-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 817885: WIP: just for test https://review.opendev.org/c/zuul/zuul/+/817885 | 04:01 | |
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 817885: WIP: just for test https://review.opendev.org/c/zuul/zuul/+/817885 | 04:45 | |
@westphahl:matrix.org | corvus: it shouldn't be possible for the run handler to process a tenant that is not loaded. A tenant is only added to the abide after config loading completed for a tenant. Also, priming/reconfig is protected by the tenant read-write lock so I can't imagine a case where the scheduler discards events for pipelines as it should never have an incomplete or partially state of a tenant (when e.g. the read lock is held) | 06:37 |
---|---|---|
@westphahl:matrix.org | * corvus: it shouldn't be possible for the run handler to process a tenant that is not loaded. A tenant is only added to the abide after config loading is completed for a tenant. Also, priming/reconfig is protected by the tenant read-write lock so I can't imagine a case where the scheduler discards events for pipelines as it should never have an incomplete or partially state of a tenant (when e.g. the read lock is held) | 06:38 |
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 817949: SQL reporter: add substring search on some fields https://review.opendev.org/c/zuul/zuul/+/817949 | 13:44 | |
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 793159: Web UI: make more filters selectable in build, buildset searches https://review.opendev.org/c/zuul/zuul/+/793159 | 13:49 | |
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 808041: [web] Early support for pagination in builds, buildsets search https://review.opendev.org/c/zuul/zuul/+/808041 | 13:49 | |
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 808042: [zuul-web] Add pagination information when querying builds, buildsets endpoints https://review.opendev.org/c/zuul/zuul/+/808042 | 13:49 | |
@jim:acmegating.com | swest: but if it wins a connection election and starts distributing connection events to tenants, it wouldn't distribute it to a tenant that wasn't loaded, right? | 13:56 |
@westphahl:matrix.org | corvus: correct, but that's not covered by 817869 AFAIKS | 14:04 |
@westphahl:matrix.org | so I think it would make sense to make the connections wait until the scheduler is primed | 14:06 |
@jim:acmegating.com | swest: okay, makes sense. i'll rework it to do that. | 14:07 |
@jim:acmegating.com | swest: we want the connections to accept events, but not distribute them to tenants | 14:08 |
@jim:acmegating.com | (otherwise, we'd have a huge window of lost events on startup) | 14:08 |
@westphahl:matrix.org | yes | 14:09 |
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] 817959: Remove fake SQL classes and variables from tests https://review.opendev.org/c/zuul/zuul/+/817959 | 14:38 | |
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] 817960: Add missing onStop() call to TestMysqlDatabase https://review.opendev.org/c/zuul/zuul/+/817960 | 14:42 | |
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] 817963: Fix NoneType ppc exception in freezeJobGraph() https://review.opendev.org/c/zuul/zuul/+/817963 | 15:10 | |
@felixedel:matrix.org | corvus: A small fix for the latest changes to the ppc debugging https://review.opendev.org/c/zuul/zuul/+/817963. In case you have a better approach/idea, feel free to update this change ;-) I just spotted this exception while running some tests. However, this exception doesn't seem to affect the test results. | 15:14 |
-@gerrit:opendev.org- Clint Byrum proposed: | 16:42 | |
- [zuul/nodepool] 817487: Add network and subnetwork to GCE driver https://review.opendev.org/c/zuul/nodepool/+/817487 | ||
- [zuul/nodepool] 817990: Adding namespace args to k8s pod type https://review.opendev.org/c/zuul/nodepool/+/817990 | ||
@spamaps:spamaps.ems.host | ```FAIL: nodepool.tests.unit.test_launcher.TestLauncher.test_node_assignment_at_tenant_quota_ram``` | 17:43 |
@spamaps:spamaps.ems.host | Anyone know if that one regularly times out? | 17:43 |
@jim:acmegating.com | i've seen it once or twice; it's likely a slightly flaky test | 17:44 |
@spamaps:spamaps.ems.host | KK, I'll recheck | 17:44 |
@spamaps:spamaps.ems.host | On to other things.. `Warning: rbac.authorization.k8s.io/v1beta1 Role is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 Role` | 17:44 |
@spamaps:spamaps.ems.host | Looksl ike one of the things the k8s driver does needs some updating for newer k8s. | 17:45 |
@spamaps:spamaps.ems.host | I think it's just dropping the beta tag.. but I wonder.. are we testing k8s in the gate? I didn't look yet. | 17:46 |
@jim:acmegating.com | spamaps: yes, twice: nodepool-functional-k8s and nodepool-functional-openshift | 17:47 |
@spamaps:spamaps.ems.host | ` minikube_version: v1.22.0 # NOTE(corvus): 1.23.0 failed with 404 on create_namespaced_role` | 17:55 |
@spamaps:spamaps.ems.host | Indeed. ;) | 17:55 |
@jim:acmegating.com | yep, that needs some updating | 17:56 |
@spamaps:spamaps.ems.host | I am guessing v1 was ratified a long, long time ago. | 17:56 |
@jim:acmegating.com | spamaps: i have confirmed that gerrit's zuul was spamming the recursive equality issue as expected | 18:25 |
@jim:acmegating.com | and it appears it is no longer doing so now that i have restarted it | 18:26 |
@clarkb:matrix.org | I've been asked to do a resource usage audit of the opendev zuul/nodepool install and I found the graphite data we emit from zuul for that: but the numbers look off to me. Also we removed the job name from the info log entry on the exuectors that records this in the logs. Two questions. Am I looking at the statsd graphite data wrong? Any idea what the scale is there? Milliseconds? And would it be terrible to include the job name in the BuildRequest records so that they get logged? | 18:49 |
@clarkb:matrix.org | fwiw I'd like to use the statsd data for aggregate reporting, but the logs are still helpful when spot checking specific things | 18:49 |
@jim:acmegating.com | i think it'd be great to include the job name | 18:50 |
@clarkb:matrix.org | great, in that case I'll try to take a look at adding that in today | 18:51 |
@jim:acmegating.com | the graphite data can be a little tricky to interpret; i usually have to have the docs for the timer type handy to work out what i need | 18:54 |
@jim:acmegating.com | if you paste a link, i can try to help | 18:55 |
@clarkb:matrix.org | noted. I'll have to find the docs for these entries | 18:55 |
@jim:acmegating.com | you have to keep the aggretation in mind with timers too | 18:56 |
@clarkb:matrix.org | corvus: https://graphite.opendev.org/?width=586&height=308&target=stats.zuul.nodepool.resources.project.opendev_org-openstack-neutron.instances is the sort of thing I'm looking at | 18:56 |
@jim:acmegating.com | Clark: another option would be the sql db and/or rest api | 18:56 |
@clarkb:matrix.org | The idea would be to compare total instance time for each openstack project in a graph | 18:56 |
@clarkb:matrix.org | (or something along those lines) | 18:56 |
@jim:acmegating.com | the sql db is how zuul answers the question of how long a job takes now | 18:56 |
@clarkb:matrix.org | ah right | 18:57 |
@clarkb:matrix.org | I would probably need to sum all the data points over $timeperiod per project? | 18:57 |
@clarkb:matrix.org | Then I could say eg Neutron used X time over this period and Nova used Y time. | 18:58 |
@clarkb:matrix.org | integralByInterval(thatpath, "1h") Seems to give me something maybe useful | 19:02 |
@jim:acmegating.com | Clark: re that graphite link -- the underlying metric is "instance-seconds", but it's being interpreted as a rate per second, so it's instance-seconds/second, and graphite averages that over a minute. | 19:02 |
@clarkb:matrix.org | I see, they aren't raw values, they are a rate. Is that how we should be sending the data? | 19:03 |
@clarkb:matrix.org | I guess graphite deals with rates | 19:03 |
@clarkb:matrix.org | Anyway putting that into integrateByInterval seems to give me something closer to what I want ( an hourly or other period) usage | 19:04 |
@jim:acmegating.com | well, it's statsd that's doing the '/second' bit. | 19:04 |
@clarkb:matrix.org | Big link warning: https://graphite.opendev.org/?width=943&height=529&target=integralByInterval(stats.zuul.nodepool.resources.project.opendev_org-openstack-neutron.instances%2C%20%221d%22)&target=integralByInterval(stats.zuul.nodepool.resources.project.opendev_org-openstack-nova.instances%2C%20%221d%22)&from=00%3A00_20211109&until=23%3A59_20211115 is closer to what I need | 19:07 |
@clarkb:matrix.org | any idea why when I shift the range to the 8th or previous the scale drops by a factor of 10? | 19:07 |
@clarkb:matrix.org | It seems to do that even if I move the end date backward so not something to do with the range width | 19:08 |
@clarkb:matrix.org | I know we fixed a bug related to this but I dind't think it was that recently. Maybe that was when we restarted to pick up the sos work | 19:09 |
@jim:acmegating.com | i don't remember the details, but i do think there was a period where we may not have been reporting stats correctly | 19:10 |
@clarkb:matrix.org | ok it is probably good enough to compare more recent stuff and just make it work going forward. Thanks for walking me through this it was helpful | 19:10 |
@spamaps:spamaps.ems.host | > <@jim:acmegating.com> and it appears it is no longer doing so now that i have restarted it | 20:49 |
Woot! | ||
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul] 818019: Add job name to BuildRequest repr https://review.opendev.org/c/zuul/zuul/+/818019 | 20:53 | |
@clarkb:matrix.org | There is the change I promised. Easier to write than I thought it would be as job_name was already an attribute on BuildRequest | 20:53 |
@clarkb:matrix.org | I need to followup on my config loading change too as unittests were unhappy with it | 20:54 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed on behalf of Felix Edel: [zuul/zuul] 817963: Fix NoneType ppc exception in freezeJobGraph() https://review.opendev.org/c/zuul/zuul/+/817963 | 20:58 | |
@clarkb:matrix.org | corvus: small thing on https://review.opendev.org/c/zuul/zuul/+/815558 if you have a second | 20:58 |
@jim:acmegating.com | ayep | 21:00 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 815558: Replace asserts with exceptions https://review.opendev.org/c/zuul/zuul/+/815558 | 21:00 | |
@clarkb:matrix.org | corvus: for https://review.opendev.org/c/zuul/zuul/+/817869 I guess we're waiting on the new patchset that does connection work prevention? | 21:12 |
@jim:acmegating.com | Clark: yep, i'll try to get to it today, but it's behind a few things in my queue | 21:18 |
@clarkb:matrix.org | no rush, I was just making sure I understood the state of the sos queue. I think I've approved a bunch of approvable changes | 21:21 |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul] 817652: Prevent duplicate config file entries https://review.opendev.org/c/zuul/zuul/+/817652 | 21:47 | |
@clarkb:matrix.org | Yay for testing. That was an actual bug in the client when I updated readConfig | 21:48 |
-@gerrit:opendev.org- Zuul merged on behalf of Felix Edel: [zuul/zuul] 817959: Remove fake SQL classes and variables from tests https://review.opendev.org/c/zuul/zuul/+/817959 | 22:02 | |
-@gerrit:opendev.org- Zuul merged on behalf of Felix Edel: [zuul/zuul] 817960: Add missing onStop() call to TestMysqlDatabase https://review.opendev.org/c/zuul/zuul/+/817960 | 22:06 | |
-@gerrit:opendev.org- Zuul merged on behalf of Felix Edel: [zuul/zuul] 817963: Fix NoneType ppc exception in freezeJobGraph() https://review.opendev.org/c/zuul/zuul/+/817963 | 22:30 | |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 817646: Re-order scheduler shutdown https://review.opendev.org/c/zuul/zuul/+/817646 | 22:30 | |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul] 817652: Prevent duplicate config file entries https://review.opendev.org/c/zuul/zuul/+/817652 | 23:16 | |
@clarkb:matrix.org | Arg I ran unittests but not the linter before the last push. Sorry about that | 23:16 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 818030: Add admin password support for Azure driver https://review.opendev.org/c/zuul/nodepool/+/818030 | 23:52 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 807464: Add metastatic driver https://review.opendev.org/c/zuul/nodepool/+/807464 | 23:53 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 23:58 | |
- [zuul/nodepool] 818030: Add admin password support for Azure driver https://review.opendev.org/c/zuul/nodepool/+/818030 | ||
- [zuul/nodepool] 807464: Add metastatic driver https://review.opendev.org/c/zuul/nodepool/+/807464 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!