-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 00:26 | |
- [zuul/nodepool] 807464: Add metastatic driver https://review.opendev.org/c/zuul/nodepool/+/807464 | ||
- [zuul/nodepool] 814837: Add more log messages to azure driver https://review.opendev.org/c/zuul/nodepool/+/814837 | ||
@clarkb:matrix.org | > <@iwienand:matrix.org> so, not sure what we could do to make buildx faster | 00:34 |
---|---|---|
We relied on the wheels for buster to make it faster. Is this time cost with or without python3.9 bullseye prebuilt wheels? | ||
@jim:acmegating.com | Clark: yes, we probably should test those commands, but i don't think we were planning on ensuring that was in the release; we actually restarted on the commit before those | 01:04 |
@jim:acmegating.com | so... given that... | 01:06 |
@jim:acmegating.com | zuul-maint: how does this look for a zuul release? commit bfe5a4a93524e1b534851f3d04c4f1ad7d44eec7 (tag: 4.10.3, refs/changes/93/814493/3) | 01:06 |
@iwienand:matrix.org | Clark: that's with it pointing at our bullseye 3.9 wheels. pip isn't spending time forking to build, any slowness seems to be limited to whatever it's doing internally | 01:15 |
@clarkb:matrix.org | Wow. Possibly doing dependency resolution? Maybe we can make that better with hints to that process | 01:15 |
-@gerrit:opendev.org- Tristan Cacqueray proposed: | 01:28 | |
- [zuul/zuul] 814842: Demonstrate pragma take over override-checkout https://review.opendev.org/c/zuul/zuul/+/814842 | ||
- [zuul/zuul] 814843: Demonstrate removing pragma fix override-checkout https://review.opendev.org/c/zuul/zuul/+/814843 | ||
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul] 814848: Add addtional checks to key deletion testing https://review.opendev.org/c/zuul/zuul/+/814848 | 02:05 | |
@clarkb:matrix.org | corvus ^ that is a couple extra checks that I thought about just now. Your testing with OpenDev made me realize we should double check that in our tests too | 02:05 |
-@gerrit:opendev.org- Zuul merged on behalf of Felix Edel: [zuul/zuul] 760804: Store version information in component registry https://review.opendev.org/c/zuul/zuul/+/760804 | 02:59 | |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 814862: Bail out when a project moves between connections https://review.opendev.org/c/zuul/zuul/+/814862 | 07:52 | |
@jkt_:matrix.org | hi there, our OpenStack provider is messing with the API endpoint reverse proxy today, and Zuul reports that it cannot SSH to freshly-created VMs anymore ("permission denied") | 09:21 |
@jkt_:matrix.org | when I checked the VM's console, I see a line from cloud-init, `ci-info: no authorized SSH keys fingerprints found for user ci.` | 09:21 |
@jkt_:matrix.org | is that expected in normal operation, or is this a sign of a bug? how do I debug this further? | 09:22 |
@jkt_:matrix.org | I'm on nodepool `3.12.1.dev3` (these are my long-obsolete `runc` patches, nothing to the core), and zuul+nodepool have not been touched in a long time, and neither was their config | 09:23 |
@jkt_:matrix.org | per https://meetings.opendev.org/irclogs/%23opendev/%23opendev.2020-08-05.log.html#t2020-08-05T08:41:12 this looks like a botched nova metadata service, that makes sense I guess | 09:32 |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 814773: Move re-enqueue to pipeline processing https://review.opendev.org/c/zuul/zuul/+/814773 | 10:59 | |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 814899: Delete old build sets immediately https://review.opendev.org/c/zuul/zuul/+/814899 | 11:22 | |
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul-jobs] 813034: Implement role for limiting zuul log file size https://review.opendev.org/c/zuul/zuul-jobs/+/813034 | 13:16 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 13:45 | |
- [zuul/nodepool] 814837: Add more log messages to azure driver https://review.opendev.org/c/zuul/nodepool/+/814837 | ||
- [zuul/nodepool] 807464: Add metastatic driver https://review.opendev.org/c/zuul/nodepool/+/807464 | ||
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul-jobs] 813034: Implement role for limiting zuul log file size https://review.opendev.org/c/zuul/zuul-jobs/+/813034 | 13:47 | |
@clarkb:matrix.org | I'll be in a different meeting this morning. But hope to join the zuul bof once my portion of that meeting ends | 14:02 |
@jim:acmegating.com | zuul bof starting nowish at https://meetpad.opendev.org/zuul-2021-10-21 | 14:03 |
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] 814996: WIP: Make the ConfigLoader work independently of the Scheduler https://review.opendev.org/c/zuul/zuul/+/814996 | 14:04 | |
@jim:acmegating.com | direct etherpad link: https://etherpad.opendev.org/p/zuul-2021-10-21 | 14:06 |
-@gerrit:opendev.org- Simon Westphahl proposed: | 14:18 | |
- [zuul/zuul] 814773: Move re-enqueue to pipeline processing https://review.opendev.org/c/zuul/zuul/+/814773 | ||
- [zuul/zuul] 814899: Delete old build sets immediately https://review.opendev.org/c/zuul/zuul/+/814899 | ||
@jim:acmegating.com | we wrapped up the bof; took some notes in the etherpad | 15:18 |
@clarkb:matrix.org | corvus: thank you for the notes. I left thoughts on running zuul in opendev using the k8s operator. I think there are some large hurdles to get over there and I'm not sure it is currently a good fit. cc tristanC | 15:32 |
@tristanc_:matrix.org | Clark: i was suggesting a standalone deployment, like an untrusted service that the zuul community members would manage. | 15:48 |
@clarkb:matrix.org | tristanC: but where does the kubernetes come from? and does the zuul installation just sit idle then? | 15:49 |
@tristanc_:matrix.org | I don't know who can provide a kubernetes api, but i was thinking this demo could be used to run third-pary-ci job for the zuul-jobs to verify they can run in a nodepool container provider | 15:50 |
@tristanc_:matrix.org | in other words, there would be a zuul/k8s-demo-config project with some secret (a review.opendev.org ci account, the k8s api password) and the zuul crd | 15:52 |
@clarkb:matrix.org | I don't think that needs to be a third party CI. But ultimately the problem is opendev has tried to run k8s like three different ways and ran into issues with all of them and now have even fewer people to help with that. If there was a kubernetes we could plug the production nodepool into it | 15:55 |
@clarkb:matrix.org | (but also if the kubernetes existed then there is potential to deploy other services in it, though I'm not sue if we should mix prod workload and ci workload so maybe we need two?) | 15:56 |
@tristanc_:matrix.org | i meant something untrusted/low risk where we could easily share the cluster api access with community members that would like to help get the operator production ready | 15:57 |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 815025: DNM Manual depends on between dib and devstack https://review.opendev.org/c/zuul/zuul-jobs/+/815025 | 15:58 | |
@tristanc_:matrix.org | alternatively corvus suggested we could use google gerrit zuul for that instead, but i think this would have less visibility than if it was in the opendev zuul tenant. | 16:00 |
@clarkb:matrix.org | tristanC: I'm not sure such a thing can exist as it requires we use donated resources in a cloud and shouldn't waste them. That means we should have a minimal level of support and care and actually use the resources properly. But once we've done that we get into the trap where the opendev team is often left expected to supported an ever growing list of things as the amount of help shrinks :/ | 16:03 |
@clarkb:matrix.org | Basically I'm saying I think this is possible, but we should do it in a way that makes it sustainable whcih for OpenDev likely means committing to converting over to k8s for many services as then the cost of running hte k8s becomes minimal compared to the gain we get from running the services. But to do that I think we need a fair bit of help. The whole replacing the motor while the plane is flying problem | 16:08 |
@tristanc_:matrix.org | Clark: I was hoping such resources could be acquired specially for this purpose (demonstrate a production ready zuul operator), without needing any work from the opendev team, e.g. it would be fully managed by the zuul community | 16:08 |
@clarkb:matrix.org | I see, then I misunderstood what was meant by doing this in opendev | 16:09 |
@avass:vassast.org | tristanC: i suppose it may be enough to run a small k3s cluster on a single node? | 16:09 |
@tristanc_:matrix.org | Albin Vass: right, or the minimum required resources would do | 16:10 |
@tristanc_:matrix.org | i take it from the meeting that the mysql operator needs at least 3 nodes | 16:10 |
@avass:vassast.org | Yeah. I think it's possible to run a multi agent cluster on a single node with k3d | 16:11 |
@jim:acmegating.com | you can run pxc on one node (we do in tests), but if the goal is a realistic deployment, 3 would be better. | 16:14 |
@tristanc_:matrix.org | corvus: may i request your feedback on 814676 , i can't tell where to look at for fixing this issue (using devstack job in our zuul) | 16:17 |
@jim:acmegating.com | i'll try to take a look in a bit; i have to afk for while now | 16:17 |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 815025: DNM Manual depends on between dib and devstack https://review.opendev.org/c/zuul/zuul-jobs/+/815025 | 16:23 | |
@tobias.henkel:matrix.org | mhu, corvus : q on 735586, what do you think? | 16:34 |
@mhuin:matrix.org | tobiash: works for me but I haven't followed the zk-related changes in detail, I'll need pointers to do it | 17:15 |
@clarkb:matrix.org | fungi: re your comment on https://review.opendev.org/c/opendev/system-config/+/814817/1 I wonder if we should upate that script to talk to zk for us? | 17:33 |
@clarkb:matrix.org | I think we land my docs change either way then improve as a followup, but thought I'd bring that up here in case other zuul users though the decrypt secret tool could be a bit more autmoatic | 17:36 |
@clarkb:matrix.org | zuulians https://review.opendev.org/c/zuul/zuul/+/814848 is a test only update to make the changes I made to delete-keys a bit more robustly tested. Nothing needs to chagne in the actual implementation so not an emergency but it occurred to me we should check the additional behavior that is tested in that change. | 18:04 |
@fungicide:matrix.org | > <@clarkb:matrix.org> I think we land my docs change either way then improve as a followup, but thought I'd bring that up here in case other zuul users though the decrypt secret tool could be a bit more autmoatic | 18:37 |
yeah, i think those are separate (albeit related) concerns | ||
@fungicide:matrix.org | > <@jim:acmegating.com> zuul-maint: how does this look for a zuul release? commit bfe5a4a93524e1b534851f3d04c4f1ad7d44eec7 (tag: 4.10.3, refs/changes/93/814493/3) | 18:44 |
sorry for the slow response, looks okay to me but opendev is running a later commit than that right? (new enough to have the /components api but not new enough to have the version info)? is that the reason for picking bfe5a4a instead of 1df09a8? i guess the latter would be fodder for a minor version increase to 4.11.0 instead of just a patchlevel increase... | ||
@clarkb:matrix.org | fungi: that commit is the commit opendev was on before the most recent restart. And ya I think corvus wants to avoid tagging where only the /components api without versions is available | 18:46 |
@fungicide:matrix.org | yep, makes perfect sense to me. i'm in favor | 18:48 |
@jim:acmegating.com | pushed 4.10.3 | 20:42 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 814685: DNM: Test unit tests on larger nodes https://review.opendev.org/c/zuul/zuul/+/814685 | 20:48 | |
@jim:acmegating.com | looking at https://review.opendev.org/814684, i think the issue with the tests in the zk stack is not so much that they are taking longer, but maybe that they are putting more load on zk and/or the host, which is causing zk disconnects. so maybe the way to address that is not to increase the job runtime, but to increase the zk session timeout. (but maybe the real answer is larger nodes; we'll see what 814685 says if i can every get the syntax right) | 20:53 |
@jim:acmegating.com | as a point of interest... when we run yarn in verbose mode, it's not kidding. i think it's responsible for 145,000 lines of job output. maybe we should not use verbose? | 20:55 |
@clarkb:matrix.org | woah ++ to not being verbose in that case | 20:55 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 815072: Don't use --verbose with yarn https://review.opendev.org/c/zuul/zuul/+/815072 | 21:00 | |
@jim:acmegating.com | originally added in https://review.opendev.org/743623 -- i'm assuming it was just to confirm we were using the opendev mirrors | 21:00 |
@jim:acmegating.com | and maybe it wasn't as verbose then :) | 21:00 |
@clarkb:matrix.org | corvus: https://review.opendev.org/c/zuul/zuul/+/814848 is an easy review if you have time | 21:07 |
@jim:acmegating.com | done | 21:07 |
@clarkb:matrix.org | Looking at opendev's production zk resource utilization it seems that we use 1.3g of memory and about a whole cpu on the leader. Are the tests hitting zk a lot harder due to density of operations? | 21:18 |
@jim:acmegating.com | Clark: i imagine so.... maybe i should dust off the dstat role. but it stands to reason. the tests seem to start reliably failing about the middle of the current "put everything left in zk" stack. and each test thread is probably going to be mostly doing operatons that go to zk. and there's usually more than one test running simultaneously. | 21:20 |
@clarkb:matrix.org | ya and its a sprint to get through the tests vs zuul in production which tends to haev a more smooth input of events | 21:21 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 815077: Run dstat in tox jobs https://review.opendev.org/c/zuul/zuul/+/815077 | 21:26 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 815078: Return dstat graph artifact https://review.opendev.org/c/zuul/zuul-jobs/+/815078 | 21:32 | |
@jim:acmegating.com | on the off chance the dstat roles still work, that might be nice | 21:32 |
@jim:acmegating.com | .3 release jobs succeeded | 21:34 |
@jim:acmegating.com | ianw, fungi, Clark: can you take a look at the comments on https://review.opendev.org/815078 | 22:09 |
@jim:acmegating.com | i haven't really kept up with the death of dstat... | 22:09 |
@jim:acmegating.com | dstat crashes. 'dool' is not available in ubuntu. | 22:11 |
@iwienand:matrix.org | didn't redhat "take it over" and then roll it into ... copilot or something like that? | 22:12 |
@iwienand:matrix.org | i seem to remember dealing with installing bits of this on fedora at some point | 22:13 |
@jim:acmegating.com | hrm, i'm not seeing options for installing co-pilot or copilot on ubuntu | 22:14 |
@clarkb:matrix.org | its performance co-pilot or pcp. Problem is the packages dont' work reliably | 22:14 |
@iwienand:matrix.org | pmlogger might be what we want in 2021 | 22:14 |
@clarkb:matrix.org | https://bugs.launchpad.net/devstack/+bug/1943184 | 22:15 |
@clarkb:matrix.org | collectl was what I was looking at replacing pcp with in devstack but I guess it isn't very maintained either | 22:15 |
@clarkb:matrix.org | I thought dstat was removed preemptively and didn't realize it didn't work at all | 22:15 |
@jim:acmegating.com | Clark: do you think the pcp problem you reported would affect incidental use for unit tests? | 22:17 |
@jim:acmegating.com | like, i guess if it fails to start we don't care...? | 22:17 |
@clarkb:matrix.org | ya if you install it with ansible saying failed when false or similar then it should be fine | 22:18 |
@clarkb:matrix.org | you just won't get the data. In devstack's case the problem is it treats the package installation as an error | 22:18 |
@jim:acmegating.com | okay, anyone know how to use pcp? :) | 22:18 |
@clarkb:matrix.org | corvus: I think if you install it it creates a drop in replacement for dstat | 22:19 |
@clarkb:matrix.org | that is how devstack uses it at least | 22:19 |
@jim:acmegating.com | okay yeah that appears to work | 22:19 |
@jim:acmegating.com | apt-get install pcp; dstat -tcmndrylpg --tcp --output /tmp/test.csv | 22:20 |
@jim:acmegating.com | presumably if we feed that to dstat-graph we'll get a graph | 22:20 |
@jim:acmegating.com | is there something more better copilot-ish we should do to collect data and make a graph? | 22:20 |
@clarkb:matrix.org | just be sure to make the install fail gracefully and the dstat run itself also fail gracefully? Though the dstat replacement may not rely on the stuff that fails to startup during install | 22:20 |
@clarkb:matrix.org | I'm not sure about copilot tooling. My only real exposure to it was debugging the devstack issues with it and suggesting collectl as a replacement | 22:21 |
@iwienand:matrix.org | corvus: looks like there's "pmdumptext" and then a bunch of gui-type tools | 22:21 |
@jim:acmegating.com | and just to confirm, you're not suggesting that now? | 22:21 |
@iwienand:matrix.org | pmchart/pmtime | 22:21 |
@clarkb:matrix.org | corvus: I think the pcp toolchain is super overkill and poorly designed (leading to the problems with basic package installation). But collectl like dstat is not maintained anymore so it may be the least evil thing if you can ignore the failures | 22:22 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 815078: Use pcp instead of dstand and return dstat graph artifact https://review.opendev.org/c/zuul/zuul-jobs/+/815078 | 22:26 | |
@clarkb:matrix.org | Basically there are a lot of not good options but if pcp works enough and when it doesn't we can keep going then it is likely the simplest option | 22:26 |
@jim:acmegating.com | Clark, ianw: maybe it's that easy? ^ | 22:26 |
@clarkb:matrix.org | ya that looks like it may avoid the known issues with pcp | 22:26 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 815077: Run dstat in tox jobs https://review.opendev.org/c/zuul/zuul/+/815077 | 22:27 | |
@jim:acmegating.com | added depends-on since we know dstat won't produce data without that patch | 22:27 |
@iwienand:matrix.org | i'll put on the todo list to play with pmlogger, etc. maybe a text dump (with timestamps) and something similar to the download-logs script where it's like "pipe this command to get a gui view of the stats" could be a useful combo | 22:29 |
@iwienand:matrix.org | that html view is a bit janky -- i looked at updating it once but every library it uses has the same name, but has essentially compeltely re-written itself in the mean time | 22:29 |
@iwienand:matrix.org | i.e. standard javascript | 22:29 |
@jim:acmegating.com | ianw: yeah -- i really dig being able to see it in a web page, so that's my #1 priority, but if there's rich local interface, that's nice too. hopefully pcp can do both, but tbh, i will probably never use the local one. | 22:30 |
@jim:acmegating.com | i ran pmgraph long enough to know i have no idea what it wants from me | 22:32 |
@jim:acmegating.com | oh it's failed_when | 22:39 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 815078: Use pcp instead of dstand and return dstat graph artifact https://review.opendev.org/c/zuul/zuul-jobs/+/815078 | 22:40 | |
@iwienand:matrix.org | just on zuul-jobs, could i get a couple of eyes on https://review.opendev.org/c/zuul/zuul-jobs/+/814695 to remove stretch testing and https://review.opendev.org/c/zuul/zuul-jobs/+/812273 adds a bit more testing to the rust roles to ensure we don't break pyca | 22:47 |
@clarkb:matrix.org | ianw: I guess that first one doesn't use zuul-jobs tagged job creation mechanism for all the platforms? | 22:52 |
@iwienand:matrix.org | Clark: no, not sure on the history there, i'm assuming it was targeted at a limited set of platforms | 22:53 |
@jim:acmegating.com | oh hey cool the dstat graph change worked - ianw if you want to take another look | 22:56 |
@jim:acmegating.com | rechecking the zuul change now | 22:57 |
@iwienand:matrix.org | lgtm. the page is not quite right in my firefox but that's the way it's always been | 22:58 |
@jim:acmegating.com | there does seem to be some broken bits on the graph page, which i guess is why ianw wanted to update it, but it's something. | 22:59 |
@jim:acmegating.com | yeah | 22:59 |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 815078: Use pcp instead of dstand and return dstat graph artifact https://review.opendev.org/c/zuul/zuul-jobs/+/815078 | 23:17 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!