*** Guest7825 is now known as diablo_rojo_phone | 06:12 | |
*** ralonsoh_ is now known as ralonsoh | 15:00 | |
*** tao is now known as Guest7921 | 15:28 | |
Guest7921 | Hi everyone! We’re researching cross-project flakiness in OpenStack—share your insights in our 10-minute anonymous survey: https://forms.gle/dUMWRL8MNALQE6kG8 Thank you! 🚀 | 15:29 |
---|---|---|
clarkb | if there are no objections in the next 15-20 minutes I'll approve https://review.opendev.org/c/opendev/system-config/+/940536 to update the haproxy image location | 15:56 |
fungi | lgtm, thanks! | 16:01 |
fungi | clarkb: want me to go ahead and approve 940536? | 17:03 |
fungi | oh, never mind, you already did | 17:03 |
fungi | i missed the workflow vote on there | 17:03 |
fungi | and that was about 50 minutes ago, so should be merging soon | 17:04 |
fungi | zuul says another 23 minutes | 17:04 |
clarkb | yup I'm waiting patientyl while I get other stuff done | 17:12 |
clarkb | infra-root I'm going to put this on the meeting agenda, but I'm thinking I'd like to do a server replacement sprint probably next week. Basically do our best to redeploy as many things on jammy or noble as possible to get off of focal. I think if we focus on that we should be able to get a good number of servers done in a week | 17:24 |
clarkb | a good part of what makes the process take a long time is waiting for reviews for things like dns updates and confirmation services on the new server are working. My goal is that if we set aside time specifically to work through that then we can quickly get trhough those updates | 17:25 |
clarkb | also I realized that I think I can put the existing grafana server in the emergency file. Land the upgrade change, then deploy a new grafana and have it deploy a new one from scratch that we switch dns over to if we're happy with the result | 17:26 |
clarkb | that mgiht be the safest way to do an upgrade if we're worried about landing the upgrade on the old server | 17:26 |
clarkb | open to idea on ^ if we have a preference for goign the safe route or not | 17:26 |
fungi | that sounds great | 17:27 |
fungi | we've done that for a few other replacements in the past, especially during the puppet-to-ansible work | 17:28 |
clarkb | in that case I'll work on a change to update launch node to check cpu count, then an update to test grafana on top of noble. When we are happy to land those I can put the existing host in the emergency file and then launch a new server | 17:31 |
opendevreview | Merged opendev/system-config master: Switch our haproxy image to quay opendevmirror location https://review.opendev.org/c/opendev/system-config/+/940536 | 17:32 |
corvus | hi, i'd like to get some feedback from opendev users, probably especially from the openstack project, on this change: https://review.opendev.org/938677 | 17:33 |
corvus | it's a status page ui change that causes the individual gate queue items to be collapsed by default | 17:33 |
corvus | here is the site preview for the change: https://0eff135c994e4125b903-72f20af1f3723272b921a9ee1bd5f518.ssl.cf2.rackcdn.com/938677/5/check/zuul-build-dashboard-opendev/007b437/npm/html/ | 17:34 |
clarkb | JayF: TheJulia: dansmith ^ are probably good feedback sources | 17:34 |
dansmith | there was already a change that causes the jobs to be collapsed by default, late last year | 17:35 |
dansmith | I wanted to complain about that already | 17:35 |
corvus | not jobs | 17:35 |
dansmith | I'm definitely not in favor of collapsing further for sure | 17:35 |
clarkb | if you use the expand all button then it doesn't look like you're affected by this change | 17:36 |
dansmith | yeah, is that new? that definitely helps.. I didn't recall seeing that after the previous change | 17:36 |
corvus | correct; that's kind of the thesis i'm wondering about: might it be the case that people who would normally be adversely affected by this already using expand all? | 17:36 |
clarkb | expand all came out of the feedback from the earlier change you talked about | 17:37 |
clarkb | so its newer than that change but not new | 17:37 |
dansmith | on zuul.o.o I have no expand all toggle | 17:37 |
dansmith | oh, show all jobs I guess | 17:38 |
corvus | this shows the difference in the proposed change (just to be clear and make sure we're talking about the same thing) https://imgur.com/a/XsCWxam | 17:38 |
dansmith | to me there's very little value in the fully collapsed view, so I'm not sure why that's an improvement.. I understand why collapsing the jobs (the previous change) might be preferred by some (but not me) | 17:39 |
dansmith | but as long as the expand-all-the-things is there then, meh | 17:39 |
dansmith | corvus: I'm complaining about any/all of the collapsing, given this is the first feedback ask, so I'm "collapsing" multiple changes in my complaints :) | 17:39 |
corvus | dansmith: https://imgur.com/a/hJAZ5XB the page should look like that, and if you select both of those toggles, you should get approximately "the old status page" | 17:40 |
clarkb | zuul load balancer also updated and seems to still work for me | 17:40 |
corvus | dansmith: if it doesn't look like that, then shift-reload :) | 17:40 |
dansmith | yeah, all the toggles makes me happy again | 17:41 |
dansmith | back button navigation seems very broken, but I'm guessing that's because it's a preview site | 17:42 |
corvus | dansmith: ack. and the expand-all (and some other changes) were made to address that; hopefully that helps. just fyi (since you wondered who the collapsed view helps) -- there are some much larger installations of zuul where it's difficult to get a view of overall trends without "zooming out" more, so the new status page is an attempt to help with those use cases; ideally without breaking existing ones. hopefully we're getting close. :) | 17:42 |
corvus | yes, "back" navigation can weird on the preview sites in some cases; that change shouldn't affect it, so if it's not broken on the real site, it shouldn't be broken by that change. | 17:43 |
dansmith | corvus: okay I guess I would expect that's a reason to allow collapsing by non-default, but fair enough.. as long as I can expand (and it's sticky for me) then I'm happy | 17:45 |
corvus | ack, thanks | 17:46 |
JayF | corvus: I run with "expand all" checked all the time, I don't like the collapsing at all but I don't experience any of the collapsed versions as literally step 1 for any use there is "expand all" | 17:49 |
JayF | and afaict there is no change for the expanded version | 17:49 |
JayF | I agree with dansmith that I'd have prefered no change to default collapse at all; but given that change is already there, this single additional change will not impact my workflow | 17:50 |
clarkb | Its interesting to me that so many people appaer to run with expand all set but no one seems aware of kolla and tacker and I'm sure others using multiples of our quota pushing just a small numebr of changes | 17:50 |
clarkb | maybe people are aware and just indifferent | 17:50 |
JayF | clarkb: expand all -> ^f [123456] -> look at jobs | 17:50 |
JayF | or replace a patch number with ironic | 17:51 |
fungi | aha, so using expand all but with a filtered view | 17:51 |
JayF | I use browser search frequently and do not like collapsed patterns which make me unable to use browser serach | 17:51 |
JayF | fungi: not filtered-by-webapp; literally in browser search | 17:51 |
fungi | oh, got it | 17:51 |
JayF | I want to use the thing that has my keyboard shortcuts setup; not a webapp search box whose ui is different based on website :) | 17:51 |
dansmith | yes, and there's a special place in hell for web app designers that hijack the browser's find-in-page | 17:52 |
fungi | agreed. the worst for me is gitlab | 17:52 |
clarkb | we got gerrit to stop doing it and now github does it | 17:52 |
dansmith | it's right on the shore of a lava lake | 17:52 |
fungi | gitlab overloads both / and ^f, so i have to click on a browser menu to do find-in-page | 17:52 |
dansmith | yep, I feel sorry for those people.. gonna be toasty in retirement | 17:53 |
fungi | what i really wish is that browsers themselves would give users the ability to block that | 17:53 |
JayF | there are extensions which do so | 17:53 |
fungi | next best thing i guess | 17:53 |
JayF | but usually if a website hijacks it, it uses dynamic loading in a way that makes browser search useless | 17:53 |
dansmith | yeah, similar to Don | 17:53 |
dansmith | DontFuckWithPaste, which is another important one | 17:53 |
JayF | the only thing worse than infinite scroll is when they eat the thing I just scrolled past while showing the new thing | 17:54 |
corvus | i largely agree with everything said about searching, but i do want to point out that the search filters in zuul's status page automatically migrate to the url, so if you do set some up, they are easy to bookmark | 17:54 |
corvus | eg https://zuul.openstack.org/status?project=openstack%2Fnova&project=openstack%2Fkeystone | 17:55 |
fungi | that is super useful too | 17:55 |
corvus | (just in case people hadn't noticed that) | 17:56 |
JayF | clarkb: I'll also note: kolla is a pretty unique skillset to manage -- I often even find myself pointing Ironic questions; when they include kolla/kolla-ansible, to the kolla team instead. I'm not sure the installers have the same benefit of a shared knowledge base to start from | 17:56 |
opendevreview | Clark Boylan proposed opendev/system-config master: Add cpu count check to launch node https://review.opendev.org/c/opendev/system-config/+/940648 | 17:56 |
JayF | clarkb: because re-reading your comment; I realized that even if I was aware of a weirdness with kolla jobs, it'd be really outta scope of what I usually work on in openstack and would likely spend my time in places I can make a bigger impact | 17:57 |
clarkb | JayF: right but you can see they are running 64 jobs per patchset | 17:57 |
JayF | OK. So what would by action be based on that knowledge? That's what I'm getting at. That knowledge with no context is not actionable | 17:57 |
clarkb | its not about how to debug kolla but how to see that kolla is using all the quota in zuul | 17:57 |
JayF | s/by/my/ | 17:57 |
clarkb | JayF: well ideally people stop asking me why zuul is slow (granted you aren't one of the people who have asked recently) | 17:58 |
clarkb | the information is gneraelly right there: Because a different project used all of our quota all at once | 17:58 |
clarkb | infra-root 940648 is totally untested. Probably the easiest way to test that is to just land it and run it? | 17:59 |
fungi | we run that script manually anyway, and should be its only users, so that seems like a fine enough choice | 17:59 |
corvus | ++ | 18:01 |
clarkb | when not expanding the zuul dashboard its easy to miss those details but so far 2/2 people we've asked for feedback run expanded so I'm just suprised that others are missing the info. But it is entirely possible they don't expand | 18:06 |
JayF | clarkb: with ci I take after Ron Popeil: I set it and forget it :D | 18:10 |
JayF | I care more about the inconsistent performance node to node (also likely caused by noisy neighbors where the noisy neighbor is me) than I do about queue time | 18:10 |
JayF | unless I'm trying to smash a CVE fix through the gate :D | 18:10 |
opendevreview | Clark Boylan proposed opendev/system-config master: Update grafana to 10.4.14 https://review.opendev.org/c/opendev/system-config/+/940073 | 18:11 |
clarkb | infra-root ^ ok I went ahead and updated grafana to test on noble. If that change looks good still I can put grafana01 in the emergency file, then approve that change then deploy a new grafana02 on noble | 18:12 |
opendevreview | Merged opendev/system-config master: Add cpu count check to launch node https://review.opendev.org/c/opendev/system-config/+/940648 | 18:19 |
clarkb | I'll test ^ by launching grafana02 after a short break for some food | 18:20 |
fungi | oh, even better | 18:23 |
fungi | good that there's a candidate so we don't forget that's merged by the time we get around to running the script again | 18:23 |
opendevreview | James E. Blair proposed opendev/zuul-jobs master: DNM: test niz node https://review.opendev.org/c/opendev/zuul-jobs/+/932455 | 18:31 |
clarkb | I think I need to wait for the hourly jobs to redeploy the launch tool to the launcher venv | 18:44 |
clarkb | infra-prod-service-bridge should do that | 18:44 |
clarkb | ya looking at venv contents it hasn't updated yet and looking at ansible it should when hourlies run | 18:45 |
clarkb | that should allow me to check the grafana on noble ci results before launching a new node too | 18:47 |
clarkb | there is a bug in the launch node update. I've fixed it directly on bridge and will push to system-config once grafana02 boot confirms it is generally working | 19:10 |
clarkb | I forgot to boot with config drive so I'm waiting for it to timeout and I'll try again :/ | 19:12 |
clarkb | oh hrm the timeout is 600 seconds * 4 accounts so thats ~40 minutes? Maybe I should ^C and manually cleanup the server | 19:20 |
clarkb | oh no its 600 seconds total I think it just timed out | 19:20 |
opendevreview | Clark Boylan proposed opendev/system-config master: Fix launch node string quoting https://review.opendev.org/c/opendev/system-config/+/940652 | 19:23 |
clarkb | thats the fix launch node appears to have gotten past that check so Ithink we're good | 19:24 |
clarkb | grafana01 did not have an external volume mounted. It also doesn't use backups. Once this is done I'll push up a change to add it to dns then a change to update system-config and we should be set to see that it deploys properly (I already put 01 in the emergency file) | 19:27 |
opendevreview | Clark Boylan proposed opendev/zone-opendev.org master: Add grafana02 to DNS https://review.opendev.org/c/opendev/zone-opendev.org/+/940653 | 19:33 |
opendevreview | Clark Boylan proposed opendev/system-config master: Deploy grafana02 https://review.opendev.org/c/opendev/system-config/+/940654 | 19:47 |
clarkb | infra-root I blieve 940651 to fix launch node based on my testing. Then 940073, 940653 and 940654 should be safe to land. grafana01 is already in the emergency file and bridge group vars were already set up (so 940654 reorg of host vars to group vars matches what we already do in prod) | 19:48 |
clarkb | I guess we may want dns to update first before approving the system-config chagne just to be sure that LE will deploy properly | 19:48 |
clarkb | as always please double check things | 19:49 |
clarkb | we actually do have a test case for testing launch node venv stuff that fails in 940654 | 20:13 |
clarkb | so fixing launch node is step 0 here | 20:13 |
opendevreview | Merged opendev/zone-opendev.org master: Add grafana02 to DNS https://review.opendev.org/c/opendev/zone-opendev.org/+/940653 | 21:03 |
opendevreview | Clark Boylan proposed opendev/system-config master: Test launch installation on launch edits https://review.opendev.org/c/opendev/system-config/+/940656 | 21:11 |
clarkb | this is an attempt at improving the testing to actually run the test case we have when we edit launch/ | 21:11 |
clarkb | I've reintroduced the bug in this patchset to ensure we catch the problem and will pull the bug back out again if it does | 21:11 |
fungi | cool! | 21:12 |
clarkb | also reminder to suggested/add/edit meeting agenda items nowish. I intend on adding a reminder that our election is starting tomorrow and bring up the idea of the noble node replacement sprint/hackathon/focus for next week | 21:15 |
opendevreview | Merged opendev/system-config master: Fix launch node string quoting https://review.opendev.org/c/opendev/system-config/+/940652 | 21:17 |
clarkb | I have rechecked https://review.opendev.org/c/opendev/system-config/+/940654 now that ^ is in | 21:17 |
fungi | SyntaxError: unexpected character after line continuation character | 21:33 |
fungi | https://zuul.opendev.org/t/openstack/build/bf4b04f8acca4c43b0ddd7a8c660fdb5 | 21:33 |
fungi | worked! | 21:33 |
opendevreview | Clark Boylan proposed opendev/system-config master: Test launch installation on launch edits https://review.opendev.org/c/opendev/system-config/+/940656 | 21:35 |
clarkb | that should make it mergeable | 21:35 |
clarkb | corvus: what updates should I apply to niz topic in our meeting tomorrow? Is it that zuul is going to start testing jobs on the niz managed nodes? | 21:41 |
clarkb | I've done a first pass set of edits on the agenda but didn't touch niz yet | 21:49 |
corvus | clarkb: i think that and the new repo | 21:53 |
opendevreview | Merged opendev/system-config master: Test launch installation on launch edits https://review.opendev.org/c/opendev/system-config/+/940656 | 21:53 |
fungi | clarkb: did you want the inventory addition merged first, or the upgrade change? | 21:53 |
clarkb | fungi: the inventory change is stacked on top of the upgrade change. I think we land both together though | 21:54 |
fungi | aha, yeah | 21:55 |
clarkb | but ya Id like to avoid there ever being an old version on the new server we need to worry about configs migrating/upgrading for | 21:55 |
clarkb | start fresh and move on as our CI jobs seem to indicate this works | 21:55 |
fungi | you might actually want a bit of a pause between them. i've seen funk when a deploy job for change #1 starts running with the inventory addition from change #2 | 21:56 |
clarkb | ya that works too | 21:56 |
fungi | when they both merge at nearly the same time that is | 21:56 |
clarkb | grafana01 is in a holding pattern so should be safe | 21:56 |
clarkb | corvus: thanks I've made those updates | 21:56 |
opendevreview | Merged opendev/system-config master: Update grafana to 10.4.14 https://review.opendev.org/c/opendev/system-config/+/940073 | 22:19 |
clarkb | that "deployed" and I have confirmed grafana01 is still running the old image as expected | 22:23 |
fungi | awesome, i'll approve the inventory addition now | 22:24 |
clarkb | thanks | 22:25 |
opendevreview | Merged opendev/system-config master: Deploy grafana02 https://review.opendev.org/c/opendev/system-config/+/940654 | 23:02 |
clarkb | we're still waiting for jobs to start running for but I will keep an eye on it | 23:20 |
clarkb | in the mean time any other edits for the meeting agenda? Otherwise I'll get that out soonish | 23:20 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!