Wednesday, 2025-03-19

cardoeYeah that’s great.00:01
cardoeI was trying to get OSH to go to gunicorn so we could do uvicorn in the future.00:01
Clark[m]Ok the Gerrit meetup page says things happen at 4:45am Pacific time tomorrow 00:05
Clark[m]So somewhere there was a time zone conversion error00:05
Clark[m]"12:00 AM - 1:00 PM GMT" is what the email said. I saw the AM and thought midnight not noon00:06
Clark[m]But if you look closer it runs until 1pm00:06
Clark[m]Anyway I'm not sure I'll make the 5am edition. fungi: frickler: tonyb: if you happen to be awake you may be interested. They should stream it on the gerritforgetv youtube channel00:07
opendevreviewClark Boylan proposed opendev/system-config master: Rebuild our base python container images  https://review.opendev.org/c/opendev/system-config/+/94478900:26
clarkbif 2.0.24 doesn't work maybe we should stop building for arm or something00:26
fricklernb04 sure is creative ;) new error from the docker-compose cron: "the input device is not a TTY"07:58
opendevreviewDr. Jens Harbott proposed opendev/system-config master: Rebuild our base python container images  https://review.opendev.org/c/opendev/system-config/+/94478908:10
frickler^^ just testing a bit, feel free to revert/update. if we only need the uwsgi image for lodgeit, we might no even need both py3.11 and py3.12?08:11
tweining_hi. you probably heard this question before, but I didn't find information about it. How can I change the email of my Gerrit account? Is that possible?10:58
tweining_ah, nevermind. I found it. it is simply the "preferred email" in the settings11:04
fungitweining_: yes, gerrit lets you add multiple addresses to your account, so that you can push changes with any of them as the committer, but whichever you set as your preferred address is the one it shows for your account and the one to which it sends notifications13:36
tweining_thanks13:37
fungiclarkb: not sure about the fip mac addrs post, can't remember if that one went through moderation or not. i'll try to kep an eye out for header differences the next time i do though (i approved one just now for exceding the 40kb message limit, but it was from an existing subscriber)14:01
*** dhill is now known as Guest1173514:21
fungiheading out to run some errands, back in about an hour14:26
clarkbfrickler: oh! the docker-compose issue is because docker-compose and docker invert the -t/-T settings and when I manually tested I had a tty because I was running in a shell14:48
clarkbI'll get a patch up for that shortly14:48
clarkbfrickler: removing arm64 was something that I was considering too. Doing that would prevent us from building any uwsgi images on arm64 because the base images won't match. But we don't have any of those images today and if we wanted to add tehm I think we would do so with not uWSGI but granian or gunicorn or uvicorn etc14:49
clarkbfrickler: so I think that solution is fine14:49
clarkbfungi: ^ fyi for when you return, can you rereview 944789? I think that solution is ok14:49
opendevreviewClark Boylan proposed opendev/system-config master: Fix nodepool image export cron  https://review.opendev.org/c/opendev/system-config/+/94501614:55
clarkbI think ^ should fix the tty issue14:55
clarkbfrickler: fungi: the other idea I had was maybe doing RUN assemble uWSGI || assemble uWSGI and see if a second pass at compiling makes the build happier since it wouldn't be starting from scratch? But then I remembered python tries to do isoalted builds now so that may not be the case15:01
clarkband then as for why mailman is ok but we aren't I wonder if this is a glibc vs musl problem15:02
clarkbthat could also explain why uwsgi builds on developer arm apple laptops (though thats a huge assumption stretch)15:02
fungierrands were faster than anticipated, so i'll probably step out again for a quick lunch in a few after i catch back up15:15
fungiclarkb: i approved 944799 but frickler has a cleanup question in a comment there, just heads up15:16
fungi944789 lgtm too, approved15:16
fungiwhat were we using the uwsgi arm64 images for anyway? we don't run lodgeit on that arch15:18
clarkbfungi: frickler: I think we keep the images in place forever? I don't know that there is a good reason to remove them. I suspect removing them would only potentially acuse problems15:19
clarkbfungi: we haven't/ don't use uwsgi on arm64 but the base images are built for both arches because it removes a step later if you do end up needing them for that arch15:20
clarkbin this case I think uwsgi is definitely a dead end so ratherthan land a new thing that uses uwsgi on arm I would suggest a different wsgi server15:20
fungimakes sense, but also seems fine to clean up15:20
fungiokay, gonna grab some lunch while those gate, but shouldn't be too long. bbiab15:25
opendevreviewMerged opendev/system-config master: Drop python3.10 container image builds  https://review.opendev.org/c/opendev/system-config/+/94479915:36
fungiokay, back, sorry for all the disappearances16:25
clarkbI had to recheck the container rebuild change there was a quay.io hiccup pulling the multiarch build deps16:27
clarkbI think that is something we can optimize out of the uwsgi job if it is a persistent problem as we aren't doing multiarch anymore16:27
fungii see that16:27
clarkbbut for now it shouldn't hurt anything I don't think16:28
fungiyeah, will keep an eye out for any recurrences16:28
clarkbhttps://review.opendev.org/c/opendev/system-config/+/945016 passes testing now too which should hopefully be the last fix for nodepool builder cron jobs16:28
clarkband now we seem to have hit docker rate limits. Maybe this is a recheck in a few hours situation16:36
fungiugh16:45
fungiwe're still incrementally getting away from those at least16:46
clarkbya this is the first one that has affected me in a while16:46
clarkbI'm starting to look at booting an nb07 noble node in osuosl for arm test image builds16:46
clarkband I think we need a new mirror there too. But one thing at a time16:49
clarkbhahahaha server does not support sse4_2. Ok I think we need to make that check conditional to x86 only16:57
clarkbI'll work on a patch after I check launch node cleaned up after itself thsi launch16:58
fungia very astute observation, arm processors do indeed not support intel/amd x86 processor flags16:58
fungibut we should make sure to adjust it in such a way that we don't need to revisit in, say, the hopeful future where we add a risc5 builder17:05
opendevreviewClark Boylan proposed opendev/system-config master: Only check sse4_2 support on x86_64  https://review.opendev.org/c/opendev/system-config/+/94502917:05
clarkbfungi: yup I think ^ avoids that problem17:05
clarkbI'll relaunch when that has landed17:06
fungiyeah, lgtm. i assume you haven't tried it with that patch applied, i know we don't do any testing, but happy to approve further adjustments17:09
clarkbya I haven't. I could edit the file in the launcher venv if we want to see it in use17:10
clarkbbut seems afe enough to land and have ansible update the venv and then try17:10
fungii usually do that when fixing launch-node just because i'm lazy and expect that i'll push bad patches otherwise ;)17:10
clarkbI'll do that now then17:11
fungiit's not as if there's any production impact17:11
clarkbits running now but will take a few minutes to get to that check17:12
clarkbits doing stuff now so I think that fixed it17:15
fungicool, my approach with manually-invoked tools like launch-node is that it's fine to experiment "in production" and the whole reason to keep them in git is so that once i've worked out a fix nobody else should have to repeat the same research17:21
fungiwe have these processes to make things easier, not to get in our way17:21
clarkbit seems to have succeeded launching too. I'm going to double check the node then will get some changes up17:26
fungiawesome17:29
*** dhill is now known as Guest1174217:33
opendevreviewClark Boylan proposed openstack/project-config master: Add config for the new nb07 nodepool builder  https://review.opendev.org/c/openstack/project-config/+/94503417:35
opendevreviewClark Boylan proposed opendev/zone-opendev.org master: Add nb07 to DNS  https://review.opendev.org/c/opendev/zone-opendev.org/+/94503517:36
frickleris there a specific reason for us not to mirror the python images from docker.io, too?17:37
opendevreviewClark Boylan proposed opendev/system-config master: Add nb07 to the inventory  https://review.opendev.org/c/opendev/system-config/+/94503617:38
clarkbfrickler: we do mirror a subset (we've been adding them as we go)17:38
clarkbI think those three changes should get nb07 deployed17:38
fungi"as we go" means as we move services to ubuntu noble, where we can use a new enough toolchain to support using buildset and intermediate registries with quay17:43
frickleris nb03 gone? wondering because 945036 removes it from a list instead of nb04, but there still are e.g. dns records for it17:46
clarkboh yes I meant to make a note of that17:46
clarkbit is gone fomr our inventory. It was in the old linaro cloud17:47
fungiright, the entire provider is gone17:47
fungithat reference was just missed cleanup from months ago17:47
clarkbI posted some comments on that17:47
clarkbjsut to remove confusion for now and the future17:47
fricklerdo you want to clean up dns for that in the same patch or follow-up?17:48
fungiagreed, that's yet still more missed cleanup17:48
clarkbI'll do a followup17:48
clarkbI just approved the existing dns change17:48
opendevreviewClark Boylan proposed opendev/zone-opendev.org master: Remove nb03 records  https://review.opendev.org/c/opendev/zone-opendev.org/+/94503817:49
clarkbthere17:49
opendevreviewMerged opendev/zone-opendev.org master: Add nb07 to DNS  https://review.opendev.org/c/opendev/zone-opendev.org/+/94503517:56
opendevreviewMerged opendev/zone-opendev.org master: Remove nb03 records  https://review.opendev.org/c/opendev/zone-opendev.org/+/94503818:00
opendevreviewMerged openstack/project-config master: Add config for the new nb07 nodepool builder  https://review.opendev.org/c/openstack/project-config/+/94503418:10
opendevreviewMerged opendev/system-config master: Only check sse4_2 support on x86_64  https://review.opendev.org/c/opendev/system-config/+/94502918:16
opendevreviewTristan Cacqueray proposed zuul/zuul-jobs master: Update the set-zuul-log-path-fact scheme to prevent huge url  https://review.opendev.org/c/zuul/zuul-jobs/+/92758218:16
opendevreviewTristan Cacqueray proposed zuul/zuul-jobs master: Add the build and tenant to the job header  https://review.opendev.org/c/zuul/zuul-jobs/+/94504218:16
clarkbdeps have merged for https://review.opendev.org/c/opendev/system-config/+/945036 should I approve it now? The alternative is waitnig for me to eat lunch first if we are worred about it it deploying and causing problems18:24
fungiapproved it18:28
opendevreviewTristan Cacqueray proposed zuul/zuul-jobs master: Update the set-zuul-log-path-fact scheme to prevent huge url  https://review.opendev.org/c/zuul/zuul-jobs/+/92758218:35
opendevreviewJames E. Blair proposed zuul/zuul-jobs master: Add upload-image-s3 role  https://review.opendev.org/c/zuul/zuul-jobs/+/94481318:57
opendevreviewJames E. Blair proposed opendev/system-config master: Add remaining clouds as zuul connections  https://review.opendev.org/c/opendev/system-config/+/94504919:18
opendevreviewMerged opendev/zuul-providers master: Add the DFW3 region for Rackspace Flex  https://review.opendev.org/c/opendev/zuul-providers/+/94310419:20
clarkbfungi: tonyb: have a moment for https://review.opendev.org/c/opendev/system-config/+/945016 ? I'd go ahead and approve it except I've already broken this particular thing so figure an extra set of eyeballs is worthwhile20:07
fungii approve of breaking it again20:09
fungioh, i meant to also workflow +1 that one, sorrt20:13
fungis/sorrt/sorry/20:13
clarkbno problem I got it20:14
clarkbmaybe tomorrow we can start landing some python312 updates if I can get the base images to update today (I just rechecked the change)20:14
*** elibrokeit__ is now known as elibrokeit20:15
fungithat would be great, yep20:16
clarkbthe nb07 change should land in the next little bit. Then we have about half an hour of deploying things. If that goes well I'll shutdown the builder on nb04 and put it in the emergency file then request image builds20:32
clarkbthat should force nb07 to start building things20:32
opendevreviewJames E. Blair proposed opendev/zuul-providers master: Refactor raxflex labels  https://review.opendev.org/c/opendev/zuul-providers/+/94505220:33
fungicorvus: that ^ reminds me that i'm missing an essential morsel of understanding about niz... how are the label names mapped to image+flavor combinations in each provider? is there a separate file with the parameters?20:38
funginever mind! it's the labels.yaml file20:39
corvusfungi: it's done globally in labels.yaml; so globally we say that the "niz-ubuntu-noble-16GB" label always means the "ubuntu-noble" image and the "16gb" flavor20:39
fungii should have looked harder (or at all)20:39
corvusthen it's the definitions of what "ubuntu-noble" or "16gb" mean that are different on each provider20:39
fungiyep, perfect20:39
corvus(and, honestly, not sure where any of this will land with respect to the refactoring in 945052 -- i think we're in the "no wrong answers" phase of figuring this out :)20:41
fungifrom a model perspective i was just thrown because providers.yaml has the labels, flavors and images, but not the mapping which connects them20:41
fungibut that's just an organizational choice really20:41
corvusthere's also flavors.yaml and images.yaml with the global part of those definitions20:42
fungiright, those seem more like declarations of the names with little else, but i suppose they could later include additional parameters20:42
corvusso in providers.yaml, on the providers objects, we're still "attaching" those global image and flavor definitions20:42
corvusyep20:42
opendevreviewJames E. Blair proposed opendev/zuul-providers master: Add rackspace classic to zuul-launcher  https://review.opendev.org/c/opendev/zuul-providers/+/94505520:44
opendevreviewJames E. Blair proposed opendev/zuul-providers master: Add rackspace classic to zuul-launcher  https://review.opendev.org/c/opendev/zuul-providers/+/94505520:45
opendevreviewMerged opendev/zuul-providers master: Refactor raxflex labels  https://review.opendev.org/c/opendev/zuul-providers/+/94505220:46
clarkbthe reason we have certcheck complaining about ptg.o.o apepars to be stale apache worker processes. I'm going to restart apache on that server to clear those out20:46
fungisgtm20:47
opendevreviewMerged opendev/system-config master: Add nb07 to the inventory  https://review.opendev.org/c/opendev/system-config/+/94503620:47
clarkbthats all done and now I'll pay attention to ^20:49
opendevreviewJames E. Blair proposed opendev/zuul-providers master: Add Openmetal provider  https://review.opendev.org/c/opendev/zuul-providers/+/94505620:50
corvusopenmetal seems to have some opendev-specific flavors, and some generic ones.  i believe we are only using the single opendev flavor20:51
corvusi'm guessing we should make some extra flavors there?20:52
clarkbcorvus: ya I think the reason for that was the built in flavors didn't get the ratios right for us. but ya we have admin access in that cloud so can add flavors20:52
corvuscool, that seems like a contribution opportunity for anyone who might be interested in that :)20:54
clarkbcorvus: what disk memeory vcpu do you think the 4gb and 16gb flavors should have?20:54
corvusif we're not limited by disk, then 80 for all would be good; if we are, i think it'd be okay to do 40/80/80.20:55
clarkbI think we're ok on disk. cpu is the major limit there20:55
corvusfor vcpu, i think 4/8/8 would be okay20:55
clarkback20:56
corvus(and i suppose if we end up not using all the cpu, we could increase the cpu for the 16gb, but it's not important, and i don't see it happening much elsewhere.  i'm not sure which of cpu/ram we hit first)20:57
clarkbcorvus: ok created. I added a new -8GB flavor too just to fit into the naming scheme20:59
corvusooh cool thanks!21:00
clarkbfor anyone wondering what the process was tehre I sshed to openmetal.us-east.opendev.org then looked in the kolla info for admin login creds then logged into horizon and used the flavor wdiget21:00
clarkbcorvus: I think your key isn't in place on those servers and I seem to recall that is because you weren't interested in admining the cloud. Totally fine if that is the case but I can add you if you like21:01
corvusack thanks, i'll let you know :)21:01
opendevreviewJames E. Blair proposed opendev/zuul-providers master: Add OVH provider  https://review.opendev.org/c/opendev/zuul-providers/+/94505721:03
clarkbcorvus: there is a syntax error in the first change in the stack.21:05
corvusha of course there is :)21:05
clarkbgood opportunity to add the new flavors to openmetal config too :)21:05
corvusoh i think that "syntax error" won't get fixed until we restart the schedulers with the new config; i'll do that in a minute21:07
opendevreviewJames E. Blair proposed opendev/zuul-providers master: Add Openmetal provider  https://review.opendev.org/c/opendev/zuul-providers/+/94505621:07
opendevreviewJames E. Blair proposed opendev/zuul-providers master: Add OVH provider  https://review.opendev.org/c/opendev/zuul-providers/+/94505721:07
clarkback keep in mind there is a deployment going on that will eventually attemtp to interact with the zuul schedulers (minimally the job will run but it should noop)21:07
corvusbut i updated anyway21:07
clarkbthis is the new arm builder deployment. It edited the inventory so all the deployment jobs are running21:08
corvusjust looking at the job list, are we sure we can't run more in parallel?21:09
corvus"can't" is the wrong word21:09
corvusi mean effectiveness21:09
fungiwe could, though when it reaches the letsencrypt job that has to complete before the remainder can start21:10
clarkbya we can. The two big holdups are the beginning with bootstrap-bridge and -base running serially. Then thinsg run in parallel until letsencrypt which is another snychronization point. Then things run in parallel afterwards to a degree but there aer some deps between services after le too21:10
clarkbhappy to bump to 6 and see how that does21:10
corvusyeah, it just got past LE, so it's doing the next 20 jobs 4 at a time21:10
fungigiven its runtime, we've seen the letsencrypt job go for a while by itself after all earlier jobs have completed21:10
corvus22 jobs21:11
clarkbsome of those can't be run in parallel But many can its like gitea, gerrit, manage projects, zuul ? in that order?21:11
clarkbbut still thats 1821:12
fungiright, increasing could still certainly speed up deploys for changes that touch the inventory, but those are generally a small subset of our deploy buildsets21:12
corvusit's rolling through them pretty quickly, so i get that it's not going to save a huge amount of time, but i just doesn't feel right.  :) 21:12
corvusi think i'm leaning slightly more toward "more" now :)21:12
fungii definitely don't object to increasing it, when we uped it to 4 i felt like we probably needed to observe it for a week before increasing further21:13
opendevreviewMerged opendev/system-config master: Fix nodepool image export cron  https://review.opendev.org/c/opendev/system-config/+/94501621:14
fungihaving resource consumption graphs to look at for bridge would also be helpful as we raise the parallelism, but spot checks suggest we have plenty of headroom from a memory and load average perspective when these happen21:14
clarkbI think you can ssh port forward and look at cacti still21:15
corvuslooks like the zuul.conf update is waiting on a deploy (maybe this one) so i'm waiting for the job21:15
clarkbbut I haven't tried21:15
corvusoh ha it hasn't been approved/merged21:15
corvushttps://review.opendev.org/94504921:15
fungiit has now!21:16
clarkbdouble approved21:16
fungidouble secret probation21:17
opendevreviewMerged opendev/system-config master: Rebuild our base python container images  https://review.opendev.org/c/opendev/system-config/+/94478921:18
clarkbwoah it merged21:19
clarkbservice-nodepool succeeded. nb04 has been put in the emergency file and I stopped its builder service21:21
clarkband nb07 is building debian bookwork21:21
clarkb*bookworm21:22
clarkbI think shutting down nb04 may have "orphaned" debian-bookworm-arm64-dd297fd35c2e44f2bba8711f6e522ed2 in a building state in the db21:23
clarkbbut I can clean that up later if it doesn't get auto cleared out as other things build21:24
clarkbI'll request a few other images build now too21:24
clarkbubuntu-noble-arm64 and rockylinux-9-arm64 were queued up21:25
fungithe inventory change deploy seems to have completed too21:31
clarkbyup hourlies are running, then the nodepool cron fix, then image promotion21:32
clarkbI kinda wonder if the hourly running now is running with 944789 (because we use master in periodic/hourly) and then when we go back to deploy we'll "revert" to 94501621:34
clarkbI'm looking on bridge to see if I can tell21:34
clarkbya we have fd8241286dbafee3528035997e01b060392dac22 checked out which is rebuild container images21:34
clarkbit doesn't matter in this case but something we should be careful about I guess21:34
clarkbI guess a fix for that is to stop using master but the git state when the buildset is enqueued?21:35
clarkband then everything should go in lockstep?21:35
clarkbimportantly none of the hourly jobs are really ones that will have problems with things going back and forth21:36
clarkbzuul is maybe the biggest one but it will just do something like add new project, remove new project, add new project which is "fine"21:36
clarkbsomething to think about21:36
clarkboh hrm with 945016 holding the "lock" system-config is still fd8241286dbafee3528035997e01b060392dac2221:38
clarkblooking at the log we don't reset to master in this job. We skip that task so prepare-workspace-git must be doing the work for us?21:40
clarkbI think this behavior is safe which is great. I just want to undersatnd how it works21:40
clarkbok 944789 is running without infra-prod-bootstrap-bridge and has infra-prod-service-gitea and infra-prod-service-refstack enqueued21:43
clarkboh never I just can't read21:44
clarkbhey the system is working thats great. My skepticism being disproven is also great :)21:44
clarkbI made some notes for digging into why things didn't rollback like I expected them to. I don't think I'm going to do that right now since that has brain melt potential. But I think I have enough notes to look back and see the order of operations in logs21:46
opendevreviewMerged opendev/system-config master: Add remaining clouds as zuul connections  https://review.opendev.org/c/opendev/system-config/+/94504922:42
clarkbcorvus: ^ that has deployed at this opint23:02
corvusah cool, thx; i'll restart the scheds23:04
clarkbwow the new nb07 server has already built noble and bookworm23:04
clarkbit seems much quicker than before. Looking at cpuinfo it seems like the same hardware. I wonder if kernels and/or ubuntu have just gotten better on arm?23:05
clarkbanywy I'm happy with its progress. I'll request it does even more image builds now23:05
clarkbeverything but openeuler has been requested. Maybe by tomorrow they will have all built and we can clean up the old server too. That would be nice23:07
corvusrestarted schedulers, web, and launcher with the new config file23:31

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!