clarkb | meeting time! | 19:00 |
---|---|---|
clarkb | we will get started momentarily | 19:00 |
ianw | o/ | 19:00 |
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue Sep 7 19:01:10 2021 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link http://lists.opendev.org/pipermail/service-discuss/2021-September/000281.html Our Agenda | 19:01 |
clarkb | #topic Announcements | 19:01 |
clarkb | I had nothing to announce | 19:01 |
fungi | ml upgrade | 19:01 |
clarkb | oh yup that is on the topic list but worth calling out here if people read the announcements and not the rest of the log. | 19:02 |
clarkb | lists.openstack.org will have its operating system upgraded September 12 beginning at 15:00UTC | 19:02 |
fungi | #link http://lists.opendev.org/pipermail/service-discuss/2021-September/000280.html Mailing lists offline 2021-09-12 for server upgrade | 19:03 |
fungi | i also sent a copy to th emain discuss lists for each of the different mailman sites we host on that server | 19:03 |
clarkb | the lists.katacontainers.io upgrade seemed to go well and we've tested this on zuul test nodes as well as a snapshot of that server | 19:04 |
clarkb | should hopefully be a matter of answering qusetions for the upgrade system and checking things are happy after | 19:04 |
clarkb | #topic Actions from last meeting | 19:05 |
clarkb | #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-08-31-19.01.txt minutes from last meeting | 19:05 |
clarkb | There were no actions recorded | 19:05 |
clarkb | #topic Specs | 19:05 |
clarkb | #link https://review.opendev.org/c/opendev/infra-specs/+/804122 Prometheus Cacti replacement | 19:05 |
clarkb | corvus: fungi: ianw: can I get reviews on this spec? I think it is fairly straightforward and approvable but wanted to make sure I got the details as others expected them | 19:06 |
clarkb | thank you tristanC and frickler for the reviews | 19:06 |
fungi | thanks for the reminder, i've starred it | 19:06 |
clarkb | #topic Topics | 19:07 |
clarkb | #topic lists.o.o operating system upgrade | 19:07 |
clarkb | as mentioned previously this is happenign on September 12 at 15:00UTC | 19:07 |
clarkb | This upgrade will affect lists for openstack, opendev, airship, starlingx and zuul | 19:07 |
fungi | i also did some preliminary calculations on memory consumption for the lists.katacontainers.io server post-upgrade and it seems like it's not going to present any significant additional memory pressure at least | 19:08 |
clarkb | thank you for checking that. I plan to be around for the upgrade as well | 19:08 |
fungi | unfortunately i didn't check memory utilization pre-upgrade and we don't have that server in cacti, so no trending | 19:08 |
fungi | however i'm not super concerned that the lists.o.o server will be under-sized for the upgraded state | 19:09 |
clarkb | it is bigger than I had thought previusly too which gives us more headroom than I expected :) | 19:09 |
fungi | after the upgrade is concluded, the openinfra foundation is interested in adding a lists.openinfra.dev site and moving a number of foundation-specific lists to that, so i'll pay close attention to the memory utilization post-upgrade to make sure that addition won't pose a resource problem | 19:10 |
fungi | (for those who aren't aware, our current deployment model uses 9 python processes for the various queue runners for each site) | 19:11 |
clarkb | I think that is about it for the lists upgrade. be aware of it and fungi and I will keep everyone updated as we go through the process | 19:11 |
clarkb | #topic Improving OpenDev's CD Throughput | 19:12 |
fungi | also once the ubuntu upgrade is done, i think we can start planning more seriously for containerized mailman 3 | 19:12 |
fungi | oops, sorry | 19:12 |
clarkb | fungi: ++ | 19:12 |
clarkb | no worries I think that is the next step for the mailman services | 19:12 |
clarkb | I haven't had time to dig into our jobs yet. Too many things kept popping up | 19:12 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/807672/ starts to sketch this out. | 19:12 |
clarkb | But ianw took a look yseterday | 19:12 |
clarkb | ianw: can you give us the high level overview of this change? It seems you've modified a few jobs then started working on pipeline updates? Looks like you're sketching stuff out and this isn't quite ready yet | 19:13 |
fungi | i've been sort of paying attention to what jobs are running on system-config changes now, and it still seems sane | 19:13 |
ianw | yeah i was going to draw graphs and things but i noticed a few things | 19:14 |
ianw | firstly the system-config-run and infra-prod stages are fairly different; in that for system-config-run you just include the letsencrypt playbook, while for prod you need to run the job first | 19:14 |
ianw | in short, i think we really just need to make sure things depend on either the base job, or the letsencrypt job, or their relevant parent (but there's only a handful of cases like that) | 19:15 |
ianw | i don't see why they can't run in parallel after that | 19:16 |
clarkb | cool. I also noticed you change how manage-projects runs a little bit. I believe we're primarily driving that from project-config today, but this has it run out of system-config more often? | 19:16 |
ianw | yeah, manage-projects all i did was put the file matchers into the job: rather than in the projects | 19:17 |
clarkb | ianw: ok, I think it is done that way because we run it from openstack/project-config and file matchers are different there? | 19:18 |
ianw | and also i think that should probably depend on infra-prod-review? as in if we've rolled out any changes to review we'd want them to merge before projects | 19:18 |
clarkb | that might need a little bit of extra investigating to understand how zuul handles that and whether it is appropraite for manage-projects | 19:18 |
clarkb | ianw: ++ | 19:18 |
ianw | oh; that could be, yep. something that probably wants a comment :) | 19:18 |
ianw | then i think infra-prod-bridge was another one i wasn't sure of in the build hierarchy | 19:19 |
clarkb | that helps me understand some of what is going on there. I can leave some comments after the meeting | 19:19 |
ianw | that pulls an updated system-config onto bridge; but i don't think that matters? everything runs on bridge, but via zuul-checkout? | 19:19 |
clarkb | infra-prod-bridge also configures other things on bridge like the ansible version iirc | 19:20 |
ianw | yeah, it was mostly a sketch, i see it synatx errored. but it suggested to me that we can probably tackle the issue with mostly just thinking about it and formatting things nicely in the file | 19:20 |
fungi | forcibly updating the checkout on bridge seems like the most sensible way to prevent accidental rollbacks from races in different pipelines too | 19:21 |
clarkb | I think that each job is using the checkout associated with its triggering change | 19:21 |
clarkb | there is an escape hatch in that task that checks if it is running in a periodic pipeline in which case it uses master instead | 19:22 |
clarkb | definitely seems unnecessary to do the cehckout in a prior job | 19:22 |
fungi | ahh, okay, so we still need some mitigation if mutex prioritization is implemented (did that ever land?) | 19:22 |
clarkb | ya I'm still not sure if we decided if that was necessary or not. Going from change to periodic should be fine, but periodic to change may not be? | 19:23 |
clarkb | though if we prioritize the change pipeline then periodic to change would only happen when a new change arrives and should be safe | 19:23 |
fungi | oh, when you say "checks if it is running in a periodic pipeline in which case it uses master instead" you mean explicitly updates the checkout when the build starts rather than using the master branch state zuul associated with it when enqueued. yeah that should be good enough | 19:23 |
clarkb | so ya I htink we're ok as long as deploy has a higher priority than the periodic piepliens | 19:24 |
clarkb | fungi: yes, that was my reading of it | 19:24 |
fungi | yes, i concur | 19:24 |
ianw | so should "infra-prod-bridge" be the base job? as in infra-prod-base <- infra-prod-bridge <- infra-prod-letsencrypt <- <most other jobs> | 19:24 |
fungi | i forgot we had already arrived at that conclusion | 19:24 |
fungi | ianw: that sounds great to me | 19:25 |
ianw | if we're thinking that say updating an ansible version on bridge should affect all following jobs | 19:25 |
clarkb | ianw: yes I think so but less for having system-config updates and more so that ansible and its config update before running more jobs | 19:25 |
ianw | these are all soft dependencies | 19:26 |
clarkb | that sounds right | 19:26 |
clarkb | we don't need -bridge to run if ansible isn't updating | 19:26 |
ianw | i assume they "pass upwards" correctly. so basically if there's no changes that match on the base/bridge for the change we're running, then everything will just fire in parallel because it knows we're good | 19:26 |
clarkb | that is my understanding of how the soft dependencies should work | 19:27 |
ianw | we may uncover deficiencies in our file matchers, but i think we just have to watch what runs and debug that | 19:27 |
clarkb | that all sounds good. I'll try to leave those comments on the change and we can continue to refine this in review. | 19:29 |
clarkb | Anything else on this subject? | 19:29 |
ianw | nope, not from me | 19:29 |
clarkb | #topic Gerrit Account Cleanups | 19:29 |
clarkb | I finalized the rpevious batch of conflict cleanups which leaves us with 33 conflicts | 19:30 |
clarkb | My intention with these is to find a morning or afternoon where I can start writing down a plan for each one then email the users directly with that proposal | 19:30 |
clarkb | Then assuming I get acks back I'll go ahead and start committing those fixes in a tmp checkout of All-Users on review02. | 19:30 |
fungi | is the list of those in your homedir on review.o.o? | 19:30 |
clarkb | I'll probably give users 2-3 weeks to respond and if they don't go ahead with my plan for them as well. Importantly once we commit these last fixes we should be able to fix any account while gerrit is online by adding and removing commits to all-users that pass validations | 19:31 |
clarkb | fungi: yup all the logs and details are in the typical location including my most recent audit results | 19:31 |
clarkb | I'll probably reach out if I need help with planning for these users otherwise I'll start emailing people this week hopefully | 19:33 |
clarkb | is anyone interested in being CC'd on those comms? | 19:33 |
ianw | sure | 19:34 |
clarkb | thanks! | 19:35 |
clarkb | #topic OpenDev Logo Hosting | 19:35 |
clarkb | The changes to make the opendevorg/assets image a thing landed this morning and gitea redeployed using those builds | 19:35 |
clarkb | thank you ianw for working through this | 19:35 |
fungi | it's awesome | 19:35 |
fungi | truly | 19:35 |
clarkb | We do still need to update gerrit and paste to incorporate the new bits one way or another | 19:35 |
clarkb | with gerrit we currently bind mount the static content dir and could put the files in that location and serve them that way | 19:36 |
clarkb | I'm not sure what the best method for paste would be | 19:36 |
clarkb | ianw: ^ you might have thoughts on those services? | 19:36 |
ianw | i think the easy approach of pointing that at https://opendev.org/opendev/system-config/assets/ | 19:37 |
ianw | i can propose changes for them both | 19:37 |
clarkb | that works too, and thanks | 19:37 |
clarkb | certainly we can start there and that will be far more static for the gitea 1.15.x upgrade | 19:37 |
clarkb | Once this logo effort is done I'ld like to see if we're happy enough with the state of things to do that gitea upgrade. I'll bring that up once logos are done | 19:38 |
clarkb | #topic Rebooting gitea servers for host migrations in vexxhost sjc1 | 19:38 |
fungi | 08 is already done, yeah? just batching up the rest and then doing the lb? | 19:39 |
clarkb | This is a last minute addition as mnaser is asking us to reboot gitea servers to cold migrate them to new hardware | 19:39 |
fungi | did the gerrit server already get migrated? | 19:39 |
clarkb | yup 08 is done. 06 and 07 are pulled from haproxy and ready to go. Note that mnaser needs to do the reboot/cold migration on his end as we cannot trigger it ourselves so I'm working with mnaser to turn things off in haproxy and then he can migrate | 19:40 |
clarkb | fungi: the gerrit server is/was already on new amd hardware and doesn't need this to happen | 19:40 |
clarkb | I think previously mnaser had asked about doing review not realizing it was on the new amd stuff already | 19:40 |
clarkb | in any case review wasn't on the list supplied this morning | 19:40 |
fungi | oh, cool, i remember him indicating some weeks back a need to migrate it, but maybe that was old info | 19:40 |
clarkb | I can double check with him when he gets back to the migrations | 19:41 |
clarkb | do we have any opinions on how to do the load balancer? Probably just do it this afternoon (relative to my time) if the gitea backends are happy with the moves? | 19:41 |
clarkb | that potentially impacts zuul jobs but zuul tends to be quieter during that time of day | 19:41 |
fungi | yeah, i mean we could try to pause all running jobs $somehow but honestly we warn projects not to have their jobs pull from gitea or gerrit anyway | 19:42 |
ianw | does zuul need a restart for anything? | 19:42 |
clarkb | ianw: there are a few changes that we could restart zuul for but the one change that I really wanted to get in isn't ready yet or wasn't last I checked | 19:43 |
ianw | i don't think the fix to the log buttons rolled out, and iirc corvus mentioned maybe we should just roll the whole thing | 19:43 |
clarkb | https://review.opendev.org/c/zuul/zuul/+/807221/ that change | 19:43 |
corvus | my bugfix merged with 2 others and has gotten big | 19:43 |
clarkb | I intend on rereviewing that change this afternoon | 19:43 |
corvus | i'm not feeling a huge need to restart right now | 19:43 |
clarkb | ok | 19:44 |
fungi | i think a quick reboot for the lb is probably fine whenever | 19:44 |
clarkb | In that case probably the easiest thing for the load balancer is to just go for it | 19:44 |
fungi | agreed | 19:44 |
clarkb | sounds good I'll continue to coordinate with mnaser on that and get this done | 19:44 |
ianw | i can do it in my afternoon if we like, make it even quieter | 19:44 |
clarkb | ianw: I think the problem with that is mnaser (or vexxhost person) has to do it | 19:45 |
* fungi does not know how to make ianw's afternoon even quieter | 19:45 | |
ianw | oh right, well let me know :) | 19:45 |
clarkb | I'm mostly managing the impact on our side but mnaser is pushing the cold migrate button | 19:45 |
ianw | fungi: you could come and cure covid and get my kids out of homeschool, that would help! :) | 19:45 |
fungi | d'oh! | 19:46 |
clarkb | #topic Open Discussion | 19:46 |
clarkb | That was it for the agenda. Anything else worth mentioning? | 19:46 |
fungi | opendev's testing and deployment is going to be featured in an talk at ansiblefest, for those who missed the announcement in other places | 19:47 |
fungi | #link https://events.ansiblefest.redhat.com/widget/redhat/ansible21/sessioncatalog/session/16248953812130016Yue | 19:49 |
fungi | registration is "free" for the virtual event, after you fill out 20 pages about how your company might be interested in ansible ;) | 19:50 |
corvus | i hope it's good :) | 19:50 |
corvus | i think it's about the coolest thing you can do with ansible so... | 19:51 |
fungi | it certainly is cool, i'll give you that | 19:51 |
corvus | sessions are also ~20m so hopefully shouldn't be a slog | 19:52 |
clarkb | oh I like that | 19:52 |
corvus | sept 29-30 | 19:52 |
clarkb | I think shorter works better for virtual | 19:52 |
corvus | yeah, they had some good info for speakers about exactly that | 19:52 |
corvus | like, your audience is in a different situation when virtual, so structure the talk a little differently | 19:53 |
fungi | that's helpful | 19:53 |
clarkb | Sounds like that may be it. | 19:56 |
clarkb | Thank you everyone! | 19:56 |
clarkb | See you here next week. Same time and location | 19:56 |
fungi | thanks clarkb! | 19:56 |
clarkb | #endmeeting | 19:56 |
opendevmeet | Meeting ended Tue Sep 7 19:56:39 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:56 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2021/infra.2021-09-07-19.01.html | 19:56 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2021/infra.2021-09-07-19.01.txt | 19:56 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2021/infra.2021-09-07-19.01.log.html | 19:56 |
clarkb | and now time for some lunch :) | 19:57 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!