*** hamalq has quit IRC | 01:25 | |
*** sboyron has joined #opendev-meeting | 06:54 | |
*** hashar has joined #opendev-meeting | 06:57 | |
*** hashar_ has joined #opendev-meeting | 07:43 | |
*** hashar has quit IRC | 07:46 | |
*** hashar_ is now known as hashar | 07:51 | |
*** hashar has quit IRC | 09:20 | |
*** hashar has joined #opendev-meeting | 11:39 | |
*** hashar has quit IRC | 13:24 | |
*** hashar has joined #opendev-meeting | 15:32 | |
*** hamalq has joined #opendev-meeting | 16:15 | |
*** hamalq_ has joined #opendev-meeting | 16:19 | |
*** hamalq has quit IRC | 16:20 | |
*** hashar is now known as hasharDinner | 17:36 | |
*** diablo_rojo has joined #opendev-meeting | 18:52 | |
clarkb | anyone else here for the meeting? | 19:00 |
---|---|---|
fungi | yeah, more or less | 19:00 |
clarkb | #startmeeting infra | 19:01 |
openstack | Meeting started Tue Mar 30 19:01:18 2021 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
*** openstack changes topic to " (Meeting topic: infra)" | 19:01 | |
openstack | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link http://lists.opendev.org/pipermail/service-discuss/2021-March/000199.html Our Agenda | 19:01 |
diablo_rojo | o/ | 19:01 |
clarkb | I wasn't around last week, but will do my best :) feel free to jump in help keep things going in the right direction | 19:01 |
ianw | o/ | 19:02 |
clarkb | #topic Announcements | 19:02 |
*** openstack changes topic to "Announcements (Meeting topic: infra)" | 19:03 | |
clarkb | I didn't have any. Do others? | 19:03 |
fungi | i don't think so | 19:03 |
fungi | gitea was upgraded | 19:03 |
fungi | keep an eye out for oddities? | 19:04 |
clarkb | ++ | 19:04 |
fungi | zuul was recently updated to move internal scheduler state into zookeeper | 19:04 |
fungi | keep an eye on that too | 19:04 |
clarkb | #topic Actions from last meeting | 19:05 |
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)" | 19:05 | |
clarkb | #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-03-23-19.01.txt minutes from last meeting | 19:05 |
clarkb | ianw had an action to start asterisk retirement. I saw an email to service-discuss about it. | 19:05 |
ianw | no response on that, so i guess i'll propose the changes soon | 19:05 |
clarkb | ianw do you want to keep the action around until the changes are up and or landed? seems to be moving along at least | 19:06 |
ianw | sure, make sure i don't forget :) | 19:06 |
clarkb | #action ianw Propose changes for asterisk retirement | 19:06 |
clarkb | #topic Priority Efforts | 19:06 |
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)" | 19:06 | |
clarkb | #topic OpenDev | 19:06 |
*** openstack changes topic to "OpenDev (Meeting topic: infra)" | 19:06 | |
clarkb | as mentioned we upgraded gitea from 1.13.1 to 1.13.6 | 19:07 |
clarkb | keep an eye out for weirdness. | 19:07 |
clarkb | Do we also want to reenable project description updates and see if 1.13.6 handles that better? or maybe get the token usage change in first? | 19:07 |
ianw | tokens seems to maybe isolate us from any future hashing changes, but either way i think we can | 19:08 |
clarkb | ianw: maybe I should push up the description update change again and then compare dstat results with and without the token use. | 19:09 |
clarkb | that should give us a good indication for whether or not 1.13.6 has improved hashing enough or not? | 19:09 |
fungi | maybe | 19:09 |
ianw | #link https://review.opendev.org/c/opendev/system-config/+/782887 | 19:09 |
fungi | it was never completely smoking gun that project management changes triggered the cpu load | 19:09 |
ianw | for anyone reading without context :) | 19:09 |
fungi | they would sometimes overload *a* gitea backend and the rest would be perfectly happy | 19:10 |
clarkb | ya I suspect it has to do with background load as well | 19:10 |
fungi | so if we want to experiment in that direction, we'll need to leave it in that state for a while and it's not a surety | 19:10 |
clarkb | due to the way we load balance we don't necessary get a very balanced load | 19:10 |
clarkb | I also made some new progress on the gerrit account classification process before taking time off | 19:11 |
clarkb | if you can review groups in review:~clarkb/gerrit_user_cleanups/notes.20210315 and determine if they can be safely cleaned up like previous groups that would be great | 19:12 |
clarkb | I'll pick that up again as others have had a chance to cross check my work | 19:12 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/780663 more user auditing improvements | 19:12 |
clarkb | that is a related scripting improvement. Looks like I have one +2 so I may just approve it today | 19:12 |
clarkb | essentially I had the scripts collect a bunch of data into yaml then I could run "queries" against it to see different angles | 19:13 |
clarkb | the different angesl are written down in the file above and can be corss checked | 19:13 |
clarkb | #topic Update Configuration Management | 19:14 |
*** openstack changes topic to "Update Configuration Management (Meeting topic: infra)" | 19:14 | |
clarkb | Any new config mgmt updates we should be aware of/review? | 19:14 |
fungi | i don't think so | 19:16 |
clarkb | #topic General Topics | 19:16 |
*** openstack changes topic to "General Topics (Meeting topic: infra)" | 19:16 | |
clarkb | #topic Server Upgrades | 19:16 |
*** openstack changes topic to "Server Upgrades (Meeting topic: infra)" | 19:16 | |
clarkb | I did end up completing the upgrades for zuul executors and mergers and nodepool launchers | 19:16 |
clarkb | That leaves us with the zookeeper cluster and the scheduler itself | 19:17 |
clarkb | I have started looking at the zk upgrade and writing notes on an etherpad | 19:17 |
clarkb | #link https://etherpad.opendev.org/p/opendev-zookeeper-upgrade-2021 | 19:17 |
clarkb | that etherpad proposes two options we could take to do the upgrade. If ya'll can review it and make sure the plans are complete and/or express an opinion on which path you would like to take I can boot instances and keep pushing on that | 19:18 |
clarkb | #topic Deploy new refstack server | 19:20 |
*** openstack changes topic to "Deploy new refstack server (Meeting topic: infra)" | 19:20 | |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/781593 | 19:20 |
clarkb | this change merged yesterday. ianw should I go ahead and remove this item from the meeting agenda? | 19:20 |
ianw | yep, deployment job ran so i'm not aware of anything else to do there | 19:20 |
clarkb | cool I'll get that cleaned up | 19:21 |
clarkb | #topic PTG Planning | 19:22 |
*** openstack changes topic to "PTG Planning (Meeting topic: infra)" | 19:22 | |
clarkb | I did submit a survey and put us on the schedule last week | 19:22 |
clarkb | the event runs April 19-23 and I selected Thursday April 22 1400-1600UTC and 2200-0000UTC for us | 19:22 |
clarkb | the first time should hopefully work for those in EU timezones and the second for those in asia/pacific/australia | 19:23 |
clarkb | my thought on that was we could do office hours and try to help some of our new project-config reviewers get up to speed or help other projects with infra related items | 19:23 |
clarkb | if the times just don't work or you think we need more or less let me know. I indicated we may need to rearrange scheduling when I filled out the survey | 19:24 |
clarkb | #topic docs-old volume cleanup | 19:24 |
*** openstack changes topic to "docs-old volume cleanup (Meeting topic: infra)" | 19:24 | |
clarkb | not sure if this is still current but it was on the agenda so here it is :) | 19:25 |
ianw | oh it was from when i was clearing out space the other day | 19:25 |
ianw | do we still need docs-old? | 19:26 |
fungi | we do not | 19:26 |
clarkb | is docs-old where we stashed the really old openstack documentation so that it could be found if people have really old installations but otherwise wouldn't show up in google results? | 19:26 |
fungi | that was kept around for people to manually copy things from if we failed to rebuild them during the transition to zuul v3 | 19:27 |
fungi | i think anything we weren't actively building but was relevant was manually copied to the docs volume | 19:27 |
ianw | clarkb: yeah, it leaking into google via https://static.opendev.org/docs-old/ which i guess has nothing to stop that was a concern | 19:27 |
ianw | ok, well it sounds like i can remove it then | 19:28 |
fungi | we should probably robots.txt to exclude spiders from the whole static vhost | 19:28 |
clarkb | would it make sense to see if Ajaeger has an opinion? | 19:28 |
clarkb | since Ajaeger was pretty involved in that at the time iirc | 19:28 |
ianw | fungi: yeah, i can propose that. everything visible there should have a "real" front-end i guess | 19:29 |
clarkb | I don't have enough of the historical context to make a decision. I'll defer to others, but suggest maybe double checking with ajaeger if we can | 19:31 |
ianw | ok, i can ask, don't want to bother him with too much old cruft these days :) | 19:31 |
clarkb | ya I don't think ajaeger needs to help with cleanup or backups or anything, just indicate if he thinks any of it is worth saving | 19:32 |
clarkb | #topic planet.openstack.org | 19:32 |
*** openstack changes topic to "planet.openstack.org (Meeting topic: infra)" | 19:32 | |
clarkb | Another one I don't have a ton of background on but I see a retire it option and I like the sound of that >_> | 19:33 |
clarkb | looks like the aggregator software is not being maintained anymore whih puts us in a weird spot doing server updates | 19:33 |
ianw | yeah, linux australia retired their planet which made me think of it | 19:33 |
fungi | i guess we should probably at least let the folks using it know somehow | 19:33 |
fungi | like make an announcement | 19:33 |
clarkb | ++ and probably send that one to openstack-discuss given the service utilization | 19:33 |
ianw | i did poke at aggregation software, i can't see any that look python3 and maintained | 19:34 |
fungi | i could get the foundation to include a link to the announcement in a newsletter | 19:34 |
clarkb | basically say the software is not maintained and we can't find alternaties. We will retire the service as a result. | 19:34 |
ianw | i thought we could replace it with a site on static that has an OPML of the existing blogs if we like | 19:34 |
ianw | these days, a RSS to twitter feed would probably be more relevant anyway | 19:34 |
fungi | or if the foundation sees benefit in it, they may have a different way they would want to do something similar anyway | 19:34 |
fungi | yeah | 19:34 |
fungi | microblogging sites have really become the modern blog aggregators anyway | 19:35 |
ianw | (i did actually look for an rss to twitter thing too, thinking that would be more relevant. nothing immediately jumped out, a buch of SaaS type things) | 19:35 |
clarkb | ya twitter, hacker news, reddit etc seem to be the modern tools | 19:36 |
clarkb | and authors just send out links from their accounts on those platforms | 19:36 |
ianw | vale RSS, RIP with google reader | 19:36 |
ianw | maybe give me an action item to remember and i can send that mail and start the process | 19:37 |
clarkb | #action ianw Announce planet.o.o retirement | 19:38 |
ianw | i am old enough to remember when jdub wrote and released the original "planet" and we all though that was super cool and created a bunch of planets | 19:38 |
clarkb | #topic Tarballs ORD replication | 19:39 |
*** openstack changes topic to "Tarballs ORD replication (Meeting topic: infra)" | 19:39 | |
ianw | ok, last one, again from clearing out things earlier in the week | 19:39 |
ianw | of the things we might want to keep if a datacentre burns down, i think tarballs is pretty much the only one not replicated? | 19:40 |
ianw | #link https://etherpad.opendev.org/p/gjzssFmxw48Nn3_SBVo6 | 19:40 |
ianw | that's the list | 19:40 |
ianw | docs is already replicated | 19:41 |
clarkb | ++ I think the biggest consideration has been that the vos release to a remote site of large sets of data isnt' quick | 19:41 |
clarkb | I think tarballs is not as large as our mirrors but bigger than docs? | 19:41 |
clarkb | I also suspect that we can set it up and see how bad it is and go from there? | 19:41 |
fungi | yeah, in that ballpark | 19:41 |
fungi | also the churn is not bad as it's mostly append-only | 19:41 |
fungi | or at least that's the impression i have | 19:42 |
fungi | i guess we'll find out if that's really true | 19:42 |
ianw | yeah, i don't think it's day-to-day operation; just recovery situations | 19:42 |
ianw | which happen more than you'd hope | 19:42 |
ianw | but still, i'd hate to feel silly if something happened and we just didn't have a copy of it | 19:43 |
clarkb | ya I think this is the sort of thing where we can make the change, monitor it to see if it is unhappy and go from there | 19:44 |
ianw | ORD has plenty of space. we can always drop the RO there in a recovery situation i guess too, if we need | 19:44 |
ianw | alright, i'll set that up. lmn if you think anything else in that list is similar | 19:44 |
clarkb | I want to say the newer openafs version we upgraded to is better about higher latency links? | 19:44 |
ianw | apparently, but still there's only so fast data gets between the two when it's a full replication scenario | 19:44 |
clarkb | ianw: maybe do all the project.* volumes? | 19:45 |
clarkb | I think those host docs for various things like zuul and starlingx | 19:46 |
clarkb | mirror.* shouldn't matter and is likely to be the most impacted by latency | 19:46 |
ianw | yeah, probably a good idea. i can update the docs for volume creation because we've sometimes done it and sometimes not it seems | 19:46 |
clarkb | ++ | 19:46 |
fungi | sure, small volumes are probably good to mirror more widely if for no other reason than we can, and they're one less thing we might lose in a disaster | 19:47 |
ianw | yeah, it all seems theoretical, but then ... fires do happen! :) | 19:48 |
clarkb | indeed | 19:49 |
clarkb | #topic Open Discussion | 19:49 |
*** openstack changes topic to "Open Discussion (Meeting topic: infra)" | 19:49 | |
clarkb | That was all on the published agenda | 19:49 |
ianw | i have a couple of easy ones from things that popped up | 19:49 |
clarkb | worth noting we think we have identified a zuul memory leak which is causing zk disconnects | 19:50 |
ianw | #link https://review.opendev.org/c/opendev/system-config/+/782868 | 19:50 |
ianw | stops dstat output to syslog | 19:50 |
clarkb | fungi was going to restart the scheduler to reset the leak and keep us limping along. corvus mentioned being able to actually debug tomorrow | 19:50 |
ianw | #link https://review.opendev.org/c/opendev/system-config/+/783120 | 19:50 |
ianw | puts haproxy logs into our standard container locations | 19:50 |
ianw | #link https://review.opendev.org/c/opendev/system-config/+/782898 | 19:50 |
clarkb | ianw: the dstat thing is unexpected but change lgtm | 19:51 |
ianw | allows us to boot very large servers when they are donated to us :) | 19:51 |
clarkb | ha on that last one | 19:51 |
fungi | yeah, we're a few minutes out from being able to restart the scheduler without worrying about openstack release impact | 19:52 |
fungi | i'm just waiting for one build to finish updating the releases site | 19:52 |
ianw | is it helpful to restart with a debugger or anything for the leak? | 19:52 |
fungi | oh, clarkb, that oddity we were looking at with stale gerritlib used in a jeepyb job? it happened again when i rechecked | 19:53 |
ianw | clarkb: yeah, i was like "i'm sure i provided a reasonable size for boot from volume ... is growroot failing, etc. etc." :) | 19:53 |
clarkb | ianw: I want to say we already have a hook to run profiling on object counts | 19:53 |
clarkb | ianw: but that is agood question and we should confirm with corvus before we restart | 19:53 |
corvus | i have not previously used a debugger when debugging a zuul memory leak; only the repl and siguser | 19:53 |
corvus | i'm always open to new suggestions on debugging memleaks though :) | 19:54 |
clarkb | seems like the repl stuff and getting object counts has been really helpful in the past at least | 19:54 |
clarkb | corvus: when I've tried in the past its been "fun" to figure out adding debugging symbols and all that. I suspect that since we use a compiled python via docker that this may be even more fun? | 19:56 |
clarkb | we can't just install the debugger symbols package from debian | 19:56 |
clarkb | (sorting that out may be a fun exercise for someone with free time though as it may be useful generally) | 19:57 |
clarkb | sounds like this may be about it. I can end here and we can go have breakfast/lunch/dinner :) | 19:57 |
clarkb | thank you everyone! | 19:57 |
clarkb | #endmeeting | 19:57 |
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev" | 19:57 | |
openstack | Meeting ended Tue Mar 30 19:57:31 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:57 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-03-30-19.01.html | 19:57 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-03-30-19.01.txt | 19:57 |
openstack | Log: http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-03-30-19.01.log.html | 19:57 |
fungi | thanks clarkb! | 19:58 |
diablo_rojo | thanks clarkb! | 19:58 |
*** hasharDinner has quit IRC | 20:20 | |
*** openstackstatus has quit IRC | 22:42 | |
*** openstack has joined #opendev-meeting | 22:43 | |
*** ChanServ sets mode: +o openstack | 22:43 | |
*** sboyron has quit IRC | 23:02 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!