*** diablo_rojo_phon has joined #opendev-meeting | 07:59 | |
-openstackstatus- NOTICE: Zuul is currently failing testing, please refrain from recheck and submitting of new changes until this is solved. | 09:00 | |
*** ChanServ changes topic to "Zuul is currently failing testing, please refrain from recheck and submitting of new changes until this is solved." | 09:00 | |
-openstackstatus- NOTICE: Zuul is currently failing all testing, please refrain from approving, rechecking or submitting of new changes until this is solved. | 09:11 | |
*** ChanServ changes topic to "Zuul is currently failing all testing, please refrain from approving, rechecking or submitting of new changes until this is solved." | 09:11 | |
*** ChanServ changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev" | 12:23 | |
-openstackstatus- NOTICE: Zuul has been restarted, all events are lost, recheck or re-approve any changes submitted since 9:50 UTC. | 12:23 | |
*** diablo_rojo has joined #opendev-meeting | 16:41 | |
clarkb | anyone else here for the meeting? we will get started shortly | 18:59 |
---|---|---|
ianw | o/ | 19:00 |
clarkb | #startmeeting infra | 19:01 |
openstack | Meeting started Tue Apr 28 19:01:11 2020 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
*** openstack changes topic to " (Meeting topic: infra)" | 19:01 | |
openstack | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link http://lists.opendev.org/pipermail/service-discuss/2020-April/000011.html Our Agenda | 19:01 |
clarkb | #topic Announcements | 19:01 |
*** openstack changes topic to "Announcements (Meeting topic: infra)" | 19:01 | |
clarkb | For the OpenDev Service Coordinator I said we would wait for volunteers until the end of the month. That gives you a few more days if interested :) | 19:02 |
mordred | o/ | 19:02 |
fungi | how many people have volunteered so far? | 19:02 |
clarkb | fungi: I think only me with my informal "I'm willing to do it" portion of the message | 19:02 |
clarkb | I figured I would send a separate email thursday if no one else did first :) | 19:03 |
fungi | thanks! | 19:03 |
diablo_rojo_phon | o/ | 19:03 |
clarkb | #topic Actions from last meeting | 19:04 |
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)" | 19:04 | |
clarkb | #link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-21-19.01.txt minutes from last meeting | 19:04 |
clarkb | There were no actions recorded last meeting. Why don't we dive right into all the fun ansible docker puppet things | 19:04 |
clarkb | #topic Priority Efforts | 19:04 |
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)" | 19:04 | |
fungi | why don't we? | 19:04 |
clarkb | #topic Update Config Management | 19:04 |
*** openstack changes topic to "Update Config Management (Meeting topic: infra)" | 19:04 | |
fungi | oh, i see, that was a rhetorical question! ;) | 19:04 |
clarkb | fungi: :) | 19:04 |
clarkb | mordred: first up on this agenda we've got dockerization of gerrit. Are we happy with how that has gone and can we remove that agenda item now? | 19:05 |
clarkb | mordred: I think all of the outstanding issues I knew about were addressed | 19:05 |
fungi | we do seem to be caught back up since friday's unanticipated excitements | 19:05 |
fungi | and now we're much further along with things too | 19:06 |
corvus | so i guess that's done and we could start thinking about the upgrade to 2.16? | 19:06 |
mordred | WELLL | 19:06 |
mordred | there's still a cleanup task | 19:06 |
mordred | which is gerritbot | 19:06 |
mordred | I don't want to forget about it | 19:06 |
fungi | oh, right, tied in with the eavesdrop dockering | 19:07 |
mordred | but we've got eavesdrop split into a playbook and are containering accessbot on eavesdrop now | 19:07 |
mordred | as well as a gerritbot container patch up | 19:07 |
mordred | oh - we landed that | 19:07 |
mordred | cool - I wanna get that bit finished | 19:07 |
mordred | but then, yes, I agree with corvus - next step is working on the 2.16 upgrade | 19:08 |
clarkb | sounds like we are very close | 19:08 |
mordred | yeah. I think we can remove the agenda item - gerritbot is more just normal work | 19:08 |
fungi | thuogh the gerrit upgrade becomes an agenda item | 19:09 |
clarkb | Next up is the Zuul driven playbooks. Last Friday we dove head first into running all of zuul from ansible and most of zuul with containers | 19:09 |
mordred | yeah - zuul-executor is still installed via pip - because we haven't figured out docker+bubblewrap+afs yet | 19:10 |
clarkb | #link https://review.opendev.org/#/c/724115/ Fix for zuul-scheduler ansible role | 19:10 |
mordred | everything else is in containers - which means everything else is now running python 3.7 | 19:10 |
clarkb | I found a recent issue with that that could use some eyeballs (looks like testing failed) | 19:10 |
fungi | we sort of didn't anticipate merging the change to go through and change uids/gids on servers, which turned it into a bit of a fire drill, but the outcome is marvellous | 19:10 |
clarkb | related to this is work to do the same with nodepool services | 19:11 |
mordred | clarkb: oh - that's the reason I didn't have the value set in the host_vars | 19:11 |
clarkb | mordred: does ansible load from the production value and override testing? | 19:11 |
clarkb | for nodepool services I think we are sorting out multiarch container builds as we have x86 and arm64 nodepool builders | 19:13 |
mordred | clarkb: I think so | 19:13 |
mordred | we're VERY close to having multiarch working | 19:13 |
mordred | https://review.opendev.org/#/c/722339/ | 19:14 |
clarkb | then once that is doen and deployed I'd like to land a rebase of https://review.opendev.org/#/c/722394/ to make it easier to work with all of our new jobs in system-config | 19:14 |
corvus | yes please :) | 19:14 |
mordred | ++ | 19:14 |
mordred | https://review.opendev.org/#/c/724079/ <-- we need this to work with the multi-arch job on an ipv6-only cloud | 19:15 |
ianw | ^ that has an unknown configuration error? | 19:16 |
mordred | ianw: AROO | 19:18 |
corvus | maybe that's the long-standing bug we've been trying to track down | 19:19 |
corvus | i can take a look after mtg | 19:19 |
clarkb | The other thing I wanted to call out is that we are learning quite a bit about using Zuul for CD | 19:20 |
ianw | apropos the container services, i don't see any reason not to replace nb01 & nb02 with container based versions now? | 19:21 |
clarkb | for example I think we've decided that periodic jobs should pull latest git repo state rather than rely on what zuul provided | 19:21 |
fungi | i like to think we're improving zuul's viability for cd | 19:21 |
clarkb | ianw: ++ | 19:21 |
clarkb | if you notice irregularities in playbook application to production please call it out. Beacuse as fungi points out I think we are improving things by learning here :) | 19:21 |
ianw | i can do that and move builds there | 19:21 |
ianw | (new builders, i mean) | 19:21 |
fungi | at the very least we're becoming an early case study on complex cd with zuul | 19:22 |
fungi | (spoiler, it's working remarkably well!) | 19:22 |
clarkb | anything else on the subject of config management and continuous deployment? | 19:23 |
mordred | oh - I'm testing focal for zuul-executors | 19:23 |
mordred | https://review.opendev.org/#/c/723528/ | 19:23 |
fungi | it's due to be officially released next week, right? | 19:23 |
mordred | once that's working - I wanna start replacing ze*.openstack.org with ze*.opendev.org on focal | 19:24 |
mordred | it's already released | 19:24 |
fungi | oh! | 19:24 |
fungi | i'm clearly behind on the news | 19:24 |
fungi | time has become more of an illusion than usual | 19:24 |
ianw | hrm, so for new builders, such as nb0X ... focal too? | 19:25 |
fungi | might as well | 19:25 |
ianw | s/builders/servers ? | 19:25 |
clarkb | my only concern at this point with it is major services like mysql crash on it | 19:25 |
ianw | that will add some testing updates into the loop, but that's ok | 19:26 |
mordred | clarkb: "awesome" | 19:26 |
clarkb | we should carefully test things are workign as we put focal into production | 19:26 |
mordred | ++ | 19:26 |
fungi | in theory anything we can deploy on bionic we ought to be able to deploy on focal, bugs like what clarkb mentions aside | 19:26 |
fungi | stuff that's still relying on puppet is obviously frozen in the xenial past | 19:26 |
ianw | i doubt system-config testing will work on it ATM for testing infrastructure reasons ... i'm working on it | 19:27 |
clarkb | less and less stuff is relying on puppet though | 19:27 |
mordred | ianw: we could also go ahead and just do bionic for builders and get that done | 19:27 |
ianw | particularly the ensure-tox role and it installing as a user | 19:27 |
mordred | ianw: since those are in containers - I mostly wanted to roll executors out on focal so that we could run all of zuul on the same version of python | 19:27 |
ianw | ... yes pip-and-virtualenv is involved somehow | 19:28 |
mordred | ianw: the focal test nodes are actually going ok for the run-service-zuul test job fwiw | 19:28 |
ianw | mordred: hrm, perhaps it's mostly if we were to have a focal bridge.o.o in the mix in the test suite, where testinfra is run | 19:29 |
mordred | yeah - for things where we're running our ansible- the lack of pre-installed stuff is good, since we install everything from scratch anyway | 19:30 |
clarkb | Sounds like that may be it for this topic | 19:31 |
clarkb | #topic OpenDev | 19:32 |
*** openstack changes topic to "OpenDev (Meeting topic: infra)" | 19:32 | |
clarkb | Another friendly reminder to volunteer for service coordinator if interested | 19:32 |
clarkb | On a services front we upgraded gitea thursdayish then had reports of failed git clones over the weekend | 19:32 |
clarkb | "thankfully" it seems that is a network problem and unrelated to our upgrade | 19:32 |
mordred | \o. | 19:33 |
clarkb | citycloud kna1 (and kna3?) was losing packets sent to vexxhost sjc1 | 19:33 |
mordred | I mean | 19:33 |
mordred | \o/ | 19:33 |
clarkb | fungi was able to track that down using our mirror in kna1 to reproduce the user reports | 19:33 |
clarkb | and from there we did some traceroutes and passed that along to the cloud providers | 19:33 |
clarkb | something to be aware of if we have more reports of this. Double checking the origin is worthwhile | 19:33 |
clarkb | Also lists.* has been OOMing daily at between 1000-1200UTC | 19:34 |
fungi | yeah, that one's not been so easy to correlate | 19:34 |
clarkb | fungi has been running dstat data collection to help debug that and I think the data shows it isn't bup or mailman. During the period of sadness we get many listinfo processes | 19:34 |
fungi | i think you're on to something with the semrush bot in the logs | 19:34 |
clarkb | those listinfo processes are started by apache to render webpage stuff for mailman and correlating to logs we have a semrush bot hitting us during every OOM I've checked | 19:35 |
clarkb | I've manually dropped in a robots.txt file to tell semrushbot to go away | 19:35 |
fungi | though ultimately, this probably means we should eventually upgrade the lists server to something with a bit more oomph | 19:35 |
clarkb | I've also noticed a "The Knowledge AI" bot but it doesn't seem to show up when things are sad | 19:35 |
fungi | or tune apache to not oom the server | 19:35 |
clarkb | fungi: and maybe even both things :) | 19:35 |
clarkb | but ya if the robots.txt "fixes" things I think we can encode that in puppet and then look at tuning apache to reduce numebr of connections? | 19:36 |
clarkb | #topic General Topics | 19:37 |
fungi | i think so, yes | 19:37 |
*** openstack changes topic to "General Topics (Meeting topic: infra)" | 19:37 | |
clarkb | A virtual PTG is planned for the beginning of june | 19:38 |
clarkb | I've requested these time blocks for us: Monday 1300-1500 UTC, Monday 2300-0100 UTC, Wednesday 0400-0600 UTC | 19:38 |
clarkb | fungi: have you been tracking what registration and other general "getting involved" requires? | 19:38 |
fungi | not really | 19:39 |
fungi | i mean, i understand registration is free | 19:39 |
clarkb | k I'll try to get more details on that so that any interesting in participating can do so | 19:40 |
clarkb | (I expect it will be relatively easy compared to typical PTGs) | 19:40 |
fungi | but folks are encouraged to register so that 1. organizers can have a better idea of what capacity to plan for, and 2. to help the osf meet legal requirements for things like code of conduct agreement | 19:40 |
fungi | there's also discussion underway to do requirements gathering for tools | 19:41 |
fungi | i can find an ml archive link | 19:41 |
clarkb | thanks! | 19:42 |
corvus | should we do anything with meetpad? | 19:43 |
fungi | #link http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014367.html PTG Signup Reminder & PTG Registration | 19:43 |
fungi | #link http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014481.html Virtual PTG Tooling Requirements & Suggestions | 19:44 |
clarkb | I think we should keep pushing on meetpad. Last I checked the deployment issues were addressed. Should we plan another round of test calls? | 19:44 |
corvus | is it generally working now? does it need more testing? are there bugs we should look at? does it meet any of the requirements the foundation has set? | 19:44 |
fungi | there's an etherpad in that second ml archive link where a list of requirements is being gathered | 19:45 |
clarkb | I don't know if its working -> more testing probably a good idea. One outstanding issue is the http to https redirect is still missing iirc | 19:45 |
corvus | i noticed there was a bunch of 'required' stuff on https://etherpad.opendev.org/p/virt-ptg-requirements | 19:45 |
corvus | clarkb: ah, yeah forgot about that. i can add that real quick. | 19:45 |
fungi | probably one of the harder things to meet there is "Legal/Compliance Approval (e.g. OSF Privacy Policy, OSF Code of Conduct, GDPR)" | 19:46 |
corvus | i don't even know what that means | 19:46 |
fungi | but i don't really know, maybe that's easy | 19:46 |
clarkb | corvus: it might be a good idea to get meetpad to a place we are generally happy with it, then we can put it in front of the osf and ask them for more details on those less explicit rquirements? | 19:47 |
fungi | i think it's supposed to mean that the osf privacy policy is linked from the service in an easy to find way, that the service complies with the gdpr (that's vague too), and that something enforces that all people connecting agree to follow the code of conduct | 19:47 |
fungi | but yes, getting clarification on those points would be good | 19:48 |
fungi | also we've (several of us) done our best to remind osf folks that people will use whatever tools they want at the end of the day, so it may be a matter of legal risks for the osf endorsing specific tools vs just accepting their use | 19:49 |
clarkb | I do think on paper jitsi meets the more concrete requirments with maybe exception of the room size (depends on whether or not we can bump that up?) | 19:49 |
corvus | is there a limit on room size? | 19:49 |
clarkb | corvus: did you say it is a limit of 35? I thought someone said that | 19:49 |
corvus | i thought i read about hundreds of people in a jitsi room | 19:49 |
clarkb | but there was some workaround for that | 19:49 |
fungi | i think that one was more of "the service should still be usable for conversation when 50 people are in the same room" | 19:50 |
clarkb | https://community.jitsi.org/t/maximum-number-of-participants-on-a-meeting-on-meet-jit-si-server/22273 | 19:50 |
clarkb | maybe the 35 number came from something like that | 19:50 |
fungi | (noting that 10 people talking at once is tough to manage, much less 50, and that has little to do with the tools) | 19:50 |
clarkb | also those numbers may be specific to the meet.jit.si deployment (and we can tune ours separate?) | 19:50 |
fungi | also osf has concerns that whatever platforms are endorsed have strong controls allowing rooms to be moderated, people to be muted by moderators, and abusive attendees to be removed reliably | 19:51 |
corvus | yeah, sounds like there may be issues with larger numbers of folks | 19:51 |
clarkb | in any case I think step 0 is getting it to work in our simpler case with the etherpad integration | 19:51 |
clarkb | I think we haven't quite picked that up against since all the etherpad and jitsi updates so worth retesting and seeing where it is at now | 19:52 |
corvus | k, i'll do the http redirect and ping some folks maybe tomorrow for testing | 19:52 |
clarkb | thanks! | 19:52 |
clarkb | fungi any wiki updates? | 19:52 |
fungi | none, the most i find time for is patrolling page edits from new users | 19:53 |
fungi | (and banning and mass deleting the spam) | 19:53 |
clarkb | #topic Open Discussion | 19:53 |
*** openstack changes topic to "Open Discussion (Meeting topic: infra)" | 19:53 | |
clarkb | That takes us to the end of our agenda | 19:53 |
clarkb | As a quick note my ISP has been sold off and acquired by a new company. That transition takes effect May 1st (Friday). I don't expect outages but seems like chances for them are higher under those circumstances | 19:54 |
corvus | clarkb: i hear the internet is gonna be big | 19:54 |
clarkb | corvus: do you think we could sell books over the internet? | 19:55 |
ianw | if i could get an eye on | 19:55 |
ianw | #link https://review.opendev.org/#/c/723309/ | 19:56 |
ianw | that is part of the pip-and-virtualenv work to add an ensure-virtualenv role for things that actually require virutalenv | 19:56 |
ianw | dib is one such thing, this gets the arm64 builds back testing; we have dropped pip-and-virtualenv from them | 19:57 |
clarkb | ianw: I'll take a look after lunch if it is still up there then | 19:57 |
ianw | #link http://eavesdrop.openstack.org/irclogs/%23opendev/%23opendev.2020-04-28.log.html#t2020-04-28T04:47:00 | 19:58 |
ianw | is my near term plan for that work | 19:58 |
fungi | i have a very rough start on the central auth spec drafted locally, though it's still a patchwork of prose i've scraped from various people's e-mail messages over several years so i'm trying to find time to wrangle that into something sensible and current | 19:58 |
fungi | and i've been fiddling with a new tool to gather engagement metrics from gerrit (and soon mailman pipermail archives, meetbot channel logs, et cetera) | 19:59 |
fungi | trying to decide whether i should push that up to system-config or make a new repo for it | 19:59 |
clarkb | fungi: new repo might be worthwhile. Thinking out loud here: the zuul work in system-config is really orientating it towards deployment of tools but not in defining the tools itself as much? | 20:00 |
fungi | yeah | 20:00 |
fungi | i concur | 20:00 |
fungi | i should make it an installable python project | 20:00 |
clarkb | and we are at time | 20:01 |
clarkb | thank you everyone@ | 20:01 |
clarkb | er ! | 20:01 |
clarkb | #endmeeting | 20:01 |
fungi | thanks clarkb! | 20:01 |
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev" | 20:01 | |
openstack | Meeting ended Tue Apr 28 20:01:17 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 20:01 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-28-19.01.html | 20:01 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-28-19.01.txt | 20:01 |
openstack | Log: http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-28-19.01.log.html | 20:01 |
*** diablo_rojo has quit IRC | 20:39 | |
*** diablo_rojo has joined #opendev-meeting | 20:42 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!