Tuesday, 2020-04-21

*** mnaser has quit IRC16:41
*** diablo_rojo_phon has quit IRC16:42
*** mnaser has joined #opendev-meeting16:44
*** mnaser has quit IRC16:46
*** mnaser has joined #opendev-meeting16:47
*** mnaser has quit IRC16:49
*** mnaser has joined #opendev-meeting16:50
clarkbWe will get started on the meeting shortly18:59
clarkbanyone else here for the meeting? seems like it has been a busy distracting day19:00
fungii will busily switch to meeting mode19:00
clarkb#startmeeting infra19:01
openstackMeeting started Tue Apr 21 19:01:06 2020 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
*** openstack changes topic to " (Meeting topic: infra)"19:01
openstackThe meeting name has been set to 'infra'19:01
mordredo/19:01
clarkb#link http://lists.opendev.org/pipermail/service-discuss/2020-April/000010.html Our Agenda19:01
ianwo/19:01
clarkb#topic Announcements19:01
*** openstack changes topic to "Announcements (Meeting topic: infra)"19:01
zbro/19:01
clarkbI wanted to call out here that splitting opendev into its own comms channels seems to be working for getting more people to engage19:02
AJaegero/19:02
clarkbwelcome! to all those people (not sure if any are here in this channel now but we've seen more traffic on the mailing list)19:02
fungiwe're up to 80 nicks currently in the #opendev channel19:03
fungi(still a far cry from the 250+ in #openstack-infra, but many of those may be zombies for all intents and purposes)19:03
clarkb#topic Actions from last meeting19:04
fungialso 20 subscribers to service-discuss and 25 to service-announce19:04
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)"19:04
clarkb#link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-14-19.01.txt minutes from last meeting19:04
clarkbthere were no actions.19:04
clarkb#topic Priority Efforts19:04
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)"19:04
clarkb#topic Update Config Management19:05
*** openstack changes topic to "Update Config Management (Meeting topic: infra)"19:05
clarkbmaybe mordred can update on his activity here then ianw?19:05
clarkbmaybe we lost mordred19:07
clarkbmy understanding of it is that we've continued to push towards zuul driven CD of things19:07
clarkbin particular we are now looking at cleaning up puppetry as and where necessary19:07
corvusdoes anyone know what the status of containerized zuul is?19:07
mordredheya - sorry19:08
mordredyes!19:08
mordredso - three things going on19:08
corvus(i'd like to proceed with the tls work, so catching up on that would be helpful for me)19:08
mordredfirst - I'm still working through the followup from the gerrit rollout - next on that list is gerritbot - this led me to eavesdrop which has turned in to reorganizing how we run puppet a bit19:09
mordredso - sorry for that rabbithole - but I think it'll be worth it19:09
mordredhttps://review.opendev.org/#/q/topic:puppet-apply-jobs19:09
mordredthat's the topic related to that19:09
mordredsecond and third are nodepool and zuul19:09
mordredhttps://review.opendev.org/#/q/topic:container-zuul19:10
corvus(i think that rabbit hole -- getting the puppet jobs down to size -- is great and worth it)19:10
mordrednodepool-launcher is ready to go and I think now safe to land: https://review.opendev.org/#/c/720527/19:10
mordredit won't restart containers in prod, so I think we can land it then do a manual rolling restart of the launchers19:10
mordredif people are happy with what we did there with starting vs. not starting docker-compose ... I can apply the same thing to the zuul patch:19:11
mordredhttps://review.opendev.org/#/c/717620/19:11
mordred(issue being we don't necessarily want ansible to run docker-compose up every time it runs - but we DO want that to happen in the gate)19:12
mordredI believe once I update that patch with the start boolean - it'll also be ready to go19:12
mordredand I think also safe to land19:12
mordredbut - since that's nodepool and zuul - please review with an eye to "is this safe to land"19:12
corvuswe probably could start nodepool-launcher every time19:13
mordredcorvus: maybe we land first with nothing starting - because we have to stop the systemd stuff ...19:13
corvusyeah19:13
corvusi'm okay with starting conservative there19:13
mordredand then land a patch to flip the var on the things where we're happy to do it every time19:13
corvuswhat's the thinking on nodepool builders?  i didn't get the full story yesterday19:14
fungialso ianw discovered that debootstrap (used by dib to make debian/ubuntu images) needs a couple of patches to work from a container, so has published a custom build of it in a ppa and confirmed that's working from a container19:14
corvus(is nb04 broken? or what?)19:14
mordredianw is further with diagnosing the issue - I think he's got a working build19:14
fungiwe debated switching to something newer like mmdebstrap, but don't want dib to break for users of older platforms where those newer tools aren't yet shipped as part of the distro19:14
mordredbut it involves two unlanded merge requests19:14
mordredcorvus: nb04 is broken for debuntu builds19:14
mordredso they have been removed from it19:14
ianwyes, a few things in progress19:14
mordredoh good - it's ianw19:14
clarkbspecifically because debootstrap in docker containers explodes the next thing that runs in the container?19:15
ianwyes, it likes to unmount /proc19:15
fricklerhttps://review.opendev.org/72139419:15
corvusit sounds like we can't really run our builders or executors in containers at the moment19:15
corvusi'm a little worried that the zuul tls work is starting to collide with this19:15
corvusthe zuul patch has the executors running outside of containers19:16
corvusshould we rethink what we're doing with the builders?  or can we get them into a consistent state soon?19:16
mordredcorvus: well - I think we can get to full ansible19:16
ianwi am working to get our dib functional tests of converted to building from the container19:16
fungiit sounds like we should be able to run builders from containers with a patched debootstrap19:16
mordredcorvus: which would be the part of the story that would most impact tls work, yes?19:17
corvusmordred: yeah19:17
corvusso we're looking at having 3 builders run from ansible+puppet, and 1 from ansible+containers?19:17
mordredso - yeah - let's give ianw a little bit to see if we can get a solid container story for the builder with patched debootstrap19:17
corvusor 3 from just "ansible"19:17
ianwplease don't forget there is an arm64 builder which has not had a lot of attention, but i would not like to drop19:17
mordredI think just ansible if we can't get the container build going19:17
mordredianw: I have thoughts on that - let's come back to arm19:18
corvuswhat kind of time are we talking about there, cause it sounds like ianw is working on a rabbit hole of his own with the container functional testing?19:18
corvusbasically, we're holding a zuul release on opendev being able to test this stuff19:18
mordredwell - it sounds like the patched debootstrap works - so now it's about updating testing to prove that it works and make sure we don't regress, yes?19:18
fungi(and working out the arm story)19:19
corvusso i think we need to either get the system into a place where we can realistically land a coordinated configuration change to the whole system in a day or two, or else sever the dependency between opendev and zuul releases (at least, temporarily)19:19
mordredok. so - there are a couple of options for that19:19
mordredwe can work on an ansible+pip install (I can work on that right now)- based on the current ansible+docker install and similar to how we did zuul-executors in the zuul patch19:20
mordredwe'll need focal nodes for them to be new enough19:20
clarkbmordred: why do we need focal for that?19:21
clarkbeverything is pip installed so shouldn't depend on focal?19:21
mordredbecause of the reasons we're using the containers in the first place- the rpm helper tools on bionic are old or missing19:21
clarkboh for builders specifically. Got it19:21
mordredyeah19:21
ianwmordred: if you mean, just use pip on a plain host to install, i.e. replicating the puppet in ansible, i have a patch that does that19:21
AJaegerfor focal, we need to merge https://review.opendev.org/#/c/720718/ to mirror it - and stop mirroring trusty19:21
mordredI think it's not unreasonable to upload a focal base image19:21
mordredyes19:22
mordredand that would be good so that we can have integration test jobs19:22
mordredbut - I think we can work those in parallel19:22
ianwalso, arm only builds xenial/buster/bionic/centos atm.  we don't need the updated tools which are required for fedora, as of right now19:22
corvusfungi: i don't understand your comment in 72071819:22
mordredand get a focal base image uploaded to rax-dfw and boot a nb on it that we can use for fedora builds19:23
corvusfungi: i don't know what the differences between those two hosts are19:23
mordredas ianw says - we only need that for fedora builds19:23
clarkbcorvus: its a response to my comment19:23
ianwcorvus: we have done the work to move reprepro from puppet to ansible yet19:23
corvusclarkb: i understand that.  i don't understand how mirror-update.opendev.org and mirror-update.openstack.org are different19:23
mordredso we can boot the other ansible-baesd builders on bionic19:23
clarkbcorvus: mirror-update.opendev.org is ansible managed and only does rsync based mirror updates currently19:24
fungithe opendev.org server is the new one cron jobs are being migrated too, off the older openstack.org server19:24
clarkbcorvus: mirror-update.openstack.org does all the other mirror updates (reprepro and maybe other tools too)19:24
fungis/too/to/19:24
corvusbut they're both afs heads?19:24
fungiyes, both write into afs19:25
mordredI think we should not tie this to reworking anything about how reprepro and old mirror-update works - really just upping the quota and doing a manual release should be fine to get this moving, yes?19:25
clarkbmordred: yes19:25
corvussorry, this is proving a distraction.  i still don't understand fungi's comment and the implications, but i'll just follow up later.19:25
clarkbbasically my comment was calling out that you need to bump the quota and do the manual release19:26
clarkbif you do that its should all be fine19:26
corvusand fungi said something isn't necessary, but i don't know wha.t19:26
mordredgreat - so I think tasks would be: get focal mirroring going, get nodepool building focal nodes, build a manual focal-minimal to upload as a base image into rax-dfw, get a pure-ansible port of nodepool-builder19:26
mordredmost of those can be done in parallel19:26
fungicorvus: oh, because mirror-update.opendev.org get vos release run remotely by ansible and uses localauth to avoid timeouts19:27
corvusso do we want to switch all of the nb nodes to pure-ansible, retiring the current nb04?19:27
mordredI'm happy to take the pure-ansible port since I'm cranking on that stuff- can someone else help drive the mirror update?19:27
fungisorry, i had to page all that back in19:27
corvusthen make the container switch later after there's lots more testing?19:27
ianwmordred: bionic is sufficient to build fedora.  in fact, i already did all of that, let me fine the patch19:27
fungimy comment was specifically in response to clarkb's19:27
mordredyeah - I think that's a sane thing to do for now - although I do think that continuning the container debugging and testing work is imporant19:27
mordredianw: but not suse19:28
mordredianw: because it doesnt' have zypper19:28
fungiin response to clarkb's "Its possible this is no longer a concern..."19:28
mordredso - I think we should operate under the assumption that having at least one focal node would be beneficial - and that we also might need at least one bionic node because arm. hopefully we can coalesce on only focal once we can prove out that it works fine for arm19:29
clarkbmordred: ianw I think the only major risk with the focal plan is focal + arm64. But it sounds like we can maybe keep that on xenial or bionic for a bit longer19:29
fungiso i was saying initial vos release timeouts *are* a concern for anything added to mirror-update.openstack.org (like reprepro-based mirroring which is still there for the moment) but not for things mirrored using mirror-update.opendev.org (like rsync-based stuff)19:29
mordredyeah19:29
mordredI don't think the ansible differences between bionic and focal are likley large19:29
fungicorvus: does that answer your question?19:29
mordredwe don't have big things like systemd vs sysvinit19:29
corvusfungi: does that change apply to opendev or openstack?19:30
corvusmordred: that sounds reasonable19:30
fungicorvus: openSTACK because it's an ubuntu mirror19:31
fungiso vos release timeouts are still a concern for that change19:31
fungii probably should have quoted clarkb's comment in my reply, but thought it was obvious what i was replying to (clearly it wasn't, sorry!)19:31
corvuswell, the part you were referring to with "this" would have been helpful19:32
mordredI'm happy to work on the ansible nodepool-builder (which ianw may have already done) and the focal-minimal image in rax-dfw - can someone else drive the steps needed to get the vos release done safely?19:32
ianwhttps://review.opendev.org/#/c/692924/19:32
mordredianw: awesome, thanks!19:32
clarkbianw: ^ does that seem reasonable? I think we can switch to containers on top of ansible without containers easily enough when we have that working relaibly19:32
ianwtbh this is was i was proposing about 6 months ago :)19:33
mordredianw: although I think a lot of the current code can stay as it is in container-builder - so it'll be about picking out the appropriate things that we need for pip nodepool19:33
clarkbmordred: I can start a root screen on mirror-update and grab the lock to get this started19:33
clarkbthen we need to update the quota, merge the cahnge, and manually trigger the update19:33
mordredclarkb: ++19:33
clarkbbut first I need to load my ssh key19:34
mordredianw: well - I think we would have been golden with the container work you did -- if there wasn't this crazy debootstrap bug :)19:34
mordredof all the things to derail this - I wasn't expecting _that_ ;)19:34
ianwit's just a lot of uncharted territory all around19:34
ianwanyway, i'll keep working on it19:35
mordredianw: cool - and yes - I'd love for us to be able to switch back to pure-containers there19:35
clarkbmordred: if we can swing back to gerrit things, we have a third party ci operator that is using devstack-gate and discovered that we are still not replicating to review.o.o/p/ properly19:35
mordredalthough there is also an arm thing we have to solve19:36
clarkbmordred: I triggered replicaton openstack/requirements (the out of date repo) just in case that was somethign that didn't get rereplicated after config updates and no change19:36
mordredI've got thoughts on the arm thing - it's solvable19:36
mordredhttp://eavesdrop.openstack.org/irclogs/%23opendev/%23opendev.2020-04-19.log.html#t2020-04-19T15:04:32 <-- basic summary of what we need to do to build arm and x86 images when we buiold images19:36
clarkbI also noticed we have replication things failing for the local /opt/git thing there19:36
mordredclarkb: ok - we need to investigate that - it should be fixed19:37
mordredcan we swing back to that after the meeting?19:37
clarkbmordred: yup19:37
corvusweird, a quick check shows that checks out for me19:38
clarkbcorvus: you can cloen it but you end up on an older commit aiui19:38
corvusclarkb: i'm saying i don't see an older commit19:39
mordredoh - I know what it is19:39
corvusbut anyway, mordred requested we defer this19:39
mordredwell - I looked anyway19:39
mordredthe issue is a few missing repos that we created while the mount wasn't properly in place19:40
mordredso we didn't actually create them on the real filesystem19:40
corvus(my local test case is opendev/system-config, because that's the ancient checkout i noticed the problem with)19:40
mordredstarlingx/kernel would be one I believe19:40
mordredactually - it's owned by root19:41
mordredanywho - we can fix that19:41
mordredwe should figure out if we're running manage-projects as the wrong user19:42
mordredand thus creating the local replication target repos as the wrong user19:42
clarkbk anything else on this subject? as a time check we have 18 minutes left and a few other things to get to (but this was also a huge chunk of change last week so want to make sure we get through it)19:42
mordredI'm good19:43
fungiall clear here19:43
clarkb#topic OpenDev19:44
*** openstack changes topic to "OpenDev (Meeting topic: infra)"19:44
clarkbAs mentioned we seem to be picking up some new traffic which is good19:45
clarkbFungi has proposed that openstack-infra become a SIG and the openstack TC is on board with that19:45
clarkbI think it makes sense too19:45
fricklerwhy not fold it into qa?19:46
fungi#link http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014296.html Forming a Testing and Collaboration Tools (TaCT) SIG19:46
AJaegerThe TaCT SIG - I love the proposal ;)19:46
clarkbfrickler: I think gmann is concerned they don't have the knowledge to manage some of the tools officially like that19:46
fungifrickler: members/leaders of the openstack qa team have expressed concern over suddenly becoming responsible for "more stuff"19:46
clarkbfrickler: I think long term that may make sense but for now this gives us the ability to keep it a thing that is more tightly scoped and work with qa team as necessary19:46
fungi(also i could see qa shutting down as a team and folding into the same sig eventually)19:47
frickleroh, maybe that direction might work, too19:47
frickleror split of some things like devstack19:47
fricklero.k.19:47
fricklers/of/off19:47
fungiwe could stand some volunteers to serve as chairs for that sig, if it's something folks are generally in favor of forming19:48
clarkbIf you'd like to volunteer to be service coordinate for opendev now is the time to do so19:49
clarkbI mentioned I'd be happy to continue but also think new involvement is good as well19:49
fungikeep in mind that there is little responsibility as a sig chair, mostly just be aware of what's going on generally within the sig and be able to serve as a representative for it19:49
clarkbya we'll need a sig chair as well but thats less involved I exect19:50
clarkb#topic General Topics19:50
*** openstack changes topic to "General Topics (Meeting topic: infra)"19:50
clarkbIt is PTG planning time19:50
clarkbwe have been given some constraints in an effort to have collaboration happen between projects and keep hours sane for attendees19:50
fungii have a link i was going to paste with sig chair responsibilities, but maybe i'll just follow up to that ml post with it19:51
fricklerso is the vPTG to run completely on meetpad? or something else like zoom or bj?19:51
fungi(snice we're short on time)19:51
clarkbthe result of that is a giant ethercalc where we need to sign up for time19:51
fungifrickler: not determined yet19:51
clarkbfrickler: I think that is still be sorted out. I'd like to be able to make meetpad an option19:51
clarkbso we should keep pushing on it19:51
clarkbfrom that etherpad I've identified 3 two hour blcoks that I think work with our global presence19:52
fungii gather there's communication going out in the next day or two from the event planners to try and nail down requirements for collaboration software19:52
corvusi'll see about working out what the deal is with the python version there19:52
clarkbcorvus: I think mordred pushed a fix for it19:52
fricklero.k., but that'd need some big push IMO. I'll try to get some time allocated for that19:52
corvusoh awesome19:52
mordredcorvus: the root cause was fun19:52
clarkbMonday 1300-1500 UTC, Monday 2300-0100 UTC, Wednesday 0400-0600 UTC19:52
mordredcorvus: https://review.opendev.org/#/c/721707/19:52
clarkbthose are the blocks I think will work so that we can each attend ~2 out of ~3 without too much pain19:52
clarkbif there isn't any immediate objection to those blocks I can go ahead and sign us up for them (and tweak later if necessary)19:53
clarkb(and yes it will mean an early morning or a late night for many of us if you intend to hit 2 of the 3)19:54
clarkb(but it seemed to be an equitable distribution when I wrote out the times in a table )19:54
frickler+1 from me19:55
fungiyeah, i'll make myself available whenever19:55
AJaegerLGTM19:55
fungimaybe i can even swing all three with appropriate quantities of caffeine coursing through my veins19:55
clarkbcool. I probably won't get to signing up for those times until later today so let me know if there is a major conflict19:55
clarkbNext up is the wiki update but I think we can skip it due to time19:55
clarkbwhich takes us to etherpad19:56
fungietherpad is dead, long live etherpad?19:56
clarkbas part of mordreds container/ansible/cd work the old etherpad servers are gone including etherpad-dev19:56
clarkbwe haven't replaced that server and will instead rely on system-config end to end testing19:56
clarkbthe idea mordred and I had was if we need to verify UI behavior we can hold a test node and use it to manually verify off what zuul built19:56
clarkbI expect this will work reasonably well as a tool we can leverage for various services19:57
mordredyeah - if we find that sucks - we can always spin up a new etherpad-dev19:57
clarkbWanted to call this out as a separate agenda because if it isn't working well then that feedback would be good to hear19:57
clarkbmordred: ++19:57
clarkb#topic Open Discussion19:57
*** openstack changes topic to "Open Discussion (Meeting topic: infra)"19:57
clarkbalright have a few minutes for any thing else19:57
clarkbSounds like that might be it. Thank you everyone!19:59
clarkb#endmeeting19:59
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev"19:59
openstackMeeting ended Tue Apr 21 19:59:30 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:59
openstackMinutes:        http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-21-19.01.html19:59
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-21-19.01.txt19:59
fungi#link https://governance.openstack.org/sigs/reference/sig-guideline.html#select-sig-chairs sig chair responsibilities19:59
openstackLog:            http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-21-19.01.log.html19:59
fungid'oh, too late19:59
fungiwill follow up to the ml like i originally decided19:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!