Monday, 2019-12-09

openstackgerritIan Wienand proposed zuul/zuul-jobs master: build-container-image: support sibling copy  https://review.opendev.org/69793600:00
openstackgerritIan Wienand proposed zuul/zuul-jobs master: build-docker-image: fix up siblings copy  https://review.opendev.org/69761400:00
openstackgerritIan Wienand proposed zuul/zuul-jobs master: build-container-image: support sibling copy  https://review.opendev.org/69793600:14
openstackgerritIan Wienand proposed zuul/zuul-jobs master: build-docker-image: fix up siblings copy  https://review.opendev.org/69761400:14
openstackgerritMerged zuul/zuul-jobs master: Fixes tox log fetching when envlist is set to 'ALL'  https://review.opendev.org/69653100:18
*** irclogbot_0 has joined #openstack-infra00:22
*** irclogbot_0 has quit IRC00:40
*** jamesmcarthur has joined #openstack-infra00:40
*** sgw has quit IRC01:11
*** jamesmcarthur has quit IRC01:17
*** stevebaker_ has joined #openstack-infra01:20
*** irclogbot_0 has joined #openstack-infra01:26
*** stevebaker_ has quit IRC01:33
*** yamamoto has joined #openstack-infra01:39
*** stevebaker_ has joined #openstack-infra01:41
*** irclogbot_0 has quit IRC01:48
*** ddurst has quit IRC01:50
*** ddurst has joined #openstack-infra01:52
*** yamamoto has quit IRC02:34
*** irclogbot_1 has joined #openstack-infra02:46
*** slaweq has joined #openstack-infra02:54
*** slaweq has quit IRC03:02
*** ykarel|away has joined #openstack-infra03:03
*** slaweq has joined #openstack-infra03:11
*** slaweq has quit IRC03:15
*** ramishra has joined #openstack-infra03:23
*** slaweq has joined #openstack-infra03:32
*** slaweq has quit IRC03:37
*** yamamoto has joined #openstack-infra03:41
*** slaweq has joined #openstack-infra03:49
*** slaweq has quit IRC03:54
*** ricolin has joined #openstack-infra04:02
*** slaweq has joined #openstack-infra04:09
*** jamesmcarthur has joined #openstack-infra04:11
*** ykarel|away has quit IRC04:12
*** jamesmcarthur has quit IRC04:16
*** rishabhhpe has joined #openstack-infra04:24
*** udesale has joined #openstack-infra04:25
*** ykarel|away has joined #openstack-infra04:27
openstackgerritIan Wienand proposed opendev/system-config master: Add roles for a basic static server  https://review.opendev.org/69758704:28
*** factor has quit IRC04:31
*** factor has joined #openstack-infra04:32
*** ykarel|away is now known as ykarel04:39
ianwAJaeger: when you get a second, can you double check my comment in https://review.opendev.org/#/c/697587/2/playbooks/roles/static/files/50-governance.openstack.org.conf that we're publishing all of this to the right location04:41
*** jamesmcarthur has joined #openstack-infra04:42
*** slaweq has quit IRC04:45
*** jamesmcarthur has quit IRC04:48
openstackgerritIan Wienand proposed opendev/system-config master: Add roles for a basic static server  https://review.opendev.org/69758704:52
openstackgerritIan Wienand proposed opendev/system-config master: Add roles for a basic static server  https://review.opendev.org/69758705:07
*** jamesmcarthur has joined #openstack-infra05:15
*** surpatil has joined #openstack-infra05:15
*** jamesmcarthur has quit IRC05:20
*** jamesmcarthur has joined #openstack-infra05:25
*** jamesmcarthur has quit IRC05:30
*** soniya29 has joined #openstack-infra05:33
*** janki has joined #openstack-infra05:39
*** rishabhhpe has quit IRC05:42
*** rishabhhpe has joined #openstack-infra05:42
*** raukadah is now known as chkumar|ruck06:04
*** jamesmcarthur has joined #openstack-infra06:26
*** jamesmcarthur has quit IRC06:31
*** jchhatbar has joined #openstack-infra07:01
*** janki has quit IRC07:01
*** jchhatbar has quit IRC07:01
*** apetrich has joined #openstack-infra07:03
*** jamesmcarthur has joined #openstack-infra07:05
*** apetrich has quit IRC07:08
*** AJaeger has quit IRC07:09
*** jamesmcarthur has quit IRC07:11
*** rcernin has quit IRC07:12
*** AJaeger has joined #openstack-infra07:13
AJaegerianw: left a comment on the change, looks like you missed "/projects"07:22
AJaegerhttps://docs.opendev.org/opendev/infra-specs/latest/specs/retire-static.html also has "/afs/openstack.org/project/governance.openstack.org" but I don't see that /project in the change07:23
*** slaweq has joined #openstack-infra07:25
*** pgaxatte has joined #openstack-infra07:34
*** surpatil is now known as surpatil|lunch07:35
*** ykarel is now known as ykarel|lunch07:38
*** adriant has quit IRC07:52
*** adriant has joined #openstack-infra07:52
*** igordc has joined #openstack-infra08:01
*** rishabhhpe has quit IRC08:02
*** jamesmcarthur has joined #openstack-infra08:07
*** tkajinam has quit IRC08:07
*** pkopec has joined #openstack-infra08:09
*** sshnaidm|off is now known as sshnaidm08:11
*** jamesmcarthur has quit IRC08:11
*** ykarel|lunch is now known as ykarel08:11
*** rishabhhpe has joined #openstack-infra08:12
ianwAJaeger: ok, thanks for that, i'll look at all the paths more closely.  but do you agree we publish /elections /tc /uc /sigs (i.e. you change catches all of these publish paths?)08:15
*** jtomasek has joined #openstack-infra08:15
*** witek has joined #openstack-infra08:15
*** dchen has quit IRC08:18
AJaegeryes, that is fine as far as I can see...08:20
*** tosky has joined #openstack-infra08:25
*** tesseract has joined #openstack-infra08:30
*** iurygregory has joined #openstack-infra08:31
*** surpatil|lunch is now known as surpatil08:31
*** tesseract has quit IRC08:31
*** tesseract has joined #openstack-infra08:31
*** igordc has quit IRC08:36
*** jamesmcarthur has joined #openstack-infra08:46
*** jamesmcarthur has quit IRC08:51
*** piotrowskim has joined #openstack-infra08:57
*** ralonsoh has joined #openstack-infra08:57
*** udesale has quit IRC09:01
*** ykarel_ has joined #openstack-infra09:03
*** ykarel has quit IRC09:06
*** yolanda has joined #openstack-infra09:09
*** rcernin has joined #openstack-infra09:11
*** lucasagomes has joined #openstack-infra09:11
*** rpittau|afk is now known as rpittau09:15
*** gfidente has joined #openstack-infra09:15
*** yamamoto has quit IRC09:17
ianwAJaeger: great, thanks, will fix up those paths tomorrow, hopefully should be working09:24
*** kopecmartin|off is now known as kopecmartin09:27
*** kjackal has joined #openstack-infra09:27
*** hashar has joined #openstack-infra09:29
*** Xuchu has joined #openstack-infra09:34
*** derekh has joined #openstack-infra09:38
*** ijw has joined #openstack-infra09:39
*** jonher has quit IRC09:40
*** jonher has joined #openstack-infra09:41
*** apetrich has joined #openstack-infra09:42
*** ijw has quit IRC09:43
*** jamesmcarthur has joined #openstack-infra09:47
*** jamesmcarthur has quit IRC09:52
*** ykarel__ has joined #openstack-infra09:54
*** ykarel__ is now known as ykarel09:54
*** ykarel_ has quit IRC09:56
*** dtantsur|afk is now known as dtantsur10:02
*** yamamoto has joined #openstack-infra10:03
*** yamamoto has quit IRC10:04
*** udesale has joined #openstack-infra10:08
*** witek has quit IRC10:10
*** rcernin has quit IRC10:14
*** ssbarnea has quit IRC10:16
*** jamesmcarthur has joined #openstack-infra10:28
openstackgerritMatthieu Huin proposed zuul/zuul master: enqueue: make trigger deprecated  https://review.opendev.org/69544610:30
*** udesale has quit IRC10:31
*** jamesmcarthur has quit IRC10:33
*** Xuchu_ has joined #openstack-infra10:49
*** Xuchu has quit IRC10:51
*** Xuchu_ is now known as Xuchu10:51
*** jtomasek has quit IRC10:54
*** ssbarnea has joined #openstack-infra10:55
*** jtomasek has joined #openstack-infra10:57
*** ykarel is now known as ykarel|afk10:58
*** yamamoto has joined #openstack-infra10:58
*** yamamoto has quit IRC11:03
*** rishabhhpe has quit IRC11:05
*** witek has joined #openstack-infra11:11
*** yamamoto has joined #openstack-infra11:13
*** Lucas_Gray has joined #openstack-infra11:20
*** jamesmcarthur has joined #openstack-infra11:29
*** jamesmcarthur has quit IRC11:33
*** Xuchu has quit IRC11:39
*** pcaruana has joined #openstack-infra11:46
*** yamamoto has quit IRC11:49
*** yamamoto has joined #openstack-infra11:50
*** hashar has quit IRC11:56
openstackgerritAlbin Vass proposed zuul/nodepool master: Aws cloud-image is referred to from pool labels section  https://review.opendev.org/69799811:56
*** witek has quit IRC12:03
*** jamesmcarthur has joined #openstack-infra12:06
*** yamamoto has quit IRC12:08
*** jamesmcarthur has quit IRC12:11
*** ykarel|afk is now known as ykarel12:11
*** dave-mccowan has joined #openstack-infra12:19
*** dave-mccowan has quit IRC12:25
*** udesale has joined #openstack-infra12:33
*** hashar has joined #openstack-infra12:37
*** yamamoto has joined #openstack-infra12:45
*** witek has joined #openstack-infra12:54
*** rlandy has joined #openstack-infra12:59
*** jamesmcarthur has joined #openstack-infra13:00
*** rosmaita has joined #openstack-infra13:02
lennybclarkb: pls review small ovs-br create patch https://review.opendev.org/#/c/693850/13:09
donnydJust so there is some sort of status update on the FN log storage. I am still cleaning up the swift servers and getting them back to a state where they are actually ready for production. Thanks to all in the swift channel who have been super helpful thus far13:11
*** jamesmcarthur has quit IRC13:15
*** jamesmcarthur has joined #openstack-infra13:15
*** rh-jelabarre has joined #openstack-infra13:16
*** Xuchu has joined #openstack-infra13:17
openstackgerritFabien Boucher proposed zuul/zuul master: Pagure: remove connectors burden and simplify code  https://review.opendev.org/69613413:19
*** goldyfruit_ has quit IRC13:19
openstackgerritFabien Boucher proposed zuul/zuul master: Pagure: remove connectors burden and simplify code  https://review.opendev.org/69613413:20
*** apetrich has quit IRC13:21
*** Lucas_Gray has quit IRC13:26
*** zbr has quit IRC13:27
*** kjackal has quit IRC13:29
*** kjackal has joined #openstack-infra13:30
*** zbr has joined #openstack-infra13:32
openstackgerritMatthieu Huin proposed zuul/zuul master: enqueue: make trigger deprecated  https://review.opendev.org/69544613:33
*** jamesmcarthur has quit IRC13:34
*** jamesden_ has joined #openstack-infra13:35
*** Goneri has joined #openstack-infra13:35
*** jamesdenton has quit IRC13:36
*** lmiccini has joined #openstack-infra13:43
*** jamesmcarthur has joined #openstack-infra13:45
*** ociuhandu has joined #openstack-infra13:51
*** Xuchu has quit IRC13:57
*** mriedem has joined #openstack-infra13:58
*** jamesmcarthur has quit IRC14:15
*** jamesmcarthur has joined #openstack-infra14:20
*** soniya29 has quit IRC14:22
openstackgerritTobias Henkel proposed zuul/nodepool master: Add back-off mechanism for image deletions  https://review.opendev.org/69802314:27
*** jamesmcarthur has quit IRC14:33
*** ykarel is now known as ykarel|afk14:36
*** eharney has joined #openstack-infra14:41
*** lpetrut has joined #openstack-infra14:41
openstackgerritTobias Henkel proposed zuul/nodepool master: Add back-off mechanism for image deletions  https://review.opendev.org/69802314:44
*** goldyfruit has joined #openstack-infra14:44
*** soniya29 has joined #openstack-infra14:44
*** jamesmcarthur has joined #openstack-infra14:46
*** jamesmcarthur has quit IRC14:46
*** jamesmcarthur has joined #openstack-infra14:46
*** soniya29 has quit IRC14:52
*** surpatil has quit IRC14:54
*** beekneemech is now known as bnemec15:00
sshnaidmThe message you sent to legal-discuss@lists.openstack.org hasn't been delivered yet due to: Recipient email address is possibly incorrect. Domain has no MX records or is invalid15:09
sshnaidmis it right address of mail list?15:09
*** sgw has joined #openstack-infra15:09
*** jamesmcarthur_ has joined #openstack-infra15:09
sshnaidmWho does maintain openstack maillists?15:10
*** ociuhandu has quit IRC15:11
*** jamesmcarthur has quit IRC15:13
*** jamesden_ is now known as jamesdenton15:15
*** chkumar|ruck is now known as raukadah15:19
openstackgerritAlbin Vass proposed zuul/nodepool master: Keys must be defined for host-key-checking: false  https://review.opendev.org/69802915:19
*** erobtom has joined #openstack-infra15:21
openstackgerritAlbin Vass proposed zuul/nodepool master: Keys must be defined for host-key-checking: false  https://review.opendev.org/69802915:25
*** ykarel|afk is now known as ykarel|away15:25
*** sgw has quit IRC15:28
mordredsshnaidm: https://docs.openstack.org/infra/system-config/lists.html15:29
mordredhowever - wow: host -t MX lists.openstack.org15:29
mordredlists.openstack.org has no MX record15:29
mordredthat doesn't seem good15:29
*** ykarel|away has quit IRC15:31
*** ykarel|away has joined #openstack-infra15:31
erobtomHi, I need help with something.15:32
*** ykarel|away has quit IRC15:32
erobtomIt looks like Gerrit at review.opendev.org stopped sending Gerrit event around 22 Nov. 15:32
erobtomDo I need to contact somebody to enable Gerrit event for our account?15:32
sshnaidmmordred, and I don't see this list here: https://opendev.org/opendev/system-config/src/branch/master/modules/openstack_project/manifests/lists.pp15:33
sshnaidmmordred, so no legal-discuss list anymore? :(15:34
mordredsshnaidm: those docs are out of date I think, sorry. playbooks/host_vars/lists.openstack.org.yaml is where the lists are defiend15:34
*** pgaxatte has quit IRC15:35
sshnaidmlegal-discuss-owner: spam15:35
sshnaidmgreat15:35
mordredbut - the bigger issue here is that there is no MX record in dns for lists.o.o currently15:35
mordredI just checked the rax cloud dns for openstack.org and it does not list an MX entry ... I'm not sure what has happened here15:36
mordredfungi: you up and lurking by any chance?15:36
mordredinfra-root: ^^15:36
*** udesale has quit IRC15:36
mordredI can obviously add an MX record back - but I'm not 100% up to speed on the moves we've been making WRT dns upgrades and don't want to break anything further15:37
Shrewsmordred: the last archive date for that list is from july 2019. was it discontinued?15:38
Shrewshttp://lists.openstack.org/pipermail/legal-discuss/15:38
fungimordred: yeah, looking15:38
mordredit's just typically pretty low traffic15:38
mordredfungi: I confirned there's no record in the rax dashboard that I get to via logging in as openstackinfra15:38
mordred(which I think is the correct produciton dns for openstack.org yes?)15:38
fungisshnaidm: mordred: lists.openstack.org has never had an mx record that i know of15:39
fungismtp says deliver to teh mx record *if* there is one, otherwise use the address record15:39
sshnaidmfungi, so maybe address of mail list is wrong?15:39
fungiit's been working this way forever15:39
mordredoh. duh15:39
mordredsorry - I';m still just on first coffee of the day15:39
fungimx is meant to be an override in case you want mail for a system delivered somewhere other than the address of its canonical name15:40
fungiit says "i don't handle e-mail directly, my mail exchanger is over here instead"15:40
sshnaidmfungi, can you post something to legal-discuss@lists.openstack.org  ?15:40
*** yamamoto has quit IRC15:40
fungimailservers refusing to deliver to a domain name which doesn't correspond to an mx record are broken, and rfc-noncompliant, though this is the first time i've ever heard of one doing something so absurd15:41
fungisshnaidm: i expect i can. i have many times in the past. i'm hesitant to send a test message to the list and bother all its subscribers though15:42
openstackgerritAlbin Vass proposed zuul/nodepool master: Keys must be defined for host-key-checking: false  https://review.opendev.org/69802915:42
sshnaidmfungi, well, I tried twice and I can't15:42
sshnaidmwill try from different provider maybe..15:43
*** lpetrut has quit IRC15:46
*** ijw has joined #openstack-infra15:49
fungisshnaidm: some web searches turn up only people sending through microsoft exchange servers reporting this specific sort of ndr, so it's possible this is a common "feature" some exchange administrators turn on without realizing they're breaking their ability to deliver e-mail from their users15:50
*** ociuhandu has joined #openstack-infra15:51
sshnaidmfungi, I'm using RH mail, it's going via Gmail15:51
fungivery strange, does it go directly to gmail or does it get relayed through another mta? what is the reporting mta in the ndr? it's the same dns records which handle all the list addresses on lists.openstack.org so if you're having trouble getting a message to one, you'd in theory have trouble getting a message to any of them15:52
mordredsshnaidm: if this is the topic I think it's about - wanna paste the email you've written and i'll send it for you? that way it'll be both a test message and an appropriate list message if it goes through15:52
mordredsshnaidm: (and it would be just as reasonble for it to come from me)15:53
*** ijw has quit IRC15:53
fungican you possibly provide the entirety of the ndr? there might be more detail, perhaps it's not the actual error and is merely guidance to the user masking the response from some remote mta15:53
*** ijw has joined #openstack-infra15:53
mordredI mena- I have an opinion on the answer - but I can still ask the question :)15:53
sshnaidmmordred, trying now from my private gmail, if won't go - let's do it15:53
mordredsshnaidm: cool.15:53
*** witek has quit IRC15:54
corvussshnaidm: did you send the legal-discuss message the same way you did the openstack-discuss message?15:54
sshnaidmcorvus, yep15:55
sshnaidmit worked from my private gmail15:55
corvussshnaidm: can you paste the entire bounce message you got originally?15:55
sshnaidmbut didn't from my RH mail15:55
*** ociuhandu has quit IRC15:55
sshnaidmcorvus, http://paste.openstack.org/show/787323/15:56
fungiahh, yeah, i realize now not everyone knows what "ndr" means, sorry :/15:57
*** erobtom has quit IRC15:58
fungisshnaidm: aha! i saw a lot of hits from confused mimecast when searching the web for that error too15:58
fungi"Powered by Mimecast" right there in the non-delivery report15:58
corvusit seems very weird to have different behavior for the different recipient addresses15:58
fungier, hits from confused mimecast users15:58
fungi"Mimecast is an international company specializing in cloud-based email management for Microsoft Exchange and Microsoft Office 365, including security, archiving, and continuity services to protect business mail."15:59
fungiso for whatever reason, that message got shunted through a mimecast service, maybe not all your messages do?16:00
mordredwow16:00
sshnaidmfungi, it's weird, maybe it's on openstack mail list somewhere?16:00
mordredsshnaidm: are you sending the email via smtp from a mail client vs. through the web interface?16:00
sshnaidmmordred, web16:00
mordredweird. that whole received chain is fascinating16:01
fungisshnaidm: Received: from mimecast-mx02.redhat.com16:01
fungipretty sure that's *not* us16:01
sshnaidmfungi, I see..16:01
*** goldyfruit_ has joined #openstack-infra16:01
*** ijw_ has joined #openstack-infra16:01
sshnaidmfungi, what can we do to satisfy mimecast? :)16:01
mordredReceived: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C435E18E9779 for <sshnaidm@gapps.redhat.com>; Mon,16:01
fungisshnaidm: i recommend talking to redhat's mail sysadmins16:02
fungipostmaster@redhat.com might reach them, i have no idea. it's not as safe a bet these days as in the golden age of the 'net16:02
sshnaidmit's weird, I post to openstack-discuss w/o any problems..16:02
mordredsshnaidm: I agree - this is quite odd16:03
sshnaidmfungi, when was the golden age? :)16:03
fungii dunno, sometime before now ;)16:03
mordredsshnaidm: back when we used to ftp into public ftp servers to download software?16:03
mordredgood old prep.ai.mit.edu ...16:03
fungiin the 90s i had pretty good luck reaching sysadmins of mailservers by sending to the ietf-mandated role alias addresses at their domains16:04
*** goldyfruit has quit IRC16:04
corvusmordred: that time is now16:04
mordredcorvus: that's a great point16:04
clarkbcould it be content of the email?16:04
*** ijw has quit IRC16:04
clarkbyou can probably check headers for the -discuss bound emails to see if they made it through the same mimecast service16:04
* corvus worked hard to ensure that still worked, and it looks like the current sysadmins do too :)16:05
fungiclarkb: yeah, likely that or frequency with which people on redhat's internal mail relays send to which addresses16:05
mordredcorvus, fungi: you know - I don't think we've ever discussed running a public ftp server for openstack software ...16:05
fungimaybe messages for more common recipient addresses don't get the third degree16:05
corvussshnaidm: if you email the rh mail admins, feel free to cc me (using my rh address) if you like16:05
sshnaidmcorvus, ack, what it is?16:05
corvussshnaidm: jeblair@redhat.com16:06
sshnaidmcorvus, great16:06
mordredsshnaidm: add mordred@redhat.com too just for kicks16:06
sshnaidmok16:06
clarkbinfra-root I think we should plan to restart our zuul (and maybe nodepool? shrews have an opinion?) services todayish. There were a few changes that went in friday that would be good to confirm are as happy in production as they were in testing16:06
corvusif they have any questions, we can check server logs, etc16:06
clarkbI can help with that once I've bootstrapped my day a bit more16:06
mordredclarkb: I submitted the request for an sdk release to get the ovh thing in - should we wai ton that?16:07
clarkbmordred: I think we can update those venvs independent of updating and restarting zuul. But a downtime might be a good safe time to do that yes16:07
Shrewsclarkb: hrm, i just recall restarting nodepool during the ptg in anticipation of a release for tobiash, but i don't think we ever released. will need to look what went in since16:07
clarkbmordred: I'll wait for sdk release to happen16:09
clarkbas getting that tested and updated in base-jobs would be excellent too16:09
Shrewsi apparently did not log the commit sha for the restart  :/16:11
mordredclarkb: we're in luck - the release is out16:11
Shrewsclarkb: i think restarting nodepool is probably a good idea anyway16:13
clarkbShrews: k16:13
Shrewsthat would grab the new upload hook feature16:13
clarkbmordred: re sdk I'm trying to figure out if we install it into the ansible venvs and don't see where we do that. We do install it globally on the executors though16:13
clarkboh wait maybe it is in zuul itself /me looks16:13
clarkbyup16:14
clarkbzuul/lib/ansible-config.conf16:14
clarkbI think that means we want to stop services, run the zuul ansible upgrade command then start services.16:14
clarkbok time to finish breakfast and make an upgrade plan16:15
pabelangerspeaking of zuul/lib/ansible-config.conf can we land ansible 2.9 support in zuul this week?16:15
pabelangerhttps://review.opendev.org/#/q/status:open+project:zuul/zuul+branch:master+topic:multi-ansible-wip16:15
*** yamamoto has joined #openstack-infra16:15
pabelangerI'd like to start using it for zuul.a.c, if possible16:16
*** soniya29 has joined #openstack-infra16:20
Shrewsclarkb: i'm going to go ahead and restart the nodepool builders for us to grab the most recent updates. nb01 has *several* dib processes that have been running for more than a month16:20
Shrewsinfra-root: ^^16:20
Shrewsthat might require a system reboot to clear things out16:20
clarkbok16:21
fungithanks Shrews!16:21
pabelangersorry, that was posted in wrong channel16:22
pabelangermoving to #zuul16:22
fungiwe're going to have to manually upgrade openstacksdk on the zuul executors, right?16:22
mordredsshnaidm: I have replied to your email - so you'll at least get one response :)16:22
fungito clear the swift config change hurdle for ovh i mean16:22
clarkbfungi: yes, I think we need to run the manage ansible command with the upgrade flag16:23
fungiahh16:24
Shrewsseems the nb01 processes cleaned up with a nodepool-builder shutdown. skipping system reboot16:25
openstackgerritFabien Boucher proposed zuul/zuul master: Pagure: remove connectors burden and simplify code  https://review.opendev.org/69613416:25
fungiclarkb: right, because it's in the ansible envs16:25
sshnaidmmordred, thanks16:27
sshnaidmmordred, I think we agreed at least to rename them, so it will be anyway "new" modules, old one will be frozen and deprecated for 2 years16:27
mordredsshnaidm: well - yes - from a consumer point of view it will be "new" in the sense that they'll need to update names - but the code itself will be largely the same and will have a direct lineage to the current code16:28
sshnaidmmordred, that's possible, although we can go from scratch too16:29
mordredwe can - but I think that's going to have a pretty high cost- the exisitng modules have tons of real world use - I think making people update playbooks to use new names is one thing, but having modules maybe start behaving slightly differently for things that are being used for ops tasks would have a high user impact for basically no value16:31
Shrewsinfra-root: wow, /opt is at 100% on nb01 and nb02. /mnt on nb03 (configured differently?JJ) is also at 100%16:31
mordredShrews: that's non-awesome16:31
clarkbShrews: that is the image build leaks16:31
clarkb/opt/dib_tmp fills16:31
sshnaidmmordred, yeah, that makes sense16:31
clarkbShrews: I havent been able to trace it back to specific behavio but Im guessing it happens when dib dies in a way its exithandlers dont run16:32
Shrewsclarkb: that's unfortunate. is it safe to just remove everything unter /opt/dib_tmp ?16:32
clarkbwhat I've done before is stop the builder and disable the service. Reboot which unmounts any stale mounts then rm the contents of dib_tmp then enable service and reboot16:33
Shrewsclarkb: gotcha16:33
clarkbShrews: it is except not with dib running16:33
mordredsshnaidm: so basically - in the absence of someone saying it *must* be done (which I find highly unlikely) - I think our best path to success is to keep the collections split out changes minimal and focused on the collections structure related things (and some cleanups) ... but let's see what people on legal-discuss@ say for sure!16:33
sshnaidmmordred, yeah, let's wait till mtg in Thu16:35
mordred++16:35
*** rpioso is now known as rpioso|afk16:37
*** jamesmcarthur has joined #openstack-infra16:40
*** ociuhandu has joined #openstack-infra16:41
*** jamesmcarthur_ has quit IRC16:41
*** jamesmcarthur_ has joined #openstack-infra16:41
*** hashar has quit IRC16:44
*** jamesmcarthur has quit IRC16:45
*** witek has joined #openstack-infra16:46
*** rpittau is now known as rpittau|afk16:46
*** lmiccini has quit IRC16:47
*** soniya29 has quit IRC16:47
*** iurygregory has quit IRC16:53
*** ricolin has quit IRC16:56
*** ociuhandu has quit IRC16:58
*** lucasagomes has quit IRC16:59
*** priteau has joined #openstack-infra17:05
Shrews#status log /opt DIB workspace on nb0[123] was full. Removed all data in dib_tmp directory on each and restarted all builders on commit e391572495a18b8053a18f9b85beb97799f1126d17:09
openstackstatusShrews: finished logging17:09
Shrewsclarkb: builders are done. some are complaining about vexxhost volume conflicts again. I'll look into cleaning those up after lunch.17:13
clarkbinfra-root: our zuul restart playbook will do the manage ansible -u command while executors are stopped. However, I think I would like to test that command in the foreground manually on an executor before running it globally17:15
clarkbthis is because one of the changes that went in was to the manage ansible code path17:15
clarkbany objection to me stopping ze01 now, running the update then starting it again?17:16
*** gyee has joined #openstack-infra17:16
clarkbthen it will get done again when we run the playbook17:16
corvusclarkb: that sounds fine17:18
fungino objection, sounds like a wise precaution17:18
*** dtroyer has joined #openstack-infra17:18
*** ykarel has joined #openstack-infra17:19
*** ykarel is now known as ykarel|away17:19
openstackgerritMonty Taylor proposed openstack/project-config master: Announce ansible-collections-openstack in openstack-ansible-sig  https://review.opendev.org/69804617:21
clarkbok stopping ze01 executor now17:22
openstackgerritAlbin Vass proposed zuul/nodepool master: Keys must be defined for host-key-checking: false  https://review.opendev.org/69802917:23
*** jamesmcarthur_ has quit IRC17:30
*** pcaruana has quit IRC17:31
*** jamesmcarthur has joined #openstack-infra17:31
clarkbnot a quick stop fwiw17:33
*** dtantsur is now known as dtantsur|afk17:33
*** jamesmcarthur has quit IRC17:36
*** hashar has joined #openstack-infra17:36
*** kjackal has quit IRC17:38
*** ijw has joined #openstack-infra17:42
*** ijw_ has quit IRC17:45
clarkbinfra-root: `/usr/lib/zuul/ansible/2.8/bin/pip freeze | grep sdk` says openstacksdk==0.39.0 after running `ANSIBLE_EXTRA_PACKAGES=gear zuul-manage-ansible -u` as root on ze0117:45
clarkbthat looks correct to me17:46
clarkbI'm going to start ze01 now17:46
clarkbI think that means we can run system-config/playbooks/zuul_restart.yaml whenever we are ready17:46
clarkbis there anything else that should be checked prior to ^17:46
fungithat looks right to me17:47
clarkbI'll let the release team know17:47
corvusclarkb: i'm guessing you'll watch ze01 for a sanity check to make sure it's able to execute ansible correctly?17:47
clarkbcorvus: ya17:47
corvusk.  then yeah, when that checks out, i think we can do the whole shebang17:48
clarkbcurrently its cleaning up stale build dirs17:48
clarkbwaiting for it to start grabbing jobs17:48
fungiand once the full restart is done, we should be ready to recheck the ovh swift config change test17:48
corvusi think we(tobiash) finally fixed the stale dir cleanup, so it may actually be doing something now17:49
fungimight be a good idea to take a look at the disk utilization graphs in cacti to see if there was a significant change at restart17:49
*** igordc has joined #openstack-infra17:49
clarkbit appears to not be the quickest thing so ya likely is doing stuff now17:50
*** diablo_rojo has joined #openstack-infra17:51
clarkbit is starting to process jobs now17:53
*** priteau has quit IRC17:54
*** priteau has joined #openstack-infra17:55
clarkbso far ansible has only exited 0 according to the log file17:57
clarkbI'll leave a tail -f | grep on that for a bit to sanity check any non zero results17:58
*** priteau has quit IRC17:58
*** jklare has quit IRC17:58
*** jklare has joined #openstack-infra18:01
*** derekh has quit IRC18:01
*** hashar has quit IRC18:02
*** rpioso|afk is now known as rpioso18:10
*** gfidente is now known as gfidente|afk18:11
*** hashar has joined #openstack-infra18:11
*** hashar has quit IRC18:11
clarkbjust had our first failure. It occurred on opensuse-15, I'm guessing that platform is less stable with its jobs as it gets less attention?18:13
clarkbotherwise things have been looking stable18:13
fungihonestly, surprised it took this long to hit a failure18:14
fungii figured our cosmic background radiation there was stronger than that18:14
*** rishabhhpe has joined #openstack-infra18:16
rishabhhpeHello All , Need you inputs on below -:18:16
rishabhhpeFrom Saturday onwards our openstack CI is failing for ISCSI driver at very early stage of devstack setup and for FC driver it is getting queued. Please let us know if there is any changes are done from community end.18:16
clarkboh thats neat. That job then went on to be unreachable when the cleanup playbook tried to run against it (that checks df and networking)18:17
clarkbI think that host got properly unhappy18:17
clarkbrishabhhpe: the openstack QA and cinder teams may be better points of contact18:17
clarkbQA is responsible for devstack and cinder for iscsi related things typiocally18:17
*** witek has quit IRC18:18
clarkbrishabhhpe: also it generally helps if you can provide more debugging information like log snippets when asking for help to debug failures18:20
*** ijw has quit IRC18:21
clarkbrishabhhpe: one guess off the top of my head is maybe you are trying to do python2.7 when now only python3 issupported18:21
rishabhhpeWe are using Python3.7 only for our setup but from our analysis here i posted the success and failure log snippet18:23
rishabhhpehttp://paste.openstack.org/show/787341/18:23
*** pcaruana has joined #openstack-infra18:23
fungii heard a number of devstack plugins were encountering issues with calls to cli tools installed with pip for both python2 and python3 where the entrypoint for whichever was installed second overwrote the first and ended with them calling into the wrong version of python18:32
fungino idea if this is an example18:32
*** ralonsoh has quit IRC18:32
clarkbfungi: I think this is simply a case of devstack deciding xenial is too old for itself18:33
clarkb(conversation happening in the qa channel)18:33
fungiahh, good18:33
fungiand yeah, i saw the message in the paste but wasn't clear whether that was a warning or a hard error18:34
clarkbinfra-root I've got a root screen started on bridge. I'll be running `ansible-playbook -f 20 /opt/system-config/playbooks/zuul_restart.yaml` from there shortly18:34
fungiattaching18:34
clarkbthe zuul restart doesn't capture queues. Does someone else want to do that bit of the restart?18:34
fungii should be able to18:35
clarkbk give me a few minutes to grab something to drink and then I'll be ready to start18:35
fungiso far we've only been preserving contents for the check and gate pipelines in the openstack tenant18:36
fungido we need to expand that?18:36
*** igordc has quit IRC18:37
*** kjackal has joined #openstack-infra18:37
fungii have a root screen session going on zuul.o.o for the queue capture and replay18:37
*** ijw has joined #openstack-infra18:42
clarkbfungi: I think the automated every 30 second capture may do all tenants now18:42
clarkbbut ya we should capture all tenants18:42
*** kjackal has quit IRC18:43
clarkbfungi: I'm ready to start running the playbook now and have given the release team warning and checked they don't have anything in flight18:43
clarkbfungi: let me know when you've got queues captured and I will start the playbook18:44
fungihrm, the command we've been using is going to the whitebox zuul.openstack.org api18:44
fungibut it is also passing the tenant name, so maybe the hostname there is irrelevant18:44
fungii'll try to nab the other tenants with it too18:45
clarkbit is because the openstack.org api is whiteboxed with redirects that assume a tenant18:45
clarkbI don't think you can pass in any other tenant there18:45
*** rishabhhpe has quit IRC18:45
clarkbfungi: `python /opt/zuul/tools/zuul-changes.py http://zuul.openstack.org openstack check` is the command right? I think you can change that to https://zuul.opendev.org openstack check, zuul check, opendev check, etc ?18:46
clarkband append all of those files together?18:47
clarkb(and maybe that should become an opendev script so we don't have to remember for next time)18:47
fungirunning18:47
fungii did a loop, but yeah18:48
clarkbI won't start the playbook until you confirm you are happy with the results of ^18:48
fungialso worth noting, tools/zuul-changes.py doesn't work with python3, encoding issues18:48
fungiclarkb: another fun detail, looks like i got openstack tenant queue items in the zuul tenant check pipeline dump18:49
clarkbhrm is that a zuul api bug?18:50
clarkbfungi: or were you talking to zuul.openstack.org ?18:50
clarkb(whcih always assumes openstack tenant)18:50
fungipython /opt/zuul/tools/zuul-changes.py http://zuul.opendev.org zuul check18:50
*** kjackal has joined #openstack-infra18:50
fungithat contains a bunch of tripleo items, as well as other stuff18:50
fungibut maybe it's not doing sni and so hitting a default vhost?18:51
openstackgerritTobias Henkel proposed zuul/zuul master: DNM: Add quick and dirty api crawler for ansible versions  https://review.opendev.org/69806218:51
clarkbfungi: ya could be we need to use requests in there instead of urllib?18:52
clarkbfungi: maybe try http then?18:52
fungiit's using urllib2.urlopen() from python 2.7.1218:52
fungiit was http, but i tried switching to https and that didn't change anything18:53
*** ykarel|away has quit IRC18:54
fungiyeah, i suspect we need to rewrite this tool, should i take a quick stab at it now or just be satisfied with current behavior at the moment since openstack is the only tenant with queue items anyway according to the top level view at https://zuul.opendev.org/tenants18:54
clarkbfor now if openstack is the only one with queue items we are probably fine, but we should definitely fix this18:54
clarkboh I should double check the zuul install version on all hosts before I run the playbook18:55
fungiokay, check and gate for openstack dumped, roll the update18:55
fungiahh, yeah, i can take another once you confirm18:55
fungiup side to using urllib2 is that you can run this script without a virtualenv and without requests installed18:56
fungipossible urrlib in python3 has sni support since a while, worth checking18:56
openstackgerritKendall Nelson proposed openstack/cookiecutter master: Update CONTRIBUTING.rst template  https://review.opendev.org/69600118:56
fungimight be if we just fix this up to be python3 then it'll simply work18:56
clarkbdouble checked they all have zuul==3.11.2.dev72  # git sha 57aa3a018:57
clarkbfungi: ok ready now when you are18:57
*** harlowja has quit IRC18:58
funginew snapshots obtained, go for it18:58
clarkbit is running18:58
mordredsshnaidm, corvus: I replied to the RH reply18:58
*** pkopec has quit IRC18:59
clarkbok web didn't clean up its pid18:59
clarkbthe process isn't running through so I will manually rm that file18:59
clarkbcorvus: ^ any idea why that happens?18:59
sshnaidmmordred, great, I was composing a long polite request..19:00
fungiclarkb: any traceback in its log?19:00
fungimaybe it died ungracefully when something it was connecting to went away19:00
mordredsshnaidm: mine was a little short - tl;dr - "we are upstream admins and don't see any problems - what problems do you think there are?"19:01
*** harlowja has joined #openstack-infra19:01
clarkbfungi: goes from 2019-12-09 18:58:55,408 DEBUG zuul.web: Websocket close: 4011 Error with Gearman to 2019-12-09 19:00:21,548 DEBUG zuul.Web: Configured logging: 3.11.2.dev7219:01
clarkbfungi: no errors19:02
fungihrm19:02
fungiso might have died in its sleep19:02
corvusclarkb: yeah, there's a perms problem19:02
corvusthe web pid always needs to be manually deleted after it stops19:02
fungidoes zuul-web drop privileges after starting or something?19:04
clarkbalso it seems that the scheduler may remove its pid before it fully exits19:04
corvusyep19:04
fungithat would explain it19:04
clarkbit was still around on my first ps listing to check on web19:04
clarkbbut gone on a subsequent one19:04
clarkbcurrently we are in the wait for executors phase of the restart19:05
corvusclarkb: well, it removes it right before it exists19:05
fungilikely all the cleanup they're now doing19:05
clarkbcorvus: ah19:05
corvusclarkb: maybe some swapping or something made that long enough to be noticable19:05
corvusrather "right before it exits"19:05
fungithe scheduler uses a ton of memory, so yeah freeing all that could take some seconds19:06
fungiboth of those could be worked around by having a parent supervisor process responsible for the pidfile reaping, but that's added complexity and one more thing to go wrong on you19:07
corvusi don't think it's a problem in the scheduler case (if you wanted to start the scheduler after the pidfile is removed, that should work :).  we do need to do something about web though.19:08
clarkbfungi: I think you can reenqueue changes now19:08
fungiyeah, if the scheduler has closed all its listeners/descriptors and isn't going to interact with anything else after the pidfile is removed, the most trouble it might cause is briefly insufficient memory19:09
clarkbbased on zuul web showing at least one change has snuck in19:09
fungiclarkb: thanks reqnqueuing now19:09
clarkb(there isn't anything to run the jobs yet but we can put them in the queue)19:09
funginope, not yet19:09
fungirpc errors19:09
clarkbhow are those changes in there then19:10
fungier, no, not rpc errors19:10
clarkbexecutors are stopping now19:10
*** ociuhandu has joined #openstack-infra19:10
clarkbhalf have stopped19:11
fungithe `zuul enqueue` subcommand's syntax has changed compared to what tools/zuul-changes.py produces now19:11
clarkbfungi: I thought that change was backward compatible.?19:11
clarkbis it complaining about the trigger option?19:11
fungiaha, no, it recorded a tenant of None19:12
fungiwill sed these real quick19:12
clarkbk19:12
fungithat's better19:12
fungiwill work out whatever's causing that in the python3ification19:13
clarkb(there was a change to make --trigger optional but it should have been backward compat so good to hear that it was a different problem)19:13
*** igordc has joined #openstack-infra19:14
clarkbexecutors are all stopped and now are getting their ansible venvs updated19:15
*** ociuhandu has quit IRC19:15
clarkboh tahts done now and they have started. playbook is completed19:15
*** ociuhandu has joined #openstack-infra19:15
clarkbno errors other than manually rm'ing the web pid file19:15
clarkbspot check on ze10's 2.8 ansible install shows an openstacksdk of 0.39.019:16
Shrewsmnaser: fyi, i'm seeing something odd with some vexxhost instances in sjc1. There are instances that are either in the 'active' or 'error' state, power state is 'running', but the task state is 'deleting'. The ones I'm looking at seem to have all been created on nov 1319:16
clarkbfungi: I'll recheck my base-test tester change now19:16
*** ociuhandu has quit IRC19:16
fungireenqueuing is still underway19:16
clarkbhttps://review.opendev.org/#/c/680178/ is the base-test tester change19:17
fungicool19:17
fungireenqueue completed too, time to #status log?19:17
clarkbexpect that some executors will be busy rm'ing their stale buiild dirs for a few minutes on startup19:17
*** gmann is now known as gmann_afk19:17
clarkbthough ze10 is into running jobs at this point so we may be generally past that step19:18
clarkbfungi: ++ did you want to do it or should I?19:18
fungioh, i guess still waiting for them to start fully, yeah19:18
fungistatus log completed full zuul and nodepool restart at 19:00 utc for updated openstacksdk 0.39.0 release19:19
fungithat look reasonable?19:19
fungis/completed/performed/ maybe19:19
corvuscan we throw in some shas?19:19
clarkbfungi: should include the zuul version too, zuul==3.11.2.dev72  # git sha 57aa3a0, and make note of the depends-on and github driver changes?19:19
fungioh, yeah good thinkin19:19
fungiright-o19:19
funginodepool rev too?19:19
Shrewsmnaser: in case i lose my data set, sample instances include: 5045d5dc-cbf3-4f62-962d-4d47dc625567, 717a819c-78a9-4b18-974d-2f5a2616ed54, 447a7e88-f338-4c1b-8777-44360d7a6972, 39128606-8abb-411c-9b54-60264139105c19:20
clarkbfungi: shrews did nodepool independently19:20
clarkb(it isn't part of this playbook)19:20
Shrewsi did not do launchers today19:20
clarkbShrews: k19:20
fungiahh, okay, just builders19:20
fungistatus log completed full zuul restart at 19:00 utc with zuul==3.11.2.dev72 (57aa3a0) for updated openstacksdk 0.39.0 release and depends-on and github driver changes19:21
clarkbfungi: ++19:21
fungistill forgot to s/completed/performed/19:21
fungiany elaboration on depends-on and github driver changes needed?19:21
clarkbfungi: for both I would say something like "to avoid looking up extra unneeded changes via code review APIs"19:22
clarkbthey were related changes19:23
fungistatus log Performed full Zuul restart at 19:00 UTC with zuul==3.11.2.dev72 (57aa3a0) for updated OpenStackSDK 0.39.0 release and Depends-On and GitHub driver improvements to avoid looking up extra unneeded changes via code review APIs19:25
fungibetter?19:25
clarkbyup19:25
fungi#status log Performed full Zuul restart at 19:00 UTC with zuul==3.11.2.dev72 (57aa3a0) for updated OpenStackSDK 0.39.0 release and Depends-On and GitHub driver improvements to avoid looking up extra unneeded changes via code review APIs19:25
openstackstatusfungi: finished logging19:25
*** yamamoto has quit IRC19:26
clarkbfungi: http://zuul.opendev.org/t/zuul/stream/56f9a111ff6e4cdf8323ca9752458552?logfile=console.log to watch base-test testing19:29
Shrewsinfra-root: fyi, with the scheduler restart, we *should* see autohold held nodes sorted by build ID (for new holds going forward)19:32
Shrewss/sorted/grouped/19:32
corvusway cool for multinode jobs19:33
stevebaker_hey, I'm researching setting up a CI failure bot for #openstack-ironic, but I can't find any other project doing this (other than the now inactive hubbot)19:33
corvusmuch more useful than the previous "here is a pile of nodes!  good luck!" approach :)19:33
fungiclarkb: so initial stab, zuul-changes.py works under python3 as long as we .decode('utf-8') the bytestreams coming out of urlopen().read(), but that doesn't seem to solve the failure to differentiate tenants, so deeper investigation is warranted19:34
corvusstevebaker_: gerritbot has a broken feature to announce verified -2 reports (ie, gate failures).  fixing that is probably not more than a line or two of code.19:34
openstackgerritTobias Henkel proposed zuul/nodepool master: Add missing release note about post-upload-hook  https://review.opendev.org/69807219:35
clarkbfungi: we should rule out (or in) that zuul api gives us the correct json blob as expected19:35
corvusstevebaker_: https://opendev.org/opendev/gerritbot/src/branch/master/gerritbot/bot.py#L20819:35
stevebaker_corvus: interesting, thanks19:35
fungiclarkb: well, i mean, i can use the zuul dashboard to browse the zuul tenant status so that's hitting the api and getting correct responses19:36
corvusstevebaker_: i think the issue is mostly just that the approval type changed from "VRIF" to "Verified".  or something like that.19:36
clarkbfungi: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_fd5/680178/3/check/tox-py27/fd5c36e/ I think it worked. Note the url is bhs not bhs1 :)19:36
clarkbsimilar story with https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_56f/680178/3/check/tox-py35/56f9a11/19:37
*** strigazi has quit IRC19:37
clarkbI'll go ahead and get base-jobs change pushed up to have this be new production state19:37
*** tesseract has quit IRC19:37
fungiclarkb: yep! lgtm now19:40
*** strigazi has joined #openstack-infra19:40
mnaserShrews: ok ill check those out, thanks for pinging me19:43
mnaserwe added monitoring recently to catch ERROR state instances :)19:43
mnaserso anything that's ERROR for over 24 hours alerts us19:44
*** hashar has joined #openstack-infra19:44
openstackgerritClark Boylan proposed opendev/base-jobs master: Push OVH region update to production  https://review.opendev.org/69807419:44
Shrewsmnaser: \o/19:44
mnaserShrews: actually its available for anyone :)19:45
clarkbinfra-root ^ ovh change should be ready now I think19:45
mnaserbeen working on this https://github.com/openstack-exporter :) -- the helmcharts have all the alerts19:45
*** strigazi has quit IRC19:46
fungiclarkb: okay, somewhat good news on the zuul-changes.py script. it's actually sorta working on python 3.6 and later, apparently urlopen().read() produces a data type json.loads() can ingest just fine. it breaks on python 3.5, which is the default python3 on zuul.opendev.org (ubuntu xenial)19:52
fungihowever, the current behavior of the script when it hits a multi-tenant zuul api is to output changes matching the specified pipeline for all tenants, not filtered to the tenant provided in its command-line arguments19:52
fungii have a feeling it's been this way since multi-tenancy support was introduced in the script19:53
clarkboh neat I think that behavior works better for our use case?19:53
fungiyeah, albeit confusing19:54
clarkbinfra-root I am going to upgrade openstacksdk to 0.39.0 on nl04 then restart nodepool launcher there19:54
clarkbI'm choosing nl04 as the canary ecause it talks to ovh and sdk update affects ovh19:54
mordredclarkb: ++19:54
fungiso if you `python3 tools/zuul-changes.py https://zuul.opendev.org zuul check` you get the contents of any pipeline named "check" across all your zuul tenants (with the corresponding tenant names in the zuul enqueue(-ref) command-line, so that's safe at least19:55
clarkbalright nl04 has had that done to it19:56
clarkbShrews: ^ fyi19:56
clarkb0013298874 built in bhs1 and went in-use not long ago20:00
clarkbfirst indications seem to be good given that20:00
clarkbmordred: ^ can you confirm that would cover the bug fix that you wrote for nodepool on the microversion side of things too?20:00
mordredclarkb: 0.39.0 definitely would20:01
mordredoh - wait - the nodepool fix20:01
clarkbya I'm testing two things. First is that ovh continues to work after the sdk json updates (seems to) and the other is whether or not your microversion fix is working (so we can make a release)20:02
clarkbI believe that ovh booting instances would cover your microversion fix if their microversions support the thing you were fixing (I don't know how to determine that bit of info)20:02
mordredclarkb: let me check them real quick20:03
*** eharney has quit IRC20:04
mordredclarkb: nope.20:05
mordred{'versions': [{'status': 'SUPPORTED', 'min_version': '', 'updated': '2011-01-21T11:33:21Z', 'links': [{'rel': 'self', 'href': 'https://compute.bhs1.cloud.ovh.net/v2/'}], 'id': 'v2.0', 'version': ''}, {'status': 'CURRENT', 'min_version': '2.1', 'updated': '2013-07-23T11:33:21Z', 'links': [{'rel': 'self', 'href': 'https://compute.bhs1.cloud.ovh.net/v2.1/'}], 'id': 'v2.1', 'version': '2.38'}]}20:05
mordred2.38 is as new as they have there20:05
clarkbmordred: can you check if vexxhost or fn supports it?20:05
clarkbI can restart that launcher next20:05
mordredI am 100%  certain vexxhost does20:05
clarkbalright I'll do vexxhost next then20:05
mordredthey're running train :)20:05
openstackgerritTobias Henkel proposed zuul/nodepool master: Add missing release notes  https://review.opendev.org/69807220:06
clarkbnl03 has been restarted now as well with sdk 0.39.020:07
clarkbthis should cover mordreds microversion fix for nova in nodepool20:07
mordredclarkb: actually - what region are we using with vexxhost? ca-ymq-1 ?20:07
clarkbmordred: sjc120:07
fungiclarkb: yet more investigation indicates we get the multi-tenant behavior on zuul.o.o under python 2.7 as well if we use the main url and not the whitebox one20:07
clarkbe391572495a18b8053a18f9b85beb97799f1126d is the nodepool commit I've restarted 03 and 04 on20:07
mordred{'version': {'status': 'CURRENT', 'min_version': '2.1', 'updated': '2013-07-23T11:33:21Z', 'media-types': [{'type': 'application/vnd.openstack.compute+json;version=2.1', 'base': 'application/json'}], 'links': [{'rel': 'self', 'href': 'https://compute-sjc1.vexxhost.us/v2.1/'}, {'type': 'text/html', 'rel': 'describedby', 'href': 'http://docs.openstack.org/'}], 'id': 'v2.1', 'version': '2.72'}}20:08
mordredclarkb: so yes - they are running 2.7220:08
clarkbok we have sucessfully built and put into use vexxhost nodes since the restart with newer sdk. That implies to me that mordreds microversion fix is working20:10
clarkbI'll restart 02 and 01 now with new sdk as well20:11
clarkball 4 launchers are now restarted at nodepool==3.9.1.dev11  # git sha e39157220:13
clarkbthings look stable. I'mgoing to figure out lunch now20:16
openstackgerritDavid Shrewsbury proposed zuul/zuul master: Sort autoholds by request ID  https://review.opendev.org/69807720:16
*** Lucas_Gray has joined #openstack-infra20:21
Shrewsclarkb: awesome, thx20:22
*** ociuhandu has joined #openstack-infra20:43
*** ociuhandu has quit IRC20:48
openstackgerritMerged opendev/base-jobs master: Push OVH region update to production  https://review.opendev.org/69807420:50
openstackgerritMonty Taylor proposed opendev/storyboard master: Build container images  https://review.opendev.org/61119120:58
mordredfungi, diablo_rojo: ^^ I had to respin that due to an error20:59
corvusmordred, fungi: i'm surprised to see the opendevzuul key in that change... what's the goal?21:01
*** eharney has joined #openstack-infra21:01
*** hashar has quit IRC21:02
fungiuploading docker images of storyboard builds to the opendev namespace on dockerhub... which key should we use for that?21:03
fungias a precursor to switching our storyboard deployments from puppety to ansible+containers21:03
*** Lucas_Gray has quit IRC21:04
corvusfungi: that's the right key and the way we do that.  i was unsure whether the opendev/ namespace was the target, and if not, but the opendevzuul key was intended, then i would expect not to require a copy of the key there.  but it is, so it is.21:08
*** Lucas_Gray has joined #openstack-infra21:08
corvuslong story short, that change occupied about 3 states of quantum superposition in my mind, but has collapsed to one :)21:08
fungino worries, could/should we do that in project-config instead of spreading copies of the key around?21:09
fungieither way, it's probably something we should call particular attention to in reviews, i agree21:10
corvusi think there is a way to reduce copies of it (which we didn't see fit to do earlier since we thought images in opendev/ would come only from system-config).  but we can futz with that later.21:12
fungiwe could also do the image building via system-config with a little bit of rejiggering, i expect21:13
fungithis is probably verging into ianw's earlier topic wrt opendev images for dib/nodepool except in this case it's opendev image of an opendev project... but maybe we want to treat that case similarly?21:14
fungiconsistency could be a good thing there, i suppose21:15
corvusyeah, in nodepool's case it's a different org and a different responsible party.  in this case (and probably for dib) it seems we're saying both are the same, so the answer is different.21:17
openstackgerritSteve Baker proposed opendev/gerritbot master: Fix event comment-added  https://review.opendev.org/69808921:18
corvusmordred: re https://github.com/bgeesaman/registry-search-order from ianw -- when we use podman-compose we should make sure that we do not have a search line in registries.conf.21:18
fungigonna try to grab an early dinner out, should be back soonish21:19
*** ociuhandu has joined #openstack-infra21:21
corvusi mean, the fully-qualified names should render us safe, but double checking that we don't have that configured is a good belts+suspenders thing21:22
*** jamesmcarthur has joined #openstack-infra21:22
clarkbianw: did you want to keep the dib container topic on the infra meeting agenda?21:24
clarkb(I'm not sure if that has been sufficiently resolved with the siblings work)21:24
*** yamamoto has joined #openstack-infra21:25
clarkb#status log Restarted Nodepool Launchers at nodepool==3.9.1.dev11 (e391572) and openstacksdk==0.39.0 to pick up OVH profile updates as well as fixes in nodepool to support newer openstacksdk21:26
openstackstatusclarkb: finished logging21:26
*** ociuhandu has quit IRC21:26
openstackgerritSteve Baker proposed openstack/project-config master: IRC #openstack-ironic gerritbot CI failed messages  https://review.opendev.org/69809121:27
stevebaker_corvus: thanks, I've proposed this https://review.opendev.org/69808921:28
openstackgerritMerged zuul/zuul master: Sort autoholds by request ID  https://review.opendev.org/69807721:30
*** pcaruana has quit IRC21:31
corvusstevebaker_: that looks good to me -- looks like we'll have to dust off a few cobwebs with the dependencies, but that should be a tractable problem21:32
corvusstevebaker_: https://review.opendev.org/545502 looks like it may be worth a look21:32
stevebaker_corvus: readable event names would be nice. I'd be happy with that change, then reworking mine to just add x-all-comments21:35
stevebaker_corvus: or landing my change then I can rebase https://review.opendev.org/54550221:38
*** nicolasbock has joined #openstack-infra21:40
openstackgerritBernard Cafarelli proposed openstack/openstack-zuul-jobs master: Update openstack-python-jobs-neutron templates  https://review.opendev.org/69809521:44
*** jtomasek has quit IRC21:45
*** jamesmcarthur has quit IRC21:48
*** gmann_afk is now known as gmann21:48
openstackgerritSteve Baker proposed opendev/gerritbot master: Fix event comment-added  https://review.opendev.org/69808921:49
clarkbzuul and nodepool continue to look stable. I'm popping out for a bike ride. Back in a bit21:52
*** ijw_ has joined #openstack-infra21:55
*** Goneri has quit IRC21:56
*** ijw has quit IRC21:58
ianwclarkb: yeah, i think we can drop the topic, we have wip22:00
*** ociuhandu has joined #openstack-infra22:04
*** slaweq has quit IRC22:05
*** ociuhandu has quit IRC22:10
donnyddid someone restart nl02?22:12
donnydduh.. it was right in front of me22:12
donnydclarkb: over time nl02 thinks there are a bunch of nodes in "ready",  and for whatever reason it counts them against quota22:13
*** cgoncalves has quit IRC22:16
*** rcernin has joined #openstack-infra22:16
*** Lucas_Gray has quit IRC22:17
openstackgerritIan Wienand proposed opendev/system-config master: Add roles for a basic static server  https://review.opendev.org/69758722:19
*** cgoncalves has joined #openstack-infra22:23
*** cgoncalves has quit IRC22:26
*** cgoncalves has joined #openstack-infra22:27
*** cgoncalves has quit IRC22:27
*** cgoncalves has joined #openstack-infra22:28
openstackgerritMerged zuul/nodepool master: Add missing release notes  https://review.opendev.org/69807222:34
*** rcernin has quit IRC22:36
openstackgerritIan Wienand proposed opendev/system-config master: mirror: remove debug output of apache config  https://review.opendev.org/69810422:38
openstackgerritIan Wienand proposed opendev/system-config master: Add roles for a basic static server  https://review.opendev.org/69758722:44
*** rkukura has quit IRC22:45
*** rkukura has joined #openstack-infra22:46
fungidonnyd: those are probably part of the min-ready count for various labels22:54
openstackgerritMerged opendev/system-config master: mirror jobs: copy acme.sh output  https://review.opendev.org/69620822:54
fungiwe might want to consider whether we should drop min-ready to 0 for infrequently-used node labels22:55
donnydI should just take mine down to 0 and see if it stops doing that22:55
fungiwell, they're a distributed global count22:55
fungifor frequently-used node labels it's not a problem because they'll be used to satisfy a node request not too long after they're booted22:56
donnydI should take them down to 0 in the FN config because it only take 30-40 seconds for an instance to launch22:56
fungiyeah, trying to say those aren't per-region/per-provider values22:57
donnydI see Ubuntu and centos instances brought up and not used for weeks sometimes22:57
fungiwhich releases of ubuntu and centos? because that would be wierd22:57
fungiweird22:57
donnydohic22:58
donnydbionic and 722:58
donnydi think it happens when a new image is loaded22:58
fungithe idea is that nodepool is configured to boot some number of each label in advance to try to keep a pool of instantly-available nodes for some requests, the usual problem comes when we add a new node type or one falls out of general use and we pre-boot some which sit around for weeks waiting on a build to request them22:59
fungibut centos-7 and ubuntu-bionic nodes should be getting used straight away22:59
fungiso maybe this is indicative of a leak of some kind22:59
donnydthat is not what always happens22:59
donnydNext time I see it I will hit you up so maybe we can diagnose why, but they surely do that23:00
donnydI usually kilt them, but then the "ready" node never gets released back to NL23:00
*** kjackal has quit IRC23:00
donnydI know a couple months back FN readys were at like 15 nodes23:00
fungii'm plumbing the depths of the node list now23:01
*** kjackal has joined #openstack-infra23:01
donnydwhich didn't actually exist, but nl was tracking that they did23:01
fungiright now there are no "ready" nodes in fn for any label23:02
donnydhttp://grafana.openstack.org/d/3Bwpi5SZk/nodepool-fortnebula?orgId=1&from=now-1w%2Fw&to=now-1w%2Fw23:02
donnydthat is because nl was just restarted, but give it a few weeks and it will slowly creep up23:02
fungi(according to the `nodepool list` command anyway)23:02
fungiokay, certainly sounds like a leak of some kind, would definitely be good to get to the bottom of, because what you're describing doesn't sound like behavior we expect23:03
donnydWe talked about this last time it was super high23:03
fungibut yeah, we'll probably need to wait for it to crop back up23:03
fungii doubt i was also super high at the time, but for the life of me i can't remember what we investigated/discovered23:04
donnydwe restarted nl and it went away last time too23:04
donnydnl02 that is23:05
fungiyeah, that's the launcher handling fn (and a few other providers)23:05
fungihopefully next time before an nl02 restart we can dig into logs around some example nodes stuck in "ready" state there23:05
donnydits strange for sure. I will keep a closer eye on it and report back when it's doing that again23:06
fungithanks23:06
donnydprobably take another month though23:06
donnydHope all is well with you fungi  :)23:06
fungioh, yep, tourists are gone, no more 'canes headed our way this year probably, time to kick back and relax23:08
fungii hope things are well with you too!23:08
ianwdoes bridge.o.o really need to do "    - name: Clone puppet modules to /etc/puppet/modules"?  they're not used by bridge.o.o i don't think?23:10
*** diablo_rojo has quit IRC23:12
*** tkajinam has joined #openstack-infra23:12
*** ramishra has quit IRC23:12
fungidoes it copy those to remote nodes or only tell remote nodes to retrieve them?23:13
fungiat one time we pushed copies of puppet modules to remote nodes23:13
clarkbfungi: donnyd I think min ready is only served by rax right now23:14
clarkbbecause nodepool doesn't know how to distribute that23:14
clarkbis it possible we are leaking instances in fn and they count against our quota but nova api doesn't show them to nodepool anymore?23:14
clarkbianw: I believe we copy the master modules copies on bridge to the individual nodes23:14
*** ociuhandu has joined #openstack-infra23:15
*** diablo_rojo has joined #openstack-infra23:15
*** rcernin has joined #openstack-infra23:17
ianwok, that could be right.  not going to get sidetracked :)23:19
*** ociuhandu has quit IRC23:20
*** ociuhandu has joined #openstack-infra23:24
*** harlowja has quit IRC23:33
*** ociuhandu has quit IRC23:33
*** ociuhandu has joined #openstack-infra23:34
*** harlowja has joined #openstack-infra23:35
*** mriedem has quit IRC23:37
*** dchen has joined #openstack-infra23:37
*** ociuhandu has quit IRC23:39
*** ijw_ has quit IRC23:41
openstackgerritMerged zuul/zuul-jobs master: build-container-image: support sibling copy  https://review.opendev.org/69793623:42
*** rlandy has quit IRC23:43
clarkbwhen I was doing nodepool launcher restarts I noticed that we may have leaked some volumes in vexxhost sjc1. I think some of those leaks are related to the instances that shrews found won't delete but there are others that don't appear to be tied to specific instances anymore23:44
clarkbI'm going to try and clean those unassociated volumes now23:44
*** ociuhandu has joined #openstack-infra23:44
openstackgerritMerged opendev/system-config master: mirror: remove debug output of apache config  https://review.opendev.org/69810423:45
openstackgerritMerged zuul/zuul-jobs master: build-docker-image: fix up siblings copy  https://review.opendev.org/69761423:49
*** ociuhandu has quit IRC23:49
openstackgerritMatt McEuen proposed openstack/project-config master: New projectt: ansible-role-airship  https://review.opendev.org/69811423:50
*** goldyfruit_ has quit IRC23:50
*** ociuhandu has joined #openstack-infra23:54
*** tosky has quit IRC23:56
openstackgerritMatt McEuen proposed openstack/project-config master: New project: go-redfish  https://review.opendev.org/69811523:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!