Monday, 2019-12-09

openstackgerritIan Wienand proposed zuul/zuul-jobs master: build-container-image: support sibling copy  https://review.opendev.org/69793600:00
openstackgerritIan Wienand proposed zuul/zuul-jobs master: build-docker-image: fix up siblings copy  https://review.opendev.org/69761400:00
openstackgerritIan Wienand proposed zuul/zuul-jobs master: build-container-image: support sibling copy  https://review.opendev.org/69793600:14
openstackgerritIan Wienand proposed zuul/zuul-jobs master: build-docker-image: fix up siblings copy  https://review.opendev.org/69761400:14
openstackgerritMerged zuul/zuul-jobs master: Fixes tox log fetching when envlist is set to 'ALL'  https://review.opendev.org/69653100:18
*** irclogbot_0 has joined #zuul00:24
*** irclogbot_0 has quit IRC00:40
*** jamesmcarthur has joined #zuul00:40
*** sgw has quit IRC01:11
*** jamesmcarthur has quit IRC01:17
*** irclogbot_0 has joined #zuul01:28
*** irclogbot_0 has quit IRC01:48
*** bhavikdbavishi has joined #zuul02:40
*** bhavikdbavishi1 has joined #zuul02:43
*** bhavikdbavishi has quit IRC02:44
*** bhavikdbavishi1 is now known as bhavikdbavishi02:44
*** irclogbot_2 has joined #zuul02:48
mnaserianw: did you end up breaking through with that dib in container bug?03:00
ianwmnaser: so i think dib is ok in containers; the one real issue was that it was inspecting the host init system, which failed in a container without an init system03:43
ianwmnaser: so i have had nodepool-builder+dib build and boot images in the functional test.  there's a lot in flight though about how we generate those images -- so at the moment it's no so much if we can, but *how* we go about building the containers03:44
ianw697614 697936 add "sibling" installs to podman/docker builds, so we can build nodepool+dib+openstacksdk all into one container in the gate03:45
ianwcorvus: ^ reviews of those most welcome, it's been updated around the recently merged podman work03:46
mnaserianw: awesome. I’ll have a look at those tomorrow when I get sometime03:46
*** bhavikdbavishi has quit IRC03:47
ianwthanks, with those merged, i expect i can get the nodepool functional job using containers (theoretically, assuming no other issues pop up in practice :)03:48
*** jamesmcarthur has joined #zuul04:11
*** jamesmcarthur has quit IRC04:16
*** bhavikdbavishi has joined #zuul04:34
*** bhavikdbavishi1 has joined #zuul04:36
*** bhavikdbavishi has quit IRC04:38
*** bhavikdbavishi1 is now known as bhavikdbavishi04:38
*** jamesmcarthur has joined #zuul04:42
*** jamesmcarthur has quit IRC04:48
*** jamesmcarthur has joined #zuul05:15
*** jamesmcarthur has quit IRC05:20
*** jamesmcarthur has joined #zuul05:25
*** jamesmcarthur has quit IRC05:30
*** raukadah is now known as chkumar|ruck06:04
*** jamesmcarthur has joined #zuul06:26
*** jamesmcarthur has quit IRC06:31
*** jamesmcarthur has joined #zuul07:05
*** AJaeger has quit IRC07:09
*** jamesmcarthur has quit IRC07:11
*** AJaeger has joined #zuul07:13
*** igordc has joined #zuul08:01
*** jamesmcarthur has joined #zuul08:07
*** sshnaidm|off is now known as sshnaidm08:11
*** jamesmcarthur has quit IRC08:11
*** jangutter has joined #zuul08:11
*** themroc has joined #zuul08:13
*** jangutter has quit IRC08:16
*** jangutter has joined #zuul08:16
*** avass has joined #zuul08:17
*** tosky has joined #zuul08:25
*** jcapitao|afk has joined #zuul08:30
*** igordc has quit IRC08:36
*** jamesmcarthur has joined #zuul08:46
*** saneax has joined #zuul08:51
*** jamesmcarthur has quit IRC08:51
*** avass has quit IRC09:04
*** jcapitao|afk is now known as jcapitao09:07
*** yolanda has joined #zuul09:09
*** hashar has joined #zuul09:29
*** jamesmcarthur has joined #zuul09:47
*** jamesmcarthur has quit IRC09:52
*** ssbarnea has quit IRC10:16
*** jamesmcarthur has joined #zuul10:28
openstackgerritMatthieu Huin proposed zuul/zuul master: enqueue: make trigger deprecated  https://review.opendev.org/69544610:30
*** saneax has quit IRC10:30
*** saneax has joined #zuul10:30
*** jamesmcarthur has quit IRC10:33
*** mhu has joined #zuul10:58
*** avass has joined #zuul11:00
*** ssbarnea has joined #zuul11:06
*** themroc has quit IRC11:26
*** jamesmcarthur has joined #zuul11:29
*** jamesmcarthur has quit IRC11:33
*** jcapitao is now known as jcapitao|lunch11:36
avasshow does nodepool handle if a static node and a cloud provider node has the same label. Does it check if the static node is available before launching another node or could that be a bit random?11:43
*** pcaruana has joined #zuul11:46
*** hashar has quit IRC11:56
openstackgerritAlbin Vass proposed zuul/nodepool master: Aws cloud-image is referred to from pool labels section  https://review.opendev.org/69799811:56
*** jamesmcarthur has joined #zuul12:06
*** jamesmcarthur has quit IRC12:11
sugaarHi, I was wondering why for the demo is it used a custom Dockerfile to launch nodepool (https://opendev.org/zuul/zuul/src/branch/master/doc/source/admin/examples/node-Dockerfile) instead of using the official one from Dockerhub https://hub.docker.com/r/zuul/nodepool-launcher12:30
sugaarI am having a problem with the nodepool driver, even if I changed the demo to use the kubernetes driver and even the modified etc/nodepool/nodepool.yaml file is being loaded, still the static driver is running. Does anyone has a clue about why is happening that? I tried deleting the nodepool image from my host thinking that maybe instead of being12:33
sugaarcreated  anew one the old one was getting launch, but it didn't work12:33
*** hashar has joined #zuul12:37
avasssugaar: the node-Dockerfile is an example workernode. nodepool is called 'launcher' in the docker-compose file: https://opendev.org/zuul/zuul/src/branch/master/doc/source/admin/examples/docker-compose.yaml#L9912:40
*** bhavikdbavishi has quit IRC12:53
*** themroc has joined #zuul12:56
Shrewsavass: it would be random. there is no priority among providers12:56
*** rlandy has joined #zuul12:59
*** jamesmcarthur has joined #zuul13:00
avassshrews: too bad. was hoping we could use aws for overcapacity somehow since we still have those static nodes.13:03
Shrewsyou are not the first to desire such a feature13:07
*** jcapitao|lunch is now known as jcapitao13:08
tobiashShrews: do you have any objection about adding nodepool image metadata as env variables during image build?13:12
tobiashthis way we would have a way to add this metadata into the image via a special element13:13
tobiashthe use case is that we would like to print some metadata (image name and number) within the jobs13:13
tobiashI thought about adding something like DIB_NODEPOOL_IMAGE_NAME and DIB_NODEPOOL_IMAGE_BUILD13:14
*** jamesmcarthur has quit IRC13:15
*** jamesmcarthur has joined #zuul13:15
openstackgerritFabien Boucher proposed zuul/zuul master: Pagure: remove connectors burden and simplify code  https://review.opendev.org/69613413:19
openstackgerritFabien Boucher proposed zuul/zuul master: Pagure: remove connectors burden and simplify code  https://review.opendev.org/69613413:20
*** zbr has quit IRC13:27
*** zbr has joined #zuul13:32
openstackgerritMatthieu Huin proposed zuul/zuul master: enqueue: make trigger deprecated  https://review.opendev.org/69544613:33
*** jamesmcarthur has quit IRC13:34
Shrewstobiash: not particularly, although image name is already in the metadata. is  DIB_NODEPOOL_IMAGE_BUILD the build ID? because only nodepool has that info at build time13:34
Shrewsoh, i guess the builder would be the one providing that info via env var13:35
*** Goneri has joined #zuul13:35
tobiashyes, that was my idea13:35
Shrewstobiash: i won't object, but it is a community decision, not mine alone.  :)13:36
tobiashShrews: cool :)13:37
Shrewstobiash: i'm not clear on how that info is passed to the launcher though, since it's the one setting metadata13:37
tobiashShrews: the idea is that nodepool supplies that var to dib and then a dib element can put this information e.g. into /etc/image-info13:38
tobiashthen e.g. the base playbook can read out this file if it exists13:38
tobiashthe launcher and zuul then don't have to know this data13:39
tobiashanother idea could be to put this into the nodepool node metadata in zk (I just checked, only the label is there currently, not the image or the build id)13:40
Shrewsi don't like the idea of storing any number of environment variables used for debugging purposes in zk13:43
Shrewsi'd rather restrict zk data to just what is necessary for nodepool/zuul to operate13:43
*** jamesmcarthur has joined #zuul13:45
sugaarthanks avass I didn't noticed the image underneath13:52
avasssugaar: no problem!13:53
sugaarIs the nodepool-launcher a generic image? or is it predefined to use a driver by default? Because I keep configuring etc/nodepool.yaml to use kubernetes driver but it always uses static driver13:53
avasssugaar: what does your nodepool.yaml look like?13:54
*** Petar_T has joined #zuul13:58
yoctozeptopabelanger, tobiash: just wrote to the ML, but thought I might as well try you here, it seems https://review.opendev.org/678273 is deemed highly controversial in its simplicity :-)14:05
tobiashyoctozepto: you already have my review ;)14:12
Petar_THi all!14:13
* Petar_T waves14:13
Petar_TI was wondering if one of you could help me with a (hopefully simple) question I have.I am fairly new to Zuul and I've been tasked to set up a few projects with pipelines in my company. The infrastructure and a few projects were already set up by someone else but they have now left.I have something I'd like done across multiple projects so I am14:13
Petar_Twriting the config for it in a sort of "common zuul jobs" repo. There are a few examples of common jobs in there already, but I have spotted that they all have multiple items listed under "job.required-projects" config item - one for each project we'd like the job run to run on, as well as the "common zuul jobs" repo so we have access to the actual14:13
Petar_Tplaybooks.Is this the idiomatic way of doing this? What if you had a large amount of projects you'd like to run a small job on? Is there a way to get Zuul to not clone repos the job isn't running on? E.g. if I had projects "Foo" and "Bar" which ran something from "common-jobs" then when Foo is running its pipeline, it wouldn't need to clone14:13
Petar_T"Bar".One way that pops into mind is having a variable in the "job.required-projects" list that is set at project level, and is merged with hardcoded required projects but I was wondering if there's a more elegant way of doing it.Thanks14:13
Petar_TI was wondering if one of you could help me with a (hopefully simple) question I have.I am fairly new to Zuul and I've been tasked to set up a few projects with pipelines in my company. The infrastructure and a few projects were already set up by someone else but they have now left.I have something I'd like done across multiple projects so I am14:13
Petar_Twriting the config for it in a sort of "common zuul jobs" repo. There are a few examples of common jobs in there already, but I have spotted that they all have multiple items listed under "job.required-projects" config item - one for each project we'd like the job run to run on, as well as the "common zuul jobs" repo so we have access to the actual14:13
yoctozeptotobiash: yes, for which I thank you :-) yet it did not get much heat since then14:13
Petar_Tplaybooks.Is this the idiomatic way of doing this? What if you had a large amount of projects you'd like to run a small job on? Is there a way to get Zuul to not clone repos the job isn't running on? E.g. if I had projects "Foo" and "Bar" which ran something from "common-jobs" then when Foo is running its pipeline, it wouldn't need to clone14:13
Petar_T"Bar".One way that pops into mind is having a variable in the "job.required-projects" list that is set at project level, and is merged with hardcoded required projects but I was wondering if there's a more elegant way of doing it.Thanks14:13
Petar_T(oops sorry for the duplicate)14:14
tobiashyoctozepto: most people here were traveling/recovering from travel the last few weeks afaik14:14
*** jamesmcarthur has quit IRC14:15
sugaaravass https://paste.gnome.org/pdhrgfoy714:15
tobiashPetar_T: you don't need required-projects for running a job in a different repo, you only need to list repos that the job needs at least in the workspace while running the job14:16
tobiashPetar_T: the repos containing playbooks and roles will be automatically there14:16
yoctozeptotobiash: ack, hence /me providing the heat instead via ML and IRC14:16
*** jamesmcarthur has joined #zuul14:20
tobiashShrews: I'd have another topic about nodepool-builders, since you fixed image deletion (thanks for that!) nodepool doesn't leak images anymore but our cloud occationally has the problem that it leaves zombie volumes of instances behind that block the deletion of images.14:21
tobiashwith many builder (we run 10 atm) and a growing number of non-deletable images the builders hammer glance with delete requests that all get rejected14:22
tobiashI think we need coordination using zk and some backoff mechanism in the image cleanup14:23
openstackgerritTobias Henkel proposed zuul/nodepool master: Add back-off mechanism for image deletions  https://review.opendev.org/69802314:27
tobiashShrews: what do you think about something like this? ^14:27
Petar_Ttobiash: Let me give you a bit more detail. The job in question is a linter. Its job config and playbooks live in the "common-jobs" repo. I have Foo and Bar, two C/C++ projects which need linting independently. If I have each of them under "required-projects" then when Foo's pipeline is running, cloning Bar would be a waste of effort. Obviously14:27
Petar_Tthis wasted effort is multiplied when there's even more projects that need linting.14:27
Shrewstobiash: yeah that’s the cinder issue we struggle with as well. Not much nodepool can do with those except eventually give up since it requires manual intervention14:27
tobiashShrews: yes, manual intervention will required anyways, but I think we should at least using a back-off to not hammer the cloud14:28
tobiashtests are missing yet though14:28
Shrewstobiash: will look in a bit14:28
tobiashPetar_T: you don't need any of foo or bar in required projects14:30
tobiashPetar_T: the project triggering the job is always part of the job14:31
*** mhu has quit IRC14:32
*** jamesmcarthur has quit IRC14:33
Petar_Ttobiash: Right I see! Existing config for other jobs have mislead me then. So would I need the "common-jobs" repo? Or since the job is defined in that repo then it knows to clone that too?14:36
tobiashPetar_T: correct since the job is defined in common-jobs you don't need to list it as well14:36
tobiashPetar_T: you would need required-projects only if the job needs to have access to a repo baz regardless against which repo it would run14:37
Petar_Ttobiash: Great, thanks. Much clearer now. I might have to correct the existing jobs at some point in that case :)14:38
openstackgerritTobias Henkel proposed zuul/nodepool master: Add back-off mechanism for image deletions  https://review.opendev.org/69802314:44
*** jamesmcarthur has joined #zuul14:46
*** jamesmcarthur has joined #zuul14:46
*** mhu has joined #zuul14:57
avasssugaar: went on a fika break :). That doesn't have a static driver pool so I'm gonna guess that there's a configuration problem which maybe causes nodepool to keep an older config?15:01
avassline 13: should be labels and not nodes right?15:03
sugaarindeed15:03
avassare the labels defined in the label section?15:05
avasssugaar: https://paste.gnome.org/pbehhrlxv15:06
Shrewstobiash: i think something like that would be ok (though there are issues with that change as-is). i think we might also want to consider a max delete attempts or something and just give up entirely on it, otherwise we just accumulate un-deletable uploads forever.15:07
*** sgw has joined #zuul15:09
*** jamesmcarthur_ has joined #zuul15:09
sugaareven thought I changed from node to label I achieve the same. Those labels are defined here: https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[kubernetes].pools.labels.type15:11
tobiashShrews: makes sense15:12
sugaaravass I just understood what you meant by defined15:12
avasssugaar: :)15:12
*** jamesmcarthur has quit IRC15:13
Shrewstobiash: maybe a max timeframe instead, like 1 month or something, to give an admin time to manually intervene, then nodepool would be able to cleanup after that. i'm not sure of the best way to deal with that, tbh. It would be nice if nodepool could email an admin saying "hey, i give up on deleting this thing. it's up to you now."  :)15:16
*** chkumar|ruck is now known as raukadah15:19
openstackgerritAlbin Vass proposed zuul/nodepool master: Keys must be defined for host-key-checking: false  https://review.opendev.org/69802915:19
tobiashShrews: yes, a max timeframe sounds more reasonable than attempts when we use a back-off15:21
tobiashwe also could regularly generate a summary of non-deletable images into the logs15:21
Shrewstobiash: yeah, but that idea depends on someone regularly checking the logs before you rotate them away15:22
Shrewsi never look at our logs unless there's an issue, tbh, or a new feature we're rolling out15:23
tobiashok, then be it email15:23
Shrewsemail is wishful thinking and hand-wavy, ala mordred   :)15:24
openstackgerritAlbin Vass proposed zuul/nodepool master: Keys must be defined for host-key-checking: false  https://review.opendev.org/69802915:25
tobiashwell, I think email is the next best thing in that regard unless someone wants to write a jira connector :)15:25
* Shrews does NOT volunteer lol15:26
avass^any tip on where to start if I want to add a test case for those kinds of fixes? :)15:26
Shrewsavass: likely in nodepool/tests/unit/test_driver_aws.py15:28
*** sgw has quit IRC15:28
openstackgerritAlbin Vass proposed zuul/nodepool master: Keys must be defined for host-key-checking: false  https://review.opendev.org/69802915:42
avassshrews: something like that I guess15:42
Shrewsavass: i'm not terribly familiar with the aws portion. we should ask SpamapS or tristanC to have a look when they get a chance15:44
avassShrews: Sure. Gonna make sure it fails if I remove the fix later aswell :)15:45
Shrews++15:47
*** johanssone has quit IRC15:48
*** johanssone has joined #zuul15:49
*** saneax has quit IRC15:53
*** jcapitao is now known as jcapitao|afk16:04
*** themroc has quit IRC16:10
pabelangermorning, I incorrectly posted this in #openstack-infra16:22
pabelangerhttps://review.opendev.org/#/q/status:open+project:zuul/zuul+branch:master+topic:multi-ansible-wip16:22
pabelangercould we review and approve support for ansible 2.9 this week in zuul? I'd like to start to use it in zuul.ansible.com16:23
openstackgerritFabien Boucher proposed zuul/zuul master: Pagure: remove connectors burden and simplify code  https://review.opendev.org/69613416:25
mordredpabelanger: don't kill me - but I kind of think that stack should be inverted - add 2.9 as a patch (very uncontroversial), then switch default to 2.8 - not super controversial, but still is a user-facing impact we should think about - then remove 2.5 - since even though it's EOL upstream, we've had community concerns about removing ansible support from zuul and that would be a conversation too16:25
mordredpabelanger: so - you know - if they were in the opposite order, I think we could land "add 2.9" pretty quickly and easily without blocking it on the other two conversations16:25
SpamapSavass: Shrews is probably right, that test file has some faked aws stuff that should make it possible to recreate the scenario you see as breaking.16:26
pabelangerokay, I can invert. But I thought we already had the discuss to drop ansible 2.5 support, that's why I rebased the stack that way to start16:26
mordredpabelanger: oh did we?16:27
mordredpabelanger: I'm sorry - if we have, then I'm less worried about it16:27
pabelangerI _think_ so, but maybe we never officially announced anything16:27
mordredclarkb, corvus: ^^16:27
pabelangerbut yah, no issues updating changes if that is what is needed16:28
pabelangerhowever, from the zuuls I know of, I don't think any of them use 2.5 jobs16:29
tristanCbut what if a zuul relies on 2.5?16:31
pabelangertristanC: I am not sure I understand16:32
avassSpamapS: Yep i noticed that after going down the rabbit hole16:33
mordredpabelanger: if someone is running a zuul and they have jobs written that rely on 2.516:33
pabelangerright, of the zuuls I know of, I don't think any are doing that.16:34
pabelangerhowever, it is possible that there are zuuls using 2.516:34
pabelangerhttps://zuul-ci.org/docs/zuul/developer/specs/multiple-ansible-versions.html has listed 2.5 as deprecated for a long time16:35
pabelangerbut doesn't list a process of actually removing ansible version16:35
clarkbit would probably be friendly to users to have a 2.5 removal release16:36
clarkbseparate of other changes16:36
clarkbinverting the stack may make that easier? but I dont think it is necessary16:37
corvusif someone wants to propose a policy for how long after ansible EOLs a release we keep it in zuul, i think adding that to https://zuul-ci.org/docs/zuul/developer/specs/multiple-ansible-versions.html#job-configuration would be great16:37
pabelangerI am okay with us remove 2.5 before landing 2.916:37
corvusbasically, come up with when we think we should mark something as deprecated, when we should remove it, write a docs patch, merge it, and we can stick to it16:38
pabelanger(ansible 2.6 is eol now too)16:38
pabelanger+116:38
SpamapS*crikey*16:39
SpamapSTo be honest.. since 2.5-ish, I haven't seen a lot of breakage when I upgraded Ansible.16:40
*** jamesmcarthur has joined #zuul16:40
SpamapSIt was bad, 2.0->2.4 .. lots of aggressive changes.. but feels like it's pretty steady-eddie now.16:40
clarkbthere has been a bunch if deprecated and not replaced functionality in ansible16:40
corvusSpamapS: yeah, and they've even backed down from one of their more aggressive changes recently16:40
clarkbmostly that results in lots of logging so farbut not breakage16:41
pabelangerSpamapS: 2.10 will have some pain, IMO. With the move to collections16:41
corvusthe removal of with_first_found has been stayed pending improvements in loop16:41
pabelangerbut agree, last few years have been nice16:41
*** jamesmcarthur_ has quit IRC16:41
*** jamesmcarthur_ has joined #zuul16:41
SpamapSHeh, perhaps part of my non-troubles is that I avoid loop as much as possible... if it's not just looping on a list.. I write a python module. ;)16:42
pabelangerI'll work on the doc change this week for ansible removal16:43
corvuspabelanger: thanks16:43
*** jcapitao|afk is now known as jcapitao16:43
*** hashar has quit IRC16:44
corvusclarkb, pabelanger: how about a 3.12.0 release to clear out the current stuff, then merge the 2.5 removal and release that as 3.13, then merge the default change and addition of 2.9 and release as 3.14?16:44
corvusor do we want to explore mordred's idea of inverting the stack?16:44
pabelangerI am okay with that16:44
*** jamesmcarthur has quit IRC16:45
clarkbcorvus: that plan wfm16:45
pabelangerany 2.5 user could not move pass 3.12.0 until they made the switch to at least 2.6.16:45
pabelangeralso, unrelated, I've been working on a buildset-galaxy server, in zuul.a.c. I've confirmed the idea, from zuuls POC, works as expected :)16:46
pabelangerhttps://object-storage-ca-ymq-1.vexxhost.net/v1/a0b4156a37f9453eb4ec7db5422272df/ansible_52/52/2ed1e5fafc215852f746d854ca03c22c839483db/check/galaxy-quick-start-child/7f08f85/job-output.html#l62316:46
pabelangeransible-galaxy CLI pulling from 2 different galaxy servers16:46
corvuspabelanger: awesome!16:47
pabelangerthat is parent job with galaxy server and child pulling from it :)16:47
pabelangerI still have work to create general role, but plan to push up into zuul-jobs at some point16:47
corvuspabelanger: thank you :)16:48
corvusit looks like we'll restart opendev zuul soon, so i'll make the 3.12 release after that16:57
mhucorvus, you think these changes could make the cut for 3.12? https://review.opendev.org/#/q/status:open+project:zuul/zuul+branch:master+topic:zuul_admin_web+label:%22Code-Review%252B2%2217:00
corvusmhu: it looks like we're changing max_validity_time to max_token_age with no backwards-compat handling?17:10
*** dtroyer has joined #zuul17:19
corvusmhu: we might want to support both spellings for a while17:23
openstackgerritAlbin Vass proposed zuul/nodepool master: Keys must be defined for host-key-checking: false  https://review.opendev.org/69802917:23
mhucorvus, right, I'll have a look17:23
avassGonna continue working on that from home. Looks like it's gonna take a bit longer than i planned :)17:24
mordredpabelanger: cool! is that intended to work sort of like the speculative container stuff?17:24
corvusmordred, tristanC: speaking of speculative containers: https://review.opendev.org/696939  is ready for review17:27
pabelangermordred: Yup, I hope so. I believe there will be some changes needed on ansible-galaxy side, today there is no options to install pre-release versions (semver)17:28
pabelangerbut, since users are expected to install collections from galaxy, and not git, it only make sense to have an ephemeral galaxy in testing, like we did for docker images17:29
pabelangeralso mean, we get to provide feedback to galaxy team now too :)17:29
pabelangerand, is a good story for cross project testing17:30
*** jamesmcarthur_ has quit IRC17:30
*** pcaruana has quit IRC17:31
*** jamesmcarthur has joined #zuul17:31
mordredcorvus: I'm curious - the openshift role is doing docker things - any reason to not just use install-docker role?17:31
mordredcorvus: I'm guessing because that's what it was already doing and that refactor is out of scope?17:32
tobiashpabelanger: there is the possibility that some of our jobs might still be on 2.5 ( not sure about that right now). May I get some additional time to check that before removing 2.5?17:32
corvusmordred: yeah that's the gist.17:34
pabelangertobiash: what are your thoughts on n+6 months after a version of ansible is EOL'd for removal from zuul?17:34
mordredcorvus: cool. patch looks awesome17:34
pabelanger(in 2.5 case, we need to figure out some timeline as n+6 months as passed, in this case)17:34
tobiashpabelanger: that sounds reasonable17:34
corvuspabelanger, tobiash: how about we invert the stack then, that way pabelanger can keep moving on 2.9, and tobiash can have a bit more time to check on 2.5?17:35
tobiashSounds good, I don't see a reason why adding 2.9 should depend on removing 2.517:35
corvusclarkb: ^17:36
*** jamesmcarthur has quit IRC17:36
tobiashActually even nike 2.8 the default is independent17:36
*** hashar has joined #zuul17:36
tobiashWhat do you think about defaulting straight away to 2.9?17:36
tobiashIs that too risky?17:36
pabelangertoo risky, IMO17:36
tobiashk17:37
pabelangerI think 2.8 is right17:37
pabelangerthat is pretty stable17:37
corvusso new plan would be: release current as 3.12, merge 2.9 plus default 2.8, release that as 3.13, merge 2.5 removal, release that as 3.14?17:37
clarkbcorvus: plan seems reasonable to me17:38
tobiash++17:39
corvuscool, after we release 3.12, i'll send an email to zuul-announce with the plan to give folks a heads up17:40
*** avass has quit IRC17:40
pabelangerokay, I'll need some time to rebase bit, but plan is fine17:40
corvusi'm assuming we'll try to do all of that this week, unless we find we need to keep 2.5 longer17:41
tobiashI guess I can tell tomorrow if we need it longer17:41
*** igordc has joined #zuul17:49
*** sgw has joined #zuul17:53
tristanCcorvus: that plan works for me too, though it doesn't seems like a burden to keep old ansible versions, and perhaps we should consider doing that in the futur to avoid forcing the user to change their job.17:53
fungithere will come a time when an older version of ansible doesn't work with whatever newer version of python you've got on the executors17:55
fungiso folks will be forced to drop old versions from their environments at some point17:55
fungibut also, testing that zuul can manage every version of ansible dating back to when it gained multi-ansible support will eventually become untenable (if it hasn't already)17:56
corvustristanC: i'm pretty uncomfortable with zuul supporting eol versions of ansible.  i can see the point that it's not strictly necessary (at least as long as we can physically continue to install them).  but i also think it's a confusing message to send to users.  what do we do if there is some security vulnerability in an old ansible?17:56
tristanCfungi: corvus: i didn't meant to keep them forever, but then it seems like we should be dropping 2.6 too.17:58
fungiwe probably should17:58
fungiwe've been talking about dropping 2.5 for quite a while17:58
fungiand it still hasn't happened for various reasons17:59
mordredI thnk largely because we hadn't gotten all the way to writing up an actual strategy for dropping yet17:59
pabelangeryup, we just talked about that doc now17:59
mordredyup18:00
fungithe previous idea was that we were marking 2.5 as depracated when we introduced 2.7, and then removing it when we added 2.8 and marking 2.6 as deprecated, to be removed when we added 2.918:00
pabelangeras example, 6 months after ansible EOLs a version, we also remove it from zuul18:00
pabelangerif not 6months, what are people okay with18:00
fungii think the old plan is fine... basically leaves any zuul release with three versions of ansible: a deprecated version, an existing stable version, and a new/experimental version18:01
pabelangertristanC: ansible 2.5 has been deprecated in zuul for a whlie now, maybe 9-12 months. If users haven't migrated now, how do we encourage zuul operators to do that?18:01
corvusthat helps keep the testing matrix reasonable18:02
fungiwhen you upgrade to a version of zuul which has deprecated the version of ansible you're using, that should be the signal to spend time updating your default version and/or jobs to use one of the two newer versions it supports18:02
*** hashar has quit IRC18:02
clarkbpabelanger: I think that points to my concern with this. As a CI tool do we want to force users to go through frequent churn of updating playbooks in addition to zuul itself?18:02
clarkbtypically what users want is jobs that continue to work once they are stable18:03
corvusclarkb: well, we've chosen ansible.  this comes with the territory i think.18:03
fungiwe've seen ansible stabilize over time, hopefully that trend continues18:04
clarkbcorvus: yes, except that there is little harm on the zuul side to continue supporting 2.5?18:04
tristanCi think the plan is fine, but this might be inconvenient for users if a new supported version doesn't work with their jobs18:04
clarkbfungi: pabelanger just pointed out 2.10 is likely to undo that trned18:04
corvusclarkb: see the harms fungi and i laid out aboveL18:04
fungiclarkb: yeah, at least those major shifts in behavior may be farther apart now than previously18:04
*** jcapitao has quit IRC18:04
clarkbyeah I think the 6 month proposal or the trailing removal based on upstream releases is a reasonable compromise18:05
tobiashok, we at least default to 2.7 in our base job, now I need to check if any job overwrites this with 2.518:05
clarkbcorvus: ^18:05
pabelangeryah, I have some concerns with 2.10 personally. That will need a bit of work on our side to support collections18:05
tristanCit seems like 2.5 -> 2.8 mostly adds warning, but what about the point where roles needs adjustment for collections18:05
tobiashI guess I have to crawl the api18:05
clarkbcorvus: but we should also be aware that some users are likely to be unhappy with needing to update every 6 months especially if ansible updates become more difficult again18:06
tristanCclarkb: that's my concern yes18:07
pabelangertristanC: roles and collections are separate. So, if a job uses a role, it _should_ still be fine in 2.1018:08
corvusclarkb: yes.  i think our options are: 1) continue to support all ansible versions indefinitely.  i do not think we have the resources for this, so we have to take it off the table.  2) amplify our users concerns so that ansible is aware of this and they minimize disruptive changes.  happily, zuul users and ansible users are in the same boat -- we're not asking for anything weird.  3) decide that ansible is18:08
corvustoo unreliable for zuul and replace it.  i don't think we've experienced remotely enough pain to consider this an option at this point (and i hope we do not).18:08
fungii also think that deciding not to upgrade a component of zuul regularly should be an indication that upgrading zuul regularly isn't right for you either18:10
corvusso if we're all on the same page that #1 and #3 are not things we should consider right now, then yes, i think we have some room to talk about how to minimize disruption for zuul users (in terms of when we add/remove supported ansible versions).  but also i think that just getting that process decided on and communicated will be a big help.  if we say "hey, we're adding a new version of ansible, it may have18:10
corvussome things you need to change, you should work on that over the next year before we drop the oldest" then that's a big help.18:10
fungizuul 3.9.0 will continue to support ansible 2.5.0 forever, after all18:10
corvusfungi: that is a useful perspective18:11
*** hashar has joined #zuul18:11
pabelangerso, if we way 2.5 is supported for ever, I would be okay with that, but in zuul.a.c, i don't want job to use it. So, if we could expose a way for zuul-manage to only install version you want, I think that makes me happy.18:12
fungiif zuul 3.14.0 drops support for ansible 2.5.0 and you need that, don't upgrade to zuul 3.14.018:12
fungi(you've already decided to continue relying on an ansible version which gets no security support, so it's hard to argue you're losing security support by not continuing to upgrade zuul)18:12
pabelangers/we way/we say/18:12
corvusexactly18:13
pabelangerfungi: ++18:13
clarkbya I think the only bit that makes this special to me is that it can force you to update your jobs18:14
corvusi think one thing that we can derive from clarkb's point is that having the widest possible window can help reduce churn.  so, if it comes down to it, not limiting ourselves arbitrarily to 3 versions will help people by potentially allowing larger jumps (or a longer time to make changes).18:15
clarkbupdating jobs is an end user facing thing and not just an operational thing18:15
fungiand i agree that's a downside to choosing what is basically a turing-complete automation language as a job definition language, but it's also a strong feature18:15
corvus(ie, we're about to more or less accidentally facilitate people performing 2.5 -> 2.9 upgrades)18:15
clarkbcorvus: ++ if you can go from say 2.5 to 2.10 (I dunno how feasible that actually is) then you do it once intsead of 4 times18:15
tristanCfungi: well if you know you need 2.5.0, then you likely know what to do to fix your job and avoid being broken by a new release of zuul. My concern is for users who are not familiar with Ansible... In anycase, I agree with corvus choice, #1 and #3 are not good either.18:15
fungiusers can shield themselves from some of the impact of that by avoiding using as many of ansible's more complex features (as SpamapS noted with his avoidance of complex loop constructs)18:16
fungiwith flexibility in the job definition language comes complexity, and with complexity comes churn18:17
clarkbfungi: yup18:17
fungiin opendev we took a similar route with jenkins, eschewing most of the plugin ecosystem and encoding more complex behaviors into jobs as shell blobs which served similar purposes18:20
pabelangercould somebody help with the following traceback we are seeing in zuul-scheduler, we just added a new github.com repo18:22
pabelangerhttp://paste.openstack.org/show/787340/18:22
pabelangersomething odd about the repo, is https://github.com/ansible/workshops/tree/devel has both devel and master branch18:22
corvusthat's a big repo18:23
*** pcaruana has joined #zuul18:23
fungiyeah, that git error usually comes about when there are ambiguous ref names (like a branch and a tag with the same name and the git command wasn't given sufficient context to differentiate them)18:24
corvuspabelanger: look up a few lines -- zuul deleted the devel branch and apparently did not add it back?18:24
pabelangeryes, I see that18:25
corvuspabelanger: can you track that event back to the zuul-merger which processed it originally?18:25
corvuspabelanger: that's responsible for deciding what branches should exist on the executor (that tb is from the executor, not the scheduler -- i assume that was just a typo)18:26
fungii'm not immediately seeing any obvious conflicts between branch and directory/file and tag names18:26
pabelanger(looking)18:27
corvuspabelanger: do you have the exclude-unprotected-branches setting enabled?  and if so, is devel unprotected?18:28
pabelangerah, you know what18:29
pabelangerI think that is it18:29
pabelangerI can see only master is protected, let me add back and recheck18:29
corvuspabelanger: you may need either some kind of event to the devel branch or a full reconfiguration for zuul to notice the change18:30
pabelangerack18:31
pabelangerthanks!18:31
pabelangerlet me test this and confirm18:31
pabelangercorvus: thanks! that was it, only master had branch protection18:34
* fungi files that bit of github integration away for future reference18:35
pabelangeryah, for some reason, before I thought zuul wouldn't start the job18:35
pabelangerthis time it did18:35
pabelangerbut yes, better docs on myself helps here18:36
pabelangersorry for the noise18:36
*** igordc has quit IRC18:37
corvuswell, also, if zuul said "I've been asked to check out a branch which was excluded due to exclude_unprotected_branches" that wouldn't be the worst error message in the world18:40
tobiashI just checked, and we don't have ansible 2.5 in use by any job, only 2.7 and 2.818:42
corvustobiash: that took much less time than you estimated :)18:43
tobiashcorvus: I wrote a script that crawled the zuul api :)18:43
tobiashin less than 50 lines of python code18:44
tristanCtobiash: you might be interested by https://review.opendev.org/#/c/633667/4 that can display such crawl result in the web interface18:48
tobiashtristanC: interesting18:49
tristanCthe web interface doesn't group by ansible-version, it currently only does per secret, semaphore, label/nodeset, project, ... but it should be relatively easy to add a view per ansible-version too18:50
openstackgerritTobias Henkel proposed zuul/zuul master: DNM: Add quick and dirty api crawler for ansible versions  https://review.opendev.org/69806218:51
tobiashfor reference, that's the quick and dirty crawler ^18:52
tristanCtobiash: it seems like your crawler can miss ansible-job attribute set in project pipeline configs18:53
tobiashtristanC: oops that's correct18:54
tristanCtobiash: here is another crawler we wrote to auto-patch nodepool labels (it goes over every project pipeline config): https://softwarefactory-project.io/cgit/software-factory/sf-ops/tree/scripts/cleanup-nodepool-images.py18:54
tristanCoops, that actually was using the /config endpoint18:55
tristanCanyway, the function is the same, to check every job you have to: for each project, for each project config, for each pipeline, for each job, for each job config18:57
tobiashok, also no project pipeline using ansible 2.519:13
*** igordc has joined #zuul19:14
fungibeing able to filter jobs by ansible version in the dashboard could be neat, if that logic is embeddable into the config api19:14
fungitobiash: you might need to recheck 698062, we restarted opendev's zuul scheduler and that was the only change with jobs queued in the zuul tenant19:16
tobiashfungi: thanks, that wasn't expected to succeed anyway19:17
*** gmann is now known as gmann_afk19:17
corvusShrews: should we add a release note to nodepool and make a release with the sdk version bump?19:22
corvus(right now, there's nothing new since 3.9.0: https://zuul-ci.org/docs/nodepool/releasenotes.html )19:22
Shrewsi guess it would be odd to release with something in release notes. i know tobiash was asking for a new release with the sdk microversion fix during the ptg19:24
Shrewssorry it's taken so long for that, tobiash. fell off the radar19:25
Shrewss/with something/without something19:25
tobiashShrews: no worries, it fell off my radar as well19:25
Shrewsdoh19:25
corvusyeah, if we want to do a release with just an sdk version bump, let's add a note that just says something like  "updated openstack sdk dependency due to ..."19:25
Shrewscorvus: ++19:26
clarkb"updated openstack sdk dependency to pick up OVH profile updates which may be come necessary in the new year as OVH modifies their catalog"19:26
tobiashcorvus, Shrews: shall we also add a release note for the post upload hook?19:27
fungiwell, also it more generally added support for a bunch more ovh regions19:27
tobiashthat's new functionality19:27
Shrewsbut fyi, we are not yet running that version of the launcher in production19:27
clarkbShrews: I'm happy to do that restart once we have the zuul upgrade settled in19:28
clarkbprobably can do that after lunch today?19:28
Shrewssounds good19:28
corvustobiash: ++19:28
corvusthat seems noteworthy19:28
tobiashk, I'll prepare a note19:29
openstackgerritTobias Henkel proposed zuul/nodepool master: Add missing release note about post-upload-hook  https://review.opendev.org/69807219:35
Shrewstobiash: can we add the sdk stuff to that note as well?19:36
tobiashShrews: sure, do you have a text in mind? I'm not sure about the correct abstraction level translation from https://review.opendev.org/69087619:37
corvusmaybe clarkb's wording from 19:2619:38
tobiashcorvus: you mean the ovh profiles?19:39
corvusyes19:40
corvusthough i guess there was also the microversion thing?  i don't know anything about that...19:40
tobiashthe fix was independent of ovh, nodepool was broken with any cloud with newer openstacksdk19:41
Shrewstobiash: maybe "updated openstack sdk dependency to handle newer nova microversions"  ??? i believe you experienced the bug firsthand, so feel free to add more detailed context there if you find it helpful19:43
tobiashmaybe 'Fixed incompatibility issue with openstacksdk <version to be added> and greater.'19:43
*** hashar has joined #zuul19:44
tobiashas I understood the nova microversion thing is an implementation detail behind openstacksdk which wasn't expected to have a user visible impact19:44
tobiashso for a release note it might be enough to state a compatibility issue or regression with newer openstacksdk has been fixed19:45
clarkboh there was another reason to update sdk then19:45
clarkbfwiw ovh users need to update to latest sdk due to cloud side changes19:45
Shrewswait, so, we didn't actually update sdk requirement19:46
Shrewswe just added a fix for a bug around referencing flavors when using a newer sdk19:47
tobiashclarkb: nodepool didn't pull in a new sdk version, it just got incompatible with the recent one which has been fixed by 69087619:47
Shrewsi'm not sure that needs a release note19:47
Shrews(sorry, been bouncing between things)19:47
clarkbok. I think that we should consider forcing the newer version given OVH's planned changes19:47
clarkbbut I don't know if anyone else uses ovh with nodepool19:47
Shrewswell that would be a NEW change, so that can include a release note with the change19:47
clarkb++19:48
Shrewswith that in mind, i'm going to +2 the change tobiash put up19:48
corvusif we're wanting to release nodepool with 690876 merged because nodepool 3.9.0 is uninstallable, that's probably relnote worthy19:48
corvuswell, not uninstallable19:48
corvusbut does-not-work-with-openstack-if-freshly-installed19:49
tobiashyepp, that was exactly the reason why I asked for a release back then (and later forgot about that)19:50
clarkbis I1f7b592265ac612ea6ca1b2f977e1507c6251da3 the fix we are talking about?19:50
corvusyes that is "the microversion thing"19:50
tobiashclarkb: yes19:50
Shrewscorvus: that's a good point on a different perspective19:50
clarkbin that case I think we should restart opendev's launchers on a commit that incldues that (current HEAD). Concurrently merge the relnote update. THen if opendev is happy make a release against the relnote update19:51
corvusclarkb: ++19:51
corvusShrews: thanks, the approach i'm taking to see if there are relnote is: "why does {clarkb, tobiash} want a release? and might anyone else benefit from that knowledge?"19:52
clarkband we should make sure we have sdk 0.39.0 installed when we do that restart19:52
clarkbto ensure that latest works19:52
corvuser, "if there are relnote gaps"19:52
corvusthat wouldn't have been sufficiently ironic without omitting the word 'gap'19:53
mordredha19:53
tobiashmordred: do you know what version of openstacksdk required that fix?19:57
tobiashmordred: I guess 0.37.0? https://docs.openstack.org/releasenotes/openstacksdk/unreleased.html#upgrade-notes19:59
mordredtobiash: yes, that's right20:00
tobiashso a release note might be: "Fixed compatibility issue with openstacksdk 0.37.0 and above."?20:01
tobiashfollowing corvus' approach I'd consider this as an important info as 3.9.0 is not working when freshly installed20:02
Shrewsi think that's pretty clear for the current state20:03
ianwcorvus / modred: could i ask for your eyes on https://review.opendev.org/697936 & https://review.opendev.org/697614 ; it adds sibling copy to podman/generic builds (936) but of immediate interest it fixes the docker sibling copy (614)20:03
ianwi will need this to get the nodepool+containers functional test working20:04
openstackgerritTobias Henkel proposed zuul/nodepool master: Add missing release notes  https://review.opendev.org/69807220:06
Shrewstobiash: lgtm. corvus?20:08
clarkbI'm restarting our launchers at nodepool==3.9.1.dev11  # git sha e391572 and they seem happy (and happy against vexxhost and ovh which should confirm that latest sdk is working)20:12
openstackgerritDavid Shrewsbury proposed zuul/zuul master: Sort autoholds by request ID  https://review.opendev.org/69807720:16
Shrewsthe unsorted chaos was starting to annoy me20:17
mordredcorvus: https://review.opendev.org/#/c/611191/ - I grok what is wrong here, but this might be an opportunity to report a different error20:58
corvusmordred: what is wrong?  (that is an unhandled error, so likely unanticipated)21:00
mordredcorvus: job definition referenced a secret name that didn't exist21:01
mordredcorvus: (because of pebkac - the change also adds a secret that didn't get used)21:01
corvusyeah, seems like that should be handled before it gets that far21:02
*** sgw has quit IRC21:02
mordredyah.21:02
*** hashar has quit IRC21:02
ianwhttps://github.com/bgeesaman/registry-search-order <- interesting lesson on container registry search order and squatters21:09
mordredianw: that's yet another reason why I think we should use nothing but fully qualified container names in our stuff :)21:11
ianwmordred: yeah, would that mean adding the hostname added by the registry roles when pulling speculative images?21:13
mordredianw: nope! the specualtive image system supports multi-registry21:13
mordredianw: this is one of the reasons corvus wrote zuul-registry21:13
corvusianw: wow, thanks.  i had been ambivalent about the search order thing (but out of a habit of conservatism, remove it when i see it).  i am no longer21:16
*** jamesmcarthur has joined #zuul21:22
mordredcorvus: ++21:26
openstackgerritMerged zuul/zuul master: Sort autoholds by request ID  https://review.opendev.org/69807721:30
*** pcaruana has quit IRC21:31
*** jamesmcarthur has quit IRC21:48
*** gmann_afk is now known as gmann21:48
clarkbI've rechecked the nodepool release notes change. Appears to have failed installing gcc?21:51
clarkbboth zuul and nodepool have been stable for opendev since I restarted them. Porbably safe to make those releases once the release notes change merges21:52
*** Goneri has quit IRC21:56
Shrewsoh fudge. scheduler needs to be restarted to get the autohold sort thing. oh well, it can wait22:30
openstackgerritMerged zuul/nodepool master: Add missing release notes  https://review.opendev.org/69807222:34
corvuszuul 3.12.0 57aa3a06af28c99e310c53cc6b88d64012029d9823:22
clarkbneat23:23
corvusnodepool 3.10.0 805d2b4d180e2a3382e718e036ad4c8c521ecfc823:23
corvuswell, it was more of a question23:23
corvusclarkb, Shrews: do those look right?23:24
corvusfungi: ^23:24
clarkboh /me double checks23:24
clarkbcorvus: zuul lgtm23:25
clarkbnodepool lgtm too23:26
corvusi'll push those up as soon as i figure out how to run gpg23:26
corvussince that is now hard23:26
clarkbcorvus: I do GNUPGHOME=/path/to/mounted/normally/offline/device git tag -s $tag then a ps -elf | grep gpg-agent and I kill the most recent agent23:27
clarkbfor some reason my desktop starts one at boot time but that isn't the one that gets used at git tag time so I can't just pkill23:27
corvusthanks... i had to kill the agent since it was attached to emacs for pinentry... ugh.  i hate everything about this now.23:29
corvusthis has to be some kind of insider plot to sabotoge pki23:29
corvuscommit 57aa3a06af28c99e310c53cc6b88d64012029d98 (tag: 3.12.0)23:30
corvusokay, that's what git show 3.12.0 shows me,  so i think that's right for zuul23:30
corvuscommit 805d2b4d180e2a3382e718e036ad4c8c521ecfc8 (HEAD -> master, tag: 3.10.0, origin/master, origin/HEAD, refs/changes/72/698072/2)23:31
corvusand that's for nodepool23:31
corvusboth pushed23:32
openstackgerritMerged zuul/zuul-jobs master: build-container-image: support sibling copy  https://review.opendev.org/69793623:42
*** rlandy has quit IRC23:43
openstackgerritMerged zuul/zuul-jobs master: build-docker-image: fix up siblings copy  https://review.opendev.org/69761423:49
fungicorvus: i do this to get gpg to stop trying to spawn graphical pinentry crap, if that's what you need... env DISPLAY="" GPG_TTY=$(tty) git tag -s 3.12.023:51
corvusfungi: ah thanks, that's the other half of the equation :)23:52
ianwcorvus: hrm, i noticed we just restarted zuul and now my change has a failing buildset registry job23:53
fungiyep, bits 'n' pieces, bits 'n' pieces23:53
ianwsee gate currently @ https://zuul.opendev.org/t/zuul/status23:53
ianwno link to logs, although the warnings might be printed now?23:53
corvusianw: it looks like the parent change still doesn't have a successful image build23:54
corvus697393 is parent?23:54
corvusits parent is building now, so a recheck of 697393 and 693464 should do it23:54
corvusor rather, its parent has succesfully built23:54
clarkbunrelated to ianw's thing. I have confirmed that the zuul.attempts data in the job inventory seems to be working as expected in production23:55
corvusin other news, zuul and nodepool are both on pypi now; i'll send out announcements23:55
fungioh!!!23:55
fungiyay releases!23:56
*** tosky has quit IRC23:56
ianwcorvus: but the result on 697393 was from several days ago; it checks the last run? (sorry, clearly still not quite ontop of this depends behaviour)23:57
corvusianw: yep, last or current23:57
ianwok, thanks23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!