Wednesday, 2019-09-11

*** jamesmcarthur has joined #zuul00:02
*** jamesmcarthur has quit IRC00:06
*** jamesmcarthur has joined #zuul00:10
*** jamesmcarthur has quit IRC00:27
*** threestrands has joined #zuul00:27
*** jamesmcarthur has joined #zuul00:39
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: Add remove-zuul-sshkey  https://review.opendev.org/68071200:41
*** jamesmcarthur has quit IRC00:44
*** jamesmcarthur has joined #zuul01:02
*** rlandy has quit IRC01:50
*** jamesmcarthur has quit IRC01:59
*** jamesmcarthur has joined #zuul02:04
*** roman_g has quit IRC02:34
*** jamesmcarthur has quit IRC03:00
*** jamesmcarthur has joined #zuul03:06
*** jamesmcarthur has quit IRC03:13
*** jamesmcarthur has joined #zuul03:25
*** rfolco has quit IRC03:32
*** PrinzElvis has quit IRC03:39
*** webknjaz has quit IRC03:43
*** dcastellani has quit IRC03:45
*** PrinzElvis has joined #zuul03:45
*** webknjaz has joined #zuul03:46
*** dcastellani has joined #zuul03:46
*** jamesmcarthur has quit IRC03:57
*** ianychoi_ has joined #zuul04:24
*** ianychoi has quit IRC04:27
*** pcaruana has joined #zuul04:42
*** saneax has joined #zuul05:03
*** pcaruana has quit IRC05:12
*** bolg has joined #zuul05:45
*** bolg has quit IRC06:09
*** pcaruana has joined #zuul06:21
*** bolg has joined #zuul06:26
*** AJaeger has quit IRC06:28
*** AJaeger has joined #zuul06:35
*** hashar has joined #zuul06:42
*** roman_g has joined #zuul07:16
*** themroc has joined #zuul07:16
*** jpena|off is now known as jpena07:37
bogdandoo/07:38
bogdandoplease merge https://review.opendev.org/#/c/681182/07:38
*** avass has joined #zuul07:40
*** threestrands has quit IRC07:59
*** sshnaidm|afk is now known as sshnaidm|ruck08:09
*** ianychoi_ has quit IRC09:09
*** hashar has quit IRC09:38
*** saneax has quit IRC10:11
*** saneax has joined #zuul10:12
*** ianychoi has joined #zuul10:30
*** avass has quit IRC10:37
*** ianychoi has quit IRC10:45
*** ianychoi has joined #zuul10:45
*** shachar has quit IRC11:08
*** snapiri has joined #zuul11:08
*** sshnaidm|ruck is now known as sshnaidm|bbl11:20
*** hashar has joined #zuul11:20
*** avass has joined #zuul11:30
*** jpena is now known as jpena|lunch11:40
*** spsurya has joined #zuul11:58
*** rfolco has joined #zuul12:06
*** rlandy has joined #zuul12:23
*** jamesmcarthur has joined #zuul12:25
pabelangermorning! I wanted to see how we could make a change to nodepool, to not fail if an openstack provider as been configured to not have a public IP: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/driver/openstack/handler.py#L19012:30
*** jamesmcarthur has quit IRC12:30
pabelangerwe have this use case in one of our network appliances, cisco iosxr, where the management interface can come online (via dhcp) but not have a default route12:31
pabelangerso we need to do some very weird things, to make multinode jobs happen12:31
pabelangerbut this requires both nodes to be on the same subnet, for public internet (which is really hard these days).12:31
pabelangerSo, by removing the check above or toggling it, we want to launch a node, but only have private IPs, which is possible the zuul executor won't have direct access too12:32
*** jpena|lunch is now known as jpena12:32
*** bogdando has left #zuul12:43
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: WIP: Allow ensure-tox to upgrade tox version  https://review.opendev.org/67646413:11
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Allow ensure-tox to upgrade tox version  https://review.opendev.org/67646413:13
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: add-build-sshkey: add centos/rhel-8 support  https://review.opendev.org/67409213:13
sean-k-mooneyi proably should have checked this before pushing the github dirver support depends on with the pull request URL right13:35
pabelangeryes13:35
sean-k-mooneyso this would work https://review.opendev.org/#/c/681474/13:35
pabelangeryup13:35
sean-k-mooneyif the intel zuul has that project in it zull config13:35
sean-k-mooneycool i was expeting upstream zuul to explone because it does not13:36
sean-k-mooneyoh i forgot to add back in noop ot upstream13:36
*** panda is now known as panda|ruck13:40
*** sshnaidm|bbl is now known as sshnaidm|ruck13:44
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Allow ensure-tox to upgrade tox version  https://review.opendev.org/67646413:44
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: add-build-sshkey: add centos/rhel-8 support  https://review.opendev.org/67409213:44
*** swest has quit IRC13:45
*** panda|ruck is now known as panda|rover13:45
*** swest has joined #zuul13:45
*** bolg has quit IRC13:56
sshnaidm|ruckhi, how can I build new containers with zuul? The current containers on docker.io/zuul has zuul version 3.5.0 which is quite old13:59
Shrewspabelanger: interface_ip is not necessarily the public IP. it's whatever IP sdk determines is used for communicating with that server, which might be private: https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/cloud/meta.py#L318-L33713:59
*** swest has quit IRC13:59
sshnaidm|ruckdocker pull zuul/zuul; docker run -it zuul/zuul zuul --version13:59
sshnaidm|ruckZuul version: 3.5.013:59
pabelangerShrews: yah, agree. In this case, it is a hard check where nodepool needs to find the interface_ip. In this case, it will always be empty. I'd like to skip that check14:02
AJaegersshnaidm|ruck: did you confirm that the content is old - or is it just the version?14:02
AJaegersshnaidm|ruck: the image was updated yesterday, wasn't it?14:02
sshnaidm|ruckAJaeger, what do you mean by content?14:02
sshnaidm|ruckAJaeger, I need a newer zuul version, at least as in CI14:03
AJaegerI mean: Does it include current git master but uses wrong version number?14:03
pabelangerwe can look at the publish job to see14:03
Shrewspabelanger: i'm confused... how do you expect the zuul executor to communicate with the node then?14:04
AJaegerhttps://hub.docker.com/r/zuul/zuul/tags says "updated 17hours ago"14:04
pabelangerShrews: it doesn't, until the primary node is able to SSH into secondary and setup route14:04
pabelangerIt needs to be done this way, because route cannot be obtainned via dhcp14:05
AJaegersshnaidm|ruck, pabelanger, this seems to be the job pushing the image, isn't it? http://zuul.opendev.org/t/zuul/build/e174c37169da417da31a85a644ca797614:05
pabelangeronly static14:05
sshnaidm|ruckAJaeger, it doesn't include git, it has /usr/local/lib/python3.7/site-packages/zuul-3.5.0.dist-info14:05
Shrewspabelanger: so these are nodes allocated via nodepool, but not expected to be used directly by the executor?14:05
pabelangerShrews: they are allocated in nodepool, but zuul / nodepool cannot route to them, until a pre-run task is setup in base job14:06
pabelangeronce that is done, then zuul executor can access it14:06
zbrnot sure who is generating https://zuul.opendev.org/manifest.json but it would be useful to include zuul version there.14:06
pabelangerthis is because, nodepool isn't able to properly setup network directly, it needs manually commands14:07
Shrewspabelanger: that goes directly against our current design. i don't have any ideas on that one, atm14:08
pabelangerwe have this working today14:08
pabelangerin zuul.a.c14:08
pabelangerbut, it works because we are using public IPs14:08
pabelangerI'd just like to have nodepool not enforce an interface ip14:09
pabelangerso I can flip to the Vm to private14:09
corvussshnaidm|ruck, AJaeger: "docker run -it zuul/zuul zuul --version"  ->  "Zuul version: 3.10.2.dev66"  for me14:09
fungipabelanger: how about a reverse nat, where the builder/executor masquerade as a local address on that network when connecting to those addresses? that way the device never needs a default route and just responds via layer 2 (arp or v6nd) resolution14:10
sshnaidm|ruckcorvus, did you pull from docker.io?14:11
corvussshnaidm|ruck: yes14:11
sshnaidm|ruckcorvus, me too..14:11
pabelangerfungi: trying to understand reverse nat comment14:12
sshnaidm|rucklemme check on vm, maybe cache..14:12
Shrewspabelanger: And the executor currently runs this pre-run task on the node, right?14:12
pabelangerShrews: on the primary node14:12
Shrewsoh, i think i see now14:13
pabelangerso here is an example14:13
pabelangerhttps://object-storage-ca-ymq-1.vexxhost.net/v1/a0b4156a37f9453eb4ec7db5422272df/ansible_32/62132/b93674666faa9a116d2a6d160f48557e6c7a454e/third-party-check/ansible-test-network-integration-iosxr-python36/dd0c215/job-output.html#l59814:13
pabelangerwe have a pre-run job, that runs on primary node (2 nodeset)14:13
pabelangerto ensure it can route to applinace node (so they are on same subnet)14:14
AJaegersshnaidm|ruck, corvus, I get the same result as corvus. So, all looks fine here as well - and I pulled from dockerhub.14:14
pabelangerthen we know we can access it and do things to it14:14
*** bolg has joined #zuul14:14
pabelangerif we cannot, zuul aborts, and re retry14:14
pabelangerhowever, today we are using public IPs, in vexxhost-sjc1, since they have 1 single subnet of public IPs14:14
fungipabelanger: managed device lives in the "private" network and lacks routes outside that network. executor and launcher live outside the network and have static routes to a router which knows how to forward traffic into that network. last hop also performs layer 3 address translation to map the executor and launcher addresses to local addresses on that network so the device sees connections coming from a local14:14
fungi(to it) address14:14
pabelangerhowever, it is a single region we test against. In limestone, we can create provider network, that is private single subnet, and route between 2 nodes, without default routes14:15
Shrewspabelanger: how does the primary node determine the address of the appliance nodes?14:15
sshnaidm|ruckcorvus, AJaeger thanks, checked on different vm, it's also 3.10, seems like docker cached something14:15
pabelangerShrews: public IP, because nodepool know it14:16
pabelangerwhen we flip to private, we'll need to manage that via nodepool again14:16
pabelangeras it will have the info14:16
pabelangerhttps://object-storage-ca-ymq-1.vexxhost.net/v1/a0b4156a37f9453eb4ec7db5422272df/ansible_32/62132/b93674666faa9a116d2a6d160f48557e6c7a454e/third-party-check/ansible-test-network-integration-iosxr-python36/dd0c215/zuul-info/inventory.yaml14:17
pabelangeris example inventory file14:17
pabelangerfungi: so, if I understand, for that to work with multiple clouds, I'd have per region subnets, that don't conflict14:18
pabelangerso I know which cloud to route too14:18
corvuspabelanger: have you tried https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[openstack].pools.host-key-checking ?14:19
fungipabelanger: a twist on that, if you do need conflicting/overlapping private networks, is to also use 1:1 nat in the other direction for those devices14:19
pabelangercorvus: yes, we disable that by default. However we hiccup on interface_ip missing14:19
corvuspabelanger: why are the nodes not always on the same subnet?  can't you make a neutron network and put them both on it?  the whole "abort and retry if they aren't on the same subnet" thing seems like it could be very problematic.14:24
pabelangercorvus: I don't know to be honest. I'd have to confirm with cloud providers how to do that or if openstack supported. That in fact would be the easiest solutions here.14:25
*** electrofelix has joined #zuul14:26
corvuspabelanger: i think that's worth looking into14:28
corvuspabelanger, Shrews: but back to the interface_ip thing --14:28
pabelangerk, I'll work up email to openstack ML14:28
corvuspabelanger, Shrews: it seems like we expect entire clouds to be either public or private, but iiuc, pabelanger has a cloud where he wants to get both public and private vms.  so yeah, that's not accomodated by the logic in sdk.  even if we set up an override option in nodepool to say "force private", that would still apply at the pool level, not the server level, so there's no way to say "use interface_ip14:30
corvusfor this server, and private_ip for this other one"14:30
corvuspabelanger, Shrews: the only workable option i see is strictly what pabelanger suggested: disable the interface ip check and just return no data.  i guess we could do that, but if we add that, we should add a warning saying people almost certainly don't want to enable this flag because it will mask all kinds of very frequent problems.14:32
Shrewspabelanger: corvus: i wonder if we can just skip the interface_ip check if host-key-checking is disabled. presumably you don't care about that ip if you're skipping ssh keyscan???14:35
corvusShrews: hrm, yeah, that may be reasonable14:36
*** nhicher has quit IRC14:39
pabelangeryah, looking at code more, if we skip interface_ip on host-key-checking false I think that make sense.14:39
pabelangerso, +1 from me :)14:39
pabelangerbut also asking in openstack too14:39
*** sshnaidm|ruck is now known as sshnaidm|rover14:40
clarkbpabelanger: so your appliance refuses to set a default route from dhcp?14:46
pabelangerclarkb: yup!14:47
pabelangerit is terrible14:47
*** panda|rover is now known as panda|ruck14:52
fungiif the device is a piece of networking gear, that's not entirely uncommon behavior14:56
fungiyou end up with a lot of flat management networks and/or 1:1 nat when dealing with devices like that14:57
clarkbime th3 expectation is you manually configure the device and not use dhcp (but that still allows for a default route14:57
*** bogdando has joined #zuul14:58
bogdandohi, please merge https://review.opendev.org/#/c/681182/14:58
bogdandoShrews, clarkb: ^^14:58
*** bogdando has left #zuul14:58
fungiyeah, or if the device is capable of placing its management address on any network where it has a serial/loopback then it may be able to make that routable via traditional routing protocols14:58
fungithat's more common for actual ip routers though, less so for pure ethernet switches15:00
pabelangeryah, usually this device is the one providing DHCP to the network, so that is why it is missing (or so I am told)15:01
pabelangerthis is basically, a large hack around the idea of not adding console support into nodepool :)15:02
clarkbin this particular case it does seem like you want to create your own network and subnet in neutron and boot all of the instances on that network15:04
clarkbthen your executor can have an interface on the network too15:04
clarkbI believe mordred has said that while vexxhost gives you public networking by default you can still create a network and subnet and router there15:04
pabelangerYup, that is right, when I last tested this we couldn't get nodepool to bring online node due to interface_ip missing15:06
pabelangerI did get it working with FIPs15:07
pabelangerbut some clouds don't support thta15:07
pabelanger(and FIPs have an extra cost)15:07
*** jamesmcarthur has joined #zuul15:08
clarkbthere is a clouds.yaml setting to say the private ip is the ip to use15:08
pabelangermy last idea, is going to be using nested virt for the appliance. But really trying to avoid doing that15:08
clarkbif nodepool is checking that that is reachable it would still fail though15:08
pabelangerclarkb: would I need to create a new pool for that? I mostly only want this 1 VM to be setup with private15:09
clarkbyou might have to set up a new provider for that since it si a clouds.yaml setting15:10
pabelangerk, let me look into that. that might complicate things on nodepool config side, but also an option15:10
clarkbpabelanger: https://docs.openstack.org/os-client-config/latest/user/network-config.html15:11
clarkbI think you set routes_externally on the netowrk to true15:11
*** chandankumar has quit IRC15:11
clarkbthen you'll get that IP back as "the ip" from sdk in nodepool15:11
pabelangerthat might actually not be bad, if that is the case15:12
*** chandankumar has joined #zuul15:12
pabelangerokay, let me test that15:12
*** Goneri has joined #zuul15:15
fungii strongly suspect openstack doesn't want to provide a means to request/guarantee provider network affinity, and expects you to create a network instead if you need that15:16
clarkbya thats teh whole point of being able to configure networks and subnets yourself15:17
Shrewsclarkb: we good to merge https://review.opendev.org/681182 for bogdando? Not sure why it wasn't approved before...15:20
clarkbShrews: I don't know why tristanC didn't approve, but ya aiui we test the multinode roles fairly well so if tests pass I would expect that to be working and can be approved15:22
clarkbShrews: do you want to +A or should I?15:22
Shrewsi'll go ahead15:22
Shrewsjust wanted to make sure we weren't waiting for something15:23
*** bolg has quit IRC15:23
*** themroc has quit IRC15:26
*** igordc has joined #zuul15:31
*** mattw4 has joined #zuul15:34
*** mattw4 has quit IRC16:04
*** mattw4 has joined #zuul16:04
openstackgerritMerged zuul/zuul-jobs master: Fix evaluating nodepool_ip and switch_ip facts  https://review.opendev.org/68118216:05
*** mattw4 has quit IRC16:10
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: WIP: Allow ensure-tox to upgrade tox version  https://review.opendev.org/67646416:16
*** hashar has quit IRC16:18
*** chandankumar is now known as raukadah16:31
openstackgerritSorin Sbarnea proposed zuul/zuul master: For pre blocks to wrap text  https://review.opendev.org/68153216:32
pabelangerclarkb: that seems to work, but like I guess, disrupts all other VMs that are also attached to that network. For some reason, opestacksdk is returning that is interface IP, even while public exists and routes external. I'm guessing there is no order involved when 2 networks have that setting enabled16:38
pabelangerbut, will look more into openstacksdk16:38
clarkbpabelanger: ya you need to use a separate provider with different clouds.yaml profile16:40
clarkbfor the ordering thing you can specify the public network as routes_external false probably16:41
clarkbor only boot the instance with a single network16:41
pabelangeryah, I have complex network requirements and limited provider networks16:42
pabelangerif routes_externally could be passed via nodepool.yaml file, that would work16:43
clarkbwell in this case you are wanting to not use provider networks and instead use user configured networks16:43
pabelangerbut we'd need to grow support I think16:43
clarkbhrm?16:43
clarkbnodepool allows you to specify which networks to attach16:43
pabelangeryup, but I don't think we expose the setting of the network16:43
clarkbnodepool does16:43
pabelangeroh16:43
clarkbhttps://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[openstack].pools.networks16:44
*** jpena is now known as jpena|off16:44
SpamapScorvus:I grabbed the nodepool task on https://storyboard.openstack.org/#!/story/2006516 .. but .. feels like the task list isn't quite in-line with the story. If I understand the story right, this is mostly about making the DB optional and allowing external settings. Yes?16:46
pabelangerclarkb: isn't that just the name of the network?16:47
pabelangernot the settings for it16:47
clarkbpabelanger: correct the settings go in clouds.yaml16:49
corvusSpamapS: i think the first 2 tasks are pre-reqs for the rest (or, well, at the very least, the zk task is a pre-req for nodepool)16:49
clarkbpabelanger: what you need to do is have two profiles in clouds.yaml with different settings for the same network. Then have two providers in nodepool using different clouds.yaml profiles16:50
pabelangerclarkb: okay, yah, that is still the 2 provider rule16:50
clarkbyes16:50
pabelangerwhich does work, I just manually forced it here: https://object-storage-ca-ymq-1.vexxhost.net/v1/a0b4156a37f9453eb4ec7db5422272df/ansible_07/207/19fd14e960c2dbe048cc429c581f594d067252fe/check/ansible-network-iosxr-appliance/b78c8a9/job-output.html#l4016:50
clarkbfwiw we intentionally stopped managing network settings in nodepool and rely on clouds.yaml16:51
clarkbso I think that is the correct way to do this16:51
pabelangeryah, I'm just not looking forward to doubling my nodepool config from 4 providers, to 8. for a single node :(16:52
corvusSpamapS: nodepool needs a zk.  the story says we should be able to provide zk connection information outside of any operators (like, imagine the IT department already runs a ZK).  so the easiest way to get something working in the test job is to set up a zk (which just happens to be run by an operator in the same k8s, but zuul-operator doesn't have to know about that).  then tell zuul-operator the zk16:52
corvusconnection info.  later on, we can do more fancy things with the zuul-operator interacting with the zk operator.16:52
corvus(and same thing applies to pxc and zuul)16:52
SpamapScorvus: got it. I'll put my extra cycles into those first few then.16:53
corvuskk16:53
SpamapSInteresting experience. I recently marked our `gate` pipeline as `supercedes: check`. As a result, PRs are merging with a "pending" status on check. I wonder if we can delete a status.17:03
*** hashar has joined #zuul17:04
openstackgerritJames E. Blair proposed zuul/zuul master: WIP: Add support for the Gerrit checks plugin  https://review.opendev.org/68077817:06
openstackgerritPaul Belanger proposed zuul/nodepool master: Disable interface_ip check, when host-key-checking is disable  https://review.opendev.org/68154417:06
pabelangercorvus: clarkb: Shrews: ^that should be host-key-checking patch, we discussed this morning17:07
pabelangerif still okay, I'll add some testing around it after I grab some lunch17:07
corvusSpamapS: yeah, we might need a new reporter action in zuul to handle that; if you find that the github api supports it, we can do that.17:07
pabelangerthat would be the less work approach to make this work for us17:07
pabelangerbut understand if we don't want to do it17:08
corvusSpamapS: (reporter action meaning like "start, failure, success.... superceded")17:09
pabelangerclarkb: just thinking about it before getting food, is there no way in clouds.yaml to define my own network name, but map to existing provider network? I'm gussing not17:12
clarkbpabelanger: you create a network in the cloud and use that instead of the provider network17:14
pabelangeryah, I don't think I can create a network17:15
Shrewspabelanger: i think we need to document the behavior in the host-key-checking doc portion too17:15
clarkbmordred: has said you can in vexxhost17:15
pabelangerthis is limestone where I am testing17:15
clarkbI've never done it myself though17:15
clarkbyou may be able to there too17:16
pabelangertrying to figure it out with current resource17:16
pabelangerthe new provider works17:16
pabelangerbut that is a lot of overhead, like mirrors, quotas, etc.17:16
pabelangerI could get behind a pool, but that is dedicated quotas there too17:16
pabelangerShrews: good idea17:17
clarkbwhy do you need new mirrors?17:17
clarkbif the host can't route anyways its not talking to those17:17
pabelangerwell, updates for dns entries (if we had region mirrors)17:17
pabelangerimage uploads will be a thing17:18
pabelangerthat will have a cost to it17:18
clarkbnot sure I understand the dns entries problem either. for images considering this is an appliance I assume nodepool isn't uploading those and you are just setting a uuid?17:18
clarkbif so then you can use the image that is already there17:18
pabelangeryah, appliance side we can reuse, but controller node is manged by dib17:19
openstackgerritSorin Sbarnea proposed zuul/zuul master: For pre blocks to wrap text  https://review.opendev.org/68153217:19
clarkbcontroller node can be launched on the other network and the limited network17:19
clarkbbut ya if using another provider that would be another image17:19
clarkbmaybe we need the concept of a subprovider which inherits images from its parent17:20
pabelangeryah, quota is the bigger one honestly. need to share that between providers17:20
openstackgerritClark Boylan proposed zuul/zuul master: Pass zuul_success to cleanup playbooks  https://review.opendev.org/68155217:21
pabelangerokay, let me update nodepool patch, at tests17:21
openstackgerritSorin Sbarnea proposed zuul/zuul master: For pre blocks to wrap text  https://review.opendev.org/68153217:21
pabelangerthen, maybe switch to vexxhost and create private network17:21
clarkbcorvus: 681552 passes zuul_success to cleanup playbooks17:21
pabelangerthen limit which VMs use it17:21
pabelangerI can then loop back to other provider17:21
openstackgerritTristan Cacqueray proposed zuul/zuul master: synchronize: add support for kubectl connection  https://review.opendev.org/68155317:22
zbrcorvus: i updated https://review.opendev.org/#/c/681532/ -- rephrased and included screenshots before/after, looks ok now?17:23
tristanCzuul-maint: https://review.opendev.org/681553 integrates https://github.com/ansible/ansible/pull/62107 so that most zuul-jobs can ran in kubernetes17:23
tristanCit's not ideal, but i don't know how long it will take for Ansible to support synchronize with kubectl connection. Could someone ask at AnsibleFest?17:24
clarkbI think we should be very careful about adding features to ansible that won't work outside of zuul (because people will expect playbooks to work in zuul and outside of zuul)17:25
clarkbI think we can ask (I'll be at the dev day Monday17:25
zbrtristanC: my experience with Ansible was that if you make a PR it will be reviewed and merged quite fast, or maybe I was just lucky.17:25
tristanCclarkb: it's not specific to zuul, it just extend the synchronize connection support17:26
zbrtristanC: there is only one ugly aspect of ansible: if you add a new feature, it will only go into next release.17:26
zbrbut if you hurry up, there may even be time to slip things into 2.9, not sure. pabelanger probably knows better.17:27
zbri know this because I was upset that refused my fix to enable "etc-hosts" for docker_image module which was missing, and they aceepted only for 2.9, did not want to add it to 2.8 because counted as "new feature".17:28
*** sshnaidm|rover is now known as sshnaidm|off17:28
zbrfrom my point of view it was a bug: failure to pass argument to docker-py module, but always depends from which angle you see it.17:28
zbrsomeones17:28
zbrfeature is someone's bug17:29
clarkbtristanC: you've imported the code from the PR into zuul right? and that PR hasn't merged. So if we merge that chagne it will be specific to zuul until that PR merges17:29
clarkband it will not be clear to users that have working kubectl rsync in zuul why it doesn't work with normal ansible17:30
clarkb(all of the other zuul changes to ansible prevent actions from being taken so tasks that run outside of zuul should still work)17:30
pabelangerzbr: tristanC  won't ship in 2.9, too late for that sadly17:31
tristanCclarkb: yes that's correct17:31
pabelangerbut, with 2.9 we can create our own zuul collection if wanted17:31
tristanCzbr: pabelanger: it can wait next or next+1, some feedback on it would be great though17:31
openstackgerritPaul Belanger proposed zuul/nodepool master: Disable interface_ip check, when host-key-checking is disable  https://review.opendev.org/68154417:32
*** spsurya has quit IRC17:32
zbri will try to test it because I have a working local cluster and I wanted to deploy zuul locally anyway to learn more about it, so I should be able to help.17:32
tristanCclarkb: i understand the concern, but on the other hand, without this we can't use most zuul-jobs on kubernetes17:33
zbrtristanC: do you have a small playbook that should test that code? it could help me do the testing.17:34
tristanCanother solution would be to patch zuul-jobs fetch roles to not do synchronize, but instead copies the artifacts to a known location and let the base jobs synchronize pull from the nodes17:34
tristanCor perhaps we need a zuul-k8s-jobs with variant of the zuul-jobs that are known to work with kubectl17:35
pabelangertristanC: clarkb: the way to do this post 2.9, is a colleciton. We need to start thinking about support that in zuul, given most modules will likley be removed from ansible/ansible. That was, we could ship our own zuul_syncronize over patching ansible core functionality17:36
tristanCzbr: ansible -m synchronize -a 'src=/tmp/dir dest=. mode=push' $pod-name17:36
clarkbwe only need to replace the pull from test node to executor right? because we can synchronize from executor to a filesystem?17:36
clarkbin that case replacing synchronize between executor and test node seems like something to explore17:36
clarkbpabelanger: I assume `pip install ansible` (what zuul runs) will get you some sort of useable system? However we can add zuul-ansible-bits to that pip install list too I suppose17:37
tristanCclarkb: the zuul-jobs patch is to circumvent the oc rsync command that can't be exector on localhost from untrusted jobs17:37
tristanCexecuted*17:37
pabelangerclarkb: IIRC, new workflow is pip install ansible, galaxy install collection foobar17:37
clarkbtristanC: right that goes into a trusted base job17:37
pabelangerpip ansible will be super minimal17:37
pabelangereg: no openstack modules17:37
tristanCclarkb: e.g. we can synchronize from a pod to the executor using oc rsync, but that requires a command17:37
clarkbtristanC: that is the same with jobs running on openstack VMs (you can't run it from untrusted context)17:38
tristanCclarkb: not sure to understand, you can run synchronize: mode=pull from a VM to the executor from untrusted context17:38
clarkboh right you just have to keep the destination in the working dir17:40
clarkbwhy is exec'ing oc different than exec'ing rsync in this case? is it because it happens at the module level rather than command ?17:40
tristanCthat's because zuul authorize synchronize to do so17:41
clarkbgot it17:42
tristanCcould we make a convention for jobs that they just need to copy their artifacts to a known location on the test instance, e.g. ~/job-logs, then we could have a generic "fetch-logs" roles that would run from the base jobs, and it would be easy to make it work for both ssh and kubectl connections17:44
openstackgerritMerged zuul/zuul master: Fix timestamp race occurring on fast systems  https://review.opendev.org/68093717:44
clarkbtristanC: I want to say that already exists but not all jobs do it17:46
clarkbthe conventional location exists I mean17:46
tristanCclarkb: indeed there is fetch-output... so we could in theory patch the fetch-* roles to implement a "copy_output_locally" toggle to make them use ~/zuul-output instead17:49
corvustristanC, clarkb: yes, that was an idea that mordred worked on for a little bit, and then others carried on for a little bit, but no one has ever pushed it to completion.17:50
corvusin general, the idea was to stop having jobs fetch things from remote nodes, and instead just put them in known directories on the remote nodes and have the base jobs do the copying back17:51
tristanCclarkb: corvus: great, thanks for the suggestion, i'll work on that instead, that seems like the best system17:51
corvus++17:51
corvustristanC: i think the 'fetch-output' role is the centerpiece of that.17:52
tristanCcorvus: yes, that seems like exactly what we need17:53
*** nhicher has joined #zuul17:56
*** electrofelix has quit IRC17:56
*** jamesmcarthur has quit IRC18:06
*** jamesmcarthur has joined #zuul18:07
*** jamesmcarthur has quit IRC18:11
openstackgerritJames E. Blair proposed zuul/zuul master: WIP: Add support for the Gerrit checks plugin  https://review.opendev.org/68077818:18
*** panda|ruck is now known as panda|ruck|off18:21
*** hashar has quit IRC18:41
pabelangerclarkb: welp, I don't think the dedicated provider idea is going to work, because I need to launch 2 types of nodes, controller / appliance. So both been on the private network, but if I enable routes externally, both will get private ip as interface_ip.19:20
pabelangerI can't just move iosxr to the new provider, because need the ability to do multi node across providers19:20
clarkbpabelanger: and the executor isn't sufficient for the controller piece?19:21
pabelangerright, we want to test ansible19:21
pabelangerand new network connections19:21
pabelangerI think we need an option to toggle routes externally via nodepool.yaml19:21
pabelangeror allow multi node across provider / pools19:22
pabelangerlet me see why nodescan is using the private ip19:24
clarkbactually this should work fine19:25
clarkbwhat you need is two networks for the controller19:25
clarkbone is the private network shared by the appliance the other is your public network19:25
clarkbon the controller you say routes external as per normal19:25
clarkbon the appliance you say routes external on the private network19:25
pabelangerhttp://paste.openstack.org/raw/775155/19:25
pabelangerthat is what I get on controller node19:25
pabelangerwhich should be public / private19:26
pabelangerI don't know why yet, it has private ip19:26
pabelangerI would expect public19:26
pabelangerhttps://github.com/ansible-network/windmill-config/blob/master/nodepool/nl01.sjc1.vexxhost.zuul.ansible.com.yaml#L173 is the new region19:27
pabelangerclarkb: the part I might be missing, is how can I say routes externally, for different networks, per label19:27
pabelangerhttps://github.com/ansible-network/windmill-config/blob/master/nodepool/clouds.yaml.j2#L28 is clouds.yaml19:28
clarkboh I see hrm19:28
clarkbI wonder what happens if you tell nodepool to boot the appliance with only a routes externall false network?19:29
clarkbthen you don't need different configs. Nodepool's fallback behavior may work in that case? (eg if no external network then use whatever is there?)19:29
clarkbI don't know that it does that though19:29
pabelangerthat works, but no interface ip19:30
pabelangerbut nodepool doesn't like that19:30
pabelangerwith out patch above19:30
pabelangerhowever, i am starting to not like that idea, and the inventory file for applinace doesn't have ansible_host set19:31
pabelangerbecause zuul gets that from interface_ip19:31
pabelangernodepool.private_ipv4 is set however19:31
clarkbwell that most accurate describes the setup you need19:31
clarkbbasically you have an unreachable instance on a network somewhere and it comes with a second node that acts as a bridge19:32
clarkbin that case maybe we should fallback to "we have no better option so use the private ip"19:32
pabelangeryah, I'd actually want the following inventory: https://object-storage-ca-ymq-1.vexxhost.net/v1/a0b4156a37f9453eb4ec7db5422272df/ansible_07/207/19fd14e960c2dbe048cc429c581f594d067252fe/check/ansible-network-iosxr-appliance/b78c8a9/zuul-info/inventory.yaml19:32
pabelangerI was able to force that by manually changing clouds.yaml between boots19:33
pabelangertoday, in nodepool you cannot have a nodeset with 1 interface_ip public and other interface_ip private19:33
pabelangerthat's really what I'm looking for19:33
clarkbpabelanger: where does public ip come from in that inventroy?19:33
clarkbthere should be no public ip only a private ip in my example19:33
pabelangeropenstacksdk19:34
clarkb(and you'd get no inventory_ip according to your message above)19:34
pabelangerbecause I had routes external trye19:34
pabelangertrue*19:34
clarkbah ok19:34
pabelangerso, interface_ip on controller node become private too19:34
pabelangerthen keyscan fails19:34
clarkbpabelanger: another way of looking at this is that if you have no external network then zuul can't talk to it so having no ip in the zuul inventory is correct19:35
clarkbpabelanger: then your job could generate a new inventory from the supplied private ip and use that in its nested ansible19:35
clarkbya so we want to disable keyscanning (whichI thought we already support)19:35
pabelangerclarkb: yah, that way does work. but need https://review.opendev.org/681544/19:35
pabelangerif I apply that, I can do what you said19:36
corvusremember also that if the executor can't talk to it (ie, during the pre playbooks), then it's going to be unhappy, so we may not want it in the inventory at all.  if that's the case, we may want to look into treating it like a "resource" (ie, what we do for k8s) rather than an item in the inventory.19:36
pabelangerthen deal with missing ansible_host info via playbooks19:36
clarkbcorvus: ++ I think the nested ansible (or other job content) needs to figure out how to interact with it rather than zuul19:36
pabelangercorvus: yup, today that is what we do, if the node is in appliance group, we have zero pre-playbooks or run playbooks use it, via hosts: all:!appliance19:37
pabelangerthen use your write-inventory role to add it to 1st node /etc/ansible folder, and control it from their19:38
pabelangerzuul-executor doesn't touch it19:38
pabelanger(appliance(19:38
pabelangerif we didn't have to test ansible, I think we'd be fine with zuul-executor using it directly19:39
*** panda|ruck|off has quit IRC19:55
*** panda has joined #zuul19:57
*** jamesmcarthur has joined #zuul20:06
openstackgerritJames E. Blair proposed zuul/zuul master: Add enqueue reporter action  https://review.opendev.org/68113220:15
openstackgerritJames E. Blair proposed zuul/zuul master: Add no-jobs reporter action  https://review.opendev.org/68127820:15
openstackgerritJames E. Blair proposed zuul/zuul master: Add report time to item model  https://review.opendev.org/68132320:15
openstackgerritJames E. Blair proposed zuul/zuul master: Add Item.formatStatusUrl  https://review.opendev.org/68132420:15
openstackgerritJames E. Blair proposed zuul/zuul master: Add support for the Gerrit checks plugin  https://review.opendev.org/68077820:15
pabelangerclarkb: corvus: Shrews: I've confirmed https://review.opendev.org/681544/ does allow the node to come online properly now, outside of the routes externally issue, if you could review when possible20:28
pabelangeralso, if you have ideas how to test in nodepool, aside from updating nodepool test, I can add those tests20:28
pabelangersorry devstack test20:29
openstackgerritPaul Belanger proposed zuul/zuul-jobs master: Also include nodepool inventory variables  https://review.opendev.org/68160120:39
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: fetch-javascript-tarball: introduce zuul_do_synchronize  https://review.opendev.org/68160320:41
tristanCclarkb: corvus: should i propose something similar to https://review.opendev.org/681603 for all the other affect fetch roles?20:41
tristanCaffected*20:42
corvustristanC: yeah -- though maybe call it 'zuul_use_fetch_output' with a default value of '{{ zuul_site_use_fetch_output }}' so the ux is that someone sets a site variable that says "my base job uses fetch-output" ?20:43
corvus(and i guess that's inverting the boolean, so, default of false instead of true)20:45
tristanCcorvus: good idea, thanks20:54
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: fetch-javascript-tarball: introduce zuul_do_synchronize  https://review.opendev.org/68160321:02
*** jamesmcarthur has quit IRC21:20
*** avass has quit IRC21:32
*** saneax has quit IRC21:57
*** panda has quit IRC22:00
*** panda has joined #zuul22:03
*** rlandy is now known as rlandy|bbl22:24
*** threestrands has joined #zuul22:37
corvusdoes anyone have suggestions as to how to convince sphinx to supply more information than this: Warning, treated as error:22:49
corvus:1: (ERROR/3) Unknown interpreted text role "class".22:49
corvus(that is *literally* all it's telling me.  no idea even what file is triggering it.  still happens without any changes to .rst files)22:50
fungino clue. did you add some text with a role named "class"?22:51
corvusfungi: i didn't -- but even when i revert out the changes to rst files it still happens22:51
fungiokay, so something new. yeah line numbers would be nice :/22:51
corvussomehow it seems that changing the python code has caused this?  i also haven't changed any docstrings.22:51
corvusif i'm parsing the error string correctly, that's: "line number 1 in the empty file"22:52
fungiargh22:52
fungiis this when trying to locally do `tox -e docs` on the zuul repo?22:53
corvusyep, with 680778 checked out22:53
fungiseems i need a full set of ansible build dependencies installed to build zuul docs22:59
corvusthe client doc pages run "zuul" to get the help output23:00
SpamapSmaybe an upstream library changed that flubs the scraped output?23:01
pabelangerokay, super ugly, but manage to get job working: https://github.com/ansible/ansible-zuul-jobs/pull/20723:03
corvusSpamapS: if i checkout the change ahead of it it works; so i think it's something about 680778 setting it off23:03
SpamapSah that's handy23:04
pabelangerthe appliance comes online with non-routable public IP, but controller can access it.  I need to figure out a better way of updating hostvars with right private IP, so I can use write-inventory role.23:05
pabelangerUsed the replace filter, but really want to replace I think23:05
fungiokay, i've gotten my dev env to the point where i can reproduce the opaque sphinx error with change 68077823:08
corvusfungi, SpamapS: okay, by reverting each chunk of the patch in 680778 one at a time and running sphinx, i've found it's related to the addition of the GerritPoller class in gerritconnection.py23:08
SpamapShah I did that too and was about to say that.. ;)23:09
SpamapSLuckily it was the 3rd hunk. ;)23:10
SpamapSYou can also just not mention that class, and it will build correctly.23:10
corvusi don't think it's referenced in the docs23:11
SpamapShm, you're right23:12
*** saneax has joined #zuul23:12
SpamapSbut I can also get the docs to build if I revert doc/source/admin/drivers/gerrit.rst23:12
SpamapSoh wait no, they just get further23:13
SpamapScorvus:the problem is GerritPoller needs a """ """ instead of #23:13
corvusit's specifically the line "poller_class = GerritPoller" setting a class variable in GerritConnection23:13
SpamapSoh hah every time I think I find it it explodes more23:14
SpamapSin different ways.. derp23:14
SpamapSyeah23:14
corvusokay, i suspect the linkage here is that the testing doc does documentFakeGerritConnection which links to GerritConnection (though that is not documented); but that might be enough to get the sphinx autodoc stuff examining that class23:16
corvuswhy it's treating that instance variable that way is still a mystery23:16
* SpamapS puts $4.20 on it being some bonghits python parsing corner case23:16
*** sanjayu_ has joined #zuul23:18
*** igordc has quit IRC23:18
corvushuh, i assumed the "class" in "poller_class" was the "class" it was refering to, but no -- even if i change that varname to "poller_thingy" it still barfs23:19
fungieven commenting out the line entirely still breaks23:20
*** saneax has quit IRC23:20
corvusi'm able to fix it by commenting that line out23:21
fungimaybe i'm commenting out a different line23:23
corvusfungi: in change 680778 it's gerritconnection.py line 36323:23
fungialso possible i needed to git clean between runs23:23
corvusokay, new hypothesis -- testing.rst autodocs FakeGerritConnection with "inherited-members" so it picks up stuff from GerritConnection, which has a class variable that points to another class and sphinx can't handle that.23:25
fungiwith that line commented out (same line number/file) it breaks on me with the above error23:25
fungimaybe there's more you've also commented out?23:25
corvus(i explicitly tested reparenting that class from threading.Thread to object to exclude the hypothesis that it's the threading class that's weird)23:25
corvusfungi: almost certainly.  let me reset to that and see.23:25
SpamapSthat actually makes sense given the error message23:26
SpamapSIt's probably looking for `str` or `bytes` or `unicode` and it's getting `class` back from `type(thatvar)`23:26
fungifwiw, a web search on that error mostly shows commits/discussions about developers silencing it23:27
fungithough i never found a great explanation of why it was turning up in their cases, this explanation does make sense23:27
fungisphinx basically has a hard-coded list of types it expects23:27
corvusfungi: yep, i also had removed it in base.py  --  so essentially the error shows up in 2 places23:28
fungii wouldn't be surprised if class objects as classvars confuse it23:28
corvusif we remove *both* of those lines it fixes it23:28
fungiyep, confirmed23:30
fungii wonder if there's a way to mark variables so that autodoc will skip them23:32
corvuswell, one way to do that is to prefix them with an '_', which is a perfectly acceptable solution in this case so i think i'll go with that :)23:32
*** jamesmcarthur has joined #zuul23:32
*** igordc has joined #zuul23:33
openstackgerritJames E. Blair proposed zuul/zuul master: Add support for the Gerrit checks plugin  https://review.opendev.org/68077823:34
corvusfungi, SpamapS: thanks!  that was "fun"  :)23:34
fungiindeed, that's a reasonable workaround i guess23:37
fungiit's possible https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#skipping-members could be used as an alternative, but the simple solution seems fine23:40
*** igordc has quit IRC23:40
*** igordc has joined #zuul23:42
*** jamesmcarthur has quit IRC23:46
*** jamesmcarthur has joined #zuul23:46
SpamapScorvus: at least you didn't have to print out the hierarchy this time. ;)23:48
*** jamesmcarthur has quit IRC23:51
*** sanjayu_ has quit IRC23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!