Wednesday, 2019-09-11

*** jamesmcarthur has joined #zuul		00:02
*** jamesmcarthur has quit IRC		00:06
*** jamesmcarthur has joined #zuul		00:10
*** jamesmcarthur has quit IRC		00:27
*** threestrands has joined #zuul		00:27
*** jamesmcarthur has joined #zuul		00:39
openstackgerrit	Tristan Cacqueray proposed zuul/zuul-jobs master: Add remove-zuul-sshkey https://review.opendev.org/680712	00:41
*** jamesmcarthur has quit IRC		00:44
*** jamesmcarthur has joined #zuul		01:02
*** rlandy has quit IRC		01:50
*** jamesmcarthur has quit IRC		01:59
*** jamesmcarthur has joined #zuul		02:04
*** roman_g has quit IRC		02:34
*** jamesmcarthur has quit IRC		03:00
*** jamesmcarthur has joined #zuul		03:06
*** jamesmcarthur has quit IRC		03:13
*** jamesmcarthur has joined #zuul		03:25
*** rfolco has quit IRC		03:32
*** PrinzElvis has quit IRC		03:39
*** webknjaz has quit IRC		03:43
*** dcastellani has quit IRC		03:45
*** PrinzElvis has joined #zuul		03:45
*** webknjaz has joined #zuul		03:46
*** dcastellani has joined #zuul		03:46
*** jamesmcarthur has quit IRC		03:57
*** ianychoi_ has joined #zuul		04:24
*** ianychoi has quit IRC		04:27
*** pcaruana has joined #zuul		04:42
*** saneax has joined #zuul		05:03
*** pcaruana has quit IRC		05:12
*** bolg has joined #zuul		05:45
*** bolg has quit IRC		06:09
*** pcaruana has joined #zuul		06:21
*** bolg has joined #zuul		06:26
*** AJaeger has quit IRC		06:28
*** AJaeger has joined #zuul		06:35
*** hashar has joined #zuul		06:42
*** roman_g has joined #zuul		07:16
*** themroc has joined #zuul		07:16
*** jpena\|off is now known as jpena		07:37
bogdando	o/	07:38
bogdando	please merge https://review.opendev.org/#/c/681182/	07:38
*** avass has joined #zuul		07:40
*** threestrands has quit IRC		07:59
*** sshnaidm\|afk is now known as sshnaidm\|ruck		08:09
*** ianychoi_ has quit IRC		09:09
*** hashar has quit IRC		09:38
*** saneax has quit IRC		10:11
*** saneax has joined #zuul		10:12
*** ianychoi has joined #zuul		10:30
*** avass has quit IRC		10:37
*** ianychoi has quit IRC		10:45
*** ianychoi has joined #zuul		10:45
*** shachar has quit IRC		11:08
*** snapiri has joined #zuul		11:08
*** sshnaidm\|ruck is now known as sshnaidm\|bbl		11:20
*** hashar has joined #zuul		11:20
*** avass has joined #zuul		11:30
*** jpena is now known as jpena\|lunch		11:40
*** spsurya has joined #zuul		11:58
*** rfolco has joined #zuul		12:06
*** rlandy has joined #zuul		12:23
*** jamesmcarthur has joined #zuul		12:25
pabelanger	morning! I wanted to see how we could make a change to nodepool, to not fail if an openstack provider as been configured to not have a public IP: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/driver/openstack/handler.py#L190	12:30
*** jamesmcarthur has quit IRC		12:30
pabelanger	we have this use case in one of our network appliances, cisco iosxr, where the management interface can come online (via dhcp) but not have a default route	12:31
pabelanger	so we need to do some very weird things, to make multinode jobs happen	12:31
pabelanger	but this requires both nodes to be on the same subnet, for public internet (which is really hard these days).	12:31
pabelanger	So, by removing the check above or toggling it, we want to launch a node, but only have private IPs, which is possible the zuul executor won't have direct access too	12:32
*** jpena\|lunch is now known as jpena		12:32
*** bogdando has left #zuul		12:43
openstackgerrit	Sorin Sbarnea proposed zuul/zuul-jobs master: WIP: Allow ensure-tox to upgrade tox version https://review.opendev.org/676464	13:11
openstackgerrit	Sorin Sbarnea proposed zuul/zuul-jobs master: Allow ensure-tox to upgrade tox version https://review.opendev.org/676464	13:13
openstackgerrit	Sorin Sbarnea proposed zuul/zuul-jobs master: add-build-sshkey: add centos/rhel-8 support https://review.opendev.org/674092	13:13
sean-k-mooney	i proably should have checked this before pushing the github dirver support depends on with the pull request URL right	13:35
pabelanger	yes	13:35
sean-k-mooney	so this would work https://review.opendev.org/#/c/681474/	13:35
pabelanger	yup	13:35
sean-k-mooney	if the intel zuul has that project in it zull config	13:35
sean-k-mooney	cool i was expeting upstream zuul to explone because it does not	13:36
sean-k-mooney	oh i forgot to add back in noop ot upstream	13:36
*** panda is now known as panda\|ruck		13:40
*** sshnaidm\|bbl is now known as sshnaidm\|ruck		13:44
openstackgerrit	Sorin Sbarnea proposed zuul/zuul-jobs master: Allow ensure-tox to upgrade tox version https://review.opendev.org/676464	13:44
openstackgerrit	Sorin Sbarnea proposed zuul/zuul-jobs master: add-build-sshkey: add centos/rhel-8 support https://review.opendev.org/674092	13:44
*** swest has quit IRC		13:45
*** panda\|ruck is now known as panda\|rover		13:45
*** swest has joined #zuul		13:45
*** bolg has quit IRC		13:56
sshnaidm\|ruck	hi, how can I build new containers with zuul? The current containers on docker.io/zuul has zuul version 3.5.0 which is quite old	13:59
Shrews	pabelanger: interface_ip is not necessarily the public IP. it's whatever IP sdk determines is used for communicating with that server, which might be private: https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/cloud/meta.py#L318-L337	13:59
*** swest has quit IRC		13:59
sshnaidm\|ruck	docker pull zuul/zuul; docker run -it zuul/zuul zuul --version	13:59
sshnaidm\|ruck	Zuul version: 3.5.0	13:59
pabelanger	Shrews: yah, agree. In this case, it is a hard check where nodepool needs to find the interface_ip. In this case, it will always be empty. I'd like to skip that check	14:02
AJaeger	sshnaidm\|ruck: did you confirm that the content is old - or is it just the version?	14:02
AJaeger	sshnaidm\|ruck: the image was updated yesterday, wasn't it?	14:02
sshnaidm\|ruck	AJaeger, what do you mean by content?	14:02
sshnaidm\|ruck	AJaeger, I need a newer zuul version, at least as in CI	14:03
AJaeger	I mean: Does it include current git master but uses wrong version number?	14:03
pabelanger	we can look at the publish job to see	14:03
Shrews	pabelanger: i'm confused... how do you expect the zuul executor to communicate with the node then?	14:04
AJaeger	https://hub.docker.com/r/zuul/zuul/tags says "updated 17hours ago"	14:04
pabelanger	Shrews: it doesn't, until the primary node is able to SSH into secondary and setup route	14:04
pabelanger	It needs to be done this way, because route cannot be obtainned via dhcp	14:05
AJaeger	sshnaidm\|ruck, pabelanger, this seems to be the job pushing the image, isn't it? http://zuul.opendev.org/t/zuul/build/e174c37169da417da31a85a644ca7976	14:05
pabelanger	only static	14:05
sshnaidm\|ruck	AJaeger, it doesn't include git, it has /usr/local/lib/python3.7/site-packages/zuul-3.5.0.dist-info	14:05
Shrews	pabelanger: so these are nodes allocated via nodepool, but not expected to be used directly by the executor?	14:05
pabelanger	Shrews: they are allocated in nodepool, but zuul / nodepool cannot route to them, until a pre-run task is setup in base job	14:06
pabelanger	once that is done, then zuul executor can access it	14:06
zbr	not sure who is generating https://zuul.opendev.org/manifest.json but it would be useful to include zuul version there.	14:06
pabelanger	this is because, nodepool isn't able to properly setup network directly, it needs manually commands	14:07
Shrews	pabelanger: that goes directly against our current design. i don't have any ideas on that one, atm	14:08
pabelanger	we have this working today	14:08
pabelanger	in zuul.a.c	14:08
pabelanger	but, it works because we are using public IPs	14:08
pabelanger	I'd just like to have nodepool not enforce an interface ip	14:09
pabelanger	so I can flip to the Vm to private	14:09
corvus	sshnaidm\|ruck, AJaeger: "docker run -it zuul/zuul zuul --version" -> "Zuul version: 3.10.2.dev66" for me	14:09
fungi	pabelanger: how about a reverse nat, where the builder/executor masquerade as a local address on that network when connecting to those addresses? that way the device never needs a default route and just responds via layer 2 (arp or v6nd) resolution	14:10
sshnaidm\|ruck	corvus, did you pull from docker.io?	14:11
corvus	sshnaidm\|ruck: yes	14:11
sshnaidm\|ruck	corvus, me too..	14:11
pabelanger	fungi: trying to understand reverse nat comment	14:12
sshnaidm\|ruck	lemme check on vm, maybe cache..	14:12
Shrews	pabelanger: And the executor currently runs this pre-run task on the node, right?	14:12
pabelanger	Shrews: on the primary node	14:12
Shrews	oh, i think i see now	14:13
pabelanger	so here is an example	14:13
pabelanger	https://object-storage-ca-ymq-1.vexxhost.net/v1/a0b4156a37f9453eb4ec7db5422272df/ansible_32/62132/b93674666faa9a116d2a6d160f48557e6c7a454e/third-party-check/ansible-test-network-integration-iosxr-python36/dd0c215/job-output.html#l598	14:13
pabelanger	we have a pre-run job, that runs on primary node (2 nodeset)	14:13
pabelanger	to ensure it can route to applinace node (so they are on same subnet)	14:14
AJaeger	sshnaidm\|ruck, corvus, I get the same result as corvus. So, all looks fine here as well - and I pulled from dockerhub.	14:14
pabelanger	then we know we can access it and do things to it	14:14
*** bolg has joined #zuul		14:14
pabelanger	if we cannot, zuul aborts, and re retry	14:14
pabelanger	however, today we are using public IPs, in vexxhost-sjc1, since they have 1 single subnet of public IPs	14:14
fungi	pabelanger: managed device lives in the "private" network and lacks routes outside that network. executor and launcher live outside the network and have static routes to a router which knows how to forward traffic into that network. last hop also performs layer 3 address translation to map the executor and launcher addresses to local addresses on that network so the device sees connections coming from a local	14:14
fungi	(to it) address	14:14
pabelanger	however, it is a single region we test against. In limestone, we can create provider network, that is private single subnet, and route between 2 nodes, without default routes	14:15
Shrews	pabelanger: how does the primary node determine the address of the appliance nodes?	14:15
sshnaidm\|ruck	corvus, AJaeger thanks, checked on different vm, it's also 3.10, seems like docker cached something	14:15
pabelanger	Shrews: public IP, because nodepool know it	14:16
pabelanger	when we flip to private, we'll need to manage that via nodepool again	14:16
pabelanger	as it will have the info	14:16
pabelanger	https://object-storage-ca-ymq-1.vexxhost.net/v1/a0b4156a37f9453eb4ec7db5422272df/ansible_32/62132/b93674666faa9a116d2a6d160f48557e6c7a454e/third-party-check/ansible-test-network-integration-iosxr-python36/dd0c215/zuul-info/inventory.yaml	14:17
pabelanger	is example inventory file	14:17
pabelanger	fungi: so, if I understand, for that to work with multiple clouds, I'd have per region subnets, that don't conflict	14:18
pabelanger	so I know which cloud to route too	14:18
corvus	pabelanger: have you tried https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[openstack].pools.host-key-checking ?	14:19
fungi	pabelanger: a twist on that, if you do need conflicting/overlapping private networks, is to also use 1:1 nat in the other direction for those devices	14:19
pabelanger	corvus: yes, we disable that by default. However we hiccup on interface_ip missing	14:19
corvus	pabelanger: why are the nodes not always on the same subnet? can't you make a neutron network and put them both on it? the whole "abort and retry if they aren't on the same subnet" thing seems like it could be very problematic.	14:24
pabelanger	corvus: I don't know to be honest. I'd have to confirm with cloud providers how to do that or if openstack supported. That in fact would be the easiest solutions here.	14:25
*** electrofelix has joined #zuul		14:26
corvus	pabelanger: i think that's worth looking into	14:28
corvus	pabelanger, Shrews: but back to the interface_ip thing --	14:28
pabelanger	k, I'll work up email to openstack ML	14:28
corvus	pabelanger, Shrews: it seems like we expect entire clouds to be either public or private, but iiuc, pabelanger has a cloud where he wants to get both public and private vms. so yeah, that's not accomodated by the logic in sdk. even if we set up an override option in nodepool to say "force private", that would still apply at the pool level, not the server level, so there's no way to say "use interface_ip	14:30
corvus	for this server, and private_ip for this other one"	14:30
corvus	pabelanger, Shrews: the only workable option i see is strictly what pabelanger suggested: disable the interface ip check and just return no data. i guess we could do that, but if we add that, we should add a warning saying people almost certainly don't want to enable this flag because it will mask all kinds of very frequent problems.	14:32
Shrews	pabelanger: corvus: i wonder if we can just skip the interface_ip check if host-key-checking is disabled. presumably you don't care about that ip if you're skipping ssh keyscan???	14:35
corvus	Shrews: hrm, yeah, that may be reasonable	14:36
*** nhicher has quit IRC		14:39
pabelanger	yah, looking at code more, if we skip interface_ip on host-key-checking false I think that make sense.	14:39
pabelanger	so, +1 from me :)	14:39
pabelanger	but also asking in openstack too	14:39
*** sshnaidm\|ruck is now known as sshnaidm\|rover		14:40
clarkb	pabelanger: so your appliance refuses to set a default route from dhcp?	14:46
pabelanger	clarkb: yup!	14:47
pabelanger	it is terrible	14:47
*** panda\|rover is now known as panda\|ruck		14:52
fungi	if the device is a piece of networking gear, that's not entirely uncommon behavior	14:56
fungi	you end up with a lot of flat management networks and/or 1:1 nat when dealing with devices like that	14:57
clarkb	ime th3 expectation is you manually configure the device and not use dhcp (but that still allows for a default route	14:57
*** bogdando has joined #zuul		14:58
bogdando	hi, please merge https://review.opendev.org/#/c/681182/	14:58
bogdando	Shrews, clarkb: ^^	14:58
*** bogdando has left #zuul		14:58
fungi	yeah, or if the device is capable of placing its management address on any network where it has a serial/loopback then it may be able to make that routable via traditional routing protocols	14:58
fungi	that's more common for actual ip routers though, less so for pure ethernet switches	15:00
pabelanger	yah, usually this device is the one providing DHCP to the network, so that is why it is missing (or so I am told)	15:01
pabelanger	this is basically, a large hack around the idea of not adding console support into nodepool :)	15:02
clarkb	in this particular case it does seem like you want to create your own network and subnet in neutron and boot all of the instances on that network	15:04
clarkb	then your executor can have an interface on the network too	15:04
clarkb	I believe mordred has said that while vexxhost gives you public networking by default you can still create a network and subnet and router there	15:04
pabelanger	Yup, that is right, when I last tested this we couldn't get nodepool to bring online node due to interface_ip missing	15:06
pabelanger	I did get it working with FIPs	15:07
pabelanger	but some clouds don't support thta	15:07
pabelanger	(and FIPs have an extra cost)	15:07
*** jamesmcarthur has joined #zuul		15:08
clarkb	there is a clouds.yaml setting to say the private ip is the ip to use	15:08
pabelanger	my last idea, is going to be using nested virt for the appliance. But really trying to avoid doing that	15:08
clarkb	if nodepool is checking that that is reachable it would still fail though	15:08
pabelanger	clarkb: would I need to create a new pool for that? I mostly only want this 1 VM to be setup with private	15:09
clarkb	you might have to set up a new provider for that since it si a clouds.yaml setting	15:10
pabelanger	k, let me look into that. that might complicate things on nodepool config side, but also an option	15:10
clarkb	pabelanger: https://docs.openstack.org/os-client-config/latest/user/network-config.html	15:11
clarkb	I think you set routes_externally on the netowrk to true	15:11
*** chandankumar has quit IRC		15:11
clarkb	then you'll get that IP back as "the ip" from sdk in nodepool	15:11
pabelanger	that might actually not be bad, if that is the case	15:12
*** chandankumar has joined #zuul		15:12
pabelanger	okay, let me test that	15:12
*** Goneri has joined #zuul		15:15
fungi	i strongly suspect openstack doesn't want to provide a means to request/guarantee provider network affinity, and expects you to create a network instead if you need that	15:16
clarkb	ya thats teh whole point of being able to configure networks and subnets yourself	15:17
Shrews	clarkb: we good to merge https://review.opendev.org/681182 for bogdando? Not sure why it wasn't approved before...	15:20
clarkb	Shrews: I don't know why tristanC didn't approve, but ya aiui we test the multinode roles fairly well so if tests pass I would expect that to be working and can be approved	15:22
clarkb	Shrews: do you want to +A or should I?	15:22
Shrews	i'll go ahead	15:22
Shrews	just wanted to make sure we weren't waiting for something	15:23
*** bolg has quit IRC		15:23
*** themroc has quit IRC		15:26
*** igordc has joined #zuul		15:31
*** mattw4 has joined #zuul		15:34
*** mattw4 has quit IRC		16:04
*** mattw4 has joined #zuul		16:04
openstackgerrit	Merged zuul/zuul-jobs master: Fix evaluating nodepool_ip and switch_ip facts https://review.opendev.org/681182	16:05
*** mattw4 has quit IRC		16:10
openstackgerrit	Sorin Sbarnea proposed zuul/zuul-jobs master: WIP: Allow ensure-tox to upgrade tox version https://review.opendev.org/676464	16:16
*** hashar has quit IRC		16:18
*** chandankumar is now known as raukadah		16:31
openstackgerrit	Sorin Sbarnea proposed zuul/zuul master: For pre blocks to wrap text https://review.opendev.org/681532	16:32
pabelanger	clarkb: that seems to work, but like I guess, disrupts all other VMs that are also attached to that network. For some reason, opestacksdk is returning that is interface IP, even while public exists and routes external. I'm guessing there is no order involved when 2 networks have that setting enabled	16:38
pabelanger	but, will look more into openstacksdk	16:38
clarkb	pabelanger: ya you need to use a separate provider with different clouds.yaml profile	16:40
clarkb	for the ordering thing you can specify the public network as routes_external false probably	16:41
clarkb	or only boot the instance with a single network	16:41
pabelanger	yah, I have complex network requirements and limited provider networks	16:42
pabelanger	if routes_externally could be passed via nodepool.yaml file, that would work	16:43
clarkb	well in this case you are wanting to not use provider networks and instead use user configured networks	16:43
pabelanger	but we'd need to grow support I think	16:43
clarkb	hrm?	16:43
clarkb	nodepool allows you to specify which networks to attach	16:43
pabelanger	yup, but I don't think we expose the setting of the network	16:43
clarkb	nodepool does	16:43
pabelanger	oh	16:43
clarkb	https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[openstack].pools.networks	16:44
*** jpena is now known as jpena\|off		16:44
SpamapS	corvus:I grabbed the nodepool task on https://storyboard.openstack.org/#!/story/2006516 .. but .. feels like the task list isn't quite in-line with the story. If I understand the story right, this is mostly about making the DB optional and allowing external settings. Yes?	16:46
pabelanger	clarkb: isn't that just the name of the network?	16:47
pabelanger	not the settings for it	16:47
clarkb	pabelanger: correct the settings go in clouds.yaml	16:49
corvus	SpamapS: i think the first 2 tasks are pre-reqs for the rest (or, well, at the very least, the zk task is a pre-req for nodepool)	16:49
clarkb	pabelanger: what you need to do is have two profiles in clouds.yaml with different settings for the same network. Then have two providers in nodepool using different clouds.yaml profiles	16:50
pabelanger	clarkb: okay, yah, that is still the 2 provider rule	16:50
clarkb	yes	16:50
pabelanger	which does work, I just manually forced it here: https://object-storage-ca-ymq-1.vexxhost.net/v1/a0b4156a37f9453eb4ec7db5422272df/ansible_07/207/19fd14e960c2dbe048cc429c581f594d067252fe/check/ansible-network-iosxr-appliance/b78c8a9/job-output.html#l40	16:50
clarkb	fwiw we intentionally stopped managing network settings in nodepool and rely on clouds.yaml	16:51
clarkb	so I think that is the correct way to do this	16:51
pabelanger	yah, I'm just not looking forward to doubling my nodepool config from 4 providers, to 8. for a single node :(	16:52
corvus	SpamapS: nodepool needs a zk. the story says we should be able to provide zk connection information outside of any operators (like, imagine the IT department already runs a ZK). so the easiest way to get something working in the test job is to set up a zk (which just happens to be run by an operator in the same k8s, but zuul-operator doesn't have to know about that). then tell zuul-operator the zk	16:52
corvus	connection info. later on, we can do more fancy things with the zuul-operator interacting with the zk operator.	16:52
corvus	(and same thing applies to pxc and zuul)	16:52
SpamapS	corvus: got it. I'll put my extra cycles into those first few then.	16:53
corvus	kk	16:53
SpamapS	Interesting experience. I recently marked our `gate` pipeline as `supercedes: check`. As a result, PRs are merging with a "pending" status on check. I wonder if we can delete a status.	17:03
*** hashar has joined #zuul		17:04
openstackgerrit	James E. Blair proposed zuul/zuul master: WIP: Add support for the Gerrit checks plugin https://review.opendev.org/680778	17:06
openstackgerrit	Paul Belanger proposed zuul/nodepool master: Disable interface_ip check, when host-key-checking is disable https://review.opendev.org/681544	17:06
pabelanger	corvus: clarkb: Shrews: ^that should be host-key-checking patch, we discussed this morning	17:07
pabelanger	if still okay, I'll add some testing around it after I grab some lunch	17:07
corvus	SpamapS: yeah, we might need a new reporter action in zuul to handle that; if you find that the github api supports it, we can do that.	17:07
pabelanger	that would be the less work approach to make this work for us	17:07
pabelanger	but understand if we don't want to do it	17:08
corvus	SpamapS: (reporter action meaning like "start, failure, success.... superceded")	17:09
pabelanger	clarkb: just thinking about it before getting food, is there no way in clouds.yaml to define my own network name, but map to existing provider network? I'm gussing not	17:12
clarkb	pabelanger: you create a network in the cloud and use that instead of the provider network	17:14
pabelanger	yah, I don't think I can create a network	17:15
Shrews	pabelanger: i think we need to document the behavior in the host-key-checking doc portion too	17:15
clarkb	mordred: has said you can in vexxhost	17:15
pabelanger	this is limestone where I am testing	17:15
clarkb	I've never done it myself though	17:15
clarkb	you may be able to there too	17:16
pabelanger	trying to figure it out with current resource	17:16
pabelanger	the new provider works	17:16
pabelanger	but that is a lot of overhead, like mirrors, quotas, etc.	17:16
pabelanger	I could get behind a pool, but that is dedicated quotas there too	17:16
pabelanger	Shrews: good idea	17:17
clarkb	why do you need new mirrors?	17:17
clarkb	if the host can't route anyways its not talking to those	17:17
pabelanger	well, updates for dns entries (if we had region mirrors)	17:17
pabelanger	image uploads will be a thing	17:18
pabelanger	that will have a cost to it	17:18
clarkb	not sure I understand the dns entries problem either. for images considering this is an appliance I assume nodepool isn't uploading those and you are just setting a uuid?	17:18
clarkb	if so then you can use the image that is already there	17:18
pabelanger	yah, appliance side we can reuse, but controller node is manged by dib	17:19
openstackgerrit	Sorin Sbarnea proposed zuul/zuul master: For pre blocks to wrap text https://review.opendev.org/681532	17:19
clarkb	controller node can be launched on the other network and the limited network	17:19
clarkb	but ya if using another provider that would be another image	17:19
clarkb	maybe we need the concept of a subprovider which inherits images from its parent	17:20
pabelanger	yah, quota is the bigger one honestly. need to share that between providers	17:20
openstackgerrit	Clark Boylan proposed zuul/zuul master: Pass zuul_success to cleanup playbooks https://review.opendev.org/681552	17:21
pabelanger	okay, let me update nodepool patch, at tests	17:21
openstackgerrit	Sorin Sbarnea proposed zuul/zuul master: For pre blocks to wrap text https://review.opendev.org/681532	17:21
pabelanger	then, maybe switch to vexxhost and create private network	17:21
clarkb	corvus: 681552 passes zuul_success to cleanup playbooks	17:21
pabelanger	then limit which VMs use it	17:21
pabelanger	I can then loop back to other provider	17:21
openstackgerrit	Tristan Cacqueray proposed zuul/zuul master: synchronize: add support for kubectl connection https://review.opendev.org/681553	17:22
zbr	corvus: i updated https://review.opendev.org/#/c/681532/ -- rephrased and included screenshots before/after, looks ok now?	17:23
tristanC	zuul-maint: https://review.opendev.org/681553 integrates https://github.com/ansible/ansible/pull/62107 so that most zuul-jobs can ran in kubernetes	17:23
tristanC	it's not ideal, but i don't know how long it will take for Ansible to support synchronize with kubectl connection. Could someone ask at AnsibleFest?	17:24
clarkb	I think we should be very careful about adding features to ansible that won't work outside of zuul (because people will expect playbooks to work in zuul and outside of zuul)	17:25
clarkb	I think we can ask (I'll be at the dev day Monday	17:25
zbr	tristanC: my experience with Ansible was that if you make a PR it will be reviewed and merged quite fast, or maybe I was just lucky.	17:25
tristanC	clarkb: it's not specific to zuul, it just extend the synchronize connection support	17:26
zbr	tristanC: there is only one ugly aspect of ansible: if you add a new feature, it will only go into next release.	17:26
zbr	but if you hurry up, there may even be time to slip things into 2.9, not sure. pabelanger probably knows better.	17:27
zbr	i know this because I was upset that refused my fix to enable "etc-hosts" for docker_image module which was missing, and they aceepted only for 2.9, did not want to add it to 2.8 because counted as "new feature".	17:28
*** sshnaidm\|rover is now known as sshnaidm\|off		17:28
zbr	from my point of view it was a bug: failure to pass argument to docker-py module, but always depends from which angle you see it.	17:28
zbr	someones	17:28
zbr	feature is someone's bug	17:29
clarkb	tristanC: you've imported the code from the PR into zuul right? and that PR hasn't merged. So if we merge that chagne it will be specific to zuul until that PR merges	17:29
clarkb	and it will not be clear to users that have working kubectl rsync in zuul why it doesn't work with normal ansible	17:30
clarkb	(all of the other zuul changes to ansible prevent actions from being taken so tasks that run outside of zuul should still work)	17:30
pabelanger	zbr: tristanC won't ship in 2.9, too late for that sadly	17:31
tristanC	clarkb: yes that's correct	17:31
pabelanger	but, with 2.9 we can create our own zuul collection if wanted	17:31
tristanC	zbr: pabelanger: it can wait next or next+1, some feedback on it would be great though	17:31
openstackgerrit	Paul Belanger proposed zuul/nodepool master: Disable interface_ip check, when host-key-checking is disable https://review.opendev.org/681544	17:32
*** spsurya has quit IRC		17:32
zbr	i will try to test it because I have a working local cluster and I wanted to deploy zuul locally anyway to learn more about it, so I should be able to help.	17:32
tristanC	clarkb: i understand the concern, but on the other hand, without this we can't use most zuul-jobs on kubernetes	17:33
zbr	tristanC: do you have a small playbook that should test that code? it could help me do the testing.	17:34
tristanC	another solution would be to patch zuul-jobs fetch roles to not do synchronize, but instead copies the artifacts to a known location and let the base jobs synchronize pull from the nodes	17:34
tristanC	or perhaps we need a zuul-k8s-jobs with variant of the zuul-jobs that are known to work with kubectl	17:35
pabelanger	tristanC: clarkb: the way to do this post 2.9, is a colleciton. We need to start thinking about support that in zuul, given most modules will likley be removed from ansible/ansible. That was, we could ship our own zuul_syncronize over patching ansible core functionality	17:36
tristanC	zbr: ansible -m synchronize -a 'src=/tmp/dir dest=. mode=push' $pod-name	17:36
clarkb	we only need to replace the pull from test node to executor right? because we can synchronize from executor to a filesystem?	17:36
clarkb	in that case replacing synchronize between executor and test node seems like something to explore	17:36
clarkb	pabelanger: I assume `pip install ansible` (what zuul runs) will get you some sort of useable system? However we can add zuul-ansible-bits to that pip install list too I suppose	17:37
tristanC	clarkb: the zuul-jobs patch is to circumvent the oc rsync command that can't be exector on localhost from untrusted jobs	17:37
tristanC	executed*	17:37
pabelanger	clarkb: IIRC, new workflow is pip install ansible, galaxy install collection foobar	17:37
clarkb	tristanC: right that goes into a trusted base job	17:37
pabelanger	pip ansible will be super minimal	17:37
pabelanger	eg: no openstack modules	17:37
tristanC	clarkb: e.g. we can synchronize from a pod to the executor using oc rsync, but that requires a command	17:37
clarkb	tristanC: that is the same with jobs running on openstack VMs (you can't run it from untrusted context)	17:38
tristanC	clarkb: not sure to understand, you can run synchronize: mode=pull from a VM to the executor from untrusted context	17:38
clarkb	oh right you just have to keep the destination in the working dir	17:40
clarkb	why is exec'ing oc different than exec'ing rsync in this case? is it because it happens at the module level rather than command ?	17:40
tristanC	that's because zuul authorize synchronize to do so	17:41
clarkb	got it	17:42
tristanC	could we make a convention for jobs that they just need to copy their artifacts to a known location on the test instance, e.g. ~/job-logs, then we could have a generic "fetch-logs" roles that would run from the base jobs, and it would be easy to make it work for both ssh and kubectl connections	17:44
openstackgerrit	Merged zuul/zuul master: Fix timestamp race occurring on fast systems https://review.opendev.org/680937	17:44
clarkb	tristanC: I want to say that already exists but not all jobs do it	17:46
clarkb	the conventional location exists I mean	17:46
tristanC	clarkb: indeed there is fetch-output... so we could in theory patch the fetch-* roles to implement a "copy_output_locally" toggle to make them use ~/zuul-output instead	17:49
corvus	tristanC, clarkb: yes, that was an idea that mordred worked on for a little bit, and then others carried on for a little bit, but no one has ever pushed it to completion.	17:50
corvus	in general, the idea was to stop having jobs fetch things from remote nodes, and instead just put them in known directories on the remote nodes and have the base jobs do the copying back	17:51
tristanC	clarkb: corvus: great, thanks for the suggestion, i'll work on that instead, that seems like the best system	17:51
corvus	++	17:51
corvus	tristanC: i think the 'fetch-output' role is the centerpiece of that.	17:52
tristanC	corvus: yes, that seems like exactly what we need	17:53
*** nhicher has joined #zuul		17:56
*** electrofelix has quit IRC		17:56
*** jamesmcarthur has quit IRC		18:06
*** jamesmcarthur has joined #zuul		18:07
*** jamesmcarthur has quit IRC		18:11
openstackgerrit	James E. Blair proposed zuul/zuul master: WIP: Add support for the Gerrit checks plugin https://review.opendev.org/680778	18:18
*** panda\|ruck is now known as panda\|ruck\|off		18:21
*** hashar has quit IRC		18:41
pabelanger	clarkb: welp, I don't think the dedicated provider idea is going to work, because I need to launch 2 types of nodes, controller / appliance. So both been on the private network, but if I enable routes externally, both will get private ip as interface_ip.	19:20
pabelanger	I can't just move iosxr to the new provider, because need the ability to do multi node across providers	19:20
clarkb	pabelanger: and the executor isn't sufficient for the controller piece?	19:21
pabelanger	right, we want to test ansible	19:21
pabelanger	and new network connections	19:21
pabelanger	I think we need an option to toggle routes externally via nodepool.yaml	19:21
pabelanger	or allow multi node across provider / pools	19:22
pabelanger	let me see why nodescan is using the private ip	19:24
clarkb	actually this should work fine	19:25
clarkb	what you need is two networks for the controller	19:25
clarkb	one is the private network shared by the appliance the other is your public network	19:25
clarkb	on the controller you say routes external as per normal	19:25
clarkb	on the appliance you say routes external on the private network	19:25
pabelanger	http://paste.openstack.org/raw/775155/	19:25
pabelanger	that is what I get on controller node	19:25
pabelanger	which should be public / private	19:26
pabelanger	I don't know why yet, it has private ip	19:26
pabelanger	I would expect public	19:26
pabelanger	https://github.com/ansible-network/windmill-config/blob/master/nodepool/nl01.sjc1.vexxhost.zuul.ansible.com.yaml#L173 is the new region	19:27
pabelanger	clarkb: the part I might be missing, is how can I say routes externally, for different networks, per label	19:27
pabelanger	https://github.com/ansible-network/windmill-config/blob/master/nodepool/clouds.yaml.j2#L28 is clouds.yaml	19:28
clarkb	oh I see hrm	19:28
clarkb	I wonder what happens if you tell nodepool to boot the appliance with only a routes externall false network?	19:29
clarkb	then you don't need different configs. Nodepool's fallback behavior may work in that case? (eg if no external network then use whatever is there?)	19:29
clarkb	I don't know that it does that though	19:29
pabelanger	that works, but no interface ip	19:30
pabelanger	but nodepool doesn't like that	19:30
pabelanger	with out patch above	19:30
pabelanger	however, i am starting to not like that idea, and the inventory file for applinace doesn't have ansible_host set	19:31
pabelanger	because zuul gets that from interface_ip	19:31
pabelanger	nodepool.private_ipv4 is set however	19:31
clarkb	well that most accurate describes the setup you need	19:31
clarkb	basically you have an unreachable instance on a network somewhere and it comes with a second node that acts as a bridge	19:32
clarkb	in that case maybe we should fallback to "we have no better option so use the private ip"	19:32
pabelanger	yah, I'd actually want the following inventory: https://object-storage-ca-ymq-1.vexxhost.net/v1/a0b4156a37f9453eb4ec7db5422272df/ansible_07/207/19fd14e960c2dbe048cc429c581f594d067252fe/check/ansible-network-iosxr-appliance/b78c8a9/zuul-info/inventory.yaml	19:32
pabelanger	I was able to force that by manually changing clouds.yaml between boots	19:33
pabelanger	today, in nodepool you cannot have a nodeset with 1 interface_ip public and other interface_ip private	19:33
pabelanger	that's really what I'm looking for	19:33
clarkb	pabelanger: where does public ip come from in that inventroy?	19:33
clarkb	there should be no public ip only a private ip in my example	19:33
pabelanger	openstacksdk	19:34
clarkb	(and you'd get no inventory_ip according to your message above)	19:34
pabelanger	because I had routes external trye	19:34
pabelanger	true*	19:34
clarkb	ah ok	19:34
pabelanger	so, interface_ip on controller node become private too	19:34
pabelanger	then keyscan fails	19:34
clarkb	pabelanger: another way of looking at this is that if you have no external network then zuul can't talk to it so having no ip in the zuul inventory is correct	19:35
clarkb	pabelanger: then your job could generate a new inventory from the supplied private ip and use that in its nested ansible	19:35
clarkb	ya so we want to disable keyscanning (whichI thought we already support)	19:35
pabelanger	clarkb: yah, that way does work. but need https://review.opendev.org/681544/	19:35
pabelanger	if I apply that, I can do what you said	19:36
corvus	remember also that if the executor can't talk to it (ie, during the pre playbooks), then it's going to be unhappy, so we may not want it in the inventory at all. if that's the case, we may want to look into treating it like a "resource" (ie, what we do for k8s) rather than an item in the inventory.	19:36
pabelanger	then deal with missing ansible_host info via playbooks	19:36
clarkb	corvus: ++ I think the nested ansible (or other job content) needs to figure out how to interact with it rather than zuul	19:36
pabelanger	corvus: yup, today that is what we do, if the node is in appliance group, we have zero pre-playbooks or run playbooks use it, via hosts: all:!appliance	19:37
pabelanger	then use your write-inventory role to add it to 1st node /etc/ansible folder, and control it from their	19:38
pabelanger	zuul-executor doesn't touch it	19:38
pabelanger	(appliance(	19:38
pabelanger	if we didn't have to test ansible, I think we'd be fine with zuul-executor using it directly	19:39
*** panda\|ruck\|off has quit IRC		19:55
*** panda has joined #zuul		19:57
*** jamesmcarthur has joined #zuul		20:06
openstackgerrit	James E. Blair proposed zuul/zuul master: Add enqueue reporter action https://review.opendev.org/681132	20:15
openstackgerrit	James E. Blair proposed zuul/zuul master: Add no-jobs reporter action https://review.opendev.org/681278	20:15
openstackgerrit	James E. Blair proposed zuul/zuul master: Add report time to item model https://review.opendev.org/681323	20:15
openstackgerrit	James E. Blair proposed zuul/zuul master: Add Item.formatStatusUrl https://review.opendev.org/681324	20:15
openstackgerrit	James E. Blair proposed zuul/zuul master: Add support for the Gerrit checks plugin https://review.opendev.org/680778	20:15
pabelanger	clarkb: corvus: Shrews: I've confirmed https://review.opendev.org/681544/ does allow the node to come online properly now, outside of the routes externally issue, if you could review when possible	20:28
pabelanger	also, if you have ideas how to test in nodepool, aside from updating nodepool test, I can add those tests	20:28
pabelanger	sorry devstack test	20:29
openstackgerrit	Paul Belanger proposed zuul/zuul-jobs master: Also include nodepool inventory variables https://review.opendev.org/681601	20:39
openstackgerrit	Tristan Cacqueray proposed zuul/zuul-jobs master: fetch-javascript-tarball: introduce zuul_do_synchronize https://review.opendev.org/681603	20:41
tristanC	clarkb: corvus: should i propose something similar to https://review.opendev.org/681603 for all the other affect fetch roles?	20:41
tristanC	affected*	20:42
corvus	tristanC: yeah -- though maybe call it 'zuul_use_fetch_output' with a default value of '{{ zuul_site_use_fetch_output }}' so the ux is that someone sets a site variable that says "my base job uses fetch-output" ?	20:43
corvus	(and i guess that's inverting the boolean, so, default of false instead of true)	20:45
tristanC	corvus: good idea, thanks	20:54
openstackgerrit	Tristan Cacqueray proposed zuul/zuul-jobs master: fetch-javascript-tarball: introduce zuul_do_synchronize https://review.opendev.org/681603	21:02
*** jamesmcarthur has quit IRC		21:20
*** avass has quit IRC		21:32
*** saneax has quit IRC		21:57
*** panda has quit IRC		22:00
*** panda has joined #zuul		22:03
*** rlandy is now known as rlandy\|bbl		22:24
*** threestrands has joined #zuul		22:37
corvus	does anyone have suggestions as to how to convince sphinx to supply more information than this: Warning, treated as error:	22:49
corvus	:1: (ERROR/3) Unknown interpreted text role "class".	22:49
corvus	(that is literally all it's telling me. no idea even what file is triggering it. still happens without any changes to .rst files)	22:50
fungi	no clue. did you add some text with a role named "class"?	22:51
corvus	fungi: i didn't -- but even when i revert out the changes to rst files it still happens	22:51
fungi	okay, so something new. yeah line numbers would be nice :/	22:51
corvus	somehow it seems that changing the python code has caused this? i also haven't changed any docstrings.	22:51
corvus	if i'm parsing the error string correctly, that's: "line number 1 in the empty file"	22:52
fungi	argh	22:52
fungi	is this when trying to locally do `tox -e docs` on the zuul repo?	22:53
corvus	yep, with 680778 checked out	22:53
fungi	seems i need a full set of ansible build dependencies installed to build zuul docs	22:59
corvus	the client doc pages run "zuul" to get the help output	23:00
SpamapS	maybe an upstream library changed that flubs the scraped output?	23:01
pabelanger	okay, super ugly, but manage to get job working: https://github.com/ansible/ansible-zuul-jobs/pull/207	23:03
corvus	SpamapS: if i checkout the change ahead of it it works; so i think it's something about 680778 setting it off	23:03
SpamapS	ah that's handy	23:04
pabelanger	the appliance comes online with non-routable public IP, but controller can access it. I need to figure out a better way of updating hostvars with right private IP, so I can use write-inventory role.	23:05
pabelanger	Used the replace filter, but really want to replace I think	23:05
fungi	okay, i've gotten my dev env to the point where i can reproduce the opaque sphinx error with change 680778	23:08
corvus	fungi, SpamapS: okay, by reverting each chunk of the patch in 680778 one at a time and running sphinx, i've found it's related to the addition of the GerritPoller class in gerritconnection.py	23:08
SpamapS	hah I did that too and was about to say that.. ;)	23:09
SpamapS	Luckily it was the 3rd hunk. ;)	23:10
SpamapS	You can also just not mention that class, and it will build correctly.	23:10
corvus	i don't think it's referenced in the docs	23:11
SpamapS	hm, you're right	23:12
*** saneax has joined #zuul		23:12
SpamapS	but I can also get the docs to build if I revert doc/source/admin/drivers/gerrit.rst	23:12
SpamapS	oh wait no, they just get further	23:13
SpamapS	corvus:the problem is GerritPoller needs a """ """ instead of #	23:13
corvus	it's specifically the line "poller_class = GerritPoller" setting a class variable in GerritConnection	23:13
SpamapS	oh hah every time I think I find it it explodes more	23:14
SpamapS	in different ways.. derp	23:14
SpamapS	yeah	23:14
corvus	okay, i suspect the linkage here is that the testing doc does documentFakeGerritConnection which links to GerritConnection (though that is not documented); but that might be enough to get the sphinx autodoc stuff examining that class	23:16
corvus	why it's treating that instance variable that way is still a mystery	23:16
* SpamapS puts $4.20 on it being some bonghits python parsing corner case		23:16
*** sanjayu_ has joined #zuul		23:18
*** igordc has quit IRC		23:18
corvus	huh, i assumed the "class" in "poller_class" was the "class" it was refering to, but no -- even if i change that varname to "poller_thingy" it still barfs	23:19
fungi	even commenting out the line entirely still breaks	23:20
*** saneax has quit IRC		23:20
corvus	i'm able to fix it by commenting that line out	23:21
fungi	maybe i'm commenting out a different line	23:23
corvus	fungi: in change 680778 it's gerritconnection.py line 363	23:23
fungi	also possible i needed to git clean between runs	23:23
corvus	okay, new hypothesis -- testing.rst autodocs FakeGerritConnection with "inherited-members" so it picks up stuff from GerritConnection, which has a class variable that points to another class and sphinx can't handle that.	23:25
fungi	with that line commented out (same line number/file) it breaks on me with the above error	23:25
fungi	maybe there's more you've also commented out?	23:25
corvus	(i explicitly tested reparenting that class from threading.Thread to object to exclude the hypothesis that it's the threading class that's weird)	23:25
corvus	fungi: almost certainly. let me reset to that and see.	23:25
SpamapS	that actually makes sense given the error message	23:26
SpamapS	It's probably looking for `str` or `bytes` or `unicode` and it's getting `class` back from `type(thatvar)`	23:26
fungi	fwiw, a web search on that error mostly shows commits/discussions about developers silencing it	23:27
fungi	though i never found a great explanation of why it was turning up in their cases, this explanation does make sense	23:27
fungi	sphinx basically has a hard-coded list of types it expects	23:27
corvus	fungi: yep, i also had removed it in base.py -- so essentially the error shows up in 2 places	23:28
fungi	i wouldn't be surprised if class objects as classvars confuse it	23:28
corvus	if we remove both of those lines it fixes it	23:28
fungi	yep, confirmed	23:30
fungi	i wonder if there's a way to mark variables so that autodoc will skip them	23:32
corvus	well, one way to do that is to prefix them with an '_', which is a perfectly acceptable solution in this case so i think i'll go with that :)	23:32
*** jamesmcarthur has joined #zuul		23:32
*** igordc has joined #zuul		23:33
openstackgerrit	James E. Blair proposed zuul/zuul master: Add support for the Gerrit checks plugin https://review.opendev.org/680778	23:34
corvus	fungi, SpamapS: thanks! that was "fun" :)	23:34
fungi	indeed, that's a reasonable workaround i guess	23:37
fungi	it's possible https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#skipping-members could be used as an alternative, but the simple solution seems fine	23:40
*** igordc has quit IRC		23:40
*** igordc has joined #zuul		23:42
*** jamesmcarthur has quit IRC		23:46
*** jamesmcarthur has joined #zuul		23:46
SpamapS	corvus: at least you didn't have to print out the hierarchy this time. ;)	23:48
*** jamesmcarthur has quit IRC		23:51
*** sanjayu_ has quit IRC		23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!