Wednesday, 2021-06-30

clarkbcorvus: ok a few comments on https://review.opendev.org/c/zuul/zuul/+/788988 but I +2'd it00:20
clarkbthe majority are testing things. The other two are not critical at this stage I don't think. Just callouts for things that might need changing as we move forward00:20
clarkbbut please double check00:20
clarkband now I must figure out dinner00:21
opendevreviewJames E. Blair proposed zuul/zuul master: Remove ExecutorApi.update() call from tests  https://review.opendev.org/c/zuul/zuul/+/79877800:39
corvusclarkb: thanks!00:39
pabelanger[m]So we've found a regression in 4.6.0 I think02:33
pabelanger[m]related to https://opendev.org/zuul/zuul/commit/be50a6ca42c41c0608dd02930a01123afd4e606402:33
pabelanger[m]the runAnsibleFreeze function doesn't take into account an inventory file02:34
pabelanger[m]where an ssh connection is not SSH02:34
pabelanger[m]we have images that set https://zuul-ci.org/docs/nodepool/openstack.html#attr-providers.[openstack].diskimages.connection-type to network-cli02:35
pabelanger[m]but now we get timeout while scanning for facts02:35
pabelanger[m]https://pastebin.com/raw/cgR6H4JS02:36
pabelanger[m]the change to https://opendev.org/zuul/zuul/src/branch/master/zuul/executor/server.py#L794 is why02:49
pabelanger[m]we properly skip them for setup, but no longer properly filter it on freeze playbook02:50
opendevreviewPaul Belanger proposed zuul/zuul master: Don't use blacklisted connections for freeze playbook  https://review.opendev.org/c/zuul/zuul/+/79878403:03
pabelanger[m]corvus: tobiash: clarkb: tristanC: ^my attempt to fix broken freeze jobs for zuul.a.c03:05
pabelanger[m]happy to make the code better if people suggest another way03:05
*** sshnaidm is now known as sshnaidm|afk04:39
*** marios is now known as marios|ruck05:08
*** marios is now known as marios|ruck07:16
*** jpena|off is now known as jpena07:39
*** marios is now known as marios|ruck08:09
opendevreviewBenjamin Schanzel proposed zuul/zuul-jobs master: Fix a bug in s3 log uploader with .gz files  https://review.opendev.org/c/zuul/zuul-jobs/+/79880208:15
opendevreviewSimon Westphahl proposed zuul/zuul master: Stop jobs on gearman disconnect  https://review.opendev.org/c/zuul/zuul/+/71472209:35
*** sshnaidm|afk is now known as sshnaidm10:53
*** jpena is now known as jpena|lunch11:27
*** marios|ruck is now known as marios|ruck|call12:01
*** jpena|lunch is now known as jpena12:27
*** marios|ruck|call is now known as marios|ruck12:57
corvuspabelanger: can we set gather_facts to false in the freeze playbook instead?13:45
pabelanger[m]I am trying to remember which host set_fact run on14:14
pabelanger[m]because we cannot access the node via SSH connection right now14:15
corvuspabelanger: can you test that to help confirm it works?15:56
corvuspabelanger: hrm, it looks like we don't gather facts in the setup playbook?16:03
*** marios|ruck is now known as marios|out16:12
opendevreviewMerged zuul/zuul master: Put zuul vars back in debug inventory.yaml  https://review.opendev.org/c/zuul/zuul/+/79809216:24
corvuspabelanger: i'm confused -- why can't network_cli handle fact gathering?16:48
*** jpena is now known as jpena|off16:49
corvusi just tested kubectl, and it does... which makes me wonder why we don't run the setup module for that16:49
corvusah, we don't run the setup module for that because we don't put it in the inventory at all16:51
pabelanger[m]it can, but it is a blacklisted connection. So up until now, we would not scan the nodes16:51
pabelanger[m]in these nodes right now, don't have interfaces online16:51
corvuspabelanger: so this isn't an issue with network_cli -- this issue is that you're adding nodes to the inventory which aren't online?16:52
corvuspabelanger: and i guess you do something out of band to make them be online?16:53
corvus(also, i was wrong, we do add kubectl nodes to the inventory; so it's unclear to me why we don't run setup on them)16:54
pabelanger[m]yes, we do https://github.com/ansible-network/windmill-config/blob/master/nodepool/nl01.sjc1.vexxhost.zuul.ansible.com.yaml#L347 which means we don't keyscan the node, and setup playbook would not run. Then we have a pre-run playbook that will scan the private IP and setup our public interface properly: https://github.com/ansible/ansible-zuul-jobs/blob/master/playbooks/ansible-network-appliance-base/pre.yaml#L1916:56
pabelanger[m]but now, we have no way to skip the job freeze playbook16:56
pabelanger[m]from zuul-executor we do not access these nodes16:56
pabelanger[m]only from the controller node16:56
corvuspabelanger: okay, that sounds like a pretty unusual situation, and tbh, i think the support in zuul needs to change a bit.  because if it's the case that in normal circumstances, a networkcli host can actually run setup or gatherfacts, that's probably what zuul should be doing.  you should probably rework both zuul and your use of it to use the "resources" system that is used for k8s namespaces for what you're doing (that's a concept of "here is17:02
corvusa resource from nodepool that zuul shouldn't use directly, but the job will").17:02
corvuspabelanger: that's a long-term issue, but it would be great if you're interested in working on that17:03
corvuspabelanger: for the short term, my guess is you may be the only user in the world using networkcli, so if you are okay with basically losing access to all job variables for networkcli hosts, i think we could merge your patch as-is17:03
corvuspabelanger: it sounds like if you're not using the hosts from zuul anyway, then you're probably not going to care about not having job vars17:04
avass[m]Does anyone know if it's possible to stop element from displaying the homeserver part of usersnames? Irc style layout doesn't really handle that very well17:06
pabelanger[m]corvus: yes, we only force network_cli today, because it is a blacklisted connection. It seems now, we will support it for job freeze playbooks, is that right?17:06
pabelanger[m]yes, I don't believe we use any host vars today on these nodes17:06
corvuspabelanger: can you update your patch with a release note?17:08
pabelanger[m]yes, I can do that shortly17:09
fungiavass[m]: my guess is that would lead to ambiguity if two users at different homeservers had the same local usernames17:09
corvusavass: it may only be doing it for folks who have a collision; have an example?17:10
avass[m]fungi : I could live with that not being completely unambiguous in the chat since it could also be possible to mouseover/open the user profile17:10
* avass[m] uploaded an image: (9KiB) < https://matrix.org/_matrix/media/r0/download/vassast.org/dvDfKPpmekrVlWvnSaxfAxBU/image.png >17:10
avass[m]corvus: yes: :)17:10
corvushrm, that's different than what i see.17:11
avass[m]maybe a version difference?17:11
* corvus uploaded an image: (8KiB) < https://matrix.org/_matrix/media/r0/download/acmegating.com/rnGvemIrLORJUeYmfbqlvRQn/image.png >17:11
corvusyeah maybe17:11
avass[m]oh, it's only on my windows machine...17:12
corvusElement version: 3a67dc18e9e3-react-3a67dc18e9e3-js-3a67dc18e9e3  lol that's helpful17:12
fungiblame windows17:12
clarkbthe collision between my irc and matrix names is useful bceaus matrix sends me notifications even if I primarily consume this channel via irc :)17:13
avass[m]Version: 1.7.31 on windows, but 1.7.20 is fine on manjaro. But I'm gonna guess it's a windows thing17:13
corvusavass: using element desktop vs browser?17:15
avass[m]corvus: desktop on all three machines, but I think it's an electron app anyway17:15
corvusavass: yeah, but i'm wondering how os makes a difference17:16
avass[m]corvus: that's a good question in that case, maybe it has different css engines?17:17
avass[m]or something like that17:17
corvuspabelanger: wait a minute -- i think i just now understood that you're saying you're not using network_cli at all -- you just found a connection type in zuul that happened to not run the setup playbook and you used that?17:19
corvuspabelanger: so we have no idea how network_cli behaves with the freeze playbook -- it could work for all we know, and your change could break it?17:20
pabelanger[m]yes, that is right17:21
pabelanger[m]in 3.0 we didn't support network_cli as a connection type17:21
pabelanger[m]because of security reasons17:21
pabelanger[m]but now, it seems freeze playbook doesn't restrict connections17:21
corvuspabelanger: 3.x would use network_cli connections in playbooks, it only didn't run setup17:22
corvusalso 4.x through 4.5.117:22
pabelanger[m]maybe I am not understanding the idea of BLACKLISTED_ANSIBLE_CONNECTION_TYPES17:23
pabelanger[m]I don't think we ever got network_cli to work from zuul-executor in 3.017:23
pabelanger[m]if we could, that would actually save a lot of work for us17:23
pabelanger[m]however, it's been a while since I've had to think about this process17:24
corvuspabelanger: this is where it was added: https://review.opendev.org/53026517:24
pabelanger[m]yes, that is true. Only recently gather facts for network_cli support was added17:25
pabelanger[m]we also have specific facts gathers, like vyos_facts or ios_facts. I need to check if gather_facts will map to them in 2.8 and 2.9 ansible17:26
corvuspabelanger: okay, i'm hesitant to change the behavior for network_cli here -- but i left a suggestion on https://review.opendev.org/798784 for what i think you can do, and i think it's about the same amount of work (basically, add a new resource type).  you'll want to add a test for it too, and a release note.17:28
pabelanger[m]corvus: gather_facts true on network_cli for 2.8 wouldn't work, but should be okay in 2.9: https://docs.ansible.com/ansible/devel/porting_guides/porting_guide_2.9.html#improved-gather-facts-support-for-network-devices17:30
pabelanger[m]so it depends on the version of ansible zuul jobs use17:30
pabelanger[m]and network_cli is only used for network appliances17:31
pabelanger[m]so you are likely right we are the only ones doin git17:31
pabelanger[m]it*17:31
corvuspabelanger: okay, i think it would also be okay to avoid running freeze on 2.8 in that case, but we should run it on 2.917:31
pabelanger[m]yes17:32
pabelanger[m]that still means, we have an issue of trying to scan a node that isn't online yet17:32
pabelanger[m]and not sure how to deal with that17:32
corvuspabelanger: yeah, i think the best solution for your case is the new resource type17:33
pabelanger[m]Hmm, okay. I was hoping to avoid writing a new feature to support this18:01
opendevreviewJames E. Blair proposed zuul/zuul master: Add some jitter to apscheduler interval cleanup jobs  https://review.opendev.org/c/zuul/zuul/+/79895320:20
opendevreviewJames E. Blair proposed zuul/zuul master: Add comment for test method  https://review.opendev.org/c/zuul/zuul/+/79895420:20
clarkbI just want to say kudos to everyone who has already reviewed the zookeeper v5 changes. They are often not short and also dense on details21:36
fungiand double-plus thanks from me as i have not found time to review any of them21:39
opendevreviewMerged zuul/zuul master: Add ExecutorApi  https://review.opendev.org/c/zuul/zuul/+/77090222:58
opendevreviewMerged zuul/zuul master: Change zone handling in ExecutorApi  https://review.opendev.org/c/zuul/zuul/+/78783322:59
opendevreviewMerged zuul/zuul master: Switch to string constants in BuildRequest  https://review.opendev.org/c/zuul/zuul/+/79184923:04
opendevreviewMerged zuul/zuul master: Clean up Executor API build request locking and add tests  https://review.opendev.org/c/zuul/zuul/+/78862423:08
opendevreviewMerged zuul/zuul master: Fix race with watches in ExecutorAPI  https://review.opendev.org/c/zuul/zuul/+/79230023:09
clarkbcorvus: I finally got through https://review.opendev.org/c/zuul/zuul/+/784195 can you check my comments on it?23:18
corvuslookin23:19
corvusclarkb: thanks; i provided an explanation on #1 and revised my vote to a -1 for #223:30
clarkbcool. I'll take a look once I've gotten through the next in the list23:30
opendevreviewJames E. Blair proposed zuul/zuul master: Simplify queue traversal of fileschangescompletedevent processing  https://review.opendev.org/c/zuul/zuul/+/79896623:51
opendevreviewJames E. Blair proposed zuul/zuul master: Switch from global to tenant event queues  https://review.opendev.org/c/zuul/zuul/+/79744023:53
opendevreviewJames E. Blair proposed zuul/zuul master: Remove unused addResultEvent method  https://review.opendev.org/c/zuul/zuul/+/79754223:53
opendevreviewJames E. Blair proposed zuul/zuul master: Lock/unlock nodes on executor server  https://review.opendev.org/c/zuul/zuul/+/77461023:53
opendevreviewJames E. Blair proposed zuul/zuul master: Remove ExecutorApi.update() call from tests  https://review.opendev.org/c/zuul/zuul/+/79877823:53
corvusclarkb: ^ that should address that; rebase incoming23:53
opendevreviewJames E. Blair proposed zuul/zuul master: Add some jitter to apscheduler interval cleanup jobs  https://review.opendev.org/c/zuul/zuul/+/79895323:53
opendevreviewJames E. Blair proposed zuul/zuul master: Add comment for test method  https://review.opendev.org/c/zuul/zuul/+/79895423:53

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!