Friday, 2023-06-09

@yoctozepto:matrix.orgmorning Zuulers; any preference regarding https://lists.zuul-ci.org/archives/list/zuul-discuss@lists.zuul-ci.org/thread/WUWBM5F3PXXDLKK6JNSP4UR4VTWDNPZ4/ ? I do not know how you handle such decisions to be honest...06:34
-@gerrit:opendev.org- Flavio Percoco Premoli proposed: [zuul/zuul] 885455: Use built-in URL data type instead of custom parse https://review.opendev.org/c/zuul/zuul/+/88545506:36
@muneerefx:matrix.orgHi all,i am muneer and new to this technology06:43
@flaper87:matrix.org> <@yoctozepto:matrix.org> morning Zuulers; any preference regarding https://lists.zuul-ci.org/archives/list/zuul-discuss@lists.zuul-ci.org/thread/WUWBM5F3PXXDLKK6JNSP4UR4VTWDNPZ4/ ? I do not know how you handle such decisions to be honest...06:44
I'm not in the mailing list (yet?) so dropping a comment here. I think it's safe, at this point, to drop support for Helm2 (assuming there are no other things using it). Helm 3 has been around for what it feels like ages at this point and it's been a long while since I ran into a Helm 2 only chart. $0.02
@yoctozepto:matrix.orgthanks flaper87 my point exactly :-) 06:45
@yoctozepto:matrix.organd I recommend you join the mailing list; it's good for async discussions06:45
@yoctozepto:matrix.orgalthough this time I received no reply :D 06:46
@yoctozepto:matrix.organd I seem to be impatient...06:46
@yoctozepto:matrix.org:D 06:46
@muneerefx:matrix.organy video for learning zuul07:08
@flaper87:matrix.org> <@muneerefx:matrix.org> any video for learning zuul07:10
I think the `docker-compose.yaml` in the examples dir is quite useful. I'd highly recommend you going through the tutorial https://zuul-ci.org/docs/zuul/latest/tutorials/quick-start.html
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/nodepool] 885423: Clear tree cache queues on disconnect https://review.opendev.org/c/zuul/nodepool/+/88542307:11
@yoctozepto:matrix.orgmuneerefx: it also depends on whether you want to run/maintain/deploy/administer zuul or "just use" zuul maintained by somebody else07:13
@jjbeckman:matrix.orgHi folks,07:44
I've done a little troubleshooting regarding my problem where simple jobs(e.x. `echo foo`) are taking over 30 seconds, in my Kubernetes based Zuul setup.
After adding, `-v=9` to this `kubectl port-forward` command, mysteriously, the 30+ second delay went away. Removing `-v=9` brings the issue back.
https://opendev.org/zuul/zuul/src/branch/master/zuul/executor/server.py#L412
While this does not make sense to me at this point, could `kubectl port-forward` which I understand is required to stream logs to the web UI not working correctly be the reason I have this 30+ second delay?
By the way, the web console has never worked for me, I just get multiple lines of `--- END OF STREAM ---`.
@avass:matrix.vassast.org> <@jjbeckman:matrix.org> Hi folks,08:10
>
> I've done a little troubleshooting regarding my problem where simple jobs(e.x. `echo foo`) are taking over 30 seconds, in my Kubernetes based Zuul setup.
>
> After adding, `-v=9` to this `kubectl port-forward` command, mysteriously, the 30+ second delay went away. Removing `-v=9` brings the issue back.
> https://opendev.org/zuul/zuul/src/branch/master/zuul/executor/server.py#L412
>
> While this does not make sense to me at this point, could `kubectl port-forward` which I understand is required to stream logs to the web UI not working correctly be the reason I have this 30+ second delay?
>
> By the way, the web console has never worked for me, I just get multiple lines of `--- END OF STREAM ---`.
Sounds like an issue with the zuul executor not being able to connect to the log streaming port
@avass:matrix.vassast.orgPort 790008:11
https://zuul-ci.org/docs/zuul/4.3.0/discussion/components.html#overview
Not exactly sure how that works with kubernetes
@avass:matrix.vassast.orgDid you get logs when you increased verbosity on kubectl port forwardor did that only remove the 30s timeout?08:14
@jjbeckman:matrix.orgHi Albin, thanks for the advice. I see, let me look into whether the executor is unable to connect to port 7900.08:16
> Did you get logs when you increased verbosity on kubectl port forwardor did that only remove the 30s timeout?
I wasn't able to tell any difference in the executor logs. Just that the jobs were completing in seconds, rather than 30+ seconds... which I know, doesn't make sense.
@avass:matrix.vassast.orgSo it's not only logs in zuul-web, but the actual job in the zuul-executor takes 30 seconds?08:18
@jjbeckman:matrix.org> So it's not only logs in zuul-web, but the actual job in the zuul-executor takes 30 seconds?08:19
Yes. the duration shown in the zuul-web, and the duration shown in the executor logs match.
@avass:matrix.vassast.orgIn any case my thought is that some part times out, likely because of a lack of logs, and increasing verbosity creates some kind of logs which gets past the step of waiting for logs somewhere 08:20
@jjbeckman:matrix.orgHere is an example.08:21
```
2023-05-31 07:19:58,953 DEBUG zuul.AnsibleJob.output: [e: 60cdeba0-ff83-11ed-8b5c-6d2bd0e764a2] [build: 0a404664fcc746308cd49be128bfe325] Ansible output: b'TASK [Test] ********************************************************************'
2023-05-31 07:20:30,877 DEBUG zuul.AnsibleJob.output: [e: 60cdeba0-ff83-11ed-8b5c-6d2bd0e764a2] [build: 0a404664fcc746308cd49be128bfe325] Ansible output: b'ok: [debian-bullseye] => {"changed": false, "cmd": ["echo", "foo"], "delta": "0:00:00.005171", "end": "2023-05-31 07:20:00.479927", "msg": "", "rc": 0, "start": "2023-05-31 07:20:00.474756", "stderr": "", "stderr_lines": [], "stdout": "foo", "stdout_lines": ["foo"], "zuul_log_id": "1a9f276b-d811-b3b3-b464-00000000000c-1-debianbullseye"}'
```
As you can see "Test" takes 32 seconds to complete. But `delta` is only 0.005171 seconds. Something that doesn't appear in the logs is happenning...
@jjbeckman:matrix.org> In any case my thought is that some part times out, likely because of a lack of logs, and increasing verbosity creates some kind of logs which gets past the step of waiting for logs somewhere 08:22
I see... I guess that's a possibility...
@jjbeckman:matrix.org * Here is an example.08:22
```
2023-05-31 07:19:58,953 DEBUG zuul.AnsibleJob.output: [e: 60cdeba0-ff83-11ed-8b5c-6d2bd0e764a2] [build: 0a404664fcc746308cd49be128bfe325] Ansible output: b'TASK [Test] ********************************************************************'
2023-05-31 07:20:30,877 DEBUG zuul.AnsibleJob.output: [e: 60cdeba0-ff83-11ed-8b5c-6d2bd0e764a2] [build: 0a404664fcc746308cd49be128bfe325] Ansible output: b'ok: [debian-bullseye] => {"changed": false, "cmd": ["echo", "foo"], "delta": "0:00:00.005171", "end": "2023-05-31 07:20:00.479927", "msg": "", "rc": 0, "start": "2023-05-31 07:20:00.474756", "stderr": "", "stderr_lines": [], "stdout": "foo", "stdout_lines": ["foo"], "zuul_log_id": "1a9f276b-d811-b3b3-b464-00000000000c-1-debianbullseye"}'
```
As you can see "Test" takes 32 seconds to complete. But `delta` you can see on line 2 is only 0.005171 seconds. Something that doesn't appear in the logs is happenning...
@avass:matrix.vassast.orgIn any case here's a link to the Ansible callback responsible for log streaming: https://opendev.org/zuul/zuul/src/commit/2bdc98b6d3f0c565813e6f2d234866539ba7337a/zuul/ansible/base/callback/zuul_stream.py#L34408:27
@jjbeckman:matrix.orgThanks a lot Albin! Let me look into what you have shared.08:29
@flaper87:matrix.orgjjbeckman: I recently went through an issue with the console. The solution was to make sure it is the *very first* task you run. It's got to be the very first one so it can start and the executor can run the `port-forward` to connect to it. If, for whatever reason, the `port-forward` is run before the console starts, then you won't get anything on the console08:43
@jjbeckman:matrix.orgHi flaper87, thanks for your advice.09:15
> The solution was to make sure it is the very first task you run.
I'm a bit confused with this bit though. Zuul automatically configures Ansible to execute `kubectl port-forward` as far as I can see by reading the source code, and as a user, I am not able to specify where to execute it in the playbooks. Hope that makes sense...
@jjbeckman:matrix.orgUnsure if related, but I confirmed that from `fingergw`, accessing `executor:7900` takes exactly 10 seconds. Really slow.09:16
```
root@zuul-fingergw-795bb7884c-fsvt7:/# time openssl s_client -connect zuul-executor:7900
CONNECTED(00000003)
...
---
real 0m10.010s
user 0m0.005s
sys 0m0.000s
```
-@gerrit:opendev.org- Tobias Henkel proposed: [zuul/nodepool] 885736: Fix typo that crashes playback worker when under load https://review.opendev.org/c/zuul/nodepool/+/88573609:18
@flaper87:matrix.orgmmh, the executor should execute the port-forward on its own. 09:35
@flaper87:matrix.orghttps://opendev.org/zuul/zuul/src/commit/0bd76048d10e12d1c914b199582f46f12fd3f732/zuul/executor/server.py#L41409:36
@flaper87:matrix.orggrep for `forward` in the executor logs to see if there was an error while trying to run it09:37
@jjbeckman:matrix.orgYes, hence my confusion with the advice "make sure it is the first task". How can I change the behavior of what is built in to Zuul?09:38
@jjbeckman:matrix.org> grep for forward in the executor logs to see if there was an error while trying to run it09:38
```
2023-06-09 09:26:33,083 INFO zuul.ExecutorServer: [e: b0926af0-06a7-11ee-8b60-c4887a70d41f] [build: 0eb0ad20e3db465e9edb43a4e955696d] Started Kubectl port forward on port 46709
2023-06-09 09:27:33,274 DEBUG zuul.ExecutorServer: [e: b0926af0-06a7-11ee-8b60-c4887a70d41f] [build: 0eb0ad20e3db465e9edb43a4e955696d] Rest of kubectl port forward output was: Forwarding from [::1]:46709 -> 19885
2023-06-09 09:27:33,274 DEBUG zuul.ExecutorServer: E0609 09:26:41.638892 6409 portforward.go:406] an error occurred forwarding 46709 -> 19885: error forwarding port 19885 to pod 60a688d104bccf842deab568632e4497b36f632f297b85ee21a417f8289bc206, uid : failed to execute portforward in network namespace "/var/run/netns/cni-dcf196fd-bc77-2c54-096d-c417f951d682": failed to connect to localhost:19885 inside namespace "60a688d104bccf842deab568632e4497b36f632f297b85ee21a417f8289bc206", IPv4: dial tcp4 127.0.0.1:19885: connect: connection refused IPv6 dial tcp6: address localhost: no suitable address found
2023-06-09 09:27:33,274 DEBUG zuul.ExecutorServer: E0609 09:26:41.639241 6409 portforward.go:234] lost connection to pod
2023-06-09 09:28:03,250 INFO zuul.ExecutorServer: [e: e9b9097e-06a7-11ee-9712-dda02c6cdd52] [build: 5ca2a2417409418f9a9b16267b8bd910] Started Kubectl port forward on port 33573
2023-06-09 09:29:03,740 DEBUG zuul.ExecutorServer: [e: e9b9097e-06a7-11ee-9712-dda02c6cdd52] [build: 5ca2a2417409418f9a9b16267b8bd910] Rest of kubectl port forward output was: Forwarding from [::1]:33573 -> 19885
2023-06-09 09:29:03,740 DEBUG zuul.ExecutorServer: E0609 09:28:12.200585 7553 portforward.go:406] an error occurred forwarding 33573 -> 19885: error forwarding port 19885 to pod 54d4f6853fc013ced5c1dfd183f6f0a448558085f2edbf914014952238562a09, uid : failed to execute portforward in network namespace "/var/run/netns/cni-6d458b82-4fac-e9fa-1acf-79e66a42368c": failed to connect to localhost:19885 inside namespace "54d4f6853fc013ced5c1dfd183f6f0a448558085f2edbf914014952238562a09", IPv4: dial tcp4 127.0.0.1:19885: connect: connection refused IPv6 dial tcp6: address localhost: no suitable address found
2023-06-09 09:29:03,740 DEBUG zuul.ExecutorServer: E0609 09:28:12.200912 7553 portforward.go:234] lost connection to pod
```
@jjbeckman:matrix.orgI see errors, but they only occur after the pipeline has been executed.09:39
@jjbeckman:matrix.orgAnd don't explain why each job is so slow.09:39
@flaper87:matrix.orgThis is a bit annoying and I've been meaning to send a PR. The job is slow because the executor is looping forever waiting for the console and it will consider it complete once it gives up on the stream. By fixing my zuul_console issue I took jobs from 7mins to1m :) 09:42
@flaper87:matrix.orgZuul console needs to run first so that it is launched in the "node" (pod) before the executor attempts to launch the port-forward09:42
@flaper87:matrix.orgDid you check that the console is actually runniing in the pod? 09:42
@jjbeckman:matrix.orgI... see... I would very much like to solve my zuul_console issue as well  :)09:44
@jjbeckman:matrix.org> Zuul console needs to run first so that it is launched in the "node" (pod) before the executor attempts to launch the port-forward09:44
So Ineed to tweak the executor source code?
@jjbeckman:matrix.org> Did you check that the console is actually runniing in the pod? 09:44
I wasn't aware this was a thing. There should be a zuul_console process running in the node pod?
@jjbeckman:matrix.org> Zuul console needs to run first so that it is launched in the "node" (pod) before the executor attempts to launch the port-forward09:45
* > Zuul console needs to run first so that it is launched in the "node" (pod) before the executor attempts to launch the port-forward
So I need to tweak the executor source code?
@flaper87:matrix.orgThis is what my pre.yaml for the base task look like: 09:46
@flaper87:matrix.orgwell, at least a portion of it09:46
@flaper87:matrix.orgOnce you have that, you can exec into one of the CI pods and run `ps aux` to get all the processes. You should see the console one09:47
@flaper87:matrix.orgMaybe get the output of listening ports to see if it's actually listening on the port09:47
@flaper87:matrix.org`ss -lp | grep ...` 09:48
@flaper87:matrix.org * `ss -lp | grep .19885`09:48
@flaper87:matrix.org * `ss -lp | grep 19885`09:48
@jjbeckman:matrix.orgAhhh, I see what you mean now, thanks so much.09:50
@jjbeckman:matrix.orgI need to run now, but will definitely try what you've suggested.09:50
@yoctozepto:matrix.org> <@yoctozepto:matrix.org> morning Zuulers; any preference regarding https://lists.zuul-ci.org/archives/list/zuul-discuss@lists.zuul-ci.org/thread/WUWBM5F3PXXDLKK6JNSP4UR4VTWDNPZ4/ ? I do not know how you handle such decisions to be honest...12:21
refreshing the message in case of more eyes in this channel at this time
@avass:matrix.vassast.orgI think it makes sense to remove helm v2 since support stopped ~3 years ago. 12:26
@yoctozepto:matrix.orgthanks, Albin12:31
@fungicide:matrix.orgnot kubernetes, but similar situation in opendev as well: https://opendev.org/opendev/base-jobs/src/commit/ca59b60/playbooks/base/pre.yaml#L4513:38
@jim:acmegating.communeerefx: https://www.youtube.com/watch?v=vb0Iuf-6wHs&pp=ygUHenV1bCBjaQ%3D%3D is based on the tutorial13:43
@jim:acmegating.comjjbeckman: https://zuul-ci.org/docs/zuul/latest/operation.html#log-streaming has some reference information13:44
-@gerrit:opendev.org- Zuul merged on behalf of Tobias Henkel: [zuul/nodepool] 885736: Fix typo that crashes playback worker when under load https://review.opendev.org/c/zuul/nodepool/+/88573614:29

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!