Thursday, 2022-09-08

-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: [zuul/zuul-jobs] 856225: upload-npm : support authToken argument https://review.opendev.org/c/zuul/zuul-jobs/+/85622501:13
-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: [zuul/zuul] 853208: zuul-stream : Test against a Python 2.7 container https://review.opendev.org/c/zuul/zuul/+/85320807:17
@sameer.deshpande:matrix.org> <@clarkb:matrix.org> Yes, this is the correct location for questions about zuul and nodepool.08:45
Thanks Clark . Facing issue while trying to install Nodepool builder 3.12.0 on Ubuntu 20.04 . Nodepool-builder service is throwing error while starting . Any pointers to resolve this issue
systemctl status nodepool-builder.service
● nodepool-builder.service - LSB: Nodepool-builder
Loaded: loaded (/etc/init.d/nodepool-builder; generated)
Active: active (exited) since Wed 2022-09-07 09:30:47 UTC; 20h ago
Docs: man:systemd-sysv-generator(8)
Tasks: 0 (limit: 9508)
Memory: 0B
CGroup: /system.slice/nodepool-builder.service
File "/usr/local/lib/python3.8/dist-packages/d>
if is_socket(stdin_fd):
File "/usr/local/lib/python3.8/dist-packages/d>
file_socket = socket.fromfd(fd, socket.AF_IN>
File "/usr/lib/python3.8/socket.py", line 544,>
return socket(family, type, proto, nfd)
File "/usr/lib/python3.8/socket.py", line 231,>
_socket.socket.__init__(self, family, type, >
OSError: [Errno 88] Socket operation on non-sock>
-@gerrit:opendev.org- Tristan Cacqueray proposed: [zuul/zuul] 856321: Add initial telemetry tracing to the executor component https://review.opendev.org/c/zuul/zuul/+/85632113:02
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 856523: wip: Add span for builds and propagate via request https://review.opendev.org/c/zuul/zuul/+/85652313:25
@westphahl:matrix.orgtristanC: ^ this is a wip that propagates the parent span info via the build request13:25
@westphahl:matrix.orgcorvus: just wondering if we always want to propagate the full span_info (including links + attributes) or if we have something similar to the w3c traceparent that mainly contains the trace and parent span id13:27
@jim:acmegating.comswest: we need it to end the span since the span end may happen on a different host if we want to stick with the opentelemetry api (where links can only be added at the start).  if we want to stray farther from the api, we could set links/attrs only at the end, then we don't need to save them.13:43
@westphahl:matrix.orgcorvus: for the use case of propagating the parent span info of the build via the build request I think we don't have to include the full span info of the build with the request.13:44
@jim:acmegating.comswest: the parent span of the build is the buildset span, right?  the buildset span can start and end on any scheduler.  it's true that we don't need that info to construct the build span, but we *do* need it when we end the buildset span.  so rather than save all the info we need for the buildset span in one place, then also save a partial copy of it in another place for the build, we just save one copy.  it doesn't matter that we don't use the extra info.13:48
@westphahl:matrix.orgcorvus: I'm not sure I can follow. I was thinking about spans that we may live on an executor (e.g. in tristanC s case) and where we might want to establish a relationship to the parent span. one option would be to include the full parent span info in the build request. Another option would be to only propagate enough info to establish the relationship to the parent span (e.g. similar to the w3c trace context)13:52
@westphahl:matrix.org * corvus: I'm not sure I can follow. I was thinking about spans that may live on an executor (e.g. in tristanC s case) and where we might want to establish a relationship to the parent span. one option would be to include the full parent span info in the build request. Another option would be to only propagate enough info to establish the relationship to the parent span (e.g. similar to the w3c trace context)13:53
@jim:acmegating.comswest: yes, what i'm saying is that any span that lives only on a single host does not need to be saved in zk.  any span that can start and end on a different host needs all the info to be saved in zk.  if we encounter a case where a parent span exists on only one host and then has a child span that exists on a different host, we will have a situation like you describe.  but so far, i think any time we have a child that can start/end on a different host than its parent, the parent itself will also need to start/end on a different host.  therefore the full parent needs to be saved in zk.13:56
@jim:acmegating.comswest: it looks like you're saying that on the executor, we don't have the buildset object available to restore the parent, so we should save less info.  i'll take a closer look at your change later today.13:59
@westphahl:matrix.orgcorvus: yes, we need to include some bit of information with the build request that allows us to establish a relationship to the parent span as we don't have the buildset available there. same goes for e.g. merge or files changes jobs14:00
@westphahl:matrix.orgcorvus: what I was trying to say is: I think this bit of information doesn't have to be the full span info (it would certainly do the trick) but maybe something smaller similar to the w3c trace context14:01
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 856523: wip: Add span for builds and propagate via request https://review.opendev.org/c/zuul/zuul/+/85652314:06
@tristanc_:matrix.orgswest: corvus I am just trying to add some telemetry around git operations. I was wondering if we could use standalone spans, with just event_id /build_id attribute, but it sounds even better if they could be attached to a build/event parent span.14:09
@westphahl:matrix.orgtristanC: yeah, I think we should always establish a relationship to other/parent spans14:19
@clarkb:matrix.org> <@sameer.deshpande:matrix.org> Thanks Clark . Facing issue while trying to install Nodepool builder 3.12.0 on Ubuntu 20.04 . Nodepool-builder service is throwing error while starting . Any  pointers to resolve this issue 14:33
>
> systemctl status nodepool-builder.service
> ● nodepool-builder.service - LSB: Nodepool-builder
> Loaded: loaded (/etc/init.d/nodepool-builder; generated)
> Active: active (exited) since Wed 2022-09-07 09:30:47 UTC; 20h ago
> Docs: man:systemd-sysv-generator(8)
> Tasks: 0 (limit: 9508)
> Memory: 0B
> CGroup: /system.slice/nodepool-builder.service
>
> File "/usr/local/lib/python3.8/dist-packages/d>
> if is_socket(stdin_fd):
> File "/usr/local/lib/python3.8/dist-packages/d>
> file_socket = socket.fromfd(fd, socket.AF_IN>
> File "/usr/lib/python3.8/socket.py", line 544,>
> return socket(family, type, proto, nfd)
> File "/usr/lib/python3.8/socket.py", line 231,>
> _socket.socket.__init__(self, family, type, >
> OSError: [Errno 88] Socket operation on non-sock>
Nodepool 3.12.0 is about 2.5 years old and was created before Ubuntu 20.04 was released. I suspect that The version of python there is simply too new (and untested) with nodepool 3.12.0. However, it is difficult to say because the traceback you pasted had truncated lines. I think if you can upgrading is liekly to be helpful as it would be software that we can support. 3.12.0 isn't something we would be fixing or updating at this point.
-@gerrit:opendev.org- Thomas Cardonne marked as active: [zuul/zuul] 856294: feat(elasticsearch): support datastreams and rollover aliases https://review.opendev.org/c/zuul/zuul/+/85629414:44
@jim:acmegating.comswest: yep, got it.  i think the flaw in my plan was that the executor does not have a buildset object, so i agree we should add something for this case.  how about i look into some options today and propose an update to my change and yours?14:46
@westphahl:matrix.orgcorvus: sounds good14:47
@avass:vassast.org`event_enqueue_processing_time` and `job_wait_time` seems to be kinda long for us (10-30sec) while schedulers are not using a lot of cpu and executors are accepting jobs. But maybe I just don't know what's happening during those periods?14:56
@mbecker12:matrix.org> <@mbecker12:matrix.org> Hi, I left a comment on my patch https://review.opendev.org/c/zuul/nodepool/+/853993 recently regarding the tests. corvus maybe you could have another look if you have a minute :)14:59
corvus: Could you take another look? I don't know how else to move forward with the tests there
@jim:acmegating.commbecker12: yep, i'll try to take a look today, sorry i didn't get to it yesterday15:04
@avass:vassast.org`job_wait_time` is the worst offender for us, because as far as I understand that's just time wasted holding a node waiting for the an executor to starting running a job.15:05
@clarkb:matrix.orgAlbin Vass: have you checked if you have a lot of executor load that is causing them to remove themselves from the available executor list (the various governors for resources can do this)15:35
@clarkb:matrix.orgAlbin Vass: this might be an indication that you need more executors or bigger executors, but that data should be recorded via statsd and/or logging15:36
@avass:vassast.orgClark: well statsd show that all executors are accepting jobs, and at least load avg shouldn't be too high to stop accepting jobs. But ill have to look into it more tomorrow15:37
@avass:vassast.organd adding another two executors didn't change much except for distributing jobs to those as well.15:38
@noonedeadpunk:matrix.orgHey! I've just realized, that it's not allowed somehow to copy workdir to remote host. So what's the way to actually copy current repo state to the third party host?16:00
@fungicide:matrix.orgDmitriy Rabotyagov: https://zuul-ci.org/docs/zuul-jobs/general-roles.html#role-mirror-workspace-git-repos16:02
@noonedeadpunk:matrix.orgah, thanks, indeed16:03
@noonedeadpunk:matrix.orgI guess I need to clone roles from repo first though16:03
@fungicide:matrix.orgDmitriy Rabotyagov: zuul should prepare repo states in the local workspace for any required projects that job specifies16:04
@noonedeadpunk:matrix.orgI wonder if I can just run prepare-workspace against any host16:04
@fungicide:matrix.orgprepare-workspace is an alternative to mirror-workspace-git-repos16:06
@noonedeadpunk:matrix.orgThough prepare-workspace I believe is designed for localhost16:06
@noonedeadpunk:matrix.orgOk, thanks, I will try this out now16:07
@fungicide:matrix.orgoh, yes sorry i meant to link https://zuul-ci.org/docs/zuul-jobs/general-roles.html#role-prepare-workspace-git16:09
@fungicide:matrix.org(which is also not prepare-workspace)16:09
@fungicide:matrix.orgDmitriy Rabotyagov: we just chuck it into our base job's pre-phase playbook like this: https://opendev.org/opendev/base-jobs/src/branch/master/playbooks/base/pre.yaml#L6116:10
@noonedeadpunk:matrix.orgYes, so the things is it's also in base for our deployment. Though, I'm still trying to test post-deployment in pre-review pipeline16:11
@noonedeadpunk:matrix.org*post-merge16:11
@noonedeadpunk:matrix.orgEventually, in post-merge it will be simply repo state from git...16:12
@noonedeadpunk:matrix.orgSo I don't really need to sync workdir, but just clone the repo....16:12
@fungicide:matrix.orgi also don't recall now why we have both mirror-workspace-git-repos and prepare-workspace-git, the descriptions look almost identical except for the rolevars they take16:12
@fungicide:matrix.orgboth say they use git operations to copy the local git state to remote nodes16:13
@noonedeadpunk:matrix.orgFor me it feels that prepare-workspace-git prepares basically workdir, while mirror-workspace-git-repos clones workdir content to remote nodes16:14
@noonedeadpunk:matrix.orgBut I haven't look deep to say for sure16:14
@noonedeadpunk:matrix.orgI just for some reason decided that I need repo state from zuul/gerrit rather then jsut from git16:16
@noonedeadpunk:matrix.orgAs if it's post-merge it means that anything I might need has already been in repo....16:16
@clarkb:matrix.org> <@fungicide:matrix.org> i also don't recall now why we have both mirror-workspace-git-repos and prepare-workspace-git, the descriptions look almost identical except for the rolevars they take16:16
One uses rsync and the other git operations. The one that uses git operations should likely be prefered
@noonedeadpunk:matrix.orgSo thanks for helping, seems I just had to say issue aloud to realize that it's supid way to deal with it16:16
@noonedeadpunk:matrix.org@clark the one that uses rsync is prepare-workspace16:17
@noonedeadpunk:matrix.orgthese 2 both use git16:17
@clarkb:matrix.orgoh I didn't realize there was a third. Interesting16:17
@noonedeadpunk:matrix.orgHm, but how to clone gerrit repo given it requires auth.... Zuul obviously does have access to it, but I bet keypair is removed quite early...16:23
@noonedeadpunk:matrix.orgPffff....16:23
@clarkb:matrix.org> <@noonedeadpunk:matrix.org> Hm, but how to clone gerrit repo given it requires auth.... Zuul obviously does have access to it, but I bet keypair is removed quite early...16:26
The idea is that zuul would set the state for you and then your job can move that copy around without interacting with Gerrit. If you need to be able to push back to Gerrit though you'll need to manage a secret
@noonedeadpunk:matrix.orgWhat I want to to copy repo from localhost to host that is not in inventory at all. Likely easiest thing it to make this host as static nodepool node....16:27
@clarkb:matrix.orgyou can use add_host to add it to the inventory as well. But even that isn't necessary you could just connect to that host directly16:28
@noonedeadpunk:matrix.orgSo I was trying not to spawn a node and jsut deal with localhost, as what I need is only to copy current project to $host16:28
@noonedeadpunk:matrix.orgAnd apparently I can't do that16:28
@noonedeadpunk:matrix.orgAs `"Syncing files from outside the working dir /var/lib/zuul/builds/0eaa2f0882c548f98828911e7be27e0a/work is prohibited"`16:30
@jim:acmegating.comDmitriy Rabotyagov: what version of zuul are you running?16:31
@noonedeadpunk:matrix.org4.9.016:31
@jim:acmegating.comDmitriy Rabotyagov: consider upgrading. 6.0.0 will let you do what you want in an untrusted playbook.16:32
@jim:acmegating.com(< 6.0.0 you would need to do that in a trusted playbook)16:32
@jim:acmegating.comhttps://zuul-ci.org/docs/zuul/latest/releasenotes.html#relnotes-6-0-0-upgrade-notes for details16:34
@noonedeadpunk:matrix.orgWell... Quite a few upgrade notes to read....16:35
@noonedeadpunk:matrix.orgAlso ansible-core 2.12 is sweet16:35
@noonedeadpunk:matrix.orgUgh16:35
@noonedeadpunk:matrix.orgTough choice :D16:36
@noonedeadpunk:matrix.orgsorry another question, should not be vars defined for job be available for playbook "hosts"?17:10
@noonedeadpunk:matrix.orgAs it seems that job vars are not yet loaded when playbook being launched17:11
@noonedeadpunk:matrix.orgMaybe it's fixed already in later versions though...17:11
@clarkb:matrix.orgDmitriy Rabotyagov: the vars defined on jobs should be available in the playbooks17:22
@clarkb:matrix.orgthey are just top level variables iirc. You can do host specific vars too.17:23
@jim:acmegating.comfungi: i replied to your comments on the spec18:03
@fungicide:matrix.org> <@jim:acmegating.com> fungi: i replied to your comments on the spec18:36
thanks!
@noonedeadpunk:matrix.org@clark:matrix.org Well, I have a playbook prepare_deploy_host.yml: https://paste.openstack.org/show/bBTYTEGVTJbchRGrR7ac/ and a job: https://paste.openstack.org/show/bSuSHH4qkw0JpP6JoyTn/18:41
@noonedeadpunk:matrix.organd it fails like https://paste.openstack.org/show/b8CrzxoJpzeTKOtLdlFN/18:42
@noonedeadpunk:matrix.orgAnd I o_O18:42
@jim:acmegating.comzuul-maint: would any one else like to look at the change to add ansible 6?  https://review.opendev.org/85355219:50
@jim:acmegating.comthat's the next change needed for 6.4.019:50
@clarkb:matrix.orgcorvus: I can19:56
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 855801: Add nodeset alternatives https://review.opendev.org/c/zuul/zuul/+/85580120:01
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 855691: Remove deprecated pipeline queue configuration https://review.opendev.org/c/zuul/zuul/+/85569120:31
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 855691: Remove deprecated pipeline queue configuration https://review.opendev.org/c/zuul/zuul/+/85569120:32
-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: [zuul/zuul] 856214: zuul_stream : Use !127.0.0.1 for loopback https://review.opendev.org/c/zuul/zuul/+/85621421:05
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 853552: Add Ansible 6 https://review.opendev.org/c/zuul/zuul/+/85355221:30
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed:22:44
- [zuul/zuul] 854458: Add support for configuring and testing tracing https://review.opendev.org/c/zuul/zuul/+/854458
- [zuul/zuul] 855096: Tracing: implement span save/restore https://review.opendev.org/c/zuul/zuul/+/855096
- [zuul/zuul] 855293: Add tracing tutorial https://review.opendev.org/c/zuul/zuul/+/855293
- [zuul/zuul] 856567: Add startSpanInContext tracing method https://review.opendev.org/c/zuul/zuul/+/856567
- [zuul/zuul] 856568: Use implicit trace context in build requests https://review.opendev.org/c/zuul/zuul/+/856568
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed on behalf of Simon Westphahl: [zuul/zuul] 856523: Add span for builds and propagate via request https://review.opendev.org/c/zuul/zuul/+/85652322:44
@jim:acmegating.comswest: tristanC ^ okay i did some surgery there.  tristanC i pulled some of your change into swest's and then updated that to use the new simplified context swest was advocating.  i added all of us as co-authors to that one.  i also added another change on top of the stack which i consider optional -- we can decide how much we want to use the implicit context (basically, thread local variables) to save on having to pass spans around all the time.22:46
@jim:acmegating.comtristanC: i think you could cherry-pick your change onto the end of the stack if you want, and maybe update it to look more like what i did with the context manager in executeJob22:47
@jim:acmegating.comthe whole stack is updated because i updated the first change to use a noop tracer if tracing is disabled so that we are guaranteed working context managers everywhere22:48
@iwienand:matrix.orgzuul-main: https://review.opendev.org/c/zuul/nodepool/+/853914 doesn't solve any of the openshift issues we discussed, but it does get the extant test working with f36 by installing a statically built client; removing a f35 dependency (the packages for the 3.11 oc client have been orphaned and don't work any more).  this isn't the long-term solution, but at least gets us off a f35 dependency for now23:27
@clarkb:matrix.orgianw: I guess the fedora 36 packages are also statically linked but they are built with the wrong compiler if I read the upstream issue correctly? The tarball from github would've been built with the time appropriate golang compiler and is happy23:31
@iwienand:matrix.orgClark: yeah i think all go things are essentially statically linked?  but yeah, building with the current go the platform ships is incompatible 23:32

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!