Tuesday, 2022-08-02

-@gerrit:opendev.org- Ian Wienand proposed: [zuul/nodepool] 849273: Dockerfile: move into separate group when running under cgroupsv2 https://review.opendev.org/c/zuul/nodepool/+/84927301:13
@tristanc_:matrix.orgtony.breeds: Clark: unfortunately the jobs defined in opendev.org/zuul/zuul-jobs often use `become`, sometime just to ensure a piece of software is installed, and that is not working well with unprivileged container. In that case, we setup a passwd alias for zuul that has the uid 0, that way become is a noop and the job can run without using setuid or sudo.12:29
@jpew:matrix.orgThe `tox-linters` appear to be broken and preventing gating for zuul?14:42
@jpew:matrix.orge.g. https://review.opendev.org/c/zuul/zuul/+/85068514:42
@clarkb:matrix.orgIt is the new "missing whitespace after keyword" rule. That was brought up somewhere else as a problem for things like `assert(foo)` and `del(foo)` which is exactly the sort of thing failing in zuul.14:43
@jpew:matrix.orgAh, it wants `assert (foo)`  or `assert foo` b/c it's a keyword not a function?14:45
@clarkb:matrix.orgcorrect. I think `assert foo` is what they are likely looking for14:46
-@gerrit:opendev.org- Tristan Cacqueray proposed: [zuul/zuul] 850575: doc: fix liveness probes path rendering https://review.opendev.org/c/zuul/zuul/+/85057514:51
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 851895: Add whitespace around keywords https://review.opendev.org/c/zuul/zuul/+/85189515:03
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed:15:03
- [zuul/zuul] 850109: Add tests for zuul-client job-graph https://review.opendev.org/c/zuul/zuul/+/850109
- [zuul/zuul] 850111: Add test for zuul-client freeze-job https://review.opendev.org/c/zuul/zuul/+/850111
- [zuul/zuul] 851107: Add job graph support to web UI https://review.opendev.org/c/zuul/zuul/+/851107
- [zuul/zuul] 851268: Add freeze job to web UI https://review.opendev.org/c/zuul/zuul/+/851268
- [zuul/zuul] 851604: Use internal links in job graph display https://review.opendev.org/c/zuul/zuul/+/851604
@jim:acmegating.comzuul-maint: https://review.opendev.org/851895 would be great to merge asap15:04
@tobias.henkel:matrix.orgzuul-maint: did anyone of you observe random task hangs in the past similar to the issue in opendev a few weeks ago? I think we face something similar (we're still on python 3.8 and ansible 2.9).15:52
@tobias.henkel:matrix.orgall gdb back traces I got so far from hanging tasks look the same:15:52
```
Thread 1 (Thread 0x7f9ba968a740 (LWP 1758660)):
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
#1 0x00007f9ba99e67d1 in __GI___pthread_mutex_lock (mutex=0x7f9ba9dfc990 <_rtld_global+2352>) at ../nptl/pthread_mutex_lock.c:115
#2 0x00007f9ba9ddf1ce in _dl_add_to_namespace_list (new=0x55b53d175e10, nsid=0) at dl-object.c:33
#3 0x00007f9ba9dda792 in _dl_map_object_from_fd (name=name@entry=0x7f9ba58b54b0 "/usr/local/lib/python3.8/lib-dynload/_crypt.cpython-38-x86_64-linux-gnu.so", origname=origname@entry=0x0, fd=-1, fbp=fbp@entry=0x7ffc97c7ca90, realname=<optimized out>, loader=loader@entry=0x0, l_type=<optimized out>, mode=<optimized out>, stack_endp=<optimized out>, nsid=<optimized out>) at dl-load.c:1382
#4 0x00007f9ba9ddca8d in _dl_map_object (loader=0x0, loader@entry=0x7f9ba9dce000, name=name@entry=0x7f9ba58b54b0 "/usr/local/lib/python3.8/lib-dynload/_crypt.cpython-38-x86_64-linux-gnu.so", type=type@entry=2, trace_mode=trace_mode@entry=0, mode=mode@entry=-1879048190, nsid=<optimized out>) at dl-load.c:2466
#5 0x00007f9ba9de6feb in dl_open_worker (a=a@entry=0x7ffc97c7cfe0) at dl-open.c:228
#6 0x00007f9ba97c157f in __GI__dl_catch_exception (exception=exception@entry=0x7ffc97c7cfc0, operate=operate@entry=0x7f9ba9de6f60 <dl_open_worker>, args=args@entry=0x7ffc97c7cfe0) at dl-error-skeleton.c:196
#7 0x00007f9ba9de6bba in _dl_open (file=0x7f9ba58b54b0 "/usr/local/lib/python3.8/lib-dynload/_crypt.cpython-38-x86_64-linux-gnu.so", mode=-2147483646, caller_dlopen=0x7f9ba9c675d1 <_PyImport_FindSharedFuncptr+113>, nsid=<optimized out>, argc=17, argv=0x7ffc97c82d58, env=0x7ffc97c82de8) at dl-open.c:599
```
@tobias.henkel:matrix.orgall hang during dl_open of _crypt.cpython-38-x86_64-linux-gnu.so15:53
@fungicide:matrix.orgAlbin Vass: that ^ was the thing you ran down initially, right?16:08
@tobias.henkel:matrix.orgI think it's different, but similar16:09
@tobias.henkel:matrix.orgfor reference, that is Albin Vass issue: https://github.com/ansible/ansible/issues/7827016:11
@tobias.henkel:matrix.orgthe stack traces look different16:11
@fungicide:matrix.orgall, okay, and this isn't in pty allocation, it's loading a c lib16:13
@fungicide:matrix.orgout of curiosity, what glibc version"?16:13
@fungicide:matrix.org * out of curiosity, what glibc version?16:14
@fungicide:matrix.org * and this isn't in pty allocation, it's loading a c lib16:14
@fungicide:matrix.orgno signs that reads are generally hanging from that fs? is that on a job node or an executor?16:15
@tobias.henkel:matrix.orgit's debian buster 2.28-10+deb10u116:16
@tobias.henkel:matrix.orgit's on an executor16:16
@tobias.henkel:matrix.orgnope, no signs, io seems fine16:19
@clarkb:matrix.orgtobiash: that looks like you're waiting for a pthread mutex while opening the python crypt module's C component?17:20
@tobias.henkel:matrix.orgyes17:21
@tobias.henkel:matrix.orghowever in the meantime I found a few different traces as well17:21
@clarkb:matrix.orgmy initial hunch is that seems like a python bug17:21
@tobias.henkel:matrix.orgso no idea yet in which directtion to look at17:22
@tobias.henkel:matrix.orgmaybe I'll try the shotgun method of upgrading to py3.10 and bullseye17:22
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 851895: Add whitespace around keywords https://review.opendev.org/c/zuul/zuul/+/85189518:11
@jpew:matrix.orgClark:  I can't seem to regate those 2 patches you reviewed yesterday, maybe I don't have permission?19:42
@clarkb:matrix.org> <@jpew:matrix.org> Clark:  I can't seem to regate those 2 patches you reviewed yesterday, maybe I don't have permission?19:43
The string is `recheck` not `regate`
@jpew:matrix.orgClark: Ah got it. Thanks19:44
@iwienand:matrix.orgtobiash: yeah when i debugged that issue the stack clearly had grantpt() in it, so i do think different.  but i guess be careful with 3.10 + bullseye because you might actually hit that issue.  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015740 does have a patch that i backported but i do think it's unlikely to make it into stable19:46
-@gerrit:opendev.org- Joshua Watt proposed: [zuul/zuul] 851931: doc: Fix Nodepool monitoring stats https://review.opendev.org/c/zuul/zuul/+/85193120:15
@clarkb:matrix.orgcorvus: did you want to reivew https://review.opendev.org/c/zuul/nodepool/+/849273 since it changes how nodepool-builder is invoked on cgroupv2 hosts?20:19
@jim:acmegating.comClark: lgtm, thx20:21
-@gerrit:opendev.org- Zuul merged on behalf of Joshua Watt: [zuul/zuul] 850685: web: openapi: Fix item_ahead and items_behind https://review.opendev.org/c/zuul/zuul/+/85068520:50
-@gerrit:opendev.org- Zuul merged on behalf of Joshua Watt: [zuul/zuul] 851550: smtpreporter: Add pipeline to subject https://review.opendev.org/c/zuul/zuul/+/85155020:54
@clarkb:matrix.orgcorvus: do you know why we required python2.7 support in ansible? re https://review.opendev.org/c/zuul/zuul-jobs/+/851343 is it because ansible may itself run against python2.7 on the remote end and this gives us some checking that it is going to function in that case?21:22
@clarkb:matrix.orgianw: ^ fyi I suspect that may be the reason and we may not be able to remove the testing if that is the case21:23
@jim:acmegating.comClark: yeah, we need to at least issue the next major rev of zuul where we drop older ansible to do that i think.  and even then, i think it's just that we can look into what the current ansible support policy is.  is there a reason we need to drop that now?21:51
@iwienand:matrix.orgsomething started failing with the py27 job as i was working on the linter stack and i put in the bits to remove it.  i'm afraid what that something was i can't remember off the top of my head22:00
@jim:acmegating.comianw: looks like it's okay in general?  was just a one-time fluke?22:01
@iwienand:matrix.orgi'll have to dig back 22:04
@iwienand:matrix.orghttps://docs.ansible.com/ansible/latest/dev_guide/:ref:managed-node-requirements is currently a 40422:11
@iwienand:matrix.orghttps://github.com/ansible/ansible/commit/2fc73a9dc357e776dbbbfd035c86fe880415e60a appears to have introduced this22:18
@iwienand:matrix.orgas usual from ansible a super unhelpful commit message with no context.  WHY DO THEY DO THIS!!!!!!22:20
@iwienand:matrix.orgi submitted https://github.com/ansible/ansible/pull/78424 to fix the control node link, and https://github.com/ansible/ansible/issues/78423 to try and figure out what ansible thinks works where23:17
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/nodepool] 851940: Bump dib to 3.23.0 https://review.opendev.org/c/zuul/nodepool/+/85194023:53

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!