Thursday, 2021-04-01

*** hamalq has quit IRC00:02
*** shanemcd has quit IRC00:02
*** shanemcd has joined #zuul00:03
*** y2kenny has quit IRC00:07
*** fsvsbs has quit IRC01:32
*** jangutter_ has joined #zuul02:14
*** jangutter has quit IRC02:17
*** evrardjp has quit IRC02:33
*** evrardjp has joined #zuul02:33
corvusi'm beginning to doubt that i'm going to get useful data in a reasonable amount of time.03:13
corvusi think i have a handle on the kind of search to do next, and hopefully i can do it during a period where there aren't quite so many objects for it to churn through03:18
*** ykarel|away has joined #zuul03:50
*** ykarel|away is now known as ykarel03:54
*** ajitha has joined #zuul03:59
*** jfoufas1 has joined #zuul04:27
openstackgerritFelix Edel proposed zuul/zuul master: Switch to ZooKeeper backed merge result events  https://review.opendev.org/c/zuul/zuul/+/78419505:55
*** jangutter has joined #zuul06:05
*** fsvsbs has joined #zuul06:05
*** jangutter_ has quit IRC06:08
*** saneax has joined #zuul06:16
*** hashar has joined #zuul06:39
*** reiterative has quit IRC06:49
*** reiterative has joined #zuul06:49
*** jcapitao has joined #zuul07:00
*** tosky has joined #zuul07:45
openstackgerritDaniel Blixt proposed zuul/zuul-jobs master: WIP: Make build-sshkey handling windows compatible  https://review.opendev.org/c/zuul/zuul-jobs/+/78066207:47
*** ykarel has quit IRC08:03
*** ykarel has joined #zuul08:05
*** fsvsbs has quit IRC08:14
*** nils has joined #zuul08:33
*** jangutter_ has joined #zuul08:57
*** ykarel is now known as ykarel|lunch08:58
*** jangutter has quit IRC09:00
*** ykarel|lunch is now known as ykarel10:10
openstackgerritSorin Sbârnea proposed zuul/zuul master: WIP: Document tox environments  https://review.opendev.org/c/zuul/zuul/+/76646010:13
*** jcapitao is now known as jcapitao_lunch10:39
*** hashar is now known as hasharLunch11:02
*** rlandy has joined #zuul11:42
*** jcapitao_lunch is now known as jcapitao11:51
*** hasharLunch is now known as hashar11:58
openstackgerritSorin Sbârnea proposed zuul/zuul master: Document local testing  https://review.opendev.org/c/zuul/zuul/+/76646012:13
*** bhagyashris has quit IRC12:28
*** bhagyashris has joined #zuul12:29
*** sanjayu_ has joined #zuul12:35
*** saneax has quit IRC12:36
*** jangutter_ is now known as jangutter12:38
jangutterNewbie question on zuul config repos: we have 2 config repos for our system. One defines the pipelines and some projects, the other defines the base jobs.12:41
jangutterI submitted a review to the projects config-repo and Zuul reported a config error: some/untrusted/project/zuul.d/main.yaml: "Job base-job-X not defined".12:42
jangutterBut, base-job-X which inherits from the default base job, is defined in the base-job config repo....12:43
avassjangutter: it sounds like zuul hasn't loaded base-job-X for some reason12:43
avassjangutter: you can check what jobs are available in: https://zuul.opendev.org/t/zuul/jobs for example12:44
avassjangutter: and config errors at: https://zuul.opendev.org/t/zuul/config-errors maybe there's some error in the config for base-job-X12:44
jangutteravass: that https://zuul-ui/job/base-job-X is showing up nicely (it's just been merged)12:45
avassjangutter: was the change dependent on that job submitted shortly after? what happens if you recheck it, still config error?12:46
*** nhicher_ has joined #zuul12:48
jangutteravass: still syntax error. and the t/tenantname/config-errors is just blank.12:51
avassjangutter: weird then I can only suspect a mispelled parent name or something like that12:53
jangutteravass: we're using load-branch in our tenant config to use the "main" branch.... would that interfere?12:55
jangutteravass: the fun thing is that the job is _running correctly_ in the untrusted project. But an entirely unrelated change in the config project causes troubles for Zuul...12:56
jangutteravass: bwahaha, I think we hit this one: https://review.opendev.org/c/zuul/zuul/+/76235412:58
avassjangutter: ah yep that sounds like that could be it :)12:59
jangutteravass: I manually merged the change and it's working as intended, so it only affects the config-repo checks.13:00
*** sanjayu__ has joined #zuul13:06
*** sanjayu_ has quit IRC13:09
zbr|roverfungi: ok drop use of bionic for zuul? related to lack of yarnpkg on deps.13:38
zbr|roverapparently installing yarn using bindep is not quite so portable13:39
funginot quite so portable how? are we testing doc builds on bionic?13:39
fungiit might be a reason to do doc builds on focal instead13:40
zbr|roverthat is what i did, but with an undesired side effect, look at https://review.opendev.org/c/zuul/zuul/+/766460/13:40
zbr|roverfixed the docs, bug broke something else :D13:40
fungiprobably the addition of compile test there13:41
zbr|roverprobably fixable by removing test/compile, but it would not be correct, as they are needed for that.13:41
fungithe alternative is to specify that it's only on versions of distros which provide it (everyone else is on their own to work out how to get yarn installed anyway, like before)13:42
zbr|rovermy hopes were to avoid using versions as I hate wasting time updating these files every ~6mo13:44
openstackgerritSorin Sbârnea proposed zuul/zuul master: Document local testing  https://review.opendev.org/c/zuul/zuul/+/76646013:44
zbr|roverat least I got it working, it also involved a small fix on program-output, got a release yesterday. finally now it displays the failure reason when it does.13:45
zbr|roverused to be very cryptic13:45
fungione way to avoid updating distro versions is to say something like platform:dpkg !platform:ubuntu-bionic13:46
fungithat strikes a bit of a balance, in that the zuul jobs won't try to install the yarnpkg package on ubuntu-bionic, if someone tries running bindep locally on a debian-stretch system it will perpetually report they need yarnpkg, but that's not the end of the world (at least serves as a reminder they need to go find yarn from somewhere, they can even satisfy it from stretch-backports)13:48
zbr|roverfungi: there is a tox-bindep plugin I wrote which could tell people what they miss before running the job, but i am not sure you will want me to enable it by default.13:54
fungiwell, if people choose to install tox-bindep it'll do the thing they install it to do, right?14:05
*** ajitha has quit IRC14:24
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: ensure-kubernetes: remove dns resolvers hack  https://review.opendev.org/c/zuul/zuul-jobs/+/78442714:37
openstackgerritTristan Cacqueray proposed zuul/zuul-operator master: Remove command args override  https://review.opendev.org/c/zuul/zuul-operator/+/78418114:38
*** jangutter_ has joined #zuul14:40
*** jangutte_ has joined #zuul14:41
*** jangutter has quit IRC14:42
*** jangutter_ has quit IRC14:45
*** sanjayu__ has quit IRC14:51
zbr|roverfungi: yes. that plugin runs bindep before tox commands. it works even if you install it side tox, no need to touch tox.ini file.14:58
zbr|roverit may prove annoying if bindep.txt files were not written for the platform you are running on.14:59
*** zbr|rover is now known as zbr15:29
openstackgerritJeremy Stanley proposed zuul/zuul-jobs master: Document algorithm var for remove-build-sshkey  https://review.opendev.org/c/zuul/zuul-jobs/+/78398815:30
*** jfoufas1 has quit IRC15:30
Shrewsfungi, et. al.  that change ^ reminds me that remind you all that a new ansible 2.11 feature will allow validating role arguments at run time (https://docs.ansible.com/ansible-core/devel/user_guide/playbooks_reuse_roles.html#role-argument-validation). So the things you have in README now for documentation might be better served by moving them to `meta/main.yml` at some point.  That would also allow `ansible-doc` to display the role docs.15:34
Shrewsjust fyi. totally not required15:34
fungiShrews: oh neat!15:35
corvusstructured docs are great :)15:35
zbrfinally! i was dreaming about that for a very long time.15:36
zbrit should not be very hard to migrate the docs from zuul-jobs to use the newer spec15:37
corvusprobably automatable15:41
corvusi think the first step would be to get the new-style docs to render in sphinx so that https://zuul-ci.org/docs/zuul-jobs/ works15:42
*** ykarel is now known as ykarel|away15:43
*** ykarel|away has quit IRC15:53
*** hashar has quit IRC15:58
zbransible community team will need do the same for collections16:12
*** sshnaidm is now known as sshnaidm|afk16:20
*** jcapitao is now known as jcapitao_afk16:21
openstackgerritPaul Belanger proposed zuul/zuul-jobs master: ensure-podman: Use official podman repos for ubuntu  https://review.opendev.org/c/zuul/zuul-jobs/+/76517716:37
*** hamalq has joined #zuul16:40
tristanCit seems like `use-buildset-registry` somehow disables the etcd service started by `ensure-kubernetes`, resulting in a broken cluster ( Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused )16:50
openstackgerritMerged zuul/zuul master: Use load-branch with trusted repository  https://review.opendev.org/c/zuul/zuul/+/76235416:51
tristanCor perhaps it is something else, the first visible issue happens when the role is being applied, it cause `pkg/osutil: received terminated signal, shutting down...`17:00
avasstristanC: use-buildset-registry restarts the docker daemon so maybe that messes with it somehow17:07
*** jangutter has joined #zuul17:09
*** jangutte_ has quit IRC17:12
openstackgerritTristan Cacqueray proposed zuul/zuul-operator master: Remove command args override  https://review.opendev.org/c/zuul/zuul-operator/+/78418117:21
tristanCavass: maybe, it's odd though because the api server does work for some request after the docker restart.. i've added a retry loop in ^17:22
*** nils has quit IRC17:43
*** jcapitao_afk has quit IRC17:46
*** jcapitao_afk has joined #zuul17:48
corvusclarkb, fungi, tobiash, swest: erm, i think i caught the memleak: https://i.imgur.com/obJIexi.png  --  it is related to semaphores, but it's not because of the layout reference.  if we removed the layout reference it's true we wouldn't leak layouts anymore (and wouldn't leak as much memory), but the real issue is that we're leaking semaphorehandlers because they inherit from ZooKeeperBase and therefore the17:53
corvuszk client keeps references to them.  so we need to clean those up.  if we do that, the layouts should get cleaned up too.17:53
fungiaha!17:53
corvusit's actually such a well-behaved leak, i got that from the quick-start setup where i was prototyping the next data collection.  so should be really easy to verify the fix.17:54
*** jcapitao_afk has quit IRC17:55
clarkbcorvus: its the onconnect and ondisconnect method registration that keeps the refs alive?17:55
corvusya17:55
clarkbcool, I'm glad that got tracked down. Any idea why that didn't show up in the investigation yesterday?17:56
corvusi think unlike other zk base objects, we discard the semaphorehandler each time we reconfigure, so it's not really long-lived17:56
corvusclarkb: we just never completed the objgraph search because the search space was too big17:56
clarkbah17:56
clarkbif we are going to do a big zuul restart to land that fix getting in https://review.opendev.org/c/zuul/zuul/+/784142 as part of it is likely a good idea too (assuming people agree with my analysis on that change)17:57
corvusclarkb: i will look at that more in depth tonight/tomorrow and make sure to approve if it hasn't been, but just from reading the 2 lines of code added, lgtm :)17:58
corvusi've got to task switch away for the rest of the day17:58
clarkbok17:59
clarkbwould it be helpful if someone looked at fixing the leak too?17:59
fungithere's no huge hurry for opendev's sake at least... looking at our current memory consumption i doubt we'll get in trouble in again until monday at the earliest18:00
corvusi'm not gonna say no, but i'm planning on doing it tonight/tomorrow :)18:01
clarkbok18:01
clarkbI'm probably nto going to jump right into that as from memory it isn't quite clear to me how one would signal that a semaphorehandler was done being used. Maybe a call at the old code that moved the old semaphore forward to unregister the current semaphore from the zk client?18:02
clarkbactually ya that might be a simple thign to write I'll give it a quick go and others can feel free to replace it or update it with something ebtter18:11
*** ykarel|away has joined #zuul18:13
openstackgerritTristan Cacqueray proposed zuul/zuul-operator master: Update operator-framwork to v1.4.2  https://review.opendev.org/c/zuul/zuul-operator/+/78445718:20
corvusclarkb: i was going to look into whether it needs to inherit from zk base at all18:20
openstackgerritClark Boylan proposed zuul/zuul master: Fix SemaphoreHandler leak  https://review.opendev.org/c/zuul/zuul/+/78445818:22
clarkbcorvus: oh that is a good point, it does seem like it doesn't use the on connect and on disconnect stuff (yet?)18:23
clarkbanyway feel free to replace that change or push new patchsets to it. Won't bother me at all18:23
* clarkb lunches18:24
*** ykarel|away has quit IRC18:25
clarkbate a quick lunch and confirmed that SemaphoreBase seems to only use self.kazoo_client. Let me see about changing this up a little18:43
fungihttps://zuul-ci.org/docs/zuul/reference/developer/specs/circular-dependencies.html is implemented now, right?18:45
fungialso is https://zuul-ci.org/docs/zuul/reference/developer/specs/kubernetes-operator.html done? seems like we have people using it, or trying to18:46
fungialso i suppose a point of process... do we remove completed specs after the implementations merge, or after they appear in a tagged release?18:46
fungii can't remember if there are still bits of https://zuul-ci.org/docs/zuul/reference/developer/specs/tenant-scoped-admin-web-API.html which haven't merged yet18:49
openstackgerritClark Boylan proposed zuul/zuul master: Fix SemaphoreHandler leak  https://review.opendev.org/c/zuul/zuul/+/78445819:01
openstackgerritPaul Belanger proposed zuul/zuul-jobs master: ensure-podman: Use official podman repos for ubuntu  https://review.opendev.org/c/zuul/zuul-jobs/+/76517719:07
*** jangutter_ has joined #zuul19:32
*** jangutter has quit IRC19:35
openstackgerritPaul Belanger proposed zuul/zuul-jobs master: ensure-podman: Use official podman repos for ubuntu  https://review.opendev.org/c/zuul/zuul-jobs/+/76517720:03
openstackgerritPaul Belanger proposed zuul/zuul-jobs master: ensure-podman: Use official podman repos for ubuntu  https://review.opendev.org/c/zuul/zuul-jobs/+/76517720:20
*** nhicher_ has quit IRC20:21
*** holser has joined #zuul20:34
openstackgerritPaul Belanger proposed zuul/zuul-jobs master: ensure-podman: Use official podman repos for ubuntu  https://review.opendev.org/c/zuul/zuul-jobs/+/76517720:39
mordredpabelanger: ^^ left a drive-by comment20:52
*** decimuscorvinus has quit IRC21:42
*** decimuscorvinus has joined #zuul21:42
clarkbhttps://review.opendev.org/c/zuul/zuul/+/784458 did pass testing21:44
clarkbharder to say it absolutely fixes the leaks but at least that is a good sign21:44
corvusclarkb: ++ i'll test it locally tonight21:50
*** pabelanger has joined #zuul21:50
pabelangerso, I finally got https://review.opendev.org/c/zuul/zuul-jobs/+/765177 to pass. I think we have enough testing in place to ensure no breaks with ensure-podman21:51
pabelangerwe've been hitting some random podman issues in zuul.a.c, that's mostly why I started working on it again21:51
clarkbguillaumec: I'm trying to catch up on https://review.opendev.org/c/opendev/gear/+/784083 and I guess the issue is you cannot use select/poll/etc on ssl sockets unless they are non blocking?23:20
clarkband that chagnes switches to a non blocking setup. I think that will work due to how the blocking variant was already only handling a packet at a time23:20
clarkb(basically lbocking vs non blocking was pretty transparent to end users as they just got gearman jobs/events23:21
clarkbguillaumec: I wonder if we even need Connection anymore23:26
clarkbseems the server side was already nonblocking and now the client side is being switched to non blocking too23:26
clarkband worker uses baseclient too so it is converted as well23:26
clarkbif we are worried about it causing user visible behavior differences we could switch on whether or not ssl is used23:32
*** jangutter_ has quit IRC23:38
*** tosky has quit IRC23:40
corvusclarkb: https://review.opendev.org/784458 fixes the leak in my simple local test; i think we should get it merged in prep for the next restart.  <-- zuul-maint23:41
*** rlandy has quit IRC23:42
corvusclarkb: on 784142 did you eventually get that test to fail before your fix?23:43
clarkbhttps://review.opendev.org/c/zuul/zuul/+/784165 and https://review.opendev.org/c/zuul/zuul/+/784142 are the other two fixes that might be good to get in too23:43
clarkbcorvus: yes the key was the content of the event23:44
clarkbor repo_state is what it calls it I think23:44
clarkbin particular I was passing in a refs/changes/xy/abxy: $rev entry in the repo state but the executors never fetch the changes in that way23:44
clarkbthey do a fetch_head merged into target ref of refs/zuul/uuid23:45
clarkbanyway because they never have that ref it tripped the first condition always in testing. When I dropped that after understanding repo_state better the test did what I expected23:45
clarkbone thing that recently occurred to me is how periodic jobs would work if they just check out master too23:48
clarkbit is certainly odd how we haven't seemed to notice before but all the behavior we observed directly on the executors seemed to point to this23:48
corvuswe almost never ff? :)23:50
clarkbcould be23:50
corvusclarkb: lgtm; i'd like tobiash to take a look at it; do you think it can wait until next week, or should we merge now and ask for a retro-review?23:50
clarkbit can probably wait23:51
clarkbconsidering we only just noticed the behavior that seems to have beenthere all along23:51
corvusclarkb: i'm pretty weakly -1 on 78416523:56
clarkboh I didn't realize that we were also passing exception objects outside of the except: context before23:58
corvusi believe we thread-shifted them23:58
clarkbI agree that unless something is fundamentally odd this should be fine as is23:58
clarkbah23:58

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!