Wednesday, 2021-06-09

mnaserjust switched it to run as 1000100:02
mnaserand ill see how that works00:02
*** josefwells has joined #zuul00:04
mnaserclarkb: that actually did the trick!00:10
mnaseri believe we should USER zuul and USER nodepool in the docker iamges imho00:10
corvustristanC: ^ that going to be okay with openshift?00:11
*** josefwells__ has quit IRC00:11
clarkbI think if you use USER you can still override the user00:13
clarkbbut if you don't use USER then it will default to root00:13
ianwi don't know but i'm getting "dib won't work with that" vibes :)00:13
mnaserDoesn’t it work already with nodepool Clarke00:14
clarkbuser: nodepool we set that on nodepool-builder containers with docker-compose00:16
clarkbso ya unless there is some weird not well documented behavior difference between explicitly setting USER and overriding user at runtime I would expect it to be ok00:16
opendevreviewMatthieu Huin proposed zuul/zuul master: Test zuul-client console-stream  https://review.opendev.org/c/zuul/zuul/+/79485400:20
mordredmnaser: we used to set user in places and removed it00:20
clarkbI think the problem had to do with hardcoded uids? but I think we are doing thatn ow anyway just without the helper mode for people who aren't randomizing those?00:21
clarkbhrm the current Dockerfile doesn't seem to have USER in its history anywhere00:23
clarkbfor zuul I mean00:23
fungiwe must do some user mapping because the host system sees zuul:zuuld ownership on files written by the container processes on bindmounted dirs00:25
clarkbfungi: I think we hardcode the uid in ansibel for things that map across00:26
opendevreviewMerged zuul/nodepool master: Update dib to 3.12.0  https://review.opendev.org/c/zuul/nodepool/+/79541600:36
*** josefwells has quit IRC01:29
*** bhavikdbavishi has joined #zuul04:19
*** marios has joined #zuul05:10
opendevreviewSimon Westphahl proposed zuul/zuul master: Refactor config/tenant (re-)loading  https://review.opendev.org/c/zuul/zuul/+/79526305:20
*** zenkuro has quit IRC06:10
*** bhavikdbavishi has quit IRC06:40
*** bhavikdbavishi has joined #zuul06:40
*** hashar has joined #zuul06:58
*** jpena|off is now known as jpena07:31
*** bhavikdbavishi has quit IRC07:51
*** tosky has joined #zuul07:53
opendevreviewFelix Edel proposed zuul/zuul master: WIP: Lock/unlock nodes on executor server  https://review.opendev.org/c/zuul/zuul/+/77461008:14
opendevreviewFelix Edel proposed zuul/zuul master: WIP: Lock/unlock nodes on executor server  https://review.opendev.org/c/zuul/zuul/+/77461008:24
avass[m]mnaser:  see: https://zuul-ci.org/docs/zuul-jobs/policy.html#preservation-of-owner-between-executor-and-remote08:28
avass[m]mnaser: that's why that rsync fails, `invalid argument` isn't very helpful when what it means is that the user doesn't exist08:29
opendevreviewGuillaume Chauvel proposed zuul/zuul master: [DNM] dummy test  https://review.opendev.org/c/zuul/zuul/+/79550108:49
opendevreviewSimon Westphahl proposed zuul/zuul master: Make reporting asynchronous  https://review.opendev.org/c/zuul/zuul/+/69125308:58
*** rpittau|afk is now known as rpittau09:22
*** ysandeep|ruck has joined #zuul09:25
opendevreviewMerged zuul/zuul-client master: Add build-info subcommand  https://review.opendev.org/c/zuul/zuul-client/+/75107009:26
ysandeep|ruck#zuul hello folks o/ top jobs in tripleo gate queue are stuck..09:27
ysandeep|ruckexample: https://zuul.opendev.org/t/openstack/status#794634 and other changes under the chain09:27
ysandeep|ruckhttps://zuul.opendev.org/t/openstack/stream/f0e55388ce7d4cf7b989315c18ca5d3d?logfile=console.log - is already at --- END OF STREAM ---09:27
ysandeep|ruckcould someone please help with this?09:28
tobiash[m]corvus, clarkb I wonder if this is a regression introduced by one of the latest changes (maybe a lost result event?) ^ let me know if I can help with the analysis09:40
opendevreviewMatthieu Huin proposed zuul/zuul master: Test zuul-client console-stream  https://review.opendev.org/c/zuul/zuul/+/79485409:51
*** hashar has quit IRC09:52
opendevreviewSimon Westphahl proposed zuul/zuul master: Make reporting asynchronous  https://review.opendev.org/c/zuul/zuul/+/69125310:16
opendevreviewSimon Westphahl proposed zuul/zuul master: Make reporting asynchronous  https://review.opendev.org/c/zuul/zuul/+/69125310:16
opendevreviewGuillaume Chauvel proposed zuul/zuul master: [DNM] dummy test  https://review.opendev.org/c/zuul/zuul/+/79550110:32
*** ysandeep|ruck is now known as ysandeep|afk10:33
opendevreviewMatthieu Huin proposed zuul/zuul master: Test zuul-client console-stream  https://review.opendev.org/c/zuul/zuul/+/79485410:42
*** ysandeep|afk is now known as ysandeep|ruck11:06
opendevreviewMohammed Naser proposed zuul/zuul-helm master: Run Zuul under the 'zuul' user.  https://review.opendev.org/c/zuul/zuul-helm/+/79551811:11
mnaserianw: no idea how, no idea why -- tried fedora-container this morning and .. "diskimage_builder.element_dependencies.MissingElementException: Element 'block-device' not found"11:26
mnaserurgh, nevermind11:28
mnaseri forgot to parent to my `base` so was misssing vm element which brought in block-device11:29
mnaser"Error: 'overlay' is not supported over overlayfs, a mount_program is required: backing file system is unsupported for this graph driver" weee11:31
mnaserlooks like /var/lib/containers mount time11:34
avass[m]I'm currently debugging why zuul seemingly isn't receiving events from one connection. The scheduler can receive events with `gerrit stream-events`, and it was able to load configuration for projects from that connection. It's just not triggering anything and not logging any events. Any ideas how that could happen?11:34
avass[m]Those projects worked before restarting yesterday to pick up the layout caching bugfixes11:35
tristanCavass[m]: perhaps there is a lock in /zuul/event zk, we had to `zkCli.sh deleteall /zuul/events` to recover from that situation11:36
avass[m]tristanC: I was looking at that and thinking that it may be enough to delete the keys for that connection?11:36
avass[m]I'm gonna see if that works11:37
mnaserexcellent.  mounting /var/lib/containers did the trick and i got a containerfile image built11:39
avass[m]mnaser: did you see my comment earlier?11:41
mnaseravass[m]: yes!  wrt the chown invalid arugment thing11:41
avass[m]ok, just making sure :)11:41
mnaseravass[m]: i think it was you that uses a bunch of zuul+gitlab right?11:42
avass[m]mnaser: more like trying to ;)11:42
avass[m]mnaser:  guillaumec  uses gitlab actively I believe11:42
mnaseravass[m]: ah i see, guillaumec fyi not sure if you are an ultimate user but this might be interesting to look at -- https://docs.gitlab.com/ee/user/project/merge_requests/approvals/#notify-external-services11:43
*** jpena is now known as jpena|lunch11:44
avass[m]tristanC: did you couple that with a restart?11:55
tristanCavass[m]: we do it before the restart11:56
avass[m]shouldn't locks be deleted when the zk session is lost?11:57
tristanCavass[m]: it should, but in our case we needed it to happen faster than zk timeout, so deleting all the zk events node before a restart did the trick11:58
guillaumecmnaser: thanks, premium only11:59
opendevreviewGuillaume Chauvel proposed zuul/zuul master: WIP gitlab status context  https://review.opendev.org/c/zuul/zuul/+/79552512:00
avass[m]tristanC: oh so something like: the lock is still present so the scheduler sees that and thinks there is another scheduler that is managing those events, or does restarting not create a new session somehow? In my case the lock was still there so I'd expect the current scheduler to be managing the events for that connection.12:01
avass[m]but I'll try and see if a deleting /zuul/events before restarting helps later.12:02
avass[m]and I would have expected a leader election to occur if the lock is deleted, but maybe that didn't work in my case because I deleted `/zuul/events/connection/<connection>`12:04
tristanCavass[m]: i haven't had time to dig into that issue, it was speculated that the scheduler was not stopped cleanly, but even with https://review.opendev.org/c/zuul/zuul/+/794035 that did not helped12:04
avass[m]tristanC: ok, I'll continue digging a bit12:05
*** ysandeep|ruck is now known as ysandeep|mtg12:07
opendevreviewGuillaume Chauvel proposed zuul/zuul master: [DNM] dummy test  https://review.opendev.org/c/zuul/zuul/+/79550112:11
opendevreviewMatthieu Huin proposed zuul/zuul master: Test zuul-client console-stream  https://review.opendev.org/c/zuul/zuul/+/79485412:31
opendevreviewGuillaume Chauvel proposed zuul/zuul master: [DNM] dummy test  https://review.opendev.org/c/zuul/zuul/+/79550112:39
*** felixedel[m] has joined #zuul12:40
opendevreviewMatthieu Huin proposed zuul/zuul-client master: builds: fix API queries for boolean parameters, make tenant optional  https://review.opendev.org/c/zuul/zuul-client/+/79455312:40
*** felixedel[m] is now known as felixedel12:41
*** felixedel has joined #zuul12:41
opendevreviewGuillaume Chauvel proposed zuul/zuul master: [DNM] dummy test  https://review.opendev.org/c/zuul/zuul/+/79550112:43
opendevreviewBenjamin Schanzel proposed zuul/zuul-jobs master: Add a meta log upload role with a failover mechanism  https://review.opendev.org/c/zuul/zuul-jobs/+/79533612:48
*** jpena|lunch is now known as jpena12:49
opendevreviewGuillaume Chauvel proposed zuul/zuul master: [DNM] dummy test  https://review.opendev.org/c/zuul/zuul/+/79550113:02
opendevreviewGuillaume Chauvel proposed zuul/zuul master: [DNM] dummy test  https://review.opendev.org/c/zuul/zuul/+/79550113:08
*** Simon[m]1 has joined #zuul13:10
*** Simon[m]1 is now known as swest[m]13:17
*** swest[m] has quit IRC13:19
*** swest[m] has joined #zuul13:19
opendevreviewGuillaume Chauvel proposed zuul/zuul master: WIP gitlab quick-start  https://review.opendev.org/c/zuul/zuul/+/79554013:33
opendevreviewMatthieu Huin proposed zuul/zuul-client master: Add buildsets, buildset-info to subcommands  https://review.opendev.org/c/zuul/zuul-client/+/75290913:35
opendevreviewMatthieu Huin proposed zuul/zuul-client master: Add console-stream subcommand  https://review.opendev.org/c/zuul/zuul-client/+/75123813:37
opendevreviewMatthieu Huin proposed zuul/zuul-client master: Add change-status subcommand  https://review.opendev.org/c/zuul/zuul-client/+/75983813:38
*** ysandeep|mtg is now known as ysandeep13:43
*** Shrews has joined #zuul13:47
*** josefwells has joined #zuul14:00
opendevreviewGuillaume Chauvel proposed zuul/zuul master: WIP gitlab quick-start  https://review.opendev.org/c/zuul/zuul/+/79554014:11
*** tosky has quit IRC14:16
*** tosky has joined #zuul14:18
*** zenkuro has joined #zuul14:23
*** hashar has joined #zuul14:57
*** ysandeep is now known as ysandeep|away15:50
*** ysandeep|away has quit IRC16:06
*** marios is now known as marios|out16:15
*** rpittau is now known as rpittau|afk16:16
*** marios|out has quit IRC16:23
opendevreviewGuillaume Chauvel proposed zuul/zuul master: WIP gitlab quick-start  https://review.opendev.org/c/zuul/zuul/+/79554016:27
*** jpena is now known as jpena|off16:38
opendevreviewMerged zuul/zuul master: Add support for vote-deleted event  https://review.opendev.org/c/zuul/zuul/+/79157216:44
opendevreviewJames E. Blair proposed zuul/zuul master: WIP: fix unknown job detection  https://review.opendev.org/c/zuul/zuul/+/79559717:02
corvustobiash, fungi, clarkb, frickler: ^ that should have caught the stuck job issue; let me see if i can make a test for that17:02
tobiash[m]great :)17:04
corvusto summarize from #opendev -- the executor VM was terminated by the cloud, geard probably detected that and returned a WORKFAIL packet to the scheduler, but since we're in a transition period between geard and zk, and result events are in zk, the scheduler ignored the WORKFAIL packet (since that's a result event).  the lost job cleanup process should have caught that and submitted a real "LOST" result event to the scheduler, but that code has a17:05
corvusbug17:05
corvusi think it's best to just fix the lost cleanup event handling, and rely on that for catching this case, rather than to try to handle the WORK_FAIL case17:05
corvuswe only need this to work for a little while longer, then we'll be putting build requests in zk and it won't be applicable any more17:06
opendevreviewGuillaume Chauvel proposed zuul/zuul master: WIP gitlab quick-start  https://review.opendev.org/c/zuul/zuul/+/79554017:24
opendevreviewMohammed Naser proposed zuul/zuul-jobs master: Switch jobs to use fedora-34 nodes  https://review.opendev.org/c/zuul/zuul-jobs/+/79563618:15
opendevreviewMohammed Naser proposed zuul/zuul-operator master: Switch jobs to use fedora-34 nodes  https://review.opendev.org/c/zuul/zuul-operator/+/79563818:16
opendevreviewMohammed Naser proposed zuul/nodepool master: Switch fedora-latest to use fedora-34  https://review.opendev.org/c/zuul/nodepool/+/79564218:22
opendevreviewJames E. Blair proposed zuul/zuul master: Fix unknown job detection  https://review.opendev.org/c/zuul/zuul/+/79559718:34
corvuszuul-maint: ^ would you please expedite the review of that?  it fixes a regression opendev saw in production; i'd like to merge that asap and make a release18:35
* clarkb rereviews18:38
clarkbcorvus: I have a couple of questions, but I thnk you can probabl approve it if my questions are not critical18:47
opendevreviewJames E. Blair proposed zuul/zuul master: Fix unknown job detection  https://review.opendev.org/c/zuul/zuul/+/79559718:51
corvusclarkb: replied.  i ran flake8 locally and it caught two nits; that's a fixed version18:51
corvusi think we can carry over all the other +2s18:51
corvus(flake8 caught a comment whitespace issue and an usused variable)18:52
clarkbcorvus: that helps, thanks18:52
corvusthanks all; assuming that merges, i think we can go ahead and cut a release with that and the 2 layout fixes18:54
* mnaser is going to be hacking curl/curl to run inside zuul to propose replacing travis ci18:55
* mnaser deciphering https://github.com/curl/curl/blob/master/.travis.yml18:55
corvusmnaser: oh are they thinking about taking you up on your offer?18:57
*** timburke_ is now known as timburke18:58
mnasercorvus: "We're not picky, we can move to any service that interfaces with github decently. It's more about someone doing the actual heavy lifting of converting travis jobs to other-service-job."18:58
mnaserand ran out of credits at travis already in the first few days18:58
mnaserso i figured if there's a ci service and jobs, maybe we can convince them to at least check for now.. not gate18:58
avass[m]mnaser: oh sounds interesting, I can help out when I have time over if needed19:00
mnaseravass[m]: awesome.  i will get a tenant running shortly and make vexxhost/curl run inside of it for now19:00
mnaserthe jobs seem straight forward19:00
corvusmnaser: neat; most of that travis file looks pretty straightforward; not sure what the cache is for19:00
mordredcorvus: it's for caching things19:00
avass[m]I've done a tiny bit of travisCI when I didn't have my zuul instance set up yet. Hope i remember something :)19:01
mordredcorvus: (I believe that's telling travis to save those dirs when the job is done and to start the next job with the contents of what was previous saved)19:01
mnaseryeah its saving state across jobs19:01
mordredI expect their build process clones those from somewhere19:01
corvusmnaser: i'd totally do most of that as a big shell task just to get things going; then maybe split it up a bit for improved ui19:01
avass[m]corvus: I think it's caching directories between builds and nothing more complicated than that, github actions has something similar and was the inspiration for zuul-cache19:02
corvusmordred: so maybe that happens in a makefile?19:02
corvusmy confusion was i didn't see where it was getting populated19:02
mnaseryes it archives post-job-completion those folders19:02
corvusi was expecting "curl http://something/wolfssl" but if that happens in a makefile, then it makes sense19:02
mnaserand loads if they dont exist pre-job-start, but yeah19:02
mnaseryeah, i MIGHT just do bindep.txt though to cover the apt installs19:03
mnaserbut that might be too much =P19:03
corvusin which case avass's roles could be very helpful; but also, they're probably not needed for a first pass19:03
corvusmnaser: a change to move all those installs to bindep might look really nice and be a good introduction to bindep for them :)19:04
avass[m]mnaser: ansibles apt module is als overy similar otherwise19:04
corvuscause i bet a bunch of bindep environments would fit very well with how they're making those package lists19:04
avass[m]yeah19:04
mordredmnaser: you should totally do a bindep.txt file19:05
mordredbecause as you're showing them - you can also show them the wonderful value of being able to machien-specify those things19:05
mordredI'd say do it in a patch series so it could be 2 patches - but github19:05
corvuswell, you can still depends-on :)19:05
mordredcorvus: ooh - good point19:05
mnaserhttps://github.com/curl/curl/blob/master/scripts/travis/script.sh19:06
corvuspersonally, i like the idea of doing the most straightforward translation first, followed by "here's how this could be improved"19:06
mnaserit looks like most of the work is already done i guess19:06
mnaserpass env variables and call that script..19:06
mnaserhttps://github.com/curl/curl/blob/master/scripts/travis/before_script.sh19:06
mnaseryou see git clones, i see required-projects .. look at all the potential19:06
mnaser:p19:06
* corvus lunches19:07
avass[m]mnaser: may wanna keep the git clones and point the remote  to a required-project in ci in case they're using that locally19:08
avass[m]though that does not look like something that is run locally19:08
mordredrequired-projects from curl would be a great way to start spidering out into those sibling projects19:09
mordred"check out what you could do with a depends-on between curl and openssl..."19:09
mnaseryeah, i'm aiming to be like.. the very basic form19:10
mnaserand then once they've adopted slowly help them use some more features19:10
avass[m]++19:10
mordred++19:11
mordredI think we all see a bunch of clear opportunities for improvement based on their current scripts19:11
mnaserbtw, just a heads up19:18
mnaseri dont have time to look into this now, but looks like a common thing on shutdown signal https://www.irccloud.com/pastebin/jtGaBjFu/19:18
opendevreviewAlbin Vass proposed zuul/zuul master: Remove argument from gerritwatcher cancel  https://review.opendev.org/c/zuul/zuul/+/79567419:20
mnaserrecord time avass[m] :)19:20
avass[m]mnaser: ^19:20
mnaserhttps://curl.zuul.vexxhost.dev/status19:20
avass[m]tristanC: maybe that's why gracefully stopping the scheduler wasn't working for you?19:21
tristanCavass[m]: perhaps, though we don't have the traceback19:30
*** hashar has quit IRC19:39
mnaseraw man, it's happening again19:47
mnaserhttps://www.irccloud.com/pastebin/26bSoijc/19:48
mnaser> Error: Project github.com/vexxhost/zuul-base-jobs does not have the default branch master19:48
mnaserhttps://opendev.org/zuul/zuul/src/branch/master/zuul/executor/server.py#L1362-L139919:49
mnaseri guess i need to set `default-branch`19:51
avass[m]mnaser:  heh, I started working on something and pretty much came up with the same config you did. except that I put package installation in a pre-run :)19:58
mnaseravass[m]: that actually sounds like an even better idea19:59
mnaseravass[m]: feel free to submit prs, itll run jobs just fine19:59
avass[m]also I put everything in zuul.d with playbooks in zuul.d/playbooks with a .zuul.ignore, in case they don't want to pollute their top-level directory20:00
opendevreviewMerged zuul/zuul master: Fix unknown job detection  https://review.opendev.org/c/zuul/zuul/+/79559720:00
avass[m]sure20:00
mnaseravass[m]: if you wanna hack at this and i can start from your progress after you're done to avoid doing duplicate work? :P20:00
avass[m]I think you got further than me :)20:02
avass[m]and I'm gonna leave in a bit, it's getting late20:02
avass[m]I can pick it up tomorrow20:03
mnaseravass[m]: cool, ill update you with my progress :)20:03
mnaserill borrow your ideas though :)20:03
opendevreviewJames E. Blair proposed zuul/zuul master: Add checkpoint release note  https://review.opendev.org/c/zuul/zuul/+/79569320:17
corvuszuul-maint: ^ can we also squeeze that in after avass's change so we have a release note for the commit?20:18
corvuss/commit/release/20:21
*** Shrews has quit IRC20:24
clarkbI've approved it20:24
corvusthx!20:24
opendevreviewMerged zuul/zuul master: Remove argument from gerritwatcher cancel  https://review.opendev.org/c/zuul/zuul/+/79567420:46
opendevreviewGuillaume Chauvel proposed zuul/zuul master: WIP gitlab quick-start  https://review.opendev.org/c/zuul/zuul/+/79554021:28
*** zenkuro has quit IRC21:30
*** zenkuro has joined #zuul21:31
opendevreviewMerged zuul/zuul master: Add checkpoint release note  https://review.opendev.org/c/zuul/zuul/+/79569321:40
corvuszuul-maint: commit 9bfdf43f49a952a0c2112258198e421ff7c9b501 (HEAD -> master, tag: 4.5.0, origin/master)21:43
corvushow's that look ?21:43
clarkb9bfdf43f49a952a0c2112258198e421ff7c9b501 is the commit I see and 4.5.0 is the appropriate version. Did you want to restart opendev on that or is the delta small enough (just that bug fix really?)21:45
corvusa couple of bugfixes that opendev isn't running; but avass reported running the layout fix, so it's only the bugfix from today that has seen no production use21:46
corvusgiven that, i'm comfortable with a release now and opendev restart at our convenience, but would be happy to reverse that and do the normal process.21:47
clarkbI'm ok with just going for it. Worst case we do a 4.5.121:48
corvusya21:48
corvusi'll wait a few mins for anyone else to double check that, then push it21:48
fungiyep, that looks sane to me. sorry my availability is unexpectedly spotty since my internet provider has been down for several hours now21:49
corvusokay, 4.5.0 pushed21:50
fungithanks!21:51
mnasercurl update: i've got jobs running right now, it was all running perfectly until job timed out, bumped to 60 minutes and watching now :)22:03
opendevreviewGuillaume Chauvel proposed zuul/zuul master: WIP gitlab quick-start  https://review.opendev.org/c/zuul/zuul/+/79554022:09
avass[m]mnaser: do you have any object storage available? Swift?22:10
mnaseravass[m]: yes we're publishing logs to swift right now22:12
avass[m]mnaser: cool, then caching may be an option to speed things up. Could be enough with a simple ccache that is passed around22:13
mnaseravass[m]: yeah or maybe just start by the simple stuff they used to cache before or something22:14
avass[m]That too :)22:14
* mnaser admits they haven't done as much C in their life to know all about the build stuff :)22:14
avass[m]I've done way too much C while working at an automotive company :)22:15
fungii still think of c as a high-level programming language... scarred in my youth perhaps22:18
*** tosky has quit IRC22:47
opendevreviewGuillaume Chauvel proposed zuul/zuul master: WIP gitlab quick-start  https://review.opendev.org/c/zuul/zuul/+/79554023:04

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!