Monday, 2021-08-16

opendevreviewIan Wienand proposed zuul/zuul master: web: JobVariant: use @patternfly/react-table  https://review.opendev.org/c/zuul/zuul/+/80447805:53
opendevreviewIan Wienand proposed zuul/zuul master: web: JobVariant: visual cleanups  https://review.opendev.org/c/zuul/zuul/+/80447905:53
opendevreviewIan Wienand proposed zuul/zuul master: web: Nodeset: convert to a treeview  https://review.opendev.org/c/zuul/zuul/+/80448005:53
opendevreviewIan Wienand proposed zuul/zuul master: web: Add release note for Job/JobVariant changes  https://review.opendev.org/c/zuul/zuul/+/80448105:53
opendevreviewIan Wienand proposed zuul/zuul master: web: Use PF4 sizing/spacing css  https://review.opendev.org/c/zuul/zuul/+/80459905:53
opendevreviewIan Wienand proposed zuul/zuul master: web: JobVariant: add icons  https://review.opendev.org/c/zuul/zuul/+/80460005:53
opendevreviewIan Wienand proposed zuul/zuul master: web: JobVariant : convert to DescriptionList  https://review.opendev.org/c/zuul/zuul/+/80460105:53
opendevreviewIan Wienand proposed zuul/zuul master: web: JobVariant : use wide layout for DescriptionList  https://review.opendev.org/c/zuul/zuul/+/80460205:53
*** jpena|off is now known as jpena07:42
*** sshnaidm|pto is now known as sshnaidm10:30
*** sshnaidm is now known as sshnaidm|pto10:31
*** jpena is now known as jpena|lunch11:16
*** dviroel|out is now known as dviroel|ruck11:26
*** jpena|lunch is now known as jpena12:16
corvusianw, clarkb: please see my comment on https://review.opendev.org/80448114:03
opendevreviewBenjamin Schanzel proposed zuul/zuul master: Add tenant name on NodeRequests for Nodepool  https://review.opendev.org/c/zuul/zuul/+/78868014:35
corvuszuul-maint: https://review.opendev.org/804532 fixes a regression; we might want to merge and release that14:57
fungithanks, that came up 15:12
fungilast week15:12
fungiand i wasn't initially sure if it was intended and we'd forgotten to update the docs, or was inadvertent collateral damage from the url deprecation15:13
Clark[m]I guess the build page will render an arbitrary result and message set by a job or is that going away at some point too?15:15
corvusClark: yes it should; but it won't know what color to use15:16
corvusso honestly, i think it would be better to remove it, but intentionally -- this removal was an accident15:16
pabelanger[m]good morning, would love to get some eyes / reviews on https://review.opendev.org/c/zuul/zuul/+/804305 to fix an issue with semaphores and pipeline precedence15:35
clarkbpabelanger[m]: that was on my list for this morning. I'm having a bit of a slow start as my office is converted back to an office from a guest room15:36
pabelanger[m]thanks!15:36
pabelanger[m]I'm going to try and get to unpause jobs this week15:37
clarkbI suspect that the forced wait for the hourly deploy pipeline in opendev may be "fixed" by pabelanger[m]'s semaphore change too15:43
clarkbthough I haven't double checked the priority on those queues yet15:43
clarkbcorvus: ^ including pabelanger's change in any restart (and release) for 804532 may be a good idea?15:53
clarkbits a small regression but I think the zk queues changed that behavior15:53
*** jpena is now known as jpena|off15:58
*** marios is now known as marios|out16:01
opendevreviewMerged zuul/zuul master: Restore job success/failure message  https://review.opendev.org/c/zuul/zuul/+/80453216:10
clarkbcorvus: re https://review.opendev.org/c/zuul/zuul/+/804601 I agree about the spacing there. Do you feel comfortable enough with the prior changes to land the stack up to taht point? I don't feel so strongly about the lack of the explicit separation to avoid landing it if so16:14
clarkbI think overall they are good improvements16:14
opendevreviewPaul Belanger proposed zuul/zuul master: Set default precedence for semaphore testing  https://review.opendev.org/c/zuul/zuul/+/80478516:54
*** timburke__ is now known as timburke17:16
corvusclarkb, pabelanger: it semaphore precedence sounds like a good change, but i don't think it's been asserted that it's a zk-related regression?17:24
corvuss/it//17:24
clarkbcorvus: I think it is because before gearman would've ensured that ordering on the executor side17:24
clarkbcorvus: because the exuecotrs would process the jobs in that order regardless of how zuul put them on the gearman queue? (that was likely racey though)17:25
corvusclarkb: no, semaphore acquisition happens before enqueing17:25
clarkboh17:25
corvusclarkb: zk executor queue was a like-for-like replacement, that was the design17:25
clarkbI'm confused how the opendev deploy pipelines have chagned though17:25
corvuspabelanger's test case fails when applied to zuul 4.0.017:25
clarkbmaybe it was just racey and the zk io is different enough to chagne the race order in most situation for opendev17:26
corvusi didn't realize something had changed in opendev17:27
fungii'm not convinced anything has17:27
clarkbit definitely changed17:27
fungiwhat behavior were we seeing "before"?17:28
clarkbthe deploy pipeline would get the semaphore from the hourly pipeline17:28
clarkbnow the deploy pipeline must always wait for the hourly pipeline to complete17:28
clarkbI'm not sure that one behavior is necessarily more correct than the other but the behavior changed for sure17:28
clarkbthis is why it takes so much extra time for landed chagnes to apply beacuse they are always waiting for the entirety of the houlry pipeline to complete17:28
fungii guess my memory is fuzzy, i don't recall that being consistent, nor am i sure i've watched it enough lately to feel it's consistent now17:29
corvusthat does sound like a race17:29
clarkbthen if you have multiple changes in the deploy pipeline what happens now is the first one waits for the hourly to finish and then starts. This often takes long enough that a new hourly is enqueued and it takes the semaphore back again before the next chagne in the deploy pipeline further delaying those changes from applying17:29
clarkbpreviously the deploy pipeline would only wait for the running hourly job to finish then it would start, then the next deploy change would run, then the hourly pipeline would start where it left off17:30
fungibut yeah, seems like maybe we've traded one undefined behavior for another, either way being explicit with priorities there would be nice17:30
clarkbthis made deploying specific cahgnes much more responsive17:30
corvusclarkb: from a design pov, the opendev pipeline configuration is still flawed -- ideally we would not want deploy to interrupt hourly.  the real issue is that hourly is slow :(17:31
corvusit just "usually works and is okay"17:31
fungii do wonder now if we've designed the hourly jobs not to roll back the promoted deploy jobs17:32
clarkbyup hourly being slow is definitely part of the problem and it seems that a big part of that is ansible is slow :/17:32
corvusanyway, precedence makes more sense than definition order for this, so i have no objection to the change :)17:32
clarkbfungi: pretty sure we havne't beacuse they actually run latest system-config17:33
clarkbfungi: so what actually happens is that the hourly jobs are running the updates for the changes that are landing. Then an hour later the chagne jobs are running and nooping17:33
corvusclarkb: if there's a behavior change, it may be related to event results in zk -- that could have caused a race before, but now is less likely to race.17:33
fungirunning latest at time of build start would be fine. running latest at enqueue time would be my concern17:33
corvus(i just wouldn't characterize it as a regression, that's all)17:33
clarkbfungi: oh good point, I'm not sure which is used17:34
clarkbcorvus: gotcha17:34
corvusfungi: we now will run latest at enqueue time17:34
corvusand that is a change17:34
corvus(that's a new behavior introduced by the global repo state -- the idea is every build for a queue item should get the same state, and that state is frozen at enqueue time)17:34
corvusanother entirely sensible change that unfortunately doesn't do quite what we want in order to work around the opendev-specific deployment issues :/17:35
fungiyeah, so periodic item is enqueued and starts a build or several, partway through the buildset a promote job is enqueued with a newer state and grabs the semaphore, deploying newer system-config on systems, finishes its buildset and then the periodic resumes its builds rolling out an older system-config state (from its original enqueue time) rolling back some deployed software/configs17:35
corvusyep.  to be correct we should really hold a semaphore for the entire buildset -- which we could do with a parent job.  that would get us back to the behavior which we have now (before merging pabelanger's change) and which rightly annoys us.17:37
fungiif we're pushing the state to be used onto the deployment bastion, then we'll end up in that situation. if the periodics are just making sure the worktree is updated at build start then that would solve it17:37
corvuswe now have a choice: correct vs expedient.  (unfortunately, "slow" still applies to both)17:37
corvusfungi: good point, i don't recall which we do17:38
clarkbthe base jobs does a workspace sync from the executor to bridge17:39
clarkbbut we could override that after the fact17:39
fungiit's a convenient generalization in this particular case to assert that *all* deploy jobs (whether timer or change-merged triggered) should use the latest branch state at build start17:39
corvusclarkb: regarding 804601 -- i agree that 804600 is an improvement over current state and am okay with merging up to that point.17:40
clarkbinfra_prod_run_from_master: "{{ zuul.pipeline|default('') in ['periodic', 'opendev-prod-hourly'] }}"17:40
fungibut that's down to nuances of how opendev is handling its deployment model really17:40
clarkbI think we use ^ in system-config to work around this problem17:40
clarkbfungi: I think that this will end up being a common problem for anyone using zuul for CD though and is worth considering from zuul's perspective how one might address it17:41
fungithat also means it's safe to reenqueue a deploy queue item, i guess (to answer a question i had sometime recently)17:41
clarkbfungi: yes opendev is running into it but it is a generic CD problem when you are trying to apply many changes concurrently17:41
clarkbfungi: no because deploy isn't in 'periodic' or 'opendev-prod-hourly'17:41
clarkbIf we added deploy to that list and basically just said always run from master then we would be fine. But now you wouldn't have changes that maybe did cleanups like remove a cron then remove teh config management for it applying rpoperly17:42
fungioh, got it. so we should add deploy to that list maybe?17:42
fungithat would also mean it would be safe to make deploy use the supercedent manager17:43
corvusclarkb: i think you missed https://review.opendev.org/80447817:43
clarkbfungi: yup see the tradeoff I mentioned though17:43
clarkbfungi: basically if you want the system to apply things in order you can't rely on that. But maybe losing that functionality is the least bad option17:43
clarkbcorvus: oh ya I was going to +2 it if you were happy to land the stack up to the child point. I'll +2 now17:44
fungioh, actually no we can't use supercedent in deploy if we also want to use file filters17:44
clarkbcorvus: I think you can +A up to that point where spacing gets weird17:44
clarkbthinking out loud: What if we could have one pipeline supercede another? And basically make the hourly deploy eventually consistent? It would finish its current job(s) but then get evicted if a change showed up in deploy. Then deploy would run. This risk starvation but maybe is the least confusing setup17:46
corvusclarkb, fungi: i think the fundamental issue is that we have an hourly pipeline that takes too long17:46
corvusit should run daily and/or be fast17:47
clarkbcorvus: yes, but I'm not sure how we fix that without replacing ansible. (related note the job that runs puppet against everything which is admittedly a shrinking list runs very quickly)17:47
corvusthis is getting off-topic for #zuul17:47
clarkbI do think that zuul providing tools to make this less painful is a good idea. I doubt opendev is unique here. However, I don't really know what zuul can do that would be effective yet17:48
*** dviroel|ruck is now known as dviroel|out17:49
corvusclarkb: sure, i mean, i think zuul provides those tools.  how opendev uses them is more of an #opendev thing.  :)17:49
corvusclarkb: pipelines can superecede each other; gate does check.  but identifying that a queue item in a deploy pipeline should supercede one in periodic would not be trivial -- they have very different identifiers.17:50
pabelanger[m]clarkb: for your hourly job, using mitogen will likely help a ton. I had to use that for zuul.a.c deploy job.  I cannot remember how fast it improved things however18:53
clarkbpabelanger[m]: the problem is in forking the processes which mitogen doesn't help with aiui18:55
opendevreviewMerged zuul/zuul master: web: yarn update to latest @patternfly/react-core  https://review.opendev.org/c/zuul/zuul/+/80447318:59
opendevreviewMerged zuul/zuul master: web: use spread operator in JOB_FETCH_SUCCESS reducer  https://review.opendev.org/c/zuul/zuul/+/80447418:59
opendevreviewMerged zuul/zuul master: web: Job: remove height setting for job page  https://review.opendev.org/c/zuul/zuul/+/80447518:59
opendevreviewMerged zuul/zuul master: web: Job: use PF4 tabs  https://review.opendev.org/c/zuul/zuul/+/80447619:03
opendevreviewMerged zuul/zuul master: web: JobVariant: update to @patternfly/react-icons  https://review.opendev.org/c/zuul/zuul/+/80447719:04
opendevreviewMerged zuul/zuul master: web: JobVariant: use @patternfly/react-table  https://review.opendev.org/c/zuul/zuul/+/80447819:04
opendevreviewMerged zuul/zuul master: web: JobVariant: visual cleanups  https://review.opendev.org/c/zuul/zuul/+/80447919:04
corvuspabelanger, clarkb, tobiash: i think there's another issue to check for on https://review.opendev.org/80430519:10
opendevreviewMerged zuul/zuul master: web: Nodeset: convert to a treeview  https://review.opendev.org/c/zuul/zuul/+/80448019:11
opendevreviewMerged zuul/zuul master: web: Use PF4 sizing/spacing css  https://review.opendev.org/c/zuul/zuul/+/80459919:11
opendevreviewMerged zuul/zuul master: web: JobVariant: add icons  https://review.opendev.org/c/zuul/zuul/+/80460019:11
clarkbcorvus: that is an interesting one beacuse it was only really handled by convention previously, but definitely something that we want to happen19:49
corvusclarkb: yep -- the same is true for this change, which is why i think we need to make sure it's handled19:50
corvus(to be clear, i don't know if it's an issue; i think it needs to be verified)19:53
TylerPearce[m]Hi! I'm working on a branch called "zuul-test", but on the zuul web UI it's showing the branch as "master". Am I misunderstanding something here, or should the branch in Zuul's UI be "zuul-test"?20:35
clarkbTylerPearce[m]: in which context is it showing the branch as master? I think we need a bit more info to properly understand20:36
fungiTylerPearce[m]: what code review system is the change uploaded into, and what branch is it targeting? that's what zuul will show, it has nothing to do with (and won't even know about) your local topic branches20:36
* TylerPearce[m] uploaded an image: (208KiB) < https://matrix.org/_matrix/media/r0/download/matrix.org/XjycKSmDypwVgjhElJzWBjeT/image.png >20:39
TylerPearce[m]I'm using GitHub. I've configured Zuul to run a "check" pipeline when a PR is created, which seems to work as intended20:39
TylerPearce[m]Then in Zuul's web UI it's showing the branch as "master" 20:39
* TylerPearce[m] uploaded an image: (57KiB) < https://matrix.org/_matrix/media/r0/download/matrix.org/bZWJCxqyQyZRzOFuVBCzedFn/image.png >20:39
TylerPearce[m]Is this because the merge target is "master"?20:39
fungiTylerPearce[m]: yes, zuul tests "the future" where your change has merged to its intended target branch20:40
TylerPearce[m]Ahh okay, thank you for explaining! That makes sense20:40
fungithis becomes much more apparent when gating, and you have multiple changes trying to merge at the same time, it needs clear test states incorporating those changes together20:41
fungiso if it tested them all individually on separate branches then it could end up trying to merge different changes which break one another functionally20:41
TylerPearce[m]Ah I see20:41
TylerPearce[m]One more question if you don't mind: When inspecting a job on Zuul's web UI there's a tab called "Console". Where does this console read from (or rather, how can I write data to this console to be viewed in the UI?)20:44
fungithe "console" view is a hierarchical rendering of the json returned by ansible20:44
fungieach playbook as the top-level of the hierarchy, and then the individual tasks run within the scope of that playbook form the second level20:45
fungithen within each task you get the stdout, stderr, exit code, and so on20:45
TylerPearce[m]Ah okay, so from the quickstart guide I might expect to see the debug message "Hello world" from `testjob.yaml`, is that correct?20:46
fungiit's basically more fine-grained and detailed than the "console log" which is essentially a timestamped and combined stream of all the stdout/stderr20:46
* TylerPearce[m] < https://matrix.org/_matrix/media/r0/download/matrix.org/HoNztJBmfaJUbmhhpaFMToRk/message.txt >20:47
fungisounds right, i'd need to look back at what the tasks are in the demo jobs for the quickstart, but you should be able to find the corresponding task in the console breakdown, yes20:47
TylerPearce[m]Hmm okay. I must have borked something on my end because none of my tasks have any console output, though Zuul seems to be running them20:49
fungias one of the final tasks zuul should publish the json for that to the logserver. maybe that's not happening, or the dashboard is having trouble reading the file from the logserver container20:51
TylerPearce[m]Ah yep, I'm not using `playbooks/base/post-logs.yaml`. I imagine that's what should publish the logs 20:52
fungidoes the normally the console would look something like this: https://zuul.opendev.org/t/zuul/build/514def4b7daa4650959030a8b28c9df5/console20:52
fungithat fetches and renders the job-output.json file for the build from the log location for it20:53
fungi(which you can also find conveniently linked in the logs section)20:53
fungiso if there's no job-output.json, that would explain the empty console20:54
TylerPearce[m]Ah nice, that looks much better than my blank view 😂 I didn't set up my configuration to utilize the zuul jobs repository, I'm guessing that's what I need to get `post-logs.yaml` to work20:54
fungiand yes, post-logs.yaml is what should, during the post-run phase, copy those files to the logserver20:54
fungithere are some standard roles for log collection and publication in zuul-jobs, which i expect post-logs.yaml relies on, yes20:55
TylerPearce[m]Thank you very much! I'll add zuul-jobs into my config and give that a try20:56
fungiyou're welcome, sorry i couldn't be more help but i'm not at my desk at the moment20:57
TylerPearce[m]That was all the help I need for now, I appreciate it :) 20:57
TylerPearce[m]Okay so I added zuul-jobs to my config and the job succeeded, but I'm getting an error when viewing the job21:11
TylerPearce[m]```Network Error (Unable to fetch URL, check your network connectivity, browser plugins, ad-blockers, or try to refresh this page) http://localhost:8000/6/6/c7667393ff325324b46027979fd28b879a529029/check/testjob/21a9ee0/zuul-manifest.json```21:11
TylerPearce[m]I think this is because I'm not viewing this on localhost, but instead it's on an AWS EC2 container. Where can I overwrite `localhost:8000` to point to my EC2 instance URL?21:11
clarkbTylerPearce[m]: that is part of the base job iirc21:18
clarkbI think there is a zuul return to set the log url21:18
TylerPearce[m]Ah yep, I'm blind 🕶️ Thank you!21:30
TylerPearce[m]Tweaked my security group settings and now it's all working. Thanks again to everyone who helped me :) 21:51
fungiTylerPearce[m]: thanks for trying it out!21:57
TylerPearce[m]Does Zuul have a job variable for the current branch? (Not the target branch like https://zuul-ci.org/docs/zuul/reference/jobs.html#var-zuul.branch)23:26
clarkbTylerPearce[m]: in zuul's world they are the same thing23:28
clarkbTylerPearce[m]: when zuul prepares the workspace it will checkout the target branch with the changes applied to it. If you need to see your delta to the target branch in the workspace you can look at origin/target-branch as that is preserved with the original state23:29
TylerPearce[m]I see. Is there any way to reference the branch where the changes came from?23:29
clarkbIf you need the actual branch name the PR was requested under I'm not sure. I don't use github with zuul much. YOu can check the inventory file that zuul records as taht should have all the vars23:29
TylerPearce[m]Got it, thank you!23:30
clarkbit might record it as part of zuul.change or zuul.change_url?23:30
clarkbI suspect that will just give you PR numbers though23:30
TylerPearce[m]change is the pr number, I'll check change_url23:30
clarkbTylerPearce[m]: I suspect that https://opendev.org/zuul/zuul/src/branch/master/zuul/driver/github/githubconnection.py#L615-L635 and the event model would need to be updated to record that info23:34
clarkbas that seems to be where we make a recording of the relevant bits and the source branch isn't in there23:35
clarkbanother option may be to look it up with the change url23:35
clarkbin the job I mean rather than having zuul record it23:35
TylerPearce[m]Awesome info, thank you! I probably don't specifically need the branch name, I think I can accomplish the task via the PR number and the dev looking at the logs can infer the branch name or click the link :) 23:36

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!