*** stevthedev has joined #zuul | 00:05 | |
openstackgerrit | Merged zuul/zuul master: Update Gerrit config for quickstart https://review.opendev.org/755669 | 00:16 |
---|---|---|
*** hamalq has quit IRC | 00:16 | |
openstackgerrit | Merged zuul/zuul-jobs master: Fix certificate issue with use buildset registry https://review.opendev.org/741584 | 00:26 |
openstackgerrit | Merged zuul/zuul master: Revert "Revert "Update images to use python 3.8"" https://review.opendev.org/755671 | 00:31 |
*** freenzyfriday has joined #zuul | 00:35 | |
*** freenzyfriday has quit IRC | 00:39 | |
*** weshay|ruck has quit IRC | 00:54 | |
*** weshay has joined #zuul | 00:55 | |
*** weshay is now known as weshay|ruck | 00:55 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: Revert "Disable broken fetch-sphinx-tarball test job" https://review.opendev.org/753199 | 03:16 |
*** stevthedev has quit IRC | 03:21 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: Revert "Disable broken fetch-sphinx-tarball test job" https://review.opendev.org/753199 | 03:28 |
*** bhavikdbavishi has joined #zuul | 03:30 | |
*** bhavikdbavishi1 has joined #zuul | 03:33 | |
*** bhavikdbavishi has quit IRC | 03:34 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 03:34 | |
*** stevthedev has joined #zuul | 03:49 | |
*** freenzyfriday has joined #zuul | 04:15 | |
*** bhavikdbavishi has quit IRC | 04:25 | |
*** bhavikdbavishi has joined #zuul | 04:26 | |
*** evrardjp has quit IRC | 04:33 | |
*** evrardjp has joined #zuul | 04:33 | |
*** freenzyfriday has quit IRC | 04:36 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: Revert "Disable broken fetch-sphinx-tarball test job" https://review.opendev.org/753199 | 04:59 |
*** bhavikdbavishi has quit IRC | 05:00 | |
*** bhavikdbavishi has joined #zuul | 05:00 | |
*** bhavikdbavishi has quit IRC | 05:23 | |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Merge Zookeeper connection methods and specialize exceptions https://review.opendev.org/754360 | 05:24 |
*** holser has joined #zuul | 05:24 | |
*** holser has quit IRC | 05:25 | |
*** holser has joined #zuul | 05:27 | |
*** hamalq has joined #zuul | 06:12 | |
*** holser has quit IRC | 06:21 | |
*** jcapitao has joined #zuul | 07:00 | |
*** bhavikdbavishi has joined #zuul | 07:02 | |
*** yolanda has quit IRC | 07:04 | |
*** yolanda has joined #zuul | 07:04 | |
*** hamalq has quit IRC | 07:13 | |
*** bhavikdbavishi has quit IRC | 07:17 | |
*** bhavikdbavishi has joined #zuul | 07:17 | |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Separate connection registries in tests https://review.opendev.org/712958 | 07:31 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Prepare Zookeeper for scale-out scheduler https://review.opendev.org/717269 | 07:31 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Mandatory Zookeeper connection for ZuulWeb in tests https://review.opendev.org/721254 | 07:31 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Driver event ingestion https://review.opendev.org/717299 | 07:31 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Connect merger to Zookeeper https://review.opendev.org/716221 | 07:31 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Connect fingergw to Zookeeper https://review.opendev.org/716875 | 07:31 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Connect executor to Zookeeper https://review.opendev.org/716262 | 07:31 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Improve typings in context of 744416 https://review.opendev.org/753578 | 07:31 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Merge Zookeeper connection methods and prepare test zookeeper https://review.opendev.org/754360 | 07:31 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Switch to using zookeeper instead of gearman for jobs https://review.opendev.org/744416 | 07:31 |
*** bhavikdbavishi1 has joined #zuul | 07:43 | |
*** tosky has joined #zuul | 07:43 | |
*** bhavikdbavishi has quit IRC | 07:44 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 07:44 | |
*** bhavikdbavishi has quit IRC | 07:51 | |
*** jpena|off is now known as jpena | 07:57 | |
*** hashar has joined #zuul | 07:58 | |
*** holser has joined #zuul | 07:59 | |
*** armstrongs has joined #zuul | 08:15 | |
*** armstrongs has quit IRC | 08:24 | |
openstackgerrit | zbr proposed zuul/zuul-jobs master: ensure-docker: validate network connectivity https://review.opendev.org/755505 | 08:27 |
openstackgerrit | zbr proposed zuul/zuul-jobs master: ensure-docker: validate network connectivity https://review.opendev.org/755505 | 08:32 |
*** jfoufas1 has joined #zuul | 08:39 | |
*** bhavikdbavishi has joined #zuul | 08:44 | |
*** bhavikdbavishi1 has joined #zuul | 08:49 | |
*** bhavikdbavishi has quit IRC | 08:51 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 08:51 | |
avass | something strange is going on with our executor running in eks. it doesn't seem possible to pause it since it keeps running jobs even though it's reporting that it's paused to statsd | 09:43 |
avass | for some reason this only happens in kubernetes | 09:43 |
tobiash | avass: could it be that it already was disabled by a governor? | 10:06 |
tobiash | avass: I've saw occurrences that pause while it's unregistered fails due to double unregister and then it went into a weird state | 10:07 |
tobiash | but hadn't time to fix it yet | 10:08 |
tobiash | avass: e.g. this is a double pause attempt: http://paste.openstack.org/show/798640/ | 10:10 |
tobiash | maybe you find a similar stack trace | 10:10 |
avass | tobiash: I'll see if I can find something. paused_on_startup doesn't seem to work either | 10:21 |
avass | tobiash: we're still on 3.19 though since we've been slow to set up zookeeper tls | 10:22 |
tobiash | avass: I think that part didn't change since then | 10:24 |
openstackgerrit | zbr proposed zuul/zuul-jobs master: ensure-docker: validate network connectivity https://review.opendev.org/755505 | 10:35 |
openstackgerrit | Tobias Henkel proposed zuul/zuul-jobs master: Consolidate common log upload code into module_utils https://review.opendev.org/742736 | 10:39 |
openstackgerrit | Tobias Henkel proposed zuul/zuul-jobs master: Consolidate common log upload code into module_utils https://review.opendev.org/742736 | 10:40 |
*** holser has quit IRC | 10:40 | |
tobiash | AJaeger, ianw: I think that should address your comments ^ | 10:41 |
avass | tobiash: looks like it's logging that the executor is starting in paused mode but then starts jobs anyway: http://paste.openstack.org/show/798642/ | 10:46 |
tobiash | looks like it's refistering regardless of pause at startup | 10:47 |
tobiash | what happens if you execute zuul-executor pause inside the pod? | 10:47 |
*** holser has joined #zuul | 10:48 | |
avass | tobiash: http://paste.openstack.org/show/798643/ | 10:50 |
avass | tobiash: looks like it works this time | 10:52 |
avass | tobiash: only difference is that it's using the default logconfig instead of outputting it in a json format. maybe that's causing some kind of problem | 10:53 |
avass | tobiash: actually no, it keeps starting jobs | 10:57 |
tobiash | that's weird, if it really unregistered that should not be possible | 10:57 |
tobiash | oh wait | 10:58 |
tobiash | it only paused the merge worker, not the executor worker | 10:58 |
tobiash | is there no exception or so? | 10:58 |
avass | nope | 10:58 |
tobiash | maybe just the log is misleading | 10:59 |
tobiash | can you check gear's function list? | 10:59 |
avass | how do I do that? | 11:00 |
tobiash | https://zuul-ci.org/docs/zuul/howtos/troubleshooting.html | 11:01 |
tobiash | you can connect to gearman using openssl and enter status | 11:01 |
avass | ah :) | 11:01 |
tobiash | then it prints all registered functions and queue lengths | 11:02 |
*** bhavikdbavishi has quit IRC | 11:06 | |
avass | tobiash: anything specific I'm looking for? | 11:07 |
tobiash | the execute functions | 11:07 |
*** jcapitao is now known as jcapitao_lunch | 11:09 | |
zbr | is anyway aware that task summary counters for failed do not match what console display? you may have failed tasks but it is always 0 | 11:09 |
zbr | example at https://zuul.opendev.org/t/zuul/build/262bad816e954d4394ddaf0b5bfe7aac | 11:10 |
avass | tobiash: for that specific executor it looks like this: http://paste.openstack.org/show/798644/ | 11:11 |
tobiash | avass: resume and stop are always there | 11:11 |
tobiash | execute doesn't have the executor name in the function name | 11:12 |
avass | tobiash: just saw it you mean this right: "executor:execute 47 47 6"? | 11:12 |
tobiash | yes | 11:12 |
tobiash | judging from the counts that's not a test env right? | 11:12 |
openstackgerrit | Merged zuul/zuul master: Optimize GitHub requests on PR merge https://review.opendev.org/752886 | 11:12 |
avass | nope | 11:13 |
tobiash | can you get a list of all that starts with 'executor:'? | 11:13 |
tobiash | or better compare the number of execute functions with the number of running executors | 11:16 |
*** jfoufas1 has quit IRC | 11:17 | |
avass | tobiash: there are more functions than running executors, but only six that is registered as available workers | 11:18 |
tobiash | I think I've found a race when pausing | 11:18 |
tobiash | pausing is implemented as a pause sensor of the normal governor | 11:19 |
tobiash | see the manageLoad function | 11:19 |
tobiash | that is called either after accepting a job or every 10 seconds | 11:19 |
tobiash | so if you pause the executor there might be 10 seconds in which it can take additional jobs before it really unregisters | 11:20 |
avass | tobiash: and self.register_work sets self.accepting_work = True | 11:21 |
avass | tobiash: I'm guessing the reregister = True should not be there | 11:22 |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Call manageLoad during pause and unpause https://review.opendev.org/755765 | 11:23 |
avass | tobiash: oh actually I take that back | 11:23 |
tobiash | I think that should fix it ^ | 11:23 |
avass | yeah I have to go through that to understand how it fits together | 11:24 |
tobiash | avass: can you verify that after pausing it at least stops accepting jobs after 10+ seconds? | 11:24 |
avass | tobiash: it doesn't :) | 11:24 |
tobiash | that is weird then | 11:25 |
tobiash | I have another idea then | 11:25 |
bolg | zuul-main: can I ask for a review on https://review.opendev.org/c/744416 and the rest of the branch (topic: scale-out-scheduler). Current comments were worked in. | 11:26 |
bolg | zuul-maint: ^^^ | 11:26 |
tobiash | avass: you might have a deadlock of the governor here: https://opendev.org/zuul/zuul/src/branch/master/zuul/executor/server.py#L3008 | 11:26 |
tobiash | that would block the governor completely | 11:26 |
tobiash | avass: do you have 'Unregistering due to' or 'Re-registering as job is within its limits' messages in your logs? | 11:27 |
tobiash | avass: a thread dump (kill -SIGINT2) coud help to prove this theory. If the theory is right, you'll have a thread hanging in one of the sensors (zuul.executor.sensor.* packages) | 11:28 |
tobiash | s/SIGINT2/SIGUSR2 | 11:29 |
avass | tobiash: no such logs | 11:30 |
tobiash | some of those query some system metrics which makes that plausible as you said it's only on eks | 11:30 |
tobiash | avass: so that further backs the hanging thread hypothesis | 11:31 |
tobiash | avass: so the thread dumo should be the next step | 11:31 |
*** jpena is now known as jpena|lunch | 11:36 | |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: tutorial: Rework quick-start and prepare for other tutorials https://review.opendev.org/732066 | 11:40 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: tutorial: Add "gate your first patch" https://review.opendev.org/732067 | 11:40 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: tutorial: Add "Use zuul jobs" https://review.opendev.org/732068 | 11:40 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: tutorial: Add "gate pipeline" https://review.opendev.org/732069 | 11:40 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: tutorial: Add "job secrets" https://review.opendev.org/732070 | 11:40 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: tutorial: Add "job dependencies" https://review.opendev.org/732071 | 11:40 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: Rename quick-start to zuul-tutorial-quick-start https://review.opendev.org/737656 | 11:40 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: [DNM] TEST run zuul tutorials to test stream+callback (+ zuul-jobs change) https://review.opendev.org/735477 | 11:40 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: [DNM] Test: run multiple tutorials ('job dependencies' 2 times) https://review.opendev.org/741558 | 11:40 |
avass | tobiash: SIGINT or SIGUSR2? | 11:43 |
tobiash | SIGUSR2 | 11:43 |
tobiash | sorry, typo ;) | 11:43 |
avass | tobiash: oh, missed the next line :) | 11:43 |
*** jfoufas1 has joined #zuul | 11:46 | |
*** sshnaidm is now known as sshnaidm|afk | 11:46 | |
avass | well, I get a ton of exceptions doing that | 11:47 |
*** iurygregory has quit IRC | 11:55 | |
tobiash | that are no exceptions, that is a complete thread dump | 11:55 |
tobiash | can you paste it? | 11:56 |
avass | ah yeah I just realized that | 11:56 |
*** jcapitao_lunch is now known as jcapitao | 11:56 | |
*** iurygregory has joined #zuul | 11:57 | |
avass | tobiash: yeah, one second | 11:58 |
*** rfolco has joined #zuul | 11:58 | |
avass | tobiash: http://paste.openstack.org/show/798645/ | 12:01 |
*** rlandy has joined #zuul | 12:07 | |
*** rlandy is now known as rlandy|rover | 12:07 | |
*** mattd01 has joined #zuul | 12:09 | |
tobiash | hrm, I don't see anything unusual | 12:09 |
tobiash | and the governor thread seems to be in its normal 10s wait | 12:10 |
tobiash | avass: when I pause I get this after latest 10s: zuul.ExecutorServer: Unregistering due to paused | 12:16 |
avass | is there any way the executor can reach the state accepting_work = false and any sensor = false after having registered? | 12:16 |
tobiash | accepting_work is only written un (un)register_work | 12:17 |
tobiash | and those are only used in _manageLoad | 12:18 |
avass | hmm | 12:18 |
*** jpena|lunch is now known as jpena | 12:30 | |
*** Goneri has joined #zuul | 13:13 | |
*** hashar has quit IRC | 13:25 | |
*** hashar has joined #zuul | 13:55 | |
*** jfoufas1 has quit IRC | 13:57 | |
openstackgerrit | zbr proposed zuul/zuul-jobs master: ensure-docker: validate network connectivity https://review.opendev.org/755505 | 14:04 |
*** holser has quit IRC | 14:09 | |
*** holser has joined #zuul | 14:12 | |
zbr | avass: ianw tristanC: please recheck ^ is now ready. | 14:33 |
*** Eighth_Doctor has quit IRC | 14:33 | |
*** stevthedev_ has joined #zuul | 14:38 | |
*** stevthedev has quit IRC | 14:40 | |
*** stevthedev_ is now known as stevthedev | 14:40 | |
*** Eighth_Doctor has joined #zuul | 14:48 | |
logan- | Hello! My github connection broke today and I was wondering if it is related to https://developer.github.com/changes/2020-04-15-replacing-create-installation-access-token-endpoint/. I am running 3.19 and confirmed I've got the fix https://opendev.org/zuul/zuul/commit/ea97b9f2e829331b0af0a6f0904cba691628c1f5 in my running env. Log output looks like | 15:36 |
logan- | http://paste.openstack.org/raw/798652/ | 15:36 |
mhu | Hello zuul-maint, the encrypt subcommand in zuul-client is go: https://review.opendev.org/#/q/topic:zuul-client_encrypt+(status:open) if you're okay with the changes, let's +3 them | 15:37 |
tobiash | logan-: can you enable debug logs to see if there is more info? | 15:40 |
logan- | tobiash: Oh whoops. I thought I had debugging enabled but I was mistaken, getting updated logs gathered. | 15:47 |
logan- | w/ debug enabled: http://paste.openstack.org/raw/798655/ | 15:53 |
tobiash | logan-: you get 403 when requesting check runs | 16:02 |
tobiash | you probably need to add check run permissions to the zuul github app | 16:02 |
fungi | he's the third person to report what looks like permission issues from the recent gh driver change, i wonder if we need an additional reminder of perms required in the release notes? | 16:06 |
logan- | Ah yes, that was it. Thank you! | 16:07 |
tobiash | at least there is https://zuul-ci.org/docs/zuul/reference/releasenotes.html#relnotes-3-17-0-new-features but it seems that somewhere in between it got required | 16:09 |
fungi | is the sudden requirement on github's side or something we changed then? | 16:12 |
tobiash | I think it might be a side effect of an optimization | 16:13 |
logan- | Yep it happened between 3.18 -> 3.19. I upgraded to 3.19 to pull in the fix for that access token endpoint change and thats when the checks permission broke. | 16:13 |
fungi | so maybe an unintended regression | 16:13 |
openstackgerrit | zbr proposed zuul/zuul-jobs master: Update ensure-docker for new releases https://review.opendev.org/752630 | 16:13 |
tobiash | should we amend a release note of 3.19 to add that as an upgrade notice? | 16:13 |
tobiash | I think that would appear at the correct version right? | 16:14 |
fungi | i'm not familiar enough with reno to know for sure if it's possible to alter release notes after the tag, but i think you can. seems like it just cares when the note identifier appeared in the history, not what the state of the file for it was in at that time | 16:15 |
logan- | Yes imo. I read the release notes for 3.19 so it would have avoided my ping at least. :) | 16:15 |
tobiash | fungi: afaik we did this already in the past | 16:15 |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Add upgrade note to 3.19 regarding check run permissions https://review.opendev.org/755842 | 16:22 |
tobiash | fungi, logan- ^ | 16:22 |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Add upgrade note to 3.19 regarding check run permissions https://review.opendev.org/755842 | 16:24 |
*** hashar has quit IRC | 16:28 | |
zbr | tobiash: fungi https://review.opendev.org/#/c/748480/ please. | 16:42 |
tobiash | fungi: I've responded to your comment | 16:45 |
fungi | thanks | 16:47 |
*** hamalq has joined #zuul | 16:58 | |
fungi | corvus: so we got some clarification on the ansiblefest virtual booth features... our package comes with private attendee/sponsor chat capability, just not a general booth group chat/message board | 17:00 |
fungi | so it is still synchronous | 17:01 |
fungi | also clunky and confusing that they have two entirely separate text chat mechanisms integrated | 17:05 |
*** mattd01 has quit IRC | 17:06 | |
openstackgerrit | Merged zuul/zuul-jobs master: Partial address ansible-lint E208 https://review.opendev.org/748480 | 17:18 |
*** jpena is now known as jpena|off | 17:20 | |
*** holser has quit IRC | 17:27 | |
corvus | tobiash: +3 | 17:54 |
*** jcapitao has quit IRC | 17:55 | |
zbr | two more on ensure-docker: https://review.opendev.org/#/q/topic:ensure-docker+(status:open+OR+status:merged) | 17:56 |
corvus | fungi: i'm not sure i fully understand who can chat with who in that situation; can you explain with more words? (are you saying an attendee can privately chat with "zuul"? or are you saying that fungi, as an attendee who is also a sponsor, is able to chat with other attendees?) | 17:57 |
fungi | corvus: the attendees can privately chat with the booth sponsor staffers in real time to ask questions, et cetera | 17:58 |
fungi | that's a base feature which is included in our sponsor package apparently | 17:59 |
fungi | there's a separate chat feature which allows booth attendees to leave persistent messages in real time and potentially talk to each other as well as the sponsor booth staffers, and that is not included in our package | 18:00 |
*** mattd01 has joined #zuul | 18:01 | |
corvus | ok. what we have sounds moderately useful then | 18:01 |
fungi | the staffers can also invite other specific attendees into the same private chat the attendee initiates apparently (if i fully understood the demo they showed us), but it's not just a wide open anybody can wander into the same chat space thing | 18:02 |
*** hashar has joined #zuul | 18:43 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add CORS header to quickstart log server config https://review.opendev.org/755864 | 18:44 |
corvus | fungi, clarkb, tobiash: ^ got 99% through recording the zuul talk and hit that | 18:44 |
corvus | the good news is that i think our quickstart test job is doing a pretty good job of keeping things from bit-rotting. | 18:45 |
corvus | the 2 things that did bitrot are relatively minor annoyances, and would be pretty hard to test | 18:46 |
fungi | oof | 18:49 |
openstackgerrit | Merged zuul/zuul master: Add upgrade note to 3.19 regarding check run permissions https://review.opendev.org/755842 | 19:02 |
*** yolanda has quit IRC | 19:04 | |
*** yolanda has joined #zuul | 19:04 | |
hamalq | i dont know if this question should be asked here? how we can access the openstackdev config inside Tempest test? | 19:06 |
fungi | hamalq: you probably want the #openstack-qa channel for that | 19:07 |
hamalq | fungi: thanks | 19:07 |
fungi | the quality assurance team in openstack handles the tempest testsuite | 19:07 |
openstackgerrit | Merged zuul/zuul master: Add CORS header to quickstart log server config https://review.opendev.org/755864 | 19:42 |
*** tosky has quit IRC | 19:57 | |
*** tosky has joined #zuul | 19:58 | |
*** tosky has quit IRC | 20:19 | |
*** mattd01 has left #zuul | 20:41 | |
*** freenzyfriday has joined #zuul | 20:47 | |
*** freenzyfriday has quit IRC | 20:55 | |
*** hashar has quit IRC | 20:56 | |
*** rfolco has quit IRC | 20:58 | |
*** freenzyfriday has joined #zuul | 21:41 | |
*** freenzyfriday has quit IRC | 21:47 | |
*** holser has joined #zuul | 21:48 | |
*** rlandy|rover has quit IRC | 21:59 | |
*** freenzyfriday has joined #zuul | 22:35 | |
*** freenzyfriday has quit IRC | 22:44 | |
*** yolanda has quit IRC | 23:23 | |
*** yolanda has joined #zuul | 23:24 | |
*** freenzyfriday has joined #zuul | 23:30 | |
*** freenzyfriday has quit IRC | 23:35 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!