Monday, 2022-03-28

-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 834857: Fix bug in getting changed files https://review.opendev.org/c/zuul/zuul/+/83485706:18
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 834857: Fix bug in getting changed files https://review.opendev.org/c/zuul/zuul/+/83485706:29
@dong.zhang:matrix.orgIs opendev zuul currently working? check job is not staring in https://zuul.opendev.org/t/zuul/status06:41
@q:fricklercloud.dezuul-maint: ^^ see #opendev, help wanted07:32
@bookwar:fedora.imHi, folks. I am looking into the topic of zuul deployment on Openshift. I want to reuse the code from https://opendev.org/zuul/zuul-helm, but it is missing licensing information. Is there a default license for OpenDev projects?09:55
@apevec:matrix.orgAleksandra Fedorova: that would be a question for the main author there, mnaser  10:32
@apevec:matrix.orgafaik each project must declare own license10:33
@apevec:matrix.orgmnaser: is zuul-helm still used for Vexxhost Zuul ?10:34
@apevec:matrix.orgAleksandra Fedorova: VH is doing Zuul hosting e.g. https://telecominfraproject.zuul.vexxhost.dev/builds10:36
@bookwar:fedora.imI doubt that specific repo is used for anything in its current state, it has several outdated links to deprecated locations, but that's not a problem for me. I am planning to adjust it to my needs anyway. But for that I need the license to permit it.10:36
@apevec:matrix.orghttps://vexxhost.com/solutions/managed-zuul/ if product placement is allowed here :)10:37
@apevec:matrix.orgAleksandra Fedorova: I'd email mnaser and corvus - it's all theirs10:38
@bookwar:fedora.imthanks for the tip, will do10:39
-@gerrit:opendev.org- Andy Ladjadj proposed: [zuul/zuul-jobs] 834043: [upload-logs-base] add public url attribute https://review.opendev.org/c/zuul/zuul-jobs/+/83404311:35
-@gerrit:opendev.org- Andy Ladjadj proposed: [zuul/zuul-jobs] 834043: [upload-logs-base] add public url attribute https://review.opendev.org/c/zuul/zuul-jobs/+/83404311:35
-@gerrit:opendev.org- Andy Ladjadj proposed: [zuul/zuul-jobs] 834043: [upload-logs-base] add public url attribute https://review.opendev.org/c/zuul/zuul-jobs/+/83404311:36
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 834857: Fix bug in getting changed files https://review.opendev.org/c/zuul/zuul/+/83485713:11
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 835452: Test zuul-client dequeue-all https://review.opendev.org/c/zuul/zuul/+/83545213:42
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 835464: Add a blob store and store large secrets in it https://review.opendev.org/c/zuul/zuul/+/83546414:24
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 834857: Fix bug in getting changed files https://review.opendev.org/c/zuul/zuul/+/83485714:26
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 834857: Fix bug in getting changed files https://review.opendev.org/c/zuul/zuul/+/83485714:46
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 832363: Add queue.dependencies-by-topic https://review.opendev.org/c/zuul/zuul/+/83236315:22
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 835349: Revert "Remove Worker class" https://review.opendev.org/c/zuul/zuul/+/83534915:30
@jim:acmegating.commhu: to be clear -- when i suggested we can add the information back, i meant in a new way -- the old implementation is not a good starting point.15:37
@mhuin:matrix.orgeven for the worker info? I understood that the nodes info needed rework15:38
@jim:acmegating.comyeah, see the original commit message -- we're sending that info back via a different method now15:39
@mhuin:matrix.orgThe revert doesn't rebase on master trivially anyway15:39
@mhuin:matrix.orgAh ok, I missed that15:40
@mnaser:matrix.orgCould we have a quick eyes to fix the k8s jobs and a cleanup:15:49
- https://review.opendev.org/c/zuul/zuul-jobs/+/835162
- https://review.opendev.org/c/zuul/zuul-jobs/+/835156
@y2kenny:matrix.orgI have been seeing Zuul getting into weird states due to Gerrit connections issue.  This is Zuul 5.2.0 with 6x scheduler, 10x merger, 10x executor.  I don't know the internals of Zuul too well so I thought I should share my observation here:16:15
Observation 1: scheduler repeatedly updating gerrit change (seems to stuck in a loop?):
https://paste.opendev.org/show/b9zioMLgrMCNl0yK2868/
It's just the same few changes repeating (it's more than 3 line... I just pasted 3 as an example
Observation 2: jobs get stuck with the log streamer saying "BuildID not found"
I am suspecting Gerrit connection error causing Zuul comment/score posting to fail and causing Zuul to stuck in a weird state. This is just a guess though. The recovery method is restarting scheduler and executors.
@y2kenny:matrix.orgI see GerritConnection timeout can be adjusted for ssh but not sure if it's possible for http16:24
@clarkb:matrix.orgKenny Ho: you do have a gerrit connection error then?16:27
@y2kenny:matrix.orgyes, definitely having unstable connection to Gerrit.16:28
@y2kenny:matrix.orgbut only intermittently 16:28
@y2kenny:matrix.orgZuul seems to be having trouble recovering... I think?16:30
@y2kenny:matrix.orgsomething is happening but there are also a bunch of events not being schedule16:30
@clarkb:matrix.orgWell it should just make a new connection. The http stuff is pretty stateless and for ssh it retries until reconnected16:31
@clarkb:matrix.orgI think it might be helpful to pick a specific instance of what looks weird and track that down using the event ids in the logs16:31
@y2kenny:matrix.orgso I am looking at one scheduler instance 16:32
@y2kenny:matrix.orgsame event but the log is filled with "Updating <change>"16:33
@y2kenny:matrix.orgfor a bunch of different change16:33
@y2kenny:matrix.orgat a rate of about 1 a sec16:34
@y2kenny:matrix.orgoh... I just see some pipeline adding change16:34
@clarkb:matrix.orgYa it does that to ensure it has a current view of the repos for config accuracy16:34
@clarkb:matrix.orgIs the updating <change> happening over and over again for the same event?16:34
@y2kenny:matrix.orgover and over again for the same event yes16:34
@y2kenny:matrix.org(I am  assuming e: <hash> is event)16:35
@y2kenny:matrix.orgsame event and same change16:35
@clarkb:matrix.orgyes e: <hash> identifies the triggering event16:35
@y2kenny:matrix.orgsame event and same bunch of changes repeatedly at a rate of 1 per second16:37
@clarkb:matrix.orgis the processing of that event hitting an error causing another scheduler to process it then repeating in a loop?16:38
@y2kenny:matrix.orgum... let me check the other scheduler's log...16:38
@y2kenny:matrix.orgI just sampled two other scheduler and they don't seems to have repeating events.  What I see is something else that I also see occasionally that I don't understand: (Exception loading ZKObject... kazoo.exceptions.NoNodeError...) Shouldn't be related though since it's on a pipeline that is not used.16:41
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 835478: Add a note about bwrap and setsid https://review.opendev.org/c/zuul/zuul/+/83547816:41
@y2kenny:matrix.orgThis (repeating events) seems to only happen on the oldest scheduler (the scheduler I have been using before going to multi-scheduler)16:47
@y2kenny:matrix.orgnot sure if that's relevant either16:47
@jim:acmegating.comit's only happening on one scheduler because only one scheduler processes the gerrit event stream16:48
@y2kenny:matrix.orgah ok16:48
@fungicide:matrix.orgFlorian Haas: https://twitter.com/xahteiwi/status/1508470104970903561 got mentioned to me and since i don't have a twitter account i'll follow up here. i agree we don't seem to have any documentation explaining how git submodules are handled by zuul (the only reference i can find to the string "submodule" anywhere in the repo is in a couple of tests of the merger service), but there may be zuul users in here who use git submodules in their projects and can explain their workflow or related pitfalls sufficient for us to add something to the docs16:50
@fungicide:matrix.orgexpanding my search to include "gitmodule" i see that the merger resets the .gitmodules file if an exception is raised while fetching refs which could indicate a faulty configuration introduced in .gitmodules16:55
@q:fricklercloud.defwiw what Kenny Ho describes sounds pretty similar to what I saw on zuul01.opendev.org.16:56
@clarkb:matrix.orgKenny Ho: I think q ended up just restarting the scheduler to get it moving agian. Of course that doesn't help root cause and debug it.17:19
@clarkb:matrix.orgAre there no errors between the restarts of event processing?17:20
@clarkb:matrix.orgSeems like it is not removing the events because it must be failing somewhere along the way?17:20
@mnaser:matrix.org> <@bookwar:fedora.im> I doubt that specific repo is used for anything in its current state, it has several outdated links to deprecated locations, but that's not a problem for me. I am planning to adjust it to my needs anyway. But for that I need the license to permit it.17:21
Sorry, there are a few outstanding patches which haven't landed because I haven't had time to fix the tests for it. However, it is functional with those patches (that are not passing but that's because the tests are borked)
@mnaser:matrix.orgWe use it in production and it works just fine :)17:21
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-helm] 835480: Add license file https://review.opendev.org/c/zuul/zuul-helm/+/83548017:23
@jim:acmegating.commnaser: ^ can you ack that?  that was my understanding at least17:23
@mnaser:matrix.org> <@jim:acmegating.com> mnaser: ^ can you ack that?  that was my understanding at least17:24
+1'd, yeah, apache2 makes sense and that's the intention to follow the rest of the Zuul licensing.
@jim:acmegating.comapevec: i think it would be good for you to ack that as a redhat representative.17:24
@jim:acmegating.commnaser: (well, most of the rest; there is some gpl3 there)17:24
@y2kenny:matrix.orgClark: There are error (Gerrit connection timeout, max retries, etc.)  restarting kind of help but not really?  I can't quite put my finger on it yet17:26
@y2kenny:matrix.orglike... there are event/jobs that got through17:26
@y2kenny:matrix.orgbut feels like something major is blocking.17:27
@y2kenny:matrix.orgI restarted a few times but things didn't move until the last restart and everything seems to be unblocked17:28
@y2kenny:matrix.orgit's possible the problem is really just the network issue on the Gerrit side but then I am not sure how some of the jobs get through and succeed.17:28
@clarkb:matrix.orgI thought that we would eventually fail (we retry gerrit queries but after 3? retries we emit a failure)17:29
@y2kenny:matrix.orgI do get a Max retries exceeded17:29
@clarkb:matrix.orgI would expect it to move on at that point17:36
@clarkb:matrix.orgmaybe we aren't handling that case properly which puts us in a loop because the event isn't removed?17:36
@y2kenny:matrix.orgClark: Is the event stored in zk?  Because they seems to survive scheduler restart (I could be mistaken though.)17:47
@clarkb:matrix.orgKenny Ho: yes, it is received by a scheduler and then stored in zookeeper in a queue where it is meant to be processed17:51
@apevec:matrix.org> <@jim:acmegating.com> apevec: i think it would be good for you to ack that as a redhat representative.17:55
you mean for your contributions while employed by Red Hat?
@apevec:matrix.orgI can't ack myself, I can file a ticket for legal17:56
@jim:acmegating.comapevec: yep, just to avoid any doubt, since red hat is the copyright holder.17:56
@bookwar:fedora.im> <@mnaser:matrix.org> Sorry, there are a few outstanding patches which haven't landed because I haven't had time to fix the tests for it.  However, it is functional with those patches (that are not passing but that's because the tests are borked)19:11
Good to know, thank you. I'll see if i can help and contribute fixes back to the main repo then.
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 835100: Rely on the unparsed config cache in reconfigurations https://review.opendev.org/c/zuul/zuul/+/83510019:31
@jim:acmegating.comAleksandra Fedorova: https://review.opendev.org/835480 should address your concern about the license19:33
@clarkb:matrix.orgcorvus: I think if you want to +A 835100 now that is probably fine since the edits were important but outside of the code and you had +2s before19:33
@vlotorev:matrix.orgHi, on https://zuul.opendev.org/t/opendev/projects there projects from multiple connections: github, googlesource.19:53
On the other hand none of these projects from github/googlesource are mentioned in pipelines configuration https://opendev.org/opendev/project-config/src/branch/master/zuul.d/pipelines.yaml.
What's the benefit of adding these projects if they not enqueued to pipelines? Only to test Depends-On?
@jim:acmegating.comvlotorev: that and required-projects19:54
@vlotorev:matrix.org * Hi, on https://zuul.opendev.org/t/opendev/projects there are projects from multiple connections: github, googlesource.19:54
On the other hand none of these projects from github/googlesource are mentioned in pipelines configuration https://opendev.org/opendev/project-config/src/branch/master/zuul.d/pipelines.yaml.
What's the benefit of adding these projects if they not enqueued to pipelines? Only to test Depends-On?
@vlotorev:matrix.orgThanks.19:55
@clarkb:matrix.orgYa with github repos zuul will cache them for us and put them in place on jobs which is a common use case. We also do depends on with upstream gerrit for our deployment testingto ensure our bug fixes (and others if necessary) work for us20:19
@fungicide:matrix.orgvlotorev: we do also report to some projects on other connections in different tenants, but we try to isolate them so that we don't wind up with broken zuul configuration from projects outside our sphere of control impacting other tenants20:35
@fungicide:matrix.orgspecifically, limiting which projects we will read configuration from20:41
@fungicide:matrix.orgbut also in some cases limiting what specific kinds of configuration we'll allow to be used in them20:42
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 835518: Fix recursive Gerrit change query https://review.opendev.org/c/zuul/zuul/+/83551821:11
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 835522: Add more submitted-together tests https://review.opendev.org/c/zuul/zuul/+/83552221:57
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed:22:10
- [zuul/zuul] 835518: Fix recursive Gerrit change query https://review.opendev.org/c/zuul/zuul/+/835518
- [zuul/zuul] 835522: Add more submitted-together tests https://review.opendev.org/c/zuul/zuul/+/835522
@jim:acmegating.comClark: ^ can you re-review those?  thx22:10
@clarkb:matrix.orgyes22:10
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 835478: Add a note about bwrap and setsid https://review.opendev.org/c/zuul/zuul/+/83547822:44
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 832870: Make promote work for any pipeline manager https://review.opendev.org/c/zuul/zuul/+/83287023:24
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 835100: Rely on the unparsed config cache in reconfigurations https://review.opendev.org/c/zuul/zuul/+/83510023:31

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!