-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 872908: Cleanup old rebase-merge dirs on repo reset https://review.opendev.org/c/zuul/zuul/+/872908 | 06:36 | |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 874096: Update reconfig event ltime on (smart) reconfig https://review.opendev.org/c/zuul/zuul/+/874096 | 07:39 | |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 872908: Cleanup old rebase-merge dirs on repo reset https://review.opendev.org/c/zuul/zuul/+/872908 | 09:17 | |
-@gerrit:opendev.org- Benedikt Löffler proposed: [zuul/nodepool] 802255: Use optional upload script for uploading an image https://review.opendev.org/c/zuul/nodepool/+/802255 | 09:52 | |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 873692: Return cached Github change on concurrent update https://review.opendev.org/c/zuul/zuul/+/873692 | 10:14 | |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 873692: Return cached Github change on concurrent update https://review.opendev.org/c/zuul/zuul/+/873692 | 10:15 | |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 873692: Return cached Github change on concurrent update https://review.opendev.org/c/zuul/zuul/+/873692 | 10:17 | |
@q:fricklercloud.de | can we get a new tag for zuul soon? seems there are a lot of unreleased fixes like https://review.opendev.org/c/zuul/zuul/+/872519 which just hit us after upgrading to 8.1.0 | 11:51 |
---|---|---|
@q:fricklercloud.de | how to other deployments handle this, do you all run from latest like #opendev? or do you build your own images? | 12:03 |
@q:fricklercloud.de | fwiw, trying with :latest containers gives me: | 12:19 |
`2023-02-17 12:15:40,351 ERROR zuul.WebServer: ValueError: ('Could not deserialize key data. The data may be in an incorrect format, it may be encrypted with an unsupported algorithm, or it may be an unsupported key type (e.g. EC curves with explicit parameters).', [_OpenSSLErrorWithText(code=75497580, lib=9, reason=108, reason_text=b'error:0480006C:PEM routines::no start line')])` | ||
@q:fricklercloud.de | disregard the last msg, that was a local deployment issue | 12:25 |
-@gerrit:opendev.org- Marvin Becker proposed: [zuul/nodepool] 873716: Add gpu support for k8s/openshift pods https://review.opendev.org/c/zuul/nodepool/+/873716 | 13:37 | |
@jim:acmegating.com | i think next week (after the next opendev restart) would be a good time for a release assuming everything still looks good | 14:52 |
@jpew:matrix.org | Any chance https://review.opendev.org/c/zuul/zuul/+/873742 could make the next release? | 15:06 |
-@gerrit:opendev.org- Marvin Becker proposed: [zuul/nodepool] 873716: Add gpu support for k8s/openshift pods https://review.opendev.org/c/zuul/nodepool/+/873716 | 16:05 | |
@clarkb:matrix.org | jpew: corvus I left a +2 on 873742 but noted a corner case where I think we'll still do the wrong thing. It should be less frequent than with the prior patchset though hence my +2 if this gets most things moving along agan | 16:46 |
@jim:acmegating.com | Clark: jpew that's more of a new behavior than a regression bugfix -- i don't think we should rush it | 17:06 |
@jpew:matrix.org | I though Zuul never merged empty commits? | 17:06 |
@jpew:matrix.org | corvus: Our deploy pipelines are broken :/ | 17:07 |
@jim:acmegating.com | jpew: yeah i get the urgency for you but they've always been broken | 17:07 |
@jim:acmegating.com | this has some pretty serious implicatons for version handling, especially with tags | 17:07 |
@jim:acmegating.com | empty commits is actually one of the only ways you can get certain consistent behavior in zuul with tags | 17:08 |
@jim:acmegating.com | so anyway, let's please don't rush this. in the mean time you can run it with your patch locally applied | 17:08 |
@jpew:matrix.org | Fair enough..... does it need to be a propery of the event then instead? | 17:10 |
@jim:acmegating.com | (i wrote a bit more about the tag case on the change btw) | 17:10 |
@jpew:matrix.org | Or I guess if FETCH_HEAD is empty, we can not remove it | 17:12 |
@jim:acmegating.com | jpew: re event property: i think that might be one option (and i think Clark suggested something like that as a possibility) -- but realistically, the data model in zuul requires this be a property of the Change(Ref) object, so if it originates at the event, it needs to end up in the Change somehow. so some kind of a merged flag might help. re fetch_head: that sounds promising too... | 17:17 |
@jpew:matrix.org | The FETCH_HEAD is pretty simple, I'll work it up quick after lunch | 17:17 |
@clarkb:matrix.org | re not merging empty commits you can still push a merge commit and land that and have it work out properly I think. Its just the git handles the merge situation for us and we don't have to think about it much | 17:18 |
@jim:acmegating.com | Clark: aiui the change from jpew would unwind that even in a check pipeline; that makes me uncomfortable | 17:19 |
@jim:acmegating.com | to try to clarify: 1) empty commits are vital for some operations; 2) we don't want to alter the behavior of empty commits in check/gate | 17:20 |
@jpew:matrix.org | Clark: Cherry-pick already has a special case for merges | 17:20 |
@jim:acmegating.com | Clark: oh i see i misunderstood what you wrote | 17:21 |
@jpew:matrix.org | (which this doesn't change) | 17:21 |
@clarkb:matrix.org | > <@jim:acmegating.com> Clark: aiui the change from jpew would unwind that even in a check pipeline; that makes me uncomfortable | 17:21 |
Yes Ithink that is correct. But it would only affect the case where someone pushes and explicit empty commit. I have no idea if anyone does that, but being cautious seems fine. | ||
@clarkb:matrix.org | its definitely a corner case. But one that is possible | 17:21 |
@jim:acmegating.com | but yeah, i still don't think we want to require someone make a merge in order to cherry-pick an empty commit | 17:21 |
@jim:acmegating.com | i'm saying it's not a corner case | 17:21 |
@jim:acmegating.com | i'm saying it's absolutely 100% critical to certain workflows | 17:21 |
@jpew:matrix.org | You push direct to git, or push an empty commit to gerrit | 17:21 |
@clarkb:matrix.org | ok I wasn't aware of anyone using empty commits. But if they are then yes this would be problematic | 17:22 |
@jim:acmegating.com | yep i wrote about it a bit on the change -- it's the only way to get consistent behavior from tag-based pipelines on projects with multiple branches which merge back into master. | 17:22 |
@jpew:matrix.org | Ya, makes sense | 17:23 |
@jim:acmegating.com | (even if it weren't an important existing use case, we should always be careful/skeptical and try to minimize the difference between what zuul does and what the code review system will do) | 17:24 |
@jim:acmegating.com | (and to be clear, i think jpew is also trying to achieve that in the change being worked on -- it's a fine line we're going to have to walk here :) | 17:25 |
@jpew:matrix.org | corvus: Although obsensibly, cherry-pick mode with an empty commit is already broken today (because it doesn't pass --allow-empty) :) | 17:29 |
@jpew:matrix.org | But, it makes sense to fix it at the same time I suppose | 17:30 |
@jim:acmegating.com | yeah, now that we're thinking about it, we should be thorough :) | 17:34 |
-@gerrit:opendev.org- Zuul merged on behalf of Clark Boylan: [zuul/zuul] 872364: Switch our local testing docker-compose to mysql 8.0 https://review.opendev.org/c/zuul/zuul/+/872364 | 17:55 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 873470: Match events to pipelines based on topic deps https://review.opendev.org/c/zuul/zuul/+/873470 | 18:12 | |
-@gerrit:opendev.org- Joshua Watt proposed: [zuul/zuul] 873742: merger: Keep redundant cherry-pick commits https://review.opendev.org/c/zuul/zuul/+/873742 | 19:22 | |
@jpew:matrix.org | Ok. Fixed I think. Also added a test case to make sure that actually empty commits are preserv3ed | 19:22 |
@jpew:matrix.org | * Ok. Fixed I think. Also added a test case to make sure that actually empty commits are preserved | 19:22 |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 872368: Allow default-ansible-version to be an int https://review.opendev.org/c/zuul/zuul/+/872368 | 19:30 | |
-@gerrit:opendev.org- Zuul merged on behalf of Simon Westphahl: [zuul/zuul] 874096: Update reconfig event ltime on (smart) reconfig https://review.opendev.org/c/zuul/zuul/+/874096 | 19:30 | |
@goneri:matrix.org | Hi, we often face a situation where the job fails before the logs are uploaded. To simplify the troubleshooting, I started an app that actually collect all the logs as soon as the jobs start. https://github.com/goneri/the_zuul_watcher | 20:55 |
@fungicide:matrix.org | normal job failures shouldn't interfere with log uploads. sometimes there are failures uploading logs though. or nodes crashing and becoming unreachable before logs can be collected from them. one of the things we talked about was having the zuul info upload first in a separate task. it might also be possible for executors to save and stream console logs and json locally and upload before collecting log files from job nodes, worth looking into | 21:00 |
@jim:acmegating.com | fungi: even the failure to fetch logs should not interfere with log uploads in a properly constructed job. the base job should have log uploads in the last post-run playbook, and it should have fetching logs in the second-to-last post-run playbook. opendev's base jobs are constructed this way, and i recommend that as a pattern to follow. | 21:14 |
@goneri:matrix.org | Well, for instance we recently had a situation where Zuul was able to connect to some of the nodes https://paste.centos.org/view/35afb08b, it was a misconfiguration of a provide. The application above was useful to troubleshot the problem. | 21:17 |
@jim:acmegating.com | Gonéri 🇺🇦: that should have still appeared in the job-output.txt and been uploaded in the final post-run playbook assuming the pattern i described above | 21:19 |
@jim:acmegating.com | Gonéri 🇺🇦: but your program would certainly be useful in the case that it couldn't reach the log server for upload. those incidents should be recorded in the executor log, but this would make it easier for a non-admin to troubleshoot | 21:20 |
@jim:acmegating.com | Gonéri 🇺🇦: thanks for sharing -- i do see your program as useful. :) i also know that some people misconfigure zuul (for example, putting log uploads in the run playbook, or combining them in a playbook with other roles/tasks, or putting them in the cleanup playbook) and in those cases, they might see more "log failures" than would normally be expected. so i want to highlight that in case other factors are at play. | 21:22 |
@goneri:matrix.org | I don't think the base job did even start in my case. The job was failing before. | 21:24 |
@jim:acmegating.com | that might arguably be a bug in handling job setup failures then | 21:25 |
@jim:acmegating.com | (i say arguably because if we can't setup, the right choice might be to do nothing) | 21:27 |
@jim:acmegating.com | Gonéri 🇺🇦: anyway, thanks for sharing the problem description and your solution :) | 21:28 |
@clarkb:matrix.org | one use case that could be useful for is when jobs crash say with nested virt | 21:31 |
@clarkb:matrix.org | being able to create a record outside of the browser easily would be nice for those situations | 21:32 |
@jim:acmegating.com | Clark: how so? why wouldn't that be in the normal job-output.txt? | 21:32 |
@clarkb:matrix.org | corvus: I seem to recall it wasn't and the last time we had this going on users were asked to keep straems open to get the data | 21:35 |
@clarkb:matrix.org | I don't know why that was the case though, but definitely recall people needing to open the various console streams for the jobs theywere interested in | 21:35 |
@jim:acmegating.com | Clark: that seems weird to me because the web stream is just a stream of the job-output.txt file, so the only reason not to have that at the end of the job is that uploading failed for some reason. shouldn't have anything to do with the remote node after the job starts (Gonéri 🇺🇦 found a case where it's a problem before the job starts) | 21:36 |
@jim:acmegating.com | Clark: even a timeout shouldn't affect that, as we give each post-run playbook its own timeout | 21:38 |
@clarkb:matrix.org | ya I may have conflated issues too. I remember people doing that for some reason though | 21:38 |
@jim:acmegating.com | well, it can sure be handy when things go really wrong :) | 21:39 |
@clarkb:matrix.org | maybe it was when tripleo had post-run timeouts which might have broken uploads or at least recording of where to find them | 21:39 |
@jim:acmegating.com | maybe that was before the split into 2 post-run playbooks? | 21:39 |
@jim:acmegating.com | or maybe was timing out the upload trying to upload way too much data? | 21:40 |
@clarkb:matrix.org | ya in tripleo's case it was processing the logs before uploading taking tons of time iirc. Whihc would be before splitting things I guess | 21:41 |
@jim:acmegating.com | yeah, the split into 2 playbooks solves a lot of probs | 21:41 |
@jim:acmegating.com | still, maybe ensuring that the upload does the job-output.txt first might not be a bad idea | 21:41 |
@jim:acmegating.com | (and maybe in doing so, it could set the return value early too) | 21:42 |
@jim:acmegating.com | but that gets complex for the synthetic index generation phase | 21:42 |
@clarkb:matrix.org | there were a couple things when fungi and I started looking at that. One was the indexes iirc. I forget the other | 21:42 |
@clarkb:matrix.org | its doable but needs effort | 21:42 |
@jim:acmegating.com | so maybe that needs to be part of the upload routine in ansible python itself | 21:42 |
@fungicide:matrix.org | oh. yes i forgot we'd already merged the change to upload the basic logs first in a separate playbook | 21:45 |
@goneri:matrix.org | We also had a case where the gather_facts was timeouting. It was the same, without the Websocket opened at the right time, it was hard to troubleshoot the problem. | 21:48 |
-@gerrit:opendev.org- Joshua Watt proposed: [zuul/zuul] 873742: merger: Keep redundant cherry-pick commits https://review.opendev.org/c/zuul/zuul/+/873742 | 22:47 | |
-@gerrit:opendev.org- Zuul merged on behalf of Simon Westphahl: [zuul/zuul] 873692: Return cached Github change on concurrent update https://review.opendev.org/c/zuul/zuul/+/873692 | 22:55 | |
-@gerrit:opendev.org- Zuul merged on behalf of Simon Westphahl: [zuul/zuul] 872908: Cleanup old rebase-merge dirs on repo reset https://review.opendev.org/c/zuul/zuul/+/872908 | 23:07 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!