Thursday, 2022-07-28

-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: [zuul/zuul-jobs] 851163: emit-job-header: noqa on error ignore https://review.opendev.org/c/zuul/zuul-jobs/+/85116300:06
-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: [zuul/zuul-jobs] 851014: linters: fix spaces between filters https://review.opendev.org/c/zuul/zuul-jobs/+/85101400:07
-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand:00:22
- [zuul/zuul-jobs] 851015: linters: add names to blocks https://review.opendev.org/c/zuul/zuul-jobs/+/851015
- [zuul/zuul-jobs] 851017: linters: update to ansible-lint 6 https://review.opendev.org/c/zuul/zuul-jobs/+/851017
-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand:00:23
- [zuul/zuul-jobs] 851164: ansible-lint: disable progressive mode https://review.opendev.org/c/zuul/zuul-jobs/+/851164
- [zuul/zuul-jobs] 851263: ensure-kubernetes: pull cri-dockerd systemd from tag https://review.opendev.org/c/zuul/zuul-jobs/+/851263
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 851288: linters: standardise on newline at end of file https://review.opendev.org/c/zuul/zuul-jobs/+/85128803:55
-@gerrit:opendev.org- Ian Wienand proposed:04:08
- [zuul/zuul-jobs] 851288: linters: standardise on newline at end of file https://review.opendev.org/c/zuul/zuul-jobs/+/851288
- [zuul/zuul-jobs] 851289: linters: use Ansible 5 for ansible-lint https://review.opendev.org/c/zuul/zuul-jobs/+/851289
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 851334: test-requirements: bump to Ansible 5 https://review.opendev.org/c/zuul/zuul-jobs/+/85133404:19
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 851334: test-requirements: bump to Ansible 5 https://review.opendev.org/c/zuul/zuul-jobs/+/85133404:44
@tony.breeds:matrix.orgI'm sure I'm not the first person to ask this but ... Given a build is there anyway to get the tasks/roles/plays that were run?  Ultimately my aim is to re-run *most* of a job that failed and debug it locally.  I know it's somewhat easy to interate on a job in zuul but I still feel like somethign like this would be potentially quicker?06:27
-@gerrit:opendev.org- Ian Wienand proposed:06:28
- [zuul/zuul-jobs] 851288: linters: standardise on newline at end of file https://review.opendev.org/c/zuul/zuul-jobs/+/851288
- [zuul/zuul-jobs] 851289: linters: use Ansible 5 for ansible-lint https://review.opendev.org/c/zuul/zuul-jobs/+/851289
- [zuul/zuul-jobs] 851343: Drop py27 tox testing https://review.opendev.org/c/zuul/zuul-jobs/+/851343
- [zuul/zuul-jobs] 851344: test-requirements: drop 3.5 dependencies https://review.opendev.org/c/zuul/zuul-jobs/+/851344
-@gerrit:opendev.org- Ian Wienand proposed:06:37
- [zuul/zuul-jobs] 851344: test-requirements: drop 3.5 dependencies https://review.opendev.org/c/zuul/zuul-jobs/+/851344
- [zuul/zuul-jobs] 851334: test-requirements: bump to Ansible 5 https://review.opendev.org/c/zuul/zuul-jobs/+/851334
- [zuul/zuul-jobs] 851289: linters: use Ansible 5 for ansible-lint https://review.opendev.org/c/zuul/zuul-jobs/+/851289
@iwienand:matrix.orgzuul-main: https://review.opendev.org/q/topic:ansible-lint-update-6 , previously reviewed and thank you for that 😀 has a few more changes from cleaning up and i figured might as well get o-z-j , project-config and base-jobs on the same level too. 07:47
@westphahl:matrix.orgcorvus Clark yeah I think we should go with 851255 and see if that has any significant performance impact for us. If that should be the case we can still look at 85125612:37
@fungicide:matrix.org> <@tony.breeds:matrix.org> I'm sure I'm not the first person to ask this but ... Given a build is there anyway to get the tasks/roles/plays that were run?  Ultimately my aim is to re-run *most* of a job that failed and debug it locally.  I know it's somewhat easy to interate on a job in zuul but I still feel like somethign like this would be potentially quicker?12:38
the build console view is a rendering of the job-output.json preserved along with your build logs, and has all the playbook and task details
@fungicide:matrix.orgwhether that's sufficient to be able to re-run the essence of a job is another question altogether though. there's been off-again/on-again efforts for a "zuul-runner" tool to query the zuul api for a facsimile of the bits you'd need to recreate a given build12:44
@fungicide:matrix.orgbut actually putting that to use gets pretty tough as soon as you exit the realm of trivial single-node jobs with no dependencies and no external state12:45
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 851414: Revert cri-dockerd changes https://review.opendev.org/c/zuul/zuul-jobs/+/85141414:22
@jim:acmegating.comClark: ianw ^ that broke nodepool -- error at https://zuul.opendev.org/t/zuul/build/9ae631c37a384b4bb63515c6c5f04a0014:23
@clarkb:matrix.orgcorvus: I don't have any objections to reverting but the existing testing isn't passing against the revert due to the issues the changes were initially addressing14:43
@jim:acmegating.commight be a good force-merge then?14:43
@jim:acmegating.comor we could pin the minikube version in the role (which is what nodepool is doing to continue testing)14:44
@clarkb:matrix.orgcorvus: or maybe nodepool will work if unpinning minikube? (I wonder if the fixes need new minikube) but ya I think pinning in the role may be a good workaround and then roll forward from tehre14:45
@jim:acmegating.comClark: any chance you can do that?  i'm knees deep in other changes to zuul-jobs14:46
@clarkb:matrix.orgI can take a look after my meeting. Might be an hour or so14:47
@jim:acmegating.comi think we should consider this break somewhat urgent.  i'd like to do a force-merge to correct it then.14:48
@clarkb:matrix.orgok, just keep in mind any other zuul-jobs updates that will trigger those jobs will no longer be landable through normal gating14:48
@jim:acmegating.comyeah, we're going back to status-quo before ianw's stack.  i think that's fine since the stack introduced a regression.14:49
@clarkb:matrix.orgYup. Just mentioning it as you indicated other zuul-jobs work is in progress14:49
@jim:acmegating.comoh yeah, the other work is to try to patch the testing hole that let this through14:50
@clarkb:matrix.orgI've given it a +2 if you want to force merge it14:50
@jim:acmegating.comcool, thanks14:51
-@gerrit:opendev.org- corvus.admin merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 851414: Revert cri-dockerd changes https://review.opendev.org/c/zuul/zuul-jobs/+/85141415:02
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed:15:14
- [zuul/zuul-jobs] 851418: Add jammy testing https://review.opendev.org/c/zuul/zuul-jobs/+/851418
- [zuul/zuul-jobs] 851419: Sort supported platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851419
- [zuul/zuul-jobs] 851420: Support subsets of platforms in update-test-platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851420
- [zuul/zuul-jobs] 851421: Test ensure-kubernetes on all Ubuntu platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851421
@jim:acmegating.comClark: ianw ^ I think that series should get us the missing test coverage15:14
@clarkb:matrix.orgcorvus: thanks I'm going to grab breakfast now that my meeting is over and then dig into that15:40
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed:15:40
- [zuul/zuul-jobs] 851418: Add jammy testing https://review.opendev.org/c/zuul/zuul-jobs/+/851418
- [zuul/zuul-jobs] 851419: Sort supported platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851419
- [zuul/zuul-jobs] 851420: Support subsets of platforms in update-test-platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851420
- [zuul/zuul-jobs] 851421: Test ensure-kubernetes on all Ubuntu platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851421
- [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/851425
@jim:acmegating.comthat's a rebase on current master, plus a revert at the end of the stack so we can check the error.15:40
@jim:acmegating.comClark: cool, thanks.  hopefully you can iterate on the last change in the stack and if it passes, we can just squash it with the next-to-last.  the ones above it pass, so no logistical issues there.15:41
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 851434: Install crio on Jammy like Focal https://review.opendev.org/c/zuul/zuul-jobs/+/85143416:44
-@gerrit:opendev.org- Clark Boylan proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/85142516:44
@clarkb:matrix.orgcorvus: ^ I think maybe that will fix it. I'll squash everything together if testing looks good16:44
@jim:acmegating.comClark: thanks, that makes sense from what i've picked up so far :)16:48
@clarkb:matrix.organd the reason it broke nodepool is ianw's stack for newer k8s support turned on crio globally16:49
@jim:acmegating.comyep16:49
@clarkb:matrix.orghrm jammy doesn't have packages in that opensuse obs build area17:06
@clarkb:matrix.orgoh maybe I need to use a different crio version17:08
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 851434: Install crio on Jammy like Focal https://review.opendev.org/c/zuul/zuul-jobs/+/85143417:11
-@gerrit:opendev.org- Clark Boylan proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/85142517:11
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 851434: Install crio on Jammy like Focal https://review.opendev.org/c/zuul/zuul-jobs/+/85143417:24
-@gerrit:opendev.org- Clark Boylan proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/85142517:24
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 851443: Clarify disjoint builders in docs https://review.opendev.org/c/zuul/nodepool/+/85144317:31
@clarkb:matrix.orgcorvus: ianw ^ My latest changes have the non crio jobs working at the end of the stack but the crio jobs fail17:55
@clarkb:matrix.orgMy hunch is that I'm not configuring minikube to use crio properly in the crio case just based on the difference between crio and docker jobs17:56
@jim:acmegating.combut crio/bionic works17:57
@clarkb:matrix.orgcorvus: yup, but it is installing old crio17:58
@clarkb:matrix.orgI think the issue must be using newer crio and not configuring something? It says the crio service is failing to start17:58
@jim:acmegating.com(since i just went and checked -- the registry crio job runs on focal, so fixing focal for ensure-k8s/crio should fix that)17:58
@clarkb:matrix.orgLet me update to see if crio 1.21 is any different (the first version to have jammy packages)18:00
@clarkb:matrix.orgactually it is 1.2218:01
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 851434: Install crio on Jammy like Focal https://review.opendev.org/c/zuul/zuul-jobs/+/85143418:05
-@gerrit:opendev.org- Clark Boylan proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/85142518:05
@clarkb:matrix.org"runtime validation: invalid runtime_path for runtime 'runc': \"stat /usr/lib/cri-o-runc/sbin/runc: no such file18:33
or directory\"""
@clarkb:matrix.orgFor some reason they make cri-o-runc a separate package and cri-o does not dpeend on it even though the service does indeed require it18:35
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 851434: Install crio on Jammy like Focal https://review.opendev.org/c/zuul/zuul-jobs/+/85143418:37
-@gerrit:opendev.org- Clark Boylan proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/85142518:37
@clarkb:matrix.org * `runtime validation: invalid runtime\_path for runtime 'runc': "stat /usr/lib/cri-o-runc/sbin/runc: no such file18:37
or directory""`
@clarkb:matrix.orgOk I think that last patchset solved the issue. I'll work on a squash after lunch once all the testing reports back18:46
-@gerrit:opendev.org- Clark Boylan proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com:19:00
- [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/851425
- [zuul/zuul-jobs] 851421: Test ensure-kubernetes on all Ubuntu platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851421
@clarkb:matrix.orgI swapped the order so that they are all mergeable now19:03
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/nodepool] 851259: Fix race in test_node_list_json https://review.opendev.org/c/zuul/nodepool/+/85125919:51
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 847387: Revert "Pin minikube to 1.25.2" https://review.opendev.org/c/zuul/nodepool/+/84738720:30
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 847387: Revert "Pin minikube to 1.25.2" https://review.opendev.org/c/zuul/nodepool/+/84738720:32
@jim:acmegating.comthat should test lifting the pin after the zuul-jobs stack is in place20:32
@clarkb:matrix.orgthat seems to be failing with `nodepool.exceptions.LaunchNodepoolException: main-0000000000: couldn't find token for service account {stuff}` I wonder if newer k8s has api incompatibilities we needto address too?21:05
@clarkb:matrix.orghuh we ask k8s to create a service account token and wait 30 seconds for that to complete when creating a namespace.21:25
@clarkb:matrix.orgI'm surprised something like a token needs to be waited on and isn't snychronous. But I guess that may be failing for some reason21:25
@clarkb:matrix.orghttps://zuul.opendev.org/t/zuul/build/4a94802b53074825a53e5179dde65579/log/minikube.txt#917-920 is that related?21:27
@jim:acmegating.comClark: is that error in response to our request, or is that a startup error?21:40
@clarkb:matrix.orgcorvus: I think it is a startup error. But wondering if we can't create a service account for similar reasons21:43
@clarkb:matrix.orgIn theory the authentication stuff should be completely independent of the container runtime driver. Which is why i suspect this is more related to the newer version of minikube and k8s21:44
@clarkb:matrix.orgLooking at the nodepool side of the logs we get the k8s response json logged while it goes through that 30 second wait for secrets. But I never see any secret info in the json21:58
@clarkb:matrix.orgI'm not familiar enough with the k8s api to know if we should expect that back in this case. But it does seem like maybe the API is doing something that we aren't expecting21:58
@iwienand:matrix.orgthanks, the whole cri-o/jammy stack lgtm.  haven't approved it as is the above saying that it installs but isn't actually happy running?21:59
@clarkb:matrix.orgianw: isn't happy for nodepool. But I'm not yet sure if that is a nodepool or minikube+k8s issue22:00
@clarkb:matrix.orghttps://zuul.opendev.org/t/zuul/build/4a94802b53074825a53e5179dde65579/log/job-output.txt#2804 is the start of where nodepool has problems in the job22:00
@clarkb:matrix.organd it is a service account on the new namespace that we're waiting for tokens for. Not the namespace itself22:01
@clarkb:matrix.orghttps://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/ shows that secrets should be automatically created for you22:03
@clarkb:matrix.orghttps://stackoverflow.com/questions/72256006/service-account-secret-is-not-listed-how-to-fix-it aha22:04
@clarkb:matrix.orgthe k8s docs need updating :)22:04
@clarkb:matrix.orgbut also I think that means it is a nodepool problem22:04
@clarkb:matrix.orgit looks like we can toggle an option to get the old behavior back, but I suspect the best thing for nodepool to do is to actually request tokens via the separate API22:06
@clarkb:matrix.orgas that should be most compatible with other k8s installs22:06
@iwienand:matrix.orgit really takes something to make devstack look like the "simple" option :)22:08
@clarkb:matrix.orghrm the guide there indicates that we'd only be compatible back to 1.2222:08
@clarkb:matrix.orgusing the new thing I mean. Instead we can use time bounded token objects22:08
@clarkb:matrix.orgMaybe nodepool needs to grow an awarenes of the k8s version it is speaking to in order to make choices about its behavior :/22:09
@clarkb:matrix.orgMight be best to defer to people actually using k8s with nodepool to help determien what is the most appropriate action there.22:10
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 851473: DNM: Test k8s roles update https://review.opendev.org/c/zuul/nodepool/+/85147322:11
@jim:acmegating.comClark ianw ^ there's a noop change just to verify the roles are good for nodepool.  hopefully the zuul-jobs test coverage is sufficient, but since we can't remove the pin right now, that should double check that landing the stack won't break nodepool again.22:12
@clarkb:matrix.org++22:13
@iwienand:matrix.orgso if i'm understanding; the zuul-jobs are uncapped on minikube but have the cri/docker bridge installed now.  but nodepool doesn't seem to work with this version, and is currently pinned to 1.25.2 (which also hasn't introduced the changes that require the cri/docker bridge)?22:26
@iwienand:matrix.orgbasically 851473 is installing 1.25.2+crio-dockerd via zuul-jobs; but the crio-dockerd is just a no-op and runs in the background but nothing uses it?22:28
@jim:acmegating.comthat's my understanding22:29
@jim:acmegating.comso i guess what this is going to tell us is whether 1.25.2 will work with the new crio-dockerd stuff -- since that isn't tested by zuul-jobs22:30
@jim:acmegating.com(if it doesn't work, however, i'm not sure there's a very good argument to be made to hold up the zuul-jobs changes.  i don't know how much backwards compat for k8s versions we really expect ensure-kubernetes to provide)22:31
@iwienand:matrix.orgi guess we keep pulling on the thread about service accounts clarkb has started on.  i need to do some serious reading but will see if i can help22:32
@jim:acmegating.com(though still, i suppose wrapping the crio-dockerd in a version check in ensure-k8s might be reasonable)22:33
@iwienand:matrix.orgyeah, that should be easy enough22:33
@jim:acmegating.comwell it's moot anyway -- it looks like the job passed22:33
@jim:acmegating.comhttps://zuul.opendev.org/t/zuul/build/6247ef9fb4634bb8be8cfdf3cfacb45122:34
@iwienand:matrix.orgok, cool, so basically it just sits there unused22:34
@jim:acmegating.comi think everything is clear to land now22:34

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!