Tuesday, 2023-01-03

jandersgood morning Ironic o/07:51
jandersHappy (Gregorian) New Year 2023 everyone!07:51
rpittaugood morning ironic! o/08:19
rpittauvmedia is not happy with 2023 already :/08:19
opendevreviewRiccardo Pittau proposed openstack/sushy-tools master: Fix tox4 error  https://review.opendev.org/c/openstack/sushy-tools/+/86876008:27
rpittauseeing this in the vmedia job logs https://8d73c19295e479ed277f-694cde954935b887fd0bda99c57e475e.ssl.cf2.rackcdn.com/867962/3/check/ipa-tempest-uefi-redfish-vmedia-src/d1fdf05/controller/logs/ironic-bm-logs/node-0_console_2023-01-02-17%3A57%3A49_log.txt09:07
rpittaucould be an issue due to uefi+libvirt09:07
opendevreviewRiccardo Pittau proposed openstack/ironic master: Use jammy for base jobs  https://review.opendev.org/c/openstack/ironic/+/86905209:24
ajyaHi, in wallaby sushy docs job fails with "make is not allowed, use allowlist_externals to allow it". Looks like a change in newer tox. Not sure why it's wallaby only. Should add make to allowlist_externals? https://review.opendev.org/c/openstack/sushy/+/86878809:25
ajyaOr was PDF generation removed in newer branches?09:25
rpittauajya: I think we added make to allowlist starting from xena, so that needs to be added09:36
rpittauor better, we need to change from whitelist to allowlist09:37
ajyarpittau:  backport this https://review.opendev.org/c/openstack/sushy/+/796399? Or still need to keep 3.9.0 as min version in wallaby?09:47
rpittauajya: that's perfect, thanks, go ahead with the backport, I'll approve it09:48
rpittaulooks like we're using tox4 for all the stable branches too, so any change due to that needs to be backported down to wallaby09:56
rpittauor we cap tox in older branches, but I doubt we'll do that09:56
opendevreviewAija Jauntēva proposed openstack/sushy stable/wallaby: Update min version of tox to use allowlist  https://review.opendev.org/c/openstack/sushy/+/86904110:03
opendevreviewAija Jauntēva proposed openstack/sushy stable/wallaby: Update min version of tox to use allowlist  https://review.opendev.org/c/openstack/sushy/+/86904110:04
ajya^ ready10:06
rpittauajya: sorry, I think we need to include that in https://review.opendev.org/c/openstack/sushy/+/868788 otherwise other jobs won't pass :/10:18
ajyarpittau:  right, then dropping this and kamleshChauvhan will update his patch10:20
rpittauajya: thanks, realized a bit late :/10:20
rpittauTheJulia, JayF, when you have a moment please check https://review.opendev.org/c/openstack/ironic/+/86852110:28
ajyarpittau: np, should have realized myself11:01
rpittauI wonder how the functional tests were passing until now if IRONIC_BAREMETAL_BASIC_OPS was set to False and the macs file was not there, the job should've failed at https://opendev.org/openstack/ironic/src/branch/master/devstack/lib/ironic#L2379 but the trap was not triggering until switching to jammy11:27
opendevreviewRiccardo Pittau proposed openstack/ironic master: [WIP] Use jammy for base jobs  https://review.opendev.org/c/openstack/ironic/+/86905211:27
opendevreviewkamlesh chauvhan proposed openstack/sushy stable/wallaby: Fix tox4 and setuptools errors, update tox version  https://review.opendev.org/c/openstack/sushy/+/86878811:45
iurygregorymorning Ironic12:18
jandershey iurygregory o/12:23
iurygregoryjanders, o/ happy new year =)12:23
opendevreviewMerged openstack/ironic-python-agent master: Fix for tox4 and setuptools  https://review.opendev.org/c/openstack/ironic-python-agent/+/86796212:31
ajyaWallaby patch is passing, but newer branches have blockers. Should W+1 wallaby patch or wait for newer ones to merge first? https://review.opendev.org/c/openstack/sushy/+/86878813:23
jandersiurygregory thank you - Happy New Year 2023 to you, too!13:32
rpittauajya: let's merge that and xena, for zed we need to wait for ironic patch to merge and I think we'll have to do some backports for yoga too13:36
ajyarpittau: ok13:38
opendevreviewRiccardo Pittau proposed openstack/ironic stable/yoga: Fix CI  https://review.opendev.org/c/openstack/ironic/+/86904513:38
rpittauthis ^ is the yoga backport, probably we can remove the increase memory part, we'll see13:39
rpittaustill need to check why the image has doubled its size13:39
rpittaummm I think I need a separate change to fix the file not found error for the macs file13:41
opendevreviewRiccardo Pittau proposed openstack/ironic master: Do not look for IRONIC_VM_MACS_CSV_FILE if we don't generate it  https://review.opendev.org/c/openstack/ironic/+/86908413:47
rpittauthis ^ should avoid the issue with the missing macs file in jammy13:47
rpittaunot sure why it does not trigger the trap until focal13:47
rpittaumaybe a bug in the shell? ¯\_(ツ)_/¯13:48
opendevreviewRiccardo Pittau proposed openstack/ironic master: [WIP] Use jammy for base jobs  https://review.opendev.org/c/openstack/ironic/+/86905213:49
opendevreviewRiccardo Pittau proposed openstack/ironic master: Use jammy for base jobs  https://review.opendev.org/c/openstack/ironic/+/86905213:49
opendevreviewRiccardo Pittau proposed openstack/ironic master: Use jammy for base jobs  https://review.opendev.org/c/openstack/ironic/+/86905213:49
opendevreviewOleksandr Kozachenko proposed openstack/networking-generic-switch master: Add ArubaOS-CX switch support  https://review.opendev.org/c/openstack/networking-generic-switch/+/86859814:27
opendevreviewMerged openstack/sushy stable/wallaby: Fix tox4 and setuptools errors, update tox version  https://review.opendev.org/c/openstack/sushy/+/86878814:35
opendevreviewRiccardo Pittau proposed openstack/ironic master: [WIP] [PoC] A metal3 CI job  https://review.opendev.org/c/openstack/ironic/+/86387314:59
opendevreviewRiccardo Pittau proposed openstack/ironic master: Create IRONIC_VM_MACS_CSV_FILE if it does not exist  https://review.opendev.org/c/openstack/ironic/+/86908415:09
opendevreviewRiccardo Pittau proposed openstack/ironic master: Use jammy for base jobs  https://review.opendev.org/c/openstack/ironic/+/86905215:10
opendevreviewMerged openstack/sushy stable/xena: Fix tox4 and setuptools errors  https://review.opendev.org/c/openstack/sushy/+/86878715:27
TheJuliaGood morning15:37
rpittaugood morning TheJulia :)15:40
TheJuliarpittau: regarding the last part of the sentence on https://review.opendev.org/c/openstack/ironic/+/868521/4/releasenotes/notes/fix-context-image-hardlink-16f452974abc7327.yaml#615:53
TheJuliadoes that mean the deploy will, or will not work15:53
TheJuliaI ask because depending on how you read the release note, it in not entirely clear15:54
TheJuliaoh wow :( Image sizes16:08
TheJuliaipa-centos9-stable-zed.tar.gz2022-12-15 20:13926M 16:08
TheJulia[   ]ipa-centos9-master.tar.gz2023-01-03 12:41805M 16:08
TheJuliathat is 2x the size16:08
* TheJulia downloads one to begin the examination16:09
samuelkunkel[m]August in 2022 it was around 450M...16:09
samuelkunkel[m]Atleast thats the size of our ipa from back then16:10
TheJuliagood dat apoint, I wonder if they moved firmware around or added new firmware16:10
TheJuliadata point16:10
TheJuliaOur yoga image is like 417 MB16:14
TheJuliaI do remember during Zed we were hitting around 430mb or so16:14
* TheJulia wonders if there could be an ipa image downloading song16:16
TheJuliabut not "222 megabytes of ipa, download one, 223 megabytes of ipa"16:17
samuelkunkel[m]that matches. Just checked - its even less of a size.16:18
samuelkunkel[m]413M Aug 31 20:08 ipa.initramfs16:19
TheJuliarpittau: is the vmedia stuff from yesteday in relation to the macs hwinfo file?16:23
*** dansmith_ is now known as dansmith16:24
rpittauTheJulia: re https://review.opendev.org/c/openstack/ironic/+/868521/4/releasenotes/notes/fix-context-image-hardlink-16f452974abc7327.yaml#6 the deployment won't work if selinux is enabled and enforcing where the hardlink is created16:30
TheJuliaThat seems like a separate thing we need to address16:31
rpittauTheJulia: re vmedia stuff is not related to the macs hwinfo, I think. Puzzling enough, jobs should fail if that does not exist, but they don't, I noticed only in jammy where it correctly exits with a trap16:31
TheJuliais there a check we can run? 16:31
TheJuliaokay16:31
TheJuliaw/r/t vmedia, is there a pristine example of a failure?16:31
rpittauTheJulia: for the selinux stuff you can check the bifrost CI16:31
rpittaufor vmedia this is zed but the failure is the same https://zuul.opendev.org/t/openstack/build/fdd84facbd5f45a495483fd0c81a4ad016:33
rpittauI need to go now, but I will check msgs later o/16:34
JayFThat isn't like the failure I saw on master and was digging yesterday16:37
JayFthat one was a single failure error that it never completed cleaning16:37
TheJuliaw/r/t vmedia: https://b07845379824cfa9e48f-ea6c0ca013bce2ed83ca9ffa0031d5bd.ssl.cf1.rackcdn.com/868789/1/check/ironic-tempest-uefi-redfish-vmedia/fdd84fa/controller/logs/ironic-bm-logs/node-1_console_2023-01-03-09%3A35%3A53_log.txt16:41
TheJuliathere is our problem16:41
JayFI've seen a lot of errors on server consoles16:41
JayFthat's a new one for me :)16:41
TheJuliano oom or anything... although I think there is a case where grub can do something like that in a resource constrained VM16:42
TheJuliawe should fix the ramdisk first, tbh16:42
TheJuliaand see where things go from there16:42
* TheJulia extracts the giant ramdisk of doom16:42
JayFDid you find anything obviously outta whack (outside of "it's big" with the ramdisk?)16:42
TheJuliayup16:45
JayFpath to firmware root dir changed?16:45
TheJuliahttps://www.irccloud.com/pastebin/kz56YovN/16:45
TheJuliajunk in /var/tmp....16:45
JayFegad16:46
TheJulialooks like there is a slightly larger mellenox firmware image now16:47
TheJuliabut only 66 mb, so not a huge gain and people do use those cards16:47
TheJulia(and each file is still reasonably sized)16:47
JayFI'm amazed we went this long without stripping /var/tmp16:47
TheJuliawell, we expect to be empty16:50
TheJuliawe don't expect dracut to be leaving things around16:50
JayFis /var/tmp a ramdisk on a booted machine? 16:51
JayFIt is on many distros; I've not used CentOS 9 stream though....16:51
TheJuliayeah, in the ramdisk16:51
TheJuliain the base image, I believe it is just an empty folder16:51
JayFHow do we end up with dracut in the var/tmp then?16:52
TheJuliadracut likely gets triggered in the image build16:53
TheJuliato include modules16:53
TheJuliaI'll have to check the dib source16:53
opendevreviewJulia Kreger proposed openstack/ironic-python-agent-builder master: Remove /var/tmp/* from images  https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/86909716:57
JayFTheJulia: I wonder if we should make CI fail if IPA jobs produce an image greater than "N"17:03
TheJuliaso it looks like there should be only one or two folders based upon what I see in the dracut scripting and dib17:07
TheJulialooks like the last change was to drop py2 support from dib, but other things can also trigger dracut to regenerate like updated kernels17:08
TheJuliaCI... maybe. But not the image build itself, some operators need the absurd ramdisk images :\17:09
JayFyeah, of course17:10
JayFthe other thing I've pondered, if we should publish  a CI-only image with more aggro firmware pruning17:10
TheJuliathat would be tinyipa really :)17:37
TheJuliait has only cpu firmware17:37
JayFthat is true17:48
TheJulia869097 is looking like it is in good shape so far18:02
TheJuliaJust need one more core18:03
JayFiurygregory: you around to land https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/86909718:04
iurygregoryJayF, yes18:08
iurygregoryCI looks green based on status, lgtm +W18:09
TheJuliaso the vmedia issue *may* be ovmf18:12
JayFovmf?18:12
TheJuliauefi firmware reference code18:19
TheJuliabut... I'm going to run it past some grub folks I know to see what they say18:19
JayFWhat is our fix in the short term?18:21
JayFmake the job nonvoting?18:21
TheJuliaI'm not sure yet18:21
TheJuliagrub maintainers prodded18:22
TheJulia"hey, does this cause deja vu?"18:22
TheJuliaThere is an open debian bug with the same invalid opcode from March18:22
TheJuliaThere is a super super fimilar RHCOS bug as well, but I think it was one of the things we already found and got fixed (unless something reverted)18:23
TheJuliaerr, similar18:24
JayFEven if that is correct, it'll take days for that to roll downstream to us, yeah?18:27
TheJuliapossibly longer, but they they think it might be the wrong grub binary18:33
* TheJulia waits()18:33
TheJuliarhel recently did some simplification of the /boot/EFI structure becaues the wrong file was getting loaded in some cases and bad htings would happen18:34
TheJuliaJayF: by chance have you submitted an operator feedback sort of session for the summit?18:38
JayFI've not submitted anything for summit18:38
JayFForum stuff is open until April; I would think that kind of feedback would be forum?18:38
TheJuliayeah, but I know some folks are already submitting stuff, so I wouldn't wait until april18:39
TheJuliaso the esp image is ubuntu artifacts18:43
TheJuliaand the ramdisk is centos18:44
TheJuliaI think that might be our issue :\18:44
JayFBluntly; that sounds like a bug in any event18:44
JayFYeah?18:44
JayFlike not our bug, like, their bug18:44
JayFIs there a valid reason for them to be incompatible?18:45
TheJuliawell18:45
TheJuliaI'm not sure, but I'm also wondering if this is the ovmf bug I stumbled upon in debian18:45
TheJuliawhich makes no sense18:45
JayFDo you have that bug handy for me to read up on?18:47
opendevreviewJulia Kreger proposed openstack/ironic master: Use centos grub artifacts with centos ramdisk for vmedia  https://review.opendev.org/c/openstack/ironic/+/86910319:14
TheJuliaI hope that works19:15
TheJuliaJayF: ^^^ 19:15
TheJuliaexplaination why support of just "working" is iffy as well in the commit message19:15
JayFnot a bad approach from our perspective19:16
JayFbut I really ,really hope upstream isn't dropping this19:16
TheJuliawhat do you mean "dropping this"?19:21
JayFmeaning like, this failure scenario is not OK19:22
JayFI should be able to mix and match those as a user19:22
JayF(upstream in this case being grub/ubu/centos)19:23
TheJuliait is not okay if secure boot is enforcing19:23
TheJuliabecause it shouldn't work with signed bits19:23
TheJuliabut these are not19:23
TheJuliawhich is... depressing19:23
TheJuliawe'll see19:23
JayFI mean, that's what I'm saying19:24
JayFif it's validating that the boot chain is the ssame with secure boot off, that's bad behavior19:24
JayFand will be breaky to folks in some rescue/recovery situations 19:24
TheJuliawoot, that change blew up horribly19:28
TheJulialike everything is red19:28
* TheJulia feels like this is success for today19:28
TheJuliai *think* there is another fundamental challenge at play, which is the fragmentation of grub2 overall.19:30
TheJulialike there are config differences between dialects of grub2 now19:30
TheJuliawhich... *insert scream here*19:30
opendevreviewJulia Kreger proposed openstack/ironic master: Use centos grub artifacts with centos ramdisk for vmedia  https://review.opendev.org/c/openstack/ironic/+/86910319:34
TheJulianow we wait()19:34
opendevreviewJulia Kreger proposed openstack/ironic-python-agent-builder stable/zed: Remove /var/tmp/* from images  https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/86911819:44
opendevreviewMerged openstack/ironic-python-agent-builder master: Remove /var/tmp/* from images  https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/86909720:12
JayFfungi: how do I manually kick off a "post" job? We need ironic-python-agent-build-image-dib-centos9 to run and publish a new IPA image based on changes we just put into ironic-python-agent-builder20:41
fungiJayF: normally the way you do it is to merge something in the repo. worst case a sysadmin can re-enqueue the last thing that merged to the branch20:42
JayFIf we merge a change to that repo, it'd be 100% just to kick that job off20:43
JayFwhich I can do if it's the preferred method...20:43
fungiwe do it all the time for our container image builds. some of our dockerfiles have comment lines that literally say something like "update to trigger a new image upload yyyy-mm-dd"20:43
fungii take it the ironic-python-agent-build-image-dib-centos9 job isn't triggered by merges to ironic-python-agent-builder... should it be?20:45
opendevreviewMerged openstack/ironic-inspector master: Remove lib/neutron-legacy leftovers  https://review.opendev.org/c/openstack/ironic-inspector/+/86819220:45
opendevreviewJay Faulkner proposed openstack/ironic-python-agent master: Remove old, unused proxy.sh file  https://review.opendev.org/c/openstack/ironic-python-agent/+/86910520:46
JayFfungi: I was thinking about that, it's a possibility but we'd need to review to make sure we CI enough in that repo 20:46
JayFiurygregory: TheJulia: https://review.opendev.org/c/openstack/ironic-python-agent/+/869105 exists to help us get a new IPA build out for CI, you wanna take a look?20:47
iurygregoryJayF, I'm ok with it +220:52
TheJuliaapproved the ipa change20:57
TheJulialooks like inspector's merge fail on the 21st was memory related, so hopefully that should make the world much happier once it merges20:58
TheJuliaI'm sensing vmedia is still broken :(20:59
TheJuliado we know when vmedia failures started?20:59
JayFI saw them emerge along the same timeline as tox 4.020:59
JayFso like, christmas week? maybe earlier and I didn't notice?20:59
TheJuliaso 22nd possibly?20:59
JayFI mean, in that ballpark yeah21:00
TheJuliastevebaker[m]: ^^ I'm sensing the kernel might be our headache :\21:00
TheJuliaI'll look at the finished job21:00
TheJuliaonce it fails21:00
JayFI think it might be wise at this point to temporarily make this job nonvoting?21:00
JayFI'm happy to write that change if you'd +2 it21:00
TheJulialets see what the current change up there comes back with since it is still running (a bad sign)21:01
JayFthat comment is based on the presumption that fails :)21:02
JayFif it doesn't fail, celebrate instead :D 21:02
TheJuliaI'm going to packup here and head down the mountain21:03
* TheJulia wonders if centos's rpm builder is public21:04
TheJuliayeah, it failed21:05
TheJulia:(21:05
opendevreviewJay Faulkner proposed openstack/ironic master: Temporarily mark redfish vmedia as non-voting  https://review.opendev.org/c/openstack/ironic/+/86910721:09
TheJuliaJayF: you have a syntax issue21:12
TheJuliathe job name now needs to end with a :21:12
opendevreviewJay Faulkner proposed openstack/ironic master: Temporarily mark redfish vmedia as non-voting  https://review.opendev.org/c/openstack/ironic/+/86910721:13
JayFthanks21:13
TheJulianp21:15
TheJuliahttps://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_598/869103/2/check/ironic-tempest-uefi-redfish-vmedia/5981ece/controller/logs/ironic-bm-logs/node-0_console_2023-01-03-20%3A19%3A02_log.txt le-sigh21:17
JayFyou mentioned a "kernel change"21:18
JayFdid centos9 ship the stuff removing support for 32 bit systems or similar?21:18
TheJuliaI don21:18
TheJuliaI don't think so, it was a build rev difference21:19
TheJulialike 210 vs 21221:19
TheJuliaI think that is grub differences, with food I'll be able to think through it21:19
JayFyeah, like pathing differences21:20
TheJuliaa *ton* of wifi stuff merged21:24
TheJuliathe change log is insane21:29
TheJulialots of stuff dropped in the 21st and 22nd, but none of it looks really related21:30
JayFI mean, in my gentoo experience21:30
JayFinvalid opcode can also mean like, cflags changed21:30
JayFe.g. they compiled it with options that make it not work on our VM21:30
JayFlike enabling avx or something similar21:30
JayFTheJulia: can you look at a config diff?21:31
JayFTheJulia: (or link me one)21:31
TheJuliahttps://kojihub.stream.centos.org/koji/buildinfo?buildID=2800421:32
TheJuliayeah, it could21:32
TheJuliahmm21:34
JayFTheJulia: 21:40
JayF> - redhat: configs: disable vDPA on all archs except x86_64 (Laurent Vivier) [2140885]21:40
JayF12/1421:40
JayFthat is sus21:40
JayFare we running x86_64 vms?21:40
TheJuliawe are21:41
TheJuliawe do host passthrough21:41
JayFah21:41
TheJuliahttps://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_598/869103/2/check/ironic-tempest-uefi-redfish-vmedia/5981ece/controller/logs/libvirt/libvirt/qemu/node-0_log.txt21:46
JayFTheJulia: not sure what I should see there21:49
TheJuliajust what is there and what the processor gets passed in as21:49
JayFah, okay21:50
TheJuliastevebaker[m]: you mentioned you had some issues when you built an ESP image on rhel recently, do you have some notes?21:50
stevebaker[m]TheJulia: I think it was all environmental, it had to be a privileged container for the mount to work. But I'll be switching to mtools when the dependencies are in the zed container images https://github.com/steveb/ironic-operator/commit/220a07224281b2181acdcb1bee560515b788bed821:54
TheJuliahmm21:55
TheJuliaokay21:55
TheJuliaAnyway, I'm going to look later, lunch and drive time now21:55
stevebaker[m]ok21:58
opendevreviewMerged openstack/ironic-python-agent-builder stable/zed: Remove /var/tmp/* from images  https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/86911823:12
opendevreviewJay Faulkner proposed openstack/ironic-python-agent-builder stable/yoga: Remove /var/tmp/* from images  https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/86912023:21
JayFWe need one more core review on https://review.opendev.org/c/openstack/ironic/+/869107 to temporarily unstick the gate23:23
stevebaker[m]done23:43

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!