Wednesday, 2023-03-29

opendevreviewIury Gregory Melo Ferreira proposed openstack/ironic-specs master: Firmware Interface  https://review.opendev.org/c/openstack/ironic-specs/+/87850504:22
*** eandersson2 is now known as eandersson05:56
rpittaugood morning ironic! o/07:54
samuelkunkel[m]Hello09:45
samuelkunkel[m]has anyone ever build a arm64 ipa ramdisk/kernel based on centos stream 9?09:45
samuelkunkel[m]on x86 it sure does not work as there are many steps that require e.g. chroot09:46
samuelkunkel[m]but I am building on an arm64 machine (ampere altra) 09:46
samuelkunkel[m]debian ipa for arm64 works properly fine on the arm machine09:46
samuelkunkel[m]centos fails within the extract image stage already.... (full message at <https://matrix.org/_matrix/media/v3/download/matrix.org/jiSpFxHxZdHhzoYRCHeTGdMV>)09:48
samuelkunkel[m]need to check if x86 for 9-stream even works... brb09:48
samuelkunkel[m]Sideinfo: I am just trying the plain build (no ssh no extra dibs)09:49
samuelkunkel[m]export ARCH=aarch6409:49
samuelkunkel[m]ironic-python-agent-builder -r 9-stream centos -vvvv09:49
jssfrthat looks wrong09:49
jssfrit's somehow missing the device name09:49
jssfrand that comparison is weird09:49
jssfras if collected from a command which had multiple lines of output, but only a single line was expected09:49
samuelkunkel[m]Vendor ID:              ARM... (full message at <https://matrix.org/_matrix/media/v3/download/matrix.org/HmgTbfINhJLlTaKgNsbzuXKX>)09:50
samuelkunkel[m]quickly checking if x86 build on x86 works09:50
samuelkunkel[m]I dont mind switching to debian-minimal though, was just strange09:50
jssfrwasn't there an issue with interface naming on debian vs. centos?09:50
jrossersamuelkunkel[m]: https://docs.openstack.org/openstack-ansible-os_ironic/latest/configure-ironic-multiarch.html09:51
samuelkunkel[m]yes.09:51
samuelkunkel[m]jssfr:  but it seems like it was partly an error on my side as well. But I will test this for sure09:51
samuelkunkel[m]thanks jrosser  - going to test the --extra-args=--no-tmpfs which is the only difference here09:53
samuelkunkel[m]also jssfr  the issue was between centos and ubuntu. Tbh I never even tested debian ^^09:54
samuelkunkel[m]The ubuntu ipa does some weird stuff. But I also founda n option to configure the interface naming there (via a dib env option)09:54
jssfrright09:54
jssfrI'm all in favour of debianisation09:55
samuelkunkel[m]If I recall correctly ubuntu is also not a validated os for the ipa09:55
jrossersamuelkunkel[m]: i have all this working on ampere - those docs are based on what i did09:55
samuelkunkel[m]thanks09:56
jrosseri built the IPA image using rocky linux on an ampere vm09:56
samuelkunkel[m]Ok I currently build on the baremetal ampere machine itself running Ubuntu22. There I run docker in which I have a arm64/v8 containerd based on debian with all the diskimage-builder tools installed09:57
samuelkunkel[m]I guess I need to drill down my chain a bit09:57
samuelkunkel[m]but thanks you for sharing!09:58
jrosseri think i might have abandoned trying to build the centos ipa on a not RH derivative OS09:58
jrosserit wasnt totally obvious what was / was not supposed to work09:58
samuelkunkel[m]so for x86 building centos on debian based systems works properly09:59
samuelkunkel[m]as we are mostly a complete ubuntu/debian based infrastructure it was fine for us to just keep the stream-9 ipa09:59
samuelkunkel[m]So it seems to be an issue running the build within a container as it seem to run properly just in a venv on the baremetal os10:06
samuelkunkel[m]interesting10:06
samuelkunkel[m]building centos arm ipa on ubuntu22 10:07
dtantsurfor the reason I don't remember, we ended up building our ARM image on Debian. Maybe rpittau remembers why.10:10
rpittaummm I don't remember exactly, I think it was missing something10:12
dtantsurTheJulia: I'm seeing a lot of "Client-side error: Agent token is required for heartbeat processing" on my environment, I wonder if something has regressed there..10:13
dtantsuralthough.. fast track looks completely broken on this node. IPA not accessible, cannot SSH...10:28
samuelkunkel[m]debian ipa works better then expected. One interesting thing. (collectors in use are default, pci-devices, logs) the fields under the system_vendor key within the introspection data are not populated.... (full message at <https://matrix.org/_matrix/media/v3/download/matrix.org/lpkOupgTlzrAhPuKzfHzRkkz>)10:34
samuelkunkel[m]need to scroll some logs10:34
samuelkunkel[m]Could not get real physical RAM from lshw: Expecting ',' delimiter: line 24 column 8 (char 664): json.decoder.JSONDecodeError: Expecting ',' delimiter: line 24 column 8 (char 664)10:38
samuelkunkel[m]hmm10:38
dtantsursamuelkunkel[m]: I remember some (older?) lshw output invalid JSON10:38
samuelkunkel[m]older in case of older hardware?10:39
dtantsurno, i mean the version of lshw itself10:41
samuelkunkel[m]root@debian:~# lshw -version10:42
samuelkunkel[m]does not even give me an output10:42
samuelkunkel[m]uff10:42
samuelkunkel[m]:D10:42
samuelkunkel[m]apt tells me its lshw/now 02.18.85-0.710:42
samuelkunkel[m]do you have any reference to that issue?10:43
dtantsursamuelkunkel[m]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=100202510:46
samuelkunkel[m]thanks10:50
samuelkunkel[m]well, debian seems to be the only people that do not provide 2.19 yet...10:50
samuelkunkel[m]trying to get my build on a bookworm release (debian 12), but firmware-misc-nonfree was moved to a different repo11:38
samuelkunkel[m]is this something we should care about? (I assume I can just add the repo via deb deriving env var)11:39
dtantsurit's a good question, I don't have an answer immediately.. we probably do11:40
* samuelkunkel[m] uploaded an image: (3KiB) < https://matrix.org/_matrix/media/v3/download/matrix.org/OhiJkCNdliboVWULfscHRVzS/image.png >11:45
samuelkunkel[m]fyi: by using export DIB_DEBIAN_COMPONENTS="main,non-free-firmware" it works11:45
samuelkunkel[m]dtantsur:  thx for the hint btw. using debian 12 as ipa with 2.19 lshw solves the issue12:01
samuelkunkel[m]"system_vendor": {12:01
samuelkunkel[m]  "product_name": "PowerEdge R750 (SKU=090E;ModelName=PowerEdge R750)"12:01
samuelkunkel[m]works pretty good despite not being fully released yet afaik (debian 12)12:01
jssfrit's in some freeze already, so it won't break too much until the release :)12:02
jssfr(i.e. with the yaook hat on: I'm just fine with depending on debian 12 at this date)12:02
iurygregorymorning Ironic12:13
rpittaudtantsur: I had a look at ipa-builder and I think the reason behind debian choice was just a matter of size, do we want to give ubuntu a try again? Also we should switch from bionic-focal to focal-jammy I guess and probably remove the ussuri job?12:47
rpittaualso pinging TheJulia JayF iurygregory stevebaker[m] for this ^12:48
rpittaucould be a topic for the next meting12:49
iurygregoryrpittau, I think it makes sense to give another try12:49
iurygregoryyup, sounds like a plan to discuss in the next weekly meeting12:50
dtantsurrpittau: we need to revisit that trademark issue.. I also seem to recall that ubuntu images never worked for me for some reason12:54
rpittauI'm adding a topic to the next meeting12:55
dtantsurwe also need to check how small these are12:55
rpittauyeah, that was a big point if I remember well12:55
rpittauI'll try to spin up an image these days to get an idea12:56
TheJuliabrraaaains12:57
dtantsurTheJulia: how are you feeling today?13:00
iurygregoryi like that the PTG page says "The PTG is on break until 13 UTC!"13:00
iurygregory:D13:00
dtantsuriurygregory: mmm, yeah, inconvenient13:00
iurygregoryfor some reason the https://ptg.opendev.org/ was directing me to the Useful Link page :D13:01
dtantsuryep, same13:02
TheJuliadtantsur: intermittent fever throughout the night13:14
TheJuliadtantsur: I feel *much* better, but also still tired13:14
TheJulia:(13:14
dtantsur*hugs*13:14
TheJuliaNow the wife has it :(13:14
dtantsurle sigh. I was the 2nd here (just as with corona in summer)13:15
TheJuliadtantsur: on that client side error, is the agent re-launching on the systemd screen output?13:15
dtantsurTheJulia: I was trying to understand that, and seems like no. It's actually pretty puzzling, check out the description: https://issues.redhat.com/browse/OCPBUGS-1103413:22
JayFdtantsur: you gave me flashbacks to custom compiled ipxe roms with custom CAs and TLS settings inside13:28
dtantsurlol13:28
iurygregoryJayF, I've updated operator-hour-baremetal-sig in the #openinfra-events13:29
TheJuliadtantsur: so I *think* the issue here is that ssl is bombing on the very first request, but the retry works... except the token has been generated13:33
TheJuliaMaybe we need to change the lockout until after the first use of the toekn13:33
TheJuliabut that is inherently racey13:33
TheJuliaand can conflict with systemd behavior13:34
TheJuliadtantsur: I think your agent is also starting up more than once13:41
TheJuliathe ethernet port list order changes...13:43
dtantsurTheJulia: at least the error message is complaining about the / endpoint, not about lookup13:47
TheJuliaahhh, yes13:54
TheJuliaso I *do* believe the container is getting restarted13:55
* TheJulia goes to see if we randomize the lookup list order13:58
opendevreviewIury Gregory Melo Ferreira proposed openstack/ironic-specs master: Firmware Interface  https://review.opendev.org/c/openstack/ironic-specs/+/87850514:21
JayFiurygregory: NO LOGIN PUBLIC RH BUGS <3<3<3<315:28
iurygregoryJayF, it will depend on the config of the bug15:29
JayFyeah, Julia said that in PTG :D 15:30
iurygregoryzoom decided to change my mic config LOL15:30
rpittaugood night! o/16:48
opendevreviewJay Faulkner proposed openstack/ironic-specs master: Retire now-outdated snapshot spec  https://review.opendev.org/c/openstack/ironic-specs/+/87893016:52
JayF^^ as discussed in PTG mere minutes ago16:52
JayFkaloyank: Hey, we did get around to the item you requested in PTG. There are notes but basically: 1) Service Steps is expected to be the method we use to implement snapshot support so 2) The existing spec is invalid and will be moved to retired ^ 3) Once service steps are completed, it's expected to be a minimal add-on item to implement support for Snapshot from an Ironic16:53
JayFperspective.16:53
ebbexhttps://github.com/metal3-io/baremetal-operator/blob/main/docs/api.md#rootdevicehints there's nothing here in the api that would set `by_path` for the rootdevice, as per https://docs.openstack.org/ironic/latest/install/advanced.html right?16:54
kaloyankJayF: that sounds great16:58
JayFkaloyank: service steps are REALLY SUPER COOL if you're not familiar yet. Basically ability to apply "steps" (think: cleaning/deployment steps) to a node that's currently in ACTIVE state. Should be super useful for a lot of things, including implementing snapshot support :)16:59
JayFIs there anything anyone knows we need to get backported into Xena before the last release is cut?17:02
JayFIt'll be moving to EM very soon, and I need to review that releases PR. I'll check for any pending patches but please make noise in my direction very soon if there's a bugfix in flight we want to get in before releases close on Xena for good :)17:03
TheJulianot that I'm aware of17:04
* TheJulia lays down for hopefully quick nap17:04
kaloyankJayF: I think I got the gist sometime ago when TheJulia was talking about them17:05
* JayF is saving his respite for in 1 hour when the NY Pizza place opens17:05
kaloyankI have one question regarding the "fate" of the IPA comms: what are the next steps? Someone should write a spec, the spec should get approved and move onto implementation? Or there's more to discuss17:07
JayFI think we surfaced a lot of concerns and possible solutions17:08
kaloyankand just to set the record straight: The "bug" I hit with fast-track integrated Neutron was with a flat network17:08
JayFit'll culminate in the spec being updated, or a new one being written, about how to fix those problems17:09
kaloyankOK, I'm just not aware how does the process work17:09
JayFGenerally speaking for something there's two paths:17:10
JayF- It's small, so a simple bug is filed (or sometimes not even), and a patch is pushed, approved, and lands. This is how most of our simple bugfixes go.17:10
iurygregoryJayF, shouldn't the spec go to retired ? (just wondering)17:10
JayF- For larger, or architecturally impactful changes, we write up a spec, it gets landed, someone implements it. There are often times  aspec gets written+approved but because priorities shift it never gets implemented :( 17:11
JayFiurygregory: this one? https://review.opendev.org/c/openstack/ironic-specs/+/87893017:11
iurygregoryyeah17:11
iurygregoryI'm fine with just removing it also17:11
iurygregoryjust wanted to double check17:11
JayFit is, if you click the file it says: > rename from specs/approved/snapshot-support.rst > rename to specs/retired/snapshot-support.rst17:11
JayFhttps://review.opendev.org/c/openstack/ironic-specs/+/878930/1/specs/retired/snapshot-support.rst17:12
iurygregoryoh ok17:12
kaloyankJayF: just asking because it wasn't evident to me what are the following steps. I'll just keep an eye on the ironic-specs repo then :)17:14
kaloyankgenerally, I'm interested in implementing the IPA comms changes17:14
JayFSo she's out now, working thru illness so we can PTG, but TheJulia is the person to sync  with I think if you're willing to lend a hand17:15
JayFand welcome to the team :) we have stickers and pins, I will send you some if you want (this is not a joke; I literally will post you some Ironic stickers if you want 'em, help or no help.)17:16
kaloyanksure, I'd love to :)17:16
JayFDM me an address to ship stickers to and I'll put them in my mailbox today :)17:17
* TheJulia has insufficient cat cuddles to nap17:25
kaloyankTheJulia: feel free to reach me via DM to sync17:28
TheJuliaAck, maybe in an hour. I’m going to have to capture a cat to take a nap17:28
* TheJulia fails and takes migraine meds instead17:32
samuelkunkel[m]JayF:  can you share the link of the „security things to do as an ironic operator“ doc once you had a look? :)17:36
JayFhttps://docs.openstack.org/ironic/latest/admin/security.html#firmware-security I think does a pretty good job of my concerns17:36
JayFI wasn't going to edit it because that did a really good job17:36
samuelkunkel[m]Thanks!17:36
samuelkunkel[m]Also fine for me :)17:37
TheJuliaJayF: I was thinking it might make sense to mention things that would be good to look out for, like you mentioned hidden i2c buses, and I have seen that, and ... yeah... you can flash firmware across them.17:46
JayFTheJulia: bluntly, I'm not sure how much of that ... highly specific information is public or not17:49
JayFTheJulia: I stick that in the realm of > Ideally, an operator would work with their hardware vendor to ensure that proper firmware security measures are put in place ahead of time.17:49
JayFBecause at least IME, enable/disable of those interfaces is a compile time bios setting, not something that can be flipped on an image17:50
JayFI am worried if we list too many things to look out for, someone might think very wrongly that it's a comprehensive list.17:50
JayFWhen really I wanna say, pretty strongly "this is impossible to do alone, you need a vendor to help"17:50
JayFI guess I can edit that to say it pretty directly.17:53
JayFRFR, call for contributors for ARM CI: https://gist.github.com/jayofdoom/c26de64fde114125be48295a6634150517:59
* JayF AFK for a while17:59
TheJuliaJayF: That is reasonable18:04
clarkbI guess virtualized arm baremetal isn't sufficient here?18:06
clarkbyou're worried more about the hardware management systems than software executing the arm instruction set?18:06
TheJuliaWe know some vendors have run into some weirdness on hardware with things such as booting when aspects worked just fine in VMs18:07
jrosserwhat specific thing is useful from people who are doing arm/ironic? kick the tyres on the experimental IPA image?18:11
TheJuliaPrimarily just make sure the image works and hopefully continues to work. We've had folks mention some arm weirdness in other areas like ipmi bmc interaction in some bugs, anything we can do to pin down/identify/fix those sorts of issues is super beneficial to the overall ecosystem. Without something like 3rd party CI, we're just sort of hoping.18:12
jrosserwell it’s a sample of 1 but learning all about ironic from scratch was a bigger thing than the ARM-ness of the nodes, which basically “just worked” for the admittedly small subset of things we’ve excercised18:18
TheJuliajrosser: that is good18:20
jrosserI expect the variance between vendors is large though, so perhaps collecting some success/not success info would be useful18:22
TheJuliakaloyank: so, snapshooting!18:23
kaloyankTheJulia: yep18:33
TheJuliaso I think the big challenge there is the mechanism and flow, not necessarily the trigger18:34
kaloyankjust to be clear, I think JayF had in mind to sync over the IPA comms change, although I'm fine with working on snapshotting via service steps18:35
kaloyankas I would really enjoy having both features in my deployment :)18:35
JayFkaloyank: keep talking and you'll have volunteered for more stuff ;) 18:39
TheJuliaThe comm stuff, I think we need to just write out what our solution is18:39
TheJuliaJayF: hehe18:39
kaloyankJayF: we have a saying at work that goes: what you haven't told us can't be used against you in court18:40
JayFI have a saying here in Ironic, it goes "please continue volunteering for more things", usually followed by a cackle18:41
JayF;)18:41
kaloyank^^18:41
JayFclarkb: I think all of us would be extremely nervous about publishing a thing as "supported" that literally zero contributors have ever seen /actually work/18:44
JayFclarkb: So like, I don't think 3rd party CI is an absolute requirement, but maybe >1 contributor/operator saying "I can confirm this works with $specificHardware" + virtualized CI18:44
JayFclarkb: but the problem is twofold: we don't have hardware *and* we don't have time. If we hook a hardware vendor to work on third-party-CI, we can potentially solve both18:45
kaloyankI went over the "Fix options" points in the etherpad and over https://review.opendev.org/c/openstack/ironic-specs/+/777172 and it looks OK18:51
TheJuliadtantsur: so w/r/t https://review.opendev.org/c/openstack/ironic-specs/+/872349/6/specs/approved/service-steps.rst#277, we've traditionally not hidden/changed states except for the None->Available transition.  I feel like going down that path may not be great, I know with nova it won't be breaking, but would it *really* be with Metal3 before it could be updated if a mismatch occurs?18:51
kaloyankI'm off for today, see you tmr o/19:05
clarkbJayF: yup. I just ask because with historically x86 software moving to arm usually the concern is does this work with the different cpu instruction set19:37
JayFclarkb: yeah; I think we have a reasonable level of confidence it works, at least IPA running on an ARM node does. What we're less certain of is what the full architecture looks like, and where the sharp edges are19:38
JayFand without an invested party to help us navigate those edges, it's pretty tough to say "yep, we support ARM"19:38
JayF(not to mention the questions of: which ARM? Supported for ironic services or just IPA/as a deployment target)19:39
clarkbyou should be able to test the ironic services pretty well on arm with our existing CI tooling19:40
clarkbthe other reason I ask is that I am curious if the fake baremetal works at all without nested virt (arm64 doesn't support this or does but very recently?)19:40
JayFWe had that conversation too :) 19:41
JayFanswering those questions takes time we don't want to spend without having an invested party19:41
*** JayF is now known as Guest930819:44
*** JasonF is now known as jayf19:44
*** jayf is now known as JayF19:44
opendevreviewJulia Kreger proposed openstack/ironic-specs master: Add service steps framework  https://review.opendev.org/c/openstack/ironic-specs/+/87234919:48
opendevreviewMerged openstack/ironic-specs master: Retire now-outdated snapshot spec  https://review.opendev.org/c/openstack/ironic-specs/+/87893020:32
JayFTheJulia: iurygregory: dtantsur: is one of you going to be at the session in one hour?20:50
JayFoh, in 10 minutes, apparently20:50
iurygregoryone hour? .-.20:50
JayFI'm GUD at math20:50
iurygregoryoh ok20:50
iurygregoryyeah, I will be20:50
JayFiurygregory: can you moderate the session?20:50
JayFI'll be there, but would rather be camera off/silent20:50
iurygregorysince I'm the owner of the spec 20:50
JayFhave a personal issue happening right now (my wife's school is in lockdown)20:50
iurygregoryJayF, sure20:50
iurygregorycan you send the code  from the etherpad in private20:51
JayFshe is fine but I am not in a good focused place to run a meeting20:51
JayFI'll join and give you host if I can20:51
iurygregoryoh ok20:51
iurygregorycool20:51
dtantsurJayF: I was about to ask when exactly we have the session20:51
dtantsurTheJulia: re hiding states: I'll need to think more about it when my brain is not shrimp20:52
TheJuliasure20:53
JayFiurygregory: if you'll join up now, I'll invite you to host20:53
* TheJulia coughs up a lung20:53
dtantsurso, right now, not in an hour?20:53
* TheJulia is sooo confused20:53
* dtantsur confused++20:54
* TheJulia suggests whiskey20:54
JayFschedule says right now20:54
JayFdon't trust my top-of-the-head math20:54
dtantsurthe etherpad used to say 2200 UTC hence my question20:55
JayFIf I made the etherpad and the schedule mismatch, it'd explain *my* confusion too20:55
TheJulia2200  UTC is in an hour20:55
dtantsurTheJulia: yep, but the PTG schedule says 2100 UTC20:56
TheJuliaoh20:56
TheJuliaoh!20:56
JayFI'll update PTG schedule now20:56
JayFwe should respect the etherpad schedule, I think20:56
JayFAny dissent?20:56
dtantsurwill be harder for me, but makes sense20:56
* dtantsur is reaching for more ibuprofen20:56
TheJulia:(20:56
TheJuliajanders: you awake yet? :)20:56
iurygregoryTheJulia, probably not20:57
iurygregoryhe is in asia 20:57
iurygregoryI think it's 4am or something for him20:57
dtantsurslack claims it's 7am for him. but slack may not be aware of his movements20:58
TheJulia... I thought he was like 1 hour behind syd20:58
iurygregoryexactly 20:58
dtantsuranyway, just please tell me the time. if it's in an hour, I'll try to recover a bit by consuming some pancakes (with ibuprofen lol)20:58
TheJuliais he out camping with satellite internet again?!20:58
JayF2200 UTC is the correct time20:58
JayFI screwed up the ptgbot schedule20:58
iurygregoryhe is in UTC + 720:58
dtantsurTheJulia: no more camping with a newborn :D20:59
* TheJulia had no idea20:59
TheJuliagood for him!20:59
dtantsurokay, so I'm blending into the background again, see you in an hour20:59
iurygregoryyeah, it's 4am for him :D21:00
iurygregoryhttps://time.is/UTC+721:00
iurygregoryso there is a small chance he can join in hour, if his newborn doesn't let him sleep 21:01
opendevreviewJulia Kreger proposed openstack/ironic-specs master: cross-conductor rpc/pxe hand-off  https://review.opendev.org/c/openstack/ironic-specs/+/87366221:12
dtantsurJayF, just in case you (or anyone else) is curious: I"m slowly working on first Ironic resources for rust-openstack: https://github.com/dtantsur/rust-openstack/pull/13921:51
iurygregoryPTG time22:00

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!