Wednesday, 2023-04-19

opendevreviewIury Gregory Melo Ferreira proposed openstack/ironic-specs master: Firmware Interface  https://review.opendev.org/c/openstack/ironic-specs/+/87850501:50
rpittaugood morning ironic! o/07:22
opendevreviewStephen Finucane proposed openstack/ironic master: db: Resolve SAWarning warnings  https://review.opendev.org/c/openstack/ironic/+/85634909:44
opendevreviewStephen Finucane proposed openstack/ironic master: tests: Replace invalid UUIDs  https://review.opendev.org/c/openstack/ironic/+/85634709:44
iurygregorygood morning Ironic 11:32
iurygregoryif possible a new round of reviews in https://review.opendev.org/c/openstack/ironic-specs/+/878505 would be good =)11:56
opendevreviewMerged openstack/ironic-python-agent-builder master: Move ubuntu jobs to jammy  https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/87953812:32
TheJuliagood morning13:13
iurygregorygood morning TheJulia 13:18
TheJuliaugh, continued v6 issues this morning13:56
iurygregoryouch >.<14:01
iurygregorysorry to hear that14:01
iurygregoryv4 issues no? v6 is working14:01
TheJuliawell, they are taking different paths14:02
TheJuliabut v6 seems to just be getting crushed with packet loss14:02
iurygregoryoh god .-.14:02
dtantsurmorning TheJulia! I recall you wanted to discuss something?14:05
TheJuliaYeah, would at the bottom of the hour work for you?14:06
dtantsurTheJulia: should do. I'm listening to an all-hands now.14:07
jrosseris `redfish_verify_ca: false` the right way to disable certificate verification for redfish driver?14:08
TheJuliadtantsur: afterwards?14:08
samuelkunkel[m]jrosser:  yes14:08
dtantsuryep, should have a bit of time14:08
TheJuliadtantsur: just let me know after it is over, and a 10m heads up would be awesome so I can make coffee :)14:09
dtantsur++14:09
jrosserhmm i still get an SSL verification error https://paste.opendev.org/show/btDy9vFZFJTnm7mx6dae/14:10
samuelkunkel[m]which version of sushy do you use?14:10
TheJulia... jrosser... have you tried to restart the conductor service?14:11
jrosseroh - no i just updated the node14:11
TheJuliagive that a try14:11
TheJuliaif it suddenly works, I can explain why :)14:11
TheJuliaand then I'll kindly ask for a bug so we can fix it :)14:11
samuelkunkel[m]as this is a container, do you have REQUESTS_CA_BUNDLE set?14:12
samuelkunkel[m]thinking about this https://review.opendev.org/c/openstack/sushy/+/87088814:12
dtantsurTheJulia: suspecting session caching?14:12
samuelkunkel[m]and if your sushy is "too old" you can also run into this were the flag is just invalidated14:12
jrosserit still looks unhappy14:12
jrosseryes there is a REQUESTS_CA_BUNDLE set in order to talk to in keystone etc14:13
jrosser*in order to14:13
jrosserthis is Zed14:13
TheJuliadtantsur: that was what I was thinking, but verify_ca is covered in the cache14:14
TheJuliaohn oes14:14
TheJulianoes14:14
TheJuliahttps://github.com/openstack/ironic/blame/master/ironic/drivers/modules/redfish/utils.py#L21614:14
* TheJulia suspects dmitry will spot it14:14
samuelkunkel[m]which sushy version? as this was merged not long ago14:14
samuelkunkel[m]but hopefully I am wrong :)14:15
jrossersushy==4.3.214:15
TheJuliaerr, hmm, maybe not14:16
dtantsurI cannot spot anything wrong, but I'm also half occupied by a meeting14:16
samuelkunkel[m]I run sushy==4.4.1 in my zed conductor pods14:17
samuelkunkel[m]can you give it a try?14:17
jrosseri can manually install that i think - though i expect upper-constraints for zed has things to say about that14:17
TheJuliahmmmm14:18
TheJuliadtantsur: yeah, I though we had the wrong field name but I traced it further up the code and it looks fine14:18
TheJuliajrosser: do you ahve debug logging turned up?14:19
jrosseri do14:20
samuelkunkel[m]ah, we install sushy 4.4.1 and we also backport the patch14:21
samuelkunkel[m]from the linked PR14:21
samuelkunkel[m]had just a look into the container image14:21
samuelkunkel[m]https://gitlab.com/yaook/images/infra-ironic/-/blob/devel/Dockerfile14:22
jrosserthis is what i get in the debug log https://paste.opendev.org/show/bGIVBSCCpNYKCBHyA0Vj/14:23
samuelkunkel[m]That is the bug mentioned in the PR14:24
jrosserand 870888 is part of sushy 4.4.1?14:24
jrosseroh or not - if you have to backport the patch14:25
samuelkunkel[m]y14:25
samuelkunkel[m]we use 4.4.1 and backport into 4.4.114:25
samuelkunkel[m]not sure in what it is included14:25
samuelkunkel[m]thats why, back when I build the zed image, I just took 4.4.1 and backported into that ^^14:25
jrosserahha maybe 4.4.2 is including it14:26
samuelkunkel[m]I did not check, but possible :)14:26
rpittausamuelkunkel[m], jrosser, 870888 is indeed in 4.4.214:28
samuelkunkel[m]then give it a try :)14:28
jrosserah i see it's now not failing on SSL errors with 4.4.214:28
samuelkunkel[m](note to me, adjust container build)14:28
jrosseris 870888 backportable to stable/zed?14:29
opendevreviewRiccardo Pittau proposed openstack/sushy stable/zed: workaround: requests verify handling if env is set  https://review.opendev.org/c/openstack/sushy/+/88083214:31
rpittaujrosser: cherry-picked :)14:31
jrosserthankyou - i can confirm it's fixed the ssl errors on Zed here14:32
rpittauawesome14:32
samuelkunkel[m]nice :)14:32
jrosserwell, i mean 4.4.2 has :)14:32
rpittauyeah, I think we can release 4.3.4 after that merges, we already have some fixes there14:32
rpittau4.3.4 being zed of c ourse14:33
jrosserthats great - i have to jump some hoops to install outside of u-c14:33
samuelkunkel[m]really? :D14:41
TheJuliahttps://meet.google.com/dip-wrpc-jwe if anyone wants to discuss cross conductor rpc/survivability of actions14:41
samuelkunkel[m]jrosser:  are you happy with the supermicro arm server?14:46
jrosserkind of +/- i think14:47
jrosserwe have them working for openstack compute nodes and controllers just fine, but in that situation the bmc DHCP it's address and everything just works (tm)14:48
samuelkunkel[m]we try the HPE RL300. They dont provide inband IPMI access at all14:48
samuelkunkel[m]so the ipa does not find any useable bmc addresses14:48
jrosserwhere we want to set static BMC addresses for ironic nodes the setup screen seems barely tested14:48
samuelkunkel[m]so we had to build a workaround by looking up their DDNS names14:49
samuelkunkel[m]oh14:49
jrosserand there is no way to disable the BMC sharing one/other of the onboard ports which seems also to have really unusual behaviour compared to x86 supermicro where you can just turn it off14:49
jrosseralso some uefi boot order bugs14:50
jrosserhaving said all that, ampere/supermicro are being pretty responsive and have fixed stuff and given us new firmware14:50
jrosserand this was also pre-release hardware and beta-ish motherboards, so i can't complain too much14:51
jrosserthe reason i'm trying redfish is that i've just had two more supermicro/ampere nodes delivered and those don't respect the ipmi "next boot should be PXE" at all14:53
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent-builder master: Add a non-voting ubuntu arm64 bnuild check job  https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/88085414:56
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent-builder master: Add a non-voting ubuntu arm64 build check job  https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/88085414:57
TheJuliajrosser: so, we try the old documented raw values, can you reproduce that directly with just ipmi?15:34
TheJuliaand not the previously documented by supermicro values15:34
jrosseri'm not sure what to do actually15:35
jrosserif i stab F12 at the console when it reboots it correctly PXE boot into the ironic agent and does the cleaning15:35
jrosserif i dont do that, i get whatever was last on the ssd15:36
jrosserand i'm seeing similar behaviour now that i've tried the redfish driver too, like the setting of boot order is just ignored15:36
JayFI assume you've taken the grand tour of the firmware settings, and looked for something like "BMC boot control [disabled]" or similar?15:37
jrosseri will take another look15:37
JayFyou indicated this is prereleaseish hardware, yeah?15:38
TheJuliaI remember you mentioned the redfish case.  And we explicitly set the default value with efibootmgr as well15:38
jrosserthis one allegedly is a production version15:38
TheJuliaUgh15:39
TheJuliaSo, even to just deploy we can’t signal it with either…15:39
TheJuliaTo boot to network?15:39
jrossermy hunch is that happens by accident15:40
JayFif the ssd happens to be blank, it falls thru to net booting15:40
JayFso we can boot exactly one agent on it15:40
JayFyeah?15:40
jrosserright, and i've succeeded once in deploying an OS15:40
jrosserand now get that all the time15:41
opendevreviewMark Goddard proposed openstack/tenks master: Add retries for get_url and package tasks  https://review.opendev.org/c/openstack/tenks/+/88075915:43
opendevreviewMark Goddard proposed openstack/tenks master: Fix CI failures  https://review.opendev.org/c/openstack/tenks/+/88086615:43
TheJuliajrosser: with redfish, do you see if the field settings change for the 'Boot' field under the system?15:43
* TheJulia wonders if we're just getting some sort of static data blob back15:44
jrosserlike device: pxe ?15:45
TheJuliayeah15:45
jrosserhow do i show that with the cli - i see it in horizon15:47
TheJuliathis would be in debug logs from sushy showing responses15:48
* jrosser reboots the BMC15:57
opendevreviewMerged openstack/ironic stable/zed: [iRMC] Handle IPMI incompatibility in iRMC S6 2.x  https://review.opendev.org/c/openstack/ironic/+/87088115:57
rpittaugood night! o/16:15
jrosserok so i have at least two compounded issues17:23
jrosserthe bmc was in a wierd state, that is cleared up for now by resetting it <- i've seen this before17:24
jrosserand my ipa image built for zed just sits there and seems to do nothing, though sadly i've not included dynamic-login so cant jump in and see whats going on17:25
jrosserthe centos8/yoga ipa i have is working though for clean & deploy17:26
TheJuliajrosser: is there any ipa output to the console?17:36
samuelkunkel[m]So the issue with the bmc I only had on pre Victoria Release using ipmi with x86 supermicro - with the current zed + redfish setting boot order works fine. 17:57
samuelkunkel[m]I would have assumed that they use the same ASpeed bmc on these platforms?17:57
samuelkunkel[m]So why would the bmc behave so differently?17:57
samuelkunkel[m]Thats sounds really not good17:58
jrossersamuelkunkel[m]: it’s open bmc on these18:16
jrosserTheJulia: there would be console output but that’s an open question we have with ampere/sm that it’s only going to the serial port, not the screen18:17
jrosserthat’s quite possibly a more general arm platform thing, but would be interested to know if the agent output is shown on the screen / kvm in your systems samuelkunkel[m]18:18
TheJuliajrosser: I guess serial port might be a thing to look at if you've got connectivity to it18:20
JayFjrosser: I wonder if you need something like console=/dev/ttySx on the kernel command line18:20
TheJuliaand what JayF said18:20
JayFjrosser: sometimes those machines can have >1 serial console, where one is physical and one is Serial-over-LAN18:20
jrosserthe physical one is usb /o\18:21
jrosserso all my lantronix console sever ports are no good18:21
TheJuliaonly way to know is to try and iterate I guess18:21
TheJuliaoh, wait18:21
TheJuliaso the bmc does not sniff the serial port?18:21
JayFTheJulia: on the opencompute machines we had at rax it was the same way; two different OS-facing ports, one was physical serial one was IPMI-serial18:22
jrosseryes, there’s a graphical SOL interface as well as the vga kvm18:22
JayFoh yeah, and IPMI-serial was not the same as the graphical SoL interface (which wasn't installed in our machines but could still be output to, sadly)18:22
TheJuliayeah, I guess it would just take iterating18:23
TheJulia:(18:23
jrosseryep, I’m in the lab in person tomorrow so this will all be much easier to poke18:23
JayFthat's how I solved the problem then. Lots of "send the vendor an email; solve the problem with guess-and-check while waiting for a response"18:23
JayFalthough that vendor is the one that gave us the IPMI raw bits which (maybe?) are still in the IPMI driver18:23
samuelkunkel[m]jrosser: yes, partly. So for HPE you have the ILO (which is basically HPEs proprietary bmc). This one comes with a html5 console (and java). And this console works. (Until you boot the final Ubuntu image)18:24
samuelkunkel[m]So I can see stuff like Post, BIOS, pxe, ipxe18:24
jrosseryou see the output from ipa there?18:25
samuelkunkel[m]Yes18:25
samuelkunkel[m]We use stream 9 ipa (as debian 12 efibootmgr failed me)18:25
samuelkunkel[m]For debugging I have an IPA with a devuser / password18:25
samuelkunkel[m]And I can log in via the console18:25
jrosseryes - I forgot to include that in my stream 9 ipa18:26
jrosserwill fix that tomorrow18:26
samuelkunkel[m](Mostly I only grab the dhcp ip from there and then ssh into it)18:26
JayFyou can get the dhcp IP from Ironic, fwiw samuelkunkel[m] 18:26
samuelkunkel[m]Only if it is registered already18:27
JayFsamuelkunkel[m]: look at the driver_internal_info[agent_url] iirc (I may have slightly misremembered the location; but it's in there)18:27
JayFsamuelkunkel[m]: oh yeah, good point, only if the first lookup worked18:27
samuelkunkel[m]Yep :)18:27
jrossertbh one of the reasons I was sticking with the ipmi driver was ipmitool-socat console access18:27
samuelkunkel[m]But your right, I also do it if it fails on cleaning and such so ;)18:28
jrosserit seems odd to me that I can’t mix that with redfish/drac/etc for everything else18:28
samuelkunkel[m]That is something I never tried18:29
samuelkunkel[m]But is the console also that weird on other supermicro server?18:30
samuelkunkel[m]For the few we have for testing I recall using the html console via the BMC18:30
jrosserit’s not that it’s wierd, it’s just not particularly accessible network wise18:31
jrosserso it’s and ssh -D / reconfigure browser proxy settings away18:31
jrosserbut that’s a local issue here with how the networks are partitioned18:32
samuelkunkel[m]There is a advisory for hpe about the blackscreen on the console. So this explains for my nodes why its black once its booted into the OS. Something like GRUB_CMDLINE_LINUX_DEFAULT="initcall_blacklist=sysfb_init“ is required18:32
opendevreviewJulia Kreger proposed openstack/ironic-python-agent master: WIP: Disable md5  https://review.opendev.org/c/openstack/ironic-python-agent/+/86519018:32
TheJulia^^^^ == PAAAAAIIINNNNN18:32
samuelkunkel[m]Then the console should work, according to hpe18:32
jrosserI think we have seen similar, intermittently18:32
samuelkunkel[m]But it seems like all of them struggle with their arm platforms18:32
samuelkunkel[m]Would love to touch gigabytes 2 socket 2U4N Plattform18:33
samuelkunkel[m]1024 Cores on 2U18:33
samuelkunkel[m]Does anyone know if there is a dib in diskimage-builder to set GRUB_CMDLINE Stuff?18:34
TheJuliadtantsur: btw, I ended up making the checksum stuff smarter to figure out based up on length, because anything else is maddening18:39
TheJuliaalso hides the glance field name detail nicely now18:40
TheJuliaunless you want to do something like sha3-51218:40
TheJuliaI'm semi-stepping away, a mechanic will be arriving soon to take a the diesel engine which refuses to turn over19:32
TheJuliafor clarity, that is for the day, and I'm off the next two days19:34
-opendevstatus- NOTICE: The Etherpad service on etherpad.opendev.org will be offline for the next 90 minutes for a server replacement and operating system upgrade21:58
JayFFor virtualpdu; needed to fix the github sync (we have to land anything at all) https://docs.openstack.org/virtualpdu/latest/23:02
JayFbad link23:02
JayFhttps://review.opendev.org/c/openstack/virtualpdu/+/880894 Fix gitreview; Minor update to readme [NEW]   23:02
JayFgood link23:02
JayFI also just pushed https://review.opendev.org/c/openstack/project-config/+/880895 to make virtualpdu changes alert in here23:04
JayFo/23:04

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!