opendevreview | cid proposed openstack/ironic master: Fix IPA external inspection callback url override https://review.opendev.org/c/openstack/ironic/+/949521 | 00:03 |
---|---|---|
iurygregory | ok, inserted on slot 1, but the machine timeouts yay | 01:07 |
iurygregory | Cleaning for node 11111111-2222-3333-4444-555555555555 failed. Timeout reached while cleaning the node. Please check if the ramdisk responsible for the cleaning is running on the node. Failed on step {}. | 01:07 |
iurygregory | yeah, I wish i had ramdisk logs at all =( | 01:08 |
iurygregory | tomorrow will be a new day, I will try something else on iDRAC10, good night folks! | 01:28 |
cid | \o | 01:46 |
opendevreview | Jacob Anders proposed openstack/sushy master: Skipping UsbCd workaround on Supermicro ARS-111GL-NHR https://review.opendev.org/c/openstack/sushy/+/949116 | 02:06 |
opendevreview | Merged openstack/ironic master: Fix redfish driver URL parsing https://review.opendev.org/c/openstack/ironic/+/949597 | 04:35 |
rpittau | good morning ironic! o/ | 06:49 |
Continuity | Morning Ironic | 08:05 |
opendevreview | Elod Illes proposed openstack/ironic unmaintained/xena: [Stable Only] pin virtualbmc/sushy-tools/ironic-tempest-plugin to last released tag https://review.opendev.org/c/openstack/ironic/+/945716 | 08:50 |
abongale | Good Morning Ironic! | 08:58 |
dtantsur | iurygregory: let me join the voices thanking you for chasing the iDRAC 10 issue :) | 09:17 |
dtantsur | iurygregory: after doing https://github.com/dell/iDRAC-Redfish-Scripting/issues/324#issuecomment-2892526106, does the machine actually boot into the ISO if rebooted? | 09:18 |
dtantsur | If not, it's a valuable information to tell them IMO | 09:18 |
dtantsur | Ah, I should have read the scrollback, right? So, try booting it with their scripts only, not Ironic. If it does not boot, complain on the bug and let us escalate it. | 09:24 |
iurygregory | dtantsur, the machine shows a screen that is trying to boot.. but after some time "power off" I'm going to re-install other firmware to test and try to manually boot an iso to install an OS just to see how things go. | 10:54 |
Sandzwerg[m] | So I created a bug for the UEFI/MBR-partition thing: https://bugs.launchpad.net/ironic/+bug/2111319 | 11:08 |
Sandzwerg[m] | Does it already make sense to notice dell of https://github.com/dell/iDRAC-Redfish-Scripting/issues/324 ? If yes then I'll open a ticket internally. | 11:10 |
iurygregory | Sandzwerg[m], you work at Dell? <eyes> | 11:20 |
Sandzwerg[m] | No, Not at all. But we have a Dell representative at my place (also a HPE and Lenovo one) and can make internal tickets for stuff we want them to fix | 11:24 |
Sandzwerg[m] | Like we get premium support and have meetings every two weeks. And if you say you can't buy their hardware because it breaks your automation that sometimes helps to fix things. | 11:25 |
Sandzwerg[m] | Of course if they follow the redfish standard all should be fine | 11:25 |
iurygregory | It would be good to reach out to Dell, the way I've found was by opening the issue | 11:27 |
Sandzwerg[m] | OK I'll open a issue so our contact is aware. We don't have idrac 10 yet but that would be a blocker | 11:29 |
iurygregory | much appreciated Sandzwerg[m] o/ | 11:30 |
iurygregory | dtantsur, ok seems like the latest iDRAC firmware had some problems, i was unable to manually boot, did a rollback to 1.20.25.00 at least now it gets to the screen saying `Virtual CD Boot Requested by iDRAC`, will test via their script now | 11:58 |
dtantsur | ++ | 12:28 |
opendevreview | Riccardo Pittau proposed openstack/bifrost master: Default ansible to version 10.x https://review.opendev.org/c/openstack/bifrost/+/948245 | 12:29 |
opendevreview | Riccardo Pittau proposed openstack/ironic master: [WIP] Run metal3 integration job using UEFI boot (default) https://review.opendev.org/c/openstack/ironic/+/939694 | 12:32 |
opendevreview | Verification of a change to openstack/ironic unmaintained/xena failed: [Stable Only] pin virtualbmc/sushy-tools/ironic-tempest-plugin to last released tag https://review.opendev.org/c/openstack/ironic/+/945716 | 12:38 |
opendevreview | Elod Illes proposed openstack/ironic unmaintained/xena: [Stable Only] pin virtualbmc/sushy-tools/ironic-tempest-plugin to last released tag https://review.opendev.org/c/openstack/ironic/+/945716 | 12:40 |
iurygregory | machine boots when doing via their scripting.., testing with ironic again hardcoding the VirtualMedia/1 ... | 12:42 |
opendevreview | cid proposed openstack/ironic master: Add port/portgroup list conductor groups filter https://review.opendev.org/c/openstack/ironic/+/862292 | 12:45 |
iurygregory | https://dl.dell.com/content/manual13739887-overview-of-idrac10-redfish-enhancements.pdf?language=en-us | 13:06 |
TheJulia | good morning | 13:22 |
TheJulia | I've only read a few fragments of that doc reading, but..... some aspects of it just don't add up | 13:27 |
iurygregory | agree | 13:27 |
TheJulia | and obviously, it lacks what has been discovered | 13:31 |
TheJulia | JayF: rpittau: https://review.opendev.org/c/openstack/ironic/+/950192 was previously approved and had a typo fixed in the release note | 13:44 |
TheJulia | A quick re-review would be appreciated | 13:45 |
rpittau | done | 13:47 |
iurygregory | time to test again idrac10 with ironic and see if the script GetIdracLcSystemAttributesREDFISH.py will give me the same output | 14:04 |
iurygregory | ok the script reports Attribute Name: ServerBoot.1.FirstBootDevice, Current Value: VCD-DVD, the iso is inserted on slot 1 | 14:19 |
iurygregory | https://paste.opendev.org/show/bejGD6UuT87uOy4F0D0I/ | 14:20 |
iurygregory | YAY Kernel Panic FTW | 14:22 |
iurygregory | wondering if it's a bad image =( or some magic in the machine... | 14:24 |
queensly[m] | I'm working with sushy and using the Redfish emulator from sushy-tools. I’m calling mgr_inst.datetime in Python, but it’s None. I already added DateTimeLocalOffset, and that works fine. Is there something else I need to add in the manager.py file? | 14:32 |
queensly[m] | * I'm working with sushy and using the Redfish emulator from sushy-tools. I’m calling mgr_inst.datetime in Python, but it’s None. I already added DateTimeLocalOffset, and that works fine. Is there something else I need to add in the manager.py file? | 14:33 |
queensly[m] | * I'm working with sushy and using the Redfish emulator from sushy-tools. I’m calling mgr\_inst.datetime in Python, but it’s None. I already added DateTimeLocalOffset, and that works fine. Is there something else I need to add in the manager.py file? | 14:33 |
opendevreview | Verification of a change to openstack/ironic master failed: CI: Reconfigure jobs to minimize tinyipa usage https://review.opendev.org/c/openstack/ironic/+/950192 | 15:13 |
opendevreview | Merged openstack/ironic-python-agent-builder master: Update pip version in dib source install https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/949974 | 15:13 |
opendevreview | Merged openstack/ironic-python-agent-builder master: Build CS9 DIB IPA ramdisk with python 3.12 https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/950152 | 15:39 |
okamitok[m] | Thanks Jay and Julia for the help over the last week.... (full message at <https://matrix.org/oftc/media/v1/media/download/AQjRLyN9iE8HCHAF-ocJsa6K1-8lVudXpwQ45heE9pP1rZIYXSAHcDhuzM3CWdBOGr3UReRqoxLhKiFIo92eCs5CeXNpPYfgAG1hdHJpeC5vcmcvTkJ3S0ltcFpORUhUcmRPaWxWTUJlSExm>) | 16:15 |
TheJulia | okamitok[m]: I suspect your nova-compute service is running with config drive disabled, if you explicitly set it to be enabled, then it should be populated on the config drive on the host. That is the best path to take and avoids reliance upon metadata services which have networking related quarks. | 16:31 |
TheJulia | okamitok[m]: regarding modifying post deploy, I'd recommend you just change the image to include the needful | 16:36 |
opendevreview | Julia Kreger proposed openstack/ironic master: DNM: CI Science - Expand the multinode job https://review.opendev.org/c/openstack/ironic/+/950206 | 16:49 |
opendevreview | Verification of a change to openstack/ironic master failed: CI: Reconfigure jobs to minimize tinyipa usage https://review.opendev.org/c/openstack/ironic/+/950192 | 17:16 |
opendevreview | Jay Faulkner proposed openstack/networking-baremetal master: Remove explicitly use of eventlet https://review.opendev.org/c/openstack/networking-baremetal/+/947985 | 17:27 |
TheJulia | looks like our devstack plugin is now broken | 17:56 |
TheJulia | start_neutron_api method from devstack is no longer found | 17:56 |
JayF | https://github.com/openstack/devstack/commit/9e81048bbb3b3adbfb7bd5307af9bce79290308c | 17:58 |
TheJulia | what is mind bending... start_neutron_api is not found | 18:01 |
JayF | It's not it's own process anymore. | 18:05 |
JayF | it's gotta be metaprogramming somewhere to make those methods | 18:06 |
TheJulia | It was a method at some point, but I don't see it in the history at this point | 18:09 |
TheJulia | well | 18:09 |
TheJulia | start_neutron is still there | 18:09 |
TheJulia | so... | 18:09 |
opendevreview | Julia Kreger proposed openstack/ironic master: ci/devstack: Remove start_neutron_api explict call https://review.opendev.org/c/openstack/ironic/+/950455 | 18:11 |
TheJulia | looks like with the flow we now 503 on the network creation | 19:19 |
TheJulia | we're likley going to have to do heavier retooling | 19:19 |
* TheJulia sighs | 19:19 | |
*** jcosmao is now known as Guest16423 | 19:26 | |
iurygregory | ok, this is the weirdest thing i saw today, after the kernel panic the machine is unable to boot from the OS or from Virtual Media | 19:29 |
iurygregory | System if powered off -> HOST boot in progress -> Please wait while the system is initializing.. -> System is powered off | 19:29 |
iurygregory | LOL the iDRAC UI says there are no Disks omg | 19:32 |
iurygregory | https://paste.opendev.org/show/buc9su3tP5XaBmzw0y8D/ | 19:32 |
* TheJulia blinks | 19:41 | |
* TheJulia blinks some more | 19:41 | |
* JayF 👀 on that neutron issue | 19:50 | |
JayF | so | 19:58 |
JayF | we stop neutron-rpc-server (I'm unsure who "we" is yet) https://www.irccloud.com/pastebin/pLQZxkSW/ | 19:58 |
JayF | and it never gets restarted | 19:58 |
JayF | that is likely the root cause of the 503 | 19:58 |
JayF | and it doesn't look like failures; it looks like actual logic issues where we never even try to start it back | 19:59 |
JayF | TheJulia: ^ if you have anything to add, /me keeps digging | 19:59 |
TheJulia | yeah, likely | 20:00 |
TheJulia | so start_neutron only restarts the API then | 20:00 |
TheJulia | that was similar behavior | 20:00 |
JayF | start_neutron is not the opposite of stop_neutron | 20:01 |
JayF | and our code kinda assumes it does | 20:01 |
TheJulia | yeah | 20:01 |
TheJulia | well, kind of | 20:01 |
JayF | start_neutron_service_and_check | 20:02 |
JayF | I think is what we want | 20:02 |
JayF | https://opendev.org/openstack/devstack/src/branch/master/lib/neutron#L610 | 20:02 |
JayF | start_neutron only does agents | 20:02 |
JayF | so maybe both | 20:02 |
* JayF sciences | 20:02 | |
opendevreview | Jay Faulkner proposed openstack/ironic master: Science: replace start_neutron_api with start_neutron_service_and_check https://review.opendev.org/c/openstack/ironic/+/950461 | 20:03 |
JayF | stop_neutron doesn't do anything to the api, either | 20:05 |
JayF | so we may not be restarting that process generally | 20:05 |
TheJulia | wheeeeeeeeee | 20:05 |
TheJulia | the whole reason to cycle the config is because we were changing the config | 20:06 |
JayF | well, it's uwsgi | 20:06 |
JayF | so I think we have to bump the whole uwsgi if we wanna restart one | 20:06 |
JayF | but I also wonder if what we do is enough, generally | 20:06 |
JayF | I see no recent changes to stop_neutron | 20:10 |
JayF | so I think my change may be enough 🤞 | 20:10 |
TheJulia | hopefully | 20:12 |
opendevreview | Julia Kreger proposed openstack/ironic master: WIP Patch configdrive metadata https://review.opendev.org/c/openstack/ironic/+/946677 | 20:20 |
TheJulia | I think that is retooled as desired | 20:22 |
JayF | that failed, it looks like neutron crashed failing to connect to ovsdb | 20:43 |
JayF | I think there's a red herring somewhere? start_neutron_api doesn't exist in stable/2025.1 :| | 20:53 |
JayF | (devstack stable/2025.1) | 20:53 |
TheJulia | yeah, I know | 20:56 |
opendevreview | Merged openstack/ironic unmaintained/xena: [Stable Only] pin virtualbmc/sushy-tools/ironic-tempest-plugin to last released tag https://review.opendev.org/c/openstack/ironic/+/945716 | 21:01 |
JayF | neutron_plugin_configure_plugin_agent is gone | 21:01 |
JayF | hmmm not if you have a plugin loaded tho | 21:02 |
opendevreview | Jay Faulkner proposed openstack/ironic master: Science: replace start_neutron_api with start_neutron_service_and_check https://review.opendev.org/c/openstack/ironic/+/950461 | 21:03 |
JayF | TheJulia: until recently; is_service_enabled neutron-api was basically unconditionally false | 21:08 |
JayF | TheJulia: that's the change. That branch of code hasn't been in actual use in ages afaict | 21:09 |
opendevreview | Jay Faulkner proposed openstack/ironic master: Science: replace start_neutron_api with start_neutron_service_and_check https://review.opendev.org/c/openstack/ironic/+/950461 | 21:10 |
* JayF just removes that whole block to see what happens | 21:10 | |
JayF | https://zuul.opendev.org/t/openstack/build/38e5e97dbf9344e6ba08fa083aff2724/log/job-output.txt#25490 | 21:11 |
* JayF wonders if that whole block of code just never being called has some kind of side effect | 21:12 | |
JayF | heh now time to search git log -p in devstack for start_neutron_api to try and timestamp how long that's been dead code | 21:15 |
JayF | uh holy crap our jobs may have been operating weirdly for a while | 21:16 |
JayF | https://opendev.org/openstack/devstack/commit/a52041cd3f067156e478e355f5712a60e12ce649 | 21:17 |
JayF | start_neutron_api in the lib/neutron module has been gone since Nov, 2022 | 21:18 |
JayF | so that code has been not running for about 2.5 years | 21:18 |
JayF | and the recent devstack "fixes" to make neutron-api properly servicified exposed this | 21:18 |
JayF | I think my fix is a good fix now, will see if it passes CI and if so will update commit message | 21:18 |
opendevreview | Jay Faulkner proposed openstack/ironic master: Remove code which has been long-dead https://review.opendev.org/c/openstack/ironic/+/950461 | 21:23 |
JayF | I went ahead and updated the commit message with all this context, so if/when it passes CI it can be landed | 21:23 |
JayF | TheJulia: ^ I think I nailed the CI thing | 21:24 |
TheJulia | woot | 21:54 |
JayF | I do wonder if those v6 tweaks need to be moved around | 21:55 |
TheJulia | wouldn't surprise me | 21:55 |
TheJulia | Our devstack plugin could use a good cleaning | 21:55 |
JayF | ...is that possibly the problem you had getting v6 to run outside of an OVN job? | 21:55 |
JayF | since that seems to be enabling proxy for ndp which seems useful | 21:56 |
TheJulia | no, the problem with v6 in non-ovn was a blend of edk2 firmware with dnsmasq | 21:56 |
JayF | ah that's right | 21:56 |
TheJulia | Anyway, I just spent the last hour and a half digging through customer logs | 21:57 |
JayF | did you find one at least half as straight as a 2x4 from lowes? /s | 21:58 |
TheJulia | lol | 21:58 |
TheJulia | I suspect brand shiny new idrac10s with idrac-wsman on some machines | 21:58 |
TheJulia | crazy errors | 21:58 |
JayF | just point them at the working redfi.... oh | 21:58 |
TheJulia | heh | 21:59 |
TheJulia | Redfis.... oh :( | 21:59 |
JayF | tell them to communicate to their vendor that it's highly recommended to not shuffle API endpoints for the hell of it | 21:59 |
TheJulia | pretty much, I've explicitly requested a bunch of details | 21:59 |
JayF | you could've written an Ironic API client when some of our contributors were still in high school and still run it today | 21:59 |
TheJulia | so... time will tell | 21:59 |
JayF | but they gotta move stuff around for one hardware revision | 22:00 |
JayF | it's disrespectful of the entire ecosystem and highly frustrating | 22:00 |
TheJulia | yeah, $words | 22:00 |
TheJulia | Anyhow, I need to step away. | 22:00 |
JayF | have a good one o/ | 22:00 |
JayF | my CI fix patch is already past the error point | 22:04 |
JayF | any cores around overnight or late today who sees this; this is what need to be approved to fix the gate once it passes CI; please land it: https://review.opendev.org/c/openstack/ironic/+/950461 | 22:04 |
TheJulia | stevebaker[m]: ^ | 22:07 |
stevebaker[m] | done | 22:13 |
JayF | job is failing :( | 22:15 |
JayF | not the same error though, so I assume it's one of our usual suspects | 22:15 |
TheJulia | looks like dhcp no worky | 22:30 |
JayF | actually no worky | 22:34 |
JayF | or race condition no worky | 22:34 |
JayF | I wonder if q-dhcp is now neutron-dhcp... | 22:34 |
* JayF waits for CI logs | 22:34 | |
TheJulia | I think they posted for one of the jobs | 22:34 |
JayF | only job that's failing+voting is ovn+ipv6 | 22:35 |
JayF | and that doesn't use neutron dhcp | 22:35 |
TheJulia | yeah, logs will need to be dug through. | 22:36 |
JayF | I'm looking real quick for any leads | 22:36 |
TheJulia | k, I should have spoons in the morning | 22:37 |
JayF | is_service_enabled q-dhcp + neutron-dhcp both pass | 22:37 |
JayF | so that naming may not matter | 22:37 |
TheJulia | That job is ovn anyhow | 22:37 |
JayF | q-dhcp service seems to be OK | 22:39 |
JayF | I'm looking at the multinode shard job | 22:39 |
JayF | since I know more about how that's shaped | 22:39 |
JayF | OVN is mostly still a black box to me :/ | 22:39 |
JayF | 2025-05-20T21:36:06.912Z|00003|reconnect|INFO|/var/run/openvswitch/db.sock: connection attempt failed (Address family not supported by protocol) in ovn-controller-vtep.log | 22:40 |
JayF | but I have no idea what a good one looks like | 22:40 |
JayF | It's also worth noting neutron has landed a lot of eventlet removal stuff | 22:41 |
JayF | places where we were depending on specific ordering or timing may be less happy now | 22:41 |
TheJulia | Yeah, there is a whole order of operations isuse here | 22:48 |
TheJulia | with devstack creating networks | 22:48 |
JayF | the port bindings are failing | 22:48 |
TheJulia | Perhaps we need to front load in the disabling | 22:48 |
JayF | > May 20 21:40:59.886533 np0040838796 devstack@neutron-api.service[91224]: ERROR networking_generic_switch.generic_switch_mech [req-fb14966a-feae-469d-ad3d-c8b01d6ee3d6 req-282c99c0-dad7-4fea-8002-cf6a14e43b48 service ironic] Cannot bind port deaec2e4-5e49-4f7b-b4ab-70e6bcb21ef6 as device brbm is not configured. Check baremetal port link configuration. | 22:48 |
JayF | I am unsure if that is meaningful or not in this context | 22:48 |
JayF | this is the ovn v6 job | 22:48 |
TheJulia | that seems like a rather unhappy error | 22:48 |
JayF | https://e3fa69918ab3893f89a3-76ad47885070581f857a540cadaa6a6d.ssl.cf1.rackcdn.com/openstack/55cf2727b4c54f06b897353cf71ea0a3/controller/logs/screen-neutron-api.txt | 22:48 |
JayF | and would follow the pattern, potentially, of a configuration not being picked up | 22:49 |
TheJulia | the knob deleted was to prevent neutron from doing initial network configurations... hmmm | 22:49 |
JayF | I have a killer migraine, weather here is rain/stop/rain/stop/rain/stop which isn't great, I'm going to step away. I should have some time to dig again tomorrow... well, probably not until afternoon tbh but I'll try to make time in the morning :) | 22:49 |
JayF | oooh | 22:49 |
TheJulia | yes, go, step away | 22:50 |
JayF | I'm still here for another :10, going to try and figure it out | 22:50 |
JayF | I don't see that switch having gone away? | 22:51 |
JayF | the primary effective changes, afaict, are s/q-/neutron-/ except in cases where you're explicitly enabling the q- version of a service | 22:52 |
JayF | but clearly some piece is missing | 22:53 |
TheJulia | NEUTRON_CREATE_INITIAL_NETWORKS=False | 22:53 |
JayF | that is still honored in devstack afaict | 22:53 |
JayF | oh, you mean it's been disappeared | 22:54 |
JayF | but again; I have hard evidence that codepath is dead for 2.5+ years | 22:54 |
JayF | so even if that's not awesome, it should *not* be the root cause of our issue imo | 22:55 |
TheJulia | true | 22:56 |
TheJulia | Get some rest, I'll take a fresh look in the morning | 22:56 |
JayF | I would posit if it's failing with that code removed, either 1) random failures like we tend to see or 2) something else in the devstack change chunk | 22:56 |
JayF | I might recheck this once the -1 comes through just to see if it's reproducable | 22:57 |
TheJulia | reasonable | 22:57 |
JayF | OK, going to go find some headache medicine and !laptop :) o/ | 22:57 |
TheJulia | sounds like a plan | 22:57 |
* TheJulia checks the status of the nearby fire which is luckily heading away | 22:58 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!