*** pmannidi|AFK is now known as pmannidi | 00:16 | |
opendevreview | Steve Baker proposed openstack/ironic-inspector master: Remove rootwrap rule for dnsmasq systemctl https://review.opendev.org/c/openstack/ironic-inspector/+/822373 | 02:16 |
---|---|---|
*** pmannidi is now known as pmannidi|AFK | 06:42 | |
arne_wiebalck | Good morning, Ironic! | 07:02 |
janders | hey arne_wiebalck o/ | 07:07 |
holtgrewe | Good morning ;-) OK, so my hardware found a new way to annoy me. In UEFI, my Dell server is apparently storing information related to my previous CentOS installation. Now that I want to install RockyLinux, it does not have sufficient "boot option" slots available. | 07:10 |
holtgrewe | Is there a "clean UEFI boot options" step in ironic? | 07:10 |
arne_wiebalck | hey janders holtgrewe o/ | 07:16 |
holtgrewe | \o arne_wiebalck janders | 07:16 |
holtgrewe | https://bugzilla.redhat.com/show_bug.cgi?id=1680659 <-- omg, am I now close to Dell iDRAC bugs? *sigh* | 07:16 |
arne_wiebalck | holtgrewe: I don't think so ... but I think there should be quite some before it runs out of space, no? | 07:16 |
holtgrewe | I have 16 slots filled ;-) | 07:17 |
arne_wiebalck | that does not sound like a lot | 07:17 |
arne_wiebalck | I am pretty sure we have more than that | 07:18 |
holtgrewe | sounds like a power of 2 | 07:18 |
arne_wiebalck | holtgrewe: how many slots do you use? | 07:18 |
holtgrewe | maybe I should not use the ipmi driver but rather redfish or this dell thing | 07:18 |
arne_wiebalck | I never tried to max it out ... let me check a standard server ... | 07:18 |
holtgrewe | 15x "Unavailable: CentOS Linux", 1x "PXE Device 1: NIC in Slot 4 Port 1 Partition 1", 1x "Unavailable: CentOS" | 07:19 |
holtgrewe | But maybe it is my disk Frankenstein Rocky disk image ;-) | 07:19 |
arne_wiebalck | hmm, first one uses only 8 | 07:21 |
arne_wiebalck | need to find one with more interfaces ... | 07:22 |
holtgrewe | arne_wiebalck: maybe let me check my uefi image first | 07:22 |
arne_wiebalck | found one with 13 now | 07:23 |
arne_wiebalck | holtgrewe: ok | 07:23 |
arne_wiebalck | holtgrewe: just checked the code, Ironic will remove duplicate entries | 07:30 |
holtgrewe | arne_wiebalck: do you also use Dell servers? if so, which driver? ipmi, redfish, or idrac? | 07:30 |
arne_wiebalck | holtgrewe: no Dell servers | 07:30 |
holtgrewe | arne_wiebalck: so maybe my ironic configuration is problematic? | 07:30 |
arne_wiebalck | holtgrewe: well, one other complication may be that s/w RAID does not use efibootmgr | 07:30 |
arne_wiebalck | holtgrewe: this is still on top of a s/w RAID, right? | 07:31 |
holtgrewe | arne_wiebalck: yes, sounds like I'm (again) hitting corner cases | 07:31 |
arne_wiebalck | holtgrewe: `efibootmgr -v` lists the same entry over and over again? | 07:31 |
holtgrewe | well, that's how you see that you're either on the bleeding edge ... or doing something really really stupid | 07:31 |
holtgrewe | arne_wiebalck: I could not boot into the machine | 07:32 |
holtgrewe | still trying to clear out the UEFI entries using Dell iDRAC/BIOS | 07:32 |
arne_wiebalck | holtgrewe: I guess you can get the current list with iDRAC as well maybe? | 07:33 |
arne_wiebalck | holtgrewe: with Redfish you can, I think | 07:33 |
holtgrewe | arne_wiebalck: yes, the | 07:33 |
holtgrewe | https://snipboard.io/GXkdon.jpg | 07:34 |
arne_wiebalck | holtgrewe: ugh :) | 07:34 |
holtgrewe | XML dump is <Attribute Name="UefiBootSeq">Unknown.Unknown.1-1, Unknown.Unknown.2-1, Unknown.Unknown.3-1, Unknown.Unknown.4-1, Unknown.Unknown.5-1, Unknown.Unknown.6-1, Unknown.Unknown.7-1, Unknown.Unknown.8-1, Unknown.Unknown.9-1, Unknown.Unknown.10-1, Unknown.Unknown.11-1, Unknown.Unknown.12-1, Unknown.Unknown.13-1, Unknown.Unknown.14-1, Unknown.Unknown.15-1, NIC.PxeDevice.1-1, | 07:35 |
holtgrewe | Unknown.Unknown.17-1</Attribute> | 07:35 |
arne_wiebalck | holtgrewe: I have never seen this before | 07:35 |
holtgrewe | arne_wiebalck: Yes, this summarises my experience with pretty much any BMC | 07:35 |
holtgrewe | I'm probably just holding all of them wrong | 07:36 |
arne_wiebalck | arne_wiebalck: heh, don't get me started :-D | 07:36 |
holtgrewe | ILO turned out to be pretty but also problematic | 07:36 |
arne_wiebalck | arne_wiebalck: we have one type of BMC which needs to be reset otherwise the server will not reboot | 07:36 |
holtgrewe | maybe the one shipping with SuperMicro is better on the command line but the HTML interface is soooo ugly | 07:36 |
holtgrewe | and not really good to use | 07:37 |
holtgrewe | so in my hands, Dell is still the best of the mediocre | 07:37 |
arne_wiebalck | the interfaces' quality differs a lot | 07:37 |
arne_wiebalck | and most of them are made to be used with a handful of servers, but not apt if you manage 1000s | 07:38 |
holtgrewe | At least with Dell I know a couple of tricks such as dumping config as XML, updating, loading again. | 07:38 |
holtgrewe | arne_wiebalck: haha, Dell has something now where you can manage a dozen from one web interface | 07:38 |
holtgrewe | I guess that's useful ... for windows admins | 07:38 |
holtgrewe | As for M1000e enclosures, you could bundle up to 8 into one admin interface. o_O | 07:39 |
arne_wiebalck | yeah ... I would appreciate much more if they provided a nice API: we have one delivery which gives you a one-time iKVM link to the console | 07:39 |
arne_wiebalck | this is useful for integration with other tools | 07:40 |
holtgrewe | I think redfish is supposed to be that API. | 07:40 |
arne_wiebalck | I have handled two deliveries with redfish and had various issues | 07:41 |
holtgrewe | And IPMI is a decent protocol, it's just suffering from (at least) -- (a) embrace and extend and (b) interesting BIOS behaviour. | 07:41 |
holtgrewe | :-D | 07:41 |
arne_wiebalck | it is supposed to be, yes, but it is not there | 07:41 |
arne_wiebalck | the standard, maybe, but the implementations I am not sure | 07:41 |
holtgrewe | I'd rather have them ship with an embedded Raspberry Pi and provide documentation to the sensors and actors. | 07:42 |
arne_wiebalck | mind you, we have just moved the first redfish managed servers to prof | 07:42 |
arne_wiebalck | *prod | 07:42 |
arne_wiebalck | holtgrewe: right, but only if the APIs are the same on all hardware | 07:43 |
holtgrewe | arne_wiebalck: Is it possible to modify the uefi settings from within the booted OS? | 07:46 |
arne_wiebalck | holtgrewe: you mean beyond setting the boot order? | 07:46 |
holtgrewe | arne_wiebalck: yes | 07:47 |
holtgrewe | You mentioned removing duplicates. | 07:47 |
holtgrewe | sorry, let met google that myself... | 07:47 |
arne_wiebalck | holtgrewe: Ironic does this manually, let me get you a link ... | 07:47 |
arne_wiebalck | holtgrewe: https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/efi_utils.py#L229 | 07:48 |
holtgrewe | arne_wiebalck: and that is not executed when using software raid? | 07:49 |
arne_wiebalck | holtgrewe: I don't think so. | 07:50 |
arne_wiebalck | holtgrewe: s/w RAID bypasses most of the UEFI management | 07:50 |
holtgrewe | :-D | 07:50 |
arne_wiebalck | holtgrewe: for historic reasons :-D | 07:51 |
* arne_wiebalck always wanted to say this. | 07:51 | |
arne_wiebalck | holtgrewe: it was mostly since s/w RAID was done/tested right before a release, and I was not confident enough to redo all with efibootmgr and test before the release. | 07:52 |
holtgrewe | heh | 07:52 |
arne_wiebalck | holtgrewe: so, we agreed to leave it and clean up later | 07:52 |
arne_wiebalck | holtgrewe: so, here we are ... 3 or 4 release later | 07:52 |
holtgrewe | Sounds like a pragmatic decision. I would have done the same. | 07:52 |
arne_wiebalck | holtgrewe: we should clean this up eventually, though, there is no reason s/w RAID does sth different | 07:53 |
holtgrewe | Is there a way to prevent the reboot while IPA is running? I still have a devuser setup so I can use efimgr manually. | 07:53 |
arne_wiebalck | holtgrewe: two ways: | 07:54 |
arne_wiebalck | holtgrewe: move the node to maintenance when it is in clean_wait, but before the IPA starts | 07:54 |
arne_wiebalck | holtgrewe: enable fast_track on the conductor | 07:54 |
holtgrewe | arne_wiebalck: thanks, #1 is fine for me | 07:55 |
holtgrewe | arne_wiebalck: https://paste.openstack.org/show/811789/ | 07:58 |
holtgrewe | wheee | 07:58 |
arne_wiebalck | holtgrewe: heh, that looks indeed like a missing cleanup | 07:59 |
holtgrewe | arne_wiebalck: maybe the rocky installation did not write to UEFI... | 07:59 |
*** amoralej|off is now known as amoralej | 08:02 | |
rpittau | good morning ironic! o/ | 08:35 |
rpittau | zigo: backports should work, at least for us, we follow the dib default apt sources config and it has bullseye-backports | 08:39 |
zigo | priteau: dtantsur: I have uploaded lshw_02.19.git.2021.06.19.996aaad9c7-2~bpo11+1_amd64.changes, now it needs to clear the Debian backports NEW queue (ie: backports FTP masters need to approve the package). This may take some time, but hopefully not too much, that queue being almost empty it should go fast. | 09:30 |
dtantsur | thank you zigo! | 09:30 |
dmellado | dtantsur: o/ | 11:08 |
dmellado | I've got one more question, so | 11:08 |
dmellado | I can't seem to enroll the node, but I can get to redfish in its ip | 11:08 |
dmellado | how can I debug this? | 11:08 |
dmellado | https://paste.openstack.org/show/811796/ | 11:10 |
dmellado | I can seem to get to the redfish api directly | 11:11 |
dmellado | https://paste.openstack.org/show/811797/ | 11:11 |
dmellado | curl to redfish v1 | 11:13 |
dmellado | https://paste.openstack.org/show/811798/ | 11:13 |
dtantsur | dmellado: "Resource temporarily unavailable" sounds like one of 2 things: 1) the conductor crashed, 2) the idrac hardware type is not enabled | 11:14 |
dtantsur | dmellado: also, idrac != redfish, you're configuring WSMAN credentials | 11:15 |
dtantsur | unless you need any advanced feature, maybe start with just IPMI? | 11:15 |
dmellado | Yeah, I can take a look | 11:16 |
dmellado | but it does seem that conductor indeed failed | 11:16 |
dmellado | it seems that Dec 21 05:14:31 jumphost2.dfwt5g.lab ironic-conductor[991788]: 2021-12-21 05:14:31.382 991788 ERROR oslo_service.service ironic.common.exception.DriverLoadError: Driver, hardware type or interface ipxe could not be loaded. Reason: [Errno 13] Permission denied: '/httpboot/boot.ipxe'. | 11:16 |
dmellado | it's not there, so I'll reinstall | 11:16 |
dmellado | I'm learning lots, though xD | 11:17 |
dmellado | hmmm seems that I'm missing /httpboot/boot.ipxe | 11:56 |
dmellado | dtantsur: when would that be created? | 11:56 |
dtantsur | dmellado: it's created by bifrost during installation | 11:57 |
dtantsur | (unless you override http_boot_folder) | 11:57 |
dmellado | I have made a symlink on the http_boot_folder | 11:58 |
dmellado | so the folder itself doesn't get created but uses the symlinked one | 11:58 |
dmellado | I guess I'm getting too hacky xD | 11:59 |
dtantsur | dmellado: yeah, then it could be permissions or selinux | 11:59 |
dtantsur | if you need it in a different location, setting http_boot_folder is probably a better idea | 11:59 |
dmellado | I'll do that, thanks for all the tips! | 11:59 |
* dtantsur bbl | 11:59 | |
*** amoralej is now known as amoralej|lunch | 13:40 | |
*** amoralej|lunch is now known as amoralej | 14:19 | |
dmellado | dtantsur: last trouble I give you | 14:57 |
dmellado | I think I'm almost done, and in a buggy environment | 14:57 |
dmellado | xD | 14:57 |
dmellado | so, now I could enroll the node and so | 14:57 |
dmellado | and I'm just getting a 403 from nginx | 14:57 |
dmellado | as I had to move and play with the folders and paths | 14:57 |
dmellado | where do you set that up in ironic? | 14:57 |
holtgrewe | arne_wiebalck: for some reason the rocky linux installation via IAP does not create the UEFI entry (when using with software RAID) | 14:57 |
dmellado | s/ironic/bifrost | 14:57 |
holtgrewe | arne_wiebalck: could it be that this is a side effect of the "skip everything in case of software RAID" feature that you mentioned earlier? | 14:58 |
arne_wiebalck | holtgrewe: not sure, there should still be an entry | 15:02 |
arne_wiebalck | holtgrewe: there is however an issue with cs8 and grub2-install in that it fails due to lack of support for secure boot | 15:02 |
arne_wiebalck | holtgrewe: another reason we need to move to efibootmgr ... next year :) | 15:03 |
arne_wiebalck | have a great break everyone, see you next year o/ | 15:03 |
holtgrewe | arne_wiebalck: o/ have a nice break | 15:04 |
holtgrewe | thanks for everything! | 15:04 |
*** akahat|ruck is now known as akahat|dinner | 15:18 | |
*** akahat|dinner is now known as akahat|ruck | 15:45 | |
dtantsur | dmellado: you really should check both nginx logs and selinux audit messages | 15:58 |
rpittau | bye everyone, see you on thursday! or next year :) | 16:29 |
dtantsur | rpittau: enjoy the break! | 16:30 |
rpittau | thank you dtantsur, you too :) | 16:30 |
opendevreview | Dmitry Tantsur proposed openstack/ironic-python-agent master: WIP Refactor: create image_download module https://review.opendev.org/c/openstack/ironic-python-agent/+/822536 | 16:58 |
NobodyCam | Good Morning Ironic'ers | 17:05 |
dtantsur | morning NobodyCam | 17:22 |
NobodyCam | O/ Morning dtantsur | 17:23 |
NobodyCam | staying warm out there | 17:23 |
dtantsur | when heating works - yes :) | 17:24 |
*** sshnaidm is now known as sshnaidm|afk | 17:34 | |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: [WIP] ImageCache: respect Cache-Control: no-store https://review.opendev.org/c/openstack/ironic/+/822329 | 18:01 |
*** amoralej is now known as amoralej|off | 18:02 | |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: [WIP] ImageCache: respect Cache-Control: no-store https://review.opendev.org/c/openstack/ironic/+/822329 | 18:21 |
dtantsur | good night folks o/ | 18:35 |
*** gmann is now known as gmann_afk | 19:51 | |
stevebaker[m] | good morning | 21:04 |
opendevreview | Steve Baker proposed openstack/ironic master: Use driver_internal_info methods for driver utils https://review.opendev.org/c/openstack/ironic/+/818505 | 21:52 |
opendevreview | Steve Baker proposed openstack/ironic master: Use driver_internal_info methods for drac driver https://review.opendev.org/c/openstack/ironic/+/818506 | 21:52 |
opendevreview | Steve Baker proposed openstack/ironic master: Use driver_internal_info methods for ilo driver https://review.opendev.org/c/openstack/ironic/+/818507 | 21:52 |
opendevreview | Steve Baker proposed openstack/ironic master: Use driver_internal_info methods for redfish driver https://review.opendev.org/c/openstack/ironic/+/818508 | 21:52 |
opendevreview | Steve Baker proposed openstack/ironic master: Use driver_internal_info methods for other drivers https://review.opendev.org/c/openstack/ironic/+/818509 | 21:52 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!