Tuesday, 2024-10-15

rpittaugood morning ironic! o/06:45
rpittauas I said yesterday during the meeting, antelope is switching to unmaintained at the end of october07:16
rpittauall patches in 2023.1 are merged, but please double-check as I plan to do the final release before end of this week07:16
opendevreviewJunya Noguchi proposed openstack/ironic master: Add image build method for verified OS.  https://review.opendev.org/c/openstack/ironic/+/93239609:49
iurygregorytonyb, hey! we can merge https://review.opendev.org/c/openstack/project-config/+/904012 o/10:43
iurygregoryoh nvm, it's already merged :D10:43
iurygregorygood morning ironic10:44
* iurygregory needs more coffee :D10:44
TheJuliaWord has it from today’s keynote speaker, nasa Goddard is going to rip out xcat and go to ironic13:50
TheJuliahttps://usercontent.irccloud-cdn.com/file/C2Qxt0vS/1729000278.JPG13:51
rpittauwow nice14:01
TheJuliaThat was Jonathan Mills on stage14:14
TheJuliahttps://usercontent.irccloud-cdn.com/file/jeN1rrjy/1729001774.JPG14:16
TheJuliaMore mention of things we care about :)14:16
rpittauyep :)14:18
opendevreviewMichael Sherman proposed openstack/ironic stable/2023.1: allow disk cleaning during deploy  https://review.opendev.org/c/openstack/ironic/+/93241814:20
cardoewho was the first speaker TheJulia?14:31
cardoeah Jonathan Mills. Reading comprehension fail.14:32
rpittau:)14:36
rpittaugood night! o/16:19
TheJuliaSo! University of Chicago has a snapshot thingie they want to make a service step16:24
JayFsounds like fun16:26
TheJuliaThey are going to try and submit their “horrible bash script” as a starting place16:52
TheJulia43 people in James Denton's Ironic session....17:37
TheJulia4 on the zoom17:38
TheJuliaInteresting…. https://usercontent.irccloud-cdn.com/file/MsyDyMMq/1729014383.JPG17:46
TheJuliaRaid is separate apparently because of hot spare17:46
TheJuliahttps://usercontent.irccloud-cdn.com/file/VPBf7J7m/1729014434.JPG17:47
iurygregoryThere is zoom available?!17:49
iurygregory=O17:49
TheJuliahttps://usercontent.irccloud-cdn.com/file/M7SNQxnK/trim.EE6C197D-9DBD-4ED0-9541-F2EEB84718CC.mp417:49
TheJuliaIf you registered for NA…17:50
TheJulia:)17:50
iurygregoryI didn't =( 17:50
TheJuliaThey only made the hybrid announcement last week17:51
TheJuliaAnd it still flew under the radar some, but it means we’re trying to record the sessions17:51
iurygregorynice =)17:51
iurygregoryenjoy the OID NA o/17:52
cardoeSo I can answer stuff about James's piece around Ironic18:11
cardoeThat first picture is what we do today. That's what I wanna get away from and do it all of out band inspection18:12
cardoeRAID is problematic because the syntax doesn't allow for expressing the configuration they have.18:12
cardoeWe'll also have different users wanting different RAID setups and the make a new flavor scheme gets unwieldy.18:13
cardoeoh you said it... hot spare18:14
TheJuliaI think we could declare that with out of band if the driver supports it, but software raid wise I kind of went “huh”18:15
TheJuliaHmm, looks like you just add a device18:17
cardoeSo this gets into the sushy / sushy-oem-idrac18:17
TheJuliahttps://usercontent.irccloud-cdn.com/file/Rtdoj97U/1729016655.JPG18:24
TheJuliaEveryone is doing ironic18:24
TheJuliacardoe: knowing the data in advance to generate the raid config? Or at least before the initial cleaning pass?18:24
cardoeSo if I recall correctly we need to specify something specific about the disk for Dell to be happy and it's before the initial cleaning18:26
cardoeThat's one that I need to get back to messing with. It's been a number of sleeps since I last looked.18:27
TheJuliaFigures, your doing hardware with dell then. Yeah, I think the last time I did it, it had to be a spare disk out of the box18:34
TheJuliaWhich was clean in advance to not have corruption issues18:35
cardoeBelieve me. I'd rather hit my fingers with a hammer every day then deal with Dell.18:44
cardoeThe only thing he missed is that when we reset the BIOS we lose some settings and Ironic requires it to boot into IPA to change BIOS settings via redfish18:46
TheJuliawwwwwhhhhaaaattttt?!?19:05
* TheJulia feels like this has created a RAIFSD19:05
TheJuliaRedundant Array of Independent Finger Smashing Devices19:05
TheJuliaregarding OVN, we likely also need to be aware of https://bugs.launchpad.net/neutron/+bug/199507819:07
cardoeTheJulia: https://docs.openstack.org/ironic/latest/admin/drivers/idrac.html#pxe-reset-with-factory-reset-bios-clean-step19:29
cardoehttps://opendev.org/openstack/ironic/src/commit/c80b8bfdb2eb18d49b049f093c8c79ffd5cac164/ironic/drivers/modules/redfish/bios.py#L189-L190 doesn't have requires_ramdisk=False19:34
TheJuliaEasy to fix I guess :(19:41
opendevreviewJay Faulkner proposed openstack/ironic master: devstack: respect USE_VENV in Ironic  https://review.opendev.org/c/openstack/ironic/+/93077621:01
JayFgmann: ^ I think that is updated in a way that it should make your grenade change happy, I'll depends-on it to check after I ensure it passes current-ci/grenade21:02
iurygregoryJayF, https://review.opendev.org/c/openstack/ironic/+/930776/15/devstack/lib/ironic#1091 you changed ) for ( 21:12
JayFfixed, ty21:13
opendevreviewJay Faulkner proposed openstack/ironic master: devstack: respect USE_VENV in Ironic  https://review.opendev.org/c/openstack/ironic/+/93077621:13
iurygregoryyw21:13
keekzhi all, i'm reading through https://docs.openstack.org/ironic/latest/admin/drivers/redfish/metrics.html but it's unclear how to actually enable the sending of the redfish metrics. in ironic config i have enabled sensor_data, configured metrics backend, configured oslo notifications, but nothing is making it to the oslo notifications rabbit queue. i see ironic conductor collecting sensor data, but it's not shipping it, and no 21:55
keekzerrors in logs21:55
opendevreviewJay Faulkner proposed openstack/ironic master: devstack: respect USE_VENV in Ironic  https://review.opendev.org/c/openstack/ironic/+/93077621:57
JayFkeekz: I think you need https://github.com/openstack/ironic-prometheus-exporter21:57
JayFhmm it should come out via notifications per that doc21:58
JayFwhat is your notification_level?21:58
keekzthe docs are light, but i was under the impression the ironic-exporter read from the rabbit queue?21:58
keekzdebug21:58
JayFI don't know much about our hardware metrics22:00
JayFI'm digging22:00
keekzbefore i added [metrics] backend to my ironic.conf, i was getting an error that i've since lost in scroll which said i needed to configure metrics. added the metrics section and no more errors... but no metrics either :)22:01
JayFCan you paste a redacted version of your config?22:03
JayFI suspect metrics config has nothing to do with this but imbw22:03
JayFwell, both ways work I mean22:03
JayFalso your [oslo_messaging_notifications] is setup, right? 22:04
keekzwell [metrics] was required or else it gives an error. let me make a gist22:07
iurygregorywe had some changes if I recall22:08
iurygregorylet me get the right config you should use22:08
JayFwell lets look at the config they're using first :P 22:09
keekzhttps://gist.github.com/nicholaskuechler/5b2d7cd183c8dee47d826b16caa53ac1 - pretty simple. fwiw once i get this working i'll update that redfish metrics page with the missing steps / configs22:09
JayFaha the real name did it for me, we worked together at rax, yeah?22:09
keekzand actually - i just found the notifications have arrived in rabbit, but in a different queue than the other notifications :)22:10
JayFkeekz: you need transport_url and topics in that oslo_messaging_notifications section22:10
iurygregorydriver = messagingv2 I think this is wrong22:10
iurygregory=)22:10
iurygregoryyeah22:10
keekzyep that's me, jay :) 22:10
JayFit's right, just needs to have topics/transport urls set22:10
JayFin that case I'll send your manager my hourly rate /s 22:10
iurygregory[oslo_messaging_notifications]22:10
iurygregorydriver = prometheus_exporter22:10
iurygregorytransport_url = fake://22:10
iurygregorylocation = /opt/stack/node_metrics22:10
JayFstill at rax?22:10
keekzyep, and i'm working on doug's team. so you can just send him the bill22:10
keekzspeaking of ironic exporter, is there a way to run it without a log file like that? i briefly looked in to it but our ironic is in kubernetes and it appeared to want actual log files, systemd services, etc.22:13
iurygregorywe forgot to update the docs when we required the new config [metrics]22:13
iurygregoryIPE is basically a flask app that will read all files in the location you have set22:14
iurygregorywe receive the data from ironic and turn into a file for each node22:15
keekzyeah, i stumbled through that myself. the https://docs.openstack.org/ironic/latest/admin/drivers/redfish/metrics.html didn't mention any config options needed so i did some spelunking to find them22:15
iurygregoryand later you can connect prometheus to scrap the data22:15
keekzsounds not very kubernetes friendly?22:15
JayFiurygregory: can you fix the doc?22:15
iurygregoryI can22:15
JayFkeekz: it's a k8s-style prometheus exporting flow bolted onto a statsd/message queue based metrics system22:16
iurygregorybut keekz mentioned he would submit a patch if I understood22:16
JayFkeekz: so ... yeah, it's not a perfect fit, but it's close22:16
iurygregoryso I don't want to jump and send if he was planning on submitting the patch 22:16
iurygregorykeekz, are you using Metal3?22:16
keekzyes i can update docs on https://docs.openstack.org/ironic/latest/admin/drivers/redfish.html - are there other places to add to?22:17
iurygregoryhttps://docs.openstack.org/ironic-prometheus-exporter/latest/configuration.html22:17
keekzno, not using metal322:17
JayFiurygregory: keekz is cardoe's environment fwiw :) 22:17
JayFif you've heard him talk about his at all, that's where keekz is at22:17
iurygregoryoh nice!22:18
iurygregoryso rax is rackspace?22:18
cardoeyep22:18
iurygregoryI was thinking it was the cloud provider we have in the infra for opendev :D22:18
keekzyep, 'rax' was the old stock ticker22:19
iurygregorybecause we have one called RAX :D22:19
keekzit is22:19
cardoeiurygregory: it is.22:19
iurygregoryOMG22:19
iurygregory=O22:19
keekzdifferent team, same company22:19
iurygregorygood to know22:19
keekzalthough we've done a lot of work over the years on that environment as well :)22:19
cardoeThe new stuff is called "flex" I think it's called raxflex or rxtflex in the openinfra configs.22:20
iurygregorykeekz, if you have any trouble with the IPE you can directly ping me, since I was the one who wrote it o/22:21
JayFWhen I worked at RAX, we got in trouble for calling it RAX22:23
JayFbecause they said people could accidentally think you're talking about the stock, not the company22:23
cardoeSo keekz is doing the needful for kicking the tires and doing some of the short-term functionality. But long term I think we'd probably look to help extend to be a bit more Prometheus native or catching that event stream directly.22:23
JayFnow the stock ticker is RXT and they don't pay me a dime, so I can call it RAX all day long :P 22:23
JayFcardoe: hooking into the notifications stream is the key there I think22:23
JayFcardoe: you aware of https://github.com/openstack-exporter/openstack-exporter22:24
cardoeyeah keekz is running that now22:24
iurygregoryI wasn't aware <eyes>22:24
keekzyeah i set up openstack-exporter, that was pretty straight forward22:24
cardoespeaking of openstack-exporter, I proposed fixing the auth stuff in gophercloud more fully. stephenfin joined in the convo. It'll likely be a v3 thing.22:25
cardoeBasically I said that internally they need to treat everything like a clouds.yaml entry and auth based on that entry.22:25
keekzdoes that replace ironic-exporter? i haven't really done a deep dive except to get it to do what i wanted, which was to hook in to prometheus for basic api up/down alerts22:25
iurygregorykeekz, not that I'm aware ...22:25
iurygregorybut it does have ironic metrics22:26
iurygregorybut not the ones we collect from the bmc with sensor data22:26
cardoeSo another thing we'd want to do is have redfish eventing be part of that pipeline of data.22:26
keekzyeah it's more of what you can see in an `openstack baremetal node list` which is useful, but we were wanting to investigate some of the hardware health info redfish can provide22:27
iurygregoryhumm interesting22:27
cardoeSo if the hardware tells us its bad, put that on the notification stream as well.22:27
iurygregoryI added the event subscription to redfish is via vendor passtru (and probably need update if things changed in the redfish schema)22:27
iurygregorybut is not integrated with IPE22:28
cardoeyeah I figure we'll need to commit some time to experimenting and then drafting some RFEs and specs around what would work.22:29
cardoeThe current hardware monitoring stacks (cause there's many different things) all live outside of Ironic22:30
iurygregoryyeah22:32
cardoeBut we look at it all and there's commonality between them and with the improvements in redfish the idea is could we do something generic inside of Ironic that can be built on.22:32
iurygregoryalso, try to not lower the interval to collect data a lot, otherwise you can cause problems to the BMC (but I think you are aware)22:32
iurygregorycardoe, totally agree22:33
keekzyeah some of the bmcs are painfully slow. in some of the ironic vendor docs i did find some performance tuning settings though 👍22:38
keekzi'm working on updating those docs (both the ironic-exporter and ironic redfish metrics) but it's getting late for me, so i'll have something for review tomorrow22:47
opendevreviewNicholas Kuechler proposed openstack/ironic-prometheus-exporter master: docs: Updates configuration documentation  https://review.opendev.org/c/openstack/ironic-prometheus-exporter/+/93245823:18
opendevreviewNicholas Kuechler proposed openstack/ironic-prometheus-exporter master: docs: Updates configuration documentation  https://review.opendev.org/c/openstack/ironic-prometheus-exporter/+/93245823:21

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!