Monday, 2024-03-11

rpittaugood morning ironic! o/07:41
rpittauJayF: great job on the highlights :)07:46
opendevreviewAdam Rozman proposed openstack/sushy-tools master: improve volume boot mode support  https://review.opendev.org/c/openstack/sushy-tools/+/91211309:28
opendevreviewMohammed Boukhalfa proposed openstack/sushy-tools master: Add fake_ipa inspection, lookup and heartbeater to fake system  https://review.opendev.org/c/openstack/sushy-tools/+/87536611:10
iurygregorygood morning Ironic11:33
iurygregorydtantsur, I'm looking at your comments in https://review.opendev.org/c/openstack/ironic/+/912336 , wondering about the eject part... this is what I had in mind https://paste.opendev.org/show/bHmFZsDXcYzEh2LWOy16/ so I'm trying to understand the case you mentioned12:18
* iurygregory brb, need to go to the insurance company to solve things about the car =X12:25
opendevreviewMerged openstack/ironic stable/2023.1: Special case lenovo UEFI boot setup  https://review.opendev.org/c/openstack/ironic/+/91044712:27
opendevreviewMerged openstack/ironic master: ci: support overriding the service project name  https://review.opendev.org/c/openstack/ironic/+/90922112:49
dtantsuriurygregory: "If a media was inserted trough Managers first and we update the BMC and now vmedia is trough Systems, I think the media was "automatically" ejected no?" - are you sure about this?13:02
TheJuliaIs that defined in the standard? I would hesiate unless it is explicitly defined13:21
dtantsurThat's basically my concern13:21
TheJuliaand *where* it is defined as well, we've seen meaning in schema spec be misconstrued13:21
TheJuliamultiple times, sometimes the same exact field13:21
opendevreviewRiccardo Pittau proposed openstack/ironic master: [trivial] add device_type param to attach_vmedia_device  https://review.opendev.org/c/openstack/ironic/+/91245714:11
opendevreviewJulia Kreger proposed openstack/ironic master: Add redfish https boot CI job  https://review.opendev.org/c/openstack/ironic/+/90109014:15
iurygregorydtantsur, not really, but in case of the Lenovo you won't even have VirtualMedia available trough Managers, so how we would eject? 14:31
dtantsuriurygregory: we don't then?14:31
dtantsurjust make sure we don't try to access the missing attribute14:32
iurygregoryI'm ok with attemting to eject in both Systems/Managers, we just need to be sure they have the VirtualMedia resource 14:32
dtantsuryou can do something like "if there is virtual media in systems, MissingAttributeError can be caught and ignored on the high level)14:33
iurygregorydtantsur, yeah, basically check if Virtual Media is available in System (if yes we eject), check if is available in Managers (if yes, eject)14:33
* TheJulia likes this idea14:35
dtantsur++14:35
iurygregoryawesome, I will talk with winicius to update things o/14:36
iurygregoryshould we bump sushy in a separate patch or nope?14:37
dtantsurwe can bump here as long as we remember to remove the bump in backports14:41
iurygregoryalso, I've noticed we have 5.0.014:44
iurygregoryshould we bump to it? and not only 4.8.0?14:44
rpittauiurygregory: that should be the minimum version needed for that to work14:47
dtantsuryep, we never bump "just in case"14:47
iurygregoryrpittau, yeah, that's why I told him to use 4.8.014:47
rpittauack14:47
iurygregoryjust wanted to be sure =)14:47
JayFHappy DST meeting day for 'muricans :D 15:00
JayF#startmeeting ironic15:00
opendevmeetMeeting started Mon Mar 11 15:00:39 2024 UTC and is due to finish in 60 minutes.  The chair is JayF. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
opendevmeetThe meeting name has been set to 'ironic'15:00
rpittauo/15:00
iurygregoryo/15:00
* TheJulia grumbles about daylight savings time15:00
dtantsuro/15:00
haoZhouo/15:00
TheJuliao/15:00
JayFWelcome to the weekly Ironic meeting. We operate under the OIF Code of Conduct.15:01
JayF#topic Announcements/Reminders15:01
JayF#info  Standing reminder to review patches tagged ironic-week-prio and to hashtag any patches ready for review with ironic-week-prio: https://tinyurl.com/ironic-weekly-prio-dash15:01
JayF#info  Project Teams Gathering (PTG) will be held from Monday, April 8 to Friday, April 12 2024 15:01
JayF#link https://etherpad.opendev.org/p/ironic-ptg-april-202415:01
JayF#info  Ironic Meetup/BareMetal SIG June 5, OpenInfra Days June 6 @ CERN. Signup at https://indico.cern.ch/event/1378171/ and https://indico.cern.ch/event/1376907/15:01
JayF#topic Caracal Release Schedule15:01
JayFIt's R-3 week, RC1s are due RC-ing projects by March 14th.15:02
JayFWe don't RC, but that also serves as a good point to us to get our release going15:02
JayFIf anyone has a specific change they want in Caracal, please let me know15:02
JayFI'm going to be focusing on landing stragglers and cutting releases this week 15:02
iurygregoryWe will probably want https://review.opendev.org/c/openstack/ironic/+/91233615:03
iurygregorythis week? :O15:03
iurygregoryoh wow15:03
dtantsur++ to this change15:03
JayF3 weeks until the release is final-final, and we can't ask releases team to take our deliverables last minute15:03
JayFso yeah, we gotta get moving :D 15:03
dtantsurand also https://review.opendev.org/c/openstack/ironic/+/907991?usp=dashboard if we want the new inspection to be (more usable) for non-standalone folks15:03
rpittauI'll try to squeeze in one more this week, fingers crossed15:04
JayFPlease let me know if you have a change up you want in so I don't ninja the release under you :D 15:04
dtantsurrpittau: you mean, vmedia implementation?15:04
rpittauyeah15:04
dtantsuroh yeah. API without a single implementation is not great15:04
rpittauyep15:05
JayFlets make sure we get things up fast :) time is coming up quickly15:06
JayF#topic Review Ironic CI Status15:06
JayFI'm pretty sure IPA gate is broken15:06
JayFI have a task on my todo list to dig into that today15:06
opendevreviewJulia Kreger proposed openstack/ironic master: Fix artifical rbac policy constraint that resulted in 500s  https://review.opendev.org/c/openstack/ironic/+/91096915:07
JayFI guess nothing else about CI status15:08
JayF#topic vPTG planning15:08
JayFPlease review https://etherpad.opendev.org/p/ironic-ptg-april-202415:09
rpittauJayF: I think that was (or is) metalsmith legacy job, I've retested a patch to verify15:09
JayFwe already have some pre-discussion happening15:09
JayFrpittau: thanks15:09
JayFrpittau: I am concerned it's a unit test issue15:09
JayFrpittau: so I'm not convinced that's the only bit, we'll see15:09
dtantsuryeah, I've left some comments today15:09
rpittauoh ok, didn't spot that15:09
JayF#topic Bug Deputy15:10
JayFI was the bug deputy, but realistically in name only15:10
JayFI'm happy to take it another week and do two weeks worth of triage :)15:11
JayFProbably a valuable activity to take before release, anyway15:11
JayF#topic Open Discussion15:11
JayFWe had an agenda item left15:11
JayF#info Please review BMC CA Cerfificate - ironic spec (https://docs.google.com/document/d/1Vxn-MrcXEnzeHaWvni4IE0WuQpR1QNWNC8EF65wXUho/edit) for RFE https://bugs.launchpad.net/ironic/+bug/2040236. Any comments are welcome.15:12
JayFMy first feedback will be to push it to gerrit for further feedback :)15:12
JayFbut I also intend on looking over it, in gdoc form and providing coarse feedback15:12
JayFAnything about this or anything else for open discussion?15:12
TheJuliawait, a google doc spec?15:12
rpittauyeah, they wanted to have a preliminary discussion on that :)15:13
dtantsurI think we should communicate that it's about time for real discussions15:13
JayFYeah, it was added to our agenda. It needs to be moved into gerrit but I am not going to hold that against them the first time around :)15:13
rpittauI agree15:13
JayFAnything else for open discussion?15:14
haoZhouI will send it to gerrit, thanks15:14
JayFhaoZhou: thanks for submitting it!15:15
JayFLast call for open discussion items15:15
TheJuliaOkay, so my $0.02 feeling on that is the addition of the option itself, as long as it is a singular option is likely okay without a spec15:15
TheJuliaI'm less and less a fan of driver specific field names unless absolutely necessary15:16
dtantsurI think JayF asked for the spec back in the RFE discussion days15:16
dtantsurTheJulia: well.. while a reasonable deployment should have the same CA for all BMCs, regardless of vendor, I can imagine a situation where some hardware simply cannot be updated15:16
dtantsurdunno15:16
JayFOr at least, it'd need to be aligned on something that is meant to reflect physical distance15:17
JayFlike conductor group15:17
dtantsurCould be a PTG topic because for now we have all these [redfish]kernel_append_params and so on15:17
TheJuliayeah15:17
JayF"a different conductor group needs a different CI" aligns with a design where you have one Ironic orchestrating multiple, separately owned/operated computer rooms15:17
JayFs/CI/CA/15:17
dtantsurThis is probably a different concern. You probably will have a different ironic.conf per conductor group.15:18
JayFooh15:18
TheJuliadtantsur: ++15:19
JayFEither way, gerrit is a good place to have that discussion :)15:19
dtantsurright15:20
JayFIs there anything else or should I close it up?15:20
JayF#endmeeting15:20
opendevmeetMeeting ended Mon Mar 11 15:20:54 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:20
opendevmeetMinutes:        https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-03-11-15.00.html15:20
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-03-11-15.00.txt15:20
opendevmeetLog:            https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-03-11-15.00.log.html15:20
TheJuliaoh, ipa's unit tests actually codify the underlying ironic-lib behavior15:29
TheJuliaso we could just do "yes, we called the library" mock level testing instead of "these commands got executed" level of verification15:30
TheJuliadtantsur: and regarding the parameter for the bmcs's ca, I'm not arguing about it being settable, I'm arguing about <drivername>_verify_ca being stamped all over the place, even on drivers which... have a shortend lifetime ahead of them15:32
dkingI see that the cleaning step `erase_devices` dispatches a call to `erase_block_device` across hardware managers, which makes it simple to override the erasing of block devices with custom code. However, I am not seeing a similar option for `erase_devices_metadata`.15:33
dtantsurTheJulia: I assumed we would only add it to the drivers that can support this option?15:33
JayFdking: erase_devices_metadata *is* the target of the hwmanager call15:34
JayFdking: just define your own in a higher-prio hwm15:34
TheJuliadtantsur: I can see sort of see an option, regardless of name making sense, It doesn't make sense to me to enabel a pattern of "ilo_verify_ca" and "redfish_verify_ca", because I have customers which will try to set it expecting both options to do something15:35
dkingI would like to be able to skip erasing metadata, specifically lvm and volume groups while allowing other drives and even volume groups to be wiped. Is there an easier way to do that than to essentially rewrite the entire `erase_devices_metadata`?15:35
TheJuliaeven when on only one of those drivers...15:35
JayFdking: the erase_devices bit is just there so you can implement "erase a single device" in a hardware-specific way15:35
JayFdking: the assumption is (I think) that you don't need hardware spceific help for metadata wipes15:35
TheJuliaIs anyone working on ironic-lib? I ask because In a couple hours I need to drive to LA15:35
TheJuliaerr15:35
TheJuliaironic-python-agent's ironic-lib testing15:36
dtantsurTheJulia: why not? I'm not sure I understood.\15:36
TheJuliahttps://www.irccloud.com/pastebin/0MbMHn35/15:36
dtantsurWe don't expect redfish_verify_ca and ilo_verify_ca to be different.. but they might in an extreme case15:36
JayFTheJulia: that's the ipa unit test issue I said I was looking at later, yeah?15:36
JayFyeah15:36
TheJuliadtantsur: but while being set on the same node is sort of the thing I'm concerned about15:36
JayFI already have time set aside :)15:36
TheJuliaJayF: okay, yeah, thanks!15:36
dtantsurTheJulia: wdym?15:37
dtantsurilo and redfish cannot apply to the same node15:37
dtantsurnot at the same time15:37
dkingJayF: I'm probably not doing something typical. What I'm looking to do at the moment is to skip wiping certain volume groups. So, it's not really a hardware problem. it's just that I want to make my own check before wiping them. Unfortunately in this case, I do want to wipe some other volume groups on the same node, so I still want the wiping functionality.15:37
JayFin my own hwm manager then, I'd override erase_devices_metadata15:37
JayFhave it get a list of devices15:38
rpittauoh you mean this https://review.opendev.org/c/openstack/ironic-python-agent/+/91048015:38
JayFfilter those devices for the set you care about15:38
JayFthen just either call the methods being used by the generichwm implementation or just copy+paste the actual metadata wipe code15:38
JayFdon't obsess over DRY'ing it up too much, it's OK to just copy+paste15:38
rpittauTheJulia: I ahve the fix for the unit test here https://review.opendev.org/c/openstack/ironic-python-agent/+/91048015:38
dtantsurMaybe we could improve that in IPA too15:38
dkingJayF: Yeah, that's hat I'm probably going to have to do. I just hate the idea of copying over the entire general erase_devices_metadata so that I can add in a skip. That means that I would lose any updates that get made to that method upstream, assuming that any ever do get made.15:39
JayFnot often :)15:39
JayFand if your hardware doesn't change, it's a feature not a bug, imo, that it doesn't change15:39
iurygregorydtantsur, what do you mean by normal AttributeError? https://review.opendev.org/c/openstack/ironic/+/912336/comment/3bc72da4_cb011dee/ doesn't sushy only raises MissingAttributeError  in such cases? https://opendev.org/openstack/sushy/src/branch/master/sushy/exceptions.py#L41 15:40
TheJuliadtantsur: I have a some customers who will try regardless to set both settings, regardless adding multiple options just starts to create divergent code which is driver specific, when those drivers are largely just being maintained with only upstream community engagement15:40
rpittauJayF: also fix for that unit test that I forgot :D https://review.opendev.org/c/openstack/ironic-python-agent/+/91048015:40
TheJuliadtantsur: I guess, I'm worried about added perception and interaction complexity more than anything else15:41
dtantsurTheJulia: "customers changing things they don't understand" is a problem but not necessarily a sort of problem that must unconditonally drive our decisions15:41
TheJuliadtantsur: I'm not saying unconditionally, I'm just saying I'm not a fan of stamping driver specific verify_ca options all over the place without intention15:41
dtantsurI'm not overly against just [conductor]bmc_verify_ca, but they we need to develop a policy around it15:41
TheJuliaagree 1000000%15:42
dtantsurso that we don't doing similar things, possible deprecate [<driver>]kernel_append_params and so on15:42
TheJuliaMore towards the ptg topic level of discussion, we likely need to message what the realistic lifetime and path is for, for example, the ilo hardware type15:43
TheJuliaand explicitly begin that process of expectation setting15:43
dtantsurYeah.. which reminds me, I need to have this conversation in Metal315:43
TheJulia++15:43
TheJuliasame for the classic drac driver, and likely more urgently because of it15:43
dtantsurhas ironic communicated the deprecation already?15:43
TheJuliawhere as ilo, we at least have contacts which will respond/engage15:44
dtantsurThe problem is: ilo4 is still quite actively used15:44
TheJuliafor classic drac I believe we did15:44
TheJuliafor ilo, I think we painted the wider context15:44
JayFre: verify_ca15:44
TheJuliamaybe not an official deprecation, but we should go ahead and set that path out15:44
JayFI think we need to think about models -- e.g. "this is a CA for BMCs" vs "this is a CA for agents"15:45
JayFand if we go that route, of making it not-driver-settable, we need to be really explicit about our "zones" of ssl verification15:45
TheJuliadtantsur: yeah, and I would expect ilo4 for a couple more years, it will sharply decline as the mean time between failures occur :)15:45
dtantsurand the ilo5 driver will probably live for longer... sigh15:45
TheJuliadtantsur: yeah, and we *will* hit a point, at some point, where the state of the proliantutils library will be the primary driver for us going "time to pull support"15:46
dkingJayF: Maybe then as a separate question, would there be any use in either a dispatch_to_managers for something like erase_block_device_metadata, which in the HardwareManager class simply calls disk_utils.destroy_disk_metadata, so that it could be overriden, or perhaps maybe even just add some type of hook to run a check before proceeding?15:46
TheJuliaI think that, and similar with dracclient, will be the prime forcing functions15:46
dtantsurThe problem may be RAID. ilo5 had RAID support via proliantutils but not redfish.15:47
JayFdking: I'd have to see the change. I'm not -1 outta hand to it by any stretch, but that's code that's called in almost every Ironic deployment with cleaning so we should be careful -- and I think you're probably overly concerned about the risks of just overriding that method. Just do it! It's fine! 15:47
TheJuliadtantsur: ugh15:48
TheJuliadtantsur: but with at least a stance written down, we have something to stand on on if we need to take any level of drastic action15:50
dtantsurtrue15:51
dtantsurhave we deprecated ibmc too?15:51
TheJuliaI believe so!?15:51
TheJuliaAt this point, I'd have to search15:51
JayFYes15:52
JayFI put the high level in cycle highlights15:52
TheJuliaJayF: https://review.opendev.org/c/openstack/ironic/+/911158/4/devstack/lib/ironic#586 is not working :( (le sigh)15:53
JayF  - Several Ironic drivers have been deprecated in favor of more modern, redfish-based drivers. The ``ibmc``, ``xclarity``, and ``idrac-wsman`` drivers will be removed during a future development cycle. Operators utilizing these drivers are encouraged to use the redfish hardware type instead. Additionally, users of the ``ilo`` hardware type on newer ILO6-based hardware will now be prompted to use ``redfish`` instead.15:53
dtantsurthanks!15:54
JayFTheJulia: I can add it to the list, but it's lower prio than IPA master branch. I'll be starting from scratch15:54
JayFBTW, we said it'd be removed in 2024.215:54
JayFI think we need to wait longer15:54
JayF(in release notes)15:54
JayFbut we can discuss that when removal time comes15:54
TheJuliaAnyway, I'm going to be hitting the road to drive to LA in ~45 minutes15:57
TheJuliadepending on traffic, I might not be on first thing in the morning15:57
TheJulia(i.e. the "no way I'm spending that much time in traffic" drive home" scenario15:57
dkingJayF: I'm doing that, but there's a couple of frustrations. For one, I now also have to maintain the proper version of ironic-libs in my requirements, but also, now I have to go through the trouble of making sure that I mock out `disk_utils.destroy_disk_metadata` in my unit tests. 16:05
JayFIf you're using a patch to do this, you're taking the wrong approach16:05
JayFyou should be able to make an indepedent hardware manager, like one of the examples we publish, that just implements that one command16:06
JayFand it will override16:06
dkingJayF: Yes, that's what I'm doing. But I still want the metadata wiped for all other block devices. So, in my custom hardware manager, I've having to re-implement erase_devices_metadata so that I can add in a check before it calls disk_utils.destroy_disk_metadata. But that means that disk_utils.destroy_disk_metadata would still get called in my own unit tests and it means that I'll have to have the package installed from which i16:08
dkingt is called.16:08
JayFactually, there's a better way than that, even16:08
JayFlemme get the docs16:08
JayFyou're really close :D16:08
dkingJayF: Great. 16:08
JayFhttps://docs.openstack.org/ironic-python-agent/latest/contributor/hardware_managers.html#custom-hardwaremanagers-and-cleaning16:09
JayFso you should be able to:16:09
JayFhmmmm16:10
dtantsurSide note: I plan to move the entire disk_utils into IPA. but that won't help the current situation.16:11
JayFyeah, we'd need a dispatch style like we use for erase_devices16:11
JayFbecause we can't do the "raise IncompatibleDeviceError and have upstream handle it" when the caller is saying "do N things"16:12
JayFso I see two paths: 1) https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/hardware.py#L3422 make this method a dispatch_to_managers() call everywhere so you can override it OR16:12
JayF2) as you mentioned above, change this to call out to dispatch inside the for loop here  https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/hardware.py#L175716:13
JayFI think that's the only non-patch way to override this method *without also* reimplementing it entirely/copying it out16:13
JayFwhich I still think is the route I'd take in your shoes16:13
dkingYes, any of those are good for me.16:13
dkingThat's what I'm doing for now to get the current functionality, but it's a hack and not terribly maintainable.16:14
JayFwell, that method barely ever changes 16:14
dking*what I'm working on now, because of this duscussion.16:14
dkingI understand, but even so it makes a few calls locally, including the important one, which isn't even in the ipa package itself, but the ironic-lib package. And all of that just means extra work even on top of hoping that none of the dependencies or original code changes.16:15
dkingit's not bad, but it's not ideal, and I can imagine that it would be generally useful for people to be able to cleanly make exceptions to what block devices (or metadata in this case) get wiped.16:16
JayFthese are the basic motivations around why we did it for erase devices16:16
JayFI'm sold on the value; but I also don't think it's likely to be a personal priority to implement16:17
JayFbut like with anything I'm happy to review and help where I can if someone else is going to16:17
opendevreviewDmitry Tantsur proposed openstack/ironic master: Switch to qemu-img functions from ironic-lib 6.0.0  https://review.opendev.org/c/openstack/ironic/+/91247116:18
opendevreviewVerification of a change to openstack/ironic master failed: Special case lenovo UEFI boot setup  https://review.opendev.org/c/openstack/ironic/+/90894616:27
dkingJayF: Actually, I just re-discovered get_skip_list_from_node, which might actually work. It looks like _list_erasable_devices has a default list of block_devices which is set to list_block_devices_check_skip_list, which in turn, in the default hardware manager, calls get_skip_list_from_node.16:32
JayFoh, score16:32
JayFif you find a way to finagle this in16:32
JayFplease do doc it up?16:32
JayFI'm not surprised there's a path, I went looking for it16:32
opendevreviewDmitry Tantsur proposed openstack/ironic-python-agent master: [WIP] Import disk_{utils,partitioner} from ironic-lib  https://review.opendev.org/c/openstack/ironic-python-agent/+/91247616:32
TheJuliayeouch CI16:32
dtantsurrpittau: will ^^ make fixing unit tests easier or harder?16:32
TheJuliahttps://www.irccloud.com/pastebin/Om7sJpM1/16:33
JayFTheJulia: side effect from our drain code, perhaps?16:33
dkingJayF: I am confused, though. In the abstract HardwareManager, get_skip_list_from_node raises errors.IncompatibleHardwareMethodError. However, I don't see it ever being called by dispatch, it it looks like that's only useful if other methods or implimented along the way.16:33
JayFTheJulia: where we might fire that log message inadvertantly?16:33
* JayF is hoping16:33
TheJuliaJayF: we don't fire it inadvertently16:33
JayFdking: so that means it actually is likely upstream-bugged16:33
TheJuliaJayF: it gets logged when we actually fail to update the heartbeat16:34
JayFdking: because we'd skip erasing *all* metadata if *one* disk was skipped16:34
TheJuliain a timely mannor16:34
JayFokay, yikes then16:34
rpittaudtantsur: mmm I don't think it would change much, but the problem is that the metalsmith legacy job also looks broken :/16:34
TheJuliaso in other words, when our write is not making it to disk (as in committed into the db)16:34
dkingJayF: yeah, that's what it looks like, but I haven't tested. 16:34
rpittaudtantsur: I'm rerunning CI here https://review.opendev.org/c/openstack/ironic-python-agent/+/910480 if it passes the metalsmith job I think it would be fast to merge16:35
dtantsurrpittau: it has just failed the metalsmith job16:35
rpittau\o/16:36
rpittaualright so we need to check that too :/16:36
dtantsurTheJulia: did you mean to address comments on https://review.opendev.org/c/openstack/ironic/+/907991 now or in a follow-up? Especially the docs update is not something I can trivially do between other tasks.16:42
dtantsurotherwise, it has 2x +216:42
TheJuliafollow-up is fine16:42
TheJuliafrom my point of view16:42
dtantsurCould you do a +W then?16:51
rpittauyep, metalsmith-integration-ipa-src-legacy in IPA is foobar16:55
rpittau'Failed to create config drive on disk /dev/vda'16:55
rpittaumunmap_chunk(): invalid pointer16:55
dtantsurWUT16:58
rpittauthat's parted doing weird things17:00
TheJulia.... is it too late to take a vacation?17:00
rpittaulol17:00
dtantsurI'm dreaming of a vacation since.. a while17:01
TheJuliaditto17:01
dtantsur(my last attempt failed miserably, which did not help at all)17:01
TheJuliaI'm actually going to start heading towards los angeles shortly17:01
TheJuliaat least, to give a talk17:01
dtantsurKubeCon next week, maybe I can get some rest there?17:01
TheJuliabahahahahaha17:01
rpittauLA, being a while I was there :)17:01
TheJuliasorry17:01
dtantsurc'mon, I can hope! :D17:02
TheJuliadtantsur: indeed, I highly recommend the mid-week morning nap17:02
TheJuliawhere you end up sleeping in by accident all morning17:02
dtantsur++17:02
dtantsurmy most important events during the conference will happen in the evening anyway :)17:02
TheJuliarpittau: doing a talk for the local openinfra meetup17:02
TheJuliawish me luck17:02
JayFNobody mailed me a random gift, perchance, did they? 17:02
dtantsurgood luck TheJulia!17:03
rpittauTheJulia: nice! have fun! :)17:03
TheJuliaJayF: I... don't... think so.17:03
JayFI had a piece of the first cray supercomputer, mounted, show up in my mailbox the other day with a vague anonymous note 17:03
TheJulianice17:03
JayFand it sounded ironic-adjacent so I figured I'd ask here17:03
JayFyes, exactly17:03
JayFI need to find, thank, and encourage this person to keep doing it 17:03
JayFlol17:03
TheJuliaheh17:03
JayFooooooooooooh I bet I know who it was17:03
rpittaumy brain refuses to go on, see ya tomorrow! o/17:05
* TheJulia needs coffee17:05
TheJuliacoffee is like a turbo button17:05
* TheJulia just dated herself there17:05
JayFHere's the real question: How much faster did a turbo button being pressed make computers go?17:06
TheJulia2.663Mhz faster17:06
dtantsurI was too small and never risked pressing the button :(17:06
TheJuliajust enough to brak the graphics17:06
TheJuliabreak the graphics17:06
TheJuliaGranted, those were CGA graphics17:07
JayFyou fell for the trick question17:07
JayFtraditionally, *enabling* turbo mode slowed the machine to 4.77mhz17:07
JayFdisabling turbo mode made it run at full speed, breakin' your games17:07
TheJuliaTIL17:07
JayF(unless your system builder hooked the button up backwards, which was also common since enabling turbo mode to go faster is weird)17:07
JayFs/en/dis/17:08
dkingJayF: The more I look at the code, the more it looks like a lot of the methods in the abstract HardwareManager class were never really intended to be abstract. They seem to depend deeply upon each other and are called with `self.` instead of being dispatched to manager. Essentially, the way they are working, it seems like they might as well just be moved to the GenericHardwareManager, but I think that it would be nice if they w18:03
dkingere instead dispatch_to_managers. Am I missing something?18:03
JayFEvery method in a clean step18:04
JayFis dispatched to managers18:04
JayFso if you have clean step X -> it gets dispatched18:04
JayFthere are only a few cases we'd need a "nested" dispatch18:04
JayFand that's when the step is plural18:04
JayFe.g. "erase_block_devices" has to then break into dispatches of "erase_block_device" in order for us to both ahve a step that says "erase everything" and the ability to override the erase of a single thing18:05
dkingJayF: Yes, the steps. I'm talking about the list of methods at the top of the abstract HardwareManager which simply `raise errors.IncompatibleHardwareMethodError` as if they are expected to be implemented by the custom hardware manager.18:05
JayFotherwise, the usage pattern expected is create a custom HWM with a higher priority and override the method18:05
JayFit's an abstract class18:05
JayFit's not supposed to implement anything?18:06
JayFI'm confused as to what you're asking18:06
dkingFor instance, list_block_devices. Let's say that you looked at the HardwareManager abstract class and saw that it simply raises an IncompatibleHardwareMethodError, so you assumed that you could just write your own in your custom hardware manager. Let's say that you first did a check to see if you wanted to handle it, and then, deciding that you don't here, you raised the IncompatibleHardwareMethodError instead, just like in the18:17
dking abstract class, expecting that it would go on to be handled by the default hardware manager. From what I see in the code, what would happen instead is that all the placess that try to list_block_devices would themselves throw IncompatibleHardwareMethodError because it's not dispatched and nothing is there to catch it.18:17
JayFso we call *all* possible hwm18:18
JayFif you have Custom1, Custom2, Generic in that priority order, and we call method_only_in_generic18:18
JayFit'll call Custom1.method_only_in_generic() -> Incompatible, so it'll call Custom2.method_only_in_generic() -> Incompatible, so it'll call Generic.method_only_in_generic() -> succeeds18:18
dkingJayF: That's how it would seem to me that it _should_ work, but I don't think that it does for these cases, because they aren't being dispatched. It looks like it's calling these on `self.`. I'm going to make a proof of concept to confirm what I suspect and see if that's helpful.18:26
opendevreviewMerged openstack/ironic master: Add inspection PXE filter service  https://review.opendev.org/c/openstack/ironic/+/90799118:31
JayFso I think what you're saying is that18:31
JayFoften GenericHardwareManager implementation details are not dispatched18:31
JayFand that is likely by design; anywhere we dispatch we're creating a de-facto API/plugin interface18:32
JayFso like, e.g. def some_erase_method(): self.do_pre_checks()18:32
JayFyou can't override do_pre_checks18:33
dkingWhat I am saying is that methods which exist in the abstract HardwareManager (not GenericHardwareManager), but which only raise IncompatibleHardwareMethodError, like list_block_devices, seem to be implying that they could be implemented by simply adding them to your custom hardware manager, and that if they do throw IncompatibleHardwareMethodError, then they would be run again by another hwm.18:39
JayFaha18:44
JayFI think I found the disconnect18:44
JayFI suspect list_block_devices may need to be gone from abstract HWM18:44
JayFwhen it was removed from GHWM18:44
dking^18:44
JayFthat would explain some of the confusion and why I've been confused too lol18:45
dkingBut the problem is that list_block_devices and others may actually be useful to be overridable.18:45
JayFwell, it's a pressure on both sides18:45
JayFthere are two ways to customize: configuration, which we've added more and more of18:45
JayFand plugins/hardware managers18:45
JayFoften the difficulty is ... if you override list_block_devices, you'd also likely be disabling some documented Ironic functionality provided by IPA18:46
dkingI found it because I was looking for a way to have certain block devices be skipped during cleaning. There's a lot of great methods which do checks, but none of them are simple to override. Essentially, there's no good existing hook I can find yet to override that. I know that I _could_ add them in the properties, but I would like to do it in my custom hwm because I want to use code (in this case to check lvms and volume groups18:47
dking for certain names).18:47
JayFYeah, I think the place we maybe disagree, design wise, is I basically think at the point you want to use code to pick devices to skip18:48
JayFI kinda think you should take over the whole process18:48
JayFe.g. just overriding erase_devices_metadata (or disabling the built-in one and using a step of your own)18:49
JayFthat's the pattern I've seen others follow (including myself) in this space in the past18:49
dkingThere's lots of useful stuff in IPA that I want to keep. I don't want to replace any part of that. I just simply want to hook in and say "Oh, and skip these devices, too."18:49
JayFI think that's a feature rather than a plugin. I suspect our answer to that request (if you filed an RFE) would be that a feature to skip based on some metadata would be a good one 19:04
dkingJayF: So, if the official position is that these should simply not exist as overridable methods and should just be removed from HardwareManager in favor just existing in GenericHardwareManager, that's fine. Then, I probably could just add in an overridable method which would allow for that sort of thing. Perhaps, something like `def get_additional_skip_list(self, block_devices=none)` and maybe some wrapper method, like `def get19:05
dking_device_skip_list(self, block_devices=none, just_raid=False)` which combines that (by dispatch_to_managers) and get_skip_list_from_node, and then call that every place that get_skip_list_from_node is called.  Does that sound reasonable?19:05
JayFand in fact, we could probably wire root_device_hints into skip_device_hints or similar (I'm surprised we haven't already done this)19:05
JayFdking: s/official position/Jay's opinion/19:05
JayFthat sounds sensible, hard to follow entirely without seeing the code19:05
JayFbut I also think just like ... hooking up a feature that does what you want, or gives you the tools to do what you want19:06
JayFinstead of giving a plugin hook 19:06
JayFmight be a viable path, too19:06
dkingWell, device hints are great and we should use them, but I'd like something that can be run from hwm code, at least for my use case.19:06
JayFbut usually with stuff like this, whoever pushes the patch gets large amounts of input into how it's done19:06
dkingIf it's practical, I'm thinking that I'd write the patch. I wouldn't want to make anybody else write it. But I do think that it can be useful for more people than just me.19:07
JayFso I'd file an RFE about it anyway, lay out the approach you wanna take in there, and put it on the meeting agenda for rfe approval19:09
JayFthat'll get it before the eyes of enough people to say "yep" or "nope" at a high level before you get too much code written19:09
dkingBoth the code for root_device_hints and get_skip_list_from_node use the same il_utils.find_devices_by_hints, so if you want device hints from the node, I think that's already there. I'm just wanting to discover the devices at runtime.19:09
JayFyeah and if you're doing "LVM things named X get skipped" I doubt we'd wanna implement that19:10
JayFbecause it breaks our API contract a little bit (it lets something *on the instance* decide if it gets deleted or not, which is not good for our security model)19:10
JayFbut if we add a hook in that spot19:10
JayFand you write code that does it :D 19:10
dkingExactly. That won't be something that would be generic. But giving me a hwm hook to allow people to implement their own code is what I think is useful.19:11
JayFhonestly just adding a single19:11
JayFdispatch_to_manaagers(custom_device_skip)19:11
JayFin the appropriate place, with a true/false result19:11
JayFand a device object passed thru19:11
dkingthat way, it could be just about any silly thing. I'm sure people would want to skip cleaning for all kinds of things. Maybe they have a backup on a disk and want to leave the disks with that, or disks with certain volume groups, and so forth.19:11
JayFmight be a simple-to-implement piece19:11
JayFThis is a place where I like for it to be possible to do what you need19:12
JayFbut we gotta make sure the defaults are not a weapon pointed at anyones' foot :D 19:12
JayFwhich is why I'm giving cagey answers instead of solid answers w/o seeing the code lol19:12
dkingThat is simple, but there are several places where the HardwareManager uses a skip list (in particular, the one from hints from the node), so that it might be useful to write a wrapper which includes both, to make the code cleaner and more DRY.19:13
JayFI think DRY in this case is the enemy of extensibility. Not saying don't think about it; but it's less of a primary concern than being explicit19:13
JayFI'd rather someone who is a 2/10 on python, mainly a sysadmin, be able to understand what's going on than I do someone who is a 10/10 on python being impressed with our python ;) 19:14
JayF(in context of hardware managers)19:14
dkingDefinitely. However, I think that if somebody is going to roll their own "get_additional_skip_list" and populate its return with devices they want wiped, they've pretty intentionally at least tried to hammer their own foot.19:15
dkingIn this case, the DRY I was speaking about wouldn't just be for readability, but to prevent somebody from accidentally forgetting an important step.19:16
dkingAnd the names would all be rather obvious, as well as how it is used.19:16
JayFI'm interested to see what you come up with19:17
dkingThanks. And thank you for walking down that trail with me.19:18
opendevreviewJay Faulkner proposed openstack/ironic-python-agent master: Workaround unit test failures with extra mocking  https://review.opendev.org/c/openstack/ironic-python-agent/+/91249019:54
JayFTheJulia: ^ fixes for IPA, I went the short way around, may try to delegate the fixing of the overall unit tests19:55
JayFr/n I want to go look and see how this all landed in ironic-lib while breaking IPA unit tests19:55
JayFwe don't cross gate anything on ironic-lib, great19:56
JayFI guess that makes sense given changes have to happen there first19:56
jamesdenton__QQ: Does Ironic support neutron's trunk port functionality? Especially interested in how this would look with ML2/NGS20:05
*** jamesdenton__ is now known as jamesdenton20:05
JayFI know we use it downstream, with some minimal patches20:06
jamesdentonpatches to ironic or to ngs?20:06
JayFI *think* the patches are for our shenanigans, and that we support that generally upstream20:06
JayFngs 20:06
jamesdentonok, that's managable20:06
JayFI think we might have added some downstream filtering stuff (e.g. port X limited to VLAN Y/Z)20:06
jamesdentonis this actual vlan pruning on a server switchport or more related to switch uplinks?20:09
JayFactual knob-turning on switches20:10
JayFour patch is basically a downstream thing that lets NGS go "no" if someone says "plug port X into VLAN A" where VLAN a is not allowed for port x20:10
opendevreviewJay Faulkner proposed openstack/ironic stable/zed: stable only/ci: pin CI to dnsmasq 2.85/pin proliantutils/scciclient  https://review.opendev.org/c/openstack/ironic/+/91115820:12
JayFTheJulia: ^ and that should be the fix that needs, I hope. Appears we were setting the microversion wrong20:13
opendevreviewVerification of a change to openstack/ironic master failed: Special case lenovo UEFI boot setup  https://review.opendev.org/c/openstack/ironic/+/90894620:26
jamesdentonJayF that patch sounds nifty20:28
JayFI think there's an upstream version of it somewhere20:29
JayFhttps://review.opendev.org/c/openstack/networking-generic-switch/+/88804720:29
JayFthat may actually be mergable20:30
jamesdentonoh ok, it's a config setting... the allowlist20:30
jamesdentonmakes sense20:30
JayFTheJulia: rpittau: Apparently everyone is workign on the same thing? https://review.opendev.org/c/openstack/ironic-python-agent/+/910480 + https://review.opendev.org/c/openstack/ironic-python-agent/+/91249020:40
JayFfile that under an hour-ish wasted20:40
TheJuliawhy am I sitting outside, sipping a cold drink... in LA... when it is windy on a cool day21:20
TheJuliawhy?!21:20
TheJuliawe really need to get jamesdenton an irccloud account :)21:20
TheJuliathat, or glue his shoes to irc :)21:20
TheJuliaJayF: interesting, ack, thanks on the microversion21:22
opendevreviewMerged openstack/ironic master: Switch to qemu-img functions from ironic-lib 6.0.0  https://review.opendev.org/c/openstack/ironic/+/91247121:27
opendevreviewMerged openstack/ironic master: [trivial] add device_type param to attach_vmedia_device  https://review.opendev.org/c/openstack/ironic/+/91245722:07
opendevreviewSteve Baker proposed openstack/sushy-tools master: Add virtual-media-boot to openstack driver  https://review.opendev.org/c/openstack/sushy-tools/+/90676822:37

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!