Friday, 2024-11-22

JayFdking and others interested: our IPA-Builder elements *do* use constraints to install: https://opendev.org/openstack/ironic-python-agent-builder/src/branch/master/dib/ironic-python-agent-ramdisk/install.d/ironic-python-agent-ramdisk-source-install/60-ironic-python-agent-ramdisk-install#L14 00:00
JayFWe might want to add a bit to our docs/examples around hardware manager customization to demonstrate that you need to use U-C if you're setting up downstream HWM unit tests00:00
JayFI am not going to take that action today because it's my EOD and my todo list is long; but that's something that hopefully someone could take care of (maybe an operator who experienced this firsthand :D )00:01
opendevreviewMerged openstack/ironic master: Use quay.io registry image for metal3 job  https://review.opendev.org/c/openstack/ironic/+/93589500:08
opendevreviewSteve Baker proposed openstack/ironic master: Calculate missing checksum for file:// based images  https://review.opendev.org/c/openstack/ironic/+/93599201:27
opendevreviewSteve Baker proposed openstack/ironic master: Calculate missing checksum for file:// based images  https://review.opendev.org/c/openstack/ironic/+/93599201:36
rpittaugood morning ironic! happy friday! o/07:11
rpittauJayF: re bugfix cut: the plan was to release next week (last week of November) so we're more than good, I'll get the release request up ASAP 07:12
opendevreviewVerification of a change to openstack/ironic master failed: Agent deploy: account for disable_power_off  https://review.opendev.org/c/openstack/ironic/+/93463710:15
opendevreviewIury Gregory Melo Ferreira proposed openstack/ironic master: Update Node Cache during Servicing  https://review.opendev.org/c/openstack/ironic/+/93600911:16
opendevreviewMerged openstack/ironic master: IPMI power: account for disable_power_off  https://review.opendev.org/c/openstack/ironic/+/93262413:01
cardoeAnyone know if we can boot loop on BIOS? Thinking about booting to the internal UEFI shell. Essentially trying to solve for the gotcha we have in the docs where a BIOS reset might make the machine unable to boot the IPA until we change some BIOS settings. Which are queued up in my next step but can’t get there cause IPA now didn’t boot.13:44
cardoeThere’s no reason the setting of the BIOS settings needs to boot into IPA via redfish. That’s all out of band. The only requirement is that the redfish task only completes once the machine powers up. I have a suspicion that these Dells have some shenanigans with ExitBootServices / EnterRuntimeServices. Cause I tried to boot Xen on there and the BIOS setting job completed on a very different time frame than Linux.13:47
cardoeI used Xen because I happened to have coded that behavior 10? Years ago and it left a scar on my soul so I know the behavior difference from Linux.13:49
cardoeToday we poke redfish directly cause we just say that ironic cannot handle this.13:51
dtantsurcardoe: janders and I really want to stop booting IPA so often during steps like firmware updates or BIOS settings. We haven't got to it yet.14:16
dtantsurThe reason IPA is booted before any steps run is because it may have its own clean/service steps that override the built-in ones (think, software RAID or some proprietary BIOS settings tool)14:16
dtantsurMaybe we need to change this approach with Redfish and declare that in-band steps should not override out-of-band ones14:17
TheJuliacardoe: set a machine to keep booting to bios?14:17
dtantsurIt feels like the solution may be to fix ironic, not to boot into bios?14:18
dtantsur(otherwise, we could add a special clean/service step for that)14:18
cardoedtantsur: Yeah that would make sense.14:20
cardoeTheJulia: I don't have TFTP anymore so can't boot BIOS.14:20
cardoeBut VirtualMedia doesn't work. Whatever Ironic sends results in a 400 error back.14:20
dtantsurOMG14:21
dtantsurWe haven't seen this in our lab, which model is that?14:21
cardoeSo we're using redfish-https and/or ipxe-http14:21
cardoeR7515, R7615 and R74014:21
dtantsurhuh14:21
dtantsurprobably newer than what we have, right iurygregory?14:22
iurygregorydtantsur, correct14:22
TheJuliagood morning everyone14:22
dtantsurmorning TheJulia 14:22
cardoeGive me a few minutes and I'll grab the bug (I think I made one) or at least pastebin the error.14:22
dtantsurcardoe: is there any error message? Or just HTTP 400?14:22
iurygregorymay the new ones work uefi http boot lol :D14:22
dtantsurright :)14:22
TheJuliacardoe: I mean, not bios mode, but I mean into firmware bios settings ?14:23
dtantsuriurygregory: if cardoe uses redfish-https, it works14:23
dtantsur(which is yay! at least)14:23
iurygregoryNICE!14:23
cardoeIt's on my other machine and its sooo far (gesturing to the other room). It's cold this morning so I'm in my chair with a blanket and not wanting to get up. :-D14:23
dtantsuroh yeah, it's very chilly here as well14:23
* dtantsur hopes for a lot of snow in the mountains this season14:23
* TheJulia feels the cold in her fingers14:23
* TheJulia doesn't want to be up this morning14:24
* iurygregory hopes for snow in Berlin between Dec 8 Dec12 :D14:24
TheJuliacardoe: maybe higher bandwidth, once caffination soaks in, might help/enable getting on same page14:24
cardoeSo basically when we clean, we reset the BIOS and want to set some settings back.14:24
dtantsuriurygregory: not very likely, but let's keep the fingers crossed :)14:25
iurygregorydtantsur, yeah =)14:25
cardoeMy plan was a custom hardware manager to ensure that, which is likely what JayF did for Rackspace ages ago.14:25
dtantsurah, reset, interesting. we don't really do it in our lab.14:25
cardoeBut the BIOS reset happens and it reboots from the IPA and then it can't get back to IPA.14:25
cardoehttps://docs.openstack.org/ironic/latest/admin/drivers/idrac.html#pxe-reset-with-factory-reset-bios-clean-step14:26
cardoeToday we have automated cleaning off and a pile of Python poking redfish directly.14:26
dtantsurTheJulia: https://review.opendev.org/c/openstack/ironic/+/929904 could use your attention when you have a minute14:27
cardoeWe're using HTTP Push to push an ISO that's just a Linux ramdisk that sleeps forever14:28
cardoeSo I reset the BIOS and give it a one time boot of this sleep forever ISO.14:28
cardoeThat let's the Redfish Task complete.14:28
dtantsurHow is it different from booting IPA though?14:29
cardoeThen I apply the BIOS settings and give it the sleep forever ISO.14:29
cardoeWell IPA is a fetch and not a push.14:29
cardoeSushy doesn't yet have push in it.14:29
dtantsurIs push a standard thing? I thought we were still discussing it..14:29
cardoeI mean "standard"14:29
cardoeThere's an advertised field in the Redfish manager which gives you the endpoint.14:30
cardoeIt's then vendor specific for the payload but I think a recent redfish update provided a suggested payload.14:30
dtantsurcc janders ^^^14:31
cardoeIf OEM == "Hpe" do_this() if OEM == "Dell" do_that()14:31
dtantsurugh14:31
cardoeWhich is part of my interest in the OEM drivers in sushy.14:31
dtantsurAnyway. If we get a proper Task from sushy for BIOS settings, we can indeed boot into ~nothing (UEFI shell or whatever).14:32
cardoeWell sushy unfortunately drops the task cause it doesn't look at the response except to check for a 200.14:32
dtantsurIf you fix this, we'll owe you a beer :)14:33
cardoeWhen you issue a reboot you get something in the response that a flag that says "I'm gonna do something before I actually reboot". Then you hit the Task endpoint and you'll have a pending task.14:34
cardoeI'm trying to figure out how to lift this into sushy. I'm wanting us to stop crafting our own redfish poking library and just use sushy.14:34
dtantsur+++++14:34
cardoeThe issue I've had is that the box boots and the task doesn't complete until the Linux kernel gets to a certain point.14:35
cardoeMy sleep forever ISO gets there pretty quick cause there's no init.14:35
dtantsurAre you sure it's related to the kernel O_o14:35
cardoeIPA gets there fairly quick and then Xen took its sweet time.14:35
dtantsurI'm really surprised to hear it. Can it just take a while?14:35
cardoeSo this is where my jump to conclusions mat came out and made me think it's related to ExitBootServices / EnterRuntimeServices14:36
cardoeSo grub used to call EBS but then Linux changed like 10+ years ago to want something in there so grub stopped doing that.14:37
dtantsurHmmm14:37
TheJuliadtantsur: my half caffinated brain has looked at https://review.opendev.org/c/openstack/ironic/+/929904 and I see one tiny issue14:38
cardoeFor Xen I had to not call EBS until dom0 started up and I had to make a mapping of the memory to make it still work for dom0.14:38
cardoeSo that's where my timing guess is coming from.14:38
dtantsurI thought that iDRAC creates a job, which is executed by its firmware during reboot, then the real reboot happens automatically14:38
dtantsurbut who knows14:38
TheJuliadtantsur: no release note to indicate if anyone has hard coded their permitted formats and they permit raw, they might want to add gpt.14:38
TheJuliaas a follow-up is totally cool in my book, fwiw14:38
cardoeYeah iDRAC is creating the job for us.14:38
cardoeBut the job sits at like 33% complete until it starts to boot some OS.14:39
dtantsurTheJulia: I can do it, although I take this case into account and add "gpt" for them (I think, it has been a while)14:39
dtantsurcardoe: I'm speechless, iDRAC is even more complicated than I thought....14:40
cardoeI wish it would complete before something booted.14:40
dtantsurso yeah, so far we're booting IPA, this is why, I guess, we haven't noticed this behavior14:40
TheJuliadtantsur: ... I didn't see that, but a reno might be good anyway since we're also moving to the community library14:40
TheJuliadtantsur: oh, idracs are stupidly complex under the hood14:40
cardoeThe internal UEFI shell completes the job too.14:40
* TheJulia needs to nom something to reach "OS Running" state14:41
dtantsurTheJulia: this is the logic: https://review.opendev.org/c/openstack/ironic/+/929904/6/ironic/common/images.py#87014:41
dtantsurhappy to follow-up with a release note anyway14:41
cardoeSo I feel like the job isn't completing until the machine thinks its moved onto an OS is Running state.14:41
dtantsursigh14:41
dtantsurWe can do the UEFI shell for sure, it's one of the boot device overrides that sushy supports14:41
* dtantsur needs to finally get some lunch, brb14:42
TheJuliadtantsur: ahh, yes14:43
TheJuliayeah, UEFI shell is totally a thing which can be requested14:43
cardoeAnd that's why I was asking about BIOS cause this would make disable_ramdisk=True work for factory_reset and apply_configuration on UEFI.14:43
cardoeBut it wouldn't work if you were booting legacy.14:43
dtantsurDo you care about legacy? I'm totally cool if we make disable_ramdisk conditional on UEFI.14:44
cardoeI don't care about legacy.14:44
cardoeOkay I'll go that route then.14:44
cardoeI think we did the sleep forever ISO back when we used to care about legacy boot.14:45
cardoeCause UEFI shell works and I don't have to have a magical bespoke image to push.14:45
cardoeOur push code is also horrifying cause it supports the vendor specific endpoint and not the redfish push one.14:46
TheJuliaWhen are we making "BIOS boot is dead" shirts?14:48
cardoeNot soon enough.14:52
TheJulia"BIOS boot is dead." with a subtext of "... At least on Bare Metal."15:23
rpittaubye everyone have a great weekend! o/15:38
JayF\15:41
JayF\o15:41
opendevreviewJulia Kreger proposed openstack/ironic master: First pass on some strucutral context setting for networking  https://review.opendev.org/c/openstack/ironic/+/93603916:48
opendevreviewJulia Kreger proposed openstack/ironic master: docs: begin making a general networking document  https://review.opendev.org/c/openstack/ironic/+/93604016:48
opendevreviewJulia Kreger proposed openstack/ironic master: docs: change network setup steps into the commands  https://review.opendev.org/c/openstack/ironic/+/93604116:48
opendevreviewJulia Kreger proposed openstack/ironic master: docs: rewrite ml2 and update physnet context  https://review.opendev.org/c/openstack/ironic/+/93604216:48
opendevreviewJulia Kreger proposed openstack/ironic master: docs: final cleanup pass on networking  https://review.opendev.org/c/openstack/ironic/+/93604316:48
dtantsurwow16:49
TheJuliabasically a complete rewrite of that file16:49
TheJuliasuper bad shape, it was in16:49
TheJuliaand still needs more work, hence the final change :)16:49
dtantsur+++16:49
TheJuliaI *think* that better sets base context, to help delineate things apart16:50
TheJuliaand also focuses more on "the single thing" as opposed to "oh, you need your api version, and then you need to do this and that"16:50
TheJuliawhich is just noise16:50
TheJulia... if your at this point, at least.16:50
TheJulia... I think i'm still sort of missing the some context around how to understand how vifs get applied, but it it is much more theory for people to grok..16:59
TheJuliaI guess, maybe it is... how to use the thing related which got lost in some of the early networking stuff17:00
TheJulialike we know how many vifs a node can take17:00
TheJuliait is based upon the ports and what gets asked for17:00
TheJuliaBut hey, also opened a bug over preferring PXE ports...17:00
TheJulia*WHY!* 17:00
TheJuliaThat makes sense for PXE, sure17:01
TheJuliadoesn't make sense for tenant workloads17:01
TheJuliaAt least in my opinion17:01
TheJulia</rant>17:01
dtantsurHmm, fair17:01
TheJuliaThis concludes your daily ironic rant17:01
dtantsur\o/17:02
cardoeshould https://review.opendev.org/c/openstack/ironic/+/934065 be backported to 2024.2?17:08
cardoeHow would people feel if I got rid of pbr from the runtime of sushy? It's literally only used to parse the local version info to return the version as a major, minor, patch tuple.17:15
cardoepbr still doesn't have Python 3.12 support17:15
dtantsurI'd appreciate that17:15
cardoedtantsur: https://paste.opendev.org/show/bpoxYEvFBbqHpsGTy1IF/ that's the virtual media issue.17:27
dtantsurahhhh17:28
dtantsurwe seem to be getting reports of these 'Virtual Media is detached or Virtual Media devices are already in use.'17:28
dtantsurbtw cardoe, this literal error may be shadowing some more precise error because of our retry logic17:29
cardoehmm that could make sense.17:29
dtantsuryeah, I was going to file a bug but forgot17:29
dtantsurwe retry connection on HTTP 500, so chances are high that the first error was not the same as this one17:29
cardoeTheJulia: https://bugs.launchpad.net/sushy/+bug/2041902 did you have a follow on in Ironic planned for that?17:31
cardoedtantsur: iurygregory has https://review.opendev.org/c/openstack/sushy/+/924020 wonder if that's affecting me as well.17:33
iurygregorycardoe, this one I was working because of a Cisco Bug17:53
iurygregorybut it was a very old firmware that was EOL17:53
cardoeYou described Dell's latest firmware18:26
cardoedtantsur: was hoping you'd weigh in on https://review.opendev.org/c/openstack/ironic/+/93302018:30
dtantsurwill check on Monday18:30
TheJuliacardoe: no, no follow-up in ironic. I spot checked some machines and saw only partial support so reliance upon it is... sort of sketchy as the kids say18:33
TheJulia*but* relying on it can help some stuff18:34
cardoeThanks Steve Buscemi.18:35
TheJuliaThere *was* something I was thinking of where it would be super useful to check if present and then block/proceed with child nodes18:39
JayFTheJulia: https://review.opendev.org/c/openstack/ironic/+/936039 did you want someone else to review the doc changes before they land? 19:25
JayFIt's only been 3 hours and it has two +2s now, so just ensuring you weren't hoping someone wearing a maroon fedora wanted to look first :D 19:25
TheJuliaJayF: I've got 5 changes stacked up, merge what we can and I'll fix them as I roll19:26
JayFthat is what I suspected, but just checking these weren't written with a specific audience in mind :D 19:26
JayFwhen was the last time a non-critical bugfix was merged in ... like half a day :)19:26
TheJulianope, largely to just clarify the docs and get things to reflect reality19:26
TheJuliabecause now() doesn't reflect19:27
JayFI'm only one patch in and it's 100% improved19:27
TheJuliaheh19:27
TheJuliayeah, looks like I've got a warning in the next patch which means I need to pull a fix forward19:27
TheJuliasooon19:27
JayFwell I'm going to go put some lunch on and then finish reviewing the stack :) thanks!19:28
TheJuliaI'll try and fix that issue and rebase the stack19:30
TheJuliaJayF: keep in mind, I recognize through the changes that I'm improving chunks at a time, so not everything is perfect, but just trying to leave it better with each pass19:31
JayFmy bar for reviewing is always "is it better than it was" with the exception of things that are potential security risks or user-facing-APIs that are hard to fix later19:31
TheJuliaheh, looks like I also addressed your first comment largely in the sections being rewritten19:33
TheJuliamostly, at least19:33
JayFI thought that would be possible19:33
* TheJulia leaves comments on the last patch19:34
opendevreviewDoug Goldstein proposed openstack/sushy master: enable pycodestyle and pyflakes checks in ruff  https://review.opendev.org/c/openstack/sushy/+/93491519:37
cardoeSo I'm working on bringing the sushy-oem-idrac stuff in... I was going to put it on top of my pyupgrade to Python 3.9 patch.19:38
opendevreviewMerged openstack/ironic master: First pass on some strucutral context setting for networking  https://review.opendev.org/c/openstack/ironic/+/93603919:45
opendevreviewJulia Kreger proposed openstack/ironic master: docs: begin making a general networking document  https://review.opendev.org/c/openstack/ironic/+/93604019:51
opendevreviewJulia Kreger proposed openstack/ironic master: docs: change network setup steps into the commands  https://review.opendev.org/c/openstack/ironic/+/93604119:51
opendevreviewJulia Kreger proposed openstack/ironic master: docs: rewrite ml2 and update physnet context  https://review.opendev.org/c/openstack/ironic/+/93604219:51
opendevreviewJulia Kreger proposed openstack/ironic master: docs: final cleanup pass on networking  https://review.opendev.org/c/openstack/ironic/+/93604319:51
opendevreviewcid proposed openstack/ironic-specs master: Add a Kea DHCP backend  https://review.opendev.org/c/openstack/ironic-specs/+/93102520:05
opendevreviewcid proposed openstack/ironic-tempest-plugin master: Test double encoding of error message  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/93574020:05
opendevreviewcid proposed openstack/ironic-tempest-plugin master: Fix test to not expect double-JSON-encoded errs  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/93254420:06
cardoeTheJulia: Do we send "extras" on ports as part of the binding profile? How can a VIF have multiple physical networks?20:19
adamcarthur5https://review.opendev.org/c/openstack/ironic/+/928919/7 and https://review.opendev.org/c/openstack/ironic/+/928920/5 should be ready to be reviewed, and have had +1 verified from after the merge of the shard tests in tempest (So we are testing that we get fails on the right microversions20:21
JayFso adamcarthur5 kinda dumb question about this change20:23
JayFis this going to be the first time we've validated schema for shards requests?20:23
JayFI'm trying to sniff out if there's any api change; e.g. an error message coming across different in a bad request workflow20:23
adamcarthur5I think it depends what you mean by "validated"? 20:23
adamcarthur5Ah - but by that metric, I suspect there is likely to be a change in there somewhere, yes20:24
adamcarthur5It depends on whether the internal-function checks you have already for microversions are consistent. But I could see a 404/406 switcharoo being possible?20:24
iurygregorycrazy bug in servicing for firmware update, at least in a Dell machine it updates but for some weird reason the new information is not in the DB :facepalm: 20:24
cardoeIs that the fix you just submitted?20:25
JayFadamcarthur5: yeah it just seems weird to me we had no schema on incoming requests before20:25
JayFadamcarthur5: lets add a release note to that? Just indicating we're validating input schemas20:25
iurygregorycardoe, yup the patch I pushed earlier today, but the fix isn't helping much at least in my testing :D20:26
JayFI was waffling but if there's any chance at all it's not invisible, lets put something in there under "features"20:26
adamcarthur5Sounds good JayF, do you think any extra testing is required for the schema validation?20:26
adamcarthur5Like what we did for microversion-fail testing20:26
JayFI think if it was my change, I'd want a solid answer to if that 404/406 switch if possible20:27
JayFbut that is also a very high bar :)20:27
adamcarthur5Yeah, at the very least, the first change is definitely tested and ready to go. I'll keep looking into the 2nd20:27
JayFthe first already had my +220:28
JayFreapplied20:28
JayFhttps://review.opendev.org/c/openstack/ironic/+/928919 would be nice to land if someone else who can is around and wants to /me nudges the two other cores watching IRC ;) 20:29
cardoeI'll look once I'm done reading this doc from Julia.20:30
cardoeDo we have a syntax to refer to config options in rst?20:31
cardoeor is it really ``[section]option``? I thought there was some other syntax.20:31
JayFthere is20:32
cardoeI feel like I wanna draw some diagrams for some of this stuff too.20:32
cardoeI +2'd it. I've been keeping up with the changes on it.20:37
JayFawesome, I'll land it \o/20:37
JayFI suspect once adamcarthur5 gets his engine going on the tempest-validation of microversion + schema changes, we'll have a lot of em to review20:38
JayFend state of good API test coverage enforcing microversions in tempest + more readable schemas is super exciting20:38
shermanmregarding the recent flurry of redfish/virtualmediaboot stuff, I wanted to mention that I ran into someone from the redfish/dtmf forum, and was pretty strongly requested to submit any reports of vendors with misbehaving implementations so they could *encourage* a fix20:43
shermanmto https://redfishforum.com/20:44
opendevreviewAdam McArthur proposed openstack/ironic-tempest-plugin master: Testing bad microversions on v1/allocations  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/93582720:49
opendevreviewcid proposed openstack/ironic-specs master: Add a Kea DHCP backend  https://review.opendev.org/c/openstack/ironic-specs/+/93102520:58
JayFCircling back around to the "ironic-ui wants to get /v1/conductors" thing; there *is* a /v1/drivers that appears to have that same info21:02
JayFand maybe more structured to redact info from21:02
TheJuliashermanm: yes, I believe some of that has occured :)21:08
TheJuliacardoe: we is a couple different styles, my intent is to batch those sorts of changes up in the last change21:08
TheJuliacardoe: also, regaring extras on ports. A VIF cannot have multiple physical networks, but can be mapped across multiple ones, potentially21:09
TheJuliathing of a physical network as a fabric21:10
TheJuliait might be possible to be on physnet1 and physnet2, but not physnet321:10
TheJuliaphysnet3 is Julia's secret network of doom plugged into the special hypervisor full of lolcats21:10
TheJuliaAt least, that is my understanding21:12
TheJuliaa physical port can only be on a single physnet as well21:12
TheJulia(the one it is attached to)21:12
TheJulia.... Anyone have a delorean that can get up to 88? I need to go back in time and beat ourselves up about overusing "physnets"21:13
TheJuliacardoe: since this is a lot of theory, drawing are always good21:14
opendevreviewAdam McArthur proposed openstack/ironic-tempest-plugin master: Testing bad microversions on v1/nodes/{uuid}/firmware  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/93607121:16
cardoeWell I was just asking about the networks cause the wording implied multiple physical networks for the VIF>21:18
cardoefabric = physical network is literally what I'm preaching here.21:19
shermanmso, this is a thing I've had a real headache describing to my operators and in our own docs. (caveat, from my perspective): Ideally physical-network == fabric. BUT, it's possible that a single network fabric may have multiple physnets on it, and/or not all of those physnets could be attached to a given port on that fabric21:21
shermanmin our case, usually done to convince neutron to treat one set of vlans differently from another, even though they all coexist on the same fabric21:21
shermanmso it's very possible that one "baremetal port" (cause I don't quite understand vif in this context yet), could have more than one neutron physical network attached to it, because they're just logical representations on the same underlying fabric21:24
shermanm*potentially attached to it21:24
TheJuliacardoe: ahhhhh! I could see that, if you can highlight it, it would be worthy of revising a little21:27
opendevreviewAdam McArthur proposed openstack/ironic-tempest-plugin master: Testing bad microversions on v1/allocations  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/93582721:33
shermanmcardoe: ^ and please ignore my above if I interpreted your comment, totally agree that diagrams would be good.21:33
cardoeshermanm: so I’m not a network guy at all so maybe I’m off. But a fabric is essentially a pile of interconnected stuff. I’m making VNIs on top through a vxlan type plugin (not the stock one since that’s OVN specific).21:33
cardoeToday the ports are clueless about the specific VLANs they are getting in the racks to match up to those VNIs.21:34
cardoeBut we’re wanting to have the trunking extension. We’ll have to modify nova to special case those ports to create the correct network_data in cloud-init.21:35
TheJuliacardoe: so I've done Ethernet and FibreChannel networks int the past, and thinking of them as fabric helps the interconnected nature of the things, they can be cross-connected with constraints, but at that point you might not let everything cross21:35
shermanmcardoe: I'm mostly looking at this from the perspective of neutron + openvswitch + ngs with vlans, and that implementation. In that particular case, the only thing a neutron physnet means is which vlans, and which interface on the networking node, will be added to the ovs bridge. IIRC that part remains the same with OVN.  So I'm primarily thinking of "fabric" == "domain within which l2 segment identifiers are valid"  21:36
shermanmbut there's nothing that enforces that two different "logical domains" can't occupy the same physical domain, it's just up to the operator to make sure that they don't conflict21:37
JayFI would be careful making that kinda assumption, there are places that do things like "this is 10.1/16 locally, and natted to 172.16/16 for other regions" at least if you mean what I think you mean21:37
JayFunless you're explicitly saying, trying to draw a line, that it is two fabrics not one?21:37
JayF(in that crazy case)21:38
cardoeTheJulia: yeah we have piles of cross-connects. And on my roadmap are scheduler crazy for grokking cross-connects. But I’m honestly hoping for that to be scheduled around the heat death of the universe21:40
cardoeSome of my gear has ports on a storage fabric and other ports on real network.21:45
TheJuliashermanm: this is a valid way of looking at it as well. I guess the ultimate challenge is to frame, adn then hope humans don't over complicate it more to make it seem simple21:45
TheJuliacardoe: Heat death of the universe sounds about right for that level of complexity21:46
cardoeWell I would say the person that comes after me but I don’t wish that on them.21:46
opendevreviewMerged openstack/ironic master: api: Introduce new mechanism for API versioning  https://review.opendev.org/c/openstack/ironic/+/92891921:47
TheJuliaheh21:49
shermanmTheJulia: I do agree on both interpretations if starting from scratch, I mostly wanted to be clear on "do we mean the same thing that neutron does by physical network", and if not, what are the constraints.   Even if just "we assume that physnet == fabric, and if you configure neutron to mean something else, you can still only attach one of them <for reasons>"21:49
TheJuliaits a good thing to sort of just disambiguate upfront21:50
TheJuliaor at least, set the same shared context for the rest of the text to be digested with21:50
cardoeSo shermanm we're integrating to Nautobot to give another layer of info for the operators. Maybe we'll go back to NetBox but we'll see.21:50
cardoeBut in there we handle the switch templating so we're not using NGS. NGS also doesn't do VXLAN21:51
TheJuliaVTEP might be a way...21:53
TheJuliabut yeah21:53
shermanmcardoe: I'm ultimately not talking too much about ngs here, everything just falls out of the neutron config for `network_vlan_ranges`  in ml2_conf, and `bridge_mappings` in ovs-agent.22:05
shermanmhttps://docs.openstack.org/neutron/latest/configuration/ml2-conf.html#ml2_type_vlan.network_vlan_ranges22:05
shermanmhttps://docs.openstack.org/neutron/latest/configuration/openvswitch-agent.html#ovs.bridge_mappings22:05
shermanmagree that it doesn't really apply to the vxlan case though22:05
cardoeTheJulia: I commented on the spot where it made me think multiple physical networks. And started reading the next patch...  where you made a bunch of the changes I suggested in the prior one. So yeah I keep my +2 on the current "next in line" patch.22:07
cardoeshermanm: yeah so I don't wanna get vlan ranges from a config file. And in fact that's something I brought up on the PTG call. I need that to be serviceable.22:07
TheJulia... that did come up... what was it22:08
TheJuliait feels wrong to be in a config file22:08
cardoeBecause in all the cases here, I don't fully own the entirety of the fabric.22:08
shermanmisn't there something with network-segment-ranges in neutron already?22:08
TheJuliaalso wrong like not giving cats scritches22:08
TheJuliacardoe: wasn't an idea that the fabric should be able to get asked what is free?22:08
cardoeyeah22:09
TheJuliacardoe: if so, we should ask neutron for it in the form of a bug/rfe22:09
TheJuliabecause... yeah22:09
TheJuliaits a really bad pattern22:09
TheJuliato rely upon config files22:09
cardoeI had a follow up convo with someone who was giving me push back on the call. I forget who. But they ended up agreeing with me and said to make a bug/rfe.22:09
shermanmbut I also don't mean to debate the docs change and meaning of physnet to death here, I'm super happy with the changes already :)22:09
cardoeI need to corner jamesdenton to help me write it.22:09
TheJuliajamesdenton: dude, the needful calls!22:10
cardoeBasically it's not possible to change a big running system.22:10
TheJulialets re-frame that22:10
cardoeCause they all need to change at the same time otherwise it's coin flip if you get the old value or the new value.22:10
cardoeThat's what the neutron dev concluded.22:11
TheJuliakeeping it running is a herculean task. Re-configuring the planet while Hercules is holding it is a laughable idea.22:11
cardoeCause each server reads the state of the ini and then updates the network-segment-ranges like shermanm was saying and then uses that value going forward. there's some re-sync background RPC thing that causes one of the nodes to update the DB if the DB is different then the state is has and all the other nodes read the value from the DB22:13
cardoeLike a quasi-leader/follower without any clear direction who the leader is.22:13
TheJuliaugh22:15
shermanmso, I know it used to be *only* ini-driven and somewhat implicit, but there is also an api/db driven neutron plugin:22:16
shermanmhttps://docs.openstack.org/neutron/latest/admin/config-network-segment-ranges.html22:16
shermanmand distinguishes between the ini-created ones, and the API created ones:22:16
shermanmhttps://docs.openstack.org/neutron/latest/admin/config-network-segment-ranges.html#default-network-segment-ranges22:16
shermanmbuut I have no idea how complete it is22:16
cardoeI honestly didn't dig that deep. I just said updating ini files across a fleet isn't a good management surface for us. And was told I needed to look into network-segment-ranges.22:18
cardoeThe dev came back with an in-depth email about the whole thing.22:20
cardoeUltimately I wanna use both cause we have provider networks and we've got self-service tenant networks.22:20
cardoeLike jamesdenton is one of my tenants for example. And he'll make himself a network and all that jazz.22:20
shermanmI'd be curious about what you conclude, totally agree that moving away from ini-based config here is the way to go. I've successfully used the plugin to adjust the valid segment range for creation of new self-service networks on existing physnets, but haven't tried anything with creation/deletion of the actual physnets. 22:22
cardoeSo I dunno what the right answer is. a system scoped API endpoint?22:24
cardoeini-file OR API?22:25
shermanmfrom what I understand, the current state is that it supports both, with the API providing the new mechanism, but also exposing existing ini config for backwards compatibility22:30
cardoeso that is just for self-service segments. it does nothing for provider segments22:51
TheJuliaI'm stepping away shortly22:51
TheJuliaI should be around again tomorrow morning22:51
opendevreviewcid proposed openstack/ironic-specs master: Add a Kea DHCP backend  https://review.opendev.org/c/openstack/ironic-specs/+/93102523:32

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!