Wednesday, 2023-11-29

opendevreviewMerged openstack/ironic stable/2023.2: Add missing compatibility between idrac and redfish firmware  https://review.opendev.org/c/openstack/ironic/+/90207400:40
opendevreviewMerged openstack/ironic-python-agent master: Fix vmedia network config drive handling  https://review.opendev.org/c/openstack/ironic-python-agent/+/89551901:10
opendevreviewMerged openstack/ironic-python-agent master: Parse efibootmgr type and details  https://review.opendev.org/c/openstack/ironic-python-agent/+/89977501:10
opendevreviewMerged openstack/ironic-python-agent stable/2023.2: improve multipathd error handling  https://review.opendev.org/c/openstack/ironic-python-agent/+/90148801:10
stevebaker[m]TheJulia: OK, I don't think assuming efibootmgr returning utf-16 holds in all cases https://zuul.opendev.org/t/openstack/build/2fd82ff944d045918d1bea322a3d60ff/log/controller/logs/ironic-bm-logs/node-0_console_2023-11-28-20:49:06_log.txt#3129-313101:24
stevebaker[m]TheJulia: when I revert that change and set use_standard_locale=True then it works https://zuul.opendev.org/t/openstack/build/7e4ba12e165145d2a8aad98445bf4aa6/log/controller/logs/ironic-bm-logs/node-0_console_2023-11-28-20:39:00_log.txt#3132-315101:25
opendevreviewMerged openstack/networking-generic-switch stable/zed: Fix regression plugging 802.3ad port group  https://review.opendev.org/c/openstack/networking-generic-switch/+/90098901:31
TheJuliastevebaker[m]: so.. is the answer just to look at /sys/efi directly then?03:03
TheJuliaI'm bit sure how we otherwise navigate it, since the records can and are supposed to be UTF1603:04
TheJuliaof course, coming from CI there are really no guarantees03:04
stevebaker[m]TheJulia: I'm sure they're stored in UTF16, but efibootmgr will surely only ever display in an encoding supported by the console?03:19
TheJulianot quite sure, because we got the definitive utf16 chars from a customer case's efibootmgr output03:20
TheJuliawe might need to go back and hunt it down even though we were able to simulate it in the unit tests03:21
TheJuliathe frustrating thing here is using standard local means we can't trust the output of it at all for compares03:25
TheJuliaonly way really to make sure would be to reproduce it with a shell and inject one of those fun characters03:26
rpittaugood morning ironic! o/07:25
opendevreviewMark Goddard proposed openstack/bifrost stable/2023.1: Fix key-order ansible errors  https://review.opendev.org/c/openstack/bifrost/+/90204009:24
dtantsuriurygregory: I'm absolutely sure we need to retry any errors 500 from Ironic in IPA11:04
dtantsurI cannot remember why we don't do it already11:04
iurygregorygood morning Ironic11:06
iurygregorydtantsur, ok! I will work on the patch for it11:06
opendevreviewMaryna Savchenko proposed openstack/ironic-python-agent master: Fix referencing to the raid_device var which is not set  https://review.opendev.org/c/openstack/ironic-python-agent/+/90032411:38
opendevreviewMaryna Savchenko proposed openstack/ironic-python-agent master: Fix referencing to the raid_device var which is not set  https://review.opendev.org/c/openstack/ironic-python-agent/+/90032411:40
opendevreviewMerged openstack/ironic master: Replace swiftclient usage with openstacksdk  https://review.opendev.org/c/openstack/ironic/+/89999911:40
opendevreviewMerged openstack/ironic master: Document wsgi_service fix from 16a806f  https://review.opendev.org/c/openstack/ironic/+/90211511:40
opendevreviewMaryna Savchenko proposed openstack/ironic-python-agent master: Fix referencing to the raid_device var which is not set  https://review.opendev.org/c/openstack/ironic-python-agent/+/90032411:40
opendevreviewMerged openstack/ironic stable/2023.2: Properly cleanup unix sockets in wsgi_service  https://review.opendev.org/c/openstack/ironic/+/90211612:28
iurygregorydtantsur, looking at the request code for ConnectionError, ProxyError is derived from it. so the retry should work, no?  https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/inspector.py#L135-L145 12:30
iurygregorywe should probably add log to _post_to_inspector I think ...12:31
dtantsuriurygregory: is it really derived this way? I'd be really surprised if TCP-layer errors were a superclass for HTTP-layer ones.13:52
dtantsuryeah, it seems to be the case. fun.13:53
dtantsuriurygregory: ah, this breaks it https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/inspector.py#L14813:54
dtantsurwe don't raise for HTTP code (there is not raise_for_status)13:54
* iurygregory switching back to the context for 502 proxy bug13:57
iurygregoryso we should just update the if?  I think we are retrying it, but we fail anyway and we just raise it  https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/inspector.py#L149 https://paste.opendev.org/show/be7I3IOGrGKRX5OfP1z3/ 14:04
opendevreviewMark Goddard proposed openstack/bifrost stable/2023.1: ansible-lint: Skip key-order[play]  https://review.opendev.org/c/openstack/bifrost/+/90215014:35
opendevreviewMerged openstack/ironic-lib master: Increase the ESP partition size to 550 MB  https://review.opendev.org/c/openstack/ironic-lib/+/90033014:38
adam-metal3dtantsur,jayF: I would be interested to learn about the dnsmasq/dhcp topic could I join the discussion ?15:01
JayFyep\15:01
dtantsurabsolutely15:01
JayFreminder you owe us follow ups on the https auth support for ipa too ;) 15:01
JayFima see if I have your email to forward  the invite15:02
dtantsurI'll write an email explaining the context in a few minutes15:02
JayFmy "I have a small headcold from travelling" is morphing into "I think my body is trying to grow another human in my sinuses" so I apologize in advance if I sound like a muppet15:02
JayFadam-metal3: I don't have a good email for you for a calendar invite, DM me one?15:03
dtantsurJayF: we can delay this call, it's not urgent15:03
dtantsurI'd rather not have you suffer on it15:04
JayFI'm not suffering right now15:04
adam-metal3JayF: dm sent15:04
rpittauJayF, dtantsur, I'd be glad to participate too15:05
JayFnow your address I have :D15:06
rpittau:)15:06
dtantsuran annoyingly long email has been sent15:22
JayFdtantsur: how is this handled for k8s-native things?15:26
JayFdtantsur: do they just do static ip configuration for all containers?15:26
JayFI'm wondering if k8s itself has any kind of dhcp service going on15:26
JayFdtantsur: ... is there any reason we have to only have one provisioning network? 15:33
JayFgive each dnsmasq instance its own separate network, have Ironic have some awareness of it15:34
JayFI guess this is no different than taking like, a /23 and splitting it into 4 logical /25-sized dhcp ranges15:35
TheJuliaJayF: oh nodes :( But have you asked an AI for "JayF as a muppet?"15:35
JayFreally this just needs to be static'd all the way through15:35
dtantsurJayF: several provisioning networks is a tough ask for many operators15:35
TheJuliait is not unheard of, really15:35
dtantsurnot unheard of, but I'd be worried about making it a requirement15:36
dtantsuralthough, if we only do it for a multi-conductor setup...15:36
TheJuliaThat is fair15:36
JayFWell the other option would be15:36
JayFtry to do something more or less fully static15:36
TheJuliayou have to put an opinion someplace15:36
JayFwhere you'd need "N" IP addresses for "N" nodes15:36
dtantsurwhich does not cancel out the problem: an Ironic Node may be handled by a conductor on a different network15:36
JayFI think I'm still missing context then15:36
TheJuliamulti-provisioning network just makes it more complex, in that any conductor may become responsible15:36
JayFyeah, basically that's what you need15:37
JayFyou have to get completely rid of conductor locality15:37
TheJuliaunless each conductor has a unique one which it it's own and they don't share15:37
TheJuliabut...15:37
TheJuliaWhere is a bar tender, I need tequila15:37
JayFyou can do that with static dhcp + on-demand-api-built pxe configs15:37
JayFI'm trying to think of how to solve the problem without N IPs for N nodes15:37
JayF(we did N ips w/N nodes in OnMetal for this kinda setup, where we really didn't care what conductor did what)15:37
dtantsurThere are many ways to make the whole thing much more complicated :) I'd rather keep it at least as complex as it is now (which does not mean simple)15:38
JayFI'd rather us make the software more complex and have the design in the end make sense in both models than us come up with something that sorta is jammed into place and kinda works15:38
JayFor at least, understand the path to ^ while hammering the temporary path into place lol15:38
dtantsurAt some point, the complexity overweights even the best solution15:39
dtantsurand in the bare-metal world, we're always close to this point15:39
JayFPerhaps; but I'm thinking manage_agent_boot is what we did in OnMetal for this, and it was tech debt for a long time15:39
JayFand if we had designed it better then, we might have this problem solved now, and might not have wasted people-days dancing around that old, now-removed ffeature15:39
dtantsurI don't quite see how that would solve anything15:40
JayFYou could 100% do what you want, with metal3, with the onmetal-style manage_agent_boot setup w/static DHCP like we did at onmetal15:40
JayFit was just gross and bad :)15:40
dtantsurheh15:40
JayFI'm saying, if we had instead of hacking something together for the short term then, had solved the problem, we might not be there now 15:41
dtantsurI already have static DHCP, that's actually the problem15:41
dtantsurif it was less static, I would not have the issue of directing a node to the right conductor15:41
rpittauTheJulia: hi! if you have a moment today can you please have a look at my answer to your comment in https://review.opendev.org/c/openstack/ironic/+/894918 ?thanks!15:41
JayFdtantsur: I think part of my problem is I don't grok why "right conductor" matters in a world where any conductor can provide a valid DHCP config15:41
JayFdtantsur: I'm assuming a homogenous conductor group and driver set, is that a bad assumption?15:42
dtantsurWe're not in that world15:42
dtantsurthat's really what my problem is :)15:42
JayFDid you not propose a feature to do that, as part of this solution?15:42
TheJuliarpittau: fair, I'm just thinking explicitly setting new device/media could also be booted from15:42
TheJuliarpittau: that seems "adminy" to me.....15:42
TheJuliabut maybe member is just right as long as the rights match up15:42
TheJuliaThen again, custom policy exists for a reason, operators who are worried about it can just say "only admins may"15:43
dtantsurJayF: this is one way of doing that. But then, we also have the dnsmasq DHCP interface, and it feels like we're developing two approaches to the same thing in parallel..15:43
JayFdtantsur: I guess my mental model of the cleanest way to do this is "make it so that any conductor can handle any query", which would use both of those features15:43
JayFdtantsur: if we don't have both, then it gets a lot harder15:44
rpittauTheJulia: ok, I see what you mean, but I'm not too worried honestly, just looking at other policies15:44
dtantsurJayF: that may be the path we take.. let's wait for the call itself before we draw a conclusion :)15:44
rpittauearlier this morning I saw a couple of failures in the metal3 integration job, I rechecked and it seems ok now, but please keep an eye on it15:53
TheJuliarpittau: I'm the only one who raised it, so likely on the being more conservative side of access controls side, which aligns with my personality as for risk management, in other words, likely sane to proceed as-is15:57
rpittauTheJulia: thanks! :)15:58
TheJuliahttps://ab9c7f011d4a8adf9dae-cec36eea8e90c9127fc5a72b798cfeab.ssl.cf5.rackcdn.com/901182/7/check/ironic-tempest-bfv/b58deaf/controller/logs/ironic-bm-logs/node-2_console_2023-11-28-18%3A53%3A39_log.txt <-- this makes me stupidly happy16:02
opendevreviewJulia Kreger proposed openstack/ironic master: DNM: CI test for httpboot jobs  https://review.opendev.org/c/openstack/ironic/+/90118216:10
TheJuliahmm... downloaded shim was only 2503 bytes. That seems, wrong.16:10
opendevreviewJulia Kreger proposed openstack/ironic-tempest-plugin master: WIP: Test multiple boot interfaces as part of one CI job  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/90217116:44
TheJuliaJayF: ^16:45
TheJuliaAnyone else keeping an eye on the httpboot changes, ^ is largely the ideal end state for testing, instead of adding a bunch of new scenario jobs16:47
iurygregoryI will take a look at it16:48
TheJuliaiurygregory: everything httpboot through the http network boot variants should be good to review, the only just trying to figure out if the trouble is grub in general or other unrelated sadness in the change after it.17:09
iurygregoryTheJulia, ok o/17:14
TheJuliaLooks like we need to get the bug supervisor fixed on networking-generic-switch17:18
TheJuliahttps://bugs.launchpad.net/networking-generic-switch17:18
dtantsuruh oh17:22
JayFI'll email.17:25
rpittaubye everyone! o/17:25
dtantsurJayF: btw we're considering a metal3 ptg-which-cannot-be-called-ptg, possibly in late January17:25
JayFMetal3 Teams Gathering17:25
JayFM3TG17:25
JayFwith the superscript it'll look extra cool17:25
dtantsuradam-metal3: ^^^ :D17:25
adam-metal3dtantsur,JayF: done :D17:27
JayFdtantsur: TheJulia: email out to all members of that generic-switch-drivers group, asking them to change it over to ironic-drivers17:28
JayFoooh. no. It's the semi-annual Metal3 FORGE17:30
* JayF is going to be thinking of metal3+gathering puns for the next two weeks17:30
TheJulialooks like the manager is set to Vasyl17:30
JayFTheJulia: yep, I included him on the email as well, just wanted to make the net as large as possible17:31
TheJuliaSo it is funny, the video on my left monitor right now is of someone forging a Wakizashi17:31
opendevreviewVerification of a change to openstack/ironic master failed: Fix *_by_arch documentation and un-deprecate the options without it  https://review.opendev.org/c/openstack/ironic/+/90195817:41
JayFjust going to say17:46
JayFhttp boot is super cool17:46
JayFabout as exciting of a feature to review as I've done in a while17:46
JayFOoooh, how satisfying. The UEFI HTTP Boot spec *explicitly* indicates that an http boot is performed via a PXE environment17:51
JayFso the name stays the same, it's just like, DHCP-PXE vs HTTP-PXE17:51
JayFso satisfying17:52
JayFrpittau: https://zuul.opendev.org/t/openstack/build/fea192cc73e14d74a985cb47a6b8c205 another metal3 integration failure; it looks like something in our script to setup the config (it's applying upper-constraints to the install of ironic-lib from git and failing)17:55
dtantsurthought it has been fixed by https://github.com/metal3-io/ironic-image/pull/43417:57
JayFdtantsur: is there a way for me to see the sha of the image it should be getting?18:00
JayFif it's intermittant, I wonder if there's a stale image cached somewhere18:00
JayFI have the sha in the logs it has, I just don't know the cncf/2023 way to see what the correct image sha I want is 18:01
JayFTheJulia: dtantsur: NGS launchpad is fixed19:05
TheJuliaawesome19:45
TheJuliaAny chance I can get a review or two on https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/901213 specifically so we can fix the snmp job on the main gate for now20:04
iurygregoryTheJulia, +2 lgtm 20:08
stevebaker[m]good morning20:24
iurygregorymorning stevebaker[m] o/20:33
iurygregoryregarding the idea to try to speed up our multipath checking (since we have the crazy scenario with 84 disks with 4 paths each)   https://review.opendev.org/c/openstack/ironic-python-agent/+/902012 this is a WIP on how I think we should try to handle things, not sure if makes sense, if anyone can provide feedback I would appreciate o/ 20:54
TheJuliao/ stevebaker[m]  https://meet.google.com/ady-rbqz-uia20:59
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be restarting momentarily for a patch update to address a recently observed regression preventing some changes from merging21:09
TheJuliastevebaker[m]: https://www.compart.com/en/unicode/U+00FF <-- all sadness in efi nvram entry handling code due to this character21:37
TheJuliaiurygregory: two thoughts added21:49
iurygregoryTheJulia, tks!21:50
opendevreviewMerged openstack/ironic-tempest-plugin master: Add snmp variant of ramdisk iso boot test  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/90121322:01

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!