Thursday, 2023-05-04

opendevreviewMerged openstack/ironic-tempest-plugin master: Secure RBAC Test  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/84242700:35
rpittaugood morning ironic! o/07:13
dtantsurrpittau: morning, could you check https://review.opendev.org/c/openstack/ironic-python-agent/+/879156 please?07:18
dtantsurI think this is the last bit that metal3 needs from us.07:18
rpittaudtantsur: I'll have a look at it now07:18
rpittauapproved07:20
dtantsur\o/ thx07:20
dtantsurTheJulia: bifrost also uses md5, we broke it as well...07:22
opendevreviewDmitry Tantsur proposed openstack/ironic-python-agent master: Revert disabling MD5 checksums  https://review.opendev.org/c/openstack/ironic-python-agent/+/88223607:28
dtantsurTheJulia, JayF, I'm sorry, but this ^^^ is the only reasonable path forward.07:28
dtantsurIf Red Hat has issues with it, I'll be happy to clarify in an internal discussion.07:28
dtantsurTheJulia: re your question about raw conversion. I've realized it's only the case for image_download_source=local.07:36
dtantsurNeither bifrost nor metal3 set that.07:38
opendevreviewMerged openstack/sushy master: Retry on ilo state error  https://review.opendev.org/c/openstack/sushy/+/88054208:49
opendevreviewMerged openstack/ironic-python-agent master: Add network interface speed to the inventory  https://review.opendev.org/c/openstack/ironic-python-agent/+/87915609:04
opendevreviewVerification of a change to openstack/ironic master failed: Support longer checksums for redfish firmware upgrade  https://review.opendev.org/c/openstack/ironic/+/88216309:22
opendevreviewVerification of a change to openstack/ironic master failed: Support sha256/sha512 with the ilo firmware upgrade logic  https://review.opendev.org/c/openstack/ironic/+/88216409:22
zigoIn my nova-compute.log, I get:09:31
zigonova.compute.manager nova.exception.NoResourceClass: Resource class not found for Ironic node <node-uuid>.09:31
zigoWhat should I check to fix?09:31
dtantsurzigo: does the node have its resource_class field populated?09:35
zigodtantsur: How do I check? The node is still in status "enroll" ...09:41
zigos/status/Provisioning State/09:41
dtantsurthen you can probably ignore the error09:43
dtantsurbut it's just a field, you can see it in $ baremetal node show09:43
zigodtantsur: No, I can't.09:51
zigoFYI, that's under Zed.09:51
zigoWhy isn't my node switching to Provisioning State: available ?09:52
kaloyankzigo: you have to move it to manageable with $ baremeatl node manage <node-name>11:37
zigoAh, thanks.11:37
kaloyankthen move it to available with $ baremetal node provide <node-name>11:37
kaloyankwith the first operation Ironic will verify that it can control the machine via its management interface (IPMI, Redfish)11:37
zigokaloyank: The IPMI password isn't set, I don't think Ironic has it. Will it make a login / pass itself?11:38
kaloyankand with the second it will try to "clean" it, that is prepare it to have instances scheduled on it11:38
zigoIt's currently the default Dell thingy ... ( root / calvin ...).11:38
kaloyankzigo: no, it's required that you configure it, Ironic can't guess it11:38
zigokaloyank: Isn't there is a way to make it set the login / pass randomly for me?11:39
zigoThe node is then back to enroll, probably because of this.11:39
dtantsurHow will it work with random credentials?11:40
zigodtantsur: That's what I did on *MY* custom baremetal provisionner: simply, my controller ssh the node to provision, and runs "ipmitool user set 2 password <BLA>11:41
dtantsurAt the very least, you need to somehow SSH into the node.11:41
zigoYeah, which is why I have dropped a public key in my live-build Debian image ...11:42
dtantsurYou can, of course, extend IPA to do it for you. Some time ago, I wrote a whole blog post on why we don't do it in Ironic https://owlet.today/posts/setting-ipmi-credentials-the-history/11:42
kaloyankzigo: I don't get what you're frustrated about, it's perfectly fine to tell Ironic how to authenticate to the machines' BMC11:43
zigodtantsur: IMO, this would be a job for IPA, indeed, and then it could report the login / pass when doing the hardware introspection.11:43
zigokaloyank: Well, it's not automated, it's a manual procedure which I would like to avoid.11:43
dtantsurzigo: see the blog post for how you can do it via a site plugin (and why we won't do it)11:43
* zigo reads11:44
zigoThanks for the link.11:44
zigodtantsur: The "Channels and Numbers:" is IMO not a valid point, as mostly, user 2 and channel 1 works in most hardware, and we could have some kind of exceptions depending on vendors.11:47
zigoI have setup literally thousands of servers and it was never an issue.11:47
zigo"Error Handling" never was an issue for me either. As the PXE server ssh into the server after populating its db, if it fails, then the old server remains, just the PXE server db contains the wrong info, and later on, a "ipmi apply" can fix things.11:49
zigoThen for "Security", well, what's the issue if something pretends to be a server to provision, and we give it a random password?11:51
zigoYou're doomed anyway if this happens...11:51
kaloyankzigo: I still don't get what's bothering you regarding Ironic11:51
zigokaloyank: Let's say I have 36 servers to setup (real life experience: I already did this...), I don't want to manually log into each of them one by one to setup IPMI credentials, and inform Ironic about it. I need this to be automated.11:52
dtantsurzigo: that's a YMMV case. We've seen issues I describe there in practice.11:54
kaloyankIMO, it's basically a couple of for loops in bash but again, new contributions are always welcome, so feel free to hack around and submit your work :)11:55
zigodtantsur: IMO, if it's controvertial, it's easy to fix: make the feature available, but disabled by default.11:55
zigodtantsur: You probably will see me opening such a PR in the near future.11:55
dtantsurzigo: you can give it a go, I'd suggest presenting the design to the community first11:56
zigodtantsur: I'm not there yet, I need my PoC to be fully functional first ! :)11:56
zigoKnowing there's no consensus, I'll make sure to write a spec for approval first then.11:57
* zigo jumps into a (borring) meeting now...11:57
iurygregorygood morning Ironic12:02
rpittaummmm seems the rbac enforced job is broken bad12:19
rpittauyep, very broken12:21
iurygregorymay the force be with you rpittau =)12:22
rpittauiurygregory: may the Force be with us all!12:22
iurygregoryyes!12:22
rpittauI see the most recent change in our tempest plugin is very recent https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/84242712:25
rpittauTheJulia: probably need help from you for this12:26
rpittauyeah, looks like one of the new tests is failing hard test_reader_cannot_create_node12:27
rpittauI think the expected exception is not correct, shouldn't that be 403 ?12:34
* iurygregory look12:34
rpittauthere are other 2 tests failing, most likely same problem, wrong exception, let's see if I can put together a quick patch12:35
iurygregoryrpittau, humm yeah to me it makes sense that is 403 - forbidden12:35
rpittauI have a doubt on test_reader_cannot_get_indicator_state, the other two are clear12:41
iurygregoryif it should be 403 also?12:44
rpittauit gives ServerFault but I'm not sure that's the correct exception, it may be the function that is wrong12:44
opendevreviewLana Kaleif proposed openstack/ironic master: Fix anaconda stage2_id loading from image properties  https://review.opendev.org/c/openstack/ironic/+/88228512:45
iurygregorywe need to look at our policy code I think, but from the description of the test "Reader cannot get indicator state" sounds like a 403 instead of 404 12:45
rpittaummm not sure, it should not be tehre, so 404 sounds correct12:47
iurygregoryto me it sounds like the indicator might be there, but the reader can't get the information12:49
opendevreviewRiccardo Pittau proposed openstack/ironic-tempest-plugin master: Fix rbac tests  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/88228813:01
rpittauTheJulia and others please have a look ^13:01
dtantsurrpittau: do you have an idea why it was not caught in the CI initially?13:05
iurygregoryI was wondering the same thing ^13:14
rpittauI don't see the tests running in the tempest plugin CI with the rbac enforced, that's why13:19
rpittauat leat looking at the change that enabled the tests https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/84242713:20
rpittauam I wrong ?13:21
* TheJulia is just starting to wake up13:23
TheJulia... I could have sworn those tests were getting run previously13:57
TheJuliabut it looks like we only have scenario jobs on tempest now13:58
TheJuliano general testsing...13:58
TheJuliadtantsur: ack14:02
TheJuliaw/r/t md514:02
TheJuliarpittau: got a link to the job where we noticed the tempest chagne breaking? The only guess I have is that somewhere between the change being started and now, we dropped the test from tempest itself?!14:02
rpittauTheJulia: here's one https://zuul.opendev.org/t/openstack/build/9eee93c64bc44a7d81fd390b470e4f1f14:03
TheJuliasweeet, thanks14:04
TheJuliaahh, it is just the flag14:13
opendevreviewJulia Kreger proposed openstack/ironic-tempest-plugin master: Advance tempest plugin tests to Zed  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/88231114:16
opendevreviewJulia Kreger proposed openstack/ironic-tempest-plugin master: Add RBAC specific tempest jobs to gate plugin  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/88231214:16
TheJuliaif zed is good, I'll add 2023.114:17
opendevreviewJulia Kreger proposed openstack/ironic-tempest-plugin master: Advance tempest plugin tests to Zed  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/88231114:36
opendevreviewJulia Kreger proposed openstack/ironic-tempest-plugin master: Add RBAC specific tempest jobs to gate plugin  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/88231214:37
TheJuliahmmmmm15:31
dtantsurany objections if I approve https://review.opendev.org/c/openstack/ironic-python-agent/+/882236 then? I have no problems with phasing out md5, but I'm afraid we've rushed it a bit.15:32
TheJuliaApproved, lets please actually do actually change that and not forget it next year15:33
JayFhttps://review.opendev.org/c/openstack/ironic-specs/+/881509 trivial, needs 1 more +2A15:36
dtantsurTheJulia: I'll drive the conversation on the Metal3 side if someone can take Bifrost15:36
JayFTheJulia: https://review.opendev.org/c/openstack/ironic-specs/+/874189 you wanna resolve nits here, or land it and you'll fix in a followup?15:37
dtantsurTheJulia, JayF, we also need to make some significant noise about the MD5 deprecation. I don't think we even had an ML announcement?15:44
dtantsurI'm afraid a lot of people keep using md5 just because it has always been there15:45
JayFdtantsur: TBH; I'm not sure I realized it was likely to be breaky, beyond upgrade level instructions, until recently15:45
JayFdtantsur: so I'm not certain what the best way to communicate this is15:45
dtantsurWe cannot do much beyond release notes (already in place, but probably worth repeating in Ironic) and the ML15:46
opendevreviewMerged openstack/ironic-specs master: Update spec template to reflect launchpad move  https://review.opendev.org/c/openstack/ironic-specs/+/88150915:56
rpittaugood night! o/16:11
TheJuliaI guess the challenge is openstack as a whole moved past md5 checksums a long time ago16:19
TheJuliawe lagged by continuing to support md5 explicitly16:19
TheJuliaso to the openstack community as a whole, it seems like a moot topic since the enhanced checksum stuffs have been around for years16:20
TheJuliaI guess I feel like we're just punting a can down the road and without a forced means of change/improvement we're going to be revisiting it again, and again, since we're allowing others to not have to change/evolve/improve their own situations and they can inherently just kick the can down the road16:27
TheJuliaI think it was also first viewed as "an option to re-enable is acceptable" and then realizing metal3 has it as part of their public api, then we hit a "oh, well, crap" sort of situation16:28
TheJuliait has been a topic for a couple years, and now we're likely not going to remove it for another ?2? Granted, I don't care about removal as much as I do about disabling by default. At the end of the day I'll just shrug though.16:36
JayFdtantsur: I do wanna specifically point out: the problem here is not our deprecation (md5 can be re-enabled by operators or downstream packagers) but metal3 wanting to reuse our IPA deliverable16:38
JayFdtantsur: that to me makes it less of a clear deliniation of "Ironic broke an API" vs just two projects that intersect making different decisions w/r/t timing16:38
dtantsurs/metal3/anyone/ and s/our IPA deliverable/any IPA deliverable/16:39
dtantsurnow it's fair16:39
TheJuliathe inherent challenge is we do control bifrost's code, we can fix stuff there fairly quickly16:39
dtantsurwe *expect* people to use our IPA deliverables or someone else's like RDO16:39
JayFMetal3 and Bifrost are the only Ironic consumers that require MD5 right now; we can fix Bifrost.16:39
dtantsur[citation needed]16:39
TheJuliano we don't, the deliverables are published for testing onlly16:39
JayFI can't prove a negative.16:39
dtantsurTheJulia: that's a very different take from what I had in mind when I created these images16:40
TheJuliayou can, but there is no warranty to the artifacts16:40
JayFThis is not the best path to solving the problem; the images exist, many people use them regardless of what we think about them.16:40
dtantsurI would expect a lot of people to just casually set image_checksum because it's there16:40
dtantsurand because Red Hat's business around FIPS has nothing to do with them16:40
JayFbut if you use the prebuilt images, you lose quite a bit of customization -- that's the tradeoff16:40
JayFthat would include the ability to re-enable md5 16:40
JayFI'm not saying we need to flip the switch back today or tomorrow; I'm saying we need an actual plan and a timeline to do ito16:41
dtantsurWe could just follow our standard and agreed deprecation process16:41
JayFand to consider that ironic and metal3 are not joined tightly enough to do many coordinated deprecations like this without significantly slowing up both projects16:41
TheJuliadtantsur: complicated by the tick/tock impact now16:41
dtantsurHold on please. It's not just metal3. It's possiblity THE MAJORITY of our consumers16:41
dtantsurYou keep ignoring the fact that absolutely nothing has motivated people to migrate away from MD5 before this Monday.16:42
TheJuliaAnyone using a recent glance would have just worked since the extra checksums have been around for quite a long time16:42
dtantsurOkay, the majority of non-Glance customers :)16:42
JayFdtantsur: So I put metal3 in a different bucket here; bceause AIUI there's an API tie-in there. The usual operator experience would involve needing to update the checksums in their images in OpenStack's API, which is hefty but not out of the norm for an upgrade requirement16:42
TheJuliaAnd we've provided the pass-through, just nobody adopted it16:42
TheJuliathe underlying issue is, we never added support for the extended use of the checksum field until now16:43
TheJuliaso, there is an easier path now16:43
dtantsurYep, metal3 is kinda special, but we cannot assume how much effort it is for the operators to change16:43
TheJuliaso... maybe that will help16:43
JayFdtantsur: I like the lens you're using; but I struggle with how to quantify that16:43
JayFdtantsur: and making decisions for a spooky mysterious operator I can't quantify is really difficult without just defaulting to 'brakes on at all times' 16:43
dtantsurI routinely talk to operators who are overworked, underpaid, and have to deal with a whole cloud alone16:43
dtantsurYour exposure may be to people of Yahoo scale16:44
TheJuliadtantsur: that is an unfortunate but common theme16:44
dtantsuryep16:44
JayFThis would be a pain the the rear for any environment I've worked in dtantsur :-( 16:44
TheJuliathings cannot be painless forever16:44
dtantsurSo when you start with "Hey, we broke you.." they may smash the monitor before you get to ".. but it's easy to fix" :)16:45
JayFdtantsur: FWIW; I've never worked on an openstack cloud, regardless of scale, that had enough people to maintain it properly16:45
TheJuliapain needs to be in logical chunks which can be navigated16:45
TheJuliaotherwise techdebt just gets piled on16:45
dtantsurRight. I'm by no means advocating to never remove MD5 or anything. FWIW, I've removed the iSCSI deploy :)16:45
JayFSo how would you all feel about me marking this as a topic for Vancouver PTG?16:45
dtantsurBut if our own removal has caught ourselves by surprise.. it's a bad sign.16:45
* JayF suspects dtantsur would not be there and it might not be super useful16:46
TheJuliawe also gave iscsi deploy an exceptionally long time on a boat before being pushed into the water and lit ablaze :)16:46
dtantsurJayF: I won't be there, but if you have an ops feedback session, it's good to mention the MD5 situation there16:46
dtantsurTheJulia: yep, I'm not advocating for exercising so much care as with iSCSI.16:46
TheJuliacan we get a graphic of me being an assassin chasing after md5 ?!?16:46
dtantsurBut even we ourselves need some graceful period.16:46
dtantsurOH16:46
* TheJulia wants artwork!16:46
dtantsurWe SO MUCH need an artist here16:47
JayFIt sounds like, if we are going this route, we might want to implement the configure-it-via-conductor path16:47
JayFbecause it sounds like we're going to want to give people a more easy freedom to flip it on/off16:47
TheJuliaJayF: already proposed, needs to be revised now since it forces a setting16:47
JayFack16:47
JayFthat to me makes this all... less interesting16:47
JayFbecause regardless of which way we fall; it's a single conductor config value to change behavior16:47
dtantsurIt's temporary. We have lived with MD5 for a long time, we can live with it for 6 months more.16:48
TheJuliawell, API lookup pass reply, but yeah16:48
TheJuliadtantsur: year+16:48
TheJuliawe can't remove it until 2024 under tick/tock16:48
dtantsurDepends on whether you want to respect the tick-tock thingy.16:48
TheJuliayeah16:48
TheJuliatrue16:48
TheJuliaI don't really want to16:48
dtantsur*I* personally don't have any stakes there as long as I have a few months to adapt Metal3 and someone fixes Bifrost.16:48
TheJuliabut I'm fine having an option and actually not explicitly removing as long as there is a knob that ends up with default false one day16:49
dtantsuronce it's false, we can plan eventual removal16:49
TheJuliabut first... why does grenade hate us16:49
* TheJulia files this next to "why does god need a starship"16:49
dtantsuryep. and the inspector grenade seems borked too16:49
TheJuliaso, in general, it *looks* like we start the network validation, but then things go sideways16:50
TheJuliawe don't get back to re-querying the image ref data so we definitely die somewhere in there16:50
TheJuliajust no debugging and no exception16:50
dtantsurTo the summit topic, I don't expect budget considerations to allow a lot of traveling for me in the near future.16:50
dtantsurSo unless it's somewhere close... we won't have a chance to discuss md5 over a shot of whiskey any soon16:50
TheJuliaWell, first we need glasses, neat or on the rocks16:51
TheJulia:)16:51
TheJuliaIt is looking like my travel is also going to be highly constrained16:52
dtantsurProsperous times are over in the cloud world16:52
JayFTheJulia: dtantsur: We do not have the choice to ignore SLURP. First of all; how can we complain about operators not having enough time AND want to skip something designed to make their life easier in the same convo?16:54
dtantsurWell, we started with "let's switch it off effective immediately"16:54
JayFTheJulia: dtantsur: but on top of that; we are an OpenStack project, TC resolutions say we need to respect it, so we should regardless of personal opinions.16:54
JayFdtantsur: well, if we're going to go the route of slow we should be properly-slow. I didn't realize such a change was as disruptive/controversial as it seems to be so my opinion is softening :) 16:55
TheJuliaThe US is starting to look like https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabdc3765-35e2-4382-8a27-143484c06ba1_1610x998.png and 16:55
TheJuliahttps://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed1d68f-3552-41b8-b726-1b72d8285b71_1582x984.png for me16:55
TheJuliaso travel moving forward is going to be... difficult even if there is budget16:56
dtantsuroh yeah. sorry to hear :(16:56
TheJuliac'est la vie16:56
TheJuliainteresting, there seems to be a multi-second lag between the logs17:02
TheJuliamaybe that might be why they are so confusing17:02
opendevreviewVerification of a change to openstack/ironic-lib master failed: Add jsonrpc client port capability  https://review.opendev.org/c/openstack/ironic-lib/+/87921117:16
opendevreviewJulia Kreger proposed openstack/ironic master: DNM: Attempting to torubleshoot grenade  https://review.opendev.org/c/openstack/ironic/+/88234717:40
TheJuliaWorth a try.... next step is adding debug logging to nova17:41
opendevreviewJulia Kreger proposed openstack/ironic-tempest-plugin master: Advance tempest plugin tests to Zed (mostly)  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/88231118:00
JayFBorderline-off-topic: I'm working on tshirt/sticker ideas around the theme of "Ironic Bear Metal Antelope Tour 2023.1" -- does this fit the bill? https://cdn.discordapp.com/attachments/306957680734371840/1103752466538975262/TastyMcRib_a_bear_playing_guitar_in_front_of_a_crowd_of_antelop_8516d96f-e297-4e0d-a488-801e2e945e04.png18:39
JayFOh, we need to add a grenade-skip-level job18:58
JayFI think iurygregory had a PR up for that a cycle early18:58
JayFit'd be interesting to see if it fails similarly to the grenade job now18:58
sschmittAnyone here try running the networking-baremetal ml2 driver along with another ml2 driver and vlan-transparency? Seems like since networking-baremetal doesnt specifically support this, it prevents any network from being created with vlan-transparency 19:33
TheJuliaJayF: dunno, I'm not a fan of big prints20:17
JayFto get the colors right, it'd likely have to be stickerified20:18
TheJuliayeah20:18
JayFWDYT about something along those lines (I've been trying, not so successfully, to refine it) for a sticker?20:26
TheJuliacould work. Anyway we could get a more metaly guitar? ;)20:32
JayFdo you have discord? I can invite you to my discord and you can see all the cursed things I've generated trying to improve that LOL 20:32
TheJuliaI do not20:33
TheJulia(have discord that is)20:33
JayFhttps://media.discordapp.net/attachments/306957680734371840/1103757136057610260/TastyMcRib_a_bear_and_an_antelope_playing_on_dueling_electric_g_7f1e1ef7-c1ba-4dcb-be64-51ec6e1205c6.png being an example lol20:33
JayFalso https://media.discordapp.net/attachments/306957680734371840/1103755810955341946/TastyMcRib_a_bear_playing_guitar_in_front_of_a_crowd_of_antelop_bbe1693f-dbf2-4fdb-b468-dfb2427ec8e4.png if I could touch up the eyes20:33
* TheJulia wonders where a third arm came from...20:33
JayFmidjourney doing midjourney things lol20:34
TheJuliaheh20:34
NobodyCamhahahahaha why is inspect wait in this list???20:44
NobodyCam`Node <UUID> can not have management_interface updated unless it is in one of allowed (enroll, inspecting, inspect wait, manageable, available) states or in maintenance mode. (HTTP 409)`20:44
NobodyCamand inspecting20:44
NobodyCamseem odd to allow update in those states20:45
NobodyCamhehehe any who Good AfterNoon Ironic Folks20:46
TheJuliabecause you can have introspection rules to change settings20:48
NobodyCamahh 20:49
iurygregoryyay, I'm getting "We are sorry! Your provided token seems to be void. Please check your email inbox for a newer one."22:30
iurygregory.-.22:30
JayFFrom where? :(22:31
iurygregorythe forum submission was accepted, so in the email in step one, they ask me to confirm .-.22:31
JayFah22:31
JayF#openinfra-events might be able to help22:32
iurygregorysince now I have my flight tickets, I was going to confirm things :D22:32
JayFwoo freakin hoo!22:32
iurygregoryno hotel yet, but not worried about it :D 22:32
iurygregoryI just sent an email to speakersupport 22:32
clarkbiurygregory: I've been told the see the email and will be in touch (probably tomorrow?)22:37
iurygregoryclarkb, tks!22:38

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!