Wednesday, 2022-12-07

vanougood morning ironic00:58
opendevreviewAija Jauntēva proposed openstack/sushy master: Fix volume deletion on newer iDRACs  https://review.opendev.org/c/openstack/sushy/+/86484507:53
rpittaugood morning ironic! o/08:36
kubajjGood morning everyone 09:16
kubajjrpittau: is it possible to recheck just one zuul pipeline?11:26
rpittaukubajj: unfortunately that's not possible11:26
iurygregorygood morning Ironic11:29
kubajjgood morning iurygregory 11:29
kubajjIf I want to recheck, do I just leave a comment, a review with just recheck in it or is there something special?11:30
rpittaukubajj: just add a comment with recheck at the beginning and possibly a reason for the recheck11:31
rpittauwhat's the patch ?11:31
kubajjhttps://review.opendev.org/c/openstack/ironic/+/86605611:31
kubajjZuul says Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance11:32
kubajjand I don't think that I modified anything that should cause a problem with partition image11:33
iurygregoryyou can add "recheck zuul retries"11:34
iurygregorythis would trigger recheck on the patch, and you provided the reason why you would like to do it =)11:34
rpittaukubajj: I agree, it doesn't look like the failure is due to your change11:34
kubajjthanks, will do11:36
opendevreviewJonathan Rosser proposed openstack/ironic master: Fix debug log message argument formatting  https://review.opendev.org/c/openstack/ironic/+/86685613:56
opendevreviewRiccardo Pittau proposed openstack/ironic master: Fix unit tests for Python 3.11  https://review.opendev.org/c/openstack/ironic/+/86686114:12
jandersdtantsur sorry I missed the ping about https://review.opendev.org/c/openstack/sushy/+/866612 - missed the boat now, but LGTM, thanks! 14:23
jandersOut of curiosity, what hardware needed this fix?14:23
dtantsurjanders: some ARM stuff, I pinged you downstream14:24
jandersdtantsur ACK14:25
kubajjdtantsur: what are the links that the controllers return?14:37
dtantsurkubajj: these are for navigation between API resources. Since your new endpoint is leaf (nothing under it), I think you can omit links for now.14:38
dtantsure.g. a node endpoint links to states (/v1/nodes/<node> to /v1/nodes/<node>/states)14:39
kubajjthanks14:39
TheJuliagood morning folks14:43
kubajjdtantsur: just to make sure, I need just get_one and it should just return the data14:50
dtantsurkubajj: yep14:50
rpittauneed to split, see ya tomorrow o/16:02
kubajjdtantsur: I think I figured out why we had the discussion about the get_by16:13
kubajjget_by_node_uuid in object definition16:14
opendevreviewJakub Jelinek proposed openstack/ironic master: WIP: API for node inventory  https://review.opendev.org/c/openstack/ironic/+/86687616:16
jrosserTheJulia: following up from my difficulty with debug=True yesterday i found a suspicious debug log https://review.opendev.org/c/openstack/ironic/+/86685616:44
jrosserit was a little tricky to find because the exception handling here doesnt output the stack trace https://github.com/openstack/ironic/blob/b34d79e3f440c408520f24a2263dc587ac205ee2/ironic/drivers/modules/agent_base.py#L539-L54516:44
JayFI don't /think/ that should be a functional change16:45
JayFbut imbw16:45
jrosseri'm now stuck in a bit of a loop where i can't get cleaning to succeed https://paste.opendev.org/show/bTFmRIKizVxJwO8E14Ww/16:46
JayFAre you using any kind of custom code in that?16:47
JayFOther than the above patch?16:47
jrosserno, this is stable/yoga16:48
JayFThis looks extremely broken16:48
JayFCan you reproduce that with IPA in debug mode?16:48
jrosseris there a correct way to "start again" after getting to clean failed state16:49
jrosseri am wondering if i'm doing that wrong16:49
JayFSo if it's in clean failed16:50
JayFyou should be able to run16:50
JayFbaremetal node blah manage16:50
JayFbaremetal node blah provide16:50
JayFyou have to go from:16:50
JayFclean failed -> managable16:50
JayFthen from managable -> cleaning -> available16:50
JayFif you're working with automated cleaning16:50
JayFso just manage, provide16:50
jrosserhmm ok thats what i'm doing16:50
JayFSo lets do a couple of things:16:51
JayFDo your "manage"16:51
JayFget a dump of the node object16:51
JayF(/v1/nodes/detail?uuid=blah or whatever the node-show-detail command is in OSC)16:51
JayFthen do a provide16:51
JayFwhen it breaks, get a node detail again16:51
JayFtoss all that into a pastebin and we should have the info we need to find the problem, I hope16:51
jrosserok i'll try that16:52
JayFTheJulia: ^^ uh, it looks like cleaning could be crazy-broken on stable/yoga  --- we are trying to call "wait" state verb on a node in cleanwait state16:52
* TheJulia tries to digest16:54
TheJuliawhat the....16:56
JayFyeah, that's what I was thinking lol16:56
TheJuliayeah, we're going to need state data for the fields before/after I think17:00
TheJuliajust so we're not spinning endlessly17:00
* jrosser not sure how to get node detail17:01
JayFbaremetal node show $node_uuid --detail # I think ? 17:01
opendevreviewVishal Manchanda proposed openstack/ironic-ui master: [DNM] Test CI jobs status  https://review.opendev.org/c/openstack/ironic-ui/+/86688017:06
jrosserJayF: before i run the 'provide' is this sufficient detail? https://paste.opendev.org/show/bkMzulFg0D4nm9zZqB41/17:18
JayFperfect17:18
jrosseri can't find a way to get detailed output from 'baremetal node show'17:18
JayF--long might be it17:18
JayFyou have the fields I  care about :D 17:18
JayFTheJulia: ^^ is clean_steps supposed to be erased on the failed -> managable transition, or when cleaning restarts?17:19
TheJuliait is just baremetal node show17:19
TheJuliaJayF: I believe it is17:20
TheJuliauhh... both I think17:20
JayFSo I'll note; node.clean_steps in the above paste is still populated17:21
JayFwe'll wait to see the output but that looks sus17:21
TheJuliaThat seems like it is a problem then17:22
TheJuliabut it has been ~2 years since I was looking at that code17:22
JayFsame, but longer17:22
JayFand I don't know what coulda busted it in Yoga17:22
TheJulialikewise17:22
TheJuliathis seems super weird17:22
TheJuliabecause we run CI in debug too17:23
JayFjrosser: information about how you have installed Ironic would be useful too -- pip installed? Using packages? etc17:23
TheJuliabut... maybe it got orphaned there earlier on and the presence short circuits things17:23
TheJuliawhich *does* feel like we've had that come up17:23
jrosserit is installed with pip from git using openstack-ansible17:23
jrosserthis feels less directly related to debug than my trouble yesterday, but i may be wrong there17:24
jrosserunless i have some bogus state now from it encountering the exception17:25
TheJuliawell, I think you hit an unexpected exception17:26
TheJulialets see, what was that exception17:26
jrosserok here is the node detail after i get clean failed https://paste.opendev.org/show/bPcYy3RTxwWFSThLUVVl/17:27
TheJuliaalso, sorry for not being really around late yesterday/early today. Board stuffs17:28
TheJuliaouch17:31
TheJuliaso yeah17:31
TheJuliayou hit a typeerror, which bombed out things really abdly17:31
TheJuliabadly17:31
JayFIf we can't recover from that in Ironic-proper; it's still our bug to fix that up, yeah?17:31
JayFmanage/provide should be resetting enough context on the node to make it work17:32
JayFI just can't tell what is broken where, in the output17:32
TheJuliayeah, I'm not that far yet17:34
TheJuliamother in law just called17:34
TheJuliaerr17:34
TheJuliano, step mother17:34
* TheJulia gets the two confused17:34
TheJuliaoh!17:44
TheJuliaI see what is going on17:44
TheJuliaso clean step overrides are loaded17:44
JayFI'm not sure I follow?17:45
TheJuliahttps://github.com/openstack/ironic/blame/stable/yoga/ironic/conductor/steps.py#L18717:45
JayFI'm still not sure how that could cause the failures we're seeing17:47
TheJuliaI think we just need a general exception catch in do_node_clean17:47
jrosseryes i have [conductor] clean_step_priority_override = deploy.erase_devices_express:517:48
JayFSo that's the original typeerror17:49
JayFbut I don't think that explains why it's failing in a recurring method now17:49
TheJuliayeah, that is what I'm struggling with17:54
TheJuliado we have the latest error?17:54
JayFhe showed a node log of the error we see in the last node17:55
JayFlike well above17:55
JayFit appears to be failing reliably in the same way17:55
jrosseryes it is the same, this is from my last attempt just now https://paste.opendev.org/show/bUASWfgCz9kRVzd1fvRn/17:56
jrosseri can see in the console of the node that the express cleaning step complete with result: None17:57
TheJuliaoh!17:59
TheJuliaI think i know what is going on17:59
TheJuliajrosser: any chance we can get like the hundred lines before that in the logs ?18:04
jrossersure18:04
jrosserthis is from the point that the node was powered on https://pastebin.com/raw/Dn0AwziH18:14
JayF17:20:3718:22
JayFhmm, nope, I'm just reading multiple logs about the same thing as if it was multiple different logs18:23
JayFit was already in clean wait after it powered on, it never transitioned back to cleaning at any point, but yet wait was called18:24
JayFI wonder if this is some kind of weird edge case in DRAC cleaning + clean step override18:24
jrosseroh the drac stuff is just noise, there are a bunch of dells as well but this isnt one of them18:24
* JayF burns his hypothesis18:25
TheJuliayeah, this is bizzar18:33
JayFjrosser: you 100000000% sure this is unpatched, up to date stable/yoga?18:36
JayFjrosser: maybe a pip freeze from inside the venv in the container, if you can do that?18:37
JayF17:20:37 is the only state transition18:38
JayFwhy it is trying to wait a cleanwait node???18:38
jrosserhttps://paste.opendev.org/show/baKpxWfnQ18WzAMBXvjl/18:38
jrosserit should be ironic from commit 5fc42c4118:40
JayFyeah, and that's the latest yoga release18:40
JayFI'm looking at git commit logs for something to blame lol18:40
JayFthis is one of the most clearly broken things I've seen with Ironic, and I can't figure out why lol18:41
jrosseri need to fork ironic now anyway to apply my debug log patch repeatably, so i can redeploy all this fresh tomorrow18:41
JayFwell, if it stopped breaking after you redeployed18:41
JayFI'd be even more confused and upset lol18:41
TheJuliaeh, everything with weird cleaning stuff is... weird18:43
JayFhttps://github.com/openstack/ironic/commit/8034242c225f3293c08ca46dc588d00c5ad0e10a is the only commit I can even see in yoga that I'd be sus over18:43
JayFand I can't draw a line between those changes18:43
jrosserand just for completeness i tried again with debug=False and it's totally nothing to do with that18:43
JayFyeah my hunch is somehow garbage was left in the node cleaning information18:44
JayFfrom the typeerror18:44
JayFand that is causing consistent failures18:44
JayFso there are 2x things to fix: the root cause, the typeerror, and our recovery semantics around bad data in cleaning metadata18:44
JayFbut I'd really like to track it through the code18:44
jrosseris there value in persisting with this, i'm happy to debug but i don't really know what i'm looking at18:44
jrosseralternatively i can delete / recreate the node but the learning opportunity may be lost then18:45
TheJuliaso it is *almost* like an explicit call to resume is not getting recorded18:45
TheJuliaand when it saves the object and refreshes, it gets the old data18:45
JayFTheJulia: do you think there's any value in jrosser NOT rebuilding this? 18:45
JayFI think we have all the data, right?18:45
TheJuliajrosser: what version of sqlalchemy is in use?18:45
JayFeven if we can't explain it18:45
JayFoooooh18:45
JayFSQLAlchemy==1.4.3118:46
JayFsqlalchemy-migrate==0.13.018:46
JayFthat looks right to me18:46
TheJuliabngo18:46
JayFthat's not correct? for yoga?18:46
TheJuliadrop down to 1.3.x or 1.2.x18:46
JayFoslo.db==11.2.018:46
TheJuliawe're just doing 1.4/2.0 stuff in master branch18:46
JayFare our requirements wrong? or is OSA ignoring them?18:46
TheJuliauhhhhhhh18:46
TheJulia11.2.018:46
* TheJulia goes and checks18:47
JayFSQLAlchemy>=1.2.19 # MIT18:47
JayFoslo.db>=9.1.0 # Apache-2.018:47
JayFno upper bound, when clearly there needs to be one18:47
TheJuliaupper comes from requirements repo18:47
JayFhttps://github.com/openstack/requirements/blob/stable/yoga/global-requirements.txt#L32618:48
TheJuliahttps://github.com/openstack/requirements/blob/stable/yoga/upper-constraints.txt#L16418:48
TheJuliayeah18:48
TheJuliaso, I'd drop the version back to just see if it works18:48
JayFso that is incorrect? 18:48
JayF1.4 drops autocommit?18:48
TheJuliaI don't know, I've never seen this before18:48
TheJulia1.4 doesn't drop it, 2.0 does18:48
JayFlet me ask the question in a more declarative way:18:48
TheJuliabut... this feels like an autocommit issue18:48
JayFWould you have expected oslo.db 11.2.0 and sqlalchemy 1.4.31 to work?18:49
JayF(on stable/yoga Ironic)18:49
TheJuliayes, I would have, but I'm grasping at straws because 1.4 does have some major changes that perhaps we just didn't detect18:49
JayFOK; I understand where you're at now18:50
jrosserdo you think i have something inconsistent with u-c18:50
* jrosser tries to follow along18:50
JayFjrosser: the hypothesis is that there may be a currently-unknown-bug in newer sqlalchemy that we haven't identified yet18:50
JayFjrosser: canyou try with 1.3.x or 1.2.x18:50
TheJulia... I feel like maybe there could be something, because the only thing that explains this, is if we get an old row back for node upon changing some other fields18:50
JayFjrosser: if that fixes it; we'll have to either fix compat with sqla 1.4.x or drop the upper-constraint 18:50
JayFjrosser: are you using anything exotic for your DB here?18:51
jrossermariadb18:51
JayFe.g. some kind of not-quite-mysql cloud service, or read-only secondaries, etc18:51
JayFI would think new mariadb would be fine18:51
TheJuliathe path is task.process_event('resume') not actually executing18:51
TheJuliaat least, that what it *seems*18:52
TheJuliabut it has no way to actually not do so to get as far as it does18:52
* TheJulia hopes that makes sense18:52
JayFwhat line # do you think that's being called on?18:52
TheJulialooking for it again, give me a minute18:52
JayFjust to make sure I'm looking in the right spot18:52
JayFty18:52
TheJuliahttps://github.com/openstack/ironic/blob/master/ironic/conductor/task_manager.py#L61318:53
JayFyeah that's what I expected18:54
TheJuliathe fact the target_provision_state is available, and not none, also points to something funky18:54
TheJuliaHonestly, I'm kind of at the point where I'd want to try and reproduce this, but I don't see how we got on that track to begin with unless something "weird" occurd with db interaction18:55
JayFlike I said above; I think the typeerror killed it in a bad spot18:55
TheJuliaoh, I agree18:55
JayFand somehow gave us a node in a state that just screws up this over and over18:55
TheJuliaoh18:55
TheJuliayou know what... it could just be the fact target_provision_state is already set18:55
JayFlike maybe that leftover target_prov_state is never getting zeroed?18:55
TheJuliaI wonder if nuking that field int he db would clear this case up18:55
TheJuliabecause I think on failure, that does get reset18:55
JayFjrosser: mind trying something for uS?18:55
jrossersure18:56
TheJuliaor at least, with cleaning is *should*18:56
JayFjrosser: reset the node to managable state18:56
jrosserok, done18:56
JayFnope, that hypothesis is bunk TheJulia 18:56
JayFhttps://paste.opendev.org/show/bkMzulFg0D4nm9zZqB41/18:56
TheJuliaopenstack baremetal node show ?18:56
JayFtarget_provision_state is null in the before photo18:56
JayFdii['clean_steps'] is not though18:57
JayFwhich is what I think could be the culprit but I'm not sure18:57
TheJuliaugh18:57
JayFTheJulia: we can just have jrosser wipe all of dii, right?18:57
TheJuliayes18:58
JayFjrosser: so now you'll run this:18:58
JayFupdate nodes set driver_internal_info=None where uuid='THE NODES UUID' limit 1; 18:58
JayFbe 1000000% sure that's right or you'll break some other node18:58
JayFI do not believe there is a non-DB way to blank driver_internal_info :( 18:58
TheJuliadoing that will at least identify where this is going sideways since it is not standing out in the code18:58
TheJuliathere is not18:58
TheJuliawe purge most of it in error handling stuffs that make sense18:59
TheJuliabut yeah...18:59
JayFjrosser: then, after you've wiped driver_internal_info on the node; lets do another before picture (node show -> pastebin) then do the provide again18:59
jrosserERROR 1054 (42S22): Unknown column 'None' in 'field list'19:02
jrosseroh hmm19:02
JayFoh19:02
JayFnull19:02
JayFnot None19:02
JayFsorry, it's null in mysql, not `None`19:02
jrosserok done19:03
JayFaight, then like I said, do a node show -> pastebin19:03
JayFso we can have a snapshot before state for this test19:03
JayFthen do the provide19:03
JayFand it'll succeed this time. Maybe. We hope. Perhaps.19:03
JayFCross your fingers :D 19:03
jrosserhttps://paste.opendev.org/show/b853jZDgzdhsYrsrAHKN/19:05
* TheJulia is hoping lots19:14
jrossersame error :(19:18
TheJuliaUgh19:19
jrosseri am kind if tired now and it's late19:19
JayFI'll be here in the morning :) Thanks for working withus19:19
jrosseri will do that again tomorrow, as first time the host dropped to the uefi shell and didnt boot19:19
jrosserand i can't be certain if that was my fault or not19:19
TheJuliajrosser: get some rest, perhaps log.debug(task.node) before the exception is raised19:19
JayFI don't see anything else in the "before" node show that would indicate to me that something in the node object is busted19:20
JayFso it's gotta be environmental :( 19:20
jrosserthankyou very much for your help, i can't help feeling theres something really silly going on here19:20
JayFlike TheJulia's scary hypothesis19:20
JayFjrosser: it shouldn't be possible to get Ironic in this broken of a state, even if misconfigured19:20
JayFjrosser: unless you find out it's been flapping between two or three DB servers mid-cleaning :P 19:20
JayFlol19:21
JayFI'm sayin', almost certainly Ironic's fault. 19:21
opendevreviewJulia Kreger proposed openstack/ironic master: Catch any exception for Cleaning  https://review.opendev.org/c/openstack/ironic/+/86693319:58
TheJuliastevebaker[m]: By chance have you looked at https://zuul.opendev.org/t/openstack/build/1172af245e204813a34d2fc3d61ef5c1 ?20:07
TheJuliarpittau: fyi https://review.opendev.org/c/openstack/ironic/+/866780 is failing, looks like job configuration on the branch20:07
TheJuliastevebaker[m]: rloo: hjensas: Reviews on https://review.opendev.org/c/openstack/ironic-specs/+/861803 would be appreciated20:08
stevebaker[m]TheJulia: that unit test fail looks very much related to the change. I'll take a look20:09
TheJuliathanks20:09
TheJuliayeah, I saw the ipmitool octets and went "rutro"20:09
rlooTheJulia: yes, I started it this morning and then got side-tracked. Have it down to do tomorrow morning (hopefully no interruptions tomorrow...)20:12
TheJuliaokay20:12
TheJuliaThanks20:12
stevebaker[m]TheJulia: ah, those tests are testing the default boot mode, which is bios on xena20:30
TheJuliaso question comes to mind, do we stop backports at that point then?20:32
stevebaker[m]TheJulia: the test tests default boot mode, explicit uefi, then explicit bios. The fix actually removes the test_management.py changes from the change20:35
stevebaker[m]So I think its ok to backport imo20:35
opendevreviewSteve Baker proposed openstack/ironic stable/xena: Align iRMC driver with Ironic's default boot_mode  https://review.opendev.org/c/openstack/ironic/+/86662720:36
TheJuliastevebaker[m]: ack ack20:38
opendevreviewSteve Baker proposed openstack/ironic stable/wallaby: Align iRMC driver with Ironic's default boot_mode  https://review.opendev.org/c/openstack/ironic/+/86662820:39
opendevreviewSteve Baker proposed openstack/ironic stable/wallaby: Align iRMC driver with Ironic's default boot_mode  https://review.opendev.org/c/openstack/ironic/+/86662820:41
JayFTheJulia: are stevebaker[m] ^^ patches the new version of that one I originally NACK'd?21:01
JayFor is this yet-another-kinda-breaking-change? :( 21:01
* JayF put a -1 and a question on the first backport in that chain21:07
JayFAs discussed in Monday's meeting, I've proposed a change to project-config to give access to a new group, ironic-release, to delete bugfix branches and create new tags (so we can tag bugfix-xy-eol and delete bugfix/x.y branch)21:31
JayFhttps://review.opendev.org/c/openstack/project-config/+/86693721:31
TheJuliaJayF: different one21:31
JayFAck; I mean, this is 2x in a week for iRMC driver?21:32
JayFAre we not having a high enough review bar for those changes?21:32
TheJuliathey have a ton of changes in review right now21:32
JayFIf we think this is OK to land, can someone make that case in the merge request in reponse to my comment21:32
TheJuliait is not about a bar, it is about addressing/fixing issues most appropriately21:32
JayFand lets get all the breaky stuff landed together and cut a minor-version-bump release21:32
JayFThis particular default boot mode fix looks like a straight up bug that we should've caught in review. 21:34
JayFand "we" is probably literal because there's a good chance I was a reviewer on it21:34
TheJuliamy concious grok which is not deep was that they were not honoring the framework as the driver didn't get updated as ironic evolved21:36
TheJuliabut.... I could be in left field, under some rocks for all I know right now21:36
JayFI'm still at google maps trying to get directions to the ballpark ;) 21:36
TheJuliaoh good, it is not just me21:37
JayFI hadn't even thought about how easy it is for bitrot to break these drivers21:39
JayFand we don't even track or think about that at all in any kind of structured way21:40
JayFthis goes back to the discussion about iRMC driver having it's own bios deal, and not doing the bios interface, bceause it predates21:40
stevebaker[m]JayF: any existing irmc node with explicit boot mode will not be affected by this backport the irmc docs are clear about boot mode being explicitly set on the node. Not setting the boot mode on the node is has undefined behaviour, that is what the backport fixes21:40
JayFundefined meaning, undocumented or unreliable?21:41
JayFThis is the way I posed the question to TheJulia about the other, similar kinda-breaky change: is there any chance a reasonable operator would have this working reliably the way they want it to *and* that the change could break it21:42
JayFsounds like in this case it would be:21:42
JayF- boot_mode unset on the node21:42
JayF- default_boot_mode set in config21:42
JayF- node is using !default boot_mode reliably21:42
JayF- operator wants the node to be using the nondefault boot_mode, despite never setting it explicitly21:42
JayFDo I understand it properly?21:42
JayFIf so, I think this fits like the other one: as long as we have an extremely straightforward release note warning operators the behavior is changing, it should be OK. And I have a strong preference for us cutting a minor-version-bump release of any impacted branches as soon as this (and the other breaky bugfix that Julia was proposing) lands21:43
JayFstevebaker[m]: ^ can you ack my understanding is correct? Then one of us put this in the merge req so it's documnted and I'll flip my vote21:43
TheJuliaI *think* the inverse, not explicitly stating it thus undefined result on the back end based upon ?last? or prior hardware state21:44
TheJuliasort of like what you get by default iwth IPMI if you don't do the raw magic21:44
JayFI don't like that scenario as much21:44
JayFbecause it means there's no way for an operator to get the existing behavior post-bugfix21:44
TheJuliaahh, but steve said the docs said the are supposed to be explicit21:45
JayFand I stopped assuming if certain configurations made sense after working at places that did things that only made sense in downstream context21:45
TheJuliathis handles when the humand didn't do the thing21:45
TheJuliaas I understand it21:45
JayFwhat I'm saying though, is right now, if you're running Zed21:45
JayFand you're in the impacted group21:45
JayFyou're getting "we boot with the last boot mode" behavior. Which is not awesome IMO, but if it reliably behaves that way21:45
TheJuliawell, if you didn't follow the docs, your impacted regardless21:45
JayFthere's no way for an oper to restore that behavior21:45
stevebaker[m]JayF: It looks like defaults to bios when not explicit. default_boot_mode is bios for W, X, and uefi for Y, Z21:46
JayFe.g. if it was always bios or always uefi; we could at least dictate how to change it21:46
JayFOK; that is nicer then.21:46
JayFCan we ensure the release note explcitly tells an operator how to get existing behavior back if they want?21:46
TheJuliaI'm not sure, if documented to work differently, that the existing behavior was valid in a sense.21:46
TheJuliawithout being set...21:46
TheJuliathat is21:47
JayFI guess I should explicitly state21:47
JayFif someone is in this configuration21:47
JayFyou have to assume they aren't reading docs, right?21:47
TheJuliathe docs are dense, they could have just missed it21:49
JayFyeah, that's fair21:49
JayFI think what I said above and re: the previous change is still reasonable, make sure we have clear, directive release notes to operators who were desiring the previous behavior (or were in the uncertain config) 21:49
JayFand then lets cut a minor version bump release as soon as those two pieces are in21:50
stevebaker[m]safest release note advice would be to set capablities boot_mode:bios explicitly for any node which doesn't have it set. And this note would only be required on Y, Z21:50
JayFand maybe, maybe consider not backporting this beyond the point where we can't "fit" in a new minor version :/ 21:50
stevebaker[m]JayF: W, X is actually safer because default_boot_mode is bios?21:51
JayFSo really, Y and Z are the only ones that are changing behavior in a real way21:51
JayFI cringe b/c I don't think we can minor-version-bump Y21:51
JayFbut that is just an unfortunate reality21:51
JayFI just noticed a misconfiguration in ironic-stable-maint; I'm going to be taking immediate action to rectify it, and will email the list.21:54
JayFWe have core-reviewers-emeritus still on the ironic-stable-maint group21:54
* TheJulia raises an eyebrow21:54
JayFI'm actually going to purge the whole list; we have ironic-stable-maint inheriting from ironic-core21:55
JayFso anything left in the list for ironic-stable-maint is just asking to be forgotten about21:55
stevebaker[m]JayF: So just clarifying, we can go ahead with the backports. Y, Z will get a clear warning in the release note to set the boot mode on existing nodes. W, X can have a softer warning because defaults result in no change?21:56
JayFI'm onboard with that21:57
JayFjust make sure the release note says what you'd tell an operator if they were like "Steve, what happened to my node?" :D 21:57
stevebaker[m]JayF: maybe in a new "upgrade:" release note section?21:58
JayF+++++21:58
stevebaker[m]STEVE, MY NODES!21:58
TheJuliaJayF: when do you want to look at the RBAC stuffs?21:59
JayFuh21:59
JayFgive me a minute to finish sending a message to the list about the stable stuff21:59
JayFthen nowish is fine w/me?21:59
TheJuliauhh, maybe in a little bit21:59
TheJuliaI'm reviewing the api change now22:00
JayFtomorrow is fine, a breadcrumb in the review about where to go look to run tests is fine22:00
JayFI know you're busy :D22:00
TheJuliasure22:00
opendevreviewSteve Baker proposed openstack/ironic stable/zed: Align iRMC driver with Ironic's default boot_mode  https://review.opendev.org/c/openstack/ironic/+/86662522:12
stevebaker[m]JayF: let me know what you think of ^^ wording22:12
JayFTheJulia: hmm, could jrosser's issue be like, a stuck-open transaction22:18
JayFTheJulia: does that even make sense as a thing to suggest?22:18
JayFstevebaker[m]: +1'd it, only not a +2 because I only looked at it from an impact perspective right now22:24
TheJuliaeh... I don't *think* so, but it is curious if the conductor has restarted or not22:24
JayFI was just thinking that an open transaction, modifying that node, might be an explainable reason why you'd be getting weird results back after you SHOULD see the node in a new position ... but I guess that hypothesis is 100% negated by the fact the node is in a decent state between runs22:27
opendevreviewVerification of a change to openstack/ironic master failed: Fix debug log message argument formatting  https://review.opendev.org/c/openstack/ironic/+/86685622:28
opendevreviewSteve Baker proposed openstack/ironic stable/yoga: Align iRMC driver with Ironic's default boot_mode  https://review.opendev.org/c/openstack/ironic/+/86662622:40
opendevreviewSteve Baker proposed openstack/ironic stable/xena: Align iRMC driver with Ironic's default boot_mode  https://review.opendev.org/c/openstack/ironic/+/86662722:40
opendevreviewSteve Baker proposed openstack/ironic stable/wallaby: Align iRMC driver with Ironic's default boot_mode  https://review.opendev.org/c/openstack/ironic/+/86662822:43
opendevreviewSteve Baker proposed openstack/ironic bugfix/20.2: Align iRMC driver with Ironic's default boot_mode  https://review.opendev.org/c/openstack/ironic/+/86678022:44
opendevreviewSteve Baker proposed openstack/ironic bugfix/21.0: Align iRMC driver with Ironic's default boot_mode  https://review.opendev.org/c/openstack/ironic/+/86665622:45
opendevreviewJay Faulkner proposed openstack/ironic master: DB & Object layer for node.shard  https://review.opendev.org/c/openstack/ironic/+/86423623:27
opendevreviewJay Faulkner proposed openstack/ironic master: API support for CRUD node.shard  https://review.opendev.org/c/openstack/ironic/+/86694623:27
TheJuliaJayF: rutro https://review.opendev.org/c/openstack/ironic/+/866946/123:30
TheJuliacheck the enitre commit message23:31
JayFlolsob23:31
JayFgimme a sec, I think I can "fix"23:31
TheJuliayeah, delete the new changeid line23:31
opendevreviewJay Faulkner proposed openstack/ironic master: DB & Object layer for node.shard  https://review.opendev.org/c/openstack/ironic/+/86423623:32
opendevreviewJay Faulkner proposed openstack/ironic master: API support for CRUD node.shard  https://review.opendev.org/c/openstack/ironic/+/86623523:32
JayFokay; fixed23:32
JayFty for pointing that out before it was too long on the split brain lol23:33
JayFgerrit telling on me, and how I usually manage multi-patch setups23:33
JayF(fixup things patch-by-patch, then rebase them back into the correct commits)23:33
TheJuliaI go patch by patch and just cherry-pick the next patch on top and go from there23:36
JayFYeah; I think I am a little less scared of rebasing than most folks. I try to avoid telling people my git workflow because they either want to adopt it or change it. 23:39
JayFlol23:39

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!