Thursday, 2023-09-14

opendevreviewVerification of a change to openstack/networking-generic-switch master failed: Fix batching error due to outdated etcd3gw  https://review.opendev.org/c/openstack/networking-generic-switch/+/88640400:09
opendevreviewJulia Kreger proposed openstack/ironic master: Enable OVN CI  https://review.opendev.org/c/openstack/ironic/+/88508700:38
opendevreviewOpenStack Release Bot proposed openstack/python-ironic-inspector-client stable/2023.2: Update .gitreview for stable/2023.2  https://review.opendev.org/c/openstack/python-ironic-inspector-client/+/89509301:36
opendevreviewOpenStack Release Bot proposed openstack/python-ironic-inspector-client stable/2023.2: Update TOX_CONSTRAINTS_FILE for stable/2023.2  https://review.opendev.org/c/openstack/python-ironic-inspector-client/+/89509401:36
opendevreviewOpenStack Release Bot proposed openstack/python-ironic-inspector-client master: Update master for stable/2023.2  https://review.opendev.org/c/openstack/python-ironic-inspector-client/+/89509501:36
*** osmanlicilegi is now known as Guest004:33
opendevreviewMerged openstack/ironic-ui master: Fix release note build  https://review.opendev.org/c/openstack/ironic-ui/+/89416409:24
opendevreviewVerification of a change to openstack/networking-generic-switch master failed: Fake: support adding a random sleep and injecting failures  https://review.opendev.org/c/openstack/networking-generic-switch/+/87479310:42
opendevreviewMerged openstack/ironic master: devstack - configurable ipv6 address mode  https://review.opendev.org/c/openstack/ironic/+/89362210:43
opendevreviewVerification of a change to openstack/networking-generic-switch master failed: Honor ngs_save_configuration setting when using batch commands  https://review.opendev.org/c/openstack/networking-generic-switch/+/88640511:24
iurygregorygood morning ironic11:44
dtantsurmorning iurygregory11:44
dtantsurdoes anyone understand why the Nova's grenade plugin  is trying to use the non-existing cirros 0.6.2 image? And why it does not break anyone else?11:44
dtantsurref https://a86af8e24720e7d1aa6e-c7277a3e95ece459b8b2ecfff9ffad89.ssl.cf5.rackcdn.com/894015/1/check/ironic-inspector-grenade/1b5ab34/controller/logs/grenade.sh_log.txt11:44
dtantsurhmmm, it's coming from the old Ironic plugin. why doesn't it break Ironic then?11:45
iurygregorydtantsur, I think we have the info in some jobs11:45
iurygregoryhardcoded11:45
dtantsurehhm, old Ironic uses 0.6.1, old Ironic in the inspector gate uses 0.6.2. WUT11:46
iurygregoryso we have 0.6.1 in our job config and devstack/tools/ironic/scripts/cirros-partition.sh11:49
dtantsurthen how does inspector manage to use 0.6.2? Oo11:49
* iurygregory checks inspector11:49
dtantsur0.6.2 is indeed the devstack's default11:50
iurygregoryin case we don't set it will get the default of CIRROS_VERSION_DEVSTACK11:51
iurygregoryso yeah it would make sense since is the default there11:51
dtantsurI'm starting to get an idea of it. The image is only created on the old devstack, and the version there is different (0.5.2)11:54
iurygregorythis is only happening in inspector-grenade?11:55
dtantsurI suspect Ironic overrides the version11:56
dtantsuryep, we do11:57
iurygregorydtantsur, https://github.com/openstack/ironic/blob/master/zuul.d/ironic-jobs.yaml#L93111:58
iurygregoryin ironic-grenade we set the version we want11:59
opendevreviewDmitry Tantsur proposed openstack/ironic-inspector master: Use one version of Cirros in the grenade job  https://review.opendev.org/c/openstack/ironic-inspector/+/89516411:59
dtantsurlet's try ^^^11:59
iurygregory++11:59
iurygregoryyeah11:59
iurygregoryquick question, how bad would be change the value of [conductor]power_state_change_timeout to 90 ?12:00
dtantsurnot ideal. kind of a last resort thing if there is no hope to fix the hardware.12:01
iurygregorymy clean_step to update the firmware said it failed with "Failed to set node power state to power on.", the update was executed successfully  .-.12:02
iurygregoryand I've found this on the logs https://paste.opendev.org/show/b7k95XWrDquLeUXgzpUO/12:03
iurygregorymaybe 60 is not enough...12:03
dtantsurthat may just mean that your hardware is not available for a long time.12:04
dtantsurit's something to handle during the firmware upgrade, but I don't think raising the general timeout is a good idea.12:04
iurygregoryack12:04
dtantsurwhat is actually happening to the node in the meantime?12:05
dtantsurdo power calls just error? return None? return a stale value?12:05
iurygregorylet me try to find in the logs12:05
iurygregorythe node was on PowerOn, I've sent the command to do the clean step to update the firmware, task is created firmware is updated, it triggers a reboot_to_finish_step, the node goes to PowerOff and then timeout because it failed to change power state to 'power on' by 'rebooting': Failed to set node power state to power on https://paste.opendev.org/show/bhGNkcKx8vqteeVq2awC/ 12:34
dtantsurinteresting! do you have any insight into what was going on with the hardware itself?12:56
dtantsurI suspect it was doing the actual upgrade.. 12:56
opendevreviewDmitry Tantsur proposed openstack/sushy-tools master: Use WAL mode for SQLite cache  https://review.opendev.org/c/openstack/sushy-tools/+/89516813:09
iurygregoryagree, I will try to watch the console and see what happens 13:23
TheJuliaiurygregory: similarly, on some other machines, I think I've seen them take 2-3 minutes for their bmcs to update. From that other big box vendor.13:37
iurygregoryTheJulia, oh god to know13:38
* TheJulia suspects froidian slip there13:38
iurygregoryI just noticed that for ilo in the management interface they have a wait parameter... <checking what it does>13:39
TheJuliawe likely want the lock held while waiting, unfortuantely.13:39
dtantsurwe can wait for power on longer in *this case*, but I don't think we should raise the default timeout13:43
TheJulia++13:55
TheJuliaI'd be fine waiting something like 5 minutes, on some level we might also be masking a failed upgrade as well, or my favorite "the partial upgrade" that didn't completely work.13:56
*** tosky_ is now known as tosky14:06
dtantsurbloody inspector grenade.. can I just nuke it?14:07
* TheJulia steps back14:07
TheJuliawho are you, and what did you do with our dear friend dtantsur ?!14:07
TheJulia;)14:08
TheJuliaI think you asking that way tells us all we need to know14:08
dtantsur:D14:08
TheJuliaIt doesn't make sense to keep... although have we merged an official "we're deprecating this stand along service" release note yet?14:08
dtantsurWe have not. But giving its virtual zero amount of changes, a grenade job may be simply an overkill.14:10
TheJuliaThen we should do both14:11
TheJuliaProvide notice "this is merging into ironic, stay tuned!"14:11
TheJuliaand kill the grenade job14:11
iurygregoryok the error (couldn't power on) happened a few seconds before the iLO GUI showed that the firmware update was finished and said iLO is being reset..14:14
opendevreviewDmitry Tantsur proposed openstack/ironic-inspector master: Update the project status and move broken jobs to experimental  https://review.opendev.org/c/openstack/ironic-inspector/+/89516414:16
dtantsurTheJulia, like this ^^?14:16
JayFdtantsur: TheJulia: My only question: is it possible some of the half-support in Ironic could break upgrades14:40
* JayF puts that in gerrit review14:41
dtantsurJayF, not impossible (but we don't run the inspector grenade job on ironic, so it won't help much)14:41
JayFI'm more saying; are we sure we *didn't already break the upgrade* 14:42
JayFif we're punting getting the job to work we need to feel super confident about that14:42
JayFYou're talking to someone who literally just got burned *yesterday* for pushing a change without CI so I'm a little extra sensitive to this lol14:43
TheJuliaI think the release note paints the picture appropriately14:44
dtantsurgiven how cryptic grenade is, it's not even trivial to figure out why it fails.. but the new module takes a different code path, so I'm also struggling to imagine how it will break grenade without breaking everything else14:44
TheJulia"not done, in progress, marking this in maintenance"14:44
JayFdtantsur: I know almost nothing about inspector, so some of those questions are just asking defensive questions 14:45
JayFdtantsur: if the answer is "the only way that could break something is it they put ironic in the catalog as the inspector" or something similar, that's an A++ answer14:45
TheJuliaWell, even putting ironic in as inspector in the catalog wouldn't really work since it is aiui, not a 1-1 move14:52
TheJuliaThey would need to intentionally drive off the happy path, as I see it14:53
TheJuliaMaybe we need an explicit issue in ironic’s release notes to advise against doing so?14:54
JayFI'd +1 such a change14:55
JayFbtw, please review prelude14:55
JayFit's ht'd ironic-week-prio14:55
TheJuliaWheeee IRC14:59
TheJuliaI added some spacing between items on https://etherpad.opendev.org/p/ironic-ptg-october-2023 because it was getting hard to separate items15:09
TheJuliaAnd another day, another item :)15:16
opendevreviewJulia Kreger proposed openstack/ironic master: Enable OVN CI  https://review.opendev.org/c/openstack/ironic/+/88508715:35
TheJuliaokay, that fixed the doc build issue in that15:36
JayFTheJulia: hmm16:27
JayFTheJulia: so locally on https://review.opendev.org/c/openstack/ironic/+/89500716:27
JayFTheJulia: it's building 2023.1 notes under 2023.2 notes16:27
JayFand that's what I expected based on TC discussions16:28
JayFso I'm a little weirded out to not see it working that way16:28
JayF(in the gate)16:28
TheJuliaunless it was added as explicit logic to reno, it doesn't work that way by default16:29
TheJuliaunreleased is always what we're about to branch off, and the content is only ever that branch until the prior stable branch16:29
JayFHmm. I thought changes had been made to that degree16:30
JayFI'll have to go lookup the governance change b/c I don't see reno changes16:30
TheJuliamaybe not released or maybe the version changed16:30
TheJuliaand honestly, wrapping it all together is *awful*16:30
TheJuliait means duplication across pages16:30
JayFTheJulia: OK, so regardless of what we do for slurp, this isn't a slurp lol16:33
TheJuliabahahahaha16:33
JayFTheJulia: Y->A->C and this is B, and I was confused16:33
* JayF has not been at his best the last two or three days16:33
TheJuliawell, time to re-write the prelude!16:33
JayFwell and stop worrying about it being too verbose16:34
TheJuliaBy prelude, it will be a prelude!16:34
JayFsince I can delete it all16:34
TheJuliaso interestingly, there *seems* to be an issue with microversion issue in the sdk16:34
TheJuliaat least, trying to write a test :\16:35
opendevreviewJay Faulkner proposed openstack/ironic master: [releasenotes] Prelude for 2023.2/bobcat  https://review.opendev.org/c/openstack/ironic/+/89500716:44
-opendevstatus- NOTICE: The lists.airshipit.org and lists.katacontainers.io sites will be offline briefly for migration to a new server16:47
opendevreviewJay Faulkner proposed openstack/ironic master: [releasenotes] Prelude for 2023.2/bobcat  https://review.opendev.org/c/openstack/ironic/+/89500716:54
opendevreviewJay Faulkner proposed openstack/ironic master: [releasenotes] Prelude for 2023.2/bobcat  https://review.opendev.org/c/openstack/ironic/+/89500716:54
JayFThose are updated, ready for re-review. Especially curious what folks think about the upgrade note I added about possibility to skip this release. I'm on the fence if we should keep it but I feel like it's good context to have in the release notes.16:56
opendevreviewVerification of a change to openstack/ironic-python-agent stable/zed failed: Handle the node being locked  https://review.opendev.org/c/openstack/ironic-python-agent/+/89259417:04
opendevreviewVerification of a change to openstack/ironic-python-agent stable/yoga failed: Handle the node being locked  https://review.opendev.org/c/openstack/ironic-python-agent/+/89268718:04
TheJuliaJayF: able to reproduce shard being dropped :)18:42
opendevreviewVerification of a change to openstack/ironic-python-agent stable/xena failed: Handle the node being locked  https://review.opendev.org/c/openstack/ironic-python-agent/+/89259518:42
TheJuliaallow_unknown_params *is* broken18:49
JayFfor everything, or just baremetal?18:50
TheJulianot sure yet18:51
TheJuliaoh, I sort of see what is going on18:51
TheJuliaokay, so... so the doc string actually provides the clarity19:02
TheJulia"True to accept, but discard unknown query parameters"19:02
TheJuliait was never an override in other words19:02
TheJuliait says "False will result in a validation exception", but that is not true, it is *always* true.19:03
JayFaha so it's about discarding unknown values instead of raising on them19:04
TheJuliayup19:04
JayFso it was not possible to do what we wanted, that's nice to know19:04
TheJuliayeah, unfortunately19:04
JayFbut does indicate we will need to take a workitem to ensure openstacksdk supports modern Ironic API abilities19:05
TheJuliaindeed19:05
TheJuliagranted, where this gets difficult, there is a non-zero percentage we just won't care about at all19:05
TheJuliaat least, in terms of openstacksdk for other openstack service consumption19:05
JayFI don't know what you mean by that19:06
JayFwe who?19:06
TheJuliaroughly half our consumers don't use nova at all20:13
JayFI guess I'm missing how that ties back to "make sure Ironic's API is fully represented in openstacksdk" -- I thought openstacksdk had use cases outside of openstack<>openstack comms20:14
TheJuliasome like ansible, but it is not automatically at the top of any mind20:17
TheJuliaobviously there is room for improvement, and I think part of the challenge is the style20:18
TheJuliabut it might not be that bad really once you wrap your head around it20:18
JayFthat is mostly what I'm counting on20:18
JayFthat it'll be "figure out how to add a query param and test it" then do that a dozen times20:19
TheJuliaexactly20:19
TheJuliaand I'm done with the first one basically20:19
* TheJulia runs tox and waits20:19
TheJuliaso lets see, where did I put that tab with our webapi version history20:21
* TheJulia whistles as adding sdk support for stuff20:53
opendevreviewVerification of a change to openstack/ironic-python-agent stable/wallaby failed: Handle the node being locked  https://review.opendev.org/c/openstack/ironic-python-agent/+/89259620:58
TheJuliaI posted a series of patches to address sdk support of the node object22:17
JayFI will h-t them ironic-week-prio as well so we get ironic cores lookinga t them22:20
JayFah, that may not be enabled in ACLs in openstacksdk; I can't do it22:20
JayFa core or you may be able to 22:20

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!