ravlew | Good morning ironic | 09:41 |
---|---|---|
ravlew | I'm still getting the fedora-latest was not found error on stable/yoga | 09:43 |
ravlew | is there any ETA for the fix? | 09:44 |
opendevreview | Mahnoor Asghar proposed openstack/ironic master: Add inspection hooks https://review.opendev.org/c/openstack/ironic/+/890817 | 10:10 |
dtantsur | TheJulia, JayF, we should really fix the fact that not all fields can be specified on node creation... | 10:48 |
dtantsur | ravlew, someone needs to check and/or fix https://review.opendev.org/c/openstack/bifrost/+/893754 | 10:49 |
opendevreview | Mahnoor Asghar proposed openstack/ironic master: Add inspection hooks https://review.opendev.org/c/openstack/ironic/+/893533 | 10:51 |
ravlew | I see, thanks dtantsur | 11:17 |
TheJulia | good morning | 12:55 |
dtantsur | morning TheJulia! safe home? | 12:55 |
TheJulia | yes, finally got home yesterday evening | 12:56 |
dtantsur | nice | 12:58 |
* TheJulia tries to wake up | 13:29 | |
JayF | dtantsur: maybe we should put that item on the ptg | 13:44 |
dtantsur | Is it going to be controversial? | 13:44 |
JayF | Why would it be? | 13:45 |
dtantsur | I don't think so either. Then it's rather a Just Do It items than a PTG discussion? | 13:45 |
JayF | I'm using that ptg board as a place to put any new work streams for next cycle, not just the ones that we might want to talk about | 13:46 |
JayF | Then when it comes time to actually schedule the gathering will pare it down to topics that have stuff to discuss | 13:47 |
dtantsur | ah cool | 13:48 |
TheJulia | mgoddard: o/ Are you guys using etcd with networking-generic-switch ? | 14:25 |
TheJulia | It looks like pulling the lock lease is failing, if you have any insight it would likely help JayF - https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_be9/888051/2/check/networking-generic-switch-tempest-dlm/be9c8a0/controller/logs/screen-q-svc.txt | 14:26 |
TheJulia | JayF: I feel like we could likely mark the dlm job non-voting for now, and just add a release note noting there is an issue with interacting with etcd, and we're still investigating | 14:27 |
opendevreview | Julia Kreger proposed openstack/ironic master: Support port name: API https://review.opendev.org/c/openstack/ironic/+/765569 | 14:34 |
opendevreview | Merged openstack/ironic master: Fix minor grammar issues in the help for new inspector options https://review.opendev.org/c/openstack/ironic/+/890138 | 14:45 |
dtantsur | wow, the metal3 job has caught a real breakage on https://review.opendev.org/c/openstack/ironic/+/863999! I wonder why nothing else did... | 14:47 |
TheJulia | secure boot, only the grub job would even get close to that and I think it is ipmi | 14:49 |
opendevreview | Verification of a change to openstack/ironic master failed: CI: Remove ubuntu focal job https://review.opendev.org/c/openstack/ironic/+/894014 | 14:49 |
dtantsur | no, it was a regression in the boot mode support | 14:49 |
TheJulia | oh, hmm | 14:49 |
TheJulia | the explicit API? | 14:49 |
dtantsur | nope, the normal boot mode stuff | 14:50 |
TheJulia | weird | 14:50 |
dtantsur | very weird indeed | 14:50 |
dtantsur | sushy-tools reports None, my code is waiting for "bios" in vain | 14:50 |
dtantsur | I'm not sure it can happen in real hardware, but who knows.. | 14:50 |
TheJulia | we don't switch any of the jobs dynamically afaik | 14:51 |
TheJulia | start state is the expected end state | 14:51 |
dtantsur | yeah, but start state is None. at least here. | 14:51 |
dtantsur | 'boot': {'allowed_values': ['Pxe', 'Cd', 'Hdd'], 'enabled': <BootSourceOverrideEnabled.CONTINUOUS: 'Continuous'>, 'mode': None, 'target': <BootSource.PXE: 'Pxe'>} | 14:52 |
JayF | My Sharding CI job is failing because ... we don't have permission to set shard? https://review.opendev.org/c/openstack/ironic/+/894460/5/devstack/lib/ironic#2587 | 14:54 |
JayF | how can that cred have ability to create node and set properties but not shards | 14:54 |
JayF | do we put a policy file in devstack jobs? | 14:54 |
dtantsur | TheJulia, I suspect the boot mode cannot be detected in metal3-dev-env, which snowballs into all sorts of problems | 14:55 |
dtantsur | I wonder if we should even try to change *anything* if the current mode is None... | 14:55 |
TheJulia | JayF: updating shard requires devstack-system-admin based upon current policy | 14:57 |
TheJulia | fallout from keeping the original "admin is admin everywhere" bug | 14:58 |
TheJulia | and that we support both, since TC has wanted to eradicate the system scoped model entirely | 14:58 |
JayF | In what world does it make sense that node:update:shard is a higher permission than node:create | 14:58 |
TheJulia | node:create has logic to permit project admins to create a node, and record the project in owner | 14:58 |
TheJulia | we likely just need to loosen the policy there for project scoped admins | 14:59 |
TheJulia | but if you look, we restrict other fields | 14:59 |
TheJulia | for now, I'd make the change as devstack-system-admin instead of devstack-admin | 14:59 |
JayF | https://github.com/openstack/ironic/blob/master/ironic/common/policy.py#L989 SYSTEM_OR_PROJECT_ADMIN ? | 14:59 |
JayF | I will try and figure out how to do that | 14:59 |
TheJulia | and we can revisit the RBAC policies/restrictions at the PTG based upon changes/evolution | 14:59 |
TheJulia | you can do that as well | 15:00 |
JayF | like here's the thing | 15:00 |
JayF | if you can set shard on create, you can make an argument for the current setup | 15:00 |
JayF | because updating shard is a footgun, potentially | 15:00 |
JayF | yeah I'll try to elevate devstack creds instead, I'm not convinced project admin should have that ability | 15:00 |
JayF | we should fix create | 15:01 |
TheJulia | but look at yeah, I think that might have been the intent to restrict changing the shard *as much as possible* | 15:01 |
TheJulia | so it didn't be come a footgun used | 15:01 |
TheJulia | s/be come/become/ | 15:01 |
TheJulia | your going to need something like SYSTEM_OR_OWNER_ADMIN | 15:01 |
JayF | so sounds like there are two follow ups (so far) from this CI attempt: 1) Make node create take shard (and other stuff?) 2) Fix sharding permissions to allow owners to set shard | 15:02 |
JayF | but for now, I'm going to elevate the creds to get it passing | 15:02 |
JayF | since updating the policy needs to be done in a backportable way | 15:03 |
TheJulia | SYSTEM_OR_OWNER_ADMIN = ( '(' + SYSTEM_ADMIN + ') or (' + PROJECT_OWNER_ADMIN + ')' ) | 15:03 |
TheJulia | that is why discussion is really required, since to change policy on a backport we would have to go "we got this wrong" | 15:04 |
TheJulia | and treat is just as a bugfix | 15:04 |
JayF | IMO we should *only* change it if we determine it is a bug | 15:05 |
TheJulia | agreed | 15:05 |
JayF | and fixing the client to set shard on create dulls this corner | 15:05 |
TheJulia | very much so | 15:05 |
TheJulia | almost entirely, really | 15:05 |
TheJulia | since the agreed upon behavior was set and never really ever change | 15:05 |
TheJulia | because "bad things will happen" :) | 15:05 |
JayF | now to sift through a few thousands lines of lib/ironic to find how to call with different creds lol | 15:06 |
TheJulia | it is not unique need in there | 15:06 |
TheJulia | baremetal --os-cloud devstack-system-admin continuecommandhere | 15:06 |
JayF | I figured I'd run across an example if I read enough :) | 15:07 |
* TheJulia goes back to slide deck and le castle vania | 15:07 | |
JayF | Speaking of, I got a talk accepted at SeaGL 2023 the first week in November | 15:07 |
TheJulia | Nice! | 15:07 |
TheJulia | Congrats! | 15:07 |
JayF | "Trust in Open Source" (mainly focusing on how you have to ensure your incentives are aligned with the projects moreso than anything else) | 15:08 |
opendevreview | Jay Faulkner proposed openstack/ironic master: [CI] Support for running with shards https://review.opendev.org/c/openstack/ironic/+/894460 | 15:11 |
JayF | thanks for the insight, I think that should do the trick | 15:11 |
JayF | I might need to add something based on if we're enforcing scope if we were to move that into more jobs | 15:11 |
TheJulia | JayF: I think that is still going to fail | 15:12 |
TheJulia | devstack-admin is a project scoped admin, not a system admin | 15:12 |
JayF | I read it and tried to figure out which it needed and flipped a coin | 15:15 |
* JayF changes to -system- | 15:16 | |
TheJulia | :) | 15:16 |
opendevreview | Jay Faulkner proposed openstack/ironic master: [CI] Support for running with shards https://review.opendev.org/c/openstack/ironic/+/894460 | 15:16 |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: Redfish: wait for secure boot state change if it's not immediate https://review.opendev.org/c/openstack/ironic/+/863999 | 16:31 |
dtantsur | JayF, I hope this is much clearer for the operators ^^ | 16:31 |
JayF | I'm a big fan of the guardrails you added | 16:32 |
JayF | +2 with an optional note | 16:33 |
dtantsur | Thx! | 16:33 |
dtantsur | Meanwhile, we're still unable to reproduce the database locked issue in the metal3 job after the recent round of patches. | 16:35 |
JayF | there's a more celebratory way to say that ;) | 16:36 |
JayF | which will have the side effect of immediately causing a failure if it's not true | 16:36 |
JayF | LOL | 16:36 |
dtantsur | that's why I'm so careful ;) | 16:36 |
dtantsur | like apparently they don't say The Q Word in American hospitals :D | 16:37 |
TheJulia | Q word? | 16:39 |
JayF | I don't actually know this Q word | 16:39 |
dtantsur | TheJulia, "What a quiet day today" - "NOOOOOOOOOOO" | 16:39 |
JayF | I learned that rule working fast food, not a hospital | 16:40 |
JayF | lol | 16:40 |
* dtantsur has learned recently | 16:40 | |
JayF | > "What a quiet day today" > [manager runs in frantically] "TWO BUSES!!!!" | 16:40 |
TheJulia | Oh, American culture is about institutional under-staffing to ensure maximum shareholder benefit | 16:40 |
TheJulia | because, profit! | 16:40 |
TheJulia | so there are no quiet days, ever | 16:40 |
dtantsur | The under-staffing aspect is unfortunately not unique to the USA (the exact reasons may differ) | 16:40 |
TheJulia | JayF: in that case, the manager is the third bus | 16:43 |
JayF | fast food managers are both instruments and victims of the system in that case, they were under all the busses the whole time :( | 16:43 |
TheJulia | Indeed | 16:44 |
JayF | https://review.opendev.org/c/openstack/ironic/+/894460 this is concerning | 17:33 |
JayF | looks like everything worked except the part where it didn't work | 17:33 |
JayF | I'm asking infra to hold one for me | 17:35 |
opendevreview | Harald Jensås proposed openstack/ironic master: devstack - configurable ipv6 address mode https://review.opendev.org/c/openstack/ironic/+/893622 | 18:56 |
TheJulia | JayF: looks like it gets set, but it also looks like it never polls :\ | 20:10 |
JayF | TheJulia: define: it | 20:15 |
JayF | shard gets set, but ? never polls | 20:15 |
TheJulia | nova, doesn't appear to actually grab a list of nodes | 20:15 |
JayF | TheJulia: I have a infra hold out for one of these, so I'll get a look soon | 20:15 |
JayF | it's very possible I just messed up the config but it looked OK from the output | 20:16 |
JayF | it's OK though, there's a reason I wanted to get this running during RC period :) | 20:16 |
JayF | I am not the best at manual QA/QC so I'll not feel confident until this job works | 20:16 |
* TheJulia lays down for a little bit | 20:19 | |
JayF | I may also be stepping out a bit early, I'm not feeling great. Trying to make it a bit further. I suspect I'll end up working on prelude today and dedicate morning-brain to sharding ci | 20:19 |
TheJulia | Yeah, I’m laying down because suddenly not feeling great myself, likely just ripples of exhaustion from the last two days | 20:20 |
JayF | I'd imagine so | 20:21 |
JayF | wtf, shard isn't even in the query Sep 12 20:33:34 np0035232105 devstack@ir-api.service[86195]: [pid: 86195|app: 0|req: 203/203] 173.231.255.247 () {68 vars in 1732 bytes} [Tue Sep 12 20:33:34 2023] GET | 20:35 |
JayF | /baremetal/v1/nodes?fields=uuid%2Cpower_state%2Ctarget_power_state%2Cprovision_state%2Ctarget_provision_state%2Clast_error%2Cmaintenance%2Cproperties%2Cinstance_uuid%2Ctraits%2Cresource_class => generated 1253 bytes in 21 msecs (HTTP/1.1 200) 7 headers in 285 bytes (2 switches on core 0) | 20:35 |
JayF | and afaict everything is configured properly | 20:36 |
JayF | I'm starting to wonder if my testing was invalid, I'm not seeing any way this coulda worked | 20:45 |
JayF | and if it's broken this way, it's because we don't have proper support in openstacksdk for filtering by sharding | 20:48 |
JayF | and for some reason we use ironicclient throughout the nova driver except to get nodes | 20:49 |
JayF | I have no idea how the hell I tested this working | 20:49 |
JayF | https://github.com/openstack/nova/blob/master/nova/virt/ironic/driver.py#L797 I am essentially seeing no evidence in logs that this is effective | 20:53 |
JayF | never gets added to the query | 20:54 |
JayF | that uses openstacksdk, it looks like everything is hooked up thru for shards | 20:54 |
JayF | I'm very confused | 20:54 |
JayF | johnthetubaguy: help ^ tomorrow please | 20:55 |
JayF | I need to step away, I'm really starting to feel worse rapidly. I'll be in tomorrow morning to tackle this again | 20:56 |
JayF | I think I figured it out while walking the dog. Testing it now. | 21:34 |
JayF | yep | 21:38 |
opendevreview | Jay Faulkner proposed openstack/ironic master: [CI] Support for running with shards https://review.opendev.org/c/openstack/ironic/+/894460 | 21:44 |
JayF | aight, I think that's the problem | 21:45 |
JayF | although it's clear to me it could've never worked before, so I must have had some weird thing when testing before where like, I was somehow able to provision to the pre-sharded versions of the nodes in nova's cache (?) | 21:46 |
JayF | I'm honestly not sure how nova works at that level, which probably is why I screwed up the manual testing :/ | 21:46 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!