Friday, 2022-08-19

opendevreviewTakashi Natsume proposed openstack/placement master: Move implemented specs for Xena and Yoga release  https://review.opendev.org/c/openstack/placement/+/85373000:08
opendevreviewMerged openstack/nova master: Unify placement client singleton implementations  https://review.opendev.org/c/openstack/nova/+/85290002:48
opendevreviewMerged openstack/nova master: Avoid n-cond startup abort for keystone failures  https://review.opendev.org/c/openstack/nova/+/85290102:48
opendevreviewMerged openstack/nova master: scheduler: Add an ephemeral encryption pre filter  https://review.opendev.org/c/openstack/nova/+/76045603:47
opendevreviewMerged openstack/nova master: Test attached volume extend actions in the nova-next job  https://review.opendev.org/c/openstack/nova/+/84370008:02
sean-k-mooneyby the way i dont know if i have explained how im using Review-Prioity +1 and +2 but im adding my +2 if im going to priotities it and i would like others too also and im adding +1 when i am going to review it and it woudl be nice if other reviewed it but im not asking other cores to go out of there way to prioritise it i.e. its a nice to have09:25
sean-k-mooneyim also not alwasy setting it, im just setting it on patches that are either imporant or have not reablly been reviewed in a while to highlight them to others09:26
sean-k-mooneyfor example i dont think this backport is critical https://review.opendev.org/c/openstack/nova/+/821349 but it would be nice to land sooner rather then later so RP +109:27
sean-k-mooneystephenfin: by the way im currently working on the  final patch in the vdpa seriese would you have time to review the first 3. i just need to add a release note and update the docs in the last patch and its also done09:29
stephenfinsean-k-mooney: sure (y)09:43
sean-k-mooneystephenfin: just pushing the last patch now once i fix the commti message so it should be ready by the time you get to it09:44
opendevreviewsean mooney proposed openstack/nova master: Add VDPA support for suspend and livemigrate  https://review.opendev.org/c/openstack/nova/+/85370409:46
sean-k-mooneygibi: ^ that should be ready for your review too. if ye have comments ill respin them collectinvly once ye have complete reviewing the whole sereise later today09:47
gibisean-k-mooney: ack, I will do a review round on it today09:49
sean-k-mooneyoh i have to fix 1 thing in the last patch for the compute service bump so ill respin that quickly09:50
* sean-k-mooney forgot to update the unit test09:50
gibi(today I spent hours in a rabbit hole spreading provider mapping around the claim code, but now I'm backing out of it as it is a dead end)09:50
opendevreviewsean mooney proposed openstack/nova master: Add VDPA support for suspend and livemigrate  https://review.opendev.org/c/openstack/nova/+/85370409:54
sean-k-mooneyi think you should just need to pass them to filter_pools09:54
sean-k-mooneyi.e. if you have the provider uuid in the pool filter pools can jsut filter to the ones it the alloctiaons09:55
sean-k-mooneyalthough that is proably over simplfying09:55
sean-k-mooneyi woudl be tempted to extend the pci request object to carry the RP infor that it shoudl be fullfiled form09:56
sean-k-mooneyand then use that latter when we are doing the filtering /caliming09:56
sean-k-mooneygibi: would something like ^ work better09:57
sean-k-mooneythat woudl avoid chaning the sigurtures of any of the fuctions09:57
gibiyeah I'm on this track09:57
gibithe filter_pools needs it09:57
gibiand we call that from 3 different places09:58
sean-k-mooneybut you would have to update teh request objecject before calling support/consume/apply09:58
gibi1) during scheduling (there we hace an allocation candidate to work with)09:58
gibi2) during claim (there we have the request spec to work with probably)09:58
gibi3) during consume (that is a big rabbit hole :D)09:58
gibibut you are right09:58
gibithe InstancePCIRequest could carry the PR uuid09:59
gibi_after_ the scheduler made the allocation in placement09:59
gibias that is then fixed09:59
sean-k-mooneywell if save the updated request object after 1 when we claim the allcoation candiate then 2 and 3 can just read it from there09:59
sean-k-mooneyyep09:59
gibiyes09:59
sean-k-mooneyso i think that will work out cleanly in the end09:59
gibiwe did something similar for QoS with the parent_ifname tag in the InstancePCIRequest10:00
sean-k-mooneyah yes that is indeed similar10:00
sean-k-mooneyas that narrowed the set of pools to consider10:00
gibithe RP uuid is actually cleaner10:00
opendevreviewMerged openstack/nova stable/xena: add regression test case for bug 1978983  https://review.opendev.org/c/openstack/nova/+/85321810:01
gibiso eventually we can refactor the QoS path to use that too10:01
sean-k-mooneyyep10:01
sean-k-mooneyonce we start populating the rp uuid in the pci device tabel and pci_stat pools10:01
sean-k-mooneyfor the pci devices are you going to add a new column for the rp_uuid10:02
sean-k-mooneyor just put it in the extra info column10:02
sean-k-mooneyi woudl be tempted to do the former10:02
sean-k-mooneyfor the pools i would just put it in the json blob10:02
gibiat the moment I don't see where I need the rp_uuid from the pci device, when I will see that I can consider this10:03
gibifor the pools it is probably the blob10:03
sean-k-mooneyya so i dont think we will evern need to do an sql query on the pools10:03
sean-k-mooneythat will alway be processed in python10:03
gibiI thinks so too10:03
sean-k-mooneybut the pci devices it would be nice to have them correalated with the placment rp10:03
sean-k-mooneyso it would be nice to put it at least in extra_info blob10:04
sean-k-mooneybut you possibely coudl move the fitlering to sql 10:04
sean-k-mooneyi.e. have filter pools intially get the candiate devcie with an sql query10:05
sean-k-mooneyalhtogh we have the pools in memory so i dont think that actully bys you much10:05
sean-k-mooneyso really i would jsut like it in the pci device tabel fro debugging10:05
sean-k-mooneyso extra info would be fine for that10:05
gibiyeah without the rp_uuid there you have to look up the pci address from the dev and query placement by RP name10:10
gibiI will see how this fits together but it probably make sense to add rp_uuid to the dev10:10
sean-k-mooneyya this is a minor convince feature so we can see where it fits when you get to it10:12
sean-k-mooneywe lived without trackign the neutron port uuid in the requster_id for years 10:12
sean-k-mooneybut due to recent evnets the more info we have for debuging things like this the happier i will be10:13
opendevreviewArnaud Morin proposed openstack/nova master: Unbind port when offloading a shelved instance  https://review.opendev.org/c/openstack/nova/+/85368210:14
amorinsean-k-mooney ^ I wrote both unit + functional tests10:16
sean-k-mooneythanks ill review it shortly10:17
amorinI also patch one of your tests to make it work10:17
amorinno hurries, thanks!10:17
* amorin will be afk for 1h10:17
sean-k-mooneyah yes test_shelve for remote managed ports10:18
opendevreviewMerged openstack/nova stable/xena: For evacuation, ignore if task_state is not None  https://review.opendev.org/c/openstack/nova/+/85321910:52
gibisean-k-mooney: I did a review round on the vdpa series, the last patch needs some func test coverage for live migration but other than that I'm happy 10:56
* gibi disappears for lunch10:57
tobias-urdinfrom what I understand nova's policy code restricts POST /os-external-server-events requests to only be allowed from the admin endpoint, how is that enforced? referring to issue mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=164044311:02
tobias-urdini'm trying to understand if that change is still required and valid or if that fix can be reverted11:02
tobias-urdinto using, for example, the internal endpoint instead11:03
sean-k-mooneygibi: or right good point ill add that11:04
sean-k-mooneytobias-urdin: the external events api is not only admin only but intended to be called only by other services11:05
sean-k-mooneytobias-urdin: as in admin shoudl never call it directly11:05
sean-k-mooneyusing the internal endpoint is fine you the calling service jsut need to use the admin clint11:06
sean-k-mooneybut old cinder did not do that properly11:06
sean-k-mooneyso if cinder have actully fixed there code to use the admin client to call the external events api instead of trying to use the user token then it can be revierted but the change in ooo was jsut a workaround for a bug in cinder11:07
sean-k-mooneynova has never required  /os-external-server-events  to use the admin endpoint11:08
tobias-urdinah, the logic was enforced by cinder I see! that's why I couldn't figure it out, my bad11:08
tobias-urdinthx11:08
sean-k-mooneywell not enforced they just had a bug11:08
tobias-urdinack11:09
sean-k-mooneythey tried to use teh users token that requsted the volume extend to send the external event11:09
sean-k-mooneyinstead of using an admin token11:09
sean-k-mooneyby using the admin endpoint they bypassed the admin check11:09
sean-k-mooneygibi: based on your questions regarding suspend and the compute service bump i realise i need a compute service bump for attach/detach. suspend can use the same one as migrate but attach/detach needs one too. so ill adress all your nit as part of that too11:23
stephenfinsean-k-mooney: done11:40
sean-k-mooneystephenfin: thanks ill respine them shortly11:48
gibisean-k-mooney: ack, good point11:55
sean-k-mooneystephenfin: i really would prefer to use the asserts by the way11:59
stephenfinYou shouldn't though. You've already highlighted the risks of doing so12:00
sean-k-mooneyi dont want to check this at runtime but do want to prevent you breaking the tests with bad mocks/fixtures12:00
stephenfinThen just raise a NovaException and don't handle it12:00
sean-k-mooneybut that is not the behaivor i want12:00
stephenfinWhat's the difference?12:00
sean-k-mooneyor a pattern we shoudl really follow12:00
sean-k-mooneythat will cause the agent to restart potially12:00
stephenfinso will an assert12:01
sean-k-mooneywell it wont because i know this is always set in real code12:01
stephenfinthen the exception won't do anything either12:01
sean-k-mooneybut what im trying to protect agaisnt is you create a unit test where you populate the object manually and fail to set the requried value12:02
sean-k-mooneywe already have assert in real code in nova12:02
sean-k-mooneywe dotn have many but they exists12:02
stephenfinwe do, but they really shouldn't be there and we shouldn't be adding to them12:03
stephenfinI don't get what the issue with 'raise Exception(...)' is12:03
stephenfinif we'll never see the AssertionError in real code then we'll never see the Exception12:03
sean-k-mooneyright but we pay the cost of checking12:04
sean-k-mooneyi can drop the check if you prefer but that is why assert exist12:04
sean-k-mooneyto help you debug and not pay any runtime cost in production code12:04
sean-k-mooneythose asserts didnt actully catch any issue in teh end but validated that that was not the issue12:05
sean-k-mooneyit wasnt the path that was wrong it was the key that was used for the path in the end12:05
stephenfinYeah, it's a band-aid for the lack of consistent type hinting in the code base, unfortunately12:06
stephenfinbut to answer your question, can we drop it so?12:06
sean-k-mooneystephenfin: the fact that the dev_path is not a keyword arg is enough documentaiton for me that we require this arg12:06
sean-k-mooneyso ya ill drop them12:07
sean-k-mooneycan typing assert that something is not None by the way12:07
sean-k-mooneyor not None or '' in this case12:07
sean-k-mooneyi dont think so but it would be nice if it could12:07
stephenfindef foo(bar: str) -> None:12:08
stephenfinbar has to be a string. It can't be a bool, int, None or anything esle12:08
stephenfin*else12:08
stephenfinit won't help you with your tests though since we don't type check our tests12:08
sean-k-mooneyit can be ''12:08
stephenfinyes, it could12:08
sean-k-mooneyok ill drop them when i refactor12:09
stephenfinta12:09
sean-k-mooneymind if i keep the ablity to disable optimiasation12:09
sean-k-mooneyill add it to passargs12:09
sean-k-mooneyinstead12:09
opendevreviewArnaud Morin proposed openstack/nova master: Unbind port when offloading a shelved instance  https://review.opendev.org/c/openstack/nova/+/85368212:09
sean-k-mooneythen i can add asset when debuging and just turn it off on the command line12:09
sean-k-mooney*passenv12:10
stephenfinwfm12:10
sean-k-mooneycool thanks for looking12:10
opendevreviewArnaud Morin proposed openstack/nova master: Unbind port when offloading a shelved instance  https://review.opendev.org/c/openstack/nova/+/85368212:15
stephenfinsean-k-mooney: regarding your earlier comments on +1 vs +2 review-priority, have you hovered over the +1 and +2 review-priority buttons?12:15
sean-k-mooneyya12:15
sean-k-mooneyi know what the text says12:15
sean-k-mooneythe contibutor vs core promise thing is a bit weired to me12:16
stephenfingibi: This PCI in placement series is really well structured and pretty easy to review (especially for PCI code /o\). Nice work (y)12:16
stephenfinoh, I read it as code12:16
stephenfinI thought they were saying +1 means _someone_ will review it but not me12:17
stephenfin*not necessarily me12:17
stephenfinOh well :)12:17
sean-k-mooneyso the idea was +1 can be set by anyone and they will review12:17
sean-k-mooneyan that is an indication that cores can use to perhaps also review it12:17
sean-k-mooneyand +2 is a commitmnet form the core reivew to review this12:18
gibiI think +1 is not well defined for cores12:18
gibiso sean-k-mooney you are free to use to for a "maybe"12:18
sean-k-mooneysylvain wanted to use +1 as a way for non cores to singal to cores that something might be ready for review too12:19
sean-k-mooneythe text in my orgially patch was +1 is core review requested and +2 is core review approved but that was also problematic12:19
gibiI think +1 for non-cores is the same as +2 for cores12:19
gibiboth is a promise12:19
sean-k-mooneyyep12:19
gibithat I, who set it, will review the patch12:20
sean-k-mooneyim using +1 as im going to review this but not nessiarly ping other to review it12:20
sean-k-mooneyvs +2 ill review it and when im going to give my +2 ill ping others to review it too12:20
gibithat is OK to me12:20
sean-k-mooneyi.e. i not only commit to reviewing but i also care about this not waiting for every12:21
sean-k-mooneykindo fo like feature-liason lite12:21
gibistephenfin: I needed the small step in the PCI work for myself too to see what is missing :) The inventory part is self contained mostly in the new translator. The scheduling part will be less easy to read (once I write it :D)12:21
gibisean-k-mooney: that make sense12:21
stephenfingibi: I got as far as https://review.opendev.org/c/openstack/nova/+/851358 +2 on everything I think12:24
* stephenfin lunches12:24
gibistephenfin: thank you, have a good one12:24
sean-k-mooneymy plan for there rest of the day is finish the vdpa seriese, review the pci serise and if melwitt has updated the encyption series review that. my plan for next week assuming vdpa is done is 100% upstream review so please ping as needed12:27
gibisean-k-mooney: ack. I will do another vdpa round today if needed12:29
gibistephenfin: thanks for noticing the TODO in https://review.opendev.org/c/openstack/nova/+/851358 I forgot it. Actually the patches above that are also ready until https://review.opendev.org/c/openstack/nova/+/850468 12:30
opendevreviewAmit Uniyal proposed openstack/nova stable/wallaby: add regression test case for bug 1978983  https://review.opendev.org/c/openstack/nova/+/85381112:31
opendevreviewAmit Uniyal proposed openstack/nova stable/wallaby: For evacuation, ignore if task_state is not None  https://review.opendev.org/c/openstack/nova/+/85381212:31
gibibut it is more than fair to stop there now12:31
stephenfinack, I'll keep going with it this afternoon so. If you could cobble together a follow-up I can finish that12:32
gibistephenfin: I will hold off with the follow up until sean-k-mooney reviews it12:38
opendevreviewAmit Uniyal proposed openstack/nova master: Adds check for VM snapshot fail while quiesce  https://review.opendev.org/c/openstack/nova/+/85217112:42
*** tosky is now known as Guest53212:51
*** tosky_ is now known as tosky12:51
*** tbachman_ is now known as tbachman13:10
opendevreviewMerged openstack/nova master: Add reno for fixing bug 1941005  https://review.opendev.org/c/openstack/nova/+/85326513:38
*** dasm|off is now known as dasm13:59
JayFGood morning; thanks for the reviews I've already been getting. I do have a couple of other PRs open in Gerrit I'd appreciate reviews on that are ready to go: https://review.opendev.org/c/openstack/nova/+/853529 and https://review.opendev.org/c/openstack/nova/+/853540 - thanks in advance.14:54
opendevreviewMerged openstack/nova stable/wallaby: Ignore plug_vifs on the ironic driver  https://review.opendev.org/c/openstack/nova/+/82134914:58
opendevreviewJay Faulkner proposed openstack/nova stable/ussuri: Ignore plug_vifs on the ironic driver  https://review.opendev.org/c/openstack/nova/+/82135114:59
JayFI need to re-learn my alphabet, apparently. Rebased the wrong one lol14:59
opendevreviewJay Faulkner proposed openstack/nova stable/victoria: Ignore plug_vifs on the ironic driver  https://review.opendev.org/c/openstack/nova/+/82135015:00
opendevreviewMerged openstack/nova master: block_device: Add DriverImageBlockDevice to block_device_info  https://review.opendev.org/c/openstack/nova/+/82652715:34
gibi2022-08-19 17:35:22,086 DEBUG [nova.pci.stats] Dropped 1 device(s) as they are on the wrong NUMA node(s)15:40
gibi2022-08-19 17:35:22,086 DEBUG [nova.pci.stats] Dropped 1 device(s) that are not part of the placement allocation15:40
gibi2022-08-19 17:35:22,086 DEBUG [nova.pci.stats] Not enough PCI devices left to satisfy request15:40
gibi... and the NUMATopologyFilter now works with placement allocation candidates \o/15:40
sean-k-mooneynice15:40
gibithere are some raw edges but the general idea seems to work15:41
gibinow I have like 10 WIP commits locally to clean up :D15:42
sean-k-mooneyack im sure we can flesh that out via review and or cleanups15:42
opendevreviewMerged openstack/nova master: block_device: Add encryption attributes to image and ephemeral disks  https://review.opendev.org/c/openstack/nova/+/82652816:23
opendevreviewmelanie witt proposed openstack/nova master: Follow up changes for ephemeral encryption  https://review.opendev.org/c/openstack/nova/+/85325417:50
opendevreviewBalazs Gibizer proposed openstack/nova master: DNM: why I cannot set request_id on InstancePCIRequiest  https://review.opendev.org/c/openstack/nova/+/85383518:23
gibisean-k-mooney: if you are still around ^^ I totally don't get this18:24
sean-k-mooneyi dont think you want to set request_id you want to set requester_id18:30
sean-k-mooneyill take a look quickly but we set this in teh neutorn module somewhere i think18:31
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/objects/instance_pci_requests.py#L43-L4418:33
sean-k-mooneybut actully it depens on what you want to track ther18:33
sean-k-mooneyare you tryign to add the placment request group or resouce provider there18:33
gibiI need a unique id18:33
sean-k-mooneywell the request_id should be unique18:34
gibifor neutron based InstancePCIRequests we generate a uuid for request_id18:34
gibiI try to do the same for the flavor based requests18:34
sean-k-mooneyyes and we set requester_id to the neutorn port uuid18:34
sean-k-mooneyah ok18:34
sean-k-mooneyam that should be posible 18:35
sean-k-mooneywhat error do you get18:35
gibihttps://github.com/openstack/nova/blob/master/nova/network/neutron.py#L239018:35
gibithe test fails as the domain has no PCI device18:36
sean-k-mooneyi would guess this is a bug in our fixture18:36
gibihttps://paste.opendev.org/show/bK1kbEypC2EUHXtL2ruT/18:36
sean-k-mooneylet me grab that patch and run it locally18:36
gibianyhow my brain is toasted and my wife just arrived so I have to log off. don't think too much about this issue it is late friday anyhow18:37
gibisee you all on Monday18:37
sean-k-mooneyok18:37
sean-k-mooneyim going to call it a day too18:37
gibihave a nice weekend18:38
sean-k-mooneyyou too ill try and look at this on monday after i rebase the vdpa patches18:38
opendevreviewDan Smith proposed openstack/nova-specs master: WIP: Robustify Compute Node Hostnames  https://review.opendev.org/c/openstack/nova-specs/+/85383718:58
dansmithsean-k-mooney: artom: ^18:58
dansmiththat's a big chunk of work, which we may never do, but I thought it was probably good to document some of the things we could/should do to make this better18:59
dansmitheither to point to and say "see, too big, never going to happen" or the opposite18:59
dansmithI've been thinking about the first work item for a long time and I think we should probably do that for safety even if we don't do any of the rest of it18:59
artomThat's kind of in the same vein as https://bugzilla.redhat.com/show_bug.cgi?id=1965419, which came up before when another customer renamed their hosts19:02
artomNow that I think about it, it may have been that exact same KCS19:02
artomBecause it was a 10 -> 13 FFU19:02
artomSorry, leaking downstream here19:02
sean-k-mooneyi mean we were broken in 16.1 requirenign neutron ot add a new config option 19:04
sean-k-mooneyhttps://bugzilla.redhat.com/show_bug.cgi?id=190050019:04
sean-k-mooneyresource_provider_default_hypervisor19:04
sean-k-mooneyhttps://github.com/openstack/neutron/commit/577217c52d677ba35ca78b87c06302d506f66ff9 and https://github.com/openstack/neutron/commit/ddf0fef28b7095724c8ba27f3275d0dad225225119:06
sean-k-mooneywere added to neutorn to work aorund changes in ooo19:06
opendevreviewDan Smith proposed openstack/nova-specs master: WIP: Robustify Compute Node Hostnames  https://review.opendev.org/c/openstack/nova-specs/+/85383719:30
*** dasm is now known as dasm|off21:33

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!