Thursday, 2022-08-18

opendevreviewMerged openstack/nova master: block_device_info: Add swap to inline  https://review.opendev.org/c/openstack/nova/+/82652303:10
opendevreviewAmit Uniyal proposed openstack/nova master: Adds check for VM snapshot fail while quiesce  https://review.opendev.org/c/openstack/nova/+/85217105:19
opendevreviewMerged openstack/nova master: libvirt: Improve creating images INFO log  https://review.opendev.org/c/openstack/nova/+/82652406:52
opendevreviewMerged openstack/nova master: libvirt: Remove defunct comment  https://review.opendev.org/c/openstack/nova/+/82652506:57
opendevreviewBalazs Gibizer proposed openstack/nova master: Trigger reschedule if PCI consumption fail on compute  https://review.opendev.org/c/openstack/nova/+/85361109:38
gibisean-k-mooney: for the yesterday pci filter bug discussion. The PciDevicePool object is differently set up depending on where it is used. in the compute it has a devices dict with PciDevice object, but that is not present from the scheduler perspective.10:04
gibiso we cannot simply replace support_requests with consume_requests10:04
gibias consume_requests depends on the pool['devices'] to be present10:04
gibiwe can call apply_request and that will decrease the pool count so it will detect the double booking. the only downside is that it does not support dependent devices10:06
gibias that would again require pool['devices'] to be present in the stats10:07
sean-k-mooneyack10:07
gibibtw support_requests is called from both the PciPassthroughFilter and the NumaTopologyFilter 10:07
sean-k-mooneyya the schduler just works on the count10:07
sean-k-mooneyyes10:07
sean-k-mooneythe numa toplogy filter needs to validate teh device numa toplogy10:08
gibiyepp10:08
gibiso I will replace support with apply10:08
gibithat will enhance the scheduling logic10:08
sean-k-mooneyyou proably want to call both no10:08
sean-k-mooneysupport then apply10:08
gibiapply does what support do but also decrease counts10:09
sean-k-mooneysupporst is just a simple dict comprehention so its cheap10:09
sean-k-mooneyack but is apply much more complex or about the same10:09
gibiI can call both but I think apply is a superset of support10:09
sean-k-mooneyhave not looked in a while10:09
gibisupport is basically _filter_pools, apply is _filter_pools + _decrease_pool_count10:10
sean-k-mooneyya im just wondering about cost but i guess apply is already called for multi create10:10
gibiapply is called at the end yes10:10
gibianyhow10:10
sean-k-mooneyack so i guess that is fine10:10
sean-k-mooneywill apply decresase the pools if they all dont fit10:11
sean-k-mooneyi.e. can it handel partial cases10:11
sean-k-mooneyim just wonderign do we need to do an atomic swap10:11
gibiI can call apply on a local copy of stats 10:12
gibithen drop the copy10:12
sean-k-mooneyack10:12
gibiso no interference between parallel requests10:12
sean-k-mooneywell i was thinking swap the copy with the orginal if it succeeeds10:12
sean-k-mooneyso that multi create works10:12
sean-k-mooneyalthogh10:13
sean-k-mooneymaybe not10:13
sean-k-mooneydroping it might be better to not break the numa toplogy filters10:13
gibiwe have two phases for multicreate 1) running filters (this will apply on a copy) 2) consume the selected host (this already apply on a shared stats)10:13
sean-k-mooneyright we will need to do apply 3 times10:13
sean-k-mooneypci filter, numa toplogy filter and host manager10:14
gibiyes10:14
sean-k-mooneythe first two should be copies to not break each other10:14
gibiyes10:14
gibiso I will keep the support call to do the apply on a copy10:14
gibiand keep the apply work on the shared stats10:14
sean-k-mooneyyou could make supports call apply internally on a copy10:14
gibibut then I need to signal to apply when to copy and when not to copy10:15
sean-k-mooneyi have not looked at the signiture but do we pass in the pools to apply10:15
sean-k-mooneyor does it get them itself10:15
sean-k-mooneyi was assumign we passed them in10:16
sean-k-mooneyso supports could do the copy for it10:16
sean-k-mooney_apply takes the pools https://github.com/openstack/nova/blob/master/nova/pci/stats.py#L622-L65310:17
sean-k-mooneyits up to you which you think is cleaner10:17
gibiyes, but apply_requests doesnt10:17
gibianyhow I will push the code soon10:17
sean-k-mooneylet me know when its ready to review10:17
gibiand we can look at that10:18
sean-k-mooneycool10:18
gibithanks for the discussion10:18
sean-k-mooneyim going to try an power through and finish the vdpa patches today just an fyi10:24
sean-k-mooneyits mainly just tests and docs at this point10:24
opendevreviewefineshi proposed openstack/python-novaclient master: Fix nova host-evacuate won't work with hostname is like a.b.c  https://review.opendev.org/c/openstack/python-novaclient/+/85346510:24
sean-k-mooneyalthouhg i have one downstream bug to look at first...10:25
gibisean-k-mooney: sure I can review the rest of the vdpa series when it is ready10:27
opendevreviewBalazs Gibizer proposed openstack/nova master: Trigger reschedule if PCI consumption fail on compute  https://review.opendev.org/c/openstack/nova/+/85361110:29
opendevreviewBalazs Gibizer proposed openstack/nova master: Trigger reschedule if PCI consumption fail on compute  https://review.opendev.org/c/openstack/nova/+/85361110:35
opendevreviewMerged openstack/nova master: imagebackend: default by_name image_type to config correctly  https://review.opendev.org/c/openstack/nova/+/82652612:31
opendevreviewMerged openstack/nova master: image_meta: Add ephemeral encryption properties  https://review.opendev.org/c/openstack/nova/+/76045412:31
opendevreviewMerged openstack/nova master: BlockDeviceMapping: Add encryption fields  https://review.opendev.org/c/openstack/nova/+/76045312:31
opendevreviewMerged openstack/nova master: BlockDeviceMapping: Add is_local property  https://review.opendev.org/c/openstack/nova/+/76448512:31
opendevreviewMerged openstack/nova master: compute: Update bdms with ephemeral encryption details when requested  https://review.opendev.org/c/openstack/nova/+/76448612:31
opendevreviewsean mooney proposed openstack/nova master: add sorce dev parsing for vdpa interfaces  https://review.opendev.org/c/openstack/nova/+/84101612:47
opendevreviewsean mooney proposed openstack/nova master: add sorce dev parsing for vdpa interfaces  https://review.opendev.org/c/openstack/nova/+/84101612:52
*** abhishekk is now known as akekane|home13:42
*** akekane|home is now known as abhishekk13:43
dansmithgibi: can you look at my reply here real quick? https://review.opendev.org/c/openstack/nova/+/85290013:49
dansmithif you agree that I need to chase down the tests. that fail because of shared state, I'll give that a shot13:50
gibidansmith: sure I will look in 513:50
dansmithbut if you have some other idea about why that might be, I'll be glad to have it before getting into that rabbit hole :)13:50
dansmiththx13:50
gibiOK I need to run it locally to see the issues 13:53
dansmiththe libvirt reshape test is the one I remember13:55
dansmithhmm, maybe resetting the client during restart_compute_service is all I need13:58
dansmithI need to run a full set now to see if the reshape ones are the only ones13:58
dansmithwhat I really wanted to say in that comment was something like "this compute service stack is where we use a client with all the complex internal state and thus maybe we shouldn't share that with anything else that isn't part of this set of objects"14:00
dansmithwhich is maybe still a good idea, I dunno14:00
dansmithbut re-using the same init code has the benefit of the similar error messages and things14:00
gibihm restart_compute_service could will trigger a creation of a new ComputeManager instance14:02
gibi-could14:03
gibiso you are right that behaves differently in the test where the two ComputeManager will share report client state, from the reality where a compute restart will reset the client 14:03
gibis/reset/recreate/14:03
dansmithyeah, so I put a call there to reset the global state and it passed the few reshape tests I had in my history14:04
dansmithrunning the full set now14:04
gibibuut, in func test we run all of our computes in the same process, so now they will share the same report client across compute hosts14:05
dansmithwill that work because of the multi-root functionality you mentioned?14:06
sean-k-mooneyi kind of feel like we shoudl seperate the singelton changes form the lazy loading14:06
sean-k-mooneyif you have not already done that14:06
dansmithbut as I said above, I'm also fine keeping them separate for the compute manager part if you think that's better14:06
sean-k-mooneyjust so there is less change to condier14:06
gibidansmith: I'm not sure about the scope of the multi root functionality14:06
dansmithand I'll try to write something more concise than my verbose sentence above, but more useful than the one I put in there that caused the confusion14:07
dansmithsean-k-mooney: they're already together, and can't be separated without losing some of the behavior that gibi wanted to keep14:07
gibiI think in short term keep a separate client per ComputeManager instance and note why we are doing it. 14:07
dansmithgibi: ack, sounds good14:08
gibiI dont like the global but I checked and in most cases it is safe based on how we use it14:08
gibithe compute manager is an exception 14:08
gibiat least due to the func test, but also might be due to ironic multiple node per compute 14:09
dansmiththe global results in fewer hits to keystone and also fewer places we could fail if a call to keystone fails, and mirrors our other client behaviors14:09
dansmith(the internal state does not, of course, but...)14:09
gibiyeah I accept the compromise14:10
gibithe global has pros and cons14:10
dansmithif you really hate the global I can switch it to per-use lazy load14:10
dansmithbut aside from the state thing, I don't know why we would perfer that14:10
gibiyeah, only the share state that makes it complicated. 14:11
dansmith(or prefer even)14:11
dansmithack, so we could also make each call to get the singleton generate a new local state object so that that part is not shared everywhere14:11
dansmithI could leave a node in there with the idea in case it becomes problematic in the future14:11
gibiif I had time I would also trim the report client to have only those methods there that depend on the shared state and move the independent functions somehere else to make it clear what is problematic14:12
dansmithack, there are several things we *could* do to make this cleaner for sure14:12
gibithe extra note in the singletone works for me14:12
* gibi needs a time machine14:12
dansmithack will do that, no singleton for the compute manager and add that test14:13
gibiack14:13
gibithanks14:13
gibisean-k-mooney: I moved forward with the allocation candidate filtering in hardware.py step. That made me realize two things 1) placement tells us RP uuids in the allocation candidates and the stats hardware.py code needs to map those RP uuids to pools. For that I need extra information in the pool. 2) the prefilters are run in the scheduler so creating RequestGroups there are good for scheduling but 14:17
gibinot good for using those RequestGroups (especiall the provider mapping in them) later to drive the PCI claim as those RequestGroups are not visible outside of the scheduler process. For the QoS port we create the groups in the compute.api so those are visible to the conductor and the compute14:17
sean-k-mooneyi thinik for cpu pinning we do it else where too14:19
gibias prefilters are acting on the RequestSpec I'm tempted to move the prefilter run to the conductor14:19
gibiI can try and see what falls out14:20
gibihow do you feel about it?14:20
sean-k-mooneymoving part of the schduleing out of the schduler14:20
sean-k-mooneythat feel like more then we need to adress this14:20
gibialternatively I can try to return the modified request spec from the scheduler to the conductor14:20
gibibut that is an RPC change14:20
gibiafaik14:20
sean-k-mooneyim looking to see what we do for PMEM and cpu pinnign i rememebr we did it differntly for one of them14:21
sean-k-mooneybefore the prefileters14:21
sean-k-mooneyim fine with moving the request group creation to before the call to the schduler14:22
sean-k-mooneymoving running the prefilters to the conductor however i think it probaly mre then we want to do14:22
gibiack, that would be another alternative, to do the request group generation not in a prefilter14:23
gibithat would make this similar to how QoS works14:23
sean-k-mooneythis is where we do it for cpu pinning https://github.com/openstack/nova/blob/e6aa6373d98103348a8ee3c59814350ea1556049/nova/scheduler/utils.py#L8014:28
gibiI think that also runs in the scheduler14:29
sean-k-mooneythe implmenation i think i sin the request spec object14:30
sean-k-mooneyoh its not14:30
sean-k-mooneyhttps://github.com/openstack/nova/blob/e6aa6373d98103348a8ee3c59814350ea1556049/nova/scheduler/utils.py#L305-L32114:30
sean-k-mooneybut those are free standing fucntions that budil up the resouce class request14:31
gibiso that works as you never need to know from where the VPMEM resource was fulfilled14:31
sean-k-mooneyya14:32
gibiI will figure out something along the line of cyborg and qos requests14:32
gibiboth needs the request groups after the scheduling to drive the claim on the compute14:32
sean-k-mooneyya14:33
sean-k-mooneywe do have some pci affintiy code there by the way added by https://github.com/openstack/nova/commit/db7517d5a8aaa5a24be12d9c3453dcd98d9a887e14:34
gibiyeah that also only extends the unsuffixed request group14:34
sean-k-mooneyyep its just finding a host that supports it14:35
sean-k-mooneybut you would potentally have to do that per pci device now14:35
sean-k-mooneyor at least eventually14:35
sean-k-mooneysince the policy is setable per alias14:35
sean-k-mooneywe can proably pretend i did not mention this for now :)14:36
sean-k-mooneywe said in the spec that we would leave numa to the numa toplogy filter14:36
sean-k-mooneyso going back to your orginal issue14:37
sean-k-mooneyyou need to be able to coralate the pci pools to the placement candiates14:37
sean-k-mooneyand provider summeries14:37
gibihm is that settable per alias? I see a flavor extra spec  hw:pci_numa_affinity_policy': 'socket'14:37
sean-k-mooneyya it is one sec14:37
opendevreviewDan Smith proposed openstack/nova master: Unify placement client singleton implementations  https://review.opendev.org/c/openstack/nova/+/85290014:38
opendevreviewDan Smith proposed openstack/nova master: Avoid n-cond startup abort for keystone failures  https://review.opendev.org/c/openstack/nova/+/85290114:38
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/pci/request.py#L104-L10714:38
dansmithgibi: ^14:38
dansmith(no rush)14:38
gibiahh we have "numa_policy": "required" in aloas14:38
gibialias14:38
gibidansmith: ack, I will look today14:38
sean-k-mooneygibi: you can set any of the policies via the alias but they only take affect if not override by the flavor or image. 14:39
sean-k-mooneyor port14:39
sean-k-mooneybut port is out of scope for now14:39
gibisean-k-mooney: I assume now (with a big bunch of ignorance) that it is handled by the NumaTopologyFilter properly regardless of the placement allocation :D14:40
gibibut we have functional tests to see if this assumption holds14:40
sean-k-mooneythe numa toplogy fileter shoudl rejec tthe host if the host cant supprot it today14:41
sean-k-mooneybut its not currenlty allcoatio candiate aware14:41
sean-k-mooneyso its considring all posible devices14:41
gibithe numa topology filter will be after my change as it calls support_request which will be a_c aware14:41
sean-k-mooneynot the ones in any one set of allcoaiton candiates14:42
sean-k-mooneyyep14:42
gibiso it has a chance to work :)14:42
sean-k-mooneyyep14:42
sean-k-mooneyas i said we can ignor that wrinkel for now14:42
gibias of pool correlation with allocation candidate, I will try to add a list of RP uuids to each pool showing that where the pool gets its devices 14:42
sean-k-mooneyack. currently it shoudl be 1:114:43
sean-k-mooneyeach pool shoudl be mapped to a singel RP correct14:43
gibithat would be awesome if pool:RP is 1:!14:43
gibi1:114:43
gibiI thought that VFs from two PFs might be and up in the same pool14:43
gibiend14:43
sean-k-mooneyi think they will be two pools14:44
gibiI will check14:44
gibibut this sounds good at least14:44
sean-k-mooneypools are not 1:1 to device_spec entires14:44
gibithen I can driver the pools consumption logic based on the RP uuids in the allocation candidate14:44
sean-k-mooneybut i each PF gets its own pool14:44
sean-k-mooneyand VFs form differnt PFs are seperate14:44
* gibi makes a note to create extra test cases with multiple PFs providing identical VFs14:45
sean-k-mooneyby the way if that is not thet case today i don tsee any reason we cant change it to make it 1:114:45
gibiyeah that would have been my next proposal :)14:45
sean-k-mooneythe pools are stored in teh pci_stats object in the compute node recored14:46
gibihehe, I already have a note where to create RequestGroups from flavor https://github.com/openstack/nova/blob/master/nova/objects/request_spec.py#L534-L54014:47
sean-k-mooney hehe ya i was expecting it to be in the request_spec object14:48
sean-k-mooneyfor cpu ectra14:48
sean-k-mooneynot the scudler utils14:48
gibiohh, and I even tried to resolve that todo at some point https://review.opendev.org/c/openstack/nova/+/64739614:50
sean-k-mooneyhehe you have too many commits :P14:51
gibiOK at least the commit message confirms my current view14:53
gibiand adding just the pci alias related groups from the flavor does not seem problematic at that point14:54
sean-k-mooneyya we already require that the alias defienition is the same on the api and the compute nodes14:55
sean-k-mooneyso you should be able to use the alsis form the current config safely 14:57
sean-k-mooneyactully that does not matter you are not changing the requirement14:57
gibiI don't think I depend on the alias on the compute but good to know14:58
sean-k-mooneyresize required the alisa to be the same to create the correct pci requests i belive14:58
gibihm, interesting14:59
sean-k-mooneyi have to join a call but we have docs about it15:00
sean-k-mooneyhttps://docs.openstack.org/nova/latest/admin/pci-passthrough.html#configure-nova-api i think we droped the reason form the doc15:01
JayFmelwitt: ack; thank you. Assuming you are OK if I want to run with trying to get those landed?15:08
melwittJayF: yes, please feel free15:09
JayFAnything else vaguely-ironic-related you want to point me at, please do15:09
JayFI'm throwing a backport party and all patches are invited ;)15:09
*** dasm|off is now known as dasm15:09
melwittok, can do :)15:13
opendevreviewJohn Garbutt proposed openstack/nova master: Ironic: retry when node not available  https://review.opendev.org/c/openstack/nova/+/84247815:24
opendevreviewJohn Garbutt proposed openstack/nova master: Ironic: retry when node not available  https://review.opendev.org/c/openstack/nova/+/84247815:27
opendevreviewArnaud Morin proposed openstack/nova master: Unbind port when offloading a shelved instance  https://review.opendev.org/c/openstack/nova/+/85368215:55
amorinhello, sean-k-mooney see ^ is a proposal for the shelved instance with bound ports15:57
amorinI can write a unit test for this if it seems good to the team15:58
sean-k-mooneyah you went with the flag approch rather then spliting it16:12
sean-k-mooneyya that shoul work 16:12
sean-k-mooneyideally we woudl have both unit and funcitonal test for this16:12
sean-k-mooneybut the direction looks fine16:12
sean-k-mooneyi guess we can see what ci says16:13
sean-k-mooneyand if it breaks anything16:13
opendevreviewsean mooney proposed openstack/nova master: fix suspend for non hostdev sriov ports  https://review.opendev.org/c/openstack/nova/+/84101719:11
sean-k-mooneydansmith: got a sec to confirm something. is bumping the compute service version and checking it in pre live migrate sufficent to assert the the source and dest supprot a feture where there are no other rpc changes required19:32
dansmithyeah19:32
sean-k-mooneyi belive the answer is yes but just want to check before i add that and add tests19:33
sean-k-mooneyok19:33
sean-k-mooneyits for the hot plug migration for vdpa. im addign a conductor check to bail if both host are not at the required compute service version19:33
sean-k-mooneyits what we did for sriov migration too19:34
dansmithcoo19:55
dansmithl19:55
sean-k-mooneyhttps://github.com/openstack/nova/blob/e6aa6373d98103348a8ee3c59814350ea1556049/nova/objects/service.py#L34 is not where i was expecting that to be set... i was looking in the compute manager19:56
sean-k-mooneyis the service version really gobal accross the compute agent/schdluer/api ectra19:57
sean-k-mooneyi tough they all had there own version number19:58
sean-k-mooneyappart form the rpc version19:58
dansmithit's tied to the object not the service,19:58
dansmithand it's supposed to be global because it's kinda like a git hash.. "the code is up to this level"19:58
sean-k-mooneyok19:58
dansmithbut within any given service (and like inside a container) it's always the same value, that's the point19:59
sean-k-mooneyack im just tring to see where i need to bump it19:59
dansmiththere, there's only one place19:59
sean-k-mooneycool19:59
sean-k-mooneyjust making sure 19:59
dansmithit's global in the code, but set on the service record in the database when a service updates its live-ness, which is why you can check it per-binary and per-host20:00
sean-k-mooneyah ok20:00
sean-k-mooneyso the services intialise them selevs with the current value when the prcoess starts and that is used to set the value in the db20:00
sean-k-mooneyso if a node has a differnt value in the db you know if its  older or newer20:01
sean-k-mooneyi was just expecting to see  th constant direcly in teh compute manger 20:01
dansmithright, but we can use service versions for any of the other services as well20:07
dansmithso it's per-service not just for compute20:07
sean-k-mooneyyep i tought we had a sperate counter per service20:08
sean-k-mooneybut i can see why we share one20:08
sean-k-mooneyas that way we dont need to check fi schulder is x and conductor is y20:09
dansmithyou could still have that situation, of course20:10
dansmithbut instead of checking conductor_version=x, you check binary=conductor,version=x20:10
sean-k-mooneyya you could20:11
sean-k-mooneybut at least there would be 1 x value for all serviecs for a given feature20:11
dansmithno20:13
dansmithif you have an old conductor and a new conductor, you could have different versions for each20:13
sean-k-mooneythat not what i ment20:13
dansmithit's really host=$host,binary=conductor,version=$x20:13
dansmiththat's the tuple to get the version of one service on one host20:14
sean-k-mooneyi can define 62 to mean we support vdpa hotplug migration20:14
sean-k-mooneyand if that needed to be check for the compute or conductor or schduler its still just is that service >= 6220:14
dansmithyes, for a given host,binary pair, or (all),binary if you want to wait until they're all ready, yes20:15
sean-k-mooneyyes each service instance for the conductor might be above or below it but by having a single counter i can corralate that 62 means "support feature X"20:15
dansmithyeah20:15
sean-k-mooneyso im currently debating beteween checkign the specific source and dest20:16
sean-k-mooneyor min compute service version in the deployment20:16
sean-k-mooneytechnically i just need the specific ones but i would prefer to wait till your fully upgrade so min is tempting20:16
dansmithyeah, min is pretty easy to understand and hard to screw up :)20:17
sean-k-mooneyyep ill go with that20:18
sean-k-mooneyif i do min i can actully do the check in the api instead20:19
sean-k-mooneyif instnace has vdpa ports and version <62 raise 40020:20
opendevreviewMerged openstack/nova master: virt: Add ephemeral encryption flag  https://review.opendev.org/c/openstack/nova/+/76045520:30
opendevreviewsean mooney proposed openstack/nova master: [WIP] Add VDPA support for suspend and livemigrate  https://review.opendev.org/c/openstack/nova/+/85370420:38
sean-k-mooneyok ill finish the docs and release note on ^ tomorrow but basically that seriese is code complete and ready for reveiw20:39
sean-k-mooneyim going to call it a night there and pick it up again tomrrow20:39
opendevreviewJay Faulkner proposed openstack/nova stable/wallaby: Ignore plug_vifs on the ironic driver  https://review.opendev.org/c/openstack/nova/+/82134920:44
*** dasm is now known as dasm|off22:08

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!