Wednesday, 2022-06-15

opendevreviewMerged openstack/nova stable/ussuri: Reproduce bug 1953359  https://review.opendev.org/c/openstack/nova/+/82204703:29
opendevreviewMerged openstack/nova stable/ussuri: Extend the reproducer for 1953359 and 1952915  https://review.opendev.org/c/openstack/nova/+/82204803:29
opendevreviewMerged openstack/nova stable/ussuri: [rt] Apply migration context for incoming migrations  https://review.opendev.org/c/openstack/nova/+/82205004:28
gibisean-k-mooney: about hw:bfv=True. I don't think putting everything into flavor is a good idea. We already have the flavor explosion issue 06:24
gibialso in analogous for hw:bfv_type somebody could start asking for hw:port_vnic_type=direct06:25
sean-k-mooney[m]they could but they have wanted to do that for years too06:26
gibiand we said no :)06:26
sean-k-mooney[m]yep im not sure the flavor explosion issue is reall still an issue06:27
sean-k-mooney[m]i think operators are just used to there being many06:27
gibiwe still heard the term last week06:27
sean-k-mooney[m]right but they just accept it and create lots of flavors06:27
gibiOK, so if we say no for a long time to hw:bfv=True then they will accept that too :) and stop asking for more orchestrations 06:28
sean-k-mooney[m]like some of the telco provideders litrally create seperate flavors for every application in a multi app vnf automaticaly via heat06:29
sean-k-mooney[m]well the latter will never happen06:29
gibi:)06:29
sean-k-mooney[m]but the flavor may not be the right place06:29
sean-k-mooney[m]i dont thing its unreasonable to have a way to say always uses cinder volumes06:30
sean-k-mooney[m]or always to it on these set of hosts by default06:30
gibiif it is _always_ then this can be a config option06:30
sean-k-mooney[m]not it cant be06:30
sean-k-mooney[m]that would be config driven api behavior06:31
sean-k-mooney[m]which is one of the reasons we didnt like having a cinder images type06:31
gibior even the admin can define the nova instance dir to be in a place with close to 0 disk space to force it06:31
sean-k-mooney[m]well they can force it differntly via config options to day but its not somethign you can discover without trying to boot a vm06:32
sean-k-mooney[m]they are stilll raising it as a pain point so obviouls they still want a way to do this06:32
sean-k-mooney[m]i tried to adress this issue differntly in the past by proposing the idea of vm porfiles to have a semi composable flavor06:33
sean-k-mooney[m]the profiles would only have extra specs that did not change resouce usage so therefor did not change billing06:35
gibiextra spec can reqeust pinning that might be billed differently.06:35
gibibut I feel I'm too negative today. sorry06:35
sean-k-mooney[m]as a use you can request that in your image06:35
sean-k-mooney[m]so if the falvor has not explictly disabled it you can always do that06:36
gibistill dedicated cpus are billed differently in public clouds afaik06:36
sean-k-mooney[m]right but that trivial to adress06:36
sean-k-mooney[m]just put hw:cpu_policy=shared in the base flavors06:36
sean-k-mooney[m]today if you do not do that you can just put hw_cpu_policy=deicated in the image06:37
sean-k-mooney[m]i assumed you were going to bring up resouce:06:38
gibiso in my mind this is similar to network connectivity. if you need it pass a pre-created port to nova. so I would move the whole disk question out from the flavor too. If you need disk (:D) then pass either a pre-created local-disk or a pre-created volume to the boot command.06:38
sean-k-mooney[m]which is why i said extra specs that dont affect resouce useage06:38
sean-k-mooney[m]fair so we should not add delete on terminate for port in the future right :P06:39
gibihonestly. It is different what I would like to have and what we will do. I accept that06:39
gibi:P06:39
sean-k-mooney[m]hehe well we are both to busy to work on this in the near term06:40
bauzashola06:40
gibiI would remove the  whole --nic net-id if I could06:40
sean-k-mooney[m]so just brainstorming06:40
bauzasnot sure I have the whole context06:40
gibiyepp, it is just brainstorming06:40
gibibauzas: morning06:40
gibibauzas: take a seat, grab a coffee06:40
bauzasmorning folks06:40
sean-k-mooney[m]bauzas o/06:40
bauzasgibi: been there, done that :)06:40
bauzasliterrally06:41
bauzaswe're a Wednesday, I'm happy to work earlier06:41
gibiI did not drink coffee since left Berlin06:41
bauzasOMG06:41
gibiI miss the taste06:41
sean-k-mooney[m]i dont drink it on weekends which people find odd06:41
* bauzas can't imagine a day without coffee06:41
bauzasas I have kids at home, I love those breaks just after lunch with my wife06:42
bauzasI ask my children to go up to their bedrooms and we take 10 mins with a coffee :)06:42
sean-k-mooney[m]thats a nice ritual06:42
bauzasas I wasn't having a coffee, man...06:43
gibino kids, wife works non-remote so I have a lot of quite moments :D06:43
bauzashow could I break :p06:43
bauzasthat said, I also had beers06:43
gibino beer for me since Berlin too :)06:43
sean-k-mooney[m]i like to play with fryea {emma’s 8 month old puppy) when i grab a cup06:43
gibiyeah I need a cat06:44
gibiso going back to hw:bfv=True06:45
sean-k-mooney[m]bauzas: we were just chatting about the request form operator to have a way to have vms use boot form volume by default without a config option06:45
sean-k-mooney[m]i was wondering how terrible it would be to have an extra spec06:45
gibiI would disaggregate disk from flavor as I dream big06:45
sean-k-mooney[m]gibi you mean like port count06:46
gibicount?06:46
sean-k-mooney[m]well you dont ask for 5 ports in a flavor06:46
sean-k-mooney[m]ports are entirly seperate06:46
gibiyepp like that06:47
sean-k-mooney[m]so you would prefer if disk was also seperate06:47
gibiyepp06:47
sean-k-mooney[m]there sort of in a weird middel ground today06:47
sean-k-mooney[m]cause they can be seperate and in the case of flavor.ephmeral you can subdevide that into multiple disks06:48
bauzassean-k-mooney: what is the usecase ?06:48
bauzaswhy they don't want to ask for a volume or other way ?06:48
gibi$ openstack disk create --type [boot|ephemeral|swap] --backend [local-disk|volume] --size06:48
gibi$ openstack server create ... --disk <disk-uuid>06:48
sean-k-mooney[m]i belive they want to have diskless compute or low disk compute and have there tenants always use cinder storage06:49
sean-k-mooney[m]but they didint want there users to have to rememebr to do that, wasnt in the room :)06:50
sean-k-mooney[m]its the last line on the meet and greet doc06:50
sean-k-mooney[m]gibi that is an intersting idea06:50
gibior even06:50
gibi# admin06:50
gibi$ openstack disk profile create --type [boot|ephemeral|swap] --backend [local-disk|volume] --size06:50
gibi# user06:51
gibi$ openstack disk create --profile <profile-uuid>06:51
gibito have the way the admin limit the sizes and types06:51
sean-k-mooney[m]right i prefer the latter but that is getting into composable flavors06:51
gibiexactly06:51
gibiso it solves both problems :D06:51
gibi(no it does not )06:51
bauzassean-k-mooney: then maybe they should provide flavors with DISK_GB=006:52
sean-k-mooney[m]right if we did that i would prefer to do that for all resouces06:52
sean-k-mooney[m]bauzas they can yes06:52
sean-k-mooney[m]with our default policy that would for bfv usage06:52
bauzasif you create a flavor with some disk value, then it would create an ephemeral volue06:52
sean-k-mooney[m]*force06:53
sean-k-mooney[m]no not always06:53
bauzasso if you have diskless computes, don't create those flavors06:53
sean-k-mooney[m]people pushed back when i suggested we should not allow you do do boot form volume with flavor wiht disk!=006:53
sean-k-mooney[m]well what if you have a mix some diskless some with local storage06:54
sean-k-mooney[m]psi has fast nvme storage on the ci aggrate and the general aggreate is backed by ceph06:54
sean-k-mooney[m]that is why we litrally have ci.* and general.* flavors06:55
sean-k-mooney[m]as a user you are ment to rember to use qcow with ci* and raw images with general*06:55
sean-k-mooney[m]gibi to your point i do think composable flaors is proably the best approch but everyone hates thoses so meh06:56
sean-k-mooney[m]and by that somehting liek the —disk-profile you were suggesting06:57
gibiyeah. also I don't think we have workforce to make composable happen06:57
gibiso it is just a dream06:57
sean-k-mooney[m]we can hope06:58
sean-k-mooney[m]there were simpler pain points to adress frist06:59
gibisorry again for being extra negative this morning06:59
sean-k-mooney[m]im interested to see how many of the quota pain points will get addressed by unifed limits06:59
sean-k-mooney[m]gibi: its oke eventully you will underflow and be very very positive again :)07:00
bauzassorry, I think I still don't get the problem about what we miss 07:00
bauzasbut I'm of course wrong07:00
gibisean-k-mooney[m]: I'm waiting for that :D07:01
sean-k-mooney[m]bauzas a simple declaritive way to say that vms should use bfv07:01
sean-k-mooney[m]the command line is imperitive07:01
sean-k-mooney[m]i would have said images_type=cinder07:02
sean-k-mooney[m]but they expictly said not via a host level config option07:02
bauzassean-k-mooney: wdym by imperative ?07:02
sean-k-mooney[m]bauzas as a user i have to know and remember to ask for bfv07:03
jkulikjust to chime in, we also want flavors that default to boot-from-volume instead of ephemeral disk without the user having to explicitly specify bfv07:03
sean-k-mooney[m]as an operator i cant make that default by declaring it in a flaovr or image07:03
bauzasoh, because of --block-device07:03
bauzasas this is a flag, this can be missed by the user, gotcha07:03
sean-k-mooney[m]yep07:03
sean-k-mooney[m]hence hw:bfv=true in flaovr07:04
sean-k-mooney[m]meaing that this flavor would always use bfv07:04
bauzaswell, if you ask for disk=0 with no block device, you end up with ERROR, right?07:04
sean-k-mooney[m]yes07:04
gibidid we fixed that?07:04
sean-k-mooney[m]actully no07:04
bauzasso, basically, users are pushed to add a block device07:04
sean-k-mooney[m]ya so it depned on policy07:04
gibiin the past disk=0 meant no enforcement07:04
sean-k-mooney[m]correct07:05
gibiahh OK07:05
bauzashonestly, I just want to make sure disk=0 is meaningful07:05
sean-k-mooney[m]so default policy requires admin for the unbounded case07:05
bauzasif you don't ask for a local disk, then you're forced to add a volume 07:05
sean-k-mooney[m]it kind of is as an end user with just the member role you cannot boot with bfv if disk=007:06
sean-k-mooney[m]*without07:06
bauzasunless people want some OS residing fully in memory07:06
sean-k-mooney[m]right but that imperitive07:06
sean-k-mooney[m]it doesnt just work the same as any other flavor but use cinder netapp storage07:06
bauzassean-k-mooney: again, I could be wrong, but this behaviour seems correct to me07:06
bauzaswhich is, nova won't provide you disk for your instance07:07
bauzasthen you have to provide it07:07
sean-k-mooney[m]jkulik:  did you want to expand on this a little07:07
jkulikthis was the approach we fiddled with https://github.com/sapcc/nova/commit/e65287727ab5f78fbcf3f6da26a54456eaebf932 it will automatically add a boot_index=0 BDM if the flavor has a property boot_from_volume=true07:07
jkulikas we don't want to let our customers re-learn how to create VMs. it should "just work" like with ephemeral disks07:08
bauzasin the past, we said those magics behind the hood were nearly orchestration07:09
sean-k-mooney[m]ah you remembered to set delete on terminate to true too07:09
sean-k-mooney[m]jkulik:  it would be less maintance for you to do that via middleware by the way07:10
sean-k-mooney[m]right but im not sure nova should avoid all orchestration at the expense of ux07:11
sean-k-mooney[m]i generally agree we should not add any complex orchestation07:11
jkuliksean-k-mooney: can you point me to some docs on that? I can see how we could edit a request coming in, but I wouldn't expect a middleware to be able to query Nova's DB to get the flavor07:12
sean-k-mooney[m]and im sure my deffintion and yours of complex willl differ and thats ok too :)07:12
sean-k-mooney[m]jkulik well it runs in the api process so it can access the the nova.conf07:12
sean-k-mooney[m]i dont think we have example that hit the db today07:13
jkuliksounds wrong to me, tbh07:13
sean-k-mooney[m]less wrong then downstream modifications to the source code :)07:13
jkulikhaha, yes, well ... I've given up on not having those07:14
sean-k-mooney[m]i was just suggesting if you have rebase issues due to this maybe middleware would help07:14
sean-k-mooney[m]but i did not think about db access for the flavor definition07:15
jkulikand I thank you for the advice. I had never thgouth about that.07:15
sean-k-mooney[m]for what its worth the keystone middelware makes rest calls to validate the auth token so they can be complex but normally they are simple07:16
bauzassean-k-mooney: sorry was disturbed by some paperwork due to a flight cancelled07:20
bauzasthanks Europe, I'm owed 250€07:20
bauzassean-k-mooney: well, surely we can improve the UX07:21
sean-k-mooney[m]the question is how :)07:21
bauzasI just wanted to express the general thought that was 'if it implies kind of roundtrips between multiple projets and a lot of conditionals, then we should maybe discuss on the opportunity to make it a client thing'07:22
jkulikby linking to the docs in the error-message? :D07:22
sean-k-mooney[m]we used to punt on this and say use a heat template07:22
bauzasnow we have openstackcli07:22
sean-k-mooney[m]well there is a simple way to do it in the client now i think07:22
bauzasand thanks to hard efforts of stephenfin and a couple of others, we are able to have it done in the CLI07:23
sean-k-mooney[m]that does not really solve the problem that uers have to rememebr to do it07:23
jkulikmost of our users don't use openstackcli either07:23
bauzasdirectly the REST APIs, heh ?07:24
sean-k-mooney[m]jkulik do they use api directly/via a differnt client or horizon/heat07:24
jkulik$some library or yes, the REST APIs :(07:24
bauzasif so, they are powered users that can understand the need for block storage if disk=007:24
sean-k-mooney[m]ah hard core curl users :)07:24
jkulikiirc gophercloud is much used, but there was also a Java library that's in much use07:24
sean-k-mooney[m]bauzas not nessisarly07:25
jkulikright. they build their own library and it takes ages to make them update anything :D that's why we would like to be able to do it on our side07:25
sean-k-mooney[m]they could just be developers of applcition that run on openstack clouds07:25
bauzasfair enough07:25
bauzasbut again, conceptually, if we fail with a flavor of disk=0, then I think we're consistent07:26
sean-k-mooney[m]jkulik so you have that patch in production i take it07:26
sean-k-mooney[m]has it helpped07:26
bauzasthe problem would be to accept a flavor with disk=0 and magically create ephemeral storage07:26
jkulikno, we don't have it in production. we lacked too much other functionality for bfv VMs (we're still on rocky, e.g. rescue)07:26
sean-k-mooney[m]bauzas that is what we did untill around rocky issue i think07:26
bauzasyup07:27
bauzasI remember this07:27
bauzaspoint is, look at the figures of the meet-and-greet07:27
sean-k-mooney[m]jkulik so i think rescue is now there and rebuild is in flight this cycle07:27
jkulikbut it's not just disk=0, it's also that the flavor is explicitly marked for automatic bfv, right?07:27
bauzasprobably the one who requested for a bfv flag was hit by the fact he/she was running older than Rocky07:27
jkuliksean-k-mooney: yeah, looking forward to it. we're also trying to jump to xena by this year07:27
bauzasjkulik: that's what I call 'orchestration'07:28
bauzasthe 'automatic' side07:28
jkulik... and no orchestration in Nova itself07:29
sean-k-mooney[m]jkulik for context novas project scope doc declares orchstration as out of scope and we tend to define that as any addtional inter service operation that can  be done by a user and the result passed in to nova07:29
jkuliklike creating a volume from image, which Nova already does?07:30
sean-k-mooney[m]i.e. create your port/volume ahead of time and tell nova to use it07:30
bauzashttps://docs.openstack.org/nova/latest/contributor/project-scope.html#no-more-orchestration07:30
sean-k-mooney[m]jkulik yep so nova only can do that because we need to support that before cinder was split out07:30
bauzasjkulik: you're exactly pointing some orchestration we keep 07:30
bauzasbecause of the API consistency07:31
sean-k-mooney[m]cinder started as nova-volume07:31
jkulikok, if we keep it, we can still use it, right? I'm mean in the end it's just a flavor using exactly that07:31
sean-k-mooney[m]the same way ironic started as nova-baremetal07:31
sean-k-mooney[m]jkulik we likely will never remove it07:31
sean-k-mooney[m]we would have to raise our min api verion to do so07:31
sean-k-mooney[m]and we wont do that any time soon07:32
sean-k-mooney[m]if ever07:32
bauzasnever07:32
jkulik:)07:32
bauzasin particular given the lag we have 07:32
sean-k-mooney[m]bauzas well i object to never07:32
bauzassean-k-mooney: okay, let's be less pragmatic07:32
jkulikso imho, if the user comes in with a BDM of image -> volume, we create the volume. why can't we specify the flavor to default to image -> volume?07:32
bauzas"could be, eventually"07:32
sean-k-mooney[m]as i think its wrong for use to have a min version filed if that is the stance we are taking07:32
sean-k-mooney[m]should be eventurally07:33
bauzassean-k-mooney: I was there in 2015 when we drafted v207:33
bauzasv2.1 actually07:33
bauzaswe create min_version because we were considering it 07:33
bauzascreated*07:34
sean-k-mooney[m]yep and at that time we planned to increase it eventually07:34
bauzasbut given interop and other reasons, we ended up being less optimistic07:34
sean-k-mooney[m]jkulik what volume_type should be used07:34
sean-k-mooney[m]to create the volume form the image07:34
sean-k-mooney[m]should we delete it on terminate07:34
jkulikthat's already a decision Nova has to make07:34
bauzasnope07:34
jkulikthen Nova does nothing and it's the default volume type of Cinder07:35
bauzasand I don't want config-driven APIs07:35
sean-k-mooney[m]well we use the default volume type07:35
sean-k-mooney[m]which i gues is your point07:35
bauzaswe have defaults for sure07:35
bauzasbut that's for a volume07:35
jkulikI mean the feature is already there, I just want a flavor to use it07:35
sean-k-mooney[m]yep it is. we just dont have a way to enable it except via an api parmater today07:36
bauzasfor many reasons07:36
jkulikright. so my request would be to enable it via flavor parameter07:36
bauzasagain, what's the usecase if we force users to add a volume if they have a diskless flavor ?07:37
sean-k-mooney[m]to not force user to do things07:37
sean-k-mooney[m]and provide bettter ux07:37
bauzaswe have the "give me a port" thing07:37
bauzasand this isn't within a flavor, right?07:38
sean-k-mooney[m]do you mean give me an network07:38
sean-k-mooney[m]or —network on server create07:38
sean-k-mooney[m]the differnece with a port is that is never in a flaovr to begin with07:39
bauzassean-k-mooney: not if you do bandwidth-aware requests :)07:40
sean-k-mooney[m]there is nothing in the falvor for that07:42
gibiI think give me a network is when you say --nic auto07:42
sean-k-mooney[m]the bandwith request comes form the port07:42
gibiyepp bw is a disaggregated resource07:43
jkulikwhy does the flavor have to be specified diskless btw.? Couldn't we just let the operator specify whether the disk should be on ephemeral or on volume storage i.e. determine the deestination_type of the BDM?07:43
sean-k-mooney[m]no give me a network is the neutron feature that will create a network, subnet and router automaticaly to replicate nova-networks behavior07:43
gibisean-k-mooney[m]: /o\07:43
bauzasshould we ask the same to cinder then ?07:44
sean-k-mooney[m]https://specs.openstack.org/openstack/nova-specs/specs/newton/implemented/get-me-a-network.html07:44
bauzas"give me a volume"07:44
bauzasand nova be --disk auto07:44
sean-k-mooney[m]its called volume create07:44
sean-k-mooney[m]its not the same thing07:44
sean-k-mooney[m]in the neutron case people just wanted the vm to have a port with an ip by defualt07:44
sean-k-mooney[m]but with neutron to do that you ahve to create a networ, subnet and router first07:45
bauzasjkulik: you're right, operators could propose both ephemeral storage and block one07:45
bauzaslife would be way easier if we weren't having flavors07:46
sean-k-mooney[m]i was thinking the same07:47
bauzassean-k-mooney: gibi: this reminds me the Public Cloud SIG session07:47
bauzassean-k-mooney: gibi: do you know that they stuggle having the same flavor names ?07:47
bauzasstruggle07:47
sean-k-mooney[m]disk=0 ram=0 vcpu=0 and just allow —resouce rc=ammount on server create07:47
sean-k-mooney[m]and use unified limits for quota/billing07:47
bauzasproposing a common set of flavors between all Passport operators seemed nearly impossible to achieve07:47
sean-k-mooney[m]ya we should likely not do that07:48
sean-k-mooney[m]the only way to do that is for nova to ship with default flavors07:48
bauzaseven that07:48
sean-k-mooney[m]that you can opt into07:48
sean-k-mooney[m]and never update just disable07:48
bauzasthey get rid of our default flavors07:48
bauzasOVH has strong concerns about touching their flavors, by instance.07:48
sean-k-mooney[m]we dont ship them the devstack one are not defualt in nova07:48
gibiI have no hard opinion about default flavors. I dont think this is a technical issue, it is an issue of how to create standards07:49
bauzasanyway, I'm just saying we're on a territory I don't like07:49
bauzasgibi: agreed, this isn't a problem we should solve technically07:49
bauzasgibi: we should let the community agree on a standard set, if they agree eventually07:50
sean-k-mooney[m]that or provide supprot for creating them in osc07:50
sean-k-mooney[m]or as a plugin07:50
bauzasthat's one of the reasons we wanted this unified client07:50
sean-k-mooney[m]i.e. have a repo where we can host some flavor templats for standard flaovr that opertors can contibute too07:51
bauzasnacking the fact that we could bfv auto in the client just because others don't use our client seems an alternative we quickly refused because $whataboutism07:51
sean-k-mooney[m]bauzas we already have the ablity to do bfv in the client today07:52
bauzasthen, personnally I feel the problem solved.07:52
jkulikbut not auto-bfv07:52
bauzasthen patch the client07:52
jkulikyeah, I can do that. again, doesn't help my customers ;)07:53
bauzasand patch any other fancy client07:53
jkulikI'll patch Nova instead thus. downstream07:53
sean-k-mooney[m]we have openstack server create —boot-form-volume <size> today07:53
sean-k-mooney[m]so ^ is our baseline in terms fo client just incase you were not aware bauzas07:54
bauzassean-k-mooney: thanks, appreciated07:55
bauzasI was looking at the client docs07:55
sean-k-mooney[m]its here https://docs.openstack.org/python-openstackclient/latest/cli/command-objects/server.html#server-create07:55
bauzassean-k-mooney: we can move on the etherpad and propose the CLI usage07:55
bauzasand see what people think07:55
sean-k-mooney[m]i thikn since walaby when stepehnfin pushed to get parity07:55
bauzasyay07:55
bauzasagain, see the metrics07:56
bauzasQueens and beyond07:56
sean-k-mooney[m]honestly i dont think that helps in any way07:56
sean-k-mooney[m]i was assumeing they already had that and it was not sufficent07:56
sean-k-mooney[m]but sure they may be on old clouds07:56
sean-k-mooney[m]so maybe its enough07:56
jkulikI get the approach of keeping the scope small and saying "we support osc, everything else is out of scope". But adding features like this only to osc will keep them away from a lot of customers on our side, so it might as well not be there. No hard feelings.07:56
bauzastrust me, they use very old relases07:56
* sean-k-mooney[m] responed to mail about juno usage last night07:57
bauzasjkulik: we support any client able to access our REST API07:57
sean-k-mooney[m]well technially we supprot the apis not the client07:58
bauzaswe even support clients that aren't able to negociate with the API about new microversions07:58
sean-k-mooney[m]we dont supprot any client excpet nova client from a project perspective07:58
bauzashere, we're not brainstorming about a lack in our APIs07:58
jkulikthen I don't get why you would propose to add the feature to a single client instead07:58
bauzasand again, I'm afraid bfv-auto would generate kind of a config-driven API07:58
bauzasjkulik: I'd propose other clients to do the same, to be precise07:59
sean-k-mooney[m]well that is why i suggested the falvor07:59
jkulikconfig-driven API is where the values, the API should get via request comes in from other sources e.g. flavor or config?07:59
sean-k-mooney[m]no it explcitly refer to useing nova.conf values to alter behavior07:59
jkulikah, thanks08:00
bauzaslike, delete_on_termination08:00
sean-k-mooney[m]that is not a config value is it08:00
sean-k-mooney[m]i did not think it was08:00
bauzasah my bad, bad example08:00
sean-k-mooney[m]the option i normall think of is force_configdrive08:00
sean-k-mooney[m]which is a config value08:01
bauzashonestly, I'm not opposed to add a way to describe automatic bfv, but we seriously need to consider the impacts08:01
bauzasand I personally have concerns with the solution be a flavor extraspec08:01
sean-k-mooney[m]ack08:02
sean-k-mooney[m]ya thats fair08:02
sean-k-mooney[m]i think i would want to see the last partity gaps get closed which i think is just rebuild first08:02
bauzasagreed08:02
sean-k-mooney[m]up until this cycle the delta for bfv and non bfv was too much to not have teh user be aware of which it was going to use ectra.08:03
jkulikhm ... if not flavor extraspecs where else? we also have baremetal flavors, which can't even use volume storage in our cloud08:03
bauzassean-k-mooney: that's the problem with operator feedback08:03
sean-k-mooney[m]jkulik actully they could depending on yoru release08:04
sean-k-mooney[m]ironic support boot form cinder via iscsi08:04
bauzassean-k-mooney: ideally I think we should propose a talk or some live thing about what nova can do with bfv noxw08:04
jkuliksean-k-mooney: not in our cloud. all volume-types are in some vmware datastores ... :(08:04
bauzasto let people know about what BFV is today08:04
sean-k-mooney[m]jkulik its pretty new too only like 2-3 relases i think08:05
sean-k-mooney[m]bauzas ya a general bfv talk woudl be good08:05
* bauzas wonders whether we may do this between us and ask the Foundation how to promote it08:06
sean-k-mooney[m]basically a bfv “state of the union”  talk of hay with z this is how it can now be used08:06
bauzasVancouver is too far away08:06
sean-k-mooney[m]i would wait for vancouver honestly08:06
sean-k-mooney[m]well08:06
sean-k-mooney[m]at least for zed to release08:06
bauzasafter Zed, this is fair08:07
sean-k-mooney[m]but maybe as a follow up to the project highlights08:07
bauzasI should run a project update thing anyway08:07
sean-k-mooney[m]ok brb08:07
opendevreviewBalazs Gibizer proposed openstack/nova master: Optimize numa_fit_instance_to_host  https://review.opendev.org/c/openstack/nova/+/84589608:17
gibisean-k-mooney: ^^ let me know what you think. I might missed some side effect in the algo that makes this optimization actually breaking change08:17
sean-k-mooneygibi: have not read it yet but when i see cache are you assume all instance numa ndoes will have the same requirements08:20
sean-k-mooneyoh you are caching the pairs08:20
sean-k-mooneyok that should work08:20
sean-k-mooneyill read the rest08:20
gibisean-k-mooney: I cache the result of the _numa_fit_instance_cell for host_cell instance_cell pairs08:20
sean-k-mooneyyep pairs are fine to cache08:21
sean-k-mooneythat break will jsut break the inner loop right08:22
* sean-k-mooney never rememebr how break works with nested loops08:22
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova/+/845896/1/nova/virt/hardware.py#237608:22
gibiyepp break jumps out from the inner loop08:23
sean-k-mooneythen i dont see why this would not work08:23
gibiif you look at the except: branch it uses there too with the same effect08:23
sean-k-mooneyand its just storing pairs of ints so the memory overhead will be trivial08:23
gibiyepp it is len(host_cells) * len(instance_cells) * 2 integeres08:24
gibithat is a lot lest than then n!/(n-k)! number of fit calls the algo made08:24
sean-k-mooneyso like with comptueing factorial this shoudl reduce to the cell checks to linear08:25
sean-k-mooneyform O(n^2)08:25
gibiit is actuall O(n!/(n-k)!)08:25
gibiI mean it was08:25
sean-k-mooneyya just realised that it much large then n^208:25
sean-k-mooneyits n choose m08:26
gibiyeah we generate k long permutations from N items08:26
sean-k-mooneyso even with that optimiasation do we still want to change the default on mater08:26
gibisean-k-mooney: yepp, probably08:27
sean-k-mooneywith the sperad default it found a candiate on the first iteration08:27
sean-k-mooneydid you try thet repoducer with your chage08:27
sean-k-mooneyif not i was going to try that now08:27
gibithe cache change helps with the forst case, the default change only helps in some case08:27
gibithe reproduce took 6 mins for me without the cache, with the cache it took 3 seconds08:28
gibis/forts/worts/08:28
sean-k-mooneynice08:28
sean-k-mooneyso still double spreaing but much much more reasonable08:29
sean-k-mooneygibi: so other then a releasenote im not sure what elese to add to that patch.08:31
sean-k-mooneyim not sure we want to unit test using a cache08:31
sean-k-mooneyand fucntional test should not be able to tell the differnce08:31
gibiyepp I hope functional will tell me that everything works as is 08:31
gibiI can add a unit test that shows that the cache works by mocking and counting the _numa_fit_instance_cell inner calls08:32
sean-k-mooneythey should i don tthink you have change the outcome in anyway08:32
sean-k-mooneygibi i woudl only do that if you pull out that code into its own fucntion08:33
sean-k-mooneythat code bing the inner loop 08:33
sean-k-mooneyi think this is too much of an implemation detail to asser for the top level numa_fit_instance_to_host fucntion as it is08:33
opendevreviewribaudr proposed openstack/nova master: Allow unshelve to a specific host (Compute API part)  https://review.opendev.org/c/openstack/nova/+/83150708:34
opendevreviewribaudr proposed openstack/nova master: Allow unshelve to a specific host (REST API part)  https://review.opendev.org/c/openstack/nova/+/84589708:34
sean-k-mooneyespically since its a function scoped cache08:34
gibiyeah I dropped the unit test idea. after played around it a bit it seems very artifical09:33
gibiI have to fix up some unit test case and I will add a reno and follow your inline suggestion09:33
gibithe functional tests passed so I think it is correct09:33
sean-k-mooneyi would be interested to triger a white box run against it but we dont really have much numa testing with whitebox upstream09:39
sean-k-mooneyi think its a pretty safe optimisation09:39
sean-k-mooneyif you like i can try and find some time to try it on real hardware or simulated multi numa hardware at least09:39
sean-k-mooneybut if our func test pass i do trust them for most numa stff at this point09:40
*** tosky_ is now known as tosky09:42
gibiI also feel pretty safe with this now09:43
gibiI don't think we should spen much time manual testing it. I will ask the bug author to test it for us09:44
gibithey have big hardware appareantly09:45
sean-k-mooneyish 09:53
sean-k-mooneyits not really that big given its a singel socket system and amd launched that chip about 3 years ago09:54
sean-k-mooneythere are more of them out in the wild then you might otherwise expect vexhost has a similar sku them but they do not expose all the numa nodes09:54
sean-k-mooneyits configurable in the bios09:55
sean-k-mooneybut sure09:55
sean-k-mooneylets get the op to test and provide feedback09:55
gibiack10:07
opendevreviewBalazs Gibizer proposed openstack/nova master: Optimize numa_fit_instance_to_host  https://review.opendev.org/c/openstack/nova/+/84589610:40
gibinow with reno ^^10:40
opendevreviewBalazs Gibizer proposed openstack/nova stable/wallaby: Fix eventlet.tpool import  https://review.opendev.org/c/openstack/nova/+/84583810:43
gibibauzas: fyi, reported a new gate failure https://bugs.launchpad.net/nova/+bug/1978817 I will push a fix soo11:02
gibiit is not a blocker11:02
gibiit only fails on slooow nodes11:02
sean-k-mooneygibi other then formating it does not look like much changed in https://review.opendev.org/c/openstack/nova/+/845896/2/nova/tests/unit/virt/test_hardware.py11:13
gibisean-k-mooney: I needed to add id field for instance cells11:13
sean-k-mooneyoh just saw that11:13
gibi(and yes I reformatted it :D)11:13
sean-k-mooneyok ide make sense i guess11:13
sean-k-mooneywe need to modle the guest numa node that the object represents11:14
sean-k-mooneyand in real code it woudl always be set11:14
sean-k-mooneyso you are just fixing the test data11:14
opendevreviewBalazs Gibizer proposed openstack/nova master: Make test_wait_for_instance_event_* test time independent  https://review.opendev.org/c/openstack/nova/+/84592211:16
gibisean-k-mooney: yepp it is just adding a more realistic test data11:16
sean-k-mooneyok im +2 on that patch assuming zuul is happy this time11:17
sean-k-mooneystephenfin: you proably know that code the best out of the remaining cores would you mind looking at that if you have time11:17
sean-k-mooneyif not artom your +1 and bauzas review would be nice11:18
gibiyepp. I also asked for feedback from the bug reporter11:18
artomsean-k-mooney, which one?11:47
bauzasgibi: ack for the gate failure11:52
gibibauzas: since then I pushed the fix https://review.opendev.org/c/openstack/nova/+/84592211:52
bauzasjust saw it11:53
gibiartom: I think sean-k-mooney refered to the numa scheduling perf optimization fix https://review.opendev.org/c/openstack/nova/+/84589611:53
bauzasgibi: 1.23 secs, heh11:53
bauzasany reason why this value ?11:53
gibi123 :)11:54
gibijust a sequence of ints11:54
*** tosky_ is now known as tosky11:55
bauzascould be 0.123 :p12:00
gibilost opportunity12:02
gibibauzas: finally I understood your comment. see the response in https://review.opendev.org/c/openstack/nova/+/845922/1#message-84ed99281b79342e03258fde377820861086856312:37
bauzasgibi: sorry yeah, I understood this was a returned value of a mock12:38
gibiso there is no delay in the test12:38
bauzasyeah I was wrong when commenting12:39
gibino worries12:40
opendevreviewMerged openstack/nova stable/yoga: Simulate bug 1969496  https://review.opendev.org/c/openstack/nova/+/84083213:06
*** dasm|off is now known as dasm13:08
*** tosky__ is now known as tosky13:17
*** tosky is now known as Guest216913:29
*** tosky__ is now known as tosky13:29
*** tosky__ is now known as tosky13:58
opendevreviewnorman shen proposed openstack/nova master: Clear connection info if vol disconnected  https://review.opendev.org/c/openstack/nova/+/84599514:07
sean-k-mooneyartom: yes i was refering to https://review.opendev.org/c/openstack/nova/+/84589614:07
opendevreviewMerged openstack/nova stable/train: Only allow one scheduler service in tests  https://review.opendev.org/c/openstack/nova/+/75136214:10
opendevreviewMerged openstack/nova stable/train: func tests: move _run_periodics() into base class  https://review.opendev.org/c/openstack/nova/+/75136314:10
opendevreviewMerged openstack/nova stable/train: Helper to start computes with different HostInfos  https://review.opendev.org/c/openstack/nova/+/75136414:10
opendevreviewMerged openstack/nova stable/train: tests: Add reproducer for bug #1879878  https://review.opendev.org/c/openstack/nova/+/75136514:10
opendevreviewMerged openstack/nova stable/train: Add generic reproducer for bug #1879878  https://review.opendev.org/c/openstack/nova/+/75136614:10
gibisean-k-mooney: one hickup in the pci-tracking work: update_provider_tree is the place to report resources BUT [pci]device_spec and the whole pci tracking lives outside of the virt driver. So either we need to report PCI devs to placement ouside of update_provider_tree along with the pci tracker OR  do the reporting in update_provider tree and duplicate [pci]device_spec handling in the libvirt driver14:19
gibiI can start building up things in the libvirt driver and see if the disconnect with the pci manager causes any issues14:20
sean-k-mooneynot realy14:20
sean-k-mooneywe already hae access tot he pci tracker form in that fuction14:20
sean-k-mooneywe just neede to read the info form it14:20
gibiwe see the pci tracker in update_provider_tree?14:21
sean-k-mooneyits aviable via the resouce tracker14:21
gibido we have access to the resource tracker from the libvirt driver?14:22
gibiI dont see it14:22
sean-k-mooneywe should14:22
gibiso the tracker calls the virt driver to update the resource inventories but the tracker is not passed down to the driver14:23
gibias far as I see14:24
sean-k-mooneyhum14:25
sean-k-mooneycan we just pass it in14:25
sean-k-mooneyi dont like the idea of duplicating the logic in the driver14:25
sean-k-mooneythese are the two main places we care about correct https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L9244=14:27
gibihttps://github.com/openstack/nova/blob/93a65f06df67ce39d65827692150c78013c7f6d5/nova/virt/libvirt/driver.py#L853014:28
gibiwe need to report the inventories in update_provider_tree14:28
sean-k-mooneysorry wrong link https://github.com/openstack/nova/blob/93a65f06df67ce39d65827692150c78013c7f6d5/nova/compute/resource_tracker.py#L1221-L1232=14:29
gibisean-k-mooney: so you suggest to pass down the pci_tracker to the update_provider_tree?14:32
sean-k-mooneyyep14:35
sean-k-mooneycan catch the not implemted error and invoke without it if the driver does nto support it for one release14:35
sean-k-mooneywith a fixme14:35
sean-k-mooneyso that we dont rbeak out of tree drivers14:35
sean-k-mooneyfor pci in placment we only are adding supprot for libvirt for now14:35
opendevreviewAlexey Stupnikov proposed openstack/nova stable/victoria: Test aborting queued live migration  https://review.opendev.org/c/openstack/nova/+/84574814:36
opendevreviewAlexey Stupnikov proposed openstack/nova stable/victoria: Add functional tests to reproduce bug #1960412  https://review.opendev.org/c/openstack/nova/+/84575314:36
opendevreviewAlexey Stupnikov proposed openstack/nova stable/victoria: Clean up when queued live migration aborted  https://review.opendev.org/c/openstack/nova/+/84575414:36
sean-k-mooneygibi: do you think that passing it down or a subset of data form it woudl be problematic14:38
gibiit is different than the other resources like CPU of VGPU14:39
sean-k-mooneygibi: the stat pools woudl be all we need14:39
sean-k-mooneywell the other way to do it si to import the code form the pci module14:39
sean-k-mooneyand filter the list of pci device with it14:40
sean-k-mooneyand thel look at all the instnace and there pci usage14:40
gibiwe are moving logic around a virt driver boundary so I'm affraid14:40
sean-k-mooneyactully for placment14:40
sean-k-mooneywe only need to know the set of pci devices and the whitelist14:41
sean-k-mooneybut i would be unfortable with using the current set form libvirt as if its passed into a guest i dont know if that will still be in the data form libvirt14:41
sean-k-mooneyits also more work to recompute it  so i would be much more comfortabel geting the data form the pci tracker14:42
fricklernova is running nova-live-migration-ceph and tempest-integrated-compute-centos-8-stream as non-voting jobs in gate, can someone have a look and clean that up? https://review.opendev.org/84083314:42
frickler(just noticed because both are failing and I was worried about "my" patch queued behind it)14:42
gibisean-k-mooney: I will look into passing the pci tracker down to the virt driver but I feel bad about it architecturally14:42
sean-k-mooneywell you have anohter option14:43
sean-k-mooneywich is to implemnt this part of the update in the compute manager14:43
gibithat would duplicate the reshape logic :/14:44
gibiideally I would like to have all the similar thing is the same place14:44
gibiso if GPU is handled in the virt driver level then PCI should be too14:44
gibibut I'm not sure why we have a pci tracker in the manager level instead of in the virt level14:45
gibiI mean I know a lot of reasons but I probably don't know all of them14:45
gibiso moving the pci tracker down to the virt level is scary14:45
gibijust passing it down will create coupling that is scary too14:45
gibi<can I rewrite nova, please?>14:46
sean-k-mooneywell vGPU shoudl not be where it si now14:47
sean-k-mooneyand the pci module is ment to be shared across virt driver which is why its where it is14:47
gibiso should there be a mdev_tracker in the resource tracker? 14:49
gibiand also a cpu/memory/disk tracker?14:50
gibithen we would not need the update_provider_tree to run in the virt level14:50
gibiit could run on the compute manager level14:50
gibibut for some reasons we introduced update_provider tree down in the virt level14:50
sean-k-mooneyi wanted to track mdevs in the pci tracker or in the resouces table14:50
sean-k-mooneythat we use for pmem14:50
sean-k-mooneywe have multipel resouce tracker in nova already14:50
sean-k-mooneycpus and memory are traced in the hsot numa toplogy blob vai the hardware.py module14:51
sean-k-mooneypmem uses the resouces table14:51
sean-k-mooneypci has its own tracker14:51
sean-k-mooneyand medev use the libvirt xml domain files14:51
sean-k-mooneywhich is proably the worst of them14:51
gibiso when numa (if ever) will be in placement then the cpu inventory reporting in update_provider_tree need to be moved also to the compute manager level?14:52
gibias most of the numa tracking is in the resource tracker not in the virt level14:52
sean-k-mooneyprbably14:52
gibiso this shows that the concept of update_provider_tree is wrong14:53
sean-k-mooneythe resouce tracker is were most of the tackign happens14:53
gibiit cannot really update anything14:53
sean-k-mooneywell we can compute it14:53
sean-k-mooneythe virt driver has the list of cores14:53
sean-k-mooneyand the list of instnaces14:53
sean-k-mooneyactully its simpelr then that14:53
sean-k-mooneyfor cpus we jsut need cpu_share_set and cpu_dedicate_set14:54
sean-k-mooneyto define the capstiy14:54
sean-k-mooneyand all the traits are provided by the virt dirver14:54
sean-k-mooneythe same is technially true for pci devices14:54
gibiyepp14:54
sean-k-mooneyso you can just import the pci module14:54
sean-k-mooneyand pass it the set of hypervior devices14:55
sean-k-mooneyand ask it for the set of pools14:55
sean-k-mooneythen use the pools to do the update14:55
sean-k-mooneybut that is duplciationg the data we already have14:55
sean-k-mooneyso it just comes down to efficnecy14:55
gibiin my mind it comes down to coupling today the virt driver does not couple to the pci tracker at all but tomorrow it will14:56
gibiit increase complexity14:56
sean-k-mooneyright so really the reosuce tracker and pci tracker are ment to be the source of truth for the resouce that are avlaibel14:56
sean-k-mooneyand the virt driver just provide the raw resoucs14:56
sean-k-mooneybtu that is not how it works today14:57
sean-k-mooneywe have the virt driver direclty updateign the tree14:57
gibiyepp14:57
sean-k-mooneyso that is backwards architularly but we did it because how that tree will like is virt dirver depentent because ironic14:57
sean-k-mooneywell ironic vmware ectra they look differnt then libvirt or hyperv14:58
sean-k-mooneye.g. clsuterd vs 1:1 drivers14:58
gibihm it is differnt per driver yes14:58
gibiso that is baad abstraction14:59
sean-k-mooneycurrently yes14:59
sean-k-mooneyif you jsut want to resue the code form the pci module that is fine14:59
gibiwe have the generic nova scheduler code that depends on the tree but the tree if virt driver dependent14:59
sean-k-mooneyi dont really think we shoudl port it14:59
gibiI will play around more before I decide14:59
sean-k-mooneyack14:59
gibithanks for talking to me about it15:00
sean-k-mooneygibi: i will be happy to review working code15:00
gibisorry If I was sooo negative15:00
sean-k-mooneywhat ever form that takes15:00
gibishit meeting :/15:00
gibiI almost made progress today :D15:00
opendevreviewMerged openstack/nova stable/yoga: Allow claiming PCI PF if child VF is unavailable  https://review.opendev.org/c/openstack/nova/+/84083315:06
*** xek__ is now known as xek15:34
dansmithgibi: sean-k-mooney bauzas: glance is planning to make a change to how image locations are updated (for the tight integration with ceph)15:53
dansmithright now we use the user's token and don't have a "talk to glance with this service account" sort of setup like we do for neutron15:54
dansmithwe'd need to create images with their token, but then update the location with the service account15:54
dansmithwhich is similar I think to things we do for/with neutron15:54
dansmithbut the spec in glance is very close to merging and I don't think anyone from nova has looked at it or grokked it from a nova perspective (other than me)15:55
dansmithanyone have any concerns over that or want to review it first?15:55
bauzasdansmith: sorry, triage meeting atm15:56
* bauzas looks above15:56
dansmithbauzas: yep, np, reply when you can15:56
bauzasdansmith: pass me the spec15:57
dansmithhttps://review.opendev.org/c/openstack/glance-specs/+/840882/15/specs/zed/approved/glance/new-location-info-apis.rst15:57
bauzasdansmith: from what you described, the usecase seems legit15:58
dansmithbauzas: I think so to and it matches what we do for neutron at least, but it's a bit of a departure from how we talk to glance today (which is always with the user's token) so I just wanted to make sure we're all in agreement15:59
bauzasdansmith: I'm more concerned by who would be in charge of doing the necessary bits in nova16:00
sean-k-mooneyso we kidn of have this precedent for nuton because16:00
sean-k-mooneywe have to update the host-id field16:00
bauzasdansmith: I don't want us to lag with old behaviour because of lack of resources16:00
sean-k-mooneyfor glance dont we already need admin for reading the multi sotre backend locations16:00
bauzasbut if the spec owner signs off for doing the nova bits, I'm cool16:00
sean-k-mooneythis is one of the things that would be good to do with the service role in the futre16:00
whoami-rajatbauzas, i might end up doing it since I'm the author of the glance spec but if not will find someone who will be willing to do it16:01
dansmithsean-k-mooney: no, we use the internal api endpoint with the user's token16:01
sean-k-mooneyah ok16:01
bauzaswhoami-rajat: cool, I understand16:01
dansmithbauzas: yeah, I'm also curious about that, since it won't be super trivial given that it's buried pretty deep16:01
sean-k-mooneyso we cant depend on having the admin token today correct16:01
whoami-rajatsince I'm already in charge of the cinder changes, nova changes should not be too hard or different16:01
bauzasdansmith: you get my point16:01
bauzasdansmith: I'm afraid of this being not trivial and requiring some hard work, including testing16:02
dansmithsean-k-mooney: right, this is to solve that problem, and let nova present actual credentials to do that thing, instead of letting unprivileged people do it, but only through the internal api endpoint16:02
dansmithsean-k-mooney: basically real auth instead of host-based auth16:02
dansmithbauzas: yeah, I don't think it'll be super hard since we only do this in a couple places, but it's deep, so not trivial16:02
bauzaswhoami-rajat: before saying "not too hard", it's quite self-contained 16:03
bauzasah, burned by dansmith16:03
bauzasbut there be dragons16:03
bauzasalso16:03
dansmithI really don't think there will much burning or dragons, but I haven't gone digging myself to see exactly what would be involved, so I'm hedging :)16:04
bauzasI guess you'll stick with the existing auth'd roundrobins for a couple of releases ?16:04
bauzasrolling upgrades is a bit of a thing16:04
dansmithbauzas: I think what we do is: if the credentials are configured, use the new way, if not, assume the old way is still active16:04
bauzasall good then16:04
dansmithbecause some people could still be running glance from folsom and be perfectly happy16:05
dansmithso we don't want to introduce a hard requirement.. we could force a move later, but I think it has to be a decent window of overlap16:05
bauzasI'll chime into the review, thanks for the heads up16:05
whoami-rajatFolsom??? but as dansmith said, we will have backward compatibility16:05
dansmithwhoami-rajat: sarcasm so extreme to make it obvious :P16:05
bauzaswhoami-rajat: as someone who played with Folsom, glance was perfectly good at that time16:06
dansmithhaha16:06
dansmithhonestly, openstack has been downhill since folsom anyway :P16:06
whoami-rajat:D16:06
whoami-rajatwow, glance has been stable forever!16:06
bauzasthings were easier honestly16:06
bauzasnova-net, glance and nova-volume were easy to manage16:06
bauzasbut I'm diverting16:07
dansmithokay so to circle back,16:07
dansmithare there any concerns with the general approach that should hold up glance merging the spec and working on this new interface in zed so that we can start converting nova and cinder in Aardvark?16:08
bauzasI saw the first +216:08
bauzasI'll review the spec now16:08
bauzasbut from a 10.000ft level, no I think we're on a safe bet16:09
dansmithbauzas: ack, could you look at it today before you go? if so I can throw my +2 on there16:10
bauzasdansmith: on it 16:10
dansmithbauzas: thanks16:10
whoami-rajatthanks dansmith and bauzas for the discussion16:10
dansmith++16:10
gibiI read back I have no objection16:27
whoami-rajatthanks gibi 16:28
bauzasI reviewed the patch16:32
opendevreviewArtom Lifshitz proposed openstack/nova stable/train: fake: Ensure need_legacy_block_device_info returns False  https://review.opendev.org/c/openstack/nova/+/84395816:32
opendevreviewArtom Lifshitz proposed openstack/nova stable/train: tests: work around malformed serial XML  https://review.opendev.org/c/openstack/nova/+/84474316:32
opendevreviewArtom Lifshitz proposed openstack/nova stable/train: functional: Use tempdir for CONF.instances_path  https://review.opendev.org/c/openstack/nova/+/84475016:32
opendevreviewArtom Lifshitz proposed openstack/nova stable/train: Add a regression test for bug 1939545  https://review.opendev.org/c/openstack/nova/+/84395916:32
opendevreviewArtom Lifshitz proposed openstack/nova stable/train: compute: Ensure updates to bdms during pre_live_migration are saved  https://review.opendev.org/c/openstack/nova/+/84396016:32
opendevreviewArtom Lifshitz proposed openstack/nova stable/train: fup: Make connection_info returned by CinderFixture unique per attachment  https://review.opendev.org/c/openstack/nova/+/84474416:33
opendevreviewArtom Lifshitz proposed openstack/nova stable/train: func: Add live migration rollback volume attachment tests  https://review.opendev.org/c/openstack/nova/+/84474516:33
opendevreviewArtom Lifshitz proposed openstack/nova stable/train: fup: Assert state of connection_info during LM rollback in func tests  https://review.opendev.org/c/openstack/nova/+/84474616:33
bauzasmy only concern which isn't blocking is about making sure we provide a smooth upgrade path for our operators 16:33
bauzasasking them to change their config before upgrading is a pain16:33
bauzasso, please, don't make it mandatory in one cycle16:34
bauzasdansmith: ^ this reminds me the tick-tock cadence16:34
dansmithbauzas: yeah absolutely. like I said, I think we need a pretty decent window16:34
bauzasif all of this becomes Aa16:34
dansmithdefinitely longer than even a single tick-tock16:35
bauzasthen we can't remove the old client usage in nova in Bb16:35
dansmithfor sure16:35
bauzasanyway, this is a nova trhing16:36
bauzasno need to paper it out in the glance spec16:36
bauzasbut just sayin', you could keep the old API for a bit16:36
bauzaswhoami-rajat: ^16:36
whoami-rajatbauzas, yep, that's mostly the plan to keep a window, also until the service-to-service interaction becomes available, this all will be a placeholder and won't be functionally active i.e. old path will be used16:41
bauzaswhoami-rajat: how do you plan to identify all the nova calls that modify the image location ?16:42
whoami-rajatbauzas, we've the glance code in nova that calls the glanceclient, eg: for add location https://github.com/openstack/nova/blob/93a65f06df67ce39d65827692150c78013c7f6d5/nova/image/glance.py#L55816:51
whoami-rajats/code/file16:51
whoami-rajatall the places calling glance should be calling this file but not sure if that's what you mean16:52
gibisean-k-mooney: one more info for the todays pci-tracking work PciDeviceStats is not good for me as the pool count is 0 if the devices are consumed, but I still need to create inventory for consumed devices too. So I need to build the inventory based on the PciDevice objects alone and match them to DeviceSpec one by one to get the metadata (resource_class, and traits from the config)17:20
sean-k-mooneyok so you can either get the rows directly from the tracker or build them i guess17:34
sean-k-mooneyi tought the stats also had the capastity but i guess not17:34
sean-k-mooneyah yes 17:36
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/objects/pci_device_pool.py#L55=17:37
gibiI can access count via stats.pools directly but that is only meaningful if there is no pci allocation at all on the compute17:38
gibiand yes I saw that the count is not visible via the object interface17:38
sean-k-mooneyright i tought we also had a total in the stats17:42
sean-k-mooneythat did not change in addtion to the count of free devices17:42
sean-k-mooneywe do not17:42
gibisean-k-mooney: quick question, does nova alway have a PciDevice object for parent devices? E.g. if only the VF is whitelisted nova still create a PciDevice for the parent is it so?17:50
gibinever mind I will track it down tomoroow17:52
gibio/17:52
sean-k-mooneyno i dont think we do17:58
sean-k-mooneywe could build oen but i dont know if we do by default17:58
sean-k-mooneyin the pci tracker today17:58
sean-k-mooneyif you whitelist a vf we do not create  a row for the vf in the db17:59
melwittelodilles: I wanted to get your opinion on backporting a refactor that skips a release https://review.opendev.org/c/openstack/nova/+/84475018:24
elodillesmelwitt: hmm. is there any reason to skip ussuri? O.o18:28
sean-k-mooneyim not aware of one18:29
melwittelodilles: not that I know of. we also don't know why the patch helps or why it's needed to make CI pass. have you ever heard of something like that?18:30
sean-k-mooneyunless it merged in ussuri when it was master18:30
melwittI looked at it and nothing rang a bell why it would help ¯\_(ツ)_/¯ 18:30
sean-k-mooneyit merged in victoria18:31
sean-k-mooneyhttps://github.com/openstack/nova/commit/980711f3d3f2db9dfd792a8c6fd15cb45a85f61518:31
elodillesi guess it was added in train to avoid some kind of conflict maybe? (and i guess it was not needed in ussuri)18:31
sean-k-mooneyi would guess so too18:32
sean-k-mooneyperhaps it was merged in in ussurit into one of the other commtis18:32
sean-k-mooneyor artom just forgot18:32
melwittthe commit message says CI was failing with "No such file or directory: '/home/zuul/src/opendev.org/openstack/nova/instances/tmpif84g_jd'" i.e. looking for a tempdir and it doesn't exist. but it's weird it wasn't needed in ussuri18:33
elodillesanyway, i'd rather see that either removed from the patch series, or backported to ussuri as well (if that is really needed)18:33
melwitt++18:34
melwittthanks18:35
sean-k-mooneyis it because of python 2?18:36
sean-k-mooneyprobably not actully18:36
melwitthm.. I didn't think of that. I'm looking at it a bit more to see if I can find why ussuri doesn't seem to need it. I don't like mystery patches 😆 18:37
sean-k-mooneyi think it does18:38
sean-k-mooneyif im reading this right18:38
sean-k-mooneythis is fixint an intermitent failure18:38
sean-k-mooneyNOTE(artom) This is needed to prevent the live migration test added18:38
sean-k-mooneyin the subsequent patch from failing in the gate18:38
sean-k-mooneyso either ther is shared state18:38
artomYeah, I'm confused too18:38
sean-k-mooneyor something else that causes the failure18:38
artomI initially thought it was a ussuri patch originally18:39
sean-k-mooneyand if stephen fixed that in victoria18:39
sean-k-mooneyit proably affect ussuri18:39
artomAnd only when melwitt noticed that it was in victoria and not ussuri did I realize something was weird18:40
sean-k-mooneyi think stephen orginally just did it as part of generic test cleanup when improving the fucntional test18:40
sean-k-mooneyto ensure that the instance path was always unique18:40
sean-k-mooneyso i think it should be fine too backport 18:41
sean-k-mooneyartom: i assume you were seeing failure in the ci as the not suggest without this18:42
artomYeah18:42
artomYou'll see it in the Zuul run on PS1 of https://review.opendev.org/c/openstack/nova/+/844745/118:43
artommelwitt, I mean I'm with you, I don't like mystery patches18:44
artomThe weird thing is that I couldn't reproduce the CI failure locally on stable/train18:44
artomI suppose it's worth a try to run CI on the func test in the gate without the instance_path fixture18:45
melwittI'm checking in opensearch to see if there's anything interesting18:45
melwittI mean, I'm ok with backporting the patch simply on the basis that the change it makes is the right thing to do, I was just not so comfortable with skipping ussuri18:46
sean-k-mooneysame18:48
artomI mean if proposing it to ussuri is the quickest way forward, I'mma do it18:48
sean-k-mooneyi jus t looked at the func test and i dont see anything obvious that is shareing the instance_path config option18:48
sean-k-mooneyoh18:48
sean-k-mooneyactully18:49
sean-k-mooneycould this be due to how you are starting the compute service18:49
artomMaybe? And do actually, the backport of the _start_computes() helper just landed, so I rebased on top of it18:50
artomAnd that's straight from ussuri, so maybe with that helper in place, I no longer need the fixture?18:50
* artom tries?18:50
sean-k-mooneyno stephs orginal patch that had the temp dir fix was based on https://review.opendev.org/c/openstack/nova/+/746943/618:50
sean-k-mooneywhich added the start_compute helper18:51
sean-k-mooneyinstead of start_computes18:51
sean-k-mooneyartom: you readded the start_computes usage when backporting18:52
sean-k-mooneyform victoria to ussuri18:52
artomI initially re-implemented start_compute(), then just rebased on top of the start_computes() with an s18:52
artomsean-k-mooney, oh, in U, yeah, I changed it to use start_computeS18:52
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova/+/843948/3/nova/tests/functional/regressions/test_bug_1939545.py#5818:53
sean-k-mooneyvs https://review.opendev.org/c/openstack/nova/+/843951/3/nova/tests/functional/regressions/test_bug_1939545.py#6018:53
sean-k-mooneyso i wonder if that is somehow related18:53
artomWell, the individual start_compute didn't exist in ussuri18:53
sean-k-mooneysure just wondering if there is a delta there18:54
sean-k-mooneyits one of the thigs you had to manually fix18:54
sean-k-mooneyhttps://github.com/openstack/nova/blob/stable/victoria/nova/tests/functional/libvirt/base.py#L40=18:56
sean-k-mooneyso on victoria you get the temp path set in  base.ServersTestBase,18:57
artomsean-k-mooney, the *libvirt* base18:58
sean-k-mooneyyes18:59
artomOh, which the regression test inherits from18:59
sean-k-mooneyyes18:59
artomOh, and it's not there in ussuri18:59
sean-k-mooneycorrect 19:00
artomBut... https://github.com/openstack/nova/commit/980711f3d3f2db9dfd792a8c6fd15cb45a85f61519:00
artomThat's the git blame19:00
artomWait19:00
sean-k-mooney that on victoria19:00
sean-k-mooneyits what you packported19:00
artomNo, nothing's new here - the instances_path fixture is in V, not in U, and I cherry-picked it to T19:01
sean-k-mooneyyep19:01
artomThe mystery is - why is the test failure that I cherry-picked it for not happening in U19:01
sean-k-mooneyso you just missed it on U19:01
artomI know :)19:01
sean-k-mooneyam either the test that it was confilciitng with on V is not in u19:01
artomThe mystery is - why did we not apparently not need it in U19:01
sean-k-mooneyor it will fail if we recheck enough19:01
sean-k-mooneyyou said you did not need it locally right19:02
artomNope19:02
artomAt least, not with a couple of runs19:02
sean-k-mooneyso its proably timing/test order related19:02
artomelodilles, is your -1 on https://review.opendev.org/c/openstack/nova/+/843678 blocking?19:03
artomAs in, do I need to respin the entire stack?19:03
sean-k-mooneywell we have to at least wait for them to merge on the other branches19:03
sean-k-mooneybut i would personally respin U and T19:03
sean-k-mooneyjust adding that patch19:04
artomsean-k-mooney, T already has it19:04
artomWill add to U19:04
sean-k-mooneyright but the commits will change19:04
sean-k-mooneyfor the cherry picked form lines19:04
melwittelodilles: dang you have eagle eyes19:05
melwittI didn't notice it 😑 19:05
elodillesartom melwitt: i was just curious why the nova-tox-validate-backport job failed :D19:06
melwittoh argh, the n-v caught me19:07
elodillesand since the hash was matching... i figured out something else should be the problem :)19:07
sean-k-mooneydid it19:07
sean-k-mooneyit looks gren to me19:07
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova/+/844750/219:07
melwittsean-k-mooney: bc it's n-v on the check queue19:07
sean-k-mooneyit would still say fail19:07
melwittthat gets me sooo often 19:07
sean-k-mooneyif it failed19:07
sean-k-mooneyhttps://zuul.opendev.org/t/openstack/build/393a44fbe4ee444a8fcc29b78b969edb19:08
melwittit says failed but it's green +119:08
melwittno sorry19:08
sean-k-mooneyare we looking at diffenrt patches19:08
sean-k-mooneyhttps://zuul.opendev.org/t/openstack/build/393a44fbe4ee444a8fcc29b78b969edb/log/job-output.txt#141719:08
melwittit is red fail but I didn't notice it bc it is n-v19:09
sean-k-mooneymelwitt: do you have a link19:09
melwittsean-k-mooney: https://review.opendev.org/c/openstack/nova/+/843678/319:09
elodillesmelwitt: well, we added this n-v as a feature, to spare some rechecks when the same patch on a newer branch merges19:10
melwittelodilles: yeah I know. I just keep not seeing it for some reason19:10
elodillesmelwitt: but yes, it could be misleading in some cases19:10
sean-k-mooneyah yes we were looking at differnt patches19:10
elodilles(for example in this case :))19:11
sean-k-mooneyi was looking at the tempurl patch form train19:11
elodillesartom: so anyway, that series needs some respin for multiple reasons :S19:14
elodillesartom: also I saw somewhere some interesting relation chain with a patch 2x in it (with different Patch Set number) :-o19:15
sean-k-mooneyyou also get interesting behavior if you use depend-on in the same repo19:16
melwittelodilles: I think that was a patch containing a squash that hasn't been abandoned yet (I didn't verify that yet tho)19:17
sean-k-mooneycrap that remiends me i have a backport somewhere that shoudl have 3 patches squashed but it only has 2 19:20
elodillesmelwitt: oh, i see. could be the case :S19:21
opendevreviewRico Lin proposed openstack/nova master: libvirt: Ignore LibvirtConfigObject kwargs  https://review.opendev.org/c/openstack/nova/+/83064419:23
opendevreviewRico Lin proposed openstack/nova master: libvirt: Remove unnecessary TODO  https://review.opendev.org/c/openstack/nova/+/83064519:23
opendevreviewRico Lin proposed openstack/nova master: add locked_memory extra spec and image property  https://review.opendev.org/c/openstack/nova/+/77834719:23
opendevreviewRico Lin proposed openstack/nova master: libvirt: Add vIOMMU device to guest  https://review.opendev.org/c/openstack/nova/+/83064619:23
opendevreviewRico Lin proposed openstack/nova master: Add traits for viommu model  https://review.opendev.org/c/openstack/nova/+/84450719:23
opendevreviewRico Lin proposed openstack/nova master: Add traits for viommu model  https://review.opendev.org/c/openstack/nova/+/84450719:31
ricolinsean-k-mooney: hi need your suggestion on https://review.opendev.org/c/openstack/nova/+/844507/1/nova/tests/fixtures/libvirt_data.py 19:34
ricolinas I don't have environment actually get iommu in domain XML, what do you think might be the way to go here?19:35
sean-k-mooneyyou dont have a vm with qemu installed?19:35
ricolingibi: yes19:35
sean-k-mooney well libvirt19:35
ricolinyes19:35
sean-k-mooneythis sample date can be gotten via the virsh command line19:36
ricolingibi: just update https://review.opendev.org/c/openstack/nova/+/778347 please kindly take a review again thanks :)19:36
ricolinsean-k-mooney: okay, let me check19:38
sean-k-mooneythe logic you have implement is not correct19:39
sean-k-mooneyricolin: ^ 19:39
sean-k-mooneyyou seam to be mixing up values form libvirt and the extra specs19:39
artomelodilles, the duplicate is because I messed up change IDs when rebasing and squashing, I abandonned the wrong one19:39
elodillesartom: ack, thx!19:43
ricolinsean-k-mooney: ah, so I assume this is also a wrong logic here to include `auto`? https://review.opendev.org/c/openstack/nova/+/844507/3/nova/virt/libvirt/driver.py#1210319:52
sean-k-mooney ricolin  looking20:05
sean-k-mooneythat logic might work at least partly20:05
sean-k-mooneyit depend on what is in fields.VIOMMUModel.ALL20:06
sean-k-mooneyricolin: so https://review.opendev.org/c/openstack/nova/+/830646/11/nova/objects/fields.py#611 is wrong20:07
sean-k-mooneywell20:07
sean-k-mooneyno that is the allow set for the image porperty20:08
sean-k-mooneyso https://review.opendev.org/c/openstack/nova/+/844507/3/nova/virt/libvirt/driver.py#1210320:08
sean-k-mooneyshoudl work but we wont have a auto or non trait20:08
sean-k-mooney*none20:08
ricolinack20:09
sean-k-mooneyso ya its just the test data that is wrong looking at it quickly20:10
sean-k-mooneyyou added auto and non to the test data20:11
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova/+/844507/3/nova/tests/fixtures/libvirt_data.py#115720:11
sean-k-mooneyricolin: if you have say ubuntu with multipel qemu backend installed20:11
sean-k-mooneyubuntu installs all the backends by defualt20:12
sean-k-mooneyyou can basicaly do virsh domaincapabliities --arch aarch64 --machine virt20:12
sean-k-mooneyhttps://termbin.com/840k820:14
sean-k-mooneyricolin: actully im sorry im tired and confusing myslef20:14
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova/+/844507/3#message-1ac3afc15e677d670697b8c2da280224ca15b516=20:15
sean-k-mooneyif you look at my review comment the iommu info is not in the domain caps output20:16
sean-k-mooneyricolin: so i think i said or intended to say you need to determin this based on teh qemu/libvirt version20:16
sean-k-mooneysince libvirt does not have an api to report it 20:16
sean-k-mooneyat least at first glance20:16
sean-k-mooneyill try and take a look at this again but you will likely need to determin the traits based on the libvirt/qemu version and perhapse check if the emulator is aviable20:17
sean-k-mooneyso you cant add an iommu section to the test data unless you update libvirt to provide that20:18
ricolinwhen you said libvirt/qemu version you mean the min version we supported right?20:22
sean-k-mooneyricolin: no20:26
sean-k-mooneythe actual version installed on the compute node20:26
ricolinsean-k-mooney: thanks for the advise, will take a deeper look tomorrow :)20:30
*** dasm is now known as dasm|off21:09

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!