Wednesday, 2019-08-14

openstackgerritTakashi NATSUME proposed openstack/nova stable/queens: Fix misuse of nova.objects.base.obj_equal_prims  https://review.opendev.org/67629100:00
sean-k-mooneyi would proably do this slightly differently however. e.g. not requrie the operator to confgiure aggreates and do it like my image metadtata tratis changes00:00
openstackgerritTakashi NATSUME proposed openstack/nova stable/pike: Fix misuse of nova.objects.base.obj_equal_prims  https://review.opendev.org/67629200:02
sean-k-mooneymelwitt: i havent reviewd the code for this but i hope they are caching the aggreate metadata00:02
sean-k-mooneythe only why the prefilter could work would eb to retrive all of the nova host aggreates and there metadata then calulate the aggrates that wont suppor the instance and generate the  &member_of=!in:<agg1>,<agg2>,<agg3> query paramater00:04
melwittsean-k-mooney: here's the code, it's close to merging so get in on it while you can https://review.opendev.org/#/q/topic:bp/placement-req-filter-forbidden-aggregates+(status:open+OR+status:merged)00:05
sean-k-mooney ya so no caching...00:07
sean-k-mooneythe proablem with caching is keping the cache valid00:07
*** rcernin has joined #openstack-nova00:08
*** whoami-rajat has joined #openstack-nova00:09
*** igordc has joined #openstack-nova00:09
*** markvoelker has joined #openstack-nova00:10
sean-k-mooneymelwitt: the schduler does not directly talk to the db right? i.e. does the schduler acess the db via the conductor like teh compute service?00:11
melwittsean-k-mooney: I think it talks directly00:11
melwittonly the compute service does not00:11
sean-k-mooneyok so all the "control plane" service talk directly but not comptues00:12
melwittI think so00:13
sean-k-mooneyok i was not sure if it was only the api,metadata service and conductor that went direct.00:13
sean-k-mooneyi guess it makes sense for the schduler to be able to take a fast path to the db00:14
*** markvoelker has quit IRC00:16
*** KeithMnemonic has joined #openstack-nova00:20
*** markvoelker has joined #openstack-nova00:21
*** trident has quit IRC00:38
melwittsean-k-mooney: you can see which services go through conductor by looking for 'indirection_api' under nova/cmd00:41
sean-k-mooneyindirection_api is new to me00:42
melwittyeah, that's the magic for objects going through conductor00:43
sean-k-mooneyah ok.00:44
sean-k-mooneythe way the prefilter is workign bugs me00:46
sean-k-mooneyi need to re read the spec but i dont think it workign the way i expect it too00:47
sean-k-mooneyit may be equivalent00:47
*** KeithMnemonic has quit IRC00:47
*** gyee has quit IRC00:48
sean-k-mooneyif the image has no trait request i think it would allow all aggreates00:48
*** KeithMnemonic has joined #openstack-nova00:49
sean-k-mooneyand i think you would have to list all traits in the imgae/flavor on the aggreate for it to match.00:49
sean-k-mooneyi proably should be less tired when reviewing this but this seams reversed to me00:49
melwittyeah, I think it will get all the aggregates, add any without the trait to the "no" bin, and then pass the "no" bin to placement and say "these are the forbidden aggregates, don't return hosts in them"00:52
sean-k-mooneyi think the logic is inverted.00:53
melwittwhich does seem backward but I assume it's the only way to do it00:53
sean-k-mooneyits treating the image and flaovr as the attoritive source nto the metadata00:53
sean-k-mooneythis is broken in the same way the existing filter is00:53
sean-k-mooneyand it the different betten the out of tree one and the in tree one00:53
sean-k-mooneyfor this prefilter to work every trait that is in the image+flavor would have to be on the host_aggreate metadata00:55
melwittit does specifically say (in the proposed doc change) that image/flavor need not be set and will still not land on hosts in aggregates with =required. I just don't know how that's done00:55
sean-k-mooneyits done by invertin the relationship00:55
openstackgerritTakashi NATSUME proposed openstack/nova stable/ocata: Fix misuse of nova.objects.base.obj_equal_prims  https://review.opendev.org/67629500:56
sean-k-mooneyso what you need to do is get the metadata for all host_aggreates. then for each aggreate you check all the keys are present itn the flvore/image requst00:56
sean-k-mooneyif a key is set on the aggreate but not in the flaor/imgae then you add that to the forbindin list00:56
melwittoh ok00:56
melwittyeah00:56
sean-k-mooneyagain exactly what the out of tree filter does :)00:56
sean-k-mooneyok ill comment on the review00:57
sean-k-mooneyas implement this would work exactly the same as the aggregate_instance_extra_specs filter00:58
*** rcernin has quit IRC00:58
sean-k-mooneywell it sort of00:58
sean-k-mooneyit use traits instead of any proerty and also looks at the image00:58
openstackgerritTakashi NATSUME proposed openstack/nova stable/ocata: Fix misuse of nova.objects.base.obj_equal_prims  https://review.opendev.org/67629500:58
sean-k-mooneybut conceptuly its the same00:58
openstackgerritTakashi NATSUME proposed openstack/nova stable/ocata: Fix misuse of nova.objects.base.obj_equal_prims  https://review.opendev.org/67629501:01
sean-k-mooneymelwitt: actully it might be correct https://review.opendev.org/#/c/671074/6/nova/objects/aggregate.py@47601:04
sean-k-mooneyin need to figure out what sql that generates01:04
*** spsurya has joined #openstack-nova01:05
melwittsean-k-mooney: note the ~ (not) squiggle01:10
sean-k-mooneyya looking at the doc sting i think its correct after all01:11
melwittphew good01:12
sean-k-mooneyso it selecting the aggrreate if the  aggreate metadta key is not in the set passed in01:12
*** rcernin has joined #openstack-nova01:14
openstackgerritZHOU YAO proposed openstack/nova master: Preserve UEFI NVRAM variable store  https://review.opendev.org/62164601:16
*** hamzy has joined #openstack-nova01:17
*** markvoelker has quit IRC01:23
*** markvoelker has joined #openstack-nova01:23
*** markvoelker has quit IRC01:28
sean-k-mooneyok im goign to log off for the night o/01:32
sean-k-mooneyoh infra change how logs are renderd in the gate.01:43
sean-k-mooneyi dont think i like it just becasue its different the it has been for 6+ years01:44
*** BjoernT_ has joined #openstack-nova01:44
*** BjoernT_ has quit IRC01:46
*** BjoernT has quit IRC01:47
openstackgerritTakashi NATSUME proposed openstack/nova master: api-ref: Fix collapse of 'host_status' description  https://review.opendev.org/67630101:47
*** markvoelker has joined #openstack-nova01:56
*** Tianhao_Hu has joined #openstack-nova01:58
*** markvoelker has quit IRC02:00
*** ash2307 has joined #openstack-nova02:13
*** ash2307 has left #openstack-nova02:16
*** KeithMnemonic has quit IRC02:17
*** BjoernT has joined #openstack-nova02:22
openstackgerritSundar Nadathur proposed openstack/nova master: ksa auth conf and client for Cyborg access  https://review.opendev.org/63124202:23
openstackgerritSundar Nadathur proposed openstack/nova master: Refactor some methods for reuse by Cyborg-related code.  https://review.opendev.org/67373402:23
openstackgerritSundar Nadathur proposed openstack/nova master: Add Cyborg device profile groups to request spec.  https://review.opendev.org/63124302:23
openstackgerritSundar Nadathur proposed openstack/nova master: Create and bind Cyborg ARQs.  https://review.opendev.org/63124402:23
openstackgerritSundar Nadathur proposed openstack/nova master: Get resolved Cyborg ARQs and add PCI BDFs to VM's domain XML.  https://review.opendev.org/63124502:23
openstackgerritSundar Nadathur proposed openstack/nova master: WIP: Delete ARQs for an instance when the instance is deleted.  https://review.opendev.org/67373502:23
*** whoami-rajat has quit IRC02:28
*** yonglihe has joined #openstack-nova02:40
*** bhagyashris has joined #openstack-nova02:43
openstackgerritYongli He proposed openstack/nova master: Add server sub-resource topology API  https://review.opendev.org/62147602:44
*** BjoernT has quit IRC02:46
*** boxiang has joined #openstack-nova02:59
*** skadam has joined #openstack-nova03:03
*** takashin has left #openstack-nova03:06
*** ccamacho has joined #openstack-nova03:06
openstackgerritBoxiang Zhu proposed openstack/nova master: Make evacuation respects anti-affinity rule  https://review.opendev.org/64996303:06
openstackgerritBoxiang Zhu proposed openstack/nova master: Fix live migration break group policy simultaneously  https://review.opendev.org/65196903:06
boxianghi gibi , I have finished the regression functional test for this https://review.opendev.org/64996303:08
*** sapd1_x has joined #openstack-nova03:15
*** licanwei has joined #openstack-nova03:15
*** whoami-rajat has joined #openstack-nova03:20
*** psachin has joined #openstack-nova03:33
*** ash2307 has joined #openstack-nova03:34
*** tbachman has quit IRC03:44
*** udesale has joined #openstack-nova03:54
*** tbachman has joined #openstack-nova03:55
*** markvoelker has joined #openstack-nova04:01
openstackgerritMerged openstack/nova master: Execute TargetDBSetupTask  https://review.opendev.org/63385304:05
*** markvoelker has quit IRC04:05
*** sapd1_x has quit IRC04:08
*** skadam has quit IRC04:15
*** mkrai has joined #openstack-nova04:17
*** skadam has joined #openstack-nova04:32
*** skadam has quit IRC04:37
*** markvoelker has joined #openstack-nova04:52
*** Tianhao_Hu has quit IRC04:53
*** markvoelker has quit IRC04:57
*** igordc has quit IRC05:02
*** ratailor has joined #openstack-nova05:08
*** ratailor has quit IRC05:20
*** ash2307 has left #openstack-nova05:31
*** Tianhao_Hu has joined #openstack-nova05:32
*** mkrai has quit IRC05:33
*** mkrai has joined #openstack-nova05:36
*** adrianc has quit IRC05:37
*** ratailor has joined #openstack-nova05:46
openstackgerritZHOU YAO proposed openstack/nova master: Preserve UEFI NVRAM variable store  https://review.opendev.org/62164605:49
*** dpawlik has joined #openstack-nova06:11
*** brinzhang_ has joined #openstack-nova06:16
*** rcernin has quit IRC06:16
*** brinzhang_ has quit IRC06:18
*** brinzhang_ has joined #openstack-nova06:18
*** luksky has joined #openstack-nova06:18
*** brinzhang has quit IRC06:19
*** brinzhang_ has quit IRC06:19
*** brinzhang_ has joined #openstack-nova06:19
*** dpawlik has quit IRC06:20
*** Tianhao_Hu has quit IRC06:20
openstackgerritYongli He proposed openstack/nova master: Add server sub-resource topology API  https://review.opendev.org/62147606:22
*** dpawlik has joined #openstack-nova06:23
*** rcernin has joined #openstack-nova06:31
*** slaweq has joined #openstack-nova06:52
*** ivve has joined #openstack-nova06:59
*** trident has joined #openstack-nova07:01
*** xek has joined #openstack-nova07:03
*** ccamacho has quit IRC07:10
*** tesseract has joined #openstack-nova07:10
*** mjozefcz has joined #openstack-nova07:16
*** kaisers has quit IRC07:21
*** janki has joined #openstack-nova07:23
*** kaisers has joined #openstack-nova07:23
openstackgerritMerged openstack/nova master: api-ref: Fix collapse of 'host_status' description  https://review.opendev.org/67630107:27
*** mkrai has quit IRC07:29
*** mkrai has joined #openstack-nova07:31
*** mkrai has quit IRC07:38
*** rcernin has quit IRC07:56
*** helenafm has joined #openstack-nova07:58
*** elod_off is now known as elod08:03
*** markvoelker has joined #openstack-nova08:07
*** tkajinam has quit IRC08:11
*** markvoelker has quit IRC08:15
*** Dinesh_Bhor has joined #openstack-nova08:16
*** rpittau|afk is now known as rpittau08:18
*** tssurya has joined #openstack-nova08:22
*** rcernin has joined #openstack-nova08:23
*** jangutter has joined #openstack-nova08:24
*** derekh has joined #openstack-nova08:29
*** rcernin has quit IRC08:29
*** cdent has joined #openstack-nova08:30
*** rcernin has joined #openstack-nova08:38
*** luksky has quit IRC08:39
*** markvoelker has joined #openstack-nova08:41
*** boxiang has quit IRC08:41
*** boxiang has joined #openstack-nova08:41
*** janki has quit IRC08:44
*** markvoelker has quit IRC08:45
*** factor has quit IRC08:58
*** boxiang has quit IRC08:58
*** icarusfactor has joined #openstack-nova08:58
*** boxiang has joined #openstack-nova08:58
*** rcernin has quit IRC09:04
*** janki has joined #openstack-nova09:06
*** markvoelker has joined #openstack-nova09:10
*** luksky has joined #openstack-nova09:13
*** markvoelker has quit IRC09:15
*** Tianhao_Hu has joined #openstack-nova09:29
*** shilpasd has joined #openstack-nova09:30
*** Dinesh_Bhor has quit IRC09:35
*** janki has quit IRC09:40
*** ociuhandu has joined #openstack-nova09:49
*** icarusfactor has quit IRC09:51
*** icarusfactor has joined #openstack-nova09:51
openstackgerritBhagyashri Shewale proposed openstack/nova master: Ignore root_gb for BFV in simple tenant usage API  https://review.opendev.org/61262609:55
*** ut2k3 has joined #openstack-nova10:05
ut2k3Hi guys, I have unfortunately two Volumes stock in "detaching" - Is there any chance I can force detach these from the instances? `nova volume-detach...` as well reseting once their state with `cinder reset-state...` did not help10:05
*** adrianc has joined #openstack-nova10:07
*** markvoelker has joined #openstack-nova10:11
*** ociuhandu has quit IRC10:14
*** ociuhandu has joined #openstack-nova10:15
*** markvoelker has quit IRC10:15
*** luyao has joined #openstack-nova10:16
*** tbachman has quit IRC10:33
*** bnemec has quit IRC10:34
*** shilpasd has quit IRC10:36
*** bnemec has joined #openstack-nova10:37
*** bhagyashris has quit IRC10:38
*** brinzhang_ has quit IRC10:39
*** dpawlik has quit IRC10:41
*** bnemec has quit IRC10:45
*** belmoreira has joined #openstack-nova10:45
*** eharney has quit IRC10:48
*** Tianhao_Hu has quit IRC10:49
*** bnemec has joined #openstack-nova10:49
*** Nick_A has quit IRC10:50
*** priteau has joined #openstack-nova10:50
*** tbachman has joined #openstack-nova10:51
openstackgerritGhanshyam Mann proposed openstack/python-novaclient master: Microversion 2.75 - Multiple API cleanup changes  https://review.opendev.org/67627510:56
stephenfinefried: I'm kind of stuck on the reshape for VCPU -> PCPU. I think what we're doing at the moment is wrong, and the tests are hiding that fact https://review.opendev.org/#/c/674895/7/nova/virt/libvirt/driver.py@689110:58
*** helenafm has quit IRC10:58
openstackgerritGhanshyam Mann proposed openstack/python-novaclient master: Microversion 2.75 - Multiple API cleanup changes  https://review.opendev.org/67627510:59
stephenfinefried: What I think we need to do is a multi-step process, (a) figure out if any instance on the host is using pinned instances by looking at the (host) NUMATopology.pinned_cpus attribute, if so (b) figure out if there are any PCPU allocations against the host, if not (c) figure out how many VCPUs to migrate to PCPUs and do that11:00
*** adrianc has quit IRC11:00
stephenfinefried: But I can't do (b) easily since the pattern we have for reshaping is to only provide allocations to 'update_provider_tree' if there's a 'ReshapeNeeded' exception raised, so I'm stuck11:01
*** adrianc has joined #openstack-nova11:03
*** bnemec has quit IRC11:04
openstackgerritYAMAMOTO Takashi proposed openstack/nova master: WIP: midonet doesn't have plug-time vif events  https://review.opendev.org/67638811:04
*** udesale has quit IRC11:05
*** dpawlik has joined #openstack-nova11:08
*** bnemec has joined #openstack-nova11:09
*** udesale has joined #openstack-nova11:12
*** bnemec has quit IRC11:13
*** udesale has quit IRC11:16
*** belmoreira has quit IRC11:17
*** markvoelker has joined #openstack-nova11:21
openstackgerritGhanshyam Mann proposed openstack/nova master: Testing tls with ipv6 also  https://review.opendev.org/67639111:23
*** markvoelker has quit IRC11:26
*** bnemec has joined #openstack-nova11:26
*** markvoelker has joined #openstack-nova11:31
*** bnemec has quit IRC11:31
*** markvoelker has quit IRC11:36
*** jaosorior has joined #openstack-nova11:38
*** bnemec has joined #openstack-nova11:38
*** factor__ has joined #openstack-nova11:41
*** icarusfactor has quit IRC11:43
*** belmoreira has joined #openstack-nova11:44
*** bnemec has quit IRC11:45
openstackgerritGhanshyam Mann proposed openstack/python-novaclient master: Microversion 2.75 - Multiple API cleanup changes  https://review.opendev.org/67627511:48
*** bnemec has joined #openstack-nova11:48
*** bnemec has quit IRC11:55
stephenfinalex_xu, gibi: More than happy to let you handle the update :)11:58
stephenfinI shall sit at the back of the room and judge mercilessly :P11:58
*** bnemec has joined #openstack-nova11:59
gibistephenfin: I will do the placement project update with tetsuro so you can have the nova project update :)11:59
alex_xuthe world depends on three of us~12:01
*** factor__ has quit IRC12:02
*** ratailor has quit IRC12:02
*** markvoelker has joined #openstack-nova12:02
gibiwill be fun :)12:04
*** bnemec has quit IRC12:04
alex_xuhah12:04
*** ut2k3 has quit IRC12:06
sean-k-mooneystephenfin: your not allowed to anounce that we are killing nova and truning it into a sig :P12:07
alex_xuhaha12:09
alex_xuI'm thinking how a sig works12:09
sean-k-mooneyalex_xu: they are allowed to have repos but have no ptl to heard the cats in a common direction12:10
*** bnemec has joined #openstack-nova12:10
alex_xuthat is fun12:11
*** tbachman has quit IRC12:11
sean-k-mooneyhehe yep which is why stephenfin's first act as docs ptl was to eliminate his postion so he will be the final docs ptl12:12
*** belmoreira has quit IRC12:13
*** ociuhandu has quit IRC12:14
alex_xuhaha :)12:14
*** artom has joined #openstack-nova12:16
*** jangutter_ has joined #openstack-nova12:19
*** jangutter has quit IRC12:20
*** luksky has quit IRC12:24
*** nweinber__ has joined #openstack-nova12:27
*** bnemec has quit IRC12:29
*** bnemec has joined #openstack-nova12:33
*** rcernin has joined #openstack-nova12:34
*** alemgeta has joined #openstack-nova12:38
*** jangutter_ has quit IRC12:39
*** jangutter has joined #openstack-nova12:39
*** rcernin has quit IRC12:40
*** bnemec has quit IRC12:41
alemgetahello please i'm doing my msc thesis on openstack nova, what exactly failure detection algorithm used in openstack ,and place of the code, its my appreciation12:42
*** belmoreira has joined #openstack-nova12:42
*** belmoreira has quit IRC12:43
alemgetahello please i'm doing my msc thesis on openstack nova, what exactly failure detection algorithm used in openstack ,and place of the code, its my appreciation12:43
*** bnemec has joined #openstack-nova12:44
*** KeithMnemonic has joined #openstack-nova12:46
*** priteau has quit IRC12:48
*** bnemec has quit IRC12:49
*** tbachman has joined #openstack-nova12:50
*** belmoreira has joined #openstack-nova12:51
*** luksky has joined #openstack-nova12:52
*** bnemec has joined #openstack-nova12:54
tssuryamriedem, dansmith: are you both around ?12:55
dansmithI am but I'm about to jump on a call in 4 minutes12:55
*** jaosorior has quit IRC12:56
tssuryawhat did we finally decide on ? tweaking the reno and adding the fact that config on ironic side should be diasbled ? or pushing the task state update to driver level12:56
dansmithnot sure we decided anything specific, but I'd prefer to move the task state setting out of the api12:57
tssuryaok let me try to do that then12:57
dansmithmriedem isn't around right now12:57
tssuryaok I'll push the rest of his suggestions and we can come back to that when he comes online12:58
alemgetahello someone i have little question about openstack nova12:58
alemgetaplease someone12:59
alemgetaabout failer detection12:59
alemgetadoing msc thesis12:59
*** mriedem has joined #openstack-nova13:00
luyaodansmith:  when will you back?😀13:02
sean-k-mooneynova does not perfrom failure detection of any of the workloads that are deploy with it if that is what you were wondering. it obviously validates that the action you asked it to perfom succeeded but it does not monitor the apllication lifetime13:02
sean-k-mooneydansmith: speak his name an he shall appear13:03
sean-k-mooneymriedem: tssurya wanted to know if you were around ~10 minuts ago13:04
dansmithin an hour13:04
luyaodansmith: got it! See you in an hour. :)13:05
mriedemwell here i am13:06
*** bnemec has quit IRC13:07
*** eharney has joined #openstack-nova13:07
sean-k-mooneymriedem: ill adress your vPMU feed back later today. thanks for reviewing it before you dropped off yesterday.13:08
mriedemyw13:10
*** bnemec has joined #openstack-nova13:10
*** bnemec has quit IRC13:16
*** beekneemech has joined #openstack-nova13:16
*** BjoernT has joined #openstack-nova13:19
mriedemsean-k-mooney: more lxc failures https://logs.opendev.org/24/676024/5/experimental/nova-lxc/9c06394/controller/logs/screen-n-cpu.txt.gz#_Aug_13_23_16_20_19178613:23
*** BjoernT has quit IRC13:23
*** BjoernT has joined #openstack-nova13:23
mriedemhttps://github.com/lxc/lxc/issues/1057 sounds similar13:26
*** belmoreira has quit IRC13:28
sean-k-mooney fun. debian does not symlink /sbin to /usr/sbin so perhaps its in /usr/sbin/init13:32
*** beekneemech has quit IRC13:33
*** belmoreira has joined #openstack-nova13:34
*** bnemec has joined #openstack-nova13:39
efriedstephenfin: That's an interesting one (your reshape chicken/egg)13:43
*** belmoreira has quit IRC13:46
*** bnemec has quit IRC13:47
sean-k-mooneystephenfin: by the way regardign the reshap you code needs to be able to handel a reshap on non upgrade cases as well.13:50
sean-k-mooneye.g. i need to handel the reshap when then new config options are defiend13:50
*** bnemec has joined #openstack-nova13:51
sean-k-mooneyso you could deploy with train and have no config options set. then set them after and restart the compute agent whic would trigger teh reshape13:51
*** jaosorior has joined #openstack-nova13:51
stephenfinsean-k-mooney: Would you actually have a reshape there?13:51
stephenfinsurely you'd just start reporting the new VCPU/PCPU resources13:52
stephenfinno different to changing vcpu_pin_set today13:52
stephenfinyou're not changing allocations from one type to the other13:52
sean-k-mooneywell if you were previously using cpu pinning on that host13:52
efriedWe've said until now that we don't want to reshape except on upgrades. (Though I've been skeptical that we would be able to stick to that.)13:52
sean-k-mooneythen defiend the cpu_dedicated_set only it would change form a vcpu allocation to pcpu allocation13:52
stephenfinsean-k-mooney: Nope. Remember, vcpu_pin_set is used for VCPU _and_ PCPU13:53
sean-k-mooneyyes but you dont need to enable the prefilter13:53
stephenfinIs that related?13:54
sean-k-mooneyor are we doing the convertion to resouce:pCPUs somewere else13:54
stephenfinefried: I _think_ I'm okay, actually13:54
sean-k-mooneyin the scheduler utils maybe?13:54
stephenfinIt seems the way the reshaping is done is that we build the list of inventory that we're going to report from the virt driver, but before we do the actual update we check for old allocations13:55
*** bnemec has quit IRC13:55
sean-k-mooneythis is what might require the reshape https://review.opendev.org/#/c/671801/19/nova/conf/workarounds.py13:56
stephenfinSo I should be able to simply check "does this compute node have any PCPU resources recorded for itself and does it have any pinned instances" and if there's a mismatch then I reshape13:56
sean-k-mooneybut i guess since you would have had to install with disable_legacy_pinning_policy_translation=true13:56
sean-k-mooneyi guess its fair to say if you are disabling that you need to reshape13:56
sean-k-mooneyyou are right about both inventories being reported13:57
stephenfinsean-k-mooney: I'm kind of lost, tbh13:57
stephenfinThat will only affect the scheduler13:57
sean-k-mooneyit affect teh plamcent query13:57
stephenfinRight, but it won't have any impact on the compute node side13:58
*** belmoreira has joined #openstack-nova13:58
stephenfinI mean, unless you're suggesting we'd want to delay reshaping if that option was configured13:58
stephenfinBut that's not the idea13:58
sean-k-mooneywith disable_legacy_pinning_policy_translation=true flavor.vcpus is translated to resouces.VCPU when hw:cpu_ploicy=dedicated13:58
*** bnemec has joined #openstack-nova13:58
stephenfinYeah. So if that's the case, we continue to be able to use all the pre-Train compute nodes that aren't reporting PCPUs13:59
sean-k-mooneyyep13:59
stephenfinBut we would not be able to use any of the Train nodes that *are* reporting them13:59
sean-k-mooneywhen that is removed new vms woudl be translated correctly.13:59
stephenfinCorrect13:59
sean-k-mooneyi guess old vms would be fixed when they are migrated14:00
sean-k-mooneyso ya i guess its not too much of an issue14:00
stephenfinTheir allocations will also be fixed when the compute node is upgrade14:00
stephenfin*upgraded14:00
sean-k-mooneyyou could avoid the reshape by migrating off all vms on old nodes to new ones14:00
sean-k-mooneyright14:01
stephenfinI could, but I think that shuffling on instances between hosts has been rejected14:01
stephenfin*of instances14:01
sean-k-mooneyok then ya ignore me14:01
sean-k-mooneywell operators can suffle the instance. nova wont do it automatically14:02
*** tbachman has quit IRC14:02
stephenfinSo my current algorithm is if the host has no PCPU inventory *and* 'ComputeNode.numa_topology.cells[*].pinned_cpus' is set (i.e. there are some pinned instances on the host), reshape14:02
stephenfinand reshaping involves identifying the instances on the host that are pinned and migrating their VCPU allocations wholesale to PCPU allocations14:03
sean-k-mooneyya14:03
sean-k-mooneyi think that should work correctly14:03
stephenfinI sadly can't reshape every consumers' allocations since there's a chance, however remote, that someone hasn't listened to us and has pinned and unpinned instances on the same host14:04
sean-k-mooneyits an online data migration?14:04
stephenfinYeah, run on startup14:04
stephenfinSo this code will trigger once in the entire life of the node14:04
openstackgerritBalazs Gibizer proposed openstack/nova master: Support reverting migration / resize with bandwidth  https://review.opendev.org/67614014:04
stephenfinWe can theoretically remove it in U14:04
sean-k-mooneyim just thinkin about the FFU implications of that which are that we must start teh agent14:05
stephenfinI have the code rewritten to implement that and am just finishing tests off14:05
sean-k-mooneye.g. you cant FFU form queens to U14:05
stephenfinDo we not do that anyway?14:05
sean-k-mooneyunless you stop at train to start the agent14:05
stephenfinIf not, that might justify us keeping the reshape code around for a few cycles14:06
stephenfinbbiab14:06
*** tbachman has joined #openstack-nova14:06
sean-k-mooneyits only the removal of the resahpe code that would break ffu i think14:06
*** belmoreira has quit IRC14:06
openstackgerritBalazs Gibizer proposed openstack/nova master: Allow migrating server with port resource request  https://review.opendev.org/67149714:06
sean-k-mooneyor rather require it to start the agents on an intermendiate release14:06
*** alemgeta has left #openstack-nova14:07
luyaodansmith: Are you around now?14:10
dansmithluyao: almost done14:10
*** mlavalle has joined #openstack-nova14:11
*** spatel has joined #openstack-nova14:12
spatelsean-k-mooney: Do you know what is going on here? http://paste.openstack.org/show/757000/14:12
*** Tianhao_Hu has joined #openstack-nova14:12
luyaodansmith: Great, I'll give my question first.  It's about libvirt driver, it seems that it's not recommended to access DB in libvirt driver, but in my patch, I need to get flavor id  from db to populate the device manager, is that acceptable? I would like you help to give some comments.14:12
spatelIt clearly related to resize issue or bug14:12
sean-k-mooneyya i have seen that before.14:13
sean-k-mooneyare you deploying on nfs or shared stoage14:13
*** helenafm has joined #openstack-nova14:13
*** Nick_A has joined #openstack-nova14:14
dansmithluyao: no, you can't access the database directly, but the flavor is on the instance so it should already be there and have what you need14:14
sean-k-mooneyspatel: this can happen when an instance was delete in the db when the compute agent is stopped and then was archived before it was started again14:15
sean-k-mooneye.g. after an evacutation14:15
luyaodansmith: https://review.opendev.org/#/c/672957/5/nova/virt/libvirt/device.py@196 I need to  access db to get instance14:16
sean-k-mooneythe host has a stale libvirt domain that references a disk tha that is nolonger found14:16
dansmithluyao: you've got it right there14:16
spatelsean-k-mooney: you are saying someone delete instance but compute agent was down and it got out of sync.14:16
*** jaosorior has quit IRC14:16
dansmithlyarwood: Instance.get_by_uuid() is okay, it accesses the database, but not directly, through conductor14:17
sean-k-mooneyyes basically this happens becasue a statle libvirt domain xml is on the host that references an image file that nolonger exists14:17
dansmithoops14:17
spatelI believe this is related to resize issue when you do CPU pinning which left resize stall14:17
dansmithluyao: ^14:17
luyaodansmith: so it's ok to access database like this? I just thought it's not recommended to access database in driver14:18
dansmithluyao: but you should try to avoid that if you can pull the instance there.. I'd  have to look at the code to see where this is running14:18
dansmithluyao: like I just said, you're not hitting the database directly with that call14:18
sean-k-mooneyspatel: it does look similar to https://bugs.launchpad.net/nova/+bug/177424914:19
openstackLaunchpad bug 1774249 in OpenStack Compute (nova) pike "update_available_resource will raise DiskNotFound after resize but before confirm" [Medium,Triaged]14:19
spatelsean-k-mooney: do you know how do i fix this issue or clean up left over disk.14:19
dansmithluyao: see there are now two places in that init routine that get instances, the mdev one and your pmem one14:20
dansmithluyao: the mdev one is more efficient than what you have, but it's not okay to look up all the instances twice, so you need to refactor that code in a patch ahead of time,14:20
alex_xudansmith: yea, that code is running when the nova-compute start, we populate all the assigned vgpu or vpmems from the libvirt, then we need instance_uuid and flavor_id to identify each assignment14:20
*** dpawlik has quit IRC14:20
sean-k-mooneywell the disk not found error means there is no left over disk14:20
dansmithluyao: to look up the instances on the driver once and pass them to the mdev and pmem methods14:20
sean-k-mooneyspatel: you have a left over domain xml14:20
spatellet me check14:20
dansmithluyao: but ideally you would pass all that in from the actual driver init, because it too does this lookup I think14:21
sean-k-mooneyso you can delete it with virsh but you should first check what the instance uuid was and then check tha tnova does not thinkg that vm should be running on that host14:21
*** ociuhandu has joined #openstack-nova14:21
sean-k-mooneyif nova thinkgs the vm should be delete or running no a different host then provide its not in the resize_confirm state its safe to delete teh domain xml14:22
spatelsean-k-mooney: i think i have xml file in /etc/libvirt/qemu which is not active in virsh list14:24
luyaodansmith: Does driver init lookup all instance objects? I think this should be done in compute manager.14:24
dansmithluyao: I said "I think", I'll look14:25
alex_xuI guess not14:25
spatelsean-k-mooney: can i just delete that file?14:25
*** ociuhandu has quit IRC14:26
alex_xunow I think it is not14:26
*** mriedem has left #openstack-nova14:26
*** mriedem has joined #openstack-nova14:26
mriedemgibi: some thougths in your tempest change to run resize+confirm in grenade https://review.opendev.org/#/c/675371/14:26
alex_xudansmith: is it ok query all the related instance objects and pass into the driver.init_host?14:27
dansmithalex_xu: luyao it does indirectly because compute manager calls init_instance for each, but yeah not in driver init, although I'm not sure where we get devicemanager instantiated either14:27
*** belmoreira has joined #openstack-nova14:27
dansmithalex_xu: I would think refactoring init_host to take/pass all instances might be good, yeah14:28
dansmithlet me look at something else14:28
*** tbachman has quit IRC14:28
alex_xuluyao: ^ is that doable, I guess you need all the instance which instance.host = self.host and something instance in resizing?14:29
sean-k-mooneymriedem: regarding the lxc issue looking at https://logs.opendev.org/24/676024/5/experimental/nova-lxc/9c06394/controller/logs/screen-n-cpu.txt.gz#_Aug_13_23_16_20_055037 we are setting the container entrypoint to /sbin/init but that can be changed in the image metadata https://github.com/openstack/glance/blob/master/etc/metadefs/compute-libvirt-image.json#L63-L6714:29
dansmithalex_xu: luyao I was going to say, init_instance() already gets called for each instance in compute manager, and you could put your accounting code in there, but passing that list to init_host() would be just as good, if that works14:30
sean-k-mooneyso i think for the debian issue we just need to set os_command_line in the lxc image to the path to the init system with is likely systemd now14:30
luyaoalex_xu: I need all instances which have xml on the host.14:30
dansmiththat would *improve* the current code, instead of your code making it *worse*14:30
alex_xuluyao: is there a case, you have xml for the target instance when resize, but the instance.host is still the src host?14:31
Tianhao_Hu@mriedem hi matt, about this issue,  if the directory left after cold migration is empty and has no effect on cold migration, can you gvie some advice about whether we can think this is not a bug?14:32
Tianhao_Huhttps://bugs.launchpad.net/starlingx/+bug/182485814:32
openstackLaunchpad bug 1824858 in StarlingX "nova instance remnant left behind after cold migration completes" [Low,Confirmed] - Assigned to hutianhao27 (hutianhao)14:32
*** tbachman has joined #openstack-nova14:32
alex_xudansmith: yea, that is better, just need to check with luyao for the resize case, in case we have domain xml but the instance.host isn't the current host. I guess init won't get the instance doesn't belong to this host14:32
luyaoalex_xu: there is a case, the resizing is not finished ,but the instance.host is dest host14:32
dansmithalex_xu: yeah14:33
sean-k-mooneymriedem: could we use a different image in the ci job or set the path to the correct entrypoint. ill look at this again later but i dont think this is a nova bug persay just incorrect configuration14:33
alex_xuluyao: ah, so the source host can get a domain xml, but it need the instance which host is target host14:34
*** BjoernT_ has joined #openstack-nova14:35
openstackgerritSurya Seetharaman proposed openstack/nova master: API microversion 2.76: Add 'power-update' external event  https://review.opendev.org/64561114:35
luyaoalex_xu,: yes , so we may also need check migrations14:35
alex_xudansmith: I think I can query more instances like the case luyao said, those instances only pass to init_host, but won't be used for later compute-manage initialize14:36
mriedemTianhao_Hu: i can't say off the top of my head and without digging through the 23 comments and description and recreate of that bug, which i'm unable to do right now14:36
*** BjoernT has quit IRC14:36
mriedemit also looks like it's rbd backend specific, and i don't have an environment to poke into that or recreate it right now14:36
mriedemsomeone from starlingx that is familiar with the nova code could triage it14:36
mriedemcould/should14:36
dansmithalex_xu: if you pass the list of instances on this host to init_host(), then you can get any instances that have xml but are not in that list, and do an independent query for those, which will be much smaller14:38
mriedemsean-k-mooney: i agree it's some kind of misconfig and related to needing to tell the image to use systemd now14:38
dansmithalex_xu: however, I would think that if you're just trying to figure out which namespaces are assigned, the xml should be enough, so I'm not sure why you need the instance from the db (but I haven't looked closely)14:38
alex_xudansmith: we need the flavor id. since we use (instance_uuid, flavor_id) to identify a claim. This is due to the same host resize. Both src and dest claim on the same host, we need a way to distinguish the claim14:40
alex_xuthe instance uuid I can get from the domain xml. the only trouble is the flavor id14:40
sean-k-mooneymriedem: looking at the lxc buster(debian 10) images the symlink appear to be there. /sbin/init -> /lib/systemd/systemd14:41
sean-k-mooneymriedem: so i think if we just use the right image it should just work14:41
mriedemalex_xu: the flavor name is in the domain xml metadata14:42
mriedemsee https://logs.opendev.org/24/676024/5/experimental/nova-lxc/9c06394/controller/logs/screen-n-cpu.txt.gz#_Aug_13_23_16_20_05503714:42
mriedem<nova:flavor name="m1.nano">14:42
dansmithalex_xu: okay makes sense14:42
alex_xumriedem: can we ensure all the virt driver persistent the flavor name? or it is libvirt specific.14:42
mriedemidk about the other drivers14:43
mriedembut they aren't doing pmems either14:43
dansmithalex_xu: mriedem you need more than that, you need the *actual* details of the flavor right?14:43
sean-k-mooneyalex_xu: i think the metadata is driver specific14:43
sean-k-mooneyalex_xu: it normally has the flavor uuid too14:43
dansmithalex_xu: looking up the flavor by name after the instance is spawned won't work because it may have changed, you need instance.flavor I think14:43
mriedemi'm wading into a conversation for which i don't have context of the problem, i'm just saying, the domain xml has flavor info14:43
alex_xudansmith: only need the id14:43
dansmithalex_xu: what does that tell you?14:44
alex_xudansmith: I just want to distinguish the src claim and dest claim on the same host. so instance_uuid and flavor_id is enough14:44
mriedemthe domain xml metadata doesn't store the flavor id so i guess that won't help you14:44
sean-k-mooneyalex_xu:  we generate teh metadtat here https://github.com/openstack/nova/blob/master/nova/virt/libvirt/config.py#L286014:44
alex_xusean-k-mooney: I'm thinking we have uuid for flavor...?14:45
mriedemalex_xu: and this is because the move claim for pmems is moving to the driver, right?14:45
dansmithalex_xu: okay I don't understand, since the flavor id won't tell you size or type or anything, but I guess I'll understand when I read14:45
sean-k-mooneyalex_xu: yes we do but we also have other info14:45
Tianhao_Humriedem: Thank you for your advice and I will get someone to work with me trying to find out  why the directory is left.14:45
mriedemalex_xu: flavor.flavorid is the user-facing thing14:45
mriedemflavorid is not necessarily a uuid,14:46
mriedembut it must be unique14:46
sean-k-mooneyactully we might not have the uuid unesll we put it in the namve field14:46
alex_xudansmith: flavor id will tell you it is src or dest. the dest is new flavor, the src is old flavor14:46
luyaodansmith: flavor id is used to mark a namespace is assigned to which instance with which flavor14:46
dansmithluyao: I think that's a bad plan14:46
sean-k-mooneyalex_xu: you cal always add whatever you need to https://github.com/openstack/nova/blob/master/nova/virt/libvirt/config.py#L2902-L293014:46
dansmithmriedem: IIRC, flavorid is unique, but only amongst non-deleted flavors, so if you delete and recreate a flavor it may be different right?14:46
alex_xumriedem: for same host resize, also note it is useful for vgpu14:47
mriedemdansmith: flavors are no longer soft-deletable14:47
mriedemsince they moved to the api db14:47
mriedemschema.UniqueConstraint("flavorid", name="uniq_flavors0flavorid"),14:47
sean-k-mooneyalex_xu: we are usign a custom xml namespace with a schema filt tha tis not avaliable anymore so you can extend it and ti wont be any more broken then it already is14:47
dansmithmriedem: okay, right, but same point.. I can delete a flavor and recreate it and the flavors will be totally different14:47
alex_xudansmith: at least we not allow resize to same flavor(check by flavor id i think)14:47
mriedemi don't know enough about the code or the problem here,14:48
mriedemnote there is also instance.instance_type_id which is the instance.flavor.id,14:49
mriedembut not in the domain xml so likel ydoesn't help14:49
dansmithbut still,14:49
alex_xusean-k-mooney: it is late to add new info to the xml. the existed instance won't have that until a restart14:49
sean-k-mooneyalex_xu: what info do you need14:49
dansmithI think they're making some assumptions that flavorid is permanent and never changes, such that considering two instances with the same pmem flavorid must be the same14:49
mriedemalex_xu: existing instances won't have vpmems either14:49
alex_xumriedem: dansmith oops, sorry, I use the flavor.id, not the flavorid14:49
dansmithalex_xu: so the integer id?14:50
alex_xudansmith: yes14:50
dansmithalex_xu: okay, well, that's still not a good idea, but less problematic14:50
alex_xumriedem: nice point14:50
*** priteau has joined #openstack-nova14:50
*** Tianhao_Hu has quit IRC14:51
*** luksky has quit IRC14:51
aspiersDoes anyone know why _get_guest_xml() has both instance and image_meta as params, rather than getting image_meta via instance.image_meta? I traced it back and found that prep_resize is called with image_meta from the request_spec, but I'm not sure why a request_spec would have different image_meta to the existing instance14:52
alex_xumriedem: but it should be a problem for vgpu14:52
alex_xumriedem: we already have vgpu instance14:52
alex_xudansmith: one more fun is we support same flavor same host cold migration :)14:52
alex_xudansmith: vmware virt driver is the only driver support that. the same host same flavor cold migration should make no sense for other virt driver14:53
mriedemalex_xu: the vmware driver isn't supporting pmems14:53
dansmithalex_xu: okay I'm really not sure why these are problematic cases14:53
luyaodansmith: beacause when resize to same host ,we can't distinguish the two groups of assignment14:54
alex_xuyea, same flavor.id14:54
mriedemif you're doing a resize, the flavors have to be different,14:55
mriedemif you're doing a cold migration, the flavors must be the same,14:55
dansmithluyao: yeah I'm not sure why that's necessary, but it's also a good reason _not_ to use flavor, because it will depend on whether or not the flavor is changing,14:55
sean-k-mooneyalex_xu: but why do you need to get flaovr info form the xml. for resize we have teh flavor objects and we shoudl have them for cold migrate and live migratie14:55
mriedemif you're doing a resize, you can do it on the same host with libvirt,14:55
dansmithand building an assumption into this that same-host same-flavor migrations can't happen is a bad idea, IMHO14:55
mriedemif you're doing a cold migration, you cannot do it on the same host with libvirt14:55
mriedemand it seems folly to think any other driver is going to implement vpmem support14:55
dansmithsean-k-mooney: they're talking about during host init, with instances in the middle of a migration14:55
mriedemgiven how complicated this sounds even for libvirt14:55
sean-k-mooneydansmith: ah thanks for the context14:56
dansmithmriedem: we do same-host same-flavor migration on libvirt in the gate with a single worker yeah?14:56
mriedemno14:56
sean-k-mooneydansmith: i dont think so14:56
mriedemresize yes14:56
mriedemnot cold migration14:56
mriedemyou literally can't14:56
dansmithwhy not?14:56
aspiersHrm, sounds like my question might coincidentally be somewhat related to the ongoing conversation, except for image_meta instead of flavors14:56
sean-k-mooneydansmith: the livrt driver does not report supprot for it14:57
alex_xusean-k-mooney: avoid db query in libvirt init14:57
mriedemhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L31414:57
mriedembecause ^14:57
mriedemremember the other day when we were talking about a workaround option to disallow cold migrating to the same source host as a scheduler option?14:57
mriedemthis is why14:57
alex_xusame flavor same host cold migration means....you stop and start the instance again :)14:57
sean-k-mooneyalex_xu: why we will be hamming it anyway when the resouce tracker starts up14:57
*** ociuhandu has joined #openstack-nova14:57
dansmithmriedem: either way, don't you think that making that assumption in the accounting for these is a bad plan, if that gets enabled in the future for whatever reason?14:57
dansmithalex_xu: no it doesn't14:57
*** boxiang has quit IRC14:58
dansmithmriedem: okay that's just because libvirt won't handle the disk copying properly because the names are all the same righht?14:58
*** boxiang has joined #openstack-nova14:58
mriedemidk what cold migrating on the same host with libvirt would mean14:58
alex_xuyea, same question14:58
dansmithages ago we had *some* CI system (not vmware) that could do same-host cold migration because it would find bugs in that, but maybe it was xen or smokestack or something14:59
dansmithjust seems like a terribly deep assumption to make14:59
mriedemidk, the xenapi driver doesn't support it either https://github.com/openstack/nova/blob/master/nova/virt/xenapi/driver.py#L6714:59
mriedemonly the vmware driver does14:59
*** liuyulong has quit IRC14:59
mriedemi also don't understand how init_host is connected to a move claim15:00
dansmithalex_xu: explain again why you need to tie flavors to the pmem at all for accounting?15:00
dansmithmriedem: this is for host restart, rebuilding state for what has been given to what15:00
dansmithmriedem: but I don't understand why we need to correlate that with anything other than instance uuid, which is done in the xml, AFAIK15:01
mriedemyeah....i thought the pmem "allocation" type data was stored in instance_extra or something15:01
mriedemkeyed by instance uuid15:01
alex_xufor same host resize, we records the claim in memory. but when we confirm and revert the resize, we need unclaim for dst claim or src claim. to distinguish that, only the instance_uuid isn't enough. I use instance_uuid + flavor.id.15:01
alex_xudansmith: mriedem sean-k-mooney ^15:01
dansmithalex_xu: isn't the whole point of this pmem stuff that the contents get moved with the instance if it crosses to another host?15:02
dansmith(tbh I don't know what the point of this is, but...)15:02
mriedemif you're adding a new move claim interface to the virt driver, why wouldn't you just pass the information down to the driver that the RT already knows - i.e. the old and new flavor and if it's a same-host resize?15:02
mriedemif we have to piece information together from what the hypervisor says in the domain xml, we screwed up somewhere in the design :)15:03
alex_xuwhen you confirm resize, you neeed unclaim the assigment for the source host. But for the same host, if only use instance_uuid, when you unclaim by instance_uuid, both the source and dest claim will be unclaim15:03
alex_xusame problem for revert resize15:03
dansmithalex_xu: so change the source allocation to be by migration uuid like we do for allocations?15:05
alex_xudansmith: the problem is when we revert the migration uuid back to instance uuid. for the resize to different host, those claim on two host's memory, we can't change the migration uuid back to instance uuid when the resize finish15:05
dansmithalex_xu: why not?15:06
dansmithif you can change it from instance to migration, why can't you change it back?15:06
*** helenafm has quit IRC15:06
*** jangutter has quit IRC15:07
alex_xudansmith: we claim the vpmem on the dest host(prep_resize), at that point, we need to change the source host claim's instance_uuid to migration_uuid.15:08
dansmith...right15:08
alex_xuyea...there is no way to change source host claim15:09
dansmithare you saying there's no way to rename a pmem device or whatever you're naming (instance, flavor) ?15:11
*** ociuhandu has quit IRC15:11
*** ociuhandu has joined #openstack-nova15:12
artomalex_xu, I feel like we should coordinate on the claim thing15:12
artomBecause IIUC your series touches on the same points as NUMA LM15:12
alex_xudansmith: when nova-compute startup, we populate all the assignment of vpmem and vgpu into memory, store those devices claim in the memory. key by (instance, flavor)15:12
alex_xuartom: yea15:13
sean-k-mooneyalex_xu: well why dont we just store them with the instance as teh key15:13
dansmithalex_xu: are the pmem devices actually named something specific in linux? or are they just /dev/pmem/2 and we maintain the mapping between them and the instance?15:13
alex_xusean-k-mooney: back to the same host resize problem :)15:13
artomalex_xu, what's your series? I'm having trouble searching gerrit for patches you own15:14
mriedem(instance, flavor) is redundant because we have instance.flavor15:14
alex_xudansmith: they are just /dev/pmem, but I don't get what relationship with the vpmem name at here15:14
*** slaweq has quit IRC15:14
sean-k-mooneyalex_xu: then use instace+flavor_name which is in the xml or add the uuid to the xml15:14
dansmithalex_xu: okay so I don't understand why you have a problem accounting things with migration uuid vs. instance uuid at any given point15:14
mriedemi think dan is saying, you claim the new_flavor devices on the dest host with the instance, and claim the old_flavor devices on the source host with the migration15:15
dansmithsean-k-mooney: it's not a matter of storing that data, and further, flavor is an object that exists and is keyed in another database, so I think using it as a key here is a bad idea as well15:15
dansmithmriedem: right15:15
mriedemand by "claim the old_flavor devices on the source host with the migration" i mean, swap the mapping for the old_flavor device claim on the source host with the migration for the instance15:15
dansmithmriedem: same pattern as we have elsewhere15:15
alex_xudansmith: the resize call to the dest host to claim the resource first, then how can we change the in-memory claim info in the source host? by another rpc call, back to the source host?15:16
mriedemwhen you revert on the source host, you revert that mapping as well15:16
dansmithalex_xu: you get a finish call or a revert call, so you do it then and there right?15:16
mriedemalex_xu: do it when prep_resize casts to resize_instance on the source host?15:16
sean-k-mooneywell we could flip it and claim with the migration uuid on the dest15:16
dansmithsean-k-mooney: no15:16
sean-k-mooneyok15:17
dansmithsean-k-mooney: because then you have to change it again if they keep the instance, which is the natural path15:17
sean-k-mooneyya that is true  you  would have to change it in confrim resize15:17
alex_xudansmith: we should change to migration uuid before we add new claim for the target15:17
mriedemor like dan said just swap the source host claim during confirm_resize or finish_revert_resize (i'm not sure you'd even need to change the latter)15:17
alex_xuotherwise how can we distinguish the src and dest for the same host resize...15:17
mriedemfor same host resize,15:18
mriedemyou have some mapping of devices claimed on that host, right?15:18
dansmithalex_xu: are you familiar with how we do this for placement allocations?15:18
mriedemthe instance maps to the new flavor device allocations, the migration maps to the old flavor device allocations15:18
dansmithyes ^15:18
alex_xuyea, I know those thing for placement15:18
*** slaweq has joined #openstack-nova15:18
*** dave-mccowan has joined #openstack-nova15:19
mriedemso for same host resize move claim it seems you'd:15:19
mriedem1. map migration = old flavor15:19
mriedem2. change instance mapping to point from old to new flavor15:19
mriedemon confirm, you remove the mapping for the migration,15:19
mriedemon revert, you:15:19
mriedem1. map instance to old flavor,15:19
mriedem2. drop the migration mapping15:19
dansmithdon't say flavor here15:19
dansmithbut yes15:19
mriedemflavor device allocation thing 200015:20
mriedemwill flavour work?15:20
dansmithno15:20
dansmithI think building more things using the existing pattern of using migration uuid to reserve the old resources is a good idea15:21
alex_xuemm...I try to remember which case we said no for this in the beginning15:21
alex_xuluyao: ^ help me15:21
luyaoalex_xu: I'm always trying...15:22
*** slaweq has quit IRC15:24
*** tbachman has quit IRC15:24
tssuryamriedem: do you also prefer to push all the instance state checking and updates into the manager like dansmith ? since we have a lock there15:25
alex_xumriedem: dansmith the first rpc call to the comptute node for the resize is send to the dest src. so you can do map 'migration = old flavor' first15:26
*** dave-mccowan has quit IRC15:26
*** nweinber_ has joined #openstack-nova15:26
dansmithalex_xu: you can't, but you don't have to do the migration mapping until you hit the source for the first time15:26
dansmithalex_xu: isn't this the exact same set of steps for allocations in placement? so the ordering should work the same way15:27
*** nweinber__ has quit IRC15:27
openstackgerritYAMAMOTO Takashi proposed openstack/nova master: Revert "Revert resize: wait for events according to hybrid plug"  https://review.opendev.org/67502115:27
openstackgerritYAMAMOTO Takashi proposed openstack/nova master: Revert "Pass migration to finish_revert_migration()"  https://review.opendev.org/67644215:27
*** nweinber__ has joined #openstack-nova15:28
*** macz has joined #openstack-nova15:28
mriedemwell, it's a little different with placement allocations since we can manage those at the top in conductor15:28
alex_xudansmith: we switch instance_uuid to migration_uuid in the conductor, right?15:28
dansmithokay, that's fair15:28
mriedemfor example, if we claim the new devices on the dest in prep_resize, cast to resize_instance on the source, swap the old devices to the migration, and then something fails during the disk transfer or whatever, we won't go back to the dest to cleanup that old claim - but the RT should fix that up in a periodic15:29
mriedemi don't know where these "claims" are stored in memory though - in the compute manager? RT? driver?15:29
mriedemi'll just say, the RT logic is already super complex, and now it sounds like we're going to be duplicating parts of it elsewhere...15:30
alex_xumriedem: swap should be happened first for same host resize15:30
alex_xumriedem: claim store in driver15:30
*** nweinber_ has quit IRC15:30
*** shilpasd has joined #openstack-nova15:30
mriedemcleaning up from a failed same host resize is simpler since yo'ure on the same host, but not for different hosts15:30
sean-k-mooneyalex_xu: this is diferent then how we claim pci devices and cpu/hugepage right?15:31
*** tbachman has joined #openstack-nova15:31
mriedemthe driver doing resource tracking now.... :(15:31
sean-k-mooneybecause for those i thought we stored the claims in the db via the RT rather then in memeroy15:31
alex_xusean-k-mooney: yes, at least the vpmem and vgpu is managed by virt driver. pci and cpu managed by resource tracker15:32
mriedemthe resource tracker doing resource tracking still ... :(15:32
mriedemwhat a mess15:32
mriedemsome things in placement, some things in legacy nova tables and the RT, some things now in the driver15:32
sean-k-mooneywe should really be keepign all this in the RT untill its in placment15:32
sean-k-mooneyif its ever in placment15:33
sean-k-mooneyput this in memory in the driver is worring15:33
alex_xusean-k-mooney: mriedem in the future, when pci and numa move to placement, then we needn't store in RT, right? then also managing in the virt driver?15:33
sean-k-mooneyor at least complex15:33
sean-k-mooneyalex_xu: it will still be needed in some cases15:33
alex_xusean-k-mooney: hah, you say different with dansmith :)15:33
* alex_xu prepare a gun for dansmith15:33
sean-k-mooneywe placement wont be tracking indeivusal device assignment15:34
sean-k-mooneye.g which vf(pci addres) the vm is using15:34
sean-k-mooneyto do that we would need to create a RP per vf which we are not going to do15:34
alex_xusean-k-mooney: yea, that is what happen for vgpu and vpmem15:34
dansmithI'm not sure what sean-k-mooney said that is different than me15:35
sean-k-mooneyso the "assignment" infomation will always need to live in nova15:35
sean-k-mooneythe tally count of how many are avaliable wil be in placment15:35
dansmithsean-k-mooney: when we previously discussed this, I wanted to avoid nova storing a mapping between the actual pmem device and the instance in our database, for a specific reason15:36
alex_xuI think dansmith said we use libvirt to persistent the assigment of devices, not DB. sean-k-mooney is talk about we still need the DB15:36
sean-k-mooneywell currently we regenerate teh xmls on lots of operations so storign the mapping in the xml will be invasive15:37
dansmithright, so we have the mapping between instances and pmem devices stored in the libvirt xml15:37
dansmithwhatever, I give up, do whatever ya15:37
dansmith'll want15:37
sean-k-mooneyit will be there implictly i guess15:37
alex_xudansmith: no...15:37
sean-k-mooneydansmith: we dont need to store it in the db provide we will never use it in a filter15:38
sean-k-mooneywe only need to store the pci info in the db to use it with the numa/pci passthough filters15:38
sean-k-mooneysame for the numa toplogy blob15:38
sean-k-mooneywe could caluate them locally on the host and keep it in meory otherwise15:39
mriedemwhat happens when i need a weigher to pick hosts with more or less allocated pmems?!15:39
sean-k-mooneyso if the only schduling for vpmem is done via placmeent then the assignment could be track via the xml15:39
mriedemor pmem affinity15:40
sean-k-mooneymriedem: we would either need to call placement for the data or we cant15:40
mriedembtw, are there a fair number of rhosp users using vgpus now that you're on queens?15:40
sean-k-mooneypmem affinity(i assume numa affinity) could be modeled in the RP tree15:41
sean-k-mooneymriedem: not that im aware of15:41
sean-k-mooneymost are using full GPU passthough when they need gpus15:41
mriedemwas just going to say that15:41
sean-k-mooneynvida licening is $$$$15:41
dansmithI'll be really honest here, I think this is a very niche, very libvirt-specific, very unlikely-to-be-widely-used feature, and I think that adding a bunch of nova persistence for these things brings more impact to operations and upgrades than we need,15:42
dansmithand storing this information purely in the place where it matters (in libvirt) limits that impact and scope a lot15:42
*** ivve has quit IRC15:42
sean-k-mooneydansmith: im not against that just wanted to point out we have always assumed the xmls are not required until now15:43
dansmithif an operator changes hardware after a maint cycle that changes the ordering of these devices or something, I worry about handing persistent devices to the wrong instances, and I think keeping the mapping(s) in one place that is visible and accessible to the operators if they need to remap is also a good idea15:43
sean-k-mooneye.g. that we can jsut regenerate them15:43
dansmithsean-k-mooney: no, that's not true I don't think15:43
dansmithsean-k-mooney: if we delete the instance from a guest and we restart nova I think it will freak15:43
sean-k-mooneyyes but if an operator change the xml with virsh and we do a hard rebot we jsut regenerate it15:44
dansmithsean-k-mooney: regenerating the xml all the time does not mean that the xml is not useful data.. we use it to determine which instances are actually on this host, vs just assigned15:44
sean-k-mooneyif the domain is missing i dont knwo what happens15:44
dansmithsean-k-mooney: not if we don't store that detail ourselves15:44
dansmithanyway, I think I've already spent way more time on this than this feature is worth,15:44
dansmithand the column in the db to just dump a blob of data into instance_extra was already merged before this was all discussed,15:45
dansmithso the easiest thing is to just let that become a dumping ground for all this stuff, regardless15:45
dansmithalex_xu: really sorry for ever even involving myself in this, my apologies15:46
*** tbachman has quit IRC15:46
mriedemonto tssurya's problem!15:46
tssuryayayy15:46
dansmithI was just goign to say15:46
dansmithtssurya: I missed if there was a reply on the plan15:46
tssuryanot yet waiting for mriedem's opinion15:47
mriedemshe's just asking if i agree with changing task_state in hte api15:47
dansmithyeah15:47
dansmithdidn't see a response on that15:47
tssuryalet a comment on the patch: https://review.opendev.org/#/c/645611/15:47
tssuryaleft*15:47
alex_xudansmith: sorry, I'm trying my best make it simple and easy. but yea, i still found those issue need help15:47
*** tbachman has joined #openstack-nova15:47
tssuryaalso efried ^ in case you have an opinion15:47
mriedemso if on power-update we don't check or set task_state in the api, we avoid the "instance is stuck witk with non-none task_state b/c it's on a stein compute" issue15:47
tssuryaright only to do the same in the manager15:48
mriedemand thhe driver / compute manager would need to handle the UnexpectedTaskStateError15:48
dansmithyep, and it moves the "do we do anything about this" closer to the thing that makes that decision15:48
tssuryawhy does the driver have to handle UnexpectedTaskStateError ?15:49
dansmithtssurya: it doesn't I don't think15:49
tssuryaI would be moving the task_state saving part into the manager15:49
mriedemthe downside is losing some race between the api and the compute manager where the sync power states task has turned off your bm instance b/c the nova db said it should be off but it's actually on again in ironic, right?15:49
tssuryaand since it has a lock it should be fine15:49
dansmithwhen Ioriginally suggested this, I was imaging the driver wholly owning the "what do we do"15:49
tssuryadansmith: yea I remember you telling it in the spec design phase15:49
dansmithso only ironic will be anything other than a no-op in this case, and all it needs to do is do its poweroff15:49
*** damien_r has quit IRC15:49
dansmithtssurya: the lock only works within a compute node,15:50
tssuryaits just that the notifications/action stuff happens in the upper level15:50
dansmithtssurya: so you can still race with the compute node processing this and something else trying to take action on the instance15:50
mriedemsomething needs to handle UnexpectedTaskStateError for power-update otherwise we fail to process all of the events in the same request on the same host15:50
tssuryadansmith: oh yea true15:50
dansmithtssurya: meaning ironic may have sent that event, and meanwhile some user tried to reboot the instance at the same time15:50
mriedemand you could lose a race with the bug you're trying to fix with this, i think, right?15:51
tssuryaso what's the point of moving it to the manager again ?15:51
dansmithmriedem: all we need to do is handle, in the ironic case, what happens if I do instance.save(expected_Task_state=None) right?15:51
mriedemtssurya: rolling upgrades for one15:51
tssuryaah yes15:51
dansmithtssurya: and it makes it so we don't touch the instance at all for any drivers that don't care about this15:51
tssuryamriedem: and yea the downside point is valid15:52
mriedemdansmith: i think so, but if the driver raises the nthe compute manager code has to handle it and not barf for the other events in the same request15:52
dansmithmriedem: I think the driver should just not raise, I think the driver should handle all of this, because it's only one15:52
mriedemso if the driver gets UnexpectedTaskStateError, it just logs and returns?15:53
dansmithyeah15:53
mriedemwfm15:53
dansmithit will be contextually relevant,15:53
tssuryadansmith, mriedem: wait the instance.save(expected_Task_state=None) is happening in the manager no ?15:53
dansmithwhere compute manager won't know what the eff it means15:53
tssuryaand the instance.save(expected_task_state=task_states.POWERING_ON)15:54
tssuryawill happen in the driver15:54
mriedemtssurya: no, your code is doing that in the driver15:54
dansmithtssurya: I don't think so15:54
mriedemit's like the only thing the driver does15:54
mriedemhttps://review.opendev.org/#/c/645611/12/nova/virt/ironic/driver.py15:54
tssuryaok so we record an instance action start before setting the task_state to POWERING_ON ?15:54
mriedemthe instance actoin stuff in the api doesn't change15:54
mriedemimo15:54
dansmithI dunno15:55
mriedemthat's how you know from the api if the event failed or not15:55
tssuryaso isn't it confusing to the user that the task_state hasn't changed15:55
tssuryabut there is an instance-action record/notification15:55
tssuryaemitted15:55
dansmithmriedem: if I send this event for libvirt instance, it will record an action but nothing really happened15:55
*** rpittau is now known as rpittau|afk15:55
mriedemdansmith: and i'll say, you shouldn't really do that15:55
mriedemsame with trying to swap volumes on anything other than libvirt15:55
dansmith:/15:56
*** tesseract has quit IRC15:56
mriedemor extend an attached volume, or do host-assised snapshot, or anything with pmems :)15:57
mriedemso https://review.opendev.org/#/c/645611/12/nova/compute/api.py@250 would move to the compute manager, or the ironic driver?15:58
dansmithyeah, well, this is a minor detail15:58
mriedemi'm cool with that moving to the driver too15:58
mriedemuntil some other driver needs this same thing for whatever reason15:58
dansmithI would think it moves to the driver, and really doesn't need to be all that detailed I would think, but yeah15:58
dansmithI mean, I guess it could stay, I don't really care that much15:59
tssuryaok I can move that whole chunk with the no-op15:59
tssuryaall into the driver15:59
tssuryaonly the instance action and notification stuff remain in place in the api and manager right ?15:59
mriedemthe state checking in the api could save you some rpc traffic i guess, but you could be racing either way15:59
tssuryabut then again, is it ok for the task_state to still be None when we emit the state notification ?16:00
dansmiththe thing I think makes the biggest difference, is changing the power state before we know if we're going to do anything16:00
tssuryastart*16:00
dansmithI don't know the semantics of any expectations around that16:00
tssuryaok yea then maybe its fine :)16:01
dansmithI didn't really think notifications are required to be realtime and synchronous, so I don't think it'd matter16:01
mriedemtssurya: you can set the task_state on the instance in the compute manager before calling the driver method if we care - but then you have to handle UnexpectedTaskStateError there as well16:01
dansmithbut if we do that,16:01
dansmithwe16:01
dansmithare changing it for non-ironic instances,16:02
tssuryahmm yea16:02
dansmiththen catch the NotImplementedError, and then change it right back, yeah?16:02
mriedemwhich won't actually power off or on the guest16:02
mriedem@reverts_task_state would set the task_state back to None yes16:02
mriedembut i'm not too worried about non-ironic instances because this is admin-only api stuff and if someone is abusing it for non-ironic instances and it doesn't work as expected....meh?16:03
dansmithokay16:03
tssuryaso what't it going to be: manager or driver ?16:03
tssuryaat this point I don't mind either16:03
dansmith"at this point" .. translation: "at this point I'm about to shoot both of you guys"16:04
mriedemi forgot the question16:04
tssuryadansmith: ;)16:04
dansmithtssurya: I think it doesn't matter that much, figure it out when you move the code and it's probably okay16:04
tssuryamriedem: meaning we move the task state and https://review.opendev.org/#/c/645611/12/nova/compute/api.py@25016:04
dansmithtssurya: also when are you actually leaving?16:04
*** tssurya has quit IRC16:05
mriedemright now :)16:05
*** tssurya has joined #openstack-nova16:05
dansmithlol16:05
dansmith#ragequit16:05
mriedem(11:04:47 AM) dansmith: tssurya: also when are you actually leaving?16:05
tssuryasorry bad connection16:05
tssuryaI leave on Friday16:05
dansmithokay16:05
tssuryaso I can work on it this evening and tomorrow16:05
dansmithokay cool16:06
dansmithtssurya: so I said I think manager vs. driver probably doesn't matter that much, you choose while moving the code and it'll probably be okay16:06
dansmithI don't have a strong opinion without seeing it, so maybe just have to pick one and see16:06
tssuryaok awesome thanks16:06
dansmithmriedem: sound okay?16:06
tssuryaso first I'll try with the manager and then we can see16:06
mriedemi'll leave a couple of comments16:06
tssuryathanks a lot dansmith and mriedem :D16:07
*** belmoreira has quit IRC16:07
mriedemdone https://review.opendev.org/#/c/645611/1216:10
*** cdent has quit IRC16:12
tssuryaty16:13
dansmithmriedem: yeah, so assuming your "don't do this for libvirt" assertion I think what you said in there is fine16:13
openstackgerritMerged openstack/nova stable/stein: Fix misuse of nova.objects.base.obj_equal_prims  https://review.opendev.org/67628916:13
*** cfriesen has quit IRC16:13
*** ricolin has quit IRC16:29
*** dklyle has quit IRC16:36
*** dklyle has joined #openstack-nova16:37
*** tssurya has quit IRC16:39
*** belmoreira has joined #openstack-nova16:39
*** ociuhandu_ has joined #openstack-nova16:41
*** belmoreira has quit IRC16:41
alex_xudansmith: mriedem sean-k-mooney anyway thanks for your time today, it is indeed I use a lot today :)16:42
*** fungi has quit IRC16:42
*** fungi has joined #openstack-nova16:43
dansmithalex_xu: it's okay, and I think sean-k-mooney has more than 24 hours in his days, so he has plenty to spare :)16:43
*** shilpasd has quit IRC16:43
mriedemno problem16:43
*** ociuhandu has quit IRC16:44
*** markvoelker has quit IRC16:44
dansmithmars has 25 hour long days, so maybe sean-k-mooney is from mars16:44
openstackgerritMatt Riedemann proposed openstack/nova stable/pike: Hook resource_tracker to remove stale node information  https://review.opendev.org/67646116:45
*** ociuhandu_ has quit IRC16:45
*** adrianc has quit IRC16:50
*** markvoelker has joined #openstack-nova16:51
openstackgerritMatt Riedemann proposed openstack/nova stable/pike: rt: only map compute node if we created it  https://review.opendev.org/67646316:52
*** adrianc has joined #openstack-nova16:52
*** mrjk has quit IRC16:54
openstackgerritMatt Riedemann proposed openstack/nova stable/pike: Hook resource_tracker to remove stale node information  https://review.opendev.org/67646116:54
openstackgerritMatt Riedemann proposed openstack/nova stable/pike: rt: only map compute node if we created it  https://review.opendev.org/67646316:55
*** ociuhandu has joined #openstack-nova16:56
*** psachin has quit IRC16:57
*** tbachman has quit IRC16:57
*** derekh has quit IRC17:00
*** ociuhandu has quit IRC17:00
efrieddustinc: Ima rebase the sdk series and bump ksa/sdk requirements and get it tracking against this devstack patch https://review.opendev.org/#/c/676268/17:00
efriedas soon as sdk 0.34.0 lands in u-c17:01
efriedwhen I say rebase - it's currently based on a really old nova commit, so I'm going to bring it up to current master.17:01
openstackgerritMatt Riedemann proposed openstack/nova stable/ocata: Hook resource_tracker to remove stale node information  https://review.opendev.org/67646717:05
*** mrjk has joined #openstack-nova17:08
roukis there a particular reason why OS-FLV-DISABLED:disabled is not exposed via api client? is it intended to be manual api queries only?17:12
*** spsurya has quit IRC17:14
rouki see an old proposed patch from... a long time ago, but it was abandoned.17:15
dansmithefried: on this:17:18
dansmithhttps://review.opendev.org/#/c/675705/2/nova/tests/functional/regressions/test_bug_1839560.py17:18
efriedyah17:19
dansmithefried: you're just noting that in real life we have a service with hostname "host1" but would normally get uuids for the node names, is that right?17:19
efrieddansmith: It appears to me as though, in real life, we actually go and create a node for the host, and then delete it when it doesn't come back in the list of nodes.17:19
efriedand because that messes with the test, it was easier to name the host 'node1', so that we don't do that deletion.17:20
dansmithwe *don't* do that in real life though17:20
efriedwell17:21
efriedwhat I did was pull down the patch, change node1 to host1 where it was talking about the host, and run the test.17:21
efriedand saw the message indicating that we deleted the host1 "node"17:21
dansmithright, it's just an artifact of the test that we create a host and node from the same name17:21
dansmithbut in real life that's not how that works,17:21
efriedoh, it's start_service that's doing that?17:22
dansmithit's just because of the test plumbing17:22
dansmithno, I think it's because of the fake driver and the service which has already been started by the time we get to the test17:22
dansmithwell,17:22
dansmithmaybe not been started,17:22
dansmithbut because we're using the fake driver,17:22
dansmithwe've already run get_available_nodes() by the time start_service has returned17:22
dansmithand that returns the same name as host.hostname or something17:23
dansmithso when he then overrides the node list to be node1, node2, you'd get a deletion of the host1 node that was already created17:23
dansmithanyway, I think your "I pulled down the patch" comment explains what I was asking, just that you were noticing test behavior and not concerned about relevance to real life or something else17:24
efriedI was concerned about relevance to real life. But if you're telling me it's just test harness noise, I'm okay with that.17:25
dansmithack17:25
efriedhowever, if it is test harness noise, I feel like we ought to be able to fix it.17:25
efriedmore easily than if it's in the real code.17:25
dansmithwe could for sure (just made a comment to that effect)17:26
dansmithI just don't think it's that important17:26
dansmithnormally having them match is handy17:26
efried"normally" not-ironic17:26
efriedno doubt17:26
efriedbut for ironic cases in general I would think it's never handy, n/a most of the time, but in certain cases like this one really confusing.17:27
dansmithyes, normally not ironic17:27
dansmithyep17:27
dansmithwell, yep to the first part, not the second17:27
dansmithI don't really think it's confusing, but probably just because I know the details17:28
dansmiththe test isn't testing the ironic case, it's testing the bring-back-the-dead case,17:28
dansmithwhich is related to how it works with ironic, but... not entirely ironic specific17:28
dansmithANYway17:28
efriedwe could add to the fragility of FakeDriver's init_host by special-casing another CONF.host, say ironic-compute, and making self._nodes [] in that case.17:30
*** priteau has quit IRC17:37
*** tbachman has joined #openstack-nova17:43
*** dpawlik has joined #openstack-nova17:43
*** ivve has joined #openstack-nova17:52
*** mjozefcz has quit IRC18:00
*** gyee has joined #openstack-nova18:06
*** ociuhandu has joined #openstack-nova18:11
*** eharney_ has joined #openstack-nova18:15
*** dpawlik has quit IRC18:15
*** eharney has quit IRC18:17
mriedemefried: fwiw,18:19
mriedemwhen we had a fake.set_nodes global i could have used that to avoid this problem,18:19
mriedembut since that was removed, you get the thing described above18:19
mriedem"so when he then overrides the node list to be node1, node2, you'd get a deletion of the host1 node that was already created"18:19
mriedemso i could have started with host1:host1, and then returned available nodes host1:host1 and host1:node2 but that's also weird - which was my reply18:20
efriedI let it go, but if y'all keep talking about it, I'm going to have to fix it.18:25
openstackgerritMerged openstack/nova master: lxc: make use of filter python3 compatible  https://review.opendev.org/67626318:25
mriedemdo you want to talk about AZ design in a new private cloud instead?18:26
mriedemsean-k-mooney: do you want to backport https://review.opendev.org/#/c/676263/ ?18:26
*** slaweq has joined #openstack-nova18:29
*** ociuhandu has quit IRC18:34
openstackgerritMatt Riedemann proposed openstack/nova stable/ocata: Log compute node uuid when the record is created  https://review.opendev.org/67648718:45
*** maciejjozefczyk has joined #openstack-nova18:53
*** maciejjozefczyk has quit IRC18:59
*** maciejjozefczyk has joined #openstack-nova19:04
sean-k-mooneymriedem: i can backport it. how far?19:15
sean-k-mooneymriedem: also i looked at what devstack is doing19:15
sean-k-mooneyits useing a cirros image for lxc by defult19:15
sean-k-mooneyi have not unpacked it but they use a different init system or rather non standard19:15
sean-k-mooneyso that coudl be the issue with the other error19:16
sean-k-mooneyi might try to set up an lxc deployment locally and see if i can figure it out later in the week19:16
*** boxiang has quit IRC19:19
openstackgerritEric Fried proposed openstack/nova master: Enhance SDK fixture for 0.34.0  https://review.opendev.org/67649519:20
efriedmordred: ^19:20
mordredefried: ++ I think that's the right call19:21
efriedmriedem, dansmith: can I get a fast approval (assuming zuul happy) for this --^ please? It's blocking u-c for openstacksdk https://review.opendev.org/#/c/676457/19:21
openstackgerritsean mooney proposed openstack/nova stable/stein: lxc: make use of filter python3 compatible  https://review.opendev.org/67649619:21
sean-k-mooneyif i want to cherry pick back multiple branches gerrit does not update the cherry picked from commit like unless its merged right19:23
sean-k-mooneyso i shoudl either do it via the commandline or update it manually?19:24
openstackgerritsean mooney proposed openstack/nova stable/rocky: lxc: make use of filter python3 compatible  https://review.opendev.org/67649819:25
openstackgerritsean mooney proposed openstack/nova stable/queens: lxc: make use of filter python3 compatible  https://review.opendev.org/67650019:25
*** beagles_pto has quit IRC19:26
openstackgerritsean mooney proposed openstack/nova stable/pike: lxc: make use of filter python3 compatible  https://review.opendev.org/67650219:26
sean-k-mooneyok i have cherry picked it to pike since that is when we added python3 support now to fix all the topics19:27
mriedemefried: done19:31
efriedthanks19:32
*** kaisers has quit IRC19:32
mriedemnice to know that after talking about what to do for AZs in a new cloud for a couple of hours that my initial, "at this point i don't think we need any AZs" comment turned out to be the correct one19:33
mriedem$$$ professional $$$19:33
*** kaisers has joined #openstack-nova19:34
*** eharney_ has quit IRC19:38
*** shilpasd has joined #openstack-nova19:40
sean-k-mooneythe only thinig i have ever used AZs for is to give people the choice of the old servers, the new servers or the ci servers19:40
openstackgerritMatt Riedemann proposed openstack/nova stable/stein: Add functional regression recreate test for bug 1839560  https://review.opendev.org/67650719:47
openstackbug 1839560 in OpenStack Compute (nova) "ironic: moving node to maintenance makes it unusable afterwards" [High,In progress] https://launchpad.net/bugs/1839560 - Assigned to Matt Riedemann (mriedem)19:47
mriedemsean-k-mooney: which you could do with host aggregates and flavors tied to those aggregates19:49
*** nweinber__ has quit IRC20:00
*** tbachman has quit IRC20:00
mriedemdropping os-acc before we even have a working nova/cyborg integration seems like jumping the gun20:02
efriedmordred: stubbing Adapter.get_endpoint is too big a hammer. I'm struggling to come up with another approach...20:06
*** luksky has joined #openstack-nova20:07
mordredefried: poop. my brain is towards the end of its hours of effectiveness in terms of coming up with useful ideas... I'll dig in to it first thing in the morning when I'm fresh.20:07
mordredthere's got to be something20:07
efriedmordred: problem is a bunch of other things hit Adapter and Adapter.get_endpoint20:08
mordredyeah20:08
openstackgerritMatt Riedemann proposed openstack/nova stable/stein: Restore soft-deleted compute node with same uuid  https://review.opendev.org/67650920:10
openstackgerritMerged openstack/nova master: Add functional regression recreate test for bug 1839560  https://review.opendev.org/67570520:14
openstackbug 1839560 in OpenStack Compute (nova) stein "ironic: moving node to maintenance makes it unusable afterwards" [High,In progress] https://launchpad.net/bugs/1839560 - Assigned to Matt Riedemann (mriedem)20:14
openstackgerritMerged openstack/nova master: Restore soft-deleted compute node with same uuid  https://review.opendev.org/67549620:14
openstackgerritMatt Riedemann proposed openstack/nova stable/rocky: Add functional regression recreate test for bug 1839560  https://review.opendev.org/67651320:16
openstackbug 1839560 in OpenStack Compute (nova) stein "ironic: moving node to maintenance makes it unusable afterwards" [High,In progress] https://launchpad.net/bugs/1839560 - Assigned to Matt Riedemann (mriedem)20:16
*** tbachman has joined #openstack-nova20:18
openstackgerritMatt Riedemann proposed openstack/nova stable/rocky: Restore soft-deleted compute node with same uuid  https://review.opendev.org/67651420:19
mriedemmnaser: backports are all up20:19
mriedemTheJulia: looks like ironic-tempest-dsvm-ipa-wholedisk-bios-agent_ipmitool-tinyipa is busted on stable/rocky,20:20
mriedemseeing this: It looks like a path. File ''/home/zuul/src/opendev.org/openstack/ironic-tempest-plugin'' does not exist.20:21
mriedemqueens and stein are ok20:23
mriedemoh i bet it's this https://github.com/openstack/ironic/commit/ef0fde41e9c78e79d0b3b618102bcb475fa1f69120:27
*** bbowen__ has quit IRC20:29
*** bbowen__ has joined #openstack-nova20:29
TheJuliabut that was on master and stein20:30
mriedem"The has stopped working out of a sudden." doesn't mention the root cause20:30
mriedemwhich might be something that affects rocky as well20:30
TheJuliamriedem: unless your meaning that it might be the fix?20:30
*** tbachman has quit IRC20:30
mriedemyeah20:30
TheJuliamaybe, I'm wondering why we've not seen it on rocky then20:31
TheJuliamriedem: are you guys just invoking the job directly by name, or do you hae a definition in nova's config?20:31
mriedemlooks like you have... https://review.opendev.org/#/c/648360/20:31
mriedemwe just use the job name20:31
mriedemhttps://github.com/openstack/nova/blob/stable/rocky/.zuul.yaml#L21320:32
TheJuliainteresting20:32
mriedemyeah stable/rocky is broken for ironic as well,20:32
mriedemi'm working on backporting that fix but there are merge conflicts20:32
TheJuliawell, it was working as of Aug 320:34
TheJuliamriedem: okay, if your doing that, I'll wait then20:35
*** tbachman has joined #openstack-nova20:41
*** dpawlik has joined #openstack-nova20:45
*** efried has quit IRC20:45
*** efried has joined #openstack-nova20:46
*** maciejjozefczyk has quit IRC21:00
*** eharney_ has joined #openstack-nova21:05
*** slaweq has quit IRC21:06
melwittmriedem: I'm thinking I should fix this up and rebase on it for the multi-cell archive patch https://review.opendev.org/67521821:18
melwittbc if it does the re-use of db engine, I won't have to make any changes to the "for table" method21:19
mriedemmelwitt: i forgot all about that patch :(21:21
mriedemi've got part of it fixed, i can push that up once i'm done rebasing the cross-cell-resize series and mark as WIP if you want to swing at it21:21
mriedembut yeah i assumed you'd rebase on that series of fixups21:21
melwittme too kinda, until I went to address review feedback and was like, hm, I think mriedem did something about this in another patch21:22
melwittok, that would be great. thanks21:22
*** markvoelker has quit IRC21:26
*** dpawlik has quit IRC21:35
openstackgerritMatt Riedemann proposed openstack/nova master: Add prep_snapshot_based_resize_at_dest compute method  https://review.opendev.org/63329321:38
openstackgerritMatt Riedemann proposed openstack/nova master: Add PrepResizeAtDestTask  https://review.opendev.org/62789021:38
openstackgerritMatt Riedemann proposed openstack/nova master: Add prep_snapshot_based_resize_at_source compute method  https://review.opendev.org/63483221:38
openstackgerritMatt Riedemann proposed openstack/nova master: Add nova.compute.utils.delete_image  https://review.opendev.org/63760521:39
openstackgerritMatt Riedemann proposed openstack/nova master: Add PrepResizeAtSourceTask  https://review.opendev.org/62789121:39
openstackgerritMatt Riedemann proposed openstack/nova master: Refactor ComputeManager.remove_volume_connection  https://review.opendev.org/64218321:39
openstackgerritMatt Riedemann proposed openstack/nova master: Add power_on kwarg to ComputeDriver.spawn() method  https://review.opendev.org/64259021:39
openstackgerritMatt Riedemann proposed openstack/nova master: Add finish_snapshot_based_resize_at_dest compute method  https://review.opendev.org/63508021:39
openstackgerritMatt Riedemann proposed openstack/nova master: Add FinishResizeAtDestTask  https://review.opendev.org/63564621:39
openstackgerritMatt Riedemann proposed openstack/nova master: Add Destination.allow_cross_cell_move field  https://review.opendev.org/61403521:39
openstackgerritMatt Riedemann proposed openstack/nova master: Execute CrossCellMigrationTask from MigrationTask  https://review.opendev.org/63566821:39
openstackgerritMatt Riedemann proposed openstack/nova master: Plumb allow_cross_cell_resize into compute API resize()  https://review.opendev.org/63568421:39
openstackgerritMatt Riedemann proposed openstack/nova master: Filter duplicates from compute API get_migrations_sorted()  https://review.opendev.org/63622421:39
openstackgerritMatt Riedemann proposed openstack/nova master: Change HostManager to allow scheduling to other cells  https://review.opendev.org/61403721:39
openstackgerritMatt Riedemann proposed openstack/nova master: Start functional testing for cross-cell resize  https://review.opendev.org/63625321:39
openstackgerritMatt Riedemann proposed openstack/nova master: Handle target host cross-cell cold migration in conductor  https://review.opendev.org/64259121:39
openstackgerritMatt Riedemann proposed openstack/nova master: Validate image/create during cross-cell resize functional testing  https://review.opendev.org/64259221:39
openstackgerritMatt Riedemann proposed openstack/nova master: Add zones wrinkle to TestMultiCellMigrate  https://review.opendev.org/64345021:39
openstackgerritMatt Riedemann proposed openstack/nova master: WIP: Re-use DB engine connection during archive_deleted_rows  https://review.opendev.org/67521821:45
mriedemmelwitt: ^ some test_db_api tests are failing with unique constraint errors, like an instance was archived when it didn't expect it to21:45
melwittmriedem: ok, thanks21:45
mriedemok i fixed it21:49
mriedemstill need to call metadata.bind.connect() per call to _archive_deleted_rows_for_table21:49
mriedemnot entirely sure why...21:49
*** macz has quit IRC21:50
openstackgerritMatt Riedemann proposed openstack/nova master: WIP: Re-use DB MetaData during archive_deleted_rows  https://review.opendev.org/67521821:52
*** dpawlik has joined #openstack-nova21:52
mriedemah crap need to remove the WIP21:52
openstackgerritMatt Riedemann proposed openstack/nova master: Re-use DB MetaData during archive_deleted_rows  https://review.opendev.org/67521821:52
mriedemok now i need to run21:52
*** mriedem has quit IRC21:53
openstackgerritDustin Cowles proposed openstack/nova master: Provider config file schema and loader  https://review.opendev.org/67334121:56
openstackgerritDustin Cowles proposed openstack/nova master: WIP: Public method to retrieve custom resource providers  https://review.opendev.org/67602921:56
openstackgerritDustin Cowles proposed openstack/nova master: WIP: Load the custom resource providers to resource tracker  https://review.opendev.org/67652221:56
*** dpawlik has quit IRC21:56
melwittthanks mriedem++22:00
*** luksky has quit IRC22:03
openstackgerritEric Fried proposed openstack/nova master: Enhance SDK fixture for 0.34.0  https://review.opendev.org/67649522:08
efriedmordred: A big but very focused hammer ^22:08
*** shilpasd has quit IRC22:09
*** artom has quit IRC22:09
mordredefried: wow22:14
mordredefried: I feel like pep8 isn't going to like it - but I do22:14
efriedoh, rat farts, forgot to fix that part...22:14
openstackgerritEric Fried proposed openstack/nova master: Enhance SDK fixture for 0.34.0  https://review.opendev.org/67649522:15
efriedthanks mordred for saving me a failed zuul run on a stupid22:15
*** mlavalle has quit IRC22:17
openstackgerritMerged openstack/nova stable/rocky: Fix misuse of nova.objects.base.obj_equal_prims  https://review.opendev.org/67629022:17
sean-k-mooneygibi: by the way you asked me over a month ago if i had an example of how to configure a host to use the same interface for both ovs and sriov. i am setting up my sriov servers again and this is the local.conf i use http://paste.openstack.org/show/757169/22:18
sean-k-mooneyim using port enp1s0f1 as teh ovs port on the br-ex and its also the pf of the sriov vfs with the neutron fdb extention enalbed to make sure that ovs vm to sriov vm traffic does nto need to hairpin at teh top of rack switch22:20
*** markvoelker has joined #openstack-nova22:23
*** eharney_ has quit IRC22:30
*** spatel has quit IRC22:31
*** ivve has quit IRC22:33
*** BjoernT_ has quit IRC22:40
*** markvoelker has quit IRC22:48
*** tkajinam has joined #openstack-nova22:50
*** tyreymer has joined #openstack-nova23:05
*** xek has quit IRC23:14
*** markvoelker has joined #openstack-nova23:25
*** markvoelker has quit IRC23:30
*** tyreymer has quit IRC23:41

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!