Thursday, 2018-12-06

sean-k-mooneyspecifcially  use this feature https://review.openstack.org/#/c/603352/00:00
tssuryasean-k-mooney: yea thanks, did read the full spec yet but got the idea00:03
tssuryaso it would be something like adding/removing to an aggregate when we disable/enable the service I guess00:04
tssuryaand the fourth solution was traits I see00:05
sean-k-mooneyyep and we then jsut alway include member_of=!<uuid of disbaled aggreage>00:05
sean-k-mooneyin the placement request00:05
tssuryasean-k-mooney: yep makes sense00:06
sean-k-mooneywe can use a uuid5 to generate a sable uuid for the aggragte but other services can create there own00:06
tssuryayea, as long as the aggregate doesn't get stale (as in disruption during enable which fails to remove it from the aggregate)/we keep it in sync should work well.00:07
tssuryawhich is a corner case00:08
sean-k-mooneytssurya: well we could take care of that with a periodic task to fix it00:08
sean-k-mooneye.g we udate it imendiatly form the api but heal it later if the update got lost00:09
*** rodolof has joined #openstack-nova00:09
tssuryasean-k-mooney: after looking at the RT's periodic task, I am so down with periodic tasks as of now :D because of the overhead when having several computes00:09
tssuryabut yea as long as the interval is not so frequent, should be ok00:10
sean-k-mooneyhaha ya but this would be on i guess the conductor00:10
sean-k-mooneyor supper conductor00:10
sean-k-mooneythe node status is in i think both the api and cell db so we dont need to run this on every compute node00:11
tssuryanode status is only in the cell db's00:11
sean-k-mooneyok so but it could run on the cell conductor then00:11
tssuryayea could be00:12
sean-k-mooneyan you could set the interval to a week if you wanted and this time supprot disabling it right out of the box by setting to 000:12
*** _alastor_ has joined #openstack-nova00:12
tssuryabut not sure if its an upcall when updating the api aggregate table frm the cell conductor but yea these are implementation details00:13
sean-k-mooneyif you trust it to not get out of sync our you will fix it later then no need to run it00:13
sean-k-mooneytssurya: oh im talking about placemetn aggreagtes not nova host aggrats by the way00:13
tssuryasean-k-mooney: oh okay!00:14
tssuryasince you saud "nova" adds hosts, got confused00:15
sean-k-mooneyya i should have said nova update the placment aggragte with the compute node RP uuid00:15
tssuryaright yea00:16
*** gyee has quit IRC00:17
*** _alastor_ has quit IRC00:17
*** wolverineav has quit IRC00:27
*** spatel has joined #openstack-nova00:29
*** wolverineav has joined #openstack-nova00:30
*** spatel has quit IRC00:34
*** jaosorior has quit IRC00:37
*** Swami has quit IRC00:52
*** Belgar81 has joined #openstack-nova00:55
*** brinzhang has joined #openstack-nova01:09
*** tommylikehu_ has joined #openstack-nova01:10
*** k_mouza has joined #openstack-nova01:13
*** k_mouza has quit IRC01:17
jaypipescfriesen: you have awoken the kraken. but unfortunately, the kraken is too tired from playing tennis to fight tonight :) so, will chat tomorrow about it.01:28
*** markvoelker has quit IRC01:33
*** betherly has joined #openstack-nova01:40
*** sapd1 has quit IRC01:40
*** sapd1 has joined #openstack-nova01:40
*** betherly has quit IRC01:44
*** david-lyle has joined #openstack-nova01:48
*** manjeets_ has joined #openstack-nova01:49
*** itlinux has joined #openstack-nova01:49
*** dklyle has quit IRC01:51
*** manjeets has quit IRC01:51
*** _alastor_ has joined #openstack-nova02:13
*** mschuppert has quit IRC02:15
*** mrsoul has quit IRC02:15
*** Dinesh_Bhor has joined #openstack-nova02:16
*** _alastor_ has quit IRC02:18
*** spatel has joined #openstack-nova02:22
*** cfriesen has quit IRC02:22
openstackgerritJack Ding proposed openstack/nova master: Preserve UEFI NVRAM variable store  https://review.openstack.org/62164602:23
openstackgerritGuo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication  https://review.openstack.org/62312002:33
*** hongbin has joined #openstack-nova02:35
*** mhen has quit IRC02:36
*** mhen has joined #openstack-nova02:37
*** wolverineav has quit IRC02:40
*** wolverineav has joined #openstack-nova02:41
openstackgerritGuo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication  https://review.openstack.org/62312002:44
*** wolverineav has quit IRC02:46
openstackgerritGuo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication  https://review.openstack.org/62312002:46
*** betherly has joined #openstack-nova02:51
openstackgerritGuo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication  https://review.openstack.org/62312002:51
*** imacdonn has quit IRC02:53
*** imacdonn has joined #openstack-nova02:53
*** betherly has quit IRC02:55
openstackgerritGuo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication  https://review.openstack.org/62312003:11
*** Dinesh_Bhor has quit IRC03:15
*** wolverineav has joined #openstack-nova03:21
*** rodolof has quit IRC03:23
*** Dinesh_Bhor has joined #openstack-nova03:23
*** wolverineav has quit IRC03:26
openstackgerritZhenyu Zheng proposed openstack/nova master: Handle tags in _bury_in_cell0  https://review.openstack.org/62185603:26
*** wolverineav has joined #openstack-nova03:30
*** psachin has joined #openstack-nova03:32
*** wolverineav has quit IRC03:34
*** tssurya has quit IRC03:42
openstackgerritTakashi NATSUME proposed openstack/nova stable/rocky: Add a bug tag for nova doc  https://review.openstack.org/62313003:43
*** wolverineav has joined #openstack-nova03:46
*** rodolof has joined #openstack-nova03:49
*** wolverineav has quit IRC04:02
*** wolverineav has joined #openstack-nova04:03
*** wolverineav has quit IRC04:07
openstackgerritGuo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication  https://review.openstack.org/62312004:16
*** rodolof has quit IRC04:23
*** betherly has joined #openstack-nova04:32
*** hongbin has quit IRC04:33
*** janki has joined #openstack-nova04:34
*** betherly has quit IRC04:37
openstackgerritGuo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication  https://review.openstack.org/62312004:52
*** lpetrut has joined #openstack-nova04:52
*** lpetrut has quit IRC05:36
*** Dinesh_Bhor has quit IRC05:36
*** Dinesh_Bhor has joined #openstack-nova05:42
*** wolverineav has joined #openstack-nova05:43
*** spatel has quit IRC05:46
*** ratailor has joined #openstack-nova05:46
*** wolverineav has quit IRC05:50
*** Dinesh_Bhor has quit IRC05:57
*** Dinesh_Bhor has joined #openstack-nova06:12
*** rambo_li has joined #openstack-nova06:16
*** Dinesh_Bhor has quit IRC06:18
*** sridharg has joined #openstack-nova06:20
*** Dinesh_Bhor has joined #openstack-nova06:36
*** rambo_li has quit IRC06:42
openstackgerritTakashi NATSUME proposed openstack/nova master: api-ref: Body verification for the lock action  https://review.openstack.org/62283506:58
openstackgerritZhenyu Zheng proposed openstack/nova master: Handle tags in _bury_in_cell0  https://review.openstack.org/62185607:00
*** wolverineav has joined #openstack-nova07:05
*** wolverineav has quit IRC07:10
*** dpawlik has joined #openstack-nova07:28
openstackgerritMerged openstack/nova master: Update mailinglist from dev to discuss  https://review.openstack.org/62182707:41
openstackgerritMerged openstack/nova master: modify the avaliable link  https://review.openstack.org/61690507:41
*** rcernin has quit IRC07:56
*** pcaruana has joined #openstack-nova07:58
*** pcaruana is now known as muttley07:58
*** maciejjozefczyk has joined #openstack-nova08:01
*** maciejjozefczyk has quit IRC08:01
*** maciejjozefczyk has joined #openstack-nova08:02
*** takashin has left #openstack-nova08:03
*** rcernin has joined #openstack-nova08:03
*** lpetrut has joined #openstack-nova08:06
*** slaweq has joined #openstack-nova08:06
*** awalende has joined #openstack-nova08:13
*** slaweq has quit IRC08:15
*** helenafm has joined #openstack-nova08:16
*** _alastor_ has joined #openstack-nova08:16
*** ralonsoh has joined #openstack-nova08:21
*** _alastor_ has quit IRC08:21
*** rcernin has quit IRC08:33
*** sahid has joined #openstack-nova08:33
*** brinzhang has quit IRC08:38
*** brinzhang has joined #openstack-nova08:38
*** tommylikehu_ has quit IRC08:40
*** Dinesh_Bhor has quit IRC08:41
*** mhen has quit IRC08:54
*** Dinesh_Bhor has joined #openstack-nova08:54
*** lpetrut has quit IRC08:57
*** mhen has joined #openstack-nova09:00
*** markvoelker has joined #openstack-nova09:01
*** awalende has quit IRC09:01
*** awalende has joined #openstack-nova09:02
*** awalende has quit IRC09:06
*** awalende has joined #openstack-nova09:08
*** maciejjozefczyk has quit IRC09:16
gibistephenfin: regarding shorted notification payload, would you like to see the samples stored in nova sorted only or also the json emitted on the message bus?09:16
*** k_mouza has joined #openstack-nova09:18
*** k_mouza has quit IRC09:22
*** k_mouza has joined #openstack-nova09:23
gibimdbooth: in the failure in nova.tests.functional.regressions.test_bug_1550919.LibvirtFlatEvacuateTest in http://logs.openstack.org/22/606122/7/check/nova-tox-functional/1f3126b/testr_results.html.gz there are 9 seconds gap between '[nova.virt.libvirt.driver] Creating image' and the first polling of the server state. So someting is definitely slow there _before_ even the test try to wait for 5 seconds09:23
gibito see the server is ACTIVE state09:23
mdboothgibi: looking09:24
mdboothgibi: That's one of the ones which looked 'generally unhappy' I think.09:24
gibimdbooth: unfortunately we dont have such timing data from successful runs :/09:25
mdboothgibi: Or DEBUG logs :(09:26
gibidoes 'Creating image' means that in these test we really generate the root fs for the instance?09:26
mdboothgibi: No. The disk creation is stubbed to just touch the file.09:27
*** ttsiouts has joined #openstack-nova09:27
mdboothHowever, it does execute _resolve_driver_format in imagebackend, which executes qemu-img.09:27
mdboothI have a patch locally to mock that out, but on my unloaded system it doesn't make a massive difference.09:28
mdboothI can see that it's about 3 seconds of wall clock time, although multithreaded.09:28
*** dtantsur|afk is now known as dtantsur09:29
gibimdbooth: I see. Honestly I don't know what could be the real problem. In the run http://logs.openstack.org/22/606122/7/check/nova-tox-functional/1f3126b/testr_results.html.gz I see missing DB table exceptions as well that reminds me the race we fixed with cdent last week in the placement db fixture. Maybe that race had other side effects. However we only fixed that in the split out placement repo.09:32
mdboothgibi: Like I said yesterday, I suspect it's just a canary: the first thing to die when conditions get bad.09:34
mdboothI could mock another couple of things out, and increase the timeout.09:34
*** markvoelker has quit IRC09:34
gibimdbooth: could very well be it. Still it woul be nice to know why 5 seconds in not enough in these tests. But I know that we don't have data to figure that out09:35
mdboothgibi: Why don't we have debug enabled, btw?09:35
gibimdbooth: I'm not sure, maybe we don't want to store that much of logs.09:36
*** maciejjozefczyk has joined #openstack-nova09:36
gibimdbooth: but if you can propose a patch that turns the debug log on, I can support that. and we will see if there are other oppinions09:37
*** ondrejme has quit IRC09:38
gibistephenfin: I can try to propose the sorting patch and see if people start puking09:42
gibistephenfin: anyhow, thanks for the reviews on those patches09:43
mdboothgibi: Thanks for spending time on this.09:43
gibimdbooth: I mysteries :)09:43
* mdbooth too :)09:44
mdboothSometimes, anyway.09:44
*** derekh has joined #openstack-nova09:44
*** cdent has joined #openstack-nova09:54
*** tssurya has joined #openstack-nova09:59
*** ttsiouts has quit IRC10:00
*** ttsiouts has joined #openstack-nova10:01
*** ttsiouts has quit IRC10:05
*** brinzhang has quit IRC10:08
*** yan0s has joined #openstack-nova10:08
*** brinzhang has joined #openstack-nova10:08
yan0sI have a question about nova policies10:10
yan0sare variables "compute:create", "compute:get" etc deprecated?10:11
yan0sI see they are not mentioned in the latest documentation10:12
yan0shttps://docs.openstack.org/nova/rocky/configuration/policy.html10:12
yan0salso testing some of them I think they affect access rights10:13
yan0sonly their "os_compute_api" equivalents work10:13
kashyapgibi: Heya, Zuul was "-2" on this: https://review.openstack.org/#/c/620327/.  Guess if I do a 'rechek', I'll "lose" all the ACKs?10:19
gibikashyap: I don't think so10:20
kashyapSeems to be a 'neutron-grenade' failure10:20
gibikashyap: if you checked the fauilre and seems unrealated to your patch then feel free to recheck10:20
kashyapYeah, they're unrelated; I can't spot anything related to my patch: http://logs.openstack.org/27/620327/3/gate/neutron-grenade/8ffe31b/job-output.txt.gz10:20
gibikashyap: yeah the cells-v1 test failure seems unrelated10:23
gibikashyap: and I dont see anything suspicious in the grenade log either10:24
kashyapThanks!  I hit a 'recheck'10:24
gibikashyap: btw, you only loose the +W if you need to rebase10:24
* kashyap hates to do mindless rechecks; so as not to waste the resources10:24
kashyapgibi: Ah, noted.10:24
*** erlon has joined #openstack-nova10:31
*** markvoelker has joined #openstack-nova10:31
*** k_mouza has quit IRC10:34
*** ttsiouts has joined #openstack-nova10:34
*** mvkr has joined #openstack-nova10:36
*** Dinesh_Bhor has quit IRC10:37
*** maciejjozefczyk has quit IRC10:45
*** maciejjozefczyk has joined #openstack-nova10:46
lyarwoodmdbooth: https://review.openstack.org/#/c/619804/ - I know you're busy with your own func test hell but can you add this to your queue today, happy to close it out if it isn't useful.10:48
*** ttsiouts has quit IRC10:50
mdboothlyarwood: Oh, nice. Does it work?10:50
*** ttsiouts has joined #openstack-nova10:50
mdboothlyarwood: I see you're using it in the patch above.10:50
lyarwoodmdbooth: I was, but I've dropped the patch above now10:50
lyarwoodmdbooth: so I either rebase this on your stuff or kill it10:51
lyarwoodmdbooth: and it works, I think.10:51
*** Belgar81 has quit IRC10:54
*** k_mouza has joined #openstack-nova10:55
*** ttsiouts has quit IRC10:55
*** markvoelker has quit IRC11:05
*** awalende has quit IRC11:15
*** k_mouza has quit IRC11:18
*** awalende has joined #openstack-nova11:20
*** ttsiouts has joined #openstack-nova11:20
*** awalende has quit IRC11:24
openstackgerritMerged openstack/nova master: Clean up cpu_shared_set config docs  https://review.openstack.org/61486411:28
openstackgerritMerged openstack/nova master: Delete NeutronLinuxBridgeInterfaceDriver  https://review.openstack.org/61699511:29
*** tbachman has quit IRC11:33
*** ttsiouts has quit IRC11:43
*** ttsiouts has joined #openstack-nova11:44
*** ttsiouts has quit IRC11:48
openstackgerritSurya Seetharaman proposed openstack/nova master: Add DownCellFixture  https://review.openstack.org/61481011:53
openstackgerritSurya Seetharaman proposed openstack/nova master: API microversion 2.68: Handles Down Cells  https://review.openstack.org/59165711:53
*** awalende has joined #openstack-nova11:53
*** awalende has quit IRC11:58
*** arches has joined #openstack-nova12:01
openstackgerritChris Dent proposed openstack/nova master: Correct lower-constraints.txt and the related tox job  https://review.openstack.org/62297212:04
*** arches has left #openstack-nova12:05
*** psachin has quit IRC12:08
*** awalende has joined #openstack-nova12:09
*** sambetts_ has quit IRC12:13
*** sambetts_ has joined #openstack-nova12:15
*** _alastor_ has joined #openstack-nova12:17
*** k_mouza has joined #openstack-nova12:20
*** _alastor_ has quit IRC12:23
*** k_mouza has quit IRC12:25
openstackgerritGuo Jingyu proposed openstack/nova master: Add rfb.VNC support for novncproxy  https://review.openstack.org/62233612:25
*** sambetts_ has quit IRC12:26
*** mriedem has joined #openstack-nova12:27
*** ttsiouts has joined #openstack-nova12:28
*** sambetts_ has joined #openstack-nova12:28
*** awalende has quit IRC12:29
*** erlon has quit IRC12:29
*** k_mouza has joined #openstack-nova12:30
openstackgerritZhenyu Zheng proposed openstack/nova master: Handle tags in _bury_in_cell0  https://review.openstack.org/62185612:33
*** ttsiouts has quit IRC12:40
*** ttsiouts has joined #openstack-nova12:41
*** udesale has joined #openstack-nova12:41
*** erlon has joined #openstack-nova12:43
*** brinzhang has quit IRC12:44
*** ratailor has quit IRC12:45
*** ttsiouts has quit IRC12:46
*** awalende has joined #openstack-nova12:50
*** betherly has joined #openstack-nova12:52
*** eharney has quit IRC12:54
*** arches has joined #openstack-nova12:54
*** ttsiouts has joined #openstack-nova12:58
mriedemtssurya: belmoreira: i'm not sure what i'm missing here https://bugs.launchpad.net/nova/+bug/180598912:58
openstackLaunchpad bug 1805989 in OpenStack Compute (nova) "Weight policy to stack/spread instances and "max_placement_results"" [Undecided,New]12:58
*** betherly has quit IRC13:01
tssuryamriedem: I guess from our prespective since we don't use any scheduler fitlers (except the Compute Filter), as opposed to a scenario where placement would return all results as in all possible available resources on which the weights would be applied versus what happens now when we do this only the max_placement_results number of available resources (hosts)13:03
tssuryawould effect the way packing/spreading is done13:03
mriedemthat's working as designed....13:03
mriedemmax_placement_results is just a front-end filter13:04
sean-k-mooneytssurya: you dont use any filetre so you dont use sriov/pci passthrough/cpu pinning/hugepages or numa in your cloud13:04
mriedemlike i said in the bug, even before placement, if you have 1k hosts, if the filters narrow that down to 10, then only 10 get weighed13:04
mriedemsean-k-mooney: no they don't,13:04
mriedemoh well,13:04
mriedemnvm13:04
mriedemb/c of cells v1 there were several things they couldn't support before, like affinity13:04
mriedemidk about pci/numa13:05
sean-k-mooneythis seams kindof like a duplicate of https://bugs.launchpad.net/nova/+bug/180598413:05
openstackLaunchpad bug 1805984 in OpenStack Compute (nova) "Placement is not aware of disable compute nodes" [Wishlist,Triaged]13:05
tssuryasean-k-mooney: we use pci passthrough, but not that much (just for one cell)13:05
mriedemsean-k-mooney: that is definitely a different problem13:05
mriedemthat bug is that placement is returning only disabled computes13:06
mriedembecause cern sets max_placement_results very low13:06
sean-k-mooneyyes13:06
mriedemthis other one is different13:06
mriedemplacement has no relationship to weighers in nova-scheduler13:06
sean-k-mooneyit is a different problem but they are both because the limiting is done in placmeent not after teh filters13:06
mriedemfor the weigher bug it doesn't matter13:07
mriedemfilters are before weighers,13:07
mriedemso if placement filtered out the hosts or the scheduler filters did, it doesn't matter13:07
mriedemthe weighers are only going to weigh what they get13:07
mriedemso they will spread/pack across whatever comes through the filters13:07
sean-k-mooneyit maters if you have lost of disabled host in the limited set of result you got form plamcnet13:07
sean-k-mooneyit will skew the weighing by reducing the set of results13:08
tssuryamriedem: exactly for us when not using "max_placement_results" filters would give us almost all the hosts from a cell13:08
*** muttley has quit IRC13:08
mriedemtssurya: yeah, so IMO bug 1805989 is not really a bug, it's working as designed13:09
openstackbug 1805989 in OpenStack Compute (nova) "Weight policy to stack/spread instances and "max_placement_results"" [Undecided,New] https://launchpad.net/bugs/180598913:09
tssuryabut yea probably this iss not a bug its  design thing13:09
mriedembug 1805984 is definitely a problem for which we have solutions13:09
openstackbug 1805984 in OpenStack Compute (nova) "Placement is not aware of disable compute nodes" [Wishlist,Triaged] https://launchpad.net/bugs/180598413:09
sean-k-mooneythat said it is working as desinged as mriedem said. that desing my not be desireable and we might want to consider how to change it13:09
gibimriedem: I saw your comment on the notification deprecation patch. I have to organize my thoughts formulate an opinion13:09
mriedemwhat we need to do is get CERN to the point that they don't have to workaround perf issues by setting max_placement_results to 1013:10
mriedemwhen they have 14K hosts13:10
mriedemgibi: heh ok :)13:10
mriedemno rush13:10
gibimriedem: yeah, I will take my time13:10
gibi:)13:10
mriedemgibi: it's a bit depressing huh?13:10
tssuryamriedem: sure its a design thing, we just laid it out in case there were ideas13:10
tssuryamriedem: yea perf stuff is still going on13:10
tssuryahopefully we won't need to set "max_placement_results" to a low number then13:11
gibimriedem: I understand that the original goals of that work might not be applicable in the today situaion on OpenStack13:11
sean-k-mooneytssurya: the placment randomisation was how we wanted peole to enable spreading behavior13:11
*** tbachman has joined #openstack-nova13:12
mriedemtssurya: ok i marked the bug as invalid (working as designed)13:12
sean-k-mooneyso if you had a 1000 hosts that could fit the request and you requested 50 then with the randomisation enbled you would get a random 50 out of that 1000 for the filters and weigher to select form13:12
*** arches has quit IRC13:13
sean-k-mooneytssurya: do you rember what you set currently and what is that value as a propotion of your average cell size13:14
*** k_mouza_ has joined #openstack-nova13:14
mriedemaverage cell size is 200 at cern13:14
tssuryasean-k-mooney: 10 versus 800 in normal scenario13:14
mriedemi think they set max_placement_results to 1013:14
mriedem800? i thought it was 200 on average13:14
mriedemwith about 72-74 cells?13:15
tssuryamriedem: yea 200 for special project to cell mappings, on an average for normal users we set aside 5 to 7 default cells13:15
tssuryaeach cell having 20013:15
mriedemok13:15
mriedemas far as i know, https://bugs.launchpad.net/nova/+bug/1737465 is still the biggest perf issue13:16
openstackLaunchpad bug 1737465 in OpenStack Compute (nova) "[cellv2] the performance issue of cellv2 when creating 500 instances concurrently" [Medium,Confirmed]13:16
sean-k-mooneyok so could you increase the limit to say 10% of the host that tenat can be expect to selct form13:16
sean-k-mooneye.g. 80 in this case?13:16
*** k_mouza has quit IRC13:17
mriedemalso, as far as i know, cern is not yet grouping cells via host aggregate13:17
mriedemor are you?13:17
*** rcernin has joined #openstack-nova13:17
tssuryawe model cells as aggregates13:17
mriedembecause of https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#tenant-isolation-with-placement right>13:17
tssuryaas in every cell is an aggregate13:17
mriedem?13:17
mriedemok13:17
tssuryayea because of the pre-filter13:18
mriedemso, really,13:18
jaypipesmriedem: wait, there's a bug in shelve code?!13:18
sean-k-mooneytssurya: so you then use the teanay affinity filter to map tenants to thos aggreates and therefor to cells?13:18
mriedemjaypipes: working as designed13:18
*** awalende has quit IRC13:18
tssuryasean-k-mooney: yea13:18
mriedemtssurya: if the pre-filter is working, then the max_placement_results probably doesn't need to be so low...13:18
jaypipesmriedem: I only have time to argue with cfriesen today, so I'll wait til he's online. :P13:19
sean-k-mooneymriedem: that is what i was wondering too13:19
tssuryamriedem: surely it need not be as low as 10, but even with it being 10 we have 5 to 8 seconds scheduling time13:19
mriedembecause if my project is mapped to a cell with ~200 hosts, then filtering on 200 hosts shouldn't be so bad13:19
tssuryathe time it would tkae to gather all those host states , well we are still trying to check iut why its 5 to 8 seconds13:19
tssuryaout*13:20
sean-k-mooneytssurya: when you say schduiling are you refering to just the time it take for the filter schduler or the time the vm is in that sate13:20
mriedemtssurya: do you know if this is set on the computes and the scheduler nodes? https://docs.openstack.org/nova/latest/configuration/config.html#filter_scheduler.track_instance_changes13:20
tssuryatime for the whole boot until select destinations is done fully13:20
jaypipesmriedem: I think it's pretty obvious I wouldn't support adding disabled flags to provider records, yes?13:20
mriedemjaypipes: yes, i could find you my reply to that13:21
mriedemwhich was probably not very nice13:21
jaypipesmriedem: yes, I saw it.13:21
mriedemand i hope cfriesen will forgive me13:21
*** muttley has joined #openstack-nova13:21
jaypipesmriedem: BTW, it's possible to do something like this:13:21
tssuryamriedem: not sure let me check13:21
sean-k-mooneyjaypipes: would you support usign a placement aggrage for disabled compute nodes13:21
sean-k-mooneyjaypipes: and then use the new member_of:!<aggreage> feature13:22
jaypipesa) request max_placement_records. get those records. b) do a filter() against those provider UUIDs and the set of disabled compute services (would need to grab the compute node UUID, not the hostname, though), and c) if len(a) < len(b), request more from placement13:22
mriedemtssurya: track_instance_changes enables the computes to rpc broadcast to the schedules the information about the instances running on them (the hosts) and then the scheduler workers cache that information so that during a scheduling request the scheduler doesn't need to iterate all of the hosts (via db) to get their current instance list13:23
jaypipessean-k-mooney: sure, though forbidden aggregates are not approved spec yet...13:23
mriedemtssurya: it's not recommended for split MQ because the computes would be disconnected from the schedulers (so the cast goes in the trash)13:23
mriedemjaypipes: i've been +2 on that forbidden aggregates spec for a couple of weeks now13:23
jaypipessean-k-mooney: the solution should remain entirely on the nova side, though, IMHO, which is why I recommend the approach above as a method to take if NoValidHosts is encountered.13:23
sean-k-mooneyjaypipes: true but its less of an abuse of the placment api and is just as simple to manage as adding or removing a trait13:23
mriedemjaypipes: what we talked about in channel yesterday was all nova side solutions13:24
jaypipesmriedem: gotta find another +2? do you want me to +2 that since I was the one who came up with the idea?13:24
sean-k-mooneyjaypipes: e.g. page in more results if the filter elimidate them13:24
mriedemjaypipes: what you suggested about sounds like paging, but also not great performance wise since we'd have to pull the data from the nova cell dbs to check the disabled status13:24
jaypipessean-k-mooney: is elimidate a combination of eliminate and intimidate?13:24
mriedemjaypipes: you were +2 on the spec before13:24
tssuryamriedem: its False13:24
mriedemtssurya: ok. you aren't running split MQ right?13:25
jaypipesmriedem: are we talking about the same spec?13:25
jaypipesmriedem: the nova side one or the placement-side one?13:25
mriedemhttps://review.openstack.org/#/c/603352/13:25
mriedem^ is the placement spec13:25
*** muttley has quit IRC13:25
mriedemtpatil has a spec leveraging that, which i couldn't understand13:25
mriedemand sounded like well if all the planets were aligned config wise things *might* not shit the bed13:26
jaypipesmriedem: ok, sorry I thought you were referring to the latter.13:26
sean-k-mooneyjaypipes: yes it likely was.13:26
mriedemthe latter is https://review.openstack.org/#/c/609960/13:26
jaypipesyes, that's what I thought you were referring to :)13:26
*** muttley has joined #openstack-nova13:26
mriedemyou can see my confusion in https://review.openstack.org/#/c/609960/2/specs/stein/approved/placement-req-filter-forbidden-aggregates.rst@1413:26
jaypipesyou being +2 on that one was news to me ;)13:26
mriedemheh, not even close13:27
jaypipesreight...13:27
mriedemi don't feel great about that, because this is at least the 3rd spec that tushar has had to chase for his issue13:27
jaypipesmriedem: ok, well at least you know why I was a bit confused above then :)13:27
sean-k-mooneywhy do we have two specs for the same thing?13:27
jaypipesmriedem: ack, agreed. but the specs should be clear. and this isn't (yet)13:27
mriedemsean-k-mooney: it's not13:28
jaypipessean-k-mooney: mriedem had (correctly) asked to separate the nova-config stuff from the placement needs13:28
mriedemonce is the placement change, one is the nova change ot use it13:28
jaypipesjinx13:28
sean-k-mooneyoh ok13:28
mriedembecause they are separate services...13:28
sean-k-mooneyi was just aware of https://review.openstack.org/#/c/603352/13:28
jaypipessean-k-mooney: it's kind of like if you combine the two words eliminate and intimidate... better to have two words that mean different things.13:28
jaypipessean-k-mooney: :P13:29
mriedemtssurya: so if your computes can talk to the scheduler nodes over rpc, enabling that option might help you some with the scheduling performance issues13:29
*** rcernin has quit IRC13:29
*** muttley has quit IRC13:29
mriedembut with 14K computes, you might also melt your mq...13:29
* jaypipes reads back up to understand tssurya's issues...13:29
sean-k-mooney:)13:29
mriedemjaypipes: they severely limit max_placement_results to 10 because scheduling takes too long13:30
mriedemmostly boils down to https://bugs.launchpad.net/nova/+bug/1737465 i think13:30
openstackLaunchpad bug 1737465 in OpenStack Compute (nova) "[cellv2] the performance issue of cellv2 when creating 500 instances concurrently" [Medium,Confirmed]13:30
mriedemfrom the forum session on cells v2:13:30
mriedem"TODO: need someone to dig into optimized DB queries in HostManager._get_host_states()."13:30
jaypipesmriedem: by "scheduling" you are referring to the entire process on the nova side or are you referring to just the placement allocation candidates query?13:30
mriedemthe entire process13:30
jaypipesk13:30
mriedemthe biggest issue, as far as i know right now, is likely HostManager._get_host_states()13:31
jaypipesbut they only have one filter enabled... so that shouldn't be a big perf hit, right?13:31
jaypipesmriedem: and all ten of the hosts returned are disabled?13:31
sean-k-mooneyjaypipes: yep13:31
mriedemdifferent issue...13:31
mriedemwe're talking past each other13:31
sean-k-mooneywell its related13:31
sean-k-mooneyeither they hit all are diable13:32
mriedemthe disabled computes issue is a result of setting max_placement_results to 1013:32
jaypipeslemme read all the bug contents... one sec.13:32
sean-k-mooneyor they are not but then when the weight the host its not a even spread or packing behavior anymore13:32
mriedemthe root issue is that they have max_placement_results set so low because of shitty performance in nova-scheduler13:32
*** liuyulong has joined #openstack-nova13:32
mriedemhttps://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L704 is a problem, because it iterates all the resulting compute nodes (via alloc candidates), and for each it joins the host aggregates and instances per host https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L73413:34
jaypipesmriedem: does CERN have placement request filtering enabled?13:34
mriedemso that the HostState has info about aggregates and instances before the HostState goes ot the filters13:34
mriedemjaypipes: yes13:34
mriedemit was added for them :)13:34
sean-k-mooneymriedem: ya by seting it to 10 the schuler only considerd 5% of posibel host in a cell or less then 1% of host that tenatn could be schduled too13:34
*** pcaruana has joined #openstack-nova13:34
mriedemsean-k-mooney: you know i understand the problem right?13:34
sean-k-mooneymriedem: ya i know13:35
sean-k-mooneysorry ill leave you too it i need to finish fixing my own spec13:35
mriedemjaypipes: so if you have some wild sql fu thoughts on how we could optimize https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L704 that would be super13:35
mriedemlike, in a single query per cell, give me all the computes in a uuid list along with the list of instance uuids on those hosts13:37
jaypipesmriedem: the two links to host_manager.py above are not doing SQL operations... they are doing lookups on the host manager's local hashmaps of aggregate and instance information...13:37
jaypipesmriedem: the nova-compute services are reporting aggregate and instance usage via the queryclient interface, and the host manager caches that information.13:38
mriedemhttps://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L74713:38
mriedemis definitely doing db stuff13:38
*** awalende has joined #openstack-nova13:38
mriedemthere is no cache here13:38
mriedemmaybe for aggregates13:38
mriedembut not for host->instance mappings13:38
jaypipesmriedem: that's only called if there's no cache entry for the host.13:38
*** pcaruana has quit IRC13:39
mriedemwhich there won't be per request13:39
jaypipesmriedem: those cache entries are populated on startup of the host manager and when the compute services report to the scheduler via the queryclient13:39
mriedemjaypipes: that's only if you have enabled track_instance_changes13:39
mriedemwhich cern does not do,13:39
mriedemand which won't work in a split MQ cells deployemtn where the computes can't rpc cast to the scheduler13:39
jaypipesguh...13:40
jaypipesso THAT'S the source of this issue, really.13:40
jaypipesok.13:40
*** awalende has quit IRC13:40
mriedemso i've suggested that cern try track_instance_changes=true to see if that caching helps13:40
mriedemmaybe at least for a handful of cells to start13:40
jaypipesmriedem: alright, I can do some SQL-fu on the _get_instances_by_host() method.13:40
*** awalende has joined #openstack-nova13:41
jaypipesmriedem: OK, I understand the issue better now. Gimme a little to experiment with something, ok?13:41
*** tssurya has quit IRC13:41
mriedemsure13:41
mriedemlet this be your white whale for the day13:41
*** tssurya has joined #openstack-nova13:41
jaypipesjust call me Ahab.13:42
jaypipesoh, hi tssurya :)13:42
*** k_mouza has joined #openstack-nova13:42
jaypipesmriedem: ok, so now I understand why CERN is setting max_placement_records so low... it's because a single HostMapping.get_by_host() is being a issued for each found host.13:43
jaypipesinstead of a batched approach.13:43
*** pcaruana has joined #openstack-nova13:43
mriedemsure, it's more than just the host mapping query13:45
*** k_mouza_ has quit IRC13:45
jaypipeshttps://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L75813:45
mriedembingo13:45
jaypipesInstalceList too...13:45
jaypipesack.13:45
mriedemyes, and i have already neutered that13:45
mriedemthat used to pull full Instance objects back13:45
mriedemnow it's just uuids13:45
mriedemhttps://github.com/openstack/nova/commit/91f5af7ee7f7140eafb7237875f6cd6ea1abcd3813:46
jaypipesk13:46
fricklermriedem: any chance to give these stable reviews progress? https://review.openstack.org/619220 https://review.openstack.org/61925413:46
mriedemfrickler: there is a >0% chance yes13:46
*** pcaruana has quit IRC13:47
openstackgerritLee Yarwood proposed openstack/nova master: fixtures: Return a mocked class instead of method within fake_imagebackend  https://review.openstack.org/61980413:47
* frickler is positivized13:47
mriedemfrickler: why didn't you cherry pick the pike change from the queens change? looks like you had to redo some of the conflict resolution?13:48
mriedemnormally backports should just all go in a row, stein -> rocky -> queens -> pike13:48
mriedemlike a snowball13:48
fricklermriedem: it was yet another conflict, yes, that function likes to do file hopping it seems13:49
mriedemlyarwood: want to get this? https://review.openstack.org/#/c/619220/113:49
mriedemok13:49
lyarwoodmriedem: looking13:50
*** dpawlik has quit IRC13:50
mriedemok scored both13:50
lyarwoodmriedem: okay done, would you mind taking a look at https://review.openstack.org/#/q/status:open+project:openstack/nova+topic:bug/1764883 when you get a chance?13:51
*** awalende has quit IRC13:52
*** dr_gogeta86 has joined #openstack-nova13:55
mriedemi see how it is, i give you something easy and in return you give me this13:56
mriedemoh these are backports..13:56
mriedemnvm13:56
*** yikun has quit IRC13:56
lyarwoodmriedem: something just as easy, yeah, what a guy ;)13:57
*** ttsiouts has quit IRC14:01
*** eharney has joined #openstack-nova14:01
*** awalende has joined #openstack-nova14:01
*** ttsiouts has joined #openstack-nova14:01
*** rodolof has joined #openstack-nova14:02
*** ttsiouts has quit IRC14:06
openstackgerritsean mooney proposed openstack/nova-specs master: Add spec for sriov live migration  https://review.openstack.org/60511614:06
sean-k-mooneybauzas: adrianc sorry for the delay on updating ^14:07
*** rodolof has quit IRC14:07
*** rodolof has joined #openstack-nova14:08
*** spatel has joined #openstack-nova14:13
*** spatel has joined #openstack-nova14:13
*** awaugama has joined #openstack-nova14:13
*** spatel has quit IRC14:17
adriancack sean-k-mooney, will look into it14:19
*** ratailor has joined #openstack-nova14:19
*** k_mouza has quit IRC14:22
*** jaosorior has joined #openstack-nova14:23
*** ttsiouts has joined #openstack-nova14:24
*** cfriesen has joined #openstack-nova14:27
openstackgerritMerged openstack/nova-specs master: Per aggregate scheduling weight (spec)  https://review.openstack.org/59930814:27
tssuryahi back jaypipes :)14:28
tssuryamriedem: thanks for the info on the config option, will see if enabling it makes it better14:29
tssuryaI guess the documentation says something like "f the configured filters and weighers do not need this information, disabling this option will improve performance"14:29
tssuryabut I didn't know what this exactly did until after looking into the code14:30
*** mchlumsky has joined #openstack-nova14:30
*** k_mouza has joined #openstack-nova14:30
sean-k-mooneyadrianc: there are very minor changes as there seam to be little questions raised14:31
sean-k-mooneyadrianc: it should not affect you poc14:31
*** psachin has joined #openstack-nova14:32
*** mchlumsky has quit IRC14:33
*** janki has quit IRC14:33
*** diliprenkila has joined #openstack-nova14:33
*** mchlumsky has joined #openstack-nova14:34
mriedemtssurya: yeah it's mostly for the affinity type filters14:35
*** mlavalle has joined #openstack-nova14:35
*** ratailor has quit IRC14:35
tssuryamriedem: ah ok14:36
tssuryayea14:36
mriedemthe performance thing there is probably about not needing to needlessly send instance information from all computes to the scheduler14:37
mriedembut misses the part about how the scheduler then blindly pulls the information from the db anyway14:37
mriedemhmm...14:38
mriedemi wonder...14:38
mriedemif we only got the HostState.instances in the scheduler if track_instance_changes was true14:38
mriedemsince they are kind of connected14:38
mriedemwe wouldn't want to make anything conditional on enabled filters b/c people can have out of tree filters14:38
mriedemor we could just add a new option, CONF.filter_scheduler.do_your_filters_care_about_instances_on_hosts14:39
mriedemwith a better name14:39
mriedemmore config options is gross, but if you have split mq then track_instance_changes won't work and you don't need to enable it14:39
tssuryaright, I am pretty sure this option was enabled by default and three months ago we disabled it and saw a very slight performance improvement actually14:40
*** diliprenkila has quit IRC14:40
*** liuyulong has quit IRC14:40
mriedembecause of less mq traffic/14:40
mriedem?14:40
tssuryabut yea I am getting this from a tracking ticket on the issue, probably we will look into this next week14:41
mriedemok14:41
tssuryayea I think so14:41
mriedemas i said, i'd expect your mq traffic to go up by enabling it, but your overall scheduling time, at least for hosts in the cells reporting instance info, might go down14:41
*** ttsiouts has quit IRC14:41
mriedembut ultimately if you aren't enabling filters that need that information, it's a waste of time all around14:41
*** ttsiouts has joined #openstack-nova14:42
tssuryahmm also what do you think querying all the cell DBs to fetch the compute nodes could be a reason enough to add time ?14:42
sean-k-mooneyif people have time could they take a look at https://review.openstack.org/#/c/591607/1114:42
tssuryaI honestly am not sure if the whole parallel works well enough even with scatter gather,14:43
tssuryawill have to dig through that too14:43
*** Swami has joined #openstack-nova14:44
mriedemtssurya: we're querying enabled cells right? but that could be 70 cells which don't contain the 10 compute nodes you care about?14:44
mriedemalso, how many scheduler workers are you running?14:45
dansmithtssurya: why do you think the parallel thing doesn't work?14:47
*** ttsiouts has quit IRC14:47
dansmithor you said "well enough"...14:47
*** diliprenkila has joined #openstack-nova14:49
sean-k-mooneydansmith: stephenfin: can you add https://review.openstack.org/#/c/591607/11 to your review queue.14:50
tssuryadansmith: well last time we tried to measure the parallel part in time over going to all 70 DBs we did not get any solid proof of fastness14:50
diliprenkilaHi all, While creating instance snapshots , type conversion error occurs nova.api.openstack.wsgi [req-fd9b8a70-f455-42e1-b186-93fffaa6192e ff9650c86533492581513eca72b48409 2eea218eea984dd68f1378ea21c64b83 - 765703fcca634b149c7a012626847d2f 765703fcca634b149c7a012626847d2f] Unexpected exception in API method: TypeError: Unable to set 'os_hidden' to 'False'. Reason: u'False' is not of type u'boolean'14:50
stephenfinsean-k-mooney: Sure14:50
mriedemdiliprenkila: you're talking about https://bugs.launchpad.net/nova/+bug/1806239 yes14:51
openstackLaunchpad bug 1806239 in OpenStack Compute (nova) "nova-api should handle type conversion while creating server snapshots " [Undecided,New]14:51
dansmithtssurya: all of the processing of the database results happens in serial, which is generally the majority of the work14:51
tssuryadon't know was just wondering if someone had done the scatter-gather versus going sequential perf study14:51
dansmithtssurya: that's because we're using eventlet instead of regular threads14:51
openstackgerritsean mooney proposed openstack/nova master: Add fill_virtual_interface_list online_data_migration script  https://review.openstack.org/61416714:51
tssuryaah yea which is why the bottelneck is the time of the slowest DB still14:52
diliprenkilamriedem>full log is at :https://etherpad.openstack.org/p/e20ERvBEkw14:52
dansmithtssurya: no, not really, that's just because we have to wait for all the results in order to be able to produce them in order14:52
*** _alastor_ has joined #openstack-nova14:52
tssuryamriedem: yea its still going to 70 cells out of which we care only about the 1 cells which might have those 10 nodes14:53
*** ttsiouts has joined #openstack-nova14:53
mriedemhmm, wonder if we could front-filter the cells via host mappings,14:53
mriedemthat wouldn't be worth it in the case of 1-2 cells,14:53
mriedembut for 70 it might14:53
dansmithmriedem: like we do for projects on list.., and we have a knob to choose14:53
mriedemsort of like how we get instance mappings by project_id and then filter the cells from those mappings14:53
mriedemyeah14:53
*** janki has joined #openstack-nova14:54
tssuryamriedem: yea we had a bug for that, and kind of have a patch to try that out, will first see if there is a major difference14:54
*** diliprenkila_ has joined #openstack-nova14:55
mriedemdiliprenkila: i don't see os_hidden here https://docs.openstack.org/glance/latest/admin/useful-image-properties.html14:55
mriedemis it just not documented?14:55
dansmithmriedem: and, it only helps for 70 cells when you regularly cut out 68 of those cells for any given scheduling request14:55
tssuryadansmith: oh ok14:55
*** Swami has quit IRC14:55
bauzasmriedem: diliprenkila: I guess he means kvm_hidden14:56
mriedemtssurya: this bug? https://bugs.launchpad.net/nova/+bug/176730314:56
openstackLaunchpad bug 1767303 in OpenStack Compute (nova) "Scheduler connects to all cells DBs to gather compute nodes info" [Undecided,Incomplete] - Assigned to Surya Seetharaman (tssurya)14:56
tssuryayea14:56
*** Swami has joined #openstack-nova14:56
mriedembauzas: no it's a property on the image? https://github.com/openstack/glance/blob/a308c444065307e99f18b521ed8d95714be24da7/glance/db/sqlalchemy/alembic_migrations/versions/rocky_expand01_add_os_hidden.py#L1614:56
*** diliprenkila has quit IRC14:57
bauzasah my bad then14:57
bauzaswhat does this ?14:57
*** _alastor_ has quit IRC14:57
* bauzas goes looking at nova repo14:57
bauzasmmm https://github.com/openstack/nova/search?q=os_hidden&unscoped_q=os_hidden14:58
*** diliprenkila_ has quit IRC14:59
mriedemtssurya: ok changed that to triaged and added notes15:00
mriedemadded to https://etherpad.openstack.org/p/BER-cells-v2-updates also15:00
mriedemsince i've got a running list of perf related issues in there15:00
*** diliprenkila has joined #openstack-nova15:01
sean-k-mooneybauzas: looking at http://codesearch.openstack.org/?q=os_hidden&i=nope&files=&repos= os_hidden is used by horizon and in glance but not in nova15:01
mriedemnova likely gets the image properties, shoves them into instance.system_metadata, and then on snapshot we populate the image meta for the new image from that sys_meta but aren't saying os_hidden is a boolean15:02
mriedemand default to send it as a string or something15:02
mriedemwe have a whitelist for shit like this15:03
tssuryamriedem: thanks15:03
diliprenkilamriedem: can't we send os_hidden as boolean?15:04
sean-k-mooneydiliprenkila: we proably could but nova does not know os_hidden is a thing so it has no logic specifically for haneling it15:05
mriedemdiliprenkila: the problem is likely here https://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/image/glance.py#L73415:05
mriedemthis is why microversions are nice - nova, as a client, never asked for this new response field15:05
mriedemstashed it away and then blindly gave it back15:06
mriedemapparently there isn't any tempest testing for this either15:06
mriedemdiliprenkila: i left some questions in the bug, i'm not sure why tempest wouldn't already be failing on this, unless you have to do something to trigger this failure - can you provide reproduction steps in the bug report?15:08
diliprenkilamriedem : yes i will provide reproduction steps15:08
mriedemi'm pretty sure "output[prop_name] = str(prop_value)" is the problem15:09
sean-k-mooneyis this something the glanceclient could fix for us15:09
sean-k-mooneyit defines it as a bool in its schema15:10
mriedemwe don't use glanceclient15:10
sean-k-mooneyhttp://git.openstack.org/cgit/openstack/python-glanceclient/tree/glanceclient/v2/image_schema.py#n21615:10
sean-k-mooneyoh ok15:10
openstackgerritBalazs Gibizer proposed openstack/nova master: Remove port allocation during detach  https://review.openstack.org/62242115:10
mriedemi think we only use ksa now15:10
*** dpawlik has joined #openstack-nova15:10
sean-k-mooneywe still import glancclient15:10
mriedemoh yeah i guess we do,15:11
mriedemwe use ksa to get the session adapter thing to construct glanceclient15:11
mriedemanyway, it'd be cool if apis didn't return things you didn't ask for...15:12
diliprenkilamriedem: yes15:12
sean-k-mooneyso could we modify glance client to munge the values form strings to bools so we did not have to handel this field ourselves15:12
*** dpawlik has quit IRC15:15
mriedemi'm sure if we complained to the glance team about this, they'd say "you should be using the schema provided with the image"15:15
mriedemwhich we aren't15:15
*** amodi has joined #openstack-nova15:15
mriedemthere is only one image field we deal with the schema and that's disk_format15:15
mriedemhttps://review.openstack.org/#/c/375875/15:16
*** dpawlik has joined #openstack-nova15:17
*** awalende has quit IRC15:17
*** dpawlik has quit IRC15:17
*** dpawlik has joined #openstack-nova15:18
diliprenkilamriedem: so we should fix the os_hidden type in nova ? not in glance15:19
openstackgerritSilvan Kaiser proposed openstack/nova master: Exec systemd-run without --user flag in Quobyte driver  https://review.openstack.org/55419515:20
*** sridharg has quit IRC15:21
mriedemdiliprenkila: i don't think there is probably anything to change in glance,15:21
mriedembut i'd like to know why tempest isn't failing with this, but i need to know the recreate steps,15:22
openstackgerritChris Dent proposed openstack/nova master: Correct lower-constraints.txt and the related tox job  https://review.openstack.org/62297215:22
mriedembecause tempest has very basic tests where it creates a server and then creates a snapshot of that server,15:22
mriedemwhich i would think should cause this failure15:22
*** slaweq has joined #openstack-nova15:23
mriedemmaybe tempest isn't using image api v2.7/15:26
mriedem?15:26
diliprenkilamridem: may be15:27
artommriedem, I think by default it goes to the lowest microversion15:28
mriedemglance doesn't have microversions but...15:28
artomUnless the test specifies min_microversion (or just microversion?)15:28
artomOh, so just endpoints?15:28
mriedemi need a mordred15:28
mordredI didn't do it15:29
mriedemimage api versions, go!15:29
*** slaweq has quit IRC15:29
mriedemas in, wtf15:29
artomYou're holding out for a mordred 'till the end of the night15:29
mriedemdoes the user opt into those or you just get what the server has available?15:29
diliprenkilamriedem: i am using nova: 18.0.0 , glance: 2.9.115:29
mordredthey're silly. there is no selection - you just get the API described by the highest number in that list15:29
mordredso - basically - ignore the thing after the 2.15:30
mriedemso if glance is rocky, i get 2.715:30
mriedemhttps://developer.openstack.org/api-ref/image/versions/index.html#version-history15:30
mordredyeah15:30
mriedemthen i don't know why tempest would not fail on https://bugs.launchpad.net/nova/+bug/180623915:30
openstackLaunchpad bug 1806239 in OpenStack Compute (nova) "nova-api should handle type conversion while creating server snapshots " [Undecided,New]15:30
mordrednova is still using glanceclient right?15:31
mriedemyeah15:31
mordredyeah - tempest uses direct rest calls - so it's possible glanceclient is doing something wrong. or tempest is doing something wrong15:31
mriedemwell, tempest would just create a server and tell nova to snapshot it15:32
mriedemand then nova will use glanceclient15:32
*** slaweq has joined #openstack-nova15:32
mriedemif that's all it takes to tickle this with rocky glance, i'm not sure why tempest wouldn't blow up15:32
mriedemanyway, i'll wait for diliprenkila to provide recreate steps15:33
mordredoh. gotcha15:33
mordredyeah. that's super weird15:34
*** jhesketh has quit IRC15:34
ShilpaSDHi All, facing issue while stack, E: Sub-process /usr/bin/dpkg returned an error code (1), any suggestions to resolve this?15:35
*** jhesketh has joined #openstack-nova15:35
*** bringha has quit IRC15:36
*** slaweq has quit IRC15:37
dansmithmriedem: so on that rpc logging thing at startup, do we think the actual query is slowing things down, or the logging of the pointless message?15:37
mriedemidk15:37
mriedemlooking at logs15:38
mriedemwell we take 2 seconds dumping our gd config options :)15:39
dansmithwhich should be mostly just log traffic, right?15:39
mriedemyeah15:39
dansmithso maybe the logging is the thing15:39
dansmithalso15:39
mriedemwe start loading extensions at Dec 05 20:14:00.91952015:39
dansmithyou know that if we were to actually start compute first, we would cache the service version the way we want and avoid the multiple lookups15:40
mriedemlooks like we're done loading extensions at Dec 05 20:14:27.71858715:40
mriedemso there is another thing here,15:40
mriedemthe rpcapi client does the version query thing,15:40
mriedembut API also constructs a SchedulerReportClient per instance, which apparently uses a lock15:41
mriedemDec 05 20:14:27.687766 ubuntu-xenial-ovh-bhs1-0000959981 devstack@n-api.service[23459]: DEBUG oslo_concurrency.lockutils [None req-dfdfad07-2ff4-43ed-9f67-2acd59687e0c None None] Lock "placement_client" acquired by "nova.scheduler.client.report._create_client" :: waited 0.000s {{(pid=23462) inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:327}}15:41
mriedemso...15:41
dansmithI also just noticed/remembered that the multi-cell version of this does not cache at all15:41
mriedemand that's an in-memory lock15:41
mriedemheh remember how efried_cya_jan removed that lazy-load scheduler report client stuff? https://github.com/openstack/nova/blob/master/nova/compute/api.py#L25615:44
dansmithyeah15:44
efried_cya_januh oh15:44
dansmithmriedem: is there a warn-once pattern I should be using for logs?15:45
mriedemdansmith: i think just a global15:46
dansmithack15:46
mriedemi'll strip out a separate bug for this report client init thing15:46
*** kaisers_ has joined #openstack-nova15:46
mriedemhttps://bugs.launchpad.net/nova/+bug/180721915:51
openstackLaunchpad bug 1807219 in OpenStack Compute (nova) "SchedulerReporClient init slows down nova-api startup" [Medium,Triaged]15:51
cdentmdbooth: is that ^ related at all to the slow down you were seeing in your explorations yesterday(?)?15:52
efried_cya_janmriedem: We ought to singleton that guy. If we're not having caching conflicts, it can only be out of luck because the API obj isn't doing anything that touches the cache.15:53
edleafeefried_cya_jan: more like efried_big_fat_liar15:53
efried_cya_janI'm here for like another 20 minutes today. This does two things: 1) keeps my inbox down to triple digits; 2) makes you scared to talk about me behind my back.15:54
mriedemi intentionally talked about you to summon you15:55
mdboothcdent: Only if we also do this during server create. Possible. The number of API objects created was insane.15:55
kaisers_stephenfin: mdbooth: Updated https://review.openstack.org/#/c/554195/ as discussed yesterday15:55
edleafeOh, it's more fun to talk about you to your face!15:55
*** dpawlik has quit IRC15:55
efried_cya_janmriedem: Is the removal of lazy-load causing (or even related to) that one-reportclient-per-API?15:56
*** jaypipes has quit IRC15:56
mriedemefried_cya_jan: not sure, we might have always been loading it during API init15:57
*** jaypipes has joined #openstack-nova15:58
mriedembut i think it would have done the lazy-load15:58
mriedemsince LazyLoader only creates the client thing until something is accessed on it15:58
efried_cya_janAnd it's only expensive because it's serialized, not because the report client is doing anything heavy, right?15:59
mriedemyeah i think so15:59
mriedemhttps://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/scheduler/client/report.py#L26016:00
mriedemwe create 2 provider trees in there for some reason...16:00
mriedemhttps://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/scheduler/client/report.py#L27016:00
mriedemhttps://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/scheduler/client/report.py#L27916:00
mriedemoops16:00
mriedemhttps://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/scheduler/client/report.py#L28616:00
mriedemis there any good reason we need to do that in both places?16:01
*** janki is now known as janki|dinner16:02
efried_cya_janmriedem: There should never be a need for one process to create two separate provider trees <=> report client instances, period.16:02
efried_cya_janThat could only ever lead to bugs.16:02
efried_cya_janIt will only ever *not* lead to bugs by luck.16:02
efried_cya_janhold on, looking at your links...16:03
mriedemso, i'm going to remove this https://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/scheduler/client/report.py#L286-L28716:03
mriedemand change https://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/scheduler/client/report.py#L270-L27216:03
mriedemto just call clear_provider_cache16:03
mriedemfor great success16:03
mriedemok?!16:03
mriedembutthole surfers got me all amped up16:03
*** yan0s has quit IRC16:03
efried_cya_janmriedem: That will be fine, it's removing redundant calls, but it's not going to change anything functionally. We were calling ProviderTree init twice before, but not creating two separate provider trees, just overwriting self._provider_tree.16:05
efried_cya_janmriedem: However, I don't agree with the details. You can't get rid of _create_client without affecting @safe_connect. For now (until @safe_connect has died a fiery death), I would just remove the provider tree and assoc timer inits from __init__.16:06
efried_cya_janAnd if you would like to take a step on the road to that fiery demise of @safe_connect, you could review https://review.openstack.org/#/c/613613/ :P16:06
mriedemefried_cya_jan: safe_connect doesn't care about provider tree and association refresh16:07
mriedemonly the ksa client thing16:08
melwitto/16:08
efried_cya_janmriedem: reason we want to clear the cache in safe_connect is because it happened due to an error, so we can't count on our cache being correct.16:08
efried_cya_janI sense you're trying to find a way to remove the use of that semaphore. I don't think we can do that.16:09
mriedemi'm not16:09
mriedemi'm just trying to get rid of this code creating the provider tree data structure 20 times in different places16:09
mriedembecause i'm OCD16:09
mriedembut i hear you16:10
efried_cya_janWhat you can do is make _create_client call clear_provider_cache, and remove those inits from __init__. That ought to consolidate the actual LOCs that do the init to one place.16:10
efried_cya_janI think.16:10
*** takamatsu has quit IRC16:10
efried_cya_janbut hm, this makes me wonder whether clear_provider_cache needs to be under that same lock.16:10
efried_cya_jan(for the other cases where it's called)16:10
efried_cya_janI'll leave that steaming pile in front of you and walk away.16:11
* kashyap likes efried_cya_jan's nick; guess it means he's going to disappear soon :D16:14
efried_cya_jankashyap: Yeah, right about... now.16:15
efried_cya_jano/16:15
kashyap(That's good; happy to see screen-starers getting breaks...)16:15
sean-k-mooneyefried_cya_jan: enjoy the break16:15
kashyapefried_cya_jan: Glad; don't make the mistake of staying connected to the VPN :D16:15
* kashyap is off from 17th; will be catching on my books and other expat errands, and so forth.16:16
*** pas-ha has joined #openstack-nova16:16
*** k_mouza_ has joined #openstack-nova16:22
*** k_mouza has quit IRC16:25
*** Miouge has left #openstack-nova16:28
mriedemheh putting the same lock on clear_provider_cache makes the tests lock up16:29
openstackgerritMatt Riedemann proposed openstack/nova master: Only construct SchedulerReportClient on first access from API  https://review.openstack.org/62324616:32
openstackgerritMatt Riedemann proposed openstack/nova master: DRY up SchedulerReportClient init  https://review.openstack.org/62324716:32
mriedemsomeone should probably tell the intel nfv ci to stop reporting b/c it's busted16:33
mriedemand has been for awhile16:34
mnaserhttps://wiki.openstack.org/wiki/ThirdPartySystems/Intel_NFV_CI16:40
*** ohorecny2 has quit IRC16:41
mnaserlooks like it's in wip or something16:41
*** psachin has quit IRC16:41
gibimriedem: my organized thoughts about the legacy notification deprecation http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000685.html16:43
pas-hahi all, have a question re nova + barbican integration - if we have enabled `[glance]verify_glance_signatures` in nova, how booting from snapshot (including unshelve of instance) is supposed to work? AFAIU nova does not re-sign the snapshots it creates...16:45
mriedempas-ha: unfortunately everyone that i know that worked on that is no longer around, but people in #openstack-barbican might know16:48
mriedemgibi: thanks16:48
pas-hamriedem: thanks, will ask around there then :-)16:48
gibimriedem: it might not help much but at least it summarizes our options16:49
mriedemgibi: yeah that's still useful16:49
*** ttsiouts has quit IRC16:49
*** ttsiouts has joined #openstack-nova16:50
mriedempas-ha: i assume you mean because of this https://docs.openstack.org/nova/rocky/configuration/config.html#DEFAULT.non_inheritable_image_properties16:50
mriedemso when nova creates a snapshot, the image snapshot does not inherit the signature16:51
mriedemso trying to boot/unshelve from it later won't work16:51
mriedemif you have verify_glance_signatures=True16:51
mriedeminterestingly enough, but sadly probably not surprising, is that those image properties are not documented https://docs.openstack.org/glance/latest/admin/useful-image-properties.html16:52
pas-hayep, but AFAIU even if they were passed, the image signature is no longer valid already as surely the hash of the snapshot is not the same as that of pristine image16:52
mriedemyeah16:53
mriedemi guess there is https://docs.openstack.org/glance/latest/user/signature.html16:53
mriedemi guess the user would need to sign the snapshots after nova creates them right?16:53
*** ttsiouts has quit IRC16:55
pas-haright now that's the only way I see it might work, yes16:56
mriedemthis was the spec if that contains any info https://specs.openstack.org/openstack/nova-specs/specs/mitaka/implemented/image-verification.html16:57
mriedemit does say "Add functionality to Nova which calls the standalone module when Nova uploads a Glance image and the verify_glance_signatures configuration flag is set."16:58
mriedembut i don't actually see that happening16:59
mriedempas-ha: you might be better off simply asking about this in the openstack-discuss mailing list to see if anyone else is working on this already17:00
mriedemor interested in closing this gap17:00
pas-hayes, figured that out, will do, thanks!17:00
mriedemyw17:00
*** diliprenkila has quit IRC17:09
*** Swami has quit IRC17:14
*** helenafm has quit IRC17:16
*** janki|dinner has quit IRC17:16
*** tssurya has quit IRC17:18
*** k_mouza_ has quit IRC17:24
*** _alastor_ has joined #openstack-nova17:28
*** k_mouza has joined #openstack-nova17:45
*** k_mouza has quit IRC17:47
*** jmlowe has quit IRC17:49
*** eharney has quit IRC17:52
*** igordc has joined #openstack-nova17:53
cfriesenthere's contact information there...is someone going to email them?17:54
cfriesen^ for the intel NFV CI17:54
sean-k-mooneyis it broken?17:54
cfriesenmriedem: says yes17:55
cfriesenbah, shouldn't have colon there17:55
*** gyee has joined #openstack-nova17:55
sean-k-mooneyi read it anyway :)17:55
sean-k-mooneyi need to talk to infra and see if we can migrate that ci upstream17:56
sean-k-mooneyit all comes down to having host noes with nested virt17:56
sean-k-mooneyand 2 numa nodes17:56
sean-k-mooneythe 2 numa nodes are optional but it used dual numa node guest too17:57
*** macza has joined #openstack-nova17:57
*** derekh has quit IRC17:59
*** dtantsur is now known as dtantsur|afk18:01
*** udesale has quit IRC18:04
*** Swami has joined #openstack-nova18:05
lyarwoodWe shouldn't allow a cold migration from an offline compute host right?18:06
dansmithwe don't check, AFAIK18:06
lyarwoodYeah that's what I'm seeing in the API at least18:06
lyarwoodbut that's a bug right? The source compute being offline should fail the attempt right there?18:07
dansmithwell,18:07
dansmithyou could argue that for sure18:07
lyarwoodk, this is from a downstream bug report where a user has tried to migrate instead of evacuate from an offline compute and the instance gets stuck in resize_prep etc.18:08
*** sahid has quit IRC18:15
*** jmlowe has joined #openstack-nova18:18
openstackgerritJack Ding proposed openstack/nova master: [WIP] Preserve UEFI NVRAM variable store  https://review.openstack.org/62164618:20
*** spatel has joined #openstack-nova18:21
spatelsean-k-mooney: afternoon18:21
sean-k-mooneyo/18:21
spatelsean-k-mooney: i had question related horizon dashboard  Overview section, its saying Active instance: 36718:22
sean-k-mooneyyep18:22
spatelbut when i run "nova list" its saying you have 280 instance total18:22
*** N3l1x has quit IRC18:22
spatelwhere that 367 came from?18:22
sean-k-mooneythat is a good question18:22
sean-k-mooneyi belive it hitting the simple teant usage api18:23
*** N3l1x has joined #openstack-nova18:23
spateldoes it counting deleted instance too?18:23
sean-k-mooneyso it is proably showing the total number of instaces that were launched18:23
sean-k-mooneylet me take a look a dashboard for a sec18:23
spatelok18:24
*** k_mouza has joined #openstack-nova18:24
sean-k-mooneyso i just opend my local hoizon you are lokking at the overview seaction in the project section or admin18:26
sean-k-mooneyin the project page it should just show the active instance not deleted18:26
*** k_mouza has quit IRC18:29
*** wolverineav has joined #openstack-nova18:29
*** wolverineav has quit IRC18:29
*** wolverineav has joined #openstack-nova18:29
*** slaweq has joined #openstack-nova18:30
*** dave-mccowan has joined #openstack-nova18:31
sean-k-mooneyspatel: you could run "openstack usage list" or openstack usage show --project <your project> and see if that say 36718:31
sean-k-mooneyspatel: i do not have a version of horizon that has teh simple teant usage api support turned on so i just see the dashboard that show the currently active instances18:32
spatelhold on doing it18:35
*** slaweq has quit IRC18:36
spatelsean-k-mooney: http://paste.openstack.org/show/736782/18:36
spatelhere you go18:36
sean-k-mooneyso you were looking at the overview in the admin section18:36
sean-k-mooneywell maybe not18:37
*** wolverineav has quit IRC18:37
sean-k-mooney314 != 36718:37
*** jmlowe has quit IRC18:37
openstackgerritDan Smith proposed openstack/nova master: Only warn about not having computes nodes once in rpcapi  https://review.openstack.org/62328218:40
openstackgerritDan Smith proposed openstack/nova master: Make service.get_minimum_version_all_cells() cache the results  https://review.openstack.org/62328318:40
openstackgerritDan Smith proposed openstack/nova master: Make compute rpcapi version calculation check all cells  https://review.openstack.org/62328418:40
dansmithmriedem: ^18:40
dansmithmriedem: hopefully we can see the difference in just the logging, and measure the impact of the all-cells change18:41
*** eharney has joined #openstack-nova18:42
sean-k-mooneyspatel: how many running vms does "openstack hypervisor stats show -c running_vms -f value" return18:42
spatelrunning18:43
-spatel- [root@ostack-osa ~(keystone_admin)]# openstack hypervisor stats show -c running_vms -f value18:43
-spatel- 28318:43
sean-k-mooneyok so that is what you were exepcting18:45
sean-k-mooneyit should match "openstack server list -f value -c ID | wc -l"18:45
*** wolverineav has joined #openstack-nova18:45
sean-k-mooneybut horizon is saying 36718:45
sean-k-mooneyit should match "openstack server list -f value -c ID | wc -l"18:46
spatelI have two project so i need to run that command in specific project18:46
spateladmin not giving me any data when i run "openstack server list -f value -c ID | wc -l"18:46
sean-k-mooneyam use this one instead "openstack server list -f value -c ID --all-projects | wc -l"18:47
spatelacross all project output is "287"18:47
spatelwhen i run this "openstack server list -f value -c ID --all-projects | wc -l"18:47
sean-k-mooneyok so that will inclode shelved or error state vms18:48
spatelpossible some machine are shutdown or error stat ...18:48
sean-k-mooneyits  clode enought to the running vms for hypervior state that i would get horizon may be wrone18:48
sean-k-mooneyopenstack server list would include all vm except deleted instances i think by default18:49
spatelbut 367 which horizon claiming is way way out..18:49
sean-k-mooneyyes it is18:50
sean-k-mooneyi honetly dont know why18:50
spateli assume its counting delete and other state also...18:50
spatelI will open BUG :)18:50
spatellets see what other people chime in18:50
sean-k-mooneyspatel: it should not include deleted18:50
spatelthat is what i am thinking18:50
sean-k-mooneyout of interest what is the resulted of "openstack server list -f value -c ID --deleted --all-projects | wc -l"18:51
sean-k-mooneyis it around 80 ish?18:51
*** slaweq has joined #openstack-nova18:52
mriedemdansmith: ack, we might get results by eod18:53
dansmithmriedem: aye18:53
dansmithmriedem: I just -1ed your placement client thing because I'm an ass, but I will make the change for you if you want18:54
dansmithto make myself feel better18:54
mriedemwhat assery is this18:54
spatelsean-k-mooney: ^^18:54
* dansmith goes to look up the types of assery18:54
-spatel- [root@ostack-osa ~(keystone_admin)]# openstack server list -f value -c ID --deleted --all-projects | wc -l18:54
-spatel- 33118:54
spateli am confused now :(18:54
spatelHere i file bug https://bugs.launchpad.net/horizon/+bug/180725118:55
openstackLaunchpad bug 1807251 in OpenStack Dashboard (Horizon) "Horizon Overview summary showing wrong numbers " [Undecided,New]18:55
cdentI imagine there is.a super upper ontology of assery18:55
*** jmlowe has joined #openstack-nova18:56
*** cdent has quit IRC18:56
*** slaweq has quit IRC18:57
sean-k-mooneyspatel:  that is strange18:58
spateldefinitely bug or may be some DB cleanup stuff?18:59
sean-k-mooneyeven teh other things like cpu hours are way out18:59
*** tbachman has quit IRC18:59
spatelexactly18:59
sean-k-mooneyare they showing the same time period19:00
sean-k-mooney2018-11-08 to 2018-12-0719:00
sean-k-mooneyit looks like the cli is defaulting to the last month but maybe horizon has a differnt default19:01
spatelon Horizon dates are default set to  2018-12-05 to 2018-12-0619:01
spatel24 hours period19:01
sean-k-mooneythat does not explain how horrizon is showing more usage19:01
sean-k-mooneyin fact it makes it less likely..19:01
spatelhmm19:02
sean-k-mooneyso ya sorry i cant really help more then that19:02
spateldon't worry it was good help to understand that i am not crazy...19:06
spateli thought i didn't understand or missing something but after you confirm it seems bug19:06
spatelanyway i have opened ticket so lets see19:06
cfriesensean-k-mooney: do you know if anyone has ever looked at setting IRQ affinity for PCI devices?  (would only really make sense for the "dedicated" case)19:07
sean-k-mooneyi did like 4 years ago19:07
sean-k-mooneythre are two field of tought on this either user irq blanace or whatever to afinities all irq away form the cores used by vms19:08
cfriesensean-k-mooney: I'm talking about for PCI-passthrough, setting the affinity to the pCPUs used by the guest19:09
sean-k-mooneyor dynamicaly affinities irqs for  a vf to the vm cpus19:09
sean-k-mooneyya i was asked to make that change a few years ago and got push back rom redhat folks on the libvirt team.19:10
sean-k-mooneylet me see if i can find that.19:10
*** udesale has joined #openstack-nova19:11
*** igordc has quit IRC19:19
*** igordc has joined #openstack-nova19:19
sean-k-mooneyhum i must have been an intel only bug which aparently i cant see19:21
*** dpawlik has joined #openstack-nova19:22
*** dpawlik has quit IRC19:24
sean-k-mooneycfriesen: i assume you cant see https://bugzilla.redhat.com/show_bug.cgi?id=113566819:28
openstacksean-k-mooney: Error: Error getting bugzilla.redhat.com bug #1135668: NotPermitted19:28
cfriesensean-k-mooney: nope. :)19:29
sean-k-mooneycfriesen: when we rant this by the libvirt people they were concerned that affinities the interups to the vm cores could cause alot of guest VM_exits to serivce the interupts19:29
cfriesensean-k-mooney: enough to outweigh the preloading of the cache?19:29
openstackgerritMatt Riedemann proposed openstack/nova master: Only construct SchedulerReportClient on first access from API  https://review.openstack.org/62324619:30
openstackgerritMatt Riedemann proposed openstack/nova master: DRY up SchedulerReportClient init  https://review.openstack.org/62324719:30
*** udesale has quit IRC19:31
sean-k-mooneyso this was created on 2014-08-29 so my memory is a little fuzzy but they wanted use to show that it would out way that overhead and at the time we did not have the capsity to test it in an openstack environmnt to messuer it19:31
sean-k-mooneyso it was drop becasue of lack of evidence to show ti was a usefull feature to add to libvirt19:31
sean-k-mooneywe abandonded our nova work as a result19:32
sean-k-mooneywhat everyone did agree on is that affizing to the same numa node was a good idea19:32
*** alex_xu has quit IRC19:32
cfriesenokay, thanks19:33
sean-k-mooneythinking about it now i think affinitizing to the emuplator thread or thread poll in isolate/shared mode might be better the to the vcpus19:33
*** alex_xu has joined #openstack-nova19:34
sean-k-mooneycfriesen: did you do any testing out of interest?19:35
sean-k-mooneycfriesen: hehe oh look here is my orginal mail https://www.mail-archive.com/libvir-list@redhat.com/msg100707.html19:35
cfriesenI'm actually not sure.  we had an in-house nova commit to enable pinning, but I wasn't the one that implemented it.  Kind of curious myself if they tested it. :)19:36
sean-k-mooneywow that was before i got the intel leagal email footer remove form my acount that was along time ago19:36
*** dpawlik has joined #openstack-nova19:39
*** dpawlik has quit IRC19:44
coreycbtobias-urdin: we have a placement package now19:56
coreycbtobias-urdin: just fyi, i know you were asking for it a little while back19:57
*** wolverineav has quit IRC19:59
*** wolverineav has joined #openstack-nova20:00
*** wolverineav has quit IRC20:05
lyarwoodcoreycb: link? I'll wire that up in my puppet changes tomorrow20:11
openstackgerritChris Dent proposed openstack/nova master: Correct lower-constraints.txt and the related tox job  https://review.openstack.org/62297220:12
*** rcernin has joined #openstack-nova20:12
*** tbachman has joined #openstack-nova20:13
*** tbachman_ has joined #openstack-nova20:15
mriedemaspiers: will config drive work with sev guests?20:15
mriedemsean asked in the spec review but i didn't see an answer20:16
coreycblyarwood: placement-api is the package name20:17
*** tbachman has quit IRC20:17
*** tbachman_ has quit IRC20:20
*** david-lyle is now known as dklyle20:24
mriedemaspiers: comments inline https://review.openstack.org/#/c/609779/ but +120:25
lyarwoodcoreycb: https://review.openstack.org/623306 - loads of assumptions about the package but I'll take another look in the morning20:26
*** udesale has joined #openstack-nova20:29
coreycblyarwood: that generally looks good to me. let me know how it goes and thanks for testing.20:29
*** wolverineav has joined #openstack-nova20:33
*** wolverineav has quit IRC20:38
openstackgerritMatt Riedemann proposed openstack/nova master: Ignore MoxStubout deprecation warnings  https://review.openstack.org/62330920:38
mriedemlet's do this ^20:38
*** mriedem has quit IRC20:45
*** udesale has quit IRC20:46
*** mriedem has joined #openstack-nova20:47
*** takashin has joined #openstack-nova20:49
artomThe hell, how do I dn sync cell1?20:49
artom*db20:50
melwittnova meeting in 10 minutes20:50
*** wolverineav has joined #openstack-nova20:53
coreycbmriedem: hi, just a friendly ping to see if you can review this when you get a chance. looks like it has a +1 from another core dev: https://review.openstack.org/#/c/579004/20:55
artomAh, just pass --config-file cell1.conf to nova-manage20:55
artom(from https://docs.openstack.org/nova/pike/cli/nova-manage.html)20:55
mriedemcoreycb: it's got the star but yeah i need to spend some time on it20:58
coreycbmriedem: ok thanks in advance20:59
*** Sundar has joined #openstack-nova21:01
SundarI'd appreciate if somebody can answer the question in: https://ask.openstack.org/en/question/117509/unable-to-update-rc-inventory-in-resource-provider/ . If openstack-discuss is a better place to pose that question, I'll bring it there.21:02
*** jmlowe has quit IRC21:16
dansmithSundar: pretty sure your url there is wrong21:22
dansmithSundar: you might want to look at the nova functional tests or the placement gabbit21:23
*** jaosorior has quit IRC21:23
dansmithSundar: you're trying to put inventory on a provider right?21:24
dansmithSundar: https://developer.openstack.org/api-ref/placement/#update-resource-provider-inventories21:24
dansmithI guess I'm wrong and the url is right, but I surely didn't think that was how that worked21:24
dansmithbecause: atomicity21:25
dansmithah right you can PUT on the provider itself do do multiples21:25
dansmithhttps://developer.openstack.org/api-ref/placement/?expanded=update-resource-provider-inventories-detail#update-resource-provider-inventories21:26
mriedemor just, openstack --debug resource provider inventory set ...21:26
mriedemhttps://docs.openstack.org/osc-placement/latest/index.html21:26
dansmithmriedem: yeah, that's how I'd be doing it.. I figured he had some reason for doing it with curl21:27
Sundardansmith: mriedem: I use curl by default. And I thought whatever works via openstack commands should also work via curl. Apparently not. :) Anyways, the openstack command worked. Thanks!21:35
dansmithSundar: of course you _can_ use curl, but you get into situations like you're in now where you get back a 400 because you typo'd something and it's hard to tell what that is :)21:36
mriedemit should work via curl21:36
*** tbachman has joined #openstack-nova21:36
mriedemright21:36
dansmithit's why we don't all write in machine language in 201821:36
SundarCurl tends to be much faster than openstack commands, in my experience.21:37
dansmithSundar: you asked on Nov 21 and got an answer on Dec 6.. seems like pretty terrible performance to me :)21:38
mriedemit fails very fast yes :)21:38
efried_cya_janSundar: I just looked at the gabbits, and it appears as though in order for that PUT to work, the inventory of that resource class has to exist already.21:38
efried_cya_janI.e. the PUT is designed to modify an existing inventory, not create a new one.21:38
mriedemPUT generally means updating something that exists21:38
efried_cya_janI think edleafe might disagree on that one21:38
Sundardansmith: I had a very long retry setting :)21:39
efried_cya_janSundar: let me get you a link to the gabbit...21:39
edleafeefried_cya_jan: damn straight21:39
edleafePUT replaces21:39
efried_cya_janSundar: https://github.com/openstack/placement/blob/master/placement/tests/functional/gabbits/inventory.yaml#L23421:40
efried_cya_janedleafe: In which case this operation is busted :(21:40
mriedemgaudenz: you need to set this to true in the devstack-plugin-ceph job https://github.com/openstack/tempest/blob/b62baf7c16d4609ea92e2ffc974e2f3a0b1cec80/tempest/config.py#L89621:40
mriedemgaudenz: and make that patch depend on the nova chnage21:40
mriedemand then you should be good21:40
mriedemgaudenz: right in here https://github.com/openstack/devstack-plugin-ceph/blob/39de6df04130cf2f221fb5ba2a9b5ff597de332a/devstack/plugin.sh#L9821:41
efried_cya_janedleafe, cdent: It appears we're missing a gabbi test for a successful PUT /rps/{u}/inventories/{rc}21:41
efried_cya_jan...or it's not in that file21:41
mriedemgaudenz: commented in the nova change on the steps21:41
efried_cya_janMaybe Sundar would like to propose that :)21:41
efried_cya_janokay, I'm going away again. o/21:42
efried_cya_janOh, edleafe, thanks for covering the n-sch summary in the meeting.21:42
Sundarefried_cya_jan: We are still seeing you in December. :/ That apart, https://developer.openstack.org/api-ref/placement/#update-resource-provider-inventories documents only PUT.21:42
efried_cya_janSundar: Yes, the document is not clear on the fact that the inventory needs to exist first.21:43
mriedemgaudenz: btw, i'm pretty sure cern was already working on this...21:43
efried_cya_janSundar: But the error message on that 400 is.21:43
dansmithSundar: it's PUT on the provider that works regardless21:43
mriedemgaudenz: melwitt: https://review.openstack.org/#/c/594273/21:43
edleafeefried_cya_jan: I prepared that thinking that you were going to be a man of your word21:43
mriedemso something is fishy21:43
dansmithSundar: PUT on a provider/class only works after it has been created via PUT on provider21:43
Sundarmriedem: Sorry, didn't get your reference to the devstack-plugin-ceph job. Why is that involved here?21:44
melwittmriedem: whoa, huh21:44
Sundardansmith: Makes sense. I just went by the doc.21:44
mriedemhttps://blueprints.launchpad.net/nova/+spec/extend-in-use-rbd-volumes21:44
mriedemSundar: that's for gaudenz, not you21:44
efried_cya_janSundar: Right, as dansmith is saying, if you do PUT /rps/{u}/inventories with a payload like  {'resource_provider_generation': $X, 'inventories': {$RC: {'total': $N, ...}}} then subsequent PUTs on /rps/{u}/inventories/{rc} ought to work.21:44
* efried_cya_jan really walks away.21:45
melwitto/ efried_cya_jan21:45
melwittmriedem: thanks for finding that. good to compare/consolidate on this cc gaudenz21:46
gaudenzmriedem: Thanks, for the links. Will have a look.21:47
mriedemi knew about the cern one originally, i thought this was that21:47
mriedemdansmith: you got results https://review.openstack.org/#/c/623282/ it works \o/21:47
mriedemnot sure if you want to change that variable name21:47
mriedemi suggest LOGGED_NO_COMPUTES21:47
gaudenzmriedem, melwitt: I cooked up something as a blueprint just now: https://blueprints.launchpad.net/nova/+spec/extend-volume-net does this look good?21:47
mriedemgaudenz: heh, well, there is already a blueprint for this...21:48
mriedemhttps://blueprints.launchpad.net/nova/+spec/extend-in-use-rbd-volumes21:48
mriedemso...we might as well just mark one superseded (the new one)21:48
mriedembut need you and jose to sort out what the correct solution is here21:48
melwittI can do the launchpad updates21:50
melwittgaudenz: yeah, so just use the already existing https://blueprints.launchpad.net/nova/+spec/extend-in-use-rbd-volumes and collaborate with jose on the approach21:50
gaudenzNo problem. At least I did not invest too much time in writing the blueprint. I'm fine with marking mine as superseeded.21:50
gaudenzI'll look at jose's patch and contact him. Thanks again for helping out.21:51
mriedemdansmith: unit test failures in https://review.openstack.org/#/c/623283/121:54
mriedemhttp://logs.openstack.org/83/623283/1/check/openstack-tox-py27/d6a560e/testr_results.html.gz21:54
melwittgaudenz: sounds good, thanks21:55
*** jmlowe has joined #openstack-nova21:55
mriedemefried_cya_jan: before i dig into the fewer placement calls from _ensure patch, you've addressed jaypipes' ironic concerns from https://review.openstack.org/#/c/615677/9/nova/compute/resource_tracker.py@812 ?21:57
mriedemi assume so, but want to make sure i'm not going to waste time21:57
*** spatel has quit IRC22:08
dansmithmriedem: hmm, okay, I switched the order at the last minute and then realized that was probably not the best idea an hour later, so that's probably why22:11
dansmiththey were all passing before I did that22:11
*** manjeets_ is now known as manjeets22:28
*** rodolof has quit IRC22:37
mriedemcoreycb: your wish is my command https://review.openstack.org/#/c/579004/22:39
*** spatel has joined #openstack-nova23:04
*** lbragstad has quit IRC23:08
*** spatel has quit IRC23:09
*** lbragstad has joined #openstack-nova23:09
openstackgerritGaudenz Steinlin proposed openstack/nova master: Extend volume for libvirt network volumes (RBD)  https://review.openstack.org/61303923:13
openstackgerritMatt Riedemann proposed openstack/nova stable/rocky: Add functional regression test for bug 1794996  https://review.openstack.org/62334823:23
openstackbug 1794996 in OpenStack Compute (nova) rocky "_destroy_evacuated_instances fails and kills n-cpu startup if lazy-loading flavor on a deleted instance" [High,Triaged] https://launchpad.net/bugs/179499623:23
openstackgerritMatt Riedemann proposed openstack/nova stable/rocky: Fix InstanceNotFound during _destroy_evacuated_instances  https://review.openstack.org/62334923:23
*** mlavalle has quit IRC23:36
openstackgerritMerged openstack/nova master: Remove ironic/pike note from *_allocation_ratio help  https://review.openstack.org/62015423:44
openstackgerritMerged openstack/nova master: Change the default values of XXX_allocation_ratio  https://review.openstack.org/60280323:44
*** N3l1x has quit IRC23:53
openstackgerritMatt Riedemann proposed openstack/nova stable/queens: Add functional regression test for bug 1794996  https://review.openstack.org/62335423:59
openstackbug 1794996 in OpenStack Compute (nova) rocky "_destroy_evacuated_instances fails and kills n-cpu startup if lazy-loading flavor on a deleted instance" [High,In progress] https://launchpad.net/bugs/1794996 - Assigned to Matt Riedemann (mriedem)23:59
openstackgerritMatt Riedemann proposed openstack/nova stable/queens: Fix InstanceNotFound during _destroy_evacuated_instances  https://review.openstack.org/62335523:59
mriedemomg23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!