Thursday, 2018-12-06

sean-k-mooney	specifcially use this feature https://review.openstack.org/#/c/603352/	00:00
tssurya	sean-k-mooney: yea thanks, did read the full spec yet but got the idea	00:03
tssurya	so it would be something like adding/removing to an aggregate when we disable/enable the service I guess	00:04
tssurya	and the fourth solution was traits I see	00:05
sean-k-mooney	yep and we then jsut alway include member_of=!<uuid of disbaled aggreage>	00:05
sean-k-mooney	in the placement request	00:05
tssurya	sean-k-mooney: yep makes sense	00:06
sean-k-mooney	we can use a uuid5 to generate a sable uuid for the aggragte but other services can create there own	00:06
tssurya	yea, as long as the aggregate doesn't get stale (as in disruption during enable which fails to remove it from the aggregate)/we keep it in sync should work well.	00:07
tssurya	which is a corner case	00:08
sean-k-mooney	tssurya: well we could take care of that with a periodic task to fix it	00:08
sean-k-mooney	e.g we udate it imendiatly form the api but heal it later if the update got lost	00:09
*** rodolof has joined #openstack-nova		00:09
tssurya	sean-k-mooney: after looking at the RT's periodic task, I am so down with periodic tasks as of now :D because of the overhead when having several computes	00:09
tssurya	but yea as long as the interval is not so frequent, should be ok	00:10
sean-k-mooney	haha ya but this would be on i guess the conductor	00:10
sean-k-mooney	or supper conductor	00:10
sean-k-mooney	the node status is in i think both the api and cell db so we dont need to run this on every compute node	00:11
tssurya	node status is only in the cell db's	00:11
sean-k-mooney	ok so but it could run on the cell conductor then	00:11
tssurya	yea could be	00:12
sean-k-mooney	an you could set the interval to a week if you wanted and this time supprot disabling it right out of the box by setting to 0	00:12
*** _alastor_ has joined #openstack-nova		00:12
tssurya	but not sure if its an upcall when updating the api aggregate table frm the cell conductor but yea these are implementation details	00:13
sean-k-mooney	if you trust it to not get out of sync our you will fix it later then no need to run it	00:13
sean-k-mooney	tssurya: oh im talking about placemetn aggreagtes not nova host aggrats by the way	00:13
tssurya	sean-k-mooney: oh okay!	00:14
tssurya	since you saud "nova" adds hosts, got confused	00:15
sean-k-mooney	ya i should have said nova update the placment aggragte with the compute node RP uuid	00:15
tssurya	right yea	00:16
*** gyee has quit IRC		00:17
*** _alastor_ has quit IRC		00:17
*** wolverineav has quit IRC		00:27
*** spatel has joined #openstack-nova		00:29
*** wolverineav has joined #openstack-nova		00:30
*** spatel has quit IRC		00:34
*** jaosorior has quit IRC		00:37
*** Swami has quit IRC		00:52
*** Belgar81 has joined #openstack-nova		00:55
*** brinzhang has joined #openstack-nova		01:09
*** tommylikehu_ has joined #openstack-nova		01:10
*** k_mouza has joined #openstack-nova		01:13
*** k_mouza has quit IRC		01:17
jaypipes	cfriesen: you have awoken the kraken. but unfortunately, the kraken is too tired from playing tennis to fight tonight :) so, will chat tomorrow about it.	01:28
*** markvoelker has quit IRC		01:33
*** betherly has joined #openstack-nova		01:40
*** sapd1 has quit IRC		01:40
*** sapd1 has joined #openstack-nova		01:40
*** betherly has quit IRC		01:44
*** david-lyle has joined #openstack-nova		01:48
*** manjeets_ has joined #openstack-nova		01:49
*** itlinux has joined #openstack-nova		01:49
*** dklyle has quit IRC		01:51
*** manjeets has quit IRC		01:51
*** _alastor_ has joined #openstack-nova		02:13
*** mschuppert has quit IRC		02:15
*** mrsoul has quit IRC		02:15
*** Dinesh_Bhor has joined #openstack-nova		02:16
*** _alastor_ has quit IRC		02:18
*** spatel has joined #openstack-nova		02:22
*** cfriesen has quit IRC		02:22
openstackgerrit	Jack Ding proposed openstack/nova master: Preserve UEFI NVRAM variable store https://review.openstack.org/621646	02:23
openstackgerrit	Guo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication https://review.openstack.org/623120	02:33
*** hongbin has joined #openstack-nova		02:35
*** mhen has quit IRC		02:36
*** mhen has joined #openstack-nova		02:37
*** wolverineav has quit IRC		02:40
*** wolverineav has joined #openstack-nova		02:41
openstackgerrit	Guo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication https://review.openstack.org/623120	02:44
*** wolverineav has quit IRC		02:46
openstackgerrit	Guo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication https://review.openstack.org/623120	02:46
*** betherly has joined #openstack-nova		02:51
openstackgerrit	Guo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication https://review.openstack.org/623120	02:51
*** imacdonn has quit IRC		02:53
*** imacdonn has joined #openstack-nova		02:53
*** betherly has quit IRC		02:55
openstackgerrit	Guo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication https://review.openstack.org/623120	03:11
*** Dinesh_Bhor has quit IRC		03:15
*** wolverineav has joined #openstack-nova		03:21
*** rodolof has quit IRC		03:23
*** Dinesh_Bhor has joined #openstack-nova		03:23
*** wolverineav has quit IRC		03:26
openstackgerrit	Zhenyu Zheng proposed openstack/nova master: Handle tags in _bury_in_cell0 https://review.openstack.org/621856	03:26
*** wolverineav has joined #openstack-nova		03:30
*** psachin has joined #openstack-nova		03:32
*** wolverineav has quit IRC		03:34
*** tssurya has quit IRC		03:42
openstackgerrit	Takashi NATSUME proposed openstack/nova stable/rocky: Add a bug tag for nova doc https://review.openstack.org/623130	03:43
*** wolverineav has joined #openstack-nova		03:46
*** rodolof has joined #openstack-nova		03:49
*** wolverineav has quit IRC		04:02
*** wolverineav has joined #openstack-nova		04:03
*** wolverineav has quit IRC		04:07
openstackgerrit	Guo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication https://review.openstack.org/623120	04:16
*** rodolof has quit IRC		04:23
*** betherly has joined #openstack-nova		04:32
*** hongbin has quit IRC		04:33
*** janki has joined #openstack-nova		04:34
*** betherly has quit IRC		04:37
openstackgerrit	Guo Jingyu proposed openstack/nova-specs master: Proposal for a safer noVNC console with password authentication https://review.openstack.org/623120	04:52
*** lpetrut has joined #openstack-nova		04:52
*** lpetrut has quit IRC		05:36
*** Dinesh_Bhor has quit IRC		05:36
*** Dinesh_Bhor has joined #openstack-nova		05:42
*** wolverineav has joined #openstack-nova		05:43
*** spatel has quit IRC		05:46
*** ratailor has joined #openstack-nova		05:46
*** wolverineav has quit IRC		05:50
*** Dinesh_Bhor has quit IRC		05:57
*** Dinesh_Bhor has joined #openstack-nova		06:12
*** rambo_li has joined #openstack-nova		06:16
*** Dinesh_Bhor has quit IRC		06:18
*** sridharg has joined #openstack-nova		06:20
*** Dinesh_Bhor has joined #openstack-nova		06:36
*** rambo_li has quit IRC		06:42
openstackgerrit	Takashi NATSUME proposed openstack/nova master: api-ref: Body verification for the lock action https://review.openstack.org/622835	06:58
openstackgerrit	Zhenyu Zheng proposed openstack/nova master: Handle tags in _bury_in_cell0 https://review.openstack.org/621856	07:00
*** wolverineav has joined #openstack-nova		07:05
*** wolverineav has quit IRC		07:10
*** dpawlik has joined #openstack-nova		07:28
openstackgerrit	Merged openstack/nova master: Update mailinglist from dev to discuss https://review.openstack.org/621827	07:41
openstackgerrit	Merged openstack/nova master: modify the avaliable link https://review.openstack.org/616905	07:41
*** rcernin has quit IRC		07:56
*** pcaruana has joined #openstack-nova		07:58
*** pcaruana is now known as muttley		07:58
*** maciejjozefczyk has joined #openstack-nova		08:01
*** maciejjozefczyk has quit IRC		08:01
*** maciejjozefczyk has joined #openstack-nova		08:02
*** takashin has left #openstack-nova		08:03
*** rcernin has joined #openstack-nova		08:03
*** lpetrut has joined #openstack-nova		08:06
*** slaweq has joined #openstack-nova		08:06
*** awalende has joined #openstack-nova		08:13
*** slaweq has quit IRC		08:15
*** helenafm has joined #openstack-nova		08:16
*** _alastor_ has joined #openstack-nova		08:16
*** ralonsoh has joined #openstack-nova		08:21
*** _alastor_ has quit IRC		08:21
*** rcernin has quit IRC		08:33
*** sahid has joined #openstack-nova		08:33
*** brinzhang has quit IRC		08:38
*** brinzhang has joined #openstack-nova		08:38
*** tommylikehu_ has quit IRC		08:40
*** Dinesh_Bhor has quit IRC		08:41
*** mhen has quit IRC		08:54
*** Dinesh_Bhor has joined #openstack-nova		08:54
*** lpetrut has quit IRC		08:57
*** mhen has joined #openstack-nova		09:00
*** markvoelker has joined #openstack-nova		09:01
*** awalende has quit IRC		09:01
*** awalende has joined #openstack-nova		09:02
*** awalende has quit IRC		09:06
*** awalende has joined #openstack-nova		09:08
*** maciejjozefczyk has quit IRC		09:16
gibi	stephenfin: regarding shorted notification payload, would you like to see the samples stored in nova sorted only or also the json emitted on the message bus?	09:16
*** k_mouza has joined #openstack-nova		09:18
*** k_mouza has quit IRC		09:22
*** k_mouza has joined #openstack-nova		09:23
gibi	mdbooth: in the failure in nova.tests.functional.regressions.test_bug_1550919.LibvirtFlatEvacuateTest in http://logs.openstack.org/22/606122/7/check/nova-tox-functional/1f3126b/testr_results.html.gz there are 9 seconds gap between '[nova.virt.libvirt.driver] Creating image' and the first polling of the server state. So someting is definitely slow there _before_ even the test try to wait for 5 seconds	09:23
gibi	to see the server is ACTIVE state	09:23
mdbooth	gibi: looking	09:24
mdbooth	gibi: That's one of the ones which looked 'generally unhappy' I think.	09:24
gibi	mdbooth: unfortunately we dont have such timing data from successful runs :/	09:25
mdbooth	gibi: Or DEBUG logs :(	09:26
gibi	does 'Creating image' means that in these test we really generate the root fs for the instance?	09:26
mdbooth	gibi: No. The disk creation is stubbed to just touch the file.	09:27
*** ttsiouts has joined #openstack-nova		09:27
mdbooth	However, it does execute _resolve_driver_format in imagebackend, which executes qemu-img.	09:27
mdbooth	I have a patch locally to mock that out, but on my unloaded system it doesn't make a massive difference.	09:28
mdbooth	I can see that it's about 3 seconds of wall clock time, although multithreaded.	09:28
*** dtantsur\|afk is now known as dtantsur		09:29
gibi	mdbooth: I see. Honestly I don't know what could be the real problem. In the run http://logs.openstack.org/22/606122/7/check/nova-tox-functional/1f3126b/testr_results.html.gz I see missing DB table exceptions as well that reminds me the race we fixed with cdent last week in the placement db fixture. Maybe that race had other side effects. However we only fixed that in the split out placement repo.	09:32
mdbooth	gibi: Like I said yesterday, I suspect it's just a canary: the first thing to die when conditions get bad.	09:34
mdbooth	I could mock another couple of things out, and increase the timeout.	09:34
*** markvoelker has quit IRC		09:34
gibi	mdbooth: could very well be it. Still it woul be nice to know why 5 seconds in not enough in these tests. But I know that we don't have data to figure that out	09:35
mdbooth	gibi: Why don't we have debug enabled, btw?	09:35
gibi	mdbooth: I'm not sure, maybe we don't want to store that much of logs.	09:36
*** maciejjozefczyk has joined #openstack-nova		09:36
gibi	mdbooth: but if you can propose a patch that turns the debug log on, I can support that. and we will see if there are other oppinions	09:37
*** ondrejme has quit IRC		09:38
gibi	stephenfin: I can try to propose the sorting patch and see if people start puking	09:42
gibi	stephenfin: anyhow, thanks for the reviews on those patches	09:43
mdbooth	gibi: Thanks for spending time on this.	09:43
gibi	mdbooth: I mysteries :)	09:43
* mdbooth too :)		09:44
mdbooth	Sometimes, anyway.	09:44
*** derekh has joined #openstack-nova		09:44
*** cdent has joined #openstack-nova		09:54
*** tssurya has joined #openstack-nova		09:59
*** ttsiouts has quit IRC		10:00
*** ttsiouts has joined #openstack-nova		10:01
*** ttsiouts has quit IRC		10:05
*** brinzhang has quit IRC		10:08
*** yan0s has joined #openstack-nova		10:08
*** brinzhang has joined #openstack-nova		10:08
yan0s	I have a question about nova policies	10:10
yan0s	are variables "compute:create", "compute:get" etc deprecated?	10:11
yan0s	I see they are not mentioned in the latest documentation	10:12
yan0s	https://docs.openstack.org/nova/rocky/configuration/policy.html	10:12
yan0s	also testing some of them I think they affect access rights	10:13
yan0s	only their "os_compute_api" equivalents work	10:13
kashyap	gibi: Heya, Zuul was "-2" on this: https://review.openstack.org/#/c/620327/. Guess if I do a 'rechek', I'll "lose" all the ACKs?	10:19
gibi	kashyap: I don't think so	10:20
kashyap	Seems to be a 'neutron-grenade' failure	10:20
gibi	kashyap: if you checked the fauilre and seems unrealated to your patch then feel free to recheck	10:20
kashyap	Yeah, they're unrelated; I can't spot anything related to my patch: http://logs.openstack.org/27/620327/3/gate/neutron-grenade/8ffe31b/job-output.txt.gz	10:20
gibi	kashyap: yeah the cells-v1 test failure seems unrelated	10:23
gibi	kashyap: and I dont see anything suspicious in the grenade log either	10:24
kashyap	Thanks! I hit a 'recheck'	10:24
gibi	kashyap: btw, you only loose the +W if you need to rebase	10:24
* kashyap hates to do mindless rechecks; so as not to waste the resources		10:24
kashyap	gibi: Ah, noted.	10:24
*** erlon has joined #openstack-nova		10:31
*** markvoelker has joined #openstack-nova		10:31
*** k_mouza has quit IRC		10:34
*** ttsiouts has joined #openstack-nova		10:34
*** mvkr has joined #openstack-nova		10:36
*** Dinesh_Bhor has quit IRC		10:37
*** maciejjozefczyk has quit IRC		10:45
*** maciejjozefczyk has joined #openstack-nova		10:46
lyarwood	mdbooth: https://review.openstack.org/#/c/619804/ - I know you're busy with your own func test hell but can you add this to your queue today, happy to close it out if it isn't useful.	10:48
*** ttsiouts has quit IRC		10:50
mdbooth	lyarwood: Oh, nice. Does it work?	10:50
*** ttsiouts has joined #openstack-nova		10:50
mdbooth	lyarwood: I see you're using it in the patch above.	10:50
lyarwood	mdbooth: I was, but I've dropped the patch above now	10:50
lyarwood	mdbooth: so I either rebase this on your stuff or kill it	10:51
lyarwood	mdbooth: and it works, I think.	10:51
*** Belgar81 has quit IRC		10:54
*** k_mouza has joined #openstack-nova		10:55
*** ttsiouts has quit IRC		10:55
*** markvoelker has quit IRC		11:05
*** awalende has quit IRC		11:15
*** k_mouza has quit IRC		11:18
*** awalende has joined #openstack-nova		11:20
*** ttsiouts has joined #openstack-nova		11:20
*** awalende has quit IRC		11:24
openstackgerrit	Merged openstack/nova master: Clean up cpu_shared_set config docs https://review.openstack.org/614864	11:28
openstackgerrit	Merged openstack/nova master: Delete NeutronLinuxBridgeInterfaceDriver https://review.openstack.org/616995	11:29
*** tbachman has quit IRC		11:33
*** ttsiouts has quit IRC		11:43
*** ttsiouts has joined #openstack-nova		11:44
*** ttsiouts has quit IRC		11:48
openstackgerrit	Surya Seetharaman proposed openstack/nova master: Add DownCellFixture https://review.openstack.org/614810	11:53
openstackgerrit	Surya Seetharaman proposed openstack/nova master: API microversion 2.68: Handles Down Cells https://review.openstack.org/591657	11:53
*** awalende has joined #openstack-nova		11:53
*** awalende has quit IRC		11:58
*** arches has joined #openstack-nova		12:01
openstackgerrit	Chris Dent proposed openstack/nova master: Correct lower-constraints.txt and the related tox job https://review.openstack.org/622972	12:04
*** arches has left #openstack-nova		12:05
*** psachin has quit IRC		12:08
*** awalende has joined #openstack-nova		12:09
*** sambetts_ has quit IRC		12:13
*** sambetts_ has joined #openstack-nova		12:15
*** _alastor_ has joined #openstack-nova		12:17
*** k_mouza has joined #openstack-nova		12:20
*** _alastor_ has quit IRC		12:23
*** k_mouza has quit IRC		12:25
openstackgerrit	Guo Jingyu proposed openstack/nova master: Add rfb.VNC support for novncproxy https://review.openstack.org/622336	12:25
*** sambetts_ has quit IRC		12:26
*** mriedem has joined #openstack-nova		12:27
*** ttsiouts has joined #openstack-nova		12:28
*** sambetts_ has joined #openstack-nova		12:28
*** awalende has quit IRC		12:29
*** erlon has quit IRC		12:29
*** k_mouza has joined #openstack-nova		12:30
openstackgerrit	Zhenyu Zheng proposed openstack/nova master: Handle tags in _bury_in_cell0 https://review.openstack.org/621856	12:33
*** ttsiouts has quit IRC		12:40
*** ttsiouts has joined #openstack-nova		12:41
*** udesale has joined #openstack-nova		12:41
*** erlon has joined #openstack-nova		12:43
*** brinzhang has quit IRC		12:44
*** ratailor has quit IRC		12:45
*** ttsiouts has quit IRC		12:46
*** awalende has joined #openstack-nova		12:50
*** betherly has joined #openstack-nova		12:52
*** eharney has quit IRC		12:54
*** arches has joined #openstack-nova		12:54
*** ttsiouts has joined #openstack-nova		12:58
mriedem	tssurya: belmoreira: i'm not sure what i'm missing here https://bugs.launchpad.net/nova/+bug/1805989	12:58
openstack	Launchpad bug 1805989 in OpenStack Compute (nova) "Weight policy to stack/spread instances and "max_placement_results"" [Undecided,New]	12:58
*** betherly has quit IRC		13:01
tssurya	mriedem: I guess from our prespective since we don't use any scheduler fitlers (except the Compute Filter), as opposed to a scenario where placement would return all results as in all possible available resources on which the weights would be applied versus what happens now when we do this only the max_placement_results number of available resources (hosts)	13:03
tssurya	would effect the way packing/spreading is done	13:03
mriedem	that's working as designed....	13:03
mriedem	max_placement_results is just a front-end filter	13:04
sean-k-mooney	tssurya: you dont use any filetre so you dont use sriov/pci passthrough/cpu pinning/hugepages or numa in your cloud	13:04
mriedem	like i said in the bug, even before placement, if you have 1k hosts, if the filters narrow that down to 10, then only 10 get weighed	13:04
mriedem	sean-k-mooney: no they don't,	13:04
mriedem	oh well,	13:04
mriedem	nvm	13:04
mriedem	b/c of cells v1 there were several things they couldn't support before, like affinity	13:04
mriedem	idk about pci/numa	13:05
sean-k-mooney	this seams kindof like a duplicate of https://bugs.launchpad.net/nova/+bug/1805984	13:05
openstack	Launchpad bug 1805984 in OpenStack Compute (nova) "Placement is not aware of disable compute nodes" [Wishlist,Triaged]	13:05
tssurya	sean-k-mooney: we use pci passthrough, but not that much (just for one cell)	13:05
mriedem	sean-k-mooney: that is definitely a different problem	13:05
mriedem	that bug is that placement is returning only disabled computes	13:06
mriedem	because cern sets max_placement_results very low	13:06
sean-k-mooney	yes	13:06
mriedem	this other one is different	13:06
mriedem	placement has no relationship to weighers in nova-scheduler	13:06
sean-k-mooney	it is a different problem but they are both because the limiting is done in placmeent not after teh filters	13:06
mriedem	for the weigher bug it doesn't matter	13:07
mriedem	filters are before weighers,	13:07
mriedem	so if placement filtered out the hosts or the scheduler filters did, it doesn't matter	13:07
mriedem	the weighers are only going to weigh what they get	13:07
mriedem	so they will spread/pack across whatever comes through the filters	13:07
sean-k-mooney	it maters if you have lost of disabled host in the limited set of result you got form plamcnet	13:07
sean-k-mooney	it will skew the weighing by reducing the set of results	13:08
tssurya	mriedem: exactly for us when not using "max_placement_results" filters would give us almost all the hosts from a cell	13:08
*** muttley has quit IRC		13:08
mriedem	tssurya: yeah, so IMO bug 1805989 is not really a bug, it's working as designed	13:09
openstack	bug 1805989 in OpenStack Compute (nova) "Weight policy to stack/spread instances and "max_placement_results"" [Undecided,New] https://launchpad.net/bugs/1805989	13:09
tssurya	but yea probably this iss not a bug its design thing	13:09
mriedem	bug 1805984 is definitely a problem for which we have solutions	13:09
openstack	bug 1805984 in OpenStack Compute (nova) "Placement is not aware of disable compute nodes" [Wishlist,Triaged] https://launchpad.net/bugs/1805984	13:09
sean-k-mooney	that said it is working as desinged as mriedem said. that desing my not be desireable and we might want to consider how to change it	13:09
gibi	mriedem: I saw your comment on the notification deprecation patch. I have to organize my thoughts formulate an opinion	13:09
mriedem	what we need to do is get CERN to the point that they don't have to workaround perf issues by setting max_placement_results to 10	13:10
mriedem	when they have 14K hosts	13:10
mriedem	gibi: heh ok :)	13:10
mriedem	no rush	13:10
gibi	mriedem: yeah, I will take my time	13:10
gibi	:)	13:10
mriedem	gibi: it's a bit depressing huh?	13:10
tssurya	mriedem: sure its a design thing, we just laid it out in case there were ideas	13:10
tssurya	mriedem: yea perf stuff is still going on	13:10
tssurya	hopefully we won't need to set "max_placement_results" to a low number then	13:11
gibi	mriedem: I understand that the original goals of that work might not be applicable in the today situaion on OpenStack	13:11
sean-k-mooney	tssurya: the placment randomisation was how we wanted peole to enable spreading behavior	13:11
*** tbachman has joined #openstack-nova		13:12
mriedem	tssurya: ok i marked the bug as invalid (working as designed)	13:12
sean-k-mooney	so if you had a 1000 hosts that could fit the request and you requested 50 then with the randomisation enbled you would get a random 50 out of that 1000 for the filters and weigher to select form	13:12
*** arches has quit IRC		13:13
sean-k-mooney	tssurya: do you rember what you set currently and what is that value as a propotion of your average cell size	13:14
*** k_mouza_ has joined #openstack-nova		13:14
mriedem	average cell size is 200 at cern	13:14
tssurya	sean-k-mooney: 10 versus 800 in normal scenario	13:14
mriedem	i think they set max_placement_results to 10	13:14
mriedem	800? i thought it was 200 on average	13:14
mriedem	with about 72-74 cells?	13:15
tssurya	mriedem: yea 200 for special project to cell mappings, on an average for normal users we set aside 5 to 7 default cells	13:15
tssurya	each cell having 200	13:15
mriedem	ok	13:15
mriedem	as far as i know, https://bugs.launchpad.net/nova/+bug/1737465 is still the biggest perf issue	13:16
openstack	Launchpad bug 1737465 in OpenStack Compute (nova) "[cellv2] the performance issue of cellv2 when creating 500 instances concurrently" [Medium,Confirmed]	13:16
sean-k-mooney	ok so could you increase the limit to say 10% of the host that tenat can be expect to selct form	13:16
sean-k-mooney	e.g. 80 in this case?	13:16
*** k_mouza has quit IRC		13:17
mriedem	also, as far as i know, cern is not yet grouping cells via host aggregate	13:17
mriedem	or are you?	13:17
*** rcernin has joined #openstack-nova		13:17
tssurya	we model cells as aggregates	13:17
mriedem	because of https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#tenant-isolation-with-placement right>	13:17
tssurya	as in every cell is an aggregate	13:17
mriedem	?	13:17
mriedem	ok	13:17
tssurya	yea because of the pre-filter	13:18
mriedem	so, really,	13:18
jaypipes	mriedem: wait, there's a bug in shelve code?!	13:18
sean-k-mooney	tssurya: so you then use the teanay affinity filter to map tenants to thos aggreates and therefor to cells?	13:18
mriedem	jaypipes: working as designed	13:18
*** awalende has quit IRC		13:18
tssurya	sean-k-mooney: yea	13:18
mriedem	tssurya: if the pre-filter is working, then the max_placement_results probably doesn't need to be so low...	13:18
jaypipes	mriedem: I only have time to argue with cfriesen today, so I'll wait til he's online. :P	13:19
sean-k-mooney	mriedem: that is what i was wondering too	13:19
tssurya	mriedem: surely it need not be as low as 10, but even with it being 10 we have 5 to 8 seconds scheduling time	13:19
mriedem	because if my project is mapped to a cell with ~200 hosts, then filtering on 200 hosts shouldn't be so bad	13:19
tssurya	the time it would tkae to gather all those host states , well we are still trying to check iut why its 5 to 8 seconds	13:19
tssurya	out*	13:20
sean-k-mooney	tssurya: when you say schduiling are you refering to just the time it take for the filter schduler or the time the vm is in that sate	13:20
mriedem	tssurya: do you know if this is set on the computes and the scheduler nodes? https://docs.openstack.org/nova/latest/configuration/config.html#filter_scheduler.track_instance_changes	13:20
tssurya	time for the whole boot until select destinations is done fully	13:20
jaypipes	mriedem: I think it's pretty obvious I wouldn't support adding disabled flags to provider records, yes?	13:20
mriedem	jaypipes: yes, i could find you my reply to that	13:21
mriedem	which was probably not very nice	13:21
jaypipes	mriedem: yes, I saw it.	13:21
mriedem	and i hope cfriesen will forgive me	13:21
*** muttley has joined #openstack-nova		13:21
jaypipes	mriedem: BTW, it's possible to do something like this:	13:21
tssurya	mriedem: not sure let me check	13:21
sean-k-mooney	jaypipes: would you support usign a placement aggrage for disabled compute nodes	13:21
sean-k-mooney	jaypipes: and then use the new member_of:!<aggreage> feature	13:22
jaypipes	a) request max_placement_records. get those records. b) do a filter() against those provider UUIDs and the set of disabled compute services (would need to grab the compute node UUID, not the hostname, though), and c) if len(a) < len(b), request more from placement	13:22
mriedem	tssurya: track_instance_changes enables the computes to rpc broadcast to the schedules the information about the instances running on them (the hosts) and then the scheduler workers cache that information so that during a scheduling request the scheduler doesn't need to iterate all of the hosts (via db) to get their current instance list	13:23
jaypipes	sean-k-mooney: sure, though forbidden aggregates are not approved spec yet...	13:23
mriedem	tssurya: it's not recommended for split MQ because the computes would be disconnected from the schedulers (so the cast goes in the trash)	13:23
mriedem	jaypipes: i've been +2 on that forbidden aggregates spec for a couple of weeks now	13:23
jaypipes	sean-k-mooney: the solution should remain entirely on the nova side, though, IMHO, which is why I recommend the approach above as a method to take if NoValidHosts is encountered.	13:23
sean-k-mooney	jaypipes: true but its less of an abuse of the placment api and is just as simple to manage as adding or removing a trait	13:23
mriedem	jaypipes: what we talked about in channel yesterday was all nova side solutions	13:24
jaypipes	mriedem: gotta find another +2? do you want me to +2 that since I was the one who came up with the idea?	13:24
sean-k-mooney	jaypipes: e.g. page in more results if the filter elimidate them	13:24
mriedem	jaypipes: what you suggested about sounds like paging, but also not great performance wise since we'd have to pull the data from the nova cell dbs to check the disabled status	13:24
jaypipes	sean-k-mooney: is elimidate a combination of eliminate and intimidate?	13:24
mriedem	jaypipes: you were +2 on the spec before	13:24
tssurya	mriedem: its False	13:24
mriedem	tssurya: ok. you aren't running split MQ right?	13:25
jaypipes	mriedem: are we talking about the same spec?	13:25
jaypipes	mriedem: the nova side one or the placement-side one?	13:25
mriedem	https://review.openstack.org/#/c/603352/	13:25
mriedem	^ is the placement spec	13:25
*** muttley has quit IRC		13:25
mriedem	tpatil has a spec leveraging that, which i couldn't understand	13:25
mriedem	and sounded like well if all the planets were aligned config wise things might not shit the bed	13:26
jaypipes	mriedem: ok, sorry I thought you were referring to the latter.	13:26
sean-k-mooney	jaypipes: yes it likely was.	13:26
mriedem	the latter is https://review.openstack.org/#/c/609960/	13:26
jaypipes	yes, that's what I thought you were referring to :)	13:26
*** muttley has joined #openstack-nova		13:26
mriedem	you can see my confusion in https://review.openstack.org/#/c/609960/2/specs/stein/approved/placement-req-filter-forbidden-aggregates.rst@14	13:26
jaypipes	you being +2 on that one was news to me ;)	13:26
mriedem	heh, not even close	13:27
jaypipes	reight...	13:27
mriedem	i don't feel great about that, because this is at least the 3rd spec that tushar has had to chase for his issue	13:27
jaypipes	mriedem: ok, well at least you know why I was a bit confused above then :)	13:27
sean-k-mooney	why do we have two specs for the same thing?	13:27
jaypipes	mriedem: ack, agreed. but the specs should be clear. and this isn't (yet)	13:27
mriedem	sean-k-mooney: it's not	13:28
jaypipes	sean-k-mooney: mriedem had (correctly) asked to separate the nova-config stuff from the placement needs	13:28
mriedem	once is the placement change, one is the nova change ot use it	13:28
jaypipes	jinx	13:28
sean-k-mooney	oh ok	13:28
mriedem	because they are separate services...	13:28
sean-k-mooney	i was just aware of https://review.openstack.org/#/c/603352/	13:28
jaypipes	sean-k-mooney: it's kind of like if you combine the two words eliminate and intimidate... better to have two words that mean different things.	13:28
jaypipes	sean-k-mooney: :P	13:29
mriedem	tssurya: so if your computes can talk to the scheduler nodes over rpc, enabling that option might help you some with the scheduling performance issues	13:29
*** rcernin has quit IRC		13:29
*** muttley has quit IRC		13:29
mriedem	but with 14K computes, you might also melt your mq...	13:29
* jaypipes reads back up to understand tssurya's issues...		13:29
sean-k-mooney	:)	13:29
mriedem	jaypipes: they severely limit max_placement_results to 10 because scheduling takes too long	13:30
mriedem	mostly boils down to https://bugs.launchpad.net/nova/+bug/1737465 i think	13:30
openstack	Launchpad bug 1737465 in OpenStack Compute (nova) "[cellv2] the performance issue of cellv2 when creating 500 instances concurrently" [Medium,Confirmed]	13:30
mriedem	from the forum session on cells v2:	13:30
mriedem	"TODO: need someone to dig into optimized DB queries in HostManager._get_host_states()."	13:30
jaypipes	mriedem: by "scheduling" you are referring to the entire process on the nova side or are you referring to just the placement allocation candidates query?	13:30
mriedem	the entire process	13:30
jaypipes	k	13:30
mriedem	the biggest issue, as far as i know right now, is likely HostManager._get_host_states()	13:31
jaypipes	but they only have one filter enabled... so that shouldn't be a big perf hit, right?	13:31
jaypipes	mriedem: and all ten of the hosts returned are disabled?	13:31
sean-k-mooney	jaypipes: yep	13:31
mriedem	different issue...	13:31
mriedem	we're talking past each other	13:31
sean-k-mooney	well its related	13:31
sean-k-mooney	either they hit all are diable	13:32
mriedem	the disabled computes issue is a result of setting max_placement_results to 10	13:32
jaypipes	lemme read all the bug contents... one sec.	13:32
sean-k-mooney	or they are not but then when the weight the host its not a even spread or packing behavior anymore	13:32
mriedem	the root issue is that they have max_placement_results set so low because of shitty performance in nova-scheduler	13:32
*** liuyulong has joined #openstack-nova		13:32
mriedem	https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L704 is a problem, because it iterates all the resulting compute nodes (via alloc candidates), and for each it joins the host aggregates and instances per host https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L734	13:34
jaypipes	mriedem: does CERN have placement request filtering enabled?	13:34
mriedem	so that the HostState has info about aggregates and instances before the HostState goes ot the filters	13:34
mriedem	jaypipes: yes	13:34
mriedem	it was added for them :)	13:34
sean-k-mooney	mriedem: ya by seting it to 10 the schuler only considerd 5% of posibel host in a cell or less then 1% of host that tenatn could be schduled too	13:34
*** pcaruana has joined #openstack-nova		13:34
mriedem	sean-k-mooney: you know i understand the problem right?	13:34
sean-k-mooney	mriedem: ya i know	13:35
sean-k-mooney	sorry ill leave you too it i need to finish fixing my own spec	13:35
mriedem	jaypipes: so if you have some wild sql fu thoughts on how we could optimize https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L704 that would be super	13:35
mriedem	like, in a single query per cell, give me all the computes in a uuid list along with the list of instance uuids on those hosts	13:37
jaypipes	mriedem: the two links to host_manager.py above are not doing SQL operations... they are doing lookups on the host manager's local hashmaps of aggregate and instance information...	13:37
jaypipes	mriedem: the nova-compute services are reporting aggregate and instance usage via the queryclient interface, and the host manager caches that information.	13:38
mriedem	https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L747	13:38
mriedem	is definitely doing db stuff	13:38
*** awalende has joined #openstack-nova		13:38
mriedem	there is no cache here	13:38
mriedem	maybe for aggregates	13:38
mriedem	but not for host->instance mappings	13:38
jaypipes	mriedem: that's only called if there's no cache entry for the host.	13:38
*** pcaruana has quit IRC		13:39
mriedem	which there won't be per request	13:39
jaypipes	mriedem: those cache entries are populated on startup of the host manager and when the compute services report to the scheduler via the queryclient	13:39
mriedem	jaypipes: that's only if you have enabled track_instance_changes	13:39
mriedem	which cern does not do,	13:39
mriedem	and which won't work in a split MQ cells deployemtn where the computes can't rpc cast to the scheduler	13:39
jaypipes	guh...	13:40
jaypipes	so THAT'S the source of this issue, really.	13:40
jaypipes	ok.	13:40
*** awalende has quit IRC		13:40
mriedem	so i've suggested that cern try track_instance_changes=true to see if that caching helps	13:40
mriedem	maybe at least for a handful of cells to start	13:40
jaypipes	mriedem: alright, I can do some SQL-fu on the _get_instances_by_host() method.	13:40
*** awalende has joined #openstack-nova		13:41
jaypipes	mriedem: OK, I understand the issue better now. Gimme a little to experiment with something, ok?	13:41
*** tssurya has quit IRC		13:41
mriedem	sure	13:41
mriedem	let this be your white whale for the day	13:41
*** tssurya has joined #openstack-nova		13:41
jaypipes	just call me Ahab.	13:42
jaypipes	oh, hi tssurya :)	13:42
*** k_mouza has joined #openstack-nova		13:42
jaypipes	mriedem: ok, so now I understand why CERN is setting max_placement_records so low... it's because a single HostMapping.get_by_host() is being a issued for each found host.	13:43
jaypipes	instead of a batched approach.	13:43
*** pcaruana has joined #openstack-nova		13:43
mriedem	sure, it's more than just the host mapping query	13:45
*** k_mouza_ has quit IRC		13:45
jaypipes	https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L758	13:45
mriedem	bingo	13:45
jaypipes	InstalceList too...	13:45
jaypipes	ack.	13:45
mriedem	yes, and i have already neutered that	13:45
mriedem	that used to pull full Instance objects back	13:45
mriedem	now it's just uuids	13:45
mriedem	https://github.com/openstack/nova/commit/91f5af7ee7f7140eafb7237875f6cd6ea1abcd38	13:46
jaypipes	k	13:46
frickler	mriedem: any chance to give these stable reviews progress? https://review.openstack.org/619220 https://review.openstack.org/619254	13:46
mriedem	frickler: there is a >0% chance yes	13:46
*** pcaruana has quit IRC		13:47
openstackgerrit	Lee Yarwood proposed openstack/nova master: fixtures: Return a mocked class instead of method within fake_imagebackend https://review.openstack.org/619804	13:47
* frickler is positivized		13:47
mriedem	frickler: why didn't you cherry pick the pike change from the queens change? looks like you had to redo some of the conflict resolution?	13:48
mriedem	normally backports should just all go in a row, stein -> rocky -> queens -> pike	13:48
mriedem	like a snowball	13:48
frickler	mriedem: it was yet another conflict, yes, that function likes to do file hopping it seems	13:49
mriedem	lyarwood: want to get this? https://review.openstack.org/#/c/619220/1	13:49
mriedem	ok	13:49
lyarwood	mriedem: looking	13:50
*** dpawlik has quit IRC		13:50
mriedem	ok scored both	13:50
lyarwood	mriedem: okay done, would you mind taking a look at https://review.openstack.org/#/q/status:open+project:openstack/nova+topic:bug/1764883 when you get a chance?	13:51
*** awalende has quit IRC		13:52
*** dr_gogeta86 has joined #openstack-nova		13:55
mriedem	i see how it is, i give you something easy and in return you give me this	13:56
mriedem	oh these are backports..	13:56
mriedem	nvm	13:56
*** yikun has quit IRC		13:56
lyarwood	mriedem: something just as easy, yeah, what a guy ;)	13:57
*** ttsiouts has quit IRC		14:01
*** eharney has joined #openstack-nova		14:01
*** awalende has joined #openstack-nova		14:01
*** ttsiouts has joined #openstack-nova		14:01
*** rodolof has joined #openstack-nova		14:02
*** ttsiouts has quit IRC		14:06
openstackgerrit	sean mooney proposed openstack/nova-specs master: Add spec for sriov live migration https://review.openstack.org/605116	14:06
sean-k-mooney	bauzas: adrianc sorry for the delay on updating ^	14:07
*** rodolof has quit IRC		14:07
*** rodolof has joined #openstack-nova		14:08
*** spatel has joined #openstack-nova		14:13
*** spatel has joined #openstack-nova		14:13
*** awaugama has joined #openstack-nova		14:13
*** spatel has quit IRC		14:17
adrianc	ack sean-k-mooney, will look into it	14:19
*** ratailor has joined #openstack-nova		14:19
*** k_mouza has quit IRC		14:22
*** jaosorior has joined #openstack-nova		14:23
*** ttsiouts has joined #openstack-nova		14:24
*** cfriesen has joined #openstack-nova		14:27
openstackgerrit	Merged openstack/nova-specs master: Per aggregate scheduling weight (spec) https://review.openstack.org/599308	14:27
tssurya	hi back jaypipes :)	14:28
tssurya	mriedem: thanks for the info on the config option, will see if enabling it makes it better	14:29
tssurya	I guess the documentation says something like "f the configured filters and weighers do not need this information, disabling this option will improve performance"	14:29
tssurya	but I didn't know what this exactly did until after looking into the code	14:30
*** mchlumsky has joined #openstack-nova		14:30
*** k_mouza has joined #openstack-nova		14:30
sean-k-mooney	adrianc: there are very minor changes as there seam to be little questions raised	14:31
sean-k-mooney	adrianc: it should not affect you poc	14:31
*** psachin has joined #openstack-nova		14:32
*** mchlumsky has quit IRC		14:33
*** janki has quit IRC		14:33
*** diliprenkila has joined #openstack-nova		14:33
*** mchlumsky has joined #openstack-nova		14:34
mriedem	tssurya: yeah it's mostly for the affinity type filters	14:35
*** mlavalle has joined #openstack-nova		14:35
*** ratailor has quit IRC		14:35
tssurya	mriedem: ah ok	14:36
tssurya	yea	14:36
mriedem	the performance thing there is probably about not needing to needlessly send instance information from all computes to the scheduler	14:37
mriedem	but misses the part about how the scheduler then blindly pulls the information from the db anyway	14:37
mriedem	hmm...	14:38
mriedem	i wonder...	14:38
mriedem	if we only got the HostState.instances in the scheduler if track_instance_changes was true	14:38
mriedem	since they are kind of connected	14:38
mriedem	we wouldn't want to make anything conditional on enabled filters b/c people can have out of tree filters	14:38
mriedem	or we could just add a new option, CONF.filter_scheduler.do_your_filters_care_about_instances_on_hosts	14:39
mriedem	with a better name	14:39
mriedem	more config options is gross, but if you have split mq then track_instance_changes won't work and you don't need to enable it	14:39
tssurya	right, I am pretty sure this option was enabled by default and three months ago we disabled it and saw a very slight performance improvement actually	14:40
*** diliprenkila has quit IRC		14:40
*** liuyulong has quit IRC		14:40
mriedem	because of less mq traffic/	14:40
mriedem	?	14:40
tssurya	but yea I am getting this from a tracking ticket on the issue, probably we will look into this next week	14:41
mriedem	ok	14:41
tssurya	yea I think so	14:41
mriedem	as i said, i'd expect your mq traffic to go up by enabling it, but your overall scheduling time, at least for hosts in the cells reporting instance info, might go down	14:41
*** ttsiouts has quit IRC		14:41
mriedem	but ultimately if you aren't enabling filters that need that information, it's a waste of time all around	14:41
*** ttsiouts has joined #openstack-nova		14:42
tssurya	hmm also what do you think querying all the cell DBs to fetch the compute nodes could be a reason enough to add time ?	14:42
sean-k-mooney	if people have time could they take a look at https://review.openstack.org/#/c/591607/11	14:42
tssurya	I honestly am not sure if the whole parallel works well enough even with scatter gather,	14:43
tssurya	will have to dig through that too	14:43
*** Swami has joined #openstack-nova		14:44
mriedem	tssurya: we're querying enabled cells right? but that could be 70 cells which don't contain the 10 compute nodes you care about?	14:44
mriedem	also, how many scheduler workers are you running?	14:45
dansmith	tssurya: why do you think the parallel thing doesn't work?	14:47
*** ttsiouts has quit IRC		14:47
dansmith	or you said "well enough"...	14:47
*** diliprenkila has joined #openstack-nova		14:49
sean-k-mooney	dansmith: stephenfin: can you add https://review.openstack.org/#/c/591607/11 to your review queue.	14:50
tssurya	dansmith: well last time we tried to measure the parallel part in time over going to all 70 DBs we did not get any solid proof of fastness	14:50
diliprenkila	Hi all, While creating instance snapshots , type conversion error occurs nova.api.openstack.wsgi [req-fd9b8a70-f455-42e1-b186-93fffaa6192e ff9650c86533492581513eca72b48409 2eea218eea984dd68f1378ea21c64b83 - 765703fcca634b149c7a012626847d2f 765703fcca634b149c7a012626847d2f] Unexpected exception in API method: TypeError: Unable to set 'os_hidden' to 'False'. Reason: u'False' is not of type u'boolean'	14:50
stephenfin	sean-k-mooney: Sure	14:50
mriedem	diliprenkila: you're talking about https://bugs.launchpad.net/nova/+bug/1806239 yes	14:51
openstack	Launchpad bug 1806239 in OpenStack Compute (nova) "nova-api should handle type conversion while creating server snapshots " [Undecided,New]	14:51
dansmith	tssurya: all of the processing of the database results happens in serial, which is generally the majority of the work	14:51
tssurya	don't know was just wondering if someone had done the scatter-gather versus going sequential perf study	14:51
dansmith	tssurya: that's because we're using eventlet instead of regular threads	14:51
openstackgerrit	sean mooney proposed openstack/nova master: Add fill_virtual_interface_list online_data_migration script https://review.openstack.org/614167	14:51
tssurya	ah yea which is why the bottelneck is the time of the slowest DB still	14:52
diliprenkila	mriedem>full log is at :https://etherpad.openstack.org/p/e20ERvBEkw	14:52
dansmith	tssurya: no, not really, that's just because we have to wait for all the results in order to be able to produce them in order	14:52
*** _alastor_ has joined #openstack-nova		14:52
tssurya	mriedem: yea its still going to 70 cells out of which we care only about the 1 cells which might have those 10 nodes	14:53
*** ttsiouts has joined #openstack-nova		14:53
mriedem	hmm, wonder if we could front-filter the cells via host mappings,	14:53
mriedem	that wouldn't be worth it in the case of 1-2 cells,	14:53
mriedem	but for 70 it might	14:53
dansmith	mriedem: like we do for projects on list.., and we have a knob to choose	14:53
mriedem	sort of like how we get instance mappings by project_id and then filter the cells from those mappings	14:53
mriedem	yeah	14:53
*** janki has joined #openstack-nova		14:54
tssurya	mriedem: yea we had a bug for that, and kind of have a patch to try that out, will first see if there is a major difference	14:54
*** diliprenkila_ has joined #openstack-nova		14:55
mriedem	diliprenkila: i don't see os_hidden here https://docs.openstack.org/glance/latest/admin/useful-image-properties.html	14:55
mriedem	is it just not documented?	14:55
dansmith	mriedem: and, it only helps for 70 cells when you regularly cut out 68 of those cells for any given scheduling request	14:55
tssurya	dansmith: oh ok	14:55
*** Swami has quit IRC		14:55
bauzas	mriedem: diliprenkila: I guess he means kvm_hidden	14:56
mriedem	tssurya: this bug? https://bugs.launchpad.net/nova/+bug/1767303	14:56
openstack	Launchpad bug 1767303 in OpenStack Compute (nova) "Scheduler connects to all cells DBs to gather compute nodes info" [Undecided,Incomplete] - Assigned to Surya Seetharaman (tssurya)	14:56
tssurya	yea	14:56
*** Swami has joined #openstack-nova		14:56
mriedem	bauzas: no it's a property on the image? https://github.com/openstack/glance/blob/a308c444065307e99f18b521ed8d95714be24da7/glance/db/sqlalchemy/alembic_migrations/versions/rocky_expand01_add_os_hidden.py#L16	14:56
*** diliprenkila has quit IRC		14:57
bauzas	ah my bad then	14:57
bauzas	what does this ?	14:57
*** _alastor_ has quit IRC		14:57
* bauzas goes looking at nova repo		14:57
bauzas	mmm https://github.com/openstack/nova/search?q=os_hidden&unscoped_q=os_hidden	14:58
*** diliprenkila_ has quit IRC		14:59
mriedem	tssurya: ok changed that to triaged and added notes	15:00
mriedem	added to https://etherpad.openstack.org/p/BER-cells-v2-updates also	15:00
mriedem	since i've got a running list of perf related issues in there	15:00
*** diliprenkila has joined #openstack-nova		15:01
sean-k-mooney	bauzas: looking at http://codesearch.openstack.org/?q=os_hidden&i=nope&files=&repos= os_hidden is used by horizon and in glance but not in nova	15:01
mriedem	nova likely gets the image properties, shoves them into instance.system_metadata, and then on snapshot we populate the image meta for the new image from that sys_meta but aren't saying os_hidden is a boolean	15:02
mriedem	and default to send it as a string or something	15:02
mriedem	we have a whitelist for shit like this	15:03
tssurya	mriedem: thanks	15:03
diliprenkila	mriedem: can't we send os_hidden as boolean?	15:04
sean-k-mooney	diliprenkila: we proably could but nova does not know os_hidden is a thing so it has no logic specifically for haneling it	15:05
mriedem	diliprenkila: the problem is likely here https://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/image/glance.py#L734	15:05
mriedem	this is why microversions are nice - nova, as a client, never asked for this new response field	15:05
mriedem	stashed it away and then blindly gave it back	15:06
mriedem	apparently there isn't any tempest testing for this either	15:06
mriedem	diliprenkila: i left some questions in the bug, i'm not sure why tempest wouldn't already be failing on this, unless you have to do something to trigger this failure - can you provide reproduction steps in the bug report?	15:08
diliprenkila	mriedem : yes i will provide reproduction steps	15:08
mriedem	i'm pretty sure "output[prop_name] = str(prop_value)" is the problem	15:09
sean-k-mooney	is this something the glanceclient could fix for us	15:09
sean-k-mooney	it defines it as a bool in its schema	15:10
mriedem	we don't use glanceclient	15:10
sean-k-mooney	http://git.openstack.org/cgit/openstack/python-glanceclient/tree/glanceclient/v2/image_schema.py#n216	15:10
sean-k-mooney	oh ok	15:10
openstackgerrit	Balazs Gibizer proposed openstack/nova master: Remove port allocation during detach https://review.openstack.org/622421	15:10
mriedem	i think we only use ksa now	15:10
*** dpawlik has joined #openstack-nova		15:10
sean-k-mooney	we still import glancclient	15:10
mriedem	oh yeah i guess we do,	15:11
mriedem	we use ksa to get the session adapter thing to construct glanceclient	15:11
mriedem	anyway, it'd be cool if apis didn't return things you didn't ask for...	15:12
diliprenkila	mriedem: yes	15:12
sean-k-mooney	so could we modify glance client to munge the values form strings to bools so we did not have to handel this field ourselves	15:12
*** dpawlik has quit IRC		15:15
mriedem	i'm sure if we complained to the glance team about this, they'd say "you should be using the schema provided with the image"	15:15
mriedem	which we aren't	15:15
*** amodi has joined #openstack-nova		15:15
mriedem	there is only one image field we deal with the schema and that's disk_format	15:15
mriedem	https://review.openstack.org/#/c/375875/	15:16
*** dpawlik has joined #openstack-nova		15:17
*** awalende has quit IRC		15:17
*** dpawlik has quit IRC		15:17
*** dpawlik has joined #openstack-nova		15:18
diliprenkila	mriedem: so we should fix the os_hidden type in nova ? not in glance	15:19
openstackgerrit	Silvan Kaiser proposed openstack/nova master: Exec systemd-run without --user flag in Quobyte driver https://review.openstack.org/554195	15:20
*** sridharg has quit IRC		15:21
mriedem	diliprenkila: i don't think there is probably anything to change in glance,	15:21
mriedem	but i'd like to know why tempest isn't failing with this, but i need to know the recreate steps,	15:22
openstackgerrit	Chris Dent proposed openstack/nova master: Correct lower-constraints.txt and the related tox job https://review.openstack.org/622972	15:22
mriedem	because tempest has very basic tests where it creates a server and then creates a snapshot of that server,	15:22
mriedem	which i would think should cause this failure	15:22
*** slaweq has joined #openstack-nova		15:23
mriedem	maybe tempest isn't using image api v2.7/	15:26
mriedem	?	15:26
diliprenkila	mridem: may be	15:27
artom	mriedem, I think by default it goes to the lowest microversion	15:28
mriedem	glance doesn't have microversions but...	15:28
artom	Unless the test specifies min_microversion (or just microversion?)	15:28
artom	Oh, so just endpoints?	15:28
mriedem	i need a mordred	15:28
mordred	I didn't do it	15:29
mriedem	image api versions, go!	15:29
*** slaweq has quit IRC		15:29
mriedem	as in, wtf	15:29
artom	You're holding out for a mordred 'till the end of the night	15:29
mriedem	does the user opt into those or you just get what the server has available?	15:29
diliprenkila	mriedem: i am using nova: 18.0.0 , glance: 2.9.1	15:29
mordred	they're silly. there is no selection - you just get the API described by the highest number in that list	15:29
mordred	so - basically - ignore the thing after the 2.	15:30
mriedem	so if glance is rocky, i get 2.7	15:30
mriedem	https://developer.openstack.org/api-ref/image/versions/index.html#version-history	15:30
mordred	yeah	15:30
mriedem	then i don't know why tempest would not fail on https://bugs.launchpad.net/nova/+bug/1806239	15:30
openstack	Launchpad bug 1806239 in OpenStack Compute (nova) "nova-api should handle type conversion while creating server snapshots " [Undecided,New]	15:30
mordred	nova is still using glanceclient right?	15:31
mriedem	yeah	15:31
mordred	yeah - tempest uses direct rest calls - so it's possible glanceclient is doing something wrong. or tempest is doing something wrong	15:31
mriedem	well, tempest would just create a server and tell nova to snapshot it	15:32
mriedem	and then nova will use glanceclient	15:32
*** slaweq has joined #openstack-nova		15:32
mriedem	if that's all it takes to tickle this with rocky glance, i'm not sure why tempest wouldn't blow up	15:32
mriedem	anyway, i'll wait for diliprenkila to provide recreate steps	15:33
mordred	oh. gotcha	15:33
mordred	yeah. that's super weird	15:34
*** jhesketh has quit IRC		15:34
ShilpaSD	Hi All, facing issue while stack, E: Sub-process /usr/bin/dpkg returned an error code (1), any suggestions to resolve this?	15:35
*** jhesketh has joined #openstack-nova		15:35
*** bringha has quit IRC		15:36
*** slaweq has quit IRC		15:37
dansmith	mriedem: so on that rpc logging thing at startup, do we think the actual query is slowing things down, or the logging of the pointless message?	15:37
mriedem	idk	15:37
mriedem	looking at logs	15:38
mriedem	well we take 2 seconds dumping our gd config options :)	15:39
dansmith	which should be mostly just log traffic, right?	15:39
mriedem	yeah	15:39
dansmith	so maybe the logging is the thing	15:39
dansmith	also	15:39
mriedem	we start loading extensions at Dec 05 20:14:00.919520	15:39
dansmith	you know that if we were to actually start compute first, we would cache the service version the way we want and avoid the multiple lookups	15:40
mriedem	looks like we're done loading extensions at Dec 05 20:14:27.718587	15:40
mriedem	so there is another thing here,	15:40
mriedem	the rpcapi client does the version query thing,	15:40
mriedem	but API also constructs a SchedulerReportClient per instance, which apparently uses a lock	15:41
mriedem	Dec 05 20:14:27.687766 ubuntu-xenial-ovh-bhs1-0000959981 devstack@n-api.service[23459]: DEBUG oslo_concurrency.lockutils [None req-dfdfad07-2ff4-43ed-9f67-2acd59687e0c None None] Lock "placement_client" acquired by "nova.scheduler.client.report._create_client" :: waited 0.000s {{(pid=23462) inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:327}}	15:41
mriedem	so...	15:41
dansmith	I also just noticed/remembered that the multi-cell version of this does not cache at all	15:41
mriedem	and that's an in-memory lock	15:41
mriedem	heh remember how efried_cya_jan removed that lazy-load scheduler report client stuff? https://github.com/openstack/nova/blob/master/nova/compute/api.py#L256	15:44
dansmith	yeah	15:44
efried_cya_jan	uh oh	15:44
dansmith	mriedem: is there a warn-once pattern I should be using for logs?	15:45
mriedem	dansmith: i think just a global	15:46
dansmith	ack	15:46
mriedem	i'll strip out a separate bug for this report client init thing	15:46
*** kaisers_ has joined #openstack-nova		15:46
mriedem	https://bugs.launchpad.net/nova/+bug/1807219	15:51
openstack	Launchpad bug 1807219 in OpenStack Compute (nova) "SchedulerReporClient init slows down nova-api startup" [Medium,Triaged]	15:51
cdent	mdbooth: is that ^ related at all to the slow down you were seeing in your explorations yesterday(?)?	15:52
efried_cya_jan	mriedem: We ought to singleton that guy. If we're not having caching conflicts, it can only be out of luck because the API obj isn't doing anything that touches the cache.	15:53
edleafe	efried_cya_jan: more like efried_big_fat_liar	15:53
efried_cya_jan	I'm here for like another 20 minutes today. This does two things: 1) keeps my inbox down to triple digits; 2) makes you scared to talk about me behind my back.	15:54
mriedem	i intentionally talked about you to summon you	15:55
mdbooth	cdent: Only if we also do this during server create. Possible. The number of API objects created was insane.	15:55
kaisers_	stephenfin: mdbooth: Updated https://review.openstack.org/#/c/554195/ as discussed yesterday	15:55
edleafe	Oh, it's more fun to talk about you to your face!	15:55
*** dpawlik has quit IRC		15:55
efried_cya_jan	mriedem: Is the removal of lazy-load causing (or even related to) that one-reportclient-per-API?	15:56
*** jaypipes has quit IRC		15:56
mriedem	efried_cya_jan: not sure, we might have always been loading it during API init	15:57
*** jaypipes has joined #openstack-nova		15:58
mriedem	but i think it would have done the lazy-load	15:58
mriedem	since LazyLoader only creates the client thing until something is accessed on it	15:58
efried_cya_jan	And it's only expensive because it's serialized, not because the report client is doing anything heavy, right?	15:59
mriedem	yeah i think so	15:59
mriedem	https://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/scheduler/client/report.py#L260	16:00
mriedem	we create 2 provider trees in there for some reason...	16:00
mriedem	https://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/scheduler/client/report.py#L270	16:00
mriedem	https://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/scheduler/client/report.py#L279	16:00
mriedem	oops	16:00
mriedem	https://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/scheduler/client/report.py#L286	16:00
mriedem	is there any good reason we need to do that in both places?	16:01
*** janki is now known as janki\|dinner		16:02
efried_cya_jan	mriedem: There should never be a need for one process to create two separate provider trees <=> report client instances, period.	16:02
efried_cya_jan	That could only ever lead to bugs.	16:02
efried_cya_jan	It will only ever not lead to bugs by luck.	16:02
efried_cya_jan	hold on, looking at your links...	16:03
mriedem	so, i'm going to remove this https://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/scheduler/client/report.py#L286-L287	16:03
mriedem	and change https://github.com/openstack/nova/blob/c9dca64fa64005e5bea327f06a7a3f4821ab72b1/nova/scheduler/client/report.py#L270-L272	16:03
mriedem	to just call clear_provider_cache	16:03
mriedem	for great success	16:03
mriedem	ok?!	16:03
mriedem	butthole surfers got me all amped up	16:03
*** yan0s has quit IRC		16:03
efried_cya_jan	mriedem: That will be fine, it's removing redundant calls, but it's not going to change anything functionally. We were calling ProviderTree init twice before, but not creating two separate provider trees, just overwriting self._provider_tree.	16:05
efried_cya_jan	mriedem: However, I don't agree with the details. You can't get rid of _create_client without affecting @safe_connect. For now (until @safe_connect has died a fiery death), I would just remove the provider tree and assoc timer inits from __init__.	16:06
efried_cya_jan	And if you would like to take a step on the road to that fiery demise of @safe_connect, you could review https://review.openstack.org/#/c/613613/ :P	16:06
mriedem	efried_cya_jan: safe_connect doesn't care about provider tree and association refresh	16:07
mriedem	only the ksa client thing	16:08
melwitt	o/	16:08
efried_cya_jan	mriedem: reason we want to clear the cache in safe_connect is because it happened due to an error, so we can't count on our cache being correct.	16:08
efried_cya_jan	I sense you're trying to find a way to remove the use of that semaphore. I don't think we can do that.	16:09
mriedem	i'm not	16:09
mriedem	i'm just trying to get rid of this code creating the provider tree data structure 20 times in different places	16:09
mriedem	because i'm OCD	16:09
mriedem	but i hear you	16:10
efried_cya_jan	What you can do is make _create_client call clear_provider_cache, and remove those inits from __init__. That ought to consolidate the actual LOCs that do the init to one place.	16:10
efried_cya_jan	I think.	16:10
*** takamatsu has quit IRC		16:10
efried_cya_jan	but hm, this makes me wonder whether clear_provider_cache needs to be under that same lock.	16:10
efried_cya_jan	(for the other cases where it's called)	16:10
efried_cya_jan	I'll leave that steaming pile in front of you and walk away.	16:11
* kashyap likes efried_cya_jan's nick; guess it means he's going to disappear soon :D		16:14
efried_cya_jan	kashyap: Yeah, right about... now.	16:15
efried_cya_jan	o/	16:15
kashyap	(That's good; happy to see screen-starers getting breaks...)	16:15
sean-k-mooney	efried_cya_jan: enjoy the break	16:15
kashyap	efried_cya_jan: Glad; don't make the mistake of staying connected to the VPN :D	16:15
* kashyap is off from 17th; will be catching on my books and other expat errands, and so forth.		16:16
*** pas-ha has joined #openstack-nova		16:16
*** k_mouza_ has joined #openstack-nova		16:22
*** k_mouza has quit IRC		16:25
*** Miouge has left #openstack-nova		16:28
mriedem	heh putting the same lock on clear_provider_cache makes the tests lock up	16:29
openstackgerrit	Matt Riedemann proposed openstack/nova master: Only construct SchedulerReportClient on first access from API https://review.openstack.org/623246	16:32
openstackgerrit	Matt Riedemann proposed openstack/nova master: DRY up SchedulerReportClient init https://review.openstack.org/623247	16:32
mriedem	someone should probably tell the intel nfv ci to stop reporting b/c it's busted	16:33
mriedem	and has been for awhile	16:34
mnaser	https://wiki.openstack.org/wiki/ThirdPartySystems/Intel_NFV_CI	16:40
*** ohorecny2 has quit IRC		16:41
mnaser	looks like it's in wip or something	16:41
*** psachin has quit IRC		16:41
gibi	mriedem: my organized thoughts about the legacy notification deprecation http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000685.html	16:43
pas-ha	hi all, have a question re nova + barbican integration - if we have enabled `[glance]verify_glance_signatures` in nova, how booting from snapshot (including unshelve of instance) is supposed to work? AFAIU nova does not re-sign the snapshots it creates...	16:45
mriedem	pas-ha: unfortunately everyone that i know that worked on that is no longer around, but people in #openstack-barbican might know	16:48
mriedem	gibi: thanks	16:48
pas-ha	mriedem: thanks, will ask around there then :-)	16:48
gibi	mriedem: it might not help much but at least it summarizes our options	16:49
mriedem	gibi: yeah that's still useful	16:49
*** ttsiouts has quit IRC		16:49
*** ttsiouts has joined #openstack-nova		16:50
mriedem	pas-ha: i assume you mean because of this https://docs.openstack.org/nova/rocky/configuration/config.html#DEFAULT.non_inheritable_image_properties	16:50
mriedem	so when nova creates a snapshot, the image snapshot does not inherit the signature	16:51
mriedem	so trying to boot/unshelve from it later won't work	16:51
mriedem	if you have verify_glance_signatures=True	16:51
mriedem	interestingly enough, but sadly probably not surprising, is that those image properties are not documented https://docs.openstack.org/glance/latest/admin/useful-image-properties.html	16:52
pas-ha	yep, but AFAIU even if they were passed, the image signature is no longer valid already as surely the hash of the snapshot is not the same as that of pristine image	16:52
mriedem	yeah	16:53
mriedem	i guess there is https://docs.openstack.org/glance/latest/user/signature.html	16:53
mriedem	i guess the user would need to sign the snapshots after nova creates them right?	16:53
*** ttsiouts has quit IRC		16:55
pas-ha	right now that's the only way I see it might work, yes	16:56
mriedem	this was the spec if that contains any info https://specs.openstack.org/openstack/nova-specs/specs/mitaka/implemented/image-verification.html	16:57
mriedem	it does say "Add functionality to Nova which calls the standalone module when Nova uploads a Glance image and the verify_glance_signatures configuration flag is set."	16:58
mriedem	but i don't actually see that happening	16:59
mriedem	pas-ha: you might be better off simply asking about this in the openstack-discuss mailing list to see if anyone else is working on this already	17:00
mriedem	or interested in closing this gap	17:00
pas-ha	yes, figured that out, will do, thanks!	17:00
mriedem	yw	17:00
*** diliprenkila has quit IRC		17:09
*** Swami has quit IRC		17:14
*** helenafm has quit IRC		17:16
*** janki\|dinner has quit IRC		17:16
*** tssurya has quit IRC		17:18
*** k_mouza_ has quit IRC		17:24
*** _alastor_ has joined #openstack-nova		17:28
*** k_mouza has joined #openstack-nova		17:45
*** k_mouza has quit IRC		17:47
*** jmlowe has quit IRC		17:49
*** eharney has quit IRC		17:52
*** igordc has joined #openstack-nova		17:53
cfriesen	there's contact information there...is someone going to email them?	17:54
cfriesen	^ for the intel NFV CI	17:54
sean-k-mooney	is it broken?	17:54
cfriesen	mriedem: says yes	17:55
cfriesen	bah, shouldn't have colon there	17:55
*** gyee has joined #openstack-nova		17:55
sean-k-mooney	i read it anyway :)	17:55
sean-k-mooney	i need to talk to infra and see if we can migrate that ci upstream	17:56
sean-k-mooney	it all comes down to having host noes with nested virt	17:56
sean-k-mooney	and 2 numa nodes	17:56
sean-k-mooney	the 2 numa nodes are optional but it used dual numa node guest too	17:57
*** macza has joined #openstack-nova		17:57
*** derekh has quit IRC		17:59
*** dtantsur is now known as dtantsur\|afk		18:01
*** udesale has quit IRC		18:04
*** Swami has joined #openstack-nova		18:05
lyarwood	We shouldn't allow a cold migration from an offline compute host right?	18:06
dansmith	we don't check, AFAIK	18:06
lyarwood	Yeah that's what I'm seeing in the API at least	18:06
lyarwood	but that's a bug right? The source compute being offline should fail the attempt right there?	18:07
dansmith	well,	18:07
dansmith	you could argue that for sure	18:07
lyarwood	k, this is from a downstream bug report where a user has tried to migrate instead of evacuate from an offline compute and the instance gets stuck in resize_prep etc.	18:08
*** sahid has quit IRC		18:15
*** jmlowe has joined #openstack-nova		18:18
openstackgerrit	Jack Ding proposed openstack/nova master: [WIP] Preserve UEFI NVRAM variable store https://review.openstack.org/621646	18:20
*** spatel has joined #openstack-nova		18:21
spatel	sean-k-mooney: afternoon	18:21
sean-k-mooney	o/	18:21
spatel	sean-k-mooney: i had question related horizon dashboard Overview section, its saying Active instance: 367	18:22
sean-k-mooney	yep	18:22
spatel	but when i run "nova list" its saying you have 280 instance total	18:22
*** N3l1x has quit IRC		18:22
spatel	where that 367 came from?	18:22
sean-k-mooney	that is a good question	18:22
sean-k-mooney	i belive it hitting the simple teant usage api	18:23
*** N3l1x has joined #openstack-nova		18:23
spatel	does it counting deleted instance too?	18:23
sean-k-mooney	so it is proably showing the total number of instaces that were launched	18:23
sean-k-mooney	let me take a look a dashboard for a sec	18:23
spatel	ok	18:24
*** k_mouza has joined #openstack-nova		18:24
sean-k-mooney	so i just opend my local hoizon you are lokking at the overview seaction in the project section or admin	18:26
sean-k-mooney	in the project page it should just show the active instance not deleted	18:26
*** k_mouza has quit IRC		18:29
*** wolverineav has joined #openstack-nova		18:29
*** wolverineav has quit IRC		18:29
*** wolverineav has joined #openstack-nova		18:29
*** slaweq has joined #openstack-nova		18:30
*** dave-mccowan has joined #openstack-nova		18:31
sean-k-mooney	spatel: you could run "openstack usage list" or openstack usage show --project <your project> and see if that say 367	18:31
sean-k-mooney	spatel: i do not have a version of horizon that has teh simple teant usage api support turned on so i just see the dashboard that show the currently active instances	18:32
spatel	hold on doing it	18:35
*** slaweq has quit IRC		18:36
spatel	sean-k-mooney: http://paste.openstack.org/show/736782/	18:36
spatel	here you go	18:36
sean-k-mooney	so you were looking at the overview in the admin section	18:36
sean-k-mooney	well maybe not	18:37
*** wolverineav has quit IRC		18:37
sean-k-mooney	314 != 367	18:37
*** jmlowe has quit IRC		18:37
openstackgerrit	Dan Smith proposed openstack/nova master: Only warn about not having computes nodes once in rpcapi https://review.openstack.org/623282	18:40
openstackgerrit	Dan Smith proposed openstack/nova master: Make service.get_minimum_version_all_cells() cache the results https://review.openstack.org/623283	18:40
openstackgerrit	Dan Smith proposed openstack/nova master: Make compute rpcapi version calculation check all cells https://review.openstack.org/623284	18:40
dansmith	mriedem: ^	18:40
dansmith	mriedem: hopefully we can see the difference in just the logging, and measure the impact of the all-cells change	18:41
*** eharney has joined #openstack-nova		18:42
sean-k-mooney	spatel: how many running vms does "openstack hypervisor stats show -c running_vms -f value" return	18:42
spatel	running	18:43
-spatel- [root@ostack-osa ~(keystone_admin)]# openstack hypervisor stats show -c running_vms -f value		18:43
-spatel- 283		18:43
sean-k-mooney	ok so that is what you were exepcting	18:45
sean-k-mooney	it should match "openstack server list -f value -c ID \| wc -l"	18:45
*** wolverineav has joined #openstack-nova		18:45
sean-k-mooney	but horizon is saying 367	18:45
sean-k-mooney	it should match "openstack server list -f value -c ID \| wc -l"	18:46
spatel	I have two project so i need to run that command in specific project	18:46
spatel	admin not giving me any data when i run "openstack server list -f value -c ID \| wc -l"	18:46
sean-k-mooney	am use this one instead "openstack server list -f value -c ID --all-projects \| wc -l"	18:47
spatel	across all project output is "287"	18:47
spatel	when i run this "openstack server list -f value -c ID --all-projects \| wc -l"	18:47
sean-k-mooney	ok so that will inclode shelved or error state vms	18:48
spatel	possible some machine are shutdown or error stat ...	18:48
sean-k-mooney	its clode enought to the running vms for hypervior state that i would get horizon may be wrone	18:48
sean-k-mooney	openstack server list would include all vm except deleted instances i think by default	18:49
spatel	but 367 which horizon claiming is way way out..	18:49
sean-k-mooney	yes it is	18:50
sean-k-mooney	i honetly dont know why	18:50
spatel	i assume its counting delete and other state also...	18:50
spatel	I will open BUG :)	18:50
spatel	lets see what other people chime in	18:50
sean-k-mooney	spatel: it should not include deleted	18:50
spatel	that is what i am thinking	18:50
sean-k-mooney	out of interest what is the resulted of "openstack server list -f value -c ID --deleted --all-projects \| wc -l"	18:51
sean-k-mooney	is it around 80 ish?	18:51
*** slaweq has joined #openstack-nova		18:52
mriedem	dansmith: ack, we might get results by eod	18:53
dansmith	mriedem: aye	18:53
dansmith	mriedem: I just -1ed your placement client thing because I'm an ass, but I will make the change for you if you want	18:54
dansmith	to make myself feel better	18:54
mriedem	what assery is this	18:54
spatel	sean-k-mooney: ^^	18:54
* dansmith goes to look up the types of assery		18:54
-spatel- [root@ostack-osa ~(keystone_admin)]# openstack server list -f value -c ID --deleted --all-projects \| wc -l		18:54
-spatel- 331		18:54
spatel	i am confused now :(	18:54
spatel	Here i file bug https://bugs.launchpad.net/horizon/+bug/1807251	18:55
openstack	Launchpad bug 1807251 in OpenStack Dashboard (Horizon) "Horizon Overview summary showing wrong numbers " [Undecided,New]	18:55
cdent	I imagine there is.a super upper ontology of assery	18:55
*** jmlowe has joined #openstack-nova		18:56
*** cdent has quit IRC		18:56
*** slaweq has quit IRC		18:57
sean-k-mooney	spatel: that is strange	18:58
spatel	definitely bug or may be some DB cleanup stuff?	18:59
sean-k-mooney	even teh other things like cpu hours are way out	18:59
*** tbachman has quit IRC		18:59
spatel	exactly	18:59
sean-k-mooney	are they showing the same time period	19:00
sean-k-mooney	2018-11-08 to 2018-12-07	19:00
sean-k-mooney	it looks like the cli is defaulting to the last month but maybe horizon has a differnt default	19:01
spatel	on Horizon dates are default set to 2018-12-05 to 2018-12-06	19:01
spatel	24 hours period	19:01
sean-k-mooney	that does not explain how horrizon is showing more usage	19:01
sean-k-mooney	in fact it makes it less likely..	19:01
spatel	hmm	19:02
sean-k-mooney	so ya sorry i cant really help more then that	19:02
spatel	don't worry it was good help to understand that i am not crazy...	19:06
spatel	i thought i didn't understand or missing something but after you confirm it seems bug	19:06
spatel	anyway i have opened ticket so lets see	19:06
cfriesen	sean-k-mooney: do you know if anyone has ever looked at setting IRQ affinity for PCI devices? (would only really make sense for the "dedicated" case)	19:07
sean-k-mooney	i did like 4 years ago	19:07
sean-k-mooney	thre are two field of tought on this either user irq blanace or whatever to afinities all irq away form the cores used by vms	19:08
cfriesen	sean-k-mooney: I'm talking about for PCI-passthrough, setting the affinity to the pCPUs used by the guest	19:09
sean-k-mooney	or dynamicaly affinities irqs for a vf to the vm cpus	19:09
sean-k-mooney	ya i was asked to make that change a few years ago and got push back rom redhat folks on the libvirt team.	19:10
sean-k-mooney	let me see if i can find that.	19:10
*** udesale has joined #openstack-nova		19:11
*** igordc has quit IRC		19:19
*** igordc has joined #openstack-nova		19:19
sean-k-mooney	hum i must have been an intel only bug which aparently i cant see	19:21
*** dpawlik has joined #openstack-nova		19:22
*** dpawlik has quit IRC		19:24
sean-k-mooney	cfriesen: i assume you cant see https://bugzilla.redhat.com/show_bug.cgi?id=1135668	19:28
openstack	sean-k-mooney: Error: Error getting bugzilla.redhat.com bug #1135668: NotPermitted	19:28
cfriesen	sean-k-mooney: nope. :)	19:29
sean-k-mooney	cfriesen: when we rant this by the libvirt people they were concerned that affinities the interups to the vm cores could cause alot of guest VM_exits to serivce the interupts	19:29
cfriesen	sean-k-mooney: enough to outweigh the preloading of the cache?	19:29
openstackgerrit	Matt Riedemann proposed openstack/nova master: Only construct SchedulerReportClient on first access from API https://review.openstack.org/623246	19:30
openstackgerrit	Matt Riedemann proposed openstack/nova master: DRY up SchedulerReportClient init https://review.openstack.org/623247	19:30
*** udesale has quit IRC		19:31
sean-k-mooney	so this was created on 2014-08-29 so my memory is a little fuzzy but they wanted use to show that it would out way that overhead and at the time we did not have the capsity to test it in an openstack environmnt to messuer it	19:31
sean-k-mooney	so it was drop becasue of lack of evidence to show ti was a usefull feature to add to libvirt	19:31
sean-k-mooney	we abandonded our nova work as a result	19:32
sean-k-mooney	what everyone did agree on is that affizing to the same numa node was a good idea	19:32
*** alex_xu has quit IRC		19:32
cfriesen	okay, thanks	19:33
sean-k-mooney	thinking about it now i think affinitizing to the emuplator thread or thread poll in isolate/shared mode might be better the to the vcpus	19:33
*** alex_xu has joined #openstack-nova		19:34
sean-k-mooney	cfriesen: did you do any testing out of interest?	19:35
sean-k-mooney	cfriesen: hehe oh look here is my orginal mail https://www.mail-archive.com/libvir-list@redhat.com/msg100707.html	19:35
cfriesen	I'm actually not sure. we had an in-house nova commit to enable pinning, but I wasn't the one that implemented it. Kind of curious myself if they tested it. :)	19:36
sean-k-mooney	wow that was before i got the intel leagal email footer remove form my acount that was along time ago	19:36
*** dpawlik has joined #openstack-nova		19:39
*** dpawlik has quit IRC		19:44
coreycb	tobias-urdin: we have a placement package now	19:56
coreycb	tobias-urdin: just fyi, i know you were asking for it a little while back	19:57
*** wolverineav has quit IRC		19:59
*** wolverineav has joined #openstack-nova		20:00
*** wolverineav has quit IRC		20:05
lyarwood	coreycb: link? I'll wire that up in my puppet changes tomorrow	20:11
openstackgerrit	Chris Dent proposed openstack/nova master: Correct lower-constraints.txt and the related tox job https://review.openstack.org/622972	20:12
*** rcernin has joined #openstack-nova		20:12
*** tbachman has joined #openstack-nova		20:13
*** tbachman_ has joined #openstack-nova		20:15
mriedem	aspiers: will config drive work with sev guests?	20:15
mriedem	sean asked in the spec review but i didn't see an answer	20:16
coreycb	lyarwood: placement-api is the package name	20:17
*** tbachman has quit IRC		20:17
*** tbachman_ has quit IRC		20:20
*** david-lyle is now known as dklyle		20:24
mriedem	aspiers: comments inline https://review.openstack.org/#/c/609779/ but +1	20:25
lyarwood	coreycb: https://review.openstack.org/623306 - loads of assumptions about the package but I'll take another look in the morning	20:26
*** udesale has joined #openstack-nova		20:29
coreycb	lyarwood: that generally looks good to me. let me know how it goes and thanks for testing.	20:29
*** wolverineav has joined #openstack-nova		20:33
*** wolverineav has quit IRC		20:38
openstackgerrit	Matt Riedemann proposed openstack/nova master: Ignore MoxStubout deprecation warnings https://review.openstack.org/623309	20:38
mriedem	let's do this ^	20:38
*** mriedem has quit IRC		20:45
*** udesale has quit IRC		20:46
*** mriedem has joined #openstack-nova		20:47
*** takashin has joined #openstack-nova		20:49
artom	The hell, how do I dn sync cell1?	20:49
artom	*db	20:50
melwitt	nova meeting in 10 minutes	20:50
*** wolverineav has joined #openstack-nova		20:53
coreycb	mriedem: hi, just a friendly ping to see if you can review this when you get a chance. looks like it has a +1 from another core dev: https://review.openstack.org/#/c/579004/	20:55
artom	Ah, just pass --config-file cell1.conf to nova-manage	20:55
artom	(from https://docs.openstack.org/nova/pike/cli/nova-manage.html)	20:55
mriedem	coreycb: it's got the star but yeah i need to spend some time on it	20:58
coreycb	mriedem: ok thanks in advance	20:59
*** Sundar has joined #openstack-nova		21:01
Sundar	I'd appreciate if somebody can answer the question in: https://ask.openstack.org/en/question/117509/unable-to-update-rc-inventory-in-resource-provider/ . If openstack-discuss is a better place to pose that question, I'll bring it there.	21:02
*** jmlowe has quit IRC		21:16
dansmith	Sundar: pretty sure your url there is wrong	21:22
dansmith	Sundar: you might want to look at the nova functional tests or the placement gabbit	21:23
*** jaosorior has quit IRC		21:23
dansmith	Sundar: you're trying to put inventory on a provider right?	21:24
dansmith	Sundar: https://developer.openstack.org/api-ref/placement/#update-resource-provider-inventories	21:24
dansmith	I guess I'm wrong and the url is right, but I surely didn't think that was how that worked	21:24
dansmith	because: atomicity	21:25
dansmith	ah right you can PUT on the provider itself do do multiples	21:25
dansmith	https://developer.openstack.org/api-ref/placement/?expanded=update-resource-provider-inventories-detail#update-resource-provider-inventories	21:26
mriedem	or just, openstack --debug resource provider inventory set ...	21:26
mriedem	https://docs.openstack.org/osc-placement/latest/index.html	21:26
dansmith	mriedem: yeah, that's how I'd be doing it.. I figured he had some reason for doing it with curl	21:27
Sundar	dansmith: mriedem: I use curl by default. And I thought whatever works via openstack commands should also work via curl. Apparently not. :) Anyways, the openstack command worked. Thanks!	21:35
dansmith	Sundar: of course you _can_ use curl, but you get into situations like you're in now where you get back a 400 because you typo'd something and it's hard to tell what that is :)	21:36
mriedem	it should work via curl	21:36
*** tbachman has joined #openstack-nova		21:36
mriedem	right	21:36
dansmith	it's why we don't all write in machine language in 2018	21:36
Sundar	Curl tends to be much faster than openstack commands, in my experience.	21:37
dansmith	Sundar: you asked on Nov 21 and got an answer on Dec 6.. seems like pretty terrible performance to me :)	21:38
mriedem	it fails very fast yes :)	21:38
efried_cya_jan	Sundar: I just looked at the gabbits, and it appears as though in order for that PUT to work, the inventory of that resource class has to exist already.	21:38
efried_cya_jan	I.e. the PUT is designed to modify an existing inventory, not create a new one.	21:38
mriedem	PUT generally means updating something that exists	21:38
efried_cya_jan	I think edleafe might disagree on that one	21:38
Sundar	dansmith: I had a very long retry setting :)	21:39
efried_cya_jan	Sundar: let me get you a link to the gabbit...	21:39
edleafe	efried_cya_jan: damn straight	21:39
edleafe	PUT replaces	21:39
efried_cya_jan	Sundar: https://github.com/openstack/placement/blob/master/placement/tests/functional/gabbits/inventory.yaml#L234	21:40
efried_cya_jan	edleafe: In which case this operation is busted :(	21:40
mriedem	gaudenz: you need to set this to true in the devstack-plugin-ceph job https://github.com/openstack/tempest/blob/b62baf7c16d4609ea92e2ffc974e2f3a0b1cec80/tempest/config.py#L896	21:40
mriedem	gaudenz: and make that patch depend on the nova chnage	21:40
mriedem	and then you should be good	21:40
mriedem	gaudenz: right in here https://github.com/openstack/devstack-plugin-ceph/blob/39de6df04130cf2f221fb5ba2a9b5ff597de332a/devstack/plugin.sh#L98	21:41
efried_cya_jan	edleafe, cdent: It appears we're missing a gabbi test for a successful PUT /rps/{u}/inventories/{rc}	21:41
efried_cya_jan	...or it's not in that file	21:41
mriedem	gaudenz: commented in the nova change on the steps	21:41
efried_cya_jan	Maybe Sundar would like to propose that :)	21:41
efried_cya_jan	okay, I'm going away again. o/	21:42
efried_cya_jan	Oh, edleafe, thanks for covering the n-sch summary in the meeting.	21:42
Sundar	efried_cya_jan: We are still seeing you in December. :/ That apart, https://developer.openstack.org/api-ref/placement/#update-resource-provider-inventories documents only PUT.	21:42
efried_cya_jan	Sundar: Yes, the document is not clear on the fact that the inventory needs to exist first.	21:43
mriedem	gaudenz: btw, i'm pretty sure cern was already working on this...	21:43
efried_cya_jan	Sundar: But the error message on that 400 is.	21:43
dansmith	Sundar: it's PUT on the provider that works regardless	21:43
mriedem	gaudenz: melwitt: https://review.openstack.org/#/c/594273/	21:43
edleafe	efried_cya_jan: I prepared that thinking that you were going to be a man of your word	21:43
mriedem	so something is fishy	21:43
dansmith	Sundar: PUT on a provider/class only works after it has been created via PUT on provider	21:43
Sundar	mriedem: Sorry, didn't get your reference to the devstack-plugin-ceph job. Why is that involved here?	21:44
melwitt	mriedem: whoa, huh	21:44
Sundar	dansmith: Makes sense. I just went by the doc.	21:44
mriedem	https://blueprints.launchpad.net/nova/+spec/extend-in-use-rbd-volumes	21:44
mriedem	Sundar: that's for gaudenz, not you	21:44
efried_cya_jan	Sundar: Right, as dansmith is saying, if you do PUT /rps/{u}/inventories with a payload like {'resource_provider_generation': $X, 'inventories': {$RC: {'total': $N, ...}}} then subsequent PUTs on /rps/{u}/inventories/{rc} ought to work.	21:44
* efried_cya_jan really walks away.		21:45
melwitt	o/ efried_cya_jan	21:45
melwitt	mriedem: thanks for finding that. good to compare/consolidate on this cc gaudenz	21:46
gaudenz	mriedem: Thanks, for the links. Will have a look.	21:47
mriedem	i knew about the cern one originally, i thought this was that	21:47
mriedem	dansmith: you got results https://review.openstack.org/#/c/623282/ it works \o/	21:47
mriedem	not sure if you want to change that variable name	21:47
mriedem	i suggest LOGGED_NO_COMPUTES	21:47
gaudenz	mriedem, melwitt: I cooked up something as a blueprint just now: https://blueprints.launchpad.net/nova/+spec/extend-volume-net does this look good?	21:47
mriedem	gaudenz: heh, well, there is already a blueprint for this...	21:48
mriedem	https://blueprints.launchpad.net/nova/+spec/extend-in-use-rbd-volumes	21:48
mriedem	so...we might as well just mark one superseded (the new one)	21:48
mriedem	but need you and jose to sort out what the correct solution is here	21:48
melwitt	I can do the launchpad updates	21:50
melwitt	gaudenz: yeah, so just use the already existing https://blueprints.launchpad.net/nova/+spec/extend-in-use-rbd-volumes and collaborate with jose on the approach	21:50
gaudenz	No problem. At least I did not invest too much time in writing the blueprint. I'm fine with marking mine as superseeded.	21:50
gaudenz	I'll look at jose's patch and contact him. Thanks again for helping out.	21:51
mriedem	dansmith: unit test failures in https://review.openstack.org/#/c/623283/1	21:54
mriedem	http://logs.openstack.org/83/623283/1/check/openstack-tox-py27/d6a560e/testr_results.html.gz	21:54
melwitt	gaudenz: sounds good, thanks	21:55
*** jmlowe has joined #openstack-nova		21:55
mriedem	efried_cya_jan: before i dig into the fewer placement calls from _ensure patch, you've addressed jaypipes' ironic concerns from https://review.openstack.org/#/c/615677/9/nova/compute/resource_tracker.py@812 ?	21:57
mriedem	i assume so, but want to make sure i'm not going to waste time	21:57
*** spatel has quit IRC		22:08
dansmith	mriedem: hmm, okay, I switched the order at the last minute and then realized that was probably not the best idea an hour later, so that's probably why	22:11
dansmith	they were all passing before I did that	22:11
*** manjeets_ is now known as manjeets		22:28
*** rodolof has quit IRC		22:37
mriedem	coreycb: your wish is my command https://review.openstack.org/#/c/579004/	22:39
*** spatel has joined #openstack-nova		23:04
*** lbragstad has quit IRC		23:08
*** spatel has quit IRC		23:09
*** lbragstad has joined #openstack-nova		23:09
openstackgerrit	Gaudenz Steinlin proposed openstack/nova master: Extend volume for libvirt network volumes (RBD) https://review.openstack.org/613039	23:13
openstackgerrit	Matt Riedemann proposed openstack/nova stable/rocky: Add functional regression test for bug 1794996 https://review.openstack.org/623348	23:23
openstack	bug 1794996 in OpenStack Compute (nova) rocky "_destroy_evacuated_instances fails and kills n-cpu startup if lazy-loading flavor on a deleted instance" [High,Triaged] https://launchpad.net/bugs/1794996	23:23
openstackgerrit	Matt Riedemann proposed openstack/nova stable/rocky: Fix InstanceNotFound during _destroy_evacuated_instances https://review.openstack.org/623349	23:23
*** mlavalle has quit IRC		23:36
openstackgerrit	Merged openstack/nova master: Remove ironic/pike note from *_allocation_ratio help https://review.openstack.org/620154	23:44
openstackgerrit	Merged openstack/nova master: Change the default values of XXX_allocation_ratio https://review.openstack.org/602803	23:44
*** N3l1x has quit IRC		23:53
openstackgerrit	Matt Riedemann proposed openstack/nova stable/queens: Add functional regression test for bug 1794996 https://review.openstack.org/623354	23:59
openstack	bug 1794996 in OpenStack Compute (nova) rocky "_destroy_evacuated_instances fails and kills n-cpu startup if lazy-loading flavor on a deleted instance" [High,In progress] https://launchpad.net/bugs/1794996 - Assigned to Matt Riedemann (mriedem)	23:59
openstackgerrit	Matt Riedemann proposed openstack/nova stable/queens: Fix InstanceNotFound during _destroy_evacuated_instances https://review.openstack.org/623355	23:59
mriedem	omg	23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!