Thursday, 2019-10-17

openstackgerritBrin Zhang proposed openstack/nova-specs master: Remove the todo in the migrations spec  https://review.opendev.org/68905600:02
*** markvoelker has joined #openstack-nova00:15
*** markvoelker has quit IRC00:20
*** markvoelker has joined #openstack-nova00:22
*** mriedem_afk has quit IRC00:30
openstackgerritMerged openstack/nova-specs master: Remove the todo in the migrations spec  https://review.opendev.org/68905600:43
*** bnemec has quit IRC00:45
*** rcernin has joined #openstack-nova00:51
*** brinzhang has joined #openstack-nova00:53
*** bnemec has joined #openstack-nova00:56
*** bnemec has quit IRC01:07
*** Liang__ has joined #openstack-nova01:10
*** brinzhang_ has joined #openstack-nova01:11
*** brinzhang has quit IRC01:14
*** mtreinish has quit IRC01:27
*** nanzha has joined #openstack-nova01:30
*** brinzhang has joined #openstack-nova01:34
*** brinzhang_ has quit IRC01:36
*** brinzhang_ has joined #openstack-nova02:05
*** brinzhang_ has quit IRC02:05
*** yaawang_ has quit IRC02:06
*** yaawang_ has joined #openstack-nova02:06
*** brinzhang has quit IRC02:08
*** SonPham has joined #openstack-nova02:12
SonPhamhi. how to install python-novaclient package from github. I installed from git but horizon error02:14
*** awalende has joined #openstack-nova02:44
*** HagunKim has joined #openstack-nova02:46
*** victor286 has joined #openstack-nova02:47
*** mdbooth has quit IRC02:47
*** awalende has quit IRC02:48
*** mdbooth has joined #openstack-nova02:49
*** ricolin has joined #openstack-nova02:55
*** dave-mccowan has quit IRC03:00
openstackgerritLiang Fang proposed openstack/nova-specs master: Support volume local cache  https://review.opendev.org/68907003:07
*** yaawang_ has quit IRC03:08
*** yaawang_ has joined #openstack-nova03:08
*** mkrai__ has joined #openstack-nova03:09
*** Kevin_Zheng has joined #openstack-nova03:21
*** nweinber has joined #openstack-nova03:32
*** mkrai__ has quit IRC03:33
*** gbarros has quit IRC03:33
*** SonPham has quit IRC03:37
*** takashin has left #openstack-nova03:39
*** psachin has joined #openstack-nova03:41
*** nweinber has quit IRC03:51
*** mkrai__ has joined #openstack-nova03:54
*** larainema has joined #openstack-nova03:55
*** Liang__ is now known as LiangFang04:05
LiangFanghi cores, I want to propose a spec about volume local cache. Can anybody be my Feature liaison?Thank you so much. spec: Support volume local cache  https://review.opendev.org/68907004:14
LiangFangI'm new to Nova04:14
*** igordc has quit IRC04:15
*** pcaruana has joined #openstack-nova04:18
*** mkrai__ has quit IRC04:22
*** mkrai__ has joined #openstack-nova04:23
*** jangutter has joined #openstack-nova04:38
*** jangutter has quit IRC04:42
*** markvoelker has quit IRC04:45
*** lbragstad has quit IRC04:53
*** lbragstad has joined #openstack-nova04:53
*** dansmith has quit IRC04:54
*** ianw has quit IRC04:54
*** ianw_ has joined #openstack-nova04:55
*** dansmith has joined #openstack-nova04:55
*** ianw_ is now known as ianw04:56
*** mkrai__ has quit IRC05:01
*** mkrai_ has joined #openstack-nova05:01
*** Luzi has joined #openstack-nova05:04
*** ratailor has joined #openstack-nova05:07
*** ociuhandu has joined #openstack-nova05:26
*** ociuhandu has quit IRC05:31
*** sridharg has joined #openstack-nova05:33
*** brinzhang has joined #openstack-nova05:34
*** brinzhang has quit IRC05:39
*** takamatsu has quit IRC05:47
*** jawad_axd has joined #openstack-nova05:59
*** jawad_ax_ has joined #openstack-nova06:03
*** jawad_axd has quit IRC06:03
*** lpetrut has joined #openstack-nova06:04
*** lpetrut has quit IRC06:05
*** lpetrut has joined #openstack-nova06:05
*** jawad_ax_ has quit IRC06:07
*** jawad_axd has joined #openstack-nova06:14
*** janki has joined #openstack-nova06:18
*** rcernin has quit IRC06:18
*** udesale has joined #openstack-nova06:22
*** nanzha has quit IRC06:26
*** nanzha has joined #openstack-nova06:28
*** LiangFang has quit IRC06:29
*** Liang__ has joined #openstack-nova06:29
*** dpawlik has joined #openstack-nova06:42
*** markvoelker has joined #openstack-nova06:46
*** markvoelker has quit IRC06:51
*** FlorianFa has quit IRC06:53
*** trident has quit IRC06:55
*** slaweq has joined #openstack-nova06:55
*** damien_r has joined #openstack-nova06:56
eanderssonVMs stuck in BUILD is such a pain :'(06:58
*** trident has joined #openstack-nova06:58
gibi_offeandersson: can I do something to help with that?07:02
*** gibi_off is now known as gibi07:02
eanderssongibi, been fighting these for a while in Rocky. We got a few issus fixed upstream.07:03
eanderssonWe first though these were due to restarted computes07:03
eanderssonhttps://review.opendev.org/#/c/687535/07:03
eanderssonbut I think the issues are due to race conditions in this case07:04
eanderssonI see two instances scheduled at the same second on the same compute07:04
eanderssonI see an allocation, but both VMs are stuck in building / scheduling07:04
eanderssonwith no compute visible in the instance info07:04
eandersson(had to track the compute down using the db)07:04
gibieandersson: interesting. So you see both server having allocation in placement or only one of the server has allocation?07:05
eanderssonboth have allocation in placement07:05
eanderssonboth are stuck in BUILD07:06
*** FlorianFa has joined #openstack-nova07:06
gibiand none of the servers has instance.host updated to point to the compute?07:06
eanderssoncorrect07:07
eanderssonI see on the compute 2x Final resource view, image xxx at /var/lib07:08
eanderssonand then nothing else07:08
*** tesseract has joined #openstack-nova07:09
eanderssonhmm maybe the compute is bad in this case07:17
eanderssonI just got 3 more stuck on that host07:17
eanderssonbut don't understand why they never fail07:18
gibieandersson: I guess instance.task_state is None. Unfortunately there is no logs coming out from the compute between the build request reaching the compute and setting the instance.vm_state to BUILDING and between the instance_claim that will set instance.host07:18
*** mjozefcz|afk has joined #openstack-nova07:18
*** takamatsu has joined #openstack-nova07:18
*** ttsiouts has joined #openstack-nova07:18
eanderssonbtw we are running the very latest rocky07:20
*** awalende has joined #openstack-nova07:20
gibieandersson: did you happen to see instance.create.start versioned notification or compute.instance.create.start legacy notification for these servers?07:21
eanderssonI can check designate07:21
eanderssonor nvm they only catch end07:21
gibieandersson: does pci devices or sriov ports are requested for these servers?07:23
eanderssonnothing special07:24
gibieandersson: the two Final resource view logged considers the allocation of the servers?07:25
eanderssonSorry, don't fully understand that07:26
eanderssonlast part07:27
gibiyou mentioned that you saw two "Final resouce view " logs from the compute07:27
gibithat contains how much vcpu and ram is used on the compute07:27
gibidoes that usage contains the usage of your two servers stuck in build?07:27
eanderssonactually does not look like it adds up07:29
eanderssonin fact when they are stuck I dont see disk in allocations07:29
eanderssonIt honestly looks like allocations is just wrong07:31
eandersson(in the db)07:31
eanderssonI see 3 items in the db, but only two vms on the box (and nothing stuck in building atm)07:32
gibieandersson: do you see logs like " Lock "compute_resources" acquired by "nova.compute.resource_tracker.instance_claim"" ?07:35
gibior in general any 'Lock "compute_resources"'07:36
eanderssonnothing ;'(07:36
*** jangutter has joined #openstack-nova07:38
*** ttsiouts has quit IRC07:38
*** ttsiouts has joined #openstack-nova07:39
eanderssonIt's possible that the allocations are a legacy of some failed cold migrations tbh07:41
eanderssonbut not sure why they would cause a deadlock07:41
*** ttsiouts has quit IRC07:43
*** Liang__ has quit IRC07:44
eanderssonnvm the unaccounted for allocation was another vm stuck in bad state07:45
eanderssonjust forgot --all-projects07:45
eanderssonstuck in building for over 24 hours :p07:46
eanderssonalso nvm the disk allocation issue... my query had LIMIT 10 on it :D07:47
eanderssonLet me try to schedule a VM manually to that host and see if it works07:48
*** rpittau|afk is now known as rpittau07:52
*** xek has joined #openstack-nova07:52
*** ttsiouts has joined #openstack-nova07:53
gibieandersson: I suggest to add oslo_concurrency=DEBUG to the [DEFAULT]/default_log_levels config of the nova-compute service because if you see Final resource view logs then you should see logs about the " Lock "compute_resources" as well07:54
eanderssonSure I can do that now07:54
*** xek_ has joined #openstack-nova07:54
eanderssonCan reproduce this issue 100% on this host07:54
gibiso the periodic jobs can run and update the resource view but no new instances get the chance to claim resoruces07:55
gibiboth uses the same compute_resources lock so I don't see how one of them can progress and not the other07:57
eandersson> Lock "compute_resources" released by "nova.compute.resource_tracker._update_available_resource" :07:57
*** xek has quit IRC07:57
*** ralonsoh has joined #openstack-nova07:57
eandersson> Running periodic task ComputeManager._poll_unconfirmed_resizes run_periodic_tasks07:57
*** dtantsur|afk is now known as dtantsur07:59
*** tssurya has joined #openstack-nova07:59
eandersson> Compute_service record updated for computexxx:computexxx _update_available_resource07:59
eandersson> Lock "compute_resources" released by "nova.compute.resource_tracker._update_available_resource"07:59
eanderssonThese are the last two lines07:59
gibieandersson: now when you boot a VM do you see Lock "compute_resources" acquired by "nova.compute.resource_tracker.instance_claim" ?08:00
*** sapd1 has joined #openstack-nova08:00
eanderssonI don't see instance_claim08:01
gibithen somehow the build request cannot grab the compute_resources lock08:02
eandersson> > Instance x has been scheduled to this compute host, the scheduler has made an allocation against this compute node but the instance has yet to start. Skipping heal of allocation:08:02
eandersson> _remove_deleted_instances_allocations08:03
gibieandersson: that is logged because instance.host is not set08:04
*** tkajinam has quit IRC08:05
gibieandersson: do you see logs like  Lock "805e10fe-2601-4849-9593-3a83f2875bfb" acquired by "nova.compute.manager._locked_do_build_and_run_instance" where the lock name is the uuid of the server being built?08:06
*** ociuhandu has joined #openstack-nova08:06
eanderssonnothing with locked_do_build08:07
gibieandersson: do you have max_concurrent_builds configured ?08:09
eanderssonnope08:09
*** xek__ has joined #openstack-nova08:11
eanderssonI don't see any lock not getting released08:12
eanderssonunless it was held before the vm was created08:13
gibithis is strange. The instance.uuid lock is grabbed here https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/compute/manager.py#L2039 and inside that lock we set the instance.vm_state from SCHEDULING to BUILDING here https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/compute/manager.py#L213208:14
*** xek_ has quit IRC08:15
gibijust to be sure does the instance.vm_state is in BUILDING ?08:15
gibior it is still in SCHEDULING ?08:15
gibieandersson: nvm08:15
gibithe SCHEDULING was the task_state08:15
eandersson> OS-EXT-STS:task_state               | scheduling08:15
*** ociuhandu has quit IRC08:15
eandersson> OS-EXT-STS:vm_state                 | building08:16
*** cdent has joined #openstack-nova08:16
eanderssonbtw restarting the nova-compute does nothing08:17
eanderssonbut deleting the vm works fine08:17
*** cdent has left #openstack-nova08:17
gibieandersson: if restarting the compute does not fix the issue then it cannot be a lock as that would be cleaned up by the restart. also it cannot be that your compute run out of RPC workers to process incomming messages either08:18
*** Shatadru has joined #openstack-nova08:18
gibibut it feels like the build request does not reach the compute08:20
eanderssonI can see the build request in the db at least08:20
gibieandersson: does other computes using the same message bus (rabbit) work properly?08:21
eanderssonyep I see messaging flowing between nova and nova-compute08:22
eanderssonqueues are all in rabbit08:22
eanderssonI could capture the rmq messages to the compute08:24
*** ratailor_ has joined #openstack-nova08:26
bauzasgood morning Nova08:26
bauzaseandersson: gibi: any logs from the conductor vs. compute showing this ?08:27
eanderssonconductor is not in debug so got very little logs there :'(08:28
* bauzas is just trying to find good SIM card opportunities for data usage while in Shanghai :)08:28
*** sapd1 has quit IRC08:28
*** ratailor has quit IRC08:28
bauzaseandersson: are you able to follow the req-id down to compute ?08:28
bauzasor as gibi said, nothing there ?08:28
jkulikis it possible to reconfigure the log-level via the eventlet_backdoor?08:28
jkulikjust in case it's activated ...08:29
* gibi have to step away from the machine for a while08:29
jkulikprobably doesn't make sense if it's an old request and can't be reproduced. nevermind.08:30
eanderssonI can reproduce this 100% on this compute08:30
eanderssonbut it's not a dev environment, so can't mess too much08:31
*** dpawlik has quit IRC08:31
eanderssonI have about 10 VMs stuck in this state, so it isn't just one host.08:31
*** takamatsu has quit IRC08:32
eanderssonbut this host I am looking into fails 100%, even when specifying the compute using --availability-zone08:33
sean-k-mooneyeandersson: are you seeing the compute agent pause for a long peiord when running the update resouces periodic task?08:37
eanderssonIt does not look like it08:38
sean-k-mooneyok i was wondiering if it was related to libvirt thing we recetly fixed08:39
eanderssonThat is what I was thinking as well.08:39
eanderssonWhen we started seeing htis.08:39
eanderssonI have that build in my lab.08:39
eanderssonI assume you are referring to https://review.opendev.org/#/c/687535/08:40
sean-k-mooneythe fix for the eventlet issue with libvirt08:40
*** derekh has joined #openstack-nova08:40
eanderssonah not sure about that one08:40
eanderssoncan you link it?08:40
sean-k-mooneyno i was thinkin of something else ya let me find it08:40
* gibi is back08:40
*** ociuhandu has joined #openstack-nova08:40
eandersson(btw this is with  rocky)08:40
*** mkrai_ has quit IRC08:41
*** ociuhandu has quit IRC08:42
bauzaseandersson: sean-k-mooney: if that's a lock issue, the logs will tell it08:42
sean-k-mooneyyes it likely unrelated and not backported https://review.opendev.org/#/c/677736/08:43
sean-k-mooneyits not a lock issue08:43
bauzasagain, tracking the request-id is super important08:43
sean-k-mooneybut it was causeing rpc issues08:43
bauzassean-k-mooney: we were suspecting some lock holding the RPC calls08:44
bauzasbut, anyway, logs, logs, logs08:44
eanderssonIf it was an RPC issue it would timeout at some point at some end08:44
eanderssonright?08:44
bauzascorrect08:44
sean-k-mooneyeventrually you would gett a messaging time out yes08:44
eanderssonOne of these VMs are stuck for 24 hours with no error logs08:44
bauzaseandersson: again, are you able to track the request down on compute ?08:45
bauzaswhat's the last step the logs are telling you for a specific instance ?08:45
gibialso the bug behind the https://review.opendev.org/#/c/677736/ says that this bug makes the compute marked down, which is not the case for eandersson08:45
openstackgerritTushar Patil proposed openstack/nova-specs master: Allow compute nodes to use DISK_GB from shared storage RP  https://review.opendev.org/65018808:45
bauzasoh wait, the last task state is "scheduling" ?08:46
bauzasthat's waaaaay different from a compute issue then :)08:46
sean-k-mooneygibi: right it cause the compute agent too block on the call to libvirt and nothing else gets processed until libvirt returns08:47
eanderssonYea - not sure it is nova-compute specific, since restarting nova-compute has no effect08:47
eanderssonbut that specific compute has this issue, but the next one works fine08:48
eanderssonkeep in mind I am just bypassing the scheduler using --availability-zone nova:<compute>08:48
sean-k-mooneythere was a case about a mont ago where a patch was submitted for a codepath were we did not catch an exception that left the vm in building for ever. but i cant recall which one it was08:48
bauzaseandersson: IIRC, 'scheduling' task state is different from 'spawning'08:48
eanderssonYea08:49
sean-k-mooneyyou would see a trace in the compute log if that was the case which also seams not to be the case08:49
bauzaseandersson: so, I suspect nothing goes back from scheduler08:49
bauzasso the conductor can't trigger the RPC call to compute08:49
bauzaswhich will change the task to spawn08:49
bauzasactually, https://docs.openstack.org/nova/latest/reference/vm-states.html#create-instance-states08:52
bauzaswe unset the task state before plugging the network and the devices08:52
bauzasbut since you're still on 'scheduling', I bet nothing goes down to compute and stays on conductor08:52
bauzasagain, logs...08:52
*** mkrai_ has joined #openstack-nova08:54
eanderssonIf I search for the req from VM create I only find the api call08:55
eanderssonI only find the compute because it's in the database.08:55
eanderssonIf I could reproduce it in the lab I could give you all the logs you can dream off :p08:56
*** dpawlik has joined #openstack-nova08:57
bauzaseandersson: even on production, can't you just query through some specific request-id ?08:59
eanderssonYes08:59
eanderssonI can give you anything that is INFO or higher =]08:59
bauzasthat's enough09:00
bauzasalso, you could just query the os-instance-action API to get you a bit of what happened https://docs.openstack.org/api-ref/compute/?expanded=#servers-actions-servers-os-instance-actions09:01
bauzaseandersson: accordingly, for a specific request-id corresponding to a server create, are you able to track the last service involved ?09:01
eanderssonnova.api is the last service with that request-id09:02
bauzascrazy09:02
eandersson> [<req> <x> <y>]<IP> "POST /v2.1/<tenant>/servers" status: 202 len: 472 microversion: 2.1 time: 1.45370609:02
eanderssonI can see the scheduling request in the database under request_specs09:03
bauzaseandersson: I'd suggest you to use the os-instance-action API then09:03
bauzasand look at the events, if you enabled them09:03
bauzaseandersson: the request_spec record is created by the nova-api service so that doesn't prove it reached the scheduler service09:04
eanderssonSure09:04
eanderssonWouldn't specifying the host actually by-pass the scheduler?09:04
bauzasit depends, which release ?09:04
eanderssonRocky09:05
sean-k-mooneyif you use the avialbality zone way partly09:05
bauzasnope, it won't then09:05
sean-k-mooneyit will check that the az exists09:05
bauzassean-k-mooney: even with this, it will call out the scheduler09:05
sean-k-mooneyand then skip the rest09:05
sean-k-mooneysure09:05
sean-k-mooneybut it does not run all the filters09:05
bauzasoh yeah, syre09:05
bauzasbut eandersson tells us that nothing but nova-api shows evidence of the request ID09:05
bauzaseven not the conductor09:06
sean-k-mooneyit will still call the schduler however so you should see something in the log for it09:06
eanderssonWell conductor logs nothing09:06
eanderssonWe have many thousands vms per day and the logs in the conductor is zero09:06
bauzasyou should call Alice to follow the rabbit...09:06
eandersson...09:06
*** zbr has quit IRC09:07
sean-k-mooneyeandersson: is the only indication of this request the api log and the db entry?09:07
sean-k-mooneye.g. are you not seeing it in any other service at all?09:07
eanderssonLet me search for the instance uuid09:08
*** victor286 has quit IRC09:08
*** janki has quit IRC09:09
eanderssonnova, placement and neutron-server are showing up09:10
bauzaseandersson: we do non-blocking RPC calls for the nova-api service09:10
*** zbr has joined #openstack-nova09:11
bauzaseandersson: so I wouldn't be surprised if you wouldn't capture RPC timeouts09:11
bauzashence the rabbit queue checks09:11
eanderssonThe rabbitmq queues look fine. Nothing queued, nothing stuck in unack'd. I could capture messages going to the queues.09:12
sean-k-mooneyin rocky we are not patching the api with eventlet so we dont have the heartbeat issue but i was just wondering how far it got in the boot process before going silent09:13
sean-k-mooneyif we only see reference to the instance boot request in the api but not in the conductor or schduler it implies its failing very early09:14
eanderssonI only see the compute in placement09:14
sean-k-mooneyok so its getting to the schduler then09:15
sean-k-mooneyinfact if its created the allcoation its pass the filters and selected the host09:15
*** HagunKim has quit IRC09:16
sean-k-mooneyso its failing somethime beetween the schduler retruning to the conductor and the conductor callign the compute node09:16
bauzassean-k-mooney: that's not my understanding from what eandersson said by "I only see the compute in placement"09:17
sean-k-mooneyoh i may have misread that09:17
bauzaseandersson: no GET logs on https://docs.openstack.org/api-ref/placement/?expanded=list-allocation-candidates-detail#list-allocation-candidates ?09:17
eanderssonI see build_and_run_instance hit the compute09:18
bauzasWTF09:18
eanderssonwhen capturing a rabbitmq message09:18
eanderssonreceived by the compute09:18
bauzasbut no logs ?09:19
bauzasI suspect your log factory not working then :D09:19
bauzasthat's... crazy09:19
*** ociuhandu has joined #openstack-nova09:21
eanderssonbauzas, talking about compute or conductor?09:21
bauzasI frankly don't know what to say, I'm just lost09:22
eanderssonlike look at the conductor https://zuul.opendev.org/t/openstack/build/9a820944e63e409cb8dbf5b83931263e/log/logs/screen-n-cond.txt.gz09:22
eanderssonand search for INFO09:22
bauzasyou're seeing only logs on api service, but you just told you're able to see a build call to compute09:22
eanderssonYou'll find like 5 logs09:22
eanderssonhttps://zuul.opendev.org/t/openstack/build/d34ddb7bd62148968c33f4fe8f348e8b/log/controller/logs/screen-n-super-cond.txt.gz09:22
eanderssonI am not sure what you are talking about to be honest.09:23
eanderssonhttps://zuul.opendev.org/t/openstack/build/d34ddb7bd62148968c33f4fe8f348e8b/log/controller/logs/screen-n-sch.txt.gz09:23
eanderssonThis is the scheduler, again search for INFO09:24
eanderssonyou'll find like zero entries09:24
eanderssonWe have 1k compute nodes, and enabling debug would be very... spammy09:24
eanderssonbut unfortunately INFO does not provide a ton of logs outside of API09:25
*** ociuhandu has quit IRC09:25
bauzaseandersson: as I proposed you, can you please do some 'nova instance-action-list' stuff to get more knowledge ?09:27
bauzasbut I understand your point, INFO logs aren't talkative09:27
bauzasI thought we fixed that in Newton (or sometimes around it)09:28
eanderssonIs that the same as openstack server even list?09:28
eandersson*event09:28
eanderssonbecause os-instance-actions just shows me the request id, server id, action and start time09:29
eanderssonlet me check the body09:29
*** ociuhandu has joined #openstack-nova09:31
*** ociuhandu has quit IRC09:32
*** mkrai_ has quit IRC09:32
*** ociuhandu has joined #openstack-nova09:33
sean-k-mooneybauzas: ya looking at the conductor logs for the gate jobs the non debug version is not very useful09:33
sean-k-mooneywe likely should make https://zuul.opendev.org/t/openstack/build/df644e9fdde346f2813e6220312a9ca5/log/controller/logs/screen-n-super-cond.txt.gz#867 info level09:34
eanderssonI enabled debug09:35
eandersson> [instance: x] Selected host: compute1031; Selected node: compute1031; ; Alternates: [] schedule_and_build_instances09:35
bauzaseandersson: I checked and events are shown by default if you are on Pike and above when you call out instance-action API09:35
eandersson>  Re-scheduling is disabled. populate_retry09:35
bauzaseandersson: but you need to login with admin creds09:36
bauzashttps://docs.openstack.org/api-ref/compute/?expanded=show-server-action-details-detail#id17009:36
bauzasanyway, gotta run,09:36
bauzas\o09:36
eanderssonhttp://paste.openstack.org/show/EUjnd5TWftogxGIVaSNP/09:37
eanderssonThis is the last log line09:37
eanderssonfrom the conductor / scheduler09:37
eanderssonyea bauzas nothing interesting in the events09:38
*** Shatadru has quit IRC09:38
eanderssonEverything looks to be scheduled fine09:39
sean-k-mooneyright that is the last log for the instace we expect to see in the conductor log09:39
sean-k-mooneywell unless there is an error09:39
eanderssonno errors in the logs09:40
eandersson:'(09:40
sean-k-mooneyand nothing in the compute agent long in debug either09:40
sean-k-mooneyit just disappars?09:40
eanderssonI pasted some logs early on09:40
sean-k-mooneyyou captured the rpc right beign recived09:41
eanderssonlet me check again09:41
sean-k-mooneyoh ill scroll back09:41
*** mkrai_ has joined #openstack-nova09:44
tssuryaeandersson: what's the exact boot request you are making ?09:45
sean-k-mooneyi dont see the logs for the compute scrolling back unfortunetly.09:45
eanderssonsean-k-mooney, I think the logs we discussed were things like09:46
eandersson> instance x has been scheduled to this compute host09:46
eanderssonLet me enable even more debug logs09:47
sean-k-mooneyah am you can re disable debug on the conductor/schduler since we are pretty sure its getting to the compute node at this point right09:47
eanderssonYea already disabled09:48
eanderssongenerates a lot of logs :p09:48
sean-k-mooneyfor 200 servers with all servceince in debug i belive its something like 30GB a day in uncompressed logs09:48
sean-k-mooneythats what kolla-ansibel said in there configuration docs in anycase09:49
eanderssonhehe well that is without kubernetes, octavia etc hitting your apis constantly09:49
eanderssonplus all other automation09:49
sean-k-mooneyright you would really want log rotate to be working well if you use debug always09:50
*** ricolin has quit IRC09:50
sean-k-mooneywe shoudl try to have more of a blance however. e.g. make info more useful without being just spam09:51
eanderssonOn compute I see a message being received, but nothing more09:52
eandersson> DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: ad3b7ccda9944e9e87f497f4401aa0b4 __call__09:52
sean-k-mooneyok so its not building the libvirt xml or plugging any ports09:52
sean-k-mooneyits just reciving it09:53
eanderssonyea09:53
sean-k-mooneydid you also check with the instance uuid09:53
eanderssonI bet this compute is bust, but everything seems fine (including already existing vms)09:53
eanderssonyea09:53
*** mkrai_ has quit IRC09:53
eanderssonI do get some logs with the instance id09:53
eanderssonbut it's just the ones I posted earlier09:54
sean-k-mooneyok09:54
eandersson> Instance ... has been scheduled to this compute host, the scheduler has made an allocation against this compute node but the instance has yet to start. Skipping heal of allocation: ...09:54
sean-k-mooneyright that is the periodic task09:54
sean-k-mooneythat make sure the resouce useage on the host matches what we expect09:55
sean-k-mooneythat is normal to see between when the host has been selected in the db as the target for the instnace and the vm starting09:55
eanderssonAnyway bed time. It's 3AM.09:57
*** brinzhang has joined #openstack-nova09:57
eanderssonThanks for helping sean-k-mooney and gibi09:57
sean-k-mooneyeandersson: o/ get some rest09:58
*** Luzi has quit IRC10:00
*** mkrai_ has joined #openstack-nova10:10
*** mjozefcz|afk is now known as mjozefcz10:18
*** ociuhandu has quit IRC10:27
*** ociuhandu has joined #openstack-nova10:27
*** ttsiouts has quit IRC10:30
*** ttsiouts has joined #openstack-nova10:31
openstackgerritBrin Zhang proposed openstack/python-novaclient master: Add functional test for migration-list in v2.80  https://review.opendev.org/68863510:32
openstackgerritMerged openstack/nova stable/queens: Fix 'has_calls' method calls in unit tests  https://review.opendev.org/67737810:33
*** brinzhang_ has joined #openstack-nova10:34
*** ttsiouts has quit IRC10:36
*** brinzhang has quit IRC10:36
*** brinzhang has joined #openstack-nova10:37
*** brinzhang_ has quit IRC10:38
gibiit is only me or the powervm unit tests started failing on master?10:42
*** tbachman has quit IRC10:42
*** dpawlik has quit IRC10:43
gibiAttributeError: 'DiGraph' object has no attribute 'node'10:44
*** Luzi has joined #openstack-nova10:44
*** CeeMac has joined #openstack-nova10:47
gibihttps://702b7e8f253d29e679a6-2fe3f6c342189909aad5220492fb4721.ssl.cf1.rackcdn.com/688387/5/check/openstack-tox-cover/2c6410c/testr_results.html.gz10:47
gibihttp://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22AttributeError%3A%20'DiGraph'%20object%20has%20no%20attribute%20'node'%5C%22&from=7d10:48
gibithere are plenty of hits not just in nova10:48
gibionly affecting python 3.x jobs not the 2.7 jobs10:50
jkulikhttps://github.com/networkx/networkx/commit/6b1ce03f485076d39994e8d624bbf6ca82466eb9#diff-027182481aebf9ad0dda6ca00714653aR95 seems to be the cause10:55
jkulikseems like there's no upper version requirement defined in the taskflow library for python3 https://opendev.org/openstack/taskflow/src/branch/master/requirements.txt#L2410:56
*** brinzhang_ has joined #openstack-nova10:57
*** udesale has quit IRC10:57
*** brinzhang_ has quit IRC10:58
gibijkulik: good find. I agree that this can be the problem10:59
*** brinzhang has quit IRC11:00
*** mkrai_ has quit IRC11:02
fricklergibi: jkulik: that matches what I found comparing pip freeze from working and broken jobs. want to propose a cap to reqs as a quick fix?11:06
*** mtreinish has joined #openstack-nova11:12
*** dpawlik has joined #openstack-nova11:16
*** nweinber has joined #openstack-nova11:32
*** ttsiouts has joined #openstack-nova11:33
*** ttsiouts has quit IRC11:44
*** ttsiouts has joined #openstack-nova11:45
*** dviroel has joined #openstack-nova11:45
*** epoojad1 has joined #openstack-nova11:45
*** nweinber has quit IRC11:46
*** ociuhandu has quit IRC11:47
*** ociuhandu has joined #openstack-nova11:48
*** Luzi has quit IRC11:48
*** markvoelker has joined #openstack-nova11:48
*** ttsiouts has quit IRC11:49
*** ratailor_ has quit IRC11:51
*** ociuhandu has quit IRC11:54
*** bbowen has joined #openstack-nova12:03
*** tbachman has joined #openstack-nova12:03
*** ociuhandu has joined #openstack-nova12:03
*** Luzi has joined #openstack-nova12:04
*** jamesden_ is now known as jamesdenton12:09
*** ociuhandu has quit IRC12:11
*** ociuhandu has joined #openstack-nova12:11
*** ociuhandu has quit IRC12:16
*** sapd1 has joined #openstack-nova12:21
*** larainema has quit IRC12:26
*** takamatsu has joined #openstack-nova12:35
*** brinzhang has joined #openstack-nova12:37
*** dtantsur is now known as dtantsur|brb12:38
*** mriedem has joined #openstack-nova12:42
*** sapd1 has quit IRC12:42
openstackgerritBrin Zhang proposed openstack/python-novaclient master: Add functional test for migration-list in v2.80  https://review.opendev.org/68863512:42
brinzhangmriedem: There is an issue, do you have time to review https://review.opendev.org/#/c/688635/5/novaclient/tests/functional/v2/test_migrations.py@11612:44
*** dave-mccowan has joined #openstack-nova12:44
*** eharney has quit IRC12:45
brinzhangmriedem: I am not find the way to get the in-progress live migration server to do the server-migration-list/show12:45
gibifrickler jkulik: sorry I was pulled into a meeting in the meanwhile12:45
gibifrickler: at the end I think taskflow needs to be fixed or the networkx req in taskflow needs to be pinned12:46
mriedembrinzhang: likely not going to be able to do that one since it's not deterministic12:47
fricklergibi: yes, updating taskflow not to use the deprecated attribute anymore would be the correct permanant solution I think12:47
fricklergibi: also the u-c update here is failing, so this currently should only affect "special" jobs https://review.opendev.org/68907912:48
*** Luzi has quit IRC12:48
*** epoojad1 has quit IRC12:48
brinzhangmriedem: should I remove the server-migraton tests from this patch ? and todo it by follow-up in novaclient ?12:48
*** Luzi has joined #openstack-nova12:49
*** derekh has quit IRC12:49
gibifrickler: hm, but if upper-contraints pins networkx to 2.3 then how can be that py3.7 jobs are pulling in networkx 2.4?12:50
brinzhangmriedem: If I make fake data, I don't think it's necessary to do this test.12:50
fricklergibi: might be jobs ignoring u-c? didn't check in detail yet.12:51
mriedembrinzhang: yeah i suppose, i forgot that those are only for in-progress live migrations12:51
mriedemwe can't do those anyway since the functional job is single-node12:51
brinzhangyes12:51
openstackgerritBrin Zhang proposed openstack/python-novaclient master: Add functional test for migration-list in v2.80  https://review.opendev.org/68863512:53
*** brinzhang_ has joined #openstack-nova12:54
brinzhang_mriedem: I was updated this patch.12:54
*** takashin has joined #openstack-nova12:54
*** ttsiouts has joined #openstack-nova12:54
mriedemok12:54
*** tetsuro has quit IRC12:55
*** tetsuro has joined #openstack-nova12:55
brinzhang_mriedem: thanks.12:56
*** brinzhang has quit IRC12:56
gibifrickler: locally when I reproduce the problem I see the following in the tox log12:59
gibi$ cat  .tox/py37/log/py37-1.log  | grep networkx12:59
gibiIgnoring networkx: markers 'python_version == "2.7"' don't match your environment12:59
gibiIgnoring networkx: markers 'python_version == "3.4"' don't match your environment12:59
gibiIgnoring networkx: markers 'python_version == "3.5"' don't match your environment12:59
gibiIgnoring networkx: markers 'python_version == "3.6"' don't match your environment12:59
gibifrickler: nvm, I use py3.7 so those logs are valid13:00
fricklergibi: IIUC that's expected when running with python3.7, but is there a cap with ==3.7 in place?13:00
fricklergibi: which repo are you running this in, nova or taskflow?13:00
*** nweinber has joined #openstack-nova13:04
*** ttsiouts has quit IRC13:06
*** ttsiouts has joined #openstack-nova13:07
*** ociuhandu has joined #openstack-nova13:09
*** ttsiouts has quit IRC13:11
mriedemare the powervm driver unit tests failing since yesterday a known issue?13:11
mriedemhttps://c6fecb2db5c55fa0effa-6229cc6450d9b491384804026d2fbd81.ssl.cf5.rackcdn.com/688980/1/gate/openstack-tox-py36/71a8bdd/testr_results.html.gz13:11
fricklermriedem: IIUC that's the networkx issue13:11
fricklergibi: seems transitive upper-constraints are ignored13:12
fricklersee e.g. https://zuul.opendev.org/t/openstack/build/2c6410c8f6344d19b9c88844b93f0683/log/job-output.txt#525-52813:12
*** derekh has joined #openstack-nova13:12
mriedemyeah 2.4 was released 11 hours ago13:12
*** gbarros has joined #openstack-nova13:12
*** takamatsu has quit IRC13:12
*** ttsiouts has joined #openstack-nova13:14
mriedemhttps://bugs.launchpad.net/nova/+bug/184849913:15
openstackLaunchpad bug 1848499 in OpenStack Compute (nova) "powervm driver tests fail with networkx 2.4: "AttributeError: 'DiGraph' object has no attribute 'node'"" [Critical,Confirmed]13:15
efriedmriedem: do we need to cap taskflow?13:15
efriedoh, or networkx13:15
mriedemit is capped in upper-constraints,13:15
mriedembut as frickler said it seems it's not being honored13:15
fricklerif I add "networkx>=1.11" to nova/test-reqs.txt, the cap works. without it, it doesn't13:15
* mriedem goes to requirements13:15
*** udesale has joined #openstack-nova13:16
openstackgerritHuachang Wang proposed openstack/nova master: To create single NUMA node instance in function '_get_numa_topology_auto'  https://review.opendev.org/68893213:17
openstackgerritHuachang Wang proposed openstack/nova master: Assign and track instance pinning cpu through 'cpu_pinning' field  https://review.opendev.org/68893313:17
openstackgerritHuachang Wang proposed openstack/nova master: Add a new instance CPU allocation policy: mixed  https://review.opendev.org/68893413:17
openstackgerritHuachang Wang proposed openstack/nova master: virt/libvirt: Get host pin cpuset according instance cpu_pinning  https://review.opendev.org/68893513:17
openstackgerritHuachang Wang proposed openstack/nova master: metadata: export the vCPU IDs that are pinning on the host CPUs  https://review.opendev.org/68893613:17
*** mjozefcz has quit IRC13:19
gibifrickler: it is the nova repo I'm using to reproduce13:24
*** mjozefcz has joined #openstack-nova13:26
gibifrickler, mriedem: If I add '+  -r{toxinidir}/requirements.txt'13:27
gibifrickler, mriedem: to tox.ini then the transitive deps are correct and the test passes13:27
mriedemstephenfin: ^?13:28
mriedemlooks like networkx 2.4 changed Graph.node to Graph.nodes, weeee https://github.com/networkx/networkx/blob/networkx-2.4/doc/release/release_2.4.rst#deprecations13:28
*** mkrai_ has joined #openstack-nova13:30
*** eharney has joined #openstack-nova13:30
stephenfinlooking13:30
mriedemgibi: i want to say we used to explicitly include -r{toxinidir}/requirements.txt in deps but it was removed b/c it's installed for us in the tox env13:30
mriedembut maybe that breaks u-c processing?13:30
openstackgerritBalazs Gibizer proposed openstack/nova master: Make sure tox install requirements.txt with upper-constraints  https://review.opendev.org/68915213:31
gibimriedem, frickler, stephenfin: ^^ temporary fix13:31
stephenfinmriedem, gibi: We need mordred/smcginnis for this13:31
fricklergibi: mriedem: stephenfin: that seems a bug introduced in https://review.opendev.org/#/c/684775/13:31
stephenfingibi: You're essentially reverting b13c33caa07fc82b19c233f9ad46a1813eb3e76d13:31
* smcginnis reads scrollback13:32
stephenfinI think we should probably do that or 19a0bdfec454bd921b718e7dc49fe2673fa79b1013:32
stephenfinsmcginnis: we don't include 'requirements.txt' in deps because tox builds an sdist for us which will include said deps automagically13:32
openstackgerritTakashi NATSUME proposed openstack/nova stable/stein: Fix unit of hw_rng:rate_period  https://review.opendev.org/68915313:32
stephenfinhowever, I switched from overriding 'install_command' to providing constraints via '-c FILE' in deps13:33
mriedemstephenfin: you mean revert https://review.opendev.org/#/c/684775/ right?13:33
fricklerstephenfin: but it seems to include uncapped reqs13:33
gibistephenfin: install_commands are used for every install step in tox while deps only used to install test-requirements ?13:33
stephenfingibi: Yeah, I think so13:33
smcginnisstephenfin: This seems OK - https://review.opendev.org/#/c/689152/1/tox.ini13:33
smcginnisThat is what is done elsewhere.13:33
openstackgerritTakashi NATSUME proposed openstack/nova stable/rocky: Fix unit of hw_rng:rate_period  https://review.opendev.org/68915413:33
smcginnisThen the other jobs that need different deps, like the lower-constraints job, override "deps" to set the requirements it needs.13:34
stephenfinsmcginnis: Cool. I didn't know this would happen so I need to go make sure we haven't broken other projects13:34
smcginnisAs long as the -c isn't hard coded in the install_command, settings "deps" should be flexible enough to use different requirements for different jobs.13:35
openstackgerritTakashi NATSUME proposed openstack/nova stable/queens: Fix unit of hw_rng:rate_period  https://review.opendev.org/68915513:35
*** factor has quit IRC13:36
smcginnisOne tricky bit I've seen is that it's not always obvious that putting a deps line in a tox environment overrides rather than appends to the deps that are used. So there have been some cases where teams have meant to use an additional file but have ended up excluding some common ones.13:36
smcginnisSo just make sure wherever used deps you always include things like test-requirements (where appropriate of course).13:36
stephenfingibi: comments left13:36
stephenfinsmcginnis: Yeah, we're good there. We use the '{testenv[blah]}deps' syntax everywhere that matters13:37
mriedemstephenfin: replied to you in there13:37
*** ileixe has joined #openstack-nova13:38
smcginnisstephenfin: ++13:38
stephenfinmriedem: yup, both valid13:38
stephenfingibi, efried, mriedem, alex_xu: Also, I'm on PTO from tomorrow until the summit. Just FYI13:39
efriedholy crap, that's in like two and a half weeks, I didn't realize how close it was.13:40
sean-k-mooneyyep13:40
stephenfinyou're telling me13:40
efriedgdi, I have to take my kid to the dentist this morning during meeting time13:41
gibistephenfin: I've just confirmed that haveing the constraint in the install_command also works13:41
stephenfingibi: yeah, I think we don't want to do that because people forget to override install_command for the lower-constraints target13:41
stephenfinat least that's what I took away from smcginnis' comments above and elsewhere13:42
*** lbragsta_ has joined #openstack-nova13:42
smcginnisCorrect.13:42
gibistephenfin, smcginnis: I got your comments on the fix, I will respin that patch quickly13:43
openstackgerritTakashi NATSUME proposed openstack/nova stable/pike: Fix unit of hw_rng:rate_period  https://review.opendev.org/68915813:43
*** gbarros has quit IRC13:44
efriedokay, I'm going to be late to the meeting, but hopefully not more than a few minutes13:46
*** gbarros has joined #openstack-nova13:46
efriedpoor planning (from like months ago)13:46
*** efried is now known as efried_afk13:46
*** brinzhang has joined #openstack-nova13:47
*** brinzhang_ has quit IRC13:50
mriedemi'll start the meeting13:50
*** jawad_axd has quit IRC13:52
*** jawad_axd has joined #openstack-nova13:54
*** ociuhandu has quit IRC13:57
*** Luzi has quit IRC13:57
*** jawad_axd has quit IRC13:58
mriedemnova meeting starting now in -meeting14:00
*** gbarros has quit IRC14:02
*** ociuhandu has joined #openstack-nova14:02
openstackgerritBalazs Gibizer proposed openstack/nova master: Make sure tox install requirements.txt with upper-constraints  https://review.opendev.org/68915214:03
gibimriedem, stephenfin, smcginnis ^^14:03
*** gbarros has joined #openstack-nova14:04
openstackgerritDan Smith proposed openstack/nova stable/train: Update compute rpc version alias for train  https://review.opendev.org/68916414:05
*** jawad_axd has joined #openstack-nova14:05
sean-k-mooneyoh for the docs job. there was a reson we did not isntall requriement and test requireemnt in the docs job before14:06
sean-k-mooneythe way we build docs useign auto generation its kind of required but there was pushback to doing htis in the past.14:07
*** ociuhandu has quit IRC14:07
*** brinzhang_ has joined #openstack-nova14:08
openstackgerritHuachang Wang proposed openstack/nova master: [WIP] metadata: export the vCPU IDs that are pinning on the host CPUs  https://review.opendev.org/68893614:08
*** bnemec has joined #openstack-nova14:11
*** brinzhang has quit IRC14:12
*** efried_afk is now known as efried14:12
*** gbarros has quit IRC14:13
*** sapd1 has joined #openstack-nova14:13
*** dtantsur|brb is now known as dtantsur14:14
*** ociuhandu has joined #openstack-nova14:15
stephenfinsean-k-mooney: We're installing requirements.txt by default already. Unless you have skipsdist=False configured, tox will build your package for you14:15
stephenfinwhich obviously requires all dependencies in requirements.txt at a minimum14:16
sean-k-mooneyya i know14:16
sean-k-mooneybecause of the way our docs work you need /requirements.txt. and /docs/requirements.txt14:16
sean-k-mooneyyou should not need test-requirements.txt but it wont break anything if its installed14:17
*** gbarros has joined #openstack-nova14:18
*** lpetrut has quit IRC14:18
*** yan0s has joined #openstack-nova14:20
*** gbarros has quit IRC14:20
*** gbarros has joined #openstack-nova14:23
*** dpawlik has quit IRC14:27
*** gbarros has quit IRC14:27
*** gbarros has joined #openstack-nova14:28
*** mkrai_ has quit IRC14:30
*** mkrai_ has joined #openstack-nova14:30
*** takamatsu has joined #openstack-nova14:32
*** yan0s has quit IRC14:32
*** yan0s has joined #openstack-nova14:32
openstackgerritStephen Finucane proposed openstack/nova master: functional: Change order of two classes  https://review.opendev.org/68917814:36
openstackgerritStephen Finucane proposed openstack/nova master: functional: Rework '_delete_server'  https://review.opendev.org/68917914:36
openstackgerritStephen Finucane proposed openstack/nova master: functional: Make '_wait_for_state_change' behave consistently  https://review.opendev.org/68918014:36
openstackgerritStephen Finucane proposed openstack/nova master: functional: Unify '_wait_until_deleted' implementations  https://review.opendev.org/68918114:36
openstackgerritStephen Finucane proposed openstack/nova master: functional: Make 'ServerTestBase' subclass 'InstanceHelperMixin'  https://review.opendev.org/68918214:36
*** jawad_axd has quit IRC14:37
stephenfinmdbooth, sean-k-mooney: ^14:40
*** gyee has joined #openstack-nova14:40
*** ttsiouts has quit IRC14:40
stephenfinthe juicy one in the middle is failing for reasons I haven't yet grokked, but that's where I'm going with it14:40
*** ttsiouts has joined #openstack-nova14:43
*** sridharg has quit IRC14:43
openstackgerritMatthew Booth proposed openstack/nova master: Add new functional test base for libvirt tests  https://review.opendev.org/68918614:44
mdboothstephenfin sean-k-mooney: ^^^14:45
mdboothsean-k-mooney: I ripped out a few things you added for your test because I'm not using them in mine, so you'll almost certainly want to add some of them back in14:46
mdboothsean-k-mooney: But I think that's ok. i.e. Extend functionality as the use case arises.14:46
sean-k-mooneysure14:46
*** ricolin has joined #openstack-nova14:47
mdboothHowever, with ^^^, a functional libvirt test is: inherit IntegratedTestBase; self._start_compute('compute1'); server = self._create_active_server()14:47
mdboothAnd I like that simplicity14:47
*** jmlowe has quit IRC14:48
sean-k-mooneyya that seam like a good way forward14:48
* efried chauffeurs again14:53
* efried totally fubared today's calendar14:53
*** efried is now known as efried_afk14:53
* melwitt lobbys14:53
melwittdansmith: from the meeting, any opinion on whether this is better off as a bp or a wishlist bug that is backportable? https://blueprints.launchpad.net/nova/+spec/nova-manage-db-purge-task-log14:54
melwitttask_log records pile up and there's no way to clean them up14:54
*** TxGirlGeek has joined #openstack-nova14:54
melwittgibi, stephenfin, bauzas: ^ any opinion14:55
melwitt?14:55
* mriedem notes she asked everyone except the ptl :)14:56
gibimelwitt: I will have to dig a bit14:57
gibimelwitt: give me 15 minutes as there is a paralle meeting14:57
mriedemmdbooth: new functional tests shouldn't be using IntegratedTestBase14:57
*** mkrai_ has quit IRC14:57
melwittmriedem: he's afk!14:57
mriedemit's got all sorts of warts from api samples tests, like CastAsCall fixture and stuff14:57
melwittefried_afk: can you give your opinion on bp vs wishlist bug while you're driving pls ^14:58
*** gbarros has quit IRC14:58
dansmithmelwitt: I'd probably say it's a feature like db purge was, but I understand the desire to make it backportable (for real value), so I don't feel that strongly14:59
melwittack14:59
bauzasmelwitt: /me looks15:00
bauzashonestly, I have the same thoughts about the audit command15:01
*** mkrai_ has joined #openstack-nova15:01
bauzasonce we merge it (and honestly, it still needs some time from me), I think we *could* honestly backport it to help operators15:01
bauzasI said we *could*15:01
mriedemlike, honestly?15:01
bauzasbut we need some consensus15:01
bauzasif someone doesn't want about backporting any feature or a wishlist bug, I understand it15:02
bauzasbecause I could tell this15:02
melwittyeah, I mean, the usual is we backport downstream only in the feature cases. example: I'm in the middle of backporting db purge, archive_deleted_rows --before and --all-cells15:02
*** dpawlik has joined #openstack-nova15:02
*** takashin has left #openstack-nova15:02
bauzasyeah, honestly, backporting the audit command only downstream wouldn't be a problem for me15:03
melwittif people are ok with backporting purge_task_log upstream, then bug it up I guess15:03
*** jmlowe has joined #openstack-nova15:03
bauzasbut I think operators not using OSP would also love it, even for Train15:03
melwittyeah15:03
bauzasand I think for purge, it's the same15:03
melwittI dunno, I would have thought the same for purge, --before and --all-cells15:03
melwittthough15:04
bauzasso, yeah, I agree with you, maybe just provide a backport change in stable/train and then we could discuss about it there15:04
*** gbarros has joined #openstack-nova15:04
*** mjozefcz has quit IRC15:04
mriedemimo backporting standalone new commands (like heal_allocations in my case) is less of an issue because if they are busted then whatever, no one is using them on stable already anyway,15:05
mriedembut backporting big changes to existing CLIs that people are using, like the all cells stuff for archive, is much riskier15:05
bauzasactually, good point15:05
melwittyeah, I could see that. risk aspect15:05
bauzasif we're adding some argument, I don't see the problem15:05
bauzasbut if we're changing some arg, then yes it's at risk15:06
mriedemdepends on how invasive it is15:06
melwittheh yeah.15:06
bauzasright, hence us should be discussing on the stable change15:06
mriedemdiconico07: i've commented in https://bugs.launchpad.net/nova/+bug/1847367 from the results of the meeting15:06
openstackLaunchpad bug 1847367 in OpenStack Compute (nova) "Images with hw:vif_multiqueue_enabled can be limited to 8 queues even if more are supported" [Undecided,Confirmed] - Assigned to sean mooney (sean-k-mooney)15:06
bauzasI mean, anyone can provide any change to the stable branches15:06
bauzasit's just the stable cores that either accept or disagree with it15:07
*** dpawlik has quit IRC15:07
sean-k-mooneymriedem: cool i have something typed up as well15:07
melwittbauzas: the master change isn't written yet :P but really the discussion here is whether to do it as a bp or a wishlist bug with backports in the mind15:07
melwittI think the process has been, if it's a bp then it's totally nacked on stable15:08
bauzasmelwitt: mriedem: but honestly, the stable rules don't say 'please don't backport any feature'15:08
bauzashttps://docs.openstack.org/project-team-guide/stable-branches.html#appropriate-fixes15:08
bauzasapart of https://docs.openstack.org/project-team-guide/stable-branches.html#active-maintenance rule #115:09
sean-k-mooneymriedem: this was the downstream bug that i was going to fix https://bugzilla.redhat.com/show_bug.cgi?id=1714075 but to be honest i have known about this bevhaior for years and its bugged me so ill be happy to fix it15:09
openstackbugzilla.redhat.com bug 1714075 in openstack-nova "[OSP13][NFV] 8 queues limit is applicable for tap device not for vhostuser port in kernel version 3." [Medium,Assigned] - Assigned to smooney15:09
bauzasbut in https://docs.openstack.org/project-team-guide/stable-branches.html#review-guidelines we say "  Proposed backports breaking any of the above guidelines can be discussed as exception requests on the openstack-discuss list (prefix with [stable]) where the stable maintenance core team will have the final say. "15:09
bauzasmelwitt: so, see, even with stable, you can still have exceptions15:09
sean-k-mooneymriedem: i was pretty sure i had already filed a bug for vhost-user but i cnat find it in launchpad so ill file a new one as you said15:09
bauzasmelwitt: so I don't see a problem with you asking for an exception once you're done with master15:10
mriedem"the backport guidelines don't say anything about new features...oh except this part where it says backports for new features are completely forbidden"15:11
*** nanzha has quit IRC15:11
mriedem:/15:11
melwittbauzas: I don't think that's likely to fly with the stable team :P just mho15:11
mriedemmelwitt: just do a wishlist bug, drop the bp, write the patch and we can slit each others throats on backport policy in 3 months?15:11
melwittmriedem: lol, sounds great15:11
bauzasmriedem: it tells about some possible exceptions :p15:12
mriedemhow about someone familiar with the new blueprint process tell me the decoder ring for what i can set for the Direction and Definition fields when approving a specless blueprint?15:12
dansmithIf we couldn't backport a tiny feature to mitigate spectre, I can't imagine we're going to get permission to backport something like this15:12
mriedemcan i mark both as "approved"?15:12
*** nanzha has joined #openstack-nova15:13
melwittI think there's an ML mail about that. /me looks15:13
sean-k-mooneymriedem: i think the intent was to mark the direct as appoved after review around m215:13
*** mlavalle has joined #openstack-nova15:14
sean-k-mooneybut this is a small thing that i expect mel will have ready pretty quickly so i hope its merged well before that point15:14
sean-k-mooneyso ya you proably could mark both as approved15:14
melwittnvm, I guess it doesn't really explain it http://lists.openstack.org/pipermail/openstack-discuss/2019-October/009945.html15:14
bauzasmriedem: https://specs.openstack.org/openstack/nova-specs/readme.html#the-lifecycle-of-a-specification15:14
bauzasmriedem: basically, now set Definition as "approved"15:15
* dansmith hopes nobody notices him in the corner trying to light the building on fire15:15
bauzasFWIW, for "Direction", we didn't had a consensus when merging the proposal15:15
bauzasso, leave it blank15:16
dansmithbauzas: you might say there was no.....Direction?15:16
sean-k-mooneybauzas: if the thing is merged before m2 or m3 it really does not matter15:16
* mriedem rimshots15:16
bauzasin theory, the PTL should set "Direction" would be used for 'important' BPs15:16
mriedemif only we had a priority field...15:16
* sean-k-mooney recoils form that joke15:16
melwittlol ahhhh15:16
bauzasbut I disagreed on that since it wasn't explaining the process to define *which* BPs would be blessed15:16
bauzashence the use of conditional15:17
mriedemDirection is binary btw, approved or not approved,15:17
mriedemlike you can be pregnant or not15:17
dansmithkinda like this conversation can make you suicidal or not?15:17
bauzasheh honestly, we shouldn't care now about those fields until someone (say efried_afk) clarifies the use15:18
sean-k-mooneyi was goign to ask about the inplace rebuild but im just going to write a unit test and fix my typos instead15:18
bauzasI see those fields as "optional" for further usage :)15:18
bauzasI was more interested honestly in the other side of the change, which is the feature liaison concept15:18
*** yan0s has quit IRC15:23
*** igordc has joined #openstack-nova15:24
*** gbarros has quit IRC15:26
*** ileixe has quit IRC15:26
*** maciejjozefczyk has joined #openstack-nova15:27
*** zbr has quit IRC15:29
*** zbr__ has joined #openstack-nova15:29
gibimelwitt: my suggestion for the https://blueprints.launchpad.net/nova/+spec/nova-manage-db-purge-task-log . Do the implementation with backportability in mind. Use a bug if you want to avoid the procedural -2 on stable backport. If the bug backport will be nack-ed by the stable team then you still have a backportable fix that a distro can backport15:31
melwittgibi: makes sense, thanks15:31
*** jawad_axd has joined #openstack-nova15:32
gibimelwitt: and I have not technical problems with the proposed change in that bp15:32
gibiI mean I don't have any technical issues15:32
melwittack, thanks15:33
*** mkrai_ has quit IRC15:34
*** mkrai__ has joined #openstack-nova15:34
*** maciejjozefczyk has quit IRC15:37
*** nanzha has quit IRC15:39
*** jbernard has quit IRC15:41
*** TxGirlGeek has quit IRC15:42
*** maciejjozefczyk has joined #openstack-nova15:47
*** dpawlik has joined #openstack-nova15:47
openstackgerritMerged openstack/os-traits master: Add COMPUTE_NODE trait  https://review.opendev.org/68896915:47
*** maciejjozefczyk has quit IRC15:48
*** jmlowe has quit IRC15:49
*** brinzhang_ has quit IRC15:49
*** jbernard has joined #openstack-nova15:49
*** brinzhang_ has joined #openstack-nova15:50
*** brinzhang_ has quit IRC15:51
*** nanzha has joined #openstack-nova15:51
*** dpawlik has quit IRC15:52
*** TxGirlGeek has joined #openstack-nova15:52
*** ttsiouts has quit IRC15:54
*** gbarros has joined #openstack-nova15:54
mriedemgibi: on https://review.opendev.org/#/c/689049/1/nova/scheduler/client/report.py@1844 - i'm adding a new kwarg to handle the logic if the target consumer does not exist,15:56
mriedemthoughts on variable names? i was thinking "target_is_new" or "reverting_allocations"15:56
mriedemwhat makes more sense to you?15:56
mriedemthe former might be more obvious, the latter is maybe too tightly coupled to what is calling the method15:56
*** gbarros has quit IRC15:58
*** gbarros has joined #openstack-nova15:58
*** tssurya has quit IRC16:00
*** efried_afk is now known as efried16:02
efriedmelwitt: If it matters for backportability, make it a bug.16:03
melwittefried: thanks, I will bug it16:03
efriedAll evidence to the contrary, I'm anti-process. I just want to get shit done.16:03
efriedso whatever moves the ball16:04
melwittwfm16:04
bauzasefried: the only problem with any negociation on what's reasonable and what's not depends on the mandate people give you16:07
bauzasefried: and I'm super afraid of us trying to decide priorities based on biaised arguments16:07
efriedwhat do you mean, mandate?16:07
efriedand what people?16:07
*** mkrai__ has quit IRC16:08
efriedand yes, I agree, it's tough to coordinate priorities16:08
bauzasefried: I think I mixed two things16:08
efriedbauzas: but it's a fact that we approve more than we can hope to accomplish16:08
bauzasyou were mentioning the stable rules with melwitt's BP, I was thinking of the new spec process with the Direction field16:09
*** damien_r has quit IRC16:09
efriedand IMO it's better to cut things off, even if it's completely arbitrary (like a hard number, picked at random) than to just meander along and have no idea at the start of the release what's got a chance of merging by the end of the release.16:09
efriedthat's what's been motivating me from day one.16:09
bauzasefried: you'd probably be surprised if I was telling you I don't see a problem of having more things approved than we can't accomplish16:10
efriedI wouldn't say I'm surprised. The fact that nobody seems to mind is why we are where we are.16:10
melwittas a person trying to get my things done, I'd rather have some chance than no chance, but that's just MHO16:11
bauzasright16:11
*** gbarros has quit IRC16:11
bauzasand having some way to help contributors to understand the dynamics improve the situation16:11
bauzasimproves*16:11
efriedMine can't be the only downstream that yells at me because "what do you mean it didn't get reviewed? The blueprint was approved!"16:11
sean-k-mooneyno but that has always been a thing16:12
efriedwhich is a perfect non-argument for continuing to allow it.16:12
bauzasefried: that's one of the reasons why I don't want our upstream process with specs be an OKR for my management16:12
efriedokr?16:12
dansmithefried: there are more reasons for things not getting reviewed than bandwidth or over-committing16:12
bauzasobjective key result16:12
bauzasI mean, I don't wanna brag because my spec is approved16:13
efrieddansmith: understood and acknowledged in the ML Thread of Doom.16:13
sean-k-mooneywell my point was the bottel neck has never been writing the code. it has alwasy been reviewing it. we can reduce the scope for new feature but i know that that leads to less in vestment in openstck16:13
bauzasif I were bragging, that would be because I feel we got a consensus on the design we gonna achieve for the thing I wanna implement16:13
bauzasbut certainly not implying that my stuff is done16:14
bauzasanyway, I need to call it a day16:15
bauzasthings change, people become parents of kids who grown up and have social activities16:15
bauzasgrow*16:15
bauzasand since stupidely kids under 8 can't drive, I need to AWOL16:15
sean-k-mooneybauzas: dont sell your kids short. have you given the opertuity to try. what could possibly go wrong16:16
openstackgerritMatt Riedemann proposed openstack/nova master: Delete source allocations in move_allocations if target no longer exists  https://review.opendev.org/68904916:16
*** gbarros has joined #openstack-nova16:17
mriedema wild idea: more cores should do more reviews16:20
mriedem*gasp*16:20
mriedemhttps://www.stackalytics.com/report/contribution/nova/12016:20
*** jbernard has quit IRC16:20
mriedemwho is <1 review per day on that list?16:20
*** udesale has quit IRC16:22
efried1/3 of the core team.16:22
mriedemright,16:22
mriedemso if you're a core below that threshold, stop complaining16:23
*** ociuhandu_ has joined #openstack-nova16:23
efriedto be fair, I don't see those cores complaining.16:23
efriedActually, I think I'm the only one complaining.16:23
mriedemi see bauzas and melwitt complaining above16:23
*** jbernard has joined #openstack-nova16:23
melwittmy comment was not intended as a complaint16:24
efriedfwiw I saw both as stating reasons for preferring the status quo wrt approving more than we can hope to review.16:25
dansmithyeah, that was my understanding as well16:25
dansmithboth melwitt and bauzas have been around since we've tried many similar schemes in the past too16:26
dansmithas have I and mriedem16:26
*** ociuhandu has quit IRC16:26
melwittI've not argued or voted on any of the new process things because I am in a difficult spot these days with upstream review time. I said one sentence from the perspective of being a contributor. I didn't want anyone to see it as complaining from me16:26
*** markvoelker has quit IRC16:26
*** rpittau is now known as rpittau|afk16:27
*** nanzha has quit IRC16:27
*** ociuhandu_ has quit IRC16:30
bauzasfolks, I was on and off last cycles, and I promised too much so now I'm done with this16:32
*** maciejjozefczyk has joined #openstack-nova16:32
bauzaswhat I just want is helping others as much as I can16:32
efriedThere was a request for a feature liaison earlier :)16:33
bauzasso, I'm glad mriedem pings me with asking to review stable changes for example, or spec review request16:33
efriedhttp://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-10-17.log.html#t2019-10-17T04:14:0616:33
bauzasand then, if I can commit myself, I do16:33
*** dtantsur is now known as dtantsur|afk16:37
openstackgerritStephen Finucane proposed openstack/nova master: functional: Make '_wait_for_state_change' behave consistently  https://review.opendev.org/68918016:39
openstackgerritStephen Finucane proposed openstack/nova master: functional: Unify '_wait_until_deleted' implementations  https://review.opendev.org/68918116:39
openstackgerritStephen Finucane proposed openstack/nova master: functional: Make 'ServerTestBase' subclass 'InstanceHelperMixin'  https://review.opendev.org/68918216:39
*** lpetrut has joined #openstack-nova16:39
*** lpetrut has quit IRC16:40
*** lpetrut has joined #openstack-nova16:40
eanderssonsean-k-mooney, the issue is oslo messaging related btw, or maybe rabbitmq related.16:45
*** sapd1 has quit IRC16:46
eanderssonI tried to publish and consume to the specific compute queues and every thing worked fine16:46
eanderssonI even captured the message last night from the scheduler16:46
eanderssonbut deleting the queues and restarting the compute, and magically it's working16:46
*** derekh has quit IRC16:58
sean-k-mooneythat makes me think of an oslo bug17:00
sean-k-mooneyeandersson: https://bugs.launchpad.net/oslo.messaging/+bug/166151017:01
openstackLaunchpad bug 1661510 in oslo.messaging "topic_send may loss messages if the queue not exists" [Medium,In progress] - Assigned to Gabriele Santomaggio (gsantomaggio)17:01
sean-k-mooneyi think that the one im thinking of17:01
*** lpetrut has quit IRC17:01
*** markvoelker has joined #openstack-nova17:05
sean-k-mooneymelwitt: do you rembere this oslo chage that intoduce the mandatory flag for rabbitmq17:05
sean-k-mooneyhttps://review.opendev.org/#/c/660373/17:05
melwitta little bit, yeah17:06
sean-k-mooneymelwitt: do you recal if we ever started using it in nova17:06
melwittnot that I know of17:06
sean-k-mooneyi would guess the nova bug for lossing message is still open17:06
sean-k-mooneyits possible that is what eandersson hit17:07
*** amodi has joined #openstack-nova17:07
*** tbachman_ has joined #openstack-nova17:10
melwittI don't recall a nova bug for this one17:10
melwittin lp17:10
sean-k-mooneyi think there was one but my seach foo is failing17:10
sean-k-mooneyis there a way to search for bugs you commented on?17:11
*** jangutter_ has joined #openstack-nova17:11
*** tbachman has quit IRC17:11
*** tbachman_ is now known as tbachman17:11
sean-k-mooneyoh there is17:12
melwitthere's the IRC convo about it from the same day we commented on the oslo.messaging review http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-06-06.log.html#t2019-06-06T22:47:0917:12
sean-k-mooney oh thats clever i would not have thought of that.  i rememerned it was mnaser that hit it17:13
*** jangutter has quit IRC17:14
melwittrest of the convo is here and I don't see any nova lp bug mentioned other than an old one that got no additional info on it17:15
melwitthttp://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-06-07.log.html17:15
sean-k-mooneyok so there wasnt a specifc nova bug but there were 3 related bugs17:15
sean-k-mooneyhttps://bugs.launchpad.net/nova/+bug/179470617:15
openstackLaunchpad bug 1794706 in OpenStack Compute (nova) "The instance left stuck when oslo.messaging raised MessageDeliveryFailure exception" [Undecided,Expired]17:15
sean-k-mooneyhttps://bugs.launchpad.net/oslo.messaging/+bug/143795517:15
openstackLaunchpad bug 1437955 in oslo.messaging "RPC calls and responses do not use the mandatory flag (AMQP)" [Wishlist,Confirmed] - Assigned to Gabriele Santomaggio (gsantomaggio)17:15
melwittyeah17:15
sean-k-mooneyhttps://bugs.launchpad.net/oslo.messaging/+bug/166151017:15
openstackLaunchpad bug 1661510 in oslo.messaging "topic_send may loss messages if the queue not exists" [Medium,In progress] - Assigned to Gabriele Santomaggio (gsantomaggio)17:15
*** maciejjozefczyk has quit IRC17:15
sean-k-mooneyeandersson: so i think you were hitting the same issue as mnaser17:16
*** awalende has quit IRC17:16
*** jangutter_ has quit IRC17:16
*** awalende has joined #openstack-nova17:16
sean-k-mooneymelwitt: thanks :)17:17
melwitt:)17:17
sean-k-mooneyshoudl i un expire https://bugs.launchpad.net/nova/+bug/1794706 by the way17:18
openstackLaunchpad bug 1794706 in OpenStack Compute (nova) "The instance left stuck when oslo.messaging raised MessageDeliveryFailure exception" [Undecided,Expired]17:18
sean-k-mooneyi guess there is no point we dont have the info we need17:18
*** awalende has quit IRC17:21
eanderssonYea that sounds like the exact issue17:22
eanderssonbtw another major issue with this is that unless you know what you are doing it's very difficult to identify the bad compute17:23
melwittsean-k-mooney: yeah, I'd say don't bother, it's from 2017 and not enough info to move forward, unless I've missed something17:23
melwittoh, nvm 2018. I can't read17:23
melwittsean-k-mooney: if you wanted to unexpire and add more info to it based on eandersson experience (if it's the same thing) then I think that makes sense17:27
sean-k-mooneyeandersson: yes. would you feel comfortable writing this up as a bug17:27
sean-k-mooneymelwitt: its not exactly the same thing as the old nova bug17:27
sean-k-mooneybut i think its the same oslo bug17:27
melwittack17:28
sean-k-mooneywhich is what mnaser was hitting17:28
eanderssonI am pretty sure what happened here was that we had a rabbitmq network partition weeks ago17:28
eanderssonand that partition somehow damaged the queue17:28
sean-k-mooneyyep17:28
eanderssonbut only from openstack perspective, because I could consume the queue using the ui etc.17:28
sean-k-mooneybasically if the que gets deleted17:28
sean-k-mooneyand you dont restart the nova-compute agent17:28
sean-k-mooneythen it wont recreate it17:28
eanderssonand conductor17:28
eanderssonyea17:28
eanderssonthe weird thing is that it existed in a "healthy" state with the bindings etc17:29
melwittand if we leverage the earlier mentioned oslo.messaging change, we can make it recover in nova?17:29
eanderssonIf I can find another bad queue I can test17:29
sean-k-mooneyi think so17:29
melwittkewl17:30
sean-k-mooneyi think either the condocot or compute node would get and excpetion if tehy tried to do a topic send to the queue and that would allwo use to fix it by recreating the queues17:30
openstackgerritMerged openstack/nova stable/queens: lxc: make use of filter python3 compatible  https://review.opendev.org/67650017:30
openstackgerritMerged openstack/nova master: Make sure tox install requirements.txt with upper-constraints  https://review.opendev.org/68915217:31
sean-k-mooneymelwitt: i would need to fully re read the oslo feature but they had a recovery mechanium in mind17:31
melwittsean-k-mooney: yeah. if I'm remembering right, they were saying just setting an option called 'mandatory' would make it do the things by itself17:32
*** mdbooth has quit IRC17:32
melwittand didn't want to change the default to mandatory=1/true, so they added this config option passing mechanism. and we just needed to pass 'mandatory' to it in nova17:33
melwittsomething like that17:33
sean-k-mooneyya something like that17:33
sean-k-mooneyhttps://bugs.launchpad.net/oslo.messaging/+bug/143795517:33
openstackLaunchpad bug 1437955 in oslo.messaging "RPC calls and responses do not use the mandatory flag (AMQP)" [Wishlist,Confirmed] - Assigned to Gabriele Santomaggio (gsantomaggio)17:33
sean-k-mooneythis kind fo explains it but i think there was more in teh review17:33
* melwitt nods17:33
*** mdbooth has joined #openstack-nova17:34
sean-k-mooneywhile this is only the 2nd time someone has reported this in the last 4 months it is really hard to traige/identify this issue so we proably shoudl try to fix this. i guess we soudl follow up with the oslo folks and see if it ready to use17:36
*** lpetrut has joined #openstack-nova17:37
sean-k-mooneyim surpiesed more people have not hit this to be honest17:37
*** ociuhandu has joined #openstack-nova17:38
sean-k-mooneymayeb they have and just rebooted the node17:38
melwittyeah, should just be a matter of bumping our minimum oslo.messaging version and finding where to set the mandatory flag17:41
melwittthat doesn't help for stable branches obvs. not sure if we have any options there17:42
melwittI can give it a go unless someone else wants to17:42
melwittlooks like the change was released in version 9.8.017:43
melwittthinking more, there's no way we could have this for stable branches17:46
*** psachin has quit IRC17:46
melwittand also, how to reproduce and prove the mandatory flag fixes anything17:46
*** dpawlik has joined #openstack-nova17:48
*** dpawlik has quit IRC17:53
*** mriedem has quit IRC17:55
*** mriedem has joined #openstack-nova17:56
*** jawad_axd has quit IRC17:59
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'os-consoles' API  https://review.opendev.org/68790718:05
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'nova-console' service, 'os-consoles' API  https://review.opendev.org/68790818:05
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'nova-xvpvncproxy'  https://review.opendev.org/68790918:05
*** jawad_axd has joined #openstack-nova18:11
openstackgerritStephen Finucane proposed openstack/nova master: functional: Change order of two classes  https://review.opendev.org/68917818:12
openstackgerritStephen Finucane proposed openstack/nova master: functional: Rework '_delete_server'  https://review.opendev.org/68917918:12
openstackgerritStephen Finucane proposed openstack/nova master: functional: Make '_wait_for_state_change' behave consistently  https://review.opendev.org/68918018:12
openstackgerritStephen Finucane proposed openstack/nova master: functional: Unify '_wait_until_deleted' implementations  https://review.opendev.org/68918118:12
openstackgerritStephen Finucane proposed openstack/nova master: functional: Make 'ServerTestBase' subclass 'InstanceHelperMixin'  https://review.opendev.org/68918218:12
*** ociuhandu has quit IRC18:13
mnasereandersson: i have hit that exact issue many times :(18:17
mnaserit really has to do with rabbitmq being unhappy when the queues come back18:18
mnasercurious -- what version of it are you running?18:18
*** ociuhandu has joined #openstack-nova18:26
*** lbragsta_ has quit IRC18:29
eandersson3.7.518:31
*** zbr__ has quit IRC18:32
*** zbr has joined #openstack-nova18:34
mordredstephenfin: did y'all sort out the requirements thing from earlier/18:37
mordred?18:37
dansmithmordred: with the networx thing or whatever?18:38
dansmithmordred: https://review.opendev.org/#/c/689152/18:38
mordreddansmith: cool!18:40
*** ceryx has joined #openstack-nova18:41
*** spatel has joined #openstack-nova18:44
*** lpetrut has quit IRC18:46
*** jawad_axd has quit IRC18:50
*** dpawlik has joined #openstack-nova18:51
*** ociuhandu has quit IRC18:52
*** dpawlik has quit IRC18:55
*** tbachman has quit IRC19:01
*** jangutter has joined #openstack-nova19:06
*** tesseract has quit IRC19:07
eanderssonsean-k-mooney, mnaser > Message not delivered: NO_ROUTE (312) to queue 'compute.<compute_name>' from exchange 'nova'19:07
eanderssonThis is the actual issue19:07
eanderssonIt's an super odd bug because the queue is fine, and the compute can consume from it19:07
eanderssonbut the binding (that is 100% there) simply does not work19:07
*** maciejjozefczyk has joined #openstack-nova19:08
eanderssonif confirm-deliveries isn't enabled, which might be the problem, amqp isn't going to raise an error19:09
eanderssonmandatory flag isn't enough afaik19:09
eanderssonmandatory + confirm_deliveries is what I used19:10
eanderssonhttp://paste.openstack.org/show/784547/19:12
openstackgerritMerged openstack/nova master: Add functional recreate test for bug 1848343  https://review.opendev.org/68898019:13
openstackbug 1848343 in OpenStack Compute (nova) "Reverting migration-based allocations leaks allocations if the server is deleted" [Medium,In progress] https://launchpad.net/bugs/1848343 - Assigned to Matt Riedemann (mriedem)19:13
eandersson* http://paste.openstack.org/show/784548/19:14
openstackgerritMerged openstack/nova master: Add live migration recreate test for bug 1848343  https://review.opendev.org/68899419:14
openstackgerritMerged openstack/nova master: Add compute side revert allocation test for bug 1848343  https://review.opendev.org/68901319:17
openstackbug 1848343 in OpenStack Compute (nova) "Reverting migration-based allocations leaks allocations if the server is deleted" [Medium,In progress] https://launchpad.net/bugs/1848343 - Assigned to Matt Riedemann (mriedem)19:17
*** maciejjozefczyk has quit IRC19:24
*** tbachman has joined #openstack-nova19:29
*** tbachman has quit IRC19:37
openstackgerritMatt Riedemann proposed openstack/nova master: Delete source allocations in move_allocations if target no longer exists  https://review.opendev.org/68904919:44
*** awalende has joined #openstack-nova19:46
*** ralonsoh has quit IRC19:47
*** pcaruana has quit IRC19:49
*** maciejjozefczyk has joined #openstack-nova19:49
*** awalende has quit IRC19:51
*** mlavalle has quit IRC19:55
*** mlavalle has joined #openstack-nova19:55
*** mgariepy has quit IRC19:56
*** maciejjozefczyk has quit IRC19:58
*** bnemec has quit IRC20:03
*** maciejjozefczyk has joined #openstack-nova20:04
*** bnemec has joined #openstack-nova20:05
mriedemeasy couple of patches to fix an upgrade issue since stein https://review.opendev.org/#/q/topic:bug/1824435+(status:open+OR+status:merged)20:06
*** ricolin_ has joined #openstack-nova20:06
*** ricolin has quit IRC20:09
* efried chauffeurs *again*20:19
*** efried is now known as efried_afk20:19
*** maciejjozefczyk has quit IRC20:19
*** jangutter has quit IRC20:23
*** igordc has quit IRC20:26
*** nweinber has quit IRC20:30
*** gbarros has quit IRC20:37
*** maciejjozefczyk has joined #openstack-nova20:41
*** mriedem has quit IRC20:49
*** eharney has quit IRC20:49
eanderssonmnaser, https://github.com/rabbitmq/rabbitmq-server/pull/1884#issuecomment-46427781020:51
eanderssonI think this is the issue20:51
*** dpawlik has joined #openstack-nova20:51
*** spatel has quit IRC20:52
eanderssonhttps://github.com/rabbitmq/rabbitmq-server/pull/187920:52
*** dpawlik has quit IRC20:56
*** mlavalle has quit IRC20:56
*** mlavalle has joined #openstack-nova20:56
*** igordc has joined #openstack-nova20:58
*** trident has quit IRC20:58
*** _mlavalle_1 has joined #openstack-nova20:59
*** _mlavalle_1 has quit IRC20:59
melwittif that's the case, then all that should be needed is upgrade rabbit to 3.7 and no need to consume the mandatory flag option in nova, iiuc20:59
melwittoh wait, according to the backscroll, need both the flag and the fix in 3.7 then sounds like21:02
*** mlavalle has quit IRC21:02
melwitteandersson: can you correct me pls? ^21:03
*** igordc has quit IRC21:03
melwittjust trying to understand whether we need to do anything in nova21:03
*** trident has joined #openstack-nova21:04
*** panda has quit IRC21:08
*** igordc has joined #openstack-nova21:09
*** lpetrut has joined #openstack-nova21:11
*** panda has joined #openstack-nova21:11
*** gbarros has joined #openstack-nova21:11
*** gbarros has quit IRC21:13
*** lbragstad has quit IRC21:14
*** lpetrut has quit IRC21:17
*** gyee has quit IRC21:20
*** lbragstad has joined #openstack-nova21:24
*** mlavalle has joined #openstack-nova21:30
*** tbachman has joined #openstack-nova21:39
*** rcernin has joined #openstack-nova21:39
*** markvoelker has quit IRC21:39
*** maciejjozefczyk has quit IRC21:42
openstackgerritMatthew Booth proposed openstack/nova master: Add new test base for libvirt functional tests  https://review.opendev.org/68918621:46
openstackgerritMatthew Booth proposed openstack/nova master: Unplug VIFs as part of cleanup of networks  https://review.opendev.org/66338221:46
openstackgerritMatthew Booth proposed openstack/nova master: Fix incorrect instance state after build failure  https://review.opendev.org/68927821:46
*** jmlowe has joined #openstack-nova21:46
eanderssonmelwitt, so the queue can still get stuck in a bad state. It happens.21:52
eanderssonSo having that extra layer of protection is still good.21:52
eandersson(e.g. network partitions etc)21:53
melwitteandersson: sorry, what I mean is, what do we need to do in nova, if anything? pass the mandatory flag? pass the mandatory flag and the confirm_deliveries flag? or is nothing needed?21:53
*** rcernin has quit IRC21:53
eanderssonI think this probably has to be done on the oslo.messaging side21:53
melwittI wasn't sure whether your findings meant a change in the plan for21:54
melwittnova21:54
eanderssonbut I still feel like a vm shouldn't be able to stay in a state like that indefinitely21:54
eanderssonmaybe that could be changed to an rpc call instead?21:55
melwittno, it shouldn't. sorry, somehow what I'm saying is not coming across. you posted links about rabbit bugs earlier, so I wasn't sure if you were saying it's a rabbit bug and there's nothing we can do to help in nova21:55
*** slaweq has quit IRC21:56
melwittthis is the change they made in oslo.messaging to allow people (like nova) to pass flags https://review.opendev.org/66037321:57
eanderssonYea - if they is exposed I would add mandatory to these calls in nova.21:58
melwittok, cool21:59
eanderssonIt will help protect against a bunch of potential RabbitMQ issues22:02
melwittthanks for clarifying that. I'll find how to add the flag on the nova side22:03
*** rcernin has joined #openstack-nova22:09
*** ricolin_ has quit IRC22:10
*** mmethot_ has quit IRC22:13
*** tbachman has quit IRC22:13
*** tbachman has joined #openstack-nova22:15
*** spatel has joined #openstack-nova22:24
eanderssonmelwitt, some information on mandatory here https://www.rabbitmq.com/reliability.html#routing22:27
*** rcernin has quit IRC22:27
*** rcernin has joined #openstack-nova22:27
*** spatel has quit IRC22:29
*** adriant has quit IRC22:46
melwitteandersson: thx22:50
*** dpawlik has joined #openstack-nova22:52
*** dpawlik has quit IRC22:56
*** TxGirlGeek has quit IRC23:00
*** xek__ has quit IRC23:01
*** TxGirlGeek has joined #openstack-nova23:02
*** tkajinam has joined #openstack-nova23:11
openstackgerritMerged openstack/nova stable/stein: Stop sending bad values from libosinfo to libvirt  https://review.opendev.org/68806723:17
openstackgerritMerged openstack/nova stable/queens: Add functional recreate test for regression bug 1825537  https://review.opendev.org/67535523:17
openstackbug 1825537 in OpenStack Compute (nova) queens "finish_resize failures incorrectly revert allocations" [Medium,In progress] https://launchpad.net/bugs/1825537 - Assigned to Matt Riedemann (mriedem)23:17
*** mlavalle has quit IRC23:27
*** TxGirlGeek has quit IRC23:36
*** markvoelker has joined #openstack-nova23:40
*** markvoelker has quit IRC23:45
*** nweinber has joined #openstack-nova23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!