Wednesday, 2020-08-05

*** tetsuro has joined #openstack-nova00:07
openstackgerritsean mooney proposed openstack/os-vif master: update tox envs and support pdf docs  https://review.opendev.org/72803700:13
openstackgerritsean mooney proposed openstack/os-vif master: support pyroute2 0.5.13  https://review.opendev.org/74480900:13
*** _erlon_ has quit IRC00:14
openstackgerritsean mooney proposed openstack/os-vif master: support pyroute2 0.5.13  https://review.opendev.org/74480900:15
openstackgerritsean mooney proposed openstack/os-vif master: update tox envs and support pdf docs  https://review.opendev.org/72803700:15
openstackgerritsean mooney proposed openstack/os-vif master: [goal] migrate testing to ubuntu focal  https://review.opendev.org/73813000:17
*** gyee has quit IRC00:30
*** songwenping__ has joined #openstack-nova00:34
*** songwenping_ has quit IRC00:37
openstackgerritsean mooney proposed openstack/os-vif master: deprecate ovs-vsctl driver and make native the default  https://review.opendev.org/74481600:49
*** tetsuro_ has joined #openstack-nova00:54
*** tetsuro_ has quit IRC00:56
*** tetsuro_ has joined #openstack-nova00:56
*** mgariepy has quit IRC00:57
*** tetsuro has quit IRC00:58
*** tetsuro_ has quit IRC00:59
*** tetsuro has joined #openstack-nova00:59
*** mgariepy has joined #openstack-nova01:12
*** xiaolin has joined #openstack-nova01:23
*** mgariepy has quit IRC01:32
openstackgerritBrin Zhang proposed openstack/nova master: Fix CI failture: Introduce decorator>=3.4.0 to test-requirements.txt  https://review.opendev.org/74438501:32
openstackgerritBrin Zhang proposed openstack/nova master: Fix issue with decorator in lower-constraints job  https://review.opendev.org/74438501:33
*** mgariepy has joined #openstack-nova01:33
*** dave-mccowan has joined #openstack-nova01:45
*** tetsuro_ has joined #openstack-nova02:11
*** dave-mccowan has quit IRC02:13
*** tetsuro has quit IRC02:15
*** aarents has quit IRC02:22
*** aarents has joined #openstack-nova02:23
*** songwenping_ has joined #openstack-nova02:25
openstackgerritsean mooney proposed openstack/os-vif master: deprecate ovs-vsctl driver and make native the default  https://review.opendev.org/74481602:27
*** songwenping__ has quit IRC02:28
*** mtreinish has joined #openstack-nova02:39
*** brinzhang has joined #openstack-nova02:54
openstackgerritBrin Zhang proposed openstack/nova master: Fix issue with decorator in lower-constraints job  https://review.opendev.org/74438502:56
*** songwenping__ has joined #openstack-nova02:59
*** Yumeng has joined #openstack-nova03:00
*** songwenping_ has quit IRC03:03
*** JamesBen_ has quit IRC03:03
*** rcernin has joined #openstack-nova03:07
*** markvoelker has joined #openstack-nova03:10
*** tetsuro_ has quit IRC03:12
*** markvoelker has quit IRC03:15
*** mkrai has joined #openstack-nova03:32
*** psachin has joined #openstack-nova03:33
*** JamesBenson has joined #openstack-nova03:38
*** markvoelker has joined #openstack-nova03:46
*** markvoelker has quit IRC03:51
*** tkajinam has quit IRC03:51
*** tkajinam has joined #openstack-nova03:52
*** markvoelker has joined #openstack-nova03:53
*** markvoelker has quit IRC04:05
*** vishalmanchanda has joined #openstack-nova04:15
*** ociuhandu has joined #openstack-nova04:18
*** sapd1_x has joined #openstack-nova04:22
*** ociuhandu has quit IRC04:23
*** songwenping_ has joined #openstack-nova04:24
*** brinzhang_ has joined #openstack-nova04:24
*** brinzhang has quit IRC04:27
*** songwenping__ has quit IRC04:27
*** udesale has joined #openstack-nova04:33
*** evrardjp has quit IRC04:33
*** evrardjp has joined #openstack-nova04:33
*** brinzhang has joined #openstack-nova04:46
*** swp20 has joined #openstack-nova04:46
*** brinzhang_ has quit IRC04:48
*** songwenping_ has quit IRC04:49
*** ratailor has joined #openstack-nova04:56
*** ratailor has quit IRC04:59
*** ratailor has joined #openstack-nova04:59
*** JamesBenson has quit IRC05:01
*** sapd1_x has quit IRC05:02
*** lbragstad_ has joined #openstack-nova05:04
*** lbragstad has quit IRC05:07
*** songwenping_ has joined #openstack-nova05:35
*** swp20 has quit IRC05:39
*** udesale_ has joined #openstack-nova05:45
*** udesale has quit IRC05:47
openstackgerritMerged openstack/nova master: Fix lower-constraints conflicts  https://review.opendev.org/74450605:56
*** gokhani has joined #openstack-nova05:58
*** Yumeng has quit IRC06:00
*** yaawang has quit IRC06:02
*** yaawang has joined #openstack-nova06:02
*** yaawang has quit IRC06:20
*** yaawang has joined #openstack-nova06:21
*** udesale_ has quit IRC06:24
*** maciejjozefczyk has joined #openstack-nova06:28
*** dklyle has quit IRC06:31
*** mlycka has joined #openstack-nova06:43
*** links has joined #openstack-nova06:48
*** udesale has joined #openstack-nova06:52
*** lchabert has quit IRC06:53
*** mvorwerk has joined #openstack-nova06:54
*** yaawang has quit IRC06:57
*** yaawang has joined #openstack-nova06:58
*** rcernin has quit IRC06:58
*** rcernin_ has joined #openstack-nova06:59
*** mkrai has quit IRC07:00
*** mvorwerk has quit IRC07:00
*** mvorwerk has joined #openstack-nova07:00
*** nightmare_unreal has joined #openstack-nova07:00
*** tesseract has joined #openstack-nova07:00
*** rcernin_ has quit IRC07:05
*** rcernin has joined #openstack-nova07:06
*** slaweq has joined #openstack-nova07:07
gokhanihi team , I have an OpenStack environment in pike version. we have been using it for 2 years. Our environment consists of 3 controller and 23 compute hosts. but for 1 week , we are getting warnings like "Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run outlasted interval by 24.01 sec" in compute nodes. what can be the reason of this ? do I need to look at mariadb galera cluster or  rabbitmq cluster ? logs are in http://paste.opens07:12
gokhanitack.org/show/796586/07:12
gokhani* http://paste.openstack.org/show/796586/07:28
*** dougsz has joined #openstack-nova07:32
*** tosky has joined #openstack-nova07:38
*** mkrai has joined #openstack-nova07:41
*** songwenping__ has joined #openstack-nova07:57
*** songwenping_ has quit IRC07:59
*** dtantsur|afk is now known as dtantsur08:04
*** xek has joined #openstack-nova08:21
*** tosky has quit IRC08:26
*** tosky has joined #openstack-nova08:27
*** k_mouza has joined #openstack-nova08:29
*** ralonsoh has joined #openstack-nova08:30
*** bjolo has joined #openstack-nova08:35
*** songwenping_ has joined #openstack-nova08:36
*** songwenping_ has quit IRC08:37
*** songwenping_ has joined #openstack-nova08:37
*** songwenping__ has quit IRC08:39
*** songwenping_ has quit IRC08:41
*** songwenping_ has joined #openstack-nova08:42
*** derekh has joined #openstack-nova08:51
*** tetsuro has joined #openstack-nova08:54
*** rcernin has quit IRC09:04
*** rcernin has joined #openstack-nova09:04
*** rcernin has quit IRC09:05
*** rcernin has joined #openstack-nova09:05
*** jangutter has quit IRC09:11
*** tetsuro has quit IRC09:11
*** jangutter has joined #openstack-nova09:11
*** ociuhandu has joined #openstack-nova09:23
*** sapd1_x has joined #openstack-nova09:30
*** yaawang has quit IRC09:35
*** yaawang has joined #openstack-nova09:36
*** mlycka has quit IRC09:42
*** mlycka has joined #openstack-nova09:44
*** jangutter has quit IRC09:46
*** jangutter has joined #openstack-nova09:46
openstackgerritStephen Finucane proposed openstack/nova master: fakelibvirt: Remove nova-network remnants  https://review.opendev.org/73732909:49
openstackgerritStephen Finucane proposed openstack/nova master: network: Add type hints  https://review.opendev.org/74486909:49
openstackgerritStephen Finucane proposed openstack/nova master: network: Drop 'vpn' parameter from 'allocate_for_instance'  https://review.opendev.org/74487009:49
openstackgerritStephen Finucane proposed openstack/nova master: network: Remove unused 'affect_auto_assigned' parameter  https://review.opendev.org/74487109:49
openstackgerritStephen Finucane proposed openstack/nova master: network: Remove 'kwargs' from 'get_instance_nw_info'  https://review.opendev.org/74487209:49
*** ralonsoh has quit IRC09:50
openstackgerritStephen Finucane proposed openstack/nova master: tests: Add helpers for suspend, resume and reboot of server  https://review.opendev.org/74128509:54
openstackgerritStephen Finucane proposed openstack/nova master: libvirt: Pass context, instance to '_create_guest'  https://review.opendev.org/74128609:54
openstackgerritStephen Finucane proposed openstack/nova master: api: Reject non-spawn operations for vTPM  https://review.opendev.org/74150009:54
openstackgerritStephen Finucane proposed openstack/nova master: libvirt: Add emulated TPM support to Nova  https://review.opendev.org/63136309:54
openstackgerritStephen Finucane proposed openstack/nova master: docs: Add docs for vTPM support  https://review.opendev.org/73921309:54
openstackgerritStephen Finucane proposed openstack/nova master: Add type hints to 'nova.compute.manager'  https://review.opendev.org/74286309:54
openstackgerritStephen Finucane proposed openstack/nova master: Don't unset Instance.old_flavor, new_flavor until necessary  https://review.opendev.org/74199509:54
openstackgerritStephen Finucane proposed openstack/nova master: privsep: Add support for recursive chown, move_tree operations  https://review.opendev.org/74286409:54
openstackgerritStephen Finucane proposed openstack/nova master: Add type hints to 'nova.virt.libvirt.utils'  https://review.opendev.org/74286509:54
openstackgerritStephen Finucane proposed openstack/nova master: Add support for resize and cold migration of emulated TPM files  https://review.opendev.org/63993409:54
lyarwoodRIP zuul09:57
kashyapstephenfin: In the default Nova configuration, VMs are allowed to overcommit and float freely across host CPUs -- still true?10:11
stephenfinlyarwood: blame artom and his damned reviews10:11
stephenfinkashyap: yes10:12
stephenfinonly caveat is they can't overcommit against themselves10:12
stephenfincan't boot an 8 core instance on a 6 core host10:12
kashyapstephenfin: Ah, okay; I was about to ask for an example :)10:12
stephenfin(or host with only 6 cores enabled in nova, via 'vcpu_pin_set' (legacy) or combo of 'cpu_dedicated_set' and 'cpu_shared_set')10:13
*** tkajinam has quit IRC10:15
kashyapThanks10:15
openstackgerritTony Su proposed openstack/nova master: Provider Config File: Functions to merge provider configs to provider tree  https://review.opendev.org/67652210:18
openstackgerritTony Su proposed openstack/nova master: Provider Config File: Enable loading and merging of provider configs  https://review.opendev.org/69346010:18
openstackgerritLee Yarwood proposed openstack/nova master: zuul: Start to migrate nova-live-migration to zuulv3  https://review.opendev.org/71160410:19
tony_sustephenfin: gibi: the last 3 provider-config-file patches are ready to go. change log: 1) all comments reflected in code 2) add three more test case to cover negative coe logic 3) unified coding style.10:21
*** rcernin has quit IRC10:23
*** xek has quit IRC10:24
gibitony_su: thanks. I added that to the queue for runway slot10:27
tony_sugibi: thanks gibi. I will continue working on these patches in the next two weeks...10:31
*** mvorwerk has quit IRC10:31
*** mvorwerk has joined #openstack-nova10:32
*** jangutter_ has joined #openstack-nova10:32
*** songwenping_ has quit IRC10:35
*** jangutter has quit IRC10:35
*** hemna has quit IRC10:38
*** hemna has joined #openstack-nova10:38
*** mvorwerk_ has joined #openstack-nova10:42
openstackgerritLee Yarwood proposed openstack/nova master: WIP zuul: nova-multinode-evacuate  https://review.opendev.org/74488310:43
lyarwoodhmm I thought the grenade issues had been resolved?10:44
*** mvorwerk has quit IRC10:44
*** ociuhandu has quit IRC10:54
*** ociuhandu has joined #openstack-nova10:55
*** ociuhandu has quit IRC11:00
brinzhangstephenfin: hi, do I need fixed your point comments in https://review.opendev.org/#/c/742863/2/nova/compute/manager.py@261511:01
stephenfinbrinzhang: Not at all. I was saying that your fix was correct and mine was not :)11:02
brinzhangchange the LOG.exception(exc.format_message()) to LOG.exception(exc), and use Exception instead of exception.NovaException11:02
brinzhangan I will report a bug for this change11:02
brinzhangstephenfin: oh sorry, your patch are working now, I will just fixed mine ^11:04
openstackgerritBrin Zhang proposed openstack/nova master: [Trivial] Remove wrong format_message() conversion  https://review.opendev.org/74428011:07
*** ralonsoh has joined #openstack-nova11:08
brinzhangstephenfin: updated, thanks11:09
*** sapd1_x has quit IRC11:09
brinzhangstephenfin, sean-k-mooney, gibi: the cyborg evacuate support patch was updated, pls review again while you have time, thanks.11:10
brinzhanghttps://review.opendev.org/#/c/715326/11:10
brinzhangand I was rebased on the optimize patch, pls see https://review.opendev.org/#/c/726564/611:10
*** jangutter has joined #openstack-nova11:12
*** jangutter_ has quit IRC11:15
*** markvoelker has joined #openstack-nova11:16
*** raildo has joined #openstack-nova11:21
*** markvoelker has quit IRC11:25
artomstephenfin, sorry :( it was a drive-by thing as well, I didn't actually look at anything else11:27
artomHence no vote11:28
lyarwoodartom: how dare you provide useful reviews11:30
lyarwoodI mean really11:30
lyarwoodyou're making the rest of us look bad ^_^11:31
lyarwoodor maybe I'm doing that on my own11:31
artomlyarwood, "useful" would be a stretch :P11:31
gibisean-k-mooney: thanks for the repeated test of the sriov attach. I responded in the review. Unfortunately I cannot reproduce your failures in my env.11:33
artom(Yes, I'm knowingly leaving your last sentence unanswered >;)11:33
gibibrinzhang: ack11:33
*** ociuhandu has joined #openstack-nova11:36
openstackgerritMerged openstack/python-novaclient master: Remove unused code  https://review.opendev.org/74413611:38
*** xek has joined #openstack-nova11:40
*** JamesBenson has joined #openstack-nova11:40
sean-k-mooneygibi: so in your env removal of direct physical or macvtap devices worked?11:41
gibisean-k-mooney: yes11:41
gibithe only thing that I can see is the remaining MAC on the VF after macvtap removal11:41
gibithe rest of your failure does not appear to me11:41
sean-k-mooneyi used a single vm to do all the testing11:42
*** ociuhandu has quit IRC11:42
sean-k-mooneyill try it again with multiple vms and see it it makes any difference11:42
*** ociuhandu has joined #openstack-nova11:42
sean-k-mooneywhat os and libvirt version are you using?11:42
gibiubuntu 18.04, libvirt 6.0.0 qemu 4.211:43
sean-k-mooneythe libvirt behavior may have changed? i was using centos 8.1 maybe 8.2 i might be using older libvirt11:43
sean-k-mooneyi need to boot up the server to check11:43
gibithe silent failure of macvtap and direct physical removel feels like a problem with findind the device that needs to be removed. I can try to add extra LOGs around that logic to trace the matching in your nev11:44
gibienv11:44
gibitwo weeks ago I upgraded the libvirt from 4.0.0 to 6.0.0 and qemu from 2.11 to 4.2 due to a different failure in the "simple" direct case11:45
sean-k-mooneyif we need libvirt 6.0.0 we could not that as a min verion i guess.11:46
sean-k-mooneybut yes11:46
sean-k-mooneyboth feel like we just did not find it in the xml and remove it11:46
*** brinzhang_ has joined #openstack-nova11:47
gibiregarding the leaking MAC addess after macvtap removal, who should do the removal of the MAC from the VF? is it libvirt?11:47
sean-k-mooneygibi: libvirt should altrhough we also have code in nova to clear it for old libvirts11:49
sean-k-mooneyits also not reseting the programed vlan on the vf11:49
sean-k-mooneybut that is likely the same issue11:49
*** brinzhang has quit IRC11:49
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L774-L78411:50
sean-k-mooneywe seam not to be calling unplug11:50
sean-k-mooneyin the direct case it proably is not clearing the trused vf status11:50
sean-k-mooneyi didnt actuly test that11:51
sean-k-mooneywe should be calling https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L833 as part of detach11:51
gibisean-k-mooney: thanks I will trace this missing unplug in my env11:52
sean-k-mooneywe are apprently https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L220111:52
gibiI even have a log in the dmesg about the MAC removal during macvtap11:53
gibi[1911821.990047] ixgbe 0000:81:00.0: removing MAC on VF 211:53
gibibut right after it11:53
gibi[1911822.241163] ixgbe 0000:81:00.0: Setting VLAN 100, QOS 0x0 on VF 211:53
gibiit seems the vlan is set back to it11:53
sean-k-mooneyoh god damit....11:53
sean-k-mooneythe neutron sriov nic agent is proably racing with unplug11:54
sean-k-mooneyalthough i guess it could be libvirt?11:54
*** k_mouza has quit IRC11:54
sean-k-mooneywe might want to do unplug after detach?11:55
gibigood points11:55
gibiI will gather logs from libvirt and the neutron agent to see if this is a race11:55
*** xek has quit IRC11:56
*** rcernin has joined #openstack-nova11:56
sean-k-mooney[centos@sriov-1 ~]$ libvirtd --version11:57
sean-k-mooneylibvirtd (libvirt) 6.0.011:57
sean-k-mooneyso same libvirt version11:57
sean-k-mooneyim using differnet nics then you you have nicantic 10G nics and im using 1G e1000 nics but that should not matter11:57
*** mkrai has quit IRC11:58
sean-k-mooney@chat:button1[centos@sriov-1 ~]$ /usr/libexec/qemu-kvm --version11:58
sean-k-mooneyQEMU emulator version 4.2.0 (qemu-kvm-4.2.0-19.el8)11:58
sean-k-mooneyi think that is the same qemu too?11:58
sean-k-mooneyyep11:58
sean-k-mooneyso ya likely not related to the versions11:58
gibicool, on set of possible differences is ruled out11:59
gibione11:59
* gibi trying to find the racing component12:01
*** rcernin has quit IRC12:02
sean-k-mooneyits proably libvirt12:09
stephenfingibi, lyarwood: Could you folks stick https://review.opendev.org/#/c/744021/ on your respective review queues, please? Feel free to chuck something my way too12:10
lyarwoodstephenfin: I was looking at that yesterday12:12
lyarwoodstephenfin: I *think* I get it, I just wanted to grep around a little more before voting12:12
stephenfinnw. Lots of context needed for it, unfortunately12:13
*** gokhani has left #openstack-nova12:14
lyarwoodyup indeed, func tests helped however so thanks for that at least12:14
*** rcernin has joined #openstack-nova12:17
*** alex_xu has joined #openstack-nova12:20
*** brinzhang0 has joined #openstack-nova12:21
*** rcernin has quit IRC12:22
openstackgerritStephen Finucane proposed openstack/nova master: libvirt: Drop support for Xen  https://review.opendev.org/74323112:24
openstackgerritStephen Finucane proposed openstack/nova master: libvirt: Remove 'hypervisor_version' from 'libvirt_info'  https://review.opendev.org/74419912:24
*** brinzhang_ has quit IRC12:24
*** derekh has quit IRC12:24
*** martinkennelly has joined #openstack-nova12:28
gibisean-k-mooney: yeah, it is a race between libvirt detaching the device and nova unpluging the vif (and reseting the MAC), If I move the unplug after the detach in the nova code then the VF MAC and VLAN is reset properly after the macvtap port is detached12:35
sean-k-mooneyi have been wondering if we should do unplug twice12:36
sean-k-mooneywe generally wantto disconnect the device form the network backend before removing it form the vm12:36
sean-k-mooneybut just doing it once might be fine too12:36
lyarwoodstephenfin: LGTM btw12:36
sean-k-mooneyat the end12:36
sean-k-mooneygibi: did you confirm it was libvirt by stoping the sriov nic agent?12:37
*** udesale has quit IRC12:37
*** udesale has joined #openstack-nova12:38
sean-k-mooneygibi: or did you manage to find a log message12:38
gibisean-k-mooney: stopping the neutron nic agent did not solved the race so I assumed it is libvirt12:42
sean-k-mooneyyep makes sense to me12:42
sean-k-mooneyi saw in the libvirtd log that it does set the mac a number of time sbut i dont have the devstack logs to corralte the timestamps12:43
alex_xuefried: do you know what is the usecase for this https://review.opendev.org/#/c/693414/3/specs/ussuri/approved/provider-config-file.rst@24712:43
gibisean-k-mooney: I will add a separate patch into the series that moves the unplug12:45
efriedalex_xu: Yes.12:45
efriedThe idea there was that you could define a "default" rule to apply to all your compute nodes, but then override it for specific ones.12:45
efriedUsed for ironic, but also in cases where you want to have your rules centralized and ansibled out to the hosts.12:47
alex_xuefried: I see, thanks12:47
*** nweinber has joined #openstack-nova12:48
alex_xuefried: is there any reason we should ignore the addtional inventories and traits when conflict the virt driver managed ones, instead of error out the conflict?12:48
*** JamesBenson has quit IRC12:49
efriedWe debated this at design time. I can tell you for sure the answer to your question is "yes". But I can't remember exactly why :P12:49
alex_xuefried: ok, so the result is ignore, not the error out, right? I saw the code is error out, not ignore.12:50
efriedOh, whatever the design says is what we decided on.12:51
*** sapd1_x has joined #openstack-nova12:51
alex_xuefried: ok, thanks :) I'm not going dump our the history12:51
*** sapd1 has quit IRC12:51
efriedFor this issue, I'm reasonably sure whatever is in the design is going to be there because it's what we decided on, not because we accidentally missed it.12:51
alex_xuefried: ok, cool12:52
efriedhttps://specs.openstack.org/openstack/nova-specs/specs/victoria/approved/provider-config-file.html#provider-config-consumption-from-nova says "ignore" under "Provider Tree Merging".12:53
alex_xuefried: yes, that is what I read also12:55
alex_xuI can't thinking of a reason the different between ignore and error out also12:56
efriedalex_xu: here's an explanation of that other thing https://review.opendev.org/#/c/693414/3/specs/ussuri/approved/provider-config-file.rst@13712:56
alex_xunice12:58
efriedalex_xu: https://review.opendev.org/#/c/612497/12/specs/train/approved/provider-config-file.rst@20413:00
efriedI remember now:13:00
efriedThe conflicts in question should error on startup, but be ignored thereafter.13:00
*** rcernin has joined #openstack-nova13:00
alex_xuefried: for the case, there are virt driver managed inventory or trait show up later?13:03
* alex_xu is reading the comment, found something complex13:04
*** derekh has joined #openstack-nova13:09
*** priteau has joined #openstack-nova13:09
*** iurygregory has quit IRC13:12
*** rcernin has quit IRC13:15
*** xek has joined #openstack-nova13:18
*** iurygregory has joined #openstack-nova13:18
*** udesale has quit IRC13:21
*** udesale has joined #openstack-nova13:22
alex_xuefried: virt driver's update_provider_tree always overwrited provider tree's inventory. so the conflict will be found at startup, and there won't be any conflict after startup. so the code feel like right13:26
*** mlycka has quit IRC13:29
efriedYeah, I thought there was a theoretical edge case where we could come across an error at runtime. The chances were very small, but we wanted to make sure we didn't crater the driver if it did happen. I think that's what led to the design decision as it stands.13:30
*** raildo has quit IRC13:35
sean-k-mooneycan there be a conflict if i make a call to placement directly and modify the inventory13:35
sean-k-mooneyi know you are not really ment to do that but users be users and they dont always do what we tell them13:36
efriedHeh. "You just voided your warranty. You're on your own."13:36
sean-k-mooneyhave you started getting support request form customer yet13:37
sean-k-mooneythey do that alot...13:38
sean-k-mooneythen ask us to fix it anyway13:38
*** raildo has joined #openstack-nova13:38
efriedof course.13:39
efriedUnless you really break things, any manual change to the placement inventory ought to be scrubbed back out on the next periodic.13:40
sean-k-mooneywe had one customer that for example when they wanted to spawn a vm on specifci core on a host, stoped nova-compute, updated the vcpu_pin_set to only have those cores, booted the vm with --avaiablity-zone <zone>:<host> and then complained that if someone did a concurent operation on another vm on the host it could cause issue with pinning ...13:40
efriedYou could probably add things that neither the compute nor the config care about, and they would stick around. Probably. But that won't break the code, I don't think.13:41
sean-k-mooneyefried: ya it should heal on the next run13:41
sean-k-mooneyif you create your own RP it defnitly shoudl be ok13:41
*** ratailor has quit IRC13:42
sean-k-mooneyif you add inventories to one of nova's RPs via the api well thats not allowed so nova is free to delete it13:42
*** raildo has quit IRC13:43
sean-k-mooneyim not sure if we actully will delete the inventory but we are allowed too13:43
*** raildo has joined #openstack-nova13:43
*** raildo has quit IRC13:44
efriedIt's been a hot minute since I looked at update_from_provider_tree, but I think we would delete it, yes.13:44
*** raildo has joined #openstack-nova13:44
sean-k-mooneyalex_xu: by the way i reworked https://review.opendev.org/#/c/739131/ after your comments in version 5, im hoping stephenfin will get back to it to be the second +2 later today but just an fyi incase you want to look at it before then.13:46
*** k_mouza has joined #openstack-nova13:46
*** raildo_ has joined #openstack-nova13:46
sean-k-mooneyefried: this is basicaly why we are providing the provider.yaml i.e. to enable a supported way to do this so ya.13:46
sean-k-mooneyefried: hows openshift land going?13:47
*** kaisers has joined #openstack-nova13:47
efriedcorrect13:49
*** dave-mccowan has joined #openstack-nova13:49
efriedThings are going well. After a couple months of pretty serious culture shock and vertical learning curve, I'm getting my feet under me.13:49
*** raildo_ has quit IRC13:50
efriedI've written an operator (in go) and feel pretty comfortable navigating openshift/kube APIs.13:50
sean-k-mooneycool, i have read some go but havent really written any, never had the need.13:51
sean-k-mooneyi am vaguly aware of what operators do but never looked under the cover to see how they are implemented13:52
efriedWith existing tools (operator-sdk) and any kind of sweng background, they're ridiculously easy to write.13:52
sean-k-mooneyi belive there are some frameworks to implement them without go too using declaritive yaml files referencing CRDs too right13:52
sean-k-mooneyi know you can go the custom contoler route too in go13:53
*** swp20 has joined #openstack-nova13:53
sean-k-mooneybut again more or less have just read the docs and then done nothing with that info13:53
sean-k-mooney/read/skimmed/13:54
efriedThe CRDs are how the operators extend the kube API. Instances of those (CRs) are how you trigger the operator to do its thing. I'm not aware of a way to "declare" an operator with just yaml.13:54
efriedAnd yes, an operator is effectively a custom controller that follows some rules.13:54
* sean-k-mooney read is too strong an implication for what i actully did13:54
*** jangutter_ has joined #openstack-nova13:55
sean-k-mooneyefried: are you invovled with the current experiments to deploy openstack using operators13:56
efriedI don't get anywhere near openstack anymore.13:56
sean-k-mooneythere is an experiment to run openstack on openshift the repo is public somehere13:56
*** jangutter has quit IRC13:58
*** JamesBenson has joined #openstack-nova13:59
*** Liang__ has joined #openstack-nova13:59
*** Liang__ is now known as LiangFang14:00
*** dave-mccowan has quit IRC14:00
sean-k-mooneyefried: https://github.com/openstack-k8s-operators14:00
efriedcool14:01
sean-k-mooneyefried: mdbooth and other have been working on that for a while14:01
*** k_mouza has quit IRC14:02
sean-k-mooneyefried: also this is what i was thining of https://kudo.dev/14:03
*** k_mouza has joined #openstack-nova14:04
*** ociuhandu has quit IRC14:04
*** ociuhandu has joined #openstack-nova14:05
sean-k-mooneyefried: i was considring writing an operator for deploying zuul at one point and i was looking at kudo amoung other things as a way to do it14:05
*** xek has quit IRC14:06
sean-k-mooneywell zuul, nodepoll , zookeyper, german and gerrit14:06
sean-k-mooneybut i just wrote hte manifest by had instead.14:07
*** ociuhandu has quit IRC14:10
openstackgerritsean mooney proposed openstack/os-vif master: support pyroute2 0.5.13  https://review.opendev.org/74480914:16
*** links has quit IRC14:16
openstackgerritMerged openstack/nova master: Document nova in tree virt drivers  https://review.opendev.org/74006114:22
*** JamesBen_ has joined #openstack-nova14:27
*** JamesBenson has quit IRC14:30
*** redrobot has joined #openstack-nova14:32
*** brinzhang_ has joined #openstack-nova14:36
*** dklyle has joined #openstack-nova14:37
*** mlavalle has joined #openstack-nova14:37
*** brinzhang0 has quit IRC14:39
*** jangutter has joined #openstack-nova14:46
*** JamesBenson has joined #openstack-nova14:46
*** JamesBen_ has quit IRC14:48
*** jangutter_ has quit IRC14:50
*** mvorwerk_ has quit IRC14:54
*** xek has joined #openstack-nova14:54
*** mvorwerk has joined #openstack-nova14:56
*** JamesBenson has quit IRC14:56
*** psachin has quit IRC14:57
*** mvorwerk has quit IRC15:02
*** LiangFang has quit IRC15:13
*** mkrai has joined #openstack-nova15:15
openstackgerritLee Yarwood proposed openstack/nova master: WIP zuul: nova-multinode-evacuate  https://review.opendev.org/74488315:17
*** JamesBenson has joined #openstack-nova15:18
*** tosky has quit IRC15:31
*** lbragstad_ is now known as lbragstad15:42
*** slaweq_ has joined #openstack-nova15:45
*** slaweq has quit IRC15:45
*** slaweq_ is now known as slaweq15:45
openstackgerritBalazs Gibizer proposed openstack/nova master: Move equality check into LibvirtConfigGuestInterface  https://review.opendev.org/74452415:53
openstackgerritBalazs Gibizer proposed openstack/nova master: Remove unused vpn param from allocate_for_instance  https://review.opendev.org/74493315:53
openstackgerritBalazs Gibizer proposed openstack/nova master: [WIP] Support SRIOV interface attach and detach  https://review.opendev.org/74099515:54
openstackgerritBalazs Gibizer proposed openstack/nova master: Only unplug vif after the device is detached from libvirt  https://review.opendev.org/74493415:54
*** dave-mccowan has joined #openstack-nova15:54
*** xek_ has joined #openstack-nova15:57
openstackgerritBalazs Gibizer proposed openstack/nova master: DNM: add logging for device matching  https://review.opendev.org/74493615:57
*** maciejjozefczyk has quit IRC15:59
*** xek has quit IRC16:00
openstackgerritHarshavardhan Metla proposed openstack/nova stable/rocky: [stable-only] Moved the quoted section  https://review.opendev.org/74468116:04
melwittlyarwood: I just barely started looking at ceph job failures on stable/queens, saw this error message "libvirtError: internal error: unable to execute QEMU command 'device_add': Property 'virtio-blk-device.drive' can't find value 'drive-virtio-disk1'"  https://zuul.opendev.org/t/openstack/build/fa85e5eeb1f84389b5dd259c8a829552/log/controller/logs/screen-n-cpu.txt#3282416:05
melwittdoes that ring any bells to you?16:06
*** hamalq has joined #openstack-nova16:08
melwitthm nvm maybe https://ask.openstack.org/en/question/95328/ceph-cinder-attach-volume-to-running-instance16:09
*** hamalq has quit IRC16:10
*** JamesBenson has quit IRC16:10
*** hamalq has joined #openstack-nova16:11
*** JamesBenson has joined #openstack-nova16:15
openstackgerritHarshavardhan Metla proposed openstack/nova stable/rocky: "[stable-only]" Removed the quoted section  https://review.opendev.org/74468116:21
*** mkrai has quit IRC16:24
stephenfinmelwitt, bauzas: not sure if you'll know this off the top of your head, but should we be releasing resources before we confirm a resize?16:26
*** dave-mccowan has quit IRC16:26
*** markvoelker has joined #openstack-nova16:27
stephenfinBased on https://review.opendev.org/#/c/641806/21, it seems we do (via the resource tracker's periodic task run)16:27
melwittstephenfin: releasing as in deleting allocations on the source before confirming? AFAIK no, supposed to hold allocations on the source and dest until confirm/revert16:28
*** tosky has joined #openstack-nova16:29
*** dougsz has quit IRC16:30
melwittI think that patch's commit message is saying prior to that patch, we were reporting resources held on the source *after* a confirm resize (until the next periodic run) whereas after a confirm, the resources should be shown as released on the source. but only after the confirm, not before16:31
*** udesale has quit IRC16:32
stephenfinah, that makes sense16:32
stephenfinso now we're freeing those resources immediately16:32
melwittimmediately upon a confirm, yeah16:33
melwittor rather reporting them immediately after a confirm. the claim has always been dropped at the time of confirm, but apparently the reporting was delayed potentially by one periodic task run16:34
stephenfinokay, cool. There's a bug in it that I'm working on (https://bugs.launchpad.net/nova/+bug/1879878, fwiw); tl;dr: if the periodic runs between the API request and the 'confirm_resize' call in the RT, the ComputeNode.numa_topology gets out of whack16:34
openstackLaunchpad bug 1879878 in OpenStack Compute (nova) "VM become Error after confirming resize with Error info CPUUnpinningInvalid on source node " [Medium,Confirmed] - Assigned to Stephen Finucane (stephenfinucane)16:34
melwittah, gotcha16:34
melwittoh interesting, so if the periodic task fires while confirm resize is still processing, it turns the instance to ERROR16:36
openstackgerritBalazs Gibizer proposed openstack/nova master: Move equality check into LibvirtConfigGuestInterface  https://review.opendev.org/74452416:37
stephenfinyeah, the '_update_available_resource' function of the RT seems to regenerate ComputeNode objects from scratch, including the embedded numa_topology16:37
*** brinzhang0 has joined #openstack-nova16:37
stephenfinand it doesn't seem to be accounting from the still-present allocations on the host once the instance has been confirmed16:37
openstackgerritBalazs Gibizer proposed openstack/nova master: Only unplug vif after the device is detached from libvirt  https://review.opendev.org/74493416:38
*** dtantsur is now known as dtantsur|afk16:39
openstackgerritBalazs Gibizer proposed openstack/nova master: [WIP] Support SRIOV interface attach and detach  https://review.opendev.org/74099516:39
openstackgerritBalazs Gibizer proposed openstack/nova master: DNM: add logging for device matching  https://review.opendev.org/74493616:39
*** brinzhang_ has quit IRC16:40
melwittstephenfin: note that sean-k-mooney's comments are correct though, that if this is in a version older than claims in placement, races are expected (especially since we don't have numa in placement). in your repro, is that with cpu_dedicated_set and on the master branch?16:42
*** maciejjozefczyk has joined #openstack-nova16:42
gibisean-k-mooney: added a patch top of the sriov_attach series with extra logging that can help you figuring out why the macvtap and direct-physical devices are not detached for you16:42
gibisean-k-mooney: https://review.opendev.org/74493616:42
gibialso added a patch to move the vif unplug after libvirt detach16:42
openstackgerritStephen Finucane proposed openstack/nova master: tests: Add reproducer for bug #1879878  https://review.opendev.org/74495016:43
openstackbug 1879878 in OpenStack Compute (nova) "VM become Error after confirming resize with Error info CPUUnpinningInvalid on source node " [Medium,Confirmed] https://launchpad.net/bugs/1879878 - Assigned to Stephen Finucane (stephenfinucane)16:43
openstackgerritStephen Finucane proposed openstack/nova master: TODO  https://review.opendev.org/74495116:43
stephenfinmelwitt: ^16:43
stephenfinI can reproduce trivially with new-style configuration16:43
melwittwhile using cpu_dedicated_set? /me looks16:43
stephenfinyup16:43
melwittok I see16:43
stephenfinI think I'm close to a fix at least. Worst case scenario, we put mriedem's stuff inside a conditional to only run if we still know that it exists16:45
stephenfini.e. if it's in the list of tracked migrations and instances16:45
stephenfinthough I haven't figured out what that will leak yet16:46
stephenfintbd16:46
melwittsorry, what's the proposed fix? being that it's intended that resources be held on source and dest until confirm or revert16:47
stephenfinhttps://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L579-L59116:48
*** gyee has joined #openstack-nova16:48
stephenfinwe only do that if the instance appears in our list of tracked_migrations or tracked_instances16:48
*** nweinber has quit IRC16:49
stephenfinafaict, '_update_usage_from_instances' wipes those and regenerates them. When we confirm the resize, the instance is marked as deleted, which means it's not picked up by '_update_usage_from_instances'16:50
stephenfinso the race is between that (the instance getting marked as deleted and '_update_usage_from_instances' running) and the call to 'drop_move_claim'16:51
stephenfinI _think_16:51
sean-k-mooneygibi: thanks ill restart with that now and see how it works16:58
*** derekh has quit IRC17:00
melwittstephenfin: this is interesting bc the commit message of mriedem's change says "This fixes the issue by also updating usage in drop_move_claim when the instance is not in tracked_migrations but is in tracked_instances." (which would do part of what you're suggesting) but it appears not to be what is in the code17:00
gibisean-k-mooney: thanks. I will end my day about now but I will be back tomorrow morning17:00
sean-k-mooneystephenfin: sound plausible i had not got to the point of diging into how the race happens17:00
sean-k-mooneygibi: no worries17:00
melwittstephenfin: oh, nvm, I'm misreading this... it's removing usage by removing the instance from tracked_instances and then updating usage17:03
sean-k-mooneymelwitt: stephenfin if this race is happening the way you say17:08
sean-k-mooneywould this also happen with the old config options?17:09
stephenfinsean-k-mooney: I think so17:09
sean-k-mooneyok its just more obvious now?17:09
stephenfinI think the bug is with that patch from mriedem17:09
sean-k-mooneyor they were just unlucky17:09
stephenfinI don't think it's anything to do with the PCPU work17:09
sean-k-mooneyok17:09
stephenfintbc, it's still a working theory but it would be easy prove out by backporting that fix to e.g. stable/stein where none of that code is present17:10
sean-k-mooneyi think the upgrade procedure we proposed for our downstream customer still makes sense17:10
*** hamalq has quit IRC17:10
sean-k-mooneybut i think they will need the backport of your two fixes17:10
sean-k-mooneythe one for this race17:10
sean-k-mooneyand the one for the isolate on smt host with new config options17:10
sean-k-mooneyright?17:10
*** hamalq has joined #openstack-nova17:11
stephenfinthe isolate on SMT fix is mostly unrelated and should be backported regardless17:11
stephenfinstill unsure about this race17:12
sean-k-mooneyok17:12
*** hamalq has quit IRC17:18
*** ociuhandu has joined #openstack-nova17:18
*** nightmare_unreal has quit IRC17:20
*** ociuhandu has quit IRC17:22
*** jdillaman has quit IRC17:23
*** brinzhang_ has joined #openstack-nova17:26
*** brinzhang0 has quit IRC17:28
*** janno has quit IRC17:31
*** janno has joined #openstack-nova17:33
*** brinzhang0 has joined #openstack-nova17:35
*** brinzhang has joined #openstack-nova17:37
*** priteau has quit IRC17:38
*** brinzhang_ has quit IRC17:39
*** brinzhang0 has quit IRC17:39
*** nweinber has joined #openstack-nova17:40
*** k_mouza has quit IRC17:43
*** hamalq has joined #openstack-nova17:44
*** hamalq has quit IRC17:45
*** hamalq has joined #openstack-nova17:46
openstackgerritStephen Finucane proposed openstack/nova master: tests: Add reproducer for bug #1879878  https://review.opendev.org/74495017:51
openstackbug 1879878 in OpenStack Compute (nova) "VM become Error after confirming resize with Error info CPUUnpinningInvalid on source node " [Medium,Confirmed] https://launchpad.net/bugs/1879878 - Assigned to Stephen Finucane (stephenfinucane)17:51
openstackgerritStephen Finucane proposed openstack/nova master: Don't unset Instance.old_flavor, new_flavor until necessary  https://review.opendev.org/74495817:51
stephenfinmelwitt: I think that's the fix. Could you sanity check at some point?17:51
melwittstephenfin: yeah I'll take a look17:51
stephenfindansmith: You should probably look too ^ It's a variant of a patch I have for vTPM and as I think I've mentioned before, your name is all over the code I'm touching17:52
*** bbowen has quit IRC17:59
*** raildo has quit IRC18:03
*** raildo has joined #openstack-nova18:08
*** tesseract has quit IRC18:09
*** raildo_ has joined #openstack-nova18:11
*** ralonsoh has quit IRC18:12
*** raildo_ has quit IRC18:13
*** k_mouza has joined #openstack-nova18:13
*** k_mouza has quit IRC18:16
dansmithokay I'm kinda heads-down on something else right now, but melwitt I guess let me know if you're unsure18:36
melwittI haven't looked yet but I think I can already say I'm unsure :)18:36
*** brinzhang_ has joined #openstack-nova18:37
*** brinzhang has quit IRC18:40
*** raildo has quit IRC18:46
*** raildo has joined #openstack-nova18:48
*** xek_ has quit IRC19:06
*** bbowen has joined #openstack-nova19:21
*** nweinber has quit IRC19:35
*** brinzhang0 has joined #openstack-nova19:36
*** brinzhang_ has quit IRC19:39
*** tosky has quit IRC19:39
lyarwoodstephenfin: https://review.opendev.org/#/c/699291/10 - finally got back to this btw if you can take a look19:43
openstackgerritArtom Lifshitz proposed openstack/nova master: Handle Neutron errors in _post_live_migration()  https://review.opendev.org/72976320:11
*** vishalmanchanda has quit IRC20:25
openstackgerritArtom Lifshitz proposed openstack/nova master: WIP: Centralize wait_for_unversioned_notification  https://review.opendev.org/74498520:29
openstackgerritLee Yarwood proposed openstack/nova master: WIP zuul: nova-evacuate  https://review.opendev.org/74488320:33
*** gyee has quit IRC20:36
*** gyee has joined #openstack-nova20:38
*** raildo has quit IRC20:40
*** raildo has joined #openstack-nova20:40
openstackgerritLee Yarwood proposed openstack/nova master: WIP zuul: nova-evacuate  https://review.opendev.org/74488320:41
*** smcginni1 has joined #openstack-nova20:47
*** smcginnis has quit IRC20:50
*** smcginni1 is now known as smcginnis20:50
*** ociuhandu has joined #openstack-nova20:51
*** ociuhandu has quit IRC20:56
*** raildo has quit IRC20:57
lyarwoodmelwitt: https://review.opendev.org/#/c/743319/ - would you mind +W'ing that again as CI is finally green again20:58
* lyarwood wonders if it's okay for him to just +W it in this case if it has already been +W'd in the past?20:58
melwittlyarwood: yeah, I will20:59
*** maciejjozefczyk has quit IRC21:00
lyarwoodthanks :)21:01
melwittnp21:03
openstackgerritLee Yarwood proposed openstack/nova master: WIP zuul: nova-evacuate  https://review.opendev.org/74488321:08
openstackgerritSean McGinnis proposed openstack/nova master: Add lsscsi to bindep  https://review.opendev.org/74499221:17
*** slaweq has quit IRC21:18
smcginnisNeeded for an os-brick change in the latest release ^21:19
*** xek has joined #openstack-nova21:19
*** gyee has quit IRC21:24
sean-k-mooneysmcginnis: should that not be listed in os-bricks bindep21:24
sean-k-mooneynot novas21:25
*** tosky has joined #openstack-nova21:25
*** gyee has joined #openstack-nova21:25
sean-k-mooneysmcginnis: nova's unit tests should be mocking any calls to os-brick21:25
sean-k-mooneysmcginnis: os-brick is a deliverable of cinder not nova so if it was to be added to any project for devstack would it not be better to add it to cinders bindep21:27
smcginnissmcginnis: It is in os-brick's bindep, but it turns out when we install libs from their released version, that doesn't do us any good.21:28
smcginnissean-k-mooney: Hah, oops. Talking to myself. :)21:28
sean-k-mooneyit happen i am told its only a problem if you are surpised by the answer21:29
smcginnissean-k-mooney: I was able to get a change in devstack so it will use bindep when installing from source (it didn't before) but still nothing to address this case.21:29
smcginnis;)21:29
sean-k-mooneywell im wondier if you shoudl be adding this do cinder21:29
openstackgerritMerged openstack/nova stable/train: Silence amqp heartbeat warning  https://review.opendev.org/72805721:30
sean-k-mooneyi mean at a minium it probaly should be lsscsi [cinder]21:30
smcginnisThat one is in https://review.opendev.org/#/c/743291/21:30
smcginnisAh, didn't see there was a profile for that.21:30
sean-k-mooneywell there proably isnt21:30
sean-k-mooneyim suggesing adding one21:30
sean-k-mooneyits not a dep of nova when not using cinder right21:31
smcginnisSo then zuul playbooks would also need to be updated to use that profile.21:31
sean-k-mooneyjust if you are using os-brick ?21:31
smcginnisYeah, so probably only applicable to compute nodes.21:31
sean-k-mooneyya only compute nodes and if cinder is deployed21:31
sean-k-mooneywhat you propose is probaly ok but just said i would ask since its not really a dep of nova21:32
smcginnisYeah, makes sense.21:32
smcginnisI don't have time now, but I may be able to follow up later to make it better.21:33
sean-k-mooneywe are not really strick about listing the min dpes in bindeps21:33
smcginnisIt is a small package, so at least it's not pulling down the world for this.21:33
sean-k-mooneyya21:33
sean-k-mooneyi would kind of prefer if we use profiles more21:33
sean-k-mooneye.g. add a mysql and postgress profile21:33
smcginnisThat could speed things up overall if we did.21:33
smcginnisI wonder how many places we would need to update playbooks now though. :/21:34
sean-k-mooneywe would need a way to pass info from the job in a declaritive way21:34
sean-k-mooneyif we used it more optimally where we have a profile for each of the configurable backend a project used and then listed the profiles that correstpond to the deployment we ar testing that would be nice but also a lot of work :)21:35
*** rcernin has joined #openstack-nova21:36
smcginnisYeah. Good idea though.21:36
*** rcernin has quit IRC21:36
*** brinzhang_ has joined #openstack-nova21:36
*** rcernin has joined #openstack-nova21:36
*** brinzhang0 has quit IRC21:39
openstackgerritMerged openstack/nova master: Removed the host FQDN from the exception message  https://review.opendev.org/74395021:42
*** xek has quit IRC21:47
*** markmcclain has quit IRC21:53
*** rcernin has quit IRC22:03
*** rcernin has joined #openstack-nova22:17
*** martinkennelly has quit IRC22:26
*** jmlowe has quit IRC22:27
*** jmlowe has joined #openstack-nova22:31
*** rcernin has quit IRC22:40
*** rcernin has joined #openstack-nova22:40
*** rcernin has quit IRC22:40
*** rcernin has joined #openstack-nova22:44
*** artom has quit IRC22:45
*** artom has joined #openstack-nova22:46
openstackgerritMerged openstack/nova master: compute: Don't delete the original attachment during pre LM rollback  https://review.opendev.org/74331922:53
openstackgerritMerged openstack/nova master: func: Add CinderFixture to _IntegratedTestBase  https://review.opendev.org/74353522:54
*** tkajinam has joined #openstack-nova23:00
*** mlavalle has quit IRC23:09
*** ociuhandu has joined #openstack-nova23:10
*** markmcclain has joined #openstack-nova23:12
*** tosky has quit IRC23:14
*** ociuhandu has quit IRC23:16
*** irclogbot_1 has quit IRC23:34
*** irclogbot_3 has joined #openstack-nova23:38
*** yoctozepto3 has joined #openstack-nova23:44
*** yoctozepto has quit IRC23:45
*** yoctozepto3 is now known as yoctozepto23:45
openstackgerritLee Yarwood proposed openstack/nova stable/ussuri: compute: Don't delete the original attachment during pre LM rollback  https://review.opendev.org/74416223:56

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!