Tuesday, 2021-03-23

*** luksky has quit IRC00:07
roukold qemu - > migrate to 3.1 machine type - > upgrade qemu - > update machine type - > slowly let people trickle over is the migration plan till nova can handle this, i guess.00:08
*** tosky has quit IRC00:09
*** johnthetubaguy has quit IRC00:16
*** johnthetubaguy has joined #openstack-nova00:22
*** hamalq has quit IRC00:56
*** macz_ has joined #openstack-nova00:58
*** macz_ has quit IRC01:03
*** rouk has quit IRC01:09
*** martinkennelly has quit IRC01:14
*** ircuser-1 has joined #openstack-nova01:26
*** zhanglong has joined #openstack-nova01:34
*** tbachman has quit IRC01:39
*** tbachman has joined #openstack-nova01:41
*** jamesdenton has quit IRC02:13
*** jamesden_ has joined #openstack-nova02:13
*** rcernin has quit IRC02:28
openstackgerritxinyu wang proposed openstack/nova-specs master: add instance join/leave server group  https://review.opendev.org/c/openstack/nova-specs/+/78235302:30
*** rcernin has joined #openstack-nova02:39
*** hemanth_n has joined #openstack-nova03:07
openstackgerritxinyu wang proposed openstack/nova master: join/leave server group  https://review.opendev.org/c/openstack/nova/+/78235603:25
openstackgerritxinyu wang proposed openstack/nova master: join/leave server group  https://review.opendev.org/c/openstack/nova/+/78235603:35
*** zhanglong has quit IRC03:51
*** khomesh24 has joined #openstack-nova04:03
*** vishalmanchanda has joined #openstack-nova04:09
*** viks____ has quit IRC04:20
*** links has joined #openstack-nova04:22
*** mkrai has joined #openstack-nova04:23
*** gyee has quit IRC04:26
*** whoami-rajat_ has joined #openstack-nova04:57
*** mgariepy has quit IRC05:14
*** kinpaa12389 has joined #openstack-nova05:14
*** mgariepy has joined #openstack-nova05:14
*** jamesden_ has quit IRC05:25
*** jamesdenton has joined #openstack-nova05:25
*** rcernin has quit IRC05:25
*** rcernin has joined #openstack-nova05:26
*** zhanglong has joined #openstack-nova05:31
*** jamesdenton has quit IRC06:20
*** jamesden_ has joined #openstack-nova06:21
*** links has quit IRC06:24
*** ralonsoh has joined #openstack-nova06:26
openstackgerritJosephine Seifert proposed openstack/nova stable/ussuri: Add config parameter 'live_migration_scheme' to live migration with tls guide  https://review.opendev.org/c/openstack/nova/+/78212606:30
*** manuvakery1 has joined #openstack-nova06:30
*** Luzi has joined #openstack-nova06:34
*** kinpaa12389 has quit IRC06:36
*** links has joined #openstack-nova06:44
*** ratailor has joined #openstack-nova06:47
*** zhanglong has quit IRC06:51
*** dklyle has quit IRC06:55
*** mkrai has quit IRC07:04
*** mkrai_ has joined #openstack-nova07:05
*** lpetrut has joined #openstack-nova07:11
*** xarlos has quit IRC07:15
*** slaweq_ has joined #openstack-nova07:15
*** zhanglong has joined #openstack-nova07:23
*** links has quit IRC07:23
*** mkrai_ has quit IRC07:23
*** links has joined #openstack-nova07:26
*** zhanglong has quit IRC07:33
*** kinpaa12389 has joined #openstack-nova07:37
*** pawan-gupta has joined #openstack-nova07:39
*** mkrai_ has joined #openstack-nova07:43
*** rcernin has quit IRC07:46
*** slaweq_ is now known as slaweq07:53
*** whoami-rajat_ is now known as whoami-rajat08:03
*** zhanglong has joined #openstack-nova08:10
*** gokhani has joined #openstack-nova08:13
*** rpittau|afk is now known as rpittau08:13
Luzigibi, elod, lyarwood: the next backport (part 2 of 4) https://review.opendev.org/c/openstack/nova/+/78212608:14
*** khomesh24 has quit IRC08:16
kinpaa12389Hi,08:18
kinpaa12389I am trying to create snapshot from server. In this operation, I have added expiration value in keystone.conf and time.sleep() in snapshot()08:18
kinpaa12389Due to token expiry value in keystone.conf, glanceclient receive 401 error. How do I debug this 401 error in keystone ?08:18
kinpaa12389Enabled all logs of keystone, still not able to see place from where this error is thrown to glanceclient ?08:18
kinpaa12389keys=root, access, warnings, keystone, cc, radius, keystonemiddleware, keystoneauth, oslo_messaging, ldap, amqp, amqplib, sqlalchemy08:18
kinpaa12389above are keys in /etc/keystone/logging.conf08:18
kinpaa12389~08:18
*** rcernin has joined #openstack-nova08:23
*** andrewbonney has joined #openstack-nova08:24
lyarwoodLuzi: LGTM but could you cherry-pick -x $commit_from_stable_victoria https://docs.openstack.org/project-team-guide/stable-branches.html#proposing-fixes08:38
kashyapIs it the fix to the LM w/ TLS guide?08:38
kashyapAh, yep08:39
lyarwoodyes08:39
kashyapMornin, folks08:39
lyarwoodmorning08:39
openstackgerritLee Yarwood proposed openstack/nova stable/ussuri: libvirt: Skip encryption metadata lookups if secret already exists on host  https://review.opendev.org/c/openstack/nova/+/76577008:41
openstackgerritLee Yarwood proposed openstack/nova stable/train: Add regression test for bug #1895696  https://review.opendev.org/c/openstack/nova/+/75248708:45
openstackbug 1895696 in Cinder "unable to boot instance from encrypted volume created from a glance image of an encrypted volume" [Undecided,New] https://launchpad.net/bugs/189569608:45
openstackgerritLee Yarwood proposed openstack/nova stable/train: Create volume attachment during boot from volume in compute  https://review.opendev.org/c/openstack/nova/+/75248808:45
openstackgerritLee Yarwood proposed openstack/nova stable/train: compute: Skip cinder_encryption_key_id check when booting from volume  https://review.opendev.org/c/openstack/nova/+/75248908:45
*** zhanglong has quit IRC08:46
*** ociuhandu has joined #openstack-nova08:51
openstackgerritJosephine Seifert proposed openstack/nova stable/ussuri: Add config parameter 'live_migration_scheme' to live migration with tls guide  https://review.opendev.org/c/openstack/nova/+/78212608:52
Luzilyarwood, ^ is this what you meant?08:54
lyarwoodyup ack'd thanks08:54
*** ociuhandu has quit IRC08:59
*** lucasagomes has joined #openstack-nova09:02
*** derekh has joined #openstack-nova09:02
*** tosky has joined #openstack-nova09:04
openstackgerritLee Yarwood proposed openstack/nova stable/stein: compute: Lock by instance.uuid lock during swap_volume  https://review.opendev.org/c/openstack/nova/+/75873409:08
lyarwoodsome quick and easy docs changes ahead of rc if anyone has time https://review.opendev.org/c/openstack/nova/+/779479 & https://review.opendev.org/c/openstack/nova/+/77944609:09
* lyarwood owes stephenfin a few docs reviews in return09:09
*** ociuhandu has joined #openstack-nova09:09
*** ociuhandu has quit IRC09:09
*** martinkennelly has joined #openstack-nova09:12
*** ociuhandu has joined #openstack-nova09:20
openstackgerritLee Yarwood proposed openstack/nova stable/queens: Use absolute path during qemu img rebase  https://review.opendev.org/c/openstack/nova/+/78079009:21
*** zoharm has joined #openstack-nova09:21
*** vishalmanchanda has quit IRC09:28
*** zhanglong has joined #openstack-nova09:33
*** luksky has joined #openstack-nova10:07
*** tosky has quit IRC10:13
*** admin0 has quit IRC10:14
*** tosky has joined #openstack-nova10:14
*** ociuhandu has quit IRC10:16
*** zhanglong has quit IRC10:20
*** ociuhandu has joined #openstack-nova10:22
gibidoes somebody remembers why we forbid reparenting RPs in placement? I think we allow null -> parent_uuid update but we reject parent_uuid -> new_parent_uuid update. I would need local reparenting, moving an RP to another parent within the same PR tree10:28
*** ociuhandu has quit IRC10:28
gibiI can see that moving an RP to one tree to the other could break things in the allocations10:28
bauzasgibi: because it would need some reshape, maybe ?10:34
bauzas:)10:35
gibibauzas: yeah if there are allocations then moving an RP could invalidate them10:35
gibiin the generic case10:35
bauzasgibi: afair, we only accept to modify the RPs by some reshape method10:35
bauzaslike we did for vGPUs and others10:35
bauzasif we modify the parent, then the resources would also be modified10:36
gibidoes reshape allows reparenting?10:36
bauzasgibi: yup10:36
bauzasyou provide new inventories10:36
gibihm, then reshape it is10:36
bauzasgibi: context ?10:36
gibibauzas: we have a bug in neutron that creates a slightly wrong RP tree for qos10:37
bauzasgibi: ahah10:38
*** ociuhandu has joined #openstack-nova10:38
gibithe expected tree would be computeRP <- neutron agent RP <- deviceRP10:38
gibibut after a bugfix in Ussuri it was changed to10:38
bauzasgibi: related to the ML thread I saw ?10:38
gibicomputeRP <- neutron agent RP10:39
gibicomputeRP <- device RP10:39
bauzasI see10:39
gibibauzas: which thread? We just figured this problem out yesterday10:39
bauzassec10:39
gibiit does not cuase any scheduling issue at the momemnt, this is why it was hidden so long10:39
bauzastitle is '[ops] Bandwidth problem on computes'10:40
gibibauzas: nope, that is some physical bandwidht issue10:40
bauzasah yeah, just looked10:40
gibiso the above bug does not case any issue toady as the agent RP is basically unused10:40
gibibut I have a request to start tracking OVS packet processing capacity as a resource10:41
gibiand as today the ovs agent has a 1-1 relationship with the ovs agent, the ovs agent RP would be a good place for that new resource10:41
bauzasokay I see the problem10:42
lucasagomeshi, can someone please take a look at https://review.opendev.org/c/openstack/nova/+/776419, https://review.opendev.org/c/openstack/nova/+/776944 and https://review.opendev.org/c/openstack/nova/+/776934 ?10:42
gibiI mean the ovs agent has 1-1 relationship with OVS itself10:42
gibibauzas: so first I would like to fix the bug10:42
lucasagomesThese are small patches that will prevent the nova gate from breaking when we change the default network backend in DevStack to OVN next cycle10:42
gibiand then introduce new resources10:42
gibilucasagomes: I don't expect that these patches will be merged before the RC1 and the stable/wallaby is cut.10:43
lucasagomesgibi, ah fair enough ok10:43
gibilucasagomes: but kick me after the cut and I will find somebody to review them10:44
gibibauzas: so one way to fix it is to re-parent the device RP10:44
lucasagomesgibi, that sounds good, thanks much!10:44
gibibauzas: under the agent RP as it was before Ussuri10:44
gibibauzas: anyhow I will look into the reshape way now10:44
gibithat is a good tip10:45
gibis/tip/suggestion/10:46
*** rcernin has quit IRC10:46
*** mkrai_ has quit IRC10:47
gibibauzas: hm, so /reshape moves inventory and allocation between RPs but it does not move RPs between parents10:49
gibiso I could create a new device RP under the agent and move the inventory and the allocations from the old device RP to the new device RP and then delete the old device RP10:50
gibithis could work from placement perspectiv10:51
gibie10:51
gibineutron port does store the uuid of the RP the port is allocation from. So that would need to change too10:52
bauzasgibi: you provide a new tree with the reshape API10:53
gibibauzas: really? I don't see RPs provided in the API10:54
gibiI see inventories and allocations10:54
*** macz_ has joined #openstack-nova10:55
*** macz_ has quit IRC11:00
*** yoctozepto has quit IRC11:02
gibibauzas: the vgpu reshape also create the pgpu RPs _before_ the placement reshape call https://github.com/openstack/nova/blob/3de7fb7c327db348d04d15d4cd3c4f811a336126/nova/virt/libvirt/driver.py#L841411:03
*** yoctozepto has joined #openstack-nova11:03
*** artom has joined #openstack-nova11:04
bauzasgibi: sure, but we provided back the new provider tree by this method https://github.com/openstack/nova/blob/3de7fb7c327db348d04d15d4cd3c4f811a336126/nova/virt/libvirt/driver.py#L838011:05
bauzasgibi: so, we provide a new tree and then we look at the allocations11:05
bauzasif there are some of them, then we call the reshape API11:05
bauzashttps://github.com/openstack/nova/blob/3de7fb7c327db348d04d15d4cd3c4f811a336126/nova/virt/libvirt/driver.py#L810211:06
bauzass/allocs/inventories11:06
bauzashttps://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/reshape-provider-tree.html#proposed-change11:08
bauzasgibi: every 60 secs (by default), the virt driver passes back the provider tree11:08
bauzasso it can change it11:09
*** rcernin has joined #openstack-nova11:09
bauzasbut if the virt driver wants to modify the tree by punting new inventories and allocations from a RP to another one, then it needs to tell placement to move them11:09
bauzasgibi: I can understand your concern, you don't want to move inventories or allocations, but just moving the parent11:10
*** vishalmanchanda has joined #openstack-nova11:11
bauzasbut then I guess that  the resources would be modified11:12
bauzasgibi: say, for example, with https://docs.openstack.org/placement/latest/user/provider-tree.html#filtering-by-tree11:12
*** elod has quit IRC11:13
bauzasyou would then get different candidates11:13
bauzasthat's why we need to be sure that we won't have different resources11:13
gibisure if I change the structure of the tree then the in_tree a_c queries will change11:14
gibiregardless if I change the tree with reshape or just going in the db and change the parent_uuid of an RP11:15
gibias far as I see if I base the fix to reshape then the process is create new dev RP, rehaspe inv and alloc from old dev RP, change neutron ports to point to the new dev RP, deled old dev RP11:16
gibiI think this would work11:16
gibibut it creates a new RP instead of move the existing RP11:17
gibiso it is more complicated that what I need11:17
*** rcernin has quit IRC11:17
*** rcernin has joined #openstack-nova11:18
*** elod has joined #openstack-nova11:21
*** ociuhandu has quit IRC11:22
openstackgerritStephen Finucane proposed openstack/nova master: docs: Add a resource limits guide  https://review.opendev.org/c/openstack/nova/+/78143311:24
openstackgerritStephen Finucane proposed openstack/nova master: docs: Add a real-time guide  https://review.opendev.org/c/openstack/nova/+/78143411:24
openstackgerritStephen Finucane proposed openstack/nova master: docs: Remove duplicate TPM extra spec info  https://review.opendev.org/c/openstack/nova/+/78143511:24
openstackgerritStephen Finucane proposed openstack/nova master: docs: Remove duplicated PCI passthrough extra spec info  https://review.opendev.org/c/openstack/nova/+/78143611:24
openstackgerritStephen Finucane proposed openstack/nova master: docs: Add SEV guide  https://review.opendev.org/c/openstack/nova/+/78143711:24
openstackgerritStephen Finucane proposed openstack/nova master: docs: Add CPU models guide  https://review.opendev.org/c/openstack/nova/+/78143811:24
openstackgerritStephen Finucane proposed openstack/nova master: docs: Change formatting of hypervisor config guides  https://review.opendev.org/c/openstack/nova/+/78143911:24
openstackgerritStephen Finucane proposed openstack/nova master: docs: Add libvirt misc doc  https://review.opendev.org/c/openstack/nova/+/78144011:24
openstackgerritStephen Finucane proposed openstack/nova master: docs: Clarify host-model, host-passthrough differences  https://review.opendev.org/c/openstack/nova/+/78241011:24
openstackgerritStephen Finucane proposed openstack/nova master: docs: Fold in MDS security flaw doc  https://review.opendev.org/c/openstack/nova/+/78241111:24
openstackgerritStephen Finucane proposed openstack/nova master: api: Improve extra spec validator help texts  https://review.opendev.org/c/openstack/nova/+/78241211:24
*** rcernin has quit IRC11:25
*** ociuhandu has joined #openstack-nova11:29
gibibauzas: another issue. I need to keep the name of the device RP as nova depends on it11:41
gibiand RP name needs to b unique11:41
bauzasyup11:41
gibiin placement11:41
bauzasin general, we use (parent)_(name)11:41
gibiso the create a new device RP way does not work11:41
sean-k-mooneyis this in relation to something speciric?11:42
bauzasI wonder11:42
bauzassean-k-mooney: see above11:42
sean-k-mooneyim trying to read back but what i have got sofar is moving bandwith inventories via a reshape?11:43
bauzasgibi: maybe you should just create a new reshape method11:43
bauzasgibi: and then modify the names11:43
bauzasin there11:43
bauzaslike : does the RP have the wrong parent UUID ? yes => ReshapeNeeded11:44
gibibauzas: ohh the RP name is settable, so we can use a temporary name11:44
bauzasthen, ReshapeNeeded will be provided so you could say 'ah, ReshapeNeeded, so create a new RP, modify the existing name, and then use this name for my new RP'11:44
sean-k-mooneygibi: its setable only one i think11:45
sean-k-mooneygibi: eg i dont think you can modify it after you create teh RP11:45
gibisean-k-mooney: parent (null-> nonnull) and name is settable11:45
gibias far as I see11:45
bauzasgibi: this way, you wouldn't change the RPs after the compute service is restarted11:45
sean-k-mooneythere used to be a limiation around this but im tryign to recall what it was11:45
gibibauzas: these RPs are not handled by nova, so this reshape will happen in neutron :)11:46
bauzasah, shit, right11:46
gibibauzas: but besides that yes11:46
* bauzas needs to lunhc11:46
sean-k-mooneyallowing RPs to be reparented would be a much cleaner approch11:47
sean-k-mooneyi have asked for this in the past11:47
sean-k-mooneyto avoid the need to reshap11:47
bauzasno11:48
bauzasbecause you need to modify them just after restarting the service11:48
bauzasand not after11:48
bauzasor candidates would be different11:48
sean-k-mooneybauzas: we have disucssed this in the past and i have nver been satifed with the answeres as to why changeing the parent is not allowed11:48
gibisean-k-mooney: agree reparenting in my specific case would be cleaner, a generic reparenting is dangerous in the other hand11:49
gibilike moving an RP from one tree to other11:50
sean-k-mooneyi dont think its any more dangours then a reshap11:50
gibithat would definitely break existing allocations11:50
sean-k-mooneyit may or it may not11:50
sean-k-mooneythere are several cases where it would not11:50
sean-k-mooneygibi: setting the parent is what i was tinking of before not the name by the way11:51
sean-k-mooneye.g. where we could only do it once11:51
gibisean-k-mooney: yeah, parent can only be set from null to non null11:51
sean-k-mooneygibi: the bandwith RPs shoudl really be using the pci device addres not the netdev name by the way11:52
sean-k-mooneyi stongly advocated for not using the netdev name in the RP name when this was being first added11:52
gibisean-k-mooney: nova uses PCI address, neutron uses netdev name11:52
sean-k-mooneyright which is problematic11:53
gibias neutron handleds these RPs it got the netdev name and nova does a translation.11:53
sean-k-mooneyright i know11:53
sean-k-mooneybut i did not wnat it to use the netdev name in neutron at all11:53
gibithats a wide change ^^11:54
gibias neturon used that even before qos11:54
sean-k-mooneycorrect which has been broken for sriov for a long time11:54
sean-k-mooneyby the way i was not suggestign that neutron stops usign devname eveywhere11:55
sean-k-mooneythat is a sperate thing11:55
sean-k-mooneyjust that we model device in placment with teh pci address11:55
*** zoharm has quit IRC11:55
sean-k-mooneywhen we start modeling sriov vfs in placmnet we will need to use the PF pci address as part of the name for the VF inventores11:56
*** brinzhang has quit IRC11:56
sean-k-mooneyother but that will be challing with qos11:56
*** brinzhang has joined #openstack-nova11:57
sean-k-mooneywe shoudl not have different behavour with nic vs other pci devices so i dont see using the devname as a valid option for nova reporting pci device in the future11:57
gibisean-k-mooney: I agree that when we model the PFs in placement we need to do it right. and make the QoS related RPs aligned.11:59
*** admin0 has joined #openstack-nova12:02
*** tobias-urdin has joined #openstack-nova12:10
*** ociuhandu_ has joined #openstack-nova12:12
*** ociuhandu has quit IRC12:13
*** ociuhandu has joined #openstack-nova12:13
gibisean-k-mooney: it is not impossile rename the device RPs in placement to use pci address, we just need to add some netdev -> PCI address translation code in the neturon sriov agent, and keep conditional logic in nova to look for old netdev names and new pci address names for an extra cycle to support rolling upgrade12:13
sean-k-mooneyhow did the RP get move by the way12:14
sean-k-mooneyyou mentioned a bugfix?12:14
sean-k-mooneyi assume that has been reverted?12:14
sean-k-mooneyand yes that is one way to adress that in the future12:15
gibisean-k-mooney: this fixed a bug https://review.opendev.org/c/openstack/neutron/+/696600 but broke the tree12:15
sean-k-mooneythe other way woul dbe to add a kind fo symlink or alias feature to placment12:15
sean-k-mooneyso we can refer to the same RP with different names12:15
sean-k-mooneyah i rememebr this12:16
*** ociuhandu_ has quit IRC12:17
sean-k-mooneygibi: what is currently broken by having them under the compute node RP12:17
gibisean-k-mooney: if you upgrade from Train to Ussuri in a way that you had QoS configured already in Train then the Ussuri neutron will error out12:18
gibiif you deploy a new Ussuri then no visible problem seen12:18
sean-k-mooneygot it12:18
sean-k-mooneyso we dont have min bandwith testing in grenade12:18
gibithis is why the problem went undetected for so long12:18
sean-k-mooneywell neutron does not12:19
gibiI can even think that schduling works after the upgrade12:19
gibibut i have to reproduce it and try12:19
sean-k-mooneyi dont think it woudl break schduling12:19
gibinew deployments are not affected just have a wrong tree sturcture12:19
sean-k-mooneynova is not currently relying on the structure12:19
sean-k-mooneyi proposed using that stucure because i wante dot model other networking requirments12:20
sean-k-mooneylike make port count support by a vswtich12:20
sean-k-mooneyor trait on a agent provider for offloads or network type supprot e.g. vxlan12:20
sean-k-mooneythen evneutally add that to the query with same subtree12:21
*** hemna has quit IRC12:21
sean-k-mooneybut we did nto have same subtree at the time so the agent RPs are not actuly adding any benifit currently12:21
*** hemna has joined #openstack-nova12:21
sean-k-mooneyanyway so the fix for this would go into neutron right12:22
gibiyes and yes12:22
gibiand the whole think came out as I started looking into modeling OVS packet processing capacity on the ovs agent RP12:22
*** hemna has quit IRC12:23
sean-k-mooneyas in packet per second?12:23
gibiyepp12:23
sean-k-mooneyyou an i both know that that depends on the type of packet you use and the ip pipline/match rules12:23
*** hemna has joined #openstack-nova12:23
sean-k-mooneyits very hard to get that right without a lot of testing on your excat hardware12:24
sean-k-mooneyi assume neutron will be dumb and just have the operator say the capastity is X in a config like bandwith12:24
gibiI'm not well educated in OVS12:24
gibiyes the inventory would be config driven like bandwidth12:24
sean-k-mooneygibi: the pps number is differnt for vlan network vs vxlan network on the same ovs on the same host12:25
sean-k-mooneye.g. l3 tunneled networks use more cpu cycle to decap and encap then vlan12:25
sean-k-mooneyin the case of ovs-dpdk vm to vm traffic on the same host uses more cycle then vm to physical network12:26
sean-k-mooneysince vm to vm can levgerage as many hardware offloads12:26
sean-k-mooneyso depening on the direction of traffic and the overlay use the capsity will change12:27
*** ociuhandu has quit IRC12:27
sean-k-mooneyyou can obvioulsy deploy and test for this but it also will change based on other factors12:27
sean-k-mooneysuch as security group rules implemnte in contrack and the traffic profile.12:27
sean-k-mooneye.g. setting up and tearing downs lots of tcp connection requires all the inital packets to go through contrack to do the state tracking12:28
sean-k-mooneyso the pps will be lower with lots of new connection being established vs steady state12:29
sean-k-mooneymulticast/broadcast vs unicast also is a factor12:29
sean-k-mooneygibi: so im glad figuring that out will be the operators problem12:29
*** ociuhandu has joined #openstack-nova12:33
lyarwoodhttps://review.opendev.org/c/openstack/nova/+/768466 - anyone able to review this change introducing a nova-live-migration-ceph job? Everything has finally merged so we should be good to go now.12:33
lyarwoodI'm still working on the grenade change ontop of it12:34
gibisean-k-mooney: thanks for the info, I don't have this deep networking knowledge, so I appreciate these details12:36
*** hemanth_n has quit IRC12:37
gibisean-k-mooney: I agree that configuring this will be really deployment specific12:37
* sean-k-mooney im just happy i dont need to deal with rfc2544 in my current role12:38
*** ociuhandu has quit IRC12:38
openstackgerritLee Yarwood proposed openstack/nova master: libvirt: Simplify device_path check in _detach_encryptor  https://review.opendev.org/c/openstack/nova/+/77846312:38
gibisean-k-mooney: as far as I understand some of our big deployers tend use a unified network setup, like all vlan. Also the these deployers tend to do deep performance testing with their config, so I assume they will figure out a pps number for their specific deployment and traffic patterns.12:38
sean-k-mooneyi worked with other teams at intel that worked on ovs and vsperf in opnfv12:38
sean-k-mooneygibi: yep they do12:39
sean-k-mooneybut for public clouds this gets harder as you dont know the workload12:39
gibiyes12:39
sean-k-mooneyi assume as part of this there is an enforcemnt elemnt12:39
sean-k-mooneye.g. a max pps qos policy12:40
sean-k-mooneyi think tc can enforce that12:40
gibiyeah the max pps is already proposed https://review.opendev.org/c/openstack/neutron-specs/+/779940/1/specs/wallaby/qos_pps_rule.rst12:40
gibiat the moment I'm looking into min pps only for the scheduling decision but later there might be enforcement for that as well12:41
sean-k-mooneygood because we dont currenlty allow enforcing that in nova flavor extra specs with quota:vif_* and i kind of want to kill those12:41
sean-k-mooneywell we can do what we do for bandwith today12:42
sean-k-mooneye.g. set min = max12:42
sean-k-mooneyand just schulde on min and enfroce max12:42
sean-k-mooneywhich sice we dont over subscibe means we could effectivly enforce min12:42
gibiyes, that is possible here too12:43
sean-k-mooneyi dont think ovs support min pps12:43
sean-k-mooneyeven for sriov that is a challange12:43
gibithree are some differences with BW. The packet rate inventory seems to be directionless in nature12:45
* bauzas is back12:45
gibiand in case of OVS it more belongs to the softswitch level than to the bridge level12:45
gibiand for SRIOV packet rate is not really a limiting factor12:46
gibiafaik12:46
sean-k-mooneythe same is true for pps form an ovs point of view12:46
sean-k-mooneybridges are meaningless for the most part12:46
sean-k-mooneygibi: it kind fo still is12:47
sean-k-mooneyeven with sriov 64b packets will limit the bandwith fo the nic12:47
sean-k-mooneytypeically they can handel 1 or 2 VF at that rate but not multiple12:47
gibisean-k-mooney: interesting, I did not know that12:48
gibiso the pps will be the limit before the bandwidth if there is a lot of VFs and the packets are small?12:49
sean-k-mooneynormlaly the north south driction can handel full line rate at 64b packets for 10G thats about 14.4mpps12:49
sean-k-mooneyso it depens where the limit will come in would be vm to vm traffic on the same host12:49
sean-k-mooneythe other place its a proble is prot mirroing12:50
sean-k-mooneyour running vf in trusted mode12:50
sean-k-mooneysetting a vf to trusted mode consumes a lot of bandwith ignorign the security implications espcially if there are multipel of them12:51
sean-k-mooneyit forcs the nic to copy a subset of packets to multiple vfs12:51
*** mgoddard has quit IRC12:53
gibithanks, these make sense. This also points to the direction that keeping bandwidth and packet rate inventory configurable as it is really deployment and traffic pattern dependent12:53
sean-k-mooneygibi: you tend to hit the pcie bottelneck before the pps bottelneck in most cases but if you wnat to get full line rate with sriov and small packet i know our nfv folks had to a lot fo tuning12:53
gibigood to know12:54
sean-k-mooneyhttps://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/network_functions_virtualization_planning_and_configuration_guide/part-sriov-nfv-configuration#sect-configuring-sriov12:55
sean-k-mooneythey are using isolcpus=1-19,21-39 on the kernel in part 412:55
sean-k-mooney6.2.412:55
gibiyeah we also separate tenat cpus from host cpus and from ovs cpus12:57
gibiwith isolcpu12:57
sean-k-mooneyreally i dont think people should use that on non realtime hosts12:57
*** ociuhandu has joined #openstack-nova12:57
sean-k-mooneyits dprecated upstream in the kernel and its kind fo a blunt hammer12:57
gibiisolcpu helps avouding host cpu load to affect the pinned VMs12:58
sean-k-mooneyin many cases you can achive the same thing without using it12:58
sean-k-mooneyit does yes12:58
sean-k-mooneybut people enable it and then deploy unpinned vms on the host12:58
gibicgroups would be the alternative I guess12:58
sean-k-mooneyyes tuned has support12:59
sean-k-mooneywe wanted to entirely remove supprot for isolcpus form our product and move people to use tuned instead12:59
sean-k-mooneybut we have not been able to convice the nfv dfg to do that yet12:59
sean-k-mooneyhttps://github.com/redhat-performance/tuned/blob/master/profiles/cpu-partitioning/cpu-partitioning-variables.conf13:00
sean-k-mooneyyou can use tuned to configure the cgoups for you including isolating the cores form irqs13:00
sean-k-mooneythats what we use by default for non realtime hosts now13:01
*** ociuhandu has quit IRC13:02
gibiI see13:03
sean-k-mooneyin the cgroup way you can prevent the kernel schduling vms to the core indepently of balancing the vms on cores13:04
sean-k-mooneyisolcpus does both13:05
sean-k-mooneyso you can use the tuned/cgrop method with floating vms too not jsut pinned13:05
sean-k-mooneybut isolcpu does a few things that you cant quite do form userspace fully or at least with tuned at this point so they still use and support itn for nfv usecases13:06
gibiI see. I think we only support pinned vms by default13:06
sean-k-mooneythat both complicates and simplcies your life :)13:06
gibiyeah13:06
gibisoo telco13:06
gibi:)13:07
sean-k-mooneycomplciates becasue numa and simplfies because only numa13:07
gibiyeah13:07
sean-k-mooneyif you dont have to deal with peopel mixing numa and non numa vms its nice13:07
*** ociuhandu has joined #openstack-nova13:08
gibiyes we sort of force numa with pinned cpus and huge pages13:08
*** ociuhandu has quit IRC13:08
*** ociuhandu has joined #openstack-nova13:10
*** ociuhandu has quit IRC13:15
openstackgerritMerged openstack/nova master: Dynamically archive FK related records in archive_deleted_rows  https://review.opendev.org/c/openstack/nova/+/77383413:19
*** lpetrut has quit IRC13:22
*** ociuhandu has joined #openstack-nova13:26
*** hemanth_n has joined #openstack-nova13:29
bauzasfolks, if you need to discuss with kashyap, he's off IRC as he has an Internet issue with his ISP13:29
sean-k-mooneyi did have one thing that came up last night13:31
*** ociuhandu has quit IRC13:31
sean-k-mooneystars here http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2021-03-22.log.html#t2021-03-22T19:34:1913:32
sean-k-mooneybasically qemu change the cpu flags that are set in a specific cpu model and if you dont use version machine types that will break live migration13:32
sean-k-mooneywe may or may not be able to work around that in nova via the migration xml13:33
sean-k-mooneybut the end effect is that eypc-ibrs when the vm was booted on the source host nolgere results in the same cpu feature enabel if you boot a vm on the same host or a different one with the same libvirt xml13:34
sean-k-mooneywehn you are using unversion machine types13:34
bauzassean-k-mooney: kashyap dunno when he will be back13:34
sean-k-mooneyits not supper urgent at least for me13:35
*** belmoreira has joined #openstack-nova13:35
*** ratailor has quit IRC13:35
sean-k-mooneybut rouk has already confrim that setting an explit machine type in nova or using kashyap recent patch to allow removing feature flags both cannot fix the issue13:35
sean-k-mooneysince neither ake effect on live migration13:36
sean-k-mooneyit wont break use downstream since we use versioned machine types but it will break anyone that uses an unversioned machine type after a qemu update13:36
sean-k-mooneythat include our downstream customer if they set an unversion machine type in the nova config or in the glance image13:37
*** hemanth_n has quit IRC13:39
*** ociuhandu has joined #openstack-nova13:42
*** ociuhandu has quit IRC13:47
*** ociuhandu has joined #openstack-nova13:50
*** sapd1 has joined #openstack-nova13:51
*** ociuhandu has quit IRC13:52
*** ociuhandu has joined #openstack-nova13:52
*** mgoddard has joined #openstack-nova13:57
*** mlavalle has joined #openstack-nova13:59
*** ociuhandu has quit IRC14:04
*** ociuhandu has joined #openstack-nova14:10
*** ociuhandu has quit IRC14:15
*** ociuhandu has joined #openstack-nova14:27
lyarwoodgibi / stephenfin ; https://review.opendev.org/c/openstack/nova/+/733627 - could either of you look at this sometime this week, wsgi fix when launched by mod_wsgi14:30
gibilyarwood: ack14:33
lyarwoodthanks14:33
*** ociuhandu has quit IRC14:35
stephenfindone14:39
gibistephenfin won14:43
*** dklyle has joined #openstack-nova14:47
*** Luzi has quit IRC14:53
bauzasgibi: taking the semaphore for bug triaging14:54
gibibauzas: given14:54
bauzasI see very old open bugs that aren't triaged14:55
bauzaslike https://bugs.launchpad.net/nova/+bug/146363114:55
openstackLaunchpad bug 1463631 in grenade "60_nova/resources.sh:106:ping_check_public fails intermittently" [Undecided,Confirmed]14:55
bauzasany reason to leave them in such state ?14:55
*** whoami-rajat has quit IRC15:03
bauzaslyarwood: hmm, very interesting corner case https://bugs.launchpad.net/nova/+bug/192088615:13
openstackLaunchpad bug 1920886 in OpenStack Compute (nova) "ImageNotFound error occurs after live migration" [Undecided,New]15:14
*** mkrai has joined #openstack-nova15:17
gibibauzas: https://bugs.launchpad.net/nova/+bug/1463631 has a not too old comment from lyarwood that he saw it again15:22
openstackLaunchpad bug 1463631 in grenade "60_nova/resources.sh:106:ping_check_public fails intermittently" [Undecided,Confirmed]15:22
lyarwoodsorry was flooded with pings downstream15:24
* lyarwood reads15:24
lyarwoodmy god, a well written bug report15:25
lyarwoodI think I might faint15:25
*** lpetrut has joined #openstack-nova15:26
*** macz_ has joined #openstack-nova15:30
*** mkrai has quit IRC15:32
*** mkrai has joined #openstack-nova15:32
lyarwoodgibi / bauzas ; re the grenade bug, yeah I couldn't make any sense of that, we could drop nova if you don't think we can help15:35
*** ebbex has quit IRC15:35
*** dosaboy has quit IRC15:35
*** nautik has quit IRC15:35
*** owalsh has quit IRC15:35
*** owalsh has joined #openstack-nova15:36
*** lpetrut has quit IRC15:37
*** martinkennelly has quit IRC15:37
*** purplerbot has quit IRC15:37
*** dosaboy has joined #openstack-nova15:37
*** frickler has quit IRC15:37
*** ebbex has joined #openstack-nova15:37
*** irclogbot_0 has quit IRC15:37
*** frickler has joined #openstack-nova15:37
gibilyarwood: I've just run a logstash query http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Couldn't%20ping%20server%5C%22 and in the last 7 days we had 5 hits so this is active15:38
*** masterpe has quit IRC15:38
kashyapbauzas: Thanks!  My network has been super flaky; just reading the scrollback15:39
*** irclogbot_1 has joined #openstack-nova15:39
openstackgerritLee Yarwood proposed openstack/nova master: compute: Reject requests to commit intermediary snapshot of an inactive instance  https://review.opendev.org/c/openstack/nova/+/78113815:39
kashyapsean-k-mooney: Hi, looking at the chat log15:39
*** martinkennelly has joined #openstack-nova15:40
kashyapAssuming I'm still connected here, I'm getting a "We’re having trouble finding that site."15:40
*** adriant7 has joined #openstack-nova15:44
*** rouk has joined #openstack-nova15:45
bauzaslyarwood: ack, will put the grenade bug to be invalid for nova15:45
*** xek has quit IRC15:45
*** adriant has quit IRC15:46
*** adriant7 is now known as adriant15:46
rouksean-k-mooney: so yeah, updating cpu_map nor machine type worked.15:46
*** xek has joined #openstack-nova15:46
roukim just building images with qemu 3.1 i guess now as the last ditch fix.15:46
*** rouk has quit IRC15:48
*** irclogbot_1 has quit IRC15:49
*** rouk has joined #openstack-nova15:51
*** mkrai has quit IRC15:51
*** kinpaa12389 has quit IRC15:52
*** irclogbot_0 has joined #openstack-nova15:53
*** ociuhandu has joined #openstack-nova15:56
*** nautik has joined #openstack-nova15:57
openstackgerritRuby Loo proposed openstack/nova master: Allow plus sign in flavor ids  https://review.opendev.org/c/openstack/nova/+/78254515:58
*** LinPeiWen has quit IRC16:06
openstackgerritRuby Loo proposed openstack/nova master: Allow plus sign in flavor ids  https://review.opendev.org/c/openstack/nova/+/78254516:09
*** gokhani has quit IRC16:11
*** masterpe has joined #openstack-nova16:20
*** __ministry has joined #openstack-nova16:38
*** links has quit IRC16:55
openstackgerritMerged openstack/nova master: Initialize global data separately and run_once in WSGI app init  https://review.opendev.org/c/openstack/nova/+/73362716:56
*** __ministry has quit IRC16:59
*** lucasagomes has quit IRC17:00
*** iurygregory has quit IRC17:02
*** belmoreira has quit IRC17:07
*** jangutter_ has joined #openstack-nova17:08
*** ociuhandu_ has joined #openstack-nova17:10
*** jangutter has quit IRC17:12
*** ociuhandu has quit IRC17:14
*** ociuhandu_ has quit IRC17:15
*** rpittau is now known as rpittau|afk17:26
*** iurygregory has joined #openstack-nova17:29
*** sapd1 has quit IRC17:29
*** derekh has quit IRC18:03
*** andrewbonney has quit IRC18:12
*** vishalmanchanda has quit IRC18:17
*** hamalq has joined #openstack-nova18:18
rouksean-k-mooney: nova patch wise, see a simpler way to make it work?18:31
sean-k-mooneythe only thing i can think of is18:34
sean-k-mooneyto update teh migrate xml18:34
sean-k-mooneyto example exactil the feature that are in use on the source node18:34
sean-k-mooneyand skip the cpu compare18:34
sean-k-mooneywith a workaround flag18:34
sean-k-mooneybut i have not looked at writing that18:35
sean-k-mooneyi ping kashyap  about this a little eairler to see if he had any ideas but you tried most of the ones i tought would be strait forward18:36
roukthat sounds like more work than removing a commit from qemu18:36
sean-k-mooneysince they didnt work nothing that is quick an simple18:36
sean-k-mooneyya it is18:36
roukill get it cooking then.18:37
*** hemna has quit IRC18:39
*** hemna has joined #openstack-nova18:50
*** manuvakery1 has quit IRC18:51
*** dtantsur is now known as dtantsur|afk18:55
*** dtantsur|afk is now known as dtantsur|afk|afk18:55
*** dtantsur|afk|afk is now known as dtantsur|afk18:55
*** hemna has quit IRC19:25
*** hemna has joined #openstack-nova19:26
melwittgmann: do you think such a change like https://review.opendev.org/c/openstack/nova/+/782545 represents an api change that needs a new microversion?19:53
gmannmelwitt: yeah, it change 400 to 200 which is interop issue. microversion is needed for such changes19:56
melwittgmann: ack, thanks19:56
gmannmelwitt: this is same as allowing more char in keypair name, - https://blueprints.launchpad.net/nova/+spec/allow-special-characters-in-keypair-name20:09
*** pawan-gupta has quit IRC20:15
melwittgmann: a-ha, thanks for the example!20:19
*** sean-k-mooney has quit IRC20:24
*** jmlowe has quit IRC20:39
*** ociuhandu has joined #openstack-nova20:39
*** jmlowe has joined #openstack-nova20:40
*** rcernin has joined #openstack-nova21:01
*** ociuhandu has quit IRC21:05
*** ociuhandu has joined #openstack-nova21:11
*** rcernin has quit IRC21:16
*** ociuhandu has quit IRC21:16
*** jangutter has joined #openstack-nova21:22
*** jangutter_ has quit IRC21:26
*** ralonsoh has quit IRC21:36
*** fnordahl has quit IRC21:37
*** rcernin has joined #openstack-nova21:46
*** slaweq has quit IRC21:52
*** slaweq has joined #openstack-nova21:54
*** rcernin has quit IRC22:10
*** rcernin has joined #openstack-nova22:10
openstackgerritMerged openstack/nova stable/ussuri: Add config parameter 'live_migration_scheme' to live migration with tls guide  https://review.opendev.org/c/openstack/nova/+/78212622:46
*** hoonetorg has joined #openstack-nova22:53
*** slaweq has quit IRC23:05
*** slaweq has joined #openstack-nova23:06
*** jamesden_ has quit IRC23:16
*** jamesdenton has joined #openstack-nova23:16
*** luksky has quit IRC23:22
*** johnthetubaguy has quit IRC23:36
*** johnthetubaguy has joined #openstack-nova23:38
*** macz_ has quit IRC23:40
*** johnthetubaguy has quit IRC23:49
*** hamalq has quit IRC23:55
*** johnthetubaguy has joined #openstack-nova23:57

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!