Wednesday, 2020-03-25

*** tetsuro has joined #openstack-nova00:08
*** ociuhandu has quit IRC00:13
*** ociuhandu has joined #openstack-nova00:14
*** ociuhandu has quit IRC00:18
*** tetsuro has quit IRC00:19
*** tetsuro has joined #openstack-nova00:21
*** bbowen has joined #openstack-nova00:27
*** tosky has quit IRC00:32
*** lbragstad has quit IRC00:37
*** zhanglong has joined #openstack-nova00:54
openstackgerritBrin Zhang proposed openstack/python-novaclient master: Microversion 2.83 - action event fault details  https://review.opendev.org/71456100:54
*** larainema has joined #openstack-nova00:56
openstackgerritMerged openstack/nova stable/pike: rt: only map compute node if we created it  https://review.opendev.org/67646300:58
openstackgerritBrin Zhang proposed openstack/nova master: Expose instance action event details out of the API  https://review.opendev.org/69443000:58
openstackgerritBrin Zhang proposed openstack/nova master: Add instance actions v283 samples test  https://review.opendev.org/70625100:58
brinzhang_stephenfin, gibi: updated done of bp/action-event-fault-details https://review.opendev.org/69443000:58
openstackgerritMerged openstack/nova stable/pike: pike-only: remove broken non-voting ceph jobs  https://review.opendev.org/70007201:05
brinzhang_dansmith: Did you see my reply in https://review.opendev.org/#/c/693828/?01:11
*** Liang__ has joined #openstack-nova01:13
*** zhanglong has quit IRC01:16
brinzhang_sean-k-mooney: I left some comments in https://review.opendev.org/#/c/631244/69/nova/accelerator/cyborg.py@214, can you check again? if it's true, I think that can be don by follow up.01:21
*** zhanglong has joined #openstack-nova01:21
*** macz_ has joined #openstack-nova01:24
*** liuyulong has quit IRC01:28
*** macz_ has quit IRC01:28
gmannbrinzhang_: one comment on policy check for older version - https://review.opendev.org/69443001:34
brinzhang_gmann: will check01:36
brinzhang_gmann: https://review.opendev.org/#/c/694430/11/nova/api/openstack/compute/instance_actions.py@187 it does not many impact as before, right?01:40
gmannbrinzhang_: one more comment on policy name01:41
gmannbrinzhang_: you mean for request with <2.83 ?01:42
brinzhang_but that can reduce to judgement the policy,01:42
gmannyeah, with <2.83 we anyhow will not show the 'details' field so no need to check policy things also01:42
brinzhang_ok, I will change to your write01:44
brinzhang_change  The value will be ``null`` for old records. to  The value will be ``null`` for older version. is it ok?01:44
openstackgerritmelanie witt proposed openstack/nova master: DNM: try to get some debug info for bug 1844929  https://review.opendev.org/70147801:45
openstackbug 1844929 in OpenStack Compute (nova) "grenade jobs failing due to "Timed out waiting for response from cell" in scheduler" [High,Confirmed] https://launchpad.net/bugs/184492901:45
gmannbrinzhang_: that is I am confuse, older version you mean API version ?01:46
gmannbecause for older microversion we are not showing the field itself.01:46
brinzhang_yes, before microversion 2.8301:46
gmannok, then you can remove that line as that field is present in response only after 2.83. if request if with <2.83 then 'details' itself is not present01:47
brinzhang_ok01:48
brinzhang_remove this line01:48
gmannyour api-ref change reflect that this field is new in 2.8301:48
gmann+101:48
*** spatel has joined #openstack-nova01:56
openstackgerritBrin Zhang proposed openstack/nova master: Expose instance action event details out of the API  https://review.opendev.org/69443002:05
openstackgerritBrin Zhang proposed openstack/nova master: Add instance actions v283 samples test  https://review.opendev.org/70625102:05
brinzhang_gmann: done, thanks02:05
brinzhang_because of the policy name changes, I have tested in my local, it works fine.02:05
openstackgerritBrin Zhang proposed openstack/nova master: Expose instance action event details out of the API  https://review.opendev.org/69443002:08
openstackgerritBrin Zhang proposed openstack/nova master: Add instance actions v283 samples test  https://review.opendev.org/70625102:08
openstackgerritmelanie witt proposed openstack/nova master: DNM: try to get some debug info for bug 1844929  https://review.opendev.org/70147802:11
openstackbug 1844929 in OpenStack Compute (nova) "grenade jobs failing due to "Timed out waiting for response from cell" in scheduler" [High,Confirmed] https://launchpad.net/bugs/184492902:11
*** macz_ has joined #openstack-nova02:17
gmannbrinzhang_: after reading the 2.62 changes, i think we do not need the new policy. we already have existing policy which can control the info from non-admin. commented on spec also02:20
gmannlet's wait for sean-k-mooney  reply. https://review.opendev.org/#/c/694430/13/nova/policies/instance_actions.py@5202:20
*** gyee has quit IRC02:21
*** macz_ has quit IRC02:21
gmanni think sean-k-mooney concern was we should not pass these info to non-admin which is controlled via existing policy also, like host info in 'events' - https://github.com/openstack/nova/blob/f454e1dec9580abf4605e071bdd678a40f492a49/nova/api/openstack/compute/instance_actions.py#L17002:22
*** ianw has quit IRC02:30
brinzhang_gmann: I cannot open github, you mean the https://review.opendev.org/#/c/694430/13/nova/api/openstack/compute/instance_actions.py@17202:35
brinzhang_gmann: the show_host = api_version_request.is_supported(req, '2.62')?02:35
gmannbrinzhang_: yeah, show_host with 2.6202:35
*** ianw has joined #openstack-nova02:36
*** ianw has quit IRC02:37
gmannbrinzhang_: let's wait for sean-k-mooney reply before changes in case something i missed02:37
brinzhang_gmann: the .BASE_POLICY_NAME % 'events' is a rule_admin policy02:37
openstackgerritMerged openstack/nova stable/train: libvirt: Provide the backing file format when creating qcow2 disks  https://review.opendev.org/71078802:37
brinzhang_gmann: ok, let sean-k-mooney check again02:38
*** ianw has joined #openstack-nova02:40
gmannyeah event policy is admin by default02:40
brinzhang_so show_host can be shown for an admin user, I think the details policy (SYSTEM_READER) is suitable for now.02:42
openstackgerritGhanshyam Mann proposed openstack/nova master: Add test coverage of existing flavor_manage policies  https://review.opendev.org/71481402:44
openstackgerritBrin Zhang proposed openstack/nova-specs master: [Trivial] Remove note for the implementation  https://review.opendev.org/71481702:47
openstackgerritLuyao Zhong proposed openstack/nova master: bug-fix: set do_cleanup always True for libvirt driver  https://review.opendev.org/71459303:00
openstackgerritLuyao Zhong proposed openstack/nova master: support live migration with vpmems  https://review.opendev.org/68785603:00
openstackgerritLuyao Zhong proposed openstack/nova master: Track orphan instances and error migrations in resource tracker  https://review.opendev.org/71465303:00
openstackgerritGhanshyam Mann proposed openstack/nova master: Introduce scope_types in os-flavor-manage  https://review.opendev.org/71481803:02
openstackgerritGhanshyam Mann proposed openstack/nova master: Add new default roles in os-flavor_manage policies  https://review.opendev.org/71481903:18
openstackgerritGhanshyam Mann proposed openstack/nova master: Pass the actual target in os-flavor-manage policy  https://review.opendev.org/71482203:25
*** psachin has joined #openstack-nova03:32
*** ociuhandu has joined #openstack-nova03:34
*** ociuhandu has quit IRC03:38
*** udesale has joined #openstack-nova04:51
openstackgerritmelanie witt proposed openstack/nova master: DNM: try to get some debug info for bug 1844929  https://review.opendev.org/70147804:51
openstackbug 1844929 in OpenStack Compute (nova) "grenade jobs failing due to "Timed out waiting for response from cell" in scheduler" [High,Confirmed] https://launchpad.net/bugs/184492904:51
*** ratailor has joined #openstack-nova05:00
*** vishalmanchanda has joined #openstack-nova05:03
*** spatel has quit IRC05:04
openstackgerritLuyao Zhong proposed openstack/nova stable/train: bug-fix: Reject live migration with vpmem  https://review.opendev.org/71406405:26
*** links has joined #openstack-nova05:29
*** evrardjp has quit IRC05:36
*** ociuhandu has joined #openstack-nova05:36
*** evrardjp has joined #openstack-nova05:36
*** TxGirlGeek has quit IRC05:40
*** TxGirlGeek has joined #openstack-nova05:40
*** ociuhandu has quit IRC05:41
*** TxGirlGeek has quit IRC05:45
*** dklyle has quit IRC05:50
*** macz_ has joined #openstack-nova05:53
*** macz_ has quit IRC05:58
*** xek_ has joined #openstack-nova06:07
openstackgerritQiu Fossen proposed openstack/nova master: The instance is volume backed and power state is PAUSED,shelve the instance failed  https://review.opendev.org/71160906:12
openstackgerritElod Illes proposed openstack/nova stable/pike: Mask the token used to allow access to consoles  https://review.opendev.org/70887606:40
openstackgerritElod Illes proposed openstack/nova stable/pike: Avoid circular reference during serialization  https://review.opendev.org/71414806:40
openstackgerritSundar Nadathur proposed openstack/nova master: Delete ARQs for an instance when the instance is deleted.  https://review.opendev.org/67373506:48
openstackgerritSundar Nadathur proposed openstack/nova master: Enable hard/soft reboot with accelerators.  https://review.opendev.org/69794006:48
openstackgerritSundar Nadathur proposed openstack/nova master: Enable start/stop of instances with accelerators.  https://review.opendev.org/69955306:48
openstackgerritSundar Nadathur proposed openstack/nova master: Enable and use COMPUTE_ACCELERATORS trait.  https://review.opendev.org/69955406:48
openstackgerritSundar Nadathur proposed openstack/nova master: Bump compute rpcapi version and reduce Cyborg calls.  https://review.opendev.org/70422706:48
openstackgerritSundar Nadathur proposed openstack/nova master: Block unsupported instance operations with accelerators.  https://review.opendev.org/67472606:48
openstackgerritSundar Nadathur proposed openstack/nova master: Add cyborg tempest job.  https://review.opendev.org/67099906:48
*** tetsuro has quit IRC07:13
*** xek_ has quit IRC07:14
*** xek_ has joined #openstack-nova07:14
*** vesper11 has quit IRC07:16
*** vesper has joined #openstack-nova07:16
*** belmoreira has joined #openstack-nova07:46
*** dpawlik has joined #openstack-nova07:47
*** nightmare_unreal has joined #openstack-nova07:48
*** maciejjozefczyk has joined #openstack-nova07:52
*** tesseract has joined #openstack-nova07:58
*** slaweq has joined #openstack-nova07:59
*** ralonsoh has joined #openstack-nova08:01
openstackgerritLuyao Zhong proposed openstack/nova master: Track orphan instances and error migrations in resource tracker  https://review.opendev.org/71465308:13
*** sapd1_x has joined #openstack-nova08:18
*** tkajinam has quit IRC08:19
*** stephenfin has quit IRC08:19
*** amoralej|off is now known as amoralej08:22
*** ileixe has joined #openstack-nova08:23
*** tosky has joined #openstack-nova08:24
*** tetsuro has joined #openstack-nova08:26
nightmare_unrealopenstackgerrit:08:26
*** dtantsur|afk is now known as dtantsur08:29
*** stephenfin has joined #openstack-nova08:30
*** rpittau|afk is now known as rpittau08:31
*** slaweq has quit IRC08:41
luyaobrinzhang_: https://review.opendev.org/#/c/678451/ was moved to https://review.opendev.org/#/c/714653, I add more testcases, thanks for your review08:44
lyarwoodelod: morning, https://review.opendev.org/#/c/713961/ if you have time, almost finished with these now.08:55
brinzhang_luyao: Thansk, got it, I will check later(I have some docs need to compete.)09:00
luyaobrinzhang_: thanks09:02
brinzhang_luyao:np ^^09:02
*** ociuhandu has joined #openstack-nova09:04
luyaolyarwood: I have a seperate patch to address the 'do_cleanup' flag issue, could you look at it again? https://review.opendev.org/#/c/71459309:04
lyarwoodyup can try today09:05
luyaolyarwood: thanks :)09:05
*** tetsuro has quit IRC09:05
luyaoelod: thanks for review, comments addressed https://review.opendev.org/#/c/714064/09:07
*** ociuhandu has quit IRC09:08
*** slaweq has joined #openstack-nova09:18
brinzhang_sean-k-mooney: https://review.opendev.org/#/c/694430/ and https://review.opendev.org/#/c/699669/ need your check, gmann have a question -1 for the new policy for show events:details, and I think it's necessary, and adopt my case, I left my comment in the spec.09:18
* gibi got hit by a list of downstream issues so will mostly be off today09:19
openstackgerritjayaditya gupta proposed openstack/nova master: Support for nova-manage placement heal_allocations --cell  https://review.opendev.org/71445909:28
* nightmare_unreal checks if /me works here09:36
* nightmare_unreal it works09:36
openstackgerritHuaqiang Wang proposed openstack/nova master: Refactor the code in checking available host CPUs  https://review.opendev.org/71465709:36
openstackgerritHuaqiang Wang proposed openstack/nova master: Introduce 'MIXED' CPU allocation policy for instance  https://review.opendev.org/71335409:36
openstackgerritHuaqiang Wang proposed openstack/nova master: Introduce the interface of creating 'MIXED' policy instance through 'PCPU' and 'VCPU'  https://review.opendev.org/71335509:36
openstackgerritHuaqiang Wang proposed openstack/nova master: metadata: export the vCPU IDs that are pinning on the host CPUs  https://review.opendev.org/68893609:36
huaqiang stephenfin: nice! you quickly fixed so much for mixed-instance bp!09:37
huaqiangI'd like to follow your code and also contributes09:37
stephenfinhuaqiang: Feel free to take ownership of the whole lot of them, if they're helpful09:38
huaqiangso I also updated the code to address the comments already made09:38
huaqiangYou are the gurantee of the code09:39
huaqiangIf you like to let me do some thing I'd like to do09:39
huaqiangmaybe from testing your code?09:39
huaqiangI see not all test passed09:39
stephenfinOh, I hadn't checked that yet. Let me respin things to fix those09:40
stephenfinThen we can figure out if any of them are useful09:40
*** martinkennelly has joined #openstack-nova09:41
huaqiangI'll spend about two hours in testing your patches that not marked with 'WIP' if you haven't test by yourself. or you can tell me which patch need more test09:43
stephenfinI think everything not marked in WIP is potentially useful09:44
stephenfinThe WIP patches duplicate your work so I'll probably abandon my ones. I wrote those WIP patches last week before you submitted the new revision09:45
huaqiangI'd like to know if you will continue these 'WIP' patches?09:45
huaqiangok09:45
huaqianga lot of them are simular09:45
stephenfinI won't. You've already done that work09:45
huaqianggot.09:45
huaqiangI'd like the take the reposibility09:46
*** zhanglong has quit IRC09:51
*** ivve has joined #openstack-nova09:53
*** Liang__ has quit IRC09:57
elodlyarwood: hi, +W'd10:00
lyarwoodelod: many thanks :)10:01
elodluyao: thanks, looks good to me, +210:01
*** factor has joined #openstack-nova10:04
elodlyarwood: thanks, too :) Now there are a bunch of patches in the gate queue in pike, hope we won't hit too many error_extending failures :S10:04
*** tesseract has quit IRC10:05
*** tesseract has joined #openstack-nova10:07
lyarwoodelod: I noticed the tempest job had failed a few times, was that the issue?10:12
lyarwoodelod: I haven't had time to look into it yet but wanted to later today10:12
lyarwoodelod: really want to flush the stable/pike queue once and for all :)10:12
elodlyarwood: yes, most of them are that failure :S yes, stable/pike will look nice if everything will be merged in the queue \o/10:14
*** ociuhandu has joined #openstack-nova10:14
*** rcernin has quit IRC10:17
lyarwoodelod: I'll take a look later today10:19
elodlyarwood: thanks! AFAIK that issue is not a new one, but strange that we hit now in that number... maybe because there were not so many check/gate runs towards pike in the last months10:26
*** maciejjozefczyk_ has joined #openstack-nova10:27
*** trident has quit IRC10:29
*** maciejjozefczyk has quit IRC10:29
*** trident has joined #openstack-nova10:31
*** trident has quit IRC10:33
*** trident has joined #openstack-nova10:37
lyarwoodelod: https://review.opendev.org/#/c/697523/ - I don't think we can workaround the issue on stable/pike within c-vol, thoughts on blacklisting the specific test in our compute jobs?10:44
openstackgerritJohn Garbutt proposed openstack/nova master: Prevent compute manager freeze when greenpool is full  https://review.opendev.org/57503410:44
openstackgerritJohn Garbutt proposed openstack/nova master: Prevent compute manager freeze when greenpool is full  https://review.opendev.org/57503410:44
*** lbragstad has joined #openstack-nova10:45
johnthetubaguygibi: stephenfin: I noticed we were all talking about this patch, it seem very like one of the big ironic pain points, so I fixed up my worries: https://review.opendev.org/#/c/57503410:47
johnthetubaguybelmoreira: I am wondering if you have seen this patch, and if it would help your powersync issues at all: https://review.opendev.org/#/c/57503410:49
belmoreirajohnthetubaguy: no, let me have a look10:51
johnthetubaguyits from the vmware folks, which I guess see related issues10:52
*** ociuhandu has quit IRC10:52
openstackgerritLee Yarwood proposed openstack/nova stable/pike: tempest: Avoid bug #1796708 on slower stable/pike CI hosts  https://review.opendev.org/71491510:57
openstackbug 1796708 in Cinder "VolumesExtendTest.test_volume_extend_when_volume_has_snapshot intermittently fails with "Extend volume failed.: VolumeNotDeactivated: Volume volume-5514a6ad-abbb-46b3-a464-d73cc67e55af was not deactivated in time."" [Medium,Confirmed] https://launchpad.net/bugs/179670810:57
lyarwoodelod: ^ lets skip that test for now.10:57
*** ociuhandu has joined #openstack-nova11:01
elodlyarwood: usually i don't like disabling tests, but maybe now that is OK for that test on pike, especially for that bug is open since 2018 and has no fix. Let's see the regex works well in your patch :)11:03
*** sapd1 has quit IRC11:04
*** sapd1_x has quit IRC11:06
openstackgerritJohn Garbutt proposed openstack/nova master: Prevent compute manager freeze when greenpool is full  https://review.opendev.org/57503411:12
openstackgerritJohn Garbutt proposed openstack/nova master: Prevent compute manager freeze when greenpool is full  https://review.opendev.org/57503411:13
*** tosky is now known as tosky_11:24
openstackgerritJohn Garbutt proposed openstack/nova master: WIP: Enforce resource limits using oslo.limit  https://review.opendev.org/61518011:33
*** tosky_ is now known as tosky11:40
openstackgerritMerged openstack/nova stable/pike: Avoid circular reference during serialization  https://review.opendev.org/71414811:40
openstackgerritMerged openstack/nova stable/pike: Mask the token used to allow access to consoles  https://review.opendev.org/70887611:40
openstackgerritMerged openstack/nova stable/pike: Remove exp legacy-tempest-dsvm-full-devstack-plugin-nfs  https://review.opendev.org/70206111:40
lyarwoodwell well well11:41
lyarwoodlooks like we got some faster CI nodes on that run11:41
*** rpittau is now known as rpittau|bbl11:42
*** eharney has quit IRC11:44
*** eharney has joined #openstack-nova11:49
*** spatel has joined #openstack-nova11:53
luyaolyarwood: Hi, thanks for your quick comments on 'do_cleanup' flag bug-fix https://review.opendev.org/#/c/71459311:55
*** spatel has quit IRC11:58
lyarwoodluyao: np, I agree this needs cleaning up, I just don't think setting do_cleanup to True is the correct way of doing it for now.12:00
*** amodi has joined #openstack-nova12:00
openstackgerritBalazs Gibizer proposed openstack/nova master: [Community goal] Update contributor documentation  https://review.opendev.org/71242012:01
luyaolyarwood: it's what I want to ask, I'm confusing about that,  do you mean I need another cleanup method to do the cleanup? not invoking driver.cleanup directly or rollback_live_migration_at_destination12:02
lyarwoodluyao: yes, I think something like live_migration_cleanup_source and live_migration_cleanup_destination would be better instead of overloading cleanup itself12:04
lyarwoodelod: fun, we can't limit the regex used by the tempest-full jobs as they are using this tox env to run the commands - https://github.com/openstack/tempest/blob/51fe1ae61bed5d62c18864748520db25144f6db9/tox.ini#L103-L11612:05
luyaolyarwood:  thingking.....what's the difference between the new cleanup methed and the existing one12:05
lyarwoodluyao: the existing one duplicates lots of cleanup already handled during the live migration flow12:07
lyarwoodluyao: we already unplug VIFs, disconnect volumes etc on success12:07
lyarwoodluyao: and depending on the failure we also do it there12:07
lyarwoodluyao: IMHO we should break up the cleanup method into smaller private methods that handle each aspect of this and use them only when required during the LM flows12:08
elodlyarwood: :-/ anyway, at least a couple of patch got merged. so maybe recheck is enough for now as we don't have that many pike patches...12:09
lyarwoodI'll post a change removing it so if it does end up blocking things we can still remove it12:10
openstackgerritLee Yarwood proposed openstack/nova stable/pike: zuul: Remove tempest-full from the gate due to bug #1796708  https://review.opendev.org/71491512:11
openstackbug 1796708 in Cinder "VolumesExtendTest.test_volume_extend_when_volume_has_snapshot intermittently fails with "Extend volume failed.: VolumeNotDeactivated: Volume volume-5514a6ad-abbb-46b3-a464-d73cc67e55af was not deactivated in time."" [Medium,Confirmed] https://launchpad.net/bugs/179670812:11
elodlyarwood: sounds like a plan :)12:11
openstackgerritMarcin Juszkiewicz proposed openstack/nova master: fix typo in wrong cpu_model message  https://review.opendev.org/71492812:13
luyaolyarwood: OK, got it, I'll look into the code and reply you in more detail on that patch, thanks  :)12:16
*** hrw has joined #openstack-nova12:20
hrwmorning12:20
*** links has quit IRC12:27
*** links has joined #openstack-nova12:28
*** ratailor has quit IRC12:32
openstackgerritMerged openstack/nova stable/train: nova-live-migration: Ensure subnode is fenced during evacuation testing  https://review.opendev.org/71396112:38
openstackgerritLee Yarwood proposed openstack/nova master: libvirt: Use virDomainBlockCopy to swap volumes when using -blockdev  https://review.opendev.org/69683412:40
lyarwood^ gibi / kashyap / stephenfin ; rebased with a bug created and referenced for tracking if you have time to review.12:40
kashyaplyarwood: Will look.  Trying to investigate a different bug I was thrown at elsewhere12:41
*** udesale_ has joined #openstack-nova12:42
*** udesale has quit IRC12:45
hrwkevinz: thanks for https://review.opendev.org/#/c/709494 - finally booted VM in qemu TCG12:58
*** spatel has joined #openstack-nova12:59
kevinzhrw: np, good to hear that :-D12:59
hrwkevinz: I wonder how many aarch64 changes done in nova should be redone in libvirt ;d13:00
hrwbut then still would stay due to desync between projects13:00
*** rpittau|bbl is now known as rpittau13:00
*** nweinber has joined #openstack-nova13:01
kevinzhrw: hope not so many, we just tweak tweak and tweak13:02
hrwkevinz: -M virt as default feels like something for libvirt ;D13:04
*** ryneq has joined #openstack-nova13:04
hrwno, it is default there. it's qemu where it is not13:04
sean-k-mooneydo we still have the cpu feature flag check disabled for aarch6413:04
kevinzyes, there are some methods in libvirt regarding with CPU are not implemented on aarch6413:05
sean-k-mooneyso its not safe to use max as the default model if we want to support livemigration13:05
hrwsean-k-mooney: anything around cpu features/model/passthrough on aarch64 is like walking on minefield13:05
sean-k-mooneywell in an upgrade case at least13:05
kevinzsean-k-mooney: yes, I know that Kevin Zheng from Huawei is working on libvirt side to make that happen13:06
sean-k-mooneyhrw: well we have the info in /sys13:06
brinzhang_sean-k-mooney: did you look at https://review.opendev.org/#/c/694430/ and https://review.opendev.org/#/c/699669/3, gmann want you can check that ^^13:06
sean-k-mooneylibvirt is just not reading it13:06
sean-k-mooneybrinzhang_: no but ill look now13:06
kevinzso hopefully live migration will works well on arm64 soon13:06
hrwsean-k-mooney: can you remind me /sys path?13:06
brinzhang_sean-k-mooney: yeah, thanks. I think we need the new policy13:07
sean-k-mooneyhrw: actully its in /proc/cpuinfo13:07
sean-k-mooneyhrw: the "flags" filed is "Flags" on aarch6413:07
sean-k-mooneywhich prevents libvirt reading it13:07
sean-k-mooneythe model is also available13:08
hrw /proc/cpuinfo... file which should just die13:08
sean-k-mooney hrw it really should not13:08
sean-k-mooneyhrw: its the standard interface to report this info13:08
hrwit is far from standard13:08
hrweach arch has own way13:08
sean-k-mooneyyes but at least its a common location13:09
sean-k-mooneyotherwise you have to use more arcane cpuid checks and model specific registers13:09
hrwyep13:10
*** amoralej is now known as amoralej|lunch13:12
kashyapgibi: Yeah, the bitwise OR and logical OR of flags is always a bit confusing for me too; they look reasonable, see my comment: https://review.opendev.org/#/c/696834/12/nova/virt/libvirt/guest.py@77313:17
*** ociuhandu has quit IRC13:19
*** ociuhandu has joined #openstack-nova13:20
*** mriedem has joined #openstack-nova13:20
*** ociuhandu has quit IRC13:25
*** sapd1_x has joined #openstack-nova13:38
gibikashyap, lyarwood: thanks. I'm +213:43
lyarwoodgibi: many thanks!13:44
dansmithbrinzhang_: I did, but I didn't understand what any of that had to do with why we need to use patch13:48
*** martinkennelly has quit IRC13:52
*** martinkennelly has joined #openstack-nova13:53
*** Liang__ has joined #openstack-nova13:55
*** Liang__ is now known as LiangFang13:56
*** liuyulong has joined #openstack-nova13:59
*** amoralej|lunch is now known as amoralej14:00
*** dswebb has joined #openstack-nova14:02
*** happyhemant has joined #openstack-nova14:06
*** maciejjozefczyk_ is now known as maciejjozefczyk14:19
*** prometheanfire has quit IRC14:19
*** prometheanfire has joined #openstack-nova14:23
openstackgerritMaciej Kucia proposed openstack/nova master: SR-IOV passthrough: Check PF only if VF is enabled  https://review.opendev.org/47664214:23
openstackgerritMerged openstack/nova master: ksa auth conf and client for Cyborg access  https://review.opendev.org/63124214:25
*** psachin has quit IRC14:27
openstackgerritLee Yarwood proposed openstack/python-novaclient master: Microversion 2.83 - Stable device boot from volume rescue  https://review.opendev.org/71495614:29
openstackgerritLee Yarwood proposed openstack/nova master: virt: Provide block_device_info during rescue  https://review.opendev.org/70081114:29
openstackgerritLee Yarwood proposed openstack/nova master: libvirt: Add support for stable device rescue  https://review.opendev.org/70081214:29
openstackgerritLee Yarwood proposed openstack/nova master: compute: Report COMPUTE_RESCUE_BFV and check during rescue  https://review.opendev.org/70142914:29
openstackgerritLee Yarwood proposed openstack/nova master: compute: Extract _get_bdm_image_metadata into nova.utils  https://review.opendev.org/70521214:29
openstackgerritLee Yarwood proposed openstack/nova master: libvirt: Support boot from volume stable device instance rescue  https://review.opendev.org/70143114:29
openstackgerritLee Yarwood proposed openstack/nova master: api: Introduce microverion 2.83 allowing boot from volume rescue  https://review.opendev.org/70143014:29
openstackgerritLee Yarwood proposed openstack/nova master: DNM - Test stable device rescue tests with BFV instances  https://review.opendev.org/71005014:29
huaqianghello. I see many '_from_dict' method in some NobaObject based classes, but not all classes,14:33
huaqiangshould I make it work for new field?14:33
openstackgerritLuigi Toscano proposed openstack/nova stable/ocata: Remove exp legacy-tempest-dsvm-full-devstack-plugin-nfs  https://review.opendev.org/71495814:34
nightmare_unrealhey, how can one overwrite allocation for instance14:36
*** ociuhandu has joined #openstack-nova14:36
huaqiangnightmare_unreal: cool name :D14:38
*** tbachman has quit IRC14:38
nightmare_unrealhuaqiang: thanks :D , it's just a nick I registered when I was more into gaming haha14:39
mriedemnightmare_unreal: why do you want/need to?14:42
nightmare_unrealworking on this : https://bugs.launchpad.net/nova/+bug/186899714:42
openstackLaunchpad bug 1868997 in OpenStack Compute (nova) "option to overwrite allocations for instances" [Undecided,New] - Assigned to jayaditya gupta (jayssj11)14:42
mriedemis that referring to the heal_allocations CLI? https://docs.openstack.org/nova/latest/cli/nova-manage.html#placement14:43
mriedemwe don't really need to track todos in the code with bug reports...so i'm not sure why someone opened that bug14:43
mriedemor is that someone you? :)14:43
nightmare_unrealyes that's me :)14:45
mriedemok https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L216 doesn't refer to a todo14:46
mriedemoh wrong line, 212614:46
mriedemhttps://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L212614:46
nightmare_unrealyup that one14:46
*** tbachman has joined #openstack-nova14:47
nightmare_unrealI also did the --cell one : https://review.opendev.org/#/c/714459/14:47
nightmare_unrealbut it needs review14:47
mriedemare you on belmiro's team at cern?14:47
nightmare_unrealyup14:48
nightmare_unrealnew joinee14:48
gibinightmare_unreal: I will get back to https://review.opendev.org/#/c/714459/ hopefully tomorrow14:49
*** jraju__ has joined #openstack-nova14:49
nightmare_unrealthanks gibi14:49
mriedemcool. welcome. i can leave some quick comments on ^14:49
nightmare_unrealsure14:49
*** links has quit IRC14:49
gibimriedem: thanks!14:50
gibimriedem: do you miss reviewing nova code ? :)14:51
*** macz_ has joined #openstack-nova14:52
*** dklyle has joined #openstack-nova14:52
openstackgerritLee Yarwood proposed openstack/nova master: WIP libvirt: Break up get_disk_mapping within blockinfo  https://review.opendev.org/71496214:54
*** sapd1_x has quit IRC14:55
*** ociuhandu has quit IRC14:56
mriedemgibi: heal_allocations started as my baby so i'm partial14:57
*** udesale_ has quit IRC14:57
*** brinzhang has joined #openstack-nova14:59
gibi:)15:00
mriedemnightmare_unreal: ok comments inline15:00
nightmare_unrealthanks :)15:00
mriedemgibi: it's also nice to review something outside of github too15:01
brinzhangsean-k-mooney: gmann: If we re-using the os-isntance-actions: events policy, we want to expose noValidHost and other information to the non-admin, which cannot be changed by modifying the policy, is it?15:01
mriedemnightmare_unreal: as to your original question, this put_allocations method is the one that overwrites the allocations for an instance https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L194815:02
brinzhangwe dont want to expose the traceback to the non-admin user15:02
gibimriedem: I don't have too much experience with github but I imagine gerrit is a nicser interface15:02
gibinicer15:02
mriedemgibi: just...different15:02
* nightmare_unreal coming from github world15:02
mriedemunified diff in github reviews isn't terrible15:03
nightmare_unrealnew to gerrit though15:03
nightmare_unrealthanks mriedem , I will work on it and submit again.15:03
mriedemerr i should say split diff i guess to be like how i used gerrit15:03
mriedemnightmare_unreal: so the way heal_allocations works is we determine if an instance needs healing and the conditional for that is here https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L192215:05
gmannbrinzhang: os-isntance-actions: events policy is admin by default15:05
mriedembypassing that is essentially a --force option or something like that15:05
sean-k-mooneyi prefer gerrit for revew also. im not really a fan of the pull request workflow but its still beter then doing things by email15:05
gmannso it would not expose those info to non-admin. until override to do so15:05
nightmare_unrealmriedem:  ah okay thanks for the guidance .15:06
mriedemnightmare_unreal: the thing that gets tricky is probably the network port allocations logic in there since you can tell the command to skip that, so we could have some problems if you skip healing port allocatoins but then forcefully overwrite the existing allocations to match the current flavor15:06
mriedemso that probably means if you add a --force option or something it probably needs to be mutually exclusive with --skip-port-allocations15:07
brinzhangWe need a policy to expose the details to non-admin (from modify the default policy), if we just use de admin default, maybe the traceback can do anything, we dont need to populate details15:07
mriedemthis is why functional tests are better for changes to this command because there are a lot of moving parts15:07
brinzhanggmann15:08
nightmare_unrealokay15:08
* mriedem goes back to his real job15:08
*** ociuhandu has joined #openstack-nova15:09
brinzhanggmann:https://review.opendev.org/#/c/699669/3/specs/ussuri/approved/action-event-fault-details.rst@5415:09
brinzhanggmann: My use case is for non-admin, but at the end need to add system_reader role default, but we should allow user change the default poliy to show the details for non-admin user15:11
brinzhanggmann: That's why I insist on using the new policy for show details15:12
brinzhanggmann: as gibi comment in https://review.opendev.org/#/c/699669/2/specs/ussuri/approved/action-event-fault-details.rst@12815:15
luyaolyarwood: Hi, I replied on  https://review.opendev.org/#/c/714593/ in detail,  as you commented, new cleanup method might be better, but it's really huge change, I think we need a bp, I'm willing to do this but later, my target is vpmem live migration in this release.:)15:16
luyaolyarwood: thanks again for your comments. :)15:16
brinzhangdansmith: I replied your comment in https://review.opendev.org/#/c/693828/, we have more details disscussed with the destroy-instance-with-datavolume using PATCH API in the SPEC15:18
*** eharney has quit IRC15:18
brinzhangdamsmith: The spec https://review.opendev.org/#/c/580336/15:18
lyarwoodluyao: if you just need to clean vpmem things up then you can add them to the existing cleanup methods outside of cleanup for live migration15:19
*** prometheanfire has quit IRC15:19
lyarwoodluyao: post_live_migration or rollback_live_migration_at_destination etc15:19
*** macz_ has quit IRC15:20
*** macz_ has joined #openstack-nova15:21
*** eharney has joined #openstack-nova15:21
luyaolyarwood: I'd like to set do_cleanup True if there are vpmems, do you think it's OK?15:22
gmannbrinzhang: gibi so this is bit we are missing. os-isntance-actions: events policy is admin only and if operator want to show it to non-admin then traceback and host name etc can be seen to non-admin which is all good because operator want to do so15:22
*** prometheanfire has joined #openstack-nova15:23
gmannand now with new field 'details' we are showing in 'events' dict so if policy os-isntance-actions: events pass then only we will add 'details'. so operator has to make os-isntance-actions: events for non-admin first and then only non-admin can see new field 'details'15:24
gmanneven we have new policy for 'details'  field, operator has to enable 'events' policy for non-admin.15:24
gmanni mean operator cannot do 1. keep os-isntance-actions: events for admin only and 2. new policy os-isntance-actions: events:details for non-admin15:25
lyarwoodluyao: I'd rather not change the semantics for vpmems at all and just add cleanup for them directly in the required places as we do with vifs and volumes15:25
lyarwoodluyao: I'll add a comment once I've finished something15:25
brinzhanggmann: in traceback recorded the sensitive information, why expose this to the non-admin?15:26
gmannbrinzhang: we are embedding new field 'details' in 'events' dict which is already policy-configurable for non-admin.15:26
gmannbrinzhang: yeah, i am saying if event policy is admin how operator make 'details' to show to non-admin15:26
gmannit is inside 'events' dict not outside15:27
gmannhttps://review.opendev.org/#/c/694430/13/nova/api/openstack/compute/instance_actions.py@18315:27
brinzhanggmann: no, if the microversion 2.51, we can see the events dict https://opendev.org/openstack/nova/src/branch/master/nova/api/openstack/compute/instance_actions.py#L17115:28
brinzhangs/ if the microversion 2.51/ if the microversion >= 2.5115:29
*** mlavalle has joined #openstack-nova15:30
*** arxcruz|rover is now known as arxcruz15:30
luyaolyarwood: Thanks. I understand you mean we can add  a separate method to cleanup vpmem, acctually I had such one solution, but that means we need to add an rpc api to cleanup destination vpmem like rollback_live_migration_at_destination, alex_xu comments that it's very vpmem and libvirt specific, he hoped we can utilize current cleanup method15:30
*** LiangFang has quit IRC15:32
gmannbrinzhang: you are right on that. i missed that microversion change.15:32
brinzhanggmann: so we don't need to have to pass os-isntance-actions: events to show the 'events' dict.15:32
gmannbrinzhang: and hostId is also shown always to non-admin15:32
brinzhanggmann: yes, hostId always shown to non-admin15:32
gmannwhen and how operator will decide that he/she does not want to show traceback to non-admin but want to show 'details' which is nothing but error message for nova exception and exception names for other to non-admin.15:34
gmanni mean we want to guard the 'details' field with admin by default but tell operator to enable for non-admin if he want to do with keeping traceback for admin only15:35
brinzhangYes, right15:35
gmannactually that use case I am not getting.  if new policy is non-admin by default then it make sense15:35
gmannbut we cannot make it non-admin by default because it may be info leak15:36
brinzhanggmann: thanks ^^15:36
brinzhangyes, that why we set system_reader by default15:36
lyarwoodluyao: I'm confused, we already have these within the libvirt driver? You wouldn't need to add any RPC calls.15:36
gmannbrinzhang:  i mean i cannot get the use case that is why i am finding difficulty to understand the use of new policy15:37
gmannor question is like: how operator can decide the that 'details' which has admin related info expose to non-admin but not traceback15:38
brinzhangyeah, in the spec if there is not have the policy limit, I think you will get that case firstly15:38
openstackgerritLee Yarwood proposed openstack/nova master: libvirt: Support boot from volume stable device instance rescue  https://review.opendev.org/70143115:39
openstackgerritLee Yarwood proposed openstack/nova master: api: Introduce microverion 2.83 allowing boot from volume rescue  https://review.opendev.org/70143015:39
*** gyee has joined #openstack-nova15:40
brinzhangif the operator want to expose the details to the non-admin, they just need to change the default policy, is it?15:40
brinzhangbecause the BASE_POLICY_NAME % 'events:details' just limit show details in 'events' dict15:41
luyaolyarwood: we already have vpmem cleanup logic inside libvirt driver, driver.cleanup will invoke vpmem cleanup.15:41
*** sapd1 has joined #openstack-nova15:41
lyarwoodluyao: so why can't we just call that specific logic from other places instead of calling the entire cleanup method?15:42
luyaolyarwood:  if I want to cleanup vpmems on destination host but do_cleanup is False,  I need a rpc call for vpmem cleanup15:43
gmannbrinzhang: yeah i get that but my point is it is difficult for operator to decide that 'details' (which can have non-nova exception so does leak the infa info) can be shown to non-admin and traceback not.15:43
luyaolyarwood: alternatively we can set do_cleanup to True, then rpc call rollback_live_migration_at_destination will be invoked, then vpmem cleanup will be called inside that15:44
brinzhanggmann: if the details is non-nova exception, it will be just only show the Exception class name to the details15:45
gmannbrinzhang: if any operator ask how to use these two policy in different way (one allowed for admin and one for non-admin) then we would not have clear answer right ?15:45
brinzhanggamnn: pls see https://review.opendev.org/#/c/712697/15:45
*** jraju__ has quit IRC15:47
*** TxGirlGeek has joined #openstack-nova15:47
luyaolyarwood: so I asked could I set do_cleanup to True if there are vpmems15:48
brinzhanggmann: 'traceback' show the exception details info, contains python path, and the all details. but 'details' just only show the format message if it's a nova exception, but if that is an non-nova excetption, we just show the simple info the the non-admin15:48
brinzhangI donnot think it no clear15:48
luyaolyarwood: Do I make it clear?15:49
lyarwoodluyao: yeah okay, that might be okay in the short term but I think after this we really need to clean this interface up15:49
gmannbrinzhang: that is what i was thinking to add in API side but serialize_args does.15:49
lyarwoodluyao: cleanup within libvirt is actually looking at migrate_data so why we are making the call to cleanup dependent on it is weird15:49
gmannto handle the non nova exception details15:49
luyaolyarwood: yeah agree15:49
brinzhanggmann: you mean, something need I add in os-instance-action API?15:52
gmannbrinzhang: no i mean hiding detail about non nova exception but exception name itself can leak few info about driver used etc.15:54
gmanncan non-admin take action based on non-nova exception ?15:55
brinzhangmaybe try to do something that they can do, nothing else15:56
gmanni am thinking if we hide the non- nova exception from 'details' field and only expose the nova exception which is what use case of 'details' is for non-admin15:56
luyaolyarwood: we only have instance path file to cleanup previously but now we have other devices needs cleanup15:56
gmanndansmith: ^^ ? any use case of keeping non-nova exception name in action event 'details' field.15:56
gmannadmin anyways can see all details from traceback15:57
brinzhanggmann: thanks, I am sorry it's too later for me, I have to go.15:58
gmannso that we can keep new field 'details' usable and no info leak for non-admin15:58
gmannbrinzhang: ah sorry. yeah. I will reply on review. thanks for discussion and late night.15:58
luyaolyarwood: we can also add a flag in libvirt migrate data to tell if there are vpmems needs cleanup, I'm not sure is it necessary?15:59
brinzhanggmann:We only show non-nova exception class name to users, I don't think it will cause serious information leakage.15:59
brinzhanggmann: this serialize_args change comes mriedem and dansmith, if they are around, I think you can get more.16:01
brinzhanggmann: thanks too, bye16:01
sean-k-mooneyluyao: we had to do host cleanup before for things other then the instnace files16:01
sean-k-mooneyluyao: like removing mounted volumes, cleaning up ports or other actions16:02
brinzhanggmann: this is the original thought https://review.opendev.org/#/c/694428/9/nova/objects/instance_action.py@19616:03
openstackgerritLee Yarwood proposed openstack/python-novaclient master: Microversion 2.83 - Stable device boot from volume rescue  https://review.opendev.org/71495616:03
lyarwoodluyao: possibily, just need to jump on a call and I'll try to update the review again16:04
sean-k-mooneybrinzhang: for non admin i think they should only see the class name for nova exceptions too16:04
brinzhangsean-k-mooney: yeah, agree, make sense to me too.16:04
sean-k-mooneynon admins ususally dont have the acess required to fix the cause of most nova excpetions16:05
luyaosean-k-mooney: sorry, you mean post live migration?16:07
sean-k-mooneyluyao: yes we clean up those resouces ealier in the function16:08
sean-k-mooneyluyao: so we unplug the guest interface on the ovs bridge for example16:08
sean-k-mooneyand we have to unmount any cinder volumes that were mounted on the soucres node16:09
luyaosean-k-mooney: yeah, invoking driver.cleanup will not cleanup them again16:09
openstackgerritLee Yarwood proposed openstack/nova master: virt: Provide block_device_info during rescue  https://review.opendev.org/70081116:09
openstackgerritLee Yarwood proposed openstack/nova master: libvirt: Add support for stable device rescue  https://review.opendev.org/70081216:09
openstackgerritLee Yarwood proposed openstack/nova master: compute: Report COMPUTE_RESCUE_BFV and check during rescue  https://review.opendev.org/70142916:09
openstackgerritLee Yarwood proposed openstack/nova master: compute: Extract _get_bdm_image_metadata into nova.utils  https://review.opendev.org/70521216:09
openstackgerritLee Yarwood proposed openstack/nova master: libvirt: Support boot from volume stable device instance rescue  https://review.opendev.org/70143116:09
openstackgerritLee Yarwood proposed openstack/nova master: api: Introduce microverion 2.83 allowing boot from volume rescue  https://review.opendev.org/70143016:09
openstackgerritLee Yarwood proposed openstack/nova master: DNM - Test stable device rescue tests with BFV instances  https://review.opendev.org/71005016:09
sean-k-mooneyluyao: ya i know16:10
sean-k-mooneywell with the flags you have set16:10
gmannsean-k-mooney: brinzhang and that is what use case if actually. expose something a non-admin could fix. may be filter or whitelist the non-admin fixable exceptions can be better here ?16:10
gmannor at least not expose the non-nova exception at all.16:10
sean-k-mooneygmann: well i would geuss any 4xx errors should be actionable by them in some way16:11
sean-k-mooneyif we are identifying them as client issues16:11
luyaosean-k-mooney: yeah, and now I need driver.cleanup to cleanup vpmems16:11
gmannsean-k-mooney: yeah most of them yes. few 404 might not be but I have not checked all exceptions but overall 4xx is in their range16:12
*** damien_r has quit IRC16:16
luyaosean-k-mooney, lyarwood: I'll offline and can't response promptly,  so please left comments on patch https://review.opendev.org/#/c/687856 if you have any suggestion about vpmem cleanup during live migration. Many Thanks. :)16:17
*** sapd1 has quit IRC16:24
sean-k-mooneyluyao: sure16:24
sean-k-mooneyluyao: o/16:25
*** damien_r has joined #openstack-nova16:27
*** damien_r has quit IRC16:32
openstackgerritGhanshyam Mann proposed openstack/nova master: Add test coverage of existing flavor_manage policies  https://review.opendev.org/71481416:32
*** liuyulong has quit IRC16:42
*** ociuhandu has quit IRC16:47
*** ociuhandu has joined #openstack-nova16:48
*** ociuhandu has quit IRC16:53
openstackgerritBalazs Gibizer proposed openstack/nova master: Reproduce bug 1869050  https://review.opendev.org/71499716:53
openstackbug 1869050 in OpenStack Compute (nova) "migration of anti-affinity server fails due to stale scheduler instance info" [Low,Triaged] https://launchpad.net/bugs/1869050 - Assigned to Balazs Gibizer (balazs-gibizer)16:53
openstackgerritBalazs Gibizer proposed openstack/nova master: Update scheduler instance info at confirm resize  https://review.opendev.org/71499816:53
hrwhttps://review.opendev.org/#/c/709494 - can someone take a look so aarch64 will be a bit better in nova?16:56
openstackgerritJohn Garbutt proposed openstack/nova master: Update quota sets APIs  https://review.opendev.org/71274916:58
openstackgerritJohn Garbutt proposed openstack/nova master: Tell oslo.limit how to count nova resources  https://review.opendev.org/71330116:58
*** damien_r has joined #openstack-nova16:59
*** belmoreira has quit IRC17:01
*** dtantsur is now known as dtantsur|afk17:06
melwittkashyap: would you mind revisiting the aarch64 patch, it's been updated ^17:06
openstackgerritLee Yarwood proposed openstack/nova master: WIP libvirt: Break up get_disk_mapping within blockinfo  https://review.opendev.org/71496217:09
sean-k-mooneyalex_xu: dansmith gibi so just did a evacuate test with the cyborg fake driver. http://paste.openstack.org/show/791153/17:12
openstackgerritStephen Finucane proposed openstack/nova master: hardware: Update and correct typing information  https://review.opendev.org/71469417:12
openstackgerritStephen Finucane proposed openstack/nova master: libvirt: Add typing information  https://review.opendev.org/71469517:12
openstackgerritStephen Finucane proposed openstack/nova master: tests: Split instance NUMA object tests  https://review.opendev.org/71469617:12
openstackgerritStephen Finucane proposed openstack/nova master: objects: Replace 'cpu_pinning_requested' helper  https://review.opendev.org/71469717:12
openstackgerritStephen Finucane proposed openstack/nova master: hardware: Don't consider overhead CPUs for unpinned instances  https://review.opendev.org/71469817:12
openstackgerritStephen Finucane proposed openstack/nova master: hardware: Remove handling of pre-Train compute nodes  https://review.opendev.org/71469917:12
openstackgerritStephen Finucane proposed openstack/nova master: hardware: Add validation for 'cpu_realtime_mask'  https://review.opendev.org/46820317:12
openstackgerritStephen Finucane proposed openstack/nova master: hardware: Tweak the 'cpu_realtime_mask' handling slightly  https://review.opendev.org/46145617:12
openstackgerritStephen Finucane proposed openstack/nova master: hardware: Rework 'get_realtime_constraint'  https://review.opendev.org/71470017:12
openstackgerritStephen Finucane proposed openstack/nova master: hardware: Invert order of NUMA topology generation  https://review.opendev.org/71470117:12
sean-k-mooneyalex_xu: dansmith gibi we can evacuate but it does not create allocation for the fpga17:12
*** rpittau is now known as rpittau|afk17:13
*** derekh has joined #openstack-nova17:14
sean-k-mooneythe arqs are also not updated http://paste.openstack.org/show/791154/17:15
sean-k-mooneyill update the block operation patch review with that info but currently we cannot evacuate properly.17:15
lyarwoodstephenfin: https://review.opendev.org/#/c/696834/ - not sure if you're still here but this should be ready now.17:16
* stephenfin clicks17:17
stephenfinlyarwood: done17:18
lyarwoodstephenfin: thanks17:19
openstackgerritJohn Garbutt proposed openstack/nova master: WIP: Enforce resource limits using oslo.limit  https://review.opendev.org/61518017:20
openstackgerritJohn Garbutt proposed openstack/nova master: Prevent compute manager freeze when greenpool is full  https://review.opendev.org/57503417:21
openstackgerritmelanie witt proposed openstack/nova stable/train: Add config option for neutron client retries  https://review.opendev.org/71501017:27
openstackgerritJohn Garbutt proposed openstack/nova master: Prevent compute manager freeze when greenpool is full  https://review.opendev.org/57503417:30
melwittlyarwood: I dunno if you saw my comment on this one https://review.opendev.org/708030 IIUC this is an option you're thinking to keep indefinitely, if so, it shouldn't go under [workarounds] as they're things intended to be temporary and removed17:31
*** evrardjp has quit IRC17:36
*** evrardjp has joined #openstack-nova17:36
*** ociuhandu has joined #openstack-nova17:39
kashyapmelwitt: Hiya; will look at the AArch64 thing tom. in the AM.   (Aside: just to keep you posted, I'm off from tomm. evening until 31)17:44
kashyap(s/31/31st-Mar/)17:44
*** ociuhandu has quit IRC17:45
kashyapActually, looking now17:45
melwittcool thanks!17:46
kashyapmelwitt: Okay, they went with the upstream QEMU AArch64 recomm. of model 'max'.  Cool17:47
openstackgerritJohn Garbutt proposed openstack/nova master: WIP: Enforce resource limits using oslo.limit  https://review.opendev.org/61518017:52
*** tesseract has quit IRC18:03
kashyapstephenfin: melwitt: The release note contains a lot of not useful info, which will only confuse: https://review.opendev.org/#/c/709494/2018:05
kashyapstephenfin: melwitt: I suggested whittling it down to a couple of sentences.  Hope that looks okay18:05
stephenfinkashyap: yeah, I was iffy on that too but figured it was good enough. Now that there's two of us...18:06
kashyapMaybe whoever is merging it can amend it?  If it's not urgent, perhaps Kevin could respoin18:06
kashyapstephenfin: Hehe, much of it is verbatim from a review comment I made; looks odd to have "stream of consciounsess" as a release note ;-)18:06
kashyapErr, I myself made a grammar error; /me goes to fix18:07
kashyapAlright; /me goes to make some dinner18:09
*** nightmare_unreal has quit IRC18:36
*** irclogbot_2 has quit IRC18:37
*** lbragstad has quit IRC18:57
openstackgerritGhanshyam Mann proposed openstack/nova master: Add test coverage of existing hypervisors policies  https://review.opendev.org/71502918:57
*** amoralej is now known as amoralej|off18:59
*** maciejjozefczyk has quit IRC19:01
*** ociuhandu has joined #openstack-nova19:01
*** irclogbot_1 has joined #openstack-nova19:02
openstackgerritGhanshyam Mann proposed openstack/nova master: Introduce scope_types in os-hypervisors  https://review.opendev.org/71503619:10
*** ociuhandu has quit IRC19:13
lyarwoodmelwitt: yeah sorry was working my way down to these changes this week19:14
melwittdansmith: are you aware that in a vanilla devstack with one cell, we are getting [workarounds]disable_group_policy_check_upcall = True ? this is new to me19:14
lyarwoodmelwitt: I'll respin and/or update in the morning.19:14
mriedemit's intentional because of superconductor mode19:14
mriedemmelwitt: ^19:15
dansmithyeah, what mriedem said19:15
mriedemotherwise affinity tests will below up19:15
dansmithso people either have to disable affinity or enable that workaround19:15
mriedemthere are a few things disabled by default like that in devstack19:15
melwittlyarwood: ok, np at all. just wanted to make sure in case you didn't see19:15
dansmithbecause we still don't have affinity in placement19:15
mriedemhttps://docs.openstack.org/nova/latest/user/cellsv2-layout.html#operations-requiring-upcalls19:16
mriedemanything not marked complete in that list probably has some kind of flag to disable it in devstack19:16
mriedemand we don't test cross_az_attach=false in the gate anywhere so...that just flies under the radar19:17
melwittmriedem, dansmith: thanks. yeah, I see that the logic is based on whether we have a superconductor going or not. just trying to work out whether we want or if there's way to set it False if we know everything's on the same MQ. context, we got a regression reported https://bugs.launchpad.net/nova/+bug/1863190 that's looking not like a regression since I saw [workarounds]disable_group_policy_check_upcall = True in the config19:17
openstackLaunchpad bug 1863190 in OpenStack Compute (nova) "Server group anti-affinity no longer works" [Undecided,New]19:17
*** macz_ has quit IRC19:17
mriedemi think there is at least one known latent multi-cell bug with how (anti-)affinity works, but i'm fuzzy on the details19:18
melwitttwo parallel anti-affinity requests seeming to violate policy in the single MQ deployment19:18
mriedemif you don't have multiple cells then i guess that doesn't apply19:18
mriedemif you have single cell and are support anti-affinity then you need the late check enabled in the compute19:19
melwittyeah if you are multi MQ then you can't get anti-affinity if the requests hit at the same time19:19
melwittright19:19
mriedem[workarounds]disable_group_policy_check_upcall = True means you're opting into the wildness19:19
melwittI was just thinking you'd think our default devstack with one MQ should set it False19:19
mriedemit's false by default19:19
*** larainema has quit IRC19:20
melwittit is, but something in the default devstack logic is setting it True19:20
melwittthat is, I cloned the devstack repo and brought up a vanilla devstack and I'm getting it set to True19:20
mriedemyeah, because superconductor is the default mode19:20
mriedemsuperconductor is the ideal mode for deploying nova, so we test in the gate with that by default19:20
melwittyeah. and I'm thinking this sort of "bug" will keep being reported occasionally bc people not realizing what devstack is doing19:21
mriedemit's a known limitation, link them to the docs19:21
mriedemtempest has (anti)affinity tests as well but i don't think they make parallel requests19:22
mriedembecause of this19:22
melwittyeah. they have to be very parallel too. the first time I tried to repro I didn't get it but after a few tries I got it19:22
mriedemright, i remember poking around in those tempest tests awhile back related to all of this19:24
*** ralonsoh has quit IRC19:26
mriedemoh also i think tempest tests the affinity stuff with 2 servers in the same create request - same (anti)affinity group, and that works b/c the scheduler knows about the decisions within the same request19:26
melwittah, yeah19:26
mriedemhttps://github.com/openstack/tempest/blob/7a588ded216f74ddd0015c3065d4fae10de2161f/tempest/api/compute/admin/test_servers_on_multinodes.py19:27
mriedemand https://github.com/openstack/tempest/blob/f419f4d36fd0f99a9c53fe3a984d172b02e828c5/tempest/api/compute/servers/test_server_group.py19:27
mriedembut of course you get nfv mano systems in the wild that are robots just firing off rapid requests19:28
mriedembut those are likely single cell and shouldn't be disabling that late affinity check upcall :)19:28
mriedemstarlingx had patches for a lot of this server group stuff19:29
melwittyeah, I see19:29
mriedemto try and mitigate some of it, but it wasn't all perfect either, e.g. locks within conductor but that would only lock *that* conductor (or scheduler) worker, not across all - unless you use an external locking mechanism, like a db or etcd or something19:29
melwittyeah, I know of one where they run with a single scheduler and serialize affinity requests19:30
melwittright19:30
mriedemtrying to make scheduling requests serialized for group based scheduling19:30
mriedemfor starlingx with like 1 node and 1 worker then it's probably fine19:30
* melwitt nods19:30
melwittok. I'll finish repro'ing the situation with the workaround set to False to make sure the late affinity check triggers, and write up something for the bug. and will link the doc19:32
*** irclogbot_1 has quit IRC19:37
*** irclogbot_2 has joined #openstack-nova19:40
*** irclogbot_2 has quit IRC19:42
mriedemmelwitt: if you're so inclined and it's something that keeps coming up it might be worth writing something up in the troubleshooting docs, e.g. why are my servers that are in x policy landing on the same/different hosts when they shouldn't?19:44
mriedemand explain the parallel issue19:44
mriedemi found it nice to write something up once and then just point people to that19:45
mriedemhttps://docs.openstack.org/nova/latest/admin/support-compute.html19:45
*** irclogbot_2 has joined #openstack-nova19:45
*** macz_ has joined #openstack-nova19:50
melwittmriedem: yup I think that's a good idea. I'll do that19:51
melwittthanks for suggesting19:51
*** martinkennelly has quit IRC19:54
mriedemmy first few weeks on the new job were me in slack being like "why x? why y? where is z documented?" and then taking the answers and trying to document them to feel like i was useful19:55
melwittthat's a good investment. every time I don't do that, I regret it later19:58
melwittand I usually forget because I have the memory recall of a hamster19:59
*** irclogbot_2 has quit IRC20:00
*** ociuhandu has joined #openstack-nova20:01
*** irclogbot_0 has joined #openstack-nova20:04
*** irclogbot_0 has quit IRC20:12
*** happyhemant has quit IRC20:15
*** irclogbot_2 has joined #openstack-nova20:16
*** irclogbot_2 has quit IRC20:16
*** irclogbot_0 has joined #openstack-nova20:22
*** ociuhandu has quit IRC20:27
*** ociuhandu has joined #openstack-nova20:28
*** ociuhandu has quit IRC20:33
melwittjohnsom: hey, finally got a chance to dig into the bug report you opened awhile back about anti-affinity, tl;dr is I don't find that there's been a regression. pls see my latest comment explaining https://bugs.launchpad.net/nova/+bug/186319020:37
openstackLaunchpad bug 1863190 in OpenStack Compute (nova) "Server group anti-affinity no longer works" [Undecided,New]20:37
johnsommelwitt Ok, thank you for having a look. I got some feedback that it changed around queens, but I didn't go back and confirm either way.20:38
*** tosky has quit IRC20:40
johnsommelwitt My money is on that setting being the variable.20:41
melwittjohnsom: it looks like it was likely a timing difference bc the change that disabled the late affinity upcall was back in pike https://review.opendev.org/47755620:42
johnsomLol, that is "around" in  OpenStack time.20:43
melwittaround for certain values of around20:43
johnsomYep20:44
*** seba has quit IRC20:46
*** seba has joined #openstack-nova20:46
johnsomrm_work FYI: https://bugs.launchpad.net/nova/+bug/1863190 comment 720:46
openstackLaunchpad bug 1863190 in OpenStack Compute (nova) "Server group anti-affinity no longer works" [Undecided,New]20:46
*** ericyoung has quit IRC20:47
rm_workhmm k20:47
rm_workwe switched to hard-anti-affinity and made sure we have retries enabled20:48
*** nweinber has quit IRC20:48
melwittrm_work: you have to have your cell conductors and computes configured a certain way to be able to handle racing affinity requests. if you have one shared MQ the configs can be set to support it. if you have multiple MQs there's no enforcement of affinity for racing requests until affinity support is added to placement20:49
*** ericyoung has joined #openstack-nova20:49
rm_workk20:50
rm_workour symptom was that a late check will catch it, but20:50
rm_workfor soft-anti-affinity, it doesn't actually BLOCK it20:50
rm_workwhich means soft-anti-affinity is pretty useless20:50
rm_workhard-anti-affinity it catches and sends for rescheduling, but our problem was we had reschedules disabled20:50
rm_workwe have since fixed that20:50
melwittyeah. I'm not sure what soft-anti-affinity is for ... give it a try and if not, meh I guess20:51
openstackgerritMerged openstack/nova master: libvirt: Use virDomainBlockCopy to swap volumes when using -blockdev  https://review.opendev.org/69683420:51
rm_workit's supposed to be "best attempt"20:51
melwittyeah20:51
rm_workso if you are down to only one schedulable HV, it'll still *work* because that's better than nothing20:51
rm_workbut in our case, using pack scheduling, it basically never does anything unless it catches up front20:51
rm_workthe late-catch will do nothing as it's still "valid"20:52
rm_workwhich makes it not so useful20:52
melwittright20:52
melwittso what did you do? enable some retries?20:52
rm_workswitched to hard-aa and set scheduling retries to 3 (which is the original default, i think -- we had set it specifically to 0)20:53
*** tosky has joined #openstack-nova20:54
melwittrm_work: yeah, ok. we do have a gap regarding the default pack scheduling + server group requests as mriedem mentioned earlier, and a way we could deal with that is to do something similar to starlingx where we serialize server group request claims at the scheduler, but we'd need to use a distributed lock since we have multiple scheduler workers. not something we already have in nova so would take more effort to add. would be a spec20:58
melwitt and all20:58
*** mlavalle has quit IRC21:09
*** mlavalle has joined #openstack-nova21:11
*** xek_ has quit IRC21:17
openstackgerritMerged openstack/nova stable/pike: Improve metadata server performance with large security groups  https://review.opendev.org/69752321:20
openstackgerritGhanshyam Mann proposed openstack/nova master: Add new default roles in os-hypervisors policies  https://review.opendev.org/71507121:24
mriedemmelwitt: tbc, i think the solution for the affinity problem during scheduling likely involves placement as has been discussed for years, not serializing things like starlingx did as a workaround21:32
mriedembut how that would work in placement has always been difficult to model21:32
openstackgerritGhanshyam Mann proposed openstack/nova master: Pass the actual target in os-hypervisors policy  https://review.opendev.org/71507421:33
mriedembut then placement is your distributed lock :)21:33
melwittmriedem: yeah, sorry, I was missing that the pack pattern problem with anti-affinity would go away with placement affinity21:33
melwittthis stuff confuses the hell out of me21:33
*** derekh has quit IRC21:34
melwittso, nix what I said earlier rm_work ^21:35
*** dpawlik has quit IRC21:35
mriedem"if you have multiple MQs there's no enforcement of affinity for racing requests until affinity support is added to placement" is not quite true, you just need the conductors configured to hit the API DB (compute -> cell conductor -> API DB); devstack doesn't configure the cell conductor with the API DB connection so again, devstack is doing the *ideal* separate setup but not what most (if any) nova deployments are probably d21:40
mriedem,21:40
mriedempretty sure most nova deployments just have the api db connection configured everywhere21:41
openstackgerritMerged openstack/nova stable/stein: nova-live-migration: Ensure subnode is fenced during evacuation testing  https://review.opendev.org/71396221:41
mriedemand yeah you have to have reschedules enabled to....reschedule :)21:41
*** derekh has joined #openstack-nova21:42
mriedemthe only things that do the late affinity check in the computes are server create and evacuate, so you can still violate affinity policy for other moves (unshelve, cold and live migrate)21:42
melwittyeah ... I was realizing that slowly regarding the difference between database access vs MQ access21:42
mriedemand the only flows that reschedule today from the compute are create and cold migrate/resize21:42
mriedemi think evacuate just fails the operation if you fail the late affinity check21:43
melwittI can't remember why that "impossible to contact bc MQ" was ever a thing wrt to affinity21:44
*** derekh has quit IRC21:44
*** ociuhandu has joined #openstack-nova21:45
melwittwas it before alternate_hosts became a thing maybe?21:45
melwittsigh ... have to correct my comment yet again21:46
*** ociuhandu has quit IRC21:49
*** spatel has quit IRC21:52
openstackgerritGhanshyam Mann proposed openstack/nova master: Add test coverage of existing instance usage log policies  https://review.opendev.org/71508021:54
mriedemalternate hosts just solved the problem of the cell conductor needing to go back to the scheduler and API DB21:55
mriedemthere is actually still an upcall bug there which i don't think got fixed,21:55
mriedemduring reschedule the conductor will update the instance.availability_zone for the alternate host and to do that it needs to hit the aggregates table which is in the API DB21:56
melwittyeah, I mean I had thought in the past it was said that the late affinity check would be impossible due to MQ isolation. but as you explained that's not true. so if that was ever said, I wondered why. it might have been before alternate hosts was added21:57
mriedemerr i guess i fixed that https://review.opendev.org/#/q/topic:bug/1781286+(status:open+OR+status:merged)+branch:master21:57
mriedemanytime, it's time to social distance myself into the kitchen, o/21:58
*** mriedem has left #openstack-nova21:58
dansmithmelwitt: I think what you're thinking of is that we don't currently have a way for the child cell to know about the top-level mq, and I don't think we should22:01
dansmithwe do have a separate api_db connection string, so as a hack, child conductors can use that as if they were a top-level conductor to still hit that database,22:02
dansmithwhich at least reduces the scope of who at the lower level can talk up,22:02
dansmithand since we should be solving this in placement, we can just hang onto the status quo instead of further pollute the model by teaching everyone to call up22:02
dansmithconcerned people could give perms to the child conductors to only view the instance_groups and related tables I think, to further limit the scope of what it can see to just what is needed for tht check22:03
melwittdansmith: thanks ... I think I am thinking of that. but I could have sworn that there was some previously discussed impossibility about it regarding separate MQs, I might have been mixing back before we had alternate hosts, how once you're in the cell you can't call the scheduler again to request a reschedule22:04
dansmith...because we don't have a way to tell those services about the upper mq like we do for the upper db22:05
melwittright. yeah, I do understand that. maybe I was thinking back to before we had alternate hosts and became able to reschedule without sharing a MQ22:05
dansmithwell, yeah, the alternate hosts thing was the only way we could reschedule without adding a similar upcall22:06
melwittand had tied that to the late affinity check in my head. I dunno22:06
dansmithfor the same reason22:06
dansmithwell, it's the same thing of course22:06
dansmithit was just easier to solve that with pre-populating some alternates to chew through, whereas the affinity check is not so easy22:07
melwittoh ... guh22:07
openstackgerritGhanshyam Mann proposed openstack/nova master: Introduce scope_types in os-instance-usage-audit-log  https://review.opendev.org/71508222:08
melwittdansmith: I think what confused me was that prior to alternate hosts, you needed to be able to access the scheduler's MQ to reschedule (right?) ... so an upcall to another MQ. and I didn't process that the late affinity check does not involve needing to upcall to another MQ to work, it only needs to upcall to the API DB22:20
dansmithmelwitt: yes, (re)schedule is an rpc call, whereas the affinity check is just a db operation22:21
melwittright, ok22:22
*** Jeffrey4l has quit IRC22:27
*** Jeffrey4l has joined #openstack-nova22:29
*** slaweq has quit IRC22:34
*** vishalmanchanda has quit IRC22:35
openstackgerritGhanshyam Mann proposed openstack/nova master: Add new default roles in os-instance-usage-audit-log policies  https://review.opendev.org/71508522:37
*** slaweq has joined #openstack-nova22:46
openstackgerritGhanshyam Mann proposed openstack/nova master: Pass the actual target in os-instance-usage-audit-log policy  https://review.opendev.org/71508922:48
*** slaweq has quit IRC22:51
*** brinzhang has quit IRC22:51
*** tkajinam has joined #openstack-nova22:52
openstackgerritGhanshyam Mann proposed openstack/nova master: Add new default roles in os-instance-usage-audit-log policies  https://review.opendev.org/71508522:55
openstackgerritGhanshyam Mann proposed openstack/nova master: Pass the actual target in os-instance-usage-audit-log policy  https://review.opendev.org/71508922:56
*** rcernin has joined #openstack-nova22:56
openstackgerritmelanie witt proposed openstack/nova master: Add info about affinity requests to the troubleshooting doc  https://review.opendev.org/71509223:18
*** efried_gone has quit IRC23:18
openstackgerritMerged openstack/nova master: libvirt: Use oslo.utils >= 4.1.0 to fetch format-specific image data  https://review.opendev.org/71078523:18
openstackgerritMerged openstack/nova master: remove DISTINCT ON SQL instruction that does nothing on MySQL  https://review.opendev.org/70585023:19
openstackgerritmelanie witt proposed openstack/nova master: Add info about affinity requests to the troubleshooting doc  https://review.opendev.org/71509223:20
*** macz_ has quit IRC23:24
*** lbragstad has joined #openstack-nova23:25
brinzhang_gmann: I would like we not only catch 4xx error, maybe be we also need get 500, so I would like to keep the exception name to populate the details if it is a non-nova exception23:49
brinzhang_gmann: for example: as a non-admin, iuf I created server failed because of NoValidHost(500), if I get this message, that I can try again, Otherwise I cannot get nothing useful message23:51
brinzhang_s/iuf/if23:51

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!