*** tetsuro has joined #openstack-nova | 00:05 | |
*** tosky has quit IRC | 00:14 | |
*** brinzhang has joined #openstack-nova | 00:18 | |
*** tetsuro_ has joined #openstack-nova | 00:23 | |
*** tetsuro has quit IRC | 00:25 | |
*** dave-mccowan has joined #openstack-nova | 00:35 | |
*** lbragstad has joined #openstack-nova | 00:38 | |
*** xek has quit IRC | 00:47 | |
*** xek has joined #openstack-nova | 00:47 | |
*** brinzhang has quit IRC | 00:52 | |
*** brinzhang has joined #openstack-nova | 00:52 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add new default roles in os-instance-actions policies https://review.opendev.org/706470 | 01:00 |
---|---|---|
brinzhang | gmann: done, thanks. And I replied your question, you can review again while your are free. | 01:03 |
*** nweinber has joined #openstack-nova | 01:27 | |
*** Liang__ has joined #openstack-nova | 01:27 | |
*** yaawang has quit IRC | 01:40 | |
*** yaawang has joined #openstack-nova | 01:40 | |
*** dave-mccowan has quit IRC | 01:43 | |
*** dave-mccowan has joined #openstack-nova | 01:52 | |
*** ociuhandu has joined #openstack-nova | 02:04 | |
*** tetsuro has joined #openstack-nova | 02:08 | |
*** ociuhandu has quit IRC | 02:10 | |
*** tetsuro_ has quit IRC | 02:11 | |
*** dave-mccowan has quit IRC | 02:29 | |
*** yaawang has quit IRC | 02:39 | |
*** yaawang has joined #openstack-nova | 02:39 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: Store instance action event exc_val fault details https://review.opendev.org/694428 | 02:47 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Expose instance action event details out of the API https://review.opendev.org/694430 | 02:47 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add instance actions v283 samples test https://review.opendev.org/706251 | 02:47 |
*** ociuhandu has joined #openstack-nova | 02:54 | |
*** tetsuro_ has joined #openstack-nova | 02:59 | |
*** tetsuro has quit IRC | 03:03 | |
*** ociuhandu has quit IRC | 03:04 | |
*** ociuhandu has joined #openstack-nova | 03:06 | |
*** sapd1 has joined #openstack-nova | 03:08 | |
*** ociuhandu has quit IRC | 03:11 | |
openstackgerrit | Kevin Zhao proposed openstack/nova master: Add default cpu model for aarch64 https://review.opendev.org/709494 | 03:14 |
*** tetsuro_ has quit IRC | 03:16 | |
*** mkrai has joined #openstack-nova | 03:18 | |
*** tetsuro has joined #openstack-nova | 03:32 | |
*** tetsuro_ has joined #openstack-nova | 03:42 | |
*** tetsuro has quit IRC | 03:45 | |
*** brinzhang_ has joined #openstack-nova | 03:47 | |
*** brinzhang has quit IRC | 03:51 | |
*** damien_r has joined #openstack-nova | 03:51 | |
*** damien_r has quit IRC | 03:56 | |
*** udesale has joined #openstack-nova | 04:25 | |
*** ratailor has joined #openstack-nova | 04:40 | |
*** nweinber has quit IRC | 04:41 | |
*** tetsuro_ has quit IRC | 05:00 | |
*** slaweq has joined #openstack-nova | 05:13 | |
openstackgerrit | Kevin Zhao proposed openstack/nova master: fix ut error on arm64 https://review.opendev.org/713163 | 05:31 |
*** evrardjp has quit IRC | 05:35 | |
*** evrardjp has joined #openstack-nova | 05:36 | |
*** links has joined #openstack-nova | 05:37 | |
openstackgerrit | Kevin Zhao proposed openstack/nova master: Add default cpu model for aarch64 https://review.opendev.org/709494 | 05:42 |
*** ociuhandu has joined #openstack-nova | 05:43 | |
*** ociuhandu has quit IRC | 05:44 | |
*** ociuhandu has joined #openstack-nova | 05:44 | |
*** ircuser-1 has quit IRC | 05:50 | |
*** ociuhandu has quit IRC | 05:54 | |
*** ociuhandu has joined #openstack-nova | 05:55 | |
*** ociuhandu has quit IRC | 06:01 | |
*** irclogbot_0 has quit IRC | 06:29 | |
*** lbragstad has quit IRC | 06:37 | |
*** dpawlik has joined #openstack-nova | 06:58 | |
*** ratailor has quit IRC | 07:05 | |
*** ratailor has joined #openstack-nova | 07:08 | |
*** tetsuro has joined #openstack-nova | 07:19 | |
*** damien_r has joined #openstack-nova | 07:27 | |
*** irclogbot_2 has joined #openstack-nova | 07:30 | |
*** damien_r has quit IRC | 07:33 | |
*** tetsuro_ has joined #openstack-nova | 07:41 | |
*** tetsuro has quit IRC | 07:44 | |
*** ociuhandu has joined #openstack-nova | 07:47 | |
openstackgerrit | Kevin Zhao proposed openstack/nova master: Add default cpu model for aarch64 https://review.opendev.org/709494 | 07:54 |
*** iurygregory has joined #openstack-nova | 08:02 | |
*** dpawlik has quit IRC | 08:07 | |
*** dpawlik has joined #openstack-nova | 08:07 | |
*** maciejjozefczyk has joined #openstack-nova | 08:11 | |
*** tesseract has joined #openstack-nova | 08:12 | |
*** damien_r has joined #openstack-nova | 08:13 | |
*** rcernin has quit IRC | 08:14 | |
*** rpittau|afk is now known as rpittau | 08:15 | |
*** damien_r has quit IRC | 08:18 | |
*** dpawlik has quit IRC | 08:18 | |
*** amoralej|off is now known as amoralej | 08:20 | |
*** ociuhandu has quit IRC | 08:20 | |
*** tkajinam has quit IRC | 08:23 | |
*** breizhkoala has joined #openstack-nova | 08:26 | |
*** dpawlik has joined #openstack-nova | 08:28 | |
*** sapd1 has quit IRC | 08:39 | |
*** sapd1 has joined #openstack-nova | 08:39 | |
*** ralonsoh has joined #openstack-nova | 08:42 | |
*** elod has joined #openstack-nova | 08:46 | |
*** ociuhandu has joined #openstack-nova | 08:50 | |
openstackgerrit | waleed mousa proposed openstack/os-vif master: [Follow Up] OVS DPDK port representors support https://review.opendev.org/705018 | 08:56 |
*** ccamacho has joined #openstack-nova | 08:57 | |
*** tetsuro_ has quit IRC | 09:00 | |
*** jaosorior has joined #openstack-nova | 09:01 | |
brinzhang_ | stephenfin: can you review os-volume-attachments refresh default policy patch (the end)? https://review.opendev.org/#/c/710190/ it prevent my destroy-instance-with-datavolume add the completed implementation | 09:02 |
*** aarents has quit IRC | 09:03 | |
brinzhang_ | stephenfin: as the same as os-instance-actions policies https://review.opendev.org/#/c/706470/, it prevents the action-event-fault-details feature | 09:03 |
lyarwood | https://zuul.opendev.org/t/openstack/build/790d84d24ac946ec90712585a301ba3f/log/job-output.txt#6750 <- anyone with more bash-foo than me able to tell me why ceph.sh is exiting with 1 here? This works locally and I can't see anything useful dumped with set -oxtrace | 09:03 |
brinzhang_ | johnthetubaguy: also need you check | 09:04 |
brinzhang_ | johnthetubaguy: stephenfin: thanks | 09:04 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: nova-live-migration: Wait for n-cpu services to come up after configuring Ceph https://review.opendev.org/713035 | 09:06 |
*** tosky has joined #openstack-nova | 09:07 | |
*** ociuhandu has quit IRC | 09:10 | |
*** ociuhandu has joined #openstack-nova | 09:11 | |
*** Liang__ has quit IRC | 09:15 | |
*** mkrai has quit IRC | 09:21 | |
*** mkrai has joined #openstack-nova | 09:22 | |
*** aarents has joined #openstack-nova | 09:32 | |
*** martinkennelly has joined #openstack-nova | 09:40 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: images: Move qemu-img info calls into privsep https://review.opendev.org/706897 | 09:45 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: images: Allow the output format of qemu-img info to be controlled https://review.opendev.org/706898 | 09:45 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: virt: Pass request context to extend_volume https://review.opendev.org/706899 | 09:45 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Correctly resize encrypted LUKSv1 volumes https://review.opendev.org/706900 | 09:45 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Use oslo.utils >= 4.1.0 to fetch format-specific image data https://review.opendev.org/710785 | 09:45 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Always provide the size in bytes when calling virDomainBlockResize https://review.opendev.org/707590 | 09:45 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: images: Remove Libvirt specific configurable use from qemu_img_info https://review.opendev.org/707591 | 09:45 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Remove QEMU_VERSION_REQ_SHARED https://review.opendev.org/710239 | 09:45 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: images: Make JSON the default output format of calls to qemu-img info https://review.opendev.org/711679 | 09:45 |
lyarwood | hmmm do I need to be part of a particular group to triage bugs? https://bugs.launchpad.net/nova/+bug/1864020 | 10:07 |
openstack | Launchpad bug 1864020 in OpenStack Compute (nova) "libvirt.libvirtError: Requested operation is not valid: format of backing image %s of image %s was not specified in the image metadata (See https://libvirt.org/kbase/backing_chains.html for troubleshooting)" [Undecided,Fix committed] - Assigned to Lee Yarwood (lyarwood) | 10:07 |
lyarwood | I thought I was but I can't seem to set the importance of the above bug | 10:07 |
*** nightmare_unreal has joined #openstack-nova | 10:08 | |
donnyd | nightmare_unreal: check this out https://docs.openstack.org/nova/train/contributor/index.html | 10:09 |
nightmare_unreal | sure | 10:09 |
donnyd | the nova contributor guide is pretty rich in detail on what it takes to get up and running | 10:10 |
donnyd | https://docs.openstack.org/nova/train/contributor/how-to-get-involved.html | 10:10 |
*** ratailor has quit IRC | 10:11 | |
*** ratailor has joined #openstack-nova | 10:13 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: nova-live-migration: Wait for n-cpu services to come up after configuring Ceph https://review.opendev.org/713035 | 10:29 |
*** nightmare_unreal has quit IRC | 10:38 | |
*** zigo has joined #openstack-nova | 10:52 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova stable/rocky: Reproduce bug 1862633 https://review.opendev.org/713187 | 10:56 |
openstack | bug 1862633 in OpenStack Compute (nova) "unshelve leak allocation if update port fails" [Medium,Fix released] https://launchpad.net/bugs/1862633 - Assigned to Balazs Gibizer (balazs-gibizer) | 10:56 |
*** ociuhandu has quit IRC | 11:02 | |
*** ociuhandu has joined #openstack-nova | 11:02 | |
openstackgerrit | Luyao Zhong proposed openstack/nova master: bug-fix: Reject live migration with vpmem https://review.opendev.org/708110 | 11:04 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: address specific resources cleanup issue https://review.opendev.org/699148 | 11:04 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: support live migration with vpmems https://review.opendev.org/687856 | 11:04 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Track orphan instances and error migrations in resource tracker https://review.opendev.org/678451 | 11:04 |
*** ociuhandu has quit IRC | 11:07 | |
*** ratailor_ has joined #openstack-nova | 11:07 | |
*** ratailor has quit IRC | 11:10 | |
openstackgerrit | John Garbutt proposed openstack/nova master: Add unified limits configuration https://review.opendev.org/712137 | 11:18 |
openstackgerrit | John Garbutt proposed openstack/nova master: Add logic to enforce local api and db limits https://review.opendev.org/712139 | 11:18 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: nova-live-migration: Wait for n-cpu services to come up after configuring Ceph https://review.opendev.org/713035 | 11:19 |
elod | lyarwood: this is the team you need if i'm not mistaken: https://launchpad.net/~nova-bugs | 11:22 |
*** eharney has joined #openstack-nova | 11:23 | |
*** sean-k-mooney has joined #openstack-nova | 11:25 | |
lyarwood | elod: thanks, I was sure I was already but nvm | 11:27 |
*** tkajinam has joined #openstack-nova | 11:27 | |
*** tkajinam has quit IRC | 11:29 | |
lyarwood | ah membership expires | 11:29 |
lyarwood | weird | 11:29 |
elod | yeah. You are member since 2016 it says | 11:29 |
lyarwood | yeah just renewed | 11:30 |
lyarwood | no excuses not to triage now I guess /o\ | 11:30 |
elod | :] | 11:30 |
sean-k-mooney | lyarwood: do you know if there is an issue with the internal vpn | 11:31 |
lyarwood | sean-k-mooney: nope I'm on AFAICT | 11:31 |
* lyarwood checks | 11:31 | |
sean-k-mooney | lyarwood: the amsterdam site keeps kicking my connection | 11:31 |
*** rpittau is now known as rpittau|bbl | 11:31 | |
*** ociuhandu has joined #openstack-nova | 11:31 | |
lyarwood | sean-k-mooney: try FAB | 11:31 |
sean-k-mooney | ok i have to go fine the config file and update my user name but ill give it a try | 11:32 |
* lyarwood assumes every corps VPN endpoints are getting hammered at the moment | 11:32 | |
luyao | lyarwood: I addressed the the 'do_cleanup' flag issue according to your and alex_xu 's comments, could you look at it again? https://review.opendev.org/#/c/687856/12 | 11:34 |
lyarwood | luyao: ack will look shortly | 11:35 |
luyao | lyarwood: thanks | 11:35 |
openstackgerrit | Balazs Gibizer proposed openstack/nova stable/rocky: Clean up allocation if unshelve fails due to neutron https://review.opendev.org/713196 | 11:40 |
*** jangutter has joined #openstack-nova | 11:40 | |
luyao | brinzhang_: your comments were addressed, thanks https://review.opendev.org/#/c/687856/12 | 11:40 |
luyao | stephenfin: Hi, are you around | 11:45 |
stephenfin | yup | 11:46 |
brinzhang_ | luyao: thanks, I have not reviewed all, I will do continue while I am free :) | 11:46 |
luyao | stephenfin: Do you have time to review vpmem live migration support? I believe you have been familiar with the vpmem feature. :). https://review.opendev.org/#/q/topic:support-live-migration-with-virtual-persistent-memory+(status:open+OR+status:merged) | 11:46 |
luyao | brinzhang_: Cool, Thanks | 11:47 |
brinzhang_ | luyao: because of some works later to continue, sorry | 11:47 |
stephenfin | luyao: Oh, I meant to take a look at that. Can do | 11:47 |
brinzhang_ | stephenfin: do you see my comments above? | 11:48 |
luyao | stephenfin: Thanks a lot | 11:48 |
stephenfin | yup, also on my list | 11:48 |
brinzhang_ | stephenfin: thanks | 11:48 |
brinzhang_ | thess feature all done to review, api change and novaclient change all done :) | 11:49 |
*** mkrai has quit IRC | 12:02 | |
openstackgerrit | waleed mousa proposed openstack/os-vif master: [Follow Up] OVS DPDK port representors support https://review.opendev.org/705018 | 12:03 |
*** Luzi has joined #openstack-nova | 12:07 | |
openstackgerrit | waleed mousa proposed openstack/os-vif master: [Follow Up] OVS DPDK port representors support https://review.opendev.org/705018 | 12:07 |
brinzhang_ | dansmith: I have added the nova and non-nova exception functional tests for instance action events fault details, pls see https://review.opendev.org/#/c/694430/7/nova/tests/functional/test_instance_actions.py | 12:09 |
*** rcernin has joined #openstack-nova | 12:10 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova stable/rocky: Reproduce bug 1862633 https://review.opendev.org/713187 | 12:13 |
openstack | bug 1862633 in OpenStack Compute (nova) "unshelve leak allocation if update port fails" [Medium,Fix released] https://launchpad.net/bugs/1862633 - Assigned to Balazs Gibizer (balazs-gibizer) | 12:13 |
openstackgerrit | Balazs Gibizer proposed openstack/nova stable/rocky: Clean up allocation if unshelve fails due to neutron https://review.opendev.org/713196 | 12:13 |
*** breizhkoala has quit IRC | 12:16 | |
*** mkrai has joined #openstack-nova | 12:18 | |
*** udesale_ has joined #openstack-nova | 12:26 | |
*** ratailor__ has joined #openstack-nova | 12:27 | |
*** udesale has quit IRC | 12:28 | |
*** nweinber has joined #openstack-nova | 12:29 | |
*** ratailor_ has quit IRC | 12:30 | |
*** jraju__ has joined #openstack-nova | 12:33 | |
*** links has quit IRC | 12:34 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: nova-live-migration: Wait for n-cpu services to come up after configuring Ceph https://review.opendev.org/713035 | 12:38 |
*** ratailor__ has quit IRC | 12:39 | |
*** ociuhandu has quit IRC | 12:43 | |
*** ociuhandu has joined #openstack-nova | 12:43 | |
*** mkrai has quit IRC | 12:47 | |
*** tkajinam has joined #openstack-nova | 12:48 | |
*** mgariepy has joined #openstack-nova | 12:49 | |
*** nicolasbock has joined #openstack-nova | 12:56 | |
*** damien_r has joined #openstack-nova | 13:01 | |
*** ociuhandu has quit IRC | 13:02 | |
*** artom has joined #openstack-nova | 13:03 | |
*** ociuhandu has joined #openstack-nova | 13:03 | |
*** amoralej is now known as amoralej|lunch | 13:07 | |
*** lbragstad has joined #openstack-nova | 13:08 | |
*** ociuhandu has quit IRC | 13:08 | |
*** ociuhandu has joined #openstack-nova | 13:09 | |
*** rpittau|bbl is now known as rpittau | 13:13 | |
openstackgerrit | John Garbutt proposed openstack/nova master: Assert API behavior for noop quota driver https://review.opendev.org/712140 | 13:17 |
openstackgerrit | John Garbutt proposed openstack/nova master: Make unified limits APIs return reserved of 0 https://review.opendev.org/712141 | 13:17 |
openstackgerrit | John Garbutt proposed openstack/nova master: Add logic to enforce local api and db limits https://review.opendev.org/712139 | 13:17 |
openstackgerrit | John Garbutt proposed openstack/nova master: Enforce api and db limits https://review.opendev.org/712142 | 13:17 |
*** beekneemech is now known as bnemec | 13:17 | |
*** CeeMac has joined #openstack-nova | 13:18 | |
*** kaisers_ has joined #openstack-nova | 13:18 | |
*** elod has quit IRC | 13:21 | |
*** elod has joined #openstack-nova | 13:21 | |
*** ociuhandu has quit IRC | 13:26 | |
dansmith | brinzhang_: okay, after coffee | 13:26 |
*** ociuhandu has joined #openstack-nova | 13:26 | |
brinzhang_ | dansmith: good morning ^^ | 13:27 |
*** dpawlik has quit IRC | 13:29 | |
*** ociuhandu has quit IRC | 13:31 | |
*** mgariepy has quit IRC | 13:39 | |
*** tkajinam has quit IRC | 13:44 | |
*** mkrai has joined #openstack-nova | 13:48 | |
*** mgariepy has joined #openstack-nova | 13:48 | |
*** mgariepy has quit IRC | 13:54 | |
*** ociuhandu has joined #openstack-nova | 14:01 | |
*** nightmare_unreal has joined #openstack-nova | 14:06 | |
*** mgariepy has joined #openstack-nova | 14:08 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Fix intermittently failing regression case https://review.opendev.org/713243 | 14:09 |
*** Luzi has quit IRC | 14:11 | |
*** haleyb has joined #openstack-nova | 14:13 | |
*** amoralej|lunch is now known as amoralej | 14:14 | |
openstackgerrit | John Garbutt proposed openstack/nova master: Update quota_class APIs for db and api limits https://review.opendev.org/712143 | 14:15 |
sean-k-mooney | bauzas: would you have time to look at https://review.opendev.org/#/c/666914/21 and the follow up patches. it would be nice to be able to close that out. gibi maybe you could take a look to since efried_gone is nolonger here to review. since stephen an i are the author we need a non redhat person to review unless we are moving away form that requirement. alex_xu or johnthetubaguy would also work if | 14:17 |
sean-k-mooney | they are around. | 14:17 |
*** mriedem has joined #openstack-nova | 14:19 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: DNM - Test TEMPEST_EXTEND_ATTACHED_ENCRYPTED_VOLUME https://review.opendev.org/707593 | 14:20 |
*** udesale_ has quit IRC | 14:23 | |
*** Luzi has joined #openstack-nova | 14:24 | |
gibi | sean-k-mooney: ack. I cannot promise too much (have a long review queue atm) but added it to my queue | 14:26 |
gibi | sean-k-mooney: regarding the non-RH core requirement. I see this requirement as something that will be very problematic due to less diversity in the core team | 14:28 |
sean-k-mooney | gibi: ya. dont worry if you cant review but the trifect rule is likely to become a problem unless the diversity fo the core team can be restored. as someone who like that rule i would be sad to see it go but i guess we will see how it goes | 14:29 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Fix intermittently failing regression case https://review.opendev.org/713243 | 14:32 |
brinzhang_ | gmann: would like check os-instance-action policy again? https://review.opendev.org/#/c/706470/ | 14:33 |
gmann | brinzhang_: yeah, i will check after my internal meeting | 14:33 |
brinzhang_ | gmann: yeah, if it is ok, I think I can continue to do the os-instance-action (bp/action-event-fault-details) feature, there is a new policy besed on the default policy change | 14:35 |
brinzhang_ | gmann: thanks | 14:36 |
gmann | brinzhang_: yeah, you can continue on that, make dependency. | 14:37 |
brinzhang_ | gmann: Adding a default policy is only the remaining part, everything else is ready(TODO in it). So I want to do it after this done. | 14:40 |
openstackgerrit | Merged openstack/nova master: Cleanup test for system reader and reader_or_owner rules https://review.opendev.org/712515 | 14:40 |
*** ociuhandu has quit IRC | 14:41 | |
*** mkrai has quit IRC | 14:52 | |
*** ociuhandu has joined #openstack-nova | 14:54 | |
*** gyee has joined #openstack-nova | 14:54 | |
*** TxGirlGeek has joined #openstack-nova | 14:55 | |
openstackgerrit | Kevin Zhao proposed openstack/nova master: Add default cpu model for aarch64 https://review.opendev.org/709494 | 14:57 |
bauzas | sean-k-mooney: I'll try but tbh, those next weeks will be crazy | 14:58 |
bauzas | but adding it to my queue | 14:59 |
*** ociuhandu has quit IRC | 14:59 | |
sean-k-mooney | bauzas: yep i understand just trying to see if we can complete some blueprints that are close to being done. | 14:59 |
bauzas | sure | 15:00 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add new default roles in os-instance-actions policies https://review.opendev.org/706470 | 15:01 |
kashyap | lyarwood: Hiya, hope the renewed comments make sense: forgot that we also need MIN_QEMU_BLOCKDEV | 15:01 |
kashyap | lyarwood: And a different, newer version for MIN_LIBVIRT_BLOCKDEV. (Notes in the review :)) | 15:02 |
artom | Remind me again what are the criteria for reporting a VM status as UNKNOWN? Host being down is one of them, right? | 15:05 |
artom | melwitt, ^^ if you're awake | 15:05 |
dansmith | yeah or cell down | 15:05 |
artom | dansmith, aha, ack, thanks! | 15:05 |
dansmith | it started at cell down only, but I think melwitt changed that recently | 15:05 |
artom | dansmith, that's the impression I got - not super important, just clearing out old (downstream) BZs | 15:06 |
*** nicolasbock has quit IRC | 15:08 | |
*** nicolasbock has joined #openstack-nova | 15:09 | |
*** spatel has joined #openstack-nova | 15:11 | |
spatel | sean-k-mooney: Good morning, if i want to disable Hyper Threading without BIOS setting how do i do that? I was reading AWS article here using offline cpu threads, do you think this is valid way to do that - https://aws.amazon.com/blogs/compute/disabling-intel-hyper-threading-technology-on-amazon-linux/ | 15:12 |
sean-k-mooney | spatel: yes you can do it via the sys virtual file system | 15:14 |
sean-k-mooney | the bios is the best way to do it as ofline cores will not unpartion the l1 cache | 15:14 |
spatel | sean-k-mooney: does that perform as good as BIOS setting ? | 15:15 |
sean-k-mooney | not quite | 15:15 |
sean-k-mooney | if you disable it in the bios it dobles the l1 cache avaiable to the core | 15:15 |
sean-k-mooney | when ht is enabled the l1 cache is partioned so the each ht has its onw region of the l1 cache | 15:15 |
spatel | We found erlang doing pretty good job when disable HT but again i don't want to do that setting from BIOS (its painful) i want to give that control to end users | 15:16 |
sean-k-mooney | but with it disable at the bios level all the l1 cache is avaiable to the physical core. at least on older intel architecutres | 15:16 |
sean-k-mooney | spatel: are you using cpu pinning | 15:17 |
spatel | Yes CPU pinning | 15:17 |
sean-k-mooney | if so you can use the cpu_thread_policy | 15:17 |
sean-k-mooney | create multiple flaovrs for the earlang instnace and let them choose | 15:17 |
spatel | sean-k-mooney: cpu_thread_policy=isolate ? | 15:18 |
sean-k-mooney | e.g. hw:cpu_thread_policy=prefer vs hw:cpu_threads_policy=isolate | 15:18 |
sean-k-mooney | yes | 15:18 |
spatel | sean-k-mooney: I have tried all kind of combination but erlang doesn't like it. | 15:18 |
spatel | when i run VM on single NUMA perfrmance is really good | 15:19 |
sean-k-mooney | the sysfs performace delta is pretty small since normally your app/data wont fit in l1 anyway | 15:19 |
spatel | sean-k-mooney: look at this - https://imgur.com/a/8zapZ8x | 15:21 |
spatel | To understand better CPU topology i am comparing them with AWS and here what i found | 15:22 |
spatel | On my openstack VM CPU topo looks very strange | 15:22 |
sean-k-mooney | the lower image looks like what i would expect | 15:23 |
spatel | l1d & l1i cache is shared | 15:23 |
spatel | why aws has l1d and l1i outside | 15:23 |
sean-k-mooney | i dont know | 15:23 |
sean-k-mooney | but you can alter this in the libvirt xml i think | 15:24 |
spatel | That is the problem, I have check with Alicloud and aws and both has perfect CPU topo but my openstack has very odd output | 15:24 |
sean-k-mooney | this is not something we would expose however | 15:24 |
spatel | I think it could be QEMU version or bug | 15:24 |
sean-k-mooney | well what do you mean by odd | 15:25 |
spatel | I am planning to upgrade my qemu to 4.2 (currently running 2.12) | 15:25 |
sean-k-mooney | the imgae you provide showing the kvm instance | 15:25 |
sean-k-mooney | look like real hardware would | 15:25 |
spatel | qemu-kvm | 15:25 |
spatel | both are virtual machine | 15:25 |
sean-k-mooney | sure but looking at https://imgur.com/a/8zapZ8x the bottom image looks corect the top look incorrect | 15:26 |
spatel | you are saying AWS instance looks incorrect? | 15:27 |
sean-k-mooney | yes | 15:27 |
sean-k-mooney | that is the toplogy that we should see if and only if you had HT disabled | 15:27 |
spatel | If i run same command on my host compute it looks exactly like AWS one | 15:27 |
spatel | sean-k-mooney: no | 15:27 |
spatel | Let me show you my two physical compute topo (HT vs non-HT) | 15:28 |
spatel | hold on.. | 15:28 |
sean-k-mooney | ok so looking locally they have changed how this work in later versions | 15:30 |
sean-k-mooney | the view that you see in openstack i how it used to work in nehalem and i belive up to sandybridge or ivybridge | 15:30 |
sean-k-mooney | spatel: as i said the bios seting used to change the toplogy between the aws one and the openstack one at the hardware level | 15:32 |
spatel | sean-k-mooney: https://imgur.com/a/at3WBBf | 15:32 |
spatel | This is my two compute host (one has HT enable and second has HT enable) | 15:32 |
sean-k-mooney | spatel: yep as i said this has changed with different hardware micorarchitecutres | 15:33 |
spatel | If you look bottom picture (its very similar to AWS virtual instance, that means AWS virtual machine correctly exposing physical topology including cache) | 15:33 |
sean-k-mooney | spatel: openstack/nova is not currently setting the cpu cache toplogy its decied by libvirt | 15:33 |
sean-k-mooney | spatel: openstack is not ment to expose the host toplogy by defualt | 15:34 |
spatel | even in host-passthrough ? | 15:34 |
sean-k-mooney | correct | 15:34 |
sean-k-mooney | openstack does not specify the cache toplogy at all | 15:34 |
spatel | hmmm | 15:34 |
sean-k-mooney | that is left entirely to libvirt today | 15:34 |
sean-k-mooney | libvirt allows use to set this but we dont so you get whatever libvirt/qemu decied to provide | 15:35 |
spatel | hmm! how ALI cloud doing this even they are running openstack | 15:35 |
spatel | may be they have hack version of software design by them | 15:35 |
spatel | Anyway so you think BIOS level is best way then i will do with that but it will be growing pain as my cloud growing :( | 15:37 |
*** breizhkoala has joined #openstack-nova | 15:38 | |
openstackgerrit | John Garbutt proposed openstack/nova master: Update limit APIs https://review.opendev.org/712707 | 15:39 |
openstackgerrit | John Garbutt proposed openstack/nova master: Update quota sets APIs https://review.opendev.org/712749 | 15:39 |
openstackgerrit | John Garbutt proposed openstack/nova master: WIP: Enforce unified limits using oslo.limit https://review.opendev.org/615180 | 15:39 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Use virDomainBlockCopy to swap volumes when using -blockdev https://review.opendev.org/696834 | 15:48 |
lyarwood | kashyap: ^ updated btw | 15:48 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add new default roles in os-volumes-attachments policies https://review.opendev.org/710190 | 15:52 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add PATCH volume attachments api to os-volume_attachments https://review.opendev.org/693828 | 15:52 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add new policy to PATCH update volume API https://review.opendev.org/711194 | 15:52 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add functional tests for PATCH volume attachments API https://review.opendev.org/710965 | 15:52 |
*** ociuhandu has joined #openstack-nova | 15:55 | |
dansmith | sean-k-mooney: AFAIK, the cyborg patch that generates the libvirt xml hasn't changed much, and you've tested that at some point with real devices (right?) so we can assume it works without much fanfare? | 15:57 |
*** iurygregory is now known as iurygregory|brb | 15:57 | |
kashyap | lyarwood: Will check; thx | 15:59 |
*** ociuhandu has quit IRC | 16:00 | |
sean-k-mooney | dansmith: i havent tested with real device no | 16:00 |
dansmith | oh I thought you had okay | 16:01 |
sean-k-mooney | dansmith: i can go specific review that patch however | 16:01 |
dansmith | presumably sundar has | 16:01 |
sean-k-mooney | i belive you are correct in that it does not change much | 16:01 |
sean-k-mooney | dansmith: yes sundar has apparently tested it with the rushcreak fpga card | 16:01 |
dansmith | I looked over it a while back and I think the only way I'd be able to find stuff really wrong with it is through log examination | 16:02 |
dansmith | it's pretty straightforward | 16:02 |
*** ivve has joined #openstack-nova | 16:03 | |
sean-k-mooney | dansmith: i rebased the cyborg devstack pluging multinode this moringin by the way. just to resovle the merge conflict | 16:05 |
dansmith | I saw, thanks | 16:05 |
sean-k-mooney | if i rebase it again do you want me to move your host name fix patch lower? hoepfully they will merge soon anyway but that is usefaul outside of multinode testing | 16:06 |
dansmith | it's not super critical unless it's blocking people.. I put it later just to avoid messing up your series, but obviously it's probably an easy merge.. your call | 16:06 |
*** mmethot_ has quit IRC | 16:07 | |
sean-k-mooney | ok if i need to respin i can move it down. i dont think other have really complained about it plus you can always override the host via the local.conf anyway | 16:08 |
dansmith | yup.. I imagine that's because most people are using more throwaway machines for their testing (and it's probably not getting a very wide audience anyway) but.. yep, not critical and there is a workaround | 16:11 |
melwitt | artom, dansmith: fyi I didn't change the meaning of the UNKNOWN status, it originally was only for host down and then when the down cells handling was added, it was used for that as well. my change was just a new policy rule to allow UNKNOWN status to be seen by non-admin if indicated by policy | 16:12 |
dansmith | melwitt: it was originally for cell down only, AFAIR | 16:13 |
melwitt | this is the logic for host down https://github.com/openstack/nova/blob/master/nova/compute/api.py#L5339-L5351 | 16:13 |
dansmith | mark-host-down left the status in place | 16:13 |
dansmith | that's host status | 16:13 |
sean-k-mooney | dansmith: on a slightly different topic do you have interest in/ time to review the porvider.yaml series? just trying to figure out which redhat cores to bug as a reviewer when its ready. | 16:14 |
dansmith | he's talking about instance status right? | 16:14 |
melwitt | wasn't that the question? | 16:14 |
sean-k-mooney | melwitt: since your here same question ^ | 16:14 |
melwitt | oh, sorry. sigh | 16:14 |
*** sapd1 has quit IRC | 16:14 | |
dansmith | melwitt: vm status | 16:14 |
melwitt | well, either way I didn't change the meaning of vm status either | 16:14 |
dansmith | melwitt: I thought you were proposing the vm status change but okay | 16:15 |
* sean-k-mooney brb | 16:15 | |
melwitt | dansmith: I did but you explained why it wouldn't be a good idea and I agreed with your reasoning and updated the spec to stop proposing it. the spec was approved some time after that | 16:16 |
*** mmethot has joined #openstack-nova | 16:16 | |
dansmith | sean-k-mooney: I dunno, I don't have a huge interest in reviewing that | 16:16 |
melwitt | *I did originally | 16:16 |
dansmith | melwitt: ack, I didn't remember, I thought you had kept that in. | 16:16 |
*** derekh has joined #openstack-nova | 16:17 | |
melwitt | artom: sorry I got vm status and host status mixed up. UNKNOWN vm status is for down cell only and was not changed as a result of my adding a host_status:unknown-only policy rule | 16:21 |
artom | melwitt, oh? So what happens when the host isn't reachable? We report the last recorded status from the DB? | 16:22 |
melwitt | artom: correct. only host status will say UNKNOWN | 16:22 |
dansmith | host_status on the instance gives you a sanitized "don't expect this instance to be actionable because the host is not healthy" | 16:22 |
dansmith | normally only admins can see info about hosts, so that field is the indicator to the user that "things are not as they appear" without exposing too much | 16:23 |
melwitt | artom: and host status is normally admin-only, so I added a new policy rule host_status:unknown-only that defaults to admin-only intended for operators who want to let non-admin users see UNKNOWN host status | 16:23 |
melwitt | host_status policy rule includes showing UP, DOWN, MAINTENANCE, UNKNOWN and host_status:unknown-only shows only UNKNOWN | 16:24 |
artom | melwitt, aha, thanks :) Can you find a link to that patch/spec/whatever? | 16:24 |
melwitt | artom: https://review.opendev.org/679181 | 16:24 |
*** sapd1 has joined #openstack-nova | 16:27 | |
*** tesseract has quit IRC | 16:33 | |
sean-k-mooney | dansmith: no worries. gibi and erric were the main reviews so im just thinking of who can take over form erric. when its ready ill add it to the runway list and see how is interested | 16:34 |
*** ociuhandu has joined #openstack-nova | 16:34 | |
gibi | sean-k-mooney: the provider config series are also on my radar | 16:43 |
evrardjp | https://review.opendev.org/711950 has merged. Congratulations! | 16:44 |
sean-k-mooney | gibi: yep im adressing the feedback on the last patch currently | 16:44 |
gibi | sean-k-mooney: cool. thanks | 16:44 |
sean-k-mooney | ill take a look at the unit tests once i have that done | 16:45 |
*** ociuhandu has quit IRC | 16:53 | |
*** ociuhandu has joined #openstack-nova | 16:53 | |
*** brtknr_ has quit IRC | 16:57 | |
*** brtknr has joined #openstack-nova | 16:57 | |
gibi | cores, there is a trivial, functional test only change: https://review.opendev.org/#/c/713243/ | 16:58 |
*** ociuhandu has quit IRC | 16:58 | |
*** rpittau is now known as rpittau|afk | 17:07 | |
*** iurygregory|brb is now known as iurygregory | 17:08 | |
*** Luzi has quit IRC | 17:14 | |
openstackgerrit | melanie witt proposed openstack/nova master: Follow-ups for host_status:unknown-only policy rule https://review.opendev.org/713295 | 17:21 |
openstackgerrit | Merged openstack/nova master: Fix intermittently failing regression case https://review.opendev.org/713243 | 17:23 |
melwitt | dansmith: pedantic correction to what I said earlier, the spec was abandoned bc it was decided to no longer need a spec when it became policy rule only. and then blueprint was approved after it became policy rule only | 17:28 |
melwitt | (in case you ever go looking at the spec and can't find it, like I just did) | 17:29 |
melwitt | *for | 17:29 |
*** evrardjp has quit IRC | 17:35 | |
*** evrardjp has joined #openstack-nova | 17:36 | |
openstackgerrit | John Garbutt proposed openstack/nova master: WIP: Enforce resource limits using oslo.limit https://review.opendev.org/615180 | 17:49 |
openstackgerrit | John Garbutt proposed openstack/nova master: WIP: Tell oslo.limit how to count nova resources https://review.opendev.org/713301 | 17:49 |
melwitt | gibi: I dunno if you saw my comment in https://review.opendev.org/712674 test just needs a tweak to handle kwarg vs positional arg | 17:54 |
dansmith | melwitt: heh okay.. | 17:55 |
melwitt | just in case you opened it and went, hWHAT! abandoned! ?!?! | 17:57 |
*** dtantsur is now known as dtantsur|afk | 17:57 | |
*** jangutter has quit IRC | 17:57 | |
*** derekh has quit IRC | 18:00 | |
*** damien_r has quit IRC | 18:06 | |
*** ociuhandu has joined #openstack-nova | 18:07 | |
openstackgerrit | Dan Smith proposed openstack/nova master: Remove non-optional kwarg for virt block_device_info https://review.opendev.org/713310 | 18:09 |
*** damien_r has joined #openstack-nova | 18:10 | |
*** maciejjozefczyk has quit IRC | 18:13 | |
*** ociuhandu has quit IRC | 18:23 | |
*** ociuhandu has joined #openstack-nova | 18:24 | |
*** ociuhandu has quit IRC | 18:29 | |
*** maciejjozefczyk has joined #openstack-nova | 18:29 | |
gibi | melwitt: ack, now see it. thanks. I will get back to that tomorrow | 18:36 |
*** breizhkoala has quit IRC | 18:37 | |
melwitt | gibi: k. I didn't want to update it, so I will be able to +2 | 18:38 |
gibi | melwitt: sure | 18:39 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: [Community goal] Update contributor documentation https://review.opendev.org/712420 | 18:42 |
gibi | stephenfin, brinzhang_: fixed the comments ^^ | 18:43 |
*** CeeMac has quit IRC | 18:48 | |
sean-k-mooney | gibi are you done for the day or are you still around | 18:58 |
gibi | sean-k-mooney: I'm here for quick questions, not for longer things | 18:59 |
sean-k-mooney | gibi: its related to the provider.conf | 19:00 |
sean-k-mooney | basically the way i was asserting that the triats dont conflict with the virt driver traits does not quite work | 19:00 |
sean-k-mooney | what hapens is it works on the first iteration then fails on the second as the trait is already there | 19:01 |
sean-k-mooney | the end to end functional test you asked for found the issue | 19:01 |
sean-k-mooney | im just wonder what the best way to adress that is | 19:01 |
sean-k-mooney | gibi: this is what im doing which works fine for inventories as we start from scratch each time https://review.opendev.org/#/c/676522/44/nova/compute/resource_tracker.py@1751 | 19:03 |
sean-k-mooney | but for taits we start with the traits from placment | 19:03 |
sean-k-mooney | i guess i need to think about it again | 19:03 |
gibi | sean-k-mooney: let me sleep on it | 19:04 |
sean-k-mooney | ya kno worries | 19:04 |
gibi | sean-k-mooney: can it be that we say CUSTOM traits are always overwritten by the provider config as we don't expect that a virt driver reports CUSTOM traits anyhiw | 19:04 |
sean-k-mooney | it might be as simple as if we have prover.yaml remove all custome triats | 19:04 |
sean-k-mooney | gibi: ya so i think that is what erric wanted | 19:05 |
sean-k-mooney | either you manage if form the api and dont use the provider.yaml | 19:05 |
*** lbragstad_ has joined #openstack-nova | 19:05 | |
sean-k-mooney | or you use the provider.yaml in which case we can reset the traits and build them up again | 19:05 |
gibi | sean-k-mooney: yeah, this make senese | 19:07 |
gibi | sense | 19:07 |
sean-k-mooney | ill give that a try if you think that is valid | 19:07 |
gibi | we just need to document it carefully | 19:07 |
sean-k-mooney | ya ok ill see if i can make that work and get back to you. | 19:07 |
*** lbragstad has quit IRC | 19:08 | |
sean-k-mooney | the pardes may be canceled but its still st patricks day tomorrow so ill be off until wednesday | 19:08 |
gibi | sean-k-mooney: sure. happy st patricks day! | 19:09 |
*** ociuhandu has joined #openstack-nova | 19:10 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Ensures that COMPUTE_RESOURCE_SEMAPHORE usage is fair https://review.opendev.org/712674 | 19:11 |
gibi | melwitt: fixed it up real quick | 19:12 |
gibi | and now I'm gone for today | 19:12 |
melwitt | gibi: awesome thanks | 19:13 |
*** ralonsoh has quit IRC | 19:15 | |
*** ociuhandu has quit IRC | 19:20 | |
*** martinkennelly has quit IRC | 19:21 | |
*** ociuhandu has joined #openstack-nova | 19:21 | |
*** maciejjozefczyk has quit IRC | 19:24 | |
*** ociuhandu has quit IRC | 19:26 | |
*** CeeMac has joined #openstack-nova | 19:29 | |
*** jraju__ has quit IRC | 19:31 | |
*** lbragstad_ is now known as lbragstad | 19:32 | |
*** amoralej is now known as amoralej|off | 19:40 | |
melwitt | dansmith: test coverage for fair locking is ready https://review.opendev.org/712674 | 19:50 |
*** ccamacho has quit IRC | 19:50 | |
dansmith | ah, I was like "this is dumb you're just testing your fixture" but I see now | 19:51 |
dansmith | like I said, I'm not really sure it's that important, but as long as it doesn't get in the way too much.. | 19:51 |
melwitt | yeah, I thought it's a nice way to cover this and catch any future uses without fair=True for the compute semaphore | 19:57 |
*** ociuhandu has joined #openstack-nova | 20:04 | |
openstackgerrit | sean mooney proposed openstack/nova master: Provider Config File: Enable loading and merging of provider configs https://review.opendev.org/693460 | 20:04 |
*** ociuhandu has quit IRC | 20:08 | |
*** mgariepy has quit IRC | 20:35 | |
openstackgerrit | Merged openstack/nova stable/train: Functional test for UnexpectedDeletingTaskStateError https://review.opendev.org/711210 | 20:36 |
*** nweinber has quit IRC | 20:38 | |
*** mgariepy has joined #openstack-nova | 20:49 | |
melwitt | artom: I just hit https://bugs.launchpad.net/nova/+bug/1813789 intermittent gate failure on one of my patches and saw you have patches/comments in the lp bug. do you have any idea where this is at right now? I saw you landed https://review.opendev.org/644881 9 months ago but it wasn't for this bug. just wondering you happen to know anything about the current bug we have in the gate | 21:11 |
openstack | Launchpad bug 1813789 in OpenStack Compute (nova) "Evacuate test intermittently fails with network-vif-plugged timeout exception" [Medium,In progress] - Assigned to Artom Lifshitz (notartom) | 21:11 |
*** ociuhandu has joined #openstack-nova | 21:11 | |
melwitt | mriedem: thought your ghost might find this interesting https://review.opendev.org/713035 | 21:13 |
artom | melwitt, IIRC my patch only addresses the revert-resize case | 21:13 |
artom | melwitt, so any other race was outsize of its scope | 21:14 |
mriedem | spooky | 21:14 |
melwitt | artom: oh, I see. thanks, helps to know that. I wonder if the same pattern could be applied to the evacuate case | 21:14 |
mriedem | so that's why the changes for that job on pike always failed? | 21:14 |
melwitt | mriedem: yeah | 21:15 |
artom | melwitt, at first pass I'd say no - I have to reload context, but it was a *really* specific scenario with revert resize | 21:15 |
melwitt | mriedem: lyarwood figured it out. it's failing nearly 100% on the pike branch | 21:15 |
mriedem | yeah i gave up on https://review.opendev.org/#/c/700072/ and thought it was due to some other pike thing that was fixed by QA awhile back | 21:16 |
melwitt | artom: ah, k. yeah even that vague info helps. I know nothing about it till now | 21:16 |
mriedem | but obviously not | 21:16 |
mriedem | artom: at the time we had talked about the same issue in evacuate | 21:16 |
artom | melwitt, like, the source host had to have the NIC already wired, and it had to be OVS | 21:16 |
mriedem | there is an old gate bug for that race | 21:16 |
artom | mriedem, yeah, but it can't have been the same root cause | 21:16 |
mriedem | http://status.openstack.org/elastic-recheck/#1813789 | 21:16 |
mriedem | no it's not due to using OVN or whatever :) | 21:17 |
*** ociuhandu has quit IRC | 21:17 | |
artom | mriedem, right, which is why I ended up filing https://bugs.launchpad.net/nova/+bug/1832028 and using that in my patch | 21:17 |
openstack | Launchpad bug 1832028 in OpenStack Compute (nova) stein "revert resize: vif-plugged external event sent too soon if Neutron is using OVS hybrid plug" [Medium,Fix committed] - Assigned to Artom Lifshitz (notartom) | 21:17 |
artom | Because turns out my thing from downstream was different than the intermittent upstream evacuate failures | 21:18 |
melwitt | dang | 21:18 |
mriedem | unless i was wrong on https://bugs.launchpad.net/nova/+bug/1813789 i had left comments about the order of events that showed the race | 21:18 |
openstack | Launchpad bug 1813789 in OpenStack Compute (nova) "Evacuate test intermittently fails with network-vif-plugged timeout exception" [Medium,In progress] - Assigned to Artom Lifshitz (notartom) | 21:18 |
artom | melwitt, brutal honesty: stay away :P It's not a can of worms you want to open | 21:19 |
mriedem | i think by "the same" i meant the fix for evacuate is similar, we need to register the callback before plugging vifs | 21:19 |
mriedem | because right now for evacuate we bind ports to the new host and then spawn the guest and it's the low level spawn in the driver that registers the callback | 21:19 |
artom | mriedem, yeah, but did we ever work out *why* that was necessary? | 21:19 |
mriedem | and we could have already gotten the response from the port bind | 21:19 |
melwitt | artom: yeah. I already got my ass kicked looking at http://status.openstack.org/elastic-recheck/#1844929 spent days digging in and no dice so far | 21:19 |
artom | mriedem, wouldn't that depend on the Neutron backend though? | 21:20 |
mriedem | artom: i think i just said why :) and it's in the bug | 21:20 |
artom | mriedem, I'm pretty sure at least some of them would wait until libvirt plugs the VIF before sending out the event | 21:20 |
melwitt | artom: but anecdotally I see http://status.openstack.org/elastic-recheck/#1813789 fail really often in my gerrit notifications so argh ... just want to fix some of these | 21:20 |
mriedem | we use ovs in the gate and for that backend neutron sends the event when the port binding host changes | 21:21 |
sean-k-mooney | artom: ovs waits to send the event yes | 21:21 |
mriedem | which is why we *don't* get the event for things like hard reboot | 21:21 |
artom | mriedem, sean-k-mooney, get your stuff in line and stop contradicting yourselves ;) | 21:21 |
* mriedem sharpens knife | 21:22 | |
artom | melwitt, I get you - it's annoying and you want to fix it | 21:22 |
artom | It's just such a mess | 21:22 |
sean-k-mooney | mriedem: do you know if we create a second port binding and activate it our just update the host | 21:22 |
sean-k-mooney | mriedem: that would cahnge when teh event is sent | 21:22 |
mriedem | evacuate doesn't use multiple port bindings like live migration | 21:22 |
*** ociuhandu has joined #openstack-nova | 21:22 | |
mriedem | when i left the only things that used multiple port bindings were live migration and cross-cell resize | 21:23 |
sean-k-mooney | ok we the condition to sent the event is the port must be in the active state and be bound to a host | 21:23 |
sean-k-mooney | so since its already in an active state when we bind it in the evacuate it might send the event imideatly | 21:23 |
artom | sean-k-mooney, so what mriedem was saying then | 21:23 |
sean-k-mooney | yep i havent checkt it but i would guess that he is correct | 21:23 |
artom | melwitt, well, if you want to take a whack at it, you could probably use the bind-time stuff I added to the model to change when evacuate starts listening for the event | 21:24 |
melwitt | ok so the main idea is find a way to register the callback earlier on | 21:24 |
artom | But there be dragons | 21:24 |
sean-k-mooney | artom: if we careted a second port binding it would not send it until we activate it | 21:24 |
melwitt | artom: thanks | 21:25 |
artom | melwitt, well, yes and no - looking at my own code, depending on whether the port has what I called "bind_time_events", you wait in the compute manager when you send the Neutron request | 21:26 |
*** xek has quit IRC | 21:26 | |
artom | And if they're "plug_time_events", you wait in the virt driver when you plug the VIFs | 21:26 |
melwitt | ahhh ok | 21:26 |
sean-k-mooney | mriedem: yes those are still the only things that use the multiple prot bindings | 21:26 |
artom | melwitt, also, ask sean-k-mooney ;) | 21:27 |
artom | (Bus, meet Sean :D ) | 21:27 |
melwitt | lol | 21:27 |
sean-k-mooney | artom: then you just delete the code and start again | 21:27 |
artom | Can we do that for all of Nova? ;) | 21:27 |
melwitt | I think I vaguely get it. I can read though your patch, just knowing a generic idea of what's going on helps a lot. saves a lot of time | 21:28 |
mriedem | "if we careted a second port binding it would not send it until we activate it" is a bigger non-backportable change most likely because of the behavior changes between compute and conductor | 21:29 |
artom | melwitt, ping me if you have questions / need review / whatever | 21:29 |
melwitt | thanks ++ | 21:29 |
artom | (/me needs pressure to "re-join" upstream) | 21:29 |
artom | I've been neglecting y'all | 21:29 |
mriedem | don't forget to loop dansmith into this when you want to talk about it, i'm sure he'd love to | 21:30 |
artom | You're such a good friend | 21:30 |
melwitt | artom: heh. I might end up running away from this screaming after I try to work on it, so if that doesn't happen maybe I'll ping you | 21:30 |
artom | melwitt, screaming would be a good sign, actually | 21:31 |
sean-k-mooney | ya we cant backport adopting multiple port bindings for evacuate | 21:31 |
artom | Means you're sane (inasmuch as that's still possible) | 21:31 |
sean-k-mooney | im not sure if we want to do that or not in general | 21:31 |
melwitt | lol | 21:31 |
sean-k-mooney | it might be useful but its alot of work to untangel things and make sure it works | 21:31 |
artom | Yep. We'd also need to run it with a couple of other Neutron backends | 21:32 |
artom | IIRC we created a DNM job to run against... OVS? OVN? | 21:32 |
artom | OVB? OG? | 21:32 |
sean-k-mooney | ovs ovn and lb | 21:32 |
artom | RunDMC? | 21:32 |
melwitt | 😂 | 21:33 |
melwitt | had to use an emoji for that one | 21:33 |
sean-k-mooney | sure you "had too" :P | 21:34 |
melwitt | yeah, it wouldn't let me type anything else until I posted the emoji | 21:34 |
sean-k-mooney | did https://bugs.launchpad.net/nova/+bug/1813789 come up recently downstream or in relation to the nova-livemigation job | 21:35 |
openstack | Launchpad bug 1813789 in OpenStack Compute (nova) "Evacuate test intermittently fails with network-vif-plugged timeout exception" [Medium,In progress] - Assigned to Artom Lifshitz (notartom) | 21:35 |
sean-k-mooney | i rembere talking about it a few days ago | 21:35 |
sean-k-mooney | i just dont recall the context | 21:35 |
melwitt | lyarwood mentioned it in the nova meeting | 21:35 |
sean-k-mooney | ah ya it was in context of the zuul v3 migration | 21:36 |
melwitt | just saying he hit that bug and http://status.openstack.org/elastic-recheck/#1844929 a bunch of times while trying to get some work done | 21:36 |
sean-k-mooney | yep | 21:36 |
melwitt | so I started looking at http://status.openstack.org/elastic-recheck/#1844929 and got nowhere. and now I know http://status.openstack.org/elastic-recheck/#1813789 is also horrid | 21:37 |
sean-k-mooney | so given we dont use the multiple port bindings flow i think we can assume that as long as the port is still active it will recive a bind time even rather then plug time | 21:38 |
mriedem | those probably aren't even in the same ballpark of terrible | 21:38 |
sean-k-mooney | but i would need to think that true more carefully to make sure that is correct | 21:39 |
mriedem | in the bug i linked in logs where the things were happening so it's not really a question of where the race is | 21:39 |
mriedem | though those log links are going to be dead by now | 21:39 |
mriedem | unshelve has the same issue | 21:40 |
*** ociuhandu has quit IRC | 21:40 | |
mriedem | 1. bind to new host triggers async network-vif-plugged event, 2. driver.spawn plugs vifs which sets up the callback handler | 21:40 |
mriedem | if you get the event before 2 you're stuck | 21:40 |
*** ociuhandu has joined #openstack-nova | 21:40 | |
melwitt | right, ok | 21:40 |
sean-k-mooney | mriedem: in the unsevle case you could argue that when we go to shevle offloaded teh status of the port shoudl be down | 21:41 |
sean-k-mooney | which woudl prevent the event being sent in the ovs case atleast until its plugged in the driver | 21:42 |
mriedem | yup, i opened an old bug for that as well | 21:42 |
sean-k-mooney | but we would need to sitll use the bind_time vs plugtime thing to check | 21:42 |
sean-k-mooney | is the state of the port something we can contol from nova? | 21:43 |
sean-k-mooney | or rather are we allowed to set it | 21:43 |
sean-k-mooney | if so we could set it to down when we do an evaucate | 21:43 |
melwitt | well, I'll try looking at it, see how it goes | 21:44 |
*** ociuhandu has quit IRC | 21:45 | |
mriedem | this is the thing i was thinking about for a shelve related bug https://github.com/openstack/nova/blob/stable/stein/nova/network/neutronv2/api.py#L3332 | 21:47 |
mriedem | looks like i updated that as part of the cross-cell series https://github.com/openstack/nova/blob/master/nova/network/neutron.py#L3339 | 21:47 |
mriedem | https://review.opendev.org/#/c/697162/ | 21:48 |
mriedem | it's all coming back to me | 21:48 |
sean-k-mooney | so in that function we would jsut set the binding host to None | 21:48 |
sean-k-mooney | which would unbind it | 21:49 |
sean-k-mooney | and the status should go to down as a result | 21:49 |
mriedem | yeah like how _unbind_ports works | 21:50 |
mriedem | except you can't clear the device_owner on the port when shelve offloading | 21:50 |
mriedem | the nova instance needs to continue to "own" the port | 21:50 |
sean-k-mooney | we need to keep device_id which is the nova instance uuid too but ya | 21:51 |
mriedem | let us *shelve* this discussion for 6 months from now when it comes up again :) | 21:51 |
sean-k-mooney | :) | 21:51 |
mriedem | o/ | 21:51 |
*** mriedem has left #openstack-nova | 21:51 | |
*** ircuser-1 has joined #openstack-nova | 21:55 | |
sean-k-mooney | melwitt: so calling self.network_api.cleanup_instance_network_on_host on the source node durign an evacuate might allow us to use the bind vs plug time evnet code to determin when to wait in the evacuate case too | 21:57 |
sean-k-mooney | melwitt: it is currently only called for cross cell resize | 21:57 |
sean-k-mooney | but ya we should be doing that definetly during shelve offload to fix the shelve case | 21:58 |
melwitt | sean-k-mooney: so does that mean that you think evacuate is pretty "easy" but shelve will be more difficult? or are they a similar level of complexity | 22:01 |
*** ociuhandu has joined #openstack-nova | 22:02 | |
sean-k-mooney | shelve shoudl be easy evaucate might be more difficutly but i think we just need to call cleanup_instance_network_on_host in the right place | 22:02 |
sean-k-mooney | so in shevel case we shoudl be calling it in shelve offload | 22:03 |
sean-k-mooney | and in evauate we need to do it before we call spawn on the dest host | 22:03 |
sean-k-mooney | so pretty early on | 22:03 |
sean-k-mooney | that will put the port into an unbound state which will set the port status to down | 22:04 |
melwitt | ok. just saying it sounds like the utilities are available, just have to leverage them | 22:04 |
melwitt | with the model bind_time stuff | 22:04 |
sean-k-mooney | they were not in place until recently | 22:04 |
melwitt | I'm not saying they were, just trying to understand what's the landscape today | 22:05 |
sean-k-mooney | yep | 22:05 |
sean-k-mooney | i think they were both added last cycle so they are there form train on | 22:05 |
sean-k-mooney | they should be backportable too i think | 22:06 |
*** ociuhandu has quit IRC | 22:07 | |
melwitt | ok | 22:08 |
*** slaweq has quit IRC | 22:09 | |
*** bbowen has quit IRC | 22:10 | |
*** bbowen has joined #openstack-nova | 22:10 | |
*** damien_r has quit IRC | 22:13 | |
*** slaweq has joined #openstack-nova | 22:21 | |
*** slaweq has quit IRC | 22:26 | |
*** spatel has quit IRC | 22:27 | |
openstackgerrit | Merged openstack/nova stable/train: Unplug VIFs as part of cleanup of networks https://review.opendev.org/711251 | 22:31 |
openstackgerrit | sean mooney proposed openstack/nova master: [WIP] unbind port before evacuate and shelve offload https://review.opendev.org/713342 | 22:38 |
sean-k-mooney | melwitt: im not sure if ^ will work but i think it would be something like that, at least as a start | 22:39 |
melwitt | cool thanks sean-k-mooney | 22:39 |
*** threestrands has joined #openstack-nova | 22:51 | |
*** tkajinam has joined #openstack-nova | 22:51 | |
*** CeeMac has quit IRC | 22:59 | |
*** lbragstad has quit IRC | 23:08 | |
*** zzzeek has quit IRC | 23:21 | |
*** zzzeek has joined #openstack-nova | 23:22 | |
*** gyee has quit IRC | 23:38 | |
*** eharney has quit IRC | 23:45 | |
*** Wellie has quit IRC | 23:57 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!