Tuesday, 2021-11-09

*** hemna9 is now known as hemna01:27
opendevreviewMerged openstack/nova master: Ignore plug_vifs on the ironic driver  https://review.opendev.org/c/openstack/nova/+/81326304:36
gibilyarwood: I saw elodilles explained the setuptools pin question. thanks elodilles 08:02
bauzashola folks08:10
bauzasgibi: I'm asked to present some PTG updates in a company session today at the same time of the upstream meeting08:10
gibibauzas: o/08:10
bauzasgibi: it would be a 2 min presentation about Nova 08:10
bauzasgibi: could you help me by chairing the meeting when I'm asked to discuss ?08:11
gibibauzas: sure08:11
bauzasI could run the meeting, then passing it to you for 5 mins 08:11
bauzasand then, either you continue or me :)08:11
gibiok08:12
gibiI will handle it when you need to switch08:13
bauzasgibi: thanks08:18
bauzasappreciated08:18
gibino worries08:22
gibilyarwood, elodilles: I'm seeing multiple guest kernel panics in stable/victoria volume related tests08:54
* gibi gather links08:54
gibi1) https://zuul.opendev.org/t/openstack/build/67c89daf17e3475cb1d632f87beeb60d/log/controller/logs/tempest_log.txt#595008:55
lyarwoodJust jumping on a call but I wonder if we were still using cirros 0.4.0 back then?08:56
lyarwood /opt/stack/devstack/files/cirros-0.5.1-x86_64-disk.img08:57
lyarwoodmaybe not08:57
gibi2) https://1a59031cf12ee85b5b8a-5c947c8d22eb7769ff9d2de46bec4cc9.ssl.cf5.rackcdn.com/810915/2/gate/nova-grenade-multinode/ebc944c/testr_results.html08:57
lyarwoodhowever I also see image.http_image               = http://download.cirros-cloud.net/0.3.1/cirros-0.3.1-x86_64-uec.tar.gz08:57
lyarwoodanyway I'll take a look after this call08:58
gibiack, I see both 0.3.1 and 0.5.1 in the logs08:58
gibihm the issue in the grenade job uses a different cirros as it has kernel 4.4.0 while the failed test case in the live migration job has kernel 5.3.009:00
gibiboth kernel stack trace shows page fault but in different processes09:01
kashyapgibi: Got a link to the traceback?09:11
gibikashyap: https://zuul.opendev.org/t/openstack/build/67c89daf17e3475cb1d632f87beeb60d/log/controller/logs/tempest_log.txt#595009:11
gibithat is one09:11
kashyapYep, finally it loaded; thanks09:12
kashyapSo, the above trace is with kernel 4.4.0?  (i.e. CirrOS 0.5.1?)09:12
gibithis one is kernel 5.309:13
gibi[   15.489062] CPU: 0 PID: 284 Comm: ip Not tainted 5.3.0-26-generic #28~18.04.1-Ubuntu09:13
kashyapYes, just saw it.  Silly me09:14
gibisorry wrong buffer09:14
gibi...09:14
gibi[   15.302770] CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 5.3.0-26-generic #28~18.04.1-Ubuntu09:14
gibithis one is from the stack trace you are looking at09:14
gibiand that is matching with cirros 0.5.109:16
kashyapYeah, figured as much.  The trace seems to go into kernel RCU (read-copy update) code in the kernel ... which I was told can be used to "frighten small children and adults alike"09:17
kashyaps/in the kernel//09:17
gibi:)09:17
kashyapHm, I wonder what changed suddenly in stable/victoria for us to hit these09:18
gibiI can try to check how frequently we hit kernel panics in stable/victoria and when we get the increawse09:18
gibiwe don't have much logs going backward in time for nova-live-migration as it was turned off for a while on stable09:26
gibihttps://zuul.opendev.org/t/openstack/builds?job_name=nova-live-migration&branch=stable%2Fvictoria09:27
gibibased on this it started failing yesterday09:27
gibibut it is small sample09:28
gibiit seems other branches (wallaby, xena, master) are not affected but only master has good amount of runs to be sure09:32
gibibut master uses cirros 0.5.209:33
gibiohh both wallaby and xena also uses 0.5.209:35
gibimaybe it is the cirros version09:35
gibiI'm wondering where we define the cirros version09:36
gibiok, that is devstack 09:37
gibithe 0.5.2. bump was this patch https://review.opendev.org/c/openstack/devstack/+/77917909:37
kashyapHmm09:37
kashyapOkay, so failing since yesterday; and only affects stable/victoria09:37
kashyap(The bump was only this year - it shouldn't affect stable/victoria?)09:39
gibistable/ussuri uses cirros 0.4.0 and it seems that is also not affected (still small sample)09:39
gibikashyap: the bump to 0.5.2 does not effect victora, that uses 0.5.1 still as devstack has stable branches too09:39
kashyapAaah, right09:40
* kashyap back in a bit09:40
opendevreviewBalazs Gibizer proposed openstack/nova stable/victoria: [stable-only]Bump cirros to 0.5.2 for live migration  https://review.opendev.org/c/openstack/nova/+/81717309:47
gibilyarwood, kashyap: that is my guess ^^ lets see what happens09:47
gibilyarwood: a totally different failure from stable/victoria https://zuul.opendev.org/t/openstack/build/3f48404a55904986b6f5bcd2ce7d1908/log/job-output.txt#251709:47
gibidie 276 'Support for rhel8 is incomplete: no support for installing packages'09:48
lyarwoodhmm that only adds in the ahci module so I dunno maybe09:48
gibiit is from tempest-integrated-compute-centos-8-stream09:48
lyarwoodoh I think that has never worked but I've added it in on master09:49
lyarwoodand because tempest is branchless 09:49
lyarwoodfun times09:49
gibi:)09:49
* lyarwood checks09:49
lyarwoodhttps://review.opendev.org/c/openstack/tempest/+/797614 was the change09:49
lyarwoodlanded overnight09:49
lyarwoodso I need to add a branch conditional in there I guess09:50
gibifor which branch?09:51
gibiahh I see it passed on master 09:52
gibibut now it fails on master too https://zuul.opendev.org/t/openstack/build/a637bf6e68c545e59c8d091393a2307e/log/job-output.txt but with a totally different issue09:52
gibiwith botocore version conflict09:53
lyarwoodhmmm https://review.opendev.org/c/openstack/devstack/+/688614 is in stable/victoria 09:56
lyarwoodoh it's CentOSStream09:57
* lyarwood facepalm09:57
lyarwoodhttps://review.opendev.org/c/openstack/devstack/+/803023 was rejected so I need the branch conditional in master gah09:59
gibiI've opened a gate bug for the tempest-integrated-compute-centos-8-stream job failing on master with version conflict as it seems to be 100% hit 10:00
gibitempest-integrated-compute-centos-8-stream10:00
gibihttps://bugs.launchpad.net/nova/+bug/195029110:01
lyarwoodweird10:03
lyarwood`Cannot install cinder because these package versions have conflicting dependencies.` FWIW10:04
gibiinteresing nothing recent in cinder bumped version10:10
gibiand nothing in the requirements repo since the 6th10:10
gibithis was on the 6th bumping boto https://review.opendev.org/c/openstack/requirements/+/816611/2/upper-constraints.txt#31410:12
gibisorry 4th10:12
lyarwoodgibi: did you have a bug for the stable/victoria issue?10:15
* lyarwood will raise one if not10:15
gibinope10:15
lyarwoodack10:15
gibiplease raise one10:15
kashyapgibi: lyarwood: Catching up ... is it because of this not merging yet? https://review.opendev.org/c/openstack/devstack/+/803023 (fix is_fedora for centos 8 stream)10:18
lyarwoodyeah but as we didn't support centos8stream at that point we shouldn't backport this anyway10:20
lyarwoodI'll just land some regex shortly to fix this10:20
opendevreviewJun Chen proposed openstack/nova master: Catch an exception in power off procedure  https://review.opendev.org/c/openstack/nova/+/81717610:21
kashyaplyarwood: Ah, noted.  Sorry, regex based on what?  To selectively check if 8stream is available, if not fallback to vanilla CentoS?10:23
lyarwoodkashyap: regex to stop the centos8stream job from running on branches older than wallaby10:24
lyarwoodhttps://review.opendev.org/c/openstack/tempest/+/81717910:24
kashyapAah, like that; thx!10:25
opendevreviewLee Yarwood proposed openstack/nova stable/victoria: DNM - Test integrated-gate-compute fix for centos8stream  https://review.opendev.org/c/openstack/nova/+/81718010:25
lyarwood^ testing here10:25
gibiI cannot reproduce the version conflict locally seen in https://bugs.launchpad.net/nova/+bug/195029110:28
gibibut I don't have python3.6 :/10:30
* gibi installing py3.610:32
fricklergibi: that looks like a failure in the index from pypi CDN, there is no conflict if you look at the version numbers. pip just fails to generate a proper error when it can't find that specific version in the index10:35
gibifrickler: ohh, good point, then I guess the error will go away after the recheck10:35
bauzasgibi: should we mark the bug Critical as it holds the gate ?10:35
bauzashttps://bugs.launchpad.net/nova/+bug/1950291 10:35
gibibauzas: wait a bit, frickler has an explanation that might mean it was a transient only10:36
bauzasok10:36
gibiI have recheck running now 10:36
bauzasthat's what I see10:36
* gibi stops install py3.6 locall :D10:36
fricklerwe have some way of telling the CDN to refresh its cache, I can look that up in a bit10:36
gibistill failing with boto conflict after recheck https://zuul.opendev.org/t/openstack/build/0ba1dd59972d48a98fe29b47dbb82e1e/log/job-output.txt10:41
bauzasgibi: marking it Critical until we figure out a better vision10:42
bauzasgibi: just to be clear, this is an unrelated issue from the stable/victoria gate, right?10:43
bauzashere, we have centos-stream on wallaby and later10:43
gibibauzas: right 10:43
bauzasand I see lyarwood providing a tempest fix for the stable branches that are impacted10:44
fricklerbauzas: gibi: I did "curl -XPURGE https://pypi.org/simple/botocore" and the same without the "/botocore". please try another recheck10:44
gibifrickler: ack I will10:45
gibiand thanks10:45
* bauzas needs to go off10:45
bauzasbut I'll scroll when I'm back10:45
fricklerif it is still failing for jobs starting now, please ping infra-root in #opendev, I'll be afk for a bit10:46
kevkosean-k-mooney: hi, here ? :) 10:46
gibifrickler: ack thanks11:03
opendevreviewLee Yarwood proposed openstack/nova master: libvirt: Create qcow2 disks with the correct size without extending  https://review.opendev.org/c/openstack/nova/+/77927511:03
gibilyarwood: fyi cirros 0.5.2 is not a solution for the kernel panic as  https://review.opendev.org/c/openstack/nova/+/817173 still triggers it11:10
lyarwoodgibi: Yeah I didn't think it would tbh11:24
gibiso back to square one11:24
gibilyarwood: should I file a bug for the kernel panic problem on stable/victoria? 12:00
gibior you already did?12:00
lyarwoodI haven't so go ahead12:00
gibiok12:01
gibiI will do12:01
gibilyarwood: https://bugs.launchpad.net/nova/+bug/195031012:09
gibilyarwood: could this be the appearence of our old volume detach bug ^^ where the fix was the redesigned detach code in https://review.opendev.org/q/topic:bug/188252112:21
gibithat was backported only to wallaby12:21
gibiand I do see in the nova log that the _do_wait_and_retry_detach function goes through the 7 iteration12:22
opendevreviewMerged openstack/nova master: Remove SESSION_CONFIGURED global from DB fixture  https://review.opendev.org/c/openstack/nova/+/81568913:21
gibifrickler: seems your PURGE command helped later runs does not hit the boto version conflict13:26
fricklergibi: great, thanks for confirming13:27
gibilyarwood: I backported the libvirt event based detach series to stable/victoria let's see if that helps with the kernel panic13:51
opendevreviewBalazs Gibizer proposed openstack/nova stable/victoria: libvirt: Define and emit DeviceRemovedEvent and DeviceRemovalFailedEvent  https://review.opendev.org/c/openstack/nova/+/81720913:51
opendevreviewBalazs Gibizer proposed openstack/nova stable/victoria: libvirt: add AsyncDeviceEventsHandler  https://review.opendev.org/c/openstack/nova/+/81721013:51
opendevreviewBalazs Gibizer proposed openstack/nova stable/victoria: libvirt: allow querying devices from the persistent domain  https://review.opendev.org/c/openstack/nova/+/81721113:53
opendevreviewBalazs Gibizer proposed openstack/nova stable/victoria: libvirt: parse alias out from device config  https://review.opendev.org/c/openstack/nova/+/81721213:56
kevkodansmith: commented on https://review.opendev.org/c/openstack/nova/+/81703013:57
opendevreviewMerged openstack/nova master: Refactor Database fixture  https://review.opendev.org/c/openstack/nova/+/81569013:58
opendevreviewMerged openstack/nova master: Use ReplaceEngineFacade fixture  https://review.opendev.org/c/openstack/nova/+/81682013:58
opendevreviewMerged openstack/nova master: Fix interference in db unit test  https://review.opendev.org/c/openstack/nova/+/81473513:59
opendevreviewBalazs Gibizer proposed openstack/nova stable/victoria: Replace blind retry with libvirt event waiting in detach  https://review.opendev.org/c/openstack/nova/+/81721414:00
gibielodilles: this probably interest you too ^^14:01
opendevreviewBalazs Gibizer proposed openstack/nova stable/victoria: Move the guest.get_disk test to test_guest  https://review.opendev.org/c/openstack/nova/+/81721514:03
opendevreviewBalazs Gibizer proposed openstack/nova stable/victoria: libvirt: Remove dead error handling code  https://review.opendev.org/c/openstack/nova/+/81721614:03
opendevreviewBalazs Gibizer proposed openstack/nova stable/victoria: Move instance power state check to _detach_with_retry  https://review.opendev.org/c/openstack/nova/+/81721714:03
opendevreviewBalazs Gibizer proposed openstack/nova stable/victoria: Consolidate device detach error handling  https://review.opendev.org/c/openstack/nova/+/81721814:03
opendevreviewBalazs Gibizer proposed openstack/nova stable/victoria: Parse alias from domain hostdev  https://review.opendev.org/c/openstack/nova/+/81648614:32
elodillesgibi: wow, 10 patches for a single bug fix? :-o14:40
gibielodilles: you know that, it is the libvirt event based device detach serires14:43
gibiit was backported to wallaby and now I backported it to victoria14:43
elodillesgibi: oh, that's soooo... May... o:D14:48
gibi:D14:49
stephenfingibi: Okay with me backporting those DB test changes?14:55
gibistephenfin: which one?14:56
stephenfinhttps://review.opendev.org/c/openstack/nova/+/814735 and company14:56
gibiI'm already on it14:56
stephenfinoh, great :D14:56
bauzasreminder: nova team meeting in 1 hour15:00
bauzas... here15:00
bauzas(sorry, forgot to tell)15:00
opendevreviewsean mooney proposed openstack/nova master: This change replaces all hardcoded tox enve with generative envs  https://review.opendev.org/c/openstack/nova/+/80429215:02
sean-k-mooneystephenfin: that ^ still has you -2 on it can you remove it so we can proceed with the review15:03
stephenfinoh yeah, sure15:03
sean-k-mooneyi proably have typos and other issue in it but for the most part i think its ready to review15:04
opendevreviewAlexey Stupnikov proposed openstack/nova master: Test aborting queued live migration  https://review.opendev.org/c/openstack/nova/+/77625015:07
clarkbfrickler: bauzas gibi it happens when new deps are released and then we pin them in constraints because pypi has a fallback for its CDN lookups that tends to run out of date by a couple of weeks it seems15:10
clarkbopenstack notices because our requirements system is really good at bumping and constraining new deps15:10
gibiclarkb: I see. Is it easy to detect when this happen?15:10
gibijust by looking at the conflict I did not figure out15:11
bauzasgibi: fwiw, this bug is still Critical, so we'll discuss it at the meeting15:11
bauzaselodilles: man, you updated the stable section in https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting, right?15:12
gibibauzas: we can close the bug, frickler's purge solved the issue15:12
elodillesbauzas: yes15:12
bauzaselodilles: can you please at the agenda, I got a merge conflict15:12
bauzasgibi: ack, please do15:12
clarkbgibi: no one of the bugs that someone could file is against pip to output a better error message. Maybe ideally have it print out the versions it did find15:12
gibibauzas: on it15:13
bauzasgibi: thanks15:13
gibiclarkb: I see15:13
elodillesbauzas: sorry :S15:14
elodillesbauzas: is there anything I should do now regarding the wiki page? :S15:15
bauzaselodilles: just looking at what you provided15:16
bauzaselodilles: when I merged, I could not have seen some modification you provided15:16
elodillesbauzas: if it helps to you just delete my change and I'll add it again15:19
bauzaselodilles: nah, should be ok15:19
elodillesack15:19
elodillesI'll try to remember to sync with you next week to avoid another merge conflict o:)15:20
opendevreviewBalazs Gibizer proposed openstack/nova stable/xena: Remove SESSION_CONFIGURED global from DB fixture  https://review.opendev.org/c/openstack/nova/+/81723615:32
opendevreviewBalazs Gibizer proposed openstack/nova stable/xena: Refactor Database fixture  https://review.opendev.org/c/openstack/nova/+/81723715:33
opendevreviewBalazs Gibizer proposed openstack/nova stable/xena: Use ReplaceEngineFacade fixture  https://review.opendev.org/c/openstack/nova/+/81723915:35
opendevreviewBalazs Gibizer proposed openstack/nova stable/xena: Fix interference in db unit test  https://review.opendev.org/c/openstack/nova/+/81724015:35
gibistephenfin: here are the backports 15:35
bauzasnova meeting in 3 mins15:57
bauzas#startmeeting nova16:01
opendevmeetMeeting started Tue Nov  9 16:01:09 2021 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.16:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:01
opendevmeetThe meeting name has been set to 'nova'16:01
gibio/16:01
bauzasI'll pass the baton to gibi for a few mins16:02
bauzas#chair gibi16:02
opendevmeetCurrent chairs: bauzas gibi16:02
elodilleso/16:02
bauzas#link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting16:03
gibibauzas: if everybody for RH is on the meeting where you will present then we might not need this meeting :)16:03
* bauzas facepalms16:03
bauzasI dunno16:03
artomIt's mostly a listening meeting, so we can lurk in both16:03
gibiahh I see16:04
artomBut yeah, active participation will be... patchy16:04
bauzaslet's start and we'll see 16:04
gibiok16:04
bauzas#topic Bugs (stuck/critical)16:04
bauzas#info No Critical bug16:04
bauzasthanks gibi for triaging the one16:04
bauzas#link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 25 new untriaged bugs (+3 since the last meeting)16:05
bauzas#help Nova bug triage help is appreciated https://wiki.openstack.org/wiki/Nova/BugTriage16:05
bauzas#link https://storyboard.openstack.org/#!/project/openstack/placement 32 open stories (+0 since the last meeting) in Storyboard for Placement 16:05
bauzasanything to discuss about bugs ?16:05
bauzasok, let's move on to the gate status16:06
bauzas#topic Gate status 16:06
bauzas#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:06
gibiwe had an intermittent failure this morning on master16:07
gibibut it is resolved now16:07
bauzasyeah16:07
bauzaswe have a few other new bugs 16:07
bauzaslike https://bugs.launchpad.net/nova/+bug/195031016:07
bauzas(easy one to triage, btw.)16:07
bauzas#link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly Placement periodic job status 16:08
bauzaswe again had an issue with placement-nova-tox-functional-py3816:08
bauzas#link https://zuul.openstack.org/build/0c6c18cff1d74f99a6f1a19913f35818 issue with placement-nova-tox-functional-py38 last run16:08
gibithat is the misterious 16:09
gibi/bin/sh: 1: Syntax error: "(" unexpected16:09
gibihm16:09
bauzashttps://zuul.openstack.org/build/0c6c18cff1d74f99a6f1a19913f35818/log/job-output.txt#80216:09
gibiit is again the tox showconfig  role16:10
bauzasgibi: I need to pass you the baton now for 5-ish mins16:10
gibiack16:10
gibianyhow I will look into that placement failure I feel we handled this before16:10
gibianything else on the gate status?16:10
gibi#topic Release Planning 16:12
gibiYoga-1 is due Nova 18th #link https://releases.openstack.org/yoga/schedule.html#y-116:12
gibiwhich means 16:12
gibi#info Spec review day on Tuesday Nova 16th16:12
gibiwhich is next Tuesday16:12
gibianything else about release planning?16:12
gibi#topic Review priorities 16:14
gibihttps://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement)+label:Review-Priority%252B116:14
gibinot a huge list16:14
gibiand most of them has feedback already 16:15
gibi#link https://review.opendev.org/c/openstack/nova/+/816861 bauzas proposing a documentation change for helping contributors to ask for reviews16:15
gibiI will definitely review that ^^ but probably not today16:15
gibiany comment / question about review priorities?16:16
gmannI will also check today16:16
* bauzas is back16:18
* gibi hands back the baton16:18
bauzasyeah, so I wrote this one16:18
bauzasI know we had concerns during the PTG but I'd love to see comments in https://review.opendev.org/c/openstack/nova/+/81686116:19
bauzasok, let's move16:20
bauzas#topic Stable Branches 16:20
bauzaselodilles: your time16:20
elodillesvictoria and ussuri are blocked until tempest fix lands: https://review.opendev.org/c/openstack/tempest/+/81717916:20
elodillesno news yet regarding the investigation of the 'volume detach' failures that requires many rechecks on multiple stable branches16:20
elodillesUssuri Extended Maintenance transition is scheduled this week (Nov 12)16:21
elodillesfinal release patch proposed: https://review.opendev.org/c/openstack/releases/+/81722616:21
gmannwill check tempest one 16:22
elodillesstill, the list of open and unreleased patches if someone is interested: https://etherpad.opendev.org/p/nova-stable-ussuri-em16:22
bauzas++16:22
elodillesand patches that need one +2 on ussuri: https://review.opendev.org/q/project:openstack/nova+branch:stable/ussuri+is:open+label:Code-Review%253E%253D%252B216:22
bauzasI need to do homework16:22
elodillesgmann: thanks in advance!16:22
bauzaswe have a large list of ussuri changes16:23
elodilleslet me know if something needs to be fit into the final release and I'll hold the release patch until16:23
bauzasbut I'll try to review a few of them I think are important16:23
bauzasI could ask other company folks if they're interesting16:23
bauzasinterested*16:23
bauzasthat said, just saying out loud, my own company isn't getting fully interested in ussuri backports for obvious reasons16:24
gibilooking at the list a lot of them are not merged to victoria yet16:24
gmann+A on tempest fix16:24
gibigmann: thanks for that16:24
gibi!16:24
elodillesyes. only some could be reasonably quickly merged16:24
elodillesgmann: \o/16:24
gibielodilles: do you have some links for "no news yet regarding the investigation of the 'volume detach' failures that requires many rechecks on multiple stable branches"16:25
elodillesand we are close to deadline16:25
bauzasyup16:25
gibielodilles: is it related to the recently seen kernel panics on stable/victoria ?16:25
bauzasI can look at that tomorrow16:25
elodillesgibi: not really, as there were not so much activity on stable nowadays16:25
elodillesgibi: but yes, it could be related to the kernel panic issue as well16:26
gibias for that I have a huge packport to see if helps16:26
bauzasa packport ? nice16:26
gibibackport :D16:26
elodillespun intended :D16:27
gibibauzas: https://review.opendev.org/q/topic:bug/1882521 if you are interested :D16:27
bauzasalways interested in eating reviews16:27
gibi(and yes it is -1 all over as wee need the tempest fix gman just approved)16:27
bauzasyeah16:27
bauzasthis doesn't help btw.16:27
bauzasanyway16:28
bauzasmoving on ?16:28
bauzas#topic Sub/related team Highlights 16:28
bauzasLibvirt (lyarwood)16:28
bauzasI guess he's not there16:29
bauzasno worries, we can punt this to next week16:29
bauzas#topic Open discussion 16:29
bauzasOff-path Network Backends spec re-review https://review.opendev.org/c/openstack/nova-specs/+/787458 after addressing PTG comments (dmitriis)16:29
bauzasdmitriis: around ?16:30
dmitriisyep16:30
dmitriisOne of the asks during the PTG was that the Neutron cores review the Neutron spec: https://review.opendev.org/c/openstack/neutron-specs/+/788821/16:31
dmitriisThere is some progress on that, I am waiting for a second +2 (hopefully there will be some more feedback today).16:31
dmitriisI updated the spec with some of the points that were discussed during the PTG as well16:31
bauzas\o/16:32
bauzasso I guess it's our turn ?16:32
dmitriisThat would be much appreciated :^)16:32
bauzasok, so just a ping for reviews ? :)16:32
bauzasnothing you wanna discuss with the team by now ?16:32
bauzassome open left question, maybe ?16:33
dmitriisyes, just a ping for now16:33
dmitriistrying to get some eyes on it early since we are getting closer to the spec freeze and holidays16:33
bauzasdmitriis: I guess you saw we plan a spec review day ?16:34
bauzasdmitriis: this doesn't mean we won't review your spec *before*16:34
bauzasbut we would appreciate if you could be around on this particular day16:35
dmitriisbauzas: yes, I plan to be around for that and ready to address feedback16:35
bauzasdmitriis: excellent, thanks16:35
bauzasgiven the size of the spec, first runs of reviews will be needed before the spec review day16:35
bauzasbut this helps to know you'll be arouind16:36
dmitriisbauzas: it had some rounds of reviews around April/May 2021 already16:36
dmitriisbut, yes, I think early views would be preferred16:36
bauzas:)16:37
bauzasok, I guess we consumed the whole agenda16:37
dmitriisthere were some external dependencies in Libvirt and OVN that got merged recently (so this is out of the way). During the PTG we agreed that the Neutron spec needs to be reviewed first and that I need to address some additional points16:37
dmitriisack16:37
bauzasdmitriis: yup, indeed16:37
bauzasdmitriis: but yeah, I get the fact the dependencies are now solved16:38
bauzasso it's our turn16:38
bauzasanyone wanting to raise anything before we shutdown the meeting ?16:39
gibi-16:40
bauzas#endmeeting16:40
opendevmeetMeeting ended Tue Nov  9 16:40:20 2021 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:40
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-09-16.01.html16:40
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-09-16.01.txt16:40
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-09-16.01.log.html16:40
dmitriiso/16:40
elodilleso/16:40
bauzashah, fancy https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-09-16.01.html16:40
bauzasI fixed the use of the #link command and the topics16:41
bauzasI guess I need to make sure we provide an #info command per topîc16:42
* bauzas tries to make our minutes more readable16:42
opendevreviewBalazs Gibizer proposed openstack/nova master: Apply common irrelevant_files for centos 8 job  https://review.opendev.org/c/openstack/nova/+/81727816:48
gibilyarwood: ^^ on tweak for the new job16:49
gibi*one16:49
lyarwoodgibi: it's part of tempest-integrated-compute so is this really needed?16:51
lyarwoodah wait the template is called something different my bad16:51
gibiyeah, I noticed that it was run on https://review.opendev.org/c/openstack/nova/+/814735 but no other tempest job run there16:51
opendevreviewBalazs Gibizer proposed openstack/placement stable/xena: Use 'functional-without-sample-db-tests' tox env for placement nova job  https://review.opendev.org/c/openstack/placement/+/81725517:02
opendevreviewBalazs Gibizer proposed openstack/nova stable/xena: Define new functional test tox env for placement gate to run  https://review.opendev.org/c/openstack/nova/+/81725617:02
gibibauzas, gmann: ^^ these backports are needed to make the periodic placement test run green on stable/xena17:02
gibias per https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly17:02
opendevreviewBalazs Gibizer proposed openstack/placement stable/xena: Use 'functional-without-sample-db-tests' tox env for placement nova job  https://review.opendev.org/c/openstack/placement/+/81725517:04
opendevreviewStephen Finucane proposed openstack/nova master: Use unittest.mock instead of third party mock  https://review.opendev.org/c/openstack/nova/+/71467617:05
gibistephenfin: on the backport of the Database fixture fix, I think we need to backport https://review.opendev.org/c/openstack/nova/+/810291 as well17:15
gibior at least I see that as a difference between master and xena and my backport on xena now fails misteriously https://zuul.opendev.org/t/openstack/build/d7c064c8981b40618e3d24fc221c1832/log/job-output.txt17:16
kevkoanyone to help me investigate nova/neutron problem :/17:18
gibianyhow I gave up for today17:18
sean-k-mooneygibi: i have not reviewed that but skimmig it quickly it seam like a small enough change17:18
gibisean-k-mooney: me neither, I probably need to pull it in apply it to xena and see if it resolves the test failure with the xena backport17:19
kevkosean-k-mooney: hi, i patched nova code to see how much time spent to get event about vif plugged from neutron 17:19
kevkoon my test environment it is about 10 - 40 sec ..sometimes it is higher sometimes it is lower ..what is strange that sometimes when I run heat stack ..it is quite fast and I can see debug log message about vif event ..sometimes it is long time .. :(17:21
sean-k-mooneykevko: it soundly like when there are a lot of vms strating it presumable gets longer17:24
sean-k-mooneyare you seeing them get close to the 300 time out or are they still generally below that17:25
kevkosean-k-mooney: nope, it is really low17:26
kevkosean-k-mooney: https://paste.opendev.org/show/810887/17:26
kevkostack is always same .. 6 small cirros instances 17:26
kevkoopenstack is clean testing env ..so no other processes running ...just my stack is building ..17:27
sean-k-mooneykevko: that point to this not being a general performance problem then so increaseign the timeout wont help17:28
kevkosean-k-mooney: yeah, something is somewhere buggy :D probably in neutron ..17:28
sean-k-mooneyyou will have to start corralating the nova and netron logs to see if/when the ovs ports ar created and what happens17:28
kevkoon neutron-server side i can see this -> 2021-11-09 16:40:42.749 8 ERROR neutron.agent.dhcp.agent [-] Unexpected number of DHCP interfaces for metadata proxy, expected 1, got 2 17:29
sean-k-mooneyya it either a problem with libvirt creating the tap and adding it to ovs or a proablem in the neutron l2 agent17:29
kevkohmm, If you have a time ..I can give you access to that LAB env 17:30
kevkosean-k-mooney: or give logs ? 17:30
kevkosean-k-mooney: because I don't know if I am able to debug it :/ ..trying whole day 17:31
sean-k-mooneyif you can share logs form 5-10min before/after the vm failed for the neutron l2 agent and nova-compute agent that should be enough17:32
sean-k-mooneyi can try and take a look but unfrotruatlly i proabley wont be able to fully debug this for you17:32
kevkook, give me minute17:33
sean-k-mooneyreally the way to approch this is look for the point at which nova/libvirt create teh docmain which will in trun create teh port and get the time stampe17:33
sean-k-mooneythen you need to look at the l2 agent log and see if it start processign the port in the treat_ports fuction17:33
sean-k-mooneykevko: this is the code that shoudl configure the port after its added https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L192517:34
kevkosean-k-mooney: https://debian.kevko.ultimum.cloud/neutron-openvswitch-agent.log17:37
sean-k-mooneydo you know the uuid of the port/tap name or mac17:38
kevkonova-compute 17:16:53.188 line 17:39
kevkosean-k-mooney: probably this ? 17:40
kevko2021-11-09 17:11:24.974 7 DEBUG neutron.agent.resource_cache [req-72e6674d-a4b6-4040-b62f-e7983c5c74f3 f21b4913a25d411fa774338091bd105a 5bd5561af79540c38df13222dce135f6 - - -] Resource Port 8d3373f0-6329-4252-8114-fc981873e0fb updated (revision_number 21->22). Old fields: {'dns': PortDNS(current_dns_domain='',current_dns_name='',dns_domain='',dns_name='',port_id=8d3373f0-6329-4252-8114-fc981873e0fb,previous_dns_domain='',previo17:40
kevkous_dns_name=''), 'device_id': '', 'bindings': [PortBinding(host='',port_id=8d3373f0-6329-4252-8114-fc981873e0fb,profile={},status='ACTIVE',vif_details=None,vif_type='unbound',vnic_type='normal')], 'device_owner': ''} New fields: {'dns': PortDNS(current_dns_domain='',current_dns_name='',dns_domain='',dns_name='prod-p0000000001-s0000000001-uan',port_id=8d3373f0-6329-4252-8114-fc981873e0fb,previous_dns_domain='',previous_dns_name=17:40
kevko''), 'device_id': 'f01680bd-ba12-4029-b11f-b2d5ae848818', 'bindings': [PortBinding(host='compute0',port_id=8d3373f0-6329-4252-8114-fc981873e0fb,profile={},status='ACTIVE',vif_details=None,vif_type='unbound',vnic_type='normal')], 'device_owner': 'compute:nova'} record_resource_update /usr/lib/python3/dist-packages/neutron/agent/resource_cache.py:18517:40
kevkofound via instance id 17:40
sean-k-mooneyok so tha tis in the log 17:42
sean-k-mooney021-11-09 17:05:32.150 7 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-70183bba-0380-45d0-afef-7834c5644b2a - - - - -] Starting to process devices in:{'current': {'8d3373f0-6329-4252-8114-fc981873e0fb', 'f2b15696-b359-4216-a22a-804ebf285332', 'cc1609e0-d7ae-45a1-9405-1c95bb8dabf1', '05295e05-4fc2-4c00-ba77-8e4ff57b2ae3'}, 'added': set(), 'removed':17:42
sean-k-mooneyset(), 'updated': {'8d3373f0-6329-4252-8114-fc981873e0fb', 'f2b15696-b359-4216-a22a-804ebf285332', '05295e05-4fc2-4c00-ba77-8e4ff57b2ae3'}, 're_added': set()} rpc_loop /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:266217:42
sean-k-mooneyand the status is set up at 17:05:4117:43
sean-k-mooney2021-11-09 17:05:41.766 7 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-70183bba-0380-45d0-afef-7834c5644b2a - - - - -] Setting status for 8d3373f0-6329-4252-8114-fc981873e0fb to UP _bind_devices /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:120217:43
sean-k-mooneyso the device configuration complete in the agent a 17:05:4417:45
sean-k-mooneya side effect fo settign the status up shoudl be calling the provision blocks cod ewhich will eventurally send the even to nova17:46
sean-k-mooneykevko: that intersting im seeign it repeat later in the log too17:50
kevkowell, i think if something is broken ..it is trying to spawn instance again no ? 17:51
sean-k-mooneynot on the same host17:52
sean-k-mooneythe curernt host appears to be compute017:53
sean-k-mooneythe revision_number 23->24 update is fefintly going form bound to compute0 with status down to compute0 with status up17:55
sean-k-mooneywhich corralates with the 21->22 detail above17:56
sean-k-mooneyit looks like the issue is else where perhaps in the the dhcp agent or neutorn server17:56
sean-k-mooneykevko: for the event to be sent both the l2 agnet and dhcp agent need to notify the neutron server that the provisioning is complete17:57
sean-k-mooneysince the l2 agent seams to be working correctly the next most likely candiate is the dhcp agent being slow whne many vms are created17:58
kevko6 vms ? :/17:58
sean-k-mooneyits likely that there is a bug in the configutaion that is cause the agent to block/hang for some reason if this is the issue18:01
sean-k-mooneyits not really a perfroamce issue18:01
sean-k-mooneywe have had bugs in the interactio nwith dnsmasque in the past18:01
sean-k-mooneykevko: in any case when the l2 agent set the port status as active it execution this code which mare it complete for the l2 agent18:02
sean-k-mooneyhttps://github.com/openstack/neutron/blob/9241c76b04e6745cc648ee42037cfe6ddad3600a/neutron/plugins/ml2/rpc.py#L312-L33118:02
sean-k-mooneyif both sides had complted the provision the event would have been sent18:02
kevkobug in configuration ? 18:03
kevkoyeah, i saws some fixed bugs on launchpad18:04
sean-k-mooneyin the neutron server you should see one of these two logs notign that the l2 agent has complted its provisioning https://github.com/openstack/neutron/blob/9241c76b04e6745cc648ee42037cfe6ddad3600a/neutron/db/provisioning_blocks.py#L133-L14018:05
kevkosean-k-mooney: nothing, i have wallaby btw18:10
sean-k-mooneyi dont think this has changed much form wallaby to master18:11
sean-k-mooneyit might be best to take this to then neutron channel butit would seam for whateer reason that the port status chagne is not propagating to the nutron server then18:12
kevkodo you want ssh key to that lab ? 18:12
sean-k-mooneyunfortunetly i have some other work i need to get done so im not sure i can really supprot debuging this much beyond what i have already done18:13
kevkosean-k-mooney: ok, no problem, thank you very much ...18:14
kevkobtw, I have neutron server set to Debug = False ..so that's the reason why I am not seeing that debug messages .. 18:14
sean-k-mooneyah ya these are debug only since its a bit verbose18:15
kevkook, have to go ...thank you very much 18:16
opendevreviewArtom Lifshitz proposed openstack/nova master: DNM: Run OVS job with hybrid plug  https://review.opendev.org/c/openstack/nova/+/81730320:16
opendevreviewArtom Lifshitz proposed openstack/nova master: DNM: Run OVS job with hybrid plug  https://review.opendev.org/c/openstack/nova/+/81730320:18
opendevreviewDan Smith proposed openstack/nova master: WIP: Revert project-specific APIs for servers  https://review.opendev.org/c/openstack/nova/+/81620620:26
dansmithgmann: lbragstad: my brain is fried from ^ so use extra caution while reviewing20:26
dansmithhowever, I do think that's much easier to read than what was there before, and hopefully makes the iteration from current..scope..nolegacy more clear20:27
gmanndansmith: thanks, ack20:35
lbragstaddansmith sweet - thanks20:45
hyang[m]Hi there, can someone help to review https://review.opendev.org/c/openstack/nova/+/811521? It can help to close both https://bugs.launchpad.net/nova/+bug/1943969 and https://bugs.launchpad.net/neutron/+bug/194261521:06
artomHah, so revert resize is broken with ovs + hybrid plug22:14
artomhttps://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f60/817303/2/check/nova-ovs-hybrid-plug/f60d54c/testr_results.html22:15
artomWe first noticed this downstream, and now that ^^ tested it upstream, same result22:15
gmanndansmith: lbragstad johnthetubaguy[m] I created this wikitable to audit all the nova API policy - https://wiki.openstack.org/wiki/Nova/rbac 22:19
gmannfew I have kept as ? mainly multi-policy one. for example showing host_status policy in GET /servers please review those. 22:20
gmanndansmith: lbragstad johnthetubaguy[m] I have updated those as per my understanding and with new direction we agreed on Wed. My eyes are paining now after listing/auditing these ~225 policies . will catch up on this tomorrow. 22:24
dansmithgmann: wow, I thought you were going to do it in a google sheet or something22:32
dansmithI'm sure your eyes are literally bleeding now :/22:32

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!