Tuesday, 2021-09-07

opendevreviewJorhson Deng proposed openstack/nova master: recheck the attachment_id after the reschedule successful  https://review.opendev.org/c/openstack/nova/+/79620901:49
opendevreviewAde Lee proposed openstack/nova master: Add check job for FIPS  https://review.opendev.org/c/openstack/nova/+/79051903:15
opendevreviewWenping Song proposed openstack/placement master: Dropping lower constraints testing  https://review.opendev.org/c/openstack/placement/+/78786307:57
bbezakHi - Can we make a release os-vif for Victoria? 2.2.1? this bug is quite critical - https://bugs.launchpad.net/os-vif/+bug/1892132. And it is not part of 2.2.0 release08:45
sean-k-mooneyits not actully a bug it was a kernel abi break09:02
sean-k-mooneybut yes we can proably do a release we just need too propose it to the release repo09:02
sean-k-mooneybbezak: the release managmen it automated via a git repo09:03
sean-k-mooneyso we just need to update this with the corerct sha https://github.com/openstack/releases/blob/master/deliverables/victoria/os-vif.yaml09:04
sean-k-mooneythere is  a script in the release repo for doing that or you can do it by hand09:04
sean-k-mooneyhttps://github.com/openstack/os-vif/compare/2.2.0...stable/victoria09:05
kashyapsean-k-mooney: What's the kernel ABI break?09:06
sean-k-mooneylooking at the delta it does not look like there is much else included but release are pretty cheap09:06
bbezakindeed - OFED 5.4 also introduced that change in renumeration at some level.09:06
bbezakhttps://bugzilla.redhat.com/show_bug.cgi?id=191870309:06
kashyapAh, a commit is linked - https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=123f0f53dd64b67e34142485fe866a8a581f12f109:06
bbezakkashyap: https://bugzilla.redhat.com/show_bug.cgi?id=191870309:06
kashyapThanks09:06
sean-k-mooneykashyap: melonox moved where the reprentor netdevs were advertised in /sys09:07
sean-k-mooneywe have fixed os-vif and libvirt to be able to handel both now09:07
sean-k-mooneybut we need to actully release os-vif with the fix09:07
kashyapI see09:07
kashyapThis seems to be a "forced" ABI break - because of an external vendor09:08
kashyapBut yeah, a break nonetheless09:08
sean-k-mooneywell yes they chose to do this to enable  a new feature in the future09:08
sean-k-mooneywe adapted on master a long time ago and the backports where held on review bandwith for some time09:09
sean-k-mooneynow that they have actully landed we can do a realase for them09:09
bbezakyesterday ussuri was merged, but I think older ones are still waiting for +2s09:09
bbezakand/or rechecks09:10
sean-k-mooneyi think train was reviewed. i should now have stable branch rights so ill take a look09:10
sean-k-mooneyah you are right09:10
sean-k-mooneythey are still pending09:10
* bauzas wonders whether we should comment in the prelude section which python versions we support09:12
sean-k-mooneyits in the governace repo09:12
bauzasok, then it's not needed09:12
sean-k-mooneyhttps://github.com/openstack/governance/blob/master/reference/runtimes/xena.rst09:13
sean-k-mooneytechnically i guess these are minimume but 3.6-3.8 is what is tested09:13
sean-k-mooneytechnially i think we can mostly run on 3.509:14
sean-k-mooneybbezak: i have rechecked the train version ill take a look at them again later today09:14
sean-k-mooneybbezak: are you going to porpos a release to the git repo or would you like me to do that 09:15
bbezaksean-k-mooney: thx, I will try to propose git release for victoria/ussuri09:17
opendevreviewBalazs Gibizer proposed openstack/nova master: DNM: check nova job results with placement transaction fix  https://review.opendev.org/c/openstack/nova/+/80755809:45
bbezakussuri, victoria os-vif release has been proposed, as per docs - https://releases.openstack.org/reference/using.html#using-new-release-command. PTL needs to approve those, please take a look gibi: 10:07
bbezakhttps://review.opendev.org/c/openstack/releases/+/807694 https://review.opendev.org/c/openstack/releases/+/80769610:07
sean-k-mooneygibi: i have reviewed both for those ^ and they look good to me. we dont have any other pending backports that i see that we should wait for10:17
sean-k-mooneyi have some patch for master that i plan to backport but we can do another release for those in a few weeks they are not urgent and release are pretty cheap so i have no issue with defering them for now10:18
bauzasgibi: permission to rewite your cycle highligts for the prelude ?10:18
gibibauzas: grandted :)10:39
gibisean-k-mooney: I will check them soon 10:40
gibibauzas: when you need some distraction there is two doc / reno patch up for review https://review.opendev.org/c/openstack/nova/+/807564  and https://review.opendev.org/c/openstack/nova/+/70566711:10
gibistephenfin, sean-k-mooney: I added a topic about tox.ini basepython pinning to the PTG etherpad https://etherpad.opendev.org/p/nova-yoga-ptg L8111:25
gibilyarwood: if you have time I could use your oppinion on https://bugs.launchpad.net/nova/+bug/1942766 11:42
lyarwoodgibi: yeah live and then persistent could be an option but ultimately a hard reboot would recover the situation so it's a `low` bug at best IMHO11:54
gibilyarwood: as the hard reboot will regenareate the xml from the db?11:55
lyarwoodgibi: correct it destroys the live domain and undefined the persistent domain before recreating everything based on what we have in the db11:56
lyarwoodundefines*11:56
gibicool, thanks11:56
opendevreviewBalazs Gibizer proposed openstack/nova master: Add more retries to TestMigrateFromDownHost tests  https://review.opendev.org/c/openstack/nova/+/80771412:45
gibilyarwood: a further tuning on these tests ^^12:52
gibiohh I see you already checked12:52
gibithanks12:53
lyarwoodgibi: yup already reviewed, I did see a gate failure with this last week but didn't get time to look, thanks for sorting that again12:53
* lyarwood needs to stop writing racey tests12:53
gibilyarwood: no problem, I like hunting down these :)12:53
gibiand I think there is no way to avoid races sometime, nova is complex12:53
lyarwoodwe all have our vices :D12:53
gibi:D12:54
bauzasman, I wish generated sphinx errors were simplier to debug with reno13:35
kashyapbauzas: What's the error that's giving you grief?13:40
bauzas/home/sbauza/git/openstack/nova/releasenotes/source/unreleased.rst:40: WARNING: Bullet list ends without a blank line; unexpected unindent.13:41
kashyapAh, I've seen this enough no. of times that it is now written into my brain's ROM13:41
bauzasalso, running the tox releasenotes target takes a while13:42
bauzasit's repopulating all the releases13:42
opendevreviewBalazs Gibizer proposed openstack/nova master: Avoid unbound instance_uuid var during delete  https://review.opendev.org/c/openstack/nova/+/80560513:48
opendevreviewMerged openstack/nova master: fup: Print message logging uncaught nova-manage exceptions  https://review.opendev.org/c/openstack/nova/+/80735814:28
opendevreviewMerged openstack/nova master: console: Improve logging  https://review.opendev.org/c/openstack/nova/+/77840714:29
gibicores: do we support providing the adminPassword to the guest via the metadata service? 14:31
gibiI see that the metadata service trying to fetch the password from instance.system_metadata https://github.com/openstack/nova/blob/402fe188b4e7ff76109e8a5ea1f24a5e915eaa09/nova/api/metadata/password.py#L3714:31
gibibut I don't see we ever store the adminPassword there 14:31
gibiasking for due to https://bugs.launchpad.net/nova/+bug/194270914:32
gibiFYI: nova meeting starts in 9 minutes here in the channel15:50
bauzasgibi: I'll be a bit late, daughter's homework helping15:59
gibiack15:59
gibi#startmeeting nova16:00
opendevmeetMeeting started Tue Sep  7 16:00:10 2021 UTC and is due to finish in 60 minutes.  The chair is gibi. Information about MeetBot at http://wiki.debian.org/MeetBot.16:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:00
opendevmeetThe meeting name has been set to 'nova'16:00
gibio/16:00
dansmitho/16:01
gibi#topic Bugs (stuck/critical)16:03
gibino critical16:03
gibi#link 13 new untriaged bugs (-4 since the last meeting): #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New16:03
gibiwe have 1 bugs marked with xena-rc-potential tag #link https://bugs.launchpad.net/nova/+bugs?field.tag=xena-rc-potential16:03
gibihttps://bugs.launchpad.net/nova/+bug/1942345 fix is going through the gate16:04
gibiand we have the placement transaction scope issue https://storyboard.openstack.org/#!/story/2009159 that is being worked on in https://review.opendev.org/c/openstack/placement/+/80701416:04
gibiI guess we should mark that as RC critical too16:04
gibiI think melwitt's solution is OK but we cannot really make a reproduction test working in the functional env 16:05
gibiso I started running nova jobs against the fix16:05
gibiin https://review.opendev.org/c/openstack/nova/+/80755816:05
gibiso far the runs there are not producing the error 16:06
gibiso probably we will land that fix without the reproduction test16:06
gibiis there any other bug we should consider as RC critical?16:07
dansmithyou're saying that we can't repro the failure in functional,16:07
dansmithbut that's kinda expected and normal for these kinds of load-based heisenbugs right?16:07
* bauzas is now back and around16:07
gibidansmith: yes, it is a race that is hard to reproduce in a clean env (it needs mysql and it needs parallel transactions)16:08
bauzasdansmith: the problem is about mysql16:08
dansmithgibi: yeah16:08
bauzashah, jinxed16:08
bauzasdifficult to write a correct parallel test16:08
dansmithyeah, just sounded like gibi was expecting we wouldn't land until we had a repro test16:09
dansmithand I'm saying I'd have expected that to be impossible16:09
gibiwe tried the repro test but we faild16:09
gibiso I'm OK to land this without a repro16:09
bauzas"I guess we should mark that as RC critical too" > yes, please16:10
gibibauzas: OK, I will. It is already in the tracking etherpad https://etherpad.opendev.org/p/nova-xena-rc-potential16:10
gibiso if there is no other bug for the RC then I have a question about https://bugs.launchpad.net/nova/+bug/194270916:11
gibido we support providing the adminPassword to the guest via the metadata service?16:11
gibiI see that the metadata service trying to fetch the password from instance.system_metadata 16:11
gibihttps://github.com/openstack/nova/blob/402fe188b4e7ff76109e8a5ea1f24a5e915eaa09/nova/api/metadata/password.py#L3716:11
gibibut as far as I see we dont store the password in instance.system_metadata on master16:12
gibiand I was not able to track down changes around this in the git history16:12
* bauzas dunoo16:12
dansmithI don't remember if this works with libvirt16:13
dansmithI want to say no16:13
bauzaswe need to look at the code honestly16:13
gibibauzas: please, I got stuck tracking done if it ever worked16:13
gibiOK moving on then16:14
gibiany other bug we need to discuss?16:14
gibi#topic Gate status 16:15
gibiNova gate bugs #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure16:15
gibiThe nova-next job should be passing now after the new osc-placement release on Friday16:15
gibiAllocation deletion conflict still can happen, and it is being worked on in https://review.opendev.org/c/openstack/placement/+/80701416:15
bauzasgibi: wish me good luck for finding about the crap in sysmetadata16:15
gibibauzas: good luck! :)16:16
gibiI'm not tracking other high frequency bugs in the gate I think we are able to merge code to master16:16
gibiPlacement periodic job status #link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly16:16
gibiplacement periodics are green16:16
gibiany other gate issue we need to talk about?16:17
gibi#topic Release Planning 16:18
gibiRelease tracking etherpad #link https://etherpad.opendev.org/p/nova-xena-rc-potential16:18
gibiWe are in Feature Freeze.16:18
gibiI'm not tracking any FFEs16:18
gibiand not tracking any feature patches approved but not landed yet16:18
gibiWe need to produce RC1 latest on 17th of September which is in less than 2 weeks.16:19
* bauzas raised fist at reno btw. 16:19
gibiyeah, bauzas writing the reno prelude as you see16:19
gibi:)16:19
bauzasI can't upload the prelude change because of a weird issue with my rst file16:19
bauzas(actually, yaml one)16:19
gibibauzas: push it up and tomorrow I can look at it 16:19
bauzashope to fix it sooner than later16:19
bauzasgibi: yeah, if I can't find it, I'll throw it to the gate16:20
gibithe above etherpad has the links to RC bugs and other TODOs 16:20
bauzasprobably something obvious enough that I missed.16:20
gibiany question about the coming release?16:20
gibi#topic PTG Planning 16:21
gibievery info is in the PTG etherpad #link https://etherpad.opendev.org/p/nova-yoga-ptg16:21
gibiwe don't have much topics yet, but there is still plenty of time16:22
gibiIf you see a need for a specific cross project section then please let me know16:22
gibiany question about the PTG?16:22
mlozais there way to set extra_specs to an existing instance without modifying the db?16:22
gibiI guess you can turn to bauzas about those as well as he is the PTL elect :)P16:22
bauzascould we have a yoga mat for the PTG ? :D16:23
gibimloza: let get back to you at the Open Discussion section16:23
gibibauzas: sure you can, please buy one in decatlon16:23
gibi:)16:23
* bauzas misses swag16:23
gibi#topic Stable Branches 16:24
gibielodilles_pto is off 16:24
gibiso we don't have status update on the wiki16:24
gibibut I guess stable is OK :)16:24
gibiany stable issue?16:24
bauzaseither way, we're entering a pretty quiet period about stable16:24
gibiI have one news we pushed stable os-vif releases today16:25
gibihttps://review.opendev.org/c/openstack/releases/+/80769416:25
gibihttps://review.opendev.org/c/openstack/releases/+/80769616:25
gibivictoria and ussuri16:25
gibianyhow moving on16:25
gibi#topic Sub/related team Highlights 16:26
gibiLibvirt (bauzas)16:26
gibi? ;)16:26
bauzasnothing to say16:26
gibi#topic Open discussion 16:26
gibinothing on the agenda16:26
gibimloza: your turn16:26
gibimloza: do you mean flavor extra_specs?16:26
mlozayup16:26
gibiyou have to resize the vm 16:27
bauzasembedded flavors can't be changed16:27
bauzasunless you resize, indeed16:28
mlozaright now live migrating is failing because the instance doesn't have extra_specs that the flavor has. I cannot afford to the resize since it incurs a downtime16:29
dansmithI don't think that's the reason for a live migration failure, right?16:30
mlozai have host aggregates with metadata16:31
mlozathe metadata is associated to the flavors 16:31
dansmithI see, so it's the aggregate not the flavor16:32
dansmithforce migrate should skip that, no? bauzas ?16:32
bauzaswait, was distracted, sorry16:33
gibiforce still runs the scheduler isn't it?16:33
bauzasgibi: that depends on the microversion you ask16:33
gibitrue16:33
gibiwith old enough microversion you can skip the scheduler16:33
bauzasmloza: which exact command do you run ?16:33
dansmithI was going to say, I thought there was a way.. I mean that's how you break your AZ right? :)16:33
gibihost=<new-host> force=True  microversion <= 2.6716:34
mlozaopenstack server migrate --os-compute-api-version 2.30 --live-migration $id16:34
bauzaswith no target ?16:34
mlozayep with no taret16:34
mlozatarget*16:35
gibiyou need < 2.30 and you nee d target16:35
bauzasif so, this goes to the scheduler16:35
bauzaswhich checks this indeed16:35
gibihttps://docs.openstack.org/api-ref/compute/?expanded=live-migrate-server-os-migratelive-action-detail#live-migrate-server-os-migratelive-action16:35
dansmiththe scheduler is doing what you asked, in that case.. not sending the host to the aggregate that requires specific specs16:35
sean-k-mooneymloza: if  you dont pass a target then  it will use the az form the request spec16:35
bauzasgibi: or you can use 2.30 with the explicit force flag16:35
gibibauzas: yes, or <=2.67 and force=True16:35
bauzas(which was deprecated later)16:35
bauzasthat, yes16:35
bauzasI'd say, the contract is granted.16:36
bauzasyou're asking to migrate something you shouldn't migrate16:36
bauzasI know this is weird16:37
sean-k-mooneyif you for the migration  will that bypas the aggreate extra spec affintiy filter16:37
sean-k-mooneyi assume that is the one that is blocking the host16:37
sean-k-mooneyid did not think it would16:37
bauzasbut if all your hosts are within aggregates that set metadata and you have a filter that verifies this, then you explicitly write a contract16:37
sean-k-mooneyi guess with force we will just check the host exist and skip everything else with the old microverion?16:38
sean-k-mooneythat would leave the vm in a state the future move operation will try to move ti out of the aggreate you moved it into16:39
mlozagot it to live migrate after passing a target host 16:39
gibiOK, I guess that is solve then 16:40
gibi:016:40
gibi:)16:40
gibiany other topic for today?16:40
gibithen lets closed this16:42
gibithanks for joining16:42
gibi#endmeeting16:42
opendevmeetMeeting ended Tue Sep  7 16:42:23 2021 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:42
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2021/nova.2021-09-07-16.00.html16:42
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2021/nova.2021-09-07-16.00.txt16:42
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2021/nova.2021-09-07-16.00.log.html16:42
* bauzas just figured https://docs.openstack.org/reno/2.4.1/usage.html#debugging16:42
bauzasholy shit, I was already knowing about it16:42
mlozaThanks. I think the instance should have extra_specs and prefer live migrating without a target host. If I have to add to extra_specs in the instance, is it instance_extra table that I have to modify ?16:44
mlozainstance_uuid='a65b26f5-a278-4200-a573-84e01c685699'; which has "extra_specs": {"aggregate_instance_extra_specs:cephcomputestorage": "true"}16:45
mlozahttps://paste.opendev.org/raw/bMf35OK3I9aYbCNCv9jm/16:45
gibimloza: yepp if you want to hack the db then that is the place16:45
sean-k-mooneywell  there are 2 palces16:47
sean-k-mooneythe instance extra table is what woud be used for generating the vm xml and on the compute node16:47
opendevreviewmelanie witt proposed openstack/placement master: Narrow scope of set allocations database transaction  https://review.opendev.org/c/openstack/placement/+/80701416:47
sean-k-mooneyi woudl hope that is the same falvor we use for move operation inthat are not resize but we do also have flavor info in the request spec which is what is passed to the schduler filters16:48
sean-k-mooneythe schduler never uses the info in the instance_extra tabel driectly16:49
mlozawhere does the scheduler pull info if not from instance_extra table ?16:51
sean-k-mooneyi belive we invoke the schduler with a populated request spec object when we do the select destioantion call form the conductor16:52
sean-k-mooneybut technially there is a request_sepcs table in the api db16:52
sean-k-mooneywith a serialised copy of the requst spec16:52
bauzassean-k-mooney: this is correct16:53
bauzasbut when you resize (eg.) you update the request spec16:53
sean-k-mooneyyes16:53
bauzasfor the flavor, I mean16:53
sean-k-mooneybut what about live migrate and cold migrate16:53
sean-k-mooneydo they just use the api db copy 16:53
sean-k-mooneyor reconstruct it form the embeded flavor in the cell db16:54
bauzascold and live migrate don't modify the resources, right?16:54
bauzasresize and rebuild do16:54
bauzasso, yes, we just call out the request spec from the api table16:55
bauzaswe never lookup the cell db16:55
sean-k-mooneyright which has an enbeded copy of the flavor extra specs16:55
bauzasthe instance table, you mean?16:55
sean-k-mooneymloza: so you would have to update both the instnace_extra table for the compute node to use and request_specs table for the scuduler to use16:56
sean-k-mooneybauzas: no the request_spec has a full embeded copy of the flavor as a json blob16:56
sean-k-mooneywell seriasised nova object16:56
bauzasah this16:56
bauzasyes16:56
bauzaswe have the embedded flavor stored in two places16:57
sean-k-mooneyso if mloza  is changing it to make live migrateion work they need to update the request_sepc version16:57
sean-k-mooneybauzas: yes16:57
bauzasone in the api db within the request_specs table16:57
bauzasone in the cell db within the instances table16:57
bauzasthe instances table is never used for scheduling decisions16:57
sean-k-mooneyinstance_extra but yes16:57
bauzasmy bad, yes indeed16:57
mlozai looked at request_specs in nova api db and indeed, there is extra_specs 16:58
sean-k-mooneyso mloza you would have to keep both in sync if you do decide to hack the db16:58
mlozanoted. thanks for the info16:59
mlozanbeed to figure out the query to modify the json blob in the table extra_specs and request_specs 17:01
bauzasuse the python objects directly17:01
bauzassimple and quick17:01
bauzasbut you need to run the python script somewhere where you have some nova code17:02
* sean-k-mooney really whish we had that recreate api.17:02
sean-k-mooneythis come up often17:02
bauzasin the past, we said those instances can be rebuilt17:02
sean-k-mooneynot for flavor17:02
sean-k-mooneybut ya resized17:03
bauzasor resized 17:03
sean-k-mooneybut this comes up alot17:03
bauzasand we said we are cloud17:03
sean-k-mooneyit is a significant downstream and upstream17:03
bauzasfor some reason, people thought we were vmware17:03
sean-k-mooneyright and there is a cloud native way to do it17:03
sean-k-mooneywell k8 swould suport this too alto0ugh it would delete it and recreate it17:04
bauzasresize --live ?17:04
bauzasyou could even use bfv17:04
sean-k-mooneyno resize to same flavor updates embeded extra specs17:04
sean-k-mooneywe agreeed i should write a spec for that17:04
sean-k-mooneyjust didnt do it last cycle17:04
sean-k-mooneystill a move op with downtime17:04
sean-k-mooneylike normal resie just we don tallow resize to same flavor today17:05
sean-k-mooneyso you have to duplicate the flavor rather then update it and then update all your exsiting vms 17:05
dansmithresize to same flavor might involve a reschedule though right?17:09
dansmithso you'd need to have a "move me if necessary" or "don't move me and abort if I would have to move" option17:09
sean-k-mooneydansmith: yes its a normal resize just removing the requirement to have a different flavor17:10
dansmithright but people wanting this will likely want to violate those assumptions17:10
dansmiththe person asking for this wanted to resize to the same flavor, but without moving or even rebooting, so they could then live migrate17:11
sean-k-mooneyi think the assumtion would be its alwasy going to change host17:11
sean-k-mooneyit might now but proably will17:11
dansmiththat wouldn't do what the person wanted though17:11
sean-k-mooney*not17:11
dansmithso I'm just saying, it's sticky :)17:11
dansmithor slippery, or whatever17:11
sean-k-mooneytrue its the only safe way to do it if we allow you to update anything17:11
sean-k-mooneye.g. numa or cpu pinning17:12
sean-k-mooneyby the way we dont correctly update extra spec on resize https://review.opendev.org/c/openstack/nova/+/805882/1/nova/tests/functional/libvirt/test_pci_sriov_servers.py17:12
sean-k-mooneyartom is working on that17:13
artomOnce I surface from these esclations, but yes17:13
sean-k-mooneyhttps://launchpad.net/bugs/194100517:13
sean-k-mooneyim pretty sure we regressed this at some point17:13
sean-k-mooneybut ya that is also a thing right now17:13
sean-k-mooneyalthough im slight confusted why this set of tempest test is passign https://review.opendev.org/c/openstack/whitebox-tempest-plugin/+/806239/6/whitebox_tempest_plugin/api/compute/test_vpmu.py17:16
artomsean-k-mooney, maybe we update the request spec when we confirm the resize?17:19
sean-k-mooneyoh ya that could be it17:19
sean-k-mooneyso we need that refactoring i suggested where we can assert thing during resize verify17:19
artomAye17:20
sean-k-mooneythat would also explain why we have not notcied this if it update on confirm17:20
sean-k-mooneythe window to observe it would be only during the migration17:20
artomAnd if it causes problems during scheduling...17:21
opendevreviewSylvain Bauza proposed openstack/nova master: Add the Xena prelude section  https://review.opendev.org/c/openstack/nova/+/80778617:55
opendevreviewMerged openstack/nova master: Parse alias from domain hostdev  https://review.opendev.org/c/openstack/nova/+/80694318:05

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!