Tuesday, 2022-08-30

*** dasm is now known as dasm|off02:00
bauzasgood morning Nova06:57
* bauzas gets a SIGHUP from 3 weeks06:57
gibibauzas: welcome back07:48
bauzasgibi: thanks07:50
*** elodilles_pto is now known as elodilles08:06
gibithe user_data patch looks good to me, but needs a second code. The rebuild of bfv also look good to me but I have some nits that can be fixed in the last patch or as a follow up if needed09:08
*** brinzhang_ is now known as brinzhang09:09
gibiI will keep an eye on the manila series09:10
kgubehi, I'm new to Nova development! I've been working on a blueprint+spec that I'd like to submit and get feedback on,  but I have been wondering, should I wait with this until after the Zed release?09:10
sean-k-mooneykgube: if its a spec you can submit it to the spec repo now09:17
gibikgube: welcome!09:17
sean-k-mooneykgube: it might not get much review for the next week or two but you can feel free to create it09:17
gibikgube: it is a good time to submit the spec as we soon be freed up from Zed and start planning the AA cycle09:17
sean-k-mooneykgube: once we release our first release candiate the master banch will technically be open for the a cycle development09:18
sean-k-mooneyalthough we tend not to merge large changes until the final release is  done09:18
gibikgube: also if the spec is a bit controversial then we can add it to the AA PTG planning to discussit in real time 09:19
kgubealright, thanks for the info!09:20
gibisean-k-mooney: I've replied your question in https://review.opendev.org/c/openstack/nova/+/85383509:20
sean-k-mooneygibi: thanks ill look again shortly. most o fthe seriese i looked at looks good09:25
sean-k-mooneyi see a good split point again about half way through the open patches09:25
sean-k-mooneyaround here ish https://review.opendev.org/c/openstack/nova/+/85444009:26
kgubesean-k-mooney: since AA is not yet available in the spec repo, should I submit it to Zed for now?09:26
sean-k-mooneykgube: you can just creat the directory locally like dan did here https://review.opendev.org/c/openstack/nova-specs/+/85383709:27
sean-k-mooneykgube: ill create a patch to do it properly later today09:28
sean-k-mooneyand then you can rebase when its merged09:28
sean-k-mooneyfor now just copy the zed template and use that for the basis of the spec09:28
kgubealright!09:28
sean-k-mooneyi dont think we plan to update it for antelope currently09:29
bauzasI was planning to do the AA paperwork right after Zed-3 FWIW09:34
sean-k-mooneyack well that thursday so if you want to create it go for it otherwise ill go create it when i get time09:35
sean-k-mooneywe have two things to do for the specs repo09:35
sean-k-mooneyone create teh new folder and copy the template09:35
sean-k-mooneytwo run the script to move/symlink the implemented specs09:36
sean-k-mooneythat second task uses the bluepirnt state in launchpad to generate teh list09:36
sean-k-mooneyso that is better to wait until next week when everything is updated before we do that09:36
gibisean-k-mooney: ack09:40
bauzassean-k-mooney: moving the specs is generally planned for RC1 https://docs.openstack.org/nova/latest/contributor/ptl-guide.html#milestone-309:52
bauzasI mean approved => implementeds09:53
sean-k-mooneyyep09:59
sean-k-mooneythat is to account for possible FFEs10:00
sean-k-mooneykgube: what is the thing you have been working on by the way10:30
kgubesean-k-mooney: support for extending attached file-based volumes (such as NFS volumes)10:38
sean-k-mooneyoh that shoudl already exists10:39
sean-k-mooneyprovided the cinder volume supprots it10:39
sean-k-mooneyi rememebr that case being speical but i belive it should work10:39
sean-k-mooneyah yes https://docs.openstack.org/cinder/latest/reference/support-matrix.html#operation_online_extend_support10:40
sean-k-mooneyGeneric NFS Reference Driver (NFS): missing10:40
sean-k-mooneybut some vendor nfs drivers supprot it 10:41
sean-k-mooneyVeritas Cluster NFS Driver (NFS): complete10:41
sean-k-mooneyNetApp Data ONTAP Driver (iSCSI, NFS, FC): complete10:41
sean-k-mooneykgube: so that is likely a cidner feature rahter then a nova one10:41
kgubesean-k-mooney: https://bugs.launchpad.net/nova/+bug/197829410:41
sean-k-mooneyhum10:43
sean-k-mooneyim not sure if we really want nova to have logic to resize these volumes on behalf of the backend10:43
sean-k-mooneyi guess we can review your proposal10:43
sean-k-mooneybut i would proably push this towards os-brick10:44
kgubethe logic is already there, though10:44
sean-k-mooneyor require the backend driver to do it not nova10:44
kgubeits just exposed as an external server event only10:44
sean-k-mooneychanging the external events api to be synconous is likely not somethign we can do10:44
kgubeThe problem is that QEMU has to perform the resize10:45
sean-k-mooneyqemu or qemu image10:45
sean-k-mooney*qemu-img10:45
kgubeqemu holds a lock on the attached file, which qemu-img respects10:46
kgubeso it refuses to resize10:46
sean-k-mooneyright but you can call qemu-img with force-share10:46
sean-k-mooneymy understandign is that is what the nfs driver normally did to reseize the device10:47
sean-k-mooneyis that not the case10:47
kgubethat does not work for modifying the image, afaik10:47
kgubeand the qemu-img docu explicitely warns against modifying attached volumes10:48
sean-k-mooneyi see10:49
sean-k-mooneythen rather then a new server action10:50
sean-k-mooneyi think you should modle the new api on the existing /os-assed-volume-shapshots api10:50
sean-k-mooneyhttps://docs.openstack.org/api-ref/compute/#create-assisted-volume-snapshots10:50
sean-k-mooneyactully https://docs.openstack.org/api-ref/compute/#assisted-volume-snapshots-os-assisted-volume-snapshots is slight better link10:51
sean-k-mooneyyou could add a /os-assisted-volume-extend api for this usecase10:52
sean-k-mooneythat can be blocking or nonblocking as its a new api for this exact usecase10:52
sean-k-mooneyits not really a server action like the other instance actions hence new top level api10:53
kgubeok, yeah, I guess that makes sense10:54
kgubeso Cinder does not need to care which instance the volume is attached to10:56
sean-k-mooneywell im not actully sure why we dont include the instance uuid it woudl make our life simpler10:58
sean-k-mooneyso i would proably include it in teh request as its simpelr for use to then make the rpc to the compute node work10:59
sean-k-mooneyim not actully user that /os-assisted-volume-snapshots works properly for multi_attach volumes for example11:00
sean-k-mooneyalthough i suspect that is not supproted via nfs backends11:00
sean-k-mooneyso i would proably make the body somethign like (instance_uuid, volume_uuid, new_size) and perhapsp include the attachment_uuid if that is useful but proably not required11:02
kgubehm, NetApp ONTAP seems to support multiattach11:02
sean-k-mooneywith that we can internally lookup the instance.host and make an rpc to the compute to do the reisze 11:02
sean-k-mooneykgube: i think that is likely only with iscsi11:02
sean-k-mooneyyou could make it work i guess with nfs11:03
sean-k-mooneybut that will be much much harder to supprot extend with11:03
sean-k-mooneyas we have muliple QEMU processes using it11:03
kgubeyeah11:03
kgubeI think this will only work if the file is read-only11:04
sean-k-mooneyfor iscis the volume resize is going ot happen on the netapp san11:04
kgubebut then we canT resize it anyway11:04
kgubeor, qemu won't rather11:04
sean-k-mooneyyep thats an edge case which prably should be called out and validated on the cinder sied11:07
sean-k-mooneye.g. prevent multi attach volumes on nfs form either beign created or rezised while attached depending on what is more approcriate11:07
kgubeyeah, that is something cinder will have to do11:11
opendevreviewBalazs Gibizer proposed openstack/nova master: Follow up for the PCI in placement series  https://review.opendev.org/c/openstack/nova/+/85518511:19
opendevreviewBalazs Gibizer proposed openstack/nova master: Doc follow up for PCI in placement  https://review.opendev.org/c/openstack/nova/+/85518611:19
bauzasreminder : nova meeting will happen today at 1600UTC here in this channel12:23
bauzasgibi: sean-k-mooney: I can run the meeting back12:24
* bauzas is currently looking at what was done during my 3.5week-PTO 12:24
gibibauzas: you can have the meeting back :)12:29
bauzasnot sure if I should say \o/ or /.o\12:29
UgglaHi bauzas, I hope you enjoy your vacation.12:30
bauzasI did12:30
UgglaJust to let you know that unfortunately I could not attend today's meeting. :(12:31
bauzasnp12:31
kashyapUggla: s/enjoy/enjoyed/?  I thought he's going to have another vacation :-P12:31
kashyap(Which is also fine, if he's up for it.)12:31
Uggla*enjoyed thx kashyap 12:31
bauzashaha, can't wait for someone saying I'm on "perpetual PTO"12:31
bauzasyou know what was fun ? I was in a camp site for two weeks 12:32
bauzasfirst week, the site was telling in French12:32
bauzaseventually, last week, we got like 70% of people that were German12:33
bauzasas it was the last children vacations week in France, while some German lander still have 3 weeks12:33
bauzasso, the campsite folks were telling in German12:33
bauzas*in Corsica12:34
UgglaI hope you had no pb with the storm in Corsica ?12:34
bauzasnone 12:36
bauzasI was on the East12:36
bauzaswhen the storm arrived around Ajaccio, we just got a small storm, like just winding and a few raining for 10 mins12:38
bauzasbut the campsite asked us on the afternoon to make sure we could quickly move to a gym if needed on the next evening as a new storm was arriving12:39
bauzaseventually, we didn't move12:39
Ugglabauzas, good to know. I saw some impressive storm videos and thought about you hoping you were safe.12:45
bauzasthanks12:45
bauzasindeed this was impressive12:45
Ugglaclearly12:45
bauzasI visited some small village in the mountains on the day the storm arrived12:46
bauzasand yeah, they got problems with just the wind12:46
bauzas200km/h12:46
bauzasfortunately, the mountain protected us12:46
kashyapYikes!12:54
ricolinsean-k-mooney gibi stephenfin  hey, can you give another review on https://review.opendev.org/c/openstack/nova/+/830646/ thanks13:38
sean-k-mooneysure have you started reworking https://review.opendev.org/c/openstack/nova/+/844507/13:40
sean-k-mooneywe likely will want ot merge both togehter13:41
sean-k-mooneythey proably shoudl be in the other order too13:41
*** dasm|off is now known as dasm13:46
stephenfinricolin: sure13:55
dansmithwhoami-rajat: are you working on gibi's comments? if not, I'll do it14:05
ricolinsean-k-mooney: yes, see if I can push something  out today14:09
gibistephenfin: sean-k-mooney made review progress in the PCI series so if you have free cyles then we could merge some patches there. I also pushed a FUP on top with doc fixes based on your comments14:15
stephenfingibi: Also on my list14:16
gibistephenfin: thanks14:16
bauzasI haven't yet had time to look at open reviews, but if people want me to look at some, \o14:29
* bauzas looks at the previous meetings and some changes14:29
whoami-rajatdansmith, sorry was away, I'm currently busy with the PTL nomination, after that can address14:50
dansmithwhoami-rajat: I'll do it14:51
whoami-rajatdansmith, ack, thanks14:51
dansmithjust didn't want to step on toes if you were in the middle already14:51
sean-k-mooneyi have a few nits ill be adding shortly14:51
sean-k-mooneyreviewing teh sersie now14:51
sean-k-mooneynothing worhth holding it however14:51
whoami-rajatack, yeah I've been busy with cinder deadlines as well this week14:53
dansmithsean-k-mooney: get them in so I can address14:54
sean-k-mooneythe nits on the first patch are up ill review the last two again shortly14:56
sean-k-mooneyno issue with the secon patch reviewing last one now14:57
dansmithoh I thought you mean the last one15:01
dansmithif they're just nits, I think we can/should do a FUP for the nits on the lower ones15:01
dansmithI was going to revise the top one because it won't impact much15:01
sean-k-mooneyack15:01
sean-k-mooneymy main grip is _detach_device being used asd the name of the function that detaches the root volume15:02
sean-k-mooneysince its really really ambigious15:02
sean-k-mooneybut its private/internal15:02
sean-k-mooneyand we can change that later15:02
sean-k-mooneyits not going ot affect the rpc or anything outside the compute manager and the tests for it15:02
sean-k-mooneythere were a couple of typos in the commens too but again nothing requireing a respin 15:03
sean-k-mooneydansmith: do you hae a way forward for https://review.opendev.org/c/openstack/nova/+/830883/27/nova/tests/functional/test_boot_from_volume.py15:04
sean-k-mooneyi can try and run that locally shortly and take a look15:05
dansmithsean-k-mooney: I fixed it15:05
sean-k-mooneyack15:05
dansmithin the patchset I pushed after that15:05
dansmithhence the zuul +115:05
sean-k-mooneyoh thats an old comment15:05
dansmithyeah15:05
sean-k-mooneyok marked it done15:06
sean-k-mooneyok done15:07
dansmithack15:08
opendevreviewDan Smith proposed openstack/nova master: Add API support for rebuilding BFV instances  https://review.opendev.org/c/openstack/nova/+/83088315:09
sean-k-mooneydansmith: i +2'd the userdata patch but left the +w to you or melwitt  by the way since i respun it15:09
dansmithsean-k-mooney: I haven't reviewed that one really yet15:09
dansmithbut I think melwitt did, so hopefully she can hit it15:10
sean-k-mooneyshe had one pending comment from PS7 https://review.opendev.org/c/openstack/nova/+/816157/15/nova/objects/instance.py but i also dont think she considered it a blocker15:11
sean-k-mooneyits a property on the ovo so technially we could change that later without causing any object change the one thing we cant change is the key in the system metadata table with our a data migration15:12
sean-k-mooneyso im hoping she is ok with is as is15:13
jhartkopfsean-k-mooney, gibi: Hey, just saw that the user data change already received two +2s from you. I have some adjustments (in response to nits) ready locally. Should I wait for a follow up with this?15:23
sean-k-mooneyyes15:23
sean-k-mooneyplease adress them in a followup patch15:23
gibijhartkopf: o/ . as sean-k-mooney suggests lets address those nits in a separate patch. We intend to land the user_data patch as is today to allow the next feature behind it to land too15:24
opendevreviewMerged openstack/nova master: Adapt websocketproxy tests for SimpleHTTPServer fix  https://review.opendev.org/c/openstack/nova/+/85337915:24
opendevreviewBalazs Gibizer proposed openstack/nova-specs master: Update the PCI in placement spec  https://review.opendev.org/c/openstack/nova-specs/+/85521815:26
gibisean-k-mooney, stephenfin: updated the PCI spec to reflect the implementation reality ^^15:27
sean-k-mooneyack15:27
jhartkopfsean-k-mooney, gibi: Alright, just wanted to make sure :)15:28
bauzasgibi: thanks (or someone else) for having created https://etherpad.opendev.org/p/nova-zed-blueprint-status this helps my work15:32
gibibauzas: I think I noted somewhere that you owe me a beer for that :)15:33
bauzasgibi: if we were traveling to a PTG, I could have made it15:34
gibiyeah I know :) 15:34
bauzasbut, I can only give you a e-beer15:34
*** artom__ is now known as artom15:34
* bauzas discovered this the day before he moved to Corsica15:34
gibibauzas: we can take that beer in Vancouver then 15:34
bauzaswe need to progress then on sustainability efforts as I would like to present those :)15:35
bauzasneed then*15:35
bauzasreminder : nova meeting in 15 mins here15:46
gibistephenfin, sean-k-mooney: I prepared a respin of the PCI series to fix two issues sean-k-mooney spotted in the Create RequestGroups from InstancePCIRequests and in Support resource_class and traits in PCI alias. Should I push it now or hold on while you are reviewing?15:52
dansmithanybody seen this yet? https://zuul.opendev.org/t/openstack/build/5088531548e2497f8a59ddd8c8b89de815:53
dansmithrequirements blocking job setup15:53
dansmithI don't see a lot of other fails, so either it's unstable or just happened15:54
dansmithbut found it in a manilla job 7 mins ago15:54
* dansmith moves to -qa15:56
sean-k-mooneyno neither have had a releas erecnetly and both are 3.7+15:57
gibiyeah I haven't see such failure yet15:58
sean-k-mooneyi wonder is it stable15:58
sean-k-mooneyususlaly you see a failure and a rety of the same package multiple times when this happens15:59
sean-k-mooneyim not seeign that15:59
bauzas#startmeeting nova16:00
opendevmeetMeeting started Tue Aug 30 16:00:15 2022 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.16:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:00
opendevmeetThe meeting name has been set to 'nova'16:00
bauzashey folks, happy to see all you again16:00
gmanno/16:00
elodilleso/16:00
bauzasok, let's start, people will arrive16:01
bauzas#link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting16:02
bauzas#topic Bugs (stuck/critical) 16:02
bauzas#info One Critical bug 16:02
bauzasbut,16:02
bauzas#link https://bugs.launchpad.net/nova/+bug/1986545 Just sent to the gate16:02
bauzasso the fix should be merged soon16:03
bauzasthanks melwitt for having worked on it16:03
bauzas#link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 7 new untriaged bugs (-1 since the last meeting)16:03
bauzas#link https://storyboard.openstack.org/#!/project/openstack/placement 26 open stories (-1 since the last meeting) in Storyboard for Placement 16:03
bauzas#info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster16:03
gibio/16:03
bauzasUggla wasn't able to attend this meeting, but...16:03
bauzas#info bug baton is being passed to Uggla16:04
* bauzas is testing a new leadership, mwawawa16:04
bauzasmwahaha even 16:04
gibi:)16:04
bauzasthanks all who looked at bugs while I was off, the progress is impressive16:05
bauzasany bug to raise before we move on ?16:05
bauzasguess not16:06
bauzas #topic Gate status 16:06
bauzas#topic Gate status 16:06
bauzas#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:06
bauzas#link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly Placement periodic job status 16:06
bauzas#link https://zuul.openstack.org/builds?job_name=tempest-integrated-compute-centos-9-stream&project=openstack%2Fnova&pipeline=periodic-weekly Centos 9 Stream periodic job status16:07
bauzas#link https://zuul.opendev.org/t/openstack/builds?job_name=nova-emulation&pipeline=periodic-weekly&skip=0 Emulation periodic job runs16:07
bauzas#info Please look at the gate failures and file a bug report with the gate-failure tag.16:07
bauzasall the job runs are green16:07
bauzashaven't seen the gate today, but I have seen that nova-next was failing last week16:07
bauzasthanks all who worked on fixing the gate16:08
bauzasthat being said, any gate failure recently spotted or requiring to be discussed today ?16:08
gibigate seems to be good given we have FF week16:09
bauzasindeed16:09
bauzasand what a good point for passing to the next topic16:10
bauzas:)16:10
bauzas#topic Release Planning 16:10
bauzas#link https://releases.openstack.org/zed/schedule.html16:11
bauzas#info Zed-3 is in *2 days*16:11
bauzas#link Zed tracking etherpad: https://etherpad.opendev.org/p/nova-zed-blueprint-status16:11
bauzas#link https://etherpad.opendev.org/p/nova-zed-microversions-plan16:11
bauzasI've looked at the blueprint status16:11
bauzasI'll do the launchpad status cleanup later this week16:11
bauzasto make sure it shows the right progress16:11
bauzascorrect me if I'm wrong, but I'm seeing for the moment 3 different series to focus on reviews16:12
bauzasoh my bad, 5 16:13
bauzasfrom what I've seen, there are reviewing dynamics on the userdata series16:13
gmannfor RBAC, this is last patch for documentation and releasenotes #link https://review.opendev.org/c/openstack/nova/+/85488216:14
gibiI think user_data + rebuild bfv can land today16:14
gmannsean-k-mooney is waiting for gibi to review i think?16:14
bauzasgibi: and they have a proper microversion, so no conflict is expected besided a small one16:14
gibigmann: ohh, I thought I pushed +A htere16:14
gibigmann: I will fix it right away16:14
bauzasbesides*16:14
gmanngibi: thanks16:15
bauzasI'll put my attention to the open reviews tomorrow morning16:15
gibigmann: done16:15
gmannthanks again16:16
dansmiththe top of the bfv one,16:16
bauzasfun fact is that I'm asked to write our Zed cycle highlights while app. 50% of the open blueprints are still under review 16:16
bauzaspoke elodilles ;)16:16
dansmithjust hit a pypi indexing fail, so it will have to be rechecked when it finishes16:16
sean-k-mooneygmann: yep was going to appove it if it was not apporve by my end of day by gibi16:16
* elodilles is shocked :-o16:16
bauzasdo people need a group discussion here at the meeting for any of those series ?16:16
bauzasor can we continue to review off the meeting ?16:17
gmannsean-k-mooney: thanks.16:17
sean-k-mooneyi think we can continue off meeting for the most part16:17
bauzascool16:17
bauzas#topic Review priorities 16:18
bauzas#link https://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement+OR+project:openstack/os-traits+OR+project:openstack/os-resource-classes+OR+project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/osc-placement)+(label:Review-Priority%252B1+OR+label:Review-Priority%252B2)16:18
bauzasmost of them are for bugs16:18
bauzasI'll look at them after FF16:19
bauzasfor the two series having a review-prio flag, they are already under great reviewing attention and close to be merged16:19
bauzas#topic Stable Branches 16:20
bauzaselodilles: want to continue ?16:20
elodillesyepp, thanks16:20
elodillesthough, since we are around FF not so much happening around stable branches16:21
elodilles#info stable/stein (and older) are blocked: grenade and other devstack based jobs fail with the same timeout issue as stable/train was previously16:21
elodilles#info newer branches should be OK16:21
elodilles#info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci16:21
elodillesthat's it16:22
bauzasthanks16:22
bauzasanything to point out ?16:23
bauzasif not, we can wrap it up16:23
elodillesnothing else from me16:23
bauzasok, last but not the least16:24
bauzas#topic Open discussion 16:24
bauzasnothing is on the agenda16:24
bauzasand we're 2 days from FF16:24
bauzasanything urgent to raise ?16:24
bauzasI maybe have one16:24
bauzasthis is premature and I don't have the full picture in my mind about the current whole Zed status, but do people feel we should grant some exceptions ?16:25
bauzasproactively I mean16:25
bauzasnot which ones16:25
bauzasbut should we all agree on discussing this if needed ?16:25
gibiwe have 2 weeks between FF and RC!16:26
gibiRC116:26
sean-k-mooney realisticaly i think if we were too i woudl only extend until the next meeting16:26
gibiyeah16:27
bauzasyup, me too16:27
bauzasif you don't mind, let's just say we're not against that, provided at least one core agreed a series16:27
gibiI want to believe that we can land the PCI series but it is a lot of patches16:27
sean-k-mooneymaybe revisit on thurday16:27
bauzasand provided we're able to merge till next week16:27
sean-k-mooneygibi: i think we can land enough of it to be useful16:27
bauzassean-k-mooney: correct, that was my point16:27
sean-k-mooneyunsure if all of it16:27
bauzaswe're just on a meeting16:28
bauzasok, looks we're in a consensus16:28
gibisean-k-mooney: yeah I feel the same but I would like to aim for all :D16:28
sean-k-mooneyme too16:28
bauzaswe just leave the room for this, but we'll anyway consider this on Thursday on a case-by-case basis16:28
bauzasthat's all I wanted16:28
sean-k-mooney+116:28
bauzasthat's it for me then16:29
bauzasunless someone else has something to raise, we can close16:29
sean-k-mooneyjust one other thing to note dansmith noticed a possible pypi cnd issue that may block things merging for a bit16:29
sean-k-mooneyhopefuly it will resolve itslef shortly16:29
gibifingers crossed16:29
bauzasshit, ok16:30
* bauzas will send hugs to the pypi operators16:30
bauzasanyway, thanks all 16:31
sean-k-mooneythere is nothing we can do really so no need to worry right now16:31
sean-k-mooneyo/16:31
bauzas#endmeeting16:31
opendevmeetMeeting ended Tue Aug 30 16:31:17 2022 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:31
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2022/nova.2022-08-30-16.00.html16:31
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2022/nova.2022-08-30-16.00.txt16:31
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2022/nova.2022-08-30-16.00.log.html16:31
elodilleso/16:31
sean-k-mooneyelodilles: before you go16:32
sean-k-mooneyelodilles: have you seen https://9f27113a1a10a64b577d-d92e3f8cc209d4a3d1e66263399702fb.ssl.cf1.rackcdn.com/855022/3/check/openstack-tox-py39/62146ba/testr_results.html16:32
sean-k-mooneyrefernce and actual look pretty identical to me16:33
sean-k-mooneyand it passed on py3616:33
sean-k-mooneyi was going to try unning that locally to confirm but just wondering if i shoudl just recheck if it passes16:33
opendevreviewBalazs Gibizer proposed openstack/nova master: Create RequestGroups from InstancePCIRequests  https://review.opendev.org/c/openstack/nova/+/85277116:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Support resource_class and traits in PCI alias  https://review.opendev.org/c/openstack/nova/+/85331616:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Split PCI pools per PF  https://review.opendev.org/c/openstack/nova/+/85444016:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Map PCI pools to RP UUIDs  https://review.opendev.org/c/openstack/nova/+/85411816:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Make allocation candidates available for scheduler filters  https://review.opendev.org/c/openstack/nova/+/85411916:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Filter PCI pools based on Placement allocation  https://review.opendev.org/c/openstack/nova/+/85412016:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Factor out base class for candidate aware filters  https://review.opendev.org/c/openstack/nova/+/85492916:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Store allocated RP in InstancePCIRequest  https://review.opendev.org/c/openstack/nova/+/85412116:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Func test for PCI in placement scheduling  https://review.opendev.org/c/openstack/nova/+/85412216:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Support cold migrate and resize with PCI tracking in placement  https://review.opendev.org/c/openstack/nova/+/85424716:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Support evacuate with PCI in placement  https://review.opendev.org/c/openstack/nova/+/85461516:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Support unshelve with PCI in placement  https://review.opendev.org/c/openstack/nova/+/85461616:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Support same host resize with PCI in placement  https://review.opendev.org/c/openstack/nova/+/85444116:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Test reschedule with PCI in placement  https://review.opendev.org/c/openstack/nova/+/85462616:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Test multi create with PCI in placement  https://review.opendev.org/c/openstack/nova/+/85466316:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Allow enabling PCI scheduling in Placement  https://review.opendev.org/c/openstack/nova/+/85492416:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Follow up for the PCI in placement series  https://review.opendev.org/c/openstack/nova/+/85518516:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Doc follow up for PCI in placement  https://review.opendev.org/c/openstack/nova/+/85518616:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Drop InstanceEventFixture  https://review.opendev.org/c/openstack/nova/+/85526216:36
sean-k-mooneyelodilles: ya those all passed locally with py3916:37
sean-k-mooneyill run them a few times but i think that just an intermitent failrue perhaps16:37
gibidansmith: if you could +A https://review.opendev.org/c/openstack/nova/+/816157  then we could start landing the rebuild bfv too16:38
gibiI go get some fresh air but I will check back later to see if my +2 is needed somewhere16:42
dansmithgibi: I said above, but I haven't reviewed that one at all16:55
dansmithand I think melwitt did, so probably best to let her ack it16:55
dansmithlet's give her a few hours but if she doesn't pop up today maybe I can review it enough to ack it16:56
elodillessean-k-mooney: hmmmm, looks strange. the difference is some alignment on the title of the tables. thanks, i'll try to look into it tomorrow.16:56
sean-k-mooneythey passed locally so ill recheck the first patch and see 17:07
gmannbauzas: just a heads up, your PTL nomination patch still showing WIP/merge conflict, may be you need to reabase on master ? https://review.opendev.org/c/openstack/election/+/85263017:12
sean-k-mooneyim going to go get somethign to eat i might do some reviews later17:19
gibidansmith: ack17:29
dansmithgibi: ah crap, I dunno why my delete of that instance event file didn't work18:30
dansmithah, my over-use of shell macros I bet18:32
opendevreviewDan Smith proposed openstack/nova master: Add API support for rebuilding BFV instances  https://review.opendev.org/c/openstack/nova/+/83088318:32
dansmithgibi: as you said, we might as well keep working on the top patch until it gets closer to the bottom, so ^18:32
gibidansmith: sure, I abandoned mine18:46
gibiand put the +2 back to the top of bfv18:46
dansmithgibi: I'm about to commit some comments on the user_data patch,18:55
dansmithcan you have a look?18:55
dansmithgibi: specifically this: https://review.opendev.org/c/openstack/nova/+/816157/comments/f31f7593_ff5dd9fc18:57
dansmithisn't this just using instance sysmeta to avoid an RPC bump?18:58
dansmithand thus it's side-stepping any RPC or object versioning we'd normally have for something like this?18:58
dansmithrebuild is basically growing a new feature, and we need to pass it a flag,18:58
dansmithbut instead of the flag and version bump, we're stashing it in instance sysmeta18:58
dansmithwhich means we can't fail the call with "sorry we're pinned to RPC 6.1 so we can't do that right now"18:59
dansmithsean-k-mooney: ^18:59
sean-k-mooney[m]the orginal idea was to allow updating the user data and then regenerate the config drive the next time we reboot the instance19:01
dansmithbut that's not how it's implemented now right?19:01
sean-k-mooney[m]although i think we changed our mind and blocked update with config drive19:01
dansmithright19:01
sean-k-mooney[m]so im not sure why we are blocking it now19:02
sean-k-mooney[m]but that is why it was done that way19:02
dansmithwhy does hard reboot not just always regenerate config drive?19:02
sean-k-mooney[m]no19:02
sean-k-mooney[m]it does not do it today at all19:02
dansmithI'm saying.. why not just make it regenerate always19:03
dansmithinstead of the dirty flag19:03
dansmiththen you get to say it's best-effort, based on compute and virt support for doing so19:03
sean-k-mooney[m]that would need to pull the data form the db but i guess we could19:03
dansmithso?19:04
sean-k-mooney[m]just saying that the side effect19:04
dansmiththe way it is right now, you've basically created a shadow RPC interface with no versioning which also accumulates in the DB19:04
sean-k-mooney[m]i think we wanted to aovid that19:04
sean-k-mooney[m]im trying to think if there is any other downside to always doing it19:05
sean-k-mooney[m]we added a trait to signel if the backend supprots regenerating it19:05
dansmithyeah, which also requires that we ask placement if *we* support a thing, which seems kinda odd :)19:06
sean-k-mooney[m]well its a compute capablity trait19:06
sean-k-mooney[m]we have several like that19:06
dansmithbut we don't have to check placement for that right?19:06
sean-k-mooney[m]am i dont think its in the api db so normally i think we do check placement19:07
dansmiththe ones that we use for scheduler filtering make sense of course, but I thought we wrote them somewhere we could get at them ourselves19:07
dansmithanyway19:08
dansmiththe shadow RPC interface seems much worse to me19:08
sean-k-mooney[m]i dont thikn its in the compute nodes table so im not sure where they would be19:08
sean-k-mooney[m]i guess we were not really thinking of it as a rpc interface19:09
sean-k-mooney[m]just some metadta on the instance19:09
sean-k-mooney[m]but i see your point19:09
dansmithwell the test is, that if you ran this under grenade,19:09
dansmithyou'd allow the reboot with the new user data, but it wouldn't get honored19:09
sean-k-mooney[m]for an un upgraded compute19:09
sean-k-mooney[m]hum19:10
sean-k-mooney[m]ya your right19:10
dansmithso you go to a lot of work to return a fail to the API call if it's not honor-able, but then you'll quietly say "got it" and send it off to a compute that will ignore it :)19:10
sean-k-mooney[m]so we would also neeed a compute service bump19:10
sean-k-mooney[m]and min version check19:10
dansmithnot really,19:10
dansmiththe rpc pinning will handle that for you19:10
sean-k-mooney[m]if we changed the hard reboot api with a new paramter19:11
dansmiththe auto pin will stick to the minimum supported version, and the rpcapi.py will raise if it can't send at v6.1 so you can error the api call19:11
dansmithright19:11
sean-k-mooney[m]or always rebuit as you said19:11
dansmithif we always rebuild, then we need a service version and check,19:11
dansmithbecause then you're assuming the compute will do a thing that it might not19:11
dansmithletting rpc handle it is simpler and more direct19:11
sean-k-mooney[m]yep that why  i was thinink the check initally19:11
sean-k-mooney[m]ya fair point19:12
sean-k-mooney[m]how do you want to proceed19:12
dansmithI dunno, it sucks that this is going to bump the bfv one too because it already touches rpc :/19:12
dansmithbut this also seems very wrong19:12
dansmithI can probably bang out the RPC change pretty quick19:13
dansmithbut it might push either of these into FFE territory,19:13
sean-k-mooney[m]i feel like this is not the first time we have done it this way. but that does not me it was right before19:13
dansmithwell, I hope we haven't done this before19:14
dansmiththis is the whole reason we have all the rpc, object, and service version plumbing19:14
dansmithfor this exact sort of thing19:14
sean-k-mooney[m]not via system metadata19:14
sean-k-mooney[m]im actully thinking about a bug fix we did19:14
sean-k-mooney[m]where we needed to fix it and backport19:15
sean-k-mooney[m]but that was becausse we could not change the rpc19:15
sean-k-mooney[m]i dont think we have done this in a new feature19:15
dansmithwell, it kinda depends on the mechanics as to whether or not that was okay or not19:15
dansmiththis is basically a new flag to an RPC call, so side-stepping it is pretty bad19:16
dansmithif it's the other way around, like compute decorating an instance so the controllers can see something, that's a bit different19:16
dansmithI'm guessing there's no tempest test for this and thus no hope that even luck would get us a test fail on a grenade job?19:17
sean-k-mooney[m]we stashed a flag in the port porfile in the migration vif object  https://github.com/openstack/nova/blob/master/nova/objects/migrate_data.py#L81-L9219:18
dansmithI mean, it'll clearly break, I just mean to demonstrate it19:18
dansmithbecause if we did, this couldn't even merge19:18
sean-k-mooney[m]ther is no tempest test for this no19:18
dansmiththat seems to me to be the "should we merge this as-is".. if we can easily write a valid tempest test that would fail on the grenade job, that's pretty bad19:19
sean-k-mooney[m]i dont think it would be that hard to test this although i have not really done much with tempest. lets see if there is a simiar test we can modify19:20
*** dasm is now known as dasm|off19:21
dansmithno,19:21
dansmithI'n not saying we should waste time on a test19:21
dansmithI'm saying we *could* easily write a legit test that *will* fail on a grenade job because this is broken19:21
dansmiththat should be our *semantic test* for "is this something we should merge"19:22
dansmithsean-k-mooney[m]: this also has a bit of a race sort of condition in it19:23
dansmithif I do a hard reboot with new user data,19:23
sean-k-mooney[m]ah right i think you confinced me already that we should not merge it as is19:23
dansmiththe API will write the user data to the instance record, then go to make the rpc call19:23
dansmithif rabbit is down, or the compute is down, the rpc call will fail,19:23
dansmithwell, I guess it will still see the dirty flag in that case, but only if they do another hard reboot19:24
dansmithpoint being, the pre-update of the user data is presumptuous19:24
sean-k-mooney[m]yes19:24
dansmithnot easy to resolve that at this point though19:24
sean-k-mooney[m]the dirty flag was intneded to say regenerate the config drive the next time you hard reboot19:25
sean-k-mooney[m]which made sense when the intent was to allow update with config drive19:25
dansmithanother interesting wrinkle in that case,19:25
dansmithis that we'll set that flag even if the compute is way too old,19:25
sean-k-mooney[m]since  it was lazy19:25
dansmiththen they reboot, nothing happens.. then 6mo later when the operator upgrades that compute .. poof, user data changes19:26
sean-k-mooney[m]yes but when the compute is eventually updated it will regenerate the config on the next reboot19:26
dansmithright, but that could be many months later19:27
dansmithwhich could be quite confusing for someone,19:27
dansmithespecially if they wrote something to test, tried to set it via hard reboot, it seemed to allow it but never worked, shrugged and went off,19:28
dansmiththen 6mo later it changes suddenly...19:28
dansmiththat user_data script might have been untested, they couldn't test it, so they assumed no harm19:28
sean-k-mooney[m]it could. so orginally the problem of how/when to regnerate the conf dirver came up a few months ago after the spec review19:29
dansmithI upgraded my +0 to -1: https://review.opendev.org/c/openstack/nova/+/81615719:29
sean-k-mooney[m]and at the time the ideia of the dirity flag was disucsed to allow this19:29
sean-k-mooney[m]its obvious we should have intoduced the rpc bump now19:29
dansmithyeah I understand the "lazy do it later" aspect, but if it's broken and doesn't get honored right away, that "later" being a half year or more seems like more harm than good19:30
sean-k-mooney[m]but orgianlly i tought that lazy rebuild was actully desirebale19:30
sean-k-mooney[m]ack19:30
dansmithlazy is good as long as it's not TOO LAZY :P19:30
sean-k-mooney[m]the rebuild series has both a compute service bump and rpc bump right19:31
dansmithyes19:32
sean-k-mooney[m]so eihter way it will have to be updated when this is adressed19:32
dansmithyes19:32
sean-k-mooney[m]so 6.1 would be for update_userdata and 6.2 for rebuild19:32
dansmithyeah19:33
dansmithare you thinking be sneaky and combine? I dunno how I feel about that19:33
sean-k-mooney[m]no19:33
sean-k-mooney[m]6.1 for extending hard_reboot19:33
sean-k-mooney[m]and 6.2 for extending rebuild19:34
sean-k-mooney[m]not cominign both into 6.119:34
sean-k-mooney[m]since there in different patches that would be odd19:34
dansmithack, yep, that's what i'll be, and of course service versions for each bump19:34
dansmith*it'll19:34
sean-k-mooney[m]and we should not squash them obviously since the patches are unrelated19:34
dansmithno, I just thought you were about to "get creative" :)19:35
sean-k-mooney[m]hehe not with correctness19:35
sean-k-mooney[m]out side of a bug fix…19:35
sean-k-mooney[m]jhartkopf: ^19:36
sean-k-mooney[m]not sure you were following that19:37
dansmithnot online, AFAICT19:38
dansmithugh,19:39
dansmithall of the virt and rpc stuff that the rebuild patches touch will conflict out19:39
dansmithI so wish luck had landed these in opposite order19:39
sean-k-mooney[m]by the way apprently the matix bridge does not actullly only show you the currently online people19:46
sean-k-mooney[m]which is why jhartkopf auto completed for me even if they are not here19:47
dansmithsean-k-mooney[m]: https://termbin.com/q3ev19:51
dansmithI think that's basically what needs to happen19:51
dansmithI wish we could get a read on this from melwitt and gibi so I should know if I should spend my evening trying to get this all changed and tested19:52
sean-k-mooney[m]ignoring your base64 nits that looks about right19:53
dansmithit also seems like the tests on this are pretty lacking, no?19:54
dansmithlike, there are no tests on the virt driver changes?19:55
sean-k-mooney[m]im currently trying to set up a devstack env i dont currently have one to test it. we do still have 2 days for code frezee19:55
dansmithlike, no test that I see that actually checks that the libvirt driver will honor the flag19:55
opendevreviewRico Lin proposed openstack/nova master: Add traits for viommu model  https://review.opendev.org/c/openstack/nova/+/84450719:56
gibiI'm not sure I have enough brain left today for this. But I have at least two comments to the above 1) I don't think this is a rebuild feature it is tight to hard reboot 2) I think we prevented the possible upgrade issue with the capability trait. 19:56
ricolinsean-k-mooney[m]:  just update the viommu traits patch :)19:56
gibiI cannot argue that this can be done differently19:56
gibiand dansmith you are right that the system metadata dependency makes it at least a grey interface19:57
gibiit is sad that we figured out this issue late in the cycle19:57
dansmithgibi: I don't understand what you mean by #1, but yeah I guess you're right on the trait.. that's preeety thin though :)19:58
gibi#1 is probably just a missunderstanding from19:59
gibi20:58 < dansmith> rebuild is basically growing a new feature, and we need to pass it a flag,19:59
sean-k-mooney[m]the trait wont be reported on a non upgraded compute yes20:00
sean-k-mooney[m]which might help for the upgrade chase specifically20:00
dansmithgibi: oh yeah I meant reboot there sorry20:00
gibion the flag itself. The RPC flag vs the persisted field has some semantic difference. If we update the user_data in the DB in the API layer then pass a flag to regenerate the config drive via the RPC then a lost RPC means that the DB data and the config data got out of sync20:00
dansmithgibi: but we can and should revert if we're reporting failure to the user20:01
sean-k-mooney[m]so im not sure if we should revert20:01
dansmithbecause if this happens because of the lack of a trait or version,20:01
gibithe reboot RPC is a cast20:01
gibiso if the RPC lost the API wont notice20:01
sean-k-mooney[m]the reason for that is todya id we update instnace metadata20:01
dansmiththen it will pop into being in six months and be very confusing20:01
sean-k-mooney[m]we dont update the config drive20:01
sean-k-mooney[m]unless you do a cross cell migration20:02
sean-k-mooney[m]so you can have a delta between the metadta api and the config drive today20:02
melwittugh, my irc client had "froze" not receiving new messages for only this network and I didn't realize it until now. had to close and reopen the client to receive and send messages20:02
dansmithgibi: ack, not for a version conflict, but for an actual lost RPC we'd get out of sync.. I'm not sure if that's better or worse than queuing an update for six months later on a different version of the software, but fair point20:03
gibiyeah, both case seems problematic 20:03
dansmithindeed20:03
melwittI just skimmed through yalls review comments from today a little while ago and don't have a handle yet on what's going on. I will read further and add a comment once I understand it20:03
dansmithI guess the trait eliminates the acute concern of this being actually broken, but I'm still concerned about setting the precedent for shadow RPC interfaces in metadata, even if protected by a flag like that20:04
gibicould we do both the DB update and the config driver regeneration from the nova-compute service?20:04
sean-k-mooney[m]do we consider the user_data to be higher imporantce to be updated then other info in the config drive20:04
dansmithit's what we have versioning for and how we do math about "can we do this now or not"20:04
sean-k-mooney[m]that we dont regenerate today20:04
dansmithgibi: well my first thought was not a flag, but pass the user data to the reboot call, and let it update it20:05
dansmithgibi: that would be much better all around20:05
dansmithI need to look at the potential size limit though20:05
sean-k-mooney[m]like if you attch a volume/interface or update server metadata that wont get updated in the config drive today20:05
gibiahh yeah, it is a blob20:05
dansmithif you can pass a MiB that would be bad...20:05
sean-k-mooney[m]its 64k i think20:06
* dansmith notes that he said "I wish melwitt and gibi were around" .. and then it happened ;P20:06
dansmithsean-k-mooney[m]: is it?20:06
sean-k-mooney[m]its large yes20:06
dansmithwe might want to make that bigger though at some point, so expecting to put that into an rpc message might be a bad idea long-term20:06
melwittit was a bit of a coincidence :) I saw sean's comment on the userdata review come in email and they said "I talked to dan" but I didn't see any talking to dan in the channel. that's when I realized my client was messed up20:07
dansmithaha20:07
gibiI need to drop for the night. I'm fine pulling user_data out of the release while we design it better. I just whish we can somehow avoid in the future push contributors into a desgin dead end and then pulling the rug out at FF. 20:08
dansmithI know the feeling because I was arguing that we not do that for bfv rebuild either20:08
gibiyeah20:09
dansmithand I noted in my comment that (a) I know the implication and (b) I'm willing to scramble on the work20:09
melwittso I disconnected and reconnected the network and I saw yalls comments rolling in. but when I sent messages there was no acknowledgement, so I checked the irc logs and my messages weren't there. so I had to escalate to a full quit/start of my client. and now it's working 🙄 20:09
gibiI can look at the patch / comments tomorrow morning. But now I drop. See you tomorrow20:09
dansmithalright20:09
gibio/20:09
dansmithsean-k-mooney[m]: how about this:20:09
dansmithsean-k-mooney[m]: how about we let this land as it is because theoretically the trait should catch it, and we convert to an RPC interface after BFV set and before the release20:10
dansmiththat won't be a behavioral change since it *should* be catching it now20:10
dansmithif someone is deploying on master within a two week window they could have some sysmeta cruft, but highly unlikely20:10
sean-k-mooney[m]ack we can likely get that working by the end of the week20:11
sean-k-mooney[m]as part of the follow up patch once bfv is landed20:11
dansmithit's mostly what I just wrote, but 6.220:11
sean-k-mooney[m]yep20:11
dansmithbut I also think that this is missing a lot of testing it should have20:12
sean-k-mooney[m]looking at it again you are right20:12
dansmithhas anyone other than the author tried this on a real devstack? with configdrive?20:12
sean-k-mooney[m]no i was going to see if i could do that tonight but i might just do that torrow at this point20:12
dansmithalso, in my defense, gibi *did* ask me to review this :)20:12
dansmithand I *did* try to punt to melwitt20:13
dansmithand melwitt *did* sabotage her irc client so she "didn't see that"20:13
sean-k-mooney[m]i was going to see if i could create a tempest test for this althogh im not sure how to force the vm too boot on the un upgraded node for grenade20:13
dansmithyeah you can't really, so you have to boot two and hit both I think20:14
dansmithbut at least it would non-deterministically fail20:14
sean-k-mooney[m]oh with the anti affintiy filter20:14
dansmithmelwitt: are you caught up yet, enough to grok that ^ plan?20:15
melwittdansmith: yeah I think so20:16
dansmithand what say ye?20:16
melwittthe plan sounds like a good compromise20:17
sean-k-mooney[m]we can likely sync with the autour tomrrow but i can set this up and test it tomorrow in anycase and perhaps look at more testing20:18
sean-k-mooney[m]so see if we can harden this and unblock bfv20:19
dansmithI just un-1'd it with a writeup20:20
dansmithsean-k-mooney[m]: are you saying that because you want to test it before it merges, given it has no real testing now?20:20
dansmithor because you expect some change to this again?20:20
sean-k-mooney[m]they were going to work on a followup patch to adress some nits20:21
sean-k-mooney[m]and i want to test it and figure out what else they should add tests for in that20:21
dansmithokay but we told them to do those as a follow-up right?20:21
sean-k-mooney[m]yep20:21
dansmithokay, well, tomorrow our runway is even shorter20:22
dansmithso hopefully you can do that in the morning :)20:22
melwittsean-k-mooney[m]: something I was confused on was whether the admin password (if there is one) would be preserved across a configdrive recreate. it seemed like yes? based on userdata update during rebuild impl, but I couldn't find definitively how that works20:23
dansmithI guess I thought you all were more confident in this with all the +2s it had20:23
sean-k-mooney[m]i was priortising it more then i would otherwise becuase of the bfv series on top20:24
dansmithyeah the ordering was unfortunate20:25
sean-k-mooney[m]i should have reviewed it more closly sorry. thanks for reviewing though even if the timing is not idea its better to get this right20:26
dansmithyep, so maybe if you test in the morning and +W it we can get the others in the queue and I'll start on the proper RPC stuff based on my pastebin above when I'm around20:27
opendevreviewRajat Dhasmana proposed openstack/python-novaclient master: Add support to rebuild boot volume  https://review.opendev.org/c/openstack/python-novaclient/+/82716321:23
opendevreviewRajat Dhasmana proposed openstack/python-novaclient master: Add support to rebuild boot volume 2.94  https://review.opendev.org/c/openstack/python-novaclient/+/82716321:31
whoami-rajat^ rebased on top of 2.93 to avoid conflicts later21:33
whoami-rajatsean-k-mooney[m], jfyi, you've a -2 on client patch of 2.93 https://review.opendev.org/c/openstack/python-novaclient/+/81615821:34
sean-k-mooney[m]oh i didnt clear that after you spun21:41
sean-k-mooney[m]i proably wont review that tonight but ill drop the -2 sorry about that21:41
sean-k-mooney[m]oh thats the user data one21:43
whoami-rajatyep, you already dropped the -2 from mine some time back but without the user data one, my change can't get in :)21:44
sean-k-mooney[m]i ment to do that when the spec was approved21:48
sean-k-mooney[m]there is no reason your bfv sereise would not work with the lvm driver right21:50
sean-k-mooney[m]i have a devstack deploying to test the user_data update but forgot to enable ceph21:50
sean-k-mooney[m]but i can pull in your changes after and test them if i get time after21:51
sean-k-mooney[m]dansmith:  https://termbin.com/650223:14
sean-k-mooney[m]ill try ubuntu 20.04 in the morning but on c9s the regenreation fails23:15
dansmithugh23:15
dansmithI was not expecting that kind of failure23:15
dansmithseems if we get to the mkisofs part we should be as good as otherwise23:16
sean-k-mooney[m]not when we are just calling an exsiting funciton23:16
dansmithso the configdrive was created properly on instance boot but failed on regenerate?23:16
sean-k-mooney[m]ya it might be because of the format23:17
sean-k-mooney[m]this might be trying to update the iso23:17
dansmithoh you mean code that works if the iso doesn't exist but fails if it already does?23:18
sean-k-mooney[m]when we use vfat that is allow but for iso it need to delete and recreate the file23:18
dansmithright23:18
dansmithI guess I expected it to overwrite, but maybe that's why permission denied, because qemu owns it now?23:18
sean-k-mooney[m]ya the previous iff unly ran that code if the file did not exist23:18
dansmithyeah23:18
sean-k-mooney[m]https://review.opendev.org/c/openstack/nova/+/816157/15/nova/virt/libvirt/driver.py#495023:18
dansmithman, if this really doesn't work (with isofs) and nobody tested until now ... :P23:19
sean-k-mooney[m]the reboot continues23:19
sean-k-mooney[m]the error is logged and ignored23:19
sean-k-mooney[m]so if you dont actuly look at the logs it looks like it worked form the api23:19
dansmithyeah, but I hope that's not why it went unnoticed23:19
dansmiththis is also why a tempest test would be good...23:20
dansmithearly in the bfv rebuild, we were doing the same.. quietly not doing the rebuild, and I wrote a test that touches a file, then does the rebuild, and asserts that it's gone.. and that pointed out that we weren't actually doing it23:20
sean-k-mooney[m]lookign at the vfat path i think i twould repoen the file and reformat it https://github.com/openstack/nova/blob/master/nova/virt/configdrive.py#L10023:22
sean-k-mooney[m]although im kind fo surrpised its failing like this for iso23:23
sean-k-mooney[m]the upper fucntion https://github.com/openstack/nova/blob/master/nova/virt/configdrive.py#L13723:23
dansmithI gotta run in a minute23:23
sean-k-mooney[m]oh never mind23:23
dansmithI thought you were going to do this tomorrow?23:23
sean-k-mooney[m]yep23:24
sean-k-mooney[m]so the uper function creates a temp dir with teh metadata files and then turns that into an iso using the final pat as the output23:24
sean-k-mooney[m]i tought it was going to do it in the temp dir and move it23:24
sean-k-mooney[m]so ya if qemu has a lock on that file it will cause a permission denined23:25
dansmithwell, I think libvirt chowns it before boot doesn't it?23:25
dansmithI was thinking ownership not lock23:25
sean-k-mooney[m]the issue is this is happening at the wrong time23:27
sean-k-mooney[m]the vm is still runing23:27
sean-k-mooney[m]it need to happen when the vm is stopped23:27
dansmithoh right, before the destroyed message23:27
dansmithif this really needs that level of care, I'm going to recommend we swap the order of the bfv stuff23:28
sean-k-mooney[m]right now its at the top of hard reboot https://review.opendev.org/c/openstack/nova/+/816157/15/nova/virt/libvirt/driver.py#388723:28
dansmithyeah, before destroy23:28
sean-k-mooney[m]ya so i dont object to swaping the order given the issues with the user data patch as is23:29
sean-k-mooney[m]anyway time for me to go sleep o/23:29
dansmithI guess if we're about to destroy, it's not *as* bad that we write to it before, but we could be competing with writes from the guest in the vfat case23:29
dansmithstill, wrong as you note23:29
dansmithack, thanks for testing, g'nite23:29
sean-k-mooney[m]by the way i tested the nova-client and osc change too which do work altough im going to ask for the osc change to take a user-data file instead of a sting or atleast have that as an option23:32
sean-k-mooney[m]im pretty sure thats what we do on boot23:32
sean-k-mooney[m]passing the bases64 encoded sting on the command line is annowing without doing  $(echo "stuff" | base64)23:33
sean-k-mooney[m]i guess thats less important if the nova part is broken23:34
dansmithoh yeah, I hadn't even looked23:35
dansmithpassed as a CLI string makes no sense23:35

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!