Friday, 2025-01-17

opendevreviewStephen Finucane proposed openstack/nova master: libvirt: Wrap un-proxied listDevices() and listAllDevices()  https://review.opendev.org/c/openstack/nova/+/93931711:08
gibifolks, what is a good way to change a multinode zuul job (nova-next) to have a different placement config than default? I cannot just set vars:devstack_local_conf as that fails as it tries to be applied on all nodes and computes has no placement service. Setting it is a group-vars that controller only does not work either as then I lose what is already set via vars.13:54
gibicontext https://review.opendev.org/c/openstack/nova/+/93727513:54
gibido I need to duplicate the settings from vars to group-vars?13:55
dougszuHas anyone seen ephemeral volumes corrupted after a VM is migrated?  I see Nova create the filesystem on the backing disk file during instance create, but then regenerate the backing disk on the destination HV. It leads to the UUID changing in the backing disk file, and the VM sees the filesystem as corrupted (for a subset of filesystems which are good spotting the anomaly). 13:56
dougszu*good at spotting13:56
dougszuhappy to create a bug and work on it, if there isn't something already13:57
opendevreviewBalazs Gibizer proposed openstack/nova master: [CI][nova-next]test with placement ac breadth-first  https://review.opendev.org/c/openstack/nova/+/93727514:04
sean-k-mooneygibi: so i can think of a few hacks but i tought we fixed this in the past14:20
sean-k-mooneygibi: there was a patch propsoed before to make sure hta tbefore the post-config code ran it would do a make -p on the partent path14:21
sean-k-mooneythe hack is in the devstack_local_conf section you could jsut add that14:21
sean-k-mooneysorry mkdir -p14:21
gibisean-k-mooney: still there is no placement.conf to edit on the computes so mkdir -p alone is not enough14:24
sean-k-mooneydougszu: so nova will redownload the backing image form glance and copy it form the souce if the image is not deleted14:25
sean-k-mooney*is deleted14:25
sean-k-mooneygibi: the file shoudl nto be required to exist14:25
sean-k-mooneyit shoudl create it if its not there14:25
sean-k-mooneygibi: at least that what my memory says is the expected behavior14:26
sean-k-mooneyim trying to find this in the devstack repo now14:26
dougszusean: what I don't see it the attempt to download the backing file - perhaps it is a misconfiguration. I've filed this with steps: https://bugs.launchpad.net/nova/+bug/209517314:26
sean-k-mooneyoh sorry you mean ephmeral storage provisin via the flavor14:28
sean-k-mooneyso for live migration we copy the data14:28
sean-k-mooneyfor cold migration we dont as far as i recall14:28
sean-k-mooneyi woudl need to check but we do not expect the content of the ephemral disks to be preseved for a number of operations14:29
sean-k-mooneyresize is oen of them i dont know about cold-migration. live migration is intended to preserve the data14:29
dougszuThanks - I see the ephemeral data copied for both live and cold. The bug affects both14:30
sean-k-mooneyso the content o fthe backing file sholud not really effect the resulting image since its ment to be blank14:31
sean-k-mooneybut i think i know why this is happening14:31
dougszuYeah, so I think you can configure it that way - to not create a FS on the ephemeral drive. But if you do tell Nova to create FS, it will create it on the backing file.14:31
sean-k-mooneyi think we are currently formating the backign file if you have not disabeld that14:31
sean-k-mooneydougszu: right os we have been disucssint removign the ablity for nova to format the ephemral storage14:32
sean-k-mooneywe planned to disabel it by default and then deprecate it but i dont think we have doen that yet14:32
sean-k-mooneyfor the ephmeral disk there is no backign file in glance14:33
sean-k-mooneyso the only optionon cold or live migrate woudl be to copy it from the souce node otherwise we would geneate a new one on the dst with no refernce to the old one a diffent uuid as a result14:33
dougszucurious -yeah, that would work, it forces the FS into the top layer14:33
dougszuand fixes the issue14:33
dougszubut not for existing VMs :D14:34
gibisean-k-mooney: so you suggest that devstack_local_conf post-config should work even if the referenced service is not deployed and the config file is not there as it i) creates the missing dir ii) generates a new file with the post-config content if the file does not exits?14:34
sean-k-mooneyright so evernayully we also dont want to have backign files for test disks ata ll14:34
sean-k-mooneythe fact they have them is a sideffect of hwo the qcow dirver is currently impelmetned and not really intentional14:34
sean-k-mooneygibi: that is how i think it was orgianly inteded to work yes14:35
sean-k-mooneyor at least that is how i would have implemtned it14:35
dougszuYou also can't copy the backing file from the source node. Eg. if you had two identical VMs on different HVs, and migrated them both to another HV, they would fight over the backing file.14:35
sean-k-mooneydougszu: you can but you need ot have a diffent name for it14:36
gibisean-k-mooney: ack, I can also try to dig devstack...14:36
gibibased on thsi14:36
gibithis14:36
dougszuyeah - agree on that - different name would work14:37
sean-k-mooneydougszu: but ya we dont generate a repoducable file system on the ephmale backign file today as far as im aware.14:37
sean-k-mooneyi agree this is a bug of some kind14:37
sean-k-mooneyits not clear to me what the best path forward is in ths short term14:37
sean-k-mooneylong terme there shoudl be no backing file and or no data in the backing file 14:38
dougszuthanks Sean, that's all helpful. I think as you say, the easiest fix is to just not support created the FS on the backing file14:38
dougszu*creating14:38
dougszuI will have a think about a fix for existing VMs. I think you can rebase the top layer onto a new backing file for one.14:39
sean-k-mooneywe wanted to make that change to reduce the security footprint of nova14:39
sean-k-mooneydougszu: we might be abel to flatten the image on migration, im not sure14:39
dougszuIt also avoids issues with for example Nova making a v5 XFS FS and then booting a C7.9 image with XFS 4 support etc14:40
dougszuyeah, flatten is also interesting14:40
sean-k-mooneyso i think this bug is another reaons to implement https://blueprints.launchpad.net/nova/+spec/default-ephemeral-format-unformated sooner rather then later14:44
sean-k-mooneyi cant commit to doing that this cycle but i might see if i can find time to try14:45
dougszu^ that looks good. I'd be happy to test/review/help out.14:49
sean-k-mooneygibi: i think this is the relevent fucntion https://github.com/openstack/devstack/blob/38f8f4da4556c11bf526392359ed6c14b45d87ea/inc/meta-config#L7914:50
gibisean-k-mooney: thanks!14:52
sean-k-mooneythe comment says it shoudl create the file if it does not exist which is the behviaor i recall i think its just failing because /etc/placement is not there for it to create the file under hence the mkdir -p comment14:53
gibiOK I will dig14:55
opendevreviewDan Smith proposed openstack/nova master: Handle GPT image detection  https://review.opendev.org/c/openstack/nova/+/93392814:58
opendevreviewDan Smith proposed openstack/nova master: DNM: Test with latest oslo.utils  https://review.opendev.org/c/openstack/nova/+/93344414:58
sean-k-mooneygibi: found it https://review.opendev.org/c/openstack/devstack/+/92284615:01
sean-k-mooneygibi: i know i debugged that at some point before15:01
sean-k-mooney*knew15:02
gibisean-k-mooney: awesome, I will put a depends-on on it to see if this helps. 15:06
gibisean-k-mooney: also an easy win review about whitebox in periodic queue https://review.opendev.org/c/openstack/nova/+/83345315:07
sean-k-mooneyoh cool artom updated that15:08
sean-k-mooneydone15:08
sean-k-mooneywe can review it for stablity i guess and see how it goes15:08
gibiyepp, bauzas tend to check the periodic state during our weekly meeting so we will see15:10
sean-k-mooneyspeakign of easy wins im also approving https://review.opendev.org/c/openstack/nova/+/939453/1 to disable the heal perodic in nova-next 15:11
sean-k-mooneywhen you have time https://review.opendev.org/c/openstack/nova/+/939476 is the follow up about changing the default15:11
sean-k-mooneyi was going to add that to next weeks team meeting ot get import15:11
* sean-k-mooney added15:14
gibiack15:25
dansmithugh.. this is wrong: https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.snapshot_image_format15:57
dansmiththe default is not "the same type as source image" it's "same type as the ephemeral backend is using"15:58
opendevreviewBalazs Gibizer proposed openstack/nova master: [CI][nova-next]test with placement ac breadth-first  https://review.opendev.org/c/openstack/nova/+/93727515:59
sean-k-mooneydansmith: i have not at the code but perhaps "souce iamge" in this context was inteded to refer to "the format fo the file on disk that was snapshoted" rather then glance16:31
sean-k-mooneywhich would match what you are stating16:31
dansmithwell, that's what the code is doing, but I don't think that's how someone would read that doc text16:31
dansmithglance people also had assumed that nova snapshotted in the format of the original (source glance) image16:33
sean-k-mooneyya without reading the docs i woudl have assuemd nova woudl use the same format in glance as the image was on disk but iagree that not what i would take away form that text16:33
sean-k-mooneywell novas current behaivor is not really surpsing to me nessisarly16:34
sean-k-mooneybut i agree the docs are incorrect16:34
dansmithit's not really surprising to me either knowing the code and how things work under the covers, I just think that's not what the reader of the doc would think16:35
sean-k-mooney this is also kind of glossing over images_type=rbd16:35
dansmithand so when it was asserted that the format would be the same as the source (glance image) and I read that doc, I assumed that's what we were doing16:35
dansmithyeah, also that16:35
sean-k-mooneydo you know what happens if you set this and you use images_type=rbd ?16:36
sean-k-mooneyit woudl be reasonabel to assuem it either has no effect (proably what happens today)16:36
dansmithI assume you mean also where we share a pool16:36
dansmithif sharing, we don't even run this code16:36
sean-k-mooneybut it woudl also be reasonable to assume we woudl actully upload an image in the requested format16:36
dansmithif not, we'll do the same thing, It hink16:36
dansmithyeah, good point16:37
dansmithsomeone thinks they can set this to get qcow2 images even for things on ceph...16:37
sean-k-mooneyim no really arguing either way just that the config option seams somewhat probelmatic on other fronts16:38
sean-k-mooneyi kind of get why it exists16:38
sean-k-mooneyi.e. use force raw for the hypervior but store all snapshots in qcow to save space 16:39
dansmithyeah, problematic indeed, since it will cause us to unwrap a qcow2 into a raw with qemu16:39
dansmithI'm not sure that would really happen unless qemu does empty space detection when you convert, and even if it did, it wouldn't work unless the guest OS had *actually* zeroed sectors16:39
sean-k-mooneyrigtht im thiniking of qcows sparce file supprot16:40
sean-k-mooneybut i also dont know if we enable that when we create the snapshot16:40
sean-k-mooneybut other then that usecase i dont really see a valid reason to ever set this16:41
dansmithapparently it does zero detect and collapse (just tried) but still.. only helps if the guest zeroes the unused sectors or never touches them16:41
dansmithanyway I'm on this because of our need to upload to glance with an accurate (i.e. raw vs gpt) format16:42
sean-k-mooneyim not sure if operators would scream but im inclind to say remove this current config option16:43
sean-k-mooneyim unsure if we shoudl have an option to say "match glance vs match file on disk"16:43
dansmithwell, that's what default is now16:43
sean-k-mooneythe kinder thing to do woudl be just set the glance type to what we are updating and continue to supprot this option16:43
dansmithoh, you mean two options16:44
dansmithI see16:44
sean-k-mooneywell this is config driven api behviaor right16:44
dansmithI'm not really sure "match glance" is necessary, I just think it's more visible from the user's perspective and also helps us say "nope, you booted from gpt, so you stay gpt"16:44
dansmithyes, it is16:44
sean-k-mooneyas an end user i woudl kind of expect the snapshot to be in the same format as the orgianl image16:44
dansmithconfig-driven api behavior across two services16:44
sean-k-mooneyas an opterator that does nto want the iamge convertion overhead on every boot16:45
sean-k-mooneyi proably want it to match the file on disk16:45
dansmithright, but, you upload a vmdk, it gets converted to raw, and snapshotted as a qcow216:45
dansmithhard to say what behavior will please everyone, but it's also frustrating to have so much variability here16:46
sean-k-mooneyto some degree as long as the data and the format listed in galnce aling that actully ok16:46
dansmithit's just more reason the ephemeral stuff is a mess16:46
sean-k-mooneyi mean one way around it would be to add the format to the snapshot api16:46
sean-k-mooneybut im not conviced that good to expose16:47
sean-k-mooneyfor one i dont know if it woudl work for any other virt dirver16:47
sean-k-mooneybut ignoring that operator might now want ot allow that and it will also comlicate this code even more so...16:47
dansmithnah, that would allow users to tickle qemu-img bugs16:48
sean-k-mooneyyep whihch seams bad. its better to let them convert it themselve after they download16:49
sean-k-mooneywell safeer perhaps not better16:49
dansmithyeah16:49
sean-k-mooneydansmith: so are you just going to update the help text and make sure the format in glance is set correctly for now?16:50
dansmithso basically, I think we need to inspect (with format inspector) the snapshot image if the target format is raw, and then upload to glance as raw or gpt depending16:50
sean-k-mooneywell snapshot is only for the root disk16:50
dansmithit is set in glance correctly, AFAICT, but yeah, when I add that behavior I can update the text of the config option16:50
sean-k-mooneyso it will be gpt or one of the other formats but not raw right16:51
dansmithwell, in almost all cases yes, but presumably if you're running something really strange in your guest, or doing direct kernel boot with just a whole-disk filesystem...16:51
dansmithas I said before, I think we should deprecate that behavior (at least require a full disk image even for direct kernel boot)16:51
dansmithbut presumably OS/2 in a guest could be a thing16:52
dansmitheven solaris on sparc has been GPT since 2010, and the arm server spec says GPT16:52
sean-k-mooneyhum im sure there are some bbs vms on opnestack somewhere16:52
dansmithso the things you might be running in there that could legit be not on GPT is dwindling16:52
dansmithbbs?16:52
sean-k-mooneybullitin board systems16:53
dansmithoh, BBS16:53
dansmithhah16:53
sean-k-mooneyhave you seen the offensivly modern BBS post. someone worte a k8s operator to dynamically scale out a BBS with new pods for conenctions as needed16:54
dansmithnice16:54
sean-k-mooneyi dont have it to hand but i kind of like how they subverted moderen tech to run somethign liek that16:54
dansmithyeah, that's cool16:54
dansmithI still have a CP/M laptop with an integrated 300 baud modem downstairs16:55
dansmithdialed int many a BBS with that thing16:55
dansmithhttps://www.sinasohn.com/cgi-bin/clascomp/bldhtm.pl?computer=starlet16:56
sean-k-mooneyi dont know if you have every seen game stream on twich but there chat system is litraly just irc with soem extentions baked into the way they format the messges for emigi and the like16:59
dansmithwhat? no threads? how will I miss messages?17:00
sean-k-mooney:)17:00
sean-k-mooney  8x40 LCD ouch not evn 80 cols17:01
sean-k-mooney"  Has WordStar built-in as well as a spreadsheet and terminal program. " in 96K of rom17:02
dansmithit was awesome.. I typed some papers on it and transferred to the family PC over serial for printing17:03
sean-k-mooneyi wrote many a school essay on one of these https://en.wikipedia.org/wiki/Palm_V i ha a p4 erea laptop for a time btu fell back to that when it was in for repair for like 6 months. but ya same, serial transfer to a pc and then print17:06
dansmithwith a keyboard?17:08
dansmithI still have my US Robotics Pilot 5000, the original upscale model, with a Palm 3 upgrade chip in it17:08
sean-k-mooneyunfortunetly no with a stylis typeign one leter at a time., i did try a laser typing attachment but it was actully slower17:09
dansmithwow17:09
sean-k-mooneybecause of my dyslexia i had a laptop form the age of about 11  for scool work. so in class i woudl use the laptop instead a notebook outside of scinece/math17:10
sean-k-mooneywhen it broke i used that instead for a whiel until it coudl be fixed. at home i could just use the family pc whne my laptop was broken but ya. 17:11
sean-k-mooneyi dont knwo if i woudl have the will power to type a 2000 word essay even on a modern phone keyboard at this point17:11
dansmithyeah, graffiti was really really good.. I was definitely faster and more accurate with that than a modern phone17:13
opendevreviewmitya-eremeev-2 proposed openstack/nova master: Add ability do not evacuate if ephemeral devices  https://review.opendev.org/c/openstack/nova/+/93823518:24
opendevreviewmitya-eremeev-2 proposed openstack/nova master: Add ability to preserve ephemeral in evacuation  https://review.opendev.org/c/openstack/nova/+/93823518:44
priteauHello Nova team. Did you notice that openstacksdk-functional-devstack has been consistently failing since earlier today? Always the same error in openstack.tests.functional.compute.v2.test_keypair.TestKeypairAdmin: openstack.exceptions.NotFoundException: No Keypair found for <id>19:46
priteauhttps://zuul.opendev.org/t/openstack/builds?job_name=openstacksdk-functional-devstack&project=openstack/nova19:47
sean-k-mooneywe have not changed anything related to keypairs lately so that does not sound like its related to nova19:47
priteauYes, nothing new merged today19:48
sean-k-mooneyfor it to affect the sdk it woudl ahve had to have been an api change or a default policy change19:49
priteauBut I looked on opendev at the list of merged patches today, nothing looked suspect to me. Some requirements bump but they were earlier in the day.19:49
sean-k-mooneyand we have have not done either this cycle19:49
priteauWe'll see if it still happens after the weekend19:51
fricklersean-k-mooney: priteau: the only changes that I see got merged in the suspicious timeframe were in keystone, asking over there now20:35
priteaufrickler: I just realised that the first failure (https://zuul.opendev.org/t/openstack/build/1bed6dab1cc1497a9a4a6a4e83d8a11e) had this keystone change included: https://review.opendev.org/c/openstack/keystone/+/93881420:44
opendevreviewPierre Riteau proposed openstack/nova master: [DNM] Checking gate health  https://review.opendev.org/c/openstack/nova/+/93956020:48

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!