opendevreview | Stephen Finucane proposed openstack/nova master: libvirt: Wrap un-proxied listDevices() and listAllDevices() https://review.opendev.org/c/openstack/nova/+/939317 | 11:08 |
---|---|---|
gibi | folks, what is a good way to change a multinode zuul job (nova-next) to have a different placement config than default? I cannot just set vars:devstack_local_conf as that fails as it tries to be applied on all nodes and computes has no placement service. Setting it is a group-vars that controller only does not work either as then I lose what is already set via vars. | 13:54 |
gibi | context https://review.opendev.org/c/openstack/nova/+/937275 | 13:54 |
gibi | do I need to duplicate the settings from vars to group-vars? | 13:55 |
dougszu | Has anyone seen ephemeral volumes corrupted after a VM is migrated? I see Nova create the filesystem on the backing disk file during instance create, but then regenerate the backing disk on the destination HV. It leads to the UUID changing in the backing disk file, and the VM sees the filesystem as corrupted (for a subset of filesystems which are good spotting the anomaly). | 13:56 |
dougszu | *good at spotting | 13:56 |
dougszu | happy to create a bug and work on it, if there isn't something already | 13:57 |
opendevreview | Balazs Gibizer proposed openstack/nova master: [CI][nova-next]test with placement ac breadth-first https://review.opendev.org/c/openstack/nova/+/937275 | 14:04 |
sean-k-mooney | gibi: so i can think of a few hacks but i tought we fixed this in the past | 14:20 |
sean-k-mooney | gibi: there was a patch propsoed before to make sure hta tbefore the post-config code ran it would do a make -p on the partent path | 14:21 |
sean-k-mooney | the hack is in the devstack_local_conf section you could jsut add that | 14:21 |
sean-k-mooney | sorry mkdir -p | 14:21 |
gibi | sean-k-mooney: still there is no placement.conf to edit on the computes so mkdir -p alone is not enough | 14:24 |
sean-k-mooney | dougszu: so nova will redownload the backing image form glance and copy it form the souce if the image is not deleted | 14:25 |
sean-k-mooney | *is deleted | 14:25 |
sean-k-mooney | gibi: the file shoudl nto be required to exist | 14:25 |
sean-k-mooney | it shoudl create it if its not there | 14:25 |
sean-k-mooney | gibi: at least that what my memory says is the expected behavior | 14:26 |
sean-k-mooney | im trying to find this in the devstack repo now | 14:26 |
dougszu | sean: what I don't see it the attempt to download the backing file - perhaps it is a misconfiguration. I've filed this with steps: https://bugs.launchpad.net/nova/+bug/2095173 | 14:26 |
sean-k-mooney | oh sorry you mean ephmeral storage provisin via the flavor | 14:28 |
sean-k-mooney | so for live migration we copy the data | 14:28 |
sean-k-mooney | for cold migration we dont as far as i recall | 14:28 |
sean-k-mooney | i woudl need to check but we do not expect the content of the ephemral disks to be preseved for a number of operations | 14:29 |
sean-k-mooney | resize is oen of them i dont know about cold-migration. live migration is intended to preserve the data | 14:29 |
dougszu | Thanks - I see the ephemeral data copied for both live and cold. The bug affects both | 14:30 |
sean-k-mooney | so the content o fthe backing file sholud not really effect the resulting image since its ment to be blank | 14:31 |
sean-k-mooney | but i think i know why this is happening | 14:31 |
dougszu | Yeah, so I think you can configure it that way - to not create a FS on the ephemeral drive. But if you do tell Nova to create FS, it will create it on the backing file. | 14:31 |
sean-k-mooney | i think we are currently formating the backign file if you have not disabeld that | 14:31 |
sean-k-mooney | dougszu: right os we have been disucssint removign the ablity for nova to format the ephemral storage | 14:32 |
sean-k-mooney | we planned to disabel it by default and then deprecate it but i dont think we have doen that yet | 14:32 |
sean-k-mooney | for the ephmeral disk there is no backign file in glance | 14:33 |
sean-k-mooney | so the only optionon cold or live migrate woudl be to copy it from the souce node otherwise we would geneate a new one on the dst with no refernce to the old one a diffent uuid as a result | 14:33 |
dougszu | curious -yeah, that would work, it forces the FS into the top layer | 14:33 |
dougszu | and fixes the issue | 14:33 |
dougszu | but not for existing VMs :D | 14:34 |
gibi | sean-k-mooney: so you suggest that devstack_local_conf post-config should work even if the referenced service is not deployed and the config file is not there as it i) creates the missing dir ii) generates a new file with the post-config content if the file does not exits? | 14:34 |
sean-k-mooney | right so evernayully we also dont want to have backign files for test disks ata ll | 14:34 |
sean-k-mooney | the fact they have them is a sideffect of hwo the qcow dirver is currently impelmetned and not really intentional | 14:34 |
sean-k-mooney | gibi: that is how i think it was orgianly inteded to work yes | 14:35 |
sean-k-mooney | or at least that is how i would have implemtned it | 14:35 |
dougszu | You also can't copy the backing file from the source node. Eg. if you had two identical VMs on different HVs, and migrated them both to another HV, they would fight over the backing file. | 14:35 |
sean-k-mooney | dougszu: you can but you need ot have a diffent name for it | 14:36 |
gibi | sean-k-mooney: ack, I can also try to dig devstack... | 14:36 |
gibi | based on thsi | 14:36 |
gibi | this | 14:36 |
dougszu | yeah - agree on that - different name would work | 14:37 |
sean-k-mooney | dougszu: but ya we dont generate a repoducable file system on the ephmale backign file today as far as im aware. | 14:37 |
sean-k-mooney | i agree this is a bug of some kind | 14:37 |
sean-k-mooney | its not clear to me what the best path forward is in ths short term | 14:37 |
sean-k-mooney | long terme there shoudl be no backing file and or no data in the backing file | 14:38 |
dougszu | thanks Sean, that's all helpful. I think as you say, the easiest fix is to just not support created the FS on the backing file | 14:38 |
dougszu | *creating | 14:38 |
dougszu | I will have a think about a fix for existing VMs. I think you can rebase the top layer onto a new backing file for one. | 14:39 |
sean-k-mooney | we wanted to make that change to reduce the security footprint of nova | 14:39 |
sean-k-mooney | dougszu: we might be abel to flatten the image on migration, im not sure | 14:39 |
dougszu | It also avoids issues with for example Nova making a v5 XFS FS and then booting a C7.9 image with XFS 4 support etc | 14:40 |
dougszu | yeah, flatten is also interesting | 14:40 |
sean-k-mooney | so i think this bug is another reaons to implement https://blueprints.launchpad.net/nova/+spec/default-ephemeral-format-unformated sooner rather then later | 14:44 |
sean-k-mooney | i cant commit to doing that this cycle but i might see if i can find time to try | 14:45 |
dougszu | ^ that looks good. I'd be happy to test/review/help out. | 14:49 |
sean-k-mooney | gibi: i think this is the relevent fucntion https://github.com/openstack/devstack/blob/38f8f4da4556c11bf526392359ed6c14b45d87ea/inc/meta-config#L79 | 14:50 |
gibi | sean-k-mooney: thanks! | 14:52 |
sean-k-mooney | the comment says it shoudl create the file if it does not exist which is the behviaor i recall i think its just failing because /etc/placement is not there for it to create the file under hence the mkdir -p comment | 14:53 |
gibi | OK I will dig | 14:55 |
opendevreview | Dan Smith proposed openstack/nova master: Handle GPT image detection https://review.opendev.org/c/openstack/nova/+/933928 | 14:58 |
opendevreview | Dan Smith proposed openstack/nova master: DNM: Test with latest oslo.utils https://review.opendev.org/c/openstack/nova/+/933444 | 14:58 |
sean-k-mooney | gibi: found it https://review.opendev.org/c/openstack/devstack/+/922846 | 15:01 |
sean-k-mooney | gibi: i know i debugged that at some point before | 15:01 |
sean-k-mooney | *knew | 15:02 |
gibi | sean-k-mooney: awesome, I will put a depends-on on it to see if this helps. | 15:06 |
gibi | sean-k-mooney: also an easy win review about whitebox in periodic queue https://review.opendev.org/c/openstack/nova/+/833453 | 15:07 |
sean-k-mooney | oh cool artom updated that | 15:08 |
sean-k-mooney | done | 15:08 |
sean-k-mooney | we can review it for stablity i guess and see how it goes | 15:08 |
gibi | yepp, bauzas tend to check the periodic state during our weekly meeting so we will see | 15:10 |
sean-k-mooney | speakign of easy wins im also approving https://review.opendev.org/c/openstack/nova/+/939453/1 to disable the heal perodic in nova-next | 15:11 |
sean-k-mooney | when you have time https://review.opendev.org/c/openstack/nova/+/939476 is the follow up about changing the default | 15:11 |
sean-k-mooney | i was going to add that to next weeks team meeting ot get import | 15:11 |
* sean-k-mooney added | 15:14 | |
gibi | ack | 15:25 |
dansmith | ugh.. this is wrong: https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.snapshot_image_format | 15:57 |
dansmith | the default is not "the same type as source image" it's "same type as the ephemeral backend is using" | 15:58 |
opendevreview | Balazs Gibizer proposed openstack/nova master: [CI][nova-next]test with placement ac breadth-first https://review.opendev.org/c/openstack/nova/+/937275 | 15:59 |
sean-k-mooney | dansmith: i have not at the code but perhaps "souce iamge" in this context was inteded to refer to "the format fo the file on disk that was snapshoted" rather then glance | 16:31 |
sean-k-mooney | which would match what you are stating | 16:31 |
dansmith | well, that's what the code is doing, but I don't think that's how someone would read that doc text | 16:31 |
dansmith | glance people also had assumed that nova snapshotted in the format of the original (source glance) image | 16:33 |
sean-k-mooney | ya without reading the docs i woudl have assuemd nova woudl use the same format in glance as the image was on disk but iagree that not what i would take away form that text | 16:33 |
sean-k-mooney | well novas current behaivor is not really surpsing to me nessisarly | 16:34 |
sean-k-mooney | but i agree the docs are incorrect | 16:34 |
dansmith | it's not really surprising to me either knowing the code and how things work under the covers, I just think that's not what the reader of the doc would think | 16:35 |
sean-k-mooney | this is also kind of glossing over images_type=rbd | 16:35 |
dansmith | and so when it was asserted that the format would be the same as the source (glance image) and I read that doc, I assumed that's what we were doing | 16:35 |
dansmith | yeah, also that | 16:35 |
sean-k-mooney | do you know what happens if you set this and you use images_type=rbd ? | 16:36 |
sean-k-mooney | it woudl be reasonabel to assuem it either has no effect (proably what happens today) | 16:36 |
dansmith | I assume you mean also where we share a pool | 16:36 |
dansmith | if sharing, we don't even run this code | 16:36 |
sean-k-mooney | but it woudl also be reasonable to assume we woudl actully upload an image in the requested format | 16:36 |
dansmith | if not, we'll do the same thing, It hink | 16:36 |
dansmith | yeah, good point | 16:37 |
dansmith | someone thinks they can set this to get qcow2 images even for things on ceph... | 16:37 |
sean-k-mooney | im no really arguing either way just that the config option seams somewhat probelmatic on other fronts | 16:38 |
sean-k-mooney | i kind of get why it exists | 16:38 |
sean-k-mooney | i.e. use force raw for the hypervior but store all snapshots in qcow to save space | 16:39 |
dansmith | yeah, problematic indeed, since it will cause us to unwrap a qcow2 into a raw with qemu | 16:39 |
dansmith | I'm not sure that would really happen unless qemu does empty space detection when you convert, and even if it did, it wouldn't work unless the guest OS had *actually* zeroed sectors | 16:39 |
sean-k-mooney | rigtht im thiniking of qcows sparce file supprot | 16:40 |
sean-k-mooney | but i also dont know if we enable that when we create the snapshot | 16:40 |
sean-k-mooney | but other then that usecase i dont really see a valid reason to ever set this | 16:41 |
dansmith | apparently it does zero detect and collapse (just tried) but still.. only helps if the guest zeroes the unused sectors or never touches them | 16:41 |
dansmith | anyway I'm on this because of our need to upload to glance with an accurate (i.e. raw vs gpt) format | 16:42 |
sean-k-mooney | im not sure if operators would scream but im inclind to say remove this current config option | 16:43 |
sean-k-mooney | im unsure if we shoudl have an option to say "match glance vs match file on disk" | 16:43 |
dansmith | well, that's what default is now | 16:43 |
sean-k-mooney | the kinder thing to do woudl be just set the glance type to what we are updating and continue to supprot this option | 16:43 |
dansmith | oh, you mean two options | 16:44 |
dansmith | I see | 16:44 |
sean-k-mooney | well this is config driven api behviaor right | 16:44 |
dansmith | I'm not really sure "match glance" is necessary, I just think it's more visible from the user's perspective and also helps us say "nope, you booted from gpt, so you stay gpt" | 16:44 |
dansmith | yes, it is | 16:44 |
sean-k-mooney | as an end user i woudl kind of expect the snapshot to be in the same format as the orgianl image | 16:44 |
dansmith | config-driven api behavior across two services | 16:44 |
sean-k-mooney | as an opterator that does nto want the iamge convertion overhead on every boot | 16:45 |
sean-k-mooney | i proably want it to match the file on disk | 16:45 |
dansmith | right, but, you upload a vmdk, it gets converted to raw, and snapshotted as a qcow2 | 16:45 |
dansmith | hard to say what behavior will please everyone, but it's also frustrating to have so much variability here | 16:46 |
sean-k-mooney | to some degree as long as the data and the format listed in galnce aling that actully ok | 16:46 |
dansmith | it's just more reason the ephemeral stuff is a mess | 16:46 |
sean-k-mooney | i mean one way around it would be to add the format to the snapshot api | 16:46 |
sean-k-mooney | but im not conviced that good to expose | 16:47 |
sean-k-mooney | for one i dont know if it woudl work for any other virt dirver | 16:47 |
sean-k-mooney | but ignoring that operator might now want ot allow that and it will also comlicate this code even more so... | 16:47 |
dansmith | nah, that would allow users to tickle qemu-img bugs | 16:48 |
sean-k-mooney | yep whihch seams bad. its better to let them convert it themselve after they download | 16:49 |
sean-k-mooney | well safeer perhaps not better | 16:49 |
dansmith | yeah | 16:49 |
sean-k-mooney | dansmith: so are you just going to update the help text and make sure the format in glance is set correctly for now? | 16:50 |
dansmith | so basically, I think we need to inspect (with format inspector) the snapshot image if the target format is raw, and then upload to glance as raw or gpt depending | 16:50 |
sean-k-mooney | well snapshot is only for the root disk | 16:50 |
dansmith | it is set in glance correctly, AFAICT, but yeah, when I add that behavior I can update the text of the config option | 16:50 |
sean-k-mooney | so it will be gpt or one of the other formats but not raw right | 16:51 |
dansmith | well, in almost all cases yes, but presumably if you're running something really strange in your guest, or doing direct kernel boot with just a whole-disk filesystem... | 16:51 |
dansmith | as I said before, I think we should deprecate that behavior (at least require a full disk image even for direct kernel boot) | 16:51 |
dansmith | but presumably OS/2 in a guest could be a thing | 16:52 |
dansmith | even solaris on sparc has been GPT since 2010, and the arm server spec says GPT | 16:52 |
sean-k-mooney | hum im sure there are some bbs vms on opnestack somewhere | 16:52 |
dansmith | so the things you might be running in there that could legit be not on GPT is dwindling | 16:52 |
dansmith | bbs? | 16:52 |
sean-k-mooney | bullitin board systems | 16:53 |
dansmith | oh, BBS | 16:53 |
dansmith | hah | 16:53 |
sean-k-mooney | have you seen the offensivly modern BBS post. someone worte a k8s operator to dynamically scale out a BBS with new pods for conenctions as needed | 16:54 |
dansmith | nice | 16:54 |
sean-k-mooney | i dont have it to hand but i kind of like how they subverted moderen tech to run somethign liek that | 16:54 |
dansmith | yeah, that's cool | 16:54 |
dansmith | I still have a CP/M laptop with an integrated 300 baud modem downstairs | 16:55 |
dansmith | dialed int many a BBS with that thing | 16:55 |
dansmith | https://www.sinasohn.com/cgi-bin/clascomp/bldhtm.pl?computer=starlet | 16:56 |
sean-k-mooney | i dont know if you have every seen game stream on twich but there chat system is litraly just irc with soem extentions baked into the way they format the messges for emigi and the like | 16:59 |
dansmith | what? no threads? how will I miss messages? | 17:00 |
sean-k-mooney | :) | 17:00 |
sean-k-mooney | 8x40 LCD ouch not evn 80 cols | 17:01 |
sean-k-mooney | " Has WordStar built-in as well as a spreadsheet and terminal program. " in 96K of rom | 17:02 |
dansmith | it was awesome.. I typed some papers on it and transferred to the family PC over serial for printing | 17:03 |
sean-k-mooney | i wrote many a school essay on one of these https://en.wikipedia.org/wiki/Palm_V i ha a p4 erea laptop for a time btu fell back to that when it was in for repair for like 6 months. but ya same, serial transfer to a pc and then print | 17:06 |
dansmith | with a keyboard? | 17:08 |
dansmith | I still have my US Robotics Pilot 5000, the original upscale model, with a Palm 3 upgrade chip in it | 17:08 |
sean-k-mooney | unfortunetly no with a stylis typeign one leter at a time., i did try a laser typing attachment but it was actully slower | 17:09 |
dansmith | wow | 17:09 |
sean-k-mooney | because of my dyslexia i had a laptop form the age of about 11 for scool work. so in class i woudl use the laptop instead a notebook outside of scinece/math | 17:10 |
sean-k-mooney | when it broke i used that instead for a whiel until it coudl be fixed. at home i could just use the family pc whne my laptop was broken but ya. | 17:11 |
sean-k-mooney | i dont knwo if i woudl have the will power to type a 2000 word essay even on a modern phone keyboard at this point | 17:11 |
dansmith | yeah, graffiti was really really good.. I was definitely faster and more accurate with that than a modern phone | 17:13 |
opendevreview | mitya-eremeev-2 proposed openstack/nova master: Add ability do not evacuate if ephemeral devices https://review.opendev.org/c/openstack/nova/+/938235 | 18:24 |
opendevreview | mitya-eremeev-2 proposed openstack/nova master: Add ability to preserve ephemeral in evacuation https://review.opendev.org/c/openstack/nova/+/938235 | 18:44 |
priteau | Hello Nova team. Did you notice that openstacksdk-functional-devstack has been consistently failing since earlier today? Always the same error in openstack.tests.functional.compute.v2.test_keypair.TestKeypairAdmin: openstack.exceptions.NotFoundException: No Keypair found for <id> | 19:46 |
priteau | https://zuul.opendev.org/t/openstack/builds?job_name=openstacksdk-functional-devstack&project=openstack/nova | 19:47 |
sean-k-mooney | we have not changed anything related to keypairs lately so that does not sound like its related to nova | 19:47 |
priteau | Yes, nothing new merged today | 19:48 |
sean-k-mooney | for it to affect the sdk it woudl ahve had to have been an api change or a default policy change | 19:49 |
priteau | But I looked on opendev at the list of merged patches today, nothing looked suspect to me. Some requirements bump but they were earlier in the day. | 19:49 |
sean-k-mooney | and we have have not done either this cycle | 19:49 |
priteau | We'll see if it still happens after the weekend | 19:51 |
frickler | sean-k-mooney: priteau: the only changes that I see got merged in the suspicious timeframe were in keystone, asking over there now | 20:35 |
priteau | frickler: I just realised that the first failure (https://zuul.opendev.org/t/openstack/build/1bed6dab1cc1497a9a4a6a4e83d8a11e) had this keystone change included: https://review.opendev.org/c/openstack/keystone/+/938814 | 20:44 |
opendevreview | Pierre Riteau proposed openstack/nova master: [DNM] Checking gate health https://review.opendev.org/c/openstack/nova/+/939560 | 20:48 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!