Thursday, 2023-03-30

*** dasm|off is now known as Guest935304:02
opendevreviewAmit Uniyal proposed openstack/placement master: Bugtracker link update  https://review.opendev.org/c/openstack/placement/+/87676805:38
fricklermelwitt: that at least sounds plausible, thanks for checking06:03
opendevreviewTobias Urdin proposed openstack/nova master: Remove libvirt tunnelled migration  https://review.opendev.org/c/openstack/nova/+/87902106:18
opendevreviewTobias Urdin proposed openstack/nova master: Remove libvirt tunnelled migration  https://review.opendev.org/c/openstack/nova/+/87902109:34
bauzasif people are OK, I'd like to add a functest for testing cross_az_attach https://review.opendev.org/c/openstack/nova/+/87894809:35
gibibauzas: I will be absent from the vPTG from 15:00 UTC hopefully only for half an hour due to a downstream call09:41
bauzasack09:44
bauzaswe'll discuss with glance09:44
bauzassee the agenda I posted on the ML09:44
bauzaswe'll discuss with glance *at that time*09:44
* bauzas runs errand by nbow09:45
gibiack10:15
opendevreviewyatin proposed openstack/nova master: [DNM] Test lower tb cache  https://review.opendev.org/c/openstack/nova/+/86841910:27
tobias-urdini have a conundrum that i cannot wrap my head around, i've been looking at the possibility of removing the need for remotefs (rsync/scp over ssh) when doing live migrations, that should be possible but requires some RPC changes to get rid of testing if instance dir is on shared storage and some other stuff, just out of curiosity i checked the10:41
tobias-urdinremotefs usage all around and we also use it for config drive migrations (as qemu (libvirt blocks us) does not allow migrating read-only ISO source file), fetching kernel and ramdisk, copying vtpm data, copying cached images from other compute nodes(?) etc10:41
tobias-urdinso while what I want to do, to remove the need for SSH key distribution for live migrations is possible, getting it completely removed in libvirt driver seems hard, because I cannot wrap my head around what we could replace that logic in terms of copying files around10:42
tobias-urdinif only libvirt had a copy file implementation in their protocol we wouldn't need anything else for remote access and could get tls etc, theoretically could be done with block devices using storage pools but that would be even worse imo10:43
* tobias-urdin brain hurts10:43
jrossertobias-urdin: key distribution can be avoided by using signed keys11:13
zigobauzas: You remember I couldn't do some live-migration with some of my VMs? It turns out that:11:21
zigo- It only happens with SOME images, like Rocky Linux 8.711:21
zigo- upgrading both source and destination from Qemu 5.2 to 7.2 (from bullseye-backports) fixes the issue ! \o/11:21
tobias-urdinjrosser: yeah, but it's also more to it than that, what about lateral movement between compute nodes; being able to essentially wipe instance disks of another compute node, or what about a operating system with no ssh running, sure could run ssh in a container and expose nova dir but then point #1 still applies, hard problem11:23
sean-k-mooney1zigo: glad its fixed but odd that it depeneded on the image11:38
zigosean-k-mooney1: According to the people from #qemu, it's likely that the image is using some new feature of the virtio stuff.11:39
zigoWhat's weird is that with Roky Linux 9, I didn't have the issue...11:39
sean-k-mooney1its proably using the transitional virtio device or something like that11:39
sean-k-mooney1as it the driver is proably negocating an older feature set11:40
*** sean-k-mooney1 is now known as sean-k-mooney11:40
zigosean-k-mooney1: I'll upgrade all my cluster to the newer version of Qemu and will migrate (non-live) the problematic instances ...11:40
zigoStill very annoying, but lucky, only very few instances are affected.11:41
kashyapzigo: Ahh, so upgrading the source and dest QEMU to 7.2 fixes -- did you test it?12:29
kashyapWell, you did test it, otherwise, you wouldn't put that "\o/"12:31
bauzaszigo: sorry was taking a bit of time off before the vPTG12:36
bauzas(after the usual child taxi for lunch :p )12:36
bauzasas I said to the PTGbot, we start at 1pm UTC with the neutron x-p session in the Neutron room (juno, ie. https://www.openstack.org/ptg/rooms/juno )12:39
artomsean-k-mooney, I'll be late for that (son's dentist appointment), can you cover the delete_on_termination stuff?12:47
artomIIRC you're the only other person with the context12:47
sean-k-mooneyam yes i can12:49
artomCheers!12:49
*** whoami-rajat__ is now known as whoami-rajat13:28
bauzasdvo-plv: are you in the neutron room ?13:30
bauzasdvo-plv: we're discussing your topic now13:30
dvo-plvyes, thank you13:37
bauzasbreak now until 3pm UTC and then please join the cinder room : https://bluejeans.com/55668129014:42
bauzasalso, I had a topic about Xena EM, we'll discuss this at next meeting14:42
bauzas(unless people have concerns by now, fer sur)14:43
elodillesbauzas: ack, we can have a couple of words about it o/14:43
bauzaselodilles: tbh, I need to look at the current open changfes14:44
sean-k-mooneybauzas: what room should we be in next14:44
sean-k-mooneywell now14:44
bauzassean-k-mooney: cinder room, 3pm UTC14:45
bauzaswe have a break now 14:45
elodillesbauzas: yes, there are plenty of open patches: https://review.opendev.org/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/placement+OR+project:openstack/nova)+branch:stable/xena14:45
bauzasthat was my ask before we left14:45
elodillesthe question is though whether anyone see anything that should be part of the 'final-before-em' release of stable/xena14:46
dansmithsean-k-mooney: what is the qemu security issue that started causing detach issues that you mentioned on the list?14:46
sean-k-mooneydansmith: the orgianl motivation for the change in qemu was related to a secuity issue i belvie14:46
sean-k-mooneyi dont actully know the details14:46
dansmithsean-k-mooney: but what's the change? I wasn't aware anything changed (intentionally)14:46
bauzaswas in libvirt 8, right?14:46
* bauzas missed the regression index14:47
sean-k-mooneyit was undefiend behaivor if you could retry detach while it was in progress14:47
sean-k-mooneythey intentionally made it an error and have it abort the inprogress detach14:47
bauzaselodilles: I can raise the question to the nova team by email14:47
elodillesbauzas: yepp, that is perfectly OK i think14:47
bauzaselodilles: and we could conclude on the next weekly meeting14:47
sean-k-mooneyin old version fo qemu it would actully try detaching again14:47
*** Guest9353 is now known as dasm14:48
elodillesbauzas: ++14:48
dansmithsean-k-mooney: that's not actually causing us trouble though right? if we needed to retry the detach it probably wasn't working anyway, right?14:48
bauzaselodilles: or the next one, I've seen the deadline for approving14:48
dansmithsean-k-mooney: ah meaning we're never sending the acpi event anymore after the first one?14:48
sean-k-mooneycorrect after the first one it never gets sent again but also our retry mechaium would stop the detach form proceeding14:49
sean-k-mooneyi.e. if was just slow detachign and we retried it would abort the detach14:49
elodillesbauzas: though we should not postpone the release close to the transition, otherwise if we hurry and merge things and something will be broken then we cannot fix it anymore ;)14:49
dansmithhmm, okay14:49
sean-k-mooneydansmith: gibi swapped use form blind retryes on a timeout/interval oto trying to use qemu events14:49
sean-k-mooneyor maybe that was lee14:50
sean-k-mooneyin either case that was not enough to resolve this issue14:50
dansmithokay, I guess I didn't know about this other detail14:50
sean-k-mooneyit just seam like we need to kick the vms several times to get the detach to work14:50
bauzaselodilles: yeah that's understandable, we shall not be lazy14:51
dansmithif it's a matter of the first event getting missed or something, that definitely *could* support the "use a real distro" argument I guess14:51
dansmithif the first one fails, is there some way the "in progress"-ness gets reset such that allowing a retry *ever* works?14:51
bauzaselodilles: despite (tbh), I'm like ~0% interested about the Xena branch :)14:51
bauzasactually, EM is helping my work :)14:51
sean-k-mooneydansmith: currently i belive there is no way to rest the state without restarting the vm14:52
sean-k-mooneyhttps://gitlab.com/libvirt/libvirt/-/issues/30914:52
dansmithdoes running the guest agent allow us to do it that way instead of just acpi?14:52
dansmiththanks, I'll brush up on that bug14:53
sean-k-mooneygood question i do not think so but maybe14:53
sean-k-mooneyi have never really looke at what the guest agent can actully do14:53
sean-k-mooneyi knowit has some filesystem apis to freeze them14:53
sean-k-mooneydansmith: the other thing to keep in mind is its not always acpi14:54
sean-k-mooneyqhen you change to q35 we started to use pcie natiave hotplug instead14:54
sean-k-mooneythe proved to be buggey so they went back to acpi14:54
dansmithfor disks?  but I'm using acpi as a stand-in.. I guess for disks I figured it was an eject request or something14:54
sean-k-mooneyim not sure if they have changed back to native pcie hotplug or if it still uses acpi14:54
sean-k-mooneythere are 2 ways to signel it for the pc machine type it used ahci interupts becasue you only had a pci bus not pcie14:55
sean-k-mooneypcie has its own hotplug mechanium and qemu tried to use that instead14:55
elodillesbauzas: we had a xena release this year, so at least we are mostly good ;)14:55
sean-k-mooneythen then hit bugs and went back to ahci14:56
sean-k-mooneyfor virtio-blk each disk is a pci device14:56
sean-k-mooneyfor virtio-scsi then they are not14:56
sean-k-mooneythey are scsi device connected to the contoller14:56
bauzasreminder : we restart in 3 mins, cinder room14:57
sean-k-mooneyspeakign of we could try using virtio-scsi i guess14:57
sean-k-mooneyjust set hw_disk_bus=scsi in devstack14:57
sean-k-mooneyi dont think htat helps as i think its the ahci path thats buggy but its an option to try14:58
sean-k-mooneydansmith: sorry for the context dump :)14:58
whoami-rajathttps://redhat.bluejeans.com/55668129015:00
sean-k-mooneydansmith: ^ are you joining that by the way15:01
dansmithsean-k-mooney: nope, tc now..15:01
sean-k-mooneyah ok15:01
sean-k-mooneywe can recap hte direct image location converstaion15:02
bauzassenrique: oh you just joined, cool thanks15:46
*** whoami-rajat__ is now known as whoami-rajat15:53
bauzassenrique: a few docs so :)15:55
bauzassenrique: this is our overall process workflow https://docs.openstack.org/nova/latest/contributor/process.html#how-do-i-get-my-code-merged15:56
senriquebauzas, hey :)15:56
bauzastl;dr: create a blueprint on https://blueprints.launchpad.net/nova/15:56
bauzasthen, once you think you have time to attend a specific nova meeting, add your topic to the weekly meeting agenda https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting (in the open discussion last topic)15:57
bauzasand then, we'll discuss it on the subsequent meeting, asking you a few details and if we agree, then we just approve your blueprint desing (from a procedural pov)15:57
bauzasdesign* even15:58
bauzasthen, you're free to work on the implementation patches and ask for reviews15:58
bauzassenrique: one example is https://meetings.opendev.org/meetings/nova/2023/nova.2023-01-17-16.00.log.html#l-28715:59
bauzasbreak until 1620UTC16:10
bauzasand then, see you back on https://www.openstack.org/ptg/rooms/diablo16:10
* gibi is back for the break, best timing ever16:11
senriquethank you bauzas!!16:19
opendevreviewsean mooney proposed openstack/nova master: [DNM] testing enableind discard by default  https://review.opendev.org/c/openstack/nova/+/87907718:41
opendevreviewMerged openstack/nova stable/xena: Accept both 1 and Y as AMD SEV KVM kernel param value  https://review.opendev.org/c/openstack/nova/+/84393819:04

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!