Monday, 2023-12-18

opendevreviewSylvain Bauza proposed openstack/nova-specs master: Proposes mdev live-migration support in libvirt  https://review.opendev.org/c/openstack/nova-specs/+/90063608:37
opendevreviewsean mooney proposed openstack/nova master: add initial healthcheck support  https://review.opendev.org/c/openstack/nova/+/82501512:38
opendevreviewsean mooney proposed openstack/nova master: add initial healthcheck support  https://review.opendev.org/c/openstack/nova/+/82501513:24
opendevreviewsean mooney proposed openstack/nova master: add healthcheck manager to manager base  https://review.opendev.org/c/openstack/nova/+/82784413:24
opendevreviewsean mooney proposed openstack/nova master: add healthcheck tracker to nova context  https://review.opendev.org/c/openstack/nova/+/82946813:24
opendevreviewsean mooney proposed openstack/nova master: add healthcheck utils and constants  https://review.opendev.org/c/openstack/nova/+/82946913:39
opendevreviewsean mooney proposed openstack/nova master: add healthcheck endpoint to proxy commands  https://review.opendev.org/c/openstack/nova/+/83070313:39
bauzassean-k-mooney: gibi: fwiw, I'm about to start writing the series for the mdev live-migration RFE14:02
gibibauzas: good luck!14:04
opendevreviewArtom Lifshitz proposed openstack/nova master: Allow live migrate paused instance when post copy is enabled  https://review.opendev.org/c/openstack/nova/+/44451714:11
artommelwitt, sean-k-mooney, gibi, btw, I updated https://review.opendev.org/c/openstack/nova/+/883682/ to hopefully reflect what was agreed on in the review comments14:38
artomSo now there's the notifications best effort patch on top: https://review.opendev.org/c/openstack/nova/+/903807/314:38
gibiartom: thanks. Both looks good to me15:07
sean-k-mooneyartom: ack ill look at it shortly15:08
artomThank _you_!15:09
gibisean-k-mooney: if you have a moment later, then I verified https://review.opendev.org/c/openstack/nova/+/444517 and it solves the live-migration of paused instance when post-copy is configured in nova.conf. 16:29
gibiartom: ^^ \o/16:29
kashyapgibi: Also the "double live migration of a paused instance" thing - did you see that bz too?16:33
kashyapAlthough, why would one want to run a double LM, though16:33
artomgibi, ~~o~~16:34
kashyap"If a VM is in paused, and it live-migrated twice, it is lost"16:34
kashyaphttps://bugs.launchpad.net/nova/+bug/194772516:34
kashyap"Lost"?16:34
artomHere instance instance, here boy!16:35
gibikashyap: I can try16:46
gibikashyap: nope, that but is still reproducible https://paste.opendev.org/show/bswQVIMJddVkrUtC1Pk0/16:55
gibikashyap: but it make sens as the related qemu bug is also open https://gitlab.com/qemu-project/qemu/-/issues/68616:56
kashyapgibi: Ohh, I missed the attached QEMU bug16:58
artomsean-k-mooney, so, tracing our snapshot upload call, we actually call it with a file handler for image_file/data17:06
artomhttps://opendev.org/openstack/nova/src/branch/master/nova/virt/libvirt/driver.py#L318217:07
artomhttps://opendev.org/openstack/nova/src/branch/master/nova/virt/libvirt/driver.py#L317617:07
artomThe thing is, on the sender side (Nova) passing a file handler to Python requests results in a streaming upload: https://docs.python-requests.org/en/latest/user/advanced.html#streaming-uploads17:09
artomFor chunks we have to use a generator or iterator: https://docs.python-requests.org/en/latest/user/advanced.html#chunk-encoded-requests17:09
sean-k-mooneyack so we likely need a [glance]uplod_type=streaming|chunked config option17:10
sean-k-mooneydefault to streamign since that is ths current behavior and allow opting into chunked17:11
sean-k-mooneyand we can consider changing the default in the future17:11
sean-k-mooneymelwitt: bauzas gibi  can ye look at https://review.opendev.org/c/openstack/nova/+/903530/2 and the patch below to unblock mypy in teh requiremetns job for stephenfin 17:20
sean-k-mooneyseperatly this https://review.opendev.org/c/openstack/nova/+/897218/2 and the 3 patches below it will finish the codespell and sphinx-lint series17:22
sean-k-mooneyif we can get the mypy codespell and sphinx-lint serise merged this week it would be great17:22
sean-k-mooneygibi: how did you test https://review.opendev.org/c/openstack/nova/+/444517 in the openstack-k8s-operators CI 17:27
sean-k-mooneydid you submit a pr?17:27
sean-k-mooneyi know they were working on makign the conent provide supprot buildign with upstream patches17:27
sean-k-mooneybtu i didnt think that was done yet17:27
melwittsean-k-mooney: sure, I can look (if others don't get to it first)17:28
opendevreviewArtom Lifshitz proposed openstack/nova master: POC: attempting glance chunked uploads  https://review.opendev.org/c/openstack/nova/+/90361117:29
gibisean-k-mooney: sorry I was missleading I applied the patch locally 17:42
sean-k-mooneyoh ok17:42
sean-k-mooneythat is something we will be able to do eventually17:42
sean-k-mooneywith a depends on17:42
sean-k-mooneybut not for a few months yet17:42
sean-k-mooneyits on the todolist and somethign ill be supprotign with rhex-ci too eventually 17:43
melwittsean-k-mooney: hrm.. I notice that artom's patch is going to fail CI for a guest kernel panic 🫤 and it looks to be using the split image18:14
artom:(18:15
opendevreviewArtom Lifshitz proposed openstack/nova master: POC: attempting glance chunked uploads  https://review.opendev.org/c/openstack/nova/+/90361118:17
sean-k-mooneymelwitt: i guess we can reviwe the logs when it finishes18:18
melwittand it seems like it's often (always?) this volume backed server resize test(s). so I wonder if there's something about it that's different than everything else. I'll see if I can find anything18:18
sean-k-mooneyis that in nova-next where we are not using split image or one of the other jobs18:18
melwittno it's in tempest-integrated-compute-rbac-old-defaultsf18:19
sean-k-mooneywell the volume test obviouly has some pci hotplug events18:19
sean-k-mooneyso maybe we can mitigate that by adding hw_disk_bus=scisi18:19
melwittlogs are at https://zuul.opendev.org/t/openstack/build/fcc86981ee0e488f9f8b05a5969bfacb18:19
melwitthm ok18:19
sean-k-mooneyi.e. chagne from virtio-blk to virtio-scisi as that will not be a pci hotplu and instead be a scsi attach/detach18:20
melwittgotcha18:20
sean-k-mooneythe sttack trace looks like its in the page fault handeler 18:21
sean-k-mooneyso this looks like the same failure we saw before18:21
sean-k-mooneyi wonder if this is bfv18:22
sean-k-mooneyand if we replace the image in that case18:22
melwittit is bfv18:23
sean-k-mooneyok so we shoudl double check that tempest used the correct cirros image in that case18:23
sean-k-mooneyits calling into the share create_server https://github.com/openstack/tempest/blob/ab3686d28d2728001e3bd2fd543575087bf00137/tempest/api/compute/servers/test_server_actions.py#L49118:25
sean-k-mooneywhich end up using https://github.com/openstack/tempest/blob/ab3686d28d2728001e3bd2fd543575087bf00137/tempest/common/compute.py#L20418:25
sean-k-mooneyimage_id = CONF.compute.image_ref18:26
sean-k-mooneymelwitt: have we confirued that to be the split image?18:26
sean-k-mooneyits https://zuul.opendev.org/t/openstack/build/fcc86981ee0e488f9f8b05a5969bfacb/log/controller/logs/tempest_conf.txt#2418:26
sean-k-mooney86a0308f-80a0-4582-8a64-f2b54e48723218:27
sean-k-mooneyi guess we coudl check the devstack log and conrim what image that is18:27
sean-k-mooneyhttps://zuul.opendev.org/t/openstack/build/fcc86981ee0e488f9f8b05a5969bfacb/log/controller/logs/devstacklog.txt#15755-1577318:27
sean-k-mooneyok so that is the uec image18:28
sean-k-mooneyso ya my best suggestion is lets try adding hw_disk_bus=scsi in devstack adn see if that helps or disabling https://docs.openstack.org/nova/latest/configuration/config.html#workarounds.libvirt_disable_apic18:29
melwittok, I can try that18:31
sean-k-mooneythinking about this its failign in the really early boot18:32
sean-k-mooneyso im not sure that will help18:32
sean-k-mooneythe other thing we coudl tyr is adding swap to the tempest flavors to see if that help with the OOM issues18:32
melwittah right18:34
artomsean-k-mooney, bleah, glanceclient chunks it for us: https://opendev.org/openstack/python-glanceclient/src/branch/master/glanceclient/common/http.py#L11019:17
sean-k-mooney ack streaming might still be good to enable19:17
artomWhich... I guess I learned stuff today, but then I'm no closer to understanding why g-api OOM'ed :P19:17
sean-k-mooneyya not sure19:18
opendevreviewMerged openstack/nova master: Allow live migrate paused instance when post copy is enabled  https://review.opendev.org/c/openstack/nova/+/44451719:59
opendevreviewMerged openstack/nova master: [codespell] fix final typos and enable ci  https://review.opendev.org/c/openstack/nova/+/89721421:20
opendevreviewMerged openstack/nova master: Bump hacking version  https://review.opendev.org/c/openstack/nova/+/90352921:20
JayFsean-k-mooney: https://blueprints.launchpad.net/nova/+spec/ironic-guest-metadata is updated; if you have a preliminary review before tomorrow's meeting I'm happy to do a round trip of feedback ahead of it :)22:22
opendevreviewJay Faulkner proposed openstack/nova master: [ironic] Partition & use cache for list_instance*  https://review.opendev.org/c/openstack/nova/+/90083123:05
opendevreviewJay Faulkner proposed openstack/nova master: Limit nodes by ironic shard key  https://review.opendev.org/c/openstack/nova/+/90391523:05
opendevreviewJay Faulkner proposed openstack/nova master: Add nova-manage ironic-compute-node-move  https://review.opendev.org/c/openstack/nova/+/90391623:05
opendevreviewJay Faulkner proposed openstack/nova master: Make compute node rebalance safter  https://review.opendev.org/c/openstack/nova/+/90391723:05
opendevreviewMerged openstack/nova master: [codespell] ignore codespell in git blame  https://review.opendev.org/c/openstack/nova/+/89721523:29

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!