opendevreview | Yaguang Tang proposed openstack/nova stable/2023.2: Fix device type when booting from ISO image https://review.opendev.org/c/openstack/nova/+/945903 | 01:56 |
---|---|---|
opendevreview | Yaguang Tang proposed openstack/nova stable/2024.1: Fix device type when booting from ISO image https://review.opendev.org/c/openstack/nova/+/945816 | 06:52 |
sahid | o/ | 09:28 |
sahid | any chance to have review regarding that one, et the serie behind? | 09:29 |
sahid | https://review.opendev.org/c/openstack/neutron/+/940983/6 | 09:29 |
sahid | it's waiting for a while now :-) | 09:29 |
sean-k-mooney | i assume you ment ot put that in the neutron channel | 09:31 |
stephenfin | sean-k-mooney: trivial README change here if you've 3.5 seconds https://review.opendev.org/c/openstack/nova/+/944994 | 09:32 |
sean-k-mooney | sahid: but i would not recommend creating a theread pool dynmaiclly in a plugin | 09:32 |
sean-k-mooney | sahid: if your going to use thread pools you shoudl create one globaly for neutron to use and reuse it | 09:32 |
sean-k-mooney | like we do in nova | 09:32 |
sahid | sean-k-mooney: oh yes... my mistake | 09:34 |
sahid | sean-k-mooney: that is an interesting point yes | 09:34 |
sean-k-mooney | sahid: https://review.opendev.org/c/openstack/nova/+/922497 | 09:34 |
sahid | i will discuss that with neutron, i think ralonsoh may also need it from it's change to the l3 agent | 09:35 |
sahid | its | 09:35 |
sean-k-mooney | actully that not quite the patch i wanted | 09:35 |
sean-k-mooney | well no that uses futureist | 09:35 |
sean-k-mooney | oh im not adding eventlet tpool there im just moving the import | 09:36 |
sean-k-mooney | thats what was confusing me | 09:36 |
sahid | in all cases I think that make sense to have one global thread pool and each pick on it, so each service will use its own thread pool | 09:40 |
sahid | but well some time we may need to have one context os threads so we can play with them, join them or kill them | 09:41 |
sean-k-mooney | sahid: you shoudl never kill a thread in python | 09:44 |
sean-k-mooney | you shoudl be using futures for the most part when you need to wait on results | 09:44 |
sean-k-mooney | if you need to kill something you should be using a process pool | 09:45 |
sean-k-mooney | otherwise you need to dispatch cancelable tasks | 09:45 |
sean-k-mooney | killing a thread via python is generally unsafe, there is no public api to do that and you have to fall abck to posix hacks, which mean you can leak locks and filehandels | 09:46 |
opendevreview | sean mooney proposed openstack/osc-placement master: Add bindep.txt for ubunutu 24.04 support https://review.opendev.org/c/openstack/osc-placement/+/946031 | 12:14 |
opendevreview | sean mooney proposed openstack/osc-placement master: Add bindep.txt for ubunutu 24.04 support https://review.opendev.org/c/openstack/osc-placement/+/946031 | 12:14 |
sean-k-mooney | hi folks https://review.opendev.org/c/openstack/osc-placement/+/946031 seam to fix the osc-placement docs jobs. we will need to backport that to stable 2025.1 as well | 13:07 |
sean-k-mooney | we can do that after the release. tl;dr we need a bindep file to install pcre because the header is not transitivly installed on ubuntu 24.04 | 13:08 |
cardoe | Hoping to get https://review.opendev.org/c/openstack/nova/+/942019 on your folks radar too. Really helps in figuring out why something failed to get the error message instead of the hex memory address of the error message. | 13:23 |
gibi | cardoe: I've approved the patch. Thanks for pushing a fix | 13:25 |
cardoe | Thank you. :) | 13:25 |
gibi | sean-k-mooney: bauzas: could one of you take a look at this simple os_traits addition https://review.opendev.org/c/openstack/os-traits/+/944049 for OTU? | 14:08 |
sean-k-mooney | sure but if we plan do do a realase for that we shoudl include https://review.opendev.org/c/openstack/os-traits/+/940418 | 14:09 |
bauzas | done | 14:09 |
gibi | sean-k-mooney: sure | 14:12 |
dansmith | gibi: my test should have caught that I was iterating RCs for no reason.. do you mind if I put a slightly-unrealistic allocation in for the PCI device in the test where there are two RCs to make sure we only call invalidate once? | 14:14 |
dansmith | PCI in placement would only have one CUSTOM_PCI_X_Y:1 thing in the allocation, but if I put two I can make sure I'm only doing it once | 14:15 |
gibi | dansmith: I don't mind at all | 14:16 |
dansmith | cool | 14:16 |
dansmith | gibi: are you walking up the series again? if so, I'll hold off pushing this fix | 14:16 |
opendevreview | Rajesh Tailor proposed openstack/nova master: Add upgrade status check for duplicate cell names https://review.opendev.org/c/openstack/nova/+/901810 | 14:17 |
opendevreview | Merged openstack/os-traits master: Add HW_PCI_ONE_TIME_USE trait https://review.opendev.org/c/openstack/os-traits/+/944049 | 14:18 |
gibi | dansmith: yeah I'm reviewing at the moment | 14:27 |
dansmith | ack, i'll hold off | 14:27 |
gibi | thanks | 14:27 |
artom | Uggla, heads up, new security/nova x-project topic for PTL at https://etherpad.opendev.org/p/nova-2025.2-ptg#L77 | 14:40 |
gibi | dansmith: I finished reviewing the stack | 14:54 |
dansmith | gibi: cool, thanks, I'm just running tests after removing the spec copy | 14:54 |
gibi | OK | 14:54 |
opendevreview | Rajesh Tailor proposed openstack/nova master: Update the api-ref for unshelve https://review.opendev.org/c/openstack/nova/+/938054 | 14:57 |
gibi | dansmith: do you plan to push some functional test coverage top of the series? | 14:58 |
dansmith | I haven't gone deeply into the functional tests for pci yet, because I've been sort of working with my devstack (and scheming about getting it tested in a job) | 14:59 |
dansmith | if you have a pointer to a good place to look at something that would be applicable for inspiration, I can do that | 15:00 |
gibi | sure: https://github.com/openstack/nova/blob/master/nova/tests/functional/libvirt/test_pci_in_placement.py this has good examples | 15:00 |
gibi | actually VM boots with PCI in Placement starts here https://github.com/openstack/nova/blob/master/nova/tests/functional/libvirt/test_pci_in_placement.py#L1618 | 15:01 |
dansmith | cool, I'll get to a stopping point in the other thing I'm doing soon and try to add some more of that | 15:02 |
bauzas | oh doh | 15:02 |
bauzas | forgot we got daylight savings here | 15:02 |
bauzas | I was about to yell to Uggla he forgot the meeting :D | 15:03 |
* bauzas goes back into his cave for 57 mins | 15:03 | |
gibi | dansmith: cool | 15:03 |
gibi | dansmith: you probably need to extend our assert to check for the reserved value on the inventory here https://github.com/openstack/nova/blob/98226b60f3fe7b20e8d7f208c12f8d0086cd83d0/nova/tests/functional/libvirt/test_pci_sriov_servers.py#L182-L188 | 15:05 |
gibi | as today only total and max_unit is asserted | 15:06 |
Uggla | Meeting in around 1h | 15:07 |
dansmith | ack | 15:08 |
* gibi now hates DST | 15:08 | |
Uggla | @artom, ok I have seen the cross session needs. | 15:09 |
dansmith | gibi: I'm not sure I can write those functional tests | 15:18 |
dansmith | well, the delay has messed up the joke.. I was going to say "I can't bring myself to waste enough vertical space to fit the style in that file" | 15:27 |
dansmith | Uggla: can we do an os-traits release to get the new one that merged this morning? we can't really test functionally until that's available in our reqs/venvs | 15:27 |
Uggla | @dansmith, yes I think so. I'll propose a patch for it right after the Epoxy release. Will it be ok for you ? | 15:29 |
* gibi looks at golang built in formatter | 15:30 | |
dansmith | gibi: don't get me started | 15:30 |
* gibi stops looking | 15:30 | |
gibi | :) | 15:30 |
dansmith | Uggla: I guess I'll put these in a separate patch since they won't pass until then. I already have a local patch to add just the trait reporting thing to a functional test, which can't work in CI until the os-traits release | 15:30 |
dansmith | so I'll just add some more in there and push that up with the caveat that it won't work until that releases | 15:31 |
dansmith | gibi: ^ re: functionals in a separate patch, at least for the moment | 15:32 |
Uggla | @dansmith, ok lgtm | 15:32 |
gibi | dansmith: works for me too | 15:32 |
* gibi hands his spare vertical space to dansmith to put it in good use | 15:33 | |
dansmith | I had stopped looking at the functional stuff since I had to hack my local env with the trait to get even basic stuff working, but I'll just do that so I can make some progress there | 15:33 |
dansmith | gibi: unfortunately, the '\n' bytes are free, it's the screen real estate and the burning feeling in my eyeballs that is not | 15:34 |
gibi | true | 15:36 |
gibi | what if you add a new test_ file for otu and use your own style? I would totally accept that | 15:37 |
gibi | the current file is already 2000 LOC so it is reasonable to split | 15:38 |
dansmith | I already hate that nova has grown this very-different style from it's reasonably-consistent-just-not-machine-formatted style before this madeness | 15:40 |
dansmith | so further splitting and being different to all the other PCI-in-placement tests here doesn't seem beneficial, except for my eyeballs | 15:41 |
dansmith | I will attempt to hold my tongue and just do it, but if I can't stand it I'll split | 15:42 |
dansmith | I don't *think* there will be a lot of OTU tests here | 15:42 |
dansmith | there's just not that much to do I think | 15:42 |
gibi | dansmith: understood | 15:47 |
gibi | I'm not sure how nova kept the resonably consistent style in the past | 15:47 |
gibi | were there stricter reviewers back then? | 15:48 |
gibi | less hippy devs? | 15:49 |
dansmith | a lot of existing code and people copying the style that was there | 15:49 |
dansmith | we also had some guidelines like breaking lines with parens instead of backslashes, and several of the pep8 style guidelines which were not machine enforced | 15:50 |
dansmith | a lot of people wrote a lot of reasonably-consistent code before machine formatting made everything reliably ugly :) | 15:50 |
Uggla | Meeting in 5mn. | 15:55 |
Uggla | #startmeeting nova | 16:01 |
opendevmeet | Meeting started Tue Apr 1 16:01:42 2025 UTC and is due to finish in 60 minutes. The chair is Uggla. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:01 |
opendevmeet | The meeting name has been set to 'nova' | 16:01 |
Uggla | Hello everyone | 16:01 |
masahito | o/ | 16:02 |
dansmith | o/ | 16:02 |
gibi | o/ | 16:02 |
sean-k-mooney | o/ | 16:02 |
elodilles | o/ | 16:03 |
bauzas | o/ | 16:03 |
Uggla | #topic Bugs (stuck/critical) | 16:03 |
Uggla | #info No Critical bug | 16:03 |
Uggla | #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster | 16:04 |
Uggla | anything about bugs ? | 16:04 |
Uggla | #topic Gate status | 16:05 |
Uggla | #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs | 16:05 |
Uggla | #link https://etherpad.opendev.org/p/nova-ci-failures-minimal | 16:05 |
Uggla | #link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 Nova&Placement periodic jobs status | 16:05 |
Uggla | #info Please look at the gate failures and file a bug report with the gate-failure tag. | 16:05 |
sean-k-mooney | so technially osc-placment gate is blocked without https://review.opendev.org/c/openstack/osc-placement/+/946031 | 16:06 |
Uggla | #info Please try to provide meaningful comment when you recheck | 16:06 |
sean-k-mooney | we need that on master and stable/2025.1 | 16:06 |
sean-k-mooney | tldr without a bindep file the tox-docs job fails on noble | 16:06 |
sean-k-mooney | specificly because libpcre3-dev is not installed | 16:07 |
sean-k-mooney | it only affect the jobs so not blocking the release | 16:07 |
elodilles | sean-k-mooney: add me as reviewer to the stable patch and I'll +2 it o:) | 16:08 |
elodilles | and thanks for the fix! | 16:08 |
sean-k-mooney | sure. ill propsoe it once its on master. it came up when reviewing https://review.opendev.org/c/openstack/osc-placement/+/943759 | 16:08 |
elodilles | +1 | 16:08 |
Uggla | thx for the fix sean-k-mooney ! | 16:08 |
sean-k-mooney | we didnt have any osc-placement change this cycle that im aware of | 16:09 |
sean-k-mooney | so not have release notes for a few days is pretty low impact | 16:09 |
elodilles | sean-k-mooney: ACK | 16:09 |
sean-k-mooney | so no need to rush but good to do before we forget about it | 16:09 |
Uggla | sure, anything else about the gate topic ? | 16:10 |
sean-k-mooney | not form me, we can move on i think | 16:11 |
Uggla | yep | 16:11 |
bauzas | I'll try to review the bindep patch | 16:11 |
bauzas | we did this before iirc | 16:11 |
sean-k-mooney | we had to fix a diffent bindep issue for nova | 16:11 |
sean-k-mooney | it was not including the test profile | 16:11 |
sean-k-mooney | that got fixed a few weeks ago | 16:12 |
Uggla | #topic Release Planning | 16:12 |
Uggla | #link https://releases.openstack.org/epoxy/schedule.html | 16:12 |
Uggla | #info Nova deadlines are set in the above schedule | 16:13 |
Uggla | #link 945904: 2025.1 Epoxy final releases for cycle-with-rc projects | https://review.opendev.org/c/openstack/releases/+/945904 | 16:13 |
Uggla | #link https://releases.openstack.org/flamingo/schedule.html | 16:13 |
Uggla | #info Remaining post RC1: update min servion version for a SLURP or non-SLURP release : https://review.opendev.org/c/openstack/nova/+/944018/ | 16:14 |
Uggla | #info Nova Flamingo deadlines will be discussed at the PTG. | 16:14 |
Uggla | Epoxy should be released tomorrow if I'm not wrong. | 16:15 |
Uggla | \o/ | 16:15 |
elodilles | ~o~ | 16:15 |
gibi | nice | 16:15 |
elodilles | yepp, we start the machinery tomorrow | 16:16 |
Uggla | btw @elodilles I have +1 the patch above. | 16:16 |
elodilles | thanks Uggla o/ | 16:16 |
Uggla | #topic Review priorities | 16:18 |
Uggla | #info Flamingo priorities will be discussed at the PTG. | 16:18 |
Uggla | #topic PTG planning | 16:18 |
Uggla | #info Next PTG will be held on Apr 7-11 | 16:18 |
Uggla | #link https://etherpad.opendev.org/p/nova-2025.2-ptg | 16:19 |
Uggla | I think we have collected all the topics, the agenda is draft for the moment. I think I will finalize it tomorrow. | 16:19 |
Uggla | Today, there's a new and rather unexpected topic on the table for the PTG. | 16:20 |
Uggla | Rumor has it that it originates from a secret TC meeting focused on improving our resilience against supply chain attacks. | 16:20 |
dansmith | Uggla: the glance ptl was just pinging me about a session with glance about location api and a couple other things | 16:20 |
Uggla | As a result, we have a new top priority for the next development cycle — replacing the eventlet removal initiative. | 16:20 |
Uggla | It has been officially decided to begin rewriting Nova, starting with the scheduler component, in Brainfuck (https://en.wikipedia.org/wiki/Brainfuck). | 16:21 |
Uggla | The minimalist nature of the language, combined with its near-total unreadability, provides an unparalleled level of protection against malicious code injections. | 16:21 |
Uggla | A working PoC is already available here: https://www.jdoodle.com/ia/1FbD | 16:21 |
Uggla | Yep, @dansmith he contacted me for a new cross team meeting. | 16:22 |
dansmith | okay good | 16:22 |
dansmith | I put it on the etherpad at the bottom | 16:22 |
* gibi hopes there is a mandatory formatter in the brainfuck compiler | 16:22 | |
gibi | Btw, will we have a full nova PTG day on Monday to front load stuff while we have? | 16:23 |
* dansmith scowls at gibi | 16:23 | |
Uggla | yes we have booked the full week Monday to Friday | 16:23 |
gibi | ack | 16:24 |
Uggla | I'll try to clarify the full agenda by tomorrow. | 16:24 |
Uggla | Monday we may start later and cover only the retro. | 16:25 |
bauzas | hmmm, April gotcha | 16:25 |
Uggla | We have a question from mikal | 16:25 |
opendevreview | sean mooney proposed openstack/osc-placement master: Add bindep.txt for ubunutu 24.04 support https://review.opendev.org/c/openstack/osc-placement/+/946031 | 16:25 |
Uggla | mikal: Sean suggested on IRC at https://meetings.opendev.org/irclogs/%23openstack-nova/latest.log.html#t2025-03-25T19:01:27 that a "specless blueprint" might be sufficient to finish off the SPICE VDI work. I've therefore created https://blueprints.launchpad.net/nova/+spec/libvirt-spice-vdi, but wont be able to attend the meeting due to timezones. Could y'all discuss and decide if I need to propose something for the PTG / require a spec / or if this | 16:25 |
Uggla | is sufficient to finish off this work? Thanks! | 16:25 |
sean-k-mooney | ya so basicaly we did approve the usb and sound device changes as part of spice direct | 16:26 |
sean-k-mooney | but we didnt have time to land it | 16:26 |
sean-k-mooney | so since the code is written if there are no objections i was suggeting proceed with a specless blueprint unless peole had design concerns | 16:27 |
gibi | if there are no changes compared to the approved spec then I'm fine with it | 16:27 |
sean-k-mooney | mikal said they can attend part of the ptg | 16:27 |
sean-k-mooney | so we could cover it a the start of one fo the sessions fi needed | 16:27 |
Uggla | sean-k-mooney, it is unclear in the message he said he won't be able to attend. | 16:28 |
sean-k-mooney | i htink they can atted if its first thign but it will be midnight for them | 16:28 |
sean-k-mooney | so if we can handel this async over eamil or approve and just move to gerrit | 16:29 |
sean-k-mooney | it will work a lot better for them | 16:29 |
bauzas | last time, he was able to join on a late evening for him | 16:29 |
bauzas | but we need to give him a specific time | 16:29 |
bauzas | like Friday 1pm if that works for him | 16:29 |
sean-k-mooney | so do we want to defer approving https://blueprints.launchpad.net/nova/+spec/libvirt-spice-vdi until we chat with them in the ptg? | 16:30 |
sean-k-mooney | ill note that while that has spice in the name | 16:30 |
gibi | I have nothing against approving it | 16:30 |
sean-k-mooney | this applie to vnc also | 16:30 |
Uggla | yep I will set it as early as possible. sean-k-mooney are you ok to discuss this topic if Mikal won't join | 16:31 |
sean-k-mooney | sure | 16:31 |
Uggla | cool | 16:31 |
Uggla | any concerns about the specless BP for Mikal ? | 16:32 |
sean-k-mooney | ill note that we only need a ptg session fi we have open design questons. | 16:32 |
bauzas | I don't have any concerns | 16:33 |
Uggla | so i guess we can go for it. | 16:33 |
gibi | go for it | 16:34 |
Uggla | sean-k-mooney, I will set the topic for the agenda, maybe it will be quick if it is crystal clear. | 16:34 |
sean-k-mooney | ack | 16:34 |
Uggla | something else you want to discuss for the PTG prep ? | 16:35 |
sean-k-mooney | am just an fyi im double booked with watcher | 16:35 |
sean-k-mooney | so ill attend where i can | 16:36 |
sean-k-mooney | ping if im not there and you would like my input | 16:36 |
sean-k-mooney | i tried to leave comment in the doc already | 16:36 |
Uggla | sean-k-mooney, ok sure thx. | 16:36 |
Uggla | I close this topic by saying we will not have this meeting next week due to the vPTG. | 16:36 |
Uggla | #topic Stable Branches | 16:36 |
masahito | i'm not sure i mentioned it before. if my topic is selected for the ptg topics, i prefer 14utc or later. | 16:36 |
Uggla | masahito, yes, this is somewhere in my mind. | 16:37 |
masahito | thanks. | 16:37 |
Uggla | #info stable/2024.* gates broken with nova-ceph-multistore job failure (test case test_volume_upload fails - No image found with ID...) | 16:37 |
Uggla | #info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci | 16:37 |
Uggla | elodilles, the floor is yours | 16:38 |
elodilles | yeah, i couldn't identify the root cause yet for that job failure yet ^^^ :( | 16:38 |
elodilles | so if anyone has any idea, it is appreciated o:) | 16:38 |
elodilles | stable/2023.2 gate is fine, and afaik stable/2025.1, too, but not 100% on that | 16:39 |
elodilles | and that is all i can say now | 16:39 |
sean-k-mooney | the iso patch has been rechecked again... | 16:41 |
sean-k-mooney | still hiting that volume bug | 16:41 |
sean-k-mooney | i havent looked at it yet either ill admit | 16:41 |
sean-k-mooney | i assume cinder of glance shoudl be seeing the same failure? | 16:42 |
gmann | I think I saw it on stable/2025.1 also (have not checked if it is same or different?) https://review.opendev.org/c/openstack/devstack/+/945239/comments/231d655b_52020b94 | 16:42 |
* elodilles clicks but review.o.o is slow nowadays here :/ | 16:43 | |
sean-k-mooney | that still the nova job | 16:43 |
sean-k-mooney | i was hopign that was a non nova one showing the same issue | 16:44 |
elodilles | gmann: looks like same | 16:44 |
Uggla | looks the same yes. | 16:44 |
sean-k-mooney | dansmith: do you know off the top of your head if there is anything special about the nova-ceph-multistore job that woudl cause no image found on volume upload? | 16:45 |
dansmith | volume upload to glance? | 16:45 |
sean-k-mooney | presumable yes | 16:45 |
dansmith | not really, that seems sort of impossible, | 16:46 |
dansmith | since you create first and then upload | 16:46 |
elodilles | (and the devstack patch merged successfully on 27th March, so it either not fully blocking or it was fixed on stable/2025.1 somehow) | 16:46 |
gmann | seems like glance fail to import image? | 16:46 |
gmann | #link https://zuul.opendev.org/t/openstack/build/54c93553fc6c4797b25077fb29a89338/log/controller/logs/screen-g-api.txt#12807 | 16:46 |
sean-k-mooney | 953cc449-8f49-4d56-ab3e-49b54f49f937 failed to import image 833ccb36-83b8-41a7-b932-86884e4cdfc0 to the filesystem.: NoneType: None | 16:47 |
dansmith | is this a multinode setup? | 16:47 |
dansmith | with glance-remote configured? | 16:47 |
dansmith | that sort of looks like it can't do import because it can't look up the configured self-reference url of a node that did the stage, maybe? | 16:48 |
sean-k-mooney | its multinode and multi store but im note sure if that also means multiple glance apis | 16:48 |
sean-k-mooney | oh no | 16:48 |
sean-k-mooney | its single node | 16:48 |
sean-k-mooney | but multiple backends | 16:48 |
dansmith | yeah, single node and no g-api-r service | 16:49 |
dansmith | that must be the import-from-http test and not volume upload then | 16:50 |
sean-k-mooney | its empest.api.volume.test_volumes_actions.VolumesActionsTest.test_volume_upload[id-d8f1ca95-3d5b-44a3-b8ca-909691c9532d,image] | 16:50 |
sean-k-mooney | https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_54c/openstack/54c93553fc6c4797b25077fb29a89338/testr_results.html | 16:50 |
sean-k-mooney | its a post to POST https://10.0.18.121/volume/v3/volumes/5726aafd-b7f7-4dbd-be64-265076f0efbb/action so asking cidner to upload the voluem to glance | 16:51 |
sean-k-mooney | Body: {"os-volume_upload_image": {"image_name": "tempest-VolumesActionsTest-Image-2109450379", "disk_format": "raw"}} | 16:51 |
dansmith | I also don't expect volume upload to use import | 16:52 |
dansmith | sean-k-mooney: I meant the failed task that gmann linked to | 16:53 |
dansmith | let's not debug it here | 16:53 |
sean-k-mooney | oh ok | 16:53 |
gibi | Uggla: I think we can close the meeting and let the folks continue troubleshooting | 16:53 |
sean-k-mooney | ya | 16:53 |
gmann | I think there are some image delete request in log | 16:53 |
gmann | #link https://zuul.opendev.org/t/openstack/build/54c93553fc6c4797b25077fb29a89338/log/controller/logs/screen-g-api.txt#19062 | 16:53 |
gmann | yeah let's debug later | 16:53 |
Uggla | yep can we looks at that after the meeting ? | 16:54 |
sean-k-mooney | sure | 16:54 |
Uggla | I skip the vmwareapi 3rd-party CI efforts Highlights as fwiesel is ooo this week. | 16:54 |
Uggla | Latest topic | 16:54 |
Uggla | #topic Open discussion | 16:54 |
Uggla | If there is not, I'll close the meeting. | 16:55 |
Uggla | anything more to discuss ? | 16:55 |
Uggla | 3... | 16:55 |
Uggla | 2... | 16:56 |
Uggla | 1... | 16:56 |
bauzas | . | 16:56 |
Uggla | thanks all | 16:57 |
Uggla | #endmeeting | 16:57 |
opendevmeet | Meeting ended Tue Apr 1 16:57:16 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:57 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/nova/2025/nova.2025-04-01-16.01.html | 16:57 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/nova/2025/nova.2025-04-01-16.01.txt | 16:57 |
opendevmeet | Log: https://meetings.opendev.org/meetings/nova/2025/nova.2025-04-01-16.01.log.html | 16:57 |
elodilles | thanks Uggla o/ | 16:57 |
bauzas | thanks Uggla | 16:57 |
masahito | thank you | 16:57 |
gibi | thanks | 16:57 |
Uggla | I thought @dansmith would like to discuss brainfuck formating. :) | 16:58 |
sean-k-mooney | dansmith: gmann: i have a thoey but im not sure if it makes sense. looking at https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_54c/openstack/54c93553fc6c4797b25077fb29a89338/controller/logs/etc/glance/glance-api_conf.txt the default backend is file(cheap), this is a volume upload test and the job has ceph. so i would assome that | 16:58 |
sean-k-mooney | cinder is using ceph. could the issue be that cidner is expectign to upload to the robust(ceph) store but we are trying to upload to the default file store | 16:58 |
sean-k-mooney | its called black | 16:58 |
sean-k-mooney | with its one line per function argument | 16:59 |
sean-k-mooney | dansmith: what im wondering is did cinder create a volume snapshot on ceph and then try and use the interoperable imporat flow to import from the ceph store to file store or somethign odd like that | 17:00 |
dansmith | you can't use import like that | 17:00 |
dansmith | so I don't think so | 17:00 |
dansmith | you're thinking of adding the location directly | 17:00 |
sean-k-mooney | perhaps | 17:01 |
sean-k-mooney | https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_54c/openstack/54c93553fc6c4797b25077fb29a89338/controller/logs/etc/cinder/cinder_conf.txt | 17:01 |
sean-k-mooney | cidner is using ceph | 17:01 |
sean-k-mooney | im just wondering if cidner ig getting confused about how to upload the image | 17:01 |
dansmith | maybe best to ask a cinder person to look | 17:02 |
sean-k-mooney | ya im going to very quieckly try and find the chagne id in there logs | 17:03 |
sean-k-mooney | *request id | 17:03 |
elodilles | gmann sean-k-mooney : it looks like this patch landed around the time when things started to break: https://review.opendev.org/c/openstack/tempest/+/938592 | 17:04 |
gmann | elodilles: yeah, I was about to debug that and someone rechecked and it merged and then I forgot | 17:05 |
gmann | I am checking some tempest logs if something test is making request which is racing GET and DELETE | 17:05 |
gmann | I will do after TC meeting | 17:05 |
sean-k-mooney | there is a trace back related to that i think in cidner | 17:06 |
sean-k-mooney | it thinks it uploaded the image DEBUG cinder.volume.manager [None req-ca38b299-90c1-489f-93c7-3fd8b32b5e69 tempest-VolumesActionsTest-2112541612 None] Uploaded volume to glance image-id: 439cbb04-d5ef-480f-89c9-84bf2d4c749b | 17:06 |
elodilles | thanks in advance gmann o/ | 17:06 |
gmann | because this is image id whihc is 404 in failing test and there is successful delete request https://zuul.opendev.org/t/openstack/build/54c93553fc6c4797b25077fb29a89338/log/controller/logs/screen-g-api.txt#19069 | 17:06 |
gmann | and after this DELETE request there are GET and 404 | 17:07 |
opendevreview | Dan Smith proposed openstack/nova master: Invalidate PCI-in-placement cached RPs during claim https://review.opendev.org/c/openstack/nova/+/944149 | 17:07 |
opendevreview | Dan Smith proposed openstack/nova master: Support "one-time-use" PCI devices https://review.opendev.org/c/openstack/nova/+/943816 | 17:07 |
opendevreview | Dan Smith proposed openstack/nova master: Add one-time-use devices docs and reno https://review.opendev.org/c/openstack/nova/+/944262 | 17:07 |
opendevreview | Dan Smith proposed openstack/nova master: WIP: Functional tests for one-time-use devices https://review.opendev.org/c/openstack/nova/+/946065 | 17:07 |
gmann | sean-k-mooney: I am searhing via image id from the tempest test failing traceback: | 17:07 |
gmann | Details: {'message': 'No image found with ID 3eb80d77-c55b-40d1-bf5d-1e22607350e8<br /><br />\n\n\n', 'code': '404 Not Found', 'title': 'Not Found'} | 17:07 |
sean-k-mooney | https://zuul.opendev.org/t/openstack/build/54c93553fc6c4797b25077fb29a89338/log/controller/logs/screen-c-vol.txt#4972-5037 | 17:07 |
dansmith | gibi: that ^ tests the full workflow in a single test.. there's a lot of setup so I'm not sure if it's really useful to split that into multiple cases | 17:08 |
dansmith | gibi: also not sure its feasible to test anything other than the happy path there (the cache issue is covered in unit tests) | 17:08 |
sean-k-mooney | gmann: do you know where that 3eb80d77-c55b-40d1-bf5d-1e22607350e8 is coming form | 17:09 |
sean-k-mooney | we ca see in the start fo the test logs that the id of the create image is 439cbb04-d5ef-480f-89c9-84bf2d4c749b | 17:09 |
dansmith | I assume testing things like startup failures due to improper config of a VF or something are not really needed to be covered in functional since they're covered in unit | 17:10 |
sean-k-mooney | gmann: oh this is doing multiple operations in on etest | 17:10 |
gmann | sean-k-mooney: its is from cinder upload_volume https://github.com/openstack/tempest/blob/80c0477f78c71a2bd2e1a324c41cd2f50329b200/tempest/api/volume/test_volumes_actions.py#L123 | 17:10 |
sean-k-mooney | so what happing is the raw iamge is uploading fine | 17:12 |
sean-k-mooney | and the qcow is failing | 17:12 |
sean-k-mooney | gmann: the qcow fails because of https://zuul.opendev.org/t/openstack/build/54c93553fc6c4797b25077fb29a89338/log/controller/logs/screen-c-vol.txt#4972-5037 | 17:13 |
sean-k-mooney | its failing in the format inspector code | 17:13 |
dansmith | on the cinder side | 17:14 |
sean-k-mooney | yep | 17:14 |
dansmith | and that's their oooold integrated format inspector code | 17:14 |
gmann | but in that case, I expect upload_volume to fail instead of giving image_id | 17:14 |
sean-k-mooney | so its a ceph backed cinder volume. and we are asking cidner ot upload the volume as an image in qcow2 format | 17:14 |
dansmith | and apparently because filename is none | 17:14 |
sean-k-mooney | gmann: i have not looked at the cidner code but i belive the worklaod is create teh glance image then upload the data | 17:15 |
sean-k-mooney | so they are probaly failing in between those two steps | 17:15 |
sean-k-mooney | gmann: form the looks of that trace the ceph cinder volume driver does not support converting the format when uploading a volume to an image | 17:17 |
sean-k-mooney | gmann: so in the near term we proably should tweak that job to only use raw | 17:17 |
sean-k-mooney | and if cidner ever support that with ceph backed volumes we can turn it back on | 17:17 |
sean-k-mooney | looking at the trace they are either expectign a host monted volume or a qemu rbd path to be passed as the path to qemu image. but its obviously not passing either given its None | 17:19 |
sean-k-mooney | dansmith: gmann i think the short term fix would be to add disk_format=raw here https://github.com/openstack/nova/blob/master/.zuul.yaml#L712 | 17:23 |
sean-k-mooney | based on the default change in https://review.opendev.org/c/openstack/tempest/+/938592/4/tempest/config.py | 17:23 |
sean-k-mooney | qcow makes sensce fo lvm josb perhaps since the volujme will be a local block device and they can call qemu image on it | 17:25 |
dansmith | why not have the cinder team confirm and do that if so? | 17:25 |
dansmith | isn't this on a stable job? | 17:25 |
sean-k-mooney | its likely on master too | 17:25 |
sean-k-mooney | but yes | 17:25 |
dansmith | how would that not be 100% fail? | 17:26 |
gmann | but this is not 100% failure in stable and master pass correctly ? | 17:26 |
dansmith | my point is, let's not just go papering over some failure with a devstack change as this seems like it should be a cinder bug no? | 17:26 |
sean-k-mooney | right i think it is a ciner bug | 17:26 |
gmann | dansmith: yeah, that what I was wondering, its not 100% | 17:26 |
dansmith | right, so I say kick it over to them | 17:27 |
sean-k-mooney | so i dont see it as papering over | 17:27 |
sean-k-mooney | we need to report it to them as a gate blocker | 17:27 |
sean-k-mooney | and then eitehr revert the tempst defualt change until its fixed or just skip it in our job until fixed | 17:27 |
dansmith | if glance is not using ceph as well, then asking them to upload a qcow2 image is totally legit | 17:27 |
sean-k-mooney | but i agree the first step is get cinder involved | 17:27 |
sean-k-mooney | glance is usign file and ceph | 17:28 |
sean-k-mooney | so yes uploading qcow is ligit in the job | 17:28 |
sean-k-mooney | given file is the default | 17:28 |
sean-k-mooney | even if it was not the default it would be ligit but slow | 17:28 |
dansmith | yeah | 17:28 |
sean-k-mooney | we also have force image convertion to raw in this job | 17:29 |
sean-k-mooney | which could be a diffent issue | 17:29 |
sean-k-mooney | its not failing in galnce however so its not the current issue | 17:29 |
supamatt | :wave: hi folks this change broke security groups https://review.opendev.org/c/openstack/nova/+/811521 , if any security groups with the same name exist in the project (whether the VM is using that sec group or not). Results in VMs not being able to be built with an error saying "Multiple security groups found matching <name>. Use an ID to be more specific" | 17:32 |
sean-k-mooney | if you relying on passign the name instead of the uuid | 17:33 |
sean-k-mooney | it woudl also only fail if that conflicting named security group was shared with you | 17:33 |
sean-k-mooney | supamatt: im not really sure how to adress the conflicting requirements | 17:34 |
sean-k-mooney | prefering the local one seams incorrect | 17:34 |
sean-k-mooney | if there is a name conflict it should be raised as an error by nova | 17:35 |
masahito | gmann: gibi: hi please review this api bug fix https://review.opendev.org/c/openstack/nova/+/939658 The v2.100 follows the same fix, so it might be ok for W+1. | 17:36 |
supamatt | sean-k-mooney: that's the thing, I am passing the uuid to nova. But nova is errors if any security has the same name in a project, regardless if I use that sec group to provision a vm. | 17:36 |
sean-k-mooney | supamatt: on ok | 17:37 |
sean-k-mooney | *oh | 17:37 |
sean-k-mooney | that simpler to fix | 17:37 |
sean-k-mooney | supamatt: have you filed a bug for this yet | 17:37 |
supamatt | I was checking lp now to see if there was an existing one | 17:38 |
supamatt | but it seems I am the first to see this, so I will file one | 17:38 |
sean-k-mooney | if you pass a name then the current behavior is arguabel correct. we could prefer the one from the current project but that might eb surprisign, if you pass a uuid we shoudl use the one you said even if there are duplicate names so that a ligit bug | 17:39 |
sean-k-mooney | supamatt: so taht a new feature in epoxy which release tomorrow | 17:40 |
sean-k-mooney | we wont have a fix for it in the release but we likely will do another release after the ptg | 17:41 |
sean-k-mooney | supamatt: https://review.opendev.org/c/openstack/nova/+/811521/13/nova/network/neutron.py#851 that where the but is | 17:51 |
sean-k-mooney | supamatt: the current algorthim is raising based on the aviabel secuirty groups to the user not the requsted one | 17:51 |
sean-k-mooney | we need to change the data stuctures and the algorthim slightly | 17:52 |
gmann | masahito: ack, will check soon | 17:53 |
gmann | dansmith: sean-k-mooney: elodilles: on test_volume_upload ceph failure, I found the cinder fix which is being backported to stable branches one by one. https://review.opendev.org/q/I32b505aa69c71b62e7e3a52d65d38165d34e97d8 | 17:57 |
sean-k-mooney | gmann ack | 17:58 |
gmann | we just need to wait for that to be merged on all stable | 17:58 |
dansmith | gotcha | 17:58 |
dansmith | I'm also a bit surprised (not really) they have not converted to oslo format inspector | 17:58 |
dansmith | I even prototyped the patch for them | 17:59 |
sean-k-mooney | hum ya the failure was on stable/2025.1 | 17:59 |
sean-k-mooney | that means they are also missing some of the mutliiple format things | 18:00 |
sean-k-mooney | for iso supprot and vfat supprot | 18:00 |
dansmith | yeah | 18:00 |
dansmith | not sure if that's an issue for them or not | 18:00 |
sean-k-mooney | well it mean you wont be able to create volumes form some isos presumable | 18:00 |
sean-k-mooney | that a pretty neich usecase given you can only use them properly with hacks | 18:01 |
dansmith | I'm still a bit surprised that fix that gmann found is not deterministic in its fail pattern | 18:01 |
elodilles | gmann: ah, cool, sounds great! thanks! | 18:01 |
sean-k-mooney | dansmith: ya that odd to me too | 18:01 |
sean-k-mooney | dansmith: that unlink https://review.opendev.org/c/openstack/cinder/+/945616/2/cinder/volume/drivers/rbd.py#2126 bug | 18:04 |
sean-k-mooney | upload can raise so there should at least be a try finally block here | 18:05 |
dansmith | unlink you mean? | 18:06 |
dansmith | it also looks like they're using two strategies to unlink the temp file no? | 18:06 |
sean-k-mooney | ya the os.unlink of the temp export of the volume | 18:06 |
dansmith | oh, I guess the context only unlinks on error | 18:06 |
dansmith | but yeah, that could definitely fail | 18:06 |
sean-k-mooney | oh they are using fileutils.remove_path_on_error(tmp_file): | 18:07 |
sean-k-mooney | to handel the excption case | 18:07 |
sean-k-mooney | and then the explcit unlink for happy path | 18:07 |
dansmith | right, seems like they should use try..finally: unlink() | 18:07 |
sean-k-mooney | that works... but not how i would have done that | 18:07 |
dansmith | yeah | 18:07 |
sean-k-mooney | dont we use context lib.exit_stack or something like that to handel it in nova/oslo | 18:08 |
dansmith | in some places yeah but I don't think it's really necessary here | 18:08 |
sean-k-mooney | we might not even need that | 18:08 |
sean-k-mooney | right that really only needed if we have multipel resocue we need to clean up | 18:08 |
sean-k-mooney | gmann: are we sure this is not just hard failing but it looks like its only failign sometimes due to the backport fo that patch | 18:10 |
sean-k-mooney | gmann: given the "fix" id dont see how this could sometiems work and sometiems not | 18:10 |
sean-k-mooney | enabling the qcow image type and this abckprot have all been happening over the last week | 18:11 |
gmann | sean-k-mooney: after this fix merged in master and stable/2025.1, I have not seen failure. in my devstack change on stable/2025.1 it started passing once this fix backport merged on stable/2025.1. I checked the timing to make sure that | 18:11 |
sean-k-mooney | ack | 18:12 |
sean-k-mooney | well let just wait for the final patch to land i guess | 18:12 |
gmann | stable/2024.2 backport still not merged so there we have 100% | 18:12 |
gmann | 100% failing | 18:12 |
gmann | yeah | 18:12 |
sean-k-mooney | ack that makes sense | 18:13 |
sean-k-mooney | gmann: the default chagne of image formats was in tempest yes? | 18:16 |
gmann | yes | 18:16 |
sean-k-mooney | i wonder if it would make sense to replace https://github.com/openstack/tempest/blob/80c0477f78c71a2bd2e1a324c41cd2f50329b200/zuul.d/project.yaml#L113-L115 with the nova ceph multistore job | 18:17 |
sean-k-mooney | that would provide the same level fo ceph coverage but allso test this configuration where upload go to a glance backed by file in this case. | 18:18 |
sean-k-mooney | that might be an over reaction but that would ahve caught this | 18:20 |
sean-k-mooney | ye already have quite a lot in check | 18:20 |
gmann | sean-k-mooney: +1, that make sense. we do have nova-ceph-multistore in tempest but as experimental which is what we forget to run on related changes | 18:20 |
sean-k-mooney | so i would be hesitent to suggest adding more jobs | 18:20 |
sean-k-mooney | its in devstack too | 18:20 |
sean-k-mooney | https://github.com/openstack/devstack/blob/master/.zuul.yaml#L933-L934 | 18:20 |
sean-k-mooney | that how we are testing ceph integration on devstack changes | 18:21 |
gmann | yes, I think replacing devstack-plugin-ceph-tempest-py3->nova-ceph-multistore in tempest will cover more cases | 18:21 |
gmann | catching in devstack is late where tempest things get merged | 18:22 |
gmann | sean-k-mooney: might be late for you but tomorrow or later if you can propose that in tempest I can merge. if not then I will do sometime later this week | 18:23 |
sean-k-mooney | i can do it now. its 19:23 but ill be around until at least the top of the hour | 18:24 |
gmann | thanks | 18:24 |
sean-k-mooney | gmann: am what will i do about https://github.com/openstack/tempest/blob/80c0477f78c71a2bd2e1a324c41cd2f50329b200/zuul.d/project.yaml#L159-L160 | 18:27 |
sean-k-mooney | it was temporaly disabled in gate 3 years ago | 18:27 |
sean-k-mooney | when i replace it will i add nova-ceph-multinode or just remove that comment | 18:28 |
sean-k-mooney | https://github.com/openstack/tempest/commit/b1ea4327108cbbd518dfc75482dff79493b4edc9 | 18:28 |
gmann | sean-k-mooney: yeah, mostly due to stability. but you can keep nova ceph as voting and remove from experimental queue https://github.com/openstack/tempest/blob/80c0477f78c71a2bd2e1a324c41cd2f50329b200/zuul.d/project.yaml#L176 | 18:28 |
gmann | yeah | 18:28 |
sean-k-mooney | ok ill add it to both | 18:28 |
sean-k-mooney | its voting on nova and devstack and stable so i dont see a reason to skip in gate | 18:29 |
gmann | yeah | 18:29 |
sean-k-mooney | gmann: last question will i put the curernt devstack-ceph job in experimental. i think no but i can replace nova-ceph-multistore with it if its useful to have there | 18:31 |
sean-k-mooney | experiental quite a lot in it too and we can easilly add a dnm to run it if needed | 18:31 |
supamatt | sean-k-mooney: here's the lp bug for the security group problem, https://bugs.launchpad.net/nova/+bug/2105896 | 18:31 |
sean-k-mooney | thanksi htink i figured out how to fix it and commented in the orginal patch | 18:32 |
sean-k-mooney | gmann: i think https://review.opendev.org/c/openstack/tempest/+/946076 should be good. i will respin it tomorrow if you have any issues | 18:40 |
gmann | sean-k-mooney: looks good, thanks | 18:40 |
opendevreview | Matthew Heler proposed openstack/nova master: Fix creating virtual servers with multiple security groups https://review.opendev.org/c/openstack/nova/+/946079 | 19:28 |
opendevreview | Goutham Pacha Ravi proposed openstack/nova stable/2025.1: DNM: test dependency on devstack-plugin-ceph changes https://review.opendev.org/c/openstack/nova/+/946082 | 19:42 |
sean-k-mooney | supamatt: we will need some test coverage but https://review.opendev.org/c/openstack/nova/+/946079/1/nova/network/neutron.py looks like it implementes the changes i suggested. | 20:14 |
sean-k-mooney | supamatt: im not sure if you have had a chace to test it or not | 20:14 |
sean-k-mooney | i can help you write a functionl repoducer or perhaps write one for you if you dont have time but lets see how ci goes over night | 20:15 |
sean-k-mooney | im going to drop for today but ill check back tomorrow | 20:15 |
supamatt | I tested the patch in lab, seems to have worked and the vm built. When previously it did not. | 20:38 |
opendevreview | Matthew Heler proposed openstack/nova master: Fix creating virtual servers with multiple security groups https://review.opendev.org/c/openstack/nova/+/946079 | 22:54 |
opendevreview | Matthew Heler proposed openstack/nova master: Fix creating virtual servers with multiple security groups https://review.opendev.org/c/openstack/nova/+/946079 | 23:38 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!