opendevreview | Merged openstack/nova master: Add metadata for shares https://review.opendev.org/c/openstack/nova/+/850500 | 08:40 |
---|---|---|
opendevreview | Rajesh Tailor proposed openstack/nova master: Fix instance vm_state during shelve https://review.opendev.org/c/openstack/nova/+/934294 | 10:07 |
opendevreview | Ivan Tkachuk proposed openstack/nova master: Reduce calls to qemu-img for disk_info https://review.opendev.org/c/openstack/nova/+/936246 | 11:34 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Show candidate combinatorial explosion by dev number https://review.opendev.org/c/openstack/nova/+/855885 | 12:33 |
opendevreview | Rajesh Tailor proposed openstack/nova-specs master: Show finish_time field in instance action show https://review.opendev.org/c/openstack/nova-specs/+/929780 | 13:08 |
greatgatsby | Good day. I think we've found a bug in os-brick fibre_channel.py and just looking to discuss quickly before I submit a ticket or even a PR (I have a working fix currently). Is this the best place to ask about it? | 14:47 |
frickler | greatgatsby: well technically os-brick belongs to the cinder team, but there might be some overlap. I'd still suggest to start in #openstack-cinder first | 14:54 |
greatgatsby | thanks, will do | 14:54 |
sean-k-mooney | gibi: when you have time would you mind reviewing this revert https://review.opendev.org/c/openstack/nova/+/909122 | 15:41 |
sean-k-mooney | its related to our scp converstaion | 15:41 |
gibi | sean-k-mooney: good point. thanks. +A | 15:45 |
sean-k-mooney | gibi: that what requried to fix using ip adresses for migration indirectly | 15:46 |
gibi | I see | 15:46 |
sean-k-mooney | that is why we dotn need the bluepirnt for that feature | 15:47 |
sean-k-mooney | it was only failing because of this scp. | 15:47 |
sean-k-mooney | if you set the migration/live migration inbound adress to an ip it will work once this is reverted | 15:47 |
sean-k-mooney | that workaround is hardcoded to use instance.host as the source host to pull the file form | 15:48 |
bauzas | #startmeeting nova | 16:00 |
opendevmeet | Meeting started Tue Nov 26 16:00:27 2024 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:00 |
opendevmeet | The meeting name has been set to 'nova' | 16:00 |
bauzas | #link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting | 16:01 |
tkajinam | o/ | 16:01 |
elodilles | o/ | 16:01 |
s3rj1k | hi all | 16:01 |
Uggla | o/ | 16:02 |
bauzas | hey | 16:03 |
bauzas | starting slowly | 16:04 |
bauzas | #topic Bugs (stuck/critical) | 16:04 |
bauzas | #info No Critical bug | 16:04 |
bauzas | #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster | 16:05 |
bauzas | any questions about bugs ? | 16:05 |
bauzas | ok moving on | 16:07 |
bauzas | #topic Gate status | 16:07 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs | 16:07 |
bauzas | #link https://etherpad.opendev.org/p/nova-ci-failures-minimal | 16:07 |
bauzas | #link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 | 16:07 |
bauzas | #info Please look at the gate failures and file a bug report with the gate-failure tag. | 16:08 |
bauzas | #info Please try to provide meaningful comment when you recheck | 16:08 |
bauzas | I saw a couple of failures but those are known issues | 16:08 |
bauzas | anything about CI failures that is pretty new ? | 16:08 |
bauzas | (all periodics are green) | 16:08 |
bauzas | looks not, moving on | 16:10 |
bauzas | #topic Release Planning | 16:10 |
bauzas | #link https://releases.openstack.org/epoxy/schedule.html | 16:10 |
bauzas | #action bauzas to add Epoxy nova deadlines in the schedule | 16:10 |
bauzas | I'm pretty done with the patch proposal but I need to fix something before uploading it | 16:10 |
bauzas | #topic Review priorities | 16:11 |
bauzas | #link https://etherpad.opendev.org/p/nova-2025.1-status | 16:11 |
bauzas | the page should be up to date, feel free to use it and amend it | 16:11 |
bauzas | anything about that ? | 16:12 |
gibi | o/ | 16:13 |
sean-k-mooney | o/ nothing from me on that topic | 16:13 |
bauzas | cool | 16:13 |
bauzas | #topic Stable Branches | 16:13 |
bauzas | elodilles: shoor | 16:13 |
elodilles | #info stable/2024.2 gate seem to be OK | 16:14 |
elodilles | #info stable/2024.1 gate is blocked on grenade-skip-level & stable/203.2 is blocked on nova-grenade-multinode | 16:14 |
elodilles | failure is due to stable/2023.1->unmaintained/2023.1 transition, devstack and grenade fixes are proposed | 16:14 |
elodilles | and actually the 2024.1 branch fix (grenade) patch is already in the gate queue | 16:14 |
elodilles | though: other workaround is to set these jobs as non-voting - given that gate should not rely on an unmaintained branch | 16:15 |
elodilles | see further details: | 16:15 |
elodilles | #info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci | 16:15 |
elodilles | and that's all from me about stable branches now | 16:15 |
bauzas | thanks | 16:16 |
elodilles | np | 16:16 |
bauzas | #topic vmwareapi 3rd-party CI efforts Highlights | 16:16 |
bauzas | fwiesel: around ? | 16:16 |
bauzas | looks he's AFK | 16:17 |
bauzas | no worries, moving on | 16:17 |
fwiesel | Sorry , I am here | 16:17 |
bauzas | ah | 16:17 |
bauzas | anything to raise from your side ? | 16:17 |
fwiesel | There was a regression in oslo.utils (master) and I have created a change to fix it: https://review.opendev.org/c/openstack/oslo.utils/+/936247 | 16:17 |
fwiesel | Hopefully the builds will back to the two failures and I will tackle these then. | 16:18 |
fwiesel | That's from my side | 16:18 |
sean-k-mooney | ah is that related to removign netifaces | 16:18 |
bauzas | okay, gtk | 16:18 |
bauzas | thanks | 16:18 |
* tkajinam is aware of the proposed fix and will ping the other cores to get in | 16:19 | |
tkajinam | get that in | 16:19 |
bauzas | nice, thanks tkajinam | 16:19 |
tkajinam | fwiesel, if you need a new release with the fix early then ping me | 16:19 |
tkajinam | once that is merged | 16:19 |
fwiesel | tkajinam: Thanks, I'll let you know | 16:20 |
bauzas | cool | 16:20 |
bauzas | then moving to the last item from the agenda | 16:20 |
bauzas | #topic Open discussion | 16:20 |
bauzas | anything in the agenda, so anything anyone ? | 16:21 |
s3rj1k | there is this https://bugs.launchpad.net/nova/+bug/2089386 | 16:21 |
sean-k-mooney | i have one followup form last week too | 16:21 |
sean-k-mooney | lets start with s3rj1k topic | 16:21 |
bauzas | ok, s3rj1k, shoot | 16:22 |
s3rj1k | idea is to allow for host discovery to be concurrent, both cli and internal using distributed locking | 16:23 |
sean-k-mooney | so perhaps i can provide some context | 16:24 |
s3rj1k | thi mostly needed for k8s like envs where discovery is run in multiple places | 16:24 |
sean-k-mooney | s3rj1k is interested in using the discover hsost perodic in a ha env | 16:24 |
bauzas | s3rj1k: I think that topic requires a proper discussion that can't be done during a meeting | 16:25 |
sean-k-mooney | currently we require that if you use the perodic its enabled on at most one host | 16:25 |
sean-k-mooney | they would like to adress that pain point | 16:25 |
bauzas | if we want to discuss about the design, it has to be an async conversation that has to be in a proper formatted document | 16:25 |
bauzas | that's the reason why we introduced our specification program for those kind of feature requests | 16:26 |
s3rj1k | bauzas: spec? or rfe is enough for this time? | 16:26 |
sean-k-mooney | so this would defneitlly be a spec if you were going to work on it | 16:26 |
bauzas | s3rj1k: are you familiar to the specs writing or do you need guidance ? | 16:26 |
s3rj1k | bauzas: done one for neutron, so all ok | 16:27 |
sean-k-mooney | i think before going that far however s3rj1k wanted some intiall feedback on is this in scope of nova to fix | 16:27 |
bauzas | sean-k-mooney: well, I'm not sure we have a quorum today for such design discussion | 16:27 |
bauzas | if that was something before the PTG, we would have said "sure, just add that to the PTG and we'll discuss it" | 16:28 |
sean-k-mooney | thats still an option | 16:28 |
sean-k-mooney | i suggested that s3rj1k bring it here to advertise that it exist | 16:28 |
bauzas | honestly, I haven't yet formally written the nova deadlines for Epoxy but we're already running short in tome | 16:28 |
sean-k-mooney | and then start eithe r a mailing list or spec dicussion after that | 16:28 |
bauzas | time* | 16:28 |
bauzas | what exact problem are we trying to solve then ? | 16:29 |
sean-k-mooney | currently if you enable the discover host perodic task in more then one schdluer it can get duplict key error form the db | 16:29 |
bauzas | are we speaking of concurrent nova-scheduler services that need to be HA active-active for X reasons ? | 16:29 |
sean-k-mooney | as 2 process can race to create the mappings | 16:29 |
sean-k-mooney | leading to errors in the logs | 16:29 |
sean-k-mooney | we dont actully supprot that today | 16:30 |
bauzas | I think we always said that nova-scheduler has to be active-passive | 16:30 |
sean-k-mooney | but our documention on that is kind of lacking | 16:30 |
sean-k-mooney | no | 16:30 |
bauzas | I pretty bet we documented it | 16:30 |
sean-k-mooney | the schdluer has been supproted in active active for a very long time | 16:30 |
bauzas | nevre | 16:30 |
sean-k-mooney | yes | 16:30 |
tkajinam | as far as I can tell Tripleo in the past deployed it in all controllers | 16:31 |
bauzas | with placement, we thought that we /could/ run it active-active but there were reasons not to | 16:31 |
sean-k-mooney | nope | 16:31 |
bauzas | tkajinam: which was a bug that we raised a couple of times | 16:31 |
sean-k-mooney | downstream its been active active since like 16 maybe before | 16:31 |
bauzas | and I think TripleO changed it to A-P | 16:31 |
bauzas | for that exact reason | 16:31 |
sean-k-mooney | nope | 16:31 |
tkajinam | no | 16:31 |
sean-k-mooney | ok well i think we need a longer discussion on this RFE request | 16:32 |
sean-k-mooney | likely a spec and we probly dont have time to complete it in epoxy | 16:32 |
sean-k-mooney | but we shoudl dicuss this more async | 16:32 |
s3rj1k | no prob, thanks sean-k-mooney for taking a lead on explaining | 16:33 |
bauzas | I have to admit that none of that tribal knowledge is written in https://docs.openstack.org/nova/latest/admin/scheduling.html | 16:34 |
sean-k-mooney | its also not in the config option | 16:34 |
sean-k-mooney | i left my inital feedback on the bug when i traged it as opion | 16:35 |
sean-k-mooney | i dint make it as invlaid as i tought we shoudl atelast dicuss it more widely first | 16:35 |
bauzas | for now, we should document that active-passive HA configuration for sure | 16:36 |
sean-k-mooney | for the perodic only | 16:36 |
bauzas | because indeed, we know that there is no eventual consistency betwen schedulers | 16:36 |
sean-k-mooney | the schduler shoudl be generally deploy active active | 16:36 |
bauzas | that's your opinion :) | 16:36 |
sean-k-mooney | but also the perodic has perfomance issuues | 16:36 |
sean-k-mooney | bauzas: its waht we use in our product | 16:36 |
sean-k-mooney | and what almost all instller do by defult | 16:36 |
bauzas | https://specs.openstack.org/openstack/nova-specs/specs/abandoned/parallel-scheduler.html | 16:37 |
tkajinam | yeah > almost all installer do by default | 16:37 |
sean-k-mooney | that a diffent proposal | 16:38 |
bauzas | I litterally quote the first sentence of that spec : | 16:38 |
bauzas | "If you running two nova-scheduler processes they race each other, they don’t find out about each others choices until the DB gets updated by the nova-compute resource tracker. This has lead to many deployments opting for an Active/Passive HA setup for the nova-scheduler process." | 16:38 |
tkajinam | people may not prefer using act-act for simplicity and avoid clustering mechanism to implement active-passive. | 16:39 |
tkajinam | without large warning :-P | 16:39 |
sean-k-mooney | bauzas: that does not really apply as of placement | 16:39 |
sean-k-mooney | bauzas: i woudl condier it to be very incorrect advice to deocument that active active is not supported | 16:39 |
gibi | yeah the goal of placement to shrink the race window between parallel schedulers | 16:40 |
gibi | it is a solved problem for those resources that are tracked in placement | 16:41 |
bauzas | I don't disagree with the fact that HA active-active schedulers is a problem to solve | 16:41 |
gibi | for those that are not tracked there, the compute manager has a lock around claim to prevent overallocation | 16:41 |
gibi | and we have alternatives to reschedule | 16:41 |
bauzas | gibi: exactly, hence the A/P mechanism | 16:41 |
gibi | no this is A A | 16:41 |
gibi | the only A P problem is in the periodic discovery | 16:42 |
bauzas | in the very early times, we were considering reschedules as a way to address the problem | 16:42 |
bauzas | we stopped that tenet by wanting to reduce the reschedules, leading to indeed a broader problem | 16:42 |
gibi | we reduced reschedules with placement | 16:43 |
bauzas | originally, the scheduler wasn't intended to provide an exact solution | 16:43 |
gibi | and we improved reschedules with alterntive generation | 16:43 |
bauzas | right, which is why we never solved that problem | 16:43 |
bauzas | we reduced the scope of reschedules, that's it | 16:43 |
sean-k-mooney | we solved it to the point that we recomemnd active active as the defualt | 16:43 |
gibi | in a distributes system you have limits what you can solve exactly | 16:43 |
gibi | I agree with sean-k-mooney we can recomend A A | 16:44 |
gibi | actually OSP 18 does A A A | 16:44 |
gibi | (or as many As as you want :D) | 16:44 |
sean-k-mooney | right our product does not supprot active passive but i belive that was true in 17 as well | 16:44 |
bauzas | A A A is OK to me with resources tracked by placement | 16:44 |
sean-k-mooney | anyway perhasp we should move on? | 16:45 |
bauzas | agreed | 16:45 |
sean-k-mooney | we can talk about this more but proably dont need to in the meeting | 16:45 |
bauzas | and agreed on the fact we need a spec | 16:45 |
bauzas | but maybe the solution is to add more resources to placement | 16:45 |
sean-k-mooney | well that is the general direction anyway | 16:46 |
bauzas | or consider this as a non-solvable problem and accepting reschedules as a caveart | 16:46 |
bauzas | caveat | 16:46 |
sean-k-mooney | but that does not adress the reporte problem | 16:46 |
gibi | on the proposal of a distributed discover I can suggest to do the discover outside of a scheduler periodic to avoid the race | 16:46 |
sean-k-mooney | nova-audit would | 16:46 |
bauzas | anyway, moving on | 16:46 |
sean-k-mooney | gibi: yes its very diffent | 16:46 |
bauzas | s3rj1k: fancy writing a spec ? | 16:46 |
sean-k-mooney | bauzas: ack so i had one quick topic | 16:46 |
bauzas | sean-k-mooney: shoot | 16:46 |
s3rj1k | gibi: similar issue would be with CLI, check out RFE | 16:46 |
s3rj1k | bauzas: will do | 16:47 |
sean-k-mooney | so last week i raised adding rodolfo to os-vif core | 16:47 |
sean-k-mooney | i sent a mail to the list and no one objected | 16:47 |
gibi | s3rj1k: I mean if you control to only run the discover from a single CLI session at a time then I assume there is no race | 16:47 |
sean-k-mooney | so if there is no other objection here i will proceed with that after the call. | 16:47 |
s3rj1k | gibi: yes, need external control on how CLI gets run | 16:48 |
s3rj1k | lets move on, yes | 16:48 |
gibi | sean-k-mooney: no objection on my side | 16:48 |
bauzas | sean-k-mooney: no objections indeed | 16:48 |
sean-k-mooney | ack so that is all i had | 16:49 |
tkajinam | I have no objections but +1 :-) (I'm not a core, though) | 16:49 |
sean-k-mooney | ill send a mail to the list and then ill add them after that | 16:50 |
sean-k-mooney | jsut to keep a record of it beyond this meeting | 16:50 |
bauzas | ++ | 16:51 |
bauzas | okay, then I think we're done for today | 16:52 |
bauzas | anything else ? | 16:52 |
bauzas | looks not | 16:52 |
bauzas | have a good end of day | 16:52 |
bauzas | thanks all | 16:52 |
bauzas | #endmeeting | 16:52 |
opendevmeet | Meeting ended Tue Nov 26 16:52:50 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:52 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/nova/2024/nova.2024-11-26-16.00.html | 16:52 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/nova/2024/nova.2024-11-26-16.00.txt | 16:52 |
opendevmeet | Log: https://meetings.opendev.org/meetings/nova/2024/nova.2024-11-26-16.00.log.html | 16:52 |
s3rj1k | thanks all | 16:52 |
tkajinam | thanks ! | 16:52 |
tkajinam | nothing urgent but I've added a few patches to drop deprecated/unmaintained deps to review priority list just fyi | 16:53 |
elodilles | thanks o/ | 16:53 |
gibi | o/ | 16:54 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Show candidate combinatorial explosion by dev number https://review.opendev.org/c/openstack/nova/+/855885 | 17:16 |
opendevreview | Takashi Kajinami proposed openstack/nova master: Add unit test coverage of get_machine_ips https://review.opendev.org/c/openstack/nova/+/936287 | 17:26 |
opendevreview | Douglas Viroel proposed openstack/nova-specs master: Add spec for show scheduler hints in server details https://review.opendev.org/c/openstack/nova-specs/+/936140 | 19:01 |
opendevreview | Andrei Yachmenev proposed openstack/nova-specs master: Dynamic disk qos updates support https://review.opendev.org/c/openstack/nova-specs/+/936302 | 19:14 |
opendevreview | Merged openstack/nova master: api: Add response body schemas for remaining server action APIs https://review.opendev.org/c/openstack/nova/+/915743 | 20:24 |
*** haleyb is now known as haleyb|out | 21:12 | |
*** iurygregory__ is now known as iurygregory | 23:37 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!