Monday, 2025-06-30

gmaansean-k-mooney: stephenfin: while adding tempest tests for migrations by project manager, I realized that we need one list migrations API (list all migrations not just in-progress live migrations) to allow for manager. I a m amending it in spec, ptal https://review.opendev.org/c/openstack/nova-specs/+/95372203:24
*** elodilles is now known as elodilles_pto06:14
opendevreviewThomas Goirand proposed openstack/nova master: Add a [libvirt]/force_virtio_cd option  https://review.opendev.org/c/openstack/nova/+/95373208:37
opendevreviewThomas Goirand proposed openstack/nova master: Add a [libvirt]/force_virtio_cd option  https://review.opendev.org/c/openstack/nova/+/95373208:40
zigoDoes the above patch makes sense, and does the team understand my motivation as an operator?08:41
opendevreviewThomas Goirand proposed openstack/nova master: Add a [libvirt]/force_virtio_cd option  https://review.opendev.org/c/openstack/nova/+/95373208:41
opendevreviewBalazs Gibizer proposed openstack/nova stable/2025.1: Fix neutron client dict grabbing  https://review.opendev.org/c/openstack/nova/+/95373709:49
opendevreviewMerged openstack/nova stable/2024.1: Add repoducer test for bug 2074219  https://review.opendev.org/c/openstack/nova/+/95365610:07
sahido/10:59
sahidI'm still stuck finding information regarding live migration that are failing10:59
sahidit's weird that we don't log anything10:59
sahidthe ;igration list --server xx show that status is in error but don't explain what the error is11:00
sean-k-mooneyyou shoudl check the source compute node for logs11:03
sean-k-mooneyor conductor in some cases11:03
sahidhum I did not tried on conductor... on source the only error I have is Live Migration failure: 'fa:....'11:06
sahidMigration operation has aborted11:06
sean-k-mooneyaborted is not the same as failing. that likely just hit the timeout11:06
sahidsean-k-mooney: I think the Live Migration failure failure is comming frist from libvirt driver11:09
sahidthen the exception is propagaded 11:09
sean-k-mooneyit can if it failed on the migrate step11:10
sahidlet me see if I find from where is comming the message Migration operation has aborted11:10
sahidthe event is mentionning, compute_rollback_live_migration_at_destination and there is no log at destination11:14
sahidwe are missing to log some important details11:14
opendevreviewMerged openstack/nova stable/2024.1: Fix detaching devices by alias with mdevs  https://review.opendev.org/c/openstack/nova/+/95366211:23
sean-k-mooneyjust an fyi i repliekd to mikal email just now, neutron broken debian 12 support with https://github.com/openstack/neutron/commit/8990dd598f84b9da976d72672c6fd6037fc0d125 it seams so the nova hybrid plug job is broken until they fix the coalaiton type11:24
opendevreviewKamil Sambor proposed openstack/nova master: Use futurist for _get_default_green_pool()  https://review.opendev.org/c/openstack/nova/+/94807211:55
opendevreviewKamil Sambor proposed openstack/nova master: Replace utils.spawn_n with spawn  https://review.opendev.org/c/openstack/nova/+/94807611:55
opendevreviewKamil Sambor proposed openstack/nova master: Add spawn_on  https://review.opendev.org/c/openstack/nova/+/94807911:55
opendevreviewKamil Sambor proposed openstack/nova master: Move ComputeManager to use spawn_on  https://review.opendev.org/c/openstack/nova/+/94818611:55
opendevreviewKamil Sambor proposed openstack/nova master: Replace eventlet.event.Event with threading.Event  https://review.opendev.org/c/openstack/nova/+/94975411:55
*** bauzas0 is now known as bauzas12:43
fungi[repeating myself from yesterday, sorry] it's come up a few times in debian over recent years that other cloud platforms don't rely on ahci driver access for cdrom devices, and so their minimal "cloud" linux kernel config doesn't include support for it, which means users have to manually override to virtio to make configdrive work with debian's official "genericcloud" images.12:47
fungiis there a reason not to switch the default in nova to virtio, as suggested in https://bugs.debian.org/1108403 ? (i see zigo just pushed change 953732 for this too)12:47
zigoI didn't push for this, because I was fearing it would be too much of an aggressive change, but I'd basically agree with fungi that it'd be nice to have virtio by default.12:48
opendevreviewThomas Goirand proposed openstack/nova master: Add a [libvirt]/force_virtio_cd option  https://review.opendev.org/c/openstack/nova/+/95373212:49
zigofungi: Thanks for spotting the typo btw! :)12:50
zigoMy patch looks like passing the CI, it just had a POST_FAILURE for nova-ovs-hybrid-plug.12:50
fungiit's also obvious to us, but perhaps not to the debian kernel maintainers, that nova changing its default now will only slowly trickle down to public cloud providers, as there are some still running their services on decade-old versions of openstack12:53
zigoI just believe that Waldi only cares about his customer in this case (ie: Microsoft and Azure) that's not impacted by the missing AHCI driver.12:54
zigoIMO, it's a conflict of interest case.12:54
zigoNever the less, virtio-cd is much nicer than ide...12:55
sean-k-mooneyfungi: so we use sata by default when using q35 machine type or ide for pc 12:59
sean-k-mooneyfungi: we cant use virtio-scis because of window window si belive12:59
sean-k-mooneyso we dont really have an alternitive choice13:00
sean-k-mooneyzigo: we only use ide for the pc machien type if you defautl to q35 we use sata as i noted above, virtio-cd either iddnt exist or was not a option when we last chagned the defautl 4+ years ago13:02
opendevreviewTakashi Kajinami proposed openstack/nova master: Use built-in declarative  https://review.opendev.org/c/openstack/nova/+/95375113:02
sean-k-mooneywe coudl consider if that is now viablebased on our min libvirt version although we may need to restrict that to only linux machine unless windows ships with it by default13:02
zigosean-k-mooney: Oh, so activating the feature I'm proposing would break windows setup, right?13:03
sean-k-mooneymaybe we dont supprot virtio-cd at all13:03
sean-k-mooneyso we woudl first need to supprot that and deterim if it can be the default or contintioal default absed on OS_TYPE on the galnce image13:03
sean-k-mooneyim late for a meeting but ill be free in an hour and can look into this more13:04
zigoWe do, it just need a specific metadata.13:04
zigoAs much as my colleague is telling me, this would *not* break our windows image that has the virtio CD support.13:05
zigothe metadata is:13:05
zigohw_cdrom_bus=virtio13:05
sean-k-mooneywill it break a vanilla windows install iso13:05
sean-k-mooneyzigo: so you can set that13:05
zigo*we* wouldn't care.13:05
sean-k-mooneybut there used to be problems with useing it13:05
sean-k-mooneywell nova would 13:06
zigosean-k-mooney: Yeah, *we* can, but some idiot customers have no idea about it, will take system image snapshot or backup, attempt to boot, and see it fail because of the CD being ide and the lack of AHCI support in the image.13:06
zigoI'd like to avoid this ...13:06
sean-k-mooneyso we really try to not make things dynmaic based on the image in a lot of cases but its worth a dicussion at least13:07
opendevreviewTakashi Kajinami proposed openstack/nova master: sqlalchemy: Use built-in declarative  https://review.opendev.org/c/openstack/nova/+/95375113:07
sean-k-mooneyzigo: i know there was a reason we could not use virtio as the default at the time13:07
sean-k-mooneybut i dont recall the details13:08
sean-k-mooneyit may have been our min libnvirt versin at the time13:08
sean-k-mooneyzigo: i assume you woudl prefer that we use virio by default both for pc and q35 machine type?13:09
sean-k-mooneyzigo: the last time we chagne defautl was yoga https://blueprints.launchpad.net/nova/+spec/virtio-as-default-display-device13:12
sean-k-mooneyso we have precendet for doing this13:12
opendevreviewTakashi Kajinami proposed openstack/placement master: sqlalchemy: Use built-in declarative  https://review.opendev.org/c/openstack/placement/+/95375913:13
stephenfinsean-k-mooney: gibi: ralonsoh: I assume you're aware that nova-ovs-hybrid-plug is permafailing on master since last Friday? https://zuul.opendev.org/t/openstack/builds?job_name=nova-ovs-hybrid-plug&project=openstack/nova13:54
ralonsohlet me check13:54
stephenfinI took a look at things that merged to neutron and neutron-lib between the last pass and first fail but didn't spot anything obvious13:55
stephenfinralonsoh: neutron-api is failing because there's a name field being set on a model that doesn't have the relevant column13:55
stephenfinhttps://zuul.opendev.org/t/openstack/build/79bd7dae1b954169a77159ca53b85422/log/controller/logs/screen-neutron-api.txt13:56
stephenfin(specifically the networksegments table as you'll se)13:56
stephenfin*see13:56
ralonsohlet me check first the last patches merged13:56
ralonsohstephenfin, this is because a stupid guy merged a patch last friday14:02
ralonsoha very stupid guy and a very stupid patch14:02
gibisean-k-mooney: dansmith: could you add the back the +A to https://review.opendev.org/c/openstack/nova/+/948072/ it was lost in a rebase14:02
dansmithdone14:03
ralonsohstephenfin, I've pushed the fix: https://review.opendev.org/c/openstack/neutron/+/953768. Check that the Neutron db migration is failing: https://zuul.opendev.org/t/openstack/build/46c6d08486c3461dbc6f3fcb3c2b832d/log/job-output.txt#1349114:03
stephenfinralonsoh: :D14:03
gibidansmith: thanks14:03
stephenfinahh14:04
stephenfinI was looking in the wrong place14:04
stephenfinralonsoh: Looks like a bug in devstack too. init_neutron shouldn't be finishing with result 0 if the db_sync failed, right? https://zuul.opendev.org/t/openstack/build/46c6d08486c3461dbc6f3fcb3c2b832d/log/job-output.txt#1350914:05
ralonsohright, that should have been stopped the stack process14:05
opendevreviewribaudr proposed openstack/nova-specs master: Enable memfd support for shared memory backing  https://review.opendev.org/c/openstack/nova-specs/+/95168914:09
Ugglasean-k-mooney, gibi, dansmith if you have time I'll be happy it you can have a look at ^14:11
zigosean-k-mooney: I'd prefer if it was virtio-cd *always*, yes.14:41
sean-k-mooneyso if we were to make that change it would only take effect for new vms14:41
zigoYeah, I get that point.14:42
sean-k-mooneywhen we made teh other changes we added code to recored the currnt model in the db14:42
sean-k-mooneywe did that so we could chagne the default in the future14:42
zigoBTW. what is the use of rabbit_qos_prefetch_count? I've set use_queue_manager and rabbit_stream_fanout, but now nova-compute complains I haven't set rabbit_qos_prefetch_count and it is to its default of zero. What is a good value for this then ?14:43
sean-k-mooneywe bassically pretent the glance image properlty is set and add it in isntance sysmtem metaadat table as if it came form teh glance image when the vm is first created14:43
sean-k-mooneyzigo: no idea i dont think we sue that in nvoa today so its  proably directly coming form oslo.messaging14:44
sean-k-mooneyzigo: we directly include the oslo.messaging cofnig options in our config because all the parsieng fo those exctra are done by oslo not nova14:44
zigoYeah, like everyone else does! :)14:45
sean-k-mooneywell excpet swift...14:45
zigoWhich is a real pain.14:45
zigoWe tried switching to json log to send all to Elasticsearch, it worked great ... except for swift ! :/14:46
sean-k-mooneyso i am not sure what that does but you might get a better answer for the oslo folks14:46
sean-k-mooneyzigo: i both love and hate that that exists14:46
sean-k-mooneyits really nice for suecase like that14:46
sean-k-mooneybut when our custoerm enabel that and i have to ready the raw logs in json format it sucks14:47
sean-k-mooneyi wish there was a tool to take teh json format and print them normally in human reablae forrmat in oslo14:47
zigoIt was commited together with the fanout and stream stuff by Arnaud Morin <arnaud.morin@ovhcloud.com>14:47
zigohttps://review.opendev.org/c/openstack/oslo.messaging/+/89082514:47
sean-k-mooneyya so nova does not supprot any of that offically14:48
zigoThough absolutely no doc about it ... :/14:48
sean-k-mooneyim vagly aware that stream supprot has been added but no work has been doen to use nova with it14:48
sean-k-mooneyit might work but no one has done the work to test it14:48
zigohttps://www.rabbitmq.com/docs/consumer-prefetch14:48
sean-k-mooneythere has been no cross project work to support it that im aware off14:48
sean-k-mooneyok so it sound like itx a limit of how many unacked message a clinet can have at a time14:49
sean-k-mooneyhttps://bugs.launchpad.net/oslo.messaging/+bug/2031497 really shoudl have had an oslo spec IMO14:51
sean-k-mooneymaybe they had one for stream supprot in general14:52
zigoAgreed.14:52
sean-k-mooneyzigo: i hope your not plannig to use that in debian 13 by default14:52
sean-k-mooneythe stream supprot i mean 14:53
sean-k-mooneyits disabled by default in oslo https://docs.openstack.org/nova/latest/configuration/config.html#oslo_messaging_rabbit.rabbit_stream_fanout14:53
sean-k-mooneyand we do not ahve any testing with it at all for nova14:53
sean-k-mooneywe also do not test with rabbit_quorum_queue set to ture either14:54
sean-k-mooneyit might work but again no work has been doen in nova to offically supprot it14:54
sean-k-mooneywe could close that gap by enablign testing in nova-next14:55
zigoWell, it's time to move to it: rabbitmq does *not* support HA queue starting at version 4.0.x14:55
zigoSo no choice for operators.14:55
sean-k-mooneybut we would also need to update the docs before offically supproting it14:55
sean-k-mooneyim aware although flamingo is the first realee we are usign 4.0.0 i think14:55
sean-k-mooneyalthough that may have been epoxy14:56
sean-k-mooneyit depend on what is in ubuntu 24.0414:56
zigoAhh... Please everyone: stop that bullshit again. Upstream does not get to choose what's in the distro. :(14:56
sean-k-mooneywe get to choose what we supprot14:56
sean-k-mooneyand if we dont have it aviabel in our ci to test with we cant claim supprot14:56
zigoDebian Trixie has RabbitMQ 4.0.5, and it has Epoxy.14:56
sean-k-mooneyack14:57
zigoJust like I have to make Epoxy run on Python 3.13: no choice... :/14:57
sean-k-mooneyi think its goign to be imporant to get trixy in our ci as soon as practical14:57
sean-k-mooneywe offically only supprot debian 12 cuyrrently14:57
zigoI need to fix this rabbitmq config first, so I can continue my tests.14:57
sean-k-mooneytrixy woudl be add next cycle14:57
zigoTrixie is out in a few weeks... :)14:58
sean-k-mooneyack, i think must of those feature are aviabel in the older release of rabbit too14:58
sean-k-mooneyso we can try and use them in our existing jobs provide they work14:58
zigoIt's always the same story anyways, so I'm used to it. :)14:58
gmaansean-k-mooney: you might have noticed but pinging just in case you missed that manager role series is ready for review https://review.opendev.org/q/topic:%22bp/policy-manager-role-default%22+status:open16:36
sean-k-mooneygmaan:  its proably in my gerrit folder but this week and last i have not been on top of burning that down to 0 each day16:42
sean-k-mooneyso tanks for the ping ill take a look at the spec today16:42
gmaansean-k-mooney: thanks16:42
opendevreviewTakashi Kajinami proposed openstack/nova master: sqlalchemy: Use built-in declarative  https://review.opendev.org/c/openstack/nova/+/95375116:54
sean-k-mooneytkajinam: so ^ is not nessiarly a bad thign but it need to have a tracker, at the miniume this shoudl be a tech debt but IMO.16:56
sean-k-mooneythe other thing to condier is while we supprot sqlachemy 2.016:57
sean-k-mooneywe stilll supprot 1.4.13+16:57
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/requirements.txt#L616:57
*** iurygregory__ is now known as iurygregory16:58
sean-k-mooneyso to do that we woudl need to bump our min version to 2.016:58
sean-k-mooneywe coudl do that but that will need a release note16:58
sean-k-mooneyto capture the upgrade impact16:58
sean-k-mooneyoh actully no16:58
sean-k-mooneyhttps://github.com/sqlalchemy/sqlalchemy/commit/450f5c0d6519a439f40 was in 1.4.016:59
opendevreviewBalazs Gibizer proposed openstack/nova master: Run unit test with threading mode  https://review.opendev.org/c/openstack/nova/+/95347516:59
opendevreviewBalazs Gibizer proposed openstack/nova master: [test]RPC executor selectively using threading  https://review.opendev.org/c/openstack/nova/+/95381516:59
sean-k-mooneyso so it does nto force min version bump above our current min16:59
opendevreviewBalazs Gibizer proposed openstack/nova master: [test]RPC using threading or eventlet selectively  https://review.opendev.org/c/openstack/nova/+/95381517:07
opendevreviewBalazs Gibizer proposed openstack/nova master: Move ConductorManager to use spawn_on  https://review.opendev.org/c/openstack/nova/+/94818717:24
opendevreviewBalazs Gibizer proposed openstack/nova master: Make nova.utils.pass_context private  https://review.opendev.org/c/openstack/nova/+/94818817:24
opendevreviewBalazs Gibizer proposed openstack/nova master: Rename DEFAULT_GREEN_POOL to DEFAULT_EXECUTOR  https://review.opendev.org/c/openstack/nova/+/94808617:24
opendevreviewBalazs Gibizer proposed openstack/nova master: Make the default executor configurable  https://review.opendev.org/c/openstack/nova/+/94808717:24
opendevreviewBalazs Gibizer proposed openstack/nova master: Print ThreadPool statistics  https://review.opendev.org/c/openstack/nova/+/94834017:24
opendevreviewBalazs Gibizer proposed openstack/nova master: Document threading mode and tuneables  https://review.opendev.org/c/openstack/nova/+/94936417:24
opendevreviewBalazs Gibizer proposed openstack/nova master: Allow services to start with threading  https://review.opendev.org/c/openstack/nova/+/94831117:24
opendevreviewBalazs Gibizer proposed openstack/nova master: Run nova-next with n-sch in threading mode  https://review.opendev.org/c/openstack/nova/+/94845017:24
opendevreviewBalazs Gibizer proposed openstack/nova master: Do not yield in threading mode  https://review.opendev.org/c/openstack/nova/+/95099417:24
opendevreviewBalazs Gibizer proposed openstack/nova master: Run nova-api and -metadata in threaded mode  https://review.opendev.org/c/openstack/nova/+/95195717:24
opendevreviewBalazs Gibizer proposed openstack/nova master: Allow to start unit test without eventlet  https://review.opendev.org/c/openstack/nova/+/95343617:24
opendevreviewBalazs Gibizer proposed openstack/nova master: Run unit test with threading mode  https://review.opendev.org/c/openstack/nova/+/95347517:24
opendevreviewBalazs Gibizer proposed openstack/nova master: [test]RPC using threading or eventlet selectively  https://review.opendev.org/c/openstack/nova/+/95381517:24
opendevreviewBalazs Gibizer proposed openstack/nova master: Warn on long task wait time for executor  https://review.opendev.org/c/openstack/nova/+/95266617:27
opendevreviewBalazs Gibizer proposed openstack/nova master: FUP: Translate scatter-gather to futurist  https://review.opendev.org/c/openstack/nova/+/95333817:28
opendevreviewBalazs Gibizer proposed openstack/nova master: FUP: Use futurist for _get_default_green_pool()  https://review.opendev.org/c/openstack/nova/+/95333917:28
sean-k-mooneyas an fyi the nova gate should become unblocked once https://review.opendev.org/c/openstack/neutron/+/953768 lands17:56
*** iurygregory_ is now known as iurygregory18:07
sean-k-mooneygmaan: im holding +w to make sure you agree with that my interperation of host info alings with yours18:10
gmaansean-k-mooney: checking18:11
sean-k-mooneyproably shoudl have pinged with the link to the comment https://review.opendev.org/c/openstack/nova-specs/+/953722/comment/a91a7f7a_c8837144/18:14
gmaansean-k-mooney: replied, yes, host related fields will be 'None' for non-admin18:16
sean-k-mooneyok if we are treatign includign the query parma as a error for non admins18:17
sean-k-mooneythen that ok 18:17
sean-k-mooneyand ack on the rest18:17
sean-k-mooneyNone is fin i guess or empty string18:17
gmaanyeah18:18
sean-k-mooneywhich ever is allowed by the schema18:18
gmaanfor request, you mean to error ?18:18
gmaanyeah, there will be no change in request, whatever it is allowed now will be same18:18
sean-k-mooneywell i dont want to allwo them to probel by sarting a resize and then guessign the source node ectra18:18
sean-k-mooneyso ok i think im happy we are on the same page18:18
sean-k-mooneyim ok to move to the impleation review so ill add +w18:19
gmaansean-k-mooney: this is the implementation will looks like with more granular policies https://review.opendev.org/c/openstack/nova/+/953063/7/nova/api/openstack/compute/migrations.py18:20
gmaanand next patch in that series change the policy to know what all things will be open/allowed for manager https://review.opendev.org/c/openstack/nova/+/94134718:21
sean-k-mooney ya that looks reasonable18:23
sean-k-mooneyi left a comment on the asymetry with dest_host and the duplciatvie info that it contains with dest_compute18:24
sean-k-mooneybut that is just an observation not a request to change anythign18:24
gmaanack18:24
sean-k-mooneywe are a littel inconssitent with how we fileter on the query side and how we respsent this in the responce18:25
sean-k-mooneylike host in the query can match the source or dest host but source_compute only mages the soucee but you cant query by only dest compute or node18:26
sean-k-mooneyi dont feeel like it worth haveing a microversion to fix that right now18:26
sean-k-mooneyafter we finish the openapi serise and we have clollected a bunch of similar odities it might make sense to do another cleanup microversion to normalise things18:27
sean-k-mooneygmaan: you have 2 serise this cycle right service role and manger role. do you have a prefered order for those in terms of review18:28
gmaanhost in query is either one, just checked if we do same in code or not and yes it match either one https://github.com/openstack/nova/blob/43d57ae63d1ecda24d8707b4750d404daadc980f/nova/db/main/api.py#L342518:29
gmaansean-k-mooney: yeah, I am targeting to finish manager role first18:29
sean-k-mooneyack18:30
sean-k-mooneyhopeflyy we can do both but i think doing one then the other is more likely to work out18:30
sean-k-mooneyill try and do a pass over the manager role then this week18:30
gmaanyeah, once I am done with manager role I am going to resume my service role also18:30
sean-k-mooneyping me if i havnt by firday18:30
gmaancool, ack18:30
opendevreviewMerged openstack/nova-specs master: Allow list migrations policy to manager role  https://review.opendev.org/c/openstack/nova-specs/+/95372218:30
sean-k-mooneyUggla: i left some comments on your memfd spec https://review.opendev.org/c/openstack/nova-specs/+/951689 i dont see any other reviwe on that so beign realsitc that proably wont land this cycle19:40
sean-k-mooneyit could but its definetly at risk19:40
sean-k-mooneywe might want to punt it premtivly so we can discss the upgrade impact with less time pressure19:40
sean-k-mooneythere are some details i would like to see captured in the spec that are missing and im not sure we should rush it but im +1 on the overall propsoal19:41

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!