Thursday, 2024-07-11

opendevreviewkeerthivasan proposed openstack/nova master: Move nova-manage db purge to nova-audit  https://review.opendev.org/c/openstack/nova/+/70878300:11
*** bauzas_ is now known as bauzas00:13
*** bauzas_ is now known as bauzas01:20
*** bauzas_ is now known as bauzas02:29
*** bauzas_ is now known as bauzas02:54
*** bauzas_ is now known as bauzas03:33
*** bauzas_ is now known as bauzas04:08
opendevreviewBalazs Gibizer proposed openstack/nova master: Stabilize iso format unit tests  https://review.opendev.org/c/openstack/nova/+/92392405:32
opendevreviewBalazs Gibizer proposed openstack/nova master: Stabilize iso format unit tests  https://review.opendev.org/c/openstack/nova/+/92392406:56
*** bauzas_ is now known as bauzas07:16
opendevreviewBalazs Gibizer proposed openstack/nova stable/2024.1: fix qemu-img version dependent tests  https://review.opendev.org/c/openstack/nova/+/92387807:27
*** bauzas_ is now known as bauzas07:53
opendevreviewTakashi Kajinami proposed openstack/nova master: libvirt: Reject cpu_model_extra_flags when cpu_mode='none'  https://review.opendev.org/c/openstack/nova/+/92393308:06
*** bauzas_ is now known as bauzas08:49
gibistephenfin: we are sort on cores this week. Could you check a small unit test stabilization patch https://review.opendev.org/c/openstack/nova/+/923924 ?09:09
songwenping__gibi, sean-k-mooney: hi, does evacute has rescheduler process?09:13
sean-k-mooneysongwenping__: no 09:13
sean-k-mooneyif evacuate fails you should just evacuate again09:13
sean-k-mooneywe do not reschule any move operation as far as im aware09:14
songwenping__i find build,unshelve and migrate have rescheduler, right?09:14
sean-k-mooneylive migrate defintly does not i dont thik coldmigrate/resize does either09:14
sean-k-mooneybuild does but im not sure about unshelve09:15
sean-k-mooneywell that not entirly true 09:15
sean-k-mooneywe got rid of reschdule entirly and replaced it with alternate_host 5 or 6 years ago09:15
sean-k-mooneymaybe more09:15
sean-k-mooneyi think if pre_live migtrate failes we may check the alternate hosts and the same mighgt be true of other move op09:16
sean-k-mooneybut we never retry and reschdule once we attemet to start the migration of data09:16
songwenping__build still entired reschedule?09:16
opendevreviewBalazs Gibizer proposed openstack/nova stable/2024.1: Stabilize iso format unit tests  https://review.opendev.org/c/openstack/nova/+/92393509:16
sean-k-mooneyno its not a rescule because we dont go back to the scheduler it uses the alternitve hosts the scheduler orgially provided09:17
sean-k-mooneyby default the schduler gives us 3 hosts09:17
sean-k-mooneythe condocutor tries those in order09:17
opendevreviewBalazs Gibizer proposed openstack/nova stable/2023.2: Stabilize iso format unit tests  https://review.opendev.org/c/openstack/nova/+/92393609:18
sean-k-mooneyprovided the allocation candiate can still be claimed in placement09:18
songwenping__got it.09:18
sean-k-mooneyif we fail early enouch in evacuate/resize we may do the same09:18
opendevreviewBalazs Gibizer proposed openstack/nova stable/2023.1: Stabilize iso format unit tests  https://review.opendev.org/c/openstack/nova/+/92393709:20
sean-k-mooneysongwenping__: https://docs.openstack.org/nova/latest/contributor/resize-and-cold-migrate.html#resource-claims09:22
songwenping__as nova-scheduler workers has same host_state info, if two evacuations come at the same time, the resource is  enough in nova-scheduler, but nova-compute claim failed as two vms required resources exceed the remain resources on nova-compute.09:22
sean-k-mooneysongwenping__: that wont happen in general 09:22
sean-k-mooneyit will only happen if the vm is using a resouce not tracked in placment or with numa toplogy09:23
sean-k-mooneyso before we call teh compute we the conductor creates an atomic claim in placement09:23
sean-k-mooneyfor cpu/ram/disk and some other resouces09:24
songwenping__yeah, but in our HA mechanism, it occurs.09:24
sean-k-mooneythe vm should never get to the comptue part, are you runnign placement on an Active Active galera by any chance09:24
sean-k-mooneyactive/active meaning multi master/multi writer setup09:25
songwenping__maybe our nova-scheduler use host_state cache info, not get ar from placement.09:31
sean-k-mooneythats not how that works09:31
sean-k-mooneyunless this is pre rocky09:32
songwenping__yeah, we use rocky :)09:32
songwenping__the cache mechanism is deprecated after rocky?09:35
sean-k-mooneythere is a cache but we first filter in placement before the schduler filtere run and then claim atotmicaly after to avoid races09:35
songwenping__sean-k-mooney: got it, thanks.09:38
fricklerelodilles: you offered help some days ago, additional reviews for https://review.opendev.org/q/topic:%22format-inspector%22+status:open would be nice I think10:47
elodillesfrickler: ACK, thanks for the ping. i did not say it upstream, but i'm mostly off today. so i might only get there to review the patches tomorrow ~afternoonish10:55
fricklerelodilles: ah, ok, thx for the update, let's hope things might be resolved until then11:09
elodillesfrickler: i've reviewed and +2+W'd the stable/2024.1 patches (except the last patch as that is not merged on master yet)12:14
*** bauzas_ is now known as bauzas12:41
sean-k-mooneythe last 2 patches that sables the unit test we belive are just ened for old veriosn of the tools13:07
sean-k-mooneyfrickler: elodilles basically we were backportign those downstream too train/wallaby  and had ci failures 13:07
opendevreviewArnaud Morin proposed openstack/nova master: Avoid failing if compute_id file is empty  https://review.opendev.org/c/openstack/nova/+/92395313:09
fricklersean-k-mooney: yes, I'm not too worried about those, but we are running with the major iso format-inspector patch downstream now, so it would be comforting to see it merged13:19
sean-k-mooneyfrickler: oh good to know. have you foudn any issue with it i should be aware of?13:33
sean-k-mooneyif you do just ping and ill take a look13:33
fricklersean-k-mooney: sure, but no issues found so far13:36
mnasersean-k-mooney: btw https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/M3BBTNE22VMMZSFMGFWMJJDYEPW7VNHL/ of interest based on our discussion the other day :)13:38
mnaserwhen you have a second id appreciate feeedback13:38
sean-k-mooney you realise that is O(hosts*aggreates^2) right13:41
sean-k-mooneyim not saying that wont work13:42
sean-k-mooneybut we might need to be a littel smarter if we were to do that in nova proper in terms of performce13:43
sean-k-mooneylike see that there is a server group in teh request spec, grap all the relevent host for that server group and lookup the aggerst once in a map and pass in the relevent agrrate i the host state object ot each filter 13:44
sean-k-mooneyim not sure  if that will work but lookign up every aggrete reated to every other host that has an instance in the same server group per host is expensive...13:45
mnasersean-k-mooney: I realize it is expensive, but i couldn't think of a better approach, but also I am looking up aggregate for server groups only, so it is limiting it to the scope of hosts that are part of that host aggregate13:57
mnaserso potentially a single lookup if there is one other node for example13:58
mnaserand I think I can't really do a cache inside of the filter for a spec because its invoked for each host13:59
sean-k-mooneythe filter runs per host however14:00
sean-k-mooneyso you multiple that by the number of host reteruend from placment * the number of vm in the build request14:01
opendevreviewMerged openstack/nova stable/2024.1: port format inspector tests from glance  https://review.opendev.org/c/openstack/nova/+/92372214:01
opendevreviewMerged openstack/nova stable/2024.1: Reproduce iso regression with deep format inspection  https://review.opendev.org/c/openstack/nova/+/92372314:02
sean-k-mooneymnaser: what i was thinkign is if we were to do this we would have the hostmanager get ll the host in the group and the aggreates for thos groups once before running the filter on each host and storing the info in either the hoststate object or request sepc so we can share that info acrros each vm and across each host14:03
sean-k-mooneymnaser: we do somethign simialr for groups already in terms of maintianign a mapping of instance uuids to hosts14:03
mnaseryeah we have host_state.instances which anti affinity can easily use14:05
sean-k-mooneyyep thats why its there14:05
sean-k-mooneyits only optionally populated based on a config option14:05
sean-k-mooneyas it uses potentally a lost of memory14:05
sean-k-mooneywe also have a list of aggreats in the host_state too14:06
mnaserI think I end up using that14:06
mnaserbut then I have to check what is the failure domain set to _this_ host aggregate, and lookup what it is for all the other VMs14:06
sean-k-mooneyyep 14:07
sean-k-mooneyso im not saying your silter wont work14:07
mnaseroh yeah no it does work its just.. eww.14:07
sean-k-mooneybut if we were to supprot this in nova properly we likely would need to do somethign else to optimise this14:07
mnasersean-k-mooney: I think maybe an extra kwarg for host_passes to include all of the HostState for all the nodes still eligible in the filter14:08
sean-k-mooneywe try not to do that14:08
mnaserso that reduces how much I need to lookup14:08
sean-k-mooneyit breaks all out of tree filters when we do14:09
sean-k-mooneynot that we technilay supprot those14:09
sean-k-mooneybut adding files to the host state object does not break the api14:09
mnaserhmm ok valid14:09
sean-k-mooneyso we woudl pass this either via the host_state object or request spec14:09
mnaserI mean if the failure domain concept becomes a little bit more concrete then there are other place we can leverage it in other places to optimize the way its looked up fo rsure14:10
mnaserand maybe just maybe also expose it as a sha'd string in nova's api... but thats stretching my luck :)14:10
mnaserthe goal here is to expose this to k8s cluster provisioners and so that the failure domain can be exposed to the user so they can make sure they schedule their pods on nodes with different failure domains14:11
*** bauzas_ is now known as bauzas14:56
*** bauzas_ is now known as bauzas16:04
sean-k-mooneymnaser: you asked for feedback :)16:41
opendevreviewMerged openstack/nova stable/2024.1: Add iso file format inspector  https://review.opendev.org/c/openstack/nova/+/92372416:58
opendevreviewMerged openstack/nova stable/2024.1: fix qemu-img version dependent tests  https://review.opendev.org/c/openstack/nova/+/92387816:59
*** bauzas_ is now known as bauzas17:14
opendevreviewkeerthivasan proposed openstack/nova master: Move nova-manage db purge to nova-audit  https://review.opendev.org/c/openstack/nova/+/70878317:28
opendevreviewkeerthivasan proposed openstack/nova master: Move nova-manage db purge to nova-audit  https://review.opendev.org/c/openstack/nova/+/70878317:44
opendevreviewMerged openstack/nova master: Stabilize iso format unit tests  https://review.opendev.org/c/openstack/nova/+/92392417:57
opendevreviewJens Harbott proposed openstack/nova master: DNM: Test devstack change  https://review.opendev.org/c/openstack/nova/+/92375919:24
*** bauzas_ is now known as bauzas19:35
*** bauzas_ is now known as bauzas21:11
*** bauzas_ is now known as bauzas21:31

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!