Friday, 2022-12-16

*** haleyb_ is now known as haleyb01:58
*** haleyb is now known as haleyb_away01:59
thuvhIDENTIFY05:04
*** akekane is now known as abhishekk05:36
*** blarnath is now known as d34dh0r5306:52
*** ralonsoh__ is now known as ralonsoh07:35
congntHi everyone, I have a new compute with CPU  (Intel(R) Xeon(R) Gold 5320 CPU @ 2.20GHz) Icelake Intel. But libvirt recognize model is Broadwell-noTSX-IBRS. Anyone know this bug? I'm use OpenStack Victoria deploy by kolla-ansible, libvirt version 6.0.0-0ubuntu8.808:08
obrecongnt: Hi! I guess that a "cat /proc/cpuinfo | grep mpx" returnes no results?08:15
congntYes08:15
congntNo result with this command08:16
obreThe issue is basicly that libvirt expects that flag to call a CPU "IceLake".08:17
obreBut Intel does not ship mpx-support for quite a few of their newer CPU-lines.08:17
obreBut you can configure libvirt with your own custom cpu-models.08:17
obreWe do for example use this file on our IceLake nodes: https://github.com/ntnusky/profile/blob/master/files/libvirt/cpu/x86_Icelake-Server-NTNU.xml08:18
obreAdd that file to /usr/share/libvirt/cpu_map/ and update /usr/share/libvirt/cpu_map/index.xml to include a link to it. Then after a restart of libvirt you should have an ICE-Lake CPU available.08:19
congntobre: thank you so much, i will research it.08:32
*** bhagyashris is now known as bhagyashris|sick09:26
*** kopecmartin is now known as kopecmartin|sick10:45
sean-k-mooneycongnt: ya so this is a common issue with older release of libvirt.10:52
sean-k-mooneycongnt: there are 3 ways to work around it if you cant us a newer libvirt, 1 is create a custom model, possibly by copying the one form a newer release, the next is to use the model that matches closed by setting cpu_mode=custom and cpu_model=<whatever> the final way to work around this is set cpu_mode=host-passthrough10:54
sean-k-mooneyfor 2 i forgot to say you can add the missing cpu flage with cpu_model_extra_flags10:55
sean-k-mooneythose config options are all in the libvirt section fo the nova.conf10:55
songwenpingsean-k-mooney: hi, i remember the live migration will check source node's resource capacity right?11:09
songwenpingsean-k-mooney: hi, i remember the live migration will also check source node's memory capacity right?11:20
sean-k-mooneyi would have to check if live migration is using 2 seperate allcotions or just one. i know for some move operations we use the migration context to hold the allcoation for one of the nodes and the vm for the other11:27
sean-k-mooneywe have not ported all move operations to use that workflows. cold migration does if i recall correctly, evacuate does not i think live migration will use 2 allocations but again i would need to look at the code to confirm11:28
sean-k-mooneygibi: bauzas  do ye recall off the top of ye're heads what we do for live migration ^11:32
gibilive migration uses the migration uuid to hold the source node alloc11:56
gibionly evacuate is an exception11:56
sean-k-mooneyack that is what i tought11:59
sean-k-mooneywe should fix evacuation sooner rather then later... im going to go add that to our downstream backlog11:59
sean-k-mooneymaybe i can push for that in B or C it would be nice to get that finally fixed12:00
sean-k-mooneyalthough the placement allocation explostion issue might be more imporant12:00
sean-k-mooneywe also still need to start using the consumer types feature right12:01
sean-k-mooneyto actully start marging the migration allcoations as migrations12:01
gibiyes and yes12:02
sean-k-mooneyby the way i will shortly be resuming the pci series review stephenfin are you on pto from today?12:03
sean-k-mooneystephenfin: im hoping to finish reviewign the rest of the pci seriese today but if not it will be my goal to get it completed the first week of january when im back12:04
songwenpingsean-k-mooney,gibi: it seems not reasonable if live migration uses 2 seperate allocations, the vm cannot be migrate if the source node have not enough resouce.12:42
sean-k-mooneysongwenping: it can we wont stop the migration if the source is over commited12:50
sean-k-mooneythat will fail for evacuate but live migration shoudl work12:51
sean-k-mooneythat said you shoudl never get into that situration unless you change something in a way that was unsupproted or hit a bug12:51
sean-k-mooneyfor example reduced the memory in the source node either intentiollay or due to a dim failure or change the allcoation ratio12:52
songwenpingwe use the Rocky code, and placement is integrated with nova.12:52
sean-k-mooneymultiple allocation was intoduced in Queens12:53
sean-k-mooneyso it shoudl be there in rocky12:53
songwenpingbut we encounter the problem, the vm cannot migrate because the source node's memory over commit12:54
sean-k-mooneywe may have a bug in rocky or master then but you shoudl not be in an over commit senario12:55
sean-k-mooneyits fine to oversubscibe provided the total amoutn adn allocation ratio align12:55
sean-k-mooneythe temporay fix woudl be to increase the allocation ratio to enabel the migration12:55
sean-k-mooneyand then restore it when done12:56
sean-k-mooneyalthernitivly you coudl try cold migration.12:56
songwenpingyes, we try to increase the allocation ratio now, and analyse why the memory is over commit12:58
gibiyepp it is a know behavior that you cannot migrate from an overallocated node. remove the overallocation by temporarily increasing allocation ratio, move the instance, restore the allocation ratio. And separately investigate how you ended up in an overallocated scenario13:12
sean-k-mooneygibi: i tought that only affected migrtions that uses a single allcoation13:18
gibisean-k-mooney: I think it is the other way around. Evacuation is the only move that does not use migration allocation on the source. It only extend the instance allocation to cover both the source and the dest node. So I think evac is not effect by this13:20
gibiall the other move operators move the instance allocation from the instance_uuid to the migration_uuid on the source node13:20
gibithat move allocation is what triggers the situation13:21
gibibecuase placement does not have a move semantic13:21
sean-k-mooneyhum maybe13:22
gibihttps://github.com/openstack/nova/blob/36091a7ed7ad553d5cbb5dcfde0090e1e762bc34/nova/scheduler/client/report.py#L2023-L204313:23
sean-k-mooneyevacuate is broken for overcommited case too however13:23
sean-k-mooneythe extention failes if the souce is over allcoated13:23
gibiOK then I was mistaken on that part. then all allocation manipulation is reject by placement if there is at least one overallocated RP in the change13:23
sean-k-mooneyproably yes13:24
gibifor some reason I assumed that if the allocation on the overallocated RP does not change then it is OK and evac only adds new alloc on new RPs.13:24
sean-k-mooneyi filed a downstrema tracker for this and its a know issue upstream so hopefully this will eventually get resovled13:24
sean-k-mooneyalthough it might require placemtn changes13:24
gibithe resolution probably needs a new placement microversion either to change the semantic of the existing POST /allocations or to add a new API that understands move semantic13:25
gibibut I agree to do something as it is a common issue from deployers13:26
*** dasm|off is now known as dasm13:36
opendevreviewJorge San Emeterio proposed openstack/nova-specs master: Review usage of oslo-privsep library on Nova  https://review.opendev.org/c/openstack/nova-specs/+/86543213:40
opendevreviewRuby Loo proposed openstack/nova stable/yoga: Ironic nodes with instance reserved in placement  https://review.opendev.org/c/openstack/nova/+/86791213:49
opendevreviewRuby Loo proposed openstack/nova stable/xena: Ironic nodes with instance reserved in placement  https://review.opendev.org/c/openstack/nova/+/86791314:03
opendevreviewRuby Loo proposed openstack/nova stable/wallaby: Ironic nodes with instance reserved in placement  https://review.opendev.org/c/openstack/nova/+/86791414:07
gibibauzas, sean-k-mooney: fyi I filed two bugs in the last two days about gate instabilties as I'm hitting them https://bugs.launchpad.net/glance/+bug/1999800 https://bugs.launchpad.net/tempest/+bug/1999893 14:08
bauzasshitty shit14:09
bauzasgibi: thanks14:09
gibithese are infrequent ones but I see both more than once so I reported them14:09
*** akekane is now known as abhishekk14:11
*** umbSubli1 is now known as umbSublime14:16
gibiI'm a magnet of bugs these days14:52
gibithe latest, this is from my local env14:52
gibiDec 16 15:51:21 bedrock kernel: traps: flake8[1268565] general protection fault ip:55f5a5ed6e83 sp:7ffdf4a39d50 error:0 in python3.10[55f5a5dba000+2a3000]14:52
gibiI cannot even run tox -e pep8 as flake8 fails all the time14:52
ykarelHi can someone look into https://bugs.launchpad.net/nova/+bug/194960614:54
ykarellibvirt-8.0.0 now provides option to set tb-cache14:55
ykarelwithout it it's difficult to run multiple guests vm together in CI on jammy hosts14:56
gibiykarel: can we default tb-cache size globally via some libvir configuration?15:07
gibikashyap: ^^15:07
ykarelgibi, no idea, but if that's possible then would be helpful as can be set outside of nova too15:08
gibiykarel: yep, it would be convinient otherwise we need to create a nova feature just for our CI usage15:08
ykarelyes15:09
gibii.e. a new nova compute host level config variable in [libvirt] section set to some small value applied blindly to all emulated domains by the nova-compute service15:09
ykarelit used to be 32MiB before it was raised to 1GiB15:10
ykarelso that should be good for CI atleast15:10
kashyapgibi: Hmmm, good question15:13
kashyapgibi: It rings a faint bell as I looked at it in the past, but I forget15:14
* kashyap looks15:14
kashyapI'm in a hurry as I need to take a train shortly, but I'll take a quick look15:14
gibikashyap: no worries, it is not super urgent :)15:14
kashyapgibi: Good news: yes!  libvirt does allow it15:15
kashyapLOL, I tested it even upstream libvirt myself and totally forgot:15:15
kashyapgibi: ykarel: https://listman.redhat.com/archives/libvir-list/2021-November/224873.html15:15
ykarelkashyap, yeap i tested that and it works, now we are looking if we can set it globally by some libvirt conf15:16
gibikashyap: with my limited understanding it only show that it is allowed via the domain xml, can we also set it via some hypervisor level global config?15:16
ykarelso we don't have to change nova code just to support CI usecase15:16
kashyapgibi: ykarel: I don't think global config is possible - near as I know15:19
gibikashyap: thanks15:19
kashyapykarel: But just shoot an email to libvirt-users@redhat.com list and ask there.15:19
kashyapPeople are friendly :)15:19
ykarelkashyap, Ok Thanks15:20
ykarelwill send a mail15:20
kashyapykarel: A quick tip: Ask them to keep you explicitly in Cc you on responses, as you're not subscribed to that list (I guess)15:23
opendevreviewBalazs Gibizer proposed openstack/nova master: Split ignored_tags in stats.py  https://review.opendev.org/c/openstack/nova/+/86797815:24
gibisean-k-mooney: I did the split as we discussed ^^15:24
ykarelThanks kashyap, yes right /me not subscribed15:24
ykarelkashyap, gibi sent https://listman.redhat.com/archives/libvirt-users/2022-December/013844.html16:17
rloohi sean-k-mooney, these should be ready to approve (zuul is happy anyway!): https://review.opendev.org/c/openstack/nova/+/867912, https://review.opendev.org/c/openstack/nova/+/867913 & https://review.opendev.org/c/openstack/nova/+/867914 (thanks!)16:26
sean-k-mooneyrloo: ack17:18
opendevreviewEdward Hope-Morley proposed openstack/nova stable/yoga: ignore deleted server groups in validation  https://review.opendev.org/c/openstack/nova/+/86798917:29
sean-k-mooneyrloo: the older backports are not quite right17:31
rloosean-k-mooney: gahhhh. did you comment? I'll take a look.17:32
sean-k-mooneythe content is fine but it looks like you cherry picked form master in all cases instead of form the previous cherry pick17:32
sean-k-mooneyyep its pretty minor17:32
rlooyes, i cherry picked from master. do you want to do it from previous cherry pick? 17:32
sean-k-mooneyjust the commit is wrong17:32
sean-k-mooneyrloo: yep you should cherry pick form the previous cherry prick17:33
sean-k-mooneyi think ironic does this slightly differntly due to how ye do bugfix branches17:33
rloogeez. i thought if i used the UI to do the cherry pick, it'd do the right thing. the reason i didn't do from previous, was cuz things looked messier, heh. 17:33
sean-k-mooneyfor nova the backport go form newest to oldest branch and you cherry pick form the previosu branch17:33
rlooi haven't been doing upstream stuff, so i don't even recall how ironic does it... i did try to find doc about it but gave up. 17:34
sean-k-mooneyrloo: i acttully care about the cherry-pick lines less then other but i knwo melwitt and elodilles do like them to be done a specific way17:34
sean-k-mooneyfor me i just do a git reset --hard origin/stable/<whatever> then git review -X <previous version>17:35
rloono worries. should i create new PRs, the 'right' way? 17:35
sean-k-mooneywell they dont have to be new reviews just need to fix the commit message with the cherry pick lines17:36
rloowell, if i manually do that -- there won't be a conflict in the wallaby one (if i recall) cuz the change was similar to the xena one. but i didn't tell you that, i'll fix the commit messages...17:38
sean-k-mooneyright so i do not normllay remove the confit bit in that case although i know other do17:39
sean-k-mooneyi do if  others ask17:39
rloo(and if someone had time to fix that UI so it doesn't allow cherry picking from master to n-2+ stable branches, heh)17:39
sean-k-mooneyone thing i have not tested is if the behvior change if the patch is merged17:39
rloosean-k-mooney: ahh, yes, you're right. if i had cherry picked from xena (which mentions the conflict), the wallaby one would have the same commit msg so.17:40
sean-k-mooneyi think it does17:40
sean-k-mooneybasically if its merged and you cherry pick it i think it addes the line properly17:41
sean-k-mooneyi think it only doesnt if you do it to an open reivew. this has changed in differnt gerrit versions17:41
opendevreviewRuby Loo proposed openstack/nova stable/yoga: Ironic nodes with instance reserved in placement  https://review.opendev.org/c/openstack/nova/+/86791217:57
opendevreviewRuby Loo proposed openstack/nova stable/xena: Ironic nodes with instance reserved in placement  https://review.opendev.org/c/openstack/nova/+/86791317:58
opendevreviewRuby Loo proposed openstack/nova stable/wallaby: Ironic nodes with instance reserved in placement  https://review.opendev.org/c/openstack/nova/+/86791418:00
opendevreviewRuby Loo proposed openstack/nova stable/xena: Ironic nodes with instance reserved in placement  https://review.opendev.org/c/openstack/nova/+/86791318:06
opendevreviewRuby Loo proposed openstack/nova stable/wallaby: Ironic nodes with instance reserved in placement  https://review.opendev.org/c/openstack/nova/+/86791418:07
sean-k-mooneythose all look good bauzas if you are around the next few days can you review them and  babysit those through the gate 18:10
rloosean-k-mooney: thx for reviewing them. now i feel like i should do more upstream stuff before i forget. ha ha. (I might backport https://review.opendev.org/c/openstack/nova/+/842478 just for fun, we don't have a need for that. yet.)18:12
sean-k-mooneyso my understandign is taht should not be needed with the fix you have backported18:12
sean-k-mooneyrloo: well it would be good to have if you disable the fix you backported18:13
sean-k-mooneyso i guess if you dont have cleaning and dont want the extra time18:13
sean-k-mooneythen having both might make sense18:13
sean-k-mooneyso looking at it quikly it shoudl be backportable too so if you want too go for it18:14
rloowe have cleaning and we don't put nodes in maint often. but i could see that being useful for others, and who knows, we might want it. The trick is getting my downstream stuff done so I have time to do some upstream stuff ;)18:15
sean-k-mooneyi know that feeling right now my upstream time is 99% reviews currently18:16
sean-k-mooneywell and irc18:16
rloowow, i appreciate that and I'm sure others do to sean-k-mooney! Just don't burn out on that.18:16
sean-k-mooneywell its how i can best supprot the rest of the team18:17
rloo++++18:17
sean-k-mooneyi could write a bunch of code but i know it wont get reviewed quickly so while my upstream time is limited im puting it to reviews to enabel other to land there fixes18:18
* sean-k-mooney does not want to be a manager but does like to help other fix things18:18
sean-k-mooneyrloo: by the way if the ironic folks ever want to turn there json rpc impl into an oslo messaging dirver so we can deploy nova without rabbit... i would not be upset18:19
* sean-k-mooney hopes the nats driver actully becomes a thing18:20
sean-k-mooneyzigo: are you still persuing ^18:22
sean-k-mooneyi ocationally look at https://review.opendev.org/q/topic:asyncio-nats but you know time18:22
rloosean-k-mooney: that is a great idea, might be good if you mentioned it in the ironic channel. problem is so few people, so many things we'd like to do. but worth it if we can get rid of rabbit....19:14
opendevreviewRuby Loo proposed openstack/nova stable/zed: Ironic: retry when node not available  https://review.opendev.org/c/openstack/nova/+/86792419:32
opendevreviewRuby Loo proposed openstack/nova stable/yoga: Ironic: retry when node not available  https://review.opendev.org/c/openstack/nova/+/86801021:27
opendevreviewRuby Loo proposed openstack/nova stable/yoga: Ironic: retry when node not available  https://review.opendev.org/c/openstack/nova/+/86801021:30
opendevreviewRuby Loo proposed openstack/nova stable/xena: Ironic: retry when node not available  https://review.opendev.org/c/openstack/nova/+/86801121:30
opendevreviewRuby Loo proposed openstack/nova stable/xena: Ironic: retry when node not available  https://review.opendev.org/c/openstack/nova/+/86801121:33
opendevreviewRuby Loo proposed openstack/nova stable/wallaby: Ironic: retry when node not available  https://review.opendev.org/c/openstack/nova/+/86801221:33
opendevreviewRuby Loo proposed openstack/nova stable/wallaby: Ironic: retry when node not available  https://review.opendev.org/c/openstack/nova/+/86801221:36
*** dasm is now known as dasm|off21:54
*** osmanlicilegi is now known as Guest022:33
*** ChanServ changes topic to "This channel is for Nova development. For support of Nova deployments, please use #openstack"22:40

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!