Wednesday, 2023-02-01

opendevreviewMerged openstack/nova master: Detect host renames and abort startup  https://review.opendev.org/c/openstack/nova/+/86392007:12
dvo-plvHello, All. Could you please review our new changes in the next blueprint https://review.opendev.org/c/openstack/nova-specs/+/85929008:33
opendevreviewMaxim Monin proposed openstack/nova master: Server Rescue leads to Server ERROR state if basew image is deleted  https://review.opendev.org/c/openstack/nova/+/87238508:33
sean-k-mooneyhum ^ is expected if you dont pass an image to use instead08:41
sean-k-mooneydvo-plv: we are well after spec freeze specs wont reopen until ~march08:41
dvo-plvOkay, I see, we told about our approach to add NT NIC support. So we made some poc and I have pushed new code to simplify understating of our approach08:50
sean-k-mooneyok so Feature Feeze is in 2 week RC 1 will be about 2 week after that and the master branch will reopen for the 2023.2 release08:51
sean-k-mooneyin the interim you can rebase you spec to the 2023.2 folder on when that is created and it can be reviewd08:52
sean-k-mooneyhas the neutron work been accpeted yet08:53
sean-k-mooneyand if so is it implemtned/reviewd or pending08:53
opendevreviewribaudr proposed openstack/os-traits master: Add 'COMPUTE_SHARE_LOCAL_FS'  https://review.opendev.org/c/openstack/os-traits/+/87218509:11
sean-k-mooneybauzas: gibi  im happy with both sahid's seriese and dans at this point and think we can proceed with merging both if ye can rereview them again this morning that would be great09:31
bauzassean-k-mooney: yup I wanted to look at it yesterday, will do it this morning09:32
sahido/ ++ guys, I will be around if you want me to change or add something09:33
sean-k-mooneythere were some trivial nits but i was fine with a followup patch for those09:33
sahidsure I will do that09:34
sean-k-mooneyat this point i would prefer to land the changes so that the sdk/osc changes can merge ectra and so you can avoid any conflicts on api ro compute service verison09:34
sahid++09:35
opendevreviewRodolfo Alonso proposed openstack/os-vif master: Implement "BaseCommand" result property  https://review.opendev.org/c/openstack/os-vif/+/87239109:53
sean-k-mooneyralonsoh: ok so that is a premtive mesure to allow the ovsdbapp to be updated when its relased.09:55
ralonsohsean-k-mooney, yes09:56
sean-k-mooneyyour cutting it kind of close09:56
ralonsohsean-k-mooney, I know that, that's why I'm speeding it09:56
sean-k-mooneywell more that the non clien lib free is thurday week09:57
opendevreviewKashyap Chamarthy proposed openstack/nova stable/xena: libvirt: Add a workaround to skip compareCPU() on destination  https://review.opendev.org/c/openstack/nova/+/87197509:57
opendevreviewKashyap Chamarthy proposed openstack/nova stable/xena: Add a workaround to skip hypervisor version check on LM  https://review.opendev.org/c/openstack/nova/+/85120509:57
opendevreviewKashyap Chamarthy proposed openstack/nova stable/xena: libvirt: At start-up rework compareCPU() usage with a workaround  https://review.opendev.org/c/openstack/nova/+/87201109:57
ralonsohsean-k-mooney, is in two weeks09:57
sean-k-mooneyso both os-vif and ovsdbapp need to have this merged by then09:57
sean-k-mooneyralonsoh: no that is feature freeze09:57
sean-k-mooneynot the non-client lib freeze09:57
sean-k-mooneythe non-clint lib freeze is the 9th09:58
sean-k-mooneyhttps://releases.openstack.org/antelope/schedule.html09:58
bauzassahid: sean-k-mooney: series sent to the gate with an ask for a FUP for 2 nits09:58
sean-k-mooneythanks if you still have energy to review i tested dans serise this morning09:59
sean-k-mooneyit worked as expect although one error could be better10:00
sean-k-mooneyagain i think thats fixable in a follow up too10:00
sean-k-mooneyso i dont think we need to wait for that10:00
sean-k-mooneyralonsoh: ill try an loop back to the os-vif change once ci has run10:01
ralonsohsean-k-mooney, thanks a lot10:01
sean-k-mooneyi know its not actully used really right now so it cant break just want to make sure everythign else is fine with it10:01
opendevreviewJorge San Emeterio proposed openstack/nova master: Dividing global privsep profile  https://review.opendev.org/c/openstack/nova/+/87172910:02
bauzassean-k-mooney: working hard on cutting the fake sysfs dir btw. for my own series10:05
bauzasdefinitely too large for our gate10:05
sean-k-mooneyyou only need a small subset of it currently10:06
sean-k-mooneyif you need me to generate new data i can10:07
bauzassean-k-mooney: yeah, I need to cut some numbers, unless you have another smaller sysfs, my proposal is just to drop a few cpus and related info10:32
sean-k-mooneyyou can drop entires trees in the fake file system10:36
sean-k-mooneylike the numa nodes and memory 10:36
opendevreviewMerged openstack/nova stable/zed: Improving logging at '_allocate_mdevs'.  https://review.opendev.org/c/openstack/nova/+/87141310:37
bauzasonce I'm done with downstream stuff, I'll cut 10:38
opendevreviewJorge San Emeterio proposed openstack/nova master: WIP: Moving privsep profiles to nova/__init__.py  https://review.opendev.org/c/openstack/nova/+/87201010:48
gibidansmith sean-k-mooney: I've approved the rest of the stable compute uuid series. 10:55
sean-k-mooneycool10:56
bauzasditto10:56
sean-k-mooneydid you have any issues or concerns10:56
bauzaseven the top patch which was WIP yesterday ?10:56
gibiI dont see any wip patches10:56
sean-k-mooneydan pushed stuff yesterday evening10:56
sean-k-mooneyafter you signed off10:56
gibihttps://review.opendev.org/q/topic:bp%252Fstable-compute-uuid10:56
sean-k-mooneyi woke up at 5 am today so i reviewd and tested all the new patches this morning10:57
gibiI think we are in good shape here10:57
sean-k-mooneybauzas: dansmith took the suggestion of adding a STUB_COMPUTE_ID class property10:58
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova/+/872204/5/nova/test.py#17810:58
sean-k-mooneyand that allwoed them to get the final tests working10:58
sean-k-mooneyby sutubing _ensure_existing_node_identity by default expect in tests that are testin git10:59
sean-k-mooneybauzas: if your interested in the extra manual tests i did my notes are here https://etherpad.opendev.org/p/Stable-compute-uuid-manual-testing#L38211:00
gibisean-k-mooney: yeah I saw that, make sense11:01
gibiour compute start / restart logic in func test is a but messy11:01
sean-k-mooneyya but its a useful mess most of the time :)11:02
sean-k-mooneygibi: did you intend to +w https://review.opendev.org/c/openstack/nova/+/872220 bauzas do you want ot have a look or will i send it into the gate11:03
sean-k-mooneygibi: you set review priorty +2 which you may or may not have intneded :)11:04
gibisean-k-mooney: my bad, fixed it11:10
gibisean-k-mooney, sahid: Am I correct here https://review.opendev.org/c/openstack/nova/+/858384/41/doc/api_samples/os-evacuate/v2.95/server-evacuate-find-host-req.json ?11:10
gibiI think targetState only part of the RPC API but not the REST API11:11
bauzassean-k-mooney: gibi: sorry my internal brain concurrency mechanism is currently locked with a downstream semaphore11:13
sean-k-mooneygibi: correct only RPC not RestAPI11:16
sean-k-mooney gibi  it used to be in the rest api but we remvoed it11:16
sean-k-mooneythat tells me our api sample tests are not validating extra fields11:17
gibisean-k-mooney: OK, then lets fix that sample in a FUP. other than that I have no issue with the evacuate series, but I only skimmed it as it was already approved11:17
bauzasgibi: ++ and thanks for the spot11:27
sahidthank you guys I'm building a patch to fix all the points that you noticed11:30
opendevreviewKashyap Chamarthy proposed openstack/nova stable/wallaby: Add a workaround to skip hypervisor version check on LM  https://review.opendev.org/c/openstack/nova/+/85120611:47
opendevreviewKashyap Chamarthy proposed openstack/nova stable/wallaby: libvirt: At start-up rework compareCPU() usage with a workaround  https://review.opendev.org/c/openstack/nova/+/87240211:47
opendevreviewMerged openstack/nova master: compute: enhance compute evacuate instance to support target state  https://review.opendev.org/c/openstack/nova/+/85838311:49
opendevreviewMerged openstack/nova master: api: extend evacuate instance to support target state  https://review.opendev.org/c/openstack/nova/+/85838411:49
sahidi'm not sure about what should be changed for openstacksdk and python-openstackclient?12:19
sahida release note would be enough?12:19
sean-k-mooneyyou need to bump the max microversion12:20
sean-k-mooneythats about it 12:20
sean-k-mooneyyou could add help text for evacuate12:21
sean-k-mooneyto explian the new bahaivor in osc12:21
sean-k-mooneythat would also be a good addtion12:21
sahidyes i was thinking about that too12:21
opendevreviewMaxim Monin proposed openstack/nova master: Server Rescue leads to Server ERROR state if base image is deleted  https://review.opendev.org/c/openstack/nova/+/87238512:38
artomsahid, I think I'll need to add the 2.94 bump before yours though (for the FQDN hostname) 13:05
artomDon't think I need to do anything else, since we don't appear to validate the hostname anywhere in the client, so it can already be an FQDN13:05
artomsahid, actually, I suspect you can just bump directly to 2.95 and be done with it13:05
artomYeah, we don't do anything clientside13:07
*** dasm|off is now known as dasm13:44
sahidartom: thank you !13:51
opendevreviewSahid Orentino Ferdjaoui proposed openstack/nova master: fup: support evacuate target state  https://review.opendev.org/c/openstack/nova/+/87241313:56
sahidartom: i think i don't get where we should bump this version ?13:58
opendevreviewJean-SĂ©bastien Bevilacqua proposed openstack/nova master: Add Lustre support to nova  https://review.opendev.org/c/openstack/nova/+/85378614:12
artomsahid, I don't know off the top of my head either, maybe I'll do both when I find it14:24
artomsahid, so 2.95 doesn't actually change anything in the API itself, there's just a new default instance state?14:49
artomafter evacuation?14:49
opendevreviewMaxim Monin proposed openstack/nova master: Server Rescue leads to Server ERROR state if base image is deleted  https://review.opendev.org/c/openstack/nova/+/87238515:03
opendevreviewArtom Lifshitz proposed openstack/python-novaclient master: Bump microversion to 2.95  https://review.opendev.org/c/openstack/python-novaclient/+/87241815:09
artomsahid ^^15:09
artomHrmm, so how do we make this work for openstackclient? AFAICT there is no max microversion declaration anywhere15:11
artomDoes it just magically work if users pass --os-compute-api-version=2.95?15:12
artomOh, and we have no multinode functional tests for osc15:15
artomSo we can't even test 2.9515:15
* artom does a func test for 2.94 only15:20
opendevreviewArtom Lifshitz proposed openstack/nova-specs master: Amend FQDN in hostname spec to reflect implementation  https://review.opendev.org/c/openstack/nova-specs/+/87242215:26
*** ksambor is now known as NICK-afk16:08
*** rpittau is now known as elfosardo16:34
*** elfosardo is now known as rpittau16:43
opendevreviewDan Smith proposed openstack/nova master: Protect against a deleted node id file  https://review.opendev.org/c/openstack/nova/+/87220416:47
opendevreviewDan Smith proposed openstack/nova master: Move comment about _destroy_evacuated_instances()  https://review.opendev.org/c/openstack/nova/+/87234816:47
opendevreviewArtom Lifshitz proposed openstack/python-novaclient master: Bump microversion to 2.95  https://review.opendev.org/c/openstack/python-novaclient/+/87241816:55
opendevreviewStephen Finucane proposed openstack/nova master: db: Remove legacy migrations  https://review.opendev.org/c/openstack/nova/+/87242817:08
opendevreviewStephen Finucane proposed openstack/nova master: db: Remove the legacy 'migration_version' table  https://review.opendev.org/c/openstack/nova/+/87242917:08
stephenfinsean-k-mooney: gibi: that should fix SQLA 2.0 compat ^17:08
stephenfinI think it's okay to drop them completely. We've supported automatic migration to alembic since Wallaby. Antelope will be 5 releases later which spans even the biggest fast-forward upgrade interval. Also, even with FFU we expect folks to run DB upgrades on each version so17:19
opendevreviewDan Smith proposed openstack/nova master: Check our nodes for hypervisor_hostname changes  https://review.opendev.org/c/openstack/nova/+/87222017:23
opendevreviewDan Smith proposed openstack/nova master: Protect against a deleted node id file  https://review.opendev.org/c/openstack/nova/+/87220417:23
opendevreviewDan Smith proposed openstack/nova master: Move comment about _destroy_evacuated_instances()  https://review.opendev.org/c/openstack/nova/+/87234817:23
opendevreviewDan Smith proposed openstack/nova master: Abort startup if nodename conflict is detected  https://review.opendev.org/c/openstack/nova/+/87243217:23
dansmithgdi17:23
sean-k-mooneystephenfin: ill take a look shortly18:11
opendevreviewSylvain Bauza proposed openstack/nova master: cpu: interfaces for managing state and governor  https://review.opendev.org/c/openstack/nova/+/86823618:37
opendevreviewSylvain Bauza proposed openstack/nova master: libvirt: let CPUs be power managed  https://review.opendev.org/c/openstack/nova/+/82122818:37
opendevreviewSylvain Bauza proposed openstack/nova master: WIP: enable cpus when an instance is spawning  https://review.opendev.org/c/openstack/nova/+/86823718:37
bauzassean-k-mooney: looks like there is a discrepancy between the guest pcpu and the numa topology blob : 18:37
bauzashttps://paste.opendev.org/show/bcAxuCSeroU2VHjfqpUx/18:37
sean-k-mooneythats the bug i reported last week18:40
sean-k-mooneyoh the power seriese18:40
sean-k-mooneybauzas: let me check if you are usign the right data set18:41
bauzasnevermind, I found the rootcase18:41
sean-k-mooneyok18:41
sean-k-mooneyby the way just so you are aware18:41
sean-k-mooneycore 0 cant generally be turnned off18:42
sean-k-mooneyits special in the kernel18:42
sean-k-mooneymore genrelaly the first core in a socket can be special in the same way18:42
bauzassean-k-mooney: found the discrepancy reason : https://paste.opendev.org/show/bAQTjXRRqaxXPO11HPq6/18:42
bauzastl;dr: cpu_pinning property is wrong18:43
bauzasI'll change my functest to use the pcpuset18:43
bauzass/use/verify 18:43
sean-k-mooneyor your reading it wtong18:43
sean-k-mooneyit looks correct to me18:44
sean-k-mooneycpu_pinning_raw={0=0,1=1,2=2,3=3,4=6} that is a dict to logical guest core to host core18:44
sean-k-mooneyso topology.cpu_pinning returned {0, 1, 2, 3, 6}18:45
bauzasI'm confused18:45
sean-k-mooneywhich are the host cores its the vm cores are pinned too18:45
bauzasare those numbers the vcpu ones ?18:45
sean-k-mooneythe key 0-418:45
sean-k-mooneyare logical guest cpu cores 0-418:45
sean-k-mooneythe values are the ids of the host cores those vcpus are pinned too18:46
opendevreviewSylvain Bauza proposed openstack/nova master: WIP: enable cpus when an instance is spawning  https://review.opendev.org/c/openstack/nova/+/86823718:46
bauzasthen, the guest.vcpu.cpu set is wrong18:46
bauzasfrom libvirt18:46
sean-k-mooneycan you show me the libvirt xml18:46
bauzasgimme me a sec18:47
* bauzas actually wonders what's the best for getting them from the functest18:47
bauzascalling the domain I guess18:47
sean-k-mooneyyou sould use the instance numa toplogy blob18:47
sean-k-mooneythat is the singel source or truth18:48
bauzasLaptop freezed, had to reboot 18:51
opendevreviewDan Smith proposed openstack/nova master: Stable compute uuid functional tests  https://review.opendev.org/c/openstack/nova/+/87244118:52
*** gibi is now known as gibi_pto19:04
opendevreviewDan Smith proposed openstack/nova master: Stable compute uuid functional tests  https://review.opendev.org/c/openstack/nova/+/87244119:12
opendevreviewSylvain Bauza proposed openstack/nova master: WIP: enable cpus when an instance is spawning  https://review.opendev.org/c/openstack/nova/+/86823719:35
bauzassean-k-mooney: updated ^19:35
sean-k-mooneybauzas: ack ill take a look tomorrwo19:38
sean-k-mooneydansmith: ah i was going to ask but i see that there was an issue with ironic on your series19:38
dansmithsean-k-mooney: only because I aded that missing node file check before the ironic exclusion19:39
dansmithbefore I added that, it worked fine on ironic19:39
dansmithbut it's running another job with the two reversed now, should be fine, but we should wait to be sure19:39
sean-k-mooneyack i can take a look again tomrrow19:40
sean-k-mooneyi also see you stared adding func test in a follow up to codify some of the manual tests19:40
sean-k-mooneyoh and you adressed the compute node create traceback19:41
sean-k-mooneycool let us know when its ready to re review19:41
sean-k-mooneyi think you have adressed everything i found in my manual testing at this point i can quickly run true the list again tomorrow19:43
dansmithsean-k-mooney: no, I can't really address the traceback (on startup) without a change to oslo.service AFAIK19:48
dansmithbut it will no longer be a trace if it happens during periodic19:48
sean-k-mooneyi was refering to https://review.opendev.org/c/openstack/nova/+/872432/1/nova/compute/manager.py19:49
sean-k-mooneysorry not that19:49
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova/+/872432/1/nova/compute/resource_tracker.py19:49
dansmithsean-k-mooney: if you could re-+W this before you go, that can still merge https://review.opendev.org/c/openstack/nova/+/872220/319:49
dansmithand then we'll have less for tomorrow19:49
sean-k-mooneysure19:50
dansmithsean-k-mooney: yeah, but in order to get service startup to abort, we still have to raise and you'll get a trace in the logs19:50
dansmithsean-k-mooney: that one was just hit by a rebase accidentally19:50
sean-k-mooneyi was just skiming the later two patches by the way19:50
sean-k-mooneydansmith: ok and the agent will abort on start in that case?19:50
dansmithyes,19:51
sean-k-mooneyill test it to tomrrow either way just wondering what to expect19:51
sean-k-mooneycool19:51
dansmiththe reason it wasn't before is we swallow and ignore Exception, except for specific ones, so now this makes InvalidConfiguration abort, if startup=True19:51
sean-k-mooneypresumably in a decorator19:51
dansmithso that patch is mostly just to make sure we catch the duplicate error specifically, turn it into InvalidConfiguration, and then allow InvalidConfiguration on startup to abort us19:52
simondodsleyQuestion from a customer using Queens (I know EOL and unsupported), is there a way to migrate a boot volume to a new backend wit ha shutdown instance?19:52
dansmithno19:52
dansmithjust in our own wrapper around update_available_resource()19:52
sean-k-mooneyah ok19:52
dansmithyou'll see when you look closer.. it was fairly obvious, there are just lots of layers19:52
sean-k-mooneyack19:53
sean-k-mooneysimondodsley: so ha shutdown isntance you mean isntancce ha is in use but the instance is stopped19:53
simondodsleyi beleive that is what they are asking19:53
sean-k-mooneysimondodsley: if its a boot form volume guest i would still expec a cinder volume retry or volume migration could be used19:53
sean-k-mooneynova does not have any apis for this so you would have to ask the cinder folks i think19:54
simondodsleythey tried the cinder retype and it dodn't work. Nova got confused and attached the new volume with the wrong vd device19:54
simondodsleyor is that definatley a cinder/os-brick thing?19:54
sean-k-mooneyvd device?19:55
sean-k-mooneyas in /dev/vda /dev/vdb in the guest19:55
sean-k-mooneyif so the device path in the guest is not actully used when you are using libvirt19:56
sean-k-mooneywe cant actully contol that19:56
sean-k-mooneybut if you mean on the host that sound like a bug but im not sure if its a nova one of os-brick/cinder one19:56
sean-k-mooneymost of the host block device managemnt is done os-brick19:57
sean-k-mooneydansmith: oh ya that was right in front of me https://review.opendev.org/c/openstack/nova/+/872432/1/nova/compute/manager.py#1049120:01
dansmithyep, that's it, and you can see earlier we raise on startup for reshape things20:02
sean-k-mooneyyep20:02
dansmithI wasn't expecting us to have a "log and swallow" exception handler there, so it took me a while to realize why it *was* running at startup, but not stopping us20:02
sean-k-mooneyya20:03
sean-k-mooneyok actully going now o/ the first patch is on its way20:03
dansmiththanks20:04
*** dasm is now known as dasm|off22:40
opendevreviewmelanie witt proposed openstack/nova master: Reproducer for bug 2003991 unshelving offloaded instance  https://review.opendev.org/c/openstack/nova/+/87247022:51
opendevreviewmelanie witt proposed openstack/nova master: Enforce quota usage from placement when unshelving  https://review.opendev.org/c/openstack/nova/+/87247122:51

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!