Friday, 2023-09-01

opendevreviewOpenStack Proposal Bot proposed openstack/nova master: Imported Translations from Zanata  https://review.opendev.org/c/openstack/nova/+/89164804:44
opendevreviewMerged openstack/nova master: Bump MIN_{LIBVIRT,QEMU} for "Bobcat"  https://review.opendev.org/c/openstack/nova/+/88725504:56
opendevreviewmelanie witt proposed openstack/nova master: Follow up for unified limits documentation  https://review.opendev.org/c/openstack/nova/+/89346206:01
opendevreviewmelanie witt proposed openstack/nova master: Follow up for unified limits: PCPU and documentation  https://review.opendev.org/c/openstack/nova/+/89346206:03
melwittjohnthetubaguy, bauzas, sean-k-mooney: followup for unified limits to handle PCPU and add a bit more detail to docs ^ thanks06:04
opendevreviewmelanie witt proposed openstack/nova master: Follow up for unified limits: PCPU and documentation  https://review.opendev.org/c/openstack/nova/+/89346206:20
opendevreviewAmit Uniyal proposed openstack/nova stable/2023.1: Refactor CinderFixture  https://review.opendev.org/c/openstack/nova/+/89346406:33
opendevreviewAmit Uniyal proposed openstack/nova stable/2023.1: Add function to get all attachments in Cinder.API module  https://review.opendev.org/c/openstack/nova/+/89346506:33
opendevreviewAmit Uniyal proposed openstack/nova stable/2023.1: Reproducer for dangling bdms  https://review.opendev.org/c/openstack/nova/+/89346606:33
opendevreviewAmit Uniyal proposed openstack/nova stable/2023.1: Delete dangling bdms  https://review.opendev.org/c/openstack/nova/+/89346706:35
opendevreviewAmit Uniyal proposed openstack/nova stable/zed: Adds regression functional test for 1980720  https://review.opendev.org/c/openstack/nova/+/89346806:47
opendevreviewAmit Uniyal proposed openstack/nova stable/zed: Refactor CinderFixture  https://review.opendev.org/c/openstack/nova/+/89346906:47
opendevreviewAmit Uniyal proposed openstack/nova stable/zed: Add function to get all attachments in Cinder.API module  https://review.opendev.org/c/openstack/nova/+/89347006:47
opendevreviewAmit Uniyal proposed openstack/nova stable/zed: Reproducer for dangling bdms  https://review.opendev.org/c/openstack/nova/+/89347106:47
opendevreviewAmit Uniyal proposed openstack/nova stable/zed: Delete dangling bdms  https://review.opendev.org/c/openstack/nova/+/89347206:47
*** whoami-rajat_ is now known as whoami-rajat07:10
opendevreviewAmit Uniyal proposed openstack/nova stable/yoga: Adds regression functional test for 1980720  https://review.opendev.org/c/openstack/nova/+/89347407:27
opendevreviewAmit Uniyal proposed openstack/nova stable/yoga: Refactor CinderFixture  https://review.opendev.org/c/openstack/nova/+/89347507:27
opendevreviewAmit Uniyal proposed openstack/nova stable/yoga: Add function to get all attachments in Cinder.API module  https://review.opendev.org/c/openstack/nova/+/89347607:27
opendevreviewAmit Uniyal proposed openstack/nova stable/yoga: Reproducer for dangling bdms  https://review.opendev.org/c/openstack/nova/+/89347707:27
opendevreviewAmit Uniyal proposed openstack/nova stable/yoga: Delete dangling bdms  https://review.opendev.org/c/openstack/nova/+/89347807:27
bauzasmorning folks07:40
* bauzas yawns07:40
bauzasmelwitt: on it 07:52
bauzasthanks for the quick follow-ups07:53
bauzasauniyal: your series was tagged as a blueprint07:53
bauzasauniyal: accordingly, you can't backport it in stable branches as it would change a behavior07:54
bauzassee the backport policy rules that are given here https://docs.openstack.org/project-team-guide/stable-branches.html#appropriate-fixes07:57
ykarelHi is there some known issue with test_reboot_server_hard, recently saw many failures08:07
ykarelFails like AssertionError: time.struct_time(tm_year=2023, tm_mon=9, tm_mday=1, tm_hour=7, tm_min=26, tm_sec=24, tm_wday=4, tm_yday=244, tm_isdst=0) not greater than time.struct_time(tm_year=2023, tm_mon=9, tm_mday=1, tm_hour=7, tm_min=26, tm_sec=24, tm_wday=4, tm_yday=244, tm_isdst=0)08:07
ykarelk didn't find a bug report, so reported https://bugs.launchpad.net/tempest/+bug/203375208:14
ykarelit's started just few hours back08:14
bauzasykarel: oh08:41
bauzasprobably related to some recent merge we did08:42
bauzasbecause now we call Cinder when rebooting so the performance could be impactedcx08:42
fricklerelodilles: https://review.opendev.org/c/openstack/nova/+/871968 is still blocked by your W-1, can you please check?08:53
fricklerbauzas: calling cinder for a reboot? I'm curious about that, do you happen to have a reference?08:54
fricklerah, https://review.opendev.org/c/openstack/nova/+/882284 ?08:55
bauzasfrickler: see the bug report08:55
bauzasyea08:55
elodillesfrickler: looking, thanks08:56
bauzassean-k-mooney: dansmith: auniyal: looks like https://review.opendev.org/c/openstack/nova/+/882284 created a gate issue08:57
fricklera reno referencing "this change" is weird wording btw08:57
bauzasfrickler: we can modify this 08:57
auniyalbauzas, ack 08:57
bauzasin general we say "Now, Nova does X or Y"08:58
bauzasor "With this new feature named "blah", now we have"08:58
bauzasauniyal: I'm working on a patch that will conditionnally check the volumes if only BDMs exist09:02
auniyalack bauzas, any thing I can help with09:02
bauzaswe may need core reviewers09:03
fricklerelodilles: thx, now just someone needs to retrigger the W+1 I think09:04
elodillesfrickler: i did it just before you wrote :)09:09
bauzasauniyal: actually, we can't avoid a Cinder API call for getting the volumes09:10
bauzasvolume attachments*09:10
auniyalyeah, I was about to say 09:11
auniyalsaw the comment on bug report09:11
auniyalcinder calls are separate from bdms what nova have09:11
auniyalcan we increase reboot time check  ?09:12
bauzasoh wait09:14
bauzasthis isn't calculating the time it boots09:15
bauzasthis is just getting the uptime09:15
SvenKieskesystemd-analyze time gives you boot times :)09:19
SvenKieskewithout any context09:20
bauzasthat's weird then09:23
bauzashttps://github.com/openstack/tempest/blob/28ba6ee268bdf5d18a13a0dc7c7fea2c3a3dfcce/tempest/common/utils/linux/remote_client.py#L84-L88 is just giving me when the server was booted09:24
bauzasand https://github.com/openstack/tempest/blob/master/tempest/api/compute/servers/test_server_actions.py#L127-L128 is just comparing the timestamps to check that the VM was actually rebooted09:24
bauzasthis isn't at all a performance check09:25
SvenKieskeI'm not understanding your last sentence: do you mean this isn't supposed to be a performance check, or are you indicating that this should be a performance check but fails to do so?09:30
SvenKieskedepending on what stage the virtual machine is in, the time comparison could be flaky, during early boot especially, if time is not correctly synced.09:31
SvenKieskeI'd also strongly advise not to directly get stuff from proc-fs. why not use a native python function instead at least?09:33
bauzasSvenKieske: Tempest gets the uptime, subs it from the current time so it gets the local time when the server was booted, then reboots the server and does the same math with uptime09:33
bauzasand then there is an assertion check that verifies that the reboot timestamp is greater than the original timestamp (meaning the OS was actually rebooted)09:34
bauzashttps://github.com/openstack/tempest/blob/28ba6ee268bdf5d18a13a0dc7c7fea2c3a3dfcce/tempest/api/compute/servers/test_server_actions.py#L127-L128 and https://github.com/openstack/tempest/blob/28ba6ee268bdf5d18a13a0dc7c7fea2c3a3dfcce/tempest/api/compute/servers/test_server_actions.py#L10909:34
SvenKieskeI hope that test never runs during dailight saving time adjustments *just kidding*09:34
bauzasUTC exists for a good reason :)09:35
bauzasbasically, this failing check means that the server *wasn't* rebooted09:35
SvenKieskeI'm strongly reminded about this now, don't know why: https://gist.github.com/timvisee/fcda9bbdff88d45cc9061606b4b923ca09:35
SvenKieskethis is really a bad check, to be blunt09:36
SvenKieskejust use systemd for this stuff already guys, it's there, it works, I know it has it's own bugs, but it's better then the stuff that was before it.09:36
auniyalin test reboot_time should be more then (first) boot_time  --- but its failing because it same.09:37
auniyalhttps://paste.opendev.org/show/bk8aK3ZbjpeDYs6swcoQ/09:37
SvenKieskee.g. you can easily check the boot counter with systemd09:37
bauzasauniyal: which means that the server *wasn't* rebooted09:37
auniyalyeah09:37
bauzasI'm just checking the logs09:37
bauzaswe may have had an exception in the reboot09:38
bauzasoh wait09:39
SvenKieskewell that could also be true :) but just to check the wall clock seems not really reliable in the long way, I bet you get false positives someday (when clocks will be out of sync in VMs, and they will be, if the ntp setup is not bulletproof)09:39
bauzasif we ssh the guest too soon, the reboot may not be done yet09:39
bauzasSvenKieske: I don't think it's an instrumentation issue09:39
SvenKieskeso you need to detect the reboot, before ssh'ing in order to detect the reboot ;)09:39
auniyalbut still will it saw same time ? time must have updated right ?09:40
bauzasauniyal: not if the guest wasn't rebooted *yet*09:40
bauzasyay, another waiter !09:40
auniyalyeah, so it took boot_time, reboot (and waited until get ACTIVE), then took new_boot_time, must be greater, 09:42
auniyalor we should set our own clock counter in test ??09:43
auniyaland verify if we are able to run any cmd at all in system after reboot ?09:44
SvenKieskewell you're already doing that if you read the system time, you "just" need to verify that this reading was successfull and not NULL :)09:45
auniyalyeah, that should be enough to say server rebooted 09:46
bauzasokay, I think I spotted09:54
bauzaswe reverted the reboot due to us unable to reach cinder09:55
auniyalbauzas  https://zuul.opendev.org/t/openstack/build/0bf10393fa25419c8f3a072ab091f0da/log/controller/logs/screen-n-cpu.txt#12033 this right ?10:00
bauzasyes10:01
bauzasthe keystone service catalog isn't containing the volumev3 endpoint url, that's what's saying the exception10:02
bauzasbut heh, https://1e11be38b60141dbb290-777f110ca49a5cd01022e1e8aeff1ed5.ssl.cf1.rackcdn.com/893401/5/check/neutron-ovn-tempest-ovs-release/f379752/controller/logs/local_conf.txt10:04
bauzas"disable_service cinder"10:04
auniyalokay, its reverted but still shouldn't /proc/uptime get updated ?10:05
auniyalstill great test though it caught it or navigated us to the reason10:07
bauzasno, I mean, we have a problem in Nova : we can't longer hard reboot if you don't deploy Cinder10:09
bauzaswe should handle this case10:09
auniyalyes10:10
bauzasnot sure sean-k-mooney is around but I'd appreciate his thoughts on the best way to alleviate https://github.com/openstack/nova/blob/88c73e2931dcb276982612dbbdc3b849d4977c16/nova/compute/manager.py#L417710:10
auniyalso if cinder.attachment_get_all failed, with could not create client or endpoint not found skip the cinder check10:12
auniyalI'll create a patch for that, 10:12
bauzasyeah that sounds the obvious fix, but I wonder whether there is a better way to flag that this environment isn't having cinder"10:12
bauzasI'm just looking at how we manage such case elsewhere in the code10:13
bauzasyeah found10:15
bauzasauniyal: https://github.com/openstack/nova/blob/88c73e2931dcb276982612dbbdc3b849d4977c16/nova/compute/manager.py#L3173-L317810:16
bauzasthis is gross but meh10:16
auniyalack bauzas10:16
sean-k-mooney bauzas  what the issue10:18
bauzasauniyal: would you be able to propose a patch soon ?10:18
auniyalbauzas yes10:19
bauzassean-k-mooney: tl;dr: some jobs that don't pull cinder fail10:19
sean-k-mooneyah10:19
bauzassean-k-mooney: because now we call cinder in reboot10:19
bauzasso we need to wrap the right exceptions10:19
bauzashttps://bugs.launchpad.net/nova/+bug/203375210:19
sean-k-mooneywell we just need a decorator that check if keystone is installed and skipt the function 10:19
bauzassean-k-mooney: keystone is installed, cinder isn't10:20
sean-k-mooneytreat this like how we test for api extentions in neutron10:20
sean-k-mooneybauzas: yes10:20
sean-k-mooneyoh i said keystoen 10:20
sean-k-mooneyi ment check in keystone if cinder is installed10:20
bauzasyeah but I got your point10:20
bauzasyeah, that's why I wanted to discuss this with you and dansmith10:21
bauzaswe have 100 ways to make it working10:21
bauzasideally I would think that our internal cinder client package would wrap this, but this sounds a huge work10:21
sean-k-mooneyso when discusssing this in the past we noted that if you cant connect ot cinder for any reason we shoudl jsut boot with the bdm we have10:21
bauzasyeah and with shutdown we just have a long list of wrapped exceptions in order to let shutdown succeed anyway10:22
bauzashttps://github.com/openstack/nova/blob/master/nova/compute/manager.py#L3164-L318810:22
bauzastbc, I agree with it, reboot *has to* be trustable anyway10:23
bauzasso, auniyal is just stacking all the exception handling for now10:23
sean-k-mooneyso this is what we do for neutron extnetions10:23
sean-k-mooneyhttps://github.com/openstack/nova/blob/88c73e2931dcb276982612dbbdc3b849d4977c16/nova/network/neutron.py#L1386-L140610:23
sean-k-mooneydo we get an expction when we try to crate the client or when we try to make the first call?10:23
opendevreviewMerged openstack/nova stable/zed: libvirt: At start-up rework compareCPU() usage with a workaround  https://review.opendev.org/c/openstack/nova/+/87196810:23
bauzassean-k-mooney: when we make the call, that's the problem10:24
sean-k-mooneyno thats fine10:24
sean-k-mooneyso with the neutron decorators we are caching the results10:25
sean-k-mooneyso w eonly do the check once10:25
bauzassean-k-mooney: yeah sounds better,10:25
sean-k-mooneyso what i woul drobaly do is somethign like add a decorator that check the cinder max microversion10:26
bauzasbut for the sake of fixing this bug quickly, auniyal is just doing like shutdown and stacking all the exceptions10:26
sean-k-mooneyand we can handle the logic with that. i.e. say this cufntion need whatever10:26
sean-k-mooneyi would prefer to revert then hack in a fix if we dont do it properly10:26
bauzassean-k-mooney: well, let's see auniyal's patch10:27
bauzasI don't disagree but I'd then also like to change https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L3164-L318810:28
bauzasand possibly others10:28
sean-k-mooneybauzas: we shoudl condider disabling cinder in one fo our jobs by the way10:34
bauzassean-k-mooney: good point10:34
sean-k-mooneywe do not need it in nova-ovs-hybrid-plug10:34
bauzassean-k-mooney: I was just looking at the jobs list :)10:35
bauzasfancy me creating a patch ?10:35
sean-k-mooneyits ok i can do it10:35
sean-k-mooneyim looking at the code to see if i can also formulate a patch for the cinder issue10:35
bauzasokay, do it then, you had the copyrights on the idea :)10:35
sean-k-mooneydo you have a link to a failing job?10:36
bauzasyeah10:36
bauzashttps://1e11be38b60141dbb290-777f110ca49a5cd01022e1e8aeff1ed5.ssl.cf1.rackcdn.com/893401/5/check/neutron-ovn-tempest-ovs-release/f379752/controller/logs/local_conf.txt10:36
sean-k-mooney ok so that is form the constuction of the client not the first call10:38
bauzassean-k-mooney: https://zuul.openstack.org/job/neutron-ovn-base removes cinder10:38
sean-k-mooneyits failing here https://github.com/openstack/nova/blob/master/nova/volume/cinder.py#L24710:39
bauzasyeah due to https://github.com/openstack/nova/blob/master/nova/volume/cinder.py#L86310:39
bauzaswe're not caching the client10:40
bauzaswe just instantiate a client at every call10:40
sean-k-mooneyya that is somehtign we can fix when we replace the client with the sdk10:41
bauzasyup10:41
bauzasanyway, my daughters are starving, I shall start to cook something or they'll burn the house10:41
bauzasfortunately, the school is starting again next week10:42
opendevreviewAmit Uniyal proposed openstack/nova master: Addis a try-catch for cinder unavailability.  https://review.opendev.org/c/openstack/nova/+/89348910:50
opendevreviewsean mooney proposed openstack/nova master: only attempt to clean up dangeling bdm if cinder is installed  https://review.opendev.org/c/openstack/nova/+/89350211:48
sean-k-mooneybauzas: auniyal ^ is how i woudl fix it11:48
opendevreviewsean mooney proposed openstack/nova master: only attempt to clean up dangeling bdm if cinder is installed  https://review.opendev.org/c/openstack/nova/+/89350211:50
auniyalyeah, its looks cleaner than https://review.opendev.org/c/openstack/nova/+/89348911:50
sean-k-mooneyjsut fixing some typos11:51
sean-k-mooneyis there a bug for this11:51
sean-k-mooneyah https://bugs.launchpad.net/tempest/+bug/203375211:51
auniyalyes11:52
opendevreviewsean mooney proposed openstack/nova master: only attempt to clean up dangeling bdm if cinder is installed  https://review.opendev.org/c/openstack/nova/+/89350211:53
bauzassorry folks, was with kids12:07
auniyalsean-k-mooney, I added 2 comments in patch, its same what I was trying to do locally12:08
auniyal*your patch - https://review.opendev.org/c/openstack/nova/+/893502/312:08
stephenfinUggla: https://review.opendev.org/c/openstack/openstacksdk/+/89350512:15
stephenfinI'm off for the rest of the afternoon, so hopefully that does the trick for you :)12:15
Ugglaoh thank you stephenfin, I'll have a look12:16
Ugglastephenfin, have a great day12:16
stephenfinAlso published here https://gist.github.com/stephenfin/5834e4bd3213a2b9a9313ee3a7f5694612:21
stephenfinUggla: Cheers12:21
Ugglastephenfin, damned I'have again forgotten to register the options. :( It needs more testing but that looks good.12:24
Ugglastephenfin, thank you so much12:24
opendevreviewsean mooney proposed openstack/nova master: only attempt to clean up dangling bdm if cinder is installed  https://review.opendev.org/c/openstack/nova/+/89350212:51
sean-k-mooneybauzas: ^ i think i have adressed your nits stephenfin  if your around can you take a look quickly too12:53
* sean-k-mooney goes to make a coffee12:53
bauzassean-k-mooney: cool do12:53
dvo-plvsean-k-mooney, Hello13:48
sean-k-mooneyhi13:48
dansmithjohnthetubaguy: comment here if you could look quickly:  https://review.opendev.org/c/openstack/nova/+/88698914:00
dansmithalso bauzas, since you were +214:00
sean-k-mooneydansmith: can you take a look at this https://review.opendev.org/c/openstack/nova/+/89350214:00
sean-k-mooneymerging the dangeling bdms serise broke reboot for deploymetns without cinder14:01
bauzasdansmith: shouldn't need to update node, right?14:01
bauzasdansmith: it's not changing14:01
bauzasbut, compute_id, mmmm, yeah you're right :(14:01
dvo-plvI would like to talk regarding our proposal14:02
dansmithbauzas: if compute_id is changing then node is changing, however, we're changing host to point to a different node, right?14:02
dvo-plvCurrently I see two option. 1 we move back to the separate mech driver. 2 we generate socket in the OpenStack and pass it to the dpdk14:02
bauzasdansmith: well, the compute node uuid is the ironic node uuid14:03
bauzasso, it shouldn't be changing, right?14:03
dvo-plvI belive that 2 option is more suitable for you14:03
dansmithokay I see, we're moving the node between services...14:03
dvo-plvSo, how do you see it in terms of the OpenStack, out dpdk team currently analyze this approach with our pmd driver14:04
bauzasdansmith: correct, that being said, now I wonder about compute_id14:04
dansmithbauzas: compute_id should match the node, so only change one if the other14:04
bauzasah, I was about looking on your series14:05
dvo-plvI belive that using common option vhost-server-path is not acceptable in case dpdk port type, so we should create the new one, and it also will be appropriate parameter14:05
bauzas(b/c while I reviewed it, honestly, I didn't thought about Ironic)14:05
sean-k-mooneydvo-plv: ya so option 2 woudl be my prefernce. but add a new dpdkarg or extra_config to hold the path14:06
sean-k-mooneydvo-plv: we have kind of a chicken and egg problem14:07
dansmithsean-k-mooney: really? we have no jobs without cinder?14:07
sean-k-mooneydansmith: apprently14:07
sean-k-mooneydansmith: its enabled by default in the devstack base job14:07
dansmithsean-k-mooney: that doesn't make sense to me.. if we have no cinder, we should have no bdms with type==volume14:08
sean-k-mooneydvo-plv: os-vif does not have a way to return data ot nova, nova need to knwo the socket path to put it in the libvirt xml so qemu cna create it, and neutron does not knwo what vendor nic is being used14:09
bauzasdansmith: ok, indeed compute_id == compute_node.id (found it in the code) so are you +W then ?14:09
bauzasdansmith: see https://bugs.launchpad.net/nova/+bug/203375214:09
dansmithbauzas: it was already +W, I just -Wd to make sure we weren't missing something14:09
sean-k-mooneydansmith: hum... that is a very good point14:10
sean-k-mooneyoh14:10
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova/+/893502/4/nova/compute/manager.py#421714:10
sean-k-mooneydansmith: that line is alwasy called 14:10
dansmithah14:10
bauzasdansmith: the problem is with https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L420914:10
bauzaseverytime we call it14:11
dansmithso let's put a check to not do anything else if no bdms_to_delete right?14:11
dansmithif not bdms_to_delete: return14:11
bauzasno, because we could have none BDMs 14:11
bauzaswhile there could be dangling volume attachments14:11
sean-k-mooneyi mean i can adda that as well 14:11
bauzasbut, I looked at shutdown14:12
sean-k-mooneyare you suggesting removing the check for if cinder exists14:12
dansmithbauzas: that's to delete attachments cinder thinks we have?14:12
sean-k-mooneyor just also adding a second check on line 421614:12
dvo-plvsean-k-mooney, If we extend this field with socket, it will be acceptable, right ? https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovsdb/ovsdb_lib.py#L18414:12
dansmithbauzas: isn't that wrong anyway?14:12
dansmithbauzas: what if we have a multi-attach volume, with one instance and one standalone attachment for an externally-managed database server or something?14:12
sean-k-mooneydvo-plv: i think that would be the correct approch yes14:13
sean-k-mooneyand if other vendors then want to integrate they just need to supprot that too14:13
dvo-plvokay, great, we will rework this solution. Do I need to udpate spec file ?14:14
bauzasdansmith: on the first way, I'm okay with the verification : if we have nova attachments, then we can verify whether cinder also has them14:14
bauzasdansmith: on the other way, we could have cinder attachments while we wouldn't see BDMs14:14
dansmithbauzas: right, but that's valid in the real world right?14:14
sean-k-mooneydvo-plv: yes but not entirly because of this. last week was the non-client lib freeze so os-vif is frozen for bobcat and yesterday was the feature freeze for nova so the spec need to be reposposed for next cycle14:15
dansmithbauzas: or, I guess we're filtering by only attachments for that instance, so maybe that's okay14:15
sean-k-mooneydvo-plv: so it would be nice to note this in that when you reporpose the spec14:15
bauzasdansmith: well, if you see my comments, I was torn14:15
bauzasI wonder whether it's possible to have dangled volume attachments for this instance while we no longer have the BDMs14:16
bauzasin order to remove the stale cinder attachments14:16
dansmithbauzas: we should only do that if they specify our instance in the attachment14:16
sean-k-mooneybauzas: if you try to attach a volume to an isntace via cinder it will create that case14:16
sean-k-mooneyfortunetly that is now blokcked without a service token14:17
sean-k-mooneyi.e. if a attachment extis for a voluem that we do not have a bdm for. you either did a db restore or messed with the nova db or you use cidner api to create it directly which we never supported14:18
sean-k-mooneydansmith: bauzas: i have not been fully following your conversation but im planning to finish in the next hour or so14:19
sean-k-mooneywith that in mind is there a direction we want to take to unblock things14:19
sean-k-mooneyrevert? proceed with patch? or is there a change you woudl like me to make quickly?14:19
dansmithstale attachments are the only *legit* way to get there now, but that can certainly happen14:19
sean-k-mooneyalternitivly im fine with someone else takeing over this14:19
dvo-plvokay, I got it14:19
dvo-plvthanks14:19
dansmithsean-k-mooney: bauzas I'd be okay with revert and fix, but if not, sean-k-mooney's patch needs some change I think14:20
dansmithI can make the change if bauzas is around to approve, but melwitt is out today and monday all of the US is off14:20
dansmithso a fast revert might be more prudent14:20
bauzasdansmith: that's what it's doing now, right? we ask Cinder to tell us if we have volume attachments for this instance14:20
dansmithbauzas: yeah, I'm just now reading it.. that wasn't in the version I last reviewed14:20
bauzasbut, how can we make this call conditional? I don't see 14:21
dansmithwell, one thing we could do is just not do that healing at all if the instance has no volume bdms14:21
bauzasthat's why wrapping the exception seems the easiest path to me14:21
bauzasdansmith: but we wouldn't fix the problem of a stale volume attach14:21
bauzasand looking at other pieces of code, we just blindly catch the keystone exception actually14:22
sean-k-mooneybauzas: in the event there are no other cinder voluem on the instance14:22
dansmithbauzas: yep, we just need to make a call14:22
bauzasdansmith: example with shutdown https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L3173-L317814:22
dansmithbauzas: yep, we could just wrap all of that function in that check and move on.. perhaps that's safest14:23
bauzasthat's what I originally ask to auniyal but then sean-k-mooney and I discussed on having a better helper14:23
dansmithbut we need to not log that on every single reboot for people that have no cinder at all, which is what sean-k-mooney's patch will currently do14:23
sean-k-mooneyah i can drop the log14:23
sean-k-mooneyi didnt have it orgianlly 14:23
bauzasdansmith: to not help the problem, we don't cache the cinder client14:23
bauzasdansmith: we're really instantiating it at every call we make14:24
sean-k-mooneydansmith: this is the log you are refering too right https://review.opendev.org/c/openstack/nova/+/893502/4/nova/compute/manager.py#419314:24
dansmithbauzas: are we?14:24
bauzasyes14:24
dansmithsean-k-mooney: the one with the ugly parens, yeah :)14:24
bauzasdansmith: eg. https://github.com/openstack/nova/blob/master/nova/volume/cinder.py#L86314:25
dansmithbauzas: ack14:25
dansmithbauzas: you need to re-+W the ironic manage one14:26
opendevreviewDanylo Vodopianov proposed openstack/nova master: Packed virtqueue support was added.  https://review.opendev.org/c/openstack/nova/+/87607514:27
bauzasdansmith: ack, will14:28
opendevreviewsean mooney proposed openstack/nova master: only attempt to clean up dangling bdm if cinder is installed  https://review.opendev.org/c/openstack/nova/+/89350214:28
dansmithman I hate wasting vertical space on dangling parens14:30
sean-k-mooneyi have kind of started to do that because of ifs but ya i prefer not to wast it either14:31
dansmithman it is SO ugly14:31
bauzasblack syntax kills python :)14:33
bauzasit's Friday I can duck 14:33
dansmithbauzas: ++14:40
bauzaswell, now I know that we don't have jobs without cinder14:40
dansmithI'm really shocked tbh, and makes me think we have some room for trimming if that's the case ;)14:41
bauzasyeah that's actually good news for the gate14:45
sean-k-mooneybefore i go fix the test are we ok with this https://termbin.com/d2q214:47
auniyalsean-k-mooney, should we catch one more i.e only  Exception  in manager.py14:50
auniyalafter EndpointNotFound14:50
bauzassean-k-mooney: looking14:50
sean-k-mooneyauniyal: no we shoud avoid catching Exception in general14:51
bauzasyeah14:51
sean-k-mooneyi think we even have a hacking check to prevent you doing that without a #noqa line14:51
sean-k-mooneys/line/comment/14:51
bauzasno AFAIR14:52
bauzasbut meh, anyway14:52
auniyalsean-k-mooney ack14:52
bauzassean-k-mooneydoesn't see in your patch where you're moving the attachment_get_call(), just see it added at first14:52
bauzasbut I assume you're moving the call, not copying it14:53
auniyalactually I was saying it because its here https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L318414:53
sean-k-mooneyim moving it im still editing but i just wanted to confirm that was the general direction14:53
bauzasyeah, that's the direction we agreed14:53
auniyalbut yes, it should be avoided14:53
bauzasmake it the first call in the method and bail out early 14:53
bauzasthis way, non-cinder envs would meh this method14:54
sean-k-mooneyyep and im adding a second early our for14:54
sean-k-mooneylen(bdms_to_delete) == 0:14:54
sean-k-mooneyafter the first loop14:54
bauzaswell, we're looping, right ?14:54
bauzasah14:55
bauzasthen after the loop, my bad, you just told it :)14:55
sean-k-mooneyhttps://termbin.com/a7sc14:55
sean-k-mooneyok im goign to start fixign the test and ill push somthing up in a bit14:56
dansmithsean-k-mooney: I'd be okay with that other than the dangling paren14:57
sean-k-mooneyfor the excpet14:57
dansmithand the unnecessary wrapping of the except list14:57
dansmiththat could be two lines but is four14:57
sean-k-mooneywell our convention is if you need to wrap you move all args14:58
dansmithno it's not14:58
sean-k-mooneyit depend on if you want to dedent the args14:59
dansmiththat's black convention that has been sneaking into nova over years14:59
sean-k-mooneyeither we dont denent them or we do and then move all14:59
dansmithfirst example I found in the same file: https://review.opendev.org/c/openstack/nova/+/893489/1/nova/compute/manager.py#388114:59
dansmiththere's another right above it14:59
sean-k-mooney yep i know we have been converting but fine i have swap back to that15:00
dansmith"we" have not been converting, AFAIK.. "some people" have been converting :)15:00
sean-k-mooneyi dont really mind doing that in this case as it not hevvily indented15:00
sean-k-mooneyi really hate when we have really indeneted arges across like 10 lins15:01
sean-k-mooneywhen it can be 4 if we dedent and wrap properly15:01
dansmithwell, regardless, this is not that case15:01
sean-k-mooneywe are both trying not to have lots of wasted virtical space so im ok to do it the other way here15:01
dansmithyep15:04
sean-k-mooneyok i think i have things wokring now15:41
sean-k-mooneyi cant do the second early out15:42
opendevreviewsean mooney proposed openstack/nova master: only attempt to clean up dangling bdm if cinder is installed  https://review.opendev.org/c/openstack/nova/+/89350215:42
sean-k-mooneyok i need to run now so dansmith bauzas  can ye take this from here.15:47
bauzasack15:47
bauzasdansmith: can you please validate https://review.opendev.org/c/openstack/nova/+/893502 ?16:20
dansmithbauzas: +W.. we might want to get a depends-on running against it in parallel to make sure it really solves the problem16:23
dansmithwas that a devstack base job that hit the original problem?16:23
bauzasdansmith: yeah, some neutron ones16:24
bauzashttps://bugs.launchpad.net/tempest/+bug/203375216:24
dansmithneutron ones or devstack ones?16:24
dansmithykarel: still around?16:25
bauzassorry https://zuul.openstack.org/job/neutron-ovn-base16:25
ykareldansmith, yes kind off16:25
bauzashttps://zuul.openstack.org/job/tempest-integrated-networking doesn't skip cinder, but his child above skips it16:25
dansmithykarel: I'll submit a DNM neutron patch to test16:26
dansmithykarel: https://review.opendev.org/c/openstack/neutron/+/89353816:26
ykareldansmith, ack16:26
bauzasI'll bail out now16:27
dansmithbauzas: ack16:27
bauzaslong day for a long week16:27
dansmithbauzas: puh lease man.. back for one week after gone for what, three?16:28
dansmithI'm going to really enjoy that ONE day off I have next week ;)16:28
bauzas3 and a half :)16:28
dansmithpssh16:28
bauzasyeah, but going back on the ff week, lovin' <316:28
dansmithbauzas: go on, get out of here.. I don't want to over stress your system :)16:29
bauzasdansmith: anyway, enjoy your own timeoff :)16:29
dansmithheh thanks16:29
auniyaldansmith - https://bf80d1d7066e6c071f87-1fbe10cd6d23547f6092d75bc2b558ab.ssl.cf2.rackcdn.com/893502/6/check/nova-ovs-hybrid-plug/4042471/job-output.txt - "DBNonExistentDatabase"16:46
auniyaland https://zuul.opendev.org/t/openstack/build/40424718c3524bb78ca22cb9d5293b6c/log/controller/logs/screen-n-cpu.txt#14946 - this one is oslo_service.periodic_task16:47
dansmithauniyal: what about it?16:48
auniyalnova-ovs-hybrid-plug job failed with POST fail16:48
auniyalbecause  devstack@c-bak.service16:49
dansmithoh, sean changed the zuul yaml16:49
dansmithI didn't even look at that16:49
auniyalcouldn't start16:49
dansmithyeah we should just do that in a different patch16:49
dansmithauniyal: can you update sean's patch to just not do the .zuul part and I'll +2+W it?16:50
dansmithmove the .zuul part to a second patch on top and we can go from there separate from fixing the gate16:50
auniyalack16:50
dansmithauniyal: can you do it quick? if not I will, but it would be better if you do and I +W16:50
auniyalyes I am doing16:51
dansmithcool16:51
dansmith++ glad you were watching those jobs live, I didn't even pay attention to the .zuul changer16:51
dansmithsomething else is re-enabling c-bak, which is why that happened:16:54
dansmith2023-09-01 16:19:08.457805 | compute1 | +++ enable_service c-bak16:54
opendevreviewAmit Uniyal proposed openstack/nova master: only attempt to clean up dangling bdm if cinder is installed  https://review.opendev.org/c/openstack/nova/+/89350216:56
opendevreviewAmit Uniyal proposed openstack/nova master: Moved zuul job update  https://review.opendev.org/c/openstack/nova/+/89354016:56
dansmithgmann: can you tell us why this is not working? https://review.opendev.org/c/openstack/nova/+/893540/1/.zuul.yaml16:57
dansmithgmann: c-bak ends up re-enabled but fails to start because the rest of cinder is disabled16:57
gmanndansmith: checking16:58
dansmithauniyal: ah, you didn't remove the comment from the commit message about modifying the job, can you fix that real quick?16:59
auniyalyes17:00
sean-k-mooneywithout the job config tweak we dont actuly knwo that this is fixed17:01
dansmithsean-k-mooney: yes, we have a neutron DNM against it17:01
sean-k-mooneybut i see that was factored out into its own change17:01
sean-k-mooneyok i would have prefered to keep them in the same patch btu sure17:02
gmanndansmith: where you are seeing it failing?17:02
sean-k-mooneyas long as we eventually get a job without cinder working its fine17:02
dansmithgmann: https://bf80d1d7066e6c071f87-1fbe10cd6d23547f6092d75bc2b558ab.ssl.cf2.rackcdn.com/893502/6/check/nova-ovs-hybrid-plug/4042471/job-output.txt17:02
dansmithgmann: search for "noneextentdatabase"17:02
dansmithsean-k-mooney: yeah, definitely need that, but just fixing that job won't even ensure that neutron is unblocked17:03
dansmithsean-k-mooney: I've got https://review.opendev.org/c/openstack/neutron/+/893538 for that verification17:03
sean-k-mooney cool well i was just checking back to see if there were any issues17:03
opendevreviewAmit Uniyal proposed openstack/nova master: only attempt to clean up dangling bdm if cinder is installed  https://review.opendev.org/c/openstack/nova/+/89350217:03
opendevreviewAmit Uniyal proposed openstack/nova master: zuul update  https://review.opendev.org/c/openstack/nova/+/89354017:03
sean-k-mooneyauniyal: why did you rebase those?17:04
dansmithsean-k-mooney: to fix the commit message17:04
dansmithsean-k-mooney: if we can figure out why it's not working in devstack, I'll fast-approve that job change or get gmann to help me, it just doesn't work as you had it17:04
dansmithsince it was part of the approved initial patch17:05
sean-k-mooneyit can wait till monday so if its not workign ill look at it then17:05
dansmithack17:05
gmanndansmith: sean-k-mooney auniyal this is multinode job and we need to disable the cinder in subnode also17:05
dansmithahh17:05
sean-k-mooneyoh right17:05
gmannneutron might be having example of those17:05
gmannlike this one https://github.com/openstack/neutron/blob/ed6023c3473585b4af6980404e7fe36ea152f980/zuul.d/tempest-multinode.yaml#L15117:06
sean-k-mooneyits trivial17:06
dansmithyeah17:06
sean-k-mooneyit just need to be in teh group vars section17:06
dansmithI'll fix it17:06
opendevreviewDan Smith proposed openstack/nova master: zuul update  https://review.opendev.org/c/openstack/nova/+/89354017:07
opendevreviewDan Smith proposed openstack/nova master: Make our nova-ovs-hybrid-plug job omit cinder  https://review.opendev.org/c/openstack/nova/+/89354017:08
gmannperfect, it should work now. will keep eyes on it17:08
dansmiththanks gmann ;)17:08
sean-k-mooneyits slightly overkill but ya it will17:08
sean-k-mooneythe sub node only has cinder volume and not swift bits17:09
sean-k-mooneybut it does not hurt to be explcit17:09
dansmithit clearly had c-bak17:09
sean-k-mooneyreally i tought that was only on the contoler17:09
dansmiththat was the failure17:09
sean-k-mooneyin anycase it does not hurt to disabel them all explictly17:10
dansmithc-bak was trying to start and didn't find the cinder database because it wasn't setup on the main node17:10
sean-k-mooneyack we might have moved that at some point17:10
gmannyeah, we do not enable swift in subnode by default but never know which child job doing might be doing/will do that so disable explicitly is better17:11
sean-k-mooneyok im going to go mostly afk. i will skim the logs later tonight but otherwise ill cat to ye on tuesday? its a long weekend in the US right17:12
dansmithyup17:13
sean-k-mooneycool well enjoy17:13
dansmiththanks, o/17:13
gmannsean-k-mooney: o/ enjoy weekend 17:14
dansmithauniyal: devstack finished on the subnode this time17:33
auniyalack 17:33
auniyaljob "openstacksdk-functional-devstack" is still running 17:34
dansmithI meant on the top patch to change the job17:35
dansmiththe one where we had to add the c-bak settings to the subnode17:35
auniyal.zuul patch where `c-bak: false` is set 17:37
dansmithyeah17:42
auniyalin here https://zuul.openstack.org/status#893540 - nova-ovs-hybrid-plug: failing and openstacksdk-functional-devstack: in-progress17:43
auniyalunit and functional tests are passed17:44
auniyalokay you meant deployment finished without issue17:44
auniyalright ?17:44
elodilleshi, nova team! fyi, train-eol patch has merged & the branches are deleted: https://paste.opendev.org/show/bVfeofByKRN0ZrNX779o/17:50
auniyalthis is not a regular thing right - https://zuul.opendev.org/t/openstack/build/475dab49cd33437a96ba81437779b875/log/controller/logs/screen-n-cpu.txt#1451517:50
dansmithauniyal: yeah17:50
gmannelodilles: thanks17:52
elodillesgmann: np17:52
dansmithgmann: this is a different failure: https://zuul.opendev.org/t/openstack/build/475dab49cd33437a96ba81437779b87521:05
dansmiththe one we were trying to resolve was where it tried to start c-bak on the subnode and failed21:05
dansmithoh, but, it's running an evacuate test on a bfv instance which clearly won't work, I see21:06
dansmithso we need to remove that post action or whatever21:06
gmanndansmith: I mean the failure it had before for evacuate test21:06
dansmithokay gotcha21:07
gmannyeah, or separate the test with cinder availability  or so?21:07
opendevreviewMerged openstack/nova master: only attempt to clean up dangling bdm if cinder is installed  https://review.opendev.org/c/openstack/nova/+/89350221:09
dansmithgmann: I don't think there's any reason we should be running that post test on the ovs job, I think it's just inherited and wasting time21:12
dansmithI'll leave that to sean on monday21:13
gmanndansmith: not sure, but it job definition explicitly says about it "Run move operations, reboot, and evacuation (via the same post-run hook21:14
gmann      as the nova-live-migration job) tests with the OVS network backend "21:14
gmannI think we can avoid running bfv evacuate test if cinder disabled. 21:14
dansmithyep, that's what I meant and said in my comment21:18
gmannyeah21:23

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!