Monday, 2022-09-12

*** gibi_off is now known as gibi07:02
opendevreviewChristian Rohmann proposed openstack/nova master: db: Drop redundant indeces on instances and console_auth_tokens tables  https://review.opendev.org/c/openstack/nova/+/85675707:11
UgglaGood morning Nova.07:46
bauzasgood morning07:51
gibio/07:52
gibiUggla: do you still have a question on my comment on the manila series ?07:52
Ugglahi gibi , no that's ok for the moment. :)07:53
gibiUggla: cool. Sorry for not responding last week I was deep in some k8s discussions07:53
Ugglagibi, no worries that's fine.07:54
sahido/ I have specific use-case regarding nova host-evacuate, we would like evacuate host to run but that, all instances scheduled to be forced as shutdown08:43
sahidis there a way to have this happening?08:43
sean-k-mooneyno08:48
sean-k-mooneyevacuate at the api level result in the vm being evacuated to the same state its in in the db currently08:48
sean-k-mooneyso you would need an api change for that.08:48
sean-k-mooneynovaclient's shell is deprecated so we are not adding or alterign any commands08:49
sean-k-mooneyand nova host-evacuate is intentionally not supported in osc08:49
sean-k-mooneyso at this point we should not alter/extend its behavior08:49
gibiI'm wondering what happens if you try to first stop the VM then evacuate it08:50
sahidgibi if the host is down, nothing is happening 08:53
sahidthe other idea was to extend resetState08:53
gibiI feel like this might be a new microversion to the evacuate action, adding a flag to instruct nova to evacuate but not start the VM on the dest08:54
sahidit's what I was thinking as-well but for host-evacuate it seems that you don't want we make any changes08:56
gibihost-evacuate is a client side concept. You can replace that with a shell script calling the openstack client08:56
sahidside question, why host-evacuate is not supported in openstack client?08:57
gibiwhat you cannot do is to make a active VM evacuated as stopped via the nova REST API today, hence my microversion thinking08:57
sean-k-mooneysahid: because of what gibi said08:57
sean-k-mooneyyou its a client side implemation and we did not want to support it any more08:58
sean-k-mooneythe error handeling is terrible08:58
gibisahid: because it is considered orcestration08:58
sean-k-mooneywell that too08:58
sahidyes that makes sense, i understand now08:58
sean-k-mooneybut more because if one of the evacuation fails its kind of undefiend what the end result of the commnd is08:58
sahidso back to the original use-case, does that would make sense to have evacuate with a flag to force the state?08:59
sean-k-mooneyit wont be one of (all evacuated or all still on orginal host) it will be a mix08:59
sean-k-mooneysahid: i would say target state08:59
sean-k-mooneyrahter then force08:59
sean-k-mooneythat has been requested before at the last inperson ptg i think08:59
sean-k-mooneyi would not be apposed to a eveacuate to stopped option09:00
sean-k-mooneyim not sure that shelved makes sense09:00
sean-k-mooneybut started/stopped i can see 09:00
gibiI think target_state enum (AsBefore,Stopped)09:00
gibimake sense09:00
gibiAsBefore=NoChange09:00
gibican we evacuate a shelved instance?09:01
sean-k-mooneyon reset-sate while i would like to expand what it can do so that you can specify somehting other then aviable/error im not sure this is the right way to do this09:01
sean-k-mooneygibi: no09:01
sean-k-mooneygibi: because its not on a host09:01
gibiOK, cool then :D09:01
gibiI started worrying :)09:01
sean-k-mooneyim ment it would not make sense to evacuate to shelve_offloaded09:02
sean-k-mooneywe could allow shelve_offloading when its down instead09:02
sean-k-mooneybut its not really evacuate09:02
sean-k-mooneyevaucate is ment to move the vm form one host to another09:02
sean-k-mooneywhere as shelve/unshleve is moving form on a host to not and vise versa09:03
gibiyeah09:03
sean-k-mooneyits kind of a pendantic distinction but i dont quite consider them equal09:03
sean-k-mooneyyou could argue it either way09:03
sean-k-mooneyso i would not be agaisn allowing stop ot work in a host dwonstate by the way09:04
sean-k-mooneyyou woudl update the db and treat it kind of like local delete09:04
sean-k-mooneywhen the compute agent comes back up it woudl reconsile the vm state09:04
sean-k-mooneyif you stoped it then evacuated that would solve sahid's case09:05
opendevreviewAmit Uniyal proposed openstack/nova master: Adds a repoducer for post live migration fail  https://review.opendev.org/c/openstack/nova/+/85449909:05
opendevreviewAmit Uniyal proposed openstack/nova master: [compute] always set instnace.host in post_livemigration  https://review.opendev.org/c/openstack/nova/+/79113509:05
sahidsean-k-mooney: yes it's also a possibility09:05
sean-k-mooneythe one thing to keep in mind i guess is that even if we allow stop09:06
sean-k-mooneyit doen not chnage the responsiblity for the admin09:07
sean-k-mooneythat is you are requried as an admin to ensure a host is fenced or all vms are stopped before you evacuate09:07
sean-k-mooneyif we allow stop in a down host state the admin still need to ensure it is stoped to prevent data currpption09:08
sean-k-mooneybut if they can then that woudl allwo them to evacuate without start the vm again09:08
gibithis is why I would connect the stopping to the evacuation action, that way it is clear that on the source host it is not stopped09:08
sean-k-mooneyack ya that cleaner09:08
sean-k-mooneyand the existing check for is it safe to evacute woudl also be checked09:09
gibiyes09:09
sean-k-mooneye.g. the heatbeat has been missed or you set force_down09:09
sean-k-mooneyi think johnthetubaguy expressed interest in this in the past09:09
sean-k-mooneyor at least supprot for the people that were askign for it in the past09:10
sean-k-mooneyoh that reminds me i guess we are not merging my default change?09:11
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova/+/83082909:11
sean-k-mooneyi would either like to merge that before RC1 or after we create the stable branch09:12
sahidthank you guys - are we agree to extend evacuate with target state (AsBefore, Stopped) ? Can I report a bug with destailled description or should I share a spec?09:17
sean-k-mooneysahid: all api changes require a spec regardless of how trivial09:21
sean-k-mooneythis would need a spec and a new microverion09:21
sahidack I make this happen for A.09:22
sean-k-mooneyit can be a pretty short spec but there will at least be a conductor rpc change and likely a compute one too pass the target state09:22
sahidsure no worries09:22
sean-k-mooneysahid: the repo is open for sepc reviews so whenever you have time feel free to submit one09:23
sahid+109:24
bauzassahid, gibi, sean-k-mooney: tbh, I think we discussed about the host-evacuate support before, and we said this was close to orchestration as a client could be doing it09:29
bauzasso the consensus is to tell that some client or script could be doing it09:29
bauzas(or Heat, or whatever else)09:30
sean-k-mooneybauzas: yep09:31
sean-k-mooneybauzas: but what we were discussiong regaring a spec wa allwoing the api to accept a target state to evacuate too09:31
sean-k-mooneyoneof:active,poweredoff or current09:32
sean-k-mooneywhere current or not specified means what we do today09:32
zigogibi: oslo.concurrency 5.0.1 hangs when I try to run its unit tests, which is probably related to your patch (at: https://review.opendev.org/c/openstack/oslo.concurrency/+/855714 ). Any idea what's going on? Do I need the latest Eventlet (I know we're lagging one minor version behind)?10:10
zigoAh no, I'm even with 0.30.2, maybe that's why...10:12
* zigo tries upgrading10:14
gibizigo: do you know where it is hanging? which test acse?10:14
gibicase10:14
zigoHard to tell...10:15
zigoLast output was: https://paste.opendev.org/show/bsDsX7d0gw1SDAnLcRMA/10:16
zigoThen it hangged ...10:16
sean-k-mooneywhats that form?10:16
zigosean-k-mooney: building oslo.concurrency.10:16
sean-k-mooneyoh i disconnected and reconnected for a bit10:17
sean-k-mooneymissed the start of your conversation with gibi i think10:17
zigoProblem is: I have autopkgtest issues in Eventlet ... 0.33.x :/10:18
sean-k-mooneyhttps://github.com/openstack/oslo.concurrency/blob/5397838f4117300a509bff474dfcdd60b5993677/oslo_concurrency/tests/unit/test_processutils.py#L184-L20110:18
sean-k-mooneyi see10:20
sean-k-mooneyso thats failing whiel building in some cases10:20
sean-k-mooneyim not really sure who/why10:20
gibizigo: could you point to how you run the unti tests?10:21
sean-k-mooneythe est is just concorting an instnace of an exeption class10:21
sean-k-mooneyasserting when you call str() on it that it contians the message10:21
gibithere are unit tests in the repo that can be run with and without eventlet monkey patching but there are tests that can only run in eventlet10:22
gibihence the https://github.com/openstack/oslo.concurrency/blob/01cf2ffdf48c21f886b2aa3f766be5d268248c18/tox.ini#L14-L1510:22
sean-k-mooneymaybe this print is the issue https://github.com/openstack/oslo.concurrency/blob/5397838f4117300a509bff474dfcdd60b5993677/oslo_concurrency/tests/unit/test_processutils.py#L20310:22
zigoI simply do this:10:22
zigoPYTHON=python3 stestr run --subunit | subunit2pyunit10:22
gibiI can imagine that if you run the eventlet aware test without eventlet monkey patching then the eventlet only test might missbehave10:23
zigoI'll try further and let you know where it leads me.10:23
sean-k-mooneyif you do things like eventlet.spwawn directly without monkeypatching10:24
sean-k-mooneyyou need to manually invoke the event loop to have it run10:24
sean-k-mooneywe saw that in the nova-api when we added scater gather10:24
gibizigo: based on that command line you run without eventlet monkey patching but you still run tests from https://github.com/openstack/oslo.concurrency/blob/master/oslo_concurrency/tests/unit/test_lockutils_eventlet.py10:25
gibizigo: you can try not running those test to see if that resolve the hang10:26
sean-k-mooneyyou could fix them by adding  eventlet.sleep(seconds=0)10:26
sean-k-mooneyi think that will make it work if not monkeypatched10:26
sean-k-mooneybut ya not runnign them would be better10:27
gibithere is the place where the test monkey patches selectively https://github.com/openstack/oslo.concurrency/blob/master/oslo_concurrency/tests/__init__.py10:27
sean-k-mooneyhttps://github.com/openstack/oslo.concurrency/blob/5397838f4117300a509bff474dfcdd60b5993677/oslo_concurrency/tests/unit/test_lockutils_eventlet.py#L4910:28
sean-k-mooneyso if we also do that on lin 51 before the pool.waitall()10:28
sean-k-mooneyi think that might actully work in either case10:28
sean-k-mooneybut i might make sense for use to use the skip funciton in the classs10:28
sean-k-mooneyto have it skip in not monkey patched10:29
sean-k-mooneygibi: is there any reason not to check if we are monkey patched in the setup function and call skipTest10:35
gibiI don't know we might need to consult with oslo cores10:40
gibibecause there was eventlet specific tests before I added mine I thought it is handled centrally not to run them in a non patched env10:40
gibithis might not be the case10:41
sean-k-mooney i dont see that logic genericlly10:41
sean-k-mooneyits certenly possible to add10:41
noonedeadpunkhello folks! I was wondering - does reverting resize failure rings anybody a bell? Ie - create server, resize server, revert resize -> VM is "stuck" in REVERT_RESIZE until message timeouts, then it goes back to VERIFY_RESIZE but with original flavor, and then nova-compute shutdown VM on hypervisor at all. The only way to recover is to reset state11:20
sean-k-mooneynot that i recall but i can see that poteilaly happening if we raise an excption in the revert path and cant proceed11:21
sean-k-mooneylike if the souce host was down or soemthing like that we would not be able to revert11:22
noonedeadpunkpaste: https://paste.openstack.org/show/bh3kML89sPFYDN9HkDSd/11:22
noonedeadpunksean-k-mooney: to have that said, out of 100 tempest runs of tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_revert 57 has failed11:23
noonedeadpunkthe stack trace I've spotted: https://paste.openstack.org/show/bytEWO0CHe8cSVktE3tv/11:24
noonedeadpunkI assumed it can be related to the heartbeat_in_pthread thing, as in the region it was "default" setting on Xena (which is enabled), but disabling it didn't fix that11:25
sean-k-mooneydo you have any timeouts form ovsdbapp11:27
sean-k-mooneyin the nova-compute logs11:27
noonedeadpunksean-k-mooney: um, nope11:30
noonedeadpunkit's ovs, not ovn fwiw11:30
sean-k-mooneyya it would be the same either way11:30
sean-k-mooneyhttps://bugzilla.redhat.com/show_bug.cgi?id=208558311:30
sean-k-mooneyi was wondering if it was related to that11:30
sean-k-mooneythose are still pending backport upstream https://review.opendev.org/c/openstack/os-vif/+/84177111:32
sean-k-mooneyhttps://review.opendev.org/c/openstack/os-vif/+/841772/111:32
sean-k-mooneyif you are using the native  os-vif backend then the compute agent can hang if the connection to the ovs db drops11:33
sean-k-mooneyand that can cause timeouts11:33
sean-k-mooneythe workaround is to just use the vsctl backend11:33
sean-k-mooneyyou will see something that looks like thi11:34
sean-k-mooney2022-09-01 16:33:38.583 2 DEBUG ovsdbapp.backend.ovs_idl.vlog [-] [POLLIN] on fd 18 __log_wakeup /usr/lib64/python3.9/site-packages/ovs/poller.py:26311:34
noonedeadpunkin neutron-ovs-agent the only string referencing port is `neutron.agent.common.ovs_lib [req-3aa28a82-1d30-4713-989a-e5093e55f7ab - - - - -] Port 200e682f-b9af-487c-aa8c-605e20b99002 not present in bridge br-int`11:34
sean-k-mooney2022-09-01 16:33:38.584 2 DEBUG ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: entering ACTIVE _transition /usr/lib64/python3.9/site-packages/ovs/reconnect.py:51911:34
sean-k-mooney2022-09-01 16:33:40.874 2 DEBUG ovsdbapp.backend.ovs_idl.vlog [-] [POLLIN] on fd 18 __log_wakeup /usr/lib64/python3.9/site-packages/ovs/poller.py:26311:34
sean-k-mooney2022-09-01 16:33:43.584 2 DEBUG ovsdbapp.backend.ovs_idl.vlog [-] [POLLIN] on fd 18 __log_wakeup /usr/lib64/python3.9/site-packages/ovs/poller.py:26311:34
sean-k-mooney2022-09-01 16:33:48.587 2 DEBUG ovsdbapp.backend.ovs_idl.vlog [-] 4999-ms timeout __log_wakeup /usr/lib64/python3.9/site-packages/ovs/poller.py:24811:34
sean-k-mooneynoonedeadpunk: not in neutron in the nova-compute agent11:34
noonedeadpunkah, I would need to have DEBUG enabled11:35
noonedeadpunklet me enable it and reproduce :)11:35
sean-k-mooneynoonedeadpunk: am well that just means that the logical port is defiend in the nortdb11:35
sean-k-mooneybut not actully added to the br-int bridge yet11:35
sean-k-mooneyi think11:36
sean-k-mooneyso not northdb but more or less the same11:36
sean-k-mooneyneutron know that the port should exist bug nova has not added it yet11:36
sean-k-mooneynoonedeadpunk: if you dont have debug enabled the symthom to look for is large gaps in the log of 5+ seconds without output11:37
sean-k-mooneythat however is not helpful if the system is idel or not activly spwaning a vm11:38
sean-k-mooneysince that is what you woudl expect it to look like unless you asked nova to do something11:38
noonedeadpunkyeah, well, delay between `Updating port 73862d0c-b31b-4b75-b833-8e29d8066b9a with attributes {'binding:host_id': 'cc-compute04-tky1', 'device_owner': 'compute:nova'}` and stack trace is exactly 5 sec. But it's exactly the timeout11:40
sean-k-mooneythat is a little sus yes11:42
sean-k-mooneythe tl;dr of https://bugs.launchpad.net/os-vif/+bug/1929446 is that in the ovs python bindign they backislly make a blocking call to select.poll() which blocks the main thread11:44
sean-k-mooney*basically11:44
sean-k-mooneyhttp://patchwork.ozlabs.org/project/openvswitch/patch/20210611142923.474384-1-twilson@redhat.com/11:45
noonedeadpunkah11:47
* noonedeadpunk reproducing with debug enabled11:47
sean-k-mooneyby the way we have seen ValueError: Circular reference detected before11:54
sean-k-mooneybut i dont think we ever figured out where that comes form or how11:54
sean-k-mooneyany time we have seen it there has always been some other error too11:55
sean-k-mooneyand fixing the othe rerror has caused both to go away11:55
noonedeadpunkoh yes, I do see `DEBUG ovsdbapp.backend.ovs_idl.vlog [-] [POLLIN] on fd 31 __log_wakeup /openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/ovs/poller.py:263` when revert_resize is failed11:56
noonedeadpunkthanks sean-k-mooney, I will check out if patch will work :)11:57
sean-k-mooneyits oke to see it if you dont see large timeouts or see it ocationally11:57
noonedeadpunkI see it only when revert is failing11:58
sean-k-mooneyotherwise etiehr backport that patch or set os vif to use vsctl11:58
sean-k-mooneyack11:58
sean-k-mooney[os_vif_ovs]/ovsdb_interface=vsctl11:58
sean-k-mooneysetting that in your nova.conf will also workaround it11:59
noonedeadpunkah, undocumented option?:)12:00
noonedeadpunklet me try it out then12:01
sean-k-mooneywell its not undcoumented12:01
sean-k-mooneyits an os_vif option12:01
sean-k-mooneynot nova12:01
noonedeadpunkwell, yeah, but keystone_authtoken and oslo are included. And with that as operator I would expect see rest as well...12:02
noonedeadpunkAnyway)12:02
zigoLooks like my issue is related to https://github.com/eventlet/eventlet/issues/73012:02
zigoI need to fix eventlet in Debian with py 3.10 first, and then see what's going on with oslo.concurrency ...12:03
sean-k-mooneynoonedeadpunk: we could render them i guess i was expecting them to be in https://docs.openstack.org/os-vif/latest/index.html but im not seeing them12:03
sean-k-mooneynoonedeadpunk: we have explictly list namespaces if we want them to show up12:04
sean-k-mooneynoonedeadpunk: this is where its defiend by the way https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L71-L8212:05
sean-k-mooneythese are the valid options https://github.com/openstack/os-vif/blob/771dfffcd90dcd7c8c95c41744092f5ad4917be3/vif_plug_ovs/ovsdb/api.py#L18-L2112:06
noonedeadpunknice, thanks!12:07
sean-k-mooneynoonedeadpunk: if your using ml2/ovs you should also be enabling https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L96-L9912:08
sean-k-mooneyignore per_port_bridge12:08
sean-k-mooneyi should remove that12:08
noonedeadpunktil about `isolate_vif`12:09
noonedeadpunkand yes, it your note does make sense12:10
noonedeadpunksean-k-mooney: `[os_vif_ovs]/ovsdb_interface = vsctl` seems not to solve the issue. 12:30
noonedeadpunkno ovsdbapp logs though12:32
sean-k-mooneyok then its likely not beauce of the agent looking up 12:33
sean-k-mooneyyou mentioned you have disabled the pthread for heatbeat yes12:33
sean-k-mooneydo you you iptbale firewall or openvswtich12:34
noonedeadpunkyeah, I did, for all neutron-ovs-agents, nova-compute/scheduler/conductor12:34
noonedeadpunkhuh.12:34
sean-k-mooneyiptables we add the linux bridge and veth pair ovs firewall we add the tap directly to ovs12:34
noonedeadpunkI just realized I likely missed disabling for neutron-server12:35
noonedeadpunkas I assumed it's running in uwsgi, but likely it's not in this deployment12:35
sean-k-mooneywell neutron-server 12:35
sean-k-mooneyah i was going to sayy its proably wsgi12:35
sean-k-mooneyya worth checking12:36
sean-k-mooneythat has been reverted to off by default12:36
sean-k-mooneynot sure if its been released yet on stable branches12:36
noonedeadpunknah, I disabled heartbeat_in_pthread for neutron-server as well12:36
noonedeadpunkfirewall_driver = iptables_hybrid12:37
noonedeadpunkyeah, I guess there's an lxb in place12:37
sean-k-mooneyyep 12:49
sean-k-mooneyok on sec12:49
sean-k-mooneyhttps://github.com/openstack/nova/commit/0b0f40d1b308b29da537859b72080488560c23d412:51
noonedeadpunkTbh I still kind of blame pthread, as actual error is `oslo_messaging.rpc.server eventlet.timeout.Timeout: 300 seconds` which is exactly what would happen because of pthreads iirc12:51
noonedeadpunkand it's intermittent as well12:51
noonedeadpunkbut only revert is affected which is weird given it's pthread12:52
sean-k-mooneyhttps://bugs.launchpad.net/nova/+bug/195200312:52
sean-k-mooneyso we when back and foth with this due to a few inflight change at once12:52
sean-k-mooneybut you are probaly hitting that12:52
sean-k-mooneywe shoudl not be waiting for the network vif plugged event if you have a specic set of patches.12:53
sean-k-mooneynoonedeadpunk: do you have https://github.com/openstack/nova/commit/66c7f00e1d9d7c0eebe46eb4b24b2b21f741378912:54
gibizigo: ack, let me know if I can help somehow12:56
sean-k-mooneywhen we adressed https://bugs.launchpad.net/nova/+bug/1895220 it intoduced https://bugs.launchpad.net/nova/+bug/1952003 which orginially fixed https://bugs.launchpad.net/nova/+bug/1832028 and https://bugs.launchpad.net/nova/+bug/183390212:56
sean-k-mooneynoonedeadpunk: https://github.com/openstack/nova/commit/0b0f40d1b308b29da537859b72080488560c23d4 is in yoga 12:57
noonedeadpunksean-k-mooney: I do have https://github.com/openstack/nova/commit/66c7f00e1d9d7c0eebe46eb4b24b2b21f741378912:58
noonedeadpunkI guess I don't have I3cb39a9ec2c260f422b3c48122b9db512cdd799b though, as it's Xena12:58
sean-k-mooneynoonedeadpunk: what about https://review.opendev.org/c/openstack/nova/+/82841412:58
sean-k-mooneywe backproted it12:58
sean-k-mooneybut only 5 months ago12:59
noonedeadpunkNah, we did not do this minor upgrade12:59
noonedeadpunkLet me check it out12:59
sean-k-mooneyim not sure if we have done a release since then12:59
noonedeadpunkGerrit says you did :p13:00
noonedeadpunkBut we run 24.0.1.dev10, and it's included in 24.1.113:00
sean-k-mooneyno it say we merge it13:00
sean-k-mooneywhere did you see that in gerrit13:00
noonedeadpunkthree dots in upper right corner -> included in13:01
sean-k-mooneyoh wow didnt know that was a thing13:01
sean-k-mooneyhttps://github.com/openstack/releases/commit/ac4be06827ec7a450233244d8c5cae8834b95ffc13:02
noonedeadpunkit was there even in gerrit 213:02
sean-k-mooneybut yes we did that on 21st jun13:02
sean-k-mooneynever used it i normally just check the release in github13:03
sean-k-mooneynoonedeadpunk: so ya soory i think its https://bugs.launchpad.net/nova/+bug/195200313:04
sean-k-mooneyon the pluse side if it is then you just need to do the minor update when you have time too13:05
noonedeadpunkah, bug report is indeed super familiar13:18
zigogibi: Building Eventlet, I still get this:13:29
zigohttps://paste.opendev.org/show/bIinQaPTAy81Uac3ZPHS/13:29
zigoAfter a lot of head-scratching, I can't get how to fix it (note: I already cherry-picked https://github.com/eventlet/eventlet/pull/754/commits/cd2532168e33d892de625f9fc831bf0951f4e937 the collections.abc.Iterable one, and another about ssl_version=ssl.PROTOCOL_TLSv1_2).13:29
zigoThe send_method object contains what, in fact?13:29
zigoI see it's self.fd.send_method or something ...13:30
*** dasm|off is now known as dasm13:33
* zigo tries this patch: https://github.com/eventlet/eventlet/commit/f0a887b94a86f9567e33037646712b89f02ae44113:35
* zigo gives up and skips the broken unit test.13:43
gibiI looked at it but I have no ideas either on that13:43
noonedeadpunksean-k-mooney: seems that patch revert does work, thanks a lot!14:14
sean-k-mooneywe had 3 or 4 supper niche edgcases that we resolve and unfortetly it took use a while to relase that that was nolonger required after we fixed that previous bug14:16
sean-k-mooneynoonedeadpunk: im glad this is working for you 14:16
zigogibi: In oslo.concurrency, I tried reverting "Fix fair internal lock used from eventlet.spawn_n" and it was still stuck, so now I'm trying to revert "Prove that spawn_n with fair lock is broken" ...14:34
sean-k-mooneyzigo: just an an fyi that fix is needed14:35
sean-k-mooneyzigo: without it none of the fair logs in nova actully work14:35
zigoYou mean the "Fix fair internal lock used from eventlet.spawn_n" ?14:36
sean-k-mooneyyes14:36
sean-k-mooneythat is required for correctness 14:36
zigoRight, though it doesn't seem to be the brokenness ...14:36
sean-k-mooneyack14:36
zigoIt passed ...14:39
zigoRemoving https://review.opendev.org/c/openstack/oslo.concurrency/+/855713 fixed my issue.14:40
gibias I noted earlier you are probably running those test ^^ without monkey patching hence the they stuck14:42
gibiyou have no better option now but removing those tests14:43
gibibut you still have to keep the fix form "Fix fair internal lock used from eventlet.spawn_n"as sean-k-mooney noted 14:43
zigoI did.14:44
zigogibi: Is it possible that I'm running into this problem because I'm not doing `TEST_EVENTLET=0 lockutils-wrapper` before stestr run ?14:45
gibizigo: yes, I think so14:45
zigoOk, will try.14:45
zigoThanks.14:46
zigoIndeed, that looks like fixing the issue, thanks gibi! :)14:57
gibias a follow up we need a better handling of those test in oslo.concurrency. The eventlet specific test should be skipped if TEST_EVENTLET is not requested14:58
zigoI'd prefer if the whole unit test suite was failing with a TEST_EVENTLET is not set ...15:11
zigoThis way, a guy like me would know what to do... :)15:11
zigo(just my 2 cents of advice...)15:12
opendevreviewDmitriy Rabotyagov proposed openstack/nova master: [doc] Add os_vif configuration options  https://review.opendev.org/c/openstack/nova/+/85720215:27
gibizigo: yeah it is not helpful if the test just stuck15:27
JayFSo heads up, it looks like there might be some persistent failure in stable/yoga: https://review.opendev.org/c/openstack/nova/+/854257 the openstacksdk functional job has failed almost every time on this change (and it's clearly unrelated)22:10
JayFI didn't see it mentioned on the etherpad so I figured I'd mention it here. I'm not too attached to that patch specifically anymore (all the things we know need backporting from Ironic driver have been) -- but it works well as a test case to see the failures.22:10
*** dasm is now known as dasm|off22:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!