Thursday, 2021-06-10

*** liuyulong has joined #openstack-neutron01:25
opendevreviewliuyulong proposed openstack/neutron master: Add the DHCPReponder for IPv6  https://review.opendev.org/c/openstack/neutron/+/77328301:27
opendevreviewliuyulong proposed openstack/neutron master: Add agent extension 'dhcp' for ovs agent  https://review.opendev.org/c/openstack/neutron/+/77656701:27
opendevreviewliuyulong proposed openstack/neutron master: Add fullstack test case for OVS DHCP extension  https://review.opendev.org/c/openstack/neutron/+/77656801:27
*** spatel has joined #openstack-neutron02:05
*** manpreetk has quit IRC02:08
*** spatel has quit IRC02:51
*** spatel has joined #openstack-neutron02:54
*** spatel has quit IRC03:22
*** gthiemon1e has joined #openstack-neutron04:31
*** gthiemonge has quit IRC04:31
*** gthiemon1e has quit IRC04:32
*** gthiemonge has joined #openstack-neutron04:32
*** manub has joined #openstack-neutron04:47
*** manpreetk has joined #openstack-neutron05:02
*** digitalsimboja has joined #openstack-neutron05:16
*** opendevreview has quit IRC05:26
*** lajoskatona has joined #openstack-neutron05:53
*** opendevreview has joined #openstack-neutron06:05
opendevreviewMerged openstack/neutron master: Fix "config-routed-networks" document errors  https://review.opendev.org/c/openstack/neutron/+/79558106:05
*** ltomasbo has joined #openstack-neutron06:15
opendevreviewSlawek Kaplonski proposed openstack/neutron-lib master: Revert "Add network parameter to some of the ML2 API methods"  https://review.opendev.org/c/openstack/neutron-lib/+/79561706:17
opendevreviewSlawek Kaplonski proposed openstack/neutron-lib master: Revert "Add network parameter to some of the ML2 API methods"  https://review.opendev.org/c/openstack/neutron-lib/+/79561706:18
*** jlibosva has joined #openstack-neutron06:51
*** ralonsoh has joined #openstack-neutron06:54
*** rpittau|afk is now known as rpittau06:54
lajoskatonaralonsoh, obondarev: https://review.opendev.org/c/openstack/oslo.privsep/+/794993 big chunks are from ralonsoh's https://review.opendev.org/c/openstack/oslo.privsep/+/78298106:59
lajoskatonaralonsoh, obondarev: I see though assertionerror (https://opendev.org/openstack/oslo.privsep/src/branch/master/oslo_privsep/comm.py#L167 ) like here: https://a0d9b83e558c2b5b4833-56ee05c3af7ad9e598ae576b02ec5a60.ssl.cf2.rackcdn.com/794993/2/check/openstack-tox-py38/aedb81c/job-output.txt07:01
lajoskatonaralonsoh, obondarev: I have to check that07:02
*** lucasagomes has joined #openstack-neutron07:07
fricklerslaweq, others: has it ever been considered for RBAC to have an access_as_owner action or similar that would also allow actions like creating ports with allowed-address-pairs set? or is that a nonsensical idea?07:07
slaweqfrickler it wasn't considered, but I think that You can probably play with policy rules to allow creation of ports with allowed address pairs in shared network, no?07:09
fricklerslaweq: well I would prefer it to be able to be set by users and not require admin access. I also need it only on specific networks, can one do policy rules that have a network filter?07:12
slaweqnope, policy rules are for all networks, not per specific one07:12
slaweqthere is opened BP to add access-as-readonly to the RBAC07:13
slaweqYou can propose another RFE for access-as-owner action probably07:13
fricklerdo you have a link for that handy? otherwise I can do a search myself07:13
fricklerhttps://bugs.launchpad.net/neutron/+bug/1875516 , but that looks almost abandoned :-(07:15
opendevmeetLaunchpad bug 1875516 in neutron "[RFE] Allow sharing security groups as read-only" [Wishlist,New]07:15
slaweqfrickler yes, that's the one07:16
slaweqand it's opened for long time07:16
slaweq:/07:16
fricklerso that likely means that if I propose the RFE, I should also be ready to implement it myself. guess I need to think about that a bit, thx so far07:18
*** luksky has joined #openstack-neutron07:29
*** elvira has joined #openstack-neutron07:32
slaweqI can try to help but only as low-prio thing for me as I have other things on my plate07:34
ralonsohlajoskatona, let me check this patch07:38
*** digitalsimboja has quit IRC07:38
*** isabek has joined #openstack-neutron07:51
*** elvira1 has joined #openstack-neutron07:54
*** elvira has quit IRC07:56
*** elvira1 is now known as elvira07:56
opendevreviewliuyulong proposed openstack/neutron master: Add agent extension 'dhcp' for ovs agent  https://review.opendev.org/c/openstack/neutron/+/77656708:04
opendevreviewliuyulong proposed openstack/neutron master: Add fullstack test case for OVS DHCP extension  https://review.opendev.org/c/openstack/neutron/+/77656808:04
opendevreviewSlawek Kaplonski proposed openstack/neutron master: [Doc] Remove one of the known DVR HA limitations  https://review.opendev.org/c/openstack/neutron/+/79553108:05
*** isabek_ has joined #openstack-neutron08:16
*** isabek has quit IRC08:17
opendevreviewRodolfo Alonso proposed openstack/neutron stable/wallaby: [OVN] Add "filter-validation" extension to supported list  https://review.opendev.org/c/openstack/neutron/+/79562308:28
opendevreviewRodolfo Alonso proposed openstack/neutron stable/wallaby: Add "filter-validation" support for "OVNL3RouterPlugin"  https://review.opendev.org/c/openstack/neutron/+/79545508:30
*** digitalsimboja has joined #openstack-neutron08:38
*** digitalsimboja has quit IRC08:47
opendevreviewRodolfo Alonso proposed openstack/neutron master: Use "multiprocessing.Queue" for "TestNeutronServer" related tests  https://review.opendev.org/c/openstack/neutron/+/79389908:54
*** lajoskatona has quit IRC08:57
*** lajoskatona has joined #openstack-neutron09:06
*** lajoskatona1 has joined #openstack-neutron09:08
*** lajoskatona has quit IRC09:14
*** digitalsimboja has joined #openstack-neutron09:23
opendevreviewSlawek Kaplonski proposed openstack/neutron stable/rocky: Exclude fallback tunnel devices from netns cleanup  https://review.opendev.org/c/openstack/neutron/+/79562409:33
opendevreviewSlawek Kaplonski proposed openstack/neutron stable/queens: Exclude fallback tunnel devices from netns cleanup  https://review.opendev.org/c/openstack/neutron/+/79562509:33
*** jlibosva has quit IRC09:53
*** jlibosva has joined #openstack-neutron09:53
*** TMM has quit IRC10:36
*** TMM has joined #openstack-neutron10:36
ralonsohbcafarel, https://review.opendev.org/c/openstack/neutron/+/790702/2/zuul.d/tempest-multinode.yaml#15210:39
ralonsohone qq10:39
ralonsohthe neutron config in the subnode will be the same as in the primary one, right?10:39
ralonsohexcept if we add something different there10:40
ralonsoh(btw, I think this duplication is a bad cherry-pick)10:40
ralonsohit is NOT in the master branch patch10:41
bcafarel:) yes it probably comes from cherry-pick, though the question is still valid10:41
bcafarelmy hunch is it should use the primary vars if there is no override, but not sure10:42
ralonsohbcafarel, yeah, this parameter is the same in both nodes10:42
ralonsohso I don't need to duplicate it10:42
ralonsohsubnode will have the same generated config file10:42
ralonsohright?10:42
bcafarelI think so yes (and if that's not the case, we have an issue in master)10:43
ralonsohperfect, I'll remove this duplication10:43
bcafarelhmm that though we can check in a recent master job run see if it has the param properly set!10:43
ralonsohright, I'll do right now10:44
ralonsohbcafarel, nope, https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_66c/795059/2/check/neutron-tempest-multinode-full-py3/66cb24e/compute1/logs/etc/neutron/neutron_conf.txt10:45
ralonsohthe configs are independent10:45
ralonsohI need to amend the master zuul job definition10:46
bcafarelralonsoh: I was almost as fast, was looking at https://7408824defcee8201456-a3def1e703d420249f59d3783ec441e9.ssl.cf5.rackcdn.com/318542/30/check/neutron-tempest-multinode-full-py3/c344d27/compute1/logs/etc/neutron/neutron_conf.txt10:46
ralonsohand the cherry-pick, lucky strike, is correct10:46
bcafarelralonsoh: fixing bugs without even realizing it :)10:47
*** digitalsimboja has quit IRC10:47
ralonsohhahahahah10:47
ralonsohthat was a lucky strike for sure10:47
opendevreviewMerged openstack/neutron master: Using 31-Bit and 32-Bit prefixes for IPv4 reasonably  https://review.opendev.org/c/openstack/neutron/+/31854210:48
bcafarel:) (and indeed neutron.conf in victoria backport is OK)10:49
opendevreviewMerged openstack/neutron master: Added dnsmasq for opensuse installation  https://review.opendev.org/c/openstack/neutron/+/79517010:49
opendevreviewRodolfo Alonso proposed openstack/neutron master: Add "nova:live_migration_events" flag to subnode in multinode CI job  https://review.opendev.org/c/openstack/neutron/+/79576111:15
ralonsoh^^^ once this patch is merged, I'll push a backport for W11:16
bcafarelralonsoh++11:29
*** hjensas|afk has quit IRC11:49
*** hjensas has joined #openstack-neutron11:49
*** whoami-rajat has quit IRC11:57
*** manub has quit IRC12:15
*** manub has joined #openstack-neutron12:15
*** spatel has joined #openstack-neutron12:19
opendevreviewsean mooney proposed openstack/os-vif master: [WIP] mock ovs.poller.Poller  https://review.opendev.org/c/openstack/os-vif/+/79577012:24
*** yoctozepto is now known as Guest157012:36
*** Guest1570 is now known as yoctozepto12:36
ralonsohfolks, just a heads-up12:43
ralonsohhttps://review.opendev.org/c/openstack/requirements/+/78833912:43
ralonsohSQLAlchemy has been bumped to 1.4.1512:44
ralonsohjust to stay alert to the CI if you see weird things related to the DB12:44
ralonsohthanks!12:44
opendevreviewRodolfo Alonso proposed openstack/neutron master: Add "nova:live_migration_events" flag to subnode in multinode CI job  https://review.opendev.org/c/openstack/neutron/+/79576112:48
*** sean-k-mooney has joined #openstack-neutron13:01
sean-k-mooneyo/13:01
sean-k-mooneyralonsoh: tl;dr regarding https://review.opendev.org/c/openstack/os-vif/+/795770 is that currently the ovsdbapp is starting an instance ot the ovs poller internallly on what it really wants to be a real thread but its monky patched.13:02
sean-k-mooneyralonsoh: due to how the ovs poller works it cna deadlock the nova compute agent13:03
ralonsohsean-k-mooney, but how do you plan to use this connection then?13:03
ralonsohI must be missing something there13:04
sean-k-mooneywell it passes the os-vif functional tests so its actully creating port in ovs with this13:04
sean-k-mooneyos-vif never reads form the ovs-db13:04
sean-k-mooneyit only ever writes to it so we dont need to pool for state changes13:05
sean-k-mooneyovsdbapp does this for neturon to monitor ovs in the l2 agent13:05
sean-k-mooneythat is just a temp patch to see if this will work13:05
ralonsohI still don't understand13:05
ralonsohthis poller is needed to run commands13:05
ralonsohhttps://github.com/openstack/ovsdbapp/blob/5d24facd36ca06b1f5f793c834da2efa4fa3a508/ovsdbapp/backend/ovs_idl/connection.py#L95-L13713:05
sean-k-mooneyno its not13:05
sean-k-mooney the pooler is not what is runnign the command13:06
ralonsohahhh no13:06
sean-k-mooneyits the idl object that is13:06
ralonsohyou are right13:06
ralonsohthis "run" is not the command.run13:06
sean-k-mooneyim just making the fd_wait and block noops13:06
ralonsohthis is just for the local cache of the DB13:06
sean-k-mooneyyep13:06
sean-k-mooneyso with this hack because that is what that patch is the cache will be stale13:06
sean-k-mooneybut for os-vif that is fine since we never read anything13:07
ralonsohok ok, maybe I can propose a patch for ovsdbapp13:07
sean-k-mooneyi was going to do that after13:07
ralonsohmaking this poller something optional13:07
ralonsohperfect!13:07
sean-k-mooneyto add a poll=True option ot the constuctor13:07
ralonsohwell, not the poller but the "run" thread13:07
sean-k-mooneyah right13:07
sean-k-mooneyso if this does not work my alternitve plan13:08
sean-k-mooneywas to chagne this https://github.com/openstack/ovsdbapp/blob/5d24facd36ca06b1f5f793c834da2efa4fa3a508/ovsdbapp/backend/ovs_idl/connection.py#L9113:08
sean-k-mooneyto always create a real pthread13:08
sean-k-mooneyeven when its being used form a monky patched context13:08
sean-k-mooneycurrenlty its a green thread13:08
ralonsohwe also use greenthreads13:09
ralonsohand is working13:09
sean-k-mooneyit works in nova but ocationally it blocks13:09
sean-k-mooneyhttps://github.com/openvswitch/ovs/blob/master/python/ovs/poller.py#L59-L6313:09
sean-k-mooneythis i think is why ^13:09
ralonsoh(well, this is also a problem in Neutron)13:09
ralonsohwe don't properly handle user threads13:09
sean-k-mooneyyes it would affect neutron too13:09
sean-k-mooneycausing the l2 agent to perodically hang13:10
ralonsohthis is a topic to be discussed with Terry13:10
ralonsohthanks a lot13:10
ralonsohbecause this is causing problems too13:10
ralonsohin Neutron13:10
sean-k-mooneylet me see if i can show you what it look like in the log13:10
ralonsohand, btw, is it legit to mix user and OS threds13:10
ralonsohthere is no problem (if done correctly)13:10
sean-k-mooneyyes if done correctly its fine and this already has all the locking required13:11
opendevreviewMerged openstack/neutron-lib master: API "subnet-segmentid-writable" should inherit field definition  https://review.opendev.org/c/openstack/neutron-lib/+/79534013:11
sean-k-mooneywe do it for oslo.messaging already13:11
sean-k-mooneythe heartbeat thread is a real pthread13:11
sean-k-mooneyhttps://zuul.opendev.org/t/openstack/build/bb6fd21b5d8c471a89f4f6598aa84e5d/log/controller/logs/screen-n-cpu.txt#12235-1226213:14
sean-k-mooneythere you can see it blocks the nova agent for ~5 seconds13:14
sean-k-mooneyand it does it a few times before other things get a change to run13:14
ralonsohTerry told me this is wrong13:15
ralonsoh"Without polling, we can't respond to the echo requests that the server sends and it can"13:15
ralonsoh^^13:15
ralonsohdisconnect us.13:15
sean-k-mooneyyep  i know its a hack13:15
ralonsohthis is why you can't do this13:15
ralonsohok ok hehehe13:15
sean-k-mooneywe would reconect but i want to see if that log message goes away13:16
ralonsohchecking the logs now13:16
sean-k-mooneythe correct way as i said i probly to put the poller in a real pthread13:16
ralonsohright13:16
*** otherwiseguy has joined #openstack-neutron13:16
ralonsohthere he is13:17
*** obondarev has joined #openstack-neutron13:17
sean-k-mooneyopendevreview: o/13:17
sean-k-mooney...13:17
sean-k-mooneyotherwiseguy: o/13:17
otherwiseguysean-k-mooney: \o13:17
ralonsohotherwiseguy, sean-k-mooney is talking about making this13:17
ralonsohhttps://github.com/openstack/ovsdbapp/blob/5d24facd36ca06b1f5f793c834da2efa4fa3a508/ovsdbapp/backend/ovs_idl/connection.py#L9113:17
ralonsoha real thread13:17
sean-k-mooneyto workaround https://github.com/openvswitch/ovs/blob/master/python/ovs/poller.py#L59-L6313:18
ralonsohthat could solve some problems we have with greenthreads too13:18
otherwiseguyralonsoh: is it not as real a thread as Python can provide?13:18
sean-k-mooneyotherwiseguy: no13:18
ralonsohno if this is called from a monkeypatch program13:18
sean-k-mooneyotherwiseguy: eventlet has monky patched it13:18
ralonsohright13:18
sean-k-mooneyotherwiseguy: it still wont escape the gil13:18
otherwiseguybut ovsdbapp doesn't monkeypatch it.13:19
ralonsohNeutron does13:19
sean-k-mooneynova and neutorn does13:19
ralonsohand neutron calls ovsdbapp13:19
otherwiseguyright, but that is a choice of the user of the library is all I'm saying. :)13:19
sean-k-mooneyin this case nova mokeypatches the world, it call os-vif, os-vif then calls ovsdbapp13:19
sean-k-mooneyotherwiseguy: ovs un monkeypatches https://github.com/openvswitch/ovs/blob/master/python/ovs/poller.py#L270-L29013:20
ralonsohotherwiseguy, yeah... you are right, but we must be aware of this13:20
ralonsohhttps://github.com/openvswitch/ovs/blob/master/python/ovs/poller.py#L59-L6313:20
ralonsohso I think this is a good idea that ovsdbapp offers this option13:20
ralonsohto use a OS thread there, instead of eventlet13:21
sean-k-mooneyyou really only see this issue in the ci because some fo the nodes are slow13:22
*** elvira has quit IRC13:22
sean-k-mooneybut i suspect it woudl also happen when a node is recaching capasity13:22
ralonsohor any race condition in the code not returning the GIL13:23
sean-k-mooneyill submit a second patch to ovsdbapp to ty and moge that to a real pthread and we can see if that helps13:24
ralonsohcool for me!13:24
sean-k-mooneywe can see what the ci says in terem of it if break anything13:24
otherwiseguysean-k-mooney: When you say OS thread, are you talking about some kind of C-extension thing so we can actually use multiple cores? I'm not following how there is going to be a performance benefit since python threads suck.13:25
sean-k-mooneyotherwiseguy: we dont care about performace we just dont want the pooling to deadlock the eventlet event loop13:26
otherwiseguygotcha13:26
sean-k-mooneyright now when the poller blocks like here https://github.com/openstack/ovsdbapp/blob/5d24facd36ca06b1f5f793c834da2efa4fa3a508/ovsdbapp/backend/ovs_idl/connection.py#L10613:26
sean-k-mooneythe entire agent blocks until the timeout expires or the file desciptor gets a kick form ovs13:27
sean-k-mooneyim totally fine with it doing that as long as tha tis not the tread on which the eventlet event loop is running on13:27
otherwiseguysean-k-mooney: ralonsoh: but it looks like we are using the _SelectSelect class and not select.poll? https://github.com/openvswitch/ovs/blob/master/python/ovs/poller.py#L153. The only place I see that would actually use the system poll is in check_connection_completion() which is only used when open/connect on a stream is called. Am I misreading that?13:34
ralonsohotherwiseguy, the problem is the thread using the poller13:34
ralonsoh_SelectSelect13:35
ralonsohsorry13:35
ralonsohhttps://github.com/openstack/ovsdbapp/blob/5d24facd36ca06b1f5f793c834da2efa4fa3a508/ovsdbapp/backend/ovs_idl/connection.py#L9113:35
ralonsohthis thread that executes "run" is a greenthread13:35
ralonsoh(as commented because Neutron and Nova are monkey patched)13:35
ralonsohthat should be a OS thread (preemptive thread, the context is given by the OS, not the user in the code, stopping the running eventlet thread)13:36
otherwiseguyright, and I thought greenthreads should be fine with _SelectSelect because it doesn't use poll and was made to work with greenthreads?13:36
* otherwiseguy tries to catch up13:36
otherwiseguyThe comment makes it sound like that is literally its reason for existing.13:37
opendevreviewMerged openstack/neutron stable/train: Enable ovsdb debug messages in functional and fullstack  https://review.opendev.org/c/openstack/neutron/+/79462513:38
sean-k-mooneyotherwiseguy: i think they had inteded it to work that way but as you can see form the nova logs https://zuul.opendev.org/t/openstack/build/bb6fd21b5d8c471a89f4f6598aa84e5d/log/controller/logs/screen-n-cpu.txt#12235-1226213:39
*** digitalsimboja has joined #openstack-neutron13:39
sean-k-mooneythere are direct 5 second gaps in output while we wait for the time out to expire13:39
sean-k-mooneyno other processing seams to be happening on the agent13:39
sean-k-mooneythis results in some rpc calls timing out during the live migration tests13:40
sean-k-mooneymost of the time this pooling returns instantly so its not an issue13:40
sean-k-mooneythese long pause seam to also be assocaited with reconnects13:42
otherwiseguysean-k-mooney: weird, because I think each of those state changes requires a call to idl.run() to happen, which only happens if we pass a block() call. And nothing happens in between them.13:42
otherwiseguyYeah, reconnects are definitely super expensive. If there was an active transaction, we'd be waiting on on it to complete (we queue txns and block popping the results with commit_block())13:44
ralonsohbut there we have the problem, in the block() call13:44
ralonsohhttps://github.com/openvswitch/ovs/blob/master/python/ovs/poller.py#L137-L14013:44
ralonsohthat will wait13:44
ralonsohat this point, if this is a user thread, the GIL will be given to the next one13:45
ralonsohif the next one never returns, this wait method will never continue13:45
sean-k-mooneywell in theory that select should be monkeypatched and yeild13:46
sean-k-mooneybut it does not look like that is what is happening here13:47
sean-k-mooneyralonsoh: otherwiseguy lee just noted that his might be  the same issue https://bugs.launchpad.net/nova/+bug/186388913:47
opendevmeetLaunchpad bug 1863889 in OpenStack Compute (nova) "Revert resize problem in neutron-tempest-dvr-ha-multinode-full" [Medium,Confirmed]13:47
*** pjakuszew has joined #openstack-neutron13:54
opendevreviewBodo Petermann proposed openstack/neutron master: Config option to enable OVN IDL on other workers  https://review.opendev.org/c/openstack/neutron/+/79578113:55
*** elvira has joined #openstack-neutron13:56
*** digitalsimboja has quit IRC13:56
sean-k-mooneyotherwiseguy: am i correct in saying ovsdbapp does not use oslo.utils?13:57
sean-k-mooneyya no oslo libs in requirements13:58
sean-k-mooneyok ill just port the relevent logic13:58
otherwiseguysean-k-mooney: correct. we try to avoid any openstack requirements since there are non-openstack users. (except we use oslo testing stuff)13:59
otherwiseguyI'm still going to be shocked if that select() call is blocking.13:59
otherwiseguy:)13:59
sean-k-mooneyotherwiseguy: ok by the way we need to fix the packaging of the ovs python binding in centos/rhel13:59
sean-k-mooneyotherwiseguy: currently ovsdbapp depend on the ovs python bidnings which depend on libopenvswitch14:00
otherwiseguysean-k-mooney: yeah, the C extension for not-stupidly-slow json parsing.14:00
sean-k-mooneywhich is fine if it was not for the fact that in cenots libopenvswich is part of the main openvswithc package so we end up pulling in all of openvswitch and dpdk14:00
otherwiseguy(it's literally 60x slower)14:01
sean-k-mooneyright but my point is libopenvswtich shoudl be its own package14:01
sean-k-mooneysince that is all the python bindings needs14:01
sean-k-mooneyits one of the reason our continer image are so big14:02
opendevreviewMamatisa Nurmatov proposed openstack/neutron master: use payloads for PORT AFTER_UPDATE events  https://review.opendev.org/c/openstack/neutron/+/79511714:02
*** pjakuszew has quit IRC14:02
sean-k-mooneyinstalling nova-api on centos pulls in all of ovs and dpdk even though it does not use os-vif but that is in the requirement for python-nova, and since os-vif pulls in ovsdbapp ...14:03
*** pjakuszew has joined #openstack-neutron14:04
*** pjakuszew has quit IRC14:05
*** pjakuszew has joined #openstack-neutron14:05
otherwiseguysean-k-mooney: oh my. yeah, I assumed it was just requiring openvswitch-devel (which has the libs/headers). Requiring openvswitch itself is silly.14:08
sean-k-mooneywell it required libopenvswitch but its provided by the main openvswitch package14:09
*** pjakuszew is now known as cz3|work14:09
*** cz3|work is now known as pjakuszew14:10
sean-k-mooneythe only thing that really hurts is container size14:10
*** pjakuszew has left #openstack-neutron14:10
sean-k-mooneybut its still not ideal14:10
otherwiseguyin rhel openvswitch2.13 it looks like python3-openvswitch directly requires openvswitch: Requires: %{pkgname} = %{?epoch:%{epoch}:}%{version}-%{release}14:11
sean-k-mooneyew14:11
otherwiseguysean-k-mooney: so changes everywhere :D14:11
*** cz3|work has joined #openstack-neutron14:17
opendevreviewsean mooney proposed openstack/ovsdbapp master: always use real threads when polling ovs.  https://review.opendev.org/c/openstack/ovsdbapp/+/79578914:17
sean-k-mooneyok cool there is a tempest job on ovsdbapp ovsdbapp-neutron-ovn-tempest-ovs-release14:18
otherwiseguysean-k-mooney: if we un-patch threading, then all of the locking we do will block eventlet, right?14:19
sean-k-mooneyit would yes but tha that is inside the run method that would be on the new thread i think14:20
sean-k-mooneyah we lock in start and stop14:20
otherwiseguyWe basically lock around all run() calls.14:20
sean-k-mooneywell run should always be called in a sperate trhead14:21
sean-k-mooneyi could revert the loc to use the patched version but i dont think that woudl be correct14:21
sean-k-mooneywell14:22
sean-k-mooneysee its tricky14:22
sean-k-mooneywhen monkeypatched we want it to yeild when it trys to take the lock and cang with out yeilding the thread14:22
sean-k-mooneyjust the greentread14:23
sean-k-mooneymaybe we should use a filesystem lock14:23
otherwiseguythe fact that nothing happens even between those multiple state changes/run calls, makes me really wonder about what the problem is.14:23
otherwiseguyThe problem happening during reconnects makes sense, since that is the only time that it uses system poll.14:26
sean-k-mooneywell mocking the poller had the opicite effect i wanted but it is what you were expecting14:27
sean-k-mooneyhttps://zuul.opendev.org/t/openstack/build/39c7eed869ce4035be359cb1f7a5b8f6/log/controller/logs/screen-n-cpu.txt#721214:27
otherwiseguyFor the normal poller we use in ovsdbapp, we have the safe _SelectSelect.14:27
sean-k-mooneyits cause it to have to reconnect every 5 seconds14:27
otherwiseguyBut in python-ovs, it uses get_system_poll() in the connection open/connect calls.14:27
otherwiseguyYeah, mocking poller would break everything. We'd not respond to the echo requests from the server and it would disconnect us.14:28
otherwiseguyHaving python-ovs's stream.py not use the system poller seems like it would be the fix for the issue at reconnect.14:28
sean-k-mooneyyes perhaps14:29
sean-k-mooneyalthoguh it raises the question as to why it need to reconnect14:29
otherwiseguytimeout too low, ovsdb-server is singlethreaded and dumb. probe intervals default to 5s which is too small. I have a patch that removes probe intervals in favor of tcp keepalives for the *client generated* echo requests.14:30
otherwiseguysean-k-mooney: https://review.opendev.org/c/openstack/neutron/+/79489214:30
otherwiseguybut there are also the server generated requests, which are configured in NB/SB Connection table as inactivity_probe14:31
sean-k-mooneyah i see we could port that to os-vif i guess but could we do this in ovsdbapp instead14:31
otherwiseguyMy thinking being that tcp keepalives at least won't have ovsdb-server as a bottleneck.14:32
otherwiseguysean-k-mooney: the biggest issue is that ovsdb-server was designed around the OVS switch usecase, which means small dbs and few connections. OVN on the other hand, large DBs with 100s of connections...a single threaded ovsdb server is ... not ideal.14:33
otherwiseguyIdeally that neutron patch would be in python-ovs and configurable. I didn't do it in ovsdbapp because I try to behave like a normal library and not change behavior in the library when I don't have to, but if it seems after running this for a while that it is definitely far superior, I'd be open to forcing it on people. ovsdbapp doesn't have config files, but I could make it a non-default option in releases.14:37
opendevreviewRodolfo Alonso proposed openstack/neutron master: [DNM] Test ovsdbapp patch  https://review.opendev.org/c/openstack/neutron/+/79582014:39
sean-k-mooneyotherwiseguy: well even if we could pass in a flag to the constotor of the connection for example to alther the behaivor we could opt into it14:44
sean-k-mooneyotherwiseguy: for now im not against porting the change you did in neutron to os-vif but i need to digest what you did first14:45
opendevreviewRodolfo Alonso proposed openstack/neutron master: Bump os-ken to 2.0.0  https://review.opendev.org/c/openstack/neutron/+/79373514:46
sean-k-mooneyotherwiseguy: am i correct in saying you are usign the no probes mixin to disable the echo and redefining the tcp and ssl stream calles by inheritng form the mixin and registering the new classes to be used by ovs stream module instead14:48
otherwiseguysean-k-mooney: basically a) set SO_KEEPALIVE to enable tcp keepalives (relying on system sysctl settings for the TCP_KEEPIDLE/KEEP_CNT/KEEPINTVL which in OSP we default to 5/1/1, but is 2 hours default in linux) b) disable sending probes via echo request (from the client)14:48
otherwiseguysean-k-mooney: Correct.14:48
sean-k-mooneyya ok we could try that in os-vif also although i am not sure that will help with the reconnect issue14:49
otherwiseguyIt might be a good idea to override with configurable TCP_KEEPIDLE, TCP_KEEPCNT, TCP_KEEPINTVL.14:49
sean-k-mooneyos-vif does have config options so we can at least make some of this configurable if we need too14:50
otherwiseguysean-k-mooney: it *could* help, if it's the client-side echo request at 5s default is what is causing it to disconnect and reconnect.14:50
otherwiseguyIt could also be the server-side echo requests, but neutron has those configured to much longer times than 5s.14:51
sean-k-mooneywell its at least worth a try14:51
sean-k-mooneydo you mind if i port it or would you like to do it14:52
otherwiseguySince we're seeing 5s timeouts, I'd put my money on it. With ovsdbapp you can also when you create the Idl, you can pass probe_interval={milliseconds} to increase the probe interval as well.14:53
otherwiseguys/ovsdbapp/ovs/14:55
otherwiseguyHere is how neutron does it: https://github.com/openstack/neutron/blob/a0cd7e76390c1f35aff5f95612cc0b7bcf0eaa9e/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py#L48114:55
sean-k-mooney ok14:55
sean-k-mooneyso the thing is though in os-vif we dont ever need to monitor the ovsdb14:56
sean-k-mooneyi was going to say we never read from the ovsdb and that is almost true14:56
otherwiseguysean-k-mooney: Feel free to port it. I'm happy to do it if you need me to as well. Yeah, the python-ovs Idl, last I checked, didn't support a write-only method, though the C idl does, I think.14:57
sean-k-mooneyi belive we have one check to see if the ovsdb support setting the mtu on a port14:57
sean-k-mooneythe main reason we keep the connection open today id to not need to generate teh client and do the schema verions stuff evey time we add or remvoe a port14:58
otherwiseguyBut the way OVS IDL stuff works mostly assumes that you have stuff in memory. The Command objects that ship with ovsdbapp do lookups by name, which would require in-memory copies of the rows, etc. So even if you think you aren't reading, you are.14:58
*** manub has quit IRC15:01
otherwiseguysean-k-mooney: In general, the IDL method of doing things kind of crappy (but better than the old way of calling the -ctl commands, which basically create a new IDL for every call). Maintaining an in-memory copy of the DB rows you need *per connection/worker* is gross. You can at least limit the tables/rows to only the ones you care about at least with register_table: https://github.com/openstack/neutron/blob/a0cd7e76390c1f35aff5f95612c15:04
otherwiseguyc0b7bcf0eaa9e/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py#L505 and cond_change(): https://github.com/openstack/neutron/blob/a0cd7e76390c1f35aff5f95612cc0b7bcf0eaa9e/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py#L71115:04
otherwiseguywhich can also cut down on how long reconnects take.15:05
sean-k-mooneyi see we porably shoudl do that so15:07
sean-k-mooney i think we just need either the interface table or ports table15:08
*** lucasagomes has quit IRC15:08
sean-k-mooneyi think ports have interfaces i always mix those up15:08
*** liuyulong has quit IRC15:12
*** isabek_ has quit IRC15:12
*** liuyulong has joined #openstack-neutron15:12
opendevreviewMerged openstack/neutron-lib master: Revert "Add network parameter to some of the ML2 API methods"  https://review.opendev.org/c/openstack/neutron-lib/+/79561715:15
otherwiseguysean-k-mooney: yeah, you just need to make sure that if you grab a table that has a refTable column that references a different table, to get that table too (at least if you ever reference that column--it *should* just display the uuid instead of a row when accessing the column)15:15
opendevreviewKamil Sambor proposed openstack/neutron master: Set trunk sub-port when bind profile is created  https://review.opendev.org/c/openstack/neutron/+/79533415:18
*** tkajinam has quit IRC15:19
otherwiseguysean-k-mooney: yeah, a quick fix would be https://github.com/openstack/os-vif/blob/d8af3568b8b92748f61029a96c46fd513b6795c2/vif_plug_ovs/ovsdb/impl_idl.py#L26 to remove the register_all() and do register_table() for what you need, and then in the next line pass probe_interval=60000 to idl.Idl and see if it helps.15:22
sean-k-mooneyok ill work that into may patches to os-vif15:23
sean-k-mooneyim on an internal team call but ill try an poc this later today15:24
otherwiseguyI'd bet it'll prevent some reconnects, though of course fixing the blocking during a reconnect would be the python-ovs socket_util.get_system_poll() call.15:24
otherwiseguyGreat! Let me know if there's anything you need.15:25
sean-k-mooneyare you going to follow up with ovs on the python-ovs change15:26
sean-k-mooneyone thing i coudl do in os-vif is monkey patch socket_util.get_system_poll but thats not great15:26
sean-k-mooneye.g. make it just return the safe select implemenation15:27
sean-k-mooneyi have a number of ways to proceed in anycase15:27
*** digitalsimboja has joined #openstack-neutron15:28
otherwiseguysean-k-mooney: yeah, it's something I should at least raise on the mailing list.15:36
otherwiseguyI was happy that they basically had a method to monkey-patch the TCP/SSLStream classes, so I didn't feel *quite* as bad.15:37
otherwiseguy"It's not my fault that you had a global dict and then a method that wrote to it that you didn't prefix with an underscore."15:37
opendevreviewSlawek Kaplonski proposed openstack/neutron master: DVR: Populate ARP entries of the allowed_address_pairs to the routers  https://review.opendev.org/c/openstack/neutron/+/60133615:38
*** digitalsimboja has quit IRC15:46
*** obondarev has quit IRC15:57
*** elvira has quit IRC15:57
*** ltomasbo has quit IRC16:01
opendevreviewRodolfo Alonso proposed openstack/neutron master: Populate self.floating_ips_dict using "ip rule" information  https://review.opendev.org/c/openstack/neutron/+/79460416:11
*** rpittau is now known as rpittau|afk16:24
opendevreviewMerged openstack/neutron master: ovn: Don't use dict.remove() for filtering dhcp ports in db-sync  https://review.opendev.org/c/openstack/neutron/+/78908516:29
opendevreviewMerged openstack/neutron master: Payloads for PORT: BEFORE_CREATE and PRECOMMIT_CREATE  https://review.opendev.org/c/openstack/neutron/+/79457916:31
opendevreviewBodo Petermann proposed openstack/neutron master: Config option to enable OVN IDL on other workers  https://review.opendev.org/c/openstack/neutron/+/79578116:41
opendevreviewsean mooney proposed openstack/ovsdbapp master: always use real threads when polling ovs.  https://review.opendev.org/c/openstack/ovsdbapp/+/79578916:47
opendevreviewMerged openstack/networking-ovn stable/train: Make networking-ovn-tripleo-ci-centos-7-containers-multinode non-voting  https://review.opendev.org/c/openstack/networking-ovn/+/79550016:51
*** jlibosva has quit IRC16:52
*** raildo has joined #openstack-neutron17:18
*** sean-k-mooney has quit IRC17:41
*** cz3|work is now known as pjakuszew18:08
opendevreviewMerged openstack/neutron master: [DHCP] Fix cleanup_deleted_ports method  https://review.opendev.org/c/openstack/neutron/+/79475518:13
opendevreviewMerged openstack/neutron master: Remove "mitogen" library installation from "docs" job  https://review.opendev.org/c/openstack/neutron/+/79337718:13
*** digitalsimboja has joined #openstack-neutron18:24
*** digitalsimboja has quit IRC18:52
*** sean-k-mooney has joined #openstack-neutron19:04
sean-k-mooneyotherwiseguy: ralonsoh its just one ci run but with https://review.opendev.org/c/openstack/ovsdbapp/+/795789 there are not reconnects19:05
sean-k-mooneyotherwiseguy: ralonsoh  but the next most recent ci run for ovsdbapp has reconnects19:07
sean-k-mooneyhttps://zuul.opendev.org/t/openstack/build/9b751acd11664482a681705fa2c7047c/log/controller/logs/screen-n-cpu.txt19:07
sean-k-mooneyhttps://zuul.opendev.org/t/openstack/build/9b751acd11664482a681705fa2c7047c/log/controller/logs/screen-n-cpu.txt#725219:07
sean-k-mooneyotherwiseguy: ralonsoh  so https://review.opendev.org/c/openstack/ovsdbapp/+/795789 does seam to fix that issue19:08
sean-k-mooneynow that i have udpated it to also cater for the fact that neutron-api runs under wsgi and while eventlest is avaiable it not patchs19:09
sean-k-mooneywhich is why it failed in the first version19:09
sean-k-mooneyim still going to explore the os-vif optimisations but this seams to prevent the reconnects19:09
*** digitalsimboja has joined #openstack-neutron19:10
sean-k-mooneyi would guess previously te echos were not replied to in time because nova was bussy doing things and eventlet and the ovs poller was sharing a single os thread now they are not19:11
*** digitalsimboja has quit IRC19:38
*** supamatt has joined #openstack-neutron20:02
*** spatel has quit IRC20:06
*** lajoskatona1 has quit IRC20:07
*** TMM has quit IRC20:25
*** TMM has joined #openstack-neutron20:25
*** spatel has joined #openstack-neutron20:35
opendevreviewBrian Haley proposed openstack/ovn-octavia-provider master: Support creating members without a subnet ID  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/79589620:45
*** spatel has quit IRC20:52
*** ralonsoh has quit IRC20:56
opendevreviewBrian Haley proposed openstack/ovn-octavia-provider master: Add Health Monitor support  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/71325321:35
opendevreviewBrian Haley proposed openstack/ovn-octavia-provider master: Support creating members without a subnet ID  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/79589621:45
*** spatel has joined #openstack-neutron21:47
opendevreviewMerged openstack/neutron master: Mark fullstack test_l2_agent_restart as unstable  https://review.opendev.org/c/openstack/neutron/+/79422821:53
*** spatel has quit IRC22:06
*** yamamoto has quit IRC22:22
*** yamamoto has joined #openstack-neutron22:33

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!