*** spandhe has quit IRC | 00:03 | |
*** SridharG has quit IRC | 00:04 | |
*** jorgem has quit IRC | 00:11 | |
*** matsuhashi has joined #openstack-neutron | 00:20 | |
openstackgerrit | Dane LeBlanc proposed a change to openstack/neutron: Cisco Nexus: maximum recursion error in ConnectionContext.__del__ https://review.openstack.org/80737 | 00:20 |
---|---|---|
*** mwagner_lap has joined #openstack-neutron | 00:21 | |
openstackgerrit | Dane LeBlanc proposed a change to openstack/neutron: Cisco plugin fails with ParseError no elem found https://review.openstack.org/81304 | 00:21 |
*** leseb_ has joined #openstack-neutron | 00:22 | |
openstackgerrit | Evgeny Fedoruk proposed a change to openstack/neutron: Cancelling thread start while unit tests running https://review.openstack.org/81323 | 00:23 |
*** ijw has joined #openstack-neutron | 00:23 | |
*** zhipeng has quit IRC | 00:24 | |
*** SumitNaiksatam has quit IRC | 00:24 | |
*** alagalah has quit IRC | 00:25 | |
*** hogepodge has joined #openstack-neutron | 00:25 | |
hogepodge | I have a python neutron client question. | 00:25 |
*** ijw_ has joined #openstack-neutron | 00:26 | |
hogepodge | What input is the client expecting for this command: /usr/bin/neutron subnet-update 192.168.22.0/24 --host-routes list=true {'destination': '10.10.10.10/24','nexthop': '8.8.8.8'} 192.168.22.0/24 | 00:26 |
hogepodge | It returns Invalid input for host_routes. Reason: Invalid input. '{'destination': '10.10.10.10/24','nexthop': '8.8.8.8'}' must be a dictionary with keys: ['destination', 'nexthop']. | 00:27 |
hogepodge | I can't for the life of me figure out how to enter a dictionary on the command line. | 00:27 |
*** leseb_ has quit IRC | 00:27 | |
*** yfried1 has quit IRC | 00:27 | |
openstackgerrit | Salvatore Orlando proposed a change to openstack/neutron: NSX plugin: return 400 for invalid gw certificate https://review.openstack.org/80948 | 00:28 |
*** ijw has quit IRC | 00:29 | |
oda-g | hogepodge: --host_routes type=dict list=true distination=10.10.10.10/24,nexthop=8.8.8.8 distination=11.11.11.11/24,nexthop=9.9.9.9 | 00:33 |
hogepodge | Thanks! oda-g | 00:35 |
*** manishg has quit IRC | 00:35 | |
*** dvorkini_ has quit IRC | 00:37 | |
*** overlayer has quit IRC | 00:39 | |
*** samuelbercovici has quit IRC | 00:40 | |
*** blogan has quit IRC | 00:40 | |
*** thuc has quit IRC | 00:42 | |
*** thuc has joined #openstack-neutron | 00:42 | |
*** blogan_ has joined #openstack-neutron | 00:46 | |
*** thuc has quit IRC | 00:47 | |
*** yamahata has quit IRC | 00:47 | |
*** devlaps has quit IRC | 00:52 | |
*** mwagner__ has joined #openstack-neutron | 00:56 | |
*** spandhe has joined #openstack-neutron | 00:56 | |
*** dguitarbite has quit IRC | 01:09 | |
*** bada has quit IRC | 01:25 | |
*** bada has joined #openstack-neutron | 01:26 | |
*** alagalah has joined #openstack-neutron | 01:35 | |
*** sfox has joined #openstack-neutron | 01:36 | |
openstackgerrit | Akihiro Motoki proposed a change to openstack/python-neutronclient: Work around pypy testing issue https://review.openstack.org/81413 | 01:38 |
*** amotoki has joined #openstack-neutron | 01:38 | |
*** xuhanp has joined #openstack-neutron | 01:38 | |
*** _cjones_ has quit IRC | 01:39 | |
*** _cjones_ has joined #openstack-neutron | 01:39 | |
*** alagalah has quit IRC | 01:40 | |
*** blogan_ has quit IRC | 01:41 | |
*** _cjones_ has quit IRC | 01:44 | |
oda-g | amotoki: ping | 01:45 |
*** ijw_ has quit IRC | 01:45 | |
*** thuc has joined #openstack-neutron | 01:52 | |
*** thuc_ has joined #openstack-neutron | 01:57 | |
*** dkehn has quit IRC | 01:58 | |
*** thuc has quit IRC | 01:59 | |
*** dave_tucker is now known as dave_tucker_zzz | 02:03 | |
*** sfox has quit IRC | 02:06 | |
*** dkehn has joined #openstack-neutron | 02:07 | |
*** yamahata has joined #openstack-neutron | 02:07 | |
openstackgerrit | Abhishek Raut proposed a change to openstack/neutron: Add missing ondelete option to Cisco N1kv tables https://review.openstack.org/81419 | 02:11 |
*** thuc has joined #openstack-neutron | 02:11 | |
*** thuc_ has quit IRC | 02:12 | |
*** thuc_ has joined #openstack-neutron | 02:12 | |
amotoki | oda-g: pong | 02:14 |
oda-g | amotoki: would you pick up to review https://review.openstack.org/#/c/79858/ which needs one more +2 ? | 02:15 |
*** thuc_ has quit IRC | 02:15 | |
*** thuc has quit IRC | 02:16 | |
*** thuc has joined #openstack-neutron | 02:16 | |
oda-g | amotoki: generally speaking, when a fix is not reviewed for a long time, what should I do ? | 02:16 |
*** sweston has quit IRC | 02:17 | |
amotoki | oda-g: just looked at it. I have one question. | 02:21 |
amotoki | oda-g: can't metaplugin terminate RPC call and redispatch it to each plugin? I am not sure we can. | 02:21 |
oda-g | amotoki: it is possible technically, but need large amount of code writing. | 02:23 |
*** thuc_ has joined #openstack-neutron | 02:24 | |
*** sfox has joined #openstack-neutron | 02:26 | |
amotoki | oda-g: I am not sure what is "many environment". Do you assume a case where the one plugin is agent-based plugin and the other is controller-based plugin? | 02:26 |
amotoki | oda-g: if so this looks reasonable. | 02:26 |
oda-g | amotoki: and need to modify each plugin, so that it is not comsume rpc queue. | 02:27 |
*** thuc has quit IRC | 02:27 | |
*** sbalukoff has quit IRC | 02:28 | |
*** shakamunyi has joined #openstack-neutron | 02:28 | |
amotoki | oda-g: I understand it. to do so, we need to take care of sending RPC from plugin to agent. | 02:28 |
*** devlaps has joined #openstack-neutron | 02:29 | |
*** thuc_ has quit IRC | 02:29 | |
amotoki | oda-g: At least "many env" is a bit confusing. I would suggest on what case this solution works more precisely. | 02:29 |
amotoki | oda-g: i will add a comment on your review. | 02:29 |
*** singhs has joined #openstack-neutron | 02:30 | |
oda-g | amotoki: "many environment": yes, you mentioned is one example. | 02:30 |
oda-g | amotoki: thanks for your review. | 02:31 |
openstackgerrit | Ian Wienand proposed a change to openstack/neutron: Record and log reason for dhcp agent resync https://review.openstack.org/81173 | 02:32 |
*** sfox has quit IRC | 02:34 | |
*** chandan_kumar has joined #openstack-neutron | 02:37 | |
*** coolsvap has quit IRC | 02:39 | |
*** SridharG has joined #openstack-neutron | 02:43 | |
*** suresh12 has quit IRC | 02:43 | |
*** markwash has quit IRC | 02:45 | |
*** Jianyong has joined #openstack-neutron | 02:48 | |
openstackgerrit | Xu Han Peng proposed a change to openstack/neutron: Permit ICMPv6 RAs only from known routers https://review.openstack.org/72252 | 02:51 |
*** jecarey has joined #openstack-neutron | 02:52 | |
*** gdubreui has quit IRC | 02:53 | |
openstackgerrit | Akihiro Motoki proposed a change to openstack/neutron: Mock agent RPC for FWaaS tests to delete DB objs https://review.openstack.org/78457 | 02:58 |
openstackgerrit | Akihiro Motoki proposed a change to openstack/neutron: Ensure to count firewalls in target tenant https://review.openstack.org/80715 | 02:58 |
*** dguitarbite has joined #openstack-neutron | 03:08 | |
*** thuc has joined #openstack-neutron | 03:12 | |
*** matsuhashi has quit IRC | 03:13 | |
*** changbl has joined #openstack-neutron | 03:17 | |
*** sweston has joined #openstack-neutron | 03:17 | |
*** matsuhashi has joined #openstack-neutron | 03:20 | |
*** SumitNaiksatam has joined #openstack-neutron | 03:22 | |
*** sfox has joined #openstack-neutron | 03:23 | |
*** harlowja is now known as harlowja_away | 03:23 | |
*** matsuhashi has quit IRC | 03:25 | |
*** jecarey has quit IRC | 03:26 | |
*** yfried has joined #openstack-neutron | 03:26 | |
*** sfox1 has joined #openstack-neutron | 03:29 | |
*** devlaps has quit IRC | 03:30 | |
*** sfox has quit IRC | 03:32 | |
openstackgerrit | oda-g proposed a change to openstack/neutron: Enable to select an RPC handling plugin under Metaplugin https://review.openstack.org/79858 | 03:34 |
*** thuc has quit IRC | 03:35 | |
*** thuc has joined #openstack-neutron | 03:35 | |
*** thuc has quit IRC | 03:40 | |
*** chandan_kumar has quit IRC | 03:41 | |
*** SridharG has quit IRC | 03:43 | |
*** spandhe has quit IRC | 03:45 | |
*** spandhe has joined #openstack-neutron | 03:47 | |
*** spandhe has quit IRC | 03:48 | |
*** banix has joined #openstack-neutron | 03:50 | |
*** spandhe has joined #openstack-neutron | 03:51 | |
*** dguitarbite has quit IRC | 03:52 | |
*** spandhe has quit IRC | 03:54 | |
*** vkozhukalov_ has joined #openstack-neutron | 03:55 | |
*** matrohon has quit IRC | 03:58 | |
*** matrohon has joined #openstack-neutron | 03:59 | |
*** ramishra has joined #openstack-neutron | 04:00 | |
*** thuc has joined #openstack-neutron | 04:03 | |
openstackgerrit | shihanzhang proposed a change to openstack/neutron: Prevent dhcp port deletion from the API https://review.openstack.org/73802 | 04:08 |
*** blogan has joined #openstack-neutron | 04:18 | |
*** matsuhashi has joined #openstack-neutron | 04:23 | |
*** _cjones_ has joined #openstack-neutron | 04:31 | |
*** _cjones_ has quit IRC | 04:35 | |
*** banix has quit IRC | 04:36 | |
*** thuc has quit IRC | 04:37 | |
*** thuc has joined #openstack-neutron | 04:38 | |
*** blogan has quit IRC | 04:39 | |
*** thuc has quit IRC | 04:42 | |
*** networkstatic has quit IRC | 04:48 | |
*** sbalukoff has joined #openstack-neutron | 04:48 | |
*** dguitarbite has joined #openstack-neutron | 04:57 | |
openstackgerrit | fumihiko kakuma proposed a change to openstack/neutron: OFA agent: use hexadecimal IP address in tunnel port name https://review.openstack.org/81436 | 04:57 |
*** arosen1 has quit IRC | 05:01 | |
*** _cjones_ has joined #openstack-neutron | 05:03 | |
openstackgerrit | A change was merged to openstack/neutron: Kill 'Skipping unknown group key: firewall_driver' log trace https://review.openstack.org/80379 | 05:04 |
*** _cjones_ has quit IRC | 05:10 | |
*** sbalukoff1 has joined #openstack-neutron | 05:20 | |
*** sbalukoff has quit IRC | 05:20 | |
*** arosen1 has joined #openstack-neutron | 05:21 | |
*** xianghui has joined #openstack-neutron | 05:26 | |
*** yfried has quit IRC | 05:30 | |
openstackgerrit | Akihiro Motoki proposed a change to openstack/python-neutronclient: Support packet_filter extension in NEC plugin https://review.openstack.org/49869 | 05:32 |
*** irenab has joined #openstack-neutron | 05:33 | |
*** dguitarbite has quit IRC | 05:40 | |
*** thuc has joined #openstack-neutron | 05:48 | |
*** pradipta_away is now known as pradipta | 05:51 | |
*** thuc has quit IRC | 05:53 | |
*** tomoe_ has joined #openstack-neutron | 05:53 | |
*** saju_m has joined #openstack-neutron | 05:58 | |
*** matsuhashi has quit IRC | 05:59 | |
*** matsuhashi has joined #openstack-neutron | 06:01 | |
*** nati_ueno has joined #openstack-neutron | 06:03 | |
*** ramishra has quit IRC | 06:03 | |
*** _cjones_ has joined #openstack-neutron | 06:04 | |
*** Jabadia has joined #openstack-neutron | 06:07 | |
*** gdubreui has joined #openstack-neutron | 06:08 | |
*** _cjones_ has quit IRC | 06:09 | |
*** SridharG has joined #openstack-neutron | 06:10 | |
*** ramishra has joined #openstack-neutron | 06:12 | |
*** tomoe_ has quit IRC | 06:12 | |
oda-g | amotoki: I fixed the commit message of https://review.openstack.org/#/c/79858/ please check. | 06:13 |
openstackgerrit | fumihiko kakuma proposed a change to openstack/neutron: OFA agent: use hexadecimal IP address in tunnel port name https://review.openstack.org/81436 | 06:15 |
*** arosen1 has quit IRC | 06:15 | |
openstackgerrit | shihanzhang proposed a change to openstack/neutron: Prevent dhcp port deletion from the API https://review.openstack.org/73802 | 06:16 |
openstackgerrit | Kevin Benton proposed a change to openstack/neutron: Allow CIDRs with non-zero masked portions https://review.openstack.org/81137 | 06:26 |
*** nati_uen_ has joined #openstack-neutron | 06:28 | |
*** nati_ueno has quit IRC | 06:31 | |
*** yfried has joined #openstack-neutron | 06:32 | |
*** gongysh has joined #openstack-neutron | 06:33 | |
*** nati_uen_ has quit IRC | 06:38 | |
*** arosen1 has joined #openstack-neutron | 06:40 | |
openstackgerrit | fumihiko kakuma proposed a change to openstack/neutron: Fix the "OVS agent loop slowdown" problem for OFAgent https://review.openstack.org/80864 | 06:44 |
openstackgerrit | gongysh proposed a change to openstack/neutron: return false or true according to binding result https://review.openstack.org/81457 | 06:45 |
*** matsuhashi has quit IRC | 06:47 | |
openstackgerrit | liusheng proposed a change to openstack/neutron: Remove vi modelines https://review.openstack.org/80213 | 06:48 |
*** gdubreui has quit IRC | 06:48 | |
*** tomoe_ has joined #openstack-neutron | 06:49 | |
*** matsuhashi has joined #openstack-neutron | 06:50 | |
*** amritanshu_RnD has joined #openstack-neutron | 06:50 | |
*** amritanshu_RnD is now known as Guest72277 | 06:50 | |
*** irenab has quit IRC | 06:51 | |
*** ramishra has quit IRC | 06:52 | |
*** _cjones_ has joined #openstack-neutron | 06:52 | |
*** _cjones_ has quit IRC | 06:56 | |
*** _cjones_ has joined #openstack-neutron | 06:56 | |
*** singhs has quit IRC | 07:00 | |
*** irenab has joined #openstack-neutron | 07:00 | |
*** _cjones_ has quit IRC | 07:01 | |
*** gongysh has quit IRC | 07:04 | |
*** gongysh has joined #openstack-neutron | 07:04 | |
*** oda-g has quit IRC | 07:07 | |
*** arosen1 has quit IRC | 07:13 | |
*** Akshik_ has joined #openstack-neutron | 07:14 | |
Akshik_ | are there any troubleshooting guide for fixing the vm ips reachability issue with neutron | 07:14 |
*** ramishra has joined #openstack-neutron | 07:15 | |
*** arosen1 has joined #openstack-neutron | 07:19 | |
*** bada has quit IRC | 07:25 | |
*** bada has joined #openstack-neutron | 07:26 | |
gongysh | Akshik: are u using openvswitch? | 07:26 |
*** ramishra has quit IRC | 07:29 | |
*** matsuhashi has quit IRC | 07:30 | |
*** gongysh has quit IRC | 07:30 | |
*** matsuhashi has joined #openstack-neutron | 07:31 | |
*** gongysh has joined #openstack-neutron | 07:31 | |
*** carl_baldwin has joined #openstack-neutron | 07:36 | |
*** morganfainberg is now known as morganfainberg_Z | 07:43 | |
*** arosen1 has quit IRC | 07:44 | |
*** tomoe_ has quit IRC | 07:47 | |
openstackgerrit | Xu Han Peng proposed a change to openstack/python-neutronclient: Create new IPv6 attributes for Subnets by client https://review.openstack.org/75871 | 07:48 |
*** ramishra has joined #openstack-neutron | 07:52 | |
*** nati_ueno has joined #openstack-neutron | 07:55 | |
openstackgerrit | Nachi Ueno proposed a change to openstack/neutron: Add enable_security_group option https://review.openstack.org/67281 | 07:55 |
*** tomoe_ has joined #openstack-neutron | 07:57 | |
*** ajo has joined #openstack-neutron | 08:00 | |
*** gongysh has quit IRC | 08:01 | |
*** gongysh has joined #openstack-neutron | 08:01 | |
*** ajo has quit IRC | 08:04 | |
*** ajo has joined #openstack-neutron | 08:04 | |
*** shakamunyi has quit IRC | 08:07 | |
*** tomoe_ has quit IRC | 08:07 | |
openstackgerrit | Berezovsky Irena proposed a change to openstack/neutron: Add update binding:profile with physical_network https://review.openstack.org/81281 | 08:09 |
openstackgerrit | Akihiro Motoki proposed a change to openstack/neutron: NEC plugin: Honor Retry-After response from OFC https://review.openstack.org/81472 | 08:10 |
*** leseb_ has joined #openstack-neutron | 08:15 | |
openstackgerrit | gongysh proposed a change to openstack/neutron: use floatingip's ID as key instead of itself https://review.openstack.org/81474 | 08:17 |
*** jgallard has joined #openstack-neutron | 08:21 | |
*** leseb_ has quit IRC | 08:22 | |
*** leseb_ has joined #openstack-neutron | 08:22 | |
openstackgerrit | Darragh O'Reilly proposed a change to openstack/neutron: lb-agent: reduce delete_vlan_bridge() logging severity https://review.openstack.org/75993 | 08:24 |
*** leseb_ has quit IRC | 08:27 | |
*** tomoe_ has joined #openstack-neutron | 08:29 | |
*** gongysh has quit IRC | 08:32 | |
*** gongysh has joined #openstack-neutron | 08:33 | |
*** shakamunyi has joined #openstack-neutron | 08:33 | |
openstackgerrit | Zang MingJie proposed a change to openstack/neutron: Ensure host_routes order in subnet https://review.openstack.org/78946 | 08:35 |
*** chandankumar_ has quit IRC | 08:37 | |
*** shakamunyi has quit IRC | 08:38 | |
openstackgerrit | Oleg Bondarev proposed a change to openstack/neutron: Sync excutils from oslo https://review.openstack.org/81480 | 08:39 |
*** chandan_kumar has joined #openstack-neutron | 08:39 | |
*** tomoe_ has quit IRC | 08:41 | |
*** jlibosva has joined #openstack-neutron | 08:44 | |
*** jlibosva has quit IRC | 08:45 | |
*** tomoe_ has joined #openstack-neutron | 08:46 | |
*** djoreilly has joined #openstack-neutron | 08:50 | |
*** yamahata has quit IRC | 08:51 | |
Akshik_ | gongysh: yes I'm using openvswitch only | 08:51 |
*** vkozhukalov_ has quit IRC | 08:55 | |
*** tomoe_ has quit IRC | 08:55 | |
gongysh | Akshik: first make sure the GRE is passing through ok if you are using GRE | 08:56 |
*** afazekas has joined #openstack-neutron | 08:56 | |
gongysh | Akshik: then make sure all ovs tag and ofport are configured well. no ovs ports are tagged as 4095, and ovs ports starting with 'tap' and 'qr-' must have ofport set well | 08:58 |
gongysh | Akshik: then make sure the number of interfaces in qdhcp name spaces should be one except the 'lo'. | 08:59 |
gongysh | Akshik: then check the dnsmasq is running | 08:59 |
*** safchain has joined #openstack-neutron | 09:01 | |
openstackgerrit | Oleg Bondarev proposed a change to openstack/neutron: Sync excutils from oslo https://review.openstack.org/81480 | 09:03 |
*** markmcclain has joined #openstack-neutron | 09:03 | |
*** jlibosva has joined #openstack-neutron | 09:04 | |
obondarev | amotoki: ping | 09:06 |
*** ygbo has joined #openstack-neutron | 09:06 | |
amotoki | obondarev: pong | 09:07 |
obondarev | amotoki: fixed commit message in https://review.openstack.org/#/c/81480/ Can you pleae check? | 09:07 |
obondarev | amotoki: Thanks! | 09:07 |
amotoki | obondarev: i just checked it and +2'ed :-) | 09:07 |
obondarev | amotoki: yeah :) | 09:08 |
amotoki | obondarev: we can clean up log messages now :) | 09:08 |
obondarev | amotoki: what do you mean? | 09:08 |
amotoki | obondarev: what I mean is that some save_and_reraise need to be update to pass reraise=False in constructors | 09:10 |
amotoki | to avoid unintended log message (drop the original exceptions). | 09:11 |
obondarev | amotoki: ah, gotcha. Yes, this is what I'm going to do next | 09:11 |
obondarev | amotoki: for some reason I read 'log messages' as 'commit messages' in your message :) | 09:13 |
*** roeyc has joined #openstack-neutron | 09:13 | |
*** carl_baldwin has quit IRC | 09:14 | |
*** tomoe_ has joined #openstack-neutron | 09:19 | |
*** gongysh has quit IRC | 09:22 | |
openstackgerrit | Ihar Hrachyshka proposed a change to openstack/neutron: Synced rpc and gettextutils modules from oslo-incubator https://review.openstack.org/80998 | 09:23 |
*** gongysh has joined #openstack-neutron | 09:23 | |
*** jgallard has quit IRC | 09:23 | |
*** jgallard has joined #openstack-neutron | 09:24 | |
*** amotoki has quit IRC | 09:26 | |
*** tomoe_ has quit IRC | 09:29 | |
*** leseb_ has joined #openstack-neutron | 09:33 | |
*** nati_ueno has quit IRC | 09:33 | |
*** nati_ueno has joined #openstack-neutron | 09:34 | |
*** shakamunyi has joined #openstack-neutron | 09:34 | |
*** leseb_ has quit IRC | 09:37 | |
*** jp_at_hp has joined #openstack-neutron | 09:38 | |
*** shakamunyi has quit IRC | 09:38 | |
*** nati_ueno has quit IRC | 09:38 | |
*** gongysh has quit IRC | 09:41 | |
*** tomoe_ has joined #openstack-neutron | 09:44 | |
*** nati_ueno has joined #openstack-neutron | 09:47 | |
*** overlayer has joined #openstack-neutron | 09:49 | |
openstackgerrit | A change was merged to openstack/neutron: Stop removing ip allocations on port delete https://review.openstack.org/81196 | 09:52 |
*** Jabadia_ has joined #openstack-neutron | 09:54 | |
*** Matt2 has quit IRC | 09:55 | |
*** Jabadia has quit IRC | 09:57 | |
openstackgerrit | Claudiu Belu proposed a change to openstack/neutron: Fixes the Hyper-V agent individual ports metrics https://review.openstack.org/78795 | 09:57 |
*** tomoe_ has quit IRC | 10:02 | |
openstackgerrit | jun xie proposed a change to openstack/neutron: Use different name for the same constraint https://review.openstack.org/81488 | 10:07 |
*** roeyc has quit IRC | 10:09 | |
openstackgerrit | jun xie proposed a change to openstack/neutron: Use different name for the same constraint https://review.openstack.org/81488 | 10:10 |
*** pcm_ has joined #openstack-neutron | 10:12 | |
*** pcm_ has quit IRC | 10:13 | |
*** tomoe_ has joined #openstack-neutron | 10:14 | |
*** pcm_ has joined #openstack-neutron | 10:14 | |
*** ramishra has quit IRC | 10:21 | |
*** ramishra has joined #openstack-neutron | 10:22 | |
*** ramishra has quit IRC | 10:22 | |
*** Jianyong has quit IRC | 10:26 | |
*** gdubreui has joined #openstack-neutron | 10:33 | |
*** vkozhukalov_ has joined #openstack-neutron | 10:33 | |
*** shakamunyi has joined #openstack-neutron | 10:35 | |
*** shakamunyi has quit IRC | 10:39 | |
*** ramishra has joined #openstack-neutron | 10:42 | |
obondarev | amotoki: ping | 10:46 |
*** rotbeard has joined #openstack-neutron | 10:50 | |
*** bada_ has joined #openstack-neutron | 10:58 | |
*** bada has quit IRC | 11:00 | |
markmcclain | sdague: this merged about an hour ago https://review.openstack.org/81196 should address top bug in gate | 11:02 |
sdague | markmcclain: great | 11:02 |
markmcclain | I am concerned about 1248757 | 11:05 |
markmcclain | sdague: ^ has seen a sharp increase in last 24hrs | 11:06 |
*** nati_ueno has quit IRC | 11:06 | |
*** ramishra has quit IRC | 11:07 | |
*** baoli has quit IRC | 11:08 | |
*** baoli has joined #openstack-neutron | 11:13 | |
*** baoli has quit IRC | 11:14 | |
*** tomoe_ has quit IRC | 11:15 | |
openstackgerrit | Li Ma proposed a change to openstack/neutron: Race condition of L3-agent to add/remove routers https://review.openstack.org/73234 | 11:16 |
openstackgerrit | Li Ma proposed a change to openstack/neutron: Race condition of L3-agent to add/remove routers https://review.openstack.org/73234 | 11:18 |
*** jgallard has quit IRC | 11:18 | |
obondarev | markmcclain: per Salvatore's comment on 1248757 it has a chance to be fixed with https://review.openstack.org/81196 as well. | 11:19 |
markmcclain | obondarev: right just concerned that it reappeared | 11:19 |
*** matsuhashi has quit IRC | 11:22 | |
*** matsuhashi has joined #openstack-neutron | 11:24 | |
*** Jabadia has joined #openstack-neutron | 11:31 | |
*** Jabadia_ has quit IRC | 11:31 | |
*** Matt2 has joined #openstack-neutron | 11:31 | |
*** Jabadia has quit IRC | 11:34 | |
*** safchain has quit IRC | 11:35 | |
*** Jabadia has joined #openstack-neutron | 11:35 | |
*** shakamunyi has joined #openstack-neutron | 11:36 | |
*** markmcclain has quit IRC | 11:36 | |
*** matsuhashi has quit IRC | 11:37 | |
*** shakamunyi has quit IRC | 11:40 | |
*** Jabadia has quit IRC | 11:40 | |
openstackgerrit | Li Ma proposed a change to openstack/neutron: Race condition of L3-agent to add/remove routers https://review.openstack.org/73234 | 11:40 |
*** Jabadia has joined #openstack-neutron | 11:41 | |
*** Jabadia_ has joined #openstack-neutron | 11:41 | |
*** xuhanp has quit IRC | 11:42 | |
*** Jabadia has quit IRC | 11:45 | |
*** yfried has quit IRC | 11:46 | |
*** rook has joined #openstack-neutron | 11:46 | |
*** overlayer has quit IRC | 11:53 | |
*** yfried has joined #openstack-neutron | 12:03 | |
openstackgerrit | Li Ma proposed a change to openstack/neutron: Race condition of L3-agent to add/remove routers https://review.openstack.org/73234 | 12:07 |
*** roeyc has joined #openstack-neutron | 12:11 | |
*** Guest72277 has quit IRC | 12:12 | |
*** Atti has joined #openstack-neutron | 12:14 | |
*** pradipta is now known as pradipta_away | 12:14 | |
sdague | so that fingerprint is actually kind of terrible, because it's not the issue | 12:19 |
openstackgerrit | Paul Michali proposed a change to openstack/neutron: Cisco VPN device driver post-merge cleanup https://review.openstack.org/80062 | 12:19 |
*** salv-orlando has quit IRC | 12:21 | |
*** chandan_kumar has quit IRC | 12:23 | |
*** gdubreui has quit IRC | 12:23 | |
*** Akshik_ has quit IRC | 12:26 | |
*** Jabadia has joined #openstack-neutron | 12:29 | |
*** Jabadia_ has quit IRC | 12:29 | |
*** jecarey has joined #openstack-neutron | 12:33 | |
*** leseb_ has joined #openstack-neutron | 12:33 | |
*** shakamunyi has joined #openstack-neutron | 12:36 | |
*** Jabadia has quit IRC | 12:38 | |
*** Jabadia has joined #openstack-neutron | 12:39 | |
*** leseb_ has quit IRC | 12:40 | |
*** leseb_ has joined #openstack-neutron | 12:41 | |
*** shakamunyi has quit IRC | 12:41 | |
*** leseb_ has quit IRC | 12:42 | |
*** leseb_ has joined #openstack-neutron | 12:42 | |
*** Jabadia has quit IRC | 12:43 | |
*** Jabadia has joined #openstack-neutron | 12:44 | |
*** xianghui has quit IRC | 12:46 | |
*** subhranshu has joined #openstack-neutron | 12:48 | |
subhranshu | Hi team, need some help in neutron scalability | 12:49 |
*** amuller has joined #openstack-neutron | 12:50 | |
subhranshu | We are planning to deploy 1000 VM with FWaaS and LBaaS using Neuton and not sure how shall we calculate how many neutron network nodes would we be needing to start with | 12:50 |
*** dims has quit IRC | 12:50 | |
subhranshu | help on this will be highly appreciated | 12:50 |
*** chandan_kumar has joined #openstack-neutron | 12:56 | |
*** thuc has joined #openstack-neutron | 12:57 | |
*** thuc_ has joined #openstack-neutron | 12:58 | |
*** tomoe_ has joined #openstack-neutron | 13:00 | |
*** thuc has quit IRC | 13:01 | |
subhranshu | Could some one have any Idea how can we do calculations for neutron scaling | 13:02 |
*** baoli has joined #openstack-neutron | 13:04 | |
*** salv-orlando has joined #openstack-neutron | 13:11 | |
*** julim has joined #openstack-neutron | 13:12 | |
*** safchain has joined #openstack-neutron | 13:13 | |
*** jgallard has joined #openstack-neutron | 13:15 | |
*** xuhanp has joined #openstack-neutron | 13:15 | |
*** yfried has quit IRC | 13:17 | |
*** subhranshu has quit IRC | 13:21 | |
*** mwagner_lap is now known as mwagner_dontUseM | 13:25 | |
*** yfried has joined #openstack-neutron | 13:26 | |
*** mwagner__ has quit IRC | 13:30 | |
*** Atti has quit IRC | 13:35 | |
openstackgerrit | Ann Kamyshnikova proposed a change to openstack/neutron: WIP Implement test https://review.openstack.org/76520 | 13:35 |
openstackgerrit | Ann Kamyshnikova proposed a change to openstack/neutron: WIP Add testing for database from oslo https://review.openstack.org/76519 | 13:35 |
*** safchain has quit IRC | 13:37 | |
*** safchain has joined #openstack-neutron | 13:40 | |
*** thuc has joined #openstack-neutron | 13:41 | |
*** dave_tucker_zzz is now known as dave_tucker | 13:42 | |
*** peristeri has joined #openstack-neutron | 13:44 | |
*** alagalah has joined #openstack-neutron | 13:44 | |
*** thuc_ has quit IRC | 13:44 | |
*** thuc has quit IRC | 13:45 | |
*** alexpilotti has joined #openstack-neutron | 13:47 | |
xuhanp | salv-orlando, ping | 13:49 |
salv-orlando | xi xuhanp | 13:49 |
xuhanp | will you have time to review my patch https://review.openstack.org/#/c/76125/ again. It has been approved by you previously but get conflicted during merge :-) | 13:49 |
*** nati_ueno has joined #openstack-neutron | 13:50 | |
*** jgrimm has joined #openstack-neutron | 13:53 | |
*** thuc has joined #openstack-neutron | 13:54 | |
*** carlp has joined #openstack-neutron | 13:54 | |
*** irenab has quit IRC | 13:55 | |
*** alagalah has quit IRC | 13:55 | |
salv-orlando | xuhanp: sure | 13:57 |
*** jecarey has quit IRC | 13:57 | |
xuhanp | salv-orlando, thanks a lot! | 13:57 |
*** thuc has quit IRC | 13:57 | |
*** salv-orlando_ has joined #openstack-neutron | 13:59 | |
*** sneezewort has joined #openstack-neutron | 13:59 | |
*** salv-orlando has quit IRC | 14:01 | |
sneezewort | Why does turning off TCP segmentation offload on the virtual NICs increase network performance from an instance? | 14:03 |
*** nati_ueno has quit IRC | 14:03 | |
*** salv-orlando_ has quit IRC | 14:03 | |
*** vkozhukalov_ has quit IRC | 14:05 | |
*** thuc has joined #openstack-neutron | 14:07 | |
*** vkozhukalov_ has joined #openstack-neutron | 14:08 | |
*** safchain has quit IRC | 14:08 | |
*** markmcclain has joined #openstack-neutron | 14:10 | |
*** markmcclain1 has joined #openstack-neutron | 14:12 | |
openstackgerrit | Jakub Libosvar proposed a change to openstack/neutron: Add support for https requests on nova metadata https://review.openstack.org/81535 | 14:13 |
*** chandan_kumar has quit IRC | 14:13 | |
openstackgerrit | dongfeng proposed a change to openstack/neutron: add support for huawei snc https://review.openstack.org/81536 | 14:15 |
*** markmcclain has quit IRC | 14:15 | |
openstackgerrit | enikanorov proposed a change to openstack/neutron: Fix namespace existence check method for it shall not be called with a namespace https://review.openstack.org/81537 | 14:15 |
*** chandan_kumar has joined #openstack-neutron | 14:17 | |
openstackgerrit | Jakub Libosvar proposed a change to openstack/neutron: Add support for https requests on nova metadata https://review.openstack.org/81535 | 14:18 |
*** shakamunyi has joined #openstack-neutron | 14:18 | |
openstackgerrit | dongfeng proposed a change to openstack/neutron: remove errors https://review.openstack.org/81540 | 14:22 |
*** alagalah has joined #openstack-neutron | 14:25 | |
openstackgerrit | dongfeng proposed a change to openstack/neutron: remove whitespace https://review.openstack.org/81543 | 14:26 |
*** Jianyong has joined #openstack-neutron | 14:27 | |
*** thuc_ has joined #openstack-neutron | 14:28 | |
*** thuc has quit IRC | 14:30 | |
*** thuc_ has quit IRC | 14:32 | |
*** markmcclain1 has quit IRC | 14:40 | |
*** jobewan has joined #openstack-neutron | 14:40 | |
*** dims has joined #openstack-neutron | 14:43 | |
*** thedodd has joined #openstack-neutron | 14:49 | |
*** SridharG has quit IRC | 14:49 | |
openstackgerrit | Oleg Bondarev proposed a change to openstack/neutron: Fix usage of save_and_reraise_exception https://review.openstack.org/81549 | 14:50 |
*** devlaps has joined #openstack-neutron | 14:51 | |
openstackgerrit | Jakub Libosvar proposed a change to openstack/neutron: Sync service and systemd modules from oslo-incubator https://review.openstack.org/81211 | 14:52 |
*** shakamunyi has quit IRC | 14:52 | |
*** shakamunyi has joined #openstack-neutron | 14:52 | |
*** thuc has joined #openstack-neutron | 14:53 | |
*** markmcclain has joined #openstack-neutron | 14:54 | |
*** salv-orlando has joined #openstack-neutron | 14:54 | |
*** Jianyong has quit IRC | 14:55 | |
*** alagalah has quit IRC | 14:55 | |
*** pcm_ has quit IRC | 14:56 | |
*** roeyc has quit IRC | 14:58 | |
*** carl_baldwin has joined #openstack-neutron | 14:59 | |
*** dvorkinista has joined #openstack-neutron | 15:00 | |
*** mwagner__ has joined #openstack-neutron | 15:05 | |
*** yamahata has joined #openstack-neutron | 15:07 | |
*** networkstatic has joined #openstack-neutron | 15:07 | |
HenryG | markmcclain: hi, do you have a few minutes to discuss https://bugs.launchpad.net/neutron/+bug/1292114 ? | 15:09 |
*** jecarey has joined #openstack-neutron | 15:09 | |
openstackgerrit | Jakub Libosvar proposed a change to openstack/neutron: Sync service and systemd modules from oslo-incubator https://review.openstack.org/81211 | 15:15 |
markmcclain | HenryG: sure.. what's up? | 15:16 |
HenryG | markmcclain: one way to fix it is rather cringe-worthy: https://review.openstack.org/80406 | 15:17 |
HenryG | markmcclain: is there a better way? | 15:18 |
markmcclain | yeah not excited about surprise append to the list of plugins | 15:20 |
*** networkstatic_ has joined #openstack-neutron | 15:21 | |
*** tvardeman has joined #openstack-neutron | 15:21 | |
HenryG | markmcclain: I guess we could add it to each individual migration where it is missing | 15:22 |
HenryG | markmcclain: problem is, it will happen again | 15:23 |
*** dvorkinista has quit IRC | 15:23 | |
*** admin0 has joined #openstack-neutron | 15:24 | |
admin0 | hi all | 15:24 |
*** blogan has joined #openstack-neutron | 15:24 | |
admin0 | in icehouse, will the neutron l3 agent be in high availability ? | 15:24 |
HenryG | markmcclain: that patch does not affect other plugins, and we have tested it with ours - it solves all the migration problems | 15:26 |
*** dvorkinista has joined #openstack-neutron | 15:26 | |
*** banix has joined #openstack-neutron | 15:26 | |
*** networkstatic_ has quit IRC | 15:26 | |
markmcclain | It's not ideal, but does make sense | 15:26 |
*** arosen1 has joined #openstack-neutron | 15:27 | |
amuller | admin0: What Neutron agent? There are many... L2 on each compute node, then there's L3, DHCP and metadata | 15:28 |
amuller | admin0: That's ignoring the advanced services | 15:28 |
amuller | admin0: The Neutron service itself will have no HA capabilities in Icehouse (That would require an external took like Pacemaker) | 15:30 |
*** sbalukoff1 has quit IRC | 15:30 | |
HenryG | markmcclain: thanks for understanding. It still makes me go 'ewww' though, and what's worse is there is a need to backport it to Havana. :( | 15:30 |
amuller | admin0: As for the agents: DHCP agent supports active/active, by installing it on multiple nodes, and changing /etc/neutron.conf to allow multiple dhcp agents per network | 15:31 |
markmcclain | yeah | 15:31 |
amuller | admin0: L3 agent will have HA for routers (Not for advanced services like FW, VPN, LB) in Juno it looks like, not Icehouse | 15:31 |
*** leseb_ has quit IRC | 15:32 | |
*** leseb has joined #openstack-neutron | 15:32 | |
HenryG | markmcclain: the problem is showing up in deployments. Not sure why us devs don't see it with devstack? Would we need some kind of grenade testing to invoke more migration scenarios? | 15:34 |
markmcclain | well it won't show up in the gate because the cisco migrations are not run | 15:34 |
markmcclain | ideally the 3rd party system should catch this | 15:35 |
HenryG | markmcclain: I understand that, but we run the cisco plugin with devstack all the time and don't see migration problems. | 15:35 |
markmcclain | is create_all() "fixing" the issue for you? | 15:35 |
HenryG | markmcclain: where is that? | 15:36 |
HenryG | markmcclain: found it, looking ... | 15:36 |
markmcclain | it should be in the db code | 15:36 |
HenryG | Trying to understand when it gets run ... | 15:38 |
markmcclain | when the db connection is instantiated | 15:38 |
*** jecarey has quit IRC | 15:38 | |
markmcclain | look at 40296 | 15:39 |
*** jecarey has joined #openstack-neutron | 15:39 | |
markmcclain | salv-orlando worked to remove create_all | 15:39 |
markmcclain | you'll probably need to rebase and adapt for new plugins | 15:39 |
*** leseb has quit IRC | 15:39 | |
*** leseb has joined #openstack-neutron | 15:40 | |
*** dfarrell07 has joined #openstack-neutron | 15:40 | |
*** overlayer has joined #openstack-neutron | 15:41 | |
*** admin0 has quit IRC | 15:41 | |
*** otherwiseguy has joined #openstack-neutron | 15:41 | |
HenryG | you've lost me. That patch is abandoned? | 15:43 |
openstackgerrit | Jakub Libosvar proposed a change to openstack/neutron: Add support for https requests on nova metadata https://review.openstack.org/81535 | 15:43 |
*** leseb has quit IRC | 15:44 | |
*** sbalukoff has joined #openstack-neutron | 15:45 | |
HenryG | markmcclain: is it something that is still planned to go ahead? If so I will try to follow the progress and understand it better. | 15:49 |
*** admin0 has joined #openstack-neutron | 15:50 | |
markmcclain | it might still be revived | 15:50 |
markmcclain | if you want to revive it | 15:50 |
HenryG | ok. Thanks for the help and info! | 15:50 |
markmcclain | salv-orlando says that would be ok | 15:51 |
*** _cjones_ has joined #openstack-neutron | 15:52 | |
salv-orlando | HenryG: I can click restore… and they you can keep working on it? | 15:52 |
salv-orlando | I might have the time for doing that if I give up on eating or sleeping | 15:52 |
salv-orlando | I wouldn't do that however | 15:52 |
HenryG | salv-orlando: ok, but I will need help | 15:53 |
salv-orlando | HenryG: I already clicked restore, what other help you might need? ;) | 15:53 |
HenryG | salv-orlando: markmcclain: we're talking juno for that, right? | 15:53 |
salv-orlando | I'm here to explain the logic of the change, and the review history | 15:53 |
markmcclain | I'd be interested in it for Icehouse | 15:54 |
*** coolsvap has joined #openstack-neutron | 15:54 | |
*** kevinbenton has left #openstack-neutron | 15:54 | |
*** chandankumar_ has joined #openstack-neutron | 15:54 | |
markmcclain | because create_all() seems to do cover up bad migrations | 15:54 |
*** kevinbenton has joined #openstack-neutron | 15:55 | |
*** chandan_kumar has quit IRC | 15:56 | |
*** yfried has quit IRC | 15:56 | |
*** amuller has quit IRC | 15:56 | |
*** carl_baldwin has quit IRC | 15:56 | |
*** rossella_s has quit IRC | 15:57 | |
*** jorgem has joined #openstack-neutron | 15:57 | |
*** rossella_s has joined #openstack-neutron | 15:57 | |
*** rcurran has joined #openstack-neutron | 15:58 | |
*** irenab has joined #openstack-neutron | 15:58 | |
*** Jabadia has quit IRC | 15:58 | |
*** dims has quit IRC | 15:59 | |
HenryG | ok, I'll see what I can do. I am really not familiar with databases just so you know. | 15:59 |
*** admin0 has left #openstack-neutron | 15:59 | |
*** Jabadia has joined #openstack-neutron | 15:59 | |
*** _cjones_ has quit IRC | 15:59 | |
*** sbalukoff has quit IRC | 15:59 | |
*** _cjones_ has joined #openstack-neutron | 15:59 | |
*** sbalukoff has joined #openstack-neutron | 15:59 | |
*** jorgem1 has joined #openstack-neutron | 16:00 | |
*** jorgem has quit IRC | 16:01 | |
HenryG | salv-orlando: when would be a good time to go through the explaining? | 16:02 |
*** Jabadia has quit IRC | 16:03 | |
salv-orlando | HenryG: We're a bit busy now, so responses might be delayed… | 16:04 |
HenryG | salv-orlando: understood. let me rebase and test, then I will ping you again | 16:04 |
salv-orlando | HenryG: you can perhaps just try running unit tests after rebasing. Most will fail as they rely on auto-generation of schemas | 16:05 |
salv-orlando | for those failing you can then look at how they were fixed in that patch. | 16:05 |
salv-orlando | I'm sorry it's been almost 6 months since I did it. I already struggle trying to remember what I had yesterday for dinner | 16:06 |
*** alagalah has joined #openstack-neutron | 16:06 | |
HenryG | bangers and mash | 16:06 |
*** ijw has joined #openstack-neutron | 16:07 | |
*** thuc has quit IRC | 16:07 | |
*** thuc has joined #openstack-neutron | 16:08 | |
*** dvorkinista has quit IRC | 16:08 | |
*** Sukhdev has joined #openstack-neutron | 16:08 | |
*** dfarrell07 has quit IRC | 16:10 | |
arosen1 | and scones and tea for breakfast :) | 16:10 |
*** alagalah has quit IRC | 16:10 | |
ihrachys | arosen1: one of those missing changes to fix the patch is I3034fbb20e790f2d6f22c50b74a9f9dcdf9081aa | 16:12 |
*** thuc has quit IRC | 16:12 | |
*** dfarrell07 has joined #openstack-neutron | 16:12 | |
arosen1 | ihrachys: i think that was a fix for a patch that we want to leave out : ) | 16:13 |
arosen1 | err maybe not. | 16:14 |
ihrachys | arosen1: I don't know, I just cherry-picked it, and now unit test that failed before succeeds :) | 16:16 |
arosen1 | ihrachys: great! Hopefully's the missing key | 16:17 |
*** armax has joined #openstack-neutron | 16:17 | |
*** xuhanp has quit IRC | 16:17 | |
*** blogan has quit IRC | 16:18 | |
ihrachys | arosen1: yeah, both unit tests are fixed by this. you should try to upload with this included, and we'll see whether tempest tests succeed too :) | 16:20 |
*** ihrachys is now known as ihrachys|afk | 16:20 | |
*** bada_ has quit IRC | 16:21 | |
*** singhs has joined #openstack-neutron | 16:21 | |
*** bada_ has joined #openstack-neutron | 16:21 | |
*** jlibosva has quit IRC | 16:22 | |
openstackgerrit | Brandon Logan proposed a change to openstack/neutron: Added config value help text in ns metadata proxy https://review.openstack.org/80816 | 16:23 |
*** dvorkinista has joined #openstack-neutron | 16:30 | |
*** leseb has joined #openstack-neutron | 16:33 | |
*** SumitNaiksatam has quit IRC | 16:33 | |
*** sbalukoff has quit IRC | 16:33 | |
*** blogan has joined #openstack-neutron | 16:34 | |
*** thuc_ has joined #openstack-neutron | 16:36 | |
marun | salv-orlando: it looks like we are seeing more lock timeouts rather than fewer. :/ | 16:36 |
marun | salv-orlando: most seem to be on ports | 16:36 |
*** thuc__ has joined #openstack-neutron | 16:37 | |
salv-orlando | marun: rebasing my patch and reputing right now. | 16:37 |
salv-orlando | I analysed gate failure since your patch merged. | 16:37 |
salv-orlando | it's still a bit early but it seems failure rate reduced by 15% | 16:37 |
marun | I have to wonder we should be doing something other than 'for update' everywhere | 16:37 |
marun | salv-orlando: when I run the logstash query with the bug, the simple graph at the top suggests that incidence is significantly worse than yesterday | 16:38 |
marun | (although maybe simply more jobs are running) | 16:38 |
openstackgerrit | Salvatore Orlando proposed a change to openstack/neutron: Add a semaphore to some ML2 operations https://review.openstack.org/80413 | 16:39 |
*** rcurran has quit IRC | 16:40 | |
*** shakamunyi has quit IRC | 16:40 | |
*** thuc_ has quit IRC | 16:40 | |
*** salv-orlando has quit IRC | 16:40 | |
openstackgerrit | mark mcclain proposed a change to openstack/neutron: add HEAD sentinel file that contains migration revision https://review.openstack.org/79377 | 16:40 |
*** hogepodge has quit IRC | 16:41 | |
*** dvorkinista has quit IRC | 16:41 | |
marun | wtf, a semaphore?? | 16:41 |
marun | that's not going to help us when multiple servers are in play | 16:41 |
*** spandhe has joined #openstack-neutron | 16:42 | |
*** salv-orlando has joined #openstack-neutron | 16:42 | |
marun | salv-orlando: to repeat myself - wtf, a semaphore?? | 16:42 |
marun | arosen: why on earth are you +2ing https://review.openstack.org/#/c/80413/ | 16:43 |
*** hogepodge has joined #openstack-neutron | 16:43 | |
salv-orlando | marun: oh I feel you are intimidating me. ;) | 16:43 |
salv-orlando | that was agreed with comstud | 16:43 |
marun | really? | 16:43 |
marun | comstud: do you want to explain? | 16:43 |
salv-orlando | yeah I pointed to the eavesdrop logs on the review | 16:43 |
salv-orlando | you'll find everything there | 16:44 |
marun | how does that help if someone has multiple neutron servers? | 16:44 |
salv-orlando | marun: but if you come up with a plan to remove all the select..for update queries, that would solve the problem | 16:44 |
marun | or is the idea that multiple neutron servers effectively have to use multiple masters? | 16:44 |
marun | salv-orlando: I think the big hole in what we're doing right now has to do with ml2 events | 16:45 |
marun | salv-orlando: it's pretty hard to have clean db interaction if there are going to be arbitrary code called from transactions | 16:45 |
marun | rkukura, mestery: ^^ | 16:45 |
*** markmcclain has quit IRC | 16:45 | |
salv-orlando | marun: I unfortunately do not know ML2 enough. | 16:46 |
marun | salv-orlando: it's not complicated. | 16:46 |
salv-orlando | Let's then agree to forget about the patch I did | 16:46 |
salv-orlando | and you, rkukura and mastery will sort that out? | 16:46 |
salv-orlando | sorry, I meant mestery | 16:46 |
marun | salv-orlando: there are [resource event]_precommit calls in transactions | 16:46 |
* mestery reads backscroll | 16:46 | |
*** irenab has quit IRC | 16:46 | |
marun | mestery: I'm starting to think we need to remove precommit calls from transaction scope | 16:47 |
rkukura | salv-orlando, marun: The pre commit calls are intended to be within the transactions, and drivers are not supposed to do anything in these that could block. | 16:47 |
* mestery agrees this will cause problems with multiple servers. | 16:47 | |
kevinbenton | marun: the deadlock the semaphore protects against doesn't affect multiple servers | 16:47 |
mestery | +1 to what rkukura said | 16:47 |
*** salv-orlando has quit IRC | 16:47 | |
marun | kevinbenton: really? | 16:48 |
rkukura | marun: We already have post commit methods that are outside the transactions | 16:48 |
kevinbenton | marun: yes, the deadlock happens between coroutines in the same threadpool | 16:48 |
*** salv-orlando has joined #openstack-neutron | 16:48 | |
rkukura | I'm working on a patch that moves the bind_port() methods outside transactions | 16:48 |
kevinbenton | one gets the mysql lock for update and yields | 16:48 |
marun | kevinbenton: I'm not sure deadlock is the problem so much as lock timeouts | 16:48 |
mestery | kevinbenton: See rkukura's comments ^^^ | 16:48 |
marun | kevinbenton: which a semaphore does nothing to help | 16:48 |
kevinbenton | the lock timeout is the side effect of the deadlock | 16:48 |
salv-orlando | I am far from being an expert on ml2. However, we had several people including me at the logs, and concluding it's an eventlet issue, not a db issues. | 16:49 |
salv-orlando | still, I'm happy to see the interest from the core devs for ML2 is on this patch! | 16:49 |
marun | salv-orlando: so we're not working around bad db code, just eventlet misbehaviour? | 16:49 |
rkukura | salv-orlando: Which patch? | 16:49 |
kevinbenton | thread1 gets mysql lock and yields. thread2 tries to get same mysql lock and blocks | 16:49 |
kevinbenton | deadlock until sql lock timeout | 16:50 |
mestery | rkukura: https://review.openstack.org/#/c/80413/ | 16:50 |
salv-orlando | marun: did you read the eavesdrop log I linked on gerrit? | 16:50 |
rkukura | Why yield inside a transaction? | 16:50 |
marun | salv-orlando: If I could find human comments on reviews with all the auto-test chatter, I would :p | 16:50 |
rkukura | seems like that is asking for timeouts | 16:50 |
salv-orlando | ok | 16:50 |
*** leseb has quit IRC | 16:50 | |
rkukura | marun: +1 on hiding the chatter | 16:50 |
marun | salv-orlando: I'll dig for it, sorry | 16:50 |
*** leseb has joined #openstack-neutron | 16:51 | |
salv-orlando | finding that... | 16:51 |
* mestery can't find it either. #^@#@ auto comments on gerrit reviews. | 16:51 | |
marun | salv-orlando: which day? | 16:52 |
*** tstevenson has quit IRC | 16:52 | |
salv-orlando | http://eavesdrop.openstack.org/irclogs/%23openstack-qa/%23openstack-qa.2014-03-13.log | 16:52 |
kevinbenton | rkukura: right, we shouldn't yield in a transaction, but it seems to be difficult to guarantee that all of the eventlet yielding calls aren't inside of a transaction | 16:52 |
marun | mestery, rkukura: there is a javascript bookmarklet to colorize auto chatter at least, but it should be enabled by default | 16:52 |
salv-orlando | check for me, sdague, comstud and markmcclain chatting around 14:00GMT | 16:52 |
*** salv-orlando has quit IRC | 16:52 | |
marun | kevinbenton: is there any way to track yield points in eventlet so we can see if we can fix? | 16:53 |
*** ygbo has quit IRC | 16:53 | |
kevinbenton | marun: not that i know of, but i'm far from an eventlet expert | 16:54 |
*** alagalah has joined #openstack-neutron | 16:54 | |
kevinbenton | something to consider is a LOG call in the transaction | 16:55 |
marun | kevinbenton: Maybe it makes sense to optionally debug log transaction enter/exit. So at least when we see a deadlock we'll know the source. | 16:55 |
*** SumitNaiksatam has joined #openstack-neutron | 16:55 | |
marun | kevinbenton: ah, I guess we're on the same page. | 16:55 |
kevinbenton | marun: not quite | 16:55 |
marun | kevinbenton: do tell | 16:55 |
kevinbenton | marun: i was just going to say that even LOG calls could cause a yield | 16:55 |
kevinbenton | marun: if a file system call is configured | 16:56 |
marun | kevinbenton: F$#@ | 16:56 |
marun | kevinbenton: so we'd need a logging mechanism that would would avoid yields in transactions. | 16:56 |
*** alagalah has quit IRC | 16:57 | |
marun | kevinbenton: i.e. flush to disk only when safe to do so | 16:57 |
mestery | marun: Thanks for the reminder of the bookmarklet! Works well. Should be default. :) | 16:57 |
kevinbenton | marun: yes | 16:57 |
kevinbenton | marun: and anyone that makes the mistake of logging the normal way in a transaction will have to be publicly chastised :-( | 16:57 |
marun | kevinbenton: at least we'll know about it if we see a log for entry without a corresponding one for exit before the error is reported. | 16:58 |
marun | kevinbenton: (in the context of a transaction I mean) | 16:58 |
*** dfarrell07 has quit IRC | 16:59 | |
marun | kevinbenton: The only problem is if the semaphore patch is merged we won't be able to find these problems anymore. | 16:59 |
marun | kevinbenton: I'm not sure if that's desirable - do we want a bandaid or a real fix? | 16:59 |
kevinbenton | marun: yeah, it will help with debugging. because the current DB timeout exception isn't from the coroutine that mistakenly yielded | 16:59 |
kevinbenton | marun: right | 17:00 |
kevinbenton | always a real fix | 17:00 |
kevinbenton | but bandaids exist because the real fixes are painful :-) | 17:01 |
*** alagalah has joined #openstack-neutron | 17:01 | |
kevinbenton | i think for icehouse the semaphore would be good for stability | 17:01 |
marun | kevinbenton: actually, we shouldn't have to worry about logging causing transaction problems.. | 17:01 |
*** dfarrell07 has joined #openstack-neutron | 17:01 | |
*** sweston has quit IRC | 17:01 | |
kevinbenton | marun: why not? | 17:01 |
marun | kevinbenton: My thought is to create a context manager that wraps the transaction one. | 17:01 |
marun | kevinbenton: log before/after the transaction | 17:02 |
marun | kevinbenton: search/replace everywhere | 17:02 |
*** dfarrell07 has quit IRC | 17:02 | |
marun | kevinbenton: do you see any issue with this approach? | 17:02 |
kevinbenton | marun: oh okay, just strip out anyone that logs inside the transaction? | 17:02 |
*** sweston has joined #openstack-neutron | 17:02 | |
marun | kevinbenton: oh! | 17:03 |
marun | kevinbenton: I just got it, I'm dense/slow | 17:03 |
marun | kevinbenton: So logging could be causing the deadlock? | 17:03 |
marun | kevinbenton: and if we could capture the logs inside of a transaction we might be able to eliminate it? | 17:03 |
kevinbenton | marun: yeah | 17:04 |
*** dfarrell07 has joined #openstack-neutron | 17:04 | |
marun | kevinbenton: i was only thinking of logging transaction start/end so we could trace where the deadlock was coming from | 17:04 |
marun | kevinbenton: both probably make sense | 17:04 |
kevinbenton | marun: actually this might not help find it | 17:05 |
Sukhdev | sdague: Sean are you here? | 17:05 |
marun | kevinbenton: no? | 17:05 |
kevinbenton | marun: the eventlet that mistakenly yields isn't the one that blows up with a timeout | 17:05 |
*** chandan_kumar has joined #openstack-neutron | 17:05 | |
marun | kevinbenton: maybe not, but couldn't we eliminate the yields? | 17:05 |
kevinbenton | marun: oh, never mind, it will still help because there would be a start and stop that enclose the exception from the yielding threa | 17:06 |
*** salv-orlando has joined #openstack-neutron | 17:07 | |
kevinbenton | marun: so logging is a good start | 17:07 |
rkukura | marun, kevinbenton: I think we need to distinguish between two kinds of blocking in transactions | 17:07 |
rkukura | There is external blocking, such as a driver talking to a controller within a transaction | 17:08 |
marun | rkukura: which imho, should not be happening. | 17:08 |
marun | rkukura: that pretty much guarantees a yield | 17:08 |
rkukura | Then there is internal stuff (local to the node) such as yields or logging that shouldn't block for long, but can cause green threads to context switch | 17:08 |
marun | rkukura: if that's 'by design', we have to rethink. | 17:08 |
kevinbenton | marun, rkukura: in the eyes of eventlet, they are the same and both cause the coroutine to yield | 17:09 |
rkukura | Our design should be to avoid the external blocking within transactions completely. | 17:09 |
marun | rkukura: right. | 17:09 |
*** harlowja_away is now known as harlowja | 17:09 | |
marun | rkukura: so we're focusing on the local stuff, which can include logging. | 17:09 |
rkukura | Our design should also minimize likelihood of context switches due to short term local blocking/yielding within transactions | 17:10 |
marun | rkukura: my thought is that we need a wrapper for initiating transactions that collects logs so that no logging occurs during a transaction | 17:10 |
marun | rkukura: because any file io can trigger a yield | 17:10 |
marun | rkukura: and outputs the logs after transaction commit/rollback | 17:10 |
rkukura | marun: That might be worthwhile | 17:11 |
kevinbenton | marun, rkukura: any io *does* trigger a yield | 17:11 |
marun | kevinbenton: right, thank you for the clarification | 17:11 |
rkukura | I like the idea of logging the beginning and end of each TX | 17:11 |
marun | rkukura: yes, and if we do that as well, we'll figure out other sources of yields that we can fix. | 17:11 |
rkukura | Making this collect logging would also reduce the chance of a yield | 17:11 |
rkukura | But a yield should not normally be an issue, except when the server process (single thread) is overloaded and can't keep up | 17:12 |
marun | rkukura: so we should be able to ensure that no yielding occurs during transactions and we can go back to fighting locks caused by trying to do too much in a given transaction. | 17:12 |
kevinbenton | rkukura: the deadlock risk is still there though, just a narrower window | 17:12 |
kevinbenton | rkukura: if there is any yielding | 17:12 |
*** hogepodge has quit IRC | 17:13 | |
marun | rkukura: I don't think we can protect against overloading in a production deployment. | 17:13 |
marun | rkukura: so avoiding yields in transactions would seem the best option. | 17:13 |
marun | salv-orlando: ping | 17:13 |
marun | salv-orlando: have you been following any of this? | 17:13 |
comstud | marun: semaphore idea is a stop-gap until you can figure out a better way to not create deadlocks or lock wait timeouts | 17:14 |
*** vkozhukalov_ has quit IRC | 17:14 | |
marun | comstud: I think we have a better way. | 17:14 |
comstud | marun: it will only work when running single process, but it was just an idea of a quick fix | 17:14 |
comstud | the other was retrying, which would work multi-process | 17:14 |
comstud | Next step would be to fix the real problems | 17:14 |
comstud | marun: ok | 17:14 |
marun | comstud: We've been discussing creating a transaction wrapper that collects logs so that file io from logging can't happen inside the transaction (and outputs them after commit/rollback). | 17:15 |
*** morganfainberg_Z is now known as morganfainberg | 17:15 | |
*** hogepodge has joined #openstack-neutron | 17:15 | |
marun | comstud: the same wrapper would log transaction entry/exit so we could correlate deadlocks with yielding transaction to find other triggers for yields. | 17:15 |
comstud | yeah, sounds like a good alternative | 17:16 |
comstud | cool | 17:16 |
marun | comstud: are there other obvious sources for yields other than logging file io? | 17:16 |
*** leseb has quit IRC | 17:16 | |
*** hogepodge has quit IRC | 17:16 | |
*** leseb has joined #openstack-neutron | 17:16 | |
*** SridharG has joined #openstack-neutron | 17:16 | |
comstud | no, unless you're doing crazy things like using queues inside of the transactions | 17:16 |
marun | comstud: I hope that's not the case, but I guess we'll find out. | 17:17 |
*** ijw has quit IRC | 17:17 | |
*** chandan_kumar has quit IRC | 17:17 | |
comstud | or something other 3rd party module that would make use of threading.Lock | 17:17 |
comstud | 3rd party modules that use python logging module could be a problem | 17:17 |
comstud | unfortunatley sqlalchemy itself falls into that category :) | 17:17 |
rkukura | does access to config potentially yield? | 17:17 |
marun | comstud: What do you think of using mock? | 17:17 |
*** alagalah has quit IRC | 17:18 | |
marun | comstud: since it should affect even instantiated objets. | 17:18 |
marun | comstud: I'm not sure I understand the problem with threading.Lock, could you elaborate? | 17:18 |
rkukura | python eventlet doesn't have any notion of thread priorities, does it? Would be nice if this transaction wrapper could bump the thread's priority up | 17:19 |
comstud | marun: sure | 17:19 |
comstud | eventlet is not thread safe | 17:19 |
comstud | oops, different problem | 17:19 |
comstud | sorry, I'm confusing 2 different problems here | 17:19 |
comstud | for this problem, it's really not specific to threading.Lock | 17:19 |
comstud | but use of locks can cause a greenthread switch if the lock is locked by other greenthread | 17:20 |
rkukura | comstud: I meant bump up the current green thread priority, not the underlying OS thread | 17:20 |
comstud | queues and things apply as well because it'll switch greenthread if you try to block getting something from queue, but queue is empty | 17:20 |
comstud | etc | 17:20 |
jogo | any updates on bug Bug 1283522? | 17:20 |
comstud | unfortunately the python logging module uses threading.Lock to serialize logging | 17:20 |
comstud | which eventlet monkey patches | 17:21 |
jogo | it accounts for roughly 30% of our known gate failures | 17:21 |
comstud | and causes a greenthread switch if another greenthread has it locked | 17:21 |
marun | comstud: ah, so if a greenthread tries to acquire a lock that is already held, it has the same effect as io -> yield | 17:21 |
comstud | right | 17:21 |
*** leseb has quit IRC | 17:22 | |
marun | jogo: we're disussing it now | 17:22 |
comstud | otherwise you have a deadlock | 17:22 |
comstud | it must switch to another greenthread to allow it to run | 17:22 |
comstud | and unlock the lock | 17:22 |
*** mlavalle has joined #openstack-neutron | 17:22 | |
marun | jogo: we're considering ways to avoid yields in transactions | 17:22 |
marun | comstud: right. so we need to avoid both io and lock contention to prevent yields | 17:23 |
comstud | So, one thing I tried before... | 17:23 |
*** dims has joined #openstack-neutron | 17:23 | |
comstud | is monkey patching the logging module to use the non-monkey patched threading.RLock | 17:23 |
comstud | but it proved to be rather ugly and difficult to do... | 17:23 |
comstud | but that's an option if you feel like trying that | 17:23 |
*** BillTheKat has joined #openstack-neutron | 17:24 | |
*** BillTheKat is now known as Gil_McGrath | 17:24 | |
marun | comstud: but does that avoid the potential for file io? | 17:24 |
comstud | file io does not cause switches | 17:24 |
marun | comstud: no? | 17:24 |
marun | comstud: *confused* | 17:24 |
comstud | no, it should not.. eventlet won't poll on file i/o | 17:24 |
marun | comstud: only network io? | 17:24 |
comstud | right | 17:25 |
comstud | (I could be wrong... but I'm like 90%+ sure :) | 17:25 |
comstud | the reason I say so is... | 17:25 |
comstud | eventlet tries to read() and write() to file first. | 17:25 |
comstud | and only polls on EAGAIN | 17:25 |
marun | comstud: what is EAGAIN? | 17:25 |
comstud | syscall errno | 17:25 |
marun | <- ignorant | 17:25 |
comstud | EAGAIN is a syscall errno meaning "no data right now" | 17:26 |
comstud | when using non-blocking I/O | 17:26 |
marun | comstud: so when reading in other words. | 17:26 |
marun | comstud: writing wouldn't yield, but reading could | 17:26 |
comstud | and that never happens for file i/o | 17:26 |
marun | ah, ok. | 17:26 |
comstud | not from a file | 17:26 |
comstud | there's always data | 17:26 |
comstud | or EOF | 17:26 |
*** harlowja has quit IRC | 17:26 | |
comstud | So eventlet should never switch on file I/O | 17:26 |
marun | comstud: from a regular file, but what about something like stdout? | 17:26 |
comstud | I think there was a misunderstanding about this before | 17:27 |
comstud | because we knew that the logging module caused switches | 17:27 |
comstud | and an assumption was made that it was from file io | 17:27 |
marun | comstud: ah, ok. | 17:27 |
comstud | when it fact it's actually from the threading.RLock | 17:27 |
marun | comstud: so the problem is the same, I was just misunderstanding the cause. | 17:27 |
comstud | correct | 17:27 |
kevinbenton | comstud: so file IO will never cause a yield on write? | 17:27 |
marun | comstud: ok, so collecting logs is still a good idea | 17:27 |
comstud | kevinbenton: correct | 17:28 |
comstud | epoll doesn't even support polling on file descriptors tied to files | 17:28 |
*** rotbeard has quit IRC | 17:28 | |
*** yfried has joined #openstack-neutron | 17:28 | |
comstud | marun: Yeah, I think that's reasonable | 17:28 |
jogo | marun: so how come this bug started recently? | 17:28 |
marun | jogo: I wish I knew. I'm presuming we were masking it with other issues. | 17:29 |
comstud | jogo: Could be getting more pseudo-parallelism somehow in DB queries | 17:29 |
comstud | more greenthreads doing DB calls, who knows | 17:29 |
marun | jogo: it's pretty strange to see how much it's incidence has increased, even in the past day alone. | 17:29 |
comstud | just a guess.. I don't know a lot about neutron in general | 17:29 |
comstud | :) | 17:29 |
*** hogepodge has joined #openstack-neutron | 17:29 | |
rkukura | More load on the server will cause more likely hood of DB timeouts | 17:29 |
jogo | marun: have you gone through the openstack/openstack repo looking for suspects? | 17:29 |
jogo | because if we could revert the thing that broke us and then fix the root cause that would make things run smoother | 17:30 |
marun | jogo: openstack/openstack? | 17:30 |
jogo | http://git.openstack.org/cgit/openstack/openstack | 17:30 |
jogo | has the full openstack git history accross all relavent repos | 17:30 |
kevinbenton | here is a diagram https://docs.google.com/drawings/d/13A2x4AWbf8zmzeGApUmYVlBrW8CMTPFTCBGSP_nTzDA/edit?usp=sharing | 17:31 |
marun | jogo: why would I use that vs looking at the commit log for neutron? | 17:31 |
marun | kevinbenton: is that what's happening here? | 17:31 |
comstud | marun: honestly, i'd probably just throw the semaphore decorator on for now | 17:32 |
marun | comstud: I was wondering how network io to mysql played into this... | 17:32 |
comstud | because you can get a quick fix that way | 17:32 |
comstud | And then work on a better fix | 17:32 |
marun | comstud: ok, fair enough. | 17:32 |
*** Gil_McGrath has quit IRC | 17:32 | |
comstud | You really don't lose anything by doing so right now | 17:32 |
*** Gil_McGrath has joined #openstack-neutron | 17:32 | |
comstud | since the DB queries all run serially anyway | 17:32 |
kevinbenton | marun: yeah, the trick is finding the yielding call | 17:32 |
comstud | Ie, you're not losing any parallelism | 17:32 |
marun | comstud: my visceral reaction to things like semaphores is 'fsck no!' but leaving the gate in bad shape is worse. | 17:32 |
comstud | marun: Yeah, I agree. :) | 17:33 |
comstud | In this case it probably looks worse than it really is | 17:33 |
comstud | It looks like a huge performance issue, but you're really not losing anything | 17:33 |
marun | comstud: I'm not worried about the performance so much as adding complexity instead of fixing things. | 17:33 |
comstud | Sure | 17:34 |
comstud | but this decorator is less complex than any of the real fixes, actually. | 17:34 |
comstud | :) | 17:34 |
marun | comstud, kevinbenton: so does eventlet ever yield due to network io with mysql, inside a transaction? | 17:34 |
comstud | But I agree with your general sentiment :) | 17:34 |
comstud | marun: Not right now, no | 17:35 |
comstud | or really not ever | 17:35 |
rkukura | marun: Is https://review.openstack.org/#/c/80413/ the patch being discussed here? | 17:35 |
comstud | Not with the default mysql backend module | 17:35 |
comstud | which uses sockets in C | 17:35 |
kevinbenton | marun: no i don't think so because at that point things are in the hands of the C lib | 17:35 |
marun | rkukura: yes | 17:35 |
comstud | eventlet can't wrap it... so they all just block | 17:35 |
marun | comstud: ah, ok.[ | 17:35 |
marun | kevinbenton: thanks for the clarification | 17:36 |
comstud | Which sucks, because we can't run DB calls in parallel... | 17:36 |
*** harlowja has joined #openstack-neutron | 17:36 | |
comstud | Anyway, longer term is we fix eventlet.. and run db calls in Thread pools | 17:37 |
rkukura | comstud: Are there any eventlet-friendly pure python mysql backends? | 17:37 |
openstackgerrit | A change was merged to openstack/neutron: Remove individual cfg.CONF.resets from tests https://review.openstack.org/79846 | 17:37 |
kevinbenton | the other option would be to avoid "lock for updates" like the plague | 17:37 |
marun | fix eventlet? Or throw it under a bus? ;) | 17:37 |
marun | kevinbenton: If only that were an option. | 17:37 |
marun | kevinbenton: db consistency sometimes demands locking. | 17:38 |
kevinbenton | rkukura: yes | 17:38 |
kevinbenton | pymysql | 17:38 |
*** suresh12 has joined #openstack-neutron | 17:38 | |
kevinbenton | i tried it out | 17:38 |
rkukura | And? | 17:38 |
kevinbenton | immediately detects the deadlock and raises an exception | 17:38 |
kevinbenton | so it doesn't automatically yield like i had hoped | 17:39 |
rkukura | kevinbenton: Is this a specific deadlock? | 17:39 |
kevinbenton | rkukura: it detects that it can't get a lock for update | 17:39 |
comstud | marun: One or the other. I have a replacement coming, actually. :) | 17:39 |
comstud | we'll see. | 17:39 |
marun | comstud: so salvatore's patch, is it ready to go as is? | 17:39 |
comstud | i haven't seen it | 17:39 |
comstud | and i'm in a 1 on 1 now | 17:40 |
comstud | i can look in a bit | 17:40 |
rkukura | marun: If its https://review.openstack.org/#/c/80413/, I may change my -1 to a -2 | 17:40 |
marun | rkukura: why? | 17:40 |
rkukura | If anything, it will just move the problem around | 17:40 |
salv-orlando | marun: even if we agree that we want a semaphore I still need to address a comment to kevinbenton. Following his lead I found two more calls that need to be synchrinized | 17:40 |
marun | rkukura: I think we may need to take this approach to appease the gate gods. | 17:40 |
marun | rkukura: and then we can work on a real fix. | 17:41 |
marun | rkukura: a -2 would be counterproductive | 17:41 |
kevinbenton | rkukura: i don't think it moves the problem. it does fix it | 17:41 |
marun | kevinbenton: it masks it. | 17:41 |
rkukura | What is the specific deadlock scenario that this avoids? | 17:41 |
marun | kevinbenton: very effectively! | 17:41 |
kevinbenton | rkukura: it's just a civil war era fix (amputate the leg to avoid infection) | 17:41 |
kevinbenton | rkukura: the one i sent the diagram for | 17:42 |
marun | rkukura: have you seen the logstash query? | 17:42 |
salv-orlando | rkukura: it would be good if you can -2 the patch and add appropriate comments on it. I'm not able to participate in the discussion now, so I would be happy if we can move it to gerrit | 17:42 |
marun | rkukura: jogo reports that 30% of gate failures are due to this issue | 17:42 |
kevinbenton | rkukura: it will prevent the second thread from trying to get an SQL lock while the other has it | 17:42 |
marun | rkukura: see the query link in the bug: https://bugs.launchpad.net/neutron/+bug/1283522 | 17:42 |
rkukura | kevinbenton, marun: It would really help me to understand which threads are trying to do what when this deadlock occurs. | 17:43 |
marun | rkukura: as far as we know logging could be the culprit | 17:43 |
marun | rkukura: or any 3rd party library that uses threading locks | 17:43 |
*** ijw_ has joined #openstack-neutron | 17:44 | |
marun | rkukura: the point is, something is causing a yield. | 17:44 |
kevinbenton | rkukura: usually it's simultaneous delete_ports | 17:44 |
rkukura | The commit msg says "This semaphore has been introduced to avoid undesired eventlet | 17:44 |
rkukura | yields...". How does locking a semaphore prevent yields? | 17:44 |
marun | rkukura: because then only one thing can attempt to acquire a given lock at a time. | 17:44 |
kevinbenton | rkukura: it doesn't, but it does make sure that the yields don't kill everything | 17:44 |
rkukura | We cannot put the semaphore on get_device_details | 17:44 |
marun | lock -L db lock | 17:45 |
marun | oops, lock -> db lock | 17:45 |
rkukura | The patch I'm working on will result in the possibility of transactions within that. | 17:45 |
marun | rkukura: ?? | 17:45 |
marun | rkukura: but the issue isn't transactions | 17:45 |
marun | rkukura: the issue is db locks | 17:45 |
marun | rkukura: are you saying get_device_details would require locking?? | 17:46 |
kevinbenton | rkukura: the only things that need to be protected with a semaphore are the ones that lock the table for update | 17:46 |
rkukura | Which table? | 17:46 |
marun | rkukura: whichever ones are being locked for update. | 17:46 |
kevinbenton | rkukura: the port table in this case is the problem child | 17:47 |
rkukura | Maybe we just want to put synchronizion at the very top level so the process only handles a single REST call or single RPC call at a time! | 17:48 |
marun | uh, no | 17:48 |
marun | i'm assuming you're being sarcastic | 17:48 |
rkukura | Slightly | 17:49 |
*** dfarrell07 has quit IRC | 17:49 | |
*** sbalukoff has joined #openstack-neutron | 17:49 | |
marun | without eventlet we have to run lots of python processes to handle the same load | 17:49 |
marun | the memory overhead would be significant | 17:49 |
rkukura | With my patch there will be a loop inside get_device_details that can do a transaction each iteration. A semaphore around this is not a good idea | 17:49 |
kevinbenton | does get_device_details lock the port table? | 17:50 |
marun | rkukura: Please answer my question: Are you going to lock in said transactions? | 17:50 |
rkukura | lock for update | 17:51 |
marun | rkukura: and if so, why? | 17:51 |
*** dfarrell07 has joined #openstack-neutron | 17:51 | |
rkukura | the lock for update is on the port binding table | 17:51 |
*** bada_ has quit IRC | 17:51 | |
marun | rkukura: why is a method called get_device_details writing to the table? | 17:51 |
marun | rkukura: shouldn't it be set_device_details? :p | 17:51 |
*** afazekas has quit IRC | 17:51 | |
*** dvorkinista has joined #openstack-neutron | 17:51 | |
*** bada_ has joined #openstack-neutron | 17:52 | |
rkukura | With my patch, get_device_details, get_port, create_port, update_port will call call a _bind_if_needed() function that does the binding. | 17:52 |
marun | *sigh* | 17:52 |
marun | so much for sanity | 17:52 |
marun | that voilates a pretty fundamental oo design principle | 17:53 |
marun | i'm sure there's a good reason, but, oy! | 17:53 |
jogo | marun: because perhaps there was a devstack or devstack-gate change that triggered this | 17:53 |
openstackgerrit | A change was merged to openstack/neutron: Mock agent RPC for FWaaS tests to delete DB objs https://review.openstack.org/78457 | 17:53 |
rkukura | Since the binding is no longer inside the transaction, after the binding completes, a truncation is started that locks the binding table for update, verifies that nothing effecting the binding (binding:host_id, binding:profile, binding:vnic_type) has changed, and then commit the binding result | 17:53 |
openstackgerrit | A change was merged to openstack/neutron: Ensure to count firewalls in target tenant https://review.openstack.org/80715 | 17:53 |
marun | rkukura: Well, at least the lock isn't on the ports table. | 17:54 |
rkukura | s/truncation/transaction/ | 17:54 |
marun | rkukura: We only need to synchronize for that table. | 17:54 |
marun | rkukura: so you should be safe. | 17:55 |
rkukura | I still do not see why this semaphore is needed | 17:55 |
marun | rkukura: cause: eventlet yields inside transaction holding lock against the ports table | 17:55 |
rkukura | Is it simply that our tests run the server in such an overloaded state that a yield inside a transaction is statistacally likely to result in a DB timeout? | 17:56 |
marun | rkukura: bandaid solution: prevent entry to more than one function at a time that locks the ports table | 17:56 |
*** dvorkinista has quit IRC | 17:56 | |
rkukura | So the bandaid just moves the problem to some other locking of some other table | 17:56 |
marun | rkukura: maybe, and we can revert it if so. | 17:56 |
marun | rkukura: maybe the only hotspot is the ports table. | 17:57 |
kevinbenton | rkukura: only if there are other tables that lock for update and yield | 17:57 |
marun | rkukura: we won't know until we've tried. | 17:57 |
kevinbenton | rkukura: there are like 4 functions that delete ports | 17:57 |
marun | rkukura: we have the power of git! nothing is permanent. | 17:57 |
kevinbenton | rkukura: and it in testing they all happen close together | 17:57 |
kevinbenton | delete_network, delete_subnet, delete_router, delete_port | 17:57 |
marun | rkukura: we don't have the luxury of finding a perfect solution. the negative impact is too great. | 17:58 |
marun | rkukura: so long as we actively work on a real fix, a bandaid is a valuable option. | 17:59 |
*** spandhe has quit IRC | 17:59 | |
marun | rkukura: anyway, I was skeptical too until talking it over with comstud. If you're still unsure, you may want to follow up with him. | 18:00 |
rkukura | I just gave it a -2. Maybe I don't understand the exact issue here, and I'm happy to discuss it, but I cannot see the semaphore around get_device_details() being acceptable. If this gets in, I need to abandon the patch I've been working on. | 18:01 |
*** RajeshMohan has joined #openstack-neutron | 18:02 | |
marun | rkukura: ?? | 18:02 |
*** baoli has quit IRC | 18:02 | |
marun | rkukura: Didn't we just tell you that the patch in question does not affect your patch unless you are locking the ports table? | 18:03 |
marun | rkukura: so you _are_ locking the ports table?? | 18:03 |
*** baoli has joined #openstack-neutron | 18:03 | |
*** kanzhe_ has joined #openstack-neutron | 18:03 | |
marun | rkukura: or is it the existing code that is doing so? | 18:04 |
rkukura | My patch will drastically increase the length of time that the semaphore around get_device_details could remain locked. | 18:04 |
*** hemanthravi has joined #openstack-neutron | 18:04 | |
enikanorov__ | kanzhe_: so... | 18:04 |
marun | rkukura: *sigh* | 18:04 |
marun | rkukura: my brain is confused | 18:05 |
enikanorov__ | how do you see inserting an LB into a router? can you specify the workflow? | 18:05 |
openstackgerrit | Abhishek Raut proposed a change to openstack/neutron: Fix segment allocation tables in Cisco N1kv plugin https://review.openstack.org/78506 | 18:05 |
*** s3wong_ has joined #openstack-neutron | 18:05 | |
kanzhe_ | enikanorov__: I don't know how the provider knows the context? | 18:05 |
kevinbenton | rkukura: a slowed down call that works is better than one that randomly deadlocks and explodes | 18:05 |
*** spandhe has joined #openstack-neutron | 18:05 | |
marun | kevinbenton: so I get using a semaphore when locking is involved... | 18:05 |
enikanorov__ | kanzhe_: well, there are options | 18:06 |
marun | kevinbenton: but if there is a lock on a given table, how are reads affected? | 18:06 |
enikanorov__ | kanzhe_: opt #1 | 18:06 |
rkukura | marun: Most of get_device_details really should not be locking the port table for update. Is it just the update_port_status() call at the end that needs to lock it for update? | 18:06 |
marun | rkukura: I would guess so, yeah. | 18:07 |
enikanorov__ | kanzhe_: you create LB specify a flavor that says 'routed insertion', then you need to pass router_id, or the provider derives router from the pool or vip subnet | 18:07 |
*** spandhe has quit IRC | 18:07 | |
enikanorov__ | so user either provides insertion context (router_id), or it is implicitly used | 18:07 |
kevinbenton | marun: good question. there still may be an issue there | 18:07 |
enikanorov__ | kanzhe_: opt #2 - chain provider passes insertion context to LB plugin upon LB creation | 18:08 |
*** Sukhdev has quit IRC | 18:08 | |
kevinbenton | marun: lock for update only blocks reads of that row, right? | 18:08 |
SumitNaiksatam | enikanorov: i think ideally we should use a generic mechanism for the latter (that is passing the router_id) | 18:08 |
marun | rkukura: it may make sense to isolate the write from the read | 18:08 |
rkukura | marun: All three of the RPC methods to which the semaphore is being added call update_port_status(), which is actually updating the port table. Maybe we can just put the semaphore in that function? | 18:08 |
*** peristeri has quit IRC | 18:08 | |
marun | kevinbenton: I think so | 18:08 |
marun | rkukura: that sounds reasonable | 18:09 |
enikanorov__ | SumitNaiksatam: the thing is that even passing router_id is not very good. it's better if provider could derive router based on existing information | 18:09 |
SumitNaiksatam | enikanorov__: that part i think we agree, it should not be a requirement | 18:09 |
kevinbenton | marun: so it is possible that a deadlock would occur if something tried to read a port that is locked for update | 18:09 |
marun | kevinbenton: arg++ | 18:10 |
rkukura | marun: I'm looking into it, and will update the review if it seems to make sense. | 18:10 |
marun | rkukura: thank you for digging, your concerns are certainly valid. | 18:11 |
kanzhe_ | enikanorov__: If there are more than one router, router_id may need to be explicitly defined. In the case of L2 devices, the context is getting more complicated. | 18:11 |
enikanorov__ | kanzhe_: right, and then i wonder how a tenant can specify such context... | 18:11 |
marun | kevinbenton: in a perfect world we could hook into eventlet's machinery to trigger an exception if a yield occurred when it shouldn't... | 18:12 |
rkukura | marun: Even update_port_status calls into mechanism drivers' pre and post commit methods, and the post commit methods could make remote calls. | 18:12 |
kanzhe_ | enikanorov__: tenant knows where the service is needed Since he is the one asking for it. | 18:12 |
comstud | marun: mock greenthread.getcurrent().switch() method to trap :) | 18:12 |
marun | comstud: so there is a way? yay! | 18:13 |
kevinbenton | marun: i was joking with markmcclain that eventlet just needs a @roadrage decorator where it refuses to yield | 18:13 |
comstud | yep | 18:13 |
*** pradipta_away is now known as pradipta | 18:13 | |
comstud | kinda hacky, but if that's what you want, yeah | 18:13 |
comstud | well | 18:13 |
marun | comstud: I think that's what we need. | 18:13 |
comstud | Actually, I guess that's not quite what you want | 18:13 |
enikanorov__ | kanzhe_: that is understood, i mean how it's defined, literally. router_id is a single parameter to a API call/CLI command | 18:13 |
comstud | because that traps the switch BACK to current greenthread | 18:13 |
marun | comstud: why not? | 18:13 |
enikanorov__ | insertion_context is a complex object | 18:13 |
marun | comstud: ah, nope. | 18:13 |
kanzhe_ | enikanorov__: If the tenant needs the service for a subnet, then subnet is the context. same for network. If the service applies to all traffic at the default gateway, then router is the context. | 18:13 |
comstud | marun: ya, more difficult to trap what you really want | 18:14 |
marun | comstud: I think this capability is essential to our use case. | 18:14 |
enikanorov__ | kanzhe_: ok, i see | 18:14 |
kevinbenton | comstud: actually that's what i just described isn't it? | 18:14 |
marun | comstud: if we have to avoid yield, we should be told explicitely how it happens so we can work to avoid. | 18:14 |
kevinbenton | comstud: blocking eventlet from yielding | 18:14 |
kanzhe_ | enikanorov__: The serviceContext is meant to address it. I agree it is a bit confusing at its current state. | 18:14 |
marun | comstud: or as kevinbenton suggests, could we simply not block the yield? | 18:15 |
*** yfried has quit IRC | 18:15 | |
marun | sorry, not block -> block. | 18:15 |
kevinbenton | comstud: package it up into a decorator we can slap on transactions | 18:15 |
kanzhe_ | enikanorov__: glad to see that you are agree on the importance of serviceContext | 18:15 |
comstud | eventlet internally, depending on what it does, plucks a greenthread off a list | 18:15 |
comstud | and does a .switch() on it | 18:15 |
comstud | but you have way to really trap that | 18:15 |
comstud | you have no idea what it'll select | 18:15 |
comstud | and where the list comes from depends on what's causing the switch | 18:16 |
enikanorov__ | kanzhe_: ok. would be great if you could show the workflow of 1-2 examples for loadbalancer service | 18:16 |
marun | comstud: but isn't that for entering rather than yielding? | 18:16 |
comstud | that's what a yield really is | 18:16 |
marun | comstud: or is the idea that the switch is an interrupt that we can't turn off selectively? | 18:16 |
comstud | it's a switch to another greenthread | 18:16 |
comstud | be it the eventlet hub or another greenthread | 18:16 |
enikanorov__ | kanzhe_: i mean literally cli commands that need to be executed to create LB inserted in routed mode | 18:16 |
marun | comstud: sorry for my ignorance of eventlet internals :/ | 18:16 |
rkukura | marun, kevinbenton, salv-orlando: Would it be possible to move the semaphore locking to basically the same scope as the "with session.begin():" statement that contains the "with_lockmode('update')"? | 18:16 |
enikanorov__ | kanzhe_: we can continue this over ML | 18:17 |
kanzhe_ | enikanorov__: Sure. For the one-arm mode LB, the router where the LB needs to be attached should be the context. Hence, the serviceContext.routers=[router_id], where the router_id is the router of interest. | 18:17 |
comstud | marun: no worries! you don't really want to know the internals | 18:17 |
comstud | :) | 18:17 |
comstud | unfortunately i've looked too much at it | 18:17 |
kevinbenton | yes | 18:17 |
kevinbenton | rkukura: yes | 18:17 |
enikanorov__ | kanzhe_: my question is mostly about API and user experience. I understand how that could look in the code. but how will that look from API/CLI perspective? | 18:17 |
kanzhe_ | enikanorov__: For one-arm with DSR case, serviceContext should have a router and subnet, where the router is the same as one-arm mode, the subnet is the one where server-pools are. | 18:18 |
marun | comstud: so would it be possible to do either of a. prevent yield or b. raise exception on yield? | 18:18 |
rkukura | kevinbenton: Do you think those semaphores at that scope would still avoid the deadlocks? | 18:18 |
marun | comstud: just want to clarify whether our best option is still logging transaction entry/exit and then having to correlate deadlocks with missing exist. | 18:18 |
marun | exits | 18:18 |
kevinbenton | rkukura: with lockutils.lock(name, lock_file_prefix, external, lock_path) | 18:18 |
kanzhe_ | enikanorov__: I am not familiar with CLI syntax, Sumit, can you help here? | 18:18 |
kevinbenton | rkukura: yes, as long as they occur before the lock for update | 18:18 |
enikanorov__ | kanzhe_: let me explain | 18:19 |
comstud | marun: I can't think of a clean way to do it | 18:19 |
comstud | offhand | 18:19 |
SumitNaiksatam | enikanorov__: go ahead | 18:19 |
SumitNaiksatam | kanzhe_: i am around :-) | 18:19 |
marun | comstud: :( | 18:19 |
kevinbenton | comstud: do you have a link to the yielding logic you are referring to? | 18:19 |
enikanorov__ | kanzhe_: you're saying for DSR it's router_id and subnet_id, so service context is two parameters. do we pass them directly to API or we create separate object, and then provide it's ID to a create_loadbalancer? | 18:20 |
enikanorov__ | kanzhe_: so basically it's a choise between: | 18:20 |
kanzhe_ | SumitNaiksatam: Is CLI or API regarding serviceContext captured in the design spec? | 18:20 |
SumitNaiksatam | enikanorov__: we may not need to create a new object | 18:20 |
enikanorov__ | #1 create-loadbalancer --flavor flavor-dsr --router-id id1 --subnet-id id2 | 18:21 |
marun | I have to grab some food | 18:21 |
*** bada_ has quit IRC | 18:21 | |
SumitNaiksatam | enikanorov__: essentially what kanzhe is saying is exactly what is proposed in the service_context | 18:21 |
marun | comstud: thank you for your patience in explaining things to me! | 18:21 |
comstud | kevinbenton: give me 10 minutes.. still on a call, actually | 18:21 |
comstud | no problem | 18:21 |
enikanorov__ | #2 create-service-context --router-id id1 --subnet-id id2 | 18:21 |
*** bada_ has joined #openstack-neutron | 18:21 | |
enikanorov__ | #2.1 create_loadbalancer --flavor flavor-dsr --context context_id | 18:21 |
SumitNaiksatam | enikanorov__: we are using #1 currently to go with the service_context patch | 18:22 |
kanzhe_ | enikanorov__: either approach is ok. Creating a object will permit the context to be used by other service. For FW case, the context may be large. So it may be a good reason to be an object. | 18:23 |
enikanorov__ | SumitNaiksatam: i see, thanks | 18:23 |
SumitNaiksatam | enikanorov__: so there is no new resource or object that needs to be created | 18:23 |
*** otherwiseguy has quit IRC | 18:23 | |
SumitNaiksatam | kanzhe_: i agree, however enikanorov__ and nachi had reservations about having to create another object | 18:23 |
SumitNaiksatam | kanzhe_: and it would add another step to the workflow | 18:23 |
SumitNaiksatam | enikanorov__ kanzhe_: so we have a compromise | 18:23 |
enikanorov__ | yes, i'd prefer #1 | 18:23 |
kanzhe_ | SumitNaiksatam: understood. I am fine with either approach. | 18:24 |
SumitNaiksatam | enikanorov__ kanzhe_: we define the notion of a service_context with the expectation that each service understands it | 18:24 |
kanzhe_ | enikanorov__: whichever one is easier to consume by the user. | 18:24 |
enikanorov__ | SumitNaiksatam: kanzhe_: so my initial confusion was about the fact that we're not actually adding 'service context' in API as an attribute. | 18:24 |
SumitNaiksatam | enikanorov__: the service context is an optinal attribute to each service | 18:24 |
enikanorov__ | instead it's just a noition that could be some subset of parameters | 18:24 |
enikanorov__ | like router_id, subnet_id, etc | 18:25 |
SumitNaiksatam | enikanorov__: it is indeed an optional attribute | 18:25 |
SumitNaiksatam | enikanorov__: but its optional | 18:25 |
enikanorov__ | well, you mean the context itself is an attribute? | 18:25 |
*** Adri2000 has quit IRC | 18:26 | |
enikanorov__ | or any attrs that i've mentioned are optional? | 18:26 |
kanzhe_ | SumitNaiksatam: enikanorov__ I would argue to make the serviceContext as a required parameter for each service. How a service can be inserted with a context? There is not default insertion. | 18:26 |
enikanorov__ | kanzhe_: right now every service has it's default insertion | 18:26 |
enikanorov__ | we only need context when there is a choice, right? | 18:27 |
SumitNaiksatam | enikanorov__: yes | 18:27 |
*** s3wong_ is now known as s3wong | 18:27 | |
SumitNaiksatam | enikanorov__: the mechanism to specify the choice should ideally be uniform across services | 18:27 |
SumitNaiksatam | enikanorov__: the service_context aims to do that | 18:27 |
kanzhe_ | enikanorov__: Why can't we capture the default insertion in the context? | 18:27 |
SumitNaiksatam | enikanorov__: in a backward compatible and non-intrusive manner | 18:28 |
*** ijw_ has quit IRC | 18:28 | |
enikanorov__ | kanzhe_: then what is a 'required attribute'? | 18:28 |
*** alagalah has joined #openstack-neutron | 18:28 | |
enikanorov__ | if i don't care about the insertion mode, should i specify it? | 18:28 |
kanzhe_ | enikanorov__: !None | 18:28 |
enikanorov__ | then why do you say it's a required attr? | 18:29 |
kanzhe_ | enikanorov__: Required attribute doesn't mean a required parameter in CLI/API. | 18:29 |
kanzhe_ | enikanorov__: Each service has a default serviceContext if not explicitly overwritten. | 18:30 |
*** salv-orlando has quit IRC | 18:31 | |
*** leseb has joined #openstack-neutron | 18:31 | |
kanzhe_ | enikanorov__: For LB and FW, it may be a router. Maybe for VPN too. | 18:31 |
enikanorov__ | kanzhe_: ok, i need probably to read the code. I think i don't fully understand the idea | 18:32 |
enikanorov__ | i need to go now, thanks for the discussion | 18:32 |
*** alagalah has quit IRC | 18:32 | |
kanzhe_ | enikanorov__: Ok. | 18:33 |
enikanorov__ | let's continue it in the next days | 18:33 |
enikanorov__ | ttyl! | 18:33 |
kanzhe_ | enikanorov__: I think we are on the same page for the need for service context, but differ on how to express it. | 18:33 |
enikanorov__ | may be | 18:33 |
kanzhe_ | Let's discuss more next time. :-) | 18:34 |
*** jgallard has quit IRC | 18:36 | |
*** irenab has joined #openstack-neutron | 18:38 | |
openstackgerrit | Tomoe Sugihara proposed a change to openstack/neutron: Implement MidoNet plugin for Icehouse (Part 1) https://review.openstack.org/78543 | 18:38 |
*** irenab has quit IRC | 18:44 | |
*** blogan has quit IRC | 18:45 | |
*** shakayumi has joined #openstack-neutron | 18:47 | |
*** shakayumi has quit IRC | 18:47 | |
*** julim has quit IRC | 18:47 | |
*** mmmucky has joined #openstack-neutron | 18:48 | |
*** Adri2000 has joined #openstack-neutron | 18:48 | |
*** Adri2000 has joined #openstack-neutron | 18:48 | |
*** hemanthravi has quit IRC | 18:48 | |
*** yfried1 has joined #openstack-neutron | 18:51 | |
*** blogan has joined #openstack-neutron | 18:51 | |
*** Gil_McGrath has quit IRC | 18:54 | |
hogepodge | so pardon my ignorance here, but is nested neutron possible using gre on havana? | 18:56 |
*** bada_ has quit IRC | 18:57 | |
*** jp_at_hp has quit IRC | 18:58 | |
*** bada_ has joined #openstack-neutron | 18:58 | |
*** shakayumi has joined #openstack-neutron | 18:58 | |
*** shakayumi has quit IRC | 18:58 | |
*** alagalah has joined #openstack-neutron | 18:59 | |
*** jecarey_ has joined #openstack-neutron | 19:01 | |
*** dvorkinista has joined #openstack-neutron | 19:01 | |
*** s3wong has quit IRC | 19:01 | |
rkukura | marun, kevinbenton, comstud: Please see my comments in https://review.openstack.org/#/c/80413/, and let me know if you think the suggested improvement would still address the issue. | 19:01 |
*** jecarey has quit IRC | 19:01 | |
*** shakayumi has joined #openstack-neutron | 19:02 | |
*** shakayumi has quit IRC | 19:02 | |
*** hemanthravi has joined #openstack-neutron | 19:02 | |
*** hemanthravi has quit IRC | 19:02 | |
*** jorgem has joined #openstack-neutron | 19:03 | |
*** jorgem has quit IRC | 19:03 | |
*** jorgem has joined #openstack-neutron | 19:03 | |
*** shakayumi has joined #openstack-neutron | 19:04 | |
*** jorgem1 has quit IRC | 19:04 | |
kevinbenton | rkukura: that should work. i replied inline | 19:11 |
rkukura | thanks | 19:11 |
*** singhs_ has joined #openstack-neutron | 19:11 | |
rkukura | kevinbenton: In https://review.openstack.org/#/c/81367/2/neutron/tests/unit/bigswitch/test_base.py, I don't see self.httpMock being used anywhere. | 19:12 |
kevinbenton | rkukura: salv-orlando is offline. since this seems to be causing a problem, should i push a new one up for you to review? | 19:12 |
*** singhs has quit IRC | 19:12 | |
*** singhs_ is now known as singhs | 19:12 | |
kevinbenton | rkukura: i reference it in a child class when i need to verify a mock call | 19:13 |
kevinbenton | rkukura: oh sorry, you're right | 19:13 |
kevinbenton | rkukura: i forgot i decided to mock the rest method | 19:13 |
kevinbenton | rkukura: -1 it please and i'll fix it | 19:14 |
rkukura | kevinbenton: Not really sure about updating salv-orlando's patch. Might be worth trying if he's gone for the day. | 19:15 |
rkukura | kevinbenton: -1'ed it | 19:15 |
openstackgerrit | Kevin Benton proposed a change to openstack/neutron: BigSwitch ML2: Include bound_segment in port https://review.openstack.org/81367 | 19:16 |
*** leseb has quit IRC | 19:17 | |
*** dvorkinista has quit IRC | 19:18 | |
*** shakayumi has quit IRC | 19:19 | |
*** rook has quit IRC | 19:19 | |
*** saju_m has quit IRC | 19:21 | |
*** saju_m has joined #openstack-neutron | 19:22 | |
*** vkozhukalov_ has joined #openstack-neutron | 19:23 | |
*** suresh12 has quit IRC | 19:25 | |
*** bada_ has quit IRC | 19:25 | |
*** bada_ has joined #openstack-neutron | 19:26 | |
*** leseb has joined #openstack-neutron | 19:26 | |
*** sfox1 has quit IRC | 19:31 | |
openstackgerrit | Kevin Benton proposed a change to openstack/neutron: Add a semaphore to some ML2 operations https://review.openstack.org/80413 | 19:32 |
*** otherwiseguy has joined #openstack-neutron | 19:34 | |
kevinbenton | rkukura, marun: ^^ | 19:34 |
*** leseb has quit IRC | 19:36 | |
*** suresh12 has joined #openstack-neutron | 19:37 | |
*** zzelle has joined #openstack-neutron | 19:38 | |
*** suresh12 has quit IRC | 19:38 | |
*** julim has joined #openstack-neutron | 19:38 | |
*** suresh12 has joined #openstack-neutron | 19:39 | |
*** alagalah has quit IRC | 19:40 | |
*** petertoft has joined #openstack-neutron | 19:42 | |
*** pradipta is now known as pradipta_away | 19:42 | |
*** petertoft has quit IRC | 19:43 | |
*** devlaps has quit IRC | 19:53 | |
*** muhanpong has joined #openstack-neutron | 19:53 | |
openstackgerrit | Tomoe Sugihara proposed a change to openstack/neutron: Implement MidoNet plugin for Icehouse (Part 1) https://review.openstack.org/78543 | 19:54 |
*** otherwiseguy has quit IRC | 19:54 | |
*** leseb has joined #openstack-neutron | 19:55 | |
*** tomoe_ has quit IRC | 19:56 | |
*** jorgem has quit IRC | 19:58 | |
rkukura | kevinbenton, marun: The updated patch looks good to me. Do we want to merge this ASAP, or just test it a few times? | 19:59 |
openstackgerrit | Evgeny Fedoruk proposed a change to openstack/neutron: New SSL extension https://review.openstack.org/81612 | 19:59 |
*** jorgem has joined #openstack-neutron | 20:00 | |
*** muhanpon1 has joined #openstack-neutron | 20:01 | |
*** kanzhe_ has quit IRC | 20:01 | |
*** dfarrell17 has joined #openstack-neutron | 20:02 | |
openstackgerrit | Evgeny Fedoruk proposed a change to openstack/neutron: New SSL extension https://review.openstack.org/74031 | 20:05 |
*** jorgem1 has joined #openstack-neutron | 20:06 | |
*** dvorkinista has joined #openstack-neutron | 20:06 | |
*** dfarrell07 has quit IRC | 20:06 | |
*** muhanpong has quit IRC | 20:06 | |
*** jorgem has quit IRC | 20:07 | |
*** devlaps has joined #openstack-neutron | 20:07 | |
*** saju_m has quit IRC | 20:08 | |
*** dave_tucker is now known as dave_tucker_zzz | 20:08 | |
*** saju_m has joined #openstack-neutron | 20:08 | |
*** samuelbercovici has joined #openstack-neutron | 20:09 | |
*** tongli has joined #openstack-neutron | 20:09 | |
*** djoreilly has quit IRC | 20:09 | |
*** dave_tucker_zzz is now known as dave_tucker | 20:10 | |
*** dave_tucker is now known as dave_tucker_zzz | 20:17 | |
*** dfarrell17 has quit IRC | 20:25 | |
*** baoli has quit IRC | 20:26 | |
*** alagalah has joined #openstack-neutron | 20:26 | |
*** dfarrell07 has joined #openstack-neutron | 20:27 | |
*** alagalah has quit IRC | 20:32 | |
*** alagalah has joined #openstack-neutron | 20:34 | |
marun | rkukura: Probably makes sense to run some check jobs at least. | 20:37 |
rkukura | marun: Makes sense. | 20:37 |
*** otherwiseguy has joined #openstack-neutron | 20:39 | |
comstud | kevinbenton, marun: actually, you can probably catch eventlet.hubs.get_hub.switch() | 20:43 |
comstud | that is what the current greenthread will call to switch to the hub (scheduler) | 20:43 |
comstud | to schedule the next greenthread | 20:43 |
marun | comstud: can you give a code pointer? | 20:43 |
comstud | yeah, sec | 20:44 |
comstud | https://github.com/eventlet/eventlet/blob/master/eventlet/hubs/__init__.py#L121 | 20:45 |
comstud | that is what all of the IO calls hit when they get EAGAIN | 20:45 |
comstud | you can see an example of it switching out to the hub here: | 20:45 |
comstud | https://github.com/eventlet/eventlet/blob/master/eventlet/hubs/__init__.py#L155 | 20:45 |
comstud | but there are cases where the current greenthread will directly switch to another, bypassing the hub | 20:46 |
comstud | Queue is a case | 20:46 |
comstud | https://github.com/eventlet/eventlet/blob/master/eventlet/queue.py#L288 | 20:46 |
comstud | but most cases, like if queue is empty and there are no current putters, you'll hit this: | 20:47 |
comstud | https://github.com/eventlet/eventlet/blob/master/eventlet/queue.py#L299 | 20:47 |
comstud | which hits this: https://github.com/eventlet/eventlet/blob/master/eventlet/queue.py#L130 | 20:47 |
comstud | which is the switch back to the hub, like the trampoline case | 20:47 |
*** banix has quit IRC | 20:47 | |
comstud | so get_hub().switch is the most common way the current greenthread yields | 20:48 |
marun | comstud: so monkey patching that and then raising an exception when the switch occurs when not wanted? | 20:49 |
comstud | yeah, that'll hit most cases | 20:50 |
marun | comstud: ok, great. | 20:50 |
marun | comstud: I was previously thinking that this had to be on all transactions, but maybe it needs to be more selective... | 20:50 |
comstud | something using a Queue could hit edge cases where the current greenthread switches directly to another greenthread | 20:50 |
comstud | where you can't really easily predict which one, so you can't really catch it | 20:51 |
marun | comstud: so that's where logging might come in handy so we can trace things manually. | 20:51 |
comstud | I can't see why trapping a switch *back* to the current greenthread wouldn't also give you what you want, though | 20:51 |
*** coolsvap has quit IRC | 20:51 | |
comstud | I mean, you detect it late, but you'll detect it 100% of the time | 20:51 |
*** dvorkinista has quit IRC | 20:51 | |
*** harlowja is now known as harlowja_away | 20:52 | |
marun | comstud: So long as it's detected, yeah. | 20:52 |
marun | comstud: and what's the call to switch back? | 20:52 |
comstud | the problem is the traceback would be the other greenthread, so you can't see exactly what caused the switch in the first place | 20:52 |
marun | comstud: maybe we need the option of both, then. | 20:53 |
sdague | the neutron deadlock bug still seems to be a problem | 20:53 |
sdague | not sure who's tracking that one now | 20:53 |
comstud | eventlet.greenthread.getcurrent() would be your current greenthread | 20:53 |
*** vkozhukalov_ has quit IRC | 20:54 | |
*** arosen1 has quit IRC | 20:54 | |
*** ajo has quit IRC | 20:54 | |
comstud | a .switch() on it is what should switch back | 20:54 |
marun | sdague: we have a patch in play we want to run check jobs to make sure it's sane | 20:54 |
marun | sdague: https://review.openstack.org/#/c/80413/ | 20:54 |
sdague | marun: great | 20:54 |
marun | sdague: also we're discussing a more general fix for eventlet yield in transaction w/lock | 20:54 |
comstud | marun: yeah, both might be best.. so you get a more useful traceback 99% of the time.. | 20:54 |
kevinbenton | comstud: thanks for the pointer. do you know what happens if one greenthread spawns another? | 20:55 |
comstud | the current greenthread stays active | 20:55 |
kevinbenton | comstud: will that just queue it | 20:55 |
comstud | the new greenthread gets put on the scheduling list | 20:55 |
comstud | yeah | 20:55 |
kevinbenton | comstud: ok | 20:55 |
sdague | marun: well 80413 seems to have failed the neutron tests, though I don't see the deadlock in the logs | 20:56 |
marun | comstud: so the other question, is there any harm in yielding in a transaction that holds no explicit lock? | 20:56 |
marun | sdague: patch set 2 failed | 20:56 |
marun | sdague: 3 is still running | 20:56 |
sdague | oh, I missed that, sorry | 20:56 |
kevinbenton | marun: i think it just failed 6 mins ago, didn't it? | 20:56 |
marun | kevinbenton: the check job for the previous patch reported 6m ago | 20:57 |
marun | kevinbenton: still waiting on the results for the current patch | 20:57 |
kevinbenton | oh, the with statement i used doesn't work in py26 | 20:57 |
comstud | marun: if there happens to be no DB lock acquired before the yield, i would think it would be fine | 20:57 |
marun | comstud: ok, so initially it probably makes sense to prevent yields always | 20:58 |
comstud | but i'm not sure you can predict that without knowledge of the DB backend | 20:58 |
kevinbenton | http://logs.openstack.org/13/80413/3/check/gate-neutron-python26/5e589f5/console.html | 20:58 |
comstud | which seems like a bad assumption | 20:58 |
marun | comstud: then we'll probably want to see about tieing into sqlalchemy so that only explicity locking prevents yield | 20:58 |
comstud | this problem goes completely away when we use Thread pools for DB calls | 20:58 |
marun | (both in a transaction, of course) | 20:58 |
comstud | but we need an eventlet fix for that too | 20:59 |
marun | comstud: when is that scheduled to happen? | 20:59 |
comstud | they have a patch, i question whether it performs well | 20:59 |
marun | comstud: who has a patch? I'm unclear as to where this work would be done (library, project, etc) | 21:00 |
comstud | eventlet | 21:00 |
comstud | i think it's issue 137 | 21:00 |
comstud | *cehecks* | 21:00 |
comstud | https://bitbucket.org/eventlet/eventlet/issue/137/ | 21:01 |
comstud | it's a long read | 21:01 |
*** nati_ueno has joined #openstack-neutron | 21:01 | |
comstud | or actually, my proposed fix: https://bitbucket.org/eventlet/eventlet/pull-request/29/fix-use-of-semaphore-with-tpool-issue-137/diff | 21:01 |
comstud | is a long read | 21:01 |
comstud | my proposed fix is not quite fool proof, so it didn't merge | 21:01 |
marun | comstud: so timeline for a proper fix is indeterminate, gotcha | 21:02 |
comstud | correct | 21:02 |
comstud | we've been using my patch on a limted basis in public cloud | 21:03 |
marun | comstud: just checking whether it makes sense to put effort into the workaround, it sounds like it is worthwhile. | 21:03 |
comstud | but it's not perfect | 21:03 |
comstud | anyway, this is what caused me to look at alternatives to eventlet | 21:03 |
marun | and are there any? | 21:04 |
comstud | gevent is based on eventlet so has the same issues | 21:04 |
marun | java/go/node.js? :p | 21:04 |
comstud | anything else is a radical change in how we do things | 21:04 |
marun | (i kid, i love python) | 21:04 |
comstud | haha | 21:04 |
kevinbenton | perl | 21:04 |
marun | lol | 21:04 |
comstud | anyway, so I have something that's thread safe that has a compatible API | 21:05 |
comstud | written in C | 21:05 |
marun | an eventlet replacement? | 21:05 |
comstud | so it also happens to be much faster than eventlet | 21:05 |
comstud | yeah | 21:05 |
marun | and why aren't we using it? | 21:05 |
*** manishg has joined #openstack-neutron | 21:05 | |
comstud | it's not quite done... and i was deciding on a name.. I need to put it up on github now that I have one | 21:05 |
comstud | it's like 90% usable probably | 21:05 |
*** leseb has quit IRC | 21:05 | |
kevinbenton | restricted to a single thread? :-) | 21:05 |
comstud | no, completely thread safe! | 21:06 |
comstud | it actually uses threads internally to get some more parallelism wrt doing I/O | 21:06 |
comstud | but even without that, most operations take about 10% of the time of eventlet | 21:07 |
comstud | like greenthread switching/scheduling | 21:07 |
marun | comstud: and what's the catch? :) | 21:07 |
comstud | kinda started out as a fun side project, but I think it's proven to me so far it could be a replacement | 21:07 |
marun | comstud: It sounds great. | 21:07 |
comstud | catch is it's not quite done and probably has unknown bugs! | 21:07 |
comstud | :) | 21:07 |
comstud | anyway, Coming Soon (tm). | 21:08 |
marun | comstud: ah, fair enough. | 21:08 |
marun | comstud: look forward to seeing it! | 21:08 |
openstackgerrit | Kevin Benton proposed a change to openstack/neutron: Add a semaphore to some ML2 operations https://review.openstack.org/80413 | 21:08 |
kevinbenton | marun: ^^ | 21:08 |
kevinbenton | syntax i used was not py26 acceptable | 21:09 |
kevinbenton | rkukura: ^^ | 21:09 |
marun | kevinbenton: let's make sure it passes the check jobs consistently at least a couple of times and then we can merge and see if it improves the gate failure rate | 21:10 |
comstud | bbs, late food break | 21:11 |
marun | comstud: ciao | 21:11 |
kevinbenton | marun: can you re-assign the parent bug to salv-orlando? https://bugs.launchpad.net/neutron/+bug/1283522 | 21:11 |
kevinbenton | marun: every time i patch it re-assigns it | 21:12 |
jogo | for the lastest neutron bug: its older then we thought https://review.openstack.org/#/c/81604/ | 21:13 |
*** thuc__ has quit IRC | 21:14 | |
*** harlowja_away is now known as harlowja | 21:14 | |
*** thuc has joined #openstack-neutron | 21:15 | |
*** thuc_ has joined #openstack-neutron | 21:17 | |
*** RajeshMohan has quit IRC | 21:17 | |
*** thuc_ has quit IRC | 21:17 | |
jogo | looks like there was a big spike on 3/12 | 21:17 |
jogo | marun: ^ | 21:17 |
*** mwagner__ has quit IRC | 21:17 | |
*** carlp has quit IRC | 21:17 | |
*** baoli has joined #openstack-neutron | 21:17 | |
*** thuc_ has joined #openstack-neutron | 21:18 | |
*** RajeshMohan has joined #openstack-neutron | 21:18 | |
*** thuc_ has quit IRC | 21:18 | |
*** thuc_ has joined #openstack-neutron | 21:18 | |
*** thuc_ has quit IRC | 21:19 | |
*** thuc has quit IRC | 21:19 | |
*** tvardeman has quit IRC | 21:20 | |
*** thuc has joined #openstack-neutron | 21:20 | |
*** mlavalle has quit IRC | 21:25 | |
*** bvandenh has quit IRC | 21:27 | |
*** sweston has quit IRC | 21:27 | |
*** saju_m has quit IRC | 21:28 | |
openstackgerrit | Dhanashree Gosavi proposed a change to openstack/neutron: Add support for router scheduling in Cisco N1kv Plugin https://review.openstack.org/77323 | 21:30 |
marun | jogo: danke | 21:31 |
marun | kevinbenton: if you're working on it, you can own it for now :) | 21:31 |
marun | salv-orlando is probably happy to not have responsibility for it. he's always stepping up because somebody has to, but he's also chronically overworked. (though feel free to correct me, salv-orlando, if I'm wrong!) | 21:32 |
kevinbenton | marun: i think he's offline for the night (~UTC), which is why i proposed a new patch | 21:33 |
kevinbenton | marun: since this seems to be painful in the gate | 21:33 |
kevinbenton | marun: it's selfish as much as anything. i have 8 patches doing battle in the gate :-) | 21:34 |
marun | kevinbenton: for critical bugs, I don't think patch ownership really comes into it. If you can move things forward, great. | 21:34 |
*** skath has quit IRC | 21:35 | |
*** suresh12 has quit IRC | 21:37 | |
*** BuSerD has joined #openstack-neutron | 21:39 | |
*** suresh12 has joined #openstack-neutron | 21:39 | |
*** suresh12 has quit IRC | 21:40 | |
rkukura | marun, kevinbenton: Did we decide whether to do a few rechecks with this first, or just merge it ASAP? | 21:43 |
kevinbenton | rkukura: with the currently failure rate, i suggest we merge now | 21:44 |
kevinbenton | rkukura: it will implicitly get checked many times due to the gate resets | 21:44 |
kevinbenton | rkukura: and if one of them fails it will get dumped back to us | 21:45 |
rkukura | kevinbenton: true | 21:45 |
*** SumitNaiksatam has quit IRC | 21:46 | |
*** krtaylor has quit IRC | 21:46 | |
*** suresh12 has joined #openstack-neutron | 21:47 | |
*** dguitarbite_ has joined #openstack-neutron | 21:48 | |
rkukura | kevinbenton: gave it my +2, will be AFK for a while | 21:48 |
kevinbenton | rkukura: sounds good | 21:48 |
*** oda-g has joined #openstack-neutron | 21:49 | |
*** leseb has joined #openstack-neutron | 21:49 | |
*** networkstatic is now known as networkstatic_zZ | 21:52 | |
*** krtaylor has joined #openstack-neutron | 21:52 | |
oda-g | nati_ueno: ping | 21:54 |
*** zzelle has quit IRC | 21:56 | |
*** jobewan has quit IRC | 22:00 | |
*** dvorkinista has joined #openstack-neutron | 22:01 | |
*** jecarey_ has quit IRC | 22:02 | |
*** devlaps has quit IRC | 22:03 | |
marun | kevinbenton: I can approve then | 22:08 |
marun | kevinbenton: as soon as gate check passes | 22:09 |
*** gdubreui has joined #openstack-neutron | 22:09 | |
kevinbenton | marun: ok | 22:11 |
*** ijw has joined #openstack-neutron | 22:13 | |
*** devlaps has joined #openstack-neutron | 22:18 | |
*** armax has quit IRC | 22:19 | |
*** sweston has joined #openstack-neutron | 22:26 | |
openstackgerrit | A change was merged to openstack/neutron: BigSwitch: Watchdog thread start after servers https://review.openstack.org/81090 | 22:29 |
openstackgerrit | A change was merged to openstack/neutron: Add session persistence support for NVP advanced LBaaS https://review.openstack.org/59146 | 22:29 |
*** dfarrell07 has quit IRC | 22:30 | |
*** thuc has quit IRC | 22:33 | |
*** thuc has joined #openstack-neutron | 22:33 | |
*** arosen1 has joined #openstack-neutron | 22:36 | |
*** thuc has quit IRC | 22:38 | |
*** SridharG has quit IRC | 22:39 | |
marun | kevinbenton: still py26 error? | 22:39 |
*** _cjones_ has quit IRC | 22:42 | |
*** thedodd has quit IRC | 22:42 | |
*** mwagner__ has joined #openstack-neutron | 22:42 | |
*** _cjones_ has joined #openstack-neutron | 22:42 | |
openstackgerrit | Ian Wienand proposed a change to openstack/neutron: Record and log reason for dhcp agent resync https://review.openstack.org/81173 | 22:45 |
*** jorgem1 has quit IRC | 22:45 | |
*** ijw has quit IRC | 22:49 | |
*** dims has quit IRC | 22:51 | |
*** ijw has joined #openstack-neutron | 22:53 | |
*** SumitNaiksatam has joined #openstack-neutron | 22:53 | |
*** armax has joined #openstack-neutron | 22:58 | |
*** harlowja is now known as harlowja_away | 22:59 | |
kevinbenton | marun: no, that was a timeout error | 23:02 |
kevinbenton | marun: that hit on some other patches too | 23:02 |
marun | kevinbenton: so problem on the infra side? | 23:02 |
kevinbenton | marun: i think something changed in the gate on the limit | 23:02 |
kevinbenton | marun: yeah | 23:02 |
kevinbenton | marun: i filed the bug i referenced, but it might be invalid because i filed it in the neutron project | 23:03 |
*** sneezewort has quit IRC | 23:03 | |
marun | kevinbenton: hmmm. so a timeout as in the job took too long? | 23:04 |
kevinbenton | marun: yeah | 23:04 |
*** yamahata has quit IRC | 23:04 | |
marun | yee. our unit tests suck | 23:04 |
kevinbenton | marun: they are just really extensive | 23:04 |
kevinbenton | marun: :-) | 23:04 |
marun | kevinbenton: no, they suck :) | 23:04 |
marun | kevinbenton: It's my priority to fix them as of last week, but I keep getting sidelined. | 23:05 |
kevinbenton | marun: to know if the XML encoding REALLY works, you must run all of the unit tests twice ;-) | 23:05 |
marun | kevinbenton: ….and tempest, don't forget tempest | 23:05 |
marun | kevinbenton: without running the same code paths at least 1000 times we have no guarantee that things will work! | 23:05 |
*** nati_ueno has quit IRC | 23:07 | |
*** dims has joined #openstack-neutron | 23:07 | |
kevinbenton | marun: new lines of code are like new shoes. you have to break them in before you can find out if they will work | 23:08 |
marun | kevinbenton: i mean, just because a statement worked one way, doesn't mean it will work another. | 23:08 |
*** otherwiseguy has quit IRC | 23:13 | |
*** ijw has quit IRC | 23:13 | |
*** tongli has quit IRC | 23:13 | |
*** blogan has quit IRC | 23:18 | |
openstackgerrit | Ian Wienand proposed a change to openstack/neutron: Log dnsmasq host file generation https://review.openstack.org/79843 | 23:19 |
*** armax has quit IRC | 23:20 | |
openstackgerrit | A change was merged to openstack/neutron: API layer documentation https://review.openstack.org/79675 | 23:20 |
openstackgerrit | Ian Wienand proposed a change to openstack/neutron: Log dnsmasq host file generation https://review.openstack.org/79843 | 23:20 |
*** zhipeng has joined #openstack-neutron | 23:26 | |
*** thuc has joined #openstack-neutron | 23:28 | |
*** sweston has quit IRC | 23:34 | |
*** overlayer has quit IRC | 23:36 | |
*** harlowja_away is now known as harlowja | 23:38 | |
*** kfox1111 has joined #openstack-neutron | 23:41 | |
kfox1111 | So... About once a day, I'm seeing qpid-stat show: | 23:41 |
kfox1111 | q-plugin 5.81k 1.72m 1.72m 3.76m 1.11g 1.11g 1 2 | 23:42 |
kfox1111 | 5.81 thousand stuck messages. restarting neutron-server causes it to drain the queue. | 23:42 |
kfox1111 | what could cause this? | 23:42 |
openstackgerrit | Dhanashree Gosavi proposed a change to openstack/neutron: Add support for router scheduling in Cisco N1kv Plugin https://review.openstack.org/77323 | 23:43 |
*** singhs has quit IRC | 23:43 | |
*** networkstatic_zZ is now known as networkstatic | 23:44 | |
kevinbenton | marun: still around? | 23:53 |
marun | kevinbenton: yes | 23:53 |
openstackgerrit | Evgeny Fedoruk proposed a change to openstack/neutron: Cancelling thread start while unit tests running https://review.openstack.org/81323 | 23:53 |
*** armax has joined #openstack-neutron | 23:54 | |
kevinbenton | marun: should be receiving a verdict from jenkins on the ML2 patch shortly | 23:54 |
*** leseb has quit IRC | 23:54 | |
kevinbenton | marun: if it passes it would be good to send it to war with the gate for the evening | 23:54 |
kevinbenton | marun: well it looks like it will pass, only job is non-voting | 23:55 |
marun | kevinbenton: ok | 23:55 |
*** mwagner_dontUseM is now known as mwagner | 23:56 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!