Thursday, 2021-10-28

opendevreviewMerged openstack/neutron stable/wallaby: Delete SG log entries when SG is deleted  https://review.opendev.org/c/openstack/neutron/+/81087000:10
opendevreviewFederico Ressi proposed openstack/neutron master: Change tobiko CI job in the periodic queue  https://review.opendev.org/c/openstack/neutron/+/81397707:35
opendevreviewManu B proposed openstack/os-ken master: Msgpack version upgrade to 1.0.0  https://review.opendev.org/c/openstack/os-ken/+/81578408:02
opendevreviewRodolfo Alonso proposed openstack/neutron master: Replace "tenant_id" with "project_id" in OVO base  https://review.opendev.org/c/openstack/neutron/+/81581409:41
opendevreviewManu B proposed openstack/os-ken master: Msgpack version upgrade to 1.0.0  https://review.opendev.org/c/openstack/os-ken/+/81578409:41
opendevreviewRodolfo Alonso proposed openstack/neutron master: Replace "tenant_id" with "project_id" in metering service  https://review.opendev.org/c/openstack/neutron/+/81480709:43
tobias-urdinralonsoh: can i ask for a +w on https://review.opendev.org/c/openstack/neutron/+/815310 thanks! :)09:55
ralonsohtobias-urdin, done09:57
em__Is there any way to set the MTU of ovs-system?10:01
em__AFAICS it needs to be set on all nodes10:01
ralonsohem__ you can set the MTU of a network10:06
ralonsohall ports attached to this network will inherit it10:06
opendevreviewRodolfo Alonso proposed openstack/neutron stable/xena: Allow to set the OpenFlow protocol when needed  https://review.opendev.org/c/openstack/neutron/+/81216210:08
em__em_, are you refering to global_physnet_mtu?10:13
em__or are your refering to 'ovs-vsctl set Interface br-int mtu_request=1400'10:13
ralonsohem__, none of them10:20
ralonsohthe global_physnet_mtu is for physical networks, this is the max limit10:20
ralonsohfrom the doc:10:21
ralonsoh'MTU of the underlying physical network. Neutron uses '10:21
ralonsoh                      'this value to calculate MTU for all virtual network '10:21
ralonsoh                      'components. For flat and VLAN networks, neutron uses '10:21
ralonsoh                      'this value without modification. For overlay networks '10:21
ralonsoh                      'such as VXLAN, neutron automatically subtracts the '10:21
ralonsoh                      'overlay protocol overhead from this value. Defaults '10:21
ralonsoh                      'to 1500, the standard value for Ethernet.'10:21
ralonsohyou should not set the MTU of any interface of OVS manually10:21
ralonsohand br-int port (and interface) is an internal port10:22
ralonsohtraffic does not use this port10:22
ralonsohso the MTU of this port is irrelevant10:22
em__ralonsoh, it was extremely relevant for us, maybe i could explain and you could correct what maybe was a wrong assumption?10:23
ralonsohem__, again, the MTU of br-int port is NOT relevant10:23
ralonsohtraffic does not use this traffic10:23
ralonsohby setting the MTU of br-int port you are not limiting the MTU of br-int bridge10:24
em__ralonsoh, i'am not doubting you, just trying to explain so you can maybe point where we had an issue10:24
ralonsohthe MTU setting is per interface10:24
ralonsohyou didn't explain what the issue is10:25
*** jp is now known as Guest426510:25
em__ralonsoh, our mng network is a vswitch of our provider. Since it is itself encapsulated in a VXLAN and then we use a VLAN inside, we have an MTU of 1400 on this interface. It is critical to use this MTU when we communicate on this vswitch10:25
em__So all our nodes (controller/computes) share this one vswitch which is (AFAIU) not controller by openstack at all - it exists pre-deployment and works pre-deployment. It is a usual linux interface with vlan tag, so somewhat enp9s0.400210:26
ralonsohso for your OpenStack deployment, the physical MTU must be 140010:27
ralonsohset this value then10:27
em__we did that (in terms of global_pysnet_mtu: 1400 in neutron.conf of our ovn-controller)10:27
em__this seems to work just fine. We then have an additional vswitch, also provider, which is our subnet for the floating ips. Again MTU 1400 due to VXLAN+VLAN of the provider10:28
ralonsohand what's the problem then?10:28
em__this vswitch we only attached to the controller (since we do not want to use DVR). It seems like we run into troubles with that and how MTU is no handled10:28
em__when we create the network in openstack we use an MTU of 1400 for it: openstack network create --external --share --mtu 1400 --provider-physical-network physnet1 --provider-network-type flat provider-wan10:29
em__now we also create a geneve (self-service) network (with no particular MTU setting) put an instance into that network, create a router ... assign a floating ip and all this.10:30
em__Now the problematic part is, that a ICMP works, but the connection seems to be not working other then ICMP (without -s, default size). We neither can wget outside the instance nor connect into the instance via SSH, the connection seems to stall 10:31
em__that is why we suggested an MTU issue. 10:31
em__also, the controller interface seems to be very slow when we do not set the MTU of br-int (ovs-vsctl set Interface br-int mtu_request=1400) - or it least we see the difference. We did not messure this yet10:32
em__I think that's it. Still not doubting your point, but where do you think we went down the wrong rabbit whole?10:32
ralonsohat this point I don't see where the traffic is being dropped. If I'm not wrong, from what you are saying is in the GW interface, when connecting to the external network10:33
ralonsohand from there I don't know what type of overlaying network/VLAN config you have10:34
ralonsohcheck the physical network max MTU10:34
ralonsohthen take into account the size of the overlaying network (not applied by OpenStack)10:34
ralonsohand use this MTU10:34
em__let me ask some questions quick so i get your terms right and do not confuse you with wrong answers: pyhsical network MTU does refer to the MTU used on the vswitches, so e.g. enp9s0.4001(mng)/enp9s0.4002(wan)?10:35
ralonsohin any case, setting the br-int MTU is irrelevant10:36
ralonsohthat will do nothing10:36
em__i do not understand yet. Trying to. What is the physical network MTU you refer to?10:36
ralonsohthe underlying physical network10:37
ralonsohthat is the infrastructure network installed in the premises10:37
ralonsohsomething openstack cannot control10:37
ralonsohor set10:37
em__in this case, that is our vswitch we get from the provider. we have 2, both  or MTU 140010:38
em__so setting global_physnet_mtu: 1400 - as you already confirmed, was the right thing. Now you continued with 'now calculate your MTU downwards from that point' - right?10:39
em__since every added geneve / vxlan or vlan on top will add an addition encapsulation header, thus the internal max mtu will become smaller and smaller -that is what you refer to, right?10:39
ralonsohem__, the mtu will be calculated by the driver type in Neutron10:40
ralonsohdepending on the header size10:40
ralonsohyou don't need to make any math there10:41
em__i understand. So from you POV setting global_physnet_mtu:1400 is all we need to do?10:42
ralonsohyes10:42
em__i understand. Do you think setting the MTU when creating the network is correct?10:44
em__(still here)10:44
ralonsohif you are creating an external network, let neutron to define this value, according to the external network provided and the network type10:45
ralonsohdo not provide the mtu value10:46
em__understood10:46
em__ok so i guess i wipe the cluster and use global_physnet_mtu:1400 only and see what happens10:47
em__Thank you a lot for sorting things out10:47
em__I'am not sure if this related or is a neutron question at all (might be rather nova?). When we start any cloud-init based instance (debian-genericcloud or cirros), they seem not to be able to contact the meta-data service https://gist.github.com/EugenMayer/42a0f13ccf5f18076c4e2d84655bda66 - could that be a communication issue based on OVN/neutron or the way meta-data has been deployed (on which nodes) - or nobody can tell?10:51
ralonsohem__, is the metadata present in this compute node?10:52
ralonsohcheck the metadata namespace10:52
em__we have an neutron_ovn_metadata_agent on each compute10:54
ralonsohand the metadata namespace?10:54
ralonsohis it there?10:54
ralonsohovnmeta-<id_of_the_network>10:54
em__on the compute, there is nothing matching ip addr | grep ovnmeta10:55
ralonsohno10:55
ralonsohip netns10:55
em__ip netns is empty, on all computes and controller10:56
ralonsohso you have a problem, the metadata agent didn't create the namespace10:56
ralonsohcheck the network10:56
ralonsohhow many subnets do you have?10:56
em__ok. What is the direction on this, does neutron_ovn_metadata_agent start up and tries to talk back to an other OVS service (most of those run on our controller, we do not have dedicated gateway or db nodes).10:57
opendevreviewMerged openstack/neutron stable/victoria: [ovn] Stop monitoring the SB MAC_Binding table to reduce mem footprint  https://review.opendev.org/c/openstack/neutron/+/81487010:58
ralonsohno, ovn metadata agent only talks to neutron server10:58
em__ralonsoh, right now, we have 2 networks, 1 provider-wan with 1 subnet and one self-service, with 1 subnet (28 on the former, 24 on the latter)10:58
ralonsohand does this subnet have DHCP enabled?10:58
ralonsohthe self-service one10:59
em__yes it has, but does OVN has a speciallity with DHCP, doesnt it?10:59
em__yes it has10:59
em__are you refering to neutron_ovn_dhcp_agent ? it is off by default, we did not activate it10:59
ralonsohno, ovn metadata agent11:00
ralonsohthere is no ovn DHCP agent11:00
em__https://docs.openstack.org/kolla-ansible/latest/reference/networking/neutron.html#ovn-ml2-ovn 11:00
ralonsohthis is not an OVN dhcp agent, this is a neutron DHCP agent11:01
ralonsohfor other backends11:01
ralonsohfor example, if you have ironic nodes with ovs11:01
ralonsohbut there is no OVN DHCP agents11:01
em__you mean, there are non (not yet implemented) - or there are none in our setup (which is the issue)?11:02
ralonsohDHCP is replied by OVN11:02
em__i think i lost you - though trying hard to follow11:03
ralonsohovn deployments do not need DHCP agents for internal ports (or SRIOV ports)11:04
ralonsohbut you can have multiple backends11:04
ralonsohor in this case, Ironic nodes11:04
ralonsohthose ports are not bound to OVN11:04
ralonsohthat means you need a neutron DHCP agent, spawning dnsmasq, to reply to DHCP requests from those ports11:05
ralonsohthis is what "neutron_ovn_dhcp_agent" means11:05
em__so basically as long as we deploy instances in an OVN based geneve or alike network, DHCP is not needed. If we then would also use let's say linux-bridges, we would need to set "neutron_ovn_dhcp_agent" in addition?11:06
ralonsohyes11:07
em__Understood - thinking bout the above meta-data issue, i would say (and that is what you told me before too) it is unrelated to that11:08
opendevreviewMerged openstack/neutron stable/ussuri: [ovn] Stop monitoring the SB MAC_Binding table to reduce mem footprint  https://review.opendev.org/c/openstack/neutron/+/81486911:11
opendevreviewMerged openstack/neutron stable/ussuri: Fix OVN migration workload creation order  https://review.opendev.org/c/openstack/neutron/+/81561811:11
em__not sure what the issue with the meta-data service could be about. Found things like https://bugs.launchpad.net/charm-ovn-chassis/+bug/190768611:12
em__when we switched from OVS to OVN deployment (in our lab and on DC, fresh clean deployments via kolla) this issue started to appear11:13
opendevreviewRodolfo Alonso proposed openstack/neutron master: Replace "tenant_id" with "project_id" in OVO base  https://review.opendev.org/c/openstack/neutron/+/81581411:41
dmitriisHi Neutron Core Devs, could this spec https://review.opendev.org/c/openstack/neutron-specs/+/788821 be added to the next Neutron drivers meeting agenda? The RFE (https://bugs.launchpad.net/neutron/+bug/1932154) is in the rfe-approved state so this is about the spec only.11:48
ralonsohdmitriis, the RFE is approved12:46
ralonsohnow what you need it to have the spec reviewed12:46
ralonsohthe drivers meeting is on fridays12:46
ralonsohhttps://meetings.opendev.org/#Neutron_drivers_Meeting12:46
em__ralonsoh, started metadata in debug, those are the logs https://gist.github.com/EugenMayer/e2c5796f7c547c224a3aaccad2c35257 .. not sure what is failing. without drebug https://gist.github.com/EugenMayer/bbda954d207fc03b1fd9ea0bb6d143cf12:51
ralonsohem__, can't connect to OVN, check that12:53
em__10.0.0.3 is my controller - what service will i try to connect to?12:54
ralonsohactually no12:54
ralonsohagent can't connect to OVS DB12:54
em__ovs sb?12:54
ralonsohnot SB or NB12:55
ralonsohOVS DB12:55
em__let me check that12:55
ralonsohaccording to those logs, the OVN NB controller is tcp:10.0.0.3:664212:55
em__i guess it is neither openvswitch_db, nor ovn_nb_db, nor ovn_sb_db12:55
ralonsohbut cannot connect to OVS controller tcp:127.0.0.1:664012:55
ralonsohthat's what I said12:56
ralonsohthis is the local OVS DB12:56
em__ok then this should be it12:56
em__it is a kolla based deployment, where the meta-data egent is seperated from the ovn controller 12:56
em__2 different docker-containers. Thus if 127.0.0.1 is used, it will not work for obvious reasons from the meta-data controller12:57
em__i refer to https://gist.github.com/EugenMayer/b6611b9725a7697d0a392c1b3c1a568312:57
em__since both run isolated in 2 different containers, 127.0.0.1 cannot be used to connect from meta-data to ovn controller12:58
ralonsohovn controller is not the ovs service12:58
ralonsohovsdb-server.service12:58
ralonsohthis is the service you need to access12:58
em__hmm so that one need to run on the meta-data container too13:00
ralonsohwhat is the value of "ovsdb_connection"?13:01
em__in the configuration?13:01
ralonsohyes, in this compute13:02
em__ralonsoh, https://gist.github.com/EugenMayer/fda894a3c6bb2bde5b9d1756b234cd7913:03
ralonsohchange it to "tcp:10.0.0.3:6640"13:03
opendevreviewOleg Bondarev proposed openstack/neutron master: Add Local IP L2 extension  https://review.opendev.org/c/openstack/neutron/+/80711613:05
opendevreviewOleg Bondarev proposed openstack/neutron master: Add Local IP L2 extension flows  https://review.opendev.org/c/openstack/neutron/+/81510213:05
opendevreviewSlawek Kaplonski proposed openstack/neutron master: Fix expected exception raised when new scope types are enforced  https://review.opendev.org/c/openstack/neutron/+/81583713:06
em__ralonsoh, with13:10
em__cat /etc/kolla/neutron-ovn-metadata-agent/neutron_ovn_metadata_agent.ini  | grep ovsdb13:10
em__ovsdb_connection = tcp:10.0.0.3:664013:10
em__ovsdb_timeout = 1013:10
em__the meta-data agent does no longer come up at all13:11
ralonsohcheck the ovs db connection13:11
ralonsohovs-vsctl list connection13:11
ralonsohand set this value there13:11
em__https://gist.github.com/EugenMayer/9256af4e6d82dd993838bdb73472cb5313:12
ralonsohif you don't have access to the ovs db from this container, set an IP different from 127.0.0.1 and change ovsdb_connection and the connection value13:12
opendevreviewRodolfo Alonso proposed openstack/neutron master: Replace "tenant_id" with "project_id" in OVO base  https://review.opendev.org/c/openstack/neutron/+/81581413:13
em__ralonsoh, currently i'am bit confused, over at kolla 127.0.0.1 seems to be the expected value, and looking at lsof there seem to be something listenting on that port too13:14
em__the entire setup is automatically orchestrated - while i'am currently not sure anybody in kolla runs an OVN+Xena stack13:15
opendevreviewSlawek Kaplonski proposed openstack/neutron master: Don't enforce scopes in the API policies UT temporary  https://review.opendev.org/c/openstack/neutron/+/81583813:15
em__i'am trying my best to make sense of it, thank you for helping all the way13:18
opendevreviewLucas Alvares Gomes proposed openstack/neutron master: [OVN] Update check_for_mcast_flood_reports() to check for mcast_flood  https://review.opendev.org/c/openstack/neutron/+/81584313:21
opendevreviewSlawek Kaplonski proposed openstack/neutron master: Fix expected exception raised when new scope types are enforced  https://review.opendev.org/c/openstack/neutron/+/81583713:22
em__ralonsoh, i tried the exact same setup with OVS (https://github.com/EugenMayer/openstack-lab/tree/stable/ovs) and it works (meta-data), while the same using OVN has this meta-data issues (https://github.com/EugenMayer/openstack-lab/tree/stable/ovn). I would now test OVN+wallaby using the same configuration, since i got a hint in kolla that people run that in production right now. Would that help and probably creating a bug in 14:00
em__the launchpad, especially if that works in Wallaby but not in Xena?14:00
dmitriisralonsoh: ack, I'll attend and propose an on-demand topic just to have some slot to answer possible questions about the spec.14:12
zigoyoctozepto: frickler: OpenVSwitch and OVN just got approved in the official Debian Bullseye backports, it will reach your local repository over night ! :)14:29
*** ricolin_ is now known as ricolin14:40
yoctozeptozigo: lovely!14:43
yoctozeptothank you for letting us know14:44
yoctozeptozigo: here's our gift to you https://review.opendev.org/c/openstack/governance/+/81585114:44
opendevreviewMamatisa Nurmatov proposed openstack/neutron master: Remove todo's in Y release  https://review.opendev.org/c/openstack/neutron/+/81585314:51
em__ralonsoh, wallaby worked without an issue, so it is Xena related without changing any other parts. Where should i open a bug report and which parts do you need?14:55
ralonsohem__, in https://launchpad.net/neutron14:58
ralonsohprovide agent logs in debug mode, at least the non working one14:58
ralonsohand provide the git hash of the versions used14:59
em__which versions of what? Will be hard for me possible since i use kolla14:59
ralonsohthe version of Neutron15:00
ralonsohkolla deploys the services using a tag, isn't it?15:00
em__not yet for Xena, they are in RC, so not yet tagged (lastest)15:01
em__ralonsoh, should i link the reproducer? I have to vagrant stacks -they only differ in therms of wallaby / xena. that's it15:04
em__https://github.com/EugenMayer/openstack-lab/tree/stable/ovn - ovn xena15:04
em__https://github.com/EugenMayer/openstack-lab/tree/stable/ovn-wallaby - ovn wallaby15:04
ralonsohok, but more important is to know if the metadata agent creates the network namespace, the interfaces, adds the routes, etc15:05
ralonsohso all that is defined in https://github.com/openstack/neutron/blob/master/neutron/agent/ovn/metadata/agent.py#L400-L51615:06
zigoyoctozepto: \o/15:08
zigoTook 10 years until it happened ... :)15:08
zigoFYI, I'm currently building arm64 builds for Linaro, that may be usefull too ...15:09
yoctozeptozigo: yeah, /me aware, thanks :-)15:10
zigoyoctozepto: FYI, I got my jenkins builder up and running, and it's currently building OVS.15:11
zigoI wont be building arch: all on it though...15:11
zigoUnfortunately, I need the (not in bullseye) openvswitch-source binary to build OVN.15:12
opendevreviewRodolfo Alonso proposed openstack/neutron master: Replace "target_tenant" with "target_project" in RBAC OVOs and models  https://review.opendev.org/c/openstack/neutron/+/81585515:12
yoctozeptozigo: I do not understand the last statement ;/ ovn has its own sources and what is "source binary"?15:13
zigoyoctozepto: This: https://packages.debian.org/search?keywords=openvswitch-source15:13
zigoopenvswitch-source_2.15.0+ds1-8_all.deb15:14
yoctozeptooh, weird15:14
zigoOVN needs the sources of OVS to build ...15:15
zigoThat's how we provide it.15:15
zigoBullseye doesn't have it, which is why I backported it too.15:15
yoctozeptook, makes sense15:15
zigoOtherwise, the Bullseye version of OVS is fine.15:15
em__ralonsoh, tried my best https://bugs.launchpad.net/neutron/+bug/194909715:18
ralonsohem__, "- Using Xena+OVS works."15:18
ralonsoh??15:18
ralonsohah ok15:18
ralonsohOVS15:18
ralonsohok then15:18
em__Xena+OVS works15:19
em__Wallaby+OVN wroks15:19
em__Xena+OVN does not work15:19
ralonsohem__, but again, is metadata agent creating the namespace?15:20
ralonsohif the namespace is created, do we have interfaces inside?15:20
em__Maybe someone from kolla needs to get involved to get you a better insight about the exact version of neutron used- you can pull  quay.io/openstack.kolla/ubuntu-source-neutron-metadata-agent:xena and check yourself though15:20
em__ralonsoh, on that topic, i'am not sure. The first time you asked me after the namespace , ip netns was empty. They i fiddled arround with the IP and all that -then i was asked in kolla to run that again, and then there was a namespace15:21
em__so i'am not sure about that.15:21
ralonsohthat's mandatory to try to debug this problem15:22
em__The questions 'do we have interfaces inside' - i'am not sure what you mean. Inside the docker-container? AFAIK not needed since --network host is used15:22
ralonsohthis is the first thing to check  here15:22
mlavalleslaweq: Happy birthday!15:22
ralonsohno, inside the namespace15:22
ralonsohip netns exec ovnmeta-xxxxxxxx ip a15:22
ralonsohso check step by step what https://github.com/openstack/neutron/blob/master/neutron/agent/ovn/metadata/agent.py#L400-L516 is doing15:23
em__i have no running stack right now. If you like, write what you need into the ticket, i provide those informations tomorrow15:24
em__(as far as i can15:24
opendevreviewRodolfo Alonso proposed openstack/neutron master: Replace "tenant_id" with "project_id" in OVO base  https://review.opendev.org/c/openstack/neutron/+/81581415:37
opendevreviewMerged openstack/neutron master: Set RPC timeout in PluginReportStateAPI to report_interval  https://review.opendev.org/c/openstack/neutron/+/81531015:56
em__ralonsoh, what is $meta in https://paste.opendev.org/show/810262/ ?16:00
ralonsohthe metadata namespace name16:00
ralonsohovnmeta-xxxx16:00
em__is see16:02
em__ip netns exec $meta tcpdump -vvnni tap40ede83a-61@if816:02
em__Cannot open network namespace "tcpdump": No such file or directory16:02
em__do i need to install tcpdump?16:02
ralonsohyes16:02
em__is that how one uses tcpdump in those ovn based networks? that neat!16:03
em__ralonsoh, is that what you need? https://gist.github.com/EugenMayer/3b7d1fc4a42d7fc911229f38eec891dd16:07
em__i added it to the bug16:11
em__cu tomorrow16:12
opendevreviewTobias Urdin proposed openstack/neutron stable/xena: Set RPC timeout in PluginReportStateAPI to report_interval  https://review.opendev.org/c/openstack/neutron/+/81587917:58
opendevreviewMamatisa Nurmatov proposed openstack/neutron master: Remove todo's in Y release  https://review.opendev.org/c/openstack/neutron/+/81585318:13
opendevreviewGhanshyam proposed openstack/neutron master: DNM: testing tempest test change  https://review.opendev.org/c/openstack/neutron/+/81589818:42
-opendevstatus- NOTICE: mirror.bhs1.ovh.opendev.org filled its disk around 17:25 UTC. We have corrected this issue around 18:25 UTC and jobs that failed due to this mirror can be rechecked.18:44
opendevreviewMerged openstack/neutron stable/xena: Fix OVN migration workload creation order  https://review.opendev.org/c/openstack/neutron/+/81562119:29
opendevreviewMerged openstack/neutron stable/wallaby: Fix OVN migration workload creation order  https://review.opendev.org/c/openstack/neutron/+/81562019:30
opendevreviewMerged openstack/neutron stable/victoria: Fix OVN migration workload creation order  https://review.opendev.org/c/openstack/neutron/+/81561919:43
opendevreviewMerged openstack/neutron master: Networking guide: Add trunk limitation to min bandwidth  https://review.opendev.org/c/openstack/neutron/+/81560919:43
opendevreviewMerged openstack/neutron stable/victoria: Delete log entries when SG or port is deleted  https://review.opendev.org/c/openstack/neutron/+/81529922:32

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!