opendevreview | Miguel Lavalle proposed openstack/neutron master: [DNM] Add rate-limiting to metadata agents https://review.opendev.org/c/openstack/neutron/+/858879 | 00:11 |
---|---|---|
prometheanfire | I have a baremetal port that doesn't seem to be getting responses to dhcp requests, I can see the dhcp requests hitting the only chassis on the network, I'm unsure how to check ovn itself | 01:03 |
prometheanfire | well, more than checking that the port was assigned a chassis | 01:05 |
prometheanfire | I see the dataflow but I don't see packets making it to the the namespace running the ovn metadata agent | 03:54 |
prometheanfire | I don't think it's related to https://bugs.launchpad.net/neutron/+bug/2007167 but it's hard to tell, I'm not sure where the packet is getting dropped | 03:56 |
prometheanfire | offhand, the tftp next server is set correctly, but I don't see any dhcp responses in the first place, so dunno | 03:57 |
prometheanfire | I do have enable_distributed_floating_ip enabled for ovn, but don't think that's related either | 03:59 |
prometheanfire | maybe because the network is external? | 04:06 |
tkajinam | hi. is there any good documentation to understand the features implemented in ovn-agent (not ovn-metadata-agent) as of 2023.1 ? | 06:39 |
opendevreview | Luis Tomas Bolivar proposed openstack/neutron master: WIP: ensure redirect-type=bridged not used for geneve networks https://review.opendev.org/c/openstack/neutron/+/878450 | 06:51 |
hjensas | dtantsur: afict the job does not use ML2 baremetal, mechanism_drivers = openvswitch in /etc/neutron/plugins/ml2/ml2_conf.ini. i.e there is no mechanism driver that bind vnic type: baremetal in this case. | 07:07 |
opendevreview | Luis Tomas Bolivar proposed openstack/neutron master: Add support for localnet_learn_fdb OVN option https://review.opendev.org/c/openstack/neutron/+/877675 | 07:08 |
opendevreview | Luis Tomas Bolivar proposed openstack/neutron master: Add support for localnet_learn_fdb OVN option https://review.opendev.org/c/openstack/neutron/+/877675 | 07:08 |
hjensas | dtantsur: I don't think the failed binding is the issue, i.e like you mention this used to work without the port being bound properly before ML2 baremetal. Unless something changed, is wiring the FIP now depending on the port being bound? | 07:11 |
ralonsoh | tkajinam, hi, this is now an empty shell. The plan during this cycle is to move every agent related stuff to this agent | 08:29 |
*** elvira2 is now known as elvira | 08:30 | |
ralonsoh | there is one feature implemented but only applies for HWOL environments that need QoS | 08:30 |
ralonsoh | very specific | 08:30 |
tkajinam | ralonsoh, ah, ok. thanks. | 08:51 |
slaweq | ralonsoh lajoskatona ykarel hi, please check https://review.opendev.org/c/openstack/neutron/+/876556 when You will have some time | 09:02 |
ralonsoh | sure | 09:02 |
slaweq | it seems it helps with memory consumption in CI jobs in many projects so we can use it too | 09:02 |
opendevreview | Luis Tomas Bolivar proposed openstack/neutron master: Ensure redirect-type=bridged not used for geneve networks https://review.opendev.org/c/openstack/neutron/+/878450 | 09:18 |
opendevreview | Luis Tomas Bolivar proposed openstack/neutron master: Ensure redirect-type=bridged not used for geneve networks https://review.opendev.org/c/openstack/neutron/+/878450 | 09:28 |
opendevreview | Merged openstack/ovn-octavia-provider stable/yoga: Fix broken pep8 jobs due to bandit 1.7.5 updated version https://review.opendev.org/c/openstack/ovn-octavia-provider/+/877464 | 10:46 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: [sqlalchemy-20] Do not use strings for aatribute names in loader options https://review.opendev.org/c/openstack/neutron/+/878480 | 10:47 |
opendevreview | Frode Nordahl proposed openstack/neutron-lib master: ext-gw-multihoming: api-def and api-ref https://review.opendev.org/c/openstack/neutron-lib/+/870887 | 11:44 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: [sqlalchemy-20] Provide SQL "case" expression correct input paremeters https://review.opendev.org/c/openstack/neutron/+/878526 | 11:48 |
opendevreview | Luis Tomas Bolivar proposed openstack/neutron master: Ensure redirect-type=bridged not used for geneve networks https://review.opendev.org/c/openstack/neutron/+/878450 | 11:50 |
opendevreview | Frode Nordahl proposed openstack/neutron master: Allow Multiple External Gateways https://review.opendev.org/c/openstack/neutron/+/873593 | 11:54 |
opendevreview | Frode Nordahl proposed openstack/neutron master: Add extra router attributes for ECMP and BFD https://review.opendev.org/c/openstack/neutron/+/874797 | 11:54 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Add end to end test for QosExtension https://review.opendev.org/c/openstack/neutron/+/877603 | 11:54 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Drop use of OVN_GW_PORT_EXT_ID_KEY https://review.opendev.org/c/openstack/neutron/+/877831 | 11:54 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Drop use of LR OVN_GW_NETWORK_EXT_ID_KEY https://review.opendev.org/c/openstack/neutron/+/877712 | 11:54 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Implement support for external-gateway-multihoming extension https://review.opendev.org/c/openstack/neutron/+/874199 | 11:54 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Allow L3 scheduler to be aware of current transaction https://review.opendev.org/c/openstack/neutron/+/874760 | 11:54 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Add helper for retrieving LR associated with LRP https://review.opendev.org/c/openstack/neutron/+/873698 | 11:54 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Apply soft anti-affinity for LRs with multiple LRPs when scheduling https://review.opendev.org/c/openstack/neutron/+/873699 | 11:54 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] OVNClient._get_router_ports: Drop unused parameter https://review.opendev.org/c/openstack/neutron/+/878527 | 11:54 |
opendevreview | Luis Tomas Bolivar proposed openstack/neutron master: Ensure redirect-type=bridged not used for geneve networks https://review.opendev.org/c/openstack/neutron/+/878450 | 11:54 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Implement support for external-gateway-multihoming extension https://review.opendev.org/c/openstack/neutron/+/874199 | 13:52 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Allow L3 scheduler to be aware of current transaction https://review.opendev.org/c/openstack/neutron/+/874760 | 13:52 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Add helper for retrieving LR associated with LRP https://review.opendev.org/c/openstack/neutron/+/873698 | 13:52 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Apply soft anti-affinity for LRs with multiple LRPs when scheduling https://review.opendev.org/c/openstack/neutron/+/873699 | 13:52 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Honor `enable_default_route_ecmp` attribute https://review.opendev.org/c/openstack/neutron/+/878531 | 13:52 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: Remove the ``OVNSqlFixture`` class workaround https://review.opendev.org/c/openstack/neutron/+/874669 | 14:15 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: Replace "tenant_id" with "project_id" in IPAM engine https://review.opendev.org/c/openstack/neutron/+/877533 | 14:17 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: Improve "sync_ha_chassis_group" method https://review.opendev.org/c/openstack/neutron/+/872023 | 14:18 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: WIP - Add ``OVNGatewayHAChassisGroup`` scheduler class https://review.opendev.org/c/openstack/neutron/+/872033 | 14:18 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: [OVN] Remove backwards compatibility with OVN < v20.09 https://review.opendev.org/c/openstack/neutron/+/870621 | 14:23 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: [OVN] OVN agent should register "Chassis_Private" by default https://review.opendev.org/c/openstack/neutron/+/878535 | 14:34 |
opendevreview | Rodolfo Alonso proposed openstack/neutron stable/2023.1: [OVN] OVN agent should register "Chassis_Private" by default https://review.opendev.org/c/openstack/neutron/+/878536 | 14:35 |
frickler | in my downstream CI, I'm seeing neutron startup issues with yoga for some time now. log looks like https://paste.opendev.org/show/bfIlfudaBGuZUsTSBoAf/ after the crash, neutron restarts and is working fine, but in the meantime CI has failed | 14:38 |
frickler | if wait and retry things like port creation some minutes later, everything is fine. any clue on this? | 14:38 |
frickler | now happening 100%, mayby 30% or so, likely some kind of race. deployment is on 3 nodes if that matters | 14:39 |
frickler | s/now/not | 14:39 |
opendevreview | Luis Tomas Bolivar proposed openstack/neutron master: Ensure redirect-type=bridged not used for geneve networks https://review.opendev.org/c/openstack/neutron/+/878450 | 14:42 |
ykarel_ | frickler, what's is the neutron version there, latest yoga? | 14:51 |
ykarel_ | 20.3.0 | 14:51 |
opendevreview | Mohammed Naser proposed openstack/neutron master: fix: add log message for periodic_sync_routers_task fullsync https://review.opendev.org/c/openstack/neutron/+/878248 | 14:53 |
frickler | ykarel_: latest stable/yoga | 14:53 |
ykarel_ | frickler, i recalled there were couple of fixes in 20.3.0 but if latest then it can be something else | 15:06 |
opendevreview | Merged openstack/neutron master: [OVN] Remove "update_port_qos_with_external_ids_reference" https://review.opendev.org/c/openstack/neutron/+/874105 | 15:06 |
prometheanfire | for ovn, is there an issue with baremetal ports not being responded to for dhcp requests on external networks? I see requests but no responses from the only chassis on the network's physical interface | 15:06 |
ykarel_ | like https://review.opendev.org/c/openstack/neutron/+/865159, https://review.opendev.org/c/openstack/neutron/+/857773 | 15:07 |
ykarel_ | also seeing the hash empty, seems there is some issue with database | 15:08 |
ralonsoh | ykarel_, well, this is the ovsdbapp result | 15:10 |
ralonsoh | cause: Result queue is empty" | 15:10 |
ralonsoh | this could be an issue with the transaction | 15:11 |
frickler | can there be an issue with 865159 if multiple neutron-servers do their initial startup in parallel? just reading the commit message that sounds like a possible issue | 15:11 |
frickler | I don't think that scenario is tested in CI, either | 15:11 |
ralonsoh | frickler, when the Neutron server is started, is the OVN DB up? | 15:16 |
ralonsoh | prometheanfire, the DHCP for baremetal ports is provided in the Neutron controllers. Is there an ovn metadata agent running there? | 15:17 |
ralonsoh | and do you see the corresponding namespace (with the network name)? | 15:17 |
ralonsoh | network ID | 15:17 |
sahid | btw ralonsoh are you ok to unblock https://review.opendev.org/c/openstack/neutron/+/871113 ? | 15:20 |
ralonsoh | yes, done | 15:20 |
sahid | ralonsoh: thank you :-) | 15:21 |
frickler | ralonsoh: how could I check that? | 15:23 |
ralonsoh | frickler, is the ovsdb service running? | 15:25 |
prometheanfire | ralonsoh: ya, the namespace is there, running haproxy | 15:25 |
ralonsoh | prometheanfire, and do you see the packets arriving to the namespace? | 15:26 |
prometheanfire | ralonsoh: no packets from the external network, only packets from the VMs on the host (controler is on a hype) | 15:26 |
ralonsoh | that's a problem, let me check with lucasagomes if I'm right on this: the metadata agent port in the Neutron controller should be the one providing the IP for baremetal ports | 15:28 |
frickler | ralonsoh: ovn and ovs should be started well before neutron, yes, like some minutes. nothing obvious in their logs, too | 15:28 |
ralonsoh | prometheanfire, in any case, please report a launchpad bug describing the issue (and the version used) | 15:29 |
ralonsoh | we'll triage it ASAP | 15:29 |
prometheanfire | ralonsoh: ya, it's something that's getting beyond my ability to troubleshoot lol, tried looking at flows, etc | 15:29 |
prometheanfire | kk | 15:29 |
ralonsoh | frickler, let me check if this is a possible error in ovsdbapp | 15:32 |
ralonsoh | frickler, what version of ovsdbapp do you have? | 15:33 |
lucasagomes | ralonsoh, hi there, lemme read | 15:35 |
lucasagomes | ralonsoh, yes, that sounds correct to me | 15:36 |
ralonsoh | ok, so maybe we have a bug there. Now the point is why the metdata namespace is not receiving the dhcp requests | 15:36 |
lucasagomes | ralonsoh, cause for ports with VNIC_BAREMETAL, what ML2/OVN does is to create a port of type "external" which will be bond to a controller instead of the compute node | 15:37 |
lucasagomes | I say controller, but it will be bond to a chassis in OVN with the "enable-gw-as-chassis" flag | 15:37 |
lucasagomes | enable-chassis-as-gw* sorry | 15:37 |
ralonsoh | prometheanfire, ^^ do you have this flag? | 15:37 |
frickler | actually I just notice that the deployment uses 20.2.0, so it doesn't have the patch ykarel_ mentioned. I need to check that, sorry for the confusion | 15:38 |
lucasagomes | https://docs.openstack.org/neutron/latest/admin/ovn/external_ports.html | 15:38 |
prometheanfire | ralonsoh: yes, I checked that :D | 15:38 |
prometheanfire | checked via 'ovs-vsctl get open . external-ids:ovn-cms-options' | 15:39 |
ralonsoh | frickler, that makes more sense because we use "ovsdb-client" commands | 15:39 |
ralonsoh | prometheanfire, and can you dump the traffic up to the controller? just to know where this traffic is dropped | 15:39 |
ralonsoh | the dhcp request I mean | 15:39 |
prometheanfire | tcpdump on the controler shows the dhcp request hitting the expected interface | 15:40 |
prometheanfire | not sure where the next place to dump would be | 15:40 |
ralonsoh | you mean the interface inside the metadata namespace? | 15:41 |
prometheanfire | no, outside the namespace | 15:41 |
ralonsoh | ok, the baremetal port dhcp request should reach the Neutron controller | 15:41 |
prometheanfire | inside the namespace I just see a bunch of arp-who-has and nothing else (which is odd because ssh is running on that network) | 15:42 |
ralonsoh | and the first interface should be the external bridge interface | 15:42 |
lucasagomes | prometheanfire, so you are usign ML2/OVN with Neutron DHCP for baremetal ? | 15:42 |
lucasagomes | prometheanfire, make sure u have the disable_ovn_dhcp_for_baremetal_ports config option set to True for that | 15:42 |
prometheanfire | ml2/ovn with ovn dhcp | 15:42 |
lucasagomes | ah ok | 15:43 |
prometheanfire | not using the 'old' neutron dhcp | 15:43 |
lucasagomes | sure | 15:43 |
ykarel_ | frickler, so then would be better to update to 20.3.0 as that contains couple of those fixes | 15:44 |
ralonsoh | and can you check in the NB database that you have a logical_switch_port with type "external" that matches the ID of the neutron port | 15:44 |
prometheanfire | ralonsoh: ya, nb database had a switch port labeled external | 15:46 |
ralonsoh | and what about the dhcp request in the external bridge interface? | 15:47 |
prometheanfire | I think I see a configuration problem on my end | 15:48 |
prometheanfire | my network mappings map to non-existant bridges/veth pairs | 15:48 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: Revert "Ensure vlan network traffic is not centralized" https://review.opendev.org/c/openstack/neutron/+/878441 | 15:49 |
opendevreview | Rodolfo Alonso proposed openstack/neutron stable/2023.1: Revert "Ensure vlan network traffic is not centralized" https://review.opendev.org/c/openstack/neutron/+/878442 | 15:49 |
prometheanfire | fat finger, probably... | 15:49 |
prometheanfire | let me fix that then try again (making sure the correct veth pair is used at least | 15:49 |
opendevreview | Rodolfo Alonso proposed openstack/neutron stable/zed: Revert "Ensure vlan network traffic is not centralized" https://review.opendev.org/c/openstack/neutron/+/878443 | 15:50 |
prometheanfire | the ovs side name being wrong is a naming issue as long as the data flows into it correctly | 15:50 |
opendevreview | Rodolfo Alonso proposed openstack/neutron stable/yoga: Revert "Ensure vlan network traffic is not centralized" https://review.opendev.org/c/openstack/neutron/+/878444 | 15:50 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: Revert "Ensure vlan network traffic is not centralized" https://review.opendev.org/c/openstack/neutron/+/878441 | 15:50 |
opendevreview | Rodolfo Alonso proposed openstack/neutron stable/2023.1: Revert "Ensure vlan network traffic is not centralized" https://review.opendev.org/c/openstack/neutron/+/878442 | 15:50 |
opendevreview | Rodolfo Alonso proposed openstack/neutron stable/xena: Revert "Ensure vlan network traffic is not centralized" https://review.opendev.org/c/openstack/neutron/+/878445 | 15:51 |
opendevreview | Rodolfo Alonso proposed openstack/neutron stable/wallaby: Revert "Ensure vlan network traffic is not centralized" https://review.opendev.org/c/openstack/neutron/+/878446 | 15:51 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: [OVS][QoS] Add QoS support for Trunk service, OVS driver https://review.opendev.org/c/openstack/neutron/+/839523 | 15:56 |
prometheanfire | ralonsoh: ok, with that fixed I do see dhcp requests in that namespace, but only for other interfaces, the one configured got filtered out, trying with the mac I saw in the namespace (updated the baremetal port address) | 16:04 |
prometheanfire | ralonsoh: switching the port address makes that address be filtered, could it be a security group filtering dhcp? | 16:07 |
prometheanfire | or port security | 16:09 |
ralonsoh | dhcp packets are accepted always | 16:09 |
ralonsoh | so now you see the DHCP requests reaching the namespace interface, right? | 16:09 |
prometheanfire | hmm, whatever the (mac) address I configure for the baremetal port gets filtered and do not reach the namespace interface | 16:10 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Implement support for external-gateway-multihoming extension https://review.opendev.org/c/openstack/neutron/+/874199 | 16:10 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Honor `enable_default_route_ecmp` attribute https://review.opendev.org/c/openstack/neutron/+/878531 | 16:10 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Allow L3 scheduler to be aware of current transaction https://review.opendev.org/c/openstack/neutron/+/874760 | 16:10 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Add helper for retrieving LR associated with LRP https://review.opendev.org/c/openstack/neutron/+/873698 | 16:10 |
opendevreview | Frode Nordahl proposed openstack/neutron master: [ovn] Apply soft anti-affinity for LRs with multiple LRPs when scheduling https://review.opendev.org/c/openstack/neutron/+/873699 | 16:10 |
opendevreview | Frode Nordahl proposed openstack/neutron master: WIP [ovn] Add support for enable_default_route_bfd attribute https://review.opendev.org/c/openstack/neutron/+/878543 | 16:10 |
prometheanfire | the namespace interface sees DHCP requests for other mac addresses | 16:10 |
ralonsoh | but the mac address should be the same as the one in the neutron port database | 16:10 |
ralonsoh | for other mac address? | 16:11 |
prometheanfire | the mac address it sees is the one not configured for the neutron port | 16:11 |
prometheanfire | the mac address configured for the neutron port seems like it's dropped before reaching the namespace | 16:11 |
ralonsoh | the baremetal port mac address and the DB neutron port mac address must be the same | 16:11 |
ralonsoh | no no | 16:11 |
prometheanfire | they are the same | 16:11 |
ralonsoh | the dhcp request cannot have other mac | 16:12 |
ralonsoh | if so, dhcp server won't reply to the correct mac | 16:12 |
prometheanfire | I'm saying that the mac address configured dhcp request is not seen, only ports that are not managed by openstack packets get forwared to the namespace | 16:12 |
prometheanfire | the server sending requests tries one interface, fails to get a reply then tries the other interface | 16:13 |
prometheanfire | server sends dhcp request from ironic/neutron port (mac address matches), packet reaches ovn controller host but does not reach the namespace | 16:14 |
ralonsoh | so the packet is dropped in ovs | 16:14 |
prometheanfire | yes, that's what it seems like | 16:15 |
prometheanfire | ok, did a tcpdump all along the path, a dhcp reply is being sent before hitting the namespace, so maybe ovs handles it before putting the packet on the namespace network | 16:19 |
prometheanfire | now to watch closer... | 16:19 |
prometheanfire | hmm, ok, I think we are good? the host isn't responding to whatever is given as the dhcp response (it gets a response that it just may not be happy with | 16:23 |
ralonsoh | so the dhcp reply is reaching the baremetal port | 16:24 |
ralonsoh | let me check one patch related to this | 16:24 |
prometheanfire | yes, I think it's short-circuited in ovs somewhere, it's the correct response though, options are set | 16:24 |
ralonsoh | prometheanfire, what version are you running? | 16:25 |
ralonsoh | do you have this patch: https://review.opendev.org/q/I59038639a8411c11c5fb8b366d9c858ef3db4f70 | 16:26 |
prometheanfire | of what? | 16:26 |
ralonsoh | Neutron version | 16:26 |
prometheanfire | yes | 16:26 |
prometheanfire | option 150 is sent | 16:26 |
ralonsoh | so at this point you need to check why the baremetal server is not accepting this dhcp reply | 16:27 |
prometheanfire | I'm thinking the hardware doesn't like ipxe, neutron part seems like it's working now from what I can see | 16:27 |
prometheanfire | yep | 16:27 |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: [OVN] Remove backwards compatibility with OVN < v20.09 https://review.opendev.org/c/openstack/neutron/+/870621 | 16:59 |
opendevreview | Miro Tomaska proposed openstack/neutron master: Fix intermittent failures in finding metada port in SB DB https://review.opendev.org/c/openstack/neutron/+/878549 | 17:03 |
*** ministry is now known as __ministry | 17:09 | |
opendevreview | Rodolfo Alonso proposed openstack/neutron master: Increase port name size and type to internal https://review.opendev.org/c/openstack/neutron/+/873118 | 17:37 |
*** Guest8553 is now known as atmark | 17:48 | |
*** atmark is now known as Guest8755 | 17:49 | |
*** Guest8755 is now known as atmark | 17:58 | |
opendevreview | Merged openstack/neutron master: [OVS] Parse the "permitted_ethertypes" at the FW initialization https://review.opendev.org/c/openstack/neutron/+/876997 | 19:19 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!