Thursday, 2021-10-28

oklhostmorning05:17
jingvar\o/06:03
em__Good morning. Anynody in here really runs Xena (in production?) Because it seems fairly buggy (starting / stoping deleting instances partially is broken, cloud init seems to be broken). Should we rather go on wallaby?06:05
opendevreviewMichal Nasiadka proposed openstack/kolla master: cinder-volume: Install binary python libs only in binary  https://review.opendev.org/c/openstack/kolla/+/81001606:12
mnasiadkaGood morning06:45
mnasiadkamgoddard, yoctozepto: I noticed we have one bug targeted against Xena - https://launchpad.net/kolla-ansible/+milestone/13.0.0 - do we want to fix it before RC2?06:45
yoctozeptomnasiadka: there are several others, just not targetted explicitly06:50
yoctozeptoif you can review, then please do (-:06:51
yoctozeptootherwise - pff06:51
mnasiadkaI think we should be better at targeting bugs, and register priorities with RFE tag, and target them - it would be good to get a nice overview...06:51
mnasiadkait's a bug and a feature in the same moment ;)06:53
mnasiadkalooks quite safe06:53
em__I'am yet not sure how overrides are used in kolla-ansible. I would need to set global_physnet_mtu to fix our MTU problem. I e.g. found https://github.com/openstack/kolla-ansible/blob/master/tests/templates/neutron-server-overrides.j2 - is what i look for https://docs.openstack.org/kolla-ansible/latest/admin/advanced-configuration.html or rather something else where i can just change one aspect (like that value)?06:54
*** amoralej|off is now known as amoralej07:06
opendevreviewMichal Nasiadka proposed openstack/kolla master: neutron: Use update-alternatives --display instead of --query  https://review.opendev.org/c/openstack/kolla/+/81578107:20
opendevreviewMerged openstack/kolla-ansible master: [mariadb] Drop some old workaround  https://review.opendev.org/c/openstack/kolla-ansible/+/81103507:20
opendevreviewUwe Grawert proposed openstack/kolla-ansible master: [Grafana] Add unified alerting and smtp options  https://review.opendev.org/c/openstack/kolla-ansible/+/81569407:48
em__looking at the neutron configuration ansible task i see that https://gist.github.com/EugenMayer/6e9723b9347c5305a4b3cf077b11aa9a are the locations that can be used for overrides. I have set node_custom_config: "/etc/kolla/config" in my globals.yml, thus i created https://gist.github.com/EugenMayer/064b20e8d67e9ed02e4dede9c1a69e3a . controller is the hostname of the target node. Redeploying does not apply the setting to 07:49
em__/etc/kolla/neutron-server/neutron.conf - what am i doing wrong?07:49
opendevreviewMerged openstack/kayobe stable/wallaby: CI: Disable heat in upgrade jobs to save disk space  https://review.opendev.org/c/openstack/kayobe/+/81493207:58
opendevreviewPierre Riteau proposed openstack/kayobe stable/wallaby: CI: enable DNF tests on CentOS Stream 8  https://review.opendev.org/c/openstack/kayobe/+/81475208:07
opendevreviewMark Goddard proposed openstack/kayobe master: infra VMs: use wait_for rather than wait_for_connection  https://review.opendev.org/c/openstack/kayobe/+/81321208:53
jingvar {{ ( enable_elasticsearch | bool or ( elasticsearch_address != kolla_internal_vip_address )) and   not enable_monasca | bool }}09:06
jingvar {{ ( enable_elasticsearch | bool or ( elasticsearch_address != kolla_internal_vip_address )) and   not enable_monasca | bool }}09:06
jingvarwhat will be there with enabled fqdn ^^^09:06
jingvarkolla-ansible/ansible/group_vars/all.yml:elasticsearch_address: "{{ kolla_internal_fqdn }}" 09:07
jingvarofcourse  kolla_internal_fqdn != kolla_internal_vip_address -  is it a bug or a feature :)09:08
mgoddardjingvar: bug09:13
mgoddardjingvar: fixed: https://review.opendev.org/q/If23a6b1273c2639d1296becc9d222546d52f63ac09:15
opendevreviewMerged openstack/kolla stable/wallaby: cinder-volume/ubuntu: add lsscsi and nvme  https://review.opendev.org/c/openstack/kolla/+/81544109:29
opendevreviewMerged openstack/kayobe stable/xena: Drop become in stackhpc.libvirt-vm for seed vm provision  https://review.opendev.org/c/openstack/kayobe/+/81564209:46
opendevreviewMerged openstack/kayobe stable/xena: CI: Disable heat in upgrade jobs to save disk space  https://review.opendev.org/c/openstack/kayobe/+/81563609:46
opendevreviewMerged openstack/kayobe stable/xena: Fix link syntax in release note  https://review.opendev.org/c/openstack/kayobe/+/81544909:46
opendevreviewwu.chunyang proposed openstack/kolla-ansible master: Fix wrong opts in cyborg.conf  https://review.opendev.org/c/openstack/kolla-ansible/+/81567209:59
jingvarmgoddard:  the old env:)10:08
opendevreviewMarcin Juszkiewicz proposed openstack/kolla master: base: drop commented repositories  https://review.opendev.org/c/openstack/kolla/+/81582410:58
hrwcleanup10:58
hrwmnasiadka: https://review.opendev.org/c/openstack/kolla/+/805640 - can you take your RP-1?11:02
mnasiadkaDone, but I'm still not convinced I like the ARG approach.11:04
mnasiadkaI've had this in the past - maybe we want to combine both patches: https://review.opendev.org/c/openstack/kolla/+/79283311:05
hrwprometheus exporters... EXPORTERNAME_exporter_*, prometheus_EXPORTERNAME_*11:09
hrwand prometheus_EXPORTERNAME_exporter_* :D11:10
mnasiadkawell, there was Mark's PoC patch that we could use for Prometheus exporters deployment and remove those from Kolla, that provide their own container image in docker hub/quay11:12
opendevreviewMerged openstack/kolla-ansible master: Fix broken deploy of placement service  https://review.opendev.org/c/openstack/kolla-ansible/+/81552411:12
em__Anybody here is using OVN (Xena) and is able to properly access the ovn-meta-data service ,thus cloud-init instances like cirros can retrieve its configuration properly?11:13
opendevreviewMerged openstack/kolla-ansible master: docs: Parameterize kolla-ansible version and branch  https://review.opendev.org/c/openstack/kolla-ansible/+/81504311:17
mnasiadkaem__: what's wrong with ovn-metadata-agent? any errors in the logs?11:35
em__mnasiadka, to be honest, we do not know, all we yet now is that it cannot be contacted and we have no ovnmeta- namespace after the deployment, thus all instances fail like this cirros one: https://gist.github.com/EugenMayer/42a0f13ccf5f18076c4e2d84655bda66 - if you can tell me in which logs to dig,i happily do so11:43
em__we can reproduce this issue with the switch to OVN, in the lab and in our datacenter (so we also know this is not related to any MTU issues)11:44
em__this is our globals.yml: https://github.com/EugenMayer/openstack-lab/blob/master/deploy/3_configure_kolla.sh#L1611:45
em__this is our inventory (https://github.com/EugenMayer/openstack-lab/tree/master/config)11:45
em__sorry wrong branch, both links are wrong11:45
mnasiadkawell, ovn-metadata-agent needs to run on the compute your VM is running on, and there should be some errors in the logs of that service if it fails to create a netns and spawn haproxy there11:45
em__inventory : https://github.com/EugenMayer/openstack-lab/tree/stable/ovn/config11:45
em__globals.yml: https://github.com/EugenMayer/openstack-lab/blob/stable/ovn/deploy/3_configure_kolla.sh#L1611:46
em__ovn-metadata-agent  is running on the computes11:46
mnasiadkaso then there must be some issue with it's operation, best is to check it's logs in /var/log/kolla/neutron11:48
em__wait, ovn-metadata-agent is not running on the computes, i was confusing it with11:48
em__neutron_ovn_metadata_agent11:48
opendevreviewMerged openstack/kayobe stable/wallaby: CI: enable DNF tests on CentOS Stream 8  https://review.opendev.org/c/openstack/kayobe/+/81475211:48
mnasiadkawell, that's the one11:49
mnasiadkaI just mixed up it's name ;)11:49
em__hehe11:49
em__so which logs to check?11:50
em__there is no haproxy on the compute though (only on the controller)11:50
em__those are the logs https://gist.github.com/EugenMayer/bbda954d207fc03b1fd9ea0bb6d143cf11:51
em__10.0.0.3:6642: connection attempt failed11:51
em__10.0.0.3 is the controller11:52
em__what should run on 6642?11:52
mnasiadkaovn sb db11:52
mnasiadkayou might need to enable debug to understand what's going wrong11:53
em__telnet 10.0.0.3 6642 works from the compute instance11:53
em__the question is, what network is that agent in?11:53
mnasiadkaand also check if there are network namespaces created for ovn metadata (ip netns)11:53
em__there are non, ip netns is empty11:54
em__ovn sb db runs on the controller, so that is right. While it fails to connect, telnet works11:54
mnasiadkaI think it's connecting, don't see the error with connection on the bottom11:55
mnasiadkabut getting ovsdbSbOvnIdl seems to have some issue11:55
em__for me - this is a pure OVN non DVR setups - no strings attached. So i would have expected that the core infra works OOTB, and i assume that metadata service is one of those core services11:57
em__mnasiadka, any hints on how to tackle this? This is the core issue why nothing cloud-init related works11:58
opendevreviewMichal Nasiadka proposed openstack/kayobe master: Build overcloud host image directly with DIB  https://review.opendev.org/c/openstack/kayobe/+/77260911:59
mnasiadkaem__: enable debug (debug=True under [DEFAULT] in ovn_metadata_agent.ini) and check logs again12:03
em__will do12:09
em__mnasiadka, you mean neutron_ovn_metadata_agent.ini ?12:09
mnasiadkayes, sorry12:09
em__docker restart neutron_ovn_metadata_agent12:11
em__should be enough ,right?12:11
mnasiadkayup, if you edited the templated config in /etc/kolla on the host12:15
em__sure i did12:19
em__mnasiadka, https://gist.github.com/EugenMayer/e2c5796f7c547c224a3aaccad2c3525712:20
em__does this help?12:20
*** amoralej is now known as amoralej|lunch12:28
em__mnasiadka, i think this is a kolla deployment /architecture bug12:58
em__AFAICs the meta-data container tries to connect to the local OVN controller via 127.0.0.1 - which obviously wont work for kolla container based deployments, since the ovn-controller runs in an isolated container: https://gist.github.com/EugenMayer/b6611b9725a7697d0a392c1b3c1a568312:59
mnasiadkanothing is isolated network wise in kolla, everything uses host networking, so the connection will work.13:02
em__but that wont work for 127?13:03
em__mnasiadka, https://gist.github.com/EugenMayer/fda894a3c6bb2bde5b9d1756b234cd7913:04
mnasiadkaWhy?13:04
em__mnasiadka, isnt ovn just connecting to the wrong ip?13:04
em__ovs13:04
mnasiadkait's connecting to the local ovsdb port13:05
mnasiadkaas you can see even in lsof - it's connected13:05
em__i'am not using --network host as isolation in docker very often, since i avoid it as much as possible. But usually 127.0.0.1 will not resolve to the hosts local socket, but the containers local socket. Am i wrong on this with --network host?13:05
mnasiadkaand the log snippet you posted is not showing anything extraordinary13:05
em__well mnasiadka at the same time https://gist.github.com/EugenMayer/9b3e4b871e33b74673daae22c9845d6c this still fails13:06
mnasiadkaone moment13:07
mnasiadkayou have both neutron-metadata-agent and neutron-ovn-metadata-agent on the same host?13:07
mnasiadkaah no, misread13:07
mnasiadkaovn-controller, uhh13:07
em__on compute i have an ovn-controller and neutron-ovn-metadata-agen13:08
mnasiadkaanyway, need more of those debug logs - from startup (with dumping of all variables, etc) to anything that makes sense13:08
em__no neutron-metadata-agent though13:08
mnasiadkaand check if you have any content in /var/lib/neutron in the neutron_ovn_metadata_agent image13:08
em__mnasiadka, https://gist.github.com/EugenMayer/ee25916307127f3b31c1a2eb58794478 - anything particular you are looking for?13:13
em__mnasiadka, if you tell me what debug logs or variables you need, i'am happy to dig those one out anytime13:16
em__mnasiadka, i'am not sure OVN+Xena is run by anybody right now since this seems to be the 'most virgin' installation of that i can think of13:16
mnasiadkaI'm referring to this: Unable to access /var/lib/neutron/external/pids/e8b1b5b2-8327-42bd-b654-04f2c1346d2a.pid.haproxy13:16
mnasiadkaWell, we are running OVN+Victoria and OVN+Wallaby, can't think what could be broken in Xena13:17
em__dint they rename things in Xena relating the controllers?13:17
em__maybe there is something mixed up?13:17
mnasiadkaSo check if you have /var/lib/neutron/external/pids directory13:17
em__the socket /var/lib/neutron/external/pids/e8b1b5b2-8327-42bd-b654-04f2c1346d2a.pid.haproxy exists, yes13:17
em__(on the meta-data container)13:18
em__mnasiadka, is there any OVN+Xena test setup you would trust more then https://github.com/EugenMayer/openstack-lab/tree/stable/ovn to verify it is working in general and it is a local issue with my configuration?13:19
mnasiadkaso the error is not an error per se - and haproxy should be running inside the container - can you double check you have haproxy processes running in the container?13:19
opendevreviewPierre Riteau proposed openstack/kayobe stable/xena: Fix --check argument for overcloud host configure  https://review.opendev.org/c/openstack/kayobe/+/81584013:20
opendevreviewPierre Riteau proposed openstack/kayobe stable/xena: infra VMs: use wait_for rather than wait_for_connection  https://review.opendev.org/c/openstack/kayobe/+/81584113:20
*** amoralej|lunch is now known as amoralej13:20
em__ps aux | grep haproxy13:21
em__neutron       51  0.0  0.0 2301140 12672 ?       Ssl  15:12   0:00 haproxy -f /var/lib/neutron/ovn-metadata-proxy/e8b1b5b2-8327-42bd-b654-04f2c1346d2a.conf13:21
em__i would say - yes it does13:21
em__inside "neutron-ovn-metadata-agent" to ensure no confusion is arround where13:21
mnasiadkaok then, so the problem might be somewhere else, but not really in a position to spend more time on this today13:22
mnasiadkajust show output of "ip netns ls"13:22
em__running on compute, right?13:24
em__❯ hostname13:25
em__compute113:25
em__❯ ip netns ls13:25
em__ovnmeta-e8b1b5b2-8327-42bd-b654-04f2c1346d2a (id: 0)13:25
em__mnasiadka, i have now proven that this very same setup works with neutron_plugin_agent: "openvswitch" the entire setup works. So it is 100% related to OVN13:28
mnasiadkait's 100% related to neutron-ovn-metadata-agent ;)13:28
mnasiadkait seems the network namespace is created, so I have no clue how it works13:28
mnasiadkayou would need to follow the virtual links in this namespace and openvswitch13:29
mnasiadkaand most probably raise a bug in Neutron's Launchpad13:29
em__the problem with that right now is, i'am entirely new to the entire setup, never had a running OVN setup, never had OVN thus offering a debugging like that will prove to be very very hard13:30
em__so basically https://github.com/EugenMayer/openstack-lab/tree/stable/ovn does not work while https://github.com/EugenMayer/openstack-lab/tree/stable/ovs does work - the only changes are https://github.com/EugenMayer/openstack-lab/commit/9784b599989c4b44c0ab89791473437d2da33604#diff-934224e66ed40046e86ff7ac9f4c38242d5364e6b4a50be21b710df5e26b879bR3813:32
mnasiadkayou are comparing two very different network solutions13:37
mnasiadkathe fact that Kolla-Ansible makes it easy to ,,switch'' is very nice, but still it's like comparing Juniper and Cisco ;)13:37
mnasiadkaAs said, configuration is correct, the way that Kolla deploys it is correct (and is working before Xena), so raise a bug in Neutron and see what they say.13:38
mnasiadkaespecially that ovn metadata agent created a netns, spawned haproxy and did all that it should - so the problem needs to be somewhere in connectivity between your vm and the haproxy from ovn metadata agent13:39
em__mnasiadka, ok surely will try13:47
em__mnasiadka, thank you13:51
mnasiadkanp14:05
em__i will now try the entire setup on wallaby too mnasiadka - i think it will be helpful for the bug report too, if this works out. AFAIKs the inventory did not change a lot, but loadbalannce has been renamed haproxy has been renamed. The diff looks fairly small https://github.com/EugenMayer/openstack-lab/commit/074e4e690a197c06033c8ba598693144a98450ce?diff=unified14:11
mnasiadkayes, but that's the api haproxy - it doesn't have any effect on neutron_ovn_metadata_agent14:12
*** ricolin_ is now known as ricolin14:40
em__mnasiadka, as kind of smelled, yes it works with wallaby, so it is a Xena issue14:52
opendevreviewMerged openstack/kolla-ansible master: Fix missing Ansible version in the error message  https://review.opendev.org/c/openstack/kolla-ansible/+/81573514:53
em__mnasiadka, which bug-tracker should is use exactly14:54
em__mnasiadka, is there any way to tell the neutron guys which version of neutron is used for the xena ubuntu images?15:01
opendevreviewMerged openstack/kayobe master: infra VMs: use wait_for rather than wait_for_connection  https://review.opendev.org/c/openstack/kayobe/+/81321215:27
opendevreviewMerged openstack/kayobe master: Fix --check argument for overcloud host configure  https://review.opendev.org/c/openstack/kayobe/+/80000615:27
*** amoralej is now known as amoralej|off15:52
opendevreviewJames Kirsch proposed openstack/kolla-ansible master: Use system scoped tokens with Keystone  https://review.opendev.org/c/openstack/kolla-ansible/+/81557716:09
em__mnasiadka, if you have a second, it would be useful to know which xena neutron images are currently packages in kolla. If you like, just comment here https://bugs.launchpad.net/neutron/+bug/1949097 - thank you a lot16:11
opendevreviewRadosław Piliszek proposed openstack/kolla-ansible stable/xena: Fix missing Ansible version in the error message  https://review.opendev.org/c/openstack/kolla-ansible/+/81580516:15
opendevreviewRadosław Piliszek proposed openstack/kolla-ansible stable/wallaby: Fix missing Ansible version in the error message  https://review.opendev.org/c/openstack/kolla-ansible/+/81580616:15
opendevreviewRadosław Piliszek proposed openstack/kolla-ansible stable/victoria: Fix missing Ansible version in the error message  https://review.opendev.org/c/openstack/kolla-ansible/+/81580716:15
opendevreviewRadosław Piliszek proposed openstack/kolla-ansible stable/ussuri: Fix missing Ansible version in the error message  https://review.opendev.org/c/openstack/kolla-ansible/+/81580916:16
opendevreviewRadosław Piliszek proposed openstack/kolla-ansible master: mariadb: use add_host to include inactive hosts in shard grouping  https://review.opendev.org/c/openstack/kolla-ansible/+/81427616:29
opendevreviewRadosław Piliszek proposed openstack/kolla-ansible stable/xena: Fix broken deploy of placement service  https://review.opendev.org/c/openstack/kolla-ansible/+/81587516:46
opendevreviewRadosław Piliszek proposed openstack/kolla-ansible stable/wallaby: Fix broken deploy of placement service  https://review.opendev.org/c/openstack/kolla-ansible/+/81587616:46
opendevreviewRadosław Piliszek proposed openstack/kolla-ansible stable/victoria: Fix broken deploy of placement service  https://review.opendev.org/c/openstack/kolla-ansible/+/81587716:46
opendevreviewRadosław Piliszek proposed openstack/kolla-ansible stable/ussuri: Fix broken deploy of placement service  https://review.opendev.org/c/openstack/kolla-ansible/+/81587816:46
opendevreviewMerged openstack/kayobe stable/xena: Fix --check argument for overcloud host configure  https://review.opendev.org/c/openstack/kayobe/+/81584017:37
opendevreviewMerged openstack/kayobe stable/xena: infra VMs: use wait_for rather than wait_for_connection  https://review.opendev.org/c/openstack/kayobe/+/81584117:37
opendevreviewWill Szumski proposed openstack/kayobe master: Fix overcloud database backup  https://review.opendev.org/c/openstack/kayobe/+/81589518:04
-opendevstatus- NOTICE: mirror.bhs1.ovh.opendev.org filled its disk around 17:25 UTC. We have corrected this issue around 18:25 UTC and jobs that failed due to this mirror can be rechecked.18:44

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!