oklhost | morning | 05:17 |
---|---|---|
jingvar | \o/ | 06:03 |
em__ | Good morning. Anynody in here really runs Xena (in production?) Because it seems fairly buggy (starting / stoping deleting instances partially is broken, cloud init seems to be broken). Should we rather go on wallaby? | 06:05 |
opendevreview | Michal Nasiadka proposed openstack/kolla master: cinder-volume: Install binary python libs only in binary https://review.opendev.org/c/openstack/kolla/+/810016 | 06:12 |
mnasiadka | Good morning | 06:45 |
mnasiadka | mgoddard, yoctozepto: I noticed we have one bug targeted against Xena - https://launchpad.net/kolla-ansible/+milestone/13.0.0 - do we want to fix it before RC2? | 06:45 |
yoctozepto | mnasiadka: there are several others, just not targetted explicitly | 06:50 |
yoctozepto | if you can review, then please do (-: | 06:51 |
yoctozepto | otherwise - pff | 06:51 |
mnasiadka | I think we should be better at targeting bugs, and register priorities with RFE tag, and target them - it would be good to get a nice overview... | 06:51 |
mnasiadka | it's a bug and a feature in the same moment ;) | 06:53 |
mnasiadka | looks quite safe | 06:53 |
em__ | I'am yet not sure how overrides are used in kolla-ansible. I would need to set global_physnet_mtu to fix our MTU problem. I e.g. found https://github.com/openstack/kolla-ansible/blob/master/tests/templates/neutron-server-overrides.j2 - is what i look for https://docs.openstack.org/kolla-ansible/latest/admin/advanced-configuration.html or rather something else where i can just change one aspect (like that value)? | 06:54 |
*** amoralej|off is now known as amoralej | 07:06 | |
opendevreview | Michal Nasiadka proposed openstack/kolla master: neutron: Use update-alternatives --display instead of --query https://review.opendev.org/c/openstack/kolla/+/815781 | 07:20 |
opendevreview | Merged openstack/kolla-ansible master: [mariadb] Drop some old workaround https://review.opendev.org/c/openstack/kolla-ansible/+/811035 | 07:20 |
opendevreview | Uwe Grawert proposed openstack/kolla-ansible master: [Grafana] Add unified alerting and smtp options https://review.opendev.org/c/openstack/kolla-ansible/+/815694 | 07:48 |
em__ | looking at the neutron configuration ansible task i see that https://gist.github.com/EugenMayer/6e9723b9347c5305a4b3cf077b11aa9a are the locations that can be used for overrides. I have set node_custom_config: "/etc/kolla/config" in my globals.yml, thus i created https://gist.github.com/EugenMayer/064b20e8d67e9ed02e4dede9c1a69e3a . controller is the hostname of the target node. Redeploying does not apply the setting to | 07:49 |
em__ | /etc/kolla/neutron-server/neutron.conf - what am i doing wrong? | 07:49 |
opendevreview | Merged openstack/kayobe stable/wallaby: CI: Disable heat in upgrade jobs to save disk space https://review.opendev.org/c/openstack/kayobe/+/814932 | 07:58 |
opendevreview | Pierre Riteau proposed openstack/kayobe stable/wallaby: CI: enable DNF tests on CentOS Stream 8 https://review.opendev.org/c/openstack/kayobe/+/814752 | 08:07 |
opendevreview | Mark Goddard proposed openstack/kayobe master: infra VMs: use wait_for rather than wait_for_connection https://review.opendev.org/c/openstack/kayobe/+/813212 | 08:53 |
jingvar | {{ ( enable_elasticsearch | bool or ( elasticsearch_address != kolla_internal_vip_address )) and not enable_monasca | bool }} | 09:06 |
jingvar | {{ ( enable_elasticsearch | bool or ( elasticsearch_address != kolla_internal_vip_address )) and not enable_monasca | bool }} | 09:06 |
jingvar | what will be there with enabled fqdn ^^^ | 09:06 |
jingvar | kolla-ansible/ansible/group_vars/all.yml:elasticsearch_address: "{{ kolla_internal_fqdn }}" | 09:07 |
jingvar | ofcourse kolla_internal_fqdn != kolla_internal_vip_address - is it a bug or a feature :) | 09:08 |
mgoddard | jingvar: bug | 09:13 |
mgoddard | jingvar: fixed: https://review.opendev.org/q/If23a6b1273c2639d1296becc9d222546d52f63ac | 09:15 |
opendevreview | Merged openstack/kolla stable/wallaby: cinder-volume/ubuntu: add lsscsi and nvme https://review.opendev.org/c/openstack/kolla/+/815441 | 09:29 |
opendevreview | Merged openstack/kayobe stable/xena: Drop become in stackhpc.libvirt-vm for seed vm provision https://review.opendev.org/c/openstack/kayobe/+/815642 | 09:46 |
opendevreview | Merged openstack/kayobe stable/xena: CI: Disable heat in upgrade jobs to save disk space https://review.opendev.org/c/openstack/kayobe/+/815636 | 09:46 |
opendevreview | Merged openstack/kayobe stable/xena: Fix link syntax in release note https://review.opendev.org/c/openstack/kayobe/+/815449 | 09:46 |
opendevreview | wu.chunyang proposed openstack/kolla-ansible master: Fix wrong opts in cyborg.conf https://review.opendev.org/c/openstack/kolla-ansible/+/815672 | 09:59 |
jingvar | mgoddard: the old env:) | 10:08 |
opendevreview | Marcin Juszkiewicz proposed openstack/kolla master: base: drop commented repositories https://review.opendev.org/c/openstack/kolla/+/815824 | 10:58 |
hrw | cleanup | 10:58 |
hrw | mnasiadka: https://review.opendev.org/c/openstack/kolla/+/805640 - can you take your RP-1? | 11:02 |
mnasiadka | Done, but I'm still not convinced I like the ARG approach. | 11:04 |
mnasiadka | I've had this in the past - maybe we want to combine both patches: https://review.opendev.org/c/openstack/kolla/+/792833 | 11:05 |
hrw | prometheus exporters... EXPORTERNAME_exporter_*, prometheus_EXPORTERNAME_* | 11:09 |
hrw | and prometheus_EXPORTERNAME_exporter_* :D | 11:10 |
mnasiadka | well, there was Mark's PoC patch that we could use for Prometheus exporters deployment and remove those from Kolla, that provide their own container image in docker hub/quay | 11:12 |
opendevreview | Merged openstack/kolla-ansible master: Fix broken deploy of placement service https://review.opendev.org/c/openstack/kolla-ansible/+/815524 | 11:12 |
em__ | Anybody here is using OVN (Xena) and is able to properly access the ovn-meta-data service ,thus cloud-init instances like cirros can retrieve its configuration properly? | 11:13 |
opendevreview | Merged openstack/kolla-ansible master: docs: Parameterize kolla-ansible version and branch https://review.opendev.org/c/openstack/kolla-ansible/+/815043 | 11:17 |
mnasiadka | em__: what's wrong with ovn-metadata-agent? any errors in the logs? | 11:35 |
em__ | mnasiadka, to be honest, we do not know, all we yet now is that it cannot be contacted and we have no ovnmeta- namespace after the deployment, thus all instances fail like this cirros one: https://gist.github.com/EugenMayer/42a0f13ccf5f18076c4e2d84655bda66 - if you can tell me in which logs to dig,i happily do so | 11:43 |
em__ | we can reproduce this issue with the switch to OVN, in the lab and in our datacenter (so we also know this is not related to any MTU issues) | 11:44 |
em__ | this is our globals.yml: https://github.com/EugenMayer/openstack-lab/blob/master/deploy/3_configure_kolla.sh#L16 | 11:45 |
em__ | this is our inventory (https://github.com/EugenMayer/openstack-lab/tree/master/config) | 11:45 |
em__ | sorry wrong branch, both links are wrong | 11:45 |
mnasiadka | well, ovn-metadata-agent needs to run on the compute your VM is running on, and there should be some errors in the logs of that service if it fails to create a netns and spawn haproxy there | 11:45 |
em__ | inventory : https://github.com/EugenMayer/openstack-lab/tree/stable/ovn/config | 11:45 |
em__ | globals.yml: https://github.com/EugenMayer/openstack-lab/blob/stable/ovn/deploy/3_configure_kolla.sh#L16 | 11:46 |
em__ | ovn-metadata-agent is running on the computes | 11:46 |
mnasiadka | so then there must be some issue with it's operation, best is to check it's logs in /var/log/kolla/neutron | 11:48 |
em__ | wait, ovn-metadata-agent is not running on the computes, i was confusing it with | 11:48 |
em__ | neutron_ovn_metadata_agent | 11:48 |
opendevreview | Merged openstack/kayobe stable/wallaby: CI: enable DNF tests on CentOS Stream 8 https://review.opendev.org/c/openstack/kayobe/+/814752 | 11:48 |
mnasiadka | well, that's the one | 11:49 |
mnasiadka | I just mixed up it's name ;) | 11:49 |
em__ | hehe | 11:49 |
em__ | so which logs to check? | 11:50 |
em__ | there is no haproxy on the compute though (only on the controller) | 11:50 |
em__ | those are the logs https://gist.github.com/EugenMayer/bbda954d207fc03b1fd9ea0bb6d143cf | 11:51 |
em__ | 10.0.0.3:6642: connection attempt failed | 11:51 |
em__ | 10.0.0.3 is the controller | 11:52 |
em__ | what should run on 6642? | 11:52 |
mnasiadka | ovn sb db | 11:52 |
mnasiadka | you might need to enable debug to understand what's going wrong | 11:53 |
em__ | telnet 10.0.0.3 6642 works from the compute instance | 11:53 |
em__ | the question is, what network is that agent in? | 11:53 |
mnasiadka | and also check if there are network namespaces created for ovn metadata (ip netns) | 11:53 |
em__ | there are non, ip netns is empty | 11:54 |
em__ | ovn sb db runs on the controller, so that is right. While it fails to connect, telnet works | 11:54 |
mnasiadka | I think it's connecting, don't see the error with connection on the bottom | 11:55 |
mnasiadka | but getting ovsdbSbOvnIdl seems to have some issue | 11:55 |
em__ | for me - this is a pure OVN non DVR setups - no strings attached. So i would have expected that the core infra works OOTB, and i assume that metadata service is one of those core services | 11:57 |
em__ | mnasiadka, any hints on how to tackle this? This is the core issue why nothing cloud-init related works | 11:58 |
opendevreview | Michal Nasiadka proposed openstack/kayobe master: Build overcloud host image directly with DIB https://review.opendev.org/c/openstack/kayobe/+/772609 | 11:59 |
mnasiadka | em__: enable debug (debug=True under [DEFAULT] in ovn_metadata_agent.ini) and check logs again | 12:03 |
em__ | will do | 12:09 |
em__ | mnasiadka, you mean neutron_ovn_metadata_agent.ini ? | 12:09 |
mnasiadka | yes, sorry | 12:09 |
em__ | docker restart neutron_ovn_metadata_agent | 12:11 |
em__ | should be enough ,right? | 12:11 |
mnasiadka | yup, if you edited the templated config in /etc/kolla on the host | 12:15 |
em__ | sure i did | 12:19 |
em__ | mnasiadka, https://gist.github.com/EugenMayer/e2c5796f7c547c224a3aaccad2c35257 | 12:20 |
em__ | does this help? | 12:20 |
*** amoralej is now known as amoralej|lunch | 12:28 | |
em__ | mnasiadka, i think this is a kolla deployment /architecture bug | 12:58 |
em__ | AFAICs the meta-data container tries to connect to the local OVN controller via 127.0.0.1 - which obviously wont work for kolla container based deployments, since the ovn-controller runs in an isolated container: https://gist.github.com/EugenMayer/b6611b9725a7697d0a392c1b3c1a5683 | 12:59 |
mnasiadka | nothing is isolated network wise in kolla, everything uses host networking, so the connection will work. | 13:02 |
em__ | but that wont work for 127? | 13:03 |
em__ | mnasiadka, https://gist.github.com/EugenMayer/fda894a3c6bb2bde5b9d1756b234cd79 | 13:04 |
mnasiadka | Why? | 13:04 |
em__ | mnasiadka, isnt ovn just connecting to the wrong ip? | 13:04 |
em__ | ovs | 13:04 |
mnasiadka | it's connecting to the local ovsdb port | 13:05 |
mnasiadka | as you can see even in lsof - it's connected | 13:05 |
em__ | i'am not using --network host as isolation in docker very often, since i avoid it as much as possible. But usually 127.0.0.1 will not resolve to the hosts local socket, but the containers local socket. Am i wrong on this with --network host? | 13:05 |
mnasiadka | and the log snippet you posted is not showing anything extraordinary | 13:05 |
em__ | well mnasiadka at the same time https://gist.github.com/EugenMayer/9b3e4b871e33b74673daae22c9845d6c this still fails | 13:06 |
mnasiadka | one moment | 13:07 |
mnasiadka | you have both neutron-metadata-agent and neutron-ovn-metadata-agent on the same host? | 13:07 |
mnasiadka | ah no, misread | 13:07 |
mnasiadka | ovn-controller, uhh | 13:07 |
em__ | on compute i have an ovn-controller and neutron-ovn-metadata-agen | 13:08 |
mnasiadka | anyway, need more of those debug logs - from startup (with dumping of all variables, etc) to anything that makes sense | 13:08 |
em__ | no neutron-metadata-agent though | 13:08 |
mnasiadka | and check if you have any content in /var/lib/neutron in the neutron_ovn_metadata_agent image | 13:08 |
em__ | mnasiadka, https://gist.github.com/EugenMayer/ee25916307127f3b31c1a2eb58794478 - anything particular you are looking for? | 13:13 |
em__ | mnasiadka, if you tell me what debug logs or variables you need, i'am happy to dig those one out anytime | 13:16 |
em__ | mnasiadka, i'am not sure OVN+Xena is run by anybody right now since this seems to be the 'most virgin' installation of that i can think of | 13:16 |
mnasiadka | I'm referring to this: Unable to access /var/lib/neutron/external/pids/e8b1b5b2-8327-42bd-b654-04f2c1346d2a.pid.haproxy | 13:16 |
mnasiadka | Well, we are running OVN+Victoria and OVN+Wallaby, can't think what could be broken in Xena | 13:17 |
em__ | dint they rename things in Xena relating the controllers? | 13:17 |
em__ | maybe there is something mixed up? | 13:17 |
mnasiadka | So check if you have /var/lib/neutron/external/pids directory | 13:17 |
em__ | the socket /var/lib/neutron/external/pids/e8b1b5b2-8327-42bd-b654-04f2c1346d2a.pid.haproxy exists, yes | 13:17 |
em__ | (on the meta-data container) | 13:18 |
em__ | mnasiadka, is there any OVN+Xena test setup you would trust more then https://github.com/EugenMayer/openstack-lab/tree/stable/ovn to verify it is working in general and it is a local issue with my configuration? | 13:19 |
mnasiadka | so the error is not an error per se - and haproxy should be running inside the container - can you double check you have haproxy processes running in the container? | 13:19 |
opendevreview | Pierre Riteau proposed openstack/kayobe stable/xena: Fix --check argument for overcloud host configure https://review.opendev.org/c/openstack/kayobe/+/815840 | 13:20 |
opendevreview | Pierre Riteau proposed openstack/kayobe stable/xena: infra VMs: use wait_for rather than wait_for_connection https://review.opendev.org/c/openstack/kayobe/+/815841 | 13:20 |
*** amoralej|lunch is now known as amoralej | 13:20 | |
em__ | ps aux | grep haproxy | 13:21 |
em__ | neutron 51 0.0 0.0 2301140 12672 ? Ssl 15:12 0:00 haproxy -f /var/lib/neutron/ovn-metadata-proxy/e8b1b5b2-8327-42bd-b654-04f2c1346d2a.conf | 13:21 |
em__ | i would say - yes it does | 13:21 |
em__ | inside "neutron-ovn-metadata-agent" to ensure no confusion is arround where | 13:21 |
mnasiadka | ok then, so the problem might be somewhere else, but not really in a position to spend more time on this today | 13:22 |
mnasiadka | just show output of "ip netns ls" | 13:22 |
em__ | running on compute, right? | 13:24 |
em__ | ❯ hostname | 13:25 |
em__ | compute1 | 13:25 |
em__ | ❯ ip netns ls | 13:25 |
em__ | ovnmeta-e8b1b5b2-8327-42bd-b654-04f2c1346d2a (id: 0) | 13:25 |
em__ | mnasiadka, i have now proven that this very same setup works with neutron_plugin_agent: "openvswitch" the entire setup works. So it is 100% related to OVN | 13:28 |
mnasiadka | it's 100% related to neutron-ovn-metadata-agent ;) | 13:28 |
mnasiadka | it seems the network namespace is created, so I have no clue how it works | 13:28 |
mnasiadka | you would need to follow the virtual links in this namespace and openvswitch | 13:29 |
mnasiadka | and most probably raise a bug in Neutron's Launchpad | 13:29 |
em__ | the problem with that right now is, i'am entirely new to the entire setup, never had a running OVN setup, never had OVN thus offering a debugging like that will prove to be very very hard | 13:30 |
em__ | so basically https://github.com/EugenMayer/openstack-lab/tree/stable/ovn does not work while https://github.com/EugenMayer/openstack-lab/tree/stable/ovs does work - the only changes are https://github.com/EugenMayer/openstack-lab/commit/9784b599989c4b44c0ab89791473437d2da33604#diff-934224e66ed40046e86ff7ac9f4c38242d5364e6b4a50be21b710df5e26b879bR38 | 13:32 |
mnasiadka | you are comparing two very different network solutions | 13:37 |
mnasiadka | the fact that Kolla-Ansible makes it easy to ,,switch'' is very nice, but still it's like comparing Juniper and Cisco ;) | 13:37 |
mnasiadka | As said, configuration is correct, the way that Kolla deploys it is correct (and is working before Xena), so raise a bug in Neutron and see what they say. | 13:38 |
mnasiadka | especially that ovn metadata agent created a netns, spawned haproxy and did all that it should - so the problem needs to be somewhere in connectivity between your vm and the haproxy from ovn metadata agent | 13:39 |
em__ | mnasiadka, ok surely will try | 13:47 |
em__ | mnasiadka, thank you | 13:51 |
mnasiadka | np | 14:05 |
em__ | i will now try the entire setup on wallaby too mnasiadka - i think it will be helpful for the bug report too, if this works out. AFAIKs the inventory did not change a lot, but loadbalannce has been renamed haproxy has been renamed. The diff looks fairly small https://github.com/EugenMayer/openstack-lab/commit/074e4e690a197c06033c8ba598693144a98450ce?diff=unified | 14:11 |
mnasiadka | yes, but that's the api haproxy - it doesn't have any effect on neutron_ovn_metadata_agent | 14:12 |
*** ricolin_ is now known as ricolin | 14:40 | |
em__ | mnasiadka, as kind of smelled, yes it works with wallaby, so it is a Xena issue | 14:52 |
opendevreview | Merged openstack/kolla-ansible master: Fix missing Ansible version in the error message https://review.opendev.org/c/openstack/kolla-ansible/+/815735 | 14:53 |
em__ | mnasiadka, which bug-tracker should is use exactly | 14:54 |
em__ | mnasiadka, is there any way to tell the neutron guys which version of neutron is used for the xena ubuntu images? | 15:01 |
opendevreview | Merged openstack/kayobe master: infra VMs: use wait_for rather than wait_for_connection https://review.opendev.org/c/openstack/kayobe/+/813212 | 15:27 |
opendevreview | Merged openstack/kayobe master: Fix --check argument for overcloud host configure https://review.opendev.org/c/openstack/kayobe/+/800006 | 15:27 |
*** amoralej is now known as amoralej|off | 15:52 | |
opendevreview | James Kirsch proposed openstack/kolla-ansible master: Use system scoped tokens with Keystone https://review.opendev.org/c/openstack/kolla-ansible/+/815577 | 16:09 |
em__ | mnasiadka, if you have a second, it would be useful to know which xena neutron images are currently packages in kolla. If you like, just comment here https://bugs.launchpad.net/neutron/+bug/1949097 - thank you a lot | 16:11 |
opendevreview | Radosław Piliszek proposed openstack/kolla-ansible stable/xena: Fix missing Ansible version in the error message https://review.opendev.org/c/openstack/kolla-ansible/+/815805 | 16:15 |
opendevreview | Radosław Piliszek proposed openstack/kolla-ansible stable/wallaby: Fix missing Ansible version in the error message https://review.opendev.org/c/openstack/kolla-ansible/+/815806 | 16:15 |
opendevreview | Radosław Piliszek proposed openstack/kolla-ansible stable/victoria: Fix missing Ansible version in the error message https://review.opendev.org/c/openstack/kolla-ansible/+/815807 | 16:15 |
opendevreview | Radosław Piliszek proposed openstack/kolla-ansible stable/ussuri: Fix missing Ansible version in the error message https://review.opendev.org/c/openstack/kolla-ansible/+/815809 | 16:16 |
opendevreview | Radosław Piliszek proposed openstack/kolla-ansible master: mariadb: use add_host to include inactive hosts in shard grouping https://review.opendev.org/c/openstack/kolla-ansible/+/814276 | 16:29 |
opendevreview | Radosław Piliszek proposed openstack/kolla-ansible stable/xena: Fix broken deploy of placement service https://review.opendev.org/c/openstack/kolla-ansible/+/815875 | 16:46 |
opendevreview | Radosław Piliszek proposed openstack/kolla-ansible stable/wallaby: Fix broken deploy of placement service https://review.opendev.org/c/openstack/kolla-ansible/+/815876 | 16:46 |
opendevreview | Radosław Piliszek proposed openstack/kolla-ansible stable/victoria: Fix broken deploy of placement service https://review.opendev.org/c/openstack/kolla-ansible/+/815877 | 16:46 |
opendevreview | Radosław Piliszek proposed openstack/kolla-ansible stable/ussuri: Fix broken deploy of placement service https://review.opendev.org/c/openstack/kolla-ansible/+/815878 | 16:46 |
opendevreview | Merged openstack/kayobe stable/xena: Fix --check argument for overcloud host configure https://review.opendev.org/c/openstack/kayobe/+/815840 | 17:37 |
opendevreview | Merged openstack/kayobe stable/xena: infra VMs: use wait_for rather than wait_for_connection https://review.opendev.org/c/openstack/kayobe/+/815841 | 17:37 |
opendevreview | Will Szumski proposed openstack/kayobe master: Fix overcloud database backup https://review.opendev.org/c/openstack/kayobe/+/815895 | 18:04 |
-opendevstatus- NOTICE: mirror.bhs1.ovh.opendev.org filled its disk around 17:25 UTC. We have corrected this issue around 18:25 UTC and jobs that failed due to this mirror can be rechecked. | 18:44 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!