Thursday, 2024-03-14

opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Bump ansible version to 2.15.9  https://review.opendev.org/c/openstack/openstack-ansible/+/90561908:56
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Fix inventory defenition for Cloudkitty  https://review.opendev.org/c/openstack/openstack-ansible/+/91226908:57
kleinihttps://paste.opendev.org/show/bZZYrhnNuQMzW6yMpJv3/ I have constantly in all services this warning about lost database connections to MySQL server during query. How is this supposed to work reliably? OSA defaults even produce more of those warnings. I already reduced openstack_db_connection_recycle_time to 400 while timeout on Galera side remains 600. How can I avoid those warnings?11:12
gokhan__I have also get lost connection to mysql errors on antelope 27.3.011:22
noonedeadpunkkleini: `Lost connection to MySQL server during query` is really bad thing11:27
kleinias said, OSA defaults and happens with all OpenStack services across all three infra nodes and Galera is totally idle11:28
noonedeadpunkI guess there're 2 possible things here - 1st is smth is off with haproxy, for instance, internal VIP is getting in some failover loop11:30
kleinihaproxy and Galera are totally stable and monitored via Prometheus. no VIP failover nothing.11:31
kleinisqlalchemy or oslo.db seem to be so bad implemented, that they keep open connections laying around. they time out. then some request comes in and 'select 1' fails. so, this connection_recycle_time does not seem to work reliably11:33
noonedeadpunkiirc, select 1 is issued periodically to detect/reset such connection11:38
noonedeadpunkI can recall there was some setting regarding that....11:39
noonedeadpunkhttps://opendev.org/openstack/oslo.db/src/commit/5363ca11c9d4d9a5a9cf5a2be2fc40c52659f258/oslo_db/sqlalchemy/engines.py#L78-L10111:40
noonedeadpunkSo I assume, that should be solved with sqlalchemy 2.0.5: https://opendev.org/openstack/oslo.db/commit/64e50494f219b5c06ed79f947c91cdb7f37cb0d611:44
noonedeadpunkbut I think even for 2024.1 it still will be <2.011:44
opendevreviewMerged openstack/openstack-ansible-os_cloudkitty master: Enable CloudKitty APIv2  https://review.opendev.org/c/openstack/openstack-ansible-os_cloudkitty/+/91229111:45
opendevreviewMerged openstack/openstack-ansible-os_heat master: Deprecate and remove heat_deferred_auth_method variable  https://review.opendev.org/c/openstack/openstack-ansible-os_heat/+/90510911:57
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_heat master: Grant proper privileges to admin user for testing purposes  https://review.opendev.org/c/openstack/openstack-ansible-os_heat/+/91210811:58
kleininoonedeadpunk, thank you very much. so I need to be patient regarding that issue and try to ignore those messages in the logs.12:23
opendevreviewOpenStack Release Bot proposed openstack/openstack-ansible-haproxy_server master: reno: Update master for unmaintained/victoria  https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/91301612:25
noonedeadpunkoh no....12:26
noonedeadpunkgerritbot flood is to start now I assume12:26
opendevreviewMerged openstack/openstack-ansible master: Fix physical network mapping for linuxbridge  https://review.opendev.org/c/openstack/openstack-ansible/+/91276812:36
opendevreviewMerged openstack/openstack-ansible master: Upgrade Gnocchi to 4.6  https://review.opendev.org/c/openstack/openstack-ansible/+/91227712:36
noonedeadpunkfwiw, ^ this still results in broken gnocchi due to "recently" upgraded werkzeug in u-c :(12:37
noonedeadpunkjust realized that todat12:37
noonedeadpunkhttps://github.com/gnocchixyz/gnocchi/pull/1378 was to address it12:37
opendevreviewOpenStack Release Bot proposed openstack/ansible-hardening master: reno: Update master for unmaintained/xena  https://review.opendev.org/c/openstack/ansible-hardening/+/91313612:52
spatelDid you guys upgrade 2023.1 to 2023.2 ? any good or bad experience 13:39
opendevreviewAleksandr Chudinov proposed openstack/openstack-ansible-os_nova master: fix apparmor profile for non-standard nova home  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/91258314:05
hamburglerspatel: minimal issues from our end 2023.1 > 2023.2 when using 28.0.1, think it was just haproxy not logging for us15:08
spateloh! so that is not openstack related issue but just infra stuff. 15:10
opendevreviewJonathan Rosser proposed openstack/openstack-ansible stable/wallaby: Remove use of undefined ceph distro job zuul template  https://review.opendev.org/c/openstack/openstack-ansible/+/91019215:24
noonedeadpunksounds like we can be switching to 2024.1 branch slowly17:03
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_gnocchi master: Drop default policy file location  https://review.opendev.org/c/openstack/openstack-ansible-os_gnocchi/+/91324417:37
opendevreviewMerged openstack/openstack-ansible master: Add check_hostname option to db healthcheck tasks  https://review.opendev.org/c/openstack/openstack-ansible/+/91115017:44
f0oSpecifying haproxy_bind_(in/ex)ternal_lb_vip_interface after everything has been deployed doesnt seem to update the haproxy configs - is that intended?18:13
noonedeadpunkUm, it actually should18:18
f0oI reran haproxy-install and os-keystone-install and it didnt change anything18:18
f0oit triggered creation of certs but the actual haproxy config/s werent touched at all18:19
noonedeadpunkf0o: so. I think, you'd need to run smth like setup-openstack.yml --tags haproxy-service-config18:19
noonedeadpunkas each service backend/frontend today configured in service playbooks18:20
f0ooh ok; let me run that18:20
noonedeadpunkBut I would expect keystone to be re-configured at least from what you ran18:20
f0oit finished and no changes done to the configs18:26
opendevreviewJimmy McCrory proposed openstack/openstack-ansible-os_nova master: Ensure nova_device_spec is templated as JSON string  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/91324818:30
jrosserf0o: the logic in the haproxy role / service template is pretty complicated - it might be you have some kind of inconsistency in the variables18:41
f0oyeah I tried looking into it and noped out of it tbh18:42
jrosserdo you actually need to bind to an interface?18:43
jrosserfor 99% of cases the IP is sufficient18:43
f0oI have two options; bind to interface or set net.ipv4.tcp_l3mdev_accept=118:43
f0oI did the latter now18:43
noonedeadpunkf0o: aha, so you binded to the interface18:44
f0oother way; I didnt bind and now would need to for VRFs to work correctly18:44
f0oit's no biggie, can just nuke the rack again and do a resetup with the VRF interfaces straight away18:45
noonedeadpunkI do have a setup with working binds to interfaces18:45
f0oadditive changes sometimes arent additive hah18:45
jrosserwell it should work18:45
jrosserlike I say there are probably a bunch of combinations or vars that are not valid18:45
f0olikely18:45
f0oI will spend some more time tomorrow on it18:46
f0oI've had a real stroll down memory lane today with hitting multi-year old bugs in OVS, FRR, VRFs, ....18:46
noonedeadpunkSo I do have `internal_lb_vip_address` set to FQDN and then `haproxy_bind_external_lb_vip_address: "*"` and `haproxy_bind_external_lb_vip_interface: bond0.3104`18:46
f0o^ that's what I had too18:47
f0owhen I ran that playbook just now18:47
jrosserwhich one did you run?18:47
f0ohaproxy-install && os-keystone-install as well as setup-openstack.yml --tags haproxy-service-config as suggested by noonedeadpunk 18:48
f0oweirdly enough it created the Certs for the *-Interface pair and uploaded them18:48
noonedeadpunkand no changes in /etc/haproxy/conf.d/keystone_service ?18:48
f0ojust never changed the haproxy configs/18:48
f0onoonedeadpunk: nope18:49
noonedeadpunkhuh18:49
noonedeadpunkand you're running 2023.2?18:49
f0oyup18:49
jrossernoonedeadpunk: there are some for loops in the service template which I was a bit suspicious about18:49
noonedeadpunklike - we did run the role yesterday and it worked18:49
f0obad8ffe55b51f8197e71b9d552282480e6a40063 to be precise18:50
noonedeadpunkor well, day before yesterdat18:50
noonedeadpunkf0o: just in case - where did you defined these variables?18:50
jrosserlike if vip_binds somehow ends up with several entries18:50
noonedeadpunkgroup_vars/host_vars18:50
f0oin /etc/openstack_deploy/user_variables.yml18:50
f0othat's what the docs said18:50
noonedeadpunkand you have commented things out in openstack_user_config?18:51
noonedeadpunkin global_overrides?18:51
f0ocommented out?18:51
noonedeadpunkso, some part of the docs could suggest adding `haproxy_bind_external_lb_vip_address` there instead of user_variables18:51
f0oI still have global_overrides>internal_lb_vip_address defined18:52
noonedeadpunkso just trying to check you don't have any conflisct18:52
noonedeadpunkthat is indeed very interesting18:52
f0oI was under the impression global>*_lb_vip_address would dictate the OpenStack endpoints and haproxy_bind_* would bypass those to get alternative binds18:52
f0oas per https://docs.openstack.org/openstack-ansible-haproxy_server/latest/configure-haproxy.html#overriding-the-address-haproxy-will-bind-to18:52
noonedeadpunkyes, that's exactly what happens18:53
f0ook then it doesnt work :D18:53
f0oor it probably does work on a fresh install but not when it's applied afterwards18:53
noonedeadpunknah, it should not matter18:53
jrosserf0o: rather than nuke it, you should debug18:53
noonedeadpunkit's resulting in a template18:53
f0ofor what it's worth, I also have letsencrypt enabled... not sure if that adds more complications18:54
jrosseradd some debug: var=<blah>18:54
jrosserfollows by fail: tasks just before the template, and see what you get in the actual variables used18:54
f0oyeah I'll litter the haproxy service.j2 with some debug comments and hope to get some change into the configs18:56
jrosserok cool then we can fix either our docs, the code or your vars and it’s understanding++18:56
noonedeadpunkI would add debug here: https://opendev.org/openstack/openstack-ansible-haproxy_server/src/branch/master/tasks/haproxy_service_config_external.yml#L2818:57
f0ohttps://github.com/openstack/openstack-ansible-haproxy_server/blob/master/templates/service.j2 << this one18:58
f0o:+1:18:58
f0oman I'm too used to slack when that's my first instinct18:58
noonedeadpunkor maybe even here https://opendev.org/openstack/openstack-ansible-haproxy_server/src/branch/master/tasks/haproxy_service_config.yml#L3018:58
f0ogood pointers, will tackle it tomorrow19:00
f0otime to walk the dog 19:01
noonedeadpunkI guess I'd try to print out _haproxy_service_configs_simplified for the beginning....19:01
noonedeadpunkor even better - haproxy_tls_vip_binds19:03
noonedeadpunkas basically this logic is the question I guess: https://opendev.org/openstack/openstack-ansible-haproxy_server/src/branch/master/templates/service.j2#L15-L1919:05
opendevreviewJimmy McCrory proposed openstack/openstack-ansible-os_nova master: Ensure nova_device_spec is templated as JSON string  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/91324819:08
opendevreviewJimmy McCrory proposed openstack/openstack-ansible-os_nova master: Ensure nova_device_spec is templated as JSON string  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/91324819:09
opendevreviewJimmy McCrory proposed openstack/openstack-ansible-os_nova master: Ensure nova_device_spec is templated as JSON string  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/91324819:10
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Add support for ovn-bgp-agent deployment  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/90978022:24
opendevreviewMerged openstack/openstack-ansible-os_neutron master: Use ansible_facts['processor_vcpus'] instead of fact variable  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/91248123:58

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!