opendevreview | Merged openstack/openstack-ansible-repo_server stable/2024.2: Remove access limitations to repo vhost https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/938676 | 01:24 |
---|---|---|
f0o | noonedeadpunk: stopping cinder-backup (that HPE appliance thing I mentioned; it's not managed by OSA sadly), removing vhost and rerunning os-cinder-install fixed it :) | 06:25 |
f0o | guess we run without cinder-backup for now and figure that one out during the day | 06:38 |
f0o | hrm I copied over the config from cinder-api; readded the backup shenanigans to it and upgraded the cinder-backup egg to https://opendev.org/openstack/cinder/commit/9d1b6b8a9f07d742fee094539199c4c14ea294e0 with the upperconstraints from OSA - Still no luck. I wonder if some other dependency somewhere in the world of oslo.messaging needs to be updated... | 06:52 |
harun | Hi all, i encountered a TLS issue in clusterapi, first of all, I specified "ca_file: '/etc/ssl/certs/ca-certificates.crt'" in magnum.conf and updated the file in all magnum containers. but I got an error like this (https://paste.openstack.org/show/bAvyHbH3dKC4TSlFHDJU/) in "openstack-cloud-controller-manager" container in the kubernetes master and worker nodes. When I change it to tls-insecure=true in /etc/kubernetes/cloud.conf file the probl | 07:24 |
harun | em is solved. But I want to use tls-insecure as a true. how can I solve this problem. | 07:24 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests master: Remove sahara from zuul required projects https://review.opendev.org/c/openstack/openstack-ansible-tests/+/938867 | 08:33 |
jrosser | harun: did you see this? https://github.com/openstack/openstack-ansible-ops/blob/master/mcapi_vexxhost/playbooks/files/openstack_deploy/group_vars/magnum_all/main.yml#L24-L33 | 08:36 |
jrosser | if your issue is in the workload cluster communicating with your openstack external endpoint, the you must specify the CA to use for that in [drivers] openstack_ca_file | 08:38 |
noonedeadpunk | f0o: so cinder-api and cinder-volume does work now? | 08:38 |
noonedeadpunk | f0o: yes, oslo.messaging was severely patched for quorum queues to work in 2023.2 iirc | 08:38 |
harun | jrosser - thank you for response, i will try | 08:45 |
harun | I noticed that in user_variables_magnum.yml file openstack_ca_file: '/usr/local/share/ca-certificates/ExampleCorpRoot.crt' is specified, I will change this line and try | 08:48 |
jrosser | well, it needs to be correct for your deployment | 08:49 |
jrosser | what certificate do you actually have on the external endpoint? | 08:49 |
jrosser | those files that I showed you demonstrate the settings needed for an OSA all-in-one, but real deployments might differ from that | 08:50 |
jrosser | harun: if you use something like letsencrypt, or a proper cert from a public CA on the external endpoint you don’t need to set that at all, and mistakenly setting it to the openstack internal CA will break it | 09:28 |
jrosser | harun: conversely, if you use a cert from a company private CA, or a self signed one from OSA on the external endpoint, then you *must* provide the correct CA to the workload cluster in the magnum config otherwise it will also be broken | 09:30 |
jrosser | so that’s why my question of “what cert do you actually have on the external vip” | 09:31 |
jrosser | is crucial to understand | 09:31 |
*** tosky_ is now known as tosky | 09:33 | |
noonedeadpunk | I'd say in general it's really fine to rely on a system-wide trust store in any case | 09:33 |
jrosser | if you use LE or public cert, for sure | 09:34 |
noonedeadpunk | given that even for internal traffic there're cases where capi driver tries to communicate over public interface with keystone | 09:34 |
noonedeadpunk | but even if it's self-signed - won't it end up in system trust? | 09:34 |
jrosser | oh well unless I misunderstood, this is workloads cluster <> external vip | 09:34 |
noonedeadpunk | ah, could be | 09:34 |
jrosser | and the image knows nothing except what’s in ca-certificated package | 09:35 |
noonedeadpunk | yeah, ok, true | 09:35 |
noonedeadpunk | fair | 09:35 |
jrosser | fwiw I think I should put some divas about this in the ops repo | 09:35 |
noonedeadpunk | we should in fact document this better.. as I also spent a while on understanding all quirks | 09:36 |
jrosser | and comments in those example vars files | 09:36 |
jrosser | *docs in the ops repo (autocorrect!) | 09:36 |
noonedeadpunk | I think there were some comments: https://review.opendev.org/c/openstack/openstack-ansible-ops/+/931556/4/mcapi_vexxhost/playbooks/files/openstack_deploy/group_vars/magnum_all/main.yml | 09:37 |
noonedeadpunk | but I agree that they can be extended even more | 09:38 |
noonedeadpunk | we kinda should land https://review.opendev.org/c/openstack/openstack-ansible-tests/+/938867 as it's resulting now in quite some errors in zuul config | 09:44 |
jrosser | are there others to remove from there too? | 09:46 |
jrosser | oh perhaps not | 09:48 |
noonedeadpunk | we never added others to tests... that's for how long we're trying to get rid of it :D | 09:54 |
f0o | noonedeadpunk: yeah cinder-api/volume/scheduler work now - Now I just need to patch that rogue cinder-backup haha | 10:07 |
f0o | I ran `pip install --constraint constraints.txt git+https://opendev.org/openstack/cinder@stable/2024.1#egg=cinder cryptography ecdsa httplib2 keystonemiddleware osprofiler PyMySQL pymemcache python-memcached systemd-python "oslo.messaging" "tooz[zookeeper]" --upgrade` where the constraints.txt is taken from the release repo | 10:09 |
f0o | (https://opendev.org/openstack/requirements/src/branch/stable/2024.1/upper-constraints.txt) but it says that everything is already matching | 10:09 |
f0o | wonder if I need something specific to tell pip to do stuff | 10:09 |
f0o | I do have oslo.messaging-14.7.2 which matches the constraint.txt | 10:10 |
opendevreview | Merged openstack/openstack-ansible-tests master: Remove sahara from zuul required projects https://review.opendev.org/c/openstack/openstack-ansible-tests/+/938867 | 10:29 |
f0o | is there a way to tell OSA not to manipulate /etc/sysctl.conf but instead use /etc/sysctl.d/99-osa.conf ? | 10:39 |
f0o | the issue is that /etc/sysctl.conf is included _last_ which then overrides all sorts of things - with OVN this becomes an issue as the conntrack-max is too low | 10:40 |
f0o | but also any other sysctl setting one might have set elsewhere is just silently overwritten | 10:40 |
jrosser | f0o: can you set the value you want here? https://github.com/openstack/openstack-ansible-openstack_hosts/blob/bab1f5c75cde047776e6bebecab6fa203bc369d8/defaults/main.yml#L122 | 11:08 |
f0o | jrosser: that might solve the specific example not the underlying issue | 11:09 |
jrosser | we can't do it both ways though | 11:09 |
jrosser | some things set by OSA in sysctl might be absolutely required | 11:10 |
f0o | just seems odd to me to edit /etc/sysctl.conf instead of /etc/sysctl.d/*.conf | 11:10 |
f0o | you can name it 99-zz-osa.conf and it will probably be the last one; but if I wanted to override some of it I can make a file 99-zz-zz.conf as an advanced user | 11:11 |
f0o | forcing it into /etc/sysctl.conf leaves me no options but to `watch "rm /etc/sysctl.conf && sysctl --system"` while using OSA as a safety measure | 11:12 |
f0o | or I get everyone on board with putting all possible sysctl values into the osa-deployment vars, some of which come from 3rd party software which we only know after shit hits the fan | 11:12 |
f0o | to me its just add odd choice to push operators/admins into a corner like that | 11:13 |
jrosser | you can't override something already in sysctl.conf by using the .d directory? | 11:13 |
f0o | nop | 11:14 |
f0o | sysctl loads /etc/sysctl.conf after all .d/*.conf making whatever is in /etc/sysctl.conf the law | 11:14 |
jrosser | if you feel strongly about it then please do consider submitting a patch to the openstack_hosts role to make use of the `sysctl_file` parameter to the ansible module that applies the configuration | 11:16 |
jrosser | https://docs.ansible.com/ansible/latest/collections/ansible/posix/sysctl_module.html#parameter-sysctl_file | 11:16 |
f0o | <3 | 11:17 |
jrosser | that would be the first step to either 1) moving the OSA specific settings out to their own file 2) allowing you to name the file whatever you want 3) allowing an extension point to write out arbitrary other sysctl config files for whatever else your vendors require | 11:18 |
f0o | yeah I think just adding a new variable which defaults to /etc/sysctl.conf would be a good way. This way it wont change default behavior but allows advanced users to specify alternate paths | 11:20 |
jrosser | sure - ideally everything in OSA should be configurable to achieve what specific thing a deployment needs | 11:20 |
jrosser | thats clearly not possible here so the ideal outcome is increasing the configurability for all use cases | 11:20 |
jrosser | it's likley that adding that extra var could be seen as a bugfix (and therefore backportable) if it leads to some difficult for you right now | 11:22 |
jrosser | a more extensive breaking change would be to change openstack_kernel_options and openstack_user_kernel_options into more than one list so they could write out multiple files into /etc/sysctl.d/ if that was useful | 11:24 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible-os_neutron master: Set valid_interfaces to internal for ironic-python-agent https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/938881 | 11:35 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/2024.2: Remove sahara from zuul required projects https://review.opendev.org/c/openstack/openstack-ansible-tests/+/938882 | 11:57 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/2024.1: Remove sahara from zuul required projects https://review.opendev.org/c/openstack/openstack-ansible-tests/+/938883 | 11:57 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/2023.2: Remove sahara from zuul required projects https://review.opendev.org/c/openstack/openstack-ansible-tests/+/938884 | 11:58 |
harun | jrosser: we use a private CA from a company, we put the CA path to the file and copy magnum containers and it didn't work strangely | 12:07 |
jrosser | you should be able to find it in the cloud-init data that is passed to the created magnum cluster | 12:08 |
jrosser | and it should be referenced in the created magnum cluster cloud config | 12:09 |
jrosser | there is a file somewhere in /etc/ i think in the workload cluster that holds the definition of where the openstack endpoint is, and that should reference the CA you supply | 12:10 |
jrosser | sorry i don't have access to one of these today to give a more specific example | 12:10 |
jrosser | but our lab environment is exactly the same, we have a private company issued cert on the external VIP, which is different to the OSA self signed one on the internal VIP | 12:11 |
harun | jrosser: thank you, i will check it | 12:14 |
jrosser | harun: i see also in your error message you put `"https://<IP>:5000/"` as the URL with an error | 12:15 |
jrosser | so you actually have a certificate that will validate for an IP address? | 12:15 |
jrosser | or have you replaced the fqdn with <IP> just for that paste? | 12:15 |
harun | yes, i replaced just for that paste | 12:59 |
harun | is may any problem with tls cipher suites? | 12:59 |
jrosser | harun: it's clear from the error "certificate signed by unknown authority" | 13:20 |
jrosser | about where you have put <IP>, i was trying to ask if you have everything setup "properly", like the fqdn (not IP) used in the service catalog for external endpoints, corresponding entries in your DNS for the external VIP, a certificate issued correctly for the external VIP fqdn (not for the IP) | 13:24 |
jrosser | there is a lot has to be right for SSL verification to work | 13:24 |
jrosser | and if you obfuscate the paste, and put in "<IP>", that makes me concerned that some shortcut has been taken which introduces a fundamental problem | 13:25 |
jrosser | but ultimately this is about the CA in the k8s workloaad cluster not trusting the SSL certificate on the external api endpoint | 13:27 |
jrosser | and you have probably two possibilies there, that the CA is not passed propery through magnum to the workload cluster, or you have some intermediate certificate missing from the cert chain you have installed on haproxy | 13:28 |
jrosser | *intermediate CA certificate missing | 13:28 |
f0o | run-upgrade.sh stops the rabbitmq-server and then runs `TASK [rabbitmq_server : Set rabbitmq cluster name on primary node]` without actually starting rabbitmq-server and thus failing | 14:24 |
f0o | where would I debug this? | 14:24 |
f0o | https://paste.opendev.org/show/bZpul73fctwdSorjc6la/ | 14:27 |
f0o | snipped away the whole installing things; but there is no task that ensures that rabbitmq-server is actually running post-install. But there is a step that will prevent it from starting from installations | 14:27 |
noonedeadpunk | are you upgrding to 2024.2 or smth? | 14:33 |
noonedeadpunk | as there was a bug regarding that which is already addressed | 14:33 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible-rabbitmq_server/commit/b09ac39cbca11f7a5a14731e583246fe6a6c420e | 14:34 |
f0o | yep 2024.2 | 14:39 |
f0o | 3 weeks ago ... why wasnt this pulled just now? | 14:39 |
noonedeadpunk | so we didn't tag this change yet, so it's not a part of any release | 14:39 |
f0o | aaaaaaaaaah | 14:39 |
f0o | well I guess I can apply the patch locally :) | 14:39 |
noonedeadpunk | it will be part of next 30.0.1 or smth | 14:39 |
noonedeadpunk | there were also couple of other backports here and there... | 14:40 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-repo_server stable/2024.2: Fix tags usage for the role https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/938907 | 14:41 |
noonedeadpunk | actually, I think I can pretty much go on and prepare that release... | 14:42 |
f0o | :3 | 14:43 |
opendevreview | Merged openstack/openstack-ansible-os_skyline stable/2024.1: Ensure proper db connection string with SSL enabled https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/938465 | 14:48 |
opendevreview | Merged openstack/openstack-ansible-tests stable/2024.2: Remove sahara from zuul required projects https://review.opendev.org/c/openstack/openstack-ansible-tests/+/938882 | 15:31 |
opendevreview | Merged openstack/openstack-ansible-tests stable/2024.1: Remove sahara from zuul required projects https://review.opendev.org/c/openstack/openstack-ansible-tests/+/938883 | 15:46 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2024.1: Bump SHAs for 2024.1 https://review.opendev.org/c/openstack/openstack-ansible/+/938919 | 15:50 |
opendevreview | Merged openstack/openstack-ansible-tests stable/2023.2: Remove sahara from zuul required projects https://review.opendev.org/c/openstack/openstack-ansible-tests/+/938884 | 15:51 |
opendevreview | Merged openstack/ansible-role-frrouting master: Use centralized requirements for molecule testing https://review.opendev.org/c/openstack/ansible-role-frrouting/+/938222 | 15:56 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2023.2: Bump SHAs for 2023.2 https://review.opendev.org/c/openstack/openstack-ansible/+/938928 | 16:10 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2024.2: Bump SHAs for 2024.2 https://review.opendev.org/c/openstack/openstack-ansible/+/938931 | 16:17 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts master: Delete ceontos-privet-epel release note https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/938933 | 16:20 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts master: Delete centos-private-epel release note https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/938933 | 16:28 |
f0o | does Keystone and Glance no longer user Rabbitmq vhosts? | 16:40 |
f0o | reason I ask is because I did the HA->Quorum migration playbooks but noticed on run-upgrade.sh that rabbitmq wouldnt start because /glance and /keystone were still using HA policies. So I just removed those vhosts and re-ran upgrade. Now I see that there is no /glance nor /keystone vhosts anymore. Even after running the keystone and glance install playbooks specifically | 16:43 |
f0o | I can login so it does seem to work, just curious about this | 16:43 |
noonedeadpunk | yeah, neither of them use messaging | 16:43 |
f0o | coolio | 16:44 |
f0o | maybe run-upgrade should drop those vhosts then | 16:44 |
noonedeadpunk | I guess we've configured it historically as didn't separate rpc/notifications. So if no ceilometer is isntalled - they won't have vhosts | 16:44 |
f0o | was a bit annoying to find those but `rabbitmq-diagnostics list_policies_with_classic_queue_mirroring -s --formatter=pretty_table` helped (https://www.rabbitmq.com/docs/3.13/ha#detect-usage) | 16:45 |
f0o | makes sense | 16:45 |
noonedeadpunk | well. it's a bit tricky as roles themselves would filter out vhost as it's not passing condition. | 16:45 |
f0o | then I guess I successfully upgraded to 2024.2 now! | 16:45 |
noonedeadpunk | and doing outside of role is kinda risky | 16:45 |
noonedeadpunk | and they generally don't hurt or consume resources.... | 16:45 |
noonedeadpunk | but if you have idea on how to remove them safely - contributions are always welcome! | 16:46 |
noonedeadpunk | frankly - I don't think we invested much time into such clean-up | 16:46 |
f0o | tbh no clue; I tihnk this is quite the exceptional usecase where rabbitmq outright removed a feature and failed startup because of those old policies | 16:47 |
f0o | anyway time for the pub for a well deserved beer. Cinder-Backups is still failing but I dont think anyone uses it even | 16:47 |
noonedeadpunk | eventually we should be dropping HA policy... | 16:47 |
noonedeadpunk | but beer is well deserved for sure! | 16:48 |
opendevreview | Merged openstack/ansible-role-systemd_networkd master: Add routing policy management for interfaces https://review.opendev.org/c/openstack/ansible-role-systemd_networkd/+/937624 | 16:59 |
noonedeadpunk | jrosser: should it be still in WIP: https://review.opendev.org/c/openstack/ansible-config_template/+/938513 ? | 17:00 |
opendevreview | Dmitriy Rabotyagov proposed openstack/ansible-role-httpd master: Initial commit to the role https://review.opendev.org/c/openstack/ansible-role-httpd/+/938245 | 17:01 |
opendevreview | Merged openstack/ansible-role-systemd_networkd master: Do not try to configure resolved when it's not available https://review.opendev.org/c/openstack/ansible-role-systemd_networkd/+/938512 | 17:10 |
jrosser | noonedeadpunk: no its not wip any more - i think i didnt understand the test failures at first, but it's passing and there are comments where oi've changed the tests | 17:14 |
noonedeadpunk | yeah, I also think it's fine | 17:26 |
noonedeadpunk | just in case - I went on and proposed 2023.1 migration to unmaintained: https://review.opendev.org/c/openstack/releases/+/938952 | 17:56 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!