Friday, 2025-01-10

opendevreviewMerged openstack/openstack-ansible-repo_server stable/2024.2: Remove access limitations to repo vhost  https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/93867601:24
f0onoonedeadpunk: stopping cinder-backup (that HPE appliance thing I mentioned; it's not managed by OSA sadly), removing vhost and rerunning os-cinder-install fixed it :)06:25
f0oguess we run without cinder-backup for now and figure that one out during the day06:38
f0ohrm I copied over the config from cinder-api; readded the backup shenanigans to it and upgraded the cinder-backup egg to https://opendev.org/openstack/cinder/commit/9d1b6b8a9f07d742fee094539199c4c14ea294e0 with the upperconstraints from OSA - Still no luck. I wonder if some other dependency somewhere in the world of oslo.messaging needs to be updated...06:52
harunHi all, i encountered a TLS issue in clusterapi, first of all, I specified "ca_file: '/etc/ssl/certs/ca-certificates.crt'" in magnum.conf and updated the file in all magnum containers. but I got an error like this (https://paste.openstack.org/show/bAvyHbH3dKC4TSlFHDJU/) in "openstack-cloud-controller-manager" container in the kubernetes master and worker nodes. When I change it to tls-insecure=true in /etc/kubernetes/cloud.conf file the probl07:24
harunem is solved. But I want to use tls-insecure as a true. how can I solve this problem.07:24
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-tests master: Remove sahara from zuul required projects  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/93886708:33
jrosserharun: did you see this? https://github.com/openstack/openstack-ansible-ops/blob/master/mcapi_vexxhost/playbooks/files/openstack_deploy/group_vars/magnum_all/main.yml#L24-L3308:36
jrosserif your issue is in the workload cluster communicating with your openstack external endpoint, the you must specify the CA to use for that in [drivers] openstack_ca_file08:38
noonedeadpunkf0o: so cinder-api and cinder-volume does work now?08:38
noonedeadpunkf0o: yes, oslo.messaging was severely patched for quorum queues to work in 2023.2 iirc08:38
harunjrosser - thank you for response, i will try08:45
harunI noticed that in user_variables_magnum.yml file openstack_ca_file: '/usr/local/share/ca-certificates/ExampleCorpRoot.crt' is specified, I will change this line and try08:48
jrosserwell, it needs to be correct for your deployment08:49
jrosserwhat certificate do you actually have on the external endpoint?08:49
jrosserthose files that I showed you demonstrate the settings needed for an OSA all-in-one, but real deployments might differ from that08:50
jrosserharun: if you use something like letsencrypt, or a proper cert from a public CA on the external endpoint you don’t need to set that at all, and mistakenly setting it to the openstack internal CA will break it09:28
jrosserharun: conversely, if you use a cert from a company private CA, or a self signed one from OSA on the external endpoint, then you *must* provide the correct CA to the workload cluster in the magnum config otherwise it will also be broken09:30
jrosserso that’s why my question of “what cert do you actually have on the external vip”09:31
jrosseris crucial to understand09:31
*** tosky_ is now known as tosky09:33
noonedeadpunkI'd say in general it's really fine to rely on a system-wide trust store in any case09:33
jrosserif you use LE or public cert, for sure09:34
noonedeadpunkgiven that even for internal traffic there're cases where capi driver tries to communicate over public interface with keystone09:34
noonedeadpunkbut even if it's self-signed - won't it end up in system trust?09:34
jrosseroh well unless I misunderstood, this is workloads cluster <> external vip09:34
noonedeadpunkah, could be09:34
jrosserand the image knows nothing except what’s in ca-certificated package09:35
noonedeadpunkyeah, ok, true09:35
noonedeadpunkfair09:35
jrosserfwiw I think I should put some divas about this in the ops repo09:35
noonedeadpunkwe should in fact document this better.. as I also spent a while on understanding all quirks09:36
jrosserand comments in those example vars files09:36
jrosser*docs in the ops repo (autocorrect!)09:36
noonedeadpunkI think there were some comments: https://review.opendev.org/c/openstack/openstack-ansible-ops/+/931556/4/mcapi_vexxhost/playbooks/files/openstack_deploy/group_vars/magnum_all/main.yml09:37
noonedeadpunkbut I agree that they can be extended even more09:38
noonedeadpunkwe kinda should land https://review.opendev.org/c/openstack/openstack-ansible-tests/+/938867 as it's resulting now in quite some errors in zuul config09:44
jrosserare there others to remove from there too?09:46
jrosseroh perhaps not09:48
noonedeadpunkwe never added others to tests... that's for how long we're trying to get rid of it :D09:54
f0onoonedeadpunk: yeah cinder-api/volume/scheduler work now - Now I just need to patch that rogue cinder-backup haha10:07
f0oI ran `pip install --constraint constraints.txt git+https://opendev.org/openstack/cinder@stable/2024.1#egg=cinder cryptography ecdsa httplib2 keystonemiddleware osprofiler PyMySQL pymemcache python-memcached systemd-python "oslo.messaging" "tooz[zookeeper]" --upgrade` where the constraints.txt is taken from the release repo10:09
f0o(https://opendev.org/openstack/requirements/src/branch/stable/2024.1/upper-constraints.txt) but it says that everything is already matching10:09
f0owonder if I need something specific to tell pip to do stuff10:09
f0oI do have oslo.messaging-14.7.2 which matches the constraint.txt10:10
opendevreviewMerged openstack/openstack-ansible-tests master: Remove sahara from zuul required projects  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/93886710:29
f0ois there a way to tell OSA not to manipulate /etc/sysctl.conf but instead use /etc/sysctl.d/99-osa.conf ?10:39
f0othe issue is that /etc/sysctl.conf is included _last_ which then overrides all sorts of things - with OVN this becomes an issue as the conntrack-max is too low10:40
f0obut also any other sysctl setting one might have set elsewhere is just silently overwritten 10:40
jrosserf0o: can you set the value you want here? https://github.com/openstack/openstack-ansible-openstack_hosts/blob/bab1f5c75cde047776e6bebecab6fa203bc369d8/defaults/main.yml#L12211:08
f0ojrosser: that might solve the specific example not the underlying issue11:09
jrosserwe can't do it both ways though11:09
jrossersome things set by OSA in sysctl might be absolutely required11:10
f0ojust seems odd to me to edit /etc/sysctl.conf instead of /etc/sysctl.d/*.conf11:10
f0oyou can name it 99-zz-osa.conf and it will probably be the last one; but if I wanted to override some of it I can make a file 99-zz-zz.conf as an advanced user11:11
f0oforcing it into /etc/sysctl.conf leaves me no options but to `watch "rm /etc/sysctl.conf && sysctl --system"` while using OSA as a safety measure11:12
f0oor I get everyone on board with putting all possible sysctl values into the osa-deployment vars, some of which come from 3rd party software which we only know after shit hits the fan11:12
f0oto me its just add odd choice to push operators/admins into a corner like that11:13
jrosseryou can't override something already in sysctl.conf by using the .d directory?11:13
f0onop11:14
f0osysctl loads /etc/sysctl.conf after all .d/*.conf making whatever is in /etc/sysctl.conf the law11:14
jrosserif you feel strongly about it then please do consider submitting a patch to the openstack_hosts role to make use of the `sysctl_file` parameter to the ansible module that applies the configuration11:16
jrosserhttps://docs.ansible.com/ansible/latest/collections/ansible/posix/sysctl_module.html#parameter-sysctl_file11:16
f0o<311:17
jrosserthat would be the first step to either 1) moving the OSA specific settings out to their own file 2) allowing you to name the file whatever you want 3) allowing an extension point to write out arbitrary other sysctl config files for whatever else your vendors require11:18
f0oyeah I think just adding a new variable which defaults to /etc/sysctl.conf would be a good way. This way it wont change default behavior but allows advanced users to specify alternate paths11:20
jrossersure - ideally everything in OSA should be configurable to achieve what specific thing a deployment needs11:20
jrosserthats clearly not possible here so the ideal outcome is increasing the configurability for all use cases11:20
jrosserit's likley that adding that extra var could be seen as a bugfix (and therefore backportable) if it leads to some difficult for you right now11:22
jrossera more extensive breaking change would be to change openstack_kernel_options and openstack_user_kernel_options into more than one list so they could write out multiple files into /etc/sysctl.d/ if that was useful11:24
opendevreviewAndrew Bonney proposed openstack/openstack-ansible-os_neutron master: Set valid_interfaces to internal for ironic-python-agent  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/93888111:35
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/2024.2: Remove sahara from zuul required projects  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/93888211:57
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/2024.1: Remove sahara from zuul required projects  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/93888311:57
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/2023.2: Remove sahara from zuul required projects  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/93888411:58
harunjrosser: we use a private CA from a company, we put the CA path to the file and copy magnum containers and it didn't work strangely12:07
jrosseryou should be able to find it in the cloud-init data that is passed to the created magnum cluster12:08
jrosserand it should be referenced in the created magnum cluster cloud config12:09
jrosserthere is a file somewhere in /etc/ i think in the workload cluster that holds the definition of where the openstack endpoint is, and that should reference the CA you supply12:10
jrossersorry i don't have access to one of these today to give a more specific example12:10
jrosserbut our lab environment is exactly the same, we have a private company issued cert on the external VIP, which is different to the OSA self signed one on the internal VIP12:11
harunjrosser: thank you, i will check it12:14
jrosserharun: i see also in your error message you put `"https://<IP>:5000/"` as the URL with an error12:15
jrosserso you actually have a certificate that will validate for an IP address?12:15
jrosseror have you replaced the fqdn with <IP> just for that paste?12:15
harunyes, i replaced just for that paste12:59
harunis may any problem with tls cipher suites?12:59
jrosserharun: it's clear from the error "certificate signed by unknown authority"13:20
jrosserabout where you have put <IP>, i was trying to ask if you have everything setup "properly", like the fqdn (not IP) used in the service catalog for external endpoints, corresponding entries in your DNS for the external VIP, a certificate issued correctly for the external VIP fqdn (not for the IP)13:24
jrosserthere is a lot has to be right for SSL verification to work13:24
jrosserand if you obfuscate the paste, and put in "<IP>", that makes me concerned that some shortcut has been taken which introduces a fundamental problem13:25
jrosserbut ultimately this is about the CA in the k8s workloaad cluster not trusting the SSL certificate on the external api endpoint13:27
jrosserand you have probably two possibilies there, that the CA is not passed propery through magnum to the workload cluster, or you have some intermediate certificate missing from the cert chain you have installed on haproxy13:28
jrosser*intermediate CA certificate missing13:28
f0orun-upgrade.sh stops the rabbitmq-server and then runs `TASK [rabbitmq_server : Set rabbitmq cluster name on primary node]` without actually starting rabbitmq-server and thus failing14:24
f0owhere would I debug this?14:24
f0ohttps://paste.opendev.org/show/bZpul73fctwdSorjc6la/14:27
f0osnipped away the whole installing things; but there is no task that ensures that rabbitmq-server is actually running post-install. But there is a step that will prevent it from starting from installations14:27
noonedeadpunkare you upgrding to 2024.2 or smth?14:33
noonedeadpunkas there was a bug regarding that which is already addressed14:33
noonedeadpunkhttps://opendev.org/openstack/openstack-ansible-rabbitmq_server/commit/b09ac39cbca11f7a5a14731e583246fe6a6c420e14:34
f0oyep 2024.214:39
f0o3 weeks ago ... why wasnt this pulled just now?14:39
noonedeadpunkso we didn't tag this change yet, so it's not a part of any release14:39
f0oaaaaaaaaaah14:39
f0owell I guess I can apply the patch locally :)14:39
noonedeadpunkit will be part of next 30.0.1 or smth14:39
noonedeadpunkthere were also couple of other backports here and there...14:40
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-repo_server stable/2024.2: Fix tags usage for the role  https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/93890714:41
noonedeadpunkactually, I think I can pretty much go on and prepare that release...14:42
f0o:314:43
opendevreviewMerged openstack/openstack-ansible-os_skyline stable/2024.1: Ensure proper db connection string with SSL enabled  https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/93846514:48
opendevreviewMerged openstack/openstack-ansible-tests stable/2024.2: Remove sahara from zuul required projects  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/93888215:31
opendevreviewMerged openstack/openstack-ansible-tests stable/2024.1: Remove sahara from zuul required projects  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/93888315:46
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/2024.1: Bump SHAs for 2024.1  https://review.opendev.org/c/openstack/openstack-ansible/+/93891915:50
opendevreviewMerged openstack/openstack-ansible-tests stable/2023.2: Remove sahara from zuul required projects  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/93888415:51
opendevreviewMerged openstack/ansible-role-frrouting master: Use centralized requirements for molecule testing  https://review.opendev.org/c/openstack/ansible-role-frrouting/+/93822215:56
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/2023.2: Bump SHAs for 2023.2  https://review.opendev.org/c/openstack/openstack-ansible/+/93892816:10
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/2024.2: Bump SHAs for 2024.2  https://review.opendev.org/c/openstack/openstack-ansible/+/93893116:17
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts master: Delete ceontos-privet-epel release note  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/93893316:20
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts master: Delete centos-private-epel release note  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/93893316:28
f0odoes Keystone and Glance no longer user Rabbitmq vhosts?16:40
f0oreason I ask is because I did the HA->Quorum migration playbooks but noticed on run-upgrade.sh that rabbitmq wouldnt start because /glance and /keystone were still using HA policies. So I just removed those vhosts and re-ran upgrade. Now I see that there is no /glance nor /keystone vhosts anymore. Even after running the keystone and glance install playbooks specifically16:43
f0oI can login so it does seem to work, just curious about this16:43
noonedeadpunkyeah, neither of them use messaging16:43
f0ocoolio16:44
f0omaybe run-upgrade should drop those vhosts then16:44
noonedeadpunkI guess we've configured it historically as didn't separate rpc/notifications. So if no ceilometer is isntalled - they won't have vhosts16:44
f0owas a bit annoying to find those but `rabbitmq-diagnostics list_policies_with_classic_queue_mirroring -s --formatter=pretty_table` helped (https://www.rabbitmq.com/docs/3.13/ha#detect-usage)16:45
f0omakes sense16:45
noonedeadpunkwell. it's a bit tricky as roles themselves would filter out vhost as it's not passing condition. 16:45
f0othen I guess I successfully upgraded to 2024.2 now!16:45
noonedeadpunkand doing outside of role is kinda risky16:45
noonedeadpunkand they generally don't hurt or consume resources....16:45
noonedeadpunkbut if you have idea on how to remove them safely - contributions are always welcome!16:46
noonedeadpunkfrankly - I don't think we invested much time into such clean-up16:46
f0otbh no clue; I tihnk this is quite the exceptional usecase where rabbitmq outright removed a feature and failed startup because of those old policies16:47
f0oanyway time for the pub for a well deserved beer. Cinder-Backups is still failing but I dont think anyone uses it even16:47
noonedeadpunkeventually we should be dropping HA policy...16:47
noonedeadpunkbut beer is well deserved for sure!16:48
opendevreviewMerged openstack/ansible-role-systemd_networkd master: Add routing policy management for interfaces  https://review.opendev.org/c/openstack/ansible-role-systemd_networkd/+/93762416:59
noonedeadpunkjrosser: should it be still in WIP: https://review.opendev.org/c/openstack/ansible-config_template/+/938513 ?17:00
opendevreviewDmitriy Rabotyagov proposed openstack/ansible-role-httpd master: Initial commit to the role  https://review.opendev.org/c/openstack/ansible-role-httpd/+/93824517:01
opendevreviewMerged openstack/ansible-role-systemd_networkd master: Do not try to configure resolved when it's not available  https://review.opendev.org/c/openstack/ansible-role-systemd_networkd/+/93851217:10
jrossernoonedeadpunk: no its not wip any more - i think i didnt understand the test failures at first, but it's passing and there are comments where oi've changed the tests17:14
noonedeadpunkyeah, I also think it's fine17:26
noonedeadpunkjust in case - I went on and proposed 2023.1 migration to unmaintained: https://review.opendev.org/c/openstack/releases/+/93895217:56

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!