opendevreview | Merged openstack/openstack-ansible stable/2024.1: Bump SHAs for 2024.1 https://review.opendev.org/c/openstack/openstack-ansible/+/938919 | 00:19 |
---|---|---|
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_octavia master: Switch from focal to jammy based amphora image for CI testing https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/939697 | 09:01 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-haproxy_server master: Make sysctl configuration path configurable Defaults to /etc/sysctl.conf to retain current behavior https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/939601 | 09:04 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-haproxy_server master: Remove extra whitespace delimiter to satisfy ansible-lint https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/939606 | 09:05 |
jrosser | manila role os pretty unhappy | 09:13 |
jrosser | theres something wrong with setting up the ceph apt repo | 09:13 |
noonedeadpunk | yeah | 09:50 |
noonedeadpunk | with ganesha | 09:50 |
noonedeadpunk | and it's been a while frankly speaking | 09:50 |
noonedeadpunk | just never had time to take a look | 09:50 |
noonedeadpunk | jrosser: it seems we really need to review what we're doing with a PKI role. As there was a ML yesterday, and I've realized that likely certs we're issuing are 1y valid today | 10:30 |
noonedeadpunk | we've discuessed that on PTG I guess, though I didn't know we're actually on timer now with the topuc | 10:31 |
noonedeadpunk | https://docs.ansible.com/ansible/latest/collections/community/crypto/x509_certificate_module.html#parameter-entrust_not_after | 10:32 |
noonedeadpunk | and also seems like module now supports acme as well? | 10:32 |
noonedeadpunk | but in general we simply don't provide any value there: https://opendev.org/openstack/ansible-role-pki/src/branch/master/tasks/standalone/create_cert.yml#L54-L69 | 10:33 |
jrosser | noonedeadpunk: here https://github.com/openstack/openstack-ansible/blob/master/inventory/group_vars/all/ssl.yml#L48 | 10:37 |
jrosser | the default should be 3560 days for the CA and intermediate | 10:39 |
jrosser | but are you concerned about server certs or the CA? | 10:40 |
jrosser | and it's not entrust, its ownca https://docs.ansible.com/ansible/latest/collections/community/crypto/x509_certificate_module.html#parameter-ownca_not_after | 10:42 |
jrosser | having said all this is would be great to have a "rotate server certs" playbook | 10:44 |
noonedeadpunk | I guess I'm more worried about certs | 10:47 |
noonedeadpunk | as then pretty much one should rotate them once a year, ie for computes and octavia? | 10:47 |
jrosser | so as far as i can see the default is 3560 days from that module | 10:47 |
noonedeadpunk | for CA, not certs? | 10:47 |
jrosser | where is this shorter default? | 10:48 |
noonedeadpunk | ah, wrong parameter.... | 10:49 |
noonedeadpunk | so it's applied based on the "provider" basically | 10:49 |
noonedeadpunk | oh | 10:49 |
noonedeadpunk | *ok | 10:49 |
jrosser | but regardless i think some work on rotating these would be really helpful | 10:50 |
noonedeadpunk | Yeah, I'd try to scope that for 2025.1 | 10:50 |
noonedeadpunk | ok, thanks for pointing to the correct one | 10:50 |
noonedeadpunk | as I got terrified a bit | 10:50 |
jrosser | there are two very distinct things, reissuing the server certs with new private key from scratch, from the existing CA | 10:51 |
jrosser | that should be strightforward and perhaps just needs some tags adding | 10:51 |
jrosser | and then there is updating the expiry time on the existing CA/Intermediate, without changing the private key | 10:52 |
noonedeadpunk | pretty much what would `-e pki_regen_cert=true` do? | 10:52 |
jrosser | this is relatively easy, but not particulary rigorous | 10:52 |
jrosser | then there is issuing a new intermediate, or new CA cert entirely and re-issuing all the server certs signed by that new one | 10:53 |
jrosser | ^ without breaking everything horribly :) | 10:53 |
noonedeadpunk | that is the tricky one, yes | 10:53 |
jrosser | one thing that i worry about is so many places specify a "ca_file", and that is really unhelpful for rotation | 10:54 |
jrosser | because ideally there is a period where the new and old CA are both trusted | 10:55 |
noonedeadpunk | yeah, true | 10:57 |
jrosser | something to test is if that ca_file can be a cert bundle with more than one CA inside | 10:58 |
jrosser | and also we have tricky things like rotating certs on the novnc proxy and libvirt | 10:59 |
noonedeadpunk | there was some kind of mitigation you've shown me earlier | 10:59 |
jrosser | but i remember that mnaser was looking at that before | 10:59 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_horizon stable/2024.2: Add retries to u_c fetch https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/939709 | 12:01 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible master: Add additional commented RabbitMQ policy to manage segment sizes https://review.opendev.org/c/openstack/openstack-ansible/+/939713 | 13:28 |
opendevreview | Merged openstack/ansible-role-systemd_networkd master: Only restart non-networkd services when the role is configured to install them https://review.opendev.org/c/openstack/ansible-role-systemd_networkd/+/939640 | 13:45 |
noonedeadpunk | I was also thinking if we should adjust default for stream queues... | 14:30 |
noonedeadpunk | I can recall also lookiing into that | 14:31 |
jrosser | i think that this is critical to merge now https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/938270 | 14:33 |
jrosser | becasue we have merged this https://review.opendev.org/c/openstack/openstack-ansible/+/938275 upgrades are broken elsewhere for now | 14:33 |
noonedeadpunk | damiandabrowski: NeilHanlon can you please review these once have some time? ^ | 14:40 |
opendevreview | Merged openstack/openstack-ansible master: Molecule to respect depends-on for test-requirements update https://review.opendev.org/c/openstack/openstack-ansible/+/939290 | 14:41 |
damiandabrowski | sure thing! will do that in a moment | 14:41 |
opendevreview | Merged openstack/openstack-ansible master: Add noble to molecule testing https://review.opendev.org/c/openstack/openstack-ansible/+/939306 | 14:46 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-repo_server master: Use FQCN for modules https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/938272 | 14:48 |
noonedeadpunk | #startmeeting openstack_ansible_meeting | 15:01 |
opendevmeet | Meeting started Tue Jan 21 15:01:43 2025 UTC and is due to finish in 60 minutes. The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:01 |
opendevmeet | The meeting name has been set to 'openstack_ansible_meeting' | 15:01 |
noonedeadpunk | #topic rollcall | 15:01 |
noonedeadpunk | o/ | 15:01 |
NeilHanlon | o/ | 15:02 |
NeilHanlon | noonedeadpunk: sure thing re: those reviews | 15:02 |
NeilHanlon | NVM damiandabrowski did them :D | 15:02 |
noonedeadpunk | sorry just pinged couple of ppl to be sure :) | 15:03 |
NeilHanlon | hehe no worries | 15:04 |
noonedeadpunk | #topic office hours | 15:05 |
damiandabrowski | hi! | 15:06 |
noonedeadpunk | ok, so first of all, TC has merged the patch that confirms HTTP repo as recognized one by OpenStack: https://review.opendev.org/c/openstack/governance/+/935694 | 15:06 |
jrosser | o/ hello | 15:07 |
noonedeadpunk | retirement of qdrouterd is still pending though: | 15:07 |
noonedeadpunk | #link https://review.opendev.org/c/openstack/governance/+/938193 | 15:07 |
noonedeadpunk | from our side everything merged except manila | 15:07 |
noonedeadpunk | which is broken on ganesha setup | 15:08 |
noonedeadpunk | so potentially some love is needed there | 15:08 |
jrosser | we are going to get CI breakage when qdrouterd is retired | 15:10 |
jrosser | should we already start removing it from the stable branches? | 15:11 |
noonedeadpunk | doh, yes, we should.... | 15:13 |
noonedeadpunk | very-very good point | 15:13 |
jrosser | this is just from requried-projects isnt it, nothing more than that | 15:15 |
noonedeadpunk | yes | 15:15 |
noonedeadpunk | though let's not backport to 2023.1this removal https://opendev.org/openstack/openstack-ansible-tests/src/branch/master/zuul.d/jobs.yaml#L69 | 15:16 |
noonedeadpunk | as otherwise I'd had to update https://review.opendev.org/c/openstack/releases/+/938952 again, and it's long overdue | 15:16 |
* noonedeadpunk thinks unmaintained policy is huge overcomplication | 15:17 | |
jrosser | i am hoping that 2023.1 might be the first branch moved to unmaintaind that is not broken so badly we can't fix it | 15:18 |
noonedeadpunk | but also - once roles are switched to unmaintained - I'll propose shas update for integrated repo | 15:18 |
noonedeadpunk | I think Zed was okeyish? | 15:18 |
jrosser | ish | 15:18 |
noonedeadpunk | yeah | 15:19 |
jrosser | this sort of ish https://review.opendev.org/c/openstack/openstack-ansible/+/932921?tab=change-view-tab-header-zuul-results-summary | 15:19 |
noonedeadpunk | hopefully it's a marker of effort put into roles stability | 15:19 |
noonedeadpunk | hm | 15:20 |
noonedeadpunk | I wonder what went wrong there | 15:20 |
noonedeadpunk | as we don't do much in post_jobs | 15:21 |
jrosser | it is very sad that our branches that transition to unmaintained are basically wrecked | 15:21 |
noonedeadpunk | I guess it's matter of capacity to maintain them | 15:21 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible stable/2024.2: Remove ansible-role-qdrouterd from zuul required-projects https://review.opendev.org/c/openstack/openstack-ansible/+/939723 | 15:22 |
jrosser | well as i've said a few times we basically kept things working for some pretty long time | 15:22 |
jrosser | but i put really a massive effort into trying to fix the transitioned unmaintained branches, at the expense of working on new stuff | 15:23 |
jrosser | and i pretty much failed on them all | 15:23 |
noonedeadpunk | yes, true | 15:23 |
noonedeadpunk | I frankly not sure what to suggest here. We can go EOL for these branches in 6 month after EOM | 15:24 |
noonedeadpunk | but it could also be un-ideal | 15:24 |
noonedeadpunk | so basically drop all to Zed right away | 15:25 |
jrosser | and the timing is so bad too | 15:25 |
jrosser | as we have a huge fight with the just-made-unmainted branch right in the middle of the current cycle | 15:25 |
jrosser | and distract from getting the new release sorted out in time | 15:25 |
jrosser | anyway, <rant> | 15:26 |
noonedeadpunk | so, we have a choice kinda. Previously we were very distracted as unmaintained timing was right during our preparation for the release | 15:26 |
jrosser | well like i say i am hopeful that 2023.1 will be much better | 15:26 |
noonedeadpunk | I got understanding that we are having trailing release, so we can go to unmaintained in 1 month after our release or so | 15:26 |
jrosser | there is handling for automatically handling both stable and unmaintained branch names in the scripts now | 15:27 |
noonedeadpunk | so basically that's the activity at the very beginning of "our" cycle | 15:27 |
jrosser | yeah this is true | 15:28 |
jrosser | perhaps we just need to be more strict about getting that transition done stright after our release | 15:28 |
noonedeadpunk | and usually that's the least loaded time | 15:28 |
noonedeadpunk | yeah, and that's pretty much on me | 15:30 |
noonedeadpunk | (and I'm failing with that from time to time) | 15:30 |
noonedeadpunk | but indeed - will try to be just more organized | 15:32 |
jrosser | so molecule and tests repo? | 15:33 |
jrosser | we had some good progress and some bugs there this week | 15:33 |
noonedeadpunk | yes. I think the only uncovered part is plugins, where tests repo runs | 15:33 |
noonedeadpunk | at least according to this: https://codesearch.openstack.org/?q=openstack-ansible-role-jobs&i=nope&literal=nope&files=&excludeFiles=&repos= | 15:34 |
noonedeadpunk | so given your work with LXC containers role - it should be doable now | 15:35 |
jrosser | yes - i have got a bit distracted from that this week | 15:36 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible unmaintained/zed: [CI] Remove periodic jobs from unmaintained branch https://review.opendev.org/c/openstack/openstack-ansible/+/932921 | 15:36 |
jrosser | but i think that now everything should be in place to pretty quickly convert the old tests to molecule | 15:36 |
noonedeadpunk | I frankly just didn't have time as summoned to finalize an ovn-bgp-agent setup | 15:36 |
* noonedeadpunk not very good in networking | 15:37 | |
NeilHanlon | heh | 15:37 |
jrosser | i need to get back on with the lxc_container_create one | 15:37 |
* jrosser also been distracted from that | 15:38 | |
noonedeadpunk | but it seems we're 99% done with functional tests at this point (given the whole scope over years) | 15:38 |
jrosser | there is also adding coverage for the other things in the plugins repo | 15:39 |
noonedeadpunk | and not only there | 15:39 |
jrosser | which could mean we don't necessarily need to run the full suite of integrated repo tests if we make a good job of that | 15:39 |
noonedeadpunk | like rabbitmq and galera are also good candidates I guess | 15:39 |
noonedeadpunk | btw last week I also had a look into topic we've discussed long time ago - simplifying bootastrap process | 15:46 |
noonedeadpunk | or better say - moving complexity around :D | 15:46 |
noonedeadpunk | #link https://review.opendev.org/c/openstack/openstack-ansible/+/939151/ | 15:46 |
jrosser | ah yes i did see that | 15:46 |
noonedeadpunk | I can't recall why I did set a WIP there though | 15:46 |
jrosser | is this OK for upgrades? | 15:47 |
noonedeadpunk | Part I liked more is the next patch, which replaces set_fact in loop with som jinja | 15:47 |
noonedeadpunk | a downside is being way less verbose | 15:48 |
noonedeadpunk | but user-facing stuff is way cleaner and readable, imo | 15:49 |
noonedeadpunk | yeah, it does work for upgrades now | 15:49 |
noonedeadpunk | that was a nasty part to do for upgrades: https://review.opendev.org/c/openstack/openstack-ansible/+/939151/21/scripts/gate-check-commit.sh | 15:49 |
noonedeadpunk | I'm really not sure I like it. But I can't come up with better thing to trigger role pull from zuul | 15:50 |
noonedeadpunk | once pre-task have finished and we're in gate-check-commit | 15:50 |
jrosser | so we don't actually test the user facing script? | 15:50 |
noonedeadpunk | we do with shastest only | 15:51 |
noonedeadpunk | https://zuul.opendev.org/t/openstack/build/9a806d6a192a424d816fe01288c15850 | 15:51 |
noonedeadpunk | but it's same today | 15:51 |
noonedeadpunk | or well, we test a very specific unrealistic version of it | 15:51 |
jrosser | oh well actually we do use it | 15:53 |
jrosser | with a user-role / collection requirements setup by the pre- job | 15:53 |
noonedeadpunk | ah | 15:54 |
noonedeadpunk | yes, we prepare user-requirements, true | 15:54 |
opendevreview | Merged openstack/openstack-ansible-apt_package_pinning master: Use OSA_TEST_REQUIREMENTS_FILE for molecule job https://review.opendev.org/c/openstack/openstack-ansible-apt_package_pinning/+/939299 | 15:54 |
jrosser | i think thats what i'd missed | 15:54 |
jrosser | it's not that we duplicate the functionality into the zuul pre job | 15:54 |
noonedeadpunk | but it's still like... not what every user will do | 15:54 |
noonedeadpunk | no-no | 15:54 |
jrosser | it's more preparing the input for the user overrides | 15:54 |
noonedeadpunk | Ive jsut moved zuul-specific things there | 15:55 |
jrosser | yes that makes sense | 15:55 |
opendevreview | Merged openstack/ansible-role-systemd_service master: Use OSA_TEST_REQUIREMENTS_FILE for molecule job https://review.opendev.org/c/openstack/ansible-role-systemd_service/+/939292 | 15:55 |
* noonedeadpunk also having someobvious memory issues | 15:55 | |
opendevreview | Merged openstack/ansible-config_template master: Use OSA_TEST_REQUIREMENTS_FILE for molecule job https://review.opendev.org/c/openstack/ansible-config_template/+/939302 | 15:55 |
jrosser | and it previously it could swap in the zuul repos sort of in-line | 15:55 |
noonedeadpunk | I kind of can recall adding some molecule jobs to the integrated repo.... | 15:56 |
noonedeadpunk | I never pushed that? | 15:56 |
jrosser | it would be nice if that playbook kept a copy of the requirements files for the two branches in an upgrade | 15:57 |
noonedeadpunk | regarding test of user-role-requirements..... | 15:57 |
noonedeadpunk | so I'm not sure if it can use zuul stuff for N-1 | 15:57 |
noonedeadpunk | or better say - I don't know how to do that | 15:57 |
opendevreview | Merged openstack/ansible-role-systemd_networkd master: Use OSA_TEST_REQUIREMENTS_FILE for molecule job https://review.opendev.org/c/openstack/ansible-role-systemd_networkd/+/939304 | 15:58 |
noonedeadpunk | ok, I can't find the patch for testing user-role-requirements locally either | 16:00 |
noonedeadpunk | Maybe I dreamt of it... | 16:00 |
noonedeadpunk | anyway | 16:00 |
noonedeadpunk | #endmeeting | 16:00 |
opendevmeet | Meeting ended Tue Jan 21 16:00:54 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:00 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2025/openstack_ansible_meeting.2025-01-21-15.01.html | 16:00 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2025/openstack_ansible_meeting.2025-01-21-15.01.txt | 16:00 |
opendevmeet | Log: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2025/openstack_ansible_meeting.2025-01-21-15.01.log.html | 16:00 |
noonedeadpunk | ah, it was this one https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/938980 | 16:02 |
noonedeadpunk | or not... | 16:02 |
noonedeadpunk | probably it was just a dream after all | 16:04 |
noonedeadpunk | jrosser: I think that pre-osa-requirements.yml never runs twice | 16:07 |
noonedeadpunk | so on upgrade it's just skipped | 16:08 |
noonedeadpunk | as `when: - "'upgrade' not in action"` | 16:08 |
noonedeadpunk | so on N-1 just regular a-r-r are used | 16:08 |
noonedeadpunk | and then then's why this exist: https://review.opendev.org/c/openstack/openstack-ansible/+/939151/21/scripts/gate-check-commit.sh | 16:08 |
noonedeadpunk | *that's | 16:09 |
noonedeadpunk | it's already after N-1 is done, and checkout to N is performed | 16:09 |
spotz[m] | You're making me work this morning noonedeadpunk :) | 16:34 |
noonedeadpunk | sorry for that :D | 16:38 |
noonedeadpunk | I had busy weekends as you might see | 16:38 |
noonedeadpunk | jrosser: maybe you know... how in the world we make galera role to work in CI? | 16:44 |
jrosser | how do you mean? :) | 16:44 |
noonedeadpunk | in terms - I was pinged internally, that on ubuntu there's no `admin` user created | 16:44 |
jrosser | molecule? | 16:44 |
noonedeadpunk | molecule is smth I'm gonna try | 16:44 |
noonedeadpunk | for now - spawned 3 ubuntu 24.04 vms | 16:45 |
noonedeadpunk | and ansible 2.18.1.... | 16:45 |
jrosser | so this is the role being used standalone? | 16:45 |
noonedeadpunk | yeah | 16:45 |
noonedeadpunk | so the user should be created with this | 16:45 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible-galera_server/src/branch/master/tasks/galera_install_apt.yml#L76-L83 | 16:45 |
noonedeadpunk | oh, I see | 16:46 |
jrosser | that code is 9 years old | 16:48 |
jrosser | i wonder if that really is correct any more | 16:48 |
jrosser | becasue we should have root/local socket out of the box? | 16:48 |
noonedeadpunk | I;m guessing if https://opendev.org/openstack/openstack-ansible-galera_server/src/branch/master/vars/debian.yml#L67 is right | 16:49 |
noonedeadpunk | as it should be `mariadb-server` I believe | 16:49 |
noonedeadpunk | but how the hack it works in CI | 16:49 |
spotz[m] | Ok first run through done, jrosser you're good at catching typos so let me know if I missed something:) | 16:49 |
jrosser | so thats why i wonder really if that code is doing anything useful | 16:50 |
noonedeadpunk | spotz[m]: I've used a linter this time in vscode :p | 16:50 |
noonedeadpunk | but how it works in ci | 16:50 |
jrosser | becasue it comes from the time that we (mis)used the root user for everything | 16:50 |
jrosser | if the default install gives root/no-password/local-socket access | 16:50 |
jrosser | then we can come along and add the admin user with that | 16:51 |
noonedeadpunk | yeah. so it's totally a good idea to refactor... | 16:51 |
jrosser | ^ this is guesswork, but i'm just wondering if when we changed things to not mess with the root user, this was forgotton to be removed | 16:51 |
jrosser | spotz[m]: which patch is this? | 16:52 |
spotz[m] | https://review.opendev.org/c/openstack/openstack-ansible/+/939609 | 16:52 |
jrosser | cool thanks i will take a look | 16:52 |
noonedeadpunk | jrosser: so the thing is, that we never create an admin user outside of this debconf | 16:52 |
spotz[m] | I spent hours yesterday tracking down a YAML issue, I can not be trusted:) | 16:52 |
jrosser | noonedeadpunk: what is this for? https://github.com/openstack/openstack-ansible-galera_server/blob/f773a8fb23a015969e886934649754a81d561601/vars/main.yml#L31 | 16:53 |
noonedeadpunk | huh | 16:53 |
jrosser | from just 2 mins trying to remember the code, this is what i expect to be making the users | 16:54 |
jrosser | i didnt look much deeper than that though | 16:54 |
noonedeadpunk | how I haven;t found it | 16:54 |
noonedeadpunk | then, I think I do have news... | 16:55 |
jrosser | https://github.com/search?q=repo%3Aopenstack%2Fopenstack-ansible-galera_server%20galera_root_user&type=code | 16:55 |
noonedeadpunk | you can't set galera_serial=100% | 16:55 |
opendevreview | Merged openstack/openstack-ansible-repo_server master: Use standalone httpd role https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/938270 | 16:55 |
noonedeadpunk | as galera_all[1:] will never pass this handler: https://opendev.org/openstack/openstack-ansible-galera_server/src/branch/master/tasks/galera_server_main.yml#L103-L104 | 16:55 |
jrosser | and super confusingly there is also https://github.com/search?q=repo%3Aopenstack%2Fopenstack-ansible-galera_server%20galera_root_user&type=code | 16:55 |
jrosser | that seems to duplicate a bunch of what is in those ansible vars | 16:55 |
noonedeadpunk | yeah, it's only for EL | 16:56 |
noonedeadpunk | so yes, /o\ | 16:56 |
jrosser | well only EL or not it's still dupliate? | 16:56 |
noonedeadpunk | yeah | 16:57 |
jrosser | so this maybe does come back to the question of what happens in CI | 16:57 |
jrosser | because in the infra jobs we do run a 3 node database cluster | 16:57 |
noonedeadpunk | in CI it works just because of this I think now https://opendev.org/openstack/openstack-ansible-plugins/src/branch/master/playbooks/galera_server.yml#L39 | 16:58 |
noonedeadpunk | so these flush_handlers do not run against [1:] | 16:58 |
noonedeadpunk | I pretty much need to come up with molecule test there.... | 17:00 |
jrosser | i'm just trying to wrap my brain around why serial 100% does not work | 17:02 |
noonedeadpunk | so, I can exmplain:) | 17:04 |
noonedeadpunk | galera_server_setup.yml is executed right after flush_handlers | 17:05 |
noonedeadpunk | where admin user is created | 17:05 |
noonedeadpunk | but, admin user is also used as SST user | 17:05 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible-galera_server/src/branch/master/defaults/main.yml#L160 | 17:05 |
noonedeadpunk | if you run 100% - they are restarted, but they can not fetch data from the "master" as there's no user existing at the point of their restart | 17:06 |
noonedeadpunk | so it's a race condition | 17:07 |
noonedeadpunk | realistically - you in fact don't want to run 100% | 17:08 |
noonedeadpunk | ever | 17:08 |
noonedeadpunk | but it shouldn't fail at least.... | 17:08 |
jrosser | errrr https://opendev.org/openstack/openstack-ansible-galera_server/src/branch/master/tasks/galera_server_main.yml#L108 ? | 17:08 |
noonedeadpunk | but L103 - executed for all | 17:08 |
noonedeadpunk | and they already have in config this | 17:08 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible-galera_server/src/branch/master/templates/cluster.cnf.j2#L40 | 17:09 |
noonedeadpunk | so at L106 only galera_all[0] survives and comes to it, while rest are marked as FAILED during restart retries | 17:10 |
jrosser | well i guess that the code is pretty unobvious | 17:10 |
jrosser | it is difficult to read and understand what is happening like this | 17:10 |
noonedeadpunk | some refactoring is totally won't hurt | 17:11 |
jrosser | it is however very obvious to have a task which immediately after install creates the users needed for the cluster to function, on all the nodes | 17:11 |
noonedeadpunk | and other thing | 17:11 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible-plugins/src/branch/master/playbooks/galera_server.yml#L66-L67 | 17:11 |
noonedeadpunk | that is unobvious as well | 17:11 |
noonedeadpunk | but if you set `galera_install_client: true` - server setup will fail as well | 17:12 |
noonedeadpunk | at earlier step though | 17:12 |
jrosser | so i was just looking at this root business | 17:12 |
jrosser | and it might be that we have enough releases now where the admin user is created and used properly that we can strip out all the special cases | 17:12 |
noonedeadpunk | as placing my.cnf with a user that does not exist - doesn't result in anything good | 17:12 |
noonedeadpunk | and handling this only on playbook level not perfect kinda | 17:13 |
jrosser | there was a time when this was a mess becasue we had to handle an upgrade where the admin user was not there | 17:13 |
jrosser | so perhaps now we can be more strict with the root user? | 17:14 |
noonedeadpunk | I think we should be able to, yes | 17:14 |
jrosser | that would reduce the complexity | 17:14 |
noonedeadpunk | but if we'd think about isolated usecase - still some things to improve inside role logic | 17:15 |
noonedeadpunk | as that race condition about root/non-root in playbook still will be true | 17:15 |
jrosser | and it would be very nice for increased obviousness to create the users earlier | 17:15 |
noonedeadpunk | yeah | 17:15 |
jrosser | well, there are more tools now, like throttle: | 17:15 |
jrosser | we didnt have that before | 17:15 |
jrosser | so you can force the eqivalent of serial: 1 for a task even if the play does not specify that | 17:16 |
noonedeadpunk | I'm not sure it will be helpful there? or it might be in an actual handler | 17:17 |
noonedeadpunk | yeah. that can help | 17:17 |
noonedeadpunk | goodpoint | 17:17 |
jrosser | its just a matter of opinion, but perhaps handlers are overused in the role | 17:17 |
noonedeadpunk | I guess I'd need to come up with some basic role testing before all that... | 17:18 |
jrosser | yes i think that would be great to have a molecule for the role standalone | 17:18 |
jrosser | we are using it like that for sure outside OSA | 17:18 |
noonedeadpunk | me too | 17:18 |
noonedeadpunk | we also finally need to figure out way of publishing in galaxsy I guess | 17:19 |
jrosser | in fact can probably also test upgrades in molecule pretty easily | 17:19 |
noonedeadpunk | at least with molecule I feel a bit more confident about that | 17:19 |
jrosser | not upgrades between branches maybe but certainly between versions on mariadb | 17:20 |
noonedeadpunk | well, we don't really know previos version on molecule for that | 17:20 |
noonedeadpunk | we can for rabbitmq though, I assume | 17:20 |
noonedeadpunk | as there's mapping available for that https://opendev.org/openstack/openstack-ansible-rabbitmq_server/src/branch/master/vars/main.yml#L26-L36 | 17:21 |
noonedeadpunk | potantially a good thing to add to mariadb | 17:21 |
jrosser | answer seems to be that we could do some good cleanup | 17:21 |
noonedeadpunk | yeah | 17:22 |
noonedeadpunk | indeed | 17:22 |
noonedeadpunk | I still owe a flag to re-bootstrap rabbitmq though :( | 17:22 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-galera_server master: Extend example playbook to contain valid values https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/939740 | 17:47 |
noonedeadpunk | actually having functional tests in repos is handy as no need to fully re-invent testing... | 18:26 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-galera_server master: Add molecule testing for the role https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/939751 | 18:56 |
opendevreview | Dmitriy Rabotyagov proposed openstack/ansible-role-pki master: Install setuptools for noble https://review.opendev.org/c/openstack/ansible-role-pki/+/939752 | 19:07 |
opendevreview | Dmitriy Rabotyagov proposed openstack/ansible-role-pki master: Use OSA_TEST_REQUIREMENTS_FILE for molecule job https://review.opendev.org/c/openstack/ansible-role-pki/+/939301 | 19:07 |
opendevreview | Merged openstack/openstack-ansible master: Return upgrade jobs back to voting https://review.opendev.org/c/openstack/openstack-ansible/+/939307 | 19:55 |
jrosser | theres a fairly repeatable error on here https://review.opendev.org/c/openstack/ansible-role-systemd_mount/+/939303 | 20:13 |
opendevreview | Merged openstack/openstack-ansible-repo_server master: Use FQCN for modules https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/938272 | 22:16 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!