Tuesday, 2023-11-07

opendevreviewJimmy McCrory proposed openstack/openstack-ansible-galera_server master: Include CA cert in client my.cnf  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/90026606:21
jrossernixbuilder: is that only with firefox? other browsers are OK?08:45
opendevreviewNiklas Schwarz proposed openstack/openstack-ansible-rabbitmq_server master: Add ability to add custom configuration for RabbitMQ  https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/90002308:50
damiandabrowskiis there something wrong with CI? I spotted a lot of POST_FAILURES yesterday09:23
damiandabrowskihttps://zuul.opendev.org/t/openstack/builds?project=%09openstack%2Fopenstack-ansible-os_nova&result=POST_FAILURE&skip=009:23
jrosseri think there might be failure in rackspace object storage for log uploads09:38
damiandabrowskiahh thanks09:42
nixbuilderjrosser: No... other browsers fail as well.  But I think I found a workaround in just turning off SSL on horizon.10:21
jrossernixbuilder: what kind of build is that? can you reproduce it with an all-in-one?10:27
nixbuilderThis is our production build... AIO works fine.  In our production environment I have three infra, five compute and two haproxy servers.  After install I will ultimately have ~15 or so compute servers.10:32
jrosserif there is a difference between production and AIO then that could point to a configuration error10:44
jrosseri am guessing that this is an internal system rather than internet facing if you're trying to connect to 10.255.60.2910:44
jrosserfor my internet facing endpoints i use some of the online SSL checkers to validate the https setup10:45
anskiynixbuilder: you can try issuing curl/openssl s_client on it: could get less vague error message10:46
nixbuilderjrosser: This is our second 'new' cloud. But yes, both clouds are all internal and have no access to the outside world as they sit behind the corporate firewalls.10:46
jrosserwhere did the SSL certificate come from?10:47
jrosserlike anskiy says you can do some quite useful debugging with CLI tools against the horizon service10:47
nixbuilderjrosser: I assume the SSL certificate were the self-signed certificates generated by the OSA scripts.10:48
nixbuilderanskiy: I did do some wget/curl debugging as well as openssl. But then decided to eliminate https just like our old system.10:49
jrosseryou could try this https://testssl.sh/10:50
noonedeadpunkI kinda wonder if that's also related to some proxy settings potentially?10:51
noonedeadpunktjhat would explain why CLI works, as it doesn't go through proxy10:51
jrosserthat depends on environment vars10:51
noonedeadpunkwell, true10:52
noonedeadpunkbut usually we suggest adding internal network to no_proxy in docs iirc10:52
noonedeadpunkand then there's a question - does CLI from your local machine also works, or you was reffering only to the utility container cli?10:53
anskiynixbuilder: you've disabled https for horizon right now, if I understood you correctly?10:55
nixbuilderI did do a packet capture on the infra node and basically it said I was getting a 400 Bad Request error... wireshark also noted that it saw unencrypted HTTP traffic over HTTPS. That is when I decided to turn off SSL on install and re-install.10:57
nixbuilderanskiy: Yes, I have disabled HTTPS on horizon and am in the process of re-installing from scratch.10:57
nixbuildernoonedeadpunk: The cli was on the infra node.10:58
nixbuilderNo containers... all bare metal10:58
anskiywell, seeing non-https traffic in what's supposed to be https connection explains the error, so it's either, like noonedeadpunk suggested: some problem with proxy on the machine where you're running firefox, or something wrong with haproxy's config10:59
opendevreviewMerged openstack/openstack-ansible-os_ironic stable/2023.1: Use common value for inspector callback URL  https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/90008010:59
nixbuilderanskiy: I suspected haproxy misconfiguration but since I don't have any experience with haproxy as our existing production system didn't use it, I decided to turn off SSL.11:01
nixbuilderanskiy: Our existing production system does not not use HTTPS on horizon.11:02
noonedeadpunkI also wonder if that could be some kind of HSTS issue as well....11:17
jrossereven so - a full metal deploy is an interesting thing as not many people do that11:21
* jrosser can't remember if we actually enable horizon in CI for that11:22
noonedeadpunkwe don't generally11:23
jrossernixbuilder: if you run into problems like these please do ask, if there is a deployment bug we should try to fix it11:23
anskiysome time ago I was trying to disable horizon on is_metal, and it didn't work :)11:23
jrosserotherwise we end up with a kind of FUD situation with "metal deploys don't work for https"11:23
anskiyit was still deployed as part of the shared-infra, IIRC11:24
noonedeadpunkyeah, in case it's in shared-infra - it will get deployed. But in CI we don't use shared-infra defenition11:37
nixbuilderALL: Thanks for the help.  If I run into problems I will send up smoke signals :-)11:49
opendevreviewMerged openstack/openstack-ansible-os_ceilometer master: Enable Ceilometer resource cache  https://review.opendev.org/c/openstack/openstack-ansible-os_ceilometer/+/88803212:11
NeilHanlono/ mornin' folks13:11
noonedeadpunk\o/13:33
NeilHanloni'm glad i looked at my calendar.. forgot about daylight savings..14:46
NeilHanlonsee y'all in 15 ;) 14:46
* noonedeadpunk hates daylight savings14:49
mgariepylol.14:54
mgariepywe should all be in UTC for everything.14:54
noonedeadpunk#startmeeting openstack_ansible_meeting15:01
opendevmeetMeeting started Tue Nov  7 15:01:00 2023 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.15:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:01
opendevmeetThe meeting name has been set to 'openstack_ansible_meeting'15:01
noonedeadpunk#topic roll call15:01
noonedeadpunko/15:01
NeilHanlono/ heya15:01
mgariepyhey15:01
* NeilHanlon resists urge to use other meetbot's commands15:01
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-plugins master: Set the default domain for the role_assignment  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/90032915:04
noonedeadpunk#topic office hours15:04
noonedeadpunkso. I have barely catched up with what happened last wekk15:04
jrossero/15:07
NeilHanlonheya jrosser15:07
noonedeadpunkFrom what I see we've almost landed quorum_queues, with very valid comment on the ceilometer role15:08
NeilHanlonthere was also a question last week about quorum queues for Keystone and Swift 15:08
noonedeadpunkWe also do have zun, magnum and manila roles broken15:08
noonedeadpunkyeah, I guess that's related to the comment15:09
noonedeadpunkI think I've skipped these 2 because neither keystone nor swift do not use them for RPC 15:09
noonedeadpunkbut I think they indeed should be covered15:09
noonedeadpunk#action noonedeadpunk to propose quorum_queues patches for keystone and swift15:10
NeilHanlondepending on the work required, I think it'd be ok to postpone to after release, since as you said they are not used for RPC15:10
noonedeadpunkFor broken roles I had a look only at manila, and it's broken due to ceph-ansible being incompatible with ansible-core 2.1515:10
NeilHanlonalso FYI RHEL 9.3 released today. So.. Rocky 9.3 will come at some point. I'll give advanced notice15:11
anskiyNeilHanlon: yeah, instead of remembering that thing, I've manaaged to put valid comment :)15:11
noonedeadpunkAnd I even proposed PR: https://github.com/ceph/ceph-ansible/pull/746615:11
NeilHanlonanskiy: thank you :D I did also put it on our last PTG etherpad so I would remember (somehow)15:12
noonedeadpunkBut I kinda concerned if it will merge and if it will - I highly doubt it would be backported by release due time...15:12
noonedeadpunkregarding magnum and zun - they seem to both fail with DB upgrade. I haven't looked there yet. And that has also happened after updating SHAs...15:13
noonedeadpunkWould need to have a closer look what's wrong there.15:14
NeilHanlonhm15:14
noonedeadpunkfrom good news - distro path seems to "just work" after switching to Bobcat :) 15:16
NeilHanlon🥳15:16
noonedeadpunkoh, well, magnum is slightly different. but not less confusing15:17
jrossersomething odd is happening there15:17
jrosser`The container-infrastructure-management service for default:RegionOne exists but does not have any supported versions.`15:18
noonedeadpunkyeah15:19
noonedeadpunkbut sometimes it pass...15:19
noonedeadpunkso would need to reproduce this one15:19
jrosserthat error actually is from the SDK, so it could be trouble anywhere from SDK onward15:20
noonedeadpunkFor openstack-resources topic: trove seems good except upgrade jobs. Apparently, role trying to change the network that's in use and fails. Likely I've missed something, as that works nicely for octavia which is pretty much simmilar: https://review.opendev.org/c/openstack/openstack-ansible-os_trove/+/89928415:21
noonedeadpunkso magnum and standalone playook left to reach the minimum par for the role15:21
noonedeadpunkno progress regarding skyline so far15:32
anskiyfor some reason I've been poking at the Vagrantfile inside openstack-ansible: I've added some cache (which breaks a bit at simultaneous package installation in containers) and roles/collections passthrough: https://github.com/dbalagansky/openstack-ansible/commit/b9168a5d022a5321c7bca4b7f5bbb470eccbaa7c15:33
noonedeadpunkanskiy: to be frank - we didn't maintained that for quite a while now15:38
noonedeadpunkso I can hardly comment on that and I was kinda thinking to clean that up with functional tests repo and functional jobs from tox15:39
jrosseri think there was even the start of an effort to remove all of that15:40
noonedeadpunkand now with vagrant being BSL-licensed...15:40
NeilHanlonyeahhhh15:41
nixbuilderAnyone have any ideas where I can start looking to fix this error: "infra01 neutron-rpc-server[8037]: SQL connection failed. 10 attempts left." Keeps repeating over and over.  Plus connecting to port 9696 yet that port is open on the server.15:42
noonedeadpunkI think it 's a good idea to have some easy way to spawn aio is some kind of VM on the localhost without much fuss. But I kinda failed to find a good option except virt-manager....15:46
anskiynoonedeadpunk: well, I was hoping to fix two thing: eliminate running with patches between the host and VM, and ability to easily instantiate MNAIO15:51
anskiyand the last part is what I've had previously, but with some additional roles in the middle (which rendered openstack_user_config, for example)15:52
anskiyI can probably try rewriting this thing on plain ansible with `community.libvirt` collection, if this is really something that's welcome :)15:53
noonedeadpunkI actually was thinking about re-writing MNAIO for quite a while but never had a chance15:55
noonedeadpunkAnd my idea was also to have some test_openstack_user_config that would be parsed by dynamic_inventory and provided as input to some role that will spawn resources15:56
jrosseri think jamesdenton has looked also at the MNAIO15:56
noonedeadpunkyeah15:56
NeilHanloncrap forgot to bring up my thing but it's not very long or important.. I made some progress on Incus for Fedora and Enterprise Linux, and made a connection with someone who did a much better job than I at packaging it, too :) so -- will be working on that in my spare time to get it ready for Rocky 9 and friends15:57
noonedeadpunkand yeah, community.libvirt was on the radar as one "driver", while another "driver" was to use already existing openstack project (and probably leverage openstack_resources role for that purpose)15:58
noonedeadpunkNeilHanlon: oh, these are great news!15:58
NeilHanlonthere is apparently some *drama* happening with some of the dependencies of LXD, so will be interesting to see how it plays out15:58
NeilHanlon`cowsql` and `raft`, specifically15:59
noonedeadpunkI assume that incus folks should be also quite interested to provide some help with that effort15:59
NeilHanlonI assume so as well15:59
noonedeadpunkyeah, I think I've read smth about cowsql at least in one of their blogposts16:00
NeilHanlonraft apparently has had two ABI changes this year without a soname bump, which is.. ugh16:00
NeilHanlon#link https://github.com/ganto/copr-lxc416:00
NeilHanlonmostly that's for me, but, yeah. that's where I'll be looking to build some crap on EPEL with :) 16:01
NeilHanlonnotably, it also should mean we can eventually remove the dependency on my personal copr for lxc416:01
NeilHanlon(https://opendev.org/openstack/openstack-ansible-lxc_hosts/src/branch/master/vars/redhat-host.yml#L19)16:02
noonedeadpunkat some point I think I've proposed patch for that....16:02
noonedeadpunkI _think_ it was build for epel even?16:02
noonedeadpunkhttps://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2022-968b01292a16:03
noonedeadpunk#endmeeting16:03
opendevmeetMeeting ended Tue Nov  7 16:03:41 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:03
opendevmeetMinutes:        https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-11-07-15.01.html16:03
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-11-07-15.01.txt16:03
opendevmeetLog:            https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-11-07-15.01.log.html16:03
NeilHanlonhmm. maybe we are using that one for the `templates-extra` thing, then16:03
NeilHanlonthanks for running the meeting btw noonedeadpunk16:03
noonedeadpunkyeah, could be for the templates....16:04
noonedeadpunkworth revising that actually16:04
jrossernixbuilder: neutron rpc server is trying to connect to the database16:05
NeilHanlonI think the consensus when I spoke with `thm` about it was that we shouldn't build two things in the same spec file, so I think I can (probably) just introduce a distinct `lxc-templates-extra-legacy` package which provides the legacy templates. or.. something like that anyways16:05
jrosserso you need to check some things, that the database is OK, that haproxy *thinks* that the database is OK, and the database port on the internal VIP is reachable from wherever neutron-rpc-server is running16:05
NeilHanlonor maybe we just move away from the legacy templates? i don't remember what's involved with that16:06
nixbuilderjrosser: Yeah... I think the manilla install may have broke mysql16:06
jrosserwell, that would be surprising :)16:07
noonedeadpunkNeilHanlon: to be frank - me neither...16:08
jrosserhaproxy status is a good place to start, using `hatop`16:08
noonedeadpunkpotentially we indeed don't need them. But iirc on ubuntu they were providing profiles for apparmour or smth like that...16:09
NeilHanlonpsh16:11
NeilHanlonsecurity16:11
NeilHanlonwho needs it16:12
NeilHanlonalthough, i did recently find a guide for openstack bobcat on centos stream 9 that had selinux profiles...16:12
noonedeadpunkwell, I'm pretty sure we're running with apparmour profiles being active16:13
noonedeadpunknever managed to deal nicely with selinux though....16:13
anskiythey don't actually have much in common :)16:13
opendevreviewMerged openstack/openstack-ansible-os_nova stable/2023.1: Always disable libvirt default network  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/90019016:15
noonedeadpunkhttps://paste.openstack.org/show/bhf0TBfoChPcdv1eaFBw/16:15
noonedeadpunkwell... they both try to restrict apps from not doing weird things? :p16:15
NeilHanlonostensibly16:18
anskiynoonedeadpunk: if you put it that way, yes, but for selinux it's just a part of it: because it can operate on users/network primitives/inter-process comms/filesystem objects16:18
noonedeadpunkbut I close to never run openstakc with EL in any sort of production (except migrating 1 deployment from centos 7 to debian 10 or smth)16:18
NeilHanlonthey're both security theater :) 16:18
noonedeadpunkanskiy: yeah, but satisfying it is really... time consuming, I would say?16:19
anskiynoonedeadpunk: sure :) I was just trying to say that selinux is a bit bigger in what it could do, in comparison to apparmor16:20
anskiyand you probably wouldn't even need it for the case, when it's not a multiuser system16:21
nixbuilderjrosser: Well something surely broke mysql... it was working up until the manilla install (https://paste.opendev.org/show/bY3q5tmyC2itBA4jFGI1/)16:21
jrossernixbuilder: is there anything else using the IP/ports that you'd expect mariadb to be using there (maybe manila related?)16:23
jrosseralso something has caused mariadb to restart? 16:25
jrosserthats worth understanding16:25
nixbuilderjrosser: To be honest, I didn't realize that manilla would install... my bad.  I took the openstack_user_config.yml from my previous AIO, edited it, and guess I missed the manilla portion and that's why the script started to install manilla.16:25
jrosserremember that we run metal AIO for manila, and this works16:26
jrosser+/- ceph-ansible trouble of course16:26
jrosserbut regardless, the database should not be restarted by installing a new service16:26
nixbuilderjrosser: I don't understand what this error means... "Failed to open backend connection: -110 (Connection timed out)"... this is coming from the mariadb.16:28
jrosserthere is a database cluster, the different mariadb instances need to talk to each other16:29
damiandabrowskisorry, i couldn't attend to the meeting16:29
damiandabrowskijrosser: regarding magnum and `The container-infrastructure-management service for default:RegionOne exists but does not have any supported versions.`16:29
damiandabrowskiI believe this patch will fix it: https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/89752616:29
jrosserdamiandabrowski: ahh cool - maybe we can get within the timeout now the jobs are adjusted a bit16:30
damiandabrowskifingers crossed16:30
jrossernixbuilder: this https://mariadb.com/kb/en/galera-cluster-address/16:31
jrossernixbuilder: there is also some guide for database maintainance https://docs.openstack.org/openstack-ansible/latest/admin/maintenance-tasks.html16:35
noonedeadpunkdamiandabrowski: ah, right, thanks fro reminding about it16:35
NeilHanlonbtw if anyone needs reviews, ping me.. i'm distracted like usual but I am happy to help out16:39
nixbuilderjrosser: Well I got mysql running on one node by turning off the cluster in /etc/my.cnf.d/cluster.cnf16:39
nixbuilderjrosser: just need to try and figure out how to fix the cluster.16:39
jrosserthis suggests that there is something broken, perhaps with your networking16:40
jrosserthe playbooks should bring up the database cluster and it should then work on it's own16:40
nixbuilderjrosser: Reading through the link you sent me.16:41
jrosserit's important to get this all to the position where you can run/r-run the playbooks without incident as thats the basis of an upgrade16:42
jrosser*re-run16:42
opendevreviewMerged openstack/openstack-ansible-os_nova stable/xena: Always disable libvirt default network  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/90019316:53
nixbuilderjrosser: Yeah... when I tried to recover the database I kept getting these... manilla did something to my database: https://paste.opendev.org/show/bfbKAaXWRJtf2cuopFcF/17:08
noonedeadpunknixbuilder: I don't think it's really smth wrong with the dvb17:15
noonedeadpunk*db17:15
noonedeadpunkI think I have quite some of such messages in logs17:15
noonedeadpunkit might be that wait_timout is unaligned...17:16
noonedeadpunk(though we've attempted to sync it lately)17:16
jrossernixbuilder: you should try to get the db cluster working correctly according to the maintainance tasks document17:17
jrosserif you think that you have manila services deployed that 1) you don't want 2) you believe are somehow interfereing with the db, just stop those services17:18
nixbuilderjrosser: Thanks for the help!17:19
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_cinder master: Restart cinder-purge-deleted service only on abnormal exit  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/90034718:04
opendevreviewMerged openstack/openstack-ansible-os_nova stable/yoga: Always disable libvirt default network  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/90019118:35
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Adjust condition for availability_zone definition  https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/89912719:11
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-repo_server master: Fix example playbook linters  https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/90035919:13
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-repo_server master: Ensure mounts are present only when they are expected to exist  https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/89906319:13
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-repo_server master: Cleanup upgrade tasks  https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/89906419:13
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Remove obsoleted provider drivers  https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/88551919:15
opendevreviewMerged openstack/ansible-role-zookeeper master: Add upgrade jobs for zookeeper  https://review.opendev.org/c/openstack/ansible-role-zookeeper/+/89775421:33
opendevreviewMerged openstack/openstack-ansible master: Disable wheels build for metal AIO deployments  https://review.opendev.org/c/openstack/openstack-ansible/+/89931921:50
jamesdentonbeen a while, but here's what i was doing with MNAIO: https://github.com/busterswt/MNAIOv222:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!