Wednesday, 2023-10-11

opendevreviewMerged openstack/openstack-ansible master: Switch classic queues to version 2  https://review.opendev.org/c/openstack/openstack-ansible/+/89580600:34
Guest2868there is anyone can help me with a ceph problem on osa?01:44
NeilHanlonGuest2868: possibly. what is the trouble you're having02:20
Guest2868When I run setup-infra, my ceph.conf file is not as it should be, and when I modify it once on my server I get errors in the syslog. So now I am trying by installing ceph manually.02:25
NeilHanlonwhat does the conf look like, and what do you expect it to look like?02:26
Guest2868NeilHanlon: this is a recap of my config files https://paste.openstack.org/show/bvpV737joawlmCFxzw5y/ as you can see in the ceph.conf I should have mon_host 192.168.102.30 not 101 I dont understand why this is happening02:30
noonedeadpunkGuest2868: hey, I think you get .30 because that's what you have in `ceph-mon_hosts` defined: https://paste.openstack.org/show/b2SJX9RBx5z7YISLEm30/06:55
Guest2868hi06:56
noonedeadpunkThere was a way to define a different IP for ceph communication/storage networks rather then just rely on invenvotry though, but I would need to check the ceph-ansible project for that06:57
noonedeadpunkLike if you wanna to use .30 as SSH and then .101 for ceph communication06:57
Guest2868did you check this ? https://paste.openstack.org/show/bvpV737joawlmCFxzw5y/06:57
noonedeadpunkI guess that's what you're looking for?06:58
Guest2868L2906:58
noonedeadpunkbut you have same ip on L1606:58
noonedeadpunkand then how you defined ceph-mon_hosts in openstack_user_config06:59
Guest2868192.168.101.30 != 192.168.100.3006:59
noonedeadpunkoh, yes07:00
noonedeadpunkit somehow using tunnel net indeed, huh07:01
Guest2868Why when I try to change ceph.conf in my mon server, when I run a systemctl restart ceph-mon@mon1.service the log not update the ips of the mon like the conf?07:05
noonedeadpunkGuest2868: I assume smth fishy is going on here: https://github.com/ceph/ceph-ansible/blob/490ca79ccc5b1cf7270032a70be41500578f3ae8/roles/ceph-facts/tasks/set_monitor_address.yml07:06
noonedeadpunkSo can you check in your logs (/openstack/log/ansible-logging/ansible.log) what of these cople of tasks getting OK and which are skipped?07:07
jrosseris this a ceph deployed with osa, or external?07:08
noonedeadpunkaccording to openstack_user_config - with osa07:09
noonedeadpunkbut if that is intentional or not - it's a good question :)07:09
noonedeadpunkbut even according to the config - it's weird why tunnel network is taken as ceph mon address07:10
jrosserthen config of ceph_mons is not needed?07:10
noonedeadpunkyeah, but that shouldn't hurt either... or well, anyway not in a way it does07:11
Guest2868Ok I will read this I have a meeting now07:11
noonedeadpunkwhen mon_host in ceph.conf is set to 192.168.*101*.30 while all rest reffer to different things07:11
Guest2868thank you, I keep you in touch07:11
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Track stable/2023.2 SHAs for upstream projects  https://review.opendev.org/c/openstack/openstack-ansible/+/89743408:10
opendevreviewMerged openstack/openstack-ansible-os_blazar master: Fix linter errors for example playbook  https://review.opendev.org/c/openstack/openstack-ansible-os_blazar/+/89680009:35
opendevreviewMerged openstack/openstack-ansible-ceph_client stable/2023.1: Allow to distribute custom key with the role  https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/89780809:52
opendevreviewMerged openstack/openstack-ansible-ceph_client stable/2023.1: Add AppArmor configuration for ceph read/write caching  https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/89773010:04
noonedeadpunkjamesdenton: if you have some spare time, it would be awesome if you could take a look at https://review.opendev.org/c/openstack/openstack-ansible/+/89438410:07
opendevreviewDmitriy Rabotyagov proposed openstack/ansible-role-python_venv_build stable/zed: Use distribution_major_version for Debian and CentOS  https://review.opendev.org/c/openstack/ansible-role-python_venv_build/+/89781910:11
opendevreviewMerged openstack/openstack-ansible-os_neutron master: Deprecate OpenDaylight support  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/89746111:38
opendevreviewMerged openstack/openstack-ansible-lxc_hosts master: Remove old cleaup task  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/89785511:59
opendevreviewMerged openstack/openstack-ansible-lxc_hosts master: Remove old tasks and vars from image download process  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/89786011:59
noonedeadpunkfolks, does anyone uses osa with non-root user? and including remote hosts (ie connecting as ubuntu@infra1)12:02
noonedeadpunkAs all looks nice, until a task is delegated from containerA to containerB, as that fails with ` Connection refused - Failed to get init pid` quite fairly12:03
jrosserno we dont do that12:07
noonedeadpunkugh12:09
mgariepyi don't either.12:11
opendevreviewMerged openstack/openstack-ansible-os_blazar master: Add quorum support for service  https://review.opendev.org/c/openstack/openstack-ansible-os_blazar/+/89569412:22
noonedeadpunkI kinda wonder if context is loosing become during delegation here: https://opendev.org/openstack/openstack-ansible-plugins/src/branch/master/plugins/connection/ssh.py#L45112:23
noonedeadpunkyes, somehow it's "False"12:27
noonedeadpunkwhen things are delegated12:27
noonedeadpunkoh. we override become explicitly for some reason?12:43
noonedeadpunkhttps://opendev.org/openstack/ansible-role-python_venv_build/src/branch/master/tasks/python_venv_wheel_build.yml#L1812:44
noonedeadpunkI wonder why this might be needed actually...12:44
noonedeadpunkLike if you run aio metal install with non-root user?12:45
jrosserthats changed in this patch with not really an explanation https://opendev.org/openstack/ansible-role-python_venv_build/commit/ac5e5e9283e8b263dd23b9b1f1dcea218c9370e512:46
noonedeadpunkI guess that was about the time of initial 12:47
noonedeadpunkFrankly speaking, I'm inclined to drop that, as I really not sure about the usecase...12:48
noonedeadpunkLike if you want to have become - run with become?12:48
jrosseri think get rid of it12:48
jrosserunless there is something super clear then that sort of thing shouldnt be in a role12:49
opendevreviewMerged openstack/openstack-ansible stable/yoga: Remove unreadable unicode symbols  https://review.opendev.org/c/openstack/openstack-ansible/+/88421912:54
opendevreviewMerged openstack/ansible-role-python_venv_build stable/2023.1: Use distribution_major_version for Debian and CentOS  https://review.opendev.org/c/openstack/ansible-role-python_venv_build/+/89781012:58
jamesdentonnoonedeadpunk it's on my list for today13:09
noonedeadpunkI wonder if condition there was supposed to be != localhost. As then it would make slightly more sense13:12
jrosserNeilHanlon: i still see a fair few rocky failures, some like this `Failed to download metadata for repo 'crb': Yum repo downloading error: Downloading error(s): repodata/fc2cce4d-3c40-4532-9b1a-fca99cdf0674-PRIMARY.xml.gz`13:14
jrosserwe wouldnt be running into a rate limit or something by running that same task almost simultaneously across a bunch of containers?13:15
opendevreviewDmitriy Rabotyagov proposed openstack/ansible-role-python_venv_build master: Drop unneeded become overrides  https://review.opendev.org/c/openstack/ansible-role-python_venv_build/+/89794813:18
NeilHanlonjrosser: no, though, we did have an outage for maintenance around 05:00 UTC. dl.rockylinux.org should not have been affected, though13:19
NeilHanlonas of ~04:45 UTC, dl.rockylinux.org should be "stable"13:20
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-lxc_hosts master: Remove lxc_cache_map variable  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/89786113:22
jrosseris it helpful to be collecting the errors somewhere? i might just have convinved myself it's very fail-y when really its not13:23
noonedeadpunkWe can get an etherpad for that13:23
noonedeadpunkor maybe better ethercalc?13:24
NeilHanlonyeah i think that wouldn't hurt to collect13:24
noonedeadpunkjrosser: I've replied in https://review.opendev.org/c/openstack/openstack-ansible/+/89756813:50
noonedeadpunkfrankly speaking I'm more confused of the usecase when nova-compute are LXC. I'm not sure that ever worked even...13:51
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/xena: Switch roles to track stable/xena  https://review.opendev.org/c/openstack/openstack-ansible/+/88492613:56
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Ensure tempest include and exclude lists all use unique names  https://review.opendev.org/c/openstack/openstack-ansible/+/89396813:57
Guest2868noonedeadpunk: Im back, this morning you asked to me to check which task are skipped, this is the cat of the log: https://paste.openstack.org/show/bvHdFXX4WhBLfyDIKrV9/13:58
Guest2868My mon_host in ceph.conf is 192.168.100.30 but when i check the syslog i got a Oct 11 14:01:18 mon1 ceph-mon[29368]: 2023-10-11T14:01:18.606+0000 7f67a7e70640  0 -- [v2:192.168.102.30:3300/0,v1:192.168.102.30:6789/0] send_to message mon_probe(probe d054f11a-d4a2-436f-acf7-1fa49bbe0eeb name mon1 leader -1 new mon_release quincy) v8 with empty dest14:01
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts master: Stop installing openssh and rsync to containers  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/88994514:16
noonedeadpunkGuest2868: I think that 192.168.102.30 is more like what you should have had according to your config14:18
noonedeadpunkThe wierd part was about 192.168.101.30 though14:19
Guest2868i removed ceph things in the mon server i rerun setup-host14:20
Guest2868fron openstack_user_config, should I write the ip from storage network or from mgmt network?14:22
Guest2868*from14:22
jrosseropenstack_user_config should always be the network that you want ansible to ssh to the machines on, most people make that the mgmt network14:25
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts master: Stop installing openssh and rsync to containers  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/88994514:26
Guest2868good14:26
jrosserNeilHanlon: did you see anything like this? Not so recent centos-8-stream cloud image trying to `dnf update` https://paste.opendev.org/show/bW7McSirGoLISoK25ivX/14:27
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Remove requirement to have id_rsa.pub  https://review.opendev.org/c/openstack/openstack-ansible/+/89795714:27
NeilHanlonjrosser huh... i've not, but.. let me poke the infra team14:29
jrosseri think it's related to https://github.com/rpm-software-management/rpm/commit/486579912381ede82172dc6d0ff3941a6d0536b514:30
jrosserbut its a bit of a bummer if exsting uploaded images are now useless14:31
NeilHanlonhm yeah that'd be bad14:33
Guest2868After rerun setup-infrastructure, the task Failed at : ASK [ceph-mon : waiting for the monitor(s) to form the quorum...] And on my mon server the log still showing [v2:192.168.102.30:3300/0,v1:192.168.102.30:6789/0] send_to message mon_probe(probe d054f11a-d4a2-436f-acf7-1fa49bbe0eeb name mon1 leader -1 new mon_release quincy) v8 with empty dest  And the ceph.conf is https://paste.openstack.org/show/b002nxqQvU03xd6sR14:35
jrosserGuest2868: how many ceph nodes do you have?14:36
Guest28683 for osd and 1 for monitor14:36
opendevreviewMerged openstack/openstack-ansible-os_neutron master: Workaround ovs bug that resets hostname with add command  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/89765614:37
jrosserGuest2868: you can run the ansible playbook with -vvv and you'll get to see the output from the tasks, specifically the one that failed14:39
jrosserit is very verbose so only run the most specific playbook that you need to14:39
Guest2868This result is not enough verbose? https://paste.openstack.org/show/bUUjsTo7AaIOPc5mBB9q/14:39
opendevreviewMerged openstack/openstack-ansible-os_keystone stable/2023.1: oidc: fix recognition of x forwarded headers from v2.4.11  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/89780614:40
jrosserGuest2868: it says `"state\":\"probing\"`14:41
jrosserand this is what it looks for https://github.com/ceph/ceph-ansible/blob/490ca79ccc5b1cf7270032a70be41500578f3ae8/roles/ceph-mon/tasks/ceph_keys.yml#L1314:41
Guest2868what i'm supposed to understand?14:43
jrosserthat the state it is looking for is "leader" or "peon" and it is neither of those14:43
jrosserthats why the task failed14:43
jrosser(sorry i'm really quite confused about what the actual issue is with your deployment)14:44
Guest2868I try to have a ceph install working14:44
jrosserand there are some wrong IP addresses?14:44
Guest2868I don't know, I try to understand, so I check the logs, the return of ansible, and the ceph.conf on the destination server. I found weird ips. In my user_variable I use my storage network ips, in the ceph.conf I found ips from my mgmt network. And the log show ip errors on storage network (mgmt 192.168.100.0/24, storage 192.168.102.0/24)14:46
Guest2868And the error in syslog seems to say there is a problem with [v2:192.168.102.30:3300/0,v1:192.168.102.30:6789/0] send_to message mon_probe(probe d054f11a-d4a2-436f-acf7-1fa49bbe0eeb name mon1 leader -1 new mon_release quincy) v8 with empty dest14:47
Guest2868why this is showing 192.168.102.30:3300 when my ceph.conf mon_host key is 192.168.100.3014:48
jrosser`In my user_variable I use my storage network ips` what does this mean?14:49
Guest2868like this example file https://github.com/openstack/openstack-ansible/blob/master/etc/openstack_deploy/user_variables.yml.prod-ceph.example#L20-L2214:52
Guest2868What Im supposed to do?15:07
jrosserGuest2868: if it was me, i would find something that is certainly wrong, then understand why it is wrong15:08
noonedeadpunkif you ask me, I think it's worth to find out why your ceph.conf is getting 192.168.100.3015:08
noonedeadpunkBecause from the config I saw I have impression that storage network should be used for communication, which is 192.168.102.3015:09
jrosserGuest2868: it is also helpful to look in the systemd journal rather than syslog, because i am not really sure which daemon has printed the log line you pasted15:10
noonedeadpunkBut I guess that it could be due to some fact missing, that https://github.com/ceph/ceph-ansible/blob/main/roles/ceph-facts/tasks/set_monitor_address.yml is not getting defined properly15:10
jrosserand if there is a mismatch between ceph.conf and what the log says, it could be as straightforward as for some reason, the ceph mon has not been restarted15:11
jrosserperhaps due to an error / failing playbook previously15:11
opendevreviewMerged openstack/openstack-ansible stable/2023.1: Gather extra networking facts for keepalived  https://review.opendev.org/c/openstack/openstack-ansible/+/89728315:20
opendevreviewMerged openstack/openstack-ansible stable/zed: Gather extra networking facts for keepalived  https://review.opendev.org/c/openstack/openstack-ansible/+/89728415:21
noonedeadpunkNeilHanlon: do you have slightest idea why libvirtd.service might be missing on c9s, given that libvirt-daemon was installed?15:40
NeilHanloni started looking into that yesterday...15:41
NeilHanlonbut basically: no...15:41
noonedeadpunkmhm, I see...15:41
noonedeadpunkI think I know15:42
noonedeadpunkit's not installed :)15:42
NeilHanlonheh15:43
NeilHanlonthat'd be one reason15:43
NeilHanlondid someone change the Provides: on a package again like systemd-uboot? :P 15:43
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_nova stable/zed: Install libvirt-deamon for RHEL systems  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/89798215:44
noonedeadpunkthis ^15:44
noonedeadpunkwe had landed https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/884380 but that wasnt enough for Zed15:45
NeilHanlonbleh15:45
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron stable/zed: Fix typo for  vpnaas_custom_config distribution  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/89671115:45
noonedeadpunkdnf logs were pretty much misleading though15:46
noonedeadpunkas they were having plenty of libvirt-daemon-stuff15:46
NeilHanlonyeah i a gree...15:46
NeilHanlonagree*15:46
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-lxc_hosts master: Remove lxc_cache_map variable  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/89786115:56
opendevreviewMerged openstack/openstack-ansible-haproxy_server stable/2023.1: Add possibility to override haproxy_ssl_path  https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/89716716:41
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: [doc] Add documentation on  https://review.opendev.org/c/openstack/openstack-ansible/+/89799918:25
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: [doc] Add documentation on running as non-root  https://review.opendev.org/c/openstack/openstack-ansible/+/89799918:27
jrossernoonedeadpunk: what is the goal with non-root? to meet some security rules?18:33
noonedeadpunkI don't know....18:34
noonedeadpunkrly18:34
noonedeadpunkWas just asked internally for how long we're going to require root for deployments 18:34
jrosseryeah interesting18:34
NeilHanlonroot is bad mkay18:35
NeilHanlonNIST said so18:35
NeilHanlonor something18:35
jrosseras it’s pretty much the same with passwordless Sufi18:35
* NeilHanlon shrugs18:35
jrosser*sudo18:35
NeilHanloni kinda like the name sufi better18:35
noonedeadpunkBut it was in regards of updating our training offering, which teaches ppl how to use OSA for deployments and upgrades: https://academy.cleura.cloud/courses/course-v1:cleura+cc201+202310/about18:36
noonedeadpunkWell, it's to enable to fully disable `root` as a username I guess...18:36
noonedeadpunkBut well. I think you can also provide become password?18:37
noonedeadpunkI didn't test this though... But I assume that should be possible... And quite some become methods as well, I didn't check all of them to be frank....18:37
noonedeadpunkmaybe there's smth that allow you to do it in a smart way...18:38
NeilHanlonPAM!18:38
NeilHanlon:P 18:38
noonedeadpunkhehe18:39
jrosserI was just interested because occasionally people ask for it18:39
noonedeadpunkAnd, I guess, you can even develop your own become plugin?18:39
NeilHanlon`sed -i 's/root/toor/g' /etc/{passwd,group,shadow}`18:40
noonedeadpunkyeah, that would work as well actually :D18:40
noonedeadpunkjrosser: I guess in our case it was more to follow "good practise" and not use root as that is considered as "moveton" nowadays18:40
noonedeadpunkrather then any real use-case (at least for now)18:41
jrosserimho sudo passwords are not great18:42
jrosserthough I’d be interested to look at the crossover of running osa as non root and ssh certs that permit sudo18:44
noonedeadpunkBut we're also having a really strictly regulated deployment, so maybe it was also smth related to that as well potentially18:44
jrosseryeah18:48
jrosserbeing able to show you’ve got no hashed passwords or keys anywhere is super useful18:48
noonedeadpunkpaswordless sudo not helping much though....18:49
noonedeadpunkbut maybe if you take password from vault....18:49
noonedeadpunkdunno18:49
jrosserno well that why I want to look again at the certs stuff18:49
jrosserif you can control in the cert that sudo is allowed that would be cool18:50
jrosseras it would be ephemeral18:50
noonedeadpunkI wonder if you indeed can do that with pam_ssh_agent_auth...18:51
noonedeadpunkor smth like that18:51
noonedeadpunkhttps://linux.die.net/man/8/pam_ssh_agent_auth18:51
jrosserwe still need a “rescue” user with a password for use over ipmi kvm if all else is broken18:51
jrosserthat’s a bit sad18:52
jamesdentonnoonedeadpunk really slick drawings, thank you18:52
noonedeadpunktrue18:52
jamesdentonFWIW I am documenting an OVS->OVN migration, will share that later this week, hopefully18:52
noonedeadpunkawesome!18:53
noonedeadpunkthat is smth I always wanted to get my hands to, but never managed to18:53
* noonedeadpunk even was going to add some automation to neutron inplace for tripleo path18:53
jamesdentonan in-place migration for TripleO based deploys?18:54
jamesdentonor similar to what OOO is going?18:54
* noonedeadpunk tries to translate slick and that looks like a negative characteristics?:)18:54
noonedeadpunkInstead of what OOO is doing as it's deprecated18:54
noonedeadpunkAnd it's pretty much OOO oriented rather then generic18:55
jamesdenton'slick' meaning 'great'18:56
noonedeadpunkaha, ok :D18:57
jamesdenton"impressive"18:57
noonedeadpunkgoogle translate was proposing "slippy" as synonym, so... :D18:58
opendevreviewJames Denton proposed openstack/openstack-ansible master: Stop ignoring hostnames without underscores  https://review.opendev.org/c/openstack/openstack-ansible/+/89800218:58
jamesdentonahh, well 'slick' does mean 'slippery', usually in the context of a road after a rain or something like that. But as slang, it means 'cool' or similar18:59
noonedeadpunkaha, gotcha, now I know 18:59
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: [doc] Add documentation on running as non-root  https://review.opendev.org/c/openstack/openstack-ansible/+/89799919:00
noonedeadpunktalking about docs - this is another thing that probably worth to be checked. I made it during previous release and might be good to have that in https://review.opendev.org/c/openstack/openstack-ansible/+/88537619:02
jamesdentonis no_log a var that can be passed on the command line for troubleshooting?19:07
noonedeadpunkum, no, but we added some var that will do same somewhere lately19:07
noonedeadpunklike _oslodb_setup_nolog for DB and _service_setup_nolog19:08
jamesdentonthank you19:09
opendevreviewMerged openstack/openstack-ansible master: Define tempest config overrides in unique variables per service  https://review.opendev.org/c/openstack/openstack-ansible/+/89476319:36
opendevreviewMerged openstack/openstack-ansible master: [doc] Add example network architectures for OVN  https://review.opendev.org/c/openstack/openstack-ansible/+/89438419:37
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: [doc] Add documentation on running as non-root  https://review.opendev.org/c/openstack/openstack-ansible/+/89799919:42
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Update VPNaaS package for RHEL  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/89800819:51
opendevreviewMerged openstack/openstack-ansible-os_nova stable/zed: Install libvirt-deamon for RHEL systems  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/89798221:26
-opendevstatus- NOTICE: Another short Gerrit outage for updates on review.opendev.org. This update ensures we are using the current versions of all Gerrit plugins.23:45

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!