Wednesday, 2022-10-19

*** ysandeep\|out is now known as ysandeep\|PTO		00:05
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/train: Use cloudsmith repo for rabbit and erlang https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/861794	07:06
jrosser_	^ thats running an upgrade job on train too, which we probably want to get rid of?	07:10
noonedeadpunk	Yeah, true	07:11
noonedeadpunk	I also see no way of saving centos 7 lxc job	07:11
noonedeadpunk	As centos has dropped their image for lxc	07:11
noonedeadpunk	And for the legacy method we need infra proxy that was likely dropped or super outdated	07:11
noonedeadpunk	Don't want to mess/fix it	07:12
noonedeadpunk	I wonder what out of that we actually need https://opendev.org/opendev/base-jobs/src/branch/master/roles/mirror-info/templates/mirror_info.sh.j2#L83-L85	07:13
noonedeadpunk	or well, we'd need to fix that https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/mirror/templates/mirror.vhost.j2#L259-L262	07:16
noonedeadpunk	and repalce with https://us.lxd.images.canonical.com/	07:17
noonedeadpunk	(but I'd rather drop)	07:17
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Drop usage of lxc containers proxy https://review.opendev.org/c/openstack/openstack-ansible/+/861825	07:20
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/ussuri: Bump SHA for galera, rabbitmq and rally roles https://review.opendev.org/c/openstack/openstack-ansible/+/853029	07:41
noonedeadpunk	Nah, we can't drop as I guess like Rocky use only this way of images retrieval	07:42
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/train: Restrict pyOpenSSL to less then 20.0.0 https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861831	07:45
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/train: Restrict pyOpenSSL to less then 20.0.0 https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861831	07:47
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/train: Use cloudsmith repo for rabbit and erlang https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/861794	07:48
gokhanisi	hi folks, I am trying to make keystone ldap integration but it didn't work. For testing ı created openldap server on ubuntu focal and created this ldif file > https://paste.openstack.org/show/bEY20h8Nvj5XtaE2zDJC/ this is my keystone domain config > https://paste.openstack.org/show/boQuvnoOHaLZ2lmkFMYd/, I created b3lab domain manually but in keystone logs it says b3lab domain not found. Maybe I have missed some things to do.	07:52
* noonedeadpunk has no experience in ldap integration		08:01
kleini	gokhanisi: this is my configuration for OSA to configure keystone with LDAP auth	08:24
gokhanisi	kleini, and it worked for you	08:27
kleini	yes, it works	08:34
kleini	gokhanisi: is your keystone configuration file located in /etc/keystone/domains/keystone.b3lab.conf localted?	08:37
gokhanisi	kleini, yes it is like that https://paste.openstack.org/show/bjO6UVFPtOA71srmlcMq/	08:41
gokhanisi	and this is keystone.conf https://paste.openstack.org/show/bBggmfcI7WMFPZYMGo31/	08:43
kleini	maybe turn on verbose and debug logging in keystone. did you try to use ldapsearch to test connection to your LDAP?	08:47
gokhanisi	kleini, it is working with ldap search > https://paste.openstack.org/show/bbOjfu6ZohKIh5PVZaZn/	08:54
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/ussuri: Switch to tracking stable/ussuri for EM release https://review.opendev.org/c/openstack/openstack-ansible/+/853029	08:54
gokhanisi	may be I am typing url wron	08:54
gokhanisi	kleini, thanks it is working now :) and now How can we map openstack projects with ldap objects?	09:02
gokhanisi	I can list groups and users on ldap	09:02
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/train: Use cloudsmith repo for rabbit and erlang https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/861794	09:13
kleini	just do normal role assignments. I create projects then in the same domain - b3lab in your case - and then assign role member to some group or user for that project	09:16
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/train: Restrict pyOpenSSL to less then 20.0.0 https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861831	09:17
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/train: Return jobs to voting https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861855	09:20
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/train: Disable upgrade jobs on EM branch https://review.opendev.org/c/openstack/openstack-ansible/+/861858	09:23
dokeeffe85	Hi again, I lost my controller and computes to a power failure and when I rebooted them today after getting them back online I get the following errors https://paste.openstack.org/show/beSCxaSdw95snHjbRdhy/ when trying to attach & start containers. I obviously didn't get a chance to stop the containers before I lost the servers. Is there anyway to fix this or is it a lxc-containers-destroy.yml + lxc-containers-create.yml again? Thanks in	09:56
dokeeffe85	advance	09:56
noonedeadpunk	dokeeffe85: and what does /var/log/lxc/lxc-infra1_utility_container-f80f87fa.log says?	09:58
admin1	tag 25.1.1 fails on python_venv_build : Install python packages into the venv => ERROR: Error [Errno 2] No such file or directory: 'git' while executing command git version\nERROR: Cannot find command 'git' - do you have 'git' installed and in your PATH .. doh !!	10:24
admin1	and i have not done any changes or overrides	10:24
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/queens: EOL OpenStack-Ansible Queens https://review.opendev.org/c/openstack/openstack-ansible/+/861868	10:25
admin1	ssh to container, apt install git ; re-run playbook .. and its solved ..	10:25
noonedeadpunk	admin1: is it for placement?	10:25
admin1	no .. c1_heat_api_container	10:25
admin1	ignore the c1_	10:26
noonedeadpunk	nah, then needs patching	10:26
admin1	where would i see our CI passing logs ..	10:26
noonedeadpunk	Just for placement was fixed with https://opendev.org/openstack/openstack-ansible-os_placement/commit/6084c248fcae02c413133329b705678cd75c1bfe	10:26
admin1	i got no issues on placement	10:26
noonedeadpunk	CI can be quite different. As we forcefully enable wheels building there	10:27
admin1	oh	10:27
noonedeadpunk	And you will see error if wheels build is disabled for some reason	10:27
noonedeadpunk	like running with limit is one option	10:27
noonedeadpunk	and it's result of one "fix" that now more properly evaluates things	10:28
noonedeadpunk	but git issues in other places started arising when wheels are not built	10:29
noonedeadpunk	so would be great if you could push some patch for that	10:29
admin1	this was in acceptance .. i will deploy tonight in a prod env . .. will have a 100% confirmation then ..	10:31
dokeeffe85	noonedeadpunk this is the entire file https://paste.openstack.org/show/bNys7633pfauez8qdhQV/	10:43
noonedeadpunk	and what if you add `-F` to lxc-start?	10:45
noonedeadpunk	I just assume that smth is off with either /var/lib/machines mount or some net interface	10:47
noonedeadpunk	But not sure what's exactly	10:47
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_tacker master: Add deployment of tacker-scheduler https://review.opendev.org/c/openstack/openstack-ansible-os_tacker/+/861870	10:52
admin1	when "external" ceph is enabled, ceph is also not installed in glance	10:52
admin1	that blocks the playbook on octavia when it wants to upload the amphora image	10:52
admin1	external ceph using cephadm,	10:53
noonedeadpunk	I think it depends on what backends you've enabled for glance?	10:54
admin1	user_variables => glance_ceph_client: glance ; glance_default_store: rbd ; glance_rbd_store_pool: images	10:55
noonedeadpunk	`glance_default_store: rbd` is the thing that should do the trick actually	10:55
noonedeadpunk	as that's the condition on when ceph part does run https://opendev.org/openstack/openstack-ansible-os_glance/src/branch/master/tasks/main.yml#L157-L158	10:56
noonedeadpunk	and `_glance_available_stores: "{{ [ glance_default_store ] + glance_additional_stores }}"`	10:57
admin1	i am going to return glance playbook with -vvv and grep rbd/ceph	10:59
admin1	return -> rerun	10:59
noonedeadpunk	so if you go to interactive python (ie /openstack/venvs/glance-<version>/bin/python) and execute `import rbd` it will fail with import?	10:59
admin1	yeah .. no /etc/ceph and no packages	10:59
noonedeadpunk	super weird	11:00
admin1	ceph_pkg_source: distro ..	11:00
admin1	without this, build does not work on 22.0.4	11:00
noonedeadpunk	it's 22.04?	11:00
admin1	yeah	11:00
noonedeadpunk	rly weird	11:01
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/queens: Switch linters to EOL https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861873	11:08
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/queens: EOL OpenStack-Ansible Queens https://review.opendev.org/c/openstack/openstack-ansible/+/861868	11:08
admin1	noonedeadpunk https://gist.github.com/a1git/5f5a129b62e57cd52c8791f5ecd0d986	11:11
admin1	not sure if it helps	11:11
admin1	noonedeadpunk, is this a good way to force ? openstack-ansible os-glance-install.yml -e 'glance_default_store: rbd'	11:12
noonedeadpunk	but it shows that ceph.conf is being copied	11:12
noonedeadpunk	and ceph packafges are symlinked properly	11:13
admin1	now i see ceph.conf :D	11:13
admin1	hmm..	11:13
noonedeadpunk	https://gist.github.com/a1git/5f5a129b62e57cd52c8791f5ecd0d986#file-gistfile1-txt-L418	11:14
admin1	i will destroy this container and retry .. could be it fails the first time and then works the 2nd time	11:14
noonedeadpunk	and all tasks are OK actually. Nothing was changed	11:14
noonedeadpunk	um, then tasks would be in changed state	11:14
noonedeadpunk	according to paste I can say nothing was done during this run	11:15
noonedeadpunk	`c1_glance_container-bf88ca5b : ok=108 changed=1 ` and this changed is forceful user creation	11:15
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/queens: EOL OpenStack-Ansible Queens https://review.opendev.org/c/openstack/openstack-ansible/+/861868	11:28
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/queens: EOL OpenStack-Ansible Queens https://review.opendev.org/c/openstack/openstack-ansible/+/861868	11:28
admin1	tracked down the issue of glance not working to " installed ceph-common package post-installation script subprocess returned error exit status 6"	11:39
admin1	cannot even purge it ..	11:40
admin1	that is the only diff in relation to ceph i see in cinder vs glance container	11:41
jrosser_	try installing that by hand and see what the error is	11:44
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_barbican stable/rocky: Stop old uwsgi service if exist https://review.opendev.org/c/openstack/openstack-ansible-os_barbican/+/861877	11:46
dokeeffe85	noonedeadpunk it seems that lxcbr0 doesn't exist https://paste.openstack.org/show/bI9tQbBaDAeXkBgVqT4Z/	11:53
admin1	no logs, nothing except /usr/bin/dpkg returned an error code ..	11:53
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/ussuri: Switch to tracking stable/ussuri for EM release https://review.opendev.org/c/openstack/openstack-ansible/+/853029	11:53
admin1	installing strace to go deep	11:53
admin1	no logs and nothing	11:53
noonedeadpunk	dokeeffe85:try restarting systemd-networkd	11:54
noonedeadpunk	lxcbr0 is managed with it	11:54
admin1	write(6, "{\"jsonrpc\":\"2.0\",\"method\":\"org.d"..., 100) = 100 -- read(6, 0x562324022690, 4096) = -1 ECONNRESET (Connection reset by peer) -- EBADF (Bad file descriptor)	11:55
admin1	i am going to deploy it in prod and see if i face the same issue or not	11:56
noonedeadpunk	so, train seems to be fixed way easier then ussuri	11:56
admin1	one quick question .. is there any override to tell glance and cinder to use the same container for instance :D	11:56
dokeeffe85	noonedeadpunk, no joy with that. "brctl show" only lists mgmt, storage & vxlan bridges	11:56
admin1	same repos, same packages .. ceph-common is good in one, bad in another	11:56
admin1	i have already lxc-destroyed the containers to recreate	11:57
admin1	so will try in a prod env now to make sure its not my env .	11:57
noonedeadpunk	dokeeffe85: as I said check systemd-networkd	11:58
dokeeffe85	noonedeadpunk yep I restarted it and then tried restarting the container with -F and same result	11:59
jrosser_	if lxcbr0 is missing you need to look at the service that creates it and see why it is missing	12:01
jrosser_	there is no point moving on to restarting the container until the bridge is there	12:01
admin1	ip link set dev lxcbr0 up ?	12:03
dokeeffe85	lxcbr0 doesn't exit. Not sure which service creates it jrosser_ it was all working fine until a reboot of the server so it was created initially	12:13
jrosser_	you have /etc/network/interfaces.d/lxc-net-bridge.cfg ?	12:16
admin1	dokeeffe85 is it an aio ?	12:25
admin1	single node all in 1 install	12:25
admin1	dokeeffe85, reboot -- was it after some update/upgrade of packages ?	12:26
noonedeadpunk	admin1: regarding override for glance/cinder - it's defenitely smth env.d related	12:37
noonedeadpunk	should be doable	12:37
noonedeadpunk	like create /etc/openstack_deploy/env.d/glance.yml with https://paste.openstack.org/show/bVCyhed2fVR46gT2FQPu/	12:39
noonedeadpunk	not 100% sure about that so worth to backup openstack_inventory jsut in case :D	12:39
noonedeadpunk	actually... likely it won't work as just glance playbook won't run as no hosts will be in glance_all	12:40
dokeeffe85	jrosser_ yep I have that file. admin1 nope it's not aio. We had a power cut and I lost the three servers	12:40
jrosser_	admin1: i dont think that combining glance and cinder is useful to address whatever ceph trouble you had	12:41
jrosser_	as usual root cause needs to be found	12:41
noonedeadpunk	then likely worth to mess up with other parts of env.d	12:41
admin1	yeah .. working to deploy in prod with full logs ..	12:41
jrosser_	dokeeffe85: then you can try to 'ifup' the interface	12:41
admin1	so that i can share	12:41
noonedeadpunk	but should be overall doable	12:41
jamesdenton	FWIW: I have resigned to adding a lxcbr0 bridge to my netplan config to avoid the issue mentioned (not being recreated on reboot). I could not easily replicate the issue, and since a) we do baremetal in prod and b) we don't often reboot controllers, i don't see it in the wild ,either	12:44
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Mark Zaqar as deprecated in role matrix https://review.opendev.org/c/openstack/openstack-ansible/+/861884	12:45
noonedeadpunk	well there're 2 patches around for this topic - first to move lxcbr fully to networkd and second - allow to avoid osa trying to create it	12:46
noonedeadpunk	(and manage)	12:46
jamesdenton	/thumbsup	12:47
*** frenzy_friday is now known as frenzyfriday\|rover		12:52
dokeeffe85	jrosser_ nope I can't as the interface doesn't exist, I can't see it anywhere. jamesdenton can you give me a paste of that netplan bridge you created please?	13:01
jrosser_	ifup is the command to bring up the interface with /etc/network/... type definitions afaik	13:02
jrosser_	it doesnt need to exist, you need to make it exist with some command	13:02
jamesdenton	https://paste.opendev.org/show/bLtNoxDfwJGIHg9t2KQp/ -- netplan apply would bring it up in this case	13:02
admin1	noonedeadpunk, i am trying the env.d override to have cinder/glance in the same container .. and changed the line in setup-openstack to install cinder first and then glance .. lets see what that does	13:05
admin1	if it works, then i can destroy this env and again start fresh	13:05
jrosser_	admin1: we already warned you about the ansible groups being empty doing that	13:06
admin1	:)	13:07
jrosser_	i cannot understand doing this rather than just debug an apt problem	13:07
jrosser_	tbh there are expected to be issues as you're attempting something thats not tested	13:08
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Remove usage of rsyslog roles https://review.opendev.org/c/openstack/openstack-ansible/+/861886	13:11
jamesdenton	jrosser_ Re: OVN ref arch: I am thinking of creating three -- 1) 3 controller nodes + X compute nodes, with DVR and computes as gateway chassis. 2) 3 controller node + 3 network node (gateway chassis) + X compute (non-gateway). 3) 3 controller/network node (gateway chassis) + X compute node. I think that mirrors most of today's deployments. Thoughts?	13:18
jrosser_	yes that would cover it - though i'm not sure they actually call it DVR these days as it's a little different	13:19
jrosser_	i	13:20
jamesdenton	i'll double check	13:20
admin1	jamesdenton, these days deployment demand are now more towards HCI with ceph .. so 3x nodes ( controllers ) + 3x nodes ( hypervisor ) .. no network nodes .. the 3x controllers + 3x hypervisors all have ceph running ..	13:22
admin1	compute also act as network nodes	13:22
jrosser_	really?	13:22
jamesdenton	Yes, I have seen that pushed more lately.	13:22
admin1	jrosser_ yes :)	13:23
dokeeffe85	Thanks jamesdenton that worked as far as starting the containers but there's other issues now after the reboot that I'll have to dig a bit deeper on before I ask any questions	13:23
jamesdenton	dokeeffe85 sure, just let us know.	13:23
dokeeffe85	Will do thanks	13:23
jrosser_	must be fun constraining the memory on a combined ceph/hypervisor when things go $wrong in ceph	13:24
admin1	hypervisor is only doing the role of the osd	13:24
admin1	and not monitors and others .. its the controllers that do them	13:24
jrosser_	steady state yeah whatever, but it can get out of hand real quick on the OSD when things are "broken"	13:24
jamesdenton	admin1 since we decouple OSA from Ceph deployment, I'll let the operator tack Ceph onto any one of those scenarios	13:25
jrosser_	lose a portion of your cluster due to a switch or some other problem and the memory usage can get very high	13:25
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Add release note about used ansible and ceph versions https://review.opendev.org/c/openstack/openstack-ansible/+/861889	13:26
jamesdenton	Anyone here using OSA+OpenDaylight? Or OSA+NSX-T?	13:27
noonedeadpunk	hyperconverged inra is always "fun". Most fun for osd+hypervisor is RAM consumption. As you always need RAM for domains, but OSDs also require ram. So I think I would reserve a lot of ram for hypervisor in placement if I had to do that	13:28
noonedeadpunk	there bunch of fixes passes for stable branches - would be good if we could merge them sooner then later before they didn't break again :D https://review.opendev.org/q/parentproject:openstack/openstack-ansible+branch:%255Estable/.*+status:open+label:Verified	13:32
jrosser_	huh lots of depends-on there	13:35
jrosser_	need to get the order right	13:35
noonedeadpunk	yeah, quite some... But I expected to be worse tbh	13:36
noonedeadpunk	it's mostly rabbitmq/erlang thing	13:36
noonedeadpunk	that's broken even back to Rocky	13:37
noonedeadpunk	But I'm not sure I have enough motivation now to fix Rocky....	13:37
noonedeadpunk	and rocky not in terms of distro but in terms of openstack release	13:40
noonedeadpunk	we should EOL it to get rid of this confusion lol	13:40
jamesdenton	mgariepy I seem to recall you had some patches to os_neutron for placing OVN gateway chassis? versus all ovn-controllers being gateways? Or maybe you just mentioned wanting to do it, can't recall	13:43
jamesdenton	if not, i can do it as part of this doc exercise	13:43
mgariepy	let me look	13:44
jamesdenton	My plan is to split out the gateway logic from the neutron_ovn_controller group, and create a second group names neutron_ovn_gateway_chassis, and just manipulate inventory accordingly.	13:46
mgariepy	https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/760647	13:47
jamesdenton	yeah ok, same concept	13:47
mgariepy	yep.	13:47
mgariepy	add me to review i can help if you want :D	13:47
jamesdenton	great. I'll ressurect that, thank you!	13:47
jamesdenton	definitely	13:47
jamesdenton	i'll put together some steps to duplicate this in 6-9 VMs, depending on the scenario	13:48
noonedeadpunk	So regarding zookeeper. I was using fork of this repo https://opendev.org/windmill/ansible-role-zookeeper/src/branch/master	13:49
noonedeadpunk	And I do see how we will struggle with it	13:50
noonedeadpunk	Maybe worth trying to reach pabelanger but I kind of doubt he will be eager to add config_template...	13:52
admin1	destroyed everything .. retrying again ..	13:53
admin1	this time i will log everything from start	13:53
noonedeadpunk	as he was not willing to listen about configuring cluster out of the box (https://github.com/openstack-archive/ansible-role-zookeeper/commit/b223d56660ea21f0feb8c6c7bf27dd4bac07a7fe)	13:54
opendevreview	Merged openstack/openstack-ansible stable/queens: EOL OpenStack-Ansible Queens https://review.opendev.org/c/openstack/openstack-ansible/+/861868	13:55
admin1	kolla+ovn works out of the box ( to get inspired from )	13:55
jamesdenton	it's not inspiration that's lacking :D	13:56
jamesdenton	only time	13:56
admin1	getting inspired is subtly saying to copy to save time :)	13:56
jamesdenton	it does work for OSA, too. Just looking to make it more complete so we can make it the default vs LXB	13:56
admin1	i also know ovn is now in starlingx .. but have not tested the latest one	13:56
jamesdenton	yes, things have been borrowed here and there :)	13:57
admin1	no need to reinvent the wheel if it works	13:57
jamesdenton	but I think OSA does a better job of spelling out different deployment scenarios versus some of the others, which tend to be a little more... prescriptive/opinionated	13:57
admin1	yes .. which is why i stick to osa for all prod and (paid) work .. and rest of the time, test others	13:57
admin1	we all are operators and so also OSA is better .. 2 weeks back i tried kolla + ovn and could not get octavia to work .. tried for 2 weeks .. zero replies :)	13:59
admin1	kolla/docker works . and so when it works, there are no interaction because it just works and people have no questions .. and when it breaks or does not work for some reasons, no one knows how to answer	14:01
noonedeadpunk	fwiw we're having operator hours now in https://www.openinfra.dev/ptg/rooms/folsom	14:01
jamesdenton	i am delayed by 25 more min	14:04
opendevreview	Merged openstack/openstack-ansible-tests stable/ussuri: Restrict pyOpenSSL to less then 20.0.0 https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861742	14:10
opendevreview	Merged openstack/openstack-ansible-tests stable/train: Restrict pyOpenSSL to less then 20.0.0 https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861831	14:10
dokeeffe85	jamesdenton I left my desk for a bit and when I came back I know have a horizon dashboard and my VM's are up and can ping out so I don't know what happened but it's back. Thanks everyone	14:47
dokeeffe85	*now	14:47
*** dviroel is now known as dviroel\|lunch		15:46
nixbuilder	I am confused (as is normal) about vxlan for private tenant networks. The documentation here (https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/targethosts-networkconfig.html) says that "Note that br-vxlan is not required to be a bridge at all, a physical interface or a bond VLAN subinterface can be used directly and will be more efficient." So what parameters do I use in openstack_user_config to use a	16:17
nixbuilder	physical interface?	16:17
noonedeadpunk	basically what you need - consistent interface name across net nodes and compute nodes	16:24
noonedeadpunk	It can be bridge, but you can also name interface as vxlan in netplan or systemd-networkd	16:24
jrosser_	is it eventually what is specified here https://github.com/openstack/openstack-ansible/blob/master/etc/openstack_deploy/openstack_user_config.yml.example#L273	16:25
noonedeadpunk	to make it less confusing you might use `host_bind_override: $name` there as well	16:26
noonedeadpunk	(iirc)	16:26
jamesdenton	i would expand that to say, the container_* bits are probably not important these days since neutron agents are on metal and not in LXC anymore (right?), but there is logic that uses the range based on type to populate ml2_conf.ini and the agent configs. we could probably stand to test/update this	16:26
jamesdenton	host_bind_override would only be applicable to vlan type, i think	16:26
jrosser_	ultimately whatever the value of "{{ tunnel_address }}" is what matters in os_neutron	16:27
noonedeadpunk	iirc it's applied everywhere in bare metal hosts	16:27
jamesdenton	right, so i might try a deployment and it would be nice if we could eliminate container_* since it's irrelevant	16:27
jrosser_	it is hugely confusing	16:27
*** dviroel\|lunch is now known as dviroel		16:28
noonedeadpunk	or you can just define neutron_provider_networks like here https://opendev.org/openstack/openstack-ansible-os_neutron/src/branch/master/defaults/main.yml#L392-L399 in user_variables and forget about openstack_user_config :D	16:28
jamesdenton	or that :)	16:28
noonedeadpunk	(in terms of vxlan/vlan/flat nets)	16:28
jamesdenton	1,000 ways... to shoot yourself in the foot	16:29
noonedeadpunk	yup, seems we tried our best to make it confusing...	16:29
noonedeadpunk	and really huge part is just historical	16:29
noonedeadpunk	but we indeed need to review our docs	16:30
jamesdenton	nixbuilder i think what we're saying is, that you need a VLAN dedicated to overlay traffic and that vlan interface can have an IP on it (the TEP) or the vlan interface could be in a bridge (i.e. br-vxlan) that has an IP on it (the TEP). Based on some logic, that IP will be automatically discovered and used for local_ip in neutron config files	16:30
jamesdenton	noonedeadpunk agreed. it all comes back to docs <crying emoji>	16:31
jamesdenton	nixbuilder it may be easier to not define at all in openstack_user_config, and instead use what jrosser_ mentioned. You're then defining everything manually	16:32
noonedeadpunk	tbh what I like about metal deployments is how clean your openstack_user_config is....	16:33
nixbuilder	Thanks everyone... I'll put on another pot of coffee and digest all this. Thanks again.	16:34
jrosser_	https://github.com/openstack/openstack-ansible-os_neutron/blob/36a2f02561b9281ee7e46287601f2d21a7fbc142/defaults/main.yml#L385-L386	16:34
jamesdenton	sure, thanks for asking. it prods us to update the docs	16:34
jamesdenton	thats a bad default	16:34
jamesdenton	lol	16:34
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts stable/ussuri: Use legacy image retrieval for CentOS 7 https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/861744	16:36
jrosser_	i wonder if theres actually much point at all keeping the vxlan part of `provider_networks`	16:36
jrosser_	maybe it's needed for feeding into OVS or something, i don't know	16:36
jamesdenton	well, i think the only benefit is the range logic.	16:36
jamesdenton	we'd have to push that into a separate var or doc or whatever	16:37
jrosser_	but it could instead be just two obivously named vars	16:37
jrosser_	the address and the range	16:37
jamesdenton	sure. we're gonna find out here shortly	16:37
jamesdenton	i never liked all of those being "provider" networks, anyway. confusion.	16:38
jrosser_	imho it sort of turns into "container networks" which really is what you might want this for, wiring up the control plane	16:38
jamesdenton	true	16:38
jrosser_	the neutron-ness of it is somehow making it conceptually really hard	16:39
jrosser_	but then also we do use it to create OVS bridges with the right settings?	16:39
jamesdenton	not in this case, that only applies to vlan	16:39
jrosser_	:)	16:39
jamesdenton	for vxlan, since the ip logic is separate, i think it's only the container wiring (not relevant) and range stuffs. So maybe new vars is the way to go, but then also need to keep the provider_networks override mechanism there, too.	16:41
* noonedeadpunk is also confused by now		16:42
jamesdenton	for vlan, yeah, i think we use container_bridge for building the ovs bridge	16:42
noonedeadpunk	I think I need to get mnaio admin1 promoted to really have a good play with all options to come up with what we can drop and how simplify	16:43
jamesdenton	agreed	16:43
jrosser_	i dont specify any of this at all btw	16:43
noonedeadpunk	We do have neutron_provider_networks	16:44
jamesdenton	you're using neutron_provider_networks?	16:44
jrosser_	let me look	16:44
noonedeadpunk	but I see lot's of crap in openstack_user_config as well for $reason	16:44
jamesdenton	yeah, i guess that was intended as the one-stop-shop for abstracting this stuff out? but thee days specifying things in both places gets confusing	16:45
jrosser_	https://paste.opendev.org/show/bQFDXRzwYupWNh14989p/	16:46
jamesdenton	i see	16:46
jrosser_	its the same interface name everywhere - except where it's not	16:46
noonedeadpunk	yeah, it's yet another option I'd say	16:46
jamesdenton	well, we never defined the interface, anyway	16:46
jamesdenton	i think we're relying on matching the CIDR?	16:46
jrosser_	so its easier to just not bother trying to do it in o_u_c and just do that in group_vars instead	16:46
noonedeadpunk	yeah, I tend to agree here	16:48
jrosser_	there might have been a neater way to do that, but if you have non uniformity then the providder_networks thing is very hard to use	16:48
jrosser_	imho variables are better	16:48
jrosser_	becasue you can have them in user_variables globally -> uniform deployment	16:48
noonedeadpunk	but again, why would you have non uniformity given you can name interface in netplan/systemd-networkd	16:48
jrosser_	or you can put them wherever you need in group_vars	16:48
noonedeadpunk	But yes, providder_networks is quite complex/unobvious for neutron usecase	16:49
jrosser_	naming interfaces is hard	16:50
noonedeadpunk	so I'd rather place them wherever else, and leave providder_networks only for containers usecase	16:50
jrosser_	as you need to have the mac of everything recorded somewhere to deploy those names	16:50
noonedeadpunk	well... In maas and ironic you have I believe?	16:50
jamesdenton	so, yank all of the neutron-specific "provider_networks'" from o_u_c and direct folks (via docs) to using neutron_provider_networks override? either as group or host vars? or global?	16:51
jrosser_	and fun times as the behaviour is surprising between focal/jammy and focal/focal+HWE kernel	16:51
noonedeadpunk	jamesdenton: I'd say yes?	16:51
jamesdenton	i just went thru this exercise yesterday... real PITA. (interface naming).	16:51
jrosser_	like PXEboot into focal, mess with the interfaces, install HWE kernel, reboot -> WTF	16:51
jamesdenton	noonedeadpunk i think that's fair. keep the logic for upgraded deployments but remove the doc examples	16:51
noonedeadpunk	yeah, fair	16:51
jrosser_	jamesdenton: if you have good ideas about interface naming would be interesting	16:52
jrosser_	we talked about it here this week off the back of nvidia/mlx changing theirs again	16:52
noonedeadpunk	I totally want to get rid of br-vlan/br-vxlan naming....	16:52
jrosser_	but did not have a great plan	16:52
noonedeadpunk	but indeed seems that bridge is still really consistent in terms of naming...	16:53
jamesdenton	pfft. seems we had been relying on biosdevname but our recent deploys don't have it. Just had to implement *.link files based on driver and using PATH as the name. Names are obnoxious, but consistent. But even between drivers you can have mild variance in the same slot (ens1f0s0 vs ens1f0s0np0) or something like that	16:53
jamesdenton	yes, i think eliminating those bridges is a Good Thing™	16:54
jrosser_	yes that np0 thing is way caught us out	16:54
jrosser_	theres now np<N> and nv<N> or smth for PF vs. VF	16:54
jamesdenton	but it does add consistency. i don't like br-ex -> br-vlan -> bond1 though	16:54
noonedeadpunk	true... that's why they're still there.	16:55
noonedeadpunk	I feel physical pain for ppl who have br-vlan.100 added to neutron-lxb bridge though...	16:55
jamesdenton	jrosser_ i have no good ideas, just complaints.	16:56
jrosser_	understood - we decided to do nothing and just fix everything up for the new names across focal->jammy	16:56
jrosser_	there was no good answer	16:56
jamesdenton	noonedeadpunk Bridgeception	16:57
jamesdenton	non-persistent persistent naming. gotta love it.	16:57
mgariepy	i rename all the interface this way i can upgrade the OS without having a suprise name when i upgrade ;p	17:07
jamesdenton	so, i was under the impression that in order to rename an interface in netplan it first had to be identified as something (ie. rename ens1f0 -> management) but that if the interface came up as ens1f0np0 first, then it wouldn't work? Maybe you then also have to specify something more specific (like MAC?)	17:08
mgariepy	i use MAC	17:09
mgariepy	but i guess it would be too much work to do it via ansible for physical hw.	17:10
jamesdenton	then i guess you just have to be conscious of a chassis swap or nic swap or something? maybe not too common	17:10
jamesdenton	i think some of the tech debt we have in OSA is trying to be too clever	17:10
mgariepy	i don't swap nic often.	17:10
mgariepy	a couple of year back netplan was not too great about it either.. using the mac when adding vlans it was trying to rename the vlan interface as well ...	17:12
mgariepy	fun times.	17:12
jamesdenton	seems to have matured a bit	17:13
mgariepy	yeah it works ok now.	17:13
opendevreview	Merged openstack/openstack-ansible-os_rally stable/ussuri: Move rally details to constraints https://review.opendev.org/c/openstack/openstack-ansible-os_rally/+/861730	17:15
opendevreview	Merged openstack/openstack-ansible-lxc_hosts stable/ussuri: Use legacy image retrieval for CentOS 7 https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/861744	17:15
mgariepy	https://netplan.io/reference#common-properties-for-physical-device-types	17:19
mgariepy	you can match by driver also.	17:20
mgariepy	but i do have my full inventory with macs most of the time so it's quite easy for me to set it up with renaming :D	17:21
mgariepy	with interface renaming i get some consistency also.	17:23
mgariepy	all my server have the first 25G interface names 25G-1 whatever pci slot/brand it is.	17:25
mgariepy	you can probably get the info from setup module, and filter by speed/type and so on.	17:26
spatel	I have question folks, by mistake i have deleted monitoring user in galera and now haproxy saying mysql is down	17:54
spatel	How do i quickly add monitoring user back ?	17:54
spatel	what is that monitoring user has to do with haproxy?	17:58
jamesdenton	might be as easy as: CREATE USER 'monitoring' IDENTIFIED BY '{{ galera_monitoring_user_password }}';	17:59
admin1	you can do grant connect,select on . to monitoring@'%' identified by 'password from secrets'	17:59
jamesdenton	https://github.com/openstack/openstack-ansible-galera_server/blob/5200b50cf650fb5ad5e0733b9e0ead207dbf6c6a/vars/main.yml#L31-L51	17:59
admin1	well, you can just add quickly and later fix the specific permissions	17:59
spatel	jamesdenton i didn't find any password in /etc/openstack_deploy/user_secrets.yml	18:00
admin1	spatel the pass could also be in the haproxy cfg	18:01
spatel	Nothing here - cat /etc/openstack_deploy/user_secrets.yml \| grep galera_monitoring_user_password	18:01
jamesdenton	ok hmm	18:02
spatel	nothing in haproxy.cfg file	18:02
mgariepy	in the clustercheck script inside the galera container	18:04
spatel	This is bizarre.. :(	18:04
jamesdenton	looks like a fairly recent addition i guess	18:04
spatel	This is the script - https://paste.opendev.org/show/bYkLV1m5l2ZjrlVjObsa/	18:06
spatel	No password there, may be it use mysql root password?	18:07
spatel	MYSQL_PASSWORD="${2-}" ?	18:07
jamesdenton	maybe theres no password?	18:07
jamesdenton	i think that's a hash	18:08
anskiy	there is a password, I can see it in user_secrets	18:08
opendevreview	Merged openstack/openstack-ansible-rabbitmq_server stable/train: Use cloudsmith repo for rabbit and erlang https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/861794	18:08
spatel	anskiy why my user_secret not showing it	18:08
jamesdenton	https://github.com/openstack/openstack-ansible/commit/302c8226e6ea51d9e0c76050b470d525dfb33d60	18:08
spatel	anskiy what is the name in user_secret?	18:09
jamesdenton	see the release note? i think it implies there was no password	18:09
anskiy	spatel: I think, during the upgrade on Y, when you should ran the script to check for missing secrets, you'd had to add one	18:09
jamesdenton	You can also override variable to	18:09
jamesdenton	``galera_monitoring_user_password: ""`` to not use password for auth and	18:09
jamesdenton	preserve previous behaviour	18:09
jamesdenton	maybe just start with creating the monitoring user w/ no pass and see if that wqorks	18:10
spatel	Now what i should to bring it back quickly to fix my production :(	18:10
spatel	I didn't know its so important	18:10
anskiy	spatel: `galera_monitoring_user_password` is the variable name in my user_secrets	18:11
spatel	anskiy i don't have that in my user_sec file	18:11
jamesdenton	spatel what OSA version? anskiy what OSA version?	18:11
spatel	Wallaby 23.3.0	18:12
anskiy	jamesdenton: I've added it when I've upgraded to Y (that was stable/yoga at the time)	18:12
anskiy	oh, you're on W, nvm then, I guess...	18:12
jamesdenton	right, so Wallaby wouldn't have that logic	18:12
jamesdenton	spatel create a monitoring user with no password in galera	18:13
jamesdenton	that ought to do it	18:13
spatel	let me do ...	18:13
spatel	done! - CREATE USER 'monitoring' IDENTIFIED BY '';	18:14
spatel	Look like that fixed my issue jamesdenton	18:15
jamesdenton	good deal	18:15
spatel	haproxy is happy now	18:15
spatel	Thank you so much jamesdenton	18:16
spatel	This is total mess :)	18:16
jamesdenton	as anskiy mentioned, once you upgrade you may need to update secrets to add that var, as a password will be required	18:16
jamesdenton	^^ release notes ought to cover this	18:16
admin1	^^ yes .. release note was there	18:17
admin1	spatel, still have ovn on prod ?	18:18
admin1	any issues so far ?	18:18
spatel	Yes i am running ovn in production without any issue.	18:18
Adri2000	hello jrosser_, I filed this bug about an issue with ansible-role-pki, would appreciate your input when possible :) thanks https://bugs.launchpad.net/openstack-ansible/+bug/1993575	19:00
Adri2000	I took the opportunity to close this old (2016) related wishlist bug that's actually fixed since the introduction of openstack_host_ca_certificates: https://bugs.launchpad.net/openstack-ansible/+bug/1649844 (apparently I'm also the reporter of this bug report!)	19:05
jrosser_	Adri2000: the run_once certainly looks like it could be an issue	19:08
jrosser_	which playbook would you be expecting to install this CA for you?	19:08
jrosser_	either playbooks/containers-lxc-create.yml or playbooks/openstack-hosts-setup.yml i expect	19:10
jrosser_	if you are able to test it out removing that run_once it would be helpful - even better a patch :)	19:10
Adri2000	jrosser_: playbooks/containers-lxc-create.yml, as I'm targetting the Keystone containers. for now the workaround I used is to run the playbook limited to Keystone containers only, so the task will run_once on a Keystone container and therefore will have the variable correctly set. I guess removing run_once should work, but I'll test to be sure, and can push it as a patch	19:15
Adri2000	then	19:15
jrosser_	seems i used rather too much run_once in the PKI role	19:15
jrosser_	Adri2000: also take a look in vars/main.yml of the PKI role - there are other ways in there to provide your own certs in as many variables as you like	19:18
Adri2000	interesting, didn't know that	19:28
*** dviroel is now known as dviroel\|biab		20:47
*** dviroel\|biab is now known as dviroel		21:59
*** dviroel is now known as dviroel\|out		23:03

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!