Wednesday, 2023-08-16

art	hi all	05:50
art	we have installed OSA(Zed)+OVN on 3 controllers and 2 compute nodes (called compute01, and compute03)	05:50
art	Ovn-northd and neutron-servers are installed on containers (on infras). And, ovn-gateways have been set on the compute nodes.	05:50
art	https://www.irccloud.com/pastebin/lSSoji45/user%20variable	05:51
art	Also, this is our config of v-lan(br-provider) and tenant(br-vxlan)networks:	05:52
art	https://www.irccloud.com/pastebin/qyTylsZm/openstack_user_config	05:52
art	On our compute nodes, before deploying OSA, we had created br-provider (using OVS in the netplan). Then let OSA install the compute nodes.	05:52
art	In this scenario, sometimes some of instances had not access to external network	05:52
art	Then I added another compute node (called compute02). The only difference between compute2 and other computes is that I have not created br-provider manually, and let OSA to create it.	05:53
art	Currently, routers of the projects which their first chassis priority is compute2, have access to external network, but the others have not.	05:53
art	For example: In the following screenshot, chassis_name: “75c7ce80-…” is our compute02, and have access to external net.	05:53
art	As it seems the default configuration of OVN is according to the “Centralized Floating IP” architecture, I tried to stop ovn_controllers on compute1 and compute3, and expected the router on compute2 routes the traffic of the project which their first chassis priority is not compute02, such as this one (first priority of this router is the chassis ‘0d3c2a0a-…, which is our compute01)’:	05:54
art	It seems I can not send screenshots here! :P	05:55
art	anyway, stopping all the ovn controllers on all three compute nodes have not any effect on the instances access to the external network. It seems, ovn gw can not route the traffic to its second and third gw-priorities.	05:55
art	Not sure if I am making mistake, or sth is wrong with our config.	05:58
art	Do you have any idea what’s wrong overhere?	05:58
jrosser	does your use of controller_hosts and compute_hosts in user_variables actually do what you expect?	05:59
jrosser	what I mean is, where are those defined, and how do you think ansible loads the values?	06:00
art	we have defined them in openstack_user_config:	06:14
art	https://www.irccloud.com/pastebin/RDfVaDm7/	06:15
jrosser	ok, so openstack_user_config is the input to the OSA dynamic inventory which creates any required ansible groups and so on	06:33
art	exactly	06:34
jrosser	user_variables sets ansible vars that you use to override things in group_vars / host_vars or ansible role defaults	06:34
jrosser	so defining network_northd-hosts in user_variables won’t define an ansible host group, nor is *compute_hosts an ansible variable that you can refer to outside openstack_user_config	06:36
jrosser	I think it looks like you are trying to define host groups in user_variables, which won’t work	06:38
art	oh, i'm sorry. I have written incorrectly, here! the ovn groups have been defined in openstack_user_variables, not user_variables. I just made mistake in writting them here	06:39
art	the config related to ovn in user_variable is just these lines:	06:39
art	https://www.irccloud.com/pastebin/aQmtD267/	06:40
jrosser	anywa I think you need OVN controller running on all chassis regardless of if they are gateway or not	06:41
art	nothing else is defined in user_variable which be related to ovn. All other configs have been defined in openstack_user_config, exactly as you mentioned	06:41
art	it means I need to have ovn-controller deamon on all of my compute nodes, which I currently have them	06:42
art	Do I need any other thing that should be considered?	06:43
jrosser	sadly I have very limited experience with OVN	06:43
jrosser	if your compute2 works reliably then this maybe points to some difference with how be-provider was created between net plan and letting OSA do it	06:44
art	that's ok. Thank you very much for taking the time to help me with this issue. really appreciate that	06:45
art	yeah. it seems. I try to change all other computes to this config for br-provider. let you know the result :)	06:45
art	Thanks again	06:45
opendevreview	Andrew Bonney proposed openstack/openstack-ansible stable/2023.1: haproxy: fix health checks for serialconsole in http mode https://review.opendev.org/c/openstack/openstack-ansible/+/891452	07:05
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-os_tempest master: Remove deprecated variables https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/891567	08:03
jrosser	arxcruz: ^ looking at codesearch this might be a breaking change, but I am assuming that the design error mentioned in the commit message here has been addressed https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/884172	08:05
noonedeadpunk	btw https://review.opendev.org/q/topic:osa%252Fopenstack_resources is another breaking thing	08:21
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-haproxy_server master: Add HTTP/2 support for frontends/backends https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/891572	08:37
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-haproxy_server master: Add HTTP/2 support for frontends/backends https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/891572	09:21
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Enable HTTP/2 for TLS-covered frontends https://review.opendev.org/c/openstack/openstack-ansible/+/891575	09:31
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Update haproxy healthcheck options https://review.opendev.org/c/openstack/openstack-ansible/+/887285	09:36
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-os_tempest master: Rename includelist/excludelist file path vars https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/891578	10:00
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-os_tempest master: Allow include/exclude lists to be defined in many variables https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/891579	10:00
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant stable/zed: Install mysqlclient devel package https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/891447	11:05
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant stable/2023.1: Install mysqlclient devel package https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/891279	11:06
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant master: Install mysqlclient devel package https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/888985	11:06
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant stable/2023.1: Install mysqlclient devel package https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/891279	11:06
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant stable/zed: Install mysqlclient devel package https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/891447	11:07
jrosser	i don't have much luck at all with tempest basic server ops on a baremetal AIO :/	11:38
jrosser	lots of neutron failed port binding	11:38
noonedeadpunk	I've just spawned one.	11:48
noonedeadpunk	Even with Octavia	11:48
jrosser	the tempest vars patches seem to be working	11:50
jrosser	but i backed out to only basic server ops as it was all very breaking	11:50
jrosser	i see this https://paste.opendev.org/show/bWJPQOl7ddixcg6TXrKp/	11:50
jrosser	but then becasue OVN i'm a bit lost with what to look at next	11:51
noonedeadpunk	I actually do recall seing that in CI, but it was solved somehow...	11:52
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-os_tempest master: Ensure test exclusion file is removed when there are no exclusions https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/891586	11:52
jrosser	like theres no neutron agent any more to look in the logs of	11:52
noonedeadpunk	Usually neutron-server and then ovn-nortd are good starting points	11:53
noonedeadpunk	Ah. I think it was hostname mismatch when I saw in CI	11:53
noonedeadpunk	that one defined for ovsdb does not match what's expected by neutron	11:53
noonedeadpunk	https://opendev.org/openstack/openstack-ansible-os_neutron/commit/74b0884fc232aa96f601b4c24c3e36f3fba4f1bb	11:54
jrosser	wierdly, i can manually boot a cirros and aparrently attach it to one of the leftover tempest networks	11:54
noonedeadpunk	Actually, we likely can revert this now...	11:54
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Revert "Workaround ovs bug that resets hostname with add command" https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/891453	11:57
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible master: Do not override tempest_plugins for AIO scenarios https://review.opendev.org/c/openstack/openstack-ansible/+/891587	11:57
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible master: Do not override tempest_plugins for AIO scenarios https://review.opendev.org/c/openstack/openstack-ansible/+/891587	11:58
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Revert "Workaround ovs bug that resets hostname with add command" https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/891454	11:59
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Revert "Workaround ovs bug that resets hostname with add command" https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/891453	12:00
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Revert "Workaround ovs bug that resets hostname with add command" https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/891453	12:00
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Adopt for usage openstack_resources role https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/889879	13:26
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_gnocchi master: Use proper galera port in configuration https://review.opendev.org/c/openstack/openstack-ansible-os_gnocchi/+/890100	13:33
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_aodh master: Use proper galera port in configuration https://review.opendev.org/c/openstack/openstack-ansible-os_aodh/+/890093	13:34
noonedeadpunk	to unblock adjutant role, we'd need to start merging from Zed: https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/891447	13:36
noonedeadpunk	otherwise upgrade jobs are not happy	13:36
spatel	Folks! Do you have any suggestion of good JBOD card for Dell server? I am looking to use for Ceph.	13:46
jrosser	lsi-logic/broadcom HBA have been ok for us	13:51
jrosser	spatel: ^ you mean like SAS adaptor to connect a JBOD chassis?	13:51
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant master: Stop reffering _member_ role https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/891462	14:01
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-plugins master: Add openstack_resources role skeleton https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/878794	14:17
spatel	jrosser, what do you mean by JBOD chassis?	14:22
jrosser	spatel: you've not been very specific - i assumed you meant you had some disk chassis to add to a dell server	14:23
spatel	I have Dell server which RAID controller and it doesn't have JBOD mode. I have two option configure all drive with RAID0 each and expose to OS or buy new card which has JBOD mode.	14:23
spatel	I have 4x 6TB disk on each dell server for OSDs	14:24
jrosser	the firmware on some dell controllers can be reflashed to IT mode	14:25
jrosser	on my r730 i can configure the drives as "non raid" and they just get passed through to the OS	14:26
spatel	Oh!! I see	14:32
spatel	Do you have dedicated mon nodes or running on with openstack controller?	14:33
spatel	I have 5 ceph nodes and thinking running controller node with OSDs nodes :(	14:33
spatel	mon is very light weight and only required processing when OSD or any node is down	14:34
noonedeadpunk	usually having mon/mgr on osd is not great	14:34
noonedeadpunk	as then upgrade process is quite cumbersome.	14:35
noonedeadpunk	or well, maybe with ceph-adm it's not a concern anymore	14:35
jrosser	i have dedicated mon nodes in that they are in LXD on some other hosts sharing with other stuff	14:36
noonedeadpunk	yeah, you totally should some kind of isolation between mon and osd services if you decide to co-locate them on the same hardware	14:36
jrosser	but they are in a totally separate failure domain from the OSD	14:36
spatel	In cephadm they are kind of isolated anyway in containers	14:40
noonedeadpunk	also rgw/mds are not _that_ lightweight in case you've ever considered using them	14:41
spatel	I have also question related OS which one is better for ceph CentOS vs Ubuntu	14:41
* jrosser has waaay more hardware for rgw than mons		14:41
jrosser	like 10x	14:41
spatel	I heard podman is best container system for Ceph so choose CentOS distro	14:42
noonedeadpunk	Well, I guess it would highly depend on the usecase and traffic served	14:42
noonedeadpunk	but cephadm will deploy centos containers anyway?	14:42
noonedeadpunk	there's no selection from what I heard?	14:43
noonedeadpunk	and dockerhub cointains pretty outdated images, so it's also pretty much only quay.io that should be used	14:44
spatel	what do you mean by depend on usecase of traffic?	14:51
spatel	container are just for control plane right	14:51
noonedeadpunk	I think pretty much everything runs in containers? Including OSDs?	14:52
jrosser	that was probably to do with my very large number of rgw	14:52
noonedeadpunk	ot 100% sure though	14:52
spatel	I love centOS but only worried in future IBM mess with it again :)	14:54
noonedeadpunk	but there're no other options with cephadm?	14:54
noonedeadpunk	yes, you can run containers pretty much anywhere, but containers themselvs will have centos anyway?	14:55
noonedeadpunk	jsut check their manifests: https://quay.io/repository/ceph/ceph/manifest/sha256:11e9bcdd06cfd4f6b0fef3da5c2a3f0f87d7a48c043080d6f5d02f7010ad72eb	14:58
spatel	Yes! Ubuntu required dedicated repository to run podman	14:58
noonedeadpunk	17.2.6 running on CentOS 8 Stream	14:59
spatel	centOS does it native	14:59
jrosser	no wonder there is nervous people tbh	14:59
spatel	I don't trust IBM tbh :)	15:00
noonedeadpunk	or well: https://github.com/ceph/ceph-container/tree/main/ceph-releases/quincy	15:00
spatel	They are business people and they can change policy with stroke of pen	15:00
noonedeadpunk	this is repo from which these containers are built from - rhel/ibm/centos	15:01
noonedeadpunk	and they not only can - they have already did that lately	15:02
noonedeadpunk	*have already done that lately	15:02
spatel	Lets stick with Ubuntu then.. hell to centos	15:02
noonedeadpunk	but as I said - you still have CentOS inside images :)	15:03
noonedeadpunk	or better say - inside containers	15:03
noonedeadpunk	so as long as you're running cephadm it doesn't really matter what will be on host	15:05
spatel	noonedeadpunk No... I have deployed cephadm and its running ubuntu based containers	15:06
spatel	https://paste.opendev.org/show/bOZDh0p3sTj62acovqxh/	15:06
noonedeadpunk	but uname will show host kernel	15:06
noonedeadpunk	as containers are not virtualizing that	15:07
jrosser	try some dnf command :)	15:07
noonedeadpunk	what's `cat /etc/release`?	15:07
spatel	CentOS Stream release 8	15:08
opendevreview	Merged openstack/openstack-ansible master: Update haproxy healthcheck options https://review.opendev.org/c/openstack/openstack-ansible/+/887285	15:08
spatel	wtf...	15:08
jrosser	spatel: this is how containers work, right?	15:09
jrosser	theres some $you-shoudlnt-have-to-care rootfs / runtime and it uses the host kernel	15:10
spatel	+1	15:10
jrosser	same is true for LXC/LXD	15:10
jrosser	it's completely trivial to use other OS LXC/D on ubuntu or whatever	15:11
jrosser	it's just an OSA thing where we take the choice to tightly couple the OS in the container to the host	15:11
noonedeadpunk	Well, we technicaly could de-couple that as well... But then we'd need to pull in container images from linuxcontainers.org rather then building them	15:12
spatel	currently we have ceph + cinder but in future we want to setup NFS storage	16:09
spatel	cinder with NFS required to mount NFS volume to all compute nodes correct?	16:10
noonedeadpunk	yup	16:16
noonedeadpunk	which ends up in a need to reboot all computes once NFS becomes unreachable	16:17
spatel	holy crap!	16:18
spatel	why not lazy unmount ?	16:18
spatel	Does cinder create LVM volume on top of NFS share and give it to VMs?	16:19
spatel	Trying to understand how NFS share plug to VMs :)	16:20
jonher	hi, i was running a deployment yesterday and finished it up today after hitting an issues with the ceph-rgw-keystone-setup playbook. I'm pretty sure this has worked in previous deployments on the same git branch/version when running 'setup-openstack' playbook: https://github.com/openstack/openstack-ansible/blob/xena-em/playbooks/ceph-rgw-keystone-setup.yml#L17	16:31
jonher	here hosts would evaluate to "localhost" and execute there, then as vars_file loads "openstack_service_setup_host" will become the first utility container as defined in defaults/source_install.yml	16:31
jonher	what i ended up doing was settings the "openstack_service_setup_host" in user_variables.yml to the first utility container, just like the source_install.yml does. Otherwise it would attempt to use /openstack/utility-<verison>/bin/python on the osa-deployer host to run the task.	16:31
jonher	But this seems like a bug? And why has it previously worked? Is the variable somehow available from a previous task when running setup-openstack.yml in one pass or similar?	16:31
noonedeadpunk	jonher: huh, interesting. But the playboook is expected to load defaults/source_install.yml according to the code? But now I kinda no sure if ansible does not evaluate vars differently. Ie, including vars after resolving hosts	20:07
noonedeadpunk	as that would result in a behaviour you've mentioned	20:07
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2023.1: Bump keystone version to unblock upgrade jobs https://review.opendev.org/c/openstack/openstack-ansible/+/891633	20:11
noonedeadpunk	though I wonder how that worked in CI, as I believe we should run this code in ceph job, and it's LXC-based, so localhost should not be valid target...	20:12
*** bjoernt_ is now known as bjoernt		21:41

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!