Thursday, 2023-03-02

opendevreview	Merged openstack/openstack-ansible-plugins master: Do not use openstack.osa.linear strategy plugin https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/874425	02:02
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Prepare service roles for separated haproxy config https://review.opendev.org/c/openstack/openstack-ansible/+/871189	03:58
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/zed: Bump OpenStack-Ansible Zed https://review.opendev.org/c/openstack/openstack-ansible/+/876028	04:28
opendevreview	Merged openstack/openstack-ansible stable/zed: Do not run dstat by default https://review.opendev.org/c/openstack/openstack-ansible/+/875608	06:28
jrosser	morning	09:05
noonedeadpunk	o/	09:12
noonedeadpunk	regarding your comment on 871189 - I have no idea about LE part as haven't touched it	09:13
jrosser	the comment is really about where the state comes from	09:14
noonedeadpunk	With zuul being overloaded it's impossible to understand if any change makes performance impact or not....	09:15
jrosser	in the previous iteration of this, in a deployment with no horizon - horizon_all would be empty list. the certbot backend setup in the haproxy playbook would never be removed because there were no play targets	09:15
noonedeadpunk	maybe we can change this condition then? https://review.opendev.org/c/openstack/openstack-ansible/+/871189/16/playbooks/os-horizon-install.yml#31	09:16
jrosser	in the new iteration, the haproxy role is always run against haproxy_all in it's own play, so there needs to be logic in that playbook to decide if the `certbot` backend should be present/absent, depending on horizon_all \| length	09:16
jrosser	ahha	09:17
jrosser	so it actually is conditional anyway then	09:17
noonedeadpunk	yep	09:17
noonedeadpunk	as we run against haproxy_all and not horizon_all	09:17
jrosser	yes indeed	09:17
noonedeadpunk	we just add one group to another so it's pretty much the same	09:18
noonedeadpunk	and I'm not sure if we want or not to configure all backends regardless	09:18
noonedeadpunk	I'd say not to save some execution time	09:19
jrosser	so maybe we should think about if it is good to have `haproxy_certbot_service` defined twice	09:19
noonedeadpunk	Also I think we might want to try to speedup dynamic_inventory by introducing threads there	09:19
noonedeadpunk	Well. haproxy-service-config.yml is now a playbook, so we can call it whenever we want and provide it whatever we want	09:20
jrosser	i think i might prefer to have `haproxy_certbot_service` defined just once in group_vars/haproxy/ and not also in horizon group vars	09:20
jrosser	then how an override works for that is super clear	09:20
noonedeadpunk	And have like haproxy_certbot_service_absent to call play one more time to remove it?	09:21
noonedeadpunk	or we can jsut \|selectattr there	09:22
jrosser	well there is an `enable` condition here https://review.opendev.org/c/openstack/openstack-ansible/+/871189/17/playbooks/haproxy-install.yml#57	09:22
noonedeadpunk	aha	09:24
noonedeadpunk	so basically we should run same thing twice	09:24
noonedeadpunk	I tend to leave that to you and damiandabrowski to fix to be frank :D	09:24
jrosser	omg now i dont understand why there is `haproxy_certbot_service` and also `haproxy_letsencrypt_service`	09:26
jrosser	/o\	09:26
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Bump OpenStack-Ansible master https://review.opendev.org/c/openstack/openstack-ansible/+/876043	09:26
noonedeadpunk	ugh, now both yoga and xena seems to be broken due to rabbit on centos....	09:32
jrosser	the other thing is that now the haproxy play can see vars from both haproxy_all and "{{ service_group }}" the patch can probably be divided in half again	09:32
jrosser	it's no longer necessary to move the vars around at the same time as make the other changes, as either location is good	09:33
damiandabrowski	"in the new iteration, the haproxy role is always run against haproxy_all in it's own play, so there needs to be logic in that playbook to decide if the `certbot` backend should be present/absent, depending on horizon_all \| length"	09:36
damiandabrowski	i guess i need to spend some time trying to understand Dmitriy's changes as for now I have no idea what could have changed in terms of letsencrypt	09:37
jrosser	^ i had not noticed the when: horizon_all \| length tbh	09:37
damiandabrowski	ok, so what problem do we have with LE for now? :D	09:37
jrosser	i have no idea really its too hard to understand :(	09:38
jrosser	for example why is there `haproxy_certbot_service` and also `haproxy_letsencrypt_service	09:38
jrosser	really this is a super good example of why i kept asking for a series of short, easy to understand patches	09:38
jrosser	changes need to be written for the reviewer really	09:39
jrosser	each one being the smallest possible step to achieve something well described in the commit message	09:40
damiandabrowski	but we always had haproxy_certbot_service and haproxy_letsencrypt_service, it's not a new thing	09:40
jrosser	? https://codesearch.opendev.org/?q=haproxy_certbot_service&i=nope&literal=nope&files=&excludeFiles=&repos=	09:40
damiandabrowski	ah sorry, you may be right, so:	09:42
noonedeadpunk	well, it's not always possible I should say... With changes like that where we need to update all playbooks I'm not sure how to split that in smaller chunks...	09:42
damiandabrowski	haproxy_letsencrypt_service handles connections to certbot which listens on 8888	09:43
damiandabrowski	so if horizon is deployed, haproxy can redirect ./well-known requrests to haproxy_letsencrypt_service	09:44
damiandabrowski	but if horizon is not deployed, haproxy_certbot_service listens on port 80 and redirects traffic to haproxy_letsencrypt_service	09:45
damiandabrowski	after horizon is deployed, haproxy_certbot_service is removed because it's no longer needed	09:46
damiandabrowski	maybe services names are confusing...	09:51
jrosser	i think that we should rename `haproxy_horizon_service` and decouple the deployment of it completely from horizon because actually it is the handling for all http/https traffic on port 80/443 regardless of if we have horizon/LE/security.txt	09:52
jrosser	that should just be one of the default services that haproxy deploys always	09:53
jrosser	and that the starts to open the door for making compute.example.com / dashboard.example.com etc etc rather than making all services on unique ports	09:54
jrosser	this then turns into a simplification, because theres only ever one thing handling port 80/443 for all cases	09:55
jrosser	^ renaming the service makes a simplification	09:55
damiandabrowski	but it's completely different approach, it doesn't go along with either point 1 or point 2 from my blueprint: https://opendev.org/openstack/openstack-ansible-specs/src/commit/30bbdf82c9df389c35c187aa9523e7e31d0c5b03/specs/antelope/separated-haproxy-service-config.rst	09:59
damiandabrowski	well... depending on how it's done	10:01
damiandabrowski	i can imagine that we may have something like "base" service, listening on 80/443 and forwarding traffic to other frontends based on URL etc.	10:02
jrosser	isnt that what i'm describing?	10:02
jrosser	anyway sorry i have meetings now	10:02
damiandabrowski	probably yes, i just needed a moment to process your msg :D	10:04
damiandabrowski	sounds interesting, noonedeadpunk what do you think?	10:04
noonedeadpunk	that sounds quite reasonable and we were discussing ability to have that kind of endpoints for a while now	10:06
noonedeadpunk	And blocker was partially on haproxy side	10:06
noonedeadpunk	But I'm not really sure how this service should look like	10:06
noonedeadpunk	As it should be able to understand when horizon/security.txt/certbot is available or needed and define ACLs based on that	10:07
damiandabrowski	yeah, but it sounds doable. I guess it will be a mix of URL/path based routing	10:14
damiandabrowski	it may help:	10:14
damiandabrowski	https://www.haproxy.com/blog/how-to-map-domain-names-to-backend-server-pools-with-haproxy/	10:15
damiandabrowski	https://www.haproxy.com/blog/path-based-routing-with-haproxy/	10:15
damiandabrowski	but yeah, it may be tricky for horizon, because we can't redirect traffic to horizon backend if it doesn't exist yet :\|	10:18
damiandabrowski	but we can make the same trick as in there and just check in haproxy-install.yml if horizon is running: https://review.opendev.org/c/openstack/openstack-ansible/+/871189/17/playbooks/haproxy-install.yml#50	10:36
jrosser	damiandabrowski: regarding the blueprint - it has always been that it's not 100% achievable	10:39
damiandabrowski	when i mentioned the blueprint, i was thinking that you just want to move horizon service to haproxy_default_services	10:41
damiandabrowski	so forget it :D	10:41
admin10	hi damiandabrowski .. i want your help in something i am struggling with .. in haproxy :)	10:42
admin10	and i know you are working day-in and out in it	10:42
admin10	https://gist.githubusercontent.com/a1git/45b41d370f04af817cb5d592d52b307b/raw/40a1b024fc6c0406410b52ca88ce6692d89707f9/gistfile1.txt --- this explains what i am trying to do	10:42
admin1	i use a wildcard, and want to be able to do s3.domain.com on the same 443, but redirect it to ceph backend	10:44
admin1	to take it even further, if this works, i will try to do id.example.com for keystone, images.example.com etc images, volume.example.com for cinder etc	10:45
admin1	so that restrictive firewalls from companies where only 80/443 is allowed can still access the apis	10:45
jrosser	admin1: you asked this before and i showed you how the letsencrypt ACL works	10:47
jrosser	you should be able to use the same approach for other things as well	10:47
damiandabrowski	yup, it should work IMO. I posted two links to haproxy blog above, they may help you	10:48
jrosser	there is a place here that you can add ACLs for the horizon frontend https://github.com/openstack/openstack-ansible/blob/master/inventory/group_vars/haproxy/haproxy.yml#L239	10:49
admin1	but where do i specify the actual SNI name ? is haproxy_service_name the sni name ?	10:51
jrosser	you can also add arbitrary config to each frontend https://github.com/openstack/openstack-ansible-haproxy_server/blob/stable/zed/templates/service.j2#L86-L88	10:51
jrosser	you are trying to add the config lines "lines to add" from your gist?	10:52
admin1	yes	10:52
jrosser	what release of OSA is this	10:53
admin1	the issue i see is the bind .. when 443 binds to VIP address cloud.domain.com, would those acl work for s3.domain.com	10:53
admin1	26.0.1	10:53
admin1	the last tag we have .. zed	10:53
jrosser	does cloud.domain.com and s3.domain.com resolve to the same ip?	10:54
admin1	yes they do	10:54
jrosser	so there is no problem with binding	10:54
admin1	ok	10:54
jrosser	binding to a VIP is binding to an IP, not an FQDN	10:54
admin1	ok .. and the ssl used is a *.domain.com wildcard	10:55
jrosser	so.....	10:55
jrosser	this is where the horizon haproxy setup is defined, that handles port 80/443 https://github.com/openstack/openstack-ansible/blob/master/inventory/group_vars/haproxy/haproxy.yml#L224-L241	10:55
jrosser	that is what is generating `frontend horizon-front-1` i your gist - does that make sense?	10:56
admin1	looking into that, i need to add the 4 lines on haproxy_security_txt_acl as list ?	10:58
admin1	i am looking into the wrong line	10:59
jrosser	now look here https://github.com/openstack/openstack-ansible/blob/master/inventory/group_vars/haproxy/haproxy.yml#L587	10:59
jrosser	there is a variable already that you can use to merge new things into `haproxy_horizon_service `, it is called `haproxy_horizon_service_overrides`	11:00
jrosser	what you would then do is go and look in defaults/main.yml for the haproxy_server role, as that should be the documentation for what you can do	11:01
jrosser	and there we find a ready made example https://github.com/openstack/openstack-ansible-haproxy_server/blob/master/defaults/main.yml#L88-L89	11:02
jrosser	putting all of those things together would end up with something like this https://paste.opendev.org/show/b84UvW46rYtpYiZjmbo4/	11:03
damiandabrowski	jrosser: noonedeadpunk i prepared a static haproxy PoC with a support for letsencrypt, security.txt, custom routing based on URL and fallback to horizon if no ACL is matched.	11:46
damiandabrowski	https://paste.openstack.org/show/bhZzGGKwrfm5AuS23WkG/	11:46
damiandabrowski	so I actually like jrosser's idea, but considering that I have 2 more days before a vacation(I'm absent from 6th to 13th March) I'll focus now on recent "virtual groups" feature added by Dmitriy and come back to this PoC after vacation	11:47
noonedeadpunk	we have quite nasty bug with systemd units being changed	11:56
noonedeadpunk	And it seems it was close to always like that (or until I've fixed another bug)	11:57
noonedeadpunk	So, because we do provide "state: started" for systemd units, and don't listen on "systemd service changed" - services are not restarted if the only change is systemd unit	11:58
noonedeadpunk	Good/easy example is uwsgi role	11:58
noonedeadpunk	I think I'd prefer adding listen of `systemd service changed` to handlers over restarting services with systemd role...	11:59
damiandabrowski	but we do listen on "systemd service changed", don't we? https://opendev.org/openstack/ansible-role-systemd_service/src/branch/master/handlers/main.yml	12:04
damiandabrowski	"services are not restarted if the only change is systemd unit" maybe you meant "systemd unit state"?	12:05
damiandabrowski	as i'm pretty sure that service will be restarted if /etc/systemd/system/*.service content changes	12:05
noonedeadpunk	damiandabrowski: nope, it won't	12:08
noonedeadpunk	I've sumbitted a bug that explains it better https://bugs.launchpad.net/openstack-ansible/+bug/2009029	12:08
damiandabrowski	'services_results.item.state is not defined' ah, i get it now	12:10
noonedeadpunk	Actually... I'm thinking about third option	12:16
noonedeadpunk	Change condition to `'services_results.item.restart_changed \| default(systemd_service_restart_changed) \| bool' or 'services_results.item.state is not defined'` instead of AND	12:16
noonedeadpunk	As for now I can't think of any bad consequences of this....	12:20
noonedeadpunk	this condition was like that since role being established	12:22
noonedeadpunk	well. except adding extra condition that state is not stopped if it's defined	12:24
jrosser	damiandabrowski: i will work on a patch to make a `base` haproxy frontend for everything port 80/443	12:36
damiandabrowski	ok, great!	12:36
jrosser	we can change that on master first and then your stuff to separate the configs will end up hopefully simpler	12:37
damiandabrowski	makes sense	12:38
opendevreview	Dmitriy Rabotyagov proposed openstack/ansible-role-systemd_service master: Restart changed services if state is started https://review.opendev.org/c/openstack/ansible-role-systemd_service/+/876083	13:18
opendevreview	Dmitriy Rabotyagov proposed openstack/ansible-role-systemd_service master: Restart changed services if state is started https://review.opendev.org/c/openstack/ansible-role-systemd_service/+/876083	13:23
jrosser	damiandabrowski: i am wondering about `frontend letsencrypt-front-1` - do we actually need that?	13:28
jrosser	the LE acl redirects to a backend, and i'm not sure the frontend is actually useful	13:29
jrosser	perhaps this comes from before we had `haproxy_backend_only: true` available	13:30
damiandabrowski	yup, i think we can safely drop it	13:44
opendevreview	Dmitriy Rabotyagov proposed openstack/ansible-role-systemd_service master: Restart changed services if state is started https://review.opendev.org/c/openstack/ansible-role-systemd_service/+/876083	13:44
admin1	jrosser, i ran the playbook with just this .. https://paste.opendev.org/show/b84UvW46rYtpYiZjmbo4/ . it made no changes .. grep s3 does not show anything	13:46
admin1	my bad	13:47
admin1	i ran horizon playbook and not haproxy :D	13:47
noonedeadpunk	Damn, 876083 is quite bad option	13:50
noonedeadpunk	Despite it looks damn easy	13:50
noonedeadpunk	Maybe we can target it for stable branches only as a fix...	13:51
noonedeadpunk	or close eyes on double restart	13:54
opendevreview	Dmitriy Rabotyagov proposed openstack/ansible-role-systemd_service master: Restart changed services if state is started https://review.opendev.org/c/openstack/ansible-role-systemd_service/+/876083	14:16
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-haproxy_server master: Allow default_backend to be specified https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/876157	14:23
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible master: Split haproxy horizon config into 'base' frontend and 'horzion' backend https://review.opendev.org/c/openstack/openstack-ansible/+/876160	14:32
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible master: Split haproxy horizon config into 'base' frontend and 'horzion' backend https://review.opendev.org/c/openstack/openstack-ansible/+/876160	14:33
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible master: Split haproxy horizon config into 'base' frontend and 'horizon' backend https://review.opendev.org/c/openstack/openstack-ansible/+/876160	14:35
opendevreview	Dmitriy Rabotyagov proposed openstack/ansible-role-pki master: Allow to provide custom handler names https://review.opendev.org/c/openstack/ansible-role-pki/+/875757	14:59
opendevreview	Dmitriy Rabotyagov proposed openstack/ansible-role-pki master: Allow to provide custom handler names https://review.opendev.org/c/openstack/ansible-role-pki/+/875757	15:00
noonedeadpunk	this systemd_service is rabbit hole....	15:06
noonedeadpunk	I regret I've spotted that bug	15:09
admin1	jrosser, it worked for 26.0.1 .. but not for 25.3.0 .. checking ..	15:09
* noonedeadpunk revises his plan to become a farmer		15:09
jrosser	admin1: `haproxy_horizon_service_overrides` was only introduced in Zed	15:11
admin1	oh	15:11
admin1	ok	15:11
jrosser	you will have to do the same thing a different way - thats why i asked you which release :)	15:11
admin1	it is ok .. i will upgrade this cluster to zed	15:11
admin1	is there an easy way to do it pre zed ?	15:13
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible master: Use 2.0.0 release for ansible-collections-openstack https://review.opendev.org/c/openstack/openstack-ansible/+/873092	15:14
jrosser	admin1: you would have to put the whole of https://github.com/openstack/openstack-ansible/blob/stable/yoga/inventory/group_vars/haproxy/haproxy.yml#L197-L213 into user_variables with those extra bits added	15:15
jrosser	so really not so bad	15:15
admin1	i think i will find upgrading easier then figuring that out :D	15:15
jrosser	really just this i think https://paste.opendev.org/show/bf7o9MxUFSAf3KpRtBUX/	15:17
jrosser	urgh i think i find some wierd bug in python_venv_build as well	16:21
jrosser	when updating upper-constraints SHA/contents and rebuilding lets say utility venv, the correct versions don't appear until `rm /var/www/repo/os-releases/26.1.0.dev57/ubuntu-22.04-aarch64/requirements/utlity*` first	16:23
jrosser	noonedeadpunk: ahha look at this https://github.com/openstack/ansible-role-python_venv_build/blob/master/tasks/python_venv_wheel_build.yml#L126	17:33
noonedeadpunk	um?	17:33
jrosser	but what is in `"{{ _venv_build_requirements_prefix }}-source-constraints.txt"` ?	17:33
jrosser	doh `--constraint http://172.29.236.101:8181/constraints/upper_constraints_cached.txt`	17:33
jrosser	so that is never `changed`	17:33
jrosser	we symlink something with a constant name to whatever u-c file is downloaded	17:34
noonedeadpunk	yup	17:34
jrosser	and i was just completely failing having updated to u-c SHA to build things with the right version	17:34
jrosser	and that is why	17:34
noonedeadpunk	aha	17:35
jrosser	thats really a bug	17:35
noonedeadpunk	I'm down in the rabbit hole with systemd_service....	17:35
jrosser	oh no :)	17:35
noonedeadpunk	Well. It's known things... We were having dynamic-SHA previously there, but decided to have constant thing	17:36
jrosser	do you remember why?	17:36
noonedeadpunk	As it was - fail unobviously why file is missing vs succeed with wrong venv	17:36
jrosser	https://github.com/openstack/openstack-ansible-repo_server/commit/a5df0d1a9bd4f4a24b578ae0596890f7b345381a	17:36
noonedeadpunk	As to update u-c you'd need to run repo playbook	17:36
noonedeadpunk	which is unobvious	17:36
noonedeadpunk	we indeed had couple of bugs and reports in IRC that things fail unobviously	17:37
jrosser	i will have a think about how to make that better	17:37
noonedeadpunk	And there's kind of close to no way to fail with good error message	17:37
noonedeadpunk	And we were out of idea how to make it better	17:37
jrosser	really because every service can have it's own u-c version we should arrange it so that each service playbook ensures that the right version is present	17:38
jrosser	rather than do the wrong thing in the repo server playbook	17:38
noonedeadpunk	I kind of need help with systemd as it seems that now we're at unstable equilibrium of bugs	17:38
noonedeadpunk	And fixing current one will lead to another one	17:38
jrosser	hmm	17:39
noonedeadpunk	Well, if you define glance_upper_constraints_url - it will be respected	17:39
noonedeadpunk	so each service habing own u-c kind of works as of today	17:41
noonedeadpunk	the thing is that caching u-c on repo_host is kind of out of scope for pyhton_venv_build role which I kind of agree with	17:41
noonedeadpunk	So about systemd it's https://bugs.launchpad.net/openstack-ansible/+bug/2009029	17:42
noonedeadpunk	I went to dropping handlers from roles and instead notifying systemd_service role so that it take care about service restarts	17:43
noonedeadpunk	But then I came to neutron and realised how bad idea is that	17:43
noonedeadpunk	So we kind of should not fix systemd_service role as then we will get services restarted twice, and we can't leave service restart to systemd role as then we loose logic and flexability we need somewhere	17:44
noonedeadpunk	And we can't really restart services with service roles, as we don't know if service should be running at all or it should be remain stopped/disabled....	17:45
noonedeadpunk	And we can't filter that when generating variable as then we won't pass it to systemd_service and it won't be actually stopped/masked/disabled. So we need to do filtering right in handers and don't touch systemd_service role	17:47
noonedeadpunk	Which sounds like terrible idea but I'm out of them	17:47
admin1	in a cluster, some hypervisors were removed ( and will not be put back) .. is there a good way to remove their entry from inventory so that next time ansible will not look for them or know about them	18:14
noonedeadpunk	admin1: inventoryy-manage.py does have a flag as of today that drops host from inventory	18:18
noonedeadpunk	-r or -d can't really recall exactly. But --help should tell :)	18:18
jrosser	noonedeadpunk: i am trying to understand the systemd thing a bit	18:22
jrosser	is it that we need to conditionally notify inside the systemd_service role only sometimes?	18:23
noonedeadpunk	So if we notify inside systemd_service AND we change like glance.conf - same service will be restarted twice	18:33
noonedeadpunk	also we don't need to restart service when we've passed state: stopped or enabled: false or masked: true, for example	18:34
noonedeadpunk	cross out enabled here :D	18:34
noonedeadpunk	but you got the gist	18:34
jrosser	its horrible but is `services_results` available outside the `systemd_service` role after it has run?	18:40
jrosser	just wild handwaving about being able to suppress a second restart if you've already done it	18:40
jrosser	or do we need to make `systemd_service` have some more official "return value" in a set_fact that describes the state of what was done, and can be used later	18:42
* jrosser has to travel		18:44
noonedeadpunk	Well, the thing is that systemd_service handler executes AFTER service role handler as service_role is usually triggered first (systemd_service is run almost at very end of the role)	18:59
noonedeadpunk	We can listen for `systemd service changed` in role handlers - that's available and working, but yeah, then we kind of need to have same condition in each role	19:00
noonedeadpunk	And flushing handlers in systemd_service is not helpful either, just in case	19:02
noonedeadpunk	We have more then enough control to be frank on the behaviour of systemd_service role. So it's not that it's doing smth unexpected.	19:04
noonedeadpunk	but yeah... Will head out for today as well. Hopefully will be less frustrated in the morning...	19:05
mgariepy	cya have a nice evening	19:06
opendevreview	Matthew Thode proposed openstack/openstack-ansible-os_octavia master: Implement support for octavia-ovn-provider driver https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/868462	20:20
prometheanfire	noonedeadpunk: I'm able to use ^ to create a lb, but traffic has yet to get through (seems like an arp issue for the 'public' side of the LB)	20:21
prometheanfire	it's VERY dirty/hackish	20:22
jrosser	noonedeadpunk: https://paste.opendev.org/show/bpl3Nm0dfoWWG0ACiRq4/	23:12

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!