Wednesday, 2023-04-19

opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/zed: Gather generic masakari facts  https://review.opendev.org/c/openstack/openstack-ansible/+/88060604:30
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/yoga: Gather generic masakari facts  https://review.opendev.org/c/openstack/openstack-ansible/+/88060704:30
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/xena: Gather generic masakari facts  https://review.opendev.org/c/openstack/openstack-ansible/+/88060804:30
jrossermorning05:58
jrosserwe should totally describe a small home lab / office lab config in the docs05:59
jrosserthe network setup is really not clear and easy to make a big mess with multi-homing / reverse-path-lookup trouble06:00
jrosserwhy does `playbooks/os-neutron-install.yml --tags neutron-config --limit neutron_server` do a bunch of python_venv_build stuff, thats odd06:27
noonedeadpunkNeilHanlon: yup, that's correct07:11
noonedeadpunkCI is not failing as we pull latest from zuul, but if you deploy a sandbox it will get version from ansible-role-requirements07:12
noonedeadpunkI think we have some bumps for reviews07:12
noonedeadpunkbecause we do include role and have tag always... So it's running tasks that also have tag always inside python_venv_build...07:13
jrosserthose `always` tasks depend on this being defined https://github.com/openstack/openstack-ansible-os_neutron/blob/master/tasks/neutron_install.yml#L7407:25
noonedeadpunkoh, wait, we import_role there07:26
jrosser:)07:26
noonedeadpunkwithout tags....07:26
jrosserwell i just tried to update some neutron server config stuff and it all blew up in a wierd way07:27
noonedeadpunkthis really drives me up the wall07:27
noonedeadpunkOh.... so despite it should run only with tag neutron-install, since imports are (now?) processed before role....07:28
noonedeadpunk /o\07:28
noonedeadpunkI bet replacing with include will just sort that out in this case07:28
noonedeadpunkI feel like we'd need to do couple of things in nearest future - replace all modules with fqdn names and with that thoroughfully review what and why we are using for imports/includes and tags07:30
jrosseryeah - and i think we have to write it in our developer docs07:30
jrosserbecause its such a mess07:31
noonedeadpunkI'm not sure I have full understanding/overview of all possible corner cases to be frank07:34
noonedeadpunkAnd obviously smth has changed with behaviour quite lately IMO07:35
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_ironic master: Add example networking-generic-switch user role for Arista switch  https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/88079707:37
noonedeadpunkdamiandabrowski: would be awesome to land this: https://review.opendev.org/q/topic:bump_osa+status:open07:48
jrossernoonedeadpunk: maybe we need some test case include/import code actually in the repo08:06
jrosserlike self contained examples that show things08:06
noonedeadpunkwe actually need to test tags I assume.08:06
noonedeadpunkafter cleaning facts08:06
noonedeadpunkbut again I'm not sure how to do that without reincarnating test repo08:07
noonedeadpunkas majority of roles will fail without keystone and infra being present08:07
jrosseri really meant just standalone manual things08:09
jrosserjust to demonstrate what is / is not supposed to happen08:09
jrosserbecause i'm also pretty much not having a full understanding either08:09
noonedeadpunkah08:11
noonedeadpunkWell, the thing is that right now we still launching such things with functional jobs/tests repo08:12
noonedeadpunkbut maybe we could expand vartests scenario or smth08:13
noonedeadpunkHm, looking at https://opendev.org/openstack/openstack-ansible/src/branch/master/tests/test-vars-overrides.yml I wonder what point of having this at all...08:15
damiandabrowskinoonedeadpunk: regarding https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/88078109:03
damiandabrowskii tried using meta: clear_facts to fix this issue and it doesn't help. clearing facts for localhost and haproxy_all does not help and clearing facts for all hosts breaks playbook execution09:04
noonedeadpunkand https://paste.openstack.org/show/bLHdaI0NjQy229dZ9gZQ/ is not working either I assume?09:05
damiandabrowskinope :/09:06
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Do not limit IP prefix for DHCP rule  https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/88080409:50
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Change default CIDR for security_group  https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/88054409:51
opendevreviewMerged openstack/openstack-ansible-os_ironic master: Add example networking-generic-switch user role for Arista switch  https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/88079709:56
admin1i need some direction .. i have disabled internal ssl in horizon, so if i do http://<internal-ip-of-horizon> , they all load fine all the time .. but when I do the same via https://domain.com ( going via haproxy) , i 503 Service Unavailable  No server is available to handle this request.10:46
admin13 controllers, so 3 horizon containers ..  doing http:// directly to the 172.29.236 never fails 10:46
jrosseradmin1: 503 service unavailable from haproxy means it does not think the backend is up10:48
admin1horizon back in haproxy stats shows all green 10:48
admin1image paste: https://pasteboard.co/vW6VJR7Gr2eE.png10:49
jrossertbh i am not really understanding "disabled internal ssl" at all10:51
noonedeadpunkbut internal ip goes also through haproxy10:51
noonedeadpunkare you sure that DNS is correct ?:)10:51
jrosseradmin1: right now there is really no SSL at all haproxy<>horizon unless you have done something really specific to enable that10:52
admin1haproxy  horizon section -> https://gist.githubusercontent.com/a1git/90d9b40c4d6f6313854abab6746b4270/raw/cd5c373c1f7c4c01fcf87f944a4fb4bd0dd3184e/gistfile1.txt10:53
jrosserbut what do you mean "disabled internal ssl in horizon"10:54
noonedeadpunkI think for internal vip10:54
admin1horizon_enable_ssl: false10:54
admin1http://horizon-mgmt-lxc-ip was redirectiing to https:// , removed that one so that they work on the mgmt network on direct 80 10:55
noonedeadpunkI'm not sure though horizon should be working through internal IP at all to be frank...10:55
admin1http://172 ip just works fine 10:55
admin1it is what haproxy is proxying, 10:55
admin1but in my case, its 503 and i have not been able to figure out why 10:56
noonedeadpunkI think it might be STS10:56
jrosserbut the apache server in the horizon container needs to be told if the user beyond haproxy is accessing as https or not i think?10:56
jrosserit's not as simple as just saying it is or is not https to haproxy10:56
noonedeadpunkIt should detect by X-Forwarded-Proto10:57
noonedeadpunkthat's part of frontend config10:57
damiandabrowskiand its logic was a bit weird...I fixed it a couple of days ago in master10:58
damiandabrowskihttps://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/87951410:58
noonedeadpunkBut in case https is disabled, hsts should be disabled as well10:58
jrosserbut here https://github.com/openstack/openstack-ansible-os_horizon/blob/stable/zed/templates/openstack_dashboard.conf.j2#L910:58
noonedeadpunkum... how that's gonna work at all... behind haproxy...10:59
jrosseron Zed, all through the logic that configures this there is `horizon_enable_ssl` and `horizon_external_ssl` and those are not the same thing10:59
jrosserhorizon_enable_ssl is basically bogus and legacy thing for putting your own cert on the horizon container afaik11:00
noonedeadpunkwe're redirecting http>https on haproxy level...11:00
noonedeadpunkAH, if you've disabled tls and did not run horizon role then yeah, it will jsut redirect to 443 that's not listening anymore11:01
admin1i am happy if only haproxy is doing ssl termination and internal services are on non-https .. this means only 3 servers ( controllers ) are doing the ssl transaction instead of 20x3 = 60 containers each doing their own ssl transaction11:01
jrosseradmin1: which release are you using?11:02
noonedeadpunkthis part completely looks legacy and not needed to me https://github.com/openstack/openstack-ansible-os_horizon/blob/stable/zed/templates/openstack_dashboard.conf.j2#L9-L1611:02
admin16.0.1 had no issues.. i upgraded to 6.1.0 and getting this issue 11:02
jrosserZed? please forget everything about 60 containers doing ssl11:03
jrossernone of this is relevant at all until antelope11:03
admin1zed :) 11:03
jrosserwhat we work on today in master branch is not part of Zed release11:03
admin1i destroyed the containers are rebuit, had no effect11:04
admin1i can rm -rf the /etc/ansible/roles, run it again, destory the repo servers and try to force a build 11:04
noonedeadpunkI don't really see differences in 26.0.1 and 26.1.0 that could affect that11:06
noonedeadpunkI'm completely lost now11:08
jrosserlikewise11:09
noonedeadpunkso http://<internal-ip-of-horizon> - works,  https://<public-fqdn-of-horizon> - does not, https://<public-ip-of-horizon> - ?11:09
jrosseradmin1: tbh this sounds like you have some incorrect override made11:09
jrosseralso i never access horizon on the internal endpoint so could not say if this does/doesnt work11:10
admin1variables for this: https://gist.githubusercontent.com/a1git/c41ec803bf7a6964e3c7b7a2acab388e/raw/bf073a4f04011e4f9df4cd7bf78402f02508d4f5/gistfile1.txt11:10
admin1i did ssh deploy -D 6000, where 6000 is socks, then use foxyproxy to have browser use 6000 for all browsing = can access all internal ips and see how it works /behaves 11:11
admin1i can validate that it works: https://pasteboard.co/lJHAUC47Y1Xp.png11:12
jrosseri am not sure that using `horizon_enable_ssl: false` is valid thing to do11:15
noonedeadpunkadmin1: I think you will need to override haproxy_horizon_service11:16
noonedeadpunkAs horizon_enable_ssl is indeed not used for haproxy config11:16
noonedeadpunkSo basically https://opendev.org/openstack/openstack-ansible/src/branch/stable/zed/inventory/group_vars/haproxy/haproxy.yml#L227-L228 should be just False11:17
jrosser?11:17
jrosserbut what about the frontend?11:17
noonedeadpunkBut horizon_enable_ssl I guess assume no ssl will be used at all?11:17
noonedeadpunkbut I'm not really understanding what is trying to be achieved right now11:18
noonedeadpunkIf fully disable tls for horizon - then that's the way?11:18
admin1it disables the internal redirection from http://horizon-mgmt-lxc-ip to https://horizon-mgmg-lxc-ip which just  does not load in the browser at all11:18
jrosseradmin1: can you be more precise please - this is very confusing11:20
jrosserdo you mean the redirect in haproxy, or one in the horizon apache?11:20
admin1ok .. let me walk through the scenairo step by step ..   1.  cluster upgraded from 6.0.1 -> 6.1.0 ..      https://cloud stopped to work (horizon) .. checked haproxy stats .. all green ..   logged into each of the horizon srevers and did ss -plant .. all listening on 80 and 443, no errors ..  .. tried curl http://ip  .. it redirected to   https://ip11:24
admin1..  did curl https://ip   gives curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number .. my thought was that https is using some wierd cert so haproxy <-> horizon is not working, so pasted here ..  jrosser gave me the link11:24
admin1https://github.com/openstack/openstack-ansible-os_horizon/blob/stable/zed/templates/openstack_dashboard.conf.j2#L9 which allows me to disable the internal https redirection .. did that, could access and login to each/all of horizon dashboard without fail and issues .. but   https://cloud ( via haproxy) still give 503 11:24
jrosseri was not telling you to use that variable11:25
jrosseri intended to show what was configuring the redirect in the horizon apache11:25
admin1at least that helped me to know that horizon is not in error and i can login and work properly and the issue lies in   haproxy -> horizon 11:26
admin1otherwise   503 Service Unavailable via haproxy and "172.29.239.156 sent an invalid response.  ERR_SSL_PROTOCOL_ERROR" via  https would have pointed to non-existent issues on the horizon side 11:27
jrosserhave you looked at the generated apache config?11:28
jrosseri don't think i can help here really11:30
admin1https://gist.githubusercontent.com/a1git/be2f283b53361bacae6345820d6a98a6/raw/d780106e23390d93dd6dd93543f94417469be110/gistfile1.txt11:30
jrosserbest to make an AIO with a competely standard config and compare what you see there11:30
admin1i don't see an issue here 11:30
admin1in the apache .. as horizon works perfectly fine in this 11:31
jrosserbut you say this `curl https://ip   gives curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number .. my thought was that https is using some wierd cert so haproxy <-> horizon is not working`11:32
jrosserthats the IP of the horizon container?11:32
admin1yes 11:32
noonedeadpunkcould you be using tlsv1.1?11:33
admin1by default 80 -> 443 is enabled for internal ip in apache.conf, and curl gives that .. https:// in the browser also does not work 11:33
noonedeadpunkor some override for that?11:33
jrosseradmin1: there is no redirect in the apache config you pasted11:33
admin1yes, because i removed it .. 11:33
noonedeadpunkwe have protocols defined here https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/all/ssl.yml#L1911:34
admin1noonedeadpunk, this is the generated config -> https://gist.githubusercontent.com/a1git/90d9b40c4d6f6313854abab6746b4270/raw/cd5c373c1f7c4c01fcf87f944a4fb4bd0dd3184e/gistfile1.txt11:34
noonedeadpunkadmin1: yes, so VIP must be accessed through SSL both internal and external according to this11:34
jrosserthe curl error is what happens when you point a browser at something on port 443 which is actually http, not https11:35
noonedeadpunkI think more vice versa?11:35
noonedeadpunkah, no, you're right11:36
admin1i  can try to rm -rf the /etc/haproxy,  remove the ssl override and re-run the haproxy playbook again 11:37
jrosserimho if the horizon apache config was generated with a redirect then that says that the config is wrong11:37
jrosserif the variables are set up properly to make that backend be http, then the redirect would not be there11:37
admin1but i pasted the whole variables and the generated configs 11:37
noonedeadpunkI think this kinda all heading to some weird direction11:37
admin1i have no overrides for ssl or haproxy 11:38
noonedeadpunkRedirect is useless imo, since haproxy still uses 80 port11:38
noonedeadpunkredirect close to never should get uitlized, except direct access to horizon backends?11:38
jrosserright - but it totally confuses when you have socks tunnel into mgmt network then try to debug with a browser11:38
noonedeadpunkyeah. as I said, I don't think horizon even supposed to be working on internal network11:39
admin1how else to check if horizon is running good .. curl gives  curl: (35) error:0A00010B:SSL routines::wrong version number11:39
jrosser^ you are accessing something as https which is not actually https11:39
jrosserport 443 is serving plain http in that case11:40
admin1you mean for the curl ? 11:40
jrosseryes11:40
jrosserjust becasue it is on port 443 does not mean it is https11:40
jrosseronly if the web server config makes it be https11:40
admin1if you guys have any zed, login to any horizon container ..  ss -plant you will see apache on 80 and 443 .. if you do curl -I   http://ip:80 , it will redirect to https://ip/auth/login/?next=/     if you do curl on that ( which horizon is asking ) it will give that error11:41
admin1so is the default behavior wrong ? 11:41
noonedeadpunkum, apache on https is wrong, yes, but it never used by haproxy11:41
noonedeadpunkso it should not cause issues by default11:41
noonedeadpunklet me quickly add horizon to sandbox deployment11:43
admin1i am going to rm -rf my /etc/ansible/roles and rerun the playbooks 11:43
noonedeadpunkNot sure how this can help, but feel free11:44
admin1noonedeadpunk, have to try to solve it somehow 11:44
admin1prod affected :( 11:45
noonedeadpunkdoing random action unlikely to help11:47
noonedeadpunkhow one old belgian friend said - less haste more speed11:47
admin1well, i submitted my variables, generated configs, also screenshots that horizon is in fact working , but got no ideas .. so have to try something .. 11:48
noonedeadpunkadmin1: ok, so I don't have redirect in apache by default11:48
admin1are you in 6.1.0 ?11:48
jrosseri don't either11:48
admin1i have one is 26.0.1 and i have redirect there also 11:49
admin1in*11:49
noonedeadpunkI still not understanding what exact setup of horizon you want. What I got is that you're trying to solve issue that horizon gets SSL error when accessing public FQDN11:49
opendevreviewMerged openstack/openstack-ansible-os_neutron master: Use include instead of import for conditional tasks  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/87494911:49
admin1when I do https://cloud.domain.com, i want horizon to load 11:50
noonedeadpunkBut not with checking what's wrong there but trying to disable SSL at all? 11:50
jrosseradmin1: again can you be precise, do you mean the redirect is in the apache config like this https://github.com/openstack/openstack-ansible-os_horizon/blob/9c07e79890692cb477005cf34139fd20d4417be7/templates/openstack_dashboard.conf.j2#L10-L1511:50
jrosseradmin1: or do you mean that with wget/curl you get a 302 regardless of the apache config11:50
jrosserthese are not the same thing, but i have no idea which you mean11:50
admin1jrosser, i do not have that 11:51
jrosserwhat?! :)11:51
admin1but curl http:// redirects to https:// 11:51
jrosserok ;)11:51
noonedeadpunkok, that's correct, as this is what haproxy should do11:52
jrosser^ i think he means inside the horizon container?11:52
admin1yes 11:52
jrossersooo much confusion11:52
admin1inside horizon 11:52
noonedeadpunk++11:52
admin1when you guys login to a horizon container , then do curl http://ip, what does it return ? 11:53
admin1curl -I ( to get the headers) 11:53
admin1it will return https://<internal-ip>/auth/login 11:53
noonedeadpunkwhat if you try setting horizon_external_ssl: True?11:55
admin1i will try again ..  i have   external_lb_vip_address  -> cloud.domain.com ..  .. when i do https://cloud.domain.com it gives 503 ..  but if I check the stats page, all backend is green ..   when I directly call the http://horizon-intenral-ip/ i am able to login fine 11:56
noonedeadpunkBut likely horizon_enable_ssl should be left as default...11:57
noonedeadpunkbut horizon_external_ssl should be true by default11:57
noonedeadpunkUnless you have override for openstack_external_ssl11:58
noonedeadpunkso basically `horizon_enable_ssl` should be true as well I guess11:58
admin1i have haproxy_user_ssl_cert  and haproxy_user_ssl_key 11:59
admin1ok11:59
jrossernoonedeadpunk: are you sure? `horizon_enable_ssl` is backend ssl11:59
noonedeadpunkUm.... I don't see that in paste...11:59
admin1line 8 and 9 12:00
admin1is the only lines i have for ssl 12:00
noonedeadpunkjrosser: I'm not... Yeah, probably just default12:00
admin1so i am missing some required vars like horizon_external_ssl12:01
damiandabrowskijrosser: there's some information about that in commit msg: https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/879515/412:01
noonedeadpunkadmin1: horizon_external_ssl is default to openstack_external_ssl12:02
noonedeadpunkthat is True by default https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/all/all.yml#L7112:02
noonedeadpunkand yes, jrosseris right, horizon_external_ssl should be true while horizon_enable_ssl should be false12:03
noonedeadpunkaha, so I was looking at new "fixed" code12:04
jrosseryeah is all different (and odd) in Zed12:04
noonedeadpunkso override admin1has regarding horizon_enable_ssl: false is correct one12:04
jrosserimho the whole business of using socks to a browser looking at the horizon container is not valid12:05
damiandabrowskihorizon_enable_ssl should be enabled by default in Zed and it's kind of strange but okay as long as horizon_external_ssl is True12:05
jrosserreally?12:05
noonedeadpunkwell, redirect won't be added at least12:05
noonedeadpunkand certs wont' be copied as well...12:05
damiandabrowskifrom https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/879515 commit message:12:06
jrossereven without the redirect in the apache config (lets be precise!) there is still a 302 if you curl port 80 in the horizon container12:06
damiandabrowskiThis patch does not change current behavior in gating as backend TLS12:06
damiandabrowskiworks only with horizon_external_ssl=False(while it's set to True by12:06
damiandabrowskidefault).12:06
damiandabrowskiso i don't mean that backend TLS will work if you set horizon_enable_ssl=True in Zed12:07
damiandabrowskiI'm just saying you can have horizon_enable_ssl=True, backend tls will still be disabled(if you have horizon_external_ssl=True) and everything should work fine12:07
noonedeadpunkOK, I think goal#1 admin1 is to get rid of the redirect in apache by setting up horizon_external_ssl and horizon_enable_ssl "correctly"12:07
jrossernoonedeadpunk: try curl -v http://<ip>:80 in your horizon container12:08
noonedeadpunkso external one should be true, enable - dunno, aio works with default12:08
noonedeadpunkjrosser: I have bare metal sandbox... But it redirect to http://172.29.236.100/auth/login/?next=/ which is fine?12:09
noonedeadpunkbut yes, it's 30212:11
noonedeadpunkthough it looks like valid one, and not to https 12:11
jrosserhttps://paste.opendev.org/show/bWCJbfhDFv2UGh71zVpd/12:11
noonedeadpunkhttps://paste.opendev.org/show/bT5hWF4Tc3okutnvxfbD/12:12
noonedeadpunkok, it should not redirect to https12:12
damiandabrowskinoonedeadpunk: basically in Zed you should have two redirects i think12:12
damiandabrowskihttps://10.6.0.153 -> http://10.6.0.153/auth/login/?next=/ -> https://10.6.0.153/auth/login/?next=/12:12
damiandabrowskiso https -> http -> https12:13
damiandabrowskiwhich is strange but works12:13
damiandabrowskiIIRC it was fixed in https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/879514/512:13
noonedeadpunkjrosser: what do you have in /etc/apache2/sites-enabled/ ?12:13
jrosserhttps://paste.opendev.org/show/bOHqweRX25roNJOzlu8T/12:14
noonedeadpunkbrrrrrr12:14
noonedeadpunkso the only thing that can redirect this way is horizon itself12:15
jrosseryes12:15
noonedeadpunkor default vhost which I assume is absent12:15
damiandabrowskii need to leave for now, I can help later this evening if needed as I spent plenty of time on horizon's redirections recently...12:16
opendevreviewMerged openstack/openstack-ansible-os_cinder master: Move online data migrations to post-restart step  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/88021012:17
noonedeadpunkjrosser: but again, it should be "fine" once you access through haproxy12:17
jrosseryes it should12:17
jrosserthis is why i think just hacking at things making it work on br-mgmt is not a good plan12:18
jrosserparticularly if you want internal vip to be http12:18
jrosseror different from external anyway12:18
admin1i am re-doing it  .. maybe it will fix it 12:19
noonedeadpunkI still think that original issue admin1 could have was realted to HSTS until things god messed up12:19
noonedeadpunk*got12:19
noonedeadpunkas then haproxy will result in error as well - not sure if 503 or not though12:20
noonedeadpunkit will show all backends as green, but until security policies are met - it won't let you through to them12:21
noonedeadpunkSo I'd try setting `haproxy_security_headers_csp_report_only: true` and revert all other changes to be frank12:22
admin1the reports, where does it log to noonedeadpunk,  journalctl ? 12:24
noonedeadpunkUm, browser console iirc12:25
jrosserthis could also be some problem with the certificate to generate a 50312:26
jrosserwhen the backends are up12:26
noonedeadpunkyeah, could be...12:27
noonedeadpunkbut no,  csp is blocked by browser iirc, so it won't be 503?12:28
noonedeadpunkyeah, and internal VIP should not work exactly due to CSP12:30
noonedeadpunkas we paste there public VIP rules12:30
noonedeadpunkalso in case of some redirect loop in apache I think haproxy should mark backends as down. as we're checking for /auth/login/ and expect 20012:33
noonedeadpunk301/302 will mark backends as down12:34
noonedeadpunkWill it?:)12:34
noonedeadpunknah, all 2xx and 3xx are valid12:35
noonedeadpunkso indeed it could be 302 to https :(12:36
admin1i found this in haproxy ..  [19/Apr/2023:12:37:05.466] base-front-1~ base-front-1/<NOSRV> -1/-1/-1/-1/0 503 237 - - SC-- 523/1/0/0/0 0/0 "GET /auth/login/?next=/ HTTP/1.1"  12:38
jrosseryou've deployed master branch12:38
jrosserthats not Zed12:38
admin1doh ! 12:38
jrosserbingo12:38
jrossersimplest answer is almost the most likley12:38
jrosser*always12:38
admin1i rm -rf /etc/ansible , checkout 6.1.0 and re-ran the haproxy playbook thinking of this possibility 12:39
jrosser6.1.0 ?12:39
admin126.1.0  12:39
jrosserdid you bootstrap-ansible.sh ?12:39
admin1yes :D 12:39
admin1i need to rm -rf the /etc/ansible from all controller and re-run it again i guess12:40
admin1to completely remove the old one ? 12:40
jrosserthe haproxy `base` stuff is only present in master and will be in antelope12:40
noonedeadpunkonly on deploy host12:40
* noonedeadpunk was wondering how you've spotted that...12:40
noonedeadpunkWell, this also means we have stuff broken in master ?:)12:41
noonedeadpunkor well, for master we have some patches that might be covering this...12:41
jrosseri think it's because dropping master on top of Zed leaves the old config for port 443 which would have shown the horizon backends up12:41
admin1i suspected and i did a ps aux | grep openstack on all servers, and everything showed 26.1.0 .. but somehow haproxy is master :( 12:42
jrosserbut at the same time drops the base config which actually serves port 443 but has no backends12:42
jrosseror something like that12:42
jrosseradmin1: no, not really12:42
noonedeadpunkyeah, so we might be lacking upgrade path right now12:42
jrosserthere is not anything in the haproxy role that defines that `base` should exist12:42
jrosserthats all in the openstack-ansible repo variables12:42
noonedeadpunkor well, it could be master one day12:43
jrosseryou have used master branch of the openstack-ansible repo at some point, and that has written config into /etc/haproxy/conf.d/12:43
admin1if i checkout a new branch based on 26.1.0, then remove the /etc/ansible, bootstrap all .. and then rm -rf /etc/haproxy from controllers and ran haproxy again, it should get me back to the default of 26.1.0 for haproxy ? 12:43
jrosserand no matter what you do, how many playbooks you run, that is not going to get removed unless you fix it by hand12:43
noonedeadpunkI think it won't as there's nothing that would remove base12:43
jrosserthis is why just delete/recreate containers / run playbooks without debugging is wasted time12:44
noonedeadpunkadmin1: rm -rf /etc/haproxy/conf.d/12:44
admin1yes , that is what i am saying 12:44
noonedeadpunkand then re-run haproxy role12:44
admin1rm -rf /etc/haproxy* and re-run haproxy playbook should recreated the new configs right ? 12:44
noonedeadpunkyes12:44
admin1ok12:44
admin1another thing i noticed .. out of 3 controllers. i from haproxy disabled 2 and enabled only 1 .. what i noticed was out of 6 tries, openstack page loaded 1time, 4 times was 503 and final time was  a HUGE openstack logo with broken css .. all from the same single active backend 12:46
noonedeadpunkwasn't that during playbook run when vip is failovering across controllers?12:50
admin1i had stats page setup wtih pass to access .. it was after the run and everything stable, i disabled each backend 1 by 1 and enabled only 1 to check if its a specific container among the 3 that was causing 50312:51
admin1and i found out that even on same backend, with the remaining 2 being on MAINT, the result was not consistent 12:51
admin1503 - works .. 503 .. 503 .. 503 ..  broen css --12:52
admin1broken*12:52
admin1hmm.. now i am thinking . what if someone forgot to do -b 26.1.0 and just checkout master as 26.1.0  ( i have zsh and it shows heads/26.1.0  ) in the branch12:56
admin1would ps ax | grep openstack will still show 26.1.0 ( my local branch name) in the venv ? 12:57
admin1jrosser noonedeadpunk thanks .. fixed 13:07
admin1just for my understanding, what is the base all about ? are we adding multi domain support of some sort ? 13:07
jrosser\o/13:07
admin1or white label type of options 13:07
jrosserthere is a large refactoring of the haproxy config in Antelope13:08
jrosserand the `base` config brings together letsencrypt, security.txt and horizon (all the things on port 443) in a much neater and more obvious way than they are at the moment13:09
jrosserit is also more extensible to make the kind of thing you have done with everything on port 443 very easy in the future13:09
admin1that config is working fine in my sandbox .. 13:10
admin1everything neat and on ssl 13:10
jrosserwell like i say there will be significant changes in Antelope13:11
admin1if only there could be a sub_filter to change api and domain urls on the fly, we can whitelist a single horizon to multiple domains :D13:11
admin1all endpoints get sub_filter to diff domains as their own custom cloud .. 13:11
jrosserhaproxy cannot do any rewriting, if thats what you mean13:13
admin1it can .. rspirep .. 13:13
admin1http-response replace-header now a days 13:14
admin1i meant , someday to see some white label features 13:14
admin1back to my question .. if I see this, /openstack/venvs/nova-26.1.0/ , does it mean nova is on 26.1.0 .. or is that 26.1.0 whatever my branch name is ? 13:16
admin1how do i validate that my install is not on master .. ( except haproxy) everything   was(is)  working just fine .. 13:17
noonedeadpunkadmin1: it's whatever venv_tag or openstack_release is13:19
noonedeadpunkit can be overwriten13:19
admin1i do not have those overrides 13:20
noonedeadpunkby default it's identified using pbr basically https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/bootstrap-ansible.sh#L15013:21
admin1if i checkout tag 26.1.0 and make my local branch 1.1.1, it will not show 1.1.1. there right ? 13:21
noonedeadpunkno, it will show only version that was bootstrapped13:22
admin1ok . then phew ! .. 13:22
admin1we have a habit of running haproxy first to ensure it binds to all the correct ports etc, so only this got master and others are safe and in the version they should be 13:23
noonedeadpunkso if you do like cd /etc/ansible/roles/haproxy; git checkout master - version won't be affected13:23
opendevreviewMerged openstack/openstack-ansible stable/xena: Bump OpenStack-Ansible Xena  https://review.opendev.org/c/openstack/openstack-ansible/+/88047813:28
opendevreviewMerged openstack/openstack-ansible stable/xena: Gather generic masakari facts  https://review.opendev.org/c/openstack/openstack-ansible/+/88060813:28
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/wallaby: Bump erlang versions  https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/87670813:41
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/wallaby: Switch rabbitmq repo back to packagecloud  https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/88032413:41
opendevreviewNeil Hanlon proposed openstack/openstack-ansible stable/zed: bump openstack_hosts role to resolve openvswitch3.1 problem on Rocky  https://review.opendev.org/c/openstack/openstack-ansible/+/88082613:42
opendevreviewMerged openstack/openstack-ansible master: Drop `else` condition in the container_skel_load loop  https://review.opendev.org/c/openstack/openstack-ansible/+/87869614:04
noonedeadpunkdamiandabrowski: jrosser I have a small problem with https://review.opendev.org/c/openstack/openstack-ansible/+/871189 and it's problem not regarding content, but how we want to merge things efficiently.14:15
noonedeadpunkand it's related to https://bugs.launchpad.net/openstack-ansible/+bug/200729614:16
noonedeadpunkSo basically, what I'm planning to do, is create a directory for each group, instead of having simple files14:17
noonedeadpunkand move content of  repo_packages https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/defaults/repo_packages there to cover this bug14:17
noonedeadpunkI'm thinking of creating a follow up patch, that would move haproxy to it's own file, group_vars that were present to another and repo_packages content to it's own, so that bump script was able to find the file by expected filename/path14:18
noonedeadpunkBut I think I would need to have 2 patches to move just created haproxy configs to the directory, and then move repo_packges?14:20
spatelfolks! we don't have release notes for 26.1.0?15:28
spatelhttps://docs.openstack.org/releasenotes/openstack-ansible/zed.html15:29
noonedeadpunkLooks like we didn't manage to write any15:32
noonedeadpunkAt very least we had couple of bugs closed there15:32
noonedeadpunkbut yeah. don't see any release notes :(15:33
damiandabrowskinoonedeadpunk: "what I'm planning to do, is create a directory for each group"16:16
damiandabrowskido we really need to have directory for each group? I don't think we have that many variables defined16:16
jrosserthe files are edited automatically by the version bump tool16:17
noonedeadpunk^ this16:18
jrosserit is much much easier to write them out completely without having to worry about * other vars along for the ride16:18
damiandabrowskiwhere can i find this tool?16:18
damiandabrowskihttps://github.com/noonedeadpunk/osa-bump-bot16:20
damiandabrowskiare we talking about this one?16:20
noonedeadpunkhttps://github.com/noonedeadpunk/osa_cli_releases16:20
noonedeadpunkbtw we should update links here https://docs.openstack.org/openstack-ansible/latest/contributor/periodic-work.html#osa-cli-tooling16:21
noonedeadpunkor better move utilities in code...16:21
noonedeadpunkwell, regardless, it's very bad idea to edit yaml files that has some content, and especially since we need to render them with jinja16:22
noonedeadpunklike there's no way of them to be co-located with other variables16:22
noonedeadpunkso my question was mostly related to how orginize our work to kinda not block each other and at the same time not to push patches that will jsut re-arrage things...16:24
damiandabrowskito me it totally looks like it should be "fixed" in a bump tool, but if you both think that creating separate directories for each group then i trust you16:28
damiandabrowskii guess it would be the best if we can merge https://review.opendev.org/c/openstack/openstack-ansible/+/871189 now16:28
damiandabrowskiyesterday it already had two +2s but I applied few really minor changes suggested by noonedeadpunk 16:28
noonedeadpunkyeah and then we also have https://review.opendev.org/c/openstack/openstack-ansible/+/879085/1516:29
damiandabrowskiyes, I'll respond to your comment in a minute, i need to check one thing16:30
jrosserfor all things like that don't we need to duplicate them in group_vars/all16:31
damiandabrowskii think if default value of the variable has some logic that may be required16:32
damiandabrowskior at least in group_vars, it doesn't need to be defined for all hosts in my opinion16:36
damiandabrowskiwe already have glance_default_store defined in inventory/group_vars/glance_all16:40
damiandabrowskiso i think we can just add these lines to glance_all file16:40
damiandabrowskihttps://paste.openstack.org/show/bvVpcWgqxSLSj08vhPrT/16:40
jrosseri do not think we can override things from vars/16:41
jrosserlike the one starting _16:41
noonedeadpunkcan or should?16:42
noonedeadpunkas we obviously can16:42
noonedeadpunkBut most likely should not16:42
noonedeadpunkwe can not override ones, that are in debian/redhat, since they will get overloaded on  vars include16:43
noonedeadpunkbut vars/main.yml can be overriden16:43
jrosserbut role vars are higher precedence than group_vars?16:44
jrosserwhich is why we "reflect" them generally through defaults/ and prefix with _ as a reminder16:44
noonedeadpunkah, yes, right, over group_vars they have prescedence16:45
jrosseryes sorry by override i didnt mean -e, that obviously wins16:45
noonedeadpunkoh, well, even over play vars.. huh16:45
noonedeadpunkhm, then it looks like https://opendev.org/openstack/openstack-ansible-ceph_client/src/branch/master/vars/main.yml#L19-L46 should not be there16:46
noonedeadpunkanyway16:46
jrosserultimately it depends where we use it16:46
jrosserand if ever it is to be overridden16:47
damiandabrowskibut haproxy config doesn't know anything about role defaults or variables as haproxy role is imported outside the role16:47
jrosserit's ok if we don't describe it in defaults/main.yml imho16:47
noonedeadpunkwell, we have release notes like this https://opendev.org/openstack/openstack-ansible-ceph_client/src/branch/master/releasenotes/notes/move-gnocchi-component-118ae07fce3562e1.yaml16:47
noonedeadpunklike it makes sense to be defined per group then16:48
noonedeadpunkor maybe not as it's filtered anyway by groups...16:48
damiandabrowskiadditionally, only _glance_available_stores is defined in role vars and it only refers to other variables so it's not relevant IMO16:48
noonedeadpunkmeh16:48
jrosserright - and i think there is confusion because we're in the mind of saying to users "put that in user_*.yml" and that it doesnt matter16:48
jrosserbut then trying to use group_vars to be selective rather than global, all this then really does matter16:49
noonedeadpunkwell, it's just impossible to do some things without group_vars16:49
jrosseri still think we shouldnt override things in main repo group vars which are notionally private in role vars16:50
noonedeadpunkyes, agree here16:50
noonedeadpunkI'd say we should make this variable public first16:50
jrosseryep16:51
noonedeadpunkand then override it's default in group_vars16:51
jrosserotherwise there is 8-O with the vars precedence and surprises16:51
noonedeadpunkso patch to glance is needed16:51
damiandabrowskiso do i understand it correctly? add https://paste.openstack.org/show/bvVpcWgqxSLSj08vhPrT/ to inventory/group_vars/glance_all and move _glance_available_stores from role vars to role defaults?16:54
noonedeadpunkwell, you can do glance_available_stores: "{{ _glance_available_stores }}" in role defaults16:57
damiandabrowskiis there any reason not to do `glance_available_stores: "{{ [ glance_default_store ] + glance_additional_stores }}"` ?16:58
noonedeadpunkand in group_vars you don't actually need to re-define glance_additional_stores16:58
damiandabrowskiwhy?16:58
noonedeadpunkwell, you can move that to defaults as is16:58
damiandabrowskias I said before, haproxy role is imported outside glance role so it doesn't have any knowledge about glance role defaults16:59
damiandabrowskiso glance_additional_stores has to be defined in group_vars if its used in haproxy service definition16:59
noonedeadpunkwell, I meant you can do like glance_use_uwsgi: "{{ ('ceph' not in [ glance_default_store | default('') ] + glance_additional_stores | default([]) ) }}"17:01
noonedeadpunkso we don't maintain defaults in 2 places17:01
noonedeadpunkwhile keeping the logic17:02
noonedeadpunkwell, as long as we won't add ceph to default store17:02
damiandabrowskiyeah i can if you think it's better, but haproxy_backend_ssl will be really hard to read then17:02
damiandabrowskiit's already quite complex17:02
damiandabrowskihaproxy_backend_ssl: "{{ (glance_use_uwsgi | default(True)) | ternary((glance_backend_ssl | default(openstack_service_backend_ssl)), False) }}"17:02
noonedeadpunkwell, you can move glance_use_uwsgi outside of haproxy logic and jsut get it defined as well17:03
noonedeadpunkthen it will be kinda same - aproxy_backend_ssl: "{{ (glance_use_uwsgi | ternary((glance_backend_ssl | default(openstack_service_backend_ssl)), False) }}"17:04
damiandabrowskii have literally 5 minutes left, I'll try to upload required changes before i leave17:05
noonedeadpunkso you just skip defining glance_additional_stores as we know ceph is not there by default17:05
noonedeadpunkand it's not in glance_default_store17:05
noonedeadpunkor well, glance_default_store is already in group_vars17:06
noonedeadpunkso need for default there17:06
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible-os_glance master: Move _glance_available_stores to defaults  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/88087217:15
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible master: Add support for TLS backends  https://review.opendev.org/c/openstack/openstack-ansible/+/87908517:16
damiandabrowskihopefully i didn't make any mistake there17:17
damiandabrowskise you tomorrow guys17:17
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-openstack_hosts master: Update release name to Antelope  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/88076118:09
psyminfor a test deployment, are internal_lb_vip_address and external_lb_vip_address required?19:03
jrosserpsymin: both are always required and should never be the same19:04
psyminmy goal at the moment is to have a playbook configured that will deploy to the three servers, even if it isn't what I want.  Then recreate the targets, change the playbook config, and redeploy.19:04
psyminIf those IPs are necessary, do I first need to configure haproxy or will ansible do that for me with the config info?19:05
jrosserthat is all done for you19:07
psyminnice19:07
jrosserhow many controllers are you using?19:07
psyminI'm having some terminology confusion.  I'd like to use three controllers, but would make due with one.  I'll look up which openstack services make up the "controller"19:08
jrosserapi services / database / message queues etc19:09
jrosserif you only define one, then I think you need to manually assign the internal/external vip to your interfaces19:09
psyminwhich network should the vips be on?19:10
jrosserif you have multiple controllers then keepalived is responsible for the IPs for H/A19:11
psyminDo I need to manually configure keepalived before deploying or does anisble do that?19:12
jrosserpsymin: keepalived is all done for you19:27
psyminnice19:27
jrosserpsymin: regarding which network the bios should be on, internal should be on the mgmt network19:28
jrosserexternal really depends on your use case19:28
jrossernot bios :( … vips I mean19:28
psyminwhew, thanks for the clarification :)19:29
* jrosser autocorrect19:29
jrosserI think it’s important to think clearly about two different things19:29
psyminI assume I should choose an IP for external_lb_vip_address that is on an interface and network that can route to the internet?19:29
jrosserone, you are deploying the cloud and want all its components to talk to each other via the internal vip with sufficient security/whatever19:30
jrossertwo, you are the user of the cloud accessing the dashboard on some ip that is the external vip19:31
jrosserand as an extension of that, as a user you make some vm which need an IP on whatever network is appropriate for your situation19:31
psymincurrently I can only access the machines using the management IPs19:31
psyminwhich is acceptable to me at the moment19:32
jrosserthat’s ok19:33
psyminI doubt we'll access the dashboard from the external IP19:35
jrosseryou will, that’s how it works19:35
psyminwe'd use wireguard to connect to the private network and from there to the dashboard19:35
psyminahh19:35
jrosserexternal means “external side of haproxy”19:35
jrosserwhich may or may not be “external” in terms of your network19:36
jrosserit could be on your lan19:36
psyminfor safety, can that external side of haproxy be the management network which isn't routable to the internet?19:36
jrosserI’m not sure how to answer that19:37
psyminI probably am thinking of the network incorrectly.  19:38
jrosseras it would be radically different depending if you were building a cloud with actual internet ip on the outside19:38
jrosseror if you make something to run test workloads internally behind your firewall for example19:38
psyminAny time we'll need an actual IP I believe we'll manually map that through in the router to the destination service.19:38
jrosserit’s totally use case dependant, and one of the biggest reasons there’s no right answer19:38
psyminone usage case is to run the mail server.  No public IPs bound to the server, but ports forwarded in the router to it.  (not currently on openstack)19:39
jrosseri think a diagram is good19:41
psymincan you suggest a linux-friendly diagram program :)19:42
jrosseryour cloud has an outside (haproxy external vip and your vm networking)19:42
jrosserand it has an inside, haproxy internal vip, mgmt, storage m, vxlan etc19:42
jrosserwith haproxy sat right at the center of that19:42
psyminimagining this diagram, would the network that I ssh to a target from be considered "outside" ?19:45
jrosserI think that depends where on the diagram you would put yours melt when running playbooks/doing admin on the hosts19:46
jrossercould be vpn, or your existing company network, basically how you’d be comfortable managing hosts or network gear today19:47
jrosserthen as a user of the cloud, accessing the dashboard, run terraform or whatever you’d place yourself somewhere a bit different maybe, as in most cases the admin of the cloud does that for some customers/users19:48
jrosserand finally you have your sevices, like your email server that might be accessible from the whole internet19:49
jrosserhopefully thinking about it like this clarifies the purpose of the different networks like mgmt and external vip a bit19:49
psyminI'll try to lay out what we're hoping for in a diagram.19:50
psyminon a slight tangent since work pressure is ramping up, how bad of an idea would it be to use multi node AIO?19:52
jrosserimho that is advanced level openstack-ansible19:54
psyminokay, if it were all up to you, and you had three servers, how would you set it up?19:54
psyminusage case being one mail server and no customers19:54
noonedeadpunkpsymin: I think I would split services by servers. 20:59
psyminnoonedeadpunk, so one controller, one storage, one compute?21:00
noonedeadpunkNo, not really. So, let's say nova-api and glance on node01 and node02, cinder-api and keystone on node02 and node03 and placement with neutron on node01 and node0321:01
noonedeadpunkall of them are part of galera and rabbitmq clusters, as well as all used for VMs (nova-compute)21:02
noonedeadpunkThere're more services to consider though, but I guess you've got the gist21:02
noonedeadpunkbut idea is to have 2 copies of everything spread instead of 3 copies like suggested by docs21:03
noonedeadpunkexcept galera and rabbit that must have 3 copies for quorum21:03
-opendevstatus- NOTICE: The Etherpad service on etherpad.opendev.org will be offline for the next 90 minutes for a server replacement and operating system upgrade21:57
opendevreviewMerged openstack/openstack-ansible stable/yoga: Bump OpenStack-Ansible Yoga  https://review.opendev.org/c/openstack/openstack-ansible/+/88047923:05

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!