Tuesday, 2020-12-08

*** spatel has quit IRC00:03
*** macz_ has quit IRC00:46
*** tosky has quit IRC00:54
*** openstackgerrit has quit IRC00:58
*** rfolco has joined #openstack-ansible01:10
*** rfolco has quit IRC01:28
*** cshen has joined #openstack-ansible01:45
*** cshen has quit IRC01:50
*** spatel has joined #openstack-ansible03:11
*** cshen has joined #openstack-ansible03:45
*** cshen has quit IRC03:50
*** openstackgerrit has joined #openstack-ansible04:40
openstackgerritSatish Patel proposed openstack/openstack-ansible-openstack_hosts master: Fix caps issue to enable powertools repo  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/76590604:40
openstackgerritSatish Patel proposed openstack/openstack-ansible-openstack_hosts master: Fix caps issue to enable powertools repo  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/76590604:42
openstackgerritSatish Patel proposed openstack/openstack-ansible-openstack_hosts master: Fix caps issue to enable powertools repo  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/76590604:47
openstackgerritSatish Patel proposed openstack/openstack-ansible-openstack_hosts master: CentOS 8.3 Fix caps issue to enable powertools repo  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/76590604:53
*** evrardjp has quit IRC05:33
*** evrardjp has joined #openstack-ansible05:33
*** cshen has joined #openstack-ansible05:45
*** cshen has quit IRC05:50
*** cshen has joined #openstack-ansible06:25
*** cshen has quit IRC06:30
openstackgerritSatish Patel proposed openstack/openstack-ansible-openstack_hosts master: Add support of CentOS 8.3 for aio  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/76590606:40
*** rgogunskiy has joined #openstack-ansible06:55
*** miloa has joined #openstack-ansible07:10
*** spatel has quit IRC07:20
noonedeadpunkmornings07:22
*** pto_ has joined #openstack-ansible07:32
*** pto has quit IRC07:33
*** gyee has quit IRC07:37
*** pto_ has quit IRC07:45
*** pto has joined #openstack-ansible07:45
*** luksky has joined #openstack-ansible08:04
*** cshen has joined #openstack-ansible08:09
*** rpittau|afk is now known as rpittau08:10
*** pcaruana has joined #openstack-ansible08:15
*** andrewbonney has joined #openstack-ansible08:16
jrossermorning08:16
jrosseri guess we are going to need to backport https://review.opendev.org/c/openstack/openstack-ansible-tests/+/76583908:17
noonedeadpunkyeah, looks like we does08:33
*** sep has left #openstack-ansible08:41
openstackgerritJonathan Rosser proposed openstack/openstack-ansible master: Ensure kuryr repo is available within CI images  https://review.opendev.org/c/openstack/openstack-ansible/+/76576508:41
jrosser^ thats the blocker for zun i think08:42
noonedeadpunkI don't think depends-on will work here08:42
noonedeadpunkbecause we clone tests repo with run_tests?08:42
noonedeadpunkor we check if we're in ci there?08:42
noonedeadpunk(can'tactually recall)08:42
jrosseriirc there is a symlink perhaps, but it could be broken08:42
noonedeadpunkwill see08:43
jrosserhttps://github.com/openstack/openstack-ansible/blob/master/run_tests.sh#L67-L8808:45
noonedeadpunkoh08:45
noonedeadpunkI'm a bit afraid of merging kyryr on master...08:48
jrosseryes i saw your comment08:49
jrosserthis is a bit tricky08:49
noonedeadpunkI'm not sure about how exactly kuryr works, but couldn't this result in broken db migrations or smth like this?08:49
jrosserthe zun stuff wont work without https://opendev.org/openstack/kuryr/commit/d36befada61e1376479536c0e62d1e769eee846c08:50
noonedeadpunkwon't there be some migration problems when we downgrade the version of kuryr?08:51
noonedeadpunkjust cherry-picked it to victoria08:51
jrosserawesome thanks08:51
jrosseri'm not sure - but afaik kuryr is a network driver for docker08:52
jrosser*neutron driver for docker08:52
andrewbonneyYeah, as far as I can see it's relatively separate given they're intending to remove it in a future release08:52
andrewbonneyFwiw the changes to master since branching on kuryr look mostly packaging/testing related to date08:53
jrosserthe only reason that andrewbonney patch puts master is that there is no backport of the fix yet merged08:53
jrosserso really depends how much we want to merge the zun stuff now, or wait for the kuryr backport08:54
noonedeadpunkyeah, I got that. just thinking how safe would be to do rc on master. considering it's just rc - I guess that should be ok...08:54
noonedeadpunk(considering we mostly likely will forget to bump it back)08:55
noonedeadpunklooking through code quickly didn't find anything that could break things while downgrading, so agree, let's merge it08:55
jrosserwe had an etherpad for V didnt we?08:56
noonedeadpunkI think we were writing to the ptg one08:56
noonedeadpunkhttps://etherpad.opendev.org/p/osa-wallaby-ptg08:56
noonedeadpunkoh, btw, we need to backport that to U https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/754122/9/vars/redhat.yml08:58
noonedeadpunkin more nasty way to support centos 7 as well...08:58
jrosserlooks like the depends-on has worked in the linters job 2020-12-08 08:50:54.517944 | ubuntu-bionic | ++ sudo /usr/bin/pip3 install 'bindep>=2.4.0' tox 'virtualenv<20.2.2'08:59
jrosserbut then sadly ARA has bombed out08:59
jrossersqlite3.OperationalError: no such table: playbooks09:00
noonedeadpunkbut linters overall suceeded09:00
jrosseryes, should we make a tiny change to the patch to cancel the job and re-try09:01
noonedeadpunk765765 is still passing?09:01
*** rfolco has joined #openstack-ansible09:03
noonedeadpunkso let it run maybe09:03
jrosseroh you are right - i thought that the ara error would be failing it09:04
jrosserso for https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/754122/9/vars/redhat.yml09:04
jrosserdo we need a nasty backport to U for centos7 or just switch the job to centos 8 on U09:04
jrosserit won't cherry-pick in gerrit but it seems to be clean when i do the whole patch here09:05
noonedeadpunkwell, we need to ensure that things will work against centos 7 as well?09:05
noonedeadpunkand on centos7 mod_proxy_uwsgi is needed09:06
noonedeadpunkand we can't just use ternary here09:06
noonedeadpunk(as empty element in list makes package module fail)09:06
jrosserso you'd like a 7 and 8 job on U?09:06
noonedeadpunkno, just centos 8 job, but, if we just drop mod_proxy_uwsgi - we will break migration path for centos 7?09:07
noonedeadpunkas on U we still support 7?09:07
*** rfolco has quit IRC09:07
jrosserwe can split out redhat-7.yml and redhat-8.yml vars files on U as part of the backport09:08
noonedeadpunkwell, upgrade will work since package will be present...09:08
jrosserkind of ugly too....09:08
*** pcaruana has quit IRC09:11
openstackgerritDmitriy Rabotyagov proposed openstack/openstack-ansible-os_keystone stable/ussuri: Move openstack-ansible-uw_apache centos job to centos-8  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/76592809:12
noonedeadpunkmaybe like this? ^09:13
jrosseryes looks reasonable, needs the zuul jobs for 7 to stay though09:14
jrosseri always forget we can build the vars inline like that09:15
noonedeadpunkit's nasty as well though :(09:15
openstackgerritDmitriy Rabotyagov proposed openstack/openstack-ansible-os_keystone stable/ussuri: Move openstack-ansible-uw_apache centos job to centos-8  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/76592809:16
openstackgerritJonathan Rosser proposed openstack/openstack-ansible-os_keystone master: Remove centos-7 conditional packages  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/76593109:19
noonedeadpunkand we need https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/765906 as well....09:25
noonedeadpunkexcept I'm not sure about https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/765906/5/vars/redhat-8.yml09:26
* jrosser wonders where the zuul job for that is09:27
jrosserthats testing > 8.3? sounds odd09:28
noonedeadpunkyeah09:28
noonedeadpunkAnd  I think that check of the kernel version should work here09:29
noonedeadpunkah, it has 4.18.0 kernel...09:32
jrosseri expect those vars have just been copied straight over from debian/ubuntu09:35
jrosserso it's probably reasonable to have a different version for centos09:35
*** pto has quit IRC09:35
*** pto has joined #openstack-ansible09:36
jrosserall these variables we have like cinder_git_project_group: ...  do they do anything any more09:39
jrosseror just history from repo_build?09:39
*** SiavashSardari has joined #openstack-ansible09:40
noonedeadpunkiirc history from repo_build09:41
noonedeadpunklet me check bump script just in case09:42
noonedeadpunkno, can't find anything09:50
noonedeadpunkI guess we can drop it now09:50
SiavashSardarimorning09:50
openstackgerritJonathan Rosser proposed openstack/openstack-ansible-os_aodh master: Remove centos-7 conditional packages  https://review.opendev.org/c/openstack/openstack-ansible-os_aodh/+/76593409:51
*** rfolco has joined #openstack-ansible09:51
SiavashSardarinoonedeadpunk do we have a plan for backporting 'Explicitly use rabbitmq collection' in rabbitmq_server repo to ussuri branch?09:52
SiavashSardariof course with all the dependencies in role requirements, etc.09:53
noonedeadpunkHi. I think no09:53
noonedeadpunkWe use ansible 2.9 in U and we won't change that09:53
noonedeadpunkwe probably can use collections, but not sure there were any changes in collection compared to module in 2.909:54
SiavashSardariI checked rabbitmq collection and it says it requires ansible 2.9+ so I thought maybe it's not a bad idea09:54
noonedeadpunkthe problem in collections with 2.9 is that you can't install them from git, and ansible galaxy is soooooooo unstable09:55
SiavashSardarioh, didn't know that.09:56
openstackgerritJonathan Rosser proposed openstack/openstack-ansible-os_ceilometer master: Remove centos-7 conditional configuration  https://review.opendev.org/c/openstack/openstack-ansible-os_ceilometer/+/76595609:57
SiavashSardaribut we use openstack collection with galaxy. should that change too?09:58
noonedeadpunkiirc we needed some change that was present only in collection09:58
noonedeadpunkwell we can backport, but honestly it's pretty much work without any feasable profit09:59
noonedeadpunkwe have fixed rabbit version which works with build in module09:59
noonedeadpunkand see no reason to do work that we can avoid doing :)10:00
openstackgerritDmitriy Rabotyagov proposed openstack/openstack-ansible-openstack_hosts master: Add support of CentOS 8.3 for aio  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/76590610:02
*** pto has quit IRC10:02
openstackgerritDmitriy Rabotyagov proposed openstack/openstack-ansible master: Ensure kuryr repo is available within CI images  https://review.opendev.org/c/openstack/openstack-ansible/+/76576510:03
SiavashSardariI got your point. I'm going to add some rabbitmq shovels for our internal use, and I wanted to use collections on U release. I guess I use ansible modules till next upgrade. thank you10:06
jrosserSiavashSardari: on ussuri we really took a quite conservative approach to collections, pretty much to try things out. At that time the openstack module shad just been moved to a collection and we wanted to get at the updated modules there10:07
jrosserthough for Victoria we have switched completely by going from ansible to ansible-base and complete use of collecitons10:08
noonedeadpunkwe don't even have USER_COLLECTION_FILE for U10:09
jrosserin particular we were very much stuck with rabbitmq upgrades becasue the newer version of rabbit we wanted changed things which broke the module inside ansible10:09
jrosserthat was a reason for needing to move to the collection10:09
noonedeadpunkI think we got this change merged at the end of the day :)10:10
SiavashSardariThanks for the explanation.10:10
SiavashSardariqq: I've never came across USER_COLLECTION_FILE in OSA. what is that?10:11
noonedeadpunkyou can define set of collections that would be installed with bootstrap-ansible script10:12
jrosserSiavashSardari: https://github.com/openstack/openstack-ansible/commit/ef1061a021a9b557d3dfb7f6e632b078e81e2f0810:12
noonedeadpunkby default it's user-collection-requirements.yml file in openstack-ansible path, but it can be adjusted with $USER_COLLECTION_FILE env var10:12
jrosserthat will be in master/victoria10:12
noonedeadpunkyeah :)10:12
SiavashSardariThat is a great idea, if we had it in U, that would help me a lot. for now I use ansible-collection-requirement but definitely looking forward to nex release to take advantage of that.10:15
*** lkoranda has joined #openstack-ansible10:15
SiavashSardarijrosserThanks for the link10:16
noonedeadpunkuh, how centos is annoying...10:21
noonedeadpunkthey seem to have exactly the same kernel, but just modules are merged10:22
noonedeadpunkwhaaat10:22
jrosserthats not just an artefact of the CI node is it?10:24
noonedeadpunknope... `Red Hat Enterprise Linux 8.3 is distributed with the kernel version 4.18.0-240.` https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/8.3_release_notes/index#enhancement_kernel10:25
noonedeadpunkwell. 8.2 had 4.18.0-19310:25
noonedeadpunkI'm not sure we can catch this with version test10:26
noonedeadpunkand probably doing spatel's way with checking distro is more reliable then...10:26
*** pto has joined #openstack-ansible10:27
*** ygk_12345 has joined #openstack-ansible10:51
*** ygk_12345 has left #openstack-ansible10:52
noonedeadpunkand what a bad timing for all of this each time...10:53
noonedeadpunkI feel like we might have a cross-dependency between 765839 and 76590610:54
noonedeadpunkoh no.... http://paste.openstack.org/show/800825/10:57
noonedeadpunkthat's on centos 8.310:59
noonedeadpunkwell, it might be pretty ok, probably this was happening before, but legacy install (without wheel build) also fails...11:01
*** SiavashSardari has quit IRC11:18
*** lkoranda has quit IRC11:24
jrossernoonedeadpunk: for 765839 shall i make the centos-8 job nv then we merge it?12:00
jrosserotherwise we are in difficulty12:00
openstackgerritJonathan Rosser proposed openstack/openstack-ansible-tests master: Bump virtualenv to version prior to 20.2.2  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/76583912:05
noonedeadpunkyeah good choice as CI is much faster for 76583912:09
jrosserso the wheel build failure for systemd-python, is that something *else* we need to fix (!) ?12:11
noonedeadpunkyeah....12:11
noonedeadpunkmaybe it was just some floating thing...12:11
noonedeadpunkbut I'm afraid it's not12:13
openstackgerritJonathan Rosser proposed openstack/openstack-ansible-tests master: Return centos-8 jobs to voting  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/76598612:13
*** fanfi has joined #openstack-ansible12:48
*** fanfi has quit IRC13:02
*** rgogunskiy has quit IRC13:11
admin0utility container .. TASK [python_venv_build : Install python packages into the venv]    almost hangs and takes ages .. is that normal ?13:13
admin0i am building 3 clusters in parallel, i see the same behaviour .. especially in 21.2.0 .. i don't think it was this long in 21.1.013:14
*** pto has quit IRC13:14
*** pto has joined #openstack-ansible13:14
*** pto has quit IRC13:19
*** spatel has joined #openstack-ansible13:19
*** pto has joined #openstack-ansible13:19
*** owalsh has quit IRC13:20
*** rfolco has quit IRC13:22
*** rfolco has joined #openstack-ansible13:22
jrosseradmin0: it's not normal, but without some debug it's hard to say13:23
jrosserit will be probably building the wheels, and there is a log for that in the repo server container13:23
*** miloa has quit IRC13:26
*** tosky has joined #openstack-ansible13:44
admin0jrosser, in utility container -- python_venv_build : Install python packages into the venv   - the repo is built in utility ?13:46
jrosserthe python wheels are built in the repo container13:47
jrosserthings are being installed into a venv in the utility container, like the openstack client13:48
jrosserbut the stuff that goes into that venv is actually compiled on the repo container13:48
admin0aah .now clear13:49
jrosserit is running the python_venv_build ansible role to do that13:50
jrosserso TASK [python_venv_build : Install python packages into the venv]    almost hangs and takes ages .. is that normal ?13:50
jrosser^ with that my first debugging would be to go look at the log in the repo container13:50
openstackgerritMarc Gariépy proposed openstack/openstack-ansible-os_horizon master: Add ability to configure ALLOWED_HOSTS for horizon.  https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/76599813:52
spateljrosser: centos 8.3 is out and its failing to build AIO at https://opendev.org/openstack/openstack-ansible-lxc_hosts/src/branch/master/tasks/lxc_cache_preparation.yml#L8514:03
spatellook like issue of epel-lxc_host.repo14:03
spatelI am debugging to see what is going on14:03
jrosseryou've made a patch though - or something different?14:04
jrosseri am also just running some stuff in the 8.3 VM14:04
spatelI did patch here https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/76590614:04
spatelbut now i am stuck at next step which i am debugging and will commit patch if i find something14:05
spatelit would be good to have Zuul job for 8.314:06
jrosserspatel: it sadly doesnt work like14:07
jrosserthat14:07
jrosserone day it is 8.2 the next it is 8.3 already in our jobs14:07
jrosserso everything is broken completely14:07
spateljrosser: look like i was having issue last night but right now its working, may something was funky about repo last night14:10
spateljrosser: +114:10
jrossernoonedeadpunk: i have reproduced the failure to build systemd-python14:11
openstackgerritDmitriy Rabotyagov proposed openstack/openstack-ansible master: Cleanup group_vars  https://review.opendev.org/c/openstack/openstack-ansible/+/76600114:16
noonedeadpunkjrosser: out of logs it looked like something I have no idea how to fix, except symlink module from system python...14:17
jrosseryeah i'm just trying to find what is going on14:18
jrosserin expansion of macro ‘LIBSYSTEMD_VERSION’ <- not even finding where that is defined right now14:18
noonedeadpunkmaybe it's get's defined during hweels build?14:19
noonedeadpunkI think, wheels build were failing previously as well, but then they got installed properly somehow...14:19
jrosserit's from libsystemd-dev headers i think14:22
jrosserspatel: have you got any centos8 not 8.3 hosts around?14:23
spatelyes i have centos 8.3 in lab14:23
jrosserno, i want earlier 8.2 or something14:24
spateli have 8.2 also14:24
jrosserok can you try `pkg-config --modversion libsystemd`14:24
spateldoing it on 8.214:24
spatelhttp://paste.openstack.org/show/800838/14:25
jrosserif thats an OSA install can you try the same in the repo container?14:26
spatelsame error inside containers14:27
mgariepyhaproxy_enpoint set state failed to connect on master now ?14:29
jrosserspatel: if you don't mind installing systemd-devel package it should give a proper output14:30
spateldoing it now14:30
-spatel- [root@infra-lxb-1 ~]# pkg-config --modversion libsystemd14:31
-spatel- 239 (239-41.el8_3)14:31
jrosseroh hrrm will that have installed the 8.3 version of that package?14:32
* jrosser curses centos14:32
spatelbut you can get version info using systemctl --version14:33
spatelwhy do you want to use pkg-config ?14:33
jrosserthe wheel build is compiling C code14:34
jrosserand C build systems make heavy use of pkg-config to find out 'things' like versions and required linker flags about system libraries14:35
spatelhmm14:35
jrosserand that is where centos8.3 is currently going all wrong14:35
*** cshen has quit IRC14:35
jrosseri want to compare the output from that pkg-config command from 8.3 with something earlier14:36
spatelok14:36
jrosserbecause if you take the output you just gave `239 (239-41.el8_3)` and try to do some primitive version compare on that string it's going to not work14:36
jrosserwhen i do the same on an ubuntu bionic box i get a plain version like `237`14:37
spatelgot it14:37
openstackgerritMarc Gariépy proposed openstack/openstack-ansible-os_horizon master: Add ability to configure ALLOWED_HOSTS for horizon.  https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/76599814:39
spateljrosser: my playbook also failing at TASK [python_venv_build : Build wheels for the packages to be installed into the venv]14:57
jrosserfor keystone?14:57
spatelyes14:57
jrosseryes that is where it is also failing ehre14:57
spatelhttp://paste.openstack.org/show/800842/14:57
mgariepyhttps://blog.centos.org/2020/12/future-is-centos-stream/ << more fun to come.14:58
jrosserspatel: sure, noonedeadpunk has already posted the underying error before http://paste.openstack.org/show/800825/14:59
jrosserthis is why i want the pkg-config output14:59
spateljrosser: ok let me know when you cut patch15:00
noonedeadpunkwhaaat - another fedora?15:01
mgariepyIf you are using CentOS Linux 8 in a production environment, and are concerned that CentOS Stream will not meet your needs, we encourage you to contact Red Hat about options.15:02
noonedeadpunkyeah, was really reading this15:02
noonedeadpunkWell, this feels totally like IBM influence15:02
spatelIf you are using CentOS Linux 8 in a production environment, and are concerned that CentOS Stream will not meet your needs, we encourage you to contact Red Hat about options.15:03
noonedeadpunkAs well as the point to drop CentOS support15:03
spatelyike...15:03
jrosserthis is just too much15:03
jrosseras it breaks all the branches all the time15:03
jrossernothing is stable ever again15:03
mgariepyyep. is sounds IBM a lot..15:03
mgariepylet's drop centos and add arch. ;p15:04
noonedeadpunkit's like - how are we supposed to have RHEL when we're having centos for free which is as stable as rhel...15:04
noonedeadpunkat least arch has nice docs lol. But not sure if anybody using it in prod?15:04
mgariepyi use that in prod as on my laptop lol15:05
mgariepybut i would never do servers with it hahaha15:05
mgariepyi am not insane (at least i don't think i am)15:05
spatelThis is good news for Ubuntu community..15:06
mgariepyi don't think it's a good news for ubuntu but it's a really bad one for centos15:07
noonedeadpunk+115:07
spatelIBM wants more $$ (no free donuts)15:08
noonedeadpunkpretty much afraid that ubuntu may follow the pattern15:08
ThiagoCMCBTW, I've never ever used CentOS, for me, it was always unstable AH, not to mention super hard to install and maintain things and a dozen of repos. Debian has thousands of packages ready-to-go, no need to add third party repos.15:10
spatelTime to fire up gentoo lab15:10
ThiagoCMCDebian is rock solid, Ubuntu will keep the LTS release as-is, I bet.15:10
mgariepylet;s fork rhel to eyebeeemOS15:10
ThiagoCMCLet15:11
ThiagoCMCLet's forget about RH-based distros lol15:11
jrosserthis is a serious point though - i've spent pretty much all my day today trying to figure out WTF is going on with a distro i don't even use15:12
jrosserthis is not sustainable if it's going to change all the time15:12
spatelagreed, anyway next year it will be over15:13
spateli don't think people will use CentOS stream in production15:13
mgariepywhat's the point to work on it now if it's all over in a year?15:13
spatelStill some folks like me using it :)15:13
spatelmay be next year i have to decided which way to go15:14
mgariepywell. yep but. hey install this, then switch your os to something else.15:14
mgariepybefore upgrading.15:14
ThiagoCMCIf OSA focus only on Debian/Ubuntu, I'm super happy! Debian is awesome, even with systemd.  lol15:14
jrosseron a positive note i think i have a gross workaround for the keystone systemd_python build error15:15
mgariepyLOL15:15
kleinisystemd is great, maybe sometimes buggy15:15
spatelOnce i install it i am not going to touch for next few years..15:15
jrosserspatel: do you have an AIO at the point keystone failed?15:15
spatelyes jrosser15:16
jrosserspatel: can you stick LIBSYSTEMD_VERSION="239" into /etc/environment inside the repo container and re-run the keystone playbook?15:16
spatelok15:17
jrosserthis may/may not work :/15:17
spatelrunning playbook15:18
spateljrosser: now it failed at different point - http://paste.openstack.org/show/800843/15:20
jrosserthats odd - did the python_venv_build step work though?15:21
jrossercould do with a big chunk of the log if you can paste it15:22
noonedeadpunkspatel: can you run with -e venv_rebuild=true?15:23
spatelk15:23
jrossernoonedeadpunk: i am trying to take advantage of this https://github.com/systemd/python-systemd/blob/d08f8dd0f4607a72f1d5497467a2f0cf5a8ee5d4/setup.py#L24-L2815:24
noonedeadpunksmart move15:24
noonedeadpunkhad no idea they was thinking about such situation in advance15:25
mgariepysince 2016.15:25
spatelnoonedeadpunk: -e venv_rebuild=true did magic..15:28
noonedeadpunklol15:28
spateljrosser: are you going to patch repo using LIBSYSTEMD_VERSION="239" ?15:30
jrosserwell it's figuring out a way to do that now which isnt breaking everything else!15:30
spatel+115:30
jrosserwe don't generally drop environment variables much15:31
noonedeadpunkwe can do that on openstack_hosts for centos 8 only... and eventually I think we can just better parse output with regexp?15:32
jrosserthat would be good - you can guarantee that the version will change somehow :)15:33
spatelwhy not just doing if distro==CentOS8 do  systemctl --version | head -n1 | awk '{print $2}' otherwise systemctl --version15:37
noonedeadpunkotherwise we shouldn't do anything actually :p15:37
spatel:)15:37
spatelAnyway this regex is for only 1 year.. not sure what centOS stream version will look like.15:39
jrosseractually thats a good point - do we have code already somewhere that needs the systemd version15:39
jrosserthis var may already exist15:39
noonedeadpunkhm, might be... can;t recall exactly where we might have it15:39
jrosserwe have some tasks here to copy https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/common-tasks/os-nspawn-container-setup.yml#L16-L2915:41
noonedeadpunkjust lineinfile instead of the set_fact yeah...15:42
jrosseri forget how we manage /etc/environment, is it copied from the host into the lxc?15:44
*** macz_ has joined #openstack-ansible15:44
mgariepyglobal_environment_variables15:44
*** tosky has quit IRC15:45
mgariepyhttps://github.com/openstack/openstack-ansible-openstack_hosts/blob/master/defaults/main.yml#L14715:45
mgariepythe same template exist for lxc_container_create15:47
mgariepytemplate/var ..15:47
mgariepyhttps://github.com/openstack/openstack-ansible-openstack_hosts/blob/master/templates/environment.j215:47
noonedeadpunkoh well, we should jsut update template https://opendev.org/openstack/openstack-ansible-openstack_hosts/src/branch/master/templates/environment.j215:47
mgariepyand this one: https://github.com/openstack/openstack-ansible-lxc_container_create/blob/master/templates/environment.j215:48
*** macz_ has quit IRC15:48
admin0spatel, in your config, exact run .. 2020-12-08 15:47:57.748 48729 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Bridge br-lbaas for physical network lbaas does not exist. Agent terminated! -- is the first error i encounter on the compute nodes15:48
spateli think we copy /etc/environment file from host to container right?15:48
mgariepyno not really, it's generated the same way tho.15:48
*** macz_ has joined #openstack-ansible15:48
noonedeadpunkisn't openstack_hosts also run against lxc containers?15:48
admin0do i need to create a blank  br-lbaas in compute nodes also ?15:48
spateladmin0: don't create br-lbaas on compute nodes15:49
admin0ok15:49
spatelcompute will use br-vlan to tunnel br-lbaas-mgmt vlan traffic15:49
admin0what to do to that error message that causes neutron-linuxbridge-agent to die ?15:50
noonedeadpunkso these might be conflicting blocks ??15:50
jrosseradmin0: you should not have a physical network 'lbaas', that suggests you still have config for the flat network in place15:52
noonedeadpunkI think we might need to drop this from lxc_container_create15:52
jrosseri have the start of a patch which i will push shortly for openstack_hosts15:52
jrossereven if it needs some improvement15:52
admin0jrosser, type is raw .. i directly copied the blocks from https://satishdotpatel.github.io//openstack-ansible-octavia/15:53
jrosserok, so look in the neutron config file that has been templated out and see what you have15:54
*** cshen has joined #openstack-ansible15:55
admin0hmm. i think i accidently had mgariepy linuxbridge-override in my config15:56
admin0fixing ..15:56
*** gyee has joined #openstack-ansible15:58
openstackgerritJonathan Rosser proposed openstack/openstack-ansible-openstack_hosts master: Fix libsystemd version for Centos  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/76603015:58
jrosserhrrm that still has the delegate_to physical_host in it16:00
jrosseralso i saw that we still use the 8.1 container base image but i think there was trouble moving newer than that?16:00
noonedeadpunkyep16:01
jrossernot really sure about this patch tbh - hack at it if you think it needs changing16:02
noonedeadpunkbut, we run upgrade of all packages in image before storing them and creating any container from it16:02
admin0ok .. its running .. now in the octavia logs, i see Failed to establish a new connection: [Errno 113] No route to host ..  this is the container trying to contact the amphora instance16:03
spatelYes, i have seen 8.1 but we do run upgrade so now i am seeing 8.3 version in containers16:03
admin0as per the example, i see br-vlan.27 and the patch is there ..16:03
admin0so br-lbaas is patched to .27 .. and  the amphora instance has the .27 vlan ports16:03
jrosseradmin0: inside the octavia container do you see eth14 with and IP you expect?16:04
spatelcan you post full brctl show output?16:04
admin0sure one moment please16:04
openstackgerritJonathan Rosser proposed openstack/openstack-ansible-openstack_hosts master: Fix libsystemd version for Centos  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/76603016:05
noonedeadpunkoh btw16:05
noonedeadpunk#startmeeting openstack_ansible_meeting16:05
openstackMeeting started Tue Dec  8 16:05:46 2020 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.16:05
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:05
*** openstack changes topic to " (Meeting topic: openstack_ansible_meeting)"16:05
openstackThe meeting name has been set to 'openstack_ansible_meeting'16:05
noonedeadpunkWe don't have new bugs that are worth discussing, so16:06
noonedeadpunk#topic office hours16:06
*** openstack changes topic to "office hours (Meeting topic: openstack_ansible_meeting)"16:06
noonedeadpunkwell, and here I think we're on the same page, but just to sum up what has went wrong during last week16:07
noonedeadpunk1. CentOS 8.3 released and break master and ussuri16:07
noonedeadpunk2. new virtualenv has also released which breaks our linter jobs16:08
admin0i have not added vlan 27 in the router an .1 as the gateway ? i should do that ?16:08
noonedeadpunkand that in the time where we were absolutely ready to make rc16:08
admin0spatel, https://gist.github.com/a1git/4368656babd5d74753eb1ce3e5c2bc8316:09
noonedeadpunk766030 sound like smth that should work. except we probqbly need to squash it with 765906?16:09
noonedeadpunkhow do you think jrosser?16:09
jrosseroh right yes, its not going to work on it's own16:11
noonedeadpunkWell, except this I don't have much topics to discuss... Looking forward to fix gates and merge zun stuff to be able to branch16:14
noonedeadpunkoh,well, also placed a bit scary thing - https://review.opendev.org/c/openstack/openstack-ansible/+/76600116:14
noonedeadpunkit made me pretty much frustrated because of inconsistency between roles and their behaviour16:14
noonedeadpunkI think we should also move service_region and package_state to role defaults16:15
jrosseroh nice cleanup there16:15
noonedeadpunkeventually I faced that octavia role was creating me internal uri with http while all other internal urls set to https, as I had openstack_service_internaluri_proto: https in overrides...16:17
noonedeadpunkand cant stop myself lol16:18
openstackgerritJonathan Rosser proposed openstack/openstack-ansible-openstack_hosts master: Fix libsystemd version for Centos  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/76603016:18
admin0i added .1 on router for vlan 27 ..and also added one ip of that range in br-lbaas eth14 ..   now the errors are not coming, i cannot still ping but it says Mark READY in DB for amphora: 1808c89b-2eeb-4f34-b6ab-b1a9536fab49 with compute id 1a018808-1467-4744-9acc-f1b3cbc0d71716:18
noonedeadpunkI tried to check that removed stuff is not used anywhere and equal to defaults, but worth double checking16:18
admin0that logs of non stop error connecting is not there yet16:18
admin0but since i cannot ping, i dunno if its working or not working16:18
jrossernoonedeadpunk: there was also the vars we talked about from the openstack_services.yml file16:19
jrosserhttps://review.opendev.org/c/openstack/openstack-ansible/+/766001 is passing apart from the centos things so it's not so bad to go in an rc16:20
noonedeadpunkah, indeed, git_project_group16:21
noonedeadpunkwell, the most scary thing with 766001 as it's not checking all roles....16:21
noonedeadpunkoh, btw, woerth saying, I placed a deprecation patches for galera_client roles https://review.opendev.org/q/topic:%22osa%252Fdeprecate_galera_client%22+(status:open%20OR%20status:merged)16:26
noonedeadpunkand, placed patches to revive monasca repos https://review.opendev.org/q/topic:%22osa%252Frevive_monasca%22+(status:open%20OR%20status:merged)16:26
noonedeadpunkmensis volunteered to submit required fixes to make role functional - he has working role for U so see no reason not to revive repo16:27
jrosserok so we need this to merge now i think https://review.opendev.org/c/openstack/openstack-ansible-tests/+/76583916:29
openstackgerritDmitriy Rabotyagov proposed openstack/openstack-ansible master: Remove *_git_project_group variables  https://review.opendev.org/c/openstack/openstack-ansible/+/76603916:31
noonedeadpunkany extra vote?:)16:31
*** nurdie has joined #openstack-ansible16:31
openstackgerritJonathan Rosser proposed openstack/openstack-ansible-tests master: Return centos-8 jobs to voting  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/76598616:34
jrosser^ that should give early indicator if we have fixed the wheel build16:34
admin0brb -- food16:38
prometheanfireI imagine this is fine, but just wanted an ack openstack-ansible-galera_client is going away?16:42
prometheanfire(offhand, what is it replaced by, upstream role)?16:42
jrossergalera client and galera server ansible roles were always in a circular dependacy fight with each other16:43
noonedeadpunkprometheanfire: we moved client side to galera_server role16:43
jrosseras they both defined the version to install, so it was best to merge the two togetehr16:43
prometheanfireyarp16:43
noonedeadpunkwell, thinking about this now - we probably could set version in the integrated repo :p16:43
noonedeadpunkbut whatever :)16:44
jrosserif we tested then with the new infra jobs yes16:44
jrosserbut with functional jobs not so much16:44
noonedeadpunkyeah, the we'd need to bump in tests repo as well... agree16:44
prometheanfirealso offhand, when is the .1 release? iirc that was when the upgrades were 'supported'16:45
noonedeadpunkiirc there's not so much for upgrade for U->V16:45
noonedeadpunkor we just implemented all at once, since we have upgrade jobs nowadays16:45
noonedeadpunkso we see broken upgrade path at once16:46
prometheanfireya, my distro based upgrade went easilly, no big migration to run for the main projects16:46
jrosserprometheanfire: you do install_method=distro?16:47
prometheanfireoh, I mean my gentoo packaging stuff16:48
prometheanfireforgot about that install method :P16:48
jrosseroh phew - thought you meant OSA distro :)16:48
prometheanfireas far as osa gentoo stuff I may work on it soon (a month or three, whenever the new nuc11 stuff comes out) since I'm rebuilding my lab16:50
prometheanfireonce that is done I can stop packaging most/many of the openstack things on gentoo and point people to that16:50
prometheanfirethen again, I don't think it'll ever be officially supported, I'm not the only gentoo user, but I am about the only dev16:51
*** gshippey has joined #openstack-ansible16:55
noonedeadpunk#endmeeting16:56
*** openstack changes topic to "Launchpad: https://launchpad.net/openstack-ansible || Weekly Meetings: https://wiki.openstack.org/wiki/Meetings/openstack-ansible || Review Dashboard: http://bit.ly/osa-review-board-v3"16:56
openstackMeeting ended Tue Dec  8 16:56:54 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:56
openstackMinutes:        http://eavesdrop.openstack.org/meetings/openstack_ansible_meeting/2020/openstack_ansible_meeting.2020-12-08-16.05.html16:56
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/openstack_ansible_meeting/2020/openstack_ansible_meeting.2020-12-08-16.05.txt16:56
openstackLog:            http://eavesdrop.openstack.org/meetings/openstack_ansible_meeting/2020/openstack_ansible_meeting.2020-12-08-16.05.log.html16:56
prometheanfireoh, I was in a meeting, lol16:58
noonedeadpunknp here :)16:59
openstackgerritJonathan Rosser proposed openstack/openstack-ansible-openstack_hosts master: Fix libsystemd version for Centos  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/76603017:02
jrosseri messed up the squash of those and put the delegate_to back in17:02
jrosserdanger of editing in the gerrit UI17:02
noonedeadpunkI'm a bit lost with octavia client certificates. I t seems we do rotate certificates each time we run role. But should octavia be able to talk to amphoras with new certificates? I guess not?17:04
noonedeadpunkAs it makes auth with amphora?17:04
noonedeadpunk*with SSL to amphora17:04
noonedeadpunkor I got it wrong?17:04
noonedeadpunkJust seeing some weird stuff here and trying to understand if it's related or not17:05
jrosserit'll be to do with the CA i guess?17:10
jrosserif the certificates can be validated then it doesnt necessarily matter if they change17:11
jrosserthis https://github.com/openstack/openstack-ansible-os_octavia/blob/master/tasks/octavia_certs.yml#L97-L10417:12
*** owalsh has joined #openstack-ansible17:13
noonedeadpunkok, I see what has happened here17:14
noonedeadpunknot cool pattern https://opendev.org/openstack/openstack-ansible-os_octavia/src/branch/master/defaults/main.yml#L42817:14
noonedeadpunkin case you auth as non root user (or having ldap and running via sudo) this will differ for users...17:15
noonedeadpunkwell, it's configurable so whatever:)17:16
noonedeadpunkso yeah, I got new CA and everything...17:17
jrosserwhat on earth is that for!17:22
jrossershouldnt these default to /etc/openstack-deploy?17:23
noonedeadpunkI'd say it should. Well, not /etc/openstack_deploy, but OSA_CONFIG_DIR17:24
noonedeadpunkBut I think caveat might be if choose different from loaclhost setup_host17:24
jrosserthis certainly needs improving17:26
*** tosky has joined #openstack-ansible17:26
jrosseri have just looked on a deploy host here and those files are in a place not covered by backup/version control17:26
noonedeadpunkthey are not, yes :p17:26
noonedeadpunkI was pretty sure I will need to respawn all amphoras now...17:27
noonedeadpunkbut finally found valid certs...17:27
jrosseri guess for the time being octavia_cert_dir can be overridden and the files put in a better place17:27
jrosser+/- permissions of course, i'd expect some of those to have quite restrictive mode17:28
jrosseryes the private keys are 0600 which makes it difficult17:29
noonedeadpunkI just find it super hard to change default as it obviously will break existing deployments (except do some symlinking with upgrade scripts)17:29
jrosserunless we have a one-cycle task that moves the files17:29
noonedeadpunkhonestly - we run osa with root only, so whatever permissions are on the deploy host...17:29
openstackgerritMerged openstack/openstack-ansible-tests master: Bump virtualenv to version prior to 20.2.2  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/76583917:29
noonedeadpunkbut in case of octavia_cert_setup_host != localhost it is really tricky...17:30
spatelfyi, successfully deployed aio using centos 8.317:31
noonedeadpunkI hope we will cover that with SSL topic as well though17:31
noonedeadpunkwill need to recheck 76603017:31
jrosseri'm not sure i really understand the use case for cert_setup_host != localhost17:31
jrosserunless you don't allow the CA key off a specific host17:32
noonedeadpunkno idea, but since we do allow this at the moment, I can expect somebody might be using it...17:32
* jrosser heads out for a bit17:32
mgariepyhttps://zuul.opendev.org/t/openstack/build/5bf01bb6687a41e0970220c461489297/log/job-output.txt#1080717:40
mgariepynoonedeadpunk, have you seen that ?17:40
mgariepysomewhere else ? i'm not show what can trigger that but my horizon patch fails there on multiple os/checks17:41
*** rpittau is now known as rpittau|afk17:43
noonedeadpunknope, not really17:47
*** cloudnull has quit IRC17:47
*** cloudnull has joined #openstack-ansible17:47
noonedeadpunkbut anyway we need to merge https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/766030 before anything will pass centos17:50
mgariepydone18:03
openstackgerritDmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Trigger service restart on cert change  https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/76606218:05
kleiniI just tried 21.2.0 in staging and noticed, that /etc/ceph for Glance, Cinder volume and computes is not created any more. Maybe ceph-client is not running. I am using ceph config from files.18:27
kleiniDoes anybody have a pointer, what may have changed?18:27
spatelfolks, do you know what is going on with my rabbitMQ - http://paste.openstack.org/show/800861/18:33
spatelLook like some bad compute node crying..18:37
admin0spatel, it seems to work .. but the operating status shows: offline18:45
spateladmin0: look like time to check octavia logs etc.. and see if you find something interesting.18:46
spateldid you see amphora vm on compute?18:46
kleinihttps://opendev.org/openstack/openstack-ansible-ceph_client/src/branch/master/tasks/main.yml#L19 <- this is not backported to ussuri18:47
kleini^^^ answering my own question18:47
admin0yes .. lb is working fine .. just that that status shows offline18:51
*** andrewbonney has quit IRC18:51
*** nurdie has quit IRC18:54
spatelvery odd, it should be online18:58
jrosserkleini: if you can find the patches (git blame?) hit the cherry-pick button In gerrit19:00
spateloperation basic.publish caused a channel exception not_found: no exchange 'reply_92897d7bc2ad495c8688964cf8433a6b' in vhost '/nova'19:00
spatelgetting these errors do you think its good idea to restart rabbit cluster?19:01
jrosseradmin0: from the octavia container you should be able to curl the api endpoint in the amphora, can you check that?19:01
*** frickler has joined #openstack-ansible19:01
admin0jrosser, spatel this is what I see: https://pasteboard.co/JE0QxcL.png19:07
openstackgerritMarcus Klein proposed openstack/openstack-ansible-ceph_client stable/ussuri: Allow to proceed with role if ceph_conf_file is set  https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/76595219:08
admin0when i curl the floating IP.   i get round-robin between web1 web2 web1 web2 .. so i know its working fine19:08
spatelDo you have spare_amphora_size_pool in octavia.conf ?19:08
spateljust curious i had that setting and it was causing strange issue19:09
admin0so in the hypervisor, ( i only have 1 to test this) there are 5 instances, 2x web servers, 1 cirros and the 2x amphoras19:09
spatelAre you using SINGLE or ACTIVE-STANDBY?19:10
admin0none.. whatever is the default19:10
spatelmay be LB is up but octavia-health manager doesn't able to pull stats (guessing)19:10
spateloctavia_spare_amphora_pool_size: 019:11
spateloctavia_loadbalancer_topology: SINGLE19:11
admin0spare_amphora_pool_size is set to 119:11
admin0how does the octavia-health-manager pull stats19:12
spatelthat option is very bad and causing issue in my setup19:12
admin0so set pool_size to 1 ?19:12
admin0i meant 019:12
spatelspare_amphora_pool_size = 0  (rebuilt LB again)19:12
admin0ok19:12
admin0and another question is .. how does the health manager connect to amphora ? via its lb ip ?19:12
openstackgerritMarcus Klein proposed openstack/openstack-ansible-ceph_client stable/ussuri: Allow to proceed with role if ceph_conf_file is set  https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/76595219:12
admin0i mean its own lbaas pool ip ?19:12
spatellb-lbaas-mgmt19:13
masterpeWe did have not have any problem with spare_amphora_pool_size. In our setup we have it on 5.19:13
admin0i am using whatever the default amphora image it downloads ..19:13
admin0have not created my own19:13
spatelif you do (ip netns list) you will see your VIP inside namespace19:13
spatelI am using same default amphora image provided by OSA19:14
kleinijrosser: https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/76595219:14
admin0spatel, the ip netns list( where ? )19:14
admin0inside the default amphora image ?19:14
spatelinside amphora19:14
admin0aah .. ok19:14
admin0let me check how to connect to it19:14
spatelamphora VM create public vip inside namespace to isolate from br-lbaas-mgmt IP19:14
spatelssh to br-lbaas-mgmt ip (you need to upload ssh-key)19:15
admin0for whatever reasons, i cannot even ping the default amphora ip19:16
spatelVIP ip?19:16
spatelyou can't ping because ICMP not allowed only vip ports will be open like port 80 or 44319:17
admin0i was trying to ping the default amphora instance that comes up first under user octavia in service tenant19:17
admin0which i cannot .. it only has the lbaas ip range .19:17
*** yann-kaelig has joined #openstack-ansible19:17
admin0the other one i created via gui has a vip19:17
jrosseradmin0: this is the time to take a breath and recap how this works19:17
admin0:D19:17
admin0ok19:17
spatel:)19:18
jrosserthe public vip is for the loadbalanced service19:18
jrosserthe interface on the lbaas network connects the amphora to the backend octavia container via br-lbaas and eth1419:19
jrosseroctavia backend service connects over that lbaas network to the amphora, with https for monitoring and config19:19
jrosseryou should be able to see evidence of that in the various octavia service journals on the controller19:20
jrosseryou should also be able to curl the backend api endpoint of the amphora from inside the octavia container19:20
jrosserthese are verification steps that you lbaas network is all running as it should19:20
*** yann-kaelig has quit IRC19:21
mgariepyhow do i run aio_metal tests for horizon repo  ? all the metal test fails.. https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/76599819:21
spateljrosser: is this guide still valid to restart rabbitMQ cluster - https://docs.openstack.org/openstack-ansible/rocky/admin/maintenance-tasks/rabbitmq-maintain.html19:21
jrosseradmin0: does this make sense? if you have some status error for the lb it will be todo with that backend network amphora<>octavia communication19:22
admin0does the default amphora image allow ping ?19:23
spateladmin0: no19:23
spatelICMP is blocked by design19:23
admin0i mean from the octavia container -> amphora instance19:23
jrossercurl the api19:23
admin0 sorry when u mean api, exactly what :(19:23
masterpeadmin0: If you enable it in the acl it is.19:23
masterpeBut is is not meant to do that.19:24
spatelYou can ping from amphor --> octavia container using br-lbaas subnet (you can't ping lb vip)19:24
admin0i am unable to ping from container -> amphora19:24
admin0but i heard that is by design19:24
spatelmasterpe: i have been told octavia deploy security rules from python code (you can't modify by security-group)19:25
admin0ok .. so the amphora instance that is under octavia/service acts as a network node for all other tenant-created load balancers ?19:25
masterpeI have changed the security group in the service project to allow icmp19:25
spatelmasterpe: admin0 sorry i found you can't ICMP (i just verified, its not allowing me to ICMP)19:27
admin0i see, 9443 is the pirt that is open19:27
spatelyou can telnet to port 22 and verify19:27
admin0i only see 9443 open in the firewall from octavia/service for the amphora image19:27
admin0so let me check if octaviacontainer can reach this port19:27
masterpeYes tcp work beter to verify.19:27
spateltelnet  10.62.7.156 2219:27
spatel works for me but ICMP not19:28
jrosserthis is what runs in the amphora https://docs.openstack.org/octavia/latest/contributor/api/haproxy-amphora-api.html19:28
jrosserit should be on port 9443 (i think?) and you should be able to curl https://your-amphora-on-lbass-mgmt-ip:9443/info or something similar19:29
johnsomYou would have to have the right TLS cert and keys for that curl to work.19:30
jrosserindeed, though i think at the moment just verifying that the network is even connected is challenging19:30
jrosserso a valid failure there is information :)19:31
johnsomYou can use s_client: openssl s_client -connect <amphora lb-mgmt-net IP>:944319:31
johnsomThat should dump the server certificate information. You will want to control-c out of it as it is waiting for the client to send certificate information19:32
jrosserthis is also all kind of well logged though too19:32
jrosserso it should be reasonably easy to see success/failure in the service journal19:32
johnsomTrue19:33
spateljohnsom: Lets allow ICMP :) we should open story or future request.19:33
admin0recall, that i had to manually add the 10.62.0.x range IP in the br-lbaas in the container .. i had put a wrong subnet . so it was seeing one amphora but not able to connect to another one19:34
johnsomYeah, we can add an option to turn that on. By default the security is locked down on the amphora as they really are just a black box.19:34
ThiagoCMCI'm curious about Octavia... Is it possible to run Octavia instances in LXC instead of QEMU? I wanna try it but I really don't want the network traffic of the LBaaS to pass though virtualization (it doesn't make sense to me)...19:34
admin0i am going to remove this lb, add another one and see what it does19:34
admin0if operators (like me) want, i easily see i can add the icmp rules to  the amphora instanes logged in as  octavia/service19:35
johnsomThiagoCMC Yes, we had a working proof of concept with LXC, but nova-lxc is now dead so we never merged it. That said, it's a rare case that you have enough load that the service VM is a bottleneck. It is very optimized.19:36
ThiagoCMCjohnsom, it feels sad that both nova-lxc and nova-lxd are gone...   :-(19:38
ThiagoCMCI'll give it a try then!19:38
admin0except "operating stats" offline .. everything else seems to work perfectly fine19:38
admin0operating-status19:39
ThiagoCMCThe nova-lxd was awesome, especially these days that LXD supports both LXC and QEMU side by side... OpenStack could be a Libvirt-free cloud!19:39
johnsomhttps://review.opendev.org/c/openstack/octavia/+/636066 was the working LXD patch. Really I did it to try it out and to see if I could find something more stable than nova was at the time19:39
johnsomadmin0 Yeah, operating status OFFLINE and no stats means you have a problem on the lb-mgmt-net, the UDP traffic over 5555 from the amphora instance to the controllers is not working. The health manager is not receiving the heartbeat packets.19:41
admin0udp is not enabled by default in the firewalls its creating19:41
admin0let me check if i can add 5555 udp and see if it solves that19:42
johnsomIt's outbound, so the SGs on the amphora won't show it as a rule, it's open by default. There should be a rule on the controller SGs however that allows it.19:42
masterpeadmin0: did you solve the issue that the vlan interface was not joined in the bridge on the compute node?19:43
johnsomI'm pretty sure OSA has that right, maybe people have deployed with it. Unless something has changed since I last looked at the repo.19:43
johnsommaybe->many19:43
admin0in the default sec_group created, octavia_sec_grp only thing allows is tcp 944319:43
*** ianychoi__ has quit IRC19:44
admin0in the other amphora where i created for load balancing http, i see 80 and 102519:44
johnsomadmin0 Yeah, that is the wrong SG19:44
admin0there are no other sec groups19:44
admin0its the default,   octavia_sec_group ( for the default amphora) and some uuid for the one i created19:44
johnsomIt's the SG on the controller port19:44
admin0its the secgroup under service/octavia user isn't it ?19:45
admin0there is no 5555 or udp in any of the sec groups created19:45
johnsomI doubt it. It's probably under a OSA admin account19:45
johnsomSGs under the octavia account are really only for amphora side/service VMs19:46
admin0all lbs created, appear as instance under service/octavia with the security group also managed from there19:46
admin0under admin, it just appears on neutron as lbaas19:46
admin0and under service/octavia, there are 3 rules .. default,  the one for the octavia and the one for my lb .. it does not have any udp in it19:47
admin0or tcp port 55519:47
admin0555519:47
johnsomadmin0 Right, that is what I said, it's a SG outside the octavia account. Probably the one called lbaas19:47
johnsomIt's been over a year since I have used OSA to deploy, so I'm a bit rusty on the setup there.19:48
jrosseram i right in thinking that would only be for OVS?19:48
admin0johnsom, not under admin19:48
admin0i can confirm that all lbaas related instances/security groups are under service/octavia19:48
johnsomOVS and linux bridge, but I don't know if the OVN stuff has been setup in OSA yet19:48
ThiagoCMCjohnsom, thanks!19:48
jrosserfor admin0 case the bridge on the compute node comes straight over to the controller as a vlan provider network with linuxbridge19:49
johnsomadmin0 Look for a port in neutron called something similar to o-hm019:49
spatelI kept two openrc file one for admin and one for octavia.rc for LB operation19:50
johnsomThat would be the controller port, the SG on that port is the one that will have UDP/5555 open.19:50
admin0its not there19:50
spatelso i can see status of amphora19:50
admin0maybe i need to add udp/5555 manually19:50
admin0and it will fix itself19:50
admin0you sure its udp/5555 ?19:50
johnsomadmin0 no, it's setup by OSA19:50
admin0johnsom, telling you bro .. its not there19:51
admin0at least not there on 21.1.0 tag i am using19:51
*** luksky has quit IRC19:51
jrosseradmin0: use the diagram from spatel blog as your reference, thats what you have built19:52
jrosserif you are trying to find this UDP traffic start at the source, with tcpdump on the compute node and follow it logically though19:52
masterpeadmin0: the octavia_sec_grp in my case ingress tcp 9443, ingress tcp 22 and ingress icmp.19:53
admin0i followed his blog exactly .. even vlan27  ..19:53
admin0and so far its working .. only issue i see is the offline status and was trying to understand why19:53
johnsomhttps://github.com/openstack/openstack-ansible-os_octavia/blob/master/defaults/main.yml#L36519:53
johnsomSo OSA is adding the rule directly via iptables19:53
spatelThis is what i have - http://paste.openstack.org/show/800864/19:54
admin0spatel, i have exactly what you have19:54
johnsomjrosser +1 , It's extremely unlikely it is an SG or iptables rule issue. Like I said, many people have deployed this over years and this part hasn't changed.19:55
johnsomspatel Yeah, that looks good for a load balancer with a VIP on port 8019:55
admin0so you guys see online status .. with the same security group, i see it as offline19:56
admin0means someting else19:56
spateladmin0: enable debug on octavia side and you will see something in logs why its offline19:56
admin0i will maybe tcpdump all outgoing packets and see19:56
admin0from the container19:56
spateladmin0: yes i am seeing online19:57
johnsomAre you seeing stats increment when you hit the VIP endpoint?19:57
johnsomIf not, you have a networking problem on the lb-mgmt-net. If they are incrementing, check that your LB isn't disabled via admin state down.19:57
masterpeadmin0: you are able to connect to the Lbaas interface on port 22?19:58
admin0johnsom, this is what i see https://pasteboard.co/JE0QxcL.png ..  i do not see stats19:58
*** nurdie has joined #openstack-ansible19:58
johnsomYeah, horizon doesn't display them, use the CLI19:58
admin0masterpe, the lbaas has a intenal IP and a vip .. the VIP i understood is for  the load balancer .. i tries ssh to the ineral ip, but there is no security group rule19:58
admin0let me enable ssh and try again19:58
johnsomopenstack loadbalancer stats show <lb ID>19:59
masterpeFor the ssh is a OSA varbiable: octavia_ssh_enabled20:00
masterpeBut I think you can also add it to the octavia_sec_grp SG20:00
spatelhttp://paste.openstack.org/show/800865/20:01
spateladmin0: you need to upload ssh-key using octavia account (i mostly do that using horizon)20:02
admin0that is done20:02
johnsomYou will need to enable ssh via the octavia config (or OSA setting that setting) and then build a new amphora. Otherwise the SSH key won't be there.20:02
admin0i have ssh enable as well20:02
*** nurdie has quit IRC20:02
admin0ok .. so even my lb is working , curl VIP round-robins between   web1 and web2 ..  my stats are all zero20:03
spatelI have downloaded amphora image and stick my password there so i don't use SSH-key :)  so i have freedom to ssh from any host20:03
admin0what happens if i delete the default amphora image20:04
admin0will it create a new one with the key ?20:04
spatelyou don't need to delete image at this point, just upload ssh-key and see if you can ssh in20:05
*** luksky has joined #openstack-ansible20:05
admin0so i added from the security groups ssh, icmp .. but from the octavia_container, i cannot ping or even ssh .. all i can do that return something is: openssl s_client -connect <amophora-ip>:944320:05
spatelIn my case i have downloaded qcow2 image separately and edit that image to add my password and upload to glance (that way i don't need ssh key, i can just ssh root@amphoa-ip20:06
*** nurdie has joined #openstack-ansible20:08
spateltonight i will add all these troubleshooting setups to my blog so it would be easy for future users.20:09
admin0https://gist.githubusercontent.com/a1git/671a00479386859f7f09697d522d778d/raw/1687a5a003972c78720a8680c37883f4ae4306d8/gistfile1.txt  == this is what i started to see in hypervisor now20:10
admin0johnsom, is there a manual way to validate that the container can talk to port 5555 and that the firewall is not preventing the stats ?20:11
spateladmin0: i am seeing same error in my neutron logs20:12
spatelI thought this is me only, but now you are also seeing that20:12
admin0still everything seem to work .. and seems to be octavia related20:12
spatel (nf_tables):  CHAIN_USER_DEL failed (Device or resource busy): chain neutronARP-tap2cdc4ac7-4620:12
admin0this is a complete greenfield env .. just set it up to do octavia only20:12
admin0i can add your ssh keys if interested to explore it20:13
spatelworth opening bug to see what community thinking about it20:13
admin0i am going to try putting octavia service in debug mode, and also run tcpdump to check what ports its trying to connect but not able to20:13
admin0so that that offline-status can be online, and the stats are there20:14
johnsomadmin0 If you enter the container for your octavia health manager process, you can tcpdump -nli <interface name> udp port 5555. You should see one packet every ten seconds20:15
admin0' container for your octavia health manager process ' -- where is this ?20:16
jrosseradmin0: are you on ubuntu focal for your compute node?20:16
admin0yep20:16
johnsomon your controller(s)20:16
johnsomIf you have more than one controller in your OSA deployment, it may be longer than ten seconds for one controller to get a packet as it rotates through them randomly20:17
spatel  In my setup i am seeing 5555 udp ping20:17
admin0johnsom, i only have 1 controller for this, and it has a single container called: c1_octavia_server_container-bd64dff320:17
jrosserit looks like your error on the compute node is this https://review.opendev.org/c/openstack/neutron/+/76540820:17
-spatel- 12:16:58.123393 IP 10.62.7.156.55886 > 10.62.7.76.5555: UDP, length 29220:17
-spatel- 12:17:14.647583 IP 10.62.7.143.55887 > 10.62.7.76.5555: UDP, length 29120:17
admin0wait .. i should have more containers ?20:18
johnsomadmin0 Ok, jump in that container.20:18
johnsomOSA used to have one container for each process, but maybe that has changed?20:18
johnsomOne for API, one for worker, one for health manager, and one for house keeping processes20:19
spateladmin0: octavia-health proc run inside same octavia-server container (there will be 4 daemon)20:19
spatelin short everything will run inside single octavia container20:19
johnsomOk, so that has changed since I have deployed. That is fine, they are low-load processes20:19
admin0https://gist.github.com/a1git/3f6fcbe73f9dff34ecc84d4413bb440b20:19
admin0that is what it has20:20
spatelThis is what i have - http://paste.openstack.org/show/800866/20:20
johnsomYep, they are all there20:20
johnsomWhat do you have for "ip link"?20:20
spateljrosser: that bug isn't merged yet right? I am seeing same error all over the place20:21
admin0https://gist.github.com/a1git/b4fa23d9011d3d35ae65ca11ba7f19a920:21
jrosserspatel: correct, bad neutron but20:21
admin0i manually added the 10.63.0.3/16 as that is what the range i gave for lbaas-mgmt20:22
jrosseri would suggest forking the neutron repo, cherry pick that and override to point to your own git repo20:22
spateljrosser: that would be good idea until it merge to victoria20:22
jrosserspatel: it's also ussuri, and from an OSA perspective we can't do anything until that merges in the real neutron repo20:23
admin0i am hitting the load balancer, but tcpdump  -nli any udp port 5555 produces nothing at all20:23
jrosserthough as an end user there are all the hooks for you to work around locally20:23
johnsomYeah, ok, so the lb-mgmt-net is broken somehow.20:23
spateljohnsom: if lb-mgmt-net is broken then octavia should keep deleting and re-creating amphora right?20:24
johnsomMy guess is, since you manually added 10.63.0.3/16, the amphora on the lb-mgmt-net interface don't have a route back to this address.20:25
johnsomspatel No, due to other nova issues, we don't start the failover clock until we have received at least one heartbeat.20:25
admin0its in the same L220:25
jrosserit does seem to be a big red flag that you've got an exception on the compute node to do with security groups and a known neutron bug which hasn't had its fix merged yet20:26
johnsomCan you paste the output of a "openstack server list" for the amphora service VM?20:26
admin0that i have to do as service/octavia20:27
admin0one moment20:27
johnsomjrosser Yeah, could be.20:28
*** cshen has quit IRC20:43
admin0ok .. question . i have octavia_management_net_subnet_cidr: 10.62.0.0/21  in  user_variables, but in user_config, under lbaas:   172.29.232.0/22 -- should they have to be the same ?20:46
admin0coz i followed first rackspace, then mgariepy and then spatel, i think may have this inconsistency20:48
admin0thank you johnsom for pointing it out20:48
admin0so checking if mgariepy and spatel have the same, or diff .. and if they should match20:48
*** cshen has joined #openstack-ansible20:48
jrosserthey should match, one is the config for the OSA containers and their interfaces20:50
spatellbaas and octavia_management_net_subnet_cidr should be same20:50
jrosserthe other is for the neutron provider network, and these must line up20:50
admin0aha .. heartbeat issue should be solved after this ..20:50
admin0let me work on it ..20:50
admin0then the remainig will be that error that both spatel and i see20:50
spatelhttps://review.opendev.org/c/openstack/neutron/+/76540820:52
spatelwe need to wait for merge that code before that we have to do some hacks to rollout patch20:52
spateladmin0: in my block i have used same network lbaas: 172.27.40.0/24   / octavia_management_net_subnet_allocation_pools: 172.27.40.200-172.27.40.25020:53
admin0mine got mixed up following multiple ways to get it done20:53
admin0but i have some clarity now20:53
spatel:)20:53
admin0finally20:54
spateladmin0: This is how you learn, if it works in first shot then you won't get chance to learn underlying technology :)20:56
spatelAfter that try Senlin also for fun20:56
spatelmy rabbitMQ isn't happy on production, may be i will rebuild whole cluster tomorrow, its too late to poke that beast..21:00
admin0so if the ip range and mgmt range is to be the same, then .. in a 3 controller setup,  there will be 3 ips assigned to controller .. that openstack will have no idea of .. and the chances of the same ip being assigned to amphora is there21:07
admin0because as far as i can recall, the containers use the ip other than the reserved ips21:07
admin0and because these containers are created automatically, we cannot predict these reserved ips21:07
spatelyou can use used_ips to reserver some ips21:09
spateli kept some IP for future use and rest for lbaas-mgmt-net21:09
admin0you mean manually editing the eth14 ip to make it fall inside the reserved range .. means manual changes to the inventory file21:10
admin0user config has this now:  lbaas: 10.62.0.0/21 . reserved: - "10.62.0.1,10.62.0.50"   user variables have: octavia_management_net_subnet_allocation_pools: 10.62.0.51-10.62.7.25021:11
admin0but when the lxc is created, it used 10.62.7.1921:11
admin0it means out of 1000s of ips, the changes of 3 amphora created that will match the container ip is there, unless that container ip is manually specified somehow via ansible21:12
admin0i can manually cahange the container ip to make it within .50 and edit the inventory .. unless a better way exists21:13
spatelThis is my production example21:16
spatel lbaas: 10.62.0.0/2121:17
spatelused_ips: - "10.62.0.1,10.62.6.255"21:17
spateloctavia_management_net_subnet_allocation_pools: 10.62.4.1-10.62.7.25421:17
spatelfrom /21 subnet i gave 10.62.4.1-10.62.7.254 to amphora21:18
admin0yes, now the lxc container also need an ip right, which is picked outside of 10.62.0.1,10.62.6.255 .. ..21:18
admin0so there is a chance it will overlap with your allocation_pool21:18
spateloctavia_management_net_subnet_allocation_pools: this pool will control by DHCP so don't use that IP anywhere manually21:19
admin0that is what i am saying21:19
spatelyes..21:19
admin0the ip picked by random by ansible for lxc , the dhcp will not know about it21:19
spatelyup21:19
admin0so there is a chance that if you dont change your lxc container ip manually and fix the inventory, an amphora can overlap with the container21:20
admin0doing a possible service down due to both ips being in the same vlan21:20
spatelpossible21:20
jrosseryou should make these ranges mutually exclusive21:20
admin0right21:20
admin0or just edit the inventory and fix the container ..21:21
jrosserin openstack_user_config ensure that the range you give to neutron is included in used_ips21:21
jrosserand put at least the opposite of that in octavia_management_net_subnet_allocation_pools21:21
jrosserwell you know what i mean.....21:22
admin0yep21:22
spatelI need to fix my range also :)  10.62.4.1-10.62.7.254  this is overlapped21:22
admin0isn't it also possible to just manually pass an IP to a container ?21:22
jrosserit is possible but ugly21:22
jrosserbest to just chop a small range off the front of the subnet and let the OSA inventory allocate from that21:22
spateljrosser: what is the status of CentOS 8.3 builds, are you seeing them passing?21:23
jrosserit is late, i am not currently looking at them much21:23
spateli am thinking to upgrade my 8.2 to 8.321:23
admin0spatel, https://lists.centos.org/pipermail/centos-announce/2020-December/048208.html21:24
spatelwe had very long discussion this morning about that :)21:25
admin0hehe21:25
spateli will test ubuntu in my lab to see how we can shift from centos821:26
admin0i like centos .. just don't like being helpless when in upgrade and cannot fix the inter-dependency issues .. or the package being a bit too older than required21:28
admin0ok . testing octavia again21:29
spatelI never used ubuntu in my entire carrier :)21:30
*** nurdie has quit IRC21:32
admin0ok .. so i nuked the default amphora, re-created the octavia lxc container with the correct ip range21:42
admin0but the amphora is not coming up again .. is there a way to kicstart it ?21:42
admin0the logs do not show any attempt .. i already ran the playbooks twice ..21:43
spatelmake sure you have lbaas-mgmt-net is there + compute nova logs etc21:48
*** pto_ has joined #openstack-ansible21:52
*** pto has quit IRC21:54
*** ebbex has quit IRC21:55
*** ebbex has joined #openstack-ansible21:55
*** pto_ has quit IRC22:05
*** pto has joined #openstack-ansible22:05
*** luksky has quit IRC22:18
*** spatel has quit IRC22:19
*** cshen has quit IRC22:31
*** luksky has joined #openstack-ansible22:32
admin0jrosser,  this is actually breaking neuron now ( and without the agent, nova fails to spawn any new vms)22:45
*** cshen has joined #openstack-ansible22:51
admin0looks like this https://bugs.launchpad.net/neutron/+bug/188728122:53
openstackLaunchpad bug 1887281 in neutron "[linuxbridge] ebtables delete arp protect chain fails" [Medium,Fix released] - Assigned to Lukas Steiner (steinerlukas)22:53
jrosseradmin0: yes, that is the bug i posted the fix for earlier22:57
jrosserfrom an OSA perspective we can't do anything until that merges22:57
admin0how do i cherry pick it22:57
admin0when might it merge22:57
jrosserbut there are all the hooks you need to override your version22:57
admin0and can it be back-merged to old tags ?22:57
admin0aah22:57
admin0oh yes .. i forgot about those22:57
jrosserit is in neutron, not OSA, so it is up to the neutron team to merge it22:57
jrosserthen when OSA makes a point release it wil be automatic for OSA upgrades to get that22:58
admin0will it see 21.3.0 ?22:58
jrosserwhat i would do here is fork the neutron repo to your github or something, and make a branch off of stable/ussuri22:58
jrosserthen cherry pick the patch from gerrit onto the tip of that branch22:59
admin0i use tags .. on 21.1.022:59
jrosserno22:59
jrosserthat is an OSA version and has no meaning for neutron22:59
admin0ok22:59
jrosserwhen you have done that there is an example here https://docs.openstack.org/openstack-ansible/latest/user/source-overrides/index.html23:00
jrosseruse the stuff under "Overriding other upstream projects source code" but switch it for neutron instead of glance23:00
jrosserand point to your patched repo23:00
jrosserthis is neutron itself you need to patch, not os_neutron ansible role or openstack-ansible23:01
admin0when i click the cherry pick ussuri, i get this url https://review.opendev.org/c/openstack/neutron/+/76540823:02
admin0is that the url i need to point to ?23:02
jrosserno, like i say, you need to fork the whole neutron repo yourself on github23:03
jrosserstandard github things, not gerrit23:03
jrosseror wherever you would need to host your modified version of neutron23:03
*** gshippey has quit IRC23:03
admin0oh23:04
admin0but if i wait and its merged, since it seems to break prod.. would it alone be enough to make a 21.3.0 tag ?23:05
admin0or if i wait it out, how does it make its way to our osa release ?23:06
*** rfolco has quit IRC23:06
jrosserevery ~2 weeks we release a new tag of OSA which picks up the SHA of the tip of whatever the corresponding stable/<release> branches are for all of nova/neutron/keystone/...23:07
jrosserthis really is what an OSA release is23:07
jrosserall of the git SHA are moved forward together and validated as a set in CI23:07
jrosseradmin0: take a look at this https://github.com/openstack/openstack-ansible/commit/1973fee1f3c172ad65a7440864092be237a01b1023:10
jrosserthat is a patch which "makes a release" of OSA on a stable branch23:10
jrosserit is nothing more than updating a whole heap of git repo SHA23:10
*** kukacz has quit IRC23:14
*** kukacz has joined #openstack-ansible23:16
*** rfolco has joined #openstack-ansible23:21
*** cshen has quit IRC23:28
*** nurdie has joined #openstack-ansible23:40
admin0sorry my irc got disconnected23:41
admin0so if i wait 2 weeks, if its merged by then, it will be in the next release23:41
admin0thanks jrosser .23:41
admin0will check/learn tomorrow on how to do it manually from you ..late for today23:42
*** nurdie has quit IRC23:42
*** rfolco has quit IRC23:53
*** rfolco has joined #openstack-ansible23:53
*** rfolco has quit IRC23:58

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!