Thursday, 2021-08-05

nurdiehttps://docs.openstack.org/kolla-ansible/train/user/centos8.html00:27
nurdieIs the kolla-ansible guide OK to follow for a C7 to C8 upgrade?00:27
nurdieSHould I upgrade deployhost first?00:27
jrossernurdie: which release are you on?07:02
jrossernurdie: the OS of the deploy host does not matter07:03
jrosserto upgrade/change OS you start with the controllers and then do the computes, that should be possible07:04
jrossercentos7 support was not removed from OSA for old releases but it might be that repos you need are now gone07:04
jrosser*other repos07:05
jrosserbut for example I just look at our CI jobs for stable/train and they are still passing for centos7.....07:06
opendevreviewAndrew Bonney proposed openstack/openstack-ansible-os_keystone stable/wallaby: Fix shibboleth compatibility for ubuntu 18.04  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/80355207:23
*** rpittau|afk is now known as rpittau07:47
anskiyjrosser: I've reproduced a bug with OVN service name on Stream on AIO. What should I do next? :)10:44
jrosseranskiy: check it’s something we’ve not already fixed in master... raise a bug on launchpad.... talk to spatel here later when he’s around.....10:47
jrosserI’m not really around this week but spatel and mgariepy have the most practical experience with OVN10:47
anskiyit's already on master, so I'll try to reach spatel, as I have another concern about OVN clustering (looks like it doesn't work)10:50
anskiythank you10:50
jrosserthere is an outstanding patch to h age how the clustering works10:50
jrosser*change10:50
jrosserfrom haproxy to native cluster10:51
anskiyyeah, I saw that one, but my concern is on the OVN side more, not the neutron <-> OVN10:51
anskiythat one, you're talking about is actually how it should be done: there is no routing in clustered OVN, so the client would just hang if he connects to non-leader node :)10:52
spatelmgariepy hey! morning 13:06
spatelhow to merge these jobs ? https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/80213413:06
spateli meant start the gate process 13:07
anskiyspatel: hey! so, I've managed to reproduce a bug with OVN installation on Stream in AIO on current master. It uses wrong service name (ovn-central instead of ovn-northd).13:08
spatelyes, central start ovn-northd behind it 13:09
nurdiejrosser: My deployment is Train 20.2.1 on CentOS7. Some of it is now on 20.2.6, but I couldn't finish it so I brought up services as I could. I have no idea how that's passing your CI o.o13:09
spatelnurdie we don't have CentOS7 CI job.. did i miss something?13:10
nurdieOne example is that OSA asks pip2 to install keystone==16.0.2.dev19, but that doesn't exist13:10
anskiyspatel: there is no such service on Stream, only ovn-northd13:10
nurdiespatel: Yes, jrosser had asked me a question about 5hours ago when I was drooling and snoring13:11
jrosserspatel: Train has centos713:11
spatelah! so just for train, i wasn't aware of that sorry13:11
jrossernurdie: the install from source code can totally install that version of keystone13:12
spatelanskiy this is what i am seeing - https://paste.opendev.org/show/807909/ 13:12
nurdieFrom source code? Is there a different way of doing it? I was adding a task to service roles to sed that dev branch out for pip before the installs13:13
jrossernurdie: check out the centos7 jobs here https://review.opendev.org/c/openstack/openstack-ansible/+/803405 they ran in the last 24 hours13:13
anskiyspatel: wait, is this inside container? 13:14
jrosserOSA does not install openstack from pip packages13:14
spatelanskiy behind that file it start these services - https://paste.opendev.org/show/807910/13:14
nurdiejrosser: I ultimately failed with C7 and pip2 (probably because of pip2?) not liking python-systemd13:14
spatelyes inside LXC container 13:14
jrossernurdie: well there is centos7 and some specific version of centos7.x13:15
anskiyspatel: ah, so, apparantly it's some flavor of debian inside that. I'm talking about full metal deployment.13:15
jrosseranskiy: Debian?13:16
anskiyI mean, OS in this LXC is Debian or Ubuntu, because that's how OVN starts on those13:18
nurdiejrosser: plain ol' 7.9. Example of what I couldn't get passed: https://pastebin.com/WeA4vZE613:19
anskiythis is how it looks on Stream with SCENARIO='aio_metal': https://paste.opendev.org/show/807911/13:19
nurdieThat was a nova-api container install. I want to move to ubuntu if it has less package isssues :) 13:20
spatelanskiy you are saying you can't see ovn-central service?13:21
mgariepymorning.13:21
jrossernurdie: 404 Client Error: Not Found for url: http://10.250.0.210:8181/os-releases/20.2.6/centos-7.6-x86_64/ - skipping13:22
jrosser2021-08-04T18:42:02,414 Skipping link: unsupported archive format: .6-x86_64:13:22
mgariepyafter some digging on the sphinx stuff. the issue is the rst parser not doing fallback on ""unknown languages"" 13:22
jrosserthat is your root cause I think13:22
anskiyspatel: yes, there is no such service, when you deploy openstack on Stream on metal. There are a couple more issues, which are going after this one too.13:23
spatelhow the heck CI job passing then? 13:24
spateli did install on metal and it worked for me.. i was able to create vm etc...13:25
jrossernurdie: I can’t check it for you but you need to look through later code if we ever updated the repo path not to include a ‘.’13:25
jrosserlike I say it’s passing centos-7 jobs currently and all the logs are there on the link I gave you to look through13:26
anskiyspatel: from which package that service came in your case? In my lab ovn-northd.service comes from ovn-2021-central-21.03.0-40.el8s.x86_6413:27
spateldid you install using distro or source?13:27
jrossernurdie: it is failing to find the built wheels on the repo server because of that error, falling back to pypi and then not finding the specific ‘devN’ versions from the built wheel constraints13:28
anskiyspatel: source is default, right? I haven't changed it.13:28
spateljrosser could you kick off the gate process? - https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/80213413:29
spatelanskiy yes source is default13:29
nurdiejrosser: if a '.' is the cause of all of my woes over the last week, I'm going to beat the nearest printer to death13:29
anskiyspatel: btw, are you sure that it was using OVN in the end? Because, if I don't set neutron_provider_networks in user_*.yml it doesn't trigger OVN installation at all.13:30
jrosserspatel: I can’t because the depends-on are not merged13:31
spateloh!! got it 13:31
spatelanskiy hmm! i just noticed we don't have OVN centos-8-stream job in CI job 13:32
spatelanskiy i am kicking off my centos-8-stream lab to deploy AIO with OVN and see how it goes.. i think we should setup zuul CI job for OVN C8 stream13:33
mgariepydowngrading to docutils to 0.16 seems to fix the issue.13:33
anskiyspatel: there are a couple more bugs to this (which lead to non-working OVN), some of which I'm not sure how to fix properly.13:33
spatelI am running c8-stream in my lab and no issue at all.. but i would like you to open bug or show some error so we can fix it 13:34
anskiyokay :)13:36
jrosseranskiy: are you interested in helping to maintain the centos stream support in OSA?13:36
nurdieAlso, jrosser: what do you mean, "from source code"? As opposed to what, a highly customized deployment where an operator pins their own pip packages? I'm not trying to be facetious! If there's a "better" way, I'll do it13:36
jrossernurdie: if you can do a “distro” install which uses exclusively apt/rpm from Ubuntu cloud archive or RDO, but I would highly discourage that in an OSA context13:37
anskiyjrosser: I don't have much expirience in contributing to the projects this large, but I can try :)13:37
jrosseranskiy: I think it’s only fair that I say that OSA is maintained by its end users for its end users13:38
nurdieAh, got it. Yeah, I prefer to leave the version pinning to the pros. I do that with the company I work for's HA version of their product so I understand that :)13:39
jrosserand we see less and less people using centos over time, and without active maintainers it may even come to dropping support13:39
nurdiejrosser: thanks yet again. I will hunt for the missing period and submit a bug if I find it13:39
spatelanskiy you will start maintaining and you won't realized that is how it works :)13:39
spateljrosser look like it doesn't like this solution - https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/80347513:41
spateltrying to hunt what is going on 13:41
jrossernurdie: I have a very distant memory of needing to fix that period issue but I can’t look at the history this week13:41
nurdieIt's cool. At any rate, I'm very appreciative of your knowledge and willingness to help. You, sir, are a scholar and a star13:42
nurdieIf you're a sir* or whatever13:42
jrosserspatel: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_fd8/803475/1/check/openstack-ansible-deploy-hosts_distro_lxc-centos-8-stream/fd8d618/logs/host/lxc-cache-prep-commands.log.txt13:44
spatelhmm very odd13:45
spateli am deploying in lab to debug.. 13:46
spateljrosser may be this is the solution - https://access.redhat.com/discussions/4222851 but will see when my build completed14:00
nurdiejrosser: Perhaps it's even worse: a missing trailing slash (/)? https://pastebin.com/2VfBjJZY14:05
nurdie>_<14:05
nurdieCould that be an nginx bug? Needing to add try clauses for missing trailing slashes?14:06
nurdieEr, try_files* rather14:07
jrossernurdie: you could make a symlink to a simpler path on all the repo servers and override this https://github.com/openstack/ansible-role-python_venv_build/blob/stable/train/defaults/main.yml#L14314:16
jelabarre-rhfor the ansible module/task "yum_repository", is seems that's only for adding repos where you have the full baseurl, rather than simply enabling one that is already defined15:08
jelabarre-rhat least that's the impression I'm getting from looking at the examples on the module's documentation page15:09
mgariepyspatel, jrosser https://github.com/openstack/openstack-helm/commit/9c89c32bd3c862f100cb0170909cb4c2312153c515:41
mgariepythat's a way to fix a translation issue..15:42
spatelNice!15:44
mgariepyit's not super respectful for the translator IMO.15:45
spatelwe have to unblock the gate15:54
mgariepythe issue is the rst parser in docutils not falling back to en when encountering unknown lang.15:57
spatelThese stuff beyond my knowledge :)16:06
mgariepyi'm not a python dev either but i can read ;p16:06
spatelwhy don't we go with non-voting so we can release next wallaby and come back on this later when we have more manpower :)16:07
mgariepy+2 on the patch16:13
mgariepyspotz_, ? do you have a minute? https://review.opendev.org/c/openstack/openstack-ansible/+/80337116:13
opendevreviewSatish Patel proposed openstack/openstack-ansible-lxc_hosts master: Add yum vars for centos-8-stream lxc containers  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/80363016:15
spateljrosser this is the fix for centos-8-stream lxc cache failure - https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/80363016:16
jrosserspatel: are they not needed on the host for a metal job?16:27
*** sshnaidm is now known as sshnaidm|afk16:29
spotz_mgariepy: looking16:33
spotz_ha jrosser got it16:33
mgariepyjrosser, is on vacation !:P16:40
*** rpittau is now known as rpittau|afk16:41
admin1[haproxy_server : regen pem]  -- this looks like a new addition .. why would this run when i am supplying my own certs ? 16:44
admin1and it fails 16:44
jrosserbecause it has to concatenate the CA, key and certificate16:45
admin1so this is a new addition ? before i only had to privide a cert pem and a key pem 16:46
admin1now have to provide all 3 of them separately ? 16:46
admin1haproxy_ssl_self_signed_regen: false ; haproxy_user_ssl_cert: /opt/ssl/cert.pem ; haproxy_user_ssl_key: /opt/ssl/key.pem  -- these 3 lines worked fine for years 16:47
jrosserwhich branch fails?16:47
admin1i am trying to deploy 23.0.0 16:48
jrosserhave you read the release notes16:48
admin1strangely , i have not this time :) 16:48
jrosserahha - I cannot help with it this week but there is a total overhaul for ssl in W16:49
admin1https://docs.openstack.org/releasenotes/openstack-ansible/unreleased.html  - i see some pointers there16:49
admin1so just change to haproxy_ssl_cert_path  and thats it it looks like 16:51
spateljrosser for metal its automatically creating those variables, its only not creating for lxc container, i believe because we are building it in chroot and may need some dependency 16:51
admin1jrosser , is it 1 single pem now ? i can't find anything when doing a grep -ri for the _key 16:52
jrosserspatel: https://github.com/openstack/openstack-ansible-lxc_hosts/blob/master/vars/centos-8.3.yml#L2416:54
admin1it worked for me for the wildcard . i will try to submit a patch to the documentation to add a few lines in the docs to make it much clear 17:00
spateljrosser sweet!! so we should just need to add - /etc/yum/vars 17:05
spateljrosser should be create centos-8.yml file? 17:12
jrosserit’s complicated17:15
spatelthen lets go with 8.3 for now or create symlink to centos-8.yml 17:16
jrosser?17:16
spateli meant add here - https://github.com/openstack/openstack-ansible-lxc_hosts/blob/master/vars/centos-8.3.yml#L2417:16
jrosserstream vs not stream is what makes it complicated17:16
jrosseriirc 8.3 and 8.4 are not-stream17:17
spatelyes 17:17
jrosservars/redhat.yml covers stream?17:17
spatelhmm it should cover.. 17:17
jrosseras it’s neither 8.3 nor 8.4 so falls back to that, I think17:17
spatellet me add in redhat.yml and run test 17:18
jrosserthere’s no proper detection of stream in ansible so this is a mess17:18
spatelhmm17:22
spatellet me add in redhat.yml and rebuild my lab to see if that works then we will stick with redhat.yml 17:28
opendevreviewMarc Gariépy proposed openstack/openstack-ansible master: skip -W on sphinx-build for translation.  https://review.opendev.org/c/openstack/openstack-ansible/+/80363517:48
mgariepylet's see if that works.17:50
mgariepythe issue seems to be more the sphinx-build command -W that threats warning as error and the bash script that does exit on error.17:52
spatelfigures cross 18:05
mgariepyit passed.. but well..  https://review.opendev.org/c/openstack/openstackdocstheme/+/80275818:09
opendevreviewSatish Patel proposed openstack/openstack-ansible-lxc_hosts master: Add yum vars for centos-8-stream lxc containers  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/80363018:12
opendevreviewSatish Patel proposed openstack/openstack-ansible-lxc_hosts master: Add yum vars for centos-8-stream lxc containers  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/80363018:13
spateljrosser this works in my lab - https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/803630/3/vars/redhat.yml18:13
spatelmgariepy great! lets push it out.. 18:14
mgariepyit might soon be dropped .. :/18:16
admin1checking if you guys know why this comes up from time to time "galera_server : Fail if galera_cluster_name doesnt match provided value" 18:39
admin1greenfield lab install 18:39
admin1literally the container was just created 18:39
admin1mgariepy spatel jrosser, any of you already using 23.0 in prod ? 19:23
admin1in my case, either the rabbitmq fails to cluster, or the mysql fails to cluster :( 19:23
admin1prod as in 3x controllers + valid ssl cert for horizon+apis 19:24
mgariepynot me.19:25
spateladmin1 i don't have and working on to upgrade to 23.0 but there are some pending patch 19:28
spateladmin1 open bug if you think its real issue 19:34
spatelbefore we release 23.1 19:34
admin1yeah .. i am doing more testing now 19:40
spatel+119:48
spatelIn CI job we don't do multi-node so you will see more issue when go out and start deploying on multi-node (specially cluster apps)19:48
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org is going down for a quick restart to adjust its database connection configuration, and should return to service momentarily20:02
nurdieI evacuated a compute node and 3 intances are stuck in "accepted" in "nova migration-list", even though those hosts are booted and running on a different compute node. Do I need to edit galera to clear that up?20:05
admin1nurdie, never delete an entry .. just update fields :) 20:48
admin1using tcpdump, maybe make sure its 100% in the other server 20:48
admin1wondering what happens if you cancel the migration job ( after they have already migrated  20:49
nurdieadmin1: Oh it's definitely on the other server lol. The failed compute node is super dead21:08
nurdieAnyone know off the top of your head what db.table that will be in?21:08
opendevreviewSatish Patel proposed openstack/openstack-ansible-openstack_hosts master: Add nova dependency repo for distro install  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/80347521:09
admin1#openstack-nova guys will know this nurdie21:09
nurdieadmin1: thanks!21:11
jrosseradmin1: we normally make the “x.1.0” release after a few people have tried multinode labs / upgrades on a new release and things found in those to fix21:15
jrosserwould be great if you could take a look (particularly regarding SSL as that’s a huge change for W). the docs certainly need improving and examples adding for the new features there21:16
admin1i am on it 21:16
jrosserawesome21:17
admin1setting up 2 labs .. one greenfield multinode .. and one installing a  new 22.2.0 now so that once the greenfield works, will test from 22.2.0 > 23.0.0 as well 21:17
admin1to see what needs to be changed for existing variable files21:17
jrosseryeah, there is now a full internal CA21:18
jrossera new ansible role to manage that and a whole set of new files in /etc/openstack_deploy, and new variables to manage it21:18

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!