Tuesday, 2024-03-19

opendevreviewIvan Halomi proposed openstack/kolla-ansible master: Refactor of docker worker  https://review.opendev.org/c/openstack/kolla-ansible/+/90829507:40
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: Run ML2/OVS agents processes in separate containers  https://review.opendev.org/c/openstack/kolla-ansible/+/86478008:12
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: Run ML2/OVS agents processes in separate containers  https://review.opendev.org/c/openstack/kolla-ansible/+/86478008:12
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: Run ML2/OVS agents processes in separate containers  https://review.opendev.org/c/openstack/kolla-ansible/+/86478008:15
opendevreviewWenping Song proposed openstack/kolla-ansible master: Fix the ansible intro_inventory.html link  https://review.opendev.org/c/openstack/kolla-ansible/+/91362409:21
guesswhat[m]Enable the Implicit flow (this one allows you to use the OpenStack CLI with oidcv3 plugin) ( https://opendev.org/openstack/kolla-ansible/src/branch/stable/2023.2/doc/source/contributor/setup-identity-provider.rst )    is it still true ?10:18
opendevreviewMichal Arbet proposed openstack/kolla-ansible master: Fix installation of ovs-dpdk service  https://review.opendev.org/c/openstack/kolla-ansible/+/91365310:45
opendevreviewMerged openstack/kolla-ansible master: Fix the ansible intro_inventory.html link  https://review.opendev.org/c/openstack/kolla-ansible/+/91362410:47
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: Run ML2/OVS agents processes in separate containers  https://review.opendev.org/c/openstack/kolla-ansible/+/86478011:08
opendevreviewJakub Darmach proposed openstack/kayobe master: Use new collections in Kayobe  https://review.opendev.org/c/openstack/kayobe/+/91074211:08
halomivahey, i have bad news about this patchset https://review.opendev.org/c/openstack/ansible-collection-kolla/+/910751  in the end it was just partial solution because we install docker using pip only in virtual environments and during zuul doesn't use them so it means it fails on the same problem, I looked for solution but it seems that ubuntu/debian only support package version 5.0.3 of python3-docker. I see there is some osbpo repository used in p11:16
halomivawould it be possible to add it also to docker and add there newer release of python3-docker package ? or how should i deal with this problem?11:16
Fl1ntHi everyone12:41
Fl1ntWhen doing a tox -e genconfig, tox complain that stevedore.named can't find kolla namespace, any idea? I'm trying to generate kolla-build.conf sample config file, but it seems there is something missing. Using 15.6.0 tag.12:42
fricklerFl1nt: iirc there was an issue with tox >= 4, can you check your tox version?12:44
Fl1ntah! ok, 4.1412:44
Fl1ntI guess I need to use below 4 then right?12:44
Fl1nttesting right now12:45
Fl1ntfrickler, seems it work, perfect, thanks!12:46
Fl1ntoof ><' python plumbing seems to have a hard time.12:52
Fl1ntpython is now complaining about setuptools and buildeggs being deprecated12:52
mnasiadkaThese are only warnings12:52
Fl1nthttps://paste.opendev.org/show/bueuqo1OMa3KOAYssKoq/12:54
Fl1ntmnasiadka, it completely broke on my run12:54
mnasiadkaPlease raise a bug then12:54
Fl1ntI was looking for humans in here first as sometimes you know things broke but just because of the doc informations being deprecated.12:56
Fl1ntBTW: What is kolla plan for RPM based build? Is the plan to abandon and switch to DEB based? How is the transition happening? Does kolla still support CentOS while transitioning to Rocky?12:57
Fl1ntmnasiadka, bugs tickets created: https://bugs.launchpad.net/kolla/+bug/2058385 & https://bugs.launchpad.net/kolla/+bug/205838313:17
mnasiadkaWe are not abandoning RPM13:23
mnasiadkaWe don't support CentOS 9 officially, but we have jobs that build and deploy on that distro, because that's a good prognostic of failures in EL9 ;-)13:23
r3ap3rFl1nt, see the second "note" in this link regarding CentOS 9 Stream: https://docs.openstack.org/kolla-ansible/latest/user/support-matrix.html13:34
r3ap3rI wasn't sure what version you were trying to build so I just grabbed the latest docs.13:35
SvenKieskethanks for pointing to the docs, was in the process of doing the same14:25
r3ap3rSvenKieske np. :-)14:31
opendevreviewVictor Morales proposed openstack/kolla-ansible master: Ensure that nova scheduler runs before refesh it  https://review.opendev.org/c/openstack/kolla-ansible/+/91327714:34
Fl1ntmnasiadka, ok thanks14:41
Fl1nt@r3ap3r, yeah I've already read these pages, hence my questions as it is a bit disturbing to get an ok answer for host but not for image.14:42
Fl1ntr3ap3r, :D14:42
SvenKieskewhy is that disturbing?14:42
Fl1ntBecause docker images shouldn't care at all about host OS indeed.14:43
SvenKieskeyeah and in theory, theory and practice are the same, but in practice they are never the same :)14:44
SvenKieskeit begins with the kernel being used, which bleeds into every container. then the kolla containers are more like mini distros, not pure docker containers, as I'm pretty sure you know already.14:44
SvenKieskealmost nobody runs docker containers like they "should be" run, that is, with a single binary inside it, and nothing else.14:45
SvenKieskeI guess you could try bootstrap kolla in distroless containers and see if that works. but then again, if you don't need a distro and just have single binaries, you also don't need centos support in the container?14:46
SvenKieskebut the docker best practices are unclear to begin with: https://docs.docker.com/develop/develop-images/guidelines/ so not really any advice you can follow easily. ¯\_(ツ)_/¯14:51
Fl1ntSvenKieske, yep just my point, I'm wondering if someone already tried to build using distroless-python already.14:53
SvenKieskeI don't think so, one of the many problems you will run into is, that many many applications assume a working linux/unix env14:53
Fl1ntSvenKieske, I'm not really following Docker best practices, just underlying technology way to work tho.14:53
SvenKieskee.g. fernet tokens in keystone are renewed with cron, so you need cron, but you don't get notified about any errors in cron, because the default cron mechanism for error logging is to mail the root account, so you need mail, in theory.14:54
SvenKieskeelasticsearch/opensearch is sensible to LANG_* settings in env14:55
SvenKieskethe list goes on and on. you can of course replicate all this stuff on a per container basis14:55
Fl1ntbut anyway, right know my concerns are more related to CentOS Stream 9 build being so unrelayable so far on our side but I'm checking if it's related to our environment or the build process overall. Hence why I'm trying to build everything on my servers with a fully connected to public internet env.14:55
Fl1nts/unrelayable/unreliable/14:56
SvenKieskebasically, docker is an elaborate way to contain the dependency chain issues modern systems have. at least that is what's it used for in many of infrastructure systems. many of the other advantages are not being used, because it's hard to do so, and a lot of work, that nobody has done so far.14:56
SvenKieskeis your end goal running centos stream in prod? or is this more a canary for RHEL/other EL distros?14:57
Fl1ntWe already run a very large cluster installation using CentOS since a long time now14:57
SvenKieskeI don't want to discourage you but I hear quite often centos stream itself is breaking - as I would expect by it's nature as rhels new upstream dev env -14:57
SvenKieskemost people I know switched to alma or rocky14:58
SvenKieskethat's why we have those images as well14:58
SvenKieskethat being said, if we support users building their own images with stream, at least that part should work14:58
Fl1ntRocky and Alma are shitty broken replacement (excuse the language), we benchmarked them against CentOS Stream 9 on Snapshots, they're really not usable.14:58
SvenKieskeinteresting :) I don't use any of the *EL world since..10 years (I used scientifc linux back then)14:59
Fl1ntMost of Rocky and Alma repo are indeed CentOS repos tbh :D14:59
SvenKieskewhat's broken about them? Performance? Or Usage Regressions?14:59
Fl1ntUsage regression, need a lot of workarounds for repos etc.15:00
Fl1ntwe've another small debian based cluster, but gosh debian can't support correctly enterprise level HW...15:00
SvenKieskeyeah the ootb experience with hardware is better, well with newer kernels (and prop. firmware in some debian instances)15:01
r3ap3rCentOS Stream is not a point release distro. That is like a project saying we support CentOS 9 Stream and deploying Fedora and expecting it to work. Rocky Linux, Alma, RHEL, etc... are all "downstream" of CentOS Stream. The binaries are "newer" than the point release distros are "older". Your experiences between the two are going to be different. Not meant to be an "attack" but just trying to clarify what seems to be a 15:01
r3ap3rmisconception.15:01
Fl1ntDebian do partial support and implementation of HW, but I do cope with that as they're 100% fully benevolent, not Enterprise backed.15:02
SvenKieskeoh you would be surprised how many companies work on debian, some hide behind gmail.com addresses though :D15:02
SvenKieskeI would just use upstream LTS kernels and move on, but well, not everyone can do that15:03
Fl1ntr3ap3r, we've an internal product similar to artifactory but on steroid, as we're 100% offline installation, we do point in time installation using CentOS, everything is fine once your CI/Build pipeline do complete.15:03
SvenKieskenothing beats a custom gentoo build, but who has time for that these days?15:03
r3ap3rFl1nt, that doesn't make your CentOS deployment point release though. Your CentOS deployment will still be incompatible with point release distros.15:04
SvenKieskewe should imho also start looking into what it takes to support tox4, but I guess this is already happening somewhere in the opendev ecosystem, at least I saw some mails around this topic15:04
Fl1ntr3ap3r, this not a misconception at all, we do work really closely with RHEL on many topics as we do are using it heavily on customers products, so we do now how CentOS is working, and honestly it's way much more stable than supposedly stable enterprise distro such as ubuntu where canonical sneakily swap packages that need patches without changing the verisons and then crash the apt checksums.15:05
SvenKiesker3ap3r: I guess they don't care about point release compatibility, they just want something stable. if you mirror complete packages, versioned, you can basically build your own stable distro this way15:05
SvenKieskewe did basically the same thing with varying degress of success in the long term at some former company. you need to watch which packages you really support, because the support burden can quickly grow out of hand15:07
Fl1ntWe could basically replicate the whole CentOS Stream Build system and it would achieve 100% matching artifact, we just skip this part and basically do our own point in time distro using various advanced tools15:07
SvenKieskeisn't all you need basically an aptly mirror? I forgot the name of the rhel counterpart. you can do snapshots there and basically install any version of any package ever hitting the mirrors. then you cut release images, test them. implement upgrade tests for important packages, do upgrades, release new images, done15:08
Fl1ntso basically we use a CentOS Stream 9 consistent and coherent installation properly freezed15:09
SvenKieskeyep15:09
SvenKieskeit start's getting unmaintainable for most companies once you seriously want to address security issues though, at least for smaller orgs.15:10
SvenKieskebut that might not be an issue, either due to small images (less packages), or a large org, or very good automation.15:10
Fl1ntWe do that internally as our solution do support many more than just CentOS, we do it for Ubuntu/CentOS/Oracle Linux/RHEL/Debian and HTTP/git/rpm/deb/whatever based sync packages15:10
Fl1ntBasically, we can't trust anything and can't get internet access, so all in all everything is sync/analyzed/build/used internally.15:11
SvenKieskeit will be interesting though how distro kernels respond to https://lore.kernel.org/linux-cve-announce/ I bet we will see basically daily kernel updates, at least twice per week :D15:11
r3ap3rSvenKieske, what I was bringing up was more to address the thinking of "since CentOS is supported as a host for the containers, the CentOS based container images should work too" which is a faulty thought process since the point release binaries never really align with the CentOS binaries since they are essentially "beta" versions of what is in the point releases. Before the packages from CentOS are brought into the point 15:12
r3ap3rreleases, they are "massaged" and function tested to ensure compatibility and stability. We can't just assume they work because CentOS is upstream of the point releases.15:12
Fl1ntwe could do it if required as we can completely manage our whole pipeline schedule, right know we do it every week with everything related to core distribution package being monthly based signature snapshots and then everything security related is weekly based.15:13
r3ap3rSame thing happens when they bring packages from Fedora to CentOS. Just because something works in CentOS, doesn't necessarily means it will work in Fedora.15:14
Fl1ntyes ABI/API/etc compatibility, hence why I'm trying to fix this specific bug that I'm chasing right now.15:15
SvenKieskethis just shows how broken both concepts are: the "stream release" and "pin software version foo in my container and I'm fine" :D15:15
Fl1nt but in our case that doesn't really matter as all our stack is using the coherent release system. 15:15
SvenKieskeI don't understand your centos<-> fedora comparison. usually it's the other way around: centos stream is cut from fedora and fedora packages are brought into centos, not the other way around.15:16
SvenKieskefedora always was the dev upstream for rhel15:16
SvenKieskewell not always, but since a very long time15:16
r3ap3rSvenKieske, the comparison is the equivalent of what Fl1nt is attempting to do with the Kolla containers. They are tested and built on Rocky Linux 9 but they want them to work on CentOS Stream. That is like trying to get something that works in CentOS Stream to work in Fedora. They are downstream from eachother, hence why the expectation of something from downstream to work upstream is a misnomer. 15:18
Fl1ntIs there a reason why we don't put ansible on the matrix page for kolla and kolla-ansible?15:18
Fl1ntr3ap3r, hence why on any run of our system any downstream patch is then reported and contributed back to the upstream, that then, if they want do something with them.15:19
Fl1ntlike for designate for instance...15:19
kevko Fl1nt what is very large cluster ? 15:19
Fl1ntkevko, more nodes than a public european hosting provider for instance.15:20
SvenKieskeah understood now15:20
SvenKieskeregarding ansible: I think it was somewhere, but I guess it's in the getting started or installation tutorial. I _think_ we also have prechecks now that check if the ansible version matches expectations and throws an error if not15:21
kevkoFl1nt: how many has public european hosting provider ? :D 15:21
r3ap3rFl1nt, seems sensible.15:21
Fl1ntSvenKieske, yep I know where to find the info, but it always seems a bit odd to me to get a matrix page about storage/base image but not any tools that actually build/generate all of that.15:23
SvenKieskeFl1nt: there you are: https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html#install-dependencies-for-the-virtual-environment15:23
SvenKieskeas you know: PRs welcome. I just recently improved some little parts in the docs, but docs are not loved by most devs it seems. it's also hard to get reviews for docs only patches imho :-/15:23
SvenKieskeI guess most customers don't like to pay for explicit docs stuff, but complain when they are not there15:24
SvenKieskeweird world, isn't it? :)15:24
Fl1ntSvenKieske, I could do doc review tho as our main feedback on the platform is about the openstack doc :D15:25
SvenKieskemhm, if you google for "kolla supported ansible versions" the first link goes to our github mirror, where the version isn't even rendered.. https://github.com/openstack/kolla-ansible/blob/master/doc/source/user/quickstart.rst15:25
SvenKieskeguess github doesn't really support restructured text15:26
Fl1ntI mean, our customers of the platform do complain a lot about it, I'm myself used to it as I work with OS since cactus release.15:26
SvenKieskethen I guess my sentence still applies: customers complain but don't want to pay for improvement :D15:27
Fl1ntkevko, OVH have over 400K servers split upon 41 Datacenters, we do have ~128 of them. I'll let you project it :D15:29
Fl1ntSvenKieske, I do it when I have time, like today... but you know how it is when you're working ^^15:30
SvenKieskewell, that reminds me about this burned out ovh "datacenter" :D (stacked containers with wood, which burns really good it seems)15:30
SvenKieskeI hope your DC design is better then OVHs ;)15:30
Fl1ntyep, that was a glorious fire :D15:30
SvenKieskewell they are dirt cheap, I guess you get what you pay for15:31
Fl1ntSvenKieske, on the most critical yes, definitely, on some others... meh :D15:31
Fl1ntPay peanuts get monkey yep15:31
SvenKieskeI only knew complete concrete building DCs with guards and stuff, seeing this the first time blew my mind. I mean I know about the containerzed DC that you can put on a truck, but how on earth can you use wood in the supporting construction?15:32
opendevreviewMerged openstack/kolla-ansible master: Revert "Pin zun jobs to Docker 20"  https://review.opendev.org/c/openstack/kolla-ansible/+/90409315:32
SvenKieske"move fast and break things" :D15:32
SvenKieskedc edition15:32
Fl1ntSvenKieske, that's what you get when you fall for any weird concept such as greenscore...15:33
SvenKieskewell I guess you can build a "green" DC and maybe even with certain wood stuff, if you have a good concept and people knowing their stuff (I know about a - afaik - 14 stories high wooden building in Hamburg, which is as fire resistant as a concrete building)15:34
SvenKieskebut you need to have equipment, staff and excellent planning/time for such stuff. and I bet it's not cheap15:35
SvenKieskeit's 18 floors, actually: https://suelzle-stahlpartner.de/en/references/housing-construction/germanys-highest-wooden-house/15:36
opendevreviewMerged openstack/kolla-ansible master: Zun: remove docker's cluster-store option  https://review.opendev.org/c/openstack/kolla-ansible/+/90416415:37
opendevreviewMerged openstack/kolla-ansible master: Revert "zun: Deprecate Zun provisionally"  https://review.opendev.org/c/openstack/kolla-ansible/+/90409415:37
Fl1ntWood per see ain't the issue as you point out ^^15:38
SvenKieskewell, the ground stuff is still concrete15:38
SvenKieskeincompetent people will find a way to even burn down a fire resistant building I guess ;)15:39
Fl1ntyep, totally, but as long as they learn to not repeat the issue I'm fine with that ^^15:40
Fl1ntalthough OVH is... not learning anything...15:40
Fl1ntthey've a large OS cluster too btw15:40
SvenKieskeyeah I know. I found their talk about backing up ceph quite interesting (I believe it was at fosdem some years ago)15:41
opendevreviewMerged openstack/kolla-ansible master: Fix Skyline API Server TLS configuration  https://review.opendev.org/c/openstack/kolla-ansible/+/90932915:41
SvenKieskethey actually do also some upstream work here and there, could be more I guess, but well. you take what you get.15:41
opendevreviewMerged openstack/kolla-ansible master: Skyline configure Prometheus  https://review.opendev.org/c/openstack/kolla-ansible/+/91051415:41
Fl1ntFun fact, CEPH Erasure Code come from one of my old team btw :D15:45
Fl1ntthe erasure code part not the whole CEPH I mean :D15:47
Fl1ntThanks everyone for this discussion, was cool ^^ see you later folks!15:56
opendevreviewMartin Hiner proposed openstack/kolla-ansible master: Add container engine migration scenario  https://review.opendev.org/c/openstack/kolla-ansible/+/83694116:14
guesswhat[m]How is this relevant https://docs.openstack.org/kolla-ansible/latest/reference/storage/external-ceph-guide.html ? Seems that keyring names are changed https://github.com/openstack/kolla-ansible/blob/0b820f10e084a650950f144772993d8afbed247c/releasenotes/notes/multiple-ceph-backends-913051631c6e69ee.yaml#L1516:33
kevkoguesswhat[m]: its'not 16:37
kevkoguesswhat[m]: keyring names are just refactored ... nothing changed for user16:38
kevkoguesswhat[m]: only if he wants to support several ceph clusters16:38
guesswhat[m]@kevko i have troubles with TASK [nova-cell : Copy over ceph cinder keyring file]   The error was: 'dict object' has no attribute 'stat'. 'dict object' has no attribute 'stat'\n\nThe error appears to be in '/usr/local/share/kolla-ansible/ansible/roles/nova-cell/tasks/external_ceph.yml'16:47
guesswhat[m]but the /etc/kolla/config/nova/ceph.client.cinder.keyring file exists16:47
guesswhat[m]have it, seems that its because cinder_backend_ceph is disable, altough I am trying to enable it in different way16:50
guesswhat[m]any idea, how to skip this https://github.com/openstack/kolla-ansible/blob/master/ansible/roles/cinder/templates/cinder.conf.j2#L148 ?  cinder_ceph_backends: [] does not work16:51
kevkoguesswhat[m]: wait16:52
kevkoguesswhat[m]: you are trying to setup something really anti-pattern 16:54
kevkoguesswhat[m]: if you want to omit backend config for cinder_ceph_backend ...just turn off cinder_backend_ceph <<< 16:54
guesswhat[m]1. ) i have multiple pool names, kolla configuration does not solve this, cinder_ceph_backend has to be enabled, otherwise the error  ^^^ 16:55
kevkoguesswhat[m]: what ? of course it solve 16:56
kevkoguesswhat[m]: okay, i can help you as i am author of that patch :) 16:57
kevkoguesswhat[m]: what is your ceph_cinder_user: "cinder" and ceph_nova_user: "nova"16:58
guesswhat[m]thats not a problem16:58
kevkoguesswhat[m]: what is the problem ? you can use config-override ...so...16:59
guesswhat[m]this is not possible to set https://github.com/openstack/kolla-ansible/blob/0b820f10e084a650950f144772993d8afbed247c/ansible/roles/cinder/templates/cinder.conf.j2#L15216:59
guesswhat[m]thus, if you want to have multiple backends, you have configure cinder_ceph_backends  and also have override files, which is ...17:01
SvenKieskekevko: have you ever seen "Inappropriate ioctl for device" in the socat mariadb clustercheck?17:08
mnasiadkaSvenKieske: everybody has seen that, socat logging is crap17:20
mnasiadkasometimes I think it would be better if that crap would not log anything :)17:20
SvenKieskexD it totally tripped me up and let me look in the wrong direction..17:25
SvenKieskewhen do we replace this with proxysql again? :D17:25
mnasiadkaI'm ok with defaulting to proxysql17:37
mnasiadkajust somebody raise that patch please17:37
mnasiadkaand people that love clustercheck can change enable_proxysql to false ;)17:37
SvenKieskeI guess I even have a clue why socat dumps these logs. I think it's because haproxy does an HTTP check and afaik socat knows nothing about http and complains about the unclean connection shutdown because it can't unwind an http connection..17:55
SvenKieskebut it's only a wild guess, I'm not deeply familiar with socats protocol support, I would not be surprised if it speaks native http17:55
guesswhat[m]kevko: this is missing from your patch https://pastebin.com/raw/FcWrSbvX18:19
Lockesmithjovial, it looks like I actually screwed up a trunk group assignment. I'm not sure why that was causing the weird interaction and not just stopping all traffic, but fixing that appears to have gotten things running. Thanks for your time in troubleshooting this!19:25
kevkoguesswhat[m]: no ! even if you leave cinder_ceph_backends as it is in defaults ...you can create /etc/kolla/config/cinder/cinder-volume.conf and place your [rbd-X] -> with configs as your override ...and service will be overconfigured with user specified override22:33
kevkoguesswhat[m]: well, but there is a patch waiting for review which is solving this out-of-the box https://review.opendev.org/c/openstack/kolla-ansible/+/907166 << but it doens't mean that you can fullfill it with config override kolla already provides ...22:36
kevkoguesswhat[m]: after those patch you have freedom to specify ceph cluster, user, rbd pool  for every ceph cluster you would like ...22:38
kevkofor all services and also for every single compute node ..if you want ...we are using it for splitting AZs and storages ...22:39
kevko(for one customer)22:39
opendevreviewMichal Arbet proposed openstack/kolla-ansible master: Switch mariadb's loadbalancer from HAProxy to ProxySQL  https://review.opendev.org/c/openstack/kolla-ansible/+/91372422:52
kevkoSvenKieske: don't use mariadb clustercheck ... using proxysql for years  from stein 22:53
kevkoSvenKieske: mnasiadka: https://review.opendev.org/c/openstack/kolla-ansible/+/913724 << switch to proxysql :D 22:53
aravindhI saw some chatter about replacing the bash in kolla-ansible command with python. Should I try to do this and contribute back? 23:15
aravindhif you guys are okay, I would suggest writting the CLI with typer, it makes it way easier to maintain in the long run..23:16

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!