Tuesday, 2023-11-14

hamidlotfi_Good morning.06:33
hamidlotfi_I want to remove a controller node from the OpenStack cluster, how do that?06:33
hamidlotfi_What are the steps to do it? I didn't see anything in the documents.06:33
hamidlotfi_I am using the zed/26.1 version.06:33
hamidlotfi_@noonedeadpunk @jrosser  06:33
hamidlotfi_Good morning.06:34
hamidlotfi_I want to remove a controller node from the OpenStack cluster, how do that?06:34
hamidlotfi_What are the steps to do it? I didn't see anything in the documents.06:34
hamidlotfi_I am using the zed/26.1 version.06:34
noonedeadpunkhamidlotfi_: hey08:48
noonedeadpunkyeah, we indeed don't have documentation for that and honestly I kinda never did that on my own on production.08:49
noonedeadpunkBut I think, genericly, you need to:08:50
noonedeadpunk1. Manually stop haproxy and keepalived services on the controller you want to remove, if if these are not on a standalone hosts08:51
noonedeadpunk2. Remove controller from the inventory, ie ./scripts/inventory-manage.py -r control0408:51
noonedeadpunk3. Remove reference to control from openstack_user_config (this could be 2 actually)08:52
noonedeadpunkand then kinda... run setup-infrastructure and setup-openstack....08:52
hamidlotfi_why run setup-infra and setup-openstack ?08:53
noonedeadpunkas you need to reconfigure galera and rabbtimq clusters, and then change refferences in each service configs to remove old cluster members08:53
noonedeadpunkand re-configre haproxy to remove backends08:54
hamidlotfi_I want to replace a controller node with an old one.08:54
hamidlotfi_Do you have a suggestion for this?08:54
noonedeadpunkthat is completely different then remove control node :)08:54
noonedeadpunkor well, depending if new one can have the same hostname as the old one or not08:55
hamidlotfi_I want setup a new controller node and join to openstack cluster and remove old one.08:55
noonedeadpunkso if you're fine with them sharing the same hostname - process would be pretty much easy I guess08:56
hamidlotfi_I used this method to add a new compute node but I got a lot of errors.08:58
noonedeadpunkHuh? We do that quite a lot today08:59
noonedeadpunkSo, first you would need to disable haproxy backends on control you wanna replace (to make it gracefully)08:59
noonedeadpunkWe use playbook like this for such thing: https://paste.openstack.org/show/bYOYrgIJYZETJdMXOPpm/09:01
noonedeadpunkthen I guess you can just shutdown the controller, provision a new one, boot it and kinda run setup-hosts --limit control01,control01-containers (or smth like that...) 09:03
jrosserhamidlotfi_: replacing a node (or reinstalling the OS, which is basically the same) does not need a specific removal step if it comes back as the same thing in the ansible inventor09:19
jrosseradding a new controller then removing an old one is a really complicated way to do that which involves much extra work09:20
hamidlotfi_noonedeadpunk: Thanks for your help.09:27
hamidlotfi_jrosser: I know this is a complicated process.09:27
jrosserhamidlotfi_: but it doesnt have to be?09:27
jrosserreplace things "in place" is much easier, and it gives you practice for when you want to do an operating system upgrade across your whole deployment09:28
hamidlotfi_jrosser: Can you explain more about what you mean by replacement?09:31
jrosseryou've added a controller and removed a controller09:31
jrosserthats "replacing a controller" ?09:31
opendevreviewAndrew Bonney proposed openstack/openstack-ansible-galera_server master: Fix ignored database directories configuration  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/90086009:32
hamidlotfi_jrosser: 09:36
hamidlotfi_I know what I want to do, I mean the steps to do it, how should it be done?09:36
jrosserthere are some instructions for an OS upgrade here https://docs.openstack.org/openstack-ansible/latest/admin/upgrades/distribution-upgrades.html09:37
jrosserwhich is very similar09:37
opendevreviewMerged openstack/openstack-ansible-os_swift master: Fix example playbook linters  https://review.opendev.org/c/openstack/openstack-ansible-os_swift/+/90078911:19
opendevreviewMerged openstack/openstack-ansible-os_keystone master: Add quorum queues support for service  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/90063111:24
nixbuilderI have Antelope installed and everything seems to work (so far) *EXCEPT* for adding images... I have the system configured to use cinder for image storage... however whenever I try to add an image I get a "HttpException: 500: Server Error for url: http://<hidden>:9292/v2/images/51db54d6-3d6a-49e5-9045-08c281cd52b1/file, Internal Server Error".  Here is the full log entry... https://paste.opendev.org12:20
opendevreviewMerged openstack/openstack-ansible stable/zed: Apply rate limit for journald in AIO builds  https://review.opendev.org/c/openstack/openstack-ansible/+/90067012:22
nixbuilderOh and BTW... adding volumes works fine.12:24
noonedeadpunknixbuilder: paste link is jsut link to new one fwiw12:33
nixbuildernoonedeadpunk: When I click on https://paste.opendev.org/show/bSIIleOANxJADPUnjTWk/ I get the entire log I posted12:37
noonedeadpunknixbuilder: question: do you have /etc/glance/rootwrap.d/glance_cinder_store.filters ?12:42
nixbuildernoonedeadpunk: Yes.12:43
nixbuildernoonedeadpunk: https://paste.opendev.org/show/b14nu4kdYKlvtO9fI34s/12:44
noonedeadpunknixbuilder: can you also provide output for `/openstack/venvs/glance-27.2.0/bin/pip list` ?12:46
nixbuildernooneddeadpunk: Here it is: https://paste.opendev.org/show/bUYETZXqtgNsVuOlyr5Y/12:48
noonedeadpunkhuh, ok12:49
noonedeadpunkI was hoping we're missing smth obvious, rather then oslo.rootwrap somehow not being executed from venv or not respecting it...12:50
noonedeadpunkAnd I'm kinda sure that thing used to work not that long ago12:50
nixbuildernoonedeadpunk: I'm just a newbee when it comes to oslo and all of that... but one thing I noticed was that the Antelope system did not take any parameters after the /use/bin/rootwrap in the cinder or glance sudoers file.  Could that be the reason?12:53
nixbuildernoonedeadpunk: Our production Pike system has "glance ALL = (root) NOPASSWD: /usr/bin/glance-rootwrap /etc/glance/rootwrap.conf *"12:54
nixbuildernoonedeadpunk: Similar syntax with cinder as well.12:55
noonedeadpunkprojects moved to privsep since then12:56
noonedeadpunkhttps://governance.openstack.org/tc/goals/selected/migrate-to-privsep.html12:56
noonedeadpunkor well....12:56
noonedeadpunkI'd need to get some sandbox actually to check on that12:57
nixbuildernoonedeadpunk: I guess I will build an AIO and see if I can duplicate the problem there.13:00
noonedeadpunknixbuilder: if you can give an overrides applied - I can sapwn it as well13:04
nixbuildernoonedeadpunk: You mean you want my openstack_user_config and user_variables file?13:06
noonedeadpunknah, just how you set glance_* variables for user_variables13:08
nixbuilderoonedeadpunk: Oh... ok13:09
nixbuildernoonedeadpunk: https://paste.opendev.org/show/bikK43TFHQ4ZAXJjkNHk/13:11
mgariepyanyone seen this ? https://bugzilla.redhat.com/show_bug.cgi?id=184684413:43
mgariepyTL;DR; placement seems to be have some race conditions or similar on selecting host for vms. 13:44
jrosseris that the conclusion? last comment is about reserving more host memory13:47
mgariepyin my case it scheduled 3 240gb vm on a 512 gb host.13:48
jrosserewww13:48
mgariepynot just a small bit.13:48
jrossersort of related we see very large memory requirements when starting vgpu instances13:48
mgariepyfor vgu it does pre-allocate memory (at least on passtrougth)13:49
jrosserway in excess of the allocation13:49
jrosserswap is totally needed, even if theres no overcommitment13:49
mgariepyhuh13:49
jrosserotherwise OOM killer13:49
noonedeadpunkwow13:49
jrosserbut the steady state once stuff is running is not using swap at all13:50
mgariepyin my case it was also causing oom kill :)13:50
noonedeadpunkI haven't seen that, but I haven't checked for it either13:50
jrosserwe found the hard way with 2 v.large GPU instances per server that one would regularly get OOM when the other booted13:50
mgariepyi wasn't looking at it until a custumer reported that his vm was being killed .. :/ 13:51
mgariepyhow much do you leave for the host ?13:51
noonedeadpunkwe do smth like 16gb13:52
jrosserit ended up being really quite a lot, 50G maybe13:52
jrosserand that wasnt enough13:52
mgariepywow13:52
jrosserbut thats the wierd thing, it was just transient at VM startup13:52
jrosserso eventually we made enough swap space on another disk (grrr zfs root)13:53
jrosserswap on ZFS will deadlock, don't do it :)13:53
mgariepyisn't zfs consuming a lot of memory too ?13:54
mgariepyi do use zfs on my controllers but not on the computes nodes haha13:54
jrosseryeah exactly, and it's catch 22 when it tries to swap on a zfs volume that it needs a bunch of memory to do that13:54
andrewbonneyThere were some nasty libvirt memory leaks too which needed recent versions to patch13:55
jrosserwe did some experiments with deliberately overcommitting and could lock the computes up reliably like that13:55
noonedeadpunknixbuilder: I think I have built an AIO for testing. What commands you were using to create an image?14:01
*** starkis is now known as Guest697214:02
nixbuildernoonedeadpunk: openstack image create cirros-0.6.1 --file cirros-0.6.1-x86_64-disk.img14:14
nixbuildernoonedeadpunk: This is on a brand new install of Antelope.14:14
noonedeadpunknixbuilder: are you sure you can do like this?14:42
noonedeadpunkAs I thought that cinder storage means you can not provide a `--file` argument to image creation14:42
noonedeadpunkas path to the file would be a cinder volume14:42
noonedeadpunkhttps://docs.openstack.org/cinder/latest/admin/volume-backed-image.html14:42
noonedeadpunkSo I'm not sure if `--file` is applicable in this case14:43
jrosserisnt that why privsep was trying to do a bunch of iscsi stuff14:44
jrosserto arrange a volume to be there to put the image into14:44
noonedeadpunkcould be actually...14:45
jrosserafaik this lets you have images in some cinder store then (if you configure it right) boot an instance from a snapshot, much like we can do with ceph14:47
nixbuilderThis works in our Production Pike cloud.... https://paste.opendev.org/show/bPL57mzdOT1RBrwIahRv/14:48
jrossergoogle says https://ilearnedhowto.wordpress.com/2020/05/06/how-to-install-cinder-in-openstack-rocky-and-make-it-work-with-glance/14:48
noonedeadpunkyeah, so things like `cinder_store_auth_address, cinder_store_user_name, cinder_store_password` should be added as an override as of today14:49
noonedeadpunkI don't see logic handling that in our glance role14:49
jrosserit's a shame there was not slightly more debug from privset aabout exactly what command it tried14:50
jrossersimilarly we miss barbican<>glance integration currently as well14:51
noonedeadpunkyeah14:51
nixbuildernoonedeadpunk: jrosser: Reading the documentation that you sent over.14:52
noonedeadpunkWhile we actually cinder in additional stores by default. huh14:53
jrossernixbuilder: i think what we need to do is check that everything is set up in the service config files as they should be14:54
jrosserit's quite possible that few people are using this integration and there is gap in the way we set it up14:54
noonedeadpunkoh well.14:58
noonedeadpunkappears that our rootwrap filters are outdated14:58
noonedeadpunkat least this https://opendev.org/openstack/glance_store/src/branch/master/etc/glance/rootwrap.d/glance_cinder_store.filters looks completely different from https://opendev.org/openstack/openstack-ansible-os_glance/src/branch/master/files/rootwrap.d/glance_cinder_store.filters15:00
noonedeadpunk#startmeeting openstack_ansible_meeting15:00
opendevmeetMeeting started Tue Nov 14 15:00:55 2023 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
opendevmeetThe meeting name has been set to 'openstack_ansible_meeting'15:00
noonedeadpunk#topic rollcall15:01
noonedeadpunko/15:01
noonedeadpunkI hope I didn't mess up with timezones15:01
NeilHanlono/ 15:02
NeilHanlonseems OK to me15:02
NeilHanlonbut I'm in a different one than usual 😂 15:02
noonedeadpunk#topic office hours15:07
noonedeadpunkwe had really good progress landing things during previous week15:07
noonedeadpunk2023.1 is still struggling from full_disk though on upgrade jobs15:08
noonedeadpunkI do hope that this backprt could help with that15:08
noonedeadpunk#link https://review.opendev.org/c/openstack/openstack-ansible/+/90067015:08
opendevreviewMerged openstack/openstack-ansible-galera_server master: Fix ignored database directories configuration  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/90086015:09
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Ensure tempest include and exclude lists all use unique names  https://review.opendev.org/c/openstack/openstack-ansible/+/89396815:11
jrossero/15:11
noonedeadpunkour jobs now seem to be quite faster after all15:12
noonedeadpunklooks like we're back to 1:10 for metal jobs15:12
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-galera_server stable/2023.1: Fix ignored database directories configuration  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/90088815:14
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-galera_server stable/zed: Fix ignored database directories configuration  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/90088915:14
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-galera_server stable/yoga: Fix ignored database directories configuration  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/90089015:15
noonedeadpunkSo releasing15:15
noonedeadpunkI was waiting for https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/899045 to land to propose new releases for stable branches15:15
noonedeadpunkas you might have seen - Yoga is going to be dropped and re-created as "unmaintained/yoga"15:16
noonedeadpunkwhich means we need to wrap up all patches we want to land there15:16
noonedeadpunk#link https://review.opendev.org/q/parentproject:openstack/openstack-ansible+branch:%255Estable/yoga+status:open+15:16
noonedeadpunkdo we wanna put some work and try to fix these or should we jsut abandon them?15:17
noonedeadpunkNext, is what we have left for Bobcat branching15:21
jrosserhmm i have not managed to find time to work on stable branches15:21
jrossertends to involve a lot of making AIO and investigating15:21
noonedeadpunkYeah, while I like in general do things like that, but right now no time for that :(15:22
noonedeadpunkSo I guess I will go on and abandon backports that are not trivially broken15:22
noonedeadpunkSo, 2023.2. I think biggest thing, that was on plate requiring changes everywhere was osa/quorum_queues.15:23
noonedeadpunkIt's mostly landed now, except Zun, which is jsut broken for 2023.215:23
noonedeadpunkWhile I've managed to fix it for master, I dunno what to do with the backport: https://review.opendev.org/c/openstack/zun/+/90078515:23
noonedeadpunkAnother thing is openstack_resources role15:25
noonedeadpunkI've started making some adjustments and add magnum resources management. Hopefully will push some result soonish15:25
jrosserah i can test that in my cluster-api patch15:26
noonedeadpunkok, awesome15:26
jrosserwhen i can find some time i'm turning that into a collection15:26
noonedeadpunkI guess we can branch once these will land15:26
noonedeadpunkAbout skyline - haven't worked on that yet...15:27
noonedeadpunkAnd seems we have quite valid bug with cinder as a glance storage15:31
noonedeadpunkthat nixbuilder raised today15:31
jrosseri was wondering if that was a candidate to add to the os_cinder/os_glance CI jobs15:32
noonedeadpunkyeah, makes sense15:32
jrosseractually yes we run an nfs scenario there15:33
jrosserso a cinder one would be a good addition15:33
noonedeadpunkwe just need to fix that first somehow :)15:35
jrosserit's sometimes quite tricky to come up with the cross-role variables to control this stuff15:37
jrosseror to say that its actually a documentation issue and a specific set of overrides are required by the operator15:37
noonedeadpunkAnd I guess I know the reason15:37
jrosserthere is already a place to document this https://docs.openstack.org/openstack-ansible-os_glance/latest/configure-glance.html#configuring-default-and-additional-stores15:38
opendevreviewMerged openstack/openstack-ansible-os_aodh master: Add quorum support for service  https://review.opendev.org/c/openstack/openstack-ansible-os_aodh/+/89569015:39
noonedeadpunkyeah, so the issue is in the "wrong" rootwrap, that doesn't include bindir from venv15:39
noonedeadpunkwhich was removed back then: https://opendev.org/openstack/openstack-ansible-os_glance/commit/9748e6b1543d225b19afe8fe2f93f5a7d66a69e4#diff-c99b9efb7da8932e4002fa34d4f7e5cf5d69d35115:41
noonedeadpunkanyway, let's investigate that after the meeting...15:42
opendevreviewMerged openstack/openstack-ansible-os_swift master: Add quorum queues support for service  https://review.opendev.org/c/openstack/openstack-ansible-os_swift/+/90063215:42
noonedeadpunkIf' that's it15:46
noonedeadpunk#endmeeting15:46
opendevmeetMeeting ended Tue Nov 14 15:46:29 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:46
opendevmeetMinutes:        https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-11-14-15.00.html15:46
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-11-14-15.00.txt15:46
opendevmeetLog:            https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-11-14-15.00.log.html15:46
jrosserso - quorum queues job needed too perhaps?15:46
noonedeadpunkOh, that is good that you've mentioned that15:47
noonedeadpunkI clean forgot that we should disable them by default15:48
jrossercould combine that with an _infra_ job on the rabbitmq role15:48
jrosserthen we get a cluster + quorum at the same time15:48
noonedeadpunkwell, infra job only runs keystone15:48
noonedeadpunkwhich kinda... doesn't care much about rabbit15:48
noonedeadpunkI guess15:49
jrosserwell - i guess we can make anything we want if we add _quorum_ into the scenario parsing15:49
noonedeadpunkyeah, true15:49
jrosserwhich is some sort of step toward making it default15:49
noonedeadpunkWe have to make it default before rabbitmq 4 releases15:50
jrosseralso seems to be some progress here https://review.opendev.org/q/topic:bug-203149715:50
noonedeadpunkoh, new reviews are in16:01
noonedeadpunknot for most crucial parts though16:36
noonedeadpunkand that's still for 2024.116:36
jrosseryes - so from OSA perspective we should chase those patches or we are delayed by another cycle16:39
noonedeadpunknixbuilder: can I kindly ask you to submit a bug report?16:49
noonedeadpunkI'm working now on proposing a fix16:49
nixbuildernoonedeadpunk: Thank you... sure. Where do I submit the bug report?17:22
jrossernixbuilder: here https://launchpad.net/openstack-ansible17:23
nixbuilderjrosser: Thank you!17:24
nixbuilderNoonedeadpunk: jrosser: Bug report here...https://bugs.launchpad.net/openstack-ansible/+bug/204350317:45
noonedeadpunkawesome17:45
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Add glance_bin to rootwrap defenition  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/90093017:50
noonedeadpunknixbuilder: can you check if that patch solves your issue? ^17:50
nixbuildernoonedeadpunk: I will start with a fresh install, patch the files and then let you know.17:55
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Remove glance_cinder_store filters override  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/90093117:55
noonedeadpunkIn my AIO it kinda worked. But I guess that's not all what's needed to make it fully functional...17:56
nixbuildernoonedeadpunk: OK... it will take a bit to verify.  I'll let you know ASAP.17:56
noonedeadpunkwe _really_ need to give some love to glance role as looking at it a bit wondering how it worked at all...18:17
noonedeadpunkand not sure how to fix in a good way....18:18
noonedeadpunkLooking now at `glance_available_stores` that should eventually be a list of mappings, but in fact smth half-backed that would require quite some overrides if one dare to make it a mappring18:20
nixbuildernoonedeadpunk: Oh wow... didn't mean to create a huge rabbit hole :-)18:29
noonedeadpunknixbuilder: nah, I think it's a good thing actually...18:37
spatelfolks, did you ever upload 700GB image in glance ?18:45
mgariepywow.18:45
spatelI am getting timeout randomly when hit 50% upload18:46
spatelI am using command line 18:46
mgariepyno never did that.18:46
mgariepyhow comes it's so big ?18:46
spatelThis is someone VM running on VMware and they want to migrate to openstack.. 18:47
spatelI convert image from vmdk to raw and now trying to upload 18:47
mgariepyi'm currently decomissioniung an old cloud and downloading via glance is so buggy. i do export data  via rbd directly.18:48
spatelcan I import image directly in ceph?18:49
mgariepythat's a good question18:49
spatelFeels like haproxy getting timeout 18:52
mgariepyis glance store and forward or stream the data to ceph ?18:52
spatelI think glance is backed by ceph so it should stream to ceph directly18:53
spatelI wish we can upload glance imgae to ceph and tell glance to point there 18:54
spatelis this right doc for me? - https://xahteiwi.eu/resources/hints-and-kinks/importing-rbd-into-glance/18:55
spatellooks good to me.. 18:56
noonedeadpunkspatel: You can always trust Florian :D19:06
mgariepyyeah looks nice :D19:07
mgariepyspatel, if you run from a volume you might want to convert it to raw if it's un vmdk format first.19:08
mgariepyotherwise you will have the same issue.19:08
mgariepyjust in a later stage.19:08
spatelnoonedeadpunk  lol19:18
spatelimage is in raw format 19:18
opendevreviewMerged openstack/openstack-ansible-os_octavia stable/2023.1: Add security rule for octavia healthmanager  https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/89904519:45
nixbuildernoonedeadpunk: OK... got the servers reloaded, source code files patched and now on to installation.20:08
opendevreviewMerged openstack/openstack-ansible-os_ceilometer master: Add quorum support for service  https://review.opendev.org/c/openstack/openstack-ansible-os_ceilometer/+/89569620:30
opendevreviewChristian Rohmann proposed openstack/openstack-ansible-rabbitmq_server stable/zed: Add ability to add custom configuration for RabbitMQ  https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/90094120:34
logan-👋20:34
jrossero/ hello20:45
logan-y'all up and switched irc networks on me!21:12
logan-hope everyone is well, glad to make my way back here and see some familiar names :) 21:13
dmsimard[m]you did not witness the implosion of freenode? :D22:12
mgariepyhey welcome back logan- 22:29
opendevreviewJimmy McCrory proposed openstack/openstack-ansible-galera_server master: Include CA cert in client my.cnf  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/90026622:35

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!