Wednesday, 2024-01-24

opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_ironic master: Fix a typo in pxe_redfish definition  https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/90635308:39
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Extra PIP_OPTS in bootstrap_ansible script must be space separated  https://review.opendev.org/c/openstack/openstack-ansible/+/90647209:06
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver  https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/90519910:15
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Add SCENARIO_EXCLUDE environment variable to limit expansion of SCENARIO  https://review.opendev.org/c/openstack/openstack-ansible/+/90638810:17
jrosser^ i was thinking about this patch, and there are some different ways to do it10:18
jrossercurrently it makes a new env var, SCENARIO_EXCLUDE to remove things from the expanded scenario10:18
jrosserbut we could also do it in a single var with a new keyword to split on SCENARIO=aio_lxc_magnum_octavia_exclude_heat10:19
jrosserthat way no changes to gate_check_commit cli parameters would be needed10:19
noonedeadpunkI'm really not sure about `_exclude_` separator frnakly speaking10:43
noonedeadpunkas overall SCENARIO_EXCLUDE thing. Like feels it's better not to add things to scenario then remove from it afterwards10:44
noonedeadpunkbut I see how that approach currently it problematic sometimes10:44
jrosseryeah, i don't really like it either10:46
jrosseri was just looking at https://review.opendev.org/90561910:47
jrosserthat fails here https://zuul.opendev.org/t/openstack/build/f3940c376d7a4806824c4d15301d7fc4/log/job-output.txt#1344410:48
jrosserbecasue "Could not find a module for {{hostvars['aio1_repo_container-c0674b1d']['ansible_facts']['pkg_mgr']}}."}10:48
jrosseri was wondering if this actually does anything https://zuul.opendev.org/t/openstack/build/f3940c376d7a4806824c4d15301d7fc4/log/job-output.txt#1205310:49
jrosserhttps://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/repo-install.yml#L16-L2010:50
jrosseri think i will test this locally, maybe if the tasks list is empty its a valid optimisation in ansible not to gather facts10:50
jrosseroh well https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/gate-check-commit.sh#L21410:55
noonedeadpunkyeah, I guess we're gathering them here though: https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/gate-check-commit.sh#L18711:03
noonedeadpunkand then skip down the road11:03
jrosserthat doesnt gather for lxc containers though11:05
jrosserand this is only failing the lxc jobs11:05
noonedeadpunkwell, ok L21011:05
jrossertbh i thought that python_venv_build role had some specific handling for ensuring that facts are present for the repo host11:05
noonedeadpunkhandy that we store gathered facts in logs actually (or at least should)11:07
noonedeadpunkhttps://zuul.opendev.org/t/openstack/build/f3940c376d7a4806824c4d15301d7fc4/log/logs/ansible/facts-all.log.txt11:07
noonedeadpunkso that should not really be a problem...11:08
jrosserok i make an AIO i think11:09
gokhan_hello folks, when upgrading from victoria to wallaby, I am getting rabbitmq crash errors. https://paste.openstack.org/show/blNvUeMG64dBChRchb92/ what can be the reason of this ? 11:16
noonedeadpunkgokhan_: have really no idea11:31
noonedeadpunkwould make sense to check rabbitmmq and erlang versions on all containers11:31
noonedeadpunkAnd check they match: https://www.rabbitmq.com/which-erlang.html11:32
noonedeadpunkPotentially - upgrade versions of rabbit/erlang to supported ones11:32
gokhan_noonedeadpunk, yes you are right 11:35
gokhan_rabbitmq version is 3.11.3 and erlan version is 26 11:36
gokhan_erlang 26 is supported with rabbitmq 3.12.011:37
noonedeadpunkgokhan_: yeah, so I assume that versions of rabbit are just missing from the repos, so you're fallbacking to system version11:42
noonedeadpunkgokhan_: we have variables `rabbitmq_package_version` and `rabbitmq_erlang_version_spec` to control these 2 things11:43
gokhan_thanks noonedeadpunk , I will check rabbitmq repos and resolve it. 11:45
noonedeadpunkyeah, basically that....11:46
noonedeadpunkgokhan_: otherwise, you cna try using `rabbitmq_install_method: distro` which will potentially downgrade the cluster, but dunno how good is that given you're quite far from distro provided versions11:46
noonedeadpunkso that might be bad idea11:47
gokhan_noonedeadpunk, is it be a problem using sam versions in xena 12:00
gokhan_*same 12:00
gokhan_sorry xena is also problem there is no version in repo rabbitmq-server 3.9.28-1 12:02
noonedeadpunkand you should keep an eye not to really downgrade12:06
noonedeadpunkas there might be flags in place that will prevent doing so (or better say startup with lower version)12:07
noonedeadpunkSo if you're upgrade to antelope - jsut take antelope version right away12:07
noonedeadpunkand keep it through all upgrades12:07
noonedeadpunkand basically jsut skip running further rabbitmq upgrades12:07
noonedeadpunkthat might work :)12:07
gokhan_yes I am upgrade to antelope 12:08
gokhan_I will continue with rabbitmq antelope version 12:09
jrosserso `Could not find a module for {{hostvars['aio1_repo_container-8475472c']['ansible_facts']['pkg_mgr']}}.`12:11
jrossermeans that this line did not actually template https://github.com/ansible/ansible/blame/stable-2.15/lib/ansible/plugins/action/package.py#L4912:11
jrosserand so the module name is not templated when it prints the error here https://github.com/ansible/ansible/blame/stable-2.15/lib/ansible/plugins/action/package.py#L6612:12
noonedeadpunkyou're looking at ansible-core 2.15.7 issue?12:21
jrosseryeah12:21
noonedeadpunktbh that kinda feels a connection plugin thing12:22
noonedeadpunkas affects only lxc12:22
noonedeadpunkso potentially we fail to propagate facts somehow, dunno...12:22
noonedeadpunkor well12:22
noonedeadpunkmaybe not - we just don't delegate in fact in metal12:22
jrosserwould you expect to get some templating exception then12:23
jrosserlike if it cant' find hostvars['foo']['ansible_facts']['pkg_mgr']12:24
noonedeadpunkwell the error there doesn't make much sense to me at all12:24
noonedeadpunkI guess I'll be looking jsut for diffs between 2.15.5 and 2.15.712:24
noonedeadpunk(I guess that's what we're upgrading for?)12:25
noonedeadpunkshoudl not be plenty of them...12:25
jrosserright, next thing to try12:25
noonedeadpunk`Nested templating may result in an inability for the conditional to be evaluated. See the porting guide for more information.`12:26
noonedeadpunkLike this one was introduced in 2.15.7: https://docs.ansible.com/ansible-core/2.15/porting_guides/porting_guide_core_2.15.html#playbook12:27
noonedeadpunkbut yeah, probably not directly related...12:27
noonedeadpunk```import_role`` reverts to previous behavior of exporting vars at compile time.` -> that's actually interesting change.....12:29
noonedeadpunkAs I can recall hassle specifically with python_venv_build regarding imports/includes messign up vars12:29
noonedeadpunkI wonder if 2.15.8 might be covering it....12:32
noonedeadpunkDoubt though12:33
jrosser2.15.6 works, 2.15.7 fails12:34
jrosserso it is in the diff between those12:34
noonedeadpunkthe main diff for 2.15.7 is CVE-2023-5764 12:46
noonedeadpunkand 2.15.8 looks like bugfix release for this as well12:47
jrossergit bisect says this is the commit that breaks it https://github.com/ansible/ansible/commit/fea130480d261ea5bf6fcd5cf19a348f1686ceb114:11
jrosserTIL that you can git clone ansible, run source ./hacking/env-setup and OSA automatically picks up the git version14:12
noonedeadpunkso... there were couple of patches to fix unsafe regression in 2.15.8.... So... I assume you've tested that it's still not working?14:16
jrosserthe original patch tries to update to 2.15.814:23
jrosserhttps://review.opendev.org/c/openstack/openstack-ansible/+/90561914:23
noonedeadpunkoh, ok14:24
noonedeadpunksorry14:24
spatelQuick question, If i take snapshot and move to new openstack cloud and spin up instance then can i delete that snapshot from glance once instance is up ?14:38
spatelI am using Ceph backend storage. 14:38
spatelMy problem is I am planning to migrate all vm and I don't want 1000s of snapshot sitting in glance repo eating up space :(14:39
spatelGetting this error in logs - rbd.PermissionError: [errno 1] RBD permission error (error listing children.)14:57
spatelLook like this is parent/child relationship issue 14:58
jrosserthis is OSA?14:58
spatelno 14:59
spatelThis should be some Ceph issue right ?14:59
spatelI found this article - https://stackoverflow.com/questions/47346402/permissions-for-glance-user-in-ceph14:59
jrosserwell whatever you use to deploy openstack needs to set the right permissions14:59
jrosseror rather openstack / ceph combination14:59
jrosserthere was a thread on the mailing list about this recently14:59
spatelTechnically you are allow to delete snapshot right after creating instance from snapshot ?15:00
spatelDo you have link for that thread?15:00
jrossernot without going checking the ML archive15:00
spatelThis is my glance caps: caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=images15:01
spatelLook like my glance need access of volumes pool also otherwise how does it delete snapshot image? 15:02
jrosseryou can also check in the OSA vars how we set this up15:04
spatelHere are the full error logs - https://paste.opendev.org/show/bsHiWAdBpqIO4oZFFI57/15:05
noonedeadpunkthese caps should work I guess15:17
noonedeadpunkBut is it's glance user which is in use then?15:17
noonedeadpunkI dunno15:18
spatelafter giving permission - caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=images, allow rx pool=volumes 15:21
spatelError disappeared and getting this mesg - Unable to delete image '3708f961-fb74-49f1-ab9b-40cf7954abed' because it is in use.15:21
spatelLook like glance need allow rx pool=volumes  permission to read volume 15:21
spatelnoonedeadpunk How do i delete snapshot then? 15:22
spatelI don't want 1000s of snapshot endup in my glance repository :(15:22
noonedeadpunkI"m kinda slightly confused about what you're trying to do I guess15:22
noonedeadpunkLike snapshot first of all is protected when it appears in glance iirc15:23
noonedeadpunkso to be removed it should be unprotected first.15:23
noonedeadpunkBut then - why not to remove it from glance at the first place?15:23
spatelits not protected - https://paste.opendev.org/show/bsHiWAdBpqIO4oZFFI57/15:23
spatelprotected                        | False                  15:23
spatelAm i missing something ?15:24
noonedeadpunkThat is actually great question, as all images I do have in glance are protected15:25
spatelI have two openstack and trying to migrate instances from A to B 15:27
spatelI taking snapshot from A and moving to B 15:27
spatelSpin up instance and now want to delete snapshot to cleanup space.. 15:28
spatelIf I can't able to delete snapshot then I will have 1000s of them in glance which is crazy 15:28
hamburglerif a volume is built on top of a snapshot, don't think you can delete the snapshot?15:32
spatelHmm! 15:33
spatelYou are saying I should spin up instance using as an image instead boot from volume?15:34
hamburgleryou could do either, but if the volume is created from snapshot, I don't think you can delete the snapshot after if the volume is in use15:35
spatelThis is very interesting point... 15:37
hamburglermaybe I don't have all the details, is this a completely separate cloud you're moving a volume to?15:37
spatelI am taking snapshot in glance and moving them to new cloud 15:37
spatelimporting that snapshot in glance and just spin up instance 15:38
hamburglerand spinning the instance up works yes? - you will not be able to delete the 'image/volume' that the instance is based on 15:41
spatelI have create instance and it works.. 15:41
hamburglerlike for example if you have instances built off images, in image pool, you cannot delete the base image if volumes are built on it15:41
hamburglerat least probably not safely i believe lol15:41
spatelHow do I check this image is tie up with that volume ?15:43
spatelIf I want to run a script to find out who is using this snapshot to create instance15:43
hamburglercan do it from ceph cli15:43
hamburglerneed to check my notes :) can't remember 15:44
spatelHmm I am trying to google that :)15:48
jrossernoonedeadpunk: https://github.com/ansible/ansible/blob/stable-2.15/lib/ansible/template/__init__.py#L744-L74615:50
jrosserwild behaviour there just to return the thing you wanted templating, but not templated15:51
hamburglerspatel: you can use rbd ls poolnamehere15:51
jrosserwith apparently no error or warning 15:51
hamburglerthen rbd info poolname/volumename15:51
noonedeadpunkIIRC, then treat "unsafe" exactl same as "raw" in jinja15:51
hamburglerand you will see parent image in output:15:52
noonedeadpunkand I can recall suggestion to use unsafe/raw as same things in docs15:52
hamburglerthat is just for volume on image, but I think you can see snapshots created too another way15:52
jrossernoonedeadpunk: though i am trying really hard to reproduce with a simple playbook delegating package task from one host to another and it pretty much just works15:52
hamburglerspatel: https://paste.openstack.org/show/bSGo4DajpV9u4luVkEvm/15:53
spatelhamburgler I can see image is parent - https://paste.opendev.org/show/bu3c7lqf6Ue5tUqn7d6X/15:53
hamburgleryeah i think if you were to do a test and try to rm the image, it wouldn't allow it, it would say there are volumes built on it or something, but obviously only try with test images/volumes :D 15:54
hamburglerbeen awhile since I looked at that15:54
noonedeadpunkjrosser: would the thing work if drop our conenction plugin and jsut ssh to container normally?15:55
jrossergood question15:55
noonedeadpunklike any other host rather then containers15:55
jrosseri have a meeting now but will try that15:55
spatelYou are saying use rbd command to delete from Ceph 15:55
hamburglermmm, only with a test image, just to see behaviour15:56
jrosseryou are moving the snapshot to a different ceph though?15:57
* jrosser very confusing15:57
hamburglerI think he is wanting to delete the snapshot that was moved to a new openstack cluster, after an instance is spun up15:57
hamburglerbut I think that can't be done if an instance is now running on top of that snapshot15:58
jrosserspatel: you have two openstack? and two ceph? or one?15:58
spatelYes... 16:06
spateltwo openstack two ceph 16:06
spatelThis is latest and greatest openstack and trying to move vms from old to new.. 16:07
spatelI wish I can directly move files from ceph to ceph and boot vms instead doing this snapshot ping-pong :(16:07
hamburglernot sure if there is a much better way to do it :(16:11
spatelHmm! This is crazy :O16:11
hamburglerwould be nice to import directly to ceph, but then entries not in database for openstack 16:15
jrosserit might be possible to rbd export / import then do a bunch of database manipulation16:15
jrosserbut thats a gigantic hack16:16
spatelhaha :O16:16
spatelThis is good read - https://docs.ceph.com/en/quincy/rbd/rbd-snapshot/#layering16:17
spatellook like ceph just do COW and keep image as a base image 16:17
spatelFLATTENING A CLONED IMAGE.. does anyone know what its trying to say?16:22
jrosserif you take a snapshot it's like a no-op, just points to the original data16:24
jrosserbut if you were to want to "detach" that from the original data then you need to fully resolve all the copy-on-write stuff into completely new "flat" data16:24
spatelhmm I think I need to do flatten image 16:29
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver  https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/90519916:32
spatelVery interesting - https://blueprints.launchpad.net/cinder/+spec/flatten-volume-from-image16:34
spatelWhat If I use QCOW2 image to import as a snapshot ?16:35
spatelLet me ask this question in mailing list and see what other think about this16:37
spatelI have did experiment with QCOW2 image and it doesn't have parent reference16:45
spatelI did*16:45
mgariepyif you store qcow2 images they need to be converted for volumes16:46
spatelmgariepy Yes I know they need time to download convert and upload back 16:55
spatelbut it let me allow to delete snapshot 16:55
mgariepyyou can enable image caching also.16:55
jrossernoonedeadpunk: i completely reproduced it outside OSA16:55
jrosserhttps://paste.opendev.org/show/bdr2qyCGXujPbCQj9x5M/16:56
mgariepywhich will do it once and then all the other will use the cached image. 16:56
noonedeadpunkjrosser: looks like veeeeery reportable bug16:56
jrosserit looks like the use of ansible_facts[...] to index into my_other_hosts makes the value be AnsibleUnsafeText16:57
jrosserif i make that a regular var that i set with -e then it works16:57
opendevreviewMartin Oravec proposed openstack/openstack-ansible stable/zed: Keepalived WIP missing in proxy-protocol-networks mysql configuration.  https://review.opendev.org/c/openstack/openstack-ansible/+/90644716:58
noonedeadpunkoh16:58
jrossereven better example https://paste.opendev.org/show/btNogSfVyKmguabmOh6n/17:01
jrosserandrewbonney: is that patch related to what you were looking at this week? https://review.opendev.org/c/openstack/openstack-ansible/+/90644717:03
jrosserthis has familiar sound to it17:03
jrosserspatel: do you miss a step in the description of what you do? don't you have to create an image from the snapshot in order to download it?17:05
noonedeadpunkI really wonder if it's "as designed" or not....17:09
noonedeadpunkas if it's not it's getting really weird17:09
noonedeadpunk*if it is17:09
noonedeadpunkregarding - 906447 - I'm slightly confused why it;s needed. Like if you have exact same issue for different reason - that would be interesting17:10
jrosserwe have the galera role used outside OSA and i know andrew has a ticket in our system to figure out what is wrong with the proxy protocol setup17:19
noonedeadpunkah, ok17:24
hamburglerIt looks like during a fresh install of Bobcat, or upgrade and switching between mirrored to quorum queues, that os-cinder-install, the backup service tries to start and ends up failing, because the new vhost doesnt actually get created until later on when cinder-api gets called, think this may need to be re-ordered?19:07
noonedeadpunkhamburgler: yeah, that's can be really valid issue19:22
noonedeadpunkjrosser: not sure if around... Do you have any idea on how to enable this? https://docs.openstack.org/keystone/latest/api/keystone.api.s3tokens.html19:23
noonedeadpunkLike I see plenty of 404 towards /v3/s3tokens and that's annoing.19:23
noonedeadpunkAt this point I'm about to think it should be done through api-paste19:23
noonedeadpunkbut I don't see any19:24
noonedeadpunkdisregard, I guess I"m stupid19:25
spateljrosser noonedeadpunk hamburgler I found solution :)  its ceph flatten  - https://paste.opendev.org/show/bL03lJDvEomnxVHkZebP/20:35
spatelas soon as I flatten image it let me delete snapshot 20:35
hamburglerawesome! :) glad you found a solution20:36
spatelIn mailing list someone said it has been addressed in new bobcat release - https://docs.openstack.org/releasenotes/glance_store/2023.2.html#relnotes-4-6-0-stable-2023-220:37
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-plugins master: Add openstack_resources role skeleton  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/87879421:09

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!