Wednesday, 2023-01-18

opendevreviewMerged openstack/openstack-ansible master: Add Octavia OVN Provider repo requirements  https://review.opendev.org/c/openstack/openstack-ansible/+/87083400:19
opendevreviewMerged openstack/openstack-ansible stable/zed: Add Glance tempest plugin repo to testing SHA pins list  https://review.opendev.org/c/openstack/openstack-ansible/+/87077700:26
opendevreviewchandan kumar proposed openstack/openstack-ansible-os_tempest master: [DNM] whitelist neutron pkg debug  https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/87088407:13
opendevreviewMerged openstack/openstack-ansible-os_nova master: Support configuration of resource providers with config files  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/87055909:32
opendevreviewMerged openstack/ansible-role-pki master: Update variables gathering to use vars/varnames lookups  https://review.opendev.org/c/openstack/ansible-role-pki/+/86966409:54
opendevreviewMerged openstack/openstack-ansible-plugins master: Add variable to control no_log in mq_setup role  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/86960210:08
opendevreviewMerged openstack/openstack-ansible-plugins master: Add variable to control no_log in service_setup role  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/86960410:15
opendevreviewMerged openstack/openstack-ansible-plugins master: Fix no_log variable templating in db_setup role.  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/87084310:15
opendevreviewMerged openstack/openstack-ansible-galera_server master: Remove "warn" parameter from command module  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/86965610:19
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Disable dhcp-agent and metadata-agent for OVN  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/87085410:30
opendevreviewMerged openstack/openstack-ansible master: Add tempest and tempest plugins to required jobs for source deploys  https://review.opendev.org/c/openstack/openstack-ansible/+/87083910:50
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Allow git servers for openstack services and tempest to be overridden  https://review.opendev.org/c/openstack/openstack-ansible/+/86974811:17
*** dviroel|afk is now known as dviroel11:19
*** dviroel|afk is now known as dviroel11:54
admin1hi all .. if I already have a server as network node. and then removed it from network hosts, how can I make it such that it does not add it back as network node again during redploy12:54
admin1i have one node which i also removed from inventory, but everytime i run the playbook, it adds it back12:54
noonedeadpunkadmin1: so you need to remove from inventory and from openstack-user-config (or conf.d) as well12:56
jrosseradmin1: how did you remove it from the inventory?12:57
admin1manually by hand 12:57
admin1also if its possible to uninstall ( neutron ) related stuff cleanly from the host 12:58
jrosseri mean did you use the inventory_manage script, or did you just edit openstack_user_config?12:59
admin1i could not use the script ..   but did a manual config . because i wanted to remove the l3 agents and dhcp agents from controllers 13:01
admin1network_hosts  -> *infrastructure_hosts  =>>  *compute_hosts13:02
admin1agents were in infra before , now they run in compute 13:03
admin1and want to clean infra hosts of all neutron agents not being deployed there again13:03
jrosseryou know that just editing openstack_user_config is not removing it from the inventory13:04
mgariepyanother point is if it were deployed in lxc container you could just delete the container and be done with the cleaning. 13:06
mgariepyi don't think we do have anything that clean services from a host.13:06
mgariepynoonedeadpunk, added a comment on https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/87085413:44
noonedeadpunkah damn, I was thinking about that but forgot to include in patch13:53
jamesdentonFWIW, DHCP Agent may still be needed for OVN+Ironic in certain scenarios13:55
jamesdentoncan prob cross that bridge later13:55
mgariepyi didn't saw your update.. and i was looking where the hell that task was coming from lol13:55
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Disable dhcp-agent and metadata-agent for OVN  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/87085413:58
opendevreviewJames Denton proposed openstack/openstack-ansible-os_ironic master: Create local log path for Ironic Python Agent  https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/87096114:18
jrosserjamesdenton: i had something along those lines here https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/86700314:20
jrosserbut not configurable14:20
jamesdentonoh, ok. i missed that since i'm on Yoga in this env14:20
jrosserso that is a very good question about ironic + dhcp + ovn14:20
jamesdentoni'll scratch mine!14:20
jrosseri assume that the ovn dhcp thing really only has concern for things on chassis it runs on?14:21
jamesdentoni believe it did, but has been extended to support aa new baremetal port type14:23
jamesdentonhere's one related patch14:23
jamesdentonhttps://bugs.launchpad.net/neutron/+bug/197143114:23
jamesdentonhttps://review.opendev.org/c/openstack/neutron/+/84031614:23
jamesdentonhttps://review.opendev.org/c/openstack/neutron/+/84028714:23
jamesdentonlooks like the port gets scheduled to one of the controllers (ovn) which is then responsible for responding14:24
jamesdentonjrosser have you built a disk image lately for ironic? and can you share your command?14:28
jrosserit's in my docs patch :)14:28
jamesdentonwell then. lemme go looking14:29
jamesdentonand reviewing :(14:29
jrosseri gave an example for aarch64 https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/867547/11/doc/source/configure-ironic-multiarch.rst14:30
jrosserbut actually thats totally portable14:30
jamesdentonsure14:30
jrosseroh and thats efiboot as well, as you'd need for arm14:31
jrosserthere is also a not-whole-disk image example in https://github.com/openstack/openstack-ansible-os_ironic/blob/master/doc/source/configure-ironic.rst14:32
jrosserbut i did get in a mess with separate initird/root images and stuck with whole-disk in the end14:33
jrosserthough it was a mess of my own creating, it was jsut more confusing14:34
jamesdentonright, ok.14:38
jamesdentonwhole disk is what i've been doing, but running into this issue: Installing GRUB2 boot loader to device /dev/sda failed wi14:39
jamesdentonth Unexpected error while running command.14:39
jrosserwell, getting the IPA log (and making sure it is in debug) was specatcularly useful14:39
jamesdentoni feel like i've run into that before and don't recall the fix14:39
jrosserkernel_append_params14:40
jrosserhttps://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/867547/11/doc/source/configure-ironic-debugging.rst#6814:40
jrosserthats where to enable debugging even if you don't want the login thing14:40
jamesdentonthanks14:41
jrosserit was also super useful to change the node state to maintainance=on and the ironic state machine will freeze14:41
jrosserotherwise it gets in some bad state and gives up / powers off and you lose the ability to debug14:41
jamesdentonthanks, yeah maint mode trips me up sometimes14:53
jamesdentonme: "why does this server keep powering down"14:53
mgariepyjrosser, root:neutron for ssl cert won't work. it's 600 for the key so group neutron won't help for reading the file.14:53
*** dviroel is now known as dviroel|lunch15:32
*** dviroel|lunch is now known as dviroel16:23
jamesdentonjrosser user error; i had kernel and ramdisk assigned to my whole disk image :|16:32
jrosserjamesdenton: I think I managed to do the opposite16:32
jrosserset the image as a whole disk one but make whatever setting it is that has ironic think it’s kernel/ramdisk16:33
jamesdentoni have kernel and ramdisk assigned to the baremetal node for booting, then had made a whole disk image and mistakenly applied to the image, too. IPA did not want to write the bootloader. 16:34
jrosserjamesdenton: when i got stuck with this i asked in #ironic and this ended up in the docs https://docs.openstack.org/ironic/latest/admin/troubleshooting.html#why-do-i-have-an-error-that-an-nvme-partition-is-not-a-block-device16:54
jamesdentonoh interesting, i read that, too16:55
jamesdentonthings are working great now that i adjusted the image. just a dummy move16:56
jrosserit was suuuuper unclear what was going on16:56
jrosserand in my case `img_type` was set when it shouldnt have been16:56
jamesdentoni didnt set that, either16:56
jrosserthere are many ways to shoot the foot :)16:56
jrossernoonedeadpunk: we can probably do like this patch for OSA https://review.opendev.org/c/openstack/ansible-collections-openstack/+/86720217:02
noonedeadpunkoh, yes, would make sense17:03
noonedeadpunkBut I don't think we're seing too much galaxy-related issues lately?17:04
noonedeadpunkIt's surprisengly stable lately...17:04
*** gmann is now known as gmann_afk17:29
noonedeadpunkopenssh_keypair module is sooooooooo weird.....17:36
noonedeadpunkchmod makes it fail with "Unable to read the key. The key is protected with a passphrase or broken. Will not proceed."17:36
noonedeadpunkmore hilarious, if set mode to 0644 for module itself it also crashes on second run17:37
noonedeadpunkhttps://paste.openstack.org/show/bqBh6uvEUgffLhsazBR6/17:37
*** gmann_afk is now known as gmann17:41
*** gmann is now known as gmann_afk18:06
mgariepylol18:10
* noonedeadpunk submitted bug report to collection18:13
mgariepyclose, works as design ?18:15
mgariepy:P18:15
mgariepyjrosser, have you seen the graph on this page? https://docs.openstack.org/networking-ovn/latest/admin/refarch/refarch.html18:18
mgariepyit does explains most of the inner working of ovn deploy.18:19
mgariepyhow they are connected i mean and what part is responsible for doing what.18:20
jrosserthat is very useful18:21
mgariepyjust missing the dhcp ;p haha18:22
jrosserI was wondering if it was yet possible to have multiple active gateway nodes18:24
jrosserthere was some priority iirc rather than sharing18:25
jrosserdiagram also missing bgp agent for ipv618:26
noonedeadpunkfwiw networking-ovn was last updated in 2018 and was completely deprecated in April 202018:27
noonedeadpunkSo no idea how this is relevant to be frank18:27
jamesdentonwell, i think it was just rolled into neutron18:27
noonedeadpunkYes, but also couple of changes landed to neutron as well regarding ovn18:27
mgariepywith the diag you see where each service takes it's info and where it runs18:28
jamesdentonhttps://docs.openstack.org/neutron/latest/admin/ovn/refarch/refarch.html18:28
noonedeadpunkJust saying that docs might be partially obsolete18:28
jamesdentonfixed.18:28
noonedeadpunkyeah. well, it's almost same18:30
noonedeadpunkso, I've missed - do we need neutron-dhcp-agent for ovn ?:)18:31
noonedeadpunkor https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/870854 is good to go?18:31
mgariepywe don't need it for the vms.18:32
mgariepythe dhcp is handled ovn direclty18:32
noonedeadpunkwhat I don't like a lot, that setup_ovs_ovn.yml is not idempotent and raise planty of changed each time....18:38
noonedeadpunkbut we can iterate o nthat18:38
jrosserwhat is the branch name going to be for AA?18:52
*** gmann_afk is now known as gmann18:56
noonedeadpunk2023.119:13
noonedeadpunk* stable/2023.1 I assume19:13
noonedeadpunkbtw, that's the bug report with answer to my issue https://github.com/ansible-collections/community.crypto/issues/56419:13
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-plugins master: Use cryptography backend for openssh_keypair  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/87099719:27
admin1has anyone experienced playbooks hanging/failing on os_keystone : Add service project19:31
noonedeadpunkwell, saw that in AIO until we increased threads19:32
noonedeadpunknever outside of aio19:33
admin1i am using a lab env to test something ..  its a 12 threads, 32gb  ..      19:33
admin1load is 0.38 19:33
admin1its the controller itself 19:33
admin1not an AIO 19:33
noonedeadpunkthreads in terms of apache/uwsgi19:34
admin1oh 19:34
noonedeadpunkas mpm was prohibiting to spawn more19:34
admin1curl http(s) endpoint:5000 replies fine 19:34
noonedeadpunkwell, I think it all depends how parallel connections it can handle19:35
admin1how to increase the threads ? just to rule it out 19:35
noonedeadpunkwe did smth like that https://opendev.org/openstack/openstack-ansible/commit/078c82b03456d46641a3ec05e3d14bd3ac6d1cd519:35
noonedeadpunkBut I assume that default values should be fine19:35
jrosseryou could try similar from the utility container with the cli19:35
jrosserbecause (sort of) that’s what the ansible is doing19:36
jrossermaybe also creating the Keystone Service project is the first use of admin creds? there might be something wrong there19:37
admin1changed the connections ..  retrying 19:40
admin1set the therad count to 219:40
admin1https://paste.openstack.org/show/bkZIDUbbPIPtcrwMEw2Y/ -  i see gateway timeout .. but no idea what it means19:47
admin1curl http://172.29.236.11:5000/ from util container responds just fine 19:49
noonedeadpunkwell, that might be aligned with amount of processes/threads19:56
noonedeadpunkas curl is jsut single request, but multiple at the same time might cause issues. But I'm not sure that's it19:56
noonedeadpunkAs default is calculated based on cpu cores iirc19:56
noonedeadpunkso that's more then enough19:56
mgariepyis the openstack client works ?19:57
admin1well, its the first playbook in a new install 19:57
noonedeadpunk`keystoneauth1.exceptions.http.GatewayTimeout: Gateway Timeout (HTTP 504)` hm19:58
admin1stuck at Making authentication request to http://172.29.236.11:5000/v3/auth/tokens  => Resetting dropped connection: 172.29.236.1119:58
noonedeadpunkwhat's in keystone / apapche logs?19:58
noonedeadpunkAs that might be mysql connection issue actually19:58
mgariepyare all revelant endpoint up in haproxy  ?19:59
jrosserhttp?20:00
jrosserdont you get 504 when a http thing tries to talk to an https endpoint?20:01
jrosserwell anyway - i thought we default the internal vip to https20:02
admin1https:// is only on the public side20:02
admin1internally its all http only 20:02
admin1the openrc in the util has export OS_AUTH_URL=http://172.29.236.11:5000/v320:03
admin1all endpoints are up 20:03
admin1i am going to set keystone_wsgi_threads: and keystone_wsgi_processes:  to 12 ( number of cpu threads ) in user_variables, delete the keystone and repo container and retry again 20:04
admin1keystone service backend shows green in haproxy status gui 20:05
admin1and curl https:// on public and http on private endpoint on 5000 both give the reply 20:05
jrosseris there no_log on the task?20:05
admin1will enable log as well 20:06
admin1i have to test moving one of the cluster running on 24.2.4 from ceph running with ceph-ansible to  ceph with cephadm, so was trying to recreate the same setup 20:07
admin1i think no_log is there by default 20:07
admin1the only change i have made is the rabbitmq version change 20:07
jrosseryou will have to comment out the no_log20:08
jrosserbut i guess the auth parameters are all in openrc / clouds.yml anyway20:08
admin1yeah .. 20:09
admin1git checkout the tag,  ran setup hosts and infra .. validated all endpoints are up, mysql working from utility, utility can reach the endpoints .. then ran os-keystone setup  and got stuck 20:09
jrosseri think i might be checking that keystone service logs20:10
admin1i already nuked the keystone container .and creating it again 20:10
jrosser504 means that haproxy waited too long for the thing behind it to reply20:10
admin1this time, thread count is 12 20:10
admin1i did not checked what it was by default though20:11
admin1as i had never been stuck in this state before20:11
mgariepyjamesdenton, https://pasteboard.co/IYHYnNjokewN.png20:11
mgariepywhat do you think about that ?20:11
jamesdentonbig improvement20:11
mgariepyi can change the network connection to match the other png and add it to your patch.20:12
*** prometheanfire is now known as Guest176920:12
mgariepydoes it reflect the correctly how it's wired up you think ?20:13
mgariepyit sure depend on a bunch of other stuff but.20:13
mgariepyany png paste site you guys prefer ? i don't paste png often.20:15
jamesdentonit looks good, except br-tun isn't really connected directly to a vlan, while br-ex is. So i'm not sure how to reflect that20:19
mgariepyyeah indeed.20:22
jamesdentonbut i really dig the changes, thank you20:22
jamesdentonand i've never used a png share, so this looks fine20:22
mgariepyit's not a real netowrk it's just a bunch of flow in ovs lol20:22
admin1ps ax | grep uwsgi | grep -v grep    | wc -l    => 14 threads running 20:22
admin1inside the keystone container20:23
mgariepyi'll redraw it like this one : https://review.opendev.org/c/openstack/openstack-ansible/+/867577/5/doc/source/reference/figures/networking-openvswitch-cn.drawio.png20:26
admin1what you see is  (  journalctl -f ) ..  https://paste.openstack.org/show/bBzQwlnQaOUBlQgnklg1/   that the container suddenly restarts during os_keystone : Add service project20:34
jrosseradmin1: the threads thing is a red herring, it was specific to CI jobs where we had deliberately turned the number right down, too much so it turned out20:34
jamesdentonmgariepy sure, that could work. we might even add comments to the page20:35
admin1it happened again 20:38
jamesdentonWell, gonna give this Yoga->Zed upgrade with OVN a go. Wish me luck20:47
admin1best of luck .. 20:47
admin1yoga(ovs) -> zed (ovn ) ? 20:48
jamesdentonovn both ways20:48
mgariepytell us how it goes.20:48
jamesdentonwill do!20:48
jrosserjamesdenton: stable/zed roles still point to the point we branched20:50
jrosserin ansible-role-requirements20:50
spatelnoonedeadpunk what do you use for snapshot of images, i meant regularly or schedulers? 20:50
jamesdentonok, so some backports may be missing?20:51
spatelI am looking at Freezer but not sure its best solution 20:51
jrosserjamesdenton: you won't get anything merged to roles since the release20:51
jamesdentongood point, maybe i'll wait. 20:51
mgariepymaybe wait the next point rel.20:52
mgariepylol20:52
jrosseri think we are super close to making 26.1.020:52
jrosser+/- ovn stuff :)20:52
jamesdentonthen i'll wait for that, thanks for the heads up20:52
noonedeadpunkspatel: we have in-house solution for that... But eventually it's freezer (which is barely alive) or mistral that can be leveraged for automation of snapshots20:53
jrosseri never understood why cinder-backup does not have any concept of schedule20:54
spatelThanks for heads up on freezer :)20:54
noonedeadpunkyep, I'm quite eager to release 26.1.0 but each time I'm about to push bump, we find a bug20:54
spatelI will look into mistral 20:54
mnaserjrosser: https://github.com/vexxhost/staffeln20:57
jamesdentona wild mnaser appears20:57
mnaseri'm always lurkin20:57
jamesdentongood to see you20:57
mnaserbut staffeln is an open source cinder-volume scheduler for lack of a better word20:57
mnaserjamesdenton: ty, you too, hope to see you in vancouver as well20:58
jamesdentonhope to be there!20:58
mnaserhaving network fun?20:58
spatelovn fun :)20:58
jrossermnaser: that is interesting - thanks21:00
spatelso it works with cinder-backup api 21:00
jrosserwith something automatic like that i would properly build something on nfs to sit next to ceph21:01
jrosseri have a small thing like that for cinder-backup now but hardly anyone uses it because its really not user friendly21:02
jrosserbut if it were scheduled/automatic thats a whole different game21:02
spatelWish there is a doc or example to set it up 21:05
noonedeadpunkso basically most important part from the dead freezer21:06
noonedeadpunkwhich is cool actually21:07
*** dviroel is now known as dviroel|afk21:09
jrosserspatel: we make a role for it? :)21:10
spatelhaha! indeed,  first lets give it a shot and see if its going to fulfill our need 21:11
spatelI will give it a try and see.. I am exploring solution for automatic backup of customer vm (i wish horizon has a option to schedule backup or similar) 21:12
spatelDid you ever see this error in mysql - https://paste.opendev.org/show/bd6nfhdsFRfMbfol7BUK/21:13
spatelThis is my very old openstack running on Queens :D21:13
jrosserhave you followed the "maintainance tasks" in the docs to check the cluster status21:14
spatelEvery few days getting OOM error on mysql 21:14
spatelCluster is health (syncd + showing 3 nodes) 21:14
spatelGoogle saying ignore that error but i think this is something serious. because only that node getting OOM. other two running fine. 21:15
spatelserver load is ok.21:16
spatelThis is how it started - https://paste.opendev.org/show/bBEwJJiXiVHUIEILsi0y/21:16
spatelDid anyone try ChatGPT ?21:21
opendevreviewMarc Gariépy proposed openstack/openstack-ansible master: Update documentation for LXC/metal and LXB/OVS/OVN  https://review.opendev.org/c/openstack/openstack-ansible/+/86757721:23
opendevreviewMarc Gariépy proposed openstack/openstack-ansible master: Update documentation for LXC/metal and LXB/OVS/OVN  https://review.opendev.org/c/openstack/openstack-ansible/+/86757721:24
*** Guest1769 is now known as prometheanfire21:24
mgariepysomething like that that need a bit of your touch jamesdenton  ;) 21:24
jamesdentonlol21:25
jamesdentoni will take a look later today, thank you for doing that21:25
mgariepyyou are welcome.21:25
mgariepyi just want to document it a bit for internal and internet uses ;D21:25
mgariepymaybe i try to put too much in the drawing..21:26
mgariepyhave a nice evening21:26
jamesdentonwell, i think more detail is better here21:26
jamesdentonyou too!21:26
mgariepyyeah more detail but the issue is that we might need to split it on a couple of drawing21:27
jamesdentonagreed21:28
opendevreviewMerged openstack/openstack-ansible-os_neutron master: Disable dhcp-agent and metadata-agent for OVN  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/87085422:07
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Add facility to rewrite source URLs for ansible collections during bootstrap  https://review.opendev.org/c/openstack/openstack-ansible/+/87082022:07
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_neutron stable/zed: Disable dhcp-agent and metadata-agent for OVN  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/87090722:08
prometheanfire^ had to do this manually on my install, I thought it was becaues ovs/linuxbridge was set up first22:27
admin1changed from 24.4.2 -> 24.4.3 and it works  ¯\_(ツ)_/¯  22:37
admin1is there a easy way to enable mfa/2fa for keystone when using osa ? 22:45
jrosseradmin1: it is a post deployment step so not really in scope, but for which user?22:47
admin1horizon 22:48
admin1i think i have to read more about it on22:49
jrosserhorizon user? or user of horizon?22:49
jrosserthe 2fa settings are per user in keystone, not really to do with horizon22:49
jrosserhttps://docs.openstack.org/keystone/latest/admin/multi-factor-authentication.html#multi-factor-authentication22:50
admin1jrosser, many thanks22:56

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!