Tuesday, 2024-03-19

hamburgler2Is anyone actively using trove? - Struggling with the guest-agent config at the moment and the trove channel is pretty dead, I can dump logs to swift without issue, but looks like there is some kind of bug in the guest agent image venv - /opt/guest-agent/backup/main.py when that calls the create backup function in datastore/service.py, looks like the command being executed is malformed/out of order. 05:22
hamburgler2Error in https://paste.openstack.org/show/byFcSFmIpjfED8XKt0ad/ - guest agent conf also in there too. Has anyone come across this before? Database instances work fine also, just backups do not.05:22
gokhan__is there a way to override upper constraints in osa? I am trying to install neutron fwaas on antelope, but it doesn't create pzmq wheel because it doesn't have python 3.10 whl. I need to override pyzmq version. 06:59
noonedeadpunkgokhan__: you can override the url to your own "fork" of it or just random url09:05
noonedeadpunkbut not specific occurency09:05
noonedeadpunkhamburgler2: well, I used it a while ago, but wanted to get it running in next month in a some "pet" project09:05
noonedeadpunkhamburgler2: and you do have a swift api endpoint for the region I assume...09:22
noonedeadpunkbut yeah, it seems an issue in settings, agree09:22
noonedeadpunkoh my, that's a mess /o\ https://opendev.org/openstack/trove/src/branch/master/trove/guestagent/datastore/service.py#L485-L49309:26
noonedeadpunkwhy in the world it opens self with popen...09:26
noonedeadpunkah, because it runs that inside the docker container.....09:27
noonedeadpunkquestion is - what is main.py....09:29
noonedeadpunkand eventually, argument is passed from code regardless https://opendev.org/openstack/trove/src/branch/master/trove/guestagent/datastore/service.py#L46709:30
noonedeadpunkhamburgler2: ensure you have this commit in your image: https://opendev.org/openstack/trove/commit/e998b6886602575127ebe613e56cee3a5a01c6c609:32
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Use container setup role from plugins repo  https://review.opendev.org/c/openstack/openstack-ansible/+/90500409:40
gokhan__noonedeadpunk, thanks. For distribution upgrade, it seems we also need to migrate volumes to other hosts if we deploy cinder volumes to containers. 09:57
gokhan__noonedeadpunk, an other question for installing a service from different version, we need to override git package urls and requirement url? do wee need any settings extra?09:59
noonedeadpunkgokhan__: um, you're using non-ceph?09:59
gokhan__no I am using ceph 10:00
noonedeadpunkso.. why you need to migrate volumes to other hosts? you've disabled active/active setup?10:00
gokhan__I am using default settings. I am not aware of active/active setup. How can we check this 10:02
noonedeadpunkI can't recall _for sure_, but there should be `cluster` setting in cinder.conf, and then for each volume in database there's a field `cluster_name` or smth, that should match10:03
noonedeadpunkalso, for "proper" active/active setup you'd need to have some coordination, like zookeeper or etcd10:04
noonedeadpunkas otherwise you can catch some race conditions10:04
noonedeadpunkbut we worked without it for a while without anything too obvious too frequent10:05
gokhan__cluster is ceph in cinder.conf 10:05
gokhan__there is no coordination url setting in cinder 10:06
gokhan__cinder.conf10:06
noonedeadpunkwell, so proably active/active can have race conditions, but it shouldn't require to migrate volumes afaik10:27
gokhan__ok thanks, I need to set cinder coordination group 10:40
noonedeadpunkwell, if you add zookeeper to inventory - it would be added on it's own10:41
noonedeadpunkbut if you have etcd running somewhere - you can point to it manually10:41
gokhan__noonedeadpunk, we have already zookeeper and we will add it. 10:50
noonedeadpunkhuh10:51
noonedeadpunkif it's spawned with osa - it should be jsut added by role I guess...10:51
jrosseryeah there should be no need to manually add that10:52
noonedeadpunkugh, crap, seems rdo has borked ovn installation for rocky....10:59
noonedeadpunkhttps://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/913582?tab=change-view-tab-header-zuul-results-summary10:59
gokhan__noonedeadpunk, we have our own zookeeper role. if ı set cinder-coordination group to our zookeer group, it can work  11:02
noonedeadpunkyeah, sure11:03
gokhan__noonedeadpunk, I want to also ask for using magnum with bobcat version. we will override magnum github settings and change magnum role to bobcat. do we need another setting ? I will also test cluster api driver.11:13
noonedeadpunkProbably not? except different fedora-coreos image and k8s/etcd versions for templates....11:15
gokhan__noonedeadpunk, I am not sure about requirement git url. it seems we need to change also requirement git url in magnum role 11:17
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Create an openrc for nb/sb clients  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/91358211:17
noonedeadpunkoh, yes, you probavbly need :)11:18
noonedeadpunkit's `magnum_upper_constraints_url`11:18
noonedeadpunkbasically you can set that to `magnum_upper_constraints_url: https://releases.openstack.org/constraints/upper/18ef0785c4d95c0b7a144c2f9b3ca6a97df20e52`11:19
gokhan__thanks noonedeadpunk :)11:22
jrossergokhan__: you will have some challenges to run my "complete" magnum cluster_api patches on an older OSA11:37
gokhan__jrosser, I am planning to test on antelope if it is possible 11:38
jrosserif you want OSA to deploy the control plane k8s cluster for you automatically then there are lots of patches needed on Antelope, will be less on Bobcat and hopefully none on C11:38
jrosserthe testing is all currently in the context of what will be the C release11:38
gokhan__jrosser, if I can install management kubernetes cluster on lxc containers, I can handle other challenges.  11:40
jrosseryou can - my patches do that11:41
gokhan__jrosser, ok thanks ı will test it next week. 11:42
jrosserwhat i am saying is that you will need to significantly patch a whole bunch of other stuff in OSA to make that managment k8s cluster on the LXC deploy properly11:43
gokhan__thanks jrosser I got it :) I will look at the patches. 11:44
noonedeadpunk#startmeeting openstack_ansible_meeting15:00
opendevmeetMeeting started Tue Mar 19 15:00:39 2024 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
opendevmeetThe meeting name has been set to 'openstack_ansible_meeting'15:00
noonedeadpunk#topic roll call15:00
noonedeadpunko/15:00
NeilHanlono/15:00
damiandabrowskihi!15:01
jrossero/ hello15:01
noonedeadpunk#topic office hours15:02
noonedeadpunkso one thing that raised today, is that Rocky seems to reliably failing OVN installation15:02
noonedeadpunkhttps://zuul.opendev.org/t/openstack/build/b48c718f0b794b18b313eb4a513b0cac15:03
noonedeadpunk"nothing provides ovn23.09 needed by rdo-ovn-2:23.09-2.el9s.noarch from rdo-deps"15:03
noonedeadpunkwith that it feels like CentOS is passing somehow?15:03
jrosserdoes this mean we are blocked on everything?15:03
noonedeadpunkI think so, yes15:03
NeilHanlonack. I see c9s had a build of this. https://cbs.centos.org/koji/packageinfo?packageID=1132915:03
NeilHanlonI will build, test, and release.15:04
NeilHanlonto get around for now we could pin to rdn-ovn-23.06, if that exists15:04
noonedeadpunkyeah, I think they don't drop things from rdo repos15:05
noonedeadpunkso should be possible15:05
noonedeadpunkwould be good to do for rocky specifically and leave centos with latest to catch up if smth goes off15:05
NeilHanlonyep. i will submit a Change to do that15:05
noonedeadpunkok, awesome15:06
noonedeadpunkit seems that most projects have branched 2024.115:06
noonedeadpunkSo probably we should switch to tracking it?15:06
jrosserit would be a great time to look through our backlog and draw up a list of whats left to merge15:07
noonedeadpunkyes, totally15:08
jrossertheres the ovn bgp stuff and also a few changes from jimmy15:08
noonedeadpunkmost contraversal thing - quorum queues as default15:08
noonedeadpunkalso skyline potentially15:08
jrosseryeah certainly15:09
noonedeadpunk(I failed with replacing nginx actually)15:09
jrosserright - i remember looking at it and thinking it was straightforward15:09
jrosserand basically deciding i didnt really understand the nginx setup at all15:09
noonedeadpunksomehow it does smth quite different from what maps are doing15:09
noonedeadpunkand not saying about hardcoded pathes in static failes...15:10
noonedeadpunkso to make it work on same ports as horizon - it can't be really in subdirectory15:10
noonedeadpunkwhich potentially would just break nice urls15:11
noonedeadpunkthere's also https://review.opendev.org/q/topic:%22osa-eom%2215:12
noonedeadpunkwhich seems to be failing due to error in zuul?15:12
noonedeadpunkI haven't look into that either :(15:13
jrosserdo those branches still exist?15:16
noonedeadpunk I think so?15:16
noonedeadpunkhttps://opendev.org/openstack/openstack-ansible/src/branch/stable/xena15:16
jrosseroh yes ok15:17
noonedeadpunkhttps://review.opendev.org/c/openstack/releases/+/91041415:18
noonedeadpunkbut I'm not sure if branches are in zuul actually....15:18
noonedeadpunkthey should be I assume15:18
jrosserdoes it not even run those?15:19
noonedeadpunkactually I haven't checked on that15:20
noonedeadpunklet's try some recheck and see15:20
noonedeadpunkso, it appears for a second15:21
noonedeadpunkand that's it15:21
noonedeadpunkso I think some config issue15:21
noonedeadpunkyeah https://zuul.opendev.org/t/openstack/config-errors?project=openstack%2Fopenstack-ansible&skip=015:22
noonedeadpunkbuster15:22
noonedeadpunkI think you was proposing smth?15:22
noonedeadpunkhttps://review.opendev.org/c/openstack/openstack-ansible/+/91019215:23
jrosseryeah i have a bunch https://review.opendev.org/q/topic:%22osa/zuul-errors%2215:23
jrosserbut tbh i really really would like these just to get force merged where possible15:24
jrosserand they are about to become pointless when when the branches are renamed, so it's just /o\ and i kind of wonder why to put effort in 15:24
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/wallaby: Remove use of undefined ceph distro job zuul template  https://review.opendev.org/c/openstack/openstack-ansible/+/91019215:26
noonedeadpunkaha, ok, I now recall the discission15:27
NeilHanloni've got ovn23.09 in the oven for rocky/rebuilds15:27
jrosserah so there is a ton more to do for debian-buster and centos-715:27
noonedeadpunkwow, that's fast15:27
NeilHanlon#link https://cbs.centos.org/koji/taskinfo?taskID=387776915:27
jrosserbut also there is branches that need deleting like pike and stuff15:27
NeilHanlontrying to figure out where/what requires 'rdo-openvswitch' which is what we need to pin back down15:28
noonedeadpunkYeah, I think pike and rest was slightly different track15:28
noonedeadpunkNeilHanlon: well, if it's already tested, how long it might take to get released?15:28
noonedeadpunkas if it's like 24h or smth - might be easier to just wait?15:29
NeilHanlonI think like, tomorrow15:29
NeilHanlonyeah15:29
NeilHanlonand, i've created a ticket at work to implement automation, or at least automatic tickets ... for this15:29
noonedeadpunkI think we can live with borked gates until then15:29
jrosseri can do some more work on the zuul errors patches15:30
jrosserfor buster and centos-715:30
noonedeadpunkcentos-7 feels still be present....15:30
noonedeadpunkI see that now job is in zul15:31
NeilHanlonfyi CentOS 7 is going to be EoL on June 30th15:31
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/wallaby: Switch SHAs to EOM  https://review.opendev.org/c/openstack/openstack-ansible/+/91341415:31
jrosserlots of `The nodeset "centos-7" was not found.`15:31
noonedeadpunkwell, I was checking https://zuul.opendev.org/t/openstack/config-errors?project=openstack%2Fopenstack-ansible&skip=015:31
jrosseroh well....15:32
jrosserif only it was possible to wildcard on that page15:32
jrosserbecasue the errors are all over the repos15:32
noonedeadpunkah, well15:34
jrosserlike https://zuul.opendev.org/t/openstack/config-errors?project=openstack%2Fopenstack-ansible-ops&skip=015:35
*** f0o_ is now known as f0o15:35
noonedeadpunkoh, actually, another thing to merge is https://review.opendev.org/q/topic:%22osa/apt_key%2215:38
noonedeadpunkI clean forgot about that :(15:38
noonedeadpunkso sounds like we have quite some outstanding topics right now15:39
jrosseryeah, perhaps we need an etherpad for things to do before release?15:39
jrossereveryone busy++ right now so might be helpful15:40
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/xena: Remove use of undefined ceph distro job zuul template  https://review.opendev.org/c/openstack/openstack-ansible/+/91025515:42
noonedeadpunkI'd suggest using PTG etherepad15:42
noonedeadpunkhttps://etherpad.opendev.org/p/osa-dalmatian-ptg ?15:43
jrossersure - the current work section is what we want15:43
noonedeadpunkbtw, are there any updates for availability for the ptg week?15:44
jrosseri pretty much cant make it15:45
noonedeadpunkok, I see15:45
noonedeadpunkI was thinking to more-or-less moving what we didn't managed to work on during this cycle15:48
noonedeadpunkSo not to scope anything too breaking I assume15:48
opendevreviewMerged openstack/ansible-hardening master: reno: Update master for unmaintained/xena  https://review.opendev.org/c/openstack/ansible-hardening/+/91313615:55
noonedeadpunk#endmeeting15:59
opendevmeetMeeting ended Tue Mar 19 15:59:31 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:59
opendevmeetMinutes:        https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-03-19-15.00.html15:59
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-03-19-15.00.txt15:59
opendevmeetLog:            https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-03-19-15.00.log.html15:59
opendevreviewMerged openstack/openstack-ansible-haproxy_server master: reno: Update master for unmaintained/victoria  https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/91301616:00
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/xena: Switch SHAs to EOM  https://review.opendev.org/c/openstack/openstack-ansible/+/91341316:00
noonedeadpunkI think we might need to squash these 2 things to have a chance to pass bootstrap....16:01
opendevreviewJonathan Rosser proposed openstack/openstack-ansible stable/xena: Remove use of undefined ceph distro job zuul template  https://review.opendev.org/c/openstack/openstack-ansible/+/91025516:03
jrossergrrr .vscode files :(16:03
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/xena: Switch SHAs to EOM  https://review.opendev.org/c/openstack/openstack-ansible/+/91341316:10
spatelnoonedeadpunk sorry I won't able to join meeting 16:37
spateldealing with production issue :(16:38
spatelI have a question, if someone delete cinder volume then where to find the logs of that.. I did search but didn't find anywhere in logs of cinder 16:38
noonedeadpunkspatel: no worries, it just ended and was quite productive I guess. We will catch up again about the progress in 2 weeks (on APril 2)16:42
spatel+1 please let me know if you need to test or something. soon I am going to deploy freezer on one of my lab to test all the function and will start reporting bug. 17:03
ThiagoCMCSo, Ceph Ansible `stable-8.0` is working with Ubuntu 22.04 + UCA Bobcat to deploy Ceph Reef!17:06
ThiagoCMC=P17:07
gebzA simple null delayed me 3 days :'D17:08
gebzIt was true whoever said that null was a billion dollar mistake 17:08
gebz@noonedeadpunk thank you man, it run smooth :D17:08
gebzNow for the moment of truth.. Fingers crossed17:09
noonedeadpunkgebz: oh, sweet it worked out17:20
noonedeadpunkspatel_: yeah, actually I'm pretty sure that current master is broken badly there due to sqlalchemy 17:20
noonedeadpunkbut I had plans for my pet thing to try out freezer and some effort into it like in couple of month17:21
NeilHanlonnoonedeadpunk: ovn23.09 is out at https://mirror.stream.centos.org/SIGs/9/nfv/x86_64/openvswitch-common/Packages/o/ -- I think that means CI should start working soon..17:47
noonedeadpunkit was waaaaaay faster then 24 h :D17:47
NeilHanlon:D 17:48
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: [doc] Expand documentation on OVN useful commands  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/91358818:17
*** jamesdenton_ is now known as jamesdenton18:41
ThiagoCMCjrosser! I'm using Ceph Ansible `stable-8.0` branch to deploy native Ceph Reef on Ubuntu 24.04! No Docker/Podman required!18:48
ThiagoCMCParty time!18:48
hamburgler2noonedeadpunk: that commit is present, Swift api endpoint for region is up and functional, can dump trove logs to swift container and visible on dashboard through horizon or via accessing swift cli, just seems to be when the guest image goes to create the backup using backup container image, the command when constructed, the positional arguments are out of order in some way19:26
noonedeadpunkhm, that exact patch should have solved that....19:26
noonedeadpunkor it broke in a different way :D19:27
noonedeadpunkbut imo, that line should be enough to get arg respected: https://opendev.org/openstack/trove/src/commit/e998b6886602575127ebe613e56cee3a5a01c6c6/backup/main.py#L5519:28
noonedeadpunkit could be some other place though19:30
noonedeadpunkbut I don't see anything obvious...19:31
f0ojust as I was gonna drop out for the day I noticed that with this OVN setup my VMs with vxlan that OVS/OVN leaks all connected subnets (br-mgmt for instance) into the OVS19:41
f0oI'm hoping this is just some silly VRF fuckup on my end19:42
f0oVRF fuckup on my end19:52
f0oman that got me worried really fast19:53
f0oI know OVS isn't VRF aware which makes it a bit tedious to work with19:53
jrosserf0o: do you have some unusual setup there?19:55
f0opossibly20:01
f0oOpenstack -> OVS -> TopOfRackRouter -> BGP -> Internet20:01
f0ousually I just segregate things with VRFs and call it a day20:02
jrosseryou mean on the TOR?20:02
jrosserso not an issue with OSA getting mgmt network mixed up into OVS?20:02
f0obut here for some reason the router is actually routing 10.0.3.1 (which is lxcbr0) and also 10.20.0.0/22 which is br-mgmt and a bunch of other things20:02
f0owell it's hard to say because I dont have a vanilla negative test here20:03
f0oI think OVS will be just as happy to route br-mgmt on your setup as it is on mine20:03
f0obecause I can nuke the VRF table leakage to prevent internet access but that does not prevent access to 10.0.3.1 nor br-mgmt range20:04
jrosserfrom where? I’m unsure if you mean from tenant vm or elsewhere20:04
f0oinside a tenant-vm with geneve network20:04
jrosserreally?20:05
f0othat vm can happily connect to br-mgmt, br-vxlan, lxcbr020:05
f0ogets routed from the hypervisor through geneve_sys_6081 interface on the Gateway node and then out lxcbr0/br-mgmt/...20:05
jrosserthis is reproducible in an all-in-one?20:06
jrossernoonedeadpunk: ^^20:06
f0oI will need to do that tomorrow when my brain is fresh20:06
f0oI just noticed it because this vm happily resolved dns against 10.0.3.1 which made me very suspicious20:06
f0oand I was just about to sign off and call this done... 20:06
f0ofrom what I can say is that OVS does everything right - packet goes out of the libvrt tap interface, gets pushed into the vxlan/geneve overlay and delivered to the openstack-router on the gateway node.20:07
f0oNow on the gateway node that packet gets full access too all connected networks of the gateway node it seems20:08
f0oand because OVS is not VRF aware, you cant limit it by just stuffing it into a VRF.20:08
f0oSo you likely have to resort to iptables to forbid forwarding based on interfaces20:09
f0owhich seems more like a bandaid than a stable fix20:09
f0obut packet-logic spoken it makes sense why it is how it is20:09
f0oin linuxbridge you didnt had this issue because you flushed the packet into an interface but OVS doesnt do that, the flows governed by northd do all the routing (S/DNAT) and then it's handed over to the kernel20:10
f0oand the kernel knows where things are20:10
f0osorry for wall of text, just had to write it down before the fever hits me and I get brainmsuh again20:11
jrosserno it’s fine, sounds like something that shouldn’t be able to happen really20:14
f0oI will set up some wiretaps tomorrow and get some packet tracing from the hypervisors to the routers and see where what when touches it to get more observations into this20:24
f0oOVS is hot and new, so I have no real clue how it works. maybe there's just one silly setting that's forgotten or some other gotcha 20:25
f0oI just did a low-effort fix by adding a blackhole route for 10.0.0.0/8 on the VRF that my br-ext is attached to. That did solve it because it matches early20:30
f0othe routing looks like: br-ext[vlan123 slaved to vrf OVS] -> default next-hop-vrf IBGP -> IBGP has full bgp tables with next-hop-vrf/s onto the next router/s.20:32
f0oso any packet that's exiting vlan123 would be pushed to IBGP with next-hop-self where it would find it's destination directly or return with DestinationUnknown error. 10.0.0.0/8 is in the MGMT vrf, far away and not leaked into anything20:33
f0oso vlan123 should not be pushing packets into that vrf whatsoever. I tripple checked the routing tables just now for the ranges of br-mgmt etc, they only exist in MGMT.20:34
f0oI'm dropping out now, will do more tests tomorrow20:35
int33hHi anyone had any issues with ssh controlmaster multiplexing issues when trying to do a openstack ansible deployment22:05
int33hi havnt found anyway to disable multiplexing of the ssh sessiosn but i cant seem to find any way 22:06
jrosserint33h: best to share whatever error you get in a paste, if you can22:07
jrosserbut we don't get really much/any trouble with ssh stuff at all22:08
int33hjrosser: right now im not getting any error it just stands still, https://pastebin.com/CRn5cWbR22:10
int33hdid a pastebin where it is atm 22:10
int33hI tried the ssh command manualy and it works if i remove the controlpath22:10
int33hIts one of these weeks where everything keep not working in the most unexpected ways :P22:11
int33hlast thing i expected was hanging ssh sessions :P22:14
jrossermaybe turning up the sshd logging on the target and look for issues there22:18
jrossereverything i read suggests this is somehow related to the ssh setup on the target22:18
int33hhmmm strange its not even trying to connect22:23
int33hyea with controlpath its not even trying to connect , as soon as i try without the controlpath it works fine with controlpath still there, there isnt even a line in the sshd with debug322:25
int33hyes increasing loglevel on ssh, it gets stuck on debug1: auto-mux: Trying existing master22:27
int33hIs it possible to disable all multiplexing22:37
int33hI found how to disable it , seems like i get a broken pipe now , maybe the more proper error 22:54
int33hThink i found the real issue now, since i got a proper error mesage now. I have a pair of pfsense firewalls infront of the managment interface, i changed the config to relect to a diffrent interface not goign thtough the firewall23:06
int33hNow its chugging away23:06
int33hFiguring out why pfsense kills the ssh sessions il do tomorrow, time to sleep 23:07
int33hjrosser: thanks for the idea about sshd logging , it took me in the right direction23:07

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!