Friday, 2023-01-13

prometheanfirecoolcool00:02
*** arxcruz is now known as arxcruz|ruck07:54
moha7jamesdenton: Hi; Why we use the type 'bridges' on the hosts for the OSA, but we use interfaces themselves during the manual installation of OpenStack?08:52
moha7In document: "Install the kernel extra package if you have one for your kernel version"  --> Shouldn't it be "... if you have NOT one ..."? (https://docs.openstack.org/project-deploy-guide/openstack-ansible/zed/targethosts.html+)09:12
noonedeadpunkmoha7: well, I guess depends. I assume there were cases when there're no extra modules for the installed kernel09:25
noonedeadpunkAnd bridges/interfaces as we're in the process of updating documentations. On bare metal hosts bridges does not make much sense and were originally used to align naming through the docs09:26
noonedeadpunkAnd James pushed couple of commits to change that but it's big work09:27
noonedeadpunkand we all have bunch of other duties09:27
moha7Ah, "_cases, when there're no extra modules_"  -,> Then it's my misunderstanding, you're right.09:40
moha7noonedeadpunk: So you mean I can bypass the step of setting bridges up in the Ubuntu Netplan configuration, either lxb bridges or ovs bridges, and it works by just giving the name of interfaces, as in ens160, ens192, etc directly in the openstack_user_config.yml file instead of br-mgmt and br-vxlan, right?09:46
opendevreviewAndrew Bonney proposed openstack/ansible-role-zookeeper master: Add configuration option for native Prometheus exporter  https://review.opendev.org/c/openstack/ansible-role-zookeeper/+/87004909:48
noonedeadpunkmoha7: um, well, not always :D There's a bit more then that. If we're talkign about bare metal deployments (without LXC) - then yes, bridges are not needed. If we're talking about compute or net nodes - bridges are not needed there either. But they are needed for LXC hosts (your controllers/infra nodes)09:49
noonedeadpunkBut even then br-vlan or br-vxlan can be just interfaces isntead of bridges09:50
noonedeadpunkAgain - it's about what can be done, but docs with bridges work just fine09:50
moha7noonedeadpunk: I think I get the point; but what do you mean by "bare metal deployments (without LXC)"? Do you mean other deployment solutions as in manual or Kolla?09:57
moha7got*09:57
noonedeadpunkSo in OSA you can either use LXC containers to separate services or deploy everything on node directly without any containers09:58
noonedeadpunkhttps://docs.openstack.org/openstack-ansible/latest/reference/inventory/configure-inventory.html#deploying-directly-on-hosts09:59
moha7OMG! I didn't knew it10:06
noonedeadpunkunfortunatelly, we have quite a few things that are not obviously documented, as there's never time for docs :(10:08
noonedeadpunkbut contributions are always very welcome10:10
noonedeadpunkAs it's also hard to spot what's missing when dealing with things on daily basis, as you have blurred eye and hardly check docs as you know how to do things10:11
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-galera_server master: Prevent mariadbcheck.socket to wait for network.target  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/87007110:16
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-galera_server master: Remove "warn" parameter from command module  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/86965610:25
noonedeadpunkNeilHanlon: ah, I guess I found what you meant about rocky changes - https://review.opendev.org/q/topic:rocky-9-virtqemud10:51
jrossernoonedeadpunk: how far do you think we are from making 26.1.0?10:56
jrosserwe have the things andrewbonney found this week and OVN / ironic docs10:57
noonedeadpunkWell, I was thinking to make at least 26.0.1 weeks ago... But now it was actually qustion I was going to ask you given found things10:57
jrosserwell - we have completed an upgrade now with 26.0.0 + tons of patches10:57
jrosserbut i think that if we merge everything that is proposed then our local patches pretty much all disappear10:58
jrosserandrewbonney: can confirm this ^10:58
andrewbonneyYes, I think everything is merged to Zed now apart from https://review.opendev.org/c/openstack/openstack-ansible/+/86997410:58
jrosseri had a change to uwsgi to bind to multiple interface s thats not backported10:58
jrosserbut thats kind of feature rather than bugfix and it landed late10:59
noonedeadpunkI can recall also couple of bugfixes now on master only11:00
jrosseralso https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/86754711:01
jrosserand https://review.opendev.org/c/openstack/openstack-ansible/+/86757711:01
jrosseroh was that the systemd service loop?11:01
jrosser*dependancy loop11:01
noonedeadpunkyeah11:01
noonedeadpunkboth for mount and mariadb11:01
noonedeadpunkwell docs do not block releases as they're not tighten to them11:02
jrosseranyway i think we are close11:02
noonedeadpunkI also feel nerveus about https://review.opendev.org/q/topic:rocky-9-virtqemud as that sounds like huge PITA11:05
noonedeadpunkor well, https://fedoraproject.org/wiki/Changes/LibvirtModularDaemons11:06
jrossertbh I am not sure why the jobs still pass11:10
noonedeadpunkyeah. eventually I think we're kind of ready11:10
jrosserparticularly if there are conflicting packages somehow11:10
noonedeadpunkwell, maybe there're some meta stuff for transition11:10
noonedeadpunkBut I would expect that kind of migration happening only with new major libvirt version11:11
noonedeadpunkWhich I wouldn't expect to see at least on Rocky... 11:11
noonedeadpunkwell, it might be coming from RDO as well11:11
jrosserI was unsure of the patches too11:11
jrosserlike not converting daemon name to list on all OS11:12
jrosserbut I only looked at them super quickly11:12
noonedeadpunkyeah, how that passed I have no idea11:12
noonedeadpunkas iterating over string should have fail dramatically11:12
noonedeadpunkunless new ansible does some check nowadays for string/list (which would be quite weird)11:15
noonedeadpunk(not weirder then re-implementing package resolver for apt module though)11:16
noonedeadpunkBtw I have that thingy that I'd love to discuss: https://review.opendev.org/c/openstack/openstack-ansible/+/869762 Maybe worth saving it for meeting... But I'm going to rely on smth like that in a new deployment as it's a nightmare otherwise11:17
amaraoIf I want to test a newer version of role, what is the easiest way to do so? Is editing ansible-role-requirements.yml enough? And when should I patch it, before running bootstrap script or after?12:07
noonedeadpunkamarao: easiest way to create /etc/openstack_deploy/user_role_requirements.yml file that will contain override of the role12:10
noonedeadpunkand after that run bootstrap-ansible.sh12:10
noonedeadpunkor you can go to /etc/ansible/roles/role_name and just `git pull; git checkout` - but that's if only testing and better not doing like that12:11
amaraoOh, thanks. I didn't know about user_role_requirements.yml.12:12
noonedeadpunkthere're couple of bugs related to that that were patched jsut now - like if you override literally every role there was an issue. And it was a bit trickier with collection-requirements until patch where we have names for all  colelctions in the list12:14
amaraoIs user_role_requirements.yml  have the format as ansible-role-requirements.yml?12:19
opendevreviewMerged openstack/ansible-role-systemd_mount stable/zed: Fix mount's systemd unit dependency logic  https://review.opendev.org/c/openstack/ansible-role-systemd_mount/+/86993712:26
jrosseramarao: https://opendev.org/openstack/openstack-ansible/src/branch/master/doc/source/reference/configuration/extending-osa.rst#maintaining-local-forks-of-ansible-roles12:30
noonedeadpunkamarao: sorry, indeed it's user-role-requirements.yml. and yes, same format12:34
amaraoIt worked, thank you! My last question, and I feel silly about it, if I want to test changes from review (before merge), where can I get git remote? F.e. https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/870071/1. I really git around and I can't find git url for proposed changes to clone...12:53
jrosseramarao: press the "three spots" top right12:57
jrosserchoose "download patch"12:58
jrosserthen i usually cherry-pick onto my local repo12:58
opendevreviewDmitriy Rabotyagov proposed openstack/ansible-role-pki master: Create relative links to root instead of absolute  https://review.opendev.org/c/openstack/ansible-role-pki/+/87008912:58
amaraoHuh... I hoped for git URL to put into user-role-requirements.yml... There isn't git accessible repo with pre-merge changes, is it?13:00
jrosserwell it's a patch not really merged to a repo13:00
jrosserso clone doesnt really make much sense13:00
amaraoSo I need to apply patch after running ansible bootstrap..., ok, I got it.13:01
amarao(I do not have local repos, as I use separate 'deployment server' for whole ansible thing, and I manage it with my in-house playbook).13:01
amaraoОh, no, it is! Three dots, download patch, and there is checkout url: https://review.opendev.org/openstack/openstack-ansible-galera_server refs/changes/71/870071/113:04
jrosserI feel that’s slightly ambiguous about exactly what point if the tree that refers to13:07
noonedeadpunkYeah, the problem is that our script at the moment is not capable of dealing with refs13:09
noonedeadpunkit's likely possible to fix, but time is always a problem13:10
noonedeadpunk(at least for our parallel git clone script)13:10
noonedeadpunkdamn, I'm so fed up dealing with our dynamic_inventory script... No fun debugging it at all...13:14
noonedeadpunkAlso I'm afraid of consequences once we fix https://bugs.launchpad.net/openstack-ansible/+bug/2002645/comments/613:16
noonedeadpunkas now behaviour is quite unpredictable I'd say13:16
amaraoWe've give up on dynamic inventory at all and just put our own inventory instead without invoking dynamic inventory at all. It's much more verbose, but at least, very debuggable.13:22
noonedeadpunkwell, in case of bare metla deployments that does make sense indeed.13:22
noonedeadpunkFor lxc - not much13:22
jrosseramarao: for if you're doing a lab or just tests then yes you can cherry-pick after bootstrap13:23
jrosserif you want to make that more deterministic for production deployment then i'd recommend forking the repo on github and making your own branch to keep outstanding patches, then refer to that in ansible-role-requirements.yml13:24
noonedeadpunkI find dynamic inventory handy and generally working until it;s not :D13:24
amaraoIt's slightly more convoluted. I have set of playbooks to reinstall BM, clone, bootstrap, configure, run openstack-ansible playbooks, etc. It's not in production yet,but it's all 'auto'. So I now trying to squeeze patch mode into existing playbooks...13:24
amarao(I also run remote ansible via local ansible and dump logs after that... crazy thing).13:24
noonedeadpunkany reason to not use ara?13:25
amarao(and pinning all deps, removing internet from install time are my next targets)13:25
jrosserwell if you keep your /etc/openstack_deploy in a git repo then you'll have user-role-requirements.yml under version control, referring to your forked repo13:25
noonedeadpunkAs osa capable of installing ara, not sure if we've ever implemented configuration to map it with remote server, but should be easy to do13:25
amaraoI plan to remove intermediate ansible at the end and just have common inventory between osa playbooks and my playbooks. The main "odd" thing we have is that inventory will be generated by our own orchestration code (the one to configure BM servers, switches, etc). So 'ansible-in-ansible' is temporary. Also, it's surprisingly usable: https://paste.openstack.org/show/bRWWON76mcgAbGPtI0Ug/ (note the Logs task at the end with /dev13:28
amarao/tty hack).13:28
jrosseramarao: the whole reason that user-role-requirements is there is to address your issue of needing to manage unmerged patches or ones from master13:29
jrosserwe tried, but it is too difficult to automate pulling in a list of patches from gerrit when building the deployment server13:29
jrosserit is much easier to maintain forks of the relevant repos on github or a mirror and refer to those as overrides13:30
amaraoI still hadn't found the way to convert pre-merged ref to compatible format. It fails with " No commits selected for shallow request" but I work on it.13:30
jrosserimho it is better to make a fork13:30
amaraoI want to make it usable for future cases.13:31
jrossermaybe i'm not explaining properly13:31
jrosserhere is what we do to carry an outstanding patch to the uwsgi role for example https://github.com/bbc/ansible-role-uwsgi/tree/bbc-zed-26.0.013:32
jrosserthen just refer to that in user-role-requirements13:33
amaraoOk, I give up. git clone can't clone a specific refspec, which is unexpected to me. I'll go your way, thanks.13:56
jamesdentonmoha7 the use of bridges, like br-vxlan, br-vlan, br-mgmt, etc. are related to the use of LXC containers running the actual services. Those bridges allow network communication through the host, since they are bridging the physical interface to the container(s). Some of the bridges, like br-vlan, were used when Neutron agents lived in LXC containers, too, but are no longer necessary since the agents live on metal now. If14:12
jamesdenton you are not using LXC, then the bridges aren't technically as important, but the installation still assumes they are used. It also provides some consistency between hosts, especially when physical interface names can vary (unless you rename them)14:12
moha7Is it matter the type of bridges, lxb bridges or ovs bridges?14:14
jamesdentonboth should be supported, but the default is lxb14:15
moha7jamesdenton:  If I run the OSA in the state without LXC, then considering that "the installation still assumes [those bridges] are used", then should I build the bridges or not?14:18
jamesdentonI run on "metal", which is non-LXC, and I use br-mgmt and br-overlay (renamed from br-vxlan)14:19
jamesdentonthe management IP of my host is configured on br-mgmt14:19
jamesdentonand br-overlay has the TEP (VTEP) address for Geneve/VXLAN14:19
jamesdentonhappy to share my netplan config14:19
moha7Is there any benchmark, or experience, indicating which method is better, seperating services with LXC containers or running in bare metal mode without LXC all stuff on the nodes?14:21
moha7jamesdenton: > happy to share my netplan config14:22
moha7I am very much looking forward to it ((:14:22
noonedeadpunkThe only tricky part regarding metal are day2 operations14:22
jamesdentonWell, operationally, it probably makes more sense to segment the services via containers, and many here are doing just that. There is a speed penalty during deployment/upgrades, since you have to touch the containers vs metal14:22
noonedeadpunkEspecially when different services rely on same distro-provided binaries14:23
noonedeadpunkCeph is quite good example of how things might go wrong if both glance and cinder use rbd as a backend14:23
jamesdentonmoha7 booting up my infra VMs now; migrated to a new hypervisor14:24
* NeilHanlon 's lab is in complete disarray14:24
noonedeadpunkas then when upgrading glance you might also upgrade ceph, which is used also by cinder, which might not be ready for such upgrade14:24
jamesdentoni have managed to keep mine relatively functional14:24
noonedeadpunkWell, in case you're storing images in swift - that's not an issue. Or have cinder-volumes elsewhere14:25
noonedeadpunkso depends14:25
jrosseralso the lxc is only the controllers14:25
noonedeadpunkI would say metal is also quite popular. So might be matter of personal preferences as well14:26
jrosserwhen you have lots of computes the potential overhead of a containerised control plane is not really large14:26
NeilHanlon"it depends", unfortunately14:26
jamesdentonjrosser spot on14:26
jrosserbut if you want to rinse-repeat a multinode lab lots of times, then it will certainly be quicker with a metal deploy14:27
jamesdentoni have forgotton the caveats of lxc and 18.04->20.04 or 20.04->22.04, anything to worry about there?14:27
jrosseri think that now we build the base images from scratch rather than all that nonsense with downloading one, it seems pretty much a non event14:28
jamesdentonmoha7 https://paste.opendev.org/show/b4rV83uoshha3RTxRT75/14:28
jrosseri guess the thing to note there with jamesdenton network config for a metal deploy is that its pretty identical to what you'd have for an lxc one14:29
jamesdentonyeah, actually, this is a hybrid14:29
jamesdentonslow migration from LXC -> metal; my rabbitmq and galera services are still containerized14:29
jamesdentonso, i get the "best of both worlds" when things break :D14:30
noonedeadpunkjamesdenton: yeah, well, we have not really working cross-OS support due to lsyncd and repo containers, that were fixed only for Yoga14:30
noonedeadpunkSo quite a mess with disabling proper repo backend on a haproxy and stopping lsyncd14:30
jamesdentonahh ok14:30
moha7jamesdenton: Thanks; Your vlan20 has mtu 1500, while others are all set on 9000; Is there something special with br-mgmt having of MTU1500?14:30
noonedeadpunkBut I'd say we have quite fair docs of the process14:30
jamesdentonmoha7 yes, vlan20 is my management interface to a default gateway (Cisco Firepower). Want to keep 1500 MTU there to avoid issues out to the worls14:31
jamesdenton*world14:31
noonedeadpunkmoha7: I don't think it's being special, rather then no need in jumbo frames there14:31
jamesdentonbut vlan21 is Geneve/VXLAN and vlan22 is storage, so 9000 there14:31
noonedeadpunkSince there's no data flowing and mostly small packets there14:31
* NeilHanlon twitches at the words Cisco Firepower14:31
noonedeadpunk:D14:31
jamesdentonnoonedeadpunk do you recall the issue with /var/run dirs not being created? Do we have a bug for that?14:32
jamesdentonNeilHanlon you will be happy to know it runs ASA code.14:32
noonedeadpunkNeilHanlon: so, any idea about switching to this https://fedoraproject.org/wiki/Changes/LibvirtModularDaemons on R9?14:32
NeilHanlonjamesdenton: thanks I hate it :P 14:32
jamesdentonme too. me too.14:33
NeilHanlonnoonedeadpunk: yeah i've been doing some digging. it's definitely switched to for RHEL 9, but i'm not sure why we're only seeing this pop up now14:33
NeilHanlonhttps://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/considerations_in_adopting_rhel_9/assembly_virtualization_considerations-in-adopting-rhel-914:33
noonedeadpunkI'm jsut a bit surprised that it's done with same libvirt version. In case it's the same14:33
NeilHanlonas far as I can tell, the RHEL/Rocky libvirt package has shipped these modular daemons since the beginning14:34
noonedeadpunkaha14:34
noonedeadpunkbut old way works somehow I assume?14:34
NeilHanlonThe guidance from RHEL suggests that it is indeed possible to use the old, monolithic daemon(s).14:35
NeilHanlonhttps://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/configuring_and_managing_virtualization/optimizing-virtual-machine-performance-in-rhel_configuring-and-managing-virtualization#proc_enabling-modular-libvirt-daemons_assembly_optimizing-libvirt-daemons14:35
noonedeadpunkAs I'm mostly conserned about migration of tcp/tls sockers14:36
NeilHanlonIMO, the path of least astonishment for users would be to keep using the monolithic daemon until other OSes switch (?)14:36
NeilHanlonthough i'm worried now we'll need to account for both scenarios going forward for current users14:36
opendevreviewMerged openstack/ansible-role-systemd_networkd stable/xena: Fix static routes to use Destination rather than Source key  https://review.opendev.org/c/openstack/ansible-role-systemd_networkd/+/86983714:38
noonedeadpunkYeah, that's my main concerns as well actually. As using different approaches can be quite messy. Not saying that a pre-requirement is to stop all VMs14:38
noonedeadpunkAnd that's kind of a deal breaker here 14:39
jrosseri guess what we don't know is the provenance of the install that prometheanfire had, like when/how it was done14:39
jrosservs. what happens if you start from fresh today14:40
jrossertheres quite a bunch of it here https://codesearch.opendev.org/?q=virtqemud14:42
moha7noonedeadpunk: "day2 operations"? Is it an idiom?14:43
jrossermoha7: well once you have deployed (day 1) you then are on to day 2 and everything beyond14:44
noonedeadpunk yah, puppet has merged that year ago...14:44
jrosserwhen you need to monitor / maintain / upgrade14:44
noonedeadpunkSo sounds like high time for us as well....14:45
jrosserand hopefully do so without any of your users noticing you do that14:45
*** tosky_ is now known as tosky14:47
noonedeadpunkHm, I don't see neither debian bullseye nor focal having virtproxyd/virtqemud14:47
noonedeadpunkdon't have jammy handy14:48
NeilHanloni'm kinda baffled this change was shoehorned into RHEL 9 like this14:49
NeilHanloni wouldn't have expected it until RHEL 1014:49
jrosserjammy has a man page https://packages.ubuntu.com/search?suite=jammy&arch=any&mode=filename&searchon=contents&keywords=virtproxyd14:50
moha7jrosser: day2, cool14:50
noonedeadpunkwhich is mostly puppet code ? o_O14:53
NeilHanlonin the future all systems will be configured by just writing ruby14:53
noonedeadpunkor python like charms do14:54
noonedeadpunkand nothing in kinetic/lunar either.14:55
noonedeadpunkSo sounds like it's a rhel thingy for the timebeing in which time should be invested to find out migration plan and timeline for it14:55
noonedeadpunkor keep monolithic until it's possible 14:56
noonedeadpunksounds like discussion for PTG actually14:56
NeilHanloni'm going to spend some time this afternoon if I can trying to recreate prometheanfire's breakage, possibly upgrading from 9.0 to see if that has something to do with it14:57
NeilHanlonbut agreed probably a good discussion for PTG14:58
noonedeadpunktbh, I'd assume that our CI would catch some obvious breakages...15:01
noonedeadpunkbut maybe it's caveats for multinode, like live migrations or smth15:01
NeilHanlonprometheanfire: if you can also let us know a bit more about the provenance of the system as jrosser said. How did you do the initial deploy? When? What OS version was installed initially? etc15:02
damnthemHi, guys. Is it possible to create a separate Ceph cluster for each Cinder AZ, using OSA-integrated ceph-ansible playbook?15:14
noonedeadpunkdamnthem: well. partially at least :) I managed to do that, but there were some non-trivial failures in ceph-ansible down the road which I had to shortcut through, as that was POC deployment15:20
noonedeadpunkI can't really recall details, but I think the issue was when you're trying to run ceph-ansible across all AZs at once15:21
noonedeadpunkWhen running with limits it was working fine15:21
noonedeadpunkAlso I'd suggest using cluster_names15:21
noonedeadpunkWait a sec, I still might have config for AZs somewhere15:22
damnthemYes, actually also thought about that (cluster names). And judjing by the structure of ceph ansible playbook i got couple ideas about why u have problems with running installation across all AZs)15:24
noonedeadpunkdamnthem: these were user_variables regarding ceph/cinder https://paste.openstack.org/show/bBIcVS2faINLR1Erh5pX/15:24
damnthemThat's would be helpful for sure., thank you15:25
* NeilHanlon saves15:25
noonedeadpunkthese were openstack_user_config https://paste.openstack.org/show/bgPT2VqXSpoe4jeCyGpB/15:26
noonedeadpunkAnd I also created this env.d/ceph.yml as I wanted to host mons/mgrs in LXC containers: https://paste.openstack.org/show/b6JiwHSUsI11365DeHtA/15:28
noonedeadpunkAs I said - that was sandbox I've played with, so not everything 100% accurate but should give some idea15:28
noonedeadpunkAh, and I've also used this patch for inventory to generate handy AZ groups https://review.opendev.org/c/openstack/openstack-ansible/+/86976215:29
noonedeadpunkThere're quite some nuances down the road, but overall quite doable, yes15:29
noonedeadpunklikely I forgot to paste env.d/cinder.yml for cinder_volumes... here it goes https://paste.openstack.org/show/bDRoc5EqmAI4IE5nWHWi/15:31
noonedeadpunkI wanted to write proper docs about AZ path, but lack time and still in progress of deploying it for real15:35
damnthemThanks again. Will look into that15:44
prometheanfirenoonedeadpunk: pre-req to stop VMs, that's harsh15:45
prometheanfireNeilHanlon: I didn't do the initial deploy, started with rocky-9, my deploy has ceph as well15:45
prometheanfirethe error I was seeing was trying to run virsh secret-create (iirc) from the ceph_client osa role15:46
mgariepyjamesdenton, around ?15:48
prometheanfirethat seems to trigger a systemd socket which then starts virtsecretd, virtsecretd has a conflicts= line on libvirtd, so kills libvirtd when it comes up15:48
NeilHanloni think we should mask the modular daemons15:50
prometheanfirethe playbook starts libvirtd, which kills the virtsecretd socket/service.  it then runs the virsh secret-create, which either fails because of the missing socket/service or starts the virtsecretd socket/service; either way the secret-create fails15:50
prometheanfirechanging to the modular daemons requires vm restarts?15:51
MrRHi all, speaking of haproxy, I'm hoping someone can give me some guidance on it and letsencrypt, i'm having trouble pulling a certificate from LE which i think just be my own mistake but help would be appreciated.16:01
MrRI have a port forward pointing to my external vip ip (have and can try others again) and port 8888, the vip/domain is set in my router and i can ping it (and all nodes/related ips) from the entire network.16:01
MrRI can see when running setup-infrastructure it is trying to pull the cert but fails, it always runs on my second node which has an ip of 236.55 and i can see in the journal that it does in fact come up (i've tried a few ips such as for the nodes on the flat and mgmt network so i could be mistaken that it worked on the external vip ip but it definatly came up for at least one of the ips) but still fails.16:01
MrRIf i try manually pulling a cert via certbot i can do so by a record on my domain with a forward from the router to the device/domain pulling the cert (this is on a none openstack machine in case it matters), so in my troubleshooting i used nc -l as a listener on a node, used an online port checker and it states closed, trying again with the non openstack machine, port reports opened. I'm guestimating i'm 16:01
MrRgetting confused somewhere along the way as to what ips i should be pointing these to. 16:01
MrRSo, i have the following set:16:01
MrRhaproxy_keepalived_external_vip_cidr: "192.168.1.62/24"16:01
MrRhaproxy_keepalived_internal_vip_cidr: "{{internal_lb_vip_address}}/22"16:01
MrRhaproxy_keepalived_external_interface: br-flat16:01
MrRhaproxy_keepalived_internal_interface: br-mgmt16:01
MrRhaproxy_ssl_letsencrypt_enable: True16:01
MrRhaproxy_ssl_letsencrypt_install_method: "distro"16:01
MrRhaproxy_ssl_letsencrypt_email: ""16:01
MrRhaproxy_interval: 200016:01
MrRopenstack_service_adminuri_proto: https16:01
MrRopenstack_service_internaluri_proto: https16:01
MrRhaproxy_ssl: True16:01
MrRhaproxy_ssl_all_vips: true16:01
MrRI also have internal_vip set as 192.168.1.61, external_vip set as openstack.domain.com which as i said is resolvable via my router accross the entire network.16:01
MrRThe nodes have the relevant networks including flat which each node has an ip on thjem (in case of node 2 192.168.1.55 for flat). I have tried setting haproxy_ssl_letsencrypt_certbot_bind_address to multiple different ips (node, external_vip, internal_vip, mgmt, flat etc with the relevant forwards etc) to still have it fail. What am i missing?16:02
MrRI can run/test anything required if help can be offered in figuring this out. Any help would be welcomed16:02
MrRsorry, i did not know it would paste like that, new to irc16:02
noonedeadpunkoh, so we basically don't test ceph scenario for rhel16:12
noonedeadpunkthat's good imput on how to reproduce prometheanfire16:13
noonedeadpunkNeilHanlon: masking daemons is easy peasy :D16:13
noonedeadpunkeasy way to reproduce should be `git clone https://opendev.org/openstack/openstack-ansible; cd openstack-ansible; ./scripts/gate-check-commit.sh aio_metal_ceph`16:14
noonedeadpunkprometheanfire: well, that's what said in rhel docs: `Your virtual machines are shut down.` in Prerequisites of Enabling modular libvirt daemons section here https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/configuring_and_managing_virtualization/optimizing-virtual-machine-performance-in-rhel_configuring-and-managing-virtualization#proc_enabling-modular-libvirt-daemons_assembly_optimizing-libvirt-daemons16:15
noonedeadpunkMrR: for pasting we use https://paste.openstack.org/ in general16:16
MrRahhh, thanks for that, I'll use that in future16:16
prometheanfirenoonedeadpunk: heh, fun migration then16:17
noonedeadpunkWell, I haven't tested that, jsut saw that in docs jsut now to which Neil has pointed me16:17
noonedeadpunkdamn, when I will stop making typos in word `just`16:19
mgariepywhen you take the time to setup auto-correct for it :P16:20
noonedeadpunkMrR: I assume issue might be in port forwarding? As for let's encrypt we're doing some tricks and it's not really port 8888 IIRC16:20
prometheanfireI switched back and forth on a test system with running VMs, I'm sure it's not supported but it's 'fine'16:20
noonedeadpunkprometheanfire: well, anyway I think these patches do need a bit more love. As they're not covering conversion of tcp/tls sockets at very least16:21
prometheanfireyep, I'm waiting for direction from NeilHanlon first (mask or not)16:22
noonedeadpunkMrR: we use 8888 just for the backend, and then access it through haproxy that listens on 443 I guess16:22
noonedeadpunkI'm quite burried with stuff right now, and likely next week as well :(16:23
jrosserMrR: port 8016:23
jrosseryou need 80 and 443 to work externally for LE to work16:24
MrR<noonedeadpunk, ok i'll try forwarding 443 also. Should i say set the haproxy_ssl_letsencrypt_certbot_bind_address to the node ip on the flat network or should the vip address work or should i just leave it out?16:25
jrosserdont change haproxy_ssl_letsencrypt_certbot_bind_address16:25
MrRjrosser, i had tried 443 also but i may not have matched ips, i'll fire it up and give it a go now16:26
noonedeadpunkMrR: and eventually these overrides: https://docs.openstack.org/openstack-ansible/latest/user/security/index.html#certbot-certificates16:26
jrosserMrR: the important thing to know is that the LE servers will make a request to your virtual IP16:26
jrosserMrR: and haproxy directs that to whichever of your infra nodes is making the certbot request16:26
MrRjrossr, so leave haproxy_ssl_letsencrypt_certbot_bind_address out, forward 80 and 443 to the external_vip which is attached to the domain and give it a go16:28
jrosserthat would be a good start16:28
jrossercertbot needs to bind to a local IP on the infra node16:28
jrosserhaproxy then uses that as a backend16:28
jrosserthis is complex becasue there are N infra nodes behind a loadbalancer16:29
MrRso the flat ip should work if i'm understanding that right, only 3 nodes right now, all running haproxy16:29
jrosserand trickery is required to make sure that the request from the LE servers to collect the token from certbot goes to the correct infra service16:30
jrosserwell, put simply you make a request for a certificate for example.com16:30
jrosserwhatever IP example.com resolves to must serve the token back at the well-known url16:30
jrosserMrR: i don't really know what you mean by the flat IP16:32
MrRsorry, i mean an ip on the br-flat network16:33
jrosserthats kind of not important16:33
jrosserexternal_lb_vip_address must be set to the FDQN that resolves to your haproxy external IP16:33
jrosserthen whatever port forward you have must allow 80/443 for that IP16:35
opendevreviewMerged openstack/ansible-role-systemd_mount stable/yoga: Fix mount's systemd unit dependency logic  https://review.opendev.org/c/openstack/ansible-role-systemd_mount/+/86993816:38
jrosserMrR: you are also doing NAT?16:38
opendevreviewMerged openstack/openstack-ansible-galera_server master: Prevent mariadbcheck.socket to wait for network.target  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/87007116:48
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-galera_server stable/zed: Prevent mariadbcheck.socket to wait for network.target  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/87005616:48
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-galera_server stable/yoga: Prevent mariadbcheck.socket to wait for network.target  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/87005716:48
jrosserMrR: "in my troubleshooting i used nc -l as a listener on a node"16:49
jrosserthe backend to haproxy is expected to be http https://github.com/openstack/openstack-ansible/blob/master/inventory/group_vars/haproxy/haproxy.yml#L29016:51
jrosserso haproxy will not respond to just a port being opened16:51
jrossersee how we "prime" the loadbalancer to ensure that it directs traffic with the correct infra node before running certbot with a fake http server https://github.com/openstack/openstack-ansible-haproxy_server/blob/master/templates/letsencrypt_pre_hook_certbot_distro.j2#L416:52
MrRyes, using NAT, sorry was just firing them up and changing a few things to run it again16:54
jrosserif you try that python command on an infra node you should be able to see the haproxy status change16:55
jrosserand then your external port check should work, or even wget the fake server16:55
jrossernoonedeadpunk: ^ i thought we should look at what directory the temporary LE web server actually serves....16:57
MrRalready started setup-infra so if it fails i'll run that and report back, usually fails ssl in about 15 mins after starting17:08
MrRran with -vvvv so i can share the output if it helps17:09
jrosseryou ca run the haproxy playbook on its own17:10
prometheanfirefor self managed ceph that we want to use for object storage (swift), I imagine we can't use the built in ceph-rgw stuff and have to create the tenant / endpoint / etc manually?17:10
jrosserlook inside setup-infra.yml, it just calls a string of other playbooks17:10
jrosserprometheanfire: I think that the rgw setup stuff is in its own playbook to enable that17:11
prometheanfireya, ceph-rgw-keystone-setup is a thing17:12
prometheanfirergw would need to be configured manually on the ceph side since osa doesn't touch that still, but could work17:12
jrosserso given the right set of vars it should set those things up17:12
prometheanfireya, figuring out those vars now :P17:12
prometheanfireI ceph_rgws is not going to be defined so need to short circuit some stuff (the ceph-rgw-install playbook entirely), looks like the keystone playbook doesn't block based on ceph_rgws at least17:15
jrosserprometheanfire: pay careful attention to https://github.com/openstack/openstack-ansible/blob/master/playbooks/ceph-rgw-install.yml#L1717:17
prometheanfireyep, which is why I have to call it directly17:18
jrosserthat is equivalent to (rgw deployed by OSA) or (list of external rgw)17:18
jrosseryou shouldnt have to17:18
jrosserif this is an external ceph cluster then defining a list of ceph_rgws just like you supply a list of external mons should be whats needed17:19
jrosserthat list being length > 0 will make the service/endpoint setup stuff happen17:19
prometheanfireif ceph_rgws is set then osa will install ceph-rgw on those hosts (the ceph-server)17:20
jrosserno17:20
jrosserwell yes17:20
jrosserbut ceph-rgw != ceph_rgw17:20
jrossersmall but important difference in group name vs. var name there17:21
jrosserdefine and populate the group ceph-rgw and OSA will deploy17:21
jrosserdefine the list ceph_rgw of some external hosts and OSA will not deploy, but make reference to in haproxy setup17:21
prometheanfireok, I'll try with ceph_rgws set to the mons (where I have them deployed)17:22
prometheanfirecool17:22
jrosserhere https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/all/ceph.yml#L43-L4817:22
opendevreviewMerged openstack/openstack-ansible-os_nova master: Fix scheduler track_instance_changes option  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/87002317:24
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_nova stable/zed: Fix scheduler track_instance_changes option  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/87005817:25
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_nova stable/yoga: Fix scheduler track_instance_changes option  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/87005917:25
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_nova stable/xena: Fix scheduler track_instance_changes option  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/87006017:25
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Add container_ip option for metal hosts  https://review.opendev.org/c/openstack/openstack-ansible/+/87011317:28
noonedeadpunkjrosser: andrewbonney you might be interested in this one ^17:28
noonedeadpunkI assume it should be safe for existing environments. If apply container_ip to o_u_c container_address will be replaced with valid data. And according to test that will be reversed17:29
jrosseri think it will be interesting to look why so far we did not need that17:30
jrossernoonedeadpunk: did you have some specific thing leading to that - interesting to understand what you've used it for17:31
noonedeadpunkyou're not running on metal and applied bunch of overrides that are still needed until we replace ansible_host with management_address17:31
noonedeadpunkwell, eventually it's andrewbonney that lead me to look into this direction. As replacing ansible_host with container_address should have worked here https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/galera_all.yml#L46 in my opinion17:32
noonedeadpunkbut it didn't as ips were wrong17:33
jrosserah right ok17:33
noonedeadpunkWe don't have environment with ssh!=mgmt yet - jsut deploying it, but I kind of unsure about what happens for my_ip on neutron and nova17:34
jrosseroh yes i think we did initially have live migration on the ssh interface17:34
noonedeadpunkAs that will resolve to mess https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/common-playbooks/nova.yml#L157 for example17:34
jrosseri do not remember why or what we did, can look at that next week but we had to change something to ensure it was on mgmt network17:35
noonedeadpunkI believe change nova_management_address17:35
noonedeadpunkBut I can imagine plenty of things that can go wrong17:35
noonedeadpunkAnd even worse for bare metal scenario17:35
noonedeadpunkWhere services will likely listen on SSH network17:36
opendevreviewMerged openstack/openstack-ansible-lxc_hosts master: Allow to create OVS bridge for lxcbr0  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/86860317:36
jrosserwe should be generally removing ansible_host then from the configs17:37
noonedeadpunkas https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/all/all.yml#L3517:37
noonedeadpunkI wonder why this was never raised tbh... Though we never said it's possible to split SSH from MGTM17:38
jrosserhmm, well we did it from that start17:38
noonedeadpunkyeah. well. If take galera example, I'm not sure that ansible_host is incorrect here https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/galera_all.yml#L36 17:38
jrosserbut totally possible we have a bunch of overrides17:39
noonedeadpunkbut well, on lxc it's not _that_ painful17:39
noonedeadpunkso regarding galera - I can assume that monitoring in general will be performed through SSH net17:40
noonedeadpunkBut I'm not sure, as I'm not there yet :D17:40
jrosserright, in our case the SSH net has all the nodes and a pair of bastions17:40
jrosserso is like a DMZ17:40
jrosserand also on the bastions are elasticsearch / prometheus gateways to gather monitoring17:41
noonedeadpunkyeah, exactly, so monitoring worth to be allowed from ansible_host17:41
noonedeadpunkI was thinking about same architecture17:41
opendevreviewMerged openstack/ansible-role-pki master: Create relative links to root instead of absolute  https://review.opendev.org/c/openstack/ansible-role-pki/+/87008917:41
opendevreviewMerged openstack/ansible-role-pki master: Use loop labels to suppress output  https://review.opendev.org/c/openstack/ansible-role-pki/+/87001517:41
jrosserwell it's worked nicely17:41
jrossergives a good structure as well when considering a compromised node17:42
opendevreviewDmitriy Rabotyagov proposed openstack/ansible-role-pki stable/zed: Create relative links to root instead of absolute  https://review.opendev.org/c/openstack/ansible-role-pki/+/87006517:42
jrosserand connections always going from more trusted -> less trusted zones for SSH and monitoring17:42
noonedeadpunkyep, right17:44
jrosseri mean if you really go to town then you can do private vlan on the switches too17:45
jrosseras there is no legitimate traffic except basion<>nodes, nothing nodes<>nodes17:45
noonedeadpunkum, any migrations go nodes<>nodes? or?17:46
jrosserfor us thats on br-mgmt17:46
jrosserwe made the ssh network not particularly resilient, like no bonds or anything17:46
noonedeadpunkoh, yes, sure17:46
jrosserbut br-mgmt is all bonds and blah blah17:46
jrosserand nothing except monitoring should break if we lose the ssh network17:46
noonedeadpunkI'm not sure how it all ends up at the end in terms of networking design, as I don't have enough clearance for all details of the deployment 17:47
noonedeadpunkbut presumably it will be nuts17:47
jrosserhah :)17:47
jrosserwe were thinking about how to do L3-to-the-host and ECMP rather than L2 bonds17:48
jrosserthat is maybe the most fragile / largest blast radius failure we have right now17:49
noonedeadpunkYeah, that has been raised as well17:49
noonedeadpunkBut more pain point is likely CD-ing changes17:49
jrosserfirmware upgrades on VPC pair switches was terrifying and when it goes wrong you lose a whole load of stuff17:49
noonedeadpunkat least for me, as any SSH access should be minimized17:50
noonedeadpunkSo - unattended opensatck upgrades :D17:50
jrosseri think the bulk of our use of SSH is for debugging wtf is going on when things break17:51
noonedeadpunkyeah, exactly. 17:51
noonedeadpunkbut I think idea is to get rid of deploy host somehow and jsut use CI/CD for all management17:52
noonedeadpunkWhich I still can't fully turn my head around implementation17:52
jrosserits perhaps not so hard to have the deploy host create-able from state all stored in git17:53
jrosserbut there does seem to need to be a lot of human-in-the-loop to deploy changes17:54
noonedeadpunkAs I feel it being too futuristic/utopic17:54
noonedeadpunkWell, bootstraping deploy host with Zuul is already done - wasn't that hard indeed17:54
noonedeadpunkBut getting rid of human-in-the-loop is ugh....17:55
noonedeadpunkSimple operations like adding new node or rolling out specific service change is doable17:55
noonedeadpunkBut still not sure what to do with ad-hocs or debugging17:56
noonedeadpunkBtw have some sweet test for the inventory, likely need to share it in ops repo or smth...17:58
noonedeadpunkThat verifies no mistake was made and groups are not intersecting (like no l3 agent on compute hosts), that amount of containers is correct and not missing smth (like 2 galera ones), and that every compute is part ovsagent as well. So basically intersections of different groups and checking their content18:00
noonedeadpunkFully ansible to be native zuul job18:01
noonedeadpunka bit generalized tox version: https://paste.openstack.org/show/bNhsa7uWomhDfoYtUlDg/18:05
noonedeadpunkorder of elements in collocated_groups is critical just in case18:06
noonedeadpunkyeah, nothing too fancy, but protects from generic human mistakes18:07
noonedeadpunkok, will check out for weekends now. Will have a look on stable branches merge status though to propose bumps18:08
mgariepyhave a nice one noonedeadpunk 18:08
mgariepytake care18:08
opendevreviewMerged openstack/openstack-ansible master: Restore dynamic_inventory unit testing  https://review.opendev.org/c/openstack/openstack-ansible/+/86977618:20
jrosserhave a good weekend18:32
noonedeadpunk963!rg91w19:18
noonedeadpunksorry, kiddo playing with keyboard :D19:18
mgariepy;)19:36
jamesdentonmgariepy i am around now, sorry20:24
mgariepyno worries. 20:25
mgariepywas only looking at the ovn png again and i was wondering why it wasn't containing the kvm ;)20:25
jamesdentonas in, the tap? or something else?20:26
mgariepylike it done for ovs and lxb20:26
jamesdentonoh, ok. good question and prob an oversight20:27
jamesdentonif you wanna leave a comment i will prob revisit that commit this weekend20:28
mgariepydone thanks20:29
jamesdentonthanks, i see what you mean. complete oversight20:29
jamesdentoni think you had another comment, too. the "br-tunnel" thing. i may remove or at least comment on that20:30
mgariepywell yeah it depends on some implementation also.20:32
mgariepyadded the comment for br-tunnel20:34
jamesdentonthanks20:36
mgariepyi'd like to document a bit the ovn commands a bit.20:37
mgariepyon different hosts and so on. how to map the different UUID20:38
mgariepyhttps://docs.openstack.org/openstack-ansible-os_neutron/latest/app-ovn.html#useful-open-virtual-network-ovn-commands20:38
mgariepyto add to that a bit.20:38
jamesdentonyeah, any tips you have would be good20:45
mgariepyyep i'll check to start something next week20:47
mgariepybut i don't have mnaio install. so documenting the output of the command would be kinda hard for me.20:48
jamesdentonIf you have an openstack cloud to deploy against, i've been working on a MNAIOv2 that will do an 8-node VM-based deploy w/ OVN or OVS20:50
mgariepyi do have a couple of clouds to deploy against.20:50
mgariepywhere is your v2 code ?20:50
jamesdentoni need to clean it up a little bit, but it's here: https://github.com/busterswt/mnaiov220:50
jamesdentonthe readme needs some love, i will update that in a few. but uses terraform to deploy the VMs, sets up some routers, etc. Needs a routed external network so your terraform/ansible host can hit the VMs20:52
mgariepyi'll take a look with a collegue next week for sure.20:54
jamesdentoncool, any feedback is appreciated. 20:54
mgariepysure20:56
mgariepynow it's time to go shovel some snow.20:58
mgariepyhave a nice weekend20:59
jamesdentonsounds like fun, don't hurt yourself! take care20:59
mgariepynot super fun.. ;p haha i need to do it so if i need to leave i don't get stuck 21:01
prometheanfireI know this isn't the ceph channel but that one's dead, so... when using `ceph orch` to deploy radosgw, is the only way to configure it via the `ceph config set` command (and not via the hosts's ceph.conf)?21:04
jamesdentonsorry, can't help ya21:06
opendevreviewMerged openstack/openstack-ansible-galera_server stable/zed: Prevent mariadbcheck.socket to wait for network.target  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/87005621:24
MrRnoonedeadpunk jrosser thank you, that python string really helped (shortened to timeout 5 python3 -m http.server 8888 --bind IP), after trail and mostly error I have figured that the ip to forward to is the internal_lb_vip, the ONE ip i hadn't tried which is absolutely typical. I don't know if this is how it's supposed to be but if it is i found no documentation stating so, now on to the next problem haha21:25
MrRprometheanfire yes only via config set, but stringing them together with && makes it easier21:26
prometheanfirekk, thanks.  upstream docs still seem to point to ceph.conf config by default21:27
MrRno worries, yeah alot of the docs refer to past ways, try and stick to the cephadm section on the ceph site, if its not under the ceph header it can take some piecing together to find the new command21:28
MrR*cephadm header21:28
prometheanfireyepyep21:29
opendevreviewMerged openstack/openstack-ansible-os_nova stable/zed: Fix scheduler track_instance_changes option  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/87005821:29
opendevreviewMerged openstack/ansible-role-pki stable/zed: Create relative links to root instead of absolute  https://review.opendev.org/c/openstack/ansible-role-pki/+/87006521:30
jrosserMrR: it should not be the internal_lb_vip you need to forward, it really should be the external one21:45
jrosserif you think about a cloud without a NAT like you have, but with API endpoint internet accessible, the mgmt network (internal vip) will not be routable from the outside where the LE servers are (rfc1918 address), but the external VIP should be (proper external IP)21:47
jrosserMrR: just FYI here is how it works https://github.com/openstack/openstack-ansible/blob/master/inventory/group_vars/haproxy/haproxy.yml#L224-L24121:50
jrosserport 443 is horizon as https, and haproxy is set up to redirect port 80 to port 44321:50
MrRthose were my thought's which is why I hadn't thought to try it, for now, it's progress, when i finally have a running cloud that i can tear down and bring up easily i'll go back to it and fix it, this is my forth way of deploying (tried packages, trippleo and kolla before this) and the first time i've had such an issue21:50
MrRThanks, bookmarked for when i look into it21:51
jrosserbut there is also an haproxy acl set up (haproxy_ssl_letsencrypt_acl) which redirects any requests to .well-known/acme-challenge to the letsencrypt loadbalancer backend rather than horizon21:51
MrRFound my next problem, qdrouterd is looking for a file called debian-11.yml that doesnt seem to exist21:52
MrRThink i'll just remove it as it was just an enable everything and see what fails option haha21:52
jrosserhmm i am surprised that is even being deployed tbh21:52
MrRwell that reply tells me i should get rid of it haha21:53
jrosseranyway it's good to hear you've got LE working, the setup for that is really quite subtle in order to make it support H/A21:54
jrosserMrR: i think that you'll find OSA is more of a 'toolbox' approach compared to some of the other deploy methods21:55
jrosseryou can have really whatever architecture you want, metal or containers, tons of choice on networking, everything configurable21:56
jrosserwe provide a set of "sensible defaults" but really that is just a reference to start from21:56
MrRquick question, is there a better way of removing services that i've missed? The way i have done it is to use the inventory-manage script, remove all references and destroy the related containers21:57
jrosserdownside is that the learning curve can be very steep, and there are many possbilties to break things21:57
jrosserinventory-manage is the 'official' tool we provide to do that21:58
MrRyeah my issues with some of the other methods was documentation and troubleshooting, this tells me where i can find the problem, the others you needed a crystal ball sometimes21:58
jrosserit would be really useful to get any feedback on things like this21:58
MrRgreat thanks, i'm doing something right haha21:58
jrosserheh no worries21:58
jrossermost people active here are actual operators21:59
jrossernot just developers working on $tool for their employers product, so theres a lot of real-world experience here21:59
prometheanfireMrR: yep, got it working22:04
mgariepyjamesdenton, it's done. 1h of snowblower and a shovel do the trick. 200 ft by 15ft is kinda long for snow removal22:04
* prometheanfire is waiting for the snow to stop, lake effect stuff here22:06
mgariepyhere the snow is full of water. it's a bit too hot for snow to be ligth and fluffy haha22:09
mgariepythere was a good 4-5 inch of snow to remove.22:09
prometheanfireah, nothing bad here, just 2ish inches22:12
MrRjrosser I can admit now I am familiar with it this is the best method, the documentation is much better but i have found a few things i have had to go on a deep dive for and i'm still not sure if i can override files such as /etc/ansible/roles/os_trove/defaults/main.yml in /etc/openstack_deploy and it took a while to find i even needed to edit that file. Which reminds me, when setting up independant rabbitmq 22:16
MrRfor trove, in group_vars/trove_rabbitmq.yml it states to put rabbitmq_monitoring_password: <password>, my question is, do i replace <password> or does it pull it from user_secrets.yml? I've looked and still dont know as i havent got that far22:16
MrRand if it doesnt pull it from  user_secrets.yml it really should22:17
jrosserso as far as everything in /etc/ansible/roles/* goes, the key thing is that for all of them defaults/main.yml is the "external interface" to the roles22:17
jrosserthat should be your first port of call and essentially the documentation for the external interface to those roles22:18
jrosserregarding trove, honestly i can say thats one of our least used roles22:19
jrosserMrR: could you give me a link to group_vars/trove_rabbitmq.yml ?22:20
MrRit's not something i need and won't be in my final deployment, just testing for the sake of it22:20
MrRyeah let me just find it22:21
jrosserthe need to set up an extra rabbitmq is tricky22:21
jrosseroh hold on thats not trove is it....22:21
jrosser:)22:21
jamesdentonmgariepy yeah, 200ft, oh boy22:21
MrRhttps://docs.openstack.org/openstack-ansible-os_trove/zed/configure-trove.html22:21
jamesdentonno snow for me since moving back from Ohio.22:21
opendevreviewMerged openstack/openstack-ansible master: Skip haproxy with setup-infrastructure for upgrades  https://review.opendev.org/c/openstack/openstack-ansible/+/86997422:22
jrosserMrR: i think that documentation is probably wrong22:24
jrosserbecause rabbitmq_monitoring_password and rabbitmq_cookie_token are defined in user_secrets.yml, and the way those vars are loaded is at the highest possible preference in ansible22:25
mgariepyjamesdenton, i have a 45inch walk behind snowblower22:26
mgariepyi'm too cheap to buy a tractor ;) haha22:26
jrosserso those will always override anything set in /etc/openstack_deploy/group_vars/trove_rabbitmq.yml22:26
MrRan old value then, suppose if it succeeds i can check the password/cookie and see if it matches, if not i'll just remove it as i don't actually need it, i have a strange need to find out what does and doesn't work, swear i like making more work for myself22:28
jrosseri think that that is going to happen is that the trove rabbitmq will end up with the same var settings as the main one22:30
jrosserand thats a quite interesting bug!22:31
jamesdentoni had a snow thrower, but didn't end up using it as we had 2 mild winters before i left. ended up giving it to the neighbor22:31
MrRyeah as it completely defies the point of a separate rabbitmq haha, i'll let you know22:31
MrRor file a bug, which would probably be a better idea22:32
jrosserwell perhaps - all being equal the trove vm communicate over a dedicated br-dbaas network and can only contact the dedicated rabbitmq22:32
jrosserif they can get to the mgmt network you have bigger problems22:32
MrRtrue, forgot I had to set a network for it22:33
jrosseri think you might be in for a rough time trying to just enable everything22:33
jrossertelemetry stack can be challenging22:33
prometheanfirelast I knew it was fairly orphaned22:45
MrRjrosser, well I'm about to find out setup-infra has just finished without error, so on to setup-openstack, I did go through as much documentation as I could find to set all variables but then again i'm also working on at least 4 projects and running a household so i won't be too surprised if i've missed something, actually not sure if i set up the ceph rgw on cephs side, set up everything ceph apart from taht 23:04
MrRso best check 23:04
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/zed: Skip haproxy with setup-infrastructure for upgrades  https://review.opendev.org/c/openstack/openstack-ansible/+/87006823:40

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!