Tuesday, 2023-06-27

anskiyHello. I've been upgrading our deployment from 25.2.0 to 25.4.0 and while running `setup-infrastructure` with `rabbitmq_upgrade=true`, OSA stopped all rabbitmq services on all nodes at once for upgrade -- is this the way it should work? The reason I'm asking is after this, all `nova-compute` services end up being broken, until I manually restart them, because they couldn't restore rabbitmq connections07:34
hamidlotfi_Hi there,07:41
hamidlotfi_I want to install ZED again in the new env for test in Ubuntu 22.04 but show me this error07:41
hamidlotfi_https://www.irccloud.com/pastebin/QWUSW36B/07:41
hamidlotfi_please help me07:41
hamidlotfi_@jrosser 07:41
hamidlotfi_@admin1 07:41
hamidlotfi_@noonedeadpunk 07:42
hamidlotfi_note:  in the syslog infra alert /etc/haproxy/haproxy.cfg not found!07:46
admin1paste the config and generated haproxy also 07:54
admin1and also check if during copy/paste if you got some special characters in the file hamidlotfi_07:54
hamidlotfi_please say to me more detail07:58
hamidlotfi_If there are special characters in the file, is the error not displayed? 08:01
anskiyhamidlotfi_: as far as I can tell, the error is: `bind openstack.stage.abramad.com:443' : unable to load certificate from file '/etc/haproxy/ssl/haproxy_infra01-openstack.stage.abramad.com.pem`08:02
hamidlotfi_This file is exist08:02
hamidlotfi_but dont understand08:02
jrosserit can't both be true, that the file exists and the error says it doesnt08:09
jrosserif it does exist then you could look at the permissions and see if they make sense08:09
hamidlotfi_What permission should it have?  jrosser 08:12
jrosserit should be readable by the user that the haproxy process run as08:16
jrosseranskiy: the rabbitmq should not be all shut down at once https://github.com/openstack/openstack-ansible/blob/master/playbooks/rabbitmq-install.yml#L6808:18
hamidlotfi_two cert files have-rw-r--r-- permission, isn't it correct?08:18
anskiyjrosser: well, there is this comment: https://github.com/openstack/openstack-ansible/blob/master/playbooks/rabbitmq-install.yml#L43, so I guess this works as intended.08:19
jrosseroh ok, right08:19
anskiybut it should leave one node running...08:20
jrosserwell i don't know what is happening there, but surely we would have a heap of bug reports if that happened to everyone :/08:21
jrosserand rabbitmq_upgrade=true on a minor upgrade, is that becasue of the rabbitmq/erland repos moving around?08:22
noonedeadpunkmornings08:22
anskiythat's from the docs :)08:22
anskiybut yeah, it was reinstalling in from novemberrain repos08:22
noonedeadpunkyeah, I think we messed up with rabbit repos a lot lately, so it makes sense to me to run with rabbitmq_upgrade=true But it's kinda "safe" to do so - we still run with this flag from time to time whenever we hit weird issue with rabbit, as it's the fastest way to recover08:23
anskiygot it, gonna try to reproduce this thing08:25
jrosserhamidlotfi_: i think you have to just debug what is wrong with haproxy, if the file is there thats OK, if the permissions are reasonable thats OK, but the contents might be somehow broken08:25
jrosseryou can validate the .pem file with the openssl command line tools, or you can delete it and re-run the haproxy playbook which should put them back into place08:26
hamidlotfi_let me check it08:27
jrosseras usual, comparing with an all-in-one build is a quick way to sanity check08:27
halalinoonedeadpunk iirc rabbitmq playbook with rabbitmq_upgrade = true it keep rabbitmq cluster up and running while upgrading the other node, unless all cluster nodes is DOWN 08:35
damiandabrowskihalali: rabbitmq_upgrade brings all nodes down before starting them08:44
damiandabrowskihttps://opendev.org/openstack/openstack-ansible-rabbitmq_server/src/branch/master/tasks/rabbitmq_post_install.yml#L7308:44
damiandabrowskihttps://opendev.org/openstack/openstack-ansible-rabbitmq_server/src/branch/master/tasks/rabbitmq_restart.yml08:44
damiandabrowskihttps://opendev.org/openstack/openstack-ansible-rabbitmq_server/src/branch/master/tasks/rabbitmq_stopped.yml08:44
damiandabrowskiwe do that because: "Rolling upgrades are possible only between compatible RabbitMQ and Erlang versions."08:46
damiandabrowskihttps://www.rabbitmq.com/upgrade.html#rolling-upgrades08:46
noonedeadpunkhalali: nah, I think at some point it completely shuts down the cluster08:49
noonedeadpunkbut that triggers services to re-connect and re-create queues08:53
anskiycontrol plane services were absolutely fine with that08:54
anskiyand they've done as you described08:54
damiandabrowskias we talked some time ago, I did some TLS performance tests and created etherpad containing my findings09:08
damiandabrowskihttps://etherpad.opendev.org/p/openstack-ansible-tls-performance-impact09:08
damiandabrowskiplease have a look when you have some time so maybe we can discuss it during the meeting09:09
halaliOK, I see09:12
noonedeadpunkdamiandabrowski: actually http/2 is really interesting09:20
noonedeadpunkit seems that the only way to use http/2 in python is via hyper though :(09:26
damiandabrowski:/09:30
noonedeadpunkoh, hyper is deprecated...09:30
noonedeadpunkwell, basically because http/2 requires async as well...09:33
noonedeadpunkit's httpx now instead I believe... But still, that will require really _a lot_ of refactoring in each project to get this implemented09:36
kleiniI am stumbling over https://bugs.launchpad.net/designate/+bug/1982252 after upgrade to Yoga. There is no response in that bug for a whole year now.09:56
kleinidnspython used here was upgraded from 1.16.0 (Xena) to 2.1.0 (Yoga). Checking, if that causes the issue.09:59
noonedeadpunkwell, dnspython is part of the upper-constraints, and it's pinned to 2.1.0 there. 12:02
noonedeadpunkSo I'd say it's on designate to fix that...12:02
noonedeadpunkjohnsom: maybe you have any idea about this? ^12:03
adivyahi Team13:21
adivyahad a query regarding the upgrade, Do we have any upgrade document for OS upgrade before doing the actual Open stack upgrade13:22
adivyafor ex i am trying to search but do ubuntu 18.04 supports wallaby openstack version13:22
adivyaand if i have to upgarde ubuntu 18.04 to ubuntu 20.04 , Do we need to keep anything in mind or any link provided13:23
NeilHanlonadivya: https://docs.openstack.org/openstack-ansible/latest/admin/upgrades/distribution-upgrades.html13:26
adivyaThankyou13:32
admin1hi adivya .. upgrade is straightforward 13:37
admin1as in the docs works and nothing fancy   . upgrade, run playbooks .. 13:44
lowercaseshould be noted that when an upgrade is performed, it is expected post-reboot that the service in the venv will no longer work until the playbooks are run again13:47
lowercaseand 13:47
lowercaseClearing out stale information13:47
lowercase    Removing stale ansible-facts13:47
lowercasesection of the docs is quite mandatory13:47
jrosserlowercase: "it is expected post-reboot that the service in the venv will no longer work until the playbooks are run again" <- do you mean when doing an in-place OS version upgrade?13:51
jrosser^ just need to be specific because it's also quite acceptable to reinstall the new OS completely13:52
adivyaok got you14:00
jamesdentondeploying stable/2023.1 with ansible_hardening: false, and hitting an error w/ these pam vars that are actually defined. Seen anything like that? https://paste.opendev.org/show/beE9qzoEIPqXNFuOI2f9/14:08
jamesdentondeploying on Jammy, too.14:08
noonedeadpunkI would move ansible_hardening somehwere else... like to hardening playbook and skip running role at all if it's defined14:29
jamesdentonwell, this is normal setup_hosts.yml play w/ apply_security_hardening set to false. Lemme set to True and see what changes14:31
jamesdentonthe role should be skipped but all tasks are executed anyway (but skipped). 14:32
jamesdentonWhen set to True, tasks execute successfully14:34
noonedeadpunkjamesdenton: aha, well, then `when` is likely not applicable here https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/security-hardening.yml#L4014:44
noonedeadpunkSo it should be `tasks: include_role: ansible-hardening when: apply_security_hardening | bool`14:45
jamesdentonok, i can try that14:45
jamesdentoni will spin up an AIO to test that14:45
jamesdentonthank you14:45
noonedeadpunkI guess smth has changed with recent ansible versions14:46
noonedeadpunkand everyone runs this hardening, so...14:46
jamesdentonmaybe.. i used this configuration fairly recently w/ Zed on 22.04, i thought anyway14:46
jamesdentonyes, i disable it for labs since it adds time14:47
jamesdentonit's strange that it couldn't find the vars, though14:47
noonedeadpunkwell, it's not, as these vars are included inside role15:00
noonedeadpunkthis is not in defaults or vars/main15:00
noonedeadpunkand all tasks are skipped ,so vars/debian.yml simply not loaded15:00
noonedeadpunk#startmeeting openstack_ansible_meeting15:00
opendevmeetMeeting started Tue Jun 27 15:00:54 2023 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
opendevmeetThe meeting name has been set to 'openstack_ansible_meeting'15:00
noonedeadpunk#topic rollcall15:00
noonedeadpunko/15:01
damiandabrowskihi!15:01
jrossero/ hello15:03
NeilHanlono/15:03
NeilHanlonsorta around. doing some errands15:03
noonedeadpunk#topic office hours15:04
mgariepyo/15:05
noonedeadpunkI don't have big agenda for today. I guess mainly we should land some backports to 2023.1 and make new bugfix release https://review.opendev.org/q/parentproject:openstack/openstack-ansible+branch:%255Estable/2023.1+status:open+15:06
noonedeadpunkAs most nasty thing is that I forgot to update openstack-ansible-plugins version in a-c-r15:06
noonedeadpunkso heat is going to fail15:06
noonedeadpunkalso gnocchi is known to be broken, but I have no idea what we can do with thta15:06
noonedeadpunkas constraints are not respected when project has pyproject.toml15:07
jamesdentono/15:09
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_cinder stable/2023.1: Use v3 service type in keystone_authtoken config  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/88705715:09
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_cinder stable/zed: Use v3 service type in keystone_authtoken config  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/88705815:09
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_cinder stable/yoga: Use v3 service type in keystone_authtoken config  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/88705915:09
jrosserwe need to clean up the cinder role15:10
jrosserlots of v1/v2 stuff in there15:10
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/2023.1: Ensure management_address is used instead of ansible_host  https://review.opendev.org/c/openstack/openstack-ansible/+/88706015:11
noonedeadpunkyup - that's really good call15:12
noonedeadpunkand I guess we kinda needs to review patches for making tls/internal tls as default15:14
noonedeadpunkI personally reluctant to vote on that, because I don't really have any strict opinion on that15:14
noonedeadpunkI'm not sure if it's good default or not15:14
damiandabrowski#link https://etherpad.opendev.org/p/openstack-ansible-tls-performance-impact15:15
noonedeadpunkand this is actually good work and smth to think about15:15
damiandabrowskiafter my benchmarks, i also don't have a strong opinion15:15
noonedeadpunkI will add the topic for next TC meeting (not the one that will be in 2 hours, but next week)15:15
noonedeadpunkTo see what they think about http/2 and if it's time for openstack to adopt it15:16
noonedeadpunkbut I see tremendeus amount of work that would be required, which is probably the main blocker15:17
noonedeadpunkand yeah, not having TLS on internal VIP have quite big difference comparing to enabled TLS on it15:18
noonedeadpunkand like almost 30% difference between current default and suggested one, if I'm right?15:18
noonedeadpunk60s vs 88s15:19
jrosseridk what the other tools do for this15:19
jrosserif we are different by having tls or by not having it15:19
damiandabrowskinoonedeadpunk: yeah, but I can't explain why enabling TLS on backend doesn't make any difference while for haproxy it does15:20
noonedeadpunkjrosser: not sure I got your point? as I guess as long as we test both we should be good?15:22
spatelFolks! I am trying to run OSA stack inside lxd container for lab/stage/testing but look like it doesn't support, hit this error when running setup-host.yml - https://paste.opendev.org/show/bsBRasNMflOnflDb68bm/15:23
spatelany workaround ?15:24
jrosseri mean if the default for the other tools is to do TLS then that says that the lower performance might be seen as acceptable already15:25
noonedeadpunkdoes kolla enforce internal tls?15:25
noonedeadpunk(I don't know to be frank)15:26
jrosserme neither - thats why it would be interesting to see what the other perspectives are15:26
noonedeadpunkspatel: do you know how things are with tls in kolla world?:)15:27
jrosserspatel: in an LXD you can't do anything with the kernel really, so you need to disable those tasks, look at the code and the vars to make some overrides15:28
noonedeadpunkregarding your question - this specific issue can be overcomed by defining `openstack_host_specific_kernel_modules: []` but I think you will fail in soooo many places, that I don't find it being feasable to run inside container15:28
spatelI mostly keep TLS disable but it does has support to encrypt all traffic using haproxy - https://docs.openstack.org/kolla-ansible/latest/admin/tls.html15:28
noonedeadpunkjrosser: default is `no` https://opendev.org/openstack/kolla-ansible/src/branch/master/ansible/group_vars/all.yml#L834-L84015:29
spateljrosser just disabled that task and re-running it.. Hope we can make it variable to make it workable on LXD playground 15:29
mgariepyspatel, lxc --vm ?15:30
spatelYes, running whole stack inside LXD to mimic production 15:30
noonedeadpunkyeah, lxd can manage LVM15:30
noonedeadpunkbrrrrrrr15:30
mgariepyvm.. lvm meh15:30
noonedeadpunk*KVM15:30
spatelIts quick to spin up and testing 15:30
noonedeadpunkspatel: yeah, but it can be proper VM rather then lxc container15:31
spatelLVM for cinder correct but we can use physical host for LVM support we don't need that inside LXD 15:31
spatelcurrently my dev/stage environment running inside VMware VMs which is very hard to setup and destroy.. I want something quick and automation way and LXD is very quick and easy15:32
noonedeadpunkthe problem with lxc containers, is that you can't manage a lot of things, including time, kernel modules, firewall?, devices15:33
noonedeadpunk(probably you can have firewall ifproper modules are loaded though)15:34
noonedeadpunkspatel:  https://ubuntu.com/blog/lxd-virtual-machines-an-overview15:34
noonedeadpunkso spawning proper KVM VM is quite as trivial as lxc container IMO15:35
spatelHmmm! 15:35
damiandabrowskimaybe we just found a volunteer who can work on https://github.com/openstack/openstack-ansible-ops/tree/master/multi-node-aio ? :D 15:36
noonedeadpunkreturning back to tls - I would leave default as is, but improve testing whenever possible15:36
noonedeadpunkhehe15:36
mgariepyonly need to add --vm to your lxc launch command 15:36
noonedeadpunkexactly ^15:36
spatellol15:37
damiandabrowskiokay, so keep tls disabled for now but implement 'tls-transition' scenario anyway, right?15:37
spatelmgariepy let me try.. --vm15:38
noonedeadpunkyeah, we must test it anyway imo15:39
noonedeadpunkmaybe also document better on how to enable/switch to TLS and possible performance degradation?15:40
jrosseri think i will be switching to tls15:40
mgariepyi'll too.15:41
damiandabrowskiwe will switch to tls as well(at least in some regions)15:41
jrosserit's just on * everywhere here so my openstack is a pretty big outlier15:41
mgariepybut i'm pretty low on api calls so i don't expect it to cause much issue 15:41
noonedeadpunkbut I kinda feel extra complexity by this as default especially for beginners or who doesn't care a lot as network is internal15:42
noonedeadpunkso it kinda pretty much depends on usecases and regulations15:42
damiandabrowskibut if we see ~30% degradation on rally, maybe it's indeed better to keep it disabled by default15:42
noonedeadpunk(and existance of quantum computers)15:42
spatelmgariepy that works!! --vm15:43
noonedeadpunkI don't think we have too much complexity with our implementation which we don't want to carry for some period of time15:45
noonedeadpunksince now we just rely on haproxy configuration at playbook runtime, this extra complexity for tcp is not gigantic anymore15:47
damiandabrowskibut this '--vm' parameter is interesting(didn't know about it before)16:03
damiandabrowskido I understand correctly that if we implement LXD support at some point, it will be much easier to spin up multi-node-aio?16:03
damiandabrowskias we can skip all virsh/pxe tasks then16:06
spateldamiandabrowski let me spin up my lab and i will give you feedback how it goes but agreed with you LXD is must faster and easier if works with OSA 16:14
opendevreviewMerged openstack/openstack-ansible stable/2023.1: Remove other releases from 2023.1 index page  https://review.opendev.org/c/openstack/openstack-ansible/+/88492116:16
damiandabrowskii'm not sure if it's faster, but for ex. it has a proper tooling for image management. But I think that requirement to install LXD from snap successfully prevented us from switching to it so far16:17
damiandabrowskinoonedeadpunk: endmeeting? ;)16:18
noonedeadpunk#endmeeting16:18
opendevmeetMeeting ended Tue Jun 27 16:18:34 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:18
opendevmeetMinutes:        https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-06-27-15.00.html16:18
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-06-27-15.00.txt16:18
opendevmeetLog:            https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-06-27-15.00.log.html16:18
noonedeadpunkyes...16:18
noonedeadpunksry16:18
opendevreviewJames Denton proposed openstack/openstack-ansible master: Use include_role in task to avoid lack of access to vars  https://review.opendev.org/c/openstack/openstack-ansible/+/88708216:25
spatelHow do i used br-mgmt for vxlan tunnel 16:58
spatelI don't want to create br-vxlan dedicated bridge16:58
noonedeadpunkyou don't need to16:58
noonedeadpunkeventually, you need just consistent interface name for vxlan16:59
noonedeadpunkwith IP on it16:59
spatel? 17:00
spatelI have only two interface so thinking br-mgmt I can use for vxlan 17:00
spateland br-vlan for provider network 17:01
opendevreviewMerged openstack/openstack-ansible-galera_server master: Add optional compression to mariabackup  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/88618017:34
opendevreviewMerged openstack/openstack-ansible-ceph_client stable/2023.1: Fix retrievement keyrings from files  https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/88647717:46
opendevreviewMerged openstack/openstack-ansible-ceph_client stable/2023.1: Fix permissions for ceph cache directories  https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/88647117:46
admin1spatel, you can use kvm with the openstack backign image ( just like how nova does it ) to quickly bring up dev boxes 18:13
admin1backing ! 18:13
admin1my dev one takes less than 1 min to boot, and it has all the networks etc ready .. ips are pre assigned via vyos 18:13
admin1all i need to do is run one small post-install ansible and the netplan is populated with the right bridges etc 18:14
admin1in less than 5 mins, the whole dev env is ready for a new openstack build 18:14
spatelgive me recipe of KVM and spin up environment 18:14
admin1let me do some cleanup  and put it in github 18:15
spatelcool18:17
admin1https://gist.githubusercontent.com/a1git/871420b52587c609b5d2d24d6e204869/raw/960ffa9f595133ed34709a1353f12eb1869cca9d/gistfile1.txt  -- something like this 18:23
admin1jammy.img is the same image you download for customers for openstack 18:23
admin1you can rewrite the mac to make it anything to get the same IP from dhcp, or remove it to get diff ones 18:24
spateladmin1 Thanks! but what about setting up networking for br-mgmt/br-vlan/br-vlan etc.. 18:44
spatelwe need internal bridge to communicate between multiple VM for multi-node deployment 18:44
spatelAIO is easy but multi-node setup not 18:46
admin1yes, you create netplans on the host and the guest 18:58
spatelhow does vxlan will talk to other vm 18:59
admin1in you host, you just add host-vxlan as bridge 18:59
admin1and then pass that as interface on br-vxlan to the hosts18:59
spatelwe have to create bridge etc.. and connect vm to those bridge to make communication work18:59
admin1yes 18:59
spatelthat short of example I am looking for :)19:00
admin1this is an old verison, but you get the idea 19:01
admin1https://gist.githubusercontent.com/a1git/c15ecbc87738d9d8390e6477d497c4c0/raw/032de84cee842201ecdc09bf47cb8e148f0df5e3/gistfile1.txt19:01
jamesdentonthere doesn't need to be a vxlan bridge, necessarily, or a dedicated vlan for vxlan traffic. Only an IP on each host that can be used for point-to-point vxlan traffic19:06
jamesdentonthat could be the IP on br-mgmt if you wanted it to19:06
jamesdentonit's just that the reference architecture uses a dedicated VLAN/IP for overlay (vxlan) traffic, and a bridge exists because of legacy reasons, connecting the old neutron-agent container to the host19:07
admin1in my case, i wanted to test cloud-connect feature 19:08
admin1which is a vpn for customer to directly plug into their vxlan 19:09
admin1which is why i put a host bridge 19:09
admin1also to tcpdump from host and check for traffic, sflow testing19:09
jamesdentonbecause it needs to be reachable by some other external host?19:09
admin1this is an all in one dev setup where i test various secnarious 19:09
admin1and one is cloud-connect 19:09
jamesdentonok19:10
jamesdentonnot familiar w/ that19:10
admin1which is you offer vpn to customer or their own l2 connect which will plugin to their internal networks directly 19:10
lowercaseguys im getting, 0x8024401719:12
lowercasewrong chat sry19:12
spateljamesdenton you are saying if i don't have br-vxlan then it will use br-mgmt for vxlan?19:12
jamesdentonIIRC it may default to ansible_host, which is likely br-mgmt19:13
jamesdentonyou can certainly try it. Ultimately, the IP srt at local_ip in the ml2/ovs/lxb config files is what is used19:14
jamesdenton*the IP used in local_ip19:14
spateljamesdenton I will give it a try 19:14
jamesdentonso, the playbooks do their best to determine what that needs to be based on the overrides and openstack_user_config19:14
spatelI didn't know that br-vxlan is optional 19:14
spatel+119:15
jamesdentonWell,  don't want to say it's optional but it's not necessarily required? Opposing statements, I know19:15
spatelI know for production its important but for test/lab its not 19:16
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible master: Add 'tls-transition' scenario  https://review.opendev.org/c/openstack/openstack-ansible/+/88519420:55
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible master: Allow to update AIO config prior to an upgrade  https://review.opendev.org/c/openstack/openstack-ansible/+/88519021:05
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible stable/2023.1: Add support for 'tls-transition' scenario  https://review.opendev.org/c/openstack/openstack-ansible/+/88711821:12
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible master: Add 'tls-transition' scenario  https://review.opendev.org/c/openstack/openstack-ansible/+/88519421:18
opendevreviewDamian Dąbrowski proposed openstack/openstack-ansible master: Add 'tls-transition' scenario  https://review.opendev.org/c/openstack/openstack-ansible/+/88519421:26

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!