Wednesday, 2022-02-09

opendevreviewMerged openstack/openstack-ansible-galera_server master: Convert xinetd clustercheck to systemd socket service  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/82404200:44
*** dviroel|ruck|afk is now known as dviroel|ruck00:48
*** dviroel|ruck is now known as dviroel|ruck|out00:57
*** dviroel|ruck|out is now known as dviroel|out00:57
opendevreviewBhagyashri Shewale proposed openstack/openstack-ansible-os_tempest master: Move zuul jobs layout to centos9 only for master branch  https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/82844903:27
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_nova master: Drop nova_glance_api_servers variable  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/82846006:55
jrossercalico is broken on victoria "oslo_config.cfg.NoSuchOptError: no such option report_interval in group [AGENT]"07:00
noonedeadpunkI'd say it's broken everywhere. Just NV now07:02
noonedeadpunkI was trying to dig one day but didn't find where it get's (or it was some lazy loading with no way to overcome)07:03
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_neutron stable/victoria: Remove legacy centos-8 jobs  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/82748307:03
jrossermaybe time to think if we keep support or not07:04
noonedeadpunkthe only occurance was https://opendev.org/openstack/openstack-ansible-os_neutron/src/branch/master/templates/metering_agent.ini.j2#L15 but even dropping this file didn't help07:04
jrosserit is not really cmopatible with internal VIP ssl either07:04
noonedeadpunkI tend to agree here07:04
jrosserbecasue of instances wanting metadata on http and calico not running an haproxy for metadata07:04
noonedeadpunkI think ovn kind of same?07:05
jrosserpotentially, i really dont know much about it07:06
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Rename RBD cinder backend  https://review.opendev.org/c/openstack/openstack-ansible/+/82846307:11
noonedeadpunkbut calico interest is really limited I believe.07:11
noonedeadpunkwell, evrardjp was talking about it recently, so likely need to double check before saying for sure :)07:12
jrosserok well like all this stuff it needs maintainance effort07:35
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_nova master: Remove secure_proxy_ssl_header logic  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/82846707:42
noonedeadpunkI think this needs to be double checked as maybe we need to jsut apply logic in other place ^07:42
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_keystone master: Switch keystone logging to syslog  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/82846907:58
jrosseri'm getting good value out of the infra scenario tests for the ssh keypairs stuff08:34
jrosserits already testing the repo sync as part of that so shows up some bugs on centos-808:35
noonedeadpunkwho was surprised about centos-related hickups08:40
noonedeadpunk*hiccups08:40
*** sshnaidm|afk is now known as sshnaidm08:54
opendevreviewMerged openstack/openstack-ansible-openstack_hosts stable/victoria: Assume centos version is at least 8.3  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/82834610:06
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_keystone master: Use uwsgi role for keystone  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/82851010:10
opendevreviewMerged openstack/openstack-ansible-lxc_hosts stable/xena: Replace CentOS 8 with Stream jobs  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/82809510:21
opendevreviewMerged openstack/openstack-ansible-lxc_hosts stable/wallaby: Ensure that the legacy network-scripts package is present  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/82823610:27
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_horizon master: Move Listen definition to VHosts  https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/82851510:49
opendevreviewMerged openstack/openstack-ansible stable/xena: Fix additional facts gathering in ceph-install.yml  https://review.opendev.org/c/openstack/openstack-ansible/+/82839211:10
*** dviroel|out is now known as dviroel|ruck11:10
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_keystone master: Define X-Forwarded-Proto for keystone  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/82851811:19
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_keystone master: Drop ProxyPass out of VHost  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/82851911:44
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_horizon master: Move Listen definition to VHosts  https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/82851511:49
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Do not run rsyslog against RabbitMQ  https://review.opendev.org/c/openstack/openstack-ansible/+/82634712:29
noonedeadpunkwould be awesome to get another review on https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/826338/ :)12:30
*** akahat|rover is now known as akahat|PTO14:11
jrosseris this a thing? lsyncd[7554]: rsync: failed to open "/var/www/repo/repo_prepost_cmd.sh", continuing: Permission denied (13)14:23
opendevreviewMerged openstack/openstack-ansible-lxc_hosts stable/wallaby: Replace CentOS 8 with Stream jobs  https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/82796614:28
jrosseroh thats confusing, lsyncd writes some stuff to the journal and the most of it to /var/log/lsyncd/lsyncd.log14:29
noonedeadpunkwhaaat14:54
jamesdentongood morning14:55
jrossero/ hello14:55
damiandabrowski[m]hey!14:55
jamesdentonmy bouncer died, and i didn't really notice14:55
jamesdenton:|14:56
jamesdentonanything new?14:57
jrosserwell i would make some centos related comment, but thats just nothing new :)14:59
jrosserthis maybe https://review.opendev.org/c/openstack/openstack-ansible/+/82838614:59
jrosser^ that blew up quite badly on stable branches14:59
jamesdentonhmm15:00
noonedeadpunkshould we wait for master patch before merging it?15:02
opendevreviewJonathan Rosser proposed openstack/openstack-ansible stable/xena: Remove enablement of neutron tempest plugin in scenario templates  https://review.opendev.org/c/openstack/openstack-ansible/+/82854815:02
jrossertada!15:02
jamesdentonwas it some particular test causing issues?15:03
noonedeadpunkit was like neutron-lib and tempest plugin being incompatible I guess15:03
noonedeadpunkas it didn't even come to tests)15:04
jrosserit installed master version of the plugin which then tries to test non existing things in older neutron iirc15:04
jamesdentoni don't really know how their tags work, seems like the latest one stops ~train15:05
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_neutron stable/xena: DNM - test https://review.opendev.org/c/openstack/openstack-ansible/+/828548  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/82854915:07
opendevreviewJonathan Rosser proposed openstack/openstack-ansible stable/xena: Remove enablement of neutron tempest plugin in scenario templates  https://review.opendev.org/c/openstack/openstack-ansible/+/82854815:09
opendevreviewJonathan Rosser proposed openstack/openstack-ansible stable/wallaby: Remove enablement of neutron tempest plugin in scenario templates  https://review.opendev.org/c/openstack/openstack-ansible/+/82855115:10
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_neutron stable/xena: DNM - test https://review.opendev.org/c/openstack/openstack-ansible/+/828548  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/82854915:10
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_neutron stable/wallaby: DNM - test https://review.opendev.org/c/openstack/openstack-ansible/+/828551  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/82855215:12
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Remove enablement of neutron tempest plugin in scenario templates  https://review.opendev.org/c/openstack/openstack-ansible/+/82855315:47
spateljamesdenton around?16:00
jamesdentonyes16:00
spatelI have question related STP enable/disable on bridge with ubuntu netplan - https://paste.opendev.org/show/bzIbTv4XYyKh6oySYFfI/16:01
spatelbrctl show - saying STP is not enable16:01
spatelnetplan - default config saying stp is enabled 16:01
spatelnetplan doc saying STP is enabled by default16:02
spatelhow should i prove that its really really disabled 16:02
jamesdentonhmm, you might try 'bridge -d link show <br>'16:03
jamesdentoni believe 'state' reflects STP state16:04
spatelhere is the output - https://paste.opendev.org/show/bFHvFO0JBEgn5LzVmOKn/16:05
spateltrying to understand what flag indicate stp is active16:06
spatellearning on flood on ???16:06
spatelstate forwarding priority 32 cost 2 16:07
spateldoes that means STP is enabled?16:07
spateljamesdenton we had network loop and i believe this could be the issue.. 16:08
NeilHanlonspatel: if state is anything but 0 (DISABLED), then STP is enabled16:12
NeilHanlon`state forwarding` is spanning tree forwarding16:12
spatelhmm very odd then..16:13
NeilHanlonif you're using a bridge with two interfaces, or if you bridged two interfaces on the same LAN, then you can cause loops, yes16:14
spatelneutron create tap interface they are always showing STP on  16:14
spateli do have bond interface active-backup mode16:14
NeilHanlonThe best thing to do is to never flood BPDUs to the devices unless you have to for some reason16:14
spatelwhat is the best practice to disable STP for everything on compute node?16:15
spatelif i disable STP on bond0 then it should disable underlying bridges/vlans or not? 16:18
jrosserheres a little something we cooked up with openstack-ansible https://superuser.openstack.org/articles/environmental-reporting-dashboards-for-openstack-from-bbc-rd/16:20
jamesdentonspatel it's probably in your best interest to leave the default (stp on)16:21
spatelhmm 16:22
jamesdentoni wouldn't trust brctl for accurate info, i think it was deprecated a while back in favor of iproute2 (bridge)16:22
spatelwe have noticed one of our compute node lock up because of memory and same time switch block entire vlan on that rack 16:23
spatelnow i started thinking about STP in bridge.. may be it created some kind of loop because i have bond interface and if stp is enable then it will do damage correct?16:23
jamesdentonnice article jrosser 16:24
spateli don't know i am just making up some story 16:24
jrosserjamesdenton: thankyou :)16:24
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-plugins master: Add ssh_keypairs role  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/82511316:40
NeilHanlonspatel: linux will dutifully process and flood Spanning Tree Bridge Protocol Data Units (BPDUs) out other interfaces in a bridge--that's what it's supposed to do because it has to ensure that the data is flooded through the entire tree. I've seen (and caused) broadcast storms due to this exact thing ;) 16:40
opendevreviewMerged openstack/openstack-ansible master: Remove symlinking of selinux libraries into the ansible-runtime venv  https://review.opendev.org/c/openstack/openstack-ansible/+/82755616:40
spatelI do have BPDU-Protection on my edge interface of switch but still no sure what happened to that box when i crashed16:41
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-repo_server master: Use ssh_keypairs role to generate keys for repo sync  https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/82710016:42
spatellooking for some kernel watchdog config if kernel shutdown machine during any crash then it would be good 16:42
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_nova master: Use ssh_keypairs role to generate cold migration ssh keys  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/82530616:44
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_keystone master: Use ssh_keypairs role to generate fernet sync ssh keys  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/82709016:45
jamesdentonsorry spatel - just finished digging myself out of a hole i created with OVN. 19:46
spatel:) tell me the story 19:47
spateljamesdenton ^19:48
spatelI am running OVN on production so i would like to know that19:49
jamesdentoni swapped out a node but kept the same name/ips/etc19:52
jamesdentonchassis id changed19:53
jamesdentonand the new node didn't rejoin the cluster properly19:53
jamesdentoni have notes but it's a mess. would likely be better off recreating the situation and walk through the fix properly19:53
spatelhmm... that is interesting.. 19:57
spatelworth testing in lab and see..19:58
spateldid you try this - https://github.com/amorenoz/ovsdb-mon20:02
spatelthis is good tool for debug OVN 20:02
spateli am playing to play and see how we can make thing easy20:02
noonedeadpunkdebug OVN sounds like pain....21:18
noonedeadpunkhave huge concerns about it operations prespectives...21:18
noonedeadpunkoh, btw, spatel do you run ovn already somewhere in prod?:)21:18
spatelI told you i am deploying HPC on openstack so that is where i am running OVN 21:19
spatelit has 30 compute nodes and yes its production 21:19
noonedeadpunkmmm, and do you use tenant routers there?:)21:20
spatelOVN is not that bad only problem is we don't have enough knowledge to debug and fix quickly :(21:20
spatelYes we do tenant router and VxLAN etc.. 21:21
noonedeadpunkis it breaks ? :D21:21
noonedeadpunkSo eventually why I'm asking - I'm super unhappy about l3 routers with ovs21:21
noonedeadpunkit's really a pita to do maintanences on net nodes21:21
noonedeadpunkBut ovn doesn't have net nodes as concept:)21:22
noonedeadpunkas it's DVR21:22
noonedeadpunkBut not sure if it made things less painfull21:22
spatelYes OVN doesn't have net node and it works smooth 21:22
spatelI am running in HA mode so if node is down it will automatically shift load to next machine.. 21:23
noonedeadpunklike we recently had big issues with l3s jsut because of rabbit fallen apart...21:23
spatelwhat is the connection with rabbit? 21:23
damiandabrowski[m]noonedeadpunk: thanks for reminding me about it, now I'll have a nightmares :D 21:24
noonedeadpunkyou see ?:)21:24
noonedeadpunkah, always welcome damiandabrowski[m]!21:24
noonedeadpunkI actually already know why that all happened :D21:24
spatelOVN is very simple compare to traditional L3 deployment in namespace :) 21:25
noonedeadpunk(kidding)21:25
noonedeadpunkspatel: so I mainly concerned if things won't go south for example when ovs package got updated or glibc21:25
noonedeadpunk(connection to rabbit btw is that l3 when loosing connection for $timout starts re-syncing and cause tons of other issues)21:26
spatelhmm the beauty of OVN is it has zero dependency with rabbitMQ 21:27
spatelnoonedeadpunk agreed upgrading stuff in OVN not great i would say.. but again we need to keep doing otherwise never going to learn :(21:27
noonedeadpunkyeah, I know...21:27
spateljust need to push hard21:28
noonedeadpunkSo I was more kind of interested if you're happy overall comparing to your ovs setup with dpdk , blackjack and... you know:)21:29
spatelI can see people developing tools to debug OVN so that is good 21:29
noonedeadpunkwell, when having tool is only option for debug...21:29
spateli stopped using dpdk :( i didn't see any performance gain 21:29
noonedeadpunkso just regular ovs?21:30
spatelYes OVN+OVS 21:30
spatelI found until unless you run DPDK aware application there is no advantage :(21:31
noonedeadpunkI see21:31
spateli did lots of loadtesting and result is same DPDK vs non-DPDK 21:31
spatelbecause VM virtio is not going to improve performance just because you are running OVS+DPDK on host21:32
spatelno one can beat SRIOV that is fact 21:32
noonedeadpunklikely also depends on network cards, as modern ones cover gap with offloading21:33
spatelnoonedeadpunk also i have successfully setup my infiniband network to run MPI job :)21:33
noonedeadpunkoh!21:33
spateli did pass through Mellanox to vm and then my VM able to see VF and i successfully run MPI job 21:33
noonedeadpunkhas it worked out as you expected with subnet manager ? :D21:34
spateli am able to get 100Gbps inside VM 21:34
spatelYes i configured subnet manager inside infiniband switch :)21:34
spatelsoon i am going to write up my blog about ib fun21:34
noonedeadpunkIB always fun. I'm glad I'm not dealing with it anymore :D21:35
spatelI am not doing any IPoIB stuff 21:35
noonedeadpunkOh, yes, that's actually nice thing21:35
noonedeadpunkas otherwise it's nightmare21:35
spatelI am getting 100Gbps speed between two VM so that is awesome :)21:35
noonedeadpunkalso - don't install any ceph packages with OFED :D21:35
spatelhmm what do you mean ? 21:36
noonedeadpunkyeah, I can imagine. I gad only 40Gbps with rubbish ConnectX-2 and the upgraded to ConnextX-3Pro that were soooo amazing back then:)21:37
noonedeadpunkIf you upgrade OFED for example, it will drop all ceph packages on host21:37
spatelI have ConnectX-5 21:37
noonedeadpunkso if you was running OSD node....21:37
noonedeadpunkAs there's some dependency on ubuntu between ofed built packages and ceph-common21:38
noonedeadpunkmaybe it's fixed today...21:38
spatelI have noticed when i install OFED then it does compile module for kernel and upgrade kernel also 21:38
noonedeadpunkyeah, with dkms usually...21:38
spatelmay be because of that ceph doesn't like it21:38
noonedeadpunkit was more about package cross-dependency I guess... but yeah. dunno how valid that is nowadays21:39
spatelI don't have ceph storage in this environment (I do have glusterFS )21:39
noonedeadpunkyeah, I do recall 21:40
spatelin each compute node we have 384GB memory :D21:41
spatelI think most costly openstack i have ever build 21:41
noonedeadpunkheh, yeah, tiny computes :D21:41
spatel15 Tesla GPU each cost $20,000 around21:41
noonedeadpunkbtw21:42
noonedeadpunkyou just passthrough tesla inside vms?21:42
spatelfrom 64GB to 384G is big deal for me.. hehe21:42
spatelYes i did passthrough 21:42
noonedeadpunkand you don't do licensing? OR you don't use cuda?21:42
spatelWe don't have license :(21:43
spatelThis HPC is for research and not for public service so we don't need virtualization 21:43
spatelI can understand for public cloud21:43
noonedeadpunkWell it was more about some confusiong coming from https://docs.nvidia.com/grid/13.0/grid-licensing-user-guide/index.html#software-enforcement-grid-licensing21:44
noonedeadpunk`When licensing is enforced through software, the performance of the virtual GPU or physical GPU is degraded over time if the VM fails to obtain a license.`21:44
spatelhmm21:44
noonedeadpunkand jsut in previous paragraph they say `GPU pass through for compute-intensive virtual servers requires vCS`21:45
spatelhehe..21:45
spateldo you guys running GPU in your cloud?21:45
noonedeadpunkI bet with T4 I was also passing through without any issues, but I;m not sure if they were working inside VMs tbh21:45
spatelhmm21:46
noonedeadpunkbut yeah, likely it only raises when gridd is installed on compute node21:46
noonedeadpunkbut it's confusing...21:47
spateli am also new in GPU and so learning for me 21:47
spateli found but in OSA /etc/hosts file 21:48
spatelit has container name with _ underscores 21:48
spatelthat is not valid hostname for /etc/hosts file21:48
spatelhttps://paste.opendev.org/show/bMFRNBU2jhEOgbi6k0ov/21:49
spatelnot sure if it has been fix in Xena but i am seeing error in wallaby21:50
noonedeadpunkwe haven;t changed it for a while now21:50
noonedeadpunkhttps://opendev.org/openstack/openstack-ansible-openstack_hosts/src/branch/master/tasks/openstack_update_hosts_file.yml < that is responsible for generating21:50
noonedeadpunkso likely it comes from `hostvars[item]['ansible_facts']['hostname']`21:51
spatelyes.. during debug i saw lots of error in logs saying invalid hostname so i freaked out and noticed this issue21:51
noonedeadpunkI hav that everywhere on V as well21:51
spatelwe should fix it (no rush but) just noice 21:52
noonedeadpunkI haven't seen issues in logs though(21:52
spateli have seen in /var/log/syslog file 21:52
spatelmay be during reboot of system21:53
noonedeadpunkyeah, might be21:53
spateldid you work on openstack masakari ?21:54
noonedeadpunkI did21:54
spateli am looking for HA solution for some critical application21:54
noonedeadpunkI want to add it to current workloads as well21:55
spatelHow do it work and how good its? 21:55
spatellast week one of my vm down which breach SLA :(21:55
noonedeadpunkwell I never really used instancemonitor tbh21:55
noonedeadpunkBut it would help with that I belive21:56
spateli am planning to play with this in LAB to test our and see how we can use it to improve SLA21:56
spatelI don't have shared storage, does it need one?21:56
spatelcurrently i have developed IP_TAKE_OVER.sh script 21:57
noonedeadpunkall depends, you know. So instancemonitor tracks vm by virsh log and if it sees VM down tries to re-spawn it locally first21:57
noonedeadpunkif not - tries evacuate iirc.21:57
spatelwhenever vm down or anything happened someone from NOC run IP_TAKE_OVER.sh script and attach vif to my standby VM21:57
spatelI don't have shared storage for evacuate won't help21:57
noonedeadpunkyeah, and hostmonitor actually does just evacuate21:58
noonedeadpunkwhen it finds that compute went down21:58
noonedeadpunkI'm not really sure what is the app, but it sounds like you more need loadbalancer?21:58
spatelWe have very complex application running for our customer which has many components talking to each other 21:59
noonedeadpunkah21:59
spatelif one of application or vm is down then i need to replace that with with *SAME* ip21:59
spatelkeeping same IP is very important for us 22:00
spatelotherwise i have to reboot every single machine in that application22:00
spatelEngineering working to fix legacy code but mean time i need some heck :)22:01
damiandabrowski[m]maybe You can disable port_security for these ports and replace IP_TAKE_OVER.sh with pacemaker/keepalived?22:01
damiandabrowski[m]or better: add more allowed ip pairs22:01
noonedeadpunkwell masakari is more about revive what you already have22:02
noonedeadpunkyou can define custom workflows there in case of failovers ofc22:02
noonedeadpunkbut it needs writing code22:02
noonedeadpunkand also it monitors on qemu/libvirt level22:03
noonedeadpunknot app inside vm22:03
spateli need to test in lab and see how it can fit in my deployment22:03
*** dviroel|ruck is now known as dviroel|out22:33
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_keystone master: Use ssh_keypairs role to generate fernet sync ssh keys  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/82709022:59
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_keystone master: Define X-Forwarded-Proto for keystone  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/82851823:03
jrosserfeels like this has strange side effects https://github.com/openstack/openstack-ansible/commit/6e9da4753af83e5b1c34f6ee7c35854c15a72bb0#diff-8c199e8e49846eb701be959066e29d5279fbde49ce2e92ce4a3ca274af3e3d9cR2523:17
noonedeadpunklike what?23:17
jrossermakes it hard when writing a role like ssh_keypairs, that it runs the whole play on repo_servers[0] then again on all the rest23:18
noonedeadpunkwe have run_once somewhere in role?23:18
jrosserso the role tasks are not run against all the nodes at the same time23:18
jrosserso for example, it deploys the keys and lsyncd onto node[0]23:19
jrosserthen starts again and puts the keys on nodes [1] and [2]23:19
jrosserbut somehow on centos lsyncd already fails because it cannot ssh to [1] and [2] when the service starts23:20
noonedeadpunkwhy it starts though? As handlers should basically run after all play done?23:20
noonedeadpunkWell , deb needs hook to prevent service from starting23:21
noonedeadpunkbut centos by default doesn't start in general...23:21
noonedeadpunkor we have flush_handlers there somewhere?23:22
jrosserhandler once https://zuul.opendev.org/t/openstack/build/f3ae5dc016cd423987842b70c8801485/log/job-output.txt#13347-1335023:22
jrosserthen later handler twice https://zuul.opendev.org/t/openstack/build/f3ae5dc016cd423987842b70c8801485/log/job-output.txt#13881-1388623:23
jrosseridk why this is different on focal23:24
jrosserwell, tasks sun in the same order, but end result works23:24
jrosser*run23:24
noonedeadpunkI wonder why we need flush_handlers at end of tasks/main.yml23:26
noonedeadpunkto restart based on serial I guess...23:26
noonedeadpunkso like we basically need to run keypairs in pre_tasks for lsyncd23:29
noonedeadpunkor just do rolling restart of lsynd in post tasks23:29
noonedeadpunkfrom other side serial in this way doesn't make real sense23:30
jrossercurrently the data is in role defaults23:30
jrosserso playbook pre_tasks would need that moving23:31
noonedeadpunkas if we think about it, we can have several group of hosts for repo23:31
noonedeadpunk(if we have multiple OS)23:32
noonedeadpunkso 1, 100% is just wrong23:32
noonedeadpunkbut also I think we miss smth like that https://opendev.org/openstack/openstack-ansible/src/branch/stable/rocky/playbooks/repo-build.yml#L33-L41 to set group of repo containers per OS23:33
noonedeadpunkmaybe we should just do rolling restart of lsync in post-tasks?23:35
jrosserwell right now its only on [0]23:35
jrosserperhaps the flush handlers is wrong23:36
noonedeadpunkmmmm....23:36
jrosserbut i also see the idea there when using serial23:36
noonedeadpunkwhat if centos systemd unit missing  restart on failure?23:36
jrosseroh maybe23:37
noonedeadpunkso we can just add override23:37
jrosserbehaviour is just different on focal https://zuul.opendev.org/t/openstack/build/c490c74c8c774d6490685a498f04bedf/log/logs/openstack/aio1_repo_container-a31176e7/lsyncd/lsyncd.log.txt23:38
jrosserit doesnt bail out on error23:38
noonedeadpunkso it just retrying...23:39
noonedeadpunk`Terminating since "insist" is not set` hm.23:40
jrosserhelpful https://github.com/lsyncd/lsyncd/issues/63223:41
noonedeadpunksome russian blogbost suggesting adding insist = true, in /etc/lsyncd.conf23:41
noonedeadpunkspecifically for centos btw23:42
jrosser`Continues startup even if a startup rsync cannot connect.`23:42
jrosserlooks like what we need23:42
noonedeadpunkwhich is exactly the case23:43
noonedeadpunkso somewhere here https://opendev.org/openstack/openstack-ansible-repo_server/src/branch/master/templates/lsyncd.lua.j2#L611 ?23:44
jrosserhuh https://github.com/openstack/openstack-ansible-repo_server/blob/master/templates/lsyncd.defaults.j2#L223:45
noonedeadpunkbut for redhat we pass only config23:46
noonedeadpunkso that explains :)23:46
jrosserdoes DAEMON_ARGS even make sense with systemd?23:47
noonedeadpunkconsidering -insist applies for ubuntu....23:48
jrosser /etc/lsyncd.conf exists on centos and we don't try to manage it23:48
noonedeadpunkI'd tried to move to config...23:48
noonedeadpunkbut we pass another conf file?23:49
noonedeadpunkhttps://github.com/openstack/openstack-ansible-repo_server/blob/master/templates/lsyncd.defaults.j2#L423:49
jrosserthe lua file23:50
noonedeadpunkHow is systemd inotify thing goes ?:D23:50
jrosserhaha -ENOTIME23:50
jrosserlike this is turning into yak shaving again23:50
noonedeadpunkyeah, but we define LSYNCD_OPTIONS to repo_lsyncd_config_file which should just replace /etc/lsyncd.conf with our PATH23:50
noonedeadpunkso i bet it's not taken into account23:51
jrosseris that different though https://github.com/openstack/openstack-ansible-repo_server/blob/master/vars/debian.yml#L3023:51
jrosserthe horrid horrid file here https://github.com/openstack/openstack-ansible-repo_server/blob/master/templates/lsyncd.lua.j223:52
noonedeadpunkexcept for debian we don't override path I believe23:52
noonedeadpunkas I said - we should put insist here https://opendev.org/openstack/openstack-ansible-repo_server/src/branch/master/templates/lsyncd.lua.j2#L61123:52
jrosserit still ships an init script /o\ https://packages.ubuntu.com/focal/amd64/lsyncd/filelist23:52
noonedeadpunkI guess23:52
noonedeadpunkno wonder - last lsync release was years ago23:53
noonedeadpunkto be correct almost 4 years ago23:53
jrosseroh i see what you mean now23:54
* jrosser didnt spot you could put config in the lua file23:54
noonedeadpunknot sure if we should drop defaults for ubuntu...23:55
* noonedeadpunk is quite drunk and clock shows almost 2am....23:55
jrosseryeah late23:55
* jrosser sleeps23:55

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!