Tuesday, 2021-08-03

*** sshnaidm|afk is now known as sshnaidm07:27
*** rpittau|afk is now known as rpittau07:30
opendevreviewMerged openstack/openstack-ansible-tests master: setup.cfg: Replace dashes with underscores  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/78976109:28
spatelwhere is noonedeadpunk ?? 13:03
spatelI have some patch which i like to get review so we can merge :) if someone else in core group please let me know 13:30
mgariepyif the patches are not too big i can take a few minutes to check them out.13:34
spatelmgariepy its tiny change in OVN so yes impact is very low 13:35
spatel1. https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/80213413:35
spatel2. https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/80306013:36
nurdieHey OSA! Trying to rebuild containers after a bunch of power outages wrecked my cluster.13:37
nurdiefatal: [infra1_utility_container-9f7e62cd]: FAILED! => {"changed": false, "msg": "Could not find the requested service systemd-networkd: host"}13:37
nurdieOSA 20.2.6 on CentOS 713:37
nurdieThere's maybe an option somewhere on openstack_user_config? I've tried to trace this but I'm coming up short13:39
spatelnurdie run -vv and see where its not able to find it 13:48
spatelmgariepy one more patch which is https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/802701 to get rid of centos-8 job 13:49
mgariepyspatel, are the ovn ovsdb all clustered now ?13:49
spatelYes 13:49
mgariepyand everything is active-active ?13:50
spatelthat patch is already merged, ovsdb cluster doesn't do active-active (its work like RabbitMQ)13:50
spatelone master and other slaves 13:50
spatelleader node and follower (leader node only can write data and followers are just read-copy)13:51
spatelclient will automatically detect leader and send write operation to leader only13:51
spatelwhen leader is dead, one of follower make itself leader after election13:52
spatelI believe we have addressed all OVN issues (except SSL related which i need to work)13:53
mgariepyrabbit is a-a irrc13:54
spatelno 13:54
spatelIn RabbitMQ only one node write and other just sync 13:55
spatelAs far as i know.. they are not multi-master 13:55
nurdiespatel: it's not configuring it on any new containers. Additionally, the /etc/resolv.conf link to /var/run isn't getting setup either13:56
nurdieIt's not configuring it on existing containers either, but that goes undectected because it's already configured there :p 13:57
spatelmgariepy in rabbitMQ its little different like primary queue node, so one of node is primary for that queue and if you try to write data to secondary node (non-primary then that node forward your request to primary for write operation)14:00
nurdieAlso, on train, OSA keeps configuring openstack-pike repo on servers >_<14:01
nurdieI just ran: openstack-ansible -vv playbooks/setup-hosts.yml -l infra1_repo_container-2aa293c714:03
nurdieIt fails because no DNS config is on that container14:03
nurdieIt fails here: TASK [openstack_hosts : Add requirement packages (repositories gpg keys packages, toolkits...)]14:03
nurdiehttps://pasteboard.co/Kebr9F4.png14:05
nurdieI can bandaid that but I want to fix the plays14:06
spatelYou should keep centOS 7 version same don't change that otherwise repo server will create issue, i believe 14:09
mgariepyspatel, i'm a bit confused by the ovs/ovn stuff.14:09
spatelif yum update is failed means look like you have issue somewhere in repo config14:10
mgariepyhttps://github.com/openstack/openstack-ansible-os_neutron/blob/master/tasks/providers/setup_ovs_ovn.yml#L3014:10
mgariepywouldn't it need to be changed so we can get rid of the haproxy config?14:11
spatelovs is just L2/L3 switch and OVN is SDN controller 14:11
spatelyes we don't need haproxy stanza for OVN 6641 and 6642 14:13
spatelbecause we can point agent directly talk to member of cluster nodes 14:13
mgariepyyeah this part was clear in my head. the issue i have is the `ovs-vsctl set open . external-ids:ovn-remote=tcp:{{ neutron_ovn_ip }}:6642` doesnt quite compute in my head haha14:13
spatelneutron_ovn_ip is haproxy vip which james initially configure before we implemented ovn cluster patch14:14
mgariepycan you then fix that part in subsequent patches ?14:15
spatelin my patch i am using list of cluster member nodes https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/803060 14:16
spatelsure i can use same variable in that place 14:16
spatelcommand: "ovs-vsctl set open . external-ids:ovn-remote={{ neutron_ovn_sb_connection }}"     14:18
spatelsomething like that14:18
mgariepyor only on the self-ip address i'm not too sure.14:18
mgariepybut you are probably right about the connection string.14:19
opendevreviewSatish Patel proposed openstack/openstack-ansible-os_neutron master: Use list of cluster member ipaddr for ovn ml2 agent to connect  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/80306014:20
spateldone check it our 14:20
spatelit was a good catch14:21
spatellater i will create one more patch to remove ovn config from haproxy lb to just clean up 14:22
mgariepycheck my comment :D14:23
opendevreviewSatish Patel proposed openstack/openstack-ansible-os_neutron master: Use list of cluster member ipaddr for ovn ml2 agent to connect  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/80306014:25
spateldone! 14:25
mgariepyperfect14:25
mgariepythanks14:25
spatelThank for the review :)14:26
nurdieShould systemd-resolved be installed on _infrastructure_hosts as well as containers, or just containers?14:27
spatelmgariepy does Depends-On: will get merge itself or need separate +2 review for them?14:27
mgariepythey all need to be merged with +214:28
spatelnurdie i think so but i have never done that part, and never get any issue like that14:29
nurdieI've been running openstack for almost 3 years now and have never seen this either >_<14:30
spatelare you trying to upgrade something or changing OS.. ? 14:30
nurdieSequential power outages did a number. We even lost a whole compute node's RAID controller. Thankfully, we're full Ceph backend :)14:30
nurdieNope14:30
nurdieJust recovering14:30
spatelagain centos7 is old and anything is possible 14:30
nurdieold usually means stable14:31
spatelnot really, soon it will be EOL + sometime broken repo also cause issues 14:31
spatelcentos7 isn't part of CI-CD pipeline so hard to know.. what is broken and when. 14:33
nurdieo.014:33
nurdieC7 doesn't EOL until 202414:33
spatelmgariepy do you know what is going on here - https://review.opendev.org/c/openstack/openstack-ansible/+/80304114:33
spatelopenstack-tox-docs https://zuul.opendev.org/t/openstack/build/589ec38bc3114399b722c3cb27f4fd75 : FAILURE in 5m 46s14:33
nurdieWhat should I upgrade to to be in CI? CentOS Stream?14:33
spatelOSA drop support of centos7 so its hard to debug..  (yes centos stream is new toy)14:34
spatelRecently i changed my deployment to use ubuntu (i am very happy) 14:35
spatelcentos is now puppet of redhat :) anytime they can change policy 14:35
nurdieHow did you do that? Just backup galera and do a full redeploy?14:35
nurdieI know....we aren't happy about it either14:35
spatelnot centos7---> ubuntu migration (only new deployment will go to ubunut) 14:36
nurdieah. I won't have that luxury. It's a production cluster. I thought I had 2 more years to create a CERT cluster but I guess that needs to happen now lol14:37
spateli am running centos7 cloud with 800 production vms and everyday i pray to god :) 14:38
nurdieIn theory, I should be able to easily backup galera and do full redeploy. All of my vms are on Ceph14:39
spateli haven't think about how to upgrade this cluster but soon planning to migrate to ubuntu may be create parallel cluster and migrate each vm every days or lift and shift 14:39
spatelnurdie if you can effort downtime then yes.. but i would say run test in lab 14:39
nurdiei do not want to think about migrating 800 vms lol14:40
nurdiemy heart goes out to you14:40
spatelIn my infra nodes i replace SSD every 1 year to just make sure it won't die with SSD ( yes controller can die also but not thinking about that yet :) )14:41
nurdieand, so far, you feel that ubuntu is far more stable than your centos stack?14:41
spatelYes and very open.. not like centos keep hunting for packages and repos.. because they change them a lot..14:42
nurdieman you should look in ceph. it's very easy to deploy and has been a total rock for me. you'll lose some IO perf, but you'll never lose prod data again14:42
nurdie"never" ;)14:42
spatelmajority of cloud running on ubuntu if you look back survey again its personal choice so i don't want to force you :) 14:43
nurdiehey i'm in this for stability. only started on centos because that's what the rest of prod runs on. i have zero qualms with adopting a deb fork14:43
spatelmy application doesn't need any data storage, all i need memory, network and cpu for realtime processing data.. also i don't trust my network which can bring down my cloud or latency 14:44
spatelfor small deployment i can understand but we are running lots of vms 14:45
nurdieYou do have double the vms that I have.....but only double O:)14:46
nurdieMy ceph cluster is all flash14:47
nurdieWorks pretty alright14:47
nurdieBut yeah, doesn't sound like that would be a worthwile investment for your ops14:47
spatelwe do have ceph and it went down and brought down 200 vms since then we moved to local compute disk 14:48
spatelceph is good until its happy but it can create massive issue if not feeding carefully.. it required some skills which we don't have at present 14:49
nurdieThat sucks man sorry to hear that. It's never godo times when losing that much14:50
mgariepyspatel, not sure what's wrong with the docs on this. maybe spotz have some idea ?15:03
spotzreading15:13
spotzI haven't done a ton of Ceph and OpenStack to be honest. I can see if it'll run on my NUC but going through them is the only way II'll know15:15
mgariepy the error : /home/zuul/src/opendev.org/openstack/openstack-ansible/doc/source/admin/backup-restore.rst::rST localisation for language "id" not found15:23
spatelmgariepy something is changed recently somewhere broke that test15:26
mgariepylol it seems like it's missing something from the id translation.15:27
mgariepy:/15:27
spateljust curious why do we have that test only for this specific role and not others, can we set it to non-voting meantime 15:28
spotzspatel: That would seem like a good idea if this is the only place it exists. Then we can evaluate it if we need it and possibly add it everywhere15:31
spatelmgariepy do you know how to set this to non-voting 15:36
mgariepyin zuul.d/project-templates.yaml it can be set to non-voting.15:45
spatelis that file inside role? 15:46
spotzShould be as each repo is tested separatelly15:47
spatellet me check 15:48
spateli can see it here - https://review.opendev.org/plugins/gitiles/openstack/openstack-ansible/+/refs/heads/master/zuul.d/project-templates.yaml15:50
opendevreviewSatish Patel proposed openstack/openstack-ansible-os_nova master: Add dependency repo for centos-8-stream distro install  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/80336815:55
opendevreviewSatish Patel proposed openstack/openstack-ansible master: set non-voting for broken tox-doc test  https://review.opendev.org/c/openstack/openstack-ansible/+/80337115:59
spatelmgariepy do we need to set +2 to merge this job? 16:00
mgariepyspatel, you need to add : in the file ;p16:02
mgariepy - blah: then the voting: false16:02
spatel?16:03
spatelhttps://review.opendev.org/c/openstack/openstack-ansible/+/803371/1/zuul.d/project-templates.yaml16:04
spateldid i miss something?16:04
spateloh hold on16:04
mgariepycheck my comment :D16:05
spatel+1 got it 16:05
opendevreviewSatish Patel proposed openstack/openstack-ansible master: set non-voting for broken tox-doc test  https://review.opendev.org/c/openstack/openstack-ansible/+/80337116:06
spateldone :)16:06
mgariepysphynx got updated recently.16:08
spatelthat might broke something 16:08
spatelalso we should get rid of centos-8 jobs because anyway its EOL end of 2021 (it will reduce some load on build servers)16:10
spatel5 more months to go 16:10
*** rpittau is now known as rpittau|afk16:13
spatelmgariepy how to merge this stuff now - https://review.opendev.org/c/openstack/openstack-ansible/+/80337119:06
spatelwe need 2 +2 to get thing merge right?19:07
mgariepywait a sec i'm didding on doc issue19:09
jrossermgariepy: can you give this a push to help the infra folk out https://review.opendev.org/c/openstack/openstack-ansible-tests/+/80312719:18
mgariepyjrosser, done.19:20
mgariepythe id translation seems to break the build.19:20
jrossermgariepy: there was a post to the ML about that, I’ve just asked in the infra channel if anyone knows the fix19:22
mgariepyok cool thanks jrosser 19:26
jrosserI have a feeling hat the suggestion may end up being we are running the docs job when we don’t need to - i.e when we’re not modifying the contents of doc/19:31
mgariepyit should build anyhow19:35
mgariepyif i remove the `id` translation the doc builds. 19:35
spatelmgariepy how to do that part to fix broken test19:43
spateljrosser could you give this guy push because its holding bunch of other patches https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/80213419:45
jrosserspatel: done19:46
jrosserI’m on holidays this week, noonedeadpunk as well too I think19:46
spatellucky guys! :)19:47
spateljrosser i have fixed centos-8-stream disto install and here is the patch if you like to bump it up - https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/80336819:48
spatelwe should remove (non-voting) from openstack-ansible-deploy-aio_distro_metal-centos-8-stream 19:49
spatelI am trying my best to keep centos-8-stream in good shape.. 19:50
jrosseris os_nova really the place for a openvswitch repo? feels like networking imho?19:51
spatelnova rpms looking for openvswitch dependencies... 19:52
spatelexact same repo we have in os_neutro role also but in setup-openstack.yml has nove playbook first and its dependency19:53
opendevreviewMerged openstack/openstack-ansible-tests stable/train: Update Debian stable job  https://review.opendev.org/c/openstack/openstack-ansible-tests/+/80312720:00
jrosserspatel: feels very similar to this https://github.com/openstack/openstack-ansible-openstack_hosts/blob/master/vars/redhat-8.yml#L8720:02
spatelYes!! 20:02
spatelwe can move that here but only for distro method 20:03
jrosserand then should we always get openvswitch from the same place for source/distro?20:03
spatelyes same place 20:03
spatelfor nova we don't need openvswitch but because of rpm dependency tree it asking to have that repo20:04
spatelwe can move that repo in hosts if that is correct way to do 20:05
spatelwill looks more organized 20:05
spateljrosser can you give bump here so we can get rid of broken centos-8 build - https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/80270120:05
jrosserdone20:07
spatelcool! 20:07
spateli can try to move centos repo to here https://github.com/openstack/openstack-ansible-openstack_hosts/blob/master/vars/redhat-8.yml#L8720:10
spateladd condition only for distro 20:11
spatelin os_neutron we have condition only install when OVS deployment required so that is good20:11
jrosseris there a good reason to use a condition20:12
jrosserwhy don’t we always use that one?20:12
spatelLets say if i want to install using source with LinuxBrige and don't want to use OVS then i don't need that repo 20:14
spatelsure, we can use that repo also regardless of source or distro 20:15
spatelnot going to hurt 20:15
spatelanyway i will push out that in hosts role and will see so we will have centralized place 20:16
spatelI gotta go!  you enjoy your holiday and have a safe vacation :) see you next week sometime 20:17
opendevreviewIan Wienand proposed openstack/openstack-ansible stable/stein: Remove Debian Stable testing  https://review.opendev.org/c/openstack/openstack-ansible/+/80340423:04
opendevreviewIan Wienand proposed openstack/openstack-ansible stable/train: Remove Debian Stable testing  https://review.opendev.org/c/openstack/openstack-ansible/+/80340523:07

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!