*** hamalq has quit IRC | 01:01 | |
*** hamalq has joined #opendev-meeting | 02:25 | |
*** hamalq has quit IRC | 06:11 | |
*** diablo_rojo has quit IRC | 06:34 | |
*** hashar has joined #opendev-meeting | 06:49 | |
*** hashar is now known as hasharAway | 10:08 | |
*** hasharAway is now known as hashar | 10:55 | |
*** hashar has quit IRC | 11:43 | |
*** hashar has joined #opendev-meeting | 12:16 | |
*** hashar has quit IRC | 14:29 | |
*** hamalq has joined #opendev-meeting | 16:16 | |
*** hamalq has quit IRC | 17:41 | |
*** diablo_rojo has joined #opendev-meeting | 18:03 | |
clarkb | anyone else here for the opendev infra meeting? | 19:00 |
---|---|---|
corvus | oh that's me | 19:00 |
clarkb | fungi mentioned he won't make it today | 19:00 |
clarkb | #startmeeting infra | 19:01 |
openstack | Meeting started Tue Sep 22 19:01:09 2020 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
*** openstack changes topic to " (Meeting topic: infra)" | 19:01 | |
openstack | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link http://lists.opendev.org/pipermail/service-discuss/2020-September/000100.html Our Agenda | 19:01 |
clarkb | #topic Announcements | 19:01 |
*** openstack changes topic to "Announcements (Meeting topic: infra)" | 19:01 | |
clarkb | PTG and Summit are fast appraoching. If you plan to participate now is a good time to register | 19:01 |
ianw | o/ | 19:01 |
clarkb | schedules for all three should be up too so you can cross check your timezone and availability | 19:02 |
clarkb | unfortunately I don't currently have links ready but if you need help finding anything I'm sure I can either find the info or know who to talk to | 19:02 |
diablo_rojo | o/ | 19:02 |
clarkb | Also the smoke is mostly gone now and tomorrow is the last day where my kids don't have school obligations for the next several months so I'm going to take the day off and go do somethingin the rain | 19:03 |
clarkb | #topic Actions from last meeting | 19:03 |
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)" | 19:03 | |
clarkb | #link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-09-15-19.01.txt minutes from last meeting | 19:03 |
clarkb | no recordred actions | 19:03 |
clarkb | #topic Priority Efforts | 19:04 |
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)" | 19:04 | |
clarkb | #topic Update Config Management | 19:04 |
*** openstack changes topic to "Update Config Management (Meeting topic: infra)" | 19:04 | |
clarkb | nb04.opendev.org has been cleaned up | 19:04 |
corvus | so all of zuul+nodepool are running from container images now? | 19:04 |
clarkb | corvus: yes | 19:05 |
corvus | is all of puppet-openstackci unused by us now? | 19:05 |
clarkb | I'm not sure if we use the elasticsearch and logstash stuff out of there or not | 19:05 |
clarkb | I think not so ya that may be completely unused by us now | 19:06 |
corvus | that's a major milestone :) | 19:06 |
clarkb | ya, and we're also using the upstream images for all services too | 19:07 |
clarkb | *all zuul and nodepool services | 19:07 |
clarkb | whcih is a good way to help ensure thoes function well for the real world | 19:07 |
*** hamalq has joined #opendev-meeting | 19:09 | |
clarkb | #link https://review.opendev.org/#/c/744821/ fetch sshfp dns records in launch node | 19:09 |
clarkb | #link https://review.opendev.org/#/c/750049/ wait for ipv6 RAs when launching nodes | 19:09 |
clarkb | those are two launch node changes that would be good to land as we continue to roll out services with new config management on new hosts | 19:10 |
clarkb | Any other config management related items to discuss? | 19:10 |
clarkb | #topic OpenDev | 19:11 |
*** openstack changes topic to "OpenDev (Meeting topic: infra)" | 19:11 | |
clarkb | It has been noticed that our Gitea project descriptions are not updated when we change them | 19:11 |
clarkb | I believe that the current method to address this is to run the sync-gitea-projects.yaml playbook whcih will do the slow resync of everything | 19:12 |
clarkb | If that is the current method, do we need to lock ansible on bridge when doing it to prevent other things from conflicting? I think so | 19:13 |
clarkb | fungi also brought up that we should think about ways to do this automatically if we can manage it somehow | 19:13 |
corvus | i'm trying to remember why we don't already do that | 19:14 |
ianw | how slow is slow? | 19:14 |
clarkb | I want to say it took about 4 hours to run the sync playbook last time mordred ran it | 19:14 |
clarkb | but maybe that was before we rewrote it in python? | 19:14 |
ianw | ok, that counts as slow :) | 19:15 |
clarkb | thinking out loud here: maybe we lock ansible on bridge, run the sync playbook and time it, then based on that we can consider doing a full sync each time? | 19:16 |
clarkb | also the ci job for gitea does two passes of project creation iirc. Maybe we can look at that for rought timing data | 19:16 |
ianw | ++ | 19:17 |
corvus | it's not clear to me that sync-gitea-projects.yaml will update the descr | 19:17 |
clarkb | corvus: oh interesting is this possibly a bug in our python too? | 19:18 |
corvus | clarkb: yeah (or missing feature); a quick scan of the python looks like it only touches description on create | 19:18 |
clarkb | neat | 19:18 |
corvus | settings and branches get updated by sync-gitea-projects.yaml, but not the description | 19:18 |
corvus | expect it would be an easy fix | 19:18 |
clarkb | now I'm thinking we make a chagne that just updates descriptions and use the ci job to time it | 19:18 |
clarkb | since I'm pretty sure we do two passes in ci now | 19:19 |
corvus | (also, i'm only at 90% confidence on this) | 19:19 |
clarkb | corvus: we can also use the ci job to check your theory on that | 19:19 |
clarkb | I can take a look at that if no one else is able/interested. It may just be a day or two while I work through other things first | 19:19 |
clarkb | and if you are interested I'm happy for someone else to work on it too :) | 19:20 |
clarkb | The other gitea topic I added to the agenda is that gitea 1.12.4 has released. I've done the last few gitea upgrades. Curious if anyone else is interested in giving it a go. | 19:21 |
clarkb | With the minor releases its mostly about double checking file diffs and editing them as necessary for our forked html content | 19:21 |
ianw | i can give it a go | 19:22 |
clarkb | Our CI job for that has decent coverage of the automation results, and if you really want to check the rendered web ui launching a gitea locally isn't too bad | 19:22 |
clarkb | ianw: thanks! | 19:22 |
clarkb | On the gerrit upgrade side of things I've been distracted by a number of operational issues the last few weeks so unfortunately no new updates here | 19:23 |
clarkb | Any other OpenDev topics people would like to bring up? | 19:23 |
clarkb | #topic General Topics | 19:25 |
*** openstack changes topic to "General Topics (Meeting topic: infra)" | 19:25 | |
clarkb | #topic Splitting puppet-else into service specific infra-prod jobs | 19:25 |
*** openstack changes topic to "Splitting puppet-else into service specific infra-prod jobs (Meeting topic: infra)" | 19:25 | |
clarkb | This is something that ianw reminded me we had planned to do | 19:25 |
clarkb | essentially we split the node definition(s) out of our manifests/site.pp and put them in manifests/service.pp then add new jobs to run puppet for that specific service instead of everything | 19:26 |
clarkb | the reason this is coming up is we've been having servers like elasticsearch servers crash on rax then ansible sits there waiting for them until the puppet-else job times out | 19:26 |
clarkb | this adds a lot of noise to our logs and it is hard to tell if things are working or not since they are lumped into a big basket | 19:26 |
clarkb | I wonder if we should plan a sprint to make those changes and get through a number of them in a day or two | 19:27 |
ianw | yeah, something else i can put on my short-term todo list | 19:28 |
clarkb | (also if anyone knows how to convince ansible to timeout more quickly when ssh will never succeed that would be great too) | 19:28 |
clarkb | ianw: I'm happy to help too, though for me having a day or two dedicated to it would likely help, maybe ping me when you intend on working on it and I'll start later in the day and we can sift through them? | 19:29 |
ianw | ok; hopefully it's all pretty mechanical | 19:30 |
ianw | always surprises though :) | 19:30 |
clarkb | ya I think the least mechanical part will be adding testinfra tests | 19:30 |
clarkb | I think that was part of mordred's original goal so that we can switch out puppet for ansible+docker and have the tests confirm everything is still happy | 19:30 |
clarkb | without needing to replace test frameworks | 19:30 |
clarkb | #topic Bup and Borg Backups | 19:31 |
*** openstack changes topic to "Bup and Borg Backups (Meeting topic: infra)" | 19:31 | |
clarkb | #link https://review.opendev.org/741366 ready to merge when we are. | 19:31 |
clarkb | kept this on the agenda as ianw mentioned I should | 19:31 |
clarkb | I don't think its incredibly urgent as bup continues to work, but being better prepared for focal and beyond is a good thing too | 19:31 |
clarkb | ianw: anything else to add on that one? | 19:32 |
corvus | we looking for another +2, or just waiting for a babysitter? | 19:33 |
clarkb | corvus: I think waiting for a babysitter (ianw mentioned he could do it, but there have been many many distractions since) | 19:33 |
corvus | well, there's a host bringup, so a bit more work than babysitting | 19:33 |
ianw | yeah, just waiting for me to have a chunk to new server, and babysit | 19:33 |
clarkb | mostly I think it gets deprioritized since bup is working | 19:33 |
ianw | although there's been some chat about alternative providers | 19:33 |
* corvus touches wood | 19:33 | |
ianw | do we want to put the bup server somewhere !rax? | 19:34 |
clarkb | ianw: the original goal with bup was to backup to >1 provider | 19:34 |
corvus | i think 2 servers in 2 providers would be great | 19:34 |
clarkb | corvus: ++ | 19:34 |
ianw | any preference of the options we have? | 19:34 |
corvus | i'd vote for rax + 1. | 19:35 |
clarkb | ianw: mnaser has recently indicated he'd be happy to host more things. Backups likely make sense there due to the use of ceph too | 19:35 |
clarkb | (our backups will be replicated many times) | 19:35 |
mnaser | yes, i meant to reply to the email, we have a lot of capacity of storage in mtl btw | 19:35 |
corvus | i'll just say that at one point we *did* have rax +1, and the +1 exited the cloud business. i really really hope (and i certainly don't expect) that to happen again. but having been bitten once. | 19:35 |
ianw | ok, sounds like vexxhost mtl | 19:36 |
mnaser | please feel free to loop me in if you need quota bumps or anything like that | 19:36 |
clarkb | mnaser: thank you! | 19:36 |
corvus | rax+mtl sounds great :) | 19:36 |
ianw | mnaser: thanks, will do | 19:36 |
clarkb | #topic PTG Planning | 19:38 |
*** openstack changes topic to "PTG Planning (Meeting topic: infra)" | 19:38 | |
clarkb | As mentioned earlier now is a good time to register if you plan to attend the PTG | 19:38 |
clarkb | #link https://www.openstack.org/ptg/ Register for the PTG | 19:38 |
clarkb | #link https://etherpad.opendev.org/opendev-ptg-planning-oct-2020 October PTG planning starts here | 19:38 |
clarkb | I've added a number of topics to this etherpad | 19:38 |
clarkb | we are just over a month away so now is a great time to think about what we should be talking about during our PTG times | 19:39 |
clarkb | Feel free to add input on the topics I've added or add your own | 19:39 |
clarkb | if a particular topic is very important to you it might be a good thing to indicate your time availability next to the topic so we can include you | 19:39 |
clarkb | Also, I plan to use meetpad again as that worked well for us last time | 19:40 |
clarkb | Any other PTG concerns or thoughts? | 19:41 |
clarkb | #topic Switch fedora-latest to fedora-32 | 19:43 |
*** openstack changes topic to "Switch fedora-latest to fedora-32 (Meeting topic: infra)" | 19:43 | |
clarkb | #link https://review.opendev.org/#/c/752744/ | 19:43 |
clarkb | I sent an email last week saying we'd make this change today | 19:43 |
clarkb | I intend on approving the change after the meeting unless there are last minute objections | 19:43 |
clarkb | I figure if anyone really really needs fedora-30 they can use fedora-30 directly as we work them off of it | 19:43 |
clarkb | hoping that in the near future we'll delete the fedora-30 image entirely | 19:43 |
clarkb | part of the motivation here is that the old fedoras seem to be bitrotting with respect to ansible. Ansible isn't able to reliably manage systemd services on f31 for example | 19:44 |
clarkb | Getting to the up to date fedora version seems important as a result | 19:44 |
clarkb | ianw: ^ any particular concerns from you on that topic? you probably do more fedora things than the rest of us | 19:44 |
ianw | no, i mean we shouldn't really be using fedora-!latest in jobs, we've always said it's a rolling thing | 19:45 |
clarkb | ya the number of cases where fedora-30 is used explicitly is very small | 19:46 |
clarkb | nodepool, ara, and dib | 19:46 |
clarkb | nodepool and ara are/have being updated and dib will just stop testing f30 buidls I think | 19:46 |
clarkb | #topic Open Discussion | 19:47 |
*** openstack changes topic to "Open Discussion (Meeting topic: infra)" | 19:47 | |
clarkb | https://review.opendev.org/#/c/752908/ is a change I'm hoping to get review(s) from someone with a fresh perspective | 19:47 |
clarkb | I've had some initial concerns but have largely come around to thinking merging it is probably the most pragmatic thing | 19:48 |
ianw | one thing was restarting zuul-web to pickup the new pf4 changes that were merged | 19:48 |
clarkb | hoping that someone else can take a look and double check on that | 19:48 |
clarkb | I'll probably approve it by the end of my work day if no one else looks as I don't want the tripleo testing to floudner longer | 19:48 |
clarkb | ianw: typically those are really straightforward, you docker-compose down and docker-compose up -d in /etc/zuul-web or whatever the dir is | 19:49 |
ianw | i'll check, yesterday there were unanswered questions | 19:49 |
ianw | clarkb: is there a reason we don't CD deploy that? | 19:49 |
clarkb | ianw: if you want to do the zuul-web restart I'm around for another 5 or so hours and will happy backup if something goes wrong | 19:49 |
clarkb | ianw: I think because sometimes you need to restart zuul-web and scheduler together | 19:50 |
clarkb | corvus: ^ is that overly cautious on our part? | 19:50 |
ianw | ahh, ok, yeah this is not an API change | 19:50 |
ianw | but i guess it could be, at some times | 19:50 |
clarkb | ya most of the time its fine to just restart | 19:50 |
clarkb | occasionally it isnt :) | 19:50 |
ianw | i'll take a look then | 19:51 |
corvus | i think it would be fine to cd zuul-web | 19:52 |
corvus | but the mechanics are tricky | 19:52 |
corvus | zuul repo is in a different tenant, etc | 19:52 |
corvus | really want a url trigger or something for that, i'd think. | 19:53 |
clarkb | we check the docker-compose pull info in the gitea role to understand if we need to restart in a safe way (whcih is different than just down and up) | 19:53 |
ianw | hrm, i docker restarted it, but it looks the same | 19:53 |
clarkb | we might be able to do something similar for zuul-web and get the hour delayed CD | 19:53 |
ianw | which must mean what i thought would be new containers isn't | 19:53 |
corvus | ianw: i think it needs more than a restart for the container to be recreated with a new image | 19:54 |
clarkb | yes I think that is the case | 19:54 |
corvus | i think a docker-compose down/up ? | 19:54 |
clarkb | ya down then up -d is what I usually use | 19:54 |
ianw | ok, yeah, looks like that's in the bash history | 19:55 |
ianw | my usual source of best practice tips :) | 19:55 |
clarkb | Sounds like that may be it | 19:56 |
clarkb | Thank you everyone | 19:56 |
ianw | yay, that got it :) | 19:56 |
clarkb | we'll be back here next week until then feel free to chat in #opendev or on service-discuss@lists.opendev.org | 19:56 |
corvus | clarkb: thanks :) | 19:56 |
clarkb | #endmeeting | 19:56 |
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev" | 19:56 | |
openstack | Meeting ended Tue Sep 22 19:56:55 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:56 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-09-22-19.01.html | 19:56 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-09-22-19.01.txt | 19:57 |
openstack | Log: http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-09-22-19.01.log.html | 19:57 |
diablo_rojo | Thanks clarkb! | 20:00 |
*** gouthamr has quit IRC | 22:06 | |
*** mnaser has quit IRC | 22:06 | |
*** gouthamr has joined #opendev-meeting | 22:09 | |
*** mnaser has joined #opendev-meeting | 22:10 | |
*** gouthamr has joined #opendev-meeting | 22:11 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!