*** tobiash has quit IRC | 07:28 | |
*** openstackstatus has quit IRC | 07:39 | |
*** openstackstatus has joined #opendev-meeting | 07:39 | |
*** ChanServ sets mode: +v openstackstatus | 07:39 | |
*** tobiash has joined #opendev-meeting | 08:07 | |
*** tobiash has quit IRC | 09:02 | |
*** tobiash_ has joined #opendev-meeting | 09:02 | |
*** SotK has quit IRC | 09:24 | |
*** SotK has joined #opendev-meeting | 12:26 | |
*** tobiash_ is now known as tobiash | 13:37 | |
*** yoctozepto8 has joined #opendev-meeting | 14:00 | |
*** yoctozepto has quit IRC | 14:01 | |
*** yoctozepto8 is now known as yoctozepto | 14:01 | |
*** hrw has joined #opendev-meeting | 18:24 | |
clarkb | we'll get the meeting started shortly | 18:59 |
---|---|---|
*** diablo_rojo has joined #opendev-meeting | 18:59 | |
hrw | o/ | 18:59 |
mordred | o/ | 19:00 |
diablo_rojo | o/ | 19:00 |
corvus | o/ | 19:00 |
fungi | ohai | 19:00 |
clarkb | #startmeeting infra | 19:01 |
openstack | Meeting started Tue May 26 19:01:16 2020 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
*** openstack changes topic to " (Meeting topic: infra)" | 19:01 | |
openstack | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link http://lists.opendev.org/pipermail/service-discuss/2020-May/000029.html Our Agenda | 19:01 |
ianw | o/ | 19:01 |
clarkb | #topic Announcements | 19:01 |
*** openstack changes topic to "Announcements (Meeting topic: infra)" | 19:01 | |
clarkb | The PTG is running next week | 19:02 |
zbr | o/ | 19:02 |
AJaeger | o/ | 19:02 |
clarkb | we'll talk more about that later in the meeting, but wanted to make sure people were aware that was going to happen next week | 19:02 |
clarkb | #topic Actions from last meeting | 19:02 |
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)" | 19:02 | |
clarkb | #link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-05-19-19.01.txt minutes from last meeting | 19:02 |
clarkb | There were no recorded actions, though I think a fair bit has happened so lets continue and discuss | 19:03 |
clarkb | #topic Priority Efforts | 19:03 |
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)" | 19:03 | |
clarkb | #topic Update Config Management | 19:03 |
*** openstack changes topic to "Update Config Management (Meeting topic: infra)" | 19:03 | |
clarkb | mordred: corvus: fungi: I think we have learned new things about using Zuul for CD thursday-friday ish? | 19:03 |
clarkb | specifically how requiring base and things like LE before running other services can have cascading effects. This is something ianw has pointed out before and has proposed a potential workaround | 19:04 |
corvus | a new insight is that since sometimes we have the run-zuul job run without the other base jobs, then we can short-circuit the protections afforded by running it in sequence after the base jobs | 19:05 |
mordred | yeah. that was less awesome | 19:05 |
corvus | clarkb: what's the ianw fix? | 19:05 |
mordred | clarkb: I've also got an idea about splitting base into per-service-base | 19:05 |
corvus | also, mordred had a proposal he was noodling on | 19:05 |
mordred | that I wanted to check with people about today and then if it's ok with people I'll write it | 19:06 |
clarkb | corvus: https://review.opendev.org/#/c/727907/ its a potential fix for unreachable hosts (basically ignore them). That is a subset of the issues we ran into | 19:06 |
clarkb | I pointed out that may not be safe given that we sometimes do expect things to run in order. However ignornign things like that specifically in base may be ok. My biggest concern with base there would be ignoring a host and not setting up its firewall properly, then that host somehow working a few minutes later when services are configured | 19:07 |
mordred | the idea in a nutshell is - make per-service base playbooks that run the base roles, move the soft-depend from service-playbook->base to service-playbook->service-base - and add appropriate file matchers | 19:07 |
mordred | this allows us to still run base things before services - but limits the success-needed scope to the hosts involved in the given service | 19:07 |
mordred | so if paste is unreachable, we don't skip running zuul | 19:08 |
mordred | but - will WILL skip running paste if paste-base fails | 19:08 |
corvus | mordred: what about running the base role in the normal service playbook, and putting limits on all the service jobs? | 19:08 |
mordred | we do this rather than just putting base roles into every service playbook to avoid no-op installing users for 5 minutes on every service | 19:08 |
corvus | mordred: (ie, "limit: zuul" on run-zuul -- would that waste too much time running the base role?) | 19:09 |
mordred | corvus: those were the two other ideas - base role in normal servicve I don't like because most of the base stuff moves very slowly but does take a non-zero amount of time to do nothing on each service invocation - and with zuul file matchers we can reduce the number of times we run service-base | 19:09 |
corvus | i don't even really know why the base job ran for my change series :( | 19:10 |
mordred | corvus: and yeah - I think we could run base.yaml --limit zuul ... but on friday when we were talking about it it seemed like we might have a harder time understanding exactly what was going on with that approach | 19:10 |
clarkb | my biggest concern with the above proposal is it seems like it could get very complicated quickly | 19:10 |
clarkb | corvus' suggestion is much simpler but might be less efficient | 19:10 |
clarkb | s/might/very likely/ | 19:10 |
mordred | there are like 4 proposals - sorry, can we be more specific | 19:11 |
mordred | let me name them: | 19:11 |
mordred | a) ignore unreachable b) service-base c) base-in-service d) --limit | 19:11 |
*** hamalq has joined #opendev-meeting | 19:12 | |
clarkb | b seems like it will have many complicated file matcher rules and it will be more difficult to understand when things will run. c seems much simpler (that is what my comment above was trying to say) | 19:12 |
mordred | awesome. and yes - I agree - I think c is the simplest thing to understand - although it might be less efficient on a per-service basis | 19:13 |
ianw | clarkb: "would be ignoring a host and not setting up its firewall properly, then that host somehow working a few minutes later when services are configured" ... the intent at least was to ignore the unreachable host (so it would presumably be unreachable for the service playbook); if it was an error that's a different exit code so serivce playbook stops | 19:13 |
corvus | because the file matchers for b would be the union of the matchers for todays "base" and today's "service" playbooks? | 19:14 |
ianw | not that i'm really arguing for a) | 19:14 |
mordred | corvus: I think we could trim down the matchers for today's "base" - they're currently a bit broad | 19:14 |
mordred | but yes | 19:14 |
mordred | well ... | 19:14 |
mordred | no | 19:14 |
mordred | service-base would be a per-service subset of today's base file matchers - but would not need to include a service's file matchers | 19:15 |
mordred | today I think we do base on inventory and playbooks/host_vars | 19:15 |
mordred | if we did per-service base, we could file-matchers on specific host_vars | 19:15 |
mordred | like we do for service | 19:15 |
mordred | but we can also do that with c) | 19:15 |
clarkb | right it would be host_vars/service*.yaml and group_vars/service.yaml rather than host_vars/* group_vars/* sort of thing right? | 19:15 |
mordred | yeah | 19:16 |
corvus | what's the job name for base today? | 19:16 |
corvus | (cause it's not "runbase | 19:16 |
clarkb | infra-prod-run-base iirc | 19:16 |
corvus | no that's "run base on pretend nodes for testing" | 19:16 |
corvus | not "actually run base on real nodes" | 19:16 |
mordred | corvus: infra-prod-base | 19:16 |
mordred | and it's triggered on all of inventory/ playbooks/host_vars/ and playbooks/group_vars/ | 19:17 |
mordred | (as well as base.yaml and the base roles) | 19:17 |
corvus | that explains why it ran for my changes; it's hard to change stuff without changing host_vars | 19:18 |
mordred | yeah | 19:18 |
clarkb | thinking out loud here, I think I'd be happy to try b) with a fallback of c) if we find it is too complicated | 19:18 |
clarkb | mostly wanted to call out the potential for complexity early so that we can try and minimize it | 19:18 |
mordred | ++ ... I think the file matchers are going to be very similar for b) and c) | 19:18 |
mordred | how about I try it for one service and we can look at it | 19:18 |
mordred | and if it's too complex, we go with c) | 19:19 |
clarkb | wfm | 19:19 |
corvus | iiuc, b) we'll have a set of like 4-6 file matchers for each service-base job | 19:19 |
corvus | they'll basically look the same | 19:19 |
corvus | except the specific groups and hostvars files/dirs will be service related | 19:19 |
mordred | yeah. and there's not an awesome way to de-duplicate taht | 19:19 |
corvus | so i think it'll be *verbose* but maybe not *complicated* | 19:19 |
ianw | is it correct to say c) means basically every playbooks/service-*.yaml will have a pre_task of include_tasks: base.yaml? | 19:19 |
mordred | corvus: ++ | 19:20 |
mordred | ianw: basically yeah | 19:20 |
mordred | corvus: I just had an idea ... | 19:20 |
mordred | corvus: I *think* normal ansible supports roles in subdirs referenced with dot notation... maybe we put our base roles in playbooks/roles/base - so we can make the file matchers list smaller | 19:20 |
mordred | so it would be roles: - base.users | 19:21 |
mordred | I can do a test of that too and see if it works- could allow us to shrink the file matchers | 19:22 |
corvus | sounds good | 19:22 |
corvus | hey | 19:22 |
corvus | i wonder if there's something we can do about the hostvars like that too | 19:23 |
corvus | like, could we segregate the base hostvars in such a way that we can more correctly detect when to run base | 19:23 |
corvus | cause just looking at the roles base runs, they pretty much never change, including the data for them | 19:24 |
mordred | corvus: I think that's a good idea | 19:24 |
corvus | but we need to run the job a lot because of hostvars | 19:24 |
clarkb | corvus: if it recursively looks for matching filenames we might be able to use host_vars/zuul/zuul01.yaml and then just match on host_vars/zuul ? | 19:24 |
clarkb | I have no idea if ansible works that way though | 19:25 |
mordred | I do not think it does | 19:25 |
mordred | oh - you know ... | 19:25 |
corvus | the roles are: users, base-repos, base-server, timezone, unbound, exim, snmpd, iptables | 19:25 |
mordred | yeah | 19:25 |
mordred | remote: https://review.opendev.org/730937 WIP Move users into a base subdir | 19:26 |
mordred | I just pushed that up to see if subdirs will work for role organizaiton | 19:26 |
mordred | we could maybe make a second hostvar location into which we just put base-associated hostvars | 19:26 |
mordred | maybe inside of the inventory dir | 19:26 |
corvus | yeah, that sounds worth exploring | 19:26 |
mordred | so we could have inventory/host_vars/zuul01.openstack.org.yarml into which we put base hostvars for zuul - and playbooks/host_vars/zuul01.openstack.org.yaml has hostvars for the zuul service playbook | 19:27 |
mordred | but we should write our own plugin that understands .yarml files | 19:27 |
corvus | that's the key | 19:27 |
mordred | I think it's super important | 19:27 |
mordred | similarly ... | 19:27 |
mordred | we could split the inventory into multiple files | 19:28 |
mordred | (we own the plugin that does that anyway) | 19:28 |
clarkb | we tend to not change the inventory super often so that may be a good followin? | 19:28 |
clarkb | *follow on | 19:28 |
mordred | yeah | 19:28 |
mordred | just thinking of ways to simplify matching what to run when | 19:28 |
clarkb | another appraoch could be to avoid having zuul's job descriptions determine the rulse for us | 19:29 |
clarkb | we could have a system that zuul triggered that determined what to execute based on git deltas or similar | 19:29 |
clarkb | (this is me thinking out loud, I Think that would take far more effort to get right since we'd be building that from scratch and in many ways it will probably look like what zuul is already doing) | 19:30 |
mordred | I think that would wind up needing to implement something similar to file matchers | 19:30 |
mordred | yeah | 19:30 |
corvus | yeah, though it might centralize it a bit | 19:30 |
corvus | (we have have a job that decides what child jobs to run) | 19:30 |
corvus | er | 19:30 |
corvus | (we can have have a job that decides what child jobs to run) | 19:30 |
mordred | that's a good point | 19:31 |
corvus | still, i think b and the related things we discussed are a good place to start | 19:32 |
clarkb | as a time check half our hour is gone now. Do we think mordred's plan to try it out on some services is a good place to start or should we discuss this further? | 19:32 |
clarkb | corvus: ++ from me | 19:32 |
corvus | and we can think about a dispatch job if it gets wild | 19:32 |
mordred | I'll get a patch doing b) for service-zuul up after the meeting | 19:32 |
clarkb | mordred: thanks | 19:32 |
clarkb | anything else config management related before we move on? | 19:32 |
mordred | after I get done with that | 19:33 |
mordred | I want to start working on gerrit upgrade | 19:33 |
zbr | super | 19:33 |
clarkb | mordred: exciting, there is a recent thread on gerrit performance post upgrade to notedb that we may want to read over | 19:33 |
clarkb | (as part of upgrade planning) | 19:33 |
mordred | yup. have been reading that | 19:33 |
mordred | some things about frequent aggressive GC | 19:34 |
clarkb | I was somewhat disappointed that some of the response seems to have been "notedb performs less well deal with it" | 19:34 |
mordred | clarkb moore's law will fix | 19:34 |
corvus | yeah, i think notedb perf depends a lot on caching | 19:35 |
clarkb | #topic OpenDev | 19:36 |
*** openstack changes topic to "OpenDev (Meeting topic: infra)" | 19:36 | |
clarkb | I've sent out the call for advisory board volunteers. | 19:36 |
clarkb | #link http://lists.opendev.org/pipermail/service-discuss/2020-May/000026.html Advisory Board thread. We have some volunteers already! | 19:37 |
clarkb | we've gotten a couple responses so far which is good considering there were holidaysi n many parts of the world recently | 19:37 |
* hrw joins | 19:38 | |
clarkb | I think the next steps here are to give people enough time to catch up on email, and work through how various groups want to select membership. But in a few weeks I'm hopiong we can select a simple operating mechanism for reaching out between the groups (admins and advisory baord) | 19:39 |
clarkb | I suggested a simple mailing list subject tag, but we'll see if that makes sense once we've got a membership | 19:39 |
clarkb | any questions or concerns on this? | 19:39 |
fungi | the subject tag may also be more trouble than its worth until volume ramps up on that ml (assuming it ever does) | 19:40 |
clarkb | fungi: ya thats possible, I can see it being nice for client side filtration too though | 19:40 |
clarkb | (even if you are getting all of the emails) | 19:40 |
fungi | ot' | 19:41 |
fungi | grr | 19:41 |
fungi | it's also something we could talk about during the ptg | 19:41 |
clarkb | ++ | 19:41 |
clarkb | #topic General topics | 19:42 |
*** openstack changes topic to "General topics (Meeting topic: infra)" | 19:42 | |
clarkb | #topic pip-and-virtualenv next steps | 19:42 |
*** openstack changes topic to "pip-and-virtualenv next steps (Meeting topic: infra)" | 19:42 | |
clarkb | ianw: I kept this on the agenda as I didnt' see announcement of the changes we had planned. Anything we can do to help or is it simlpy a matter of time now? | 19:43 |
ianw | i've been working on the announcement at https://etherpad.opendev.org/p/rm-pip-and-virtualenv | 19:43 |
ianw | it feels way too long | 19:43 |
fungi | maybe the summary needs a summary? ;) | 19:44 |
clarkb | #link https://etherpad.opendev.org/p/rm-pip-and-virtualenv announcement draft for pip and virtualenv changes | 19:44 |
ianw | fungi: yeah, i'll put a tl;dr at the top | 19:44 |
clarkb | sounds like it is moving along then, should we keep this on the agenda for next meeting? | 19:45 |
corvus | maybe something action focused? like "no action required unless $foo happens in which case you can $bar" | 19:45 |
corvus | which i *think* is the case :) -- ie, i think we're telling people they shouldn't need to do anything, but jobs are varied, and if something breaks they have options | 19:45 |
ianw | corvus: good point; i thin think the main action will be "if virtualenv is not found, install it" | 19:45 |
clarkb | ++ to giving peopel shortcut to fixes if they have a problem | 19:46 |
ianw | ok, will do, will ping for reviews on opendev at some point, thanks | 19:47 |
clarkb | #topic DNS cleanup and backups | 19:47 |
*** openstack changes topic to "DNS cleanup and backups (Meeting topic: infra)" | 19:47 | |
clarkb | fungi: I didn't end up sharing the dns zone contents with foundation staff. I think you may have, did that happen? | 19:48 |
fungi | yes, they observed there was a lot they could clean up in there, but no concerns publishing the list of records | 19:48 |
ianw | #link https://review.opendev.org/#/c/728739/ | 19:49 |
ianw | is the final job that backs up all the RAX domains | 19:49 |
clarkb | cool | 19:50 |
ianw | should we etherpad the openstack.org and we can go through it, maybe put "DELETE" next to things people know can go, then I can clean it up at some point? | 19:50 |
clarkb | ianw: that works for me. We can also share that etherpad with the foundation admins and they can help annotate it too? | 19:51 |
clarkb | though maybe it is better for them to delete the things they know about | 19:51 |
clarkb | #topic Using HTTPS with in region mirrors | 19:52 |
*** openstack changes topic to "Using HTTPS with in region mirrors (Meeting topic: infra)" | 19:52 | |
clarkb | moving along as we only have a few minutes left | 19:52 |
clarkb | #link https://review.opendev.org/730861 Test ssl with mirrors via base-test | 19:52 |
ianw | #link https://etherpad.opendev.org/p/rax-dns-openstack-org | 19:52 |
ianw | ^^ to go through | 19:53 |
clarkb | #link https://review.opendev.org/730862 Use ssl with mirrors in production if base-test is happy | 19:53 |
clarkb | I've been pushing this along now that ianw rebuilt all of our mirrors. The big upside to this is we get a bit more assurance that nothing silly is happenign with packages since we don't sign them with reprepro | 19:53 |
clarkb | also with pypi it will be a nice to have too since it basically relies on ssl for all its trust | 19:53 |
clarkb | the firs tchange I've linked will update base-test, we can check bindep and things are happy with it, then roll it out globally | 19:54 |
mordred | ++ | 19:54 |
clarkb | #topic Scaling Meetpad/Jitsi Meet | 19:54 |
*** openstack changes topic to "Scaling Meetpad/Jitsi Meet (Meeting topic: infra)" | 19:54 | |
clarkb | my change to configure a jvb server has landed. I think that means we can deploy a jvb server | 19:55 |
corvus | let's do it | 19:55 |
clarkb | This si somethign I can likely do tomorrow if no one else can do it sooner | 19:55 |
corvus | it'll be good to have that up and running, then we can add others quickly if needed | 19:55 |
clarkb | any new server called jvb01.opendev.org in our inventory should be configured properly (and jvb02, 03, 99 etc) | 19:55 |
fungi | hopefully we don't need more than 100 | 19:56 |
fungi | though we probably support an arbitrary number of digits there | 19:56 |
clarkb | I'm totally happy with someone else spinning one up today if they have time. othewise I've got it on the todo list for tomorrow | 19:56 |
clarkb | #topic Project Renames | 19:57 |
*** openstack changes topic to "Project Renames (Meeting topic: infra)" | 19:57 | |
clarkb | A reminder that June 12 was pencilled in last week. I think that is still looking ok for me | 19:57 |
fungi | wfm | 19:57 |
clarkb | I'll try to bring it up with the opnstack tc soon so that any other renames they want can be included at that time too | 19:57 |
clarkb | #topic Virtual PTG Attendance | 19:58 |
*** openstack changes topic to "Virtual PTG Attendance (Meeting topic: infra)" | 19:58 | |
clarkb | #link https://virtualptgjune2020.eventbrite.com Register if you plan to attend. This helps with planning details. | 19:58 |
clarkb | #link https://etherpad.opendev.org/p/opendev-virtual-ptg-june-2020 PTG Ideas | 19:58 |
clarkb | Please register and feel free to add ideas to our planning document | 19:58 |
clarkb | Any thoughts on whether or not we should have this meeting next week? | 19:58 |
clarkb | unlike a normal PTG we won't be distracted by travel and timezones. The PTG times we've requested do not conflict with this meeting | 19:59 |
clarkb | I guess I can show up and if others do too we'll have a meeting. | 19:59 |
corvus | yeah, maybe play it by ear? | 20:00 |
clarkb | I don't have to go out of my way to be around during that time period | 20:00 |
clarkb | also a reminder that the PTG is happening next week | 20:00 |
clarkb | and that takes us to the end of our hour | 20:00 |
clarkb | thank you everyeon | 20:00 |
clarkb | #endmeeting | 20:00 |
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev" | 20:01 | |
openstack | Meeting ended Tue May 26 20:00:58 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 20:01 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-05-26-19.01.html | 20:01 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-05-26-19.01.txt | 20:01 |
openstack | Log: http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-05-26-19.01.log.html | 20:01 |
fungi | thanks clarkb! | 20:01 |
*** hrw has left #opendev-meeting | 20:01 | |
*** tobiash has quit IRC | 22:33 | |
*** tobiash has joined #opendev-meeting | 22:34 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!