*** hamalq has quit IRC | 01:47 | |
*** ianw has quit IRC | 06:18 | |
*** ianw has joined #opendev-meeting | 06:19 | |
*** sboyron_ has joined #opendev-meeting | 08:10 | |
*** hashar has joined #opendev-meeting | 08:45 | |
*** sboyron_ is now known as sboyron | 14:51 | |
*** hashar has quit IRC | 15:29 | |
*** hashar has joined #opendev-meeting | 16:22 | |
*** hashar is now known as hasharDinner | 18:07 | |
*** hamalq has joined #opendev-meeting | 18:35 | |
clarkb | anyone else here for the meeting? | 19:00 |
---|---|---|
ianw | o/ | 19:00 |
clarkb | #startmeeting infra | 19:01 |
openstack | Meeting started Tue Feb 16 19:01:11 2021 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
*** openstack changes topic to " (Meeting topic: infra)" | 19:01 | |
openstack | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link http://lists.opendev.org/pipermail/service-discuss/2021-February/000184.html Our Agenda | 19:01 |
clarkb | Sorry I had meant to send this out yesterday and got it put together but then got distracted by server upgrades | 19:01 |
clarkb | #topic Announcements | 19:03 |
*** openstack changes topic to "Announcements (Meeting topic: infra)" | 19:03 | |
clarkb | There were none listed | 19:03 |
clarkb | #topic Actions from last meeting | 19:03 |
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)" | 19:03 | |
clarkb | #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-02-09-19.01.txt minutes from last meeting | 19:03 |
clarkb | we had two (though only one got properly recorded) | 19:03 |
clarkb | ianw was looking at wiki borg backups (I think this got done?) | 19:04 |
ianw | yes wiki is now manually configured to be backing up to borg | 19:04 |
clarkb | corvus had an action to unfork jitsi meet | 19:05 |
clarkb | the web componenet at least (everything else is already unforked) | 19:05 |
corvus | not done, feel free to re-action | 19:06 |
clarkb | #action corvus unfork jitsi meet web component | 19:06 |
clarkb | #topic Priority Efforts | 19:06 |
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)" | 19:06 | |
clarkb | #topic OpenDev | 19:06 |
*** openstack changes topic to "OpenDev (Meeting topic: infra)" | 19:06 | |
fungi | i also saw ianw's request for me to double-check the setup on wiki.o.o, will try to get to that after the meeting wraps up | 19:06 |
clarkb | fungi:thanks | 19:06 |
clarkb | I did further investigation of gerrit inconsistent accounts and wrote up notes on review-test | 19:07 |
clarkb | I won't go through all the status of things because I don't think much has changed since the last meeting | 19:07 |
clarkb | but I could use another set or two of eyeballs to look over what I've written down to see if the choices described there make sense | 19:07 |
clarkb | if they do then the next step is likely to make that staging All-Users repo and start committing changes | 19:08 |
clarkb | we don't need to work through that in the meeting but if you have time to look at it and want me to walk you through it let me know | 19:08 |
clarkb | I was going to call out a couple of Gerrit 3.3 related changes but looks like both have merged at this point. Thank you reviewers | 19:08 |
clarkb | For the gitea OOM problems we've noticed recently I pushed up a haproxy rate limiting framework change | 19:09 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/774023 Rate limiting framework change for haproxy. | 19:09 |
clarkb | I doubt that is mergeable as is, but if you have a chance to review it and provide thoughts like "never" or "we could probably get away with $RATE" that may be useful for future occurences | 19:09 |
fungi | i'm feeling like if the count of conflicting accounts is really that high, we should consider sorting by which have the most recent (review/owner) activity and prioritize those, then just disable any which are inactive and let people know, rather than manually investigating hundreds of accounts | 19:10 |
clarkb | that said, I am beginning to suspect that these problems may be self induced | 19:10 |
clarkb | fungi: yup, I'm beginning to think that may be the case. We could do a rough quick search for active accounts, manually check and fix those, then do retirement for all others | 19:10 |
fungi | er, inactive in the sense of not used recently | 19:10 |
fungi | not the inactive account flag specifically | 19:10 |
clarkb | I can look at the data from that perspective and write up a set of alternate notes | 19:11 |
fungi | maybe also any which are referenced in groups | 19:11 |
clarkb | judging based on the existing data I expected that may be be something like 50 accounts max that we have to sort out manually and the rest we can just retire | 19:11 |
fungi | but those are likely very few at this point | 19:11 |
clarkb | but would need to do that audit | 19:11 |
fungi | i can try to help with that | 19:11 |
clarkb | thanks | 19:11 |
clarkb | To help investigate further if the gitea ooms may be self inflicted by our aggressive project description updates I've been trying to get some server metrics into our system-config-run jobs | 19:12 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/775051 Dstat stat gathering in our system-config-run jobs to measure relative performance impacts. | 19:12 |
clarkb | That failed in gitea previously, but I just pushed a rebase to help make gerrit load testing a thing | 19:12 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/775883 Gerrit load testing attempt | 19:13 |
clarkb | there was a recent email to the gerrit mailing list about gatling-git which can be used to do artificial load testing against a gerrit and that inspired me to write ^ | 19:13 |
clarkb | I think that could be very useful for us if I can manage to make it work | 19:13 |
clarkb | in particular I'm interested in seeing differences between 3.2 and 3.3 | 19:13 |
ianw | lgtm; unfortunately there didn't seem to be a good way to provide a visualization of that i could find | 19:13 |
ianw | (dstat) | 19:14 |
clarkb | ya I think we can approach this step by step and add bits we find are lacking or would be helpful | 19:14 |
clarkb | anyway this has all been in service of me trying to better profile our services as we've had a coupel of issues around that recently | 19:14 |
clarkb | I think the owrk is promising but still early and may have very rough edges :) | 19:15 |
clarkb | ianw and fungi also updated some links on opendev docs and front page to better point people at our incident list | 19:15 |
clarkb | Are there any other opendev related items to bring up before we move on? | 19:15 |
clarkb | #topic Update Config Management | 19:17 |
*** openstack changes topic to "Update Config Management (Meeting topic: infra)" | 19:17 | |
clarkb | ianw: the new refstack deployment is happy now? we are just waiting on testing before scheduling a migration? | 19:18 |
ianw | well i guess i have migrated it | 19:18 |
ianw | the data at least | 19:19 |
ianw | yes, not sure what else to do other than click around a bit? | 19:19 |
clarkb | right but refstack.openstack.org is still pointed at the old server (so we'll need to do testing, then schedule a downtime where we can update dns and remigrate the data) | 19:19 |
clarkb | I think kopecmartin had some ideas around testing, probably just point kopecmartin at it to start and see what that turns up | 19:19 |
ianw | has any new data come into it? | 19:19 |
clarkb | new data does occasionally show up, though I don't know if it has in this window | 19:20 |
ianw | you can access the site via https://refstack01.openstack.org/#/ | 19:20 |
clarkb | I'll try to catch kopecmartin and point them to ^ | 19:21 |
clarkb | and then we can take it from there | 19:21 |
ianw | ++ | 19:21 |
clarkb | fungi: ianw: I also saw that ansible was reenabled on some afs nodes | 19:21 |
clarkb | any updates on that to go over? | 19:21 |
fungi | i think it's all caught up, now we can focus on ubuntu upgrades on those | 19:22 |
ianw | that was a small problem i created that fungi fixed :) | 19:22 |
ianw | yep, trying some in-place focal upgrades is now pretty much top of my todo | 19:23 |
fungi | more like a minor oversight in the massive volume of work you completed to get all that done | 19:23 |
clarkb | ++ and thanks for the followup there fungi | 19:23 |
clarkb | Any other config management items to cover? | 19:23 |
ianw | semi related is | 19:24 |
ianw | #link https://review.opendev.org/c/opendev/system-config/+/775546 | 19:24 |
ianw | to upgrade grafana, just to keep in sync | 19:24 |
clarkb | looks like an easy review | 19:24 |
clarkb | #topic General Topics | 19:25 |
*** openstack changes topic to "General Topics (Meeting topic: infra)" | 19:25 | |
clarkb | We just went over afs so we can skip to bup and borg backups | 19:26 |
clarkb | #topic Bup and Borg Backups | 19:26 |
*** openstack changes topic to "Bup and Borg Backups (Meeting topic: infra)" | 19:26 | |
clarkb | wiki has been assimilated | 19:26 |
fungi | resistance was substantial, but eventually futile | 19:26 |
clarkb | any other updates? should we consider removing this from topic from our meetings? | 19:27 |
*** hasharDinner is now known as hashar | 19:27 | |
ianw | umm maybe keep it for one more week as i clean it up | 19:27 |
ianw | #link https://review.opendev.org/c/opendev/system-config/+/766630 | 19:28 |
ianw | would be good to look at, which removes bup things | 19:28 |
clarkb | ok | 19:28 |
ianw | i left removing the cron jobs just as a manual task, it's easy enough to just delete them | 19:28 |
clarkb | sounds good | 19:29 |
clarkb | #topic Enable Xenial to Bionic/Focal system upgrades | 19:30 |
*** openstack changes topic to "Enable Xenial to Bionic/Focal system upgrades (Meeting topic: infra)" | 19:30 | |
clarkb | #link https://etherpad.opendev.org/p/infra-puppet-conversions-and-xenial-upgrades Start capturing TODO list here | 19:30 |
clarkb | please add additional info on todo items there. I add them as I come across them (though have many other distractions too) | 19:30 |
clarkb | I also intend to start looking at zuul, nodepool, and zookeeper os upgrades as soon as the zuul release settles | 19:31 |
clarkb | I'm hopeful we can largely just roll through those by adding new servers, and removing old ones | 19:31 |
clarkb | the zuul scheduler being the exception there | 19:31 |
fungi | if we were already on zuul v5... ; | 19:32 |
fungi | ;) | 19:32 |
clarkb | if others have time to start looking at other services (I know ianw has talking about looking at review, thanks) that would be much appreciated | 19:32 |
clarkb | #topic opendev.org not reachable via IPv6 from some ISPs | 19:33 |
*** openstack changes topic to "opendev.org not reachable via IPv6 from some ISPs (Meeting topic: infra)" | 19:33 | |
clarkb | frickler put this item on the agenda. frickler are you around to talk about it? If not I'll do my best | 19:33 |
frickler | yeah so I brought this up mainly to add some nagging toward mnaser | 19:34 |
frickler | or maybe find some other contact at vexxhost | 19:34 |
frickler | the issue is that the IPv6 prefix vexxhost is using is not properly registered, so some ISPs (like mine) are not routing it | 19:34 |
clarkb | noonedeadpunk is another contact there | 19:34 |
frickler | oh, great, I can try that | 19:35 |
fungi | it's specifically about how the routes are being announced in bgp, right? | 19:35 |
frickler | the issue is in the route registry, which provider use to filter bgp announcements | 19:36 |
fungi | usually the way we dealt with it in $past_life was to also announce our aggregates from all borders | 19:36 |
frickler | they registered only a /32, but announce multiple /48s instead | 19:36 |
clarkb | I see so its a separate record that routers will check against to ensure they don't accept bad bgp advertisements? | 19:36 |
fungi | so you announce the /32 to all your peers but also the individual /48 prefixes or whatever from the gateways which can route for them best | 19:37 |
frickler | vexxhost only needs to create route objects for the individual /48s matching what they announce via bgp | 19:37 |
fungi | and yes, there is basically a running list maintained by the address registries which says which prefix lengths to expect | 19:38 |
fungi | out of what ranges | 19:38 |
frickler | the prefix opendev.org is in is 2604:e100:3::/48, which is what they announce via their upstreams | 19:39 |
fungi | and operators wishing to optimize their table sizes use that list to implement filters | 19:39 |
frickler | but a route object only exists for 2604:e100::/32 | 19:39 |
frickler | no, that's not about table size, it is general bgp sanity | 19:40 |
frickler | except not too many providers care about that | 19:40 |
frickler | but I expect that to change in the future | 19:40 |
fungi | the main sanity they care about is "will the table overrun my allocated memory in some routers" | 19:40 |
fungi | (and it's no fun when your border routers start crashing and rebooting in a loop as soon as they peer, let me tell you) | 19:41 |
frickler | this is more related to the possibitly of route hijacking | 19:41 |
clarkb | frickler: whee does this registry live? arin (those IPs are hosted in the USA iirc) | 19:42 |
fungi | yeah, but that possibility exists with or without tat filter list, and affects v4 as well | 19:42 |
frickler | in that case it would be arin maybe, though the /32 is registered in radb | 19:42 |
clarkb | (mostly just curious, I know we can't update it for them) | 19:42 |
frickler | I don't know all the details for american networks, in europe it would be RIPE | 19:43 |
clarkb | ok, in any case I would see if noonedeadpunk can help | 19:43 |
ianw | (ftp://ftp.radb.net/radb/dbase/level3.db.gz contains a large amount of ascii art of cartoon characters, which is ... interesting) | 19:43 |
clarkb | anything else on this topic? | 19:44 |
*** hashar has quit IRC | 19:45 | |
frickler | no, fine for me | 19:45 |
clarkb | #topic Open Discussion | 19:45 |
*** openstack changes topic to "Open Discussion (Meeting topic: infra)" | 19:45 | |
clarkb | Anything else? | 19:45 |
*** hashar has joined #opendev-meeting | 19:45 | |
fungi | yeah, the individual lirs make and (generally) publish their allocation policies indicating what size allocations they're making from what ranges | 19:46 |
fungi | they tend to expect you to at least have aggregates announced for those | 19:46 |
fungi | er, s/lirs/rirs/ | 19:47 |
clarkb | sounds like that may be it? | 19:48 |
clarkb | I'll give it another couple of minutes | 19:48 |
fungi | you find recommendations like "route-filter 2600::/12 prefix-length-range /19-/32;" in old lists, e.g. https://www.space.net/~gert/RIPE/ipv6-filters.html | 19:49 |
fungi | that's the /12 which covers our address, and the recommendation is to only accept prefixes between /19 and /32 long in it | 19:50 |
clarkb | and sounds like that may be it, thanks everyone. | 19:51 |
fungi | so if a provider is using a filter like that, they'll discard the /48 routes vexxhost is announcing | 19:51 |
clarkb | we can continue the ipv6 discussion in #opendev | 19:51 |
fungi | thanks clarkb! | 19:51 |
clarkb | #endmeeting | 19:51 |
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev" | 19:51 | |
openstack | Meeting ended Tue Feb 16 19:51:20 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:51 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-02-16-19.01.html | 19:51 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-02-16-19.01.txt | 19:51 |
openstack | Log: http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-02-16-19.01.log.html | 19:51 |
*** hashar has quit IRC | 20:35 | |
*** sboyron has quit IRC | 21:47 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!