Tuesday, 2021-02-16

*** hamalq has quit IRC01:47
*** ianw has quit IRC06:18
*** ianw has joined #opendev-meeting06:19
*** sboyron_ has joined #opendev-meeting08:10
*** hashar has joined #opendev-meeting08:45
*** sboyron_ is now known as sboyron14:51
*** hashar has quit IRC15:29
*** hashar has joined #opendev-meeting16:22
*** hashar is now known as hasharDinner18:07
*** hamalq has joined #opendev-meeting18:35
clarkbanyone else here for the meeting?19:00
ianwo/19:00
clarkb#startmeeting infra19:01
openstackMeeting started Tue Feb 16 19:01:11 2021 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
*** openstack changes topic to " (Meeting topic: infra)"19:01
openstackThe meeting name has been set to 'infra'19:01
clarkb#link http://lists.opendev.org/pipermail/service-discuss/2021-February/000184.html Our Agenda19:01
clarkbSorry I had meant to send this out yesterday and got it put together but then got distracted by server upgrades19:01
clarkb#topic Announcements19:03
*** openstack changes topic to "Announcements (Meeting topic: infra)"19:03
clarkbThere were none listed19:03
clarkb#topic Actions from last meeting19:03
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)"19:03
clarkb#link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-02-09-19.01.txt minutes from last meeting19:03
clarkbwe had two (though only one got properly recorded)19:03
clarkbianw was looking at wiki borg backups (I think this got done?)19:04
ianwyes wiki is now manually configured to be backing up to borg19:04
clarkbcorvus had an action to unfork jitsi meet19:05
clarkbthe web componenet at least (everything else is already unforked)19:05
corvusnot done, feel free to re-action19:06
clarkb#action corvus unfork jitsi meet web component19:06
clarkb#topic Priority Efforts19:06
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)"19:06
clarkb#topic OpenDev19:06
*** openstack changes topic to "OpenDev (Meeting topic: infra)"19:06
fungii also saw ianw's request for me to double-check the setup on wiki.o.o, will try to get to that after the meeting wraps up19:06
clarkbfungi:thanks19:06
clarkbI did further investigation of gerrit inconsistent accounts and wrote up notes on review-test19:07
clarkbI won't go through all the status of things because I don't think much has changed since the last meeting19:07
clarkbbut I could use another set or two of eyeballs to look over what I've written down to see if the choices described there make sense19:07
clarkbif they do then the next step is likely to make that staging All-Users repo and start committing changes19:08
clarkbwe don't need to work through that in the meeting but if you have time to look at it and want me to walk you through it let me know19:08
clarkbI was going to call out a couple of Gerrit 3.3 related changes but looks like both have merged at this point. Thank you reviewers19:08
clarkbFor the gitea OOM problems we've noticed recently I pushed up a haproxy rate limiting framework change19:09
clarkb#link https://review.opendev.org/c/opendev/system-config/+/774023 Rate limiting framework change for haproxy.19:09
clarkbI doubt that is mergeable as is, but if you have a chance to review it and provide thoughts like "never" or "we could probably get away with $RATE" that may be useful for future occurences19:09
fungii'm feeling like if the count of conflicting accounts is really that high, we should consider sorting by which have the most recent (review/owner) activity and prioritize those, then just disable any which are inactive and let people know, rather than manually investigating hundreds of accounts19:10
clarkbthat said, I am beginning to suspect that these problems may be self induced19:10
clarkbfungi: yup, I'm beginning to think that may be the case. We could do a rough quick search for active accounts, manually check and fix those, then do retirement for all others19:10
fungier, inactive in the sense of not used recently19:10
funginot the inactive account flag specifically19:10
clarkbI can look at the data from that perspective and write up a set of alternate notes19:11
fungimaybe also any which are referenced in groups19:11
clarkbjudging based on the existing data I expected that may be be something like 50 accounts max that we have to sort out manually and the rest we can just retire19:11
fungibut those are likely very few at this point19:11
clarkbbut would need to do that audit19:11
fungii can try to help with that19:11
clarkbthanks19:11
clarkbTo help investigate further if the gitea ooms may be self inflicted by our aggressive project description updates I've been trying to get some server metrics into our system-config-run jobs19:12
clarkb#link https://review.opendev.org/c/opendev/system-config/+/775051 Dstat stat gathering in our system-config-run jobs to measure relative performance impacts.19:12
clarkbThat failed in gitea previously, but I just pushed a rebase to help make gerrit load testing a thing19:12
clarkb#link https://review.opendev.org/c/opendev/system-config/+/775883 Gerrit load testing attempt19:13
clarkbthere was a recent email to the gerrit mailing list about gatling-git which can be used to do artificial load testing against a gerrit and that inspired me to write ^19:13
clarkbI think that could be very useful for us if I can manage to make it work19:13
clarkbin particular I'm interested in seeing differences between 3.2 and 3.319:13
ianwlgtm; unfortunately there didn't seem to be a good way to provide a visualization of that i could find19:13
ianw(dstat)19:14
clarkbya I think we can approach this step by step and add bits we find are lacking or would be helpful19:14
clarkbanyway this has all been in service of me trying to better profile our services as we've had a coupel of issues around that recently19:14
clarkbI think the owrk is promising but still early and may have very rough edges :)19:15
clarkbianw and fungi also updated some links on opendev docs and front page to better point people at our incident list19:15
clarkbAre there any other opendev related items to bring up before we move on?19:15
clarkb#topic Update Config Management19:17
*** openstack changes topic to "Update Config Management (Meeting topic: infra)"19:17
clarkbianw: the new refstack deployment is happy now? we are just waiting on testing before scheduling a migration?19:18
ianwwell i guess i have migrated it19:18
ianwthe data at least19:19
ianwyes, not sure what else to do other than click around a bit?19:19
clarkbright but refstack.openstack.org is still pointed at the old server (so we'll need to do testing, then schedule a downtime where we can update dns and remigrate the data)19:19
clarkbI think kopecmartin had some ideas around testing, probably just point kopecmartin at it to start and see what that turns up19:19
ianwhas any new data come into it?19:19
clarkbnew data does occasionally show up, though I don't know if it has in this window19:20
ianwyou can access the site via https://refstack01.openstack.org/#/19:20
clarkbI'll try to catch kopecmartin and point them to ^19:21
clarkband then we can take it from there19:21
ianw++19:21
clarkbfungi: ianw: I also saw that ansible was reenabled on some afs nodes19:21
clarkbany updates on that to go over?19:21
fungii think it's all caught up, now we can focus on ubuntu upgrades on those19:22
ianwthat was a small problem i created that fungi fixed :)19:22
ianwyep, trying some in-place focal upgrades is now pretty much top of my todo19:23
fungimore like a minor oversight in the massive volume of work you completed to get all that done19:23
clarkb++ and thanks for the followup there fungi19:23
clarkbAny other config management items to cover?19:23
ianwsemi related is19:24
ianw#link https://review.opendev.org/c/opendev/system-config/+/77554619:24
ianwto upgrade grafana, just to keep in sync19:24
clarkblooks like an easy review19:24
clarkb#topic General Topics19:25
*** openstack changes topic to "General Topics (Meeting topic: infra)"19:25
clarkbWe just went over afs so we can skip to bup and borg backups19:26
clarkb#topic Bup and Borg Backups19:26
*** openstack changes topic to "Bup and Borg Backups (Meeting topic: infra)"19:26
clarkbwiki has been assimilated19:26
fungiresistance was substantial, but eventually futile19:26
clarkbany other updates? should we consider removing this from topic from our meetings?19:27
*** hasharDinner is now known as hashar19:27
ianwumm maybe keep it for one more week as i clean it up19:27
ianw#link https://review.opendev.org/c/opendev/system-config/+/76663019:28
ianwwould be good to look at, which removes bup things19:28
clarkbok19:28
ianwi left removing the cron jobs just as a manual task, it's easy enough to just delete them19:28
clarkbsounds good19:29
clarkb#topic Enable Xenial to Bionic/Focal system upgrades19:30
*** openstack changes topic to "Enable Xenial to Bionic/Focal system upgrades (Meeting topic: infra)"19:30
clarkb#link https://etherpad.opendev.org/p/infra-puppet-conversions-and-xenial-upgrades Start capturing TODO list here19:30
clarkbplease add additional info on todo items there. I add them as I come across them (though have many other distractions too)19:30
clarkbI also intend to start looking at zuul, nodepool, and zookeeper os upgrades as soon as the zuul release settles19:31
clarkbI'm hopeful we can largely just roll through those by adding new servers, and removing old ones19:31
clarkbthe zuul scheduler being the exception there19:31
fungiif we were already on zuul v5... ;19:32
fungi;)19:32
clarkbif others have time to start looking at other services (I know ianw has talking about looking at review, thanks) that would be much appreciated19:32
clarkb#topic opendev.org not reachable via IPv6 from some ISPs19:33
*** openstack changes topic to "opendev.org not reachable via IPv6 from some ISPs (Meeting topic: infra)"19:33
clarkbfrickler put this item on the agenda. frickler are you around to talk about it? If not I'll do my best19:33
frickleryeah so I brought this up mainly to add some nagging toward mnaser19:34
frickleror maybe find some other contact at vexxhost19:34
fricklerthe issue is that the IPv6 prefix vexxhost is using is not properly registered, so some ISPs (like mine) are not routing it19:34
clarkbnoonedeadpunk is another contact there19:34
frickleroh, great, I can try that19:35
fungiit's specifically about how the routes are being announced in bgp, right?19:35
fricklerthe issue is in the route registry, which provider use to filter bgp announcements19:36
fungiusually the way we dealt with it in $past_life was to also announce our aggregates from all borders19:36
fricklerthey registered only a /32, but announce multiple /48s instead19:36
clarkbI see so its a separate record that routers will check against to ensure they don't accept bad bgp advertisements?19:36
fungiso you announce the /32 to all your peers but also the individual /48 prefixes or whatever from the gateways which can route for them best19:37
fricklervexxhost only needs to create route objects for the individual /48s matching what they announce via bgp19:37
fungiand yes, there is basically a running list maintained by the address registries which says which prefix lengths to expect19:38
fungiout of what ranges19:38
fricklerthe prefix opendev.org is in is 2604:e100:3::/48, which is what they announce via their upstreams19:39
fungiand operators wishing to optimize their table sizes use that list to implement filters19:39
fricklerbut a route object only exists for 2604:e100::/3219:39
fricklerno, that's not about table size, it is general bgp sanity19:40
fricklerexcept not too many providers care about that19:40
fricklerbut I expect that to change in the future19:40
fungithe main sanity they care about is "will the table overrun my allocated memory in some routers"19:40
fungi(and it's no fun when your border routers start crashing and rebooting in a loop as soon as they peer, let me tell you)19:41
fricklerthis is more related to the possibitly of route hijacking19:41
clarkbfrickler: whee does this registry live? arin (those IPs are hosted in the USA iirc)19:42
fungiyeah, but that possibility exists with or without tat filter list, and affects v4 as well19:42
fricklerin that case it would be arin maybe, though the /32 is registered in radb19:42
clarkb(mostly just curious, I know we can't update it for them)19:42
fricklerI don't know all the details for american networks, in europe it would be RIPE19:43
clarkbok, in any case I would see if noonedeadpunk can help19:43
ianw(ftp://ftp.radb.net/radb/dbase/level3.db.gz contains a large amount of ascii art of cartoon characters, which is ... interesting)19:43
clarkbanything else on this topic?19:44
*** hashar has quit IRC19:45
fricklerno, fine for me19:45
clarkb#topic Open Discussion19:45
*** openstack changes topic to "Open Discussion (Meeting topic: infra)"19:45
clarkbAnything else?19:45
*** hashar has joined #opendev-meeting19:45
fungiyeah, the individual lirs make and (generally) publish their allocation policies indicating what size allocations they're making from what ranges19:46
fungithey tend to expect you to at least have aggregates announced for those19:46
fungier, s/lirs/rirs/19:47
clarkbsounds like that may be it?19:48
clarkbI'll give it another couple of minutes19:48
fungiyou find recommendations like "route-filter 2600::/12 prefix-length-range /19-/32;" in old lists, e.g. https://www.space.net/~gert/RIPE/ipv6-filters.html19:49
fungithat's the /12 which covers our address, and the recommendation is to only accept prefixes between /19 and /32 long in it19:50
clarkband sounds like that may be it, thanks everyone.19:51
fungiso if a provider is using a filter like that, they'll discard the /48 routes vexxhost is announcing19:51
clarkbwe can continue the ipv6 discussion in #opendev19:51
fungithanks clarkb!19:51
clarkb#endmeeting19:51
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev"19:51
openstackMeeting ended Tue Feb 16 19:51:20 2021 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:51
openstackMinutes:        http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-02-16-19.01.html19:51
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-02-16-19.01.txt19:51
openstackLog:            http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-02-16-19.01.log.html19:51
*** hashar has quit IRC20:35
*** sboyron has quit IRC21:47

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!