*** hamalq_ has quit IRC | 03:45 | |
*** hamalq has joined #opendev-meeting | 06:44 | |
*** sboyron has joined #opendev-meeting | 07:51 | |
*** hashar has joined #opendev-meeting | 08:07 | |
*** hamalq has quit IRC | 09:10 | |
*** hamalq has joined #opendev-meeting | 09:14 | |
*** hamalq_ has joined #opendev-meeting | 09:16 | |
*** hamalq has quit IRC | 09:18 | |
*** hamalq_ has quit IRC | 09:20 | |
*** hamalq has joined #opendev-meeting | 10:00 | |
*** hamalq has quit IRC | 10:04 | |
*** hashar is now known as hasharLunch | 11:26 | |
*** hamalq has joined #opendev-meeting | 12:01 | |
*** hamalq has quit IRC | 12:05 | |
*** hasharLunch is now known as hashar | 12:37 | |
*** hamalq has joined #opendev-meeting | 14:02 | |
*** hamalq has quit IRC | 14:07 | |
*** hamalq has joined #opendev-meeting | 14:17 | |
*** hamalq has quit IRC | 14:22 | |
*** hashar has quit IRC | 15:31 | |
*** hashar has joined #opendev-meeting | 16:00 | |
*** hamalq has joined #opendev-meeting | 16:18 | |
*** hamalq has quit IRC | 16:23 | |
*** hamalq has joined #opendev-meeting | 16:56 | |
*** hamalq has quit IRC | 17:01 | |
*** hamalq has joined #opendev-meeting | 17:06 | |
*** guillaumec has joined #opendev-meeting | 17:07 | |
*** hashar is now known as hasharDinner | 18:02 | |
fungi | ahoy! | 19:00 |
---|---|---|
clarkb | hello | 19:00 |
*** frickler has joined #opendev-meeting | 19:00 | |
clarkb | we'll get started shortly (if you are looking for the opendev infra meeting you are in the right place) | 19:00 |
clarkb | #startmeeting infra | 19:01 |
openstack | Meeting started Tue Dec 8 19:01:14 2020 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
*** openstack changes topic to " (Meeting topic: infra)" | 19:01 | |
openstack | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link http://lists.opendev.org/pipermail/service-discuss/2020-December/000151.html Our Agenda | 19:01 |
clarkb | #topic Announcements | 19:01 |
*** openstack changes topic to "Announcements (Meeting topic: infra)" | 19:01 | |
ianw | o/ | 19:02 |
clarkb | I intend to be far away from keyboards next week. I'll also be doing school duties to give my wife a break from that so will be distracted either way. This means that we will need a meeting chair volunteer or we can cancel the next meeting | 19:02 |
clarkb | thenfor the 22nd and 29th I figured we'd play it more by ear as others are also likely taking time? | 19:02 |
corvus | clarkb: are you away all next week? | 19:03 |
fungi | with things getting quiet, having fewer meetings might also just be nice | 19:03 |
clarkb | corvus: ya sorry, trying to take the week off and get some rest/reset | 19:03 |
corvus | don't be sorry :) | 19:03 |
fungi | i heard he's taking a week-long trip to oregon | 19:03 |
corvus | (just wanted to be clear if it was a day or a week) | 19:03 |
ianw | ++ sounds good :) | 19:03 |
corvus | i'll be around through the 23rd, then not around | 19:04 |
clarkb | if you will be around and want to chair either let me know or maybe just send out a meeting agenda email on monday | 19:04 |
clarkb | #topic Actions from last meeting | 19:05 |
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)" | 19:05 | |
fungi | i'm in favor of just having fewer meetings and handling things as the come up | 19:05 |
clarkb | fungi: that wfm | 19:05 |
clarkb | #undo | 19:05 |
openstack | Removing item from minutes: #topic Actions from last meeting | 19:05 |
corvus | yep | 19:05 |
clarkb | in that case why don't we consider the meeting cancelled and we can schedule meetings as necessary instead with those who happen to be around | 19:05 |
fungi | seconded | 19:05 |
clarkb | and apply similar logic to the 22nd and 29th | 19:05 |
clarkb | #topic Actions from last meeting | 19:06 |
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)" | 19:06 | |
clarkb | #link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-12-01-19.01.txt minutes from last meeting | 19:06 |
clarkb | there were no actiosn recorded so lets just dive in | 19:06 |
clarkb | #topic Priority Efforts | 19:06 |
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)" | 19:06 | |
clarkb | #topic OpenDev | 19:07 |
*** openstack changes topic to "OpenDev (Meeting topic: infra)" | 19:07 | |
clarkb | On the gerrit side of things I listed out a few items for further tuning consideration | 19:07 |
clarkb | last night (relative to me) ianw ended up resetarting gerrit as it became non responsive. I think that possibly the lack of memory headroom with java 11 may be related to that? I've pushed up a chnge to reduce allowed heap size to 44g from 48g | 19:07 |
clarkb | I think java 11's non heap space is larger than java 8s | 19:08 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/766020 reduce java heap size on review.o.o | 19:08 |
clarkb | that should give us more room for things like apache, git gc, backups, and so on | 19:08 |
fungi | this all seems reasonable. i've +2'd but not approved the changes you recommended for the next restart | 19:08 |
clarkb | even if the memory wasn't at fault for the issue last night I think we're seeing sawpping and should avoid it if necessary | 19:08 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/765867 Put more jgit configs in jgit.config | 19:09 |
clarkb | This change is the other one that I think we should consider for the next restart. Reading the gerrit documentation it is completely confusing as to whether our preexisting jgit tunables in gerrit.config apply anymore or if they need to be in jgit.config | 19:09 |
clarkb | I tried poking around the source to figure it out but failed doing that as well. Instead I'm thinking lets put the config options in both files and see if we get a difference in behavior | 19:10 |
ianw | yeah, the host was responsive, but gerrit was not. my debugging was not extensive unfortunately | 19:10 |
clarkb | There are two children changes of ^ this change which are worth consideration too, but likely need more eyeballs and should go in on later restarts so we can tell what helps and what doesn't | 19:10 |
clarkb | specifically using packedGitUseStrongRefs and enabling git protocol 2 | 19:10 |
clarkb | for the strong refs the idea there reading stuff from matthias upstraem is that when garbage collection happens it has a tendency to flush out jgit caches which jgit then immediately refills and this thrashing can lead to a sad gerrit | 19:11 |
clarkb | the strong refs makes the garbage collector stop doing that. My concern with this change is that I'm not sure if the garbage collector can ever clean strong refs if it needed to? | 19:11 |
clarkb | (strong refs are not eligible for garbage collections) | 19:12 |
clarkb | I think if we land this particular change we should do so when we can monitor it over a long period of time just to keep an eye on memory use | 19:12 |
clarkb | for git protocol v2, the idea there is it is much more efficient for git client operations when dealing with repos that have a lot of refs (like our gerrit repos) | 19:12 |
clarkb | the client must also support it but current git clients default to v2 aiui so as systems update we woudl get more and more use out of that? | 19:13 |
clarkb | anyway those first two chagnes should be much safer than the latter two. ANd if we can get the first two in and restart with them that would probably be good | 19:13 |
fungi | sounds great | 19:14 |
clarkb | The other tunable that I discovered is that gerrit allows you to split its thread resources into batch and interactive user sets. The idea here is that things like CI systems could have dedicated thread resources. I'm not sure if this would help us or not but I noticed it was somethign called out in tuning discussions | 19:14 |
fungi | i can do another gerrit restart later in my evening for the initial changes you mentioned | 19:15 |
clarkb | if others have time to look into ^ that would probably be good (even if it is to say "no we don't want this as it will start regular users") | 19:15 |
clarkb | fungi: thanks! | 19:15 |
clarkb | that was all I had for tunables. ianw want to update us on the ci results table progress? | 19:15 |
ianw | for a quick look at what i've got see https://104.130.172.52/c/openstack/diskimage-builder/+/554002 | 19:16 |
ianw | there's a tab | 19:16 |
clarkb | ooh I like that | 19:17 |
corvus | ++ that looks great | 19:17 |
ianw | this is just all very very simple plain javascript @ https://github.com/ianw/gerrit-zuul-summary-status/blob/main/gr-zuul-summary-status/gr-zuul-summary-status-view.js#L87 | 19:17 |
corvus | are comment tags available to the js? | 19:18 |
corvus | (so we could act on that instead of author name?) | 19:18 |
fungi | i guess we call it a "zuul summary" because it's based on parsing zuul's standard comment format, even though it may include results from other non-zuul ci systems reporting in a similar format? | 19:18 |
ianw | corvus: hrm, whatever is in https://gerrit-review.googlesource.com/Documentation/rest-api-changes.html#change-info i guess | 19:19 |
ianw | fungi: yeah, i mean that's up for debate i guess | 19:19 |
ianw | i think i can probably get it down to be simple enough to be a single file | 19:19 |
ianw | i am starting to wonder about pushing it upstream, it might feel more at home even as a contrib/ in zuul | 19:21 |
* diablo_rojo sneaks in late | 19:21 | |
fungi | but it installs as a pg plugin? | 19:21 |
clarkb | the big upside to pushing it upstream is that we know there are other zuul users out there with gerrit and they may be more likely to find these things on the gerrit side (as it is a gerrit modification)? | 19:21 |
corvus | ianw: https://gerrit-review.googlesource.com/Documentation/rest-api-changes.html#change-message-info 'tag' field | 19:21 |
corvus | yeah, and we might be able to take advantage of the gerrit plugin ecosystem | 19:22 |
clarkb | in fact a zuul user I didn't recognize (sorry if I shoudl've) caught the stream events thing on gerrit 3.3.0 (which we will talk about in a bit) and sent amil about it to the repo discuss list | 19:22 |
corvus | (like i think there's a pluginmanager plugin or something where you can click-to-install gerrit plugins) | 19:22 |
clarkb | ya there is | 19:22 |
clarkb | (we don't have it enabled on our setup)_ | 19:23 |
corvus | so i'd be in favor of putting that in the upstream gerrit for that and community relations purposes :) | 19:23 |
ianw | corvus: i'll look into it. basically the plugin gets called with a changeinfo object for the current change. my debug method is "console.log" so that's how i inspect what's going on :) | 19:23 |
ianw | yeah, i do have it building via bazel ATM | 19:24 |
ianw | and there's testing frameworks for polymer | 19:24 |
corvus | and a zuul to run those tests :) | 19:24 |
clarkb | anything else to add on this? | 19:25 |
ianw | nope, i'll just keep plugging away on it | 19:25 |
corvus | ianw: if you're okay with pushing that to gerrit's gerrit, i think the next step is to send an email to repo-discuss requesting the repo creation; i can help with that if you want | 19:25 |
ianw | corvus: thanks. i will clean it up a bit more and get back to you | 19:26 |
corvus | fwiw, i think it looks good enough to start iterating on things in parallel :) | 19:27 |
clarkb | Next up is the built in WIP status for changes on newer gerrit. We had hacked in WIP support by adding a -1 approval category that change owners and cores could toggle, but now gerrit supports it directly (for change orwners at least). People have started asking about using the actual WIP status instead of the approval category | 19:27 |
clarkb | but I think just now we have discovered that zuul doesn't yet know about the built in wip status and should be updated before we recommend our users use the built in wip status | 19:28 |
clarkb | corvus: fungi zbr any other specifics to call out on that? sounds like work will start soon on addressing that in zuul | 19:28 |
corvus | zbr volunteered to work on a change tomorrow | 19:29 |
fungi | i had nothing to add | 19:29 |
clarkb | ok, I figure once zuul is updated we'll do more testing then we can decide if we want to clean up the old approval hack or not (or at least offer that as an option to users) | 19:29 |
fungi | just be aware it will cause top-of-queue gate resets for now if people accidentally approve wip state changes | 19:30 |
clarkb | oh ya beacuse submit will fail which zuul will think is a merge failure | 19:30 |
fungi | zuul will get as far as trying to submit, right | 19:30 |
clarkb | that is a good point particularly since we haev seen deep gate queues in some projects recently (there has been a lot of python trouble with pip lately) | 19:31 |
fungi | python comics, issue #473: the trouble with pip | 19:31 |
clarkb | Last up on the Gerrit OpenDev topic was calling out that Gerrit 3.3.0's event stream implementation breaks zuul's ability to take action on comment contents (think recheck comments) | 19:31 |
clarkb | corvus: ^ are any other zuul interactions with gerrit known to be affected ? | 19:32 |
clarkb | calling this out beacuse upstream is aware of the issue and is working on addressing it, but we should avoid upgrading to 3.3 until it is fixed | 19:33 |
fungi | can we insert after this subtopic the jeepyb lp bug/bp hook scripts? i wanted to know if anyone has made progress on those or if i should try to pick them up next myself tomorrow-ish | 19:33 |
clarkb | sure I think that was all I had on it (basically upgrade to 3.3.0 has found a blocker) | 19:34 |
corvus | clarkb: that's all i'm aware of | 19:34 |
clarkb | fungi: I am not aware of anyone working on them yet | 19:34 |
corvus | latest on the stream-events thing is luca is going to rage code a bunch of tests :) | 19:34 |
fungi | cool, mostly just trying to prioritize the stuff we've been accumulating on the post-upgrade etherpad | 19:34 |
clarkb | fungi: ianw had looked at them briefly pre upgrade iirc, but that was the last I heard | 19:34 |
clarkb | fungi: ++ and thank you | 19:34 |
corvus | he said something about "if it's not tested it's broken" | 19:34 |
fungi | and bug/bp integration seems to be next on the painpoints after/alongside ci results table | 19:35 |
fungi | corvus: i feel like i've heard that somewhere before | 19:35 |
ianw | yeah, i hadn't really got that far with them, but now we have the actual REST API to play against i think we can iterate on it faster | 19:35 |
clarkb | Anything else on the subject of gerrit and or opendev? | 19:36 |
ianw | one quick thing on the system-config gate test for gerrit/review ... what does the review-dev node test over just the review node? | 19:36 |
ianw | i'm wondering if we can prune that to just the one node? | 19:36 |
fungi | it's been pointed out that we may be invalidating gerrit logins more quickly than (we think) we've configured, so i'll also test whether my restart later today invalidates my webui session | 19:37 |
clarkb | ianw: ya I think the idea before we realized that we really need something like a prod alike is that we might have -dev and prod in different stages of upgrades | 19:37 |
clarkb | ianw: since we're doing pre merge testing anyway I think we can probably have a single node that just does the thing we want prod to look like and use it that way | 19:37 |
clarkb | ianw: mordred may remember if there was any better reason than that though | 19:37 |
clarkb | fungi: oh good idea | 19:38 |
mordred | what did I do? | 19:38 |
clarkb | mordred: basically in the system-config job for gerrit we have a review.o.o and review-dev.o.o fake tests nodes separated I think | 19:38 |
fungi | i feel like we should rip out review-dev at this point (and keep in mind when we're ready to also tear down review-test in favor of held job nodes) | 19:38 |
clarkb | fungi: ++ | 19:38 |
ianw | fungi: re the logout, i was not logged out when i restarted it last night my time | 19:39 |
mordred | yeah- I think it was just because they were a bit different | 19:39 |
clarkb | at this point its an artifact of how we didn't have great testing for gerrit and now we can make that better with testing that looks like prod | 19:39 |
mordred | so I think re-collapsing those at this point is ... yup | 19:39 |
fungi | ianw: thanks, that's also a useful data point | 19:39 |
ianw | ok, i will propose that. i makes it a bit simpler doing a full gerrit initalisation and pushing changes in the job | 19:40 |
clarkb | #topic Update Configuration Management | 19:40 |
*** openstack changes topic to "Update Configuration Management (Meeting topic: infra)" | 19:40 | |
clarkb | Has there been any movement on this topic in the last week (sorry gerrit has been overly consuming) | 19:41 |
fungi | this might be the place to remind folks we're running into dockerhub rate limits on our containerized service test jobs | 19:42 |
fungi | no easy answers at this point though | 19:42 |
corvus | is it enough we're ready to decide we want to do something about it? | 19:42 |
clarkb | corvus: probably not? It is just infrequent enough that I haven't rage fixed it :) | 19:42 |
fungi | it hasn't been particularly crippling yet | 19:43 |
fungi | but worth keeping an eye on in case it escalates quickly | 19:43 |
clarkb | it may be worth setting up a job to publish to quay just to see if that works? | 19:43 |
corvus | iirc, we're thinking if it is annoying enough, we should start by looking into squid, and if that fails, we could look at a smart proxy based on zuul-registry but that's high-effort. that still a decent summary? | 19:43 |
clarkb | since that may be an easy out | 19:43 |
clarkb | corvus: yup I think as far as proper fixing goes that is a good summary | 19:44 |
clarkb | (my quay comment is more that "maybe this is an easy half measure to consider alongside ^ and we'd still want to cache for quay anyway) | 19:44 |
fungi | yeah, that's still the latest thinking as far as i'm aware | 19:44 |
clarkb | #topic General Topics | 19:46 |
*** openstack changes topic to "General Topics (Meeting topic: infra)" | 19:46 | |
clarkb | #topic Bup and Borg Backups | 19:46 |
*** openstack changes topic to "Bup and Borg Backups (Meeting topic: infra)" | 19:46 | |
clarkb | ianw: this is still on the agenda mostly as a remidner that we should look out for your bup removal change and +2 that after we've verified borg backups? | 19:46 |
clarkb | ianw: is that change up yet? | 19:46 |
fungi | thanks, that reminds me to actually add restoration docs exercising to my to do list | 19:48 |
ianw | no, it is not. i'll get to it so we can hopefully sort it by year end | 19:48 |
clarkb | thanks | 19:48 |
fungi | there's no rush | 19:48 |
clarkb | #topic OpenStackID hosting | 19:48 |
*** openstack changes topic to "OpenStackID hosting (Meeting topic: infra)" | 19:48 | |
clarkb | This I failed to add to the agenda but the foundation sysadmins have started to think about what a more ideal hosting situation looks like (ignoring who is hosting it) which I think is a good first step in figuring out how we collaborate (if at all) in hosting it | 19:49 |
clarkb | basically taking another look at service needs and requirements and work out how to deploy it well | 19:50 |
clarkb | (this hasn't been forgotten) | 19:50 |
clarkb | Then for remaining topics I may have to declare bankruptcy on ptg followups or at least in the way I've done them before. Meetpad testing with users in china has not happened yet (I'd still like to coordinate that though), and puppet job splitting hasn't happened as far as I can tell | 19:51 |
clarkb | #topic Open Discussion | 19:51 |
*** openstack changes topic to "Open Discussion (Meeting topic: infra)" | 19:51 | |
fungi | thinking back, the reason we insisted on hosting it in opendev previously was that we were tying the gerrit contact store api to it, so new contributors at the time couldn't agree to the (then mandatory for basically all projects) osf icla if openstackid was down, but also we were looking at depending on it for authenticating users to various services | 19:51 |
clarkb | #undo | 19:51 |
openstack | Removing item from minutes: #topic Open Discussion | 19:51 |
fungi | gerrit since remove the contact store feature entirely so that's no longer a concern | 19:52 |
clarkb | that is a good point, the requierments/needs on our end have shifted too | 19:53 |
fungi | and the only services we set up authenticating against it were translate (openstack-only abandonware which needs to be replaced soonish), refstack (also openstack-only, tied to foundation trademark programs), and survey (beta which never really gained traction) | 19:53 |
clarkb | alright I'll open it up now as we only have a few minutes left | 19:54 |
clarkb | #topic Open Discussion | 19:54 |
*** openstack changes topic to "Open Discussion (Meeting topic: infra)" | 19:54 | |
clarkb | Anything else to call out really quickly? | 19:54 |
diablo_rojo | Nothing from me. | 19:55 |
fungi | i've promised to spend less time on the computer this month. i expect to still be around some of the time but will also be taking more time away as i can to work on some projects around the house. also probably for the last week-ish of the month i may not be around much at all | 19:56 |
clarkb | ya I'll be trying to take it easy around the holidays though in and out | 19:57 |
fungi | for me that probably translates to fixing emergency fires but maybe not much progress on longer term efforts | 19:57 |
fungi | s/fixing/fueling/ ? ;) | 19:59 |
clarkb | heh | 19:59 |
clarkb | anyway we are about at time now. Thanks everyone! | 19:59 |
clarkb | #endmeeting | 19:59 |
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev" | 19:59 | |
openstack | Meeting ended Tue Dec 8 19:59:38 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:59 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-12-08-19.01.html | 19:59 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-12-08-19.01.txt | 19:59 |
openstack | Log: http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-12-08-19.01.log.html | 19:59 |
fungi | thanks clarkb! | 19:59 |
*** hasharDinner has quit IRC | 21:34 | |
*** sboyron has quit IRC | 22:20 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!