*** zbr is now known as zbr|ruck | 09:26 | |
clarkb | fungi: corvus: I'll startmeeting infra here? | 14:50 |
---|---|---|
fungi | sgtm | 14:51 |
corvus | oh hi | 14:51 |
clarkb | #startmeeting infra | 14:51 |
openstack | Meeting started Fri Jul 24 14:51:37 2020 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 14:51 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 14:51 |
*** openstack changes topic to " (Meeting topic: infra)" | 14:51 | |
openstack | The meeting name has been set to 'infra' | 14:51 |
clarkb | #topic Gerrit Project Renames July 24, 2020 | 14:51 |
*** openstack changes topic to "Gerrit Project Renames July 24, 2020 (Meeting topic: infra)" | 14:51 | |
clarkb | #link https://etherpad.opendev.org/p/gerrit-2020-07-24 | 14:52 |
fungi | clarkb: we've used opendev-maint for the meeting name previously i think? | 14:52 |
clarkb | ah that is what I was wondering. Maybe we can just mv this file when we are done? | 14:52 |
clarkb | or should I end and start new? | 14:52 |
fungi | it's probably fine for this one | 14:53 |
clarkb | k | 14:53 |
fungi | i missed you were asking what meeting name to use, sorry :/ | 14:53 |
clarkb | I have started a root screen on bridge and run disable-ansible there. We are also waiting for a couple of openstack releases to flush through the release and post pipelines. | 14:53 |
clarkb | Once the releases are complete we'll proceed with irc notices and running the playbook | 14:54 |
fungi | i've checked the renames data change and the copy of the yaml file from it on bridge.o.o, both identical and correct | 14:55 |
fungi | they match what's going on in the rename changes too | 14:56 |
clarkb | there is also a single ironic job that should end any minute now and if it succeeds will flush about 5 changes so may wait for that too, though I think that is less critical | 14:56 |
fungi | it's zuul.opendev.org/t/openstack/status/ | 15:01 |
fungi | er, mispaste | 15:01 |
clarkb | we're still waiting on nodes for the releases but I think all of the jobs are queued at this point. Now we wait | 15:02 |
clarkb | nova's release notes job takes more than half an hour? | 15:03 |
clarkb | I imagine its safe to restart Gerrit while that is running since it isn't pushing tags | 15:04 |
fungi | yeah, the problem is if we take gerrit down during tag pushes, slightly less so for the job which proposes constraints changes | 15:04 |
fungi | the last build for that ironic change is also just wrapping up now | 15:05 |
clarkb | fungi: it would probably be good to wait at least for the release-openstack-python jobs to finish too incase they fetch from gerrit? | 15:05 |
clarkb | I don't think they do but I'm always surprised :) | 15:06 |
fungi | i don't think they do. propose-update-constraints will definitely be pushing to gerrit though | 15:08 |
clarkb | oh I see | 15:09 |
clarkb | ya | 15:09 |
clarkb | we are 4 minutes away for the one release | 15:10 |
clarkb | fungi: https://zuul.opendev.org/t/openstack/build/ad0de3a8325c4c3ab4c462e6ee1bf509 nova release failed on that | 15:13 |
clarkb | is that something we can deal with after the maintenance? | 15:13 |
clarkb | looks like it did upload to pypi | 15:14 |
clarkb | and it ran on ze10 (not sure if that aws one of the time delta servers) | 15:14 |
fungi | ugh, yes that's the same problem i was trying to track down on ze11 yesterday | 15:15 |
fungi | ze02, ze10 and ze11 were the three which spontaneously rebooted at various times on wednesday | 15:15 |
fungi | it'll have to be dealt with after the maintenance | 15:16 |
fungi | it involves manually copying files from pypi into afs | 15:16 |
clarkb | I'm checking with smcginnis now that we are good to proceed wtih maintenance from their side | 15:17 |
clarkb | all the tagging and constraints have been pushed I think | 15:17 |
clarkb | fungi: corvus I've got the ansible playbook command queued up in screen. smcginnis thinks we are clear to proceed. | 15:19 |
corvus | ++ | 15:19 |
clarkb | will running the status notice here confuse the bots: #status notice We are renaming projects in Gerrit and review.opendev.org will experience a short outage. Thank you for your patience. | 15:19 |
fungi | yep, should be clear to proceed | 15:20 |
clarkb | I'll run it in #opendev to avoid any bot confusion | 15:20 |
fungi | it shouldn't confuse the bots | 15:20 |
clarkb | oh well then I guess I'll try it here | 15:20 |
clarkb | #status notice We are renaming projects in Gerrit and review.opendev.org will experience a short outage. Thank you for your patience. | 15:20 |
openstackstatus | clarkb: sending notice | 15:20 |
fungi | the only real confusion will be if we switch meeting topics while still under alert | 15:20 |
fungi | i think | 15:21 |
-openstackstatus- NOTICE: We are renaming projects in Gerrit and review.opendev.org will experience a short outage. Thank you for your patience. | 15:21 | |
fungi | or end the meeting while still under alert | 15:21 |
clarkb | ya I didn't alert in part to avoid topic changes | 15:21 |
clarkb | Once it says it is done I'll remove my # prefix on the bridge screen command line and run the playbook? | 15:21 |
fungi | the maintenance itself will take less time to complete than undoing the alert would anyway | 15:22 |
fungi | yeah, looks right | 15:22 |
corvus | ++ | 15:23 |
openstackstatus | clarkb: finished sending notice | 15:24 |
clarkb | alright proceeding with the playbook command now | 15:24 |
fungi | please do | 15:24 |
clarkb | it is running | 15:24 |
fungi | no usable temp dir found? | 15:25 |
clarkb | uhm there was gurmp about a tmp dir? | 15:25 |
clarkb | but its proceeding? | 15:25 |
corvus | review-test | 15:25 |
clarkb | review-test | 15:25 |
clarkb | ya ok | 15:25 |
fungi | yup | 15:25 |
clarkb | its at the wati for gerrit to come up stage | 15:26 |
clarkb | logs claim it is ready so web should catch up momentarily | 15:28 |
fungi | api has gone from refusing to hanging | 15:28 |
fungi | help responds now | 15:28 |
fungi | ls-projects isn't returning yet for me though | 15:29 |
corvus | i can load changes using the web ui | 15:29 |
clarkb | yup web ui is up for me but ls-projects isn't working yet | 15:29 |
clarkb | show-queue works /me tries ls-projects again | 15:30 |
fungi | now it's outputing the projects list for me, just pausing/buffering slowly | 15:30 |
fungi | maybe project listing is slow at startup | 15:31 |
fungi | okay it finally returned for me | 15:31 |
fungi | and now it returns quickly when rerunning | 15:31 |
clarkb | yup my initla call errored but now if I do it it works | 15:31 |
fungi | so we're probably in the clear to move along | 15:31 |
clarkb | corvus: ^ you think we are ready too? | 15:31 |
corvus | yep | 15:32 |
clarkb | that lgtm other than review-test | 15:32 |
fungi | why does it run against review-test? | 15:33 |
clarkb | fungi: we must be using the review group and review-test is in it? | 15:33 |
clarkb | would probably be a good idea to be more explicit or remove review-test from that group but that will need investigating | 15:33 |
fungi | ahh, that would make sense, yeah | 15:33 |
fungi | okay, so now we land the rename changes? | 15:34 |
clarkb | we are ready to merge https://review.opendev.org/#/c/739286/ and https://review.opendev.org/#/c/738979 right? | 15:34 |
fungi | yeah i think so | 15:34 |
clarkb | this is where we had problems last time but the ansible disablement seems to be working so I think its ok | 15:34 |
fungi | also 742731 right? | 15:34 |
clarkb | and I should force merge those because the first cahnge cannot merge as is? | 15:34 |
clarkb | fungi: ya that one too but it can happen without force merging and before or after we disable ansible | 15:35 |
corvus | yeah, i think force-merge 738286 | 15:35 |
fungi | yeah, unless we want to split the zuul tenant config change out we'll need to bypass zuul | 15:35 |
clarkb | corvus: should I force merge both so that we can reenable ansible more quickly? | 15:35 |
corvus | we'll get a zuul config error which will reconcile once we re-enable zuul on bridge and deploy | 15:35 |
corvus | clarkb: yes i think it's fine to do both | 15:36 |
clarkb | ok I'll force merge both now | 15:36 |
fungi | we need all three merged before we turn ansible back on though, right? | 15:36 |
fungi | er, no i guess the renames data is only used if we rebuild gitea servers | 15:37 |
clarkb | fungi: no, the record change is purely information and not processed by automation (yet) | 15:37 |
fungi | so that can merge in its own time, yeah | 15:37 |
corvus | (i had a typo earlier, 739286 not 738286) | 15:37 |
clarkb | corvus: ya I opened them from the etherpad and checked content | 15:38 |
clarkb | I have merged both | 15:38 |
fungi | yep, gerritbot confirmed | 15:38 |
clarkb | https://gitea01.opendev.org:3000/openstack/project-config/commits/branch/master I'm checking that on 01 to 08 now | 15:39 |
clarkb | 01 lgtm | 15:39 |
clarkb | all 8 lgtm | 15:40 |
fungi | yep, i also just finished checking them. all 8 look like they have those now | 15:40 |
clarkb | deploy has both changes queued up and the first manage projects job should be hitting our ansible is disabled check | 15:40 |
clarkb | fungi: corvus let me know if you think there is anything else we should check before removing the ansible disablement file. I think we should be good to proceed | 15:41 |
corvus | clarkb: i think we're good | 15:41 |
fungi | i think we're ready to go | 15:41 |
clarkb | the file has been rm'd | 15:42 |
clarkb | msg executing local code is prohibited | 15:42 |
clarkb | ok then | 15:42 |
corvus | ? | 15:42 |
clarkb | manage-projects job failed due to ^ | 15:42 |
corvus | lisk? | 15:43 |
clarkb | getting one | 15:43 |
clarkb | https://zuul.opendev.org/t/openstack/build/948ba0341b334e9db4c2f32779fdae86 | 15:43 |
clarkb | I think we're ok from a renaming standpoint | 15:43 |
clarkb | we just failed to run the job but once the job is working again we'll apply the updated state and noop | 15:44 |
clarkb | however, we cannot create new projects at the moment | 15:44 |
clarkb | it is the git repo update | 15:45 |
clarkb | we may actually not update anything right now :/ | 15:45 |
clarkb | I'm wondering if that means we want to disable ansible again to keep periodic jobs from potentially being unhappy about things? | 15:45 |
fungi | i guess we haven't actually run infra-prod-manage-projects successfully since the upgrade: https://zuul.opendev.org/t/openstack/builds?job_name=infra-prod-manage-projects | 15:46 |
corvus | i suspect we don't really need to disable since it's just going to continue to bomb at the start | 15:47 |
clarkb | fungi: or any other infra-prod playbook | 15:47 |
clarkb | corvus: ya I think that is correct since this is very early in the infra prod playbook | 15:47 |
corvus | we're using the zuul project ssh key for access control, right? | 15:48 |
clarkb | corvus: yes | 15:48 |
clarkb | it should be added to the bridge zuul user | 15:48 |
corvus | i think we're going to need a new base job in a config-project to fix this | 15:49 |
corvus | system-config is currently in the openstack tenant, so we can add it to opendev/base-jobs or openstack/project-config | 15:50 |
corvus | i don't think we have a config-project that's just for the opendev tenant, do we? | 15:50 |
clarkb | corvus: we have opendev/project-config which we don't really use for much yet | 15:50 |
corvus | aha | 15:50 |
corvus | it's not in openstack | 15:51 |
corvus | maybe we should just put this in base-jobs for now, then move it? | 15:51 |
fungi | i'm okay with it going into opendev/base-jobs initially | 15:51 |
clarkb | opendev/base-jobs you mean? ya I think that would be easy enough | 15:51 |
corvus | i mean, no other project will be able to use it because of the project key anyway | 15:52 |
corvus | i'll work on a change | 15:52 |
fungi | we'll eventually shuffle all of that into the opendev tenant regardless | 15:52 |
clarkb | we can also explicitly limit it to openstack/project-config and openstack/system-config right? | 15:52 |
fungi | and can then put it in opendev/project-config when we do | 15:52 |
clarkb | allowed-projects or whatever the term is | 15:52 |
corvus | fungi: yeah, and when we do, we can pull it into a narrower scope | 15:52 |
corvus | clarkb: yes | 15:52 |
clarkb | and the only role it seems to use is prepare-workspace-git which is in zuul-jobs which we include to base-jobs so that will eb ok too I think | 15:55 |
clarkb | we can also probably start small and disable most of the CD jobs and try the new one with a single job? and then expand from there as things are happy | 15:56 |
clarkb | hrm unknown project opendev/system-config :/ | 16:00 |
fungi | where did you see that? | 16:00 |
clarkb | on the change corvus just pushed | 16:00 |
corvus | i'll fix it in a minute, working on the other change now | 16:00 |
corvus | that should be openstack | 16:00 |
clarkb | k | 16:00 |
clarkb | oh right | 16:00 |
fungi | oh, in 742934 | 16:01 |
corvus | oh wait no that's right | 16:01 |
corvus | it's just that opendev/system-config is not in every tenant | 16:01 |
corvus | let's just skip the allowed-projects | 16:01 |
clarkb | and by skipping allowed projects other projects can run it but will fail to ssh because they don't have the project key? | 16:02 |
corvus | yep. | 16:03 |
clarkb | ok maybe if we do that we should test it once it has landed (to ensure it fails as expected) | 16:03 |
clarkb | but I may be overly paranoid | 16:03 |
corvus | clarkb: sounds good | 16:04 |
clarkb | corvus: as another option can we add system-config to opendev without loading any configs from it? | 16:08 |
clarkb | its a move we'll do eventually (but with loaded configs) so maybe that is a good step anyway? | 16:08 |
corvus | clarkb: would need to be added to every tenant | 16:09 |
corvus | the issue is opendev/base-jobs is in every tenant, so every tenant needs to understand that job defn | 16:09 |
clarkb | because we load base jobs in every tenant, got it | 16:09 |
corvus | we could add system-config and project-config to every tenant and just "include: []" but i think even that is too messy | 16:09 |
corvus | this gets better once we move system-config into the opendev tenant | 16:10 |
clarkb | ya | 16:10 |
clarkb | https://review.opendev.org/#/admin/groups/459,members is who has approval on that repo fwiw. I'm thinking I may trim it down a bit? | 16:11 |
clarkb | infra-core + dmsimard, frickler, mnaser, ajaeger? | 16:11 |
clarkb | any objections? | 16:12 |
clarkb | I guess the ssh key still protects us there | 16:12 |
clarkb | so its probably fine to leave it as is | 16:12 |
fungi | oh, yeah, lots of emeritus reviewers who haven't been involved for a while | 16:12 |
clarkb | oh no the ssh key doesn't protect us as muc hthere once we consume the base job | 16:12 |
clarkb | so ya I think trimming that a bit maks sense. Any objections to doing it with the group above? | 16:13 |
mnaser | no objection | 16:13 |
mnaser | and if they're active again, we can bring them up anytime | 16:13 |
clarkb | mnaser: ya its an active vs inactive question but also a root vs not root question | 16:13 |
mnaser | ah. well, in full transparency, i have access to vexxhost/base-jobs which is a config project inside vexxhost tenant | 16:14 |
mnaser | so.. i'd lose that.. | 16:14 |
corvus | i think i'm lost | 16:14 |
fungi | mnaser: clarkb is proposing to continue including you | 16:14 |
corvus | we're talking about trimming membership of project-config-core which has approval rights for which repo(s)? | 16:14 |
clarkb | corvus: opendev/base-jobs | 16:15 |
mnaser | oh, sigh, sorry. i thought the + was a ping, and not foo + bar | 16:15 |
corvus | i think clarkb's trimming is appropriate and should have no net effect | 16:15 |
fungi | yes, i agree with the proposed adjustment to that group membership | 16:16 |
AJaeger | clarkb: trimming looks fine - I just wonder about dmsimard, is he still reviewing? I would drop him as well | 16:16 |
clarkb | mnaser: yup I trust both you an AJaeger with that access even if you aren't proer roots | 16:16 |
clarkb | AJaeger: let me double check if we've cleared out his ssh key if so I'll clear from that group too | 16:17 |
AJaeger | clarkb: change Ife3cfdfe3b674c7703adcbcf7f5a4af708fcd03a | 16:17 |
clarkb | dmsimard is removed yup | 16:17 |
clarkb | https://review.opendev.org/#/admin/groups/459,members should be good now. infra-core had ianw and frickler in it so they are removed from the extra individuals list | 16:18 |
clarkb | now I'm going to update infra-core while I'm thinking about it | 16:18 |
clarkb | thats done, should better reflect what we've got in ansible | 16:19 |
AJaeger | clarkb: updates look good to me | 16:19 |
clarkb | I've approved https://review.opendev.org/#/c/742731/1 | 16:24 |
clarkb | I'll end the meeting here as the steps on the etherpad (other than disk cleanups) are completed | 16:28 |
clarkb | #endmeeting | 16:28 |
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev" | 16:28 | |
openstack | Meeting ended Fri Jul 24 16:28:39 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:28 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-07-24-14.51.html | 16:28 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-07-24-14.51.txt | 16:28 |
openstack | Log: http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-07-24-14.51.log.html | 16:28 |
*** zbr|ruck is now known as zbr | 16:41 | |
*** hamalq has joined #opendev-meeting | 17:29 | |
*** AJaeger has quit IRC | 17:45 | |
*** hamalq has quit IRC | 23:10 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!