Friday, 2020-07-24

*** zbr is now known as zbr|ruck09:26
clarkbfungi: corvus: I'll startmeeting infra here?14:50
fungisgtm14:51
corvusoh hi14:51
clarkb#startmeeting infra14:51
openstackMeeting started Fri Jul 24 14:51:37 2020 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.14:51
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.14:51
*** openstack changes topic to " (Meeting topic: infra)"14:51
openstackThe meeting name has been set to 'infra'14:51
clarkb#topic Gerrit Project Renames July 24, 202014:51
*** openstack changes topic to "Gerrit Project Renames July 24, 2020 (Meeting topic: infra)"14:51
clarkb#link https://etherpad.opendev.org/p/gerrit-2020-07-2414:52
fungiclarkb: we've used opendev-maint for the meeting name previously i think?14:52
clarkbah that is what I was wondering. Maybe we can just mv this file when we are done?14:52
clarkbor should I end and start new?14:52
fungiit's probably fine for this one14:53
clarkbk14:53
fungii missed you were asking what meeting name to use, sorry :/14:53
clarkbI have started a root screen on bridge and run disable-ansible there. We are also waiting for a couple of openstack releases to flush through the release and post pipelines.14:53
clarkbOnce the releases are complete we'll proceed with irc notices and running the playbook14:54
fungii've checked the renames data change and the copy of the yaml file from it on bridge.o.o, both identical and correct14:55
fungithey match what's going on in the rename changes too14:56
clarkbthere is also a single ironic job that should end any minute now and if it succeeds will flush about 5 changes so may wait for that too, though I think that is less critical14:56
fungiit's zuul.opendev.org/t/openstack/status/15:01
fungier, mispaste15:01
clarkbwe're still waiting on nodes for the releases but I think all of the jobs are queued at this point. Now we wait15:02
clarkbnova's release notes job takes more than half an hour?15:03
clarkbI imagine its safe to restart Gerrit while that is running since it isn't pushing tags15:04
fungiyeah, the problem is if we take gerrit down during tag pushes, slightly less so for the job which proposes constraints changes15:04
fungithe last build for that ironic change is also just wrapping up now15:05
clarkbfungi: it would probably be good to wait at least for the release-openstack-python jobs to finish too incase they fetch from gerrit?15:05
clarkbI don't think they do but I'm always surprised :)15:06
fungii don't think they do. propose-update-constraints will definitely be pushing to gerrit though15:08
clarkboh I see15:09
clarkbya15:09
clarkbwe are 4 minutes away for the one release15:10
clarkbfungi: https://zuul.opendev.org/t/openstack/build/ad0de3a8325c4c3ab4c462e6ee1bf509 nova release failed on that15:13
clarkbis that something we can deal with after the maintenance?15:13
clarkblooks like it did upload to pypi15:14
clarkband it ran on ze10 (not sure if that aws one of the time delta servers)15:14
fungiugh, yes that's the same problem i was trying to track down on ze11 yesterday15:15
fungize02, ze10 and ze11 were the three which spontaneously rebooted at various times on wednesday15:15
fungiit'll have to be dealt with after the maintenance15:16
fungiit involves manually copying files from pypi into afs15:16
clarkbI'm checking with smcginnis now that we are good to proceed wtih maintenance from their side15:17
clarkball the tagging and constraints have been pushed I think15:17
clarkbfungi: corvus I've got the ansible playbook command queued up in screen. smcginnis thinks we are clear to proceed.15:19
corvus++15:19
clarkbwill running the status notice here confuse the bots: #status notice We are renaming projects in Gerrit and review.opendev.org will experience a short outage. Thank you for your patience.15:19
fungiyep, should be clear to proceed15:20
clarkbI'll run it in #opendev to avoid any bot confusion15:20
fungiit shouldn't confuse the bots15:20
clarkboh well then I guess I'll try it here15:20
clarkb#status notice We are renaming projects in Gerrit and review.opendev.org will experience a short outage. Thank you for your patience.15:20
openstackstatusclarkb: sending notice15:20
fungithe only real confusion will be if we switch meeting topics while still under alert15:20
fungii think15:21
-openstackstatus- NOTICE: We are renaming projects in Gerrit and review.opendev.org will experience a short outage. Thank you for your patience.15:21
fungior end the meeting while still under alert15:21
clarkbya I didn't alert in part to avoid topic changes15:21
clarkbOnce it says it is done I'll remove my # prefix on the bridge screen command line and run the playbook?15:21
fungithe maintenance itself will take less time to complete than undoing the alert would anyway15:22
fungiyeah, looks right15:22
corvus++15:23
openstackstatusclarkb: finished sending notice15:24
clarkbalright proceeding with the playbook command now15:24
fungiplease do15:24
clarkbit is running15:24
fungino usable temp dir found?15:25
clarkbuhm there was gurmp about a tmp dir?15:25
clarkbbut its proceeding?15:25
corvusreview-test15:25
clarkbreview-test15:25
clarkbya ok15:25
fungiyup15:25
clarkbits at the wati for gerrit to come up stage15:26
clarkblogs claim it is ready so web should catch up momentarily15:28
fungiapi has gone from refusing to hanging15:28
fungihelp responds now15:28
fungils-projects isn't returning yet for me though15:29
corvusi can load changes using the web ui15:29
clarkbyup web ui is up for me but ls-projects isn't working yet15:29
clarkbshow-queue works /me tries ls-projects again15:30
funginow it's outputing the projects list for me, just pausing/buffering slowly15:30
fungimaybe project listing is slow at startup15:31
fungiokay it finally returned for me15:31
fungiand now it returns quickly when rerunning15:31
clarkbyup my initla call errored but now if I do it it works15:31
fungiso we're probably in the clear to move along15:31
clarkbcorvus: ^ you think we are ready too?15:31
corvusyep15:32
clarkbthat lgtm other than review-test15:32
fungiwhy does it run against review-test?15:33
clarkbfungi: we must be using the review group and review-test is in it?15:33
clarkbwould probably be a good idea to be more explicit or remove review-test from that group but that will need investigating15:33
fungiahh, that would make sense, yeah15:33
fungiokay, so now we land the rename changes?15:34
clarkbwe are ready to merge https://review.opendev.org/#/c/739286/ and https://review.opendev.org/#/c/738979 right?15:34
fungiyeah i think so15:34
clarkbthis is where we had problems last time but the ansible disablement seems to be working so I think its ok15:34
fungialso 742731 right?15:34
clarkband I should force merge those because the first cahnge cannot merge as is?15:34
clarkbfungi: ya that one too but it can happen without force merging and before or after we disable ansible15:35
corvusyeah, i think force-merge 73828615:35
fungiyeah, unless we want to split the zuul tenant config change out we'll need to bypass zuul15:35
clarkbcorvus: should I force merge both so that we can reenable ansible more quickly?15:35
corvuswe'll get a zuul config error which will reconcile once we re-enable zuul on bridge and deploy15:35
corvusclarkb: yes i think it's fine to do both15:36
clarkbok I'll force merge both now15:36
fungiwe need all three merged before we turn ansible back on though, right?15:36
fungier, no i guess the renames data is only used if we rebuild gitea servers15:37
clarkbfungi: no, the record change is purely information and not processed by automation (yet)15:37
fungiso that can merge in its own time, yeah15:37
corvus(i had a typo earlier, 739286 not 738286)15:37
clarkbcorvus: ya I opened them from the etherpad and checked content15:38
clarkbI have merged both15:38
fungiyep, gerritbot confirmed15:38
clarkbhttps://gitea01.opendev.org:3000/openstack/project-config/commits/branch/master I'm checking that on 01 to 08 now15:39
clarkb01 lgtm15:39
clarkball 8 lgtm15:40
fungiyep, i also just finished checking them. all 8 look like they have those now15:40
clarkbdeploy has both changes queued up and the first manage projects job should be hitting our ansible is disabled check15:40
clarkbfungi: corvus let me know if you think there is anything else we should check before removing the ansible disablement file. I think we should be good to proceed15:41
corvusclarkb: i think we're good15:41
fungii think we're ready to go15:41
clarkbthe file has been rm'd15:42
clarkbmsg executing local code is prohibited15:42
clarkbok then15:42
corvus?15:42
clarkbmanage-projects job failed due to ^15:42
corvuslisk?15:43
clarkbgetting one15:43
clarkbhttps://zuul.opendev.org/t/openstack/build/948ba0341b334e9db4c2f32779fdae8615:43
clarkbI think we're ok from a renaming standpoint15:43
clarkbwe just failed to run the job but once the job is working again we'll apply the updated state and noop15:44
clarkbhowever, we cannot create new projects at the moment15:44
clarkbit is the git repo update15:45
clarkbwe may actually not update anything right now :/15:45
clarkbI'm wondering if that means we want to disable ansible again to keep periodic jobs from potentially being unhappy about things?15:45
fungii guess we haven't actually run infra-prod-manage-projects successfully since the upgrade: https://zuul.opendev.org/t/openstack/builds?job_name=infra-prod-manage-projects15:46
corvusi suspect we don't really need to disable since it's just going to continue to bomb at the start15:47
clarkbfungi: or any other infra-prod playbook15:47
clarkbcorvus: ya I think that is correct since this is very early in the infra prod playbook15:47
corvuswe're using the zuul project ssh key for access control, right?15:48
clarkbcorvus: yes15:48
clarkbit should be added to the bridge zuul user15:48
corvusi think we're going to need a new base job in a config-project to fix this15:49
corvussystem-config is currently in the openstack tenant, so we can add it to opendev/base-jobs or openstack/project-config15:50
corvusi don't think we have a config-project that's just for the opendev tenant, do we?15:50
clarkbcorvus: we have opendev/project-config which we don't really use for much yet15:50
corvusaha15:50
corvusit's not in openstack15:51
corvusmaybe we should just put this in base-jobs for now, then move it?15:51
fungii'm okay with it going into opendev/base-jobs initially15:51
clarkbopendev/base-jobs you mean? ya I think that would be easy enough15:51
corvusi mean, no other project will be able to use it because of the project key anyway15:52
corvusi'll work on a change15:52
fungiwe'll eventually shuffle all of that into the opendev tenant regardless15:52
clarkbwe can also explicitly limit it to openstack/project-config and openstack/system-config right?15:52
fungiand can then put it in opendev/project-config when we do15:52
clarkballowed-projects or whatever the term is15:52
corvusfungi: yeah, and when we do, we can pull it into a narrower scope15:52
corvusclarkb: yes15:52
clarkband the only role it seems to use is prepare-workspace-git which is in zuul-jobs which we include to base-jobs so that will eb ok too I think15:55
clarkbwe can also probably start small and disable most of the CD jobs and try the new one with a single job? and then expand from there as things are happy15:56
clarkbhrm unknown project opendev/system-config :/16:00
fungiwhere did you see that?16:00
clarkbon the change corvus just pushed16:00
corvusi'll fix it in a minute, working on the other change now16:00
corvusthat should be openstack16:00
clarkbk16:00
clarkboh right16:00
fungioh, in 74293416:01
corvusoh wait no that's right16:01
corvusit's just that opendev/system-config is not in every tenant16:01
corvuslet's just skip the allowed-projects16:01
clarkband by skipping allowed projects other projects can run it but will fail to ssh because they don't have the project key?16:02
corvusyep.16:03
clarkbok maybe if we do that we should test it once it has landed (to ensure it fails as expected)16:03
clarkbbut I may be overly paranoid16:03
corvusclarkb: sounds good16:04
clarkbcorvus: as another option can we add system-config to opendev without loading any configs from it?16:08
clarkbits a move we'll do eventually (but with loaded configs) so maybe that is a good step anyway?16:08
corvusclarkb: would need to be added to every tenant16:09
corvusthe issue is opendev/base-jobs is in every tenant, so every tenant needs to understand that job defn16:09
clarkbbecause we load base jobs in every tenant, got it16:09
corvuswe could add system-config and project-config to every tenant and just "include: []" but i think even that is too messy16:09
corvusthis gets better once we move system-config into the opendev tenant16:10
clarkbya16:10
clarkbhttps://review.opendev.org/#/admin/groups/459,members is who has approval on that repo fwiw. I'm thinking I may trim it down a bit?16:11
clarkbinfra-core + dmsimard, frickler, mnaser, ajaeger?16:11
clarkbany objections?16:12
clarkbI guess the ssh key still protects us there16:12
clarkbso its probably fine to leave it as is16:12
fungioh, yeah, lots of emeritus reviewers who haven't been involved for a while16:12
clarkboh no the ssh key doesn't protect us as muc hthere once we consume the base job16:12
clarkbso ya I think trimming that a bit maks sense. Any objections to doing it with the group above?16:13
mnaserno objection16:13
mnaserand if they're active again, we can bring them up anytime16:13
clarkbmnaser: ya its an active vs inactive question but also a root vs not root question16:13
mnaserah.  well, in full transparency, i have access to vexxhost/base-jobs which is a config project inside vexxhost tenant16:14
mnaserso.. i'd lose that..16:14
corvusi think i'm lost16:14
fungimnaser: clarkb is proposing to continue including you16:14
corvuswe're talking about trimming membership of project-config-core which has approval rights for which repo(s)?16:14
clarkbcorvus: opendev/base-jobs16:15
mnaseroh, sigh, sorry.  i thought the + was a ping, and not foo + bar16:15
corvusi think clarkb's trimming is appropriate and should have no net effect16:15
fungiyes, i agree with the proposed adjustment to that group membership16:16
AJaegerclarkb: trimming looks fine - I just wonder about dmsimard, is he still reviewing? I would drop him as well16:16
clarkbmnaser: yup I trust both you an AJaeger with that access even if you aren't proer roots16:16
clarkbAJaeger: let me double check if we've cleared out his ssh key if so I'll clear from that group too16:17
AJaegerclarkb: change Ife3cfdfe3b674c7703adcbcf7f5a4af708fcd03a16:17
clarkbdmsimard is removed yup16:17
clarkbhttps://review.opendev.org/#/admin/groups/459,members should be good now. infra-core had ianw and frickler in it so they are removed from the extra individuals list16:18
clarkbnow I'm going to update infra-core while I'm thinking about it16:18
clarkbthats done, should better reflect what we've got in ansible16:19
AJaegerclarkb: updates look good to me16:19
clarkbI've approved https://review.opendev.org/#/c/742731/116:24
clarkbI'll end the meeting here as the steps on the etherpad (other than disk cleanups) are completed16:28
clarkb#endmeeting16:28
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev"16:28
openstackMeeting ended Fri Jul 24 16:28:39 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:28
openstackMinutes:        http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-07-24-14.51.html16:28
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-07-24-14.51.txt16:28
openstackLog:            http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-07-24-14.51.log.html16:28
*** zbr|ruck is now known as zbr16:41
*** hamalq has joined #opendev-meeting17:29
*** AJaeger has quit IRC17:45
*** hamalq has quit IRC23:10

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!