*** hamalq has joined #opendev-meeting | 15:21 | |
*** hamalq_ has joined #opendev-meeting | 15:23 | |
*** hamalq has quit IRC | 15:27 | |
clarkb | ++ to stopping regular deployment activity | 20:19 |
---|---|---|
clarkb | I'm trying to get in me before 2100 | 20:20 |
fungi | #startmeeting opendev-maint | 20:27 |
openstack | Meeting started Fri Jun 12 20:27:58 2020 UTC and is due to finish in 60 minutes. The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot. | 20:27 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 20:28 |
*** openstack changes topic to " (Meeting topic: opendev-maint)" | 20:28 | |
openstack | The meeting name has been set to 'opendev_maint' | 20:28 |
fungi | #link https://etherpad.opendev.org/p/gerrit-2020-06-12 | 20:28 |
fungi | clarkb: do you recall what file it is we create on bridge.o.o to pause deployments? | 20:29 |
clarkb | fungi: it should be in the etherpad | 20:29 |
clarkb | its also documented in bridges system config dics | 20:30 |
clarkb | *docs | 20:30 |
fungi | in which etherpad? | 20:30 |
fungi | i wasn't finding where we did it in the last one from 2020-03-20 | 20:30 |
clarkb | the one I did for this time | 20:31 |
* clarkb looks | 20:31 | |
fungi | oh, there was another pad? | 20:31 |
fungi | anyway, yeah, docs say /home/zuul/DISABLE-ANSBILE | 20:31 |
clarkb | https://etherpad.opendev.org/p/gerrit-project-renames-20200612 | 20:32 |
fungi | #info did `sudo touch /home/zuul/DISABLE-ANSBILE` on bridge.o.o | 20:32 |
fungi | aha, i'll switch to that pad. didn't realize you already had one | 20:32 |
clarkb | I put that together yesterday and libked it in irc. Your url earlier looked similar enough that I thought they were the same | 20:32 |
fungi | #link https://etherpad.opendev.org/p/gerrit-project-renames-20200612 | 20:33 |
fungi | nah, i just missed that you had already created one | 20:33 |
fungi | clarkb: the pad says to put review.o.o in emergency disable *and* stop deployments, is the emergency disable list addition actually necessary? | 20:35 |
fungi | also i guess i should pull a copy of renames/20200612.yaml from https://review.opendev.org/735211 onto bridge.o.o | 20:36 |
clarkb | fungi: I don't think it is necessary | 20:41 |
clarkb | and the renames file should already be there at the path I list (double check though) | 20:41 |
fungi | yep, i see now in the pad you already staged it at /root/renames/20200612/20200612.yaml | 20:42 |
fungi | confirmed it's there and checksum matches the one from 735211 | 20:43 |
fungi | i've started a screen session as root on bridge.o.o | 20:44 |
clarkb | I've joined it | 20:44 |
corvus | i'm standing by | 20:45 |
fungi | and yeah, since these are all repositories moving to different namespaces i wouldn't have expected any groups to get renamed (as we aren't namespacing groups... yet anyway) | 20:45 |
clarkb | ya I double checked just to be sure fwiw, but there weren't any I saw | 20:46 |
fungi | i looked through the changes as well and didn't see any | 20:46 |
fungi | the acls in them are only git moves, with the exception of cla enforcement added ni one | 20:47 |
fungi | 10 minutes to go. i suppose at the 5 minute mark i can send a status notice that gerrit is being restarted | 20:49 |
clarkb | sounds good | 20:50 |
fungi | something like... | 20:52 |
fungi | status notice The Gerrit service on review.opendev.org is going offline momentarily at 21:00 UTC for project rename maintenance, but should return within a few minutes: http://lists.opendev.org/pipermail/service-announce/2020-June/000004.html | 20:52 |
clarkb | lgtm | 20:52 |
fungi | hopefully if that takes anyone by surprise, it'll be incentive to subscribe to service-announce | 20:52 |
fungi | #status notice The Gerrit service on review.opendev.org is going offline momentarily at 21:00 UTC for project rename maintenance, but should return within a few minutes: http://lists.opendev.org/pipermail/service-announce/2020-June/000004.html | 20:55 |
openstackstatus | fungi: sending notice | 20:55 |
-openstackstatus- NOTICE: The Gerrit service on review.opendev.org is going offline momentarily at 21:00 UTC for project rename maintenance, but should return within a few minutes: http://lists.opendev.org/pipermail/service-announce/2020-June/000004.html | 20:55 | |
fungi | i've staged the ansible-playbook command from the pad in the screen session, ready to engage at 21:00 | 20:57 |
openstackstatus | fungi: finished sending notice | 20:58 |
clarkb | doing last minute sanity checks and the flag to run gerrit init before gerrit start seems to default to false which is what we want | 20:58 |
fungi | thanks | 20:59 |
clarkb | oh we explicitly set it to false when we run it too | 20:59 |
fungi | go time. green light from mission control? | 21:00 |
clarkb | the command looks correct to me | 21:00 |
clarkb | corvus: ^ you ready and go? | 21:00 |
fungi | #info running `ansible-playbook /home/zuul/src/opendev.org/opendev/system-config/playbooks/rename_repos.yaml -e@/root/renames/20200612/20200612.yaml` | 21:00 |
corvus | clarkb: yep | 21:00 |
fungi | repolist is undefined | 21:01 |
fungi | edit repos to repolist? | 21:01 |
* clarkb looks at historical lists | 21:01 | |
clarkb | it has always been repos before /me looks at the code now | 21:02 |
corvus | -e repolist=ABSOLUTE_PATH_TO_VARS_FILE | 21:02 |
corvus | is the command wrong? | 21:02 |
clarkb | oh ya we didn't do repolist= | 21:02 |
corvus | that ^ is from the docs | 21:02 |
clarkb | is that -e repolist=@filename or -e@repolist=filename? | 21:03 |
fungi | -e@repolist=/root/renames/ or? | 21:03 |
fungi | checking docs | 21:03 |
corvus | i think it should not have @ | 21:03 |
corvus | playbooks/rename_repos.yaml: - include_vars: "{{ repolist }}" | 21:03 |
corvus | cause that's how it's used | 21:03 |
fungi | docs say -e repolist= | 21:04 |
clarkb | corvus: got it and ya I think that is correct. -e repolist=filepath | 21:04 |
fungi | #info running `ansible-playbook /home/zuul/src/opendev.org/opendev/system-config/playbooks/rename_repos.yaml -e repolist=/root/renames/20200612/20200612.yaml` | 21:04 |
clarkb | sorry I took that from the etherpad we used last time and didn't cross check the docs | 21:04 |
fungi | looking better, thanks ;) | 21:05 |
corvus | i updated the etherpad for today to avoid perpetuating the error in case that happens again | 21:05 |
clarkb | ++ | 21:05 |
clarkb | hrm looking at zuul status I think our jobs are not pausing | 21:06 |
clarkb | we don't have any hourly gitea or gerrit jobs queued though so we are probably ok | 21:06 |
fungi | the jobs themselves start, but ansible blocks if that file is present, from what i gather | 21:07 |
fungi | okay, i just saw some errors scroll by | 21:07 |
fungi | user id changes biting us? | 21:08 |
corvus | seems likely | 21:08 |
fungi | the project key moves failed | 21:08 |
clarkb | ya that seems likely related to the regrouping and reuiding | 21:08 |
fungi | that seems to have been the only issue it encountered | 21:08 |
clarkb | re the jobs, that is correct but two jobs have succeeded since after you reported touching that file | 21:09 |
clarkb | fungi: it should stop completely once it hits the first error | 21:09 |
fungi | ahh, okay, so there was still more to run | 21:09 |
corvus | i'll make a new playbook | 21:09 |
clarkb | it should be zuuld and ya we should make a new playbook for things that happen from failure forward. thank you corvus | 21:10 |
fungi | yeah, so looks like we're using zuuld on the schdeduler now, not zuul | 21:10 |
clarkb | fungi: ya this was missed when we changed that stuff | 21:10 |
corvus | ~corvus/rename_repos.yaml look good? | 21:10 |
fungi | and the playbook tried to chown to zuul which dne | 21:10 |
clarkb | corvus: yes lgtm | 21:11 |
fungi | yep, that looks like the remaining tasks with the first one fixed to the new username | 21:11 |
corvus | should we move that playbook into place? | 21:11 |
fungi | i was simply going to run `ansible-playbook ~corvus/rename_repos.yaml -e repolist=/root/renames/20200612/20200612.yaml` | 21:12 |
corvus | i can't remember if it references any roles or includes | 21:12 |
corvus | if it does, those will be relative to the playbook path | 21:12 |
clarkb | corvus: it does for the gitea stuff | 21:12 |
fungi | but yeah, i can move it | 21:12 |
clarkb | and for gerrit start | 21:12 |
corvus | then best to cp it to pwd | 21:12 |
clarkb | hold on | 21:12 |
clarkb | if zuul is running jobs the jobs could undo that copy | 21:12 |
corvus | or rather, the system-config/playbooks dir | 21:12 |
clarkb | would it be ebtter to make a copy of the subtree and then copy your edit into that? | 21:13 |
clarkb | (I'm trying to understand what zuul is doing now fwiw) | 21:13 |
fungi | i could just put it at /home/zuul/src/opendev.org/opendev/system-config/playbooks/fixed_rename_repos.yaml and then reference that new playbook name, right? | 21:13 |
fungi | and then clean up the checkout when we're done | 21:14 |
clarkb | fungi: assuming that workspace setup doesn't reset --hard | 21:14 |
fungi | if it does, we'll just get an error at that point anyway, right? non-destructive if so | 21:15 |
clarkb | good point | 21:15 |
clarkb | then ya I think thats the way to do it | 21:15 |
clarkb | corvus: ^ that sound good? | 21:15 |
clarkb | fwiw I only see an older (possibly stale) remoet_puppet else run from zuul ansible | 21:16 |
clarkb | and we're seeing jobs rotate attempts now | 21:16 |
corvus | wfm | 21:16 |
clarkb | so I think we may be good either way | 21:16 |
fungi | #info ran `cp ~corvus/rename_repos.yaml /home/zuul/src/opendev.org/opendev/system-config/playbooks/fixed_rename_repos.yaml` | 21:16 |
fungi | #info running `ansible-playbook /home/zuul/src/opendev.org/opendev/system-config/playbooks/fixed_rename_repos.yaml -e repolist=/root/renames/20200612/20200612.yaml` | 21:16 |
fungi | it's gotten further, thanks | 21:17 |
clarkb | we are to the step wehere we check gerrit is up again and happy before hitting enter | 21:18 |
fungi | yeah, ssh: connect to host review.opendev.org port 29418: Connection refused | 21:18 |
clarkb | the process is running but not yet httpd'ing for me yet | 21:18 |
clarkb | I wnt to say this takes ~3-4 minutes now? | 21:19 |
fungi | yeah, i'm not worried (yet) | 21:19 |
fungi | responding! | 21:20 |
clarkb | [2020-06-12 21:20:00,219] [main] INFO com.google.gerrit.pgm.Daemon : Gerrit Code Review 2.13.12-11-g1707fec ready | 21:20 |
clarkb | httpd still not there yet | 21:20 |
fungi | well, ssh api is responding so we can proceed | 21:20 |
fungi | any objections? | 21:20 |
clarkb | there are sshd exceptions in the error log | 21:21 |
fungi | (at least our reminder there says "Make sure that Gerrit ssh api is accepting requests" which it seems to be | 21:21 |
clarkb | did you login and run a command? | 21:21 |
fungi | i did | 21:21 |
clarkb | web is up for me now anyway | 21:21 |
fungi | i mean, i asked it for ssh api help, but i can do more | 21:21 |
clarkb | I think we can proceed | 21:21 |
clarkb | I was thinking something like ls-projects but I expect its fine as web responds now too | 21:21 |
fungi | i can get it to list projects | 21:21 |
corvus | i just pushed up a change | 21:21 |
fungi | i'd call that working | 21:21 |
clarkb | cool lets proceed | 21:21 |
fungi | okay, continuing | 21:21 |
corvus | 21:21 < openstackgerrit> James E. Blair proposed opendev/system-config master: Fix rename playbook after zuul user rename https://review.opendev.org/735397 | 21:21 |
fungi | #info gerrit api is responding, continued | 21:22 |
fungi | #info playbook run completed | 21:22 |
fungi | no new errors | 21:22 |
fungi | anybody want to double-check anything before we start approving the rename changes? | 21:22 |
clarkb | if I search for changes on the renamed proejcts I get no results | 21:23 |
clarkb | do we have to wait for reindexing to complete for that to work? | 21:23 |
fungi | yes | 21:23 |
clarkb | fungi: that was the only other thing I was going to check | 21:23 |
fungi | searches hit the index | 21:23 |
clarkb | but if we have to reindex I don't know if we want to wait | 21:23 |
fungi | if you know a change for one of those projects, you should still be able to access it | 21:24 |
clarkb | ah ok let me see | 21:24 |
fungi | queries which involve the project name will fail to find it though | 21:24 |
fungi | once the online reindex reaches the renamed projects, they'll be searchable again | 21:25 |
fungi | which thankfully is long, long before the reindex completes | 21:25 |
fungi | (unless we rename nova, openstack-manuals, neutron, or something similarly large) | 21:25 |
clarkb | https://review.opendev.org/#/c/726006/ errors which I think is a refstack or interop change | 21:26 |
clarkb | java.lang.IllegalArgumentException: passed project openstack/refstack when creating ChangeNotes for 726006, but actual project is osf/refstack | 21:26 |
clarkb | maybe we need reindexing to finish for that to work too? | 21:26 |
fungi | ahh, yeah okay so i guess they can't be reached until the index is done | 21:27 |
fungi | though we don't typically wait for reindex to complete to do the project-config changes | 21:27 |
fungi | but will likely need to wait to be able to update .gtireview files for those repos | 21:27 |
clarkb | I've double checked that refstack hasn't reindexed it (just to be sure the problem is likely related to idnexing) and it has not so thats good | 21:28 |
fungi | corvus: you still have a wip on 714478, mind if i delete it? or do you want to undo that (or approve the change)? | 21:29 |
corvus | i'll +w if we're ready | 21:29 |
clarkb | ya I think I'm about as ready as I'll be without waiting for reindexing to finish | 21:30 |
corvus | done | 21:30 |
fungi | cool, thanks | 21:30 |
clarkb | should we rm the DISABLE-ANSIBLE file to ensure those jobs run properly? | 21:30 |
clarkb | in particular I think we need them to update zuul config | 21:30 |
corvus | that's probably a stellar idea | 21:31 |
fungi | it wasn't clear to me whether we needed to wait until the changes merged to clear DISABLE-ANSIBLE | 21:31 |
fungi | in the past we didn't resume configuration management for the gerrit server until the rename changes merged, so that manage-projects wouldn't recreate old stuff | 21:31 |
mordred | yeah - that's probably a good idea ... | 21:32 |
clarkb | fungi: thats a good point | 21:32 |
mordred | we use tip of project-config | 21:32 |
clarkb | I think the correct thing now is to rm the file bceause of ^ | 21:32 |
clarkb | oh wait | 21:32 |
clarkb | no I see | 21:32 |
clarkb | we want them to merge then rm I get it | 21:32 |
mordred | no - I think let's wait to rm until the patches merge | 21:32 |
mordred | yeah | 21:32 |
fungi | yeah, we've scripted changes to match the future state of project-config, not the present satte | 21:32 |
fungi | so if manage-projects runs with the present state, we'll wind up with a mess | 21:32 |
clarkb | yup | 21:33 |
clarkb | and I guess we can always reenqueue if we need to once things have merged | 21:33 |
clarkb | rather than try and sync with zuul perfectly | 21:33 |
clarkb | gerrit is slowly chewing the the largest repos now | 21:33 |
fungi | 734669 fails zuul configuration. are we going to need to bypass gating on that one? | 21:34 |
corvus | that probably could be merged by zuul if it were split into 2 changes | 21:35 |
corvus | change main.yaml first, then the rest. | 21:35 |
corvus | but as written, yes, that would need to be force-merged | 21:36 |
fungi | i can split it up | 21:36 |
corvus | depends on how long you want to stick around. :) | 21:36 |
fungi | we can always bypass check | 21:36 |
corvus | i'd vote to force-merge | 21:38 |
clarkb | ya I'm ok with force merge too | 21:38 |
fungi | about halfway through the dance to split the patch, but i'll just do that instead | 21:38 |
clarkb | one thing I notice is we haev a lot more "big" repos to reindex now | 21:39 |
clarkb | we've only finished a handful so far | 21:39 |
clarkb | but it is progressing | 21:39 |
clarkb | also we should really do that gerrit disk cleanup I never do because its scary | 21:40 |
clarkb | we haev 23GB free where the git repos and indexes live | 21:40 |
clarkb | (enough for now but steadily getting smaller) | 21:40 |
mordred | yeah. we have a LOT of saved indexes in the homedir too | 21:40 |
clarkb | mordred: ya exactly | 21:40 |
fungi | #info bypassed gating on 734669 so as not to spend time splitting the change during maintenance | 21:41 |
clarkb | zuul's reindexing now | 21:41 |
clarkb | that means zuul is one of our bigger repos :) | 21:41 |
clarkb | fungi: re removing the disable file has the change replicated first? | 21:42 |
clarkb | I think we want to wait for that as well? | 21:43 |
fungi | i haven't removed it yet | 21:43 |
fungi | just documenting in the pad that it's a step | 21:43 |
clarkb | (I want to say reindexing and replication are different worker threads so that should work well) | 21:43 |
fungi | though yes, i guess the question remains whether we need to make sure something is replicated to gitea before we turn jobs back on... they use branch tip so do they get that directly from gerrit or gitea? | 21:44 |
mordred | gitea | 21:44 |
clarkb | I want to say it is from gitea | 21:44 |
mordred | definitely gitea | 21:44 |
fungi | okay, so we need to make sure the project-config changes have replicated to gitea i guess | 21:45 |
clarkb | ugh local network has suddenly gone flaky for me so whwen I run the show-queue commands ssh says review.o.o is unreachable. Then I run it again and it is reachable | 21:45 |
clarkb | its not review I notice it typing in this shell for irc too | 21:45 |
fungi | i've noted the additional wait condition in the pad | 21:46 |
fungi | #link https://opendev.org/openstack/project-config/commits/branch/master | 21:47 |
fungi | i see merge commits for both rename changes | 21:47 |
fungi | so we should be safe? | 21:47 |
clarkb | I like the check the individual backends and I just do that they all look good | 21:48 |
fungi | okay, thanks for confirming | 21:49 |
clarkb | progress is picking up with reindexing which is what we expect since it starts with the most expensive ones and proceeds to the cheaper ones | 21:50 |
fungi | so we should be all clear to remove /home/zuul/DISABLE-ANSIBLE now | 21:50 |
clarkb | fungi: yes I think so | 21:50 |
clarkb | mordred: corvus ^ do you agree? | 21:50 |
mordred | ++ | 21:50 |
corvus | abstain (weak agree) | 21:50 |
fungi | so i was *going* to remove it | 21:52 |
fungi | but it's no longer there | 21:52 |
fungi | i wonder if that explains the behavior clarkb was seeing | 21:52 |
mordred | uh - what removed it? | 21:52 |
fungi | i'd love to know | 21:52 |
mordred | fungi: I see it | 21:52 |
mordred | root@bridge:/home/zuul# ls -ltra /home/zuul/DISABLE-ANSBILE | 21:53 |
mordred | -rw-r--r-- 1 root root 0 Jun 12 20:32 /home/zuul/DISABLE-ANSBILE | 21:53 |
mordred | oh | 21:53 |
mordred | nope | 21:53 |
mordred | I see a file named incorrectly | 21:53 |
clarkb | oh ya looking at scrollback thats the file that was there when I was trying to figure out why things were running | 21:53 |
mordred | this is maybe a stupid idea - but maybe we should make a helper utitlity called "disable-ansible" that touches the file | 21:54 |
clarkb | so uh | 21:54 |
mordred | so that we can dis<tab> | 21:54 |
clarkb | we may have recreated the refstack and interop repos? | 21:54 |
mordred | and not worry about misspelling | 21:54 |
mordred | yeah. we probably did | 21:54 |
corvus | ansibile is great | 21:54 |
fungi | wow | 21:54 |
mordred | corvus: so is ansbile | 21:54 |
fungi | i picked the ONE place in our documentation where it's mistyped, and didn't notice | 21:54 |
fungi | https://docs.opendev.org/opendev/system-config/latest/bridge.html#running-ansible-on-nodes "When done, don’t forget to remove /home/zuul/DISABLE-ANSBILE" | 21:55 |
corvus | okay, so how do we remove those repos? | 21:55 |
mordred | I think we have to shut gerrit down again | 21:55 |
mordred | I'm not 100% sure what the right choice is on the gitea replicas | 21:55 |
mordred | since I'm guessing the new repos will have removed the redirects | 21:56 |
clarkb | mordred: they did I just checked | 21:56 |
mordred | I mean - obvs we can do db surgery to fix the redirects | 21:56 |
mordred | but that'll take some investigation | 21:56 |
clarkb | can we even delete projects? | 21:56 |
corvus | i'll start working on the gitea stuff | 21:56 |
clarkb | in gerrit I mean | 21:56 |
clarkb | or do we toggle some flag saying this isn't visible instead? | 21:56 |
corvus | please shut gerrit down now | 21:57 |
clarkb | can we safely stop it if it is reindexing? | 21:57 |
clarkb | I assume so since its online | 21:57 |
corvus | because we can delete projects if they haven't changed | 21:57 |
clarkb | gotcha | 21:57 |
fungi | #status notice gerrit is being taken offline for emergency cleanup, will return to service again shortly | 21:57 |
openstackstatus | fungi: sending notice | 21:57 |
mordred | yeah - I think worst case it'll just be another reindex | 21:57 |
-openstackstatus- NOTICE: gerrit is being taken offline for emergency cleanup, will return to service again shortly | 21:58 | |
fungi | do we use docker-compose to take it down now? | 21:58 |
clarkb | fungi: yes, cd /etc/gerrit-compose && sudo docker-compose down | 21:58 |
mordred | yeah | 21:58 |
fungi | doing that now | 21:58 |
fungi | it's stopped | 21:59 |
mordred | corvus: removing from gerrit is removing the git repo - do we need to delete from db there too? | 21:59 |
* mordred can work on that | 21:59 | |
fungi | yes, needs to be removed from the mysql db | 21:59 |
corvus | mordred: dunno | 21:59 |
fungi | same tables we rename in | 21:59 |
fungi | account_project_watches where project_name = oldname | 22:00 |
openstackstatus | fungi: finished sending notice | 22:00 |
fungi | changes where dest_project_name = oldname | 22:01 |
fungi | though hopefully those match zero rows | 22:01 |
clarkb | I found https://stackoverflow.com/questions/8723716/delete-a-project-in-gerrit which seems to indicate that removing the git dirs may be sufficient on some versions of gerrit (maybe those with notedb though?) | 22:01 |
mordred | yeah - probably so | 22:01 |
mordred | now I'm wondering why /root/.my.cnf doesn't have the ssl=true line that's in our ansible config management | 22:01 |
mordred | but maybe that can be a task for older monty | 22:01 |
fungi | clarkb: yeah, i think that stuff winds up in special refs in those repos with notedb | 22:01 |
fungi | but also if nobody created new changes or watches on those recreated repos, it's likely we won't have anything to remove from the db | 22:02 |
mordred | fungi: yeah - I'm going to quick select to see if that's true | 22:02 |
corvus | for gitea -- how about we use the web ui to delete the old name, rename the new name to the old, then rename the old to the new? | 22:02 |
clarkb | corvus: we should be able to test that on a single backend too to ensure it works properly | 22:03 |
clarkb | but that plan sounds good to me | 22:03 |
mordred | me too | 22:03 |
clarkb | separately, should we touch that ansible file now? | 22:03 |
clarkb | the proper one? | 22:03 |
mordred | yes | 22:03 |
corvus | clarkb: yes; i expect it too because i did that when testing renames originally | 22:03 |
fungi | corvus: that sounds like it should create redirects again... though doesn't our redirect creation also have the same effect if it reruns after we delete the old repos? | 22:03 |
fungi | (part of the reason for recording those in yaml) | 22:03 |
clarkb | fungi: no | 22:04 |
clarkb | that hasn't been implemented yet | 22:04 |
clarkb | we've been reocrding things so that we can do that | 22:04 |
fungi | oh, okay | 22:04 |
fungi | sometimes i confuse the future with the present, sorry :/ | 22:04 |
clarkb | I'm touching the ansible disable file now | 22:04 |
corvus | we could just manually delete the old names in gitea and then run that part of the playbook | 22:04 |
corvus | or rather, manually delete the old names, move the new to the old, then run the playbook | 22:05 |
corvus | that only saves one step though | 22:05 |
clarkb | corvus: ya I think your plan was great myself and gets us to the end state we want | 22:05 |
fungi | yep, agreed | 22:05 |
clarkb | does anyone else want to double check the file in /home/zuul on bridge/ | 22:05 |
mordred | I do not believe there are any gerrit db changes | 22:06 |
clarkb | thinking out loud on how to address zuul running in the future. Maybe we should edit authorized keys? | 22:06 |
mordred | mysql> select * from account_project_watches where project_name in ('openstack/refstack', 'x/whitebox-tempest-plugin', 'openstack/python-tempestconf', 'openstack/refstack-client'); | 22:06 |
mordred | Empty set (0.01 sec) | 22:06 |
fungi | clarkb: it looks correct this time, thanks | 22:06 |
mordred | mysql> select * from changes where dest_project_name in ('openstack/refstack', 'x/whitebox-tempest-plugin', 'openstack/python-tempestconf', 'openstack/refstack-client'); | 22:06 |
mordred | Empty set (0.64 sec) | 22:06 |
clarkb | mordred: is there a general projects table too? | 22:06 |
clarkb | basically what does ls-projects read? | 22:07 |
mordred | no. no projects table | 22:07 |
fungi | clarkb: i expect the current method for disablement is just fine as long as we type it correctly | 22:07 |
clarkb | fungi: sure, but its fragile whereas breaking ssh is robust | 22:07 |
mordred | I believe it reads from a cache that's related to git repos on disk | 22:07 |
clarkb | mordred: rgr | 22:07 |
clarkb | mordred: and to confirm I t hink what I hear you saying is we only need to remove the git dirs? | 22:07 |
fungi | i should have looked more closely at the command i cut and pasted, and not just assumed our documentation is somehow magically immune to typos | 22:07 |
mordred | clarkb: yes - I believe that is correct | 22:08 |
mordred | clarkb, fungi: I think making a utility script to set the file would be a very easy one-liner and prevent future mistakes while keeping things simple | 22:08 |
fungi | that sounds reasonable | 22:08 |
clarkb | I like that | 22:08 |
mordred | clarkb: I will now delete the git repos | 22:08 |
mordred | clarkb: I think you already have done that | 22:09 |
mordred | as I do not see them | 22:09 |
fungi | or manage-projects never actually made it that far? | 22:09 |
clarkb | mordred: I have not | 22:10 |
mordred | perhaps we only made the projects in gitea | 22:10 |
clarkb | fungi: manage projects succeeded | 22:10 |
clarkb | the job I mean | 22:10 |
mordred | well - that's disturbing - because there are no projects named that in gerrit now | 22:10 |
fungi | does it only apply repos which changed, or do we do a full run when we trigger it? | 22:10 |
mordred | OH | 22:11 |
mordred | you know what? | 22:11 |
corvus | https://gitea01.opendev.org:3000/openstack/interop should be complete | 22:11 |
mordred | those repos are going to be in the manage-projects cache on gerrit | 22:11 |
mordred | so manage-projects isn't going to try to re-created them gerrit side - purely accidentally | 22:11 |
clarkb | mordred: yup I'm reading the logs on bridge now and I concur that is what happened | 22:11 |
mordred | \o/ | 22:11 |
clarkb | it basically said "my shas match so I'm good" | 22:11 |
mordred | cool. so gerrit is actually fine (and will remain fine) | 22:12 |
clarkb | someone else may want to double check that manage projects logs on brdige to concur too but that is my read of it | 22:12 |
clarkb | corvus: redirect wfm there | 22:12 |
mordred | yes - I concur | 22:12 |
mordred | corvus: did you do that by clicking things? | 22:13 |
corvus | mordred: yes | 22:13 |
fungi | okay, so we're ready to start gerrit again? | 22:14 |
corvus | do we want to wait until gitea is done? replication | 22:14 |
clarkb | ya that way everything works properly when it starts functioning | 22:15 |
fungi | oh, right, that, sorry missed it wasn't done on all the backends | 22:15 |
mordred | corvus: yeah - I think that's safest | 22:15 |
fungi | i also have the documentation correction for "ansbile" ready to push once it's up (interesting fact, it's not the only place in our documentation where we did that) | 22:15 |
corvus | im confused -- 1 sec | 22:16 |
corvus | i don't see an x/whitebox-tempest-plugin; it still has a redirect | 22:16 |
corvus | maybe it just didn't get around to it? | 22:17 |
clarkb | corvus: that one is ok | 22:18 |
clarkb | corvus: because it was in the first chagne that merged which is what ran manage-projects | 22:18 |
corvus | ah ok | 22:18 |
clarkb | its was the gap between first and second changes merging that bit us | 22:18 |
clarkb | and so only ssecond change's projects are sad (I noted the list on the etherpad too) | 22:18 |
fungi | oh, right | 22:18 |
fungi | so if the three-rename change had merged first we'd only have to correct x/whitebox-tempest-plugin | 22:19 |
clarkb | yes | 22:20 |
corvus | okay i've done all the things for gitea01 | 22:21 |
corvus | are other folks idle? should we just divide up the clicking? | 22:21 |
corvus | it's silly, but it's probably ~= to the time to make a playbook | 22:21 |
clarkb | I can click buttons if we decide that is easiest | 22:21 |
mordred | I can click. should we divide up backends? | 22:21 |
corvus | k; ya'll grab the root password from bridge | 22:22 |
corvus | i'll lay stuff out in the etherpad | 22:22 |
mordred | corvus: it's in hostvars right? | 22:22 |
corvus | yep | 22:22 |
corvus | git grep gitea_root_password | 22:22 |
clarkb | yup found it. is the username admin? | 22:23 |
corvus | root | 22:23 |
clarkb | heh that makes sense given the key name | 22:23 |
mordred | corvus has done 01 - actually, why don't I make an etherpad list and we can grab backends | 22:24 |
mordred | corvus has done it better than me | 22:24 |
corvus | oh heh i was about to say i like that better | 22:24 |
corvus | okay we'll do it my way :) | 22:25 |
corvus | clarkb, mordred: etherpad make sense? gtg? | 22:25 |
mordred | corvus: we go to /login ? | 22:25 |
corvus | oh right | 22:25 |
corvus | /user/login | 22:25 |
mordred | that did not work :( | 22:26 |
mordred | nevermind. I'm dumb | 22:26 |
clarkb | https://gitea0X.opendev.org:3000/user/login | 22:26 |
clarkb | that worked for me | 22:26 |
corvus | then click the settings in the top right for the repo | 22:26 |
fungi | grabbing backends might get me in trouble, but i'll try one anyway | 22:27 |
corvus | you could put that on the url too, but i liked to see the repo readme | 22:27 |
fungi | stole 8 from corvus | 22:27 |
corvus | fungi: take 8 | 22:27 |
corvus | good | 22:27 |
mordred | k. 2 is done | 22:31 |
clarkb | 4 is done | 22:32 |
corvus | 6,7 done | 22:34 |
clarkb | 5 is done now | 22:35 |
mordred | 3 is done | 22:35 |
mordred | also - let me express how happy I am that clicking things is not our normal job | 22:35 |
clarkb | once 8 is done we start gerrit, trigger reindexing, check things are happy again then delete the zuul ansible disable file? | 22:36 |
fungi | i'm so unbelievably terrible at it | 22:36 |
fungi | almost there | 22:36 |
clarkb | fungi: you need to get yourself a gamer mouse | 22:37 |
fungi | just one double-rename left | 22:37 |
fungi | and fniished | 22:39 |
fungi | so i can start gerrit again now? | 22:39 |
corvus | go from me | 22:39 |
clarkb | I think so. commadn for that is docker-compose up -d in the the /etc/gerrit-compose dir | 22:39 |
fungi | it's on its way up again now | 22:40 |
fungi | #info started gerrit again after cleanup | 22:40 |
mordred | woot | 22:41 |
mordred | that was exciting | 22:41 |
fungi | #link https://review.opendev.org/735400 Correct "ansbile" typos | 22:42 |
clarkb | now we need to trigger reindexing I think | 22:42 |
fungi | status notice The Gerrit service on review.opendev.org is available again | 22:43 |
fungi | should we? ^ | 22:43 |
clarkb | fungi: I think we should trigger reindexing first? | 22:43 |
mordred | yeah | 22:44 |
fungi | i can do that now | 22:44 |
fungi | gerrit index start accounts --force | 22:44 |
fungi | gerrit index start changes --force | 22:44 |
fungi | is what the playbook does | 22:44 |
clarkb | ya that should be what we want | 22:44 |
fungi | #info restarted reindexing for accounts and changes | 22:45 |
clarkb | I saw all the tasks queue up and now gerrit is nomming on them | 22:47 |
clarkb | now I Think we can maybe make the notice? | 22:47 |
corvus | ++ | 22:47 |
fungi | #status notice The Gerrit service on review.opendev.org is available again | 22:47 |
openstackstatus | fungi: sending notice | 22:47 |
-openstackstatus- NOTICE: The Gerrit service on review.opendev.org is available again | 22:47 | |
fungi | other than keeping an eye on things (particularly reindexing) i guess we're basically done? | 22:48 |
fungi | someone will also need to push .gitreview updates to the 5 repos i suppose | 22:48 |
fungi | but doesn't need to be us necessarily | 22:48 |
clarkb | and removing the ansible zuul disable when happy with things (is that now?) | 22:48 |
fungi | aha, right | 22:49 |
fungi | that | 22:49 |
clarkb | I think we should remove that then watch manage projects run (which is queued up and waiting) | 22:49 |
fungi | i expect we are? and yes to watching | 22:49 |
clarkb | the one queued up should be for the most recent merge so all of that should be happy now | 22:49 |
clarkb | ya I think we are | 22:50 |
clarkb | mordred: corvus ^ any other thoughts before we do that? | 22:50 |
openstackstatus | fungi: finished sending notice | 22:50 |
fungi | once that's cleared, i may switch to a more evening-appropriate part of my residence | 22:51 |
fungi | but will still be around to check/test stuff, and fix anythnig else we don't know is broken | 22:51 |
clarkb | I'm around, but also enjoying tea becaues it is cold and rainy | 22:54 |
mordred | clarkb: yeah - I thnk so | 22:55 |
mordred | (like, it sounds good) | 22:55 |
clarkb | fungi: ^ do you want to remove it or should I/ | 22:55 |
fungi | i'm happy to | 22:57 |
fungi | and done | 22:57 |
fungi | i removed the stray mistyped one as well | 22:57 |
fungi | #info ran `rm /home/zuul/DISABLE-ANSIBLE` to reenable deployments | 22:58 |
fungi | once again, sorry about all that, i should have looked more closely at commands i was cutting and pasting | 22:59 |
clarkb | we're back to having zuul get indexed so things are moving along there | 22:59 |
fungi | i guess once we see the manage-projects run complete as expected, we can endmeeting. probably no need to wait for reindexing as that will be hours still | 23:00 |
clarkb | if anyone is wondering run cloud launcher is running on bridge | 23:04 |
clarkb | its just slow because of the previous disable and now it has to do its things | 23:05 |
clarkb | cloud launcher is finsihing up now I think manage projects should be next? | 23:09 |
clarkb | I want to say priority is such that that happens | 23:09 |
clarkb | openstack/gnocchi is being reindexed | 23:09 |
clarkb | I wonder if there are ways to optimize this ti ignore ancient history until the end | 23:10 |
fungi | which may not be what you want if you wind up renaming ancient history for some reason | 23:11 |
fungi | right now it optimizes to fill all the parallel workers with the largest workloads first so that everything is finished sooner | 23:12 |
clarkb | it is moving pretty quickly now, enough of the big repos have cleared out and it can work on the smaller ones | 23:13 |
clarkb | mange projects ran | 23:16 |
clarkb | redirects for openstack/interop still work | 23:16 |
fungi | awesome | 23:18 |
fungi | shall we call it a wrap? | 23:18 |
clarkb | I think so? It would be nice to confirm that that refstack change I linked earlier works once reindexing hits refstack | 23:19 |
clarkb | but thats the only thing outstanding that I can see | 23:19 |
clarkb | and looks like reindexing may be done for it /me tries it | 23:19 |
fungi | ls-projects lists a osf/refstack and no openstack/refstack | 23:19 |
clarkb | https://review.opendev.org/#/c/726006/ loads now with no http 500 | 23:20 |
mnaser | before i mucharound too much | 23:20 |
mnaser | https://review.opendev.org/#/c/714480/ was a change i wanted to land relating to the rename | 23:20 |
mnaser | it lists "Cannot Merge" but i just rebased it to the tip of the master (Add njohnston liaison preference) | 23:21 |
mnaser | and the dependency is merged too? | 23:21 |
fungi | the "cannot merge" may have been from before reindexing completed | 23:21 |
mnaser | it looks like after my rebase, now zuul also thinks it cannot merge either | 23:21 |
fungi | zuul may not have reconfigured yet, if our configuration management hasn't gotten around to triggering a reload of the tenant config | 23:22 |
clarkb | I think zuul asks gerrit | 23:22 |
clarkb | fungi: that change is to governance | 23:22 |
fungi | ahh, or that | 23:22 |
fungi | oh, hrm | 23:22 |
mnaser | yeah and it depends on a project-config change | 23:22 |
mnaser | so really its not an 'affected' project | 23:22 |
clarkb | but we can check the zuul logs for it to see why zuul is unhappy | 23:22 |
mordred | clarkb, fungi: don't forget about: https://review.opendev.org/#/c/735401/ - I imagine we'll forget about it if we don't grab it today | 23:22 |
fungi | gerrit doesn't take depends-on into account for showing "cannot merge" | 23:23 |
mordred | clarkb: and https://review.opendev.org/#/c/735400/ from fungi | 23:23 |
mnaser | fungi: right -- so that's why when i rebased to the most recently merged change, i figured it.. should be ok? | 23:23 |
mnaser | i mean, i can pull it down and fully push back under another commit id but.. | 23:24 |
fungi | also claims "Patch in Merge Conflict" | 23:24 |
mnaser | i mean if we think its just a weird thing inside openstack/governance, i can pull it down and push it back up | 23:25 |
fungi | mnaser: i see a merge conflict with that change and head of master over reference/projects.yaml | 23:26 |
mnaser | under another commit id and that might unblock whatever is stuck | 23:26 |
fungi | maybe gerrit/zuul aren't wrong? | 23:26 |
mnaser | fungi: i'm curious as to how/why gerrit let me rebase it through the UI | 23:26 |
mnaser | usually it would whine about a merge conflict in the UI | 23:26 |
clarkb | oh you used the ui | 23:26 |
mnaser | i should have clarified earlier, it sounds like the ui has some dark magic? | 23:26 |
clarkb | well the ui uses jgit | 23:27 |
clarkb | so weirdness to be expected :P | 23:27 |
clarkb | I'm still tracking down how zuul failed | 23:27 |
fungi | ahh, yeah i really don't know what the ui does for its magic/remote rebase | 23:27 |
clarkb | we don't seem to log the merger that merges things though :/ | 23:27 |
mnaser | let me cherry-pick it locally and see if git on my machine complains | 23:27 |
fungi | clarkb: i expect zuul is actually unhappy because that change is really merge-conflicting with current master state | 23:27 |
mnaser | if it does, maybe that's why zuul is unhappy | 23:28 |
mnaser | ^ | 23:28 |
fungi | yeah, i mean, i tried to rebase it on master locally just now which is how i know it's conflicting over reference/projects.yaml | 23:28 |
mnaser | yeah, it conflicts locally | 23:28 |
mnaser | fungi: always a step ahead of me :) | 23:29 |
clarkb | ERROR: content conflict in reference/projects.yaml | 23:29 |
clarkb | zuul tripped that too | 23:29 |
clarkb | so ya I think its actually in an unmergable state | 23:29 |
clarkb | I wonder if rebase in gerrit ui relies on the index to check | 23:29 |
clarkb | and if index is stale you just get a weird result | 23:29 |
clarkb | only nova and openstack-manuals are reindexing now | 23:30 |
fungi | i want to say that gerrit's rebase function is for when you have a parent change which gets a new patchset and you want your change rebased on the new state of the parent | 23:30 |
clarkb | so ya I'd push a new ps properly rebased using git and see if it complains from there | 23:30 |
clarkb | or maybe if it has a conflict it just rebases to the most recent thing that doesn't conflict | 23:31 |
clarkb | because it did change the HEAD^ | 23:31 |
mnaser | it is weird, because sometimes it does also complain about merge conflict in the UI too | 23:31 |
clarkb | and its different than what the latest ps is at | 23:31 |
mnaser | (anyways, pushed it up and its good now) | 23:31 |
fungi | cool. maybe this story has a happier ending in newer gerrit | 23:31 |
clarkb | mordred: bug on https://review.opendev.org/#/c/735401/2 | 23:32 |
fungi | mnaser: regardless, thanks for bringing that up, thankfully i don't see anything to suggest it indicates a problem with the maintenance | 23:32 |
mnaser | yeah, i figured it would be good to have an early signal _incase_ it was something | 23:33 |
mnaser | but no we can all have our rest of friday : | 23:33 |
mnaser | :) | 23:33 |
fungi | yay! | 23:33 |
mordred | clarkb: fixed - and thus why we code-review :) | 23:34 |
clarkb | I've detached from the root screen on brdige | 23:34 |
clarkb | I think we can probably shut it down now? | 23:34 |
clarkb | mordred: +2 now. | 23:35 |
mordred | fungi: got a sec for a +A on 735401? | 23:36 |
clarkb | I've rechecked https://review.opendev.org/#/c/735397/ as its gating was lost by gerrit being down I think | 23:36 |
clarkb | everything else looks merged or on its way except for 735491 | 23:37 |
clarkb | *401 | 23:37 |
fungi | lookin | 23:38 |
clarkb | I'm going to take a break for a bit now. Will check back in a bit to see if nova and openstack manuals are done reindexing | 23:39 |
clarkb | but I think this is looking happy now | 23:39 |
fungi | okay, shall i endmeeting and declare the maintenance done? reindexing should take care of itself | 23:39 |
clarkb | ++ | 23:40 |
fungi | #endmeeting | 23:45 |
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev" | 23:45 | |
openstack | Meeting ended Fri Jun 12 23:45:41 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 23:45 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/opendev_maint/2020/opendev_maint.2020-06-12-20.27.html | 23:45 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/opendev_maint/2020/opendev_maint.2020-06-12-20.27.txt | 23:45 |
openstack | Log: http://eavesdrop.openstack.org/meetings/opendev_maint/2020/opendev_maint.2020-06-12-20.27.log.html | 23:45 |
fungi | #status log project rename maintenance concluded: http://eavesdrop.openstack.org/meetings/opendev_maint/2020/opendev_maint.2020-06-12-20.27.html | 23:46 |
openstackstatus | fungi: finished logging | 23:46 |
fungi | thanks everybody! | 23:46 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!