-openstackstatus- NOTICE: review.opendev.org is being restarted for scheduled maintenance; see http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html | 16:04 | |
fungi | okay, we can start prepping for the etherpad maintenance in here i suppose | 16:53 |
---|---|---|
corvus | status notice etherpad.openstack will be offline for about 30 minutes while it is migrated to a new server with a new hostname; see http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html | 16:54 |
corvus | how's that look? | 16:54 |
corvus | also, do we want to startmeeting? | 16:55 |
corvus | maybe startmeeting opendev-maintenance ? | 16:55 |
corvus | infra-root: i summon you :) | 16:56 |
fungi | *poof* | 16:56 |
* fungi appears in a puff of smoke | 16:56 | |
clarkb | corvus: ++ on the meeting we can try that out for records | 16:56 |
clarkb | and the status message lgtm | 16:56 |
fungi | that lgtm | 16:57 |
fungi | using meetbot for this one would work, but not for anything where #status alert as they will fight for control of the channel topic | 16:57 |
corvus | or should we call it 'opendev-maint' because typing is hard? | 16:58 |
fungi | i'm fine with the abbrev, sure | 16:59 |
* mordred waves | 16:59 | |
mordred | corvus: yes | 16:59 |
corvus | "opendev-maint" going once | 16:59 |
corvus | ...going twice... | 16:59 |
corvus | ...sold | 17:00 |
corvus | #startmeeting opendev-maint | 17:00 |
openstack | Meeting started Fri Apr 10 17:00:05 2020 UTC and is due to finish in 60 minutes. The chair is corvus. Information about MeetBot at http://wiki.debian.org/MeetBot. | 17:00 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 17:00 |
*** openstack changes topic to " (Meeting topic: opendev-maint)" | 17:00 | |
openstack | The meeting name has been set to 'opendev_maint' | 17:00 |
corvus | ha, apparently it's opendev_maint :) | 17:00 |
corvus | #status notice etherpad.openstack.org will be offline for about 30 minutes while it is migrated to a new server with a new hostname; see http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html | 17:00 |
openstackstatus | corvus: sending notice | 17:00 |
* mordred is in a screen on etherpad01.opendev.org | 17:01 | |
-openstackstatus- NOTICE: etherpad.openstack.org will be offline for about 30 minutes while it is migrated to a new server with a new hostname; see http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html | 17:01 | |
corvus | joined | 17:01 |
fungi | joined as well | 17:01 |
mordred | k. I'm ready to rock and roll there - somebody else want to stop existing etherpad? | 17:01 |
mordred | ( | 17:02 |
* clarkb is joining | 17:02 | |
corvus | i'll stop existing etherpad | 17:02 |
mordred | I'm going to warn everybody - it's like watching paint dry in the screen once this is running | 17:02 |
fungi | oh, for the db dump/source pipeline? | 17:02 |
clarkb | I've joined | 17:02 |
mordred | yup | 17:02 |
corvus | old etherpad is stopped | 17:03 |
mordred | ok. I'm going to run the command | 17:03 |
mordred | it is running | 17:03 |
corvus | neat, old etherpad is running a puppetlabs mcollectived server | 17:04 |
corvus | whatever that is | 17:04 |
openstackstatus | corvus: finished sending notice | 17:04 |
mordred | WOW | 17:04 |
corvus | mordred: is etherpad running on the new server? | 17:04 |
mordred | corvus: it shold not be | 17:04 |
mordred | I only started the mariadb service | 17:04 |
corvus | cool, i confirm that's the case :) | 17:04 |
clarkb | mcollective was puppets message bus for doing orchestration like tasks | 17:05 |
corvus | should we start the dns change now? | 17:05 |
corvus | i believe we should change etherpad.openstack.org cname to point to etherpad.opendev.org ? | 17:05 |
mordred | yeah - I think that's a good idea | 17:05 |
corvus | i'll get started on that while clarkb and fungi confirm :) | 17:06 |
clarkb | ++ | 17:06 |
fungi | yes definitely | 17:07 |
fungi | to give the change time to propagate | 17:07 |
fungi | presumably the plan is to delete the existing a/aaaa rrs for etherpad.openstack.org and replace it with a cname to etherpad.opendev.org | 17:08 |
corvus | etherpad.openstack.org is currently a cname for etherpad01 | 17:09 |
corvus | etherpad.openstack.org is currently a cname for etherpad01.openstack.org | 17:09 |
corvus | i was going to change it to be a cname for etherpad.opendev.org | 17:09 |
corvus | so the result will be etherpad.openstack.org -> etherpad.opendev.org -> etherpad01.opendev.org | 17:09 |
clarkb | corvus: ++ | 17:09 |
mordred | corvus: I think that's correct | 17:09 |
fungi | ahh, right, so just update the cname, even easier | 17:10 |
corvus | there's just one problem; i don't see etherpad.openstack.org in the list of records in the rax web ui | 17:10 |
corvus | it was there when i changed the ttl a few days ago | 17:10 |
fungi | scroll all the way to the end and then keyword search? | 17:10 |
corvus | is there some kind of limit? | 17:10 |
mordred | the rax records are paged and sorted by type | 17:10 |
corvus | fungi: that is my usual procedure which i have done | 17:10 |
fungi | it only pages in some at a time and you have to scroll | 17:10 |
mordred | weird | 17:10 |
fungi | ahh, i can try | 17:10 |
clarkb | the lenght of the db backup is making me think about this. Whats the disk situation like on the new server? it has a 50GB volume and is currently using ~3GB of that for the prod db? | 17:11 |
mordred | also - https://review.opendev.org/#/c/718764 can be landed now | 17:11 |
corvus | wait i found it | 17:11 |
fungi | standing down! | 17:11 |
clarkb | also ^F doesn't work properly | 17:11 |
corvus | ctrl-f was not bringing it up | 17:11 |
mordred | corvus: once it's loaded it's about 30G of data | 17:11 |
mordred | gah | 17:11 |
mordred | clarkb: ^^ | 17:11 |
clarkb | mordred: is 50GB big enough? | 17:11 |
corvus | but scrolling to it, it shows up (and it's highlighted) | 17:11 |
mordred | that's what the volume was on the old one | 17:11 |
clarkb | mordred: ah ok | 17:11 |
clarkb | and we can always attach another volume and grow the lv | 17:12 |
clarkb | now that I've said ^ and checked lvs I'm far less worried :) | 17:12 |
mordred | ++ | 17:12 |
* fungi checks paint, still sticky | 17:12 | |
mordred | that said - I was totaly a shemp when I attached that volume so the lv has a stupid name | 17:12 |
corvus | #info updated etherpad.openstack.org. CNAME from etherpad01.openstack.org. to etherpad01.opendev.org. | 17:13 |
corvus | i left the ttl at 300 | 17:13 |
mordred | cool | 17:13 |
corvus | do we have an ssl cert for etherpad.openstack.org on etherpad01.opendev.org? | 17:14 |
fungi | yeah, i already tested that bit | 17:14 |
corvus | cool, i thought so, just running through things again :) | 17:14 |
mordred | if you want to watch the db size grow: | 17:14 |
mordred | ls -ltrah /var/etherpad/db/etherpad@002dlite/store.ibd | 17:14 |
mordred | on etherpad01.opendev.org | 17:14 |
clarkb | ya and the LE verification failed the first time around because dns wasn't set up properly to verify that the frist time | 17:15 |
fungi | X509v3 Subject Alternative Name: DNS:etherpad.opendev.org, DNS:etherpad.openstack.org, DNS:etherpad01.opendev.org | 17:15 |
fungi | according to openssl | 17:15 |
mordred | woot | 17:15 |
corvus | etherpad.openstack.org.300INCNAMEetherpad.opendev.org. | 17:16 |
corvus | etherpad.opendev.org.299INCNAMEetherpad01.opendev.org. | 17:16 |
corvus | etherpad01.opendev.org.218INA104.130.124.120 | 17:16 |
corvus | that's what i get from dig now | 17:16 |
clarkb | corvus: looks perfect | 17:17 |
corvus | and cool, the http redirect is working | 17:17 |
corvus | (because apache is up; it's just the eplite service that's down) | 17:17 |
mordred | while we're waiting - it occurred to me recently - is having apache on the host rather than in a docker container and in the compose file the right choice? would it make more sense to run it as an apache container as well? | 17:19 |
clarkb | mordred: ya I was thinking about that back when I thought refstack might grow some momentum again. I think if we want to go away from using host networking having a host run webproxy is nice though it could be the one host network container too | 17:20 |
fungi | right, i tested the redirect yesterday as well, albeit with the etherpad service down and apache serving an error for it | 17:20 |
fungi | so looks like what i got from my local /etc/hosts edit | 17:21 |
mordred | clarkb: yeah - I was thinking about it from a "what would be different about these container services if we decided to roll out k8s" | 17:21 |
clarkb | mordred: if we rolled out k8s we'd probably use the nginx ingress controller for a good chunk of that ? | 17:21 |
corvus | i'm ambivalent about whether we run apache in a container or not; if we did, we could stull use host networking | 17:22 |
clarkb | though services like etherpad need rewriting which I don't know that can do | 17:22 |
corvus | clarkb: we would use *some kind* in ingress controller, not necessarily the nginx one, depending on what our load balancer situation was like | 17:22 |
clarkb | fair | 17:22 |
corvus | and many of them can rewrite | 17:22 |
mordred | clarkb: yeah - I think we can still run apache behind the ingress controller in those cases - so that we don't have to rewrite all of our rewrites | 17:22 |
mordred | but also - cloud load balancers are a thign too | 17:22 |
mordred | when we did the gitea setup, we used a cloud load balancer that attached to exposed service of each pod running | 17:23 |
clarkb | and that cloud load balancer was running haproxy not nginx :) | 17:23 |
mordred | that said - in our current clouds we can do the same thing only with nginx ingress if we use VRRP to manage which thing owns the VIP | 17:23 |
mordred | if we don't want to rely on a cloud load balancer | 17:24 |
mordred | I know that it's possible to create VRRP-enabled ports in neutron in vexxhost | 17:24 |
clarkb | mordred: ya the basic requirement is being able to control a shared l2 network between the instances with the 3 IPs on that network | 17:25 |
clarkb | though maybe you don't even need the third ip on that network if you can vrrp separately? its been a while since I had to do vrrp | 17:26 |
corvus | here's an ingress controller config for gke with a path mapping (to /, but the syntax is there to imagine other roots); so it's doing layer 7 load balancing -- https://gerrit.googlesource.com/zuul/ops/+/refs/heads/master/k8s/zuul.yaml#315 | 17:26 |
fungi | clarkb: yeah, technically you can have vrrp/hsrp/carp use only two addresses (though a third makes it somewhat easier) | 17:27 |
mordred | corvus: so that ingress setup seems like it's mapping a single external ip to the resources? | 17:29 |
clarkb | mordred: I think its a name not an ip | 17:30 |
clarkb | (so they could do magic with dns potentially) | 17:30 |
mordred | kubernetes.io/ingress.global-static-ip-name: "zuul-static-ip" | 17:31 |
mordred | is what I was keying off of | 17:31 |
corvus | mordred: yes, it's a single pre-allocated static ip | 17:31 |
corvus | (i previously ran "gcloud get me a static ip named zuul-static-ip") | 17:32 |
clarkb | ah | 17:32 |
clarkb | so its referencing cloud resources outside of k8s | 17:32 |
mordred | nod. so pattern-wise (ignoring mechanics for a sec) - that would potentally map to the sorts of things we'd want to do | 17:32 |
corvus | yep | 17:32 |
mordred | so figuring out the equiv pattern for us inside of a k8s in openstack would be a key piece if we wanted to explore using k8s for services instead of compose | 17:33 |
clarkb | we are at 13GB used | 17:37 |
clarkb | and now 15GB this paint is sticky | 17:41 |
mordred | yeah | 17:44 |
fungi | "wet data, do not touch" | 17:44 |
mordred | seems to be running slower today | 17:44 |
fungi | it is a holiday | 17:48 |
corvus | we're expiting it to be how big? | 17:48 |
fungi | ~30gb clarkb said? | 17:48 |
corvus | 30g right? | 17:48 |
clarkb | ya thats what mordred said above | 17:49 |
fungi | oh, got it | 17:49 |
corvus | so we're 36 minutes away from completion | 17:49 |
corvus | status notice The etherpad migration is still in progress; revised estimated time of completion 18:30 UTC | 17:50 |
corvus | should we send that? | 17:50 |
fungi | yeah, warranted | 17:51 |
clarkb | ++ | 17:51 |
corvus | #status notice The etherpad migration is still in progress; revised estimated time of completion 18:30 UTC | 17:51 |
openstackstatus | corvus: sending notice | 17:51 |
corvus | i'm going to afk for about 30m | 17:51 |
-openstackstatus- NOTICE: The etherpad migration is still in progress; revised estimated time of completion 18:30 UTC | 17:51 | |
fungi | once maintenance is concluded, it may be time to prepare for my annual viewing of "the life of brian" | 17:52 |
clarkb | I'll be making a tunafish sandwich for lunch when this is done | 17:53 |
mordred | fungi, clarkb : while you're waiting: https://review.opendev.org/#/c/718764/ | 17:53 |
mordred | and actually - I think we can not land that yet | 17:54 |
openstackstatus | corvus: finished sending notice | 17:54 |
mordred | and land it once we take etherpad out of the emergency file to ... no, that's too laggy. nevermind me | 17:55 |
clarkb | https://review.opendev.org/#/c/719051/ another good one to review though it had a post failure | 17:55 |
mordred | I think we can land it whenever | 17:55 |
mordred | clarkb: and this one remote: https://review.opendev.org/719053 Set env vars pointing to correct file locations | 17:57 |
mordred | and remote: https://review.opendev.org/719052 Fix issues from rolling out containers | 18:01 |
mordred | infra-root db migration done | 18:02 |
mordred | I might have been wrong about db size | 18:02 |
fungi | or there were a lot of zeroes at the end | 18:02 |
clarkb | or newer mysql is more compact | 18:02 |
mordred | I think actually 32G of free space on device is what I was looking at :) | 18:02 |
fungi | so ready to start up the container? | 18:02 |
mordred | yeah - I thnk so | 18:03 |
mordred | any last concerns? | 18:03 |
fungi | none for me | 18:04 |
clarkb | none from me | 18:04 |
mordred | k. here we go | 18:04 |
mordred | k. I reloaded an openstack etherpad, it redirected to opendev and all is good | 18:04 |
fungi | i reconnected to a pad i already had open and got sent to the right (new) place | 18:05 |
mordred | we might want to keep our eyes on this as it gets usage - might need to tune the my.cnf settings | 18:05 |
fungi | didn't even reload, just clicked the reconnect button from when it got disconnected during the shutdown | 18:05 |
fungi | we did at least incorporate the apache tuning we had on the old deployment, right? | 18:05 |
mordred | yeah | 18:06 |
mordred | innodb_buffer_pool_size= 256M is the one I think might be applicable | 18:06 |
fungi | tested out a few more pads, not seeing any problem yet | 18:06 |
clarkb | mordred: thinking it may need to be bigger? | 18:06 |
mordred | but honestly, 256M of hot data isn't bad | 18:06 |
clarkb | and ya I think individual etherpads tend to be pretty small. Its the history data that grows (I wonder if we can tune it to prefer the newer pad data) | 18:07 |
mordred | it'll do that naturally - the buffer pool will only contain the most recently touched pages | 18:07 |
mordred | so I think it should be fine | 18:08 |
mordred | in other news, my new dowel-style rolling pin has arrived | 18:09 |
fungi | have fun! i still just use a boring old marble cylinder roller | 18:11 |
fungi | but i like the extra weight | 18:11 |
mordred | are you saying I'm fat? | 18:12 |
fungi | heh | 18:13 |
clarkb | that post failrue was due to an rsync failure fwiw | 18:13 |
clarkb | mordreds approval seems to have rechecked it | 18:13 |
clarkb | do we need to send an all clear now? and maybe end the meeting? | 18:14 |
clarkb | not sure what other work there is to do other than following up on gerrit jeepyb things | 18:14 |
mordred | I think we should end the meeting - don't know if we need an all clear | 18:18 |
mordred | I thnk this oe is good | 18:18 |
mordred | we might need to restart etherpad to pick up the settings.json update - but that should be a thing that can just be done - in the margin of error of an internet facing service connectivity | 18:19 |
mordred | oh - we need to take etherpad01.opendev.org out of emergency - shall I do that? | 18:19 |
clarkb | ++ | 18:19 |
clarkb | and then sometime next week clean up the old server and db? probably after we have backups running for the new server? | 18:19 |
mordred | no - we need ot land ... | 18:20 |
mordred | https://review.opendev.org/#/c/719036/ | 18:20 |
mordred | and then ... one sec | 18:20 |
fungi | mordred: are we missing an equivalent of https://opendev.org/opendev/system-config/src/branch/master/modules/openstack_project/templates/gerrit_patchset-created.erb ? | 18:20 |
clarkb | mordred: comment on https://review.opendev.org/#/c/719036/1 | 18:21 |
fungi | nevermind, found it at https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/gerrit/templates/patchset-created.j2 | 18:22 |
mordred | clarkb: updated - and pushed up 2 additional | 18:23 |
mordred | fungi: oh - I issed that one in the patch didn't I? | 18:24 |
fungi | mordred: yeah, i commented | 18:24 |
fungi | since it's a template it's not in the same directory | 18:24 |
mordred | ++ | 18:24 |
fungi | though maybe make it not a template? | 18:24 |
corvus | o/ | 18:24 |
fungi | it's only templated so we can toggle the welcome message feature on the existence or absence of a welcome_message_gerrit_ssh_private_key value | 18:25 |
clarkb | mordred: and do we expect that to noop for review01.openstack.org? I guess since its already configured? | 18:25 |
mordred | fungi: yeah - which does't exist onreview-dev I think | 18:25 |
fungi | which i expect was more transitionalor for the benefit of people who might reuse our hook scripts | 18:25 |
mordred | clarkb: I thnik the backup group is intended to be a normal group for servers we backup? | 18:25 |
fungi | anyway, yeah, drop the conditional, move to files, add envvar exports | 18:26 |
clarkb | mordred: aha got it | 18:26 |
mordred | the backup-server is the only one we only run some times | 18:26 |
clarkb | also I accidentally adding a +W on that group change. I've removed that | 18:26 |
mordred | (see the two followup patches) | 18:26 |
mordred | fungi: no - I think review-dev doesn't have that key | 18:26 |
mordred | fungi: we'd need to add one for it - and a welcome message user | 18:27 |
mordred | that said ... | 18:27 |
mordred | fungi: I updated it - I think you'll like it now | 18:30 |
mordred | corvus: does the stack at https://review.opendev.org/#/c/719077/ look right to you? | 18:32 |
corvus | mordred: yeah -- though what was the conclusion about puppet managing backups on review? | 18:33 |
corvus | (have we confirmed that's gone?) | 18:33 |
mordred | those would be cron jobs right? | 18:33 |
clarkb | mordred: yes cron jobs | 18:33 |
clarkb | and since puppet isn't running its not managing it | 18:33 |
clarkb | would mostly just be ensuring ansible applies the same or similar cron jobs and bup config | 18:34 |
mordred | yeah. let me remove the bup cronjob | 18:34 |
mordred | there's also 2 other cronjobs we have for root we need to add to ansible | 18:34 |
mordred | but I'll leave them for now | 18:34 |
mordred | until we have the patch to replace them | 18:34 |
mordred | k. bup cronjob on review01.opendev.org has been removed - we should expect ansible to add one now | 18:35 |
mordred | lemme make a patch to add the others | 18:35 |
clarkb | service-backup should apply it | 18:35 |
clarkb | when you add the server to the backup group | 18:36 |
clarkb | (I don't know what rtiggers that playbook though) | 18:36 |
mordred | clarkb: well - we have a patch to trigger all playbooks on inventory changes | 18:40 |
mordred | that hasn't landed | 18:40 |
mordred | https://review.opendev.org/719088 <-- gerrit cron jobs | 18:40 |
mordred | clarkb: I take it back - inventory changes trigger everything now: https://review.opendev.org/71908 | 18:41 |
mordred | clarkb: so adding and removing the things to groups should cause the backup playbook to run | 18:41 |
clarkb | k | 18:41 |
clarkb | mordred: that link is missing a digit | 18:41 |
mordred | clarkb:https://review.opendev.org/#/c/717114/ is what I meant | 18:42 |
clarkb | specifically line 1716 of that change covers this case | 18:43 |
mordred | yeah | 18:43 |
mordred | hah | 18:44 |
corvus | looks like it's time to end the meeting | 18:47 |
corvus | #endmeeting | 18:47 |
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev" | 18:47 | |
openstack | Meeting ended Fri Apr 10 18:47:50 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 18:47 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/opendev_maint/2020/opendev_maint.2020-04-10-17.00.html | 18:47 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/opendev_maint/2020/opendev_maint.2020-04-10-17.00.txt | 18:47 |
openstack | Log: http://eavesdrop.openstack.org/meetings/opendev_maint/2020/opendev_maint.2020-04-10-17.00.log.html | 18:47 |
*** diablo_rojo has joined #opendev-meeting | 18:55 | |
-openstackstatus- NOTICE: Due to a database migration error, etherpad.opendev.org is offline until further notice. | 20:07 | |
*** diablo_rojo_phon has joined #opendev-meeting | 20:53 | |
*** diablo_rojo has quit IRC | 21:54 | |
-openstackstatus- NOTICE: Maintenance on etherpad.opendev.org is complete and the service is available again | 22:23 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!