Saturday, 2018-12-22

corvusoh neat00:00
corvusapparently gerrit can create repos by pushing to them00:00
clarkbooh00:00
corvusbut gitea doesn't know about them00:00
fungiaccording to man ssh, "PreferredAuthentications can be used to change the default order"00:01
corvusi guess we'll see what happens when my script gets around to creating the repos that gerrit is pushing now00:01
fungiif it becomes necessary00:01
openstackgerritMerged openstack-infra/infra-specs master: Move project-hosting spec to completed  https://review.openstack.org/62260900:02
openstackgerritMerged openstack-infra/infra-specs master: Add opendev Gerrit spec  https://review.openstack.org/62303300:02
clarkbThe doc update should merge soon too00:03
fungiinfra-root: reminder, we'll want https://review.openstack.org/627018 merged before the next gerrit restart00:03
mordredclarkb: also, I think we were saying this to fungi earlier (and by we, I mean corvus) - but we should also walk y'all through all the various k8s things - maybe screencast walkthrough style - before going live with this bad boy00:04
corvushttp://38.108.68.66/openstack/steth is one that gerrit replicated to before my script created it; so we should check back on that later00:04
mordredcool00:04
corvusyeah, i totally voluteered mordred and i for that without asking :)00:04
corvusbut i figured mordred is exicted about this and likes to talk, so it was a safe bet00:05
mordredI have learned many things about kubectl that are non-terrible00:05
clarkbmordred: ya that would be good though I feel like I have a high enough level understanding of it that I'd work my way through kubectl docs and be fine00:05
openstackgerritMerged openstack-infra/infra-specs master: Overhaul instructions in README.rst for clarity  https://review.openstack.org/62321100:05
fungii'll appreciate the show-n-tell00:06
clarkbwe should also make sure we size things appropriately as much as we can figure upfront00:06
clarkb(isnce the atomic based k8s had a sad due to undersizing)00:06
mordredclarkb: yah - seeing as how I knew nothing a couple of weeks ago and now feel comfortable debugging stuff, I'm not worried about anybody - but yeah, show-n-tell should be nice00:06
mordredyah. although we're not running this on atomic anymore00:06
clarkbya its xenial ansible dpeloyed00:07
mordredalso - things like "replace a node" and "resize a node" are all things we should do before it goes live00:07
corvusone of the things we can do during the show-and-tell is bootstrap the whole thing from nothing with everyone watching.  and drop nodes and watch ceph heal.00:07
mordred++00:07
clarkbmordred: ++ and upgrade k8s00:07
mordredyah. although we don't currently have a candidate k8s version to upgrade to00:08
mordredmaybe we could deploy it on 1.11.4 - and then upgrade to .500:08
mordredjust for the exercise00:08
clarkb++00:08
clarkbwe should also do a braindump on the networking. which driver did you choose?00:08
mordredI made no choices - the k8s-on-openstack author chose weave00:09
mordred but yes - we should DEFINITELY talk about networking - especually about the octavia interface00:09
corvusmy script is at the beginning of the openstack/ alphabet, and gerrit is at the end00:10
corvusso they're decidedly out of sync still00:10
corvusopenstack/c vs openstack/x00:10
mordredhttp://38.108.68.66/explore/repos is fun though00:11
clarkbmordred: is weave open source? their website ient clear one that00:11
clarkbthere us a trial version of something00:11
corvushttp://38.108.68.66/openstack-attic/akanda  has populated00:12
mordredyeah. it's one of the two 'default' options00:12
mordredakanda repo looks great!00:12
corvushttp://38.108.68.66/openstack-attic/akanda seems to show the kilo branch by default00:13
mordredI agree with that00:14
clarkblooks like weave net us open but weave cloud is not00:14
clarkbin any case should be fine for us to use00:14
*** jamesmcarthur has joined #openstack-infra00:14
corvusthere is a default branch setting in the ui00:15
mordredclarkb: yeah. I wouldn't mind learning a bit more and potentially looking at calico and ipv6 ... but for now I think weave is working fine00:15
mordredcorvus: the default branch is set to stable/kilo in the db00:16
corvusit seems to be set to kilo in akanda00:16
*** eernst has quit IRC00:16
mordredcorvus: maybe it's getting set by the force push - maybe gerrit decided to force push stable/kilo last?00:16
corvusthe only non-master values are on akanda and akanda-appliance-builder so far00:17
mordredcorvus: hrm. database-api is fine - it's on lthe ... yeah00:17
mordreddid the default branch for akanda get set to non-master in gerrit?00:17
corvusbut maybe those are the only repos that existed in the db during the push?00:18
mordredhrm. extra weird00:18
corvusi guess we'll see when things swing back around00:18
mordredyeah00:18
clarkbhead isnt pushed so  i dont think gerrit can communicate that info00:19
clarkbthis is one reason removing master is "bad"00:19
*** jamesmcarthur has quit IRC00:21
mordredyeah - but that makes the default_branch field being set even weirder00:22
corvusmelange just got updated as well00:22
mordredcorvus: yes, I agree00:23
corvusso when everything is settled, let's set the defaults back to master, then see if gerrit's force-pushing updates them again00:23
corvusi'm going to afk for a bit while these things run.  i'll check back in later00:24
*** tosky has quit IRC00:25
clarkbhrm weave pods can be evicted by k8s if the node runs out of resources. We may need or want some simple monitoring for base functionality too00:27
clarkb(thats likely longer term need)00:27
openstackgerritMerged openstack-infra/system-config master: Add more \\ to launchpad-bug Gerrit tracking-id  https://review.openstack.org/62701800:29
clarkbmelange is a nameI havent heard in a while00:30
clarkbthe git spice must flow00:30
*** hwoarang has quit IRC00:31
mordredcorvus: ++ to checking after the first pass00:32
mordredI'd put money on it being something with the initial push00:32
*** hwoarang has joined #openstack-infra00:32
mordred(and might be a bug we can file if we can figure it out)00:33
*** whoami-rajat has quit IRC00:34
*** xek has quit IRC00:42
*** xek has joined #openstack-infra00:43
*** rh-jelabarre has quit IRC00:44
*** xek has quit IRC00:47
*** bobh has joined #openstack-infra00:51
*** bobh has quit IRC00:56
*** rascasoft has joined #openstack-infra00:57
*** rascasoft has quit IRC01:06
corvusokay, http://38.108.68.66/openstack/steth has been created.  gerrit pushed to it a long time ago, but there's no content there01:08
corvusso we'd probably either need another push, or to tell gitea to resync from disk in order fo it to notice content01:08
corvusit's still chugging; i'll wait until it's done before i do anything like that01:08
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Add gitea k8s resource definitions and playbook  https://review.openstack.org/62675901:12
corvusthat's the result of today's work01:12
clarkbheh that ansible reads yaml into the playbook then writes yaml back to k8s :)01:13
clarkbpuppet suffered from a lot of this too. lack of passthrough serialized data01:14
corvusclarkb: which bit are you looking at?01:14
clarkbcorvus: the top of the playbook with the from yaml stuff01:15
corvusclarkb: ah, ansible does not suffer from that problem :)01:16
corvusclarkb: you can totally pass through data unaltered; see the examples at https://docs.ansible.com/ansible/latest/modules/k8s_module.html#examples01:16
clarkbis the from yaml unnecessary then?01:16
corvusclarkb: specifically the third from the bottom there ("Create a Deployment by reading the definition from a local file")01:16
corvusclarkb: it's there so that we can jinja-interpolate things into the yaml01:16
corvusclarkb: specifically for the secret.yaml file01:17
*** bobh has joined #openstack-infra01:17
clarkbya but namespace doesnt seem to be a template?01:17
corvusclarkb: i just kept the pattern for the other files, but we can switch to direct passthrough for the ones that don't use it.01:17
clarkbits valid k8s config as is01:17
clarkbah01:17
*** jamesmcarthur has joined #openstack-infra01:17
corvusi thought "always assume k8s files are jinja-templatable" was a neat idea, but i'm not married to it01:18
clarkbI think I prefer that since its more accurate to what we want (eg dont need templating just apply directly)01:18
corvusyeah, the downside is we may end up with a deployment with "{{ foobar }}" in it before we realize we need to turn on jinja01:18
corvusthough, i guess logically, i should use .j2 extensions to make it clear01:19
corvusPlaybook run took 0 days, 1 hours, 27 minutes, 52 seconds01:19
corvusthat's all the repos created in the db01:19
corvuspushes are still happening01:19
corvuswe're on openstack-infra/g there01:20
*** bobh has quit IRC01:21
*** hwoarang has quit IRC01:21
*** hwoarang has joined #openstack-infra01:22
*** jamesmcarthur has quit IRC01:23
clarkbcorvus: thinking out loud here we probably want all the k8s yaml to live in separate files so that you can use them without ansible easily too? the configmap config seems to be the exception to that rule in the playbook01:24
*** rascasoft has joined #openstack-infra01:33
*** rascasoft has quit IRC01:43
*** bobh has joined #openstack-infra01:49
*** bobh has quit IRC01:54
*** hwoarang has quit IRC02:04
*** hwoarang has joined #openstack-infra02:08
*** hamerins has joined #openstack-infra02:09
*** tinwood has quit IRC02:10
*** markvoelker has joined #openstack-infra02:10
*** rascasoft has joined #openstack-infra02:11
*** tinwood has joined #openstack-infra02:11
*** hamerins has quit IRC02:11
*** hamerins has joined #openstack-infra02:12
*** rascasoft has quit IRC02:18
*** jamesmcarthur has joined #openstack-infra02:19
*** hamerins has quit IRC02:20
*** jamesmcarthur has quit IRC02:24
*** _alastor_ has quit IRC02:28
*** bhavikdbavishi has joined #openstack-infra02:29
*** hamerins has joined #openstack-infra02:30
*** hamerins has quit IRC02:31
*** rfolco has quit IRC02:34
*** rfolco has joined #openstack-infra02:35
*** hamerins has joined #openstack-infra02:39
*** mriedem_away has quit IRC02:58
*** jamesmcarthur has joined #openstack-infra03:20
*** jamesmcarthur has quit IRC03:25
*** armax has quit IRC03:26
*** gouthamr has quit IRC03:36
*** gyee has quit IRC03:37
*** dmellado has quit IRC03:38
*** stevebaker has quit IRC03:38
*** dulek has quit IRC03:38
*** bhavikdbavishi has quit IRC03:38
*** armax has joined #openstack-infra03:48
*** gouthamr has joined #openstack-infra03:49
*** dmellado has joined #openstack-infra03:50
*** stevebaker has joined #openstack-infra03:51
*** jamesmcarthur has joined #openstack-infra03:51
*** dulek has joined #openstack-infra03:54
*** hamerins has quit IRC03:55
*** jamesmcarthur has quit IRC03:56
*** hamerins has joined #openstack-infra03:58
*** hamerins has quit IRC03:58
*** bhavikdbavishi has joined #openstack-infra03:59
*** armax has quit IRC04:01
*** stevebaker has quit IRC04:16
*** dmellado has quit IRC04:17
*** dulek has quit IRC04:17
*** gouthamr has quit IRC04:17
*** hwoarang has quit IRC04:21
*** hwoarang has joined #openstack-infra04:23
*** dmellado has joined #openstack-infra04:27
*** dulek has joined #openstack-infra04:27
*** gouthamr has joined #openstack-infra04:28
*** bobh has joined #openstack-infra04:28
*** stevebaker has joined #openstack-infra04:29
*** bobh has quit IRC04:44
*** jamesmcarthur has joined #openstack-infra04:52
*** jamesmcarthur has quit IRC04:54
*** jamesmcarthur has joined #openstack-infra04:54
*** jamesmcarthur has quit IRC04:57
*** jamesmcarthur has joined #openstack-infra04:58
*** jamesmcarthur has quit IRC05:17
*** jamesmcarthur has joined #openstack-infra05:17
*** jamesmcarthur has quit IRC05:48
*** bhavikdbavishi has quit IRC05:50
*** jamesmcarthur has joined #openstack-infra05:51
*** jamesmcarthur has quit IRC06:04
*** jamesmcarthur has joined #openstack-infra06:06
*** ykarel|away has joined #openstack-infra06:09
*** jamesmcarthur has quit IRC06:11
*** e0ne has joined #openstack-infra06:22
*** e0ne has quit IRC07:05
*** ccamacho has quit IRC07:05
*** ccamacho has joined #openstack-infra07:06
*** e0ne has joined #openstack-infra07:10
*** spa-87 has joined #openstack-infra07:14
*** e0ne has quit IRC07:43
*** ykarel|away has quit IRC07:46
*** agopi has joined #openstack-infra08:02
*** yboaron_ has joined #openstack-infra08:16
*** ykarel|away has joined #openstack-infra08:18
*** jtomasek has joined #openstack-infra08:21
*** spa-87 has quit IRC08:24
*** jtomasek has quit IRC08:28
*** hwoarang has quit IRC08:45
*** hwoarang has joined #openstack-infra08:51
*** hwoarang has quit IRC09:02
*** ykarel|away has quit IRC09:10
*** spa-87 has joined #openstack-infra09:12
*** ykarel|away has joined #openstack-infra09:15
*** hwoarang has joined #openstack-infra09:21
*** d0ugal has quit IRC09:26
*** agopi has quit IRC09:26
*** wolverineav has joined #openstack-infra09:31
*** jbadiapa has quit IRC09:33
*** wolverineav has quit IRC09:36
*** agopi has joined #openstack-infra09:48
*** hwoarang has quit IRC09:49
*** hwoarang has joined #openstack-infra09:53
*** agopi has quit IRC09:56
*** ykarel|away has quit IRC10:12
*** xek has joined #openstack-infra10:13
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Move postgres to tmpfs  https://review.openstack.org/62703510:14
*** yboaron_ has quit IRC10:21
*** evrardjp has quit IRC10:27
*** evrardjp_ has joined #openstack-infra10:27
*** agopi has joined #openstack-infra10:40
*** pbourke has quit IRC10:42
*** bhavikdbavishi has joined #openstack-infra10:42
*** pbourke has joined #openstack-infra10:44
*** bhavikdbavishi has quit IRC10:57
*** dkehn has quit IRC11:16
*** jaosorior has joined #openstack-infra11:18
*** agopi has quit IRC11:37
*** agopi has joined #openstack-infra11:38
*** ykarel|away has joined #openstack-infra11:41
*** ykarel|away has quit IRC11:49
*** dayou has quit IRC11:53
*** dayou has joined #openstack-infra11:57
*** agopi has quit IRC11:59
*** ykarel|away has joined #openstack-infra12:05
*** jamesmcarthur has joined #openstack-infra12:07
*** jamesmcarthur has quit IRC12:11
*** ykarel|away has quit IRC12:29
*** ykarel|away has joined #openstack-infra12:30
*** agopi has joined #openstack-infra12:31
*** tobiash has quit IRC13:19
*** tobiash has joined #openstack-infra13:23
*** agopi has quit IRC13:28
*** zul has quit IRC13:42
*** agopi has joined #openstack-infra13:46
*** markvoelker has quit IRC13:46
*** markvoelker has joined #openstack-infra13:46
*** e0ne has joined #openstack-infra13:54
*** agopi has quit IRC13:56
*** dkehn has joined #openstack-infra14:08
AJaegercorvus: looking at your gitea instance: On git.openstack.org we've hidden the retired repos, your gitea instance showed them earlier today. But now I cannot reach it, I get a "500 Internal Server Error"14:09
openstackgerritmelissaml proposed openstack-infra/infra-specs master: remove the repetition words  https://review.openstack.org/62703914:12
openstackgerritmelissaml proposed openstack-infra/infra-specs master: remove the repetition words in doc  https://review.openstack.org/62704014:19
*** bobh has joined #openstack-infra14:22
*** tosky has joined #openstack-infra14:22
fungiAJaeger: that reminds me, i've recently come to realize that hiding retired repos does make it harder for people coming later to find the references we've left in their readme files indicating they're no longer maintained or have moved elsewhere14:33
fungiin particular, someone reached out to osf staff trying to find out what had happened to the openstack salt formulas. the readme files in them clearly stated they had moved to the saltstack github org, but we aren't actually putting those readme files anywhere search engines will index14:34
fungii don't know whether we should be doing something different, mostly just pointing out the lack of discoverability we've engineered into it which seems to run counter to other aspects of the retirement process14:35
mordredAJaeger: I agree - I also cannot reach the new service ... I don't see anything wrong at the k8s level - it might be the octavia load balancer (which is a bit of a black box compared to everything else)14:36
AJaegerfungi: yes, indeed that is unfortunate. On the other hand, we have 542 retired repos, so filtering them out might be worth while. Definitley not a blocker for moving forward - but something to look at...14:39
AJaegermordred: thanks for checking - no urgency from my side ;)14:40
*** bobh has quit IRC14:40
*** bobh has joined #openstack-infra14:43
*** stevebaker has quit IRC14:51
*** dmellado has quit IRC14:51
*** stevebaker has joined #openstack-infra14:51
*** bobh has quit IRC14:52
*** dmellado has joined #openstack-infra14:53
*** hwoarang has quit IRC14:55
*** e0ne has quit IRC14:55
*** hwoarang has joined #openstack-infra14:57
*** e0ne has joined #openstack-infra14:59
*** bobh has joined #openstack-infra15:04
*** bobh has quit IRC15:04
corvuswe've nearly filled our cephfs: 10.110.59.103:6790,10.109.161.206:6790,10.103.118.117:6790:/  239G  227G   12G  95% /data15:09
corvusoh, i think we *have* filled it15:10
corvus[Macaron] PANIC: session(start): open /data/gitea/sessions/8/d/8dfbcd26c1f08dbe: no space left on device15:10
fungihah, nice15:11
fungii guess the replication got far enough to eat it all?15:11
corvusyeah.  plus we're indexing the repos, and i think they said indexes take 5x the repo space?  i'm getting numbers now.15:12
fungioof15:13
fungiraw repos on review.o.o take 12gb according to du15:13
fungiso call that 60gb minimum?15:14
corvusdrwx------ 1 git git 868M Dec 22 08:47 repos.bleve15:15
corvushrm, i must be missing something.15:15
*** Vadmacs has joined #openstack-infra15:23
*** ykarel|away has quit IRC15:25
corvusdu reports 4.1G used by repos.  and 1.2G by gitea (indexes, sessions).15:25
corvusthat's all there is on the filesystem.15:26
* corvus learns more about cephfs15:26
*** dhill_ has quit IRC15:43
corvusmordred, fungi: i think the current issue with availability is a ceph problem.  if i strace the gitea process and hit the url, i see it get the request and attempt to do work.  but i think there's a hung IO call on that machine: https://etherpad.openstack.org/p/OYJJLz0xlx15:51
corvus(i don't think the load balancer is implicated)15:51
fungigot it15:51
corvusoh yeah, there's a "ceph_fsync" in that call stack15:52
*** e0ne has quit IRC15:52
*** e0ne has joined #openstack-infra16:09
openstackgerritmelissaml proposed openstack-infra/infra-specs master: remove the repetition words in docs  https://review.openstack.org/62704016:14
*** tosky has quit IRC16:19
*** bhavikdbavishi has joined #openstack-infra16:24
*** e0ne has quit IRC16:25
openstackgerritmelissaml proposed openstack-infra/infra-specs master: remove the repetition words in docs  https://review.openstack.org/62704016:25
corvusapparently cephfs clients can hang indefinitely waiting to write data out when the filesystem is full (but should proceed when that is corrected)16:28
corvusso these 2 problems may be related16:29
corvusi do not understand how 5G used fills up 227G of space.16:29
*** eernst has joined #openstack-infra16:36
*** rkukura has quit IRC16:41
*** bhavikdbavishi has quit IRC16:52
*** markvoelker has quit IRC16:52
*** markvoelker has joined #openstack-infra16:53
*** markvoelker has quit IRC16:57
*** armax has joined #openstack-infra16:59
*** eernst has quit IRC17:02
*** armax has quit IRC17:04
*** hwoarang has quit IRC17:15
*** hwoarang has joined #openstack-infra17:16
*** spa-87 has quit IRC17:29
corvusokay, my theory is that ceph's bluestore driver is using 64k minimum allocation size (which is apparently its default on hdds) rather than 4k.  we have lots of small files, so that exacerbates the space used.  the math almost works out -- our number of objects*3*64k is 245GiB; we're using 227GiB.  i don't know if the estimated usage being greater than the actual means that is wrong, or if i'm just17:36
corvusmissing something.17:36
corvusat any rate, it's in the ballpark, and it's the only explanation i've come up with for the discrepancy.17:36
corvusi'd like to confirm the min_alloc_size, but it doesn't show up in our config file, so it must be using one of the implied defaults.  but i don't know how to verify what the value is.17:37
corvusand it's time for me to afk17:37
persia245GiB is 228GB by my calculations.  Are you sure both are measured in GiB?17:38
corvuspersia: fairly; ceph seems to consistently use GiB17:39
persiaAlso, I fail at units.  228GiB approximates 245GB.17:39
fungiseems like a reasonable theory, yes17:45
*** bobh has joined #openstack-infra17:57
*** jamesmcarthur has joined #openstack-infra18:19
*** bobh has quit IRC18:23
*** bobh has joined #openstack-infra18:28
*** markvoelker has joined #openstack-infra18:29
*** bhavikdbavishi has joined #openstack-infra18:30
*** bobh has quit IRC18:38
*** bobh has joined #openstack-infra18:41
*** dave-mccowan has joined #openstack-infra18:42
*** ociuhandu has joined #openstack-infra18:52
*** ociuhandu has quit IRC18:52
*** bobh has quit IRC18:56
*** spa-87 has joined #openstack-infra19:01
*** bhavikdbavishi has quit IRC19:12
*** jamesmcarthur has quit IRC19:24
*** spa-87 has quit IRC20:12
*** jamesmcarthur has joined #openstack-infra20:32
*** e0ne has joined #openstack-infra20:40
*** dave-mccowan has quit IRC20:55
*** Vadmacs has quit IRC21:06
*** markvoelker has quit IRC21:30
*** markvoelker has joined #openstack-infra21:30
*** slaweq has joined #openstack-infra21:33
*** e0ne has quit IRC21:33
*** markvoelker has quit IRC21:35
*** pcaruana has joined #openstack-infra21:53
*** jaosorior has quit IRC22:07
*** slaweq has quit IRC22:13
*** jamesmcarthur has quit IRC22:22
*** gary_perkins has quit IRC22:25
*** jamesmcarthur has joined #openstack-infra22:26
*** gary_perkins has joined #openstack-infra22:28
*** gary_perkins has quit IRC22:37
*** gary_perkins has joined #openstack-infra22:44
*** gary_perkins has quit IRC22:51
corvusokay, i feel like there must have been a better way to do this, but i think i have the answer i was looking for.22:53
corvusi deleted the k8s deployment and replicaset objects for the ceph osd1 pod with --cascade=false, so the pod was left running22:53
*** gary_perkins has joined #openstack-infra22:53
corvusthen i deleted the osd1 pod.  without the deployment and replicasets, the pod was not restarted22:54
*** pcaruana has quit IRC22:55
corvusi then made a copy of the osd1 directory (this directory is part of ceph which basically just has some metadata about where the actual storage is, along with some symlinks to the underlying storage device) and placed that copy on the ceph tools pod, after making sure the ceph tools pod was running on the k8s node that osd1 used to run on22:55
corvusall of that was so that i can run "ceph-kvstore-tool" which, when run against a bluestore, wants the bluestore to be un-mounted.22:55
corvusi ran "ceph-kvstore-tool bluestore-kv ./osd1/ get S min_alloc_size"22:55
corvusand it output:22:56
corvus(S, min_alloc_size)22:56
corvus00000000  00 00 01 00 00 00 00 00                           |........|22:56
corvuswhich, when i compensate for endianness seems to give me the number 6553622:56
corvus(i believe it's a uint64_t)22:56
corvusand now back to afking22:57
*** gary_perkins has quit IRC23:12
*** xek has quit IRC23:23
*** markvoelker has joined #openstack-infra23:31
*** Adri2000 has quit IRC23:47
*** Adri2000 has joined #openstack-infra23:53

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!