Tuesday, 2018-04-10

*** rbudden has quit IRC02:52
*** paken has joined #scientific-wg08:04
*** priteau has joined #scientific-wg08:27
*** paken has quit IRC09:11
*** paken has joined #scientific-wg09:25
*** rbudden has joined #scientific-wg12:35
*** KurtB has joined #scientific-wg13:42
KurtBGood morning. Is anyone here running openstack controllers and/or compute on AMD EPYC?13:44
KurtBBetween that, and trying to figure out how to size ceph is making my head hurt. :-)13:44
*** ildikov has quit IRC14:08
*** ildikov has joined #scientific-wg14:09
*** paken has quit IRC14:30
jmlowe_I'm not running on amd14:39
jmlowe_I've got a few years of ceph experience, maybe I can help?14:40
jmlowe_KurtB: ^^^^14:40
KurtBjmlowe_: Hey. Yeah. I'm trying not to bother you.16:33
KurtBI have some (six) disk trays that I inherited from a failed experiment, 256G RAM, 52 4TB drives, 4 800G SSDs. I want to turn that into a ceph cluster. If I set those up as 3x replication, I'm wondering how to size the metadata servers.17:01
KurtBProbably run MONs on the storage servers.17:09
jmlowe_I'm not sure about metadata server sizing, they need very little storage but I'd put the metadata pool on the ssd's17:23
jmlowe_you should be ok in terms of memory per osd17:24
jmlowe_thinking about this a little more, I'd use lvm and put all of the ssd's in to one vg, slice that into 52 50GB partitions to use as --block.db then that leaves you with 600GB left over to back a ssd based osd18:07
jmlowe_your mention of metadata makes me think you are wanting to play with cephfs?18:08
*** paken has joined #scientific-wg18:18
KurtBcephfs... That's an option.19:03
KurtBDo you use ceph as strictly block storage?19:03
jmlowe_I've started dabbling with cephfs via manila with ganesha nfs exports19:04
jmlowe_but for the most part block storage, we have a user or two that do s3/swift via radosgw19:05
KurtBA buddy of mine has been struggling with getting a ceph cluster running reliably for a coupe of months. He's doing cephfs on top of a erasure coding cluster.19:05
jmlowe_https://zonca.github.io/2018/03/zarr-on-jetstream.html19:05
jmlowe_that's an interesting thing a user is doing19:06
jmlowe_so far so good with me testing cephfs on an erasure coded pool with metadata on a nvme pool19:06
jmlowe_single mds though19:06
jmlowe_The hdf guys have gotten their thingy working on our object store so you can consume hdf directly from the object store19:07
KurtBHe's been trying to run the MDSs on the storage servers, and doesn't have enuf RAM. 36 8TB drives each, with only 64G RAM19:08
jmlowe_99.9% block usage I'd say19:08
KurtBOh, that's cool!19:08
jmlowe_I've pressed a compute node or two into service for mds, they can be kind of memory pigs19:08
KurtBAre you exporting block to anything otside of OpenStack?19:09
jmlowe_I am not19:09
jmlowe_I don't think cephfs w/ manila will be ready for prime time until queens and ganesha 2.7, I'm of the opinion all HA all the time19:10
KurtBThere are a couple of 32G servers I inherited with those storage trays.19:10
KurtBNot sure if that's enuf for a MDS19:11
jmlowe_my plan is ganesha/manila agent/radosgw/mds, on 3 x 24 core 128G nodes19:11
KurtBNice!19:12
jmlowe_so far I only have 2 x radosgw and everything else is single instance19:12
KurtBSeems there are a lot of knobs to turn to get ceph happy... but I think my buddy is under-provisioned for RAM.19:13
jmlowe_bollig: that dask thingy might be interesting to you guys19:13
jmlowe_or I should say Zarr for Dask19:14
* KurtB is looking at dask now19:19
KurtBzarr looks interesting19:20
KurtBlemme get ceph working right, and then I'll hack on that! :-)19:21
jmlowe_Do I remember correctly that you are at NIST in Boulder?19:22
KurtBI'm at NREL (National Renewable Energy Laboratory) in Golden, CO.19:28
KurtBI don't have a big enuf trust fund to live in Boulder! :-)19:28
jmlowe_ah, ok, I was just out there for the SEA conference at UCAR19:28
KurtBI'm like 20 miles south of UCAR. really close.19:29
*** paken has quit IRC19:46
bolligjmlowe_: cool. thanks for the tip19:47
bolligKurtB: I’m also interested in how well epyc performs as a hypervisor19:57
KurtBbollig: One of my cohorts just gave me an EPYC to test. I'm going to build a test cluster and shove it in as a compute node. I'll let you know.20:05
bolligjmlowe_: has zonca or anyone else used dask with kubernetes on jetstream?20:28
jmlowe_We don't think so20:30
jmlowe_There is a guy working on it20:30
jmlowe_Kevin Paul from Pangeo20:30
bolligok great to know20:31
jmlowe_Did you say you had barbican working?20:32
bolligno, not yet.20:34
*** priteau has quit IRC20:59
*** priteau has joined #scientific-wg21:00
*** priteau has quit IRC21:04
*** priteau has joined #scientific-wg21:58
*** priteau has quit IRC22:08
*** rbudden has quit IRC23:50

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!