openstackgerrit | gordon chung proposed openstack/ceilometer: drop individual names from copyright https://review.openstack.org/301983 | 00:05 |
---|---|---|
*** diogogmt has quit IRC | 00:14 | |
*** rbak has joined #openstack-telemetry | 00:25 | |
*** rbak has quit IRC | 00:28 | |
*** dave-mcc_ has joined #openstack-telemetry | 00:50 | |
*** dave-mccowan has quit IRC | 00:51 | |
*** jwcroppe has joined #openstack-telemetry | 01:06 | |
*** diogogmt has joined #openstack-telemetry | 01:07 | |
*** jwcroppe has quit IRC | 01:12 | |
*** thorst_ has quit IRC | 01:22 | |
*** thorst has joined #openstack-telemetry | 01:23 | |
*** cheneydc has joined #openstack-telemetry | 01:27 | |
*** thorst has quit IRC | 01:31 | |
*** liamji has joined #openstack-telemetry | 01:36 | |
*** ljxiash has joined #openstack-telemetry | 01:39 | |
*** ljxiash has quit IRC | 02:03 | |
*** ljxiash has joined #openstack-telemetry | 02:03 | |
*** ljxiash has quit IRC | 02:08 | |
*** Kevin_Zheng has joined #openstack-telemetry | 02:24 | |
*** thorst has joined #openstack-telemetry | 02:29 | |
*** jwcroppe has joined #openstack-telemetry | 02:35 | |
*** thorst has quit IRC | 02:36 | |
*** zqfan has joined #openstack-telemetry | 02:53 | |
*** ljxiash has joined #openstack-telemetry | 02:58 | |
*** jwcroppe_ has joined #openstack-telemetry | 03:11 | |
*** jwcroppe has quit IRC | 03:15 | |
*** ljxiash has quit IRC | 03:34 | |
*** thorst has joined #openstack-telemetry | 03:34 | |
*** ljxiash has joined #openstack-telemetry | 03:34 | |
*** ljxiash has quit IRC | 03:40 | |
*** thorst has quit IRC | 03:41 | |
*** anush_ has joined #openstack-telemetry | 03:49 | |
*** links has joined #openstack-telemetry | 03:52 | |
*** ljxiash has joined #openstack-telemetry | 04:13 | |
*** jwcroppe_ has quit IRC | 04:13 | |
*** jwcroppe has joined #openstack-telemetry | 04:14 | |
*** jwcroppe_ has joined #openstack-telemetry | 04:17 | |
*** jwcroppe has quit IRC | 04:19 | |
*** r-mibu has quit IRC | 04:29 | |
*** r-mibu has joined #openstack-telemetry | 04:30 | |
*** diogogmt has quit IRC | 04:33 | |
*** thorst has joined #openstack-telemetry | 04:39 | |
*** thorst has quit IRC | 04:47 | |
*** anush_ has quit IRC | 05:01 | |
*** nadya has joined #openstack-telemetry | 05:21 | |
*** jwcroppe_ has quit IRC | 05:25 | |
*** ljxiash has quit IRC | 05:26 | |
*** ljxiash has joined #openstack-telemetry | 05:26 | |
*** sekrit has quit IRC | 05:30 | |
*** ljxiash has quit IRC | 05:31 | |
*** ljxiash has joined #openstack-telemetry | 05:32 | |
*** ChrisBenson has joined #openstack-telemetry | 05:32 | |
*** thorst has joined #openstack-telemetry | 05:44 | |
*** thorst has quit IRC | 05:52 | |
*** dave-mcc_ has quit IRC | 05:56 | |
*** nadya has quit IRC | 05:58 | |
*** rcernin has joined #openstack-telemetry | 05:59 | |
*** ChrisBenson has quit IRC | 06:11 | |
openstackgerrit | Mehdi Abaakouk (sileht) proposed openstack/gnocchi: tests: tempest plugin https://review.openstack.org/301585 | 06:18 |
*** sekrit has joined #openstack-telemetry | 06:23 | |
*** jwcroppe has joined #openstack-telemetry | 06:28 | |
*** belmoreira has joined #openstack-telemetry | 06:30 | |
*** dave-mcc_ has joined #openstack-telemetry | 06:33 | |
*** jwcroppe has quit IRC | 06:35 | |
*** nadya has joined #openstack-telemetry | 06:47 | |
*** thorst has joined #openstack-telemetry | 06:49 | |
*** dave-mccowan has joined #openstack-telemetry | 06:49 | |
*** yprokule has joined #openstack-telemetry | 06:54 | |
*** dave-mcc_ has quit IRC | 06:54 | |
*** nadya has quit IRC | 06:56 | |
*** thorst has quit IRC | 06:57 | |
openstackgerrit | ZhiQiang Fan proposed openstack/aodh: use default option for notification topics https://review.openstack.org/302059 | 07:00 |
*** openstackgerrit has quit IRC | 07:02 | |
*** openstackgerrit has joined #openstack-telemetry | 07:03 | |
*** pcaruana has joined #openstack-telemetry | 07:06 | |
*** rcernin has quit IRC | 07:08 | |
*** rcernin has joined #openstack-telemetry | 07:09 | |
openstackgerrit | Julien Danjou proposed openstack/aodh: Use pbr wsgi_scripts to build aodh-api https://review.openstack.org/301850 | 07:35 |
*** shardy has joined #openstack-telemetry | 07:51 | |
*** shardy has quit IRC | 07:52 | |
*** shardy has joined #openstack-telemetry | 07:53 | |
*** thorst has joined #openstack-telemetry | 07:55 | |
*** thorst has quit IRC | 08:02 | |
openstackgerrit | Merged openstack/aodh: use default option for notification topics https://review.openstack.org/302059 | 08:10 |
openstackgerrit | Merged openstack/ceilometer: Updated from global requirements https://review.openstack.org/300756 | 08:14 |
openstackgerrit | Eyal proposed openstack/python-gnocchiclient: Remove redundant parentheses https://review.openstack.org/301473 | 08:24 |
*** jwcroppe has joined #openstack-telemetry | 08:33 | |
*** yassine has joined #openstack-telemetry | 08:34 | |
*** yassine is now known as Guest9848 | 08:34 | |
*** Guest9848 is now known as yassou | 08:35 | |
*** jwcroppe has quit IRC | 08:38 | |
*** dave-mccowan has quit IRC | 08:43 | |
*** thorst has joined #openstack-telemetry | 09:00 | |
*** thorst has quit IRC | 09:06 | |
*** cdent has joined #openstack-telemetry | 09:13 | |
*** ljxiash has quit IRC | 09:15 | |
*** ljxiash has joined #openstack-telemetry | 09:17 | |
*** ljxiash has quit IRC | 09:22 | |
openstackgerrit | Merged openstack/gnocchi: Use pbr WSGI feature to build gnocchi-api https://review.openstack.org/301611 | 09:51 |
*** ljxiash has joined #openstack-telemetry | 09:58 | |
*** cheneydc has quit IRC | 10:01 | |
*** thorst has joined #openstack-telemetry | 10:04 | |
*** nadya has joined #openstack-telemetry | 10:05 | |
*** ekarlso- has quit IRC | 10:09 | |
*** thorst has quit IRC | 10:11 | |
-openstackstatus- NOTICE: npm lint jobs are failing due to a problem with npm registry. The problem is under investigation, and we will update once the issue is solved. | 10:19 | |
*** ChanServ changes topic to "npm lint jobs are failing due to a problem with npm registry. The problem is under investigation, and we will update once the issue is solved." | 10:19 | |
*** ekarlso- has joined #openstack-telemetry | 10:21 | |
*** ekarlso- has quit IRC | 10:22 | |
*** ekarlso has joined #openstack-telemetry | 10:22 | |
*** cdent has quit IRC | 10:30 | |
*** jwcroppe has joined #openstack-telemetry | 10:36 | |
*** ljxiash has quit IRC | 10:39 | |
*** jwcroppe has quit IRC | 10:41 | |
*** ljxiash has joined #openstack-telemetry | 10:50 | |
*** ljxiash has quit IRC | 10:54 | |
*** cdent has joined #openstack-telemetry | 10:58 | |
*** nadya has quit IRC | 11:01 | |
*** thorst has joined #openstack-telemetry | 11:09 | |
*** thorst has quit IRC | 11:16 | |
*** Ashlyn has joined #openstack-telemetry | 11:18 | |
Ashlyn | Is there a mechanism with ceilometerclient to find a list of all the triggered alarms(basically whose state has changed and is approaching the critical state)? | 11:20 |
*** thorst has joined #openstack-telemetry | 11:20 | |
*** ljxiash has joined #openstack-telemetry | 11:33 | |
*** ljxiash has quit IRC | 11:37 | |
*** gordc has joined #openstack-telemetry | 11:42 | |
*** nadya has joined #openstack-telemetry | 11:46 | |
openstackgerrit | Eyal proposed openstack/python-gnocchiclient: Remove redundant parentheses https://review.openstack.org/301473 | 12:17 |
openstackgerrit | gordon chung proposed openstack/ceilometer: re-org existing manually install notes https://review.openstack.org/301321 | 12:24 |
*** pradk_ has joined #openstack-telemetry | 12:26 | |
*** Liuqing has joined #openstack-telemetry | 12:27 | |
*** ljxiash has joined #openstack-telemetry | 12:32 | |
*** Liuqing has quit IRC | 12:37 | |
*** Liuqing has joined #openstack-telemetry | 12:38 | |
openstackgerrit | venkatamahesh proposed openstack/ceilometer: Update the Administrator Guide links https://review.openstack.org/302220 | 12:38 |
*** julim has joined #openstack-telemetry | 12:39 | |
*** jwcroppe has joined #openstack-telemetry | 12:39 | |
*** Liuqing has quit IRC | 12:42 | |
*** Liuqing has joined #openstack-telemetry | 12:43 | |
*** jwcroppe has quit IRC | 12:44 | |
*** Liuqing has quit IRC | 12:45 | |
*** julim has quit IRC | 12:45 | |
*** ljxiash has quit IRC | 12:48 | |
*** Liuqing has joined #openstack-telemetry | 12:49 | |
*** links has quit IRC | 12:56 | |
*** iberezovskiy_ is now known as iberezovskiy | 13:01 | |
*** liamji has quit IRC | 13:17 | |
*** jdowner has joined #openstack-telemetry | 13:20 | |
*** peristeri has joined #openstack-telemetry | 13:20 | |
*** Liuqing has quit IRC | 13:26 | |
*** Liuqing has joined #openstack-telemetry | 13:33 | |
*** ametts has joined #openstack-telemetry | 13:35 | |
*** ljxiash has joined #openstack-telemetry | 13:37 | |
*** diogogmt has joined #openstack-telemetry | 13:42 | |
openstackgerrit | Julien Danjou proposed openstack/ceilometer: mongo: remove unused function https://review.openstack.org/302264 | 13:48 |
openstackgerrit | Julien Danjou proposed openstack/ceilometer: Remove the deprecated DB2 driver https://review.openstack.org/302265 | 13:48 |
*** pradk_ has quit IRC | 13:49 | |
openstackgerrit | Merged openstack/gnocchi: fix resource_type table migration https://review.openstack.org/301192 | 13:52 |
openstackgerrit | Julien Danjou proposed openstack/ceilometer: Remove unused context objects in Glance tests https://review.openstack.org/300374 | 13:54 |
openstackgerrit | Julien Danjou proposed openstack/ceilometer: Remove unused context object in test https://review.openstack.org/300373 | 13:54 |
openstackgerrit | Julien Danjou proposed openstack/ceilometer: Remove unused context object in vpnaas test https://review.openstack.org/300378 | 13:54 |
openstackgerrit | Julien Danjou proposed openstack/ceilometer: Remove unused context object lbaas test https://review.openstack.org/300376 | 13:54 |
openstackgerrit | Julien Danjou proposed openstack/ceilometer: Remove unused object from lbaas_v2 test https://review.openstack.org/300377 | 13:54 |
openstackgerrit | Julien Danjou proposed openstack/ceilometer: test: remove unused context object in FWaaS tests https://review.openstack.org/300380 | 13:54 |
openstackgerrit | Julien Danjou proposed openstack/ceilometer: Remove a useless usage of oslo.context in meters API https://review.openstack.org/300365 | 13:54 |
openstackgerrit | Julien Danjou proposed openstack/ceilometer: messaging: remove RequestContextSerializer https://review.openstack.org/300381 | 13:54 |
*** diogogmt has quit IRC | 13:55 | |
*** diogogmt has joined #openstack-telemetry | 13:56 | |
*** nadya has quit IRC | 13:57 | |
*** jwcroppe has joined #openstack-telemetry | 14:00 | |
*** cdent has quit IRC | 14:02 | |
openstackgerrit | Julien Danjou proposed openstack/gnocchi: Revert "Use pbr WSGI feature to build gnocchi-api" https://review.openstack.org/302278 | 14:16 |
*** rbak has joined #openstack-telemetry | 14:19 | |
*** cdent has joined #openstack-telemetry | 14:22 | |
*** diogogmt has quit IRC | 14:23 | |
*** diogogmt has joined #openstack-telemetry | 14:28 | |
*** flwang1 has quit IRC | 14:35 | |
*** flwang has joined #openstack-telemetry | 14:49 | |
*** nadya has joined #openstack-telemetry | 14:52 | |
*** jwcroppe has quit IRC | 15:01 | |
*** julim has joined #openstack-telemetry | 15:06 | |
*** nadya has quit IRC | 15:07 | |
*** cdent has quit IRC | 15:12 | |
*** Liuqing has quit IRC | 15:15 | |
*** dave-mccowan has joined #openstack-telemetry | 15:15 | |
*** belmoreira has quit IRC | 15:17 | |
*** yprokule has quit IRC | 15:28 | |
*** dave-mcc_ has joined #openstack-telemetry | 15:30 | |
*** shardy has quit IRC | 15:31 | |
*** dave-mccowan has quit IRC | 15:32 | |
*** ametts has quit IRC | 15:36 | |
*** drupalmonkey has joined #openstack-telemetry | 15:37 | |
*** drupalmonkey has left #openstack-telemetry | 15:37 | |
*** drupalmonkey has joined #openstack-telemetry | 15:43 | |
*** julim has quit IRC | 15:43 | |
*** nicodemus_ has joined #openstack-telemetry | 15:50 | |
*** rbak_ has joined #openstack-telemetry | 15:57 | |
openstackgerrit | Julien Danjou proposed openstack/gnocchi: carbonara: add a processing speed in debug logs https://review.openstack.org/302336 | 15:59 |
*** rbak has quit IRC | 15:59 | |
openstackgerrit | Merged openstack/gnocchi: Revert "Use pbr WSGI feature to build gnocchi-api" https://review.openstack.org/302278 | 16:01 |
*** cdent has joined #openstack-telemetry | 16:03 | |
jd__ | sileht: https://review.openstack.org/#/c/301553/ | 16:09 |
drupalmonkey | hi everyone: have a ceilometer issue.. getting this error in liberty: http://imgur.com/DbOpABz | 16:11 |
drupalmonkey | line 128 in that file is here: http://imgur.com/AW12dU0 | 16:11 |
drupalmonkey | this is what the data looks like that is giving the error: http://imgur.com/c0irulO | 16:11 |
drupalmonkey | duration_start and duration_end are None | 16:11 |
drupalmonkey | possible fix: http://imgur.com/MbmvMwE | 16:11 |
drupalmonkey | use period_end when duration_end is None.. is this the right way to go? | 16:12 |
*** anush_ has joined #openstack-telemetry | 16:13 | |
*** anush_ has quit IRC | 16:17 | |
*** jwcroppe has joined #openstack-telemetry | 16:18 | |
*** jwcroppe has quit IRC | 16:18 | |
*** jwcroppe has joined #openstack-telemetry | 16:18 | |
gordc | drupalmonkey: just fyi, you can paste your logs in paste.openstack.org... rather than screenshots. | 16:18 |
gordc | i'd suggest you open a bug. https://bugs.launchpad.net/ceilometer/+bugs?orderby=-id&start=0 | 16:19 |
gordc | probably against horizon... horizon uses ceilometer data (for now) but i can't speak to how they're using it. | 16:20 |
*** nadya has joined #openstack-telemetry | 16:25 | |
*** ljxiash has quit IRC | 16:29 | |
*** jwcroppe has quit IRC | 16:30 | |
nicodemus_ | hello | 16:31 |
*** thumpba has joined #openstack-telemetry | 16:31 | |
nicodemus_ | I'm seeing an ever-increasing number of measures that gnocchi-metricd acknowledges but don't process | 16:31 |
nicodemus_ | about 200 measures per day spread across 6 metrics | 16:32 |
*** lsmola has quit IRC | 16:38 | |
gordc | nicodemus_: what are the metrics? | 16:46 |
stevelle | Just curious, how do you do this diagnosis? (the number of measures waiting processing, and inspecting them) | 16:53 |
stevelle | I'm assuming this is the ceph store. | 16:53 |
nicodemus_ | gordc, one is memory.resident | 16:57 |
nicodemus_ | the others are disk.* | 16:57 |
nicodemus_ | stevelle, metricd logs reports that : "6544 measurements bundles across 6 metrics wait to be processed" | 16:58 |
nicodemus_ | since I'm polling every 20 minutes, there are several of these log lines but without metricd processing them | 16:58 |
nicodemus_ | only when new measures arrive, metricd computes them | 16:59 |
nicodemus_ | down to the same number of unprocessed measures, or sometimes a little more | 16:59 |
nicodemus_ | the "measure_" objects for those metrics are on the ceph pool as well as in the xattrs of the "measure" object | 16:59 |
nicodemus_ | this happens with metricd versions 2.0.1.dev1 and 2.0.1.dev46 (our productive deploy) | 17:00 |
nicodemus_ | would you like me to open a bug with this info? | 17:01 |
gordc | nicodemus_: sure. i'm about to head out for lunch | 17:03 |
gordc | it'd be helpful to include the metric listing in metric table (if present) | 17:03 |
nicodemus_ | me too. I'll provide as much info as possible in the bug report | 17:03 |
gordc | nicodemus_: cool cool.ttyl | 17:04 |
nicodemus_ | k, thanks! | 17:04 |
stevelle | nicodemus_: thx, haven't looked at the operational monitoring bits yet, digging now | 17:04 |
sileht | gordc, I suspect that self.partition is always 0 here: https://github.com/openstack/gnocchi/blob/master/gnocchi/storage/ceph.py#L191 | 17:09 |
sileht | gordc, forget | 17:09 |
*** ljxiash has joined #openstack-telemetry | 17:21 | |
*** jwcroppe has joined #openstack-telemetry | 17:25 | |
*** marcin12345 has quit IRC | 17:27 | |
*** marcin1234 has joined #openstack-telemetry | 17:27 | |
*** ChrisBenson has joined #openstack-telemetry | 17:28 | |
*** nadya has quit IRC | 17:33 | |
marcin1234 | hey guys, what are your thoughts about inefficiency of writing objects by Gnocchi to Swift container called "measure"? It maybe would make sense to write to Redis instead, I think. | 17:34 |
marcin1234 | those objects in there are for a very very short time, and it is expensive operation for such a small object (mostly timestamp, id, value) | 17:35 |
*** david-lyle has quit IRC | 17:44 | |
*** ChrisBenson has quit IRC | 17:46 | |
stevelle | marcin1234: your issue there is adding yet another piece of shared infra, and secondly that isn't an HA solution so you will lose some of that data someday. | 17:52 |
stevelle | the first is a concern in small deploys, the second for large deploys | 17:53 |
*** iberezovskiy is now known as iberezovskiy_ | 17:53 | |
*** rcernin has quit IRC | 17:56 | |
gordc | sileht: :) | 17:57 |
openstackgerrit | Mehdi Abaakouk (sileht) proposed openstack/gnocchi: ceph: Don't fetch useless omap attributes https://review.openstack.org/302383 | 17:58 |
gordc | stevelle: you going to be in austin? | 17:58 |
stevelle | gordc: yes | 17:58 |
gordc | nice. will be good to put a face to the name | 17:59 |
gordc | planning on joining the telemetry design sessions? | 17:59 |
stevelle | marcin1234: my comment above isn't meant to shut down the idea, just to lay out some concerns that need to be managed. | 17:59 |
stevelle | gordc: I'm expecting to balance between telemetry and osa | 18:00 |
gordc | marcin1234: there's a bug i think that's compounding your issue: https://bugs.launchpad.net/gnocchi/+bug/1566940 | 18:00 |
openstack | Launchpad bug 1566940 in Gnocchi "objectstorage resource not processed correctly" [Undecided,New] | 18:00 |
gordc | i'm still debugging it. | 18:00 |
gordc | stevelle: cool cool. | 18:00 |
*** david-lyle has joined #openstack-telemetry | 18:02 | |
marcin1234 | stevelle: Redis is needed anyway for env where you have more that one Ceilometer controller to coordinate and it is HA, that data would not be lost most likely because Redis can persist to disk on shutdown, also that data leaves very short time period (minutes) so using disk for it is not optimal | 18:04 |
stevelle | marcin1234: any tooz driver that supports the lock use case is needed afaik | 18:05 |
stevelle | And redis persisting to disk is only good for planned outage, and from what I heard operators don't trust the disk persistence. | 18:07 |
stevelle | I agree that a clustered memory-based solution seems appropriate fwiw | 18:08 |
stevelle | just that all the implementations available seem to create pushback from ops | 18:09 |
stevelle | we need to prepare for that | 18:09 |
*** pradk has quit IRC | 18:14 | |
*** zqfan has quit IRC | 18:22 | |
*** pradk has joined #openstack-telemetry | 18:25 | |
*** KrishR has joined #openstack-telemetry | 18:28 | |
*** ljxiash has quit IRC | 18:29 | |
*** thumpba has quit IRC | 18:29 | |
*** thumpba has joined #openstack-telemetry | 18:31 | |
NotMyNameEither | Hi! We are currently evaluating Gnocchi, for a potential deployment in our production environment (4000 instances/Gnocchi site). We were wondering if Gnocchi was ever tested at this scale yet? With which storage drivers? | 18:32 |
gordc | err.. i'm going to go out on a limb and say probably not. | 18:35 |
gordc | ceph driver is probably your best bet. the swift driver will probably cause you some issues currently. | 18:36 |
*** jwcroppe has quit IRC | 18:39 | |
marcin1234 | what is the highest number version wise of Gnocchi that will work with Liberty Ceilometer? | 18:40 |
jmlowe | 2.0.2 works, I don't think there is a newer release yet | 18:40 |
*** ChrisBenson has joined #openstack-telemetry | 18:43 | |
gordc | officially we only test liberty against 1.3.x but that's really only because we had a requirements bump in 2.x that doesn't match with other stable/liberty stuff. | 18:44 |
*** yassou has quit IRC | 18:46 | |
NotMyNameEither | gordc: thank you for you answer. Could you elaborate a little bit about the Ceph vs Swift drivers? | 18:50 |
NotMyNameEither | I suppose this is related to the measure container being used as a queue before the metrics are processed. How is Ceph handling this better than Swift? | 18:51 |
gordc | NotMyNameEither: they actually both work pretty much identical. they both queue up unprocessed data in respective backends, get processed and the aggregates stored accordingly. | 18:52 |
gordc | NotMyNameEither: to avoid the whole ceph vs swift general debate, gnocchi dev team in general is more familiar with ceph so that's probably why it's performing better | 18:53 |
NotMyNameEither | Makes sense, no problem, not trying to start a fight :] | 18:53 |
gordc | NotMyNameEither: :) there's also a list of bugs open against swift which we ran into with when building the driver. | 18:54 |
NotMyNameEither | We have been testing Gnocchi/Swift in a dev environment for about a week now and we are indeed seeing Swift suffering from the constant PUT and DELETE made to the measure container | 18:55 |
NotMyNameEither | Around 150PUT/s, we are seeing object-server errors related to the container update operation that happens after an object write | 18:56 |
gordc | NotMyNameEither: yeah, there is definitely a lot of i/o in gnocchi from read/write temp and computed storage | 18:57 |
NotMyNameEither | I was curious to understand how Ceph would handle it better. To my understanding, the journal is also updated after an object write before sending back an ACK to the client | 18:57 |
NotMyNameEither | Swift is actually keeping up with the objects write load, but the more load we are pushing, the more asynchronous container update operations we are seeing | 18:58 |
gordc | sileht: can probably answer exact specifics better... althought he's based in Europe. | 18:58 |
gordc | NotMyNameEither: are you metering swift as well (with ceilometermiddleware)? | 18:58 |
NotMyNameEither | We are not yet no, we are only collecting instances metrics during this proof of concept at this point | 18:59 |
gordc | NotMyNameEither: i see. yeah. i haven't had a chance to benchmark the backends. | 19:03 |
gordc | from general feedback the ceph driver has been much more performant | 19:03 |
NotMyNameEither | While I understand the reason why storing the unprocessed data in a shared store makes sense to make them available to any Gnocchi aggregator (allowing to scale Gnocchi horizontally), it seems that this is creating a bottleneck that won't scale under high load | 19:06 |
NotMyNameEither | Both Ceph and Swift will be limited to the devices associated to the "measure" container (based on their hash) which, unless I am missing something, cannot be scaled | 19:08 |
NotMyNameEither | I might be wrong, this is only my current understanding of what we are observing! | 19:09 |
NotMyNameEither | No offense by the way, I am not here to criticize your work at all, just trying to understand the vision/design! | 19:10 |
sileht | With Ceph we use an omap database on the measure object, that allows to control the lock contention on this object, so if you have replicat 3 for the pool that have the 'measure' object, so you have 3x the devices as limitation | 19:10 |
gordc | no worries :) yeah that may be valid. we initially had an idea to write locally as well but we went down this route. | 19:11 |
gordc | doesn't mean we can't change it :) | 19:11 |
gordc | NotMyNameEither: good feedback though. | 19:11 |
gordc | jd__: ^^ fyi for future you. | 19:11 |
NotMyNameEither | sileht: gordc: thanks for your answers , much appreciated! | 19:12 |
sileht | NotMyNameEither, Also Ceph allows to not hold the lock for reading the omap database, so workers can get a (perhaps outdated) version of the database without waiting for the journal or anything elase | 19:13 |
NotMyNameEither | I see | 19:13 |
sileht | So I would says only Gnocchi-api is a bit limited by the devices performances, but gnocch-metricd is not | 19:14 |
gordc | (for ceph) | 19:14 |
sileht | yes :) | 19:14 |
NotMyNameEither | I think most of that logic applies to Swift as well actually | 19:15 |
* gordc needs a emoji of confusion. | 19:15 | |
gordc | (for swift) | 19:15 |
NotMyNameEither | :) | 19:15 |
NotMyNameEither | I definitely take your word on the fact that Ceph shows better performances though | 19:16 |
NotMyNameEither | But both backends will indeed become your Bottleneck eventually (only for the measure/unprocessed data portion) | 19:17 |
gordc | NotMyNameEither: there's probably some weird inefficiencies we have in swift driver. if you have some expertise in it, contributions are welcomed :) | 19:17 |
cdent | NotMyNameEither: I've been wondering about this too: [t t53] | 19:17 |
purplerbot | <NotMyNameEither> While I understand the reason why storing the unprocessed data in a shared store makes sense to make them available to any Gnocchi aggregator (allowing to scale Gnocchi horizontally), it seems that this is creating a bottleneck that won't scale under high load [2016-04-06 19:06:35] [n t53] | 19:17 |
gordc | cdent: your bot is magic. | 19:17 |
cdent | I proposed at once point that that temp data should always use the local file store | 19:17 |
cdent | s/one/once/ | 19:17 |
cdent | and then feed to whatever the "real" store is | 19:18 |
sileht | I think we could separate the final storage backend from this temporary storage backend, so use the best tech for each | 19:18 |
sileht | s/so/to | 19:18 |
gordc | seems like a good design session. 'how to possibly improve temp store' | 19:18 |
sileht | see you tomorrow guys | 19:20 |
NotMyNameEither | sileht: that would definitely make sense, but I am sure this complicates the parallel aggregation of data if stored locally | 19:20 |
gordc | sileht: laters | 19:20 |
NotMyNameEither | Bye sileht, thank you for your time and answers! | 19:20 |
NotMyNameEither | Oh my bad, "that would definitely make sense, but I am sure this complicates the parallel aggregation of data if stored locally" was meant for cdent actually :) | 19:22 |
cdent | he was saying much the same thing as me, or acking it and then expanding a bit | 19:22 |
cdent | but yeah, you are right | 19:22 |
NotMyNameEither | I like where this discussion is going! Appreciate your openness to feedbacks! | 19:22 |
* gordc doesn't want to say "let's use redis" | 19:24 | |
cdent | no way gordc, even more cumbersome would be to put them in a queue on the message bus | 19:26 |
* cdent imagines more horribleness | 19:26 | |
gordc | lol | 19:27 |
NotMyNameEither | :) | 19:28 |
*** ametts has joined #openstack-telemetry | 19:31 | |
*** KrishR has quit IRC | 19:36 | |
stevelle | I presume we don't want to go down the path of partitioning aggregation work | 19:40 |
stevelle | but I'm going to throw it out there to see it get obliterated | 19:40 |
cdent | stevelle: without j*d and s*leht that conversation will stall | 19:42 |
stevelle | cdent: I'll save that for summit then | 19:42 |
gordc | cdent: wah? we don't need them. i'm my own boss :P | 19:42 |
cdent | gordc: obliteration is jd's speciality | 19:43 |
stevelle | I fear that much of these internals are still not meaningfully differentiated from magic and so I won't understand a lot | 19:43 |
gordc | true true | 19:43 |
gordc | stevelle: coles notes: we do some partitioning currently. all the workers will process only their own block of unprocessed measures | 19:44 |
gordc | not sure if that's the partitioning you were thinking of | 19:45 |
stevelle | gordc: I believe I'm thinking of a little more partitioning than exits, but as I am but an egg maybe I'm misguided | 19:46 |
marcin1234 | finally read all your comments guys | 19:47 |
marcin1234 | current partitioning is still using the same container "measure", so it does not help with this issue | 19:47 |
*** thumpba has quit IRC | 19:47 | |
*** KrishR has joined #openstack-telemetry | 19:49 | |
*** KrishR has quit IRC | 19:53 | |
*** julim has joined #openstack-telemetry | 19:57 | |
*** nadya has joined #openstack-telemetry | 19:58 | |
gordc | marcin1234: yeah, we need to brainstorm. | 19:59 |
*** kapil has joined #openstack-telemetry | 19:59 | |
kapil | hi, i am sending compute.metrics.update event notification to ceilometer from the compute node, However, I am some events are getting lost. | 20:00 |
*** drupalmonkey has quit IRC | 20:00 | |
kapil | I checked the logs for ceilometer-agent-notification and nova-compute, nova-compute is generating the events but some of them are not being received by ceilometer | 20:01 |
gordc | kapil: do you have something else listening to nova queue? | 20:03 |
*** KrishR has joined #openstack-telemetry | 20:04 | |
*** mgagne_ is now known as mgagne | 20:04 | |
kapil | like what ? | 20:05 |
kapil | i added two classes to /usr/lib/python2.7/dist-packages/ceilometer-2015.1.1.egg-info/entry_points.txt which generate the metric from the notification | 20:06 |
kapil | but that should not be an issue i think | 20:06 |
*** rcernin has joined #openstack-telemetry | 20:09 | |
gordc | do you get some compute.metrics.update notifications but not others? | 20:09 |
*** pradk_ has joined #openstack-telemetry | 20:12 | |
*** pradk_ has quit IRC | 20:13 | |
*** pcaruana has quit IRC | 20:14 | |
kapil | yes, i guess that is the case | 20:14 |
*** pradk has quit IRC | 20:14 | |
kapil | the ceilometer sample-list query doesn't show me the metrics for every interval | 20:15 |
kapil | but nova-compute logs show that an event was generated | 20:15 |
gordc | which metric are you missing specifically? | 20:15 |
*** pradk has joined #openstack-telemetry | 20:18 | |
*** nadya has quit IRC | 20:18 | |
kapil | i added few metrics of my own which i am missing sometimes and reported other times | 20:19 |
kapil | they should be reported every 1 min, as is the default | 20:19 |
kapil | but one of my metrics is being reported every 1 min as it should be | 20:21 |
gordc | not sure. i'd probably need to look at code to help. | 20:22 |
gordc | really depends on how you added new metric | 20:23 |
kapil | https://github.com/kapiliitr/openstack_project | 20:28 |
kapil | i added the code i have added as plugin | 20:29 |
kapil | along with this, i made changes to rootwrap and entry_points.txt mainly | 20:29 |
kapil | i am getting the ipmi.power metric every 1 min, but others are randomly arriving | 20:30 |
marcin1234 | kapil: check your rabbit if you have matching toipc name onpublisher and consumer | 20:32 |
kapil | the topic name is compute.metrics.update right ? i didn't change that | 20:33 |
kapil | ok, maybe i am getting closer to fixing the problem, maybe the messaged sent on rabbit are distributed across multiple controller nodes | 20:36 |
kapil | and if one node is failing, then it doesn't try the other node | 20:36 |
*** ljxiash has joined #openstack-telemetry | 20:41 | |
kapil | yes, i just checked that is the problem. thanks guys | 20:42 |
*** kapil has quit IRC | 20:42 | |
*** jdowner has left #openstack-telemetry | 20:43 | |
*** ljxiash has quit IRC | 20:45 | |
*** KrishR has quit IRC | 20:55 | |
*** peristeri has quit IRC | 20:59 | |
*** jmlowe has quit IRC | 21:10 | |
*** cdent has quit IRC | 21:14 | |
*** julim has quit IRC | 21:14 | |
openstackgerrit | Merged openstack/ceilometer: mongo: remove unused function https://review.openstack.org/302264 | 21:22 |
*** rcernin has quit IRC | 21:27 | |
*** jwcroppe has joined #openstack-telemetry | 21:28 | |
*** nicodemus_ has quit IRC | 21:32 | |
*** jwcroppe has quit IRC | 21:33 | |
*** jwcroppe has joined #openstack-telemetry | 21:36 | |
*** drupalmonkey has joined #openstack-telemetry | 21:43 | |
*** thorst has quit IRC | 21:44 | |
*** thorst has joined #openstack-telemetry | 21:44 | |
*** ChrisBenson has quit IRC | 21:49 | |
*** thorst has quit IRC | 21:53 | |
*** rbak_ has quit IRC | 21:53 | |
*** ChrisBenson has joined #openstack-telemetry | 21:56 | |
*** ChrisBenson has quit IRC | 22:01 | |
*** rbak has joined #openstack-telemetry | 22:15 | |
*** ametts has quit IRC | 22:21 | |
*** pradk has quit IRC | 22:36 | |
*** aggaatul has joined #openstack-telemetry | 22:39 | |
*** drupalmonkey has quit IRC | 22:46 | |
*** aggaatul has quit IRC | 22:47 | |
*** thorst has joined #openstack-telemetry | 22:50 | |
*** thorst has quit IRC | 22:57 | |
*** gordc has quit IRC | 22:59 | |
*** jwcroppe has quit IRC | 23:46 | |
*** thorst has joined #openstack-telemetry | 23:55 | |
*** ljxiash has joined #openstack-telemetry | 23:56 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!