*** yujunz has joined #openstack-performance | 00:21 | |
*** catintheroof has quit IRC | 00:29 | |
*** dimtruck is now known as zz_dimtruck | 00:55 | |
*** jkilpatr has quit IRC | 01:06 | |
*** yujunz is now known as yujunz[away] | 01:12 | |
*** yujunz[away] is now known as yujunz | 01:13 | |
*** zz_dimtruck is now known as dimtruck | 01:16 | |
*** tovin07_ has joined #openstack-performance | 03:41 | |
*** yujunz has quit IRC | 04:08 | |
*** dimtruck is now known as zz_dimtruck | 05:48 | |
*** yujunz has joined #openstack-performance | 05:59 | |
*** yujunz-zte has joined #openstack-performance | 07:00 | |
*** yujunz has quit IRC | 07:02 | |
*** pcaruana has joined #openstack-performance | 08:19 | |
*** yujunz-zte is now known as yujunz[away] | 08:20 | |
*** yujunz[away] is now known as yujunz-zte | 08:23 | |
*** msimonin has joined #openstack-performance | 08:24 | |
*** msimonin has quit IRC | 08:25 | |
openstackgerrit | yunfeng zhou proposed openstack/performance-docs: add CONTRIBUTING.rst https://review.openstack.org/412898 | 08:48 |
---|---|---|
*** yujunz-zte is now known as yujunz[away] | 09:04 | |
*** yujunz[away] is now known as yujunz-zte | 09:05 | |
*** yujunz-zte is now known as yujunz[away] | 09:05 | |
*** yujunz[away] is now known as yujunz-zte | 09:05 | |
*** yujunz-zte has quit IRC | 10:13 | |
*** tovin07_ has quit IRC | 10:15 | |
*** jkilpatr has joined #openstack-performance | 10:56 | |
*** msimonin has joined #openstack-performance | 11:09 | |
*** msimonin has quit IRC | 11:10 | |
*** msimonin has joined #openstack-performance | 11:13 | |
*** msimonin has quit IRC | 11:14 | |
*** jkilpatr has quit IRC | 11:28 | |
*** yujunz has joined #openstack-performance | 11:51 | |
*** jkilpatr has joined #openstack-performance | 12:07 | |
openstackgerrit | Ilya Shakhat proposed openstack/performance-docs: Kubernetes density testing https://review.openstack.org/413048 | 12:12 |
*** jkilpatr has quit IRC | 12:20 | |
*** jkilpatr has joined #openstack-performance | 12:20 | |
*** yujunz has quit IRC | 12:26 | |
*** catintheroof has joined #openstack-performance | 12:37 | |
*** pcaruana has quit IRC | 13:01 | |
*** pcaruana has joined #openstack-performance | 13:06 | |
*** yujunz has joined #openstack-performance | 13:29 | |
*** yujunz has quit IRC | 13:29 | |
*** yujunz has joined #openstack-performance | 13:30 | |
*** catinthe_ has joined #openstack-performance | 14:06 | |
*** catintheroof has quit IRC | 14:08 | |
*** pcaruana has quit IRC | 14:23 | |
*** pcaruana has joined #openstack-performance | 14:37 | |
*** Guest67717 is now known as med_ | 14:48 | |
*** med_ has quit IRC | 14:48 | |
*** med_ has joined #openstack-performance | 14:48 | |
*** zz_dimtruck is now known as dimtruck | 14:55 | |
*** dimtruck is now known as zz_dimtruck | 15:05 | |
*** tovin07_ has joined #openstack-performance | 15:11 | |
*** vbala has joined #openstack-performance | 15:14 | |
openstackgerrit | Igor Yozhikov proposed openstack/performance-docs: Test plan for k8s+OS+Cinder+Ceph https://review.openstack.org/411933 | 15:17 |
*** rcherrueau has joined #openstack-performance | 15:28 | |
DinaBelova | #startmeeting Performance Team | 15:30 |
openstack | Meeting started Tue Dec 20 15:30:14 2016 UTC and is due to finish in 60 minutes. The chair is DinaBelova. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:30 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:30 |
openstack | The meeting name has been set to 'performance_team' | 15:30 |
DinaBelova | hey folks! | 15:30 |
akrzos | Hey DinaBelova | 15:30 |
rcherrueau | o/ | 15:30 |
tovin07_ | o/ | 15:30 |
vbala | Hi | 15:31 |
DinaBelova | let's wait for a few moments to ensure everyone who wanted joined :) | 15:31 |
lezbar__ | o/ | 15:32 |
DinaBelova | hey lezbar__ o/ | 15:32 |
DinaBelova | so I guess we may get started | 15:32 |
DinaBelova | #topic Action Items | 15:32 |
DinaBelova | last time we had only one action item on me | 15:32 |
DinaBelova | regarding verification of what grafana backend Mirantis is using | 15:33 |
DinaBelova | in fact we're using right now plain Prometheus with its own database | 15:33 |
DinaBelova | we plan to add persistent time series storage (e.g. Cassandra or OpenTSDB) a bit later | 15:33 |
DinaBelova | to store old monitoring data | 15:34 |
DinaBelova | and then we'll need to modify our grafana boards a bit | 15:34 |
DinaBelova | to grab data from it | 15:34 |
DinaBelova | but right now it's plain prometheus | 15:34 |
DinaBelova | I don't remember who was asking this question, I believe it might be you, akrzos | 15:34 |
DinaBelova | so we may proceed to the current progress | 15:35 |
DinaBelova | #topic Current progress on the planned tests | 15:36 |
DinaBelova | rcherrueau it looks like you're only guy from inria today :) | 15:36 |
rcherrueau | Yes, msimonin is on holiday, so I will speak for him/Inria. | 15:36 |
DinaBelova | rcherrueau cool :) | 15:36 |
DinaBelova | please go ahead | 15:36 |
rcherrueau | We are working on two stuff. First, deploy a multi-region OpenStack with kolla. | 15:36 |
rcherrueau | 15:36 | |
rcherrueau | This almost works. | 15:37 |
DinaBelova | any issues met? | 15:37 |
DinaBelova | probably we may list bugs here | 15:37 |
DinaBelova | if any | 15:37 |
rcherrueau | We have something we call the Administrative Region (AR) that contains Keystine, MariaDB (wth Keystone tables) and Memcached. | 15:37 |
rcherrueau | This AR also contains one HAProxy since we deploy with kolla. | 15:38 |
rcherrueau | 15:38 | |
rcherrueau | We have then, n OpenStack Region (OSRn) that each contains Nova, Glance, Neutron, RabbitMQ, MariaDB and HAProxy | 15:38 |
rcherrueau | Each OSR register itself to the AR Keystone. And when an operator connect itself to Horizon, he has to choose between all OSR | 15:38 |
rcherrueau | To do so, we have to patch kolla a little bit. We plan to make a mail on the kolla mailing list to share our experience with the community | 15:39 |
DinaBelova | so you have keystone separated from the OSR to separated region? just to make sure | 15:40 |
rcherrueau | So, no special issues except patches we have to do on the kolla-ansible code. | 15:40 |
rcherrueau | Yes, exactly | 15:40 |
DinaBelova | rcherrueau ok, and those regions might be located in different locations theoretically | 15:40 |
rcherrueau | Yes, this is the idea | 15:41 |
DinaBelova | I think that keystone performance might be the issue in this case :/ | 15:41 |
DinaBelova | I think although you'll test it anyway :) | 15:41 |
rcherrueau | Yes we will, and this comes to the second stuff we are working on | 15:41 |
DinaBelova | ok, thank you rcherrueau - please keep us updated regarding your experiments :) | 15:42 |
rcherrueau | At the same time we are adding `netem` to our deployment and test tool | 15:42 |
DinaBelova | and the second? | 15:42 |
rcherrueau | `netem` is a Linux tool that lets you emulate network latency, low bandwidth, packet loss ... | 15:42 |
akrzos | what about setting latency via tc? | 15:43 |
rcherrueau | The idea is to make a several multi-region deployment on our G5k platform. Then use `netem` to simulate different locations with different latencies, bandwidth and see how OpenStack behaves | 15:43 |
DinaBelova | #info Inria had to modify Kolla a bit to be able to proceed with their type of multisite deployment (Administrative Region and n OpenStack Regions) | 15:43 |
rcherrueau | akrzos: netem is tc ;) | 15:43 |
akrzos | ah | 15:44 |
akrzos | :D | 15:44 |
DinaBelova | #info the second part of work is oriented on adding `netem` to their deployment and test tool - o simulate different locations with different latencies, bandwidth and see how OpenStack behaves | 15:44 |
DinaBelova | ok, thanks rcherrueau | 15:44 |
rcherrueau | msimonin is working hard on this second part | 15:44 |
DinaBelova | hope to see him next week :) | 15:44 |
DinaBelova | akrzos any update from you sir? afair you got new HW for the telemetry testing :) | 15:45 |
akrzos | so beeing running into bottlenecks in telemetry services | 15:45 |
akrzos | first was too few metricd workers | 15:46 |
akrzos | this is with 3 controllers, 4 ceph nodes, 10 computes | 15:46 |
akrzos | booted 1k instances | 15:46 |
DinaBelova | #info akrzos has started work on telemetry testing following the test plan - http://docs.openstack.org/developer/performance-docs/test_plans/telemetry_scale/plan.html | 15:46 |
akrzos | gnocchi backlog continously grows | 15:46 |
akrzos | $os_Workers limits metricd workers to 6 on my controllers | 15:47 |
akrzos | (24 logical cpu cores) | 15:47 |
akrzos | so i redeployed overrideing it with 48 workers | 15:47 |
akrzos | so 48 workers on each controller | 15:47 |
akrzos | so 144 total metricd workers | 15:47 |
*** yujunz has quit IRC | 15:47 | |
akrzos | also reduced metric processing delay | 15:47 |
akrzos | from 60s to 30s | 15:47 |
akrzos | and 1k instances is now handled in realtime | 15:48 |
akrzos | in ceph there is 36 osds | 15:48 |
akrzos | also needed to tune pgs to avoid ceph health_warn | 15:48 |
*** yujunz has joined #openstack-performance | 15:48 | |
akrzos | though the calculation for this is tricky using pgcalc | 15:48 |
akrzos | so with this tuning i can now sustain 1k instances in the cloud aiwth gnocchi | 15:49 |
akrzos | on low archival policy | 15:49 |
*** harlowja has joined #openstack-performance | 15:49 | |
akrzos | i attempted to scale further | 15:49 |
akrzos | (wanted 2k) | 15:49 |
akrzos | and got to ~1.9k before hitting new problems | 15:49 |
akrzos | load avg on controllers is >core count | 15:49 |
DinaBelova | wow | 15:50 |
DinaBelova | it's huge load | 15:50 |
akrzos | memory is rising in both rabbitmq and ceilometer-collector | 15:50 |
akrzos | at this scale now | 15:50 |
akrzos | also | 15:50 |
akrzos | to get to 1.9 k | 15:50 |
akrzos | i had to tune threads in gnocchi | 15:50 |
akrzos | aggregation worker threads is default to 1 | 15:50 |
DinaBelova | it looks like that potentially for ~2k VMs gnocchi and rabbit needs to be separated from each other to different nodes - with more nodes given to control plane side of the cloud | 15:50 |
akrzos | my concern now is the collector grows as i have seen in the past | 15:51 |
akrzos | i thouigh there was a patch put in to limit the # of messages it grabs off rabbit | 15:51 |
akrzos | to prevent growth | 15:51 |
akrzos | but i don't understand the problem enough right now | 15:51 |
DinaBelova | akrzos ack, thank you sir | 15:52 |
akrzos | so another factor | 15:52 |
akrzos | is the archival policy | 15:52 |
akrzos | high policy might actually mean less aggregations being "Recalculated" | 15:52 |
akrzos | and could actually be a lower workload | 15:52 |
akrzos | due to a finer grain "end" timeframe | 15:52 |
akrzos | so i should retest with a new archival policy | 15:53 |
akrzos | and maybe different number of aggregations | 15:53 |
akrzos | so lots to try still | 15:53 |
akrzos | another thing i can share with the community is a collectd plugin i wrote to monitor gnocchi backlog | 15:53 |
akrzos | #link https://review.openstack.org/#/c/411030/4/ansible/install/roles/collectd-openstack/files/collectd_gnocchi_status.py | 15:54 |
akrzos | I think that summerizes the chaos i've been working on as of last week pretty well :D | 15:54 |
DinaBelova | ack, really good job being done | 15:54 |
DinaBelova | thanks akrzos | 15:54 |
akrzos | thanks | 15:55 |
akrzos | also i agree separating telemetry from control plane for scale is a must | 15:55 |
DinaBelova | yeah, I believe this is needed | 15:56 |
DinaBelova | on that scale of monitored resources | 15:56 |
DinaBelova | ok, from mirantis side we've started uploading test plans / results for some recent researches | 15:56 |
DinaBelova | #link https://review.openstack.org/411933 | 15:56 |
DinaBelova | #link https://review.openstack.org/413048 | 15:56 |
DinaBelova | the first one is regarding Cinder performance with Ceph backend - in case of running OpenStack services on k8s | 15:57 |
DinaBelova | Ceph is installed separately of course :) | 15:57 |
DinaBelova | the second one is related to max pods per host density testing | 15:58 |
DinaBelova | in fact what we got was a bit disappointing | 15:58 |
DinaBelova | after 200 pods being run on the host the overall process of scheduling, etc. becomes really slow | 15:58 |
DinaBelova | so 400 pods is almost the limit here | 15:58 |
DinaBelova | we think we may miss some pool / whatever configuration parameter | 15:59 |
DinaBelova | as we did not expect degradations to start that early (200 pods/node density) | 15:59 |
DinaBelova | so that's still in progress | 15:59 |
DinaBelova | also right now we're still working on workloads testing | 16:00 |
DinaBelova | on 200 nodes | 16:00 |
DinaBelova | when we're deploying heat stacks with various apps running on Vms and planning to run locust.io workloads against it | 16:00 |
DinaBelova | still on the deployment phase for now | 16:00 |
DinaBelova | we observed some strange issues with Heat support in the fuel-ccp - really bad performance | 16:01 |
DinaBelova | so we're debugging it right now to see what might be the reason for this issue | 16:01 |
DinaBelova | and I think that's pretty all from my side | 16:01 |
*** markvoelker_ has joined #openstack-performance | 16:01 | |
DinaBelova | anything else to cover in test plans / test results topic? | 16:02 |
DinaBelova | it looks like we may proceed to the Open Discussions | 16:02 |
DinaBelova | #topic Open Discussion | 16:02 |
DinaBelova | vbala tovin07_ I have an idea to finish the work on https://review.openstack.org/#/c/407967/ patch | 16:02 |
DinaBelova | and cut new osprofiler release | 16:03 |
*** markvoelker has quit IRC | 16:03 | |
akrzos | Any ptg updates? | 16:03 |
vbala | vmware ci posted the result on that patch | 16:03 |
DinaBelova | vbala tovin07 are you ok with it? | 16:03 |
tovin07_ | Yes, it’s from vbala | 16:03 |
vbala | i'm ok with it | 16:03 |
tovin07_ | I think it’s ok | 16:03 |
DinaBelova | ack, thanks :) | 16:04 |
DinaBelova | akrzos well :) from Mirantis side me and andreykurilin still coming :) | 16:04 |
andreykurilin | hi hi | 16:04 |
DinaBelova | akrzos were you able to discuss it within your team? | 16:04 |
*** markvoelker has joined #openstack-performance | 16:04 | |
DinaBelova | rcherrueau the same question to you sir :) any updates on PTG side? | 16:04 |
akrzos | we are still looking into budget, but in an ideal world, we would have myself, rook, sai and justin on our team come | 16:04 |
DinaBelova | akrzos yay :) I hope this will happen :) | 16:05 |
akrzos | and each have a performance topic we could cover/discuss | 16:05 |
rcherrueau | no not right now | 16:05 |
DinaBelova | akrzos I think we may start preparing agenda | 16:05 |
akrzos | so i was wondering if we would put together a schedule/agenda | 16:05 |
DinaBelova | lemme create an etherpad for those purposes | 16:05 |
akrzos | perfect | 16:05 |
tovin07_ | +1 | 16:05 |
DinaBelova | #action DinaBelova create an etherpad for PTG agenda collection | 16:06 |
DinaBelova | ack, cool | 16:06 |
rcherrueau | I have to discuss that with ad_rien | 16:06 |
DinaBelova | rcherrueau sure | 16:06 |
DinaBelova | please take your time | 16:06 |
DinaBelova | akrzos as said, I plan to focus on test ideas / tools roadmaps / etc. | 16:06 |
DinaBelova | ok, one more thing to cover | 16:07 |
DinaBelova | there is holiday season close to us | 16:07 |
*** markvoelker_ has quit IRC | 16:07 | |
akrzos | DinaBelova: got it | 16:07 |
DinaBelova | I wanted to check who's going to be available and when :) | 16:07 |
akrzos | we are out all next week, back january 3rd | 16:08 |
DinaBelova | I have a PTO for Dec 27 - Dec 30 | 16:08 |
DinaBelova | ok, so it looks like it makes sense to move our next meeting to Jan | 16:08 |
DinaBelova | rcherrueau and you folks? | 16:08 |
rcherrueau | Me also, I will be out next week. I don't know for msimonin | 16:08 |
DinaBelova | are you ok to meet on Jan 3rd? | 16:08 |
DinaBelova | ack, let's agree on next meeting on Jan 3rd, already in the new year :) | 16:09 |
rcherrueau | OK great | 16:09 |
DinaBelova | #info next meeting to be on Jan 3rd, usual time | 16:09 |
tovin07_ | got it | 16:09 |
akrzos | Great Thanks! | 16:09 |
DinaBelova | and I think that's all from my side | 16:10 |
DinaBelova | anything else to cover? | 16:10 |
DinaBelova | tovin07_ akrzos you're welcome :) | 16:10 |
DinaBelova | ok, thank you folks! see you next year :D | 16:10 |
DinaBelova | bye! | 16:10 |
tovin07_ | Bye | 16:10 |
DinaBelova | #endmeeting | 16:10 |
vbala | Bye | 16:10 |
openstack | Meeting ended Tue Dec 20 16:10:45 2016 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:10 |
rcherrueau | bye! | 16:10 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/performance_team/2016/performance_team.2016-12-20-15.30.html | 16:10 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/performance_team/2016/performance_team.2016-12-20-15.30.txt | 16:10 |
openstack | Log: http://eavesdrop.openstack.org/meetings/performance_team/2016/performance_team.2016-12-20-15.30.log.html | 16:10 |
akrzos | Happy Holidays all! | 16:11 |
*** rcherrueau has quit IRC | 16:14 | |
*** vbala has quit IRC | 16:17 | |
*** tovin07_ has left #openstack-performance | 16:33 | |
*** yujunz has quit IRC | 16:35 | |
*** catintheroof has joined #openstack-performance | 16:39 | |
*** catinthe_ has quit IRC | 16:42 | |
*** harlowja has quit IRC | 16:42 | |
* rook just saw the pings | 16:44 | |
rook | sorry | 16:44 |
rook | in other meetings | 16:44 |
rook | DinaBelova: it would be good to get eyes on https://review.openstack.org/#/c/412554/ | 16:44 |
*** zz_dimtruck is now known as dimtruck | 16:56 | |
*** dimtruck is now known as zz_dimtruck | 17:06 | |
*** msimonin has joined #openstack-performance | 17:17 | |
*** msimonin has quit IRC | 17:35 | |
*** harlowja has joined #openstack-performance | 17:53 | |
*** pcaruana has quit IRC | 17:59 | |
*** catinthe_ has joined #openstack-performance | 18:03 | |
*** catintheroof has quit IRC | 18:06 | |
openstackgerrit | Igor Yozhikov proposed openstack/performance-docs: Test plan for k8s+OS+Cinder+Ceph https://review.openstack.org/411933 | 18:10 |
*** zz_dimtruck is now known as dimtruck | 18:13 | |
DinaBelova | rook ack | 18:23 |
DinaBelova | I've seen it already, did not have a chance to review yet | 18:23 |
rook | DinaBelova: ack | 18:52 |
*** jkilpatr has quit IRC | 19:30 | |
*** harlowja has quit IRC | 19:35 | |
*** catintheroof has joined #openstack-performance | 19:37 | |
*** jkilpatr has joined #openstack-performance | 19:39 | |
*** catinthe_ has quit IRC | 19:40 | |
*** catintheroof has quit IRC | 20:47 | |
*** jkilpatr has quit IRC | 21:14 | |
*** jkilpatr has joined #openstack-performance | 21:33 | |
*** dimtruck is now known as zz_dimtruck | 23:52 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!