Tuesday, 2017-11-14

*** nicovs_be has joined #ara		00:08
*** nicovs_be has quit IRC		00:12
ara-slack	<pilotmattk> @dmsimard We need the DB, running Maria. Using this in a *very, large, international shop. 200,000+ endpoints :grinning:. We had to build our own ansible control/cluster to slice inventory across nodes. Currently doing some load testing before opening the gates in Q1 next year	02:07
ara-slack	<pilotmattk> Without ARA, I can hit ~5K nodes in 3-6 minutes. With ARA it's more like 20mins, can only push about 32 commits per minute. DB is fairly close (same LAN). Troubleshooting both sides. DB and the app, currently.	02:08
ara-slack	<dmsimard> @pilotmattk I'm sure you could be surprised by the performance of sqlite, even at a large scale. If there's even 5ms roundtrip (10ms) for a MySQL database, if you're recording 20k task results (4 tasks on 5k hosts?) that's already 3 minutes worth of latency over the course of a playbook run	02:10
ara-slack	<dmsimard> There's certainly an overhead in running a callback that records data, especially that amount of data. I'm not particularly surprised by your numbers. I'd love to improve them, though.	02:11
ara-slack	<dmsimard> Out of curiosity, we could try benchmarking the current state of ARA 1.0 with your setup -- I'd love to test with that use case and find improvement opportunities. Not tonight, though :)	02:12
ara-slack	<pilotmattk> Yea, I'm not at all surprised about callbacks adding time. 2-3X just seemed high. We're running playbooks all over the place inside docker containers (multiple clouds). Having SQLight files all over would be hard to track down. I wonder if there is a way to batch up results and send bulk commits? Maybe send things through a redis buffer as forks spawn/die. I'd be happy to give 1.0 a go, just replying as there is time	02:16
ara-slack	:slightly_smiling_face: Will check back on tomorrow.	02:16
ara-slack	<dmsimard> @pilotmattk Another thing is that ara 1.0 will ship the notion of input drivers. Right now you have this callback <1.0 that does pure SQL queries, in >1.0, this callback is refactored to use an API instead (either "internal (offline)" or HTTP REST). However, this callback will be folded back as a "driver". The driver implementation will make it easier to add other means of inserting data into ARA. I see you mention redis but we already have	02:20
ara-slack	other message queues in mind like mqtt, rabbitmq, etc. This way, the data could be written to a low-latency bus and asynchronously processed to make it available in the interface.	02:20
ara-slack	<dmsimard> Definitely happy to spend some time narrowing down how we can improve this :slightly_smiling_face:	02:40
*** bcoca has quit IRC		03:48
*** jparrill has joined #ara		06:59
*** nicovs_be has joined #ara		07:45
*** jclaret has joined #ara		07:57
*** jcl has joined #ara		07:57
*** twouters_ is now known as twouters		08:23
*** twouters has joined #ara		08:23
*** jcl has quit IRC		09:59
*** jclaret has quit IRC		10:00
*** jcl has joined #ara		10:00
*** sshnaidm is now known as sshnaidm\|afk		10:06
*** jclaret has joined #ara		10:07
*** sshnaidm\|afk is now known as sshnaidm		11:34
*** nicovs_b_ has joined #ara		12:02
*** nicovs_be has quit IRC		12:04
*** bcoca has joined #ara		13:35
*** jclaret has quit IRC		14:01
*** jcl has quit IRC		14:01
*** dmsimard\|off is now known as dmsimard		14:05
*** jclaret has joined #ara		14:11
*** jcl has joined #ara		14:11
*** bcoca has quit IRC		14:18
*** bcoca has joined #ara		14:19
*** bcoca has quit IRC		14:19
*** bcoca has joined #ara		14:19
ara-slack	<pilotmattk> I have a few ideas to try. Just to double-check that latency estimate vs 3 minutes. If I understand callbacks correctly, each fork runs the callback with the parent running a final callback at the end (summary). Is this correct? There Should be a high degree of concurrency at 500 forks (depends on db threads).	14:34
ara-slack	<pilotmattk> I see the 1.0 branch out on github (your repo and openstack), how safe are the callbacks? Does the data model match 0.14.5	14:36
*** tbielawa has joined #ara		14:41
ara-slack	<dmsimard> @pilotmattk the database model is very different and there's no upgrade path, it breaks backwards compatibility.	15:04
ara-slack	<dmsimard> The callback is safe (it is integration tested), I haven't yet fully tested the API with MySQL however.	15:05
*** jcl has quit IRC		15:06
ara-slack	<dmsimard> I'm not sure about the impact of forks, not familiar with the low level implementation in Ansible	15:06
*** jcl has joined #ara		15:06
ara-slack	<dmsimard> For yesterday's example, it was just napkin math, though, there's a bit more queries involved than that.. recording hosts, files, plays. Bulk of your time would be spent recording task results in your use case however	15:08
ara-slack	<pilotmattk> OK, thank you. First order of business is to locate the DB in the same rack (or floor) as the worker nodes. Then thinking about ways to batch up / federate mysql (without the overhead of FederatedX). Might try writing to SQlite then dump and load to backhaul the data. Need some sort of local write cache / tempfs. RabbitMQ is perfect going forward, seems 50/50 which in-mem store a python project will pick	15:21
ara-slack	<dmsimard> @pilotmattk I don't know what's your use case but we had scalability issues in OpenStack because we were generating static reports for every CI job. The static reports aren't large but it's a lot of smaller files. Anyway, we came up with a WSGI middleware to load arbitrary sqlite databases which suits well the use case for "ephemeral" CI reports http://ara.readthedocs.io/en/latest/advanced.html	15:25
ara-slack	<pilotmattk> I think I remember that blog post... something about static reports in jenkins. To date we do not generate static reports... We could likely store the sqlite file as a blob inside our Deployment Orchestrator db (postgres).. That has some potential. Call postgres HTTP api to retrieve the report.	15:50
ara-slack	<pilotmattk> I'd have to think about how to collate the reports... eventually. there is a desire to have some visibility exactly what is being deployed where and how often.	15:51
*** jparrill has quit IRC		15:54
*** nicovs_b_ has quit IRC		16:08
ara-slack	<dmsimard> @pilotmattk yeah, the feature is not so much to create static reports, but rather to dynamically load an arbitrary sqlite database	16:31
ara-slack	<dmsimard> So instead of generating a static report and storing the report files, you store the sqlite database(s) instead	16:31
ara-slack	<dmsimard> I recognize it's a niche use case though :slightly_smiling_face:	16:32
*** nicovs_be has joined #ara		16:52
*** nicovs_be has quit IRC		16:56
*** dougbtv__ has joined #ara		17:45
*** dougbtv_ has quit IRC		17:48
*** cliles has joined #ara		17:53
*** tbielawa has quit IRC		17:56
*** dougbtv__ has quit IRC		18:12
*** tbielawa has joined #ara		18:33
*** dougbtv__ has joined #ara		18:36
*** tbielawa is now known as tbielawa\|caff		19:00
*** jclaret has quit IRC		19:21
*** jcl has quit IRC		19:21
*** tbielawa\|caff is now known as tbielawa		19:28
*** resmo has joined #ara		20:22
*** resmo has quit IRC		20:24
*** tbielawa has quit IRC		21:07
*** nicovs_be has joined #ara		23:09
*** nicovs_be has quit IRC		23:13

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!