Wednesday, 2016-08-03

*** rudrajit has joined #openstack-dns00:25
*** richm has quit IRC00:50
*** puck has quit IRC01:23
*** puck has joined #openstack-dns01:26
*** rudrajit has quit IRC02:24
*** rudrajit has joined #openstack-dns02:26
*** elarson has quit IRC02:29
*** sonuk has joined #openstack-dns03:09
*** stupidnic has quit IRC03:40
*** EricGonczer_ has quit IRC03:43
*** stupidnic has joined #openstack-dns03:43
*** Alex_Stef has joined #openstack-dns04:30
*** stupidnic has quit IRC04:35
*** stupidnic has joined #openstack-dns04:39
*** Alex_Stef has quit IRC06:28
*** Alex_Stef has joined #openstack-dns06:55
*** pcaruana has joined #openstack-dns07:21
*** ekarlso has quit IRC07:29
*** ekarlso has joined #openstack-dns07:37
*** lbrune has joined #openstack-dns07:49
*** maskarat has joined #openstack-dns07:58
openstackgerritDavanum Srinivas (dims) proposed openstack/designate: [WIP] Testing latest u-c  https://review.openstack.org/31802008:10
*** kbyrne has quit IRC08:29
sonukkljkljkljl08:32
*** rudrajit has quit IRC08:41
*** GonZo2000 has quit IRC08:50
*** penchal has joined #openstack-dns08:58
openstackgerritqinchunhua proposed openstack/designate: Replace  assertDictEqual() with assertEqual()  https://review.openstack.org/34873309:03
*** amitkqed has quit IRC09:04
*** amitkqed has joined #openstack-dns09:04
*** GonZo2000 has joined #openstack-dns09:07
*** GonZo2000 has quit IRC09:07
*** GonZo2000 has joined #openstack-dns09:07
*** kbyrne has joined #openstack-dns09:21
*** nyechiel has joined #openstack-dns09:34
*** simonmcc has quit IRC09:44
*** simonmcc has joined #openstack-dns09:47
*** GonZo2000 has quit IRC10:02
*** amit213 has quit IRC10:10
*** amit213 has joined #openstack-dns10:12
*** serverascode has quit IRC10:15
*** bauruine has quit IRC10:16
*** serverascode has joined #openstack-dns10:17
*** bauruine has joined #openstack-dns10:24
*** kbyrne has quit IRC11:11
*** kbyrne has joined #openstack-dns11:12
openstackgerritGraham Hayes proposed openstack/designate: Revert 372057bddb27716acd42a88591552a8dee7b519b  https://review.openstack.org/35020611:13
*** abalutoiu has joined #openstack-dns11:20
*** mehul has joined #openstack-dns11:31
*** mehul has quit IRC11:32
*** ducttape_ has joined #openstack-dns12:06
*** leitan has joined #openstack-dns12:27
*** trondham has joined #openstack-dns12:27
openstackgerritMerged openstack/designate-specs: Fix typo in zone-exists-event.rst  https://review.openstack.org/34765012:45
*** ducttape_ has quit IRC13:11
*** GonZo2000 has joined #openstack-dns13:22
*** EricGonczer_ has joined #openstack-dns13:41
*** richm has joined #openstack-dns13:43
*** lbrune1 has joined #openstack-dns13:49
*** lbrune has quit IRC13:51
*** GonZo2000 has quit IRC13:59
*** krot_sickleave is now known as krotscheck14:00
*** ducttape_ has joined #openstack-dns14:06
*** EricGonczer_ has quit IRC14:16
*** EricGonczer_ has joined #openstack-dns14:17
*** mlavalle has joined #openstack-dns14:18
*** penchal has quit IRC14:19
openstackgerritMerged openstack/designate: Revert 372057bddb27716acd42a88591552a8dee7b519b  https://review.openstack.org/35020614:27
*** EricGonczer_ has quit IRC14:31
*** EricGonczer_ has joined #openstack-dns14:31
openstackgerritTim Simmons proposed openstack/designate-tempest-plugin: Test that updating recordset TTL only modifies TTL  https://review.openstack.org/35023514:39
openstackgerritTim Simmons proposed openstack/designate: Fix recordset changes so that they preserve object changes fields  https://review.openstack.org/35062114:40
openstackgerritTim Simmons proposed openstack/designate: Make notifications pluggable  https://review.openstack.org/34853514:41
*** pglass has joined #openstack-dns14:46
*** lbrune1 has quit IRC14:57
*** lbrune has joined #openstack-dns14:57
*** lbrune has quit IRC14:58
*** lbrune has joined #openstack-dns14:59
*** lbrune has quit IRC14:59
*** lbrune has joined #openstack-dns15:01
*** leitan has quit IRC15:06
*** leitan has joined #openstack-dns15:09
*** rudrajit has joined #openstack-dns15:40
*** james_li has joined #openstack-dns15:44
*** rudrajit has quit IRC15:49
maskarathi, I am noticing that my machines lose their external DNS domain after a reboot15:52
maskaratis this "normal"?15:52
maskarat[centos@testvm1 ~]$ cat /etc/resolv.conf15:55
maskarat; generated by /usr/sbin/dhclient-script15:55
maskaratsearch openstacklocal15:55
maskaratnameserver 10.10.10.1015:55
maskaratalthough I have defined in my neutron.conf a domain .. it seems that is reverting back to default behaviour15:56
maskaratwith openstacklocal as the domain name15:56
*** lbrune has quit IRC16:06
*** dxu_ has joined #openstack-dns16:08
openstackgerritOpenStack Proposal Bot proposed openstack/designate: Updated from global requirements  https://review.openstack.org/34861916:08
*** leitan has quit IRC16:20
*** leitan has joined #openstack-dns16:32
*** abalutoiu has quit IRC16:37
greghaynesmugsie: hey, you around to chat perf?16:39
mugsieI am, but our IRC meeting is in 15 mins16:44
mugsieI will be back at about 18:00 UTC ?16:45
greghaynesworks for me16:45
mugsiecool16:45
*** krotscheck is now known as kro_focused16:53
*** rudrajit has joined #openstack-dns16:58
*** pcaruana has quit IRC17:00
*** jmcbride has joined #openstack-dns17:01
*** nyechiel has quit IRC17:04
*** openstackgerrit_ has joined #openstack-dns17:06
*** openstackgerrit_ has quit IRC17:08
*** Alex_Stef has quit IRC17:13
*** james_li has quit IRC17:14
*** pglass has quit IRC17:15
*** lbrune has joined #openstack-dns17:16
openstackgerritTim Simmons proposed openstack/designate: Change bind -> bind9 in docs, sample configs  https://review.openstack.org/35069817:18
*** pglass has joined #openstack-dns17:20
*** GonZo2000 has joined #openstack-dns17:22
*** abalutoiu has joined #openstack-dns17:25
*** GonZo2000 has quit IRC17:35
*** pglass has quit IRC17:35
*** maskarat has quit IRC17:35
*** lbrune has quit IRC17:45
*** lbrune has joined #openstack-dns17:49
*** EricGonc_ has joined #openstack-dns17:55
*** EricGonczer_ has quit IRC17:57
*** haplo37__ has joined #openstack-dns17:59
mugsietimsim: greghaynes around?18:01
timsimo/18:01
greghaynesohai18:01
mugsiehey18:01
mugsieso timsim did the peref testing testing post ML thread18:02
greghaynesSo, what I was testing was a very small zone transfer getting hit with a ton of requests, I'm curious how that differs from you all's tests18:02
mugsiewe were testing zones that were big enough to start using  TCP afaik18:02
mugsieis that right timsim ?18:02
greghaynesand I have some theories on why we might not match, but need some info to figure out if they are correct18:02
timsimYeah I think it was about 2k recordsets18:03
mugsieso we have 2 kinds of usual queries - the small light weight SOA one, and heavier largish zones18:03
mugsieSOA can definitly be improved, with just local caching or wire format. as that query will be done repeately18:04
mugsiecaching of*18:04
mugsieespecially on zones that do not change much18:04
greghaynesright. So my thinking was pretty simple - the bits where the process isn't blocking on i/o are the only areas where there can possibly be a perf diff between languages, and the larger the zone the larger it should be an i/o bound issue unless theres something horribly different abou encoding between languages18:05
greghayneswhich, there really shouldn't be18:05
greghaynesall that to say - if zone size makes a difference then theres something horribly wrong either in encoding or design18:06
mugsiewe use dnspython for encoding18:06
mugsiewhich could be a culprit18:06
timsimWhich is pure Python right?18:06
greghaynesand I suspect whats actually going on is either running with a ton of threads so thread starvation issues are coming in to play due to the gil, or $something_silly_with_encoding18:07
mugsieeh, i think so18:07
*** dxu_ has quit IRC18:07
greghaynesyea, but its a single pass encode, it still should be fast relative to writing out to the wire18:07
*** dxu_ has joined #openstack-dns18:07
*** leitan has quit IRC18:07
mugsiewell, we dont do single pass afaik18:07
mugsiewe yeild tcp packet sized chunks18:07
greghaynesright - I think you read it all in to a rr array then loop over encoding it18:08
greghaynesso, something I'd really like to know is - what specifically other than a largeish zone were you testing. how many threads / processes and how many parallel requests18:08
*** pglass has joined #openstack-dns18:09
timsimJust a sec18:09
greghaynesbecause something I did was turn off threading after noticing it defaults to a 1k thread pool and that SO_REUSEPORT would 'just work' for scaling our processes, otherwise I suspect hitting the mdns with many parallel requests would just case them to all get run on one core in parallel18:10
timsimYeah I think I was running mdns with one process and SO_REUSE on18:10
greghaynesand the default thread pool count?18:11
timsimYep18:11
*** lbrune has quit IRC18:12
greghaynesok, so this is something worth verifying, but that was kind of my original theory when you all mentioned perf falling off a cliff at larger scale - if you run 1k python threads in a single process that are all doing a bit of work the GIL will destroy throughput18:12
*** rudrajit_ has joined #openstack-dns18:12
*** lbrune has joined #openstack-dns18:13
timsimThat sounds about like what was happening.18:13
timsimI think there's also probably some difference with dnspython's encoding vs the Go one too that would exacerbate that issue.18:15
*** rudrajit has quit IRC18:15
mugsiewould doing apache style "lots of workers" help here? let the kernal deal with it?18:16
mugsie(it may be a stupid question)18:16
greghaynesFor that - the go one will probably be a tiny bit faster but really the way it should pan out is once the zone size gets large enough the amount of time doing encoding work will be tiny compared to blocking on writing out to the network. Really the encoding issues come in to play when doing tons of small zone transfers which is why I was trying to test that18:16
greghaynesmugsie: yep, thats exactly what I did and the way mdns is coded it 'just works' - which is awesome18:17
greghaynesyou literally just start more workers and its all good18:17
greghaynesbut about the encoding - really the tons of threads + a large zone is when you get in the deathspiral with encoding since the encoding can't release the gil18:17
mugsiewas there issues with eventlet passing work to larger numbers of workers? I seem to remember a ML thread about that18:18
mugsiecan't seem to find it in my email though18:18
greghaynesmugsie: eventlet has no idea about the number of processes, evenlet is only operating at the thread level18:18
greghaynesoh, the swift issue?18:18
timsimSo a relatively small pool of threads per process to limit the number of parallel encodings would be ideal.18:19
greghaynesyep, do you still have some kind of testing setup?18:19
greghaynesmine was basically a 5line bash script, I am sure anything is better18:19
greghaynesmugsie: the evenlet deal swift was hitting is a lot more complicated - it really comes down to not being able to do async i/o on flat files in linux, which isn't an issue here18:20
timsimI used this to send the queries https://github.com/rackerlabs/mdns/blob/master/cmd/bench.go18:21
timsimThen I used this mysqldump, which has a small and a large zone https://github.com/rackerlabs/mdns/blob/master/test_resources/designate.sql18:22
greghaynesah, thats super helpful18:22
greghaynesI spent way too long figuring out how to load up some data18:22
greghaynesSo, first is probably verifying that this death spiral with a ton of threads is really happening, which should be pretty easy to test. Then after that theres a bunch of different ways to cache the zone if you wanted to make the responses fast, and it really comes down to how much work I think you want to do18:23
*** leitan has joined #openstack-dns18:24
greghaynesit also could be used to fix the threading issue a bit, basically if every process did some kind of read-through cache of generating a zone then you would be doing 1k less work18:24
mugsiethe problem with caching is that these services can be distributed geographically18:25
mugsieand an AXFR will hit them once, or twice18:25
greghaynesthats fine, even jsut an in-process cache that checks the db result and invalidates when it changes18:25
greghaynesbecause the db query is not where this is falling over, clearly18:25
greghaynesso you'd get identical output, jsut do 1/num_threads less work18:26
timsimAlso I think there's some level of the wire format that you need to parameterize because it changes for every request18:26
Kialltimsim: there is, but it's easy to "fix" as it's a fixed size fixed position int within the wireformat18:27
greghaynesYep. Thats kind of the mimum no-brainer way to do caching, if a bit more work was wanted I am sure we could come up with something a ton smarter for invalidating it18:28
mugsiegreghaynes: well, I dont think the wire format cache will get used much, if at all18:28
greghaynesmugsie: so one reason I could see you really wanting it is actually for memory savings - if there are actually multiple megabyte zones as-is they are read in to memory 3 times per thread18:29
greghaynesso for a 10mb zone its 30mb ram minimum per thread18:29
mugsieeither all servers will hit the same node at about the same time after getting notified, and all threads will generate the cache, or the request will go to a completely different mdns server18:29
Kiallmugsie: on every change, every NS (so 1 to say 15 or so before you tier things) will make the same AXFR query at nearly the same time, which means 15x CPU time vs doing it once, caching, and unblocking the 2nd to 15th query.18:30
greghaynesmugsie: oh, no, you do front-of-line-blocking. So all the other threads lock on one generating the cache and then when its done they all write it out18:30
mugsieunless we take a lcok on a a zone18:30
greghaynesmugsie: you lock on the cache entry18:30
mugsieOK, yeah - that makes sense.18:31
mugsieKiall: you wrote the current TCP stuff - how easy wuold it be to add caching to the stuff there?18:31
KiallThat was at least a year or two ago ;)18:31
greghaynesI'm also happy to help some of this btw, I don't have a *ton* of time but if someone else was taking lead I could probably churn out a few patches18:32
KiallCaching of responses would be easy enough - we actually could drop in a new mDNS middleware that does it in all likelyhood..18:32
mugsiegreghaynes: that would be great - even reviews would help :)18:32
greghaynessure thing18:33
Kiallcache AXFR's, block in the middleware if cache suggests an AXFR is in progress anywhere else, drop cache if/when SOA queries come in and result in a higher SOA serial18:33
mugsieeven a select serial would be ok, right?18:34
KiallWe do hit issues like memcache's 1MB limit, but that can probably be fixed with key sharding18:34
Kiallsomething somewhere has to hit the DB to invalidate caches18:34
mugsieKiall: redis. it will keep timsim happy18:34
Kialllol - we already use and have memcache libs in place ;)18:34
Kiall(one day, we'll move to oslo.cache and let the deployer decide ;))18:35
mugsieOK. I will file a bp, and write up a spec based on ^18:35
mugsiethanks all for digging in!18:36
greghaynesnp :)18:36
KiallAnyway - I don't think any of this really helps in a meaninful way at the sorta scale RAX are running things at tho :)18:36
greghaynesoh?18:36
greghaynesso, I'm happy to do a bit more engineering to make this a 100% i/o bound problem18:37
KiallFrom memory, the usage profile there is A) tiered, so 2x AXFR per change rather than say, 15.. and a stupid large number of zones, including some very heavy high churn zones18:37
Kialltheir*18:37
greghaynesits not hard I think18:37
Kialland B) a stupid large*18:37
Kiallgreghaynes: :)18:37
greghaynesYea, so really the only question is at what point does this become a purely i/o bound problem, because then its going to be about the same speed no matter the language, its simply a matter of having the i/o bandwidth18:38
timsimNah the bind slaves don't end up doing an axfr18:38
Kialltimsim: ah - /me misremembers what you guys look like then ;)18:38
timsimI think caching would work out ok18:39
pglasscaching should work well for us to tone down the number of db queries.18:39
timsimIt was mostly a matter of having many worker processes going during performance tests. We weren't spinning enough up probably so they were thread starving18:39
greghaynespglass: with this caching setup there won't be less queries, that could be changed though...18:39
pglasssorry. what's the point of the caching then?18:40
greghaynestimsim: yea, and also turning the thread count way down18:40
Kiallwont be less per non-cached AXFR I guess greg meant/18:40
greghaynespglass: here its purely so we don't encode things per thread18:40
mugsietimsim: it would be interestig to see the perf results at differing levels of # number of workers18:40
Kialloverall QPS on mysql would drop, as responses served from cache wouldn't hit the DB18:40
greghaynesKiall: oh, so what are you thinking for invalidating a cache?18:40
greghaynesKiall: like, when would we know that the db changed - just check serial?18:41
pglassso we explode memory per process, in order to avoid re-encoding to dns wire format?18:41
greghaynes(maybe I missed that?)18:41
mugsiegreghaynes: yeah, just checking serial is much lighter18:41
greghaynespglass: its actually less memory per process18:41
pglassis it?18:41
pglassthings in the cache are never freed, are they?18:42
Kiallwell - memcache or w/e grows18:42
greghaynespglass: ah, disreguard what I said about queries, if were checking serial then it is a ton less queries18:42
mugsiepglass: it should go to a cache, not in memory18:42
Kiallmugsie: caches are usually in memory ;) semantics matter :P18:42
mugsieand we can kick things out of the cache based on changed serial18:42
mugsieKiall: in process18:42
pglassoh, i thought I saw like an in memory cache pre process. i haven't read too closely. if it's memcached or something that's different.18:42
greghaynesthe cache will be in mem, basically the design right now wastes a _ton_ of memory - every thread is reading in the whole zone 3 times and each one stores its own copy and theres 1k threads18:42
greghaynesso if teh cache size is HUGE itll be more mem, but really I think you just need a tiny cache here - the way load works for this service is an axfr goes out and everyone requests the same thing for a short period of time18:43
greghaynesso caching any more than how many simultaneous axfr's are going on is a waste18:43
greghaynesbut thats also just a tuning nob folks can figure out afterwards18:44
pglassoh. there' usually only one process with lots of threads.18:44
pglassokay. that makes sense.18:44
KiallThe flip side is SOA query caching, which is a high number of DNS requests per sec with a low-ish number of queries per request... which can also be cached, if invalidation can be worked into the right places.18:44
Kiall(and SOA has a low encoding overhead, as were talking like 50-200byte responses)18:45
greghaynesah, so thats the kind of thing you probably want some kind of smarter system for18:46
greghaynesbut, baby steps18:46
*** haplo37__ has quit IRC18:55
*** haplo37__ has joined #openstack-dns19:07
*** EricGonczer_ has joined #openstack-dns19:09
*** EricGonc_ has quit IRC19:13
*** abalutoiu has quit IRC19:20
openstackgerritGraham Hayes proposed openstack/designate: Don't hardcode options we pass to oslo.context  https://review.openstack.org/35075819:32
*** lbrune has left #openstack-dns19:42
openstackgerritMerged openstack/designate-tempest-plugin: Test that updating recordset TTL only modifies TTL  https://review.openstack.org/35023519:44
*** pglass has quit IRC19:53
*** rudrajit has joined #openstack-dns19:55
*** rudrajit_ has quit IRC19:59
*** pglass has joined #openstack-dns20:06
*** EricGonc_ has joined #openstack-dns20:08
*** EricGonczer_ has quit IRC20:09
*** leitan has quit IRC20:16
*** jmcbride has quit IRC20:43
*** GonZo2000 has joined #openstack-dns20:48
*** GonZo2000 has joined #openstack-dns20:48
*** v12aml has quit IRC20:57
*** rudrajit_ has joined #openstack-dns21:00
*** rudrajit has quit IRC21:03
*** v12aml has joined #openstack-dns21:04
*** EricGonc_ has quit IRC21:19
*** EricGonczer_ has joined #openstack-dns21:21
*** GonZo2000 has quit IRC21:41
*** GonZo2K has joined #openstack-dns21:41
*** rudrajit_ has quit IRC21:45
*** rudrajit has joined #openstack-dns21:54
*** GonZoPT has joined #openstack-dns21:55
*** GonZo2K has quit IRC21:58
*** pglass has quit IRC22:11
*** tyr_ has joined #openstack-dns22:31
tyr_Krenair: are you interested in contributing to a v2 UI ?22:32
Krenairmight be able to in future22:33
tyr_I could use some help with converting a Horizon token into a v2 client. For testing I've just been hard coding the auth in the Horizon API layer.22:33
Krenairdon't have my own local dev setup and am not really in a position to make one right now22:33
tyr_ok, np. Are you using DNS functionality in Horizon? Perhaps you'd be a good reviewer to validate the usability of a new panel.22:34
*** mlavalle has quit IRC22:37
*** ducttape_ has quit IRC22:41
Krenairtyr_, yes22:44
tyr_do you mind if I add you as a reviewer on my patch? Not so much for the code review, but to remind me to ping you when I have a version ready for testing.22:45
openstackgerritAlexander Monk proposed openstack/designate: Fix api-ref methods for getting, updating and deleting recordsets  https://review.openstack.org/35081722:47
Krenairtyr_, go for it22:47
KrenairI'll have to figure out how to get it setup myself locally22:48
tyr_ok. Thanks! We can burn that bridge once we get to it.22:48
KrenairThat doesn't sound very encouraging22:48
tyr_lol22:48
Krenair:)22:48
tyr_I'm aiming to have this ready for Newton.22:49
tyr_but it depends on a fair amount of new Horizon flux...so its "dynamic"22:50
*** EricGonczer_ has quit IRC22:54
*** ducttape_ has joined #openstack-dns23:02
*** jmcbride has joined #openstack-dns23:07
Krenairyeah we're not actually on Mitaka yet23:08
*** jmcbride has quit IRC23:08
Krenairour 'production' openstack system just moved to Liberty, testing system is still on Liberty but I think there's an upgrade being planned for that to go to Mitaka soonish23:09
*** jmcbride has joined #openstack-dns23:14
*** ducttape_ has quit IRC23:24
*** ducttape_ has joined #openstack-dns23:25
*** tyr_ has quit IRC23:31
Krenairoh, he quit :(23:38
Krenairthink I had a solution to the token-to-v2-client thing23:38
Krenairsent an email instead23:44
*** dxu_ has quit IRC23:47
*** ducttape_ has quit IRC23:49

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!