Thursday, 2017-04-20

*** jkilpatr has quit IRC00:01
*** jkilpatr has joined #zuul00:14
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Remove unused repos from repo init  https://review.openstack.org/45823200:28
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Remove project3  https://review.openstack.org/45823300:36
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Use 'org/project' for tests which only operate on one project  https://review.openstack.org/45823700:36
*** jkilpatr has quit IRC05:50
*** jkilpatr has joined #zuul05:51
*** isaacb has joined #zuul06:24
*** bhavik1 has joined #zuul06:49
*** jhesketh has quit IRC07:14
*** jamielennox has quit IRC07:28
*** jamielennox has joined #zuul07:42
*** jhesketh has joined #zuul07:44
*** bhavik1 has quit IRC08:26
*** hashar has joined #zuul08:56
*** jkilpatr has quit IRC10:41
*** jkilpatr has joined #zuul11:02
*** lennyb has quit IRC11:29
*** lennyb has joined #zuul11:30
*** openstackgerrit has quit IRC11:32
*** atarakt has joined #zuul12:15
*** atarakt has quit IRC12:15
*** atarakt has joined #zuul12:16
*** jkilpatr has quit IRC14:07
*** hashar has quit IRC14:11
*** hashar has joined #zuul14:13
*** jkilpatr has joined #zuul14:22
*** dkranz has joined #zuul14:26
*** isaacb has quit IRC14:54
jlkblah, so both of my changes for simple_layout failed. I'll dig into those today14:58
*** isaacb has joined #zuul15:11
jeblairjlk: i think i left a comment on the first about it15:27
*** jkilpatr has quit IRC15:44
*** isaacb has quit IRC16:08
*** openstackgerrit has joined #zuul16:11
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Move test_tags to simple_layout  https://review.openstack.org/45824016:11
Shrewsjeblair: mordred: I've encountered some roadblocks with changing the current design of the log streamer to be incorporated with the executor. I've put my thoughts up here: http://paste.openstack.org/show/607375/  If you have time, I'd appreciate any suggestions that I haven't considered.16:11
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Move some uses of updateConfigLayout to simple_layout  https://review.openstack.org/45826216:12
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Remove unused test git repos  https://review.openstack.org/45826316:12
Shrewsa bit pre-occupied with some other things atm, so very likely overlooked something easy16:12
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Remove outdated TODO  https://review.openstack.org/45608916:13
* Shrews steps away for lunch16:18
jeblairShrews: i feel like the phares "daemon subprocess" is a specific term of art that i should know the meaning of (similarly, the admonition "daemon subprocesses should not start subprocesses of their own"); do you have a reference for what exactly those mean?16:22
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Move semaphore tests to their own class  https://review.openstack.org/45858416:29
*** hashar has quit IRC16:41
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Move semaphore tests to their own class  https://review.openstack.org/45858416:44
jeblairthat one ran in 7.5m16:45
Shrewsjeblair: sorry. That was from the viewpoint of using the multiprocessing module and setting the daemon=True attr16:51
jeblairShrews: got it, thanks.16:52
jeblairShrews: i will dig into that a little later16:53
*** jkilpatr has joined #zuul17:30
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Move semaphore tests to their own class  https://review.openstack.org/45858417:31
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Move timer tests to commitConfigUpdate  https://review.openstack.org/45859917:31
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Move timer tests to commitConfigUpdate  https://review.openstack.org/45859917:34
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Move test_repo_deleted to simple_layout  https://review.openstack.org/45826917:34
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: move test_disable_at to simple_layout  https://review.openstack.org/45825217:34
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Move semaphore tests to their own class  https://review.openstack.org/45858417:34
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove org_unknown git repo from single-client test config  https://review.openstack.org/45827017:34
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Move check_ignore_dependencies to simple_layout  https://review.openstack.org/45825517:34
jeblairjlk: i updated your patches17:34
jlkwell shoot I was just inthe midle of that17:34
jlkI was running tests locally this time to see if they'd pass :)17:34
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Move timer tests to commitConfigUpdate  https://review.openstack.org/45859917:35
jeblairjlk: sorry :(17:35
jlkno worries17:35
jlkI had it wrong anyway17:35
clarkbare files like https://git.openstack.org/cgit/openstack-infra/zuul/tree/tests/fixtures/layout-disable-at.yaml?h=feature/zuulv3 able to be removed?17:36
clarkbmy grepping shows that we may have quite a few of those layouts that are unused17:36
jlkall the patches for setting a hostname in source merged yesterday right?17:37
jeblairclarkb: yes, though some of them are still unported v2 tests; may be easiest to clean them all up when all the tests are done17:38
jeblairjlk: yes17:38
jlkjeblair: is that setting hostname stuff the last of hte multi-connection bits, or is there more coming?17:38
jeblairjlk: i think that's it17:40
clarkbjeblair: wouldn't they show up in greps if part of unported tests?17:40
jlkalrighty, I'll start a rebase going. along the way I'll be moving all the tests to the simple_layout mechanism.17:40
jeblairclarkb: yep.  that should be safe.17:41
jeblairjlk: note that the "build a config scenario directory" is still a good way of handling configuration.  in the same way that we will end up with a bunch of scheduler tests with "a gerrit, one project, and a check and gate pipeline" operating from a config directory, if you end up with a situation with a bunch of github tests wanting a similar configuration, it may well be worth doing that, at least for some of the tests.17:45
jeblair(just wanted to make it clear that i think both techniques have their place; right tool for the job and all)17:46
jlknod17:46
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Move timer tests to commitConfigUpdate  https://review.openstack.org/45859917:54
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Move semaphore tests to their own class  https://review.openstack.org/45858417:54
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove org_unknown git repo from single-client test config  https://review.openstack.org/45827017:54
jlkjeblair: ahhh, so reading setupSimpleLayout() it assumes a source of gerrit.18:33
openstackgerritMerged openstack-infra/zuul master: Remove link to modindex  https://review.openstack.org/41549818:33
jlkjeblair: would you accept a change so that the driver can be passed in as an option, that defaults to gerrit?18:34
*** hashar has joined #zuul18:44
*** hashar is now known as hasharAway18:44
jeblairjlk: sounds good18:46
jlkworking up an isolated change for that to test with18:46
openstackgerritJesse Keating proposed openstack-infra/zuul feature/zuulv3: Allow simple_layout to support custom drivers  https://review.openstack.org/45862419:06
jlkjeblair: ^^ that does the driver option. Seemed to pass tests locally.19:08
jlkI'm going to stack the rest of the github patch set on top of that one19:08
tobiashjeblair: that's correct, it's implicitly configured with max 1 to behave like a mutex19:22
tobiashjeblair: the reason for that was backwards compatibility for the v2 version of the patch19:23
tobiashjeblair: and for the v3 version of the semaphores I just left this behaviour19:24
tobiashjeblair: but I'm also fine making this explicitly required19:24
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Move two more tests to commitConfigUpdate  https://review.openstack.org/45864720:09
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove commitLayoutUpdate  https://review.openstack.org/45864820:13
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Allow simple_layout to support custom drivers  https://review.openstack.org/45862420:19
*** hasharAway has quit IRC20:19
*** hashar has joined #zuul20:20
jlkwoo.20:23
* jlk lunches20:23
*** hashar has quit IRC20:23
*** hashar has joined #zuul20:23
*** yolanda has quit IRC20:35
jeblairShrews: i'd avoid the multiprocessing module for this.  i'd prefer for us to be explicit about exactly what's happening, and multiprocessing hides a lot of stuff which i think we'd want to know about it (i really like it as a simple replacement for threads, but this is more nitty gritty than that).  another option to consider is to manage a subprocess like the scheduler does for the 'built-in' gear server.  that seems to work well enough most of ...20:39
jeblair... the time.  see zuul/cmd/scheduler for how it starts/stops it.  that may port over fairly well to zuul/cmd/executor.20:39
jeblairShrews: i also left some comments on the current changes -- i think those comments are generally still forward compatible with ^^ if we do that.20:40
jeblairclarkb, mordred: I think 458252 -- 458648 are ready to go.20:41
jeblairSpamapS: 458648 deletes a method you wrote; it's replaced with a similar method, but just wanted to draw your attention to it.20:42
*** jkilpatr has quit IRC20:45
SpamapSjeblair: I'll take a gander20:53
clarkbjeblair: ya I had started at the bottom of that stack then lunch arrived20:56
clarkbpicking it up again20:56
clarkbjeblair: I don't see why the tag called init is necessary in https://review.openstack.org/#/c/458270/3/tests/base.py don't you just need a repository?21:02
jeblairclarkb: somewhere along the way addFakeChangeToRepo grew a requirement that there be a commit tagged init.21:07
*** dkranz has quit IRC21:21
*** jkilpatr has joined #zuul21:23
SpamapSjeblair: quite nice. Thanks for optimizing the tests.. :)21:27
clarkbjeblair: thats an interseting requirement :)21:27
jeblairclarkb: oh i think i get it.  we've had that requirement forever.  init_repo used to do it automatically.  but then i had it stop doing it because i wanted some other initialization path (i think probably the config directory system) to do it, and supply its own initial commit.  but then that's adding it back because there are some tests where do we want init_repo to handle it directly and create the initial commit.21:42
jeblair(i still don't understand why it's actually needed, but that explains that the change above isn't really much of a change)21:43
jeblairclarkb, jlk, SpamapS, mordred: btw, i just performed 3 test runs on my local workstation.  the testr elapsed time before the multi-connection stack was 367s.  at the end of the multi-connection stack (which added way more git ops) was 539s.  at the end of the test-cleanup stack it's now 251s.21:46
jlkwhoh21:46
SpamapSRan 219 (-1) tests in 852.227s (-93.742s)21:47
SpamapSFAILED (id=44, failures=2 (-2), skips=23)21:47
SpamapSThat's on my laptop21:47
SpamapSand that's testing 45864821:47
SpamapS'tox -r -epy27'21:47
SpamapSI got one Alarm Clock :-/21:47
jeblairweird, i ran 240 tests.21:48
* clarkb starts a local zk again and runs the tests21:48
SpamapSThe alarm clock kills all the rest of the tests in that process21:48
jeblairah21:48
SpamapSand likely I had one the last time I tried too21:48
SpamapSI shut down a few naughty slack windows .. maybe having that CPU back will help ;-)21:49
SpamapStestr should subtract the rounded 10 minute load average from its concurrency guess21:50
*** dkranz has joined #zuul21:51
SpamapSstill got an alarm clock even with a relatively quiet system. :-/21:55
clarkbI forgot to exclude the mysql tests... so those are failing. Still waiting for completion21:57
SpamapStests.unit.test_scheduler.TestScheduler.test_dependent_behind_dequeue22:02
SpamapSalways the worst of the bunch22:02
jlkthat's the one22:06
jeblairwe could probably have it run fewer jobs and still have it be a valid test22:07
jeblairi have to run out for a bit22:09
clarkbwoo finally think I have it running with all the sql tests excluded lets see if this is happy22:14
clarkbtestr run 'tests.unit.(?!test_connection.(TestSQLConnection|TestConnectionsBadSQL)).*' or tox -ep27 -- 'tests.unit.(?!test_connection.(TestSQLConnection|TestConnectionsBadSQL)).*' is the magic22:15
*** hashar has quit IRC22:25
*** jamielennox is now known as jamielennox|away22:25
*** jamielennox|away is now known as jamielennox22:29
clarkbappears its not going to finish quickly :/ I have a bunch of failures. This low power i5 from 6 years ago not that great I guess22:44
SpamapSso even with OS_TEST_TIMEOUT=9999 there's something in that test that still causes an alarm clock. Very confusing.22:44
SpamapSmodel name: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz22:45
SpamapSand it still takes mine 900 or so seconds22:46
clarkbmodel name: Intel(R) Core(TM) i5-2400S CPU @ 2.50GHz22:46
openstackgerritJesse Keating proposed openstack-infra/zuul feature/zuulv3: Allow using webapp from connections  https://review.openstack.org/43983122:46
openstackgerritJesse Keating proposed openstack-infra/zuul feature/zuulv3: Support GitHub PR webhooks  https://review.openstack.org/43983422:46
openstackgerritJesse Keating proposed openstack-infra/zuul feature/zuulv3: Add basic Github Zuul Reporter.  https://review.openstack.org/44332322:46
clarkbthe S means "Slow" (actually its the low power chip iirc but ya that means slow)22:46
SpamapSI think there's more to it than CPU22:48
SpamapS%Cpu(s): 29.7 us, 65.9 sy,  0.1 ni,  4.0 id,  0.1 wa,  0.0 hi,  0.2 si,  0.0 st22:48
SpamapS65.9% system CPU22:49
SpamapShttp://paste.ubuntu.com/24423387/22:50
SpamapSMost of its time spent in clone()22:50
clarkbmight be good to attach yappi to a test run22:51
SpamapSdunno what that is22:52
clarkbits a python profiler that is generally betterwith threads and such22:52
SpamapSah, well honestly, clone() is likely because of all the git22:52
clarkbya22:53
SpamapSI wonder if something else is using alarm()22:53
clarkbalso 277 stat errors?22:53
SpamapSbecause it makes 0 sense that I"d get an Alarm Clock with OS_TEST_TIMEOUT=9999 in less than 300s.. but I just did22:53
SpamapSclarkb: that's just how os.path.exists() works22:53
clarkbah22:53
SpamapS"errors"22:53
SpamapSmeans "set errno"22:53
clarkbI'm thinking my test run won't finish because it ate itself when it errored22:55
SpamapSit will22:58
SpamapSif it's using any CPU22:59
SpamapSyou just get less tests run22:59
clarkbit is still cpuing22:59
SpamapSsince any scheduled on that process after the fail don't run22:59
clarkbI shall be patient22:59
* SpamapS trying with gentle=True to see if that changes anything22:59
clarkbmost of the cputime does appear to be spent by a single process so maybe thats the last one doing any work?23:01
SpamapSyeah23:01
SpamapSthat's the pattern I see23:01
SpamapShttp://paste.ubuntu.com/24423454/23:06
SpamapSwith gentle timeout that's what I get23:06
clarkband thats before you hit the timeout right?23:07
clarkbso something else is SIGALARMing?23:07
SpamapSNo that's the timeout that is normally causing SIGALRM to kill the process23:07
SpamapSbut with gentle=True it catches it and logs it.23:07
clarkbright any alarm will cause that code to raise23:08
clarkbso wondering if it happened at expected time or later23:08
clarkber earlier23:08
SpamapSansible uses alarm() I think23:08
SpamapSbut it's in a different process23:08
SpamapSso I don't think that would propagate23:08
SpamapSoh I got another one23:09
SpamapS  File "tests/unit/test_scheduler.py", line 765, in test_patch_order23:09
SpamapSjust this time "timeout waiting for zuul to settle23:10
SpamapS.tox/py27/lib/python2.7/site-packages/ansible/plugins/action/pause.py:                signal.alarm(seconds)23:11
SpamapSI wonder23:11
SpamapSno pauses in our playbooks23:11
SpamapSRan 240 (+27) tests in 969.116s (+241.735s)23:15
SpamapSFAILED (id=48, failures=2, skips=24)23:15
clarkbsetitimer uses the same clock and also raises sigalrm23:16
clarkbmine is still running I think its been about an hour23:17
SpamapSoy23:17
SpamapSYeah I forgot about setitimer23:17
* SpamapS used it in pipemeter of all things23:17
clarkbSpamapS: I'm knee deep in manpages23:17
SpamapSnothing in the deps uses itimer23:18
clarkbyou know, I wonder if my btrfs choice was a poor one here >_>23:18
SpamapSclarkb: do you have a tmpfs for ZUUL_TEST_ROOT ?23:19
clarkbno23:19
SpamapSI mean, SSD should make that mostly meh23:20
clarkbthe gate jobs aren't using one as far as I know so it shouldn't be necessary23:20
clarkband yes ssd23:20
SpamapSbut still I haven't looked at how well btrfs handles concurrent metadata writes23:20
SpamapSif nothing else you're creating and destroying a lot of tmpdirs in one single dir23:21
clarkbI'm pretty sure its much slower than ext4 across the baord23:21
clarkbbut I wouldn't expect it to be that much slower that zuul just fails23:21
SpamapSI was always assured btrfs was comparable to ext4 except when it got really full23:22
clarkbhttps://www.phoronix.com/scan.php?page=news_item&px=Linux-4.7-FS-5-Way23:23
SpamapSso...23:23
SpamapSif one sets OS_TEST_TIMEOUT=023:23
SpamapSit should make test base class not even use the timeout fixture23:23
SpamapSand yet...23:23
clarkbbasically for dby type stuff btrfs is slow23:23
clarkbSpamapS: how are you setting it? and are you using tox?23:26
SpamapSEverything is slower than XFS for databases.23:26
clarkbtox filters the env by default so you can hard code it in tox.ini or set it in .testr.conf23:26
clarkbSpamapS: ya but ext4 is also quicker than btrfs23:26
SpamapSThey all just fail so badly for concurrent writes to the same file except XFS.23:26
SpamapSbut yeah I see your point.23:27
clarkbext4 beats btrfs and xfs with pgbench too (though thats a specific test not to be taken alone)23:27
SpamapSwhat if it's .testrepository that is making yours slow?23:27
clarkboh maybe, just writing out all the datas?23:27
SpamapSrheeeeeeeealllly?23:27
SpamapSpg I'd expect to lurve some xfs23:27
clarkbSpamapS: bottom of thel ink I posted23:27
SpamapSdamnit.... ok23:28
SpamapSso..23:28
SpamapStox.ini sets OS_TEST_TIMEOUT to 12023:28
SpamapSapparently setenv sections don't respect already-set values. Damn.23:28
SpamapSso that's been my confusion23:28
SpamapSwe should remove that23:29
SpamapSand just set a default in the test suite23:29
jlkhah oops23:29
SpamapSI mean, I'd like to know why my tests are going so slow that they finish 2x slower than on crappy cloud vms23:29
SpamapSbut .. that's why I set it =9999 ... cause I just want it to finish and then I'll debug23:30
clarkbthe default is set in .testr.conf iirc23:30
jlkin good news, my rebased github patches (only have completed 3 so far) are passing tests in CI!23:30
clarkbappaerntly not23:31
clarkbI think typically you'd have it set in .testr.conf so that it only affects testr runs23:31
clarkband can be overridden easily23:31
SpamapSoh yeah I think it should be there yeah23:31
SpamapSand with a ${OS_TEST_TIMEOUT:120}23:31
SpamapS:-120 I guess23:32
SpamapSthanks, bash, for using single characters to represent entire concepts, so we can't read ANYTHING23:32
clarkbya that23:32
clarkbI'm testing just tests.unit.test_scheduler.TestScheduler.test_disable_at with a lcean testrepository and no timeout23:33
clarkbit passes in 24 seconds23:33
clarkbnow to run it a bunch and see if it regresses because .testrepository23:33
clarkbcurious if my 24s is really off the mark compared to SpamapS'23:34
SpamapSclarkb: test_disable_at? Let me look23:37
SpamapStests.unit.test_scheduler.TestScheduler.test_disable_at                         46.56823:37
SpamapSsomething's terribly wrong on my computer I think ;)23:38
clarkbit seems pretty stable around 24s on its own23:39
clarkbwhats the big test that fails for you?23:39
clarkblet me try that one next23:39
clarkbtests.unit.test_scheduler.TestScheduler.test_dependent_behind_dequeue ?23:40
SpamapSyeah that one23:41
SpamapSI don't know how long it really takes23:42
SpamapSbecause it's been dying at 120s23:42
clarkbRan 1 tests in 58.010s here23:42
SpamapSRan 240 (+72) tests in 1010.691s (+645.160s)23:42
SpamapSPASSED (id=50, skips=24)23:42
* clarkb tries a full run with concurrency=123:42
clarkbwondering if its jus tterrible contention23:42
clarkband I forgot to disable the mysql tests again... oh well23:43
SpamapSclarkb: what OS and kernel are you on?23:44
SpamapSI"m on Xenial, 4.4.0-7223:44
clarkbSpamapS: suse tumbleweed, 4.10.1-2-default23:44
SpamapSentirely possible that the xenial kernel has some crapiness in it23:44
SpamapShmmmmmmmmm23:46
openstackgerritJesse Keating proposed openstack-infra/zuul feature/zuulv3: Add 'push' and 'tag' github webhook events.  https://review.openstack.org/44394723:46
SpamapSI wonder if I only have one actual CPU23:47
SpamapSdmidecode suggests that I do23:47
SpamapSah 2 cores though23:48
jlkone cpu, 2 cores. so you _should_ be able to concurrency=223:48
jlkor it should pick that up23:48
clarkbI've got 4 cores and it runs 4way by default23:48
clarkbanyways trying it with concurrency=1 to see if thats stable23:49
SpamapSjlk: I'm getting concurrency=423:49
SpamapSbecause 4 threads23:49
SpamapSso maybe testr's doing it wrong23:49
jlkhow is it figuring on 4 threads?23:50
SpamapSlet's see what concurrency=2 does23:50
SpamapSjlk: SMT23:50
jlkoh, so you've got 1 package, 2 cores, hyperthreading?23:50
SpamapSyep23:51
clarkbmine is actually 4 cores not hyperthreaded23:51
SpamapSI'm trying with conc=223:51
clarkbI should clean out the case so that the turboboost can boost more :)23:53
SpamapSadd some stickers23:55
clarkbwell in theory for single cpu load its supposed to clock up to ~3.3ghz23:56
clarkbbut currently only at 2.650 and I think its thermally limited23:56
clarkbtests.unit.test_scheduler.TestScheduler.test_client_promote_dependent and tests.unit.test_scheduler.TestScheduler.test_client_promote each took like a minute and a half23:58
clarkbI could see those going over the timeout limit if things were really loaded up23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!