*** hashar has joined #zuul | 07:06 | |
*** dmellado has joined #zuul | 07:08 | |
*** bhavik1 has joined #zuul | 08:11 | |
*** bhavik1 has quit IRC | 09:51 | |
tobiash | hi, just curious, was there a reasoning why zookeeper was chosen from the set zookeeper, etcd, consul, ...? | 10:11 |
---|---|---|
*** jamielennox|away is now known as jamielennox | 11:25 | |
mordred | tobiash: yes - at the time we started looking, it was the only one of the three that implemented fair locking, which for us is important | 11:49 |
mordred | tobiash: it's my understanding that etcd3 may now have support, as they increased the scope for what they wanted etcd to be able to do for v3 | 11:49 |
tobiash | mordred: ah, good to know :) | 11:50 |
tobiash | thx | 11:50 |
mordred | sure thing! | 11:50 |
tobiash | mordred: btw, I discovered a bug in my io sync optimization suggestion | 11:51 |
tobiash | mordred: left a comment in https://review.openstack.org/#/c/448591/ | 11:51 |
mordred | ooh - thanks | 11:55 |
openstackgerrit | Joshua Hesketh proposed openstack-infra/zuul feature/zuulv3: Put test id in zk chroot path https://review.openstack.org/454925 | 12:36 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Run unitests twice to see if this is possible https://review.openstack.org/454931 | 13:01 |
*** openstackgerrit has quit IRC | 13:33 | |
clarkb | mordred: you can also try eatmydata, but at least for tempest found it wasn't that much of a speedup (since its largely cpu bound) | 14:53 |
mordred | clarkb: yah - I was mostly curious about impact on the clouds where we're our own noisy-neighbor - lke infra-cloud (and also there where the disks are slow too) | 15:02 |
mordred | clarkb: I figure at places like osic we'll see no impact at all | 15:02 |
*** hashar is now known as hasharAway | 15:08 | |
clarkb | jlk: looks like http://logs.openstack.org/31/454931/2/check/gate-zuul-python27-ubuntu-xenial/12ce10f/console.html generally reproduces the issues we see locally. So good to know the test instances are doing something magical to get stuff to pass other than running the tests once then being deleted. | 16:32 |
mordred | clarkb: WOOT | 16:40 |
clarkb | mordred: I had initially expected to not merge ^ and use it as temporary test, however after seeing your +2 I wonder if it wouldn't be a bad idea to have it run twice until things are less flaky | 16:41 |
clarkb | (of course that could make it much harder to make it less flaky) | 16:41 |
mordred | clarkb: yah - the more I've thought about it the more I'm on the fence on the best way to do it ... I think double-running in the gate is a good idea,but maybe doing that directly in the py27 env in tox makes -epy27 locally a bit annoying | 16:43 |
mordred | clarkb: maybe we should add an env that does it twice, or even just add a job that calls tox -epy27 twice in a row that we could start as non-voting while we get things stablized again or something | 16:43 |
jlk | clarkb: I'm confused by your statement. YOu mean CI IS doing something magical, or is the magic just running in a "clean room" each time? | 16:49 |
clarkb | sorry s/are/aren't/ basically the only magic is they run clean room each time | 16:50 |
clarkb | the job fails if you run tox -epy27 twice | 16:50 |
jlk | awesome. | 16:52 |
jlk | That confirms my / our suspicions. | 16:52 |
jlk | although doesn't jive with some things I'm seeing. MOre testing happening today for me. | 16:53 |
jlk | but first, coffee. | 16:53 |
clarkb | there are two big main effects of clean room each time. The first is no preexisting .testrepository which means test ordering is naive (alphabetical? something like that). The other is databases (mysql/zk) will potentially have data leftover and I have seen zk leak however I don't think those leaks should affect other tests as the random test chroot code seems to work properly | 16:55 |
*** jlk has quit IRC | 17:30 | |
*** jlk has joined #zuul | 17:30 | |
*** jlk has quit IRC | 17:31 | |
*** jlk has joined #zuul | 17:31 | |
jlk | clarkb: what I was trying thus far was removing all of .testrepository, .tox, and all the contents of the zuul test root, re-running tools/test-setup.sh, and restarting zookeeper between each run. That may not be enough, might have to get more drastic. | 18:18 |
clarkb | jlk: its possible that something in /tmp is leaking across jobs but I had good luck just removing .testrepository and the zk files | 18:19 |
jlk | kk | 18:21 |
mordred | jlk, clarkb: we're going to feel so good when we know what's wrong | 18:24 |
jlk | such knowledge usually leads to brown liquids. | 18:25 |
jlk | which initially feel great. | 18:25 |
*** hasharAway is now known as hashar | 18:34 | |
mordred | mmm. liquid | 19:21 |
jlk | clarkb: so even pruning all those files and restarting zookeeper, I can't get tip of feature/zuulv3 to pass locally. Rebooting to try again. | 19:48 |
clarkb | jlk: are you clearing the zk data when you restart too? | 20:12 |
jlk | so I did a stop of hte service, rm -rf of the tmpfs where zookeeper works in, and start of the service. | 20:12 |
jlk | What size node does the gate use for these tests? | 20:47 |
clarkb | jlk: 8 vcpu, 8GB ram, and ~80GB of disk (though in some cases much more disk) | 20:51 |
clarkb | but what 8vcpu means and how fast that disk is varies greatly across clouds | 20:51 |
jlk | ah | 20:52 |
jlk | So I'm testing with 4 vcpu, maybe I should do more | 20:52 |
jlk | load average is between 4 and 8 | 20:52 |
clarkb | jlk: testr will use the number of cpus available to determine how many processes to run | 20:54 |
clarkb | so slightly above 4 is what I would expect for load average | 20:54 |
jlk | resizing up my test VM | 21:17 |
jlk | to a full 8 CPUs | 21:17 |
jlk | MAGIC it passes. | 21:27 |
jlk | wtf. | 21:27 |
*** hashar has quit IRC | 21:29 | |
clarkb | that can affect test ordering so if it is an order issue that may be why it passed | 21:51 |
jlk | Anybody around that understands in 2.5 how trigger requirements work? I'm trying to trace through the code how zuul decides that a trigger event should apply to a pipeline or not. Specifically thinking about trigger filters/requirements as opposed to pipeline requirements. | 21:53 |
*** openstackgerrit has joined #zuul | 22:12 | |
openstackgerrit | Monty Taylor proposed openstack-infra/nodepool feature/zuulv3: Validate flavor specification in config https://review.openstack.org/451875 | 22:12 |
openstackgerrit | Monty Taylor proposed openstack-infra/nodepool feature/zuulv3: Add ability to select flavor by name or id https://review.openstack.org/449784 | 22:12 |
openstackgerrit | Monty Taylor proposed openstack-infra/nodepool feature/zuulv3: Cleanup from config syntax change https://review.openstack.org/451868 | 22:12 |
openstackgerrit | Monty Taylor proposed openstack-infra/nodepool feature/zuulv3: Add support for specifying key-name per label https://review.openstack.org/455464 | 22:12 |
*** yolanda has quit IRC | 22:18 | |
*** yolanda has joined #zuul | 22:19 | |
openstackgerrit | Monty Taylor proposed openstack-infra/nodepool master: Add support for specifying key-name per label https://review.openstack.org/455466 | 22:26 |
jlk | I want to cry now. | 22:33 |
jlk | the tip of my patch branch completely passes tests on a fresh platform, when using 8 CPUs | 22:33 |
jlk | Is there a way to tell testr to only use 4 cores instead of 8? | 22:33 |
jhesketh | Morning | 22:51 |
clarkb | jlk: yes tox -e py27 -- --concurrency=4 | 23:05 |
mordred | jlk: also - wow | 23:36 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!