Tuesday, 2018-10-23

*** jimi|ansible has joined #zuul00:01
mordredSpamapS: re: markdownlint00:16
mordredSpamapS: the pre playbook runs npm - but there isn't anything ensuring npm exists ... we have an install-nodejs role, but that's for installing from nodesource and might be overkill00:17
mordredSpamapS: NEVERMIND - it has been taken care of and also covered in previous review comments00:18
mordredSpamapS: maybe one day I'll learn to read. probably not - but maybe00:18
clarkbreading is hard00:20
mordredclarkb: srrsly00:20
*** tobiash has quit IRC00:21
*** ssbarnea has joined #zuul00:22
*** tobiash has joined #zuul00:23
mordredShrews, corvus: I had this thought about nodepool config ... it happened as I was thinking about the changes to the rate-limiting config that are/will be possible once the current stack finishes landing00:27
mordredShrews, corvus: in the ansible modules, we recently added the ability to pass an entire clouds.yaml style cloud config dict to the cloud parameter00:27
mordredso - if it's a scalar, it's a cloud name, but if it's a dict, we pass it in to the constructor as kwargs00:27
mordredShrews, corvus: do you think adding a similar thing to nodepool would be an improvement or just confusing and terrible?00:28
mordred(the thing that made me start thinking about this in the first place is that, with the next openstacksdk release, it will be possible to configure rate limits in clouds.yaml - and also to configure them separately per-service00:29
mordredI'm not sure whether adding per-service rate-limit support directly to nodepool's config language is a valuable thing though - or whether a passthrough cloud dict is either (we do it in ansible because people want to store their config in vault and whatnot and the clouds.yaml files get confusing in that context)00:31
mordredanyway- not urgent or anything - just thoughts happening while dealing with airplanes00:31
corvusmordred: but with rate limits going into openstacksdk(/keystone?) wouldn't the most desirable future state be one where we retire the use of rate limit stuff in nodepool and defer that entirely to clouds.yaml?  i see this as an opportunity to further simplify nodepool config...00:35
mordredcorvus: yah - that's probably the right option00:36
Shrewsi feel like moving that out of nodepool is the clearer thing to do, too00:36
corvusmordred: to answer the question another way -- if there's a big win by supporting the scalar-or-dict thing, i think it would work fine, however, right now it strikes me as complexity without gain, and configuring this stuff is hard enough as-is...  it'd be an easy sell though if we find the right use case00:36
mordred++00:37
mordredcool. less patches for me to write :)00:37
Shrewsthose were the exact words i had in my head00:37
Shrewscorvus: quick! what number am i thinking of?00:37
corvusShrews: -2?00:38
Shrewssorry, the answer was "blue". i feel safer now00:39
corvusShrews: are you sure it wasn't yellow?00:39
ShrewsO.o00:39
Shrewstin foil hat time00:39
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Consume rate limiting task manager from openstacksdk  https://review.openstack.org/61216900:46
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Remove task manager  https://review.openstack.org/61217000:46
*** gouthamr has joined #zuul00:47
*** dmellado has joined #zuul00:51
*** rlandy has quit IRC01:27
dmsimardI'm trying ansible-runner out of curiosity01:27
dmsimardand it's not bad01:28
*** bhavikdbavishi has joined #zuul01:43
*** bhavikdbavishi has quit IRC01:50
*** bhavikdbavishi has joined #zuul02:23
*** bhavikdbavishi has quit IRC02:40
*** bhavikdbavishi has joined #zuul03:35
*** mrhillsman is now known as openlab04:05
*** openlab is now known as mrhillsman04:06
*** rfolco|rover has quit IRC04:30
*** spsurya has joined #zuul05:48
AJaegerzuul cores, time to remove an ancient role from zuul-jobs - please review https://review.openstack.org/61038106:21
*** bhavikdbavishi1 has joined #zuul06:53
*** bhavikdbavishi has quit IRC06:55
*** bhavikdbavishi1 is now known as bhavikdbavishi06:55
*** pcaruana has joined #zuul06:56
*** themroc has joined #zuul07:24
*** sshnaidm|afk is now known as sshnaidm|pto07:41
*** electrofelix has joined #zuul08:29
*** gouthamr has quit IRC08:32
*** dmellado has quit IRC08:34
*** cbx33 has joined #zuul08:58
*** nilashishc has joined #zuul09:09
*** gouthamr has joined #zuul09:11
*** cbx33 has quit IRC09:17
*** dmellado has joined #zuul09:18
*** jpena|off is now known as jpena09:22
*** rfolco has joined #zuul10:12
openstackgerritMerged openstack-infra/zuul-jobs master: Remove the "emit-ara-html" role  https://review.openstack.org/61038110:14
*** ssbarnea_ has joined #zuul10:16
*** ianychoi has quit IRC10:22
*** ianychoi has joined #zuul10:25
*** rfolco is now known as rfolco|rucker11:23
*** panda is now known as panda|lunch11:27
*** jpena is now known as jpena|lunch11:33
*** nilashishc has quit IRC11:39
*** nilashishc has joined #zuul12:19
*** bhavikdbavishi has quit IRC12:22
*** rlandy has joined #zuul12:29
*** jpena|lunch is now known as jpena12:40
corvustobiash: do you still see the same error in https://review.openstack.org/597147 ?13:43
tobiashcorvus: will re-do the check13:50
openstackgerritMerged openstack-infra/zuul master: web: Increase height and padding of zuul-job-result  https://review.openstack.org/61098013:51
tobiashcorvus: regarding 610029: iterating over the node requests took so long even with caching (we have probably more providers than is good currently in one instance)13:52
tobiashbut nevertheless we had a very long list of open requests and quota calculation for each request took a few seconds even with znode caching13:53
tobiashso I think this safety net is still useful in case of overload situations13:53
openstackgerritMerged openstack-infra/zuul master: encrypt_secret: support OpenSSL 1.1.1  https://review.openstack.org/61141413:54
corvustobiash: ok.  i'm fine merging that and then continuing to work to speed things up.13:58
tobiashcorvus: thanks, that's also my main focus area atm13:59
tobiash(stability and performance)14:01
goerntobiash, are you running jobs on VM or in pods?14:01
tobiashgoern: our jobs run on vms14:01
tobiashzuul in pods14:01
goernuh, zuul itself in pods :)14:02
goernneed to talk to tristanC that he puts software-factory in pods :)14:02
*** panda|lunch is now known as panda14:04
* tobiash just increased the zuul-executor pods to 1214:04
goernand thats a real bottleneck for me right now... the vm running the executors is way to slow :/14:05
tobiashcorvus: I cannot reproduce the error in 597147 anymore. I guess the needed change in the api was not there yet in openstack at that time?14:10
openstackgerritTobias Henkel proposed openstack-infra/nodepool master: Ignore removed provider in _cleanupLeakedInstances  https://review.openstack.org/60867014:18
corvustobiash: that sounds plausible14:18
tobiashcorvus: so I changed -1 to +214:19
tristanCgoern: well the issue remains that zuul executor needs privileged pods, which iiuc is not acceptable with multi-tenant openshift deployment...14:23
tobiashtristanC: true, so I have my own 'single-tenant' openshift deployment14:39
openstackgerritTobias Henkel proposed openstack-infra/nodepool master: Cleanup node requests that are declined by all current providers  https://review.openstack.org/61091514:48
goerntristanC, tobiash do we know why and what privs the executors need?15:02
tobiashgoern: the executors use bwrap (a sandboxing tool) to jail the jobs15:03
openstackgerritMerged openstack-infra/zuul master: Exclude .keep files from .gitignore  https://review.openstack.org/61199015:03
openstackgerritMerged openstack-infra/zuul master: Add a sanity check for all refs returned by Gerrit  https://review.openstack.org/59901115:03
tobiashthat needs privs15:03
openstackgerritMerged openstack-infra/zuul master: Reload tenant in case of new project branches  https://review.openstack.org/60008815:03
goerntobiash, dont we run jobs in pods?! so why we need bwrap?15:04
mordredgoern: there is a hypothesis that it could be possible to have bwrap request less capabilities - but to my knowledge nobody has had the time to investigate whether or not that is actually possible15:04
mordredgoern: we wrap the execution of ansible-playbook, which runs on the executor, in bwrap15:04
mordredas a defense in depth - to go along with ansible-level restrictions preventing local code execution15:05
tobiashgoern: no, the jobs run in the executor pods (and can do localhost stuff)15:05
goernoh ja...15:06
openstackgerritTobias Henkel proposed openstack-infra/nodepool master: Cleanup node requests that are declined by all current providers  https://review.openstack.org/61091515:17
openstackgerritMerged openstack-infra/zuul master: Use merger to get list of files for pull-request  https://review.openstack.org/60328715:25
openstackgerritMerged openstack-infra/zuul master: Add support for authentication/STARTTLS to SMTP  https://review.openstack.org/60383315:25
openstackgerritMerged openstack-infra/zuul master: encrypt_secret: Allow file scheme for public key  https://review.openstack.org/58142915:25
tristanCgoern: my proposal was to spawn a new executor pod for each job to remove the need for local bwrap isolation, but that's a substantial refactor...15:28
goerntristanC, but I think that is the right way to move zuul forward into a cloud native world15:29
tristanCthere are more details in this thread: http://lists.zuul-ci.org/pipermail/zuul-discuss/2018-July/000477.html15:31
corvusyep, if anyone has time to look into the bwrap capabilities question mordred described, that's the next step.15:34
corvustristanC: are you planning on looking at the nodepool k8s functional test failures?  it looks like it's catching a real bug in the k8s driver15:35
*** ianychoi_ has joined #zuul15:36
tristanCcorvus: i'm still in vacations atm, and i probably won't have time for that before the summit15:37
corvustristanC: oh vacation!  don't worry about it then!  no rush.  :)15:38
corvustristanC: you're just around so much it didn't seem like you were on vacation.  ;)15:38
*** ianychoi has quit IRC15:40
Shrewscorvus: speaking of that, does the direction in https://review.openstack.org/609515 seem sensible to you?15:48
Shrewsas far as organizing the tox tests15:48
Shrewstl;dr....  current tests -> tests/unit/   , driver func tests -> tests/functional/<driver>/15:50
Shrewsfunc tests will be a separate job per driver. didn't make sense to have a single job to set up *all* of the potential backends on a single node15:53
clarkbThe first paragraph of https://github.com/projectatomic/bubblewrap#user-namespaces is the important part re privilege vs unprivileged bwrap aiui15:55
clarkbI think if your container provider considered user namespaces secure for unprivileged users then bwrap would run fine15:55
clarkbfor example on my tumbleweed machine where non root without setuid can run brwap15:55
corvusShrews: that looks great15:57
*** themroc has quit IRC16:00
*** rfolco|rucker is now known as rfolco|brb16:01
tobiashgoern: btw, nodepool builder also needs privs16:01
openstackgerritMerged openstack-infra/zuul master: web: add config-errors notifications drawer  https://review.openstack.org/59714716:07
ShrewsHas anyone else noticed that zk in our tests seems to be getting less and less reliable?16:39
Shrewsat least in nodepool. not sure about zuul16:40
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: DNM: testing zookeeper oddities  https://review.openstack.org/61275016:46
clarkbShrews: ya on theory I had was maybe running 8 test processes didn't give zookeeper time to get the cpu16:47
clarkbnot easy to test that, but did push up a change that runs numcpu - 1 test processes16:47
*** caphrim007 has joined #zuul16:48
Shrewsclarkb: well, the case I just looked at, zookeeper connection was established, but then it was suspended for whatever reason16:48
Shrewsmaybe still cpu issue? dunno16:49
clarkbCould that happen if your connection timesout?16:49
clarkbI'm not sure how zk handles a timed out connection when you run without responding for too many ticks16:49
Shrewsmaybe. but it was less than 2 seconds between connection and suspension16:51
Shrewshttp://logs.openstack.org/98/605898/2/gate/tox-py36/9d5cf96/testr_results.html.gz16:51
Shrewsfor reference16:51
corvuszuul tests have been having issues too, which also are consistent with a cpu contention hypothesis16:52
Shrewsclarkb: where was your change you referenced?16:52
clarkbhttps://review.openstack.org/#/c/561037/16:52
Shrewsdoes that still work with stestr?16:53
clarkbit should, but likely have to move the config to the stestr config file. /me reads some docs16:53
tobiashShrews, corvus: do you think it's cpu or io contention?16:54
Shrewstobiash: we don't know what it is, thus our speculative poking  :)16:55
clarkbhttps://stestr.readthedocs.io/en/latest/MANUAL.html#user-config-files it may not dynamically evaluate it as shell though16:55
clarkbit is configurable but only to an integer value not a run this command set the value value16:56
Shrewswe could also do the command line option i suppose16:57
tobiashI'm probably a little bit biased towards io since I'm mostly dealing with io bottlenecks here16:57
clarkbcheck the dstat addition to zuul jobs change?16:58
clarkbfwiw it did look like there was disk io contention on ovh but not the other clouds16:58
Shrewshrm, this was an ovh node16:59
clarkbI think ovh limits by iops16:59
clarkbso lots of small writes (probably what zk is doing actually) are "slow" but big writes are fast16:59
Shrewsbut a similar failure on rax17:00
caphrim007are there any zuul services which *dont* require the zuul.conf?17:02
tobiashcaphrim007: no, but they require different parts of the zuul conf17:03
caphrim007tobiash: thanks!17:04
*** gothicmindfood has quit IRC17:05
tobiashShrews, corvus, clarkb: the dstat of zuul looks more iops bound than cpu bound17:07
corvuscaphrim007: (it's designed so you can use the same conf file everywhere)17:07
tobiashhttp://logs.openstack.org/00/610100/3/check/tox-py36/6a8a11d/dstat.html.gz17:08
caphrim007corvus: roger that17:08
clarkbtobiash: at least for that particular job I don't think it is, if you scroll the window to the left you'll see there is a spike in iops17:09
clarkbtobiash: and then the rest of it runs under that spike17:09
clarkbimplying we don't hit a limit there17:09
*** gothicmindfood has joined #zuul17:09
clarkbat the same time the load average is very high for an 8vcpu host17:10
clarkbmaybe its both things!17:10
tobiashclarkb: but cpu is below 100% constantly17:11
clarkbya17:11
Shrewsif it isn't cpu, we'll have to look at setting up zookeeper with a tmpfs for our tests. but that's the harder thing to do, so let's try the cpu thing first17:11
clarkbanother thing I notice is that there are a lot of sockets open. Is it possible we are running into ulimit errors?17:11
tobiashclarkb: and the load is a combi ation of cpu and io on linux17:11
tobiashThat also might be a problem17:12
clarkbtobiash: the wai cpu time should indicate waiting on io (or other syscalls) right?17:12
tobiashNot neccessarily17:12
clarkbI guess if you are running async it wouldn't show up there?17:12
clarkbbeacuse you are polling17:12
corvusit should as long as there is some idle time (and there is), so it should be fairly reliable indication of iowait17:13
corvus(if there's no idle time, you can't rely on iowait)17:13
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: DNM: testing zookeeper oddities  https://review.openstack.org/61275017:13
clarkbtrying to accomodate the number of sockets/fds is probably worth doing anyway to avoid ulimit problems on say your laptop17:14
tobiashBut nevertheless it's very io heavy as it's doing constantly brtween 500 and 1000 iops17:16
*** rfolco|brb is now known as rfolco|rucker17:16
clarkbyes17:16
Shrewsthose graphs would be handy to have for nodepool17:17
Shrewshow is that enabled?17:19
clarkbI think its an unmerged change to the zuul tox job to add it?17:19
tobiashShrews: https://review.openstack.org/#/c/61010017:19
clarkbfwiw running zk on a tmpfs may not be too difficult since we use the tools/test-setup.sh method of running zk17:19
clarkbI'll work on making that happen17:20
Shrewsclarkb: that isn't used in nodepool17:21
Shrewswe should actually remove that17:21
tobiashMaybe in a pre playbook like dstat?17:21
Shrewsoh, np doesn't have that17:21
clarkbShrews: remote the test setup? that is how mysql is configured17:22
clarkbShrews: and that is why nodepool probably doesn't have it17:22
clarkbI don't think you can remove it from zuul if you want to test the mysql reporting driver17:22
Shrewsclarkb: i was thinking of only nodepool17:22
corvusdo folks like the dstat thing?  should we start thinking about how to do it for realz?17:22
Shrewsbut looking in zuul code (duh)17:23
Shrewscorvus: seems handy for this use case, at least17:23
tobiashcorvus: we have something like this in the base job that cam be enabled by a job var17:25
corvusif we want the data for nodepool, the most expiditious thing would just be to copy that change to the nodepool repo for now... i can do that real quick17:26
corvustobiash: what do you use to generate the report?17:26
tobiashWe use sar for gathering and 'sadf -g' for generati g svg graphs17:26
corvusi really like the thing in my change because it's all static js that just gets concatenated into a single file.  it's very simple and self-contained.  however, it isn't published anywhere, so we'd have to figure out a distribution mechanism17:27
corvustobiash: is that something you could share?17:27
tobiashOf course17:27
Shrewscorvus: thx. i can rebase on top of that change17:27
tobiashcorvus: I can share that later (not at laptop atm)17:29
openstackgerritJames E. Blair proposed openstack-infra/nodepool master: WIP: Run dstat and generate graphs in unit tests  https://review.openstack.org/61276517:30
corvusShrews: ^17:30
corvustobiash: cool, thanks17:30
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: DNM: testing zookeeper oddities  https://review.openstack.org/61275017:31
openstackgerritClark Boylan proposed openstack-infra/zuul master: Run zookeeper datadir on tmpfs during testing  https://review.openstack.org/61276617:33
clarkbit isn't too bad to use a tmpfs ^ if we want to go that route17:33
clarkbcould also look into eatmydata and other write() fast returners17:33
*** jpena is now known as jpena|off17:34
clarkbShrews: if ^ resutls in more stability we could do similar for nodepool17:34
Shrewsnot quite as simple in nodepool, but yeah. hope it produces good results17:35
corvusShrews: what sets up zk in the unit tests in nodepool?17:36
Shrewscorvus: nothing. it's installed from bindep17:36
Shrewsso we assume it's running17:37
corvusShrews: ah gotcha17:37
corvusShrews: i think that's the same for zuul, so it might just be a matter of adding a test-setup.sh script with the sed in it17:38
Shrewscorvus: yeah, that's what i was thinking17:38
Shrewscorvus: where is that script called in zuul?17:39
clarkbit is part of the base tox jobs iirc17:39
corvusya17:39
corvusso if it exists, it'll get run in tox-pyxx automatically17:39
Shrewsoh, that's convenient17:40
Shrewsso not so bad then17:40
clarkbwe might want to add noatime to that set of mount options too17:40
Shrewsclarkb: ++17:40
clarkblets see if that actually works (I'm slightly worried about file permissions but the mount that happened locally for me seemed to just work17:42
Shrewsok, i've got an equivalent change for nodepool, but i'm not going to push it up just ye17:46
Shrewsyet17:46
*** nilashishc has quit IRC17:59
SpamapSHey everyone, I'm starting to poke at converting my kubernetes yamls for zuul+nodepool (based on tobiash's openshift submission) into helm charts. Just wondering if anybody has headed down that route before I go there.18:01
*** chandankumar is now known as chkumar|off18:37
*** panda has quit IRC18:45
*** panda has joined #zuul18:45
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Merger: automatically add new hosts to the known_hosts file  https://review.openstack.org/60845318:57
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Merger: automatically add new hosts to the known_hosts file  https://review.openstack.org/60845318:59
*** jtanner has joined #zuul19:19
tobiashcorvus: do you still use xenail nodes in openstack?19:22
clarkbtobiash: yes we run zuul on xenial and many of the tests still run on xenial19:23
*** pcaruana has quit IRC19:23
tobiashcorvus: I'm asking because the svg generation of sadf needs sysstat at least in version 11.4.0 which is too old in xenial (I think 16.10 was the first that shipped sysstat in a version that can generate svg)19:24
tobiashso we're currently running the svg generation in docker in an alpine image ;)19:24
tobiashso to use that in openstack we'd maybe to host the sadf binary somewhere (or add it to the xenial nodes in a newer version)19:25
tobiashor use a dfferent method of svg generation19:25
corvustobiash: i think it's also ok if it only works on newer systems19:26
tobiashok, that makes it easier19:26
*** openstackgerrit has quit IRC20:06
*** caphrim007 has quit IRC20:30
SpamapScorvus: how would you feel about adding something to zuul that lets it read any config option from envvars (constructed from section+key)? Asking because that makes kubernetes deployments of zuul a lot simpler.20:32
SpamapSI had to jump through a lot of hoops to get secrets into the containers ... porting that to helm chart is shining a light on how this could be made quite a bit simpler.20:32
corvusSpamapS: to be honest, that sounds terrible -- like, we must be missing something.  surely kubernets can handle running apps with *config files* ?20:33
SpamapSNope.20:36
SpamapSIt can handle config *file*20:36
*** ssbarnea_ has quit IRC20:36
dmsimardSpamapS: you can set environment for Ansible tasks, you don't need Zuul for that20:36
SpamapSBut when you have some bits coming from config maps, some from secrets, and others from other deployment pieces, you have to assemble that config file in a very frustrating way.20:36
SpamapSdmsimard: I am setting things like the github app secret.20:37
SpamapSAnsible isn't even in the picture yet20:37
SpamapSOr the mysql db password.20:37
SpamapSNow20:37
corvus(to elaborate on why it rubs me the wrong way -- i feel like we're finally getting to the point where we can somewhat concisely instruct people on how to set up zuul, and forking that process so there are two completely different ways of configuring zuul is counter-productive.  being able to talk with people about "the zuul config file" and not have that be a mystery depending on the deployment tech would be20:37
corvusgreat)20:37
SpamapSThe problem is that in order to build that file, you have to pull things from many different sources.20:38
SpamapSWe can also just make a wrapper that does what I described.20:38
SpamapSOr, we can enable ConfigParser environment interpolation.20:38
SpamapSWhich I discovered after asking that question.. just now. ;)20:38
dmsimardSpamapS: this would not work ? https://gist.github.com/dmsimard/7e8753b252de7cc9380c2b4d5ad2f6f920:39
SpamapShttp://paste.openstack.org/show/732855/ <-- this patch actually makes it so you can reference the environment in zuul.conf20:39
SpamapSwith %(ENV_VAR_NAME)s20:39
SpamapSso maybe that's a happy medium?20:40
SpamapSsince you still have "a zuul config file"20:40
corvusSpamapS: i see the problem you describe; that solution soulds like maybe a good compromise20:40
SpamapSbut you can feed variable things in via the environment20:40
corvusyeah, it's a bit more explicit and less magic -- it should be pretty easy to understand/debug20:40
SpamapSIndeed, and doesn't change by way of deployment tool.20:41
corvus++20:41
dmsimardSpamapS: oh, it's not for a job, it's for actually deploying zuul ?20:41
SpamapSreading up on the caveats now20:41
SpamapSsince I literally just learned about this 120 seconds ago.20:41
SpamapSdmsimard: correct20:41
dmsimardok, my bad, I had the job use case in mind20:41
SpamapSYeah, and I'm explicitly avoiding ansible for any of it just to avoid ansibleception.20:42
dmsimardfair :)20:42
dmsimardso you're using puppet? jk20:42
SpamapS(though it looks like ansible+k8s should be a lot simpler in ansible 2.7)20:42
dmsimardyeah.. the awx installer uses ansible to deploy itself in k8s/openshift20:43
clarkbSpamapS: fwiw https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#add-configmap-data-to-a-volume implies you could write multiple config files20:43
clarkbthe value is the file content an the key the filename at that volume mount point20:43
SpamapSclarkb: configmaps and secrets cannot land in the same dir, nor can they be symlinked reliably.20:46
SpamapSwhich is exactly the frustration I landed on20:46
SpamapSzuul.conf ended up having to be a secret entirely.20:46
clarkbhow are you suopposed to use secrets with config20:46
clarkbthats seems broken20:46
SpamapSYES20:46
SpamapSsecrets are carefully handled in such a way that they are very hard to compromise accidentally and don't end up on disk ever.20:47
SpamapSbut that makes them a bit rigid.20:47
SpamapSI believe you can probably figure out a way to make something like  zuul.conf in one dir, and zuul-secure.conf in the same dir.20:48
dmsimardsecrets are also very securely "encrypted" in base64 in etcd :(20:48
SpamapSbut.. I'm just thinking, I kind of like the idea of just sticking them in the environment.20:48
clarkbdmsimard: and etcd doesn't have read acls (or did v3 add that?)20:48
dmsimardnot sure, openshift has acls for sure but I'm not sure if that comes from etcd or k8s20:48
clarkbSpamapS: ya if that works and configparser supports it sanely seems like a reaosnable approach20:49
SpamapSetcd3 does in fact have RBAC20:49
clarkbnice20:49
SpamapSbut IIRC the recommendation is to replace that secret storage with something better20:50
SpamapShttps://kubernetes.io/docs/concepts/configuration/secret/#protections20:53
SpamapSFYI, if interested20:53
*** openstackgerrit has joined #zuul20:54
openstackgerritMerged openstack-infra/zuul master: Run zookeeper datadir on tmpfs during testing  https://review.openstack.org/61276620:54
clarkbI guess ^ worked thats neat20:54
SpamapSand looks like they have a path to encrypted-at-rest secrets20:54
SpamapShttps://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/20:55
*** manjeets has joined #zuul21:01
SpamapShm.. so with a few more lines of code, instead of doing %(ENV_VAR)s, we could instead have $ENV_VAR and ${ENV_VAR} work..21:01
SpamapSthe latter would be less surprising. Like, people might just try that without reading the docs.21:02
*** caphrim00_ has joined #zuul21:03
*** caphrim00_ has quit IRC21:07
*** caphrim007 has joined #zuul21:08
caphrim007hey folks, can i ask nodepool questions here? or is that another channel?21:09
corvusSpamapS: that could be a thing21:09
corvuscaphrim007: this is the place21:09
caphrim007corvus: thanks. had a question. if I have a min-ready of 1 in my nodepool.conf and an existing node created in my openstack. i manually deleted the node in openstack, and expected that nodepool would rectify that. was that an incorrect assumption? didnt seem to "work" until i did nodepool delete foo21:10
caphrim007the existing node had been created via nodepool btw21:11
corvuscaphrim007: ah nope, i don't think nodepool will detect that (it doesn't expect nodes to be deleted from under it)21:12
caphrim007ahh ok. thanks for the clarification!21:12
corvuscaphrim007: but nodepool delete will delete the node in nodepool and openstack21:12
clarkbit should eventually cleanup the node from its db (and in the cloud if necessary) after the ready timeout is reached though21:12
clarkbbut that is a long timeout by default21:12
corvusclarkb: oh, right, that's true.  but that's like 8+ hours i think21:13
clarkb(8hours?)21:13
corvusit's more likely to just try to use the node and fail before then21:13
caphrim007ok. that brings me to another question actually corvus. how does zuul instruct nodepool to create something? does it use an api for that? because i see no thing like "nodepool create foo"21:13
corvuscaphrim007: yes, zuul puts requests into zookeeper and nodepool handles them.  there isn't an external interface to this api yet (probably some day, but we still consider it a private api for now while we iron it out)21:14
corvuscaphrim007: you can inspect it with 'nodepool request-list'21:14
corvusthat will show any pending requests from zuul21:15
caphrim007oh, interesting21:15
manjeetsHello zuul community I'm trying to set up a 3rd party job for an opensource project one of openstack projects, and found this https://zuul-ci.org/docs/zuul/admin/quick-start.html#quick-start21:17
manjeetsI can disable gerrit service from this and configure it to point to gerrit for opensource project ?21:17
corvuscaphrim007: here's the current output from openstack's nodepool if you are curious: http://paste.openstack.org/show/732858/21:18
corvusmanjeets: yes, you should be able to do that.  the zuul.conf file is bind-mounted into the container, so you can just edit it in your local directory.  you can remove the gerrit container from the docker-compose file to avoid running it.21:20
corvusmanjeets: note that it will levae "localhost" links for the logs, so you will also need to change the log url to something that other people can access.21:21
manjeetscorvus thanks, should i just remove gerrit from docker compose or I have to delete gerrit-config tag as well, If I understood correctly i'll link gerrit stream of upstream patches in zuul.conf ?21:22
corvuscaphrim007: i forgot (until i pasted that output) that even min-ready nodes go through the request system, so you should be able to see that too.21:22
caphrim007corvus: alrighty. thanks!21:23
corvusmanjeets: you can remove gerrit-config as well21:23
caphrim007corvus: is main.yaml used by all the zuul components too? or only select ones?21:23
corvusmanjeets: and yes, you can update zuul.conf to connect to an upstream gerrit instead of the local one.21:23
corvuscaphrim007: it's only used by the scheduler to bootstrap the rest of the configuration21:24
caphrim007kk21:24
caphrim007ahh yeah. there it is in zuul.conf. duh tim21:24
manjeetscorvus thanks I'm building that right now might come back to ask about issue if I run into any !21:24
corvusmanjeets: great, we're happy to help -- when you're done, maybe you can share your configuration -- other folks may be able to use it :)21:25
manjeetscorvus sure ! once I'm done I'll write documentation and make it public21:26
caphrim007corvus: is there a reference/code/anything that i can look at that covers the zuul.conf options?21:29
corvuscaphrim007: yes, i think they should all be covered on this page: https://zuul-ci.org/docs/zuul/admin/components.html21:33
corvuscaphrim007: (under each individual section)21:34
caphrim007corvus: ahh ok. is this config here, zuul_url, no longer a thing? https://github.com/openstack/windmill/blob/306602fc0c267837e2a4af68e510e1e7b705871b/config/zuul/zuul.conf.j2#L4221:37
corvuscaphrim007: correct -- i think that was for zuul v221:38
caphrim007k21:38
*** spsurya has quit IRC21:38
corvuscaphrim007: oh, there's also a little more zuul.conf option documentation in the drivers pages: https://zuul-ci.org/docs/zuul/admin/connections.html#drivers21:38
corvuscaphrim007: for example, to configure a github connection, here are the docs: https://zuul-ci.org/docs/zuul/admin/drivers/github.html#connection-configuration21:39
corvus(those are in separate files -- one per driver -- since the drivers are supposed to be self-contained)21:39
caphrim007ahh yes, right right. i'm just doing some rectifying between windmill and what i see in the current zuul docs21:40
clarkbianw: if you have a sec can you review https://review.openstack.org/#/c/609829/5 ? it adds the port cleanups to nodepool itself21:45
openstackgerritClark Boylan proposed openstack-infra/nodepool master: Run test zookeeper on top of tmpfs  https://review.openstack.org/61281621:49
clarkbShrews: corvus ^ didn't see one of those get pushed yet so went ahead and did it atop the dstat change to see if we notice a difference21:49
clarkblooking at http://logs.openstack.org/16/612816/1/check/tox-py36/f346739/dstat.html (tmpfs) vs http://logs.openstack.org/65/612765/1/check/tox-py36/d2d81c3/dstat.html (not tmpfs) we do drastically reduce the iops21:58
clarkbwhether or not that has a hand in making the tmpfs change pass vs the failure in the not tmpfs run hard to say21:58
clarkbthe dstat info for the not tmpfs run doesn't actually look all the bad21:58
clarkbload is low, plenty of cpu idle time, low memory usage etc22:00
ianwclarkb: i will take a closer look today.  i was wondering if this is actually something nodepool should work around, or if it was a very specific problem22:06
clarkbianw: we've seen it on other clouds too (like packethost recently, but also hpcloud way back when iirc)22:07
clarkbI expect it will be a useful thing to have nodepool understand :/22:07
ianwi guess "this shouldn't happen but does" is the raison d'ĂȘtre of openstacksdk, and nodepool to some extent22:08
clarkbhttp://logs.openstack.org/16/612816/1/check/tox-py35/70a3157/job-output.txt.gz I think that rules out iowait as the cause of the zk problems in nodepool test suite22:09
clarkbI wonder what our ulimit is there22:09
*** caphrim007 has quit IRC22:21
clarkbreading the failed test logs again the issue is in wait_for_threads22:32
clarkbwe actually do build the image and boot a node that we are waiting on but there must be some unexpected background thread running that holds us up22:33
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/zuul master: Add the process environment to zuul.conf parser  https://review.openstack.org/61282422:47
clarkbok I've tried to reproduce that locally and am having a very hard time. Seems to be fine from here22:56
clarkbran that specific test ~30 times and now running the full test suite to see if it is an interaction between test threads22:57
*** threestrands has joined #zuul23:02
openstackgerritClark Boylan proposed openstack-infra/nodepool master: Do not merge  https://review.openstack.org/61282823:08
clarkbI really dislike ^ but unsure of whereelse to look since local reproduction isn't working23:08
*** rlandy is now known as rlandy|bbl23:17
clarkbat least it caught one23:24
clarkbok I think the real-cloud thread is leaking across tests23:26
clarkbI'm going to guess this is a side effect of the openstacksdk release that happened23:29
clarkbbecause openstacksdk is going to run a thread for the api request throttling?23:29
ianwclarkb: so you're saying the new thread hasn't been skipped in the wait?23:51
clarkbianw: yes, though reading our task manager and nodepool fixtures I expect that this thread should be stopped23:56
clarkbianw: now that I have that info I'm tracking to hack togther a reproduction locally by running the sdk integration test before the webapp test23:57
clarkbcurrently trying to figure out how to enforce test order23:57

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!