Wednesday, 2017-12-20

*** threestrands has joined #zuul00:26
*** dkranz has quit IRC00:54
openstackgerritJames E. Blair proposed openstack-infra/nodepool feature/zuulv3: Log provider names with quota  https://review.openstack.org/52917801:00
openstackgerritJames E. Blair proposed openstack-infra/nodepool feature/zuulv3: WIP: Fail on quota-exceeded (partial revert)  https://review.openstack.org/52917301:00
openstackgerritJames E. Blair proposed openstack-infra/nodepool feature/zuulv3: Assume a quota limit of -1 means unlimited  https://review.openstack.org/52918001:00
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Use yarn and webpack to manage zuul-web javascript  https://review.openstack.org/48753801:07
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Add babel transpiling enabling use of ES6 features  https://review.openstack.org/52829501:07
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Add StandardJS linting and analysis  https://review.openstack.org/52829601:07
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Fix source_url handling for jobs view  https://review.openstack.org/52837301:07
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Fix StandardJS warnings and turn them to errors  https://review.openstack.org/52829701:07
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Add bundle analysis to the lint target  https://review.openstack.org/52829801:07
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Make bundle of build web content  https://review.openstack.org/52837401:07
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Remove use strict  https://review.openstack.org/52843701:07
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Inject url endpoint information  https://review.openstack.org/52919301:07
mordredjeblair: ok. I got the unittest thing fixed more better - and also as of https://review.openstack.org/529193 the generated draft web code should totally work WRT internal links01:08
mordredtobiash: ^^01:08
mordredI have a few more followup thoughts on top of https://review.openstack.org/529193 that I want to explore01:09
tristanCmordred: jeblair: so to get the api documentation in sphinx doc, it seems like we need to go the other way around: write a swagger file, then use sphinxcontrib-openapi for the doc and aiohttp_swagger for the '/api/doc' route01:14
mordredtristanC: wow.01:14
tristanCopenapi sphinx extension looks pretty good to me, see: https://sphinxcontrib-openapi.readthedocs.io/01:15
openstackgerritJames E. Blair proposed openstack-infra/nodepool feature/zuulv3: Fail on quota-exceeded (partial revert)  https://review.openstack.org/52917301:17
openstackgerritJames E. Blair proposed openstack-infra/nodepool feature/zuulv3: Assume a quota limit of -1 means unlimited  https://review.openstack.org/52918001:17
mordredtristanC: oh - you know - I guess that's also because the sphinx plugin would not have any way of connecting the docs for a call with the route - since we register routes and methods in code01:17
mordredso that makes sense01:17
mordredI think going that route would allow us to have the normal live swagger api rest api doc thing, as well as the ability to craft a narrative document about the calls01:18
mordredhah. I like that they link the status codes to the w3c definition of the code :)01:19
tristanCi'll have a look, in any case, it seems better to have a standalone swagger file instead of doing it in docstring01:19
mordred++01:20
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Inject url endpoint information  https://review.openstack.org/52919301:34
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Make bundle of build web content  https://review.openstack.org/52837401:34
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Remove use strict  https://review.openstack.org/52843701:34
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Log provider names with quota  https://review.openstack.org/52917801:41
*** _ari_ is now known as ari|pto01:46
mordredwoot!01:56
mordredhttp://logs.openstack.org/93/529193/2/check/build-javascript-content/514252d/npm/html/status.html01:56
mordredtristanC: ^^ note that the jobs and builds links work properly - as do the websocket streaming links from the status page01:56
tristanCmordred: that's really neat, i'll rebase the new job page on top of it01:57
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Fail on quota-exceeded (partial revert)  https://review.openstack.org/52917302:04
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Fix build-javascript-content success-url  https://review.openstack.org/52920302:50
*** rlandy|rover has quit IRC03:01
*** threestrands has quit IRC03:03
*** threestrands has joined #zuul03:04
*** threestrands has quit IRC03:04
*** threestrands has joined #zuul03:04
*** threestrands has quit IRC03:05
*** threestrands has joined #zuul03:06
*** threestrands has quit IRC03:06
*** threestrands has joined #zuul03:06
*** threestrands has quit IRC03:07
*** threestrands has joined #zuul03:07
*** threestrands_ has joined #zuul03:57
*** threestrands has quit IRC03:57
*** threestrands_ has quit IRC03:58
*** threestrands_ has joined #zuul03:59
*** ianw is now known as ianw_pto04:57
tobiashjeblair: looking05:03
tobiashpabelanger: if you want old ready nodes being deleted you need to configure max-ready-age for the label05:04
tobiashThe default is infinite05:04
openstackgerritTobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Assume a quota limit of -1 means unlimited  https://review.openstack.org/52918005:15
*** bhavik has joined #zuul05:42
*** bhavik has quit IRC05:44
*** bhavik has joined #zuul05:45
*** bhavik has quit IRC05:54
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Assume a quota limit of -1 means unlimited  https://review.openstack.org/52918006:03
tobiashjeblair: that change just missed an 'import math' so I fixed it and approved based on the previous reviews ^^06:08
openstackgerritTobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Test that -1 works for infinite quota  https://review.openstack.org/52923106:19
tobiashjeblair: and that is a followup that would catch the missing import not only in pep8 but also in the unit tests ^^06:19
tobiashjlk, SpamapS: I think you had a discussion about github caching issues some weeks ago. Did you solve your problems?06:35
tobiashI have a gate requirement to a label 'merge' and that matches sometimes and sometimes not non-deterministically06:36
*** threestrands_ has quit IRC06:57
*** flepied_ has quit IRC08:19
*** hashar has joined #zuul08:27
*** flepied has joined #zuul08:35
*** kmalloc has quit IRC08:41
*** jpena|off is now known as jpena08:44
*** flepied_ has joined #zuul09:51
*** flepied has quit IRC09:54
*** jeblair has quit IRC10:15
*** jeblair has joined #zuul10:16
*** flepied has joined #zuul10:17
*** flepied_ has quit IRC10:19
*** flepied has quit IRC10:24
*** flepied has joined #zuul10:25
*** jaianshu has joined #zuul10:37
jaianshuHi, does anyone know about this error: Ansible output: b"Can't find source path /opt/zuul-scripts:10:37
jaianshu No such file or directory"10:37
jaianshui'm able to run  ansible job but after that it fails - http://paste.openstack.org/show/629444/10:37
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Set remote url on every getRepo in merger  https://review.openstack.org/52929310:48
tobiashjaianshu: you probably configured trusted_ro_paths in zuul conf?10:49
tobiashthat is just an example how you can add files to the execution environment of ansible10:50
tobiashyou should just remove this config10:50
jaianshuyes, i did....ok10:50
*** hashar is now known as hasharAway10:54
*** yolanda__ has joined #zuul11:05
*** yolanda has quit IRC11:08
*** flepied_ has joined #zuul11:13
*** flepied has quit IRC11:14
openstackgerritRicardo Carrillo Cruz proposed openstack-infra/zuul feature/zuulv3: Add run_ansible_setup_on_start flag for executor  https://review.openstack.org/52930011:24
jaianshutobiash: now i'm getting  Ansible exit code: 2  ... Ansible complete, result RESULT_NORMAL code 2 for the Ansible Job, any idea what does exit code: 2 means?11:24
rcarrillocruzmordred , jeblair : we talked about the runAnsibleSetup on start by default on all nodes not working for us (networking). I've been checking what we talked about maybe running that method to just nodes using ssh connection type (which relies upon the ansible_connection plumbing from nodepool up to zuul). However, that would be a bit complex, it would require I create a special purpose group on getHostList to11:27
rcarrillocruzput the groups to exclude on the inventory, so in runAnsibleSetup I can run the setup run against all excluding group not using ssh connection type11:27
rcarrillocruzlong story short: i think is easier to have a general flag to disable it should it needed11:27
rcarrillocruzwe would disable it, by default it will run setup on everything11:28
rcarrillocruzit's the patch above ^11:28
rcarrillocruzlet me know what you folks think when you get around11:28
rcarrillocruzother option would be changing runAnsibleSetup to do a 'ansible -m wait_for' , but in my experience is dodgy and it does not check end to end connectivity , which is a requirement for that function per jeblair11:30
openstackgerritRicardo Carrillo Cruz proposed openstack-infra/zuul feature/zuulv3: Add run_ansible_setup_on_start flag for executor  https://review.openstack.org/52930011:35
tobiashrcarrillocruz: that is a all or nothing config, what do you think about that as a job attribute (which could be set by the base job)?11:38
rcarrillocruzimho, don't think that should by a by job attribute. Zuul/Nodepool is responsible for giving a sane underlying infra to run tests (obviously, network can be lost in between, but still). If anything, maybe a node attribute?11:41
tobiashyes, maybe an attribute coming from nodepool11:42
tobiashthat also would allow mixed jobs with parts doing the setup and parts not11:42
rcarrillocruzi think all those options are not mutually exclusive tho. Thing is, v3 is supposed to be out of the door very soon, making schema changes to nodes and or jobs i don't think those will be in time11:44
rcarrillocruzthus my intent of having this all or nothing11:44
rcarrillocruzwhich is a valid use case too11:44
rcarrillocruzand getting it for 3.011:44
rcarrillocruzso it works for us11:44
tobiashrcarrillocruz: works for me11:44
tobiash(the all or nothing switch)11:45
rcarrillocruzi really think doing the per group inventory is not as elegant11:45
rcarrillocruzand moreover, we won't need that in the mid-term11:45
rcarrillocruzwe have plans to plugify gather_facts, so in case of networking it calls the needed modles11:45
rcarrillocruzlike11:46
rcarrillocruz'if gather_facts: yes and i'm on networking connection type, let's guess underlying OS. Oh good, it's IOS, then let's call ios_facts'11:46
rcarrillocruzthat kind of thing11:46
tobiashthat sounds good11:48
openstackgerritRicardo Carrillo Cruz proposed openstack-infra/zuul feature/zuulv3: Add run_ansible_setup_on_start flag for executor  https://review.openstack.org/52930012:06
openstackgerritRicardo Carrillo Cruz proposed openstack-infra/zuul feature/zuulv3: Add run_ansible_setup_on_start flag for executor  https://review.openstack.org/52930012:16
*** Wei_Liu has joined #zuul12:23
Wei_Liuhello, I used role "prepare-workspace" to sync executor src to work node, but it hang sometime when executing synchronize and got message "Output suppressed because no_log was given". Is there anyone who get same issue?12:24
Wei_Liuhello?12:36
*** yolanda__ is now known as yolanda12:53
*** jpena is now known as jpena|lunch12:58
Wei_Liuhello, I used role "prepare-workspace" to sync executor src to work node, but it hang sometime when executing synchronize and got message "Output suppressed because no_log was given". Is there anyone who get same issue?12:59
*** dmellado has quit IRC13:05
tobiashWei_Liu: you get the message because the task defines no_log to suppress log output13:07
tobiashWei_Liu: you can remove that in your deployment for debugging13:07
tobiashmordred: any idea what could be the cause of http://logs.openstack.org/93/529293/1/check/tox-py35/45c66c4/testr_results.html.gz ?13:08
tobiashthe log is not *that* helpful13:08
tobiashran the tests locally several times without problems :(13:08
*** openstackgerrit has quit IRC13:13
*** dmellado has joined #zuul13:15
*** dmellado has quit IRC13:19
*** dmellado has joined #zuul13:21
*** rlandy has joined #zuul13:23
*** rlandy is now known as rlandy|ruck13:24
*** jaianshu has quit IRC13:34
mordredtobiash: looking13:34
mordredtobiash: wow. that's ...13:36
tobiash... not helpful?13:36
tobiash;)13:36
tobiashI cannot reproduce that locally13:36
tobiashI did a recheck, let's see that this does13:37
mordredtobiash: yah - not helpful at all - I don't see anything obvious in the code ... trying to think of other things to try :(13:40
tobiashmordred: that looks pretty similar like the process-returncode failure we all see occationally13:42
mordredyah13:43
mordredit's ... super unhelpful13:44
tobiashmordred: hrm, the recheck reproduced that13:46
tobiashmordred: there are two alarm clocks in the text log: http://logs.openstack.org/93/529293/1/check/tox-py35/8723364/job-output.txt.gz#_2017-12-20_13_25_07_77393413:48
tobiashthat might be somehow related13:48
mordredtobiash: oh - yah. that does seem like perhaps related13:49
tobiashmordred: a different change doesn't have the alarm clocks13:49
mordred:(13:49
tobiashand there's exaclty two like the two new tests13:49
tobiashmordred: one of these tests is relatively slow: http://paste.openstack.org/show/629457/13:51
*** rlandy|ruck is now known as rlandy|rover13:52
tobiashmaybe I disable that check see how that runs13:52
*** openstackgerrit has joined #zuul13:53
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Set remote url on every getRepo in merger  https://review.openstack.org/52929313:53
*** jpena|lunch is now known as jpena13:54
openstackgerritAndreas Jaeger proposed openstack-infra/zuul-jobs master: Do not use --ignore-missing-args for rsync  https://review.openstack.org/52932013:56
pabelangertobiash: Ah, thank you13:59
tobiashpabelanger: the nodepool patch?14:03
openstackgerritAndreas Jaeger proposed openstack-infra/zuul-jobs master: Do not use --ignore-missing-args for rsync  https://review.openstack.org/52932014:05
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Set remote url on every getRepo in merger  https://review.openstack.org/52929314:05
pabelangertobiash: re max-ready-age14:05
tobiashah ;)14:06
tobiashlost the context14:06
SpamapStobiash: I still have the caching issue.14:06
*** logan- has quit IRC14:08
tobiashSpamapS: :(14:10
*** logan- has joined #zuul14:10
SpamapStobiash: in debugging I got lost in trying to see if github3.py was actually using etags14:15
tobiashSpamapS: I think I'll first watch a curl to check if that's a github problem or github3.py problem14:16
SpamapStobiash: it _should_, but getting requests to log the reqs confused me.14:16
SpamapStobiash: yeah maybe try a curl and twiddle the labels.14:17
tobiashSpamapS: so the curl label watch changes pretty instantaneously14:22
tobiashI cannot see there a noticable delay14:23
SpamapStobiash: is your curl using an etag?14:23
tobiashdoes github3.py itself caching?14:23
tobiashno14:23
SpamapStobiash: github3 does I think14:23
SpamapSjust in-memory dict of resource<->etag14:24
SpamapSit uses requests' built-in caching14:24
tobiashSpamapS: looks like I'll first have to read what that does...14:24
SpamapSIIRC, you get an etag with every object. It stores that, and uses If-None-Match based on it.14:24
SpamapSI wonder if it's actually causing the problem.14:24
SpamapSInstead of using etag for invalidation, maybe github just keeps serving you the etag until it times out.14:25
SpamapSwhich would be... super annoying14:25
SpamapSbut we could probably fix things if that's true by invalidating cache on receiving webhooks for things.14:25
tobiashSpamapS: ok, curl watched the etag header and that also changes without delay14:27
tobiashalso the last modified header looks correct in the curl watch14:29
SpamapSso the next thing to wonder about is whether the internal caching turns around and just returns us the wrong thing without making a req at all.14:29
SpamapSassuming your curl is accurately reproducing what github3.py is requesitng14:29
tobiashI fear that means debugging github3.py14:30
SpamapSrequesting14:30
SpamapSyeah debugging github3.py and requests was hard14:30
SpamapSI just wanted to turn on debug logging and it was like o_O14:30
tobiashso that's my watch: watch -n0.1 'curl -v -s -H "Authorization: token xxxx" https://example.com/api/v3/repos/codecraft/cc-zuul-conf/issues/3/labels 2>&1'14:30
tobiashso next step probably would be a simple python script doing that loop with github3.py14:31
*** openstackgerrit has quit IRC14:33
*** openstackgerrit has joined #zuul14:34
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Add consolidated role for processing subunit  https://review.openstack.org/52933914:34
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Remove testr and stestr specific roles  https://review.openstack.org/52934014:34
*** sshnaidm has quit IRC14:50
*** sshnaidm has joined #zuul14:50
tobiashSpamapS: I can reproduce it with a sample script: http://paste.openstack.org/show/629466/14:53
openstackgerritMerged openstack-infra/zuul-jobs master: Do not use --ignore-missing-args for rsync  https://review.openstack.org/52932015:00
*** dkranz has joined #zuul15:01
tobiashSpamapS: removing the cache control adapter fixes the caching issue15:01
SpamapStobiash: ugh.. well that's good and bad news15:05
SpamapSgood that it's fixed15:05
SpamapSbut bad that we have to figure out caching gaain15:05
SpamapSagain15:05
SpamapSbecause rate limiting is a pain15:05
tobiashSpamapS: so etag matched requests don't count to the rate limit?15:07
tobiashok, doc says that's true15:10
SpamapSRight that's why it was turned on I'm sure.15:11
SpamapSSo we just have to invalidate etags on webhook I think.15:11
tobiashSpamapS: I just tried out etags in curl15:14
jeblairtobiash: re nodepool quota -- i was thinking maybe we could have nodepool put the request back into the 'requested' state (without declining the request) after an unexpected over-quota error.  then the launcher can look at it again in the normal way and decide if it should take it based on its quota and other considerations.  it's a bit more work, but it's probably the closest we can get to letting the algorithm work as designed.15:14
tobiashand they seem to work correctly15:14
jeblairrcarrillocruz: what is your connection type?15:15
tobiashwhoa, context change15:15
rcarrillocruznetwork_cli15:15
rcarrillocruzcan be netconf too15:15
tobiashjeblair: I think that sounds good15:15
tobiashSpamapS: so I think the etag handling with the cachecontroladapter is somehow broken on client side15:16
*** sshnaidm is now known as sshnaidm|afk15:16
jeblairrcarrillocruz: i think i'd prefer to do something automatic based on the connection type.  we can re-write the inventory file (or make a second inventory file) so we don't have to have extra groups in it (which may affect how the job is run)15:17
jeblairrcarrillocruz: we eventually want things to work with mixed connection types -- a network appliance + 2 bare metal nodes, for instance -- so being able to have it adjust behavior to handle that would be good.15:19
mordredjeblair: I think making a second inventory file just for the setup filtered by connection type would be pretty straightforward15:19
openstackgerritFabien Boucher proposed openstack-infra/zuul feature/zuulv3: Attempt to improve tenant config loading in case of config issue  https://review.openstack.org/52906015:19
rcarrillocruzso15:19
rcarrillocruzleave one like15:19
rcarrillocruz'all'15:19
rcarrillocruzwith everything15:20
rcarrillocruzthen15:20
rcarrillocruz'ssh'15:20
rcarrillocruz'network_cli'15:20
rcarrillocruz...15:20
*** myoung is now known as myoung|bbl15:20
rcarrillocruzso at runAnsibleSetup we run it against 'ssh' inv file15:20
rcarrillocruzis that what you think of15:20
rcarrillocruz?15:20
mordredrcarrillocruz: I think it can be even easier ... we can either have a blacklist or a whitelist of connection types that setup should be run against - and write a setup inventory and a run inventory15:21
mordred(thinking out loud)15:22
jeblairyeah, i think that will be safer because we allow user-defined groups.  so if we kept one inv file, we would have to pick a name that would not collide ('_zuul_internal_ssh'), and then raise an error if the user named a group that way.15:23
mordredtobiash, SpamapS: I wrote a patch to os-service-types a few months ago that used that cachecontroladapter (because i'd seen it used in the gh driver) - and got feedback there that that cachecontrol adapter wasn't working fully correctly. I didn't complete the loop at the time because it seemed like it was working fine for our usecases ... but that could put some fuel on the 'the cachecontroladapter isn't15:23
mordredactually doing things right' fire15:23
jeblair2 inv files, we don't have to worry about that.  either way works though.15:23
mordredjeblair: yah - it also seems like polluting the test environment in a way we don't need to15:24
jeblairmordred: ++15:24
tobiashmordred: thanks for the info, exactly this seems to break us currently15:24
jeblairmordred: right, even if they don't create a group with that name, someone could still write a play against it, which is awkward.15:24
rcarrillocruztobiash: can you check my comment on https://review.openstack.org/#/c/501976/ when yo uget a change15:25
jeblair2 files is best i think. :)15:25
rcarrillocruzi can't do the above without getting nodepool connection type up to zuul15:25
tobiashSpamapS, mordred: https://github.com/sigmavirus24/github3.py/issues/22615:25
tobiashso according to the open issues in github3.py caching is not yet fully supported15:27
tobiashSpamapS: so I suggest to first remove it and when someone has time to fix that15:27
jeblairjlk: ^ fyi15:28
rcarrillocruzjeblair , mordred : whitelist/blacklist connection type, parameterizable via executor conf section or good to just hardcode now to whitelist = 'ssh'/'winrm' , blacklist=rest15:29
rcarrillocruzin other news , https://arstechnica.com/gadgets/2017/12/microsoft-quietly-snuck-an-ssh-client-and-server-into-the-latest-windows-10/15:30
tobiashrcarrillocruz: thanks, read that, but hadn't time to answer15:31
jeblairrcarrillocruz: let's hardcode for now.  we may need to make it configurable later, but let's see if we can keep up with the connection plugins for now, so it's easier for users (this is something where we know the "right" answer, so we should be able to do it without asking the user)15:32
rcarrillocruzk15:33
mordredjeblair, rcarrillocruz: ++ ... and I'd vote for blacklist fwiw15:33
tobiashrcarrillocruz: answered on 50197615:33
mordredsince running setup / gathering facts *probably* works on the majority of the connection types - and the ones where it doesn't are the exception15:34
rcarrillocruztobiash: fair enough, wanna rebase?15:40
rcarrillocruzi can too15:40
tobiashdoes it need a rebase?15:40
tobiashfeel free if you want15:40
rcarrillocruzyup, shows cannot merge15:40
rcarrillocruzk15:40
tobiashI'm just about pushing up a github change and have to run after that15:41
rcarrillocruzi can run it, np15:41
tobiashthanks15:41
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Remove github caching  https://review.openstack.org/52936115:42
openstackgerritRicardo Carrillo Cruz proposed openstack-infra/zuul feature/zuulv3: Use connection type supplied from nodepool  https://review.openstack.org/50197615:42
tobiashSpamapS, jlk: ^^15:42
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: DNM: Add github debugging script  https://review.openstack.org/52936215:44
tobiashSpamapS, jlk: this is a hacky debug script which helped me to find the issue ^^15:45
jeblairi'd be happy to merge that, fwiw15:45
tobiashjeblair: the debug script?15:46
jeblairyep15:46
tobiashthen I'll need to clean it up a bit to take parameters15:46
jeblairit seems like a nice bit of boilerplate for when we have a question about github15:46
tobiashk, I'll clean it up later and push without the DNM15:47
jeblairtobiash: either way -- i was just thinking it's something to start with any time someone needs to poke at the github api.  no big deal.  :)15:47
tobiashok, thinking as a debugging template that make sense as is15:48
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Add github debugging template  https://review.openstack.org/52936215:51
tobiashso that's a little nicer ^^15:51
rcarrillocruzah nice15:59
rcarrillocruzso the inventory you explicitly set it on the ansible.cfg the executor creates on the fly15:59
rcarrillocruzi.e. inventory.yaml15:59
rcarrillocruzso this will be shorter, i was worried the inventory was placed in a folder and runAnsible was just pointed to it, thus putting another file in there for the blacklist inventory may complicate my patch16:00
rcarrillocruzjeblair, mordred  : if you're cool with comments on https://review.openstack.org/#/c/501976/ , can we merge ?16:01
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Use python3 for docs publication  https://review.openstack.org/52571816:07
*** flepied__ has joined #zuul16:12
*** flepied_ has quit IRC16:15
SpamapStobiash: thanks for debugging. It's been driving my team mates crazy to label things and have zuul do nothing.16:27
tobiashSpamapS: it drove me crazy too :)16:34
tobiashwe'll go with v3 and github (with ghe apps) productive early january and I have to make this combination as useful as possible till then...16:36
SpamapStobiash: People don't like gerrit? ;)16:37
tobiashSpamapS: we have gerrit (in a very old version) but it was a decision to take github as the de-facto standard in open source world16:38
jlktobiash: oh man, that's really sad about caching. That's going to hurt our rate limits a lot :(16:38
tobiashit's part of a mind change we want to drive (more open and community driven)16:38
jlkI wonder if it's specifically a GHE problem, or something else?16:39
tobiashjlk: testing with curl, the ghe works correctly with etags16:39
tobiashso it must be something within the client16:39
jlktobiash: any hints as to what's borken on the github3.py side? I have commit rights there now...16:39
SpamapSjlk: I have it too with our GHE16:40
SpamapSseems like it is a problem with how github3.py uses etags16:40
tobiashjlk: unfortunately I have no clue how github3.py works internally16:40
tobiashmy current knowledge is that it doesn't work and I'm happy that I could fix that before going to production16:41
tobiashbut the debug script from above could help in fixing the cache side16:41
tobiashjust run it and toggle labels16:42
tobiashwith cache it takes 30s to notice the change, without change it's without delay16:42
jlkmaybe we just weren't dealing with labels much in public github16:43
SpamapSwe did switch to reviews quickly16:43
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Add role to fetch output from nodes  https://review.openstack.org/51184316:43
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Update upload-logs to process docs as well  https://review.openstack.org/51185316:44
jlkhrm, so I wonder if it's specifically an issue with labels16:44
SpamapSWell, I think it's an issue with fetching a single issue, labels or not.16:45
tobiashwhat I'm planning now is also to take over jamielennox's dynamic branch protection patch (471175) and make that work16:45
mordredSpamapS: is the cause that someone adds a label, which sends an event immediately which zuul notices, but then zuul has to fetch the issue in question to get the details and the fetched issue winds up not having the label?16:46
tobiashthat would us the possibility to have a generic gate pipeline16:46
SpamapSmordred: exactly16:46
SpamapSbut we think maybe it's getting a 304 and not actually fetching16:46
SpamapSinstead of using the hook as a reason to drop the etag16:46
mordrednod. cool. I grok the issue then16:46
SpamapSit's kind of weird and hard to turn on requests debug logging16:47
SpamapSwhich it should not be16:47
mordredyah. it seems like if we get a hook event for an issue then we'd want to invalidate the cache for that issue16:47
jlkwell16:47
jlkI think we put faith in the GitHub API that it would know the data behind the GET has changed16:47
jlklike, that's the entire point of the ETAG16:47
mordredyah. good point16:48
jlkand as tobiash said, using CURL directly works16:48
tobiashmordred: github works perfectly there if we send the correct etag (tested via curl)16:48
SpamapSYeah I was speculating that it might be doing something different.16:48
jlkI had an open change somewhere I thought to add debug logging for github3.py16:48
tobiashI watched with curl in a loop getting 304, then toggled a label and got immediately the correct response after that16:48
SpamapSor maybe github3 is just not handling the cache miss properly.16:48
tobiashmaybe it's on a different level16:50
jlkcould be in requests somewhere16:50
tobiashgithub3.py makes 3 requests to get to the labels16:50
jlksince I think github3 just leans on that16:50
tobiashpull request -> issue -> label16:50
openstackgerritFabien Boucher proposed openstack-infra/zuul feature/zuulv3: Attempt to improve tenant config loading in case of config issue  https://review.openstack.org/52906016:50
mordredSpamapS: http://paste.openstack.org/show/629474/16:51
tobiashif pull request IS unchanged (hypothetical, didn't validate that yet) then github3.py might not request updates to the issue and labels at all16:51
mordredSpamapS: there's the recipe for enabling request level debug logging16:51
jlkafk for a moment, making breakfast.16:51
mordred(although obviously where it does logging.basicConfig() you'd want to do whatever logging config you're aiming for)16:51
jlktobiash: that could be it. Could be that because the data is indirect.16:52
tobiashmaybe16:52
jlkI can dig into this, but we're in the middle of refactoring these objects16:52
tobiashjlk: maybe the refactoring will fix that by accident ;)16:52
tobiashso if that's the case then something would need to be changed in github3.py anyway to not do separate caching (don't know if that's the case)16:54
jlkI plan to look into this today16:56
jlktobiash: thank you for uncovering this!16:58
SpamapSmordred: what i want is a recipe for logging.conf that does it :)16:58
tobiashjlk: thanks for looking into the caching :)16:59
tobiashjlk: maybe the problem is not github3.py but the cachecontrol adapter17:01
*** hasharAway is now known as hashar17:01
tobiashat least the log suggests that it does api requests regardless of the cachecontrol adapter17:02
mordredSpamapS: probably http://paste.openstack.org/show/629477/ ... but the "HTTPConnection.debuglevel = 1" is probably the missing piece17:02
tobiashhttp://paste.openstack.org/show/629478/17:03
mordredSpamapS: (so you might have to hack that in to place somewhere, which is sad17:03
tobiashthat's the same for each iteration17:03
tobiashso it looks like from point of github3.py it does requests17:03
tobiashmordred: that doesn't animate urllib to spit out logs :(17:05
tobiashmordred: the context is just urllib317:07
tobiashand now it's getting interesting:17:07
tobiashhttp://paste.openstack.org/show/629479/17:07
tobiashthat's with cache control adapter17:08
tobiashand that without the cachecontrol adapter: http://paste.openstack.org/show/629480/17:09
tobiashso the cache control adapter catches the requests and doesn't make one based on etags17:09
mordredtobiash: maybe add 'cachecontrol.controller' at the debug level ...17:10
tobiashtrying17:10
mordredtobiash: that seems to be where in the cachecontrol lib the logging is happening17:10
tobiashhttp://paste.openstack.org/show/629481/17:11
tobiashso it has a max age which we probably shoud suppress17:12
mordredtobiash: well, we have cache_etag=True, which http://cachecontrol.readthedocs.io/en/latest/etags.html#turning-off-equal-priority-caching indicates will use the etags17:25
tobiashmordred: yes, that's what the docs say17:25
tobiashbut not what it does...17:25
mordredtobiash: oh - actually - wait17:26
tobiashmordred: ah, I think github3.py would have to set the etag header17:26
mordredtobiash: there are two behaviors - Time Priority- ETag support only takes effect when the time has expired. and Equal Priority - decision to cache is either time base or due to the presense of an ETag17:27
mordredbut what *we* want is to always ignore time when there is an etag header17:27
mordredso an ETag Priority17:27
mordredfor that, I think we may need to do something like in http://cachecontrol.readthedocs.io/en/latest/custom_heuristics.html17:27
tobiashmordred: that's what I'm trying to do17:28
tobiashstripped out the cache-control header17:28
tobiashbut that defeats caching completely17:28
mordredtobiash: what if you strip expires? what all headers do you get from gh?17:28
tobiashlet me check17:29
tobiashah no, it might work17:30
tobiashhave to take a break17:30
tobiashbrb17:30
mordredsweet17:33
SpamapSI've almost never seen Expires work well.17:34
SpamapSEven when it's useful, it hides real problems.17:34
SpamapSPart of me is like "can we just fork a squid off?"17:35
*** apevec has joined #zuul17:36
pabelangerSo, looking to get some help with a deadlock issues we are seeing in nodepool for rdoproject, today jpena has been working on debuging the core we have of more recently deadlock in nodepool, and curious if people would mind helping. https://review.rdoproject.org/etherpad/p/nodepool-core has the most recent info. We initially thought it might be paramiko causing the issue, but upgraded to 2.1.1 yesterday and17:38
pabelangerstill got a deadlock, right now I am not sure if this is a new deadlock or existing17:38
pabelangerso far, seem a deadlock happens once a week, but this core is from less then a day17:39
pabelangerjpena feel free to add anything I might have left out17:39
jpenaso the symptoms are pretty similar to the paramiko issue pabelanger mentioned: nodepool hangs, stops sending any output to the logs, and kill -SIGUSR2 doesn't produce any output17:40
jpenawe got a core from this morning's hang, so I'm happy to get whatever additional debugging info is needed17:40
tobiashhooray17:46
tobiashI think I have it: http://paste.openstack.org/show/629483/17:46
jeblairjpena, pabelanger: so you think requests or urllib3 are stuck?17:46
jpenajeblair: that's my current suspect, it would explain why shade is stuck in self._finished.wait()17:47
tobiashthis is the changed cachecontrol: http://paste.openstack.org/show/629484/17:47
jpenajeblair: but I cannot confirm that17:48
jeblairjpena: what version of requests do you have?17:48
jpenajeblair: 2.11.117:49
openstackgerritJeremy Stanley proposed openstack-infra/zuul-base-jobs master: Add generic base and base-test jobs/playbooks  https://review.openstack.org/52614017:49
jeblairjpena: fwiw, we're runinng 2.18.4.  no idea if anything related has changed.17:50
pabelanger2.11.0 seems to have bug with hanging17:55
pabelangerhttps://pypi.python.org/pypi/requests17:55
pabelangerI should say, was fixed in 2.11.017:56
pabelangermight be worth upgrading rdo, is an older release17:57
apevecpython-requests-2.14.2 is in RDO >= Pike17:58
pabelangerrequests>=2.14.2  # Apache-2.0 is what global requirements is currently using, going to see if that is for a specific reason18:03
mordredapevec, pabelanger: also - we're working on getting the version of shade cut thatincludes the connection timeout bug fix18:06
pabelangermordred: great!18:07
jeblairjpena: what's thread 18 waiting on?18:07
mordredso it would be interesting to know if it's hanging in an initial network connection or not18:07
jeblairmordred: according to the etherpad, it's hanging during a socket close?  i think?18:08
mordredoh - I guess the stack shows me the answer to that - that seems to be in ssl close - so it's probably not hanging on initial connection18:08
mordredyah18:08
jeblairi'm confused though because thread 18 doesn't have a * or x next to it, but does seem to be waiting on something?18:08
jpenajeblair: the backtrace for thread 18 is interesting (and huge), see https://review.rdoproject.org/paste/show/61/18:09
jeblairPyThread_acquire_lock is the gil, right?18:09
jpenamordred: it's actually closing a connection that was reset. https://github.com/shazow/urllib3/blob/1.16/urllib3/connectionpool.py#L248-L25018:10
jeblairoh, i guess PyThread_acquire_lock could be locks other than the gil18:11
jeblairit's waiting on futex 0x1ba53f0 -- is there a way to find out what other thread has that lock?18:14
openstackgerritClark Boylan proposed openstack-infra/nodepool feature/zuulv3: Log unknown providers during quota calculation  https://review.openstack.org/52939118:15
jpenaI tried hard to get that, but I couldn't find it. That structure does not have an "owner" field18:15
jpenabrb, have to pick up kid18:15
jeblairwould a full bt of thread 31 confirm it's the other participant in the deadlock?18:17
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Fix github caching  https://review.openstack.org/52939218:17
tobiashjlk, SpamapS, mordred: this should fix (not remove) the github caching ^^18:17
mordredtobiash: so it did wind up being cache-control you needed to remove. net18:22
tobiashmordred: that worked perfectly in my test script :)18:22
jeblairtobiash: does that warrant a # NOTE in the code?18:23
tobiashchecked the headers and I get correct 304 responses which proves that caching works18:23
clarkband possibly a bug to cachecontrol?18:23
jeblair(to answer the question "why are we deleting headers?")18:23
tobiashjeblair: I'm just about to write that note ;)18:23
jeblaircool18:24
SpamapSossssuumm18:26
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Fix github caching  https://review.openstack.org/52939218:34
tobiashnow with a note ^^18:34
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Use connection type supplied from nodepool  https://review.openstack.org/50197618:35
jpenajeblair: full bt of thread 31: https://review.rdoproject.org/paste/show/62/18:43
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Add github debugging template  https://review.openstack.org/52936218:49
jeblairjpena: okay, so i think the interesting question is still what is thread 18 waiting for18:52
jeblairjpena: is it possible to get a full bt from all threads and see if any of them mention futex=0x1ba53f0 ?18:52
jeblair(other than 18)18:52
jpenajeblair: I've tried that, it's not elsewhere (I can upload the full bt, but it's > 600K)18:55
*** myoung|bbl is now known as myoung19:01
jlkugh, shakes fist at cachecontrol19:05
tobiashjlk: at least you can concentrate now at your refactoring ;)19:07
*** dkranz has quit IRC19:25
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Set remote url on every getRepo in merger  https://review.openstack.org/52929319:34
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: DNM: shall fail  https://review.openstack.org/52940319:34
SpamapStobiash: fantastic work chasing it down.19:35
tobiashSpamapS: thanks :)19:36
SpamapSI bounced off it twice.19:36
SpamapSSo bravo.19:36
*** jpena is now known as jpena|off19:37
*** rlandy|rover is now known as rlandy|rover|brb19:39
tobiashyeah, so I'm currently working hard getting this stuff running right19:42
tobiashI refused to go prodictive without gh apps, so now I have to deliver ;)19:42
tobiashbut now with that sorted out and the ghe apps patches and the merger url (529293) it looks pretty good now I think19:44
SpamapSYeah that's nice that the apps are there.19:48
SpamapSWebhooks kinda suck. ;)19:48
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: DNM: shall fail  https://review.openstack.org/52940319:49
openstackgerritMerged openstack-infra/zuul-jobs master: Fix build-javascript-content success-url  https://review.openstack.org/52920319:50
tobiashmordred: re strange build errors a few hours ago: http://logs.openstack.org/03/529403/2/check/tox-py35/c1e632d/testr_results.html.gz20:03
tobiashthat should fail one with an assertion and not fail the other test20:03
tobiashbut again both with the same non-log20:04
mordredtobiash: :(20:05
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Fix github caching  https://review.openstack.org/52939220:05
tobiashmordred: but ps1 where both shall fail works as expected: http://logs.openstack.org/03/529403/1/check/tox-py35/ea62da4/testr_results.html.gz20:06
tobiashthat is the test without simulating changing urls20:07
tobiashI wonder if that has something to do with this20:07
tobiashcan it be that file://foo:bar@/some/path doesn't work on the xenial nodes?20:08
tobiashlocally that works and gets just ignored (which I intended)20:08
mordredmaybe? lemme pull and try on my laptop real quick20:10
tobiashyou're on xenial?20:10
mordredyah20:10
mordredtobiash: well, for one thing I get this:20:12
mordredhttp://paste.openstack.org/show/629494/20:12
mordredtobiash: and then it hangs20:12
tobiashmordred: ah, so when it hangs that triggers alarm clock and inhibits log output20:13
tobiashthat makes sense20:13
tobiashah shit, choices is py3620:13
*** rlandy|rover|brb is now known as rlandy|rover20:14
mordredwhoops20:14
tobiashthanks mordred, I wouldn't have been able to find this20:14
mordredwell - at least that explains it!20:14
tobiash:)20:14
tobiashnow fixing is easy20:14
mordredsure thing - also - I should have started with pulling and running locally as well ... oh well20:14
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Set remote url on every getRepo in merger  https://review.openstack.org/52929320:19
tobiashhere we go, that should work ^^20:19
tobiashmordred: I've a comment on 51184320:28
tobiashmordred: and a question on 51185320:32
mordredtobiash: yes - on 511853 I was just having that same thought ... and am trying to come up with a 'good' way to tell if the dir is empty or not20:34
tobiashok20:35
tobiashmordred: yay, the remote url change is green now :)20:42
tobiashnote to myself: rtfm more precisely20:42
mordred\o/20:44
openstackgerritMerged openstack-infra/zuul-base-jobs master: Initial boilerplate, packaging and testing  https://review.openstack.org/52613921:06
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Log unknown providers during quota calculation  https://review.openstack.org/52939121:10
jeblairmordred: i left a couple of +0 style questions on 529193 when you have a sec21:12
*** threestrands_ has joined #zuul21:36
mordredjeblair: I do not see any questions?21:46
mordredjeblair: btw - did you notice that as of 529193 the links on the draft status page are fully working- including jobs, builds and the log streaming?21:48
jeblairmordred: er, sorry, 48753921:50
jeblairmordred: and yes!  or, well, i noticed in review that had happened, i hadn't actually looked myself yet.  but i believed you.  :)21:50
mordredjeblair: 487538 I hope? yes - there they are21:51
*** jappleii__ has joined #zuul22:35
*** threestrands_ has quit IRC22:36
*** apevec has quit IRC23:09
*** flepied__ has quit IRC23:30
*** hashar has quit IRC23:33
*** rlandy|rover is now known as rlandy|bbl23:47

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!