Friday, 2018-06-15

*** GonZo2000_ has quit IRC00:04
*** GonZo2000_ has joined #zuul00:07
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: Fix paused handler exception handling  https://review.openstack.org/57551500:08
pabelangerjust catching up on zuul commits this week, is there any more info on https://review.openstack.org/#/c/574788/ ? from the commit it seems like a potential security issue. Are we looking to create 3.0.4 to include it?00:10
tristanCpabelanger: iirc, jeblair mentioned doing a 3.1.0 instead because of some other changes to the default behavior00:12
pabelangerk, that would get ansible 2.5 too00:12
*** GonZo2000_ has quit IRC00:17
*** rlandy is now known as rlandy|afk00:29
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: angular6 fix suggestion  https://review.openstack.org/57349401:30
*** rlandy|afk is now known as rlandy01:36
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: Add support for Ansible extra-vars flag  https://review.openstack.org/54647402:10
tristanCtobiash: not sure what's going on, but current master version of zuul is failing localhost command task with "msg": "zuul_log_id missing02:51
tristanCwith those zuul_json exceptions in logs: http://paste.openstack.org/show/723525/02:52
tristanC(i mean on my local deployment)02:56
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: Add missing __init__.py files  https://review.openstack.org/57559103:09
tristanCtobiash: nevermind, found the issue: the action-general directory was missing without 57559103:10
*** rlandy has quit IRC03:27
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: Add support for Ansible extra-vars flag  https://review.openstack.org/54647403:32
tobiashpabelanger: yes, I also planned to write a mail about the security fix but that has fallen down because of the other ansible problems04:07
tobiashtristanC: phew, my first thought was 'fuck, what did I miss' because the test cases worked04:08
*** gouthamr has quit IRC04:22
*** dmellado has quit IRC04:23
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Remove failed_when when creating /tmp/console-None.log  https://review.openstack.org/57467204:53
tristanCtobiash: sorry about that :-) Thanks for the many fixes you worked on!05:01
tobiashtristanC: pep8 doesn't like your change05:02
tobiashwe may need to change the directory name05:03
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Rename action-general to actiongeneral  https://review.openstack.org/57561705:09
tobiashtristanC: ^05:09
tobiashthat renames that directory and adds the __init__.py05:10
tobiashI'll leave the other __init__.py files to you as they're unrelated05:10
openstackgerritMerged openstack-infra/zuul master: Add supercedent pipeline manager  https://review.openstack.org/57193205:15
*** pcaruana has quit IRC05:18
*** gtema has joined #zuul05:34
*** gtema has quit IRC06:09
*** dmsimard has quit IRC06:13
*** dmsimard has joined #zuul06:27
*** gtema has joined #zuul06:41
*** pcaruana has joined #zuul06:44
*** swest has quit IRC07:33
*** swest has joined #zuul07:34
*** jpena|off is now known as jpena07:47
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: job: add ansible-tags and ansible-skip-tags attribute  https://review.openstack.org/57567207:50
*** hashar has joined #zuul07:51
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: zk: use kazoo retry facilities  https://review.openstack.org/53553708:18
*** threestrands has quit IRC08:20
*** gouthamr has joined #zuul09:02
*** electrofelix has joined #zuul09:03
*** GonZo2000_ has joined #zuul09:20
*** GonZo2000_ has quit IRC09:21
tobiashhrm, I'm currently trying to react on changes of the branch protection settings09:32
tobiashbut it seems that github doesn't offer any event for this09:32
tobiashjlk: is that correct?09:32
tobiashjlk: if yes, that would be an important feature in the long run09:33
openstackgerritFabien Boucher proposed openstack-infra/zuul master: Add tenant yaml validation option to scheduler  https://review.openstack.org/57426509:47
*** egonzalez has joined #zuul10:18
*** GonZo2000_ has joined #zuul10:27
*** yolanda has quit IRC11:03
*** yolanda has joined #zuul11:06
*** jpena is now known as jpena|lunch11:12
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: Fix paused handler exception handling  https://review.openstack.org/57551511:13
*** jpena|lunch is now known as jpena|off11:38
*** jpena|off is now known as jpena12:05
*** rlandy has joined #zuul12:19
*** rlandy_ has joined #zuul12:46
*** gouthamr has quit IRC12:46
*** rlandy_ has quit IRC12:48
*** rlandy has quit IRC12:48
*** rlandy_ has joined #zuul12:48
*** rlandy_ is now known as rlandy12:49
*** myoung|off is now known as myoung12:49
*** egonzalez has quit IRC12:54
Shrewsmordred: pabelanger: clarkb: corvus: we should try to get 575515 ^ merged12:57
Shrewsps1 shows the failure w/o the fix12:58
mordredShrews: lgtm - should we wait for corvus to look before landing? looks fairly straightforward to me13:03
Shrewsmordred: actually, that test has a slight race in the last assert... lemme fix13:04
mordredkk13:05
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: Fix paused handler exception handling  https://review.openstack.org/57551513:05
Shrewsok, that should do it13:05
Shrewsmordred: we can wait. just want it to go in in case we restart any launchers soonish13:05
mordredShrews: cool. I mean, it seems like one we can just land - but I know things related to locking/unlocking/etc benefit from all the eyeballs13:07
Shrewsall it really does is set self.paused = False (no locking involved), so we can just land it with another +213:08
Shrewstobiash: if you wanna do the honors ^^13:08
tobiashShrews: _a13:09
tobiash+a13:09
Shrews_a should totally be a gerrit op13:09
tobiashlol13:10
mordredtristanC: left some comments on https://review.openstack.org/#/c/57349413:14
*** elyezer has joined #zuul13:50
pabelangerShrews: great work on leak14:01
Shrewspabelanger: well, i *think* that fixes the leak. it definitely does fix the request lingering. i feel like they're related14:03
openstackgerritMerged openstack-infra/nodepool master: Fix paused handler exception handling  https://review.openstack.org/57551514:11
mordredtobiash, tristanC: I believe we need to squash your action-general/action-general patches because of pep814:32
mordredoh - nevermind - the tobiash change does the whole thing14:33
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Add missing __init__.py files  https://review.openstack.org/57559114:35
tobiashjust got an interesting zuul misbehavior14:43
tobiashone of my users has a pull request with more than 2000 files and a change to zuul.yaml14:43
tobiashand zuul ignores the dynamic zuul.yaml change14:44
tobiashit looks like zuul takes the changed file list from the github event and github caps that list at some point14:44
jlktobiash: hrm, that's a good question. Let me do some splunking, but I'm pretty sure that's not an event.14:48
tobiashjlk: that just destroyed my plans to solve a few scaling issues we're starting to hit with github :(14:49
jlkYeah, you'd have to do it on a timer or something.14:49
tobiashjlk: I wanted to start caching of the project branches and that would be an important part in that concept14:49
jlkis it an API limit you're reaching, where we're doing an expensive API call too often?14:49
jlkI _thought_ that a way forward for that was to move to graphql and limit the number of transactions, but that's a long delayed project14:50
tobiashjlk: it's more of a zuul hangs often for a while during tenant reconfiguration querying hundreds of repos for their branches14:50
mordredyah - and if you could keep a cache that you update just from the PR event info it would be cheaper ...14:51
jlkoh.... why does it need every branch?14:51
tobiashmaybe I'll try out some stuff with graphql14:51
mordredbut if the pr event info is incomplete, then you'd still have to do a query14:51
tobiashjlk: it queries the project branches when building the global config14:51
jlkit has to look for .zuul files in every branch doesn't it.14:51
mordredtobiash: does the payload indicate that it's a truncated file list?14:51
mordredjlk: yah14:51
tobiashmordred: I didn't have a look at the payload yet14:51
jlkin theory with GraphQL you can ask for all branches in a set of repos with one query14:52
jlkthat query will have a cost. I don't know what that cost will be14:52
tobiashour github has no rate limit configured14:52
jlkGraphQL limits are based on query cost, not transactions14:52
jlkin theory the return will be faster because it's like one SQL query rather than 20014:52
tobiashbut I guess one big query is not that bad as hammering github for 30s while zuul is stalled in the eye of the users14:53
mordredjlk: one would imagine that the graphql query for that would be less expensive than the multiple normal queries ... or would hope so :)14:53
jlkit should be less expensive overall, yes14:53
jlktobiash: can you go on a great branch reaping mission? I know internally at GitHub we do all dev in the canonical repo, rather than forks, but we're very active about deleting our PR branches14:54
*** myoung is now known as myoung|bbl14:54
tobiashwell it's not that every project has 300 branches but we have now 200 repos with 5 branches each14:55
jlkI see, so definitely more about a single thread traversing a list of 200 projects14:55
tobiashjlk: and in fact the tenant reconfig only queries the repos for the branches (the config itself is cached)14:55
tobiashso it scales with the number of repos14:55
tobiashand that is growing now so it's getting a pain slowly14:56
jlkis it single threaded right there?14:56
tobiashyes14:56
tobiashand it blocks the scheduler event loop14:56
jlkcan we async it, parallel it? (that may run you into API limits much quicker)14:56
tobiashmaybe14:56
mnasercould we add [zuul] to the mailing list emails14:56
jlkmnaser: I'm curious why?14:57
mnasereasy to identify in my mailing list and filters14:57
tobiashbut I think that might be difficult as that would dramatically change how the tenant reconfiguration works14:57
tobiashso caching of the project branches would solve that issue14:57
jlkmnaser: but list-id is in the headers, which is way better for filtering.14:58
tobiashmaybe I can get something that is good enough for now14:58
mordredmnaser: List-Id: "Discussion of Zuul usage and development." <zuul-discuss.lists.zuul-ci.org>14:58
mordredmnaser: is what I filter on14:58
jlkor just simply the 'zuul-discussion.lists.zuul-ci.org' bit14:59
mordredyah14:59
*** pcaruana has quit IRC15:01
*** gouthamr has joined #zuul15:19
*** elyezer has quit IRC15:23
rcarrillocruzhelo helo folks. Turns out we are writing a bunch of roles at ansible-networking, some of them will interact with cloud providers (like for example creating cloudformation stacks in AWS and the likes). I'm thinking the cloud creds can be stored on the zuul tenant as secret. As for the job, maybe the static driver node is the best match? The reason I'm asking is cos pre v3, the static node would have those kind15:27
rcarrillocruzof secrets stored in filesystem with config mgmt tools (i'm remembering for pypi upload), but given that now Zuul has secrets natively what's the advantage/disadvantage of using a static node vs an ephemeral node for such kind of jobs?15:27
clarkbrcarrillocruz: there isn't really one, openstack uses all dynamic ephemeral nodes now and we have deleted our static nodes15:31
clarkbrcarrillocruz: static nodes are more of a bootstrap without cloud option I think15:31
rcarrillocruzso static nodes now are merely 'i need to run jobs here cos it has a leg on $network' ? that's the use case i can just think of now15:32
clarkbrcarrillocruz: ya or maybe you are using zuul for deployment too and you statically define your production env or something15:33
clarkbI'm sure there are more use cases but they don't need to be tightly coupled to secrets anymore15:33
rcarrillocruzsweet, thanks for the prompt response (as always)15:33
*** gouthamr has quit IRC15:38
*** gouthamr has joined #zuul15:39
*** jpena is now known as jpena|off16:27
*** jpena|off is now known as jpena16:29
*** jpena is now known as jpena|away16:29
*** gouthamr has quit IRC16:50
*** gouthamr has joined #zuul16:50
mordredrcarrillocruz: fwiw, you can also run zuul jobs nodeless16:52
mordredrcarrillocruz: depending on what the content of the job is, of course, but if all the job is doing is calling to a remote web service, you don't *actually* need to spin up a VM or a static node to accomplish that16:53
mordredrcarrillocruz: you'd need to define those jobs in a config-project so that they'd be trusted jobs - but it's possible that's a useful option foryou16:54
*** gouthamr_ has joined #zuul17:27
*** gouthamr has quit IRC17:27
*** gouthamr_ is now known as gouthamr17:27
*** jpena|away is now known as jpena17:47
*** electrofelix has quit IRC18:05
*** bhavik1 has joined #zuul18:11
*** jpena is now known as jpena|off18:13
*** bhavik1 has quit IRC18:18
corvusfyi, i'll be in china for a conference next week so my availability may be limited and different.  but i should be able to review changes on the plane with gertty :)18:24
corvusplease remember to fill in https://etherpad.openstack.org/p/zuul-update-email18:28
*** rlandy_ has joined #zuul18:31
*** rlandy_ has quit IRC18:31
*** rlandy_ has joined #zuul18:31
*** rlandy has quit IRC18:33
*** gtema has quit IRC18:34
*** rlandy_ is now known as rlandy18:36
rcarrillocruzmordred: yeah, it's just that i feel uneasy about doing that, what if the change has a bogus delegate_to localhost and reviewer overlooks it or the likes18:54
mordredrcarrillocruz: yah, totally18:54
mordredrcarrillocruz: just wanted to mention it for completeness18:54
rcarrillocruzhow post-review works, would it help for this case?18:55
rcarrillocruzalso, was thinking the other day about voting false in the GH scenario , is it a thing, providing there's no +1/-1 on GH18:55
clarkbcorvus: before the week ends are you still planning a new zuul release?18:59
clarkbI haven't explicitly followed up on the logging None issue with tobiash's changes deployed yesterday bu thaven't heard of problems18:59
*** rlandy has quit IRC19:15
*** rlandy has joined #zuul19:18
*** rlandy_ has joined #zuul19:39
*** rlandy has quit IRC19:41
corvushttps://review.openstack.org/564220 previously had the logging None issue, and the recheck i recently issued passed the devstack-multinode job, so i think that adds extra confirmation about the fix21:02
corvusi'd like to tag 057d664ecc2ce151789e2488250b5e1da36d48a3 as 3.1.021:02
corvusclarkb, tobiash, fungi, mordred, anyone else: ^ thoughts21:03
tobiashcorvus: which one is that?21:04
* tobiash is not at the computer21:04
clarkblooking at log21:04
corvustobiash: it's right before supercedent.  it's what we restarted openstack-infra with yesterday.  it includes all the logging fixes21:05
fungiwe seem to be running 06df3bb on our systems at the moment21:05
tobiashcorvus: fine for me21:05
clarkbMerge "Remove extra argument when logging logger timeout"21:05
corvusfungi: running, or have installed?21:05
clarkbI'm good with 057d664ecc2ce151789e2488250b5e1da36d48a3 as 3.1.021:06
fungioh, good point. installed not restarted since then21:06
* fungi rechecks assumptions21:06
corvusfungi: ya, the bottom of http://zuul.openstack.org/ has the sha the scheduler is running at least21:06
corvuswhich is 05721:06
mordred++ to tag21:07
fungiyeah our zuul-web says 057d66421:07
fungiokay, makes sense21:07
fungiso basically tagging the merge before master tip21:08
fungisounds great to me21:08
corvusyeah.  if we restart the scheduler now we'd get a new pipeline manager.  :)21:08
fungithat gets us the multinode console log fix and the unreachable node information leak fix as big ticket changes since last point release?21:09
clarkbfungi: yup21:09
corvuserm, i had some minor dyslexia when making the tag.  can folks double check this before i push it up?  http://paste.openstack.org/show/723564/21:10
clarkbcorvus: that commit is one I have locally and appeas to tbe the one above21:11
fungilgtm21:11
corvus(to be clear, i caught my error and fixed it, i think -- i just want to make sure i really did get the version number and sha right)21:11
clarkbcorvus: version and sha lgtm21:11
clarkb3.1.0 because it adds ansible 2.521:12
clarkbsha1 is one above21:12
corvus3.1.0 pushed21:12
corvusand, happily, 0.3.1 still does not exist21:12
corvusi'm clearly not used to such large version numbers21:13
funginext up, zuul 10!21:13
corvusi only learned to count to 2 in sound engineering classes21:13
corvusand only up to 1 in computer science21:14
corvus3 just boggles the mind21:14
fungizuul hrair21:16
*** GonZo2000_ has quit IRC21:32
*** hashar has quit IRC22:10
*** rlandy_ is now known as rlandy22:15
jesusaurwhere can I update the zuul-ci docs? specifically I want to add an openSuse environment section to https://zuul-ci.org/docs/zuul/admin/zuul-from-scratch.html22:33
clarkbjesusaur: they are in the zuul repo itself22:34
clarkbzuul/doc/source/admin looks like22:34
jesusauraha, there it is, thanks!22:35
corvusi've observed a couple of things slowing down openstack-infra's zuul.  1) we run cat jobs on too many project-branches on tenant reconfiguration because we don't distinguish between a cache invalidation and not having any data to start with22:36
corvus(about 2k of our 6k project-branches have no zuul config on them, so we query the mergers for them every time)22:36
corvus2) we trigger tenant reconfiguration on branch deletions from random github repos where we aren't actually loading any config22:38
corvuspresumably we do the same for branch creation22:39
corvusi don't think even setting exclude_unprotected_branches would help here -- that's only honored during the configuration itself, it doesn't impact whether a configuration is triggered22:39
corvusthe fix for #1 is obvious22:40
corvusthe fix for #2 i'm not sure about -- should we special case repos where no configuration is loaded and ignore those events?  should we try to do something with exclude_unprotected_branches?  (i worry about being too agressive there because even if no configuration files change, the configuration itself can differ whenever a repo creates the second branch (it goes from unbranched to branched).22:44
corvusmaybe we could look at exclude_unprotected_branches but also be aware of the second-branch issue, and filter out useless create/delete events, unless it's creating or deleting the second branch22:45
clarkbcorvus: do pull requests generate useless branch creation events?22:49
clarkbor are those distinct?22:49
corvusclarkb: i believe those are distinct22:50
corvusthe events i observed in prod i traced back to real branch creation events.  one of the third-party repos we're watching is branch-happy22:50
clarkbah22:50
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Log more information about what events trigger a reconfig  https://review.openstack.org/57587122:53
corvusi was able to deduce the info without that ^ but it took a long time.  that should make it clearer in the future22:53
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Log more information about what events trigger a reconfig  https://review.openstack.org/57587122:54
clarkbthat change prompted me to check zuul sched for the logging fix I did and we now have github delivery events in the logs properly22:57
corvusclarkb: yeah, those were really helpful :)23:14

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!