Wednesday, 2019-02-13

tristanCcorvus: indeed, it seems like the limit is somehow applied to the number of build?00:02
corvustristanC: i suspect it's that join to the build table00:03
corvustristanC: i'm still skeptical that we need to query by build name... so i think the easiest/best solution would be to drop that00:04
tristanCcorvus: it should work though, but your suggestion works for me too00:05
SpamapSjlk: fascinating. I used to rely on the status triggers though, so that I could require a Jenkins check, but if we failed ot merge because jenkins was lagging, we had a trigger source of the jenkins status check, to kick the gate off again.00:06
corvustristanC: yeah, i'm sure we can make it work, but unless we really think querying by job is important, this will make the code+query simpler and more efficient.00:06
SpamapSbut seems like a corner case00:06
corvustristanC: the *main* thing here is to get SpamapS his buildset status page ;)00:06
jlkSpamapS: hrm. If Jenkins was using the checkAPI though I think that would still work00:06
SpamapScorvus: yes! :) actually lately I've been pointing at builds with a change filter, and that has made my users happier.00:07
jlkSpamapS: because there would be a check suite for Jenkins, and you'd get check suite events saying Jenkins started or whatnot, and you could query to see if all check suites were complete.00:07
SpamapSbut a buildset would be ideal.00:07
tristanCcorvus: can you please revisit your -2 on https://review.openstack.org/573473 , the last PS correctly handle multi-parent jobs00:07
SpamapSjlk: ah yeah, that's the answer than.00:07
jlka built set == all the jobs done for a particular change?00:07
SpamapSjlk: correct (but don't get confused, corvus and tristanC are talking about other things. ;)00:08
corvustristanC: awesome!  i will do that first thing tomorrow, i have to EOD now00:08
SpamapSbuildset would actually map pretty well to the check run though.00:08
jlkSpamapS: right, but I'm just thinking that checks API solves that too.00:08
jlkeach job is represented as a check in the check run00:08
SpamapSya00:09
tristanCcorvus: and your -2 on https://review.openstack.org/555153 too please. Please explain why a job filter is a no-go.00:09
*** jamesmcarthur has joined #zuul00:10
jlker. sorry. Each job would be a check_run in the check_suite00:10
clarkbthats the other thing that the check api would give us iirc, better breakdown of individual jobs (though the buildset page in zuul side still a good idea)00:10
clarkbis the check api something available in GHE too? as thats another thing to consider with our GHE users00:11
jlkI can't recall just yet, but I _think_ so00:11
jlkhttps://developer.github.com/enterprise/2.16/v3/checks/00:11
jlkIt was beta in 2.14, GA in 2.1500:12
clarkbcool so more than one release even00:13
SpamapSya the features seem to flow into GHE at a pretty impressive pace00:14
*** jamesmcarthur has quit IRC00:16
jlk... it may be a while before the Actions API is available there.00:16
SpamapSmeh ;)00:16
*** manjeets has quit IRC00:18
*** manjeets has joined #zuul00:19
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: remove build and job_name filter from the buildset route  https://review.openstack.org/63650400:22
*** sdake has quit IRC00:26
*** sdake has joined #zuul00:47
*** openstackgerrit has quit IRC00:52
*** jamesmcarthur has joined #zuul00:53
*** jamesmcarthur has quit IRC00:54
*** jamesmcarthur has joined #zuul01:04
jheskethcorvus: did I miss the discussion about zuul at the PTG? What was the reason for not having an official room if you don't mind me asking?01:10
*** sdake has quit IRC01:14
*** sdake has joined #zuul01:17
*** sdake has quit IRC01:29
*** jamesmcarthur has quit IRC01:30
*** krasmussen has quit IRC01:30
*** sdake has joined #zuul01:33
*** ruffian_sheep has joined #zuul01:33
*** jamesmcarthur has joined #zuul01:33
ruffian_sheepHi,guys.I want to use zuul v3 to build a third party ci for cinder of openstack.01:34
ruffian_sheepI build it during the link:https://zuul-ci.org/docs/zuul/admin/zuul-from-scratch.html01:35
ruffian_sheepNow,there has some problem I don't know tow to sovle.It showed me :zuul-executor: ConnectionRefusedError: [Errno 111] Connection refused01:36
ruffian_sheepIs it related to the setting of nodepool-static?01:37
*** sdake has quit IRC01:39
SpamapSruffian_sheep: zuul-executor only connects to gearman.01:43
SpamapSruffian_sheep: which should be running/open on the scheduler.01:43
ruffian_sheepSpamapS:Get it ,but there is also appeared an error :zuul-scheduler.service stop-sigterm timed out. Killing.When I try to restart it01:45
ruffian_sheepSpamapS:What I can think of about the scehduler part is the setting of the zuul.conf file.01:46
SpamapSruffian_sheep: There should be other clues in the zuul-scheduler service output that tells you why it isn't working.01:47
ruffian_sheepSpamapS:Is it probably I set the wrong paraments?01:47
*** jamesmcarthur has quit IRC01:53
*** jamesmcarthur has joined #zuul01:54
ruffian_sheepSpamapS:I cat the file of zuul-scheduler.log.It load the main.yaml and show me zuul.Scheduler: Full reconfiguration complete01:57
SpamapSruffian_sheep: sounds good. But is it listening on 4730 ?01:58
ruffian_sheepSpamapS:I think yes.I didn't set the port of gearman_server.During the document ,It should be 4730 in default.02:01
*** jamesmcarthur has quit IRC02:05
SpamapSruffian_sheep: ok, I have to go AFK now, but check to make sure executor can reach the scheduler port 473002:05
*** jamesmcarthur has joined #zuul02:16
ruffian_sheepSpamapS:OK,thx dude.02:19
*** jamesmcarthur has quit IRC02:28
*** bjackman has joined #zuul02:40
*** saneax has joined #zuul02:45
*** jamesmcarthur has joined #zuul02:59
ruffian_sheepI make sure web can can reach the scheduler port 4730.zuul      7363  0.1  7.4 1011516 75104 ?       Ssl  10:00   0:03 /usr/bin/python3.5 /usr/bin/zuul-web03:03
ruffian_sheeptcp        0      0 127.0.0.1:34644         127.0.0.1:4730          ESTABLISHED 7363/python3.503:04
*** jamesmcarthur has quit IRC03:06
ruffian_sheepBut I don't why I cannot start the executor.03:06
*** sdake has joined #zuul03:07
ruffian_sheepWhat I set in the zuul.conf is that [executor] private_key_file=/var/lib/zuul/.ssh/id_rsa03:07
*** ruffian_sheep has quit IRC03:16
*** ruffian_sheep has joined #zuul03:23
*** rlandy has quit IRC03:24
*** jamesmcarthur has joined #zuul03:26
*** jamesmcarthur has quit IRC03:30
*** bjackman has quit IRC03:45
*** spsurya has joined #zuul03:55
*** bjackman has joined #zuul03:55
*** sdake has quit IRC04:15
*** openstackgerrit has joined #zuul04:24
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add buildsets page  https://review.openstack.org/63004104:24
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add /{tenant}/buildset/{uuid} route  https://review.openstack.org/63007804:24
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add buildset page  https://review.openstack.org/63007904:25
tristanCcorvus: topic:web-build should be good for review now04:26
tristanCSpamapS: if you could have look too ^04:26
*** bjackman has quit IRC04:54
*** bjackman has joined #zuul04:57
*** sanjayu_ has joined #zuul05:02
*** saneax has quit IRC05:04
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add /{tenant}/buildset/{uuid} route  https://review.openstack.org/63007805:08
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add buildset page  https://review.openstack.org/63007905:08
ruffian_sheeptristanC:Hi,tristanC.Do you know how to deal with the problem?06:07
tristanCruffian_sheep: hum, what is the problem again?06:12
ruffian_sheeptristanC:I want to use zuul v3 to build a third party ci for cinder of openstack.It showed me :zuul-executor: ConnectionRefusedError: [Errno 111] Connection refused06:18
ruffian_sheeptristanC:SpamapS told me to checked the scheduler.I also found that zuul-scheduler.service stop-sigterm timed out. Killing.When I try to restart it06:20
ruffian_sheeptristanC:I cat the file of zuul-scheduler.log.It load the main.yaml and show me zuul.Scheduler: Full reconfiguration complete06:20
ruffian_sheeptristanC:I make sure zuul-web can can reach the scheduler port 4730.               zuul      7363  0.1  7.4 1011516 75104 ?       Ssl  10:00   0:03 /usr/bin/python3.5 /usr/bin/zuul-web06:21
ruffian_sheeptristanC:tcp        0      0 127.0.0.1:34644         127.0.0.1:4730          ESTABLISHED 7363/python3.506:21
ruffian_sheeptristanC:But I don't why I cannot start the executor.What I set in the zuul.conf is that [executor] private_key_file=/var/lib/zuul/.ssh/id_rsa06:22
tristanCruffian_sheep: can the executor reach the scheduler port 4730?06:22
*** sshnaidm is now known as sshnaidm|afk06:22
ruffian_sheeptristanC:I cannot start the executor now06:23
ruffian_sheeptristanC:It appeared that zuul-executor[8845]: ConnectionRefusedError: [Errno 111] Connection refused06:23
ruffian_sheeptristanC:I dont know what's wrong with the setting.06:24
ruffian_sheeptristanC:I didn't change the setting about executor.It can be restarted before.06:25
tristanCruffian_sheep: please paste the executor logs06:25
ruffian_sheeptristanC:zuul.Executor: Starting log streamer   and   zuul.Executor: Stopping log streamer06:26
ruffian_sheeptristanC:I take it from /var/log/zuul/executor.log06:26
ruffian_sheeptristanC:Is it ?06:27
tristanCruffian_sheep: yes, the logs before the connection refused error06:28
*** sdake has joined #zuul06:28
ruffian_sheeptristanC:Do you know what the problem is ?Or how to solve it?06:29
ruffian_sheeptristanC:I didn't find a way to deal with06:29
tristanCruffian_sheep: it seems like the executor cannot connect to the scheduler gearman service06:29
ruffian_sheeptristanC:But what I set the parament about the executor is that [executor] private_key_file=/var/lib/zuul/.ssh/id_rsa06:34
ruffian_sheeptristanC:Is something I ignored?06:34
ruffian_sheeptristanC:Is the gearman related to nodepool?06:35
tristanCruffian_sheep: iirc, executor pulls gearman config from the [gearman] section06:36
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: zuul-runner: add quick-start integration test  https://review.openstack.org/63570106:41
ruffian_sheeptristanC:Almost the same as the executor.In zuul.conf :[gearman] server=127.0.0.106:43
tristanCruffian_sheep: is the executor running on the same host as the scheduler?06:44
ruffian_sheeptristanC:Yes.06:45
ruffian_sheeptristanC:I build is during the link:https://zuul-ci.org/docs/zuul/admin/zuul-from-scratch.html06:46
tristanCand is the gearman service running, e.g. is there something listening on port 4730?06:46
ruffian_sheeptristanC:Yes.tcp        0      0 0.0.0.0:4730            0.0.0.0:*               LISTEN      9638/python3.506:47
ruffian_sheeptristanC:zuul      9638  0.0  6.5 479232 66004 ?        Sl   14:45   0:00 /usr/bin/python3.5 /usr/bin/zuul-scheduler06:48
tristanCruffian_sheep: could you paste the full exception from the executor logs, "ConnectionRefusedError: [Errno 111] Connection refused" might be something else06:48
ruffian_sheeptristanC:Is it ?Feb 13 14:33:02 ubuntu zuul-executor: Traceback (most recent call last): Feb 13 14:33:02 ubuntu zuul-executor: File "/usr/bin/zuul-executor", line 10, in <module> Feb 13 14:33:02 ubuntu zuul-executor: sys.exit(main()) Feb 13 14:33:02 ubuntu zuul-executor: File "/usr/lib/python3.5/site-packages/zuul/cmd/executor.py", line 121, in main Feb 13 14:33:02 ubuntu zuul-executor: Executor().main() Feb 13 14:33:02 ubuntu06:49
*** quiquell|off is now known as quiquell|rover06:49
tristanCruffian_sheep: use a pasting service please06:50
*** bjackman has quit IRC06:50
*** bjackman has joined #zuul06:51
ruffian_sheeptristanC:Emmmm....Are you saying that you are going to paste it line by line? I am not very familiar with the use of IRC.06:51
quiquell|rovertristanC: o/06:52
tristanCruffian_sheep: paste the log in http://paste.openstack.org/ and share the resulting url here06:55
tristanCquiquell|rover: hey06:55
quiquell|rovertristanC: good morning sir06:56
quiquell|rovertristanC: so I found a correct way to escape jinja in the zuul.message06:56
tristanCquiquell|rover: nice06:57
quiquell|roverhttps://review.openstack.org/#/c/63393006:57
quiquell|roverIt's here but maybe is too complicated for the value it gives06:57
quiquell|roverIf you can take a look would be nice06:58
tobiashquiquell|rover: commented on ^07:02
quiquell|rovertobiash: thanks! So is this a proper way to do it ?07:04
quiquell|roverIt's a little ad-hoc07:05
ruffian_sheeptristanC:Can you open it? http://paste.openstack.org/show/744995/07:05
tobiashquiquell|rover: I think that makes sense and is reusable if we need it elsewhere07:05
quiquell|rovertobiash: ack07:06
tristanCruffian_sheep: so the connection error is related to /var/lib/zuul/executor.socket, is there a sock file leaked?07:07
tristanCperhaps tries to remove the file and restart the service07:07
*** sdake has quit IRC07:07
ruffian_sheeptristanC:I don't quite understand what you mean. What is it about?07:08
ruffian_sheeptristanC:I just changed the user and user group of a folder. His feedback has changed, is it related to this? http://paste.openstack.org/show/744996/07:11
ruffian_sheeptristanC:I had done it like you said.But it failed again.07:12
tristanCruffian_sheep: for some reasone, the executor process doesn't seems to be able to open the executor.socket file07:14
ruffian_sheeptristanC:yes,the executor.socket is belong to root:root before.The messages showed me that http://paste.openstack.org/show/744997/  .I change it.And it turns to be connection refused.07:19
ruffian_sheeptristanC:I'm confused what's wrong in the setting.I have no idea,and cann't find some similar case to solve07:21
tristanCruffian_sheep: sorry, i'm not familiar with the documentation you followed, not sure what is the issue here07:21
quiquell|rovertobiash: So at my system it's using cyaml, but still only works if y set yaml.SafeLoader.add_cons ... and the like :-/07:26
quiquell|rovertobiash: so looks like cyaml is still calling something from yaml for this07:26
*** jamesmcarthur has joined #zuul07:26
quiquell|rovertobiash: I suppose cyaml use the binding but the rest as from yaml07:27
tobiashquiquell|rover: hrm, ok07:28
quiquell|rovertobiash: so have to be as it's07:28
quiquell|rovertobiash: yaml.Safe... add_constructor there :-) will add a comment07:28
tobiashquiquell|rover: k07:28
*** swest has joined #zuul07:30
*** jamesmcarthur has quit IRC07:31
*** swest has quit IRC07:34
ruffian_sheeptristanC:ok,thx dude!07:35
openstackgerritQuique Llorente proposed openstack-infra/zuul master: Mark as unsafe commit message at inventory  https://review.openstack.org/63393007:44
quiquell|rovertobiash, tristanC: Added comment07:44
quiquell|roverbtw https://github.com/yaml/pyyaml/blob/4c2e993321ad29a02a61c4818f3cef9229219003/lib/yaml/cyaml.py#L4907:44
quiquell|roverLooks like the representer is from yaml07:44
quiquell|roverHumm we can call parent class from the multiple inheritance07:47
openstackgerritQuique Llorente proposed openstack-infra/zuul master: Mark as unsafe commit message at inventory  https://review.openstack.org/63393007:47
quiquell|roverCalling SafeReprenter works for cyaml and yaml07:50
*** gtema has joined #zuul07:55
quiquell|rovernah not working08:01
openstackgerritQuique Llorente proposed openstack-infra/zuul master: Mark as unsafe commit message at inventory  https://review.openstack.org/63393008:28
quiquell|rovertobiash, tristanC: Have just leave a comment ^, let me know if something else is missing, sorry about the harassing08:29
openstackgerritQuique Llorente proposed openstack-infra/zuul master: Mark as unsafe commit message at inventory  https://review.openstack.org/63393008:42
*** dmsimard has quit IRC08:44
*** dmsimard has joined #zuul08:45
*** jpena|off is now known as jpena08:54
openstackgerritFelix Schmidt proposed openstack-infra/zuul master: Retrieve full list of jobs with details per tenant via API  https://review.openstack.org/63571409:03
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add buildsets page  https://review.openstack.org/63004109:42
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add /{tenant}/buildset/{uuid} route  https://review.openstack.org/63007809:42
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add buildset page  https://review.openstack.org/63007909:42
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift Pod provider  https://review.openstack.org/59033509:46
*** sshnaidm|afk is now known as sshnaidm09:59
*** ruffian_sheep has quit IRC09:59
*** fdegir has joined #zuul10:10
*** swest has joined #zuul10:52
quiquell|rovertobiash: zuul passing now, we can merge it https://review.openstack.org/#/c/633930/11:22
quiquell|rovertristanC: ^11:22
*** gtema has quit IRC11:52
*** panda is now known as panda|lunch12:00
*** gtema has joined #zuul12:01
*** sdake has joined #zuul12:04
*** sdake has quit IRC12:16
*** jpena is now known as jpena|lunch12:34
*** bjackman_ has joined #zuul12:53
*** rlandy has joined #zuul12:55
*** bjackman has quit IRC12:56
*** bjackman__ has joined #zuul12:57
*** bjackman_ has quit IRC13:00
*** panda|lunch is now known as panda13:12
*** jamesmcarthur has joined #zuul13:14
*** sanjayu_ has quit IRC13:19
*** bjackman__ has quit IRC13:19
*** swest has quit IRC13:19
*** sanjayu_ has joined #zuul13:19
*** swest has joined #zuul13:20
*** jamesmcarthur has quit IRC13:29
*** swest has quit IRC13:33
*** sanjayu__ has joined #zuul13:36
*** jpena|lunch is now known as jpena13:37
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift Pod provider  https://review.openstack.org/59033513:39
*** sanjayu_ has quit IRC13:39
*** jamesmcarthur has joined #zuul13:50
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: executor: add log_stream_port and log_stream_file settings  https://review.openstack.org/53553813:55
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: executor: add log_stream_port and log_stream_file settings  https://review.openstack.org/53553813:55
*** sdake has joined #zuul14:08
*** sdake has quit IRC14:09
*** sdake has joined #zuul14:10
*** sdake has quit IRC14:13
*** sdake has joined #zuul14:14
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: Implement a Runc driver  https://review.openstack.org/53555614:15
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift Pod provider  https://review.openstack.org/59033514:17
*** rfolco has quit IRC14:18
*** pwhalen has quit IRC14:21
*** pwhalen has joined #zuul14:22
*** sdake has quit IRC14:23
*** sanjayu__ has quit IRC14:26
*** sanjayu__ has joined #zuul14:26
*** jamesmcarthur has quit IRC14:26
*** rlandy is now known as rlandy|afk14:26
*** rfolco has joined #zuul14:37
jkttristanC: thanks for your very fast runc followup14:49
jktI wonder why that log streaming goes over a static port in the first place; Ansible has ssh multiplexing, so why not launch another command which `tail -f` from that well-known file location on the build node?14:51
*** gtema has quit IRC14:51
*** jamesmcarthur has joined #zuul14:55
mordredjkt: we read the stream in the callback plugin which isn't in a position to spawn additional commands. I've got a half-finished replacement for the whole thing which gets rid of the file too - but it keeps being lower priority than other things to finish15:30
mordredalthough I need to check out bcoca's work upstream to allow modules to return intermediary results to see if we can use it instead15:31
*** quiquell|rover is now known as quiquell|off15:48
clarkbhttp://paste.openstack.org/show/745021/ github processing data16:05
clarkbthat was a recent event that was finished. It took about a minute and a half to process once it was its turn and we do seem to have a growing queue16:05
*** sdake has joined #zuul16:06
jktmordred: ack, thanks, looking forward to having it replaced :)16:07
*** ianychoi has quit IRC16:08
clarkbI think that confirms our suspicions that we grow a queue due to expensive serialized processing. I do note that it appears to take ~7 seconds to handle events from less big projects too so while ansible is really slow its not incredibly fast on the cheaper projects16:08
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: CLI: fail if trying to enqueue/dequeue a change for the wrong project  https://review.openstack.org/63666216:11
tobiashclarkb: that queue length is impressive...16:38
tobiashclarkb: maybe we should go the search route while knowing that the api limit can be a problem. But to fill that rate limit we need to be faster than two seconds per event anyway and in addition have a long queue in the same org16:41
clarkbya it does make me wonder if we can reduce the number/type of events we listen for16:42
clarkbI'm not familiar enough with how zuul plugs into workflows to understand if that is viable, but reducing total number of events gives us more breathing room on time to process an event16:42
*** sdake has quit IRC16:46
*** sdake has joined #zuul16:46
tobiashclarkb: do you gate on any github repo atm?16:46
tobiashclarkb: if not, you cound temporarily remove the status events16:47
clarkbtobiash: we don't but I think ttx is using labels on kata for something16:47
clarkbttx: ^ maybe you can confirm16:47
clarkbbut maybe we don't use them on ansible and that is the biggest traffic generator?16:47
tobiashlabel events should be even something different16:47
clarkbah16:47
tobiashclarkb: the event subscription is configured at app level16:48
tobiashnot per repo16:48
clarkboh I thought it was per installation?16:48
tobiashafaik not16:48
clarkbI guess the app defines what it needs and you either accept or deny that on installation?16:48
tobiashyepp16:48
tobiashthat's how it's supposed to work16:48
*** jamesmcarthur has quit IRC16:51
corvusquiquell|off, fungi: why use !unsafe instead of "{% raw %}" ?16:51
corvusis it for the double-escape case?  where the message itself contains endraw?16:52
fungiit's possible i misunderstand the point of that change. i thought it was avoiding performing any jinja2 substitutions at all on commit messages by marking them unsafe?16:54
corvusfungi: yes, but somewhere along its 28 patchset history, it changed from using raw/endraw to using !unsafe, and i'd like to make sure i fully understand why16:55
corvusi don't see any explicit comments about the switch16:55
*** gcutrini has joined #zuul16:56
fungithis is the first patchset for it i've reviewed. it had rather numerous revisions so i'll admit i didn't dig into them all16:56
fungidefinitely a good question16:56
corvusfungi: ack, i pinged you cause i thought you might have the extra context.  i can wait for quiquell|off16:56
fungiyeah, i unfortunately do not but am eager to learn the reason16:57
fungisorry16:57
corvusi think test_jinja2_message_raw supports that supposition16:57
corvusit's a unittest with {% endraw %} in the commit message16:57
clarkbfor more data handling PR events takes ~2 seconds on ansible16:58
*** sdake has quit IRC16:58
clarkbso ya ignoring status events could be a big win16:58
corvusclarkb: interesting, aside from looking up the PR, the status event takes 5 seconds16:58
corvusclarkb: so even the other parts of that event are slower than a pr event...16:59
corvusi guess the other things it's doing may be worth looking into as well16:59
*** jamesmcarthur has joined #zuul17:00
*** rlandy|afk is now known as rlandy17:00
clarkbhttp://paste.openstack.org/show/745027/ is the pr event I was looking at17:01
clarkbhttps://developer.github.com/v3/activity/events/types/#statusevent status event details. I was a big confused about this yesterday but rereading it it is scoped to CI status results17:01
clarkbWe likely can ignore those in our app at least for now17:02
clarkbsince we don't gate or trigger on the results of CI systems currently17:02
corvusclarkb: but to change the app permissions may mean asking everyone who has installed it to update; that's something i'd like to avoid.17:03
clarkbah17:03
corvus(perhaps asking for fewer perms doesn't require any intervention, but if we fix this and re-add it later, i'm almost certain it would)17:04
clarkbyup17:04
corvusclarkb: do you have the github3.py logs for the 90 seconds it was doing the sha->pr mapping?17:06
clarkbcorvus: not yet but I'm sure I can dig that up17:08
clarkbtil awk can set marks and print text between matching patterns17:11
clarkbwhy did I learn sed instaed of awk17:11
corvusclarkb: wow.  can you paste your awk command when you're done?17:12
ttxclarkb: yes we use label/unlabel events on the Kata pipelines17:14
ttxbut not the status ones :)17:15
clarkbcorvus: awk '/2019-02-13 15:56:45,615 DEBUG zuul.GithubEventConnector: \[delivery: 8ec60b1e-2f99-11e9-8d5b-7a80f8e7170b\] Handling status event/{flag=1;next}/2019-02-13 15:58:22,125 DEBUG zuul.GithubEventConnector: \[delivery: 8ec60b1e-2f99-11e9-8d5b-7a80f8e7170b\] Finished event processing/{flag=0}flag' debug.log | grep github317:16
clarkbcorvus: flag=1 sets a mark on first match then flag=0 unsets it. And the flag at the end of the command means print if flag is not zero (or something similar)17:16
clarkbthen I grep all the lines with github3 in it17:17
clarkbI don't know if any of the github query logs are sensitive should I just write them to disk for now? or you can run my comamnd above and see them for yourself :)17:17
corvusclarkb: i don't think so, but since i'm not sure, i'll look and summarize17:18
* clarkb is skimming them now and can paste if nothing sensitive is found17:18
corvusclarkb: oh yeah, i think we can just paste that.17:20
clarkbya doing so now17:20
clarkbhttp://paste.openstack.org/show/745028/17:21
clarkbone thing I notice is we seem to find the commit in the first couple of pages of PRs then we continue scanning to the end17:21
clarkb(I guess because there can be more than one PR for that commit)17:21
clarkbanother is that we check our rate limit a lot17:21
*** jamesmcarthur has quit IRC17:22
clarkbas expected the bulk of the cost is in that PR listing17:22
clarkbwe also get user data which maybe isn't necessary? (I don't know)17:23
corvusclarkb: i *think* we have some other actions in here17:23
corvus(from another thread)17:23
clarkboh maybe17:24
clarkbI'm not sure how to dedup that17:24
corvusit's not super-relevant, i just think there's a report happening at the same time17:24
corvusa few of these requests look unrelated and we can ignore them, but they're small anyway17:25
corvusclarkb: is it possible your assumption about finding the pr early was based on the simultaneous report happening?17:27
clarkbhttps://github.com/ansible/ansible/pull/52156 is the pull request and https://github.com/ansible/ansible/pull/52156/commits/0fb7693d44b31f867fab1fe7e35fbef7b17dbb14 the commit17:28
clarkbcorvus: it was based on the 0fb769 POST17:28
clarkbwhcih ya is unrelated to the page through I think17:28
corvusyeah, pretty sure that's the report17:28
corvus52156 doesn't show up until the scan is finished; i don't think we do anything special when we find a pr -- i think much of the data are already there17:29
clarkbthough the api is being asked to sort by created date descending which should put newer PRs near the top17:29
corvusright, so we *could* likely break earlier17:29
clarkb52156 is in the first page on the web ui with that sort17:30
clarkbcorvus: ya but we'd have to assume a 1:1 PR:commit relationship17:30
clarkbwhich is safe 99% of the time until you hit that 1% set of corner cases jlk and SpamapS were talking about yesterday17:30
clarkbstable backport where master and stable/foo haven't moved their branch heads yet so are the same17:31
clarkbA force push to another branch (though maybe zuul doesn't care about that)17:31
*** jamesmcarthur has joined #zuul17:34
clarkbtrying to reason about how your idea of an LRU cache for that mapping table would like like in that case. For example if PRA had commit 1 and we cached that, then PRB is created with commit 1 we'll find PRA and not B in the cache and ignore B17:34
clarkbso we'd have to curate teh cache based on PR created events17:35
clarkbwhich is maybe good enough? there could be windows (like zuul wasn't running when PRB was created but was when PRA was created) where we get out of sync17:35
corvushere's a summary of the requests and the time they took: http://paste.openstack.org/show/745030/17:36
corvusmaybe we should drop some of those rate limits :)17:36
corvusdo we get the PR twice?17:37
clarkbhrm does seem like we do17:38
clarkbwe also search issues and PRs for the PR, do we need to do that when we know a commit is involved (implying a PR not an issue?)17:39
openstackgerritMerged openstack-infra/zuul-jobs master: Rework upload-forge role to use module  https://review.openstack.org/63594117:39
clarkbre searching by sha1 https://help.github.com/articles/searching-issues-and-pull-requests/#search-by-commit-sha is the api to do that and comes with a much lower api rate limit aiui17:39
clarkbhttps://developer.github.com/v3/search/#rate-limit has rate limit details for search17:40
corvusright.  i think that's one of the options...17:40
corvusi still want to finish digging into the data first17:40
clarkb30 per minute is average 2s per17:40
clarkbcorvus: ++17:40
corvusi think i understand why we're fetching the PR twice -- i think that's an optimization we can make in the _updateChange method17:40
corvusyour question about issues/pulls is interesting17:41
clarkbcorvus: that part of the github api gets weird and we may have to do both for reasons.17:41
corvusi'm going to try interpolating the log lines back together to figure out what prompted those requests17:41
clarkbbasically PRs are issues and you don't get all of the issue related data from the PR side of the api so if you need those details you have to query the issue api iirc17:41
corvusok, both of those (/pulls/52156 and /issues/52156) happened between 'got user' and 'refreshed change'17:42
corvusso probably both inside the _updateChange method17:43
corvusclarkb: yeah, i don't see zuul calling 2 methods there, so that's probably github3.py internals17:47
corvusoh, no it is us17:47
corvusbut there's a comment explaining why it's necessary17:47
corvushttps://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/driver/github/githubconnection.py#n111417:47
clarkbya that would be a case of the leaky models there17:48
clarkbhttps://developer.github.com/v3/pulls/#get-a-single-pull-request does show labels listed in an example response17:51
clarkbmaybe the reponse from pulls gives us what we need now?17:51
corvusjlk, tobiash: ^ any ideas if we can drop the extra call to /issues/ ?17:52
jlkhrm, trying to catch scrollback17:57
jlkoooh17:57
jlkso the /issues/ call is to get the labels?17:58
jlkhrm17:58
jlkhrm hrm17:58
corvusthat's what the comment says...17:58
jlkyeah17:58
jlkbut...17:58
corvusand i think the code backs it up17:58
corvusat least, labels are the only thing we're getting from 'issueobj'17:59
* jlk looks at github3.py17:59
clarkbits probably a minor thing performance wise, but if we can clean that up and simplify the code it will probably make it easier to optomize the bits that do cost us18:00
corvusyeah, i confess neither of these will save us much, but there's no better time now that we see the full sequence18:01
corvusand this stuff does add up18:01
clarkbya the impact is largely on the human understanding of the system which does add up (more logs to wade through as well as code to undersatnd)18:01
clarkbbut also rate limits etc18:01
jlkSo yeah it does look like github3.py doesn't expose labels as part of the pull request object18:02
jlkand it does for an issue object18:02
jlkso yes, this appears to be a github3.py bug, which would take a fair amount of work to fix up18:03
jlkoutside of this, has there been any revelations about what's going on w/ the queue length?18:04
clarkbjlk: http://paste.openstack.org/show/745030/ is the semi processed log data18:04
clarkbjlk: that is for one ansible status event18:04
corvusjlk: well, that's actually great news -- there's an improvement that we can make :)  should i file a bug on github3.py?  (or would you like to do it and use better words?)18:05
jlkoof, so like 4 to 8 seconds for each of those 100 PR grabs18:05
jlkcorvus: I'm distracted with jury duty today, so feel free :D18:05
clarkbit does seem like using the search api even with the lower limit would be an improvement18:05
corvusjlk: will do!18:06
clarkbwe can process 30 status updates a minute with search api, but its a minute and a half for one with PR listing18:06
jlkugghh18:06
jlkI don't trust the search API though18:06
jlkbecause it's not necessarily live data18:06
SpamapSjlk: I thought this was actually part of the API.. that PR's are just "special" issues.18:07
*** bhavikdbavishi has joined #zuul18:07
clarkbanother approach could be to maintain a PR cache that is curated by PR events18:07
jlkSpamapS: yes ... and no18:07
clarkb(or maybe we do both to avoid search costs as much as possible)18:07
jlkSpamapS: there are different API endpoints, and at some point lables may not have been exposed on the PR endpoint, but they are now18:07
jlkclarkb: ooooh18:08
jlkIf we could thread getting all of those PRs.... :D18:08
jlkno reason to do all of those serially18:08
clarkbjlk: I don't think you know how many pages to page through unless you do them serially?18:08
jlkyeah that's a good point18:09
jlklet me do some chatting internally for a moment18:09
* tobiash is afk this evening18:12
*** jpena is now known as jpena|off18:13
corvusjlk: https://github.com/sigmavirus24/github3.py/issues/924 look sane?18:15
corvusor, rather, coherent :)18:15
jlkthat looks good to me!18:15
openstackgerritJan Kundrát proposed openstack-infra/nodepool master: runc: Use correct interpreter for Ansible  https://review.openstack.org/63670218:16
openstackgerritJan Kundrát proposed openstack-infra/nodepool master: runc: Support IPv6  https://review.openstack.org/63670318:16
openstackgerritJan Kundrát proposed openstack-infra/nodepool master: runc: Use the hypervisor's zuul-console location  https://review.openstack.org/63670418:16
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Re-use the github PR object when fetching reviews  https://review.openstack.org/63670518:16
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Add comment about extra issues request  https://review.openstack.org/63670618:17
*** sdake has joined #zuul18:19
jlkI have learned that a certain other CI provider uses the search API and it is non-deterministic for them.18:23
fungislowness is preferable over nondeterminism if we have to pick one18:25
clarkbfungi: sort of, we don't actually need these events for the openstack instance18:25
fungioh?18:25
fungiyou mean the status events in particular18:26
clarkbfungi: these events are CI status updates and since we don't trigger jobs based on ci results and we aren't gating we can safely ignore them18:26
clarkbfungi: yup, they are the events that force us to map commits to PRs18:26
quiquell|offcorvus, fungi: the jinja review is just to not expand jinja stuff I the comments it break the jobs18:26
quiquell|offFirst approach was just using raw18:26
fungido we trigger jobs for ansible repositories based on status updates?18:26
fungido we foresee wanting to trigger jobs (as a third-party ci) on status updates?18:27
clarkbits possible we might wait for first party to succeed before running third party18:28
clarkbthere are definitrly use casesthere18:28
jlkI think you'd only trigger if you were doing a gate, or some kind of 2nd run CI18:28
*** quiquell|off is now known as quiquell|rover18:28
fungipresumably some github-using project might want to do that with their own zuul deployment though. would we want an option to ignore status updates entirely? or per tenant/per project (if we even get enough context to be able to do that)?18:29
fungibut also having it be more efficient would be friendly to high-volume github-using projects who want to run a zuul and have it gate their pull requests for them18:30
clarkbfungi ya maybe a per tenant config if we cant fogure out a better functioning approach18:30
corvusi would like for this to be a plausible thing for people to use, and therefore would prefer that we find a way to ameliorate this.18:30
fungiyes, i concur18:30
fungiit does seem like a blocker for, say, ansible if they decided they wanted to run a zuul to gate them18:31
corvusi think we had 3 main ideas: search api, cache, paralellization.  search api seems to be out of the running.  were there any others?18:31
jlkI'm still chatting internally18:33
fungithat's all i recall. of those cache sounds the least dangerous (but possibly also the least effective?)18:33
corvuscool, let's see if jlk comes up with anything else18:34
jlkhonestly I'm not hopeful. The person I'm talking to recognizes the problem, has pointed to a similar problem with some plan to fix the similar problem (though for GraphQL).18:35
*** sdake has quit IRC18:37
jlkI think I'm seeing some desire to enhance the /commit/ object within the API, so that a GET on the commit object would include a listing of PRs w/ the commit as the HEAD. SO we could get hook, find hash, get commit object via API, find our PRs18:39
corvusthat sounds like a good solution18:41
*** bhavikdbavishi has quit IRC18:42
jlkI don't know the timeline yet though18:43
jlkcould be months18:44
fungii thought replacing all apis with taco bell^W^Wgraphql was supposed to solve everything18:46
jlksadly the github3.py api doesn't even allow selecting which page to grab, so parallelizing would be some extra work to make URLs on our own18:46
corvusclarkb, jlk, fungi: my inclination would be to start with caching -- it's simpler, and it means less work for the cpu.  then if we *also* want to do parllel of processing of multiple events, each of those processes will already be faster.18:46
corvus(i worry that if we start with threading, we just end up with 100% cpu paging through github prs)18:47
fungiyep, i think that's a sane next step18:47
fungicaching, aside from arguing over cache invalidation, can't really hurt18:47
jlkI can't see in my head how the caching would work18:48
fungialso, it might get the performance to the point where it's "good enough for now" while github works to improve their api18:48
clarkbjlk: when a PR is created/updated we get the commit info right? we stick that in a cache table. Then when we get status update for commit we check that table first and if not found then do the scan then update cache18:49
jlkis it that as you do GETs on PRs we maintain a list of of HEADs to PRs, removing them when no longer open, then when we need PR from HASH we hit the cache first, if we find one yay, if we find more than one boo, and if we find none we resort to the long API calls?18:49
clarkbjlk: yup18:50
jlkokay.18:50
corvusi was about to type the same thing a third way but i stopped :)18:50
jlkhahah just wanted to be sure I understood the approach. I feel like this is a classic compsci thing18:50
fungiand we can assume the commit info will be immutable (or at least things like commit message, timestamps, author/committer and file contents anyway)18:50
fungibecause changes to any of those result in a new hash value18:51
jlkI don't think we need anything other than the hash18:51
corvusright, we really just need sha->pr_number18:51
jlkwe still have to GET the PR to have up to date info18:51
fungiand if the pr gets a new sha associated with it, that gives us a distinct event right?18:52
jlkwhich as a side effect updates the cache18:52
SpamapSHey you know what's good for parallelizing a bunch of on-demand cache-or-slow-backend requests? Gearman. :)18:52
jlkfungi: yes, a push event18:52
* SpamapS shakes fist "You'll never get rid of me and my gearman, mwahahahahaha"18:52
fungigearman is like a short-order cook18:53
jlkone day I'll learn what gearman is.18:53
* fungi doesn't have a punchline for that analogy18:53
SpamapSA short order cook that clones food when it is ready.18:53
clarkbsigmavirus has left a note about original_labels on the PR response18:54
clarkbappaerntly that might be a thing we can use from github318:54
SpamapSIn all seriousness, its entire reason for being is to coalesce multiple requesters for hot-but-not-cached items into a single backend slow request thread.18:54
SpamapSThe fact that geard doesn't actually support that mode of operation, and that gearman is actually useful for other things, is purely an accident. ;)18:54
corvusSpamapS: yeah.  to be fair, there was no documentation about that mode of operation when i wrote geard :)18:55
SpamapScorvus: Some day I'd like to do a talk about assumed/implicit qualities of software and how longevity compromises it. ;)18:55
corvusSpamapS: i would like to be an exhibit in that talk :)18:56
SpamapSgearman may be the only example tho ;)18:56
fungii am sort of curious what the original use case was which prompted its existence. did it emerge as a need from some existing community?18:56
corvusfungi: resizing pictures for livejournal18:56
SpamapSfungi: LiveJournal wrote it, and it sat between MySQL and Memcache.18:56
clarkbfungi: livejournal or some such service needed to ya that18:56
quiquell|rovercorvus: ping18:56
SpamapSand file storage too, right18:56
fungineat18:57
fungiin that case i can understand the multiple requesters bit18:57
SpamapSlivejournal would go "do we have a thumbnail for uniqueid=012345 in memcached?" no, ok imagemagick and store it and then return it but the magic was that each client who fetched that unique id would be sent to the same gearmand (hash the host list) and then each gearmand would only start one job for every unique id.18:58
fungisince you might have a bunch of hits around the same time all waiting on the same thumbnail18:58
fungiyeah, pretty awesome18:58
SpamapSIt completely mitigates thundering herd.18:58
SpamapS(unless you have a thundering herd of 100% unique requests)18:58
SpamapSBut what's really funny, is that nobody thought to *write that down*.18:59
fungisort of sad to see lj passing into obscurity. didn't they originate openid as well?18:59
SpamapSSo while the unique ID was there in the protocol, and useful for other things, it wasn't actually stated in the docs that gearmand implementations should coalesce.18:59
*** quiquell|rover is now known as quiquell|off19:02
jlkfwiw I've been told that the API team DOES want to do this change for the REST API and I'll get tagged on the issue when it gets created so I can follow along19:03
corvusjlk: does github ever make issues like that public?19:04
jlkrarely19:04
jlkactually I take that back19:04
jlkI honestly don't know.19:04
jlkI think there would always be an internal issue that would see most of the updates, and would have all kinds of business info and justification and customer names and stuff that we wouldn't want to share19:04
jlkthere might be an originating public issue19:05
jlkjury duty bbiab19:05
clarkblooking at the github3 pulls vs issues thing a bit more we get the full response body, we could deserialize that and grab labels from there I expect?19:08
SpamapSyeah I think it wasn't there before, but now it is.19:10
clarkboh hrm maybe thats the markdown formatted commit message/pr header19:11
clarkbnot the response body19:11
clarkbpr._json_data is the raw data but given the _ we should probably avoid accessing it directly19:14
clarkboh but as_dict returns it that way publicly19:14
clarkbok I think that is enough to make a change19:15
openstackgerritClark Boylan proposed openstack-infra/zuul master: Don't request PR issue data  https://review.openstack.org/63672819:18
clarkbI think something like ^ may work.19:18
clarkbis anyone working on the cache idea? or intending to? I'm not in a great spot for that due to needing to care for infra things as well as not being super familiar with the github API. But I am happy to continue to get logs as needed to help with that19:21
clarkbreading up on the rate limit logging it is purely informational in its current state and optional. Basically zuul won't backoff near the rate limit or stop and wait19:30
clarkbso openstack's zuul can set that to false to reduce api queries and log noise19:30
SpamapSso, if you guys want GitHub to do anything19:33
SpamapSI am officially a paying customer19:34
SpamapSso I can rattle chains19:34
SpamapSor whatever that is.19:34
SpamapSI'm sure tobiash can join me and we can amplify our customer-desires. :)19:34
clarkbSpamapS: from jlk it sounds like they already recognize this as an issue and there is at least some plans to add an api endpoint to alleviate it19:34
clarkbbasically add the commit lookup that will return list of PRs api19:34
SpamapSIndeed, just making sure we get the appropriate priority on the dev team backlog. :)19:35
clarkbhttps://docs.google.com/document/d/e/2PACX-1vRW41XdoY-HlKdgSzAZ4Dwm-OeAyi1zEkDvwIfw3FsZ5H4yhtPNaZPs-50lgG_BHsWLKixxlrHpske1/pub Gerrit first class support for CI systems just announced19:36
daniel2So I'm working with an old version of nodepool and keep getting this error: https://bpaste.net/show/578c082a3450 Unable to track down the cause, maybe someone can nudge me a little bit?19:37
clarkblooks like they will integrate a system to request checks (jobs/buildsets) in gerrit directly so recheck comments can go away19:38
clarkbdaniel2: paramiko seems to think that your ecdsa private key isn't valid for some reason. I would try sshing with it with command line ssh client and if that works maybe it is a bug in paramiko? You can try an rsa key instead (we know those work as it is what we use)19:39
clarkbthe initial pass of first class CI support in Gerrit won't support cross project CI systems :/19:40
daniel2They are all RSA keys, and I even regenerated them all, I have no idea what key its actually looking at.19:42
clarkbalso the events are basically a "hey there might be work for you to poll on now" events which is an interesting design19:43
clarkbdaniel2: that is the code that verifies the booted instance is up and running properly iirc. So it is going to try and use the default key of whatever user is running nodepool to ssh into the isntance. Maybe the daemon is running under an unexpected user and finding unexpected keys?19:44
clarkbalso is there an ssh agent that might be injected unexpected key(s)?19:46
*** smyers has quit IRC19:47
*** smyers has joined #zuul19:48
openstackgerritClark Boylan proposed openstack-infra/zuul master: Don't request PR issue data  https://review.openstack.org/63672819:53
*** jamesmcarthur has quit IRC19:55
fungidaniel2: the "SSHException: not a valid EC private key file" message is misleading. in actuality it tries a variety of means of identifying keys with which to attempt authentication, and is merely mentioning the last type it attempted to use19:57
daniel2oh okay thanks!19:58
fungiusual causes are that the username it's trying to use isn't valid for the remote host, the private key doesn't match a public key authorized for that user on the remote host, or (and this one is weird) that it found a local copy of the public key which doesn't actually correspond to the private key present19:59
SpamapSbtw.. replacing CircleCI badges in our code.. I wonder if anybody could do an actual pro job of making this: http://spamaps.org/files/zuul-gated-badge.png20:00
fungithat looks fairly professional to me!20:00
SpamapSIt's 20 minutes of gimping20:00
clarkbI can pass that along to james (he did the original logo work)20:01
corvusSpamapS: there's https://zuul-ci.org/docs/zuul/user/badges.html20:02
clarkboh nevermind its done20:02
corvusif folks like the logo better, we should update that20:02
clarkbjust without the logo20:02
SpamapSperfect20:02
SpamapSno I'm so happy somebody did it already20:02
SpamapSand I like that it's blue20:02
SpamapSgreen is kind of snarky ;)20:02
corvusi believe that shade of blue is even our secondary logo color20:03
*** panda is now known as panda|off20:04
corvusand since we host it on our site, someday we might be able to do analytics on it  (jamesmacarthur had a suggestion about anonymous analytics we might employ, but i haven't had a chance to look at it yet)20:04
*** jamesmcarthur has joined #zuul20:07
SpamapScorvus: Yeah that would be cool, and should be encouraged. If nothing else, it would be cool if that badge page had a little counter showing unique referrers in the last 30 days or something.20:08
corvusmaybe we can ask openstack if they'd like some more badges.20:09
SpamapSIt might make sense to make it a backend API call on the website too, so it can become dynamic.20:09
corvusclarkb, fungi: did one of you want to work on implementing that cache?20:10
SpamapSI still think it would be fun to have a "It has been # days since the build broke." and just have it count the days since the first time we saw that referrer.20:10
corvusSpamapS: haha :)20:10
clarkbcorvus: I noted above that I don't think I have the time or github api knowledge to do it, but am happy to work with whoever does work on it to get logs and similar data from openstack installation20:11
*** jamesmcarthur has quit IRC20:12
corvusclarkb: ok.  i don't think much github api knowledge is required (i don't expect any github api changes), and i think it can be written and satisfactorily tested entirely with the current unit test.  but no dispute about "not enough time".  :)20:12
clarkbcorvus: probably need enough knowledge to know when to update the cache? definitely something that can be learned from code and docs reading20:13
corvusclarkb: i think it's safe to update it every time we get a pr20:13
clarkbspeaking of docs reading. The Gerrit first class CI thing is interesting. It appears that they are trying to get away from needing the ssh event system as a requirement for CI with this design20:14
clarkbbasically you can poll the checks api without the ssh checks trigger events20:14
clarkbbut still receive the ssh checks trigger events to react without needing to always poll20:15
clarkbcorvus: re update every time we get a PR rgr. In that case maybe I will have time20:16
*** jamesmcarthur has joined #zuul20:16
fungii'm open to helping write the caching implementation, but am not familiar enough with zuul internals to know what existing cache solutions should be copied from (pretty sure there are some though, we cache gerrit change query results, right?)20:17
clarkbfungi: the github driver has a _change_cache20:18
clarkbwhcih is pretty simple. but its caching a derivative set of data (PRs mangled into "changes")20:18
fungioh, so some of the primitives there in the driver could presumably be reused at least?20:20
fungilooking20:21
clarkbI am not sure how reuseable it is20:21
clarkbits a dict of all the "changes" currently in pipelines20:21
fungiyeah, i suppose it is just an associative array plus the maintainCache() method20:22
fungiwe try to get() from the dict, and if the result evaluates false we query normally and set the result in the cache20:24
fungilooks like we also take exceptions in _updateChange() as a sign to remove entries from the cache20:25
SpamapSyeah it should be relatively straight forward to wrap the PR fetching in cache decorators, and then invalidate/update the cache based on events we get.20:26
fungiso figuring out what circumstances necessitate removal from the pr cache is i suppose the most complex bit20:26
SpamapSmight even be useful to use dogpile for it, since the cache could get rather large, and it would be useful to be able to farm it out to memcached.20:26
fungiwell, we maintain change caches in our other connection drivers too and those probably get fairly large as well i'd imagine20:27
funginot that i disagree about externalizing this cache, just that there seem to also be other caches in the scheduler which could presumably benefit from the same effort20:28
fungialso just using an associative array in memory like we're doing for _change_cache initially is probably simpler to prove efficacy of the idea20:29
*** jamesmcarthur has quit IRC20:35
*** sanjayu__ has quit IRC20:46
openstackgerritClark Boylan proposed openstack-infra/zuul master: First draft PR caching by commit sha  https://review.openstack.org/63675120:50
clarkbthat is super rough20:51
clarkband doesn't try to do anything fancy. More of a direct outline of how we'd want to maintain sucha cache I think20:51
fungithanks, i'll see if it does what i expect20:51
pabelangerWill the recent discussion around github performance block a potential release of zuul this week?21:02
clarkbpabelanger: the issue has been there all along so probably not21:03
pabelangerokay, neat!21:05
clarkbwe're just better at noticing now21:05
pabelangerI also won't mind if we tagged a new nodepool release (with https://review.openstack.org/636393/) but understand if we don't want with people out21:05
SpamapSa new nodepool release deserves some noise, since it will be the first one with the aws driver.21:11
corvusthere's a lot of new stuff in play that we haven't had a chance to exercise in production yet, so i'm not optimistic about a zuul release21:11
openstackgerritClark Boylan proposed openstack-infra/zuul master: First draft PR caching by commit sha  https://review.openstack.org/63675121:12
*** jamesmcarthur has joined #zuul21:25
*** jamesmcarthur has quit IRC21:30
openstackgerritJames E. Blair proposed openstack-infra/zuul master: WIP: alternative github sha cache  https://review.openstack.org/63676421:47
corvusclarkb: ^ i don't think we have to care about events -- since _updateChange gets called at the end of every event, this will ensure that after every event involving a PR, we update the cache.21:48
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Re-use the github PR object when fetching reviews  https://review.openstack.org/63670521:56
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Add comment about extra issues request  https://review.openstack.org/63670621:56
clarkboh I see, ya22:03
clarkbcorvus: I left a few notes on your change based on what I had learned writing my change22:11
corvusclarkb: thx, replied22:20
clarkboh derp msised the exception thrower that already existed22:22
*** openstackgerrit has quit IRC22:22
*** pwhalen has quit IRC22:32
*** gouthamr has quit IRC22:35
*** gouthamr has joined #zuul22:36
*** pwhalen has joined #zuul22:38
*** jamesmcarthur has joined #zuul22:43
*** jamesmcarthur has quit IRC22:48
*** jamesmcarthur has joined #zuul22:51
*** jamesmcarthur has quit IRC23:07
*** jamesmcarthur has joined #zuul23:27
*** jamesmcarthur has quit IRC23:33
clarkbcorvus: https://review.openstack.org/#/c/636705/2 still fails tests, I think I tracked down why (details in comments)23:45
*** jamesmcarthur has joined #zuul23:49
*** jamesmcarthur has quit IRC23:53
*** jamesmcarthur has joined #zuul23:56

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!