mordred | looking | 00:03 |
---|---|---|
mordred | jeblair: +2 from me on both (+3 on the doc update) | 00:04 |
mordred | jeblair: https://review.openstack.org/#/c/498127/ is ready as well, while we're doing things that clean up logging on the executor | 00:05 |
mordred | jeblair: also - from having written it, I have come to really like yaml/json versions of logging config | 00:05 |
mordred | as well as the use of dictConfig in code | 00:05 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix job timeout docs https://review.openstack.org/498513 | 00:14 |
jeblair | mordred: yeah, i think we should switch to that form | 00:45 |
jeblair | mordred: couple minor comments on that | 00:54 |
mordred | jeblair: looking | 00:54 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add role for adding launchpadlib credentials https://review.openstack.org/498633 | 00:58 |
leifmadsen | I'm going to have to look at the zuul meeting agenda ahead of time and participate from the past via etherpad with follow up post-haste | 01:02 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Write a logging config file and pass it to callbacks https://review.openstack.org/498127 | 01:06 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Remove default root handler fallback to console https://review.openstack.org/498635 | 01:07 |
mordred | jeblair: ^^ that second patch addresses your third comment, I added it as a followup so that we could have a place of discussion | 01:07 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add role for adding launchpadlib credentials https://review.openstack.org/498633 | 01:16 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add trigger-readthedocs job https://review.openstack.org/498626 | 01:17 |
mordred | jeblair: ZOMG your legacy job succeeded! why did it take 2h? (or should I ignore that for now) | 01:21 |
dmsimard | clarkb, jeblair: I think https://review.openstack.org/#/c/496935/ is good to go | 01:22 |
dmsimard | errrrrr | 01:23 |
dmsimard | it looks like there's part of that patch missing | 01:23 |
mordred | jeblair: wow. it seems like it took an hour from start of tempest to end of job | 01:23 |
dmsimard | how did that happen | 01:23 |
*** xinliang has quit IRC | 02:36 | |
jeblair | mordred: yeah, 1h transfer time for 2.3GB in infra-cloud is, sadly, about what i was expecting. that's about 5mbps. :( | 02:39 |
*** xinliang has joined #zuul | 02:49 | |
*** xinliang has quit IRC | 02:49 | |
*** xinliang has joined #zuul | 02:49 | |
dmsimard | clarkb: the dvr multinode job is failing even on completely unrelated changes so it's hard to tell if the failure is legit or not :( | 03:05 |
dmsimard | I checked various unrelated devstack-gate jobs and they're not consistently passing | 03:05 |
dmsimard | Would need an expert to tell me why they are failing :) | 03:06 |
clarkb | dmsimard: can you check experimental to see if the normal dvr tempest multinode job passee | 03:13 |
dmsimard | sure | 03:13 |
dmsimard | clarkb: this other job was supposed to be more reliable :/ | 03:13 |
dmsimard | triggered an experimental | 03:13 |
dmsimard | let's see tomorrow mornin | 03:14 |
*** SotK has quit IRC | 04:58 | |
*** SotK has joined #zuul | 04:58 | |
*** hashar has joined #zuul | 07:34 | |
*** hashar has quit IRC | 07:34 | |
*** hashar has joined #zuul | 07:34 | |
*** hashar has quit IRC | 07:41 | |
*** hashar has joined #zuul | 07:41 | |
*** electrofelix has joined #zuul | 08:55 | |
*** jkilpatr has quit IRC | 10:46 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: Add gearman server port configuration https://review.openstack.org/498753 | 10:47 |
*** hashar is now known as hasharLucnh | 10:54 | |
*** hasharLucnh is now known as hasharLunch | 10:54 | |
*** jkilpatr has joined #zuul | 11:04 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: Add gearman server port configuration https://review.openstack.org/498753 | 11:08 |
*** hasharLunch is now known as hashar | 11:35 | |
*** dkranz has joined #zuul | 12:51 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Include the prepared projects list in zuul variables https://review.openstack.org/498618 | 12:53 |
pabelanger | jeblair: mordred: is something like per region zuul-executors something that could help with infracloud bandwidth? | 14:00 |
jeblair | pabelanger: theoretically, but there's to way to make that happen until zuul v4. | 14:03 |
pabelanger | wfm! | 14:04 |
dmsimard | good morning o/ | 14:07 |
leifmadsen | o/ | 14:42 |
jeblair | clarkb, mordred: https://review.openstack.org/496935 lgtm | 14:47 |
jeblair | clarkb: https://review.openstack.org/498559 has 2x+2; but i wanted you to see it; can you +3 it? | 14:54 |
jeblair | pabelanger: can you +3 https://review.openstack.org/498270 ? | 14:55 |
dmsimard | jeblair: +1 on 498270, we definitely need a mechanism to do things like log recovery on jobs that have timed out | 14:57 |
dmsimard | jeblair: however, at what point does a "post" job time out ? :) | 14:57 |
jeblair | dmsimard: for now, it's just the same job timeout value (so we have *something*). but really we should work out how to add separate pre/post timeouts to jobs. | 14:58 |
jeblair | (the thing we need to decide there is how that works with multiple pre/post playbooks) | 14:58 |
dmsimard | jeblair: FWIW I might have discussed this with you a while back where retrieving logs on timeout'd jobs in v2 was not really possible except if you "manually" timeout'd from inside the job to allow time for log recovery | 14:59 |
* dmsimard looks for logs | 14:59 | |
jeblair | dmsimard: yes, we intend to correct that with v3 :) | 14:59 |
dmsimard | Ah, yup, I was asking about "postbuildscript" http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2017-08-08.log.html#t2017-08-08T18:58:54 | 15:00 |
jeblair | dmsimard: so to be clear, when 270 lands, we should have the behavior we all desire -- the main playbook will timeout, and then the post playbook will run. it also has a timeout -- the same value as the main timeout, but with the timer reset. | 15:00 |
jeblair | the thing to fix up in the future is to tune that post timeout so a 2h main job can have, say, a 10m post timeout, rather than a 2h post timeout as now. | 15:01 |
clarkb | jeblair: I think some jobs do depend on JOB_NAME but we aren't writing it to reproduce at all before so this should be fine | 15:03 |
jeblair | clarkb: ftr i have no problem breaking any job that depends on JOB_NAME. i think we've been very clear about that. :) | 15:04 |
clarkb | ya and reproduce isn't working for them anyways | 15:05 |
clarkb | I haev approved | 15:05 |
jeblair | w00t | 15:05 |
jeblair | mordred: i've lost context on https://review.openstack.org/489780 what's going on there? | 15:09 |
jeblair | mordred: also, maybe we should make https://review.openstack.org/490216 a high priority? | 15:10 |
*** lennyb has quit IRC | 15:12 | |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul feature/zuulv3: WIP: Add publish-openstack-python-docs to post pipeline https://review.openstack.org/498840 | 15:14 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul feature/zuulv3: WIP: Add publish-openstack-python-docs to post pipeline https://review.openstack.org/498840 | 15:17 |
pabelanger | jeblair: we recently restarted zuulv3.o.o? | 15:21 |
jeblair | pabelanger: not that i'm aware. I'm waiting for you to approve 270 before i restart it. | 15:21 |
pabelanger | Oh, Hmm | 15:21 |
pabelanger | https://review.openstack.org/498621/ a syntax error for some reason in project-config from zuul | 15:21 |
pabelanger | jeblair:sorry for the delay on 270 +3 | 15:22 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add gearman server port configuration https://review.openstack.org/498753 | 15:23 |
*** lennyb has joined #zuul | 15:23 | |
jeblair | pabelanger: that's a curious error; openstack-doc-build seems to exist in openstack-zuul-jobs. | 15:25 |
pabelanger | jeblair: ya, its been there for a while. Since it is still calling run-docs.sh, but odd that is just started complaining. | 15:27 |
pabelanger | I'll check debug.log here in a moment | 15:27 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Run post playbooks on job timeout https://review.openstack.org/498270 | 15:30 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul-jobs master: Delete .pypirc file at end of task https://review.openstack.org/498843 | 15:36 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul-jobs master: Move .pypirc into tmpfs https://review.openstack.org/498844 | 15:36 |
mordred | jeblair: I've also lost context on 489780 - there was an issue a while ago - but I don't know what - and I'd rather just change all that anyway | 15:48 |
mordred | jeblair: how about I abandon it | 15:48 |
mordred | jeblair: also - oy, I will rebase 490216 right now | 15:49 |
jeblair | mordred: kk | 15:49 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Prevent execution of locally overridden core modules https://review.openstack.org/490216 | 15:50 |
mordred | jeblair: ^^ should be good now | 15:51 |
mordred | jeblair: (the rebase conflict was a single line) | 15:51 |
mordred | jeblair: fwiw, I pointed the release team at the release patches yesterday just to give them a headsup | 15:52 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add integration test for zuul_stream https://review.openstack.org/498209 | 15:53 |
mordred | pabelanger, jlk, SpamapS: could I sell any of you on +3ing tps://review.openstack.org/#/c/498127 https://review.openstack.org/#/c/498209/ and https://review.openstack.org/#/c/484000/ ? | 15:53 |
jeblair | mordred, pabelanger, jlk: http://paste.openstack.org/show/619789/ | 15:56 |
jeblair | i'm concerned that the failed reconfiguration there may have left zuul with an incomplete configuration | 15:57 |
pabelanger | eep | 15:58 |
mordred | jeblair: me too - I also started looking at something related to those rate limit errors a day or two ago | 15:58 |
mordred | jeblair: I BELIEVE whats going on is that when we get a payload we do a look up to get the integration id and then cache that | 15:58 |
mordred | jeblair: but we're not first filtering to see if it's a project we're configured to do anything with | 15:58 |
pabelanger | posted this in openstack-infra by mistake: 2017-08-29 15:08:51,880 DEBUG is the first time Job openstack-doc-build not defined happend in debug.log today | 15:59 |
pabelanger | so, lines up with reconfigure failure | 15:59 |
jlk | looking | 15:59 |
mordred | jeblair: which means that things like those weird payloads from uknown repos may be eating our quota | 15:59 |
mordred | jlk: oh - there's another 'fun' thing github related - which is taht we're getting some payloads from forks of ansible that we don't know anything about | 15:59 |
mordred | which makes me wonder if adding an Application to a project means that forks of that project ALSO send things | 16:00 |
jlk | that would be interesting | 16:02 |
jlk | I have to wonder if forking a repo drags along the integration, but we would need to examine these repos and see what's going on | 16:02 |
mordred | jlk: one of the forks we got something from seems to be a very old fork :( | 16:03 |
mordred | one sec, I'll get you a log snippet | 16:03 |
mordred | jlk: csc0714/ansible and cn-ansible/ansible | 16:04 |
mordred | jlk: http://paste.openstack.org/show/619794/ | 16:05 |
jlk | Maybe we're interpreting that payload wrong. In the github app configuration page (as the app owner) you have a pane that will show you the raw payloads of everything that's been delivered. Would be interesting to find that particular payload. | 16:05 |
jlk | Every repo is _supposed_ to get their own unique ID | 16:06 |
jlk | so if forking a repo drags along the apps, it should get a new ID for that new fork. ANd your app would have gotten an "install" payload saying the repo installed the app | 16:06 |
mordred | jlk: http://paste.openstack.org/show/619795/ are all of the no installation id messages from the current debug log | 16:07 |
mordred | jlk: what should I look for in the payloads page? | 16:09 |
jlk | okay that's saying that we're getting a project name that we don't have a mapped install ID for | 16:09 |
jlk | uh. | 16:10 |
jlk | mordred: pull up one of the payloads that mentions that project | 16:11 |
jeblair | 2017-08-24 16:39:24,681 ERROR zuul.GithubConnection: No installation ID available for project csc0714/ansible | 16:11 |
jeblair | that's the first time one of those shows up (it also shows up with the other 2) | 16:11 |
jeblair | i don't see anything interesting immediately preceding it in zuul debug logs | 16:12 |
jlk | what I'm thinking is that zuul is getting a payload, it's extracting what it thinks is the source project of the payload and trying to find a mapping of it in the configuration. We're missing that, so the operation I think ends right there. | 16:14 |
mordred | yah - so - I think it would be helpful to add debug logging of the X-GitHub-Delivery value - that's how the payloads are listed on the web page | 16:15 |
jeblair | jlk: i don't think it ends there -- http://paste.openstack.org/show/619796/ | 16:15 |
jlk | mordred: while you're on the github page, can you get a listing of your installations? | 16:15 |
jeblair | that's why mordred was thinking it's eating our unauthenticated api counts | 16:15 |
jlk | "Your Installations" tab. | 16:15 |
jlk | does it show the fork? | 16:15 |
mordred | jlk: no - that only shows the repos that I own that it's attached to | 16:16 |
jlk | jeblair: oh interesting. That was a theoretical problem that appears to be true. | 16:16 |
jlk | jeblair: that routine gets hit when we get a "status" event, something setting a github status on a commit hash | 16:16 |
jlk | we have to search github for any pull requests that have that hash as the head. | 16:17 |
jlk | oh hrm. I wonder... | 16:17 |
jlk | I wonder if in our search, we're creating github client objects to all these other repos, and that's where we're trying to find an install ID and failing | 16:18 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add X-Github-Delivery id to debug logs https://review.openstack.org/498853 | 16:18 |
mordred | jlk, jeblair: ^^ I think that'll help us track back to a given incoming payload | 16:18 |
jeblair | well, it looks like we're getting *one* webhook event, followed by 3 "no installation id" errors, followed by one "multiple pulls" exception | 16:18 |
jlk | mordred: +2 | 16:19 |
jlk | nod | 16:19 |
jlk | jeblair: so one webhook event is the status event, which leads to searching github, and maybe making a bunch of calls to new projects. I'm walking that code right now | 16:19 |
jlk | getPullBySha is the func | 16:19 |
mordred | oh - what do you want to bet there is a PR somewhere thatsomeone has mentioned in a different repo (or maybe that those other repos have already merged) and there is some list in the payload and we're pulling the wrong thing? | 16:19 |
jeblair | maybe there are not multiple installs, we're getting a legit webhook event from the repo we're watching, but that pr shows up in 4 projects | 16:19 |
jlk | jeblair: that's my current hunch | 16:20 |
mordred | yup | 16:20 |
jlk | Yes yes we do | 16:20 |
mordred | and then since we don't get results for those, we're not caching them, so we're doing the search every time | 16:20 |
jlk | we create a new github client option that's based on the owner/project | 16:20 |
jlk | that's when we go hunting for an installID | 16:20 |
mordred | jlk: so we should maybe limit that to owner/project that we know about in our configuration? | 16:21 |
jlk | ugh, I think this is one thing that GraphQL was supposed to make better | 16:21 |
jlk | so that we could get all the data we need in the search rather than having to poke at the API again. | 16:21 |
jlk | mordred: yeah that seems like a quick easy limit. If we don't care about it, don't even look it up | 16:21 |
mordred | jlk: ++ | 16:22 |
jlk | need to think on this some more | 16:22 |
mordred | especially since we're rate limited - and are already storing an installation id cache - that should get us installation ids for the things we care about pretty quickly | 16:22 |
jeblair | okay, a final question to bring this back around to the error that bombed us -- why is getProjectBranches resulting in unauthenticated api calls? | 16:22 |
jlk | A status event, I really really really wish github would link that status event to a particular repo | 16:22 |
mordred | jlk: right? | 16:23 |
jlk | actually | 16:23 |
jlk | wait a sec | 16:23 |
jlk | I'm looking at a status event, and it's in the payload a "name" attribute | 16:24 |
jlk | which I'm pretty sure is the repo in question | 16:24 |
jlk | and there is a "repository" key too with all kinds of info | 16:24 |
jlk | trying to page more of this stuff into my head space. | 16:25 |
jlk | jeblair: I'm not ignoring that question just yet | 16:25 |
mordred | jlk: yah- I see a "base" key with the information on what the PR is targetted to, which also includes a repo key | 16:26 |
mordred | jlk, jeblair: http://paste.openstack.org/show/619800/ | 16:27 |
jeblair | jlk: that's fine; i was waiting for a context switch to drop that in, and you just tripped me up by having a eureka moment. :) | 16:27 |
mordred | that is a payload (it's not the one in question, but it's a copy-paste of a full payload from the admin panel) | 16:27 |
mordred | looks like it has installation id in the payload, as well as repository | 16:28 |
mordred | jeblair: yes - I believe we got an incomplete configuration- I'm getting "job multinode unknown" here: https://review.openstack.org/#/c/498209/13 | 16:29 |
jeblair | mordred: yeah, i think i understand how that happened; i'm formulating a response | 16:31 |
mordred | jlk: oh - that's not a status payload | 16:31 |
jlk | I'm doing an experiment | 16:32 |
jlk | oh hrm, I think our v3 is not responding :( | 16:32 |
mordred | http://paste.openstack.org/show/619801/ | 16:33 |
mordred | we may be at ratelimit for the hour | 16:33 |
mordred | jlk: I'm tailing log - wanna do the experiment again? or were you saying the bonny v3 isn't responding? | 16:33 |
jlk | bonny | 16:34 |
mordred | nod | 16:34 |
jlk | oh I opened it wrong | 16:34 |
mordred | that paste above is a status payload, fwiw | 16:34 |
mordred | also, fwiw: | 16:35 |
mordred | 2017-08-29 16:34:44,222 DEBUG zuul.GithubConnection: GitHub API rate limit remaining: 12357 reset: 1504026379 | 16:35 |
mordred | 2017-08-29 16:34:44,360 DEBUG zuul.GithubConnection: GitHub API rate limit remaining: 55 reset: 1504028033 | 16:35 |
mordred | I'm guessing that's auth'd and unauth'd ? | 16:35 |
Shrews | mordred: comment on https://review.openstack.org/498623 | 16:37 |
jlk | mordred: it might be yea. Using the APP id is one set of limits. We might be able to be more clear about which is which. | 16:38 |
jlk | Do you have any installs to sandbox like repos on the infra zuul v3? I don't want to open something against real ansible for our testing | 16:38 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add X-Github-Delivery id to debug logs https://review.openstack.org/498853 | 16:39 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Write a logging config file and pass it to callbacks https://review.openstack.org/498127 | 16:39 |
jlk | and I think our zuul v3 is not responding | 16:39 |
mordred | Shrews: for such a simple job I can't quite get it right can I? thanks - fixed | 16:39 |
mordred | jlk: yes - https://github.com/gtest-org/ansible | 16:40 |
jlk | ah okay | 16:40 |
jlk | mordred: incoming PR, which should generate a few things, including hopefully a pending status event | 16:44 |
jlk | https://github.com/gtest-org/ansible/pull/4 | 16:45 |
jlk | is there a configured pipeline? | 16:45 |
mordred | yup | 16:45 |
mordred | 2017-08-29 16:45:48,285 DEBUG zuul.GithubConnection: Scheduling github event from github: pull_request | 16:46 |
jlk | does that pipeline have a configuration to set status at all? | 16:46 |
jlk | (where is the pipeline config for this?) | 16:46 |
mordred | in openstack-infra/project-config | 16:47 |
mordred | jlk: http://git.openstack.org/cgit/openstack-infra/project-config/tree/zuul.yaml | 16:48 |
mordred | jlk: let's restart zuul scheduler with those patches from above and try again | 16:48 |
mordred | it's too hard to attempt to find the payload on the gh page without the id | 16:48 |
jlk | okay, I guess that zuul is busy a bit because i haven't seen teh status come through | 16:48 |
jlk | (or we're hitting the bug in question preventing it from happening) | 16:48 |
mordred | jeblair: restarting zuul scheduler | 16:49 |
jlk | Funny that there's status: pending without quotes, but the other two statuses are quoted: 'success' and 'failure' | 16:49 |
jeblair | jlk: we can clean that up; quotes are optional in all 3 of those cases | 16:50 |
jeblair | (i'd prefer to drop them) | 16:50 |
mordred | scheduler restarted with logging and github delivery id patches applied | 16:52 |
mordred | (so that means we should also get base pre-playbook logged properly again) | 16:53 |
jlk | okay will re-open the PR | 16:53 |
jlk | re-opened, you should get the new logging. I'm concerned that it never reports status | 16:54 |
*** electrofelix has quit IRC | 16:57 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Update config cache only after all cat jobs complete https://review.openstack.org/498872 | 16:57 |
jeblair | mordred, jlk, pabelanger: that ^ should fix the problem where a failed reconfig causes further dynamic reconfigs to fail. with that, the github error "only" should have caused us to fail to reconfigure (which, of course, is still a critical error) | 16:58 |
jlk | mordred: any indication from the logs if / where it's failing to process the reopen event far enough to set a pending status? | 16:58 |
mordred | jeblair: startup issues - workign on it | 16:59 |
jeblair | jlk: http://paste.openstack.org/show/619804/ | 17:00 |
jeblair | (that also suggests that the code on disk and in memory are not in sync) | 17:00 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Fix badly named function typo https://review.openstack.org/498877 | 17:00 |
jlk | seems that got cut off, I don't see the full trace | 17:00 |
mordred | jeblair: that ^^ - sorry | 17:00 |
jeblair | jlk: that was the full trace -- that's what i meant by the code not being in sync. | 17:01 |
jlk | ah :( | 17:01 |
jeblair | jlk: the actual line is probably nearby | 17:01 |
pabelanger | I know I had some issues getting -e git+https://github.com/sigmavirus24/github3.py.git@develop#egg=Github3.py to install properly, wonder if puppet is failing some how now. | 17:02 |
mordred | ok. scheduler running again | 17:02 |
pabelanger | also, we really should ask sigmavirus24 for a release. I tried reaching out to him / her, but didn't hear anything back. Maybe others will have a better chance | 17:02 |
jlk | hrm I wonder if it's the new line that we just added, request.headers.get | 17:02 |
jlk | I just sent a "recheck" comment, which should trigger the pipeline | 17:03 |
mordred | cool | 17:03 |
jeblair | jlk: is this interesting? http://paste.openstack.org/show/619806/ | 17:04 |
mordred | 2017-08-29 17:03:35,202 DEBUG zuul.GithubWebhookListener: Github Webhook Received: fe3b9e94-8cdb-11e7-99e4-7f96d0d39b8a | 17:04 |
mordred | woot | 17:04 |
jlk | o_O | 17:04 |
mordred | that's actually not your thing | 17:05 |
jlk | jeblair: somewhat. 2 things. | 17:05 |
jlk | jeblair: 1) we do not currently handle review_comments (because those are different than just a comment on the pull request.) 2) I thought we gracefully dropped things we don't handle events for, looks like there is a problem in that graceful handling | 17:05 |
*** hashar is now known as hasharDinner | 17:06 | |
jlk | well, I think I got the data I need from our v2 install | 17:09 |
jlk | what I wanted to verify is that the "name" key in the status payload is the target repo, not the forked repo. We can narrow our search for PRs by just adding that into the search terms. | 17:09 |
mordred | jlk: I'm not seeing webhooks for those recheck comments at all | 17:09 |
jlk | okay I'll close/open | 17:09 |
jeblair | jlk: http://paste.openstack.org/show/619807/ there's the correct traceback | 17:10 |
mordred | jlk: also, we should figure out comments if we're not getting them | 17:10 |
jlk | yes | 17:10 |
jlk | jeblair: oh that makes some sense. We found no commit, so trying to do soemthing on a None. boo. | 17:10 |
jlk | probably because we're failing to find the pr or something | 17:11 |
mordred | jlk: http://paste.openstack.org/show/619808/ | 17:11 |
mordred | jlk: that's the payload from you reopening the pr | 17:12 |
jlk | yeah but did it do anything on zuul side? | 17:12 |
mordred | jlk: yes - it triggered something then hit api-rate limit issue | 17:13 |
mordred | one sec | 17:13 |
mordred | jeblair: I can't get the scheduler started with server zuul-scheduler start - but I can if I run it with -d on the command line | 17:13 |
mordred | jlk: but | 17:14 |
mordred | gah | 17:14 |
mordred | jeblair: but I also don't see any errors when it doesn't start - any thoughts? | 17:14 |
pabelanger | pidfile still exist? | 17:14 |
jeblair | mordred: systemd may think it's still running; throw some 'stop' and 'status' at it. | 17:14 |
openstackgerrit | Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Limit github PR search to project status is from https://review.openstack.org/498879 | 17:14 |
jlk | ^^ that may help a lot | 17:14 |
pabelanger | ya, I've found systemd using init.d script pretty odd. I've been meaning to write systemd unit files for zuul | 17:15 |
mordred | that was it | 17:15 |
jeblair | mordred: (i think especially if you stop it any way other than via systemd, systemd gets confused) | 17:15 |
mordred | pabelanger: yes please. the half-way-in-between state we're in now is pretty brittle. I don't like systemd, but I like systemd managing my non-systemd init things even less | 17:15 |
mordred | jeblair: yah | 17:15 |
jeblair | pabelanger: after ptg please :) | 17:16 |
pabelanger | :) | 17:16 |
mordred | jlk: http://paste.openstack.org/show/619810/ | 17:18 |
jeblair | pabelanger: some comments on 498588 | 17:18 |
mordred | jlk: that's the log entries for the pr being opened | 17:18 |
jlk | So I suspect what's happening is that a status for ansible/ansible comes in, mentioning a particular sha. Zuul then searches all of github for any pr that is open that has the sha as part of the PR, and it's finding other forks of ansible that have open PRs (why the fork has a PR I don't know) that also have that hash and it's getting confused | 17:19 |
jlk | mordred: oh, boo. we're just hitting API limits. | 17:20 |
mordred | jlk: maybe those forks just have the sha in their master branch? like it's a fork that's carrying local changes? | 17:20 |
jlk | probably because we're over searching | 17:20 |
jlk | mordred: I'm not sure yet, I'm going to replicate the search and see what I get | 17:20 |
mordred | jlk: yah - because of the other fork thing | 17:20 |
mordred | jlk: cooll | 17:20 |
jlk | gotta find that line somewhere in the IRC log | 17:20 |
jeblair | jlk: if we narrow that search to just the project we're interested in, can that search use the auth api limits? | 17:21 |
mordred | jeblair, pabelanger: y'all ok with me restarting ze01 executor to pick up logging changes? | 17:21 |
jlk | jeblair: maybe? | 17:21 |
jlk | so I re-did one of the searches we logged about having multiple hits on. | 17:21 |
jlk | https://github.com/search?utf8=✓&q=8e6c0ca5996bf1057ec346d68ed85eec8b25ca11+type%3Apr+is%3Aopen&ref=simplesearch | 17:21 |
jlk | interestingly enough, it doesn't find the ansible/ansible one | 17:22 |
jlk | maybe because it's closed? | 17:22 |
pabelanger | mordred: no issue here | 17:22 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Limit github PR search to project status is from https://review.openstack.org/498879 | 17:22 |
jlk | jeblair: I think I already did that.. | 17:22 |
jlk | oh I see what you're doing. | 17:23 |
jeblair | jlk: i fixed a small thing in your change so you can keep digging | 17:23 |
jlk | nod | 17:23 |
jeblair | jlk: it also doesn't return anything from gtest-org/ansible | 17:24 |
jlk | Okay, so https://github.com/ansible/ansible/pull/26282 was the trouble PR | 17:24 |
mordred | jeblair: I did 'service zuul-executor stop' on ze01 and still show 5 zuul-executor processes running - should I figure out what they're doing? | 17:24 |
jlk | 11 hours ago there was a status that came in | 17:24 |
jlk | on that hash | 17:25 |
mordred | jeblair: stat("/var/lib/zuul/builds/f90ee26921054a2caca95a36a3dbabba/work/logs/job-output.txt | 17:25 |
mordred | jeblair: I think they're orphaned log streaming processes | 17:25 |
jeblair | mordred: :( | 17:25 |
mordred | jeblair: oh - that build dir is still there | 17:25 |
jeblair | mordred: keep! | 17:26 |
jlk | and the three open PRs it found were um... huge? | 17:26 |
jeblair | mordred: i bet 'keep' causes our log streaming cleanup detection to fail | 17:26 |
mordred | jeblair: yah. I'm guessing keep is on - so at shutdown we're not deleting those dirs so the log streaming doens't notice that the job has stopped via the log file going away | 17:26 |
mordred | jeblair: yup | 17:26 |
jlk | somebody was struggling with github I think. Opening PRs to merge upstream devel with downstream fork devel. | 17:26 |
jeblair | mordred: anyway, i think you can kill 'em. | 17:27 |
mordred | jeblair: cool - we can fix that edge case later | 17:27 |
jeblair | mordred: and i don't need keep anymore; no need to turn it back on when you restart | 17:27 |
jlk | re search rate limits | 17:28 |
jlk | "The Search API has a custom rate limit. For requests using Basic Authentication, OAuth, or client ID and secret, you can make up to 30 requests per minute. For unauthenticated requests, the rate limit allows you to make up to 10 requests per minute." | 17:28 |
jlk | crap I had a disconnect, what was the last you all saw from me? | 17:29 |
mordred | jeblair: ok. executor restarted with logging changes and new version of ara | 17:30 |
mordred | jlk: to make up to 10 requests per minute." | 17:30 |
jlk | okay. Yeah. So, I think we definitely need to re-work this a bit to use authe'd github | 17:31 |
jlk | and... and maybe we could stop searching and instead just get a dump of all the PRs of the target repo instead, and examine them to see what the HEADs are | 17:31 |
jlk | I realize that our Depends-On stuff is using search as well | 17:32 |
jlk | so every time we deal with a PR we're going to be hitting the search API :( | 17:32 |
mordred | jlk: yah - that seems less scalable than we'd like | 17:32 |
*** Shrews has quit IRC | 17:33 | |
jlk | thankfully we cache that info | 17:33 |
jlk | er rather we cache the PR, it's unclear to me just yet if we'd search EVERY time we get an event regarding the same PR | 17:33 |
*** Shrews has joined #zuul | 17:34 | |
mordred | jlk: yah. so - if we did a single search of all PRs when we start and maintain a cache, then update that cache whenver we get an event - we should be able to search almost never at the cost of a big search at start? | 17:35 |
mordred | jlk: or something | 17:35 |
jlk | not exactly what I was thinking. | 17:35 |
jlk | I think I can eliminate the search we do when we get a status event | 17:36 |
mordred | oh - neat | 17:36 |
jlk | not touching the Depends-On bit yet | 17:39 |
jlk | jeblair: don't merge that change, I'm going to try something different | 17:39 |
mordred | jeblair: v3 is running weird and I think I may want help looking at it | 17:41 |
mordred | jeblair: https://review.openstack.org/#/c/498877 doesn't seem to be being enqueued - neither via an additional +A or a recheck comment | 17:41 |
jlk | crap, gotta go my car is ready. I'll be back at this in 30 minutes or so | 17:42 |
*** maxamillion has quit IRC | 17:53 | |
Shrews | hrm, we have some project-config zuulv3 jobs that are misconfigured. luckily, doesn't look like they are used | 17:57 |
*** maxamillion has joined #zuul | 17:57 | |
Shrews | https://review.openstack.org/498895 | 18:00 |
mordred | Shrews: thanks - that was my oops | 18:05 |
mordred | jeblair: ok - I think I've got things restarted happy - I think we were stuck in a weird place with bogus config due to the ratelimit error which was causing jobs to not trigger or something | 18:10 |
mordred | jeblair: you may want to look at executor-debug.log - as things were starting up there was a pile of: | 18:10 |
mordred | 2017-08-29 18:09:25,459 DEBUG zuul.ExecutorServer: Finished updating repo github/gtest-org/ansible | 18:10 |
mordred | 2017-08-29 18:09:25,536 DEBUG zuul.ExecutorServer: Got cat job: a6cf8c99cdc4467fa35eeda885e188f3 | 18:10 |
jeblair | mordred: hrm; i was just looking into the 498877 error and i didn't see any signs of a bogus config | 18:11 |
jeblair | mordred: what's unexpected about gtest-org/ansible cat jobs? | 18:12 |
mordred | jeblair: nothing necessarily - there were just a lot of them one after the other, so it triggered my eyeball 'maybe it's an issue' | 18:12 |
jeblair | mordred: ansible has a lot of branches | 18:12 |
mordred | jeblair: maybe it's just that when i've seen similar before in the scheduler log it says something about the branch it's doing - whereas this was just as sequence of cat jobs that seemed identical in the executor log | 18:13 |
mordred | jeblair: not important - just a thing to know as an admin that that's not a sign something is stuck in a loop | 18:13 |
jeblair | mordred: well, i don't know what was wrong earlier, but 498877 is working now | 18:14 |
mordred | jeblair: yes | 18:14 |
jeblair | mordred: next time it'd be nice to figure out what's going on though | 18:14 |
mordred | jeblair: I agree | 18:15 |
jeblair | mordred: next time, please avoid restarting until we've diagnosed the problem | 18:15 |
mordred | jeblair: will do | 18:16 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Make a docs-on-readthedocs project-template https://review.openstack.org/498903 | 18:30 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Make a docs-on-readthedocs project-template https://review.openstack.org/498903 | 18:43 |
jeblair | mordred: https://review.openstack.org/498877 is all about some post fail | 18:44 |
jeblair | mordred: is that because of the bug that change is fixing? | 18:44 |
jeblair | i'm rechecking and trying to catch the error on stream | 18:47 |
Shrews | just happened on https://review.openstack.org/498626 too | 18:48 |
Shrews | Could not clean up: 'AraCli' object has no attribute 'ara_context' | 18:50 |
Shrews | ?? | 18:50 |
jeblair | dmsimard: are we using your new point release? | 18:51 |
dmsimard | jeblair: the new dot release would be 0.14.1 | 18:51 |
jeblair | 2017-08-29 18:51:41.727936 | localhost | module 'logging' has no attribute 'config' | 18:51 |
jeblair | that's the line before what Shrews pasted | 18:52 |
dmsimard | jeblair: I guess it would need to be updated on the executor ? | 18:52 |
dmsimard | It's probably not automatically updated ? | 18:52 |
Shrews | http://paste.openstack.org/show/619822/ | 18:52 |
dmsimard | It's not installed for every job, right ? | 18:52 |
jeblair | ara==0.14.1 | 18:52 |
jeblair | dmsimard: it's updated whenever we update zuul | 18:52 |
jeblair | which is every time we land a change :) | 18:52 |
dmsimard | ok let me see | 18:53 |
dmsimard | bah there's no logs in that job ? | 18:53 |
dmsimard | you're looking from the executor directly ? | 18:53 |
jeblair | dmsimard: no logs on any job. Shrews and i were streaming output | 18:54 |
jeblair | dmsimard: but that's all there is; there's no traceback | 18:54 |
dmsimard | okay | 18:54 |
dmsimard | jeblair: the log config just landed in zuul too right ? | 18:54 |
* dmsimard looks at that patch again | 18:54 | |
dmsimard | this one https://review.openstack.org/#/c/498127/ | 18:55 |
openstackgerrit | David Moreau Simard proposed openstack-infra/zuul feature/zuulv3: Add integration test for zuul_stream https://review.openstack.org/498209 | 18:55 |
dmsimard | rebased this patch which has integration tests to see what happens out of curiosity | 18:56 |
jeblair | dmsimard: no job will upload lods at this point | 18:56 |
dmsimard | http://zuulv3.openstack.org/static/stream.html?uuid=102ad366d10a4c74ac138646cc2bd2c1&logfile=console.log | 18:56 |
jeblair | dmsimard: but if that job outputs something useful in the stream, you'll see it there | 18:57 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Remove bindep_command and bindep_fallback references https://review.openstack.org/498913 | 18:57 |
jlk | back now | 18:58 |
jeblair | remote: https://review.openstack.org/498915 Import logging.config as well as logging | 18:59 |
jeblair | mordred, dmsimard: ^ | 18:59 |
dmsimard | bleh | 19:00 |
dmsimard | jeblair: FWIW, full logs http://paste.openstack.org/raw/619824/ | 19:00 |
dmsimard | the integration test passed | 19:00 |
dmsimard | but the "real" post job failed | 19:00 |
dmsimard | not sure what's going on there | 19:00 |
dmsimard | er, for some reason paste.o.o didn't pick up the whole thing | 19:01 |
dmsimard | https://paste.fedoraproject.org/paste/jE8DDlAuHKmkssVK2cZnVw/raw | 19:01 |
jeblair | dmsimard: does that job run "ara generate html" ? | 19:01 |
dmsimard | jeblair: it does: 2017-08-29 18:59:14.738256 | TASK [Generate ARA html] | 19:02 |
jeblair | (infra team meeting starting now in #openstack-meeting ; i have a zuulv3 topic about git caching) | 19:02 |
jeblair | dmsimard: does it use the new logging config? | 19:02 |
dmsimard | The first one passes (where the controller node runs it) and the second one fails (where the executor runs it) | 19:02 |
dmsimard | jeblair: I assume so, I rebased the patch | 19:02 |
dmsimard | jeblair, mordred: added a comment in https://review.openstack.org/#/c/498915/ | 19:16 |
mordred | dmsimard, jeblair: I think I know why the integration test works and prod doesn't | 19:19 |
dmsimard | go on | 19:19 |
mordred | we pass the env var to the invocation of ansible-playbook which sets it up properly for the callback plugin | 19:20 |
dmsimard | because that's the kind of issue I want to avoid with integration tests | 19:20 |
mordred | but we do not pass it to the generate html command | 19:20 |
dmsimard | not sure I follow who's doing what, you mean the executor doesn't currently pass the required env var when running the generate command and we do it in the integration tests ? | 19:25 |
mordred | hrm. no - I'm backto being confused. we pass and don't-pass things consistently across the two | 19:27 |
mordred | we don't set the env var for generate html in either place | 19:28 |
mordred | and we set the env var for ansible-playbook in both places | 19:28 |
jlk | hrm, what would I need to put into a logging config so that it just spews stuff to stdout? It used to do this a while ago, but since I last ran my dockers, everything is by default going into a log file | 19:29 |
dmsimard | mordred: where does this end up running, the executor ? https://github.com/openstack-infra/project-config/blob/master/playbooks/base/post-logs.yaml | 19:42 |
mordred | dmsimard: yes. that runs on the executor - but via ansible, so env vars passed to the invocation of ansible-playbook should not pass through | 19:42 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul-jobs master: Create upload-afs role https://review.openstack.org/498588 | 19:43 |
dmsimard | mordred: I guess we emulate the same thing here where the logging config is set for the ansible-playbook task but not the generate task https://review.openstack.org/#/c/498209/14/playbooks/zuul-stream/functional.yaml | 19:44 |
mordred | dmsimard: yah | 19:44 |
mordred | jlk: we actually JUST landed a change to how that works | 19:45 |
dmsimard | mordred: we're absolutely positive both zuul and ara are up to date on the executor, was it reloaded ? does it need a *restart* or something ? | 19:45 |
dmsimard | mordred: otherwise I'm missing something obvious as to why it works in the integration job and not "fo real" | 19:45 |
mordred | dmsimard: the executor has been restarted after being re-installed | 19:45 |
mordred | yah. this is a thing that very much confuses me | 19:45 |
dmsimard | mordred: the log config generation, does that happen automatically when running the executor ? | 19:46 |
dmsimard | mordred: because we seem to be doing it explicitely in the job | 19:46 |
mordred | yes. the executor generates the log config | 19:46 |
mordred | and then passes the path to it via the env var | 19:46 |
dmsimard | and I guess the logging patch is effective on the executor because we're no longer missing the first couple lines from the output and we're not seeing the alembic/migration/ara logging | 19:48 |
jlk | haha I have a file named /var/log/zuul/{server}.log | 19:48 |
dmsimard | jlk: oops | 19:48 |
dmsimard | bad substitution somewhere :) | 19:49 |
dmsimard | that makes me remember the days where ansible created a literal '$HOME' file | 19:49 |
dmsimard | mordred: yup, I just re-read the entire integration job log and I don't see any hints. | 19:50 |
mordred | jlk: https://etherpad.openstack.org/p/sI8hoI3Tah | 19:50 |
mordred | jlk: that should get you "log everything to stdout" | 19:51 |
jlk | and that goes in a logging config file somewhere? | 19:51 |
jeblair | jlk, mordred: if you pass -d shouldn't it go to stdout anyway? | 19:52 |
jeblair | (i'm trying to figure out why going to stdout *and* backgrounding are desirable) | 19:53 |
jlk | hrm. | 19:53 |
jlk | yeah I'm running -d and -c /path/to/conf | 19:53 |
mordred | jeblair: yes. with -d it SHOULD go to stdout | 19:53 |
jlk | you'd thing -d it should. but it's not | 19:53 |
jlk | it's defaulting to writing to a {server}.log file | 19:54 |
mordred | well- it certainly shouldn't do that :) | 19:54 |
dmsimard | mordred: any way we can test that jeblair's patch fixes the issue before I go ahead and rush a release ? | 19:54 |
mordred | jlk: does your zuul.conf file you're pointing to with -c have a logging config in it? | 19:55 |
jlk | no, those lines are commented out | 19:55 |
mordred | dmsimard: I mean - I have tested that trying to use logging.config without importing it is an error | 19:55 |
jlk | (side note, just realized that a webhook_token is now required. drats) | 19:55 |
mordred | jlk: we also updated the names of the parameters - not sure if that happened while you were away or not | 19:56 |
mordred | jlk: app_key= app_id= and webhook_token= now | 19:56 |
dmsimard | mordred: I believe you, just still wish we could reliably reproduce this in the integration test (which is the whole point) | 19:56 |
jeblair | mordred: i spot the logconfig error | 19:56 |
dmsimard | mordred: can we run the real *executor* instead of just "ansible-playbook" 6 | 19:57 |
dmsimard | s/6/?/ | 19:57 |
mordred | dmsimard: that's a WAY more complex thing to do - very unlikley we'll get that done before PTG | 19:57 |
dmsimard | mordred: the fact that we can do it is already good news, but sure, after the ptg | 19:58 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix typo in ServerLoggingConfig https://review.openstack.org/498922 | 19:58 |
jeblair | mordred, jlk: ^ that should fix the {server} thing | 19:58 |
mordred | jeblair: I agree - but I'm confused as to why jlk hit that if he doesn't have logging config configured | 19:59 |
dmsimard | mordred: but I mean, by exercising the executor instead of ansible-playbook, we would probably be able to reproduce the error | 19:59 |
jeblair | mordred: that's the default? | 19:59 |
jlk | that fixes the file names indeed | 20:00 |
jlk | which is certainly part of the problem :D | 20:00 |
mordred | if hasattr(self.args, 'nodaemon') and not self.args.nodaemon: | 20:00 |
mordred | logging_config = logconfig.ServerLoggingConfig() | 20:00 |
mordred | oh - well there it is | 20:00 |
jlk | ah | 20:01 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Fix a backwards boolean comparison for nodaemon logging https://review.openstack.org/498923 | 20:01 |
jlk | logic reverse | 20:01 |
jlk | firing up here with that change | 20:02 |
mordred | jeblair: ok. with those two patches that issue shoudlbe fixed. | 20:02 |
jlk | hrm. | 20:03 |
jlk | so with that change, I still get log files written to | 20:03 |
jlk | rather than console out | 20:03 |
mordred | jlk: when you run with -d ? | 20:03 |
jlk | yeah | 20:04 |
jlk | /srv/zuul/bin/zuul-scheduler -d -c /run/secrets/zuul.conf | 20:05 |
mordred | jlk: can you slam in a print in setup_logging in zuul/cmd/__init__.py to see what hasattr(self.args, 'nodaemon') and self.args.nodaemon are? | 20:06 |
jlk | sure | 20:06 |
jlk | erm | 20:09 |
mordred | jeblair: I have set keep on on ze01 | 20:09 |
jlk | well. I don't have log files written out | 20:09 |
mordred | jlk: but you still have no output on stdout? | 20:09 |
jlk | yeah | 20:10 |
mordred | grump | 20:10 |
jlk | not even the print statement. Not sure where that went | 20:10 |
mordred | jlk: are your dockers eating stdout somehow perhaps? | 20:10 |
jlk | I don't think so, I get plenty of stdout for zookeeper, and I was getting tracebacks on stdout | 20:10 |
jlk | maybe did the default level of logging change? | 20:11 |
jlk | like I need to add more verbosities ? | 20:11 |
jlk | there does not appear to be a command line option for more verbosities :( | 20:11 |
mordred | jlk: zuul/ansible/logconfig.py line 83 - bump that to DEBUG (and sorry for the hassle here) | 20:13 |
mordred | jlk: we have different levels set for different things, but then the console output is filtering to warning | 20:14 |
jlk | no worries, I had getting my dockers going again as a backburner that I let sit too long | 20:14 |
jlk | hasattr is True, and the args.nodaemon is True | 20:15 |
jlk | both boolean | 20:15 |
mordred | ok. I think it's the other thing | 20:15 |
jlk | trying the debug | 20:16 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Set better defaults for server logging https://review.openstack.org/498928 | 20:16 |
jlk | oh yeah that looks familiar | 20:16 |
jlk | although | 20:16 |
jlk | every message is duplicate | 20:16 |
mordred | jlk: ^^ that patch does a few more things | 20:16 |
jlk | there's two lines printed for every entry | 20:16 |
mordred | yah - one sec | 20:17 |
jeblair | mordred: can you make the defaults for -d be debug? that was the old behavior | 20:18 |
jeblair | (and i think desirable) | 20:18 |
mordred | jlk: https://review.openstack.org/#/c/498635 | 20:18 |
mordred | jeblair: yes I can - one sec | 20:18 |
dmsimard | mordred, jeblair: 0.14.2 pushed | 20:19 |
pabelanger | known issue? Could not clean up: 'AraCli' object has no attribute 'ara_context' | 20:20 |
jeblair | pabelanger: yes. scrollback. | 20:21 |
pabelanger | Thanks, see it now | 20:21 |
jeblair | pabelanger, mordred, dmsimard: i will update and restart ze01 | 20:21 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Raise default logging level to debug if nodaemon is passed https://review.openstack.org/498931 | 20:22 |
dmsimard | jeblair: 0.14.2 on pypi but surely not on -infra mirrors yet, not sure if the executor node uses the mirrors | 20:23 |
jeblair | dmsimard: nope, pyp | 20:23 |
dmsimard | ok | 20:23 |
jeblair | actually, we shouldn't need a restart should we? | 20:23 |
dmsimard | jeblair: thanks for finding (and fixing!) the issue | 20:23 |
jeblair | i upgraded ara | 20:24 |
dmsimard | for the restart, not entirely sure how it's loaded into the executor. | 20:25 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Remove default root handler fallback to console https://review.openstack.org/498635 | 20:25 |
mordred | jeblair: I don't tihnk we're doing the copy trick with ara yet | 20:25 |
mordred | 268851 | 20:26 |
mordred | gah | 20:26 |
mordred | jeblair: which is to say I do not thinkg you need to restart | 20:26 |
jlk | mordred: those three recent changes look good here | 20:27 |
mordred | woot | 20:27 |
dmsimard | jeblair: so, circling back to v3 integration testing with ara -- this is the kind of issue I hope to be catching either through executor integration tests (which, to my understanding, we don't have yet), zuul_stream tests (WIP) and integration tests directly from ara's gate | 20:27 |
dmsimard | I feel bad about ara breaking the gate a few times for what seems like things that could've been relatively easily caught | 20:27 |
*** jkilpatr has quit IRC | 20:28 | |
dmsimard | the other challenge is properly gating what ends up in zuul-jobs (especially things from base roles that include roles from zuul-jobs) | 20:30 |
mordred | dmsimard: yes - I totally agree with that (and I'm sad that the integration test we set up did not catch this and still wantto know why) | 20:31 |
mordred | like - it's not like we didn't actually set up a specific job just for this case ... which makes me extra sad it broke | 20:31 |
jeblair | dmsimard: no worries. we know what we signed up for. i'm still happy to fix it after the ptg | 20:34 |
jeblair | the job i'm watching just passed ara-emit-html | 20:34 |
jeblair | http://logs.openstack.org/77/498877/1/check/tox-pep8/79f316d/ara/ | 20:35 |
dmsimard | yay | 20:35 |
dmsimard | well, that's a relief | 20:35 |
mordred | jeblair: woot! | 20:35 |
jlk | well, poop. | 20:42 |
jlk | I can't see where scheduler is sitting and spinning on processing of github events | 20:42 |
*** jkilpatr has joined #zuul | 20:47 | |
dmsimard | clarkb, jeblair, mordred: so now that inventory/overlay will be landing, any other tree you want me to bark at ? | 20:52 |
dmsimard | fix_disk_layout is also about to land, just missing a +3 https://review.openstack.org/#/c/496935/ | 20:52 |
mordred | dmsimard: done | 20:54 |
dmsimard | I was thinking just now that we should probably consider pinning to Ansible<2.4 instead of Ansible >=2.3.0.0 | 20:56 |
dmsimard | So they don't break stuff in our faces, makes sense ? | 20:56 |
clarkb | in d-g its an == | 20:57 |
clarkb | or do you mean with zuul? | 20:57 |
dmsimard | In Zuul, yes | 20:58 |
dmsimard | zuul currently has >=2.3.0.0 | 20:58 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add trigger-readthedocs job https://review.openstack.org/498626 | 20:59 |
dmsimard | 2.4 changes a LOT of things, I haven't yet spent time getting ara to work with it but I have non-voting jobs on devel (2.4) and they're broken | 20:59 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix badly named function typo https://review.openstack.org/498877 | 21:00 |
dmsimard | 2.4 is still planned for mid september, with a first release candidate sept 6th | 21:00 |
* dmsimard sends a patch, we can discuss it there if need be | 21:00 | |
jlk | blah anybody got tips on signing the json data going into a requests call? | 21:02 |
openstackgerrit | David Moreau Simard proposed openstack-infra/zuul feature/zuulv3: Pin Ansible to <2.4 https://review.openstack.org/498941 | 21:05 |
jlk | n/m got it | 21:06 |
*** hasharDinner is now known as hashar | 21:07 | |
jlk | So I guess there is a question | 21:25 |
jlk | that I'm about to answer with data | 21:26 |
jlk | OKAY GOOD NEWS | 21:31 |
jlk | fetching all the pulls of a PR does not decrease the rate limit by the # of pulls. | 21:32 |
jlk | and getting the same event a second time hits all the good cache points and we don't reduce the rate limit at all | 21:32 |
jeblair | jlk: yay! | 21:34 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Pin Ansible to <2.4 https://review.openstack.org/498941 | 21:34 |
*** olaph1 has joined #zuul | 21:42 | |
*** olaph has quit IRC | 21:42 | |
openstackgerrit | Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Improve function to find PR from commit status https://review.openstack.org/498957 | 21:46 |
jlk | jeblair: mordred ^^ that should help somewhat with search API limits and of trying to hit repos we don't care about. | 21:48 |
mordred | jlk: looks great - other than tests failing | 21:56 |
*** olaph1 is now known as olaph | 22:05 | |
*** dkranz has quit IRC | 22:08 | |
*** hashar has quit IRC | 22:09 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add mirror-workspace-git-repos role https://review.openstack.org/498967 | 22:23 |
jlk | oh hah | 22:23 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add start-zuul-console role https://review.openstack.org/498968 | 22:26 |
openstackgerrit | Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Improve function to find PR from commit status https://review.openstack.org/498957 | 22:32 |
mordred | jlk: that'll do it :) | 22:53 |
openstackgerrit | Jesse Keating proposed openstack-infra/zuul-jobs master: Add a role to remove an ssh private key https://review.openstack.org/498530 | 22:57 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add mirror-workspace-git-repos role https://review.openstack.org/498967 | 23:11 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add start-zuul-console role https://review.openstack.org/498968 | 23:11 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add mirror-workspace-git-repos role https://review.openstack.org/498967 | 23:14 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add start-zuul-console role https://review.openstack.org/498968 | 23:14 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add mirror-workspace-git-repos role https://review.openstack.org/498967 | 23:16 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add start-zuul-console role https://review.openstack.org/498968 | 23:16 |
mordred | jeblair: jlk's patch https://review.openstack.org/498957 is good to go and should help with our ratelimit fun | 23:18 |
mordred | jlk, jeblair: also, so that we don't lose track of the stack, https://review.openstack.org/#/c/498635 and the 3 before it are ready - and the first three each have a +2 from one of you | 23:22 |
mordred | jeblair: with your latest copying patch, you should be able to make the devstack-legacy job base on base-test right? or were you thinking of synthetically testing that instead | 23:23 |
jeblair | mordred: i was going to rebase devstack-legacy | 23:23 |
mordred | jeblair: cool | 23:25 |
pabelanger | https://review.openstack.org/498588/ is ready for review again, that is our upload-afs role | 23:25 |
pabelanger | https://review.openstack.org/498621/ is our new pubilsh-openstack-python-docs jobs too | 23:26 |
pabelanger | actually, I still see an issue with 498621, let me update quickly | 23:26 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Improve function to find PR from commit status https://review.openstack.org/498957 | 23:33 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix a backwards boolean comparison for nodaemon logging https://review.openstack.org/498923 | 23:36 |
mordred | pabelanger: speaking of - https://review.openstack.org/#/c/498623/ has a publish-to-pypi project-template - since by the time we land it we'll have all th ebits we need for that | 23:36 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Set better defaults for server logging https://review.openstack.org/498928 | 23:37 |
pabelanger | mordred: nice | 23:37 |
mordred | pabelanger: so - if I can work down my current stack, I believe we'll be 100% done with publish-to-pypi- complete with constraints patches and release announcements! | 23:38 |
pabelanger | mordred: woot! | 23:38 |
pabelanger | I am hoping afs jobs won't be much far behind | 23:39 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Raise default logging level to debug if nodaemon is passed https://review.openstack.org/498931 | 23:39 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Remove default root handler fallback to console https://review.openstack.org/498635 | 23:40 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Create upload-afs role https://review.openstack.org/498588 | 23:41 |
mordred | pabelanger: +100 | 23:41 |
mordred | jeblair, pabelanger: it's still in check, but https://review.openstack.org/#/c/498903 should be good to go now (pre-reqs have landed so recheck works) | 23:42 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add mirror-workspace-git-repos role https://review.openstack.org/498967 | 23:42 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add start-zuul-console role https://review.openstack.org/498968 | 23:43 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Make a docs-on-readthedocs project-template https://review.openstack.org/498903 | 23:50 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix typo in ServerLoggingConfig https://review.openstack.org/498922 | 23:50 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!