*** qwc has quit IRC | 00:05 | |
*** qwc has joined #zuul | 00:06 | |
*** xinliang has quit IRC | 02:23 | |
*** xinliang has joined #zuul | 02:35 | |
*** xinliang has quit IRC | 02:35 | |
*** xinliang has joined #zuul | 02:35 | |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-jobs master: Move to dictionary list of projects zuul._projects https://review.openstack.org/513233 | 03:09 |
---|---|---|
*** yolanda has quit IRC | 03:23 | |
*** bhavik1 has joined #zuul | 04:42 | |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul feature/zuulv3: Convert zuul.projects to a dict https://review.openstack.org/514119 | 05:08 |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul feature/zuulv3: Convert zuul.projects to a dict https://review.openstack.org/514119 | 05:33 |
*** bhavik1 has quit IRC | 06:21 | |
kklimonda | SpamapS: well, there is at least one too smart for its own good firewall between executors and the scheduler - both client and server receive responses to tcp keepalives, but not from each other. | 07:29 |
kklimonda | SpamapS: and in the meantime NAT session/flow is deleted but keepalives keep coming | 07:30 |
kklimonda | SpamapS: my current theory is that I have 3 devices: A, B and C - A responds to the client keepalives, C responds to server keepalives, and B sees no traffic so it drops the session | 07:31 |
*** hashar has joined #zuul | 07:32 | |
ianw | i'm not too convinced the unit test failures on https://review.openstack.org/#/c/514119/ are caused by my change. in fact i saw the coverage test pass once | 08:40 |
ianw | i think it has to do with "- child finger://ubuntu-xenial-vexxhost-ca-ymq-1-0000380491/d2ca05bd0f6c4904aa7eb52afe58375b : FAILURE in 3s" | 08:41 |
ianw | that's the common thread i think | 08:41 |
*** sambetts|afk is now known as sambetts | 09:15 | |
*** electrofelix has joined #zuul | 09:31 | |
*** yolanda has joined #zuul | 09:34 | |
tobiash | ianw: there was a change merged in zuul which introduced a test with a race | 11:08 |
tobiash | ianw: https://review.openstack.org/#/c/514056/ fixes this | 11:09 |
*** yolanda has quit IRC | 11:09 | |
*** jkilpatr has joined #zuul | 11:19 | |
kklimonda | SpamapS: I'm now running with that patch and I'll see if that keeps the connection open/handles dropping connection https://github.com/kklimonda/gear/commit/f4c3b6193ac6a2a71b97640545a63599c00e5e22 | 11:22 |
kklimonda | Connection is probably not the place I wanted to do that, but that seemed to be the quickest approach. | 11:23 |
kklimonda | SpamapS: I'd love to get some form of that into zuul/gear so that we don't have to carry the delta around - suggestions welcome :) | 11:23 |
kklimonda | (other than not sending ECHO_REQ when we've recently heard from the server) | 11:26 |
*** weshay|bbiab is now known as weshay|ruck | 12:18 | |
*** yolanda has joined #zuul | 12:47 | |
*** lennyb has quit IRC | 12:59 | |
*** lennyb has joined #zuul | 13:05 | |
*** xinliang has quit IRC | 13:23 | |
*** lennyb has quit IRC | 13:23 | |
*** lennyb has joined #zuul | 13:27 | |
*** yolanda has quit IRC | 13:43 | |
leifmadsen | mordred: wanna schedule some more time to go over some zuul v3 docs? | 13:50 |
leifmadsen | maybe... next week or week after? | 13:51 |
leifmadsen | CC: jeblair ^^ | 13:51 |
*** yolanda has joined #zuul | 13:57 | |
Shrews | the week after, most folks will be in Sydney | 14:02 |
jeblair | leifmadsen: i'm available later this week or earlier next week. (early this week i should leave open for fixing openstack's zuul deployment; next friday i'm flying to sydney). | 14:16 |
jeblair | tobiash: it looks like we've both agreed on preferring the graph-order fix. thanks for finding that and exploring the options. | 14:17 |
leifmadsen | Shrews: ah, thought that was this week :) | 14:17 |
leifmadsen | I guess I should know better because not missing any of our team :) | 14:17 |
tobiash | jeblair: already abandoned the other one :) | 14:18 |
tobiash | jeblair: I learned something new about the zuul data model during debugging :) | 14:18 |
sambetts | Hi folks, in zuul v2 is there a way to acheive the same effect as the semaphores I see in the new documentation?? | 14:23 |
jeblair | sambetts: there's a mutex (one job at a time), but not semaphores (multiple jobs) | 14:25 |
sambetts | :( I have a shared resource, which has a limited number of that resource and each job takes one so I wanted to limit my jobs to the amount of the resource I had | 14:26 |
tobiash | sambetts: if you run your own zuul v2 you can add this patch to it: https://review.openstack.org/#/c/386520/ | 14:27 |
tobiash | sambetts: that adds semaphores to zuulv2 | 14:27 |
tobiash | sambetts: I have this in my productive zuul v2 deployments | 14:27 |
sambetts | thanks, but I'm not sure I'm able to patch my zuul :( I think I might be able to acheive a similar thing in my environment by fiddling with nodepool config and limiting the number of avaible VMs under a certain label, but I was hoping to avoid it | 14:30 |
* sambetts files that patch away for later use though | 14:30 | |
*** yolanda has quit IRC | 14:36 | |
jeblair | sambetts: i think that should work too | 14:38 |
sambetts | just a shame that max-servers is only defined on providers, so I have to put my provider in twice, once for the jobs I have that I don't need to limit and then again for the labels I want to limit | 14:40 |
sambetts | :( | 14:40 |
* sambetts can't wait for zuulv3 to become prime time for third party CI... | 14:40 | |
sambetts | jeblair: can you help me out with this patch btw, https://review.openstack.org/#/c/512588/ I'm not sure if this is going to work, but now zuul/ansible are involved with checking out the right version of each project, I don't know if I can pin to a ref like I could pre-zuulv3 | 14:50 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Document executor/merger stats https://review.openstack.org/514343 | 14:50 |
jeblair | sambetts: replied on that change | 14:58 |
sambetts | jeblair: thanks | 14:58 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: experiment with late-binding inheritance https://review.openstack.org/511352 | 15:02 |
mordred | jeblair: moring! I just sent you an email - but tl;dr I'm in my plane seat and my flight does not have wifi | 15:11 |
jeblair | mordred: ack thx! see you on the other side! | 15:13 |
mordred | jeblair: I worked on removing the requirement for project name in config loader on the flight to ams - but it isn't owkring just yet - will finish debugging that nad adding a few more tests - then likely hack on log streaming refactoring since that's a thing I can do self-contained on a long flight too | 15:13 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add log streaming logging and exception handling https://review.openstack.org/513811 | 15:13 |
jeblair | mordred: be aware of tobiash's change to add project name regex support (which i haven't reviewed yet): https://review.openstack.org/513368 | 15:15 |
mordred | jeblair: if I get BOTH of those done I have all of tristan and mnaser's current patches and I may try to rationalize both of them with the webpack/yarn patches - but I honestly doubt I'll get that far | 15:15 |
mordred | jeblair: oh - thanks for reminding me - I am aware of it but I wanted to grab it to make sure I'm not conflicting | 15:15 |
jeblair | mordred: kk | 15:15 |
dmsimard | Perhaps a silly question, but just making sure I understand -- with the git namespace flattening project that we want to do, does that just mean that we'll need to rename projects and modify the project layouts for each project in zuul v3 ? | 15:16 |
jeblair | mordred: i'm about to leave some questions on it, so i think some form of it will go in, but i'd suggest not depending on it | 15:16 |
dmsimard | jlk: https://review.openstack.org/#/c/513368/ came up when we were discussing how we could define (for example) the 'system-required' project template for every project without having to maintain 1000+ project layouts | 15:18 |
mordred | jeblair: ok. cool. I don't thinik it's super conflicty anyway | 15:18 |
jlk | ah okay | 15:18 |
jlk | dmsimard: thanks. Was there a spec attached, or did it not need one? | 15:19 |
mordred | dmsimard: I'm working on making project names go away from most in-repo project stanzas which will make that easier - but yah, required-projects entries will need to get renamed as will likley roles entries | 15:19 |
dmsimard | jlk: I don't think there was a spec, tobiash just felt inspired and wrote things I think. Maybe there could be one. | 15:19 |
dmsimard | mordred: doing that across 1000+ repos (and each branch) sounds like fun | 15:20 |
mordred | yah | 15:20 |
jlk | neat little bit of programming | 15:23 |
jlk | like when I did mass rebuilds (in a reasonable order) of Fedora packages. | 15:24 |
tobiash | Yeah, I thought I also might have a use case and wrote this as a proposal in my spare time | 15:24 |
tobiash | jeblair: i've replied to your comment | 15:32 |
tobiash | (on 513368) | 15:33 |
dmsimard | jeblair, mordred: I remember us discussing the concept of 'artifacts' (of jobs/builds) being a thing in Zuul v3 -- to easily expose files from a parent job to child jobs. Is that a thing yet ? Glancing through the v3 config docs I'm not sure it is. | 15:36 |
dmsimard | Maybe it's just zuul_return ? | 15:38 |
jeblair | dmsimard: it's not yet a thing. we'll talk about it more after things settle down. | 15:39 |
dmsimard | jeblair: ok, it's not a rush, it's a question I got that I wasn't sure about. | 15:40 |
dmsimard | We can fairly cleanly work around it by leveraging zuul_return with the location of the artifacts which would be pulled by the child jobs. | 15:40 |
jeblair | dmsimard: yeah; the buildset uuid may also be interesting (it uniquely identifies a particular run of a collection of jobs) | 15:42 |
jeblair | dmsimard: a missing piece for this kind of solution is cleanup though -- you'd need external cleanup for now, until we add some more things to zuul. | 15:43 |
dmsimard | jeblair: yeah we override the log path to use buildset uuid in v2 for RDO. | 15:47 |
jeblair | jlk: do you have a minute to +3 https://review.openstack.org/514056 ? | 15:50 |
*** jkilpatr has quit IRC | 15:55 | |
Shrews | did flake8 get an update or something? seeing pep8 failures for things I did not change: http://logs.openstack.org/11/513811/2/check/tox-pep8/ce4f011/job-output.txt.gz#_2017-10-23_15_34_18_760469 | 16:04 |
Shrews | ah, yep | 16:05 |
Shrews | new version updated 2017-10-23 | 16:05 |
Shrews | jeblair: how would you like to proceed with changing zuul for that ^^^? Add some ignores for those errors, or fix the actual errors, or some combination of both? | 16:07 |
Shrews | look like just E722 and E741 | 16:08 |
jeblair | Shrews: i can't find e741 described... | 16:10 |
jeblair | ooooh | 16:10 |
jeblair | haha | 16:10 |
jeblair | do not use variables named ‘l’, ‘O’, or ‘I’ | 16:10 |
jeblair | it's for people with bad fonts. | 16:11 |
jeblair | Shrews: i think we should ignore E741 and fix E722. | 16:11 |
pabelanger | nice | 16:11 |
jeblair | pabelanger: if by 'nice' you mean 'absurd', i agree :) | 16:11 |
Shrews | lol | 16:11 |
pabelanger | I just use wingdings now | 16:12 |
jeblair | e722 is a legit issue though, and we really should work to avoid it. | 16:12 |
Shrews | jeblair: k. i'll work on that | 16:13 |
jeblair | Shrews: thanks! | 16:13 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Fix for pep8 E722 and ignore E741 https://review.openstack.org/514372 | 16:19 |
*** jkilpatr has joined #zuul | 16:32 | |
openstackgerrit | Clark Boylan proposed openstack-infra/zuul feature/zuulv3: Only autohold failed builds https://review.openstack.org/513850 | 16:53 |
clarkb | tobiash: fyi I think ^ generally addresses your comments, lets see if zuul is happy with it | 16:56 |
clarkb | I think it will fail on pep because it needs 514372 though | 16:56 |
tobiash | clarkb: looking | 16:58 |
kklimonda | for zuulv3, when I deploy zuul-executor behind the firewall, there is no clean way of getting log streaming to work - so far I've figured out some sort of reverse tunnel (autossh) or vpn could be used - that is actually working, the last piece of the puzzle being finger:// urls - I've created a small patch for that http://paste.openstack.org/show/624392 (not really tested yet, however I think it shows what I'm looking at - each executor | 16:59 |
kklimonda | would be binding to finger on a different port and use the public hostname from config instead of its own) but perhaps there is a better way and/or someone had already been thinking about it. | 16:59 |
jeblair | kklimonda: the intent is eventually to have the executors run the streamer on a non-privileged port, and then serve the finger port from a multiplexer, the same way we do with websockets now. | 17:02 |
jeblair | kklimonda: would that address your current situation as well, since the multiplexer and executor would be on the same side as the firewall? you'd only need to make sure users can get to the multiplexer... | 17:03 |
jeblair | kklimonda: or i could be misunderstanding -- does websocket streaming work for you now? | 17:04 |
kklimonda | jeblair: web streaming will be working once we open the port - in that case I just have to run zuul-web behind a firewall, and open a port to the public, right? | 17:06 |
jeblair | kklimonda: yes | 17:06 |
kklimonda | mhm, so assuming we can get the IT team to open one or two ports we would be good with multiplexers | 17:07 |
jeblair | kklimonda: cool. hopefully someone will get around to implementing the finger multiplexer soon. :) then the finger urls can switch from finger://executor/job to finger://zuul/job, and no one has to see executor hostnames anymore | 17:08 |
kklimonda | (even if I can't get ports opened, multiplexer would let me not have to worry about executor hostnames for finger - I'd just have to create reverse tunnels so that multiplexer can connect to executors) | 17:09 |
tobiash | kklimonda: last week also https://review.openstack.org/#/c/512629/ was merged, this could help you in defining the executor hostname | 17:09 |
openstackgerrit | Doug Hellmann proposed openstack-infra/zuul-jobs master: add more debugging to the upload-pypi role https://review.openstack.org/514394 | 17:10 |
jlk | jeblair: done! | 17:10 |
openstackgerrit | Doug Hellmann proposed openstack-infra/zuul-jobs master: add more debugging to the upload-pypi role https://review.openstack.org/514394 | 17:11 |
kklimonda | tobiash: I've seen that patch, but it doesn't solve an issue of multiple executors running behind firewall - in that case finger hostname should be common (zuul-web hostname) but that shouldn't affect zuul-executor hostnames, as that's used at least for executor:stop:{hostname} and probably in other places | 17:12 |
tobiash | kklimonda: I see | 17:13 |
*** electrofelix has quit IRC | 17:14 | |
openstackgerrit | Doug Hellmann proposed openstack-infra/zuul-jobs master: add more debugging to the upload-pypi role https://review.openstack.org/514394 | 17:18 |
kklimonda | also, a way to limit a backlog when streaming over the web may be nice - trying to stream megs of logs with chrome has cooked my laptop. | 17:18 |
SpamapS | weird.. curling the webapp port is just hanging for me | 17:21 |
SpamapS | http://paste.openstack.org/show/624393/ | 17:22 |
SpamapS | just sits there | 17:22 |
jeblair | SpamapS: is the scheduler up and running correctly? | 17:22 |
jeblair | SpamapS: iirc, that may happen if it hasn't completed it's initial configuration | 17:22 |
jeblair | SpamapS: (either it's still loading, or it bombed) | 17:22 |
*** sambetts is now known as sambetts|afk | 17:23 | |
SpamapS | jeblair: ahhh no executor was down | 17:24 |
SpamapS | 2017-10-23 10:19:33,003 DEBUG zuul.TenantParser: Waiting for cat job <gear.Job 0x7f39c80a6518 handle: b'H:127.0.0.1:1' name: merger:cat unique: 8da117682a494ad888526dbf7ddf3f8c> | 17:24 |
openstackgerrit | Doug Hellmann proposed openstack-infra/zuul-jobs master: add more debugging to the upload-pypi role https://review.openstack.org/514394 | 17:25 |
SpamapS | http://paste.openstack.org/show/624394/ <-- ERROR while starting executor | 17:26 |
SpamapS | jeblair: ^^ seen that? | 17:26 |
SpamapS | oh.. I bet I have an in-repo zuul.yaml that is bong | 17:27 |
*** hashar is now known as hasharDinner | 17:29 | |
jeblair | SpamapS: weird, that looks like a zuul bug...maybe that could happen if a merger has an error while trying to run a cat job? | 17:31 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: experiment with late-binding inheritance https://review.openstack.org/511352 | 17:39 |
SpamapS | jeblair: that wa actually the scheduler log, I was mistaken. | 17:41 |
SpamapS | jeblair: the executor finished the cat job fine. I'm debugging now. | 17:41 |
SpamapS | jeblair: removing some half-baked zuul.yaml's from the equation at this point. | 17:41 |
Shrews | jeblair: is the test_data_return test failure a known thing? i've seen it fail more than once now | 17:43 |
Shrews | jeblair: most recently on the pep8 change | 17:43 |
jeblair | Shrews: yep. it's blocked by the pep8 change. | 17:44 |
jeblair | Shrews: https://review.openstack.org/514056 | 17:44 |
Shrews | neat | 17:44 |
jeblair | so we'll need to recheck-bash the pep8 change in, or squash the two, or update the test fix to pin flake8 | 17:44 |
SpamapS | jeblair: yes that was a very old zuul.yaml in an old experimental fork | 17:45 |
SpamapS | I almost feel like zuul should tag commits that it has gated and landed on layout changes and refuse to consider layouts that lack the tag or something. | 17:46 |
SpamapS | tho | 17:46 |
*** __zeus__ has joined #zuul | 17:46 | |
SpamapS | that's just a knee jerk reaction.. reality is: we were breaking stuff a lot for a reason and we won't do that anymore. | 17:47 |
jeblair | Shrews: tell you what, i'll add a skip add/remove to those 2 changes | 17:49 |
Shrews | jeblair: that'll work | 17:50 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix for pep8 E722 and ignore E741 https://review.openstack.org/514372 | 17:50 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix undefined sort order when applying parent data https://review.openstack.org/514056 | 17:51 |
jeblair | Shrews, tobiash: ^ that should get us moving | 17:51 |
tobiash | \o/ | 17:54 |
jeblair | tobiash: thinking about the project regex -- https://review.openstack.org/514127 is similar to the problem you point out. perhaps we should change all the implied regex options to only regex if they start with "^". (you can, of course, still write '^.*?something' if you really don't want it anchored at the start of the string) | 17:58 |
tobiash | jeblair: sounds good to me | 18:00 |
tobiash | jeblair: should we also make regexes full match by default (I did this in the regex patch)? | 18:01 |
jeblair | tobiash: probably so | 18:02 |
openstackgerrit | Clark Boylan proposed openstack-infra/zuul feature/zuulv3: Only autohold failed builds https://review.openstack.org/513850 | 18:09 |
fungi | digging into the sqlreporter exceptions which are blocking our smtpreporter from working... it looks like the insert in the SQLReporter.report() method were written to assume change-based pipelines exclusively and fail when trying to insert empty strings for irrelevant fields into integer type columns (e.g. the "change" value for gerrit ref-updated pipelines) | 18:13 |
fungi | should i coerce those values to 0 or is there a "better" way? | 18:13 |
jlk | So with zuul-web, if things are running fine, should I be able to curl <ip>:9000 and get something other than a 404 back? | 18:13 |
pabelanger | might need tennat in URL | 18:14 |
pabelanger | tenant* | 18:14 |
jlk | oh, maybe my tenant config is jank | 18:14 |
pabelanger | but, should be able to curl a project public key | 18:15 |
fungi | yeah, we do some rewrite tricks in apache right now to hide the tenant part of the url | 18:15 |
fungi | but you probably want <ip>:9000/<tenant>/status.json or something | 18:15 |
jlk | huh, nothing but 404s. | 18:15 |
Shrews | jlk: afaik, the only route zuul-web serves is websocket based (assuming you're not going through apache) | 18:16 |
Shrews | so curl won't handle that (i don't think) | 18:16 |
jlk | that said, I don't see zuul-scheduler actually trying to download content to load the tenant | 18:16 |
jlk | Shrews: oooh, hrm. | 18:16 |
jlk | but, it has a listen address? | 18:16 |
jlk | https://docs.openstack.org/infra/zuul/feature/zuulv3/admin/components.html?highlight=logs#web-server | 18:16 |
pabelanger | oh, 8001 I think is default for web | 18:17 |
Shrews | <ip>:9000/console-stream is the only route at this point | 18:17 |
jlk | two different things | 18:17 |
jlk | there's "webapp" which runs on scheduler, and then there's the zuul-web service | 18:17 |
jlk | Shrews: ah okay. | 18:17 |
pabelanger | <ip>:8001/<tenant>/status.json | 18:17 |
pabelanger | I think that is the default | 18:18 |
fungi | jlk: ohhh... and you're trying to get to the zuul-web service which serves the console streams? | 18:18 |
jlk | yeah I was just trying to see if it was alive. I forgot that not much has landed there. I was testing before using in-flight code that hasn't merged yet | 18:19 |
Shrews | jlk: http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/web/__init__.py?h=feature/zuulv3#n182 | 18:19 |
Shrews | you need a websocket client | 18:19 |
Shrews | jlk: the websocket tests have an example client, if you need it | 18:19 |
Shrews | http://git.openstack.org/cgit/openstack-infra/zuul/tree/tests/unit/test_log_streamer.py?h=feature/zuulv3#n159 | 18:20 |
Shrews | jlk: also, iirc, i was using a chrome extension (simple websocket client) to test the websocket connection at one point | 18:23 |
Shrews | probably something similar for other browsers | 18:23 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/zuul feature/zuulv3: Default change and patchset to 0 in SQLReporter https://review.openstack.org/514423 | 18:25 |
fungi | okay, convinced myself that 0 was correct there ^ | 18:25 |
jlk | alright, so I can't even call zuul cli to show me anything. My scheduler process is not exactly doing what it should be :( | 18:42 |
tobiash | jlk: is the scheduler started fully (and has printed its layout to the log)? | 18:44 |
jlk | it has not. | 18:44 |
tobiash | jlk: before that the scheduler doesn't react to status requests | 18:44 |
jlk | It seems to be stuck somewhere, either not able to communicate with zookeeper or gear or what. I can't seem to convince it to try to parse the tenant yaml | 18:45 |
*** __zeus__ has quit IRC | 18:45 | |
*** __zeus__ has joined #zuul | 18:45 | |
tobiash | so there are not cat job logs? | 18:46 |
tobiash | s/not/no/ | 18:46 |
jlk | not yet no. | 18:46 |
tobiash | jlk: do you use the zuul supplied gearman server or an external one? | 18:47 |
jlk | zuul supplied one | 18:47 |
jlk | I should make sure that's actually working | 18:47 |
jlk | is there a good way from the cli to see that status? | 18:47 |
tobiash | hm, I'm usually looking into the debug log of the scheduler | 18:48 |
jlk | yeah | 18:48 |
jlk | this is running in daemon mode, so in theory I should be getting debug on the stdout, and yet... | 18:48 |
tobiash | can you post the debug log? | 18:49 |
SpamapS | jlk: echo "status" | nc scheduler-ip 4730 | 18:49 |
jlk | only thing in logs is: 2017-10-23 18:32:55,577 DEBUG zuul.MergeClient: Connecting to gearman at zuul-gearman:4730 | 18:49 |
jlk | 2017-10-23 18:32:55,580 DEBUG zuul.MergeClient: Waiting for gearman | 18:49 |
jlk | blah, I don't have nc in the container | 18:49 |
tobiash | jlk: http://paste.openstack.org/show/624402/ | 18:51 |
tobiash | these are the first log lines of my (working) scheduler | 18:51 |
jlk | okay, so I'm stuck getting connected to zookeeper perhaps | 18:52 |
tobiash | so seems to be either the gearman connection or the zookeeper connection | 18:52 |
tobiash | unfortunately you don't see in the log when it's finished with gearman connection and starting zookeeper connection :/ | 18:52 |
*** harlowja has joined #zuul | 18:53 | |
jlk | yeah | 18:53 |
SpamapS | man | 19:12 |
SpamapS | the error message on zuul.yaml config problems is really nice | 19:12 |
SpamapS | jeblair: ^^ well done on that. It's really clear when I've messed something up horribly in a PR. ;) | 19:12 |
jeblair | \o/ | 19:13 |
jeblair | i can't wait to be able to do inline comments. and once v3 is released, maybe link to docs in the error msg. :) | 19:13 |
jlk | so something I'd like to work on at some point soon, better start-up debugging, of where things are at | 19:13 |
jeblair | (actually, we can probably link to docs once we merge the dashboard and start self-hosting generated docs) | 19:14 |
jeblair | jlk: ++ | 19:14 |
jeblair | it's very opaque now | 19:14 |
jlk | yeah, and I'm not convinced that the changes that went in to make debugging to stdout the default when running in nodaemon are actually doing the write thing. | 19:18 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Support upper-constraints in tox-siblings https://review.openstack.org/513199 | 19:26 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix for pep8 E722 and ignore E741 https://review.openstack.org/514372 | 19:42 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix undefined sort order when applying parent data https://review.openstack.org/514056 | 19:42 |
openstackgerrit | Clark Boylan proposed openstack-infra/zuul feature/zuulv3: Only autohold failed builds https://review.openstack.org/513850 | 19:53 |
clarkb | ok I think ^ is finally ready for review. simplified the test a bit (and got it working) | 19:53 |
ianw | tobiash: (from ages ago) thanks that looks like it | 20:04 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Document executor/merger stats https://review.openstack.org/514343 | 20:08 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: experiment with late-binding inheritance https://review.openstack.org/511352 | 20:08 |
jlk | SpamapS: I got | 20:14 |
jlk | echo "status" | nc localhost 4730 | 20:14 |
jlk | . | 20:14 |
jlk | is a period expected? | 20:14 |
jeblair | yeah, that's end-of-data | 20:15 |
jeblair | (so no jobs are registered) | 20:15 |
jeblair | but the server is up and running | 20:15 |
SpamapS | http://paste.openstack.org/show/624405/ | 20:15 |
SpamapS | that's my running zuul | 20:15 |
jlk | okay, interesting | 20:16 |
jlk | so hitting localhost works, but hitting the k8s "service" name does not work. | 20:16 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add log streaming logging and exception handling https://review.openstack.org/513811 | 20:18 |
Shrews | jeblair: rebased ^^^ and also added exception handling around the serve_forever() part that should hopefully catch an abnormal termination of the serving thread | 20:19 |
jeblair | Shrews: lgtm! | 20:20 |
Shrews | cool. hopefully we can catch something with that | 20:21 |
jlk | ugh I wonder if the Fedora container image is running iptables | 20:28 |
Shrews | jeblair: qq... is the error from zuul on https://review.openstack.org/513766 due to the fact that the depends-on change has not merged, or is zuul confused because there are two changes with change ID I22007434b38129379690f4e469a1981ed7dcb68c ? | 20:28 |
Shrews | jeblair: one of those changes with that ID is aborted | 20:28 |
Shrews | (it's still a mystery to me how i made two changes with the same ID, but whatever...) | 20:29 |
jeblair | Shrews: that's hard to say without a trip into the zuul logs. if you don't feel like spelunking, it may be worth starting with a recheck. | 20:29 |
jeblair | also, this may answer the question from fungi earlier about depends-on:abandoned behavior | 20:30 |
Shrews | jeblair: i shall do the easier option first | 20:30 |
jlk | nope, hrm. | 20:30 |
clarkb | jlk: firwall on the host maybe? | 20:31 |
clarkb | (we've seen host firewalls interact with lx(c|d) for osa in the past | 20:32 |
SpamapS | hrm.. weird.. bwrap isn't working on centos7's kernel | 20:36 |
SpamapS | but it was 3 weeks ago | 20:36 |
pabelanger | 7.4 release? | 20:37 |
SpamapS | CentOS Linux release 7.4.1708 (Core) | 20:37 |
SpamapS | Hm | 20:37 |
SpamapS | that would be weird | 20:37 |
SpamapS | regressing non-user namespaces? | 20:37 |
* SpamapS will try an older kernel I gues | 20:38 | |
pabelanger | maybe tristanC know more, but think he is still on PTO | 20:38 |
SpamapS | trying 7.3 kernel now | 20:40 |
tristanC | SpamapS: yep, you likely need https://github.com/projectatomic/bubblewrap/commit/ec5093d57d8d55aa49525e26117ff4e43181a4d3 | 20:40 |
SpamapS | tristanC: oh 0.2.0 .. that's easier than kerneling ;) | 20:41 |
clarkb | can you change the max user namespaces as an alternative? | 20:41 |
clarkb | my local machine says I can have 63k of them | 20:42 |
tristanC | clarkb: probably, but (iirc) then you need to enable userns on kernel command line, it should be disabled by default on redhat kernel | 20:43 |
SpamapS | yeah it was pretty unstable until the most recent kernels | 20:43 |
SpamapS | still pretty annoying that this broke this way :-P | 20:44 |
SpamapS | also I'm running setuid so why is it even trying to USER_NS? | 20:44 |
clarkb | ah, well I'm running 4.13.1 and it seems happy enough | 20:45 |
clarkb | but getting newer kernel may be more of a pain on centos then just patching bwrap :) | 20:45 |
tristanC | SpamapS: it's trying because zuul uses --unshare-all | 20:47 |
SpamapS | clarkb: yeah, I already had to find bubblewrap.. might as well get latest | 20:47 |
SpamapS | btw we need to yell at the projectatomic people for not signing their tarball releases | 20:48 |
SpamapS | that doesn't work either | 20:50 |
SpamapS | 0.2.0 just as broken unfortunately :-/ | 20:51 |
SpamapS | oh hm | 20:52 |
SpamapS | maybe this is just zuul-bwrap being broken | 20:52 |
tobiash | SpamapS: with centos 7.4 you have to supply a boot arg to allow user namespaces | 21:05 |
SpamapS | tobiash: I don't want ot use user ns | 21:05 |
SpamapS | I want to use setuid | 21:05 |
jlk | okay this is frustrating | 21:05 |
SpamapS | and have been :-P | 21:05 |
jlk | an nginx service works fine in k8s | 21:05 |
jlk | both the endpoint IP and the cluster IP, and the service name. But gearman, not so much | 21:06 |
jlk | maybe gearman just doesn't like the networking setup? | 21:06 |
SpamapS | gearman shouldn't really care | 21:06 |
SpamapS | jlk: what's your symptom at this point? | 21:07 |
jlk | does "nc" not do the proper things for going through proxies? | 21:07 |
SpamapS | nc is just a dumb TCP pipe | 21:07 |
SpamapS | connect() and then write/read | 21:07 |
jlk | SpamapS: zuul-scheduler start up is stalled somewhere, and I'm trying to determine what's stalling it. Trying to verify that scheduler is able to communicate with gearman | 21:07 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: experiment with late-binding inheritance https://review.openstack.org/511352 | 21:07 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: Add implied branch matchers on 'master' https://review.openstack.org/514459 | 21:07 |
SpamapS | jlk: if you're using the internal gearman (the default) zuul-scheduler just forks it off and listens on :::4730 ... | 21:08 |
jlk | yes, I'm able to nc into it | 21:08 |
jlk | but I'm running zuul-scheduler in one pod, and executor(s)/web(s) in other pods, that presumably have to connect to that gearman on scheduler | 21:09 |
jlk | so in the config, I have the german address as the service name (zuul-gearman) | 21:09 |
jlk | since that DNS maps throughout the cluster | 21:09 |
SpamapS | makes sense | 21:10 |
SpamapS | but the IP that the service has may be complicated to contact from the pod backing it? | 21:10 |
SpamapS | maybe try separate zuul-scheduler config that uses ::1 (or 127.0.0.1)? | 21:10 |
SpamapS | tobiash: anyway, 0.2.0 works | 21:11 |
SpamapS | tristanC: ^ thanks | 21:11 |
* SpamapS can move on | 21:11 | |
tobiash | ok | 21:11 |
SpamapS | would be nice if EPEL updated to it. | 21:12 |
SpamapS | so I don't have to maintain my own copy somewhere or link to rawhide urls ;) | 21:12 |
jlk | oh very interesting | 21:15 |
jlk | launch the executor, and it can see the service | 21:16 |
jlk | and then status shows things about the executor | 21:16 |
tobiash | Alpine is also pre 0.2 | 21:16 |
jlk | feels like there is something wrong with hairpining | 21:16 |
clarkb | tristanC: trying to build bwrap locally (running make) fails when building the manpage beacuse https://github.com/projectatomic/bubblewrap/blob/master/Makefile-docs.am#L4 seems to conflict with url at https://github.com/projectatomic/bubblewrap/blob/master/Makefile-docs.am#L12 is that a bug? | 21:16 |
SpamapS | jlk: yeah that's exactly how it feels. | 21:17 |
clarkb | SpamapS: ^ you may know since I think you are building bwrap too | 21:17 |
SpamapS | clarkb: yeah I hit that bug too | 21:17 |
SpamapS | --disable-man | 21:17 |
SpamapS | who needs manpages? | 21:17 |
clarkb | SpamapS: well I want the manpage :) | 21:17 |
clarkb | my local manpage from distro is broken | 21:17 |
clarkb | so want to make sure upstream works (seems to) | 21:17 |
clarkb | I'll make a pull request | 21:17 |
SpamapS | we're in a post-manpage world man | 21:18 |
jlk | hahaha fucking hell | 21:18 |
jlk | https://github.com/kubernetes/minikube/issues/1568 | 21:18 |
jlk | of course I have to minikube ssh in and set a permisc flag. | 21:19 |
jlk | that was absolutely what was stalling scheduler start up. Now I get errors in my project-config :) | 21:21 |
SpamapS | derp | 21:22 |
SpamapS | jlk: don't you have a bluemix k8s to play with? | 21:23 |
SpamapS | I'm sure it has 50 other issues but it wouldn't be minikube ;) | 21:23 |
jlk | sure, I could put in a request for that, and I might get it in 2 weeks. | 21:23 |
jlk | maybe with a working external IP | 21:23 |
clarkb | SpamapS: https://github.com/projectatomic/bubblewrap/pull/240 now you can have mangpages too | 21:23 |
SpamapS | clarkb: well played sir | 21:25 |
SpamapS | jlk: seems legit | 21:25 |
SpamapS | You know what's a MASSIVE pain btw? | 21:25 |
SpamapS | adding all the hooks, and user perms, and and and, to github | 21:25 |
SpamapS | we have automation for the hooks, which is amazing | 21:26 |
SpamapS | but zomg still so much to do :-P | 21:26 |
jlk | oh heay neat | 21:28 |
SpamapS | I suppose the integrations/apps makes this quite a bit smoother. | 21:28 |
jlk | turns out you can just restart the scheduler (and gearman), and executor will just wait around and reconnect | 21:29 |
jlk | SpamapS: yeah, apps makes it a pretty much one-click thing | 21:29 |
jlk | can one-click for an entire org worth of repos too | 21:29 |
jlk | although on a side project I'm playing with ansible modules to manipulate repos, so you could write a playbook to do all your things... | 21:30 |
jlk | $ curl http://zuul-webapp:8001/z8s/status.json | 21:31 |
jlk | {"zuul_version": "2.5.3.dev1543" | 21:31 |
jlk | wheeee! | 21:31 |
SpamapS | woot | 21:32 |
SpamapS | oh look at that I'm 2 patches behind | 21:32 |
jlk | next, get the webhook feeding working | 21:34 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add log streaming logging and exception handling https://review.openstack.org/513811 | 21:35 |
Shrews | i think that fixes the test issues | 21:35 |
SpamapS | jlk: that's really cool. I definitely am interested in consuming what you produce. The Ansible from hoist is incredibly high quality.. but 5 - 7 minutes for every config change is annoying. | 21:35 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: experiment with late-binding inheritance https://review.openstack.org/511352 | 21:58 |
jeblair | it's zuul meeting time in #openstack-meeting-alt | 21:59 |
clarkb | SpamapS: wow it apparently is not a bug to have `make` not work | 21:59 |
clarkb | I might have to stop computering now | 21:59 |
jeblair | clarkb: you keep computering. it's the other people that already stopped. | 22:00 |
clarkb | what is the point of a makefile if `make` doesn't wrok | 22:01 |
*** __zeus__ has quit IRC | 22:01 | |
Shrews | not all makes are the same. i remember different versions that weren't compatible | 22:03 |
clarkb | well in these case its beacuse they have a broken command | 22:03 |
clarkb | so make properly fails and says go away | 22:03 |
clarkb | but that is intentionally I guess | 22:03 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add log streaming logging and exception handling https://review.openstack.org/513811 | 22:14 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Only autohold failed builds https://review.openstack.org/513850 | 22:14 |
SpamapS | clarkb: well it should be exploding in configure | 22:17 |
clarkb | SpamapS: ya I think that is the actual bug based on the response | 22:18 |
clarkb | or make should disable man by default | 22:18 |
SpamapS | make isn't the culprit | 22:18 |
SpamapS | definitely configure | 22:18 |
SpamapS | autoconf should disable man page building if you don't have the components to build the manpages | 22:18 |
* SpamapS struggling with matchers on github statuses | 22:19 | |
SpamapS | is it user:tenant:pipeline or tenant:pipeline? :-P | 22:20 |
SpamapS | actually user:tenant/pipeline:result | 22:20 |
SpamapS | but still not working :-/ | 22:22 |
SpamapS | looks like maybe labeling pr's for requirements doesn't work | 22:38 |
openstackgerrit | Doug Hellmann proposed openstack-infra/zuul-jobs master: fix the path for the launchpad credentials file https://review.openstack.org/514484 | 22:43 |
SpamapS | jlk: did you ever notice any race conditions with the github driver? | 22:46 |
SpamapS | I think I'm seeing where statuses are lagging on fetch after an event that changes them | 22:46 |
SpamapS | as in, we get an event on the webhook "hey the status changed to X" and then querying it immediately shows status empty. | 22:47 |
SpamapS | but querying it a second later shows it filled in | 22:47 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add zuul.{pipeline,nodepool.provider,executor.hostname} to job header https://review.openstack.org/509436 | 22:55 |
Shrews | jeblair: fwiw, i've no idea why https://review.openstack.org/513811 keeps failing. passes for me locally on repeated runs. will investigate more in the morning | 22:59 |
dmsimard | jeblair: more broadly speaking, one of the topics we barely touched but will be scheduling a session on is how the jobs running on review.rdoproject.org will look like once we roll out Zuul v3 | 23:05 |
jeblair | dmsimard: forum session? | 23:05 |
dmsimard | jeblair: I won't be at the Forum, so if there's something there I won't attend that one :) | 23:05 |
jeblair | dmsimard: or another bluejeans meeting? | 23:05 |
jeblair | dmsimard: (wondering what you meant by 'scheduling a session on') | 23:06 |
dmsimard | jeblair: probably on bluejeans, it was something we meant to discuss back at the PTG but didn't get around to it | 23:06 |
dmsimard | jeblair: basically, we'd probably add project-config/zuul-jobs/openstack-zuul-jobs on review.rdoproject.org to be able to leverage the roles, playbooks and jobs that are defined there -- but then there's some question marks, like secrets, for example | 23:06 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: experiment with late-binding inheritance https://review.openstack.org/511352 | 23:08 |
dmsimard | a good amount of tripleo jobs will be running as third party from review.rdo so we'd like to avoid duplicating things as much as we can versus the jobs that run in -infra | 23:08 |
clarkb | dmsimard: third party jobs run against our gerrit though right? or maybe I'm misunderstanding the use of third party here | 23:09 |
dmsimard | clarkb: yeah, basically those RH1 jobs | 23:09 |
dmsimard | clarkb: against tripleo gerrit patches | 23:09 |
clarkb | in that case they can just use the upstream repos maybe? | 23:09 |
clarkb | I guess you'll have problems with secrets (mentioned earlier) | 23:09 |
dmsimard | clarkb: yeah, there's a couple question marks... secrets, logserver. I wonder to what extent can the entirety of their job setup live 100% upstream | 23:10 |
jeblair | dmsimard: cool, this will be a very good test of reusability. :) one thing to note is that currently we are building an api contract for zuul-jobs. we are not, at the moment, expecting to do that for openstack-zuul-jobs. so if we want to make those reusable, we'll have to be explicit about that. | 23:10 |
dmsimard | and then it gets weird, because if we add project-config, do we inherit from the zuul/main.yaml so we load every project in the universe ? | 23:11 |
jeblair | dmsimard: project-config is a very good candidate for not reusing. :) | 23:11 |
dmsimard | jeblair: yeah, I suspect we will have to define our own base jobs and secrets -- based on the same roles and everything. | 23:12 |
jeblair | dmsimard: fortunately (and not coincidentally), most of those roles are in zuul-jobs. | 23:12 |
dmsimard | There's some roles in project-config too, perhaps. It'll be interesting to write down what we actually need from where. | 23:13 |
dmsimard | More on that later :) | 23:13 |
* dmsimard & | 23:14 | |
*** hasharDinner has quit IRC | 23:15 | |
openstackgerrit | Mohammed Naser proposed openstack-infra/zuul-jobs master: Revert "Add zuul.{pipeline,nodepool.provider,executor.hostname} to job header" https://review.openstack.org/514488 | 23:18 |
dmsimard | mnaser: why? | 23:18 |
mnaser | dmsimard http://logs.openstack.org/79/514479/2/check/puppet-openstack-lint/5dd9896/job-output.txt.gz | 23:19 |
mnaser | it broke things i think | 23:19 |
clarkb | Shrews: are you still around? can you see comments on https://review.openstack.org/#/c/512637/18 really quick if so? | 23:19 |
mnaser | nodepool is undefined somehow | 23:19 |
SpamapS | wow this is driving me crazy | 23:19 |
mnaser | and everything is failing in pre | 23:19 |
SpamapS | I wonder if there's some kind of caching between me and github's API | 23:19 |
dmsimard | mnaser: ok let's revert and figure it out after | 23:19 |
mnaser | dmsimard agreed | 23:20 |
dmsimard | mnaser: I took the time to write integration tests for that too, wtf.. | 23:20 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Revert "Add zuul.{pipeline,nodepool.provider,executor.hostname} to job header" https://review.openstack.org/514488 | 23:21 |
* mnaser shrugs | 23:21 | |
mnaser | i dont know much :p | 23:21 |
mnaser | i just know it broke things | 23:22 |
openstackgerrit | David Moreau Simard proposed openstack-infra/zuul-jobs master: Revert "Revert "Add zuul.{pipeline,nodepool.provider,executor.hostname} to job header"" https://review.openstack.org/514489 | 23:39 |
openstackgerrit | David Moreau Simard proposed openstack-infra/zuul-jobs master: Revert "Revert "Add zuul.{pipeline,nodepool.provider,executor.hostname} to job header"" https://review.openstack.org/514489 | 23:44 |
SpamapS | paste.openstack.org/show/624413 | 23:55 |
SpamapS | anyone see the syntax error there? | 23:55 |
SpamapS | Zuul isn't telling me what's wrong with it | 23:56 |
SpamapS | 2017-10-23 16:53:24,115 DEBUG zuul.ConfigLoader: Created layout id 2ada486dcbef40589fb5c12454d70a5e | 23:56 |
SpamapS | 2017-10-23 16:53:24,124 INFO zuul.Pipeline.GoDaddy.check: Configuration syntax error in dynamic layout | 23:56 |
SpamapS | just when I thought I understood how to write zuul.yaml's :-P | 23:56 |
pabelanger | SpamapS: line 28 is wrong, you can drop name | 23:58 |
pabelanger | and 31 | 23:59 |
jeblair | SpamapS: what did zuul report on the pr? | 23:59 |
SpamapS | jeblair: nothing | 23:59 |
jeblair | SpamapS: at the very least, it should have reported "unknown configuration error" or something | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!