openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul-jobs master: Add prepare-workspace-openshift role https://review.openstack.org/631402 | 00:16 |
---|---|---|
corvus | tristanC: hi! i noticed you've uploaded a few new revisions of https://review.openstack.org/573473 but they don't address the -2 comment... do you want to talk about other ways to accomplish what you want there? | 00:18 |
tristanC | corvus: hello, yes sure, please let me know how this could be accepted | 00:22 |
tristanC | corvus: it seems like i didn't commented back, but the jobs list is already showing the first variant information (well, just for the description now) | 00:23 |
*** sdake has quit IRC | 00:23 | |
tristanC | corvus: thus, i don't understand why also showing the parent of the first variant would break the data model | 00:24 |
tristanC | at least with regard to how the jobs list currently work | 00:24 |
*** sdake has joined #zuul | 00:24 | |
corvus | tristanC: the main thing is that we need to support multiple inheritance. that is currently accomplished with different variants having different parents; we might explicitly support multiple parents in the future to make using that easier. | 00:25 |
tristanC | how multiple parents would work? | 00:25 |
corvus | here's a real example of multiple inheritance using variants that works today: | 00:27 |
corvus | https://review.openstack.org/629983 | 00:27 |
corvus | this is what the same thing would look like if we added explicit support: https://review.openstack.org/630337 | 00:28 |
corvus | (rather, if we added explicit support for multiple parents) | 00:28 |
tristanC | ho wow, we can do that?! | 00:28 |
corvus | yeah :) sorry if my objection seemed a little too theoretical -- the algorithm has always supported it (on purpose), we just haven't really talked about it much, because it can be a bit confusing :) | 00:29 |
tristanC | it's rather an un-common way to describe things | 00:30 |
tristanC | so, in tenant layout, thoses variants/to-be-mixed-in job are stored separately in the job list | 00:31 |
tristanC | right? | 00:32 |
corvus | yeah, i think you'd need to look at each variant and get its parent | 00:32 |
tristanC | and then there is the branch matcher, that is actually the only way to prevent the mix-in correct? | 00:33 |
corvus | (also, one other use case is a job that changes parents on branches; that's not a mix-in, it's just separate variants. for example, a "devstack" job changing its parent from "devstack-legacy" to "devstack-v3" from one stable branch to the next) | 00:33 |
corvus | tristanC: right. multiple variants matching gets you a mix-in. multiple variants that don't match just get you a fork in the graph, basically. | 00:34 |
tristanC | ok, thus https://review.openstack.org/573473 needs to returns list of parents, and perhaps the main jobs list should only focus on master branch, or should we also differentiate job's fork? | 00:35 |
corvus | tristanC: it's a little easier for me to think about a visualization that shows a tree of a single job from the bottom up. there we can start with one root node, and branch upward as we add variants. | 00:36 |
tristanC | corvus: yes, but we are also interested in having a better view of the available jobs | 00:37 |
corvus | very quick and bad sketch: http://paste.openstack.org/show/744142/ | 00:38 |
corvus | that's the "easy" part. i *know* we can generate that data and draw that graph. :) | 00:38 |
corvus | tristanC: i just looked at http://logs.openstack.org/37/633437/1/check/zuul-build-dashboard-multi-tenant/9c74c2f/npm/html/t/ansible/jobs and see what you want to do there, with a top-down tree | 00:39 |
tristanC | for the per job graph, the issue is that we have to apply the branch matcher client side to know which fork to follow for each parent | 00:40 |
corvus | tristanC: or you could show the whole thing and annotate the variants with their matchers (say "branch:stable" instead of "variant1" in my quick sketch) | 00:41 |
corvus | tristanC: maybe you could do the same thing for a top-down tree as well? like this? http://paste.openstack.org/show/744143/ | 00:42 |
tristanC | corvus: that would work, though tree view doesn't work well if there can be multiple parents | 00:43 |
corvus | that would basically duplicate jobs, but it would only do so when necessary. and people would be able to see the variants and relationships. | 00:43 |
corvus | true | 00:43 |
tristanC | corvus: the issue remains that the relationships logic isn't easy to implement based on job definition | 00:43 |
tristanC | corvus: you have to match branch for each variant | 00:44 |
corvus | (if we did support explicit multiple parents, we could continue to duplicate jobs, just without the extra variant branch info) | 00:44 |
corvus | tristanC: i think the general approach i'm suggesting is "show all the variants". so on the per-job page, show the full graph with all variants, and same thing on the job listing page. | 00:46 |
tristanC | corvus: would the branch matcher be the only way for a variant to fork the graph? | 00:47 |
tristanC | i'm trying to figure out how to correctly identify a variant parent | 00:48 |
corvus | tristanC: yes; we used to have other selectors but we removed them. | 00:48 |
tristanC | that's good to know and help understand the current model :) | 00:49 |
corvus | tristanC: i think on the api side, "job/{name}" returns a list, iterating over that gets you a list of possible parents (which you can annotate with branch matchers if you want) | 00:50 |
corvus | tristanC: if we wanted to do a bottom-up graph for the per-job page, we could either do that, or add a new endpoint which walks that graph and returns the data; might be a little faster. | 00:51 |
tristanC | corvus: i was going to walk the graph for the per-job page | 00:52 |
corvus | tristanC: for the top-down job listing, maybe we could return the list of possible parents along with their branch matchers? | 00:52 |
tristanC | corvus: but right now, user with lots of jobs really need the top-down view first | 00:52 |
tristanC | corvus: yes, so how about a list of so in the jobs list, a job variant only need {'name': job-name, 'branch': branch, 'parents': [{parent-name: parent-branch}]} | 00:53 |
tristanC | arg, s/so how about a list of // | 00:54 |
corvus | that would work, but would mean returning every variant in the jobs list, instead of just a single entry per job name... | 00:55 |
corvus | an alternative with the current format might be: [{name: 'child', description: "...", parents: [{name: parent1, branches: "master"}, {name: parent2, branches: "stable"}]] | 00:55 |
corvus | either way, those solutions are pretty close :) | 00:56 |
tristanC | well since the job's metadata is variant specific, it seems like we have to return each variant in jobs list to improve the visualisation | 00:57 |
corvus | it's the most correct thing to do; i think the /jobs/ endpoint is mostly there to serve that page (which is why it doesn't include all variants now). i think we can retain the current system if we want, but if we need to change it to better support the job listing page, we can. and yes, the more data we return about jobs in that endpoint, the more it's going to need to be variant-specific. | 01:00 |
corvus | tristanC: hopefully that's enough to unblock you there; i have to run now. i like the result in http://logs.openstack.org/37/633437/1/check/zuul-build-dashboard-multi-tenant/9c74c2f/npm/html/t/ansible/jobs -- it's a good improvement that makes job relationships easier to understand. :) | 01:01 |
tristanC | corvus: thanks for the extra explanation, i'll look into adding the variants information properly to the jobs list | 01:01 |
*** rlandy is now known as rlandy|bbl | 01:15 | |
*** sdake has quit IRC | 01:28 | |
dmsimard | corvus: (totally unrelated), were you okay with the whitespace-- patch for the website https://review.openstack.org/#/c/617680/ ? | 01:29 |
dmsimard | tristanC: wow that new job view | 01:32 |
dmsimard | nice work | 01:32 |
*** bhavikdbavishi has joined #zuul | 01:33 | |
*** pvinci has joined #zuul | 01:48 | |
pvinci | Are there any resources available for troubleshooting zuul-jobs? | 01:50 |
*** bhavikdbavishi has quit IRC | 01:51 | |
*** bhavikdbavishi has joined #zuul | 01:51 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: model: remove unused addImpliedBranchMatcher procedure https://review.openstack.org/633643 | 01:53 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: model: remove unused job's BranchMatcher procedures https://review.openstack.org/633643 | 01:56 |
*** ruffian_sheep has joined #zuul | 02:01 | |
*** sdake has joined #zuul | 02:31 | |
*** sdake has quit IRC | 02:37 | |
*** bhavikdbavishi has quit IRC | 02:37 | |
*** sdake has joined #zuul | 02:38 | |
*** sdake has quit IRC | 02:58 | |
*** sdake has joined #zuul | 02:59 | |
*** rlandy|bbl is now known as rlandy | 03:04 | |
*** rlandy has quit IRC | 03:04 | |
*** sdake has quit IRC | 03:26 | |
*** sdake has joined #zuul | 03:29 | |
*** sdake has quit IRC | 03:42 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: config: add playbooks to job.toDict() https://review.openstack.org/621343 | 03:48 |
*** bhavikdbavishi has joined #zuul | 03:52 | |
*** chkumar|out is now known as chandankumar | 04:22 | |
*** pvinci has quit IRC | 04:26 | |
*** bhavikdbavishi has quit IRC | 04:38 | |
*** bhavikdbavishi has joined #zuul | 04:39 | |
*** sdake has joined #zuul | 04:49 | |
*** spsurya has joined #zuul | 05:05 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: scheduler: add job's variants to the rpc job_list method https://review.openstack.org/573473 | 05:16 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: switch jobs list to a tree view https://review.openstack.org/633437 | 05:16 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: add jobs list filter https://review.openstack.org/633652 | 05:16 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: scheduler: add job's tags to the rpc job_list method https://review.openstack.org/633653 | 05:16 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: add tags to jobs list https://review.openstack.org/633654 | 05:16 |
*** ruffian_sheep has quit IRC | 05:21 | |
*** bhavikdbavishi has quit IRC | 05:27 | |
*** sdake has quit IRC | 05:41 | |
*** sdake has joined #zuul | 05:49 | |
*** bhavikdbavishi has joined #zuul | 05:53 | |
*** strigazi has quit IRC | 06:42 | |
*** strigazi has joined #zuul | 06:43 | |
*** badboy has joined #zuul | 06:59 | |
*** quiquell|off is now known as quiquell | 07:01 | |
*** themroc has joined #zuul | 07:30 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: config: add tenant.toDict() method and REST endpoint https://review.openstack.org/621344 | 07:48 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: add config page https://review.openstack.org/633667 | 07:48 |
*** sdake has quit IRC | 08:11 | |
*** gtema has joined #zuul | 08:31 | |
*** jpena|off is now known as jpena | 08:33 | |
*** electrofelix has joined #zuul | 08:35 | |
*** gtema has quit IRC | 08:36 | |
*** gtema has joined #zuul | 08:42 | |
*** avass has joined #zuul | 08:45 | |
*** pcaruana has joined #zuul | 08:51 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: config: add playbooks to job.toDict() https://review.openstack.org/621343 | 09:04 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: config: add tenant.toDict() method and REST endpoint https://review.openstack.org/621344 | 09:04 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: add config page https://review.openstack.org/633667 | 09:04 |
*** panda is now known as panda|numb | 09:28 | |
*** gtema has quit IRC | 09:31 | |
*** luizbag has joined #zuul | 09:40 | |
*** rf0lc0 has joined #zuul | 10:27 | |
*** rfolco has quit IRC | 10:27 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: add roles usage information to the build page https://review.openstack.org/633697 | 10:43 |
*** bhavikdbavishi has quit IRC | 10:45 | |
*** rf0lc0 has quit IRC | 10:53 | |
*** ssbarnea|bkp2 has quit IRC | 11:01 | |
*** ssbarnea|rover has joined #zuul | 11:02 | |
*** rfolco has joined #zuul | 11:03 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: add roles usage information to the build page https://review.openstack.org/633697 | 11:04 |
tristanC | mordred: corvus: 633697 is a nice improvement to the build page, it lists the roles used by the job | 11:17 |
badboy | do I need to install finger on a nodepool worker node in order to see the output of a playbook in Zuul's web dashboard? | 11:21 |
tristanC | badboy: it shouldn't be needed, zuul executor collects the output through an ansible plugin called zuul_stream | 11:28 |
tristanC | badboy: then zuul-web console stream page should take care of streaming the logs from the executor | 11:28 |
badboy | tristanC: I've the issue | 11:30 |
tristanC | e.g., zuul-web service acts as a finger client | 11:30 |
tristanC | badboy: what's your issue? | 11:30 |
badboy | tristanC: if your DNS doesn't work you need to set the executor.hostname as the IP of the executor | 11:30 |
badboy | tristanC: I'm playing with pause module in my playbook and the console output from web dashboard is full of this: 019-01-29 03:36:49.354728 | [ubuntu-bionic] Waiting on logger | 11:38 |
*** bhavikdbavishi has joined #zuul | 11:39 | |
badboy | tristanC: while the actual console.log on my worker node consists only of two lines | 11:39 |
tristanC | badboy: the zuul_stream plugin needs a connection from the executor to the test instance on port 19885 (tcp) | 11:39 |
tristanC | badboy: iirc, "Waiting on logger" is what you get if the executor can't connect to the nodepool worker node | 11:40 |
electrofelix | Does nodepool get some kind of notification from zookeeper that request's are looking for a particular node type and none found matching it, or is it inferred by the queue in zookeeper for that type of node dropping below a certain threshold? | 11:40 |
electrofelix | looking at creating a custom driver that takes our static nodes and makes them available with dynamically determined labels that represent some other limited resource to limit the number of jobs needing those resources that can run | 11:42 |
electrofelix | but trying to understand whether need to make one available of each type as a minimum and then replace as consumed up to the amount of the resource available or whether there is something that can trigger making it available | 11:43 |
electrofelix | as the latter would reduce wastage of nodes not being used | 11:43 |
badboy | tristanC: by 'test instance' you mean the web dashboard? | 11:44 |
badboy | tristanC: take a look a that: http://paste.openstack.org/show/744165/ | 11:44 |
badboy | tristanC: I've the pause for two minutes | 11:45 |
badboy | *set | 11:45 |
tristanC | badboy: the flow goes from [zuul-web] -|finger|-> [zuul-excecutor] -|tcp/19885|-> [slave] | 11:47 |
tristanC | badboy: can the executor connect with port 19885 to the slave? | 11:47 |
badboy | tristanC: during the playbook run? | 11:48 |
tristanC | badboy: well that shouldn't matter, the question is are your nodepool slave allowing ingress tcp/19885 connections? | 11:50 |
badboy | tristanC: when trying telnet <IP of the slave> 19885 from executor the connection is refused | 11:53 |
*** swest has joined #zuul | 11:53 | |
tristanC | badboy: is there a security group or a firewall at play? | 11:54 |
badboy | ufw is inactive and iptables is at accept all | 11:55 |
badboy | tristanC: and these are vms on the same subnet | 11:55 |
tristanC | badboy: so there is something else, perhaps the zuul_console failed to start, are you using such role: https://git.zuul-ci.org/cgit/zuul-jobs/tree/roles/prepare-workspace/tasks/main.yaml#n2 ? | 11:58 |
tobiash | badboy: connection refused indicates that you probably didn't start zuul_console | 12:09 |
*** fdegir is now known as fdegir_ | 12:10 | |
avass | if you have a pipeline triggered by a comment in gerrit is there any way to retrieve the comment in the job as a variable? | 12:11 |
badboy | tristanC: no, I am not | 12:13 |
*** jpena is now known as jpena|lunch | 12:15 | |
badboy | tristanC: ok, so the pre-run playbook is *REQUIRED* to make it all working | 12:24 |
tristanC | badboy: yes, console-stream needs the zuul-console, perhaps you should base your job on https://git.zuul-ci.org/cgit/zuul-base-jobs/tree/zuul.yaml#n2 | 12:27 |
badboy | tristanC: that's exactly what I did now | 12:28 |
badboy | tristanC: and it's working | 12:28 |
badboy | tristanC: one more question, where the logs land? | 12:28 |
tristanC | badboy: iirc you need to setup a logserver and use the upload-log role | 12:31 |
badboy | tristanC: what kind of logserver? | 12:31 |
*** fdegir_ has left #zuul | 12:33 | |
*** fdegir has joined #zuul | 12:33 | |
*** gtema has joined #zuul | 12:34 | |
tristanC | badboy: it could be scp+http or swift object store, there may be more options too | 12:34 |
tristanC | badboy: got to go now, good luck with your setup :) | 12:34 |
avass | also, is there any way to start the same job (or list of jobs dependent on eachother) with different configurations (job variables set differently in project-config), like looping over a list or something like it? | 12:38 |
*** pcaruana has quit IRC | 12:40 | |
pabelanger | badboy: look at upload-logs, upload-swift-logs roles for publising logs | 12:43 |
pabelanger | avass: if I understand, you'll need to name each job uniquely, then add them into pipelines. | 12:44 |
*** quiquell is now known as quiquell|brb | 12:48 | |
avass | pabelanger: i thought so | 12:48 |
*** pcaruana has joined #zuul | 12:50 | |
badboy | tristanC: thx | 12:59 |
*** quiquell|brb is now known as quiquell | 13:26 | |
*** rlandy has joined #zuul | 13:30 | |
*** jpena|lunch is now known as jpena | 13:36 | |
badboy | is there an ansible playbook/role to install and configure apache/mysql for zuul? | 13:42 |
*** gtema has quit IRC | 13:44 | |
*** bhavikdbavishi has quit IRC | 13:48 | |
*** panda|numb is now known as panda | 13:56 | |
*** badboy has quit IRC | 14:01 | |
pabelanger | geerlingguy/ansible-role-apache / geerlingguy/ansible-role-mysql should work | 14:07 |
pabelanger | I've been testing with nginx, and haven't launched db server yet, just using sqlite for testing currently | 14:07 |
*** sean-k-mooney has joined #zuul | 14:08 | |
mrhillsman | is there a way to force dib to release/unmount and delete failed/successful builds | 14:19 |
mrhillsman | /tmp has quite a few loop devices mounted | 14:20 |
pabelanger | dib should try to unmount, but possible you are seeing a leak | 14:20 |
mrhillsman | or is there a timer | 14:20 |
pabelanger | nodepool-builder, should delete failed DIB builds also | 14:20 |
mrhillsman | using dib standalone | 14:20 |
pabelanger | on nb01.o.o and nb02.o.o we did leak some opensuse things recently, so we might need to dive more into it | 14:21 |
pabelanger | mrhillsman: ah, in that case you need to manage the clean-up yourself | 14:21 |
mrhillsman | ok cool, thx | 14:21 |
*** dkehn has joined #zuul | 14:27 | |
*** pcaruana has quit IRC | 14:45 | |
*** sdake has joined #zuul | 14:49 | |
*** bhavikdbavishi has joined #zuul | 14:50 | |
*** sdake has quit IRC | 14:51 | |
*** sdake has joined #zuul | 14:53 | |
*** pcaruana has joined #zuul | 14:53 | |
*** gtema has joined #zuul | 14:54 | |
*** gtema has quit IRC | 15:00 | |
*** gtema has joined #zuul | 15:01 | |
*** sshnaidm is now known as sshnaidm|mtg | 15:04 | |
Shrews | mordred, corvus, and anyone else that may have my personal email address in your contacts: i've pretty much finished migrating away from gmail for personal use, so you should replace my gmail address with my work email (shrews AT redhat.com) | 15:18 |
*** gtema has quit IRC | 15:21 | |
*** sdake has quit IRC | 15:32 | |
*** sshnaidm|mtg is now known as sshnaidm | 15:33 | |
ttx | Quick question on the Zuul ansible playbook "restricted execution" environment... The doc is a bit limited on what is allowed or not allowed. Would a playbook that runs a script: foobar.py that would call an API and parse the results be allowed on localhost? Is there a practical way to simulate that environment locally? | 15:33 |
Shrews | ttx: no. we have a custom version of the command and shell modules that prevent execution on localhost | 15:35 |
*** sdake has joined #zuul | 15:35 | |
*** gtema has joined #zuul | 15:36 | |
ttx | Shrews: are the jobs in zuul-jobs somehow immune from that ? | 15:38 |
ttx | Like shell being executed in http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/validate-dco-license/tasks/main.yaml | 15:39 |
pabelanger | trusted jobs are allow access to localhost, but not untrusted | 15:39 |
pabelanger | some task, like URI module should be able to run in untrusted context | 15:40 |
pabelanger | we use that for the rtd job to trigger remote builds | 15:40 |
*** swest has quit IRC | 15:40 | |
*** gtema has quit IRC | 15:41 | |
ttx | pabelanger: ok, thanks! | 15:42 |
*** gtema has joined #zuul | 15:42 | |
*** bhavikdbavishi has quit IRC | 15:43 | |
Shrews | hrm, do we intentionally have the .pyi files under zuul/ansible/* added to the repo? | 15:46 |
Shrews | e.g.) http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible/action | 15:47 |
*** gtema has quit IRC | 15:48 | |
clarkb | Shrews: they are used by the type checker iirc | 15:48 |
clarkb | so yes | 15:48 |
*** gtema has joined #zuul | 15:48 | |
Shrews | clarkb: ah, k | 15:49 |
Shrews | thx | 15:49 |
avass | pabelanger: do you know if there's some sort of job templating planned?? | 15:50 |
*** openstackgerrit has quit IRC | 15:51 | |
corvus | avass: no, the idea is to make a parent job which uses variables, then make a bunch of one-liner child jobs which just set those variables, but each has a unique name. | 15:51 |
sean-k-mooney | hi quick question what ansible version is currently supported by zuul | 15:51 |
corvus | avass: (of course, the job language is yaml, so if you have a truly staggering number of jobs, you could probably write a quick script to generate that) | 15:52 |
corvus | sean-k-mooney: 2.5. soon (maybe in a few weeks/months) to be all supported versions. | 15:52 |
avass | corvus: I'll take a look at it again and see if I can do something different instead then | 15:53 |
sean-k-mooney | corvus: im hitting issue with the docker hub zuul image | 15:53 |
pabelanger | avass: to add what corvus said, yaml anchors do work. So that will offer some help | 15:54 |
sean-k-mooney | corvus: specifcally the fetch-devstack-logs-dir is failing with rsync errors https://github.com/openstack-dev/devstack/blob/master/roles/fetch-devstack-log-dir/tasks/main.yaml | 15:55 |
*** bhavikdbavishi has joined #zuul | 15:55 | |
pabelanger | sean-k-mooney: do you have log of failure? | 15:56 |
avass | pabelanger, corvus: looks like we might have more than 130 jobs, 18 base jobs with 7 sets of parameters | 15:56 |
sean-k-mooney | pabelanger: yes one sec | 15:57 |
pabelanger | avass: yah, we spend a fair bit of time designing the base jobs for openstack.org, we went thought a few iterations before selecting current setup. It resulted in a few layers of jobs to help support all the jobs in openstack | 15:58 |
corvus | avass: if it's not secret, i'd love to see your job definitions when you're done (private email would be ok). i find it helpful seeing concrete use cases. no worries if that's not practical. | 16:00 |
avass | corvus: would love to but I don't think I'm allowed to do that sorry | 16:01 |
sean-k-mooney | pabelanger: they are here https://logs.seanmooney.info/41/1041/3/check/dvsm-base/4a67f89/ but specific error is here http://paste.openstack.org/show/744173/ | 16:01 |
avass | corvus, pabelanger: we are working on getting legal apartment to let us contribute though | 16:01 |
corvus | avass: no prob. thanks. re contributing, just so you know, no CLA is required (just the ability to submit code under the apache license) | 16:02 |
sean-k-mooney | pabelanger: it may be related to this bug https://github.com/ansible/ansible/issues/23887 | 16:02 |
corvus | (i only mention that because there's still some ambiguity since we're hosted on openstack's infrastructure, and the openstack projects require a cla; we're working on separating that out even more so it's more clear) | 16:03 |
pabelanger | sean-k-mooney: what OS is your executor? | 16:03 |
sean-k-mooney | pabelanger: im using the docker image that the zuul project created and published to the docker hub but ill check | 16:04 |
pabelanger | sean-k-mooney: I want to say you might have an old version of rsync, | 16:04 |
pabelanger | failed: Invalid argument (22)\nrsync: chown | 16:05 |
avass | coruvs: Oh, yeah we're fully aware of that. | 16:05 |
sean-k-mooney | im using rsync 3.1.2 | 16:05 |
pabelanger | sean-k-mooney: or, ther is maybe an issue with mounts in docker, and rsync isn't able to chown the file | 16:05 |
corvus | yeah, my first instinct says that's some kind of container mount issue, but i don't know what it could be | 16:07 |
*** gtema has quit IRC | 16:07 | |
*** gtema has joined #zuul | 16:08 | |
corvus | i'll boot up the quick-start system and see if i can see | 16:08 |
pabelanger | need to step away for a few minutes | 16:08 |
sean-k-mooney | pabelanger: perhaps but other syncornise task work | 16:08 |
*** gtema has quit IRC | 16:10 | |
*** dkehn has quit IRC | 16:10 | |
corvus | sean-k-mooney: if you run "zuul-executor keep" in the container, zuul will not delete the working directory at the end of the job. then you can go looking at the job's working dir in the executor container. | 16:10 |
sean-k-mooney | corvus: for all jobs or just running jobs | 16:11 |
corvus | (so, run "zuul-executor keep", then retrigger the job. when finished debugging, you can turn the option back off with "zuul executor unkeep") | 16:11 |
corvus | sean-k-mooney: for all jobs, but only for jobs which start after you set it | 16:11 |
sean-k-mooney | i will also need to mark the instance for holding in node pool too right? | 16:12 |
corvus | sean-k-mooney: if you want to debug the remote end, yes. but if i'm reading the error right, the problem is on the receiving side (the executor). of course, i could be reading it wrong. :) | 16:13 |
sean-k-mooney | the rest of the job is all working fine and its litrally the last role in the devstack job that copys the logs that is failing so its kind of annoying | 16:13 |
*** quiquell is now known as quiquell|off | 16:16 | |
sean-k-mooney | corvus: ill try that in anycase and see if i can figure out whats going on. | 16:16 |
sean-k-mooney | corvus: by the way do you know what {{ ansible_user_dir }} typeically is | 16:18 |
sean-k-mooney | i guess ubuntu@192.168.1.168:/home/ubuntu/logs | 16:18 |
corvus | sean-k-mooney: it should be the user ansible is running as on the remote node, so 'ubuntu' in your case, 'zuul' in openstack's case (because of our custom images) | 16:18 |
corvus | or rather, the home dir of that user | 16:19 |
corvus | /home/ubuntu /home/zuul | 16:19 |
sean-k-mooney | ya that makes sense | 16:19 |
sean-k-mooney | ill retrigger and then try to run the command again myself for the executro container | 16:20 |
clarkb | iirc that chown is there to make sure we get all of the logs? | 16:24 |
*** fdegir has quit IRC | 16:26 | |
*** avass has quit IRC | 16:32 | |
*** chandankumar is now known as chkumar|out | 16:35 | |
corvus | sean-k-mooney: i just tested out a synchronize task with the zuul quick-start system (which uses the containers from docker hub), and it works. it's a very simple test, but i think it shows we haven't completely broken the images. i think the next thing is to figure out what is special about those files/directories that's causing the problem. so let me know what you find with the keep/recheck test. | 16:43 |
sean-k-mooney | corvus: i created my k8s scips by revers enginering the quickstart using the same same images :) | 16:44 |
sean-k-mooney | i know other synchronize task work | 16:45 |
*** sdake has quit IRC | 16:46 | |
sean-k-mooney | the logs /opt/stack/logs are own by stack but they are being copied to /home/ubunutu/logs | 16:46 |
sean-k-mooney | corvus: do you think i would be better deploying nodepool-builder and justin simlar custom images as are used in the gate | 16:46 |
corvus | sean-k-mooney: it shouldn't be necessary. if you do want to do that, i'd recommend waiting until we solve this problem. | 16:48 |
*** themroc has quit IRC | 16:57 | |
*** bhavikdbavishi has quit IRC | 16:58 | |
*** openstackgerrit has joined #zuul | 16:59 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul-jobs master: install-nodejs: add support for RPM-based OSes https://review.openstack.org/631049 | 16:59 |
*** pcaruana has quit IRC | 17:01 | |
*** bhavikdbavishi has joined #zuul | 17:16 | |
*** sdake has joined #zuul | 17:17 | |
*** sdake has quit IRC | 17:18 | |
*** sdake has joined #zuul | 17:43 | |
sean-k-mooney | corvus: hum that didnt really tell me anything new. i think for now im going to copy paste the devstack base jobs and define my own that does copies the logs differently | 17:53 |
corvus | sean-k-mooney: can you paste a directory listing of the logs directory in the executor work dir? | 17:54 |
sean-k-mooney | am yes http://paste.openstack.org/show/744185 | 17:56 |
corvus | sean-k-mooney: and also the 'controller' directory under that one, and also controller/logs ? | 17:57 |
sean-k-mooney | sure one sec. in the interim most of the job logs are here https://logs.seanmooney.info/41/1041/3/check/dvsm-base/a4f5dba/ | 17:59 |
sean-k-mooney | am yes http://paste.openstack.org/show/744186 | 18:00 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: WIP Revert "Revert "Add a timeout for the image build"" https://review.openstack.org/633792 | 18:02 |
sean-k-mooney | corvus: all of the file seam to be coppied it just the chown that failed | 18:02 |
sean-k-mooney | corvus: one thing i had considerd was submitting a patch to devstack to not copy the owner/group/perms of the files | 18:04 |
corvus | sean-k-mooney: yes.... i wonder if all those files are owned as ubuntu:ubuntu on the remote node, and it's just failing to chown them to ubuntu:ubuntu on the executor, since that user doesn't exist... | 18:04 |
electrofelix | corvus: so I had a chance to do more thinking overnight on how we're looking to make use nodepool v3 to solve a current problem and a step to migrating to zuulv3 and I've put some thoughts into an email | 18:04 |
corvus | (but this would work in openstack, since there's a zuul user in both places) | 18:04 |
sean-k-mooney | corvus: let me check my passwd for a sec | 18:04 |
corvus | sean-k-mooney: i think your patch to devstack might solve the problem | 18:05 |
sean-k-mooney | i do not have a user 1000 in that docker image let me check if that is the userid in the vm | 18:05 |
sean-k-mooney | ya i probaly would as it would disable the chown | 18:06 |
electrofelix | I think what we'll need to do is create a custom driver taking ideas from the static pool and openstack drivers where instead of simply populating zk with all the static nodes, only add them when requested | 18:06 |
sean-k-mooney | just not sure if its the best solution | 18:06 |
corvus | pabelanger: with regard to the fetch-output role, did you put any thought into the issue of file ownership being different on remote host vs executor? (see conversation above with sean-k-mooney) | 18:06 |
corvus | electrofelix: i may need to spend some time reading your mail, because the static driver used to do that, and we had problems with that and changed it to the current behavior. | 18:07 |
electrofelix | so that when a request comes in for one needing the special resources the quota call can check to see if there is one available in our vault instance | 18:07 |
electrofelix | and if there is add suitable labels to one of the static nodes storing within an internal listing and then add that to zk | 18:07 |
electrofelix | corvus: nuts, I was worried that I might not be understanding how node requests were triggered | 18:08 |
electrofelix | was there a race condition with nodes getting added? | 18:09 |
electrofelix | end up adding more nodes than requested? | 18:09 |
sean-k-mooney | corvus: ya so looking at the base vm ubuntu is 1000:1000 but the zuul container does not have matching user | 18:09 |
corvus | electrofelix: the biggest problem is that it made nodepool behave in a way that nobody expected. having nodepool managing nodes that don't show up in 'nodepool list' was exceedingly frustrating. | 18:09 |
*** sdake has quit IRC | 18:10 | |
sean-k-mooney | the nfs share that backs my log dir would have however | 18:10 |
sean-k-mooney | so the copy form the workspace to logs server should be fine but chown form vm to executor node would fail | 18:11 |
corvus | sean-k-mooney: let's see if we can adjust the devstack job to work here. i think this is a general problem we need to solve. if we can fix this, we can start using the pattern elsewhere. | 18:11 |
sean-k-mooney | well i can push a patch | 18:11 |
sean-k-mooney | the syncronize role suports controling permissions | 18:11 |
sean-k-mooney | we just use the defaults | 18:12 |
sean-k-mooney | *module | 18:12 |
*** fdegir has joined #zuul | 18:12 | |
corvus | electrofelix: i'll try to read and digest your email today and hopefully can speak intelligently on the subject by tomorrow | 18:12 |
sean-k-mooney | looking at https://docs.ansible.com/ansible/latest/modules/synchronize_module.html i think i can just add group: no and owner:no | 18:13 |
electrofelix | corvus: great, it might be that we can live with those limitations as long as there appears to be a path out of it at some point | 18:13 |
corvus | sean-k-mooney: assuming you have a connection to openstack's gerrit, you should be able to make a local change which depends-on the change to devstack and verify the fix for your install | 18:14 |
electrofelix | corvus: does zuulv3 require that it can ssh to any node in nodepool or is there a concept of a multi-node set that is used by a job were it only connects to one? if so that would be our path out and some hacked up driver to get us there would be acceptable | 18:14 |
sean-k-mooney | corvus: oh yes i can do that | 18:14 |
*** jpena is now known as jpena|off | 18:15 | |
corvus | electrofelix: currently it assumes that for openstack and static nodes; it does not for k8s resources. we generally are in favor of nodepool handling more generic resources, so that is a potential solution. | 18:15 |
corvus | electrofelix: (keeping in mind that i haven't fully internalized the problem yet :) | 18:16 |
electrofelix | ah, well, that would be sufficient, I might read some more of the k8s driver, but I think the connecting to all the nodes supplied might be a limitation of the jenkins nodepool agent and once we don't need Jenkins we can drop the custom driver for a more generic solution | 18:17 |
electrofelix | or maybe it's not required for Jenkins either and I'm just missing it, but sounds promising enough that I think we can start down this path while working out the remaining wrinkles | 18:18 |
corvus | electrofelix: i'll still try to read up on the actual problem and give you a more reasoned opinion later today/tomorrow :) | 18:19 |
clarkb | is jenkins a constraint that we can't ignore? eg you must keep jenkins? | 18:19 |
*** bhavikdbavishi has quit IRC | 18:20 | |
electrofelix | clarkb: short term yes, long term my hope is add nodepool, then bring up zuulv3 and start migrating away from Jenkins | 18:20 |
electrofelix | but need to solve it incrementally | 18:20 |
electrofelix | hence my feeling that if there are some limitations with such a driver in nodepool, we could live with it and then switch once no longer required | 18:21 |
pabelanger | corvus: guess we'd need to expose the chmod options to synchronize, like you discussed, and disable that. However, that might be a lot of changes to zuul-jobs. | 18:23 |
pabelanger | we are using same user in ansible-network, zuul, for executor and nodes. However, I think sf.io is using zuul-worker on nodes, would need to double check | 18:24 |
sean-k-mooney | pabelanger: i just submitted https://review.openstack.org/#/c/633796/ and https://review.seanmooney.info/c/test/+/1041 | 18:28 |
sean-k-mooney | pabelanger: the other thing i can try it to just have a k8s init-contianer create the user in my executor container | 18:29 |
clarkb | electrofelix: super hacky idea. Have static nodes (could even just be the same node with different ssh auth details) where each "node" maps to a set of resources | 18:31 |
clarkb | electrofelix: that mapping could be an rc file in the homedir of the user ssh'd in | 18:31 |
clarkb | electrofelix: then nodepool will broker the static nodes and transitively all of your special resources | 18:32 |
electrofelix | clarkb: I think that would indeed be plan B, lose some nodes from being available to provide that mapping, not efficient but workable | 18:33 |
electrofelix | heading now, at least it looks like there is a stepping stone | 18:34 |
*** electrofelix has quit IRC | 18:34 | |
*** sdake has joined #zuul | 18:37 | |
sean-k-mooney | you know while i like zuul alot. waiting for devstack to run to see if you change to the log copying will work is like watching paint dry | 18:41 |
sean-k-mooney | i think im going to use this time to have dinner instead brb | 18:41 |
pabelanger | sean-k-mooney: yah, creating the user is also a good idea too at run time | 18:44 |
corvus | sean-k-mooney, pabelanger: i don't beleive we have intentionally made any assumptions about the remote user being the same as on the executor. i agree that might solve present issues quickly, but if sean-k-mooney is up for it, i'd prefer to fix this in jobs so it isn't necessary. | 18:56 |
pabelanger | I agree, we haven't stated they should be the same, but believe the assumption might be made by default by ansible. The archive defaults to true on synchronize tasks, which does include user / group permissions | 19:02 |
sean-k-mooney | so i tried using the new style depend on and it cloned the review i pushed https://review.openstack.org/#/c/633796/ | 19:49 |
sean-k-mooney | but it did not use the roles from that reivew | 19:50 |
sean-k-mooney | https://github.com/openstack-dev/devstack/blob/master/.zuul.yaml#L166-L169 | 19:50 |
sean-k-mooney | why does devstack prefix it required project with git.openstack.org by the way. i had to rename my gerrit connetion to git.openstack.org | 19:50 |
sean-k-mooney | in order to be able to use this job at all | 19:51 |
sean-k-mooney | the old style depends-on with the change id seams to have been ignored | 19:51 |
sean-k-mooney | the old style change id is still running so i dont know if it will work but i just did not see the clone in the logs | 19:51 |
*** fdegir has quit IRC | 19:52 | |
pabelanger | sean-k-mooney: we opted for git.o.o since that is the canonical url for the roles, we've talked in the past of maybe making zuul a little smarter so we didn't need to do that | 19:56 |
pabelanger | sean-k-mooney: the main reason, is if you have 2 gerrit connections, both with openstack-dev/devstack, jobs won't work properly | 19:56 |
sean-k-mooney | actully i jsut looked at my config | 19:56 |
pabelanger | this is the workflow that rdoproject has | 19:56 |
sean-k-mooney | to be able to use the role i have to have git.openstack.org added as a seperate connection | 19:57 |
pabelanger | yah | 19:57 |
pabelanger | you should only need a single connection to review.o.o, just named git.o.o connection | 19:57 |
sean-k-mooney | that make it very hard to reuse these jobs | 19:57 |
sean-k-mooney | one sec ill share my config if you dont mind taking a look and saying if its sane | 19:58 |
pabelanger | k | 19:58 |
*** luizbag has quit IRC | 19:59 | |
sean-k-mooney | http://paste.openstack.org/show/744196/ | 20:01 |
pabelanger | sean-k-mooney: you can collapse line 28 and 41 into a single connection called git.o.o | 20:02 |
sean-k-mooney | ok that might fix my depens on issue too | 20:02 |
pabelanger | line 36 could also use git.o.o today, since we share the same gerrit instance | 20:02 |
pabelanger | yes | 20:02 |
sean-k-mooney | oh the zuul connection | 20:03 |
pabelanger | you could load it from openstack-infra/zuul-jobs, to collapse a connection | 20:03 |
pabelanger | once opendev gerrit is online, some of this will change, but I believe you can still use a single gerrit connection for openstack / opendev / zuul projects | 20:04 |
sean-k-mooney | ya i think ill do that also i just realised in my tenant config im loading most of the project form git currently instead of the openstack gerrit | 20:04 |
pabelanger | yes, that is correct | 20:04 |
sean-k-mooney | the one nice thing about running all this stuff with k8s is i just update the configs and delete all the contaienr and it deploys it all clean again but keeps my volumes so the data is still there | 20:06 |
pabelanger | yah, that is the workflow tobiash is doing today also | 20:06 |
sean-k-mooney | ya so there is very little documentation on how to set up a third party ci against openstack with zuul v3 so im just cobbeling thing togeter at the moment | 20:08 |
sean-k-mooney | i remember about 3 cycle ago pitching the idea interally of adding zuul v3 support ot kolla anisble so you could deploy a cloud and working ci in one command but i never got around to it | 20:09 |
*** fdegir_ has joined #zuul | 20:11 | |
*** irclogbot_3 has quit IRC | 20:15 | |
*** irclogbot_3 has joined #zuul | 20:19 | |
openstackgerrit | Paul Vinciguerra proposed openstack-infra/zuul master: configloader.py: Not all jobs have an .updated attribute. https://review.openstack.org/633259 | 20:43 |
*** sdake has quit IRC | 20:45 | |
*** fdegir_ is now known as fdegir | 20:55 | |
openstackgerrit | Sean McGinnis proposed openstack-infra/zuul-jobs master: Make sure urllib3[secure] is installed for Twine use https://review.openstack.org/633829 | 21:19 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: WIP Revert "Revert "Add a timeout for the image build"" https://review.openstack.org/633792 | 21:20 |
openstackgerrit | Sean McGinnis proposed openstack-infra/zuul-jobs master: Block installation of requests-toolbelt 0.9.0 https://review.openstack.org/633829 | 22:15 |
*** sdake has joined #zuul | 22:15 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul-jobs master: add-build-sshkey: remove previously authorized build-sshkey https://review.openstack.org/632620 | 23:06 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Block installation of requests-toolbelt 0.9.0 https://review.openstack.org/633829 | 23:09 |
*** saneax has joined #zuul | 23:14 | |
-openstackstatus- NOTICE: http://zuul.openstack.org is not working. https://zuul.openstack.org does work. Please use that while we investigate. | 23:15 | |
*** saneax has quit IRC | 23:20 | |
*** saneax has joined #zuul | 23:22 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: add /{tenant}/buildsets route https://review.openstack.org/630035 | 23:28 |
*** sdake has quit IRC | 23:35 | |
*** sdake has joined #zuul | 23:37 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: WIP add provides/requires support https://review.openstack.org/633605 | 23:39 |
*** sdake has quit IRC | 23:55 | |
*** sdake has joined #zuul | 23:56 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!