pabelanger | ianw: np, I think in large changes like that we need to get some sort of sign-off on other zuul operators. I am not sure how best do do that right now. | 00:30 |
---|---|---|
clarkb | pabelanger: I dont think that would be practical. And even small.changes like the ssh port one had to be reverted | 00:33 |
clarkb | testing is likely the only practical way to address this | 00:34 |
ianw | yeah, if we just make sure we stress both paths now we know they're in use | 00:44 |
logan- | i saw talk about the ability to freeze zuul-jobs to a tag or sha, that would have helped reduce the revert urgency.. but yeah better testing and/or 3pci seem like good things to target also | 00:44 |
*** rlandy is now known as rlandy|bbl | 00:52 | |
logan- | and yep ianw.. id think its pretty necessary to test a path with _no_ site variables defined in case there are others waiting to break things aside from the traceroute one. | 00:54 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Revert "validate-host: retry network tests and include unbound logs" https://review.openstack.org/618844 | 02:16 |
openstackgerrit | neilsun proposed openstack-infra/zuul master: Add type check for zuul conf https://review.openstack.org/591917 | 03:31 |
*** chandankumar has joined #zuul | 04:55 | |
*** chandankumar is now known as chkumar|ruck | 04:56 | |
*** bjackman has joined #zuul | 05:16 | |
*** rlandy|bbl is now known as rlandy | 05:26 | |
*** chkumar|ruck has quit IRC | 06:37 | |
*** spsurya has joined #zuul | 06:39 | |
*** rlandy has quit IRC | 06:39 | |
*** bjackman has quit IRC | 07:13 | |
*** chandankumar has joined #zuul | 07:14 | |
*** chandankumar is now known as chkumar|ruck | 07:14 | |
*** pcaruana has joined #zuul | 07:20 | |
*** quiquell|off is now known as quiquell | 07:28 | |
*** pcaruana has quit IRC | 07:34 | |
*** bjackman has joined #zuul | 07:36 | |
*** pcaruana has joined #zuul | 07:40 | |
bjackman | If you define a secret and an ansible role in a trusted project, and then a job in an untrusted project, and the role refers to the secret, is there any reason why the job shouldn't be able to import the role? | 07:42 |
bjackman | Or would that be insecure because of something about the way ansible works? | 07:43 |
*** snapiri has joined #zuul | 07:45 | |
*** quiquell is now known as quiquell|brb | 07:46 | |
tristanC | bjackman: secrets are not attached to roles, so an untrusted project's job can use the role, but it would have to provide its own secret | 07:50 |
*** chandankumar has joined #zuul | 07:52 | |
*** chkumar|ruck has quit IRC | 07:56 | |
*** chandankumar is now known as chkumar|ruck | 07:57 | |
bjackman | tristanC, Right OK | 07:57 |
bjackman | What I'm trying to figure out is whether I can have an untrusted project mostly define its own configuration, but just "call" into something in a trusted project when it needs to use an API key | 07:58 |
tristanC | bjackman: job using secrets has to be defined in config projects... perhaps the config project's job could read the trusted project configuration and call the api key using task defined in the config project? | 08:03 |
*** themroc has joined #zuul | 08:06 | |
*** quiquell|brb is now known as quiquell | 08:06 | |
tristanC | i meant, to be defined in the same* project | 08:07 |
*** gtema has joined #zuul | 08:26 | |
tristanC | bjackman: fwiw we just published a blog post about using secret in zuul: https://www.softwarefactory-project.io/zuul-hands-on-part-5-job-secrets.html | 08:28 |
AJaeger | bjackman: What I hear from you would allow a job in an untrused project to get a secret - and that's a no-go for us. We don't want a job to leak a secret like with "echo $secret" in a job. So, you might need to move both together into the config project. | 08:50 |
AJaeger | bjackman: keep in mind you can use the job in the untrusted project - but you cannot define it there... | 08:51 |
*** jpena|off is now known as jpena | 08:53 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: executor: enable zuul_return to update Ansible inventory https://review.openstack.org/590092 | 08:56 |
*** goern has joined #zuul | 09:11 | |
*** gtema has quit IRC | 09:19 | |
*** snapiri has quit IRC | 09:36 | |
quiquell | AJaeger, tristanC: Can I configure zuul to hold nodes at its yaml files? or is just a zuul client option ? | 09:44 |
*** snapiri has joined #zuul | 09:45 | |
tristanC | quiquell: it's just a zuul client option, though there is a spec to enable auto hold action through rest api: https://review.openstack.org/562321 | 09:46 |
*** hashar has joined #zuul | 09:58 | |
bjackman | tristanC, Awesome will have a read | 10:10 |
bjackman | AJaeger, Got it. But I wondered if the secret was only accessible from the tasks defined in the role (and the role is defined in the project containing the secret) then perhaps it would be safe | 10:11 |
bjackman | AJaeger, I will just define the job in the trusted project and use it from the untrusted project. It would have been nice to have the details of the tests be defined in the same repo as the code under test, but no biggie | 10:12 |
*** chkumar246 has joined #zuul | 10:13 | |
tristanC | tobiash: it seems like prometheus is not suitable for some of the zuul metrics, the per-job,per-project,per-pipeline,per-tenant (e.g. .SUCCESS) and per-branch,per-project,per-tenant (e.g. .resident_time) may yield too many labels and the /metrics endpoint response time is impacted... | 10:14 |
*** chkumar|ruck has quit IRC | 10:15 | |
tristanC | documentation says that cardinality over 100 should be moved away from the monitoring ( https://prometheus.io/docs/practices/instrumentation/#do-not-overuse-labels ) | 10:16 |
*** chkumar246 is now known as chkumar|ruck | 10:18 | |
*** gtema has joined #zuul | 10:23 | |
*** sshnaidm has quit IRC | 10:50 | |
*** chkumar246 has joined #zuul | 10:54 | |
*** chkumar|ruck has quit IRC | 10:57 | |
*** goncalo has quit IRC | 11:01 | |
openstackgerrit | Fabien Boucher proposed openstack-infra/zuul-jobs master: DNM - Negative test - Validate host role - Check third party CI SF https://review.openstack.org/618988 | 11:10 |
tobiash | tristanC: we don't see real problems with that at our scale right now (using statsd_exporter) | 11:15 |
tobiash | tristanC: the argumentation in your link is that this can become a problem if you have hundreds of labels and scrape that for hundreds of servers | 11:17 |
tobiash | the zuul metrics will be scraped only for one instance | 11:17 |
openstackgerrit | Fabien Boucher proposed openstack-infra/zuul-jobs master: DNM - Validate the validate-host via the SF third party CI https://review.openstack.org/618989 | 11:18 |
fbo | ianw: pabelanger logan- We just added the validate-host role in the test-basic-roles job of the SF third party CI for zuul-jobs. Here is the negative test https://review.openstack.org/618988 with the ianw's change (the CI catch the failure) and the positive test https://review.openstack.org/618989 (current master) that pass the CI. | 11:22 |
*** chkumar246 is now known as chkumar|ruck | 11:30 | |
*** sshnaidm has joined #zuul | 11:41 | |
*** chkumar246 has joined #zuul | 11:53 | |
*** chkumar|ruck has quit IRC | 11:56 | |
tristanC | tobiash: using this test: http://paste.openstack.org/show/735721/ , the metrics endpoints output 6.5MB and localhost curl takes almost a second to complete... | 12:00 |
tristanC | tobiash: the culpit is that metrics doesn't expire, so we may need to implement a cleaning thread to purge metric not updated every 10 minute or something | 12:02 |
tobiash | tristanC: fair, that needs to be done | 12:04 |
tobiash | tristanC: atm in our prod system with statsd_exporter I get 19MB in 0.2s | 12:04 |
tristanC | having to manually expire the per-job and per-project metric seems like extra work which makes me wonder if prometheus is the right place to store those information | 12:12 |
tristanC | tobiash: but if you think that's worthy, then i can try to implement this too | 12:13 |
*** chkumar has joined #zuul | 12:15 | |
*** chkumar246 has quit IRC | 12:18 | |
*** hashar has quit IRC | 12:27 | |
*** bhavikdbavishi has joined #zuul | 12:28 | |
*** gtema has quit IRC | 12:33 | |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add support for zones in executors https://review.openstack.org/549197 | 12:36 |
*** jpena is now known as jpena|lunch | 12:40 | |
*** panda|rover is now known as panda|rover|lch | 12:48 | |
tobiash | tristanC: hrm, expiring might also break some analysis methods, so maybe we should leave out these two metrics if they become a performance bottleneck | 12:52 |
*** chkumar246 has joined #zuul | 12:54 | |
*** chkumar246 is now known as chkumar|ruck | 12:54 | |
*** bhavikdbavishi has quit IRC | 12:55 | |
*** quiquell is now known as quiquell|lunch | 12:55 | |
*** chkumar has quit IRC | 12:57 | |
*** hashar has joined #zuul | 13:06 | |
*** bjackman has quit IRC | 13:19 | |
*** rlandy has joined #zuul | 13:33 | |
*** quiquell|lunch is now known as quiquell | 13:36 | |
*** panda|rover|lch is now known as panda|rover | 13:37 | |
*** chkumar246 has joined #zuul | 13:49 | |
tobiash | pabelanger: +2 with comment about possible docs improvement ^ | 13:51 |
pabelanger | tobiash: thanks! looking | 13:51 |
*** chkumar|ruck has quit IRC | 13:52 | |
tobiash | pabelanger: so the thing that I'd like to emphasize is that the executor takes all jobs of that provider and it further doesn't execute any jobs from a different provider | 13:52 |
*** chkumar246 has quit IRC | 14:01 | |
pabelanger | tobiash: yes, exactly | 14:09 |
*** jpena|lunch is now known as jpena | 14:15 | |
*** gtema has joined #zuul | 14:21 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Improve resource usage with semaphores https://review.openstack.org/619004 | 14:21 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add support for zones in executors https://review.openstack.org/549197 | 14:37 |
pabelanger | tobiash: updated with your comments | 14:37 |
pabelanger | rcarrillocruz: if you also want to review, since you were driving the initial discussions for use case^ | 14:38 |
tobiash | pabelanger: thanks | 14:38 |
corvus | pabelanger: is there a nodepool change for that? | 14:48 |
*** chkumar246 has joined #zuul | 14:53 | |
*** hashar has quit IRC | 15:12 | |
pabelanger | corvus: as that patch is written, there isn't a change needed. As we use nodepool.provider for the gearman function name, however I initiall created https://review.openstack.org/618622/ but clarkb suggested we might not need if we use provider_name | 15:16 |
corvus | pabelanger: ah, i -1d the zuul change because i'm still in favor of the nodepool change | 15:18 |
corvus | clarkb, pabelanger: can you look at the comments in 549197 and see what you think? | 15:18 |
pabelanger | sure, I don't have a strong preference on which way we choose... looking | 15:19 |
dmsimard | corvus: btw I didn't want to bother you with that during summit week but for https://review.openstack.org/#/c/617680/ | 15:27 |
dmsimard | neither sass or node-sass comes available in a 3.5.5 version.. should I just send a patch which recompiles the scss with the latest version of sass ? | 15:27 |
dmsimard | sass: https://www.npmjs.com/package/sass?activeTab=versions and node-sass: https://www.npmjs.com/package/node-sass?activeTab=versions | 15:27 |
corvus | dmsimard: is this it? https://rubygems.org/gems/sass/versions/3.5.5 | 15:36 |
dmsimard | corvus: oh, a gem... I was looking at npm T_T | 15:37 |
corvus | dmsimard: yeah, that's not confusing at all is it? | 15:37 |
corvus | i feel like we should make a python sass now, since that's the cool thing to do | 15:37 |
dmsimard | it's already a thing | 15:38 |
dmsimard | https://pythonhosted.org/scss/ | 15:38 |
dmsimard | :p | 15:38 |
openstackgerrit | David Moreau Simard proposed openstack-infra/zuul-website master: Reduce padding to remove extraneous whitespace https://review.openstack.org/617680 | 15:44 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Add second level cache of nodes https://review.openstack.org/619025 | 15:49 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add support for zones in executors https://review.openstack.org/549197 | 15:57 |
pabelanger | corvus: clarkb: tobiash: I've updated ^ to depend on https://review.openstack.org/618622/ from nodepool, which creates a provider executor-zone setting. I still needs to add some testing, but if you'd like to confirm it is on the right track | 15:59 |
tobiash | pabelanger: seems like I overlooked the discussion earlier about the original approach. executor-zones indeed sounds better and more generic | 16:00 |
pabelanger | yah, the are top level with provider right now, but I think maybe I should move them into provider pools, for more granularity | 16:01 |
* tobiash is looking again | 16:02 | |
tobiash | pabelanger: if we want to cover az's as well, then yes this should be moved down to pools | 16:03 |
clarkb | I think both of corvus' review items could be addressed with the alternative as well. for 1) make the zone provider.pool and for 2) allow an executor to register to multiple pools. Its mostly a matter of ease of configuration I think | 16:04 |
tobiash | pabelanger: I also can imagine further use cases when moving this to pools like having some labels without provider stickyness and having e.g. IO heavy jobs more locally | 16:04 |
clarkb | and if setting an explicit arbitrary human readable string is easier than implied behavior between zuul and nodepool I'm happy with that | 16:05 |
*** bjackman has joined #zuul | 16:08 | |
*** quiquell is now known as quiquell|off | 16:15 | |
*** bjackman has quit IRC | 16:15 | |
*** chkumar246 has quit IRC | 16:16 | |
clarkb | pabelanger: ok tried to dig into a the code a bit more and left some notes on 549197 | 16:20 |
clarkb | ok and I think I've gotten through the nodepool change as well | 16:33 |
*** themroc has quit IRC | 16:47 | |
*** gtema has quit IRC | 17:15 | |
pabelanger | k, just finishing up some food, will reply in a few | 17:16 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: mirror-workspace-git: use port when pushing git repo through ssh https://review.openstack.org/618859 | 17:26 |
*** tobias-urdin is now known as tobias-urdin_afk | 17:27 | |
pabelanger | clarkb: so, you are not okay with just ignoring the zone request if nodepool sends it, but we don't have a executor setup for that zone? My thoughts are, it would fall back to default execute:executor zone, where all jobs run | 17:40 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add support for zones in executors https://review.openstack.org/549197 | 17:41 |
pabelanger | clarkb: for example ^ | 17:41 |
clarkb | pabelanger: I think we should log it at least | 17:41 |
clarkb | Im not sure if it should be an error | 17:41 |
pabelanger | yah, I've updated the patch to include warning about when we don't find a function from gearman | 17:42 |
corvus | i like warn but not error; that way you can put zones freely in your nodepool config but not have to use them | 17:43 |
corvus | that could be nice for expressing a preference that a node be used by a certain executor even if it isn't required -- then the job still runs if all executors in that zone are offline. | 17:44 |
corvus | otoh, if, say, you were just rebooting executors in a zone, maybe you'd want the job to wait. | 17:45 |
corvus | we might be tweaking this later :) | 17:45 |
pabelanger | wfm! | 17:45 |
*** j^2 has joined #zuul | 17:55 | |
*** hashar has joined #zuul | 18:08 | |
*** bhavikdbavishi has joined #zuul | 18:22 | |
*** hashar_ has joined #zuul | 18:23 | |
*** hashar has quit IRC | 18:25 | |
*** jpena is now known as jpena|off | 18:38 | |
*** tobias-urdin_afk is now known as tobias-urdin | 18:38 | |
clarkb | I have made firefox very sad trying to open the py35 job log for pabelanger's most recent patchset | 18:43 |
clarkb | (warning to anyone else that may try to do the same) | 18:44 |
tobiash | my chrome is happy ;) | 18:45 |
*** pcaruana has quit IRC | 18:47 | |
*** hashar_ has quit IRC | 18:48 | |
*** bhavikdbavishi has quit IRC | 18:51 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Add second level cache of nodes https://review.openstack.org/619025 | 19:06 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul master: Add support for zones in executors https://review.openstack.org/549197 | 19:12 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Add second level cache of nodes https://review.openstack.org/619025 | 19:13 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Cache node request zNodes https://review.openstack.org/618806 | 19:24 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Update node request during locking https://review.openstack.org/618807 | 19:24 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Add second level cache of nodes https://review.openstack.org/619025 | 19:24 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool master: Add second level cache to node requests https://review.openstack.org/619069 | 19:24 |
tobiash | corvus: that might be helpful for the dynamic priorities ^ | 19:25 |
pabelanger | clarkb: tobiash: corvus: what are your thougts on nodepool side of making executor-zone(s), a list vs string. Since availability-zones can be a list, might be worth doing the same idea on executors. | 19:28 |
pabelanger | I could see where you might want to have a executor in 2 zones | 19:28 |
*** j^2 has quit IRC | 19:29 | |
corvus | pabelanger: that would suggest that it needs to be a list zuul-side, not nodepool side. | 19:29 |
corvus | i like string in both places though | 19:29 |
corvus | (in meeting now, can chat later) | 19:30 |
pabelanger | ack | 19:30 |
pabelanger | no rush, something I was thinking off adding tests in nodepool | 19:30 |
tobiash | that even could be added later if someone has a need for this | 19:31 |
pabelanger | Yup | 19:32 |
*** sshnaidm has quit IRC | 19:32 | |
*** sshnaidm has joined #zuul | 19:40 | |
*** sshnaidm is now known as sshnaidm|afk | 19:40 | |
tobiash | clarkb: if you have time, 599567 in combination with 599573 would solve an operational problem we have in case the vm that hosts the scheduler goes down without a chance to terminate connections | 19:40 |
*** j^2 has joined #zuul | 20:25 | |
openstackgerrit | Paul Belanger proposed openstack-infra/nodepool master: Add executor-zone support for providers https://review.openstack.org/618622 | 20:29 |
clarkb | tobiash: I have to refresh my memory on how keepalives work on linux every time this comes up | 20:31 |
clarkb | tobiash: iirc you have to enable them on the host kernel too right? | 20:31 |
tobiash | clarkb: I think they are enabled on any system I've looked | 20:32 |
clarkb | ya looks like they are enabled by default but can be disabled. In this case probably good enough for gear | 20:33 |
clarkb | and your implementation uses the default values from linux | 20:33 |
clarkb | and it is disabled by default All of that looks fine to me | 20:34 |
clarkb | corvus: ^ is that something you want to review too? | 20:34 |
*** hashar_ has joined #zuul | 20:34 | |
tobiash | clarkb: thanks! I think I've used the same defaults as the server component of gear (from which I copy&pasted most of it) | 20:36 |
clarkb | https://review.openstack.org/#/c/599573/1 fails testing implying that tox siblings instals isn't working? | 20:36 |
tobiash | Looks like | 20:37 |
AJaeger | clarkb: don't we need gear in required-projects for tox siblings to work? | 20:38 |
*** dkehn has joined #zuul | 20:38 | |
clarkb | AJaeger: maybe? the depends-on pulled it in | 20:39 |
clarkb | AJaeger: the repo was on disk but maybe the role doesn't notice it unless it is a required-project | 20:39 |
tobiash | clarkb: ah interesting I already wondered why that failed to do the correct sibling install | 20:41 |
AJaeger | clarkb: http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/tox/library/tox_install_sibling_packages.py#n69 | 20:42 |
tobiash | AJaeger: thanks for the tipp | 20:42 |
AJaeger | but the code is not clear to me that it really does it - it's just what I remember... | 20:42 |
AJaeger | tobiash, clarkb http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/tox/tasks/siblings.yaml#n13 is the line that confirms my statement | 20:44 |
AJaeger | so, if you really want to test, add required-projects, could be a DNM ;) | 20:44 |
*** hashar_ has quit IRC | 22:06 | |
*** j^2 has quit IRC | 22:25 | |
*** mugsie has quit IRC | 23:06 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!