*** armstrongs has joined #zuul | 00:04 | |
*** armstrongs has quit IRC | 00:13 | |
*** jamesmcarthur has joined #zuul | 00:42 | |
*** jamesmcarthur has quit IRC | 01:07 | |
*** kerby has quit IRC | 01:55 | |
openstackgerrit | Mohammed Naser proposed zuul/zuul-operator master: Create zookeeper operator https://review.opendev.org/676458 | 02:08 |
---|---|---|
*** jamesmcarthur has joined #zuul | 02:22 | |
*** bhavikdbavishi has joined #zuul | 02:33 | |
*** roman_g has quit IRC | 02:35 | |
*** jamesmcarthur has quit IRC | 02:38 | |
*** bhavikdbavishi1 has joined #zuul | 02:40 | |
*** bhavikdbavishi has quit IRC | 02:42 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 02:42 | |
*** jamesmcarthur has joined #zuul | 02:47 | |
*** noorul has joined #zuul | 03:12 | |
noorul | hi | 03:12 |
noorul | I am seeing the following error | 03:12 |
noorul | Sending result: {"result": "DISK_FULL", "warnings": [], "data": {}} | 03:12 |
noorul | But plenty of disk space is available on the node where zuul is running and on the slave nodes | 03:13 |
*** noorul has quit IRC | 03:18 | |
*** jamesmcarthur has quit IRC | 03:24 | |
*** jamesmcarthur has joined #zuul | 03:36 | |
*** noorul has joined #zuul | 03:41 | |
noorul | I see this warning too | 03:45 |
noorul | WARNING zuul.ExecutorDiskAccountant: /var/lib/zuul/builds/e27b42f0eff140bcb66b0ccb050e5b81 is using 280MB (limit=250) | 03:45 |
noorul | Is this related? | 03:45 |
clarkb | yes, there is a per job disk limit and you are hitting that. There is a co fig option to make the limit larger | 03:47 |
clarkb | it is there to avoid jobs filling disks on your exwcutors and log servers | 03:47 |
*** jamesmcarthur has quit IRC | 03:50 | |
*** noorul has quit IRC | 03:59 | |
openstackgerrit | Ian Wienand proposed zuul/zuul master: zuul_console: fix python 3 support https://review.opendev.org/682556 | 03:59 |
openstackgerrit | Ian Wienand proposed zuul/zuul master: Support nodes setting 'auto' python-path https://review.opendev.org/682275 | 03:59 |
*** bolg has joined #zuul | 03:59 | |
mnaser | https://review.opendev.org/#/c/676458/7 | 04:40 |
mnaser | What's the best way to consume the image built by the dependant job? | 04:41 |
mnaser | (and should we just build the image during the test run rather than out in a job before it?) | 04:41 |
*** pcaruana has joined #zuul | 04:46 | |
*** noorul has joined #zuul | 04:48 | |
mnaser | I wonder why the image is not being found | 05:02 |
noorul | On my slave node I am trying to use docker to run unit tests. What is the normal practice for this? Should we create user in the docker image or root should be used. When I have a user mount is not working | 05:31 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Set default python-path to "auto" https://review.opendev.org/682797 | 05:57 |
*** avass has joined #zuul | 06:13 | |
*** igordc has joined #zuul | 06:17 | |
*** igordc has quit IRC | 06:19 | |
*** avass has quit IRC | 06:32 | |
*** shachar has quit IRC | 06:35 | |
*** snapiri has joined #zuul | 06:35 | |
*** bolg has quit IRC | 06:50 | |
*** roman_g has joined #zuul | 07:00 | |
flaper87 | tobiash: how do you do the restart of the scheduler pod when you add a new project to the tenant? I've a configmap with the tenant config that is mounted in the scheduler's POD. I'd like to have a (safe?) way to signal the scheduler POD when the configmap changes | 07:04 |
flaper87 | I guess I could have a custom startup script | 07:04 |
flaper87 | but I'd rather use a more k8s method if there's one | 07:04 |
*** themroc has joined #zuul | 07:07 | |
*** hashar has joined #zuul | 07:13 | |
*** noorul has quit IRC | 07:17 | |
*** tosky has joined #zuul | 07:18 | |
mordred | mnaser: the zuul images jobs (with the opendev-buildset-registry job) should make the image stuff just work - as well as image promotion so that the image built in the gate is what gets published. it's a whole thing, but it works really well and works well with depends-on (in your copious free time we should make sure your zuul(s) are set up so that people can do speculative image jobs, because they're | 07:19 |
mordred | mindblowingly awesome, but they do take work from the zuul operator) | 07:19 |
mordred | mnaser: if you're having issues with images not showing up, let's for sure figure that out | 07:19 |
*** bolg has joined #zuul | 07:23 | |
*** sshnaidm|pto is now known as sshnaidm|rover | 07:24 | |
tobiash | flaper87: you can reload the scheduler without restart | 07:24 |
flaper87 | tobiash: yeah, but that requires getting into the pod and reloading the scheduler. was hoping to have that done automagically when the configmap is updated | 07:25 |
tobiash | flaper87: in that case you could run a helper script using inotify to automate that | 07:26 |
mordred | flaper87: I think you're yearning for the distributed scheduler work to be completed | 07:27 |
flaper87 | coolio, that's what I was going to do but I wanted to know if there was a more native way to do it. | 07:27 |
*** gtema_ has joined #zuul | 07:27 | |
flaper87 | mordred: oh, there's an ongoing work to make the scheduler distributed? | 07:27 |
flaper87 | I'd very much love that | 07:27 |
*** mmedvede has quit IRC | 07:31 | |
mordred | flaper87: yeah. https://review.opendev.org/#/c/621479/ is the spec | 07:32 |
*** arxcruz has quit IRC | 07:32 | |
*** mmedvede has joined #zuul | 07:34 | |
*** arxcruz has joined #zuul | 07:36 | |
*** jpena|off is now known as jpena | 07:41 | |
*** gtema_ has quit IRC | 07:45 | |
*** gtema_ has joined #zuul | 07:45 | |
*** AJaeger has quit IRC | 08:14 | |
*** recheck has quit IRC | 08:20 | |
*** noorul has joined #zuul | 08:22 | |
noorul | When the job log crossed the size limit, it failed with DISK_FULL error message | 08:23 |
noorul | testjob finger://aruba-virtual-machine/e27b42f0eff140bcb66b0ccb050e5b81 : DISK_FULL in 0s | 08:23 |
noorul | I am not sure what is the use of finger URL | 08:23 |
*** recheck has joined #zuul | 08:24 | |
*** AJaeger has joined #zuul | 08:27 | |
openstackgerrit | Merged zuul/zuul master: Add no-jobs reporter action https://review.opendev.org/681278 | 08:36 |
openstackgerrit | Merged zuul/zuul master: Add report time to item model https://review.opendev.org/681323 | 08:50 |
*** pcaruana has quit IRC | 08:57 | |
*** pcaruana has joined #zuul | 09:01 | |
mgoddard | FYI, had a weird issue in the persistent-firewall role: https://storage.gra1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_7dd/681446/7/gate/ironic-tempest-ipa-wholedisk-direct-tinyipa-multinode/7dd6514/job-output.txt | 09:06 |
openstackgerrit | Merged zuul/zuul master: Add Item.formatStatusUrl https://review.opendev.org/681324 | 09:11 |
*** panda|ruck|off is now known as panda|ruck | 09:37 | |
*** hashar has quit IRC | 09:41 | |
*** gtema_ has quit IRC | 09:46 | |
*** gtema_ has joined #zuul | 09:46 | |
*** noorul has quit IRC | 09:49 | |
*** bhavikdbavishi has quit IRC | 09:52 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - fix wrong commit gitweb url https://review.opendev.org/679946 | 10:02 |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - handle initial comment change event https://review.opendev.org/680310 | 10:02 |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - handle Pull Request tags (labels) metadata https://review.opendev.org/681050 | 10:02 |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - reference pipelines add open: True requirement https://review.opendev.org/681252 | 10:02 |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - handles pull-request.closed event https://review.opendev.org/681279 | 10:02 |
*** openstackgerrit has quit IRC | 10:06 | |
*** zbr has quit IRC | 10:11 | |
*** zbr has joined #zuul | 10:12 | |
*** noorul has joined #zuul | 10:21 | |
fbo | corvus: tristanC mordred regarding Pagure, I've rebased the patch chain and removed the git.tag.creation patch. The chain should be ok to merge. For the tag.creation patch I'll discuss with pagure folks. | 10:23 |
fbo | https://review.opendev.org/#/q/topic:pagure-driver-update | 10:24 |
*** noorul has quit IRC | 10:26 | |
mordred | fbo: stack looks great to me | 10:29 |
*** openstackgerrit has joined #zuul | 10:30 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - add support for git.tag.creation event https://review.opendev.org/679938 | 10:30 |
fbo | mordred: thanks for the reviewes ! | 10:31 |
mordred | fbo: thanks for working on that - it's exciting to see the pagure support and that relationship moving forward | 10:31 |
fbo | mordred: thanks, yes :) and we are working on a set of default jobs for Fedora distgit here an example https://src.fedoraproject.org/rpms/python-gear/pull-request/8#comment-30592. I hope packagers will like it. | 10:41 |
*** brendangalloway has joined #zuul | 10:46 | |
*** noorul has joined #zuul | 10:51 | |
*** AJaeger has quit IRC | 10:55 | |
openstackgerrit | Merged zuul/zuul master: Pagure - fix wrong commit gitweb url https://review.opendev.org/679946 | 10:57 |
*** AJaeger has joined #zuul | 11:01 | |
*** noorul has quit IRC | 11:04 | |
*** noorul has joined #zuul | 11:05 | |
*** pcaruana has quit IRC | 11:19 | |
*** pcaruana has joined #zuul | 11:28 | |
*** jpena is now known as jpena|lunch | 11:35 | |
*** bhavikdbavishi has joined #zuul | 11:54 | |
*** bhavikdbavishi has quit IRC | 11:59 | |
mnaser | flaper87: you can run a second pod that watches the file for changes and fires the zuul full reconfigure command | 11:59 |
*** bhavikdbavishi has joined #zuul | 12:05 | |
*** jamesmcarthur has joined #zuul | 12:09 | |
*** jamesmcarthur has quit IRC | 12:16 | |
*** jamesmcarthur has joined #zuul | 12:22 | |
*** rlandy has joined #zuul | 12:25 | |
*** jamesmcarthur has quit IRC | 12:31 | |
*** jpena|lunch is now known as jpena | 12:31 | |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add scheduler config options for hold expiration https://review.opendev.org/682675 | 12:42 |
*** themroc has quit IRC | 12:42 | |
*** themroc has joined #zuul | 12:42 | |
Shrews | tristanC: if that ^^ looks good, i'll toss the rest of the stack on top of that again. Hopefully we can get this all merged today :/ | 12:43 |
Shrews | also corvus | 12:44 |
*** avass has joined #zuul | 12:47 | |
*** fdegir has quit IRC | 12:47 | |
*** fdegir has joined #zuul | 12:48 | |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add scheduler config options for hold expiration https://review.opendev.org/682675 | 12:48 |
Shrews | fixed up outdated comments ^^ | 12:48 |
*** jamesmcarthur has joined #zuul | 12:51 | |
*** hashar has joined #zuul | 12:51 | |
tristanC | Shrews: i'm in a meeting now, i can have a look in a couple of hours | 12:56 |
mnaser | If I want to use the same base job across different zuuls (for logs), would it pretty much require that all deployments at least include the keys used to build the secrets on that job and I should be good to go? | 12:59 |
mnaser | It sounds like it, I haven't tried it yet but it seems like that should be enough | 12:59 |
*** Goneri has joined #zuul | 13:09 | |
*** bolg has quit IRC | 13:12 | |
*** hashar has quit IRC | 13:30 | |
*** bhavikdbavishi has quit IRC | 13:30 | |
tristanC | Shrews: left a comment, it lgtm, though if you don't mind i'll wait for the rebase to test the stack again | 13:41 |
tristanC | fungi: re bwrap: containers usually setup cgroups, seccomp and selinux which we don't currently get from bwrap in zuul | 13:53 |
tristanC | seccomp in particular may be important, though it seems like bwrap can setup the filters thus we could get that for zuul. then zuul would be mostly missing selinux context by using bwrap instead of full containers | 14:00 |
*** swest has quit IRC | 14:01 | |
*** swest has joined #zuul | 14:02 | |
*** swest has quit IRC | 14:03 | |
corvus | mnaser: yes; as long as the keys are there, zuul won't geerate them | 14:04 |
corvus | noorul: the finger url is just the only url zuul had for the job at the time. it does contain the build uuid so you can check that in logs. | 14:05 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add scheduler config options for hold expiration https://review.opendev.org/682675 | 14:07 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Mark nodes as USED when deleting autohold https://review.opendev.org/664060 | 14:12 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Auto-delete expired autohold requests https://review.opendev.org/663762 | 14:12 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add autohold delete/info commands to web API https://review.opendev.org/679057 | 14:12 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Remove outdated TODO https://review.opendev.org/682421 | 14:12 |
Shrews | tristanC: that should be the rest of the stack ^^ | 14:12 |
Shrews | there was a test fix in 679057 | 14:13 |
corvus | clarkb: https://review.opendev.org/680778 could use re+2 from you | 14:13 |
pabelanger | mordred: where is a good place to talk about how pbr does versioning of python things from git commits? Basically, I'd like to see how to add that support into galaxy cli command now for generating tarballs used by collections. So today, on galaxy side, it is very static. | 14:13 |
fungi | tristanC: do other container solutions actually apply selinux on platforms which don't ship with selinux enabled (for example, ubuntu)? | 14:14 |
pabelanger | so we can get a good story / versioning around speculative galaxy collections | 14:14 |
fungi | pabelanger: there is a thorough spec in the pbr docs | 14:15 |
pabelanger | Thanks! | 14:15 |
fungi | pabelanger: or are you talking about the actual implementation at the python level, not the versioning rules? | 14:15 |
pabelanger | fungi: yah, I wanted to see how to write the logic in python or maybe extract from pbr to galaxy can use | 14:16 |
pabelanger | or some sort of CLI command, where I can run and generate a proper version number | 14:16 |
fungi | i think pbr still relies on pkg_resources to obtain the metadata from disk, but could in theory be switched over to the newer packaging library | 14:16 |
*** noorul has quit IRC | 14:17 | |
pabelanger | right now, everything generate is all using the same version number, which isn't ideal. | 14:17 |
*** avass has quit IRC | 14:18 | |
fungi | oh, pbr does have some functions for things like outputting equivalent rpm and deb version strings, so you just want something similar for its normal pep-440 version strings | 14:18 |
tristanC | fungi: i can't tell for ubuntu, but podman does setup uniq context label per containers to restrict container access to host and to others containers (when selinux is activated) | 14:19 |
fungi | ubuntu ships by default with apparmor, so maybe that gets used in similar ways by some containerization solutions | 14:20 |
pabelanger | fungi: yes. Basically, take git repo. Run command, output something like 3.9.1.dev183 a2018c5 (taken from zuul.o.o UI). I can then generate proper 'Manifest' file for galaxy or bonus points if galaxy did it directly | 14:20 |
*** openstackgerrit has quit IRC | 14:21 | |
fungi | seems like adding an additional pbr subcommand like `pbr version [--deb, --rpm, --sdist, --wheel]` might do the trick | 14:26 |
pabelanger | okay cool, that should give me starting place | 14:26 |
fungi | er, i guess it would be `pbr version [--deb, --rpm, --sdist, --wheel] <package>` | 14:26 |
fungi | so work like `pbr info <package>` does now | 14:27 |
fungi | there's also a `pbr sha <package>` | 14:27 |
fungi | i expect most of the plumbing is there, and i know the functions you need to generate equivalent version strings for multiple package formats are already in pbr | 14:28 |
fungi | "The version.SemanticVersion class can be used to query versions of a package and present it in various forms - debian_version(), release_string(), rpm_string(), version_string(), or version_tuple()." https://docs.openstack.org/pbr/latest/user/features.html#version | 14:30 |
pabelanger | fungi: yah, I think the change will be, there is no package. As these repos don't use setuptools. So, need to understand how pbr reads that info from git, to generate the version number | 14:30 |
fungi | the implementation would likely be no more than a few lines of python | 14:30 |
fungi | oh | 14:30 |
fungi | if there's no python packaging involved, then yeah you may want a separate took | 14:31 |
fungi | tool | 14:31 |
fungi | pbr is fairly tied to python packaging metadata | 14:31 |
pabelanger | ack | 14:31 |
fungi | the bits of pbr which deal with finding the highest tag in your history, obtaining the commit id and so on are in https://opendev.org/openstack/pbr/src/branch/master/pbr/git.py | 14:33 |
fungi | the actual version string format is likely also not too relevant outside of python packaging contexts, since it's specific to how pip and similar tools expect to order things like prerelease, postrelease and development identifiers in version strings | 14:34 |
fungi | that's why pbr also has functions to translate those to deb and rpm version formats... different platforms have different rules for how to encode such details and how they sort relative to each other | 14:35 |
*** openstackgerrit has joined #zuul | 14:36 | |
openstackgerrit | James E. Blair proposed zuul/zuul-jobs master: DNM: test prepare-workspace-git base-test https://review.opendev.org/682912 | 14:36 |
brendangalloway | pabelanger: tried your connection reset suggestion from yesterday, but it fails due to what looks like ansible 2.5 compatibility issues. Do you know if that will work if we upgrade from SF 3.2 to 3.3? | 14:56 |
pabelanger | brendangalloway: don't know, but which version of zuul are you using? We have multi-ansible support | 14:56 |
brendangalloway | we also noticed that the static node returns to the 'ready' state in nodepool when it comes back, even though zuul still considers it in use. Not sure if that's related though | 14:56 |
pabelanger | so you could use job.ansible_version to ask for newer version | 14:57 |
brendangalloway | We're on Software Factory 3.2, not sure exactly which zuul version that is. Default ansible is 2.5 | 14:57 |
pabelanger | brendangalloway: look on your zuul dashboard UI | 14:58 |
pabelanger | bottom left, should be version info | 14:58 |
brendangalloway | pabelanger: 3.6.1-1.el7 | 14:58 |
pabelanger | 3.7.0 was when we adding multiple ansible: https://zuul-ci.org/docs/zuul/releasenotes.html#relnotes-3-7-0 | 14:58 |
pabelanger | :( | 14:58 |
pabelanger | brendangalloway: yah, sounds like you might need to upgrade | 14:59 |
brendangalloway | Aah - the Multiple Ansible Versions doc made it sound like the feature was still work in progress, so we haven't tried to use it yet | 14:59 |
openstackgerrit | Merged zuul/zuul master: Pagure - handle initial comment change event https://review.opendev.org/680310 | 14:59 |
pabelanger | brendangalloway: nope! is production, works great | 14:59 |
brendangalloway | pabelanger: https://zuul-ci.org/docs/zuul/developer/specs/multiple-ansible-versions.html maybe needs and update then? It's the first thing that came up searching for 'zuul change ansible version' | 15:01 |
pabelanger | +1 | 15:01 |
brendangalloway | but yes, looks like next step is upgrading and seeing if that fixes things | 15:01 |
pabelanger | yah, we should consider updating it | 15:01 |
fungi | that's also somewhat of a risk with documenting our design specs in the documentation tree... can make it seem like those features aren't implemented yet if you go looking for them and end up at the design spec | 15:03 |
fungi | the pink warning at the top of the page is intended to convey that, but maybe it gives the opposite impression | 15:04 |
corvus | oh, if that's implemented, it should be removed | 15:04 |
tristanC | brendangalloway: zuul multi-ansible support is added to SF-3.3, you would need to follow this procedure: https://www.softwarefactory-project.io/docs/3.3/operator/upgrade.html | 15:05 |
fungi | corvus: the spec should be removed, or just the admonition at the top of it? | 15:05 |
corvus | fungi: the spec i think | 15:05 |
fungi | makes sense | 15:05 |
fungi | it's always available in git history after all | 15:05 |
tristanC | brendangalloway: though we are fixing minor issues regarding the recent centos-7.7 update, feel free to ask on #softwarefactory if you have any issue with that | 15:05 |
corvus | we should make sure that any documentation value it has is covered elsewhere, then remove it. (otherwise, we end up with documentation by spec which is less user friendly) | 15:06 |
openstackgerrit | Merged zuul/zuul-website master: Update to page titles and Users https://review.opendev.org/680459 | 15:06 |
*** arxcruz is now known as arxcruz|ruck | 15:08 | |
*** mattw4 has joined #zuul | 15:09 | |
*** themroc has quit IRC | 15:11 | |
openstackgerrit | Merged zuul/zuul master: zuul_console: fix python 3 support https://review.opendev.org/682556 | 15:17 |
*** michael-beaver has joined #zuul | 15:19 | |
clarkb | corvus: +2 reapplied | 15:19 |
*** noorul has joined #zuul | 15:23 | |
*** TxGirlGeek has joined #zuul | 15:26 | |
brendangalloway | I don't suppose there's any way for zuul to report how many times a certain job has been run in a certain pipeline? | 15:32 |
*** zbr has quit IRC | 15:36 | |
*** zbr has joined #zuul | 15:37 | |
noorul | Any idea how to copy a folder from one location to another using inbuilt ansible module? | 15:37 |
clarkb | noorul: you can use the synchronize module | 15:37 |
clarkb | brendangalloway: for all time? I believe that data is in the databse but not currently exposed | 15:38 |
*** sshnaidm|rover is now known as sshnaidm | 15:39 | |
*** panda|ruck is now known as panda | 15:39 | |
brendangalloway | clarkb: Yeah, we've ported one of our build systems into zuul and we want to switch the old one off, but the idea of no longer having build numbers is ruffling feathers | 15:40 |
brendangalloway | Trying to find an easy way to provide something equivalent | 15:40 |
clarkb | in the opendev system I would point people to the statsd reporting since we expose that publicly through graphite and grafana | 15:42 |
*** jamesmcarthur has quit IRC | 15:45 | |
*** jamesmcarthur has joined #zuul | 15:47 | |
*** jamesmcarthur has quit IRC | 15:49 | |
*** noorul has quit IRC | 15:50 | |
*** jamesmcarthur has joined #zuul | 15:52 | |
*** noorul has joined #zuul | 15:53 | |
noorul | Does zuul support running a job in a container on a static node? | 15:55 |
pabelanger | noorul: yup, you can write a job to use docker for that | 15:57 |
*** jamesmcarthur has quit IRC | 15:57 | |
corvus | brendangalloway: in addition to stats reporting, you can run a build query -- eg http://zuul.opendev.org/t/zuul/builds?job_name=zuul-promote-image&pipeline=promote | 15:57 |
*** TxGirlGeek has quit IRC | 15:58 | |
*** jamesmcarthur has joined #zuul | 15:59 | |
*** recheck has quit IRC | 16:00 | |
*** recheck has joined #zuul | 16:00 | |
noorul | pabelanger: I think I can use https://docs.ansible.com/ansible/latest/modules/docker_container_module.html | 16:00 |
*** zbr is now known as zbr|ruck | 16:07 | |
*** gtema_ has quit IRC | 16:11 | |
noorul | I am trying something like this | 16:19 |
noorul | http://paste.openstack.org/show/777430/ | 16:19 |
noorul | But it has syntax error at line 11 | 16:19 |
noorul | did not find expected '-' indicator while parsing a block collection | 16:19 |
noorul | Is it not possible to use variables in a list? | 16:20 |
pabelanger | line 11 needs to quota all | 16:23 |
pabelanger | ":/tmp | 16:23 |
pabelanger | needs to also be inside quotas | 16:24 |
pabelanger | quotes* | 16:24 |
noorul | pabelanger: Thank you | 16:25 |
noorul | ofosos: hi | 16:26 |
noorul | What is the procedure for using 3rd party ansible modules inside zuul ? | 16:34 |
fungi | noorul: what sort of third-party module, and where inside zuul? | 16:36 |
fungi | noorul: if you're talking about use of ansible modules on the executor, zuul shadows and filters some modules in the per-ansible trees like https://opendev.org/zuul/zuul/src/branch/master/zuul/ansible/2.8 | 16:38 |
pabelanger | I'd say nested ansible is likey more flexable approach | 16:38 |
pabelanger | but, with coming collections we'll need to make 3rd party modules a little more easier | 16:39 |
fungi | and yes, having the executor's ansible invoke an ansible on a disposable build node gets you access to use any additional modules you want without compromising the security of the executor's contained ansible environment | 16:39 |
fungi | ansible wasn't really designed with the idea in mind that you might want to use it to run untrusted playbooks/roles/modules, so zuul cripples the ansible it's interfacing directly with to try and help preserve the security of its executors | 16:41 |
fungi | if you're considering expanding the set of modules the executor allows, they should be very carefully scrutinized for ways they could be abused to compromise the environment | 16:43 |
*** tosky has quit IRC | 16:49 | |
pabelanger | neat, https://github.com/theopenlab/labkeeper looks to be a fork of stuff we use to deploy zuul.a.c :) | 16:50 |
noorul | I am getting this error | 16:50 |
noorul | Failed to import docker or docker-py - No module named requests.exceptions. Try `pip install docker` or `pip install docker-py` (Python 2.6) | 16:50 |
noorul | Does zuul only use python 2? | 16:50 |
pabelanger | no, can use python3 but need configure that in nodepool | 16:50 |
pabelanger | https://zuul-ci.org/docs/nodepool/configuration.html#attr-diskimages.python-path for example | 16:51 |
noorul | pabelanger: I am using static driver | 16:52 |
pabelanger | https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[static].pools.nodes.python-path | 16:52 |
noorul | It also has python path option | 16:52 |
openstackgerrit | Merged zuul/zuul master: Add support for the Gerrit checks plugin https://review.opendev.org/680778 | 16:59 |
noorul | pabelanger: http://paste.openstack.org/show/777430/ | 17:06 |
noorul | looks like the : in 10.29.12.160:7990 is creating problem | 17:06 |
noorul | I am getting following error Error creating container: 500 Server Error: Internal Server Error ("invalid mode: /tmp") | 17:07 |
*** brendangalloway has quit IRC | 17:07 | |
noorul | https://docs.ansible.com/ansible/latest/modules/docker_container_module.html | 17:08 |
fungi | is 10.29.12.160:7990 your connection name? | 17:08 |
pabelanger | should it be: "{{ zuul.projects['10.29.12.160:7990/ac/commonlib'].src_dir }}:/tmp" | 17:08 |
openstackgerrit | Merged zuul/zuul master: Update gerrit pagination test fixtures https://review.opendev.org/682114 | 17:08 |
fungi | ahh, it's the second : causing the issue | 17:08 |
fungi | not the earlier one | 17:09 |
noorul | Oops let me correct | 17:09 |
*** jamesmcarthur_ has joined #zuul | 17:09 | |
noorul | This is what I have now http://paste.openstack.org/show/777434/ | 17:09 |
noorul | That was the old one | 17:09 |
noorul | As per the syntax, the third one is mode | 17:10 |
noorul | Since I have colon in connection name itself | 17:10 |
noorul | My connection is defined here http://paste.openstack.org/show/777435/ | 17:11 |
pabelanger | noorul: what does your inventory file look like | 17:11 |
*** jamesmcarthur has quit IRC | 17:12 | |
clarkb | http://zuul.openstack.org/build/7dd6514fd3b24eaea2db05b09e1d0d26/log/job-output.txt#2904 seems to be the cause of the failure mgoddard pointed out earlier today | 17:12 |
clarkb | but I'm not seeing that module failure in the console of the job | 17:12 |
pabelanger | noorul: that will include zuul.projects variable | 17:12 |
clarkb | it says "See stdout/stderr for the exact error" | 17:13 |
clarkb | anyone know what might cause a module failure like that? | 17:13 |
clarkb | it has failed_when set to false which explains why that task doesn't fail the job | 17:14 |
noorul | pabelanger: ? | 17:14 |
noorul | I am not able to use zuul.projects['bitbucket/ac/commonlib'].src_dir }} | 17:16 |
pabelanger | noorul: no, I am asking to see your inventory file, it will list the projects the job is using | 17:16 |
pabelanger | wanted to see what zuul is expecting | 17:16 |
noorul | Did you mean main.yaml ? | 17:18 |
noorul | http://paste.openstack.org/show/777437/ | 17:18 |
pabelanger | clarkb: http://paste.openstack.org/show/777438/ | 17:18 |
pabelanger | usually ARA helps to expose some of that info | 17:18 |
*** hashar has joined #zuul | 17:18 | |
clarkb | pabelanger: yes the stdout problem is caused by the module failre I linked to (the task that hit module failure registers that variable | 17:18 |
clarkb | pabelanger: the job failed bceause stdout wasn't set. stdout wasn't set because the module that sets it failed | 17:18 |
pabelanger | we restarted zuul right | 17:19 |
pabelanger | did we pick up new version of ansible? | 17:19 |
*** jpena is now known as jpena|off | 17:19 | |
clarkb | I don't think we upgrade ansible due to how pip install works | 17:19 |
clarkb | though that gives me the idae of checking the executor logs | 17:20 |
clarkb | I'll go do that now | 17:20 |
noorul | pabelanger: I am wondering why is it using ip address instead of the name | 17:20 |
pabelanger | clarkb: Hmm, I guess something about ansible changed | 17:21 |
pabelanger | however | 17:22 |
pabelanger | https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/persistent-firewall/tasks/main.yaml | 17:22 |
pabelanger | I would consider switching that to include_task, and pass in iptables_rules / ip6tables_rules into the tasks | 17:22 |
pabelanger | let me check something | 17:22 |
clarkb | what is the difference? | 17:22 |
pabelanger | Oh | 17:23 |
pabelanger | http://paste.openstack.org/show/777440/ | 17:23 |
pabelanger | rc: -13 | 17:23 |
pabelanger | I've seen this before | 17:23 |
pabelanger | but, haven't figured it out | 17:23 |
pabelanger | I think that is SIGPIPE? | 17:24 |
pabelanger | basically, ansible is killing the task, IIRC, which raises -13 | 17:24 |
pabelanger | I never figured it out | 17:24 |
fungi | yes, manual for signal(7) confirms 13 is sigpipe | 17:25 |
fungi | oh, though this is an exit code not a signal | 17:25 |
clarkb | one thing I notice looking at the executor logs is that those tasks ran twice | 17:25 |
clarkb | the first time it runs it does so successfully | 17:25 |
pabelanger | clarkb: I can see it run twice, but on different nodes | 17:26 |
pabelanger | unless I missed | 17:26 |
pabelanger | fungi: yah, I think the - in -13 was a signal | 17:26 |
clarkb | pabelanger: its twice per node for a total of 4 times | 17:26 |
pabelanger | at least if I remember talking to core | 17:26 |
clarkb | I think because the multinode bridge updates the firewall rules then repersists them after the initial pass | 17:26 |
clarkb | this happens 2 minutes and 15 seconds apart ish | 17:27 |
clarkb | likely not a race in that case | 17:27 |
clarkb | but that also means the command is present on the host and successful ran at least once | 17:27 |
noorul | pabelanger: Any work around? | 17:28 |
pabelanger | clarkb: yah, something looks odd in that playbook, we call persistent-firewall multiple times | 17:28 |
pabelanger | oh | 17:29 |
clarkb | pabelanger: I don't think its wrong, we call it after making changes to the rules | 17:29 |
clarkb | and we make changes multiple times if doing multinode bridge setup | 17:29 |
pabelanger | this is test playbook? | 17:29 |
clarkb | no | 17:29 |
clarkb | unfortunately the ironic job doesn't seem to successfully collect syslog :/ | 17:29 |
pabelanger | k, I haven't really looked to much into multi-node playbooks | 17:31 |
pabelanger | but, I think we are hitting ansible issue | 17:31 |
pabelanger | would be cool if we can reproduce | 17:31 |
clarkb | it wouldn't surprise me if including the same task file multiple times is buggy in ansible | 17:31 |
clarkb | we've find that having nested includes has been able to cause weird errors | 17:31 |
pabelanger | yah, in fact, I think I'm including the same role multiple times too | 17:32 |
pabelanger | so, maybe you are on to something | 17:32 |
clarkb | might also be a problem with the register task. Like rerunning the same command task with a register breaks | 17:33 |
pabelanger | yah, I think passing vars into task via include_task might be something to try: https://docs.ansible.com/ansible/latest/user_guide/playbooks_reuse_includes.html#including-and-importing-task-files | 17:35 |
openstackgerrit | Merged zuul/zuul master: Support HTTP-only Gerrit https://review.opendev.org/681936 | 17:35 |
*** zbr|ruck is now known as zbr | 17:36 | |
SpamapS | Watching a Circle CI demo.. we could totally steal a thing from them with their restore_cache/save_cache keywords. | 17:38 |
SpamapS | Simpler than building AMI's or docker images .. just have a way to save dirs and restore dirs. | 17:39 |
fungi | sounds like pbuilder in debian | 17:39 |
SpamapS | yeah exactly | 17:40 |
fungi | tar up the tree and archive it, then untar it for subsequent builds | 17:40 |
SpamapS | they just use a key to invalidate.. so you can hash things like your requirements.txt or something else. | 17:40 |
clarkb | requirements.txt won't work because it is too loose. constraints might | 17:41 |
SpamapS | Pipfile.lock would work | 17:41 |
clarkb | also worth noting (and this too is python specific) venvs are not portable across even minor python version updates | 17:42 |
clarkb | you'd also need to invalidate if your base image changed | 17:42 |
SpamapS | Yeah, that seems pretty simple to do. | 17:42 |
fungi | it goes deeper still | 17:43 |
SpamapS | I think you are optimizing for rare events. | 17:43 |
noorul | pabelanger: I shared main.yaml http://paste.openstack.org/show/777437/ | 17:43 |
SpamapS | but yes, of course, invalidation will be needed. | 17:44 |
SpamapS | also I didn't suggest venvs | 17:44 |
SpamapS | wheel cache for instance | 17:44 |
fungi | if your dependencies include things which build c extensions from sdist and don't provide wheels, then it won't be the same if the underlying libs/headers change | 17:44 |
fungi | oh, but yeah, a wheel cache, sure | 17:44 |
SpamapS | you are all thinking in terms of *BIG* infrastructure like opendev. Small shops won't want to maintain all the caches and such | 17:44 |
mordred | yeah. there's definitely ways in which such a technique could be really nice - and I think we have the general tools and plumbing to be able to do such a thing | 17:44 |
fungi | granted, we basically already build a wheel cache in our ci system and publish it from local hosts in each region | 17:44 |
SpamapS | Right I'm looking at what a team did that migrated away from Zuul | 17:45 |
mordred | and probably wiring up a similar mechanism using the plumbing we've got so that people could easy do such a thing would be really helpful to a range of folks | 17:45 |
pabelanger | noorul: so, each job will produce an inventory file, which should be collected via logs. For example: https://zuul.opendev.org/t/zuul/build/adf44f2115344c1bac7733f1ef22983c/log/zuul-info/inventory.yaml | 17:45 |
SpamapS | Because we didn't have time to build our own wheel cache. | 17:45 |
fungi | but i agree, not having to maintain a separate system for wheel caches would be nice, even for large deployments like opendev, absolutely | 17:45 |
pabelanger | in it you can confirm the syntax, but based on your yaml file, you need likely need to use bitbucket/ac/commonlib | 17:46 |
fungi | SpamapS: the other concern which arises is trusted vs untrusted builds and cache poisoning. this is already a problem for systems like distcc | 17:46 |
fungi | er, well, ccache | 17:46 |
SpamapS | only do it on gate | 17:46 |
*** hashar has quit IRC | 17:46 | |
fungi | and scope it independently along project lines i suppose | 17:47 |
mordred | you could even use promote to push an updated version of the cache | 17:47 |
noorul | pabelanger: I have that file | 17:47 |
mordred | so that check and gate pull the cache, do $whatever to "update" it (which would frequently be noop) | 17:47 |
fungi | because project a doesn't necessarily trust caches created by changes approved by project b's maintainers | 17:47 |
mordred | then promote publishes an updated cache | 17:47 |
noorul | pabelanger: It has project names that I can't put in public | 17:47 |
pabelanger | noorul: that is fine, if you look at zuul.projects, it will be a list of projects you can use | 17:48 |
* mordred waves hands a bit - but it could be really cool | 17:48 | |
noorul | It has this key | 17:48 |
noorul | 10.29.12.160:7990/ac/commonlib: | 17:48 |
noorul | I am wondering why stash driver is using 10.29.12.160:7990 as prefix | 17:49 |
openstackgerrit | Clark Boylan proposed zuul/zuul-jobs master: DO NOT MERGE test cleanup phase playbook https://review.opendev.org/680178 | 17:49 |
noorul | Is it stash driver or Zuul ? | 17:49 |
pabelanger | that comes from your tenant config, IIRC | 17:50 |
pabelanger | but, you had bitbucket for connection | 17:50 |
noorul | Yes | 17:50 |
pabelanger | you can try setting canonical_hostname in your connect | 17:52 |
pabelanger | to the hostname of your service | 17:52 |
pabelanger | connection* | 17:52 |
fungi | makes me wonder if the bitbucket driver could be passing the connection address/port where it's supposed to be passing the connection name | 17:52 |
noorul | Any idea which interface method is used? | 17:53 |
pabelanger | SpamapS: curious how moved away from zuul, is that public info? | 17:54 |
pabelanger | asterisk project used to run zuul at one point, but moved to jenkins. They never did get to a point of gating asterisk, just using zuul with gerrit to test, then merge manually | 17:55 |
*** sshnaidm is now known as sshnaidm|bbl | 17:57 | |
SpamapS | pabelanger: It's a team here at Good Money. Experiment that so far is going well. They just didn't ever embrace Zuul/Ansible and CircleCI has a lot of bells and whistles for smaller teams/apps. | 17:57 |
pabelanger | ack | 17:58 |
fungi | i agree looking at what bells and whistles people find useful is a good opportunity to identify possible missing features | 17:58 |
mordred | ++ | 17:58 |
mordred | it's always good to learn about what things make people like other things | 17:58 |
openstackgerrit | Merged zuul/zuul master: Add autogenerated tag to Gerrit reviews https://review.opendev.org/682473 | 17:59 |
pabelanger | SpamapS: I do agree that setting up caches stuff is time consuming. Still need to do that for zuul.a.c. Starting now to get to point where regional mirrors of things will be helpful | 17:59 |
SpamapS | I mean, to put it in context.. we were too slow, taking 17 minutes to deploy. | 18:01 |
fungi | it's also possible to cache a lot more stuff on node images | 18:03 |
fungi | opendev used to do a bunch more of that than it does now, because the more you cache the easier it is for your cacheing to hide problems with dependency management | 18:04 |
SpamapS | We did that in response, but that took us, frankly, a month. | 18:04 |
SpamapS | Because nodepool+aws has no builder. | 18:04 |
SpamapS | So we had to create a packer build, get it working, move all the pre steps into it, and then make a zuul job that packer builds and uploads an AMI. | 18:05 |
pabelanger | yah, we started to cache ansible/ansible on images, which did speed up job runs a bit | 18:05 |
SpamapS | By that time they'd already moved to Circle and ripping that team's stuff out of our monolithic build took it from 17 minutes to 7, just in time for some new microservices to add 3 more minutes. ;) | 18:05 |
SpamapS | But the AMI having everything installed won us back those 3 minutes. | 18:06 |
pabelanger | I think I'd like to add docker insecure register, but also want to back with swift... so in a holding pattern until we figure out zuul-registry | 18:06 |
SpamapS | (In reality, it happened a little different in terms of ordering, but we're at 7 minutes for our zuul build as of today, and that team that went to CircleCI is at 4 minutes. ;) | 18:07 |
pabelanger | yah, that is some of the concern I hear from awx team, and using static nodes and GCE. They get faster builds | 18:08 |
pabelanger | but, I also don't think they are rebuilding (base) images each time | 18:09 |
SpamapS | I think at scale, building AMI's with caches and having local mirrors .. all "the right thing". But making it simpler for a single build to solve its own problems may actually be the more important capability. | 18:09 |
SpamapS | (AMI's, or disk images, or whatever) | 18:10 |
mordred | SpamapS: the thing with a cache save/restore thing is that it doesn't just have to be used for caches | 18:10 |
clarkb | pabelanger: static nodes also have the potential for going sideways | 18:12 |
mordred | and it's a feature we already have in zuul - its just not exposed in a single save/restore pair like in circle | 18:12 |
* clarkb lived that life for a long time does not want to go back | 18:12 | |
SpamapS | mordred: is that the provides thing? | 18:12 |
pabelanger | clarkb: yah, agree. I totally think it is an education issue too. But so far, they seem happy to use that, regardless of the potential issues | 18:12 |
fungi | yeah, glad to no longer spend a good chunk of my day checking the health of all our static ci nodes and rebooting them | 18:12 |
SpamapS | I've never really looked at it. | 18:12 |
mordred | SpamapS: which is me saying - yeah - empowering people to be able to use the underlying power to solve their spefific problems without them knowing how all of the pieces work is super helpful | 18:12 |
mordred | SpamapS: exactly | 18:13 |
mordred | :) | 18:13 |
SpamapS | But isn't that between pipelines? | 18:13 |
pabelanger | clarkb: I really don't want to have jobs be blocked when infra is down, that is the power of zuul to me | 18:13 |
mordred | yeah - provides is - that's how a parent job could update a 'cache' if needed and have a child job pick up the updates. it's the other side of the coin to "I need one of these, it needs to be updated sometimes, and I need it as an input to my job" | 18:14 |
fungi | pabelanger: general leaks over time cause static nodes to start systemically failing jobs because they run out of disk/ram/whatever, but if you're running builds which require root privileges then it gets waaay worse since a build can absolutely shred the node or, worse, backdoor it and then steal things like credentials for subsequent builds | 18:14 |
mordred | the image build jobs are, I think, the most complete implementation of the pattern overall - but they're docker specific. the usage pattern though is essentially what one should have from a caching system in a world where depends-on and friends exist | 18:15 |
SpamapS | the general flow of things = restore_cache("key") or build_things("key")...save_cache("key") is the one that I like. | 18:15 |
pabelanger | fungi: agree! Maybe at ansiblfest you can help share that info :D | 18:15 |
fungi | happy to! | 18:15 |
SpamapS | Like in this case, we have a node_modules dir that only needs to be maintained when the yarn.lock changes. | 18:16 |
mordred | yah. totally | 18:16 |
SpamapS | so, Circle has a really nice way to just say "restore node_modules from cache if you can" and then "save node_modules to cache" | 18:16 |
SpamapS | I'd like to steal that for Zuul. :) | 18:16 |
mordred | just saying - that's the same effective pattern - so the tools are there to put together the thing you're talking about | 18:16 |
SpamapS | And one problem we have there, is that Zuul doesn't have a real abstraction for object storage/filesystem-caching/etc. | 18:17 |
fungi | i suppose we just needs some more specific roles/workflows around that use case | 18:17 |
SpamapS | It goes back to my idea that we don't need "nodepool" we need "thingpool" | 18:17 |
SpamapS | Anyway, just a thought. | 18:18 |
SpamapS | I also don't know how much I've missed in terms of roles that could make our builds simpler. | 18:19 |
openstackgerrit | Merged zuul/zuul master: Use robot_comments in Gerrit https://review.opendev.org/682487 | 18:19 |
SpamapS | There may be lots of roles out there that make Zuul life as simple as Circle life. | 18:20 |
noorul | After adding canonical_hostname everything stopped working :( | 18:22 |
mordred | SpamapS: yeah - I think that was kind of what I was getting at - I think the usage pattern you're talking about, while not implemented in exactly that way, is where several things either are headed or are already there, on a system-by-system basis - so I think broadly speaking that type of pattern is one that has a decent amount of work already done for it and I think it would mostly be some glue | 18:22 |
mordred | SpamapS: so like - a, "yeah, I agree, that's a great interface to provide" and "I don't think we're too far away from being able to do so" | 18:23 |
SpamapS | Cool. Just.. it needs to be really obvious and simple to consume. Right now I think Zuul is in that stage where it can do anything, but .. it's not super approachable for the few things that should be simple and easy. | 18:24 |
mordred | with a side helping of "the specific use case example is a good one to have in our heads as we continue to work on providing good sugar" | 18:24 |
SpamapS | Some things are, though. Like, if you embrace tox.ini, and all you want is CI on that.. ++ .. zuul crushes that. | 18:24 |
mordred | SpamapS: yeah - I think the power in sharable job definitions is that for anythign people have written good jobs for, things become trivial and uber powerful | 18:25 |
mordred | but we definitely need to pick up more reusable job content of the same depth as the tox support | 18:25 |
pabelanger | I wish all jobs were like tox jobs :) That would be huge, but also takes a lot of work. | 18:26 |
mordred | yup | 18:26 |
SpamapS | Another thing that Circle does more easily is secrets. | 18:26 |
mordred | because it also takes people who understand the ecosystem of the tools being supported. like - what are the right actions to take on behalf of someone when they are doing yarn/npm-based things | 18:26 |
clarkb | mordred: you are asking the wrong person about that. I still have to undelete or is it delete? the "I'm just here to hangout" file in zuuls js builds | 18:27 |
clarkb | seems like I get that wrong every time I push a new js related change | 18:27 |
SpamapS | They just have these things called "Contexts" and that's a group of envvars and other settings. Each CI step just calls out a context. That is totally doable in a Zuul shop with parent: prod-context parent: staging-context ... but.. it's not quite batteries-included for Zuul. | 18:27 |
mordred | clarkb: well - to be fair, that's a javascript project embedded in a python project - so it's a very zuul-specific one-off | 18:28 |
clarkb | SpamapS: i think at least some of that (env vars in particualr) is a desire to push that onto ansible so that we aren't rewriting what ansible can do | 18:29 |
*** hashar has joined #zuul | 18:29 | |
clarkb | maybe that has to change, maybe not. | 18:30 |
mordred | yeah - and the inability to tell ansible "just pass these env vars to every task please kthxbai" isn't not a thing that has annoyed me before :) | 18:30 |
noorul | even after removing canonical_hostname nothing works | 18:33 |
SpamapS | clarkb: I think a solid library of jobs that are one level up from base job, would do it. | 18:34 |
SpamapS | mordred: yeah, that would be a nice feature to add to Ansible. Something like playbook-level generics. | 18:34 |
*** openstackgerrit has quit IRC | 18:37 | |
SpamapS | like "- playbooks_defaults: { become: true, environment: { FOO: bar } }" | 18:42 |
*** mattw4 has quit IRC | 18:44 | |
noorul | somehow working after removing canonical_hostname | 18:46 |
noorul | Can someone help me to find a work around to http://paste.openstack.org/show/777434/ ? | 18:51 |
noorul | The colon in 10.29.12.160:7990 is conflicting with volumes syntax | 18:52 |
clarkb | noorul: can you share error messages? | 19:20 |
dmsimard | SpamapS: it sort of has something like that now -- module defaults: https://docs.ansible.com/ansible/latest/user_guide/playbooks_module_defaults.html | 19:29 |
SpamapS | dmsimard: I want it a level up | 19:29 |
SpamapS | that's at the play level | 19:29 |
SpamapS | I need it at the playbook level | 19:29 |
noorul | clarkb: Error creating container: 500 Server Error: Internal Server Error ("invalid mode: /tmp") | 19:29 |
dmsimard | SpamapS: it's at the play level | 19:29 |
dmsimard | yeah | 19:30 |
dmsimard | might be workaroundable with a play that imports other playbooks, not sure | 19:30 |
clarkb | noorul: did it log the argument it passed to docker? | 19:30 |
clarkb | noorul: if so that will help us udnerstand how things are getting interpolated there | 19:31 |
SpamapS | import playbook is at the playbook level | 19:31 |
dmsimard | ah, yeah, I'm mistaken | 19:31 |
noorul | Simple docker run http://paste.openstack.org/show/777447/ creates another problem | 19:32 |
noorul | the input device is not a TTY | 19:32 |
noorul | 19:32 | |
noorul | "volumes":[1 item | 19:34 |
noorul | 0:"src/10.29.12.160:7990/ac/commonlib: /tmp" | 19:34 |
noorul | ] | 19:34 |
mordred | dmsimard: still - that's cool! TIL | 19:35 |
noorul | clarkb: ^^ | 19:35 |
clarkb | I see so the problem is the port specification in the source side? | 19:36 |
clarkb | does docker have a way to escape that? : is a valid file name character | 19:36 |
noorul | by docker did you mean the ansible module? | 19:37 |
clarkb | no docker itself or whatever is consuming that input | 19:37 |
noorul | It is the ansible module | 19:38 |
*** openstackgerrit has joined #zuul | 19:41 | |
openstackgerrit | David Shrewsbury proposed zuul/nodepool master: Reduce upload threads in tests from 4 to 1 https://review.opendev.org/682977 | 19:41 |
*** tosky has joined #zuul | 19:43 | |
noorul | https://github.com/moby/moby/issues/8604 | 19:44 |
clarkb | noorul: if I'm reading that right you need the --mount flag ? | 19:45 |
noorul | clarkb: Not sure whether ansible module https://docs.ansible.com/ansible/latest/modules/docker_container_module.html supports that | 19:46 |
clarkb | noorul: ya you might have to run docker via the shell or command module isntead | 19:46 |
clarkb | not sure | 19:46 |
noorul | I tried that, and I get the error "the input device is not a TTY" | 19:48 |
noorul | Is there any example in opendev which uses docker run | 19:48 |
noorul | ? | 19:48 |
SpamapS | dmsimard: I want ot echo what mordred said. I didn't know about module_defaults, and that's a handy thing | 19:49 |
Shrews | zuul-maint: it sure would be pleasant to get the autohold-revamp merged and remove this Sword of Damocles from over my head | 19:49 |
* Shrews starts singing from the Rocky Horror soundtrack | 19:49 | |
mordred | Shrews: but the dangling sword is so purty | 19:49 |
SpamapS | Shrews: is that a transvestite, transexual patch, from transylvania? | 19:50 |
noorul | In the devel branch mounts is supported | 19:50 |
noorul | https://github.com/ansible/ansible/blob/devel/lib/ansible/modules/cloud/docker/docker_container.py#L378 | 19:50 |
Shrews | SpamapS: you know it | 19:50 |
*** mattw4 has joined #zuul | 19:50 | |
*** sshnaidm|bbl is now known as sshnaidm | 19:52 | |
*** Goneri has quit IRC | 19:52 | |
fungi | now we need to plan a zuul rocky horror stage performance | 19:59 |
Shrews | i call the part of Dr. Frank-N-Furter | 20:00 |
fungi | with the right baldwig i can do a decent riff raff | 20:00 |
corvus | ooh, i'm shivering with antici | 20:07 |
Shrews | .... ????????? | 20:07 |
corvus | pation | 20:07 |
Shrews | *phew* | 20:07 |
fungi | the suspense was unbearable | 20:08 |
*** pcaruana has quit IRC | 20:10 | |
*** jamesmcarthur_ has quit IRC | 20:12 | |
*** hashar has quit IRC | 20:12 | |
*** jamesmcarthur has joined #zuul | 20:17 | |
*** jamesmcarthur has quit IRC | 20:17 | |
*** jamesmcarthur has joined #zuul | 20:18 | |
dmsimard | mordred, SpamapS: I've been using it to set the openstack cloud credentials so we don't need to repeat it for every module, it's pretty neat | 20:26 |
*** noorul has quit IRC | 20:35 | |
openstackgerrit | James E. Blair proposed zuul/zuul-jobs master: RFC: Generic cache implementation https://review.opendev.org/682992 | 20:38 |
openstackgerrit | James E. Blair proposed zuul/zuul-jobs master: RFC: Generic cache implementation https://review.opendev.org/682992 | 20:39 |
corvus | SpamapS: I did a weird thing there -- i wrote up a sketch of how we might implement a cache system like you were describing, except that the way i originally thought of implementing it wouldn't work without some changes to how zuul handles secrets. but i wrote it anyway and pushed it as PS1. then i revised the plan to something that would work with what we have today, but has a different interface | 20:42 |
corvus | (shifting a bit more work into base jobs) and pushed that as PS2. | 20:42 |
corvus | SpamapS: when you have a minute, maybe take a look at those and let me know if you think that fits the use-case, and if so, whether PS2 is something we should work on now, or whether we should take this as design input for figuring out how to make PS1 work. | 20:43 |
corvus | anyone else too of course ^ | 20:45 |
*** Goneri has joined #zuul | 20:45 | |
corvus | Shrews: your stack is +3 up through the web api change | 20:51 |
corvus | er, up until it | 20:51 |
daniel2 | Someone gave me a link to a documentation on using cloud images instead of building them with nodepool. Can someone share that link again, I can't find it. | 20:59 |
corvus | daniel2: https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[openstack].cloud-images (or similar for other drivers) | 21:11 |
daniel2 | Thanks! | 21:11 |
Shrews | corvus: woooo | 21:18 |
*** Goneri has quit IRC | 21:22 | |
daniel2 | Does nodepool builder have any .service files? I swore I saw one before for nodepool builder but can't find it now. For systemd | 21:25 |
clarkb | daniel2: nodepool itself doesn't ship any but some of the various config management tools for managing nodepool ahve init scripts and unit files | 21:26 |
clarkb | the zuul from scratch docs may also haev examples too /me looks | 21:26 |
*** Goneri has joined #zuul | 21:26 | |
daniel2 | ah yes, found it | 21:27 |
clarkb | https://zuul-ci.org/docs/zuul/admin/nodepool_install.html#service-file aha I was wrong | 21:27 |
clarkb | https://opendev.org/zuul/nodepool/src/branch/master/etc/nodepool-launcher.service | 21:27 |
daniel2 | I could probably edit that just to builder | 21:28 |
SpamapS | corvus: will take a look later today. Cool. | 21:28 |
daniel2 | launcher is running in docker. I'm going to run the builder on the host. | 21:28 |
corvus | i'm restarting opendev's zuul on HEAD right now; i noticed one minor issue which caused a non-fatal traceback on a gerrit connection without an http password. i don't think it will cause ongoing problems, and i'll push a fix in a minute. | 21:31 |
*** panda has quit IRC | 21:41 | |
*** panda has joined #zuul | 21:42 | |
corvus | and 2 tracebacks that are fatal | 21:44 |
corvus | i'm restarting our scheduler at 3.10.2 | 21:44 |
*** nhicher has joined #zuul | 21:47 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: Store a list of held node per held build in hold request https://review.opendev.org/682466 | 21:47 |
*** michael-beaver has quit IRC | 21:49 | |
*** rlandy is now known as rlandy|bbl | 22:04 | |
*** jamesmcarthur has quit IRC | 22:06 | |
*** jamesmcarthur has joined #zuul | 22:09 | |
mnaser | i know "reviews welcome" but i've been struggling at searching zuul's codebase with gitea | 22:27 |
mnaser | it almost always returns 0 results | 22:27 |
SpamapS | isn't that what ripgrep i for? ;-) | 22:27 |
mnaser | aha | 22:28 |
mnaser | and on that note | 22:28 |
mnaser | i was trying to check which services are stateful vs stateless and noticed the executor actually has a state_dir config option | 22:28 |
mnaser | little research later shows we define it as `executor_state_root` but never reference it.. ever | 22:28 |
mnaser | besides just creating that directory | 22:28 |
mnaser | wait gr thats in the test code only | 22:29 |
mnaser | ok so looks like one reference to determine disk avaialble on executor | 22:30 |
mnaser | and storing ansible's | 22:30 |
mnaser | maybe one day someone will read these chat logs and save themselves the source code reading :) | 22:30 |
*** jamesmcarthur has quit IRC | 22:33 | |
*** rfolco has quit IRC | 22:43 | |
clarkb | SpamapS: more specifically it is why we haven't deleted codesearch | 22:47 |
*** jamesmcarthur has joined #zuul | 22:50 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP: Fix gerrit errors from production https://review.opendev.org/683006 | 22:53 |
openstackgerrit | James E. Blair proposed zuul/zuul master: DNM: Use http for all gerrit tests https://review.opendev.org/683007 | 22:53 |
pabelanger | mnaser: yah, mostly executor.state_dir needs to be stateful | 23:06 |
pabelanger | git repos / ansible, etc | 23:06 |
mnaser | ok so statefulsets for everything except zuul-web | 23:06 |
pabelanger | and fingergw | 23:07 |
*** kerby has joined #zuul | 23:07 | |
pabelanger | nodepool-launchers are stateless, builders maybe need access to cache for dib | 23:08 |
pabelanger | starting upgrade of zuul.a.c to 3.10.2 | 23:08 |
*** jamesmcarthur has quit IRC | 23:09 | |
mnaser | btw, i think we should get around tagging release on dockerhub | 23:09 |
mnaser | that'd be neat | 23:09 |
pabelanger | Hmm, maybe we just need to add job to release pipeline | 23:10 |
mnaser | well in the release pipeline we dont really have a built artifact (unlike promote i guess) | 23:15 |
pabelanger | Yup, we'd need to build it, then push with right version info | 23:16 |
mnaser | because the buildset registry would be long gone by then, so it would be a fresh rebuild | 23:17 |
pabelanger | other jobs do it today | 23:17 |
pabelanger | eg: loci | 23:17 |
mnaser | (in an ideal world it would be nice if it wasn't a fresh rebuild, and we just retag the one that's already uploaded) | 23:17 |
mnaser | and that we know is tested | 23:17 |
mnaser | because technically i guess the build that happens inside release isn't _exactly_ what we tested | 23:17 |
pabelanger | I mean, could do that too. But I don't think we keep more then head in docker hub | 23:18 |
pabelanger | do we? | 23:18 |
mnaser | no only head is in dockerhub | 23:18 |
mnaser | actually the intermediate registry should always be running, so i'm wrong on that, i think it should be trivial, the trick is just figuring out the zuul version name in the job to tag | 23:19 |
pabelanger | I think we purge that each night | 23:20 |
mnaser | ah that breaks my plan in that case | 23:20 |
mnaser | cause release can come many days after no changes | 23:21 |
pabelanger | but yah, we could keep them around some how | 23:21 |
pabelanger | then when we tag, also fetch original bits | 23:21 |
pabelanger | but, given for pypi we do a rebuild, docker should work the same | 23:21 |
pabelanger | just pulls in way more thing | 23:21 |
pabelanger | things* | 23:21 |
mnaser | well since we already upload to dockerhhub, we can upload/tag every commit id there | 23:22 |
mnaser | and then its a matter of docker pull zuul/zuul-merger@<sha-of-current-released-commit> and retagging that with the tagged version | 23:22 |
mnaser | its also makes it easy for someone to point towards a very specific version of zuul based on the commit id | 23:23 |
pabelanger | yah, we did that for pypi for a bit | 23:23 |
pabelanger | I liked it, so I could fetch any wheel | 23:23 |
pabelanger | rather then building it myself | 23:24 |
mnaser | i can imagine those repos getting a little annoyed with us tagging everything in there :p | 23:24 |
corvus | pabelanger, mnaser: what do you mean by stateful for executors? they benefit from cache but do not require it. | 23:26 |
mnaser | corvus: the idea that /var/lib/zuul (aka executor.state_dir) needs to be stable | 23:27 |
corvus | i mean, that if you shut one down or move it or whatever, if it has a git repo cache, great, it'll be faster. but if it doesn't, it can totally rebuild it from scratch. | 23:27 |
pabelanger | Zuul version: 3.10.2 ! | 23:27 |
mnaser | right ok, so the state is a nice to have cache, but it is not necessary | 23:27 |
pabelanger | yah, I was assuming mnaser was looking at doing a rolling upgrade in k8s. Agree, having cache isn't needed, just faster | 23:28 |
mnaser | it seems like a useful optimization to have if its possible imho | 23:28 |
corvus | mnaser: yep. let's pretend i've forgotten all the right k8s vocab. so just in general terms -- executors should be somewhat long-running so that they benefit from the cache, but for scaling up, down, or horizontal moves for eviction, etc, they can lose the state dir and be fine. | 23:30 |
mnaser | corvus: that makes perfect sense | 23:30 |
corvus | state dir for the scheduler is critical (secret decryption keys) | 23:30 |
corvus | oh, and everything said about executors applies to mergers, if you run any | 23:31 |
pabelanger | so, for some reason, that I don't fully understand, https://dashboard.zuul.ansible.com/t/ansible/build/026dbb8307d5405d93e226286ade0650/log/job-output.txt seems to take some time to render the HTML | 23:34 |
pabelanger | I'm unsure where the bottleneck is happening | 23:34 |
clarkb | pabelanger: is it a very large file? | 23:34 |
*** jamesmcarthur has joined #zuul | 23:34 | |
mnaser | i think its the large file factor | 23:34 |
mnaser | my browser seems to hang | 23:34 |
clarkb | pabelanger: basically what happens is the js has to request the entire file then scan it with regexes. Large files will be slow because network transfer is slow as is regexing a large file | 23:34 |
pabelanger | 4.3mb? | 23:35 |
pabelanger | if I access file directly, it loads much faster | 23:35 |
pabelanger | https://object-storage-ca-ymq-1.vexxhost.net/v1/a0b4156a37f9453eb4ec7db5422272df/ansible_24/524/64bf07cadde572bf2ae5fa19bd0201b153f12f5a/promote/windmill-config-deploy/026dbb8/job-output.txt | 23:35 |
mnaser | i think its cause one does zero css/rendering/blah | 23:35 |
pabelanger | clarkb: ah, I see | 23:35 |
mnaser | i just profiled it with chrome | 23:36 |
pabelanger | could it be server side? | 23:36 |
mnaser | 2s for scripting, 4s rendering | 23:36 |
clarkb | pabelanger: if directly fetching it is fast then no it shouldn't be server side | 23:36 |
pabelanger | k, I'll have to dig more | 23:37 |
pabelanger | was thinking of disabling job-output.txt.html, like opendev did | 23:37 |
clarkb | I think mnaser just profiled why it is slow | 23:37 |
pabelanger | what does rendering? | 23:37 |
corvus | pabelanger: obviously, whatever workflow works for you, but i've basically stopped using the text console in favor of the json: https://dashboard.zuul.ansible.com/t/ansible/build/026dbb8307d5405d93e226286ade0650/console | 23:38 |
corvus | (i reserve the log viewer for logs other than the console output) | 23:39 |
mnaser | ok i see a little reason why it is taking up a lot | 23:39 |
mnaser | we're building a table of N rows where N is the number of lines | 23:40 |
mnaser | in this case, a table with 39.4k rows | 23:40 |
*** tosky has quit IRC | 23:40 | |
mnaser | so the number of nodes explodes as the browser tries to render it all | 23:40 |
pabelanger | how do I see that? | 23:41 |
mnaser | i usually open the chrome developer tools | 23:42 |
mnaser | and then go into "performance" | 23:42 |
mnaser | hit the record button (make sure you have memory profiling enabled/checked too) | 23:42 |
mnaser | and then refresh the page, wait for it to be done, stop recording | 23:42 |
pabelanger | k | 23:42 |
mnaser | FWIW, GitHub does the same thing.. so i dont know if there's a better way to deal with this | 23:42 |
mnaser | https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py a 10K github file probably gives your computer a workout too | 23:43 |
mnaser | and they even do syntax highlighting so its probably worse (if it was ~40k lines) | 23:43 |
pabelanger | that loads not too bad | 23:44 |
corvus | well, if it loads in 25% of the time it's comprable performance. | 23:44 |
pabelanger | true | 23:45 |
pabelanger | okay, thanks for help | 23:45 |
corvus | pabelanger: try out the console tab. keep in mind you can deep-link to individual results. | 23:45 |
corvus | eg https://dashboard.zuul.ansible.com/t/ansible/build/026dbb8307d5405d93e226286ade0650/console#3/0/1/localhost | 23:46 |
pabelanger | well, we have some large shell tasks, for nested ansible. Having some trouble figuring that out | 23:47 |
pabelanger | eg: https://dashboard.zuul.ansible.com/t/ansible/build/026dbb8307d5405d93e226286ade0650/console#3/0/1/localhost | 23:47 |
pabelanger | sorry | 23:48 |
pabelanger | https://dashboard.zuul.ansible.com/t/ansible/build/026dbb8307d5405d93e226286ade0650/console#3/1/6/bastion01.sjc1.vexxhost.zuul.ansible.com | 23:48 |
corvus | oh yeah, it's less great for nested ansible | 23:48 |
pabelanger | k | 23:49 |
pabelanger | mnaser: btw: we haven't see issues with centos7.7 so far, but not doing much with it currently. Mostly a stale mirror | 23:52 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!