Wednesday, 2019-09-18

*** armstrongs has joined #zuul		00:04
*** armstrongs has quit IRC		00:13
*** jamesmcarthur has joined #zuul		00:42
*** jamesmcarthur has quit IRC		01:07
*** kerby has quit IRC		01:55
openstackgerrit	Mohammed Naser proposed zuul/zuul-operator master: Create zookeeper operator https://review.opendev.org/676458	02:08
*** jamesmcarthur has joined #zuul		02:22
*** bhavikdbavishi has joined #zuul		02:33
*** roman_g has quit IRC		02:35
*** jamesmcarthur has quit IRC		02:38
*** bhavikdbavishi1 has joined #zuul		02:40
*** bhavikdbavishi has quit IRC		02:42
*** bhavikdbavishi1 is now known as bhavikdbavishi		02:42
*** jamesmcarthur has joined #zuul		02:47
*** noorul has joined #zuul		03:12
noorul	hi	03:12
noorul	I am seeing the following error	03:12
noorul	Sending result: {"result": "DISK_FULL", "warnings": [], "data": {}}	03:12
noorul	But plenty of disk space is available on the node where zuul is running and on the slave nodes	03:13
*** noorul has quit IRC		03:18
*** jamesmcarthur has quit IRC		03:24
*** jamesmcarthur has joined #zuul		03:36
*** noorul has joined #zuul		03:41
noorul	I see this warning too	03:45
noorul	WARNING zuul.ExecutorDiskAccountant: /var/lib/zuul/builds/e27b42f0eff140bcb66b0ccb050e5b81 is using 280MB (limit=250)	03:45
noorul	Is this related?	03:45
clarkb	yes, there is a per job disk limit and you are hitting that. There is a co fig option to make the limit larger	03:47
clarkb	it is there to avoid jobs filling disks on your exwcutors and log servers	03:47
*** jamesmcarthur has quit IRC		03:50
*** noorul has quit IRC		03:59
openstackgerrit	Ian Wienand proposed zuul/zuul master: zuul_console: fix python 3 support https://review.opendev.org/682556	03:59
openstackgerrit	Ian Wienand proposed zuul/zuul master: Support nodes setting 'auto' python-path https://review.opendev.org/682275	03:59
*** bolg has joined #zuul		03:59
mnaser	https://review.opendev.org/#/c/676458/7	04:40
mnaser	What's the best way to consume the image built by the dependant job?	04:41
mnaser	(and should we just build the image during the test run rather than out in a job before it?)	04:41
*** pcaruana has joined #zuul		04:46
*** noorul has joined #zuul		04:48
mnaser	I wonder why the image is not being found	05:02
noorul	On my slave node I am trying to use docker to run unit tests. What is the normal practice for this? Should we create user in the docker image or root should be used. When I have a user mount is not working	05:31
openstackgerrit	Ian Wienand proposed zuul/nodepool master: Set default python-path to "auto" https://review.opendev.org/682797	05:57
*** avass has joined #zuul		06:13
*** igordc has joined #zuul		06:17
*** igordc has quit IRC		06:19
*** avass has quit IRC		06:32
*** shachar has quit IRC		06:35
*** snapiri has joined #zuul		06:35
*** bolg has quit IRC		06:50
*** roman_g has joined #zuul		07:00
flaper87	tobiash: how do you do the restart of the scheduler pod when you add a new project to the tenant? I've a configmap with the tenant config that is mounted in the scheduler's POD. I'd like to have a (safe?) way to signal the scheduler POD when the configmap changes	07:04
flaper87	I guess I could have a custom startup script	07:04
flaper87	but I'd rather use a more k8s method if there's one	07:04
*** themroc has joined #zuul		07:07
*** hashar has joined #zuul		07:13
*** noorul has quit IRC		07:17
*** tosky has joined #zuul		07:18
mordred	mnaser: the zuul images jobs (with the opendev-buildset-registry job) should make the image stuff just work - as well as image promotion so that the image built in the gate is what gets published. it's a whole thing, but it works really well and works well with depends-on (in your copious free time we should make sure your zuul(s) are set up so that people can do speculative image jobs, because they're	07:19
mordred	mindblowingly awesome, but they do take work from the zuul operator)	07:19
mordred	mnaser: if you're having issues with images not showing up, let's for sure figure that out	07:19
*** bolg has joined #zuul		07:23
*** sshnaidm\|pto is now known as sshnaidm\|rover		07:24
tobiash	flaper87: you can reload the scheduler without restart	07:24
flaper87	tobiash: yeah, but that requires getting into the pod and reloading the scheduler. was hoping to have that done automagically when the configmap is updated	07:25
tobiash	flaper87: in that case you could run a helper script using inotify to automate that	07:26
mordred	flaper87: I think you're yearning for the distributed scheduler work to be completed	07:27
flaper87	coolio, that's what I was going to do but I wanted to know if there was a more native way to do it.	07:27
*** gtema_ has joined #zuul		07:27
flaper87	mordred: oh, there's an ongoing work to make the scheduler distributed?	07:27
flaper87	I'd very much love that	07:27
*** mmedvede has quit IRC		07:31
mordred	flaper87: yeah. https://review.opendev.org/#/c/621479/ is the spec	07:32
*** arxcruz has quit IRC		07:32
*** mmedvede has joined #zuul		07:34
*** arxcruz has joined #zuul		07:36
*** jpena\|off is now known as jpena		07:41
*** gtema_ has quit IRC		07:45
*** gtema_ has joined #zuul		07:45
*** AJaeger has quit IRC		08:14
*** recheck has quit IRC		08:20
*** noorul has joined #zuul		08:22
noorul	When the job log crossed the size limit, it failed with DISK_FULL error message	08:23
noorul	testjob finger://aruba-virtual-machine/e27b42f0eff140bcb66b0ccb050e5b81 : DISK_FULL in 0s	08:23
noorul	I am not sure what is the use of finger URL	08:23
*** recheck has joined #zuul		08:24
*** AJaeger has joined #zuul		08:27
openstackgerrit	Merged zuul/zuul master: Add no-jobs reporter action https://review.opendev.org/681278	08:36
openstackgerrit	Merged zuul/zuul master: Add report time to item model https://review.opendev.org/681323	08:50
*** pcaruana has quit IRC		08:57
*** pcaruana has joined #zuul		09:01
mgoddard	FYI, had a weird issue in the persistent-firewall role: https://storage.gra1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_7dd/681446/7/gate/ironic-tempest-ipa-wholedisk-direct-tinyipa-multinode/7dd6514/job-output.txt	09:06
openstackgerrit	Merged zuul/zuul master: Add Item.formatStatusUrl https://review.opendev.org/681324	09:11
*** panda\|ruck\|off is now known as panda\|ruck		09:37
*** hashar has quit IRC		09:41
*** gtema_ has quit IRC		09:46
*** gtema_ has joined #zuul		09:46
*** noorul has quit IRC		09:49
*** bhavikdbavishi has quit IRC		09:52
openstackgerrit	Fabien Boucher proposed zuul/zuul master: Pagure - fix wrong commit gitweb url https://review.opendev.org/679946	10:02
openstackgerrit	Fabien Boucher proposed zuul/zuul master: Pagure - handle initial comment change event https://review.opendev.org/680310	10:02
openstackgerrit	Fabien Boucher proposed zuul/zuul master: Pagure - handle Pull Request tags (labels) metadata https://review.opendev.org/681050	10:02
openstackgerrit	Fabien Boucher proposed zuul/zuul master: Pagure - reference pipelines add open: True requirement https://review.opendev.org/681252	10:02
openstackgerrit	Fabien Boucher proposed zuul/zuul master: Pagure - handles pull-request.closed event https://review.opendev.org/681279	10:02
*** openstackgerrit has quit IRC		10:06
*** zbr has quit IRC		10:11
*** zbr has joined #zuul		10:12
*** noorul has joined #zuul		10:21
fbo	corvus: tristanC mordred regarding Pagure, I've rebased the patch chain and removed the git.tag.creation patch. The chain should be ok to merge. For the tag.creation patch I'll discuss with pagure folks.	10:23
fbo	https://review.opendev.org/#/q/topic:pagure-driver-update	10:24
*** noorul has quit IRC		10:26
mordred	fbo: stack looks great to me	10:29
*** openstackgerrit has joined #zuul		10:30
openstackgerrit	Fabien Boucher proposed zuul/zuul master: Pagure - add support for git.tag.creation event https://review.opendev.org/679938	10:30
fbo	mordred: thanks for the reviewes !	10:31
mordred	fbo: thanks for working on that - it's exciting to see the pagure support and that relationship moving forward	10:31
fbo	mordred: thanks, yes :) and we are working on a set of default jobs for Fedora distgit here an example https://src.fedoraproject.org/rpms/python-gear/pull-request/8#comment-30592. I hope packagers will like it.	10:41
*** brendangalloway has joined #zuul		10:46
*** noorul has joined #zuul		10:51
*** AJaeger has quit IRC		10:55
openstackgerrit	Merged zuul/zuul master: Pagure - fix wrong commit gitweb url https://review.opendev.org/679946	10:57
*** AJaeger has joined #zuul		11:01
*** noorul has quit IRC		11:04
*** noorul has joined #zuul		11:05
*** pcaruana has quit IRC		11:19
*** pcaruana has joined #zuul		11:28
*** jpena is now known as jpena\|lunch		11:35
*** bhavikdbavishi has joined #zuul		11:54
*** bhavikdbavishi has quit IRC		11:59
mnaser	flaper87: you can run a second pod that watches the file for changes and fires the zuul full reconfigure command	11:59
*** bhavikdbavishi has joined #zuul		12:05
*** jamesmcarthur has joined #zuul		12:09
*** jamesmcarthur has quit IRC		12:16
*** jamesmcarthur has joined #zuul		12:22
*** rlandy has joined #zuul		12:25
*** jamesmcarthur has quit IRC		12:31
*** jpena\|lunch is now known as jpena		12:31
openstackgerrit	David Shrewsbury proposed zuul/zuul master: Add scheduler config options for hold expiration https://review.opendev.org/682675	12:42
*** themroc has quit IRC		12:42
*** themroc has joined #zuul		12:42
Shrews	tristanC: if that ^^ looks good, i'll toss the rest of the stack on top of that again. Hopefully we can get this all merged today :/	12:43
Shrews	also corvus	12:44
*** avass has joined #zuul		12:47
*** fdegir has quit IRC		12:47
*** fdegir has joined #zuul		12:48
openstackgerrit	David Shrewsbury proposed zuul/zuul master: Add scheduler config options for hold expiration https://review.opendev.org/682675	12:48
Shrews	fixed up outdated comments ^^	12:48
*** jamesmcarthur has joined #zuul		12:51
*** hashar has joined #zuul		12:51
tristanC	Shrews: i'm in a meeting now, i can have a look in a couple of hours	12:56
mnaser	If I want to use the same base job across different zuuls (for logs), would it pretty much require that all deployments at least include the keys used to build the secrets on that job and I should be good to go?	12:59
mnaser	It sounds like it, I haven't tried it yet but it seems like that should be enough	12:59
*** Goneri has joined #zuul		13:09
*** bolg has quit IRC		13:12
*** hashar has quit IRC		13:30
*** bhavikdbavishi has quit IRC		13:30
tristanC	Shrews: left a comment, it lgtm, though if you don't mind i'll wait for the rebase to test the stack again	13:41
tristanC	fungi: re bwrap: containers usually setup cgroups, seccomp and selinux which we don't currently get from bwrap in zuul	13:53
tristanC	seccomp in particular may be important, though it seems like bwrap can setup the filters thus we could get that for zuul. then zuul would be mostly missing selinux context by using bwrap instead of full containers	14:00
*** swest has quit IRC		14:01
*** swest has joined #zuul		14:02
*** swest has quit IRC		14:03
corvus	mnaser: yes; as long as the keys are there, zuul won't geerate them	14:04
corvus	noorul: the finger url is just the only url zuul had for the job at the time. it does contain the build uuid so you can check that in logs.	14:05
openstackgerrit	David Shrewsbury proposed zuul/zuul master: Add scheduler config options for hold expiration https://review.opendev.org/682675	14:07
openstackgerrit	David Shrewsbury proposed zuul/zuul master: Mark nodes as USED when deleting autohold https://review.opendev.org/664060	14:12
openstackgerrit	David Shrewsbury proposed zuul/zuul master: Auto-delete expired autohold requests https://review.opendev.org/663762	14:12
openstackgerrit	David Shrewsbury proposed zuul/zuul master: Add autohold delete/info commands to web API https://review.opendev.org/679057	14:12
openstackgerrit	David Shrewsbury proposed zuul/zuul master: Remove outdated TODO https://review.opendev.org/682421	14:12
Shrews	tristanC: that should be the rest of the stack ^^	14:12
Shrews	there was a test fix in 679057	14:13
corvus	clarkb: https://review.opendev.org/680778 could use re+2 from you	14:13
pabelanger	mordred: where is a good place to talk about how pbr does versioning of python things from git commits? Basically, I'd like to see how to add that support into galaxy cli command now for generating tarballs used by collections. So today, on galaxy side, it is very static.	14:13
fungi	tristanC: do other container solutions actually apply selinux on platforms which don't ship with selinux enabled (for example, ubuntu)?	14:14
pabelanger	so we can get a good story / versioning around speculative galaxy collections	14:14
fungi	pabelanger: there is a thorough spec in the pbr docs	14:15
pabelanger	Thanks!	14:15
fungi	pabelanger: or are you talking about the actual implementation at the python level, not the versioning rules?	14:15
pabelanger	fungi: yah, I wanted to see how to write the logic in python or maybe extract from pbr to galaxy can use	14:16
pabelanger	or some sort of CLI command, where I can run and generate a proper version number	14:16
fungi	i think pbr still relies on pkg_resources to obtain the metadata from disk, but could in theory be switched over to the newer packaging library	14:16
*** noorul has quit IRC		14:17
pabelanger	right now, everything generate is all using the same version number, which isn't ideal.	14:17
*** avass has quit IRC		14:18
fungi	oh, pbr does have some functions for things like outputting equivalent rpm and deb version strings, so you just want something similar for its normal pep-440 version strings	14:18
tristanC	fungi: i can't tell for ubuntu, but podman does setup uniq context label per containers to restrict container access to host and to others containers (when selinux is activated)	14:19
fungi	ubuntu ships by default with apparmor, so maybe that gets used in similar ways by some containerization solutions	14:20
pabelanger	fungi: yes. Basically, take git repo. Run command, output something like 3.9.1.dev183 a2018c5 (taken from zuul.o.o UI). I can then generate proper 'Manifest' file for galaxy or bonus points if galaxy did it directly	14:20
*** openstackgerrit has quit IRC		14:21
fungi	seems like adding an additional pbr subcommand like `pbr version [--deb, --rpm, --sdist, --wheel]` might do the trick	14:26
pabelanger	okay cool, that should give me starting place	14:26
fungi	er, i guess it would be `pbr version [--deb, --rpm, --sdist, --wheel] <package>`	14:26
fungi	so work like `pbr info <package>` does now	14:27
fungi	there's also a `pbr sha <package>`	14:27
fungi	i expect most of the plumbing is there, and i know the functions you need to generate equivalent version strings for multiple package formats are already in pbr	14:28
fungi	"The version.SemanticVersion class can be used to query versions of a package and present it in various forms - debian_version(), release_string(), rpm_string(), version_string(), or version_tuple()." https://docs.openstack.org/pbr/latest/user/features.html#version	14:30
pabelanger	fungi: yah, I think the change will be, there is no package. As these repos don't use setuptools. So, need to understand how pbr reads that info from git, to generate the version number	14:30
fungi	the implementation would likely be no more than a few lines of python	14:30
fungi	oh	14:30
fungi	if there's no python packaging involved, then yeah you may want a separate took	14:31
fungi	tool	14:31
fungi	pbr is fairly tied to python packaging metadata	14:31
pabelanger	ack	14:31
fungi	the bits of pbr which deal with finding the highest tag in your history, obtaining the commit id and so on are in https://opendev.org/openstack/pbr/src/branch/master/pbr/git.py	14:33
fungi	the actual version string format is likely also not too relevant outside of python packaging contexts, since it's specific to how pip and similar tools expect to order things like prerelease, postrelease and development identifiers in version strings	14:34
fungi	that's why pbr also has functions to translate those to deb and rpm version formats... different platforms have different rules for how to encode such details and how they sort relative to each other	14:35
*** openstackgerrit has joined #zuul		14:36
openstackgerrit	James E. Blair proposed zuul/zuul-jobs master: DNM: test prepare-workspace-git base-test https://review.opendev.org/682912	14:36
brendangalloway	pabelanger: tried your connection reset suggestion from yesterday, but it fails due to what looks like ansible 2.5 compatibility issues. Do you know if that will work if we upgrade from SF 3.2 to 3.3?	14:56
pabelanger	brendangalloway: don't know, but which version of zuul are you using? We have multi-ansible support	14:56
brendangalloway	we also noticed that the static node returns to the 'ready' state in nodepool when it comes back, even though zuul still considers it in use. Not sure if that's related though	14:56
pabelanger	so you could use job.ansible_version to ask for newer version	14:57
brendangalloway	We're on Software Factory 3.2, not sure exactly which zuul version that is. Default ansible is 2.5	14:57
pabelanger	brendangalloway: look on your zuul dashboard UI	14:58
pabelanger	bottom left, should be version info	14:58
brendangalloway	pabelanger: 3.6.1-1.el7	14:58
pabelanger	3.7.0 was when we adding multiple ansible: https://zuul-ci.org/docs/zuul/releasenotes.html#relnotes-3-7-0	14:58
pabelanger	:(	14:58
pabelanger	brendangalloway: yah, sounds like you might need to upgrade	14:59
brendangalloway	Aah - the Multiple Ansible Versions doc made it sound like the feature was still work in progress, so we haven't tried to use it yet	14:59
openstackgerrit	Merged zuul/zuul master: Pagure - handle initial comment change event https://review.opendev.org/680310	14:59
pabelanger	brendangalloway: nope! is production, works great	14:59
brendangalloway	pabelanger: https://zuul-ci.org/docs/zuul/developer/specs/multiple-ansible-versions.html maybe needs and update then? It's the first thing that came up searching for 'zuul change ansible version'	15:01
pabelanger	+1	15:01
brendangalloway	but yes, looks like next step is upgrading and seeing if that fixes things	15:01
pabelanger	yah, we should consider updating it	15:01
fungi	that's also somewhat of a risk with documenting our design specs in the documentation tree... can make it seem like those features aren't implemented yet if you go looking for them and end up at the design spec	15:03
fungi	the pink warning at the top of the page is intended to convey that, but maybe it gives the opposite impression	15:04
corvus	oh, if that's implemented, it should be removed	15:04
tristanC	brendangalloway: zuul multi-ansible support is added to SF-3.3, you would need to follow this procedure: https://www.softwarefactory-project.io/docs/3.3/operator/upgrade.html	15:05
fungi	corvus: the spec should be removed, or just the admonition at the top of it?	15:05
corvus	fungi: the spec i think	15:05
fungi	makes sense	15:05
fungi	it's always available in git history after all	15:05
tristanC	brendangalloway: though we are fixing minor issues regarding the recent centos-7.7 update, feel free to ask on #softwarefactory if you have any issue with that	15:05
corvus	we should make sure that any documentation value it has is covered elsewhere, then remove it. (otherwise, we end up with documentation by spec which is less user friendly)	15:06
openstackgerrit	Merged zuul/zuul-website master: Update to page titles and Users https://review.opendev.org/680459	15:06
*** arxcruz is now known as arxcruz\|ruck		15:08
*** mattw4 has joined #zuul		15:09
*** themroc has quit IRC		15:11
openstackgerrit	Merged zuul/zuul master: zuul_console: fix python 3 support https://review.opendev.org/682556	15:17
*** michael-beaver has joined #zuul		15:19
clarkb	corvus: +2 reapplied	15:19
*** noorul has joined #zuul		15:23
*** TxGirlGeek has joined #zuul		15:26
brendangalloway	I don't suppose there's any way for zuul to report how many times a certain job has been run in a certain pipeline?	15:32
*** zbr has quit IRC		15:36
*** zbr has joined #zuul		15:37
noorul	Any idea how to copy a folder from one location to another using inbuilt ansible module?	15:37
clarkb	noorul: you can use the synchronize module	15:37
clarkb	brendangalloway: for all time? I believe that data is in the databse but not currently exposed	15:38
*** sshnaidm\|rover is now known as sshnaidm		15:39
*** panda\|ruck is now known as panda		15:39
brendangalloway	clarkb: Yeah, we've ported one of our build systems into zuul and we want to switch the old one off, but the idea of no longer having build numbers is ruffling feathers	15:40
brendangalloway	Trying to find an easy way to provide something equivalent	15:40
clarkb	in the opendev system I would point people to the statsd reporting since we expose that publicly through graphite and grafana	15:42
*** jamesmcarthur has quit IRC		15:45
*** jamesmcarthur has joined #zuul		15:47
*** jamesmcarthur has quit IRC		15:49
*** noorul has quit IRC		15:50
*** jamesmcarthur has joined #zuul		15:52
*** noorul has joined #zuul		15:53
noorul	Does zuul support running a job in a container on a static node?	15:55
pabelanger	noorul: yup, you can write a job to use docker for that	15:57
*** jamesmcarthur has quit IRC		15:57
corvus	brendangalloway: in addition to stats reporting, you can run a build query -- eg http://zuul.opendev.org/t/zuul/builds?job_name=zuul-promote-image&pipeline=promote	15:57
*** TxGirlGeek has quit IRC		15:58
*** jamesmcarthur has joined #zuul		15:59
*** recheck has quit IRC		16:00
*** recheck has joined #zuul		16:00
noorul	pabelanger: I think I can use https://docs.ansible.com/ansible/latest/modules/docker_container_module.html	16:00
*** zbr is now known as zbr\|ruck		16:07
*** gtema_ has quit IRC		16:11
noorul	I am trying something like this	16:19
noorul	http://paste.openstack.org/show/777430/	16:19
noorul	But it has syntax error at line 11	16:19
noorul	did not find expected '-' indicator while parsing a block collection	16:19
noorul	Is it not possible to use variables in a list?	16:20
pabelanger	line 11 needs to quota all	16:23
pabelanger	":/tmp	16:23
pabelanger	needs to also be inside quotas	16:24
pabelanger	quotes*	16:24
noorul	pabelanger: Thank you	16:25
noorul	ofosos: hi	16:26
noorul	What is the procedure for using 3rd party ansible modules inside zuul ?	16:34
fungi	noorul: what sort of third-party module, and where inside zuul?	16:36
fungi	noorul: if you're talking about use of ansible modules on the executor, zuul shadows and filters some modules in the per-ansible trees like https://opendev.org/zuul/zuul/src/branch/master/zuul/ansible/2.8	16:38
pabelanger	I'd say nested ansible is likey more flexable approach	16:38
pabelanger	but, with coming collections we'll need to make 3rd party modules a little more easier	16:39
fungi	and yes, having the executor's ansible invoke an ansible on a disposable build node gets you access to use any additional modules you want without compromising the security of the executor's contained ansible environment	16:39
fungi	ansible wasn't really designed with the idea in mind that you might want to use it to run untrusted playbooks/roles/modules, so zuul cripples the ansible it's interfacing directly with to try and help preserve the security of its executors	16:41
fungi	if you're considering expanding the set of modules the executor allows, they should be very carefully scrutinized for ways they could be abused to compromise the environment	16:43
*** tosky has quit IRC		16:49
pabelanger	neat, https://github.com/theopenlab/labkeeper looks to be a fork of stuff we use to deploy zuul.a.c :)	16:50
noorul	I am getting this error	16:50
noorul	Failed to import docker or docker-py - No module named requests.exceptions. Try `pip install docker` or `pip install docker-py` (Python 2.6)	16:50
noorul	Does zuul only use python 2?	16:50
pabelanger	no, can use python3 but need configure that in nodepool	16:50
pabelanger	https://zuul-ci.org/docs/nodepool/configuration.html#attr-diskimages.python-path for example	16:51
noorul	pabelanger: I am using static driver	16:52
pabelanger	https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[static].pools.nodes.python-path	16:52
noorul	It also has python path option	16:52
openstackgerrit	Merged zuul/zuul master: Add support for the Gerrit checks plugin https://review.opendev.org/680778	16:59
noorul	pabelanger: http://paste.openstack.org/show/777430/	17:06
noorul	looks like the : in 10.29.12.160:7990 is creating problem	17:06
noorul	I am getting following error Error creating container: 500 Server Error: Internal Server Error ("invalid mode: /tmp")	17:07
*** brendangalloway has quit IRC		17:07
noorul	https://docs.ansible.com/ansible/latest/modules/docker_container_module.html	17:08
fungi	is 10.29.12.160:7990 your connection name?	17:08
pabelanger	should it be: "{{ zuul.projects['10.29.12.160:7990/ac/commonlib'].src_dir }}:/tmp"	17:08
openstackgerrit	Merged zuul/zuul master: Update gerrit pagination test fixtures https://review.opendev.org/682114	17:08
fungi	ahh, it's the second : causing the issue	17:08
fungi	not the earlier one	17:09
noorul	Oops let me correct	17:09
*** jamesmcarthur_ has joined #zuul		17:09
noorul	This is what I have now http://paste.openstack.org/show/777434/	17:09
noorul	That was the old one	17:09
noorul	As per the syntax, the third one is mode	17:10
noorul	Since I have colon in connection name itself	17:10
noorul	My connection is defined here http://paste.openstack.org/show/777435/	17:11
pabelanger	noorul: what does your inventory file look like	17:11
*** jamesmcarthur has quit IRC		17:12
clarkb	http://zuul.openstack.org/build/7dd6514fd3b24eaea2db05b09e1d0d26/log/job-output.txt#2904 seems to be the cause of the failure mgoddard pointed out earlier today	17:12
clarkb	but I'm not seeing that module failure in the console of the job	17:12
pabelanger	noorul: that will include zuul.projects variable	17:12
clarkb	it says "See stdout/stderr for the exact error"	17:13
clarkb	anyone know what might cause a module failure like that?	17:13
clarkb	it has failed_when set to false which explains why that task doesn't fail the job	17:14
noorul	pabelanger: ?	17:14
noorul	I am not able to use zuul.projects['bitbucket/ac/commonlib'].src_dir }}	17:16
pabelanger	noorul: no, I am asking to see your inventory file, it will list the projects the job is using	17:16
pabelanger	wanted to see what zuul is expecting	17:16
noorul	Did you mean main.yaml ?	17:18
noorul	http://paste.openstack.org/show/777437/	17:18
pabelanger	clarkb: http://paste.openstack.org/show/777438/	17:18
pabelanger	usually ARA helps to expose some of that info	17:18
*** hashar has joined #zuul		17:18
clarkb	pabelanger: yes the stdout problem is caused by the module failre I linked to (the task that hit module failure registers that variable	17:18
clarkb	pabelanger: the job failed bceause stdout wasn't set. stdout wasn't set because the module that sets it failed	17:18
pabelanger	we restarted zuul right	17:19
pabelanger	did we pick up new version of ansible?	17:19
*** jpena is now known as jpena\|off		17:19
clarkb	I don't think we upgrade ansible due to how pip install works	17:19
clarkb	though that gives me the idae of checking the executor logs	17:20
clarkb	I'll go do that now	17:20
noorul	pabelanger: I am wondering why is it using ip address instead of the name	17:20
pabelanger	clarkb: Hmm, I guess something about ansible changed	17:21
pabelanger	however	17:22
pabelanger	https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/persistent-firewall/tasks/main.yaml	17:22
pabelanger	I would consider switching that to include_task, and pass in iptables_rules / ip6tables_rules into the tasks	17:22
pabelanger	let me check something	17:22
clarkb	what is the difference?	17:22
pabelanger	Oh	17:23
pabelanger	http://paste.openstack.org/show/777440/	17:23
pabelanger	rc: -13	17:23
pabelanger	I've seen this before	17:23
pabelanger	but, haven't figured it out	17:23
pabelanger	I think that is SIGPIPE?	17:24
pabelanger	basically, ansible is killing the task, IIRC, which raises -13	17:24
pabelanger	I never figured it out	17:24
fungi	yes, manual for signal(7) confirms 13 is sigpipe	17:25
fungi	oh, though this is an exit code not a signal	17:25
clarkb	one thing I notice looking at the executor logs is that those tasks ran twice	17:25
clarkb	the first time it runs it does so successfully	17:25
pabelanger	clarkb: I can see it run twice, but on different nodes	17:26
pabelanger	unless I missed	17:26
pabelanger	fungi: yah, I think the - in -13 was a signal	17:26
clarkb	pabelanger: its twice per node for a total of 4 times	17:26
pabelanger	at least if I remember talking to core	17:26
clarkb	I think because the multinode bridge updates the firewall rules then repersists them after the initial pass	17:26
clarkb	this happens 2 minutes and 15 seconds apart ish	17:27
clarkb	likely not a race in that case	17:27
clarkb	but that also means the command is present on the host and successful ran at least once	17:27
noorul	pabelanger: Any work around?	17:28
pabelanger	clarkb: yah, something looks odd in that playbook, we call persistent-firewall multiple times	17:28
pabelanger	oh	17:29
clarkb	pabelanger: I don't think its wrong, we call it after making changes to the rules	17:29
clarkb	and we make changes multiple times if doing multinode bridge setup	17:29
pabelanger	this is test playbook?	17:29
clarkb	no	17:29
clarkb	unfortunately the ironic job doesn't seem to successfully collect syslog :/	17:29
pabelanger	k, I haven't really looked to much into multi-node playbooks	17:31
pabelanger	but, I think we are hitting ansible issue	17:31
pabelanger	would be cool if we can reproduce	17:31
clarkb	it wouldn't surprise me if including the same task file multiple times is buggy in ansible	17:31
clarkb	we've find that having nested includes has been able to cause weird errors	17:31
pabelanger	yah, in fact, I think I'm including the same role multiple times too	17:32
pabelanger	so, maybe you are on to something	17:32
clarkb	might also be a problem with the register task. Like rerunning the same command task with a register breaks	17:33
pabelanger	yah, I think passing vars into task via include_task might be something to try: https://docs.ansible.com/ansible/latest/user_guide/playbooks_reuse_includes.html#including-and-importing-task-files	17:35
openstackgerrit	Merged zuul/zuul master: Support HTTP-only Gerrit https://review.opendev.org/681936	17:35
*** zbr\|ruck is now known as zbr		17:36
SpamapS	Watching a Circle CI demo.. we could totally steal a thing from them with their restore_cache/save_cache keywords.	17:38
SpamapS	Simpler than building AMI's or docker images .. just have a way to save dirs and restore dirs.	17:39
fungi	sounds like pbuilder in debian	17:39
SpamapS	yeah exactly	17:40
fungi	tar up the tree and archive it, then untar it for subsequent builds	17:40
SpamapS	they just use a key to invalidate.. so you can hash things like your requirements.txt or something else.	17:40
clarkb	requirements.txt won't work because it is too loose. constraints might	17:41
SpamapS	Pipfile.lock would work	17:41
clarkb	also worth noting (and this too is python specific) venvs are not portable across even minor python version updates	17:42
clarkb	you'd also need to invalidate if your base image changed	17:42
SpamapS	Yeah, that seems pretty simple to do.	17:42
fungi	it goes deeper still	17:43
SpamapS	I think you are optimizing for rare events.	17:43
noorul	pabelanger: I shared main.yaml http://paste.openstack.org/show/777437/	17:43
SpamapS	but yes, of course, invalidation will be needed.	17:44
SpamapS	also I didn't suggest venvs	17:44
SpamapS	wheel cache for instance	17:44
fungi	if your dependencies include things which build c extensions from sdist and don't provide wheels, then it won't be the same if the underlying libs/headers change	17:44
fungi	oh, but yeah, a wheel cache, sure	17:44
SpamapS	you are all thinking in terms of BIG infrastructure like opendev. Small shops won't want to maintain all the caches and such	17:44
mordred	yeah. there's definitely ways in which such a technique could be really nice - and I think we have the general tools and plumbing to be able to do such a thing	17:44
fungi	granted, we basically already build a wheel cache in our ci system and publish it from local hosts in each region	17:44
SpamapS	Right I'm looking at what a team did that migrated away from Zuul	17:45
mordred	and probably wiring up a similar mechanism using the plumbing we've got so that people could easy do such a thing would be really helpful to a range of folks	17:45
pabelanger	noorul: so, each job will produce an inventory file, which should be collected via logs. For example: https://zuul.opendev.org/t/zuul/build/adf44f2115344c1bac7733f1ef22983c/log/zuul-info/inventory.yaml	17:45
SpamapS	Because we didn't have time to build our own wheel cache.	17:45
fungi	but i agree, not having to maintain a separate system for wheel caches would be nice, even for large deployments like opendev, absolutely	17:45
pabelanger	in it you can confirm the syntax, but based on your yaml file, you need likely need to use bitbucket/ac/commonlib	17:46
fungi	SpamapS: the other concern which arises is trusted vs untrusted builds and cache poisoning. this is already a problem for systems like distcc	17:46
fungi	er, well, ccache	17:46
SpamapS	only do it on gate	17:46
*** hashar has quit IRC		17:46
fungi	and scope it independently along project lines i suppose	17:47
mordred	you could even use promote to push an updated version of the cache	17:47
noorul	pabelanger: I have that file	17:47
mordred	so that check and gate pull the cache, do $whatever to "update" it (which would frequently be noop)	17:47
fungi	because project a doesn't necessarily trust caches created by changes approved by project b's maintainers	17:47
mordred	then promote publishes an updated cache	17:47
noorul	pabelanger: It has project names that I can't put in public	17:47
pabelanger	noorul: that is fine, if you look at zuul.projects, it will be a list of projects you can use	17:48
* mordred waves hands a bit - but it could be really cool		17:48
noorul	It has this key	17:48
noorul	10.29.12.160:7990/ac/commonlib:	17:48
noorul	I am wondering why stash driver is using 10.29.12.160:7990 as prefix	17:49
openstackgerrit	Clark Boylan proposed zuul/zuul-jobs master: DO NOT MERGE test cleanup phase playbook https://review.opendev.org/680178	17:49
noorul	Is it stash driver or Zuul ?	17:49
pabelanger	that comes from your tenant config, IIRC	17:50
pabelanger	but, you had bitbucket for connection	17:50
noorul	Yes	17:50
pabelanger	you can try setting canonical_hostname in your connect	17:52
pabelanger	to the hostname of your service	17:52
pabelanger	connection*	17:52
fungi	makes me wonder if the bitbucket driver could be passing the connection address/port where it's supposed to be passing the connection name	17:52
noorul	Any idea which interface method is used?	17:53
pabelanger	SpamapS: curious how moved away from zuul, is that public info?	17:54
pabelanger	asterisk project used to run zuul at one point, but moved to jenkins. They never did get to a point of gating asterisk, just using zuul with gerrit to test, then merge manually	17:55
*** sshnaidm is now known as sshnaidm\|bbl		17:57
SpamapS	pabelanger: It's a team here at Good Money. Experiment that so far is going well. They just didn't ever embrace Zuul/Ansible and CircleCI has a lot of bells and whistles for smaller teams/apps.	17:57
pabelanger	ack	17:58
fungi	i agree looking at what bells and whistles people find useful is a good opportunity to identify possible missing features	17:58
mordred	++	17:58
mordred	it's always good to learn about what things make people like other things	17:58
openstackgerrit	Merged zuul/zuul master: Add autogenerated tag to Gerrit reviews https://review.opendev.org/682473	17:59
pabelanger	SpamapS: I do agree that setting up caches stuff is time consuming. Still need to do that for zuul.a.c. Starting now to get to point where regional mirrors of things will be helpful	17:59
SpamapS	I mean, to put it in context.. we were too slow, taking 17 minutes to deploy.	18:01
fungi	it's also possible to cache a lot more stuff on node images	18:03
fungi	opendev used to do a bunch more of that than it does now, because the more you cache the easier it is for your cacheing to hide problems with dependency management	18:04
SpamapS	We did that in response, but that took us, frankly, a month.	18:04
SpamapS	Because nodepool+aws has no builder.	18:04
SpamapS	So we had to create a packer build, get it working, move all the pre steps into it, and then make a zuul job that packer builds and uploads an AMI.	18:05
pabelanger	yah, we started to cache ansible/ansible on images, which did speed up job runs a bit	18:05
SpamapS	By that time they'd already moved to Circle and ripping that team's stuff out of our monolithic build took it from 17 minutes to 7, just in time for some new microservices to add 3 more minutes. ;)	18:05
SpamapS	But the AMI having everything installed won us back those 3 minutes.	18:06
pabelanger	I think I'd like to add docker insecure register, but also want to back with swift... so in a holding pattern until we figure out zuul-registry	18:06
SpamapS	(In reality, it happened a little different in terms of ordering, but we're at 7 minutes for our zuul build as of today, and that team that went to CircleCI is at 4 minutes. ;)	18:07
pabelanger	yah, that is some of the concern I hear from awx team, and using static nodes and GCE. They get faster builds	18:08
pabelanger	but, I also don't think they are rebuilding (base) images each time	18:09
SpamapS	I think at scale, building AMI's with caches and having local mirrors .. all "the right thing". But making it simpler for a single build to solve its own problems may actually be the more important capability.	18:09
SpamapS	(AMI's, or disk images, or whatever)	18:10
mordred	SpamapS: the thing with a cache save/restore thing is that it doesn't just have to be used for caches	18:10
clarkb	pabelanger: static nodes also have the potential for going sideways	18:12
mordred	and it's a feature we already have in zuul - its just not exposed in a single save/restore pair like in circle	18:12
* clarkb lived that life for a long time does not want to go back		18:12
SpamapS	mordred: is that the provides thing?	18:12
pabelanger	clarkb: yah, agree. I totally think it is an education issue too. But so far, they seem happy to use that, regardless of the potential issues	18:12
fungi	yeah, glad to no longer spend a good chunk of my day checking the health of all our static ci nodes and rebooting them	18:12
SpamapS	I've never really looked at it.	18:12
mordred	SpamapS: which is me saying - yeah - empowering people to be able to use the underlying power to solve their spefific problems without them knowing how all of the pieces work is super helpful	18:12
mordred	SpamapS: exactly	18:13
mordred	:)	18:13
SpamapS	But isn't that between pipelines?	18:13
pabelanger	clarkb: I really don't want to have jobs be blocked when infra is down, that is the power of zuul to me	18:13
mordred	yeah - provides is - that's how a parent job could update a 'cache' if needed and have a child job pick up the updates. it's the other side of the coin to "I need one of these, it needs to be updated sometimes, and I need it as an input to my job"	18:14
fungi	pabelanger: general leaks over time cause static nodes to start systemically failing jobs because they run out of disk/ram/whatever, but if you're running builds which require root privileges then it gets waaay worse since a build can absolutely shred the node or, worse, backdoor it and then steal things like credentials for subsequent builds	18:14
mordred	the image build jobs are, I think, the most complete implementation of the pattern overall - but they're docker specific. the usage pattern though is essentially what one should have from a caching system in a world where depends-on and friends exist	18:15
SpamapS	the general flow of things = restore_cache("key") or build_things("key")...save_cache("key") is the one that I like.	18:15
pabelanger	fungi: agree! Maybe at ansiblfest you can help share that info :D	18:15
fungi	happy to!	18:15
SpamapS	Like in this case, we have a node_modules dir that only needs to be maintained when the yarn.lock changes.	18:16
mordred	yah. totally	18:16
SpamapS	so, Circle has a really nice way to just say "restore node_modules from cache if you can" and then "save node_modules to cache"	18:16
SpamapS	I'd like to steal that for Zuul. :)	18:16
mordred	just saying - that's the same effective pattern - so the tools are there to put together the thing you're talking about	18:16
SpamapS	And one problem we have there, is that Zuul doesn't have a real abstraction for object storage/filesystem-caching/etc.	18:17
fungi	i suppose we just needs some more specific roles/workflows around that use case	18:17
SpamapS	It goes back to my idea that we don't need "nodepool" we need "thingpool"	18:17
SpamapS	Anyway, just a thought.	18:18
SpamapS	I also don't know how much I've missed in terms of roles that could make our builds simpler.	18:19
openstackgerrit	Merged zuul/zuul master: Use robot_comments in Gerrit https://review.opendev.org/682487	18:19
SpamapS	There may be lots of roles out there that make Zuul life as simple as Circle life.	18:20
noorul	After adding canonical_hostname everything stopped working :(	18:22
mordred	SpamapS: yeah - I think that was kind of what I was getting at - I think the usage pattern you're talking about, while not implemented in exactly that way, is where several things either are headed or are already there, on a system-by-system basis - so I think broadly speaking that type of pattern is one that has a decent amount of work already done for it and I think it would mostly be some glue	18:22
mordred	SpamapS: so like - a, "yeah, I agree, that's a great interface to provide" and "I don't think we're too far away from being able to do so"	18:23
SpamapS	Cool. Just.. it needs to be really obvious and simple to consume. Right now I think Zuul is in that stage where it can do anything, but .. it's not super approachable for the few things that should be simple and easy.	18:24
mordred	with a side helping of "the specific use case example is a good one to have in our heads as we continue to work on providing good sugar"	18:24
SpamapS	Some things are, though. Like, if you embrace tox.ini, and all you want is CI on that.. ++ .. zuul crushes that.	18:24
mordred	SpamapS: yeah - I think the power in sharable job definitions is that for anythign people have written good jobs for, things become trivial and uber powerful	18:25
mordred	but we definitely need to pick up more reusable job content of the same depth as the tox support	18:25
pabelanger	I wish all jobs were like tox jobs :) That would be huge, but also takes a lot of work.	18:26
mordred	yup	18:26
SpamapS	Another thing that Circle does more easily is secrets.	18:26
mordred	because it also takes people who understand the ecosystem of the tools being supported. like - what are the right actions to take on behalf of someone when they are doing yarn/npm-based things	18:26
clarkb	mordred: you are asking the wrong person about that. I still have to undelete or is it delete? the "I'm just here to hangout" file in zuuls js builds	18:27
clarkb	seems like I get that wrong every time I push a new js related change	18:27
SpamapS	They just have these things called "Contexts" and that's a group of envvars and other settings. Each CI step just calls out a context. That is totally doable in a Zuul shop with parent: prod-context parent: staging-context ... but.. it's not quite batteries-included for Zuul.	18:27
mordred	clarkb: well - to be fair, that's a javascript project embedded in a python project - so it's a very zuul-specific one-off	18:28
clarkb	SpamapS: i think at least some of that (env vars in particualr) is a desire to push that onto ansible so that we aren't rewriting what ansible can do	18:29
*** hashar has joined #zuul		18:29
clarkb	maybe that has to change, maybe not.	18:30
mordred	yeah - and the inability to tell ansible "just pass these env vars to every task please kthxbai" isn't not a thing that has annoyed me before :)	18:30
noorul	even after removing canonical_hostname nothing works	18:33
SpamapS	clarkb: I think a solid library of jobs that are one level up from base job, would do it.	18:34
SpamapS	mordred: yeah, that would be a nice feature to add to Ansible. Something like playbook-level generics.	18:34
*** openstackgerrit has quit IRC		18:37
SpamapS	like "- playbooks_defaults: { become: true, environment: { FOO: bar } }"	18:42
*** mattw4 has quit IRC		18:44
noorul	somehow working after removing canonical_hostname	18:46
noorul	Can someone help me to find a work around to http://paste.openstack.org/show/777434/ ?	18:51
noorul	The colon in 10.29.12.160:7990 is conflicting with volumes syntax	18:52
clarkb	noorul: can you share error messages?	19:20
dmsimard	SpamapS: it sort of has something like that now -- module defaults: https://docs.ansible.com/ansible/latest/user_guide/playbooks_module_defaults.html	19:29
SpamapS	dmsimard: I want it a level up	19:29
SpamapS	that's at the play level	19:29
SpamapS	I need it at the playbook level	19:29
noorul	clarkb: Error creating container: 500 Server Error: Internal Server Error ("invalid mode: /tmp")	19:29
dmsimard	SpamapS: it's at the play level	19:29
dmsimard	yeah	19:30
dmsimard	might be workaroundable with a play that imports other playbooks, not sure	19:30
clarkb	noorul: did it log the argument it passed to docker?	19:30
clarkb	noorul: if so that will help us udnerstand how things are getting interpolated there	19:31
SpamapS	import playbook is at the playbook level	19:31
dmsimard	ah, yeah, I'm mistaken	19:31
noorul	Simple docker run http://paste.openstack.org/show/777447/ creates another problem	19:32
noorul	the input device is not a TTY	19:32
noorul		19:32
noorul	"volumes":[1 item	19:34
noorul	0:"src/10.29.12.160:7990/ac/commonlib: /tmp"	19:34
noorul	]	19:34
mordred	dmsimard: still - that's cool! TIL	19:35
noorul	clarkb: ^^	19:35
clarkb	I see so the problem is the port specification in the source side?	19:36
clarkb	does docker have a way to escape that? : is a valid file name character	19:36
noorul	by docker did you mean the ansible module?	19:37
clarkb	no docker itself or whatever is consuming that input	19:37
noorul	It is the ansible module	19:38
*** openstackgerrit has joined #zuul		19:41
openstackgerrit	David Shrewsbury proposed zuul/nodepool master: Reduce upload threads in tests from 4 to 1 https://review.opendev.org/682977	19:41
*** tosky has joined #zuul		19:43
noorul	https://github.com/moby/moby/issues/8604	19:44
clarkb	noorul: if I'm reading that right you need the --mount flag ?	19:45
noorul	clarkb: Not sure whether ansible module https://docs.ansible.com/ansible/latest/modules/docker_container_module.html supports that	19:46
clarkb	noorul: ya you might have to run docker via the shell or command module isntead	19:46
clarkb	not sure	19:46
noorul	I tried that, and I get the error "the input device is not a TTY"	19:48
noorul	Is there any example in opendev which uses docker run	19:48
noorul	?	19:48
SpamapS	dmsimard: I want ot echo what mordred said. I didn't know about module_defaults, and that's a handy thing	19:49
Shrews	zuul-maint: it sure would be pleasant to get the autohold-revamp merged and remove this Sword of Damocles from over my head	19:49
* Shrews starts singing from the Rocky Horror soundtrack		19:49
mordred	Shrews: but the dangling sword is so purty	19:49
SpamapS	Shrews: is that a transvestite, transexual patch, from transylvania?	19:50
noorul	In the devel branch mounts is supported	19:50
noorul	https://github.com/ansible/ansible/blob/devel/lib/ansible/modules/cloud/docker/docker_container.py#L378	19:50
Shrews	SpamapS: you know it	19:50
*** mattw4 has joined #zuul		19:50
*** sshnaidm\|bbl is now known as sshnaidm		19:52
*** Goneri has quit IRC		19:52
fungi	now we need to plan a zuul rocky horror stage performance	19:59
Shrews	i call the part of Dr. Frank-N-Furter	20:00
fungi	with the right baldwig i can do a decent riff raff	20:00
corvus	ooh, i'm shivering with antici	20:07
Shrews	.... ?????????	20:07
corvus	pation	20:07
Shrews	phew	20:07
fungi	the suspense was unbearable	20:08
*** pcaruana has quit IRC		20:10
*** jamesmcarthur_ has quit IRC		20:12
*** hashar has quit IRC		20:12
*** jamesmcarthur has joined #zuul		20:17
*** jamesmcarthur has quit IRC		20:17
*** jamesmcarthur has joined #zuul		20:18
dmsimard	mordred, SpamapS: I've been using it to set the openstack cloud credentials so we don't need to repeat it for every module, it's pretty neat	20:26
*** noorul has quit IRC		20:35
openstackgerrit	James E. Blair proposed zuul/zuul-jobs master: RFC: Generic cache implementation https://review.opendev.org/682992	20:38
openstackgerrit	James E. Blair proposed zuul/zuul-jobs master: RFC: Generic cache implementation https://review.opendev.org/682992	20:39
corvus	SpamapS: I did a weird thing there -- i wrote up a sketch of how we might implement a cache system like you were describing, except that the way i originally thought of implementing it wouldn't work without some changes to how zuul handles secrets. but i wrote it anyway and pushed it as PS1. then i revised the plan to something that would work with what we have today, but has a different interface	20:42
corvus	(shifting a bit more work into base jobs) and pushed that as PS2.	20:42
corvus	SpamapS: when you have a minute, maybe take a look at those and let me know if you think that fits the use-case, and if so, whether PS2 is something we should work on now, or whether we should take this as design input for figuring out how to make PS1 work.	20:43
corvus	anyone else too of course ^	20:45
*** Goneri has joined #zuul		20:45
corvus	Shrews: your stack is +3 up through the web api change	20:51
corvus	er, up until it	20:51
daniel2	Someone gave me a link to a documentation on using cloud images instead of building them with nodepool. Can someone share that link again, I can't find it.	20:59
corvus	daniel2: https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[openstack].cloud-images (or similar for other drivers)	21:11
daniel2	Thanks!	21:11
Shrews	corvus: woooo	21:18
*** Goneri has quit IRC		21:22
daniel2	Does nodepool builder have any .service files? I swore I saw one before for nodepool builder but can't find it now. For systemd	21:25
clarkb	daniel2: nodepool itself doesn't ship any but some of the various config management tools for managing nodepool ahve init scripts and unit files	21:26
clarkb	the zuul from scratch docs may also haev examples too /me looks	21:26
*** Goneri has joined #zuul		21:26
daniel2	ah yes, found it	21:27
clarkb	https://zuul-ci.org/docs/zuul/admin/nodepool_install.html#service-file aha I was wrong	21:27
clarkb	https://opendev.org/zuul/nodepool/src/branch/master/etc/nodepool-launcher.service	21:27
daniel2	I could probably edit that just to builder	21:28
SpamapS	corvus: will take a look later today. Cool.	21:28
daniel2	launcher is running in docker. I'm going to run the builder on the host.	21:28
corvus	i'm restarting opendev's zuul on HEAD right now; i noticed one minor issue which caused a non-fatal traceback on a gerrit connection without an http password. i don't think it will cause ongoing problems, and i'll push a fix in a minute.	21:31
*** panda has quit IRC		21:41
*** panda has joined #zuul		21:42
corvus	and 2 tracebacks that are fatal	21:44
corvus	i'm restarting our scheduler at 3.10.2	21:44
*** nhicher has joined #zuul		21:47
openstackgerrit	Tristan Cacqueray proposed zuul/zuul master: Store a list of held node per held build in hold request https://review.opendev.org/682466	21:47
*** michael-beaver has quit IRC		21:49
*** rlandy is now known as rlandy\|bbl		22:04
*** jamesmcarthur has quit IRC		22:06
*** jamesmcarthur has joined #zuul		22:09
mnaser	i know "reviews welcome" but i've been struggling at searching zuul's codebase with gitea	22:27
mnaser	it almost always returns 0 results	22:27
SpamapS	isn't that what ripgrep i for? ;-)	22:27
mnaser	aha	22:28
mnaser	and on that note	22:28
mnaser	i was trying to check which services are stateful vs stateless and noticed the executor actually has a state_dir config option	22:28
mnaser	little research later shows we define it as `executor_state_root` but never reference it.. ever	22:28
mnaser	besides just creating that directory	22:28
mnaser	wait gr thats in the test code only	22:29
mnaser	ok so looks like one reference to determine disk avaialble on executor	22:30
mnaser	and storing ansible's	22:30
mnaser	maybe one day someone will read these chat logs and save themselves the source code reading :)	22:30
*** jamesmcarthur has quit IRC		22:33
*** rfolco has quit IRC		22:43
clarkb	SpamapS: more specifically it is why we haven't deleted codesearch	22:47
*** jamesmcarthur has joined #zuul		22:50
openstackgerrit	James E. Blair proposed zuul/zuul master: WIP: Fix gerrit errors from production https://review.opendev.org/683006	22:53
openstackgerrit	James E. Blair proposed zuul/zuul master: DNM: Use http for all gerrit tests https://review.opendev.org/683007	22:53
pabelanger	mnaser: yah, mostly executor.state_dir needs to be stateful	23:06
pabelanger	git repos / ansible, etc	23:06
mnaser	ok so statefulsets for everything except zuul-web	23:06
pabelanger	and fingergw	23:07
*** kerby has joined #zuul		23:07
pabelanger	nodepool-launchers are stateless, builders maybe need access to cache for dib	23:08
pabelanger	starting upgrade of zuul.a.c to 3.10.2	23:08
*** jamesmcarthur has quit IRC		23:09
mnaser	btw, i think we should get around tagging release on dockerhub	23:09
mnaser	that'd be neat	23:09
pabelanger	Hmm, maybe we just need to add job to release pipeline	23:10
mnaser	well in the release pipeline we dont really have a built artifact (unlike promote i guess)	23:15
pabelanger	Yup, we'd need to build it, then push with right version info	23:16
mnaser	because the buildset registry would be long gone by then, so it would be a fresh rebuild	23:17
pabelanger	other jobs do it today	23:17
pabelanger	eg: loci	23:17
mnaser	(in an ideal world it would be nice if it wasn't a fresh rebuild, and we just retag the one that's already uploaded)	23:17
mnaser	and that we know is tested	23:17
mnaser	because technically i guess the build that happens inside release isn't _exactly_ what we tested	23:17
pabelanger	I mean, could do that too. But I don't think we keep more then head in docker hub	23:18
pabelanger	do we?	23:18
mnaser	no only head is in dockerhub	23:18
mnaser	actually the intermediate registry should always be running, so i'm wrong on that, i think it should be trivial, the trick is just figuring out the zuul version name in the job to tag	23:19
pabelanger	I think we purge that each night	23:20
mnaser	ah that breaks my plan in that case	23:20
mnaser	cause release can come many days after no changes	23:21
pabelanger	but yah, we could keep them around some how	23:21
pabelanger	then when we tag, also fetch original bits	23:21
pabelanger	but, given for pypi we do a rebuild, docker should work the same	23:21
pabelanger	just pulls in way more thing	23:21
pabelanger	things*	23:21
mnaser	well since we already upload to dockerhhub, we can upload/tag every commit id there	23:22
mnaser	and then its a matter of docker pull zuul/zuul-merger@<sha-of-current-released-commit> and retagging that with the tagged version	23:22
mnaser	its also makes it easy for someone to point towards a very specific version of zuul based on the commit id	23:23
pabelanger	yah, we did that for pypi for a bit	23:23
pabelanger	I liked it, so I could fetch any wheel	23:23
pabelanger	rather then building it myself	23:24
mnaser	i can imagine those repos getting a little annoyed with us tagging everything in there :p	23:24
corvus	pabelanger, mnaser: what do you mean by stateful for executors? they benefit from cache but do not require it.	23:26
mnaser	corvus: the idea that /var/lib/zuul (aka executor.state_dir) needs to be stable	23:27
corvus	i mean, that if you shut one down or move it or whatever, if it has a git repo cache, great, it'll be faster. but if it doesn't, it can totally rebuild it from scratch.	23:27
pabelanger	Zuul version: 3.10.2 !	23:27
mnaser	right ok, so the state is a nice to have cache, but it is not necessary	23:27
pabelanger	yah, I was assuming mnaser was looking at doing a rolling upgrade in k8s. Agree, having cache isn't needed, just faster	23:28
mnaser	it seems like a useful optimization to have if its possible imho	23:28
corvus	mnaser: yep. let's pretend i've forgotten all the right k8s vocab. so just in general terms -- executors should be somewhat long-running so that they benefit from the cache, but for scaling up, down, or horizontal moves for eviction, etc, they can lose the state dir and be fine.	23:30
mnaser	corvus: that makes perfect sense	23:30
corvus	state dir for the scheduler is critical (secret decryption keys)	23:30
corvus	oh, and everything said about executors applies to mergers, if you run any	23:31
pabelanger	so, for some reason, that I don't fully understand, https://dashboard.zuul.ansible.com/t/ansible/build/026dbb8307d5405d93e226286ade0650/log/job-output.txt seems to take some time to render the HTML	23:34
pabelanger	I'm unsure where the bottleneck is happening	23:34
clarkb	pabelanger: is it a very large file?	23:34
*** jamesmcarthur has joined #zuul		23:34
mnaser	i think its the large file factor	23:34
mnaser	my browser seems to hang	23:34
clarkb	pabelanger: basically what happens is the js has to request the entire file then scan it with regexes. Large files will be slow because network transfer is slow as is regexing a large file	23:34
pabelanger	4.3mb?	23:35
pabelanger	if I access file directly, it loads much faster	23:35
pabelanger	https://object-storage-ca-ymq-1.vexxhost.net/v1/a0b4156a37f9453eb4ec7db5422272df/ansible_24/524/64bf07cadde572bf2ae5fa19bd0201b153f12f5a/promote/windmill-config-deploy/026dbb8/job-output.txt	23:35
mnaser	i think its cause one does zero css/rendering/blah	23:35
pabelanger	clarkb: ah, I see	23:35
mnaser	i just profiled it with chrome	23:36
pabelanger	could it be server side?	23:36
mnaser	2s for scripting, 4s rendering	23:36
clarkb	pabelanger: if directly fetching it is fast then no it shouldn't be server side	23:36
pabelanger	k, I'll have to dig more	23:37
pabelanger	was thinking of disabling job-output.txt.html, like opendev did	23:37
clarkb	I think mnaser just profiled why it is slow	23:37
pabelanger	what does rendering?	23:37
corvus	pabelanger: obviously, whatever workflow works for you, but i've basically stopped using the text console in favor of the json: https://dashboard.zuul.ansible.com/t/ansible/build/026dbb8307d5405d93e226286ade0650/console	23:38
corvus	(i reserve the log viewer for logs other than the console output)	23:39
mnaser	ok i see a little reason why it is taking up a lot	23:39
mnaser	we're building a table of N rows where N is the number of lines	23:40
mnaser	in this case, a table with 39.4k rows	23:40
*** tosky has quit IRC		23:40
mnaser	so the number of nodes explodes as the browser tries to render it all	23:40
pabelanger	how do I see that?	23:41
mnaser	i usually open the chrome developer tools	23:42
mnaser	and then go into "performance"	23:42
mnaser	hit the record button (make sure you have memory profiling enabled/checked too)	23:42
mnaser	and then refresh the page, wait for it to be done, stop recording	23:42
pabelanger	k	23:42
mnaser	FWIW, GitHub does the same thing.. so i dont know if there's a better way to deal with this	23:42
mnaser	https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py a 10K github file probably gives your computer a workout too	23:43
mnaser	and they even do syntax highlighting so its probably worse (if it was ~40k lines)	23:43
pabelanger	that loads not too bad	23:44
corvus	well, if it loads in 25% of the time it's comprable performance.	23:44
pabelanger	true	23:45
pabelanger	okay, thanks for help	23:45
corvus	pabelanger: try out the console tab. keep in mind you can deep-link to individual results.	23:45
corvus	eg https://dashboard.zuul.ansible.com/t/ansible/build/026dbb8307d5405d93e226286ade0650/console#3/0/1/localhost	23:46
pabelanger	well, we have some large shell tasks, for nested ansible. Having some trouble figuring that out	23:47
pabelanger	eg: https://dashboard.zuul.ansible.com/t/ansible/build/026dbb8307d5405d93e226286ade0650/console#3/0/1/localhost	23:47
pabelanger	sorry	23:48
pabelanger	https://dashboard.zuul.ansible.com/t/ansible/build/026dbb8307d5405d93e226286ade0650/console#3/1/6/bastion01.sjc1.vexxhost.zuul.ansible.com	23:48
corvus	oh yeah, it's less great for nested ansible	23:48
pabelanger	k	23:49
pabelanger	mnaser: btw: we haven't see issues with centos7.7 so far, but not doing much with it currently. Mostly a stale mirror	23:52

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!