Monday, 2018-11-12

pabelanger	clarkb: Ack, it is pretty minimal. I'll see about proposing a patch this week to help provide a little more information. User on twitter confusing zuul contribiting with having to sign CLA for OpenStack foundation	00:53
*** irclogbot_3 has quit IRC		00:56
tristanC	corvus: it seems like we are not handling api status code correctly, i don't think it's related to the queue id thing	04:11
tristanC	corvus: the typo in pipeliens is odd though	04:11
openstackgerrit	Tristan Cacqueray proposed openstack-infra/zuul master: Add monitoring driver spec draft https://review.openstack.org/617220	04:40
*** chandankumar has joined #zuul		06:04
*** chandankumar is now known as chkumar\|rover		06:06
*** bjackman has joined #zuul		06:26
bjackman	I tried increasing the max-parallel-jobs field of the Nodepoo static driver configuration, and I noticed that the working directory on the test node was still directly in the $HOME directory (i.e. the project under test was at ~/src/gerrit/foo) - this seems like it will be a problem when there are multiple jobs in parallel; they'll step on each other's working directory	06:31
bjackman	Do I need some extra config to set a unique workdir for each job?	06:31
*** themroc has joined #zuul		07:59
*** bjackman has quit IRC		08:04
clarkb	bjackman isn't here anymore but yes, I believe you need a different base job that is smart about where the job content goes on disk in the remote nodes	08:32
corvus	pabelanger: can you ask if the user read the readme?	08:34
corvus	pabelanger: it's two separate questions -- Is there enough information in the readme on how to contribute? vs Did the user find the information on how to contribute?	08:35
corvus	but, regardless, this is why we need to use a gerrit that is not hosted at openstack.org (whether that's review.opendev.org or review.zuul-ci.org, i'm not sure). but it's not the first time it's happened, and it's the main motivation for new top-level project hosting.	08:36
corvus	clarkb, fungi, mordred: ^ fyi	08:36
clarkb	its unfortunate that that assumption gets made at all (but we can't control that only make it more clera that this is openstack and that isn't for various values of this and that)	08:37
corvus	tristanC: are you able to reproduce the issue? you may be able to by starting up zuul with no mergers or executors online	08:38
corvus	clarkb: i don't expect the assumption to be made once there are no openstack.org domains involved in zuul development	08:38
clarkb	corvus: yup	08:38
*** pcaruana has quit IRC		08:39
*** hashar has joined #zuul		08:41
*** bjackman has joined #zuul		08:44
*** jpena\|off is now known as jpena		08:51
bjackman	Ahhh I guess the answer to my previous question is to set the {{ zuul_workspace_root }} variable	09:33
bjackman	I guess there is some way to set it to the {{ zuul.build }} UUID	09:34
corvus	bjackman: yeah, the zuul_workspace_root is a conevntion used by jobs and roles in the zuul-jobs repo. it's not part of zuul itself (zuul doesn't do anything on remote nodes). but if you set that in your jobs, the setup-workspace role, and others, will use it. clarkb suggested after you left that you could do that in a base job.	09:37
bjackman	corvus: Got it, thanks.	09:37
bjackman	clarkb: Sorry I missed your response, I lost my network connection.	09:38
corvus	bjackman: eavesdrop.openstack.org has channel logs if you need them btw	09:47
bjackman	corvus: I actually had a look in there after I got back online and didn't see anything, maybe there was a race condition	09:47
tobiash	bjackman: I think it updates the logs every 15 minutes	09:49
clarkb	ya we could probably change it to render the html on demand or more frequently	09:49
clarkb	there is a raw log too, but I don't think we link those anywhere	09:50
clarkb	http://eavesdrop.openstack.org/irclogs/%23zuul/%23zuul.2018-11-12.log that should always be up to date	09:50
tobiash	good to know	09:51
*** chkumar\|rover is now known as chkumar\|ruck		10:51
openstackgerrit	Fabien Boucher proposed openstack-infra/zuul master: encrypt_secret: support self-signed certificates via --insecure argument https://review.openstack.org/617281	10:56
pabelanger	corvus: will try to confirm. So far I think it is like you say, users see review.o.o and think zuul is part of openstack CLA.	11:51
*** aluria has joined #zuul		12:05
*** bjackman has quit IRC		12:08
*** sshnaidm is now known as sshnaidm\|afk		12:09
*** quiquell has joined #zuul		12:25
quiquell	Hello	12:25
*** bjackman has joined #zuul		12:25
bjackman	Looks like there's no CLA to contribute to Zuul, is that right?	12:26
quiquell	I am trying to test "git" driver over a local repository	12:27
clarkb	bjackman: correct	12:27
quiquell	But zuul is not detecting the changes at my testing pipeline	12:27
quiquell	I am just using trigger.event: ref-updated	12:27
quiquell	Did it deepends on the poll_delay ?	12:28
clarkb	quiquell: yes that determines how often it polls the repo. Looks like default is 2 hours?	12:29
quiquell	clarkb: changes to just 30 seconds, creating a commit in this local repo is still not detected	12:29
quiquell	clarkb: with git show-ref I see the ref is updated	12:29
quiquell	clarkb: Do I have to use a bare repo and clone locally to add changes ?	12:30
clarkb	quiquell: you'll need to update the repo that is in your connection. it uses ls remote to list the refs on the remote wherever it is looks like	12:32
clarkb	corvus may know more. I think corvus added the driver	12:34
*** jpena is now known as jpena\|lunch		12:35
quiquell	clarkb: so it looks for changes in the refspec but at remote ?	12:35
clarkb	quiquell: it lists refs in the repo listed in the connection	12:36
clarkb	and checks if there are new refs in that list, if any have been deleted, or if they point to new commits	12:37
quiquell	clarkb: So if my connection points to a normal repository	12:37
quiquell	clarkb: Normal I mean a clone at my machine	12:38
quiquell	clarkb: And I just do a commit over this, it should trigger it ?	12:38
clarkb	yes I would expect so	12:38
clarkb	for example adding a commit to master should create a ref-updated event for the master branch updating	12:38
quiquell	clarkb: ack	12:41
corvus	clarkb, quiquell: yes that's how i would expect it to work. debug logs from the scheduler process may be helpful in debugging	12:42
quiquell	corvus: I think my issue, I was trying it with docker-compose external volumes, and they don't get updated inside the running dockers	12:42
quiquell	corvus: Will try with bind mount	12:43
corvus	quiquell: yep, that sounds probable	12:43
quiquell	corvus: Attaching to the images I see the pseudo-remote is not updated :-)	12:43
*** rlandy has joined #zuul		12:47
*** sshnaidm\|afk is now known as sshnaidm		12:48
*** quiquell is now known as quiquell\|brb		12:49
*** bjackman has quit IRC		12:50
*** EvilienM is now known as EmilienM		13:00
quiquell\|brb	corvus: If I don't set 'ref' attribute it will work or do I have to add it ?	13:02
quiquell\|brb	corvus: Doing a git ls-remote show differences	13:03
*** quiquell\|brb has quit IRC		13:09
*** chkumar has joined #zuul		13:10
*** chkumar is now known as chkumar\|rover		13:11
*** chkumar\|ruck has quit IRC		13:13
*** chkumar\|rover is now known as chkumar\|ruck		13:13
corvus	quiquell is not here, but if ref is omitted from the git trigger, it should trigger on all refs	13:31
*** jpena\|lunch is now known as jpena		13:31
*** dkehn has joined #zuul		13:52
*** chkumar\|ruck has quit IRC		14:22
*** hashar has quit IRC		14:37
tobiash	corvus, Shrews: do you have an idea how nodes can up in ready & locked since a few hours?	14:42
*** frickler has quit IRC		14:43
*** frickler has joined #zuul		14:43
clarkb	tobiash: iirc there is a bug in zuul (I thought corvus fixed it though?) where if a job reservers/locks a node before using it then the job is removed from the pipeline by a new patchset that doesn't need the job then we leak that node	14:53
tobiash	clarkb: that sounds pretty similar to my issue	14:54
tobiash	it looks like zuul has a lock on these nodes	14:54
clarkb	let me see if I can find the patch that shoud've fixed it	14:54
tobiash	thx	14:54
*** bjackman has joined #zuul		14:55
clarkb	tobiash: https://review.openstack.org/#/c/605527/ that one I think	14:55
clarkb	maybe it didn't fully fix the issue	14:55
tobiash	hrm, that is for sure already in my deployment :(	14:56
tobiash	so it looks like something different	14:56
tobiash	clarkb, corvus: found the reason	15:03
tobiash	the job in question has a semaphore, but the semaphore seems to not be respected when requesting and locking resources	15:03
tobiash	that will be fun to sort this out	15:03
clarkb	tobiash: oh so it has reserved the instance and is waiting for the sempahore to allow it to proceed?	15:03
tobiash	yes	15:04
clarkb	that could cause deadlock in the system depending on ordering (if you can lokc the node but not the semaphore and some other job locks the semaphore and not the node)	15:04
tobiash	and that since 3 hours with 20*16 core nodes :/	15:04
corvus	tobiash: what has the semaphore?	15:06
tobiash	we have a job x that has a semaphre with n=15	15:06
tobiash	not we have 30 of them in the pipelines	15:06
tobiash	15 are running and 15 queued according to the semaphore	15:06
tobiash	but there are already 30 nodes requested and locked which blocks other projects as we're at quota right now	15:07
corvus	tobiash: so you're not actually deadlocked, right?	15:07
tobiash	no, but wasting 1/4th of our resources atm	15:08
corvus	tobiash: gotcha. obviously a problem, just wanted to make sure i understood	15:08
corvus	tobiash, clarkb: we could have an option to wait for a semaphore before requesting nodes. do you think that would help in this case?	15:09
clarkb	corvus: is there any reason that shouldn't be the default or just how it always works?	15:09
tobiash	corvus: I even think this should not be an option	15:09
corvus	(i think it needs to be an option, because i'm sure in some cases the current behavior is preferable)	15:09
corvus	ha! we disagree :)	15:10
clarkb	I think it should be default at least, not sure if it needs to be toggleable yet	15:10
tobiash	corvus: do you have a specific case in mind where this would be preferable?	15:10
clarkb	(but maybe the behavior change is scary enough that for backward compat we don't make it default in the toggleale world)	15:10
openstackgerrit	Fabien Boucher proposed openstack-infra/zuul master: WIP - Pagure driver https://review.openstack.org/604404	15:11
corvus	i'm pretty open to changing the default. i don't have a specific use case, but often getting nodes is a bottleneck, and being able to run jobs in quick succession would be good.	15:12
corvus	actually, here's an example: in openstack, if we did afs publishing in a job, we might prefer the current behavior (the nodes aren't special, the semaphore is application-layer)	15:12
tobiash	corvus: ok, so we can make this configurable as a compromise	15:13
tobiash	corvus: would you make this togglable per semaphore or zuul config?	15:13
corvus	tobiash: semaphore i think. not global. maybe even per-job. (where the job specifies the semaphore)	15:13
tobiash	corvus: ok, so the (safe) default behavior would be to wait before request and some jobs/semaphores might override that for performance?	15:15
corvus	tobiash: sounds reasonable. this might be worth a mailing list post to get more feedback (people may use semaphores for widely variable reasons)	15:16
tobiash	ok, will write one	15:18
*** quiquell has joined #zuul		15:22
quiquell	corvus: Going back to the git driver I have the following config	15:22
quiquell	[connection "git.openstack.org"]	15:22
quiquell	name=git.openstack.org	15:22
quiquell	driver=git	15:22
quiquell	baseurl=file:///projects/	15:22
quiquell	pool_delay=30	15:22
quiquell	And under /projects/ in the container a bind mount with all the projects	15:23
quiquell	is this ok ?	15:23
clarkb	why not point it at gerrit at thta point?	15:24
clarkb	oh right the url is local logically it is git.openstack.org nevermind	15:24
corvus	quiquell: you can use paste.openstack.org in the future for pasts like that	15:24
clarkb	(jet lag brain happening here)	15:24
quiquell	clarkb: Doing some experiments about running zuul locally using local repos	15:24
quiquell	corvus: ack about paste sorry	15:25
corvus	quiquell: https://zuul-ci.org/docs/zuul/admin/drivers/git.html#attr-%3Cgit%20connection%3E.poll_delay -- looks like the delay setting is misspelled	15:25
corvus	quiquell: and yeah, you may want a different name to be less confusing :)	15:25
quiquell	corvus: Jojojo, so lame :-=), thanks	15:25
*** jimi\|ansible has joined #zuul		15:32
quiquell	working now :-)	15:36
*** chandankumar has joined #zuul		15:41
*** chandankumar is now known as chkumar\|ruck		15:41
*** hashar has joined #zuul		15:46
*** edleafe_ has joined #zuul		15:47
*** chkumar\|ruck has quit IRC		15:58
*** themroc has quit IRC		16:27
*** irclogbot_0 has joined #zuul		16:35
*** irclogbot_0 has quit IRC		16:36
*** pcaruana has joined #zuul		17:27
*** hashar has quit IRC		17:33
*** caphrim007 has joined #zuul		17:58
*** jpena is now known as jpena\|off		18:13
* SpamapS FOMO'ing soooo hard		18:20
*** quiquell is now known as quiquell\|off		18:48
*** bjackman has quit IRC		21:49
*** bjackman has joined #zuul		21:49
*** bjackman has quit IRC		21:53
*** bjackman has joined #zuul		21:53
*** pcaruana has quit IRC		22:58
*** dkehn has quit IRC		23:38
*** spsurya has quit IRC		23:40
*** dkehn has joined #zuul		23:44

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!