Monday, 2019-08-26

*** jamesmcarthur has joined #zuul		00:17
*** jamesmcarthur has quit IRC		01:33
*** jamesmcarthur has joined #zuul		02:14
*** jamesmcarthur has quit IRC		02:37
*** jamesmcarthur has joined #zuul		03:04
*** jamesmcarthur has quit IRC		03:33
*** bhavikdbavishi has joined #zuul		03:44
*** threestrands has joined #zuul		04:37
*** threestrands has quit IRC		04:37
*** threestrands has joined #zuul		04:37
*** raukadah is now known as chkumar\|rover		04:37
*** dkehn has quit IRC		05:04
*** AJaeger has quit IRC		05:45
*** AJaeger has joined #zuul		05:48
*** aluria has joined #zuul		05:56
openstackgerrit	Jan Kubovy proposed zuul/zuul master: Fix weak dependencies to work with child_jobs https://review.opendev.org/677936	06:39
*** themroc has joined #zuul		06:43
openstackgerrit	Jan Kubovy proposed zuul/zuul master: Fix weak dependencies to work with child_jobs https://review.opendev.org/677936	06:50
openstackgerrit	Jan Kubovy proposed zuul/zuul master: Fix weak dependencies to work with child_jobs https://review.opendev.org/677936	07:10
*** jpena\|off is now known as jpena		07:11
*** sshnaidm\|afk is now known as sshnaidm		07:16
*** threestrands has quit IRC		07:24
*** hashar has joined #zuul		07:45
*** jangutter has joined #zuul		08:06
*** themroc has quit IRC		08:15
*** themroc has joined #zuul		08:16
*** bhavikdbavishi has quit IRC		09:45
*** saneax has joined #zuul		09:59
*** sanjayu_ has joined #zuul		10:43
*** saneax has quit IRC		10:43
*** badboy has joined #zuul		10:55
badboy	hi guys	10:56
badboy	quick question, is it possible to set Zuul up to be triggered by abandon event?	10:56
*** badboy has quit IRC		11:03
*** gtema_ has joined #zuul		11:20
*** jpena is now known as jpena\|lunch		11:25
*** hashar has quit IRC		11:38
*** badboy has joined #zuul		11:44
*** gtema_ has quit IRC		11:46
tristanC	badboy: it seems like all event types ( https://www.gerritcodereview.com/cmd-stream-events.html#events ) are available for trigger, e.g. you could try change-abandoned	11:51
badboy	tristanC: thank you, will try that!	11:54
*** rlandy has joined #zuul		11:59
*** rlandy is now known as rlandy\|ruck		11:59
*** badboy has quit IRC		12:01
*** weshay_MOD is now known as weshay		12:07
*** jpena\|lunch is now known as jpena		12:30
*** mgoddard has quit IRC		12:47
*** jamesmcarthur has joined #zuul		12:47
*** mgoddard has joined #zuul		12:47
fungi	that ought to work. we run (or at least used to, i haven't checked) jobs on change-restored events which are the inverse of change-abandoned	12:53
fungi	gerrit considers abandoned changes to be closed though, so if you're trying to report to one it will only allow commenting and not voting	12:54
pabelanger	morning	13:31
pabelanger	we are seeing a newish error with swift upload logs	13:31
pabelanger	http://paste.openstack.org/raw/764544/	13:31
pabelanger	module 'keystoneauth1.exceptions.http has no attribute 'HTTPError'	13:32
pabelanger	unsure if related to vexxhost or python lib	13:32
pabelanger	mnaser:^ incase you want to look	13:32
mnaser	seems like a 504 from our side that made that happen	13:33
mnaser	we saw some issues earlier, mainly because of the sheer amount of objects being uploaded all at once (see discussion on zuul-discuss)	13:33
mnaser	something like ~60-65% of all objects uploaded in a day (2.5m in opendev case) are ara-report files	13:33
pabelanger	okay, so though is opendev might be impacting swift?	13:34
fungi	we experienced similar issues with ara's raw files format initially as well, basically we ran out of inodes on the filesystem where we were trying to store them	13:36
pabelanger	that might explain why we've seen an increase in POST_FAILURES for zuul.a.c recently	13:36
fungi	as pointed out on the ml thread, it's not actually the identical static files (icons, scripts, stylesheets) which account for the bulk of those files, it's the data. think of it as a database format where every row for every table is in its own file	13:37
*** jeliu_ has joined #zuul		13:42
AJaeger	pabelanger: I saw a patch merged for this, let me find it...	13:43
AJaeger	pabelanger: I4afe8c9fc8239a31d62a2a1d09794211b5066472	13:43
*** jamesmcarthur has quit IRC		13:44
*** hashar has joined #zuul		13:45
*** jamesmcarthur has joined #zuul		13:47
pabelanger	odd, we should be running this now	13:48
AJaeger	pabelanger: so, is running it the problem? Meaning: Do you have another keystoneauth version running then we do?	13:49
openstackgerrit	Jens Harbott (frickler) proposed zuul/zuul-jobs master: Fix handling of dangling symlink on manifest generation https://review.opendev.org/678552	13:52
*** jamesmcarthur has quit IRC		13:52
pabelanger	AJaeger: let me check which version I have installed	13:52
openstackgerrit	Dmitriy Rabotyagov (noonedeadpunk) proposed zuul/zuul-jobs master: Check path for existance in generate_manisfest.py https://review.opendev.org/678553	13:52
pabelanger	keystoneauth1==3.16.0	13:53
AJaeger	pabelanger: I don't know what OpenDev uses...	13:55
pabelanger	okay, thanks. It sounds like, we might be exposing this issue, only if upload to vexxhost times out	13:55
AJaeger	3.17 just came out - a pull shows both 3.16 and 3.17	13:55
pabelanger	which, seems to happen more, now that opendev is also uploading there	13:56
pabelanger	due to volume of ara bits	13:56
*** mgoddard has quit IRC		13:56
openstackgerrit	Jens Harbott (frickler) proposed zuul/zuul-jobs master: Fix handling of dangling symlink on manifest generation https://review.opendev.org/678552	13:57
*** mgoddard has joined #zuul		13:58
AJaeger	zuul-maint, could you review this, please? ^	13:59
fungi	on ze01, pip list reports keystoneauth1 (3.17.0)	14:00
fungi	however, the executor daemon on it last restarted 17 days ago	14:00
fungi	so if it's importing keystoneauth at start, then it hasn't been using anything newer than that date	14:01
AJaeger	and the diff between 3.16 and 3.17 shows nothing related to this AFAIU	14:02
openstackgerrit	Merged zuul/zuul-jobs master: Fix handling of dangling symlink on manifest generation https://review.opendev.org/678552	14:14
fungi	well, 3.16 is also newer than the last executor restart	14:15
fungi	i have a feeling it's running on 3.15.0 comparing release dates for keystoneauth1 with process start times	14:16
mordred	fungi, AJaeger: keystoneauth1.exceptions.http.HttpError is the correct exception name	14:17
clarkb	fungi: because ansible is the result of new forks it should use current sdk install	14:17
mordred	so I think if we're trying to catch HTTPError that's a bug	14:17
mordred	there is also an HTTPClientError	14:18
fungi	clarkb: ahh, this is being called by ansible, not imported by zuul? in that case i concur, it'll be 3.17.0	14:18
fungi	er, well, possibly not	14:19
fungi	that's the system context version	14:19
openstackgerrit	Mohammed Naser proposed zuul/zuul-jobs master: Add tests for manifest generation for missing files https://review.opendev.org/678573	14:19
fungi	ansible will use the one in the corresponding venv for its ansible version, right?	14:19
mnaser	^ this is not really that clean..	14:19
mnaser	i think its ok enough, but if someone wants to polish it, please feel free to, i gotta dig into other things :<	14:20
*** jamesmcarthur has joined #zuul		14:21
*** michael-beaver has joined #zuul		14:21
openstackgerrit	Monty Taylor proposed zuul/zuul-jobs master: Update keystoneauth exception name https://review.opendev.org/678575	14:21
mordred	fungi, clarkb, corvus ^^	14:21
fungi	yeah, these ansible venvs have keystoneauth1 versions contemporary to their initial creation, so 3.13.1 for all the ansible venvs except the ansible 2.8 venv which has 3.14.0	14:21
corvus	that's untestable code that we just added, so it's not surprising it didn't work the first time out	14:24
*** jamesmcarthur has quit IRC		14:28
corvus	mnaser: test looks good -- left a nit and a question about which behavior we want	14:29
openstackgerrit	Merged zuul/zuul-jobs master: Update keystoneauth exception name https://review.opendev.org/678575	14:38
*** gtema_ has joined #zuul		14:41
*** gtema_ has quit IRC		14:44
*** sanjayu_ has quit IRC		15:11
clarkb	corvus: I wrote a bunch of zuul changes on friday. With exception of this first one https://review.opendev.org/678049 the others https://review.opendev.org/#/c/678286/ https://review.opendev.org/#/c/678312/ should be straightforward docs updates if you have a moment	15:14
clarkb	corvus: the docs updates were things that noorul ran into that we helped noorul work through so are real issues people are having	15:14
corvus	fungi, clarkb: mordred and i are talking to the folks working on the gerrit checks plugin tomorrow about closing some of the gaps so zuul could maybe use it	15:15
corvus	clarkb: thanks i'll take a look	15:16
fungi	as was https://review.opendev.org/678243 but looks like i have a docs build error	15:18
fungi	will update in a moment	15:18
*** jamesmcarthur has joined #zuul		15:18
openstackgerrit	Jeff Liu proposed zuul/zuul-operator master: Add PerconaXDB Cluster to Zuul-Operator https://review.opendev.org/677315	15:19
corvus	fungi: a new user wanted to use the logging config?	15:20
*** dkehn has joined #zuul		15:20
fungi	corvus: yes, to turn on debugging	15:20
corvus	i thought that was a flag?	15:20
*** jamesmcarthur has quit IRC		15:22
*** jamesmcarthur has joined #zuul		15:22
fungi	hrm, if so i didn't find it when searching the zuul docs	15:22
corvus	hrm, apparently that only happens if it runs in the foreground	15:22
fungi	nor did i find anything at all related to configuring service logging	15:23
corvus	honestly, i'd rather we finally fix that rather than encourage folks to add logging configs	15:23
fungi	ahh, okay	15:23
fungi	so proxy the logging module configuration via one of the existing service config files?	15:23
corvus	https://review.opendev.org/635649	15:24
corvus	fungi: ^	15:24
fungi	oh, neat	15:24
fungi	so when zuul is containerized, users don't expect it to create log files?	15:25
fungi	also, for the record, noorul was following the zfs instructions not the quickstart, since it was for something related to the bitbucket driver	15:26
corvus	fungi: that is true for many users of containers	15:26
fungi	so i don't know if the container stuff is relevant to the zfs instructions	15:26
fungi	but maybe manually starting the service with logging to foreground still would be	15:26
corvus	fungi: yeah, but that means that noorul would merely have to add "-d" to the invocation rather than loarn the python logging file format	15:26
corvus	fungi: that change severs the two	15:26
corvus	fungi: all 4 cases in the matrix are supported :)	15:27
fungi	let me figure out where i left my bottle of red pills	15:28
corvus	hrm	15:28
corvus	wait, maybe that only handles three?	15:28
clarkb	worth noting that journld also works like docker in this case	15:29
corvus	fungi, clarkb, tobiash: ^ check my comment on that. it's late and i'm making small mistakes now	15:29
clarkb	they both grab all the stdout/stderr and record them	15:29
* corvus eods		15:31
fungi	thanks corvus! enjoy the cool se evening weather	15:31
clarkb	corvus: I think you generally don't want to write to stdout in the daemon case because daemonization closes teh fd's	15:32
openstackgerrit	Monty Taylor proposed zuul/zuul master: Apply changes to command module from ansible 2.6 https://review.opendev.org/678594	15:33
fungi	yes, closing inherited fds is necessary to be able to fully disassociate from the calling process	15:34
mordred	pabelanger: ^^ you might want to look at that one	15:34
fungi	obviously for container and systemd use cases you may choose not to daemonize	15:34
mordred	pabelanger: flaper87 noticed that we're missing a new parameter added in ansible 2.6	15:34
fungi	oh?	15:35
clarkb	fungi: ya in those two cases you are not supposed to daemonize and then if you write to stdout they collect those as logs	15:35
* flaper87 +1'd		15:35
fungi	ahh, 678594	15:35
*** noorul has joined #zuul		15:36
clarkb	mordred: does that need to be version specific so that < 2.6 don't get weird errors?	15:37
clarkb	oh I guess we convert argv to args so that may actually just backport the support to 2.5. May still cause problems if people test their 2.5 ansible with zuul and it works then deploy and it fails	15:38
*** mattw4 has joined #zuul		15:38
*** mattw4 has quit IRC		15:39
*** mattw4 has joined #zuul		15:39
*** mattw4 has quit IRC		15:42
*** noorul has quit IRC		15:42
*** mattw4 has joined #zuul		15:42
openstackgerrit	Merged zuul/nodepool master: openstack: handle safely invalid network name https://review.opendev.org/677501	15:47
*** mattw4 has quit IRC		15:52
*** stewie925 has joined #zuul		15:58
*** noorul has joined #zuul		16:00
*** jpena is now known as jpena\|off		16:02
*** noorul has quit IRC		16:07
openstackgerrit	Jeff Liu proposed zuul/zuul-operator master: Add PerconaXDB Cluster to Zuul-Operator https://review.opendev.org/677315	16:07
*** noorul has joined #zuul		16:11
noorul	hi	16:14
noorul	Does merger care about all the branches in the repo?	16:14
openstackgerrit	Merged zuul/zuul master: Document js tool installation in scratch doc https://review.opendev.org/678286	16:14
fungi	noorul: yes, because a change could be proposed for any branch of a repository	16:15
fungi	it will only calculate merges for the target branches of changes in the set it's considering, but i believe it prepares all branches from the configured remote	16:16
fungi	local copies of all branches that is	16:16
clarkb	yes it does that to load configs for all branches	16:16
fungi	ahh, right, that too, for the cat jobs	16:17
noorul	I see this exception http://paste.openstack.org/show/764710/ in the log. But it is not marking the build as MERGE_FAILURE	16:20
fungi	if things were working correctly, the scheduler should then report that failure	16:23
fungi	have you checked the scheduler log around that timestamp?	16:23
noorul	It says merge failure, http://paste.openstack.org/show/764712/	16:28
fungi	does the change ahead of it which it's conflicting with still have builds in progress?	16:31
noorul	Yes	16:31
fungi	because zuul won't report until it decides for sure it won't be able to merge. if the change ahead of it fails a build and gets kicked out of the queue, then this change will no longer conflict and will be tested normally	16:31
noorul	fungi: I see	16:32
fungi	if the change ahead of it (with which it's conflicting) succeeds all its builds and merges, then this change will be kicked out because it can no longer be merged to the branch, and then the scheduler should report a merge failure on it	16:32
noorul	fungi: Is the scheduler responsible for updating build status as MERGE_FAILURE?	16:33
noorul	fungi: I see that in this scenario using Gerrit, the build status is set as MERGE_FAILURE	16:33
noorul	but not in the case of stash. I think that stash driver has nothing to do here.	16:34
fungi	yes, it will happen as part of the buildset reporting process. it will request an updated merge from the mergers if the change ahead of it is ejected	16:34
openstackgerrit	Merged zuul/zuul master: Set git user config in from scratch document https://review.opendev.org/678312	16:34
fungi	otherwise it will report the merge failure	16:34
*** igordc has joined #zuul		16:37
*** jamesmcarthur has quit IRC		16:40
corvus	clarkb: my intent wasn't to write to stdout in the debug case, but rather to change the debug level of the default file output handler which is configured if no logger config is specified. i only intended to suggest that the setDebug() method be called if the debug arg is present regardless of which output handler is used	16:48
clarkb	gotcha	16:48
corvus	i don't know what i actually wrote on the review because i'm tired, but that's what i meant :)	16:49
*** hashar has quit IRC		16:49
noorul	corvus, clarkb: Is it possible for you to take a look at stash PR to figure out why I am not able to get the merge failure from Zuul?	16:53
*** hashar has joined #zuul		16:57
openstackgerrit	Dmitriy Rabotyagov (noonedeadpunk) proposed zuul/zuul-jobs master: Fix handling of dangling symlink https://review.opendev.org/678619	16:58
openstackgerrit	Merged zuul/zuul master: bindep: add unzip and bzip2 for rpm platform https://review.opendev.org/678433	16:58
*** bhavikdbavishi has joined #zuul		17:01
*** jamesmcarthur has joined #zuul		17:04
openstackgerrit	Andreas Jaeger proposed zuul/zuul-jobs master: Add tests for manifest generation for missing files https://review.opendev.org/678573	17:05
noorul	Sometimes both the PR tests run parallely	17:09
noorul	https://imgur.com/a/0KBjOXA	17:09
mordred	clarkb: re: the patch for adding the argv to command module - https://review.opendev.org/#/c/650431/7 pabelanger has a patch up to remove 2.5 support anyway (since it's EOL)	17:09
mordred	clarkb: so I was kind of not worrying about it too much	17:09
clarkb	ya I'm on the side of the fence that just because ansible eols aggressively doesn't mean we have to. There is typically little reason to upgrade ansible as a user and comes with quite a bit of headache to do so	17:11
clarkb	I think if ansible was backward compatible and upgrades just worked I'd care less	17:11
clarkb	I expect zuul users don't want to do large amounts of job churn frequently either	17:12
mordred	yeah	17:14
mordred	clarkb: we could also copy the current command module to the 2.5 directory replacing the symlink	17:15
mordred	which shouldn't be too hard	17:15
*** hashar has quit IRC		17:18
openstackgerrit	Monty Taylor proposed zuul/zuul master: Apply changes to command module from ansible 2.6 https://review.opendev.org/678594	17:18
mordred	clarkb: there's doing that	17:18
*** armstrongs has joined #zuul		17:19
*** mattw4 has joined #zuul		17:19
*** noorul has quit IRC		17:20
*** chkumar\|rover is now known as raukadah		17:21
mugsie	Shrews: re - https://review.opendev.org/#/c/554432/ - that is working for me locally :/	17:27
mugsie	and the unit test seems to be loading and running fine as well	17:27
Shrews	mugsie: I’m surprised since I didn’t see a config definition for ‘driver’, but I only did a quick pass before an appointment	17:28
armstrongs	Quick question I noticed when I run a shell command from a vm in nodepool it streams the stdout but when you run it on a container using the kubernetes driver it doesn't stream the stdout and you have to view it in the output in the json or ara report. Is there any way to get the stdout to behave the same on cotainers as vms?	17:28
*** jamesmcarthur has quit IRC		17:29
mugsie	I basically copied AWS's config layout, but I can update to add in the extra stuff you pointed out tomorrow at some point	17:29
clarkb	armstrongs: the way console streaming works we run a little daemon on the test node to collect and stream taht data. As is I don't think people want to pollute containers with that by default, but that should all just be a base job config thing iirc	17:32
clarkb	armstrongs: there is long term ongoing work to have ansible do what zuul does for streaming out of the box and once that happens it shouldn't matter what platform is used	17:32
armstrongs	clarkb: could you point me towards what I need to put in the base job if you have an example. Also thanks for the info 😊	17:35
clarkb	armstrongs: zuul/zuul-jobs/roles/start-zuul-console	17:36
armstrongs	Ah awesome thanks again	17:37
clarkb	we have that in our base job pre run playbook	17:37
fungi	armstrongs: https://opendev.org/opendev/base-jobs/src/branch/master/playbooks/base/pre.yaml#L18	17:37
armstrongs	Cool will give it a go	17:38
tristanC	armstrongs: zuul-console may not work in kubernetes as it requires a tcp access to the pod netns from the zuul-executor. iirc kubectl connection doesn't have access to the pod netns and rely on the exec api of kubernetes	17:40
openstackgerrit	Dmitriy Rabotyagov (noonedeadpunk) proposed zuul/zuul-jobs master: Fix handling of dangling symlink https://review.opendev.org/678619	17:45
openstackgerrit	Dmitriy Rabotyagov (noonedeadpunk) proposed zuul/zuul-jobs master: Add tests for manifest generation for missing files https://review.opendev.org/678573	17:46
*** jamesmcarthur has joined #zuul		17:49
*** jamesmcarthur has quit IRC		17:58
clarkb	any other zuulian want to review https://review.opendev.org/#/c/676717/ I think that will improve memory overhead for running zuul jobs which will help opendev	18:02
*** armstrongs has quit IRC		18:08
*** jamesmcarthur has joined #zuul		18:12
*** bhavikdbavishi has quit IRC		18:32
*** noorul has joined #zuul		18:40
SpamapS	clarkb:I wonder if we could make the zuul_console daemon a sidecar that is automatically added to every pod.	18:51
*** noorul has quit IRC		18:52
SpamapS	Wouldn't be too hard.. emptyDir shares the socket, envvar added to the main pod with the ID.	18:52
clarkb	or we can push ansible to add support in tree	18:53
clarkb	which I believe they want anyway	18:53
clarkb	then it will work for ansible always hopefully	18:53
SpamapS	tristanC: that TCP access that the executor needs doesn't need anything other than a port to contact. Why would it need access in the netns?	18:53
clarkb	I'll add this ti my list of things to bring up at ansiblefest dev day	18:54
SpamapS	Just grab the IP of the container and the sidecar should have the usual port defined.	18:54
clarkb	the other item is the exec per task	18:54
SpamapS	clarkb: That would be nice. :)	18:54
*** noorul has joined #zuul		18:57
openstackgerrit	Dmitriy Rabotyagov (noonedeadpunk) proposed zuul/zuul-jobs master: Fix handling of dangling symlink https://review.opendev.org/678619	19:01
*** noorul has quit IRC		19:01
SpamapS	Ugh.. https://opendev.org/zuul/nodepool/commit/da2701e0b19cbe75cdbd79cfeafaf7c643546fc7 broke us btw. I understand, using the Dockerfile in the repo may not be something that is part of releases. Just, FYI.. that broke us. We're having to redo a bunch of stuff to be able to deploy Nodepool 3.8.0. :-P	19:04
clarkb	SpamapS: it broke because the uid you had been using wasn't 10001?	19:06
SpamapS	clarkb:correct.	19:06
SpamapS	Permissions problems.. have to rework our Kubernetes pod specs	19:06
clarkb	I think we saw that as forward compatible because you can specify whatever you want it to be, but ya if you don't specify you get the default	19:07
*** noorul has joined #zuul		19:07
SpamapS	not a huge deal. I just want to raise that this was yet another thing that changed under us. I am not upset, or anything. I just .. it happens a lot.. I feel like we're doing the wrong thing or something.	19:08
*** armstrongs has joined #zuul		19:09
openstackgerrit	Dmitriy Rabotyagov (noonedeadpunk) proposed zuul/zuul-jobs master: Add tests for manifest generation for missing files https://review.opendev.org/678573	19:11
*** noorul has quit IRC		19:12
tristanC	SpamapS: i meant access to the tcp port of the zuul-console daemon, which is not exposed by default iiuc	19:14
SpamapS	tristanC:right but we can expose it in the k8s driver by running it as a sidecar and sharing the socket in. That way your pod image is clean, you don't even need python. :)	19:14
SpamapS	well.. n/m.. you do.. because ansible	19:15
tristanC	SpamapS: i'm not sure how that will work, the kubectl connection doesn't have ip an address, thus the zuul_stream wouldn't be able to contact the right ingress entrypoint. and even so, how would you map the default port to the zuul-console netns?	19:17
*** noorul has joined #zuul		19:17
openstackgerrit	Dmitriy Rabotyagov (noonedeadpunk) proposed zuul/zuul-jobs master: Fix handling of dangling symlink https://review.opendev.org/678619	19:19
tristanC	another solution to get the output in job-output.txt (but not live) would be to tweak zuul_stream and make it dump the result object when the connection is kubectl	19:22
*** noorul has quit IRC		19:22
SpamapS	tristanC: We could make a nodeport for it.	19:27
*** noorul has joined #zuul		19:28
openstackgerrit	Dmitriy Rabotyagov (noonedeadpunk) proposed zuul/zuul-jobs master: Fix handling of dangling symlink https://review.opendev.org/678619	19:28
Shrews	SpamapS: I, for one, wasn't really considering the nodepool Dockerfile a "production" piece of code, rather something that we used for our testing. While you obviously were. Perhaps that's the something that isn't quite right there and we need some more better testing around that.	19:29
Shrews	In which case, this feels more like a "packaging testing" issue and feels out of place within nodepool repo itself.	19:30
Shrews	But I'm often told that I'm a weird person	19:30
clarkb	ya I think some of the pain there is SpamapS is running zuul and nodepool completely disjoint from many of us. github not gerrit, aws/k8s with nodepool, kubernetes to host the services, etc. Different groups of us use pieces of that collection but their isn't full overlap iirc	19:31
openstackgerrit	Dmitriy Rabotyagov (noonedeadpunk) proposed zuul/zuul-jobs master: Add tests for manifest generation for missing files https://review.opendev.org/678573	19:31
clarkb	I think we can cover a lot of those gaps with better testing	19:31
*** noorul has quit IRC		19:32
tristanC	SpamapS: isn't nodeport unique per node? it seems like only the first job will be able to spawn the zuul-console service	19:34
tristanC	SpamapS: also, setting nodeport requires admin privilege on okd	19:34
*** armstrongs has quit IRC		19:35
*** noorul has joined #zuul		19:36
Shrews	I'd love to solve the us-breaking-SpamapS issues. Some areas we can improve with additional testing (like the Dockerfile thing), others, like the AWS driver, I'm not sure we'd ever be able to do anything about.	19:37
clarkb	Shrews: this is a bit of hand waving but openstack does/did have ec2 api layer	19:40
clarkb	I have no idea how good a stand in for aws that is, but is a potential opetion	19:40
fungi	seems it's still semi-active: https://opendev.org/openstack/ec2-api/commits/branch/master	19:42
Shrews	yeah, i don't know anything about that either	19:42
*** noorul has quit IRC		19:43
clarkb	I think logan- said they use it in some capacity, might have insight on applicability to this use case	19:43
*** stewie925 has quit IRC		19:44
fungi	https://review.opendev.org/650397 was the last commit of substance to merge and was related to testing, but was reviewed fairly quickly	19:45
fungi	~4 months ago	19:45
*** hashar has joined #zuul		19:46
fungi	and it's still considered an official team in openstack: https://governance.openstack.org/tc/reference/projects/ec2-api.html	19:46
Shrews	i would expect issues with a driver test using a translation API rather that actual API most users would be using	19:47
Shrews	but also... something is better than nothing sometimes	19:47
clarkb	Shrews: ya that could happen. It might however catch bugs in improper use of the api?	19:47
*** noorul has joined #zuul		19:48
clarkb	probably won't know how useful it is until we try it and I'm not sure if the investment makes that worthwhile	19:48
Shrews	I think users of that driver (with a much more vested interest in it) would have to do the investing. But I think that's only 1 person, atm	19:49
*** noorul has quit IRC		19:52
*** jamesmcarthur has quit IRC		19:53
SpamapS	clarkb: I really hope to move to the kubernetes operator once it exists. That should help me align better.	19:53
SpamapS	Sign me up as a beta tester.	19:53
SpamapS	tristanC: You can't make a nodeport in a namespace you have control over?	19:54
SpamapS	tristanC: we may be talking about doing this on a different level. I'm suggesting that we make a way for Zuul to tell Nodepool that it wants these things when it asks for a pod.	19:54
SpamapS	like, in the nodepool request, when it asks for a label that is k8s based, it should be able to also tack on a little thing that tells it to run the sidecar and create a nodeport.	19:55
SpamapS	So, not talking about doing it from the ansible. Do it in nodepool and zuul.	19:56
*** mattw4 has quit IRC		19:58
openstackgerrit	Merged zuul/zuul-jobs master: Fix handling of dangling symlink https://review.opendev.org/678619	19:59
openstackgerrit	Jeff Liu proposed zuul/zuul-operator master: Add PerconaXDB Cluster to Zuul-Operator https://review.opendev.org/677315	20:00
tristanC	SpamapS: yes I understand it needs to be done by nodepool, but multiple zuul build can run on one kubernetes host which may have only one public ip (nodeip)	20:02
openstackgerrit	Dmitriy Rabotyagov (noonedeadpunk) proposed zuul/zuul-jobs master: Add tests for manifest generation for missing files https://review.opendev.org/678573	20:02
tristanC	SpamapS: thus the first build may get the correct nodeport, and we can somehow tell zuul that the console of the kubectl connection is the kubernetes nodeip, but that will only work for the first build	20:03
tristanC	SpamapS: okd doesn't let regular user use nodeport as they should be using regular ingress route, which are either based on http vhost or dynamic port mapping, which isn't supported by the zuul_stream module.	20:04
SpamapS	tristanC:nodeports are unique ports. So you'd get back the node IP and a random port for zuul to use as the console port.	20:05
SpamapS	ingresses are for http.. :-P	20:05
SpamapS	We can make zuul_stream read an environment variable or something.	20:05
SpamapS	Anyway, point is, I think it can be done just with k8s primitives and some plumbing between zuul and nodepool. But, I defer to the implementors. I will not be able to work on this myself. :_/	20:06
*** hashar has quit IRC		20:08
tristanC	SpamapS: oh right, if we can make the zuul_stream callback and the zuul_console use an arbritary port, that would work	20:11
tristanC	but i've not been able to change the zuul-console listening port without changing the ansible role vars (the python module that spawn the daemon doesn't have access to the environment or site vars)	20:12
clarkb	it can scan /proc for that info	20:14
clarkb	we did that atone point to find the pid iirc	20:14
clarkb	though maybe that means we cant find the pid to find tge port	20:15
SpamapS	Yeah not claiming it's easy.. like clarkb said.. Ansible needs to put this in the core so we can simplify it.	20:15
SpamapS	I bet if it's done right we can drop the zuul_console daemon and just multiplex the output from the python modules that get uploaded.	20:16
* SpamapS kind of wishes he could just work on that for 3 months.		20:16
*** mattw4 has joined #zuul		20:17
*** mattw4 has quit IRC		20:28
*** mattw4 has joined #zuul		20:39
*** dolpher has joined #zuul		21:01
dolpher	How the user zuul is created in nodepool image? is there a dib element for doing that?	21:03
clarkb	dolpher: https://opendev.org/openstack/project-config/src/branch/master/nodepool/elements/zuul-worker that is the element we use	21:07
flaper87	Is this a valid job definition? http://paste.openstack.org/show/765019/ (specifically line #5 where I'm using an ansible variable that is defined in a previous ansible play using zuul_return)	21:07
flaper87	the zuul_return is called in the `run` playbook. I should probably try running this task in the `pre-run` and then consuming the returned value from the `run` playbook	21:13
clarkb	I don't think there is shared values between the phases	21:14
clarkb	zuul_return will share the info between jobs though	21:14
tristanC	flaper87: it depends, zuul_returns are passed between buildset dependent jobs	21:14
SpamapS	I've used a file on disk to pass info between pre/normal/post	21:15
flaper87	oh, mmh, then that's not what I need :/	21:15
dolpher	clarkb: thanks, looks like nodepool-base depends on it, so can I use the nodepoll-base element in a 3rd party CI env? or any other suggested/required elements to include?	21:16
flaper87	I need to generate an auth token from a trusted playbook and pass the token to the job so that the job can use it to authenticate to a service	21:16
clarkb	dolpher: I think those elements are largely going to be geared towards opendev's images. In general you shouldn't need much more than an adduser and setting the ssh key (which you could even use something like cloud init to set at runtime too)	21:18
clarkb	if those elements work for you then great, they tend not to change much, with dns fiddling being the most likely changes based on history iirc	21:19
clarkb	flaper87: I believe that will make your auth token exposable	21:19
dolpher	clarkb: got it, thanks!	21:19
clarkb	flaper87: if you need to keep the token secure I think that the consumer of the token will also need to be in a trusted repo	21:20
clarkb	flaper87: if you don't need to keep the token secure then you can likely do what spamaps does and write to disk and load from there on subsequent playbooks	21:25
*** igordc has quit IRC		21:26
flaper87	clarkb: SpamapS gotcha, thanks! I'll play with that although I think the answer is that I do have to keep this secure.	21:26
*** mattw4 has quit IRC		21:28
*** jeliu_ has quit IRC		21:28
clarkb	flaper87: the problem with what you are planning of handing a secure set of data to untrusted playbook is that I could push a change that is tested pre review that cat's that data to the console	21:29
clarkb	(or whatever method I want to sneak it off of the test node)	21:29
clarkb	if hwoever it is fully in a trusted repo you can only do that post review	21:29
clarkb	the idea being it won't get approved if it does something bad like that	21:29
SpamapS	flaper87: The thing you said, generating a token, is exactly what I do. The pre has AWS access keys to generate an STS token. I write the token, which has a timeout of exactly the same as the job timeout, into ~/.aws/credentials.	21:31
SpamapS	Actually I've been meaning to open source that role	21:32
SpamapS	http://paste.openstack.org/show/765021/ <-- pretty simple actually	21:34
*** mattw4 has joined #zuul		21:36
SpamapS	clarkb: Oh this reminds me! I've been thinking a realy cool enhancement to zuul_stream would be that you could feed it a list of hashes of secrets from the trusted phase, and it would XXXXX any strings that match those hashes in the output stream.	21:43
SpamapS	That wouldn't prevent malicious compromise (they can just encrypt it to a key), but it would prevent accidentally printing stuff.	21:43
fungi	i guess it also would be tricky to find what substrings to check against the hashes	21:51
fungi	since for any output stream of nontrivial length the possible substrings (even below a reasonably small max string length) would be nigh innumerable	21:52
clarkb	you'd have to tokenize based on whitespace or some other rule? otherwise ya arbitrary length strings	21:54
SpamapS	You could do some interesting optimization with run length.	22:02
SpamapS	like if you know hash=abc123 and len=47, you can stop checking when there are only 46 chars left. You only have to check the output of things from the console.. it's not that much data.	22:03
SpamapS	add in character classing (class==binary is check every byte, class==word means check non-whitespace, etc)	22:04
SpamapS	Anyway, just a fun thought.	22:04
clarkb	does specifying hash lenght compromise the one way ness of the hash ? (I think for short values it probably does?)	22:04
* SpamapS goes back to using the hell out of zuul as it is. ;)		22:04
clarkb	so many rainbow tables	22:05
SpamapS	clarkb:yeah, for short values RL would be a big help for rainbow tables and such.	22:05
SpamapS	Use a fast hash algorithm with a strong one to back it up, like rsync uses crc32 before others, and you can just check every damn byte with minimal overhead.	22:06
SpamapS	but.. yeah.. fantasyland is over	22:06
fungi	not exposing the raw length of secrets is one of the reasons for using oaep in our encoding choice	22:21
fungi	so we only leak a loosely quantized length	22:21
fungi	knowing exactly (or even almost) how long a secret is can mean a significant reduction in work factor for brute-force guessing	22:22
fungi	and even moreso for educated guessing	22:23
SpamapS	Indeed. I think just a fast, collission-prone-but-secure hash, followed by a very slow one like sha512, would allow efficient filtering w/o the length.	22:28
SpamapS	not sure what that faster hash is.. would be fun to play with a few and find the right balance.	22:28
SpamapS	Hm, but if something wrapped a secret, you'd have to accept that as a potential oops-around.	22:29
SpamapS	(The whole idea in my head assumes you'd print the secret on a single line)	22:29
SpamapS	n/m on this	22:29
SpamapS	fun mental exercise	22:29
* SpamapS withdraws		22:29
fungi	oh, and also, pkcs1 is a kdf, not just a raw hash, for a reason. publishing a plain hash of the secret also significantly speeds attempts at guessing the value because a simple hash is much faster to compute than applying a typical kdf	22:36
fungi	there's a massive market for special-purpose sha256 or sha512 hash generating processors	22:37
fungi	those hashes aren't designed to be slow or computationally intensive to calculate	22:37
SpamapS	I wonder if instead of focusing on hashing, the right thing is focusing on getting the secret into a secure place that can filter the output in real time.	22:37
SpamapS	But.. this is again just an oops-preventor.. so. not worth spending much time. :)	22:38
fungi	i like where you're going with the idea	22:38
fungi	but yes, the execution is going to be fraught with pitfalls, risking compromising the strength of the mechanisms protecting secrets it's attempting to catch leaks of	22:39
fungi	though i believe we do have some tests in zuul which encrypt a known secret, use it in some jobs, and then search the resulting logs for a leak of that secret	22:40
fungi	granted that sort of testing can miss unlikely branches in execution or similar corner cases, so it also not a panacea	22:41
SpamapS	fungi: Oh hm, that technique would be an interesting way to test if a change is doing something naughty with a secret....	22:41
SpamapS	fungi: you could conceivably run the job w/ a known secret in the secret value, and do the same check.	22:42
fungi	right	22:42
SpamapS	Except it would probably fail when using said secret, which is usually the point of secrets.	22:42
SpamapS	(like, I mean, a real job)	22:42
fungi	yeah, would have to be a fairly abstract job without external interactions i suppose	22:43
SpamapS	Yeah that's not realy the thing I want to prevent. I want to prevent somebody from leaving their debug stuff in. ;)	22:43
SpamapS	All we have in defense of that now is code review.	22:43
fungi	and you would likely need to mock up possible failure scenarios in it, since the most common problem is "i tried to log into this remote service and failed, here are the credentials i used..."	22:43
SpamapS	Yeah, no, the only thing I can think of is that you run a filter in front of the logs that knows the exact strings it's not supposed to print.	22:44
SpamapS	There's a company out there, Netskope, that does this with things like Google Docs and Dropbox. You securely feed them all of your secrets, with 0 context, and they will scan all of your google docs and dropbox files for those specific strings.	22:45
fungi	this is verging into the wonderful world of proxies (you love that planet, i know). i'm familiar with an entire industry built around scanning and blocking egress traffic which includes known trigger words/phrases	22:45
SpamapS	But you have to trust them to have those secrets.	22:45
fungi	yeah, the big enterprise solutions involve transparent proxies at your border which you trust (because you audited the source code in them yourself? doubtful, but anyway...)	22:46
fungi	basically application-layer gateways serving the "data loss prevention" concerned industrial sectors	22:48
fungi	you have to rub them down with snakeoil every morning	22:49
*** mattw4 has quit IRC		23:02
clarkb	ianw when your day starts thoughts on the latest patchset for https://review.opendev.org/#/c/678049/ wouldbe great	23:24

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!