Monday, 2018-07-09

*** threestrands has joined #zuul00:03
*** hwoarang has quit IRC00:43
*** threestrands_ has joined #zuul01:18
*** threestrands_ has quit IRC01:18
*** threestrands_ has joined #zuul01:18
*** threestrands_ has quit IRC01:19
*** threestrands_ has joined #zuul01:20
*** threestrands_ has quit IRC01:21
*** threestrands_ has joined #zuul01:21
*** threestrands_ has quit IRC01:21
*** threestrands_ has joined #zuul01:21
*** threestrands has quit IRC01:21
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Add role for installing docker and configuring registry mirror  https://review.openstack.org/58073003:39
*** chandankumar has joined #zuul05:08
*** chandankumar is now known as chkumar|ruck05:08
tobiashcorvus: I tried to reproduce the erratic behavior in a test environment so I don't have to break production during debugging but no luck so far05:10
*** pawelzny has joined #zuul05:39
*** yolanda has joined #zuul05:46
tobiashcorvus: I've now managed to reproduce it first in my test env and now also as a zuul test case :)06:07
tobiashcorvus: it seems that it's important to have two config projects, one depoyment specific and one shared by all deployments06:08
tobiashcorvus: it seems that it's happening if the according pipeline is defined in a different config project06:16
openstackgerritVu Cong Tuan proposed openstack-infra/zuul-jobs master: Switch to stestr  https://review.openstack.org/58002807:13
*** gtema has joined #zuul07:14
openstackgerritVu Cong Tuan proposed openstack-infra/zuul-jobs master: Switch to stestr  https://review.openstack.org/58002807:14
*** fbo|off is now known as fbo07:15
*** hashar has joined #zuul07:46
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Add test facility to add file contents in github tests  https://review.openstack.org/58094807:51
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Reproducer for master job contamination  https://review.openstack.org/58094907:51
tobiashcorvus: I have a reproducer now ^07:51
*** zigo has quit IRC08:03
*** electrofelix has joined #zuul08:04
*** aspiers[m] has quit IRC08:04
*** zigo has joined #zuul08:05
*** threestrands_ has quit IRC08:12
*** aspiers[m] has joined #zuul08:31
openstackgerritVu Cong Tuan proposed openstack-infra/zuul-jobs master: Remove testr  https://review.openstack.org/58002808:54
openstackgerritMarkus Hosch proposed openstack-infra/zuul master: Fix GitHub branch protection test  https://review.openstack.org/58096608:56
openstackgerritMarkus Hosch proposed openstack-infra/zuul master: Reduce number of reconfigurations on branch delete  https://review.openstack.org/58096708:56
*** pawelzny has quit IRC08:57
*** pawelzny has joined #zuul09:02
*** gtema has quit IRC09:02
*** sshnaidm|off is now known as sshnaidm|rover09:02
openstackgerritMarkus Hosch proposed openstack-infra/zuul master: Reduce number of reconfigurations on branch delete  https://review.openstack.org/58096709:25
*** gtema has joined #zuul10:22
*** gtema has quit IRC10:50
*** gtema has joined #zuul10:51
*** pawelzny has quit IRC10:54
*** pawelzny has joined #zuul10:57
*** gtema has quit IRC11:07
*** gtema has joined #zuul11:10
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Add role for installing docker and configuring registry mirror  https://review.openstack.org/58073011:57
rcarrillocruzfolks, is there any zuul construct to say 'run this job unless neither of these list of jobs are running'12:08
*** sshnaidm|rover is now known as sshnaidm|afk12:08
rcarrillocruzthis job would be in periodic, checking of jobs not running on check/gate12:08
rcarrillocruzother than a mechanism specific to this, i can only think of a semaphore12:08
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: fix job contamination  https://review.openstack.org/58101912:17
tobiashcorvus: after hours of debugging I found out that not the cached layout is contaminated but the one that is attached to the pipeline is the wrong one and this is used under some circumstances12:17
tobiashthis is an easy fix for this by retrieving the layout from the tenant instead of the pipeline ^12:18
tobiashcorvus: while this fix is easy I think it's the wrong fix as this hides a deeper problem and maybe further uses of the wrong layout12:19
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Add job to build container images using pbrx  https://review.openstack.org/58016012:27
*** rlandy has joined #zuul12:34
openstackgerritcaoxufeng proposed openstack-infra/zuul master: Test for commit  https://review.openstack.org/58102712:46
mordredrcarrillocruz: semaphore would be the only thing for that I could think of12:49
mordredtobiash: ooh, that's exciting12:49
*** toabctl has quit IRC12:59
rcarrillocruzoki, thx12:59
*** toabctl has joined #zuul12:59
*** chkumar|ruck has quit IRC13:01
*** gtema has quit IRC13:09
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Add job to build container images using pbrx  https://review.openstack.org/58016013:22
rcarrillocruzmordred: is it possible for jobs to acquire multiple semaphores, or is it just one?13:23
*** sshnaidm|afk is now known as sshnaidm|rover13:42
tobiashrcarrillocruz: even if it would be possible I wouldn't recommend that. This would cry for deadlocks...13:45
tobiashrcarrillocruz: the data structure in zuul says it's one13:47
rcarrillocruzk, thx for confirming13:47
*** zxiiro-pto is now known as zxiiro14:08
trishnagHi14:10
trishnagCan we make zuul do squash and merge when merging PRs instead of creating new commits?14:10
rcarrillocruztobiash: you use GH, do you know ^ ?14:18
tobiashtrishnag, rcarrillocruz: afaik zuul doesn't support this yet14:18
rcarrillocruzack thx14:18
trishnagtobiash: ack thanks.14:19
mordredtrishnag: it probably shouldn't be _super_ hard to add I'd imagine, right?14:21
mordredgah14:21
mordredtobiash: ^^14:21
tobiashmordred: changing the reporter should be easy as it's just a flag14:22
corvuszuul has a facility for specifying alternative merge strategies for the merger, so it could be implemented without too much difficulty14:22
tobiashmordred: but I'm not sure how to do this for the correct merge-mode (is the right one already there?)14:22
corvushttp://git.zuul-ci.org/cgit/zuul/tree/zuul/merger/merger.py#n53114:23
mordredcorvus: I think this is more about the reporter - rather than clicking 'merge' - have zuul click 'squash-and-merge'14:23
corvusmordred: it's both14:24
mordredcorvus: fair point14:24
corvus(you want zuul to do the same thing github is going to do so that it actually tests the right thing)14:24
mordredyup14:24
mordredand implementing it in the merger also certainly sets us up for a future of zuul-push14:24
corvusyep14:25
mordredalthough - to make things more funner, it seems like merge-mode might be a project setting rather than a pipeline quality14:25
corvusthis may not sound very important, but before we managed to get zuul doing (almost) exactly the same kind of merge gerrit does (merge-resolve), we ran into problems frequently.14:25
corvusmordred: it is, in fact, a project setting14:26
mordred\o/14:26
mordredthen I am correctly caffinated14:26
corvusso the github reporter could look at the merge-mode of the project it's reporting and dtrt14:26
mordredcorvus: btw - I'm making progress on the container image build job - it has progressed from zuul error with no logs, to RETRY_LIMIT - to just having normal failures!14:26
corvustrishnag: ^ see followup conversation above14:27
mordredcorvus: ooh, that's a good idea14:27
corvusmordred: the system works!14:27
corvustobiash: i'm thrilled by your bug progress, and will start looking into it as soon as i've had breakfast!14:28
mordredcorvus: to summarize, we'd want to add a squash-and-merge strategy for the merger, then update the github driver to look at project merge-mode settings14:28
tobiashcorvus: I have a bad quick fix (see above) and a better but bigger fix that I can upload shortly14:28
corvusmordred: ++14:29
mordredcorvus: ooh look! http://logs.openstack.org/60/580160/10/check/zuul-build-container-images/bf13769/job-output.txt.gz#_2018-07-09_13_31_35_51578514:30
corvusmordred: a nice add-on might be to have the drivers validate merge-mode settings for projects, since i think we're about to have divergent choices between our drivers.14:30
mordredcorvus: it's actually building the images and is failing on actual bad content!14:30
mordredcorvus: yah14:30
mordredcorvus: honestly - is it time to start considering shifting to zuul-push? divergent click-the-merge-button settings across drivers start to seem ugly14:31
mordredcorvus: but yes, validation14:31
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Fix job contamination - alternate approach  https://review.openstack.org/58104714:31
tobiashcorvus: that is an alternative approach14:31
tobiashcorvus: so the thing is that it's not poisening the parsed or unparsed layout but some state of the layout is attached to the pipeline14:32
tobiashcorvus: the first quick fix fixed it by not taking the layout from the pipeline at two places but go deeper into the tenant and get it from there14:33
*** pabelanger has quit IRC14:33
*** pabelanger has joined #zuul14:33
tobiashcorvus: then I realized that it somehow sounds weird to have the layout in the pipeline and most of the layout accesses were to get the tenant anyway14:33
tobiashcorvus: so the second approach is to rip out the layout from the pipeline and replace it directly from the tenant14:34
tobiashcorvus: as it turned out just a few minutes we had these issues also in some subtile ways without any config project involved14:35
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: web: add tenant-scoped autohold, enqueue (change, ref)  https://review.openstack.org/57690714:45
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Update bindep file with compile profiles  https://review.openstack.org/58015914:46
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Add job to build container images using pbrx  https://review.openstack.org/58016014:46
*** hashar is now known as hasharAway14:48
rcarrillocruzneat14:49
rcarrillocruzmy jobs are now using a semaphore14:49
rcarrillocruzsince they use aws creds to create aws objects, i needed a way to not incur into quota issues14:50
mordredrcarrillocruz: ++14:50
rcarrillocruzby defaultl aws allow 5 vpcs per region14:50
mordredrcarrillocruz: and this is one of those cases where the thing you're doing is legit job content and not a thing that an aws nodepool driver could otherwise solve14:51
rcarrillocruzyeah, cos the jobs create 'things' not nodes... so the driver is not really applicable14:52
rcarrillocruznow we get into nodepool getting into thingpool14:52
rcarrillocruzShrews: wink wink14:52
tobiashcorvus: have to run now, I'll probably come back later14:57
ShrewsFYI, out doing some volunteer things this morning. Will be around in the afternoon.15:00
corvusrcarrillocruz: we also had the idea of a 'cleanup' job, which always gets run at the end of a buildset even if intermediate jobs get aborted.  that's not implemented yet, but that may be helpful for your use case15:02
rcarrillocruzwellll, that would be ideal... the reason why i asked about 'only run this periodic job if and only if no jobs in check/gate are running' was exactly for that15:04
rcarrillocruzthe tests on the repo clean up themselves15:04
rcarrillocruzbut there are chances they can be aborted15:04
rcarrillocruzthus cruft could be left behind15:04
rcarrillocruznow, i could create periodic 'cleanup' jobs that take semaphores tied to check/gate (that's what i plan to experiment with anyways), but if there was a cleanup concept that would be more than ideal for my use case15:04
pabelangermordred: corvus: to continue on with https://review.openstack.org/578230/ (zuul_return with zuul.child_jobs) we also discussed allowing a parent job to live longer to service child jobs, say for hosting artifacts (over pushing to external server), any thoughts or notes about how that could work?15:05
rcarrillocruzcorvus: does that sound as a reasonable 'cleanup' workaround till what you depicted is implemented? cannot think of another way of doing it tbh15:08
rcarrillocruzhowever, i would need to have $periodic_cleanup_jobs per semaphores, since it seems only one semaphore can be locked by any given job, which is not a huge deal for us anyway15:09
mordredpabelanger: there are thoughts on that concept written up in the container spec: https://review.openstack.org/#/c/56013615:12
pabelangermordred: thanks15:13
mordredrcarrillocruz: yah - I thnk periodic jobs on a semaphore is a good workaround for now15:13
rcarrillocruz\o/15:40
*** jiapei has quit IRC16:05
openstackgerritRonelle Landy proposed openstack-infra/zuul-jobs master: Update multi-node-bridge to install OVS from delorean deps repo  https://review.openstack.org/58108116:20
*** acozine1 has joined #zuul16:24
tobiashcorvus: regarding that but, it is unrelated to github and multi tenant so it even might have happened (maybe even unnoticed) also in openstack land16:34
tobiashs/but/bug16:34
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Fix job contamination by unmerged change  https://review.openstack.org/58094917:27
tobiashcorvus: squashed and tweaked the commit message, I hope it's understandable17:28
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Fix job contamination by unmerged change  https://review.openstack.org/58094917:30
*** fbo is now known as fbo|off17:52
SpamapShm18:25
SpamapStobiash: do you have any way to kick something out of gate if it is alread in the gate? Like, a label or something?18:25
SpamapSI have a thing called do-not-merge that will prevent it from enqueing via requirements.18:25
SpamapSbut I want something that actually dequeues it.18:25
tobiashSpamapS: that's currently not possible (other than closing the pr)18:25
corvustobiash: i'm still a little unclear as to the mechanics of the bug.  why was item.queue.pipeline.layout != tenant.layout?  if the config-project change merged, they should be the same thing18:26
tobiashcorvus: the config-project change didn't merge18:26
tobiashcorvus: however I didn't fully understand why it's not the same but still I think taking the layout from a fixed place is cleaner (and fixes the bug)18:27
corvustobiash: so did it re-use the active pipeline object when creating the new layout?18:27
tobiashcorvus: that might be the case18:28
SpamapSSo, with gerrit, we can remove the +A and it doesn't merge. Is that because gerrit will reject the "merge" API call without the +A?18:29
corvusSpamapS: yes18:29
SpamapSI wonder if there's a pipeline hack we can do where we make a "do-not-merge" pipeline that will set a status to failed if we drop a 'do-not-merge' label or remove the 'approved' label.18:29
corvusSpamapS: (in openstack-infra, folks push up a new patchset to cause a change to be dequeued from gate)18:30
SpamapSAh push new change would work18:30
tobiashSpamapS: you also could revoke the review to prevent zuul to merge18:30
corvustobiash: do you have a change with just your test addition?18:35
corvusi really need to understand the bug before we merge any fix18:35
tobiashcorvus: it's ps1 of that change18:35
corvustobiash: of 949?18:35
tobiashyes18:35
corvusthanks18:35
*** electrofelix has quit IRC18:36
corvustobiash: http://git.zuul-ci.org/cgit/zuul/tree/zuul/model.py#n299118:44
corvushrm, that's part of it, but it's not the whole story18:45
SpamapStobiash: we don't really use reviews18:45
SpamapSbecause we do need to self approve often18:45
tobiashcorvus: yes, that's the only place where the layout gets set (at least the only place I found)18:46
corvustobiash: oh, i think it's because it's a cached config object.  that's why this requires pipelines be in a different repo than the project-config being changed.  when we create the throwaway layout to validate the config for change A, we call layout.addPipeline on the pipeline objects.  if your pipelines and global config are in the same repo, the cache would be invalidated and you would get brand new18:52
corvuspipeline objects.  but in this case, we're using the cached pipeline objects (which are the *live* pipelines) and we add them to the layout.  because addPipeline has the side effect of setting the pipeline's layout, we've altered the pointer on the live pipeline object to point to a throwaway layout.18:52
corvustobiash: this kind of error would be caught for most config objects because they are made immutable when cached, but we don't do that for pipelines because we need the real pipeline objects to be added to dynamic layouts for untrusted changes.18:52
corvus(semaphores have similar characteristics)18:53
tobiashcorvus: so if we need the live pipelines and cannot make them immutable I think it makes sense that they should not contain any layout directly18:54
corvustobiash: i agree that since the tenant object is expected to be constant among all layouts within the tenant, then referencing that from the pipeline, rather than the layout, is an appropriate fix.18:56
tobiashcorvus: oh yes, I don't know how I've overlooked that, the second call to addPipeline alters the existing one18:56
tobiashI just thought it's a new one so I overlooked that detail18:56
tobiashthat would have saved me hours of debugging...18:57
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Switch to centos-release-openstack for ovspackage  https://review.openstack.org/58051818:57
corvustobiash: my print-based debug log: http://paste.openstack.org/show/725364/18:58
corvustobiash: i'm inclined to think that your change is the way we should fix it.  i want to think about it a little bit more over lunch in case any other subtleties jump out at me.  and we might want to move the pipeline.tenant assignment into the configloader, so that the modification side effect isn't as surprising18:59
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Fix job contamination by unmerged change  https://review.openstack.org/58094919:12
tobiashcorvus: like this ^ ?19:13
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Switch to centos-release-openstack-queens for ovs  https://review.openstack.org/58051819:14
*** adam_g has quit IRC19:35
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Require tenant in Pipeline constructor  https://review.openstack.org/58112920:22
corvustobiash: ^ that may be a little bit stronger of an assertion20:22
tobiashcorvus: lg20:25
tobiashT20:25
tobiashl20:25
tobiashlgtm20:25
* tobiash apologizes for fat finger spam20:26
corvustobiash: would you mind updating the commit message with the suggestions i left on there?  this is a complicated bug and if we have to revisit this at some point, i'd like the full story to make it easier.  other than that, change lgtm.20:29
tobiashSure, looking in a bit20:31
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Fix job contamination by unmerged change  https://review.openstack.org/58094920:40
tobiashcorvus: ^20:40
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Require tenant in Pipeline constructor  https://review.openstack.org/58112920:58
corvusrebase ^20:58
corvusclarkb: can you look at https://review.openstack.org/580948 and 2 children when you have a moment?20:59
corvusto clarify: 2 children of the change.  not just any 2 children.20:59
*** adam_g has joined #zuul21:46
*** acozine1 has quit IRC22:03
*** threestrands_ has joined #zuul22:20
*** threestrands_ has quit IRC22:20
*** threestrands_ has joined #zuul22:20
*** jappleii__ has joined #zuul22:23
*** jappleii__ has quit IRC22:24
*** jappleii__ has joined #zuul22:25
*** threestrands_ has quit IRC22:26
*** yolanda_ has joined #zuul23:06
*** hasharAway has quit IRC23:08
*** yolanda has quit IRC23:09
openstackgerritJames E. Blair proposed openstack-infra/zuul master: WIP: Report to gerrit over HTTP  https://review.openstack.org/57702723:20
*** ianychoi_ has joined #zuul23:21
*** goern_ has joined #zuul23:22
*** goern has quit IRC23:22
*** jimi|ansible has quit IRC23:23
*** jimi|ansible has joined #zuul23:24
*** ianychoi has quit IRC23:25

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!