Thursday, 2016-11-10

jeblair00:06 < openstackgerrit> James E. Blair proposed openstack-infra/infra-specs: Zuul v3: Add section on secrets  https://review.openstack.org/38628100:07
jeblairi've updated that based both on conversations from the summit meetup as well as a subsequent conversation with mordred00:07
jheskethjeblair: left a comment if you're still around00:40
*** saneax is now known as saneax-_-|AFK00:53
jeblairjhesketh: thanks; replied and revision forthcoming00:54
jheskethcool, glad I wasn't completely off, will wait for the next iteration00:56
jeblair00:59 < openstackgerrit> James E. Blair proposed openstack-infra/infra-specs: Zuul v3: Add section on secrets  https://review.openstack.org/38628101:01
jheskethresponded01:19
openstackgerritwatanabe isao proposed openstack-infra/nodepool: Add ssh timeout to client  https://review.openstack.org/32979901:45
*** persia has quit IRC03:59
*** persia has joined #zuul04:01
*** bcoca has quit IRC05:58
*** abregman has joined #zuul06:05
*** harlowja_ has quit IRC06:33
*** openstackgerrit has quit IRC07:48
*** openstackgerrit has joined #zuul07:49
*** abregman_ has joined #zuul08:26
*** abregman_ has quit IRC08:26
*** abregman has quit IRC08:29
*** hashar has joined #zuul08:32
*** hashar has quit IRC08:32
*** abregman has joined #zuul08:32
*** hashar has joined #zuul08:38
*** hashar_ has joined #zuul08:41
*** hashar has quit IRC08:43
*** abregman has quit IRC08:52
*** hashar_ is now known as hashar09:06
*** abregman has joined #zuul09:08
*** yolanda has quit IRC09:31
*** yolanda has joined #zuul09:31
*** abregman is now known as abregman|afk10:47
*** abregman|afk is now known as abregman11:34
openstackgerritJan Hruban proposed openstack-infra/zuul: Support GitHub PR webhooks  https://review.openstack.org/16311713:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Make the string representation of change transparent  https://review.openstack.org/23894813:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Merge pull requests from github reporter  https://review.openstack.org/24325013:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Encapsulate determining the event purpose  https://review.openstack.org/24748713:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Fix job hierarchy bug.  https://review.openstack.org/19245713:01
openstackgerritJan Hruban proposed openstack-infra/zuul: support github pull reqeust labels  https://review.openstack.org/24742113:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Better merge message for GitHub pull reqeusts  https://review.openstack.org/28066713:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Add 'push' and 'tag' github webhook events.  https://review.openstack.org/19120713:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Add 'pr-comment' github webhook event  https://review.openstack.org/23920313:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Support for dependent pipelines with github  https://review.openstack.org/24750013:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Configurable SSH access to GitHub  https://review.openstack.org/23913813:01
openstackgerritJan Hruban proposed openstack-infra/zuul: GitHub file matching support  https://review.openstack.org/29237613:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Allow github trigger to match on branches/refs  https://review.openstack.org/25844813:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Log GitHub API rate limit  https://review.openstack.org/29237713:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Set filter according to PR/Change in URL  https://review.openstack.org/32530013:01
openstackgerritJan Hruban proposed openstack-infra/zuul: Allow using webapp from connections  https://review.openstack.org/21564213:02
openstackgerritJan Hruban proposed openstack-infra/zuul: Allow list values in template parameters.  https://review.openstack.org/19120813:02
openstackgerritJan Hruban proposed openstack-infra/zuul: Add basic Github Zuul Reporter.  https://review.openstack.org/19131213:02
openstackgerritJan Hruban proposed openstack-infra/zuul: Support for github commit status  https://review.openstack.org/23930313:02
*** bcoca has joined #zuul14:30
openstackgerritMerged openstack-infra/nodepool: Remove OldNodePoolBuilder class  https://review.openstack.org/39288414:48
*** yolanda has quit IRC15:11
dmsimardpabelanger: any news in regards to zuul merger namespacing ?15:22
pabelangerdmsimard: not yet, going to try and update feature/zuulv3 first. Make sure everybody agrees on that implementation, then see about back porting to zuulv215:25
*** yolanda has joined #zuul15:26
*** abregman has quit IRC15:33
openstackgerritMerged openstack-infra/zuul: Ansible launcher: move AFS publisher into a module  https://review.openstack.org/39465816:20
*** harlowja has joined #zuul16:31
*** harlowja has quit IRC16:32
*** harlowja has joined #zuul16:34
*** hashar is now known as hasharAway16:40
SpamapSjeblair: so, I want to make sure we have only one source of truth. I was thinking I'd send an email to openstack-infra asking people to use storyboard for v3. How does that sound?17:59
* SpamapS should have done that back when the list went in as stories and tasks17:59
mordredSpamapS: that sounds completely sane to me18:00
jeblairSpamapS: context?18:03
jeblairwhat would the other source of truth be?18:03
SpamapSjeblair: pabelanger wasn't aware of storyboard and there was mention of etherpads18:04
SpamapSI just realize we haven't told everyone to do that.18:04
pabelangerYa, I've been just using gerrit up until now18:05
jeblairSpamapS: yes.  now that i realize you are done setting up storyboard, i have retired the etherpad.18:05
pabelangeratleast to check which tasks were done18:05
jeblairSpamapS: i think an email would be good18:05
SpamapScool will send shortly18:09
mordredjeblair: weekly meeting is mon 2200 yeah?18:09
jeblairmordred: yes -- http://eavesdrop.openstack.org/#Zuul_Meeting18:10
jeblairin -alt18:10
jeblairso everybody be sure to show up for that on monday :)18:11
*** Shuo_ has joined #zuul18:26
SpamapSsent18:27
*** jamielennox|away is now known as jamielennox18:37
ShrewsSpamapS: How come when I search https://storyboard.openstack.org/#!/search for the zuulv3 tag (as your email suggests), it does not return all items from your workboard (https://storyboard.openstack.org/#!/board/41), even though they are clearly marked with that tag?18:43
ShrewsAm I missing something about how storyboard works?18:43
jeblairShrews: i will note that the search result returns a very suspcious 10 results18:46
Shrewsindeed18:47
jeblairthat's a number that shows up a lot in storyboard :/18:47
* jeblair heads over to #storyboard18:47
mordredzuulv3 search gets me a link to the workboard18:48
Shrewsmordred: try just the "tag" search. should show up as a popup in the search bar (with a tag icon next to it)18:49
jeblairmordred: yeah, i think that's a text search18:49
jeblairmordred: as compared with what Shrews said18:50
mordredAH18:51
mordredShrews: oh! I got a bunch of things18:52
mordred(more than 10)18:52
mordredjeblair: ^^18:52
Shrewsprofile says my page size is 1018:53
Shrewsyet i cannot go to the next page18:53
jeblairyeah, if i set my profile to 100, i get more than 1018:53
SpamapSwhen I search for the _tag_ I get all of it.18:54
SpamapSoh but yeah, I have it set t o10018:54
jeblairmy profile resets to 10 every time i log in18:54
mordredjeblair: that's odd - fwiw, mine does not reset everytime I log in18:57
mordredseems like a bug18:57
Shrewsmine does reset18:58
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Combine ZKTestCase with DBTestCase  https://review.openstack.org/38396219:03
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Supply ZK connection information to test configs  https://review.openstack.org/38396319:03
jeblairShrews: what's still WIP about 592?19:04
Shrewsjeblair: i wanted to run nodepool-builder by hand to test the changes19:05
Shrewswhich i'm setting up to do now19:05
Shrewsjeblair: but so far it's working ok19:09
Shrews(CONNECTED) /nodepool/image/devstack-trusty/builds> get 000000000119:11
Shrews{"state": "building", "builder": "localhost.localdomain", "state_time": 1478804951}19:11
Shrewsy19:11
mordredShrews: that seems positive19:13
Shrewsif post build data looks ok, and delete code runs ok, i'll un-WIP19:13
mordred\o/19:15
mordredShrews: zomg. that means we're pretty much like almost done with that and stuff amirite?19:16
clarkbwe should enable all the tests again too and get the integration test working19:16
Shrewsyeah, but did find a minor issue19:16
jeblairclarkb: yes, that is the plan19:22
jeblairShrews: cool, then ignore those two changes, i will re-rebase on 592 (like i said i would yesterday but just now remembered)19:26
Shuo_jeblair: what's the status of v3? (before alpha or ?)19:26
jeblairShuo_: yes, under heavy development and not able to actually be used at all yet19:27
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Transition ZK API from dict to object model  https://review.openstack.org/39459219:37
Shrewsjeblair: ^^^  found a few minor issues19:38
Shrewsmain one is that 'formats' is an attribute of ImageBuild, not ImageUpload19:38
Shrewsduh19:38
Shrewsbut, un-WIP'd19:38
Shrewsand one place where i still treated it as a dict instead of obj19:40
Shrewsmordred: to be clear, i don't think we'll be "done with that" until we actually put it through the ringer with actual usage with actual providers and stuff. there HAVE to be bugs.19:47
Shrewsbut, i think we're at the point we need to start doing that19:48
mordredShrews: ++19:48
mordredyes - that's what I meant19:48
*** openstackgerrit has quit IRC19:48
clarkbit should be easy because we have much testing for it already19:49
*** openstackgerrit has joined #zuul19:49
clarkbfor the integration test I think you should be able to just update to make sure zk is installed and running and then that will tell you if the build, upload, boot, delete cycle works19:52
clarkbthough it might only delete the node not the image (adding an image delete too is easy though)19:52
Shrewsi'll have to spend some time learning the integration tests. haven't looked at that part yet19:53
mordredShrews: it's very similar to the shade tests - devstack plugin, runs nodepool against the devstack cloud19:54
clarkband then checks nodepool reaches expected steady states19:54
Shrewsthis is found in nodepool/devstack?19:55
clarkbyes19:55
clarkband then the check steady state script is in tools/ I think19:55
Shuo_jeblair: what's the estimated release time? If we setup a v2, what's the migration path?19:56
clarkbShrews: https://git.openstack.org/cgit/openstack-infra/nodepool/tree/tools/check_devstack_plugin.sh is the state check script, you probably want toadd in image delete checking there too20:01
clarkbShrews: and replace https://git.openstack.org/cgit/openstack-infra/nodepool/tree/devstack/plugin.sh#n309 with starting a zk20:01
clarkbShrews: and you can install zk by modifying files in https://git.openstack.org/cgit/openstack-infra/nodepool/tree/devstack/files20:01
Shrewsclarkb: well, gearman is still needed for part of nodepool, but i get what you mean. thx20:02
Shrewslet's hope zookeeper actually works  :)20:03
jeblairShrews: note that some of my changes are a pre-req for that; i will finish rebasing them after lunch20:04
Shrewsjeblair: ack20:05
openstackgerritIan Wienand proposed openstack-infra/nodepool: Add option to force image delete  https://review.openstack.org/39638820:07
clarkbjeblair: the db test case reorg is the ones you mean htat need a rebase?20:16
* clarkb holds off on those for now then20:16
jeblairyep20:16
openstackgerritIan Wienand proposed openstack-infra/nodepool: Add option to force image delete  https://review.openstack.org/39638820:31
jeblairShrews: 592 fails tests now20:48
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Transition ZK API from dict to object model  https://review.openstack.org/39459220:51
jeblairShrews: ^ fixed, and also rebased to branch tip (which i needed to resolve a conflict in the stuff i'm stacking on it)20:52
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Combine ZKTestCase with DBTestCase  https://review.openstack.org/38396220:53
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Supply ZK connection information to test configs  https://review.openstack.org/38396320:53
Shrewsjeblair: bleh. thx20:54
clarkbjeblair: would mixing the zk test case into classes that need both be easier? also make it easier to untangle the two later if tjat becomes desirrablr?20:59
jeblairclarkb: that was my first attempt.  testtools setup weirdness struck.21:06
openstackgerritIan Wienand proposed openstack-infra/nodepool: Add option to force image delete  https://review.openstack.org/39638821:06
clarkbjeblair: huh ok21:07
jeblairclarkb: this seemed reasonable considering our trajectory (in reality, they will both almost always be used together until gearman is gone completely)21:07
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Add getMostRecentBuildImageUpload method to zk  https://review.openstack.org/38396421:08
clarkbya I think test_zk and the allocator tests are probably the odd ones out21:08
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Update waitForImage test method for ZK  https://review.openstack.org/38396621:09
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Re-enable test_dib_image_list  https://review.openstack.org/38396721:09
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Assume diskimage and image names are the same  https://review.openstack.org/38396521:09
jeblairclarkb, greghaynes, Shrews: ^ there's the stack updated21:09
jeblair967 is actually the tip. we need to do that and also update the rest of the commands before the integration test since it uses them.  but we also still have to update the main nodepool daemon to get its image information from zk.  the getMostRecentBuildImageUpload method in 964 and having switched the model to use objects should make that relatively painless.21:12
jeblairoh, i think i need to update 967 more21:13
Shrewsjeblair: do you want to add a 'count' param to your new API to match the others?21:27
Shrewstotally don't have to, but just a thought21:27
clarkbanother random thought, its weird that you can't get all the uploads regardless of state (to me at least)21:28
jheskethMorning21:28
Shrewsyeah, the others have the option of ignoring state altogether21:29
jeblairShrews: i thought about it, but i'm leaning toward no because i think this is basically the primary output of the builder for the rest of the system.  nodepool launcher never needs more than the most recent upload21:29
jeblairit's the "get me the image i need to launch this node" method21:29
Shrewsjeblair: *nod*21:29
clarkbreading the failed logs it seems to be getting the same build over and over and over even though the state is ready?21:32
jeblairi think we may not be uploading21:34
clarkbah21:34
clarkboh right its builds/ that it is iterating through constantly so ya liekyl no upload data21:35
jeblairyeah, and i don't see any log msgs from the uploader21:36
jeblairother than 'starting'21:36
clarkbhttp://logs.openstack.org/67/383967/4/check/gate-nodepool-python27-db-ubuntu-xenial/826558c/console.html#_2016-11-10_21_15_06_494643 and that confirms no nodes there21:37
Shuo_where can I find a good architecture description of zuul (hopefully one for v3 and one for existing one)21:52
openstackgerritCaleb Boylan proposed openstack-infra/nodepool: Fix subnode deletion  https://review.openstack.org/37045521:54
jeblairclarkb, Shrews: it looks to be another manifestation of the diskimage name != image name problem21:55
jeblairIebe873bae1cf71a56ad9b791a9e6751a4e15043e is what broke it21:55
jeblairworking on a fix21:56
jeblairShuo_: http://docs.openstack.org/infra/publications/zuul/#(1) and http://docs.openstack.org/infra/zuul/quick-start.html#zuul-components may help with v222:01
jeblairShuo_: http://specs.openstack.org/openstack-infra/infra-specs/specs/zuulv3.html may help with v322:02
Shuo_jeblair: thanks!22:02
jeblairShuo_: none of those are easily consumable though, sorry.  we will address that for the v3 release.22:02
Shuo_jeblair: hopefully get some whiteboard time to consume it :-)22:03
jeblairclarkb, Shrews: okay, i have a fix, but there are a couple more fixes i need to untangle.  will have patches soon.22:11
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Update waitForImage test method for ZK  https://review.openstack.org/38396622:23
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Re-enable test_dib_image_list  https://review.openstack.org/38396722:23
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Add getMostRecentBuildImageUpload method to zk  https://review.openstack.org/38396422:23
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Assume diskimage and image names are the same  https://review.openstack.org/38396522:23
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Use diskimage name when looking up image on disk  https://review.openstack.org/39642222:23
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Override the cleanup interval in builder fixture  https://review.openstack.org/39642322:23
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Add __repr__ methods to ZK objects  https://review.openstack.org/39642422:23
jeblairclarkb, Shrews: ^ I think that should do it.22:23
Shuo_In the architecture of zuul-based CI infrastructure setup, is nodepool's purpose/role to bring VM with the right type of base image online (through nova API)?22:24
mordredShuo_: yes. and also to manage base-images that you'd use to boot vms22:28
mordredShuo_: and also to, based on config, keep a pool of available nodes so that tests don't  have to wait for nova to boot a vm22:28
clarkbit also does some scrubbing of instances to avoid using "bad ones" and can do per region configuration22:28
Shuo_mordred: 1) does nodepool maintain states (the answer seems a 'yes' from your "keep a pool of available nodes ")?; 2) It sounds like the continuous integration infra team consume a constant large pool of capacity, regardless if there is a lot job flight around. is this understanding true?22:31
clarkbyes it keeps states, building, ready, used, delete currently22:33
mordredthe consumed capacity does vary with demand22:33
mordredso although it does keep some amount of nodes around if there is no demand, it's a minimal amount compared to peak demand times22:34
clarkbright, we run "under demand" most of the time22:34
clarkbholidays, portions of weekends, and summits are just about the only times we drop below22:34
Shuo_To add what I meant by 2), say if I have setup a 400 dell machines openstack cluster, and use it as the capacity of  zuul + nodepool servcie. I  setup a (say) 1000VM of 4GB on my nodepool configure, then even I have zero job flighing around in the queue, I will be occupying 1000vms22:35
clarkbjeblair: if you don't skip all those tests in test_commands.py do they all fail?22:35
clarkbShuo_: no, nodepool is demand based, you set a minimum level to be ready at any time and thats how many you get with zero jobs in flight22:36
clarkbShuo_: it will then expand capacity up to 1k VMs as demand requires it22:36
Shuo_clarkb: thanks and make sense to me now.22:37
Shuo_clarkb: any discussion/brainstorm on a resource pool of container-oriented cluster? Say, we have a cluster of Kubernetes or Mesos already, and we want to use that pool of machines (mesos/kubernetes cluster) as our zuul solution capacity. Can it fit in?22:41
Shuo_clarkb: if it can, replacing or expending what part?22:41
clarkbI think in the past we have said that it would be nice if you continued to use the nova api beacuse that can do baremetal, VMs, and containers and we don't have to change anything in nodepool22:42
Shuo_http://www.ebaytechblog.com/2014/04/04/delivering-ebays-ci-solution-with-apache-mesos-part-i/22:42
clarkbIt would probably be a significant amount of work to add in not openstack image management and instance boot deletion22:43
clarkbyou'd basically be rewriting most of nodepool to add that in22:43
jeblairclarkb: most of them will probably fail22:44
Shuo_clarkb: interesting, are you saying  this can be done. keep all other parts as is and replace nodepool component in the picture?22:44
clarkbShuo_: its possible. It would be a large effort though22:46
clarkbI'm not sure how valuable it would be as a result (at least in their example they already have an oepnstack you could talk to)22:46
mordredthat said - running k8s workloads is a thing we'd like to support - but using k8s or mesos to manage resources instead of openstack would not really be a win for us22:48
clarkbmordred: we already do support them :)22:48
Shuo_clarkb: "at least in their example they already have an oepnstack you could talk to", but our reality is we have a huge mesos cluster running. Can't bend the existing infrastructure too much.22:48
clarkbkolla at least is doing it today on top of zuul+nodepool22:48
clarkbI think magnum and higgins too?22:48
mordredclarkb: I mean without first spinning up a VM to install the k8s in22:49
*** hasharAway has quit IRC22:49
clarkbmordred: right its not consuming k8s resources, but it is testing k8s things22:49
mordredyah22:49
clarkbShuo_: I'm not sure I understand how you would put them together either isn't mesos fairly static?22:49
clarkbShuo_: not sure how nodepool would help there22:49
clarkbk8s too right?22:49
mordredShuo_: zuul+nodepool is very much focused on operating on things that behave like computers22:49
clarkbeg you wouldn't have nodepool do much with them22:50
mordrednot abstract container execution environments22:50
clarkbyou'd maybe have things register with zuul and zuul execute on them directly22:50
timrcyolanda made a nodepool-like service that spit out containers using k8s, Shuo_ -- totally randomly jumping in here w/out much context so not sure if that's useful or not :)22:50
clarkbat least from my first impression I think thats how I would try to structure it22:50
clarkbbasically nodepool registers openstack resources with zuul, $other thing could register mesos resources with zuul22:51
mordredclarkb: yah - that was somewhat was I was originally thinking - a nodepool provider that returned a k8s endpoint22:51
mordredyah22:51
mordredit seems there are k8s namespaces - so a "single user k8s endpoint" could potentially be a namespace thing inside of an existing k8s22:52
mordredthis is all very handwavey at this point22:52
Shuo_clarkb: "isn't mesos fairly static?" No and Yes. No, because you can start containers of jenkins on demand (number of occupied capacity is very dynamic); Yes, although your 400 dell machines are all management by mesos (400 is a static thing, but that's the same for OpenStack as well)22:52
mordredin some ways, parts of this may be easy - ansible can talk to k8s and mesos just fine22:55
Shuo_clarkb: did that answer your question?22:55
mordredhowever, the hard part, from when we've talked with container people about this problem space22:55
clarkbShuo_: sort of, iirc mesos pre chunked its resources22:55
clarkbShuo_: so you basically consume the entire set at all times which is your problem with using openstack next to mesos because mesos has taken over the whole set22:55
mordredis that zuul v3 will want to prepare git repository states for a given job and then rsync those contents to a VM22:55
clarkbShuo_: I think a better way to describe it might be "mesos is single tenant"22:56
mordredso if things are operating in a container model, we're goint to need to figure out the git repo data transfer22:56
clarkbso yes while openstack consumes the entire set of compute resources it shcedules on them dynamically in a multitenant manner22:56
mordredit is on the list of things to figure out - but it's pretty low on the list right now to be honest22:56
Shuo_clarkb: no, mesos does not pre-chunk resource (it labels your machines, so that you can say things like "Job A can only run on machine with Label-I", but that's totally optional)22:56
mordredbecause of the things we need to do to get zuul v3 up and going just in the current non-container oriented world22:56
clarkbmordred: I mean it will just work in a container world too22:57
clarkbmordred: you jsut have to treat containers as instances not processes22:57
clarkbsame as baremetal22:57
timrc"system containers"?22:57
mordredright. that's not actually doing the thing that k8s people want22:57
clarkb(apparently people are using nodepool with baremetal which is cool)22:57
mordredso isn't really a good analog22:57
clarkbmordred: sort of22:57
clarkbmordred: thats not what you want for deploying applications in production22:58
clarkbbut for CI I think we learn over and over and over again that you need to turn those knobs very often22:58
mordredclarkb: so - if you have a container-based appliation that is in git repos A and B22:58
clarkbbecause you aren't just running unittests, you are also testing that nginx/apache/haproxy/mysql/etc work together22:58
mordredto deploy that, the thing you need to do is build containers a and b from the content in git repos A and B22:59
timrcI don't think i'd abstract k8s as a nodepool provider if what I'm getting back is an endpoint to an immutable process container... might as well just have zuul interface k8s directly.22:59
clarkbyes that is what kolla is basically doing today22:59
mordredthe questions are "where do you build the containers" "where do you upload them" and "how to you actually run the deploy based on them"22:59
clarkbtimrc: yup thats what I was saying earlier :)22:59
Shuo_clarkb: concept of tenancy in mesos is quite different from tenancy in openstack; though it's hard to have a 1-1 mapping, people can do some cheating.22:59
Shuo_do some cheating and achieve such mapping.23:00
mordredit's possible we could decide that we want to teach the zuul core about building containers23:00
timrcclarkb: sweet :)23:00
mordredbut that seems like a bad idea23:00
Shuo_fire alarm, ttyl23:00
clarkbtimrc: mordred I don't think zuul would build the containers23:00
clarkbor at least not as part of zuul itself23:00
mordredwhere do you think they would be built?23:00
clarkbbut the interaction of run a job over here23:00
clarkbthat seems like something that zuul should figure out between k8s/mesos directly without a nodepool23:01
clarkbbecause its fairly inelastic23:01
clarkbk8s cluster si X big, use it23:01
clarkbsame for mesos23:01
mordredso - let's ignore nodepool for a second23:01
mordredit's not the hard part23:01
mordredthe hard part is - how do you go from speculative git repository state to containers running in k8s23:02
clarkbmordred: the "easy" way is the way we do it today23:02
clarkbmordred: or at least the theoretical future state of how to do it better today23:02
mordredright. which requires either a VM or a container running multi-process pretending ot be a VM23:03
clarkbgive people "real" computers to do their work on and be flexible23:03
mordredright23:03
timrcThat ^^ btw would be useful for lint tests :)23:03
timrcBut I digress :)23:03
mordredbut there's this whole world of container people who explicitly do not want to be given 'real' computers23:03
mordredbut who want to express all of their things in terms of collections of interrelated containers23:04
timrcYeah so you have a job that use k8s instead of nodepool and asks for pods with $image and you get back the ips into that pod?23:04
clarkbtimrc: ya that is basically what I had in mind23:05
clarkbespecially with zuulv3 where its already "give me N of image foo" from openstack via nodepool23:05
clarkbyou get back a "primary" run your testing from that whatever it is that you need to do to assert state23:05
clarkbwhich leads into mordreds thing23:06
clarkbof how do we make the new image that you ask the things to boot23:06
mordredyah23:06
clarkbas part of a speculative git future world that may or may not exist23:06
mordredwithout having each user have to script doing that themselves23:06
mordredsince for computer based workloads we provide a VM that has a bunch of state shoved in to it for them23:06
mordredwe'll figure it out eventually :)23:06
* mordred has to run to dinner23:07
jeblairin the mean time, we need to get it to run.  at all.23:07
timrclol23:07
clarkbmordred: thsi is sort of related to that thing from earlier and the reason I like have entirely self contained units or things that appear that way is it makes it very easy to reproduce locally23:07
clarkbwhich is why job A to compile feeds into job B to build package feeds into X jobs to assert things I am less a fan of23:08
clarkbI like Job 1 asserts X and also compiles and builds package and installs package. job 2 asserts Y and also compiles and builds package and installs package23:08
clarkbthen you figure out how to make the cost of those steps cheap when repeated. Because now I can just grab that same thing run locally and boom I reproduced23:09
clarkbwithout needing to understand a complicated graph of build targets23:09
timrcJust like we have ansible launcher to run ansible against newly provisioned nodes I imagine you'd have some sort of k8slauncer thing that would create images/apply deltas and upload those images to k8s.  I dunno that sounds complicated :)23:09
clarkbtimrc: that assumes I have a k8s right?23:10
clarkbbut if you flip it around and make having a k8s part of the process now its easy23:10
clarkb(which is the setup we use today)23:10
clarkb(unfortunately we haven't solved the make those steps cheap part)23:10
timrcHm, yeah.23:10
openstackgerritIan Wienand proposed openstack-infra/nodepool: Add option to force image delete  https://review.openstack.org/39638823:11
clarkbI also fully acknowledge that if you want things to be performant taking those shortcuts is often necessary (at least due to time constraitns)23:12
Shuo_clarkb: following up the topic of mesos/k8s. If we think of zuul's core value (great gating system, great serializing checking with speculative test runs), then it does not care if it's openstack cluster that provide the test machine capacity or mesos cluster doing so -- it simply ask for capacity to execute speculative jobs.23:19
Shuo_so although I don't particular understand the architecture detail yet, I feel there could exist synergy here.23:19
jeblairShuo_: yes, that's part of why there's an interface between zuul and nodepool, to allow for this kind of flexibility23:20
clarkbyup exactly23:20
clarkbthe interaction between zuul and nodepool is a fairly well defined api23:21
Shuo_one more thing: zuul may be even be able to consume aws spot instance (I know this is OpenStack community, but I am just sharing our current non-openstack perspective)23:21
clarkband should be useable by things other than nodepool23:21
jeblairShuo_: agreed.  our plan is to focus first on openstack's usecase, then the ansible community (which is less tied to openstack and uses github), then other uses, including containers23:22
clarkband in fact I don't think zuul has ever required nodepool23:23
jeblairShuo_: right now, we're at a stage where we're learning about container-based workflows23:23
clarkbthe old jenkins based system could've used jenkins + mesos23:23
clarkband new system can plug into zuul's resource allocation system23:23
jeblairShrews, clarkb: 394592 through 383967 pass tests23:23
Shuo_To me (and thanks a lot of clarkb, mordred and jeblair for helping out my rudimentary questions), zuul has two primary types of partners: 1) gerrit/gitlab/github -- developer interface; 2) nodepool+openstack cluster / some-kind-of-Mesos_driver + mesos cluster / some-kind-of-aws-driver + EC2 account /....23:25
clarkbShuo_: I don't know for sure because I haven't tested it but I would expect that zuul 2 + jenkins gearman + jenkins mesos would just work23:26
Shrewsjeblair: will review when less drunk. Which reminds me, where is olaph???23:27
clarkbyou might have to hack it so that jenkins mesos forces gearman to register josb but otherwise that should do things23:27
Shuo_and 2) is resource capacity interface.23:27
Shrewsolaph: we totes need to hang out23:27
Shuo_jeblair: I'd love to volunteer the container workflow if that helps the zuul community -- I happened to have chance to see both nova/vm side of story/workflow and mesos/container side of work/story23:28
pabelangerclarkb: on the run containers on a VM thing, I do think it would be neat for nodepool to launch a VM, then some how flag it to be static for x amount of runs, to let some container jobs do things. Then, have nodepool delete that vms and repeat.23:29
pabelangerso, people get container things for lint jobs23:30
clarkbpabelanger: why not use nova's container drivers though?23:30
pabelangerbut we still don't have to manually create static23:30
clarkbI think you are solving it at the wrong layer by having nodepool do that23:30
olaphShrews: I'm on the west coast23:31
pabelangerclarkb: because not all our clouds support that?23:31
olaphgetting my protest on outside jeblair's house23:31
pabelangerI have no idea, if that is true23:31
clarkbmesos, k8s, openstack, docker swarm, etc all solve that problem23:31
pabelangernever tried using nova container23:31
clarkbit seems weird to try and solev it in nodepool23:31
Shrewsolaph: bah. Come to the Mexican restaurant below my house and drink margaritas!23:32
clarkbbut also lint jobs run fine on the "bigger" VMs23:32
pabelangerwell, I know ansible has an lxc task, it would be neat to use that interface23:32
pabelangerbut I haven't tried it before23:32
clarkband they won't run faster on containers, so it feels like optimizing for a problem that doesn't exist23:32
pabelangerclarkb: right, I think in our case, we can start using smaller vms for lint tests23:32
Shuo_clarkb: "I don't know for sure because I haven't tested it but I would expect that zuul 2 + jenkins gearman + jenkins mesos would just work" -- that would be super! give me enough appetizer to recruit internal partners :-)23:32
pabelangerbut the often argument is people don't have the budget to keeps a lot of VMs going23:33
pabelangerclarkb: speaking of smaller VMs, when do we want to start on that?23:33
clarkbpabelanger: for us booting smaller VMs doesn't really help much23:33
clarkbpabelanger: since we have a fixed instance quota23:33
clarkbbut people can boot smaller VMs with nodepool23:33
clarkbif they want to reduce consumption of resources23:34
pabelangerya, capacity isn't an issue right now for sure23:34
clarkbeven if capacity was an issue for us using smaller VMs would not help23:34
clarkbbecause in all of our clouds we either have a fixed number of IPs or a fixed number of total instances or both23:34
pabelangerYa, ipv6 helps23:35
pabelangerbut, agreed23:35
Shuo_clarkb: in container world, there exist pretty good adoption of ip-per-container, so the number of IP is not an issue (you can use a 10.0/16 and that gives you a huge runway)23:36
clarkbif nova says you get 100 instances, making 100 small instances really doesn't do much except for the cloud scheduling23:36
clarkbShuo_: we require public IPs because we run things all over the worlda nd don't want to deal with the haedache of resolving a broken ipv4 weorld23:36
clarkbShuo_: so as pabelanger says ipv6 is nice :)23:37
Shuo_oh, I missed that part.23:37
clarkbShuo_: we run in something like 7 clouds and 11 regions? I forget the current number (it fluctuates a bit) and in at least 3 countries on two continents23:38
Shuo_clarkb: but even for multiple clusters around the world, there could be private IP space solution (we are doing that now on aws): say we have a 400 machines cluster on AWS EAST-1; and you have 600 machines cluster on AWS WEST-1, we can setup the virtual gateway and peering (setup the layer 3 connectivity) in your private ip space. But I know AWS gave us more weapon to use -- not a comparable situation.23:41
clarkbya its a bit different when you are dealing with a single cloud23:42
Shuo_clarkb: but it's great to hear you feel the current code base might just work :-)23:42
clarkbShuo_: ya I think it will come down to whether or not jenkins mesos gives jenkins gearman enough to register the jobs23:45
clarkbbut once the jobs are registered via gearman zuul should be able to schedule them and go crazy23:45

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!