ianw | any idea why this pull request with a depends-on: for system-config -> https://github.com/philpep/testinfra/pull/494/commits/c9320bce0708fce074ecb3e1cb06591b99f4c0ee | 00:29 |
---|---|---|
ianw | ends up in -> https://zuul.opendev.org/t/openstack/build/54d18384d52d4f7f859a08bcb0f82b92/logs | 00:30 |
ianw | not seeming to notice the depends-on? | 00:30 |
ianw | https://169d8cf58d4d97c4b367-c86e09fdca941b34a3fcb115361b332e.ssl.cf2.rackcdn.com/494/c9320bce0708fce074ecb3e1cb06591b99f4c0ee/third-party-check/system-config-run-base-ansible-devel/54d1838/zuul-info/inventory.yaml | 00:30 |
clarkb | ianw depends on goes in the first header comment of the PR | 00:33 |
clarkb | not the commit message | 00:33 |
ianw | clarbk: does that mean i need to edit my comment in https://github.com/philpep/testinfra/pull/494#issue-314742565 ("don't merge this, testing only")? | 00:37 |
ianw | clarkb: sorry, ^ :) | 00:38 |
ianw | ahh, it appearss the answer is yes! | 00:39 |
clarkb | yup | 00:41 |
ianw | https://github.com/philpep/testinfra/pull/494#issuecomment-529265364 | 00:59 |
ianw | i'm so happy to have that finally working i added a hooray emjoi | 00:59 |
*** sgw has quit IRC | 01:22 | |
*** sgw has joined #zuul | 01:44 | |
mnaser | https://github.com/vexxhost/nodepool-provider - big "wip"/"poc" thing but i'm trying to work my way to get a nodepool provider with many resources. as of now it just prints out the zookeeper endpoints for the service in cli, but there's a bit more work left (imho) to get it at least to use the k8s cluster its running on | 02:39 |
mnaser | i'll probably have resources for each (i.e. NodepoolKubernetesProvider) or someting and that will systematically update the nodepool config | 02:40 |
mnaser | the goal is to be nothing but building components, not a "distro | 02:40 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: Add a netconsole role https://review.opendev.org/680901 | 02:46 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: Add a netconsole role https://review.opendev.org/680901 | 03:41 |
*** jank has joined #zuul | 04:11 | |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Evaluate CODEOWNERS settings during canMerge check https://review.opendev.org/644557 | 05:15 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: Add a netconsole role https://review.opendev.org/680901 | 05:29 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: Add a netconsole role https://review.opendev.org/680901 | 05:44 |
*** raukadah is now known as chandankumar | 05:46 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: Add a netconsole role https://review.opendev.org/680901 | 06:02 |
*** jank has quit IRC | 06:16 | |
*** bolg has joined #zuul | 06:18 | |
*** bolg has quit IRC | 06:23 | |
*** bolg has joined #zuul | 06:24 | |
*** saneax has joined #zuul | 06:25 | |
openstackgerrit | Merged zuul/zuul-jobs master: Switch releasenotes to fetch-sphinx-tarball https://review.opendev.org/678429 | 06:28 |
*** snapiri has quit IRC | 06:35 | |
*** snapiri has joined #zuul | 06:36 | |
*** snapiri has quit IRC | 06:36 | |
*** mattw4 has joined #zuul | 06:44 | |
*** snapiri has joined #zuul | 06:49 | |
*** shachar has joined #zuul | 06:58 | |
*** mattw4 has quit IRC | 06:59 | |
*** snapiri has quit IRC | 07:00 | |
*** saneax has quit IRC | 07:08 | |
*** saneax has joined #zuul | 07:08 | |
*** aluria has quit IRC | 07:33 | |
*** themroc has joined #zuul | 07:33 | |
*** jpena|off is now known as jpena | 07:35 | |
*** aluria has joined #zuul | 07:38 | |
*** jangutter has joined #zuul | 07:47 | |
*** pcaruana has joined #zuul | 07:49 | |
*** sshnaidm|afk is now known as sshnaidm | 07:51 | |
*** sshnaidm is now known as sshnaidm|ruck | 07:51 | |
*** bjackman has joined #zuul | 07:54 | |
openstackgerrit | Simon Westphahl proposed zuul/zuul master: Fix timestamp race occuring on fast systems https://review.opendev.org/680937 | 08:04 |
openstackgerrit | Simon Westphahl proposed zuul/zuul master: Fix timestamp race occuring on fast systems https://review.opendev.org/680937 | 08:06 |
*** panda has quit IRC | 08:09 | |
*** panda has joined #zuul | 08:11 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: Add a netconsole role https://review.opendev.org/680901 | 08:14 |
openstackgerrit | Simon Westphahl proposed zuul/zuul master: Fix timestamp race occurring on fast systems https://review.opendev.org/680937 | 08:59 |
*** arxcruz_pto is now known as arxcruz | 09:27 | |
*** shachar has quit IRC | 09:53 | |
*** snapiri has joined #zuul | 09:53 | |
*** spsurya has joined #zuul | 10:21 | |
*** hashar has joined #zuul | 10:25 | |
*** panda is now known as panda|rover | 10:42 | |
*** bjackman_ has joined #zuul | 10:46 | |
*** bjackman has quit IRC | 10:48 | |
flaper87 | just to confirm, all the zuul container images are the same except for their CMD, right? | 10:51 |
*** noorul has joined #zuul | 10:53 | |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: add-build-sshkey: add centos/rhel-8 support https://review.opendev.org/674092 | 10:54 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: add-build-sshkey: add centos/rhel-8 support https://review.opendev.org/674092 | 10:54 |
noorul | Hi | 10:55 |
noorul | Does one need to sign CLA for contributing to Zuul project? | 10:55 |
*** shachar has joined #zuul | 11:02 | |
*** snapiri has quit IRC | 11:03 | |
*** sshnaidm|ruck is now known as sshnaidm|afk | 11:15 | |
zbr | i was trying to investigate why zuul console does not wrap long lines and I made an interesting discovery: we have a table inside a pre element. AFAIK that was illegal in HTML. | 11:19 |
zbr | i found a less ideal trick to prevent the horizonal scroll from appearing, adding overflow-x: hidden; fro html block. | 11:25 |
*** jpena is now known as jpena|lunch | 11:35 | |
*** shachar has quit IRC | 11:48 | |
*** bjackman_ has quit IRC | 11:49 | |
*** snapiri has joined #zuul | 12:02 | |
*** noorul has quit IRC | 12:02 | |
*** rlandy has joined #zuul | 12:03 | |
*** rfolco has joined #zuul | 12:06 | |
*** gtema has joined #zuul | 12:23 | |
*** jpena|lunch is now known as jpena | 12:31 | |
*** sshnaidm|afk is now known as sshnaidm|ruck | 12:32 | |
*** saneax has quit IRC | 12:40 | |
*** saneax has joined #zuul | 12:41 | |
*** saneax has quit IRC | 12:51 | |
*** saneax has joined #zuul | 12:52 | |
fungi | if noorul returns or looks at the channel log, there is no cla enforcement configured for zuul's repositories in gerrit | 13:14 |
*** gtema has quit IRC | 13:21 | |
Shrews | fungi: how did your property do? | 13:23 |
*** Miouge has joined #zuul | 13:29 | |
Shrews | flaper87: that would appear to be the case, except the zuul-executor has data in /usr/local/lib/zuul that the others may not, based on https://opendev.org/zuul/zuul/src/branch/master/Dockerfile#L47 | 13:31 |
*** jangutter_ has joined #zuul | 13:34 | |
*** jangutter has quit IRC | 13:35 | |
corvus | mnaser: i'm confused, what is https://github.com/vexxhost/nodepool-provider ? | 13:37 |
mnaser | wow, dont commit code when you're tire,d that is meant to be nodepool-operator. | 13:38 |
mnaser | corvus: it is a (wip) kubernetes nodepool-operator that is much more low level and granular (rather than "get me a nodepool") | 13:38 |
mnaser | i.e. think more roles rather than a playbook | 13:39 |
corvus | mnaser: there is an effort in progress to develop a zuul and nodepool operator -- it is intended to support nodepool alone too -- is there a reason that wouldn't work for you? | 13:39 |
mnaser | corvus: the few things that seemed to be different was that the zuul and nodepool operators spec seems to imply its a "one resource to rule them all" | 13:40 |
mnaser | i.e. it does things like deploy postgres/mysql, etc | 13:40 |
corvus | mnaser: no, that's optional | 13:40 |
mnaser | it seemed pretty hard-wired in from what i saw in the playbooks | 13:41 |
corvus | mnaser: did you read the spec? | 13:41 |
mnaser | i did go over it, it didn't seem to be encouraging of things like creating a resource called "ZuulTenant" etc | 13:41 |
*** bjackman_ has joined #zuul | 13:41 | |
mnaser | maybe i misinterpreted things | 13:41 |
corvus | mnaser: okay, let's take things one at a time | 13:41 |
corvus | mnaser: here's the section on external deps: https://zuul-ci.org/docs/zuul/developer/specs/kubernetes-operator.html#external-dependencies | 13:42 |
corvus | mnaser: there should be a config override setting to allow a deployer to say “I already have one of these that’s located at Location, please don’t create one.” | 13:42 |
corvus | er that was a quote | 13:42 |
corvus | so that's the spec saying that we're not going to require that the operator control your rdbms, etc. you can provide it | 13:43 |
corvus | mnaser: regarding zuultenant -- that's correct, we absolutely don't want the operator to require you to create a zuultenant crd. there's a really good reason for this which we discussed while developing the spec | 13:43 |
corvus | the tenant config is already a yaml file | 13:43 |
mnaser | right, but what if you wanted to expose that ZuulTenant or ZuulProject resource so that certain specific users can add/remove projects without having a single owner of that tenant yaml config file? | 13:44 |
corvus | it's not useful to *require* users of the zuul operator to write a *different* (k8s) form of yaml, which we would then have to document, merely to have that transformed into the zuul version | 13:44 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: Add prepare-workspace-openshift role https://review.opendev.org/631402 | 13:45 |
corvus | mnaser: i could totally see building something like that on top of the operator | 13:45 |
mnaser | so zuul-...lifecycle..-operator or something along the lines? | 13:45 |
mnaser | (im just throwing a name out there) | 13:45 |
corvus | yeah (or heck, maybe even build it in to the operator as an optional thing?) -- the main thing is that there is no one right way to build a tenant.yaml file -- it sounds like you want to do it using k8s primitives and access control -- other folks wouldn't want to expose k8s access to the project "owners" and might want to have their own management app write out the yaml, or use code review, etc | 13:47 |
corvus | if we focus on making the zuul operator a good operator for the zuul application as it is, then we've got a good platform to build on top of | 13:47 |
corvus | now, the thing about the current source code in the zuul-operator repo is that it is an initial POC from tristan that was developed before the spec. basically, it got us to the point where we could discuss things in the spec, and what we came up is different. so the current work effort there is going to be to update the code in the operator repo to match the spec | 13:49 |
mnaser | i thought the order was spec => zuul/zuul-operator | 13:50 |
mnaser | which explains a bit of my confusion and the "well this doesn't seem to add up, the code says otherwise" | 13:50 |
corvus | we had an intern working on that, but that fell through. SpamapS and pabelanger were interested on friday in contributing. if you can help out too, we could probably get something going pretty quickly | 13:50 |
corvus | ideally, yes | 13:50 |
tristanC | corvus: mnaser: we also prototyped a sf-operator that setup a gerrit and inject the configuration to zuul: https://softwarefactory-project.io/cgit/software-factory/sf-operator/tree/ansible/roles/deploy_zuul/tasks/main.yaml | 13:50 |
mnaser | i can put a significant amount of time working on it to be honest right now | 13:50 |
mnaser | personally, i have really enjoyed working with the controller-runtime in golang because it gives a ton more flexibility | 13:51 |
mnaser | like watching other events (i.e. services) and triggering non-owners for reconcile (i.e. watching for non-owned services autoscale events) | 13:51 |
mnaser | i.. dunno how much of that power we can get with the ansible stuff | 13:52 |
corvus | mnaser: how about i set up a quick storyboard with what i think are the big work items on it and we can take a look | 13:52 |
fungi | Shrews: very minor wind damage and limited runoff flooding in the downstairs, as best we can tell. all in all a non-event, stuff i can take care of myself for the most part | 13:52 |
*** bjackman_ has quit IRC | 13:52 | |
corvus | fungi: \o/ | 13:52 |
mnaser | corvus: id be happy to start picking them up. is the decision to use the ansible-variant of the operator an already decided one? | 13:53 |
fungi | finally got home late last night and it's torrential thunderstorms all today but hopefully tomorrow i can check more closely and take all the plywood back down off the windows | 13:53 |
corvus | mnaser: there are a couple advantages of ansible: 1 -- we have a lot of deployment tooling using ansible we can repurpose into the operator -- 2 tristanC has a bunch of ansible stuff in the current tree that probably just needs minor rearranging to work | 13:53 |
AJaeger_ | fungi: glad to hear! | 13:54 |
mnaser | something that i found rather tricky is how to watch another resource that you don't own for changes (i.e. if you specify `serviceName`) .. and im not sure how this can be done via ansible | 13:54 |
*** AJaeger_ is now known as AJaeger | 13:54 | |
tristanC | corvus: though mnaser is correct, golang based operator seems to be much more flexible for complex resources | 13:54 |
mnaser | while im generally happy to use ansible | 13:55 |
AJaeger | zuul-maint, want to switch tox-docs job to fetch-sphinx-tarball? See https://review.opendev.org/676430 | 13:56 |
mnaser | i cant imagine writing stuff to do something like this: https://github.com/vexxhost/nodepool-operator/blob/master/pkg/controller/nodepool/nodepool_controller.go#L45-L57 + https://github.com/vexxhost/nodepool-operator/blob/master/pkg/controller/nodepool/nodepool_controller.go#L91 + https://github.com/vexxhost/nodepool-operator/blob/master/pkg/controller/nodepool/nodepool_controller.go#L136-L169 with ansible | 13:56 |
corvus | i'm personally not opposed to changing it if we need to use the golang operator, i'd love to try to use the ansible operator if we can for now -- we seemed to think it was a pretty good idea at the time, and we have a lot of work already done on it. moreover, maybe we can make the ansible operator better? would there be a way to extend it to support the sort of thing you're talking about? | 13:56 |
mnaser | i think the hard thing with it is ansible isn't too good at lookups/logic | 13:57 |
mnaser | so while it can work, it will be a lot of register: + when: | 13:57 |
mnaser | and im not sure if we can do mapped watches with the ansible operator-sdk | 13:57 |
mnaser | where you can listen to $some_other_event and transform it and reconcile another resource | 13:58 |
mnaser | yeah no it just seems like a straight "when this change run that" | 13:59 |
mnaser | we seem to miss out on the ability of having a well defined schema too, golang one allows us to have openapi defined schema so invalid requests can be validated at the api layer | 14:00 |
corvus | mnaser: is this the sort of thing we would need to do for what's covered in the spec, or is it a more advanced followup? | 14:00 |
corvus | (er, that last was before the openapi thing, that i grok) | 14:00 |
pabelanger | sorry, meeting. yes, i was interested in help with zuul-operator, I may be able to find some time to work on it. | 14:01 |
pabelanger | I did like the idea of using ansible operator, given how much tooling we have with ansible | 14:02 |
pabelanger | but, TBH, i am not sure the differences between that and go based | 14:02 |
*** bolg has quit IRC | 14:03 | |
mnaser | i think going over the spec, we will struggle with watching non-owned resources for changes if we do this | 14:03 |
corvus | what's a non-owned resource we might watch? | 14:03 |
mnaser | secrets and endpoints | 14:03 |
mnaser | if you change a secret, zuul should change the config. if you scale up your zookeeper cluster (or the ip changes), zuul should update the zookeeper list | 14:04 |
mnaser | those are two off the top of my head | 14:04 |
mnaser | if we don't watch them, then we need to somehow adopt zuul itself to watch them but i dont know if we want to start carrying k8s constructs there | 14:04 |
mnaser | for services, you can get away by using the cluster dns for some (like db), but for a headless service like zookeeper, you will ned to always have the ip of all nodes in there | 14:05 |
fungi | by "secret" i assume you're talking about credentials zuul's services are using to connect to things like zk and rdbms, not talking about job secrets (which are in zuul-managed got repos and update automatically) | 14:06 |
tristanC | i also noticed the ansible based operator takes a lot of time to go over all the task each time, resulting in auto-scaling to not kick fast enough for merge task | 14:06 |
corvus | that sounds pretty basic -- the ansible operator can't watch arbitrary resources? | 14:06 |
mnaser | fungi: correct, im talking about the kubernetes 'secret' resource | 14:06 |
fungi | makes sense then | 14:06 |
corvus | and we're planning on building zuul.conf from k8s secrets (you'd put your gerrit password in there) | 14:07 |
mnaser | i'm positive that it can, but the issue becomes is that now you start to have fairly complex code, because the code that watches the secret will trigger a playbook that a secret was updated | 14:08 |
mnaser | but then how does that trigger a 'zuul' updated event after, not sure the ansible one can do that | 14:08 |
mnaser | and then if you say, well lets just run the zuul playbook everytime we get a secret change.. but the issue is you cant do that because your playbooks assume that the vars are all for `kind` `Zuul` | 14:09 |
mnaser | and instead they just got vars for a secret, confusing it. what you'd need to do is listen to a secret change and re-enqueue a reconciliation for zuul (that has the secretName set there) | 14:10 |
mnaser | its quite the task, even in go.. i cant imagine it'll be easier in ansible, but i'll shush now with my wall of text :) | 14:10 |
tristanC | mnaser: well, you can short-circuit the watches logic and include the zuul-deploy role from the secret watch | 14:11 |
mnaser | tristanC: but the zuul-deploy role will be called with the "secret" info, it won't magically contain the cluster that maps to it :) | 14:13 |
mnaser | its tricky logic | 14:13 |
openstackgerrit | Merged zuul/nodepool master: Fix Kubernetes driver documentation https://review.opendev.org/680879 | 14:13 |
openstackgerrit | Merged zuul/nodepool master: Add extra spacing to avoid monospace rendering https://review.opendev.org/680880 | 14:13 |
openstackgerrit | Merged zuul/nodepool master: Fix chroot type https://review.opendev.org/680881 | 14:13 |
*** jamesmcarthur has joined #zuul | 14:15 | |
corvus | mnaser: okay, i'm not familiar enough with either operator to help out on a technical level, so my affinity for the ansible operator is based only on assuming that we made a good initial decision, that the ansible operator folks would welcome improvements, and that with tristanC's poc, we have a chunk of code written to help get us to the initial spec. if that's wrong, i'm not opposed to changing it. maybe | 14:16 |
corvus | you can write a patch to the spec to switch, and include the justifications? we can ask tristanC, SpamapS, pabelanger, tobiash to look at that. mordred would be nice, but he's still afk this week. | 14:16 |
mnaser | sure, i can do that. | 14:16 |
corvus | i'll get that storyboard thing going | 14:16 |
mnaser | meanwhile a fun ops story | 14:17 |
mnaser | new ceph releases have a setting of osd_memory_target to manage their cache that defaults to 4gb | 14:17 |
mnaser | but if they detect that they are running inside a cgroup, they use `osd_memory_target_cgroup_limit_ratio` which defaults to 0.8 | 14:18 |
mnaser | cat /sys/fs/cgroup/memory/system.slice/system-ceph\\x2dosd.slice/memory.limit_in_bytes => 9223372036854775807 | 14:18 |
mnaser | ceph daemon osd.222 config get osd_memory_target => 7378697629483821056 | 14:18 |
corvus | that is > 4gb | 14:18 |
mnaser | well if it detects that its inside a cgroup | 14:18 |
mnaser | it does ~dynamic~ calculation | 14:19 |
mnaser | so i guess the math all went wrong and i ended up with ... 7378697629.483821 GB memory limit, and i wondered why my OSDs were running awy with memory :) | 14:19 |
corvus | seems like maybe a min(4gb, 0.8*x) would be nice | 14:19 |
corvus | yeah, i bet they ran really fast until they didn't :) | 14:20 |
mnaser | and the fix https://github.com/ceph/ceph/pull/29581 | 14:23 |
corvus | mnaser: can you look at the paragraph "The Operator will shard" in the spec -- under ansible, we figured the best way to do that would be a utility pod (ie, when the nodepool configmap changes, the operator runs the utility pod which has a python script that parses the nodepool.yaml and shards it and creates a bunch of new configmaps) -- if we switched to golang, do you think that would be internal operator | 14:28 |
corvus | logic? | 14:28 |
*** hashar has quit IRC | 14:28 | |
mnaser | corvus: yes, we could probably build out the config via yaml and write out the configmaps there and then trigger a redeploy (avoiding that utility pod entirely) | 14:30 |
corvus | mnaser: thanks. i'll factor that into my storyboard list -- and you might want to revise that in your spec patch | 14:31 |
*** saneax has quit IRC | 14:41 | |
*** saneax has joined #zuul | 14:41 | |
pabelanger | tobiash: do you think it is possible for github driver to support more then 1 github app, as the user to github? The issue is, today we have a single github app, with read / write permissions for commit. This is fine, if zuul is going to merge code, but in some cases (say ansible/ansible) we don't actually gate the code, we only do thirdparty CI (report results back and use statuses API). | 14:42 |
pabelanger | basically, we want to use a 2nd github app to drop the commit permissions needed on github app | 14:42 |
pabelanger | but, don't want to stand up another zuul | 14:42 |
tobiash | pabelanger: you still can configure two connections | 14:43 |
tobiash | Wit different app settings | 14:43 |
tristanC | iirc we did test successfully multiple apps on a single zuul | 14:43 |
pabelanger | Hmm, let me think | 14:43 |
tobiash | yes, that should just work | 14:44 |
pabelanger | can 2 connections, has the same canonical_hostname? I guess it could | 14:44 |
pabelanger | I guess, if we create 2 tenants, it would be okay | 14:45 |
pabelanger | tenant A, would have read / write github.com, tenant B, read only github.com | 14:45 |
pabelanger | as long as both connections we not on the same tenant, should be fine | 14:46 |
pabelanger | tobiash: tristanC: thanks, let me test that out | 14:47 |
clarkb | connections are global, does tenancy matter here? | 14:49 |
openstackgerrit | Andy Ladjadj proposed zuul/zuul master: Fix: prevent usage of hashi_vault https://review.opendev.org/681041 | 14:50 |
*** themroc has quit IRC | 14:50 | |
corvus | mnaser, tristanC, pabelanger, SpamapS: okay i started a storyboard here: https://storyboard.openstack.org/#!/story/2006516 -- note that each task has a note (which you have to click the little triangle to see) with a little more detail. we may be able to split up the "implement nodepool" and "implement zuul" tasks a bit, but probably not until someone starts working on them to figure out how that might | 14:51 |
corvus | work. the rest of the tasks should be pretty reasonably non-overlapping. | 14:51 |
openstackgerrit | Andy Ladjadj proposed zuul/zuul master: Fix: prevent usage of hashi_vault https://review.opendev.org/681041 | 14:52 |
openstackgerrit | Andy Ladjadj proposed zuul/zuul master: Fix: prevent usage of hashi_vault https://review.opendev.org/681041 | 14:53 |
*** jangutter has joined #zuul | 14:54 | |
pabelanger | clarkb: I think it means, if 2 github connections, a project could not be in both. So today I have ansible/ansible in read / write connection, I would need to create 2nd read-only connection and move ansible/ansible into it | 14:54 |
pabelanger | but, I cam guessing here until I try | 14:54 |
pabelanger | am* | 14:54 |
*** jangutter_ has quit IRC | 14:55 | |
mnaser | corvus: ok cool, ill have a look shortly and see what i can do | 14:55 |
clarkb | pabelanger: I see, I'm not sure if putting them in different tenants will fix that | 14:56 |
clarkb | pabelanger: note you'll need different connections names which means different pipeline config too | 14:56 |
pabelanger | yah, different pipelines should be okay today, we have check (which we merge) and third-party-check (report statuses) pipelines now | 15:00 |
*** hashar has joined #zuul | 15:01 | |
AJaeger | zuul-maint, could you review https://review.opendev.org/674334, please? | 15:10 |
AJaeger | pabelanger: do we still need https://review.opendev.org/583350 and https://review.opendev.org/583346 - or is it time to abandon? | 15:10 |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - handle Pull Request tags (labels) metadata https://review.opendev.org/681050 | 15:34 |
pabelanger | AJaeger: done | 15:46 |
AJaeger | thanks | 15:46 |
*** rlandy is now known as rlandy|brb | 15:52 | |
*** chandankumar is now known as raukadah | 15:53 | |
SpamapS | corvus: neat! | 15:53 |
SpamapS | corvus: should have some spare time next week to pick a few tasks up. | 15:53 |
*** igordc has joined #zuul | 15:54 | |
mnaser | SpamapS: how do you feel about the moving to go based operator part of things? | 15:56 |
*** mattw4 has joined #zuul | 15:58 | |
*** sshnaidm|ruck is now known as sshnaidm|afk | 16:00 | |
*** mattw4 has quit IRC | 16:01 | |
*** mattw4 has joined #zuul | 16:01 | |
*** mattw4 has quit IRC | 16:05 | |
*** mattw4 has joined #zuul | 16:06 | |
*** igordc has quit IRC | 16:09 | |
*** mattw4 has quit IRC | 16:10 | |
openstackgerrit | Mohammed Naser proposed zuul/zuul master: spec: use operator-sdk for kubernetes operator https://review.opendev.org/681058 | 16:16 |
mnaser | i put that up for discussion | 16:17 |
mnaser | in the meantime i'll work with what we have and i can rebuilt it (relatively easily) in golang if we decide to. i'd just like to have the operator up as quickly as possible :) | 16:17 |
*** bogdando has joined #zuul | 16:19 | |
bogdando | hi. I'm trying to make zuul executor filling in a non-empty hosts.primary.nodepool.private_ipv4 value. Not sure how to debug how it gets null... Where it comes from when running a job? | 16:21 |
bogdando | pabelanger: ^^ perchance? | 16:21 |
clarkb | bogdando: it comes from the provider's returned info | 16:22 |
clarkb | the openstack driver should set private to the public value if private is null | 16:22 |
bogdando | clarkb: thanks | 16:23 |
bogdando | in inventory.yaml is creates dynamically for jobs in /var/lib/zuul/builds/xxxxx/, I have public_ipv4 set though | 16:24 |
bogdando | clarkb: I'm using static-libvirt | 16:25 |
bogdando | multinode... | 16:25 |
clarkb | bogdando: static-libvirt is your nodepool provider? | 16:25 |
bogdando | not sure where to start fixing that info that provider returns... | 16:25 |
bogdando | clarkb: yea | 16:25 |
clarkb | I don't see that in nodepool | 16:25 |
bogdando | running it locally | 16:26 |
clarkb | so I can't help you with its behavior | 16:26 |
*** spsurya has quit IRC | 16:27 | |
bogdando | clarkb: it's here https://review.rdoproject.org/r/gitweb?p=rdo-infra/ansible-role-tripleo-ci-reproducer.git;a=blob;f=templates/nodepool-libvirt.yaml.j2;h=3b64df83aca33303845d039dabbd99b03db20ece;hb=HEAD :) | 16:29 |
bogdando | trying to compare that beast to nodepool-openstack.yaml.j2 now... | 16:29 |
bogdando | sorry for bothering with custom providers... :) | 16:29 |
Shrews | bogdando: what is the error you are getting at the zuul-executor? static driver should work fine (the IP or hostname used should come from pools.nodes.name (https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[static].pools.nodes.name) | 16:31 |
bogdando | where does the code live for openstack provider? | 16:31 |
clarkb | Shrews: this isn't the static provider, it is some libvirt-static provider | 16:31 |
bogdando | it uses static driver | 16:31 |
clarkb | Shrews: I think the problem is they don't set the private ip to == the public ip if there is no private ip | 16:31 |
Shrews | driver: static | 16:32 |
Shrews | is in the config, so i'm more confused | 16:32 |
clarkb | I see so libvirt before was just niose? | 16:32 |
openstackgerrit | Mohammed Naser proposed zuul/zuul-operator master: Create zookeeper operator https://review.opendev.org/676458 | 16:32 |
bogdando | a kind of | 16:32 |
Shrews | clarkb: perhaps they are using libvirt for the "static" nodes, which is fine. | 16:32 |
Shrews | i've done that on my local machine | 16:33 |
bogdando | Shrews: ++ | 16:33 |
bogdando | trying the same path | 16:33 |
Shrews | bogdando: perhaps it's best to start at the beginning and show us the executor error | 16:33 |
clarkb | bogdando: Shrews in that case I think the problem here is static nodes have a single IP address | 16:33 |
clarkb | but openstack nodes can have ~3 | 16:33 |
bogdando | clarkb: indeed the eth1 is down in VMs | 16:33 |
bogdando | subnodes | 16:34 |
clarkb | in that case you probably want to update the jobs tohandle that case | 16:34 |
bogdando | clarkb: thanks I'll try that | 16:35 |
zbr | tristanC: pabelanger clarkb: i removed molecule from https://review.opendev.org/#/c/674092/ - ok now? | 16:40 |
openstackgerrit | Mohammed Naser proposed zuul/zuul-operator master: Create zookeeper operator https://review.opendev.org/676458 | 16:40 |
openstackgerrit | Mohammed Naser proposed zuul/zuul-operator master: Deploy Zuul cluster using operator https://review.opendev.org/681065 | 16:40 |
*** rlandy|brb is now known as rlandy | 16:41 | |
*** hashar has quit IRC | 16:45 | |
openstackgerrit | Merged zuul/zuul-jobs master: Switch to fetch-sphinx-tarball for tox-docs https://review.opendev.org/676430 | 16:46 |
*** jpena is now known as jpena|off | 16:47 | |
*** mattw4 has joined #zuul | 16:56 | |
*** bogdando has quit IRC | 17:07 | |
*** saneax has quit IRC | 17:18 | |
*** jamesmcarthur has quit IRC | 17:28 | |
*** mhu has quit IRC | 17:33 | |
SpamapS | mnaser: I'm fine with Go, Ansible, Python, just want something we can all work on. | 17:33 |
pabelanger | mnaser: SpamapS: corvus: I am happy to move to what others would like to do. It does seem Go is what majority is using, however I also see if Ansible operator is used and way to help that community grow more too | 17:36 |
*** jangutter has quit IRC | 17:38 | |
fungi | up side to ansible/python there is no need to precompile the source to get something useful | 17:41 |
SpamapS | Agreed. Mitigating it is that typically these go-based operators are pretty tiny and communities just maintain dockerhub images that can be mirrored pulled infrequently. | 17:43 |
SpamapS | mirrored/pulled | 17:43 |
SpamapS | Personally I think it will challenge our community involvement a lot. If we *can* get it done w/ Ansible, we probably should. | 17:44 |
pabelanger | I did share the link downstream with some ansible folks, my hope somebody more familiar on ansible-operator side to comment | 17:44 |
SpamapS | I was just looking at it.. seems fine. | 17:46 |
mnaser | fungi: when using golang, as long as you have a connection to a cluster active, you can just run operator-sdk up local --namespace=foo | 17:57 |
mnaser | so while it involves compiling, you dont have to build the image/etc to get it up *locally* | 17:57 |
mnaser | having said that, there is other options like kopf which allows building operators using python that might be interesting | 17:57 |
mnaser | but im not sure of how much that is adopted overall | 17:58 |
pabelanger | mnaser: I think your comment about disadvantages rings the most for me, I don't have golang XP, but am good at ansible. However, that said, I also am not good at k8s, so need to learn that too. | 18:00 |
fungi | i'm a fan of using the right too for the job, so if that means applying a programming language i'm not familiar with yet, i'm sure i'll muddle through | 18:01 |
pabelanger | so, think I would get up to speed faster on ansible-operator but doesn't mean I can't use go | 18:01 |
pabelanger | fungi: yah | 18:01 |
SpamapS | Hrm.. one thing that kind of sucks about using Ansible to build these objects is that it's not super great about conditionally adding/subtracting things from the container spec. | 18:08 |
SpamapS | I can work around that with envvars in a configmap for the thing I'm currently hitting (conditionally adding AWS creds if they're passed in), but it may not work well for other things. | 18:09 |
* SpamapS decided to just plow through with an unexpected hour of free time today.. trying nodepool-launcher now. | 18:10 | |
tristanC | SpamapS: though you can use Python with a custom task/library to perform data mutation | 18:19 |
SpamapS | tristanC: indeed! | 18:25 |
SpamapS | I'm 90% ready to change my vote to -1 for golang. | 18:25 |
SpamapS | This thing is pretty good. | 18:25 |
SpamapS | For all of us, ansibling is pretty natural. | 18:26 |
tristanC | SpamapS: what both me with Ansible is that roles lack interfaces, which makes it difficult to combine things. Using a programming language such as go would give us types, which is great from a devel point of view :-) | 18:26 |
tristanC | bother* | 18:26 |
SpamapS | Yeah the structure for keeping things consistent is difficult in Ansible. | 18:26 |
SpamapS | But, I don't think we have that much complexity, it's mostly just plumbing configs from the right place to the right place. | 18:27 |
SpamapS | And the benefit of allowing anybody in the Zuul world who knows Ansible to write Ansible to deploy Zuul.. seems like a big win. | 18:28 |
tristanC | on the other hand, golang or ansible is an implementation detail for the user, you shouldn't have to know how an operator is written | 18:29 |
tristanC | with the operator-lifecycle, you can just click on a dashboard to deploy an operator | 18:30 |
SpamapS | You don't need to know, until you do. :) | 18:33 |
SpamapS | Zuul has quite a few user/operators. I am one of them. It keeps the pool of folks who want to contribute large to use Ansible. | 18:34 |
SpamapS | (I can totally golang.. but I'd rather not, and it would add friction for me. I imagine there are others less versed in Golang that would be completely unable to help.) | 18:35 |
tristanC | i think we should stick to ansible for the current spec, e.g. takes a zuulYamlConfig and nodepoolYamlConfig and start the services | 18:36 |
*** jamesmcarthur has joined #zuul | 18:37 | |
SpamapS | ya, TBH I'm almost done w/ NodepoolLauncher in that mode. :) | 18:37 |
tristanC | but if we want to also manage more fine grained resources like ZuulTenant or NodepoolProviders, e.g. with custom logic/rbac, then golang benefit may outweigh ansible | 18:38 |
SpamapS | I don't want that. | 18:40 |
SpamapS | Nodepool and Zuul have their own configurations and I don't think we need to re-generate them in an operator. | 18:40 |
SpamapS | Just bolt the user's configs onto the stuff the operator needs to know. | 18:41 |
tristanC | SpamapS: if k8s api can enable rbac on those resources, then it may be interesting to let an operator generate the config | 18:42 |
SpamapS | Not sure that's what the k8s API is for, but ok. | 18:43 |
SpamapS | I can't say it's wrong either. :) | 18:43 |
SpamapS | Kinda feels like side-stepping the real problem which is that these configs are too static. | 18:45 |
SpamapS | Not all Zuul users will be in k8s. | 18:45 |
fungi | yeah, if that configuration were managed dynamically via a zuul api, then the problem could be solved more generally | 18:51 |
fungi | (like the configs-in-zk conversation over the weekend) | 18:51 |
fungi | people could presumably still use kubernetes as a frontend to that, and just have it talk to the zuul api | 18:52 |
SpamapS | ya, I dislike putting that cart before that horse. | 18:53 |
SpamapS | But I also wonder if this is really all that important. | 18:53 |
SpamapS | I know we have some use cases for dynamic tenant configuration. But nodepool pools.. seems less-so. | 18:53 |
fungi | it seemed to be important enough to mnaser to want to write a separate kubernetes operator on his own | 18:53 |
SpamapS | For pools? interesting. | 18:54 |
fungi | oh, the nodepool pools. i was referring to the tenant configs | 18:54 |
SpamapS | Yeah for tenant configs, I think any shop bigger than 5-6 people will have repos coming and going all the time and a dynamic tenant config service makes a ton of sense. | 18:55 |
SpamapS | GitOps is nice when you have it.. but not everybody will. :-P | 18:57 |
SpamapS | one thing I can't seem to find is how the ansible operators handle deletion of the CR | 19:03 |
clarkb | bringing the "how can zuul handle this better" discussion to here. I've discovered that if the root disk of a test node fills up then next ansible playbook run exits 4 and zuul sees that as a network failure | 19:05 |
clarkb | I believe this is happening because ansible wants to be able to write data to /tmp on the remote host but cannot as the disk is full | 19:05 |
clarkb | however the host is still up and accessible and for debugging purposes it would be nice to be able to get some data (even if locked up in the exceutor log) | 19:05 |
clarkb | maybe we can have zuul attempt to execute a canned raw module playbook for checking ifconfig and df output if it gets an rc 4? | 19:06 |
*** hashar has joined #zuul | 19:06 | |
pabelanger | clarkb: could we do that in a cleanup job? | 19:07 |
clarkb | pabelanger: I don't think so because we need it to run each time a job is retried | 19:07 |
clarkb | cleanup jobs run for the entire buildset iirc | 19:08 |
tristanC | SpamapS: not sure what you meant by deletiong of the CR, but each resources should have an owner, and when you delete the top resources, e.g. Nodepool, then k8s should removes everything attached to it | 19:09 |
corvus | clarkb: cleanup playbooks are job-level (i don't think we have any buildset-level playbooks) | 19:12 |
clarkb | oh playbook sorry I though pabelanger meant a job that depended on the others in a buildset | 19:13 |
clarkb | so ya we could add a cleanup playbook to the base job that tries to use raw module to grab basic essentials maybe | 19:13 |
corvus | ++ | 19:13 |
pabelanger | yah, sorry. that is what I meant, and clean up playbook to base | 19:14 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: Improve job and node information banner https://review.opendev.org/677971 | 19:33 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: Improve job and node information banner https://review.opendev.org/677971 | 19:34 |
SpamapS | tristanC: how does k8s know that Nodepool owns that deployment though? | 19:36 |
SpamapS | tristanC: in the ansible operator .. it just has a role that creates. But there's no annotation in there that I can see that ties it to anything else. | 19:36 |
SpamapS | Maybe something in envvars or ansible-runner that magically adds things to k8s calls? | 19:36 |
tristanC | SpamapS: yeah maybe, don't remember how it happen | 19:37 |
tristanC | SpamapS: the ansible sdk does it, from the source: ""The ansible operator will inject owner references unless this flag is false" the flag being inject-owner-ref | 19:39 |
clarkb | does cleanup-run happen after run playbook but before post? | 19:39 |
tristanC | clarkb: iirc after post | 19:40 |
tristanC | clarkb: right before the ssh-agent is stopped | 19:41 |
SpamapS | tristanC:thanks thats what I was looking for. Neat. | 19:41 |
tristanC | SpamapS: when testing locally, i used a state|default('present') variable, and to cleanup, ran the playbook with -e state=absent | 19:42 |
SpamapS | One good reason to have config object kinds would be to be able to say "If there aren't any NodepoolImage resource, don't bother running nodepool-builder" | 19:42 |
SpamapS | tristanC: yeah that makes sense. | 19:42 |
clarkb | tristanC: thanks I won't worry about logging it on the host then (whcih may be difficult with no disks pace) | 19:44 |
tristanC | just found that etcd operator written with ansible, it seems to manage ha operation easily, usinc a couple of python plugin, e.g.: https://github.com/openshift/etcd-ha-operator/blob/master/roles/reconcile/lookup_plugins/etcd_member.py | 19:48 |
tristanC | and using tricky when conditions such as: https://github.com/openshift/etcd-ha-operator/blob/master/roles/reconcile/tasks/reconcile_pods.yaml#L128-L130 | 19:51 |
*** igordc has joined #zuul | 20:00 | |
*** michael-beaver has joined #zuul | 20:04 | |
*** hashar has quit IRC | 20:48 | |
*** jamesmcarthur has quit IRC | 21:29 | |
*** snapiri has quit IRC | 21:41 | |
*** armstrongs has joined #zuul | 21:46 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP: Add support for the Gerrit checks plugin https://review.opendev.org/680778 | 21:49 |
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP: Add enqueue reporter action https://review.opendev.org/681132 | 21:49 |
*** armstrongs has quit IRC | 21:56 | |
*** EmilienM is now known as little_script | 22:39 | |
*** little_script is now known as EmilienM | 22:42 | |
*** rlandy is now known as rlandy|bbl | 22:50 | |
*** threestrands has joined #zuul | 23:14 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!