Monday, 2020-02-03

*** wxy-xiyuan has joined #zuul00:04
*** adamw has joined #zuul01:15
*** adamw has quit IRC01:19
*** adamw has joined #zuul01:20
*** jamesmcarthur has quit IRC02:43
*** jamesmcarthur has joined #zuul02:53
*** bhavikdbavishi has joined #zuul03:06
openstackgerritIan Wienand proposed zuul/zuul-jobs master: upload-afs: rename to upload-afs-roots; add afs-upload-synchronize  https://review.opendev.org/70536803:07
*** bhavikdbavishi1 has joined #zuul03:13
*** bhavikdbavishi has quit IRC03:15
*** bhavikdbavishi1 is now known as bhavikdbavishi03:15
openstackgerritIan Wienand proposed zuul/zuul-jobs master: upload-afs: rename to upload-afs-roots; add afs-upload-synchronize  https://review.opendev.org/70536803:18
openstackgerritIan Wienand proposed zuul/project-config master: Migrate to upload-afs-roots role  https://review.opendev.org/70537203:22
*** jamesmcarthur has quit IRC03:22
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Remove deprecated upload-afs role  https://review.opendev.org/70537303:24
*** jamesmcarthur has joined #zuul03:28
*** jamesmcarthur has quit IRC03:46
openstackgerritIan Wienand proposed zuul/zuul-jobs master: upload-afs: rename to upload-afs-roots; add afs-upload-synchronize  https://review.opendev.org/70536803:48
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Remove deprecated upload-afs role  https://review.opendev.org/70537303:48
*** jamesmcarthur has joined #zuul03:51
*** jamesmcarthur has quit IRC04:50
*** raukadah is now known as chkumar|rover04:55
*** evrardjp has quit IRC05:33
*** evrardjp has joined #zuul05:34
*** jamesmcarthur has joined #zuul06:01
*** jamesmcarthur has quit IRC06:06
*** sanjayu_ has joined #zuul06:12
*** mattw4 has joined #zuul06:17
*** sanjayu__ has joined #zuul06:22
*** sanjayu_ has quit IRC06:25
*** mattw4 has quit IRC06:43
*** felixedel has joined #zuul06:48
*** sanjayu__ has quit IRC06:56
*** saneax has joined #zuul06:58
AJaegerianw: what about sending the announcement for the removal already now so that people can review the whole stack? Or do you want to give folks a chance for first review today? corvus, any suggestions on the stack above? ^07:05
*** AJaeger has quit IRC07:06
*** AJaeger has joined #zuul07:08
*** felixedel has quit IRC07:27
*** pcaruana has joined #zuul07:43
*** tosky has joined #zuul08:29
ianwAJaeger: i don't know if i'd send something to the announce list about a *potential* deprecation, but if you'd like something to the main list i could08:31
ianwi feel like the intersection of people using zuul and uploading various artifacts to afs is probably just openstack, though?08:32
AJaegerianw: I guess, but we should announce it nevertheless. I suggest you chat with corvus when both of you are online ;) Since he wrote the role initially AFAIR, he might be a good reviewer as well.08:43
*** yolanda has joined #zuul09:03
openstackgerritFelix Schmidt proposed zuul/zuul master: Implement github checks API  https://review.opendev.org/70516809:18
mnaser^ this is cool!09:21
*** sshnaidm|off is now known as sshnaidm09:30
*** felixedel has joined #zuul09:53
*** felixedel has left #zuul10:02
*** bhavikdbavishi has quit IRC10:12
*** wxy-xiyuan has quit IRC10:42
*** mhu has joined #zuul10:43
*** bhavikdbavishi has joined #zuul11:10
tobiashcorvus: when you have time, I'd like to discuss a cherrypy related problem we have and possible options to resolve this11:40
webknjaz@tobiash: if it's generic enough, you can ask me about CherryPy11:42
tobiashtldr is that we're facing occasional connection resets when connecting to zuul-web when many requests in parallel are coming in. I can reproduce this problem also locally with a minimal cherrypy based server. The hypothesis is that the http server of cherrypy cannot accept connections fast enough so they are rejected by the kernel11:42
*** bhavikdbavishi1 has joined #zuul11:43
tobiashI've experimented locally with combining cherrypy with the tornado web server (see https://docs.cherrypy.org/en/latest/deploy.html#tornado)11:43
*** bhavikdbavishi has quit IRC11:44
*** bhavikdbavishi1 is now known as bhavikdbavishi11:44
webknjazdoes it happen w/o tornado?11:44
tobiashwhich resolves this issue, I also have a running poc with zuul that works except websockets11:44
webknjazwhat's the worker thread count setting?11:44
tobiashwebknjaz: I've experimented with default setting, and also higher settings (like 100)11:45
webknjazthere's recently been some refactoring in Cheroot concerning worker threads, try updating and see if it gets better11:46
tobiashI used latest versions which are on pypi for both cherrypy and cheroot11:46
webknjazif you can share a reproducer with just CherryPy, I could take a deeper look. maybe post that as a GH issue.11:47
tobiashtest example cherrypy only: http://paste.openstack.org/show/789060/11:48
*** hashar has joined #zuul11:48
tobiashtest example cherrypy with tornado: http://paste.openstack.org/show/789061/11:48
tobiashk, I'll open an issue11:48
webknjazthanks11:49
webknjazand also describe how you send requests and all the details there11:49
tobiashyes, thanks11:51
*** felixedel has joined #zuul11:58
tobiashwebknjaz: shall I open the ticket against cherrypy or cheroot?12:07
webknjazNot sure yet. It looks like a Cheroot issue but it's fine to open it in CherryPy too. I think, if I'll manage to come up with a Cheroot only reproducer, I'll just transfer the issue across repos myself (github will put a redirect from the old location so it's fine).12:09
tobiashok, thanks12:10
*** avass has joined #zuul12:11
*** rfolco has joined #zuul12:24
tobiashissue: https://github.com/cherrypy/cherrypy/issues/183913:01
webknjazthanks13:01
*** rlandy has joined #zuul13:01
tobiashthanks for helping :)13:02
openstackgerritMerged zuul/zuul master: Dockerfile: create a zuul user with uid 10001  https://review.opendev.org/65024613:09
*** bolg has joined #zuul13:10
mhuttx, thanks for the clarification on the history w/ CNCF & CDF - the whole github criteria is indeed unreasonable13:16
mhuand also weird since non open source offerings like Circle CI are on the landscape13:17
*** jamesmcarthur has joined #zuul13:21
*** jamesmcarthur has quit IRC13:34
*** Goneri has joined #zuul13:40
*** jamesmcarthur has joined #zuul13:46
hasharhi13:56
hasharmhu: I would guess it is to gauge the popularity of a software13:56
hasharsince most everything is on github nowadays13:57
mhuhashar, yes that was my assumption too13:57
hasharthe repository used to be monitored from gerrit to github, maybe that is enough13:57
hasharor one could check with them as to why github is a requirement in the first place13:58
ttxmhu: the weird part is that the github requirement is only for open source projects.14:08
ttxSo basically it's easier to get a proprietary solution listed.14:09
pabelanger'Projects must be open source and hosted on or mirrored to GitHub.'14:13
pabelangerhow does proprietary even get hosted?14:13
pabelangerhttps://github.com/cdfoundation/cdf-landscape#new-entries14:13
hasharoh14:19
hasharthey have a landspace.yml file with list of projects14:19
hasharand then a bot auto generated a processed_landscape.yml to add bunch of metadata14:19
hasharsuch as # of commits,  # of twitts etc14:19
webknjaztobiash: I've added comments on the issue14:19
mordredyeah. it's very much from a worldview of commercial mindset14:20
tobiashwebknjaz: thanks, I'm already trying this out14:20
hasharso I guess they require github for their bot to be able to process the project extra meta data14:20
tobiashthe sysctl values you mentioned are the same on the ubuntu box by default14:20
tobiashtrying now server.socket_queue_size with that value on ubuntu14:20
webknjazdo you have slow machines/a lot of things running in background?14:20
tobiashno, macos was idle otherwise and the ubuntu test is on a freshly spawned ubuntu aws machine14:21
*** jamesmcarthur has quit IRC14:21
*** jamesmcarthur has joined #zuul14:23
tobiashstill getting connection resets14:23
tobiashalso with somaxconn 1024 no change14:24
*** bolg has quit IRC14:29
webknjaztobiash: did you adjust `somaxconn` with `server.socket_queue_size` setting?14:33
tobiashyes, default is 128 so I tested with server.socket_queue_size and I did a second test run with both values 102414:34
*** jamesmcarthur has quit IRC14:35
*** bhavikdbavishi has quit IRC14:36
*** zxiiro has joined #zuul14:37
*** chkumar|rover is now known as raukadah14:37
webknjazplz try `cheroot<8.1.0` just in case14:41
hashartobiash: for sockets troubles, I often rely on the utility `ss` a good swiss army knife  to list/monitor socket states14:42
hasharso you can list all sockets in state time_wait with destination port 443  for example14:42
webknjazyeah, `ss` should be helpful here14:43
hasharwhich stands for "socket statistics"14:45
webknjazyep14:47
webknjazJust ran that reproducer in 4 parallel shells, still not reproducible on my machine14:47
hasharmight be different linux kernel tcp settings? :/14:48
webknjaztotally, taking into account it's Gentoo Linux :)14:48
tobiashI just retried the test with cheroot<8.1.0 and the resets are gone14:48
tobiashhowever it seems to get slow at some point14:48
webknjazalright, so we need to work towards creating a reproducer in pure Cheroot14:48
webknjazyep, it's slow because worker threads are per TCP connection, and not per HTTP request + the pool size is static (unless you use `cherrypy-dynpool`)14:50
webknjazmoving the issue to the Cheroot tracker then14:50
tobiashoh from the docs I thought it's growing automatically14:50
webknjaznope, the pool doesn't grow but there's an interface for external things to resize it.14:51
tobiashyepp, with 200 threads it's fast :)14:51
webknjazI'm curious whether it's related to https://github.com/cherrypy/cheroot/issues/24914:55
*** fbo has quit IRC15:00
tobiashwebknjaz: shall I try a bisect?15:01
webknjazI mean, it's quite obvious that the regression is introduced by https://github.com/cherrypy/cheroot/pull/19915:02
webknjazWe need to (1) create a stable reproducer based on pure Cheroot (w/o CherryPy layer) and (2) figure out how to fix it.15:03
tobiashis there a minimal standalone example how to use it? Then I'll try to reproduce it directly with cheroot15:04
webknjazoh, it's just a WSGI server (well, on top of an HTTP server)15:05
webknjazfeed it with a dummy WSGI app15:05
webknjazsimple example: https://github.com/cherrypy/cheroot/blob/bee5df9/cheroot/wsgi.py#L7-L1515:06
*** jamesmcarthur has joined #zuul15:12
tobiashalso reproduces with the same client script (needs a few more threads) and with directly with ^ (only changed port)15:15
*** mattw4 has joined #zuul15:18
webknjazI'll try to hit it with `ab` locally, maybe if I use big enough parallelism value, it'll break...15:20
webknjazI guess it's because it'15:22
webknjazit's hard to hit limits with 32GB + i7 w/ 8 virtual cores15:23
tobiashwell, I tested on an 72 core aws ubuntu machine ;)15:24
tobiash(it's also used for other tests on an on-demand basis)15:24
*** mattw4 has quit IRC15:25
webknjazSo I increased numbers in the client script to ~10k and started hitting `ERROR:test:FAILED http://localhost:5000/info: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /info (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f028f3fe790>: Failed to establish a new connection: [Errno 16] Device or resource busy'))` but that seems like the client got stuck, not srv15:26
tobiashyes, I get a mixture of 'Connection reset by peer' and 'Remote end closed connection without response'15:27
webknjazoh, and the server was hitting "too many open files" for a while15:27
webknjazthat's different15:27
webknjazfacepalm15:27
webknjazI was testing against wrong cheroot version15:28
webknjazokay, now I see the error as described15:29
tobiash:)15:29
webknjazPBKAC15:29
*** zbr has quit IRC15:30
openstackgerritTobias Henkel proposed zuul/zuul master: Cap cheroot to fix issues with concurrent requests  https://review.opendev.org/70545915:32
tobiashzuul-maint: this avoids hittting https://github.com/cherrypy/cheroot/issues/263 until there is a fix available ^15:33
tobiashwebknjaz: thanks a lot for your help!15:33
webknjazyou're welcome :)15:34
*** zbr has joined #zuul15:34
*** hashar has quit IRC15:35
*** rfolco is now known as rfolco|eats15:43
mordredcorvus, tristanC: in https://review.opendev.org/#/c/702106 - what installs ZK now?15:46
*** electrofelix has joined #zuul15:47
tristanCmordred: the zookeeper service definition is https://review.opendev.org/#/c/702106/22/conf/zuul/resources.dhall@558 , and it's currently just spawning docker.io/library/zookeeper15:48
tristanCmordred: i mean, it is an optional service, that depends if the user provides a zk secret15:49
mordredtristanC: ah - cool - thanks!15:50
tristanCmordred: and this is not sufficient according to the spec, a follow-up should setup a more solid solution such as the zk operator or even the helm chart15:50
mordredyah - but what's there is an adequate first step I think15:50
tristanCyep, that's just the easiest thing to use right now to get it going15:50
*** felixedel has quit IRC15:55
mordredtristanC: ok - another one for you - in https://review.opendev.org/#/c/702716/7/roles/zuul-ensure-gearman-tls/tasks/main.yaml - why write the generated certs locally? just for debugging?15:56
tristanCmordred: just because it's easier to make the openssl cli write file15:57
tristanCmordred: the operator sdk setups a cwd per CR to avoid conflict, and iiuc this can be used for such tasks15:58
mordredtristanC: oh - no - what I mean is - line 36 - we don't seem to use those files for anything?15:59
mordred(I'm just wondering if I'm missing something - the openssl creation and then k8s secret creation makes sense)16:00
tristanCmordred: oh that, they are used in the next review, to dump the status queues16:00
tristanCmordred: https://review.opendev.org/#/c/703624/7/roles/zuul-restart-when-zuul-conf-changed/module_utils/gearlib.py@2416:01
mordredtristanC: ah - cool16:01
mordredtristanC: I really wish lookup syntax was different16:06
tristanCmordred: what do you mean?16:07
mordredtristanC: the zuul_conf_secret: "{{ lookup('k8s', api_version='v1', kind='Secret', namespace=namespace, resource_name=zuul_name + '-secret-zuul') }}" would be SO neat if it was more structured and less embedded in a string - but there's nothinng we can really do about that :)16:10
tristanCmordred: indeed, that's odd we are not using them as regular ansible task16:14
mordredyeah. It's always been weird to me that it's not a more first-class construct. but oh well16:15
*** avass has quit IRC16:27
*** sshnaidm is now known as sshnaidm|afk16:27
*** mattw4 has joined #zuul16:31
*** mattw4 has quit IRC16:32
*** jamesmcarthur has quit IRC16:33
*** jamesmcarthur has joined #zuul16:37
*** zxiiro has quit IRC16:45
*** mattw4 has joined #zuul16:51
*** rfolco|eats is now known as rfolco16:51
*** mattw4 has quit IRC16:56
tristanCmordred: thanks for the review on the zuul-operator. The next things we'll need is the the zuul secret to get the image promotion: https://review.opendev.org/704187 needs a config core to zuul-encrypt the dockerhub password16:58
*** mattw4 has joined #zuul17:08
corvustristanC: can you update the nodepool dockerfile to match the user approach in zuul?17:33
*** evrardjp has quit IRC17:33
*** evrardjp has joined #zuul17:34
tristanCcorvus: yes, should we remove the entrypoint too?17:35
corvustristanC: yes, i think so.  it's probably not as dangerous there, but still could be problematic on a builder.17:37
openstackgerritTristan Cacqueray proposed zuul/nodepool master: Dockerfile: create a nodepool user with uid 10001  https://review.opendev.org/70549717:38
openstackgerritTristan Cacqueray proposed zuul/nodepool master: Dockerfile: remove the uid_entrypoint service  https://review.opendev.org/70524117:41
tristanCcorvus: in case the removal is found problematic, here are the required changes split in two ^17:42
clarkbtristanC: corvus I've gone through all but the last change in the operator stack and left thoughts17:49
clarkbFor the dhall itself it looks like it got cleaned up quite a bit which is nice17:49
clarkbstill a bit of nesting but looks like the vars from parent scope are used in child scopes and things are labeled well17:49
clarkbcorvus: https://review.opendev.org/#/c/703624/7/roles/zuul-restart-when-zuul-conf-changed/library/dump_zuul_changes.py has a question you might want to look at in particular because I think it is a general zuul thing17:51
corvusclarkb: tricky.  in the long run, the ha-scheduler work should obviate the need for any of this, and avoids that issue.  i don't want to invest too much time working around the status quo and would rather look ahead to implementing that.  however, adding a 'do-not-restore' flag isn't a big deal.  but really, you'd still need to be careful about pipelines like that anyway (if you added the flag, then a17:57
corvusrestart happened, you could be missing a bunch of release jobs).  so really, the best thing is probably for folks to just be careful about when they do things that cause scheduler lifecycle events (or else, just don't support that for now).17:57
corvusi pretty much answered that with stream-of-conciousness didn't it?17:58
corvusi guess to some up: there be dragons, and one way or another we either shrug off the difficulties, or just don't do that for now.17:58
corvussum up even17:58
clarkbcorvus: ya I think the difference is the oeprator hides that from the humans17:59
clarkbtoday without the operator you have to think about those consequences17:59
clarkbwith the operator you won't notice until it breaks your releases (or similar)17:59
*** electrofelix has quit IRC18:06
*** jamesmcarthur has quit IRC18:12
tristanCclarkb: thank you for the review, i replied to your comment18:16
*** jamesmcarthur has joined #zuul18:19
clarkbtristanC: I guess centos8 doesn't have openshift/kubectl pacakges yet?18:21
tristanCclarkb: it doesn't seem like there is one already. it isn't part of centos7 either, it's only provided by the PaaS SIG18:23
clarkbbut the paas sig packages for centos 718:24
clarkb(mostly just weird to be using old centos here since the base image is fedora iirc)18:24
clarkbI guess if it gets compiled into a statically linked binary from golang then it doesn't matter much18:25
clarkbI think we should add a note to https://review.opendev.org/#/c/703631/5/roles/zuul-reconfigure-tenant-when-conf-changed/library/k8s_exec.py about when we can remove it18:26
tristanCclarkb: operator-framework image are based on ubi (rhel8), and yes, static golang package install works acros el7 and el818:27
tristanCclarkb: yes, let me update that file, and i can switch to tarball install from https://dl.k8s.io/v1.17.0/kubernetes-client-linux-amd64.tar.gz18:28
Shrewscorvus: line 425 of your change in https://review.opendev.org/#/c/705053/2/zuul/driver/gerrit/gerritconnection.py  ... i feel like that string interpolation is missing the 'age' param, or did i miss something?18:28
Shrewschanges = self.connection.simpleQueryHTTP("status:merged -age:%ss")18:29
openstackgerritTobias Henkel proposed zuul/zuul master: Add foreground option  https://review.opendev.org/63564918:30
openstackgerritTobias Henkel proposed zuul/zuul master: Deprecate -d switch for running in foreground  https://review.opendev.org/70518518:30
openstackgerritTobias Henkel proposed zuul/zuul master: Don't enforce foreground with -d switch  https://review.opendev.org/70518918:30
mordredShrews: I agree with you18:33
Shrewsi thought maybe there is some scary magic in simpletQueryHTTP... but does not appear to be the case18:33
*** armstrongs has joined #zuul18:39
*** armstrongs has quit IRC18:42
*** plaurin has joined #zuul18:47
*** jamesmcarthur has quit IRC18:54
plaurinHello irc people! Quick question, when using kubernetes with zuul, the log streams seems a bit different. I am reusing a job that used to work on static ssh nodes, but not I seem to have less output when it's running on a pod than an ssh node18:55
plaurinrunning on static node seems to be more verbose by defaut than kubectl/openshift18:56
tristanCplaurin: that's excepted, the zuul_stream module used to stream console logs requires a tcp connection, not available with pods18:56
plaurinany possible workaround? That's a major impact for me if I cannot have proper log stream output (I have jobs that runs for 4+ hours)18:58
plaurinother than using debug everywhere19:00
tobiashplaurin: with the current log streaming mechanism I don't see any easy way of improving this unfortunately19:04
tobiashit might be possible to leverage kubectl port-forward to implement this for kubectl connections19:05
tobiashfor this to work one would need to map an unused port to each host and override the log streaming target to localhost:<chosen port> for the according host19:06
tobiashand run a kubectl port forward process in parallel similar to the ssh-agent19:07
clarkbthinking out loud here, could you stream the `docker log $container` logs19:07
clarkbthat should capture stdout and stderr right?19:07
tobiashclarkb: zuul expects to get the log stream per command task19:08
clarkbtobiash: I thouht it was per playbook.19:08
tobiashit's per task and also the log stream signals the end (and result code) of the task afaik19:09
plaurinoh that's why I lose my stream for my tasks that take 1+ hours19:09
clarkbI think I understand what you are saying. I think I'm suggesting that we don't run the ansible plugin that does that at all19:09
clarkband instead bypass ansible and look directly at the stdout from the container. But ansible probably eats all that I guess19:10
clarkbmaybe we can update the ansible plugin to write to stdout under k8s instead of to a file that is stream?19:10
clarkbthen on the executor side we can stream it from `docker logs` or similar19:10
tobiashthere are also some nifty logging poc refactorings by mordred that might resolve this issue as well19:12
clarkbessentially what we've done is buidl our own docker log utility for VMs19:13
clarkband I think we should be able to use docker log (or equivalent) when running in ks8?19:13
corvusclarkb, tobiash: iirc, the architecture here is that zuul has a callback plugin which gets called on every task.  if the task is a "command" task, we start streaming from the remote host.  now we're talking about a kubectl exec task, right?  so maybe we can update that callback plugin to say "if the task is kubectl exec, start a 'kubectl logs -f' process"19:14
corvusclarkb: (kubectl logs == docker logs, but does not have to run on the same host)19:14
clarkbcorvus: ya I think that is the more specific version of what I'm trying to say)19:14
corvusclarkb: yeah, sorry -- i should have specified i was trying to articulate your idea in more detail :)19:15
mordredcorvus, clarkb: that sounds like a decent and tractable idea19:15
openstackgerritTristan Cacqueray proposed zuul/zuul-operator master: Add tenant reconfiguration when main.yaml changed  https://review.opendev.org/70363119:15
plaurinseems like a real good idea to me, I would be the 'first customer' of such a feature19:16
tobiashyes, sounds doable as long as we keep outputting the same end markers19:16
tobiashftr, this is the mordred redesign logging stack: https://review.opendev.org/#/q/topic:zuul-stream-rework+(status:open+OR+status:merged)19:17
corvusactually, is this where we would implement it?  https://opendev.org/zuul/zuul/src/branch/master/zuul/ansible/base/callback/zuul_stream.py#L269-L27119:17
corvus(i guess it's a command task on a kubectl connection we're talking about?)19:18
tobiashcorvus: yes, that's the receiving side19:18
corvusso that becomes "start a kubectl logs thread" instead of "start a tcp receiver thread"19:19
tobiashI think if we go that route we should abstract the streamer here: https://opendev.org/zuul/zuul/src/branch/master/zuul/ansible/base/callback/zuul_stream.py#L27519:19
tobiashmaybe change that from a plain thread with a function to a streamer base class that is/has a thread and specializations for tcp and kubectl19:20
corvustobiash: good idea; that may help with mordred's rework too19:20
mordred++19:20
corvus(because then you could replace the tcp with domain sockets, and not touch the kubectl)19:20
clarkbtristanC: ianw can you check my comment on https://review.opendev.org/#/c/705337/1 and see what you think?19:21
tobiashcorvus: btw, hadn't we the same exception for winrm there? https://opendev.org/zuul/zuul/src/branch/master/zuul/ansible/base/callback/zuul_stream.py#L26919:22
tobiashwith winrm we don't support streaming yet as well19:23
clarkbtobiash: corvus we would still need to update the python that runs in ansible on the remote side to write to stdout instead of to the log file that is streamed right?19:23
plaurintobiash, corvus, mordred: if I can help in some way let me know, else you have my full support for this discussion around logs 😎️19:23
corvustobiash: that seems like a good idea -- do you get errors or warnings or anything for failed streaming attempts?19:23
corvusplaurin: are you interested in doing some python hacking?  :)19:23
tobiashcorvus: sometimes yes, and I thought I was pretty sure that I added this even before the kubectl thing19:24
tobiashclarkb: yes, that's the other thing that's needed19:24
corvusclarkb, tobiash, mordred: oh wait -- are we talking about the case where we run a pod that just sits there and does nothing as the main process while we run other commands on the pod?19:25
corvusbecause if that's the case, then kubectl logs isn't going to produce any useful output19:25
corvus(the pod-as-machine case or whatever we called it)19:25
clarkbcorvus: I think what happens is because we don't have log streaming all of the ansible stuff that happens on the pod is written to a file that we can't get at19:26
tobiashcorvus: oh yes I think so19:26
*** hashar has joined #zuul19:26
tobiashcorvus: but we could still do a kubectl port-forward and stream from localhost thing with the same abstraction in the callback19:26
clarkbif we changed that ansible plugin thing to write to stdout we would be able to get that data from the lgos command19:26
tobiashin that case we wouldn't even need to modify the command module19:27
clarkbif however it isn't using that plugin thing then ya I think what tobiash says is correct19:27
tristanCclarkb: zj prefix wfm. though i've been using '_' successfully, and it's surprising the doc doesn't mention this as valid since it's common for python variable name.19:27
clarkbwe just grab the stdout anyway19:27
clarkbtristanC: I think my biggest concern would be the next release of ansible deprecating that it works :/19:28
clarkbtristanC: I can use zj_ as a prefix as that is valid according to the docs and should make it unique enough for us19:28
tobiashcorvus: phew windows is filtered out at a different point: https://review.opendev.org/#/c/615804/1/zuul/ansible/callback/zuul_stream.py19:28
tobiashI almost thought this got broken19:29
plaurincorvus: not really interested for contributing python, you would have to take me by the hand anyways to produce anything of value, but I'm willing to try and test patches on my setup19:29
corvusplaurin: ack.  it looks like we're back to the design phase anyway :/19:29
corvustobiash: i think i'm not following the forward idea19:29
openstackgerritMerged zuul/zuul master: Add gcloud_service auth option for Gerrit driver  https://review.opendev.org/70490419:30
corvusclarkb: so your idea is change the kubectl connection plugin to output whatever it gets to stdout?19:30
tobiashcorvus: instead of kubectl logs the streamer could execute kubectl port-forward <some local port>:<log port> and then stream from localhost:<some local port>19:30
tobiashthat would be possible as well if we do the streamer abstraction19:31
plaurincorvus: also for now I might be able to have a workaround using async tasks + debug statements, that might work19:31
tobiashto be precise: "kubectl port-forward pod/mypod :19885", parse the output to get local port and then stream from localhost:<parsed port>19:33
clarkbcorvus: similar to how the command stuff happens for VMs but instead of writing tona file write to stdout then it can be streamed from k8s19:34
corvustobiash: in this situation, does the modified command module still run?19:34
tobiashcorvus: yes, the same command module as we use right now19:34
tobiashand the zuul_console module as we use in vms to serve on 19885 as well19:35
corvusso when you do "command: /bin/true" on a vm, it copies an ansiball over ssh to the vm and executes it with python.  when you do the same for a host with a kubectl connection type, it also copies an ansiball over (via kubectl) and executes it with python?19:36
tobiashthat's what I expect19:36
corvusok, then yeah, the port-forward thing sounds like it would be feasible then19:37
tobiashotherwise you'd need kube_foo modules that work differently just as it is with windows19:37
tobiashwe could make the streamers even stackable so we can reuse the tcp streamer then19:37
corvusclarkb: ^ if that's the case, then i think that's why the 'stdout' approach wouldn't work -- the stdout would actually just be the json, just like a regular ssh connection19:38
openstackgerritJames E. Blair proposed zuul/zuul master: Gerrit: poll for merged changes if no stream events  https://review.opendev.org/70505319:38
openstackgerritJames E. Blair proposed zuul/zuul master: Add google-cloud-storage to executor ansible  https://review.opendev.org/70527919:38
clarkbcorvus: ah19:38
corvusShrews, mordred: ^ derp, thanks.19:39
clarkbactually eait19:39
clarkbwhat we write to a file on thr VM isnt json19:39
corvusclarkb: no i mean what ansible sees is json19:39
clarkbso we should be able to write the same data to stdout instead?19:39
tobiashclarkb: you mean the pod's stdout, not ansible's stdout?19:40
clarkbyes19:40
tobiashyes, would be possible but makes things more complicated than needed19:40
clarkbwhere k8s will collect it19:40
clarkbI think that avoidsthe need for port 1988519:40
clarkbbecause stdout is already aggregated and streamable19:41
tobiashthat port would be only open inside the port19:41
corvusclarkb: do you mean have the pod's main command be "tail -f /zuul_logfile"?19:41
tobiashaccess is via kubectl port-forward then19:41
tobiashso the port does't need to be reachable19:41
clarkbcorvus: that is one way to do it. Can't wejust write to fd 0 insteadof zuul_logfile though?19:42
tobiashand executing kubectl port-forward is as easy as executing kubectl log19:42
clarkber fd 119:42
corvusclarkb: in this situation, there are 2 processes on the pod19:42
corvusclarkb: the main process is basically "sleep inifinity".  that's what kubernetes thinks it's running, and that's what you'd get with kubectl logs.  the other is what ansible runs via kubectl exec19:43
tobiashI'd prefer using port-forward over writing into stdout of a foreign process ;)19:43
clarkbcorvus: hrm, why arent we just running the process directly?19:43
openstackgerritTristan Cacqueray proposed zuul/zuul-operator master: Add CONTRIBUTE file  https://review.opendev.org/70553519:43
corvusclarkb: because the pod terminates at the end of the process.19:43
mordredclarkb: because we need to run multiple commands in the same container19:44
corvusclarkb: we asked nodepool for a long-running pod.19:44
clarkbI see19:44
clarkbin that case having the long running process either tail the file or have it write to port 19885 makes sense to me19:44
corvusclarkb: there is another option, which is to ask nodepool for a namespace, then you can run as many pods as you want, but you're on your own there.19:44
clarkbya19:45
corvus(ie, zuul doesn't know anything about pods you create in that namespace)19:45
clarkbya I was ignoring the "native" case19:45
corvusit seems like at first blush, the port-forward will probably fit in easier with the existing code19:45
corvus(the "tail -f case" is actually a bit more complicated, because it's really "tail -f whatever the current file for the current command is")19:46
tobiashzuul-maint: this is a small fix that resolves occasional connection problems to zuul-web (which also can lead to missed events in case of github): https://review.opendev.org/70545919:53
*** irclogbot_3 has quit IRC20:05
*** irclogbot_2 has joined #zuul20:07
ianwclarkb: thinking about the loop variables -- "_item" would be a bad choice as again likely to conflict, maybe _zj_... actually satisfies everything (clearly don't use this, and also very unlikely to conflict)20:09
clarkbthe _ prefix is aproblem according to the docs though20:09
clarkbI'm worried they'll start enforcing that rule of first char is letter20:10
ianwyeah, i wonder if that's really just a rule of thumb in the docs for a regular end-user, rather than people writing things to be used in a library situation20:11
*** mhu has quit IRC20:13
openstackgerritTobias Henkel proposed zuul/zuul master: DNM: Debug container build  https://review.opendev.org/70421520:19
*** jamesmcarthur has joined #zuul20:20
openstackgerritTobias Henkel proposed zuul/zuul master: Use apt mirror infrastructure during zuul-quick-start  https://review.opendev.org/64944820:22
openstackgerritTobias Henkel proposed zuul/zuul master: DNM: Debug container build  https://review.opendev.org/70421520:22
openstackgerritTobias Henkel proposed zuul/zuul master: Enable ansible cleanup  https://review.opendev.org/63601520:24
openstackgerritTristan Cacqueray proposed zuul/nodepool master: Dockerfile: create a nodepool user with uid 10001  https://review.opendev.org/70549720:29
*** jamesmcarthur has quit IRC20:38
openstackgerritIan Wienand proposed zuul/zuul-jobs master: upload-afs: rename to upload-afs-roots; add afs-upload-synchronize  https://review.opendev.org/70536820:42
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Remove deprecated upload-afs role  https://review.opendev.org/70537320:42
openstackgerritMerged zuul/zuul master: Gerrit: poll for merged changes if no stream events  https://review.opendev.org/70505320:43
openstackgerritMerged zuul/zuul master: Cap cheroot to fix issues with concurrent requests  https://review.opendev.org/70545920:54
*** Goneri has quit IRC21:16
fungiofosos: let me know how your connectivity explorations go (once you've sufficiently recovered from fosdem). i'm still en route home but should be around more usual hours on wednesday21:18
openstackgerritClark Boylan proposed zuul/zuul-jobs master: Use unique loop vars to avoid conflicts  https://review.opendev.org/70533721:35
openstackgerritMerged zuul/zuul master: Add google-cloud-storage to executor ansible  https://review.opendev.org/70527921:38
corvustristanC, mordred: can you +3 https://review.opendev.org/705313 soon?  it's trivial and having it in for the next restart would be helpful21:46
tristanCcorvus: done21:50
corvustobiash: +2 on scale-out-scheduler spec :)21:51
tobiashwohoo21:51
corvussuper cool.  when do we start? :)21:52
fungii think that means we just have started, right?21:53
*** jamesmcarthur has joined #zuul21:53
corvusor we started a year ago21:54
tobiashha scheduler is now one of our high prio topics. bolg is starting to work full time on this atm21:58
corvushuzzah!21:59
tobiashcorvus: maybe tomorrow we could discuss a way forward for the cyclic deps as well?22:00
corvustobiash: ah yes, i'll put that on my list :)22:00
tobiashthanks :)22:01
*** mattw4 has quit IRC22:09
*** mattw4 has joined #zuul22:09
pabelangerinterested in both HA scheduler and circular depends too22:24
*** saneax has quit IRC22:25
*** mattw4 has quit IRC22:29
*** mattw4 has joined #zuul22:29
*** jamesmcarthur has quit IRC22:32
*** rfolco has quit IRC22:36
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: use-buildset-registry: disable docker userland proxy  https://review.opendev.org/70275322:36
clarkbgoing over the operator stack again I've +2'd basically everything but https://review.opendev.org/#/c/702106/22 because tobiash left some really good points there and i'm not sure if we need to address those before merging or not22:43
openstackgerritMerged zuul/zuul master: Fix github app authentication to work with checks API endpoints  https://review.opendev.org/70516722:46
openstackgerritTristan Cacqueray proposed zuul/zuul-operator master: Add tenant reconfiguration when main.yaml changed  https://review.opendev.org/70363122:46
openstackgerritTristan Cacqueray proposed zuul/zuul-operator master: Add CONTRIBUTE file  https://review.opendev.org/70553522:46
tristanCclarkb: i'm happy to fix tobiash comment in the review or as follow-up, but those are also good learning opportunities too22:49
tobiashI'm fine with followups, they are also easier to review22:51
tristanCthe operator stack also depends-on https://review.opendev.org/#/c/702753/ which may need a toggle before landing in zuul-jobs22:55
clarkbtristanC: I think we should set that config elsewhere since it isn't related to the registry right? It is zuul operator on docker specific (with gearman)22:57
clarkbtristanC: I believe the docker config manipulation there will honor other existing configs (it does a merge)22:57
clarkb(maybe in install-docker if it makes sense to have that as a global setting)22:58
openstackgerritMerged zuul/zuul master: Add some debug lines for provides/requires  https://review.opendev.org/70531323:00
*** jamesmcarthur has joined #zuul23:11
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: DNM: install-docker: enable setting docker userland proxy  https://review.opendev.org/70275323:16
openstackgerritTristan Cacqueray proposed zuul/zuul-operator master: Add OpenShift SCC and functional test  https://review.opendev.org/70275823:16
tristanCclarkb: ok, testing the new install-docker approach with 70275823:17
*** jamesmcarthur has quit IRC23:17
openstackgerritJames E. Blair proposed zuul/zuul-jobs master: Add upload-logs-gcs role  https://review.opendev.org/70371123:20
openstackgerritJames E. Blair proposed zuul/zuul-jobs master: Add index_links option to zuul manifest  https://review.opendev.org/70558023:20
*** hashar has quit IRC23:26
*** plaurin has quit IRC23:37
openstackgerritJames E. Blair proposed zuul/zuul master: web: link to index.html if index_links is set  https://review.opendev.org/70558523:45

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!