Friday, 2019-03-22

openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: Add API endpoint to get frozen jobs  https://review.openstack.org/60707700:07
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: Get executor job params  https://review.openstack.org/60707800:07
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: Separate out executor server from runner  https://review.openstack.org/60707900:10
tristanC^ just fixing yet another rebase conflict00:10
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: runner: implement prep-workspace  https://review.openstack.org/60708200:11
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: runner: add configuration schema  https://review.openstack.org/64067200:11
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: runner: add execute sub-command  https://review.openstack.org/63094400:11
*** irclogbot_3 has joined #zuul00:18
*** jamesmcarthur has quit IRC00:30
*** jamesmcarthur has joined #zuul01:46
*** jamesmcarthur has quit IRC02:03
*** bjackman has joined #zuul02:34
*** jamesmcarthur has joined #zuul03:51
*** jamesmcarthur has quit IRC04:12
*** daniel2 has quit IRC04:30
*** wxy-xiyuan has quit IRC04:31
*** dcastellani has quit IRC04:31
*** wxy-xiyuan has joined #zuul04:31
*** spsurya has quit IRC04:31
*** maxamillion has quit IRC04:31
*** PrinzElvis has quit IRC04:31
*** gundalow has quit IRC04:32
*** kmalloc has quit IRC04:32
*** hogepodge has quit IRC04:32
*** jbryce has quit IRC04:33
*** spsurya has joined #zuul04:34
*** PrinzElvis has joined #zuul04:34
*** gundalow has joined #zuul04:34
*** hogepodge has joined #zuul04:34
*** jbryce has joined #zuul04:38
*** daniel2 has joined #zuul04:38
*** dcastellani has joined #zuul04:39
*** kmalloc has joined #zuul04:40
*** jamesmcarthur has joined #zuul04:52
*** jamesmcarthur has quit IRC04:57
*** raukadah is now known as chandankumar05:06
*** saneax has joined #zuul06:09
*** swest has joined #zuul06:13
*** bjackman has quit IRC06:41
*** bjackman has joined #zuul06:41
*** pcaruana has joined #zuul07:27
*** gtema has joined #zuul08:22
*** jpena|off is now known as jpena08:51
*** gtema has quit IRC09:00
openstackgerritMerged openstack-infra/nodepool master: Update docs for provider removal.  https://review.openstack.org/64522009:27
arxcruz|ptohey guys, can we have some love on https://review.openstack.org/#/c/607077/ ?09:28
*** bjackman has quit IRC09:42
*** saneax has quit IRC10:01
*** saneax has joined #zuul10:03
*** hashar has joined #zuul10:23
openstackgerritLuigi Toscano proposed openstack-infra/zuul-jobs master: DNM Debug stage-output, change archival mechanism  https://review.openstack.org/64523910:42
*** dcastellani has quit IRC10:59
*** spsurya has quit IRC10:59
*** dcastellani has joined #zuul11:00
*** spsurya has joined #zuul11:01
openstackgerritFabien Boucher proposed openstack-infra/zuul master: Elasticsearch Zuul reporter  https://review.openstack.org/64492711:12
*** arxcruz|pto is now known as arxcruz11:31
*** pcaruana has quit IRC11:53
*** Guest12731 has joined #zuul12:08
*** rlandy has joined #zuul12:14
*** logan- has quit IRC12:14
*** Guest12731 is now known as logan-12:14
openstackgerritLuigi Toscano proposed openstack-infra/zuul-jobs master: DNM Debug stage-output, change archival mechanism  https://review.openstack.org/64523912:19
*** jpena is now known as jpena|lunch12:36
*** pcaruana has joined #zuul12:54
*** hashar has quit IRC12:58
*** altlogbot_2 has quit IRC13:01
*** irclogbot_3 has quit IRC13:01
*** altlogbot_0 has joined #zuul13:03
*** irclogbot_0 has joined #zuul13:03
openstackgerritLuigi Toscano proposed openstack-infra/zuul-jobs master: DNM Debug stage-output, change archival mechanism  https://review.openstack.org/64523913:08
openstackgerritFabien Boucher proposed openstack-infra/zuul master: Elasticsearch Zuul reporter  https://review.openstack.org/64492713:09
openstackgerritLuigi Toscano proposed openstack-infra/zuul-jobs master: DNM Debug stage-output, change archival mechanism  https://review.openstack.org/64523913:26
*** jpena|lunch is now known as jpena13:36
fboHi, we have a Zuul user that was looking for a way to send build results into elasticsearch (then explore/graph via kibana). I did a reporter proposal here https://review.openstack.org/644927, do you think that is something relevant to have into Zuul ?14:26
*** smyers has quit IRC14:27
corvusarxcruz: the zuul-runner stack is our next priority -- we're finishing up the multi-ansible stuff this week (i think we need to make one more point release).  see the most recent project update email.14:28
arxcruzcorvus: no problem, i wait until now, i can wait a little bit more :)14:29
arxcruzjust wanted to ensure it's not missed :)14:29
corvusarxcruz: so close! :)  thanks!14:29
*** smyers has joined #zuul14:37
corvusfbo: i'll give it a quick look :)14:37
fbocorvus: thanks :)14:44
corvusfbo: that seems like a fine idea -- i left one quick question, and will give it a more detailed review later.  we'll definitely want clarkb to review that too :)14:46
mordredfbo: yeah - I saw that review come by yesterday - the concept seems like a potentially neat option for zuul users14:57
mordredfbo: (although I was on an airplane and haven't actually, you know, looked at it)14:57
pabelangerdoes anybody have thoughts on graylog? that came up in discussion recently for log mgmt15:08
*** jamesmcarthur has joined #zuul15:12
fbocorvus: mordred yes and we are excited to build nice dashboard on top of that data. We tried before based on the log artifacts exported to logstash/elastic but had to find a unique line of log like (Job console starting...) and that was combersome15:18
fbohaving both: build/buildset data + log artifacts in elk is a nice have15:20
*** altlogbot_0 has quit IRC15:21
fbopabelanger: problem with elk is the authentication (rely on x-pack extention (not free)) and it seems graylog have the support15:24
*** altlogbot_2 has joined #zuul15:26
clarkbfbo: corvus: two things I notice really quickly are that we only allow for a single uri? you may want to take a list so that you can have fallback nodes. Also I think you want indexes to rollover on some period for management purposes15:28
clarkbif an index becomes corrupt being able to delete a days worth of data is worthwile. At least for log data when you are talking terabytes of data. The zuul data may be small enough that isn't a huge concern15:29
*** irclogbot_0 has quit IRC15:30
pabelangerfbo: yah, I haven't looked too much at graylog myself, aside from looking at the website.15:31
openstackgerritLuigi Toscano proposed openstack-infra/zuul-jobs master: stage-output: fix the archiving of all files  https://review.openstack.org/64523915:31
*** irclogbot_0 has joined #zuul15:33
*** saneax has quit IRC15:33
*** irclogbot_0 has quit IRC15:36
fboclarkb: yes good points the driver should take a comma separated list of el nodes (I kept it simple for the first implementation). Also yes the index should not grow as fast as with logs but splitting index might be useful for the reason you gave. We can think of having a strategy based of the number of docs in the index or simply by date to split the indexes.15:37
*** irclogbot_3 has joined #zuul15:38
SpamapSfbo: regarding parsing the console.. did you try sending the json in as a document?15:49
SpamapSI guess it's a bunch of lists.. might not be useful in elastic15:50
SpamapSbut.. there should be no reason to parse the text. You have everything you need in the json.15:50
SpamapSAnd really I don't know that you need a reporter. You could make a post-run playbook that feeds into elastic pretty easily.15:50
*** hashar has joined #zuul15:52
fboSpamapS: that's not about parsing the console. You can see it as the same as the sql reporter but for elastic so only related to build and buildset data. In a post-run playbook some info will be missing like a SKIPPED job result.16:07
SpamapSfbo: there are skipped job results?16:12
SpamapSOh I guess if parents fail16:13
*** smyers has quit IRC16:14
SpamapSAnyway, IMO you can get pretty far just scraping the database and shoving it into elastic.16:14
SpamapSThe problem with reporters is they tax the scheduler, which is already really, really busy.16:14
*** smyers has joined #zuul16:15
SpamapS(of course, we could fix that with some gearman or zk job farming.. but... moar components?)16:15
pabelangercould you not get data from mqtt reporter? then offload that to some other publisher?16:19
fboSpamapS: yes skipped child job due to parent failure. Yes more load but still that is configurable/activable by pipeline16:20
fbopabelanger: yes that's possible but a bit more complicated to setup for a Zuul operator imo.16:22
pabelangeragree, would put more work on deployers16:22
pabelangermight be a good way to scale out too16:23
clarkbits not necessarily more work on deployers if it is automatically forked like the geard process16:30
pabelangeroh, yah.  In my brain, I was thinking about how openstack logstash workers were setup16:32
fbook you mean still part of zuul, like a zuul-sql-reporter or zuul-elastic-reporter16:32
fboreading gearman report jobs from their own geard16:33
*** maxamillion has joined #zuul16:37
clarkbwell if the main scheduler process always ran an (internal) mqtt reporter then you could fork off reporters for the other backends16:40
clarkbthen the cpu required to report is limited to mqtt (or whatever other internal bus is chosen)16:40
clarkbI dno't know if that is actually worthwhile, but is one appraoch that could be taken16:46
openstackgerritPaul Belanger proposed openstack-infra/zuul master: Update component diagram to show statsd  https://review.openstack.org/64579816:58
pabelangerwould love to get some eyes on ^ for statsd integration. Looking at code, I could only see executors and scheduler sending data to statsd.16:58
*** chandankumar is now known as raukadah17:02
*** jamesmcarthur has quit IRC17:05
corvustobiash: shall i tag efae4deec5b538e90b88d690346a58538bd5cfff as 3.7.1 ?17:10
tobiashcorvus: ++17:12
tobiashcorvus: but I found another bug, default_ansible_version seems to be ignored in zuul.conf17:12
tobiashbut that might be less critical17:13
corvustobiash: think you might have a fix soon?  if so, we could wait for it; if not, we could do 3.7.1 today and 3.7.2 next week17:16
tobiashpabelanger: merger should send data to statsd too17:16
tobiashcorvus: define 'soon'17:17
corvustobiash: 3 hours? :)17:17
tobiashchallenge accepted17:18
corvustobiash: cool... would you mind adding a release note about that and also the uri module fix?  doesn't have to be much, but i just realized we don't have any release notes and it seems weird to have a release without at least one.17:18
tobiashk, will do17:19
tobiashwas uri only broken for 2.7?17:19
corvustobiash: yes17:19
pabelangertobiash: Hmm, I didn't see a code path for that, but also could just be blind :)17:22
pabelangerlooking again17:22
*** jamesmcarthur has joined #zuul17:22
tobiashpabelanger: the merger queue17:23
tobiashoh, the merger queue probably comes from the scheduler17:24
pabelangertobiash: http://git.zuul-ci.org/cgit/zuul/tree/zuul/scheduler.py#n39617:24
pabelangeryah17:24
tobiashthen I think you're right17:24
openstackgerritMerged openstack-infra/zuul-jobs master: Add fetch-sphinx-tarball role  https://review.openstack.org/64534617:26
openstackgerritMerged openstack-infra/zuul-jobs master: Add download artifact role  https://review.openstack.org/64538417:26
*** jamesmcarthur has quit IRC17:27
*** jamesmcarthur has joined #zuul17:34
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Fix ignored default ansible version  https://review.openstack.org/64581917:47
tobiashcorvus: ^17:47
Shrewsclarkb: you will be happy (or sad?) to hear that i have found multiple issues around the nodepool build process, none of which i have solutions for atm17:49
tobiashShrews: you mean the image build process?17:49
Shrewstobiash: yes17:49
Shrewstobiash: this is causing some image files to be left unmanaged on the builders17:50
tobiashwe're leaking massively images in our clouds btw17:52
tobiashneed to look into that as well17:52
tobiashevery few weeks I need to delete 10-20TB of images from our cloud :-/17:53
Shrewstobiash: the leak seems mostly related to losing the ZK session during the image build process17:53
pabelangertobiash: wow17:54
Shrewsthe other problem is that we are creating two different image build znodes for a single image build17:54
Shrewsthat one seems more easily fixed, but not sure its impact on the leak yet17:54
Shrewsif any17:54
tobiashShrews: do you think it's viable to fail and delete the image if we lost the lock?17:55
tobiashsure, it can be very expensive, but that image is probably lost anyway?17:55
clarkbif dib has been killed externally then there isn't really a good way to recover17:56
Shrewstobiash: it's supposed to do that already (state is BUILDING but no build lock). that might be due to the 2-znode problem17:56
tobiashah ok17:56
Shrewsi still see issues with that though17:57
Shrewsi think we can force kick the cleaning process when the build finishes, rather than let the cleanup thread do it17:58
fungithough is the cleanup thread also not working?18:03
Shrewsfungi: it works, but it's a timing thing. the lost zk session causes the cleanup thread to begin a build that may still be in process (thus new files may appear after we think we've deleted them)18:05
Shrewsbegin to cleanup* a build, that is18:05
*** jpena is now known as jpena|off18:13
corvustobiash: thanks!18:18
corvusi think we can issue the release after that lands without an openstack-infra burn-in18:19
*** jamesmcarthur has quit IRC18:19
*** jamesmcarthur has joined #zuul18:20
corvusSpamapS, fbo: it's true that reporters are run in the scheduler main thread, however, as long as we keep them simple, i don't think they should have too much of an impact -- a pipeline usually doesn't have too many, they don't run that often, and usually they're just 'fire and forget' -- should only take a few hundred ms.  i think we'll have scale-out schedulers before reporter cpu-time becomes a significant18:21
corvusproblem.18:21
corvusSpamapS, fbo: it's also true that many 'reporting' actions can be handled in jobs (indeed, we do that in openstack-infra with our logstash processer), there is a difference in the data available to them -- jobs necessarily report about themselves, wheras reporters have the big picture of a buildset.  so it's worth considering which one is right for a given application.18:23
*** jamesmcarthur has quit IRC18:24
SpamapScorvus: scaling out reporters would be another use case for the "cleanup job" concept.. have the cleanup job look at the whole tree and do all of the reporter work.18:24
SpamapSbut alas, the time.18:25
corvusSpamapS: yes... the infinite amout of time in the universe which we are unable to access... :(18:25
* SpamapS shakes fist at time and space18:26
*** pcaruana has quit IRC18:45
pabelangerQuestion: If you don't have zuul-fingerfg running, is zuul-web smart enought to try to connect to zuul-executors streaming port directly? Or does in need to connect via zuul-fingergw?18:48
pabelangerenough*18:49
pabelangerI know the finger client wouldn't work, because the port is not 21/tcp18:49
Shrewspabelanger: iirc, it should get the streaming info (server and port), via gearman. it wouldn't go through the zuul-fingergw18:56
Shrewszuul-fingergw gets the info the same way and should just be for finger clients18:57
pabelangerShrews: ack, thanks!18:58
corvuspabelanger: any reason you can't run fingergw?19:07
pabelangercorvus: nope, booting it now. Was mostly curious is executors were public, if it was really needed19:09
pabelangerwas also trying to see flow for firewalls19:09
corvuspabelanger: it's needed so that "finger uuid@zuul" works, which i think is a useful feature :)  but it's just like web -- only the web and fingergw services need to be publicly accessible; executors never do -- they only need to be accessible by the web and fingergw processes.19:11
pabelangergreat, thats how I remembered it19:12
pabelangerthanks19:12
*** saneax has joined #zuul19:17
tobiashcorvus: do you think it makes sense to increase the default wait_timeout further to 90s?19:19
tobiashwe're currently rechecking all the time :(19:20
corvustobiash: maybe so19:21
openstackgerritPaul Belanger proposed openstack-infra/zuul master: Add web / fingergw connections for components graph  https://review.openstack.org/64585219:21
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Increase default wait_timeout  https://review.openstack.org/64585319:22
SpamapSBTW, I failed to notice before so c'est la vie, but did we break config file compatibility again? I think we did by requiring the user to be set in fingrgw20:02
corvusSpamapS: yes, we batched a couple of those changes up and made sure to highlight them in the release upgrade notes -- so 3.7.0 is the "you may need to pay attention to your deployment settings" release20:19
corvus3 changes total -- multi-ansible, zookeeper connection default, and fingergw user20:20
SpamapSYeah that's ok. Just making sure I understand.20:20
corvusmaybe we should name our releases like that :)20:20
SpamapSI don't use fingergw, so it doesn't affect me20:20
SpamapS(why would I need to use fingergw?)20:20
corvusSpamapS: because it's awesome?  :)  "finger uuid@zuul | grep -i error"20:20
SpamapScorvus: we should look for quotes from Ghostbusters to adequately convey that need. ;)20:21
corvusSpamapS: nice :)20:21
corvusi'll, um,  get right on that research project :)20:21
SpamapS3.7.0 - "I'm warning you, turning off these machines would be extremely hazardous."20:21
SpamapS4.0.0 can be "Many Shubs and Zuuls knew what it was to be roasted in the depths of a Sloar that day, I can tell you!"20:24
corvusgold20:25
SpamapSWe should rename nodepool to Sloar.20:27
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Don't assume secrets are text in encrypt_secret  https://review.openstack.org/64588821:22
*** saneax has quit IRC21:32
*** mgoddard has quit IRC21:47
*** mgoddard has joined #zuul21:47
openstackgerritJames E. Blair proposed openstack-infra/zuul-jobs master: Minor improvements to docker-image doc structure  https://review.openstack.org/64589721:49
openstackgerritJames E. Blair proposed openstack-infra/zuul-jobs master: Organize documentation by subject area  https://review.openstack.org/64595522:52
corvusAJaeger: ^ your feedback especially sought on that one22:52
clarkbcorvus: is the idea there that around the autojob loads you can write the narrative?22:55
corvusclarkb: yep, sort of like how i did the container images documentation for opendev/base-jobs (but probably with many fewer words)22:56
corvusso you could say "these next 3 roles are all about dealing with python releases", and put that in a subsection heading22:56
corvusthat way if a user is trying to find out what's available to help with a python project, they have a better tool than 'grep' :)22:57
corvus(or ctrl-f in browser)22:58
*** rlandy has quit IRC23:11

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!