clarkb | maybe they arent connected ti gearman? | 00:00 |
---|---|---|
clarkb | or is that a different error? | 00:00 |
krasmussen | For what its worth I can at least see that the gearman port is up and taking connections. Again I'm new to this so I'm not sure what all I can do to better debug any potential issues on the gearman side of things though. | 00:02 |
clarkb | you can telnet/nc to the port and enter the 'status' command to see registered jobs | 00:02 |
krasmussen | Sweet thanks I'll try that. | 00:03 |
corvus | oh, yeah, we might be looking at a situation where there are no connected mergers and the job timed out | 00:04 |
corvus | i think the timeout is 5 minutes? so if it took 5 minutes to fail like that after starting, that's almost certainly it | 00:04 |
corvus | krasmussen: the "workers" command to gearman may also be useful | 00:05 |
krasmussen | Looks like gearman died with the scheduler start failure. I throw in a `status` via nc or telnet and it just closes the connection on me. | 00:07 |
corvus | krasmussen: is gearman using ssl? | 00:08 |
krasmussen | Ahh I think it is. | 00:08 |
corvus | krasmussen: here's the command we use to debug in our system; paths may need adjusting: openssl s_client -connect localhost:4730 -cert /etc/zuul/ssl/client.pem -key /etc/zuul/ssl/client.key | 00:09 |
krasmussen | Sweet, I was building a similar command via my google search :P | 00:09 |
krasmussen | Not overly sure how to read the gearman status but: https://pastebin.com/SVuRBN0J | 00:11 |
krasmussen | Going to google that now. | 00:11 |
SpamapS | beware folks, krasmussen is dealing with a Zuul that I built. ;-) | 00:15 |
krasmussen | That is true :P | 00:15 |
krasmussen | miss you @SpamapS | 00:16 |
SpamapS | ;) | 00:16 |
SpamapS | merger:cat 367 0 0 | 00:16 |
SpamapS | well there's yer problenm | 00:16 |
SpamapS | Looks like no workers | 00:16 |
SpamapS | executors failing? | 00:16 |
SpamapS | krasmussen: maybe the gearman cert expired. | 00:17 |
SpamapS | ;) | 00:17 |
krasmussen | Nothing in the executors logs but maybe? | 00:17 |
krasmussen | gearman cert is good till Aug | 00:17 |
klindgren | German cert is valid for a few more months | 00:17 |
klindgren | :-D | 00:17 |
SpamapS | krasmussen: IIRC it uses systemd ... check systemd journal? | 00:17 |
SpamapS | klindgren: hah, you've been through this before I see :) | 00:17 |
klindgren | Might have just done something octavia | 00:17 |
SpamapS | I wouldn't be surprised if the executor logs are all in journald | 00:18 |
SpamapS | I also wouldn't be surprised if the journal were corrupted | 00:18 |
krasmussen | I was looking in the wrong spot for the executor logs as they got moved to a different node and I was still looking on the old node... Looking into that for real this time :) | 00:20 |
SpamapS | Oh right, the scale out :) | 00:21 |
*** sdake has quit IRC | 00:21 | |
klindgren | Only one that has logs in it was executor03 | 00:23 |
klindgren | 01 and 02 haven't had logs in 10+ days | 00:23 |
klindgren | Looks like its doing things now | 00:27 |
krasmussen | Yeah this is looking much happier. | 00:30 |
SpamapS | krasmussen: hah, so you applied the patch on zuul0, but not the executors, I presume? | 00:31 |
SpamapS | krasmussen: If you want that to stick, I believe there's some stuff in the hoist repo to pull from a local github fork of zuul, so you can land the patch there. | 00:32 |
krasmussen | My guess is it was failing due to the executors not being happy for however long | 00:32 |
*** saneax has joined #zuul | 00:40 | |
* mordred waves to klindgren and krasmussen as he walks out to make some food ... wishing everyone good luck! | 00:40 | |
* klindgren waves back - I think things are proceeding in the right direction | 00:41 | |
* krasmussen waves ecstatically while trying to remember @mordred | 00:43 | |
klindgren | Thangs are working | 00:44 |
krasmussen | Looks like we have a working zuul again :) Thanks for all the assist everyone! | 00:44 |
*** EvilienM is now known as EmilienM | 00:58 | |
*** zbr has quit IRC | 01:00 | |
*** saneax has quit IRC | 01:34 | |
*** bhavikdbavishi has joined #zuul | 01:45 | |
*** sdake has joined #zuul | 01:58 | |
*** sdake has quit IRC | 02:28 | |
*** bhavikdbavishi has quit IRC | 02:37 | |
*** swest has quit IRC | 02:51 | |
*** panda has quit IRC | 02:53 | |
*** panda has joined #zuul | 02:54 | |
*** swest has joined #zuul | 03:05 | |
*** bhavikdbavishi has joined #zuul | 03:50 | |
*** bhavikdbavishi has quit IRC | 04:28 | |
* SpamapS watches Debian based zuul images build and is pleased | 04:47 | |
*** sdake has joined #zuul | 06:00 | |
*** sdake has quit IRC | 06:13 | |
*** bjackman has joined #zuul | 07:04 | |
*** daniel2 has quit IRC | 07:48 | |
tobiash | mordred: now I have executor memory statistics of 1.5 days for py36, py37 and py37+jemalloc on bionic: https://paste.pics/581cc286226407ab0be400b94951a7d9 | 08:27 |
tobiash | interestingly py37 shows the same behavior as py36 but uses generally even more memory | 08:27 |
tobiash | and py37+jemalloc seems to resolve our executor memleak... | 08:28 |
tobiash | all three executors had roughly the same workload during the day | 08:28 |
tobiash | I think we'll go for jemalloc... | 08:31 |
tobiash | SpamapS: thanks for the awsome hint about jemalloc :) | 08:31 |
SpamapS | tobiash: nice! | 08:35 |
tobiash | still awake? | 08:35 |
SpamapS | Yeah, just upgrading zuul actually ;) | 08:35 |
tobiash | like me ;) | 08:35 |
SpamapS | Got it to latest master | 08:35 |
tobiash | me2 | 08:36 |
SpamapS | The service worker thing was killing me | 08:36 |
SpamapS | And I was really happy to drop pbrx from my docker builds. | 08:36 |
tobiash | :) | 08:37 |
SpamapS | That is a truly stunning result for jemalloc btw | 08:38 |
SpamapS | wow | 08:38 |
SpamapS | Python should just link it directly. | 08:38 |
tobiash | indeed, I never expected such a clear result | 08:38 |
*** zbr|ssbarnea has joined #zuul | 08:39 | |
*** daniel2 has joined #zuul | 08:39 | |
*** zbr|ssbarnea has quit IRC | 08:42 | |
*** zbr|ssbarnea has joined #zuul | 08:53 | |
*** daniel2 has quit IRC | 08:55 | |
openstackgerrit | Tobias Urdin proposed openstack-infra/zuul-jobs master: Rework upload-puppetforge role to use module https://review.openstack.org/635941 | 09:27 |
openstackgerrit | Tobias Urdin proposed openstack-infra/zuul-jobs master: Rework upload-forge role to use module https://review.openstack.org/635941 | 09:29 |
openstackgerrit | Tobias Urdin proposed openstack-infra/zuul-jobs master: Rework upload-forge role to use module https://review.openstack.org/635941 | 09:32 |
*** daniel2 has joined #zuul | 10:18 | |
*** sshnaidm|off has quit IRC | 10:18 | |
*** sshnaidm|off has joined #zuul | 10:21 | |
*** bhavikdbavishi has joined #zuul | 10:24 | |
*** bhavikdbavishi has quit IRC | 10:47 | |
*** gtema has joined #zuul | 11:17 | |
*** bjackman has quit IRC | 11:28 | |
*** gtema has quit IRC | 11:36 | |
*** gtema has joined #zuul | 11:37 | |
tobiash | Shrews: responded on https://review.openstack.org/623927 | 11:50 |
*** bhavikdbavishi has joined #zuul | 12:18 | |
*** toabctl has quit IRC | 12:30 | |
tobiash | corvus: just noticed that runAnsibleCleanup in the executor is a noop: https://git.zuul-ci.org/cgit/zuul/tree/zuul/executor/server.py#n2013 | 12:58 |
tobiash | corvus: is that intended? | 12:58 |
*** gtema has quit IRC | 12:59 | |
tobiash | ah judging from comment and commit message this seems like something we wanted to have but wasn't possible back then with ansible 2.3 | 13:03 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Enable ansible cleanup https://review.openstack.org/636015 | 13:07 |
*** sdake has joined #zuul | 13:08 | |
*** snapiri has quit IRC | 13:25 | |
*** snapiri has joined #zuul | 13:25 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: WIP: Manage ansible installations https://review.openstack.org/631930 | 13:29 |
*** bhavikdbavishi has quit IRC | 13:35 | |
mordred | tobiash: oh wow! those are some great graphs | 13:43 |
*** sdake has quit IRC | 13:44 | |
tobiash | yes :) | 13:44 |
*** sdake has joined #zuul | 13:46 | |
mordred | tobiash: that kind of makes me want to add jemalloc to the python-base image and set the env var there | 13:47 |
tobiash | might make sense | 13:48 |
tobiash | at least it makes sense for the zuul image | 13:48 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul master: Run python with jemalloc in containers https://review.openstack.org/635504 | 13:49 |
mordred | tobiash: well, let's take the WIP off of that ^^ | 13:49 |
tobiash | mordred: :) | 13:50 |
tobiash | however I have to note that I compiled the latest release from source, so using apt-get may result in different graphs | 13:51 |
mordred | tobiash: fair enough | 13:57 |
*** sdake has quit IRC | 14:08 | |
openstackgerrit | Monty Taylor proposed openstack/pbrx master: Remove container build jobs https://review.openstack.org/636019 | 14:10 |
*** sdake has joined #zuul | 14:20 | |
*** sdake has quit IRC | 14:21 | |
*** sdake has joined #zuul | 14:23 | |
fungi | gotta say, i'm thrilled to see freebsd's malloc getting increased uptake in other operating systems! ;) | 14:24 |
fungi | too bad they moved it to github and dropped their mailing lists in favor of gitter :/ | 14:26 |
*** sdake has quit IRC | 14:31 | |
mordred | fungi: sigh | 15:07 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: WIP: Manage ansible installations https://review.openstack.org/631930 | 15:10 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: WIP: Symlink ansible plugins https://review.openstack.org/636022 | 15:10 |
tobiash | dmsimard: thinking about multi ansible support and ara, do we have to take special care? Is the latest ara compatible with multiple ansible versions (say 2.5 - 2.7)? | 15:14 |
*** bhavikdbavishi has joined #zuul | 15:20 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 15:22 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 15:25 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: WIP: Manage ansible installations https://review.openstack.org/631930 | 15:31 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: WIP: Symlink ansible plugins https://review.openstack.org/636022 | 15:31 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 15:31 |
tobiash | dmsimard: ah ok, judging from the readme it's compatible from 2.5 - devel currently so that should just work | 15:35 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: WIP: Manage ansible installations https://review.openstack.org/631930 | 15:49 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: WIP: Symlink ansible plugins https://review.openstack.org/636022 | 15:49 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 15:49 |
*** needssleep is now known as TheJulia | 15:58 | |
*** sdake has joined #zuul | 16:08 | |
mnaser | it looks like the ara-report zuul role is broken with (the default option) of compression enabled (cc dmsimard ) | 16:18 |
mnaser | https://object-storage-ca-ymq-1.vexxhost.net/v1/296431177f204070bb3ba134fd51e6ca/zuul-dev-logs/1/1/211c1400a9d5cef18d5c3a5eaf198e02fe9a8b17/check/run-test-command/66ef1d6/ara/index.html.gz | 16:19 |
mnaser | it compresses everything but the html files are still looking to non-.gz'd paths | 16:19 |
corvus | mnaser: yeah, with the swift upload, you don't want to do any compression ahead of time | 16:24 |
corvus | mnaser: here's our (test) swift-logs job: https://review.openstack.org/635906 | 16:24 |
mnaser | corvus: thanks, that's really helpful. | 16:25 |
mnaser | something that i haven't hit yet which i'm trying to wrap my head around is cross-tenant base jobs .. aka trying to have the same base job which uploads logs to swift across multiple tenants | 16:26 |
mnaser | it looks like when i encrypted the secrets, it required the tenant name which seems like its binding to the tenant | 16:26 |
corvus | mnaser: it's actually bound to the project, independent of the tenant, so it'll work. it just needed the name of a tenant in order to get to the project via the api | 16:27 |
mnaser | corvus: oh okay, that was just an assumption based on my interaction with the tool, so that's good to know, i'm still trying to see the cleanest way to get around multitenancy | 16:28 |
corvus | mnaser: we're now doing this in opendev -- that repo (opendev/base-jobs) provides the base job for both of our tenants: http://zuul.opendev.org/tenants | 16:28 |
mnaser | ok wonderful, that'll be a good example | 16:28 |
tobiash | mnaser: the only thing you should care of is that you gate that repo only in one tenant | 16:28 |
tobiash | so you need to exclude project in all but one tenant | 16:29 |
pabelanger | https://github.com/ansible-network/zuul-config/blob/master/zuul.d/jobs.yaml#L29 is also an example of swift-upload-logs with vexxhost | 16:29 |
mnaser | correct, i dont want the other tenants to touch that repo | 16:29 |
pabelanger | keep in mind, the container you have in swift is unique, so don't call it logs | 16:29 |
mnaser | yeah i noticed infra already yoinked logs ;) | 16:29 |
* mnaser sudo rm -rfv and steals 'logs' | 16:30 | |
corvus | mnaser: i think what you describe is the right idea -- single base jobs repo that understands log uploading and other site-specific stuff. beyond that, there are lots of options to explore. we might end up with an "openstack-base" job which inherits from that and adds some openstack-specific stuff. you can define a "default parent" for each tenant, so the openstack-base job would still look like the | 16:30 |
corvus | default base job in the openstack tenant. | 16:30 |
corvus | mnaser: yeah, i think that was by accident. i don't expect we'll actually use that container :) | 16:30 |
mnaser | corvus: yeah, i'm trying to avoid having the user deal too much with a base job (for now) | 16:30 |
*** sdake has quit IRC | 16:30 | |
mnaser | as mucking about base jobs is a bit more 'advanced' user | 16:31 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: WIP: Manage ansible installations https://review.openstack.org/631930 | 16:31 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: WIP: Symlink ansible plugins https://review.openstack.org/636022 | 16:31 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 16:31 |
mnaser | (potentially later enabling users to have their own base jobs, but yeah) | 16:31 |
mnaser | i'm really really trying to figure out the on-boarding process and simplifying it as much as possible | 16:32 |
corvus | mnaser: ++ | 16:37 |
*** sdake has joined #zuul | 16:52 | |
*** SpamapS has quit IRC | 17:00 | |
*** SpamapS has joined #zuul | 17:01 | |
SpamapS | Seems like I got a bad copy of base jobs on my latest scheduler restart | 17:06 |
SpamapS | 2019-02-09 17:06:04.407932 | builder | "msg": "The conditional check 'zuul_temp_ssh_key_stat.stat.exists != True' failed. The error was: error while evaluating conditional (zuul_temp_ssh_key_stat.stat.exists != True): 'dict object' has no attribute 'stat'\n\nThe error appears to have been in | 17:06 |
SpamapS | '/tmp/tmpsmsv8tek/28a7b23751574999b9626f922b1860ea/trusted/project_1/git.zuul-ci.org/zuul-jobs/roles/add-build-sshkey/tasks/create-key-and-replace.yaml': line 1, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Create Temp SSH key\n ^ here\n" | 17:06 |
SpamapS | or zuul-jobs maybe, not sure | 17:07 |
SpamapS | http://paste.openstack.org/show/744788/ has the full fail | 17:08 |
SpamapS | the problem seems to be stat failing | 17:08 |
SpamapS | oh nope, this is new executor images breaking me... interesting | 17:10 |
SpamapS | 2019-02-09 17:08:07,791 DEBUG zuul.AnsibleJob: [build: bd2201c7f8864964a0233f5e567aa6e9] Ansible output: b'failed: [builder -> localhost] (item=/tmp/tmpsmsv8tek/bd2201c7f8864964a0233f5e567aa6e9/work/docs) => {"changed": false, "item": "/tmp/tmpsmsv8tek/bd2201c7f8864964a0233f5e567aa6e9/work/docs", "module_stderr": "/bin/sh: 1: /usr/bin/python3: not found\\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": | 17:11 |
SpamapS | 127}' | 17:11 |
SpamapS | alpine put python3 in /usr/bin, but the builder images don't | 17:11 |
*** sdake has quit IRC | 17:47 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: WIP: Manage ansible installations https://review.openstack.org/631930 | 17:56 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: WIP: Symlink ansible plugins https://review.openstack.org/636022 | 17:56 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 17:56 |
*** sdake has joined #zuul | 18:06 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 18:09 |
mordred | SpamapS: yes - in the new images, python is in /usr/local/bin ... what was looking for python explicitly by /usr/bin/python3 ? | 18:12 |
mordred | corvus: ^^ heads up - the quickstart test seems to have potentially missed something | 18:13 |
SpamapS | mordred: I have a site variable of ansible_python_interpreter=/usr/bin/python3 | 18:14 |
SpamapS | mordred: because otherwise Ansible requires python2 or a symlink in every image | 18:14 |
SpamapS | so any localhost tasks failed | 18:15 |
mordred | SpamapS: ah | 18:15 |
*** bhavikdbavishi has quit IRC | 18:18 | |
*** bhavikdbavishi has joined #zuul | 18:20 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 18:25 |
*** sdake has quit IRC | 18:27 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 18:36 |
SpamapS | mordred: simple fix.. | 18:38 |
SpamapS | RUN [ -e /usr/bin/python3 ] || ln -s /usr/local/bin/python3 /usr/bin/python3 | 18:38 |
mordred | SpamapS: cool. think we should add that to the upstream dockerfile? | 18:41 |
SpamapS | mordred: Might simplify things for weird people like me who don't build custom images. ;) | 18:42 |
* SpamapS goes ot saturday | 18:43 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 18:56 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 19:10 |
*** sdake has joined #zuul | 19:15 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 19:24 |
*** sdake has quit IRC | 19:35 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 19:41 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 20:05 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 20:26 |
*** sdake has joined #zuul | 20:27 | |
*** sdake has quit IRC | 20:27 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional https://review.openstack.org/636026 | 20:44 |
*** sdake has joined #zuul | 21:15 | |
*** krasmussen has quit IRC | 22:03 | |
*** sdake has quit IRC | 22:37 | |
dmsimard | mordred, corvus, tobiash: I'm really not excited about the performance of ara 0.x html reports in swift | 22:40 |
dmsimard | static html reports are horrible at scale -- I understand that disk space (or inodes) is not really a concern with swift but we still have to upload them for every job | 22:42 |
dmsimard | I haven't found a solution I'd be happy with for the swift use case in 1.0 yet but it's something that I've started writing about in this pad: https://etherpad.openstack.org/p/ara-1.0-in-zuul | 22:46 |
dmsimard | Need to kill IRC bouncer for a while, migrating stuff in home lab. Happy to chat about this monday. | 22:58 |
* SpamapS looking at using EC2 capacity reservations to avoid NODE_FAILURE's on EC2 | 23:03 | |
*** dmsimard has quit IRC | 23:06 | |
*** sdake has joined #zuul | 23:09 | |
*** sdake has quit IRC | 23:19 | |
*** sdake has joined #zuul | 23:24 | |
*** sdake has quit IRC | 23:53 | |
*** sdake has joined #zuul | 23:56 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!