Saturday, 2019-02-09

clarkbmaybe they arent connected ti gearman?00:00
clarkbor is that a different error?00:00
krasmussenFor what its worth I can at least see that the gearman port is up and taking connections. Again I'm new to this so I'm not sure what all I can do to better debug any potential issues on the gearman side of things though.00:02
clarkbyou can telnet/nc to the port and enter the 'status' command to see registered jobs00:02
krasmussenSweet thanks I'll try that.00:03
corvusoh, yeah, we might be looking at a situation where there are no connected mergers and the job timed out00:04
corvusi think the timeout is 5 minutes?  so if it took 5 minutes to fail like that after starting, that's almost certainly it00:04
corvuskrasmussen: the "workers" command to gearman may also be useful00:05
krasmussenLooks like gearman died with the scheduler start failure. I throw in a `status` via nc or telnet and it just closes the connection on me.00:07
corvuskrasmussen: is gearman using ssl?00:08
krasmussenAhh I think it is.00:08
corvuskrasmussen: here's the command we use to debug in our system; paths may need adjusting: openssl s_client -connect localhost:4730 -cert /etc/zuul/ssl/client.pem  -key /etc/zuul/ssl/client.key00:09
krasmussenSweet, I was building a similar command via my google search :P00:09
krasmussenNot overly sure how to read the gearman status but: https://pastebin.com/SVuRBN0J00:11
krasmussenGoing to google that now.00:11
SpamapSbeware folks, krasmussen is dealing with a Zuul that I built. ;-)00:15
krasmussenThat is true :P00:15
krasmussenmiss you @SpamapS00:16
SpamapS;)00:16
SpamapSmerger:cat  367 0   000:16
SpamapSwell there's yer problenm00:16
SpamapSLooks like no workers00:16
SpamapSexecutors failing?00:16
SpamapSkrasmussen: maybe the gearman cert expired.00:17
SpamapS;)00:17
krasmussenNothing in the executors logs but maybe?00:17
krasmussengearman cert is good till Aug00:17
klindgrenGerman cert is valid for a few more months00:17
klindgren:-D00:17
SpamapSkrasmussen: IIRC it uses systemd ... check systemd journal?00:17
SpamapSklindgren: hah, you've been through this before I see :)00:17
klindgrenMight have just done something octavia00:17
SpamapSI wouldn't be surprised if the executor logs are all in journald00:18
SpamapSI also wouldn't be surprised if the journal were corrupted00:18
krasmussenI was looking in the wrong spot for the executor logs as they got moved to a different node and I was still looking on the old node... Looking into that for real this time :)00:20
SpamapSOh right, the scale out :)00:21
*** sdake has quit IRC00:21
klindgrenOnly one that has logs in it was executor0300:23
klindgren01 and 02 haven't had logs in 10+ days00:23
klindgrenLooks like its doing things now00:27
krasmussenYeah this is looking much happier.00:30
SpamapSkrasmussen: hah, so you applied the patch on zuul0, but not the executors, I presume?00:31
SpamapSkrasmussen: If you want that to stick, I believe there's some stuff in the hoist repo to pull from a local github fork of zuul, so you can land the patch there.00:32
krasmussenMy guess is it was failing due to the executors not being happy for however long00:32
*** saneax has joined #zuul00:40
* mordred waves to klindgren and krasmussen as he walks out to make some food ... wishing everyone good luck!00:40
* klindgren waves back - I think things are proceeding in the right direction00:41
* krasmussen waves ecstatically while trying to remember @mordred 00:43
klindgrenThangs are working00:44
krasmussenLooks like we have a working zuul again :) Thanks for all the assist everyone!00:44
*** EvilienM is now known as EmilienM00:58
*** zbr has quit IRC01:00
*** saneax has quit IRC01:34
*** bhavikdbavishi has joined #zuul01:45
*** sdake has joined #zuul01:58
*** sdake has quit IRC02:28
*** bhavikdbavishi has quit IRC02:37
*** swest has quit IRC02:51
*** panda has quit IRC02:53
*** panda has joined #zuul02:54
*** swest has joined #zuul03:05
*** bhavikdbavishi has joined #zuul03:50
*** bhavikdbavishi has quit IRC04:28
* SpamapS watches Debian based zuul images build and is pleased04:47
*** sdake has joined #zuul06:00
*** sdake has quit IRC06:13
*** bjackman has joined #zuul07:04
*** daniel2 has quit IRC07:48
tobiashmordred: now I have executor memory statistics of 1.5 days for py36, py37 and py37+jemalloc on bionic: https://paste.pics/581cc286226407ab0be400b94951a7d908:27
tobiashinterestingly py37 shows the same behavior as py36 but uses generally even more memory08:27
tobiashand py37+jemalloc seems to resolve our executor memleak...08:28
tobiashall three executors had roughly the same workload during the day08:28
tobiashI think we'll go for jemalloc...08:31
tobiashSpamapS: thanks for the awsome hint about jemalloc :)08:31
SpamapStobiash: nice!08:35
tobiashstill awake?08:35
SpamapSYeah, just upgrading zuul actually ;)08:35
tobiashlike me ;)08:35
SpamapSGot it to latest master08:35
tobiashme208:36
SpamapSThe service worker thing was killing me08:36
SpamapSAnd I was really happy to drop pbrx from my docker builds.08:36
tobiash:)08:37
SpamapSThat is a truly stunning result for jemalloc btw08:38
SpamapSwow08:38
SpamapSPython should just link it directly.08:38
tobiashindeed, I never expected such a clear result08:38
*** zbr|ssbarnea has joined #zuul08:39
*** daniel2 has joined #zuul08:39
*** zbr|ssbarnea has quit IRC08:42
*** zbr|ssbarnea has joined #zuul08:53
*** daniel2 has quit IRC08:55
openstackgerritTobias Urdin proposed openstack-infra/zuul-jobs master: Rework upload-puppetforge role to use module  https://review.openstack.org/63594109:27
openstackgerritTobias Urdin proposed openstack-infra/zuul-jobs master: Rework upload-forge role to use module  https://review.openstack.org/63594109:29
openstackgerritTobias Urdin proposed openstack-infra/zuul-jobs master: Rework upload-forge role to use module  https://review.openstack.org/63594109:32
*** daniel2 has joined #zuul10:18
*** sshnaidm|off has quit IRC10:18
*** sshnaidm|off has joined #zuul10:21
*** bhavikdbavishi has joined #zuul10:24
*** bhavikdbavishi has quit IRC10:47
*** gtema has joined #zuul11:17
*** bjackman has quit IRC11:28
*** gtema has quit IRC11:36
*** gtema has joined #zuul11:37
tobiashShrews: responded on https://review.openstack.org/62392711:50
*** bhavikdbavishi has joined #zuul12:18
*** toabctl has quit IRC12:30
tobiashcorvus: just noticed that runAnsibleCleanup in the executor is a noop: https://git.zuul-ci.org/cgit/zuul/tree/zuul/executor/server.py#n201312:58
tobiashcorvus: is that intended?12:58
*** gtema has quit IRC12:59
tobiashah judging from comment and commit message this seems like something we wanted to have but wasn't possible back then with ansible 2.313:03
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Enable ansible cleanup  https://review.openstack.org/63601513:07
*** sdake has joined #zuul13:08
*** snapiri has quit IRC13:25
*** snapiri has joined #zuul13:25
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Manage ansible installations  https://review.openstack.org/63193013:29
*** bhavikdbavishi has quit IRC13:35
mordredtobiash: oh wow! those are some great graphs13:43
*** sdake has quit IRC13:44
tobiashyes :)13:44
*** sdake has joined #zuul13:46
mordredtobiash: that kind of makes me want to add jemalloc to the python-base image and set the env var there13:47
tobiashmight make sense13:48
tobiashat least it makes sense for the zuul image13:48
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Run python with jemalloc in containers  https://review.openstack.org/63550413:49
mordredtobiash: well, let's take the WIP off of that ^^13:49
tobiashmordred: :)13:50
tobiashhowever I have to note that I compiled the latest release from source, so using apt-get may result in different graphs13:51
mordredtobiash: fair enough13:57
*** sdake has quit IRC14:08
openstackgerritMonty Taylor proposed openstack/pbrx master: Remove container build jobs  https://review.openstack.org/63601914:10
*** sdake has joined #zuul14:20
*** sdake has quit IRC14:21
*** sdake has joined #zuul14:23
fungigotta say, i'm thrilled to see freebsd's malloc getting increased uptake in other operating systems! ;)14:24
fungitoo bad they moved it to github and dropped their mailing lists in favor of gitter :/14:26
*** sdake has quit IRC14:31
mordredfungi: sigh15:07
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Manage ansible installations  https://review.openstack.org/63193015:10
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Symlink ansible plugins  https://review.openstack.org/63602215:10
tobiashdmsimard: thinking about multi ansible support and ara, do we have to take special care? Is the latest ara compatible with multiple ansible versions (say 2.5 - 2.7)?15:14
*** bhavikdbavishi has joined #zuul15:20
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602615:22
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602615:25
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Manage ansible installations  https://review.openstack.org/63193015:31
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Symlink ansible plugins  https://review.openstack.org/63602215:31
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602615:31
tobiashdmsimard: ah ok, judging from the readme it's compatible from 2.5 - devel currently so that should just work15:35
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Manage ansible installations  https://review.openstack.org/63193015:49
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Symlink ansible plugins  https://review.openstack.org/63602215:49
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602615:49
*** needssleep is now known as TheJulia15:58
*** sdake has joined #zuul16:08
mnaserit looks like the ara-report zuul role is broken with (the default option) of compression enabled (cc dmsimard )16:18
mnaserhttps://object-storage-ca-ymq-1.vexxhost.net/v1/296431177f204070bb3ba134fd51e6ca/zuul-dev-logs/1/1/211c1400a9d5cef18d5c3a5eaf198e02fe9a8b17/check/run-test-command/66ef1d6/ara/index.html.gz16:19
mnaserit compresses everything but the html files are still looking to non-.gz'd paths16:19
corvusmnaser: yeah, with the swift upload, you don't want to do any compression ahead of time16:24
corvusmnaser: here's our (test) swift-logs job: https://review.openstack.org/63590616:24
mnasercorvus: thanks, that's really helpful.16:25
mnasersomething that i haven't hit yet which i'm trying to wrap my head around is cross-tenant base jobs .. aka trying to have the same base job which uploads logs to swift across multiple tenants16:26
mnaserit looks like when i encrypted the secrets, it required the tenant name which seems like its binding to the tenant16:26
corvusmnaser: it's actually bound to the project, independent of the tenant, so it'll work.  it just needed the name of a tenant in order to get to the project via the api16:27
mnasercorvus: oh okay, that was just an assumption based on my interaction with the tool, so that's good to know, i'm still trying to see the cleanest way to get around multitenancy16:28
corvusmnaser: we're now doing this in opendev -- that repo (opendev/base-jobs) provides the base job for both of our tenants: http://zuul.opendev.org/tenants16:28
mnaserok wonderful, that'll be a good example16:28
tobiashmnaser: the only thing you should care of is that you gate that repo only in one tenant16:28
tobiashso you need to exclude project in all but one tenant16:29
pabelangerhttps://github.com/ansible-network/zuul-config/blob/master/zuul.d/jobs.yaml#L29 is also an example of swift-upload-logs with vexxhost16:29
mnasercorrect, i dont want the other tenants to touch that repo16:29
pabelangerkeep in mind, the container you have in swift is unique, so don't call it logs16:29
mnaseryeah i noticed infra already yoinked logs ;)16:29
* mnaser sudo rm -rfv and steals 'logs'16:30
corvusmnaser: i think what you describe is the right idea -- single base jobs repo that understands log uploading and other site-specific stuff.  beyond that, there are lots of options to explore.  we might end up with an "openstack-base" job which inherits from that and adds some openstack-specific stuff. you can define a "default parent" for each tenant, so the openstack-base job would still look like the16:30
corvusdefault base job in the openstack tenant.16:30
corvusmnaser: yeah, i think that was by accident.  i don't expect we'll actually use that container :)16:30
mnasercorvus: yeah, i'm trying to avoid having the user deal too much with a base job (for now)16:30
*** sdake has quit IRC16:30
mnaseras mucking about base jobs is a bit more 'advanced' user16:31
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Manage ansible installations  https://review.openstack.org/63193016:31
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Symlink ansible plugins  https://review.openstack.org/63602216:31
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602616:31
mnaser(potentially later enabling users to have their own base jobs, but yeah)16:31
mnaseri'm really really trying to figure out the on-boarding process and simplifying it as much as possible16:32
corvusmnaser: ++16:37
*** sdake has joined #zuul16:52
*** SpamapS has quit IRC17:00
*** SpamapS has joined #zuul17:01
SpamapSSeems like I got a bad copy of base jobs on my latest scheduler restart17:06
SpamapS2019-02-09 17:06:04.407932 | builder |   "msg": "The conditional check 'zuul_temp_ssh_key_stat.stat.exists != True' failed. The error was: error while evaluating conditional (zuul_temp_ssh_key_stat.stat.exists != True): 'dict object' has no attribute 'stat'\n\nThe error appears to have been in17:06
SpamapS'/tmp/tmpsmsv8tek/28a7b23751574999b9626f922b1860ea/trusted/project_1/git.zuul-ci.org/zuul-jobs/roles/add-build-sshkey/tasks/create-key-and-replace.yaml': line 1, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Create Temp SSH key\n  ^ here\n"17:06
SpamapSor zuul-jobs maybe, not sure17:07
SpamapShttp://paste.openstack.org/show/744788/ has the full fail17:08
SpamapSthe problem seems to be stat failing17:08
SpamapSoh nope, this is new executor images breaking me... interesting17:10
SpamapS2019-02-09 17:08:07,791 DEBUG zuul.AnsibleJob: [build: bd2201c7f8864964a0233f5e567aa6e9] Ansible output: b'failed: [builder -> localhost] (item=/tmp/tmpsmsv8tek/bd2201c7f8864964a0233f5e567aa6e9/work/docs) => {"changed": false, "item": "/tmp/tmpsmsv8tek/bd2201c7f8864964a0233f5e567aa6e9/work/docs", "module_stderr": "/bin/sh: 1: /usr/bin/python3: not found\\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc":17:11
SpamapS127}'17:11
SpamapSalpine put python3 in /usr/bin, but the builder images don't17:11
*** sdake has quit IRC17:47
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Manage ansible installations  https://review.openstack.org/63193017:56
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Symlink ansible plugins  https://review.openstack.org/63602217:56
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602617:56
*** sdake has joined #zuul18:06
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602618:09
mordredSpamapS: yes - in the new images, python is in /usr/local/bin ... what was looking for python explicitly by /usr/bin/python3 ?18:12
mordredcorvus: ^^ heads up - the quickstart test seems to have potentially missed something18:13
SpamapSmordred: I have a site variable of ansible_python_interpreter=/usr/bin/python318:14
SpamapSmordred: because otherwise Ansible requires python2 or a symlink in every image18:14
SpamapSso any localhost tasks failed18:15
mordredSpamapS: ah18:15
*** bhavikdbavishi has quit IRC18:18
*** bhavikdbavishi has joined #zuul18:20
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602618:25
*** sdake has quit IRC18:27
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602618:36
SpamapSmordred: simple fix..18:38
SpamapSRUN [ -e /usr/bin/python3 ] || ln -s /usr/local/bin/python3 /usr/bin/python318:38
mordredSpamapS: cool. think we should add that to the upstream dockerfile?18:41
SpamapSmordred: Might simplify things for weird people like me who don't build custom images. ;)18:42
* SpamapS goes ot saturday18:43
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602618:56
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602619:10
*** sdake has joined #zuul19:15
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602619:24
*** sdake has quit IRC19:35
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602619:41
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602620:05
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602620:26
*** sdake has joined #zuul20:27
*** sdake has quit IRC20:27
openstackgerritTobias Henkel proposed openstack-infra/zuul master: DNM: Multi-ansible zuul-stream-functional  https://review.openstack.org/63602620:44
*** sdake has joined #zuul21:15
*** krasmussen has quit IRC22:03
*** sdake has quit IRC22:37
dmsimardmordred, corvus, tobiash: I'm really not excited about the performance of ara 0.x html reports in swift22:40
dmsimardstatic html reports are horrible at scale -- I understand that disk space (or inodes) is not really a concern with swift but we still have to upload them for every job22:42
dmsimardI haven't found a solution I'd be happy with for the swift use case in 1.0 yet but it's something that I've started writing about in this pad: https://etherpad.openstack.org/p/ara-1.0-in-zuul22:46
dmsimardNeed to kill IRC bouncer for a while, migrating stuff in home lab. Happy to chat about this monday.22:58
* SpamapS looking at using EC2 capacity reservations to avoid NODE_FAILURE's on EC223:03
*** dmsimard has quit IRC23:06
*** sdake has joined #zuul23:09
*** sdake has quit IRC23:19
*** sdake has joined #zuul23:24
*** sdake has quit IRC23:53
*** sdake has joined #zuul23:56

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!