Thursday, 2019-02-14

*** agopi|out has quit IRC00:00
clarkbjournalctl -u kube-apiserver has no entries00:01
clarkb-u kubelet has one entry00:01
*** jamesmcarthur has quit IRC00:03
corvusokay, i guess we don't get api request logs00:04
*** markvoelker has quit IRC00:05
fungiservers with logs are for grey-bearded old fogies. the hipster way to manage services is to just redeploy your containers over and over until some fogey with logs fixes things upstream00:06
*** smarcet has joined #openstack-infra00:07
*** rh-jelabarre has quit IRC00:07
corvusi'm giving serious thought to going back to the shared-nothing gitea idea...00:08
corvuswhen setting this up, i ran a lot of kubectl commands by hand which should be equivalent to this00:08
corvusand they never failed00:08
openstackgerritMerged openstack-infra/zuul master: web: remove build and job_name filter from the buildset route  https://review.openstack.org/63650400:09
corvusi ran lsof on the ansible process, it has no open network connections.00:09
*** rascasoft has joined #openstack-infra00:09
clarkbfwiw my reading of the module on the version of ansible we run is that it won't wait for anything00:09
clarkband wait is off by default in 2.8 (and we don't enable it)00:09
clarkbcorvus: fwiw the kubectl commands are written in one implementation in one language and the k8s ansible module an entirely different language by a third party00:10
*** jamesmcarthur_ has joined #openstack-infra00:10
corvusyep -- my implication is that i lean toward the problem being ansible00:10
*** hwoarang has quit IRC00:11
*** weshay|ruck has quit IRC00:11
clarkbcorvus: maybe ansible -vvvvv will help narrow down where the problem is?00:12
*** jamesmcarthur_ has quit IRC00:12
corvusi'd love to get a stacktrace now :/00:13
*** rlandy is now known as rlandy|bbl00:13
*** hwoarang has joined #openstack-infra00:13
openstackgerritClark Boylan proposed openstack-infra/zuul master: Add Fake Github Review object to test suite  https://review.openstack.org/63678800:15
*** rascasoft has quit IRC00:16
clarkbcorvus: another approach might be to run an out of band k8s module change? though if it is specific to the resource types being created that may be tricky to narrow down from00:17
*** jamesmcarthur has joined #openstack-infra00:17
clarkbrereadign that select the first arg is 0 and the other args are NULL so it isn't really selecting anything00:18
corvusclarkb: http://paste.openstack.org/show/745046/  these are our stack traces00:18
clarkbwhcih lines up with your no network connections observation00:18
corvusproc 13560 has 2 threads00:18
*** jamesmcarthur has quit IRC00:19
corvusoh wait should i be looking at 13759 ?00:19
*** jamesmcarthur has joined #openstack-infra00:20
corvusit also has no network connections00:20
clarkb13759 is the "remote" side of the local connection I think00:20
corvusstrace says restart_syscall(<... resuming interrupted poll ...>00:21
corvusit has 1 thread: http://paste.openstack.org/show/745047/00:21
*** jamesmcarthur has quit IRC00:22
corvusclarkb: oh, note these stacktraces are backwards (most recent call first)00:22
corvusoh *that* process has 13762 as a child00:24
corvusaha! python3 13762 root    8u  IPv4         4241073484      0t0        TCP bridge.openstack.org:42764->38.108.68.20:6443 (ESTABLISHED)00:24
fungi6443 is the api?00:24
*** mattw4 has quit IRC00:25
clarkbah ok 13762 is the remote local side00:25
corvusfungi: i don't know, but i assume so for now00:25
corvusstrace says futex(0x7f4764000e70, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff00:25
corvus1 thread, traceback is: http://paste.openstack.org/show/745048/00:26
*** markvoelker has joined #openstack-infra00:26
clarkbtcp6       0      0 :::6443                 :::*                    LISTEN      1680/kube-apiserver00:27
corvusit is... disheartening... to see a deadlock on a threadpool in a process with a single thread.00:29
clarkbI can curl that from my desktop (I get a json resposne saying I am not authenticating)00:29
*** mriedem has quit IRC00:29
clarkbcorvus: https://github.com/kubernetes-client/python/blob/v8.0.1/kubernetes/client/api_client.py#L76 is the code that is blocking00:30
*** jamesmcarthur has joined #openstack-infra00:31
corvusyep00:31
clarkbso its made its requests and is now trying to clean up the request thread(s)00:31
clarkband that is deadlocked. Fun00:31
corvusbut there's only one thread running according to gdb00:31
clarkband the client itself is generated code so there isn't a useful history in the log00:32
clarkbhttps://github.com/kubernetes-client/python/blob/master/kubernetes/client/api_client.py#L78 new version is slightly different though00:32
*** jamesmcarthur has quit IRC00:33
clarkbcorvus: maybe try upgrading the kubernetes lib?00:34
clarkbthere is only a beta for 9.0.0 though00:34
corvusclarkb: i'm trying to figure out if ansible k8s calls call_api with async true or false00:34
corvusi haven't found the call path yet00:35
clarkb++00:35
corvusi agree, if it calls it with false, upgrading it may help00:35
clarkbcorvus: https://github.com/openshift/openshift-restclient-python/blob/master/openshift/dynamic/client.py#L268 is where that happens via openshift00:35
clarkband I don't see anywhere on the ansible side setting async_req (though my search is via github search which may not be very exact)00:37
corvusyeah, i'm not finding it either00:38
corvusso maybe the upgrade will help00:38
corvus#status log manually ran "pip3 install kubernetes==9.0.0b1" on bridge to see if newer version avoids deadlock on k8s api calls00:41
openstackstatuscorvus: finished logging00:41
corvusi'm going to kill that process now00:41
corvusclarkb: thanks for your help!  i'm pleasantly surprised how far we were able to actually chase that down00:42
corvusand if this doesn't work, we'll just figure out how to run 'kubectl' for all of these00:43
corvus(the nice thing about the k8s module is the free jinja2 templating)00:43
corvusbut i'm sure we can do something with stdin00:43
clarkbya should be workable to figure out kubectl commands00:43
mordredcorvus: if we have to kubectl, clint has a setup for templating he's fond of00:43
mordredbut fingers crossed that the upgrade just works00:44
fungii bet in involves gearman ;)00:44
* mordred waves from the supercharger in alexandria00:44
corvusmordred: do not lick the supercharger00:44
fungielectrifying!00:44
mordredcorvus: too late00:44
fungiyou're just around the corner from me00:45
mordredfungi: LA not VA00:45
fungiunless you mean egypt00:45
fungioh, or there00:45
mordred:)00:45
mordredegypt would also be cool00:45
fungii bet it still has a decent library?00:45
mordredfungi: I'm still closer to you than if I was in egypt though00:45
corvusyou never know with mordred00:45
fungiindeed00:45
openstackgerritMonty Taylor proposed openstack-infra/zuul-preview master: Update gitreview file with correct project name  https://review.openstack.org/63679100:46
openstackgerritMonty Taylor proposed openstack-infra/zuul-preview master: Add perf testing framework  https://review.openstack.org/63679200:46
openstackgerritMonty Taylor proposed openstack-infra/zuul-preview master: Reimplement in Rust  https://review.openstack.org/63679300:46
openstackgerritMonty Taylor proposed openstack-infra/zuul-preview master: Remove C++ version  https://review.openstack.org/63679400:46
openstackgerritMonty Taylor proposed openstack-infra/zuul-preview master: Use rust:slim base image  https://review.openstack.org/63679500:46
openstackgerritMonty Taylor proposed openstack-infra/zuul-preview master: Use slice matching for hostname unpacking  https://review.openstack.org/63679600:46
fungihe is in all alexandrias simultaneously, until you collapse his wave function00:47
* mordred bombs people with patches00:47
mordredfungi: ++00:47
mordredcorvus: feel free to do whatever is useful with any, all or none of those00:48
clarkbI was going to say virginia is wrong direction00:48
*** jamesmcarthur has joined #openstack-infra00:49
mordredclarkb: virginia was the right direction when duke beat them this weekend :)00:49
*** jamesmcarthur has quit IRC00:51
*** betherly has joined #openstack-infra00:53
pabelangerclarkb: mgagne: looks like inap might be having some problems: http://grafana.openstack.org/dashboard/db/nodepool-inap00:54
openstackgerritClark Boylan proposed openstack-infra/zuul master: Add Fake Github Review object to test suite  https://review.openstack.org/63678800:54
pabelangerhappen to notice it when looking at grafana00:54
mgagnepabelanger: thanks for the info, I'm currently busy debugging something else. I will look into it asap.00:54
pabelangernp! mostly an FYI00:55
*** betherly has quit IRC00:58
*** wolverineav has quit IRC01:00
*** hwoarang has quit IRC01:11
*** jamesmcarthur has joined #openstack-infra01:12
*** hwoarang has joined #openstack-infra01:13
*** smarcet has quit IRC01:13
*** gyee has quit IRC01:15
*** smarcet has joined #openstack-infra01:15
*** jamesmcarthur has quit IRC01:17
*** sthussey has quit IRC01:17
*** eumel8 has quit IRC01:19
*** whoami-rajat has joined #openstack-infra01:19
*** jamesmcarthur has joined #openstack-infra01:20
*** ekultails has quit IRC01:20
*** jamesmcarthur has quit IRC01:24
*** wolverineav has joined #openstack-infra01:28
*** rascasoft has joined #openstack-infra01:30
*** bhavikdbavishi has joined #openstack-infra01:38
*** rascasoft has quit IRC01:39
*** jamesmcarthur has joined #openstack-infra01:40
*** jamesmcarthur has quit IRC01:46
*** jamesmcarthur has joined #openstack-infra01:52
*** jamesmcarthur has quit IRC02:00
*** hongbin has joined #openstack-infra02:07
*** jamesmcarthur has joined #openstack-infra02:15
*** jamesmcarthur has quit IRC02:15
*** jamesmcarthur has joined #openstack-infra02:16
*** jamesmcarthur has quit IRC02:21
*** wolverineav has quit IRC02:34
*** jamesmcarthur has joined #openstack-infra02:42
*** jamesmcarthur has quit IRC02:46
*** jamesmcarthur has joined #openstack-infra02:52
*** psachin has joined #openstack-infra02:54
*** jamesmcarthur has quit IRC03:00
*** betherly has joined #openstack-infra03:00
*** jamesmcarthur has joined #openstack-infra03:02
*** betherly has quit IRC03:04
*** rlandy|bbl is now known as rlandy03:10
*** rlandy has quit IRC03:13
*** rascasoft has joined #openstack-infra03:23
*** armax has quit IRC03:24
*** markvoelker has quit IRC03:27
*** markvoelker has joined #openstack-infra03:27
*** rascasoft has quit IRC03:30
*** markvoelker has quit IRC03:32
*** jamesmcarthur has quit IRC03:33
*** ykarel|away has joined #openstack-infra03:40
*** agopi|out has joined #openstack-infra03:40
*** ykarel|away is now known as ykarel03:48
*** diablo_rojo has quit IRC03:53
*** jamesmcarthur has joined #openstack-infra03:54
*** jamesmcarthur has quit IRC03:59
*** eernst has joined #openstack-infra03:59
*** jamesmcarthur has joined #openstack-infra04:04
*** smarcet has quit IRC04:05
*** ramishra has joined #openstack-infra04:09
*** jamesmcarthur has quit IRC04:10
*** armax has joined #openstack-infra04:13
*** wolverineav has joined #openstack-infra04:17
*** wolverineav has quit IRC04:21
*** jamesmcarthur has joined #openstack-infra04:27
*** markvoelker has joined #openstack-infra04:28
*** jamesmcarthur has quit IRC04:31
*** udesale has joined #openstack-infra04:36
*** wolverineav has joined #openstack-infra04:39
*** eernst has quit IRC04:42
*** hwoarang has quit IRC04:47
*** jamesmcarthur has joined #openstack-infra04:48
*** owalsh_ has joined #openstack-infra04:49
*** hwoarang has joined #openstack-infra04:50
*** owalsh has quit IRC04:52
*** jamesmcarthur has quit IRC04:52
*** hwoarang has quit IRC04:56
*** hwoarang has joined #openstack-infra04:56
*** ykarel has quit IRC04:57
*** markvoelker has quit IRC05:02
*** jamesmcarthur has joined #openstack-infra05:09
*** ykarel has joined #openstack-infra05:11
*** jamesmcarthur has quit IRC05:13
*** wolverineav has quit IRC05:18
*** jamesmcarthur has joined #openstack-infra05:30
*** jamesmcarthur has quit IRC05:35
*** hongbin has quit IRC05:44
*** jamesmcarthur has joined #openstack-infra05:51
*** jamesmcarthur has quit IRC05:56
*** yboaron_ has joined #openstack-infra05:58
*** markvoelker has joined #openstack-infra05:58
*** ramishra_ has joined #openstack-infra06:00
*** ramishra has quit IRC06:01
openstackgerritOpenStack Proposal Bot proposed openstack-infra/project-config master: Normalize projects.yaml  https://review.openstack.org/63683106:06
*** jamesmcarthur has joined #openstack-infra06:12
*** ramishra_ is now known as ramishra06:17
*** jamesmcarthur has quit IRC06:17
*** snapiri has joined #openstack-infra06:29
*** e0ne has joined #openstack-infra06:29
*** dpawlik has joined #openstack-infra06:31
*** markvoelker has quit IRC06:32
*** jamesmcarthur has joined #openstack-infra06:33
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675906:35
*** jamesmcarthur has quit IRC06:38
*** ccamacho has quit IRC06:41
*** e0ne has quit IRC06:46
*** quiquell|off is now known as quiquell|rover06:47
*** jamesmcarthur has joined #openstack-infra06:54
*** dpawlik has quit IRC06:55
*** jamesmcarthur has quit IRC07:00
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675907:00
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: Implement a Runc driver  https://review.openstack.org/53555607:04
*** jamesmcarthur has joined #openstack-infra07:05
*** wolverineav has joined #openstack-infra07:05
*** dpawlik has joined #openstack-infra07:05
*** slaweq has joined #openstack-infra07:08
*** wolverineav has quit IRC07:09
*** jamesmcarthur has quit IRC07:09
*** janki has joined #openstack-infra07:11
*** janki has quit IRC07:13
*** janki has joined #openstack-infra07:13
openstackgerritTobias Henkel proposed openstack-infra/zuul-jobs master: Optionally silence git push in mirror-workspace-git-repos  https://review.openstack.org/63516607:16
*** bhavikdbavishi has quit IRC07:17
*** pgaxatte has joined #openstack-infra07:18
openstackgerritTobias Henkel proposed openstack-infra/zuul-jobs master: Optionally silence git in mirror-workspace-git-repos  https://review.openstack.org/63516607:21
openstackgerritMerged openstack-infra/project-config master: Normalize projects.yaml  https://review.openstack.org/63683107:21
openstackgerritTobias Henkel proposed openstack-infra/zuul-jobs master: Optionally silence git in mirror-workspace-git-repos  https://review.openstack.org/63516607:21
*** aojea has joined #openstack-infra07:21
*** jamesmcarthur has joined #openstack-infra07:25
*** markvoelker has joined #openstack-infra07:28
*** Adri2000 has quit IRC07:28
*** Adri2000 has joined #openstack-infra07:29
*** jamesmcarthur has quit IRC07:30
*** quiquell|rover is now known as quique|rover|brb07:31
*** lujinluo has joined #openstack-infra07:32
*** apetrich has joined #openstack-infra07:35
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675907:35
*** jtomasek has joined #openstack-infra07:35
*** jtomasek has quit IRC07:39
*** ykarel is now known as ykarel|lunch07:42
*** jtomasek has joined #openstack-infra07:44
*** jamesmcarthur has joined #openstack-infra07:46
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675907:46
*** jamesmcarthur has quit IRC07:51
openstackgerritTobias Henkel proposed openstack-infra/zuul-jobs master: Optionally silence git in mirror-workspace-git-repos  https://review.openstack.org/63516607:52
*** kjackal has joined #openstack-infra07:56
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675907:57
*** markvoelker has quit IRC08:01
*** e0ne has joined #openstack-infra08:05
*** jamesmcarthur has joined #openstack-infra08:07
*** ramishra has quit IRC08:08
*** e0ne has quit IRC08:08
*** tkajinam has quit IRC08:09
*** memoussati has joined #openstack-infra08:10
*** jamesmcarthur has quit IRC08:12
*** quique|rover|brb is now known as quiquell|rover08:15
*** rpittau has joined #openstack-infra08:17
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675908:17
*** e0ne has joined #openstack-infra08:18
*** e0ne has quit IRC08:18
*** ramishra has joined #openstack-infra08:18
*** ccamacho has joined #openstack-infra08:22
*** jamesmcarthur has joined #openstack-infra08:28
*** e0ne has joined #openstack-infra08:29
*** ykarel|lunch is now known as ykarel08:29
*** memoussati has quit IRC08:31
*** jamesmcarthur has quit IRC08:32
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675908:33
*** e0ne has quit IRC08:35
*** jpena|off is now known as jpena08:38
*** xek has joined #openstack-infra08:40
*** electrofelix has joined #openstack-infra08:43
*** tosky has joined #openstack-infra08:47
*** jamesmcarthur has joined #openstack-infra08:48
*** memoussati has joined #openstack-infra08:49
*** jpich has joined #openstack-infra08:53
*** jamesmcarthur has quit IRC08:53
tobias-urdininfra-root: these two are merged now https://review.openstack.org/#/c/635941/ https://review.openstack.org/#/c/635965/ but i need some help requeuing a old release to test it08:53
tobias-urdin19:40 < fungi> sudo zuul enqueue-ref --tenant=openstack --trigger=gerrit --pipeline=release --project=openstack/puppet-aodh --ref=refs/tags/14.2.0 --newrev=617ffad84b633618490ca1023f8a31d9694b31a908:53
*** wolverineav has joined #openstack-infra08:57
*** kopecmartin|off is now known as kopecmartin08:57
*** markvoelker has joined #openstack-infra08:58
*** panda|off is now known as panda09:00
fricklertobias-urdin: enqueued09:00
*** wolverineav has quit IRC09:01
*** memoussati has quit IRC09:03
*** jamesmcarthur has joined #openstack-infra09:06
fricklertobias-urdin: still no luck it seems "Forge API auth failed with code: 400"09:09
*** memoussati has joined #openstack-infra09:10
*** jamesmcarthur has quit IRC09:12
*** dtantsur|afk is now known as dtantsur09:20
*** memoussati has quit IRC09:21
*** jamesmcarthur has joined #openstack-infra09:27
*** markvoelker has quit IRC09:31
*** jamesmcarthur has quit IRC09:32
openstackgerritBrendan proposed openstack-infra/zuul-jobs master: Use zuul_workspace_root variable for Git workspace prep  https://review.openstack.org/63687009:32
*** memoussati has joined #openstack-infra09:33
openstackgerritBrendan proposed openstack-infra/zuul-jobs master: Use zuul_workspace_root variable for Git workspace prep  https://review.openstack.org/63687009:34
*** derekh has joined #openstack-infra09:37
*** luizbag has joined #openstack-infra09:38
*** stakeda has quit IRC09:40
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675909:42
*** memoussati has quit IRC09:43
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675909:43
*** jamesmcarthur has joined #openstack-infra09:49
*** jamesmcarthur has quit IRC09:53
*** rosmaita has quit IRC09:55
tobias-urdinfrickler: hm, i tried that ansible module with my own credentials without any issues, perhaps it's actually something wrong with credentials09:55
tobias-urdinor maybe i should add more debug output, let me check what the reponse is on code 40009:55
*** ociuhandu has joined #openstack-infra09:56
*** AJaeger has quit IRC10:01
*** memoussati has joined #openstack-infra10:03
*** fdegir has joined #openstack-infra10:04
*** AJaeger has joined #openstack-infra10:04
tobias-urdinfrickler: i would prefer we do not send the password out, but can somebody that has access to that password try http://paste.openstack.org/show/745081/10:06
*** jamesmcarthur has joined #openstack-infra10:10
*** bhavikdbavishi has joined #openstack-infra10:12
*** jamesmcarthur has quit IRC10:15
*** walshh__ has quit IRC10:16
tobias-urdinif that works its something with the encrypted secret or the usage of the secret that is faulty in project-config/zuul.d/jobs.yaml or project-config/playbooks/publish/puppetforge.yaml10:17
*** whoami-rajat has quit IRC10:19
openstackgerritsahid proposed openstack-dev/pbr master: Change python3.5 job to python3.7 job on Stein+  https://review.openstack.org/61065910:24
*** markvoelker has joined #openstack-infra10:28
fricklertobias-urdin: that works with the password we have on record. so probably something wrong with the encrypted secret or the way it is used. maybe fungi can look into that later10:31
*** jamesmcarthur has joined #openstack-infra10:31
fricklerinfra-root: FYI, running "gpg-agent --daemon emacs" failed on bridge because there was an active gpg-agent. please remember to use this incantation to avoid leaving secrets active after you exit emacs10:32
*** bhavikdbavishi has quit IRC10:34
tobias-urdinfrickler: thanks! good then we know what is blocking10:34
*** jamesmcarthur has quit IRC10:36
*** yamamoto has quit IRC10:40
*** jbadiapa has quit IRC10:42
fricklertobias-urdin: hmm, just browsing things it seems that the api also returns 400 for things like bad module names, probably it would be good to have more logging available in that module anyway. https://tickets.puppetlabs.com/projects/FORGE/issues/FORGE-228?filter=allopenissues10:43
*** betherly has joined #openstack-infra10:44
*** betherly has quit IRC10:44
*** jamesmcarthur has joined #openstack-infra10:52
tobias-urdinfrickler: should be fine, the auth and actual upload call is separated so it will show all errors regarding bad module etc10:54
tobias-urdinhttps://github.com/openstack-infra/zuul-jobs/blob/master/roles/upload-forge/library/forge_upload.py#L16810:54
tobias-urdinwhen response code != 20110:54
tobias-urdinon 409 (exact module version already exists) it fails with "module already exists"10:55
*** jamesmcarthur has quit IRC10:57
*** udesale has quit IRC10:58
*** markvoelker has quit IRC11:01
fricklertobias-urdin: ah, right, that would give a different error msg. so probably someone will have to crosscheck that decrypting the encrypted secret gives the correct result. or otherwise just try to refresh with a newly encrypted version.11:03
tobias-urdinyeah, it's also entirely possible that i messed up the "secrets" passed on the "job", set the "name" on that and are doing it wrong in publish/puppetforge.yaml11:04
tobias-urdinbut i don't see any errors when staring at it11:04
*** roman_g has joined #openstack-infra11:12
*** jamesmcarthur has joined #openstack-infra11:14
fricklertobias-urdin: did some staring myself and that didn't help either ;)11:16
tobias-urdinhehe :)11:16
*** jamesmcarthur has quit IRC11:18
*** yamamoto has joined #openstack-infra11:21
*** yamamoto has quit IRC11:28
*** jamesmcarthur has joined #openstack-infra11:35
*** jamesmcarthur has quit IRC11:40
*** priteau has joined #openstack-infra11:43
*** vdrok_ has quit IRC11:44
*** vdrok has joined #openstack-infra11:47
*** janki has quit IRC11:48
*** janki has joined #openstack-infra11:48
*** quiquell|rover is now known as quique|rover|r--11:53
*** janki has quit IRC11:55
*** janki has joined #openstack-infra11:55
*** jamesmcarthur has joined #openstack-infra11:57
*** yamamoto has joined #openstack-infra11:57
*** rpittau has quit IRC11:58
*** jpena is now known as jpena|lunch11:58
*** markvoelker has joined #openstack-infra11:58
*** jamesmcarthur has quit IRC12:01
*** yamamoto has quit IRC12:10
*** armstrong has joined #openstack-infra12:11
*** yamamoto has joined #openstack-infra12:14
*** jamesmcarthur has joined #openstack-infra12:18
*** memoussati has quit IRC12:19
*** yamamoto has quit IRC12:22
*** jamesmcarthur has quit IRC12:23
*** markvoelker has quit IRC12:26
*** wolverineav has joined #openstack-infra12:33
*** smarcet has joined #openstack-infra12:34
*** priteau has quit IRC12:36
*** priteau has joined #openstack-infra12:38
*** wolverineav has quit IRC12:38
*** jamesmcarthur has joined #openstack-infra12:39
*** memoussati has joined #openstack-infra12:41
*** janki has quit IRC12:41
*** yamamoto has joined #openstack-infra12:42
*** smarcet has quit IRC12:43
*** jamesmcarthur has quit IRC12:44
*** udesale has joined #openstack-infra12:45
*** owalsh_ is now known as owalsh_afk12:51
*** rh-jelabarre has joined #openstack-infra12:53
*** rpittau has joined #openstack-infra12:56
*** yamamoto has quit IRC13:00
*** jamesmcarthur has joined #openstack-infra13:01
*** ekultails has joined #openstack-infra13:01
*** rosmaita has joined #openstack-infra13:03
*** yamamoto has joined #openstack-infra13:03
*** jamesmcarthur has quit IRC13:05
*** yamamoto has quit IRC13:06
*** yamamoto has joined #openstack-infra13:06
*** yamamoto has quit IRC13:07
*** yamamoto has joined #openstack-infra13:13
*** priteau has quit IRC13:13
*** trown|outtypewww is now known as trown13:16
*** jbadiapa has joined #openstack-infra13:17
*** yamamoto has quit IRC13:19
*** jamesmcarthur has joined #openstack-infra13:21
*** jamesmcarthur has quit IRC13:21
*** jamesmcarthur_ has joined #openstack-infra13:22
*** armstrong has quit IRC13:22
*** armstrong has joined #openstack-infra13:23
*** jpena|lunch is now known as jpena13:24
*** smarcet has joined #openstack-infra13:25
*** agopi|out is now known as agopi|brb13:28
*** jamesmcarthur_ has quit IRC13:29
openstackgerritsebastian marcet proposed openstack-infra/system-config master: Update puppet config for openstackid-dev node  https://review.openstack.org/63695213:30
*** agopi|brb has quit IRC13:32
*** weshay has joined #openstack-infra13:34
*** yboaron_ has quit IRC13:36
*** yboaron_ has joined #openstack-infra13:36
*** mriedem has joined #openstack-infra13:39
*** rlandy has joined #openstack-infra13:41
*** yamamoto has joined #openstack-infra13:46
*** yamamoto has quit IRC13:46
*** yamamoto has joined #openstack-infra13:46
*** yamamoto has quit IRC13:47
*** weshay is now known as weshay|ruck13:47
*** yamamoto has joined #openstack-infra13:47
*** jaosorior has quit IRC13:47
*** jamesmcarthur has joined #openstack-infra13:49
*** jaosorior has joined #openstack-infra13:51
*** jamesmcarthur has quit IRC13:52
*** jamesmcarthur has joined #openstack-infra13:52
*** quique|rover|r-- is now known as quiquell|rover13:53
*** memoussati has quit IRC13:54
*** priteau has joined #openstack-infra13:55
sshnaidmclarkb, pabelanger fungi do you know if there is a way to prevent merge of patch if 3d party CI failed?13:58
*** agopi|brb has joined #openstack-infra14:00
*** jamesmcarthur has quit IRC14:01
*** rfolco is now known as rfolco|off14:01
*** owalsh_afk is now known as owalsh14:01
*** rosmaita has quit IRC14:02
*** agopi_ has joined #openstack-infra14:04
*** agopi|brb has quit IRC14:06
fungisshnaidm: yes, don't approve it14:08
*** yamamoto has quit IRC14:09
sshnaidmfungi, and something more rough and tyranic? :)14:09
fungisshnaidm: short of that, you might be able to write a zuul job which checked the vote details on the change under test and then return a failure result under specific conditions, though i haven't thought through what race conditions an implementation like that might imply14:09
sshnaidmfungi, cool, will think how to do it..14:10
fungisshnaidm: though to answer the question you may have been trying to ask, zuul doesn't have extensible features for depending on the results of nor deferring to other ci systems14:11
sshnaidmfungi, yeah, seems like good feature to have in the future.14:12
*** memoussati has joined #openstack-infra14:12
fungii disagree, but it's worthy of debating14:13
sshnaidmespecially if the another ci is also zuul based14:13
fungiteams like neutron, cinder, ironic and nova who have potentially dozens of third-party ci systems manage to pay attention to the votes those cast and take them into account when deciding whether or not to approve a change14:13
fungialso, ci testing and gating aren't a substitute for reviewers' attention to detail14:15
*** agopi_ is now known as agopi14:18
*** ykarel is now known as ykarel|away14:18
*** yboaron_ has quit IRC14:20
*** jamesmcarthur has joined #openstack-infra14:22
*** psachin has quit IRC14:22
*** kjackal has quit IRC14:23
*** ykarel|away has quit IRC14:23
*** jamesmcarthur has quit IRC14:26
*** kjackal has joined #openstack-infra14:27
sshnaidmcan not disagree14:30
sshnaidmbut things happen14:31
openstackgerritNir Magnezi proposed openstack/diskimage-builder master: [wip] rhel8 beta support  https://review.openstack.org/62313714:32
tobias-urdinfungi: could you check the conversation i had with frickler a bit up, thanks! so summarize there's something wrong the the secret or how the secret is passed/configured in jobs->playbook14:32
*** ekultails has quit IRC14:34
*** yboaron_ has joined #openstack-infra14:38
fungitobias-urdin: yep, saw it. i'll double-check that the secret decrypts to the same value we have on record, but it may be a bit14:38
*** ykarel|away has joined #openstack-infra14:40
*** yamamoto has joined #openstack-infra14:41
*** gfidente has joined #openstack-infra14:41
fricklerfungi: I had the same idea, but gave up after looking into the code to see how that would be done. do you happen to have a tool for it?14:41
*** yboaron_ has quit IRC14:42
*** yboaron_ has joined #openstack-infra14:42
fungifrickler: yeah, the openssl command line ought to be able to do it14:42
fungiit uses a standard protocol14:42
*** jamesmcarthur has joined #openstack-infra14:43
*** ykarel|away is now known as ykarel14:45
*** yamamoto has quit IRC14:46
*** jamesmcarthur has quit IRC14:47
*** nhicher has quit IRC14:48
*** nhicher has joined #openstack-infra14:50
smcginnissshnaidm, fungi: I believe if you give a CI voting rights then it can block patches from merging.14:51
sshnaidmsmcginnis, it can vote, but not block..14:51
smcginnisWe don't go that far in Cinder though since most CIs are not super reliable, so there's always manual evaluation needed.14:51
smcginnisI could have sworn we accidentally added a third party CI account to a voting group once and had to quickly change it back. And I thought Nova would let the VMware NSX CI vote for that reason.14:52
fungismcginnis: we grant third-party ci systems at most -1..+1 voting rights on the verified label. that doesn't block anything14:52
smcginnisPerhaps I'm wrong, or perhaps things have changed.14:52
smcginnisfungi: Ah, probably my misunderstanding then.14:52
*** electrofelix has quit IRC14:53
*** ekultails has joined #openstack-infra14:53
fungithe argument used by the cinder team a while back to not have their third-party ci systems vote is that a lot of those ci systems were unreliable and reviewers were skipping patches which had a verified -114:53
smcginnisMaybe I'm thinking of it showing up in the list of reviews as Verified-1, causing reviewers to skip it thinking it failed gate.14:53
smcginnisfungi: Yeah14:53
smcginnisContext switch: anyone know where things ended up with the LOCI post queue issue from yesterday?14:54
fungismcginnis: i think we were hoping to get up with hogepodge but the short-term solution is you can delete that job from your post pipeline in your project. longer term, if the job is valuable, is to move it to a trusted/config repository so other projects can resume running it14:55
smcginnisfungi: I would think it would be very valuable to the loci team, but I hadn't even realized we had it in the cinder job config.14:56
smcginnisI will remove from there for now and leave it to them to decide how they want to handle things I guess.14:56
smcginnisTHanks14:56
fungiit's in the git history, so easy enough to add back once it's working again15:00
smcginnis++15:00
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: Implement a Runc driver  https://review.openstack.org/53555615:00
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: Implement a Runc driver  https://review.openstack.org/53555615:02
*** jamesmcarthur has joined #openstack-infra15:04
*** jamesmcarthur has quit IRC15:08
*** rosmaita has joined #openstack-infra15:14
*** wolverineav has joined #openstack-infra15:15
*** roman_g has quit IRC15:16
*** wolverineav has quit IRC15:19
*** udesale has quit IRC15:22
*** jamesmcarthur has joined #openstack-infra15:24
*** jamesmcarthur has quit IRC15:29
*** eernst has joined #openstack-infra15:34
*** memoussati has quit IRC15:35
*** jamesmcarthur has joined #openstack-infra15:45
*** jamesmcarthur has quit IRC15:50
*** jamesmcarthur has joined #openstack-infra15:52
*** jamesmcarthur has quit IRC15:54
*** jamesmcarthur has joined #openstack-infra15:54
clarkbfrickler: sorry that was probably me15:55
*** gfidente has quit IRC15:56
*** memoussati has joined #openstack-infra15:56
*** yboaron_ has quit IRC15:57
clarkbre gpg agent15:58
*** pgaxatte has quit IRC16:01
*** diablo_rojo has joined #openstack-infra16:01
*** eernst has quit IRC16:02
*** kaisers has quit IRC16:07
*** ramishra has quit IRC16:07
*** kaisers has joined #openstack-infra16:15
*** dtantsur is now known as dtantsur|afk16:16
zbrclarkb: fungi: few days ago I was asked to by pabelanger to stop using bindep-fallback template due to deprecation. Does this look ok? https://review.openstack.org/#/c/636163/16:18
*** smarcet has quit IRC16:24
openstackgerritClark Boylan proposed openstack-infra/zuul master: Add Fake Github Review object to test suite  https://review.openstack.org/63678816:24
*** smarcet has joined #openstack-infra16:25
clarkbzbr: lgtm. I made a couple suggestions inline if you have to make a new patchset but I don't think they are critical16:27
*** smarcet has quit IRC16:28
*** smarcet has joined #openstack-infra16:30
*** memoussati has quit IRC16:32
clarkbcorvus: I think we are still experiencing the slow ansible runs due to broken k8s module (at least I don't expect these tasks to run this long16:36
clarkbso the upgrade and changes to how thread pools were handled didn't help16:36
tobias-urdinfungi: thanks :)16:36
clarkbI wonder if this is a multiprocessing interaction between ansible use of multiprocessing and k8s module use of it16:36
clarkbanyone else seen this problem? dmsimard perhaps?16:36
*** markvoelker has joined #openstack-infra16:38
clarkbfungi: https://review.openstack.org/#/c/636681/ is a quick and easy one if you have a moment. I'll start figuring out the replacement pbx as soon as that merges and is on bridge16:39
dmsimardclarkb: it doesn't ring me a bell but if you are reproducing this with Zuul's Ansible, I would try to see if we can reproduce it with the latest 2.716:40
clarkbdmsimard: no this is with ansible 2.7.3 on bridge.openstack.org. The k8s module ends up deadlocking around thread pool cleanup16:40
clarkbthis also doesn't put much faith in swagger client generation :/16:41
dmsimardclarkb: have a link to the playbook/role where the tasks in question are used ?16:41
clarkbdmsimard: https://git.openstack.org/cgit/openstack-infra/system-config/tree/kubernetes/gitea/gitea-playbook.yaml the k8s tasks there16:42
*** ociuhand_ has joined #openstack-infra16:42
clarkboh now thats curious16:43
clarkbdouble checking which version of the python kubernetes lib we have isntalled I still see 8.0.116:43
clarkbsomething undid the 9.0.0 beta install maybe? we should rerun with that and double check if it fixes or not16:43
*** agopi is now known as agopi|FOOD16:44
*** ociuhandu has quit IRC16:45
*** trident has quit IRC16:46
*** ccamacho has quit IRC16:48
*** ociuhand_ has quit IRC16:48
*** dpawlik has quit IRC16:52
dmsimardclarkb: I see kubernetes==8.0.1 and openshift==0.8.2. I guess kubernetes is pulled by openshift: https://github.com/openshift/openshift-restclient-python/blob/master/requirements.txt16:53
*** ijw has joined #openstack-infra16:53
dmsimardclarkb: openshift python lib 0.8.5 was released yesterday fwiw: https://pypi.org/project/openshift/#history16:53
clarkbdmsimard: yes it is, and that is where corvus' debugging seemed to pinpoint the deadlock. Reviewing its code (whcih is generated) there were chagnes around thread pool management in 9.0.0 beta16:53
clarkbso we thought maybe an upgrade would fix things but we seem to have downgraded after the upgrade so we should try that again16:54
clarkbpossibly because of that openshift release yesterday causing a reinstall16:54
*** gyee has joined #openstack-infra16:55
dmsimardthere's a 9.0.0b1 of kubernetes that was released yesterday as well.. is that a coincidence ?16:55
clarkbya thats the one I thought we had installed so we'll just need to redo that again16:56
*** quiquell|rover is now known as quiquell|off16:57
openstackgerritFabien Boucher proposed openstack-infra/zuul master: URLTrigger driver time based - artifact change jobs triggering driver  https://review.openstack.org/63556717:00
*** ociuhandu has joined #openstack-infra17:02
openstackgerritFabien Boucher proposed openstack-infra/zuul master: URLTrigger driver time based - artifact change jobs triggering driver  https://review.openstack.org/63556717:02
*** sreejithp has joined #openstack-infra17:05
*** sthussey has joined #openstack-infra17:05
*** ociuhandu has quit IRC17:06
*** ianychoi has joined #openstack-infra17:06
*** ijw_ has joined #openstack-infra17:06
*** jpich has quit IRC17:08
*** ijw has quit IRC17:10
*** markvoelker has quit IRC17:12
openstackgerritMerged openstack-infra/openstackid-resources master: Update API code to work with Presentation Moderators collection (+N)  https://review.openstack.org/63619017:13
*** ociuhandu has joined #openstack-infra17:15
*** kopecmartin is now known as kopecmartin|off17:16
*** wolverineav has joined #openstack-infra17:17
*** wolverineav has quit IRC17:17
*** wolverineav has joined #openstack-infra17:17
*** armstrong has quit IRC17:21
*** mattw4 has joined #openstack-infra17:22
*** rpittau has quit IRC17:22
*** betherly has joined #openstack-infra17:23
*** ociuhandu has quit IRC17:24
*** memoussati has joined #openstack-infra17:27
*** aojea has quit IRC17:31
*** wolverineav has quit IRC17:32
*** wolverineav has joined #openstack-infra17:32
*** ijw_ has quit IRC17:34
*** wolverineav has quit IRC17:36
*** wolverineav has joined #openstack-infra17:36
*** NCLanceman has joined #openstack-infra17:43
*** Jason_Lee has joined #openstack-infra17:46
*** luizbag has quit IRC17:47
*** NCLanceman has quit IRC17:47
*** Jason_Lee has quit IRC17:48
*** NCLanceman has joined #openstack-infra17:48
*** trown is now known as trown|lunch17:49
fungiokay, i need to go out for lunch and some errands. on my return i'll try to decrypt the puppetforge creds we have in job config to make sure they match what's on record (tobias-urdin), and get the openstackid-dev mysql ssl keys into private hiera (smarcet)17:49
smarcetfungi: thx u! :)17:49
corvusclarkb, dmsimard: wow, what an unlucky coincidence.  did you re-install 9.0.0?17:50
fungithose are currently the top items to be popped off my (lengthy) to do list anyway17:50
clarkbcorvus: I have not reinstalled yet in case you had other ideas for why that may have happened17:52
*** ykarel is now known as ykarel|away17:54
corvusclarkb: what does ~= mean?17:56
clarkbcorvus: in what context? did I type that? usually if I type that I mean approximately17:56
corvusclarkb: sorry -- https://github.com/openshift/openshift-restclient-python/blob/master/requirements.txt#L317:57
clarkbhrm I think that means any 8.x version17:57
clarkbit won't upgrade them by default either17:57
clarkbexcept when you have 9.0 :/17:57
corvusclarkb: meh.  maybe we should just upgrade and check back in a few hours :)17:58
corvusi'll do it17:58
clarkbwfm17:58
corvus(i still have the shell open from last time)17:58
corvuslooks like we currently have a stuck task.17:58
corvusi will kill it17:58
mgagnecurrently checking for an issue with orphan/zombie neutron ports in inap-mtl01. new instances fail to get an IP address since there are none free.17:59
*** ykarel|away has quit IRC18:00
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Re-use the github PR object when fetching reviews  https://review.openstack.org/63670518:03
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Add comment about extra issues request  https://review.openstack.org/63670618:03
*** Jason_Lee has joined #openstack-infra18:07
openstackgerritSorin Sbarnea proposed openstack-infra/bindep master: Replace deprecated bindep-fallback testing  https://review.openstack.org/63616318:07
*** jpena is now known as jpena|off18:08
*** derekh has quit IRC18:09
*** NCLanceman has quit IRC18:09
openstackgerritSorin Sbarnea proposed openstack-infra/bindep master: Replace deprecated bindep-fallback testing  https://review.openstack.org/63616318:09
*** markvoelker has joined #openstack-infra18:09
*** priteau has quit IRC18:13
*** wolverineav has quit IRC18:14
clarkbcorvus: next run_all.sh just started18:15
corvusstill at 9.0.0b118:15
*** Jason_Lee has quit IRC18:16
*** wolverineav has joined #openstack-infra18:16
clarkbjust downgreaded looks like18:16
clarkbTASK [install-ansible : Install openshift client] ****************************** and friends must do it18:16
corvusyeah :(18:16
clarkbmaybe we just delete that task for now?18:17
corvuswfm18:17
corvusclarkb: i'll patch18:18
*** wolverineav has quit IRC18:18
corvusi'm reinstalling manually18:18
*** wolverineav has joined #openstack-infra18:18
corvusit'll get used on this run18:18
openstackgerritClark Boylan proposed openstack-infra/system-config master: Stop install the openshift client  https://review.openstack.org/63702018:19
clarkbis the chagne if we want to do it longer term18:19
*** agopi|FOOD is now known as agopi18:19
clarkbcorvus: oh thats a good idea, we'll be able to confirm if it works and if it helps that way18:19
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Temporarily stop installing openshift  https://review.openstack.org/63702118:20
corvusclarkb: ^ can we do that instead so we don't forget?18:20
clarkbwfm18:20
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Make UnsafeTag self registering  https://review.openstack.org/63702318:21
*** NCLanceman has joined #openstack-infra18:24
*** smarcet has quit IRC18:24
*** Jason_Lee has joined #openstack-infra18:28
*** NCLanceman has quit IRC18:30
openstackgerritMerged openstack-infra/zuul master: Mark as unsafe commit message at inventory  https://review.openstack.org/63393018:32
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Upgrade all dev servers to puppet 4  https://review.openstack.org/63039118:34
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Upgrade some servers to puppet 4  https://review.openstack.org/63472618:34
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Upgrade git01.openstack.org to puppet 4  https://review.openstack.org/63472718:34
cmurphyclarkb: ^ i didn't realize that was in merge conflict18:34
cmurphyclarkb: also noticing that we didn't do https://review.openstack.org/616001 i think there was a good reason for that? is puppet turned back on for those?18:35
*** Jason_Lee has quit IRC18:35
clarkbcmurphy: it isn't turned on yet. smarcet is working on upgrades now (and is almost done on the dev server, next prod, then we can puppet 4 I think)18:35
cmurphyok18:36
cmurphyand ask.o.o?18:36
clarkbI don't know about ask18:36
clarkbfungi: ^ do you?18:36
cmurphyalso refstack https://review.openstack.org/628153 i think i took that out of the list for a reason, i think i found some channel history about refstack going to containers?18:37
clarkbya refstack needs some care/feeding18:37
cmurphyshould we puppet4 it or ignore it?18:37
clarkbI know hogepodge wanted to move it to a docker image based deployment (which we can now do) but unsure if progress was made on that yet18:37
clarkbcmurphy: I would probably ignore it for now18:38
cmurphykk18:38
cmurphyhoping to get all this done before the end of the cycle18:38
clarkbcorvus: ansible is running puppet afs playbook now so either newer kubernetes lib fixed things or it broke it faster :)18:39
*** shardy has quit IRC18:40
clarkbcorvus: I think it worked18:40
clarkbhttp://paste.openstack.org/show/745115/18:41
openstackgerritFabien Boucher proposed openstack-infra/zuul master: URLTrigger driver time based - artifact change jobs triggering driver  https://review.openstack.org/63556718:42
*** markvoelker has quit IRC18:42
*** sshnaidm is now known as sshnaidm|off18:47
*** dims has quit IRC18:47
*** ccamacho has joined #openstack-infra18:47
toskyuh, a "REMOTE HOST IDENTIFICATION HAS CHANGED" error led to a POST_FAILURE in a sahara job: http://logs.openstack.org/57/634757/4/check/python-saharaclient-tempest/f64602b/ara-report/18:48
toskythere are also other failures in the devstack run18:48
clarkbcorvus: if you confirm that it seems to have worked too then I think you should go ahead and approve https://review.openstack.org/#/c/637021/18:48
toskyand that job was not touched in a while - a transient error?18:48
clarkbtosky: that happens when clouds reuse IP addresses18:48
clarkbtosky: then the instance fight over the IP via ARP18:48
clarkband if the "wrong" host has the IP when we try to ssh bad things happen18:49
clarkbthe joys of dogfooding :)18:49
toskyI was lucky enough so far, it's the first time I see this18:49
clarkbya its not a super frequent thing but >018:49
toskyis this classified in some way, so that I can add the proper label after recheck?18:50
clarkbI think it is /me double checks. However there is no need to label things with rechecks.18:50
clarkbwe stopped doing that when we compared human generated data to elastic-recheck data and found humans are very error prone18:50
toskyI know, but at least this time it seems to be a known issue18:50
toskyah18:51
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Cache github PR shas  https://review.openstack.org/63676418:51
clarkbI'm not seeing it in e-r currently so maybe we need to add it or revert a removal if it was there18:51
corvusclarkb: lgtm +318:52
openstackgerritSorin Sbarnea proposed openstack-infra/bindep master: Fix tox python3 overrides  https://review.openstack.org/60561318:59
clarkbcorvus: I wonder if we should file bugs against openshift client and ansible (seems k8s client is already fixing it, but the others need to change their deps)19:00
corvusclarkb: for some reason i don't understand, the system-config updates take 2 pulses to take effect19:02
*** wolverineav has quit IRC19:02
corvusoh, and the change hasn't landed anyway19:03
corvusi re-installed 9.0.019:03
clarkbcorvus: we only update system-config at the very beginning of the run19:03
clarkbso you have to get in before that for most things19:03
corvusah, i must have had bad timing19:03
corvushopefully the openshift change will land between now and the next run19:03
*** wolverineav has joined #openstack-infra19:03
*** jamesmcarthur has quit IRC19:03
*** wolverineav has quit IRC19:06
*** wolverineav has joined #openstack-infra19:06
pabelangerwe seem to have a large backlog in executor queue at the moment, does anybody know why?19:08
clarkblooks like executors have been nnot accepting new jobs since about 1800UTC (small brief periods where they do)19:11
clarkbload averages have been high according to the grafana graphs19:12
clarkband the running builds spiked, but they haven't persisted at those levels so you'd expect jobs take up to happen again19:12
openstackgerritClark Boylan proposed openstack-infra/zuul master: Cache github PR shas  https://review.openstack.org/63676419:13
pabelangerI am guessing the large stack at 636253 has something to do with it19:13
clarkbpabelanger: if you look at node usage graph we continue to use just as many or more nodes through that time period19:14
clarkbso I think its demand19:14
*** NCLanceman has joined #openstack-infra19:14
clarkbI don't think we're falling behind so much as going full steam ahead19:14
clarkb(and maybe falling behind when the current is stronger)19:15
pabelangerclarkb: yah, I think we might be capped at how fast we can launch jobs on the executors, I haven't looked at logs, but graphs to be seem to show we have the capacity (RAM / HDD / CPU) on current executors to launch more jobs at once19:16
clarkbpabelanger: it could also be the 4 jobs per second or whatever that number/period is for throttling19:17
clarkbpabelanger: but if you look at the node usage we are using on average more nodes over that time period19:17
clarkbso zuul si going as fast as it can I think19:17
*** ccamacho has quit IRC19:17
pabelangerclarkb: well, i think the nodes are ready, but we haven't started ansible-playbook process yet19:18
pabelangerdue to 4 jobs per sec limit19:18
clarkbthey say they are in use, do we flip that bit over before starting ansible?19:18
pabelangerso, nodes are idle not working in used state19:18
pabelangerclarkb: yah, I think so19:18
clarkbgotcha19:18
*** jamesmcarthur has joined #openstack-infra19:18
clarkbsome of the dip is due to the merger queue backlog I think (which would be explained by that large stack19:19
clarkbthat roughly coincides with the dip in running jobs19:19
pabelangerah, I missed thata19:20
clarkband the scheduler itself is consuming quite a bit of cpu. Those two things may end up being our bottlenecks?19:20
*** NCLanceman has quit IRC19:20
clarkbmerger to say here is the new configs, scheduelr to parse them all and take action on them. Then once it does there is a lot of work for executors to pick up and we hit the 4 per second limit?19:20
clarkb(I don't have hard data for that other than teh zuul-status page graphs)19:21
*** jamesmcarthur has quit IRC19:21
clarkbbut ya the biggest dip coincides with merger backlog. And scheduler tends to have to take action on post merger activity19:21
*** sreejithp_ has joined #openstack-infra19:22
*** wolverineav has quit IRC19:22
clarkband ya that nova stack rebase conincides with all that19:22
*** wolverineav has joined #openstack-infra19:23
*** NCLanceman has joined #openstack-infra19:23
pabelangeryah, starting builds governor likely could be allowed to open up a little more, to then let memory / cpu / hdd govenor be our limit19:23
*** sreejithp has quit IRC19:24
*** ociuhandu has joined #openstack-infra19:24
*** ociuhandu has quit IRC19:30
cmurphyhuh i hit the remote host key change too http://logs.openstack.org/91/630391/5/check/tox-linters/9d053bf/job-output.txt.gz#_2019-02-14_19_08_09_41616219:30
*** jamesmcarthur has joined #openstack-infra19:30
*** Jason_Lee has joined #openstack-infra19:30
tobiashclarkb: are your executors on local disks or shared storage?19:32
clarkbtobiash: they are cinder block volumes, which are remote disks19:32
*** NCLanceman has quit IRC19:32
tobiashclarkb: so if the disks would become slower due to something somthing this would also count into the load and that spikes up leading to a deregister of the executors19:33
pabelangertobiash: loadavg / memory / swap all look okay in grafana19:35
tobiashpabelanger: how about io of the disks?19:35
pabelangers/swap/hdd19:35
tobiashI mean not disk space but iops19:36
tobiashwe see that our executors are mostly io limited19:36
tobiash(they're on ceph)19:36
pabelangertobiash: not sure, haven't looked at cacti19:36
clarkbtobiash: http://grafana.openstack.org/d/T6vSHcSik/zuul-status?orgId=1&from=now-3h&to=now the starting builds graph seems to show its the main jobs per $timeframe limit I thin19:36
pabelangerwe should log the data there19:36
clarkbstarting builds should trend lower than 4 if it is another limit19:37
clarkbI think19:37
corvushas anyone compared ze08 to the rest?19:38
clarkbcorvus: I looked the other day comparing swap and its swap usage is way down compared to the others19:38
clarkbhaven't checked today19:39
corvusreduced mem/swap should allow it to accept more jobs when we hit capacity19:39
corvusour executors should be able to handle more than they are, the reason they aren't is swapping activity19:39
openstackgerritMerged openstack-infra/system-config master: Trigger deployment with gitea 1.6.3  https://review.openstack.org/63501619:39
*** markvoelker has joined #openstack-infra19:39
clarkbya but I don't think the governing happens due to cpu or memory or disk, its just starting jobs limit? or maybe I misunderstand how that limit works?19:40
*** weshay|ruck has quit IRC19:40
corvusclarkb: that's correct -- but here's the thing -- if the system has enough headroom, then a starting job becomes a running job quickly19:41
corvusclarkb: think about how the system behaves when we restart the executors19:41
clarkbhttp://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=64537&rra_id=all ze08 swap http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=64465&rra_id=all ze05 swap19:41
corvusit goes from 0 to capacity in a very short time19:41
*** jamesmcarthur has quit IRC19:41
clarkb08 still looking much better on swap19:41
corvusthe starting jobs governor is still in place, but it doesn't unduly inhibit things19:41
*** wolverineav has quit IRC19:41
*** jamesmcarthur has joined #openstack-infra19:41
corvusbut after running a long time, the executors swap more and spend more system cpu time, and things just slow down.19:42
clarkbgotcha its the turnover on those 4 jobs that the resource contention effects not the raw limit19:42
corvusyep19:42
clarkbze08 isn't running many more jobs than the other executors. It seems to fall right in the middle19:42
corvusbummer19:43
corvuswas there a really big reset i wonder?19:44
clarkbcorvus: yes huge nova stack rebase19:44
clarkband then some smaller gate resets due to the ssh key change problems :/19:45
clarkbcloudnull: re ssh key change problems ^ we still think those likely to be due to duplicate IP/ IP reuse in rax regions19:45
corvusit could be that a high enough percentage of jobs had to stop at once, and the system was big enough to keep the starting governor at the min19:46
corvusrather, system was 'busy' enough19:46
*** rlandy is now known as rlandy|afk19:47
clarkbmgagne: fwiw those IPs seem to be from the inap cloud this time around example are 198.72.124.136 and 198.72.124.19119:47
openstackgerritChristian Berendt proposed openstack/diskimage-builder master: bootloader: add support for GRUB_CMDLINE_LINUX  https://review.openstack.org/63703619:47
clarkbmgagne: perhaps related to the ports issue you mentioned previously19:47
mgagneclarkb: I think Zuul looped on node creation because it couldn't find any free IPs. I'm not sure if there are any left zombie instances with "free" IPs. are there any issues atm since the cleanup?19:48
cloudnullclarkb on it19:49
corvusclarkb: there's still a lot of swap activity on ze0819:49
clarkbcloudnull: sorry I assumed it was still rax this time19:49
cloudnulloh , not rax?19:49
clarkbcloudnull: at least the two examples I just pulled were inap19:49
*** ssusteve has joined #openstack-infra19:50
cloudnullah, i see.19:50
* cloudnull off it :) 19:50
* cloudnull reading back 19:50
clarkbmgagne: http://logs.openstack.org/96/636696/1/gate/openstack-tox-py27/43988b0/job-output.txt.gz#_2019-02-14_19_39_38_475279 is a case from about 10 minutes ago19:50
*** rosmaita has left #openstack-infra19:50
mgagnelet me check19:50
clarkbcloudnull: that said if you know what might be causing that, that migh be helpful info for mgagne19:50
clarkbcloudnull: I'm not sure how far your debugging got19:51
corvusinstalled 9.0.0 again19:51
cloudnulli just saw the ping and assumed a duplicate IP was a RAX problem19:51
clarkbcorvus: ya I'm guessing that total memory use is less because jemalloc packs better, but the hot spots are still there and we still swap them19:51
cloudnullI'd not really looked into it just yet19:51
clarkbcloudnull: ya no worries19:52
*** Jason_Lee has quit IRC19:52
cloudnullwhen we see those, in our case, its normally some zombie instance19:52
clarkbcloudnull: sometimes you discover $thing is a known problem and so and so fixed it and there is some change over here that needs backporting :)19:52
cloudnullin our case thats normally caused by a delete, while recorded as successful, fails to actually shutdown / remove the instance. Not that it happens often, but it happens.19:54
clarkbI've got to step out for a few, but the thrashing from the IP issues seems like it could be a big part of the busy system load19:54
clarkbsince that results in gate resets19:54
openstackgerritJames E. Blair proposed openstack-infra/zuul-preview master: Update gitreview file with correct project name  https://review.openstack.org/63679119:54
mgagne198.72.124.191 is no longer responding to ping and I can't find it on any compute nodes. 198.72.124.136 responds to ping but I can't find it with my usual method.19:55
openstackgerritJames E. Blair proposed openstack-infra/zuul-preview master: Update gitreview file with correct project name  https://review.openstack.org/63679119:55
openstackgerritMerged openstack-infra/zuul-preview master: Update gitreview file with correct project name  https://review.openstack.org/63679119:56
openstackgerritJames E. Blair proposed openstack-infra/zuul-preview master: WIP: test docker registry  https://review.openstack.org/63703720:02
fungiclarkb: cmurphy: i don't know anything about ask.o.o really. not sure anyone does. we puppet it, but it's mostly unmaintained at this point (from the underlying software point of view anyway)20:04
clarkbI think it was ianw that had started to look at it20:06
clarkbcan ask when awake20:06
mgagneclarkb: found it for 198.72.124.13620:07
*** yboaron_ has joined #openstack-infra20:07
fungicorvus: clarkb: also, the primary playbooks i think do require to rounds of updated because they're already read in at the start of the pulse and responsible for updating their source code thereafter?20:07
clarkbonly the playbool that updates systemd config iirc. we use different processesas wego along to mostly address that20:07
fungicmurphy: tosky: those reused ip addresses generally happen because nova has lost track of an instance which remained running. usually if we track the specific ip addresses we'll see the same ones crop up over and over for that error20:08
corvusok, yeah, i think i just had a run of bad timing then -- possibly combined with the fact that runs were timing out due to k8s20:08
mgagnefound 198.72.124.191 too20:09
mgagneso far, Neutron Queens has been a huge bag of hurts for us... =(20:09
*** markvoelker has quit IRC20:12
fungiand now i'm caught up on scrollback and will dig into my to do list20:12
clarkbmgagne: thats we should keep an eye out for others20:13
clarkbfungi ^ re ssh failures20:13
fungineutron queens, got it20:13
clarkber thanks20:17
*** trown|lunch is now known as trown20:17
*** jamesmcarthur has quit IRC20:19
*** e0ne has joined #openstack-infra20:21
*** jamesmcarthur has joined #openstack-infra20:21
ianwclarkb / cmurphy / fungi : i have looked as ask.o.o in the past, but no current work20:22
ianwi think i probably got to the point of "wow this has bitrotted so far it's like starting new"20:22
fungisounds like a fair assessment20:23
*** wolverineav has joined #openstack-infra20:23
openstackgerritJames E. Blair proposed openstack-infra/zuul-preview master: WIP: test docker registry  https://review.openstack.org/63703720:23
*** jamesmcarthur has quit IRC20:24
*** jamesmcarthur has joined #openstack-infra20:25
cmurphyianw: i think this stack https://review.openstack.org/559178 and https://review.openstack.org/585196 were doing pretty good they just need some love20:29
*** wolverineav has quit IRC20:30
cmurphybut if it's not going to happen any time soon let's turn puppet back on for it so we can puppet4 it?20:31
ianwcmurphy: yeah, sorry i can't remember if it was that not getting reviews that stalled it, or i stopped pushing because I found even more problems20:32
ianwlooking back at my notes, i think i was looking at getting it into a virtualenv https://review.openstack.org/#/c/560696/20:34
*** memoussati has quit IRC20:34
*** dave-mccowan has joined #openstack-infra20:34
ianwgiven the intervening year, i guess docker is now the new virtualenv20:35
*** memoussati has joined #openstack-infra20:37
*** wolverineav has joined #openstack-infra20:42
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Upgrade some servers to puppet 4  https://review.openstack.org/63472620:44
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Upgrade git01.openstack.org to puppet 4  https://review.openstack.org/63472720:44
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Switch gitea to TLS  https://review.openstack.org/63704520:47
corvusinfra-root: \o/ http://38.108.68.64/ is running 1.6.3 -- our automatic upgrade driven by git and ansible worked!20:48
fungitobiash: frickler: if i cut out the list item for the password variable of the openstack_puppetforge_credentials secret and pipe it into `| base64 -d | sudo openssl rsautl -inkey /var/lib/zuul/k20:48
fungieys/secrets/project/gerrit/openstack-infra/project-config/0.pem -decrypt -oaep` on the scheduler, i get back the password corresponding to the "openstack" account we have on record20:48
fungier, sorry, that was for tobias-urdin not tobiash!20:48
corvusclarkb, fungi: can you look at https://review.openstack.org/637045 as an option for gitea ssl termination?20:48
fungi(also pardon the stray newline there)20:48
fungicorvus: checking now20:49
corvusfungi, ianw, clarkb: i found this interesting -- i don't think it's relevant for us, since it's just one service, but gitea has built-in support for letsencrypt.  https://docs.gitea.io/en-us/https-setup/20:50
fungiclarkb: 636681 seems to have run afoul of a post failure in one of the rspec jobs20:50
corvusfungi, clarkb: ^ another inap ssh error20:50
corvussome zuul changes hit it recently too20:50
funginoted20:50
fungicorvus: regarding gitea and letsencrypt, that's interesting... so it has some feature which just directly does acme negotiation (or vendors its own copy of certbot or something anyway)>20:52
corvusfungi: that's what i'm imagining.20:53
openstackgerritMerged openstack-infra/system-config master: Update to gitea 1.7.1  https://review.openstack.org/63456520:54
mgagnecorvus, fungi, clarkb: deleted orphan instance20:54
*** auristor has joined #openstack-infra20:57
*** memoussati has quit IRC21:00
*** memoussati has joined #openstack-infra21:01
*** memoussati has quit IRC21:01
openstackgerritJames E. Blair proposed openstack-infra/zuul-jobs master: Enable logging on registry/push/pull jobs  https://review.openstack.org/63704921:01
clarkbcorvus: you can reuse the opendev.org cert I already sorted out right?21:07
corvusclarkb: yep21:07
*** markvoelker has joined #openstack-infra21:09
*** kjackal has quit IRC21:09
clarkbmgagne: unsure of where you ended up with cleanupg but returning from lunch and notice 198.72.124.117 198.72.124.131 198.72.124.72 198.72.124.183 all being sad recently21:12
mgagneclarkb: I just checked the IPs mentioned above, I'm context switching between inap-mtl01 and a forest fire atm.21:13
mgagneI will check those21:14
clarkb198.72.124.59 198.72.124.33 too21:16
clarkbmgagne: do you know if this is something we can clean up on our end?21:16
clarkbor are those rogue instances not going to be exposes to us?21:16
mgagneclarkb: if the instance is unmanaged by Nova, there is little you can do, you need to virsh destroy the instance on the compute node.21:16
clarkbroger21:16
clarkbI wonder if we need to do a whole audit of that /24 or whatever it is21:16
toskyclarkb, mgagne: I have two additional failures: 198.72.124.60 and 198.72.124.7021:17
mgagnewe usually have a tool to find those but so far I'm not getting luck into running it. And since inap-mtl01 is a very busy region, there is a race between listing the instances and actually checking on the compute nodes.21:18
*** jamesmcarthur has quit IRC21:19
*** test_weshay has joined #openstack-infra21:20
*** jtomasek has quit IRC21:23
mgagnedone: 198.72.124.117 198.72.124.131 198.72.124.72 198.72.124.183 198.72.124.6021:23
*** eharney has quit IRC21:23
mgagnecouldn't find orphan 198.72.124.70, only one found is legit.21:24
clarkbmgagne: I am finding more :/ 198.72.124.184 198.72.124.151 198.72.124.17121:24
clarkbI wonder if we shouldn't disable that region so that it can be cleaned up without affecting jobs21:25
mgagneI think it's the best solution for now21:25
clarkbI'll get that patch up and we can decide since the impact seems to be pretty widespread21:25
mgagneok, better be slower than faster but failing.21:25
mgagneand it's gonna be easier to cleanup if the region isn't used that much.21:25
openstackgerritClark Boylan proposed openstack-infra/project-config master: Disable inap region due to duplicate IPs  https://review.openstack.org/63705421:26
clarkbinfra-root ^ fyi21:26
*** xek has quit IRC21:27
*** jamesmcarthur has joined #openstack-infra21:27
clarkbalso something like #status notice Jobs are failing due to ssh host key mismatches caused by duplicate IPs in test cloud region. We are disabling the region and will let you know when jobs can be rechecked.21:28
clarkbHow does that look for a notice?21:28
corvusclarkb: lgtm.  should we force-merge 637054?21:28
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Fix gitea repository root  https://review.openstack.org/63705521:29
clarkbcorvus: we probably should21:29
corvusclarkb, fungi: ^ got another small gitea patch -- our reward for getting the playbook working is that it broke the config since i left something out21:30
corvusclarkb: i'll do it21:30
clarkbcorvus: would you like to do that? I'll send the notice21:30
clarkbthanks!21:30
clarkb#status notice Jobs are failing due to ssh host key mismatches caused by duplicate IPs in a test cloud region. We are disabling the region and will let you know when jobs can be rechecked.21:30
openstackstatusclarkb: sending notice21:30
*** iurygregory has quit IRC21:30
openstackgerritMerged openstack-infra/project-config master: Disable inap region due to duplicate IPs  https://review.openstack.org/63705421:30
clarkbI'll go edit the nodepool launchers config by hand too to speed that up21:31
*** jamesmcarthur has quit IRC21:31
-openstackstatus- NOTICE: Jobs are failing due to ssh host key mismatches caused by duplicate IPs in a test cloud region. We are disabling the region and will let you know when jobs can be rechecked.21:31
clarkbthats done21:32
*** jtomasek has joined #openstack-infra21:32
openstackstatusclarkb: finished sending notice21:33
clarkbthank you openstackstatus21:34
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675921:35
*** agopi_ has joined #openstack-infra21:39
openstackgerritMerged openstack-infra/system-config master: Trigger deployment with gitea 1.7.1  https://review.openstack.org/63501721:39
*** whoami-rajat has joined #openstack-infra21:41
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Switch gitea to TLS  https://review.openstack.org/63704521:41
corvusi'm going to squash those 2 changes21:41
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Switch gitea to TLS  https://review.openstack.org/63704521:42
*** agopi has quit IRC21:42
*** markvoelker has quit IRC21:42
corvusclarkb, fungi: can you re-review that?  i squashed the app.ini change into it, and also fixed the gitea-init docker build (which i probably broke when i changed how the docker jobs work)21:42
fungisure21:43
clarkbyup looking21:43
fungilgtm, thanks!21:43
clarkbcorvus: why is the image alias important ther eif we aren't doing a multi stage build in that dockerfile?21:43
corvusclarkb: because if i change the .zuul.yaml, a bunch more jobs are going to run :(21:44
corvusthat may not have been your question, let me say more21:44
*** test_weshay has quit IRC21:44
clarkbcorvus: more curious how that would have an appreciable change on anything since that alias of gitea-init really only applies to the build there21:44
corvusbecause .zuul.yaml tells docker to build the 'gitea-init' target, there must be a named target in the dockerfile.  so we either need to remove that config from zuul, or add the target name.21:45
clarkbit isn't externally visible?21:45
corvuscorrect, not externally visible, it's just that the build fails with "failed to reach build target gitea-init in Dockerfile"21:45
clarkbgotcha it is due to asking for gitea-init and we pretend that is it there21:45
tobias-urdinfungi: hm, so maybe there is some spacing in the password or something?21:46
corvusbasically, we're running "make foo" and there's no makefile target for that :)  so we either define "foo", or change the command to "make"21:46
tobias-urdinor i'm doing it wrong in the playbook where i use the secret21:46
clarkbcorvus: ya I just always read FROM lines as take this image from over there and we'll modify it21:46
tobias-urdinsince frickler tried this http://paste.openstack.org/show/745081/ manually with the password, i'm not sure whats wrong21:46
clarkbbut then we never refer to that source again so the alias shouldn't matter. But I guess that impacts the output name as well?21:47
tobias-urdinfungi: did you have a minute to check the playbook as well to verify it looks correct where i use the secret?21:47
corvusclarkb: yeah, it's a little weird that also means "at the end of this, the result will be called 'foo"21:47
corvustobias-urdin: can you point at the playbook?21:47
clarkbcorvus: yup reading multistage docs again and the as names the end result not the source21:48
clarkbso that does do as you intend. Just slightly confusing as a language21:48
corvusclarkb: yeah, that's not what those words mean in english.21:48
corvusclarkb: the "AS" is ironic.21:48
tobias-urdinsure, secret: https://github.com/openstack-infra/project-config/blob/master/zuul.d/secrets.yaml#L68321:48
tobias-urdinjob: https://github.com/openstack-infra/project-config/blob/master/zuul.d/jobs.yaml#L9121:48
*** e0ne has quit IRC21:48
*** dave-mccowan has quit IRC21:49
tobias-urdinplaybook https://github.com/openstack-infra/project-config/blob/master/playbooks/publish/puppetforge.yaml21:49
fungitobias-urdin: sure, looks like the secret is passed in via the release-openstack-puppet job with the name "puppetforge" and has a "user" key (which matches the username we have on record) and a "password" key (which decrypts to the corresponding password we have on record). i assume the way forge_username and forge_password are set to strings via substitution of "{{ puppetforge.user }}" and "{{21:52
fungipuppetforge.password }}" works but i'll check some similar working examples21:52
corvustobias-urdin: i don't see anything wrong with that :/21:53
*** jtomasek has quit IRC21:54
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675921:57
*** ekultails has quit IRC21:57
tobias-urdinyeah been staring at it as well, it must be something about that though21:58
fungihttps://git.zuul-ci.org/cgit/zuul-jobs/tree/roles/upload-forge/tasks/main.yaml seems to in turn use similar substitution to set the forge_username and forge_password contents as username and password variables to the forge_upload library module21:58
tobias-urdinsince we've tried using http://paste.openstack.org/show/745081/ with that account21:58
*** jamesmcarthur has joined #openstack-infra22:00
pabelangerdid you pass --strip to encrypt_secret?22:00
fungiand the library modle does use username and password keys from the module.params dict and pass them into _forge_auth() in the anticipated order22:01
clarkbencrypt_secret strips input on the cli automatically22:01
pabelangerI thought you needed to use --strip22:02
clarkbpabelanger: but ya that might be a good thing to check, that the secret comes out with expected whitespace22:02
fungiwhen i decrypt the secret it does end in a newline22:02
clarkbpabelanger: you might if doing it with an input file, but inptu on command line its fine22:02
*** trown is now known as trown|outtypewww22:02
pabelangerAh, that might be it22:02
pabelangerI was encrypting some clouds.yaml files for ansible-network, and pretty sure had whitespace issue until I used --strip22:02
fungii have no idea if any of the layers of string substitution in ansible/jinja2/yaml strips it22:02
clarkbcorvus: bridge has downgraded kubernetes lib again :/22:03
fungibut yeah, possible we're passing the trailing \n in as part of the password22:03
corvusfungi: they don't.  if you say it has a newline, it will send it.22:03
corvusclarkb: yep.  i've been re-upgrading it all day22:03
fungilikely a mistake on my part encrypting it in that case22:03
corvusi would really like that change to land.22:03
fungii'll upload a replacement secret in moments22:03
fungicorvus: which was the change to stop it upgrading?22:04
pabelangerI think you could use | trim(), but would need to call filter in playbook22:04
fungier, downgrading i suppose22:04
corvusno, just encrypt the right secret :)22:04
pabelanger+122:04
clarkbfungi: if you use the cli input it will strip for you22:04
fungiindeed22:04
fungii probably put it in a file22:04
clarkbyou do have to ^C^D or ^D^D though22:04
clarkbto close the input22:04
*** wolverineav has quit IRC22:05
corvusfungi: https://review.openstack.org/637021 is the change22:05
*** tosky has quit IRC22:05
clarkbI think its safe to recheck now22:05
corvusit keeps failing due to the inap stuff22:05
*** tosky has joined #openstack-infra22:05
clarkbI should send that notice, but still trying to be extra sure22:06
clarkboh it could be old inap servers we haven't rotated out22:06
* clarkb does a listing22:06
clarkball inap servers are in use or deleting and max-servers is set to 022:07
clarkbso I think it is "safe" now22:07
fungisounds safe to clear then22:07
corvusclarkb, fungi, pabelanger: there is no difference between cli and stdin.  in both cases, encrypt_secret will encrypt *exactly* what you give it.  so if you don't want a newline, don't give it one.  ^D^D after the value will cause it to terminate reading input without a newline.  again, in either case, you can add --strip to the command line and it will strip leading/trailing whitespace.22:07
clarkbhow about #status notice The test cloud region using duplicate IPs has been removed from nodepool. Jobs can be rechecked now.22:08
*** jamesmcarthur has quit IRC22:08
openstackgerritMerged openstack-infra/zuul-jobs master: Enable logging on registry/push/pull jobs  https://review.openstack.org/63704922:08
corvusclarkb: wfm22:08
clarkbcorvus: oh hrm I thought it had stripped the one secret I had to replace with opendev. Maybe the ^C^D thing did what I wanted and I assumed another thing22:09
*** auristor has quit IRC22:09
*** rlandy|afk is now known as rlandy22:09
fungiclarkb: corvus: hrm, yeah when i piped it through stdin i did indeed also get a trailing newline in there22:11
openstackgerritMerged openstack-infra/system-config master: Prep for pbx upgrade to xenial  https://review.openstack.org/63668122:12
clarkb#status notice The test cloud region using duplicate IPs has been removed from nodepool. Jobs can be rechecked now.22:12
openstackstatusclarkb: sending notice22:12
corvusfungi: piped?  from what?22:13
fungicorvus: echo22:13
corvusfungi: echo -n or echo?22:13
fungijust echo with no -n, so yeah unsurprising i suppose22:13
corvusi would expect echo -n to work22:13
-openstackstatus- NOTICE: The test cloud region using duplicate IPs has been removed from nodepool. Jobs can be rechecked now.22:14
*** jamesmcarthur has joined #openstack-infra22:14
fungii'll try that too, but expect --strip will do the trick22:14
*** ijw has joined #openstack-infra22:15
fungiyep, echo -n did what we want too22:15
fungiwithout --strip in the tool22:15
openstackstatusclarkb: finished sending notice22:15
openstackgerritJeremy Stanley proposed openstack-infra/project-config master: Update Puppetforge secret  https://review.openstack.org/63706722:17
fungitobias-urdin: clarkb: pabelanger: corvus: ^ there we go22:17
tobias-urdin\o/22:18
*** jamesmcarthur has quit IRC22:18
*** auristor has joined #openstack-infra22:18
*** jamesmcarthur has joined #openstack-infra22:18
fungionce it merges (hopefully in just a few minutes) i can retrigger the reenqueue that tag yet agani22:21
fungiagain too22:21
*** yboaron_ has quit IRC22:21
*** tjgresha has joined #openstack-infra22:21
*** dave-mccowan has joined #openstack-infra22:22
*** ijw has quit IRC22:24
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675922:27
clarkboh wow storyboard upgrade is tomorrow. fungi anything need to be done before then?22:28
fungiclarkb: not that i know of. as a reminder, the plan is at https://etherpad.openstack.org/p/gCj4NfcnbW22:29
fungiit's straightforward and should only be a few minutes downtime22:29
fungithe db dump/load ought to take less time than dns propagation, based on my evaluations with the (larger) storyboard-dev db on a smaller machine22:30
clarkbthats a fun size difference due to how we use them differently. Sounds good22:30
*** yamamoto has joined #openstack-infra22:30
*** wolverineav has joined #openstack-infra22:30
*** armax has quit IRC22:30
fungiyeah, the massive projects diablo_rojo has done import testing for is what leads sb-dev to be so much larger of a db22:30
*** wolverineav has quit IRC22:35
fungii've added the new hiera keys for https://review.openstack.org/636952 on bridge.o.o if another infra-root wants to review that. the osf is relocating the foundation member databases to vexxhost and adding ssl/tls client cert auth for mysql22:36
fungithat change is to basically do a dry run for openstackid-dev22:36
clarkbfungi: approved22:37
fungithat way they can be confident it's working as intended before they do the same for the production db a week from tuesday22:38
clarkb__22:38
fungithanks clarkb!22:38
clarkber22:38
clarkb++22:38
*** sreejithp_ has quit IRC22:39
*** markvoelker has joined #openstack-infra22:39
openstackgerritMerged openstack-infra/project-config master: Update Puppetforge secret  https://review.openstack.org/63706722:42
openstackgerritMerged openstack-infra/zuul master: Cache github PR shas  https://review.openstack.org/63676422:42
clarkbcorvus: things are moving again ^22:43
*** agopi_ has quit IRC22:43
clarkbnow do we want to try and restart now since things are backlogged/slow anyway or let them clear out22:43
openstackgerritMerged openstack-infra/system-config master: Switch gitea to TLS  https://review.openstack.org/63704522:45
fungihappy to help with a restart in the next 10 minutes, then i'm on a conference call for a while22:48
clarkblet me check when puppet will do the install for us22:48
*** wolverineav has joined #openstack-infra22:49
clarkbI think we are about 25 minutes away from a puppet install of zuul on zuul0122:49
fungiahh, happy to half-watch a restart in progress while on my conference call in that case22:49
clarkbI can probably do a restart after that as long as we don't think it will eat into backlog too much (which is my next thing to check)22:49
clarkbwe are headed in the right direction now with backlog so a restart probably isn't too terrible22:50
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675922:50
*** jamesmcarthur has quit IRC22:51
*** wolverineav has quit IRC22:53
fungitobias-urdin: i've reenqueued the puppet-aodh 14.2.0 tag now22:55
fungithat should complete before puppet updates zuul01.o.o anyway22:56
fungiso ought not get in the way of the restart22:56
*** wolverineav has joined #openstack-infra22:56
openstackgerritMerged openstack-infra/system-config master: Temporarily stop installing openshift  https://review.openstack.org/63702123:00
openstackgerritJames E. Blair proposed openstack-infra/zuul-jobs master: Fix pull-from-intermediate-registry artifacts error  https://review.openstack.org/63707223:00
corvusyay, maybe my sysiphean task will end soon!23:01
*** jamesmcarthur has joined #openstack-infra23:01
fungithere can't be *that* much of a boulder left now23:01
*** eernst has joined #openstack-infra23:01
fungiit's been worn down by all the rolling23:01
*** tkajinam has joined #openstack-infra23:02
openstackgerritJames E. Blair proposed openstack-infra/zuul-jobs master: Fix undefined attrs in registry push/pull roles  https://review.openstack.org/63707223:03
tobias-urdinfungi: yay successful upload23:03
fungimic drop!23:04
fungiexcellent work23:04
corvusyay!23:04
fungisorry my stray newline gummed up the last of it23:04
openstackgerritMerged openstack-infra/system-config master: Update puppet config for openstackid-dev node  https://review.openstack.org/63695223:05
tobias-urdinhere comes the annoying part though, i'm sorry to drop you all this crap but there is a lot of failed releases23:05
*** jamesmcarthur has quit IRC23:05
tobias-urdinthe last release was a bump of pretty much all modules23:05
*** jamesmcarthur has joined #openstack-infra23:06
tobias-urdinhere is the commit, it was the stein-2 milestone https://github.com/openstack/releases/commit/d815ac8ed2b4043aaa2cb3be2a4d2c3b398fc4df23:06
*** markvoelker has quit IRC23:12
openstackgerritJan Kundrát proposed openstack-infra/nodepool master: Implement a Runc driver  https://review.openstack.org/53555623:12
clarkbthere should be a rule we have with the release team where we release a single thing whenever there are job changes :)23:15
clarkbtobias-urdin: so all of those releases failed and need to be reeunqueud except for the on that fungi just enqueued?23:15
tobias-urdinyeah, all of them except for the first one (puppet-aodh) that he reenqueued23:16
clarkbtobias-urdin: ok we need to gather some data (probably in an etherpad?) we need the project, tag, and tag sha1 (not the sha1 the tag points to but the sha1 of the tag itself)23:16
clarkbcorvus: fungi zuul==3.5.1.dev68  # git sha 275cbc9 is installed on zuul01 now which looks correct to me for the PR caching change23:18
clarkbdo we want to go ahead and do a restart now?23:18
clarkbI can do it just have to remember the steps23:18
clarkbfungi: you did have to restart the web process too ?23:18
clarkbthe way github renders that diff you can't just copy paste the text because you lose the +'s23:20
* clarkb does a git show locally23:20
openstackgerritIan Wienand proposed openstack-infra/system-config master: [dnm] testing...  https://review.openstack.org/63675923:22
tobias-urdinok, i'll compile a list23:23
openstackgerritMerged openstack-infra/system-config master: Update DNS documentation  https://review.openstack.org/63356923:23
*** wolverineav has quit IRC23:23
fungiclarkb: yeah, i ended up restarting zuul-web to get status content23:24
*** wolverineav has joined #openstack-infra23:26
tobias-urdinso sha1 of tag would be $(git show-ref -s 14.2.0) for example23:26
clarkbtobias-urdin: ya that looks right23:27
*** rh-jelabarre has quit IRC23:29
clarkbok I think I'm ina  good spot to restart zuul scheduler23:30
clarkbprocess would be store queues, stop scheduler, start scheduler, wait for config to be loaded, reload queues. And restart zuul-web at some point23:31
clarkbfungi: ^ is that how you did it? corvus ^ any reason to not do that nowish?23:31
fungiwfm23:32
openstackgerritHunter Werkman proposed openstack-infra/project-config master: Add puppet-gitrepo project to puppet-OpenStack  https://review.openstack.org/63707623:32
fungilet #openstack-release know first?23:32
corvusclarkb: sgtm23:32
*** eernst has quit IRC23:33
clarkbok saving queues now23:33
fungiconfig-core reviewers: 637076 there is some university student interns, be gentle! ;)23:33
*** yamamoto has quit IRC23:33
clarkbI have asked the zuul-scheduler to stop23:34
corvusclarkb: re ^ maybe that should be non-infra?23:34
corvussorry, was re 07623:34
clarkband it is starting now23:34
clarkbcorvus: ya we can always consume a generic module if we like, doesn't have to be infra (and we are moving away from puppet)23:35
corvusyes -- i mostly suggest that because i don't expect us to use it23:35
clarkbindeed23:36
clarkbre zuul-web, I stopped and started it but the start doesn't seem to have worked. I'm going to assume it will fix itself if I stop start again once shceduler has configs loaded23:36
clarkbbut if that doesn't happen then we may need eyeballs23:37
corvusclarkb: remove the pidfile23:37
clarkbcorvus: will do23:37
clarkbthat was it23:38
clarkbenqueuing changes now23:39
tobias-urdincorvus: raw data http://paste.openstack.org/show/745127/23:44
tobias-urdinhere is commands http://paste.openstack.org/show/745128/23:44
tobias-urdin:)23:44
*** rlandy is now known as rlandy|bbl23:44
clarkbI assume that was for me23:45
clarkbfungi: do you have your puppet-aodh enqueue command nearby? I can apply it to the above pastes and we can run a mass enqueue assuming we are happy with the resutls of the aodh enqueu23:46
fungiclarkb: just a sec, sure23:46
fungisudo zuul enqueue-ref --tenant=openstack --trigger=gerrit --pipeline=release --project=openstack/puppet-aodh --ref=refs/tags/14.2.0 --newrev=617ffad84b633618490ca1023f8a31d9694b31a923:47
clarkbI think I found it23:47
clarkbya thanks23:47
fungiyou need to show-ref the tag to get the newrev unless you have the failure links handy23:47
clarkbfungi: ya tobias-urdin got all that data for me in the paste above23:48
clarkbI'll do puppet-barbican and double check things then do a larger chunk23:49
fungithe script in https://review.openstack.org/613676 can just be fed the url of a log for a previous run23:50
tobias-urdinclarkb: oh yeah23:50
fungiand will output the reenqueue command23:50
clarkbits pretty easy with visual mode vim and the columnar data too23:50
*** tosky has quit IRC23:51
*** agopi_ has joined #openstack-infra23:51
tobias-urdinfungi: clean utility right there23:52
*** mattw4 has quit IRC23:53
tobias-urdini'll see if can work with the release team to propose some pre-release check that queries puppetforge api to prevent failed release jobs if somebody duplicates versions23:54
tobias-urdinor maybe it fails on that already, then it's probably fine23:54
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Fix gitea k8s files  https://review.openstack.org/63708023:58
corvusclarkb, fungi: ^ i will be very happy when we figure out testing of k8s stuff ^23:59
clarkbtobias-urdin: zuul says puppet-barbican job ran successfully, want to double check things in puppetforce and if it looks good I'll enqueue the others23:59
tobias-urdinsure, give me a sec23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!