Tuesday, 2019-07-23

*** mattw4 has quit IRC00:00
clarkbthat one is 5d31e134-3abf-4802-b7d5-df4824b79e9500:00
clarkbalso available00:00
fungiwacky00:00
*** eernst has quit IRC00:00
clarkbis it racing maybe?00:01
clarkbthe volume isn't ready for nova when noav tries to use it?00:01
fungi~clarkb/.bash_history doesn't seem to have any bfv examples in it00:01
*** eernst has joined #openstack-infra00:02
clarkbfungi: `sudo ./launch-node.py $FQDN --flavor "$FLAVOR" --cloud=$OS_CLOUD --region=$OS_REGION_NAME --image $IMAGE --boot-from-volume --volume-size 80 --config-drive --network public`00:02
fungioh, wait, theres one00:02
fungi--config-drive --network public00:02
fungidiffers from what i used00:02
fungido we need to specify --config-drive?00:03
clarkboh ya we need config drive (I wish that was just default standard nova behavior)00:03
clarkbbecause the minimal images use glean00:03
clarkbhowever that shouldn't affect the volume thing unless it is a race and that slows down nova enough00:03
clarkbfungi: you should delete the leaked volumes too or should I go ahead and do that?00:03
fungitrying but yeah i have a feeling it's not getting that far00:03
fungii can clean them up in a sec00:04
fungihow were you identifying them since they don't mention the server?00:04
clarkbk. The two 40GB volumes that don't appear attached are for gitea01 and gitea06 I think00:04
clarkbfungi: timestamp and size00:04
clarkbfungi: and the first one was in your error message00:04
fungi40gb? but i specified 8000:04
clarkbfungi: there are two leaked 40GB volumes (old 06 and 01) and two leaked 80GB volumes (your recent failures)00:04
clarkbthey are not attached to anything00:05
fungioh neat it's getting farther this time00:05
clarkband then using image name, timestamps and the error above you can kinda infer stuff00:05
funginot sure if this is random race-winning or the options really made a difference00:05
*** jamesmcarthur has joined #openstack-infra00:05
clarkbI think its gonna be race winning00:05
clarkbbecause all you chaged was use public network instead of public network uuid and add a config drive00:06
clarkbthe add a config drive step is likely making nova take long enough that cinder is ready with a volume in time00:06
openstackgerritMerged opendev/system-config master: Increate gerrit user connection limit by 50%  https://review.opendev.org/67218800:06
*** eernst has quit IRC00:06
fungioh, yep that makes some sense re: nova lag00:06
*** gyee has quit IRC00:07
*** eernst has joined #openstack-infra00:08
clarkbwe should consider setting config drive to true in launch node by default00:08
clarkbI always forget that one00:08
*** tkajinam has quit IRC00:10
ianwcorvus: just looping back to 669780 which sets up debug logging for nodepool; using "-d" is going to foreground the process.  i'm not sure on running " &" but it seems we should rely on nodepool's own daemonizing imo00:11
ianwi wonder if it used to run in a systemd service using run_process or if the old way just called it too00:12
clarkbfungi: there is a message to the infra list about opendev.org closing a connection unexpectedly. You deleted gitea01 right?00:12
clarkbI'm now remembering that haproxy doesn't auto reload its config because docker00:13
*** eernst has quit IRC00:13
clarkbit will notice the server is down after it goes away but any connections up at the time would be toast I think00:13
clarkbwould probably explain that00:13
clarkb(did we have to delete the old one due to quota?)00:13
ianwto answer above : yes it used to run via devstack which did it in a service : run_process nodepool-builder "$NODEPOOL_INSTALL/bin/nodepool-builder -c $NODEPOOL_CONFIG -l $NODEPOOL_LOGGING -d"00:14
*** eernst has joined #openstack-infra00:14
*** jamesmcarthur has quit IRC00:15
fungiclarkb: --config-drive should only be an issue on static.o.o where every last block device assignment counts (and the configdrive uses one we could otherwise attach a cinder volume at)00:16
fungiso in general i agree00:16
fungimaybe we can invert the option at least00:17
clarkbfungi: ya00:17
clarkbwell that and live migration potentially not working with config drive depending on the age of the cloud00:17
clarkbbut maybe we want that00:17
clarkbI've responded to the opendev connection question with my theory and asked for more info if it happens again00:17
fungiand yes, if haproxy says gitea01 is still in the table but disabled due to failures, that was probably me circa 23:25z00:17
clarkbI think we either want toteach ansible and docker how to gracefully restart an haproxy or stick to manually removing those backends, waiting for connections to drop, then making these changes in the future00:18
clarkb(it would've been fine if my change had been applied)00:18
fungiokay, new gitea01 is 84a24c36-b8fa-4e5f-8e1a-6d80767e52fc/38.108.68.172/2604:e100:3:0:f816:3eff:fe16:27400:19
fungitime to clean up stray cinder volumes00:19
*** eernst has quit IRC00:19
clarkbI've been asked to start dinner so going to go do that now00:20
clarkbremember to exclude the server in the remote_puppet_git.yaml playbook so that we don't configure projects on it automatically (instead want that to happen via db backup)00:20
fungii may be handing the rest of gitea01 off to others or picking it up in the morning. getting weird looks from guests wondering why i'm working past 8pm00:20
*** eernst has joined #openstack-infra00:21
fungithere are currently 4 volumes showing as "available" for that tenant in sjc100:21
fungithey should all be safe to delete, in theory00:22
fungithe two showing as 80gb were created at times corresponding with my prior failed launch attempts00:23
fungithe rest are the two 40gb volumes you mentioned00:23
clarkbthe two 80gb are definitely safe00:23
clarkbthe 40gb ones maybe we want to double check more?00:23
fungiwill do00:23
fungiokay, the two leaked 80gb volumes have been deleted now00:24
*** eernst has quit IRC00:25
fungii see a bit of a timeline with the two available 40gb volumes00:26
fungib9eb27fe-3dfe-49a8-85f4-c142b60aa06b was created 2019-02-28T13:41:45.000000 and updated 2019-02-28T13:41:50.000000 based on an image named "ubuntu-bionic-minimal"00:27
*** eernst has joined #openstack-infra00:27
fungi7cd5e56d-8d16-4a5e-b0d6-a4b982c61e80 was created 2019-02-28T15:58:19.000000 and updated 2019-06-27T22:06:47.000000 based on the same image00:29
fungi(comparing image uuid, not just name)00:29
fungiwas huh, that's weird that the second image was created the day after it was updated00:30
fungianyway, https://wiki.openstack.org/wiki/Infrastructure_Status says "2019-06-27 22:12:16 UTC Gitea06 had a corrupted root disk around the time of the Denver summit. It has been replaced with a new server and added back to the haproxy config."00:31
fungii'm willing to bet you tried to launch it more than once and leaked an extra image00:31
*** eernst has quit IRC00:32
*** eernst has joined #openstack-infra00:39
*** betherly has joined #openstack-infra00:42
*** eernst has quit IRC00:43
*** eernst has joined #openstack-infra00:45
*** betherly has quit IRC00:46
*** eernst has quit IRC00:50
*** eernst has joined #openstack-infra00:51
*** hongbin has joined #openstack-infra00:55
*** igordc has quit IRC00:55
*** eernst has quit IRC00:57
*** eharney has quit IRC00:57
*** eernst has joined #openstack-infra00:58
fungii was able to delete one of the two ubuntu-bionic-minimal images (d090dd9a-cd77-46d1-afa6-97a99f54dea8) since no volumes were using it01:00
fungiif i delete those two "available" 40gb volumes i should be able to delete the other ubuntu-bionic-minimal images (d0edcf7a-0779-476e-8285-bcab9043b616) which their metadata says they're based on01:01
*** eernst has quit IRC01:02
*** eernst has joined #openstack-infra01:04
*** eernst has quit IRC01:09
*** eernst has joined #openstack-infra01:11
*** eernst has quit IRC01:15
*** imacdonn has quit IRC01:18
*** imacdonn has joined #openstack-infra01:18
*** ricolin has joined #openstack-infra01:19
*** jamesmcarthur has joined #openstack-infra01:24
*** eernst has joined #openstack-infra01:25
*** eernst has quit IRC01:30
*** eernst has joined #openstack-infra01:32
*** jamesmcarthur_ has joined #openstack-infra01:32
*** betherly has joined #openstack-infra01:34
*** jamesmcarthur has quit IRC01:34
*** yamamoto has quit IRC01:34
*** eernst has quit IRC01:36
*** betherly has quit IRC01:38
*** eernst has joined #openstack-infra01:41
*** whoami-rajat has joined #openstack-infra01:43
*** rchurch has joined #openstack-infra01:43
*** eernst has quit IRC01:46
*** _erlon_ has quit IRC01:46
*** eernst has joined #openstack-infra01:48
*** jcoufal has joined #openstack-infra01:52
*** eernst has quit IRC01:52
*** rchurch has quit IRC01:53
*** eernst has joined #openstack-infra01:54
*** jcoufal has quit IRC01:56
*** yamamoto has joined #openstack-infra01:58
*** eernst has quit IRC01:58
*** eernst has joined #openstack-infra02:00
*** rchurch has joined #openstack-infra02:01
*** eernst has quit IRC02:05
*** eernst has joined #openstack-infra02:07
*** eernst has quit IRC02:11
*** eernst has joined #openstack-infra02:13
*** betherly has joined #openstack-infra02:15
*** yamamoto has quit IRC02:17
*** eernst has quit IRC02:17
*** eernst has joined #openstack-infra02:19
*** betherly has quit IRC02:20
*** yamamoto has joined #openstack-infra02:21
*** eernst has quit IRC02:25
*** eernst has joined #openstack-infra02:26
*** eernst has quit IRC02:31
*** eernst has joined #openstack-infra02:32
*** jamesmcarthur_ has quit IRC02:35
*** eernst has quit IRC02:37
*** bhavikdbavishi has joined #openstack-infra02:38
*** bhavikdbavishi1 has joined #openstack-infra02:40
*** bhavikdbavishi has quit IRC02:42
*** bhavikdbavishi1 is now known as bhavikdbavishi02:42
*** eernst has joined #openstack-infra02:43
*** eernst has quit IRC02:48
*** eernst has joined #openstack-infra02:50
*** rchurch has left #openstack-infra02:51
*** michael-beaver has quit IRC02:54
*** eernst has quit IRC02:54
*** eernst has joined #openstack-infra02:56
*** eernst has quit IRC03:01
*** ykarel has joined #openstack-infra03:23
*** psachin has joined #openstack-infra03:28
*** betherly has joined #openstack-infra03:29
*** betherly has quit IRC03:33
openstackgerritIan Wienand proposed zuul/nodepool master: [wip] dib_cmd  https://review.opendev.org/67219603:33
openstackgerritIan Wienand proposed zuul/nodepool master: [wip] dib_cmd  https://review.opendev.org/67219603:42
*** hongbin has quit IRC03:45
*** betherly has joined #openstack-infra03:49
*** hongbin has joined #openstack-infra03:50
*** hongbin has quit IRC03:50
ianwi feel like the job status for ^ is not showing up in http://zuul.openstack.org/status03:50
*** betherly has quit IRC03:54
clarkbianw zuul and nodepool arein their own tenant now03:58
clarkbso you have to go to zuul.opendev.org and follow zuul tenant links03:58
ianwahhh, got it thanks :)03:59
ianwyeah, i was a bit confused as it still appears under projects @ http://zuul.openstack.org/projects so wasn't sure if it was moved04:00
*** udesale has joined #openstack-infra04:01
clarkbI think we have it there to load the integration job config for glean and dib04:02
*** kjackal has joined #openstack-infra04:05
*** rcernin has quit IRC04:13
*** rcernin has joined #openstack-infra04:14
*** rcernin has quit IRC04:20
*** betherly has joined #openstack-infra04:31
openstackgerritIan Wienand proposed zuul/nodepool master: [wip] dib_cmd  https://review.opendev.org/67219604:34
*** betherly has quit IRC04:36
*** gfidente has quit IRC04:43
*** pcaruana has joined #openstack-infra04:43
*** raukadah is now known as chandankumar04:44
*** betherly has joined #openstack-infra04:47
*** betherly has quit IRC04:51
*** ramishra has quit IRC04:54
*** ykarel is now known as ykarel|afk04:55
*** AJaeger is now known as AJaeger_05:02
*** ykarel|afk has quit IRC05:02
*** ykarel|afk has joined #openstack-infra05:18
*** ykarel|afk is now known as ykarel05:18
*** bhavikdbavishi has quit IRC05:19
*** bhavikdbavishi has joined #openstack-infra05:25
*** ramishra has joined #openstack-infra05:29
AJaeger_I see "Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host='opendev.org', port=443): Max retries exceeded with url: /openstack/requirements/raw/branch/master/upper-constraints.txt (Caused by ProtocolError('Connection aborted.', OSError(0, 'Error')))"05:30
AJaeger_example http://logs.openstack.org/31/672131/1/check/openstacksdk-functional-devstack-senlin/ebf3a80/job-output.txt.gz#_2019-07-23_04_38_46_14441505:30
*** Lucas_Gray has joined #openstack-infra05:31
AJaeger_That just happened 1 hour ago - is that still to be expected?05:31
*** armax has quit IRC05:34
openstackgerritIan Wienand proposed zuul/nodepool master: [wip] dib_cmd  https://review.opendev.org/67219605:35
*** notmyname has quit IRC05:41
*** notmyname has joined #openstack-infra05:42
*** dpawlik has joined #openstack-infra05:44
*** eernst has joined #openstack-infra05:46
openstackgerritIan Wienand proposed zuul/nodepool master: [wip] dib_cmd  https://review.opendev.org/67219605:46
*** eernst has quit IRC05:51
*** jamesmcarthur has joined #openstack-infra05:55
*** jamesmcarthur has quit IRC05:56
*** kjackal has quit IRC06:01
*** kjackal has joined #openstack-infra06:05
*** igordc has joined #openstack-infra06:08
*** Lucas_Gray has quit IRC06:09
*** jamesmcarthur has joined #openstack-infra06:09
*** Lucas_Gray has joined #openstack-infra06:10
*** jhesketh has quit IRC06:11
*** jhesketh has joined #openstack-infra06:11
*** jamesmcarthur has quit IRC06:13
*** igordc has quit IRC06:15
*** jamesmcarthur has joined #openstack-infra06:16
*** apetrich has quit IRC06:20
*** diablo_rojo has joined #openstack-infra06:27
*** jamesmcarthur_ has joined #openstack-infra06:28
*** jamesmca_ has joined #openstack-infra06:31
*** diablo_rojo has quit IRC06:32
*** jamesmcarthur has quit IRC06:32
*** jamesmcarthur_ has quit IRC06:35
*** jamesmca_ has quit IRC06:38
*** Lucas_Gray has quit IRC06:39
*** e0ne has joined #openstack-infra06:40
*** e0ne has quit IRC06:41
*** Lucas_Gray has joined #openstack-infra06:43
*** jamesmcarthur has joined #openstack-infra06:43
*** odicha has joined #openstack-infra06:44
*** piotrowskim has joined #openstack-infra06:47
*** dchen has quit IRC06:47
*** pgaxatte has joined #openstack-infra06:50
*** jamesmcarthur has quit IRC06:51
*** jamesmcarthur has joined #openstack-infra06:53
*** gfidente has joined #openstack-infra06:59
*** jtomasek has joined #openstack-infra07:04
*** yamamoto has quit IRC07:05
*** bhavikdbavishi has quit IRC07:05
*** yamamoto has joined #openstack-infra07:06
*** Goneri has joined #openstack-infra07:07
*** slaweq has joined #openstack-infra07:07
*** tesseract has joined #openstack-infra07:09
*** rpittau|afk is now known as rpittau07:09
*** iurygregory has joined #openstack-infra07:14
*** lucasagomes has joined #openstack-infra07:18
openstackgerritIan Wienand proposed zuul/nodepool master: Add a dib_cmd option for diskimages  https://review.opendev.org/67219607:18
*** zbr_ has joined #openstack-infra07:20
*** joeguo_ has joined #openstack-infra07:20
*** irclogbot_2 has quit IRC07:20
*** kaisers has quit IRC07:20
*** openstackstatus has quit IRC07:20
*** kaisers has joined #openstack-infra07:21
*** irclogbot_3 has joined #openstack-infra07:21
*** dansmith has quit IRC07:23
*** zbr has quit IRC07:23
*** joeguo has quit IRC07:23
*** dansmith has joined #openstack-infra07:24
*** Anticimex has quit IRC07:24
*** ginopc has joined #openstack-infra07:24
*** jamesmcarthur has quit IRC07:24
*** beagles has quit IRC07:26
*** tosky has joined #openstack-infra07:28
*** Anticimex has joined #openstack-infra07:29
noonedeadpunkI see strange thing while connecting to opendev.org (probably it's due t my VPN but 12 hours before everything was ok): http://paste.openstack.org/show/754747/07:30
noonedeadpunkDoes anyone have any ideas why this may happen?07:31
gmannAJaeger_: clarkb fungi ianw can any of you remove the stable/stein branch for patrole project - https://opendev.org/openstack/patrole/src/branch/stable/stein.07:32
gmann release patch to remove the branch is merged now: https://review.opendev.org/#/c/670942/07:32
gmannit was created by mistake07:33
*** ykarel is now known as ykarel|lunch07:50
*** apetrich has joined #openstack-infra07:57
*** dtantsur|afk is now known as dtantsur07:59
*** priteau has joined #openstack-infra07:59
noonedeadpunkAnd we're eventually catching common things in CI http://logs.openstack.org/26/670126/1/gate/openstack-ansible-functional-centos-7/a2fda1c/job-output.txt.gz#_2019-07-23_07_55_36_28875108:06
*** Goneri has quit IRC08:09
*** bhavikdbavishi has joined #openstack-infra08:10
*** Goneri has joined #openstack-infra08:11
*** betherly has joined #openstack-infra08:20
*** yamamoto has quit IRC08:20
*** pkopec has joined #openstack-infra08:23
*** ralonsoh has joined #openstack-infra08:26
*** panda has quit IRC08:38
*** panda has joined #openstack-infra08:38
noonedeadpunkinfra-root ^08:40
*** e0ne has joined #openstack-infra08:46
*** yamamoto has joined #openstack-infra08:50
*** ykarel|lunch is now known as ykarel08:57
*** priteau has quit IRC09:00
*** Lucas_Gray has quit IRC09:01
*** yamamoto has quit IRC09:02
*** priteau has joined #openstack-infra09:03
*** pgaxatte has quit IRC09:11
*** Goneri has quit IRC09:11
*** betherly has quit IRC09:11
*** Lucas_Gray has joined #openstack-infra09:12
*** pgaxatte has joined #openstack-infra09:12
*** Goneri has joined #openstack-infra09:16
*** Lucas_Gray has quit IRC09:18
*** Lucas_Gray has joined #openstack-infra09:19
*** psachin has quit IRC09:20
*** ociuhandu has joined #openstack-infra09:31
*** psachin has joined #openstack-infra09:35
*** apetrich has quit IRC09:36
*** pgaxatte has quit IRC09:41
*** pgaxatte has joined #openstack-infra09:43
*** betherly has joined #openstack-infra09:46
*** ociuhandu has quit IRC09:48
*** ociuhandu has joined #openstack-infra09:50
*** jaosorior has joined #openstack-infra09:52
*** bhavikdbavishi has quit IRC09:59
*** dpawlik has quit IRC10:02
*** dpawlik has joined #openstack-infra10:03
jrosserwe have a bunch of jobs failing due to errors fetching files from git http://logs.openstack.org/89/667789/3/gate/openstack-ansible-deploy-aio_metal-debian-stable/14b42ef/job-output.txt.gz#_2019-07-23_07_50_24_82461310:10
*** kopecmartin|off is now known as kopecmartin10:12
noonedeadpunkthat's what I've reported 2 hours ago:) And I have similar thing when I'm connected trough my VPN: http://paste.openstack.org/show/754747/10:13
*** traskat has quit IRC10:13
jrossernoonedeadpunk: interesting, changing curl to curl -4 and it works10:15
jrosserso looks like the ipv6 gremlins are at work10:15
*** lpetrut has joined #openstack-infra10:21
*** tdasilva has joined #openstack-infra10:21
*** yamamoto has joined #openstack-infra10:27
*** yamamoto has quit IRC10:28
*** yamamoto has joined #openstack-infra10:28
*** shachar has joined #openstack-infra10:29
*** snapiri has quit IRC10:32
*** ricolin_ has joined #openstack-infra10:34
*** ricolin has quit IRC10:37
*** yamamoto has quit IRC10:48
openstackgerritMonty Taylor proposed zuul/zuul master: Don't barf in dashboard on CORS violations for 404s  https://review.opendev.org/67226210:50
*** apetrich has joined #openstack-infra10:53
*** Lucas_Gray has quit IRC10:54
*** yamamoto has joined #openstack-infra10:57
*** yamamoto has quit IRC11:02
*** yamamoto has joined #openstack-infra11:05
*** b3nt_pin has joined #openstack-infra11:36
*** ykarel is now known as ykarel|afk11:40
*** ginopc has quit IRC11:40
*** markvoelker has quit IRC11:58
*** rh-jelabarre has joined #openstack-infra12:00
*** eharney has joined #openstack-infra12:01
*** yamamoto has quit IRC12:03
*** udesale has quit IRC12:04
*** udesale has joined #openstack-infra12:04
*** ricolin_ is now known as ricolin12:05
*** Goneri has quit IRC12:06
*** yamamoto has joined #openstack-infra12:08
*** pfallenop has joined #openstack-infra12:12
openstackgerritMonty Taylor proposed opendev/system-config master: Build gerrit images for 2.16 and 3.0 as well  https://review.opendev.org/67227312:15
*** markvoelker has joined #openstack-infra12:16
*** Goneri has joined #openstack-infra12:18
openstackgerritMonty Taylor proposed opendev/system-config master: Trim some bazel flags  https://review.opendev.org/67227412:18
*** pfallenop has quit IRC12:20
*** ccamacho has joined #openstack-infra12:22
*** goldyfruit has quit IRC12:26
*** ykarel|afk is now known as ykarel12:34
*** electrofelix has joined #openstack-infra12:36
*** mriedem has joined #openstack-infra12:38
*** joeguo_ has quit IRC12:42
*** siqbal has joined #openstack-infra12:48
*** pkopec has quit IRC12:57
*** bhavikdbavishi has joined #openstack-infra12:59
*** bhavikdbavishi1 has joined #openstack-infra13:02
*** sshnaidm has quit IRC13:03
*** bhavikdbavishi has quit IRC13:04
*** bhavikdbavishi1 is now known as bhavikdbavishi13:04
*** pkopec has joined #openstack-infra13:04
openstackgerritMonty Taylor proposed zuul/zuul master: Use cherrypy_cors to set cors headers  https://review.opendev.org/67228513:05
*** apetrich has quit IRC13:07
*** pkopec has quit IRC13:08
*** pkopec has joined #openstack-infra13:08
*** jamesmcarthur has joined #openstack-infra13:08
*** pkopec has quit IRC13:09
*** sshnaidm has joined #openstack-infra13:12
*** jamesmcarthur has quit IRC13:18
fungiAJaeger_: noonedeadpunk: jrosser: i have a feeling haproxy isn't taking the dead gitea01 backend out of the pools. i've manually disabled its backends for http and https now13:18
fungi#status log manually disabled http and https backends for missing gitea01 in haproxy13:19
noonedeadpunkfungi: worked for me13:19
fungiyeah, you have a 1/8 chance of being balanced to the dead backend (based on client address hash)13:19
fungistatusbot is apparently dead too. restarting it now13:20
noonedeadpunkI guess I had 100% for some reason, since today gitea never worked for me (from VPN)13:20
noonedeadpunkso my IP wasn't in lucky list:(13:21
*** openstackstatus has joined #openstack-infra13:21
*** ChanServ sets mode: +v openstackstatus13:21
fungiwell, the backend is chosen based on a hash of your client address, so you'll be consistently sent to the same backend, yeah13:21
fungi#status log restarted statusbot after a 07:20z ctcp ping timeout13:22
openstackstatusfungi: finished logging13:22
fungi#status log manually disabled http and https backends for missing gitea01 in haproxy13:22
openstackstatusfungi: finished logging13:22
openstackgerritMonty Taylor proposed zuul/zuul master: Use cherrypy_cors to set cors headers  https://review.opendev.org/67228513:24
*** pkopec has joined #openstack-infra13:26
AJaeger_thanks, fungi13:30
ykarelfungi, AJaeger_ is below issue related to ^^ discussion:- ERROR: Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host='opendev.org', port=443): Max retries exceeded with url: /openstack/requirements/raw/branch/master/upper-constraints.txt (Caused by SSLError(SSLEOFError(8, u'EOF occurred in violation of protocol (_ssl.c:618)'),))13:31
ykarelseeing in one of the job http://logs.openstack.org/97/672197/2/check/tripleo-ci-centos-7-containers-multinode/0dd1fe6/job-output.txt.gz#_2019-07-23_10_24_26_40017113:31
fungiykarel: probably, yes13:33
fungias to why that job doesn't have zuul provide the source code for it, i guess that's a bigger debate13:33
ykarelfungi, ack, so it should not be seen again from now, right?13:33
fungiykarel: as far as i know, yes. i'm not sure why haproxy didn't mark those backends down when they became unreachable, but am still catching up for the morning before i can dig deeper into that13:34
*** goldyfruit has joined #openstack-infra13:35
ykarelfungi, ack Thanks for the info, i will keep watching, if i see again will share with u13:35
fungiappreciated!13:35
*** bhavikdbavishi has quit IRC13:35
*** mriedem has quit IRC13:38
*** ykarel is now known as ykarel|away13:39
*** jamesmcarthur has joined #openstack-infra13:39
*** tosky_ has joined #openstack-infra13:40
*** tosky has quit IRC13:42
openstackgerritSlawek Kaplonski proposed openstack/project-config master: Rename "tripleo-ci-centos-7-scenario007-standalone" in Neutron  https://review.opendev.org/67229013:43
*** ykarel|away has quit IRC13:44
*** sreejithp has joined #openstack-infra13:45
petevgGood morning/afternoon/evening! Does anybody have any handy examples of adding artifacts to test output after a test failure? I want to gather and tar up some logs for inspection, and I'm not certain what the best practice for that sort of thing is in zuul ...13:47
*** AJaeger_ is now known as AJaeger13:50
*** aaronsheffield has joined #openstack-infra13:51
openstackgerritMonty Taylor proposed zuul/zuul master: WIP Use cherrypy_cors to set cors headers  https://review.opendev.org/67228513:52
*** yamamoto has quit IRC13:53
*** yamamoto has joined #openstack-infra13:54
*** iurygregory has quit IRC13:55
*** iurygregory has joined #openstack-infra13:55
*** yamamoto has quit IRC13:58
*** jamesmcarthur has quit IRC13:58
*** rpittau is now known as rpittau|afk14:03
openstackgerritSlawek Kaplonski proposed openstack/project-config master: Rename "tripleo-ci-centos-7-scenario007-standalone" in Neutron  https://review.opendev.org/67229014:07
*** michael-beaver has joined #openstack-infra14:07
*** tosky_ is now known as tosky14:08
*** ykarel|away has joined #openstack-infra14:11
*** jcoufal has joined #openstack-infra14:16
openstackgerritMonty Taylor proposed opendev/system-config master: Build gerrit images for 2.16 and 3.0 as well  https://review.opendev.org/67227314:21
openstackgerritMonty Taylor proposed opendev/system-config master: Trim some bazel flags  https://review.opendev.org/67227414:21
mordredcorvus: there's a change to build 2.16 and 3.0 too - maybe we'll get lucky and they'll build with no issues14:22
*** gyee has joined #openstack-infra14:23
*** apetrich has joined #openstack-infra14:26
fungipetevg: usually the simplest thing to do is if some action you expected to succeed fails, copy the additional logs/artifacts for it into the place where zuul expects to find them so it will slurp them up at the end of the job14:29
*** jeremy_houser has joined #openstack-infra14:31
petevgfungi: aha. So by default, if I just copy things into work/logs, it'll make them available to me?14:34
*** ykarel|away has quit IRC14:34
fungipetevg: i'll check the log collection role, but i believe so yes14:34
petevgfungi: cool. Thx.14:35
*** mriedem has joined #openstack-infra14:36
*** ykarel|away has joined #openstack-infra14:37
*** yamamoto has joined #openstack-infra14:37
*** kjackal has quit IRC14:38
*** kjackal has joined #openstack-infra14:40
*** dosaboy has joined #openstack-infra14:41
fungipetevg: so it's a combination of two roles usually... this one collects files from the job nodes and pulls them back to the executor workspace: https://zuul-ci.org/docs/zuul-jobs/log-roles.html#role-fetch-output14:41
fungipetevg: and then this one copies those to a site for publication: https://zuul-ci.org/docs/zuul-jobs/log-roles.html#role-upload-logs14:41
*** yamamoto has quit IRC14:43
fungiand yeah, the collect log output task in the fetch-output role looks in {{ zuul_output_dir }}/logs/ generally, unless overridden14:43
fungiwhatever's in there at the time the post playbook runs should get archived14:44
petevgfungi: got it. I'll give that a try. Thank you again!14:45
fungiany time!14:45
*** iurygregory has quit IRC14:45
*** iurygregory has joined #openstack-infra14:47
*** gfidente has quit IRC14:49
fungiokay, so i've figured out why gitea01 didn't get taken out of the haproxy pools when it became unreachable... none of the backends have any sort of checking enabled (status column is "no check" for all of them)14:55
fungiat least i think that's the reason14:57
fungistill trying to wrap my head around haproxy's socket forwarding model. we do have defaults set to redispatch with a variety of conditional timeouts and retrying14:57
fungiis the expectation that forwarding failures are used in lieu of service checks?14:58
openstackgerritMonty Taylor proposed zuul/zuul master: WIP Use cherrypy_cors to set cors headers  https://review.opendev.org/67228515:01
*** armax has joined #openstack-infra15:05
*** kjackal has quit IRC15:06
*** lseki has joined #openstack-infra15:06
*** kjackal has joined #openstack-infra15:06
*** odicha has quit IRC15:08
fungiso far i'm not finding any mention in the haproxy docs of an implicit checking feature15:09
*** pfallenop has joined #openstack-infra15:09
*** bhavikdbavishi has joined #openstack-infra15:10
clarkbfungi: I think this configuration was lost in the conversion to ansible + docker15:12
openstackgerritMonty Taylor proposed zuul/zuul master: WIP Do public cors without cherrypy_cors  https://review.opendev.org/67231315:12
clarkbfungi: you have to explicitly configure checks15:12
fungithanks, that's what i was getting from the haproxy docs as well15:12
fungipresumably we can get by with just a tcp socket check on the ports15:13
openstackgerritClark Boylan proposed opendev/system-config master: Actually check backends are alive in haproxy  https://review.opendev.org/67231415:13
clarkbfungi: something like ^15:14
*** eernst has joined #openstack-infra15:14
mordredclarkb: ++15:15
clarkbalso a "listen" block is a combined frontend and backend block15:16
*** pfallenop has quit IRC15:16
clarkbso I'm 99% sure the backend directives are valid in that listen block15:16
*** trident has quit IRC15:18
*** trident has joined #openstack-infra15:20
fungithe examples i found looked exactly like that15:21
*** ccamacho has quit IRC15:21
*** rh-jelabarre has quit IRC15:23
*** rh-jelabarre has joined #openstack-infra15:25
*** dpawlik has quit IRC15:30
*** pgaxatte has quit IRC15:30
clarkbfungi: is there a change up to add new gitea01 back into the inventory yet?15:33
clarkbmordred: fungi re haproxy config changes the way you gracefully stop haproxy is to send some signal to it that says exit when all connections are closed, then you start a new daemon that will listen for new connections. I'm not really sure how to coordinate that with docker-compose, do you all have any idea15:34
clarkb`docker kill -s HUP my-running-haproxy` is what the image docs say to do15:36
clarkbwhich addresses half of the problem (when only config updates and not the image)15:36
fungiclarkb: hah, i was just researching how to do that. i guess https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/haproxy/tasks/main.yaml#L18-L21 is where we need to trigger a configuration reload?15:36
clarkbfungi: that is one place where we need to the other is when we do a docker-compose up and the haproxy image has updated15:37
clarkbfungi: I suppose we can start with the config updated case first and image updates are likely to be less frequent15:37
openstackgerritMonty Taylor proposed openstack/project-config master: Add additional gerrit plugin repos  https://review.opendev.org/67232015:38
clarkbfungi: the chagnes we've made in the last day or so would all be the first case anyway so that is likely good enough15:39
*** e0ne has quit IRC15:39
clarkbfungi: are you writing that change (adding a handler to the role to trigger that command?)15:39
openstackgerritMonty Taylor proposed opendev/system-config master: Build gerrit images for 2.16 and 3.0 as well  https://review.opendev.org/67227315:40
fungii'm learning ansible enough to find the pidfile and use that to template out the kill command15:40
*** david-lyle is now known as dklyle15:40
clarkbfungi: why not use docker or docker-compose for that?15:40
fungisince i guess we don't have normal systemd service management hooked up to containerized haproxy15:40
clarkbsorry let me link you to the docs I've got15:41
fungioh, because i didn't know docker-compose could do it ;)15:41
clarkbfungi: https://hub.docker.com/_/haproxy/15:41
mordredclarkb: I'm not sure what the answer is15:41
fungi(or really the first thing about docker compose, to be honest)15:41
mordredclarkb: oh - cool - that seems neat15:42
clarkbfungi: the command you want is likely `docker-compose -f /etc/haproxy-docker/docker-compose.yaml kill -s HUP haproxy`15:42
fungineat-o15:42
mordredclarkb, fungi: if you have a sec, 672320 is easy and needed for the gerrit image work15:42
fungiworst case, /var/haproxy/run/haproxy.pid does seem to contain the pid for the haproxy daemon15:43
clarkbfungi: their docs document the docker command equivalent but beacuse docker-compose is managing our containers we'd have to lookup whatever it called the haproxy container. Instead if we have docker compose run the command it knows how to map the logical thing we called 'haproxy' in our config to the running container for us15:43
fungioh, i take that back15:43
fungiot15:43
fungiit's the pid of the haproxy daemon within the process namespace15:43
fungi(so... 1)15:43
clarkbya that is why the docker tools exists15:44
clarkbfungi: you can also test that command by running it on opendev.org as root. Shoudl result in a new pid for the haproxy process15:45
fungiyep, that command works (as root, not as an unprivileged user)15:45
portdirecthey - im wondering if we can 'choose' which pool of nodes some ci jobs run on15:45
clarkbif it somehow causes haproxy to stop functioning then docker-compose -f /that/same/file restart should get it back up and running15:45
clarkbportdirect: no15:45
fungiand caused it to finally remove the gitea01 pool entries we commented out of the config15:46
clarkbportdirect: there is a bit more to it than that (like arm64 resources only come from one location currently so thats an implicit choosing) but in general we try to have a generic pool of resources beacuse resources come and go over time15:47
clarkbportdirect: is there something more specific you are trying to achieve?15:47
clarkbeg what is the goal with that?15:47
portdirectyeah - we'd like to do some dpdk checks15:48
portdirecti'll paste a snippet that may help/provide context15:48
portdirecthttps://www.irccloud.com/pastebin/J4JXDe1k/15:48
fungiwe have waaaay more than "two kinds" of virtual machines managed by nodepool15:49
portdirectyeah ;)15:49
clarkbthe two that seem to be detected there are rackspace (no nested virt + two interfaces) and not rackspace (always one interface sometimes nested virt)15:49
fungiportdirect: https://docs.openstack.org/infra/manual/testing.html tries to cover the variances you can expect15:49
clarkbportdirect: what are the requirements to test dpdk ?15:49
portdirectinterface with pci addr, and nested virt15:50
clarkbofficially we don't support nested virt because it has never worked reliably15:50
portdirectthats fair15:50
clarkbeven in clouds where it works today we've found that the next round of kernel updates to our images tend to break things15:50
clarkbthen we have to wait for the cloud to update all their hypervisor kernels15:50
fungiespecially in environments where we don't control the underlying hardware, host kernel or hypervisor15:51
fungi(which is basically all of them)15:51
portdirectthough here is not for real nested virt, its just vmx we need i think15:51
clarkbas for interface with pci addr does that require directed io or whatever it is called today (pci passthrough?)15:51
clarkbI don't think we've got that in any clouds15:51
fungiand we don't even have guarantees that all server instances booted in a particular provider/region will use the same sort of hypervisor hosts with consistent features15:52
*** trident has quit IRC15:52
*** jaosorior has quit IRC15:53
portdirectok - thanks clarkb and fungi15:53
clarkbAt one point we had a reasonably good setup for pushing the boundaries a bit on this with logan- (limestone) and mnaser (vexxhost) but I think we are still experiencing networking problems on limestone so are back down to one cloud region for that again15:53
clarkbits possible fn may be able to help support some of that.15:54
clarkbI'd be happy for people to experiment more but I think the goal is likely to be "improve reliability of nested virt" and not "test dpdk" at least initially15:54
*** trident has joined #openstack-infra15:55
mnaserindeed. i had mentioned at the time that i was more than happy to work with whoever would volunteer to figure out what needs to be done on the host level15:56
mnaser(i.e. run this kernel or whatever)15:56
mnaserbut i cant really put time into digging that out myself unfortunately15:56
clarkbportdirect: if intereting in helping to improve that base layer the people I know that have owrked on it in the past are mnaser logan- johnsom rm_work and sean mooney15:57
*** kjackal has quit IRC15:58
clarkbI think the next step given where we are at now is to get a second cloud back up again that can support/assist direct debugging: either address networking issues in limestone and reenable there or see if donnyd thinks fortnebula can support it15:58
johnsomYeah, let me know if I can help with getting that enabled.15:59
clarkbThen add a flavor that allows us to run testing directed at exercising nested virt (boot cirros, ubuntu, centos smoke test? maybe more complicated then that) and that will allow us to track that directly15:59
portdirectI'll reach out to cheng1 and see if we can assit there15:59
portdirectthanks so much :)15:59
johnsomThe last issue that caused our project to turn it off was a nodepool instance kernel bug that came and went with the kernel releases.  Probably resolved now, but we still haven't turned it back on.16:00
clarkbjohnsom: it seems like the way those end up working out is guest kernel upadtes and breaks nested virt, then cloud updates their hypervisor kernel and it works again16:00
clarkbwhich is why the involvement from the cloud side has been so valuable16:00
johnsomWe narrowed it to not be related to the guest kernel or the host. It was the nodepool kernel. (working with limestone team)16:01
*** e0ne has joined #openstack-infra16:01
johnsomI should check if my kernel bug report is still open for that or not.16:02
*** iurygregory has quit IRC16:02
johnsomhttps://bugzilla.kernel.org/show_bug.cgi?id=19252116:03
openstackbugzilla.kernel.org bug 192521 in kvm "KVM: entry failed, hardware error 0x0" [High,New] - Assigned to virtualization_kvm16:03
openstackgerritJeremy Stanley proposed opendev/system-config master: Reload haproxy configuration when config changes  https://review.opendev.org/67232316:03
johnsomStill open, but who knows....16:03
clarkbjohnsom: yes in my context nodepool is the gueast16:04
fungiclarkb: mordred: i feel like i've sort of cargo-culted 672323... don't really understand enough about ansible still to be confident that i'm understanding how file management works16:04
clarkbyou have cloud <- nodepool image <- nested image16:04
fungi(i'm assuming the template task only returns success when the file content changes)16:04
clarkband we know that updating the middle kernel does break things and usually updating the first kernel fixes it16:04
clarkbfungi: look up ansible handlers that is the "correct" way to do the association between tasks iirc16:05
clarkbfungi: I believe config_update will always be successful whether it writes bytes or not16:05
fungiahh16:05
fungistill mentally mapping puppet concepts onto ansible. sorry!16:05
fungiwill read more16:05
mordredfungi: you'll like handlers - they're nice and clean16:06
clarkbfungi: opendev/system-config/playbooks/roles/nameserver/handlers/main.yaml is a good example likely16:06
clarkbmordred: when they work16:06
*** mattw4 has joined #openstack-infra16:06
fungiheh16:06
clarkbmordred: when they don't work they cause ansible to exit 0 without running any subsequent tasks and you wonder why16:07
mordredclarkb: of course :)16:07
* clarkb is a bit grumpy about how unreliable ansible has been with handlers16:07
mordredclarkb: remember when we tried salt and it returned 0 on every invocation regardless of success or failure?16:07
clarkbya and puppet returns 2 on success16:07
mordredya16:07
*** tesseract has quit IRC16:07
donnydI should be able to help with that16:08
clarkbjohnsom: also we may need to set up some consistent terminology if we start to dig into this more. Nodepool doesn't run on these VMs nor does it have a special kernel16:09
donnydMy gear should be able to do dpdk, and I'm happy to enable nested virt16:09
clarkbso calling it the "nodepool kernel" implies nodepool is at fault when really it is normal upstream kernels being used as regular old VMs16:09
*** eernst has quit IRC16:09
johnsomAgreed, consistent terms would help.  Yep, it's distro kernels for sure.16:10
*** lucasagomes has quit IRC16:10
mordredclarkb, johnsom: terminology is always the hardest part16:10
*** pfallenop has joined #openstack-infra16:10
johnsomEven the "levels" terminology is troubled with obi-wan errors. lol16:11
clarkbGood news is that the linux kernel in 4.19 (I think) enabled nested virt by default on intel cpus16:11
clarkbwhich means that in a year or two maybe this will all just work in the wild16:11
johnsomYeah, it's been on by default for quiet some time.16:12
clarkbit has been on for amd for forever but kashyap mentioned that was likely an oversight16:12
*** eernst_ has joined #openstack-infra16:12
johnsomWe had a good few years run with this stuff turned on without any issues. It was just this bug that stopped us.16:12
clarkbjohnsom: on the specific hardware we have and so on16:13
clarkbthere are a lot of variables at play and the kernel itself is looking at it more globally16:13
clarkb(so their stamp of approval implies that it is likely way more stable than we've seen it previously)16:13
clarkboh also kashyap is another person that is likely willing to help if people start pushing on this more16:14
logan-o/16:14
logan-johnsom: yep typically it breaks in limestone when guest kernels update but the host kernel does not. i've found the host kernel needs to be updated in lock step with nodepool for it to keep working.16:15
johnsomJoy16:15
donnydBe back later16:16
*** eernst_ has quit IRC16:18
*** lpetrut has quit IRC16:19
*** eernst has joined #openstack-infra16:19
openstackgerritMerged opendev/system-config master: Actually check backends are alive in haproxy  https://review.opendev.org/67231416:21
clarkblogan-: fwiw I believe the network problems came back as soon as we put workload on limestone. iirc fungi pulled it back out again16:23
clarkblogan-: that probably does imply it is somethign to do with our network traffic that triggers that card problems16:23
clarkb(not sure if you were caught up on that)16:23
fungiit didn't come back right away16:23
*** eernst has quit IRC16:23
fungior at least not that i saw16:23
fungithough this last time it was harder to identify since it didn't impact the part of the network between cacti and the mirror instance16:24
logan-it looked like a different problem. multinode job having problems SSHing between hosts, so it would not have been going thru the neutron gateways where we were seeing problems previously.16:24
clarkbah16:25
openstackgerritMonty Taylor proposed zuul/zuul master: WIP Use cherrypy_cors to set cors headers  https://review.opendev.org/67228516:33
openstackgerritMonty Taylor proposed zuul/zuul master: WIP Do public cors without cherrypy_cors  https://review.opendev.org/67231316:33
logan-the issue with the multinode job failure might have been a one off. we have continued running our local nodepool jobs on that cloud and they have not been impacted. we previously saw our jobs affected when the network node issues were occurring.16:34
*** ricolin has quit IRC16:34
fungiclarkb: i'm still not sure how to go about making it so that the template task only notifies the handler if it changed file content. it looks like you can do that with normal tasks instead using register and the .changed attribute though in examples i'm seeing16:35
fungialso i suppose i need to have it no-op if the service isn't running (maybe only when the pidfile exists?)16:36
*** jamesdenton has quit IRC16:36
*** jamesdenton has joined #openstack-infra16:36
*** dtantsur is now known as dtantsur|afk16:38
clarkbfungi: I think you just do a notify like the nameserver handler does16:42
clarkbthen ansible knows to notify only if the file was updated16:42
clarkb(that was my example above)16:43
*** yamamoto has joined #openstack-infra16:43
openstackgerritJames E. Blair proposed zuul/zuul-jobs master: Add generate-zuul-manifest role  https://review.opendev.org/67187416:44
*** Goneri has quit IRC16:44
clarkbfungi: I think notify only fires if a task has the changed attribute set to true16:45
*** yamamoto has quit IRC16:48
openstackgerritJeremy Stanley proposed opendev/system-config master: Reload haproxy configuration when config changes  https://review.opendev.org/67232316:48
fungiclarkb: ahh, in that case ^16:48
openstackgerritCarlos Goncalves proposed openstack/diskimage-builder master: Reduce yum-minimal based OS install size footprint  https://review.opendev.org/67232916:48
fungiit was what i had already cribbed together trying to reconcile documentation against the other examples16:49
*** yamamoto has joined #openstack-infra16:49
fungii just wasn't sure how to make certain it only did it on content changes16:49
*** betherly has quit IRC16:49
clarkbfungi: +216:50
*** armax has quit IRC16:50
clarkbfungi: might be worth manually running that command just to confirm that the container does the right thing16:51
fungii already did earlier16:51
fungiand it properly removed gitea01 from the pools16:51
clarkbhuh the processes are still from July 17 though16:51
clarkbmaybe the way it does restarts is different than I thought16:51
fungiit doesn't restart anything, just tells the daemon to reread its config16:52
mordredyeah16:52
clarkbok I distinctly remember reading a thing that said it doesn't reload the config but instead stops accepting new connections and you have to start a new process with the nwe config16:53
clarkbbut maybe that is an alternative method which ubuntu packaging employs or something16:53
clarkbya it definitely should make new processes16:54
clarkbbased on http://www.haproxy.org/download/1.7/doc/management.txt which the image docs link to16:55
openstackgerritMerged openstack/project-config master: Add additional gerrit plugin repos  https://review.opendev.org/67232016:55
fungiclarkb: i think you're referring to what happens when you send sigusr1?16:56
clarkbya ok we have three haproxy processes16:56
clarkbtwo from the 17th and one from an hour ago16:56
mordreddo the two from the 17th ever go away?16:56
clarkbmordred: I kind of expect the one with the -sf 6 to have been replaced by the new one that is -sf 616:57
clarkbbut maybe not16:57
fungiahh, so it does start a new daemon when rereading its config? that's strange16:57
clarkbfungi: yes16:57
clarkbthat is how you get the new config it is a new process16:57
clarkbcan we run the command again and see if we get a fourth process?16:57
*** siqbal90 has joined #openstack-infra16:57
fungisure, just a sec16:57
mordredmaybe it just tells the live one to stop accepting connections, and starts a new one that accepts - and getting rid of the old stale one is an exercise for the listener?16:57
fungiwell, documentation suggested you should be able to gracefully stop the old processes when doing socket takeover16:58
fungibut i assumed that was only for no-downtime restarts, not also for config changes16:58
mordredclarkb, fungi: since you're both enjoying docker at the moment, https://review.opendev.org/#/c/671457 is also ready for your enjoyment16:58
clarkbfungi: they are the asme thing to haproxy16:58
*** siqbal has quit IRC16:58
fungiindeed, i find that a strange design choice on their part16:58
*** dancek has quit IRC16:59
mordredin particular, https://review.opendev.org/#/c/671457/13/docker/gerrit/2.13/Dockerfile "should" result in a container image that resembles what puppet would do on a server - minus the data and config files ... that's likely the most 'interesting' bit to look at17:00
*** siqbal90 has quit IRC17:02
fungiclarkb: okay, ran it as close as i can to how the handler will (cwd into /etc/haproxy-docker/ and not specifying a compose file)17:02
clarkbmordred: fungi ok that leaked another process17:02
fungiand it does seem to have started yet another haproxy process, yes17:02
clarkbI guess we have to check if eventually those older processes go away. It is possible those older processes are still handling clients17:02
clarkb(and so have not exited yet)17:02
clarkbhaproxy[25482]: proxy balance_git_https has no server available!17:03
clarkbthat seemed to have picked up the check?17:03
fungialso i find it interesting that 26683 is running as root but the others are running as uid 100017:03
*** roman_g has quit IRC17:03
*** udesale has quit IRC17:03
clarkbheres hoping it discovered quickly that tcp works17:03
johnsomAny idea why I can't open this page on opendev.org? https://opendev.org/openstack/diskimage-builder/src/branch/master/diskimage_builder/elements/package-installs/post-install.d17:03
clarkbjohnsom: see the paste a couple lines above yours17:04
fungiConnection refused at initial connection step of tcp-check17:04
fungii'll undo the service checks17:04
fungimanually17:04
clarkbfungi: ok17:04
fungiokay, the backends are all back to no check now17:05
*** roman_g has joined #openstack-infra17:05
fungifungi@gitea-lb01:/etc/haproxy-docker$ telnet 38.108.68.122 308017:05
fungiTrying 38.108.68.122...17:05
fungitelnet: Unable to connect to remote host: Connection refused17:05
clarkbis it actually the backends that are sad?17:06
clarkbya ok17:06
clarkbgitea web restarted ~3 minutes ago17:06
funginow i can reach them from the lb17:07
fungiokay, that's... strange timing?17:07
clarkbfungi: so the checks shoudl be fine17:07
fungialso i guess we don't have rolling restarts set up for gitea yet17:07
clarkbfungi: we don't but they should only restart if the images update17:07
*** trident has quit IRC17:07
fungiso did we just get new gitea images?17:08
*** chandankumar is now known as raukadah17:08
clarkbmariadb updated17:08
clarkbaccording to sudo docker image ls17:08
fungineat17:09
clarkbfungi: we should put the checks back17:10
*** trident has joined #openstack-infra17:10
clarkb(if they have been successfully removed)17:10
fungidone17:10
clarkband our older haproxy process from about an hour ago has gone away17:10
fungiLayer4 check passed17:10
clarkbso that may just be delay waiting for connections to die17:10
fungilooking okay so far17:11
*** yamamoto has quit IRC17:11
*** priteau has quit IRC17:11
clarkbya its working for me now17:11
clarkbthat was highly coincidental17:11
fungiconfusingly so, yes17:11
clarkbthe worst kind of wtf did it stop working :)17:11
clarkbmordred: I think https://review.opendev.org/#/c/672323/2 is good to go17:12
clarkbas the older process did go away17:12
johnsomIf you use the haproxy reload, it will spawn a new process for new connections and keep the old one around to finish out any active connections. Once they are all closed the old process will exit.17:12
clarkbjohnsom: yup17:12
clarkbjohnsom: we were just confirming that the docker image supervisor process actually ensures that happens17:12
*** yamamoto has joined #openstack-infra17:13
*** ociuhandu has quit IRC17:13
johnsomFYI, if you are using the HAProxy docker image, 2.0.3 and 1.9.9 haproxy is out today with a CVE fix.17:16
*** lseki has quit IRC17:16
mordredclarkb: I hopped onto a call right as things were unhappy - tldr was that it was just bad timing?17:16
clarkbjohnsom: doesn't appear to be on dockerhub yet17:16
fungii'm going to go get lunch, and then once i'm back i'll finish getting the gitea01 replacement into the mix17:16
fungi(managed to get the replacement server launched last night)17:17
johnsomHere is the announce e-mail for 2.0.3: https://www.mail-archive.com/haproxy@formilux.org/msg34586.html17:17
clarkbmordred: yes mariadb image updated which restarted all of the giteas and they need a few minutes to start up. At the same time fungi HUP'd haproxy to pick up the health checks nad they reported no tcp connections17:17
clarkbmordred: so was coincidence aftera couple minutes gitea was back and checks worked fine17:17
clarkbmordred: and we should be safe to approve the change to add the graceful restart on config updates17:17
fungiclarkb: also see scrollback from last night about the two available 40gb volumes in sjc1 control plane tenant and see if the timeline there makes sense for how we wound up with them. if you concur i'll delete them when i get back17:18
*** yamamoto has quit IRC17:18
mordredclarkb: awesome. out of curiosity - isn't an image update like that supposed to do one backend at a time?17:18
clarkbmordred: no our ansible does not serialize them17:18
clarkb(that would be the easy fix but would slowdown ansible runtime)17:18
mordredah - maybe we should serialize them, now that we have haproxy doing health checks17:18
fungibut sounds like a great enhancement17:18
fungiyes17:18
mordredand have it at least wait on the port being up17:18
clarkbjohnsom: I don't think we are affected we only forward tcp so there is no cookie parsing17:18
corvus++ can't think of a reason not to17:18
fungibbiaw17:19
mordredalthough - while we're talking about it - am I remembering right that gitea isn't fully up as soon as its port is up?17:19
openstackgerritMonty Taylor proposed opendev/system-config master: Trim some bazel flags  https://review.opendev.org/67227417:20
mordredcorvus: this is happening in remote_puppet_git right? it's the docker-compose pull in the gitea role that also causes the update?17:22
corvusyeah17:22
openstackgerritMerged zuul/nodepool master: static: add host-key-checking toggle  https://review.opendev.org/65367917:23
mordredhrm. well - that will make initial spinup for integration testing annoying17:23
mordred(serializing the gitea role)17:23
corvuswe only have one gitea during testing though17:23
mordredoh - duh17:23
clarkbmordred: corvus its also done separately from the create projects play17:23
clarkbso I think we should be able to do it fairly isolated17:23
clarkbmordred: are you working on that?17:24
corvuseven better17:24
mordredcool - then I'm less worried about it - because we also don't actually expect to run the playbook from scratch against 8 giteas too17:24
mordredclarkb: yes17:24
corvusagreed, that time is hopefully past17:24
*** ramishra has quit IRC17:24
*** _erlon_ has joined #openstack-infra17:25
clarkbmordred: k I'll stop looking too hard at it then and will await change to review17:25
openstackgerritMonty Taylor proposed opendev/system-config master: Serialize the gitea role  https://review.opendev.org/67233517:26
mordredclarkb: I believe it's easy like that17:26
clarkbmordred: should we add a wait until gitea accepts connection on 443 task too?17:27
mordredclarkb: well, there's already a "make sure root user exists" in the role17:27
clarkbaha17:27
clarkbok17:27
mordredclarkb: so I think we're covered17:27
clarkb+217:28
mordred\o/17:28
openstackgerritCarlos Goncalves proposed openstack/diskimage-builder master: Reduce yum-minimal based OS install size footprint  https://review.opendev.org/67232917:30
clarkbI have rechecked https://review.opendev.org/#/c/672323/2 as it failed on vcsrepo failing17:32
clarkbI think due to the short gitea outage17:32
*** armax has joined #openstack-infra17:32
clarkbinfra-root https://review.opendev.org/#/c/672335/1 is the fix for that outage if we can get a second reviewer on it17:32
*** e0ne has quit IRC17:33
clarkbonce we haev those two changes the only outstanding issue is how to gracefully restart when haproxy image updates I think17:33
clarkbthere is a good chance that will happen soon too given the cve17:35
*** ykarel|away has quit IRC17:35
*** ralonsoh has quit IRC17:36
*** dancek has joined #openstack-infra17:38
openstackgerritMerged zuul/zuul-jobs master: Add generate-zuul-manifest role  https://review.opendev.org/67187417:39
*** panda is now known as panda|off17:45
*** sshnaidm is now known as sshnaidm|afk17:50
openstackgerritDavid Shrewsbury proposed zuul/nodepool master: Add build ID to failure message  https://review.opendev.org/67233717:52
*** Lucas_Gray has joined #openstack-infra17:58
*** jcoufal_ has joined #openstack-infra17:59
*** armax has quit IRC18:01
clarkbmordred: left comments on the gerrit dockerfile change18:02
clarkbmordred: mostly didn't review the 2.15 stuff since I Have less context for that but tried to point out where there are deltas with the existing 2.13 things18:02
mordredclarkb: awesome - thanks! the 2.15 is fairly different so that's ok - the approach there is "build the plugins together with the war file in the first place"18:03
*** jcoufal has quit IRC18:03
*** Lucas_Gray has quit IRC18:03
*** armax has joined #openstack-infra18:03
mordredclarkb: (and if that's not sufficient, I expect we'll figure that out as we try doing upgrade testing)18:03
clarkbya18:03
corvusmordred, clarkb: can you +3 https://review.opendev.org/671893 to add the new manifest role to base-test?18:04
clarkbcorvus: done18:04
clarkb(its base-test so I just single core approved)18:04
corvussounds good, thx18:04
*** Lucas_Gray has joined #openstack-infra18:05
*** factor has joined #openstack-infra18:14
openstackgerritMerged opendev/base-jobs master: Test generate-zuul-manifest role  https://review.opendev.org/67189318:14
openstackgerritJames E. Blair proposed zuul/zuul-jobs master: DNM: Test base-jobs  https://review.opendev.org/67189418:15
*** ociuhandu has joined #openstack-infra18:16
*** psachin has quit IRC18:19
*** bhavikdbavishi has quit IRC18:19
*** bhavikdbavishi has joined #openstack-infra18:19
*** Lucas_Gray has quit IRC18:20
*** Lucas_Gray has joined #openstack-infra18:22
*** igordc has joined #openstack-infra18:27
openstackgerritMerged opendev/system-config master: Serialize the gitea role  https://review.opendev.org/67233518:29
mordredclarkb: jeez /etc/init.d/gerrit18:33
clarkbmordred: ya its a big one18:33
clarkbI tried to pull out the highlights :)18:33
mordredclarkb: the best part is that it does a TON of things that are all elided by the container18:33
mordredclarkb: but it also does a ton of things that are not18:34
mordredand they're all intermingled18:34
*** ociuhandu has quit IRC18:34
mordredulimit, for instance, is a docker-level setting18:34
*** Lucas_Gray has quit IRC18:36
*** Lucas_Gray has joined #openstack-infra18:36
*** e0ne has joined #openstack-infra18:42
*** mriedem has quit IRC18:42
*** tdasilva has quit IRC18:51
*** jcoufal_ has quit IRC18:52
fungiokay, back and catching up before the meeting18:54
openstackgerritMerged opendev/system-config master: Reload haproxy configuration when config changes  https://review.opendev.org/67232318:55
*** bhavikdbavishi has quit IRC18:56
openstackgerritJames E. Blair proposed zuul/zuul-jobs master: Fix typo in generate-zuul-manifest role  https://review.opendev.org/67234318:58
clarkboh right the meeting18:59
*** igordc has quit IRC19:00
openstackgerritMonty Taylor proposed opendev/system-config master: Build a docker images of gerrit  https://review.opendev.org/67145719:01
mordredclarkb, corvus: ^^ I think that addresses clarkb's comments - we'll need to translate some gerrit.config settings into docker-compose.yaml when we get there (most notably heap size - which we'll also clearly want to set differently for test)19:01
*** goldyfruit has quit IRC19:02
*** ociuhandu has joined #openstack-infra19:04
*** goldyfruit has joined #openstack-infra19:06
*** factor has quit IRC19:07
*** factor has joined #openstack-infra19:08
*** ociuhandu_ has joined #openstack-infra19:09
*** ociuhandu has quit IRC19:09
*** factor has quit IRC19:10
openstackgerritMerged zuul/zuul-jobs master: Fix typo in generate-zuul-manifest role  https://review.opendev.org/67234319:11
*** factor has joined #openstack-infra19:11
donnydOk I'm back19:11
*** factor has quit IRC19:13
*** factor has joined #openstack-infra19:13
*** kopecmartin is now known as kopecmartin|off19:16
*** Lucas_Gray has quit IRC19:22
*** whoami-rajat has quit IRC19:22
*** Wryhder has joined #openstack-infra19:22
*** panda|off has quit IRC19:23
*** Wryhder is now known as Lucas_Gray19:23
*** panda has joined #openstack-infra19:25
openstackgerritMerged zuul/nodepool master: Add build ID to failure message  https://review.opendev.org/67233719:29
*** Lucas_Gray has quit IRC19:34
*** Lucas_Gray has joined #openstack-infra19:35
*** joeguo has joined #openstack-infra19:44
*** kjackal has joined #openstack-infra19:45
*** tosky has quit IRC19:47
*** Lucas_Gray has quit IRC19:49
openstackgerritJames E. Blair proposed opendev/base-jobs master: Promote generate-zuul-manifest role to base  https://review.opendev.org/67234819:51
*** Lucas_Gray has joined #openstack-infra19:52
*** factor has quit IRC19:52
*** factor has joined #openstack-infra19:53
*** factor has quit IRC19:55
*** factor has joined #openstack-infra19:55
*** factor has quit IRC19:55
*** mriedem has joined #openstack-infra20:00
clarkbfungi: I've commented on the linaro flavor change. Thanks for pointing that out20:06
*** e0ne has quit IRC20:06
*** slaweq has quit IRC20:08
clarkbfungi: for gitea01 what are we ready to add it to the inventory and get it ansibled?20:09
clarkbfungi: I was going to ask if it got a 8GB swapfile setup properly too20:09
fungigood point, i'll investigate20:10
fungii also haven't followed the steps to import data yet20:10
clarkbfungi: that happens after we add it to the inventory20:10
clarkbwe add it back to ansible to get gitea and everything installed but not configured, then we import the data to configure it, then we fully add it to the git playbook20:11
corvusclarkb, mordred, fungi: https://review.opendev.org/672348 is ready; then i can stop pestering for a while :)20:11
clarkblooking20:11
clarkbI wont single core approve that one :)20:11
fungiclarkb: what was your take on the two available 40gb volumes in the sjc1 control plane tenant, after my investigation last night?20:12
fungiyou okay with them being cleaned up?20:12
clarkbfungi: I think they are for old gitea06 and old gitea01. I seem to recall trying to delete the old gitea06 volume and it refused to so I filed that away for later20:12
clarkbmight have ended up in my #status ntoes /me looks20:12
*** roman_g has quit IRC20:12
openstackgerritMonty Taylor proposed zuul/nodepool master: Install libffi6 on dpkg platforms  https://review.opendev.org/67235220:13
clarkblooks like no :/20:13
clarkbfungi: I think they likely can be deleted20:14
*** raissa has joined #openstack-infra20:14
fungiSwap:          8191           0        819120:14
fungithat's from `free -m` on the replacement gitea0120:14
clarkbyay the make swap updates worked then (I mean I tested them but still)20:14
fungiso i think we're all set on virtual memory20:14
fungiright, this was just me bumbling around trying to get a server to boot, so of they worked there then i think we can be pretty certain they're foolproof now20:15
fungii've deleted the two available 40gb volumes and the accompanying ubuntu-bionic-minimal base image they were blocking20:17
clarkbwere they blocking that image?20:17
fungino, wait, i haven't deleted that image20:17
clarkbI'm betting the other gitea servers are still on that image20:17
fungiyeah, i concur20:18
fungiFailed to delete image with name or ID 'd0edcf7a-0779-476e-8285-bcab9043b616': 409 Conflict: Image d0edcf7a-0779-476e-8285-bcab9043b616 could not be deleted because it is in use: The image cannot be deleted because it is in use through the backend store outside of Glance. (HTTP 409)20:18
mordredcorvus: done20:18
fungii bet you're right, it's also in use by the other gitea instance root filesystems20:18
fungionce those are all replaced we can clean it up20:18
clarkbyup20:19
*** jtomasek has quit IRC20:21
fungi#status log openstack/doc8 in github has been transferred to the PyCQA organization20:23
openstackstatusfungi: finished logging20:23
fungistephenfin: ^20:23
openstackgerritMerged opendev/base-jobs master: Promote generate-zuul-manifest role to base  https://review.opendev.org/67234820:26
openstackgerritJames E. Blair proposed zuul/zuul master: Add log browsing to build page  https://review.opendev.org/67190620:27
clarkbfungi: I'm going to have to pop out here soon for a bit. Did you want to get the inventory addition up soon if so I'll wait for that and review it20:28
openstackgerritJeremy Stanley proposed opendev/system-config master: Re-add gitea01 replacement to inventory  https://review.opendev.org/67235420:32
fungiclarkb: ^ sorry, was working on it20:32
*** ociuhandu_ has quit IRC20:32
clarkbfungi: need to exclude it from remote_puppet_git.yaml too20:32
*** ociuhandu has joined #openstack-infra20:33
clarkbfungi: https://review.opendev.org/#/c/667474/ is the example (I didn't capture the xample in the docs but did make note of it, maybe we should make the docs more explicit20:34
openstackgerritJeremy Stanley proposed opendev/system-config master: Re-add gitea01 replacement to inventory  https://review.opendev.org/67235420:34
fungiclarkb: like that? ^20:34
clarkb+2 that should do it20:35
*** Lucas_Gray has quit IRC20:36
corvusi bet we could use hostvars to get the ip addrs there20:40
clarkbcorvus: for the haproxy config you mean?20:42
corvusclarkb: yep20:46
*** pcaruana has quit IRC20:50
clarkband now I must pop out. Back later20:52
*** kjackal has quit IRC20:55
*** slaweq has joined #openstack-infra20:56
*** slaweq has quit IRC21:01
*** ociuhandu has quit IRC21:02
*** sreejithp has quit IRC21:12
openstackgerritMerged zuul/nodepool master: Install libffi6 on dpkg platforms  https://review.opendev.org/67235221:15
*** lpetrut has joined #openstack-infra21:22
*** e0ne has joined #openstack-infra21:22
*** lpetrut has quit IRC21:22
*** lpetrut has joined #openstack-infra21:23
*** lpetrut has quit IRC21:30
openstackgerritJames E. Blair proposed zuul/zuul master: Add log browsing to build page  https://review.opendev.org/67190621:31
*** altlogbot_2 has quit IRC21:33
*** irclogbot_3 has quit IRC21:33
*** altlogbot_0 has joined #openstack-infra21:33
*** irclogbot_1 has joined #openstack-infra21:34
*** rosmaita has left #openstack-infra21:39
fungii'm confused by the linters error on 672354 as it seems unrelated to the proposed change but i'm also unsure why it would have started spontaneously breaking21:42
openstackgerritJames E. Blair proposed zuul/zuul master: Fix sphinx error  https://review.opendev.org/67237221:44
*** yamamoto has joined #openstack-infra21:50
*** e0ne has quit IRC21:50
*** yamamoto has quit IRC21:55
*** irclogbot_1 has quit IRC21:59
*** altlogbot_0 has quit IRC22:01
dmsimardbtw, ara static report generation will land in the next release of 1.x -- need to iterate a bit on it but it works: http://logs.openstack.org/76/672376/2/check/ansible-role-ara-api-ubuntu-postgresql/2e6a610/logs/static/22:07
fungidmsimard: that's awesome news!22:08
openstackgerritJames E. Blair proposed zuul/zuul master: Move artifacts to their own section  https://review.opendev.org/67237922:09
dmsimardit turns out that I'm much more productive in pure html/css than in javascript :p22:09
clarkbfungi it is mad about the remote puppet git change but I dont understand it22:10
clarkbalso I'm only about at my halfway point on today's ride so phone debugging22:10
*** armax has quit IRC22:11
*** adriant has quit IRC22:11
*** iokiwi has quit IRC22:11
clarkbfungi it is the whitespacing22:12
clarkbyou need to dedent22:12
fungihrm, fun22:16
fungithat's new though? not something being changed there22:17
fungiand it's not the duplicate keys it's complaining about which are the problem?22:18
fungifound duplicate key "name" with value "Create repos on gitea servers" (original value: "Puppet-git: Collect the project-config ref")22:18
fungihttp://logs.openstack.org/54/672354/2/check/tox-linters/c4e534c/job-output.txt.gz#_2019-07-23_20_46_23_93288222:18
fungiahh, yeah, that's in playbooks/remote_puppet_git.yaml22:20
*** altlogbot_2 has joined #openstack-infra22:22
fungiokay, yeah i see where we have two "name" keys defined there but it's not clear what needs to happen with them. should they be in individual list elements?22:22
clarkbfungi the new bit is over indented22:22
fungioh! it's the hosts line22:22
*** gyee has quit IRC22:23
openstackgerritJeremy Stanley proposed opendev/system-config master: Re-add gitea01 replacement to inventory  https://review.opendev.org/67235422:23
* fungi swears audibly at his editor22:23
*** jeremy_houser has quit IRC22:24
fungiautoindent is a blight22:24
fungiyet another reason to turn it off22:24
clarkbYou have my +2 from a phone if you want to reapprove22:25
clarkb(not a +2 in gerrit because ehard22:25
openstackgerritJames E. Blair proposed zuul/zuul-jobs master: Download-artifact: use the artifact type rather than name  https://review.opendev.org/67238122:26
openstackgerritJames E. Blair proposed zuul/zuul-jobs master: Use human-readable names for artifact returns  https://review.opendev.org/67238222:26
*** altlogbot_2 has quit IRC22:27
*** iokiwi has joined #openstack-infra22:28
*** goldyfruit has quit IRC22:30
*** goldyfruit has joined #openstack-infra22:30
fungithanks clarkb, my eyesight is suffering tonight22:32
corvusdmsimard: sweet!  i've picked up work on the zuul log display stuff, so we're getting closer to a place where i think we'll feel comfortable switching to swift (where we're going to want to use static generation)22:32
corvusfungi, clarkb +322:33
*** goldyfruit has quit IRC22:38
*** diablo_rojo has joined #openstack-infra22:41
*** tkajinam has joined #openstack-infra22:51
*** mriedem has quit IRC22:55
*** armax has joined #openstack-infra23:04
*** gyee has joined #openstack-infra23:10
clarkboh hah 672354 fails now for a different reason23:13
clarkbit is because gitea01 is the node we test in CI23:13
clarkband well that one is being untested with this change23:13
*** altlogbot_3 has joined #openstack-infra23:14
fungioh, likely so23:15
*** rcernin has joined #openstack-infra23:16
clarkbI've got a patch one sec23:17
openstackgerritClark Boylan proposed opendev/system-config master: Re-add gitea01 replacement to inventory  https://review.opendev.org/67235423:18
clarkbfungi: corvus ^ I think that might fix it23:18
fungihttp://logs.openstack.org/54/672354/3/check/system-config-run-gitea/0d26df1/job-output.txt.gz#_2019-07-23_22_39_50_31836123:18
fungii guess that's the error?23:18
*** altlogbot_3 has quit IRC23:19
fungihah, yep23:20
clarkbya we basically don't configure gitea in that change (intentionally)23:20
clarkbswitching to a host that we should never actually use in production should give us flexibility to test and also rotate these out like this23:20
*** altlogbot_1 has joined #openstack-infra23:28
dmsimardcorvus: ack23:28
clarkbfungi: I +2'd the latest ps I'll let you decide if we should keep trying to work on it today23:31
clarkb(if you want to approve it I mean)23:31
fungii saw and approved it23:31
fungistill hacking on it some23:31
fungialso winding down for the evening but will see how far i get23:31
clarkbit should be safe as that host isn't in the load balancer23:31
fungiyup23:31
clarkbso worst case we continue to have an unconfigured gitea23:31
*** irclogbot_0 has joined #openstack-infra23:32
*** igordc has joined #openstack-infra23:35
*** jamesmcarthur has joined #openstack-infra23:36
*** dchen has joined #openstack-infra23:47
*** aaronsheffield has quit IRC23:50
*** jamesmcarthur has quit IRC23:52
clarkbseems that it made it to the gate with those changes23:54
clarkb\o/23:54
*** diablo_rojo has quit IRC23:54
*** eernst has joined #openstack-infra23:57
fungi<mr_burns>exxxxxcellent</mr_burns>23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!