*** caphrim007 has quit IRC | 00:01 | |
*** Swami has quit IRC | 00:03 | |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Add mirror01.iad.rax.o.o https://review.openstack.org/494042 | 00:11 |
---|---|---|
ianw | clarkb / fungi : ^ since we're talking load-balancing, etc, i think move to the "01" naming ... mirror01.iad.rax.o.o is up but needs puppeting | 00:12 |
*** gouthamr has joined #openstack-infra | 00:12 | |
ianw | i should probably turn down the ttl on mirror.iad.rax.o.o to facilitate a cut-over | 00:12 |
pabelanger | okay, reverse proxy cache for docker.io is working still. tripleo jobs are now hitting it | 00:13 |
clarkb | ianw: will the vhost answer to mirror.iad.rax.o.o if the hostname is something else? | 00:15 |
* clarkb looks | 00:15 | |
pabelanger | Wow, mirror.dfw.rax.o.o is only 2GB of RAM currently | 00:16 |
pabelanger | but seems to be holding up | 00:16 |
ianw | pabelanger: i thought that was intentional? i also brought up the new one as a 2gb instance | 00:16 |
clarkb | ianw: ya posted a comment I think the vhost name will be a problem as is | 00:16 |
ianw | i guess this was before we were doing a lot of reverse proxying | 00:17 |
clarkb | ianw: pabelanger ya I think only ord is the bigger flavor | 00:17 |
pabelanger | Ya, for bandwidth I think | 00:17 |
clarkb | since dfw and iad are temporarily larget quotas but are typically bigger? | 00:17 |
* clarkb double checks that | 00:19 | |
clarkb | ya iad and dfw are typically smaller but tempoararily have quota bumps thanks to cloudnull | 00:20 |
clarkb | ord we increased the size of and is more permanently large | 00:20 |
ianw | clarkb: hmm, some sort of *'d serveralias might work? ... | 00:23 |
clarkb | ianw: ya left that in my comment on the change | 00:24 |
clarkb | I think you can do vhost_name => '*' instead of fqdn in site.pp | 00:24 |
ianw | clarkb: it looks like puppet-httpd supports serveraliases | 00:24 |
clarkb | and since each vhost is on different ports on that server and not based on name I think it is fine | 00:24 |
clarkb | oh you mean serveralias directive, I think since we supply our own template we have to update the template too, but then how do you make iad server only respond to mirror01.iad and mirror.iad and not mirror.dfw? | 00:25 |
clarkb | maybe we just let them respond to all the names and not care too much about it? | 00:25 |
clarkb | we could also have ruby in the erb do string munging to remove the 01 in the serveralias | 00:26 |
ianw | Would "ServerAlias mirror*.iad.rax.openstack.org" not match? | 00:27 |
*** pvaneck has quit IRC | 00:27 | |
*** yongwc has joined #openstack-infra | 00:27 | |
clarkb | it would but this is where puppet is painful, getting a different rule for each region | 00:28 |
*** markvoelker has quit IRC | 00:30 | |
fungi | i think using a default vhost should be fine | 00:30 |
*** yongwc has quit IRC | 00:32 | |
*** hongbin has joined #openstack-infra | 00:35 | |
*** dave-mccowan has quit IRC | 00:38 | |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Add mirror01.iad.rax.o.o https://review.openstack.org/494042 | 00:41 |
ianw | clarkb / fungi : ^ untested but presented for discussion :) | 00:41 |
*** thorst has joined #openstack-infra | 00:42 | |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Add mirror01.iad.rax.o.o https://review.openstack.org/494042 | 00:43 |
*** cshastri has joined #openstack-infra | 00:48 | |
*** vhosakot has quit IRC | 00:50 | |
*** zhurong has joined #openstack-infra | 00:58 | |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Add mirror01.iad.rax.o.o https://review.openstack.org/494042 | 01:12 |
*** gouthamr has quit IRC | 01:13 | |
*** liujiong has joined #openstack-infra | 01:20 | |
*** claudiub has quit IRC | 01:20 | |
*** liusheng has quit IRC | 01:21 | |
pabelanger | surprisingly: we're capping out at 15Mb/s in mirror.dfw currently | 01:21 |
pabelanger | http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=3104&rra_id=all | 01:21 |
pabelanger | mirror.ord.rax is at 400Mb/s http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=3063&rra_id=all | 01:22 |
pabelanger | 150Mbs for dfw* | 01:22 |
fungi | maybe not all that surprising... you did say it's only a 2gb instance right? | 01:23 |
pabelanger | Ya | 01:24 |
pabelanger | we're also caching a lot of things now too | 01:24 |
fungi | i think the 4gb instance in ord was capping out at 200mbps and then when i upgraded it to an 8gb instance we were able to spike higher (or maybe i'm getting some of those values mixed up, it's sorta late here) | 01:24 |
pabelanger | that sounds right | 01:24 |
fungi | so would make sense that a 2gb instance could have a still lower bw cap | 01:25 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Add mirror01.iad.rax.o.o https://review.openstack.org/494042 | 01:27 |
pabelanger | Ya, something to keep an eye on with more traffic flowing through mirrors, might result in long job run times | 01:27 |
*** dave-mccowan has joined #openstack-infra | 01:28 | |
ianw | fungi: so what size do we want the xenial instances? | 01:28 |
pabelanger | but so far, docker.io reverse proxy cache is working ! | 01:29 |
*** gouthamr has joined #openstack-infra | 01:30 | |
fungi | ianw: i think it depends a bit on provider constraints (like tying bw caps to flavors) and relative quotas | 01:31 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Add mirror01.iad.rax.o.o https://review.openstack.org/494042 | 01:31 |
ianw | fungi: well, for the specific instance of this rax.iad one ... 8gb? | 01:31 |
*** gcb has joined #openstack-infra | 01:32 | |
fungi | i'm looking to see how that matches up to ord quota-wise | 01:32 |
fungi | ~75% the size of ord's quota... so yeah i'd probably do an 8gb flavor there as well | 01:33 |
fungi | probably ought to consider the same for dfw too | 01:34 |
*** liusheng has joined #openstack-infra | 01:34 | |
fungi | since it's only 5 instances lower than the quota for iad | 01:34 |
*** esberglu has quit IRC | 01:34 | |
clarkb | those are temporary bumps though | 01:35 |
*** esberglu has joined #openstack-infra | 01:35 | |
clarkb | (though oversizing doesnt hurt) | 01:35 |
*** jamesmcarthur has joined #openstack-infra | 01:35 | |
*** yamahata has quit IRC | 01:35 | |
fungi | right, more concerned that we may be breaking some jobs due to packet loss when the proxies reach the bw caps for their present flavors | 01:36 |
fungi | seems to have been the case in ord anyway until i beefed up the mirror there | 01:36 |
*** dave-mcc_ has joined #openstack-infra | 01:37 | |
*** cuongnv has joined #openstack-infra | 01:38 | |
pabelanger | clarkb: fungi: we should likely calculate # jobs / available bandwidth for our mirror servers too. We might find we are over saturating some regions too. | 01:38 |
pabelanger | for example, mirror.ord I see ~ 18 Mb/s per host | 01:39 |
*** esberglu has quit IRC | 01:39 | |
pabelanger | in mirror.dfw, 21 Mb/s | 01:39 |
*** dave-mccowan has quit IRC | 01:39 | |
*** jamesmcarthur has quit IRC | 01:39 | |
pabelanger | but, that is enough for tonight. I'm happy to see docker.io caching working | 01:41 |
*** slaweq has joined #openstack-infra | 01:42 | |
*** esberglu has joined #openstack-infra | 01:43 | |
*** esberglu has quit IRC | 01:43 | |
fungi | yeah, i'm about to disappear for the night myself, once the tc office hour wraps up in ~15 minutes | 01:44 |
*** lbragstad has quit IRC | 01:44 | |
*** tuanluong has joined #openstack-infra | 01:44 | |
*** Apoorva_ has joined #openstack-infra | 01:46 | |
*** slaweq has quit IRC | 01:47 | |
clarkb | also is bw in region typically limited? | 01:48 |
clarkb | in rax we use the public ip so it is | 01:48 |
clarkb | but in other clouds that may be less an issue | 01:48 |
*** Apoorva has quit IRC | 01:50 | |
*** Apoorva_ has quit IRC | 01:50 | |
fungi | right, which is why i was saying the requirements for sizing are going to vary by provider | 01:51 |
clarkb | ++ | 01:51 |
*** thorst has quit IRC | 01:59 | |
*** gongysh has joined #openstack-infra | 02:07 | |
*** r-daneel has joined #openstack-infra | 02:09 | |
*** EricGonczer_ has joined #openstack-infra | 02:15 | |
*** liujiong_lj has joined #openstack-infra | 02:25 | |
*** liujiong has quit IRC | 02:26 | |
*** liujiong_lj is now known as liujiong | 02:26 | |
*** liujiong has quit IRC | 02:29 | |
*** liujiong has joined #openstack-infra | 02:30 | |
*** markvoelker has joined #openstack-infra | 02:31 | |
*** gouthamr has quit IRC | 02:34 | |
*** yamahata has joined #openstack-infra | 02:34 | |
*** krtaylor has joined #openstack-infra | 02:42 | |
*** slaweq has joined #openstack-infra | 02:43 | |
*** ramishra has quit IRC | 02:44 | |
*** ramishra has joined #openstack-infra | 02:46 | |
*** gongysh has quit IRC | 02:47 | |
*** baoli has joined #openstack-infra | 02:47 | |
*** baoli has quit IRC | 02:48 | |
*** mat128 has quit IRC | 02:48 | |
*** slaweq has quit IRC | 02:48 | |
*** baoli has joined #openstack-infra | 02:48 | |
*** baoli has quit IRC | 02:49 | |
*** mwarad has joined #openstack-infra | 02:53 | |
*** gongysh has joined #openstack-infra | 02:53 | |
*** markvoelker has quit IRC | 03:04 | |
openstackgerrit | zhurong proposed openstack-infra/project-config master: Register solum-tempest-plugin jobs https://review.openstack.org/494064 | 03:08 |
*** dklyle has quit IRC | 03:12 | |
openstackgerrit | zhurong proposed openstack-infra/project-config master: Register solum-tempest-plugin jobs https://review.openstack.org/494064 | 03:13 |
*** sree has joined #openstack-infra | 03:21 | |
*** dave-mcc_ has quit IRC | 03:21 | |
*** sbezverk has joined #openstack-infra | 03:24 | |
*** LindaWang has joined #openstack-infra | 03:26 | |
*** tjones has joined #openstack-infra | 03:29 | |
*** tjones has left #openstack-infra | 03:30 | |
*** EricGonczer_ has quit IRC | 03:30 | |
*** david-lyle has joined #openstack-infra | 03:31 | |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Add mirror01.iad.rax.o.o https://review.openstack.org/494042 | 03:36 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Add mirror01.iad.rax.o.o https://review.openstack.org/494042 | 03:39 |
*** slaweq has joined #openstack-infra | 03:44 | |
*** udesale has joined #openstack-infra | 03:44 | |
*** links has joined #openstack-infra | 03:47 | |
*** makowals has quit IRC | 03:47 | |
*** slaweq has quit IRC | 03:49 | |
*** ykarel_ has joined #openstack-infra | 03:49 | |
*** wolverineav has joined #openstack-infra | 03:55 | |
*** ramishra has quit IRC | 03:55 | |
*** ramishra has joined #openstack-infra | 03:57 | |
*** makowals has joined #openstack-infra | 03:57 | |
*** markvoelker has joined #openstack-infra | 04:01 | |
*** david-lyle has quit IRC | 04:06 | |
*** david-lyle has joined #openstack-infra | 04:08 | |
*** nicolasbock has joined #openstack-infra | 04:13 | |
*** dklyle has joined #openstack-infra | 04:15 | |
*** david-lyle has quit IRC | 04:18 | |
*** hongbin has quit IRC | 04:24 | |
ianw | clarkb / fungi: i think mirror01.iad.rax.o.o is close to ready ... see my comments in -> https://review.openstack.org/#/c/494042 | 04:26 |
*** markvoelker has quit IRC | 04:34 | |
*** liujiong_lj has joined #openstack-infra | 04:38 | |
*** liujiong has quit IRC | 04:39 | |
*** slaweq has joined #openstack-infra | 04:45 | |
*** r-daneel has quit IRC | 04:50 | |
*** slaweq_ has joined #openstack-infra | 04:50 | |
*** slaweq has quit IRC | 04:50 | |
*** r-daneel has joined #openstack-infra | 04:54 | |
*** slaweq_ has quit IRC | 04:55 | |
*** Guest41024 has quit IRC | 05:03 | |
*** dhajare has joined #openstack-infra | 05:10 | |
*** Hal has joined #openstack-infra | 05:17 | |
*** liujiong_lj has quit IRC | 05:17 | |
*** Hal is now known as Guest55565 | 05:18 | |
*** knikolla has quit IRC | 05:22 | |
AJaeger | jeblair: 492715 is merging now - you caught all capitalizations | 05:23 |
*** liujiong has joined #openstack-infra | 05:25 | |
*** wolverineav has quit IRC | 05:30 | |
*** markvoelker has joined #openstack-infra | 05:32 | |
*** udesale__ has joined #openstack-infra | 05:35 | |
*** udesale has quit IRC | 05:35 | |
*** jamesmcarthur has joined #openstack-infra | 05:36 | |
*** liujiong has quit IRC | 05:37 | |
*** jamesmcarthur has quit IRC | 05:40 | |
*** wolverineav has joined #openstack-infra | 05:41 | |
*** pcaruana has joined #openstack-infra | 05:45 | |
AJaeger | is there a status page for zuul v3? The gate was entered 25 mins ago which looks long | 05:49 |
*** pgadiya has joined #openstack-infra | 05:50 | |
*** slaweq has joined #openstack-infra | 05:51 | |
*** tnovacik has joined #openstack-infra | 05:55 | |
*** slaweq has quit IRC | 05:56 | |
*** arturb has quit IRC | 06:03 | |
*** jbadiapa has joined #openstack-infra | 06:04 | |
*** markvoelker has quit IRC | 06:05 | |
*** namnh has joined #openstack-infra | 06:05 | |
hwoarang | good morning. has anyone reported any dns failures on jobs? i've just seen on for opensuse http://logs.openstack.org/66/493566/2/check/gate-openstack-ansible-os_keystone-ansible-func-opensuse-423-nv/02a37b0/console.html#_2017-08-16_05_19_13_576828 but i also saw another one yesterday http://logs.openstack.org/93/493893/1/check/gate-openstack-ansible-os_magnum-ansible-func-opensuse-423-nv/4637cfb/console.html#_2017-08-15_14_24_04_572983 | 06:12 |
*** liujiong has joined #openstack-infra | 06:16 | |
*** jamesdenton has quit IRC | 06:16 | |
*** zhurong has quit IRC | 06:19 | |
*** adisky__ has joined #openstack-infra | 06:19 | |
*** jamesdenton has joined #openstack-infra | 06:20 | |
*** zhurong has joined #openstack-infra | 06:22 | |
*** udesale has joined #openstack-infra | 06:23 | |
*** udesale__ has quit IRC | 06:23 | |
*** tumbarka has joined #openstack-infra | 06:25 | |
*** tushar has quit IRC | 06:25 | |
*** wolverin_ has joined #openstack-infra | 06:26 | |
*** wolverineav has quit IRC | 06:29 | |
*** bogdando has joined #openstack-infra | 06:31 | |
*** alexchadin has joined #openstack-infra | 06:31 | |
*** wolverin_ has quit IRC | 06:34 | |
*** wolverineav has joined #openstack-infra | 06:35 | |
*** udesale has quit IRC | 06:35 | |
*** dizquierdo has joined #openstack-infra | 06:36 | |
*** udesale has joined #openstack-infra | 06:36 | |
*** kjackal_ has quit IRC | 06:37 | |
*** florianf has joined #openstack-infra | 06:37 | |
*** slaweq has joined #openstack-infra | 06:40 | |
*** eroux has quit IRC | 06:43 | |
*** kjackal_ has joined #openstack-infra | 06:44 | |
*** danpawlik has quit IRC | 06:47 | |
*** camunoz has joined #openstack-infra | 06:48 | |
*** zhurong has quit IRC | 06:53 | |
*** coolsvap has joined #openstack-infra | 06:54 | |
*** pgadiya has quit IRC | 06:54 | |
*** danpawlik has joined #openstack-infra | 06:54 | |
*** rcernin has joined #openstack-infra | 06:57 | |
*** tnovacik has quit IRC | 07:02 | |
*** markvoelker has joined #openstack-infra | 07:02 | |
*** markus_z has joined #openstack-infra | 07:05 | |
*** kjackal_ has quit IRC | 07:06 | |
*** pgadiya has joined #openstack-infra | 07:07 | |
*** markus_z has quit IRC | 07:09 | |
*** shardy_afk is now known as shardy | 07:10 | |
*** markus_z has joined #openstack-infra | 07:11 | |
*** masuberu has joined #openstack-infra | 07:14 | |
*** aviau has quit IRC | 07:14 | |
*** aviau has joined #openstack-infra | 07:14 | |
*** rcernin has quit IRC | 07:16 | |
*** masber has quit IRC | 07:16 | |
*** masuberu has quit IRC | 07:18 | |
*** ccamacho has joined #openstack-infra | 07:18 | |
*** jpena|off is now known as jpena | 07:19 | |
*** aviau has quit IRC | 07:19 | |
*** isaacb has joined #openstack-infra | 07:19 | |
*** rcernin has joined #openstack-infra | 07:19 | |
*** aviau has joined #openstack-infra | 07:19 | |
odyssey4me | clarkb ah, you'll see later in the log that it does install it - we have a whell mirror in the build, the first attempt is to use what it has, then it reaches out if it couldn't install anything... so not firewalling, but instead a pip config restricting where it sources wheels | 07:20 |
odyssey4me | that's only happening in the upgrade jobs | 07:20 |
odyssey4me | thanks for the ping though | 07:21 |
bogdando | o/ PTAL e-r queries https://review.openstack.org/#/c/493535/ https://review.openstack.org/#/c/493525/ https://review.openstack.org/#/c/493520/ to reduce unknown/uncategorised numbers | 07:25 |
*** jpich has joined #openstack-infra | 07:28 | |
*** masuberu has joined #openstack-infra | 07:28 | |
*** dmellado has joined #openstack-infra | 07:28 | |
*** kjackal_ has joined #openstack-infra | 07:29 | |
*** masuberu has quit IRC | 07:31 | |
*** shardy is now known as shardy_afk | 07:32 | |
*** markvoelker has quit IRC | 07:36 | |
*** alexchadin has quit IRC | 07:37 | |
*** eranrom has joined #openstack-infra | 07:38 | |
*** egonzalez has joined #openstack-infra | 07:41 | |
*** jklare has quit IRC | 07:49 | |
*** jklare has joined #openstack-infra | 07:50 | |
*** jklare_ has joined #openstack-infra | 07:52 | |
*** jklare has quit IRC | 07:52 | |
*** jklare_ is now known as jklare | 07:52 | |
*** slaweq_ has joined #openstack-infra | 07:53 | |
*** abelur_ has quit IRC | 07:54 | |
*** jklare has quit IRC | 07:55 | |
*** jklare has joined #openstack-infra | 07:56 | |
*** slaweq_ has quit IRC | 07:58 | |
*** Guest55565 has quit IRC | 07:59 | |
*** e0ne has joined #openstack-infra | 08:01 | |
*** thorst has joined #openstack-infra | 08:04 | |
*** ralonsoh has joined #openstack-infra | 08:04 | |
AJaeger | jeblair: so, 492715 has still not merged. But problem might be elsewhere now since Zuul reported "Starting gate jobs" | 08:05 |
*** lucas-afk is now known as lucasagomes | 08:06 | |
*** wolverineav has quit IRC | 08:07 | |
*** andymccr_ is now known as andymccr | 08:08 | |
*** thorst has quit IRC | 08:09 | |
*** derekh has joined #openstack-infra | 08:10 | |
*** ralonsoh has quit IRC | 08:12 | |
*** ralonsoh has joined #openstack-infra | 08:12 | |
*** alexchadin has joined #openstack-infra | 08:13 | |
*** efoley has joined #openstack-infra | 08:13 | |
*** isaacb has quit IRC | 08:14 | |
openstackgerrit | Sean Handley proposed openstack-infra/project-config master: Add IRC notifications for #openstack-publiccloud. https://review.openstack.org/493934 | 08:17 |
*** Hal has joined #openstack-infra | 08:17 | |
*** shardy_afk is now known as shardy | 08:17 | |
*** Hal is now known as Guest58951 | 08:17 | |
openstackgerrit | Merged openstack-infra/project-config master: Add Vitrage python 35 jobs as non-voting https://review.openstack.org/493758 | 08:18 |
openstackgerrit | Merged openstack-infra/project-config master: Publish monasca-events-api documentation https://review.openstack.org/492805 | 08:18 |
openstackgerrit | Merged openstack-infra/project-config master: Add gate jobs for new openstack-ansible Octavia scenario https://review.openstack.org/491109 | 08:18 |
openstackgerrit | Merged openstack-infra/project-config master: [Zun] Move etcd dsvm job back to check queue https://review.openstack.org/493564 | 08:19 |
openstackgerrit | Merged openstack-infra/project-config master: Revert "Temporarily start using the public registry again" https://review.openstack.org/493630 | 08:19 |
*** claudiub has joined #openstack-infra | 08:20 | |
openstackgerrit | Merged openstack-infra/project-config master: networking-midonet: Enable centos-7 jobs for stable/ocata https://review.openstack.org/487763 | 08:23 |
*** ramishra has quit IRC | 08:24 | |
*** ramishra has joined #openstack-infra | 08:27 | |
openstackgerrit | Merged openstack-infra/project-config master: Skip additional tests for Cinder doc changes https://review.openstack.org/492630 | 08:27 |
*** isaacb has joined #openstack-infra | 08:27 | |
openstackgerrit | Merged openstack-infra/project-config master: Add pypi-jobs to masakari and related projects https://review.openstack.org/493434 | 08:28 |
openstackgerrit | Dong Ma proposed openstack-infra/subunit2sql master: turn on warning-is-error in documentation build https://review.openstack.org/477155 | 08:29 |
openstackgerrit | Merged openstack-infra/project-config master: Add bandit integration job for glance_store https://review.openstack.org/441632 | 08:30 |
*** yamamoto has joined #openstack-infra | 08:33 | |
*** slaweq has quit IRC | 08:33 | |
*** markvoelker has joined #openstack-infra | 08:33 | |
*** jamesmcarthur has joined #openstack-infra | 08:36 | |
openstackgerrit | Markos Chandras (hwoarang) proposed openstack-infra/project-config master: zuul: layout: AIO: Add openSUSE Leap 42.3 as non-voting CI job https://review.openstack.org/494116 | 08:37 |
*** wolverineav has joined #openstack-infra | 08:38 | |
*** jamesmcarthur has quit IRC | 08:40 | |
openstackgerrit | Artur Basiak proposed openstack-infra/project-config master: Provide unified gate configuration https://review.openstack.org/490790 | 08:41 |
*** wolverineav has quit IRC | 08:43 | |
*** ykarel_ is now known as ykarel|lunch | 08:48 | |
*** yamamoto has quit IRC | 08:49 | |
openstackgerrit | Markos Chandras (hwoarang) proposed openstack-infra/project-config master: zuul: layout: AIO: Add openSUSE Leap 42.3 as non-voting CI job https://review.openstack.org/494116 | 08:49 |
*** jbadiapa_ has joined #openstack-infra | 08:50 | |
*** jbadiapa has quit IRC | 08:51 | |
*** slaweq has joined #openstack-infra | 08:54 | |
*** sambetts|afk is now known as sambetts | 08:58 | |
*** slaweq has quit IRC | 08:59 | |
*** Guest58951 has quit IRC | 09:01 | |
*** thorst has joined #openstack-infra | 09:05 | |
*** markvoelker has quit IRC | 09:06 | |
*** lucasagomes is now known as lucas-brb | 09:08 | |
*** thorst has quit IRC | 09:09 | |
*** electrofelix has joined #openstack-infra | 09:09 | |
openstackgerrit | Thierry Carrez proposed openstack-infra/puppet-ptgbot master: Update to Queens, add a site index https://review.openstack.org/484798 | 09:10 |
*** goutham has joined #openstack-infra | 09:14 | |
*** sree has quit IRC | 09:18 | |
*** sree has joined #openstack-infra | 09:19 | |
*** sree has quit IRC | 09:23 | |
*** alexchadin has quit IRC | 09:24 | |
*** tosky has joined #openstack-infra | 09:26 | |
*** mwarad has quit IRC | 09:33 | |
*** goutham has quit IRC | 09:37 | |
openstackgerrit | Merged openstack/diskimage-builder master: Increase timeout for removal https://review.openstack.org/493026 | 09:45 |
*** ykarel|lunch is now known as ykarel | 09:46 | |
*** cshastri has quit IRC | 09:47 | |
*** LindaWang has quit IRC | 09:52 | |
*** ociuhandu has quit IRC | 09:54 | |
*** slaweq has joined #openstack-infra | 09:55 | |
*** cuongnv has quit IRC | 09:58 | |
*** slaweq has quit IRC | 10:00 | |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: DONT REVIEW: test undercloud containers https://review.openstack.org/494151 | 10:01 |
*** dizquierdo has quit IRC | 10:02 | |
*** markvoelker has joined #openstack-infra | 10:03 | |
*** namnh has quit IRC | 10:04 | |
*** thorst has joined #openstack-infra | 10:05 | |
*** dtantsur|afk is now known as dtantsur | 10:06 | |
*** wolverineav has joined #openstack-infra | 10:10 | |
*** thorst has quit IRC | 10:11 | |
*** ociuhandu has joined #openstack-infra | 10:12 | |
*** liujiong has quit IRC | 10:15 | |
openstackgerrit | Omer Anson proposed openstack-infra/project-config master: Dragonflow: Add a gate-hook to tempest tests https://review.openstack.org/494155 | 10:16 |
*** lucas-brb is now known as lucasagomes | 10:18 | |
*** slaweq has joined #openstack-infra | 10:19 | |
*** yamahata has quit IRC | 10:21 | |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: Create a whitelist for /etc configs https://review.openstack.org/493973 | 10:25 |
*** chandankumar is now known as chkumar|travel | 10:27 | |
*** LindaWang has joined #openstack-infra | 10:29 | |
*** udesale has quit IRC | 10:29 | |
*** jbadiapa_ has quit IRC | 10:30 | |
*** jbadiapa_ has joined #openstack-infra | 10:31 | |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: Exclude list for logs collection https://review.openstack.org/494022 | 10:32 |
*** slaweq has quit IRC | 10:34 | |
*** slaweq has joined #openstack-infra | 10:36 | |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: Create a whitelist for /etc configs https://review.openstack.org/493973 | 10:37 |
*** markvoelker has quit IRC | 10:38 | |
*** florianf has quit IRC | 10:42 | |
*** dizquierdo has joined #openstack-infra | 10:43 | |
*** ralonsoh has quit IRC | 10:44 | |
*** ralonsoh_ has joined #openstack-infra | 10:44 | |
*** priteau has joined #openstack-infra | 10:44 | |
*** kjackal_ has quit IRC | 10:47 | |
AJaeger | wow, we're already at our capacity of cloud nodes ;( | 10:49 |
*** jkilpatr has quit IRC | 10:54 | |
openstackgerrit | Dmitry Tantsur proposed openstack-infra/project-config master: Add missing publish-to-pypi to networking-baremetal https://review.openstack.org/494166 | 10:54 |
*** florianf has joined #openstack-infra | 10:57 | |
*** tuanluong has quit IRC | 11:04 | |
*** thorst has joined #openstack-infra | 11:04 | |
*** jpena is now known as jpena|lunch | 11:05 | |
*** isaacb has quit IRC | 11:09 | |
*** isaacb has joined #openstack-infra | 11:10 | |
*** lucasagomes is now known as lucas-hungry | 11:10 | |
*** szaher has joined #openstack-infra | 11:15 | |
*** kjackal_ has joined #openstack-infra | 11:18 | |
*** gcb has quit IRC | 11:20 | |
*** sree has joined #openstack-infra | 11:24 | |
*** jbadiapa_ has quit IRC | 11:27 | |
*** martinkopec has joined #openstack-infra | 11:29 | |
*** jkilpatr has joined #openstack-infra | 11:30 | |
*** markvoelker has joined #openstack-infra | 11:35 | |
*** ldnunes has joined #openstack-infra | 11:35 | |
*** pgadiya has quit IRC | 11:41 | |
*** jbadiapa_ has joined #openstack-infra | 11:42 | |
*** slaweq has quit IRC | 11:43 | |
*** rhallisey has joined #openstack-infra | 11:43 | |
*** slaweq has joined #openstack-infra | 11:43 | |
*** slaweq has quit IRC | 11:44 | |
*** kjackal_ has quit IRC | 11:44 | |
*** mat128 has joined #openstack-infra | 11:44 | |
*** slaweq has joined #openstack-infra | 11:45 | |
*** kjackal_ has joined #openstack-infra | 11:56 | |
*** slaweq_ has joined #openstack-infra | 11:57 | |
*** lucas-hungry is now known as lucasagomes | 11:59 | |
Shrews | AJaeger: there is a zuulv3 status page http://zuulv3.openstack.org/ | 12:00 |
*** slaweq_ has quit IRC | 12:02 | |
*** trown|outtypewww is now known as trown | 12:04 | |
*** slaweq has quit IRC | 12:05 | |
*** rlandy has joined #openstack-infra | 12:05 | |
*** slaweq has joined #openstack-infra | 12:05 | |
AJaeger | Shrews: thanks. | 12:06 |
AJaeger | do you know what happened to 492715 ? | 12:07 |
*** jamesdenton has quit IRC | 12:07 | |
*** markvoelker has quit IRC | 12:08 | |
*** jcoufal has joined #openstack-infra | 12:10 | |
*** slaweq has quit IRC | 12:10 | |
Shrews | AJaeger: unfortunately no | 12:10 |
*** dprince has joined #openstack-infra | 12:15 | |
*** slaweq has joined #openstack-infra | 12:18 | |
*** dizquierdo has quit IRC | 12:19 | |
numans | AJaeger, Hi, can you please add this to your review queue - https://review.openstack.org/#/c/490622/ | 12:21 |
*** dizquierdo has joined #openstack-infra | 12:22 | |
AJaeger | numans: please ask EmilienM to review this first . Once he's happy, I'll review. | 12:24 |
numans | AJaeger, sure. thanks. EmilienM can you please add this to your review queue - https://review.openstack.org/#/c/490622/ | 12:25 |
EmilienM | done ^ | 12:27 |
openstackgerrit | Dmitry Tantsur proposed openstack-infra/project-config master: Add missing publish-to-pypi to networking-baremetal https://review.openstack.org/494166 | 12:28 |
*** atarakt has quit IRC | 12:29 | |
*** jpena|lunch is now known as jpena | 12:29 | |
*** atarakt has joined #openstack-infra | 12:29 | |
*** efoley has quit IRC | 12:31 | |
fungi | AJaeger: i'll take a peek at the debug logs on zuulv3.o.o and see whether i can suss out where 492715 went | 12:35 |
fungi | AJaeger: looks like we lost the connection to the database service for the mysql reporter... i'll follow up in #zuul with tracebacks | 12:39 |
*** eharney has joined #openstack-infra | 12:41 | |
*** lucasagomes is now known as lucas-brb | 12:48 | |
openstackgerrit | Thierry Carrez proposed openstack-infra/puppet-ptgbot master: Update to Queens, add a site index https://review.openstack.org/484798 | 12:49 |
ttx | fungi: small last-minute adjustment ^ | 12:49 |
openstackgerrit | sebastian marcet proposed openstack-infra/openstackid-resources master: External Calendar Sync https://review.openstack.org/487683 | 12:49 |
*** spzala has quit IRC | 12:51 | |
fungi | ttx: thanks | 12:52 |
AJaeger | thanks, fungi for looking into zuulv3 | 12:56 |
fungi | AJaeger: looks like when the trove instance for the zuulv3 mysql reporter was created, our typical default configuration overrides were not applied (which include extending the wait_timeout to 28800 (the upstream mysql default value) instead of whatever absurdly low inactivity timeout rax sets for their deployments | 12:58 |
*** slaweq_ has joined #openstack-infra | 12:58 | |
AJaeger | can we apply those now? | 12:59 |
fungi | however, that has exposed that our db query socket implementation in zuul v3 is not very robust in the face of disconnects | 12:59 |
AJaeger | yep | 12:59 |
*** markvoelker has joined #openstack-infra | 12:59 | |
fungi | AJaeger: not sure yet... i think go ahead and try a recheck, but i'm curious whether zuul will reestablish its connection without a restart now | 13:00 |
*** gongysh has quit IRC | 13:00 | |
*** gongysh has joined #openstack-infra | 13:00 | |
*** gongysh has quit IRC | 13:00 | |
*** Julien-z_ has joined #openstack-infra | 13:00 | |
*** gongysh has joined #openstack-infra | 13:00 | |
*** EricGonczer_ has joined #openstack-infra | 13:01 | |
AJaeger | restart issues, I see it on zuulv3.openstack.org | 13:01 |
*** lathiat has quit IRC | 13:01 | |
fungi | #status log trove configuration "sanity" created in rax dfw for mysql 5.7, setting our usual default overrides (wait_timeout=28800, character_set_server=utf8, collation_server=utf8_bin) | 13:01 |
openstackstatus | fungi: finished logging | 13:01 |
openstackgerrit | Monty Taylor proposed openstack-infra/shade master: Use new keystoneauth version discovery https://review.openstack.org/493582 | 13:01 |
*** slaweq_ has quit IRC | 13:03 | |
*** lathiat has joined #openstack-infra | 13:03 | |
fungi | AJaeger: i saw that you didn't know about the v3 status page being up... have you tried clicking the log link for an in-progress job? | 13:03 |
*** Julien-zte has quit IRC | 13:03 | |
*** isaacb has quit IRC | 13:04 | |
AJaeger | fungi: WOW! Thanks for showing that to me. | 13:04 |
fungi | this is going to be NICE | 13:04 |
*** isaacb has joined #openstack-infra | 13:04 | |
*** gongysh has quit IRC | 13:05 | |
AJaeger | yes, it is! | 13:06 |
openstackgerrit | Leticia Wanderley proposed openstack-infra/project-config master: Update LDAP domain driver CI job to run tempest full https://review.openstack.org/492223 | 13:06 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Use openstack-publish-artifacts base job https://review.openstack.org/492715 | 13:06 |
fungi | AJaeger: ^ working i guess | 13:06 |
AJaeger | fungi: yeah, finally! | 13:07 |
AJaeger | jeblair: it merged now ^ | 13:07 |
AJaeger | fungi: that single change needed two changes in project-config and fixing the database to merge. Good that we find this now... | 13:08 |
*** esberglu has joined #openstack-infra | 13:08 | |
mordred | AJaeger: yup! that's why we're running it on ourselves first :) | 13:09 |
AJaeger | agreed | 13:09 |
fungi | AJaeger: i also discovered last night that we hadn't yet granted zuul permission to leave verify -2..+2 votes nor submit to merge in gerrit | 13:09 |
fungi | so... we're ironing out a lot of configuration gotchas on our end this way | 13:10 |
*** spzala has joined #openstack-infra | 13:12 | |
*** spzala has quit IRC | 13:12 | |
*** spzala has joined #openstack-infra | 13:12 | |
*** kgiusti has joined #openstack-infra | 13:12 | |
*** gouthamr has joined #openstack-infra | 13:15 | |
pabelanger | morning | 13:18 |
*** jamesmcarthur has joined #openstack-infra | 13:19 | |
openstackgerrit | Davanum Srinivas (dims) proposed openstack-infra/devstack-gate master: Update grenade settings for stable/pike https://review.openstack.org/493057 | 13:19 |
openstackgerrit | Numan Siddique proposed openstack-infra/project-config master: Add TripleO scenario007-container experimental job for OVN https://review.openstack.org/490622 | 13:20 |
mnaser | morning pabelanger | 13:20 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Retry updating apt-cache https://review.openstack.org/494204 | 13:21 |
mnaser | fyi i've been monitoring # of jobs ran on our cloud so far vs timeouts of voting jobs and 7 out of 308 timed out in the past 24 hours | 13:21 |
mnaser | ill still try to see if we can minimize it even more | 13:21 |
*** ramishra has quit IRC | 13:25 | |
dtantsur | folks, could you please check https://review.openstack.org/#/c/494166/ ? it blocks releasing networking-baremetal | 13:26 |
pabelanger | clarkb: fungi: ianw: apache2 process in infracloud-chocolcate has died 3 times this morning | 13:27 |
*** ramishra has joined #openstack-infra | 13:28 | |
*** jamesmcarthur has quit IRC | 13:28 | |
pabelanger | 8 times on infracloud-vanilla | 13:29 |
*** jamesmcarthur has joined #openstack-infra | 13:29 | |
pabelanger | ERROR: apport (pid 7398) Wed Aug 16 13:22:48 2017: apport: report /var/crash/_usr_sbin_apache2.0.crash already exists and unseen, doing nothing to avoid disk usage DoS | 13:31 |
*** dave-mccowan has joined #openstack-infra | 13:31 | |
*** chlong_ has joined #openstack-infra | 13:31 | |
fungi | that's thoughtful of it | 13:32 |
*** felipemonteiro has joined #openstack-infra | 13:35 | |
*** sree has quit IRC | 13:35 | |
*** sree has joined #openstack-infra | 13:36 | |
*** ykarel is now known as ykarel|afk | 13:36 | |
*** sree has quit IRC | 13:36 | |
*** felipemonteiro_ has joined #openstack-infra | 13:36 | |
*** sree has joined #openstack-infra | 13:36 | |
*** mat128 has quit IRC | 13:38 | |
pabelanger | fungi: clarkb: ianw: best I can get without running apache2-dbg: http://paste.openstack.org/show/618526/ | 13:39 |
pabelanger | guess we should speed up our upgrade to xenial | 13:39 |
*** felipemonteiro has quit IRC | 13:40 | |
*** ykarel|afk has quit IRC | 13:40 | |
openstackgerrit | Mohammed Naser proposed openstack-infra/project-config master: Bump vexxhost max-servers to 40 https://review.openstack.org/494208 | 13:41 |
mnaser | pabelanger fungi ^ this should help a bit in clearing the queue | 13:41 |
mnaser | (slow bumps because i dont want to cause more of a mess and closely monitoring timeouts/etc) | 13:41 |
fungi | it should be built with debug symbols, so just installing that package (even temporarily) to provide resolution for them should allow you to get more useful detail out of the dump | 13:41 |
AJaeger | thanks, mnaser | 13:43 |
mnaser | or AJaeger too :D thank you | 13:43 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Recycle stale SQL connections https://review.openstack.org/494210 | 13:43 |
fungi | pabelanger: when you have a sec, any chance you have a moment to review 484798 (pretty small change) so ttx and diablo_rojo can announce the ptgbot is up and running? | 13:43 |
mnaser | oh my | 13:44 |
mnaser | those zuul v3 logs are super badass | 13:44 |
pabelanger | fungi: looking | 13:44 |
fungi | mnaser: super badass is the zuul v3 motto, i think | 13:44 |
mnaser | i am just curious about how this would scale | 13:45 |
fungi | so are we | 13:45 |
fungi | ;) | 13:45 |
mnaser | aha :p | 13:45 |
mnaser | but this makes for a cool new change eliminating the need for public facing ci workers | 13:45 |
fungi | the log muxing is decoupled a bit with the idea that we should be able to scale it horizontally if needed | 13:45 |
fungi | mnaser: the actual implementation is fascinating | 13:46 |
fungi | mnaser: zuul executors serve the console logs via finger protocol | 13:46 |
fungi | and then there's a websockets proxy which feeds them to your browser | 13:46 |
mnaser | as long as it only streams logs on demand, i dont think it should be an issue (only a few crazy people watch their CI jobs like me i think, aha) | 13:46 |
mnaser | yeah, i saw the websockets part, i did a little searching around :-P | 13:47 |
fungi | right, basically the proxy connects via finger to request the stream for a specific log | 13:47 |
mordred | mnaser: for scaling - the websocket streamer is actually scaleout/load-balancer-able | 13:47 |
*** sree_ has joined #openstack-infra | 13:47 | |
mnaser | mordred nice! | 13:48 |
mordred | mnaser: so we can run as many web frontends as needed - and they get the data from the executors - which can also be scaled out to handle load as needed | 13:48 |
*** sree_ is now known as Guest66109 | 13:48 | |
fungi | right, the main scaling concern is the finger server, though we batted around some thoughts on how to tackle that if needed | 13:48 |
mnaser | mordred are you saying its web scale? :-P | 13:48 |
fungi | more of a yagni situation though | 13:48 |
mordred | mnaser: so _hopefully_ it'll prove to be a pleasingly scalable system - however, we only have one so far :) | 13:48 |
mnaser | but that's awesome. i'm really excited for the transition. | 13:48 |
mordred | mnaser: if /dev/null is very fast, i'll put my data in it | 13:48 |
*** sree has quit IRC | 13:49 | |
mnaser | your apps should be stateless so /dev/null should be your storage, databases are so old school | 13:49 |
mordred | mnaser: it's step one on the path to cloud native :) | 13:49 |
fungi | it's the next step past nosql | 13:49 |
fungi | nodb | 13:49 |
mnaser | :P | 13:50 |
fungi | we'll eventually achieve nosoftware | 13:50 |
fungi | less is more! | 13:50 |
pabelanger | Ya, speaking of scale, I've ben wondering if we'll setup regional zuul-executors (a long side mirrors) to help cutdown on the delay of pushing git contents to nodes | 13:50 |
*** lucas-brb is now known as lucasagomes | 13:53 | |
*** dhajare has quit IRC | 13:55 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Recycle stale SQL connections https://review.openstack.org/494210 | 13:55 |
*** slaweq_ has joined #openstack-infra | 13:59 | |
*** chlong_ has quit IRC | 13:59 | |
mordred | pabelanger: it might be a thing to consider, but it would require some additional plumbing in zuul, as currently jobs are distributed to executors without any knowledge of location | 14:00 |
fungi | pabelanger: well, we had also talked about a future state where nodepool could add and remove executors as load dictates, and pairing that with region-specific scheduling could get weird/inefficient | 14:00 |
*** dimak has quit IRC | 14:01 | |
*** udesale has joined #openstack-infra | 14:01 | |
pabelanger | mordred: Ya, I was thinking about that too the other night. I think we'd need to some how filter gearman requests to a specific region for executors ( I think that is the right place). | 14:01 |
mordred | fungi: it could - although it has the possibility to be nice if done well | 14:01 |
pabelanger | fungi: Ya | 14:01 |
*** dimak has joined #openstack-infra | 14:01 | |
*** alexchadin has joined #openstack-infra | 14:02 | |
pabelanger | mordred: jeblair: I think we are ready to test zuulv3-dev uploading with secret: https://review.openstack.org/492671 both clarkb and fungi have reviewed, do you both mind looking again | 14:03 |
*** slaweq_ has quit IRC | 14:04 | |
AJaeger | pabelanger: hwoarang added a few links to timeouts to https://review.openstack.org/#/c/493986/ - do we want to merge the request or do you have an idea how to fix those? | 14:05 |
*** EricGonc_ has joined #openstack-infra | 14:06 | |
*** EricGonczer_ has quit IRC | 14:07 | |
*** dtantsur is now known as dtantsur|bbl | 14:10 | |
*** felipemonteiro_ has quit IRC | 14:10 | |
*** ykarel|afk has joined #openstack-infra | 14:11 | |
*** lbragstad has joined #openstack-infra | 14:13 | |
pabelanger | AJaeger: hwoarang: A quick look at logs shows might be hitting trunk.rdoproject.org directly, had to tell since not many logs on that job. Also see other yum repos yum.mariadb.org, we likely can mirror. But ya, incrase in timeout okay with me, still under 90mins | 14:13 |
mordred | pabelanger: lgtm +2 - when we update it for real I think we should do fungi's shred thing too | 14:16 |
*** ralonsoh_ has quit IRC | 14:16 | |
pabelanger | AJaeger: hwoarang: hard to tell, but is job configured to use wheels too? I see it adding new pip.conf files | 14:17 |
hwoarang | i think it does use wheels | 14:18 |
mordred | pabelanger: wait - doesn't the job need to request the secret? | 14:18 |
pabelanger | mordred: Oh, maybe. Looking | 14:19 |
pabelanger | mordred: ya, let me fix | 14:20 |
*** mat128 has joined #openstack-infra | 14:20 | |
*** gongysh has joined #openstack-infra | 14:21 | |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: Create site_zuulv3_dev secret https://review.openstack.org/492671 | 14:22 |
mordred | pabelanger: lgtm. in the etherpad, you've got " convert playbooks/publish/openstack-tarball.yaml to a role (publish-to-tarballs) [pabelanger] | 14:24 |
mordred | with that change listed beside it - that still on your plate or you want me to do it when Ido the playbook tasks below? (I'm fine either day) | 14:24 |
mordred | way | 14:24 |
pabelanger | mordred: sure, if you want to convert | 14:27 |
*** wolverineav has quit IRC | 14:28 | |
*** wolverineav has joined #openstack-infra | 14:29 | |
*** slaweq has quit IRC | 14:30 | |
*** rbrndt has joined #openstack-infra | 14:31 | |
*** LindaWang has quit IRC | 14:31 | |
*** links has quit IRC | 14:32 | |
*** wolverineav has quit IRC | 14:32 | |
openstackgerrit | Merged openstack-infra/project-config master: jenkins: jobs: ansible-role-jobs: Increase job timeout to 90 minutes https://review.openstack.org/493986 | 14:36 |
mordred | pabelanger: found one more bug | 14:37 |
*** wolverineav has joined #openstack-infra | 14:37 | |
jeblair | fungi, mordred, pabelanger: should we also have zuul shred the ansible variables files to which it writes secrets? | 14:39 |
fungi | does zuul remove those files after they're written? | 14:40 |
fungi | is that done just as a batch when performing ephemeral cleanup of te tmpdir? | 14:40 |
fungi | or are they explicitly removed as soon as they get used? | 14:40 |
jeblair | the batch cleanup | 14:41 |
*** felipemonteiro has joined #openstack-infra | 14:41 | |
jeblair | the only other special thing about them is that they are only bind-mounted into the bwrap container for the playbook run they're used for. so for the other playbook runs, they sit in the jobdir, but outside the container. | 14:41 |
fungi | overwriting sensitive file contents with random garbage and calling sync prior to unlinking mitigates some kinds of harvesting of secrets off discarded physical media or if the hypervisor fails to zero a disk before reusing blocks | 14:42 |
*** felipemonteiro_ has joined #openstack-infra | 14:42 | |
fungi | if they're written all the way through the local filesystem caching layer to the block device (as opposed to, say, using tmpfs) then it might be worthwhile as long as it's not a lot of added code complexity | 14:42 |
jeblair | now that's an interesting idea... let me see if we can tell bwrap to put them on a tmpfs | 14:43 |
*** jamesmcarthur has quit IRC | 14:43 | |
fungi | granted, it's far from foolproof as some kinds of filesystems or virtual storage layering will cause the overwriting to go to a new block (especially true of ssd or similar solid-state media) | 14:44 |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: Create site_zuulv3_dev secret https://review.openstack.org/492671 | 14:44 |
pabelanger | mordred: thanks | 14:44 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Recycle stale SQL connections https://review.openstack.org/494210 | 14:44 |
mordred | jeblair: question about precedence ... | 14:45 |
*** srobert has joined #openstack-infra | 14:45 | |
mordred | jeblair: nevermind | 14:46 |
*** felipemonteiro has quit IRC | 14:46 | |
jeblair | fungi: i'll move the shred conversation to #zuul | 14:46 |
*** makowals has quit IRC | 14:48 | |
*** makowals has joined #openstack-infra | 14:48 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add two roles for publishing artifacts over ssh https://review.openstack.org/494230 | 14:49 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config master: Use artifact publication roles from zuul-jobs https://review.openstack.org/494231 | 14:49 |
*** tjones has joined #openstack-infra | 14:49 | |
mordred | jeblair, pabelanger: ^^ that's a thought as a followup to pabelanger's patch - but it raises a question (that I thought would be easier to talk about with an example) - however, I'll move that question to #zuul too | 14:51 |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: DONT REVIEW: test removing fake image from multinode https://review.openstack.org/494233 | 14:53 |
*** alexchadin has quit IRC | 14:54 | |
*** xyang1 has joined #openstack-infra | 14:54 | |
*** Guest66109 has quit IRC | 14:56 | |
*** dtantsur|bbl is now known as dtantsur | 14:58 | |
*** ykarel|afk is now known as ykarel | 14:59 | |
clarkb | ianw: puppetry to do digited mirrors lgtm | 14:59 |
*** knikolla has joined #openstack-infra | 14:59 | |
clarkb | pabelanger: re speeding up upgrade to xenial, looks like ianw wants to use https://review.openstack.org/#/c/494042/8 in that process if you want to review that | 15:00 |
*** slaweq has joined #openstack-infra | 15:00 | |
mnaser | jeblair fungi regarding secrets, i think it could be interested to have a look at trove. i believe some folks had similar concerns which were resolved by using ramfs to store secrets | 15:00 |
mnaser | the concern was 'what if someone snapshotted the vm during a run' | 15:00 |
mnaser | and given that in infra's use case at least, you don't really "trust" the infrastructure (well, we trust each other but you know what i mean :)) | 15:01 |
*** spzala has quit IRC | 15:01 | |
*** spzala has joined #openstack-infra | 15:01 | |
rybridges | Hello everyone. Is there anything else that you guys need me to add before we can merge this? ->https://review.openstack.org/#/c/492693/ | 15:02 |
fungi | mnaser: agreed. granted tmpfs is basically a ramfs just abusing the kernel's filesystem cache without providing a backing block device | 15:02 |
mnaser | fungi maybe im being extreme but tmpfs could potentially use swap and swap out to it i think | 15:03 |
clarkb | rybridges: I need to rereview it, I've added it to my list | 15:03 |
fungi | mnaser: ramfs doesn't ever get paged out? | 15:04 |
rybridges | Okay great! Thanks clarkb | 15:04 |
mnaser | fungi https://www.kernel.org/doc/Documentation/filesystems/ramfs-rootfs-initramfs.txt i think so, there is a ramfs and tmpfs section here | 15:04 |
jeblair | mnaser, fungi: tmpfs does have the advantage of being supported by bubblewrap, so it's easy to add to the restricted environment we're running in | 15:04 |
fungi | mnaser: but also, copies of sensitive values in application memory are probably just as likely to get paged, so the real fix there is to couple it with encrypted swap | 15:04 |
jeblair | of course, we could make our own tmpfs/ramfs and then bindmount it in as a normal directory. i think. | 15:05 |
fungi | mnaser: or run swapless | 15:05 |
*** slaweq has quit IRC | 15:05 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add two roles for publishing artifacts over ssh https://review.openstack.org/494230 | 15:05 |
*** sshnaidm is now known as sshnaidm|afk | 15:05 | |
mnaser | fungi i don't fully have the context but i know that some jobs generate swap with sudo locally in infra afaik | 15:06 |
mnaser | (i know zuul v3 isn't just -infra's solutions but its a use case :p) | 15:06 |
jeblair | mnaser: in this case, the swap in question would be on the infra-managed server where zuul is running (not on a test node) | 15:06 |
*** rcernin has quit IRC | 15:07 | |
mnaser | jeblair oh i see | 15:08 |
mnaser | fair enough :D | 15:08 |
fungi | mnaser: yeah, jobs performing safe handling of secrets on disk is another issue entirely | 15:08 |
fungi | and one which is likely harder for us to control | 15:08 |
pabelanger | fungi: clarkb: mordred: jeblair: okay, I think https://review.openstack.org/492671 is ready now! :) zuulv3 secret for zuulv3-dev publishing | 15:09 |
mordred | fungi: yup- but - that's a user-content issue so caveat emptor :) | 15:10 |
clarkb | mordred: pabelanger jeblair and while I'm thinking about it a secret can only be used by a job in the same zuul.yaml file? | 15:11 |
clarkb | or is any trusted repo allowed to use any secret? | 15:11 |
pabelanger | If I understand correctly now, we pin secrets to just jobs now | 15:13 |
pabelanger | so only publish-openstack-artifacts has access to it | 15:14 |
openstackgerrit | Merged openstack-infra/puppet-ptgbot master: Update to Queens, add a site index https://review.openstack.org/484798 | 15:14 |
clarkb | pabelanger: right but could I push a change to a different trusted repo that used that secret? | 15:14 |
mordred | clarkb: I'm pretty sure it's same-repo | 15:14 |
pabelanger | clarkb: I'll deferred to jeblair, but I don't think so | 15:14 |
mnaser | did rax-ord mirror have issues over the yesterday/today? | 15:15 |
pabelanger | limited to zuul.yaml file | 15:15 |
mnaser | http://logs.openstack.org/88/493888/1/gate/gate-puppet-octavia-puppet-unit-3.6-legacy-centos-7/625e672/console.html#_2017-08-16_08_25_43_941296 -- this job failed with "transfer closed with 351490 bytes remaining to read" on two RPMs | 15:15 |
pabelanger | mnaser: possible, we are having some apache2 crashes | 15:15 |
clarkb | pabelanger: mordred cool thanks (that definitely helps with reviewing and not needing to expand context to all the trusted repos | 15:15 |
mordred | pabelanger, clarkb: I'm not 100% sure about zuul.d dirs - I *think* those count as "same zuul.yaml file" since they're in the same repo | 15:16 |
mordred | but I do not know for 100% sure | 15:16 |
mordred | (I would hope they count as the same thing) | 15:16 |
pabelanger | mnaser: clarkb: Ya, mirror.ord.rax is unhappy. Core dumps of apache every 5mins | 15:17 |
*** dougs1 has joined #openstack-infra | 15:17 | |
*** dougs1 has left #openstack-infra | 15:17 | |
pabelanger | http://paste.openstack.org/show/618540/ | 15:17 |
bogdando | to elastic-recheck folks, PTAL https://review.openstack.org/#/c/493535/ https://review.openstack.org/#/c/493525/ https://review.openstack.org/#/c/493520/ to reduce unknown/uncategorised numbers | 15:17 |
*** dougs1 has joined #openstack-infra | 15:18 | |
pabelanger | [393640.798829] net_ratelimit: 1 callbacks suppressed | 15:18 |
pabelanger | see that in dmesg | 15:18 |
clarkb | pabelanger: I've +2'd the secrets chagne but not approved it in case jeblair wants to take a look. You now have three +2's so I think are good to go ahead either way | 15:18 |
pabelanger | a lot | 15:18 |
clarkb | I believe the net_ratelimit thing is a xen behavior that is mostly just noise | 15:18 |
*** marst has joined #openstack-infra | 15:19 | |
fungi | mnaser: current plan is to try upgrading the mirrors to xenial before troubleshooting further. if memory serves we started noticing these segfaults on web servers we run after upgrading from precise to trusty, so expecting them to go away when upgrading from trusty to xenial isn't entirely insane | 15:19 |
clarkb | you can tune interface settings to make it go away iirc | 15:19 |
mnaser | clarkb pabelanger i wonder if the hypervisor cpu/memory has issues | 15:19 |
pabelanger | that is a new server this week | 15:19 |
*** markus_z has quit IRC | 15:20 | |
mnaser | ah okay | 15:20 |
clarkb | bigger new server too | 15:20 |
mnaser | now the other issue | 15:20 |
mnaser | https://github.com/openstack-infra/project-config/blob/master/jenkins/jobs/macros.yaml#L517-L535 | 15:20 |
clarkb | I've got to do some home things for a bit but maybe we can get ianw's change in than start a xenial replacement in ord? | 15:20 |
mnaser | sudo yum -y groupinstall "Development Tools" failed .. but zuul task didnt fail | 15:21 |
clarkb | or continue ahead with iad since I think it is unhappy there too | 15:21 |
mnaser | would anyone know why? does zuul go based on the exit code of the entire "shell" job? | 15:21 |
clarkb | (I'm not sure where ianw got iad yesterday) | 15:21 |
pabelanger | mnaser: clarkb: fungi: [Errno 14] curl#18 - \"transfer closed\" does seem limited to rackspace over the last 10 days. Only 3 hits in infracloud | 15:21 |
clarkb | mnaser: yum doesn't fail if at least one item successfully installs. And yes the command zuul executes' return code determine job status | 15:22 |
mnaser | clarkb so i take it the || true after pip uninstall is hiding the yum exit status | 15:22 |
mnaser | because that will always return true | 15:22 |
pabelanger | mnaser: I wonder if you created a mirrorlist for yum, but added 10 entries for the same mirror. Would that cause the yum client to try 'another mirror' | 15:23 |
clarkb | mnaser: ya but not being set -e wil mask all the yum return codes too | 15:23 |
pabelanger | EmilienM: mwhahaha: ^ see comment about mirrorlist | 15:23 |
clarkb | mnaser: so setting it errexit may be the simplest fix there | 15:23 |
clarkb | ok I've got to do some home things for a bit, back as soon as I can | 15:24 |
mwhahaha | pabelanger: this is before we update the mirrors in the ci code | 15:24 |
mwhahaha | pabelanger: this is purely infra stuff, we didn't even get to the puppet ci builder stuff. it failed in bindep processing | 15:24 |
mwhahaha | pabelanger: so you'd have to do that within the images | 15:25 |
pabelanger | mwhahaha: Right, I was mostly curious about the mirrorlist question, do you happen to know? | 15:25 |
*** udesale has quit IRC | 15:25 | |
pabelanger | My next test is going to create one, but add the same mirror hostname to it 10 times | 15:25 |
pabelanger | and see if yum will kick to the 'next mirror' and try to download the package | 15:26 |
pabelanger | but, really hitting the same mirror | 15:26 |
mwhahaha | pabelanger: yea i believe you can fake it by listing it multiple times | 15:26 |
pabelanger | Ya, I am starting to think this is likely the next step. mirrorlist to have yum client attempt multiple downloads | 15:27 |
*** coolsvap has quit IRC | 15:27 | |
openstackgerrit | Merged openstack-infra/project-config master: Bump vexxhost max-servers to 40 https://review.openstack.org/494208 | 15:27 |
*** shardy is now known as shardy_afk | 15:27 | |
openstackgerrit | Mohammed Naser proposed openstack-infra/project-config master: Fail early if any Puppet preparation commands fail https://review.openstack.org/494244 | 15:28 |
mnaser | EmilienM mwhahaha ^ when you have any free time | 15:30 |
*** links has joined #openstack-infra | 15:33 | |
*** dizquierdo has quit IRC | 15:34 | |
*** dimak has quit IRC | 15:38 | |
*** dimak has joined #openstack-infra | 15:39 | |
pabelanger | mnaser: mwhahaha: EmilienM: so, CentOS-Base.repo would be setup to use mirrorlist: http://paste.openstack.org/show/618547/ | 15:39 |
pabelanger | I'll do some testing in a bit to see if that actually works, then maybe we can update configure_mirror.sh to do it for default repos | 15:40 |
jeblair | i'm going to try pushing some inap rename changes through today | 15:43 |
*** dougs1 has left #openstack-infra | 15:43 | |
jeblair | infra-root: so, if nodepool dies, that may be why. | 15:43 |
pabelanger | ack | 15:43 |
mnaser | pabelanger i wonder why yum just doesnt retry on its won | 15:43 |
mnaser | own | 15:43 |
pabelanger | mnaser: it does, 10 times by default | 15:44 |
openstackgerrit | Merged openstack-infra/project-config master: Create site_zuulv3_dev secret https://review.openstack.org/492671 | 15:44 |
pabelanger | mnaser: however, I am not sure it retries if the connection is broken | 15:44 |
*** yolanda has quit IRC | 15:44 | |
pabelanger | mnaser: I think it would fail over to next mirror, if setup | 15:44 |
*** yolanda has joined #openstack-infra | 15:44 | |
pabelanger | Yay, secrets merged! | 15:44 |
mnaser | pabelanger ah yes maybe failed downloads are not recoverable in yum (maybe) | 15:44 |
*** ccamacho has quit IRC | 15:44 | |
pabelanger | starting testing on zuulv3-dev.o.o | 15:44 |
*** dave-mcc_ has joined #openstack-infra | 15:45 | |
openstackgerrit | Mohammed Naser proposed openstack-infra/project-config master: Fail early if any Puppet preparation commands fail https://review.openstack.org/494244 | 15:45 |
mnaser | AJaeger thanks for the comments, addressed :) | 15:45 |
*** ccamacho has joined #openstack-infra | 15:46 | |
*** dave-mccowan has quit IRC | 15:47 | |
*** yamahata has joined #openstack-infra | 15:47 | |
*** links has quit IRC | 15:49 | |
*** dave-mccowan has joined #openstack-infra | 15:50 | |
*** spzala has quit IRC | 15:51 | |
jeblair | infra-root: i propose to remove zl07, zl08 and zl09 from the current zuulv2 deployment and create ze02, ze03 and ze04 for the zuulv3 deployment so that we can test startup times with no net change in server footprint. | 15:51 |
*** spzala has joined #openstack-infra | 15:51 | |
jeblair | i'll note that 08 and 09 seem not to have ever gone into production for some (iptables?) reason, so it's only a net loss of one launcher from the v2 deployment. | 15:51 |
jeblair | how does that sound? | 15:52 |
fungi | jeblair: sounds fine to me since we've significantly dropped our aggregate quota recently anyway | 15:52 |
pabelanger | jeblair: mordred: fungi: clarkb: while the job failed, http://logs.openstack.org/89/489689/15/check/publish-openstack-python-branch-tarball/45a2698/ did properly connect to zuulv3-dev.o.o, if people would like to confirm logs | 15:52 |
*** isaacb has quit IRC | 15:52 | |
pabelanger | http://zuulv3-dev.openstack.org/logs/sandbox/ was created | 15:52 |
pabelanger | so, SSH key worked as expected | 15:52 |
*** dave-mcc_ has quit IRC | 15:52 | |
clarkb | jeblair: yes what fungi said, I expect it to be fine unless we fall into a lot more new cloud quota | 15:53 |
*** Sukhdev_ has joined #openstack-infra | 15:53 | |
openstackgerrit | Merged openstack-infra/system-config master: Add inap cloud definition https://review.openstack.org/493226 | 15:53 |
openstackgerrit | Gage Hugo proposed openstack-infra/project-config master: Skip non-doc jobs in certain cases https://review.openstack.org/494027 | 15:53 |
fungi | pabelanger: excellent! | 15:54 |
*** tumbarka__ has joined #openstack-infra | 15:54 | |
pabelanger | fungi: Ya, super exciting! | 15:54 |
clarkb | pabelanger: one thing I notice is we gather facts for the "logs" server. Might want to turn that off as hundreds of jobs all collecting facts seems unnecessary | 15:54 |
*** martinkopec has quit IRC | 15:55 | |
pabelanger | clarkb: Ya, we have fact caching right now, but that is limited per job. But agree, right now we are not using any facts on that playbook | 15:55 |
*** spzala has quit IRC | 15:55 | |
*** pcaruana has quit IRC | 15:57 | |
*** dizquierdo has joined #openstack-infra | 15:59 | |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: Disable facts on publish-openstack-artifacts jobs https://review.openstack.org/494251 | 15:59 |
pabelanger | clarkb: ^ | 15:59 |
openstackgerrit | James E. Blair proposed openstack-infra/system-config master: Remove zl07-zl09; add ze02-ze04 https://review.openstack.org/494252 | 16:00 |
*** wolverineav has quit IRC | 16:00 | |
jeblair | pabelanger: remind me why gathering facts is enabled by default? | 16:00 |
*** slaweq has joined #openstack-infra | 16:01 | |
pabelanger | jeblair: we enabled them because we started caching facts | 16:02 |
jeblair | pabelanger: i see the change where you turned it on, but there was no reason | 16:02 |
jeblair | pabelanger: as you point out, the caching is ineffective across job runs | 16:02 |
mordred | jeblair: we gather them becuase we need them for things like os_type | 16:02 |
mordred | in modules like 'package' | 16:02 |
pabelanger | right, but its possible to have multiple plays for a job, so in that case fact cache is helpful | 16:02 |
mordred | but we turn on caching and then gather them once in the first playbook, so that for any given job we only gather them once per node | 16:03 |
jeblair | mordred: okay, so it's required on the build nodes for some roles | 16:03 |
mordred | yes | 16:03 |
pabelanger | ya | 16:03 |
openstackgerrit | Gage Hugo proposed openstack-infra/project-config master: Skip non-doc jobs in certain cases https://review.openstack.org/494027 | 16:03 |
mordred | and for some very common ansible modules | 16:03 |
mordred | we also put in a blank file for localhost so that we don't fact-gather iirc | 16:03 |
kklimonda | can someone give me a brief explanation why did openstack drop puppetmaster/puppetdb and went with ansible playbooks in cron running agent? was it for performance reasons, or did you need more fine-grained orchestration for AFS and git/gerrit integration? | 16:04 |
jeblair | mordred: ya | 16:04 |
mordred | kklimonda: the second | 16:04 |
mordred | kklimonda: also, puppet agent had a tendency to hang when run in daemon mode | 16:04 |
*** wolverineav has joined #openstack-infra | 16:04 | |
mordred | kklimonda: but it was the sequencing that pushed us to using ansible to run puppet | 16:04 |
fungi | kklimonda: basically, puppet is not great at complex task orchestration so we went with ansible to provide that | 16:05 |
mordred | kklimonda: basically, when creating new projects, we need to create them on the mirrors first, then on the gerrit server, or else things go to the bad place | 16:05 |
*** jascott1 has joined #openstack-infra | 16:05 | |
fungi | but puppet does excel at declarative configuration management (as long as you keep all its actions idempotent) so we continued using it for that purpose | 16:05 |
mordred | yup. that and we have a large pile of it :) | 16:06 |
pabelanger | mordred: jeblair: technically we could share fact cache across multiple playbook runs, maybe something to discuss at ptg | 16:06 |
*** slaweq has quit IRC | 16:06 | |
jeblair | pabelanger: we do share it across multiple playbook runs | 16:06 |
kklimonda | thanks, another thing to consider :) | 16:07 |
jeblair | pabelanger: we don't share it across multiple jobs | 16:07 |
pabelanger | jeblair: sorry, yes, multiple jobs runs | 16:07 |
clarkb | jeblair: pabelanger right and that only really becomes a problem when collecting facts on shared nodes (like our logs/tarballs/archival servers) | 16:07 |
jeblair | pabelanger: we should not share facts across job runs. there is almost nothing in common. | 16:07 |
jeblair | except what clarkb says | 16:07 |
jeblair | which is the exception to the rule | 16:07 |
clarkb | because its a potential dos against that server | 16:07 |
*** ykarel is now known as ykarel|afk | 16:07 | |
mordred | yah. if we need facts for the log/tarball server - we could come up with a way to pre-populate the fact-cache with information about them | 16:08 |
pabelanger | Ya, facts on control plane servers would be the use case. Not an issue ATM, because we don't need any facts | 16:08 |
mordred | in fact, we could put those facts into the secret and have the job plop the info into th efact cache like it does with host keys | 16:08 |
pabelanger | ya, that is possible | 16:09 |
jeblair | i have invoked zuul-launcher graceful on zl07 | 16:09 |
*** bogdando has quit IRC | 16:09 | |
*** dmellado has quit IRC | 16:11 | |
jeblair | deleted zl08 and zl09 | 16:11 |
*** ramishra has quit IRC | 16:13 | |
*** trown is now known as trown|lunch | 16:13 | |
*** Apoorva has joined #openstack-infra | 16:14 | |
*** amoralej has joined #openstack-infra | 16:14 | |
openstackgerrit | Rob Cresswell proposed openstack-infra/irc-meetings master: Update Horizon meeting to reflect current PTLs https://review.openstack.org/494257 | 16:15 |
amoralej | pabelanger, it seems there is some issue with synchronization of buildlogs in http://mirror.dfw.rax.openstack.org:8080/buildlogs.centos/centos/7/cloud/x86_64/openstack-pike/ | 16:15 |
amoralej | it seems repo metadata is not properly synced | 16:15 |
pabelanger | amoralej: we don't sync, that is just a reverse proxy cache | 16:15 |
pabelanger | amoralej: what data is incorrect? | 16:16 |
pabelanger | amoralej: we are also hitting buildlogs.cdn.centos.org, so it is possible it has the issue | 16:16 |
amoralej | pabelanger, there is any way to force synchronization of http://mirror.dfw.rax.openstack.org:8080/buildlogs.centos/centos/7/cloud/x86_64/openstack-pike/repodata/repomd.xml ? | 16:16 |
amoralej | metadata doesn't come from cdn | 16:17 |
pabelanger | amoralej: http://buildlogs.cdn.centos.org/centos/7/cloud/x86_64/openstack-pike/repodata/repomd.xml | 16:17 |
pabelanger | that is where you are actually getting | 16:17 |
amoralej | lemme check redirects | 16:18 |
pabelanger | so, we'd need to check expire headers on the cache data | 16:18 |
pabelanger | possible apache is now refreshing it | 16:18 |
amoralej | pabelanger, in fact we shouldn't use cdn for metadata, that may be the issue | 16:19 |
* amoralej checking | 16:19 | |
*** Apoorva_ has joined #openstack-infra | 16:19 | |
pabelanger | amoralej: okay, this would need to be fixed upstream. Because we are just proxying the data | 16:20 |
*** pbourke has quit IRC | 16:21 | |
*** pbourke has joined #openstack-infra | 16:23 | |
*** Apoorva has quit IRC | 16:23 | |
amoralej | pabelanger, you proxy all requests to buildlogs.cdn.centos.org, right? | 16:23 |
openstackgerrit | Monty Taylor proposed openstack-infra/puppet-zuul master: Install ara on the executors https://review.openstack.org/494260 | 16:25 |
pabelanger | amoralej: yes because buildlogs.centos.org redirects to it. So we now directly use it: https://review.openstack.org/492256/ | 16:25 |
mordred | jeblair, pabelanger: ^^ that's a followup to https://review.openstack.org/#/c/487853 | 16:25 |
amoralej | pabelanger, buildlogs.centos.org doesn't redirect for metadata | 16:25 |
amoralej | only for rpms | 16:25 |
pabelanger | amoralej: so that is the issue, metadata on buildlogs.cdn.centos.org is stale | 16:25 |
pabelanger | and we are getting it | 16:25 |
amoralej | yeah | 16:26 |
amoralej | pabelanger, i'm trying to figure out if proxying to cdn for metadata should work | 16:26 |
amoralej | and we just need to fix that | 16:26 |
openstackgerrit | Merged openstack-infra/project-config master: Disable facts on publish-openstack-artifacts jobs https://review.openstack.org/494251 | 16:26 |
amoralej | or we should switch to not use cdn for metadata | 16:26 |
mnaser | possible infracloud-vanilla issues (network?): http://logs.openstack.org/17/494217/1/check/gate-puppet-openstack-integration-4-scenario002-tempest-centos-7/5e5f67e/console.html | 16:26 |
*** isaacb has joined #openstack-infra | 16:27 | |
pabelanger | amoralej: so, we can either revert https://review.openstack.org/492336/ which means extra 302 redirects for every buildslogs RPM attempt, or see ask buildlogs.cdn.centos.org to maybe mirror faster | 16:27 |
*** ccamacho has left #openstack-infra | 16:27 | |
pabelanger | amoralej: lets see how long it takes before buildlogs.cdn.centos.org updates metadata | 16:28 |
pabelanger | Other wise, we should we able to write an apache rule just to hit buildlogs.centos.org for the metadata | 16:28 |
*** ihrachys has quit IRC | 16:28 | |
amoralej | pabelanger, update in buildlogs was 14 hours ago or so | 16:30 |
pabelanger | amoralej: wow, guess they are slow | 16:30 |
*** Apoorva_ has quit IRC | 16:31 | |
clarkb | pabelanger: what is odd to me is it seems the yum mirror hits these problems more than anything else? | 16:31 |
*** Apoorva has joined #openstack-infra | 16:31 | |
*** camunoz has quit IRC | 16:32 | |
clarkb | pip hits it with the hash mismatch but far less frequently despite running far more jobs agianst it | 16:32 |
clarkb | I wonder if there is something specific to how we are mirroring centos repos that tickles this behavipr | 16:32 |
*** florianf has quit IRC | 16:32 | |
pabelanger | clarkb: sorry, which issue are you referecing ATM | 16:32 |
clarkb | pabelanger: the apache segfaults | 16:32 |
clarkb | pabelanger: aiui they manifest in e-r as yum client no more mirrors to try and the pip hash mismatch error | 16:33 |
*** e0ne has quit IRC | 16:33 | |
pabelanger | clarkb: Right, so we are pushing more yum things via reverse proxy then apt ATM. So I am starting to think it might be related | 16:34 |
pabelanger | Pip would just hit afs cache | 16:34 |
clarkb | pabelanger: but aren't the segfaults caused by afs? | 16:34 |
clarkb | I thought that is what ianw found yesterday | 16:34 |
pabelanger | clarkb: Oh, I am not sure. I think we'd need to get apache debug symbols to look at issue I seen this morning | 16:34 |
clarkb | pabelanger: pretty sure ianw tracked it to afs /me reads scrollback | 16:35 |
pabelanger | I don't think I seen anything specific to AFS durning that time | 16:35 |
pabelanger | infracloud-chocolcate is what I was looking at this morning | 16:35 |
clarkb | pabelanger: http://paste.openstack.org/show/618560/ | 16:36 |
*** yamahata has quit IRC | 16:36 | |
openstackgerrit | Paul Belanger proposed openstack-infra/system-config master: Revert "Replace buildlogs.centos with buildlogs.cdn.centos" https://review.openstack.org/494265 | 16:36 |
clarkb | so that is correlation not necessarily at fault, but looks really suspicious | 16:37 |
pabelanger | clarkb: Ya, I see that last night. But I didn't see the same on infracloud | 16:37 |
pabelanger | only think I see in dmesg for afs is | 16:37 |
pabelanger | [2535506.431981] afs: file server 23.253.73.143 in cell openstack.org is back up (code 0) (multi-homed address; other same-host interfaces may still be down) | 16:38 |
clarkb | which is from last week so older | 16:38 |
pabelanger | amoralej: see ^494265 | 16:38 |
jeblair | clarkb: i agree that EINTR and segv are related, but i honestly have no idea which is the cause. | 16:38 |
amoralej | pabelanger, i've forced fetching the file again and it's synced now | 16:38 |
jeblair | clarkb: i think it's equally likely that the EINTR is a result of apache receiving sigsegv | 16:39 |
felipemonteiro_ | anyone aware that http://apps.openstack.org/ is done? just wondering whether it's a known issue | 16:39 |
felipemonteiro_ | down* | 16:39 |
jeblair | clarkb: (based on not actually having tracked it down) | 16:39 |
pabelanger | clarkb: http://paste.openstack.org/show/618526/ was the best I could get from coredump this morning. We should likely add apache2-dbg | 16:39 |
pabelanger | amoralej: okay, where did you do that? | 16:40 |
pabelanger | amoralej: or how | 16:40 |
amoralej | curl http://mirror.dfw.rax.openstack.org:8080/buildlogs.centos/centos/7/cloud/x86_64/openstack-pike/repodata/repomd.xml?foo123 | 16:40 |
pabelanger | amoralej: Ah, okay. So manually. Ya, we can do 494265 for now, since it was an optimization | 16:40 |
amoralej | adding ?whatever forces to recheck | 16:40 |
amoralej | pabelanger, lemme check with centos team if we can trust on cdn or not for that | 16:41 |
amoralej | if it's issue in cdn we should fix it there | 16:41 |
pabelanger | amoralej: sure, that would be helpful | 16:41 |
clarkb | pabelanger: ya would need mroe info on what read is happening | 16:43 |
*** vhosakot has joined #openstack-infra | 16:43 | |
clarkb | felipemonteiro_: yes I believe it was intentionally shut down | 16:44 |
pabelanger | clarkb: it is likely possible that read operation was for AFS, and it just took to long or something networking related happened. | 16:46 |
pabelanger | apache proxy cache is on local filesystem | 16:46 |
*** rama_y has joined #openstack-infra | 16:47 | |
jlvillal | clarkb, Is there anything that actually acts upon DEVSTACK_GATE_TLSPROXY yet? | 16:47 |
jlvillal | clarkb, Doing a quick codesearch.openstack.org, I couldn't find anything. | 16:47 |
clarkb | felipemonteiro_: I'm trying to find record of discussion for that and failing so I may be wrong but looking | 16:47 |
jlvillal | clarkb, I see things that are setting DEVSTACK_GATE_TLSPROXY, but nothing that I can find that uses it. | 16:47 |
clarkb | jlvillal: devstack-gate features.yaml | 16:48 |
felipemonteiro_ | clarkb: I was thinking back to http://lists.openstack.org/pipermail/openstack-dev/2017-March/113362.html but was also looking for where the official note of it is if TC did in fact decide to shut it down | 16:48 |
jlvillal | clarkb, Yep, I see it. Thanks! | 16:48 |
clarkb | jlvillal: if its set we enable the service on newer branches via devstack-gate's feature thing | 16:48 |
amoralej | pabelanger, the recommendation from centos team is not tu use cdn for metadata, in fact they are telling me to not use cdn but buildlogs.centos.org and follow redirects | 16:48 |
clarkb | jlvillal: it was done that way because it makes it easy to control what branches it is enabled for | 16:48 |
jlvillal | clarkb, Thanks! | 16:49 |
clarkb | amoralej: pabelanger out of curiousity why can't we just use the base OS repos? | 16:49 |
*** rcernin has joined #openstack-infra | 16:49 | |
pabelanger | amoralej: okay, so we should land 494265 | 16:49 |
amoralej | yeah | 16:50 |
clarkb | we already mirror all of centos and epel, and I think rdo is proxied and that works? | 16:50 |
pabelanger | clarkb: buildslogs contains pre-release packages which haven't landed in centos.org | 16:50 |
clarkb | pabelanger: why are we using prereelase pacakges? | 16:50 |
clarkb | we don't do that anywhere else aiui | 16:50 |
pabelanger | clarkb: tripleo does it for their workflow | 16:50 |
clarkb | because we aren't testing centos | 16:50 |
pabelanger | testing openstack RPM | 16:51 |
*** jpich has quit IRC | 16:51 | |
pabelanger | I don't know the history, but there is a complicated relationship for RPMs between RDO, DLRN and buildslogs | 16:52 |
mgagne | is there a way to unqueue a change from Zuul? Use case is: Zuul thinks Jenkins job is running but it's not for reasons. and now it waits forever | 16:52 |
clarkb | mgagne: you can push a new patchset | 16:52 |
pabelanger | just trying to make sure sites are reliable | 16:52 |
*** spzala has joined #openstack-infra | 16:52 | |
mgagne | clarkb: clever | 16:52 |
clarkb | pabelanger: ya I'm trying to understand why we rely on such a complicated setup :) | 16:52 |
clarkb | pabelanger: and its because rdo doesn't host the preelease openstack packages, those are hosted by buildlogs? | 16:53 |
pabelanger | clarkb: Yes, I think that is part of the issue also. RDO has limited infrastructure so they rely on buildslogs for things | 16:53 |
*** egonzalez has quit IRC | 16:53 | |
*** lucasagomes is now known as lucas-afk | 16:54 | |
pabelanger | clarkb: one of the things I am hope to do at PTG is sit down with tripleo / puppet teams and better understand all the infrastructure they are using and why | 16:54 |
pabelanger | because some of it I also don't understand | 16:54 |
pabelanger | clarkb: fungi: where can I find our credentials to testpypi? I've been just using personal ones for the moment | 16:55 |
fungi | pabelanger: i don't know that we have any | 16:56 |
fungi | would need to create some | 16:56 |
*** rhallisey has quit IRC | 16:57 | |
*** dprince has quit IRC | 16:58 | |
openstackgerrit | Flavio Percoco proposed openstack-infra/project-config master: Split COE scenarios: 1 for each COE https://review.openstack.org/494272 | 16:58 |
fungi | felipemonteiro_: clarkb: announcement was here: http://lists.openstack.org/pipermail/openstack-operators/2017-July/013965.html | 16:58 |
clarkb | ah on operators list that explains my failed searching. Thank fungi for finding it | 16:59 |
fungi | yeah, it seemed far less relevant to the dev ml | 16:59 |
fungi | but some operators may have configured deployments to integrate that app catalog beta site | 17:00 |
*** strigazi_OFF is now known as strigazi | 17:00 | |
pabelanger | fungi: k | 17:00 |
felipemonteiro_ | fungi: thank you | 17:01 |
openstackgerrit | James E. Blair proposed openstack-infra/puppet-zuul master: Zuulv3: move the job dir under /var/lib/zuul https://review.openstack.org/494273 | 17:01 |
fungi | felipemonteiro_: you're welcome | 17:02 |
*** rhallisey has joined #openstack-infra | 17:02 | |
*** rhallisey has quit IRC | 17:02 | |
clarkb | pabelanger: the infracloud read problems could also just be slow networking there | 17:02 |
jeblair | clarkb, fungi, pabelanger: i'd like to merge 494273 and perform the corresponding filesystem creation before performing the "close everything" startup test. | 17:02 |
pabelanger | clarkb: ya I think that is possible | 17:02 |
clarkb | pabelanger: that said, replacing mirrors in infracloud is likely far simpler than rax due to lack of cinder volumes. Do we want to just go ahead and replace those two mirrors after we get ianw's change in? | 17:02 |
*** rhallisey has joined #openstack-infra | 17:02 | |
*** slaweq has joined #openstack-infra | 17:03 | |
clarkb | pabelanger: we can leave the old one hanging around too since we reduced nodepool's max server in infracloud | 17:03 |
clarkb | (easy revert that way) | 17:03 |
*** shardy_afk is now known as shardy | 17:03 | |
*** Swami has joined #openstack-infra | 17:04 | |
pabelanger | jeblair: 1 question on 494273 | 17:04 |
pabelanger | clarkb: ya, upgrading them to xenial should be strightforward | 17:04 |
pabelanger | clarkb: fungi: do you mind helping land https://review.openstack.org/494265/ that will fix metadata issue amoralej is seeing | 17:06 |
*** derekh has quit IRC | 17:06 | |
openstackgerrit | James E. Blair proposed openstack-infra/puppet-zuul master: Zuulv3: move the job dir under /var/lib/zuul https://review.openstack.org/494273 | 17:07 |
*** slaweq has quit IRC | 17:07 | |
jeblair | pabelanger, fungi: yeah, i don't think it matters but updated anyway ^ | 17:07 |
*** isaacb has quit IRC | 17:07 | |
clarkb | pabelanger: what rewrites the 302 content to point back at our proxy? | 17:07 |
pabelanger | jeblair: fungi: I fear linting issue on arrows | 17:09 |
*** Sukhdev_ has quit IRC | 17:09 | |
pabelanger | but +3 | 17:09 |
jeblair | for crying out loud | 17:09 |
pabelanger | clarkb: apache2 will rewrite it properly | 17:10 |
openstackgerrit | James E. Blair proposed openstack-infra/puppet-zuul master: Zuulv3: move the job dir under /var/lib/zuul https://review.openstack.org/494273 | 17:10 |
clarkb | pabelanger: oh on the proxy pass reverse because that effects the entire vhost | 17:11 |
clarkb | pabelanger: got it | 17:11 |
pabelanger | ya, metadata will not 302, so we need that to hit buildlogs, everything does 302 to buildlogs.cdn, which we also catch | 17:11 |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: Create testpypi_secret secret for zuulv3 https://review.openstack.org/494276 | 17:12 |
pabelanger | fungi: do you mind creating some credentials on testpypi.python.org and maybe update ^? | 17:12 |
*** ihrachys has joined #openstack-infra | 17:13 | |
fungi | pabelanger: i may not have time today. been trying for hours to free myself up for last-minute yardwork and packing before i head out of town | 17:14 |
pabelanger | fungi: okay, I don't mind doing it. Wanted to share the love with intree credentials :) | 17:15 |
fungi | pabelanger: though there's nothing requiring me specifically to create an openstackci account on testpypi as far as i know... anyone should be able to | 17:15 |
jeblair | pabelanger, fungi, clarkb: this is what i have done on ze02 in order to put all the git repos and build dirs on the same filesystem (the ephemeral disk): http://paste.openstack.org/show/618563/ | 17:15 |
jeblair | that look okay? | 17:15 |
clarkb | pabelanger: ya you should be able to do it, just add the info to the password file | 17:15 |
fungi | jeblair: looks right to me | 17:16 |
*** yamahata has joined #openstack-infra | 17:16 | |
openstackgerrit | Monty Taylor proposed openstack-infra/shade master: Use new keystoneauth version discovery https://review.openstack.org/493582 | 17:16 |
fungi | jeblair: i suppose if you cared about uptime you could move line 1 down to between 4 and 5 | 17:16 |
jeblair | not so much | 17:16 |
fungi | exactly | 17:16 |
fungi | lgtm | 17:17 |
clarkb | ++ also there is noatime :) | 17:17 |
fungi | jeblair: oh, and missing a `service zuul-executor start` at the end obviously | 17:17 |
jeblair | fungi: yeah. though i haven't run that part yet :) | 17:18 |
fungi | jeblair: only other thing i can think of is making sure to chmod/chown /mnt to match /var/lib/zuul | 17:18 |
fungi | in case initscripts don't take care of that automagically for us | 17:19 |
*** dtantsur is now known as dtantsur|afk | 17:19 | |
clarkb | puppet will get to that eventually | 17:19 |
jeblair | fungi: indeed that was incorrect, thanks :) | 17:19 |
fungi | or pupprt | 17:19 |
*** jascott1 has quit IRC | 17:19 | |
jeblair | http://paste.openstack.org/show/618564/ | 17:20 |
*** jascott1 has joined #openstack-infra | 17:20 | |
jeblair | okay, i'll do that to ze03 and 04 now, start them up, then do ze01 | 17:21 |
*** e0ne has joined #openstack-infra | 17:21 | |
*** ykarel|afk has quit IRC | 17:21 | |
jeblair | oh neat | 17:21 |
jeblair | zuulv3.o.o is not running iptables | 17:21 |
pabelanger | oh darn | 17:21 |
pabelanger | heh, used openstack-infra@lists.openstack.org as regstration email for testpypi when I should have used infra-root | 17:22 |
jeblair | pabelanger: please change that | 17:22 |
pabelanger | yes, trying to do so now | 17:22 |
*** sambetts is now known as sambetts|afk | 17:23 | |
*** shardy has quit IRC | 17:23 | |
*** Apoorva has quit IRC | 17:23 | |
pabelanger | is somebody able to help be with mailmain to not have that email get posted? | 17:23 |
pabelanger | validation link is likley on route to that | 17:23 |
*** Apoorva has joined #openstack-infra | 17:23 | |
jeblair | pabelanger: it should be held for moderation | 17:23 |
*** tjones has left #openstack-infra | 17:24 | |
*** jascott1 has quit IRC | 17:24 | |
pabelanger | great, thank you | 17:24 |
*** jascott1 has joined #openstack-infra | 17:25 | |
fungi | pabelanger: yeah, i'll discard in the moderation queue | 17:25 |
pabelanger | fungi: which email should I use for testpypi.python.org verfification? | 17:26 |
pabelanger | safe with infra-root@o.o? | 17:26 |
*** dprince has joined #openstack-infra | 17:26 | |
fungi | and done | 17:27 |
*** rbrndt has quit IRC | 17:27 | |
fungi | pabelanger: yeah, that's fine | 17:27 |
fungi | also, i highly recommend someone besides just me sets up imap to watch that infra-root mailbox | 17:27 |
fungi | wouldn't be a bad idea to get a second moderator onto the infra ml as well | 17:27 |
pabelanger | ya, I do not have that setup. I'll do that this afternoon | 17:28 |
jeblair | fungi, clarkb, pabelanger: i fixed iptables on zuulv3.o.o which means i need https://review.openstack.org/494252 merged to proceed | 17:28 |
jeblair | mordred: ^ | 17:28 |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: Create testpypi_secret secret for zuulv3 https://review.openstack.org/494276 | 17:29 |
*** jascott1 has quit IRC | 17:29 | |
clarkb | jeblair: what was wrong with iptables? | 17:30 |
*** eranrom has quit IRC | 17:30 | |
jeblair | Active: failed (Result: exit-code) since Sat 2017-07-08 01:12:00 UTC; 1 months 9 days ago | 17:30 |
jeblair | Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable. | 17:30 |
*** Apoorva_ has joined #openstack-infra | 17:30 | |
jeblair | clarkb: ^ i don't know :( i restarted it and it's up now | 17:31 |
clarkb | :( | 17:31 |
jeblair | maybe last time it started ze01 didn't resolve or something? | 17:31 |
clarkb | jeblair: we may want to check that journald is configured to log to disk and not use a ring buffer. | 17:31 |
clarkb | I think I checked that on centos images in rax when I found it was aproblem but not on ubuntu because we didn't have xenial yet | 17:32 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config master: Emit and publish ara logs if available https://review.openstack.org/494281 | 17:32 |
clarkb | jeblair: ya just confirmed its ring buffering | 17:33 |
clarkb | jeblair: in /etc/systemd/journald.conf we set it to auto which will log to disk if /var/log/journald exists but otherwise use ringbuffer | 17:33 |
*** Apoorva has quit IRC | 17:33 | |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: Create a whitelist for /etc configs https://review.openstack.org/493973 | 17:33 |
clarkb | we may just want to add a /var/log/journald resource to puppet to make sure its everywhere and we get persistent logging | 17:33 |
*** spzala has quit IRC | 17:34 | |
*** e0ne has quit IRC | 17:34 | |
*** spzala has joined #openstack-infra | 17:35 | |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: Exclude list for logs collection https://review.openstack.org/494022 | 17:35 |
jeblair | clarkb: ++ | 17:36 |
clarkb | I'll get that patch up as soon as I determine the correct ownership and perms | 17:36 |
jeblair | cool, i'm going to take a short break while those patches bake | 17:36 |
*** rlandy is now known as rlandy|brb | 17:37 | |
*** bhavik1 has joined #openstack-infra | 17:38 | |
*** spzala has quit IRC | 17:39 | |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: Create testpypi_secret secret for zuulv3 https://review.openstack.org/494276 | 17:39 |
pabelanger | okay, testpypi credentials | 17:40 |
*** jtomasek is now known as jtomasek|afk | 17:41 | |
*** tosky has quit IRC | 17:41 | |
pabelanger | OpenStackCloudHTTPError: (403) Client Error for url: https://compute-ca-ymq-1.vexxhost.net/v2.1/86bbbcfa8ad043109d2d7af530225c72/servers Quota exceeded for ram: Requested 8192, but already used 204800 of 204800 ram' | 17:46 |
pabelanger | mnaser: looks like quota issue vexxhost | 17:47 |
mnaser | that'll stop it from hitting 40 | 17:47 |
mnaser | ill raise to 409600 | 17:47 |
mnaser | pabelanger done | 17:48 |
mnaser | there we go | 17:48 |
mnaser | launching 18 now | 17:49 |
pabelanger | Ya, seeing ready nodes now | 17:49 |
*** trown|lunch is now known as trown | 17:52 | |
pabelanger | mnaser: thanks, 40 online now | 17:52 |
mnaser | pabelanger sweet, i've been keeping at eye of timeout/job run ratio | 17:53 |
mnaser | 3-4 in past 24 hours out of 300 something so hopefully that stabilizes | 17:53 |
*** gongysh has quit IRC | 17:54 | |
*** ociuhandu has quit IRC | 17:55 | |
*** strigazi is now known as strigazi_OFF | 17:55 | |
openstackgerrit | Clark Boylan proposed openstack-infra/system-config master: Make journal logs persistent on disk https://review.openstack.org/494282 | 17:56 |
clarkb | that was actually far more reading that I expected it to be | 17:56 |
clarkb | pabelanger: ok I'm going to go grab some early lunch, but when I get back I think I am ready to boot some new xenial mirrors | 17:57 |
*** e0ne has joined #openstack-infra | 17:57 | |
pabelanger | clarkb: ack | 18:00 |
*** electrofelix has quit IRC | 18:00 | |
*** rlandy|brb is now known as rlandy | 18:02 | |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: WIP: test creating fake image in oooq extras https://review.openstack.org/494233 | 18:02 |
*** slaweq has joined #openstack-infra | 18:03 | |
*** spzala has joined #openstack-infra | 18:03 | |
*** dave-mccowan has quit IRC | 18:05 | |
*** e0ne has quit IRC | 18:07 | |
*** tosky has joined #openstack-infra | 18:07 | |
*** e0ne has joined #openstack-infra | 18:08 | |
*** e0ne has quit IRC | 18:08 | |
*** slaweq has quit IRC | 18:09 | |
*** dave-mcc_ has joined #openstack-infra | 18:09 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add sphinx-autodoc-typehits sphinx extension https://review.openstack.org/492557 | 18:09 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Collect logging information into ara callback https://review.openstack.org/487853 | 18:09 |
jeblair | clarkb: is that journald change missing a git add? | 18:09 |
*** pcaruana has joined #openstack-infra | 18:09 | |
openstackgerrit | Merged openstack-infra/system-config master: Remove zl07-zl09; add ze02-ze04 https://review.openstack.org/494252 | 18:10 |
openstackgerrit | Merged openstack-infra/bindep master: Add ability to list all deps https://review.openstack.org/492693 | 18:10 |
*** jascott1 has joined #openstack-infra | 18:11 | |
clarkb | jeblair: yes sorry | 18:11 |
openstackgerrit | Clark Boylan proposed openstack-infra/system-config master: Make journal logs persistent on disk https://review.openstack.org/494282 | 18:12 |
*** rhallisey has quit IRC | 18:13 | |
*** slaweq has joined #openstack-infra | 18:15 | |
*** rhallisey has joined #openstack-infra | 18:17 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Create nodepool.cloud inventory variable https://review.openstack.org/493088 | 18:18 |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: Use nodepool.cloud for zuul_site_mirror_fqdn https://review.openstack.org/494288 | 18:20 |
pabelanger | jeblair: ^should be fix for internap mirror URL | 18:20 |
pabelanger | just waiting until we restart zuulv3 with nodepool.cloud inventory to confirm | 18:20 |
*** makowals_ has joined #openstack-infra | 18:21 | |
*** ociuhandu has joined #openstack-infra | 18:21 | |
jeblair | pabelanger: okay, that's +2 from me for you to +3 after the restart | 18:21 |
*** dizquierdo has quit IRC | 18:27 | |
*** bhavik1 has quit IRC | 18:28 | |
openstackgerrit | Gage Hugo proposed openstack-infra/project-config master: Skip integration/non-doc jobs in certain cases https://review.openstack.org/494018 | 18:29 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Retry updating apt-cache https://review.openstack.org/494204 | 18:30 |
openstackgerrit | Merged openstack-infra/system-config master: Revert "Replace buildlogs.centos with buildlogs.cdn.centos" https://review.openstack.org/494265 | 18:30 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add two roles for publishing artifacts over ssh https://review.openstack.org/494230 | 18:31 |
*** rbrndt has joined #openstack-infra | 18:33 | |
*** marst_ has joined #openstack-infra | 18:34 | |
amoralej | pabelanger, will http://mirror.dfw.rax.openstack.org:8080/buildlogs.centos/centos/7/cloud/x86_64/openstack-pike/repodata/repomd.xml be refreshed after https://review.rdoproject.org/r/#/c/8618 is applied? | 18:34 |
*** rhallisey has quit IRC | 18:34 | |
pabelanger | amoralej: not sure I follow, that is an review from rdoproject | 18:35 |
amoralej | pabelanger, sorry, wrong paste, i meant https://review.openstack.org/#/c/494265/ | 18:36 |
ihrachys | I noticed, multiple times already, tempest jobs fail without timeout AND no logs uploaded, only thing we have then is console | 18:36 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Document and update fileserver roles https://review.openstack.org/494291 | 18:36 |
ihrachys | which is not helpful | 18:36 |
ihrachys | what could be the reason of that happening? | 18:36 |
ihrachys | example: http://logs.openstack.org/08/493108/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/21af49b/console.html | 18:36 |
pabelanger | amoralej: ya, that just merged a few minutes ago, so once mirrors update new requests should get refreshed for that | 18:36 |
amoralej | ok, thx | 18:37 |
pabelanger | amoralej: so, we hit buildlogs.centos.org first, then redirect to buildlogs.cdn.centos.org when they tell us too | 18:37 |
*** marst has quit IRC | 18:37 | |
pabelanger | ihrachys: setup_host failed: http://logs.openstack.org/08/493108/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/21af49b/logs/devstack-gate-setup-host.txt | 18:38 |
pabelanger | failed to ping mirror in internap | 18:38 |
*** thingee_ has joined #openstack-infra | 18:40 | |
ihrachys | pabelanger, ah I see. I always jump straight to devstack.log ;) | 18:45 |
openstackgerrit | Jakub Libosvar proposed openstack-infra/project-config master: Revert "Make neutron functional job non-voting" https://review.openstack.org/494295 | 18:45 |
*** amoralej is now known as amoralej|off | 18:47 | |
openstackgerrit | Merged openstack/gertty master: Handle approvals with no name https://review.openstack.org/493601 | 18:50 |
openstackgerrit | Merged openstack-infra/puppet-zuul master: Zuulv3: move the job dir under /var/lib/zuul https://review.openstack.org/494273 | 18:50 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Add publish-openstack-python-branch-tarball to post pipeline https://review.openstack.org/494296 | 18:50 |
*** Sukhdev has joined #openstack-infra | 18:52 | |
*** e0ne has joined #openstack-infra | 18:54 | |
*** nicolasbock has quit IRC | 18:55 | |
*** wolverineav has quit IRC | 18:57 | |
openstackgerrit | Merged openstack-infra/project-config master: Add inap cloud https://review.openstack.org/493072 | 19:00 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Revert "Retry updating apt-cache" https://review.openstack.org/494298 | 19:02 |
jeblair | infra-root: ^ i'm going to force merge that. the commit it reverts wedged zuulv3. | 19:02 |
jeblair | hrm, why doesn't project-bootstrappers let me +2 verify that? | 19:04 |
clarkb | are the verified perms on it exclusive? | 19:05 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Revert "Retry updating apt-cache" https://review.openstack.org/494298 | 19:05 |
jeblair | gertty let me | 19:05 |
clarkb | or you may just need a hard refresh | 19:05 |
clarkb | ya web ui caches vote categories | 19:05 |
jeblair | i did refresh :( | 19:05 |
jeblair | the full shift-control-open-apple-alt-meta-splat-R one too | 19:05 |
jeblair | anywho, it's in | 19:06 |
openstackgerrit | Jakub Libosvar proposed openstack-infra/project-config master: Revert "Make neutron functional job non-voting" https://review.openstack.org/494295 | 19:06 |
*** rkukura has quit IRC | 19:08 | |
*** lihi has quit IRC | 19:13 | |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: Create post pipeline for zuulv3.o.o https://review.openstack.org/494300 | 19:14 |
pabelanger | jeblair: mordred: fungi: clarkb: any objection for creating post pipeline for zuulv3.o.o?^ | 19:15 |
*** lihi has joined #openstack-infra | 19:16 | |
jeblair | lgtm | 19:19 |
fungi | pabelanger: i'm not where i can review it right now, but i'm fully in favor of the subject line | 19:21 |
*** adisky__ has quit IRC | 19:23 | |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Install build private key too https://review.openstack.org/494302 | 19:24 |
*** trown is now known as trown|brb | 19:29 | |
clarkb | pabelanger: approved | 19:29 |
clarkb | ok now to do infracloud mirror things. I'm guessing step zero is uploading an ubuntu xenial cloud image | 19:29 |
*** rhallisey has joined #openstack-infra | 19:30 | |
clarkb | ya at least for infracloud it needs a xenial image. I'm going to grab one of ubuntu's and upload it to both infraclouds | 19:30 |
pabelanger | Thanks | 19:31 |
clarkb | hrm openstack image create doesn't take a hash like glance client did? | 19:36 |
clarkb | maybe that is a property | 19:36 |
openstackgerrit | Merged openstack-infra/project-config master: Create post pipeline for zuulv3.o.o https://review.openstack.org/494300 | 19:38 |
clarkb | mordred: ^ do you know what the magical way to have glance verify the checksum is? | 19:39 |
*** nicolasbock has joined #openstack-infra | 19:39 | |
*** trown|brb is now known as trown | 19:39 | |
clarkb | I guess I can push it then check the checksum glance reports back but it seems like handing it one and letting it fail upfront is far more sane | 19:39 |
clarkb | hrm doesn't even look like shade does this | 19:40 |
clarkb | mordred: is that a bug? we don't seem to check hashes when uploading images to glance? | 19:42 |
*** Sukhdev has quit IRC | 19:43 | |
*** kjackal_ has quit IRC | 19:44 | |
mtreinish | clarkb: it's a manual thing via the api, but that would be a good shade flag to add | 19:45 |
mtreinish | clarkb: I have a doc patch to add that to the install guide: https://review.openstack.org/#/c/486674/ | 19:45 |
clarkb | mtreinish: I'm not seeing it in the api docs either fwiw | 19:45 |
mtreinish | because I was bit with that | 19:45 |
mtreinish | with/by/ | 19:45 |
clarkb | I see that glance will return a checksum to you though so I am just uploading and will check the sum that glance computes before booting | 19:45 |
mtreinish | yeah there is an md5sum in the image properties | 19:46 |
*** kjackal_ has joined #openstack-infra | 19:46 | |
* clarkb wonders why a doc change can't merge until after pike is cut... | 19:46 | |
mtreinish | I dunno | 19:47 |
mtreinish | I stopped asking questions... | 19:47 |
*** nicolasbock has quit IRC | 19:49 | |
clarkb | pabelanger: ok images are up in both regions. Is there one that you think would be ebtter to start in? | 19:53 |
jeblair | #status log zuul v2 launchers zl07, zl08, zl09 have been deleted due to reduced cloud capacity and to make way for zuul v3 executors | 19:54 |
openstackstatus | jeblair: finished logging | 19:54 |
*** rcernin has quit IRC | 19:55 | |
jeblair | #status log zuul v3 executors ze02, ze03, ze04 are online | 19:55 |
openstackstatus | jeblair: finished logging | 19:55 |
*** jpena is now known as jpena|off | 19:58 | |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Fix documentation nits https://review.openstack.org/494310 | 20:03 |
clarkb | ianw: upon rereview of https://review.openstack.org/#/c/494042/8 I think I found a minor issue taht should be fixed before merging | 20:04 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Fix documentation nits https://review.openstack.org/494310 | 20:05 |
clarkb | ianw: once that is addressed I think we can get that in and I will work on infracloud mirror replacements if you want to work on the rax one (pabelanger found ifnracloud also suffering from the segfaults) | 20:05 |
*** tnovacik has joined #openstack-infra | 20:06 | |
clarkb | pabelanger: looks like buildlogs issues have skyrocketed recently according to e-r | 20:09 |
clarkb | pabelanger: is that what you were looking to fix or is it potentially caused by the fix? | 20:09 |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: Update base-test to use site_logs secret https://review.openstack.org/494314 | 20:10 |
pabelanger | clarkb: ya, that is the stale repodata. Our revert should fix that | 20:11 |
pabelanger | clarkb: looking at logs now | 20:11 |
*** tumbarka__ has quit IRC | 20:12 | |
*** Sukhdev has joined #openstack-infra | 20:13 | |
pabelanger | Ya, last failure was 20mins ago: 2017-08-16T19:56:34.839Z | 20:13 |
pabelanger | mirrors should all be running revert now | 20:13 |
clarkb | ok | 20:13 |
*** srobert has quit IRC | 20:13 | |
clarkb | then I think next step is to have ianw respond to my review comments, and start booting some xenial mirrors | 20:14 |
pabelanger | kk | 20:15 |
*** mat128 has quit IRC | 20:17 | |
*** dprince has quit IRC | 20:18 | |
pabelanger | mordred: jeblair: clarkb: regarding: https://review.openstack.org/494314 since logs.o.o points to static.o.o, I believe it was suggested we create seperate secrets for each fqdn we are going to access, even though they have same private SSH keys. Meaning, we'd have both site_logs and site_tarballs zuul secrets over directly using site_static with different paths. thoughts? | 20:18 |
jeblair | pabelanger: that sounds nicely future-proof | 20:20 |
clarkb | the only thing to be wary of in that setup is if we change logs' key without realizing that static needs updating too. But I think it more likely we'd split the hosts up rather than change a key and miss that | 20:21 |
mordred | clarkb: glance v1 lets you send a checksum in the upload payload, and shade sends a checksum to it | 20:22 |
mordred | clarkb: v2 has no documented mechanism to do the same thing for direct upload | 20:22 |
jeblair | ze01--ze04 are cloning all repos | 20:23 |
clarkb | mordred: thats ... ok | 20:23 |
mordred | clarkb: if there is a way to do checksum-on-upload we can happily add it | 20:23 |
clarkb | mordred: we probably want to check the checksum against the response to upload which includes the glance computed checksum | 20:23 |
mordred | however- it wouldn't be any less expensive than just checking the returned checksum, since you'll have ot upload the whole thing before the checksum can get validated anyway | 20:23 |
clarkb | mordred: so we can fail on the shade side rather than on the glance side | 20:23 |
clarkb | ya | 20:24 |
mordred | so yah - we should definitely validate like that | 20:24 |
clarkb | mordred: its just more of a "your api shoudl handle these things for you" thing than anything else | 20:24 |
mordred | yah | 20:24 |
openstackgerrit | Major Hayden proposed openstack-infra/system-config master: Add packates.erlang-solutions.com reverse proxy https://review.openstack.org/494317 | 20:27 |
*** trown is now known as trown|outtypewww | 20:28 | |
openstackgerrit | Major Hayden proposed openstack-infra/project-config master: Add proxy host for erlang-solutions mirror https://review.openstack.org/494318 | 20:29 |
clarkb | mhayden: ^ is there a reason for not using the distro provided erlang packages? (we've used them for years so curiouos if there is some benefit to using upstream packages) | 20:32 |
mhayden | clarkb: according to cloudnull's research, rabbitmq recommends erland 19.x for the current version of rabbitmq | 20:32 |
mhayden | but 16.04 only has 18 | 20:32 |
clarkb | mhayden: and you aren't using ubuntu's rabbitmq package? or their package has jumped ahead of their erlang? | 20:33 |
mhayden | we had some rabbitmq performance issues on some larger OpenStack clouds and updating the erlang version made things more stable | 20:33 |
mhayden | clarkb: we are using the upstream rabbitmq package at the moment - 3.6.9-1 | 20:34 |
clarkb | huh interesting | 20:34 |
mhayden | we found that rabbitmq did a lot better under load with 1) modern version from rabbitmq.org and 2) pinned erlang version from upstream erlang repos | 20:34 |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: WIP: Create testpypi_secret secret for zuulv3 https://review.openstack.org/494276 | 20:34 |
mhayden | i didn't do the research myself, but that was the output | 20:34 |
mhayden | but the erlang mirror is somewhere in eastern europe with latency ~ 160ms to the central USA :/ | 20:35 |
openstackgerrit | Slawek Kaplonski proposed openstack-infra/shade master: Don't determine local IPv6 support if force_ip4=True https://review.openstack.org/494319 | 20:35 |
clarkb | mhayden: and is that similar situation for mariadb? why not use what is in centos repos? | 20:35 |
openstackgerrit | wes hayutin proposed openstack-infra/project-config master: Add oooq based undercloud-containers job. https://review.openstack.org/493715 | 20:35 |
mhayden | clarkb: we wanted some of the newer features from upstream mariadb/galera/percona | 20:36 |
mhayden | so we install from mariadb's upstream repos | 20:36 |
*** makowals_ has quit IRC | 20:36 | |
*** e0ne has quit IRC | 20:36 | |
*** jcoufal_ has joined #openstack-infra | 20:36 | |
openstackgerrit | Major Hayden proposed openstack-infra/system-config master: Add packages.erlang-solutions.com reverse proxy https://review.openstack.org/494317 | 20:36 |
clarkb | I'm worried that we are creating an unscalable future with all these changes particularly since we have mirror instability as it is (not that osa is at fault for that, everyone else is piling on too) | 20:36 |
mhayden | couldn't live with that typo in the commit message ;) | 20:36 |
clarkb | mhayden: it was correct in the proxy config :) | 20:37 |
mhayden | haha, not enough coffee this afternoon | 20:37 |
openstackgerrit | wes hayutin proposed openstack-infra/project-config master: Add oooq based undercloud-containers job. https://review.openstack.org/493715 | 20:37 |
mhayden | clarkb: makes sense | 20:37 |
mhayden | our other option is to pre-stage some of this stuff as early in our gate jobs as possible, but it might not be a great test of a production deploy | 20:37 |
mhayden | if it's preferred that we don't make more reverse proxies right now, i can go back and try out some other options | 20:38 |
openstackgerrit | Slawek Kaplonski proposed openstack-infra/shade master: Fix determining if IPv6 is supported when it's disabled https://review.openstack.org/494321 | 20:38 |
*** tnovacik has quit IRC | 20:38 | |
*** rkukura has joined #openstack-infra | 20:39 | |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Install build private key too https://review.openstack.org/494302 | 20:39 |
*** jcoufal has quit IRC | 20:39 | |
clarkb | mhayden: it might be good to wait a minute while we upgrade the mirrors in an attempt to get to a debuggable spot. ianw should be waking soon and was working on that and I plan on doing some of it today as well | 20:39 |
mhayden | alrighty | 20:40 |
clarkb | mhayden: then once we stabilize we can add things back on (as is its hard to know if we are making anything better if we keep adding backends while trying to fix things tio keep up better) | 20:40 |
mhayden | totally understandable ;) | 20:40 |
clarkb | this conversation has also inspired an entry on the PTG ideas list which I am typing up now | 20:40 |
*** pcaruana has quit IRC | 20:41 | |
mhayden | ah okay -- i'll be at the PTG! | 20:41 |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: Create testpypi_secret secret for zuulv3 https://review.openstack.org/494276 | 20:41 |
mhayden | i'll be the guy in the OSA room who is getting yelled at for constantly breaking gate jobs | 20:41 |
*** rkukura_ has joined #openstack-infra | 20:44 | |
clarkb | fwiw we mirror mariadb properly for ubuntu | 20:44 |
clarkb | so wouldn't be crazy to mirror it properly for centos as well rather than just proxying it | 20:44 |
mhayden | ah okay | 20:44 |
mhayden | i'm not sure how much of mariadb's yum repo would need to be mirrored -- that'd take some digging | 20:45 |
openstackgerrit | Eric Harney proposed openstack-dev/hacking master: Fix python 3.6 escape char warnings in strings https://review.openstack.org/494322 | 20:46 |
*** rkukura has quit IRC | 20:46 | |
*** rkukura_ is now known as rkukura | 20:46 | |
*** gouthamr has quit IRC | 20:46 | |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: Create testpypi_secret secret for zuulv3 https://review.openstack.org/494276 | 20:47 |
*** Apoorva_ has quit IRC | 20:47 | |
mhayden | clarkb: to be honest, if i could figure out the difference between erlang/OTP and erlang, i might not need a mirror or proxy :P | 20:47 |
*** Apoorva has joined #openstack-infra | 20:47 | |
dmsimard|off | mhayden: the mariadb off of RDO isn't good enough ? | 20:48 |
pabelanger | mordred: so, I know ^ isn't the full job yet needed for testpypi, some of that is depending on your (pre-)python-tarball logic. But might be worth it to land so we can start testing pip intall commands inside bwrap on executor | 20:48 |
mhayden | dmsimard|off: would like to keep versions of mariadb synced between xenial/centos/suse | 20:48 |
*** felipemonteiro_ has quit IRC | 20:48 | |
dmsimard|off | mhayden: fair | 20:48 |
mhayden | dmsimard|off: you're off work, go enjoy a beer | 20:48 |
dmsimard|off | mhayden: who said I was working :) | 20:49 |
*** thingee_ has quit IRC | 20:49 | |
*** esberglu has quit IRC | 20:50 | |
*** esberglu has joined #openstack-infra | 20:50 | |
*** jcoufal_ has quit IRC | 20:52 | |
*** rhallisey has quit IRC | 20:54 | |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Install build private key too https://review.openstack.org/494302 | 20:54 |
*** esberglu has quit IRC | 20:55 | |
*** spzala has quit IRC | 20:58 | |
*** spzala has joined #openstack-infra | 21:00 | |
mordred | pabelanger: lgtm | 21:02 |
*** esberglu has joined #openstack-infra | 21:03 | |
*** spzala has quit IRC | 21:05 | |
*** slaweq has quit IRC | 21:06 | |
*** aviau has quit IRC | 21:06 | |
*** slaweq has joined #openstack-infra | 21:06 | |
*** aviau has joined #openstack-infra | 21:07 | |
jeblair | images have been uploaded to inap | 21:08 |
jeblair | i've approved the quota switch to move from using internap to using inap (493073) | 21:08 |
*** rkukura has quit IRC | 21:09 | |
pabelanger | ack | 21:09 |
*** slaweq has quit IRC | 21:11 | |
*** xarses_ has joined #openstack-infra | 21:12 | |
*** andreww has quit IRC | 21:15 | |
*** ldnunes has quit IRC | 21:17 | |
*** spligak has quit IRC | 21:17 | |
jeblair | i'm adding zuulv3.o.o to the emergency file | 21:17 |
jeblair | clarkb: are you done with mirror.mtl01.inap.openstack.org in emergency? | 21:18 |
openstackgerrit | Merged openstack-infra/project-config master: Stop using internap in favor of inap https://review.openstack.org/493073 | 21:18 |
*** rama_y has quit IRC | 21:18 | |
*** thorst has quit IRC | 21:19 | |
clarkb | jeblair: yes | 21:19 |
jeblair | removed | 21:19 |
ianw | clarkb: looking ... | 21:21 |
fungi | mhayden: come to the infra room and we can yell at you for breaking the infrastructure instead! er, i mean... totally help you out | 21:23 |
fungi | mhayden: we're probably going to break tons of eggs while making zuul v3 omelets at the ptg anyway, so won't be in much position to criticize ;) | 21:24 |
*** ihrachys has quit IRC | 21:25 | |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Add mirror01.iad.rax.o.o https://review.openstack.org/494042 | 21:25 |
pabelanger | ianw: do you think we are ready to rotate out fedora-25, now that fedora-26 has been online for a while? | 21:27 |
ianw | pabelanger: there's still one devstack thing i can push today, because it's got no comment | 21:27 |
ianw | https://review.openstack.org/#/c/490331/ | 21:27 |
ianw | unless you have better ideas on that | 21:27 |
*** sree has joined #openstack-infra | 21:28 | |
ianw | clarkb: i think that removing the "-"'s has padded things out a bit far, see /etc/apache2/sites-enabled/50-mirror01.iad.rax.openstack.org.conf on mirror01 . it also doesn't bother me; i just copied it from one of the other ones | 21:29 |
clarkb | ianw: I think you can avoid that by removing some of the built in leading whitespace in the interpolation strings? | 21:32 |
clarkb | or do you mean vertically? | 21:32 |
ianw | both, ithink that's why the "<% end -%> was there | 21:32 |
clarkb | ah | 21:32 |
*** sree has quit IRC | 21:33 | |
clarkb | ianw: you can add it back to the end tags as long as it is removed from yhr serveralais lines | 21:33 |
clarkb | I think one of the end tags has no - and is adding an extra newline | 21:33 |
clarkb | its correct as is just potentially ugly | 21:34 |
ianw | closer ... http://paste.openstack.org/show/618577/ | 21:34 |
*** slaweq has joined #openstack-infra | 21:36 | |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Install build private key too https://review.openstack.org/494302 | 21:36 |
*** felipemonteiro has joined #openstack-infra | 21:36 | |
clarkb | ianw: I think you mayalso have to lose the if else block formatting so it is all at the same level of indentation | 21:37 |
fungi | too many newlines (extra blank lines) in apache configs is fine... too few (to the point that different logical lines of configuration end up on the same actual line) is if course not great | 21:37 |
*** aeng has joined #openstack-infra | 21:38 | |
clarkb | fungi: ya fixing the too few problem is what my -1 was about, now addressed but resulting in uglier output | 21:38 |
fungi | meh | 21:38 |
fungi | as long as the template in git is readable, those few poor schmucks with direct access to look at the resulting configs on disk can get by | 21:39 |
fungi | readable in git and resulting in syntactically correct (if not aesthetically pleasing) conffiles is what i would focus on | 21:39 |
*** bobh has joined #openstack-infra | 21:40 | |
fungi | life's too short to care about extra rendered whitespace ;) | 21:40 |
*** slaweq has quit IRC | 21:42 | |
*** Apoorva_ has joined #openstack-infra | 21:43 | |
*** slaweq has joined #openstack-infra | 21:43 | |
clarkb | fungi: pabelanger can you review ianw's change https://review.openstack.org/494042 other than extra whitespace I believe it to be correct :) | 21:43 |
jeblair | deb-python-cassandradriver seems to have a large number of branches | 21:43 |
clarkb | but with that in I will go ahead and spin up mirror01.regionone.infracloud-vanilla.openstack.org | 21:43 |
*** yamamoto has joined #openstack-infra | 21:44 | |
*** yamamoto has quit IRC | 21:45 | |
pabelanger | clarkb: ianw +3 | 21:45 |
*** Apoorva has quit IRC | 21:46 | |
ianw | paste.o.o where are you? | 21:46 |
ianw | clarkb: i think sticking it all on the one line works the best -> https://paste.fedoraproject.org/paste/V7cRAXFKiIJmD6d7C6p82Q | 21:47 |
ianw | but whatever. i'll kill my cache warming stuff on mirror01.iad.rax | 21:47 |
clarkb | ianw: I think its fine all on one line but the -%> need to be removed | 21:47 |
*** slaweq has quit IRC | 21:47 | |
clarkb | otherwise if you have more than one alias you'll get them all on one line right? | 21:47 |
ianw | but it's got a \n in the string? | 21:48 |
ianw | anyway, i'm all for leaving well enough alone too :) | 21:48 |
clarkb | oh huh | 21:48 |
clarkb | sorry I totally missed that if that was there before and that is my bad | 21:48 |
ianw | i didn't think that moving the caches from the old trusty -> xenial was a good idea, seeing as versions of everything changed | 21:48 |
clarkb | ok | 21:49 |
*** eharney has quit IRC | 21:49 | |
ianw | and i also guess ttl on the mirror is of minimal importance, given everything starts fresh anyway | 21:49 |
ianw | although i guess the upstream dns could be holding onto it | 21:49 |
pabelanger | ianw: ya, I've just keep 60 TTL and things slowly moved over | 21:50 |
ianw | i think they call that load balancing :) | 21:50 |
pabelanger | Ya, could be better to prime empty caches too | 21:50 |
clarkb | also that pastebin got a facelift from the last time I saw it | 21:50 |
*** danieli has joined #openstack-infra | 21:50 | |
pabelanger | not have everything hit it as once | 21:50 |
*** yamamoto has joined #openstack-infra | 21:51 | |
ianw | pabelanger: yeah, i got about 50gb worth of centos/fedora/ubuntu in just by running md5's over the mirrors | 21:51 |
ianw | i figure reverse proxy was less of a cold-cache issue | 21:51 |
pabelanger | ianw: when I did sto2 yesteday, it had no issue priming from zero. Worked well actually | 21:52 |
*** thorst has joined #openstack-infra | 21:53 | |
ianw | cool; i'll get some breakfast while this all gets puppeted properly then see about inserting it and monitor closely | 21:53 |
*** thorst has quit IRC | 21:53 | |
clarkb | we should see results fairly quickly too I imagine? I wonder if we could just do all the servers tomorrow | 21:53 |
clarkb | that would be nice and I can make time for cranking them out | 21:54 |
*** felipemonteiro_ has joined #openstack-infra | 21:54 | |
pabelanger | clarkb: we could setup an elastic-recheck query to track http connection resets, at least from yum client | 21:57 |
clarkb | we sort of already do | 21:57 |
clarkb | I think there is strong correlation between those yum errors and that behavior | 21:57 |
clarkb | we'd just have to filter by region and see if results drop off, or check the apache logs for segfaults | 21:57 |
*** felipemonteiro has quit IRC | 21:57 | |
*** yamamoto has quit IRC | 21:58 | |
pabelanger | Ya, I think there is also networking issue there too, but is a good start | 21:58 |
*** priteau has quit IRC | 21:59 | |
*** jascott1 has quit IRC | 22:00 | |
*** jascott1 has joined #openstack-infra | 22:01 | |
ianw | hopefully it is not related to some weird afs just-non-posixy-enough-to-confuse-deep-logic-in-apache issue | 22:03 |
*** jascott1 has quit IRC | 22:05 | |
fungi | i'm more likely to blame apache for this than the other way around | 22:06 |
*** jascott1 has joined #openstack-infra | 22:06 | |
fungi | but yeah, who knows at this point | 22:06 |
*** iyamahat has joined #openstack-infra | 22:07 | |
clarkb | we need a trusty node | 22:07 |
*** jascott1 has quit IRC | 22:07 | |
clarkb | it is as if that release knows we are replacing it and is holding out | 22:07 |
*** jascott1 has joined #openstack-infra | 22:07 | |
jeblair | i've removed zuulv3 from the emergency file | 22:08 |
*** jkilpatr has quit IRC | 22:09 | |
*** xyang1 has quit IRC | 22:11 | |
*** jascott1 has quit IRC | 22:12 | |
clarkb | are half hour build times expected in internap? | 22:13 |
clarkb | wondering if that is fallout from the internap changes going on | 22:13 |
mgagne | clarkb: compared to what? what kind of build? | 22:14 |
clarkb | mgagne: I see it for xenial and trusty instances at least. I would expect it to take just a few minutes typically so wondering if something went sideways with the name changing? | 22:15 |
mgagne | what kind of job? like a lint job for instance? | 22:16 |
clarkb | mgagne: no this is just instance booting, no jobs yet | 22:16 |
mgagne | ouch | 22:16 |
pabelanger | compute nodes downloading images? | 22:16 |
mgagne | pabelanger: that's what I suspect | 22:17 |
pabelanger | clarkb: http://grafana.openstack.org/dashboard/db/nodepool that large spike for Time to Ready on internap is likely new images too | 22:17 |
jeblair | yeah just started looking at that | 22:17 |
mgagne | but I don't have enough details to verify | 22:17 |
pabelanger | hits upwards of 50Mins | 22:17 |
ianw | clarkb / pabelanger: how interesting ... another person caught that mpm_event segfault literally an hour ago -> https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1630413 | 22:18 |
openstack | Launchpad bug 1630413 in apache2 (Ubuntu) "segfault in server/mpm/event/event.c:process_socket" [Undecided,Confirmed] | 22:18 |
openstackgerrit | Ramamani Yeleswarapu proposed openstack-infra/devstack-gate master: [TESTING][DO NOT MERGE] Testing TLS in Ironic jobs https://review.openstack.org/492661 | 22:18 |
pabelanger | ianw: Ha, nice | 22:18 |
pabelanger | ianw: looks like possible SRU | 22:18 |
ianw | yeah, the tl;dr is use it from backports | 22:19 |
clarkb | pabelanger: mgagne ok it does look like it spikes up daily if I expand the grafana graph period | 22:19 |
clarkb | ianw: pabelanger so maybe xenial will fix it then \o/ | 22:19 |
ianw | but, adds some value to upgrading | 22:19 |
ianw | jinx :) | 22:19 |
*** esberglu has quit IRC | 22:22 | |
*** sree has joined #openstack-infra | 22:22 | |
jeblair | mgagne, clarkb, pabelanger, ianw: i spot checked inap nodes -- they really still are in building state in nova | 22:23 |
mgagne | have you swapped all quota to inap? | 22:23 |
jeblair | so that idea of image copying may be correct | 22:23 |
jeblair | mgagne: yes | 22:23 |
mgagne | could be that all compute nodes are downloading images at the same time | 22:23 |
clarkb | I shall practice patience | 22:24 |
*** thorst has joined #openstack-infra | 22:24 | |
*** jascott1 has joined #openstack-infra | 22:24 | |
*** sree has quit IRC | 22:27 | |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Add known_hosts from executor to all nodes https://review.openstack.org/494333 | 22:28 |
*** thorst has quit IRC | 22:29 | |
openstackgerrit | sebastian marcet proposed openstack-infra/openstackid-resources master: External Calendar Sync https://review.openstack.org/487683 | 22:30 |
openstackgerrit | Emilien Macchi proposed openstack-infra/project-config master: TEMP - Disable voting on tripleo upgrade jobs https://review.openstack.org/494334 | 22:33 |
clarkb | the number of inap building instances is 138 and has remained that way for a while so we aren't thrashing | 22:33 |
clarkb | likely that I Just need to be paitent | 22:33 |
*** felipemonteiro_ has quit IRC | 22:34 | |
pabelanger | mnaser: I am seeing about 23 jobs timeout in vexxhost using our e-r query, a few recently. | 22:34 |
jeblair | clarkb: yep; i'm tailing logs and have seen no further activity other than polling | 22:34 |
pabelanger | infracloud-vanilla seems to timeout more then infracloud-chocolate | 22:34 |
clarkb | pabelanger: vanilla is on the older hardware I think | 22:35 |
clarkb | might be somewhat slower | 22:35 |
pabelanger | clarkb: ya, I think that is right | 22:35 |
mgagne | clarkb: I checked 1 compute node and yes, it's downloading. the one I checked, compared image size to existing ones and it's done at 20% so far. | 22:36 |
fungi | i wonder if we've unleashed a thundering herd in the image storage network there | 22:37 |
mgagne | ^^' | 22:37 |
mgagne | you are only affecting yourself so there is that =) | 22:37 |
fungi | oh, good. as long as we're not impacting performance for anyone else there i don't much care. it'll resolve itself soon enough | 22:38 |
mgagne | I can only imagine our netadmin: "Are you downloading something?" "Yea, a cloud user is using the cloud." "well, he shouldn't!" | 22:39 |
jeblair | we get that a lot | 22:40 |
fungi | "wait, what, you're using this thing?!?" | 22:40 |
fungi | lesson: openstack expects lots of available bandwidth between nova compute nodes and glance stores | 22:41 |
mgagne | and now, it reminds me of the idea of bypassing glance so nova-compute can download multiple chunks from Swift in parallel | 22:41 |
openstackgerrit | Merged openstack-infra/system-config master: Add mirror01.iad.rax.o.o https://review.openstack.org/494042 | 22:41 |
mgagne | fungi: glance store is the key, now it's going through glance-api for no reason =) | 22:41 |
clarkb | we are down to 133 building | 22:42 |
clarkb | not sure if the 5 that changed state succeeded or failed though | 22:42 |
fungi | yeah, we also toyed around some years back (in tripleo context i think?) with the idea of a bittorrent sharing solution for updating nova image caches between commute nodes | 22:42 |
clarkb | oh what do you know irc says the change merged so must've changed state in the direction we want :) | 22:42 |
clarkb | ianw: so thats in now | 22:42 |
clarkb | fungi: ya | 22:42 |
fungi | s/commute/compute/ | 22:43 |
*** bobh has quit IRC | 22:43 | |
mgagne | fungi: I heard something similar from rax with deployment and some people forgot firewalls have limited number of sessions. | 22:43 |
fungi | hah | 22:43 |
*** bobh has joined #openstack-infra | 22:43 | |
clarkb | my favorite was when I got the 3am phone call because the default security group rule in hpcloud had nuked the database | 22:44 |
fungi | should of course clarify, _stateful_ firewalls have session limits | 22:44 |
mgagne | clarkb: oh, the default rule where other members are authorized and each new instance triggers an iptable update on all compute nodes? | 22:44 |
clarkb | mgagne: ya | 22:44 |
clarkb | down to 108 building so its moving relatively quickly now :) | 22:45 |
fungi | _stateless_ firewalls just inspect each packet on its own merits (but of course have far less effective rules and often run into packet rate issues too since they can't take state tracking shortcuts) | 22:45 |
jeblair | clarkb: i think we're hitting the nodepool build timeout | 22:45 |
clarkb | jeblair: oh maybe | 22:45 |
fungi | we'll eventually get those images primed onto all their compute nodes though ;) | 22:45 |
jeblair | yep confirmed in logs | 22:45 |
* clarkb checks where ianw's change gated | 22:45 | |
clarkb | ya neither trusty job was inap | 22:46 |
*** bobh has quit IRC | 22:48 | |
*** jkilpatr has joined #openstack-infra | 22:49 | |
*** rbrndt has quit IRC | 22:51 | |
clarkb | I am launching the xenial mirror in vanilla cloud now | 22:52 |
*** dizquierdo has joined #openstack-infra | 22:53 | |
clarkb | ianw: and we are doing mirror.foo.bar.o.o CNAME mirror01.foo.bar.o.o ? | 22:54 |
jeblair | the inap deletes are slow too; maybe we can't delete the instances which are still waiting on image downloads until they are complete | 22:55 |
pabelanger | clarkb: so, ya, I pretty sure I know the answer, but..... is doing a reverse proxy cache for github.com something we'd do? | 22:57 |
*** EricGonc_ has quit IRC | 22:57 | |
EmilienM | hey infra, I have an outstanding request: https://review.openstack.org/#/c/494334/ - everything is explained in the commit message - we need this asap. Thanks a lot | 22:58 |
clarkb | pabelanger: like I already told mhayden I think step 0 is getting reliable proxy/mirror for what we currently have | 22:58 |
clarkb | pabelanger: we keep piling on making it impossible to figure out if we've made anything better than yesterday | 22:58 |
clarkb | but once that is done I imagine it is something we could slowly test | 22:58 |
clarkb | pabelanger: my concern is that we are essentially trying to mirror the internet | 22:58 |
clarkb | and one tiny host per region is going to do a poor job of that | 22:59 |
pabelanger | clarkb: Ya, agree. Don't think we should add more until we stablize current mirrors | 22:59 |
pabelanger | clarkb: ya, that is also my concern now too | 22:59 |
*** esberglu has joined #openstack-infra | 22:59 | |
pabelanger | where do we draw the line on thing to proxy | 22:59 |
clarkb | ya I put an entry on the ptg ideas etherpad related to figuring ^ out | 23:00 |
clarkb | I think a big part of it will be getting information on what people think they need mirrored/proxied bucause I think we are learning that we haven't actually done that yet | 23:00 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Document and update fileserver roles https://review.openstack.org/494291 | 23:01 |
clarkb | then based on that feedabck and knowledge of hopefully running more reliable servers we can figure out what makes sense | 23:01 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config master: Use artifact publication roles from zuul-jobs https://review.openstack.org/494231 | 23:01 |
clarkb | github is probably special in that we already do a good job of hosting git repos. Are people wanting us to provide CI for github or can we just consume release artifacts from projects hosted on github via the current channels (pip, npm, gems, ubutnu centos etc) | 23:02 |
*** aeng has quit IRC | 23:02 | |
clarkb | pabelanger: ^ is there a specific case where github would be useful? | 23:02 |
*** esberglu has quit IRC | 23:03 | |
clarkb | first build in vanilla failed. I can't tell if it thinks ssh timed out or if it didn't like the ssh host key for some reason | 23:03 |
pabelanger | clarkb: personally, I don't think we should proxy github.com. In the case I am looking at, infracloud seems to be having problems cloning some repos from github.com | 23:04 |
pabelanger | clarkb: this is because the DLRN tool requires some configuration data and rdoproject publishes it there | 23:04 |
pabelanger | clarkb: I think it would be possible to move this data into existing RDO proxy infra, but I need to work with #rdo | 23:05 |
openstackgerrit | Giulio Fidente proposed openstack-infra/tripleo-ci master: Copy Ceph logs from running containers https://review.openstack.org/494340 | 23:05 |
pabelanger | clarkb: so, for now, I think we have better tools to mirror git then using reverse proxy cache | 23:05 |
pabelanger | https://github.com/redhat-openstack/rdoinfo is the repo in question too | 23:05 |
clarkb | ya seems like that is repo metadata that could be hosted in the rpm repo | 23:06 |
clarkb | (but I know little about hosting practices for rpm packages) | 23:06 |
*** marst_ has quit IRC | 23:06 | |
*** makowals has quit IRC | 23:06 | |
pabelanger | clarkb: yes, I agree. That is what I am going to ask #rdo about | 23:06 |
*** dizquierdo has quit IRC | 23:10 | |
*** makowals has joined #openstack-infra | 23:11 | |
*** thorst has joined #openstack-infra | 23:11 | |
*** thorst has quit IRC | 23:13 | |
clarkb | I'm debuggin launch node fails in infracloud now. trying to figure out if its our ssh key or maybe the wrong users being used? | 23:14 |
*** vhosakot has quit IRC | 23:14 | |
ianw | clarkb: i think the "01" is a good idea, because if we get to the point we do want load-balancing or something i think that gives us flexibility | 23:17 |
ianw | clarkb: want me to start an etherpad to track what's what? | 23:17 |
clarkb | ianw: sure | 23:17 |
clarkb | and I'm attempting to boot with 01 just curious if plan was to CNAME? | 23:17 |
ianw | https://etherpad.openstack.org/p/mirror-xenial | 23:18 |
ianw | i think yeah, have mirror -> mirror01 | 23:18 |
*** notmyname has quit IRC | 23:19 | |
*** aeng has joined #openstack-infra | 23:19 | |
clarkb | ok I think the error is I didn't enable config drive and ifnracloud doesn't have metadata service, retrying | 23:20 |
*** notmyname has joined #openstack-infra | 23:23 | |
clarkb | pabelanger: ok just discovered why removing dns settings on our network is bad, can't launch node the new mirror because we have to resolve the host for get-pip.py :) | 23:24 |
clarkb | pabelanger: thoughts on how we want to address that? | 23:24 |
pabelanger | clarkb: Ah, ya. That would do it | 23:25 |
clarkb | I could provide a user data ascript that wrote out a resolv.conf | 23:25 |
pabelanger | ya, was just thinking something like that | 23:25 |
clarkb | that is kind of hacky | 23:26 |
clarkb | but we only need it to work that one time | 23:26 |
pabelanger | clarkb: I think once I fix up https://review.openstack.org/493665/ and we land that on our images, we can revert the puppet change | 23:26 |
clarkb | doesn't look like launch node supports arbitrary user data yet | 23:26 |
clarkb | I'm going to hack it in | 23:27 |
*** gouthamr has joined #openstack-infra | 23:27 | |
pabelanger | kk | 23:27 |
clarkb | since we have WIP to fix it properly I don't feel to bad getting the host booted today | 23:27 |
clarkb | in the future it should just work :) | 23:27 |
pabelanger | One day we'll have DIB images for control plane, then we'd get unbound :) | 23:28 |
*** EricGonczer_ has joined #openstack-infra | 23:28 | |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config master: Create glean@.service.d/override.conf https://review.openstack.org/493665 | 23:29 |
clarkb | what I'm doing is adding an ssh command in launch node that appends nameserver 8.8.8.8 to resolve.conf | 23:29 |
clarkb | so its a one liner hack in launch node that should work fine and will be undone by puppet | 23:30 |
*** EricGonczer_ has quit IRC | 23:30 | |
clarkb | if people think its worth having that in launch node proper I can push it up once confirmed ti works | 23:30 |
pabelanger | k | 23:30 |
*** tosky has quit IRC | 23:33 | |
*** Swami has quit IRC | 23:33 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add publish-openstack-python-branch-tarball to post pipeline https://review.openstack.org/494296 | 23:33 |
*** spzala has joined #openstack-infra | 23:34 | |
*** Apoorva_ has quit IRC | 23:35 | |
ianw | is mirror.dfw.rax.openstack.org really old? i can't log in and i wonder if it's my ecdsa key | 23:35 |
clarkb | ianw: I'll look | 23:35 |
*** Apoorva has joined #openstack-infra | 23:35 | |
clarkb | ianw: ed25519 key | 23:35 |
ianw | yeah, sorry | 23:36 |
fungi | the "djb curve" | 23:36 |
clarkb | let me take a look at what is in puppet and put that in there | 23:36 |
ianw | it caused issues with precise hosts | 23:36 |
clarkb | oh that is what is in puppet | 23:36 |
clarkb | is this a precise host that we missed somehow? | 23:37 |
clarkb | it has a 3.13. trusty kernel | 23:37 |
clarkb | let me see if auth log has more info | 23:37 |
*** spzala has quit IRC | 23:38 | |
clarkb | ianw: I see root attempts is that you? should be your normal username if so | 23:38 |
ianw | ohhh, haha it would help if i was in a window on a host with my key, doh | 23:39 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Use new sphinx roles in docs https://review.openstack.org/493250 | 23:40 |
clarkb | fatal: [mirror01.regionone.infracloud-vanilla.openstack.org]: FAILED! => {"changed": false, "failed": true, "module_stderr": "", "module_stdout": "/bin/sh: 1: /usr/bin/python: not found\r\n", "msg": "MODULE FAILURE", "parsed": false} | 23:40 |
ianw | and this is why i colour code terminals | 23:40 |
clarkb | so uh I guess ubuntu removed python from their base xenial cloud image | 23:40 |
ianw | oh, that would be the xenial not having python2 by default | 23:40 |
clarkb | ianw: ya guessing you haven't hit that because you are on rax's images which are different | 23:40 |
clarkb | so do I also hack in an apt-get install python? | 23:41 |
clarkb | I guess its worth doing :) | 23:41 |
ianw | yeah, dib had plenty of issues with that at the time :) | 23:41 |
jeblair | inap nodes are coming online i believe | 23:41 |
fungi | slow and steady wins the race | 23:41 |
jeblair | and all the previously deleted nodes have been cleared out | 23:42 |
jeblair | #status log renamed nodepool internap provider to inap. new mirror server in use. | 23:43 |
openstackstatus | jeblair: finished logging | 23:43 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Allow requesting secrets by a different name https://review.openstack.org/494343 | 23:43 |
jeblair | infra-root: now that inap is in use (replacing internap), we should watch out for unexpected fallout from the new mirror host | 23:44 |
mordred | ianw, clarkb: wow. I have clearly missed a giant pile of fun - ubuntu no longer ships python in their cloud images? | 23:45 |
ianw | mordred: python2 | 23:45 |
fungi | a.k.a. /usr/bin/python | 23:45 |
clarkb | which ansible needs | 23:45 |
mordred | yah | 23:45 |
clarkb | which launch-node needs | 23:45 |
clarkb | mordred: I love that nasible has now graduated to the langauge runtime problem that we have had with puppet at times | 23:46 |
ianw | it's what i like to call aspirational | 23:46 |
fungi | which things not python3-only will default to using | 23:46 |
mordred | clarkb: indeed. well - we can add a bootstrap python to our launch-node stuff | 23:46 |
clarkb | mordred: ya ssh_client.ssh('apt-get update && apt-get install -y python') seems to have worked | 23:47 |
pabelanger | connect=raw FTW | 23:47 |
mordred | clarkb: it's possible to have non-python tasks in ansible, which is the normal cantrip for getting python intsalled onto a node that doens't have it and is otherwise managed with ansible | 23:47 |
pabelanger | connection* | 23:47 |
clarkb | I think we have a bootstrap script that that would be better in though | 23:47 |
clarkb | mordred: ah | 23:47 |
pabelanger | this is actually a good use case for zuulv3 and ubuntu cloud images | 23:47 |
mordred | clarkb: yah - just saying - it's possible to do ^^ what pabelanger said if we were using ansible and not that bootstrap script | 23:47 |
pabelanger | we'd need to do the same thing | 23:48 |
mordred | well - it's also a good case for "make our own base images for control plane" | 23:48 |
mordred | but, you know - hours in a day | 23:48 |
mordred | that's been on my list for what? 3 years now? | 23:48 |
clarkb | I'll take a look at the boot process when not trying to get a mirror up :) | 23:49 |
clarkb | its possible we just want to add an ansible step that does it without python | 23:49 |
clarkb | or put it in an existiing bootstrap script | 23:49 |
*** aeng has quit IRC | 23:49 | |
openstackgerrit | Paul Belanger proposed openstack-infra/openstack-zuul-jobs master: Replace slash for tarball rename https://review.openstack.org/494344 | 23:51 |
*** gongysh has joined #openstack-infra | 23:52 | |
*** gongysh has quit IRC | 23:52 | |
clarkb | hrm we didn't break in install_puppet.sh which is where I would've expected a lack of python to first be a problem | 23:53 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Document and update fileserver roles https://review.openstack.org/494291 | 23:53 |
clarkb | and we default to setting up pip | 23:54 |
clarkb | now I'm extra confused | 23:54 |
pabelanger | actually, I thought I've launched a xenial node before | 23:55 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Allow requesting secrets by a different name https://review.openstack.org/494343 | 23:55 |
pabelanger | clarkb: Oh, now I remember | 23:56 |
pabelanger | https://review.openstack.org/#/c/450526/ is actually wrong | 23:56 |
dims | clarkb : fungi : mordred : Do you have a few mins to merge "Update grenade settings for stable/pike" review? pretty please - https://review.openstack.org/#/c/493057/ | 23:56 |
pabelanger | that needs to move down 1 level | 23:56 |
pabelanger | and not be in the if statement | 23:56 |
pabelanger | I thought I proposed a fix for that | 23:57 |
mordred | pabelanger: oh - hah. yeah - so it does | 23:57 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Replace slash for tarball rename https://review.openstack.org/494344 | 23:58 |
clarkb | pabelanger: it actually need to go well before that because setup_pip happens before much of anything else | 23:58 |
clarkb | I'm also getting a lot of Aug 16 23:56:36 mirror01 puppet-agent[9836]: Could not request certificate: Failed to open TCP connection to puppet:8140 (getaddrinfo: Name or service not known) | 23:58 |
pabelanger | clarkb: I think I had this issue in vexxhost too | 23:58 |
pabelanger | when we did nb03 | 23:58 |
clarkb | pabelanger: the no python issue or the puppet-agent? | 23:58 |
mordred | Failed to open TCP connection to puppet:8140 ... that sounds like puppet.conf is not right | 23:58 |
clarkb | mordred: yes | 23:58 |
pabelanger | clarkb: no python | 23:59 |
mordred | and that something started agent | 23:59 |
clarkb | its trying to talk to a master | 23:59 |
mordred | yah | 23:59 |
pabelanger | clarkb: I must have just manually applied my fix and never pushed up the fix :( | 23:59 |
mordred | AND using the default value of 'puppet' - that's fantastic | 23:59 |
pabelanger | the puppet issue, I think I see that but didn't break anything on vexxhost | 23:59 |
mordred | dims: ++ from me | 23:59 |
clarkb | pabelanger: ok so maybe be patient then double check after its done? it hasn't failed yet | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!