*** jamesmcarthur has quit IRC | 00:00 | |
*** jamesmcarthur has joined #openstack-infra | 00:00 | |
*** jamesmcarthur has quit IRC | 00:02 | |
*** ijw has joined #openstack-infra | 00:08 | |
ianw | there's not enough space to if we wanted to anyway, and we should maybe cleanup old volumes. i'll add a meeting item | 00:16 |
---|---|---|
ianw | fungi: it's already at a 5 minute ttl, so i think most things would be not really caching the remote address | 00:28 |
*** hwoarang has quit IRC | 00:32 | |
*** jamesmcarthur has joined #openstack-infra | 00:32 | |
*** hwoarang has joined #openstack-infra | 00:33 | |
ianw | #status log graphite.o.o A/AAAA records renamed to graphite-old.o.o, graphite.o.o now a CNAME to these until switch to graphite.opendev.org | 00:36 |
openstackstatus | ianw: finished logging | 00:36 |
openstackgerrit | Merged openstack-infra/project-config master: Add windmill-config project https://review.openstack.org/640510 | 00:52 |
*** wolverineav has joined #openstack-infra | 00:55 | |
fungi | ianw: that assumes applications re-resolve hostnames periodically at all | 00:56 |
fungi | for datagrams it might not be as prevalent as established tcp sockets though | 00:57 |
fungi | (continuing to assume the original address to which you resolved a name is relevant) | 00:57 |
*** wolverineav has quit IRC | 01:00 | |
*** jamesmcarthur has quit IRC | 01:01 | |
ianw | fungi: yeah, we only open udp 8125 and i think it all pretty much uses the statsd client, every packet blast will be a fresh lookup to the local resolver anyway | 01:01 |
*** markvoelker has joined #openstack-infra | 01:02 | |
fungi | ahh, in that case it's probably fine | 01:04 |
*** sdake has joined #openstack-infra | 01:06 | |
*** markvoelker has quit IRC | 01:07 | |
*** jesusaur has quit IRC | 01:19 | |
*** jesusaur has joined #openstack-infra | 01:25 | |
*** ijw has quit IRC | 01:27 | |
*** ijw_ has joined #openstack-infra | 01:27 | |
*** ijw_ has quit IRC | 01:28 | |
*** ijw has joined #openstack-infra | 01:29 | |
*** ijw has quit IRC | 01:30 | |
*** ijw_ has joined #openstack-infra | 01:30 | |
*** sdake has quit IRC | 01:44 | |
*** diablo_rojo has joined #openstack-infra | 01:48 | |
*** sdake has joined #openstack-infra | 01:50 | |
*** wolverineav has joined #openstack-infra | 01:56 | |
*** sdake has quit IRC | 01:59 | |
*** wolverineav has quit IRC | 02:01 | |
*** markvoelker has joined #openstack-infra | 02:03 | |
*** sdake has joined #openstack-infra | 02:05 | |
*** diablo_rojo has quit IRC | 02:06 | |
*** markvoelker has quit IRC | 02:06 | |
*** sdake has quit IRC | 02:08 | |
*** chason has quit IRC | 02:36 | |
*** chason has joined #openstack-infra | 02:38 | |
*** sdake has joined #openstack-infra | 02:41 | |
*** wolverineav has joined #openstack-infra | 02:42 | |
*** sdake has quit IRC | 02:44 | |
ianw | https://github.com/jsocol/pystatsd/blob/master/statsd/client/udp.py#L30 ... boo it's not fine, it looks up once | 02:47 |
*** sdake has joined #openstack-infra | 02:50 | |
*** wolverineav has quit IRC | 02:50 | |
*** psachin has joined #openstack-infra | 02:52 | |
*** ijw_ has quit IRC | 02:58 | |
*** ijw has joined #openstack-infra | 02:59 | |
*** jamesmcarthur has joined #openstack-infra | 03:01 | |
*** wolverineav has joined #openstack-infra | 03:05 | |
*** rkukura has quit IRC | 03:05 | |
*** jamesmcarthur has quit IRC | 03:05 | |
*** ijw has quit IRC | 03:08 | |
*** ijw has joined #openstack-infra | 03:09 | |
*** ykarel|away has joined #openstack-infra | 03:11 | |
*** yamamoto has joined #openstack-infra | 03:16 | |
*** ykarel|away is now known as ykarel | 03:17 | |
ianw | 03:21:35.787558 IP 117.114.139.162.59587 > graphite.openstack.org.8125: UDP, length 49 | 03:22 |
ianw | i can not log into this host, but yet it's making it past the graphite firewall, despite not being listed in iptables :/ ? | 03:22 |
ianw | in china somewhere, must the old arm builder? | 03:23 |
*** sdake has quit IRC | 03:24 | |
*** sdake has joined #openstack-infra | 03:26 | |
*** jamesmcarthur has joined #openstack-infra | 03:27 | |
ianw | looks like a zuul executor? something to do with ustack? http://paste.openstack.org/show/747206/ | 03:29 |
clarkb | thats unexpected | 03:29 |
clarkb | is it in our firewall rules? the system config log may have ideas | 03:30 |
ianw | ok, so it's tcpdump, it's *not* making it past the firewall. but someone is misconfigured to send their stats to us | 03:30 |
ianw | yeah, it's something from https://www.ustack.com/ | 03:35 |
*** sdake has quit IRC | 03:41 | |
*** jamesmcarthur has quit IRC | 03:41 | |
fungi | sounds like someone deployed a ci system using a copy of our configuration management and didn't change the statsd destination variable | 03:43 |
*** wolverineav has quit IRC | 03:47 | |
*** wolverineav has joined #openstack-infra | 03:48 | |
*** janki has joined #openstack-infra | 03:52 | |
*** wolverineav has quit IRC | 04:14 | |
*** yamamoto has quit IRC | 04:18 | |
*** udesale has joined #openstack-infra | 04:20 | |
*** ramishra has joined #openstack-infra | 04:26 | |
*** diablo_rojo has joined #openstack-infra | 04:32 | |
*** yamamoto has joined #openstack-infra | 04:36 | |
*** ijw has quit IRC | 04:37 | |
*** udesale has quit IRC | 04:39 | |
*** ijw has joined #openstack-infra | 04:39 | |
*** udesale has joined #openstack-infra | 04:41 | |
*** ijw has quit IRC | 04:45 | |
ianw | yep, just had me confused because we did have a builder at one point in CN sending stats (moved to london though). but yeah, it's executor, builder, scheduler stats coming in so a misconfiguration | 04:50 |
*** ykarel is now known as ykarel|afk | 04:51 | |
*** wolverineav has joined #openstack-infra | 04:52 | |
*** Tengu_ is now known as Tengu | 04:58 | |
*** sdake has joined #openstack-infra | 05:09 | |
ianw | #status graphite.opendev.org now active replacement for graphite.openstack.org. everything on the firewall list that might need a restart to pickup new address has been done | 05:14 |
openstackstatus | ianw: unknown command | 05:14 |
ianw | #status log graphite.opendev.org now active replacement for graphite.openstack.org. everything on the firewall list that might need a restart to pickup new address has been done | 05:14 |
openstackstatus | ianw: finished logging | 05:15 |
*** ijw has joined #openstack-infra | 05:15 | |
*** snapiri has joined #openstack-infra | 05:19 | |
*** wolverineav has quit IRC | 05:23 | |
*** ykarel|afk is now known as ykarel | 05:25 | |
*** ricolin has joined #openstack-infra | 05:27 | |
*** udesale has quit IRC | 05:39 | |
*** udesale has joined #openstack-infra | 05:39 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: Separate out executor server from runner https://review.openstack.org/607079 | 05:41 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: zuul-runner: implement prep-workspace https://review.openstack.org/607082 | 05:41 |
*** raukadah has quit IRC | 05:45 | |
*** chandankumar has joined #openstack-infra | 05:46 | |
*** sdake has quit IRC | 05:48 | |
*** sdake has joined #openstack-infra | 05:54 | |
*** sdake has quit IRC | 06:01 | |
*** sdake has joined #openstack-infra | 06:02 | |
*** sdake has quit IRC | 06:03 | |
*** diablo_rojo has quit IRC | 06:04 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: Add API endpoint to get frozen jobs https://review.openstack.org/607077 | 06:26 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: Get executor job params https://review.openstack.org/607078 | 06:26 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: Separate out executor server from runner https://review.openstack.org/607079 | 06:26 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: zuul-runner: implement prep-workspace https://review.openstack.org/607082 | 06:26 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: zuul-runner: add yaml based configuration file https://review.openstack.org/640672 | 06:26 |
*** apetrich has joined #openstack-infra | 06:30 | |
*** tkajinam_ has joined #openstack-infra | 06:36 | |
*** tkajinam has quit IRC | 06:38 | |
*** udesale has quit IRC | 06:46 | |
*** udesale has joined #openstack-infra | 06:47 | |
*** markvoelker has joined #openstack-infra | 06:48 | |
*** wolverineav has joined #openstack-infra | 06:54 | |
*** wolverineav has quit IRC | 07:02 | |
*** quiquell|off is now known as quiquell | 07:03 | |
*** udesale has quit IRC | 07:05 | |
*** udesale has joined #openstack-infra | 07:08 | |
*** udesale has quit IRC | 07:09 | |
*** udesale has joined #openstack-infra | 07:09 | |
*** auristor has quit IRC | 07:12 | |
*** slaweq has joined #openstack-infra | 07:15 | |
*** markvoelker has quit IRC | 07:21 | |
AJaeger | config-core, could you go over the queue, please? we have some changes since mid February waiting for a second +2... | 07:25 |
*** wolverineav has joined #openstack-infra | 07:27 | |
*** jtomasek has joined #openstack-infra | 07:30 | |
*** wolverineav has quit IRC | 07:31 | |
*** yamamoto has quit IRC | 07:40 | |
*** wolverineav has joined #openstack-infra | 07:41 | |
*** yamamoto has joined #openstack-infra | 07:41 | |
*** auristor has joined #openstack-infra | 07:41 | |
*** pgaxatte has joined #openstack-infra | 07:42 | |
*** quiquell is now known as quiquell|brb | 07:43 | |
*** rkukura has joined #openstack-infra | 07:44 | |
*** wolverineav has quit IRC | 07:47 | |
*** ginopc has joined #openstack-infra | 07:50 | |
*** snapiri has quit IRC | 07:52 | |
*** adriancz has joined #openstack-infra | 07:56 | |
*** kopecmartin|off is now known as kopecmartin | 08:04 | |
*** jesusaur has quit IRC | 08:06 | |
*** jesusaur has joined #openstack-infra | 08:10 | |
*** panda|ruck|off is now known as panda|ruck | 08:12 | |
*** rpittau|sardegna is now known as rpittau | 08:13 | |
*** dpawlik has joined #openstack-infra | 08:16 | |
*** markvoelker has joined #openstack-infra | 08:18 | |
*** quiquell|brb is now known as quiquell | 08:18 | |
*** rascasoft has joined #openstack-infra | 08:20 | |
*** pcaruana has joined #openstack-infra | 08:25 | |
*** mandre_away is now known as mandre | 08:25 | |
*** tosky has joined #openstack-infra | 08:27 | |
*** helenafm has joined #openstack-infra | 08:29 | |
*** rkukura has quit IRC | 08:31 | |
*** wolverineav has joined #openstack-infra | 08:33 | |
*** e0ne has joined #openstack-infra | 08:34 | |
*** wolverineav has quit IRC | 08:37 | |
*** iurygregory has joined #openstack-infra | 08:40 | |
*** jpich has joined #openstack-infra | 08:43 | |
*** dtantsur|afk is now known as dtantsur | 08:44 | |
*** tkajinam_ has quit IRC | 08:48 | |
*** markvoelker has quit IRC | 08:50 | |
*** jpena|off is now known as jpena | 08:56 | |
*** rossella_s has joined #openstack-infra | 08:56 | |
*** rkukura has joined #openstack-infra | 08:58 | |
*** roman_g has joined #openstack-infra | 08:59 | |
*** janki has quit IRC | 09:00 | |
*** janki has joined #openstack-infra | 09:00 | |
*** udesale has quit IRC | 09:04 | |
openstackgerrit | Merged openstack-infra/storyboard-webclient master: removes # for cards in automatic worklists https://review.openstack.org/640128 | 09:10 |
openstackgerrit | Roman Gorshunov proposed openstack-infra/irc-meetings master: Update Airship meeting time, change chair https://review.openstack.org/640359 | 09:15 |
*** ykarel is now known as ykarel|lunch | 09:21 | |
*** derekh has joined #openstack-infra | 09:34 | |
*** e0ne has quit IRC | 09:40 | |
*** jaosorior has joined #openstack-infra | 09:41 | |
*** jistr is now known as jistr|sick | 09:42 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Add Authorization Rules configuration https://review.openstack.org/639855 | 09:43 |
*** markvoelker has joined #openstack-infra | 09:47 | |
*** jbadiapa has quit IRC | 09:51 | |
*** gfidente has joined #openstack-infra | 09:52 | |
*** IvensZambrano has joined #openstack-infra | 09:54 | |
*** gfidente has quit IRC | 10:00 | |
*** mwhahaha has quit IRC | 10:01 | |
*** mwhahaha has joined #openstack-infra | 10:01 | |
*** jpich has quit IRC | 10:05 | |
*** jpich has joined #openstack-infra | 10:06 | |
*** electrofelix has joined #openstack-infra | 10:07 | |
*** gfidente has joined #openstack-infra | 10:08 | |
*** ykarel|lunch is now known as ykarel | 10:10 | |
*** wolverineav has joined #openstack-infra | 10:10 | |
*** jpich has quit IRC | 10:10 | |
*** jpich has joined #openstack-infra | 10:12 | |
*** jbadiapa has joined #openstack-infra | 10:13 | |
*** wolverineav has quit IRC | 10:15 | |
*** e0ne has joined #openstack-infra | 10:16 | |
*** luizbag has joined #openstack-infra | 10:18 | |
*** dtantsur has quit IRC | 10:18 | |
*** yamamoto has quit IRC | 10:18 | |
*** dtantsur has joined #openstack-infra | 10:18 | |
Dobroslaw | Hello zuul magicians | 10:19 |
Dobroslaw | Is there any magic that needs to be done to run custom job on release pipeline? | 10:19 |
Dobroslaw | What is my problem: I want to push docker image with proper tag to docker hub on release | 10:19 |
Dobroslaw | I added it to `.zuul.yml` https://github.com/openstack/monasca-common/blob/master/.zuul.yaml#L36-L38 | 10:19 |
Dobroslaw | but looks like it wasn't run | 10:19 |
Dobroslaw | this job is running only in periodic and post | 10:19 |
Dobroslaw | http://zuul.openstack.org/builds?project=openstack%2Fmonasca-common&job_name=docker-publish-monasca-base | 10:19 |
Dobroslaw | I checked kolla repo and looks like it have the same problem | 10:19 |
Dobroslaw | https://github.com/openstack/kolla/blob/master/.zuul.d/ubuntu.yaml#L26-L29 | 10:19 |
Dobroslaw | http://zuul.openstack.org/builds?project=openstack%2Fkolla&pipeline=release&job_name=kolla-publish-ubuntu-binary | 10:19 |
openstackgerrit | Slawek Kaplonski proposed openstack-infra/project-config master: Move openstack-tox-lower-constraints to UT jobs graph https://review.openstack.org/639321 | 10:19 |
*** markvoelker has quit IRC | 10:21 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: Proposed spec: tenant-scoped admin web API https://review.openstack.org/562321 | 10:25 |
*** yamamoto has joined #openstack-infra | 10:36 | |
*** yamamoto has quit IRC | 10:36 | |
*** yamamoto has joined #openstack-infra | 10:37 | |
openstackgerrit | Merged openstack-infra/project-config master: Add openstack-tox-py37 job to Neutron dashboard https://review.openstack.org/639588 | 10:37 |
*** yamamoto has quit IRC | 10:41 | |
openstackgerrit | Rico Lin proposed openstack-infra/project-config master: Add an openstack/auto-scaling-sig repository https://review.openstack.org/637125 | 10:42 |
*** ykarel is now known as ykarel|mtg | 10:43 | |
openstackgerrit | Slawek Kaplonski proposed openstack-infra/project-config master: Move openstack-tox-lower-constraints to UT jobs graph https://review.openstack.org/639321 | 10:48 |
openstackgerrit | Merged openstack-infra/project-config master: Set up placement project to use storyboard https://review.openstack.org/639445 | 10:48 |
AJaeger | Dobroslaw: I see the job run, see http://zuul.openstack.org/builds?job_name=docker-publish-monasca-base | 10:48 |
Dobroslaw | AJaeger: but not on release | 10:49 |
AJaeger | Dobroslaw: please follow our naming conventions for in-repo jobs at https://docs.openstack.org/infra/manual/drivers.html#consistent-naming-for-jobs-with-zuul-v3 | 10:49 |
AJaeger | It should be monasca-common-X - and not X-monasca-common. | 10:49 |
Dobroslaw | OK, will fix | 10:49 |
AJaeger | we share a namespace and that makes it easier to see where jobs come from. | 10:49 |
AJaeger | thanks | 10:50 |
*** ricolin has quit IRC | 10:50 | |
Dobroslaw | but it still is not running on release | 10:50 |
frickler | Dobroslaw: this is what I see in the zuul log, but I cannot find where the BranchMatcher:master comes from http://paste.openstack.org/show/747225/ | 10:50 |
cmurphy | infra-root I'm seeing some mirror issues with limestone regionone e.g. http://logs.openstack.org/14/640214/3/check/devstack-xenial/de9a63e/job-output.txt.gz http://logs.openstack.org/93/628193/21/check/openstack-tox-py35/117afc2/job-output.txt.gz | 10:50 |
frickler | Dobroslaw: may well be a zuul bug | 10:50 |
AJaeger | frickler: the job si defined on master - so might be implicit | 10:51 |
AJaeger | Dobroslaw, best ask later again once US westcoast is awake... | 10:52 |
Dobroslaw | AJaeger: frickler OK, will wait | 10:52 |
Dobroslaw | thank you | 10:52 |
Dobroslaw | but release is going from master so for me it's looks like zuul bug | 10:53 |
frickler | cmurphy: looking | 10:55 |
*** jangutter has joined #openstack-infra | 10:57 | |
frickler | [Mon Mar 04 06:39:15.952767 2019] [core:error] [pid 1507:tid 139635334301440] [client 2607:ff68:100:54:f816:3eff:fe8e:3c5d:55636] AH00037: Symbolic link not allowed or link target not accessible: /var/www/mirror/ubuntu | 11:03 |
frickler | infra-root: did we change our mirror setup somehow lately? ^^ | 11:04 |
frickler | in addition I'm also seeing flapping connectivity to that mirror node, like maybe a duplicate node issue | 11:05 |
frickler | hmm, actually the latter seems to be the more significant issue | 11:07 |
*** jangutter has quit IRC | 11:09 | |
*** e0ne has quit IRC | 11:10 | |
*** e0ne has joined #openstack-infra | 11:11 | |
*** yamamoto has joined #openstack-infra | 11:14 | |
*** udesale has joined #openstack-infra | 11:15 | |
ianw | frickler: yeah, no changes and http://grafana.openstack.org/d/ACtl1JSmz/afs?orgId=1 looks healthy | 11:18 |
*** markvoelker has joined #openstack-infra | 11:18 | |
*** jangutter has joined #openstack-infra | 11:26 | |
openstackgerrit | Jens Harbott (frickler) proposed openstack-infra/project-config master: Disable provider limestone https://review.openstack.org/640737 | 11:42 |
frickler | infra-root: ^^ let's disable limestone until we can get stable networking to the mirror node again | 11:43 |
*** markvoelker has quit IRC | 11:50 | |
*** udesale has quit IRC | 11:52 | |
*** udesale has joined #openstack-infra | 11:53 | |
*** wolverineav has joined #openstack-infra | 11:59 | |
*** e0ne has quit IRC | 12:00 | |
openstackgerrit | Merged openstack-infra/project-config master: Move openstack-tox-lower-constraints to UT jobs graph https://review.openstack.org/639321 | 12:02 |
*** wolverineav has quit IRC | 12:04 | |
*** e0ne has joined #openstack-infra | 12:08 | |
*** aojea has joined #openstack-infra | 12:09 | |
*** ykarel|mtg is now known as ykarel | 12:13 | |
*** janki has quit IRC | 12:16 | |
*** janki has joined #openstack-infra | 12:16 | |
*** udesale has quit IRC | 12:20 | |
*** udesale has joined #openstack-infra | 12:21 | |
*** dave-mccowan has joined #openstack-infra | 12:22 | |
frickler | there are also afs failures on mirror02.limestone, which probably explain the issues above: | 12:22 |
frickler | Mar 4 12:18:53 mirror02 kernel: [2541863.823809] afs: Lost contact with file server 104.130.138.161 in cell openstack.org (code -1) (all multi-homed ip addresses down for the server) | 12:22 |
*** priteau has joined #openstack-infra | 12:30 | |
*** jcoufal has joined #openstack-infra | 12:34 | |
openstackgerrit | Luigi Toscano proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-sahara-dashboard-dsvm-integration https://review.openstack.org/640751 | 12:38 |
*** jpena is now known as jpena|lunch | 12:38 | |
*** udesale has quit IRC | 12:38 | |
*** udesale has joined #openstack-infra | 12:41 | |
*** markvoelker has joined #openstack-infra | 12:48 | |
*** yamamoto has quit IRC | 12:48 | |
*** edmondsw has joined #openstack-infra | 12:49 | |
*** zbr has quit IRC | 12:53 | |
*** zbr|ssbarnea has joined #openstack-infra | 12:53 | |
*** zbr|ssbarnea has quit IRC | 12:54 | |
*** zbr has joined #openstack-infra | 12:55 | |
*** rlandy has joined #openstack-infra | 12:57 | |
*** jistr|sick is now known as jistr|sick|mtg | 13:01 | |
*** e0ne has quit IRC | 13:04 | |
*** udesale has quit IRC | 13:08 | |
*** udesale has joined #openstack-infra | 13:09 | |
*** yamamoto has joined #openstack-infra | 13:16 | |
*** rh-jelabarre has joined #openstack-infra | 13:17 | |
*** markvoelker has quit IRC | 13:21 | |
*** sdake has joined #openstack-infra | 13:23 | |
*** jamesmcarthur has joined #openstack-infra | 13:24 | |
*** jpena|lunch is now known as jpena | 13:31 | |
*** jamesmcarthur has quit IRC | 13:32 | |
*** jamesmcarthur has joined #openstack-infra | 13:33 | |
*** udesale has quit IRC | 13:36 | |
*** mhu has quit IRC | 13:36 | |
*** mhu has joined #openstack-infra | 13:37 | |
*** yamamoto has quit IRC | 13:37 | |
*** jamesmcarthur has quit IRC | 13:38 | |
*** yamamoto has joined #openstack-infra | 13:39 | |
*** wolverineav has joined #openstack-infra | 13:47 | |
*** sthussey has joined #openstack-infra | 13:47 | |
*** rfolco is now known as rfolco|pto | 13:50 | |
*** wolverineav has quit IRC | 13:52 | |
fungi | logan-: when you're around, we've been seeing intermittent but significant packet loss to the ipv4 address of our mirror server there (216.245.200.132) | 13:59 |
fungi | i was seeing upwards of 60% icmp packet loss a few minutes ago from multiple parts of the internet, though it's cleared up again for the moment | 14:00 |
fungi | i haven't observed any packet loss to its ipv6 address, but i wasn't testing both concurrently so that could just be luck | 14:01 |
*** jamesmcarthur has joined #openstack-infra | 14:03 | |
fungi | i suspect it was impacting v6 as well since i expect cacti is polling snmp on it over v6 and i see some prominent gaps in our graphs for it... take the root disk utilization graph for example http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=64934&rra_id=all | 14:06 |
*** jamesmcarthur has quit IRC | 14:06 | |
*** jamesmcarthur has joined #openstack-infra | 14:06 | |
frickler | fungi: yes, I was pinging v6 from bridge earlier and also seeing gaps in connectivity | 14:07 |
openstackgerrit | Merged openstack-infra/project-config master: Disable provider limestone https://review.openstack.org/640737 | 14:08 |
fungi | and now it's entirely unreachable | 14:13 |
fungi | fd3b:86c4:135e:d033::100 is returning no route to host for it | 14:13 |
*** sdake has quit IRC | 14:14 | |
fungi | that seems to be a non-globally-routable address somewhere in limestone's network, i guess on a router serial interface | 14:15 |
fungi | though i can reach it over ipv4 currently | 14:16 |
fungi | increasingly strange | 14:16 |
*** markvoelker has joined #openstack-infra | 14:18 | |
*** janki has quit IRC | 14:19 | |
fungi | probably coincidence, but i started pinging the v6 gateway from the mirror server (over a v4 ssh connection) and suddenly stopped getting ipv4 packets through | 14:20 |
fungi | but my previously hung v6 ssh session is suddenly working again | 14:20 |
*** janki has joined #openstack-infra | 14:24 | |
*** jamesmcarthur has quit IRC | 14:24 | |
*** sdake has joined #openstack-infra | 14:25 | |
*** sdake has quit IRC | 14:25 | |
mordred | fungi: maybe it's a poltergeist | 14:28 |
mordred | infra-root: the project creation playbook for gitea seems to be working! | 14:28 |
pabelanger | I see things! | 14:30 |
*** pcaruana has quit IRC | 14:31 | |
mordred | pabelanger: hopefully not dead people ... | 14:31 |
pabelanger | ++ | 14:31 |
mordred | https://review.openstack.org/#/c/640218/ and https://review.openstack.org/#/c/640431/ could use review | 14:32 |
*** beekneemech is now known as bnemec | 14:33 | |
*** mriedem has joined #openstack-infra | 14:36 | |
*** ykarel is now known as ykarel|away | 14:37 | |
*** sdake has joined #openstack-infra | 14:38 | |
*** jamesmcarthur has joined #openstack-infra | 14:38 | |
*** ykarel|away has quit IRC | 14:46 | |
*** ekultails has joined #openstack-infra | 14:47 | |
*** sdake has quit IRC | 14:49 | |
*** markvoelker has quit IRC | 14:50 | |
*** sdake has joined #openstack-infra | 14:56 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Add Authorization Rules configuration https://review.openstack.org/639855 | 14:56 |
*** e0ne has joined #openstack-infra | 14:59 | |
fungi | i am mildly curious what happens if a project is renamed to the old name of a different already-renamed project | 14:59 |
fungi | whether gitea drops the old redirect at that point, or rejects the rename attempt | 14:59 |
fungi | (or option #3) | 14:59 |
*** 18WAAFJGJ has quit IRC | 15:00 | |
*** guilhermesp has joined #openstack-infra | 15:01 | |
fungi | and now we're back to ipv4 working to the limestone mirror but not v6 | 15:01 |
fungi | i wonder if i can replicate the earlier switch in behavior | 15:02 |
*** e0ne has quit IRC | 15:02 | |
*** jistr|sick|mtg is now known as jistr|sick | 15:02 | |
fungi | nevermind, now they're both unresponsive again | 15:02 |
mordred | fungi: I thnik corvus actually submitted a patch upstream related to that - but I don't remember what the sitch is | 15:03 |
fungi | and now v4 is responding for me again | 15:04 |
*** e0ne has joined #openstack-infra | 15:04 | |
*** eharney has joined #openstack-infra | 15:04 | |
fungi | pinging the ipv6 default gateway from the mirror over an ipv4 ssh session doesn't seem to have broken v4 connectivity this time | 15:05 |
fungi | however v6 is working again now | 15:05 |
fungi | i wonder if something is breaking neighbor discovery and/or arp | 15:06 |
corvus | fungi, mordred: yeah, that case should be handled in gitea master: https://github.com/go-gitea/gitea/pull/6216 merged | 15:07 |
*** kgiusti has joined #openstack-infra | 15:07 | |
fungi | neat! glad i'm not the only one who has these idle what-ifs | 15:07 |
fungi | since odds are at some point we would have run into that case | 15:08 |
openstackgerrit | Monty Taylor proposed openstack-infra/nodepool master: Remove TaskManager and just use keystoneauth https://review.openstack.org/640643 | 15:08 |
*** sdake has quit IRC | 15:11 | |
*** priteau has quit IRC | 15:11 | |
*** zul has joined #openstack-infra | 15:12 | |
openstackgerrit | Monty Taylor proposed openstack-infra/nodepool master: Remove TaskManager and just use keystoneauth https://review.openstack.org/640643 | 15:12 |
corvus | fungi: yeah, i ran into it pretty quick in testing :) | 15:14 |
*** sdake has joined #openstack-infra | 15:15 | |
*** roman_g has quit IRC | 15:19 | |
*** roman_g has joined #openstack-infra | 15:19 | |
fungi | not surprised | 15:22 |
*** pcaruana has joined #openstack-infra | 15:25 | |
corvus | fungi: can you review https://review.openstack.org/636775 when you have a moment? it will improve run_all.sh a bit | 15:26 |
fungi | happy to! | 15:27 |
corvus | though, oops | 15:28 |
fungi | unapproved | 15:28 |
fungi | what's the oops i overlooked? | 15:28 |
corvus | fungi, mordred, clarkb: it looks like remote_puppet_git is taking 1.5 hours now | 15:28 |
fungi | oof | 15:28 |
fungi | that sounds pathological. any idea what task is taking so long? | 15:29 |
corvus | probably gitea creation :) | 15:29 |
fungi | yeah, just thinking maybe that's a one-time cost? | 15:30 |
fungi | or several-time until it gets all the way through? | 15:30 |
fungi | or is it in theory already done creating all the repos | 15:31 |
fungi | ? | 15:31 |
corvus | yeah... do we know if it's gotten through yet? | 15:31 |
mordred | corvus: it SEEMS like all the projects are in - it's at least not showing the one-project-per-run pattern from before and there are a lot in there | 15:32 |
*** chandankumar is now known as raukadah | 15:32 | |
corvus | mordred: there are failures in the log for the 'create repo' task | 15:32 |
mordred | corvus: ah. then maybe not | 15:32 |
corvus | thank goodness we still have the debug line in there that tells us what project it's doing :) | 15:34 |
*** wolverineav has joined #openstack-infra | 15:35 | |
*** armax has joined #openstack-infra | 15:36 | |
corvus | mordred, fungi: it looks like we're stuck. i've checked the last 3 runs, and gitea08 has bombed on stackforge/halthnmon | 15:36 |
corvus | i think because the description is too long | 15:36 |
corvus | so i think we just need to update the playbook to use a substring | 15:37 |
corvus | (i thought we did that already) | 15:37 |
openstackgerrit | Merged openstack-infra/system-config master: Add gitea to project rename playbook https://review.openstack.org/640218 | 15:37 |
corvus | and yeah, i think they're all stuck on that repo | 15:38 |
openstackgerrit | Monty Taylor proposed openstack-infra/nodepool master: Remove TaskManager and just use keystoneauth https://review.openstack.org/640643 | 15:38 |
fungi | i can't find a halthnmon | 15:38 |
corvus | healthnmon | 15:39 |
fungi | ahh, stackforge/healthnmon | 15:39 |
fungi | yeah, tried some obvious variations ;) | 15:39 |
mordred | good old healthnmon | 15:39 |
corvus | fungi: we're playing "spot the typo!" | 15:39 |
fungi | 296 characters, give or take a newline | 15:40 |
*** wolverineav has quit IRC | 15:40 | |
fungi | is that field limited to 256 now? | 15:40 |
mordred | I bet it is | 15:40 |
mordred | or probably 255 | 15:40 |
mordred | since it's probably a database field length | 15:40 |
fungi | limited in code or by db column width? | 15:40 |
fungi | ahh, yeah | 15:40 |
mordred | I'm guessing db column width | 15:40 |
mordred | corvus: you doing a patch or want me to? | 15:41 |
openstackgerrit | David Moreau Simard proposed openstack-infra/project-config master: Retire ara-{server,clients,plugins} repos: they've been merged to ara https://review.openstack.org/640785 | 15:41 |
corvus | mordred: can you? i'll go double check the length | 15:41 |
mordred | yup | 15:41 |
*** janki has quit IRC | 15:42 | |
corvus | yeah, both repo and org descriptions are limited to 255 | 15:42 |
*** ricolin has joined #openstack-infra | 15:42 | |
fungi | limestone update, the mirror is now fairly responsive over both ipv4 and ipv6 again | 15:42 |
fungi | at least for my ssh sessions, but snmp still seems to not be getting much in the way of responses yet according to cacti | 15:43 |
fungi | ahh, yeah seeing packet loss again already. that didn't seem to last long | 15:44 |
corvus | fungi, mordred: did you know that our base playbook alternates between taking ~12 seconds and 20 minutes to run? | 15:44 |
fungi | weird... | 15:44 |
corvus | http://grafana.openstack.org/d/qzQ_v2oiz/bridge-runtime?panelId=6&fullscreen&orgId=1&from=now-7d&to=now | 15:44 |
fungi | and, no, i did not know that ;) | 15:44 |
corvus | oh there's a third time in there too... ~10m | 15:45 |
corvus | oh nm | 15:45 |
corvus | that's a bug i fixed in that patch | 15:45 |
fungi | quick! | 15:45 |
corvus | one of those is the "k8s_bootstrap" run | 15:45 |
corvus | so as far as timings go -- i think the git playbook isn't going to get any faster. we're already looking at it taking 1.5 hours to no-op it's way from start to healthnmon | 15:47 |
*** quiquell is now known as quiquell|off | 15:47 | |
corvus | less says that's 98% through the file, so we *are* almost done | 15:47 |
*** markvoelker has joined #openstack-infra | 15:48 | |
*** zul has quit IRC | 15:48 | |
fungi | this is using api queries not direct to database, right? | 15:48 |
openstackgerrit | Rico Lin proposed openstack-infra/project-config master: Add an openstack/auto-scaling-sig repository https://review.openstack.org/637125 | 15:48 |
fungi | is it parallel across all the gitea servers or serialized? | 15:48 |
corvus | right, though most of the playbook was already doing that, the only thing we changed was updating the settings | 15:49 |
corvus | fungi: parallel | 15:49 |
fungi | oof, so ~1.5 hours just to check that ~2k projects don't need updating | 15:49 |
corvus | we do execute the settings update regardless | 15:50 |
corvus | we have task profiling turned on, but it's not really helping to point out where we're spending the time on this one. there are too many tasks | 15:50 |
fungi | so we're managing to average around 2.7 projects per second there | 15:50 |
corvus | we ran 65688 tasks across all 8 servers | 15:51 |
corvus | i can tell you the 50 slowest ones. | 15:51 |
fungi | er, i divided backwards. 2.7 seconds per project | 15:51 |
corvus | but i think we need average run time for each task | 15:51 |
mordred | corvus: ok. I may need your ansible/jinja help | 15:52 |
mordred | corvus: project.description | default('') ... where does a string slice go in that? | 15:52 |
corvus | "{{ (project.description | default(''))[:255] }}" | 15:53 |
corvus | i've found that if you sprinkle extra parens, it turns the filter's back into something more python-like :) | 15:54 |
mordred | corvus: wow. that's neat | 15:57 |
openstackgerrit | Monty Taylor proposed openstack-infra/system-config master: Limit project description to 255 characters https://review.openstack.org/640788 | 15:57 |
*** wolverineav has joined #openstack-infra | 15:57 | |
mordred | corvus, fungi: ^^ | 15:57 |
fungi | huh, nifty slicing | 15:58 |
corvus | mordred, fungi: when we ask gitea for the repo list, we are told the description, but we don't have the other info (like tracker url, etc). but maybe under the circumstances, we should only run the update settings task if the description doesn't match. | 15:58 |
mordred | corvus: that works for me as a hack for now | 15:59 |
mordred | corvus: maybe we can make a PR to upstream to return more things in the API call - or have a 'detailed-repo-list' call or something | 15:59 |
corvus | ++ | 15:59 |
*** josephrsandoval has joined #openstack-infra | 15:59 | |
*** sdake has quit IRC | 16:00 | |
corvus | mordred: you want to work on the update if description changed change? | 16:00 |
mordred | corvus: should we set a different default description then? so that we know that if a description is there, it means we've run the settings update at least once? | 16:00 |
corvus | mordred: oh, we don't update the description with the settings | 16:01 |
corvus | oops, i just assumed we did. | 16:02 |
mordred | nod. so - that doesn't get us much it seems | 16:02 |
corvus | mordred: ok, how about we just go back to updating settings only when we create the project | 16:02 |
mordred | yeah. and if we want we can make a version of this playbook that can be run by hand to fix up any botched creations | 16:02 |
corvus | mordred: ya. the only project that will cause a problem with at the moment is healthnmon. | 16:03 |
openstackgerrit | Monty Taylor proposed openstack-infra/system-config master: Only update gitea project settings during creation https://review.openstack.org/640789 | 16:04 |
*** e0ne has quit IRC | 16:06 | |
*** wolverineav has quit IRC | 16:07 | |
openstackgerrit | Monty Taylor proposed openstack-infra/system-config master: Add utility playbook for fixing gitea project settings https://review.openstack.org/640792 | 16:07 |
mordred | corvus: there's a utility playbook that we can use for a forced full sync | 16:08 |
*** luizbag has quit IRC | 16:08 | |
corvus | ++ | 16:08 |
openstackgerrit | Camila Moura proposed openstack-infra/storyboard master: Creates the tag with project name and priority https://review.openstack.org/640793 | 16:09 |
mordred | corvus: I also don't see an api call we can use to update a description (was going to add a task to update the description if it didn't match) | 16:10 |
*** e0ne has joined #openstack-infra | 16:12 | |
*** tobeass-urdin is now known as tobias-urdin | 16:17 | |
tosky | uhm, I have a zuul job (jobA) defined in the repository A and it uses the bindep role. If I try using the job in another repository (B), it fails because some binary dependencies are not installed | 16:18 |
tosky | so I guess that jobA tries to use the bindep.txt file from B, not from A | 16:18 |
tosky | is that expected? | 16:18 |
*** gfidente has quit IRC | 16:19 | |
clarkb | tosky: if you are using the infra provided parent job stuff that runs bindep then yes that is relative to the repo under test | 16:19 |
tosky | clarkb: isn't it a bit of unexpected? Or at least it was for me - I would expect the job to be self-contained | 16:20 |
*** e0ne has quit IRC | 16:20 | |
tosky | so that I shouldn't need to not copy the content of bindep.txt, as it is an implementation detail of the job | 16:20 |
clarkb | tosky: if you want the job to run against a fixed target you'll need to implement that in the job itself | 16:21 |
*** markvoelker has quit IRC | 16:21 | |
clarkb | tosky: most of our job design assumes reusable code that pplies to arbitrary repos, not a job fixed against a repo that is mixed in with other repos | 16:21 |
clarkb | this is possible, you just have to set it up that way | 16:21 |
*** e0ne has joined #openstack-infra | 16:22 | |
tosky | so I guess I would need to tune the call to the bindep role | 16:22 |
clarkb | tosky: for example the job that sets up the bindep stuff is probably defined in openstack-zuul-jobs. We don't want to force every tox job to use openstack-zuul-jobs bindep file | 16:22 |
clarkb | to catch up on scrollback it sounds like we've disabled limestone due to network flakyness and the gitea project creation playbook works now but is really slow? | 16:23 |
*** e0ne has quit IRC | 16:23 | |
clarkb | looks like the gitea sign in box is still present. Do we need to retrigger docker image builds to rebuild that image after the docker promotion fixes? | 16:23 |
*** wolverineav has joined #openstack-infra | 16:24 | |
mordred | tosky: when you use the job in another repository, you can set zuul_work_dir in the job variables | 16:25 |
*** gfidente has joined #openstack-infra | 16:25 | |
mordred | tosky: when I make jobs that I expect to run in a consistent context regardless of who triggered them, I'll oftentimes set zuul_work_dir in the job defintion | 16:26 |
mordred | tosky: in fact - thanks! you just caused me to realize I've got a job that's not doing what I thought it was | 16:29 |
tosky | thanks clarkb and mordred | 16:29 |
tosky | in fact the other role which is defined and used in the job uses a variable which points to the correct place, I just need to pass it to the bindep role | 16:29 |
mordred | tosky: remote: https://review.openstack.org/640804 Make tox tips job actually run sdk | 16:30 |
mordred | tosky: if you set zuul_work_dir it should make its way to the bindep role | 16:30 |
mordred | tosky: but cool! | 16:31 |
*** wolverineav has quit IRC | 16:31 | |
mordred | tosky: (that's the patch to openstacksdk that this discussion just made me realize I needed ... so thanks again) | 16:31 |
clarkb | if you do that it won't run bindep for the repo under test which might be needed if installing that repo as well | 16:31 |
*** ramishra has quit IRC | 16:31 | |
mordred | clarkb: indeed. we don't really have a bindep-siblings behavior defined particularly well | 16:32 |
*** gyee has joined #openstack-infra | 16:32 | |
tosky | something like "merge all the bindep dependencies" ? | 16:32 |
mordred | yeah. oh - actually - no, we should not add that | 16:34 |
*** sreejithp has joined #openstack-infra | 16:34 | |
mordred | clarkb: a repo will ultimately need to define transitive bindep depends in its own bindep file for real-world usage anyway | 16:34 |
mordred | because pip install foo does not know how to find the bindep file of foo and install it first | 16:34 |
mordred | so if we did bindep-siblings, we might actually wind up letting people develop things that have transitive bindep depends and work in the gate but not in the real world | 16:35 |
*** sreejithp has quit IRC | 16:35 | |
*** sreejithp has joined #openstack-infra | 16:35 | |
clarkb | ya | 16:35 |
*** aojea has quit IRC | 16:38 | |
clarkb | mordred: is https://review.openstack.org/#/c/640789/ that the fix for slow gitea runs? | 16:38 |
*** sreejithp has quit IRC | 16:39 | |
*** sreejithp has joined #openstack-infra | 16:39 | |
mordred | clarkb: yeah | 16:39 |
mordred | clarkb: and the followup lets us cleanup if something deps | 16:39 |
mordred | derps | 16:39 |
clarkb | ok I'm caught up on that stack and it is now approved | 16:40 |
clarkb | any idea on the sign in link on the gitea UI? my hunch is we need to rerun image builds since the docker image building jobs were unhapy when the fix for that merged | 16:41 |
* clarkb finds tea now that fixes are approved | 16:41 | |
*** yamamoto has quit IRC | 16:41 | |
corvus | i'll take a look at that | 16:42 |
*** rpittau is now known as rpittau|afk | 16:42 | |
corvus | clarkb: i think the image is updated, but i don't think that docker-compose automatically redploys it for us | 16:43 |
*** helenafm has quit IRC | 16:43 | |
corvus | here's the last build: http://zuul.openstack.org/build/0c842abbb7474874b406393a4e127108 | 16:45 |
openstackgerrit | Merged openstack-infra/system-config master: Limit project description to 255 characters https://review.openstack.org/640788 | 16:47 |
fungi | tosky: clarkb: remember that bindep is about declaring dependencies for a project, not dependencies for a ci job. if a cross-project job has specific distro packages it needs installed, it should take care of that itself | 16:48 |
mordred | corvus, clarkb: yes - I believe that is correct - I think we need to tell docker-compose to update | 16:48 |
*** ginopc has quit IRC | 16:49 | |
*** wolverineav has joined #openstack-infra | 16:49 | |
fungi | tosky: clarkb: though if the job needs to install packages which are dependencies declared for multiple projects, it seems reasonable to consider concatenating the bindep.txt from each of those projects and use that to determine what to install | 16:51 |
frickler | corvus: there was a question by Dobroslaw earlier about why some jobs aren't running in the release queue. I found this in zuul.log, but have no idea where that BranchMatcher comes from, might be a bug? http://paste.openstack.org/show/747225/ | 16:51 |
*** pgaxatte has quit IRC | 16:51 | |
*** Vadmacs has joined #openstack-infra | 16:51 | |
*** yamamoto has joined #openstack-infra | 16:52 | |
clarkb | frickler: you can't have branch matchers on tag based pipelines since tags don't have a single branch (I think there is work to make this optionally work?). That said I agree I don't see where that branch matcher is set | 16:53 |
*** wolverineav has quit IRC | 16:54 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Add Authorization Rules configuration https://review.openstack.org/639855 | 16:54 |
frickler | clarkb: yes, I was looking for a "branch: master" or similar in the zuul.yaml, but I didn't find it | 16:55 |
corvus | clarkb: https://review.openstack.org/578557 is the change you're thinking about, but i'm not sure that's the problem at issue here. | 16:55 |
*** dpawlik has quit IRC | 16:56 | |
corvus | clarkb, frickler: i stand corrected, i think that may be at issue here | 16:56 |
frickler | corvus: yeah, at least the commit message seems to match somehow | 16:57 |
*** yamamoto has quit IRC | 16:57 | |
frickler | or maybe even https://review.openstack.org/640272 | 16:58 |
corvus | frickler: here's our documentation that says "don't put tag jobs in-repo": https://docs.openstack.org/infra/manual/creators.html#central-config-exceptions | 16:58 |
dmsimard | AJaeger: eh, I need to send a patch to add the noop-jobs template to ara-plugins/server/clients in project-config because removing the zuul.yaml file from the repo leaves the projects with no jobs | 16:58 |
frickler | corvus: I see, would it make sense to add a note that this covers any release job? | 17:00 |
*** josephrsandoval has quit IRC | 17:00 | |
openstackgerrit | David Moreau Simard proposed openstack-infra/project-config master: Add noop-jobs to ara-{server,clients,plugins} https://review.openstack.org/640818 | 17:01 |
frickler | gotta leave, bbl | 17:02 |
fungi | frickler: release, pre-release and tag pipelines in this case | 17:04 |
*** kopecmartin is now known as kopecmartin|off | 17:04 | |
corvus | infra-root: i'd like to restart the nodepool launchers and all of zuul to pick up the affinity changes. Shrews, should i restart the builders as well? | 17:06 |
clarkb | corvus: does that include the base64 commit message escaping? | 17:07 |
clarkb | (thinking it would be nice to get that in if we can) | 17:07 |
corvus | lemme check that has landed | 17:07 |
mordred | corvus: we should keep our eyes peeled on the launchers/builders due to new openstacksdk release | 17:08 |
mordred | corvus: it should be fine - but, you know, a release happened, so being aware isn't terrible | 17:08 |
corvus | clarkb: yes, that is branch tip in fact | 17:08 |
Shrews | corvus: builders are optional for your purposes, but we should do so at some point | 17:08 |
mordred | (the image code in particular got moved around - although the code is all the same) | 17:08 |
openstackgerrit | David Moreau Simard proposed openstack-infra/project-config master: Retire ara-{server,clients,plugins} repos: they've been merged to ara https://review.openstack.org/640785 | 17:08 |
Shrews | corvus: b/c of the things mordred said | 17:08 |
*** psachin has quit IRC | 17:09 | |
corvus | do we have a restart nodepool playbook? | 17:09 |
Shrews | infra-root: fyi, launchers will restart with the port cleanup code (again) | 17:09 |
*** wolverineav has joined #openstack-infra | 17:09 | |
Shrews | clarkb: you have a good way to monitor that? ^^^ | 17:09 |
clarkb | Shrews: its tough because most of the time the clouds also do similar | 17:10 |
clarkb | Shrews: we only notice on nodepool side when cloud side stops keeping up and i think we are in a period of clouds keeping up right now | 17:10 |
clarkb | Shrews: but in general port list and grep for DOWN or Null attached ports | 17:10 |
openstackgerrit | David Moreau Simard proposed openstack-infra/project-config master: Retire ara-{server,clients,plugins} repos: they've been merged to ara https://review.openstack.org/640785 | 17:10 |
clarkb | you should see that they go away within the port cleanup run period | 17:11 |
clarkb | mordred: does the sdk release include the changes to taskmanagers? | 17:11 |
clarkb | mordred: or are we not affected by that until we update nodepool? | 17:11 |
mordred | clarkb: no - that isn't in yet | 17:12 |
Shrews | corvus: no playbook that i know of | 17:13 |
corvus | Shrews: i'm writing one now! | 17:14 |
openstackgerrit | James E. Blair proposed openstack-infra/system-config master: Add nodepool_restart playbook https://review.openstack.org/640822 | 17:15 |
*** agopi has joined #openstack-infra | 17:15 | |
dmsimard | config-core: an easy review to add noop-jobs which is a dependency in cleaning up stuff: https://review.openstack.org/#/c/640818/ | 17:16 |
openstackgerrit | Adam Coldrick proposed openstack-infra/storyboard master: Add a 'security' flag to Teams https://review.openstack.org/640823 | 17:16 |
*** wolverineav has quit IRC | 17:16 | |
clarkb | corvus: fwiw I think you can stop/start them all together as the ordering doesn't matter. That might make the waiting for timeouts slightly shorter | 17:16 |
corvus | clarkb: good point | 17:17 |
*** markvoelker has joined #openstack-infra | 17:18 | |
openstackgerrit | James E. Blair proposed openstack-infra/system-config master: Add nodepool_restart playbook https://review.openstack.org/640822 | 17:20 |
corvus | clarkb: something like that? ^ | 17:20 |
clarkb | ya that should do it | 17:20 |
openstackgerrit | Merged openstack-infra/system-config master: Only update gitea project settings during creation https://review.openstack.org/640789 | 17:21 |
*** sdake has joined #openstack-infra | 17:22 | |
openstackgerrit | Merged openstack-infra/system-config master: Add utility playbook for fixing gitea project settings https://review.openstack.org/640792 | 17:22 |
corvus | okay, i'll get started on the restarts now. starting with nodepool first | 17:22 |
clarkb | I'm reading email and can be on standby for helping with things | 17:23 |
corvus | wow. the upload recency table in the builder debug log is.... a lot of numbers :) | 17:23 |
corvus | #status log restarted nodepool launchers and builder at commit 3561e278c6178436aa1d8d673f839a676598ea17 | 17:25 |
openstackstatus | corvus: finished logging | 17:25 |
corvus | builders even | 17:25 |
dmsimard | thanks for the quick reviews <3 | 17:26 |
*** jpich has quit IRC | 17:27 | |
openstackgerrit | Merged openstack-infra/project-config master: Add noop-jobs to ara-{server,clients,plugins} https://review.openstack.org/640818 | 17:28 |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Add Authorization Rules configuration https://review.openstack.org/639855 | 17:29 |
*** dtantsur is now known as dtantsur|afk | 17:29 | |
*** dpawlik has joined #openstack-infra | 17:30 | |
corvus | okay, that took a little longer than i expected, but i have found a node created and in-use since the restart | 17:31 |
mordred | corvus: \o/ | 17:32 |
*** rkukura has quit IRC | 17:32 | |
corvus | the only errors i'm seeing are over-quota errors in ord, and that could be restart related | 17:33 |
*** dpawlik has quit IRC | 17:34 | |
corvus | there are 3 dib images being built now, so we should see if they upload later | 17:35 |
*** roman_g has quit IRC | 17:36 | |
corvus | one of them just failed with this message: http://paste.openstack.org/show/747250/ | 17:37 |
corvus | pabelanger: ^ ? | 17:37 |
clarkb | http://git.openstack.org/cgit/openstack/windmill-config seems to confirm it is empty | 17:38 |
*** roman_g has joined #openstack-infra | 17:38 | |
fungi | logan-: not sure if you're around to day, but in case you didn't see earlier, we're experiencing a lot of packet loss to mirror02.regionone.limestone.openstack.org over both ipv4 and ipv6 since 04:00 utc (we took that region out of service in our nodepool config starting around 15:15 utc) | 17:38 |
corvus | did we... approve a project creation while the git playbook is broken? | 17:38 |
pabelanger | that was approved by ianw this morning | 17:39 |
corvus | yes, it merged at Mon Mar 4 00:52:02 2019 +0000 | 17:39 |
corvus | okay. well, it's not going to exist and image builds aren't going to work until the git playbook runs to completion | 17:39 |
corvus | i would suggest that we hold off on creating more projects, but at this point, it doesn't actually matter, so i don't think we should issue any guidance for reviewers. | 17:40 |
corvus | we'll just have to check back on image uploads later | 17:40 |
corvus | meanwhile, i think we're ready to restart zuul now | 17:40 |
*** ijw has quit IRC | 17:41 | |
corvus | clarkb, pabelanger: are we using jemalloc on all executors? | 17:42 |
clarkb | corvus: we will after this restart | 17:42 |
corvus | let me rephrase -- are we configured to do so? and have we restarted them all with that configuration yet, or is this the first restart? | 17:42 |
fungi | the change to install it on all of them merged | 17:42 |
logan- | fungi: o/ was just catching up on that backscroll. I've had an ipv4 mtr open trying to see what's up, but no losses since I started that about 10 minutes ago. I'll start an ipv6 one here in a minute. no network issues reported in the racks where these nodes are, normal traffic levels, etc. i'll do some digging around on the nodes and see if I can find anything useful there too. | 17:43 |
corvus | clarkb: ok, thanks, i think that's what i was asking :) | 17:43 |
clarkb | corvus: we should be configured to do so but we have not restarted with that config anywhere but ze08 | 17:43 |
pabelanger | sorry, I haven't been following along on the jemalloc discussions | 17:43 |
fungi | corvus: this will be the first time except on (i believe) ze08 which was our proving ground | 17:43 |
dmsimard | Is zuul.o.o broken ? web is not displaying | 17:44 |
fungi | thanks logan-! i can reboot the node and see if it helps, just didn't know if you wanted to look into it first | 17:44 |
dmsimard | it kind of flashes briefly | 17:44 |
corvus | dmsimard: zuul is restarting | 17:44 |
openstackgerrit | Hervé Beraud proposed openstack-dev/pbr master: Fix error when keywords are defined as a list in cfg https://review.openstack.org/639661 | 17:44 |
dmsimard | ack. | 17:44 |
fungi | i've checked ze03 (chosen at random) and it has the libjemalloc1 package installed and the expected LD_PRELOAD exported in /etc/default/zuul-executor | 17:45 |
*** wolverineav has joined #openstack-infra | 17:46 | |
corvus | the scheduler has loaded its config, but the executors haven't finished stopping yet | 17:49 |
corvus | re-enqueueing | 17:49 |
logan- | fungi: thanks, i'll dig around for a bit and see what I can find before we try a reboot. | 17:49 |
*** gfidente has quit IRC | 17:50 | |
fungi | logan-: appreciated! | 17:51 |
*** markvoelker has quit IRC | 17:51 | |
corvus | #status log restarted zuul at commit d298cb12e09d7533fbf161448cf2fc297d9fd138 | 17:55 |
openstackstatus | corvus: finished logging | 17:55 |
openstackgerrit | Merged openstack-infra/zuul-website master: Add a promotional message banner and events list https://review.openstack.org/639871 | 17:58 |
*** e0ne has joined #openstack-infra | 17:59 | |
*** derekh has quit IRC | 18:05 | |
*** mriedem has quit IRC | 18:05 | |
*** wolverineav has quit IRC | 18:06 | |
*** ociuhandu has joined #openstack-infra | 18:11 | |
*** wolverineav has joined #openstack-infra | 18:14 | |
*** rkukura has joined #openstack-infra | 18:15 | |
*** e0ne has quit IRC | 18:17 | |
*** diablo_rojo has joined #openstack-infra | 18:18 | |
*** dpawlik has joined #openstack-infra | 18:19 | |
*** electrofelix has quit IRC | 18:22 | |
*** dpawlik has quit IRC | 18:24 | |
*** whoami-rajat has quit IRC | 18:24 | |
*** gfidente has joined #openstack-infra | 18:25 | |
*** harlowja has quit IRC | 18:27 | |
*** jpena is now known as jpena|off | 18:27 | |
*** jamesmcarthur_ has joined #openstack-infra | 18:32 | |
*** jamesmcarthur has quit IRC | 18:35 | |
clarkb | corvus: we expect windmill-config to be populated after ansible runs with the fix from mordred to only update settings on config updates? | 18:35 |
mordred | clarkb: yeah | 18:35 |
corvus | clarkb: strictly speaking, we expect it to be populated after the description length limit change, but they merged at ~ the same time, so... "yeah" :) | 18:35 |
clarkb | ok, I think ansible is running with that now since HEAD is the utility to fix things on bridge | 18:36 |
clarkb | and according to the log it is going through and parsing project names? | 18:37 |
corvus | clarkb: yeah, i think we're looking at ansible task overhead | 18:38 |
openstackgerrit | Merged openstack-infra/project-config master: Retire ara-{server,clients,plugins} repos: they've been merged to ara https://review.openstack.org/640785 | 18:38 |
fungi | so theory is most of the ~2.7 seconds per project is ansible spinning up tasks? | 18:40 |
*** jamesmcarthur_ has quit IRC | 18:41 | |
openstackgerrit | Merged openstack-infra/storyboard master: Add documentation for private stories https://review.openstack.org/636235 | 18:41 |
corvus | fungi: it'll be less now that we're not unconditionally doing a POST request, but yes. the only tasks it's doing right now are string comparison. | 18:41 |
*** panda|ruck is now known as panda|ruck|off | 18:41 | |
corvus | at the moment, i would not describe this communication as being faster than light. | 18:42 |
fungi | i know little about ansible internals. does it at least reuse a single python process for those? | 18:42 |
fungi | (so it's not spending most of it's time creating and tearing down python interpreter processes) | 18:43 |
corvus | nope. | 18:43 |
corvus | it uses multiprocessing with forks behind the scenes, and at the moment, the subprocs it's using are turning over rather quickly. | 18:44 |
fungi | so i guess if this were a python script which opened a persistent https socket to the api and hammered it while iterating over the list of projects, it might complete far more quickly | 18:45 |
corvus | i think this is the same thing we noticed with the plain base playbook too -- where we have so many tasks across all our hosts that ansible bogs down and runs them unusually slowly. we didn't get far debugging that because at the time, the inventory plugin issue make debugging it impossible. | 18:46 |
corvus | fungi: yes, though, what this playbook is currently doing is spending 5 seconds "hammering the api", then 1.5 hours doing string comparisons. then terminating. | 18:47 |
fungi | aha. wow | 18:47 |
*** markvoelker has joined #openstack-infra | 18:48 | |
corvus | i wanted to debug that some more, this may be a really good playbook for that.... | 18:51 |
corvus | so maybe i should spend some time this afternoon poking at that (because if we find something that could be improved, it could make base.yaml run faster). but if we don't make any headway, we may just need to turn this into a python script :/ | 18:52 |
*** jamesmcarthur has joined #openstack-infra | 18:53 | |
fungi | that does sound like a good opportunity | 18:53 |
*** pcaruana has quit IRC | 18:56 | |
*** whoami-rajat has joined #openstack-infra | 18:58 | |
*** jlvillal has quit IRC | 18:58 | |
*** jamesmcarthur has quit IRC | 18:59 | |
*** jlvillal has joined #openstack-infra | 18:59 | |
*** jamesmcarthur has joined #openstack-infra | 18:59 | |
*** IvensZambrano has quit IRC | 19:01 | |
*** ricolin has quit IRC | 19:01 | |
*** electrofelix has joined #openstack-infra | 19:02 | |
clarkb | infra-root I've done an audit of our current afs volumes and written down a plan for upgrading the afs file servers (but not yet the DBs) at https://etherpad.openstack.org/p/201808-infra-server-upgrades-and-cleanup if you get time to read that over and check it for reasonableness that would be much appreciated. | 19:02 |
openstackgerrit | David Moreau Simard proposed openstack-infra/project-config master: Remove openstack-cover-jobs/docs-on-readthedocs templates for ara https://review.openstack.org/640837 | 19:02 |
*** ijw has joined #openstack-infra | 19:03 | |
*** jamesmcarthur has quit IRC | 19:04 | |
fungi | clarkb: did you notice whether the afs02.dfw.openstack.org/main02 cinder volume outage on saturday had any adverse effects? | 19:07 |
*** sdake has quit IRC | 19:07 | |
clarkb | fungi: I have not, but also not looked super closely. Just a vos listvldb so far | 19:08 |
clarkb | then rendered that into the tabular data | 19:08 |
fungi | ahh, okay | 19:08 |
*** ijw has quit IRC | 19:10 | |
*** electrofelix has quit IRC | 19:10 | |
*** mriedem has joined #openstack-infra | 19:11 | |
clarkb | I'd be happy to start on steps 1 and 2 ( and 5.1 if we need that ) if this appears like a viable upgrade path | 19:11 |
*** ijw has joined #openstack-infra | 19:12 | |
*** e0ne has joined #openstack-infra | 19:13 | |
*** wolverineav has quit IRC | 19:15 | |
*** ijw has quit IRC | 19:16 | |
*** wolverineav has joined #openstack-infra | 19:16 | |
*** agopi has quit IRC | 19:18 | |
*** agopi_ has joined #openstack-infra | 19:18 | |
*** ijw has joined #openstack-infra | 19:20 | |
*** markvoelker has quit IRC | 19:21 | |
*** jamesmcarthur has joined #openstack-infra | 19:22 | |
*** openstackgerrit has quit IRC | 19:23 | |
corvus | infra-root: i'm going to comment out the run_all cron job on bridge for manual debugging | 19:26 |
corvus | (unfortunately, since bridge only has 2G of ram, it swaps even when it runs a single copy of the playbook; there's no way i can debug a second copy with a load average of 10) | 19:27 |
clarkb | rgr | 19:27 |
*** ijw has quit IRC | 19:27 | |
fungi | yikes | 19:27 |
*** ijw has joined #openstack-infra | 19:27 | |
fungi | thanks for the heads up, and for looking into it | 19:27 |
corvus | and if anyone has any idea how to make bridge bigger, i'd love to hear it :) | 19:28 |
dmsimard | corvus: does it need to be more complicated than "openstack server resize <uuid> --flavor <new-flavor>" ? | 19:29 |
corvus | dmsimard: no. does that work on rax? | 19:31 |
dmsimard | corvus: it should -- I can test on a throwaway VM to make sure | 19:31 |
dmsimard | I want to say I remember doing it at least once in rax but if I really did, it was a good while ago so let me double check :D | 19:32 |
*** eharney has quit IRC | 19:33 | |
*** openstackgerrit has joined #openstack-infra | 19:34 | |
openstackgerrit | Merged openstack-infra/zuul master: quickstart: web and others wait on mysql to start https://review.openstack.org/640548 | 19:34 |
fungi | what are the main challenges with replacing bridge.o.o with a new instance? | 19:35 |
fungi | is its ip address trusted and baked into a lot of stuff? | 19:35 |
fungi | or is it just that there's a lot of things on the filesystem which aren't configuration-managed? | 19:36 |
mordred | fungi: I'd say both of those would be concerns I'd have | 19:37 |
dmsimard | rabbit holing to find a working openstackclient on bridge, is there one in a virtualenv somewhere ? | 19:38 |
dmsimard | python3 -m venv is telling me the python3-venv package isn't installed by pip3 freeze shows that virtualenv is installed... but it's not in /usr/bin or /usr/local/bin | 19:39 |
fungi | dmsimard: ~root/launch-env/bin/openstack | 19:39 |
fungi | though clarkb was talking about installing it globally | 19:39 |
dmsimard | fungi: I tried launch-env first, it's complaining about the lack of python datetime lib | 19:39 |
openstackgerrit | Riju Khatri proposed openstack-infra/storyboard-webclient master: adds activity indicator to new worklist modal and detail Task: 29730 https://review.openstack.org/640845 | 19:40 |
fungi | dmsimard: ahh, yep. the things i don't notice when i use ~fungi/launch-env/bin/openstack instead | 19:40 |
dmsimard | that one works \o/ | 19:41 |
dmsimard | corvus: is "corvustest" from rax FDW a good candidate for testing a resize ? :D | 19:41 |
fungi | dmsimard: what sort of resize are you wanting to test? afaik rackspace doesn't have server instance resizing allowed (at least for pvhvm flavors) | 19:42 |
dmsimard | fungi: corvus and I were wondering if resizing worked at all | 19:43 |
fungi | doesn't hurt to re-validate that previous determination | 19:43 |
fungi | i'll cross my fingers that something there has changed since the last few times we tried | 19:43 |
fungi | i don't recall being able to resize any servers since we moved off the rackspace legacy cloud years ago | 19:44 |
fungi | they used to have a kb article which said the only way to resize a server was to build a replacement | 19:46 |
*** agopi_ is now known as agopi | 19:50 | |
fungi | yeah, at least in their cloud dashboard the "resize" option is greyed out in the action drop-down for bridge.o.o and there's a link in the right-hand sidebar to https://support.rackspace.com/how-to/upgrading-resources-for-general-purpose-or-io-optimized-cloud-servers/ | 19:51 |
dmsimard | trying a resize on a VM I created, OS-EXT-STS:task_state | resize_migrating | 19:52 |
dmsimard | will let you know if it works or not | 19:53 |
fungi | what flavor did you choose? | 19:53 |
*** eernst has joined #openstack-infra | 19:54 | |
fungi | same as bridge? (2 GB Performance) | 19:54 |
*** eernst has quit IRC | 19:54 | |
fungi | and what image? | 19:54 |
dmsimard | fungi: I suspect live resize might (understandably) not work for bare metal servers | 19:54 |
fungi | bridge is a vm as far as i know | 19:54 |
dmsimard | I created a new VM with the same image and flavor as bridge and I am trying to resize to performance1-8 | 19:54 |
dmsimard | status has been in "RESIZE" for a few minutes now, not sure if it's doing anything | 19:55 |
fungi | cool. if it works via api and not dashboard that'll be quite the discovery! | 19:55 |
dmsimard | fungi: the reason why I mentioned bare metal is the wording used in the KB article you linked | 19:56 |
dmsimard | the general and i/o optimized nodes are flavor of bare metal nodes iirc | 19:56 |
fungi | i don't see any mention of bare metal or their onmetal service offering there | 19:56 |
fungi | ahh, weird that they would link that from a vm detail view with the words "help me with... resizing my server" | 19:57 |
fungi | dmsimard: looking at their flavor list, the bare metal flavors are the ones prefixed with "onmetal-" | 19:58 |
dmsimard | it doesn't seem like it works at first glance :( | 19:58 |
Shrews | infra-root: fyi, looks like the new port cleanup code in nodepool launchers has removed only 3 down ports (1 in limestone-regionone, 2 in inap-mtl01). that appears to be working (fyi tobiash) | 19:59 |
mordred | Shrews: \o/ | 19:59 |
mordred | Shrews: so it didn't delete all of teh ports in all of the regions | 19:59 |
tobiash | Shrews: \o/ | 19:59 |
* mnaser builds a bridge on vexxhost :> | 19:59 | |
dmsimard | oh wait, maybe I spoke too soon, the VM stopped pinging | 19:59 |
mordred | Shrews: that's so much better than last time | 19:59 |
Shrews | mordred: right? | 19:59 |
dmsimard | oh, it's in verify_resize now | 20:00 |
mordred | Shrews: I love it when stuff doesn't just delete stuff | 20:00 |
dmsimard | with 8GB of RAM \o/ | 20:00 |
mordred | dmsimard: that's exciting | 20:00 |
* tobiash goes upgrading nodepool | 20:00 | |
dmsimard | for some reason it took like a good ~6 minutes to start the resize | 20:00 |
fungi | dmsimard: that is to say, they appear to have vm flavors for general and io so it seems like they intended for it to apply to virtual machines | 20:01 |
fungi | dmsimard: excellent! thanks for proving their documentation incorrect | 20:01 |
mordred | dmsimard: so, assuming it finishes properly, that seems workable - although we should probably also make a backup of the important things just in case | 20:01 |
fungi | oh, definitely take a snapshot | 20:01 |
dmsimard | fungi, mordred, corvus: commands I used for the resize: http://paste.openstack.org/show/747257/ | 20:04 |
fungi | thanks! this is an excellent discovery | 20:05 |
fungi | makes me wonder why they disable it in their dashboard and make no mention of it even being possible | 20:05 |
openstackgerrit | Gaëtan Trellu proposed openstack/diskimage-builder master: [lvm] Add Ubuntu bionic as supported distro https://review.openstack.org/640850 | 20:05 |
dmsimard | I would have been sad if it hadn't worked, live resizing VMs is kind of a big feature in OpenStack :/ | 20:06 |
dmsimard | It's another problem for bare metal flavors entirely but I understand that | 20:06 |
fungi | well, rackspace isn't entirely openstack either, as we've seen so many times in the past | 20:06 |
dmsimard | true | 20:06 |
fungi | you're able to ssh into that server and free shows the increased ram? | 20:06 |
fungi | have you tried rebooting it? (or did the resize reboot it?) | 20:07 |
corvus | fungi, dmsimard, mordred: i'm in favor of taking a snapshot and doing a resize to 4 or 8gb. if someone wants to do that now, it's a good time -- run_all has stopped and is commented out, and i'm about to get lunch. | 20:07 |
dmsimard | fungi: for some reason "infra-root-keys" didn't make it, I had to use the adminPass provided at the instance creation. I rebooted once after ~3 minutes to see if that would've kicked something into gear but the VM eventually rebooted by itself for the resize a few minutes later | 20:08 |
fungi | i'm strongly in favor of that. if dmsimard wants to do it, all the better! | 20:08 |
*** ijw has quit IRC | 20:08 | |
dmsimard | fungi: output from free before and after http://paste.openstack.org/show/747258/ | 20:09 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Log exception on module failure with empty stdout https://review.openstack.org/640650 | 20:09 |
*** wolverineav has quit IRC | 20:09 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Manage ansible installations within zuul https://review.openstack.org/631930 | 20:09 |
mordred | corvus: I have an idea about a possible playbook optimization - not sure if it's where your brain was already going... | 20:09 |
dmsimard | corvus, fungi: I can do it -- do you snapshot using "server image create" or something else ? | 20:11 |
clarkb | dmsimard: re infra-root-keys do we add that list of keys to the ci account or just the jenkins/zuul account? | 20:11 |
clarkb | dmsimard: I usually do server image create --name server-fqdn-$date #server | 20:12 |
mordred | corvus: http://paste.openstack.org/show/747259/ - if we put the when on the loop, it applies the when to every loop iteration - and maybe we can avoid including the setup-repo.yaml tasks completely when not needed and keep the string comparison inside of the single task/process | 20:12 |
fungi | clarkb: dmsimard: same here, or sometimes i'll add more detail like ...-before-resize-... | 20:13 |
fungi | just so we have additional context when looking at the list of snapshots later | 20:13 |
dmsimard | clarkb: not sure about the nova "infra-root-keys" keypair -- I'm in openstackci-rax dfw and there's just nothing in ubuntu or root's authorized keys | 20:13 |
dmsimard | I didn't want to rabbit hole into something else so I left it at that | 20:13 |
fungi | dmsimard: i wonder if cloud-init cleared them | 20:14 |
mordred | yeah - I don't think the keypair stuff will be necessary, because the keys we actually care about are installed by ansible | 20:14 |
mordred | dmsimard: did the IP stay the same? | 20:14 |
dmsimard | mordred: the keypair stuff was because I created a new VM for the sole purpose of testing this -- it wouldn't be an issue for bridge | 20:14 |
mordred | dmsimard: yah. totes | 20:15 |
dmsimard | everything stayed the same | 20:15 |
mordred | sweet | 20:15 |
dmsimard | kind of the point of resize | 20:15 |
fungi | sounds like a winner to me then | 20:15 |
mordred | magical ponies | 20:15 |
dmsimard | I really like "openstack server rebuild" too, one of my favorite commands :D | 20:15 |
mordred | dmsimard: yes indeed - but just because it's the thing that makes sense doesn't mean it's reality ;) | 20:15 |
mordred | dmsimard: there was a taem at HP Public Cloud who wanted us to investigate using server rebuild in nodepool instead of delete/create | 20:16 |
mordred | but we didn't get around to it - it's hard to ponder - and hp public cloud is gone | 20:16 |
fungi | also they had an especially reassigning ip addresses | 20:17 |
fungi | er, had an especially hard time | 20:17 |
dmsimard | I hit rebuild on a development VM at home and it's pinging under 30 seconds from the original image, same IP, keypairs, everything. It's neat. | 20:17 |
fungi | then again, so has rackspace | 20:17 |
clarkb | note rebuild != resize | 20:17 |
dmsimard | clarkb: yes :) | 20:18 |
dmsimard | we don't want a rebuild on bridge haha | 20:18 |
*** markvoelker has joined #openstack-infra | 20:18 | |
dmsimard | Just saying like resize, it's another cool feature :) | 20:18 |
*** jamesmcarthur has quit IRC | 20:19 | |
dmsimard | ok I'll run the snapshot and let you know when I start the resize | 20:19 |
*** jamesmcarthur has joined #openstack-infra | 20:19 | |
*** dpawlik has joined #openstack-infra | 20:20 | |
clarkb | mordred: prior to kubernetes and nginx sarah novotny had a job running game servers on aws that was basically know when to reuse vs delete instances. Turns out it is a fun problem | 20:21 |
* mordred hands dmsimard a cake | 20:21 | |
mordred | clarkb: ++ | 20:21 |
clarkb | (you have to predict demand and in their case determine if waiting on an existing server to become free is cheaper than spinning up a new server) | 20:21 |
*** gfidente is now known as gfidente|afk | 20:21 | |
*** e0ne has quit IRC | 20:22 | |
mordred | yeah. also - I *believe* there is variability in terms of when rebuild is cheaper that maps to cloud deployment somehow - so it might not actually be cheaper on all of the clouds | 20:23 |
mordred | so I imagine it would get even more funner for us | 20:23 |
*** dpawlik has quit IRC | 20:25 | |
dmsimard | meanwhile, need a +3 on an easy project-config review: https://review.openstack.org/#/c/640837/ | 20:26 |
dmsimard | snapshot is still queued, I suppose it might take a while since it's not a volume ? | 20:26 |
pabelanger | didn't we have to stop intances first for snapshoting to work properly in rax? | 20:27 |
pabelanger | instances* | 20:28 |
dmsimard | I'll give it a while before giving up, it took almost 10 minutes for a resize to kick in earlier | 20:28 |
pabelanger | or maybe that was something to do with volumes | 20:28 |
pabelanger | (nevermind me :)) | 20:28 |
fungi | it will snapshot while the server is running, it just tends to do it more quickly if it's not (because the writes have already reached quiescence) | 20:29 |
dmsimard | It just transitioned from "queued" to "saving" so it'll work. I have a ping going to/from bridge to see if there's any blips in availability | 20:31 |
mriedem | clarkb: http://status.openstack.org/elastic-recheck/data/integrated_gate.html "Generated at: 2019-02-08" | 20:33 |
mriedem | uh oh | 20:33 |
mriedem | did something die? | 20:34 |
*** Vadmacs has quit IRC | 20:34 | |
clarkb | mriedem: hrm we've seen that where a run doesn't timeout | 20:36 |
clarkb | I'll look to see if the lock is being held | 20:36 |
corvus | mordred: ah, yeah, http://paste.openstack.org/show/747259/ looks like it'd be an improvement. mostly i wanted to dig into why ansible's task workers weren't being reused which is causing much slowness in general. so i'd still like to do that, but i have no idea what will come of that. meanwhile, maybe that change will be enough to get the runtime to something tolerable. | 20:36 |
clarkb | mriedem: nothing has the lock so it is probably failing. I can run it in the foreground | 20:37 |
clarkb | mriedem: 2019-03-04 20:37:55,442 [eruncategorized] WARNING: No failures found in group "integrated_gate". The default ALL_FAILS_QUERY might be broken. | 20:38 |
mordred | corvus: kk. I'll push it up as a change | 20:38 |
*** auristor has quit IRC | 20:39 | |
mriedem | hmmm | 20:40 |
mriedem | that was recently fixed... | 20:40 |
openstackgerrit | Monty Taylor proposed openstack-infra/system-config master: Filter setup-repos loop before include_tasks https://review.openstack.org/640861 | 20:41 |
clarkb | mriedem: elastic-recheck==0.0.1.dev2223 # git sha 5ce47d0 is what we've got installed | 20:41 |
mriedem | https://github.com/openstack-infra/elastic-recheck/commit/cdf6ee031e9514b6a8751f0684c147dd3d500404 | 20:43 |
mriedem | POST-RUN END RESULT_NORMAL: [trusted : git.openstack.org/opendev/base-jobs/playbooks/base/post.yaml@master] | 20:43 |
mriedem | compared to | 20:43 |
mriedem | '(filename:"job-output.txt" AND message:"POST-RUN END" AND message:"project-config/playbooks/base/post.yaml")' # flake8: noqa | 20:43 |
clarkb | I'm guessing we broke it again some other way | 20:44 |
openstackgerrit | Merged openstack-infra/project-config master: Remove openstack-cover-jobs/docs-on-readthedocs templates for ara https://review.openstack.org/640837 | 20:45 |
mriedem | i'll dork with it locally | 20:45 |
gmann | clarkb: fungi ianw can uou check this: setup of stable/stein - https://review.openstack.org/#/c/640641/ | 20:45 |
*** ijw has joined #openstack-infra | 20:45 | |
corvus | mriedem, clarkb: i haven't chased down all the links yet to find this myself -- but it looks like that query is sensitive to the playbook name, and we recently moved base-jobs to opendev/base-jobs. has that been updated? | 20:46 |
clarkb | corvus: probably not, but that is a likely explanation | 20:46 |
*** auristor has joined #openstack-infra | 20:46 | |
clarkb | I wonder if we could just match playbooks/base/post.yaml | 20:46 |
clarkb | though we shouldn't be moving it much from now on | 20:47 |
mriedem | that's what i'm trying | 20:47 |
*** jtomasek has quit IRC | 20:48 | |
clarkb | fungi: AJaeger the docs.old afs volume was there to help facilitate the move from not afs to afs? | 20:50 |
clarkb | fungi: AJaeger I'm guessing we don't actually need redundant copies of that volume? | 20:50 |
*** markvoelker has quit IRC | 20:51 | |
*** sdake has joined #openstack-infra | 20:52 | |
dmsimard | infra-root: snapshot for bridge.o.o is finished, I'm briefly testing the image before proceeding with the resize | 20:52 |
clarkb | dmsimard: I guess the cron is disabled but normally we'd want ot be careful doing taht to avoid multiple independent runs of ansibel happening | 20:53 |
clarkb | though I guess we restrict by IP so probably a non concern even then | 20:53 |
dmsimard | clarkb: yes, I'm doing this with the knowledge that run_all is commented out | 20:53 |
openstackgerrit | Matt Riedemann proposed openstack-infra/elastic-recheck master: Fix ALL_FAILS_QUERY https://review.openstack.org/640864 | 20:54 |
mriedem | clarkb: corvus: yeah ^ fixes it | 20:54 |
mriedem | ~40% categorization in the gate | 20:54 |
fungi | infra-root: in today's exciting episode of "openstack or opendev..." wiki-dev: is it openstack or opendev? i'm gearing up to launch a replacement for it | 20:54 |
mordred | fungi: oy | 20:54 |
mordred | fungi: that is the hardest question you've asked | 20:54 |
mordred | fungi: could I answer by panicing, running around in a circle and then knocking myself unconscious by running in to a light post? | 20:55 |
*** dpawlik has joined #openstack-infra | 20:55 | |
clarkb | mriedem: looking | 20:55 |
fungi | mordred: i think when in doubt, maybe we just default to assuming it's not opendev | 20:56 |
mordred | fungi: I think I'd tend towards thinking openstack ... I don't think we're happy about being in the wiki business in the first place, so I don't think we likely want to expand our user audience | 20:56 |
fungi | preferable to the lightpost thing anyway | 20:56 |
mordred | fungi: yeah - what you said but with more words | 20:56 |
clarkb | I know the wiki is a thing people want, but ya also not sure how much effort we'd be able to invest in it given well experience | 20:56 |
clarkb | easy enough to change it later I suppose if we need to | 20:57 |
mordred | "would I want to excitedly point people towards it as an awesome feature of opendev and why they should use us - or would I prefer to quietly nudge it in to a dark corner and hope nobody notices it's there" | 20:57 |
*** IvensZambrano has joined #openstack-infra | 20:57 | |
fungi | clarkb: ianw: gmann: i suppose the reason features.yaml is still in devstack-gate is because we need it to be somewhere branchless? | 20:57 |
fungi | should that maybe move into openstack-zuul-jobs or something? | 20:57 |
openstackgerrit | Merged openstack-infra/storyboard-webclient master: Sort search results by updated_at by default https://review.openstack.org/638690 | 20:57 |
pabelanger | +1 no wiki.opendev.org | 20:57 |
*** anteaya has joined #openstack-infra | 20:58 | |
clarkb | fungi: the branchless vs not is what makes that complicated aiui. That said we should be able to make it branched if we really want to it just hasn't happened yet | 20:58 |
*** jamesmcarthur has quit IRC | 20:58 | |
clarkb | mriedem: I approved the change | 20:58 |
*** jamesmcarthur has joined #openstack-infra | 20:59 | |
*** jamesmcarthur has quit IRC | 20:59 | |
fungi | clarkb: yeah, i believe docs.old was from before the root-marker work and subsequent mass deletion of unmanaged content (and then we recalled a fair amount of more ancient unmanaged content back from it and readded it to the live volume) | 20:59 |
*** jamesmcarthur has joined #openstack-infra | 20:59 | |
clarkb | fungi: cool I'll mark that as not needing redundant copies | 21:00 |
*** dpawlik has quit IRC | 21:00 | |
ianw | fungi: isn't this just required for legacy jobs at this point? that was my understanding. | 21:00 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Manage ansible installations within zuul https://review.openstack.org/631930 | 21:00 |
fungi | ianw: oh, do we not even use features.yaml in the new devstack jobs? how do we manage the feature matrix? last i recall that was the one bit we hadn't managed to rework yet | 21:00 |
fungi | but there's every chance i'm just behind the times | 21:01 |
corvus | fungi, clarkb: yes, features.yaml should be able to be factored out, but it will require some thought. | 21:01 |
corvus | ianw: ^ | 21:01 |
corvus | it *is* used in the new devstack jobs, though it never should have been. | 21:02 |
*** wolverineav has joined #openstack-infra | 21:02 | |
gmann | yeah currently it is used in new job. | 21:03 |
clarkb | infra-root I've annotated my afs volume ethercalc with a bit more specifics on which volumes will need to get redundant copies given my upgrade plan. | 21:03 |
*** whoami-rajat has quit IRC | 21:04 | |
clarkb | and with that i have gumbo to eat for lunch | 21:04 |
ianw | clarkb: maybe put in some reboots in the steps on the etherpad? | 21:05 |
openstackgerrit | Merged openstack-infra/zuul master: Optionally disable disk_limit_per_job https://review.openstack.org/638596 | 21:05 |
corvus | gmann, clarkb, fungi, ianw: the devstack pre playbook uses the test-matrix role defined in devstack-gate | 21:05 |
*** gfidente|afk has quit IRC | 21:05 | |
clarkb | ianw can do | 21:06 |
*** jamesmcarthur has quit IRC | 21:06 | |
openstackgerrit | Merged opendev/base-jobs master: Remove promote playbook from opendev-promote-docker-image https://review.openstack.org/640563 | 21:07 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Manage ansible installations within zuul https://review.openstack.org/631930 | 21:08 |
openstackgerrit | Matt Riedemann proposed openstack-infra/elastic-recheck master: Add query for PlacementFixture bug 1818560 https://review.openstack.org/640869 | 21:09 |
openstack | bug 1818560 in OpenStack Compute (nova) "Nova test_report_client uses nova conf when starting placement intercept, causing missing config opts" [Critical,In progress] https://launchpad.net/bugs/1818560 - Assigned to Chris Dent (cdent) | 21:09 |
ianw | clarkb: otherwise maybe we can paste in the commands; trying to remember the last time i did any of the volume swizzling and i think it was http://lists.openstack.org/pipermail/openstack-infra/2018-May/005949.html | 21:09 |
dmsimard | infra-root: the "brief" test of the snapshot is turning out to take a while, sorry about that. VM is still trying to spawn with the image. I guess it takes a while to download the image back. | 21:10 |
ianw | yeah, it can take quite a while for rax to actually start a vm even when you upload an image | 21:11 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Fix missing wait-to-start playbook in quick start https://review.openstack.org/640871 | 21:11 |
fungi | i expect it's the time taken to copy that image to the hypervisor host's cache | 21:12 |
dmsimard | glance says the image's size is "10170112000" which sounds like bytes, so around 10GB ? not the end of the world but it's not small | 21:13 |
ianw | clarkb: but other than that, plan LGTM, thanks. can help if you like | 21:13 |
openstackgerrit | Merged openstack-infra/elastic-recheck master: Fix ALL_FAILS_QUERY https://review.openstack.org/640864 | 21:25 |
*** eharney has joined #openstack-infra | 21:28 | |
*** jcoufal has quit IRC | 21:28 | |
openstackgerrit | Merged openstack-infra/elastic-recheck master: Add query for PlacementFixture bug 1818560 https://review.openstack.org/640869 | 21:28 |
openstack | bug 1818560 in OpenStack Compute (nova) "Nova test_report_client uses nova conf when starting placement intercept, causing missing config opts" [Critical,In progress] https://launchpad.net/bugs/1818560 - Assigned to Chris Dent (cdent) | 21:28 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Fix test race in test_client_dequeue_change_by_ref https://review.openstack.org/640878 | 21:30 |
*** yamamoto has joined #openstack-infra | 21:33 | |
corvus | mordred: if i'm reading the ansible source correctly, it starts a new multiprocessing.Process for every task that it runs.... | 21:33 |
corvus | mordred: it looks like that was not previously the way it worked, but something changed in the past few years... | 21:34 |
*** ijw has quit IRC | 21:36 | |
*** jcoufal has joined #openstack-infra | 21:37 | |
*** jamesmcarthur has joined #openstack-infra | 21:37 | |
*** yamamoto has quit IRC | 21:38 | |
corvus | mordred, fungi: i *think* this is when ansible switched to a pool of long-running worker processes draining tasks from a queue to running each task in its own worker process: https://github.com/ansible/ansible/commit/120b9a7ac6274c54d091291587b0c9ec865905a1 | 21:40 |
corvus | git tells me that would be ansible v2.1 | 21:40 |
*** sdake has quit IRC | 21:41 | |
dmsimard | VM with the snapshot is still in BUILDING... | 21:41 |
dmsimard | Makes me question the ability to restore from a snapshot | 21:41 |
openstackgerrit | Merged openstack-infra/storyboard master: Update docs on how to run the tests locally https://review.openstack.org/640233 | 21:41 |
dmsimard | be back in a few | 21:43 |
*** sdake has joined #openstack-infra | 21:43 | |
corvus | clarkb, fungi: can you +3 https://review.openstack.org/640861 | 21:47 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Manage ansible installations within zuul https://review.openstack.org/631930 | 21:47 |
clarkb | corvus: done | 21:48 |
corvus | i ran a quick test of that in my test setup, and i think it will make the runtime of this playbook tolerable. | 21:48 |
fungi | lgtm | 21:48 |
corvus | my guess is just a few minutes total. | 21:49 |
*** markvoelker has joined #openstack-infra | 21:49 | |
*** ijw has joined #openstack-infra | 21:49 | |
*** mriedem has quit IRC | 21:50 | |
mordred | corvus: oh joy | 21:51 |
*** geguileo has quit IRC | 21:53 | |
*** smcginnis has quit IRC | 21:53 | |
*** zzzeek has quit IRC | 21:54 | |
*** kota_ has quit IRC | 21:54 | |
*** zzzeek has joined #openstack-infra | 21:54 | |
*** jamesmcarthur has quit IRC | 21:57 | |
*** kota_ has joined #openstack-infra | 21:57 | |
*** jamesmcarthur has joined #openstack-infra | 21:58 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Web: plug the authorization engine https://review.openstack.org/640884 | 21:59 |
dmsimard | infra-root: I have two ongoing attempts to spawn a VM from the bridge.o.o snapshot. The first, a rebuild of my resize test VM has been going for >1hr and the other, a fresh VM, has been going for >30 minutes. The resize looked to work well in the test enough but I'm not sure about snapshot part. | 22:01 |
*** jcoufal has quit IRC | 22:02 | |
*** ijw has quit IRC | 22:02 | |
dmsimard | Have we successfully used snapshots before ? | 22:02 |
clarkb | dmsimard: yes we have with the previous lists upgrade server testing | 22:02 |
clarkb | dmsimard: its probably made a 40GB or bigger image that has to be copied? | 22:02 |
dmsimard | of course, just as I mention that, the rebuild *just* became active | 22:02 |
clarkb | ianw: I'ev updated the etehrpad with the system level steps. Going to sort out afs management tasks to move RW volume around now | 22:03 |
*** wolverineav has quit IRC | 22:03 | |
*** eernst has joined #openstack-infra | 22:06 | |
*** wolverineav has joined #openstack-infra | 22:06 | |
*** eernst has quit IRC | 22:06 | |
*** eernst has joined #openstack-infra | 22:07 | |
*** betherly has joined #openstack-infra | 22:07 | |
*** dave-mccowan has quit IRC | 22:09 | |
*** maumont has joined #openstack-infra | 22:10 | |
corvus | clarkb: why step 5? | 22:11 |
corvus | let me rephrase | 22:11 |
*** wolverineav has quit IRC | 22:11 | |
corvus | clarkb: why change where the RW volumes are at all? | 22:11 |
clarkb | corvus: it means we don't have to track down every single writer | 22:11 |
clarkb | so mostly belts and suspenders | 22:11 |
corvus | clarkb: there aren't that many writers, and you've shut most of them off by disabling those cronjobs | 22:12 |
corvus | clarkb: the rest can just fail | 22:12 |
clarkb | corvus: ya thats true | 22:12 |
*** betherly has quit IRC | 22:12 | |
*** jamesmcarthur has quit IRC | 22:12 | |
corvus | clarkb: personally, i feel that an upgrade plan which doesn't require modifying the vldb or volumes themselves is easier and safer | 22:13 |
*** jamesmcarthur has joined #openstack-infra | 22:13 | |
clarkb | corvus: ok we'll need to update the files sites that only use RW volumes today (but planned to do that anyway) | 22:13 |
openstackgerrit | Merged openstack-infra/system-config master: Filter setup-repos loop before include_tasks https://review.openstack.org/640861 | 22:14 |
corvus | clarkb: ++ | 22:14 |
corvus | clarkb: we may need to add cronjobs for those | 22:14 |
corvus | or rather, add them to the docs auto-release cronjob | 22:15 |
clarkb | corvus: ya they'll need to be added to the afsrelease cronjob | 22:15 |
corvus | clarkb: i think you can skip opendev | 22:15 |
corvus | it's now in the gitea image | 22:15 |
clarkb | oh right | 22:15 |
clarkb | corvus: do you know what "service" volume is? | 22:16 |
corvus | clarkb: nothing mounted under it | 22:16 |
clarkb | can probably be skipped too then | 22:16 |
dmsimard | The snapshot isn't booting and is falling back to "emergency mode", ctrl+d eventually ends up rebooting the VM: https://i.imgur.com/Wite4dW.png | 22:16 |
dmsimard | Do we happen to have a root password for bridge.o.o already set ? | 22:16 |
clarkb | dmsimard: no we don't set root passwords | 22:17 |
dmsimard | not sure how to interrupt boot to set a password, I don't see a grub prompt | 22:18 |
*** eglute has joined #openstack-infra | 22:18 | |
*** wolverineav has joined #openstack-infra | 22:18 | |
clarkb | corvus: ianw: great looks like we have a list of volumes to update and get published properly. I'll probably poke at that starting tomorrow morning as I'll be able to fully page in all the afs and kerberos stuff :) Then maybe wednesday/thursday we try the upgrade on ord? | 22:18 |
dmsimard | This has gotten very rabbit holey. Do we want to move forward with the resize ? | 22:18 |
clarkb | ianw: and thank you for the offer of help | 22:18 |
clarkb | dmsimard: the usual method with openstack is you boot a recovery iamge that attaches your server iamge | 22:19 |
ianw | yeah, graphical boot doesn't help, i thought there was something we put in base puppet ages ago to turn that off | 22:20 |
*** markvoelker has quit IRC | 22:21 | |
clarkb | dmsimard: assuming that mordred's ansible fix above helps things we may not need the resize right this moment, but this is still likely to be a useful thing to do overall | 22:21 |
clarkb | so probably don't want to give up on it yet if we can avoid it | 22:22 |
dmsimard | I looked on bridge and GRUB_TIMEOUT is set to 0 and GRUB_TIMEOUT_STYLE is set to hidden which would explain why I don't see it | 22:22 |
corvus | well, bridge (and puppetmaster before it) has been out of memory, swapping, and ooming for *years* | 22:24 |
corvus | it's very frustrating, and it's why it often takes half a day or longer for changes to take effect | 22:25 |
*** eernst has quit IRC | 22:25 | |
corvus | so upgrading is definitely necessary despite mordred's improvement | 22:25 |
corvus | but if we need to do that by redeploying bridge from scratch again some other time, then i guess that's what we'll do | 22:26 |
dmsimard | I'd love to be confident that our snapshot works before doing the resize, all of this should've been simple and quick enough to do but here we are :D | 22:27 |
corvus | dmsimard: yeah, it's a good plan. i don't think we should make more work for ourselves by short-cutting | 22:28 |
corvus | dmsimard: if the snapshot doesn't boot, let's not proceed | 22:28 |
ianw | dmsimard: i didn't see it in the saved images list, did you remove it? | 22:29 |
clarkb | one thing to check with snapshots is that the host flavor is big enough for the disk | 22:29 |
dmsimard | ianw: bridge.openstack.org-20190304-before-resize (faf18644-1c6e-4844-9692-9856b512b8e0) | 22:29 |
dmsimard | clarkb: tried one with the 8gb flavor (which is the one failing to boot right now) -- the one with the 2gb flavor is still building | 22:29 |
openstackgerrit | Adam Coldrick proposed openstack-infra/storyboard master: Add a 'security' flag to Teams https://review.openstack.org/640823 | 22:30 |
dmsimard | I can spend some more cycles on this later, need to grab dinner. | 22:30 |
clarkb | dmsimard: the image min disk is set to 40 and the performance 8gb flavor does have a 40gb root disk | 22:31 |
clarkb | so ya that should be fine | 22:31 |
ianw | dmsimard: just taking the liberty of trying emergency mode with that other server, see if we can get any info | 22:34 |
dmsimard | ianw: go for it | 22:34 |
clarkb | corvus: should we reenable the cron the running server and run it with mordreds fix or wait for possible resizing? | 22:35 |
corvus | clarkb, dmsimard: since dmsimard is going to return to this later, and the next step is verifying the snapshot anyway, i think we can go back into production... | 22:37 |
corvus | clarkb: where's your zuul-cd change at? | 22:38 |
clarkb | https://review.openstack.org/#/c/604925/ looks like it already conflicts again after recent rebase. I'll rebase again now | 22:38 |
*** kgiusti has left #openstack-infra | 22:40 | |
openstackgerrit | Clark Boylan proposed openstack-infra/system-config master: Add zuul user to bridge.openstack.org https://review.openstack.org/604925 | 22:40 |
openstackgerrit | Clark Boylan proposed openstack-infra/system-config master: Manage user ssh keys from urls https://review.openstack.org/604932 | 22:40 |
openstackgerrit | James E. Blair proposed openstack-infra/system-config master: Run docker-compose pull before docker-compose up https://review.openstack.org/640889 | 22:41 |
corvus | clarkb: ^ there's the answer to your question from a while ago | 22:41 |
corvus | clarkb, dmsimard: i will re-enable the cron jobs on bridge | 22:42 |
clarkb | ah the image pull isn't implicit | 22:42 |
clarkb | but if you've got new images then the up implies a restart | 22:42 |
corvus | yep | 22:42 |
*** harlowja has joined #openstack-infra | 22:42 | |
corvus | i went ahead and did a git pull on system-config on bridge, and re-enabled the crons | 22:43 |
*** agopi has quit IRC | 22:44 | |
dmsimard | corvus: +1 | 22:44 |
mordred | corvus: ok to +A the docker-compose? | 22:45 |
corvus | clarkb: is the addition of test_ara in 604925 intentional? (i didn't see it mentioned in the commit, and it's not obvious why it's needed for this change, but it's fine on its own :) | 22:45 |
corvus | mordred: yep | 22:45 |
mordred | corvus: done! | 22:46 |
clarkb | corvus: that appears to be bad rebasing | 22:46 |
clarkb | corvus: I can remove it. I just rebased so may as well remove that unneeded function | 22:46 |
corvus | clarkb: ok either way. i +2d now, and will be happy to +2 the cleanup. | 22:47 |
corvus | assuming, of course, that test works :) | 22:47 |
openstackgerrit | Clark Boylan proposed openstack-infra/system-config master: Add zuul user to bridge.openstack.org https://review.openstack.org/604925 | 22:48 |
openstackgerrit | Clark Boylan proposed openstack-infra/system-config master: Manage user ssh keys from urls https://review.openstack.org/604932 | 22:48 |
clarkb | corvus: ^ should be cleaned up now | 22:48 |
*** sdake has quit IRC | 22:50 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Web: plug the authorization engine https://review.openstack.org/640884 | 22:51 |
*** slaweq has quit IRC | 22:53 | |
*** smcginnis has joined #openstack-infra | 22:54 | |
ianw | dmsimard: so it has a command-line "root=/dev/xvdb1" and the partitions seem to be /dev/xvda1 (cloudimg-rootfs) /xvda15 & 14 (efi stuff) | 22:54 |
ianw | bridge has everying on xvda | 22:55 |
*** dpawlik has joined #openstack-infra | 22:56 | |
*** tkajinam has joined #openstack-infra | 22:56 | |
ianw | different kernel in the /proc/cmdline ... hrm | 22:57 |
*** jamesmcarthur has quit IRC | 22:57 | |
dmsimard | ianw: bridge.o.o (the real one) has /dev/xvda1, xvda14 and xvda15 (???) as well as an xvde2 for /opt | 22:58 |
*** jamesmcarthur has joined #openstack-infra | 22:58 | |
*** eernst has joined #openstack-infra | 23:00 | |
*** dpawlik has quit IRC | 23:00 | |
*** jamesmcarthur has quit IRC | 23:00 | |
*** slaweq has joined #openstack-infra | 23:01 | |
*** eernst has quit IRC | 23:02 | |
openstackgerrit | Logan V proposed openstack/diskimage-builder master: Add DIB_APT_MINIMAL_CREATE_INTERFACES toggle https://review.openstack.org/639865 | 23:03 |
ianw | yeah, i got up the initramfs ... closer ... | 23:04 |
clarkb | logan-: it looks like our nasible run may be having trouble talking to the mirror now | 23:05 |
*** slaweq has quit IRC | 23:05 | |
clarkb | logan-: not sure if that helps with debugging (just noticed that the ansible process to that node is slow/not finishing quickly) | 23:05 |
corvus | fungi: can you +3 https://review.openstack.org/604925 ? i'm in no rush for the followup if you want to discuss it more | 23:06 |
*** eernst has joined #openstack-infra | 23:06 | |
logan- | clarkb: yeah :/ it seems like there's some neutron weirdness going on in the controllers / network nodes. i'm rolling thru them doing openstack minor version upgrades and reboots and then we'll see where things stand | 23:07 |
fungi | yep, looks good | 23:07 |
fungi | thanks again, logan-! | 23:08 |
corvus | fungi: and i downgraded my vote in the follup to a +1 to reduce the chances of premature merging | 23:08 |
logan- | im tempted to leave it disabled this week and bring that cloud up to rocky. | 23:09 |
*** eernst has quit IRC | 23:10 | |
fungi | corvus: thanks. i'm open to being convinced this isn't really a risk in practice, but combine it with the fact that we know there's a *.openstack.org cert floating around outside our sphere of control and it just gets that much more questionable | 23:10 |
*** rascasoft has quit IRC | 23:10 | |
*** sdake has joined #openstack-infra | 23:12 | |
openstackgerrit | Adam Coldrick proposed openstack-infra/storyboard master: Use ColumnElements instead of strings in migration https://review.openstack.org/640892 | 23:15 |
*** betherly has joined #openstack-infra | 23:16 | |
*** eernst has joined #openstack-infra | 23:16 | |
*** eernst has joined #openstack-infra | 23:17 | |
*** markvoelker has joined #openstack-infra | 23:18 | |
*** eernst has quit IRC | 23:19 | |
*** betherly has quit IRC | 23:20 | |
*** eernst has joined #openstack-infra | 23:23 | |
corvus | JpMaxMan, clarkb, fungi, mordred: i think all the zuul stuff is in place for netlify-cms. see the most recent build of https://review.openstack.org/635924 -- if you follow the link there, it will take you directly to the preview site | 23:25 |
corvus | that change can be merged now if we want :) | 23:26 |
clarkb | neat | 23:27 |
*** eernst has quit IRC | 23:27 | |
*** openstackgerrit has quit IRC | 23:28 | |
*** tosky has quit IRC | 23:28 | |
JpMaxMan | Nice! Now what? | 23:29 |
clarkb | JpMaxMan: basically merge that change then all subsequent changes to that sandbox will get similar previewing builds | 23:30 |
clarkb | which can be used by reviewers to decide if they want to merge those changes or not | 23:31 |
*** eernst has joined #openstack-infra | 23:31 | |
JpMaxMan | Hah yeah I know speaking more macro like do we roll up all the project sites using this ? | 23:31 |
clarkb | puppet is going to run on review.o.o nowish. I think this should fix windmill-config which will fix our image builds | 23:31 |
*** eernst has quit IRC | 23:32 | |
corvus | JpMaxMan: yeah, i think maybe we're probably ready to either move starlingx or zuul to this for real... maybe we can try to regroup with mordred tomorrow, and jimmy too if he's around? | 23:35 |
fungi | JpMaxMan: probably once we see it work well for the stx site, i guess so | 23:35 |
*** sdake has quit IRC | 23:35 | |
mordred | corvus: yeah - I think that's a great next step | 23:35 |
clarkb | https://git.openstack.org/cgit/openstack/windmill-config has content now | 23:35 |
fungi | might at least be nice to go through the paces of a change pushed from netlify and a change pushed directly to gerrit just to make sure it's doing what we expect | 23:35 |
mordred | corvus: probably try moving startlingx for real first - since it's already using an appropriate framework and we'll need to do some rework on zuul | 23:35 |
fungi | not that zuul is really using much of a framework. it's just hand-edited html/css which were orignially spat out of a templating engine | 23:36 |
mordred | but I think we can fast-follow starlingx with zuul pretty easisly | 23:36 |
*** dpawlik has joined #openstack-infra | 23:37 | |
corvus | mordred: zuul has the publication framework, stx has the vuepress buildout. they're each halfway there :) | 23:37 |
fungi | good point | 23:37 |
mordred | corvus: indeed. :) | 23:38 |
corvus | mordred, clarkb: git playbook ran in 10m | 23:38 |
mordred | corvus: that's so much better than 1.5 hours | 23:39 |
fungi | that's a good order of magnitude improvement (down from nearly 100) | 23:39 |
*** dklyle has quit IRC | 23:39 | |
corvus | i'm trying to dig the result out :) | 23:39 |
JpMaxMan | fungi: yeah I think some more testing is in order for sure | 23:40 |
mordred | corvus: opendev.org seems to have repos all the way to the end of stackforge at least - so seems like it got past healthnmon | 23:40 |
corvus | zero failures :) | 23:40 |
mordred | but that looks like it was from 4 hours ago | 23:40 |
corvus | oh, hrm | 23:40 |
clarkb | corvus: mordred did we end up correcting the setup of the projects that made it in when the db stuff wasn't working? | 23:40 |
clarkb | (wondering if we want to do an out of band pass with the playbook just for this) | 23:41 |
mordred | clarkb: all of them should be fine - except for maybe healthnmon | 23:41 |
corvus | mordred: hrm, we must have run it to completion with the description length fix. xstatic-angular-animate is the last project in the list. | 23:41 |
*** dpawlik has quit IRC | 23:41 | |
corvus | so the state is correct | 23:41 |
mordred | clarkb: we were doing full re-runs already | 23:41 |
mordred | yah. oh - and in fact healthnmon should be fine too | 23:41 |
corvus | yep, looks like it has the correct settings | 23:42 |
mordred | corvus: \o/ | 23:42 |
mordred | do we have that new project that was landed this morning? | 23:42 |
clarkb | cool | 23:42 |
corvus | https://opendev.org/openstack/windmill-config | 23:42 |
clarkb | openstack/windmill-config | 23:42 |
mordred | woot | 23:42 |
clarkb | next step is replication? | 23:42 |
mordred | so now, in theory, we should be good to land the gerrit replicate patch, assuming we're happy with the playbook | 23:43 |
corvus | and https://review.openstack.org/#/admin/projects/openstack/windmill-config exists | 23:43 |
*** mriedem has joined #openstack-infra | 23:43 | |
corvus | mordred, clarkb: yes on replication | 23:43 |
mordred | woot! | 23:43 |
clarkb | I'll remove my wip | 23:43 |
clarkb | https://review.openstack.org/#/c/640431/ WIP removed | 23:43 |
corvus | clarkb: frickler had a q | 23:43 |
mordred | corvus, clarkb: anybody want to predict an over/under on how long it'll take for replication to catch up? | 23:44 |
mordred | :) | 23:44 |
corvus | mordred: i think *less* than 24h :) | 23:45 |
clarkb | corvus: frickler I actually did read the docs on that let me find a link to a manpage | 23:45 |
*** rlandy has quit IRC | 23:45 | |
clarkb | https://git-scm.com/docs/git-push#URLS ssh://[user@]host.xz[:port]/path/to/repo.git/ is the form we are using | 23:46 |
* mordred puts €1 on 7hrs | 23:46 | |
clarkb | it is possible we need to explicitly set the ssh:// though | 23:46 |
*** maumont has quit IRC | 23:46 | |
clarkb | [user@]host.xz:path/to/repo.git/ is the form we were using which didn't have the ssh:// | 23:46 |
corvus | yeah, but it's gerrit, not ssh, doing this | 23:47 |
clarkb | https://gerrit.googlesource.com/plugins/replication/+doc/master/src/main/resources/Documentation/config.md points to the git push url docs too | 23:47 |
clarkb | given that I'd expect ssh:// to be what we want | 23:47 |
corvus | well, i don't think we need ssh:// | 23:48 |
corvus | we don't have it in any other sections | 23:48 |
clarkb | ya but the other sections use the scp form I think (which doesn't have the ssh:// | 23:48 |
clarkb | but with scp form you can't set the port since scp uses : ? | 23:48 |
* clarkb double checks | 23:48 | |
* mordred believes in clarkb | 23:49 | |
clarkb | we can convert review-dev to the other form really quick and test if we want | 23:49 |
corvus | where does review-dev replicate to? | 23:49 |
clarkb | github iirc | 23:49 |
clarkb | gtest-org replicates to github | 23:49 |
* clarkb double checks that | 23:50 | |
corvus | ah, so put a :22 on there | 23:50 |
clarkb | yup | 23:50 |
clarkb | let me get that chagne up | 23:50 |
*** eharney has quit IRC | 23:51 | |
clarkb | rereading docs I'm fairly certain we want the ssh:// do we prefer I test what I've already pushed for review.o.o or test ssh:// then update review.o.o change to ssh:// if that works? | 23:51 |
*** markvoelker has quit IRC | 23:51 | |
*** hwoarang has quit IRC | 23:51 | |
corvus | clarkb: whichever you prefer | 23:51 |
*** wolverineav has quit IRC | 23:52 | |
*** mriedem has quit IRC | 23:52 | |
*** hwoarang has joined #openstack-infra | 23:53 | |
*** IvensZambrano has quit IRC | 23:53 | |
*** openstackgerrit has joined #openstack-infra | 23:53 | |
openstackgerrit | Clark Boylan proposed openstack-infra/system-config master: Use explicit ssh url with review-dev replication config https://review.openstack.org/640896 | 23:53 |
corvus | clarkb: can we just do that manually on review-dev real quick? | 23:54 |
clarkb | corvus: sure | 23:54 |
clarkb | I can do taht real quick | 23:54 |
clarkb | review-dev.o.o has been restarted with that config. Now I'll merge something in gtest and we should see it replicate | 23:57 |
clarkb | https://github.com/gtest-org/test/commit/0e293c8fbf715d3e601be7be6ffcf56be6da10bd is there | 23:58 |
clarkb | https://review-dev.openstack.org/#/c/107956/ I just submitted that change | 23:58 |
clarkb | corvus: ^ if you think that looks right I'll update my change for prod | 23:58 |
openstackgerrit | Merged openstack-infra/system-config master: Add zuul user to bridge.openstack.org https://review.openstack.org/604925 | 23:59 |
openstackgerrit | Merged openstack-infra/system-config master: Run docker-compose pull before docker-compose up https://review.openstack.org/640889 | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!