Tuesday, 2017-12-12

pabelangerremote:   https://review.openstack.org/527272 Add pip as requirement00:02
pabelangershould fix ptgbot00:02
jeblairlgtm00:02
ianwpabelanger: is that going to install pip3?00:02
ianwyes, after looking at it :)00:05
pabelangerianw: actually, I think we might need pip::python300:06
pabelangerlooking00:06
pabelangeroh00:06
pabelangerno, you are right00:06
pabelangerwe do both00:06
ianwoh good, that was my reading of puppet-pip :)00:07
pabelangerianw: :) mind upgrading to a +300:07
ianwpabelanger: np, or just merge when ci returns00:08
*** baoli_ has quit IRC00:17
*** baoli has joined #openstack-sprint00:17
*** baoli has quit IRC00:22
ianwlaunch status01.o.o00:41
ianwoh, bah, that probably would have worked if i fixed up the hiera groups, doing that now01:00
fungialso apparently need 527280 for the subunit worker01:03
ianwoh, status.o.o needs nodejs too01:21
ianwi'm pretty sure https://review.openstack.org/#/c/526978/ (update puppet nodejs to 2.3.0) is safe ... but will test more01:22
Shrewshrmm, anyone know why our dns TTLs are different on some logstash workers? the one i did earlier was 5m, the two I just did are 60m01:36
clarkbShrews: I think our script defaults to 60m but rax web ui defaults to 5m01:38
clarkbso likelywhere records were made?01:38
Shrewsclarkb: should I change them?01:39
clarkbya its probably worthwhile to go to 60m (our default) but also not urgent I dont think01:39
Shrewsk01:39
* Shrews notes that ttl is changed for logstash-worker02 from 5 to 6001:42
*** baoli has joined #openstack-sprint01:47
*** jhesketh has quit IRC01:59
Shrewslogstash-worker19 and logstash20 are done02:29
* Shrews calls it a night02:29
Shrewserr, logstash-worker2002:29
*** larainema has joined #openstack-sprint02:50
*** baoli has quit IRC02:54
*** baoli has joined #openstack-sprint02:55
*** baoli has quit IRC03:00
*** baoli has joined #openstack-sprint03:03
*** ianychoi has joined #openstack-sprint03:08
dmsimardjeblair: did I ever tell you those zuul graphs are beautiful ? They're beautiful.03:16
dmsimardI'll take logstash-worker04 through 06 since the logstash queue is low03:18
ianwhmm, is there something magic in heira/common.yaml03:27
dmsimardianw: I'm sure there's plenty magic to be had03:28
dmsimard:D03:28
ianwError: Could not find data item elasticsearch_nodes in any Hiera data file and no default supplied03:28
dmsimardianw: where are you seeing that ?03:28
ianwfrom a manual puppet run on new status host03:28
dmsimardianw: hmm, unrelated but is paste.o.o is loading for you ?03:29
ianwdmsimard: it appears to not be03:30
clarkbianw I think we set the hiera load path to find it but I'd have to read ansible puppet to be sure03:31
dmsimardhah, I'd paste you the error03:32
dmsimardbut hey, paste is down :D03:32
dmsimardhttps://etherpad.openstack.org/p/XYJJyds9L803:32
dmsimardlooks like it's back ?03:33
ianwyeah, it does that sometimes03:33
dmsimardclarkb: unrelated but do you know if we're enabling the mqtt ansible callback on puppetmaster.o.o ?03:34
dmsimardclarkb: it looks like it tries to load the mqtt lib but it's not there http://paste.openstack.org/raw/628671/03:35
clarkbI want to say that is a known issue03:37
clarkbfungi and mtreinish likely kniw more03:37
ianwinability to build netifaces ... fungi was that the problem you saw before?04:00
dmsimardrings me a bell04:01
clarkbyes with subunit worker04:23
*** baoli has quit IRC04:26
ianwarrgghh damn it.  i just spent 20 minutes wondering why puppet doesn't work locally and it was because i didn't prefix it with 'sudo '04:30
ianwanyway, my instructions are wrong to run puppet locally, just want to use the environment04:30
ianwupdated on the etherpad ... confusingly it works if you *don't* access common heira data04:31
dmsimardinfra-root: I wrote what was essentially my bash history in playbook format: https://review.openstack.org/#/c/527301/04:55
dmsimardI wanted to do it only for the workers, but I figured it might be generic enough04:55
dmsimardThe payoff might not be immediate but guess what comes out in 18.0404:56
dmsimardOh, that's a server re-installation playbook by the way :D04:56
* dmsimard sleep04:58
ianwfrickler: thanks for https://review.openstack.org/527144 ... progress has been painful but getting there slowly ... i think now it succesfully installs nodejs and npm05:06
ianwyou might like to start looking at other ::nodejs users ; e.g. https://review.openstack.org/52730205:06
*** skramaja has joined #openstack-sprint05:07
*** baoli has joined #openstack-sprint05:29
*** fungi has quit IRC05:45
*** baoli has quit IRC05:46
*** fungi has joined #openstack-sprint05:48
fricklerianw: cool, I'm not sure about the symlink issue though, looking at the code it is set to true for 16.04, but should in fact only be for system packages05:55
fricklerianw: I'd like to redeploy ethercalc01 from scratch to verify the new stuff is still working fine then05:56
*** skramaja has quit IRC05:58
ianwfrickler: sure; one thing is it's a bit of a pain to deploy it all with all the dependent changes06:14
ianwfrickler: but feel free to destroy 93b2b91f-7d01-442b-8dff-96a53088654a ; or just recreate it from image or whatever.  any testing i have is now in changes06:18
fricklerianw: o.k., I'm still processing backlog, but will do in a bit. probably launch another new instance first before removing that one06:20
fricklerianw: on a different note, I think https://review.openstack.org/524459 could proceed now, zuul seems to have changed in the meantime06:21
ianwyes, i meant to unwip them after a zuul restart06:22
ianwthanks for reminding me :)06:24
frickleroh, cool, just received the first bunch of root cron log mails06:25
*** baoli has joined #openstack-sprint06:44
*** baoli has quit IRC06:49
*** AJaeger has joined #openstack-sprint08:44
*** baoli has joined #openstack-sprint08:46
-openstackstatus- NOTICE: Our CI system Zuul is currently not accessible. Wait with approving changes and rechecks until it's back online. Currently waiting for an admin to investigate.08:49
*** baoli has quit IRC08:51
-openstackstatus- NOTICE: Zuul is back online, looks like a temporary network problem.09:09
*** baoli has joined #openstack-sprint09:22
*** baoli has quit IRC09:26
*** skramaja has joined #openstack-sprint09:30
*** AJaeger has quit IRC09:51
*** AJaeger has joined #openstack-sprint09:51
*** jhesketh has joined #openstack-sprint10:04
*** baoli has joined #openstack-sprint10:23
*** baoli has quit IRC10:27
*** baoli has joined #openstack-sprint11:24
*** baoli has quit IRC11:28
*** jkilpatr has quit IRC11:32
*** baoli has joined #openstack-sprint11:40
*** baoli has quit IRC11:44
*** jkilpatr has joined #openstack-sprint12:05
fricklerdeployed logstash-worker0[1789] and configured rdns, waiting with the remainder for someone to watch how I break things ;)12:15
*** baoli has joined #openstack-sprint12:41
*** baoli has quit IRC12:45
*** baoli has joined #openstack-sprint13:37
Shrewspicking up logstash-worker 17 & 1813:43
dmsimardShrews: oi13:45
dmsimardShrews: I starting writing this last night: https://review.openstack.org/#/c/52730113:45
*** baoli has quit IRC13:46
*** baoli has joined #openstack-sprint13:46
dmsimardShrews: It's WIP and actually not tested yet (I'll test it today), I wrote it after reinstalling three workers.. cause, you know, I'm not doing this by hand for all the servers (especially again once 18.04 is out)13:46
Shrewsi suspect you'll have issues automating the dns changes13:47
Shrewsbecause of the things clarkb explained yesterday13:47
dmsimardI think we'll always want to do reverse but forward is manual, yeah13:48
*** skramaja has quit IRC13:49
dmsimardIt's more than likely possible to do a delete and a create of the forward13:49
dmsimardbut we probably don't want to do that before there's been some amount of verifications13:49
dmsimardhttp://git.openstack.org/cgit/openstack-infra/system-config/tree/launch/dns.py already has the logic to show which commands to run and it's something available within Ansible already (server IPs, etc.)13:51
ShrewsFWIW, after we get done with the logstash workers, I need to switch gears back to zuul things. There are some things that I MUST get done this week since I'll be off for a couple of weeks starting next week. If I get those done, I can switch back to helping with the upgrades.13:52
dmsimardShrews: feel free to focus on zuul after the two you're on13:55
dmsimardI should be able to pick a few easily with the playbook :D13:55
fungiianw: where else did you encounter netifaces getting dragged in from pypi?14:04
fricklerianw: 93b2b91f-7d01-442b-8dff-96a53088654a is the original ethercalc01, I assume we still need to migrate the data from there somehow.14:23
fricklerianw: 079b34ea-4c2d-4c05-a984-a044ab69b0d8 is the one you launched which I think can be removed now, 43fa686e-12a4-4c51-ad3b-d613e2417ff3 is my second launch, which I now deployed successfully with some more fixes to our patch14:25
clarkbfrickler: o/ I'm around now if you want to walk through dns canges14:28
fricklerclarkb: cool, I think I can follow what is in backlog, I'd start by changing dns via web gui for the above four instances14:33
fricklerclarkb: although I've wondered whether it would make sense to stop the old instances first14:34
clarkbfrickler: it shouldn't make a difference because it is the firewall update that effectively switches "control" from the old instance to the new instance. And this way you can always roll back to the old instance if you need to (though at this point I Think we have a lot of confidence in the new logstash-workers14:35
clarkbfrickler: questions like that and the weird dns situation are why we don't really automate all of this as its largely per service what makes sense14:36
clarkbdmsimard: re automating dns one of the biggest issues has been the client/api and the fact that we share the domain with the foundation14:36
fricklerclarkb: though maybe we could use a different domain for most instances? like openstack-infra.org? or something unbranded once we have the current specs proposal done14:39
clarkbfrickler: ya we've talked about that. I think that will likely happen as part of the host other name servers in jeblairs spec14:39
-openstackstatus- NOTICE: We're currently seeing an elevated rate of timeouts in jobs and the zuulv3.openstack.org dashboard is intermittently unresponsive, please stand by while we troubleshoot the issues.14:39
clarkb(probably not immediately, but gives us a lot more contorl and ability to do things like that)14:40
fricklerclarkb: o.k., updated 8 dns records, confirmed with dmsimard's ansible command on all target hosts, will do firewall restarts next14:51
clarkbalso re dns I thinkthe ideal would be that it went through code review or at least revision control of some sort which we currently lack as well15:00
fricklerclarkb: yes, having the zones in git managed by gerrit sounds pretty nice I think. probably with a post job to take care of SOA updates and zone reloads, but that's fine tuning ;)15:01
clarkbfrickler: and re email flood before the summit I started sifting through it and attempting to address things that looked like problems, but that lost all momentum during travel for summit. That is probably something I should try and pick up again during the slow weeks around holidays15:03
clarkbthings like unattended upgrade package lists are noisy but probably useful (easy to grep my inbox for when things updated) and don't indicate issues15:07
fricklerclarkb: yeah, some cleanup would be nice15:07
fricklerclarkb: so I've done the firewall restart and refreshed hostkeys on puppetmaster, now checking what I might have missed15:08
fricklerclarkb: so I think I'm done, ready to delete the remaining nodes, waiting for confirmation on that. I'd then continue with the remainder of the workers while everyone else chases zuul ;)15:16
clarkbfrickler: after a quick look around it seems like the new workers are happy. Logstash job queue is small too15:17
clarkbfrickler: I think you cna go ahead and remove the trusty nodes when you are ready15:17
clarkbfrickler: did you catch my notes on that from yesterday? you have to use the uuid due to the name conflict. I personally like to show $uuid, confirm its the right one then delete $uuid15:18
clarkb(the big drawback to non unique names is this can get a little confusing)15:18
pabelangerdmsimard: should look into cloud-launcher, there is some logic already create to bootstrap a server, but was never finished.  Missing part is running puppet after server was launched15:19
pabelangerdmsimard: we'd then store server into in yaml format15:19
fricklerclarkb: yeah, I saw that show/delete thing, seems useful15:21
*** baoli has quit IRC15:32
fricklerclarkb: o.k., so there are 7 workers remaining now, should I do them all in one batch, split them in 3+4, leave some/all for others? or wait until zuul is fixed?15:33
Shrewsfrickler: if you want to take the rest as a batch, i say go for it.15:35
clarkbI would split them up if only to keep quota useage low. I don't think you need to wait for zuul15:35
fricklero.k., so I'll start with the next 415:36
Shrewscool. i'm going to focus on some zuul things15:37
dmsimardfrickler: can you leave me a few ? I'd like to test my playbook15:37
dmsimardI'll claim them on the pad15:38
fricklerdmsimard: ack15:44
* dmsimard current status: creating mail filters for 300 mails16:03
pabelangerremote:   https://review.openstack.org/527447 Support ubuntu xenial plugins directory16:09
pabelangershould fix puppet-meetbot plugins issue^16:09
clarkbpabelanger: commented,16:11
pabelangerclarkb: thanks!16:12
pabelangerfixing16:12
pabelangerremote:   https://review.openstack.org/527447 Support ubuntu xenial plugins directory16:30
pabelangerclarkb: ^16:30
pabelangerI've moved on to files02.o.o, until we can land ^ for eavesdrop01.o.o17:14
fricklertaking a break now, but planning to be back for the meeting17:14
clarkbpabelanger: +2 on meetbot change17:24
fungidmsimard: did we mailbomb you with cronspam? welcome to infra root ;)17:24
*** baoli has joined #openstack-sprint17:27
pabelangerremote:   https://review.openstack.org/527469 Update files node to support numeric hostnames17:30
pabelangerneeded to bring files02.o.o online17:30
dmsimardfungi: yeah DoSing my inbox :)17:42
clarkbpabelanger: can we abandon https://review.openstack.org/#/c/527469/ in favor of https://review.openstack.org/#/c/527186/ ?18:04
pabelangerclarkb: yup, if you want to +3 the other, I'll abandon mine18:06
clarkbok18:06
fungidmsimard: i don't know what sort of mail filtering system you use, but if it helps here's the relevant snippet from my exim .forward filter file: http://paste.openstack.org/show/628758/18:08
fungithough i have earlier rules which explicitly handle stuff coming from mailing lists and code review so that's mostly a fallthrough catch-all18:09
clarkbreally need to make cacti graph creation quieter18:10
clarkbmaybe I'll amke a patch for that today18:10
fungiand that one's actually an elif branch in a long series18:10
clarkbbut also the exim warning email we get from all over18:11
pabelangerclarkb: yah, I pushed up something a few weeks ago to help cut down on size of emails18:21
pabelangerclarkb: but, I most of it we can just remove some debug print commands18:21
pabelangerat least for size of the email18:21
clarkbpabelanger: or just log it on cacti instead and not email it?18:21
pabelangerclarkb: yah, I'd be okay with that. I assumed somebody was actually looking at the emails18:22
fungimore likely it's an indication nobody's looking at it, or it wouldn't still be so large18:36
dmsimardfungi: gmail :/18:45
dmsimardfungi: for stuff like this, you can just batch select a bunch of messages with similar patterns and there's an option "filter messages like these" and it proposes a filter based on what you selected (maybe it's a mailing list header, maybe it's a bunch of different "from", etc.)18:46
fungiinteresting18:48
fungicurious to hear how that goes for you. i bet their ai is powerful18:48
fungigiven how much money they probably sink into developing that service18:48
*** baoli has quit IRC19:01
*** baoli has joined #openstack-sprint19:01
dmsimardit's not really that smart, but I guess it's something that'll improve over time19:03
dmsimardlike most data-driven things19:03
*** baoli has quit IRC19:42
*** baoli has joined #openstack-sprint19:56
ianwafter some breakfast i'll take a look at codesearch20:01
ianwand also review frickler's changes to ethercacalc20:01
clarkbI need lunch but ya was going to dig into sprint topic reviews after20:02
ianwfungi: i'm certain i saw netifaces failing to build during one of our ethercalc runs ... but there's been a lot going on and i can't find it right now, but i'll keep an eye20:02
pabelangergoing to look at files02.o.o and eavesdrop01.o.o again20:02
fungiianw: speaking of netifaces, still hammering on fixing it for puppet-subunit2sql... latest fix is https://review.openstack.org/52728020:04
fricklerdmsimard: I'd be ready for the next set of iptables restarts, have you started touching dns yet?20:07
dmsimardfrickler: no, I haven't touched my batch of 3 workers yet20:07
dmsimardmaybe in an hour or so20:07
fricklerdmsimard: o.k., so I'll do mine now and should be finished with that soon20:08
dmsimardack20:08
ianwfungi: /usr/local/bin isn't in the default path?20:08
pabelangerclarkb: https://review.openstack.org/507266/ too when you have a moment20:09
fungiianw: previously we hard-coded to /usr/bin/pip20:09
fungiwhich doesn't exist (per the tracebacks i got when launching the server)20:10
pabelangerfungi: ianw: https://review.openstack.org/527447/ could also use a review to address plugins directory for puppet-meetbot20:10
fungii included the path in there on the assumption that puppet may not set one otherwise20:10
ianwright, but is line 42 in https://review.openstack.org/#/c/527280/2/manifests/init.pp strictly necessary?20:10
pabelangerhttps://review.openstack.org/527274/ is an easy +3 for somebody too20:10
-openstackstatus- NOTICE: The zuul scheduler has been restarted after lengthy troubleshooting for a memory consumption issue; earlier changes have been reenqueued but if you notice jobs not running for a new or approved change you may want to leave a recheck comment or a new approval vote20:16
fungii had copied the original implementation from puppet-zuul and adapted it for python 2.7... not sure why it was calling /usr/bin/pip3 even though our normal pip is at /usr/local/bin/pip20:18
fricklerlogstash-worker1[0-3] done and /me is done for today, too ;)20:21
fungithanks frickler!20:22
fungipabelanger: i had one minor concern noted inline on https://review.openstack.org/527447 but didn't block the change for that20:26
pabelangerfungi: yah, I tried looking this morning, but didn't see a way to define an external plugin directory20:27
fungithe cheap solution would just be to switch on version ranges like >=16.0420:28
fungiwhich is what some changes in other modules ended up going with20:28
pabelangeryah20:28
pabelangerI am guessing python3.6 might be the next path too20:28
fungiquite possibly, if meetbot even has/gets py3k support20:29
*** jkilpatr has quit IRC20:33
*** dteselkin has quit IRC20:35
*** dteselkin has joined #openstack-sprint20:40
jeblairi didn't hear any screams about grafana, so i will delete the old server now20:44
pabelangerokay, trying files02.o.o again20:44
pabelangerjeblair: +120:44
*** clarkb has quit IRC20:45
*** clarkb has joined #openstack-sprint20:46
jeblairdone20:48
ianwfrickler: ok, my original ethercalc 079b34ea-4c2d-4c05-a984-a044ab69b0d8 deleted now20:58
ianwhttps://172.99.116.13/ seems to be working21:02
pabelangerokay, files02.o.o is online21:03
pabelangerI'm going to update the DNS for files.o.o and point to files02.o.o21:05
pabelangerunless somebody objects21:05
fungigo for it21:08
pabelangerdone21:08
pabelangerI can see some traffic already21:08
fungiguess we should try hitting docs.o.o or something to make sure it comes up21:08
fungiwfm and is resolving via cname21:08
pabelangeryay21:09
clarkbianw: cool does that mean nodejs is sorted out now?21:11
clarkblunch has been consumed now to the reviews also apparently I lost connectivity to freenode?21:12
pabelangerremote:   https://review.openstack.org/527519 Remove files01.o.o from hiera21:13
ianwclarkb: we'll need the updated version from https://review.openstack.org/#/c/526978/21:13
pabelangerI'll delete files01.o.o in an hour or so. Once http traffic has stopped hitting it21:14
clarkbianw: cool I'll start reviewing there21:14
ianwwell that's a just a version number bump ... i'm pretty sure it's backwards compatable but i'm not sure how to prove that21:14
clarkbwas just going to ask about that21:14
clarkbianw: due to how puppet moduels are global I think what we've donein the past is read the docs and do uor based to read the interfaces we use to have the pretty sure assertion then check after the fact if anything broke21:15
clarkbianw: puppet tends to fail gracefully making it relatively safe to push those things in and see what happens21:15
clarkb(because puppet when failing does nothing rather than something)21:15
clarkbI've +2'd the change, if other people are ok with ^ then I think we can go ahead and approve it21:16
ianwyeah, also any backwards compat issues would be with trusty hosts, so presumably temporary21:17
ianwif anyone has thoughts on migrating redis databases over and above https://redis.io/topics/persistence ... which suggests you can just move the rdb file ... i'm open to suggestions :)21:17
clarkbianw: my understanding is that is how you redis21:17
clarkbI think what you would do is stop the service in front of redis so that it doesn't get updates, force a write (or just wait for one) then copy the db file21:18
clarkbfungi: any raeson this wasn't approved earlier https://review.openstack.org/#/c/527175/1 ? just the zuul situation?21:19
ianwit does say it's safe to copy at any time due to it being renamed().  so i'll do any initial ethercalc copy, test a few, and then if it's ok, shutdown apache on ethercalc, final copy, redirect dns21:19
clarkbsounds good to me21:20
fungiianw: oh, approved now. i probably just missed that it already had a +221:20
ianwclarkb: i'd say the ethercalc puppet is ready for review too -> https://review.openstack.org/#/c/52714421:30
clarkbianw: left some comments on that, the first set are for the -121:46
clarkband I'm just now realizing I am blind21:46
clarkbthere are TWO anchors only one went away21:46
clarkbianw: curious what you think about the persistent journald storage thing though. I remember poking at this in the past and deciding there didn't seem to be a great way to do that21:47
clarkbianw: aiui the two ways to do that are to make sure the /var/log/journald dir exists with all the correct permissions (and selinux labels) or to chagne the journald config to persistent (instead of the default auto) and then journald creates the dir for iteslf with the correct settings21:52
clarkbThe problem with the first is  I think different distros do perms/users/etc differently. THe problem with the second is doing the whole restart dance and making sure it picks it up and does the work for itself21:53
clarkbbut maybe we just do the second and restart the service and call it good21:53
*** jkilpatr has joined #openstack-sprint22:04
fungiour launch script reboots the server at the end anyway, so it'll be restarted at that point right?22:10
clarkbfungi: ya so fresh servers will be fine but not existing ones like translate or logstash workers22:13
clarkbI think we can sort it out I just need enough time to sort out what the process is22:13
clarkbor someone else, feedback welcome :)22:13
ianwhmm, yeah i hadn't considered that.  setting the config seems best maybe22:18
ianwi think maybe we redirect like the upstart logs for ethercalc, though22:22
ianwfrickler: 527144 updated the service file to put the logs in the same place, moved the rotation back out22:27
jeblair1 year memory max for paste.o.o is 486M.  do we want to stick with the current 2G flavor or drop to 1G?22:29
jeblair(it has almost no cpu usage)22:29
clarkbI guess that also drops to single cpu? but with little cpu usage thats probably fine22:30
jeblairya22:30
jeblairremote:   https://review.openstack.org/527536 Create paste hiera group22:33
jeblaireasy +322:33
pabelangerremote:   https://review.openstack.org/527447 Support ubuntu xenial plugins directory22:35
clarkbjeblair: in that case I think 1GB is fine22:35
pabelangerneed to run and eat, but ^ updated for puppet syntax issue. Would like a re-review please22:35
clarkbI've queued up both cahnges for review22:35
jeblairfungi: can you +3 527536?22:37
clarkbpabelanger: I left a -1 can you take a look when you get back?22:37
jeblairclarkb, pabelanger: i responded to clarkb but left a -1 for a different reason22:40
jeblair(but related)22:41
clarkbjeblair: you may be interested in my comment on https://review.openstack.org/#/c/526946/ as I think there is an intersection between retired projects and zuul v3 config22:43
jeblairclarkb: yeah, why don't we do that in project-config, so the repo can still be empty22:45
clarkbya that may make it cleaner in the retired repo22:46
pabelangerthanks, looking22:46
clarkbI guess the end state is to remove the repo from zuuls config entirely?22:47
clarkbso that would just be temporary no matter where it lives?22:47
pabelangerjeblair: good point22:47
*** baoli has quit IRC23:25

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!