Wednesday, 2016-05-25

jhesketh	fungi: so with the new naming thing, is that meant to apply to dns too? eg I launch apps01.openstack.org, create a DNS apps01.openstack.org and use a CNAME for apps.openstack.org?	00:15
jhesketh	or do I just launch apps01.openstack.org as the server name and update the apps.openstack.org records?	00:15
anteaya	jhesketh: I don't know the answer to the naming question	00:27
anteaya	my nephew has arrived for tea	00:27
anteaya	I may or may not be back online tonight	00:27
fungi	jhesketh: i think cname, though we didn't explicitly discuss that part of the implementation	00:35
fungi	jhesketh: though you likely want to make sure there are no fqdn-isms baked into puppet for the service you're replacing	00:36
fungi	i'm leaning toward this is guidance going forward for new services we build out, but unless someone does the legwork to clean up hostname assumptions in our existing puppet we likely should continue doing replacements the old way when there's some question	00:37
jhesketh	fungi: right, so if there are fqdn-isms (so to speak), how do you launch a new node? Do you do it with the same name as the existing one?	00:42
fungi	yep	00:43
fungi	and then just edit the dns records	00:43
fungi	to switch to the replacement	00:43
* jhesketh is embarrassed to admit he hasn't ever edited any openstack's dns before...		00:44
jhesketh	fungi: what's the best way to do that.. through the web ui? (given the commands from dns.py are for creating new records rather than updating)	00:45
fungi	jhesketh: yes, i end up doing dns changes through the rax web dashboard	00:45
fungi	remember the tenant that domain is under is "openstack" not openstackci	00:45
jhesketh	yeah I found them earlier	00:46
fungi	login credentials for that are in our usual password list	00:46
*** rfolco has joined #openstack-sprint		00:49
ianw	couple of jobs failed i think 2 hours ago with "timeout -s SIGINT 0 git fetch http://zm05.openstack.org/p/openstack-dev/devstack refs/zuul/master/Z6370fc143e874864a12f8061044c2f04" ... i'm gonna presume it was related to upgrades?	01:01
ianw	e.g. http://logs.openstack.org/52/320152/2/check/gate-devstack-unit-tests/1b1e49a/console.html	01:01
jhesketh	fungi: so just looking at the hostname stuff... if we did apps01.openstack.org we'd also need to update the site.pp to match there..	01:07
jhesketh	should we have something like "node /^apps\d*\.openstack\.org$/ {" for all the node definitions?	01:07
*** baoli has joined #openstack-sprint		01:09
fungi	ianw: that sounds possible. pabelanger pretty consistently did a #status log of each of them so should be timestamped on the infra status wiki	01:12
fungi	jhesketh: yeah, though there may also be $::fqdn references in the corresponding class in system-config or in the puppet module instantiated by it	01:13
fungi	you'll want to keep an eye out for that as well	01:13
jhesketh	oh yeah, there's heaps of that	01:13
jhesketh	most of those are vhost names so we should look at having vhost in the site.pp	01:15
jhesketh	hmm, replacing $::fqdn is no small feat	01:20
fungi	yep, that's why i say don't let the fqdn-isms in our puppet keep you from doing upgrades for now. people who have a strong interest in that refactoring effort can tackle it after we upgrade stuff, but the upgrades are more urgent than separating out all the hostname assumptions in our config management right now	01:39
fungi	that was more or less the point of my second e-mail on that thread	01:41
*** hieulq has quit IRC		01:48
jhesketh	yep that seems sensible... I looked at tackling the fqdn but I think upgrading the old way for now to move the sprint along is best	01:49
fungi	i wound up taking the same route through the storyboard replacement build too	01:50
anteaya	I'm out for the night	02:01
anteaya	thanks jhesketh for taking on the apps.o.o operating system upgrade	02:01
anteaya	:)	02:01
anteaya	g'night	02:01
jhesketh	anteaya: no trouble.. I've gotten side-tracked but will look at it this afternoon	02:01
jhesketh	anteaya: have a good evening :-)	02:01
anteaya	thank you :)	02:02
anteaya	have a good day	02:02
*** anteaya has quit IRC		02:02
*** hieulq has joined #openstack-sprint		03:10
*** baoli has quit IRC		05:19
*** ig0r_ has joined #openstack-sprint		06:00
*** ig0r_ has quit IRC		06:17
*** ig0r_ has joined #openstack-sprint		08:20
*** delatte has quit IRC		12:12
*** baoli_ has joined #openstack-sprint		12:45
*** ig0r__ has joined #openstack-sprint		13:05
*** ig0r_ has quit IRC		13:08
*** anteaya has joined #openstack-sprint		13:24
pabelanger	morning	13:49
pabelanger	fungi: it looks like the volumes on graphite.o.o are reattached this morning. No updates on the support ticket jeblair created	13:50
pabelanger	I'm going to try again to disconnect one	13:50
fungi	thanks	13:51
fungi	worth a shot	13:51
pabelanger	will do 1 at a time this time	13:52
pabelanger	4b5d9f10-0427-4bfb-b45b-a4d682ac4ba5 detaching	13:53
pabelanger	still detaching	13:54
pabelanger	looks like same issue as yesterday	13:57
pabelanger	I am going to move to status.o.o while we wait for RAX support, fungi anteaya jhesketh or anybody else. Do you mind +A https://review.openstack.org/#/c/320653/	14:03
anteaya	looking	14:03
jhesketh	me too	14:04
jhesketh	anteaya: Morning	14:04
anteaya	jhesketh: hey there	14:05
anteaya	what are you doing up?	14:05
jhesketh	just checking on things before I head off	14:05
jroll	pabelanger: what's the issue, volumes stuck detaching? ORD?	14:05
anteaya	jhesketh: thank you, I haven't read back scroll yet, how did you do with apps.o.o?	14:05
jhesketh	anteaya: I spun up a new 14.04 apps.openstack.org.. .I've left a few comments in the etherpad. Basically I think it's okay but would like somebody to take a quick look before doing the dns switchover	14:06
jhesketh	probably good to get docaedo to take a look	14:06
anteaya	jhesketh: fair enough, thank you	14:06
pabelanger	jroll: Yes, volumes appear stuck when I detach them, this is in dfw	14:06
anteaya	jhesketh: the etherpad has the new ip?	14:06
jhesketh	yes	14:06
jhesketh	anteaya: or http://104.239.149.86/	14:06
jroll	pabelanger: interesting. /me sees if this is a thing	14:06
anteaya	jhesketh: thank you	14:07
fungi	pabelanger: did you try halting (or rebooting) the server?	14:07
pabelanger	fungi: I have not	14:08
anteaya	jhesketh: great thank you, I will share that with docado and get his approval	14:08
jhesketh	anteaya: cool. I'm happy to switch the dns across, but maybe not right now	14:08
anteaya	jhesketh: yup, understood thank you	14:09
anteaya	jhesketh: have you lowered the ttl time on apps.o.o?	14:09
anteaya	jhesketh: I pinged docado in -infra	14:09
jhesketh	anteaya: I pinged docaedo earlier but he wasn't around.. if you wouldn't mind checking there are no other api end points or anything that might be pointing to apps.o.o that'd be handy	14:09
jhesketh	anteaya: oh, good point, I'll do that now	14:10
anteaya	jhesketh: thanks	14:10
jroll	pabelanger: if you have instance UUIDs I'm happy to dive in logs, too	14:10
anteaya	no other api endpoints pointing to apps.o.o, I'd be happy to but would have to read or recieve instruction as to how	14:10
anteaya	as currently I don't know	14:10
jhesketh	anteaya: any other alias's	14:11
anteaya	morning jroll thanks for the help	14:11
jroll	morning anteaya :)	14:11
anteaya	jhesketh: I don't know how I would check that, I guess I will ask docado and hope he would know if there are	14:11
pabelanger	jroll: sure: server 09696c29-3410-4baf-8813-05d0eb948be2	14:11
anteaya	jhesketh: in any case, leave it with me, enjoy some sleep	14:12
jhesketh	anteaya: yeah, if you don't mind	14:12
pabelanger	jroll: and thanks for helping	14:12
anteaya	jhesketh: and thanks for your work here, much appreciated	14:12
anteaya	jhesketh: I got it, sleep well	14:12
jhesketh	anteaya: any idea what storage.apps.openstack.org might be?	14:12
* jhesketh notes it in the dns records		14:12
anteaya	jhesketh: I personally have no idea, but will find out	14:12
jroll	pabelanger: no problem, I'll see what I can find	14:13
jhesketh	anteaya: thanks, much appreciated :-)	14:13
anteaya	jhesketh: and I'm grateful for your work, see you tomorrow	14:14
jhesketh	heh, it's like you're trying to get rid of me!	14:14
jhesketh	not at all, it was my pleasure... sorry I didn't get it online, I just wanted some people to sanity check it first	14:14
fungi	docaedo lives in portland oregon, so probably isn't online yet for the day	14:15
anteaya	jhesketh: no no, not trying to get rid of you at all	14:16
jhesketh	I'm teasing ;-)	14:16
anteaya	jhesketh: but it must be what, midnight for you?	14:16
jhesketh	correct	14:16
anteaya	love your company, would keep you to 3am if I could	14:16
anteaya	I miss not having you around more	14:16
anteaya	jhesketh: and I agree sanity checking is great	14:17
jhesketh	would stay up until 3am if I could sleep in.. I'm much more productive at night	14:17
anteaya	fungi: thank you	14:17
jhesketh	naww, thanks	14:17
anteaya	jhesketh: :)	14:17
anteaya	jhesketh: well work it out with your wife and then stay up at night	14:17
anteaya	that would work for me	14:17
jhesketh	heh, I do some nights, it's more the morning meetings that constrain me...	14:18
anteaya	ah yes	14:18
jhesketh	(actually one of the reasons I am still up is my wife is on a night shift, so that works out well)	14:18
anteaya	well get mikal to change his schedule too	14:18
anteaya	oh nice	14:18
anteaya	we can have mikal staying up all night too	14:19
anteaya	would get to talk to him again	14:19
anteaya	I miss him too	14:19
jhesketh	heh, he's a morning person	14:19
anteaya	dang	14:19
jhesketh	but these meetings are with other rackers in the states or sometimes uk so it's also juggling timezones	14:19
anteaya	ah I see	14:19
anteaya	I thought it was your morning mikal meeting	14:20
jhesketh	we have those too, but they are at 10am so they are easy	14:20
anteaya	ah good	14:23
jhesketh	okay, now I'm off	14:32
jhesketh	night all!	14:32
anteaya	night	14:35
jroll	pabelanger: looks like a host thing, still not sure what's up but it looks specific to your instance or your host	14:36
pabelanger	jroll: Hmm. Okay. Maybe I'll do what fungi suggests and stop the instances next time	14:37
pabelanger	assuming it reattaches	14:37
jroll	pabelanger: looks like it's going to retry for a couple hours :\|	14:38
pabelanger	jroll: what do you suggest? Should I be able to power down the instances now? or simply wait until the retry finish	14:39
jroll	pabelanger: I'm not sure, I am far from a virt expert - trying to bug some people	14:40
anteaya	docado has given the all clear to swap dns on apps.o.o if someone can do that	14:47
anteaya	also don't change storage.apps.o.o	14:48
anteaya	<docaedo> anteaya: yes, that storage DNS entry points to a rax cloudfiles instance, so should not be changed	14:48
anteaya	<	14:48
anteaya	and docaedo is not aware of any aliases	14:49
anteaya	can I convince anyone to take up the dns switch for apps.o.o?	15:16
*** yolanda has quit IRC		15:20
* anteaya delegates :)		15:36
anteaya	wrong channel	15:36
* clarkb catches up on precise to trusty sprint		16:08
clarkb	fungi: are we supposed to boot things with digits always now and cname to them in dns with human names?	16:09
fungi	clarkb: if possible that's the new exciting, i guess. but if you're replacing a server there's a good chance its existing manifest(s) assume the hostname in lots of places so we can punt on that and just do the old-style server names until someone has time to do that refactoring	16:10
clarkb	ya the one I am worried about in the stack of hosts I would like to help with is logstash.o.o, I can look at its apache vhost config	16:11
fungi	storyboard.o.o didn't pass the sniff test for me, so i just replaced it with another storyboard.o.o	16:12
fungi	git grep fqdn turned up 4 instances of $::fqdn in the puppet-storyboard module and some in the sytem-config classes for the storyboard servers as well	16:12
fungi	and i don't want to conflate the "is this thing working after moving to trusty?" problems with the "did i just screw up the puppet manifest?" ones	16:13
clarkb	ya	16:14
clarkb	fungi: the fixup is to listen on *:port and set servername to fqdn?	16:17
fungi	clarkb: set servername to something which _isn't_ fqdn, passed in as a specific string constant to a class parameter instead	16:22
clarkb	oh right because fqdn is different now	16:22
fungi	VirtualHost *:port and ServerName are separate concerns	16:22
clarkb	fungi: sort of, you can't safely colocate *'s on the same ports iwthout servername iirc	16:23
pabelanger	trying to launch status.o.o replacement again	16:23
fungi	VirtualHost *:port ends up being necessary anyway, i think, because of that apache 2.4 behavior i found with the hostname mapping only to a loopback interface in /etc/hosts	16:23
clarkb	fungi: maybe we should default to fqdn so that we continue to work on older hosts (if anoyne else is using the current setup)	16:27
fungi	yep	16:28
pabelanger	wouldn't mind a review to fix apache on status.o.o replacement: https://review.openstack.org/#/c/321068/	16:31
fungi	in the case of the storyboard vhosts we were, luckily, already setting ServerName so the wildcard change to VirtualHost was sufficient to make this work on newer servers	16:31
clarkb	pabelanger: looking	16:31
jroll	pabelanger: still got nothin on that volume thing :(	16:32
pabelanger	jroll: :(	16:32
clarkb	pabelanger: do we need to enable mod_version or whatever the module that does the if checks is called?	16:32
pabelanger	jroll: I'll wait for it to reattach, then power off the server	16:32
jroll	pabelanger: sounds good, if that doesn't work then O_o	16:32
pabelanger	clarkb: I don't believe so. A quick test on the replacement server didn't require it	16:33
clarkb	pabelanger: it may be enabled by default	16:33
clarkb	pabelanger: we should probably explicitly enable it whenever we add a dep on it	16:33
pabelanger	clarkb: sure, let me check	16:33
clarkb	and it is mod_version	16:33
pabelanger	clarkb: https://tickets.puppetlabs.com/browse/MODULES-1446 looks to be a builtin function now	16:37
clarkb	pabelanger: but we support precise too	16:38
clarkb	if its enabled there by default too I am less worried	16:38
clarkb	I seem to recall we ran into this with the git0* hosts because centos6 to centos7 is similar migration path for apache	16:38
pabelanger	clarkb: that is true, let me see how to enable it for other OSes	16:39
pabelanger	looks like include ::apache::mod::version should be enough	16:41
pabelanger	Hmm, we are still using puppet-httpd	16:41
pabelanger	let me see what is needed to migrate to puppet-apache	16:41
pabelanger	https://review.openstack.org/#/c/321124/ fixes a race condition on status.o.o	16:56
pabelanger	keeps bouncing on me	16:56
fungi	clarkb: yeah, what we ran into with the centos6 migration is that the change to detect the apache version broke the centos6 production servers because it assumed mod_version was always present	17:06
fungi	which apparently was the case on typical centos7 deployments or something	17:06
fungi	so apache failed to start because configuration explicitly referenced version driectives in conditionals even thoug the module to provide them was absent	17:07
pabelanger	we only need to worry about it on ubuntu-precise now	17:09
pabelanger	centos7 and ubuntu-trusty and above have it baked in	17:09
pabelanger	fungi: clarkb: mind reviewing https://review.openstack.org/#/c/321124/	17:28
pleia2	lgtm	17:29
pabelanger	pleia2: danke!	17:30
pabelanger	pleia2: mind +A?	17:31
pleia2	yep, just saw that \	17:31
fungi	sorry, my primary computer picked this moment to overheat and power itself off, and then i discovered my alternate was frozen	17:31
fungi	back now	17:31
pleia2	fungi: yeesh, quite a morning	17:31
clarkb	fungi: did you apply ice to the alternate?	17:32
fungi	clarkb: i'm allowing it to add to the thermal entropy of the room for a few minutes instead	17:33
*** ig0r__ has quit IRC		17:51
pabelanger	fungi: okay, looks like volume has reattached	17:59
fungi	pabelanger: yeah, try halting the server, and if that still doesn't work then we can try to openstack server reboot it	18:00
fungi	hard reboot (whatever the osc syntax is for that anyway)	18:00
pabelanger	ack	18:02
pabelanger	halting system	18:02
pabelanger	fungi: okay, that worked. Both volumes are detached (available)	18:05
pabelanger	and volumes attached to the new server	18:07
pabelanger	\o/	18:07
fungi	excellent. i guess update/close the ticket	18:07
pabelanger	fungi: sure. Do you mind stepping me through the attachment process for lvm?	18:08
fungi	my guess is something about the old system wasn't responding to or updating metadata in the hypervisor in the way nova expected to indicate the volumes were unused	18:08
fungi	i don't know exactly what signaling that process relies on normally	18:08
fungi	pabelanger: sure, reattachment is easy	18:09
fungi	pabelanger: see the openstack server volume attach syntax in our example guide	18:09
fungi	after that, look in dmesg output, you'll likely see it discover the volume group on its own	18:09
fungi	if lvs reports the existence of the logical volume, then you should be ready to mount itt	18:10
fungi	if not, we may need to vgchange -a y first	18:10
fungi	or vgscan	18:10
fungi	there are a few tools to help the kernel along if udev triggers don't automagically dtrt	18:11
fungi	but in my experience this usually _just works_	18:11
pabelanger	lvs: http://paste.openstack.org/show/505433/	18:11
fungi	lgtm!	18:12
pabelanger	and I see /dev/mapper/main-graphite	18:12
pabelanger	okay, mounted!	18:14
pabelanger	now to remember the fstab syntax	18:15
pabelanger	looks like:	18:15
pabelanger	/dev/mapper/main-graphite /var/lib/graphite/storage ext4 errors=remount-ro,noatime,barrier=0 0 2	18:15
pabelanger	okay, rebooting server to make sure things come back up properly	18:16
pabelanger	then cutting over DNS	18:16
anteaya	anyone who has the time to change the dns for apps.openstack.org to http://104.239.149.86/ that is the only thing left to do for that server	18:20
pabelanger	DNS updated for graphite.o.o	18:25
pabelanger	anteaya: I can do that now	18:25
anteaya	pabelanger: thank you	18:25
anteaya	note there is a dns entry for storage.apps.openstack.org, do nothing with that entry	18:25
pabelanger	anteaya: There should also be an IPv6 address to. We'll need to update that	18:26
pabelanger	anteaya: ack	18:26
anteaya	thanks	18:26
anteaya	I don't know how to get the ipv6 address for that server, sorry I didn't ask for it earlier	18:26
pabelanger	np, I can find it quickly	18:26
anteaya	thanks	18:26
pabelanger	2001:4800:7819:105:be76:4eff:fe04:70ae for your records	18:27
anteaya	thank you	18:27
pabelanger	okay, DNS changed	18:27
anteaya	wonderful!	18:27
anteaya	thanks pabelanger	18:27
pabelanger	anteaya: the old server is still running. We should clean that up as soon as people are happy with the replacement	18:28
anteaya	thanks, I just pinged docaedo in -infra	18:28
fungi	similarly, i have the old storyboard.o.o halted, will be deleting it later today	18:28
fungi	the new one seems to be in good shape per the denizens of #storyboard	18:29
anteaya	yay	18:29
pabelanger	okay, I restarted nodepoold to pick up the DNS change for graphite.o.o, looks to be logging correctly again	18:35
pabelanger	we'll need to do that same for the other services	18:35
anteaya	yay graphite	18:36
pabelanger	So, shockingly, this is the output of iptables on graphite.o.o now	18:37
pabelanger	http://paste.openstack.org/show/505440/	18:37
clarkb	pabelanger: does an iptables-persist or whatever it is called restart fix that?	18:38
pabelanger	clarkb: YES	18:39
clarkb	huh	18:39
clarkb	I wonder if it is racing something at boot	18:39
pabelanger	possible	18:39
pabelanger	I should check other servers I upgraded	18:39
pabelanger	okay, they all work as expected	18:41
jeblair	i will see about fixing cacti	18:43
pabelanger	off to pick up my daughter from school, then will try status.o.o again	18:43
jeblair	oh, ha	18:43
jeblair	i forgot to turn off debug logging, which is why the disk is full	18:43
pleia2	whoops	18:44
jeblair	clarkb, pleia2: May 24 16:52:26 cacti puppet-user[26190]: (/Stage[main]/Apache/Apache::Vhost[default]/File[15-default.conf symlink]/ensure) created	18:47
jeblair	so, puppet ensured that the default site was created... i can't imagine we actually wanted that to happen...	18:47
jeblair	(this is why cacti.openstack.org returns the ubuntu default page)	18:47
pleia2	hrm	18:50
pleia2	looks like cacti is another we configure completely in system-config?	18:51
jeblair	yep	18:51
bkero	jeblair: that might be the default behavior for the apcahe module	18:51
pleia2	bkero: yeah, that's what I'm gathering	18:52
jeblair	two questions: why didn't this break on the old host? how do we turn it off?	18:52
bkero	Which shouldn't be a problem if you're doing name-based vhosts for the sites you actually care about	18:52
pleia2	could be apache 2.2 > 2.4 weirdness again	18:53
jeblair	indeed, this is not an actual vhost	18:53
jeblair	maybe we should make it a proper vhost	18:53
pleia2	yeah	18:53
bkero	That sounds like the proper solution	18:53
jeblair	hrm	18:54
bkero	Hmm, it should be getting a proper vhost	18:54
bkero	::apache::vhost::custom { $::fqdn:	18:54
bkero	hah	18:54
jeblair	bkero: yeah, but we set custom content, which doesn't actually vhost	18:54
bkero	jeblair: can you see if there's a file in /etc/apache2/sites-enabled/ for cacti?	18:55
bkero	It should be triggered by http://localhost/cacti	18:55
jeblair	bkero: yes, it looks like modules/openstack_project/templates/cacti<whatever>	18:56
bkero	right	18:56
jeblair	and cacti.openstack.org/cacti has been working	18:56
jeblair	so that file is fine	18:56
jeblair	i just wrapped it in a virtualhost on disk with a servername of cacti.openstack.org	18:56
jeblair	and now cacti.openstack.org/ works	18:57
jeblair	so i will puppet that up	18:57
bkero	cool	18:57
* bkero prefers to avoid having to specifying paths when a subdomain already exists		18:57
jeblair	this is an old server config :)	18:57
bkero	I can feel the cobwebs	18:58
jeblair	wow, it's not as old as i thought: https://review.openstack.org/#/c/14582/ it's a whole 1.25 years after gerrit...	19:00
jeblair	we probably stood it up manually before then	19:00
pleia2	still before my time :)	19:01
jeblair	oh, yes, in fact the first comment says as much	19:01
jeblair	pleia2, bkero, clarkb: remote: https://review.openstack.org/321190 Add VirtualHost to cacti vhost	19:07
bkero	jeblair: lgtm	19:15
pabelanger	and back	19:39
pabelanger	jeblair: I just noticed http://cacti.openstack.org/ landing page doesn't seem correct	19:40
bkero	pabelanger: it will once patch lands I assume	19:41
bkero	See earlier ^	19:41
pabelanger	bkero: sure, checking backscroll	19:41
jeblair	pabelanger, bkero: yes, that patch addresses that	19:41
pabelanger	Great, see it now	19:41
pabelanger	launching status.o.o again	19:58
jeblair	pabelanger: hrm, i'm not seeing some graphite data i'm expecting from zuul	20:26
pabelanger	jeblair: which?	20:27
pabelanger	jeblair: is it possible the service needs restarting to pick up the dns change?	20:27
jeblair	pabelanger: oh hrm. nodepool has been restarted, and we have data from that	20:27
jeblair	pabelanger: that could be it	20:27
fungi	bkero: yeah, i think stats exports are going to whatever ip address zuul resolved from its config at start	20:28
pabelanger	Ya, I did restart nodepool. But we need to schedule the other services	20:28
fungi	er, jeblair ^ (sorry bkero)	20:28
jeblair	well... i've been wondering if we want to restart zuul for https://review.openstack.org/318966	20:29
clarkb	yes the statsd lib does name resolution at import time iirc	20:29
bkero	no worries	20:29
fungi	bkero: you had changes up for the paste.o.o upgrade, right? i have bandwidth to tackle that one now	20:31
bkero	fungi: I do!	20:32
bkero	Lemme get the review link	20:32
bkero	fungi: https://review.openstack.org/#/c/311235/	20:32
pabelanger	okay, moving to my next server.	20:42
fungi	bkero: i think i see a problem with that change. see my inline comment	20:45
pabelanger	going to do eavesdrop.o.o	20:45
pabelanger	Hmm, this was going to be tricky	20:46
pabelanger	need to rework the file system first	20:46
fungi	oh, yeah eavesdrop has state in /var/lib	20:48
fungi	and then we symlink into it from /srv	20:48
bkero	fungi: ok, fixed	20:48
pabelanger	yup	20:48
fungi	pabelanger: we could make /var/lib/whatever a logical volume and /srv/whatever another volume	20:49
fungi	and stick them in a main volume group on a cinder device	20:49
fungi	bkero: coolnezr	20:49
pabelanger	fungi: ya, I think that will be easiest.	20:50
clarkb	hrm logstash.o.o will need some plumbing to use something other than fqdn in the vhost name	20:50
pabelanger	fungi: so, 1 cinder volume, 2 partitions. I should be able to follow: http://docs.openstack.org/infra/system-config/sysadmin.html#adding-a-new-device but use 50% for parted command	20:52
pabelanger	fungi: does that sound right?	20:52
fungi	bkero: wrong name on that class parameter--it will likely fail tests anyway, but see inline new comment	20:53
fungi	pabelanger: don't need two partitions. you can just create two logical volumes in the volume group you make	20:54
bkero	fungi: ah, my bad. Fixed.	20:55
pabelanger	fungi: gotcha. That's using lvcreate right?	20:55
bkero	Hah, that's kind of curious. It should still work, but this is the default: https://github.com/openstack-infra/puppet-lodgeit/blob/master/manifests/site.pp#L12	20:58
bkero	So if we didn't set it in the lodgeit::site resource, it would do the right thing by default	21:01
fungi	pabelanger: yep, just make sure to figure out how much space you want to assign to each lv so that you don't exceed the space available to the vg	21:02
pabelanger	fungi: okay, I created a 50GB SSD volume	21:04
*** baoli_ has quit IRC		21:04
pabelanger	fungi: or I can go less	21:04
pabelanger	we are only using about 10GB today	21:04
fungi	pabelanger: for some reason i thought jeblair said 75gb was the smallest available size for ssd in rackspace	21:05
jeblair	fungi: for sata	21:05
fungi	ahh	21:05
fungi	thought that was 100gb	21:06
pabelanger	I am not sure what the smallest SSD is	21:06
jeblair	the limit is smaller for ssd, but i don't recall	21:06
jeblair	fungi: i did too, seems they lowered it?	21:06
pabelanger	let me try 25GB	21:06
fungi	bkero: oh, interesting. that's a strangely dysfunctional default whose memory i'd apparently repressed	21:06
jeblair	(if i were to guess, i'd guess 50 for ssd)	21:06
pabelanger	Invalid input received: 'size' parameter must be between 50 and 1024 (HTTP 400) (Request-ID: req-382a88ec-dfeb-44de-a84b-490cc92592ad)	21:07
pabelanger	there we go	21:07
pabelanger	50 is the smallest	21:07
pabelanger	jeblair wins!	21:07
fungi	and just wait 'till you see what you've won!	21:08
anteaya	a new car?	21:11
bkero	Bees!	21:15
fungi	a new car filled with bees	21:20
bkero	http://www.theguardian.com/environment/2016/may/24/bee-swarm-clinging-to-car-boot-haverfordwest-wales	21:20
bkero	Apparently that's what happens when a queen is accidentally trapped inside.	21:20
fungi	(god save the queen?)	21:21
pabelanger	okay, ready to mount the new volume on eavesdrop.o.o, but waiting until meetings are finished	21:21
pabelanger	it would be cool to have a link on eavesdrop.o.o showing the current meetings in progress	21:22
clarkb	I am going t owork on a new logstash.o.o shortly	21:22
fungi	pabelanger: that's an awesome idea... i wonder if we should compute that from our schedule data or attempt to extract state info out of the bot	21:24
fungi	leaning toward the former so that if there's no meeting we can see what/when the next meeting is supposed to be for each official channel	21:24
pabelanger	fungi: parsing the bot would be interesting. But maybe we can use some JS magic to read the exiting ical start times	21:25
pabelanger	Looks like I have to way a few hours for the mount	21:27
pabelanger	I'll go grab some food and hang out with the family. I'll circle back later tonight to finish up	21:28
fungi	awesome, thanks!	21:29
clarkb	dropping TTL on logstash.o.o dns records now	21:31
clarkb	current logstash.o.o is on a standard 2GB flavor, should I switch to performance or use the new "general" flavors	21:38
clarkb	(anyone have an opinion? I lean towards using performance)	21:38
fungi	i've confirmed that the ttl for paste.o.o is still at 5 minutes	21:38
fungi	clarkb: i've been sticking with performance since it's what we've documented	21:39
anteaya	clarkb: let's try performance	21:39
clarkb	ok going to use performance 2GB which has the same memory and cpus	21:39
anteaya	fungi: ah did we?	21:39
fungi	anteaya: system-config launch/README example commands anyway	21:40
fungi	i guess we're running a bit of a backlog in the check pipeline	21:41
clarkb	just waiting on the change to test current stuff for logstash on trusty to be tested. If that comes back green will start the boot	21:56
anteaya	fungi: ah	21:57
anteaya	fungi: yes appears we have a dearth of trusty nodes	21:57
clarkb	hrm I might need to dig in and see if nodepool is happy	22:27
clarkb	ENOTRUSTY	22:27
clarkb	we have 450 ish used instances	22:33
anteaya	that would make things unhappy	22:33
anteaya	do we	22:33
clarkb	which roughyl lines up with our capacity if we are a couple clouds down	22:34
* clarkb guesses bluebox and osic are still effectively off?		22:34
anteaya	I haven't heard anything to the contrary	22:34
fungi	no, we got the fip cleanup patch online and working yesterday	22:34
anteaya	ah wonderful	22:35
fungi	though pabelanger spotted a bunch of stale used nodes not transitioning to delete i think?	22:35
fungi	i wonder if that's an ongoing issue suddenly	22:35
clarkb	fungi: the vast majority of instances in bluebox and osic are building	22:36
fungi	oh, hrm	22:36
clarkb	so I don't think its a state transition issue	22:36
clarkb	in osic its 70:27 building:used	22:36
clarkb	Forbidden: Quota exceeded for instances, ram: Requested 1, 8192, but already used 100, 819200 of 100, 819200 instances, ram (HTTP 403) (Request-ID: req-de1bc677-c2e6-45f1-b0f2-9ea776dd858b)	22:37
clarkb	we do have 100 instances there, a bunch of them don't have fips	22:39
clarkb	I wonder if we are just slowly going from building to ready due to fip situation	22:39
clarkb	the rate for osic is set to .001 so thats not it. I wonder if we just have a really slow attach time period or a high fail rate? maybe racing against hte cleanup cron	22:42
clarkb	ok finally got the trusty tests to run, they passed so attempting to build new host now	23:07
anteaya	yay	23:08
pabelanger	fungi: clarkb: Ya, I cleaned up a bunch already today. Mostly because nodepoold was stopped for ~45mins	23:12
pabelanger	must have been about 400 nodes leaked	23:12
clarkb	OSError: [Errno 13] Permission denied: '/home/clarkb/.ansible/tmp'	23:13
clarkb	am I doing this wrong?	23:14
fungi	clarkb: you likely ran with sudo once and now you're running without?	23:15
fungi	that happened to me once and so i blew away ~/.ansible	23:15
clarkb	hrm ya	23:16
* clarkb cleans up		23:16
fungi	then later discovered that there are a number of things still not working due to insufficient permissions if you try to launch-node.py as non-root anyway	23:16
fungi	like hiera values not making it onto new servers	23:16
clarkb	oh	23:17
* clarkb prepares for it to fail again		23:17
clarkb	fungi: do I need to sudo -H it?	23:17
fungi	clarkb: you probably also need to do something to keep sudo from filtering envvars if so?	23:17
clarkb	oh ya	23:18
fungi	i used an interactive root shell	23:18
clarkb	hrm puppet failed with exit code 6 but no logs printed	23:25
clarkb	fungi: is ^ where you were saying you have to keep it and check syslog?	23:26
fungi	yeah, add --keep	23:27
fungi	and then ssh into the broken instance (if you can) and look in its syslog	23:27
pabelanger	looks like meetings are done for a bit	23:32
pabelanger	going to stop meetbot and move files to cinder volume	23:33

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!