fungi | ianw: cmurphy was heading up a bunch of our puppet 4 testing in infra so would help to get her input if possible | 00:08 |
---|---|---|
clarkb | jeblair: fungi ianw ok http://paste.openstack.org/show/628906/ seems to work | 00:09 |
clarkb | annoyingly unbound is sys v | 00:10 |
* clarkb wonders what made the decision for sysv vs systemd unit in ubuntu | 00:10 | |
clarkb | we probably shouldn't be rebuilding any new services that require name based firewall rules until we figure this out? | 00:10 |
clarkb | or do we want to temporarily switch to ip addrs manually (thinking about zuul in particular) | 00:11 |
ianw | so that replaces it's init.d script? | 00:16 |
clarkb | ianw: nothing I think the init.d script is ignored entirely because systemd never tries to do a compatibility lookup as it already knows the name | 00:17 |
ianw | oh, ok, override | 00:18 |
clarkb | another options is to not override and just run our own unit later | 00:20 |
clarkb | so you'll get firewall early if no dns resolution necessary, then redundantly reapply rules | 00:20 |
clarkb | or you won't have firewall until unbound is running | 00:20 |
ianw | i feel like maybe it's better not to override, but not strongly | 00:21 |
clarkb | fungi: jeblair ^ do you have an opinion? | 00:22 |
clarkb | I'll test the supplemental run later now | 00:22 |
fungi | our base rules give us access anyway without dns resolution, so waiting a little to grant anything beyond that seems fine (and safe) | 00:29 |
clarkb | well in this case the fallback behavior is no rules at all until we can apply the ones with names | 00:30 |
clarkb | so wide open :/ | 00:30 |
clarkb | I think I have supplemental working, I'll push up a patch we can look over | 00:31 |
fungi | k | 00:31 |
jeblair | maybe best to do the supplemental script as a stepping stone to automatic translation to ip addrs | 00:43 |
*** harlowja has quit IRC | 00:45 | |
clarkb | https://review.openstack.org/527821 ist the change to do supplemental serivce | 00:46 |
clarkb | I have not tested the puppet but have tested the unit | 00:46 |
clarkb | because that seems to be working on the es07 replacement I'm going to go ahead and bring it into the es cluster now | 00:47 |
clarkb | jeblair: fungi ianw if we decide to move head with 527821 we should watch zuulv3 closely and we can use one of the logstash-workers as a reboot tester | 00:47 |
clarkb | oh I didn't use the right topic on that change, will update | 01:03 |
clarkb | I'm going to delete the old es07 now | 01:12 |
clarkb | new one seems to be working | 01:12 |
clarkb | old es07 is deleted | 01:17 |
clarkb | I need to take a break and make dinner and stuff. I'll probably try to resume early tomorrow again to help frickelr with the other es servers | 01:17 |
clarkb | ianw: thanks for review on the firewall thing | 01:18 |
pabelanger | just getting back now, not sure I have focus to rotate out volume for eavesdrop.o.o this evening | 01:30 |
pabelanger | I'll look in the morning for the next window between meetings | 01:30 |
*** jkilpatr has quit IRC | 02:00 | |
*** openstackstatus has quit IRC | 02:32 | |
*** openstackstatus has joined #openstack-sprint | 02:36 | |
*** barjavel.freenode.net sets mode: +v openstackstatus | 02:36 | |
*** skramaja has joined #openstack-sprint | 04:47 | |
ianw | frickler: progress! a real test -> http://logs.openstack.org/22/527822/5/check/legacy-puppet-beaker-rspec-infra/101b243/job-output.txt.gz#_2017-12-14_04_22_09_291878 | 05:06 |
ianw | so we did have the repo names wrong. i think we're getting very very close | 05:06 |
ianw | if ci comes back positive on 527822 i think the stack is good to go. more rspec tests would be welcome if you feel like it (connect to the service, etc) | 05:08 |
ianw | in terms of ethercalc migration, i looked into that. you just move the redis db in /var/lib/ ... it's safe to do at any time, but so people don't drop edits we should shutdown apache before copying | 05:09 |
*** skramaja_ has joined #openstack-sprint | 08:56 | |
*** skramaja has quit IRC | 08:57 | |
*** skramaja has joined #openstack-sprint | 09:01 | |
*** skramaja_ has quit IRC | 09:01 | |
frickler | ianw: small issue on https://review.openstack.org/527144 I think, but I like the rspec stuff | 09:36 |
*** jkilpatr has joined #openstack-sprint | 10:46 | |
*** jkilpatr has quit IRC | 11:08 | |
ianw | frickler: cool, will look at tomorrow. hopefully that will be it! | 11:37 |
*** skramaja_ has joined #openstack-sprint | 12:35 | |
*** skramaja has quit IRC | 12:35 | |
*** skramaja has joined #openstack-sprint | 12:40 | |
*** skramaja_ has quit IRC | 12:40 | |
*** skramaja has quit IRC | 13:31 | |
clarkb | frickler: good morning/afternoon. I'm around if you want to go over the elasticsearch upgrade process | 14:06 |
clarkb | frickler: wrote it down at https://etherpad.openstack.org/p/elasticsearch-xenial-upgrade and went through it with elasticsearch07 so I think it is owrking. The one issue I found was related to the firewall. Details at https://review.openstack.org/#/c/527821/ | 14:08 |
*** baoli has joined #openstack-sprint | 14:35 | |
frickler | clarkb: I looked at your notes and they seem pretty clear to me, I can give them a go with es06, then | 14:43 |
clarkb | frickler: the only thing is we need to make sure the firewall is addressed first | 14:43 |
clarkb | I'd like 527821 in or something like it before we proceed just so that we can confirm that whatever the fix is is working | 14:44 |
clarkb | (it is important for elasticsearch because its api is fairly easy to abuse and there are no AAA features built into the open source version) | 14:44 |
frickler | clarkb: I'd like to use the new server to test that patch once more before merging it | 14:45 |
frickler | clarkb: or you can merge if you are confident enough and I'll test it implicitly | 14:45 |
clarkb | frickler: you mean you would manually run a puppet apply with that update applied after the initial install? | 14:46 |
clarkb | frickler: thats probably a good idea if you want ot proceed with es06 | 14:46 |
frickler | clarkb: yes | 14:46 |
clarkb | ok lets do that then | 14:46 |
clarkb | just have to remember it is a reboot that we need to test (firewall rules should be in place after) | 14:48 |
frickler | clarkb: yes, launching node now | 14:50 |
fungi | curious to hear if that solves the race | 14:53 |
frickler | clarkb: I think you want to add a start action into the service definition, so that it doesn't need a reboot initially. waiting for results of the reboot now | 15:06 |
clarkb | frickler: well the regular iptables application should work the first time around. The problem we are trying to address is specifically that the system provided unit does not work on boot for us | 15:06 |
clarkb | so in this case I think it is ok to just have it be enabled for the next boot | 15:07 |
frickler | clarkb: on es06, there were no rules after the initial puppet run nor of running again with the patch. the were installed only after I explicitly started the new service | 15:08 |
frickler | s/of/after | 15:08 |
frickler | clarkb: fungi: looking o.k. after a reboot | 15:09 |
frickler | so maybe need more investigation why the rules aren't enabled on the initial puppet run | 15:10 |
clarkb | frickler: that is a possibility. Though reading journald logs I'm fairly confident the main service is attempting to run on boot and failing | 15:12 |
clarkb | frickler: so guessing there is a separate issue with not enabling on first puppet run | 15:12 |
frickler | clarkb: but I guess I could continue with stopping things on the old es06 now, o.k.? | 15:13 |
clarkb | frickler: actually, you know what, we may not have noticed because launch node does a reboot so we will always get the rules applied that way | 15:13 |
clarkb | frickler: yes if iptables looks good to you after a reboot I think you can proceed with the rest of the process | 15:13 |
clarkb | frickler: for the disable shard allocation step that needs to be run from a node currently in the cluster | 15:15 |
clarkb | frickler: not sure if I made that clean (also you can run it multiple times safely) | 15:15 |
clarkb | *clear | 15:15 |
frickler | clarkb: I assumed that, yes, but might be worth another note | 15:16 |
fungi | perhaps we just set elasticsearch to not start automatically, and so once we log in and manually start it the firewall rules have been long-since applied successfully? | 15:18 |
clarkb | fungi: it already does not start automatically | 15:19 |
clarkb | (this was intentional as joining cluster at improper time could cause disruption) | 15:19 |
frickler | whew, attaching the volume took a bit of time to execute for rax ... /me was already getting nervous about it not showing up ;) | 15:23 |
clarkb | the good news is we have a completely copy of the data on all of theo ther volumes too | 15:24 |
clarkb | er | 15:24 |
clarkb | its not one complete copy on all of the other ovlumes, it is in aggregate one copy | 15:25 |
clarkb | but ya worst case we let elasticsearch recover itself | 15:25 |
frickler | clarkb: so just to confirm, for the "update DNS/firewalls" step, this is the same as for the worker earlier? i.e. add new rdns via cli, update existing records via webinterface, run fw-restart on all the nodes | 15:31 |
clarkb | frickler: correct, in this case you need to update the firewall on elasticsearch* | 15:31 |
clarkb | so that they can talk to each other | 15:32 |
frickler | clarkb: o.k., new node seems to be running fine. how long did the recovery take for you? | 15:49 |
frickler | given that we want to reboot the new nodes anyway before taking them into service, I'd be fine with merging https://review.openstack.org/#/c/527821/ then as is | 15:51 |
clarkb | frickler: I think recovery took a couple hours, but I ended up in a degraded state for longer than you did while I sorted out the firewall so hopefully yours has less time to recover | 15:52 |
clarkb | fungi: did you want to review the firewall change too? | 15:52 |
clarkb | (I think the more reviews on that one the better) | 15:52 |
fungi | clarkb: which one, 527821? i can take a look once tc office hour discussion winds down | 15:59 |
clarkb | fungi: yes that change | 16:01 |
clarkb | frickler: down to 2 initializing shard so I think it is close to being done | 16:20 |
frickler | clarkb: should I launch another node already? seems I could do another one and then either you take over or I continue with the remainder tomorrow my morning | 16:25 |
clarkb | frickler: once the cluster goes green you can do another one, but should wait until then | 16:32 |
frickler | clarkb: just went green, now relocating 2 shards. o.k. to also remove old es06 now? | 16:38 |
clarkb | frickler: yup | 16:39 |
frickler | clarkb: probably need to do that first, launching es05 did hit quota error | 16:39 |
clarkb | relocating shards are ok, if cluster is green you can take the next node offline (we have n+1 redundancy so can only handle one node outage when "green") | 16:39 |
clarkb | frickler: ya these are large nodes | 16:40 |
clarkb | they are essentially a giant in memory data base for searching billions of rows of free form text data | 16:40 |
*** jkilpatr has joined #openstack-sprint | 16:51 | |
jeblair | https://review.openstack.org/527729 could use review -- the new paste server is in place, we should merge that before it reboots :) | 16:52 |
clarkb | jeblair: I think you may actually want network-online | 16:54 |
clarkb | jeblair: based on https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ | 16:54 |
jeblair | welp, if we're okay with a paste outage, i can try it :) | 16:57 |
jeblair | i know the current one works, and i copied that from other unit files we have | 16:57 |
clarkb | jeblair: test would be a reboot? | 16:57 |
jeblair | yep | 16:57 |
clarkb | I'm ok with that I think. The problem with After network.target is network may not be "up" according to the docs | 16:58 |
clarkb | so network.target works better as a Before | 16:58 |
jeblair | ok | 16:58 |
frickler | clarkb: do we need to wait for shard relocation to finish, too, or will that become obsolete after the next replacement anyway? | 17:00 |
clarkb | frickler: you don't have to wait, shard relocation is just disk usage balancing. I expect that yes the upgrades will largely make that obsolete and its the final state that we want to rebalance | 17:01 |
jeblair | ok that works too | 17:01 |
jeblair | clarkb: https://review.openstack.org/527729 updated | 17:03 |
clarkb | +2 | 17:03 |
clarkb | (I ended up reading far too much about this stuff when sorting out the firewall thing yesterday) | 17:04 |
jeblair | we should keep an eye out for other instances | 17:05 |
clarkb | ++ | 17:05 |
frickler | so I'll start to replace es05 now | 17:06 |
frickler | clarkb: I just noticed the the old es06 is still in state deleting, is that expected to take 30 minutes or longer? server id d1cc1c9e-0e3f-4663-aa8c-56a40873e387 | 17:11 |
clarkb | frickler: no, that is odd, but also not something on our end | 17:12 |
clarkb | frickler: we will hae to watch it and if it persists I guess file a rax ticket for it | 17:13 |
clarkb | frickler: I wonder if a second delete request might unstick it? mordred might know I'll ask him over in -infra | 17:21 |
*** baoli has quit IRC | 17:27 | |
frickler | clarkb: seems I can still log into that node, maybe do a shutdown/poweroff? | 17:32 |
clarkb | frickler: my hunch is that it got lost somewhere on the cloud side and so the event enver made it to the instance. We have had trouble with similar in the past which is why nodepool will retry deletes | 17:33 |
clarkb | frickler: we can retry the delete in a bit and if that doesn't work ya we can try the shutdown | 17:34 |
frickler | o.k., new es05 seems to be doing fine now, 12 shards initializing | 17:35 |
*** baoli has joined #openstack-sprint | 17:38 | |
fungi | clarkb: just to clarify earlier discussion on 527821, the upshot is that the instance briefly comes up with no packet filtering because iptables failed to load the ruleset due to missing dns resolution, and then shortly thereafter this reapplies the rules when name resolution is working and at that point we're covered? | 17:46 |
fungi | i wonder if there's some way to make it fail closed initially until dns works | 17:46 |
fungi | bur regardless, we likely should acknowledge that the instances are briefly exposed/vulnerable for any services started before that point | 17:50 |
fungi | if that's what's happening | 17:50 |
clarkb | fungi: yes that upshot is correct | 17:55 |
clarkb | fungi: I haven't checked on a trusty node but I actually think that behavior of not having a firwall until some time later must not be a regression or it would've failed on trusty too | 17:55 |
clarkb | its just with the systemd change they are trying to clamp down on it much more and broke the dns based rules | 17:56 |
frickler | having a failsafe mode that only allowed ssh instead of allowing everything still seems a good idea to me, but I'd say that that would be a followup project | 17:57 |
clarkb | ya, I also think that would require much more testing and I'm not sure its raelly a regression especially since ubuntu and debian don't firewall by default | 17:58 |
clarkb | the idea jeblair had was to have config managment do the dns lookups and write ip addresses into the rules files so that the base service works | 17:58 |
clarkb | that will also address the concern | 17:58 |
jeblair | i'm going to poke at that real quick | 18:00 |
jeblair | the dnslookup thing | 18:00 |
pabelanger | so, just check to see if we have any ongoing meetings right now | 18:01 |
fungi | ahh, right, i remember that suggestion now. yes, would be pretty cool | 18:02 |
pabelanger | if not, I have the time to migrate eavesdrop.o.o volume | 18:02 |
pabelanger | checking schedule again | 18:02 |
fungi | clarkb: i suppose it's not possible/easy to make unbound start before netfilter-persistent? | 18:02 |
clarkb | fungi: not with netfilter-persistent saying it must start before networking does. Unbound would be up but unable to resolv anything | 18:03 |
fungi | or, another alternative, possible to use remote resolvers until unbound starts? | 18:03 |
clarkb | fungi: that won't help either because the issue is netfilter-persistent is starting before networking | 18:03 |
fungi | ahh, right | 18:03 |
clarkb | and you can only append to the lists of dependencies with modifications to existing units | 18:03 |
clarkb | which is why I went with a different unit entirely | 18:04 |
fungi | basically we'd need [networking]>[unbound]>[netfilter-persistent] | 18:04 |
clarkb | fungi: yup | 18:04 |
pabelanger | looks like maybe a conflict for http://eavesdrop.openstack.org/#Group_Based_Policy_Team_Meeting | 18:04 |
pabelanger | confirming | 18:04 |
fungi | jeblair's idea is sounding better and better | 18:04 |
clarkb | so idea with my change is its a quick way to give us fairly good coverage on the problem so we don't have to halt upgrades | 18:05 |
pabelanger | okay, I don't see any traffic in meeting rooms, but we do have a project scheduled | 18:07 |
pabelanger | I think I can hold off another hour on migration to be safe | 18:08 |
clarkb | fridays tend to be very quiet as far as meetings go if today doesn't work | 18:09 |
clarkb | (but getting it done is probably best) | 18:10 |
*** baoli has quit IRC | 18:14 | |
*** baoli has joined #openstack-sprint | 18:16 | |
*** harlowja has joined #openstack-sprint | 18:29 | |
clarkb | frickler: just three more shards to go. Should I plan on deleting the old 05 node and continue with the others this afternoon or do you want to finish up tomorrow? | 18:38 |
frickler | clarkb: that's up to you, I think I could do 3+4 tomorrow and leave 2 for the grand finale, but if you want to continue today, that's fine for me, too, doesn't look like we'll run out of tasks soon ;) | 18:41 |
clarkb | ok | 18:45 |
clarkb | I'll probably continue with them today then as they take time and getting them done is worthwhile | 18:45 |
*** clayton has quit IRC | 18:46 | |
*** clayton has joined #openstack-sprint | 18:47 | |
frickler | o.k., good luck, I think I'm done for today anyway | 18:47 |
clarkb | frickler: ok I should delete the old 05 then? | 18:48 |
frickler | clarkb: sure, I'd have waited until the shards are done, but should be fine by now | 18:51 |
clarkb | oh I can wait just saying I won't wait on you to do it | 18:52 |
clarkb | you can go enjoy your evening | 18:52 |
jeblair | okay, i have a working impl of the dns thing. will push up patches shortly | 18:58 |
pabelanger | okay, i think we are clear on meetings at the moment | 19:13 |
pabelanger | any objections if I migrate eavesdrop volume? | 19:13 |
pabelanger | infra-root: ^ | 19:13 |
fungi | none from me | 19:13 |
fungi | i concur, no meetings seem to be running in official channels right now, at least | 19:14 |
clarkb | if no meetings then fine by me | 19:14 |
pabelanger | okay, will shutdown eavesdrop here in a moment | 19:14 |
pabelanger | then detach volume | 19:14 |
pabelanger | and reattach to eavesdrop01.o.o | 19:14 |
*** openstack has joined #openstack-sprint | 20:32 | |
*** ChanServ sets mode: +o openstack | 20:32 | |
jeblair | clarkb: thx, fixed | 20:33 |
pabelanger | okay, DNS updated | 20:34 |
pabelanger | I'm going try starting a meeting , then stop to confirm everything is working as expected | 20:34 |
pabelanger | suggestions for a project to test work? | 20:34 |
pabelanger | with* | 20:34 |
clarkb | jeblair: +2 thanks | 20:35 |
pabelanger | clarkb: are you okay with using infra to test startmeeting commands? | 20:36 |
clarkb | pabelanger: what about in here? | 20:37 |
clarkb | that keeps it on topic with the sprint | 20:37 |
pabelanger | clarkb: was going to use a meeting channel | 20:37 |
clarkb | oh you mean infra meeting? | 20:37 |
pabelanger | yah, just want to confirm bot works as expected on xenial | 20:38 |
clarkb | I think you can probably do something like #startmeeting test in here or sure one of the other meeting channels | 20:38 |
pabelanger | unless you are okay | 20:38 |
pabelanger | yah, test works | 20:38 |
clarkb | (I'd avoid adding meeting logs that aren't for proper meetings like infra) | 20:38 |
pabelanger | kk | 20:38 |
pabelanger | #startmeeting test | 20:38 |
openstack | Meeting started Thu Dec 14 20:38:39 2017 UTC and is due to finish in 60 minutes. The chair is pabelanger. Information about MeetBot at http://wiki.debian.org/MeetBot. | 20:38 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 20:38 |
*** openstack changes topic to " (Meeting topic: test)" | 20:38 | |
openstack | The meeting name has been set to 'test' | 20:38 |
pabelanger | thanks! | 20:38 |
pabelanger | #endmeeting | 20:38 |
*** openstack changes topic to "OpenStack Infra team Xenial upgrade sprint | Coordinating at https://etherpad.openstack.org/p/infra-sprint-xenial-upgrades" | 20:38 | |
openstack | Meeting ended Thu Dec 14 20:38:49 2017 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 20:38 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/test/2017/test.2017-12-14-20.38.html | 20:38 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/test/2017/test.2017-12-14-20.38.txt | 20:38 |
openstack | Log: http://eavesdrop.openstack.org/meetings/test/2017/test.2017-12-14-20.38.log.html | 20:38 |
pabelanger | looks to be good | 20:38 |
fungi | pabelanger: you want to #status log something too? | 20:39 |
clarkb | that did what I expect | 20:39 |
pabelanger | fungi: yes | 20:39 |
*** pabelanger has quit IRC | 20:39 | |
*** pabelanger has joined #openstack-sprint | 20:39 | |
fungi | heh, weren't identified to nickserv? | 20:40 |
pabelanger | #status log eavesdrop01.o.o online and running xenial | 20:40 |
openstackstatus | pabelanger: finished logging | 20:40 |
pabelanger | woot | 20:40 |
pabelanger | https://wiki.openstack.org/wiki/Infrastructure_Status | 20:40 |
pabelanger | I'll keep eavesdrop.o.o online for a day or so, just to be safe | 20:41 |
pabelanger | but things look good | 20:41 |
fungi | jeblair: commented on 528043, wondering if you tested that syntax for udp | 20:42 |
fungi | (i honestly don't know whether that's valid or not) | 20:42 |
pabelanger | okay, I'll try cacti02.o.o now | 20:44 |
pabelanger | actually, let me confirm backups | 20:47 |
fungi | pabelanger: we probably want to avoid keeping the old eavesdrop server around for very long unless we do something to prevent the bots from getting started if the server gets rebooted | 20:49 |
jeblair | fungi: nope, missed that, thanks | 20:49 |
jeblair | fungi: so we should just stick an <if> in there i think | 20:50 |
fungi | yeah, that seems easiest | 20:50 |
fungi | use "-m state --state NEW" with tcp and omit it for anything else | 20:51 |
pabelanger | fungi: right, let me shutdown to be safe | 20:52 |
jeblair | -A openstack-INPUT -m state --state NEW -m tcp -p tcp -s 23.253.235.223 --dport 1234 -j ACCEPT | 20:52 |
jeblair | -A openstack-INPUT -m udp -p udp -s 104.239.137.167 --dport 1234 -j ACCEPT | 20:52 |
jeblair | fungi: how's that look? | 20:52 |
fungi | lgtm! | 20:53 |
jeblair | ok updated | 20:54 |
jeblair | remote: https://review.openstack.org/528043 Add support for resolving hostnames in rules | 20:54 |
fungi | thanks, +2 | 20:55 |
clarkb | I've also +2'd it but not approved as looks like we haven't decided yet to approve the stack? | 20:56 |
jeblair | i can help shepherd if we want to go for it | 20:56 |
clarkb | I'm in favor | 20:57 |
jeblair | we should get a +2 from fungi on 528045 | 20:57 |
fungi | i would volunteer to do it myself but i have to disappear for christine's office holiday party soon and so won't be around | 20:57 |
jeblair | which just happened | 20:57 |
fungi | just to confirm, this is piloting for graphite but then we can incrementally switch other classes over as we get to them? | 20:58 |
jeblair | fungi: yep | 20:58 |
fungi | i'm in favor | 20:58 |
jeblair | ok, i'll start the +Ws | 20:58 |
clarkb | ya I'll push up a change for elasticsearch stuff soon | 20:59 |
pabelanger | yah, looks good firewall changes | 21:00 |
fungi | i may not be around to review that, but sounds good regardless | 21:00 |
fungi | disappearing in another hour or so | 21:00 |
clarkb | jeblair: oh you know what | 21:01 |
clarkb | elasticsearch needs a range of ports | 21:01 |
* clarkb checks if that is compatible with this change (should be I guess?) | 21:01 | |
fungi | yeah, should work fine | 21:01 |
fungi | as long as the destination port is just treated as a string | 21:01 |
clarkb | ya I can just set port to 9200:9400 | 21:02 |
fungi | simply put in XXX:YYY | 21:02 |
fungi | right | 21:02 |
*** openstack has joined #openstack-sprint | 21:12 | |
*** ChanServ sets mode: +o openstack | 21:12 | |
clarkb | jeblair: fungi pabelanger https://review.openstack.org/#/c/528087/1 thats the first step in converting elasticsearch to this I think | 21:18 |
pabelanger | I'm just looking at design-summit-prep.o.o, is that server needed still? I'm not sure how it works | 21:20 |
clarkb | fungi: ^ do you recall? I want to say its a ttx thing | 21:20 |
pabelanger | yah, I see ttx name in some apache configs | 21:21 |
pabelanger | seems to host https://github.com/ttx/summitsched | 21:22 |
pabelanger | is that something we should move to openstack-infra repo? | 21:22 |
pabelanger | maybe not | 21:22 |
pabelanger | summitsched - A proxy to edit the OpenStack Design Summit sched.org | 21:22 |
clarkb | pabelanger: I think we ask ttx and possibly just delete it tomorrow after confirmation from ttx | 21:23 |
fungi | pabelanger: we're not really managing that | 21:23 |
fungi | it was more that ttx needed a sandbox to stand up something on short notice | 21:23 |
pabelanger | kk | 21:24 |
pabelanger | will follow up | 21:24 |
clarkb | jeblair: are we just waiting on CI for the iptables changes to merge now? | 21:40 |
jeblair | clarkb: ya | 21:40 |
jeblair | clarkb: looks like a job ahead of it has gone kaput | 21:41 |
clarkb | ok, just making sure I can't help with anything else | 21:41 |
clarkb | pabelanger care to review https://review.openstack.org/#/c/528087/ | 21:42 |
jeblair | i'm going to push commit message updates | 21:42 |
pabelanger | looking | 21:42 |
clarkb | oh my change actually conflicts with jeblairs so I need to rebaes | 21:42 |
clarkb | pabelanger: ^ so one sec | 21:42 |
* clarkb waits for new commit messages | 21:43 | |
pabelanger | ah | 21:43 |
pabelanger | -W | 21:43 |
jeblair | clarkb: go for it; i popped 528038 which was stuck and reapproved. i don't think i need to touch the iptables ones | 21:43 |
clarkb | jeblair: ok | 21:43 |
clarkb | pabelanger: jeblair ok rebased | 21:44 |
clarkb | working on updating the rules now | 21:44 |
pabelanger | +3 | 21:50 |
clarkb | https://review.openstack.org/528101 and https://review.openstack.org/528097 do the config change for elasticsearch and logstash | 22:11 |
dmsimard | FYI I'm picking up logstash-worker14 through 16 which I had not yet completed | 22:11 |
pabelanger | clarkb: looks good | 22:13 |
*** rwsu has quit IRC | 22:26 | |
pabelanger | giving cacti02.o.o another try now | 22:32 |
pabelanger | okay, booted that time | 22:39 |
*** rwsu has joined #openstack-sprint | 22:39 | |
jeblair | 2/3 iptables changes landed. i'm going to verify it doesn't change the firewall on graphite.o.o first, then approve the third | 22:44 |
pabelanger | okay, cacti02.o.o looks okay. I'm going to proceed with volume migration from cacti01.o.o | 22:45 |
jeblair | iptables rules are unchanged on graphite; approving 528045 | 22:46 |
clarkb | jeblair: sounds good | 22:46 |
pabelanger | okay, I've stopped apache2 on cacti01.o.o and placed it in emergency file | 22:48 |
pabelanger | going to comment out crontab to stop cacti from running | 22:48 |
pabelanger | volume attached, rebooting server to confirm | 22:55 |
*** baoli has quit IRC | 22:57 | |
ianw | if i could get a review on https://review.openstack.org/526978 (update puppet nodejs) i think ethercalc is ready to go | 23:00 |
ianw | i'm going to try codesearch now too, since it's back puppeting | 23:00 |
clarkb | ianw: I think you hvae the reviews you need, just a matter of approving when you can watch the js things | 23:00 |
pabelanger | okay, cacti02.o.o online, I've updated the database info manually. For some reason it doesn't look to be under control of puppet | 23:03 |
pabelanger | I can see crontab running now | 23:04 |
ianw | if we could look at https://review.openstack.org/528120 (sysconfig xenial update for codesearch) that would help | 23:04 |
pabelanger | I'll open firewall for cacti02 now | 23:05 |
ianw | thanks! | 23:07 |
ianw | frickler: as penance for that typo, i added a check for it in the rspec test :) | 23:08 |
pabelanger | remote: https://review.openstack.org/528122 Add cacti02.o.o to all snmp iptables rules | 23:11 |
pabelanger | clarkb: ianw: ^ | 23:11 |
pabelanger | I think we can also update that to new firewalls shortly | 23:11 |
pabelanger | http://cacti02.openstack.org/cacti/graph.php?local_graph_id=2374&rra_id=all works | 23:12 |
pabelanger | but toplevel page doesn't yet | 23:13 |
pabelanger | must be populating still | 23:13 |
pabelanger | okay, I've updated dns to cacti.o.o to cacti02.o.o, and poweroff cacti01 | 23:20 |
*** baoli has joined #openstack-sprint | 23:23 | |
*** baoli has quit IRC | 23:28 | |
ianw | pabelanger: https://review.openstack.org/528127 is probably of interest. removes npm mirroring bits. noticed because i don't think we want to bother updating mirror-update to use the new puppet-nodejs | 23:30 |
pabelanger | ianw: ah, ya. good call | 23:30 |
pabelanger | +2 | 23:31 |
*** harlowja has quit IRC | 23:32 | |
*** harlowja has joined #openstack-sprint | 23:39 | |
jeblair | pabelanger: it shouldn't need to populate anything -- should just be using the same database | 23:39 |
*** harlowja has quit IRC | 23:40 | |
*** jkilpatr has quit IRC | 23:40 | |
*** harlowja has joined #openstack-sprint | 23:42 | |
clarkb | ianw: looks like the npmrc thing was already not in use? | 23:42 |
clarkb | I can't find any reference to that template in that repo | 23:42 |
pabelanger | jeblair: yah, I see in the logs it finds database, but trying to see why tree isn't populated yet | 23:43 |
clarkb | ianw: I've approved the npm mirror cleanup | 23:43 |
pabelanger | both list view and preview seem to work | 23:43 |
jeblair | pabelanger: look at the page source, the tree is there | 23:43 |
jeblair | it looks like some jquery files are 404ing | 23:43 |
pabelanger | ah, okay | 23:43 |
pabelanger | cool | 23:43 |
pabelanger | I can look why in a min | 23:44 |
jeblair | http://cacti02.openstack.org/javascript/jquery/jquery.min.js | 23:44 |
jeblair | iptables 3/3 merged, i'm going to poke at that now | 23:45 |
jeblair | oh wow, it's already run and updated | 23:45 |
ianw | clarkb: thanks, i'll keep an eye as i don't think mirror-update has puppeted 100% successfully in a while, should be fine as i guess it just skipped the npm bits | 23:45 |
jeblair | old and busted: http://paste.openstack.org/raw/628994/ | 23:46 |
jeblair | new hotness: http://paste.openstack.org/raw/628995/ | 23:46 |
ianw | clarkb: the other one is etherpad. i noticed it doesn't have an acceptance test ... using what i've learnt i'll add one to try deploying on xenial, and we can start with that ... | 23:47 |
jeblair | the iptables stuff seems sane. i think we're good to go there. | 23:47 |
*** harlowja has quit IRC | 23:50 | |
clarkb | jeblair: pabelanger ianw https://review.openstack.org/528101 should pass testing now, they are ready for review if the grafana rules came out good | 23:50 |
clarkb | oh jeblair just said it looks sane so ya I think we are ready to review and possibly merge that stack | 23:50 |
jeblair | clarkb: oh thanks, i was just looking into that error, will refresh | 23:51 |
jeblair | clarkb: that seems like a layer violation but i'm fine not fixing it now :) | 23:52 |
clarkb | jeblair: ya it is preexisting... | 23:52 |
jeblair | (the reference of elasticsearch_nodes from within logstash_worker) | 23:52 |
jeblair | +2 | 23:53 |
jeblair | and i +3d the parent | 23:53 |
clarkb | thanks | 23:54 |
jeblair | pabelanger: be sure to update the base firewall rule for the new cacti server | 23:55 |
jeblair | it looks like it's not getting any data from any hosts right now | 23:55 |
pabelanger | jeblair: yah, 528122 should do that | 23:56 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!