*** jcooley_ has joined #tripleo | 00:03 | |
*** jcooley_ has quit IRC | 00:08 | |
*** ccrouch has quit IRC | 00:16 | |
*** CaptTofu has joined #tripleo | 00:18 | |
*** noslzzp has joined #tripleo | 00:19 | |
*** sdake has joined #tripleo | 00:22 | |
*** sdake has joined #tripleo | 00:22 | |
*** sdake has quit IRC | 00:25 | |
*** UtahDave has quit IRC | 00:26 | |
*** jcooley_ has joined #tripleo | 00:30 | |
*** CaptTofu has quit IRC | 00:40 | |
*** CaptTofu has joined #tripleo | 00:40 | |
*** CaptTofu has quit IRC | 00:45 | |
*** spzala has quit IRC | 00:51 | |
*** newell has quit IRC | 01:04 | |
lifeless | crickets | 01:17 |
---|---|---|
*** CaptTofu has joined #tripleo | 01:23 | |
devananda | lol | 01:34 |
clarkb | lifeless: isn't it your weekend? :P | 01:35 |
devananda | hey lifeless, got a few minutes? | 01:35 |
clarkb | we have a long weekend this side of the planet | 01:35 |
devananda | clarkb: what's monday again? | 01:35 |
clarkb | devananda: mlk jr day | 01:35 |
devananda | ah right | 01:35 |
derekh | lifeless: got a node running the jjb stuff again, now rebuilding the TE before I clock off | 01:49 |
*** taps has quit IRC | 01:57 | |
*** noslzzp has quit IRC | 02:08 | |
*** noslzzp has joined #tripleo | 02:09 | |
*** CaptTofu has quit IRC | 02:24 | |
*** coolsvap_away has joined #tripleo | 02:52 | |
openstackgerrit | Derek Higgins proposed a change to openstack-infra/tripleo-ci: Switch test environment users https://review.openstack.org/67614 | 02:53 |
*** coolsvap has quit IRC | 02:54 | |
derekh | lifeless: looks like TE host on the baremetal cloud can't contact the broker, have updated the etherpad with more uptodate notes and will pick it back up tomorrow | 02:55 |
derekh | root@testenv-testenvconfig-lieghn64l4vq:/home/heat-admin# ping 192.168.1.1 | 02:56 |
derekh | 13 packets transmitted, 0 received, +7 errors, 100% packet loss, time 12065ms | 02:56 |
*** AaronGr is now known as AaronGr_Zzz | 02:57 | |
*** derekh has quit IRC | 02:59 | |
*** coolsvap_away has quit IRC | 03:15 | |
*** sdake has joined #tripleo | 03:28 | |
*** hewbrocca has quit IRC | 03:54 | |
*** hewbrocca has joined #tripleo | 03:54 | |
*** CaptTofu has joined #tripleo | 04:42 | |
*** CaptTofu has quit IRC | 05:15 | |
*** noslzzp has quit IRC | 05:17 | |
*** rushiagr has joined #tripleo | 05:24 | |
*** vkozhukalov has joined #tripleo | 05:30 | |
openstackgerrit | Tzu-Mainn Chen proposed a change to openstack/tuskar-ui: Add unstyled overcloud resource category page https://review.openstack.org/67632 | 05:59 |
*** rushiagr is now known as rushiagr_away | 06:01 | |
*** rushiagr_away is now known as rushiagr | 06:11 | |
*** ccrouch has joined #tripleo | 06:18 | |
*** ccrouch has quit IRC | 06:23 | |
*** tzumainn has quit IRC | 06:32 | |
*** CaptTofu has joined #tripleo | 07:16 | |
*** boris-42 has joined #tripleo | 07:16 | |
*** akuznetsov has quit IRC | 07:19 | |
*** rwsu has quit IRC | 07:19 | |
*** CaptTofu has quit IRC | 07:20 | |
*** rushiagr has joined #tripleo | 07:32 | |
*** akuznetsov has joined #tripleo | 07:50 | |
*** akuznetsov has quit IRC | 08:32 | |
*** akuznetsov has joined #tripleo | 09:10 | |
*** CaptTofu has joined #tripleo | 09:16 | |
*** CaptTofu has quit IRC | 09:21 | |
*** akuznetsov has quit IRC | 09:47 | |
*** hewbrocc` has joined #tripleo | 09:59 | |
*** uvirtbot has quit IRC | 10:03 | |
*** hewbrocca has quit IRC | 10:04 | |
*** boris-42 has quit IRC | 10:06 | |
*** e0ne_ has joined #tripleo | 10:10 | |
*** sdake has quit IRC | 10:11 | |
*** e0ne has quit IRC | 10:12 | |
*** akuznetsov has joined #tripleo | 10:12 | |
*** sdake has joined #tripleo | 10:14 | |
*** jcooley_ has quit IRC | 10:14 | |
*** boris-42 has joined #tripleo | 10:34 | |
*** derekh has joined #tripleo | 11:00 | |
*** akuznetsov has quit IRC | 11:15 | |
*** CaptTofu has joined #tripleo | 11:18 | |
*** CaptTofu has quit IRC | 11:22 | |
*** derekh has quit IRC | 11:44 | |
*** derekh has joined #tripleo | 11:50 | |
*** rbrady has joined #tripleo | 11:51 | |
*** akuznetsov has joined #tripleo | 11:52 | |
*** rushiagr has quit IRC | 12:08 | |
*** rushiagr has joined #tripleo | 12:11 | |
*** derekh has quit IRC | 12:16 | |
*** rushiagr2 has joined #tripleo | 12:18 | |
*** rushiagr has quit IRC | 12:21 | |
*** rushiagr2 has quit IRC | 12:41 | |
*** e0ne_ has quit IRC | 13:04 | |
*** e0ne has joined #tripleo | 13:04 | |
*** CaptTofu has joined #tripleo | 13:18 | |
*** CaptTofu has quit IRC | 13:21 | |
*** CaptTofu has joined #tripleo | 13:22 | |
*** derekh has joined #tripleo | 13:26 | |
*** e0ne has quit IRC | 13:33 | |
*** derekh has quit IRC | 13:34 | |
*** e0ne has joined #tripleo | 14:09 | |
*** e0ne has quit IRC | 14:13 | |
*** e0ne has joined #tripleo | 14:22 | |
*** e0ne has quit IRC | 14:25 | |
*** ccrouch has joined #tripleo | 14:48 | |
*** vkozhukalov has quit IRC | 15:01 | |
*** e0ne has joined #tripleo | 15:02 | |
*** e0ne has quit IRC | 15:06 | |
*** CaptTofu has quit IRC | 15:36 | |
*** CaptTofu has joined #tripleo | 15:37 | |
*** CaptTofu has quit IRC | 15:42 | |
*** vkozhukalov has joined #tripleo | 15:43 | |
*** derekh has joined #tripleo | 15:47 | |
*** lynxman has quit IRC | 15:49 | |
*** mordred has quit IRC | 15:49 | |
*** slagle has quit IRC | 15:49 | |
*** rpodolyaka has quit IRC | 15:49 | |
*** lynxman has joined #tripleo | 15:52 | |
*** mordred has joined #tripleo | 15:52 | |
*** slagle has joined #tripleo | 15:52 | |
*** rpodolyaka has joined #tripleo | 15:52 | |
*** lynxman has quit IRC | 15:56 | |
*** mordred has quit IRC | 15:56 | |
*** slagle has quit IRC | 15:56 | |
*** rpodolyaka has quit IRC | 15:56 | |
*** lynxman has joined #tripleo | 15:58 | |
*** mordred has joined #tripleo | 15:58 | |
*** slagle has joined #tripleo | 15:58 | |
*** rpodolyaka has joined #tripleo | 15:58 | |
*** akuznetsov has quit IRC | 16:40 | |
*** derekh has quit IRC | 16:45 | |
*** panda has joined #tripleo | 17:29 | |
*** rushiagr has joined #tripleo | 17:31 | |
*** panda__ has quit IRC | 17:32 | |
*** CaptTofu has joined #tripleo | 17:37 | |
*** CaptTofu has quit IRC | 17:42 | |
*** akuznetsov has joined #tripleo | 17:47 | |
*** marun has joined #tripleo | 18:01 | |
*** marun has quit IRC | 18:06 | |
*** jrist has quit IRC | 18:19 | |
*** UtahDave has joined #tripleo | 18:33 | |
*** jrist has joined #tripleo | 18:33 | |
*** akuznetsov has quit IRC | 18:51 | |
*** noslzzp has joined #tripleo | 19:10 | |
*** CaptTofu has joined #tripleo | 19:11 | |
lifeless | o/ | 19:21 |
lifeless | more crickets! | 19:21 |
lifeless | to the theme of 'more cowbell!' | 19:21 |
*** CaptTofu has quit IRC | 19:57 | |
*** taps has joined #tripleo | 19:58 | |
SpamapS | I got a fever | 20:21 |
*** UtahDave has quit IRC | 20:30 | |
*** akuznetsov has joined #tripleo | 20:35 | |
lifeless | I think we're going to have to debug this network performance thing asap | 20:39 |
lifeless | 12kBps is too slow | 20:39 |
phschwartz | What type of network issue. I have some cycles while testing of a new release here is going on and I can take a look | 20:39 |
lifeless | phschwartz: on ci-overcloud.tripleo.org, which is a regular cd-overcloud just a different name, so we have a stable base for infra to run in | 20:40 |
lifeless | phschwartz: instances are getting 12kbps from the internet | 20:40 |
lifeless | phschwartz: gre overlay network | 20:40 |
phschwartz | gre+ovs I take it | 20:40 |
lifeless | yah | 20:41 |
phschwartz | I have had this issue a few times. What version of ovs is installed? | 20:41 |
lifeless | ml2 drive | 20:41 |
lifeless | I just tried clamping the mtu of an instance down, no discernable effect | 20:41 |
*** akuznetsov has quit IRC | 20:41 | |
lifeless | let me log into the plumbing and I'll answer the ovs question | 20:41 |
lifeless | phschwartz: ovs-vsctl --version | 20:42 |
lifeless | ovs-vsctl (Open vSwitch) 1.10.2 | 20:42 |
lifeless | Compiled Sep 23 2013 14:53:13 | 20:42 |
lifeless | on the network node | 20:42 |
phschwartz | No, that won't do it. One of the older ovs installs had an issue with gre networking that caused its in memory datastore for ovs+gre routing to eat cpu and ram. It would clean the ram, but would leave cpu usage high causing a reduction in traffic routing compute which in turn slows down throughput | 20:43 |
lifeless | which is able to pull 20MB/s from the host I was testing against | 20:43 |
phschwartz | Let me check to see if that is the version with the issue or not | 20:43 |
lifeless | same version on the compute node | 20:43 |
phschwartz | ok, that is the one I had the issue with that over time would have the same problem. I was running the default ubuntu installed 1.10.2 on 13.04. | 20:45 |
lifeless | 1.10.2 is bad? | 20:45 |
phschwartz | Compiling for my local 1.11.0 fixed the issue. The other thing that helped was moving from using the python wrapper for root commands | 20:45 |
phschwartz | I found it to be with gre | 20:46 |
phschwartz | Works good with nvp | 20:46 |
lifeless | ok, thats super useful. THanks! | 20:46 |
lifeless | phschwartz: is nvp open source? | 20:46 |
phschwartz | I found this before I started with Rax, but I think someone in Rax found the same as they moved to nvp and that I know still run 1.10.2 | 20:46 |
phschwartz | no, it is not. | 20:46 |
lifeless | ah :) | 20:46 |
lifeless | ok, so we need to replace the openvswitch packages too | 20:47 |
lifeless | I can see us just building everything from scratch :/ | 20:47 |
phschwartz | That was what I did for the fix. Built my own and made a local repo for install | 20:47 |
phschwartz | I need to look at the bug back log and work on a few when I have time like this. Haven't had much time lately. | 20:47 |
lifeless | I'm going to poke deeper on this, as there isn't a CPU problem today, just a throughput problem | 20:49 |
phschwartz | I think what I found on the ovs mailing lists when I had the issue was that it would eat ram and cpu, then kill the gre threads, and it would severely limit them when it respawns them and that is why it has the issue. | 20:50 |
lifeless | yeah but this is right from first vm on cloud ever | 20:50 |
lifeless | it would have to eat them spectacularly fast... | 20:50 |
phschwartz | I found it to happen very fast | 20:52 |
lifeless | ok | 20:52 |
lifeless | order of minutes? | 20:52 |
phschwartz | If he is on, kbringard in #openstack had the same issue in the ovs+gre setup that at&t was using and helped me locate the issue. He might have more in depth info still. | 20:53 |
phschwartz | yes, a matter of mintues | 20:53 |
lifeless | ok, cool | 20:53 |
lifeless | so - replacing the package version is going to be a little tricky right now, but will dig into it | 20:53 |
phschwartz | I would get the slow down starting within 2-5 min of quantum bringing up networking as a whole for my env. | 20:53 |
phschwartz | It will be in this case. I had the benefit of a small cluster at the time with no impact of stopping to redo it. | 20:54 |
lifeless | so one thing thats odd | 20:54 |
lifeless | when I wget from the instance to the world - slow | 20:54 |
lifeless | when I rsync up the same content from my home to the instance - fast | 20:54 |
lifeless | phschwartz: would restarting openvswitch temporarily fix things? | 20:55 |
phschwartz | defn the same issue that I had then. It was slowness in the computing of routing in the ovs namespaces that were using gre. | 20:55 |
phschwartz | That would work sometimes, but usually needed a host reboot. | 20:55 |
lifeless | righto, from the ip router netns I get 160Mbps of throughput to a static file in the UK | 20:56 |
lifeless | which isn't brilliant but is tolerable | 20:57 |
phschwartz | I would see not even to the net, but between external networks in the datacenter hits where I would get 50-60kpbs, and the core network for the env was 160gb and the clusters interconnect was 8 10g ports aggregated with 2 10g aggregated on each host. | 21:02 |
phschwartz | You can never be 100% positive, but defn sounds like the same issue I was having | 21:02 |
phschwartz | When I would get rid of namespacing it would improve, but that defeats the purpose | 21:03 |
lifeless | hmmm | 21:03 |
lifeless | trusty has 2.0 | 21:03 |
lifeless | that might be easier | 21:03 |
phschwartz | here is a mail list thread from OS that someone had the same issue. http://lists.openstack.org/pipermail/openstack/2013-October/002265.html | 21:04 |
phschwartz | In their case, the only fix was setting up a proxy to get around the issue with the gre namespacing | 21:04 |
lifeless | yah | 21:05 |
lifeless | family time, shall dig in in detail this evening | 21:06 |
lifeless | thanks for the pointers | 21:06 |
phschwartz | Just had a network eng from LexisNexis (where I use to work) remind me that we also had to turn GRO off on the hardware side as the offloading made the problem happen a lot faster. | 21:06 |
phschwartz | no problem at all | 21:06 |
*** julim has quit IRC | 21:16 | |
*** taps has quit IRC | 21:19 | |
*** boris-42 has quit IRC | 21:19 | |
*** jhurlbert has quit IRC | 21:19 | |
*** sgrasley has quit IRC | 21:19 | |
*** taps has joined #tripleo | 21:20 | |
*** boris-42 has joined #tripleo | 21:20 | |
*** jhurlbert has joined #tripleo | 21:20 | |
*** sgrasley has joined #tripleo | 21:20 | |
*** boris-42 has quit IRC | 21:21 | |
*** boris-42 has joined #tripleo | 21:21 | |
*** d0ugal has joined #tripleo | 21:26 | |
*** derekh has joined #tripleo | 21:41 | |
*** vkozhukalov has quit IRC | 21:43 | |
*** rushiagr has quit IRC | 21:50 | |
*** taps has quit IRC | 22:35 | |
*** ccrouch has quit IRC | 22:37 | |
*** cody-somerville has quit IRC | 23:00 | |
*** e0ne has joined #tripleo | 23:00 | |
*** e0ne has quit IRC | 23:07 | |
*** e0ne has joined #tripleo | 23:20 | |
*** akuznetsov has joined #tripleo | 23:25 | |
derekh | need a bigger VM - Out of memory | 23:27 |
*** cody-somerville has joined #tripleo | 23:35 | |
*** e0ne has quit IRC | 23:46 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!