Wednesday, 2013-12-04

*** mlavalle has quit IRC00:01
*** bvandenh has quit IRC00:09
*** reaper has quit IRC00:10
*** SumitNaiksatam has quit IRC00:17
*** nati_ueno has joined #openstack-neutron00:18
*** nati_uen_ has quit IRC00:21
openstackgerritDane LeBlanc proposed a change to openstack/neutron: Improve unit test coverage for Cisco plugin model code  https://review.openstack.org/5812500:30
*** matsuhashi has joined #openstack-neutron00:31
*** nati_ueno has quit IRC00:33
*** carl_baldwin has quit IRC00:35
*** aymenfrikha has left #openstack-neutron00:39
*** balar has joined #openstack-neutron00:41
*** openstack has joined #openstack-neutron00:46
*** salv-orlando has quit IRC00:53
openstackgerritSalvatore Orlando proposed a change to openstack/neutron: Test commit for testing parallel job in experimental queue  https://review.openstack.org/5742000:55
*** openstackgerrit has quit IRC00:56
*** openstackgerrit has joined #openstack-neutron00:56
*** dims has quit IRC01:04
*** Abhishek has quit IRC01:10
*** unicell has joined #openstack-neutron01:12
*** unicell has joined #openstack-neutron01:12
*** dims has joined #openstack-neutron01:18
*** wcaban has quit IRC01:21
*** julim has quit IRC01:22
*** nati_ueno has joined #openstack-neutron01:26
*** nati_ueno has quit IRC01:32
*** nati_ueno has joined #openstack-neutron01:32
*** nati_ueno has quit IRC01:33
*** nati_ueno has joined #openstack-neutron01:33
*** banix has joined #openstack-neutron01:43
*** Abhishek has joined #openstack-neutron01:51
*** Abhishek has quit IRC02:00
*** dzyu has joined #openstack-neutron02:00
*** banix has quit IRC02:04
*** banix has joined #openstack-neutron02:06
*** dzyu has quit IRC02:09
*** gdubreui has quit IRC02:11
*** Abhishek has joined #openstack-neutron02:17
*** Jianyong has joined #openstack-neutron02:23
*** dims has quit IRC02:23
*** gdubreui has joined #openstack-neutron02:31
*** rwsu has quit IRC02:35
*** rwsu has joined #openstack-neutron02:39
*** marun has joined #openstack-neutron02:46
*** Abhishek has quit IRC02:52
*** gongysh has joined #openstack-neutron02:53
*** yfujioka has joined #openstack-neutron03:01
*** enikanorov__ has joined #openstack-neutron03:17
*** aveiga has joined #openstack-neutron03:17
*** AndreyGrebenniko has quit IRC03:17
*** gdubreui has quit IRC03:17
*** AndreyGrebenniko has joined #openstack-neutron03:19
*** harlowja has quit IRC03:19
*** pvo has quit IRC03:19
*** enikanorov_ has quit IRC03:20
*** gdubreui has joined #openstack-neutron03:20
*** pvo has joined #openstack-neutron03:22
*** suresh12 has quit IRC03:53
*** amotoki has joined #openstack-neutron03:55
*** aveiga has quit IRC03:55
*** nati_ueno has quit IRC04:08
*** netavenger-jr has joined #openstack-neutron04:26
*** x86brandon has joined #openstack-neutron04:28
*** gongysh has quit IRC04:35
*** chandankumar has joined #openstack-neutron04:36
*** banix has quit IRC04:38
*** jp_at_hp has quit IRC04:39
*** banix has joined #openstack-neutron04:44
*** suresh12 has joined #openstack-neutron05:04
*** Jianyong has quit IRC05:08
*** suresh12 has quit IRC05:09
*** SumitNaiksatam has joined #openstack-neutron05:12
*** SumitNaiksatam_ has joined #openstack-neutron05:17
*** amir1 has joined #openstack-neutron05:17
*** dcahill1 has joined #openstack-neutron05:17
*** aryan_ has joined #openstack-neutron05:19
*** gfa_ has joined #openstack-neutron05:19
*** marun has quit IRC05:22
*** dcahill has quit IRC05:22
*** asadoughi has quit IRC05:22
*** matrohon has quit IRC05:22
*** decede has quit IRC05:22
*** SumitNaiksatam has quit IRC05:22
*** SumitNaiksatam_ is now known as SumitNaiksatam05:22
*** gfa has quit IRC05:22
*** aryan has quit IRC05:22
*** pvo has quit IRC05:22
*** marun has joined #openstack-neutron05:22
*** pvo has joined #openstack-neutron05:22
*** matrohon has joined #openstack-neutron05:23
*** decede has joined #openstack-neutron05:23
*** amotoki has quit IRC05:30
*** x86brandon has quit IRC05:31
*** alex_klimov has joined #openstack-neutron05:36
*** yfujioka has quit IRC05:48
*** banix has quit IRC05:59
openstackgerritA change was merged to openstack/neutron: Add vpnaas and debug filters to setup.cfg  https://review.openstack.org/5987006:00
*** yfried has joined #openstack-neutron06:01
*** gongysh has joined #openstack-neutron06:05
*** suresh12 has joined #openstack-neutron06:05
*** towen27 has joined #openstack-neutron06:16
towen27I was wondering if anyone here could help me with a multi l3_agent setup.06:17
*** alex_klimov has quit IRC06:18
*** alex_klimov has joined #openstack-neutron06:18
*** alex_klimov1 has joined #openstack-neutron06:19
towen27everyone asleep?06:20
*** alex_klimov has quit IRC06:23
*** towen27 has quit IRC06:41
openstackgerritJenkins proposed a change to openstack/neutron: Imported Translations from Transifex  https://review.openstack.org/5963206:44
*** gdubreui has quit IRC06:49
*** amritanshu_RnD has joined #openstack-neutron06:49
*** nati_ueno has joined #openstack-neutron06:52
*** bashok has joined #openstack-neutron06:56
*** marun has quit IRC06:57
openstackgerritAnn Kamyshnikova proposed a change to openstack/neutron: Fix mistake in usage drop_constraint parameters  https://review.openstack.org/5991007:02
*** nati_ueno has quit IRC07:10
*** towen has joined #openstack-neutron07:20
towenHello, is anyone up?07:20
lifelesstowen: hi, try #openstack for support07:22
towenThank you07:22
*** towen has quit IRC07:22
*** jlibosva has joined #openstack-neutron07:22
*** rwsu has quit IRC07:24
*** yongli has quit IRC07:28
*** amotoki has joined #openstack-neutron07:33
*** rwsu has joined #openstack-neutron07:48
*** suresh12 has quit IRC08:04
*** amuller has joined #openstack-neutron08:04
*** marun has joined #openstack-neutron08:09
*** alagalah has joined #openstack-neutron08:17
*** jistr has joined #openstack-neutron08:18
*** alagalah has left #openstack-neutron08:19
openstackgerritArmando Migliaccio proposed a change to openstack/neutron: Handle exceptions on create_dhcp_port  https://review.openstack.org/5781208:20
openstackgerritRoman Podoliaka proposed a change to openstack/neutron: Fix a race condition in agents status update code  https://review.openstack.org/5881408:23
*** markmcclain has joined #openstack-neutron08:33
*** matsuhashi has quit IRC08:35
*** ygbo has joined #openstack-neutron08:36
*** matsuhashi has joined #openstack-neutron08:36
*** netavenger-jr has quit IRC08:39
*** networkstatic has joined #openstack-neutron08:40
*** fouxm has joined #openstack-neutron08:41
*** SumitNaiksatam has quit IRC08:42
*** SumitNaiksatam has joined #openstack-neutron08:42
*** smcavoy has quit IRC08:48
*** zigo has joined #openstack-neutron09:05
*** amotoki has quit IRC09:08
*** jpich has joined #openstack-neutron09:08
openstackgerritEvgeny Fedoruk proposed a change to openstack/neutron: Extending quota support for neutron LBaaS entities  https://review.openstack.org/5872009:10
*** metral_ has joined #openstack-neutron09:10
*** gongysh has quit IRC09:12
*** zigo_ has quit IRC09:12
*** metral has quit IRC09:12
*** metral_ is now known as metral09:12
*** markmcclain has quit IRC09:13
*** suresh12 has joined #openstack-neutron09:14
*** yongli has joined #openstack-neutron09:17
*** suresh12 has quit IRC09:19
marios_morning neutron09:20
*** pbeskow has joined #openstack-neutron09:29
*** rossella_s has joined #openstack-neutron09:29
pbeskowAnyone have any resources on how to make neutron-ovs-plugin and docker driver interoperate?09:31
*** salv-orlando has joined #openstack-neutron09:33
*** afazekas has joined #openstack-neutron09:36
openstackgerritAnn Kamyshnikova proposed a change to openstack/neutron: Sync models with migrations  https://review.openstack.org/5541109:38
openstackgerritA change was merged to openstack/python-neutronclient: Fix i18n messages in neutronclient  https://review.openstack.org/5752209:38
openstackgerritA change was merged to openstack/neutron: Imported Translations from Transifex  https://review.openstack.org/5963209:39
openstackgerritA change was merged to openstack/python-neutronclient: Updates .gitignore  https://review.openstack.org/5902609:52
marunsalv-orlando: ping09:57
salv-orlandohi marun09:57
marunhi salvatore09:57
marunquite the mess we have on our hands :\09:58
marunJust for kicks I tested with the dhcp notification being sent regardless of agent status.09:59
salv-orlandoI've seen your emails09:59
salv-orlandoand I've even read them, which is perhaps more incredible :)09:59
marunheh09:59
marunAm I making a mountain out of a molehill?09:59
marunThings just seem so...broken.09:59
*** salv-orlando_ has joined #openstack-neutron10:01
salv-orlando_marun: I'm back what did I lose?10:02
marunsalv-orlando_: In a test where 75 vms were booted with dhcp notification always being sent, the hosts file ends up with 100 entries, but only 43 VMs actually made it all the way to active.10:03
salv-orlando_yes - all the vms made to active according to nova, but only 43 got  an IP thus really becoming active?10:03
*** salv-orlando has quit IRC10:03
*** salv-orlando_ is now known as salv-orlando10:03
*** jorisroovers has joined #openstack-neutron10:03
marunsalv-orlando: not quit.  out of 75 boot attempts, only 43 made it to ACTIVE according to nova.10:04
marunsalv-orlando: the others ended up in error state due to nova having trouble communicating with neutron.10:04
salv-orlandomarun: sounds like there is another problem beyond the skipped notification?10:05
*** steven-weston has joined #openstack-neutron10:06
salv-orlandomy thought was the the issue you identified with the dhcp agent being unwisely marked as down did not affect the vm boot workflow?10:06
marunsalv-orlando: yes.  when neutron is overloaded the nova integration fails while trying to boot vm's.10:06
salv-orlandowe are seeing that in the large_ops job as well.10:07
marunsalv-orlando: what is the large_ops job exactly?  I'm afraid I'm ignorant.10:07
salv-orlandoLaunchpad bug 125016810:07
salv-orlandohttps://bugs.launchpad.net/neutron/+bug/125016810:07
salv-orlandoIt does something similar to what you're doing (150 vms) but uses a fake virt driver10:08
marunfake virt driver, I should enable that!10:08
marun*sigh*10:08
salv-orlandomarun: Our understanding is that job fails because nova is terribly slow when interacting with neutron, and we should reduce the chatter10:09
salv-orlandodims already did a great job in fixing an issue requiring a lot of round-trips to keystone10:09
salv-orlandobut something in the last week made the issue reapper10:09
marunsalv-orlando: Is someone actively working on the problem of optimization, then, and I should leave it to them?10:09
marunsalv-orlando: It sounds like something a profiler would be useful for.10:09
salv-orlandohowever, in your case, you said you had instances going into ERROR, which is not optimisation, but a real bug10:10
marunsalv-orlando: Find the hotspots and optimize.10:10
salv-orlandomarun: attaching a profiler is something I have on a post-it on my desk which is not covered in dust and stained with coffee10:10
salv-orlandowhich is *now10:10
marunsalv-orlando: so do you want me to leave it to you?10:10
salv-orlandoNope, because otherwise it will just stay on that postit10:11
marunsalv-orlando: ok, fair enough10:11
salv-orlandomarun: but am I right you were saying you saw instance go into ERROR state, which means at some point neutron started throwing 500s10:11
marunsalv-orlando: the errors I'm seeing in the logs, btw are 'Caught error: Connection to neutron failed: Maximum attempts reached'10:11
salv-orlandolike bug 129111510:11
salv-orlandohttps://bugs.launchpad.net/neutron/+bug/121191510:12
marunsalv-orlando: btw I tried the fix suggested in the QueuePool bug by garyk, to set the pool timeout to something low, and then I started seeing 500s from nova because neutron was throwing queuepool timeout errors.10:12
marunsalv-orlando: I really don't understand how fiddling with queuepool conf is supposed to work10:12
marunsalv-orlando: yeah, like that bug10:13
marunsalv-orlando: except the error was reported by nova api10:13
salv-orlandoreducing the queue pool timeout will make things better because it allows quicker recycle of db connections into the pool, but does not permanently solve the issue10:14
marunsalv-orlando: do you have a suggested value?  Setting it to '2' made things worse for me.10:14
salv-orlandobottom line in my opinion is that if you want to handle X concurrent requests in neutron your pool size should be at least X+110:14
marunsalv-orlando: ah, ok.10:14
salv-orlandomarun: I'm not talking about the timeout, but the pool size10:15
marunsalv-orlando: that suggests rate limiting then10:15
salv-orlandomarun: yeah, that's another post-it on my desk. We had a guy working on it, but then we did not see him anymore.10:15
salv-orlando*We I mean neutron, not my people at vmware10:15
marunsalv-orlando: right10:15
marunsalv-orlando: so, hypothetically, if we were to rate-limit maybe we could treat a rate limit failure differently than a connection failure and have a longer time between connection attempts in the client?10:17
marunsalv-orlando: maybe even progressively adjust the time between connection failures?10:17
salv-orlandoI was going to ask you the same thing. The approach seems reasonable to me.10:18
salv-orlandoDefinetely better than failing a request without retry or doing complex things like queueing requests in the neutron server10:18
marunsalv-orlando: I'm hoping we can keep the scope on a fix small enough to backport.10:19
marunsalv-orlando: though I would argue that an ideal solution will involve queueing requests.10:19
marunsalv-orlando: breaking apart the api from the 'conductor' is probably the way to go, longer term.  refusing client requests when we're at capacity does not seem like the best idea.10:20
marunsalv-orlando: but i digress10:20
marunsalv-orlando: ok, I guess there are two parallel requirements to fix the scaling issue.  rate-limiting (and handling this in the client) and optimizing common paths.10:21
*** Sreedhar has joined #openstack-neutron10:21
*** matsuhashi has quit IRC10:22
*** markmcclain has joined #openstack-neutron10:22
marunsalv-orlando: as far as the notification problem, is notifying regardless of agent status an option?10:23
SreedharHi All, During the concurrent VM deployment of 30 instances some instances are going into error state due to neutron rpc timeouts and some instances are getting duplicate fixed Ips , once we have around 150 instances already active10:26
marunSreedhar: Yeah, this is a known issue: https://bugs.launchpad.net/bugs/119238110:28
SreedharHave already tuned the sqlalchemy queuepool size and increased the agent_down_time10:29
SreedharHi Marun, Thanks, I am following that bug10:29
SreedharI had the same HW setup and did similar tests in Grizzly.  After the tuning of "sqlalchemy queuepool size and increased the agent_down_time", all the instances were active, none went to error state in Grizzly. Also did not had the duplicate fixed IP issues10:30
salv-orlandomarun: I think it is valuable for fixing the issue in the short term and back porting, as you said.10:31
salv-orlandoAlternatively the only thing I see is changing the logic for declaring an agent down by allowing for more tolerance for missed notifications10:31
marunsalv-orlando: maybe both?10:31
marunsalv-orlando: What do you think of increasing the tolerance for missed notifications and logging when sending to an agent reported as down?10:33
salv-orlandomarun: I think it's worth doing.10:33
marunsalv-orlando: Is there any reason not to send a notification to a down agent?  Would the alternative be reporting an exception?10:33
marundown agent  -> 'agent reported as down'10:34
salv-orlandomarun: I don't see why it should be bad, but jog0 raised the point in the mailing list, so I'm trying to understand why it would be really bad10:34
*** markmcclain has quit IRC10:35
*** markmcclain has joined #openstack-neutron10:35
marunsalv-orlando: do you have a link to jog0's concern?  I'm afraid I missed it10:35
openstackgerritIsaku Yamahata proposed a change to openstack/neutron: l3-agent-consolidation(WIP): framework for consolidating l3 agents  https://review.openstack.org/5762710:36
salv-orlandoin the email thread he said that sending notifications to down agent is, in his opinion, as bad as not sending them10:37
jog0what was my concern?10:37
*** nati_ueno has joined #openstack-neutron10:37
salv-orlandosorry jog0 I think it was not you :(10:37
salv-orlandoI'm messing up email threads10:37
salv-orlandojog0: I hope I did not wake you up or distract from some other task10:38
jog0salv-orlando:  heh, I didn't think I chimed in on that thread. my only concern is icehouse-210:38
jog0salv-orlando: heh, no I am UTC+2 this week10:38
openstackgerritArmando Migliaccio proposed a change to openstack/neutron: Handle failures on update_dhcp_port  https://review.openstack.org/5966410:38
*** nigel_r_davos has joined #openstack-neutron10:38
markmcclainjog0: you already in Israel?10:38
jog0markmcclain: yup10:38
*** armax has joined #openstack-neutron10:38
markmcclainah cool… I'll be there Sunday10:38
jog0markmcclain: cool see you sunday10:39
jog0time to go find some lunch10:39
ygbomarkmcclain: Hi, do you have a second?10:40
markmcclainygbo: sure what's up?10:41
ygboI tried to add some docstring explaining why dnsmaq requires --addn-hosts parameter : https://review.openstack.org/#/c/52930/ just let me know if anything is unclear.10:42
marunarmax: ping10:43
armaxmarun: pong10:43
marunarmax: regarding https://review.openstack.org/#/c/59664, is there a reason that update and create have to separately test all the failure cases?  Why not do test _port_action directly instead10:44
marun?10:44
*** matsuhashi has joined #openstack-neutron10:45
armaxIf you want complete coverage the number of combinations are the same10:46
*** jistr has quit IRC10:46
marunsalv-orlando: uh, not true10:46
marunsorry, armax: not true10:46
armaxall exceptions for each supported action, no?10:46
marunarmax: no10:46
armaxk10:46
marunthat's falacious10:46
marunarmax: if it were true, testing would be combinatorial for complete coverage10:47
salv-orlandomarun: if you want I can slap him in the face for writing untrue statements :)10:47
armaxindeed it is10:47
marunarmax: it does not need to be10:47
marunarmax: and it can't be, if we want our efforts to be useful10:47
armaxI'm happy to hear how I can improve it10:48
marunarmax: an alternative is testing the error conditions with just 'create_port'.  Then test golden-path (non-error condition) with 'update_port'10:48
marunarmax: these are whitebox tests - the error is set by mock anyway10:48
marunarmax: voila, coverage.10:49
armaxok10:49
armaxso are you saying I remove some test methods? sorry I don't follow you10:50
marunarmax: my suggested strategy, in general, is testing paths at as low a level as possible (i.e. test the error conditions by calling _port_action directly).10:50
ygbomarkmcclain: as you can see, dnsmasq does not resolve hosts defined in --dhcp-hostsfile if it did not give a lease for it (it is only a lease mapping and not a list of hosts to resolve). So if you have HA with 2 dnsmaq instances on same subnet (subnet being tunnelled between several network nodes) currently hosts resolve only other hosts on the same network which got their lease from the same dnsmasq instance.10:50
armaxmaybe if you did your review on gerrit I might be able to follow more10:50
marunarmax: Yes, I'm suggesting removing the new tests that check that update_port handles the error conditions appropriately.10:51
marunarmax: Ok, I'll add on gerrit.10:51
armaxtnx10:51
marunarmax: apologies, I figured a conversation would move things quicker.10:51
armaxthat's okay10:51
marunsalv-orlando: I think you were talking about Clint Byrum's comments on that mailing list thread.  He was concerned that sending notifications blindly would be problematic.10:53
marunsalv-orlando: I figure logging warnings if agents are not up should alleviate some of that concern, and that the amqp queue will provide some assurance of eventual delivery if an agent is actually down.  The alternative would seem much more involved and hard to backport.10:54
salv-orlandomarun: correct. I don't know how I managed to mix the two of them10:54
salv-orlandomarun: I agree with you, I though it was worth digging into clint's concerns10:54
marunsalv-orlando: ok, cool.10:55
openstackgerritEvgeny Fedoruk proposed a change to openstack/python-neutronclient: Extending quota support neutron LBaaS entities  https://review.openstack.org/5919210:56
pbeskowam I correct in understanding that if I use the ml2 plugin for a dedicated neutron network node I should be able to use the neutron-linuxbridge plugin on one compute node and the neutron-ovs-plugin on another compute node?11:02
pbeskowand then be able to use the docker driver with neutron-linuxbridge to enable network connectivity?11:03
*** ygbo has quit IRC11:04
SreedharMarun: Per this bug https://bugs.launchpad.net/neutron/+bug/1160442, with sqlalchemy queuepool size, did not observe these duplicate fixed in IP's in Grizzly but in Havana even with sqlalchemy queuepool tuning, still see the duplicate fixed IPs. Also per this https://bugs.launchpad.net/bugs/1192381 bug, instances are active but they are not getting IPs, but in my case instances are going to error state due to neutron ti11:05
*** ygbo has joined #openstack-neutron11:07
*** jistr has joined #openstack-neutron11:09
*** networkstatic is now known as networkstatic_zZ11:09
Sreedharmarun: In Grizzly, once we have more than 210 instance active during subsequent 30 parallel instance deployment, some instances are not able to get IP address  during their first boot. There is a considerable delay (close to 2min) in updating the port status (during security group rule update) due to which instances are not able to get their IP even though the port details are added in hosts file. Till the port status i11:11
*** bvandenh has joined #openstack-neutron11:12
*** jorisroovers has quit IRC11:12
salv-orlandoSreedhar: patch 57420 is rationaling the ovs agent loop11:24
salv-orlandoor trying to make it less crappy11:24
salv-orlandothere is also another race being investigated when a port_create_end arrives before the sync_state routine in the dhcp agent processes the network and adds it to the cache.11:25
salv-orlandoAs a result, the port update is not processed until the next sync_state iteration, and DHCPDISCOVER from vms are not handled by dnsmasq as the entry is not added in the hosts file11:25
salv-orlandothis might cause the vm to timeout on dhcp requests on boot11:26
*** pcm_ has joined #openstack-neutron11:26
salv-orlandoSreedhar: ^^ and this is an example of it http://logs.openstack.org/20/57420/35/experimental/check-tempest-dsvm-neutron-isolated-parallel/cdf95ef/logs/11:26
*** pcm_ has quit IRC11:28
*** pcm_ has joined #openstack-neutron11:28
Sreedharsalv-orlando: Thanks for the info. I see some build failures in patch 57420. Is this complete, can i merge that code11:29
salv-orlandoSreedhar: patch 57420 is a wip - Once the builds become green I will extract several patches out of them and push them for revie11:31
salv-orlandothere is a lot of LOG code in there11:31
*** jp_at_hp has joined #openstack-neutron11:31
Sreedharsalv-orlando: Thanks.11:32
*** KA has quit IRC11:38
openstackgerritArmando Migliaccio proposed a change to openstack/neutron: Handle failures on update_dhcp_port  https://review.openstack.org/5966411:39
*** armax has quit IRC11:40
*** bvandenh has quit IRC11:49
*** jorisroovers has joined #openstack-neutron12:03
anteayamlavalle: awesome job on the API tests gap analysis12:08
anteayaI have added a note to the etherpad encouraging anyone to select one item from the list and identify themselves and then create a launchpad bug for the item12:09
anteayathe person signing up does not have to file a patch for the bug12:09
anteayaonce the list is in the bug tracker it is easier to track progress, but Mark didn't want you to have to enter the list into launchpad yourself12:10
anteayahttps://etherpad.openstack.org/p/icehouse-summit-qa-neutron12:10
*** bvandenh has joined #openstack-neutron12:18
*** dims has joined #openstack-neutron12:18
*** salv-orlando_ has joined #openstack-neutron12:33
*** enikanorov_ has joined #openstack-neutron12:35
*** salv-orlando has quit IRC12:36
*** salv-orlando_ is now known as salv-orlando12:36
*** enikanorov has quit IRC12:38
Sreedharsalv-orlando: During the concurrent instance creation, getting these errors in the DHCP agent log  - TRACE neutron.agent.dhcp_agent Timeout: Timeout while waiting on RPC response - topic: "q-plugin", RPC method: "get_dhcp_port" info: "<unknown>"  and Timeout: Timeout while waiting on RPC response - topic: "q-plugin", RPC method: "get_active_networks_info" info: "<unknown>".  Any idea why these are coming and any tuning12:46
marunSreedhar: the load is simply too high.  did you say you were on grizzly?12:49
marunSreedhar: or can you use havana or trunk?12:49
Sreedharmarun: I have ran similar tests on Grizzly but never had issues. I have installed the fresh Havana bits from Ubuntu Cloud and seeing these issues12:50
marunThere is a patch introduced in icehouse that allows running wsgi workers for the neutron service in a separate process which should allow for more performant rpc handling as a side-effect: https://review.openstack.org/#/c/37131/12:51
*** safchain has joined #openstack-neutron12:51
Sreedharmarun: I was using the same HW and same network setup.. Able to deploy 240+ instances with concurrently of 30 in Grizzly without any issue with SQLpool tuning and increasing agent_down_time and report_interval. I have upgraded the setup with Havana (with fresh install) since then i could not go more than 150 instances. Some of the instances are going into error state and some getting duplicate fixed IPs12:51
SreedharMarun: I am in the process of implementing those changes - adding more worker threads. But what puzzles me is how come it was working in Grizzly and not in Havana12:53
marunSreedhar: I'm seeing the same results in icehouse.  I'm surprised you were able to deploy so many instances in grizzly.  I've seen reports that booting lots of instances results in hosts not being configured via dhcp.12:53
marunSreedhar: The duplicate fixed ip and error state may be a new problem, though.12:54
marunSreedhar: What sql pool configuration have you found effective?  I posted on the mailing list for assistance with that one and didn't receive a good answer.12:54
SreedharMarun: I have done the tests more than 10 to 15 times. I was able to get IP address all the times up to 210 instances in Grizzly. Only when crossed 210 instances, 2 or 3 instances gets IP address with a delay of 2-3min. But eventually all instances could get IP address. This behavior was consistent across all times12:55
beaglessalv-orlando, markmcclain, marun: I've added you guys as reviewers to 59542. In some respects, I consider this an "experimental" patch in that I've not examined thoroughly the implications and code paths, but there were several types of errors that disappeared in my environment after implementing the patch... the rest were NetworkNotFound related with get_dhcp_port() etc, but a fix for some of t12:56
beagleshat was already merged by the time I "got to it"12:56
marunSreedhar: That is extremely surprising, and indicates that your configuration has performance headroom to spare.12:56
SreedharIn Grizzly i set these values Quantum  sqlalchemy_pool_size = 60 sqlalchemy_max_overflow = 120 sqlalchemy_pool_timeout = 2.. Same values were set in Havana as well  - max_pool_size = 60 max_overflow = 120pool_timeout = 212:56
*** jprovazn has joined #openstack-neutron12:58
marunSreedhar: thank you, I will try those settings.12:58
SreedharMarun: My configuration includes 16 compute nodes (each has 16 cores) - I could even go up to 300 instances with concurrently of 30 parallel instance creation, but once we cross 240, most of the instances won't get IP fast enough (there is a delay of close to 2min)12:59
*** amuller_ has joined #openstack-neutron12:59
marunbeagles: Excuse my ignorance, but is a network only ever configured on a single dhcp agent?12:59
SreedharMarun: With the performance enhancements in Havana, I was expecting neutron to performance better than Quantum but its proving otherwise12:59
marunSreedhar: Frankly, I continue to be surprised when people attempt to use the ovs or linuxbridge plugins as more than POC.  As we're discovering there are many reasons to choose a better supported solution.13:00
SreedharMarun: With the above mentioned sql pool configuration, never seen duplicate fixed IP issues in Grizzly13:01
markmcclainbeagles: looking13:01
openstackgerritOleg Bondarev proposed a change to openstack/neutron: LBaaS: agent monitoring and instance rescheduling  https://review.openstack.org/5974313:01
*** amuller has quit IRC13:02
*** matsuhashi has quit IRC13:03
beaglesmarun, mmm... if we are doing distributed dhcp agents then that would be multiple agents per network, wouldn't it?13:05
*** bvandenh has quit IRC13:05
marunbeagles: the reason I ask is that that the approach in the task presumes to know when resyncing is happening based on local state only.13:06
maruntask -> patch13:06
*** bashok has quit IRC13:07
marunSreedhar: The likely culprit is a performance regression.  You have big enough hardware that you never saw the problem before, but it might have been there for people with slower hosts.  Certainly the VMs running the gate jobs and developers like me running in a VM on a laptop seem to replicate the problems with ease.13:07
beaglesmarun, but for the purposes of this patch it is only local state that is relevant... as the purpose is to avoid changes that are in conflict with activities that are "in progress" that affect a particular network setting.13:08
*** yamahata_ has joined #openstack-neutron13:09
beagless/setting/config/13:09
marunoh, duh13:09
marunbeagles: I think I see the problem.13:09
Sreedharmarun: how to proceed further. I feel instance going into error state and getting duplicate fixed IPs are serious problem,13:10
marunbeagles: the utils.synchronized decorator was intended to limit access to the function to a single caller at a time, but the use of spawn_n would fire off greenthreads that could be still running after the function was exited.13:11
marunSreedhar: I agree.  I'm working on it.13:12
Sreedharmarun: Thanks13:12
marunbeagles: I'm going to comment on the patch.13:13
marunmarkmcclain: are you seeing the same thing?13:13
markmcclainmarun: got pulled into another convo, so haven't gotten there yet13:14
markmcclaingo ahead and add your comment13:14
marunmarkmcclain: Ok, hopefully this will save you the time.13:15
markmcclainmarun: thanks for beating me to the review13:16
marun:)13:16
beaglesmarun, yeah, that's what I was thinkin'13:16
*** dims has quit IRC13:17
*** dims has joined #openstack-neutron13:17
beaglesmarun, further to that since operations tend to involve external commands they are outside of the eventlet scheduling so it is quite easy to conceive of multiple threads "stacking up" for a given network and causing things to happen in an "insane order"13:18
beaglesmarun, what would be better (maybe) is a queue of updates on network ids so that sync_state operations per-network are always run in order13:19
beaglesmarun, if it were a pre-emptive multi-tasking environment, we'd be driven in that direction anyways because spawning threads per update would lead to scaling issues due to proliferation of threads13:20
marunbeagles: Is that complexity necessary?  The original design used a single lock.13:20
marunbeagles: remember that we are not using real threads, though.  greenthreads are free.13:20
marun(basically)13:21
beaglesmarun: relatively yeah... I only bring it up because an idiom that works in one often has applicability to the other13:21
marunbeagles: I'm going to look at the patch that introduced the greenthreading in the hopes of understanding why the spawn_n was introduced in the first place.13:21
beaglesmarun: yup13:21
marunHave you already done the same?  If not, maybe do the same in case I miss something13:22
marun?13:22
beaglesmarun: no I didn't13:22
beaglesmarun: although I can see how sync_state calls might timeout or pile up if threads aren't used there13:22
marunbeagles: right, io blocking13:23
*** matsuhashi has joined #openstack-neutron13:32
markmcclainmarun, salv-orlando: just wanted to make sure you saw this: http://lists.openstack.org/pipermail/openstack-dev/2013-December/021127.html13:36
markmcclainlooks like we're still #113:36
marunmarkmcclain: cripes, that network isolation one is still at #2 :(13:37
*** markmcclain has quit IRC13:38
marunsalv-orlando: is there an exception that i should throw when there are no dhcp agents to notify?13:41
*** armax has joined #openstack-neutron13:47
*** jorisroovers has quit IRC13:48
*** jorisroovers has joined #openstack-neutron13:48
*** jorisroovers has quit IRC13:50
openstackgerritSean M. Collins proposed a change to openstack/neutron: Quality of Service API extension - RPC & Driver support  https://review.openstack.org/5997013:50
openstackgerritSean M. Collins proposed a change to openstack/neutron: Ml2 QoS API extension support  https://review.openstack.org/5997113:50
openstackgerritSean M. Collins proposed a change to openstack/neutron: QoS API and DB models  https://review.openstack.org/2831313:50
*** steven-weston has quit IRC13:51
*** jorisroovers has joined #openstack-neutron13:52
*** yfried has quit IRC13:54
*** safchain has quit IRC13:56
*** mengxd has joined #openstack-neutron14:00
*** markmcclain has joined #openstack-neutron14:02
*** mengxd has quit IRC14:02
dkehnhas the ml2 meeting changed14:02
dkehnmestery: ^^^^^14:02
mesterydkehn: Yes, it's at 1600UTC14:03
dkehnmestery: ok, thx14:03
mesterydkehn: And it's on #openstack-meeting-alt, FYI14:03
*** SushilKM has joined #openstack-neutron14:03
dkehnmestery: yaaaaaa14:03
pcm_mestery: Thanks. i was wondering too.14:04
*** aymenfrikha has joined #openstack-neutron14:04
*** amuller__ has joined #openstack-neutron14:04
mesteryNo worries, had sent email to openstack-dev, but that list has grown to an almost unimaginable size.14:04
*** amuller_ has quit IRC14:05
*** julim has joined #openstack-neutron14:05
*** yamahata_ has quit IRC14:07
*** yamahata_ has joined #openstack-neutron14:07
dkehnmestery: true, you might want to update the https://www.google.com/calendar/ical/bj05mroquq28jhud58esggqmh4@group.calendar.google.com/public/basic.ics, just a thought14:09
mesterydkehn: The calendar invite isn't updated? I thought somehow that happened automatically, but let me do that myself, thanks!14:10
*** safchain has joined #openstack-neutron14:10
*** heyongli has joined #openstack-neutron14:11
dkehnmestery: how'd you guys fair on the snow?14:12
mesterydkehn: About 2-3 inches so far, but mixed with rain, so it's frozen all over.14:12
mesteryHow about you?14:12
*** markmcclain has quit IRC14:12
dkehnmestery: still snowing here, about 6 inch down here, I hear the mountains got 28 inch, great for the skiing, but cold as hell, which we don't see to often14:13
*** amuller__ is now known as amuller14:13
*** armax has quit IRC14:15
*** armax has joined #openstack-neutron14:15
anteayasalv-orlando: this bug had 61 hits in the last 48 hours: https://bugs.launchpad.net/tempest/+bug/125389614:16
anteayasalv-orlando: any thoughts?14:16
openstackgerritMarios Andreou proposed a change to openstack/neutron: Validate CIDR given as ip-prefix in security-group-rule-create  https://review.openstack.org/5921214:16
mesterydkehn: Wow, that sounds cold and snowy :)14:18
salv-orlandoanteaya: I have to check what merged in the past 48 hours; I would rather avoid reverting if we can find easy fixes14:18
dkehnanteaya: responded via HP email about the sprint14:19
*** bvandenh has joined #openstack-neutron14:20
dkehnmestery: hard to get a snow day when working from home14:20
anteayasalv-orlando: sound reasonable to me, thank you14:22
anteayadkehn: great thanks14:22
*** SushilKM has quit IRC14:23
mesterydkehn: Agree.14:23
* mestery works from home as well.14:24
salv-orlandoanteaya: I think I have skewed the stats14:24
mesterydkehn: You planning to come to the Montreal sprint too?14:24
salv-orlandoon monday I've been launching a lot of parallel jobs which exhibit a failure like bug 125389614:24
salv-orlandobut the stats from the last 24 hours are of "only" 19 hits14:24
anteayahmmmmmm14:24
dkehnmestery: yes, trying to plan it14:25
mesterydkehn: Cool!14:25
dkehnanteaya: is the location confirmed14:25
anteayaany suggestions for how me might proceed to get an accurate reflection of what is 1253896? and what might be due to the parallel jobs?14:25
anteayadkehn: the location address I sent in the email is tentatively booked14:26
anteayaI am going to pay the deposit when I return home next week14:26
dkehnanteaya: ok, so its not likely to change then, or should one wait until you confirm?14:26
anteayanot likely to change14:27
anteayathey will contact me if they get an inquiry for the same time14:27
anteayaand i will pay the deposit from where I am if that happesn14:27
anteayahappens14:27
*** stackKid has joined #openstack-neutron14:27
anteayafor security reasons I would just like to do that from home14:28
anteayaI have my airfare booked for it since i am coming from Australia and had to book that14:28
anteayabut not my hotel or travel home yet14:28
anteayaso that is where I am personally14:28
*** nigel_r_davos has quit IRC14:28
salv-orlandoanteaya: I think it won't be straightforward with the log collection process. On the other hand I might stop randomly running jobs on the gate and use internal infrastructure where possible.14:29
anteayajog0: ping14:30
anteayajog0: any thoughts on salv-orlando's possible direction?14:30
dkehnanteaya: ok, thanks for the info, just want to get as close as possible to the venue with hotel14:30
anteayanot sure if jog0 is online right now or not14:31
anteayadkehn: understood14:31
*** ocherka_ has joined #openstack-neutron14:31
marunarmax: ping14:33
armaxmarun: pong14:33
marunarmax: Have you seen beagles' patch?14:34
armaxnot yet, I am bit behind on reviews14:34
armaxwhich one is it?14:34
marunhttps://review.openstack.org/#/c/59542/14:35
marunGiven your focus on the dhcp issues I think it's important that you look at this.14:35
armaxit looks like I should look at this one14:35
armaxthanks for the heads-up14:35
armaxI'll give it a look14:36
marunDid you add the utils.synchronized decorator to sync_state?14:36
armaxI think I did14:36
marunI think there may be a problem with that approach (comment inline documents my concern)14:36
*** jlibosva1 has joined #openstack-neutron14:36
marunbeagles suggested that a queue might be a better solution that his current proposal, maybe you can speak on that issue as well.14:37
marunthat -> tahn14:37
marunoy14:37
marunok, I'm done :)14:37
armaxI added it because I saw log traces that led me to believe that sometime event handlers were preempted by the periodic sync worker14:38
marunarmax: your instincts were definitely correct14:38
*** jlibosva has quit IRC14:39
marunarmax: but the use of non-blocking spawn_n() calls to invoke the configuration doesn't allow a function-level lock to work14:39
armaxthat lock is file-based though14:39
armaxwouldn't that work still?14:39
marunarmax: I'm afraid not.14:39
marunarmax: function called -> greenthreads spawned -> function ended and lock released -> (possibly sometime later, greenthreads exit)14:40
marunarmax: it's the non-blocking nature of spawn_n that is the problem.14:40
*** jlibosva1 has quit IRC14:40
armaxmy understanding was that the decorator was dropping a cookie (i.e. a file) on the file systme14:41
armax14:41
armaxand the method would yield if the cookie was already there14:41
marunarmax: that is correct.14:41
armaxso concurrent calls would still serialize14:41
*** jlibosva has joined #openstack-neutron14:41
armaxI am not overly expert14:41
marunarmax: concurrent calls to sync_state, yes.14:41
armaxof that piece of code14:41
armaxthough14:41
marunarmax: but greenthreads that have been spawned - that call safe_configure - are not guaranteed to have run to completion before the lock is relinquished14:42
armaxright14:42
marunarmax: the only way the lock would be effective would be if spawn() was used instead of spawn_n() and wait() was called on every greenthread in the pool.14:43
*** peristeri has joined #openstack-neutron14:43
marunarmax: which, frankly, might be a good idea.14:43
armaxso in a nutshell you're saying that the decorator is ineffective14:43
armaxso far14:43
marunarmax: As the function is currently written, yes.14:44
armaxand we need to tweak the body as you suggested14:44
armax?14:44
* beagles posits that the decorator is still a good thing^TM but it doesn't scope out the way you think14:44
marunarmax: If the goal is to only have one copy of sync_state running at a time, then yes, we can tweak the body.14:44
marunarmax: if the goal is to maximize concurrency of safe_configure, then beagles suggestion of queueing those activities might be a better choice.14:45
marunI'm not exactly sure why spawn_n is being used.14:45
beaglesarmax, marun: if the same operations are initiated through other means (other RPCs, etc), there will be other races. These should also be accounted for14:46
armaxI think it's safer to serialize14:46
armaxmy understanding of the use of spawn was to run the task in 'background'14:47
beagleskeep in mind that many of these operations end up invoking commands that are of indeterminate duration... serializing everything may be prohibitive from the scalability perspective14:48
marunarmax: arg, beagles is correct though.  enable_dhcp_helper calls configure_dhcp_for_network too, from all the network events14:48
beaglesconsidering sync_states are frequent and know and ad-hoc calls are arbitrary in frequency and timing14:48
beagless/know/known/14:48
armaxmy fear is that eventually those end up calling dnsmasq, ovs, and ip commands14:49
armaxif they end up playing with the same resource14:49
armaxthen there may be troubles14:49
marun:\14:49
beagleswould it make to serialize at those points then?14:49
marunarmax, beagles: I think the sync point needs to be configuring a given network14:50
beaglesand allow other concurrency to occur if the resources are not shared/contended for14:50
armaxfrankly I am very bad at  visualizing the execution of concurrent code14:50
armaxin this case my approach would try trial and error14:50
beaglesarmax: anybody who says they are faultless at that is lying :)14:50
beaglesarmax, or deluded14:50
marunarmax: I don't think it's too complicated.14:50
armaxbeagles: I cannot disagree14:50
beaglesarmax: I've just "stuck my finger in the proverbial socket" quite a lot :)14:50
marunarmax: the os operations in question - dnsmasq etc - are isolated by network14:51
jog0anteaya: I am aboutto go offline14:51
marunarmax: so only a single configuration operation should be done at a time on a given network14:51
marunarmax: It should be safe to configure multiple networks at a time, though.14:51
beaglesI think ops vis-a-vis ovs probably wont' walk all over each other either since one network's operations is isolated from another's pretty much by definition14:52
* beagles notices marun's "etc" and facepalms and shuts up14:52
armaxmarun: true14:53
armaxbut what if you get concurrent events on the same network14:53
marunarmax: queue14:53
marunarmax: or ??14:53
armaxindeed14:53
marunheh14:54
armaxbut a queue is a form of serialization, is it not?14:54
beaglesqueue per network14:54
armaxcorrect14:54
marun+114:55
armaxbottom line: dhcp agent v214:55
armax:)14:55
marunoy14:55
marunso we're stuck for havana?14:55
marunor do you think v2 could be backported?14:56
armaxit depends on the extent of changes required14:56
marunthe reason I ask is that we're trying to stabilize havana for release as RHOS 4.0 and it's got an awful lot of issues at present.14:56
beaglesusually for this kind of thing it is best to start really simple and functionally specific14:56
marunagreed14:57
beaglesin this case a map of network.id to updated info and have a set of eventlet threads processing that map14:57
anteayajog0: was just wondering what you thought about salv-orlando's assessment of Gate Bug #114:57
armaxyes, but the bigger picture should still be in sight14:57
* beagles nods14:58
marunarmax: what would the bigger picture look like?14:58
*** yfried has joined #openstack-neutron14:58
armaxyou mentioned this queuing system14:58
beaglesarmax, don't get stuck on "queue"14:58
beaglesarmax, thing "task list" or "work queue" or "command pattern"14:59
armaxwe also talked about introducing more states to network and subnet resources14:59
armaxto address some concurrency issues we noticed14:59
beagless/thing/tink14:59
beaglesthink14:59
beaglesGAHH14:59
beaglesarmax: that's a decent idea too... it is an approach used in nova for instances14:59
armaxI am all in for a piece-meal approach to improving things14:59
anteayajog0: he feels that the current hits might be due to his inclusion of some patches using parallel testing with a similar fingerprint to bug #114:59
marunarmax: +1 on introducing more states15:01
beaglesarmax, but... it is not a panacea, it just gives you a mechanism of knowing when not to "do something dangerous". It doesn't address the issue of "doing something". The task_states etc. give some sanity to things like get_dhcp_port() after things have been deleted.15:01
marunarmax: Being able to track success or failure of operations with more granularity is key to improving reliability.15:01
marunI don't think that's the immediate solution, though.  Limiting concurrency is the obvious stepping stone.15:02
beaglesI'd rephrase that as "controlling dangerous concurrency" :)15:02
*** jistr has quit IRC15:02
*** networkstatic_zZ has quit IRC15:02
beaglesor "maximizing good concurrency" or something15:03
beaglesand reducing too much coffee intake for beagles15:03
*** networkstatic has joined #openstack-neutron15:03
*** reaper has joined #openstack-neutron15:03
marunbeagles:  controlling dangerous concurrency => preventing dangerous concurrency, and you've got yourself a deal15:03
beaglesright on!15:04
jog0anteaya: ahh that is very possible15:04
*** jistr has joined #openstack-neutron15:04
*** jistr is now known as jistr|mtg15:04
jog0anteaya: thats easy to confirm15:04
jog0anteaya: if we see it in the gate, then he is wrong15:04
anteayajog0: hmmmm15:05
jog0anteaya: which bug just to be clear15:05
*** sbasam has joined #openstack-neutron15:05
anteayajog0: do we have a graph for this bug showing up in the gate?15:05
anteayahttps://bugs.launchpad.net/bugs/125389615:05
anteayasalv-orlando: ^^15:05
jog0anteaya: ohh that one, that fails for nova-networking as well15:06
*** armax_ has joined #openstack-neutron15:06
jog0although its possible the neutron failures are due to what you just described, checking15:06
anteayathanks15:06
*** wcaban has joined #openstack-neutron15:06
*** thedodd has joined #openstack-neutron15:07
marunarmax: dumb question, should it be possible to run more than 1 instance of an agent type (dhcp, l3, metadata) on a given host?15:07
*** ocherka_ has quit IRC15:08
jog0http://logstash.openstack.org/#eyJmaWVsZHMiOltdLCJzZWFyY2giOiJtZXNzYWdlOlwiU1NIVGltZW91dDogQ29ubmVjdGlvbiB0byB0aGVcIiBBTkQgbWVzc2FnZTpcInZpYSBTU0ggdGltZWQgb3V0LlwiIEFORCBmaWxlbmFtZTpcImNvbnNvbGUuaHRtbFwiIiwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJvZmZzZXQiOjAsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sIm1vZGUiOiJ0ZXJtcyIsImFuYWx5emVfZmllbGQiOiJidWlsZF9uYW1lIiwic3RhbXAiOjEzODYxNjk2NTkxMjh915:08
jog0anteaya: the numbers back up that explanation at fist glance15:08
*** armax has quit IRC15:08
*** armax_ is now known as armax15:08
anteayajog0: so your sense is that salv-orlando's assessment of the situation is accurate?15:09
jog0http://logs.openstack.org/28/51228/11/check/check-tempest-dsvm-neutron/6e8adb4/console.html15:09
anteayajog0: can I get a shorter url for the logstash link?15:09
armaxmarun: cannot speak for l3 and metadata, but I think that the dhcp one, configured properly, it should be able to15:10
marunah, ok15:10
jog0anteaya: that failure is in a a swift job15:10
armaxand by configuring properly I mean ensuring that they don't step up on each other toes15:11
marunarmax: https://review.openstack.org/#/c/58814 is presuming that is not the case15:11
armaxlike host files etc15:11
anteayajog0: looking at the log, what should I be seeing?15:11
marunarmax: so I'll -1 and ask the question of the submitter15:11
jog0anteaya: not sure .. still digging in logstash15:11
*** markmcclain has joined #openstack-neutron15:11
armaxI quickly looked at that but I needed more time to digest15:11
armaxit15:11
armaxI wasn't convinced by it though15:12
marunarmax: I'm going to suggest he raise the issue on the mailing list.  I don't think that kind of design decision should be made without lots of input.15:13
armaxseems fair15:13
openstackgerritBrian Haley proposed a change to openstack/neutron: Change l3-agent to periodically check children are running  https://review.openstack.org/5999715:13
jog0ahh here we go15:14
jog0anteaya: https://review.openstack.org/#/c/57420 that is responsible for 26% of the failures15:14
jog0(out of the last 100 neutron failures)15:14
jog0query message:"SSHTimeout: Connection to the" AND message:"via SSH timed out." AND filename:"console.html"  AND build_name:*neutron*15:14
jog0the next biggest one is https://review.openstack.org/#/c/5762715:15
*** amritanshu_RnD has quit IRC15:15
jog0anteaya: in short i see no gate failures for neutron for that bug15:15
*** jecarey has joined #openstack-neutron15:15
jog0so I think its safe to say salv-orlando is right15:15
jog0I'll remove neutron from that bug15:16
anteayathank you, jog015:16
*** Sreedhar has quit IRC15:16
*** edbak has quit IRC15:16
*** ywu has quit IRC15:16
enikanorov__marun: hi15:16
jog0anteaya: thank you, good digging15:17
marunenikanorov__: hi!15:18
marunenikanorov__: I bet I know what you want to talk about :)15:18
anteayajog0: I'm learning more about logstash everyday15:18
jog0anteaya: its really powerful15:19
jog0and confusing15:19
*** heyongli has quit IRC15:19
anteayayes15:20
enikanorov__marun: yep15:20
enikanorov__marun: i'd like to discuss https://review.openstack.org/#/c/58814/15:21
marunenikanorov__: go ahead15:21
enikanorov__i've posted general comment15:22
enikanorov__so the fix is not trying to allow several agents of the same type on 1 host15:22
*** jlibosva has quit IRC15:22
*** matsuhashi has quit IRC15:23
marunenikanorov__: the proposed fix makes it impossible for more than 1 instance of a given agent type to run on a host15:23
marunenikanorov__: I'm not against that, but that decision needs to have broad consensus that I don't think is possible on a review.15:23
enikanorov__ah, i'm misread your comment actually15:23
anteayamarun: can I get a status update on: https://bugs.launchpad.net/neutron/+bug/1251448 ?15:24
enikanorov__but isn't that how things have worked up to the moment?15:24
enikanorov__it's just with certain deployment sequence it leads to a problems15:24
marunenikanorov__: It is not hard-coded as the fix proposes.15:24
marunenikanorov__: I'm not saying 'this can't be merged'.  I'm saying 'ask the community if this is an acceptable restriction'15:25
enikanorov__syeah, sure15:25
*** clev has joined #openstack-neutron15:25
marunenikanorov__: If it is not, then agents are going to have to have unique ids generated for them instead of using agent type+host15:25
enikanorov__that makes sense15:25
marunanteaya: I'm afraid I have nothing to report beyond what is already commented on the bug report.15:25
marunanteaya: I've been sidelined by trying to improve the dhcp agent's reliability which has been contributing to a host of other bus15:26
marunbugs15:26
marunanteaya: if you can find a volunteer to take it over,  I think that would be best.15:27
*** networkstatic has quit IRC15:27
*** fouxm_ has joined #openstack-neutron15:31
*** wcaban is now known as Mr_W15:31
*** fouxm_ has quit IRC15:32
*** fouxm_ has joined #openstack-neutron15:33
*** fouxm has quit IRC15:34
anteayamarun: ack15:34
anteayaokay folks we need someone to take over https://bugs.launchpad.net/neutron/+bug/125144815:35
*** rpodolyaka has joined #openstack-neutron15:36
openstackgerritstephen-ma proposed a change to openstack/neutron: Delete duplicate internal devices in router namespace  https://review.openstack.org/5795415:36
markmcclain1251448 looks a combo of a tempest and a neutron bug right?15:37
*** amuller has quit IRC15:37
rpodolyakaHey all! marun raised an interesting question about allowing of running multiple agents of the same type on one host in this review https://review.openstack.org/#/c/58814/ . I thought, it might be interesting for you, guys15:37
jog0https://jenkins02.openstack.org/job/check-tempest-dsvm-neutron/buildTimeTrend15:37
jog0that job just dumped out a bunch of fails15:38
marunrpodolyaka: Please make sure to ask on mailing list too, not everyone will see the question on irc.15:38
*** SushilKM has joined #openstack-neutron15:38
jog0https://jenkins01.openstack.org/job/check-tempest-dsvm-neutron/buildTimeTrend15:38
jog0same here15:38
jog0anteaya: ^15:38
rpodolyakamarun: ok, just wanted you to look at my comment first :)15:39
jog0neutron check-tempest-dsvm-neutron just crapped out15:39
*** safchain has quit IRC15:39
anteayajog0: :(15:39
jog0hmm its horizon15:39
anteayajog0: what bug is that?15:39
jog0anteaya: none yet15:40
jog0oh wait15:40
anteayaor does the bug exist15:40
* anteaya waits15:40
jog0its everything15:40
jog0https://jenkins01.openstack.org/job/check-tempest-dsvm-full/buildTimeTrend15:40
anteayaeverything?15:40
marunrpodolyaka: hmmm, good point15:41
*** stackKid is now known as dingye15:41
marunmarkmcclain: ping15:41
*** otherwiseguy has joined #openstack-neutron15:41
rpodolyakamarun: so that's the only reason, why I implemented it like this15:42
markmcclainmarun: pong15:42
marunmarkmcclain: So, only 1 instance of a given agent type allowed per host?15:42
dingyehello, i am new comer, i'd like to contribute on QA/temptest on neutron. where to start? could anyone give me some hint?15:42
markmcclainmarun: yes because most share the same state files15:43
markmcclainotherwise they will interfere with each other15:43
anteayajog0: had never seen that view for jenkins before15:43
anteayaexciting15:43
marunmarkmcclain: ok, fair enough.  For some reason I thought it was possible for more than 1 to exist, maybe because the l3 agent without namespaces being enabled would require an agent per router.15:43
*** safchain has joined #openstack-neutron15:44
anteayawelcome dingye15:44
*** safchain has quit IRC15:44
anteayadingye: take a look at this etherpad: https://etherpad.openstack.org/p/icehouse-summit-qa-neutron15:44
marunrpodolyaka: ok, I'll update my review15:44
rpodolyakamarun: thanks!15:44
markmcclainl3 without namespaces can15:45
markmcclainmarun: ^15:45
jog0anteaya: heh, moving the conversation to -qa since its universal15:45
anteayadingye: find this heading: API tests gap analysis (as of 2013-12-1)15:45
anteayayou will see some instructions below it, to select one of the api gap items and create a launchpad bug for it15:45
anteayadingye: does that sound like something you are comfortable doing?15:45
marunmarkmcclain: in that case, what do you think of adding a unique index on (agent_type,host)?15:46
anteayajog0: very good, moving to -qa15:46
marunmarkmcclain: this restriction is already encoded in _get_agent_by_type_and_host, which throws an exception if multiple results are returned15:47
markmcclainmarun: that works for most cases but fails for l3 agents that serve different upstream nets15:48
*** carl_baldwin has joined #openstack-neutron15:49
marunmarkmcclain: the only way that could work (as armax suggested), would be to use a different hostname (for same ip) for each l3 agent15:49
dingyeanteaya: i understand the usercases of that section and i have played with neutron with ovs plugin. i'd like to start to look at some test example. is this right approach?15:49
markmcclainyeah15:49
rkukuramarkmcclain, marun: Should be no (or less) reason to run multiple l3 agents on same node once https://review.openstack.org/#/c/59359/ merges15:49
marunmarkmcclain: https://github.com/openstack/neutron/blob/master/neutron/db/agents_db.py#L13115:49
markmcclainright15:50
markmcclaindb should enforce :)15:50
*** amir1 is now known as asadoughi15:50
marunmarkmcclain: ok, cool.15:50
marunrpodolyaka: question, is there a reason not to retry, say, 5 times instead of 2?  And maybe make it configurable?15:52
*** jistr|mtg has quit IRC15:53
rpodolyakamarun: I think, no. The only case this error can happen is two, transactions trying to insert the same entry and commit. Whatever happens, one will be committed, and the second will fail. And when it's retried, it will UPDATE the existing entry, rather than insert a new one15:54
rpodolyaka* two concurrent transactions15:54
marunrpodolyaka: what makes you think there would ever only be 2 concurrent transactions?15:54
rpodolyakamarun: ok. But what changes if we have more than 2? only one succeeds, other ones are retried and issue UPDATEs15:55
markmcclaindingye: here's a sample scenario test https://review.openstack.org/#/c/56242/15:55
*** SushilKM has quit IRC15:55
markmcclaindingye: and another inflight for API tests15:56
markmcclainhttps://review.openstack.org/#/c/56680/15:56
*** jroovers has joined #openstack-neutron15:58
rpodolyakamarun: I mean, if this call is retried, than we already have an (agent_type, host) entry committed. The won't be new INSERTs, only UPDATEs15:58
rpodolyakamarun: and we guarantee we retry once15:58
*** jorisroovers has quit IRC15:58
marunrpodolyaka: I guess it will be fine until it isn't.  At some point in the near future any exceptions being logged will result in gate failures.15:59
dingyemarkmcclain: thanks, i will start to learn from these examples.15:59
*** jistr has joined #openstack-neutron15:59
*** jistr is now known as jistr|mtg16:00
rpodolyakamarun: if you prove me wrong, I'll be happy to update the patch :) but I can't see, how we can receive the error more than once16:02
roaetrkukura: howdy, when was the ml2 meeting rescheduled for (just to make sure)?16:02
marunrpodolyaka: concurrency is a wonderful thing16:02
*** jroovers has quit IRC16:02
*** Abhishek_ has joined #openstack-neutron16:02
*** dingye has quit IRC16:03
rkukuraroaet: Is now on #openstack-meeting-alt16:03
rpodolyakamarun:  sure it is :) but only one transaction can be committed at the time16:03
roaetah yes, so it is16:04
marunrpodolyaka: wow, that's a lot of sleep() calls in that method16:04
rpodolyakamarun: so one commits, all others will fail, be rolled back and retried, on retry we'll update the existing entry (multiple times, yes)16:04
*** Abhishe__ has joined #openstack-neutron16:05
*** alagalah has joined #openstack-neutron16:05
rpodolyakamarun: yeah, that's must be the reason that triggered this error with multiple agent entries to happen more often16:05
marunrpodolyaka: yes, I think you're correct16:05
*** alagalah has left #openstack-neutron16:05
*** safchain has joined #openstack-neutron16:07
*** networkstatic has joined #openstack-neutron16:07
*** Abhishek_ has quit IRC16:08
marunrpodolyaka: I suggest updating the comment (without a NOTE so it is more visible) to better explain that 'DBDuplicateEntry' exceptions can only happen when two or more creates contend, and that subsequent operations will always be updates and more or less guaranteed to succeed.16:09
*** banix has joined #openstack-neutron16:09
rpodolyakamarun: got it, thanks!16:09
*** otherwiseguy has quit IRC16:11
*** lori has joined #openstack-neutron16:11
*** Abhishe__ has quit IRC16:11
*** x86brandon has joined #openstack-neutron16:13
*** armax has quit IRC16:17
anteayaroaet can you take over this bug? https://bugs.launchpad.net/neutron/+bug/125144816:17
roaetlooking16:17
anteayathanks16:17
anteayathis is our worst gate blocking bug right now16:18
anteayaand I am trying to find someone to take it over for marun16:18
anteayayou can talk to him about it16:18
anteayaI just don't want to leave it hanging without someone championing it16:18
roaetanteaya: not sure if 'taking over' is a good idea, but I am definitely looking at it with mlavalle already16:18
anteayaand I am at the end of my day16:19
anteayaawesome16:19
*** markmcclain has quit IRC16:19
roaetthanks for the directive. it was blocking my changes so I have been trying to fit it into my schedule16:19
anteayacan you post status at the end of the day so when I look at it again, I get a sense of where you are?16:19
anteayaroaet: awesome16:19
anteayathanks16:19
anteayaI am afk for the night16:19
anteayasee you tomorrow, I hope16:20
roaetok. will try to remember, nighto16:20
*** rudrarugge has joined #openstack-neutron16:20
openstackgerritSylvain Afchain proposed a change to openstack/neutron: Fix Metering doesn't respect the l3 agent binding  https://review.openstack.org/6001616:22
*** bashok has joined #openstack-neutron16:24
*** alex_klimov1 has quit IRC16:25
*** Sreedhar has joined #openstack-neutron16:26
sbasammark, we spoke at the last summit about being able to support preserving neutron ports on a instance terminate. The bug report is at https://bugs.launchpad.net/neutron/+bug/116101516:29
*** reaper has quit IRC16:30
*** SushilKM has joined #openstack-neutron16:35
*** sbasam has quit IRC16:35
*** rudrarugge has quit IRC16:36
*** jgrimm has joined #openstack-neutron16:41
*** mlavalle has joined #openstack-neutron16:45
mlavalleamir: ping16:45
*** SumitNaiksatam has quit IRC16:48
*** marun has quit IRC16:50
*** armax has joined #openstack-neutron16:50
openstackgerritJon Grimm proposed a change to openstack/neutron: Openvswitch update_port should return updated port info  https://review.openstack.org/5884716:50
*** sbasam has joined #openstack-neutron16:51
*** otherwiseguy has joined #openstack-neutron16:52
*** jistr|mtg is now known as jistr16:54
*** armax has quit IRC16:55
*** armax has joined #openstack-neutron16:55
*** clev has quit IRC16:58
*** SushilKM__ has joined #openstack-neutron16:59
*** SushilKM has quit IRC17:01
asadoughirkukura: no official follow up discussion. i was thinking of adding agenda item for next week ml2 meeting17:02
rkukuraasadoughi: ok17:03
mesteryasadoughi: I think ML2 is the right place to discuss that, please add an item.17:03
asadoughimestery: oh, do i just edit https://wiki.openstack.org/wiki/Meetings/ML2 ?17:04
asadoughiwasn't sure if there was a more official way of asking for an agenda item17:05
mesteryasadoughi: Yes, please feel free to.17:06
mesteryJust add a new section at the top with the date for next week's meeting17:07
mesteryAnd go from there.17:07
asadoughimestery: done17:07
*** yamahata_ has quit IRC17:08
mesteryasadoughi: Awesome, thanks!17:08
*** jpich has quit IRC17:10
*** SumitNaiksatam has joined #openstack-neutron17:12
openstackgerritRoman Podoliaka proposed a change to openstack/neutron: Fix a race condition in agents status update code  https://review.openstack.org/5881417:16
*** armax has quit IRC17:17
pete5dear neutronians: I have a nova review for token sharing in nova.network.neutronv2 that helps a lot with instance creation time. Would appreciate some eyes on :-) https://review.openstack.org/#/c/58854/17:18
*** pcm_ has quit IRC17:19
*** pcm_ has joined #openstack-neutron17:19
*** jistr has quit IRC17:19
*** ywu has joined #openstack-neutron17:23
*** pcm_ has quit IRC17:24
*** safchain has quit IRC17:25
mesterypete5: Looking.17:26
*** armax has joined #openstack-neutron17:30
*** mlavalle has quit IRC17:32
*** chandankumar has quit IRC17:33
*** networkstatic has quit IRC17:38
openstackgerritYves-Gwenael Bourhis proposed a change to openstack/neutron: Make dnsmasq aware of all names  https://review.openstack.org/5293017:41
*** garyk has joined #openstack-neutron17:45
garykrkukura: ping17:45
*** fouxm_ has quit IRC17:48
*** rossella_s has quit IRC17:52
garyksalv-orlando: ping17:53
salv-orlandohi garyk17:53
garyksalv-orlando: i am having some connectvity issues - can you look at the private message i sent you17:54
*** ygbo has quit IRC17:59
garykrkukura:  you around?18:06
*** yfried has quit IRC18:09
*** armax has quit IRC18:14
*** yfried has joined #openstack-neutron18:20
*** armax has joined #openstack-neutron18:23
*** armax has quit IRC18:24
*** Abhishek_ has joined #openstack-neutron18:34
*** Abhishek_ has quit IRC18:36
*** harlowja has joined #openstack-neutron18:37
*** Abhishek_ has joined #openstack-neutron18:37
openstackgerritSean M. Collins proposed a change to openstack/neutron: Create a new attribute for subnets, to store v6 dhcp options  https://review.openstack.org/5298318:41
*** jprovazn has quit IRC18:42
*** pcm_ has joined #openstack-neutron18:43
*** jprovazn has joined #openstack-neutron18:44
*** pcm_ has quit IRC18:44
*** pcm_ has joined #openstack-neutron18:45
*** nati_ueno has quit IRC18:45
*** alagalah has joined #openstack-neutron18:46
*** alagalah has left #openstack-neutron18:46
otherwiseguyHmm, so whatever generates the tarballs for releases messes up the whitespace in setup.cfg so it no longer matches what exists in the git tag (in addition to adding the [egg_info] section at the end.18:46
rkukuragaryk: pong18:46
otherwiseguyThis is currently screwing up my tools for packaging a backport of a setup.cfg change. :p18:47
* otherwiseguy waves at garyk 18:47
garykrkukura: sent you a mail :)18:47
garykhey otherwiseguy - hope all is well.18:47
rkukuragaryk: will reply to that18:47
garykrkukura: thanks18:48
otherwiseguygaryk: things are at least better. ;)18:49
otherwiseguyI hope all is well with you too!18:49
*** otherwiseguy has quit IRC18:55
sc68calOne of these days I won't wince when I push something that's a work in progress into gerrit....18:55
*** alagalah has joined #openstack-neutron19:00
*** pete5 has quit IRC19:02
*** pete5 has joined #openstack-neutron19:02
*** pete5 has quit IRC19:02
*** pete5 has joined #openstack-neutron19:02
*** sbasam has quit IRC19:06
*** alagalah has left #openstack-neutron19:17
*** yfried has quit IRC19:18
*** sbasam has joined #openstack-neutron19:24
*** yfried has joined #openstack-neutron19:27
*** alex_klimov has joined #openstack-neutron19:31
*** shashank_ has joined #openstack-neutron19:34
*** clev has joined #openstack-neutron19:51
*** rudrarugge has joined #openstack-neutron19:55
*** alagalah_ has joined #openstack-neutron19:55
openstackgerritJon Grimm proposed a change to openstack/neutron: Openvswitch update_port should return updated port info  https://review.openstack.org/5884720:04
*** alagalah_ has left #openstack-neutron20:04
*** armax has joined #openstack-neutron20:12
*** Abhishek_ has quit IRC20:12
*** armax has quit IRC20:15
*** Abhishek_ has joined #openstack-neutron20:24
*** Sreedhar has quit IRC20:24
*** clev_ has joined #openstack-neutron20:32
*** SushilKM__ has quit IRC20:34
*** clev has quit IRC20:34
carl_baldwinanteaya: ping20:35
*** Abhishek_ has quit IRC20:40
*** otherwiseguy has joined #openstack-neutron20:47
anteayahello carl_baldwin20:48
carl_baldwinanteaya: Was wondering if you received my email from yesterday about the sprint.20:48
roaetanteaya: i have confirmed that I will be able to, and have time to, work on https://bugs.launchpad.net/neutron/+bug/1251448 I will keep you appraised as I go20:49
anteayacarl_baldwin: I did20:49
anteayaspent the day meeting with markmcclain20:49
anteayalooking to send out an email to you and a few others shortly20:50
anteayaroaet: dude you rock20:50
anteayaroaet: are you coming to the code sprint?20:50
roaetwhat is that?20:50
* roaet is super disconnected from reality :(20:50
carl_baldwinanteaya: thanks.20:51
anteayaroaet: http://lists.openstack.org/pipermail/openstack-dev/2013-November/018907.html20:51
anteayacarl_baldwin: thanks for the ping20:51
roaetfootware heh20:53
*** networkstatic has joined #openstack-neutron20:54
roaetanteaya: I would like to go, my employer will not fund it. I will need to figure out all the prices to see if I can work it into my budget.20:56
anteayaroaet: do your figuring and let me know how it goes20:58
anteayawe are not above some public shaming if folks who are willing to do the work to shore up the gaps are not given the support the deserve from their employer20:59
anteayait isn't our first choice, but it is an option20:59
roaetYMQ right?21:00
anteayaYMQ?21:00
roaetthe airport21:00
roaetin quebec21:00
anteayaany airport you like21:00
roaetanteaya: any idea if it is weekend/weekday?21:00
anteayahttp://en.wikipedia.org/wiki/List_of_airports_in_the_Montreal_area21:01
anteayaWednesday, Thursday, Friday21:01
anteayaDec 15, 16 and 1721:01
anteayaYUL21:02
roaetwait.. it said 2nd week of january21:02
anteayayes21:02
anteayasecond full week of January21:02
roaetNot Dec 15, 16 and 17?21:02
roaet^^21:02
roaetJan 15 - 1721:02
anteayahttp://lists.openstack.org/pipermail/openstack-dev/2013-November/019973.html21:02
anteayasorry yes21:02
anteayaJan 15, 16, 1721:03
roaethrm. it isn't bad.21:03
anteayasorry I am tired21:03
anteayagood21:03
anteayawork your magic and tell me what support you need21:03
anteayaI will do my best to provide it21:03
roaetdo you know where the conference thing is happening?21:03
roaetsorry21:03
roaetprbably in the list21:03
* roaet reads21:04
anteayaSalle du Parc, New Residence21:04
anteayaMcGill University, 3625 Parc Avenue21:04
anteayanp21:04
*** harlowja has quit IRC21:09
*** harlowja has joined #openstack-neutron21:09
roaetanteaya: pretty much just need to have my passport eh? Seems ok.21:11
roaetI'll talk to my manager and see how it goes :D21:11
anteayacool21:15
anteayathanks21:15
anteayakeep me posted21:15
anteayayeah, US citizens to Canada is just a passport21:15
*** suresh12 has joined #openstack-neutron21:15
*** jprovazn has quit IRC21:17
*** julim has quit IRC21:25
*** Abhishek_ has joined #openstack-neutron21:27
*** bashok has quit IRC21:28
*** rudrarugge has quit IRC21:50
*** Abhishek_ has quit IRC21:51
*** Abhishek_ has joined #openstack-neutron21:51
*** Abhishe__ has joined #openstack-neutron21:54
*** Abhishek_ has quit IRC21:55
*** harlowja has quit IRC22:02
openstackgerritdekehn proposed a change to openstack/neutron: extra_dhcp_opt add checks for empty strings  https://review.openstack.org/5985822:08
*** dims has quit IRC22:10
*** otherwiseguy has quit IRC22:11
*** mlavalle has joined #openstack-neutron22:12
*** clev_ has quit IRC22:13
*** sbasam has quit IRC22:15
*** bvandenh has quit IRC22:16
*** aymenfrikha has quit IRC22:16
*** clev has joined #openstack-neutron22:22
*** peristeri has quit IRC22:24
*** harlowja has joined #openstack-neutron22:27
*** otherwiseguy has joined #openstack-neutron22:27
*** sbasam has joined #openstack-neutron22:30
*** sbasam has quit IRC22:34
*** x86brandon has quit IRC22:35
*** clev has quit IRC22:42
*** clev has joined #openstack-neutron22:43
*** rossella_s has joined #openstack-neutron22:49
*** clev has quit IRC22:54
*** alex_klimov has quit IRC22:55
*** nati_ueno has joined #openstack-neutron23:01
*** jecarey_ has joined #openstack-neutron23:06
*** jecarey has quit IRC23:07
*** networkstatic has quit IRC23:11
openstackgerritdekehn proposed a change to openstack/neutron: extra_dhcp_opt add checks for empty strings  https://review.openstack.org/5985823:11
*** Mr_W has quit IRC23:13
*** SumitNaiksatam has quit IRC23:13
*** SumitNaiksatam has joined #openstack-neutron23:14
*** jecarey_ has quit IRC23:15
*** SumitNaiksatam has quit IRC23:15
*** pcm_ has quit IRC23:16
*** gdubreui has joined #openstack-neutron23:18
*** rossella_s has quit IRC23:29
*** banix has quit IRC23:31
*** aymenfrikha has joined #openstack-neutron23:41
*** salv-orlando has quit IRC23:49
*** yamahata_ has joined #openstack-neutron23:50
*** gdubreui has quit IRC23:51
*** gdubreui has joined #openstack-neutron23:51
*** thedodd has quit IRC23:52
*** bgorski has joined #openstack-neutron23:57
bgorskiHi folks23:57
bgorskiI have a question. Assume I have a router with added gateway. What is the best way to get IP address of this gateway through API?23:58

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!