Tuesday, 2019-04-09

*** Swami has quit IRC00:08
*** abaindur has quit IRC00:56
*** ricolin has joined #openstack-lbaas01:02
sapd1I don't know why he does not continue implement l3-active-active feature? https://review.openstack.org/#/q/owner:yjf1970231893%2540gmail.com+status:open01:03
*** hongbin has joined #openstack-lbaas01:33
*** ricolin_ has joined #openstack-lbaas01:50
*** ricolin has quit IRC01:50
*** hongbin has quit IRC02:35
*** hongbin has joined #openstack-lbaas02:38
*** hongbin has quit IRC02:49
*** hongbin has joined #openstack-lbaas02:51
*** yamamoto has quit IRC03:40
*** yamamoto has joined #openstack-lbaas03:55
*** ramishra has joined #openstack-lbaas03:59
*** hongbin has quit IRC04:05
*** Vorrtex has joined #openstack-lbaas04:13
*** Vorrtex has quit IRC04:22
*** abaindur has joined #openstack-lbaas04:51
*** abaindur has quit IRC05:12
*** ricolin_ has quit IRC05:58
*** ccamposr has joined #openstack-lbaas06:10
*** ricolin has joined #openstack-lbaas06:22
rm_worksapd1: basically, that work was being done by walmart, and they ended up dropping the whole project IIRC06:27
rm_workjohnsom might be able to corroborate that or correct me06:27
sapd1worst06:30
*** pcaruana has joined #openstack-lbaas06:30
*** abaindur has joined #openstack-lbaas06:36
sapd1rm_work, do you know his IRC nickname?06:41
rm_worksapd1: not sure if he's still here anymore06:42
rm_workumm06:42
rm_worki think we interfaced with someone else there06:43
sapd1Ya. I see a topic about active/active this PTG. looking forward to hear something new. I'm trying to follow some patches from he. But seem likes he does not work on it any more.06:47
openstackgerritGregory Thiemonge proposed openstack/octavia master: Set member initializing state as OFFLINE  https://review.openstack.org/65111107:02
*** ivve has joined #openstack-lbaas07:06
*** rpittau|afk is now known as rpittau07:12
*** happyhemant has joined #openstack-lbaas07:48
openstackgerritMerged openstack/octavia-lib master: Remove testtools from test-requirements.txt  https://review.openstack.org/64487607:59
*** rcernin has quit IRC08:01
*** luksky has joined #openstack-lbaas08:15
*** abaindur has quit IRC08:23
*** yamamoto has quit IRC08:30
*** yamamoto has joined #openstack-lbaas08:36
*** vishalmanchanda has joined #openstack-lbaas08:44
*** chungpht has joined #openstack-lbaas09:11
*** luksky has quit IRC09:15
*** yamamoto has quit IRC09:16
*** yamamoto has joined #openstack-lbaas09:44
*** yamamoto has quit IRC09:47
*** psachin has joined #openstack-lbaas09:59
*** salmankhan has joined #openstack-lbaas10:00
*** yamamoto has joined #openstack-lbaas10:02
*** livelace has joined #openstack-lbaas10:02
livelaceHello. Cannot find any information about a heartbeat logic. If amphora sends UDP packet through NAT - is it Ok ? Does healthmanager work with such packets ?10:04
*** yamamoto has quit IRC10:08
*** luksky has joined #openstack-lbaas10:08
livelaceWhy am I asking ? Because I see that amphora is in ERROR status, but it works fine.10:12
cgoncalveslivelace, hi. sadly it will not work with heartbeats being sent behind NAT. the health manager looks for the source address of the UDP packet10:15
*** yamamoto has joined #openstack-lbaas10:15
cgoncalveshttps://github.com/openstack/octavia/blob/372ff99a030e6b33dad11a35cb9d5c4058805c53/octavia/amphorae/drivers/health/heartbeat_udp.py#L18810:15
livelacecgoncalves, Thanks. That means I should change my communication topology :(10:17
*** yamamoto has quit IRC10:18
*** yamamoto has joined #openstack-lbaas10:21
*** yamamoto has quit IRC10:21
*** yamamoto has joined #openstack-lbaas10:22
*** yamamoto has quit IRC10:24
*** yamamoto has joined #openstack-lbaas10:24
cgoncalvessorry for that. perhaps the heartbeat packet could include in the payload the IP address and the health manager later could check if it's set and use that one, otherwise fall back to the srcaddr as is today10:33
cgoncalveslivelace, could you please file a story on storyboard.openstack.org?10:34
livelacecgoncalves, Don't think so, really. Because I have a very specific configuration + I don't know Octavia well. If I'm change my mind, I will fill story on storyboard (that is cool that such board exists).10:41
*** livelace has quit IRC10:56
*** yamamoto has quit IRC11:01
*** yamamoto has joined #openstack-lbaas11:03
*** yamamoto has quit IRC11:07
*** yamamoto has joined #openstack-lbaas11:09
*** yamamoto has quit IRC11:09
*** yamamoto has joined #openstack-lbaas11:26
*** yamamoto has quit IRC11:41
openstackgerritGregory Thiemonge proposed openstack/octavia master: Set member initializing state as OFFLINE  https://review.openstack.org/65111111:47
rm_workcgoncalves: yeah we did it that way partly for security too, not just convenience11:51
rm_workkind of "proof" that it is the packet it says it is11:51
rm_worksince the only other auth bits are global (no per-amp encryption key for packets)11:51
rm_workthat way someone couldn't perform part of a DoS by spoofing "healthy" messages and then trying to take down an amp or something (i dunno)11:52
rm_workor at least make it more difficult11:52
rm_workbut maybe it doesn't actually help that much, dunno, would be good to hear from a seasoned network security person who might know if that's actually reasonable11:53
cgoncalvesrm_work, I didn't know about that decision (you guys are Octavia dinosaurs). make sense :)11:53
rm_workwell, we could also have per-amp encryption keys11:54
* rm_work shrugs11:54
rm_workjust more work11:54
rm_workwe were going for "secure enough" but also "able to launch in a reasonable timeframe"11:54
rm_workturns out we were too late for RAX :/11:55
rm_workand also GD11:55
rm_workas it turns out T_T11:55
*** yamamoto has joined #openstack-lbaas11:59
*** yamamoto has quit IRC11:59
*** celebdor1 has joined #openstack-lbaas12:18
*** livelace has joined #openstack-lbaas12:21
*** celebdor1 has quit IRC12:22
*** celebdor1 has joined #openstack-lbaas12:23
johnsomYeah, the design does not support NAT.  It is a routable private network, so shouldn’t need NAT.  But as rm-work said, it is a security feature. One of a few layers.12:24
livelaceCaught a difference between centos and ubuntu images (amphora images). I see that ubuntu network namespace doesn't have an default route. Centos has default routes (which it takes from network settings). It seems a problem with the ubuntu image, isn't it ?12:25
johnsomUbuntu has a default route, but you might have an image that has a recent bug in it.12:26
johnsomThe bug causes eth1 to not be up at all in the netns, the routes included.12:27
livelacejohnsom, That is why I use centos mostly. Ok, thanks.12:27
*** celebdor1 has quit IRC12:28
johnsomIn fairness, it was an Octavia bug we introduced in the RCs.  RC3 has it fixed.12:29
*** livelace has quit IRC12:32
*** yamamoto has joined #openstack-lbaas12:43
cgoncalvesreviewers: as we plan to cut a release for stable/queens soon, it would be great if we could also try to have https://review.openstack.org/#/c/650909/ in12:43
*** openstackgerrit has quit IRC12:44
*** HVT has quit IRC12:44
*** yamamoto has quit IRC13:10
*** yamamoto has joined #openstack-lbaas13:11
*** yamamoto has quit IRC13:11
*** vishalmanchanda has quit IRC13:50
*** yamamoto has joined #openstack-lbaas13:51
*** fnaval has joined #openstack-lbaas13:54
*** Vorrtex has joined #openstack-lbaas13:58
*** yamamoto has quit IRC14:02
*** openstackgerrit has joined #openstack-lbaas14:11
openstackgerritMerged openstack/octavia stable/queens: Fix the amphora base port coming up  https://review.openstack.org/65046914:11
*** celebdor1 has joined #openstack-lbaas14:13
openstackgerritMerged openstack/octavia stable/rocky: Fix the amphora base port coming up  https://review.openstack.org/65046814:15
*** celebdor1 has quit IRC14:20
*** boden has joined #openstack-lbaas14:30
cgoncalvesnmagnezi, dayou_: do you have some spare minutes to review https://review.openstack.org/#/c/650909/?14:30
*** gcheresh_ has joined #openstack-lbaas14:33
*** livelace has joined #openstack-lbaas14:42
*** lemko has joined #openstack-lbaas14:46
openstackgerritMerged openstack/octavia master: Fix the amphora base port coming up  https://review.openstack.org/65041714:46
openstackgerritGregory Thiemonge proposed openstack/octavia master: Set member initializing state as OFFLINE  https://review.openstack.org/65111114:46
*** gcheresh has joined #openstack-lbaas14:50
*** gcheresh_ has quit IRC14:54
livelacejohnsom, Were there problems with DNS resolving in Centos based Amphora images ?14:57
livelacejiteka, I see in nsswitch only: "hosts:      files  myhostname"14:58
johnsomDNS is disabled in all of the amphora images.14:58
livelacejiteka, Sorry14:58
livelacejohnsom, For what reason ?14:59
johnsomIt is not needed, it slows down a lot of processes, and there is typically no DNS resolver available to the amphora.14:59
johnsomNot to mention the security implications.15:00
johnsomIf you feel you do need it, you can always create a custom image that turns it back on and create a way for the amps to access a resolver.15:04
livelacejohnsom, Yes, I understand. Thanks for your comments.15:07
livelacejohnsom, I'm just investigating how can I increase speed of amphora initialization. Are there any prod installations who use containers for this purpose ?15:09
johnsomNo.  The amps boot in around 30 seconds in most clouds, which seems reasonable for most.15:11
johnsomI have done a PoC using lxd, the patches are posted, but there are tradeoffs.15:12
cgoncalvesand if single topology, you may consider amphora spare pool for faster provisioning times15:12
livelaceMy initialization spend 160 seconds :(15:13
johnsomThe real issues are that we are plumbing and the container platforms are focused on the app layers.15:13
johnsomlivelace: Then you have something very wrong in your deployment.15:13
cgoncalvesis your deployment on a nested virtualized environment?15:14
johnsomI can’t go into detail today as I am traveling, so can go more into detail later in the week on containers.15:14
livelacejohnsom, I see that worker checks availability of the agent, at that moment VM is available and I can connect to it over SSH.15:14
livelaceThese check eat most of the time.15:15
livelacejohnsom, Ok, have a good trip :)15:16
*** gcheresh has quit IRC15:17
johnsomIf it is checking, the vm should be booted by then. It sounds like the environment has a problem.15:19
bcafarelcgoncalves: o/ can you +2 https://review.openstack.org/#/c/646673/ ? (I see you went through all the other ones)15:19
cgoncalvesbcafarel, +W'd15:19
bcafarelcgoncalves: thanks, kicking the complete set in gate run15:20
livelaceI caught another one issue (for me), after shutdown the whole server (I'm testing on it) the amphora stays in "ERROR" state and the worker doesn't recreate it/create a new one. If I shutdown the amphora correctly - the worker recreate a new one.15:22
*** sapd1_x has joined #openstack-lbaas15:22
livelaceIs there any way to always check amphora state and tries to create a new instance ?15:24
livelaceIt seems that controller/worker "forget" about amphora, if controller/worker was restarted.15:25
cgoncalveslivelace, the health manager service should have triggered an amphora fail over. could you check the logs around the time the amphora went in to ERROR?15:26
cgoncalvesoh, you restarted the health manager service while it was failing over the amphora?15:26
*** sapd1_x has quit IRC15:27
colin-i'd be surprised if any of the octavia processes forgot about an amphora, you can verify that by checking the amphora table in the octavia database where load_balancer_id=<your id>15:28
colin-it will even track deleted ones there. if you have that visibility, it can be helpful in illustrating what octavia sees15:28
livelaceStatus of amphora and lb right after shutdown of the server https://paste.fedoraproject.org/paste/IEYR26k~jBJQYR2jsAzCAg15:28
livelaceThe amphora vm is in Shutoff state15:29
livelacecgoncalves, I'm simulating powerloss, I just put the whole server (shutdown -h now) into shutdown, the amphora and the controller were working that time.15:31
johnsomYeah, that was probably the only compute node, the controller attempted to failover and repair, but nova failed to provision a replacement instance, so we stopped and marked it error.15:31
gthiemongeView All15:32
*** ivve has quit IRC15:32
gthiemongeoops, sorry15:32
cgoncalveslivelace, johnsom made a good point. could you please check the health manager service logs?15:34
livelaceDebug log https://paste.fedoraproject.org/paste/IBO1HrPcMWEQskeB4rgZAA15:38
*** ccamposr has quit IRC15:39
cgoncalveshmm, "ComputeDeleteException: Failed to delete compute instance.". the exception raised by  is not being handled15:39
livelaceAnd what next ? Amphora and lb are not working. What should we do if cluster is down ? Recreate all lb by hand ? :)15:40
*** luksky has quit IRC15:42
livelacecgoncalves, I guess because Nova services were initializing that time.15:43
livelaceMy experience suggests that we should do periodic checks and trying to recreate amphorae and their lbs, because we don't know how the cluster was putted down and we don't know what order it will be during initialization.15:46
colin-i can't see the log (gated?), which process logged the "ComputeDeleteException" message?15:51
colin-want to see if i've ever recorded it15:51
livelacecgoncalves, is your deployment on a nested virtualized environment? No, I don't use nested virt, it's a server where I can do some experiments, before changes go to the prod cluster.15:51
cgoncalvesthe exception is thrown here: https://github.com/openstack/octavia/blob/12668dec63906628e2f01f651a9e57d9b2446e40/octavia/controller/worker/tasks/compute_tasks.py#L18715:51
colin-thanks15:52
cgoncalvesget_failover_flow isn't catching that though15:52
cgoncalveshttps://github.com/openstack/octavia/blob/147a340f4031d13bd196adb2fd7204db7a7bd5c5/octavia/controller/worker/flows/amphora_flows.py#L387-L38915:52
johnsomIt didn’t revert?  Did they turn that off in the config?15:54
cgoncalvesso, what I'm reading is that we trigger a nova instance delete to delete the old amphora and nova throws an error because the compute node is unavailable. am I missing something?15:54
cgoncalvesjohnsom, no revert defined for that task :/15:55
livelaceCould you guys give a verdict ?15:59
cgoncalveslivelace, at this point you may have to recreate the LB, I'm afraid16:00
livelacecgoncalves, No-no, I meant your planning for this issue ?16:01
livelaceWill you change behaviour in future or not ?16:01
cgoncalveslivelace, once we confirm and root cause the issue, someone should work on a fix16:05
cgoncalvesthe first step would be to file a story. could you please open one on storyboard.openstack.org?16:06
livelacecgoncalves, johnsom Thanks for your replies!16:06
*** livelace has quit IRC16:06
johnsomIt should still revert even if we don’t define a revert step for that task.16:17
cgoncalvesah! our great new PTL has us covered!16:33
cgoncalveshttps://review.openstack.org/#/c/616287/16:33
cgoncalvesLB was not being marked in ERROR before16:33
cgoncalvesqueens and rocky backports merged in end of March/beginning of April16:34
*** salmankhan has quit IRC16:39
johnsomAh, so it failed prior to entering the flow?16:45
cgoncalvesjohnsom, no. it entered the flow, exception raised on nova delete, amp marked in ERROR, and bubbled up to failover_amphora which while catching any exception it was not marking LB in ERROR16:51
johnsomThat flow doesn’t have a capstone task that moves the lb to error?  Maybe the database was down by then and we couldn’t mark it as such.16:53
openstackgerritMerged openstack/neutron-lbaas stable/stein: Replace openstack.org git:// URLs with https://  https://review.openstack.org/64667416:54
openstackgerritMerged openstack/neutron-lbaas stable/rocky: Replace openstack.org git:// URLs with https://  https://review.openstack.org/64667316:54
cgoncalvesLoadBalancerToErrorOnRevertTask16:55
cgoncalvesapparently not, no16:55
*** gcheresh has joined #openstack-lbaas16:56
*** psachin has quit IRC17:00
cgoncalveshttps://github.com/openstack/octavia/blob/a728bc000f65e431dc57fef61d41e5ba63d72b02/octavia/controller/worker/tasks/lifecycle_tasks.py#L34-L3517:04
cgoncalvesno mark lb status error there17:05
cgoncalvesit should considering also that it might be a spare amp hence not associated to any LB17:05
*** jiteka1 has joined #openstack-lbaas17:13
*** ramishra has quit IRC17:19
*** gcheresh has quit IRC17:24
*** yamamoto has joined #openstack-lbaas17:28
*** livelace has joined #openstack-lbaas17:32
*** ricolin has quit IRC18:04
*** ceryx has joined #openstack-lbaas18:10
*** yamamoto has quit IRC18:19
*** rpittau is now known as rpittau|afk18:21
*** luksky has joined #openstack-lbaas18:22
*** happyhemant has quit IRC18:38
*** yamamoto has joined #openstack-lbaas18:58
openstackgerritMerged openstack/neutron-lbaas stable/queens: Choose correct log option by listener protocol  https://review.openstack.org/64768919:04
*** yamamoto has quit IRC19:09
*** abaindur has joined #openstack-lbaas19:35
openstackgerritMerged openstack/neutron-lbaas stable/queens: Fix proxy extension for neutron RBAC  https://review.openstack.org/64904820:17
openstackgerritMerged openstack/neutron-lbaas stable/pike: Replace openstack.org git:// URLs with https://  https://review.openstack.org/64667120:17
*** pcaruana has quit IRC20:31
*** pcaruana has joined #openstack-lbaas20:33
*** pcaruana has quit IRC20:36
*** pcaruana has joined #openstack-lbaas20:39
*** lemko has quit IRC20:44
livelacecgoncalves, Ok. I'm going to do it right now :)20:45
*** pcaruana has quit IRC20:47
*** Vorrtex has quit IRC20:47
livelacecgoncalves, https://storyboard.openstack.org/#!/story/200541721:07
*** boden has quit IRC21:08
*** luksky has quit IRC21:50
*** rcernin has joined #openstack-lbaas22:28
*** fnaval has quit IRC22:29
*** abaindur has quit IRC23:02
*** abaindur has joined #openstack-lbaas23:03
*** livelace has quit IRC23:14
*** fnaval has joined #openstack-lbaas23:42
*** fnaval has quit IRC23:45

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!