Tuesday, 2019-04-09

*** Swami has quit IRC		00:08
*** abaindur has quit IRC		00:56
*** ricolin has joined #openstack-lbaas		01:02
sapd1	I don't know why he does not continue implement l3-active-active feature? https://review.openstack.org/#/q/owner:yjf1970231893%2540gmail.com+status:open	01:03
*** hongbin has joined #openstack-lbaas		01:33
*** ricolin_ has joined #openstack-lbaas		01:50
*** ricolin has quit IRC		01:50
*** hongbin has quit IRC		02:35
*** hongbin has joined #openstack-lbaas		02:38
*** hongbin has quit IRC		02:49
*** hongbin has joined #openstack-lbaas		02:51
*** yamamoto has quit IRC		03:40
*** yamamoto has joined #openstack-lbaas		03:55
*** ramishra has joined #openstack-lbaas		03:59
*** hongbin has quit IRC		04:05
*** Vorrtex has joined #openstack-lbaas		04:13
*** Vorrtex has quit IRC		04:22
*** abaindur has joined #openstack-lbaas		04:51
*** abaindur has quit IRC		05:12
*** ricolin_ has quit IRC		05:58
*** ccamposr has joined #openstack-lbaas		06:10
*** ricolin has joined #openstack-lbaas		06:22
rm_work	sapd1: basically, that work was being done by walmart, and they ended up dropping the whole project IIRC	06:27
rm_work	johnsom might be able to corroborate that or correct me	06:27
sapd1	worst	06:30
*** pcaruana has joined #openstack-lbaas		06:30
*** abaindur has joined #openstack-lbaas		06:36
sapd1	rm_work, do you know his IRC nickname?	06:41
rm_work	sapd1: not sure if he's still here anymore	06:42
rm_work	umm	06:42
rm_work	i think we interfaced with someone else there	06:43
sapd1	Ya. I see a topic about active/active this PTG. looking forward to hear something new. I'm trying to follow some patches from he. But seem likes he does not work on it any more.	06:47
openstackgerrit	Gregory Thiemonge proposed openstack/octavia master: Set member initializing state as OFFLINE https://review.openstack.org/651111	07:02
*** ivve has joined #openstack-lbaas		07:06
*** rpittau\|afk is now known as rpittau		07:12
*** happyhemant has joined #openstack-lbaas		07:48
openstackgerrit	Merged openstack/octavia-lib master: Remove testtools from test-requirements.txt https://review.openstack.org/644876	07:59
*** rcernin has quit IRC		08:01
*** luksky has joined #openstack-lbaas		08:15
*** abaindur has quit IRC		08:23
*** yamamoto has quit IRC		08:30
*** yamamoto has joined #openstack-lbaas		08:36
*** vishalmanchanda has joined #openstack-lbaas		08:44
*** chungpht has joined #openstack-lbaas		09:11
*** luksky has quit IRC		09:15
*** yamamoto has quit IRC		09:16
*** yamamoto has joined #openstack-lbaas		09:44
*** yamamoto has quit IRC		09:47
*** psachin has joined #openstack-lbaas		09:59
*** salmankhan has joined #openstack-lbaas		10:00
*** yamamoto has joined #openstack-lbaas		10:02
*** livelace has joined #openstack-lbaas		10:02
livelace	Hello. Cannot find any information about a heartbeat logic. If amphora sends UDP packet through NAT - is it Ok ? Does healthmanager work with such packets ?	10:04
*** yamamoto has quit IRC		10:08
*** luksky has joined #openstack-lbaas		10:08
livelace	Why am I asking ? Because I see that amphora is in ERROR status, but it works fine.	10:12
cgoncalves	livelace, hi. sadly it will not work with heartbeats being sent behind NAT. the health manager looks for the source address of the UDP packet	10:15
*** yamamoto has joined #openstack-lbaas		10:15
cgoncalves	https://github.com/openstack/octavia/blob/372ff99a030e6b33dad11a35cb9d5c4058805c53/octavia/amphorae/drivers/health/heartbeat_udp.py#L188	10:15
livelace	cgoncalves, Thanks. That means I should change my communication topology :(	10:17
*** yamamoto has quit IRC		10:18
*** yamamoto has joined #openstack-lbaas		10:21
*** yamamoto has quit IRC		10:21
*** yamamoto has joined #openstack-lbaas		10:22
*** yamamoto has quit IRC		10:24
*** yamamoto has joined #openstack-lbaas		10:24
cgoncalves	sorry for that. perhaps the heartbeat packet could include in the payload the IP address and the health manager later could check if it's set and use that one, otherwise fall back to the srcaddr as is today	10:33
cgoncalves	livelace, could you please file a story on storyboard.openstack.org?	10:34
livelace	cgoncalves, Don't think so, really. Because I have a very specific configuration + I don't know Octavia well. If I'm change my mind, I will fill story on storyboard (that is cool that such board exists).	10:41
*** livelace has quit IRC		10:56
*** yamamoto has quit IRC		11:01
*** yamamoto has joined #openstack-lbaas		11:03
*** yamamoto has quit IRC		11:07
*** yamamoto has joined #openstack-lbaas		11:09
*** yamamoto has quit IRC		11:09
*** yamamoto has joined #openstack-lbaas		11:26
*** yamamoto has quit IRC		11:41
openstackgerrit	Gregory Thiemonge proposed openstack/octavia master: Set member initializing state as OFFLINE https://review.openstack.org/651111	11:47
rm_work	cgoncalves: yeah we did it that way partly for security too, not just convenience	11:51
rm_work	kind of "proof" that it is the packet it says it is	11:51
rm_work	since the only other auth bits are global (no per-amp encryption key for packets)	11:51
rm_work	that way someone couldn't perform part of a DoS by spoofing "healthy" messages and then trying to take down an amp or something (i dunno)	11:52
rm_work	or at least make it more difficult	11:52
rm_work	but maybe it doesn't actually help that much, dunno, would be good to hear from a seasoned network security person who might know if that's actually reasonable	11:53
cgoncalves	rm_work, I didn't know about that decision (you guys are Octavia dinosaurs). make sense :)	11:53
rm_work	well, we could also have per-amp encryption keys	11:54
* rm_work shrugs		11:54
rm_work	just more work	11:54
rm_work	we were going for "secure enough" but also "able to launch in a reasonable timeframe"	11:54
rm_work	turns out we were too late for RAX :/	11:55
rm_work	and also GD	11:55
rm_work	as it turns out T_T	11:55
*** yamamoto has joined #openstack-lbaas		11:59
*** yamamoto has quit IRC		11:59
*** celebdor1 has joined #openstack-lbaas		12:18
*** livelace has joined #openstack-lbaas		12:21
*** celebdor1 has quit IRC		12:22
*** celebdor1 has joined #openstack-lbaas		12:23
johnsom	Yeah, the design does not support NAT. It is a routable private network, so shouldn’t need NAT. But as rm-work said, it is a security feature. One of a few layers.	12:24
livelace	Caught a difference between centos and ubuntu images (amphora images). I see that ubuntu network namespace doesn't have an default route. Centos has default routes (which it takes from network settings). It seems a problem with the ubuntu image, isn't it ?	12:25
johnsom	Ubuntu has a default route, but you might have an image that has a recent bug in it.	12:26
johnsom	The bug causes eth1 to not be up at all in the netns, the routes included.	12:27
livelace	johnsom, That is why I use centos mostly. Ok, thanks.	12:27
*** celebdor1 has quit IRC		12:28
johnsom	In fairness, it was an Octavia bug we introduced in the RCs. RC3 has it fixed.	12:29
*** livelace has quit IRC		12:32
*** yamamoto has joined #openstack-lbaas		12:43
cgoncalves	reviewers: as we plan to cut a release for stable/queens soon, it would be great if we could also try to have https://review.openstack.org/#/c/650909/ in	12:43
*** openstackgerrit has quit IRC		12:44
*** HVT has quit IRC		12:44
*** yamamoto has quit IRC		13:10
*** yamamoto has joined #openstack-lbaas		13:11
*** yamamoto has quit IRC		13:11
*** vishalmanchanda has quit IRC		13:50
*** yamamoto has joined #openstack-lbaas		13:51
*** fnaval has joined #openstack-lbaas		13:54
*** Vorrtex has joined #openstack-lbaas		13:58
*** yamamoto has quit IRC		14:02
*** openstackgerrit has joined #openstack-lbaas		14:11
openstackgerrit	Merged openstack/octavia stable/queens: Fix the amphora base port coming up https://review.openstack.org/650469	14:11
*** celebdor1 has joined #openstack-lbaas		14:13
openstackgerrit	Merged openstack/octavia stable/rocky: Fix the amphora base port coming up https://review.openstack.org/650468	14:15
*** celebdor1 has quit IRC		14:20
*** boden has joined #openstack-lbaas		14:30
cgoncalves	nmagnezi, dayou_: do you have some spare minutes to review https://review.openstack.org/#/c/650909/?	14:30
*** gcheresh_ has joined #openstack-lbaas		14:33
*** livelace has joined #openstack-lbaas		14:42
*** lemko has joined #openstack-lbaas		14:46
openstackgerrit	Merged openstack/octavia master: Fix the amphora base port coming up https://review.openstack.org/650417	14:46
openstackgerrit	Gregory Thiemonge proposed openstack/octavia master: Set member initializing state as OFFLINE https://review.openstack.org/651111	14:46
*** gcheresh has joined #openstack-lbaas		14:50
*** gcheresh_ has quit IRC		14:54
livelace	johnsom, Were there problems with DNS resolving in Centos based Amphora images ?	14:57
livelace	jiteka, I see in nsswitch only: "hosts: files myhostname"	14:58
johnsom	DNS is disabled in all of the amphora images.	14:58
livelace	jiteka, Sorry	14:58
livelace	johnsom, For what reason ?	14:59
johnsom	It is not needed, it slows down a lot of processes, and there is typically no DNS resolver available to the amphora.	14:59
johnsom	Not to mention the security implications.	15:00
johnsom	If you feel you do need it, you can always create a custom image that turns it back on and create a way for the amps to access a resolver.	15:04
livelace	johnsom, Yes, I understand. Thanks for your comments.	15:07
livelace	johnsom, I'm just investigating how can I increase speed of amphora initialization. Are there any prod installations who use containers for this purpose ?	15:09
johnsom	No. The amps boot in around 30 seconds in most clouds, which seems reasonable for most.	15:11
johnsom	I have done a PoC using lxd, the patches are posted, but there are tradeoffs.	15:12
cgoncalves	and if single topology, you may consider amphora spare pool for faster provisioning times	15:12
livelace	My initialization spend 160 seconds :(	15:13
johnsom	The real issues are that we are plumbing and the container platforms are focused on the app layers.	15:13
johnsom	livelace: Then you have something very wrong in your deployment.	15:13
cgoncalves	is your deployment on a nested virtualized environment?	15:14
johnsom	I can’t go into detail today as I am traveling, so can go more into detail later in the week on containers.	15:14
livelace	johnsom, I see that worker checks availability of the agent, at that moment VM is available and I can connect to it over SSH.	15:14
livelace	These check eat most of the time.	15:15
livelace	johnsom, Ok, have a good trip :)	15:16
*** gcheresh has quit IRC		15:17
johnsom	If it is checking, the vm should be booted by then. It sounds like the environment has a problem.	15:19
bcafarel	cgoncalves: o/ can you +2 https://review.openstack.org/#/c/646673/ ? (I see you went through all the other ones)	15:19
cgoncalves	bcafarel, +W'd	15:19
bcafarel	cgoncalves: thanks, kicking the complete set in gate run	15:20
livelace	I caught another one issue (for me), after shutdown the whole server (I'm testing on it) the amphora stays in "ERROR" state and the worker doesn't recreate it/create a new one. If I shutdown the amphora correctly - the worker recreate a new one.	15:22
*** sapd1_x has joined #openstack-lbaas		15:22
livelace	Is there any way to always check amphora state and tries to create a new instance ?	15:24
livelace	It seems that controller/worker "forget" about amphora, if controller/worker was restarted.	15:25
cgoncalves	livelace, the health manager service should have triggered an amphora fail over. could you check the logs around the time the amphora went in to ERROR?	15:26
cgoncalves	oh, you restarted the health manager service while it was failing over the amphora?	15:26
*** sapd1_x has quit IRC		15:27
colin-	i'd be surprised if any of the octavia processes forgot about an amphora, you can verify that by checking the amphora table in the octavia database where load_balancer_id=<your id>	15:28
colin-	it will even track deleted ones there. if you have that visibility, it can be helpful in illustrating what octavia sees	15:28
livelace	Status of amphora and lb right after shutdown of the server https://paste.fedoraproject.org/paste/IEYR26k~jBJQYR2jsAzCAg	15:28
livelace	The amphora vm is in Shutoff state	15:29
livelace	cgoncalves, I'm simulating powerloss, I just put the whole server (shutdown -h now) into shutdown, the amphora and the controller were working that time.	15:31
johnsom	Yeah, that was probably the only compute node, the controller attempted to failover and repair, but nova failed to provision a replacement instance, so we stopped and marked it error.	15:31
gthiemonge	View All	15:32
*** ivve has quit IRC		15:32
gthiemonge	oops, sorry	15:32
cgoncalves	livelace, johnsom made a good point. could you please check the health manager service logs?	15:34
livelace	Debug log https://paste.fedoraproject.org/paste/IBO1HrPcMWEQskeB4rgZAA	15:38
*** ccamposr has quit IRC		15:39
cgoncalves	hmm, "ComputeDeleteException: Failed to delete compute instance.". the exception raised by is not being handled	15:39
livelace	And what next ? Amphora and lb are not working. What should we do if cluster is down ? Recreate all lb by hand ? :)	15:40
*** luksky has quit IRC		15:42
livelace	cgoncalves, I guess because Nova services were initializing that time.	15:43
livelace	My experience suggests that we should do periodic checks and trying to recreate amphorae and their lbs, because we don't know how the cluster was putted down and we don't know what order it will be during initialization.	15:46
colin-	i can't see the log (gated?), which process logged the "ComputeDeleteException" message?	15:51
colin-	want to see if i've ever recorded it	15:51
livelace	cgoncalves, is your deployment on a nested virtualized environment? No, I don't use nested virt, it's a server where I can do some experiments, before changes go to the prod cluster.	15:51
cgoncalves	the exception is thrown here: https://github.com/openstack/octavia/blob/12668dec63906628e2f01f651a9e57d9b2446e40/octavia/controller/worker/tasks/compute_tasks.py#L187	15:51
colin-	thanks	15:52
cgoncalves	get_failover_flow isn't catching that though	15:52
cgoncalves	https://github.com/openstack/octavia/blob/147a340f4031d13bd196adb2fd7204db7a7bd5c5/octavia/controller/worker/flows/amphora_flows.py#L387-L389	15:52
johnsom	It didn’t revert? Did they turn that off in the config?	15:54
cgoncalves	so, what I'm reading is that we trigger a nova instance delete to delete the old amphora and nova throws an error because the compute node is unavailable. am I missing something?	15:54
cgoncalves	johnsom, no revert defined for that task :/	15:55
livelace	Could you guys give a verdict ?	15:59
cgoncalves	livelace, at this point you may have to recreate the LB, I'm afraid	16:00
livelace	cgoncalves, No-no, I meant your planning for this issue ?	16:01
livelace	Will you change behaviour in future or not ?	16:01
cgoncalves	livelace, once we confirm and root cause the issue, someone should work on a fix	16:05
cgoncalves	the first step would be to file a story. could you please open one on storyboard.openstack.org?	16:06
livelace	cgoncalves, johnsom Thanks for your replies!	16:06
*** livelace has quit IRC		16:06
johnsom	It should still revert even if we don’t define a revert step for that task.	16:17
cgoncalves	ah! our great new PTL has us covered!	16:33
cgoncalves	https://review.openstack.org/#/c/616287/	16:33
cgoncalves	LB was not being marked in ERROR before	16:33
cgoncalves	queens and rocky backports merged in end of March/beginning of April	16:34
*** salmankhan has quit IRC		16:39
johnsom	Ah, so it failed prior to entering the flow?	16:45
cgoncalves	johnsom, no. it entered the flow, exception raised on nova delete, amp marked in ERROR, and bubbled up to failover_amphora which while catching any exception it was not marking LB in ERROR	16:51
johnsom	That flow doesn’t have a capstone task that moves the lb to error? Maybe the database was down by then and we couldn’t mark it as such.	16:53
openstackgerrit	Merged openstack/neutron-lbaas stable/stein: Replace openstack.org git:// URLs with https:// https://review.openstack.org/646674	16:54
openstackgerrit	Merged openstack/neutron-lbaas stable/rocky: Replace openstack.org git:// URLs with https:// https://review.openstack.org/646673	16:54
cgoncalves	LoadBalancerToErrorOnRevertTask	16:55
cgoncalves	apparently not, no	16:55
*** gcheresh has joined #openstack-lbaas		16:56
*** psachin has quit IRC		17:00
cgoncalves	https://github.com/openstack/octavia/blob/a728bc000f65e431dc57fef61d41e5ba63d72b02/octavia/controller/worker/tasks/lifecycle_tasks.py#L34-L35	17:04
cgoncalves	no mark lb status error there	17:05
cgoncalves	it should considering also that it might be a spare amp hence not associated to any LB	17:05
*** jiteka1 has joined #openstack-lbaas		17:13
*** ramishra has quit IRC		17:19
*** gcheresh has quit IRC		17:24
*** yamamoto has joined #openstack-lbaas		17:28
*** livelace has joined #openstack-lbaas		17:32
*** ricolin has quit IRC		18:04
*** ceryx has joined #openstack-lbaas		18:10
*** yamamoto has quit IRC		18:19
*** rpittau is now known as rpittau\|afk		18:21
*** luksky has joined #openstack-lbaas		18:22
*** happyhemant has quit IRC		18:38
*** yamamoto has joined #openstack-lbaas		18:58
openstackgerrit	Merged openstack/neutron-lbaas stable/queens: Choose correct log option by listener protocol https://review.openstack.org/647689	19:04
*** yamamoto has quit IRC		19:09
*** abaindur has joined #openstack-lbaas		19:35
openstackgerrit	Merged openstack/neutron-lbaas stable/queens: Fix proxy extension for neutron RBAC https://review.openstack.org/649048	20:17
openstackgerrit	Merged openstack/neutron-lbaas stable/pike: Replace openstack.org git:// URLs with https:// https://review.openstack.org/646671	20:17
*** pcaruana has quit IRC		20:31
*** pcaruana has joined #openstack-lbaas		20:33
*** pcaruana has quit IRC		20:36
*** pcaruana has joined #openstack-lbaas		20:39
*** lemko has quit IRC		20:44
livelace	cgoncalves, Ok. I'm going to do it right now :)	20:45
*** pcaruana has quit IRC		20:47
*** Vorrtex has quit IRC		20:47
livelace	cgoncalves, https://storyboard.openstack.org/#!/story/2005417	21:07
*** boden has quit IRC		21:08
*** luksky has quit IRC		21:50
*** rcernin has joined #openstack-lbaas		22:28
*** fnaval has quit IRC		22:29
*** abaindur has quit IRC		23:02
*** abaindur has joined #openstack-lbaas		23:03
*** livelace has quit IRC		23:14
*** fnaval has joined #openstack-lbaas		23:42
*** fnaval has quit IRC		23:45

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!