Friday, 2019-02-01

colin-i'm getting that feedback from the driver even on IPs i believe are not in use and who don't have valid ARP responses, any idea why that might be? false positive maybe?00:29
johnsomNot likely. Can you do a "openstack port list" and search for the IP?00:30
rm_workjohnsom: i am officially nunned00:35
johnsomWell, I was the one that did it, so you should be throwing nun my way00:36
rm_workdid I escape nun-shaming for https://review.openstack.org/634302 immediately following agreeing that there was no time to deal with it before Stein? :P00:36
rm_workah00:36
rm_worki just figured it was me :P00:36
johnsomSigh00:37
rm_workit was only an hour or two distraction while i was building devstacks anyway <_<00:37
rm_worki had just forgotten00:37
johnsomcolin- FYI, that message is coming straight from neutron here: https://github.com/openstack/octavia/blob/master/octavia/network/drivers/neutron/allowed_address_pairs.py#L41400:37
johnsomOk, I have decided I need to override whole methods in openstacksdk to make failover work. Joy. Fun for tomorrow.00:39
johnsomIt was kindly silenty passing the failover test even though it didn't even call the API00:39
johnsomAll because it's hard coded to always expect a header or body change in a PUT call. If nothing changed, return success and don't bother the call!00:40
johnsomlol00:40
johnsomI need to stop early today, off to see W. Kamau Bell. He is in town tonight.00:42
xgermanhave fun!00:58
rm_workwat00:59
rm_workthat's .... wat00:59
rm_workok. whelp00:59
colin-thanks johnsom that will help figure out what's going on01:16
colin-going to pause here for the day, appreciate the quick replies01:16
*** Swami has quit IRC01:32
*** Dinesh_Bhor has joined #openstack-lbaas02:14
*** hongbin has joined #openstack-lbaas02:29
*** sapd1 has joined #openstack-lbaas02:34
*** psachin has joined #openstack-lbaas02:45
*** sapd1 has quit IRC02:50
*** dims has quit IRC02:53
*** ramishra has joined #openstack-lbaas03:57
*** Dinesh_Bhor has quit IRC03:58
*** Dinesh_Bhor has joined #openstack-lbaas04:08
*** hongbin has quit IRC05:24
*** numans has joined #openstack-lbaas05:43
*** AlexStaf has joined #openstack-lbaas05:58
*** AlexStaf has quit IRC06:14
*** ramishra has quit IRC06:48
*** psachin has quit IRC07:04
*** pcaruana has joined #openstack-lbaas07:19
*** ramishra has joined #openstack-lbaas07:23
*** psachin has joined #openstack-lbaas07:33
*** rpittau has joined #openstack-lbaas07:59
*** yamamoto has quit IRC08:00
*** yamamoto has joined #openstack-lbaas08:01
openstackgerritCarlos Goncalves proposed openstack/octavia-tempest-plugin master: Add octavia-v2-dsvm-scenario-fedora-latest job  https://review.openstack.org/60038108:06
*** ccamposr has joined #openstack-lbaas08:21
*** mkuf_ has quit IRC08:48
*** rcernin has joined #openstack-lbaas08:57
openstackgerritReedip proposed openstack/octavia-tempest-plugin master: Modify Member tests for Provider Drivers  https://review.openstack.org/59847608:58
*** salmankhan has joined #openstack-lbaas09:14
*** mkuf_ has joined #openstack-lbaas09:17
*** yamamoto has quit IRC09:18
*** salmankhan1 has joined #openstack-lbaas09:26
*** salmankhan has quit IRC09:26
*** salmankhan1 is now known as salmankhan09:26
*** pcaruana has quit IRC09:30
*** salmankhan has quit IRC09:34
*** salmankhan has joined #openstack-lbaas09:37
*** pcaruana has joined #openstack-lbaas09:42
*** mkuf_ has quit IRC09:48
*** mkuf_ has joined #openstack-lbaas09:58
*** Dinesh_Bhor has quit IRC10:01
*** Dinesh_Bhor has joined #openstack-lbaas10:06
*** yamamoto has joined #openstack-lbaas10:41
*** yamamoto has quit IRC10:51
openstackgerritMerged openstack/octavia-tempest-plugin master: Adds flavor profile API tests  https://review.openstack.org/63041110:51
openstackgerritMerged openstack/octavia-tempest-plugin master: Adds flavor API tests  https://review.openstack.org/63080410:51
*** Dinesh_Bhor has quit IRC11:01
*** rcernin has quit IRC11:12
*** pcaruana has quit IRC12:05
*** pcaruana has joined #openstack-lbaas12:19
*** psachin has quit IRC12:21
*** pcaruana|afk| has joined #openstack-lbaas12:25
*** pcaruana has quit IRC12:26
*** pcaruana|afk| is now known as pcaruana12:27
openstackgerritMerged openstack/octavia-tempest-plugin master: Modify Member tests for Provider Drivers  https://review.openstack.org/59847613:01
*** trown|outtypewww is now known as trown13:03
*** pcaruana has quit IRC13:13
*** yamamoto has joined #openstack-lbaas13:19
*** pcaruana has joined #openstack-lbaas13:20
*** dims has joined #openstack-lbaas13:41
*** celebdor has joined #openstack-lbaas13:43
*** dims has quit IRC14:38
*** dims has joined #openstack-lbaas14:44
*** dims has quit IRC15:01
tomtom001johnsom, xgerman: thanks so much for your help, I've just about got Octavia ACTIVE_STANDBY working 100%!!15:05
tomtom001One question is that the /0.5/info url keeps timing out when it re-builds after a failover: http://paste.openstack.org/show/744395/  I think it is not waiting long enough but I cannot find the setting to make it wait for the new instance to be up.  Can you tell me which setting that is?15:06
*** sapd1 has joined #openstack-lbaas15:09
*** dims has joined #openstack-lbaas15:14
*** dims has quit IRC15:19
*** dims has joined #openstack-lbaas15:20
*** yamamoto has quit IRC15:23
*** yamamoto has joined #openstack-lbaas15:23
*** yamamoto has quit IRC15:23
openstackgerritCorey Bryant proposed openstack/octavia master: Add missing import octavia/opts.py  https://review.openstack.org/63444115:51
*** sapd1 has quit IRC15:57
*** pcaruana has quit IRC15:59
johnsomtomtom001 Do you get an ERROR log there or just those WARNING messages?16:04
johnsomThose warnings are normal while we wait for nova to finish booting the service VM.16:04
tomtom001johnsom: yeah and eventually I get this(full traceback): http://paste.openstack.org/show/744401/16:11
johnsomtomtom001 Ok, so that means the amphora image is bad. There should be a log message with the explanation of when there was an error inside the amphora.16:12
tomtom001johnsom: wow, it's the same image I'm using for all the LB's, they work without error.  It's only on failover that I see this.16:17
tomtom001I will check the amphora image log though16:17
tomtom001johnsom: so you're saying look at amphora.log inside amphora?16:18
tomtom001just to be clear16:18
johnsomOk, so that is interesting actually. Any chance you can past 20 lines before and 20 lines after that traceback?16:19
johnsomWell, the controller should capture a "why" log message for that internal error. Sometimes it doesn't and then we would need to look inside the amphora.log and the syslog/messages log inside the amphora.16:20
tomtom001Yes let me get them for you.16:21
*** yamamoto has joined #openstack-lbaas16:26
*** yamamoto has quit IRC16:34
tomtom001johnsom: ok, here is the octavia-worker log: 474cG3ao1hOrCyW2y5HdyPf88necVBS7zUOl0yINIfz0Ey6YOF7q\npX38XbSWueRRnXHFl9WtJl/UH3F2bA3/uubkY1UKfoWfr2c/lQM2B7fSTNSvQ5Wp\nF28KGwKBgQCeuYgcNSrUdC65jI1ADM9NaPXqb4mYqqvKLdadS9tRCyH2oO31GRPf\nOQRARyrBSg2/woNxB1Z7mmEFT/gCVJW262xAZGd0MpSMJ7NJUXLGb2cRwTSs9wIT\n80ekXoo2TdPBjdzuBX1SXva0t5F5HxkPQVo4Enlk1TlPnj4Cp6c2zA==\n-----END RSA PRIVATE KEY-----\n'}16:37
tomtom001                                                                                         |__Atom 'octavia-failover-amphora-flow-octavia-create-amp-for-lb-subflow-octavia-create-amphora-indb' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {}, 'provides': u'be18776e-9b14-49dd-8119-c2f0c538dfd3'}16:37
tomtom001                                                                                            |__Flow 'octavia-failover-amphora-flow-octavia-create-amp-for-lb-subflow'16:37
tomtom001                                                                                               |__Atom 'octavia-failover-amphora-flow-octavia-get-amphora-for-lb-subflow-octavia-mapload-balancer-to-amphora' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {'server_group_id': u'e0af734a-5f76-4a87-a476-f4d1a1959b35', 'loadbalancer_id': u'8e92efeb-e2a3-4a1c-b61a-2b47f67de93b'}, 'provides':16:37
tomtom001None}16:37
tomtom001                                                                                                  |__Flow 'octavia-failover-amphora-flow-octavia-get-amphora-for-lb-subflow'16:37
tomtom001                                                                                                     |__Atom 'octavia.controller.worker.tasks.database_tasks.GetAmphoraDetails' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {'amphora': <octavia.common.data_models.Amphora object at 0x7f05cb143490>}, 'provides': <octavia.common.data_models.Amphora object at 0x7f05c85f1a90>}16:38
tomtom001                                                                                                        |__Atom 'octavia.controller.worker.tasks.database_tasks.MarkAmphoraDeletedInDB' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {'amphora': <octavia.common.data_models.Amphora object at 0x7f05cb143490>}, 'provides': None}16:38
tomtom001                                                                                                           |__Atom 'octavia.controller.worker.tasks.network_tasks.WaitForPortDetach' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {'amphora': <octavia.common.data_models.Amphora object at 0x7f05cb143490>}, 'provides': None}16:38
tomtom001                                                                                                              |__Atom 'octavia.controller.worker.tasks.compute_tasks.ComputeDelete' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {'amphora': <octavia.common.data_models.Amphora object at 0x7f05cb143490>}, 'provides': None}16:38
tomtom001                                                                                                                 |__Atom 'octavia.controller.worker.tasks.database_tasks.MarkAmphoraHealthBusy' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {'amphora': <octavia.common.data_models.Amphora object at 0x7f05cb143490>}, 'provides': None}16:38
tomtom001                                                                                                                    |__Atom 'octavia.controller.worker.tasks.database_tasks.MarkAmphoraPendingDeleteInDB' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {'amphora': <octavia.common.data_models.Amphora object at 0x7f05cb143490>}, 'provides': None}16:38
johnsomYeah, we don't need the tree, it's actually before that16:38
tomtom001                                                                                                                       |__Atom 'octavia.controller.worker.tasks.lifecycle_tasks.AmphoraToErrorOnRevertTask' {'intention': 'EXECUTE', 'state': 'SUCCESS', 'requires': {'amphora': <octavia.common.data_models.Amphora object at 0x7f05cb143490>}, 'provides': None}16:38
tomtom001                                                                                                                          |__Flow 'octavia-failover-amphora-flow': InternalServerError: Internal Server Error16:38
tomtom0012019-02-01 16:34:00.973 30045 ERROR octavia.controller.wor16:38
tomtom001ok16:38
tomtom001sorry, copy paste didn't work16:38
tomtom001johnsom: octavia-worker.log: http://paste.openstack.org/raw/744403/16:38
tomtom001got it that time.16:38
tomtom001johnsom: Ok, here is right before the tree started: http://paste.openstack.org/show/744408/16:40
johnsomThe line I am looking for should read: ERROR Amphora agent returned unexpected result code %s with response %s'16:41
johnsomLol, ok. Yeah, it didn't get it.   "due to: Internal Server Error: InternalServerError: Internal Server Error"16:41
johnsomNot helpful. sigh, we need to work on that more.16:42
johnsomOk, so root cause is going to be saving the amphora and going into the /var/log/syslog (messages on redhat style) and search for amphora-agent16:42
tomtom001oh ok, yeah I just grep for it and the phrase you're looking for doesn't exist16:42
johnsomYeah, the controller didn't capture the true error, just the repeated internal error string16:43
tomtom001ok, yeah I'll go onto the amphora instance and look for the error there.16:43
johnsomNow, cleanup may automatically delete the amphora that had the problem. If that is the case, I can help you with the configuration change that would allow it to remain16:43
tomtom001ok, yeah it does just loop endlessly, what config change can I make?16:44
johnsomOk, this is for testing only. It disables ALL of the error handling and cleanup. So don't set this on production, etc.16:44
tomtom001agreed will not implement in production16:45
johnsomBelieve it or not, we did have an operator do it in production.... So, I try to be very explicit about it.16:45
tomtom001thank you, it can happen.16:46
johnsomChange this setting to True: https://docs.openstack.org/octavia/latest/configuration/configref.html#task_flow.disable_revert16:46
johnsomThen restart your controller that is getting the "internal error" message. CW if you are manually failing over, HM if it is automatic16:46
johnsomEverything will run the same, but when it hits the error it will not attempt to repair or cleanup the resources.16:47
openstackgerritMerged openstack/octavia-tempest-plugin master: Adds provider API tests  https://review.openstack.org/63110516:56
*** ramishra has quit IRC16:56
johnsomYay, more tests merging!16:57
tomtom001johnsom: Here is the syslog from amphora: http://paste.openstack.org/show/744409/17:01
tomtom001johnsom: here is the haproxy.log from amphora: http://paste.openstack.org/show/744410/17:02
johnsomHmmm, that was fixed a long time ago....  Let me dig for that fix17:02
*** salmankhan1 has joined #openstack-lbaas17:04
tomtom001is this an amphora image bug, or one in octavia?17:04
johnsomAmphora image17:04
*** salmankhan has quit IRC17:04
*** salmankhan1 is now known as salmankhan17:04
tomtom001ok, thanks17:05
johnsomtomtom001 It was this bug: https://review.openstack.org/#/c/577238/17:05
johnsomIt was fixed in Rocky and backported to queens here: https://review.openstack.org/57801417:06
tomtom001ok, thank you17:06
colin-johnsom: worked out the explicit VIP address, thanks for the help17:08
johnsomOk cool. Something had it in use you didn't know about?17:08
colin-yes :)17:11
*** rpittau has quit IRC17:20
*** trown is now known as trown|lunch17:41
*** salmankhan has quit IRC17:48
*** salmankhan has joined #openstack-lbaas17:53
*** gcheresh has joined #openstack-lbaas18:27
*** salmankhan has quit IRC18:28
*** ccamposr has quit IRC18:46
tomtom001johnsom: I have applied the patch and now everything is function perfectly in my octavia deployment.  Thank you!! for your assistance.18:47
johnsomSure, NP. Glad you are up and running18:47
*** trown|lunch is now known as trown18:58
*** gcheresh has quit IRC19:40
*** salmankhan has joined #openstack-lbaas19:49
openstackgerritMerged openstack/octavia master: Add missing import octavia/opts.py  https://review.openstack.org/63444120:41
colin-is there an easy way to get detail about a listeners in an ERROR provisioning state?21:10
colin-so far i'm going only on: ERROR21:11
colin-Provisioning failed21:11
johnsomYeah, the additional details are in the controller worker or health manager log file21:12
colin-got it21:12
*** celebdor has quit IRC21:32
*** salmankhan has quit IRC21:36
*** trown is now known as trown|outtypewww21:42
cgoncalvesjohnsom, do you have any flavor patch open that touch on spare pool area? running master, HK can't create amps22:28
cgoncalves'revert' method on 'octavia.controller.worker.tasks.compute_tasks.CertComputeCreate==1.0' requires ['flavor'] but no other entity produces said requirements22:28
cgoncalves  MissingDependencies: 'execute' method on 'octavia.controller.worker.tasks.compute_tasks.CertComputeCreate==1.0' requires ['flavor'] but no other entity produces said requirements22:28
johnsomYes I do22:29
cgoncalvesoh, I see it22:29
cgoncalveshttps://review.openstack.org/#/c/632594/22:29
johnsomYep, that is the one22:29
cgoncalvesthanks. I'll just rollback to a pre-flavor hash and continue for now22:33
johnsomOr you can grab the top of the chain and test the last two patches....  grin22:35
johnsomhttps://review.openstack.org/#/c/63284222:35
cgoncalveswell, what I'd like to do would end up help validating your spare pool patch22:37
cgoncalveshow receptive are folks with another new tempest job? :)22:38
johnsomYeah... spares, are ....  ummm...22:38
johnsomI think it might have to be a multi-node as I don't think we can have enough amps booted on a single node job.22:39
johnsomI still need to figure out why the ipv6 tests randomly fail on the multi-node tests. My guess is there is a neutron issue with ipv6, but haven't run it to ground yet22:40
*** KeithMnemonic has quit IRC22:40
cgoncalves2 amps should do it: 1 associated to a LB, 1 in spare pool. test would be verifying the one in spare pool takes over when active is failed over22:40
johnsomI was thinking it might peak at 322:43
cgoncalvescorrect, it might. we can do multi-node if we have to22:49
*** takamatsu has quit IRC23:26

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!