johnsom | sorrison Left some comments on the SDK patch. | 00:06 |
---|---|---|
sorrison | thanks | 00:17 |
*** hongbin has quit IRC | 00:37 | |
*** ramishra has quit IRC | 00:40 | |
*** hongbin has joined #openstack-lbaas | 00:41 | |
*** gthiemonge has quit IRC | 00:58 | |
*** gthiemonge has joined #openstack-lbaas | 00:58 | |
*** spatel has joined #openstack-lbaas | 01:39 | |
openstackgerrit | Noah Mickus proposed openstack/octavia-lib master: Adding cipher list Support for provider drivers https://review.opendev.org/714558 | 01:44 |
*** spatel has quit IRC | 01:44 | |
*** ramishra has joined #openstack-lbaas | 02:07 | |
*** gthiemonge has quit IRC | 02:46 | |
*** gthiemonge has joined #openstack-lbaas | 02:46 | |
*** hongbin has quit IRC | 03:29 | |
openstackgerrit | Sam Morrison proposed openstack/octavia-dashboard master: Availability zone support https://review.opendev.org/714563 | 03:35 |
*** jamesdenton has quit IRC | 04:30 | |
*** jamesdenton has joined #openstack-lbaas | 05:03 | |
*** dulek has quit IRC | 05:11 | |
*** spatel has joined #openstack-lbaas | 05:40 | |
*** spatel has quit IRC | 05:46 | |
*** vishalmanchanda has joined #openstack-lbaas | 06:32 | |
*** ataraday_ has joined #openstack-lbaas | 06:56 | |
*** gcheresh has joined #openstack-lbaas | 07:17 | |
*** gcheresh has quit IRC | 07:50 | |
*** gcheresh has joined #openstack-lbaas | 07:56 | |
*** tkajinam has quit IRC | 08:05 | |
*** dulek has joined #openstack-lbaas | 08:12 | |
*** rpittau|afk is now known as rpittau | 08:26 | |
*** ccamposr__ has joined #openstack-lbaas | 08:36 | |
*** ccamposr has quit IRC | 08:39 | |
*** maciejjozefczyk has joined #openstack-lbaas | 08:56 | |
*** TMM has quit IRC | 09:14 | |
*** TMM has joined #openstack-lbaas | 09:14 | |
*** gcheresh has quit IRC | 09:21 | |
*** ccamposr has joined #openstack-lbaas | 09:27 | |
*** ccamposr__ has quit IRC | 09:29 | |
*** spatel has joined #openstack-lbaas | 09:42 | |
*** spatel has quit IRC | 09:47 | |
*** gcheresh has joined #openstack-lbaas | 09:55 | |
*** ccamposr__ has joined #openstack-lbaas | 11:34 | |
*** ccamposr has quit IRC | 11:37 | |
*** psachin has joined #openstack-lbaas | 11:44 | |
*** sapd1_x has joined #openstack-lbaas | 12:07 | |
*** ccamposr has joined #openstack-lbaas | 12:40 | |
*** ccamposr__ has quit IRC | 12:42 | |
*** psachin has quit IRC | 13:05 | |
*** ataraday_ has quit IRC | 13:36 | |
*** tkajinam has joined #openstack-lbaas | 13:57 | |
*** TrevorV has joined #openstack-lbaas | 14:09 | |
cgoncalves | FYI, Analysis of 2019 User Survey Feedback: https://governance.openstack.org/tc/user_survey/analysis-12-2019.html | 14:27 |
johnsom | That looks like only the feedback for the TC questions. I wonder if we got access to the Octavia question results. | 15:01 |
cgoncalves | right. I could also not find them in the survey report at https://www.openstack.org/analytics | 15:03 |
*** tkajinam has quit IRC | 15:04 | |
johnsom | Yeah, not surprised really | 15:05 |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: WIP: Fix amphora image build jobs https://review.opendev.org/714680 | 15:41 |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: WIP: Fix amphora image build jobs https://review.opendev.org/714680 | 15:45 |
*** dtruong has quit IRC | 15:47 | |
*** dtruong has joined #openstack-lbaas | 15:48 | |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: WIP: Fix amphora image build jobs https://review.opendev.org/714680 | 15:48 |
*** gthiemonge has quit IRC | 16:07 | |
*** gthiemonge has joined #openstack-lbaas | 16:07 | |
*** sapd1_x has quit IRC | 16:08 | |
*** gcheresh has quit IRC | 16:34 | |
rm_work | would be cool if they'd email that stuff to the PTL or the liason or something | 16:54 |
rm_work | <_< | 16:54 |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: Fix amphora image build jobs https://review.opendev.org/714680 | 17:04 |
*** rpittau is now known as rpittau|afk | 17:18 | |
*** maciejjozefczyk has quit IRC | 17:21 | |
*** maciejjozefczyk has joined #openstack-lbaas | 17:21 | |
*** maciejjozefczyk has quit IRC | 17:26 | |
*** ianychoi has quit IRC | 17:45 | |
*** vesper11 has quit IRC | 17:45 | |
*** openstackstatus has quit IRC | 17:45 | |
*** ianychoi has joined #openstack-lbaas | 17:46 | |
*** vesper11 has joined #openstack-lbaas | 17:47 | |
*** openstackstatus has joined #openstack-lbaas | 17:47 | |
*** ChanServ sets mode: +v openstackstatus | 17:47 | |
*** irclogbot_1 has quit IRC | 17:47 | |
*** irclogbot_1 has joined #openstack-lbaas | 17:48 | |
*** ataraday_ has joined #openstack-lbaas | 18:14 | |
ataraday_ | rm_work, are you around? | 18:15 |
rm_work | yes | 18:15 |
ataraday_ | I added some details https://review.opendev.org/#/c/647406/96/octavia/controller/worker/v2/taskflow_jobboard_driver.py@47 - if you have time we can discuss this | 18:20 |
rm_work | kk, in a moment, finishing something up | 18:42 |
*** laerlingSAP has quit IRC | 19:04 | |
*** laerlingSAP has joined #openstack-lbaas | 19:06 | |
*** ataraday_ has quit IRC | 19:09 | |
*** gcheresh has joined #openstack-lbaas | 19:11 | |
*** maciejjozefczyk has joined #openstack-lbaas | 19:32 | |
*** rcernin|brb has quit IRC | 19:57 | |
*** vishalmanchanda has quit IRC | 19:59 | |
openstackgerrit | Adam Harwell proposed openstack/octavia master: Support HTTP and TCP checks in UDP healthmonitor https://review.opendev.org/589180 | 19:59 |
*** maciejjozefczyk has quit IRC | 20:20 | |
*** gcheresh has quit IRC | 20:39 | |
*** TrevorV has quit IRC | 20:47 | |
openstackgerrit | Michael Johnson proposed openstack/octavia-tempest-plugin master: Add skip_if_not_implemented to the service client https://review.opendev.org/714003 | 21:03 |
*** cjloader has quit IRC | 21:18 | |
sorrison | johnsom: Left some replies for https://review.opendev.org/#/c/714345/ I think we do need the id attribute for these resources | 22:14 |
johnsom | sorrison Yeah the "don't use \ for line wrap" is an Octavia team thing. I know other teams use it (mostly legacy). So, having it in SDK is fine per SDK hacking rules, but we don't typically allow it in Octavia repos. rm_work may have more comment. grin | 22:17 |
johnsom | sorrison lol, hmmm, yeah, I did add it in flavor. hmm, maybe we should leave those. Let me refresh my brain. | 22:19 |
sorrison | Can't figure out how you say "this is part of the resource but not used when creating one" | 22:20 |
*** gregwork has quit IRC | 22:21 | |
rm_work | sorrison: oh how do you actually ENABLE that healthcheck btw | 22:21 |
rm_work | ah nm i bet it's in the doc you committed :D | 22:21 |
johnsom | Yeah, in the past we have not included ID in the properties list, I thought because that implied it was settable. Most of the proxy methods will already take an id/object | 22:23 |
johnsom | If something is in the properties list, it ends up in the json body going over the wire. | 22:24 |
johnsom | Yeah, flavor might be wrong in having the ID there | 22:26 |
johnsom | It might also be needed for the query map though | 22:26 |
*** rcernin has joined #openstack-lbaas | 22:45 | |
lxkong | hi guys, I am wondering if it's possible that some amphora will never be failed over, give octavia is always picking up the first unhealthy one from db? | 22:52 |
lxkong | https://www.irccloud.com/pastebin/vaRMVfZv/ | 22:52 |
lxkong | or do i miss something elsewhere? | 22:53 |
lxkong | stable/train | 22:53 |
*** tkajinam has joined #openstack-lbaas | 22:53 | |
johnsom | lxkong No, you will note the "busy" flag. This is set once an amphora is selected for failover, thus will not be in the results for the next health manager | 22:53 |
lxkong | johnsom: but what if the lb is in pending_update? the amphora will be skipped | 22:54 |
lxkong | and next time, it is still be picked? | 22:55 |
lxkong | the following amphora will never get a chance? | 22:55 |
johnsom | Yes, lb in pending_update means one of the controllers has ownership of the LB and all of it's parts. No other controller will act on it. | 22:55 |
lxkong | so what about the following unhealthy amphorae? | 22:56 |
johnsom | If the LB is in pending_update due to a failover, the busy flag will be set | 22:56 |
lxkong | i mean, e.g. I have two unhealthy amphorae (am1 and am2), the lb of am1 is in pending_update, am2 will never be failed over, right? | 22:57 |
johnsom | Basically that busy flag is there to make sure it walks the list of amphora | 22:57 |
lxkong | am1 is firstly be picked, set busy=1, but the lb in pending_update, session rolled back, bush=0, break the loop | 22:59 |
lxkong | in the next loop, am1 is checked again | 22:59 |
johnsom | lxkong am2 will not get failed over until the pending_update has been removed. We intentionally only failover one amphora of a load balancer at a time to make sure there is at least one serving traffic and should the initial failover blow up for some reason, there is still a chance of a functioning load balancer. | 22:59 |
lxkong | hmm...some lbs in our cloud are stuck in pending_xxx, so that affect the octaia-healthmonitor service, right? | 23:00 |
lxkong | especially the amphorae for those lb are unhealthy | 23:01 |
johnsom | Yeah, pending_XXX means one of the controllers is already acting on it. | 23:01 |
lxkong | yeah, that's gonna be problematic | 23:01 |
johnsom | One of the controllers already has ownership of the object and no other should act on it | 23:01 |
lxkong | the point is, no controllers are actually working on those lbs | 23:02 |
johnsom | So, likely what happened is one of your controllers had a non-graceful shutdown where it didn't get a change to release the "pending_*" lock. | 23:02 |
lxkong | yes, they are in pending_xxx forever | 23:02 |
johnsom | This is what we are working to fix now, sub-flow controller failures | 23:02 |
johnsom | Yeah, so someone did a kill -9 or powered off a controller without shutting it down. | 23:03 |
lxkong | or when octavia service is up and running, maybe scan the pending lbs first and do somethign? | 23:03 |
johnsom | Also check your systemd scripts to make sure they aren't configured to timeout and kill -9 | 23:03 |
lxkong | we are using `start-stop-daemon` | 23:04 |
johnsom | lxkong I think what you are reporting is similar to this: https://storyboard.openstack.org/#!/story/2007340 | 23:05 |
johnsom | Where you have accumulated PENDING_ that are no longer owned by the controller that locked it. | 23:05 |
lxkong | yeah, the same | 23:05 |
lxkong | From http://man7.org/linux/man-pages/man8/start-stop-daemon.8.html, I can see `All matching processes will be sent the TERM signal (or the one specified via --signal or --retry) if --stop is specified.` | 23:08 |
lxkong | probably that's the reason | 23:08 |
johnsom | Also check that your start-stop-daemon is using TERM/15 and not 9 | 23:08 |
johnsom | Just like systemd it will escalate to sending a KILL, so you need to make sure it gives those processes time to shutdown before it escalates to a KILL. | 23:09 |
johnsom | You will want a --retry config | 23:11 |
*** gthiemonge has quit IRC | 23:13 | |
*** gthiemonge has joined #openstack-lbaas | 23:13 | |
lxkong | thanks johnsom, i will deal with the pending LBs first. | 23:16 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!