Tuesday, 2020-03-24

johnsomsorrison Left some comments on the SDK patch.00:06
sorrisonthanks00:17
*** hongbin has quit IRC00:37
*** ramishra has quit IRC00:40
*** hongbin has joined #openstack-lbaas00:41
*** gthiemonge has quit IRC00:58
*** gthiemonge has joined #openstack-lbaas00:58
*** spatel has joined #openstack-lbaas01:39
openstackgerritNoah Mickus proposed openstack/octavia-lib master: Adding cipher list Support for provider drivers  https://review.opendev.org/71455801:44
*** spatel has quit IRC01:44
*** ramishra has joined #openstack-lbaas02:07
*** gthiemonge has quit IRC02:46
*** gthiemonge has joined #openstack-lbaas02:46
*** hongbin has quit IRC03:29
openstackgerritSam Morrison proposed openstack/octavia-dashboard master: Availability zone support  https://review.opendev.org/71456303:35
*** jamesdenton has quit IRC04:30
*** jamesdenton has joined #openstack-lbaas05:03
*** dulek has quit IRC05:11
*** spatel has joined #openstack-lbaas05:40
*** spatel has quit IRC05:46
*** vishalmanchanda has joined #openstack-lbaas06:32
*** ataraday_ has joined #openstack-lbaas06:56
*** gcheresh has joined #openstack-lbaas07:17
*** gcheresh has quit IRC07:50
*** gcheresh has joined #openstack-lbaas07:56
*** tkajinam has quit IRC08:05
*** dulek has joined #openstack-lbaas08:12
*** rpittau|afk is now known as rpittau08:26
*** ccamposr__ has joined #openstack-lbaas08:36
*** ccamposr has quit IRC08:39
*** maciejjozefczyk has joined #openstack-lbaas08:56
*** TMM has quit IRC09:14
*** TMM has joined #openstack-lbaas09:14
*** gcheresh has quit IRC09:21
*** ccamposr has joined #openstack-lbaas09:27
*** ccamposr__ has quit IRC09:29
*** spatel has joined #openstack-lbaas09:42
*** spatel has quit IRC09:47
*** gcheresh has joined #openstack-lbaas09:55
*** ccamposr__ has joined #openstack-lbaas11:34
*** ccamposr has quit IRC11:37
*** psachin has joined #openstack-lbaas11:44
*** sapd1_x has joined #openstack-lbaas12:07
*** ccamposr has joined #openstack-lbaas12:40
*** ccamposr__ has quit IRC12:42
*** psachin has quit IRC13:05
*** ataraday_ has quit IRC13:36
*** tkajinam has joined #openstack-lbaas13:57
*** TrevorV has joined #openstack-lbaas14:09
cgoncalvesFYI, Analysis of 2019 User Survey Feedback: https://governance.openstack.org/tc/user_survey/analysis-12-2019.html14:27
johnsomThat looks like only the feedback for the TC questions. I wonder if we got access to the Octavia question results.15:01
cgoncalvesright. I could also not find them in the survey report at https://www.openstack.org/analytics15:03
*** tkajinam has quit IRC15:04
johnsomYeah, not surprised really15:05
openstackgerritCarlos Goncalves proposed openstack/octavia master: WIP: Fix amphora image build jobs  https://review.opendev.org/71468015:41
openstackgerritCarlos Goncalves proposed openstack/octavia master: WIP: Fix amphora image build jobs  https://review.opendev.org/71468015:45
*** dtruong has quit IRC15:47
*** dtruong has joined #openstack-lbaas15:48
openstackgerritCarlos Goncalves proposed openstack/octavia master: WIP: Fix amphora image build jobs  https://review.opendev.org/71468015:48
*** gthiemonge has quit IRC16:07
*** gthiemonge has joined #openstack-lbaas16:07
*** sapd1_x has quit IRC16:08
*** gcheresh has quit IRC16:34
rm_workwould be cool if they'd email that stuff to the PTL or the liason or something16:54
rm_work<_<16:54
openstackgerritCarlos Goncalves proposed openstack/octavia master: Fix amphora image build jobs  https://review.opendev.org/71468017:04
*** rpittau is now known as rpittau|afk17:18
*** maciejjozefczyk has quit IRC17:21
*** maciejjozefczyk has joined #openstack-lbaas17:21
*** maciejjozefczyk has quit IRC17:26
*** ianychoi has quit IRC17:45
*** vesper11 has quit IRC17:45
*** openstackstatus has quit IRC17:45
*** ianychoi has joined #openstack-lbaas17:46
*** vesper11 has joined #openstack-lbaas17:47
*** openstackstatus has joined #openstack-lbaas17:47
*** ChanServ sets mode: +v openstackstatus17:47
*** irclogbot_1 has quit IRC17:47
*** irclogbot_1 has joined #openstack-lbaas17:48
*** ataraday_ has joined #openstack-lbaas18:14
ataraday_rm_work, are you around?18:15
rm_workyes18:15
ataraday_I added some details https://review.opendev.org/#/c/647406/96/octavia/controller/worker/v2/taskflow_jobboard_driver.py@47 - if you have time we can discuss this18:20
rm_workkk, in a moment, finishing something up18:42
*** laerlingSAP has quit IRC19:04
*** laerlingSAP has joined #openstack-lbaas19:06
*** ataraday_ has quit IRC19:09
*** gcheresh has joined #openstack-lbaas19:11
*** maciejjozefczyk has joined #openstack-lbaas19:32
*** rcernin|brb has quit IRC19:57
*** vishalmanchanda has quit IRC19:59
openstackgerritAdam Harwell proposed openstack/octavia master: Support HTTP and TCP checks in UDP healthmonitor  https://review.opendev.org/58918019:59
*** maciejjozefczyk has quit IRC20:20
*** gcheresh has quit IRC20:39
*** TrevorV has quit IRC20:47
openstackgerritMichael Johnson proposed openstack/octavia-tempest-plugin master: Add skip_if_not_implemented to the service client  https://review.opendev.org/71400321:03
*** cjloader has quit IRC21:18
sorrisonjohnsom: Left some replies for https://review.opendev.org/#/c/714345/ I think we do need the id attribute for these resources22:14
johnsomsorrison Yeah the "don't use \ for line wrap" is an Octavia team thing. I know other teams use it (mostly legacy). So, having it in SDK is fine per SDK hacking rules, but we don't typically allow it in Octavia repos. rm_work may have more comment. grin22:17
johnsomsorrison lol, hmmm, yeah, I did add it in flavor. hmm, maybe we should leave those.  Let me refresh my brain.22:19
sorrisonCan't figure out how you say "this is part of the resource but not used when creating one"22:20
*** gregwork has quit IRC22:21
rm_worksorrison: oh how do you actually ENABLE that healthcheck btw22:21
rm_workah nm i bet it's in the doc you committed :D22:21
johnsomYeah, in the past we have not included ID in the properties list, I thought because that implied it was settable. Most of the proxy methods will already take an id/object22:23
johnsomIf something is in the properties list, it ends up in the json body going over the wire.22:24
johnsomYeah, flavor might be wrong in having the ID there22:26
johnsomIt might also be needed for the query map though22:26
*** rcernin has joined #openstack-lbaas22:45
lxkonghi guys, I am wondering if it's possible that some amphora will never be failed over, give octavia is always picking up the first unhealthy one from db?22:52
lxkonghttps://www.irccloud.com/pastebin/vaRMVfZv/22:52
lxkongor do i miss something elsewhere?22:53
lxkongstable/train22:53
*** tkajinam has joined #openstack-lbaas22:53
johnsomlxkong No, you will note the "busy" flag. This is set once an amphora is selected for failover, thus will not be in the results for the next health manager22:53
lxkongjohnsom: but what if the lb is in pending_update? the amphora will be skipped22:54
lxkongand next time, it is still be picked?22:55
lxkongthe following amphora will never get a chance?22:55
johnsomYes, lb in pending_update means one of the controllers has ownership of the LB and all of it's parts. No other controller will act on it.22:55
lxkongso what about the following unhealthy amphorae?22:56
johnsomIf the LB is in pending_update due to a failover, the busy flag will be set22:56
lxkongi mean, e.g. I have two unhealthy amphorae (am1 and am2), the lb of am1 is in pending_update, am2 will never be failed over, right?22:57
johnsomBasically that busy flag is there to make sure it walks the list of amphora22:57
lxkongam1 is firstly be picked, set busy=1, but the lb in pending_update, session rolled back, bush=0, break the loop22:59
lxkongin the next loop, am1 is checked again22:59
johnsomlxkong am2 will not get failed over until the pending_update has been removed. We intentionally only failover one amphora of a load balancer at a time to make sure there is at least one serving traffic and should the initial failover blow up for some reason, there is still a chance of a functioning load balancer.22:59
lxkonghmm...some lbs in our cloud are stuck in pending_xxx, so that affect the octaia-healthmonitor service, right?23:00
lxkongespecially the amphorae for those lb are unhealthy23:01
johnsomYeah, pending_XXX means one of the controllers is already acting on it.23:01
lxkongyeah, that's gonna be problematic23:01
johnsomOne of the controllers already has ownership of the object and no other should act on it23:01
lxkongthe point is, no controllers are actually working on those lbs23:02
johnsomSo, likely what happened is one of your controllers had a non-graceful shutdown where it didn't get a change to release the "pending_*" lock.23:02
lxkongyes, they are in pending_xxx forever23:02
johnsomThis is what we are working to fix now, sub-flow controller failures23:02
johnsomYeah, so someone did a kill -9 or powered off a controller without shutting it down.23:03
lxkongor when octavia service is up and running, maybe scan the pending lbs first and do somethign?23:03
johnsomAlso check your systemd scripts to make sure they aren't configured to timeout and kill -923:03
lxkongwe are using `start-stop-daemon`23:04
johnsomlxkong I think what you are reporting is similar to this: https://storyboard.openstack.org/#!/story/200734023:05
johnsomWhere you have accumulated PENDING_ that are no longer owned by the controller that locked it.23:05
lxkongyeah, the same23:05
lxkongFrom http://man7.org/linux/man-pages/man8/start-stop-daemon.8.html, I can see `All matching processes will be sent the TERM signal (or the one specified via --signal or --retry) if --stop is specified.`23:08
lxkongprobably that's the reason23:08
johnsomAlso check that your start-stop-daemon is using TERM/15 and not 923:08
johnsomJust like systemd it will escalate to sending a KILL, so you need to make sure it gives those processes time to shutdown before it escalates to a KILL.23:09
johnsomYou will want a --retry config23:11
*** gthiemonge has quit IRC23:13
*** gthiemonge has joined #openstack-lbaas23:13
lxkongthanks johnsom, i will deal with the pending LBs first.23:16

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!