Saturday, 2017-08-12

johnsomSee, this is what I don't get: When deadlock detection is enabled (the default) and a deadlock does occur, InnoDB detects the condition and rolls back one of the transactions (the victim).00:00
johnsomSo, it should only roll back one.  It should still let one complete00:01
johnsomrm_work Ah, I see why it just stops....00:07
*** gongysh has joined #openstack-lbaas00:08
*** sshank has quit IRC00:08
*** gongysh has quit IRC00:09
openstackgerritMichael Johnson proposed openstack/octavia master: Fix health monitor DB locking.  https://review.openstack.org/49325200:11
johnsomDoesn't answer the deadlock, but will cause it to not matter as much.00:11
*** sshank has joined #openstack-lbaas00:13
*** sshank has quit IRC00:22
*** xingzhang has joined #openstack-lbaas00:25
rm_workeugh00:26
rm_workhttp://paste.openstack.org/show/618236/00:26
rm_workfollowed by00:26
rm_workhttp://paste.openstack.org/show/618237/00:27
rm_workthis is spectacular00:27
rm_workso much bug00:27
rm_workthis is what i was talking about before i think00:27
rm_workthe first one is that failovers should be able to ignore status00:28
rm_workso it does seem to ALLOW failovers now00:30
rm_workbut that is pretty lulzy00:30
xgerman_yeah, looks like the wheels are coming off00:31
johnsomThe top try/catch block?00:31
johnsomI mean, it should be ok for that health check to not get a lock, that is "normal" in a way00:31
johnsomI probably should modify that get_stale try block to ignore the deadlock event.00:32
xgerman_it’s still saving the busy? I saw the commit in the func calling but…00:33
rm_workthat was because of a previous failed failover00:34
rm_workbut00:34
johnsomAs for the failover status, this is an interesting one.  It's locking the LB, which may have other healthy amps....00:34
rm_workbasically if it tries to failover when the state is PENDING_UPDATE00:34
rm_workit fails00:34
rm_workand yeah, the busy stays00:34
rm_worki have to figure out the second one00:34
johnsomThose revert issues are just missing kwargs00:35
rm_worktrying to figure out where00:35
johnsomhttps://github.com/openstack/octavia/blob/master/octavia/controller/worker/tasks/database_tasks.py#L92200:36
rm_workah yeah there's one00:36
rm_workand 90700:36
rm_workwe should fix up all of those00:36
johnsomShould look more like: https://github.com/openstack/octavia/blob/master/octavia/controller/worker/tasks/database_tasks.py#L105800:36
johnsomYeah, I fixed a ton of those at one point, but more must have slipped in00:37
johnsomWe probably need a hacking rule for that00:37
xgerman_+100:38
johnsomHa, that is currently the only one it looks like00:39
xgerman_It makes sense to me to lock an LB during failover even if he has more than one amp — we can’t guarantee that updates will reach all amps at that point in time00:39
xgerman_but we should ignore it when we failover another amp00:40
johnsomOh, I don't disagree that it should be locked, I'm just worried if the update thread is still going on the other o-cw if it's going to mess with the state machine, i.e. unlock it00:41
xgerman_mmh00:41
johnsomI mean it "should" fail out and go to ERROR instead of pending00:42
rm_workyep lol00:42
rm_workjust the one spot00:42
rm_workawesome >_>00:42
johnsomSo, either we don't failover when it's in PENDING_* and wait for it to exit that state or....00:43
xgerman_well, we always need to failover - uptime is our ultimate goal00:43
johnsomYeah, but I don't want failover of one amp to cause failure of the other....00:44
xgerman_ok, makes sense - so if we are not running SINGLE we can ait for the update (and hope it doesn’t crash by talking to the defunct amp)00:45
rm_workbut yeah that's what i was saying earlier -- "we always need to failover"00:45
rm_workso blocking a failover because of an update is kinda >_>00:45
rm_workbut, yeah, easier said than done since it IS problematic00:45
rm_workanyway this HELPS since now failovers *happen*, but now i'm just getting deadlocks like constantly00:45
johnsomDid you ever find the deadlock log?00:46
rm_workseriously, just spewing them00:46
rm_worklooking00:46
rm_workoh err00:47
rm_workwait00:47
rm_workam i using INNODB?00:47
johnsomI super hope so00:47
rm_workerr00:47
rm_workhow do i verify that00:47
johnsomhttp://paste.openstack.org/show/618235/00:47
rm_worki have this set up as percona+extradb00:47
johnsomYeah, you are00:47
johnsomIt was in your status output00:48
rm_workdoes ExtraDB not override that or something00:48
rm_workExtraDB is madness00:48
johnsomsqlalchemy + mysql is maddness00:48
rm_worklol00:48
*** leitan has quit IRC00:50
rm_workyeah i have no idea where the errors are going <_<00:50
rm_workif anywhere00:51
johnsomlsof?00:52
openstackgerritMichael Johnson proposed openstack/octavia master: Fix health monitor DB locking.  https://review.openstack.org/49325200:55
johnsomThat will shut it up00:55
xgerman_ha00:55
rm_worklol....00:55
rm_worknot sure that's ideal00:57
johnsomWell, no.  We still need to figure out what is deadlocking.00:58
rm_workthis is dumb01:12
rm_workmaybe i need to explicitly configure a log location?01:13
rm_workah percona xtradb is Galera01:13
johnsomThere should be a mysql variable that defines the error log location01:14
johnsomBut didn't you see those "row too long" messages? that should have been the error log01:14
rm_workyeah i found those on all nodes01:15
rm_workbut nothing about deadlocks01:15
rm_worki don't know if setting that global is working right01:15
rm_workhmmmmmm01:15
rm_workmaybe I need to just ...01:15
rm_workonly send writes to one node <_<01:15
rm_workone sec01:15
rm_workdoing that01:16
rm_workman, what we ARE missing is active/passive01:16
rm_worki want to have one node ONLY come up if the other is down01:17
rm_workcan't really do it with weights01:17
rm_workjohnsom: k i think that solves it -- so this is not really octavia's problem, so much as galera's optimistic locking and writing to more than one node01:19
johnsomAre you kidding me?01:19
johnsomUgh, can't figure out why this regex doesn't work01:20
johnsom(.)*def revert\(.+, (?!\*\*kwargs)\):01:21
rm_work:301:28
rm_workthis is a little odd01:28
rm_workjohnsom: so i would say: throw that revert fix into the same HM patch, *remove* the bits that hide the deadlock messages from logs, and we should merge that01:33
rm_worksince it does solve a problem01:34
johnsomI would consider it if I can get this damn regex to work01:35
*** yamamoto has joined #openstack-lbaas01:35
rm_workyeah i poked at it01:37
rm_worknot sure wtf01:37
*** ssmith has quit IRC01:50
*** yamamoto has quit IRC02:03
*** gongysh has joined #openstack-lbaas02:43
*** yamamoto has joined #openstack-lbaas03:04
*** xingzhang has quit IRC03:08
*** xingzhang has joined #openstack-lbaas03:08
*** yamamoto has quit IRC03:09
*** xingzhang has quit IRC03:13
*** xingzhang has joined #openstack-lbaas03:14
*** rajivk has quit IRC03:28
*** reedip has quit IRC03:28
*** yamamoto has joined #openstack-lbaas03:29
openstackgerritMichael Johnson proposed openstack/python-octaviaclient master: Improve error reporting for the octavia plugin  https://review.openstack.org/49327304:13
johnsomOk, that should pass through our fault strings to the user giving better error strings than "Bad Request"04:14
*** yamamoto has quit IRC04:39
*** yamamoto has joined #openstack-lbaas04:48
*** yamamoto has quit IRC04:55
*** gcheresh has joined #openstack-lbaas06:22
*** gongysh has quit IRC06:36
*** gcheresh has quit IRC06:40
*** yamamoto has joined #openstack-lbaas06:54
*** yamamoto has quit IRC06:59
*** tesseract has joined #openstack-lbaas07:03
*** KeithMnemonic has quit IRC07:24
*** yamamoto has joined #openstack-lbaas07:27
*** yamamoto has quit IRC07:32
*** Alex_Staf has joined #openstack-lbaas08:12
*** aojea has joined #openstack-lbaas08:19
*** gongysh has joined #openstack-lbaas09:28
*** gongysh has quit IRC09:28
*** aojea has quit IRC09:47
*** amotoki__away is now known as amotoki10:51
*** aojea has joined #openstack-lbaas10:53
*** aojea has quit IRC11:00
*** dasanind has quit IRC11:02
*** yamamoto has joined #openstack-lbaas11:27
*** yamamoto has quit IRC12:01
*** yamamoto has joined #openstack-lbaas12:21
*** gcheresh has joined #openstack-lbaas12:24
*** Alex_Staf has quit IRC12:30
*** Alex_Staf has joined #openstack-lbaas12:35
*** aojea has joined #openstack-lbaas12:57
*** aojea has quit IRC13:01
*** gcheresh has quit IRC13:17
*** aojea has joined #openstack-lbaas14:58
*** xingzhang has quit IRC14:58
*** xingzhang has joined #openstack-lbaas14:59
*** aojea has quit IRC15:02
*** xingzhang has quit IRC15:03
*** ajo has quit IRC15:31
*** yamamoto has quit IRC15:40
*** yamamoto has joined #openstack-lbaas15:41
*** ipsecguy_ has joined #openstack-lbaas15:52
*** ipsecguy has quit IRC15:56
*** xingzhang has joined #openstack-lbaas16:09
*** Alex_Staf has quit IRC16:33
*** xingzhang has quit IRC16:42
*** aojea has joined #openstack-lbaas16:58
*** aojea has quit IRC17:03
*** aojea has joined #openstack-lbaas17:04
*** tesseract has quit IRC17:28
*** xingzhang has joined #openstack-lbaas17:42
*** Alex_Staf has joined #openstack-lbaas18:05
*** xingzhang has quit IRC18:12
*** aojea has quit IRC18:13
openstackgerritMichael Johnson proposed openstack/octavia master: Fix octavia logging to be more friendly  https://review.openstack.org/49332818:57
*** xingzhang has joined #openstack-lbaas19:12
*** D33P-B00K has joined #openstack-lbaas19:30
*** D33P-B00K has left #openstack-lbaas19:30
*** xingzhang has quit IRC19:42
*** Alex_Staf has quit IRC19:58
*** gcheresh has joined #openstack-lbaas20:00
johnsomNice, that works for the gates20:02
*** xingzhang has joined #openstack-lbaas20:42
*** aojea has joined #openstack-lbaas20:42
openstackgerritMerged openstack/neutron-lbaas master: Update reno for stable/pike  https://review.openstack.org/49287220:58
*** gcheresh has quit IRC21:04
*** xingzhang has quit IRC21:12
*** aojea has quit IRC21:37
*** aojea has joined #openstack-lbaas21:37
*** xingzhang has joined #openstack-lbaas22:12
*** xingzhang has quit IRC22:42
*** aojea has quit IRC23:31
*** xingzhang has joined #openstack-lbaas23:42

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!