Monday, 2020-01-27

*** gthiemonge has quit IRC01:30
*** gthiemon1e has joined #openstack-lbaas01:31
*** yamamoto has quit IRC01:48
*** openstackgerrit has quit IRC02:04
*** armax has joined #openstack-lbaas02:10
*** yamamoto has joined #openstack-lbaas02:57
*** psachin has joined #openstack-lbaas03:35
*** ramishra has joined #openstack-lbaas03:46
*** gcheresh has joined #openstack-lbaas06:41
*** tkajinam has quit IRC07:02
*** tkajinam has joined #openstack-lbaas07:04
*** gthiemon1e is now known as gthiemonge07:16
*** luksky has joined #openstack-lbaas07:29
*** tkajinam_ has joined #openstack-lbaas07:52
*** tkajinam has quit IRC07:55
*** tesseract has joined #openstack-lbaas08:13
*** tkajinam_ has quit IRC08:18
*** rpittau|afk is now known as rpittau08:18
*** pcaruana has joined #openstack-lbaas08:25
*** openstackgerrit has joined #openstack-lbaas08:26
openstackgerritAnn Taraday proposed openstack/octavia master: Jobboard based controller  https://review.opendev.org/64740608:26
*** AlexStaf has joined #openstack-lbaas08:27
*** ccamposr has joined #openstack-lbaas08:30
*** vesper11 has quit IRC09:01
*** vesper11 has joined #openstack-lbaas09:05
*** yamamoto has quit IRC09:13
openstackgerritAnn Taraday proposed openstack/octavia master: Jobboard based controller  https://review.opendev.org/64740609:18
*** etp has quit IRC10:58
*** etp has joined #openstack-lbaas11:00
*** rpittau is now known as rpittau|bbl11:21
openstackgerritAnn Taraday proposed openstack/octavia master: Jobboard based controller  https://review.opendev.org/64740611:24
openstackgerritAnn Taraday proposed openstack/octavia master: Testing  https://review.opendev.org/69721311:30
*** luksky has quit IRC11:55
*** yamamoto has joined #openstack-lbaas12:04
*** yamamoto has quit IRC12:09
*** yamamoto_ has joined #openstack-lbaas12:09
TMMDoes anyone happen to know if there's any version of the octavia dashboard that supports the new allowed-cidr options from Train?12:22
TMMI upgraded horizon to train but it appears that there's no such option in horizon at least12:22
cgoncalvesTMM, allowed-cidr option has not been added to the dashboard yet12:23
TMMOK, thanks for the confirmation, I'm not losing my mind :)12:23
*** rpittau|bbl is now known as rpittau12:55
*** ramishra has quit IRC13:20
*** ramishra has joined #openstack-lbaas13:21
*** ramishra has quit IRC13:21
*** ramishra has joined #openstack-lbaas13:21
*** psachin has quit IRC13:29
*** yamamoto_ has quit IRC13:56
*** yamamoto has joined #openstack-lbaas13:58
TMMIs there a way to tell octavia to retry some operations on an lbaas? I made an error updating octavia and now all my load balancers are either in PENDING or ERROR state :P (forgot to set the rabbit topic name)14:01
TMMThey are all still working fine, I still need to update their amphoras14:01
openstackgerritGregory Thiemonge proposed openstack/octavia master: Support haproxy development snapshot version parsing  https://review.opendev.org/70182314:07
*** haleyb has joined #openstack-lbaas14:14
*** luksky has joined #openstack-lbaas14:16
johnsomTMM that scenario might be tricky. For those in ERROR, you can use the failover API. For those in PENDING_*, it is likely the workers never got that message, so you need to set them to ERROR in the DB, then fail those over.14:19
TMMjohnsom: the openstack loadbalancer failover, or the amphora failover api?14:20
johnsomLoad balancer failover14:20
TMMok, thanks14:21
TMMI appreciate it! :D14:21
TMMhmm, after the failover the amphoras went to 'standalone'14:22
johnsomIt should do that temporarily during the failover process14:22
TMMAhh, ok14:22
johnsomIt builds them as standalone, then will update them to their proper role.14:23
TMMclever :) I don't think it used to do that14:23
johnsomA load balancer failover sequencing the amphora replacements so that it minimizes downtime, etc.14:23
johnsomThere are also additional improvements to failover coming. I am working on that right now.14:24
TMMhmm, I now have some amphora with a non-matchin ssl cert? (Caused by SSLError(CertificateError("hostname u'fe56fc4a-71cb-416a-a4c5-bf8892d81879' doesn't match '6cf80b52-c839-4d05-a777-e72a1530e126'") I wonder how I managed to do this14:25
TMM@johnsom Thank you for your work! I generally really like octavia!14:25
johnsomExcellent, glad to hear it.14:25
johnsomThat is very odd, but maybe the rabbit issue impacted nova or neutron too?14:26
TMMHmm, maybe, but I only touched octavia14:27
TMMand it was just a new setting in oslo configs that I didn't set14:27
johnsomThis is a funny one: https://storyboard.openstack.org/#!/story/200721814:34
johnsomMy guess is it is the standard openstackclient behavior, but I will take a look14:35
cgoncalveshttps://docs.python.org/3/library/argparse.html#choices14:36
gthiemongeis 65535 a reserved port? it's not in the choice list14:37
gthiemongechoices=range(1, 65535) doesn't look good14:39
johnsomYep. I bet we can do better....  Interestingly enough, we don't validate the listener port #, just pass it through to the API to validate.14:41
TMMHmm, I appear to have at least 1 loadbalancer where the amphora is in MASTER role, but there is no slave amphora for it at all14:47
johnsomYeah, that can happen under strange circumstances. This is one of the things I am currently fixing.14:47
TMManything I can do about that now? :)14:48
johnsomSo, the short answer is, there is not an easy way to fix this. If it's not a critical LB, delete it and recreate it. If it's critical, there is likely a number of steps required to fix it enough that a failover will complete.14:49
TMMwell, I ran a 'loadbalancer failover' on a bunch of my loadbalancers and most of them are now in this state it seems14:50
TMMwould an amphora failover make octavia notice that there's some amphoras missing?14:51
johnsomAre they back to Active or Error state, or still Pending?14:51
TMMEverything is in 'active' state14:52
TMMthere's just amphora's missing, but nothing seems to be too concerned about this14:52
*** TrevorV has joined #openstack-lbaas14:53
TMMyeah, so right now all my loadbalancers are in provisioning_status ACTIVE, and ONLINE14:55
TMMAll my amphoras are ALLOCATED, but there's just a bunch of backups just kind of not there14:55
TMM(this is octavia from Train btw)14:56
johnsomYeah, it's a bug where if the database records for the amphora somehow got removed, it doesn't notice there is one missing. This is the patch I working on to fix now. I have it working in my lab, but still have work to do before I can publish it. The original authors assumed that scenario would never happen.14:57
TMMI don't think I deleted any octavia db records14:57
johnsomYes, but the failover might have.14:58
TMMah, ok14:58
TMMDo I have to manually undelete the amphora db record?14:58
johnsomIf you can't just delete/rebuild, you will need to, yes.14:59
TMMI can't really delete them no14:59
johnsomI think rm_work has a procedure for recreating those records, but I'm not sure if he is online at the moment.15:01
johnsomIf not, I can probably walk through it, but it might be a bit of a process...15:03
TMMit's still in the database as DELETED15:03
TMMso I just set it back to active, I'll try to do a failover now15:04
johnsomAh, that is good. Or try ERROR15:04
TMMalright, I put the missing ones in error15:07
TMMI'll do another failover, see if it'll fix them15:07
*** gcheresh has quit IRC15:09
TMMok, yeah, so setting the deleted db records to 'ERROR' and running the lb failover twice seems to have fixed it15:19
TMMthe first time the lb itself went into ERROR mode, as octavia desperately tried to contact the non-existent amphora15:20
TMMthe second time it recovered the state15:20
johnsomOh good.15:21
TMMNot sure if that was expected? :) But it worked for me at least15:21
johnsomYes, with the amp in error it should handle it better.  So, keep an eye out for future bug fix releases that will include a much improved failover capability.15:23
TMMAwesome, thank you for your help. I really appreciate it.15:24
haleybjohnsom: you want me to fix the port validation bug?16:09
johnsomhaleyb Almost done16:09
haleybjohnsom: i'm already done iwth it :)16:11
haleybit's a race right?16:12
johnsomHa, well, I took assignment of the bug. But we can both post and see who did a better patch....16:12
* johnsom throws the gauntlet 16:12
* cgoncalves is open to bribes16:15
openstackgerritBrian Haley proposed openstack/python-octaviaclient master: Do not print large usage message for port or weight  https://review.opendev.org/70434816:19
haleybuntested though16:19
*** ccamposr has quit IRC16:34
haleybjohnsom: sigh, that doesn't exactly work ^^^16:38
openstackgerritMichael Johnson proposed openstack/python-octaviaclient master: Fix long CLI error messages  https://review.opendev.org/70435516:47
johnsomhaleyb ^^^^ This works.... (I still need to finish the cleanup/tests)16:47
* haleyb shakes fist16:48
haleybjohnsom: it would have helped if my devstack had octavia running, part of the problem was a 500 error16:49
johnsomInvalid input for field/attribute 'protocol-port'. Value: '65536'. Value must be between 1 and 65535.16:51
johnsomI made the error similar to the API error message16:51
*** AlexStaf has quit IRC16:51
haleybjohnsom: the only thing you forgot it tests :-p16:52
johnsomYep, still working on those16:53
*** mithilarun has joined #openstack-lbaas16:56
*** yamamoto has quit IRC17:06
*** mithilarun has quit IRC17:06
*** gregwork has joined #openstack-lbaas17:15
rm_workTMM / johnsom: yep, resurrecting old amp records into ERROR state is the easiest way (though I just do an Amphora failover on that specific ID, not a LB failover) -- the hard way is copying the INSERT from the MASTER amp, and just changing all of the amp/compute/port ID fields to junk uuids so it will just see nothing there17:16
openstackgerritMichael Johnson proposed openstack/python-octaviaclient master: Fix long CLI error messages  https://review.opendev.org/70435517:16
rm_workwhich is only necessary if you literally have no other records to work with17:17
rm_work(like I did at the time I dealt with most of that)17:17
TMMstill waiting on the last lb to recover17:17
TMMit's taking so friggin long to timeout on the non-existent amps17:17
rm_workyeah it's safer/easier to do individual amp failovers17:17
rm_workthen you don't run into that17:17
TMMah17:17
TMMwell, now I know17:18
rm_workand if you want to be extra sure, do the LB failover once the Amp failover succeeds and you have two active amps17:18
rm_workless downtime that way too17:18
TMMYeah, this one lb has been down for like 20 minutes now17:18
TMMProbably should've just recreated it17:19
TMMoh well17:19
rm_work:(17:19
rm_worklooking forward to johnsom's failover rework17:19
johnsomIf it makes you feel better, the new code won't do that17:19
TMMcomputers are awful17:19
TMM:P17:19
rm_work^^ yes17:19
rm_workthey do what we tell them to, it's horrible :D17:20
TMMwhy is it even TRYING to contact the amp that's in ERROR mode17:20
TMMjust shoot it17:20
TMMshoooot iiiitttt17:20
TMMit's been 11 minutes now :P17:21
johnsomWell, in defense of the original authors, we get differing views. Some want retries waiting for other services (nova for example) forever, others want fail fast.17:21
TMMI just think that perhaps 11 minutes to wait on a node that's already in error state with a 'no route to host' error is maybe excessive17:22
johnsomYeah, the default is 25 minutes I think. That was because people were using virtualbox and some of the zuul tests nodes don't have hardware virtualization. For example, one hosting provider can take up to 18 minutes to boot a via using nova.17:23
johnsomIt's a poor default I think. production should be much lower.17:24
TMMI just killed octavia-worker and set everything to error except the lb, doing amphora failovers now17:24
TMMI Don't have another 50 minutes to wait17:24
johnsomYeah, be super careful killing the octavia processes. Currently that can halt other actions going on in the cloud and lead to PENDING_* states and broken LBs17:25
rm_workyeah i was tempted to suggest that17:25
rm_work\\\\\\but17:25
rm_workyeah it's a little risky17:25
johnsomThey may even blow up in the future, not necessarily right way.17:25
TMMNothing really was happening at the time17:25
johnsomThere are patches in flight for that issue too17:26
TMMat least the debug log of worker didn't seem to suggest it was doing anything except waiting on that one amp17:26
*** luksky has quit IRC17:27
johnsomhaleyb Up for review... grin17:27
*** tesseract has quit IRC17:38
*** yamamoto has joined #openstack-lbaas17:43
*** yamamoto has quit IRC17:52
*** mithilarun has joined #openstack-lbaas18:09
openstackgerritMichael Johnson proposed openstack/python-octaviaclient master: Fix long CLI error messages  https://review.opendev.org/70435518:11
*** yamamoto has joined #openstack-lbaas18:14
*** rpittau is now known as rpittau|afk18:18
TMMI learned that resurrecting two amp records that are bot set to 'MASTER' is not a recipe for success18:24
*** gcheresh has joined #openstack-lbaas18:37
*** AlexStaf has joined #openstack-lbaas18:40
rm_workah no, you need to set one to BACKUP18:40
rm_workand also fix the vrrp_priority field?18:41
rm_workuhh... that might be the only other thing18:41
*** gcheresh has quit IRC18:44
*** gcheresh has joined #openstack-lbaas18:46
*** AlexStaf has quit IRC18:46
*** yamamoto has quit IRC18:59
*** yamamoto has joined #openstack-lbaas19:03
*** yamamoto has quit IRC19:03
*** yamamoto has joined #openstack-lbaas19:03
*** yamamoto has quit IRC19:08
*** AlexStaf has joined #openstack-lbaas19:08
TMMwell, it appears to work now19:15
*** gregwork has quit IRC19:25
*** KeithMnemonic has joined #openstack-lbaas19:26
*** AlexStaf has quit IRC19:28
openstackgerritBrian Haley proposed openstack/octavia-tempest-plugin master: Change to use memory_tracker variable  https://review.opendev.org/70420219:29
rm_workyou should definitely fix it so one is MASTER and one is BACKUP or failover will not work great right now19:42
*** luksky has joined #openstack-lbaas19:47
*** AlexStaf has joined #openstack-lbaas20:17
*** openstackstatus has joined #openstack-lbaas20:28
*** ChanServ sets mode: +v openstackstatus20:28
*** gcheresh has quit IRC20:29
*** mithilarun has quit IRC21:36
*** TrevorV has quit IRC21:36
*** mithilarun has joined #openstack-lbaas21:37
*** mithilarun has quit IRC21:52
*** rcernin has joined #openstack-lbaas22:10
*** mithilarun has joined #openstack-lbaas22:20
*** mithilarun has quit IRC22:24
*** mithilarun has joined #openstack-lbaas22:36
*** tkajinam has joined #openstack-lbaas22:55
*** mithilarun has quit IRC23:31
*** mithilarun has joined #openstack-lbaas23:32
*** mithilarun has quit IRC23:36
*** yamamoto has joined #openstack-lbaas23:45
*** mithilarun has joined #openstack-lbaas23:49
*** yamamoto has quit IRC23:49

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!