Friday, 2020-08-28

*** yamamoto has joined #openstack-lbaas00:10
*** yamamoto has quit IRC00:33
openstackgerritMichael Johnson proposed openstack/octavia-tempest-plugin master: Adjust scenario tests for NotImplemented skip  https://review.opendev.org/71400400:37
*** yamamoto has joined #openstack-lbaas02:30
*** yamamoto has quit IRC02:35
*** ramishra has joined #openstack-lbaas02:40
*** sapd1_x has joined #openstack-lbaas02:45
*** ramishra has quit IRC02:47
*** armax has quit IRC02:48
*** yamamoto has joined #openstack-lbaas02:56
*** rcernin has quit IRC02:56
sorrisonrm_work, johnsom: We have finally switched fully over to octavia, migrated the last of the old neutron lbaas over a couple weeks ago02:57
sorrisongot about 150 LBs running02:57
johnsomNice!02:58
*** sapd1_x has quit IRC02:59
sorrisonmainly going ok, but about every day we get a bunch of amps going into ERROR status03:00
sorrisonbeen trying to figure out why03:00
*** rcernin has joined #openstack-lbaas03:04
johnsomHmm, check the health manager log03:04
johnsomThat is unusual for sure.03:04
johnsomAre the pretty stable LBs or do they have a high rate of changes going on?03:05
sorrisonboth :-)03:06
sorrisonAmphora %(id)s health message was processed too '03:06
sorrison                            'slowly: %(delay)ss! The system may be overloaded '03:06
sorrison                            'or otherwise malfunctioning. This heartbeat has '03:06
sorrison                            'been ignored and no update was made to the '03:06
sorrison                            'amphora health entry. THIS IS NOT GOOD.',03:06
sorrisonWe see this sometimes03:06
sorrisonI can't quite figure out why as the system is not overloaded03:07
johnsomOh! Yeah, huge red flag. You database is overloaded03:07
sorrisonWe have 3 health manager03:07
sorrisonour DB is def not overloaded03:07
johnsomThat means a simple db query took longer than 10 seconds to respond.03:07
sorrisonphysical hardware less than a year old with nvme03:07
sorrisonwe have no slow queries03:08
johnsomI have seen this when people put 30+ containers on a single host, with the primary db and the master rabbit queue.03:08
sorrisonna we have dedicated hardware for our DB servers03:09
johnsomThen that is very odd. We know it handle 2000+03:11
johnsomFrom other deployments.03:11
sorrisonwell we have the odd slow query so I do lie, but hardly any `Slow queries: 0% (224K/13B)`03:11
sorrisonSo is it def a DB issue and couldn't be anything else?03:11
johnsomWhat version are you running?03:11
sorrisonoctavia is ussuri03:12
sorrisonwith a couple of other patches on top, including the fail over refactor03:12
johnsomAre there “dropped” messages or just the “this is not good” messages?03:13
sorrisonjust trying to find the log string to chuck in kibana, I can't see any log messages with dropped in the string03:14
johnsomIt is either a process/thread pool exhaustion or DB queried taking close to 10 seconds to respond. Assuming you haven’t changed the heartbeat interval.03:15
johnsomSorry on mobile so typos.03:16
sorrisonno, haven't changed the interval, something I've been thinking of but trying to figure this all out exactly first03:17
johnsomI can give you the db query to run for monitoring or testing tomorrow when I am back in the office.03:17
sorrisonok thanks, probably not in the middle of the work day like me :-)03:17
johnsomIt is a specially optimized query for this. I did a bunch of work optimizing that and scale testing it as we had 2000+ deployments that needed it.03:18
johnsomYeah, it is 8pm here. Watching a movie with the wife.03:19
johnsomOh, one other case we saw was an ha db deployment that was flapping master and having resync latency.03:21
*** ramishra has joined #openstack-lbaas03:22
sorrisonyeah thought of that as we run a cluster, but all our queries going to the 1 server and hasn't been any flipping etc.03:22
*** rcernin has quit IRC03:35
*** rcernin has joined #openstack-lbaas03:39
*** sapd1_x has joined #openstack-lbaas03:41
*** psachin has joined #openstack-lbaas03:51
*** rcernin has quit IRC03:54
sorrisonIt seems to happen in spikes so trying to track that one done. Changed the log level from debug -> info and now getting `Health Update finished in: 0.018291685730218887 seconds` so will monitor this and see when it happens again03:58
*** rcernin has joined #openstack-lbaas04:08
*** rcernin has quit IRC04:18
*** rcernin has joined #openstack-lbaas04:19
*** sapd1_x has quit IRC04:26
*** spatel has joined #openstack-lbaas04:38
*** spatel has quit IRC04:43
johnsomYeah, that is slower than my cloud, but still respectable. I usually get 0.006...04:55
johnsom1004:55
johnsom10 is where we start having problems with it.04:56
*** vishalmanchanda has joined #openstack-lbaas05:01
*** ccamposr has joined #openstack-lbaas06:36
*** ataraday_ has joined #openstack-lbaas07:18
*** ccamposr__ has joined #openstack-lbaas07:32
*** ccamposr has quit IRC07:34
openstackgerritAnn Taraday proposed openstack/octavia master: Add option to set default ssl ciphers in haproxy  https://review.opendev.org/68533708:11
*** kevinz has joined #openstack-lbaas08:32
*** gcheresh has joined #openstack-lbaas08:49
*** gcheresh has quit IRC08:55
*** gcheresh has joined #openstack-lbaas09:08
*** spatel has joined #openstack-lbaas09:20
*** spatel has quit IRC09:24
*** gcheresh has quit IRC09:27
*** vishalmanchanda has quit IRC09:27
cgoncalvesoctavia-tox-functional-py37-tips (voting) is failing because of a recent change in octavia-lib. https://review.opendev.org/#/c/744520/ fixes the gate. if you folks have some time to review it... :)10:12
*** yamamoto has quit IRC10:20
*** gcheresh has joined #openstack-lbaas10:32
*** gcheresh has quit IRC10:40
*** yamamoto has joined #openstack-lbaas10:51
*** yamamoto has quit IRC12:25
*** servagem has quit IRC12:54
*** rcernin has quit IRC12:56
*** servagem has joined #openstack-lbaas12:57
*** spatel has joined #openstack-lbaas13:00
*** spatel has quit IRC13:04
*** yamamoto has joined #openstack-lbaas13:05
*** yamamoto has quit IRC13:10
openstackgerritGregory Thiemonge proposed openstack/octavia master: Add SCTP support in API and Amphora  https://review.opendev.org/73838113:19
openstackgerritGregory Thiemonge proposed openstack/octavia-tempest-plugin master: WIP SCTP traffic scenario tests  https://review.opendev.org/73864313:20
*** vishalmanchanda has joined #openstack-lbaas13:38
openstackgerritGregory Thiemonge proposed openstack/python-octaviaclient master: Add SCTP support  https://review.opendev.org/74866713:49
*** ataraday_ has quit IRC13:59
*** TrevorV has joined #openstack-lbaas14:00
openstackgerritGregory Thiemonge proposed openstack/python-octaviaclient master: Add SCTP support  https://review.opendev.org/74866714:19
*** armax has joined #openstack-lbaas14:23
*** sapd1 has quit IRC14:40
openstackgerritGregory Thiemonge proposed openstack/octavia-dashboard master: Add support for SCTP  https://review.opendev.org/74868114:46
openstackgerritGregory Thiemonge proposed openstack/python-octaviaclient master: Add SCTP support  https://review.opendev.org/74866714:58
openstackgerritGregory Thiemonge proposed openstack/octavia-dashboard master: Add support for SCTP  https://review.opendev.org/74868115:00
*** yamamoto has joined #openstack-lbaas15:07
*** yamamoto has quit IRC15:12
*** gcheresh has joined #openstack-lbaas15:14
*** gcheresh has quit IRC15:40
openstackgerritMichael Johnson proposed openstack/octavia master: Add proxy v2 protocol support  https://review.opendev.org/74780116:10
johnsomcgoncalves the ALPN patch looks good, thanks!16:21
*** psachin has quit IRC16:24
*** ccamposr has joined #openstack-lbaas16:50
*** ccamposr__ has quit IRC16:53
*** yamamoto has joined #openstack-lbaas16:56
*** gcheresh has joined #openstack-lbaas16:58
openstackgerritBrian Haley proposed openstack/octavia-tempest-plugin master: Change pool create scenario test to wait for operating status  https://review.opendev.org/74596217:00
*** yamamoto has quit IRC17:01
openstackgerritBrian Haley proposed openstack/octavia master: Remove Neutron SDN-specific code  https://review.opendev.org/71819217:32
openstackgerritMichael Johnson proposed openstack/octavia-tempest-plugin master: Adjust scenario tests for NotImplemented skip  https://review.opendev.org/71400417:34
johnsomOk, that should be back in working order after the ACL patch merged.17:34
openstackgerritBrian Haley proposed openstack/octavia-tempest-plugin master: Change pool create scenario test to wait for operating status  https://review.opendev.org/74596217:35
openstackgerritMichael Johnson proposed openstack/octavia-tempest-plugin master: Adjust API tests for NotImplemented skip  https://review.opendev.org/74480518:09
openstackgerritGregory Thiemonge proposed openstack/octavia master: Fix nf_conntrack_buckets sysctl in Amphora  https://review.opendev.org/74874919:39
*** TrevorV has quit IRC19:57
*** jamesdenton has quit IRC20:39
*** jamesdenton has joined #openstack-lbaas20:40
openstackgerritCarlos Goncalves proposed openstack/octavia master: Switch to live from noop drivers  https://review.opendev.org/74816320:51
*** yamamoto has joined #openstack-lbaas20:58
*** servagem has quit IRC20:59
*** yamamoto has quit IRC21:03
*** jamesdenton has quit IRC21:07
*** jamesden_ has joined #openstack-lbaas21:07
*** vishalmanchanda has quit IRC21:25
*** yamamoto has joined #openstack-lbaas21:32
*** gcheresh has quit IRC21:36
openstackgerritMichael Johnson proposed openstack/octavia-tempest-plugin master: Adjust API tests for NotImplemented skip  https://review.opendev.org/74480522:04
*** armax has quit IRC22:05
-openstackstatus- NOTICE: A zuul server ended up with read only filesystems which caused many jobs to hit retry_limit. The server has been rebooted and appears happy. Jobs can be rechecked.22:13
*** yamamoto has quit IRC22:16
*** armax has joined #openstack-lbaas22:47
*** rcernin has joined #openstack-lbaas23:10
*** rcernin has quit IRC23:15
*** armax has quit IRC23:26
*** armax has joined #openstack-lbaas23:40
*** armax has quit IRC23:52
*** armax has joined #openstack-lbaas23:56

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!