Wednesday, 2021-10-27

opendevreviewGregory Thiemonge proposed openstack/octavia stable/wallaby: Fix failover of az-specific loadbalancers  https://review.opendev.org/c/openstack/octavia/+/81558506:22
opendevreviewGregory Thiemonge proposed openstack/octavia stable/victoria: Fix failover of az-specific loadbalancers  https://review.opendev.org/c/openstack/octavia/+/81558606:22
opendevreviewGregory Thiemonge proposed openstack/octavia stable/ussuri: Fix failover of az-specific loadbalancers  https://review.opendev.org/c/openstack/octavia/+/81558706:23
opendevreviewGregory Thiemonge proposed openstack/octavia stable/xena: Fix failover of az-specific loadbalancers  https://review.opendev.org/c/openstack/octavia/+/81558806:24
opendevreviewGregory Thiemonge proposed openstack/octavia master: Reconfigure amphora network interfaces seamlessly  https://review.opendev.org/c/openstack/octavia/+/81236808:39
opendevreviewGregory Thiemonge proposed openstack/octavia master: Fix plugging member subnets on existing networks  https://review.opendev.org/c/openstack/octavia/+/66540208:39
opendevreviewGregory Thiemonge proposed openstack/octavia master: Allow multiple VIPs per LB  https://review.opendev.org/c/openstack/octavia/+/66023908:39
dulekHi folks! In Kuryr we see elevated rate of SCPT tests failures when using Amphora based on Ubuntu Focal.13:56
dulekThe symptoms are mostly that 0 or only a single backend responds when it should be 2.13:57
dulekhttps://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_8bd/812588/12/check/kuryr-kubernetes-tempest-defaults/8bdf367/controller/index.html - example run.13:58
dulekHm, interestingly I see the LB moving from PENDING_UPDATE to PENDING_CREATE which we doesn't really expect.14:08
dulekAh no, it's not this thing. :/14:15
johnsomdulek By that message I am assuming you figured out what it was an it's not amphora related?14:20
dulekjohnsom: Nah, so looks like Kuryr correctly waits for Amphora to become ACTIVE and then allows running the SCTP connectivity tests.14:28
dulekAnd then the test tries running connections to the LB assuming it'll reach all the backends.14:28
dulekI think ~100 connections are tried.14:29
dulekYet it only reaches one despite ROUND_ROBIN algorithm.14:29
dulekAnd we do see that only on Amphora gates.14:29
johnsomHmm, does it wait for operating status ONLINE or just prov status ACTIVE? Not that 100 connections shouldn't allow all to come up in time and start the RR.14:30
dulekjohnsom: I think we wait for the provisioning_status only. Is that incorrect? Could explain quite a bit - those amps on gates are super slow.14:30
dulekAnd maybe even slower with Focal?14:31
johnsomI did a quick scan of the logs, there aren't any errors there that I saw. Not even a member dropping offline (though it won't log if they never come online)14:31
johnsomWell, adding a member should only take a few seconds. Actually it's centos that is having major performance issues at the moment.14:32
dulekjohnsom: Those VMs often run on software virtualization, I'm fairly sure few seconds isn't what we see.14:33
johnsomSo, provisioning status is when Octavia is done configuring things. Operating status is the "observed" status, i.e. the pod is responding, etc.14:33
johnsomYeah, software qemu is rough, very slow in the IO subsystem14:34
dulekjohnsom: Do I need to check the operating_status only on the LB or also the member?14:35
johnsomSince you have more than one  member, I would check the pool. ONLINE==all members healthy, DEGRADED==one or more members not responding as expected, ERROR==all members are down14:36
johnsomLet me look through your test and the logs deeper to see if I can see what is up.14:36
johnsomdulek So, it looks like the traffic test is starting before the second member is marked as provisioning status ACTIVE.15:23
johnsomI see traffic through the LB at 10:57:00, but the second member isn't finished creating until 10:57:3315:24
johnsomIt looks like only about 19 of the connections could have been in the RR and hit member 215:25
johnsomActually, no, none would have looking at the timing15:25
johnsomIt looks like the LB would have been fully provisioned at Oct 27 10:57:33.61764015:27
johnsomSo, yeah, something isn't waiting for the LB create/configuration to finish before starting the test. That is why the traffic is only hitting one member.15:29
gthiemonge#startmeeting Octavia16:00
opendevmeetMeeting started Wed Oct 27 16:00:41 2021 UTC and is due to finish in 60 minutes.  The chair is gthiemonge. Information about MeetBot at http://wiki.debian.org/MeetBot.16:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:00
opendevmeetThe meeting name has been set to 'octavia'16:00
gthiemongeHi16:00
johnsomo/16:00
gthiemonge#topic Announcements16:02
gthiemongeYoga PTG16:02
gthiemongeThe Yoga PTG for Octavia was last week16:02
gthiemongeWe had some good discussions there16:02
gthiemongeI would like to highlight some features/fixes we want to get in the Y release16:03
gthiemonge- amphorav2+persistence by default16:03
gthiemonge- Fix plugging member subnets on existing networks16:03
gthiemonge- Allow multiple VIPs per LB16:03
gthiemongeand also improvements for AZ support would be great16:03
gthiemongeanything else? any other announcements?16:05
gthiemonge#topic Brief progress reports / bugs needing review16:07
gthiemongeFYI I created one (last?) backport for stable/ussuri16:07
gthiemonge#link https://review.opendev.org/c/openstack/octavia/+/81558716:07
gthiemongewe need to merge it before Nov 12th if we want to include it in the final reelase for Ussuri16:08
johnsomI think I reviewed the master patch for that yesterday. I will take a pass on the backports16:08
gthiemongeI updated the two patches for the multi-subnet issue on member ports and for the multi-vip support (mentionned in the announcements)16:08
gthiemongejohnsom: thanks16:08
gthiemonge#link https://review.opendev.org/c/openstack/octavia/+/66540216:08
gthiemonge#link https://review.opendev.org/c/openstack/octavia/+/66023916:08
gthiemonge(last one is still WIP)16:09
gthiemongeI also tried to update the octavia-grenade-ffu job16:10
gthiemongewe want to update octavia from n-3 to n, without updating the other services (they probably don't support that jump)16:10
gthiemongeI'm still failing to configure grenade to update only octavia16:10
njohnstongthiemonge: I wonder if tosky can help, he has experience with grenade16:11
gthiemongenjohnston: ok I can ask him16:12
gthiemongenjohnston: thanks ;-)16:13
gthiemonge#topic Open Discussion16:15
gthiemongeI don't have any other topics16:15
gthiemongeOk Folks, thanks everyone!16:18
gthiemonge#endmeeting16:18
opendevmeetMeeting ended Wed Oct 27 16:18:28 2021 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:18
opendevmeetMinutes:        https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-10-27-16.00.html16:18
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-10-27-16.00.txt16:18
opendevmeetLog:            https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-10-27-16.00.log.html16:18

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!