Wednesday, 2021-11-03

*** redrobot2 is now known as redrobot05:58
maysamshello folks, quick question about octavia lbs with ovn provider14:38
maysamsI just noticed a LB ACTIVE , but with pools with ERROR state and lb listeners with state PENDING_UPDATE. Shouldn't the LB be with some PENDING_* state or ERROR?14:38
gthiemongemaysams: if a listener is PENDING_UPDATE, the load balancer should also be PENDING_UPDATE (both are set at the same time by the Octavia API)14:41
maysamsgthiemonge: oh, interesting. So I'm observing a different behavior, I will try to gather more info around that LB14:43
maysamsgthiemonge++14:43
gthiemongemaysams: not an ovn-provider expert, but this doesn't look good: https://opendev.org/openstack/ovn-octavia-provider/src/branch/master/ovn_octavia_provider/helper.py#L1380-L138814:45
gthiemongemaysams: on error, in pool_delete, the provisioning_status of the pool is set to ERROR and the LB is set to ACTIVE, but the listener is unchanged14:45
gthiemongehttps://opendev.org/openstack/octavia/src/branch/master/octavia/api/v2/controllers/pool.py#L512-L51414:46
gthiemonge^ this call marks the LB and the listeners in PENDING_UPDATE state14:47
maysamsright, the scenario I see with kuryr is the following: kuryr is trying to delete a lb member, but it cant since the lb pool is with error and listener pending_update14:48
maysamsit fails with conflict14:48
gthiemongethere was probably a previous error that triggered this invalid state14:49
maysamsseems the logs were rotated :/ 14:55
gthiemonge#startmeeting Octavia16:00
opendevmeetMeeting started Wed Nov  3 16:00:29 2021 UTC and is due to finish in 60 minutes.  The chair is gthiemonge. Information about MeetBot at http://wiki.debian.org/MeetBot.16:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:00
opendevmeetThe meeting name has been set to 'octavia'16:00
johnsomo/16:00
gthiemongehi!16:00
gthiemonge#topic Announcements16:02
gthiemongewell I don't have any annoucements16:02
gthiemongeanyone?16:02
johnsomI don't16:03
gthiemonge#topic Brief progress reports / bugs needing review16:03
gthiemongeI proposed a fix for a problem with some revert functions:16:03
gthiemonge#link https://review.opendev.org/c/openstack/octavia/+/81597316:04
gthiemongein some tasks, in the revert function, we set the load balancer to ERROR and I believe that it is not good, because the provisioning_status of the LB acts as a lock on the resource16:04
gthiemongeand in those revert functions, we unlock the LB too early, which may cause race conditions16:05
johnsomYep, unlocking too early16:05
gthiemongeonly the revert function of the first task of a flow (such as LoadBalancerToErrorOnRevertTask) should set a LB to ERROR in my opinion16:05
gthiemongeI opened a story about the issue I got:16:05
gthiemonge#link https://storyboard.openstack.org/#!/story/200965216:05
johnsomYeah, it should roll up to the capstone task16:05
gthiemongeplease note that an additional patch will be required for release <=stable/wallaby16:06
gthiemongebecause some tasks were removed from Xena (spare pool)16:07
gthiemongeand FYI I also started using centos 9 stream amphora images. It works in my local env, but not in the CI16:08
johnsomNice16:08
johnsomOn the Octavia front I have only been doing bug fixes16:09
johnsomsigh, reviews that is16:11
gthiemonge#topic 200 vs 202 return codes in the API16:11
johnsomYeah, this is my topic item16:12
johnsomhttps://review.opendev.org/c/openstack/octavia/+/81639316:12
johnsomThis patch raised an interesting issue. No idea how it went this long without being caught.16:12
johnsomThough I'm pretty sure we talked about this long ago. Maybe rm_work remembers the discussion16:12
johnsomSo, the API reference says that all of the PUT methods return 202 status codes.16:13
johnsomThis makes sense give that the updates are asynchronous, i.e. need to go update the certs in the amps.16:13
johnsomHowever, the actual API code is returning a 200 code for these calls.16:14
johnsom#link https://review.opendev.org/c/openstack/octavia/+/81639316:14
johnsomIf you need a reference to the meanings:16:14
johnsom#link https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html16:14
johnsomThe patch proposes simply changing the API reference to show 200's16:14
johnsomHowever, I raised the question of should the API really be returning 202 since they are async methods.16:15
johnsomThoughts?16:16
gthiemongethat's a good topic :D16:17
johnsomlol, yes16:17
gthiemongeAFAICT I see a lot of good reasons to reply 200 to these calls, and a lot of good reasons to reply 20216:17
johnsomYeah, one could argue for 200 as I think the response includes the updated fields (though PENDING_UPDATE status).16:18
johnsomYou could argue 202 because they are in PENDING_UPDATE and not yet actually applied to the LBs16:19
gthiemongeone concern raised in the review about fixing the code and not fixing the doc is that changing the code may break some clients/sdks16:19
johnsom202 is kind of a signal that you should poll for status updates16:19
johnsomYeah, changing status codes is.... ugly16:19
johnsomI don't think it will break openstacksdk or our client16:20
johnsomBoth should be looking for a 2xx code and not specific sub-codes16:20
gthiemongeI didn't any clients/sdks in openstack that use the return code16:21
gthiemongei'm lazy so I would recommend fixing the doc :D16:24
johnsomYeah, this is one that is... unfortunate.16:24
johnsomMaybe add it to the v3 api list16:24
gthiemongejohnsom: but if you think that 202 is more appropriate, let's fix the code16:24
johnsomI think the community should decide. I would like to hear what rm_work thinks too16:25
gthiemongedo we have a v3 api todo list?16:25
johnsomWe should16:25
johnsomMaybe we should create a wiki page16:26
johnsomWith a big warning at the top that v3 is not planned any time soon16:26
gthiemonge+116:26
johnsomOk, maybe we should table this topic and hope for more community feedback16:27
gthiemongeyeah, async feedback16:28
gthiemongejohnsom: thanks for raising this issue16:28
gthiemonge#topic Open Discussion16:28
gthiemongeAny other topics today?16:28
johnsomI don't have anything16:29
gthiemongeOk!16:29
gthiemongeThanks everyone!16:29
gthiemonge#endmeeting16:29
opendevmeetMeeting ended Wed Nov  3 16:29:30 2021 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:29
opendevmeetMinutes:        https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-11-03-16.00.html16:29
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-11-03-16.00.txt16:29
opendevmeetLog:            https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-11-03-16.00.log.html16:29
johnsomo/16:29
opendevreviewGregory Thiemonge proposed openstack/octavia master: Enable taskflow retry feature when waiting for compute  https://review.opendev.org/c/openstack/octavia/+/81653516:55
maysamsgthiemonge: hello again, getting back to the issue we chatted about earlier. I only found a call of "Sending lb create to octavia provider" nothing extra after that17:26
maysamsgthiemonge: regardless of it being with ovn provider the lb should have moved to pending_update state right?17:27
johnsomThat means Octavia handed the request off to the provider driver.17:27
maysamsright and I couldn't find some useful info on the provider logs17:28
johnsomYes, that happens. Octavia sets it to PENDING_UPDATE and hands it off to the provider driver. In the case of OVN, I think it immediately sets it ACTIVE as it doesn't do much state management. Not sure, as I'm not familiar with the OVN internals17:28
johnsomYeah, I don't know that OVN logs much.17:28
maysamsall right, thanks17:29
johnsomYou can ask for help in the neutron channel, they own the OVN provider and know more about it than most folks here.17:29
maysamsokay17:29
rm_workUgh yeah, I think we are now stuck with the current contract (returning 200)20:02
rm_workKinda sucks but there’s always v3 :D20:02

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!