Tuesday, 2023-03-14

gthiemongeskraynev: please open a new story for this issue, if you can provide the Octavia worker and api logs, that would be really helpful07:58
skraynevgthiemonge: hi. I am working on it right now ;) I will. notify when create it and share a link07:59
gthiemongeskraynev: I see that you're still using the amphorav1 driver in yoga, we would recommand to switch to amphorav2 (or amphora which is an alias), v1 is still supported (but will be removed in Bobcat) and it receives minimal maintenance08:00
gthiemongehmm when I see "for Task 'octavia-member-to-error-on-revert-flow-created'  transitioned into state 'REVERTING' from state 'SUCCESS'"08:04
gthiemongeit makes me think that there was an earlier error08:04
skraynevgthiemonge: https://storyboard.openstack.org/#!/story/201064609:36
skraynevthe main points here: yoga, amphora v1 and unclear retry batch_member_update after success revert failed task...09:37
skraynevgthiemonge: one more interesting thing:  this issue time to time (not in all cases) makes LB to stuck in PENDING_UPDATE state.09:41
gthiemongeskraynev: thanks, interesting... I'm looking at it09:42
gthiemongethe code that creates the flow of tasks for deleting members is different in v209:46
gthiemongein v1, I don't understand why we have those 2 calls: https://opendev.org/openstack/octavia/src/branch/master/octavia/controller/worker/v1/flows/member_flows.py#L158-L16609:47
gthiemongeDeleteModelObject + DeleteMemberInDB09:47
gthiemongewe don't have it in v2: https://opendev.org/openstack/octavia/src/branch/master/octavia/controller/worker/v2/flows/member_flows.py#L154-L15809:47
gthiemongeskraynev: I have one question: do you know when the API received those 2 calls?09:57
skraynevyes. I wrongly hide the request during formatting message09:58
skraynevplease refresh story:09:58
skraynevit happens 2023-03-13 04:12:40, on server1 and 2023-03-13 04:12:39 - server 209:58
skraynevI did not expect, that ``` requires moving text to new line. I left log on the same line and it did not displayed in story. sorry for that09:59
gthiemongeah ok10:04
gthiemongeskraynev: I may have reproduced the issue locally10:10
gthiemongeskraynev: when updating the members the Octavia API should lock the access to the LB, but in this API call, that doesn't work properly10:10
skraynevgthiemonge: wow! it's great news, that it's not my local issue :) I really happy, that you get the same result10:11
skraynevI am not a happy, that bug exist, but repro - is 50% solution.10:12
skraynevregarding stuck in PENDING_UPDATE: do I understand right, that it could happen in some corner cases, like when worker failed due to traceback - did some work? 10:12
gthiemongeskraynev: in most of the cases, the worker should recover from those issues and mark the LB in ERROR10:35
skraynevgthiemonge: hm. is it correct for amphora v1. without jobboard ?10:35
gthiemongeskraynev: but here, the workers processed at the same time 2 actions on the same LB which triggered a bug in the error handling (in octavia-member-to-error-on-revert-flow-created)10:36
gthiemongeskraynev: well if the worker is killed (without jobboard), yeah you may have some resources in PENDING states10:36
skraynevgthiemonge: got it! thx10:37
gthiemongeskraynev: I updated the story with my findings, but ATM I have no idea13:02
gthiemongetweining: johnsom: if you want to take a look: https://storyboard.openstack.org/#!/story/201064613:02
gthiemongethe code looks fine BTW13:03
tweininggthiemonge: did you reproduce it with two api instances in devstack?13:03
gthiemongeno, only one instance, but the requests are processed concurrently13:05
tweiningI will try to reproduce it as well later13:07
gthiemongeI see the COMMIT for the first request, so the LB should be in PENDING_UPDATE in the DB, but the 2nd request queries the DB and the LB seems to be ACTIVE13:09
skraynevgthiemonge: thank you. it looks interesting. Do I understand right, that code should set PENDING_UPDATE for members  and it will block second API call? 13:12
skraynevin the PUT method for batch update I see only _test_lb_and_listener_and_pool_statuses - which block pool and listener only13:13
skraynevI actually thought, that it should be enough... 13:14
opendevreviewOmer Schwartz proposed openstack/octavia master: Fix pool creation with single LB create call  https://review.opendev.org/c/openstack/octavia/+/86420413:14
skraynevgthiemonge. Could the issue be related with long code under context manager for session? I mean, that we set _test_lb_and_listener_and_pool_statuses at the start of context manager, later do a lot of work for members (we could have 100 members for example for update). and only after all these actions we commit session.13:23
gthiemongeskraynev: the call at 04:12:39,780.780 is fine, it was processed by the worker on server2 (between 04:12:39,955.955 and 04:12:41,085.085), that means that the call at 04:12:40,197.197 should have been denied, the LB should have been locked (with the PENDING_UPDATE provisioning_status)13:23
gthiemongeI don't think so, I'm reproducing it with small batch requests13:24
skraynevgthiemonge: yeap. I mean, that call should be denied based on LB and Pool statuses - so statuses for members do not matter for validation of immutability, right? 13:24
skraynevgthiemonge: hm.. you're right. if it works for small set - my theory does not work.13:25
gthiemongeskraynev: no, I think the statuses of the members are not evaluated13:25
skraynevgthiemonge: read your comment. does the issue with sqlalchemy? or maybe with session?15:54
gthiemongeskraynev: don't know yet16:12
opendevreviewGregory Thiemonge proposed openstack/octavia master: Fix ORM caching for with_for_update calls  https://review.opendev.org/c/openstack/octavia/+/87741417:14
gthiemongethanks johnsom ;-)17:15
opendevreviewMichael Johnson proposed openstack/octavia-tempest-plugin master: Update Octavia tempest tests for no scoped tokens  https://review.opendev.org/c/openstack/octavia-tempest-plugin/+/87690423:03
opendevreviewMichael Johnson proposed openstack/octavia master: Fix devstack policy overrides  https://review.opendev.org/c/openstack/octavia/+/87743323:06
opendevreviewMichael Johnson proposed openstack/octavia-tempest-plugin master: Update Octavia tempest tests for no scoped tokens  https://review.opendev.org/c/openstack/octavia-tempest-plugin/+/87690423:07
opendevreviewMichael Johnson proposed openstack/octavia master: Fix devstack policy overrides  https://review.opendev.org/c/openstack/octavia/+/87743323:21
opendevreviewMichael Johnson proposed openstack/octavia-tempest-plugin master: Update Octavia tempest tests for no scoped tokens  https://review.opendev.org/c/openstack/octavia-tempest-plugin/+/87690423:22

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!