Wednesday, 2023-08-09

opendevreviewOpenStack Proposal Bot proposed openstack/octavia-dashboard master: Imported Translations from Zanata  https://review.opendev.org/c/openstack/octavia-dashboard/+/88266004:34
opendevreviewOmer Schwartz proposed openstack/octavia master: Provide Amphora stats for Octavia no-op drivers  https://review.opendev.org/c/openstack/octavia/+/89081412:54
opendevreviewOmer Schwartz proposed openstack/octavia master: Provide Amphora stats for Octavia no-op drivers  https://review.opendev.org/c/openstack/octavia/+/89081412:59
QGHello, since we have upgraded Octavia to Zed we have some strange issue with some resources staying in PENDING_CREATE, have you ever seen this behavior ?13:08
QGIn the logs we only see for example : INFO octavia.controller.queue.v2.endpoints [-] Creating member 'ca977118-94ad-4314-9ca5-89baa017771c'... 13:10
QGand when we check in the octavia-persistence databases there are no tasks in progress13:11
tweiningHi QG, I assume this happens only when you create or failover resources?13:11
QGYes exactly !13:12
tweiningAFAIR there were issues with resources stuck in PENDING_*, but I am not aware of any that are related to Zed. I'll have to search.13:15
tweiningalso, that can happen for a lot of reasons13:15
tweiningthe first thing I would probably do is to check if the amphora instance came up without errors and then whether the management network works.13:22
tweiningmaybe it is related to https://storyboard.openstack.org/#!/story/201042613:24
QGChecking amphora and network thanks !13:26
QGI can see in the logs of the amphora agent some configuration reload from some previous configuration changes ( like couple seconds before )13:29
QG[2023-08-09 12:44:23 +0000] [650] [DEBUG] PUT /1.0/loadbalancer/ff5783e5-15a5-49f7-932a-cadbce463810/reload13:29
QGand 13:29
QGworker 2023-08-09 12:44:29.110 11 INFO octavia.controller.queue.v2.endpoints [-] Creating member 'ca977118-94ad-4314-9ca5-89baa017771c'...13:29
QGwe have backported https://storyboard.openstack.org/#!/story/2010426 13:34
johnsomQG Hi, PENDING_CREATE is an odd one. To clarify, you do see one of the worker processes pick it up off the rabbit queue and start working on it?15:09
johnsomYou might check your OctaviaConnectionMaxRetries and OctaviaBuildActiveRetries settings, the upstream default is VERY long (hours) due to the slow test gate hosts. We usually set those to lower numbers that are more user friendly. Basically those set how long we keep retrying nova/neutron failures.15:10
QGHi johnsom, yes i see the worker pick it up and start working on it, i even see the action on the amphora side, but it's like at the end of the task when it need to put back the lb in Active, it doesn't do it15:10
johnsomDo you see retry warning statements in that worker log?15:11
QGoh checking thanks 15:11
johnsomPENDING_* states means one of the controllers has ownership and is provisioning or retrying failure conditions15:11
QGno i don't see any retry warning15:13
QGthere is no task in octavia-persistence15:13
johnsomSo you have enabled jobboard?15:14
QGyes we have15:14
johnsomHmm, I guess my next step would be to go through the worker log and follow it's progression until it stops. If you want to share the worker log, I am happy to take a look.15:15
QGyes sure i will try to compile the logs and share them15:39
opendevreviewMerged openstack/octavia-dashboard master: Imported Translations from Zanata  https://review.opendev.org/c/openstack/octavia-dashboard/+/88266016:30
opendevreviewMichael Johnson proposed openstack/octavia master: Remove unused wait_for_port_detach code  https://review.opendev.org/c/openstack/octavia/+/89095819:51

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!