*** tamas_erdei is now known as terdei | 06:57 | |
opendevreview | Gregory Thiemonge proposed openstack/octavia master: Preserve haproxy server states during reloads https://review.opendev.org/c/openstack/octavia/+/805955 | 07:32 |
---|---|---|
opendevreview | Gregory Thiemonge proposed openstack/octavia-tempest-plugin master: Add new scenario test to create LB in specific AZ https://review.opendev.org/c/openstack/octavia-tempest-plugin/+/695349 | 14:21 |
mnaser | johnsom: the current issue that i'm dealing with is that it seems my octavia workers are getting stuck for some reason. | 14:29 |
mnaser | like, absolutely no movement even though i have a bunch of PENDING_DELETE | 14:30 |
johnsom | mnaser Did they pop off the rabbit queue or did rabbit lose them? | 14:33 |
mnaser | rabbit is fine, it's running all other openstack services with no issues, let me see if they did pop off though | 14:33 |
johnsom | I assume you are running durable queues | 14:34 |
mnaser | yes | 14:34 |
mnaser | 36 PENDING_DELETE lbs | 14:34 |
mnaser | so they are in PENDING_DELETE state | 14:35 |
mnaser | so i cant re-run a delete? | 14:35 |
johnsom | There are pretty much just two possibilities for that. 1. Rabbit lost the message. 2. The controller that was working on the delete got kill -9'd. | 14:35 |
mnaser | none of these were up :( | 14:35 |
mnaser | is there anyway without db to get out of this state now | 14:35 |
johnsom | PENDING is a lock state meaning one of the controllers has "ownership" of the object. If none of the controllers are actively working (retrying), then I would set it to ERROR and retry the delete call. | 14:36 |
mnaser | but there's no nova equiv of reset-state ? | 14:36 |
johnsom | Nope. One of the controllers owns that object, so it is invalid to "reset" the state without making sure it's not actively being worked on. | 14:38 |
johnsom | We have made a lot of progress towards eliminating the #2 possibility though. It's just not quite ready yet. | 14:38 |
opendevreview | Merged openstack/octavia master: Fix pylint checks https://review.opendev.org/c/openstack/octavia/+/805861 | 14:48 |
rm_work | Yeah, there have been some requests to allow an admin call to reset the state to ERROR. Most of us have been united in the opinion that we should not do that. I've wavered a bit, and would probably cave if more pressure was applied, but I don't know if johnsom would. It really should be solved by the jobboard work. | 15:58 |
rm_work | The problem is how long that is taking. I go into the DB to do a state reset about once a month. | 15:58 |
gthiemonge | #startmeeting Octavia | 16:01 |
opendevmeet | Meeting started Wed Aug 25 16:01:08 2021 UTC and is due to finish in 60 minutes. The chair is gthiemonge. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:01 |
opendevmeet | The meeting name has been set to 'octavia' | 16:01 |
gthiemonge | Hi everyone | 16:01 |
rm_work | o/ | 16:01 |
johnsom | o/ | 16:01 |
gthiemonge | #topic Announcements | 16:02 |
gthiemonge | Xena-3 milestone | 16:03 |
gthiemonge | next week is Xena-3 milestone (Feature Freeze, Final release for client libraries) | 16:03 |
gthiemonge | the priority review etherpad is up to date: | 16:03 |
gthiemonge | #link https://etherpad.opendev.org/p/octavia-priority-reviews | 16:03 |
gthiemonge | we have to focus on [feature] reviews | 16:03 |
gthiemonge | for instance "Generic Network Interface Management" (wink wink) | 16:04 |
gthiemonge | #link https://review.opendev.org/c/openstack/octavia/+/761195 | 16:04 |
gthiemonge | it is important for Centos/RHEL users, this commit is required for Centos/RHEL 9 amphora images. | 16:04 |
gthiemonge | johnsom: rm_work: ^ | 16:04 |
rm_work | NINE? | 16:05 |
gthiemonge | Centos Stream as well | 16:05 |
gthiemonge | yes | 16:05 |
gthiemonge | 9 | 16:05 |
johnsom | nueve | 16:06 |
rm_work | why does time keep moving | 16:07 |
gthiemonge | I think Centos 9 Stream will be released... in 2021 | 16:07 |
rm_work | anyway ok, will take a look | 16:07 |
johnsom | neuf | 16:07 |
johnsom | It is an issue on the debian side too | 16:07 |
gthiemonge | thanks | 16:09 |
gthiemonge | next | 16:09 |
gthiemonge | amphorav2! | 16:09 |
gthiemonge | amphorav2 is now the default driver! | 16:09 |
gthiemonge | (amphora is now an alias for amphorav2) | 16:09 |
gthiemonge | (amphorav1 still exists) | 16:09 |
gthiemonge | Thanks to all the people involved in this work! | 16:10 |
rm_work | woot | 16:10 |
gthiemonge | note that persistence is not enabled by default | 16:10 |
gthiemonge | Any other announcements? | 16:12 |
gthiemonge | #topic Brief progress reports / bugs needing review | 16:14 |
johnsom | I have mostly been working on reviews | 16:15 |
gthiemonge | I've been working on a issue with the status of members. It is not kept after reloading haproxy configuration | 16:15 |
gthiemonge | johnsom: +1 | 16:15 |
gthiemonge | a member with an ERROR operating_status can be switched to ONLINE after reconfiguring a LB (ex: adding another member) | 16:15 |
gthiemonge | I have a WIP patch that uses the server-state feature in haproxy: | 16:16 |
rm_work | mmm yeah because it resets haproxy and it remakes the member table, so healthchecks needs to rerun | 16:16 |
rm_work | that seems expected | 16:16 |
gthiemonge | #link https://review.opendev.org/c/openstack/octavia/+/805955 | 16:16 |
rm_work | oh so there is a way to solve that? neat | 16:16 |
gthiemonge | If you have some concerns about using server-state, please comment in the review | 16:16 |
gthiemonge | Yeah but I read that using this file can be time-consuming with older haproxy proxy releases, someone with 100k backends complained that it took 1h to launch haproxy :D | 16:17 |
rm_work | jeeze | 16:17 |
gthiemonge | this feature has been optimized since 2.1 | 16:18 |
gthiemonge | so if there's a way to improve the user experience with hm status, that's great, but if it introduces some issues, ... | 16:19 |
gthiemonge | #topic Open Discussion | 16:22 |
gthiemonge | Any other topics today? | 16:22 |
gthiemonge | ok | 16:24 |
gthiemonge | Thanks everyone! | 16:25 |
gthiemonge | #endmeeting | 16:25 |
opendevmeet | Meeting ended Wed Aug 25 16:25:13 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:25 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-08-25-16.01.html | 16:25 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-08-25-16.01.txt | 16:25 |
opendevmeet | Log: https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-08-25-16.01.log.html | 16:25 |
johnsom | Thanks Greg | 16:25 |
rm_work | o/ | 16:27 |
mnaser | rm_work, johnsom: i mean we have that in nova and, nova's been around for a while | 17:11 |
johnsom | bugs and all | 17:11 |
mnaser | yeah like reset state will always be a helpful tool to avoid going into the db | 17:12 |
mnaser | in an ideal world yes it should never exist | 17:12 |
johnsom | The challenge is you have to check every controller first. | 17:13 |
mnaser | johnsom: same challenge as nova :) | 17:15 |
mnaser | well, setting them to ERROR and deleting them again fixed it | 17:27 |
mnaser | but the fact i have to login to db to do that is really not ideal :\ | 17:27 |
rm_work | yes :/ | 18:09 |
rm_work | see above | 18:09 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!