opendevreview | Merged openstack/swift master: ec: Use replication network to get frags for reconstruction https://review.opendev.org/c/openstack/swift/+/812614 | 04:12 |
---|---|---|
mattoliver | reid_g: ^^ looks like that just merged, thanks for pointing it out! | 04:54 |
*** reid_g1 is now known as reid_g | 12:34 | |
reid_g | Nice. Won't be able to use it for a while though. | 12:34 |
reid_g | Anybody ever have problems restarting swift services? Exception: "swift-object[38407]: Could not bind to 0.0.0.0:6018 after trying for 30 seconds" | 17:13 |
reid_g | Nothing showed up when I checked `ss -Hntl '( sport >= :6000 and sport <= :6034 )'` | 17:13 |
reid_g | Running per port with 35 disks | 17:15 |
reid_g | We have to reboot the node when this happens | 17:16 |
timburke_ | reid_g, i'm surprised -- eventlet started setting SO_REUSEPORT a while ago, and anything other than EADDRINUSE should be getting raised directly rather than retried... | 17:59 |
timburke_ | what version of eventlet are you using? what kernel? | 17:59 |
timburke_ | (SO_REUSEPORT added in https://github.com/eventlet/eventlet/commit/f9a3074a3b75f17f76cc04a693dc48a367b99861, so 0.20.0) | 18:00 |
reid_g | python3-eventlet: | 18:00 |
reid_g | Installed: 0.25.1-2ubuntu1~cloud0 | 18:00 |
reid_g | Kernel is 5.4.0-80-generic | 18:01 |
timburke_ | very odd. you might try taking out the if block and always raise around https://github.com/openstack/swift/blob/master/swift/common/wsgi.py#L201-L204 the next time it comes up -- but it seems like it *must* be getting back EADDRINUSE, and with no socket stats in that range... i'm not sure where to look next | 18:04 |
timburke_ | i'm guessing other utils like lsof or netstat give the same info? nobody's listening? | 18:05 |
reid_g | correct. We don't see anything listening/open | 18:06 |
reid_g | The line in the trace before the error is https://github.com/openstack/swift/blob/stable/ussuri/swift/common/wsgi.py#L200 | 18:07 |
timburke_ | right; the 30s timeout is from us trying repeatedly to bind in the loop at https://github.com/openstack/swift/blob/stable/ussuri/swift/common/wsgi.py#L184-L195 and finally giving up | 18:09 |
timburke_ | maybe spot-check that the python socket module really *does* have SO_REUSEPORT defined? i'm kinda grasping at straws here | 18:16 |
reid_g | I see the ports showing up in /proc/net/tcp but wasn't able to find out what was starting them. | 19:13 |
timburke_ | aw, reid_g left. i think he might be able to take the inode column from /proc/net/tcp and find an owner (or more than one!) with something like `sudo find -L /proc/[0-9]*/fd -inum <inode>` | 23:05 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!