mattoliverau | donnyd: now are you determining the list of stale entries? Ie things that are in the queue however the objects don't exist anymore in the cluster? or objects that aren't being deleted? | 00:12 |
---|---|---|
mattoliverau | is this using the legacy queue? I assume so because you say you have only 2 object expirers. | 00:13 |
mattoliverau | what are their `processes` and process settings? I assum both have something like 'processes = 2' and then one has a `process = 0` and another `process = 1` ? | 00:14 |
seongsoocho | morning ~ | 00:15 |
mattoliverau | I guess the other question is, what version of Swift? If it's one with the general task queue (new method) everytime you add a new expiration it'll be added to the general task queue which I think assumes you have an object expirer running on each object server (because the work is split by everyone to better scale). | 00:16 |
mattoliverau | Which is why I ask.. what do we mean by scale. No stress though we can work our way through it.. also probably so go see what/if anything has changed on the expirer side, as I haven't looked at that code in a while :) | 00:17 |
mattoliverau | seongsoocho: morning | 00:17 |
*** gregwork has quit IRC | 01:05 | |
donnyd | Version is Stein | 01:09 |
donnyd | I only have two nodes | 01:09 |
tdasilva | donnyd: where/how did you get those stats? | 01:15 |
tdasilva | just curious, i haven't looked at expirer in a while either so i'm just trying to understand what stale means in this case | 01:16 |
DHE | presumably a file that's expired but not actually deleted from disk yet... | 01:27 |
tdasilva | found it: https://gist.github.com/clayg/7f66eab2a61c77869e1e84ac4ed6f1df | 01:33 |
donnyd | Oh yes it was shared on here to check stats of the expirer | 01:47 |
donnyd | That is the one tdasilva | 01:49 |
donnyd | mattoliverau: you are correct on the processor/ process setting | 01:51 |
donnyd | DHE: yes that is what I am trying to clean up before I put FN log storage back online. Also been super helpful to learn how to swift a bit better | 01:52 |
tdasilva | ok, IIUC, stale are objects that expired more than 24 hours ago, while pending just expired. Ideally stale should be close 0 while having pending should be fine. In your case all your objects are stale (that's because you are not getting new objects being written to the cluster, correct?). I guess the question is, are you seeing that number go down? | 01:53 |
donnyd | tdasilva: that is correct, there are no new objects | 01:53 |
tdasilva | what is your reclaim_age set to? I'm guessing those objects stale objects are also past their reclaim age | 01:54 |
donnyd | The numbers that were in pending were removed which were about 4 million | 01:54 |
donnyd | It is set back to the default | 01:54 |
donnyd | Do I need to change the reclaim age to a much larger number? | 01:55 |
donnyd | 84600 iirc | 01:55 |
donnyd | I need to go back to like August or so | 01:55 |
donnyd | Lol | 01:56 |
tdasilva | default is 7 days ( 604800 ) | 01:57 |
tdasilva | do you have any lines like this in your logs: "Unexpected response while deleting object" ? | 01:58 |
donnyd | I get a bunch of 404's | 01:59 |
donnyd | "DELETE /sdf/930/AUTH_e8fd161dc34c421a979a9e6421f823e9/zuul_opendev_logs_7fa/679944/2/check/openstack-tox-functional-py36/7faca48/ara-report/result/2abbf12c-cf6b-47a7-841f-2eedfdf9353a/index.html" 404 70 "DELETE http://localhost/v1/AUTH_e8fd161dc34c421a979a9e6421f823e9/zuul_opendev_logs_7fa/679944/2/check/openstack-tox-functional-py36/7faca48/ara-report/result/2abbf12c-cf6b-47a7-841f-2eedfdf9353a/index.html" | 02:04 |
donnyd | "txf6d90b15611e46138570a-005df6e621" "proxy-server 53952" 0.0128 "-" 2830 0 | 02:04 |
tdasilva | donnyd: are you seeing the total (and stale) numbers go down? are expired objects being deleted or not at all? | 02:19 |
DHE | Aside: that giant container of mine is basically done filling. 58 million objects.. | 03:12 |
*** psachin has joined #openstack-swift | 03:34 | |
*** openstackgerrit has quit IRC | 04:08 | |
*** diablo_rojo has quit IRC | 04:08 | |
*** Jeffrey4l has quit IRC | 04:08 | |
*** edausq has quit IRC | 04:08 | |
*** szaher has quit IRC | 04:08 | |
*** MooingLemur has quit IRC | 04:08 | |
*** openstackstatus has quit IRC | 04:11 | |
*** openstackstatus has joined #openstack-swift | 04:13 | |
*** ChanServ sets mode: +v openstackstatus | 04:13 | |
tdasilva | DHE: nice, how's container-sharding going for you? | 04:19 |
*** rcernin has quit IRC | 06:58 | |
*** pcaruana has joined #openstack-swift | 07:16 | |
*** tesseract has joined #openstack-swift | 07:48 | |
*** tkajinam has quit IRC | 08:06 | |
*** tesseract has quit IRC | 08:11 | |
*** rdejoux has joined #openstack-swift | 08:16 | |
*** rpittau|afk is now known as rpittau | 08:34 | |
*** rcernin has joined #openstack-swift | 08:54 | |
*** tesseract has joined #openstack-swift | 09:03 | |
*** tesseract has quit IRC | 09:04 | |
*** tesseract has joined #openstack-swift | 09:04 | |
*** mugsie has quit IRC | 09:19 | |
*** mugsie has joined #openstack-swift | 09:21 | |
*** rcernin has quit IRC | 09:36 | |
*** rcernin has joined #openstack-swift | 09:36 | |
*** Jeffrey4l has joined #openstack-swift | 10:05 | |
*** edausq has joined #openstack-swift | 10:06 | |
*** diablo_rojo has joined #openstack-swift | 10:07 | |
*** szaher has joined #openstack-swift | 10:07 | |
*** irclogbot_2 has quit IRC | 10:08 | |
*** irclogbot_0 has joined #openstack-swift | 10:10 | |
*** MooingLemur has joined #openstack-swift | 10:10 | |
*** rcernin has quit IRC | 10:16 | |
*** rpittau is now known as rpittau|bbl | 11:22 | |
donnyd | I restarted the expirer and it seems to be taking down the stale numbers now too | 11:36 |
*** pcaruana has quit IRC | 12:27 | |
*** pcaruana has joined #openstack-swift | 12:33 | |
donnyd | I am seeing this pop up in the logs every so often #012UnexpectedResponse: Unexpected response: 400 Bad Request | 12:39 |
*** rpittau|bbl is now known as rpittau | 13:09 | |
DHE | tdasilva: haven't started yet. was missing some systemd unit files for the sharding service, and the manage-shard-ranges app is giving errors | 13:36 |
*** psachin has quit IRC | 14:01 | |
*** zaitcev has joined #openstack-swift | 14:13 | |
*** ChanServ sets mode: +v zaitcev | 14:13 | |
*** onovy has quit IRC | 14:53 | |
*** psachin has joined #openstack-swift | 15:07 | |
*** openstackgerrit has joined #openstack-swift | 15:29 | |
openstackgerrit | Charles Hsu proposed openstack/python-swiftclient master: Support uploading Swift symlinks without content. https://review.opendev.org/694211 | 15:29 |
*** onovy has joined #openstack-swift | 15:43 | |
*** efried is now known as efried_pto | 16:00 | |
*** gregwork has joined #openstack-swift | 16:22 | |
*** gyee has joined #openstack-swift | 16:28 | |
*** rdejoux has quit IRC | 17:16 | |
*** psachin has quit IRC | 18:19 | |
openstackgerrit | Merged openstack/swift master: WSGI server workers must drop_privledges https://review.opendev.org/698563 | 18:48 |
timburke | donnyd, the 400 is curious -- could we get some more of the traceback/logs? | 18:50 |
donnyd | yea I can probably take a look a little later today and see what I can pull out of the logs | 18:51 |
timburke | is there anything *besides* the zuul logs in swift? iirc, those were set to expire after 30 days, is that right? | 18:52 |
timburke | i'm realizing that we're probably at about the point where *everything* should be expired, so it might be fastest to just format drives :-/ | 18:53 |
timburke | (much as i hate to say it) | 18:53 |
timburke | again, though, that only really works if there's *only* zuul logs in there | 18:54 |
*** rpittau is now known as rpittau|afk | 18:55 | |
clayg | tdasilva: timburke: we might as well discuss p 697739 here yeah? | 19:00 |
patchbot | https://review.opendev.org/#/c/697739/ - swift - Have slo tell the object-server that it wants whol... - 3 patch sets | 19:00 |
clayg | I'm starting to think it's a solid win with little risk? why not +A? | 19:00 |
tdasilva | clayg, timburke: the EC part was what I found a bit more difficult to grok, so I was hoping someone else could have a look instead of just one +2 +A | 19:04 |
clayg | understood! | 19:04 |
timburke | yeah, EC caught me by surprise, too! wasn't expecting those failures on the first patch set ;-) | 19:05 |
*** tesseract has quit IRC | 19:06 | |
timburke | i'm not entirely pleased with the object controller popping off the Range header like that -- i should maybe either (a) do the same thing for the replicated controller, so we don't have this guessing game or (2) find a way to get the EC iter kicked off with no ranges, then stuff them back in the WSGI environment | 19:07 |
clayg | tdasilva: wait... I'm not seeing an "AccountContext" in the object versioning patch - does that mean we don't capture account listing requests at all yet? | 19:09 |
tdasilva | clayg: correct, my assumption is that since the stats are rolled up to the same account, we didn't have to do anything special there, guess I missed something :/ | 19:12 |
clayg | gotcha - ok, no problem 👍 | 19:14 |
timburke | might be nice to have an API like `GET /v1/AUTH_test?versions` that would splice listings from the user and reserved namespaces and put more info in the container entries like total number of objects (including versions and links) and total bytes used across all versions. but we can probably put it off as future work -- should write down the idea somewhere, though | 19:28 |
clayg | timburke: in china we'd decided the most reasonable thing to do was make the bytes & objects match the container HEADs - i.e. in a versioned container bytes is sum(bytes + versioned_bytes) and count = unversioned_count | 19:47 |
clayg | if we don't expose the bytes in the versioned container from the get go I think we'll be forcing clients to HEAD all of their containers when a single X-backend-allow-reserved-names listing to the account db has all the information they're looking for | 19:48 |
clayg | right now I'm tracking down a 500 though... i used to be able to do null namespace container listings through my noauth pipeline 🤔 | 19:49 |
*** gyee has quit IRC | 19:59 | |
clayg | uhh... I guess I was wrong about static link to SLO - I'm working on a functional test now | 20:30 |
*** gregwork has quit IRC | 20:34 | |
*** gyee has joined #openstack-swift | 20:40 | |
tdasilva | clayg: could that 500 might be due to https://review.opendev.org/#/c/682382/51/swift/common/middleware/versioned_writes/object_versioning.py@888 | 20:49 |
patchbot | patch 682382 - swift - New Object Versioning mode - 51 patch sets | 20:50 |
openstackgerrit | Clay Gerrard proposed openstack/swift master: New Object Versioning mode https://review.opendev.org/682382 | 20:59 |
*** rcernin has joined #openstack-swift | 21:21 | |
*** patchbot has quit IRC | 21:30 | |
*** patchbot has joined #openstack-swift | 21:31 | |
clayg | @tdasilva yes that was *exactly* it!? I ended up catching the ValueError there - HOW DID YOU DO THAT 🤣 | 21:37 |
clayg | I guess it's not going to make sense to review p 691877 until we get object versioning merged 🤔 | 21:38 |
patchbot | https://review.opendev.org/#/c/691877/ - python-swiftclient - object versioning features - 8 patch sets | 21:38 |
tdasilva | clayg: came across that on that container-sync patch, my solution was to check for RESERVED in the container name in the if statement...w/e | 21:38 |
*** pcaruana has quit IRC | 21:55 | |
timburke | *totally* makes sense to review it! just gotta do them together ;-) | 22:09 |
timburke | in fact, i was thinking about how it increasing (to me) seems like we ought to make sure one follows the other pretty quickly | 22:19 |
*** rcernin has quit IRC | 22:39 | |
*** rcernin has joined #openstack-swift | 22:39 | |
zaitcev | Guys, I need a cross-check. How do I look up an object in a container? | 22:43 |
zaitcev | I'm thinking perhaps some kind of HEAD to container server, but with object in path. We do this for updates, but with PUT. | 22:44 |
zaitcev | Unfortunately, our existing HEAD does not do this (always ignores obj), so I | 22:45 |
zaitcev | 'm thinking about constructing a some kind of zero UPDATE, which does not actually update anything | 22:45 |
timburke | GET with ?prefix=<obj> ? and maybe a &limit=1 | 22:46 |
timburke | (so you don't get cont/obj1, cont/obj2, ... when you're just looking for cont/obj) | 22:47 |
timburke | UPDATE really is just useful for updating -- don't think it returns anything (iirc) | 22:47 |
zaitcev | This is for dark data checker. | 22:48 |
timburke | still gotta manually check that the object returned is the object you expected, but if the prefix is the whole name, you've either got it or you don't | 22:48 |
zaitcev | I noticed that if talk to object server, it just immediately creates a broker and that's that. It will return HEAD of a dark object. | 22:49 |
timburke | ah -- yeah, i see how an UPDATE might be useful -- kinda depends on whether you want to aggregate several objects per container before checking or do it one at a time... | 22:50 |
timburke | i wonder -- when we accept the PUT for an object entry, do we distinguish between a 201 vs a 202? it'd maybe be useful... | 22:51 |
timburke | nope, not really :-/ https://github.com/openstack/swift/blob/2.23.0/swift/container/server.py#L472 | 22:52 |
mattoliverau | morning | 22:53 |
timburke | i feel like that'd be a fairly modest extention to DatabaseBroker's API to have put_record return some indication of whether the record already exists or not... though maybe that's getting out of scope for what you're looking for, zaitcev | 22:54 |
zaitcev | timburke: I'm still mocking it up... I was hoping just have a plugin for the API in patch 212824. | 22:56 |
patchbot | https://review.opendev.org/#/c/212824/ - swift - Let developers/operators add watchers to object audit - 18 patch sets | 22:56 |
timburke | i saw you rebased that ;-) | 23:00 |
zaitcev | I even copy-pasted the example into conf from it and found a syntax error :-) | 23:01 |
*** tkajinam has joined #openstack-swift | 23:07 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!