*** mariusv has quit IRC | 00:15 | |
*** mariusv has joined #openstack-swift | 00:15 | |
*** mariusv has joined #openstack-swift | 00:15 | |
*** links has joined #openstack-swift | 00:27 | |
*** Jeffrey4l has quit IRC | 00:30 | |
*** zhugaoxiao has quit IRC | 00:30 | |
*** zhugaoxiao has joined #openstack-swift | 00:30 | |
*** Jeffrey4l has joined #openstack-swift | 00:31 | |
*** jerrygb has joined #openstack-swift | 01:26 | |
*** jerrygb_ has quit IRC | 01:27 | |
*** takashi has joined #openstack-swift | 01:30 | |
*** jerrygb_ has joined #openstack-swift | 01:32 | |
*** jerrygb has quit IRC | 01:35 | |
*** jerrygb_ has quit IRC | 01:39 | |
*** dmorita has quit IRC | 01:41 | |
*** jerrygb has joined #openstack-swift | 01:41 | |
*** jerrygb has quit IRC | 01:49 | |
*** jerrygb has joined #openstack-swift | 01:50 | |
*** chlong has joined #openstack-swift | 02:06 | |
*** jerrygb has quit IRC | 02:12 | |
*** jerrygb has joined #openstack-swift | 02:13 | |
*** jerrygb has quit IRC | 02:29 | |
*** jerrygb has joined #openstack-swift | 02:30 | |
kota_ | morning | 02:32 |
---|---|---|
*** winggundamth has quit IRC | 02:37 | |
*** jerrygb_ has joined #openstack-swift | 02:42 | |
*** winggundamth has joined #openstack-swift | 02:43 | |
*** jerrygb has quit IRC | 02:44 | |
*** jerrygb has joined #openstack-swift | 02:47 | |
*** jerrygb_ has quit IRC | 02:50 | |
*** dmorita has joined #openstack-swift | 03:42 | |
*** dmorita has quit IRC | 03:46 | |
*** vinsh has quit IRC | 04:09 | |
*** jerrygb has quit IRC | 04:24 | |
*** SkyRocknRoll has joined #openstack-swift | 04:54 | |
*** viktork has joined #openstack-swift | 05:01 | |
*** ppai has joined #openstack-swift | 05:08 | |
charz_ | kota_: good morning! | 05:13 |
openstackgerrit | ChangBo Guo(gcb) proposed openstack/swift: Enable DeprecationWarning in test environments https://review.openstack.org/381117 | 05:13 |
kota_ | charz_: hi ;-) | 05:13 |
openstackgerrit | Kota Tsuyuzaki proposed openstack/liberasurecode: Fix error handling on gf_ivnert_matrix in isa-l backend https://review.openstack.org/393595 | 05:33 |
kota_ | hmmm... i wanna write a few of patches for liberasurecode but i like to depends on patch 387879 because it should land asap to prevent regressions | 05:35 |
patchbot | https://review.openstack.org/#/c/387879/ - liberasurecode - Fix liberasurecode skipping a bunch of invalid_arg... | 05:35 |
kota_ | so maybe working to land patch 387879 at first, then improve anything else should be a way go fast. | 05:36 |
patchbot | https://review.openstack.org/#/c/387879/ - liberasurecode - Fix liberasurecode skipping a bunch of invalid_arg... | 05:36 |
kota_ | thx tsg to add kmgreen to the reviewer. | 05:37 |
kota_ | tsg: (if you are in this channel) | 05:37 |
openstackgerrit | Kota Tsuyuzaki proposed openstack/liberasurecode: Fix error handling on gf_ivnert_matrix in isa-l backend https://review.openstack.org/393595 | 05:43 |
openstackgerrit | Kota Tsuyuzaki proposed openstack/liberasurecode: Fix error handling on gf_ivnert_matrix in isa-l backend https://review.openstack.org/393595 | 05:45 |
openstackgerrit | Kota Tsuyuzaki proposed openstack/liberasurecode: WIP: ISA-L Cauchy support https://review.openstack.org/393263 | 05:47 |
*** rcernin has joined #openstack-swift | 05:55 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/swift: Updated from global requirements https://review.openstack.org/88736 | 06:03 |
*** vinsh has joined #openstack-swift | 06:09 | |
*** vinsh has quit IRC | 06:14 | |
*** ChubYann has quit IRC | 06:27 | |
zaitcev | https://www.computerworlduk.com/cloud-computing/mark-shuttleworth-on-openstack-hpe-layoffs-prove-bs-as-service-theory-3648336/ - really, Shuttleworth? I can't wait to hear your opinion how Swift is unnecessary | 06:27 |
*** sams-gleb has joined #openstack-swift | 06:56 | |
*** sams-gleb has joined #openstack-swift | 06:57 | |
*** tesseract has joined #openstack-swift | 07:04 | |
*** tesseract is now known as Guest13194 | 07:04 | |
*** klrmn has quit IRC | 07:24 | |
ppai | zaitcev: swift being one of the cores, I don't think he meant that about swift. But I totally get the reference to sheer number of new peripheral projects that have popped up. Used to be just around 6 project code names. | 07:25 |
*** pcaruana has joined #openstack-swift | 07:33 | |
*** chlong has quit IRC | 07:33 | |
*** d0ugal has joined #openstack-swift | 08:07 | |
*** hseipp has joined #openstack-swift | 08:10 | |
*** rledisez has joined #openstack-swift | 08:18 | |
*** d0ugal has quit IRC | 08:28 | |
*** geaaru has joined #openstack-swift | 08:28 | |
*** amoralej|off is now known as amoralej | 08:29 | |
*** hseipp has quit IRC | 08:42 | |
*** ppai has quit IRC | 08:51 | |
*** cbartz has joined #openstack-swift | 08:55 | |
*** hseipp has joined #openstack-swift | 08:55 | |
*** hseipp has left #openstack-swift | 08:57 | |
*** d0ugal_ has joined #openstack-swift | 08:58 | |
*** hseipp has joined #openstack-swift | 09:18 | |
openstackgerrit | Kota Tsuyuzaki proposed openstack/pyeclib: Add greedy test for decode/reconstruct result solid https://review.openstack.org/393656 | 09:18 |
*** silor has joined #openstack-swift | 09:19 | |
*** jordanP has joined #openstack-swift | 09:20 | |
*** silor1 has joined #openstack-swift | 09:22 | |
*** silor has quit IRC | 09:24 | |
*** silor1 is now known as silor | 09:24 | |
*** kei_yama has quit IRC | 09:27 | |
kota_ | oh, notmyname already worked on the backporting for audit patch, https://review.openstack.org/#/c/389746 | 09:37 |
patchbot | patch 389746 - swift (stable/mitaka) - Make ECDiskFileReader check fragment metadata (MERGED) | 09:37 |
kota_ | good job | 09:37 |
kota_ | i just was searching the patches in open reviews. | 09:37 |
openstackgerrit | Kota Tsuyuzaki proposed openstack/pyeclib: WIP: ISA-L Cauchy support https://review.openstack.org/393276 | 09:38 |
*** acoles_ is now known as acoles | 09:43 | |
*** d0ugal_ is now known as d0ugal | 09:51 | |
*** d0ugal has joined #openstack-swift | 09:51 | |
acoles | timburke: thanks for doing the backport | 09:55 |
admin6 | aoles: kota_ Hi, thanks for the backport to mitaka for checking EC fragments metadata. I applied it, and it seems to work, but sometimes, I have errors on os.rename with no such file or directory : http://paste.openstack.org/show/587874/ | 09:59 |
*** takashi has quit IRC | 10:00 | |
openstackgerrit | Kota Tsuyuzaki proposed openstack/liberasurecode: Fix liberasurecode skipping a bunch of invalid_args tests https://review.openstack.org/387879 | 10:01 |
kota_ | admin6: looking | 10:01 |
kota_ | acoles: oh, backporting patch is stil open? | 10:02 |
kota_ | still | 10:02 |
kota_ | hi, acoles and admin6 anyway. | 10:02 |
*** dmorita has joined #openstack-swift | 10:03 | |
kota_ | dmorita!!?? | 10:04 |
kota_ | admin6: it looks like to fail to move the corrupted fragments to move quarantine dir but no info appeared which one doesn't exist... | 10:05 |
kota_ | s/fragments/fragment/ it's single. | 10:05 |
kota_ | either the fragment has been quarantined or no quarantine dir. | 10:05 |
kota_ | lemme check the code.. | 10:06 |
*** dmorita has quit IRC | 10:07 | |
admin6 | kota_ it seems to be only on device s02z2ecd03, other devices have good quarantine Invalid EC logs | 10:08 |
kota_ | could you quick check if quarantine dir exists or not in the device? | 10:09 |
*** vinsh has joined #openstack-swift | 10:11 | |
kota_ | AFAIK, the renamer can dig the directory structure if it's nested so it could happen either racing to make/delete the dir or no source file? | 10:12 |
kota_ | or failed to dig the dir? | 10:13 |
admin6 | kota_ the quarantine dir exist and has good permissions. The corrupted fragment is present in the quarantine dir : ls -al /srv/node/s02z2ecd03/quarantined/objects-1/91021a50d6c103235c1fc0a4fa0cebdd/ | 10:16 |
*** vinsh has quit IRC | 10:16 | |
admin6 | -rw------- 1 swift swift 2790878 Sep 25 14:14 1470952148.01588#3.data | 10:16 |
patchbot | https://review.openstack.org/#/c/25/ - openstack-infra/system-config - Manage apt and gems path with puppet. Add tarmac.c... (MERGED) | 10:16 |
admin6 | -rw-r--r-- 1 swift swift 0 Sep 25 14:14 1470952148.01588.durable | 10:16 |
patchbot | https://review.openstack.org/#/c/25/ - openstack-infra/system-config - Manage apt and gems path with puppet. Add tarmac.c... (MERGED) | 10:16 |
kota_ | you don't have to work, patchbot | 10:16 |
kota_ | admin6: the error is only in auditor? or object-server too? | 10:22 |
kota_ | so if the auditor detects the frag corrupted and at the same time, someone request to GET the object, both object-server and object-auditor could race to move it to quarantine dir | 10:23 |
kota_ | and the looser on the race might dump the error, in my current idea. | 10:23 |
kota_ | not sure | 10:23 |
kota_ | but curious, it happens only a device... | 10:24 |
acoles | kota_: admin6 so the frag has been quarantined but we also get the traceback from renamer - I'm looking for path where we maybe try to quarantine the same frag twice, but can't see it yet :/ | 10:26 |
admin6 | I don’t see error in the object-server so far. however it doesn’t happen on only one device, but it happens always on the same devices | 10:26 |
admin6 | kota_ some device logs fine quarantine, but some always log this error when they quarantine frags. I doubt it could be a race condition with the object-server | 10:29 |
admin6 | kota_ I’ve also seen anothe type of error : " line 1800, in close#012 {'exc': e, 'stack': ''.join(traceback.format_stack()),#012" http://paste.openstack.org/show/587879/ I’m rolling back my servers to unpatched versions. | 10:34 |
*** vint_bra1 has joined #openstack-swift | 10:41 | |
kota_ | hmmm | 10:42 |
*** blair has quit IRC | 10:42 | |
*** janonymous has quit IRC | 10:42 | |
*** vint_bra has quit IRC | 10:43 | |
*** AndyWojo has quit IRC | 10:43 | |
*** nottrobin has quit IRC | 10:43 | |
*** philipw has quit IRC | 10:43 | |
*** fungi has quit IRC | 10:44 | |
*** tdasilva has quit IRC | 10:44 | |
*** calebb has quit IRC | 10:44 | |
*** tries_ has quit IRC | 10:44 | |
*** philipw has joined #openstack-swift | 10:45 | |
*** AndyWojo has joined #openstack-swift | 10:45 | |
*** nottrobin has joined #openstack-swift | 10:45 | |
acoles | so, do we somehow have a race between the reader quarantining the frag while reading and also attempting to quarantine while closing (in _handle_close_quarantine)?? One or other will fail, producing one or other of admin6 logs. The frag would be quarantined so if this is the case we're looking at log noise, not failure to quarantine. BUT I can't see how that can happen yet i.e. read a chunk, check it, quarantine *and* als | 10:46 |
acoles | o call _handle_close_quarantine. It would mean read_to_eof is True. | 10:46 |
acoles | kota_: ^^ | 10:46 |
*** openstackgerrit has quit IRC | 10:47 | |
*** openstackgerrit has joined #openstack-swift | 10:48 | |
kota_ | acoles: could be? not yet get the path. | 10:52 |
*** blair has joined #openstack-swift | 10:53 | |
kota_ | if the bad frag appeared at the mid of the file, DiskFileQuarantined will be raised so i doesn't seem to go read_to_eof == True | 10:54 |
kota_ | and if the bad frag is in the tail, no check_frag doesn't anything untill to get into the tail... | 10:55 |
acoles | kota_: agree, but see line 3 of this paste ... the frag has been quarantined, but the exception raised in close() suggests that the _handle_close_quarantine got called ?? | 10:56 |
admin6 | kota_ acoles: I checked 6 or 7 error logs on different servers and different devices, the corrupted frags always has been quarantined correctly | 10:56 |
kota_ | acoles: yeah, that's odd. | 10:56 |
*** tries_ has joined #openstack-swift | 10:57 | |
*** tries_ has quit IRC | 10:57 | |
*** tries_ has joined #openstack-swift | 10:57 | |
*** janonymous has joined #openstack-swift | 10:57 | |
acoles | admin6: good! I think/hope that the logger error is due to an unnecessary second attempt to quarantine. But I'd like to figure out why! | 10:58 |
*** calebb has joined #openstack-swift | 10:58 | |
*** fungi has joined #openstack-swift | 10:58 | |
kota_ | acoles: not sure if auditor calling the reader with closing context but the reader already has the final statement to close itself | 11:09 |
kota_ | acoles: does it trigger the duplication? | 11:09 |
*** CrackerJackMack has quit IRC | 11:10 | |
kota_ | so, if i got true, it could happen when 1. it doesn't include corrupted fragment metdata 2. it has invalid md5sum, maybe? | 11:11 |
kota_ | try to make sure the tests | 11:11 |
kota_ | no, 1 is probably false | 11:11 |
kota_ | since admin6's paste | 11:12 |
kota_ | the duplication could be, try to make sure.. | 11:13 |
*** hseipp has quit IRC | 11:16 | |
acoles | kota_: yeah I am looking at the same (two calls to reader.close) but not sure that explains the other traceback (renamer fails during read of a chunk) | 11:16 |
acoles | line 8 of paste | 11:17 |
*** CrackerJackMack has joined #openstack-swift | 11:18 | |
*** gopenshaw_ has quit IRC | 11:19 | |
*** vinsh has joined #openstack-swift | 11:23 | |
acoles | kota_: reader.close() sets self._fp None and also uses self._fp as a guard so it won't execute twidce | 11:39 |
*** tdasilva has joined #openstack-swift | 11:40 | |
*** d0ugal has quit IRC | 11:40 | |
acoles | twice* | 11:40 |
kota_ | acoles: sure | 11:40 |
acoles | kota_: so i am back to wondering if somehow check_frag is called during iter and also during close->handle_close_quarantine | 11:41 |
acoles | for same data file, which would result in a duplication of renamer calls | 11:42 |
acoles | kota_ should be going home ? | 11:42 |
kota_ | acoles: might be | 11:43 |
kota_ | not yet | 11:43 |
*** d0ugal has joined #openstack-swift | 11:45 | |
kota_ | still in fog :( | 11:59 |
acoles | kota_: another crazy thought...the renamer retries once if an OSError is raised...is it possible that the first rename does perform the file move but also raises an OSError (like I said, crazy thinking) so the second attempt fails. i.e. there is just one call to renamer?? | 12:02 |
kota_ | acoles: sounds crazy but could be possible but i cannot think it happens so frequently... | 12:04 |
acoles | kota_: yep. v unlikely. more likely something to do with the reader iter but I can't see it. seems like another call to the iter is needed AFTER the quarantine has occurred and the exception raised. Then, we'd see both of admin6 cases depending on whether there was more data to read (another call to check_frag) or no more data (so close calls handle_close_quarantine calls check_frag) | 12:10 |
*** chlong has joined #openstack-swift | 12:13 | |
kota_ | acoles: probably similar thought to me but right now no idea where we catch the exception and retry the iter calls | 12:17 |
kota_ | acoles: one minor mistake could be found. | 12:22 |
*** jerrygb has joined #openstack-swift | 12:23 | |
kota_ | acoles: i don't think it is related to the admin6's issue though, https://github.com/openstack/swift/blob/master/swift/obj/diskfile.py#L2648 line may be unecessary | 12:23 |
kota_ | acoles: it seems we can expect DiskFileQuarantined will be raised at self._quarantine with quarantine_hook | 12:24 |
kota_ | and it seems object-server doesn't want to raise an exception during transferring the data, maybe? | 12:25 |
kota_ | it calls without quarantine_hook | 12:25 |
acoles | I think clayg added that and i think it is so that when detecting corrupt frag during a GET (not audit) then the GET will terminate early | 12:26 |
acoles | quarantine_hook only installed by auditor | 12:26 |
openstackgerrit | Christian Hugo proposed openstack/swift: Raise ValueError if a config section does not exist https://review.openstack.org/393388 | 12:28 |
*** jerrygb has quit IRC | 12:31 | |
*** iurygregory has quit IRC | 12:32 | |
*** hseipp has joined #openstack-swift | 12:33 | |
kota_ | acoles: on the duplication call, i'm now wondering, how we can capture the *diskfile* log message? | 12:33 |
kota_ | acoles: unit test looks to capture the auditor's logger and assert only error lines... | 12:34 |
kota_ | diskfile's logger is from manager, let me check... | 12:34 |
kota_ | audit worker's logger seems to be passed to the diskfile router so the router pass it to diskfile, maybe? | 12:36 |
kota_ | hmm... | 12:36 |
kota_ | oh, yeah, trying to warning level log lines, it looks like there is the diskfile's line. | 12:37 |
acoles | yes the auditor creates DF manager with the same logger as auditor | 12:39 |
acoles | so we see the warning from diskfile | 12:39 |
kota_ | acoles: moving on another idea/question, can threadpool.run_in_thread return sort of None? | 12:40 |
kota_ | if something fails | 12:40 |
kota_ | if we still have self.frag_buf and the run_in_thread failed and return None, it seems to go close with read_to_eof = True | 12:41 |
kota_ | and then, with closing statement also trying to close??? | 12:42 |
*** jerrygb has joined #openstack-swift | 12:42 | |
*** vinsh has quit IRC | 12:45 | |
*** chlong has quit IRC | 12:46 | |
admin6 | kota_ acoles: I need to leave. thanks for looking to this error. I’ve re-applied the patch and I’ll let it run for a while as long as it seems to be only a "noise" log error (I hope so). I’ll survey the IRC for your comment tonight (in about 6-8 hours) Let me know if you need more logs/infos. | 12:48 |
acoles | kota_: yes, but close() will only call handle_close_quarantine once because self._fp is set to None in the finally. And the finally is executed *before* the DiskFileQuarantine exception is handled in the auditor, so the "with closing" context would find self._fp=None, I *think*?? | 12:48 |
acoles | admin6: ok, thanks for flagging this up | 12:48 |
acoles | kota_: I am away for lunch | 12:49 |
kota_ | acoles: ok, and i have to leave my office | 12:49 |
kota_ | will think more next week. | 12:49 |
*** hseipp has quit IRC | 12:52 | |
*** jerrygb has quit IRC | 13:01 | |
*** admin6 has quit IRC | 13:03 | |
*** StraubTW has joined #openstack-swift | 13:09 | |
*** StraubTW_ has joined #openstack-swift | 13:10 | |
*** hseipp has joined #openstack-swift | 13:10 | |
*** d0ugal has quit IRC | 13:12 | |
*** StraubTW has quit IRC | 13:13 | |
*** SkyRocknRoll has quit IRC | 13:13 | |
*** links has quit IRC | 13:13 | |
*** silor has quit IRC | 13:19 | |
*** d0ugal has joined #openstack-swift | 13:21 | |
*** silor has joined #openstack-swift | 13:26 | |
*** daemontool has joined #openstack-swift | 13:32 | |
*** amoralej is now known as amoralej|lunch | 13:32 | |
*** d0ugal has quit IRC | 13:35 | |
*** d0ugal has joined #openstack-swift | 13:37 | |
*** jerrygb has joined #openstack-swift | 13:39 | |
*** jerrygb has quit IRC | 13:39 | |
*** jroll is now known as jrollinhatin | 13:40 | |
*** jerrygb has joined #openstack-swift | 13:46 | |
*** doxavore has joined #openstack-swift | 13:47 | |
*** jerrygb has quit IRC | 13:48 | |
*** sweeper has joined #openstack-swift | 13:50 | |
*** d0ugal_ has joined #openstack-swift | 13:51 | |
sweeper | so | 13:53 |
*** d0ugal has quit IRC | 13:53 | |
sweeper | I've got this cluster that's gone through a rough week or so, with the colo deciding to turn off half of our rack one day, followed by the other half the next day | 13:54 |
sweeper | this shook out a drive or three, so we did a rebalance | 13:54 |
sweeper | now we're looking at this drive that's full to 100% when it should really be around 90% tops | 13:55 |
*** sgundur has joined #openstack-swift | 13:58 | |
*** Guest13194 has quit IRC | 13:58 | |
acoles | kota_: admin6 I filed a bug report here bug 1639244 , I have to work on some other stuff rest of today, we'll try to get some other minds to think on it | 14:02 |
openstack | bug 1639244 in OpenStack Object Storage (swift) "Auditor logs "No such file" errors when renaming corrupt EC fragments" [Undecided,New] https://launchpad.net/bugs/1639244 | 14:02 |
sweeper | is there a way to see if this is just a result of the ongoing rebalance? allocation in the ringfile looks right | 14:09 |
*** Guest13194 has joined #openstack-swift | 14:13 | |
openstackgerrit | Nandini Tata proposed openstack/swift: Multi Swift - Multiple Swift clusters on same h/w https://review.openstack.org/393794 | 14:17 |
*** donagh has quit IRC | 14:24 | |
*** vinsh has joined #openstack-swift | 14:27 | |
*** jistr is now known as jistr|call | 14:29 | |
acoles | sweeper: try asking later (west coast US time), someone there may be able to answer | 14:29 |
acoles | admin6: can you double check you don't have two auditor processes running on the same node? (I'm clutching at straws, but worth asking) | 14:31 |
*** d0ugal_ has quit IRC | 14:39 | |
*** d0ugal has joined #openstack-swift | 14:39 | |
*** amoralej|lunch is now known as amoralej | 14:46 | |
*** jerrygb_ has joined #openstack-swift | 14:52 | |
*** jerrygb has joined #openstack-swift | 15:01 | |
*** jistr|call is now known as jistr | 15:03 | |
*** jerrygb_ has quit IRC | 15:04 | |
*** zul has quit IRC | 15:07 | |
*** zul has joined #openstack-swift | 15:08 | |
*** jistr is now known as jistr|biab | 15:10 | |
*** zul has quit IRC | 15:10 | |
*** zul has joined #openstack-swift | 15:12 | |
*** jistr|biab is now known as jistr | 15:14 | |
*** Victor777 has joined #openstack-swift | 15:19 | |
*** Victor777 has quit IRC | 15:24 | |
*** sgundur has quit IRC | 15:24 | |
*** sgundur has joined #openstack-swift | 15:25 | |
*** rcernin has quit IRC | 15:25 | |
*** chsc has joined #openstack-swift | 15:25 | |
*** chsc has joined #openstack-swift | 15:25 | |
notmyname | good morning | 15:26 |
notmyname | acoles: west coast US time, you say? | 15:26 |
*** rcernin has joined #openstack-swift | 15:27 | |
acoles | notmyname: that's now is it? I think we're only 7 hours apart this week | 15:28 |
notmyname | yep. 8:30 ish here right now | 15:28 |
notmyname | so, for most people we work with on the west coast, that hardly counts as "morning" ;-) | 15:28 |
sweeper | damn hippies | 15:29 |
notmyname | fungi: I have a CI question. in general, if a project needed to pull in a dependency and build it from source (ie instead of pulling a distro package), how would one do that? is there anythign in infra that currently supports that? | 15:29 |
acoles | lol | 15:29 |
fungi | notmyname: i think it would be very situationally dependent | 15:29 |
notmyname | sweeper: I'm likely not exactly the right person to answer your question, but I can give it a shot. you're trying to figure out if the drive fullness is because of rebalancing? or failures? or ...? | 15:30 |
fungi | notmyname: i mean, technically there are plenty of jobs that do that now. if you have a python project depend on pycrypto i'll pull down sdists of numpy that build c extensions against cffi.h | 15:30 |
notmyname | fungi: ah, right | 15:30 |
notmyname | fungi: specifically for swift I'm thinking of the EC libraries or the future golang stuff (if the TC sticks with "not in the swift repo") | 15:31 |
fungi | and we have jobs that build java-based projects with maven or similar tools | 15:31 |
fungi | so those are grabbing random java deps from the 'net before compiling | 15:32 |
notmyname | fungi: do they build a binary that's then available in the PATH? or are they building a distro package and then installing that? | 15:32 |
fungi | for the java case they typically build a jar, war, whatever and then run that under a jvm | 15:32 |
fungi | i think there's an effort starting with nova wanting to test against just-in-time builds of libvirt and qemu from trunk as well | 15:34 |
fungi | though i don't know if they've worked out the details on that yet | 15:34 |
notmyname | ah, interesting | 15:35 |
fungi | i know they were waffling between compiling in-job vs having a separate periodic job to build packages of those their jobs could grab | 15:35 |
notmyname | regardless of where the code lives (in or out of today's swift repo), we'll have to solve the build problem for golang stuff. ie build from source and test that instead of installing some distro package version | 15:35 |
notmyname | yeah, we considered that too. eg a nightly build that we use | 15:36 |
notmyname | I'm not a big fan of that. seems to be the worst of both worlds | 15:36 |
fungi | yep. the up-side is that it speeds up your tests because you don't have to wait for the build | 15:36 |
fungi | down side, which we see with projects that do this already to create things like service images, is that if your periodic update publishes a broken artifact your jobs are all instabroken until you solve that | 15:37 |
notmyname | our tests are currently slow because of devstack setup. that is, those are the longest running tests by far, so we've got a fairly generous "budget" to play with, if jobs run concurrently, before we increase the wall-time of a check | 15:38 |
*** Guest13194 is now known as tesseract- | 15:38 | |
fungi | yep, so as i said, highly situationally dependent | 15:38 |
fungi | it comes down to what tradeoffs you're willing to make | 15:39 |
notmyname | heh. our in-tree unit and functional tests each are about 5 minutes long. gate-tempest-dsvm-neutron-identity-v3-only-full-ubuntu-xenial-nv is 45 minutes | 15:40 |
notmyname | by the way, what is that test? | 15:40 |
notmyname | why does swift need to have an extra 15 minutes (next slowest is only 30 min) for testing neutron with keystone v3? | 15:40 |
fungi | i believe that's a devstack integration job using neutron instead of nova-net and keystone v3 api instead of v2, running the "full" set of tempest tests on ubuntu-xenial but not voting (because you're probably still relying on ubuntu-trusty and need to switch) | 15:41 |
notmyname | we're running gate-swift-dsvm-functional-identity-v3-only-ubuntu-xenial-nv (in ~15 min) and that's passing | 15:42 |
fungi | sure, that probably doesn't, e.g., set up glance and exercise it to make sure that changes to swift don't break glance's use of it as a storage backend | 15:43 |
notmyname | anyway, it's jsut something I didn't (don't) know about. thought I'd ask. if we're all running that to make sure of something, that's good for me | 15:44 |
fungi | agreed, those are questions better posed to the qa team | 15:44 |
fungi | from an infra perspective, i just want to make sure that if there are jobs you want to run, we have the support in place for you to be able to do so | 15:45 |
notmyname | thanks :-) | 15:45 |
notmyname | what I want to run is something that builds a binary from source. at some point soon, we need to figure that out together (I hope) | 15:46 |
notmyname | pdardeau: you've got keystone set up against your swift dev cluster(s) right? | 15:47 |
fungi | sure. ultimately what our ci platform gives you is a way to run shell scripts in a vm with root access. so really it's a blank slate if you want, but there are almost certainly modular bits of prior art we can draw from so you don't have to write it from the ground up | 15:47 |
notmyname | cool | 15:47 |
notmyname | sounds like `./configure && make && make install` is even reasonable. we put that in a script in our repo, then call it from the gate job, and done! | 15:48 |
fungi | yep | 15:48 |
notmyname | oh, speaking of gate changes, I've got one other thing on my plate | 15:49 |
notmyname | I need to make a new job template copied from the py27 base one that adds an XFS partition | 15:49 |
notmyname | likely will mount /tmp as XFS | 15:49 |
notmyname | just a head's up that I'm hoping to get that done...hmm, it's already friday. maybe next week then | 15:50 |
notmyname | maybe today. we'll see what comes up | 15:50 |
fungi | sure, i remember discussing it with you. the tox-based python unit test jobs are pretty modular already, so shouldn't be hard. just need to add a custom shell builder prior to revoke-sudo that does your mkfs and mount | 15:51 |
notmyname | right | 15:51 |
notmyname | I'd prefer to do that in the existing job template, but I've been told that would be unacceptable | 15:51 |
*** stradling has joined #openstack-swift | 15:52 | |
fungi | well, the existing template is pretty hollow. a copy of that to be xfs-specific wouldn't be a problem | 15:52 |
fungi | the main sticking point, if i recall, was making sure the xfs-requiring tests are opportunistic so you can also continue to run an xfs-less unit test to meet the consistent testing interface guidelines | 15:54 |
notmyname | yeah, I'm not worried about the "difficulty" of making a new job template. that looks easy. but yeah, that | 15:54 |
*** bogdands has joined #openstack-swift | 15:54 | |
notmyname | it makes the existing py27 jobs skip about 1/3 of the tests and all the associated code in repo to handle that | 15:54 |
stradling | Hey, guys -- newb here. Looking for guidance on puppet-swift -- has development moved from puppetforge to Github definitively, or will the changes eventually flow back to puppetforge? | 15:55 |
fungi | alternative being to mock those interactions in your unit tests so they don't touch an actual local fs, and then do your real xfs testing in the functional job. presumably you have good reasons to discount that solution tough | 15:55 |
notmyname | stradling: I have no idea. fungi, do you know where openstack puppet devs hang out? | 15:55 |
fungi | stradling: https://wiki.openstack.org/wiki/IRC says they have a #openstack-uppet channel | 15:56 |
fungi | er, #openstack-puppet | 15:56 |
stradling | Nice. Thanks! | 15:56 |
*** nikivi has joined #openstack-swift | 15:56 | |
notmyname | fungi: yeah, that's what we're actually replacing. we were halfway mocking a filesystem. so the option is either to write a pretty-close-to-real FS as a mock or use one that's provided to us on the test node. that second option is much preferred | 15:56 |
*** rcernin has quit IRC | 15:56 | |
*** Victor777 has joined #openstack-swift | 15:57 | |
notmyname | point being that in many ways swift's job is to write a file in a filesystem. makes it *really* hard to test if you don't actually have a filesystem to write to | 15:57 |
fungi | sure. just also makes it hard as a local dev to run swift unit tests without knowing that you need to do extra setup | 15:58 |
notmyname | but yeah. plan right now is to add the all the skips and then make a new gate job | 15:58 |
*** pcaruana has quit IRC | 15:58 | |
fungi | anyway, we can definitely make it work. give me a heads up once you have a first stab and i'll give it a once-over | 15:58 |
*** tesseract- has quit IRC | 15:59 | |
bogdands | Hi everyone. I've been struggling with something for a few days now and figured this is my last chance of figuring it out. I want to have all users of a project with the role of _member_ being able to list containers, I thought you'd have to edit the policy.json file and add something like .."identity:list_containers": "rule:member", where member is "role:_member_", but it doesn't really work and I've tried several other | 15:59 |
bogdands | would be awesome if someone has any insight on this | 15:59 |
notmyname | I disagree with the level of difficulty added. but i also don't think it's too important to argue about. the path forward with the least issues is a new job template. makes our code more complex, but that's what we're paid to do as devs | 15:59 |
notmyname | fungi: thanks | 15:59 |
*** Victor777 has quit IRC | 15:59 | |
fungi | notmyname: you're welcome! always happy to help | 15:59 |
notmyname | acoles: is guy fawkes day tomorrow a thing that's celebrated? or is that something that americans think is a brittish holiday because of a movie that came out a few years ago? | 16:00 |
pdardeau | notmyname: no, we don't use keystone. we've taken very brief looks into it a couple of times. | 16:01 |
notmyname | pdardeau: ah. ok. I was wondering about the impact of fernet tokens | 16:01 |
notmyname | pdardeau: I thought you might have had some experience there or at least something set up to easily compare | 16:01 |
*** jordanP_ has joined #openstack-swift | 16:02 | |
acoles | notmyname: yes it is widely celebrated this weekend - bonfires and fireworks galore! | 16:02 |
notmyname | oh, fun! | 16:03 |
acoles | not if you are a dog ;) | 16:03 |
notmyname | lol | 16:03 |
*** jordanP has quit IRC | 16:04 | |
*** jordan__ has joined #openstack-swift | 16:04 | |
*** Victor777 has joined #openstack-swift | 16:04 | |
acoles | or, in the past, a cat, but that's not a pleasant tale | 16:05 |
*** 18VAABSMN has quit IRC | 16:06 | |
*** jordanP_ has quit IRC | 16:08 | |
*** Victor777 has quit IRC | 16:08 | |
*** raginbaj- has joined #openstack-swift | 16:09 | |
*** cebruns_ has quit IRC | 16:11 | |
*** cebruns_ has joined #openstack-swift | 16:14 | |
acoles | bogdands: swift doesn't use policy.json files. Any user with an operator_role on a project as defined in proxy-server.conf can list containers. Default operator_roles are admin and swiftoperator. | 16:20 |
bogdands | thanks for the reply, but can I have a user that cannot write/create containers and still be able to list them | 16:21 |
*** diogogmt has joined #openstack-swift | 16:27 | |
acoles | bogdands: you cannot achieve that with roles, but you can with container ACLs - you can use a read ACL to grant read access to users which allows them to list container and read its objects. The best description is here https://review.openstack.org/#/c/374215/4/doc/source/overview_acl.rst which is a patch we have yet to merge into the docs | 16:28 |
patchbot | patch 374215 - swift - Document access control lists (ACLs) | 16:28 |
acoles | bogdands: also see docs here http://docs.openstack.org/developer/swift/overview_auth.html#access-control-using-keystoneauth | 16:31 |
*** dmorita has joined #openstack-swift | 16:32 | |
*** dmorita has quit IRC | 16:34 | |
*** dmorita has joined #openstack-swift | 16:34 | |
bogdands | acoles: thanks, but I also need the users to be able to list all available containers, which I'm afraid won't be possible if done in this manner | 16:41 |
acoles | bogdands: correct, with keystone auth only users with operator_role can list all containers in a project, and those users can also modify them, so that does not achieve your goal. There is work in progress to add account level ACL support for keystone auth https://review.openstack.org/#/c/356715/ | 16:43 |
patchbot | patch 356715 - swift - Supporting Account ACL in keystoneauth | 16:43 |
clayg | weee | 16:44 |
bogdands | ok, thank you acoles. Really appreciate your answer. Cheers | 16:44 |
clayg | acoles used to be a cat? | 16:45 |
* acoles purrs | 16:46 | |
acoles | clayg: kota and I got completely stumped looking into this https://bugs.launchpad.net/swift/+bug/1639244 | 16:47 |
openstack | Launchpad bug 1639244 in OpenStack Object Storage (swift) "Auditor logs "No such file" errors when renaming corrupt EC fragments" [Undecided,New] | 16:47 |
acoles | admin6 applied the Mitaka backport, saw corrupt frags quarantined (good!) but also some error logs | 16:48 |
clayg | maybe the file is not very long/big (i.e. < 64K) | 16:48 |
clayg | I would have swore I saw that once in dev when I was doing functional testing - but it was laborsome - the patch changed - and I sort of forgot about it | 16:49 |
clayg | acoles: neway - i'll look | 16:49 |
acoles | couldn't see how that would cause duplicate attempts to quarantine | 16:49 |
acoles | oh, interesting | 16:49 |
clayg | i was sorta looking at it already - but I'm going to spend some time with liberasurecode stuff today too - so my head should be pleanty in EC (no guarantee I'll see something you missed tho, probably the oppostite) | 16:49 |
acoles | clayg: one of the culprit frags was listed by admin6, >2M http://eavesdrop.openstack.org/irclogs/%23openstack-swift/%23openstack-swift.2016-11-04.log.html#t2016-11-04T10:16:09 | 16:51 |
acoles | clayg: ok, I was just mentioning it cos sometimes fresh eyes produce an "aha" moment | 16:51 |
*** bogdands has quit IRC | 16:51 | |
*** jamielennox is now known as jamielennox|away | 16:51 | |
clayg | acoles: so you're saying asking clayg if he can offer anything useful is sort of a last ditch hail mary shot in the dark - but what have you got too loose? | 16:52 |
*** dmorita has quit IRC | 16:57 | |
*** dmorita has joined #openstack-swift | 16:57 | |
*** dmorita has quit IRC | 16:57 | |
acoles | lol | 16:59 |
*** dmorita has joined #openstack-swift | 16:59 | |
acoles | I'm admitting defeat | 16:59 |
*** nikivi has quit IRC | 17:01 | |
clayg | go enjoy some fireworks | 17:05 |
*** klrmn has joined #openstack-swift | 17:06 | |
*** rledisez has quit IRC | 17:06 | |
*** hseipp has quit IRC | 17:09 | |
*** stradling has quit IRC | 17:20 | |
clayg | ntata: I really thought the few places we hard code /etc/swift/swift.conf made multi-swift impossible to do w/o some patching? patch 393794 | 17:21 |
patchbot | https://review.openstack.org/#/c/393794/ - swift - Multi Swift - Multiple Swift clusters on same h/w | 17:21 |
*** ChubYann has joined #openstack-swift | 17:21 | |
clayg | timburke: man it's going to be *really* hard to keep probetests running with encryption if people aren't running saio's with encryption turned on (vsaio makes it easy to swap into encryption, but I leave it off mostly) | 17:25 |
clayg | ... makes me wonder about our auditor ec checksum patches and encryption :'( | 17:25 |
openstackgerrit | Clay Gerrard proposed openstack/swift: Add probetest for response with duplicate frags https://review.openstack.org/371771 | 17:29 |
clayg | ^ test only change - added back acoles +2 | 17:32 |
clayg | Although - I didn't run it with encryption turned on :\ | 17:32 |
*** sgundur has quit IRC | 17:33 | |
*** sgundur has joined #openstack-swift | 17:33 | |
*** cbartz has left #openstack-swift | 17:35 | |
*** d0ugal has quit IRC | 17:39 | |
acoles | I just did | 17:49 |
acoles | clayg: so is this a case where I should just go ahead and approve? seems like we could do that | 17:50 |
pdardeau | clayg: cschwede: i like the graphics you used in ring talk | 17:51 |
acoles | pdardeau: I want t-shirts with those graphics :) | 17:56 |
clayg | like with the parts in buckets and arrows? or the one with krik cursing rings? | 17:59 |
*** jordan__ has quit IRC | 18:01 | |
acoles | parts in buckets | 18:02 |
acoles | bedlinen, towels... ;) | 18:03 |
* acoles thinks it may be time to go. | 18:04 | |
*** acoles is now known as acoles_ | 18:05 | |
pdardeau | clayg: i liked all of them! i'll probably end up having to re-watch it about 20 times to grok the content, so i'll keep better track on next viewing. ;-) | 18:05 |
*** links has joined #openstack-swift | 18:09 | |
*** links has quit IRC | 18:10 | |
*** daemontool has quit IRC | 18:19 | |
*** klrmn has quit IRC | 18:34 | |
*** sgundur has quit IRC | 18:36 | |
*** sgundur has joined #openstack-swift | 18:37 | |
*** klrmn has joined #openstack-swift | 18:37 | |
*** klrmn has quit IRC | 18:38 | |
*** klrmn has joined #openstack-swift | 18:38 | |
*** jamielennox|away has quit IRC | 18:38 | |
*** dmorita has quit IRC | 18:40 | |
clayg | lol | 18:41 |
clayg | bedlinen, towels!? | 18:41 |
clayg | now *I* have to rewatch it?! | 18:41 |
*** dmorita has joined #openstack-swift | 18:41 | |
openstackgerrit | Merged openstack/swift: Add probetest for response with duplicate frags https://review.openstack.org/371771 | 18:42 |
openstackgerrit | Pete Zaitcev proposed openstack/liberasurecode: Fix liberasurecode skipping a bunch of invalid_args tests https://review.openstack.org/387879 | 18:42 |
*** jerrygb has quit IRC | 18:45 | |
*** jerrygb has joined #openstack-swift | 18:45 | |
*** david_c_ has joined #openstack-swift | 18:47 | |
*** geaaru has quit IRC | 18:54 | |
*** doxavore has quit IRC | 18:59 | |
clayg | timburke: do you have a system to keep track of the backports for all the high/critical bugs we've been fixing? | 19:03 |
openstackgerrit | Tim Burke proposed openstack/swift: Reduce backend requests for SLO If-Match / HEAD requests https://review.openstack.org/347538 | 19:03 |
openstackgerrit | Tim Burke proposed openstack/swift: Confirm receipt of SLO PUT with etag https://review.openstack.org/390901 | 19:03 |
timburke | clayg: i've got https://review.openstack.org/#/dashboard/?title=Open+Backports&foreach=is:open+branch:%255Estable/.*&Swift=project:openstack/swift&Swift+Client=project:openstack/python-swiftclient but that only surfaces the backports that have already been proposed. in terms of identifying patches that should be backported but haven't yet, no, i don't have a good solution beyond "someone said maybe we should backport this -- i should | 19:09 |
timburke | go propose it!" | 19:09 |
patchbot | Error: No closing quotation | 19:09 |
*** amoralej is now known as amoralej|off | 19:11 | |
*** jamielennox|away has joined #openstack-swift | 19:13 | |
*** jamielennox|away is now known as jamielennox | 19:13 | |
*** jerrygb has quit IRC | 19:14 | |
clayg | notmyname: maybe a good friday project would be to hack up patchbot so that the only responses he can give are the patch description lines? | 19:16 |
*** jerrygb has joined #openstack-swift | 19:16 | |
clayg | like `if "review.openstack.org" not in response: return` or something | 19:18 |
timburke | clayg: i think it runs deeper than that, unfortunately. like, we need to rewrite some/all of supybot, which seems to have defaulted to enabling a bunch of things when notmyname changed the machine that was running patchbot | 19:21 |
pdardeau | here is what patchbot is currently thinking: https://www.youtube.com/watch?v=c8N72t7aScY | 19:24 |
zaitcev | this better not be a music video by Rick Astley | 20:00 |
*** jerrygb_ has joined #openstack-swift | 20:01 | |
*** jerrygb has quit IRC | 20:05 | |
*** jerrygb_ has quit IRC | 20:06 | |
*** vinsh has quit IRC | 20:06 | |
*** vinsh has joined #openstack-swift | 20:06 | |
*** sileht has quit IRC | 20:21 | |
clayg | timburke: yeah totally - find the method that sucks in supy bot and monkey patch the hell out of it | 20:23 |
clayg | like - there's probably only one place that acctually writes to the socket that goes out to irc - i'm saying stuff some jank in that method right there | 20:24 |
*** sileht has joined #openstack-swift | 20:28 | |
timburke | clayg: this is how we get shit like https://github.com/openstack/swift/blob/2.10.0/swift/proxy/server.py#L220 and then we can't upgrade because socket._fileobject flat out *isn't a thing anymore* | 20:30 |
*** dmorita has quit IRC | 20:31 | |
*** dmorita has joined #openstack-swift | 20:43 | |
*** adb5a56 has joined #openstack-swift | 20:45 | |
*** 7ITAAOQ37 has joined #openstack-swift | 20:48 | |
*** 7ITAAOQ37 has quit IRC | 20:48 | |
*** tries_ has quit IRC | 20:48 | |
*** chsc has quit IRC | 20:52 | |
*** chsc has joined #openstack-swift | 21:03 | |
*** chsc has joined #openstack-swift | 21:03 | |
*** sams-gleb has quit IRC | 21:06 | |
*** sams-gleb has joined #openstack-swift | 21:06 | |
*** sams-gleb has quit IRC | 21:11 | |
clayg | timburke: *totally* | 21:15 |
clayg | timburke: where does socket keep it's fileobject now then? | 21:15 |
timburke | no gd clue | 21:15 |
*** silor has quit IRC | 21:19 | |
clayg | timburke: oh look how just *hands* every plugin the irc instance - it *wants* you to monkey patch takeMsg https://github.com/Supybot/Supybot/blob/159c1e7cd886cb5f8c68d1ba703b1d302a716e5d/src/drivers/Socket.py#L102 | 21:21 |
*** chsc has quit IRC | 21:28 | |
*** sgundur has quit IRC | 21:35 | |
*** jrollinhatin is now known as jroll | 21:36 | |
openstackgerrit | Tim Burke proposed openstack/python-swiftclient: Add additional headers for HEAD/GET/DELETE requests. https://review.openstack.org/372656 | 21:39 |
*** StraubTW_ has quit IRC | 21:43 | |
*** jerrygb has joined #openstack-swift | 21:51 | |
*** vint_bra1 has quit IRC | 21:52 | |
*** jerrygb has quit IRC | 21:56 | |
ntata | clayg, that's a good question. I made those changes in my personal repo that I forked. Multi Swift pulls Swift from my forked Swift repo and works :). I used the same idea of exporting SWIFT_ROOT from env variables and use it in the code. | 22:20 |
ntata | I will propose upstream the changes I made for a custom swift dir to work. | 22:21 |
*** diogogmt has quit IRC | 22:39 | |
notmyname | clayg: I think you hinted at this a few days ago, but yikes! https://github.com/openstack/swift/blob/master/test/functional/__init__.py#L46-L49 | 22:39 |
notmyname | the rabbit hole goes deeper... | 22:40 |
*** dmorita has quit IRC | 22:44 | |
clayg | notmyname: cool right! | 22:45 |
notmyname | neat! | 22:45 |
clayg | ntata: yes yes yes! | 22:45 |
*** dmorita has joined #openstack-swift | 22:45 | |
*** amoralej|off is now known as amoralej | 22:46 | |
clayg | who else was looking at the ec/ssync/auditor corrupt frag bugs? I can't remember the easiest way to get corrupt frags on disk that match their etag metadata? | 22:46 |
openstackgerrit | Merged openstack/python-swiftclient: Enable code coverage report in console output https://review.openstack.org/388669 | 22:51 |
*** jerrygb has joined #openstack-swift | 22:52 | |
zaitcev | notmyname: impressive | 22:58 |
*** jerrygb has quit IRC | 22:58 | |
clayg | new way to run into lp bug #1558754 - if you increase your replica count! solution - let the builder do it's code instead of having stupid cli's doing stupid cli thing - workaround tell he cli to gtfo w/ -f | 23:11 |
openstack | Launchpad bug 1558754 in OpenStack Object Storage (swift) "remove devices before min-part-hours requires --force" [Undecided,Confirmed] https://launchpad.net/bugs/1558754 - Assigned to Jethro Sun (shwsun) | 23:11 |
openstackgerrit | Nandini Tata proposed openstack/swift: Allow custom swift configuration directory https://review.openstack.org/393952 | 23:16 |
clayg | yay cschwede fixed it already! | 23:17 |
clayg | patch 326967 | 23:18 |
patchbot | https://review.openstack.org/#/c/326967/ - swift - Rebalance with min_part_seconds_left > 0 | 23:18 |
clayg | tdasilva: I ahve this vauge memory of at one time trying think about adding some sort of tag to bugs like "has a patch in gerrit" | 23:18 |
clayg | tdasilva: like "fixed-in-gerrit" or something? I'm wondering if we might be able to leverage it later for finding old bugs with fixes that new reviewing | 23:19 |
clayg | what is the official bug tracker for liberasure? | 23:51 |
clayg | i sorta think it's still -> https://bitbucket.org/tsg-/liberasurecode/issues | 23:53 |
clayg | maybe we can just use tdasilva's pyeclib project -> https://bugs.launchpad.net/pyeclib | 23:54 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/python-swiftclient: Updated from global requirements https://review.openstack.org/89250 | 23:58 |
notmyname | clayg: no, it's on openstack too | 23:58 |
notmyname | https://bugs.launchpad.net/liberasurecode | 23:58 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/swift: Updated from global requirements https://review.openstack.org/88736 | 23:58 |
clayg | yes! | 23:59 |
*** amoralej is now known as amoralej|off | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!