Tuesday, 2016-01-05

*** zengyingzhe_ has joined #openstack-smaug00:05
*** zengyingzhe_ has quit IRC00:19
*** zengyingzhe has joined #openstack-smaug00:19
*** saggi has quit IRC02:14
openstackgerritYingzhe Zeng proposed openstack/smaug: Proposed Smaug API v1.0  https://review.openstack.org/24475602:17
*** saggi has joined #openstack-smaug02:27
*** zengyingzhe has quit IRC02:28
openstackgerritzengchen proposed openstack/smaug: schedule service design  https://review.openstack.org/26264903:42
openstackgerritzengchen proposed openstack/smaug: operation engine design  https://review.openstack.org/26264903:44
*** WANG_Feng has quit IRC06:19
*** WANG_Feng has joined #openstack-smaug06:19
*** CrayZee has quit IRC07:43
*** gampel has joined #openstack-smaug07:55
*** zengyingzhe has joined #openstack-smaug08:14
*** c00281451 has joined #openstack-smaug08:57
*** c00281451 is now known as chenzeng08:58
chenzengsaggi:please review the bp:operation-engine-design at your free time. I hope for your feedback. thanks. https://review.openstack.org/#/c/26264909:00
saggichenzeng: I will09:01
chenzengsaggi:thanks.09:01
gampelsaggi: did you update everyone about the IRC meeting set for next week09:27
zengyingzheI've told the team in China.09:39
yinweithe meeting is next week?09:40
yinweiI thought it's tonight09:40
zengyingzheAnd update the meeting notice in https://wiki.openstack.org/wiki/Meetings/smaug09:42
zengyingzheyinwei, every even week.09:43
yinweisaggi, I think about the lease scenario today. In fact, what we need solve are two cases:1. delete unfinished zombie checkpoints but no active under protecting checkpoint; 2. delete finished checkpoints but no under restoring checkpoints;09:44
yinweiBoth cases could be divided by two categories: delete/write or delete/read in same site or across different sites;09:45
yinweiif they are from the same site, then lease should be enough to synchronize;09:45
yinweiif they are initiated on paralle across different sites, delete based on lease may need wait enough time;09:47
yinweifor the question who will do the clean up work, I suggest normally one site will have a single GC instance to check checkpoints created by this site, here we need build another index to list checkpoints of one site.09:50
saggiyinwei, actually shortly after you left I found a very simple solution. When you delete something you need to make sure you only mark a section as deleted if the delete was complete if you have any issues you stop the delete. That way if we have two delete running at once, they will stop once they start handling the same resource. Than, in the next GC cycle. The will continue where they left off as if there were a crash.09:50
saggiThat way we don't need locking at all. We just need to make sure our algorithms are crash safe. Which we have to do anyway.09:51
yinweiactually, we could make use of the feature that, in the same site, key is updated synchronously.  So inside one site, each service instance could compete the root lease to run the GC, which only check checkpoints of this site.09:53
yinweiWhen the site failed, admin notifies the other site that site A has failed, and GC all unfinished checkpoints left by siteA.09:54
yinweiI don't think cross sites compete root lease will work, since this is the similar issue that the other site may see the root lease later.09:55
saggiThat is why I suggested abandoning root lease. Just allow delete wherever. We just fail if we detect a collision and restart some other time.09:57
yinweiBut inside one site, root lease compete works.  This could ensure one GC one site, where we don't need think about GC workload partition...09:58
yinweiacross sites, where we don't have any strong consistency cluster to notify site failure, we could choose a manual way: let admin to notify all unfinished checkpoint left by failure site should be cleaned up09:59
saggihow do you know which is the "master" site09:59
yinweithere's no master site10:00
yinweiI suppose smaug should have an alert mechanism or a manual mechanism that admin knows which site fails10:00
saggihow would you implement this.10:00
saggi?10:00
saggiyinwei, I got to go, be back in 45 minutes.10:01
yinweime too10:01
yinweiping you later at night10:01
yinweiI have another question to ask you, shall we grant lease for read? like restore10:02
yinweiso let's talk later10:02
gampelHi yinwei11:42
*** wei___ has joined #openstack-smaug13:18
wei___hi, saggi13:18
saggiwei___: hi\13:18
saggiwei___: hows it going13:19
wei___fine, thanks13:19
wei___are you free to talk?13:19
saggiwei___: yes13:20
wei___so what I propose is simple, since we don't have centralized arbitration service, we let admin notify one of alive site that one site has failed. Here we provide an API, like notify_site_failure, then the site being notified will cleanup the garbage of the failure site.13:21
wei___for failure site, all unfinished checkpoints are garbage.13:21
saggiThat means the system is no longer self correcting13:21
saggiyou always need someone to take care of it13:22
wei___not really. for smaug, what it manages is separate openstacks which are only connected by geo-replicated bank13:23
wei___admin should be aware that how many sites are managed by smaug13:24
saggiwei___: So what you are saying is that we have the admin force one site to do the GC and moving it safely is the admins responsibility.13:29
wei___think in detail, we have no control for behavior of plugin. So each plugin may backup to different sites, say, vol1 backup to site b, vol2 backup to site c. Even site b is able to delete checkpoint located in bank, it may not be able to delete backup of vol2 located in site c.  I mean, if the two backup backends are different, you can't have backup driver to delete a resource not belong to its own backend.13:31
wei___It seems that admin or some other service should be aware of all sites(openstacks) managed by smaug, and notify all parties that one site failure and each one should delete the garbage produced by this failure site13:33
wei___It seems we need another component to manage add/delete site. I'm not sure if there're more scenarios require such service, but I feel like yes.13:34
saggiA provider should be able to back up fully from the site you want to back up and restore fully from the site you want to restore.13:35
saggiThere is no other way to go about it13:35
wei___but different tenant could pick up different provider, right?13:36
saggiyes13:36
saggiof course13:36
wei___we only allow all providers backup to one site?13:36
wei___or smaug only manage two sites?13:37
saggiWe don't limit it. But it doesn't make any sense because if a site goes down you will only have partial data. I'd assume you would want to backup everything fully to one site or more.13:37
saggiIt's the providers job. If we backup the volumes to swift than swift does the geo replication.13:38
wei___but one site is shared by multi tenants, you can't limit tenant behavior13:38
wei___say, tenant a chose provider a, tenant b chose provider b, where provider a backup to site2, provider b backup to site313:39
saggiBut this is OK13:39
wei___when site1 fails, garbage are located in site b and site c13:39
saggiyou could also have a provider that backs up to both site2 and site3 at once13:40
saggidepends on configuration13:40
wei___here garbage i don't mean checkpoint only, but also the backup data13:40
saggiexactly13:40
wei___in this case, you couldn't have site b to delete garbage located in site c13:41
wei___you have to notify site b cleanup garbage left in its own site and notify site c to cleanup its own. site a may knows how to cleanup both, but it fails so you can't count on. actually when i worked in distributed storage system, we always have a central configure service to synchronize each site status, and even configure sites replication pairs. I'm not sure why smaug doesn't need this, since it works like a distributed system.13:46
saggiThe bank is supposed to store the configuration13:54
saggiIf some storage needs any synchronization it should be handled outside.13:55
saggiwei___:13:55
saggiwei___: This is because actual distributed storage has it's own configuration server. It can't use Smaug's.13:59
gampelwei___: hi are you still here ?14:12
*** wei___ has quit IRC14:13
*** wei__ has joined #openstack-smaug14:23
wei__hi14:23
gampelhi14:23
gampelhi will be in shenzhen  on the 18th14:24
wei__cool14:24
wei__then we can talk face to face14:24
wei__will saggi come with you?14:24
gampelI will be there 18th and 19th14:24
saggiwei__: no14:25
gampelno ayal and me14:25
wei__ok14:25
wei__welcome!14:25
gampelI will be in chengdu  next week and then in shenzhen14:25
wei__saggi, welcome to china next time!14:26
wei__yes, i heard of the news so lili agreed that if you guys won't come to shenzhen, then we 3 persons would come to chengdu.14:26
gampelSo it up to you it seem that I will be in shenzhen so we could meet there14:27
wei__hmm, shenzhen is good to me14:28
wei__:)14:29
gampelOk great14:30
wei__so for my last question, another case is restore. If site a fails, tenant need restore one checkpoint created by a in all sites managed by smaug, since we suppose sites are independent.  If tenant has to do such kind of work in restore, we could also have tenant to recycle garbage left by failure site in each site.14:32
gampelwei_:do you think we can arrange a meet-up for Smaug in  shenzhen14:32
wei__anyone familiar how to organize the meet-up in openstack?14:33
wei__I will ask my colleges tommoraw14:33
gampelI am not sure how it is being done in shenzhen but I can ask the guys in hangzhou we are doing a meet-up for dragonflow there next week14:34
wei__to see if anyone here in shenzhen has such kind of experience.14:34
wei__ok, pls give me their name14:34
gampelOk i will14:34
wei__hmm, maybe chaoyi knows. Let me check it tomorrow.14:35
wei__time to go to bed14:35
wei__bye, guys14:35
gampel wei__: bye talk to you tomorrow14:36
*** wei__ has quit IRC14:36
*** gampel has quit IRC15:23
openstackgerritSaggi Mizrahi proposed openstack/smaug: Pluggable protection provider doc  https://review.openstack.org/26226415:29
openstackgerritSaggi Mizrahi proposed openstack/smaug: Proposed Smaug API v1.0  https://review.openstack.org/24475616:17
*** wei__ has joined #openstack-smaug17:37
*** wei__ has quit IRC17:41
*** gampel has joined #openstack-smaug19:12
*** gampel1 has joined #openstack-smaug20:02
*** gampel has quit IRC20:03

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!