*** mhen_ is now known as mhen | 03:01 | |
jbernard | #startmeeting cinder | 14:01 |
---|---|---|
opendevmeet | Meeting started Wed Dec 11 14:01:31 2024 UTC and is due to finish in 60 minutes. The chair is jbernard. Information about MeetBot at http://wiki.debian.org/MeetBot. | 14:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 14:01 |
opendevmeet | The meeting name has been set to 'cinder' | 14:01 |
jbernard | #topic roll call | 14:01 |
tosky | o/ | 14:01 |
jungleboyj | o/ | 14:01 |
jbernard | o/ | 14:01 |
simondodsley | o/ | 14:01 |
rosmaita | o/ | 14:01 |
akawai | o/ | 14:01 |
jbernard | #link https://etherpad.opendev.org/p/cinder-epoxy-meetings | 14:01 |
flelain | o/ | 14:01 |
vdhakad | o/ | 14:02 |
whoami-rajat | hey | 14:02 |
sp-bmilanov | o/ | 14:02 |
msaravan | Hi | 14:02 |
jbernard | ok, welcome everyone | 14:04 |
jbernard | #topic annoucements | 14:05 |
jbernard | #link https://releases.openstack.org/epoxy/schedule.html | 14:05 |
jbernard | still in M1, M2 is early Jan (6-10) | 14:05 |
jbernard | generally we (or I at least) are focused on consolidating any pending specs in the next days | 14:06 |
jbernard | specifically the dm-clone spec from jan | 14:06 |
jbernard | but there could be something I've missed | 14:06 |
jbernard | otherwise, I hope everyone is haveing a good december so far | 14:06 |
simondodsley | It's December already!!! | 14:07 |
jungleboyj | Lol. 1/3 of the way through even! | 14:07 |
jbernard | actually we're almost half done! :/ | 14:07 |
simondodsley | must take more notice | 14:07 |
flelain | Happy Advent time to you all! | 14:07 |
jbernard | to you too | 14:08 |
jbernard | speaking of advent... :) | 14:08 |
jbernard | #link https://adventofcode.com/ | 14:08 |
jbernard | ^ this is fun if you don't have anything to do | 14:08 |
jhorstmann | o/ | 14:08 |
jbernard | #topic followup on dm-clone spec | 14:09 |
flelain | lol; got a couple of colleagues of mine already deeply involved in it :) Haven't found time for it so far! | 14:09 |
jbernard | #link https://review.opendev.org/c/openstack/cinder-specs/+/935347/6 | 14:09 |
jbernard | #link https://review.opendev.org/c/openstack/cinder-specs/+/935347 | 14:09 |
simondodsley | I'm reading this spec now and I'm not sure I understand the need, especially if Ceph is deployed | 14:09 |
jbernard | jhorstmann: we're getting some traction on your spec review, wanted to make space available now to raise any issues so far | 14:10 |
jbernard | simondodsley: if you have ceph, my understanding is that dm-clone would not be necessary; this spec rather gives you local storage with the ability to live migrate, something closer to distributed storage without the resource requirements and complexity | 14:11 |
jhorstmann | thank you, happy to answer any questions | 14:11 |
jbernard | jhorstmann: ^ please correct me if im wrong there | 14:11 |
simondodsley | but you are requiring a c-vol service on each compute node - that is more resources. I'm not sure that a c-vol on each hypervisor is a good methodology | 14:12 |
jhorstmann | simondodsley: the idea is to have more options. This is intended for deployments where you want either performance of local storage or are resource constrained and do not want to deploy a full storage solution | 14:13 |
jhorstmann | simondodsley: I agree that this is disadvatagious. It is one of the contraints if this is implemented as a pure cinder volume driver | 14:14 |
simondodsley | and how will this work with Nova AZs | 14:14 |
simondodsley | lol - you said pure - please capitialize it ... :) | 14:15 |
jhorstmann | :) | 14:15 |
simondodsley | i feel this may become problematic in core-edge scenarios where different edge nodes have no connectivity to other edge nodes | 14:16 |
whoami-rajat | simondodsley, IIUC it's a k8s over openstack use case where we require local volumes for the etcd service, we use ceph in HCI but i think it doesn't provide the desired latency | 14:16 |
whoami-rajat | simondodsley, that was one of my concerns, here all compute nodes needs to be connected via storage network | 14:17 |
simondodsley | if this is for a k8s on openstack, then a CSI driver should be used, rahter than a cinder driver shouldn't it? | 14:17 |
flelain | jhosrstmann, about this spec, just to make sure I got it right, does it want to achieve block storage service right on the local storage of the compute hosting the VM? | 14:17 |
whoami-rajat | simondodsley, I'm not aware of all the details but do you mean cinder CSI driver? won't that issue API calls to cinder only? | 14:20 |
simondodsley | or even a true CNS solution - i have to be careful here as my company sells one of these so I may have to recuse myself from being involved in this | 14:20 |
simondodsley | not necessarily th cinder CSI driver. | 14:20 |
jhorstmann | simondodsley: so regarding the availability zones you have cinder.cross_az_attach=True as a default, so the AZ is not relevant in that case, correct? If you want to have to have cross_az_attach=False you have to make sure that they overlap. I am probably missing the exact point of the question | 14:21 |
simondodsley | it's more about DCN deployments | 14:22 |
jbernard | flelain: i believe that is true, I think of it as an LVM deployment, but with the ability to move volumes between compute nodes as instances are migrated | 14:22 |
jhorstmann | flelain: that is correct. The driver will provide storage local to the compute node and offer the possibility to transfer the data on attach and live-migration | 14:22 |
flelain | then couldn't the instance disk, using local compute node, do the trick? (w/o being a block storgae though) | 14:23 |
simondodsley | i don't understand why you are referencing iSCSI or other network attached storage as the source, because this would mean tha texternal storage is avaible and therefore why not just use the cinder driver for that storage? | 14:24 |
jhorstmann | simondodsley, whoami-rajat: yes if you want to deploy a cloud with this driver you would need some sort of storage network between the conmpute nodes, but is there a deployment were this does not apply? | 14:26 |
simondodsley | It feel this this is just live migration 2.0, using dm-clone instead of dd. | 14:26 |
simondodsley | how will you cater for network outages? does dm-clone have capabilities to automatically recover from these? | 14:27 |
jhorstmann | flelain: the idea is to have the full flexibility of the block storage service, so that you dynamically create the volumes of the required size and are not bounbd to any instance lifecycles | 14:27 |
jhorstmann | simondodsley: the advantage is that using the dm-clone target writes will always be local to the compute node. So you get local storage perfomrance for writes, where it is most critical for applications like e.g. databases | 14:29 |
simondodsley | but what about a network outage during a migration | 14:29 |
whoami-rajat | the spec mentions all of these details | 14:31 |
whoami-rajat | #link https://review.opendev.org/c/openstack/cinder-specs/+/935347/7/specs/2025.1/add-dmclone-driver.rst | 14:31 |
whoami-rajat | L#528 | 14:31 |
flelain | jhorstmann: gotcha. Interesting concept to be carried on! | 14:31 |
jhorstmann | simondodsley: regarding network outages: yes you can recover from those. of course you will get read errors if the source node is not available. Usually this is handled on the filesystem level by remounting read-only. Once you recover the source node you can recover the volume and data will be trasfered again | 14:31 |
flelain | jhorstmann I left comments in the spec, thank your for this proposal. | 14:32 |
jhorstmann | flelain: thank you | 14:32 |
rosmaita | also, the presentation from the PTG gives a good overview of the spec: https://janhorstmann.github.io/cinder-driver-dm-clone-presentation/000-toc.html | 14:32 |
jhorstmann | simondodsley: it is not an HA storage solution. If that is a requirement then this driver should not be used. As said before, I see this as an additional storage option, but it cannot replace most existing solutions | 14:34 |
whoami-rajat | i feel it would be best to leave the concerns as comments on the spec since Jan is pretty good and quick at responding on those | 14:37 |
simondodsley | So what thing that concerns me is that you cannot extend a volume. This is a minimum requirement for ny cinder driver as defined in the documentation | 14:38 |
whoami-rajat | we can extend it, just not during the migration | 14:38 |
simondodsley | so does it meet all the required core functions? | 14:39 |
simondodsley | https://docs.openstack.org/cinder/latest/contributor/drivers.html | 14:39 |
jhorstmann | simondodsley: yes volume extension is possible when the data is "at rest" (no clone target loaded) | 14:39 |
whoami-rajat | so this is not specifically a new cinder driver but an extension of the LVM driver to provide local attachment + live migration support | 14:40 |
jbernard | these are all good questions, it might be good to have any outstanding concerns captured in the spec, Jan has been good about quick responses | 14:41 |
simondodsley | hmm - the spec doesn't really say that - it says it is a new driver | 14:41 |
simondodsley | Line #20 | 14:41 |
simondodsley | and the title says 'Add' which implies 'new' | 14:42 |
simondodsley | even the spec name is add | 14:42 |
simondodsley | isn't this more of a new replication capability with failover capabilities? | 14:43 |
whoami-rajat | i might need to clarify my understanding there, maybe jhorstmann can help | 14:43 |
whoami-rajat | but jbernard might have other discussion points so we can discuss it at some other place | 14:44 |
whoami-rajat | s/discussion/meeting | 14:44 |
jbernard | on the agenda there are only review requests, this was/is our primary topic for today | 14:44 |
jbernard | unless there is something not on the etherpad | 14:44 |
tosky | sooo, if I can: should we mark revert to snapshot as supported for Generic NFS? I've created https://review.opendev.org/c/openstack/cinder/+/936182 also to gather feedback: it seems the pending issues were addressed. Or is there still any gap that needs to be fixed? | 14:45 |
jbernard | #topic open dicussion | 14:46 |
jbernard | simondodsley: those are all great questions, please if you have cycles today put any outstanding ones in the spec so that we have record | 14:46 |
jhorstmann | whoami-rajat, simondodsley: the driver inherits from the LVM driver for basic volume management, but requires to implement some methods differently. I am not sure where the line between extension and new driver is drawn | 14:46 |
jbernard | eharney: re tosky's question, you are frequently in the NFS codez | 14:47 |
whoami-rajat | jhorstmann, so basically if the base driver (LVM) + the new driver methods implement all required functionalities, we should be good | 14:48 |
rosmaita | jhorstmann: i think calling it a new driver is fine, we have a bunch of drivers like that | 14:48 |
eharney | jbernard: yeah i meant to +2 that one -- i'm doing some work in nfs world anyway right now so i'll see if there are any concerns there | 14:48 |
whoami-rajat | rosmaita, +1 | 14:49 |
jbernard | eharney: excellent, thanks | 14:50 |
whoami-rajat | tosky, i see in one of recent runs of nfs job, revert is enabled but I'm unable to find any test that is exercising that code path | 14:50 |
tosky | whoami-rajat: the tests are in cinder-tempest-plugin; I have a review to enable these tests for NFS, which is blocked on the known NFS format bug | 14:51 |
tosky | let me dig it | 14:51 |
whoami-rajat | tosky, oh okay, though I can vouch it works since i worked on the fix and verified it | 14:52 |
whoami-rajat | (it would still be good to see it in gate run) | 14:52 |
tosky | whoami-rajat: https://review.opendev.org/c/openstack/devstack-plugin-nfs/+/898965 | 14:53 |
tosky | aaand we probably need fresh logs | 14:53 |
whoami-rajat | wait, that doc change might not be accurate | 14:54 |
whoami-rajat | so NFS doesn't support revert to snapshot | 14:54 |
tosky | oh | 14:54 |
whoami-rajat | but when it falls back to the generic revert mechanism, it still fails | 14:54 |
whoami-rajat | which is the scenario that i fixed | 14:54 |
whoami-rajat | so we cannot do driver assisted revert to snapshot but at least the generic code path works with nfs now | 14:55 |
tosky | so that's feature is about driver assisted revert, not just that it "just works"? | 14:55 |
tosky | but in fact having the tests enabled wouldn't hurt | 14:55 |
tosky | still it may need to written down in a proper way somehow | 14:56 |
tosky | or just decided that yes, revert to snapshot works, just not optimized | 14:56 |
whoami-rajat | i think everything is expected to work in generic workflow but NFS proved us wrong :D | 14:56 |
tosky | up to whatever you people decide :) | 14:56 |
whoami-rajat | right, it is certainly an improvement than before when we couldn't even do revert with NFS | 14:56 |
whoami-rajat | which also led to some complaints in the past | 14:57 |
jbernard | ok, last call | 14:59 |
whoami-rajat | I've left a -1 on this change but we can add it in some other doc place | 14:59 |
jbernard | #endmeeting | 15:00 |
opendevmeet | Meeting ended Wed Dec 11 15:00:00 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 15:00 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/cinder/2024/cinder.2024-12-11-14.01.html | 15:00 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/cinder/2024/cinder.2024-12-11-14.01.txt | 15:00 |
opendevmeet | Log: https://meetings.opendev.org/meetings/cinder/2024/cinder.2024-12-11-14.01.log.html | 15:00 |
jbernard | thanks everyone | 15:00 |
jbernard | have a good week | 15:00 |
jungleboyj | Thanks! | 15:00 |
sp-bmilanov | thanks! | 15:00 |
whoami-rajat | thanks! | 15:01 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!