opendevreview | Goutham Pacha Ravi proposed openstack/manila master: RBAC: Enable "new" defaults and scope checks https://review.opendev.org/c/openstack/manila/+/916025 | 00:16 |
---|---|---|
carthaca | gouthamr: thanks for updating the milestones. There is a typo in the name though: https://launchpad.net/manila/+milestone/dalmation-1 should be 'dalmatian-1' ;) (same for '-2' and '-rc1') | 07:15 |
gouthamr | gah thanks for noticing this carthaca; fixed them | 08:15 |
opendevreview | Maurice Escher proposed openstack/manila master: NetApp cDOT only modify qos policy if troughput changed https://review.opendev.org/c/openstack/manila/+/916061 | 08:35 |
opendevreview | kiran pawar proposed openstack/manila master: Add share type encryption https://review.opendev.org/c/openstack/manila/+/909977 | 16:45 |
opendevreview | kiran pawar proposed openstack/manila master: Use encryption key id during share create https://review.opendev.org/c/openstack/manila/+/911089 | 16:45 |
ganso | gouthamr: ping | 17:29 |
gouthamr | hey ganso | 17:36 |
ganso | gouthamr: hey I am wondering whether you ever faced the situation in production where you have 3 manila-share services that have the same backend configured pointing to the same storage box (like a ceph cluster or netapp box) but you face the situation where one manila-share host goes down and the shares that it is "owning" become unoperable (cannot extend, add/remove access rules, delete, etc) despite the fact that the other | 17:38 |
ganso | 2 manila-share hosts could perfectly perform operations on the shares that were registered in the database as being owned by the host that went down | 17:39 |
ganso | gouthamr: cinder solves this with a cluster config option which makes cinder aware that the hosts are a cluster pointing to the same box and any of them can service requests and creates like a cluster@backend#pool entry instead of hostA@backend#pool, hostB@backend#pool etc | 17:40 |
gouthamr | ganso: yes; there's no similar concept here.. but, does the host string not match right now? | 17:40 |
ganso | what do you mean by matching? were they supposed to match? | 17:41 |
gouthamr | ganso: so yes; this kind of HA with the manila-share service really is only possible if the host and backend names match exactly - which is terrible in terms of observability, but will solve the problem of an active service taking over the load from a backend that's down - i.e., manila will consider all matching hosts as the same host | 17:42 |
gouthamr | ganso: curious, what backend your environment using? | 17:43 |
ganso | gouthamr: oh yea so assuming you configure the host config option in each manila-share service to the same name, you achieve that behavior. But if you don't, you end up with the behavior I previously described. So setting the host config to the same value across nodes acceptable? In cinder this can potentially create side effects and is not recommended | 17:44 |
ganso | gouthamr: I am using NetApp | 17:44 |
gouthamr | ganso: yes; that'd be it.. | 17:44 |
ganso | gouthamr: I am a bit worried that internally either Manila or NetApp could face issues with such configs. Has this scenario been tested in production for the NetApp box, such as for replication, consistency groups, migrations, etc? | 17:45 |
gouthamr | ganso: i ask because backend driver maintainers aren't testing this; and there may be race conditions - for example, the NetApp backend has some periodic tasks - if multiple copies of these tasks run, i wonder if there's good error handling to prevent something adverse from happening | 17:45 |
ganso | gouthamr: yea exactly. Sounds like some unpredictable things could happen | 17:46 |
ganso | gouthamr: Given I am completely unaware of whether this is considered safe, I am wondering whether this has ever worked safely in production | 17:46 |
gouthamr | ganso: i've heard of some users in the wild; but, with the number of backends we have, and their various quirks, we'll certainly have issues.. there's no tests to validate this in the gate for instance | 17:47 |
ganso | gouthamr: it kinda sucks to transition to or back that host-same-name setup because once you have resources created you cannot transition between anymore | 17:48 |
ganso | well, cannot *easily* transition | 17:49 |
ganso | gouthamr: but yea, too bad driver maintainers are not testing this | 17:50 |
ganso | gouthamr: another downside specific to DHSS=True is the insane amount of share servers... every time a new share lands on a different host, even in the same share network, a new share server is created, 2 more ports are used... scalability seems pretty bad just as a consequence of this | 17:52 |
gouthamr | ganso: you can fix up the host if want; "manila-manage share update-host" | 17:54 |
ganso | gouthamr: I suspect that even if we had an equivalent to Cinder's cluster config in manila, we will still not have gate testing nor vendor validation and it will be everyone's using at their own risk | 17:54 |
ganso | gouthamr: in case of a node failure, yes, but for preventing creation of share servers it is not scalable. We would need to customize the scheduling to prefer the same host | 17:55 |
gouthamr | ganso: true; we could make it _easier_ to test by making devstack setup more capable - i.e., if you're running in multinode (or even running multiple services in teh same node), devstack can set all this up | 17:56 |
gouthamr | ganso: i wonder if what carthaca brought up at PTG would be relevant to you | 17:57 |
gouthamr | ganso: https://etherpad.opendev.org/p/dalmatian-ptg-manila-sap | 17:57 |
gouthamr | ganso: he's proposing enhancements to the PoolWeigher to consider cases where you want to consider the share network when stacking/spreading shares across share servers | 17:58 |
gouthamr | ganso: in your case of course, that solution has to be coupled with manila realizing that all those hosts are the same | 17:59 |
gouthamr | the netapp driver could be smarter about this too | 18:00 |
gouthamr | (i think) | 18:00 |
ganso | gouthamr: yes! I have just been thinking of tuning the scheduler to mitigate the problem of creating share servers uneceessarily, like, make it prefer the same host | 18:00 |
ganso | but in carthaca's proposal it is optimizing it further at the share network level as well | 18:01 |
ganso | gouthamr: hmmm I am not sure the NetAPp driver could do anything about it because once the driver runniing on host A accepts the request to create the share, it owns it, I am not sure you can return a hostname info like "cluster@netapp#pool" | 18:02 |
gouthamr | ganso: yeah, true... i was thinking the driver can discern the fact its the same ONTAP, and the same aggr, and so on and hence reuse the server when the share manager asks it for a share server.. but, we record the host info in the share server as well.. so it'll get confusing real fast | 18:04 |
tspyderboy[m] | gouthamr:... (full message at <https://matrix.org/_matrix/media/v3/download/matrix.org/TnRkfALnMvDgRDxZOLrRMwlv>) | 18:10 |
ganso | gouthamr: right, I forgot the driver has a say on whether a share_server is needed or not, but if I recall correctly manila still manages the host#backend value | 18:17 |
ganso | gouthamr: thanks for all the tips! | 18:18 |
ganso | I gotta go now | 18:18 |
gouthamr | ganso: yes! thanks for checking in here; and let me know what direction you take with this.. it'll likely be of interest to other operators, and a good gap to close | 18:18 |
gouthamr | tspyderboy[m]: o/ ack | 18:19 |
opendevreview | Elvis Kobi Acheampong proposed openstack/manila master: Adds "usedforsecurity=False" to veritas drivers https://review.opendev.org/c/openstack/manila/+/914400 | 18:36 |
opendevreview | Skylar Markegard proposed openstack/manila master: Enable Bandit testing in Manila https://review.opendev.org/c/openstack/manila/+/908191 | 18:48 |
opendevreview | Goutham Pacha Ravi proposed openstack/devstack-plugin-ceph master: Standalone nfs-ganesha with cephadm deployment https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/915212 | 23:10 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!