Thursday, 2022-10-13

opendevreviewXuQi proposed openstack/cinder master: Fujitsu Driver: Change the function of attach/detach  https://review.opendev.org/c/openstack/cinder/+/86099703:40
*** amoralej is now known as amoralej|off06:33
*** amoralej|off is now known as amoralej06:37
opendevreviewMasayuki Igawa proposed openstack/cinder stable/ussuri: Add warning message about slow volume backend  https://review.opendev.org/c/openstack/cinder/+/86105707:34
opendevreviewRaghavendra Tilay proposed openstack/cinder master: HPE 3PAR: test - please ignore  https://review.opendev.org/c/openstack/cinder/+/86100110:28
*** amoralej is now known as amoralej|lunch12:14
opendevreviewTushar Trambak Gite proposed openstack/cinder master: Deleting a volume in 'downloading' state  https://review.opendev.org/c/openstack/cinder/+/82660712:28
*** amoralej|lunch is now known as amoralej13:08
hemnaso, fwiw, I filed this one yesterday https://bugs.launchpad.net/cinder/+bug/1992493 as a placeholder for some major problems that I face on the daily with customers that are using our deployment14:22
hemnacustomers end up not being able to do basic operations on volumes due to cinder's inability to manage operations where pools are full, but there are many pools with capacity available in the same backend.14:24
hemnathis is turning into a major problem in our deployments.   can't do backups, clones, snapshots, extends 14:24
hemnawe have patched cinder to try a migrate on a failed extend as a solution and that has worked 14:25
hemnasnapshots are a bit more difficult as cinder doesn't allow migrations for volumes that have existing snaps14:25
hemnaand backups first thing it does is to snap a volume (attached)14:26
hemnawhich fails14:26
hemnaso most of the cinder deployment is completely inoperative for customers in this state14:26
jbernardhemna: are the other, non-full, pools considered to be equal to the full one in that particular deployment?14:38
hemnaall pools are available for provisioning in the same backend14:38
hemnathere is plenty of space in the backend, but certain pools are full and can't take anymore.  the volumes being backed up, cloned, snapshotted are typically volumes on the pools that are full as they have been around the longest14:39
hemnathose operations shouldn't fail14:39
jbernardi see, migrating seems like the solution there, could we not also migrate the snaps along with the volume?14:40
hemnaso what ends up happening is the customer complains, files a ticket, then I have to manually migrate the volume in question so they can do the backup/snap/clone/etc14:40
hemnathat process isn't 'cloud' like, nor scalable14:40
jbernardwhat happens the volumes' snaps?14:40
hemnain our case, snaps are full clones14:40
hemnaso, those could also get migrated14:41
hemnabut cinder just quits and says no to everything14:41
jbernardwould an operator complain about an automigrate function? it seems like really nice to have imo14:41
hemnafor example, in one backend we have 52 pools, 22 of which are full.  14:42
hemnaI think migrate should be built in to these operations.14:42
hemnaif it fails to find a host for the operation, try to find a host to migrate it to on the same backend 14:43
jbernardin considering an automatic migration, how would you select the target pool? random?14:43
hemnajust sent it back through the scheduler and tell it to find a pool in the same backend14:43
jbernardahh14:43
hemnathat's what we did for our patch for extend14:43
jbernardwith available capacity14:43
hemnaextend fails because the current pool is full or can't take the new size.   so we ask the scheduler to migrate it with the new size, it picks one and migrates it, then we extend after the migration completes14:44
hemnacinder should be doing that for all the operations where it can't find a host on a particular backend14:44
jbernardthat would definately be more consistent14:45
hemnaotherwise most of the real world operations are broken for users14:45
hemnaI have 20+ deployments of cinder around the world.  doing manual migrations isn't a 'solution'14:46
jbernardyikes14:46
jbernardptg is in a few days, this might be a good time to air this idea across the group14:47
hemnayah, I'll bring it up, which is also why I filed the bug.  the bug was unfortunately labeled as medium.14:47
hemnaI think it's a high priority.  cinder isn't really cloud like software if human intervention is needed for basic operations.14:48
jbernardenriquetaso: ^ this might be worth considering for https://bugs.launchpad.net/cinder/+bug/199249314:48
hemnaI think this was just a big oversight when we enabled pools way back when.14:48
hemnaheh, the bug was actually labeled 'wishlist' !14:49
hemnasmh14:49
* enriquetaso reading14:53
hemnafwiw, this is what we did to fix extend https://github.com/sapcc/cinder/pull/13414:54
*** dviroel_ is now known as dviroel14:55
enriquetasoI've mentioned 1992493 on yesterday bug meeting. I'll re-target it to High priority. I think the best idea is to discuss it on the PTG, please add a topic hemna.15:02
hemnayes, we need to discuss it.  thank you15:02
enriquetasoThanks jbernard hemna 15:03
hemnaposted it in the etherpad15:28
*** tkajinam is now known as Guest297115:43
*** amoralej is now known as amoralej|off16:16
enriquetasohemna++16:42
*** dviroel is now known as dviroel|biab19:55
*** dviroel|biab is now known as dviroel20:51
*** dviroel is now known as dviroel|afk21:33

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!