Monday, 2023-04-17

opendevreviewMerged openstack/nova master: db: Remove legacy migrations  https://review.opendev.org/c/openstack/nova/+/87242801:08
Ugglasean-k-mooney, gibi, bauzas I'm facing an issue. I can mount a share on the compute if I'am an admin, but not a regular user. I think it is something around privsep but I did not figure what it could be. Any idea what could be the issue ?09:05
gibiUggla: how this fails?09:06
UgglaI receive this error as a regular user : mount.nfs: mounting 192.168.122.76:/opt/stack/data/manila/mnt/share-7643c9cb-bf52-4263-b048-2b3589451cd9 failed, reason given by server: No such file or directory\n'09:07
Ugglahowever the directory exists.09:07
Ugglaas an example using the mount -t nfs 192.168.122.76:/opt/stack/data/manila/mnt/share-7643c9cb-bf52-4263-b048-2b3589451cd9 /opt/stack/data/nova/mnt/57c1c5f49444e0ccdcd1d46b5d640827 command as root is ok.09:08
opendevreviewStephen Finucane proposed openstack/nova master: doc: Update version info  https://review.opendev.org/c/openstack/nova/+/88061409:09
Ugglaas root09:09
Ugglagibi, https://paste.opendev.org/show/bypnmFjxs469LqyF1nsB/09:15
bauzasUggla: I'm unclear, is the mount failing ?09:16
Ugglabauzas, yes the mount is failing if I am an openstack "regular" user (demo). Not if I am an admin.09:18
bauzasI see09:18
bauzaswhich service is doing the mount ? 09:19
bauzasnova-compute right?09:19
Ugglabauzas, compute09:19
Ugglayes09:19
* bauzas needs to look at your driver code09:19
bauzassec09:19
gibiit is strange the mount call does not take the ctx so I'm not sure how this code can depend on the user token09:20
bauzasI'm confused, I'm trying to see where we internally syscall mount09:23
Ugglagibi, hopefully I have a witness (artom) prooving I'm not totally mad. ;)09:23
bauzasah, found it09:24
bauzasin https://review.opendev.org/c/openstack/nova/+/833090/29/nova/virt/libvirt/driver.py#411509:24
bauzasyou're calling nfs.LibvirtNFSVolumeDriver with connect_volume()09:25
bauzasUggla: amirite ?09:25
Ugglayes09:26
Ugglathen it is calling mount --> privsep --> fs09:26
bauzashttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/volume/fs.py#L11309:27
bauzasfound it09:27
bauzashttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/volume/mount.py#L262 this is what's eventually called09:28
bauzaswhich indeed calls a prevsep helper https://github.com/openstack/nova/blob/master/nova/virt/libvirt/volume/mount.py#L30809:29
bauzasand yeah look https://github.com/openstack/nova/blob/master/nova/privsep/fs.py#L3009:29
bauzasoh wait09:33
Ugglabauzas, yes and as gibi said there is no link with the context. That's the reason why I don't understand the issue.09:33
bauzasUggla: as a reminder, the privsep helper doesn't run as root09:33
bauzasit runs as the nova user with sudo rights, which is different09:34
bauzasiirc, rootwrap only runs privsep with https://github.com/openstack/nova/blob/master/etc/nova/rootwrap.d/compute.filters09:36
bauzasand then the privsep client connects to the privsep socket09:37
bauzasthat's why I'm suspecting you miss some privsep capability09:37
Ugglabauzas, hum you lost me. As privsep is a replacement for rootwrap in my mind.09:39
bauzasUggla: privsep is a deamon https://github.com/openstack/oslo.privsep/blob/master/oslo_privsep/daemon.py09:41
bauzasand iirc, we start it thru rootwrap https://github.com/openstack/oslo.privsep/blob/master/oslo_privsep/daemon.py#L3209:41
bauzasthen, the privsep client connects to the deamon which has escalated rights09:42
gibibauzas: if the same sequence works with and admin user token but not with the demo user token then I don't think this is a privsep capability issue but I have no better idea :/09:42
bauzasgibi: I haven't understand the same09:42
bauzasI thought Uggla was saying that the mount wasn't working with a nova user, but a mount command was working fine with sudo righrs09:43
bauzashence me thinking this was about a missing cap09:43
bauzasbut agreed, if an admin API call can mount thru nova, then yeah, this is a keystone context problem and not a privsep one09:44
Ugglabauzas, the sudo mount command was just to prove that it is not an issue with missing directory and that mount.nfs is possible on the host. And it works fine as an os admin and not as a os user.09:46
bauzasohok09:46
bauzasmy bad then09:46
bauzassoooo, pdb it09:47
stephenfinsqlalchemy-migrate is finally gone. Whoop! 🥳09:48
bauzasUggla: have you checked that all the attributes for the privsep.mount call are the same, between a standard user and an admin user?09:48
stephenfinNow to get the other projects to do the same09:48
stephenfinThanks for the reviews :)09:48
bauzasstephenfin: kudos for the win.09:48
stephenfinI hope nova jobs will start passing here shortly https://review.opendev.org/c/openstack/requirements/+/87974309:49
Ugglabauzas, unless if I'm wrong we are calling the same.09:49
Ugglabauzas, if you wish I can show you a demo in the beginning of the afternoon.09:50
bauzasUggla: I've seen a couple of log debug messages, should be quick to verify09:50
bauzasUggla: sure09:50
songwenpingbauzas, sean-k-mooney: hi guys, i have attach nvidia gpu(v100/A100) to the vm with 'virsh attach-device --persistent --live' , i can find the gpu in the vm with lspci, but nvidia-smi cnanot find the gpu, with the error 'NVRM: GPU 0000:05:00.0: RmInitAdapter failed! ', do you have any advices?10:26
opendevreviewRajesh Tailor proposed openstack/nova master: Fix case-sensitivity for metadata keys  https://review.opendev.org/c/openstack/nova/+/87390111:29
bauzassongwenping: this is unfortunately not a nova issue, IMHO12:07
bauzasif you can list the vgpu by lcpci, it looks to me a nvidia driver issue12:08
songwenpingbauzas: right, but where can we find some doc to prove it's nvidia driver issue?12:14
opendevreviewStephen Finucane proposed openstack/placement master: db: Replace use of deprecated API  https://review.opendev.org/c/openstack/placement/+/88062312:24
opendevreviewStephen Finucane proposed openstack/placement master: tests: Use base class for all functional tests  https://review.opendev.org/c/openstack/placement/+/88062412:24
opendevreviewStephen Finucane proposed openstack/placement master: tests: Warn on *any* SAWarning warning  https://review.opendev.org/c/openstack/placement/+/88062512:24
Ugglagibi, bauzas I have progressed a little. In fact I was fooled. The mount issue is not linked to user vs admin. But the first to the api is failing, but the second one is working, and this is not linked to timing.12:31
Ugglas/first/first call/12:32
bauzasUggla: let's discuss this directly by gmeet if you want ;)12:46
Ugglaok12:47
Ugglacleaning my env and calling you12:47
bauzassongwenping: honestly, nova just adds a mdev on the libvirt XML https://libvirt.org/drvnodedev.html#mediated-devices-mdevs12:48
bauzasUggla: ok12:48
artomUggla, oh, the privsep mount thing13:22
artomYeah, it's super weird, I am indeed a witness13:22
opendevreviewDan Smith proposed openstack/nova master: Remove silent failure to find a node on rebuild  https://review.opendev.org/c/openstack/nova/+/88063213:41
opendevreviewDan Smith proposed openstack/nova master: Stop ignoring missing compute nodes in claims  https://review.opendev.org/c/openstack/nova/+/88063313:41
dansmithbauzas: I pulled the RT stuff out into a separate set^13:41
bauzasdansmith: on a long call with Uggla but okay, I'll try13:42
dansmithand fixed another silent failure we ignore in evacuate13:42
bauzasgibi: dansmith: sean-k-mooney: fyi, we discussed with Uggla about his series and we discovered a potential large leak for ACLs in Manila access rights14:21
gibiack14:21
bauzasgibi: dansmith: sean-k-mooney: when adding an access-allow for the share, we pass the compute IP address to Manila14:21
dansmithwhich is visible by the user?14:21
bauzasthen the IP address can be seen by any user in the same project, and also any user can delete this ACL14:21
dansmiththat'd be bad14:22
bauzasso we need to tell the Manila folks to somehow hide those details14:22
dansmithyeah14:22
bauzasif the ACL is done by a service, it shouldn't be seen by an enduser, neither be able to delete it14:23
bauzasif we ask Manila to lock a share, that's a different concern14:23
sean-k-mooneybauzas: this is partly why i wanted to use the cert auth method not the ip one14:23
bauzasbecause users can delete ACLs without unlocking14:23
dansmith...yup14:24
bauzassean-k-mooney: it should also leak the certificate, right?14:24
bauzasand users could also delete the ACL14:24
bauzasthere are two problems to resolve : 1/ we need to hide the ACL, 2/ we need to make sure it's not possible to delete it by an user14:25
Ugglasean-k-mooney, also today you can do a manila show share that will reveal IP + export location14:25
sean-k-mooneythe cert would be a per vm cert generted for that insntace14:25
bauzaswe == Manila API14:25
sean-k-mooneyand it woudl only be the public key14:25
sean-k-mooneyso i dont think we care if that is visable14:25
dansmithdoes nova use only the user's token to talk to mania?14:26
bauzassean-k-mooney: then we would need to extend the lock mechanism to deny any ACL modification14:26
dansmith*manila14:26
sean-k-mooneyhttps://docs.openstack.org/api-ref/shared-file-system/?expanded=grant-access-detail#grant-access14:26
bauzasdansmith: we tested this also with admin 14:26
dansmithI'm asking14:27
sean-k-mooneyi would prefer to use cert or user for auth then ip14:27
bauzasdansmith: if you're an admin, you can generate a share and add an ACL 14:27
bauzasdansmith: but any user from the same project can both see the share and delete the ACL you created14:27
dansmithbauzas: can you answer my question?14:27
dansmithdoes nova use the user's token (only) to talk to manila?14:28
sean-k-mooneydansmith: yes i belive we use only the user token in the current proposal14:28
bauzasdansmith: IIRC today yes but Uggla knows better about it14:28
dansmithack14:28
sean-k-mooneybut we could add an admin token if we needed too. i dont think we use any admin api today14:28
dansmithwe have a lot of different places where we've done that in the past and it has come back to bite us14:28
sean-k-mooneybut we might for the lock api 14:28
bauzaswe discussed this at the PTG, we could use a service token if Manila adds it14:28
bauzasbut again, we missed at the PTG the ACL problem14:28
dansmithcinder, glance, etc.. so perhaps we should not replicate the same thing here14:28
sean-k-mooneybauzas: the probelm is leakign the ip/subnet of the nova comptue host14:29
bauzassince anyone from the same project can read anything about shares, then a service token needs to be related to a specific different project14:29
sean-k-mooneythe fix for that is to use a differnt grant type14:29
dansmithI know that even if nova uses admin or service, manila probably needs extra work to not allow a user to delete/see that thing14:29
dansmithbut I'm saying.. using the user's token puts us at a disadvantage for actually filtering those things14:29
dansmithsean-k-mooney: that may be better anyway, but I'm saying it might be a good idea to not go down this same road again14:30
gibiso when nova adds the compute IP to a request sent to manila with the user token, nova basically leaked the IP of the compute to that user (and project) 14:30
bauzassean-k-mooney: that's not only an IP leak problem, this is also a security problem since a malicious user can block the mount by deleting the ACL14:30
Ugglasean-k-mooney, we need to take care even with certs, I'm not sure that manila will not expose the IP in the export location as well.14:30
dansmitheven if we use cert auth, we're leaking something about the compute node14:30
sean-k-mooneybauzas: im aware but if we have locked the share you shoudl not be able to do that14:30
sean-k-mooneyso we can include that in the spec for the new lock api14:30
dansmithtwo users can conspire to compare certs about their hosts if they can see them and determine if they're on the same compute node, for example14:31
bauzasyup, + again, we allow anyone to block a mount14:31
dansmiththe cert probably has the hostname in it too right?14:31
sean-k-mooneydansmith not if the cert is per instance14:31
bauzasI think sean was proposing an instance-based cert14:31
sean-k-mooneybut otherwise yes14:31
bauzasbut that doesn't solve the ACL delete case14:31
dansmithack14:31
dansmithright14:31
bauzasand it adds a lot of complexity for little benefits14:31
bauzas+ I'm unsure that the export location doesn't show the IP address, even with certr14:32
dansmithI think manila is going to need finer granularity for this sort of thing, and nova is going to have to not use the user's token (alone) to do this stuff if we're going to distinguish between the user and nova14:32
bauzasdansmith: yup, and we need to round it back to the Manila team14:34
dansmithyeah14:34
bauzasthinking it more, the real problem is the fact that Manila exposes the export_location on the API for any user14:38
dansmithwell, for standalone usage it probably has to14:38
bauzassean-k-mooney: even with certs, the compute IP address would be shown in the export_location field of the manila share 14:38
bauzastryue14:38
dansmithhowever, I wonder if it should always only show a user their *own* exports14:38
bauzasthat's absolutely understandable from a standalone perspective14:38
sean-k-mooneyim not sure about that14:38
sean-k-mooneybut im in another meeting14:38
dansmiththat way, if nova uses its own user to create the export, the user would not be able to see it14:39
sean-k-mooneyso ill check back with this after14:39
bauzasdansmith: another user, another project, but yeah14:41
bauzasdansmith: it looks to me the granularity on CRUDs on per project14:42
dansmithbauzas: well, either really14:42
dansmithbauzas: we have per-user resources14:42
dansmithbut we need something14:42
dansmithproject would mean that the neutron user could see exports created by the nova user, which is probably not ever necessary14:42
dansmithalso not terrible, but..14:42
bauzasdansmith: well, we tested with Uggla, a devstack admin can create a share and add an access ACL, a devstack demo user can both delete the ACL and the share14:42
dansmithwait, what?14:43
bauzasthat yeah14:43
dansmithbut why?14:43
bauzasI'm saying again 14:43
bauzasf... no idea :)14:43
dansmiththat's fundamentally broken.. more than just "too coarse" right?14:43
bauzasI'm devstack admin, I create a share and I add access to 1.2.3.4/3214:44
bauzasthen,14:44
bauzasI switch to demo14:44
dansmithis manila in devstack configured with noauth or something/14:44
bauzasas demo, I can read that there is an existing share, I can read its ACL14:44
bauzasand I can delete both14:44
dansmithif that's true, this should have been a security bug no?14:44
bauzasdansmith: checking, but I think admin and demo share the same project14:45
dansmithoh, then that explains it, but.. they shouldn't be right?14:45
dansmithisn't admin in the admin project and demo in the demo project?14:45
bauzasdansmith: I'm just verifying that with Uggla14:45
bauzasdansmith: double-checking but yeah14:46
bauzasso, we tested with both an admin and a demo user from the same project14:53
opendevreviewMerged openstack/placement master: Move implemented specs for Xena and Yoga release  https://review.opendev.org/c/openstack/placement/+/85373015:11
opendevreviewStephen Finucane proposed openstack/nova master: db: Don't rely on branched connections  https://review.opendev.org/c/openstack/nova/+/88066316:07
opendevreviewStephen Finucane proposed openstack/nova master: db: Remove unnecessary 'insert()' argument  https://review.opendev.org/c/openstack/nova/+/88066416:07
opendevreviewStephen Finucane proposed openstack/nova master: tests: Pass parameters to sqlalchemy.text() as bindparams  https://review.opendev.org/c/openstack/nova/+/88066516:07
opendevreviewStephen Finucane proposed openstack/nova master: fixup! db: Remove unnecessary 'insert()' argument  https://review.opendev.org/c/openstack/nova/+/88066616:07
opendevreviewStephen Finucane proposed openstack/nova master: tests: Add missing args to sqlalchemy.Table  https://review.opendev.org/c/openstack/nova/+/88066716:07
opendevreviewStephen Finucane proposed openstack/nova master: db: Avoid relying on autobegin  https://review.opendev.org/c/openstack/nova/+/88066816:07
opendevreviewStephen Finucane proposed openstack/nova master: tests: Remove test for DB special characters  https://review.opendev.org/c/openstack/nova/+/88066916:07
opendevreviewStephen Finucane proposed openstack/nova master: db: Remove unnecessary 'insert()' argument  https://review.opendev.org/c/openstack/nova/+/88066416:08
opendevreviewStephen Finucane proposed openstack/nova master: tests: Pass parameters to sqlalchemy.text() as bindparams  https://review.opendev.org/c/openstack/nova/+/88066516:08
opendevreviewStephen Finucane proposed openstack/nova master: tests: Add missing args to sqlalchemy.Table  https://review.opendev.org/c/openstack/nova/+/88066716:08
opendevreviewStephen Finucane proposed openstack/nova master: db: Avoid relying on autobegin  https://review.opendev.org/c/openstack/nova/+/88066816:08
opendevreviewStephen Finucane proposed openstack/nova master: tests: Remove test for DB special characters  https://review.opendev.org/c/openstack/nova/+/88066916:08
opendevreviewMerged openstack/placement master: Update python testing as per zed cycle testing runtime  https://review.opendev.org/c/openstack/placement/+/84169016:41
opendevreviewDan Smith proposed openstack/nova master: Remove silent failure to find a node on rebuild  https://review.opendev.org/c/openstack/nova/+/88063217:11
opendevreviewDan Smith proposed openstack/nova master: Stop ignoring missing compute nodes in claims  https://review.opendev.org/c/openstack/nova/+/88063317:11
dansmithgibi: sean-k-mooney: bauzas: re our conversation this morning, I've seen this on plenty of volume tests lately as well: https://04e1a81b9004b13f1732-58d59663991bfd1d0dc3ba3167f83a98.ssl.cf1.rackcdn.com/880632/2/check/nova-ceph-multistore/f6ebaaf/testr_results.html19:30
dansmithi.e. several kernel crashes and, obviously, a failure to detach later19:30
dansmithit's this kind of thing that makes me highly suspect of cirros19:31
sean-k-mooneydansmith: if its the old cirror image  form the 5.2 release 22:15
sean-k-mooneyactully thts a differnt crash then i have seen in the past22:16

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!