> While researching container escape vulnerabilities, I recently came > across CAP_DAC_READ_SEARCH and open_by_handle_at, which is a system call > so insanely powerful it is outright banned in all but the root user > namespace. Or at least, it was. 10 months ago, in commit > 620c266f394932e5decc4b34683a75dfc59dc2f4 of > https://github.com/torvalds/linux, the requirements were relaxed so > that, in certain cases, processes in non-root user namespaces could use > open_by_handle_at. The way ‘open_by_handle_at’ is documented (“half” of ‘openat’) does not make it immediately obvious to me what makes it “powerful”. I see the risk of a confused deputy problem though because of the ‘mount_id’ argument in addition to ‘handle’. Is that what you have in mind? > The consequences of this for same-user containers are not clear to me > yet, as I haven't studied the kernel source enough to know what exactly > that commit message means by "privileges over the filesystem" or > "privileges over a subtree". I also haven't been able to test this > behavior yet, because my kernel is actually too old (I do my rebases and > upgrades rather less regularly than is recommended). I'll try to look > into this more once I update my system (and man-pages!), but figured I > should mention it, because aside from that, and the aforementioned > isInStore check, I can't think of any remaining concerns. Alright. I’ll send v8 with the change above. Thanks again! Ludo’.