Ludovic Courtès writes: >> While researching container escape vulnerabilities, I recently came >> across CAP_DAC_READ_SEARCH and open_by_handle_at, which is a system call >> so insanely powerful it is outright banned in all but the root user >> namespace. Or at least, it was. 10 months ago, in commit >> 620c266f394932e5decc4b34683a75dfc59dc2f4 of >> https://github.com/torvalds/linux, the requirements were relaxed so >> that, in certain cases, processes in non-root user namespaces could use >> open_by_handle_at. > > The way ‘open_by_handle_at’ is documented (“half” of ‘openat’) does not > make it immediately obvious to me what makes it “powerful”. I see the > risk of a confused deputy problem though because of the ‘mount_id’ > argument in addition to ‘handle’. Is that what you have in mind? The handle is a purely user-space sequence of bytes, and is not namespaced whatsoever. In other words, the first "half" (that is, name_to_handle_at) is completely optional, as long as you have a good idea of what sort of handle values to try. This means that, if a process has this capability in the root user namespace, they can potentially access every file of any filesystem that has at least one file visible to them. Note that "filesystem" here is not the same thing as "mount point", so this means that if you have a bind mount from the root filesystem in the container (or the root filesystem itself in the container is on the out-of-container root filesystem), a process in the container but with CAP_DAC_READ_SEARCH in the root user namespace could access *every file on the real root filesystem*. This is how an exploit for Docker named "shocker" worked (http://stealth.openwall.net/xSports/shocker.c), caused by Docker leaving CAP_DAC_READ_SEARCH available by default in privileged containers. I of course hope that the kernel's relaxing of the rules to also allow open_by_handle_at in some situations in non-root user namespaces has been carefully thought through to not open any holes like this, but it would be good to keep an eye on it regardless. - reepca