Ludovic Courtès writes: > @@ -2013,23 +2057,36 @@ void DerivationGoal::runChild() > > _writeToStderr = 0; > > + if (readiness.writeSide > 0) readiness.writeSide.close(); > + > + if (readiness.readSide > 0) { > + /* Wait for the parent process to initialize the UID/GID mapping > + of our user namespace. */ > + char str[20] = { '\0' }; > + readFull(readiness.readSide, (unsigned char*)str, 3); > + if (strcmp(str, "go\n") != 0) > + throw Error("failed to initialize process in unprivileged user namespace"); > + } > + > restoreAffinity(); (in patch 7/16, in nix/libstore/build.cc) Strictly speaking we should check whether the fds are >= 0, not > 0, since 0 is technically a valid file descriptor, and we use -1 to indicate the absence of a file descriptor. Also, readiness.readSide isn't explicitly closed in the child after we're done with it. > + > + # The unprivileged daemon cannot create the log directory by itself. > + mkdir /var/log/guix > + chown guix-daemon:guix-daemon /var/log/guix > + chmod 755 /var/log/guix (in patch 15/16, in etc/guix-install.sh) Should this be 'mkdir -p' or some other conditional creation? If I understand correctly this will fail when overwriting an existing install using GUIX_ALLOW_OVERWRITE. Concerning guix-install.sh, to be clear, is the intent to specifically not support installing the rootful daemon on systemd systems? For my two cents, I do think that it's still a tradeoff - not just because of the reliance on different kernel mechanisms for security, but also because the rootless daemon currently causes visible changes to the build environment (EROFS on /, and nothing owned by root, for example). Which one should we consider the "canonical" build environment going forward? I decided to do some searching for container escapes / vulnerabilities online, just to be extra careful, and found one that relies on the container entry point being run as the user that owns the program that execs the container entry point: CVE-2019-5736. It exploits the fact that /proc/self/exe, despite being displayed like a symlink in the output of ls, does not actually act like a symlink, and indeed acts more like a hardlink that readlink happens to have some associated data for. The demo, modified for guix circumstances, would go something like this: 1. A derivation is created whose builder is /proc/self/exe, and whose LD_PRELOAD environment variable points into a malicious store item for one of its shared libraries - for example, libc. 2. The daemon reads this in, and, to my knowledge, does no verification of the builder string. Note that this aspect isn't actually necessary, as the builder could also be a symlink to /proc/self/exe from the store. 3. The daemon sets up the build environment, and execs /proc/self/exe. 4. An attacker-controlled load-time function gets run. 5. It opens /proc/self/exe, initially read-only because it can't be opened writable while a process is executing it. It then execs another attacker-controlled process, which inherits the open file descriptor and subsequently opens it via /proc/self/fd/, this time read-write (it can do this because it owns the file, and even if it's not writable, a quick fchmod will fix that, and the filesystem it was originally opened from isn't read-only, because guix-daemon starts before gnu-store.mount bind-mounts /gnu/store to itself prior to making it read-only). It then overwrites the resulting file descriptor with whatever contents it wants. 6. The next time guix-daemon is started outside the container, it runs attacker-controlled code. There are several points at which that particular attack could be stopped, and I'd like to try to stop it at as many of them as possible. A good start would be canonicalizing the builder prior to executing it and then checking to make sure it is in the store. A more general solution could look like writing out and then executing a tiny binary, something like /tmp/runbuilder, that does nothing but unlink itself and then exec the actual program. Here's a writeup of the CVE in question: https://unit42.paloaltonetworks.com/breaking-docker-via-runc-explaining-cve-2019-5736/ Aside from all that, it looks good to me. - reepca