GNU bug report logs -
#61095
possible misuse of posix_spawn API on non-linux OSes
Previous Next
Reported by: Omar Polo <op <at> omarpolo.com>
Date: Fri, 27 Jan 2023 11:53:01 UTC
Severity: normal
Tags: patch
Merged with 61079
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your bug report
#61095: possible misuse of posix_spawn API on non-linux OSes
which was filed against the guile package, has been closed.
The explanation is attached below, along with your original report.
If you require more details, please reply to 61095 <at> debbugs.gnu.org.
--
61095: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=61095
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
Hi!
Omar Polo <op <at> omarpolo.com> skribis:
> On 2023/03/30 22:21:28 +0200, Josselin Poiret <dev <at> jpoiret.xyz> wrote:
>> Hi Ludo,
>>
>> Ludovic Courtès <ludo <at> gnu.org> writes:
>>
>> > Coming next is an updated patch series addressing this as proposed
>> > above. Let me know what y’all think!
>> >
>> > I tested the ‘posix_spawn_file_actions_addclosefrom_np’ path by building in:
>> >
>> > guix time-machine --branch=core-updates -- shell -CP -D -f guix.scm
>>
>> I didn't test, but this LGTM! Maybe someone on OpenBSD could test this
>> patchset?
>
> % gmake check
> <snip />
> gmake[5]: Entering directory '/home/op/w/guile/test-suite/standalone'
> PASS: test-system-cmds
>
> it seems to work on OpenBSD 7.3 :)
Awesome! Pushed as 9cc85a4f52147fcdaa4c52a62bcc87bdb267d0a9.
> but note that our libc doesn't have posix_spawn_file_actions_addclosefrom_np,
> so this is using the "racy" code path.
Yeah, not great. :-/ I hope that function will be adopted by other
libcs, especially since ‘closefrom’ is already available.
> Just for curiosity, as it's outside the scope of the bug, what's the
> reason posix_spawn was used instead of a more classic fork() +
> closefrom()?
There’s a long discussion at:
https://issues.guix.gnu.org/52835
Essentially, ‘fork’ is unusable in multi-threaded context, in addition
to being inefficient.
Thanks,
Ludo’.
[Message part 3 (message/rfc822, inline)]
Hello,
I've noticed that test-system-cmds fails on OpenBSD-CURRENT while
testing the update to guile 3.0.9:
test-system-cmds: system* exit status was 127 rather than 42
FAIL: test-system-cmds
Here's an excerpt of the ktrace of the child process while executing
that specific test: (the first fork() is the one implicitly done by
posix_spawn(3))
5590 guile RET fork 0
[...]
5590 guile CALL dup2(0,3)
5590 guile RET dup2 3
5590 guile CALL dup2(1,4)
5590 guile RET dup2 4
5590 guile CALL dup2(2,5)
5590 guile RET dup2 5
5590 guile CALL dup2(3,0)
5590 guile RET dup2 0
5590 guile CALL dup2(4,1)
5590 guile RET dup2 1
5590 guile CALL dup2(5,2)
5590 guile RET dup2 2
5590 guile CALL close(1023)
5590 guile RET close -1 errno 9 Bad file descriptor
5590 guile CALL kbind(0x7f7ffffd51f8,24,0x2b5c5ced59893fa9)
5590 guile RET kbind 0
5590 guile CALL exit(127)
(if you prefer I can provide a full ktrace of guile executing that
test case)
My interpretation is that the sequence of dup2(2) is from
posix_spawn_file_actions_adddup2 in do_spawn, while the strange
close(1023) is from close_inherited_fds_slow. Such file descriptor is
not open, so close(2) fails with EBADF and the posix_spawn machinery
exits prematurely. My current RLIMIT_NOFILE is 1024, so the number
would make sense.
On OpenBSD I've tried to use the following patch to work around the
issue:
[[[
Index: libguile/posix.c
--- libguile/posix.c.orig
+++ libguile/posix.c
@@ -1325,6 +1325,7 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
static void
close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
{
+ max_fd = getdtablecount();
while (--max_fd > 2)
posix_spawn_file_actions_addclose (actions, max_fd);
}
]]]
getdtablecount(2) returns the number of file descriptor currently open
by the process. unfortunately it doesn't seem to be portable. (well,
tbf /proc/self/fd is not portable too.)
However, while this pleases the system* test, it breaks the pipe
tests:
Running popen.test
FAIL: popen.test: open-input-pipe: echo hello
FAIL: popen.test: pipeline - arguments: (expected-value ("HELLO WORLD\n" (0 0)) actual-value ("" (127 0)))
the reason seem to be similar:
74865 guile CALL dup2(7,3)
74865 guile RET dup2 3
74865 guile CALL dup2(10,4)
74865 guile RET dup2 4
74865 guile CALL dup2(2,5)
74865 guile RET dup2 5
74865 guile CALL dup2(3,0)
74865 guile RET dup2 0
74865 guile CALL dup2(4,1)
74865 guile RET dup2 1
74865 guile CALL dup2(5,2)
74865 guile RET dup2 2
74865 guile CALL close(8)
74865 guile RET close -1 errno 9 Bad file descriptor
74865 guile CALL kbind(0x7f7ffffcfa88,24,0x2125923bdf2ca9e)
74865 guile RET kbind 0
74865 guile CALL exit(127)
I guess it's trying to close the fd of the pipe that was closed.
I'm not sure what to do from here, I'm not used to the posix_spawn_*
APIs. I'm happy to help testing diffs or by providing more info if
needed.
Thanks,
Omar Polo
This bug report was last modified 2 years and 105 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.