GNU bug report logs -
#21694
'clone' syscall binding unreliable
Previous Next
Reported by: ludo <at> gnu.org (Ludovic Courtès)
Date: Fri, 16 Oct 2015 20:41:02 UTC
Severity: normal
Done: ludo <at> gnu.org (Ludovic Courtès)
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your bug report
#21694: 'clone' syscall binding unreliable
which was filed against the guix package, has been closed.
The explanation is attached below, along with your original report.
If you require more details, please reply to 21694 <at> debbugs.gnu.org.
--
21694: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=21694
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
ludo <at> gnu.org (Ludovic Courtès) skribis:
> "Thompson, David" <dthompson2 <at> worcester.edu> skribis:
>
>> On Fri, Oct 16, 2015 at 4:39 PM, Ludovic Courtès <ludo <at> gnu.org> wrote:
[...]
>>> Now, there remains the question of CLONE_CHILD_SETTID and
>>> CLONE_CHILD_CLEARTID. Since we’re passing NULL for ‘ctid’, I expect
>>> that these flags have no effect at all.
>>
>> I added those flags in commit ee78d02 because they solved a real issue
>> I ran into. Adding those flags made 'clone' look like a
>> 'primitive-fork' call when examined with strace.
>
> Could you check whether removing these flags makes a difference now?
I removed them in commit after confirming that it affects neither the
test suite nor ‘guix system environment’ (on x86_64, with Linux-libre
4.2.3-gnu.)
Thanks,
Ludo’.
[Message part 3 (message/rfc822, inline)]
[Message part 4 (text/plain, inline)]
I’m reporting the problem and (hopefully) the solution, but I think we’d
better double-check this.
The problem: Running the test below in a loop sometimes gets a SIGSEGV
in the child process (on x86_64, libc 2.22.)
--8<---------------cut here---------------start------------->8---
(use-modules (guix build syscalls) (ice-9 match))
(match (clone (logior CLONE_NEWUSER
CLONE_CHILD_SETTID
CLONE_CHILD_CLEARTID
SIGCHLD))
(0
(throw 'x)) ;XXX: sometimes segfaults
(pid
(match (waitpid pid)
((_ . status)
(pk 'status status)
(exit (not (status:term-sig status)))))))
--8<---------------cut here---------------end--------------->8---
Looking at (guix build syscalls) though, I see an ABI mismatch between
our definition and the actual ‘syscall’ C function, and between our
‘clone’ definition and the actual C function.
This leads to the attached patch, which also fixes the above problem for me.
[Message part 5 (text/x-patch, inline)]
diff --git a/guix/build/syscalls.scm b/guix/build/syscalls.scm
index 80b9d00..f931f8d 100644
--- a/guix/build/syscalls.scm
+++ b/guix/build/syscalls.scm
@@ -322,10 +322,16 @@ string TMPL and return its file name. TMPL must end with 'XXXXXX'."
(define CLONE_NEWNET #x40000000)
;; The libc interface to sys_clone is not useful for Scheme programs, so the
-;; low-level system call is wrapped instead.
+;; low-level system call is wrapped instead. The 'syscall' function is
+;; declared in <unistd.h> as a variadic function; in practice, it expects 6
+;; pointer-sized arguments, as shown in, e.g., x86_64/syscall.S.
(define clone
(let* ((ptr (dynamic-func "syscall" (dynamic-link)))
- (proc (pointer->procedure int ptr (list int int '*)))
+ (proc (pointer->procedure long ptr
+ (list long ;sysno
+ unsigned-long ;flags
+ '* '* '*
+ '*)))
;; TODO: Don't do this.
(syscall-id (match (utsname:machine (uname))
("i686" 120)
@@ -336,7 +342,10 @@ string TMPL and return its file name. TMPL must end with 'XXXXXX'."
"Create a new child process by duplicating the current parent process.
Unlike the fork system call, clone accepts FLAGS that specify which resources
are shared between the parent and child processes."
- (let ((ret (proc syscall-id flags %null-pointer))
+ (let ((ret (proc syscall-id flags
+ %null-pointer ;child stack
+ %null-pointer %null-pointer ;ptid & ctid
+ %null-pointer)) ;unused
(err (errno)))
(if (= ret -1)
(throw 'system-error "clone" "~d: ~A"
[Message part 6 (text/plain, inline)]
Could you test this patch?
Now, there remains the question of CLONE_CHILD_SETTID and
CLONE_CHILD_CLEARTID. Since we’re passing NULL for ‘ctid’, I expect
that these flags have no effect at all.
Conversely, libc uses these flags to update the thread ID in the child
process (x86_64/arch-fork.h):
--8<---------------cut here---------------start------------->8---
#define ARCH_FORK() \
INLINE_SYSCALL (clone, 4, \
CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD, 0, \
NULL, &THREAD_SELF->tid)
--8<---------------cut here---------------end--------------->8---
This is certainly useful, but we’d have troubles doing it from the FFI…
It may that this is fine if the process doesn’t use threads.
Ludo’.
This bug report was last modified 9 years and 264 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.