GNU bug report logs - #7597
multi-threaded sort can segfault (unrelated to the sort -u segfault)

Previous Next

Package: coreutils;

Reported by: Jim Meyering <jim <at> meyering.net>

Date: Thu, 9 Dec 2010 12:11:01 UTC

Severity: normal

Tags: fixed

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Jim Meyering <jim <at> meyering.net>
To: Chen Guo <chen.guo.0625 <at> gmail.com>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, bug-coreutils <at> gnu.org, DJ Lucas <dj <at> linuxfromscratch.org>, coreutils <at> gnu.org
Subject: bug#7597: [coreutils] multi-threaded sort can segfault (unrelated to the sort -u segfault)
Date: Thu, 09 Dec 2010 22:33:42 +0100
Jim Meyering wrote:
...
> With that, I've solved at least part of the problem.
> The segfault (and other strangeness we've witnessed)
> arises because each "node" struct is stored on the stack,
> and its address ends up being used by another thread after
> the thread that owns the stack in question has been "joined".
>
> My solution is to use the heap instead of the stack.
> However, for today I'm out of time and I have not yet found a
> way to free these newly-malloc'd "node" buffers.
>
> To test this, I've done the following:
>
> gensort -a 10000 > gensort-10k
> for i in $(seq 2000); do printf '% 4d\n' $i; valgrind --quiet src/sort -S 100K \
>   --parallel=2 gensort-10k > k; test $(wc -c < k) = 1000000 || break; done
> for i in $(seq 2000); do printf '% 4d\n' $i; src/sort -S 100K \
>   --parallel=2 gensort-10k > j; test $(wc -c < j) = 1000000 || break; done
>
> Without the patch, the first would show errors for more than 50% of
> the runs and the second would rarely get to i=100 without generating
> a core file.  With the patch, both complete error-free (not counting
> leaks).

FYI, while preparing a test, I've found that the latter test
(without valgrind) passes 2000/2000 tests when compiled with -g -O2,
yet fails in at least 10 of the 2000 when compiled with -ggdb3.




This bug report was last modified 6 years and 285 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.