GNU bug report logs -
#7597
multi-threaded sort can segfault (unrelated to the sort -u segfault)
Previous Next
Reported by: Jim Meyering <jim <at> meyering.net>
Date: Thu, 9 Dec 2010 12:11:01 UTC
Severity: normal
Tags: fixed
Done: Assaf Gordon <assafgordon <at> gmail.com>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
Jim Meyering wrote:
...
> With that, I've solved at least part of the problem.
> The segfault (and other strangeness we've witnessed)
> arises because each "node" struct is stored on the stack,
> and its address ends up being used by another thread after
> the thread that owns the stack in question has been "joined".
>
> My solution is to use the heap instead of the stack.
> However, for today I'm out of time and I have not yet found a
> way to free these newly-malloc'd "node" buffers.
>
> To test this, I've done the following:
>
> gensort -a 10000 > gensort-10k
> for i in $(seq 2000); do printf '% 4d\n' $i; valgrind --quiet src/sort -S 100K \
> --parallel=2 gensort-10k > k; test $(wc -c < k) = 1000000 || break; done
> for i in $(seq 2000); do printf '% 4d\n' $i; src/sort -S 100K \
> --parallel=2 gensort-10k > j; test $(wc -c < j) = 1000000 || break; done
>
> Without the patch, the first would show errors for more than 50% of
> the runs and the second would rarely get to i=100 without generating
> a core file. With the patch, both complete error-free (not counting
> leaks).
FYI, while preparing a test, I've found that the latter test
(without valgrind) passes 2000/2000 tests when compiled with -g -O2,
yet fails in at least 10 of the 2000 when compiled with -ggdb3.
This bug report was last modified 6 years and 285 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.