GNU bug report logs - #7597
multi-threaded sort can segfault (unrelated to the sort -u segfault)

Previous Next

Package: coreutils;

Reported by: Jim Meyering <jim <at> meyering.net>

Date: Thu, 9 Dec 2010 12:11:01 UTC

Severity: normal

Tags: fixed

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>
Cc: Chen Guo <chen.guo.0625 <at> gmail.com>, bug-coreutils <at> gnu.org, DJ Lucas <dj <at> linuxfromscratch.org>, coreutils <at> gnu.org
Subject: bug#7597: multi-threaded sort can segfault (unrelated to the sort -u segfault)
Date: Sun, 12 Dec 2010 13:42:31 -0800
On 12/12/2010 07:41 AM, Jim Meyering wrote:
> That sounds good, assuming it triggers the bug reliably for you.
> I was hoping to find a way to reproduce it without relying on gensort,
> but won't object if you want to do that.

In my attempts to reproduce the problem, it's pretty flaky.
I think it depends on how busy the operating system is.
Sometimes I'd get failures all the time; sometimes, almost
never.  (This was with valgrind; I had much less luck without
valgrind.)

Anyway, I pushed this, which seemed to work well enough
on my host.  It prefers gensort if available, but falls
back on seq+shuf if not.

From 63d1b425976ccc0b89159d743e33eb5da634de3c Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Sun, 12 Dec 2010 13:38:19 -0800
Subject: [PATCH] tests: test for access to stale thread memory

* tests/misc/sort-stale-thread-mem: New tests.
* tests/Makefile.am (TESTS): Add it.
---
 tests/Makefile.am                |    1 +
 tests/misc/sort-stale-thread-mem |   44 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 0 deletions(-)
 create mode 100755 tests/misc/sort-stale-thread-mem

diff --git a/tests/Makefile.am b/tests/Makefile.am
index b573061..f7a8af8 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -238,6 +238,7 @@ TESTS =						\
   misc/sort-month				\
   misc/sort-rand				\
   misc/sort-spinlock-abuse			\
+  misc/sort-stale-thread-mem			\
   misc/sort-unique				\
   misc/sort-unique-segv				\
   misc/sort-version				\
diff --git a/tests/misc/sort-stale-thread-mem b/tests/misc/sort-stale-thread-mem
new file mode 100755
index 0000000..c4f4fcb
--- /dev/null
+++ b/tests/misc/sort-stale-thread-mem
@@ -0,0 +1,44 @@
+#!/bin/sh
+# Trigger a bug that would cause 'sort' to reference stale thread stack memory.
+
+# Copyright (C) 2010 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# written by Jim Meyering and Paul Eggert
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+print_ver_ sort
+
+expensive_
+
+valgrind --help >/dev/null || skip_ "requires valgrind"
+test "$(nproc)" = 1 && skip_ "requires a multi-core system"
+
+# gensort output seems to trigger the failure more often,
+# so prefer gensort if it is available.
+(gensort -a 10000 in) 2>/dev/null ||
+  seq -f %-98f 10000 | shuf > in ||
+  framework_failure_
+
+# With the bug, 'sort' would fail under valgrind about half the time,
+# on some circa-2010 multicore Linux platforms.  Run the test 10 times
+# so that the probability of missing the bug should be about 1 in
+# 2**100 on these hosts.
+fail=0
+for i in $(seq 100); do
+  valgrind --quiet --error-exitcode=3 \
+      sort -S 100K --parallel=2 in > /dev/null ||
+    { fail=$?; echo iteration $i failed; Exit $fail; }
+done
-- 
1.7.2





This bug report was last modified 6 years and 285 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.