GNU bug report logs - #7489
[coreutils] over aggressive threads in sort

Previous Next

Package: coreutils;

Reported by: DJ Lucas <dj <at> linuxfromscratch.org>

Date: Fri, 26 Nov 2010 19:40:02 UTC

Severity: normal

Tags: fixed

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #80 received at 7489 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Chen Guo <chen.guo.0625 <at> gmail.com>
Cc: Jim Meyering <jim <at> meyering.net>,
	Pádraig Brady <P <at> draigbrady.com>,
	DJ Lucas <dj <at> linuxfromscratch.org>, 7489 <at> debbugs.gnu.org, coreutils <at> gnu.org
Subject: [PATCH] sort: fix bug on 64-bit hosts with at least 32768 processors
Date: Wed, 01 Dec 2010 21:57:11 -0800
On 11/30/2010 10:16 PM, Paul Eggert wrote:

> Invoke MAX_MERGE(total, level) with level == 15.
> 2 << level yields 65536, and 65536 * 65536 overflows to zero.

I managed to reproduce this bug on a (faked) host with
32768 processors, using a command like this:

  seq 1000000000 | sort --parallel=32768 -S 10G

The result was a floating point exception (actually, a division
by zero) and 'sort' crashed.

However, the bug is timing dependent and is very hard to
reproduce.  I tried many more times to reproduce it, and
they all failed.

This proved to my satisfaction that it is a real bug, though,
so I pushed the following patch.

From 1561c2b228d93a049e527824e14ad4fe8c256b52 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Wed, 1 Dec 2010 21:50:00 -0800
Subject: [PATCH] sort: fix bug on 64-bit hosts with at least 32768 processors

* src/sort.c (MAX_MERGE): Avoid integer overflow when on a machine
with (say) 32-bit int and 64-bit size_t and when level == 15.
Without this fix, on such a machine with 32768 or more processors,
the level computation could overflow on large input, and this
would result in division by zero.
---
 src/sort.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/sort.c b/src/sort.c
index 1aa1eb4..5c368cd 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -107,7 +107,7 @@ struct rlimit { size_t rlim_cur; };
 /* Maximum number of lines to merge every time a NODE is taken from
    the MERGE_QUEUE.  Node is at LEVEL in the binary merge tree,
    and is responsible for merging TOTAL lines. */
-#define MAX_MERGE(total, level) ((total) / ((2 << level) * (2 << level)) + 1)
+#define MAX_MERGE(total, level) (((total) >> (2 * ((level) + 1))) + 1)
 
 /* Heuristic value for the number of lines for which it is worth
    creating a subthread, during an internal merge sort, on a machine
-- 
1.7.2





This bug report was last modified 6 years and 202 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.