GNU bug report logs - #6600
[PATCH] sort: add --threads option to parallelize internal sort.

Previous Next

Package: coreutils;

Reported by: Pádraig Brady <P <at> draigBrady.com>

Date: Sat, 10 Jul 2010 01:09:02 UTC

Severity: normal

Tags: patch

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: Chen Guo <chenguo4 <at> yahoo.com>
Cc: eggert <at> cs.ucla.edu, 6600 <at> debbugs.gnu.org, glen.lenker <at> gmail.com, mykphyre <at> gmail.com, quaker4lyf <at> gmail.com, cdickens <at> ucla.edu
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Date: Sat, 10 Jul 2010 02:07:37 +0100
On 08/03/10 10:39, Chen Guo wrote:
> Hi Padraig,
> 
>> You previously mentioned a thread bug with memcoll. Is that worked around?
>
> That happened when more than one instance of memcoll is called on the same
> line at once, since memcoll replaces the eolchar with '\0'. Under our approach,
> the same line shouldn't ever be compared at the same time, so we're fine.
> On top of that, Professor Eggert suggested NUL delimiting all lines as they're
> read in, so memcoll doesn't have to; hence the patch to gnulib, which introduces
> xmemcoll_nul and memcoll_nul, for when input is known to be NUL delimited, thus
> no replacement of the eolchar is needed, making memcoll threadsafe.

Note the current xmemcoll0() in gnulib requires the length
_including_ the terminating NUL to be passed, whereas one
usually does not include the terminating char in the length
passed to xmemcoll(). I accordingly updated the lengths
passed to xmemcoll0() by your latest patch.

However there are still writes done to the source text
in the keycompare() function. So I'm thinking of dropping
the whole xmemcoll0() thing altogether assuming your
statement above is correct, that a particular line will
not be used at the same time by multiple threads.
I did try to copy the text to the stack before comparing,
but that introduced a significant overhead noted below.

Your patch is still performing well on a single core machine:

----------- before ---------------------
$ time ./src/sort < nums.list >/dev/null
real    0m8.644s
user    0m8.307s
sys     0m0.292s

$ time ./src/sort -g < nums.list >/dev/null
real    0m11.046s
user    0m10.652s
sys     0m0.295s

$ time ./src/sort -n < nums.list >/dev/null
real    0m4.909s
user    0m4.567s
sys     0m0.298s

$ time LANG=C ./src/sort < nums.list >/dev/null
real    0m1.959s
user    0m1.657s
sys     0m0.285s


------------ after ---------------------
$ time ./src/sort < nums.list >/dev/null
real    0m8.686s
user    0m8.300s
sys     0m0.232s

$ time ./src/sort -g < nums.list >/dev/null
real    0m10.196s
user    0m9.850s
sys     0m0.221s

$ time ./src/sort -n < nums.list >/dev/null
real    0m2.958s
user    0m2.664s
sys     0m0.221s

$ time LANG=C ./src/sort < nums.list >/dev/null
real    0m1.985s
user    0m1.750s
sys     0m0.217s

After copying the text to the stack as mentioned above
there is a significant performance drop:

$ time ./src/sort -n < nums.list >/dev/null

real    0m4.086s
user    0m3.848s
sys     0m0.218s

cheers,
Pádraig.




This bug report was last modified 14 years and 313 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.