GNU bug report logs - #6600
[PATCH] sort: add --threads option to parallelize internal sort.

Previous Next

Package: coreutils;

Reported by: Pádraig Brady <P <at> draigBrady.com>

Date: Sat, 10 Jul 2010 01:09:02 UTC

Severity: normal

Tags: patch

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Chen Guo <chen.guo.0625 <at> gmail.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Bug Coreutils <bug-coreutils <at> gnu.org>, Glen Lenker <glen.lenker <at> gmail.com>,
	Mike Nichols <mykphyre <at> gmail.com>, Gene Auyeung <quaker4lyf <at> gmail.com>,
	Chris Dickens <cdickens <at> ucla.edu>,
	Pádraig Brady <P <at> draigbrady.com>
Subject: Re: [PATCH] sort: add --threads option to parallelize internal sort.
Date: Fri, 9 Jul 2010 21:23:46 -0700
2010/7/9 Paul Eggert <eggert <at> cs.ucla.edu>:
> On 07/09/10 18:07, Pádraig Brady wrote:
>> Chen Guo wrote:
>>> That happened when more than one instance of memcoll is called on the same
>>> line at once, since memcoll replaces the eolchar with '\0'. Under our approach,
>>> the same line shouldn't ever be compared at the same time, so we're fine.
>
> Ah, sorry, I wasn't aware of that.
>
>> I'm thinking of dropping
>> the whole xmemcoll0() thing altogether assuming your
>> statement above is correct, that a particular line will
>> not be used at the same time by multiple threads.
>
> Yes, that makes sense.  We can revert that change from gnulib, since it
> makes gnulib bigger unnecessarily.
>

Actually, the '\0' saves about 5% off runtime last I checked. This is because
EACH TIME sort compares two lines memcoll would replace the last byte. If we
set them all to NUL anyway at the start, memcoll_nul wouldn't need to do that
replacement for each compare. When we output, we'd simply put the \n back.

I could be wrong though, this is going off memory from 4-5 months ago. But 5%
is about what I remember, when sorting 1M lines on 8 cores.




This bug report was last modified 14 years and 313 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.