GNU bug report logs - #7182
sort -R slow

Previous Next

Package: coreutils;

Reported by: Ole Tange <tange <at> gnu.org>

Date: Sat, 9 Oct 2010 13:12:02 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Davide Brini <dave_br <at> gmx.com>
To: bug-coreutils <at> gnu.org
Subject: Re: bug#7182: sort -R slow
Date: Sat, 9 Oct 2010 22:06:28 +0100
On Sat, 9 Oct 2010 14:52:41 +0200 Ole Tange <tange <at> gnu.org> wrote:

> I recently needed to randomize some lines. So I tried using 'sort -R'.
> I was astonished how slow that was. So I tested how slow a competing
> strategies are. GNU sort is two magnitudes slower than unsort and more
> than one magnitude slower than perl:
> 
> $ time unsort file
> real    0m1.388s
> 
> $ unsort --version
> unsort 1.1.2
> 
> $ time perl -e 'print sort { rand() <=> rand() } <>' file
> real    0m6.621s
> 
> $ time sort -R file
> real    4m8.403s
> 
> $ sort --version
> sort (GNU coreutils) 8.5
> 
> What is even scarier: sort without -R is faster than sort -R:
> 
> $ time sort file
> real    0m53.553s
> 
> I would expect sort -R to be faster than sort and faster than Perl if
> not as fast as unsort.

On my system, locale settings seem to impact the runtime significantly:

$ wc -l bigfile 
1000000 bigfile

$ time LC_ALL=en_US.utf8 sort -R bigfile > /dev/null

real	1m29.302s
user	1m21.009s
sys	0m0.155s

$ time LC_ALL=C sort -R bigfile > /dev/null

real	0m38.881s
user	0m35.276s
sys	0m0.118s


However, shuf is much faster, and seems mostly unaffected by the locale
used:

$ time shuf bigfile > /dev/null

real	0m1.044s
user	0m0.833s
sys	0m0.042s

-- 
D.




This bug report was last modified 13 years and 350 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.