GNU bug report logs -
#7182
sort -R slow
Previous Next
Reported by: Ole Tange <tange <at> gnu.org>
Date: Sat, 9 Oct 2010 13:12:02 UTC
Severity: normal
Done: Jim Meyering <jim <at> meyering.net>
Bug is archived. No further changes may be made.
Full log
Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):
On Sat, 9 Oct 2010 14:52:41 +0200 Ole Tange <tange <at> gnu.org> wrote:
> I recently needed to randomize some lines. So I tried using 'sort -R'.
> I was astonished how slow that was. So I tested how slow a competing
> strategies are. GNU sort is two magnitudes slower than unsort and more
> than one magnitude slower than perl:
>
> $ time unsort file
> real 0m1.388s
>
> $ unsort --version
> unsort 1.1.2
>
> $ time perl -e 'print sort { rand() <=> rand() } <>' file
> real 0m6.621s
>
> $ time sort -R file
> real 4m8.403s
>
> $ sort --version
> sort (GNU coreutils) 8.5
>
> What is even scarier: sort without -R is faster than sort -R:
>
> $ time sort file
> real 0m53.553s
>
> I would expect sort -R to be faster than sort and faster than Perl if
> not as fast as unsort.
On my system, locale settings seem to impact the runtime significantly:
$ wc -l bigfile
1000000 bigfile
$ time LC_ALL=en_US.utf8 sort -R bigfile > /dev/null
real 1m29.302s
user 1m21.009s
sys 0m0.155s
$ time LC_ALL=C sort -R bigfile > /dev/null
real 0m38.881s
user 0m35.276s
sys 0m0.118s
However, shuf is much faster, and seems mostly unaffected by the locale
used:
$ time shuf bigfile > /dev/null
real 0m1.044s
user 0m0.833s
sys 0m0.042s
--
D.
This bug report was last modified 13 years and 350 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.