GNU bug report logs - #18121
A bug in sort.

Previous Next

Package: coreutils;

Reported by: Tom Bryant <mainsequence <at> verizon.net>

Date: Mon, 28 Jul 2014 01:10:01 UTC

Severity: normal

Tags: notabug

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


Message #8 received at 18121 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Tom Bryant <mainsequence <at> verizon.net>
Cc: 18121 <at> debbugs.gnu.org
Subject: Re: bug#18121: A bug in sort.
Date: Mon, 28 Jul 2014 09:41:08 +0100
On 07/28/2014 01:05 AM, Tom Bryant wrote:
> I issued a "sort -n hugeFile > sortedHugeFile" and it introduced a very occasional but destructive "x" in to the data.
> 
> The original data consisted of numeric fields, separated by the vertical bar, "|", and +, - and spaces.  It was 25861964610 bytes in size.
> 
> The final file had around 10 "x" characters overwritten in it.  It too was  25861964610 bytes in size.  I copy the first few lines to give you an idea of what sort was sorting:
> 
> 0.01996377896414875189|-1.56937596815334989842|13950|13860|9|0|0|146|158|8|6|2|9697|59367|119|65406|159|161|1101364107|12467|12131|11963|5|5|5|2|3|3|20000|20000|20000|20000|20000|99|99|99|99|99|3|1|0|0|1000076|1|2|1 1
> 0.05686181376938173604|-1.56865877357861105423|14858|14817|7|0|0|158|160|6|6|2|9584|16962|42|65512|167|167|1229086934|12870|12167|12014|5|5|5|2|2|2|20000|20000|20000|20000|20000|99|99|99|99|99|3|1|0|0|1000185|1|7|1 2
> 0.08867460878463592766|-1.56748967932357308186|10400|10375|2|0|0|141|140|8|8|5|9290|56797|26|36|141|139|1181024763|7516|6675|6389|5|5|5|2|3|3|13182|10986|20000|20000|20000|99|99|99|99|99|2|310000001|0|0|1000431|1|10|1 3
> 0.13659213373632231314|-1.56927658619685916896|14012|13924|9|0|0|151|148|8|8|2|9611|52428|153|65530|160|159|1127037907|12431|12038|11937|5|5|5|2|3|3|20000|20000|20000|20000|20000|99|99|99|99|99|3|1|0|0|1000084|1|14|1 4
> 0.15088146914756625505|-1.57030633530367280670|16079|16329|99|5|0|223|226|1|1|1|9874|37522|0|0|127|127|1085342271|15299|14894|14657|25|25|26|7|10|13|20000|20000|20000|20000|20000|99|99|99|99|99|0|0|0|0|1000007|0|0|1 5
> 0.17178172876255659585|-1.56903360727616081327|13032|12989|5|0|0|148|145|8|8|2|9647|57825|126|0|157|157|1085364212|11087|10514|10353|5|5|5|2|3|3|20000|20000|20000|20000|20000|99|99|99|99|99|3|1|0|0|1000121|1|24|1 6
> 0.18379604688637316001|-1.56836539827576126882|15692|16287|39|0|0|195|200|2|2|2|9341|13621|65514|2|197|198|1085364149|14738|14268|13997|5|5|5|3|6|7|20000|20000|20000|20000|20000|99|99|99|99|99|4|1|0|0|1000243|0|0|1 7
> 
> The data, FWIW, is an ASCII representation of the UCAC4 star catalog.
> 
> Here is an example of a record with the "x" added to it by sort:
>                                                                                                                    V
> 2.04433377497687374102|0.22403821980488977661|16454|20000|99|1|0|23x24|1|1|2|8603|20560|141|65324|192|191|111893392|14129|13386|13099|25|2|2|4|99|99|20000|20000|20000|20000|20000|99|99|99|99|99|0|30|0|0|118728360|0|0|515 44588
> 
> I still have the original and flawed sort if you're interested.
> 
> The computer this error occured on was a 16Gb machine with a 2TB drive and an Intel Quad core processor running Slackware Linux 13.0.

When processing large amounts of data (25G in this case),
and one sees corruptions in the content but not the size,
it's worth considering hardware errors.

This case might be indicative of single bit errors in RAM,
as the difference between '|' and 'x' is only a single bit.
I would first eliminate that possibility with a RAM checker.

Note sort uses a large memory buffer by default,
so more susceptible than most data processors to issues like this.

If you can reproduce the issue on another system, then
we can start looking at software errors.

thanks,
Pádraig.

p.s. please provide the version of sort




This bug report was last modified 11 years and 16 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.