GNU bug report logs - #6020
coreutils-8.x: a simple feature enhancement, and how to do it

Previous Next

Package: coreutils;

Reported by: "Nelson H. F. Beebe" <beebe <at> math.utah.edu>

Date: Sat, 24 Apr 2010 01:31:01 UTC

Severity: normal

Fixed in version 8.6

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: "Nelson H. F. Beebe" <beebe <at> math.utah.edu>
To: 6020 <at> debbugs.gnu.org
Cc: beebe <at> math.utah.edu
Subject: bug#6020: coreutils-8.x: a simple feature enhancement, and how to do it
Date: Fri, 23 Apr 2010 19:30:17 -0600 (MDT)
In 1981, 29 years ago, Intel introduced the 8087 floating-point
coprocessor that implemented an early draft of the 1985 IEEE 754
standard for binary floating-point arithmetic.  That chip, and all
subsequent Intel IA-32, IA-64, EM64T, and AMD AMD64 (aka x86_64)
architectures provide three floating-point formats in hardware:

	32-bit	24-bit significand, number range ~= 1.4e-45 .. 3.40e38,
	        roughly 7 decimal digits
		C type float

	64-bit	53-bit significand, number range ~= 4.94e-324 .. 1.80e308
		roughly 16 decimal digits
		C type double

	80-bit	(variously stored in 10, 12, or 16-byte memory blocks)
		64-bit significand, number range ~= 3.64e-4951 .. 1.19e+4932
		roughly 19 decimal digits
		C type long double

Several other CPU platforms provide a 128-bit format instead of the
80-bit format, with these properties:

	128-bit	113-bit significand, number range ~= 3.64e-4951 .. 1.19e+4932,
		roughly 34 decimal digits
		C type long double

In 2009, the IEEE 754 Standard was revised to include the above, plus
decimal arithmetic, the latter with these properties:

	32-bit	7 digits, number range 1e-101 .. 9.999_999e+96

	64-bit	16 digits, number range 1e-398 .. 9.999_999_999_999_999e+384

	128-bit	34 digits, number range 1e-6176 .. 9.999_999_999_999_999_999_999_999_999_999_999e+6144

At present, up to version 8.5, coreutils uses only type double in its
implementation of the -g sort-ordering option.  The result is that it
is unable to correctly sort files that use the entire number range of
IEEE 754 binary arithmetic; indeed, the double format covers only
about 6% of the possible binary range, and 5% of the decimal range.

Please extend the next version of coreutils to use "long double"
instead of "double" in this operation.  Here is a patch that worked
for one recent coreutils release:

*** src/sort.c.~1~      Sun Jan  3 10:06:20 2010
--- src/sort.c  Mon Jan 18 08:24:18 2010
***************
*** 1792,1799 ****
--- 1792,1805 ----

    char *ea;
    char *eb;
+
+ #if 0
    double a = strtod (sa, &ea);
    double b = strtod (sb, &eb);
+ #else
+   long double a = strtold (sa, &ea);
+   long double b = strtold (sb, &eb);
+ #endif

    /* Put conversion errors at the start of the collating sequence.  */
    if (sa == ea)

The "long double" type is required by both C89 and C99, but the
strtold() function appeared first in C99 (although many vendors
supplied it before then).  If strtold() is absent, then
"long double x; if (sscanf(s, "%Lg", &x) == 1) {...}" is often
a reasonable replacement.

However, note that some aberrant systems implement "long double" as
"double" (e.g., DEC Alpha OSF/1 4.x, Minix, and most *BSD
distributions), and some implement it in doubled-double format, which
increases the precision, but leaves the range at that of double.
Examples of the latter include Apple Mac OS X on PowerPC, IBM AIX on
PowerPC, and SGI IRIX MIPS.

I suggest a configure-time check for strtold(), and if that works,
then use "long double" in sort.c.

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                    FAX: +1 801 581 4148                  -
- Department of Mathematics, 110 LCB    Internet e-mail: beebe <at> math.utah.edu  -
- 155 S 1400 E RM 233                       beebe <at> acm.org  beebe <at> computer.org -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------





This bug report was last modified 15 years and 83 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.