GNU bug report logs - #75606
undefined behaviour in sort.c

Previous Next

Package: coreutils;

Reported by: Bruno Haible <bruno <at> clisp.org>

Date: Thu, 16 Jan 2025 16:20:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Bruno Haible <bruno <at> clisp.org>
To: 75606 <at> debbugs.gnu.org
Subject: bug#75606: undefined behaviour in sort.c
Date: Thu, 16 Jan 2025 17:18:53 +0100
Testing the current coreutils with the current gnulib, there is an
undefined behaviour in sort.c, in or around the functions
  debug_line
  debug_key
  debug_width

 ------------------------------------------------------------------------------

Found by building on Ubuntu 24.04, with clang 19,

CC="clang -fsanitize=address,undefined,signed-integer-overflow,shift,integer-divide-by-zero -fno-sanitize-recover=undefined"

and running the test suite. The log shows this:

+ printf 'A\tchr10\nB\tchr1\n'
+ sort -s -k2.4b,2.3n --debug
sort: text ordering performed using simple byte comparison
sort: leading blanks are significant in key 1; consider also specifying 'b'
sort: note numbers use '.' as a decimal point in this locale
../lib/mbswidth.c:60:26: runtime error: addition of unsigned offset to 0x76f80b601805 overflowed to 0x76f80b601804
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../lib/mbswidth.c:60:26 
../tests/sort/sort-debug-keys.sh: line 292: 345166 Done                    printf 'A\tchr10\nB\tchr1\n'
     345167 Aborted                 | sort -s -k2.4b,2.3n --debug

 ------------------------------------------------------------------------------

How to reproduce without clang and UBSAN:

$ printf 'A\tchr10\nB\tchr1\n' > in
$ gdb src/sort
(gdb) break mbsnwidth
(gdb) run -s -k2.4b,2.3n --debug < in

The first time the mbsnwidth function is invoked:
Breakpoint 1, mbsnwidth (string=0x51d000000a80 "A\tchr10", nbytes=5, flags=0) at ../lib/mbswidth.c:59

The second time the mbsnwidth function is invoked:
Breakpoint 1, mbsnwidth (string=0x51d000000a85 "10", nbytes=18446744073709551615, flags=0) at ../lib/mbswidth.c:59

The nbytes value is obviously bogus. The documentation of mbsnwidth() says:
  /* Returns the number of screen columns needed for the NBYTES bytes
     starting at BUF.  */
  extern int mbsnwidth (const char *buf, size_t nbytes, int flags);

Stack trace at the second invocation:

(gdb) where
#0  mbsnwidth (string=0x51d000000a85 "10", nbytes=18446744073709551615, flags=0) at ../lib/mbswidth.c:59
#1  0x00005555556bff9e in debug_width (text=0x51d000000a85 "10", lim=0x51d000000a84 "r10") at ../src/sort.c:2326
#2  0x00005555556bfee6 in debug_key (line=0x51d000001280, key=0x5070000001e0) at ../src/sort.c:2415
#3  0x00005555556beabc in debug_line (line=0x51d000001280) at ../src/sort.c:2427
#4  0x00005555556b487f in write_line (line=0x51d000001280, fp=0x7ffff78045c0 <_IO_2_1_stdout_>, output_file=0x0) at ../src/sort.c:2942
#5  0x00005555556cc3be in write_unique (line=0x51d000001280, tfp=0x7ffff78045c0 <_IO_2_1_stdout_>, temp_output=0x0) at ../src/sort.c:3577
#6  0x00005555556d15c8 in mergelines_node (node=0x51d000001500, total_lines=2, tfp=0x7ffff78045c0 <_IO_2_1_stdout_>, temp_output=0x0)
    at ../src/sort.c:3624
#7  0x00005555556cf27f in merge_loop (queue=0x7ffff59096c0, total_lines=2, tfp=0x7ffff78045c0 <_IO_2_1_stdout_>, temp_output=0x0)
    at ../src/sort.c:3708
#8  0x00005555556cbff7 in sortlines (lines=0x51d0000012a0, nthreads=8, total_lines=2, node=0x51d000001500, queue=0x7ffff59096c0, 
    tfp=0x7ffff78045c0 <_IO_2_1_stdout_>, temp_output=0x0) at ../src/sort.c:3825
#9  0x00005555556ae58f in sort (files=0x502000000418, nfiles=0, output_file=0x0, nthreads=8) at ../src/sort.c:4124
#10 0x00005555556a2d20 in main (argc=4, argv=0x7fffffffd0c8) at ../src/sort.c:4900

Bruno







This bug report was last modified 120 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.