Package: coreutils;
Reported by: Med Maatallah <hotelsmaatallahrecemail <at> gmail.com>
Date: Tue, 20 May 2025 11:47:02 UTC
Severity: normal
Done: Pádraig Brady <P <at> draigBrady.com>
Message #8 received at 78507 <at> debbugs.gnu.org (full text, mbox):
From: Pádraig Brady <P <at> draigBrady.com> To: Med Maatallah <hotelsmaatallahrecemail <at> gmail.com>, 78507 <at> debbugs.gnu.org Subject: Re: bug#78507: [Security] Heap Buffer Overflow in GNU Coreutils sort (CWE-122) Date: Tue, 20 May 2025 16:15:26 +0100
On 20/05/2025 10:31, Med Maatallah wrote: > Dear GNU Coreutils Maintainers, > > I am reporting a heap buffer overflow vulnerability (CWE-122) I've > discovered in the GNU Coreutils sort utility. This issue affects the > traditional key specification syntax processing and leads to an > out-of-bounds read. > Vulnerability Details > > The vulnerability occurs when the traditional key specification syntax ( > +POS1[.C1][OPTS]) is used with UINTMAX_MAX as the character position value. > The begfield() function in src/sort.c performs unsafe pointer arithmetic > that leads to integer wraparound, resulting in a pointer that points one > byte before the start of an allocated heap buffer. > > The vulnerable code is in the begfield() function in src/sort.c: > > > static char *begfield (struct line const *line, struct keyfield const *key){ > char *ptr = line->text, *lim = ptr + line->length - 1; > size_t sword = key->sword; > size_t schar = key->schar; > > /* The leading field separator itself is included in a field when -t > is absent. */ > > if (tab != TAB_DEFAULT) > while (ptr < lim && sword--) > { > while (ptr < lim && *ptr != tab) > ++ptr; > if (ptr < lim) > ++ptr; > } > else > while (ptr < lim && sword--) > { > while (ptr < lim && blanks[to_uchar (*ptr)]) > ++ptr; > while (ptr < lim && !blanks[to_uchar (*ptr)]) > ++ptr; > } > > /* If we're ignoring leading blanks when computing the Start of > the field, skip past them here. */ > if (key->skipsblanks) > while (ptr < lim && blanks[to_uchar (*ptr)]) > ++ptr; > > /* Advance PTR by SCHAR (if possible), but no further than LIM. */ > ptr = MIN (lim, ptr + schar); > > return ptr;} > > The issue lies in the expression ptr + schar when schar is set to > UINTMAX_MAX (18446744073709551615 on 64-bit systems). This triggers integer > wraparound due to size_t arithmetic, causing the calculation to effectively > become ptr - 1. As a result, the function returns a pointer that's one byte > before the start of the allocated buffer. > > The vulnerability is exploitable when: > > 1. A user passes the key specification in traditional format ( > +0.18446744073709551615R) > 2. During command-line parsing in main(), this sets key->schar to > UINTMAX_MAX > 3. In fillbuf(), the begfield() function is called to precompute key > positions > 4. The underflow occurs during the line key pointer calculation > 5. The function returns a pointer before the buffer start > 6. This invalid pointer is later passed through the call chain: > - keycompare() function assigns the pointer to texta > - When using -R (random sort), it calls compare_random() > - compare_random() calls xstrxfrm() with the invalid pointer > - xstrxfrm() calls strxfrm() on the out-of-bounds address > - strxfrm() attempts to read the byte before the buffer, triggering > the overflow > > Technical Impact > > This is a heap buffer overflow (read) that accesses memory one byte before > an allocated buffer. The vulnerability could lead to program crashes and > potentially information disclosure depending on the memory layout. > Proof of Concept > > The vulnerability can be reliably reproduced with this simple test case: > > bash > > # Create a test file with any contentecho -e "aa\nbb" > poc_input.txt > # Execute vulnerable command (traditional key format + random sort option) > ./sort +0.18446744073709551615R poc_input.txt > > When compiled with AddressSanitizer, this command produces the following > error: > > [image: image.png] > > The ASan output clearly shows that the issue is a READ one byte before a > 672-byte heap-allocated region. The call stack confirms the path from > begfield() through keycompare() and compare_random() to strxfrm(). > Proposed Fix > > A proper fix would involve checking for integer overflow before performing > the pointer arithmetic in begfield(). Here's a suggested fix: > > c > > /* Inside begfield() *//* Advance PTR by SCHAR (if possible), but no > further than LIM. */if (schar > 0) { > /* Check if adding schar would overflow or wrap negatively */ > if (SIZE_MAX - (uintptr_t)ptr < schar) { > /* If it would overflow, safest is to set to end of current segment */ > ptr = lim; > } else { > ptr = MIN(lim, ptr + schar); > }} else { > /* Original behavior for schar == 0 */ > ptr = MIN(lim, ptr + schar);} > > This fix guards against the integer overflow by checking if ptr + schar > would exceed the maximum representable size_t value, which indicates a > wraparound would occur. > Affected Versions > > This vulnerability affects GNU Coreutils through at least version, and > potentially earlier versions. I've confirmed the issue in the current > development code. > CVE Request > > I would like to request a CVE for this vulnerability. > > Thank you for your attention to this matter. > > Sincerely, Mohamed Maatallah (@Zephkek) Indeed. I introduced this in coreutils 7.2 (2009). One can repro on Fedora for e.g. with: _POSIX2_VERSION=200809 LC_ALL=C valgrind sort +0.18446744073709551615R poc_input.txt ==984625== Memcheck, a memory error detector ==984625== Using Valgrind-3.24.0 and LibVEX; rerun with -h for copyright info ==984625== Command: sort +0.18446744073709551615R poc_input.txt ==984625== ==984625== Invalid read of size 1 Going back to the more verbose code from coreutils 7.1 avoids the issue. I'll test a bit more here and post a full patch in a while. thanks! Pádraig
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.