From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 16 21:23:19 2013 Received: (at submit) by debbugs.gnu.org; 17 Dec 2013 02:23:19 +0000 Received: from localhost ([127.0.0.1]:54722 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VskJW-0004wl-Km for submit@debbugs.gnu.org; Mon, 16 Dec 2013 21:23:19 -0500 Received: from eggs.gnu.org ([208.118.235.92]:42415) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VskJQ-0004wX-8X for submit@debbugs.gnu.org; Mon, 16 Dec 2013 21:23:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VskJK-0002xo-2B for submit@debbugs.gnu.org; Mon, 16 Dec 2013 21:23:08 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:51298) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VskJJ-0002xk-V4 for submit@debbugs.gnu.org; Mon, 16 Dec 2013 21:23:01 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44108) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VskJD-0006uR-Ud for bug-coreutils@gnu.org; Mon, 16 Dec 2013 21:23:01 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VskJ7-0002v3-RF for bug-coreutils@gnu.org; Mon, 16 Dec 2013 21:22:55 -0500 Received: from mail-pb0-x22e.google.com ([2607:f8b0:400e:c01::22e]:59183) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VskJ7-0002uq-Jg for bug-coreutils@gnu.org; Mon, 16 Dec 2013 21:22:49 -0500 Received: by mail-pb0-f46.google.com with SMTP id md12so6295395pbc.19 for ; Mon, 16 Dec 2013 18:22:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:message-id:subject:to:content-type; bh=KDieia8zmNfpJPzlsqS1xBZHtNrLaTjuLEJDNMoFTTY=; b=JE5+8K5ujjcc2ovcDqb1TSYG35TpSQbQDiHYYfWlQvpEsmqxXv37UkbmwmrnB/AOS9 9wshFcuFKL9nmBLgRHOaMInQiWiIzJQwHRahp+wkklFwXcTVK5fKGOllG74vl2zJlUmG kS2pso3+BljoVacfpzwPzRntflox2GxQWvV8Wr9vyRdshLf24211MavQImpSWHFAekym ye1llIG0sN6l+j6A4W1qSEwbLvD4GA8kih4wvR2XNL0txgejkhqz8N1h/2LdDEPI7lQ/ 1GCvV7rNXZgeuVogWbR9zoBDaONqVXgRUZNVuaIYM2H11jQQ58d076jbB5GtqswzDGfN PI0w== X-Received: by 10.66.192.74 with SMTP id he10mr24177212pac.126.1387246967962; Mon, 16 Dec 2013 18:22:47 -0800 (PST) MIME-Version: 1.0 Received: by 10.68.6.66 with HTTP; Mon, 16 Dec 2013 18:22:27 -0800 (PST) From: Jim Meyering Date: Mon, 16 Dec 2013 18:22:27 -0800 X-Google-Sender-Auth: Q_Z9sph8ACL-lXK2nUnZsb-Rg90 Message-ID: Subject: ptx: heap buffer overrun, when run with two file arguments To: bug-coreutils@gnu.org Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) Hi, I built like this using just-built 4.9.0 20131216 (but it probably would work as well with 4.8.x): make check AM_CFLAGS='-ggdb3 -static-libasan -fsanitize=address' AM_LDFLAGS='-fsanitize=address -static-libasan -lpthread -ldl' and then I ran this, echo a > a && echo b > b && ./ptx -g1 -w1 a b 2>&1 | asan_symbolize.py -d and include its output below. That output shows a heap-read overrun bug that arises because ptx was designed to process only one input file, yet was later extended to process more than, but without some important adjustments. The underlying problem is that swallow_file_in_memory (called from main) is setting the contents of the global text_buffer for the first file, then updating it (clobbering old value) for the second file. Yet, some pointers to the initial buffer have been squirreled away and later, one of them (keyafter) is presumed to point into the new "text_buffer", which it does not. The subsequent SKIP_WHITE_BACKWARDS use backs up "cursor" until it is goes out of bounds. ================================================================= ==3156==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000e90f at pc 0x409612 bp 0x7fff67a426c0 sp 0x7fff67a426b8 READ of size 1 at 0x60200000e90f thread T0 #0 0x409611 in generate_all_output /cu/src/ptx.c:1425 #1 0x7f3080a89994 in __libc_start_main /home/aurel32/eglibc/eglibc-2.17/csu/libc-start.c:276 #2 0x409bb4 in _start ??:? 0x60200000e90f is located 1 bytes to the left of 3-byte region [0x60200000e910,0x60200000e913) allocated by thread T0 here: #0 0x4458e7 in __interceptor_malloc _asan_rtl_ #1 0x462c8b in fread_file /cu/lib/read-file.c:73 SUMMARY: AddressSanitizer: heap-buffer-overflow ??:0 ?? Shadow bytes around the buggy address: 0x0c047fff9cd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c047fff9ce0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c047fff9cf0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c047fff9d00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c047fff9d10: fa fa fa fa fa fa fd fd fa fa 03 fa fa fa fd fd =>0x0c047fff9d20: fa[fa]03 fa fa fa 00 00 fa fa 04 fa fa fa 04 fa 0x0c047fff9d30: fa fa fd fa fa fa fd fa fa fa 00 fa fa fa fd fa 0x0c047fff9d40: fa fa 00 fa fa fa 00 fa fa fa fd fa fa fa fd fa 0x0c047fff9d50: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fa 0x0c047fff9d60: fa fa fd fa fa fa fd fa fa fa fd fa fa fa 00 fa 0x0c047fff9d70: fa fa 00 fa fa fa 00 fa fa fa 00 fa fa fa 04 fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Contiguous container OOB:fc ASan internal: fe ==3156==ABORTING From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 17 05:29:35 2013 Received: (at 16171) by debbugs.gnu.org; 17 Dec 2013 10:29:35 +0000 Received: from localhost ([127.0.0.1]:54968 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VsruB-00040t-6j for submit@debbugs.gnu.org; Tue, 17 Dec 2013 05:29:35 -0500 Received: from mail5.vodafone.ie ([213.233.128.176]:39261) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Vsru8-00040j-Aq for 16171@debbugs.gnu.org; Tue, 17 Dec 2013 05:29:33 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApMBAConsFJtTvXM/2dsb2JhbAANTINCuTyBNIMZAQEBBDIBRhALDQsJFg8JAwIBAgFFBg0BBwEBiAUIrz2ZGReNGIF6B4Q2AQOZRoU8jlI Received: from unknown (HELO [192.168.1.79]) ([109.78.245.204]) by mail3.vodafone.ie with ESMTP; 17 Dec 2013 10:29:30 +0000 Message-ID: <52B02788.3030604@draigBrady.com> Date: Tue, 17 Dec 2013 10:29:28 +0000 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Jim Meyering Subject: Re: bug#16171: ptx: heap buffer overrun, when run with two file arguments References: In-Reply-To: X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16171 Cc: 16171@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On 12/17/2013 02:22 AM, Jim Meyering wrote: > Hi, > > I built like this using just-built 4.9.0 20131216 > (but it probably would work as well with 4.8.x): > > make check AM_CFLAGS='-ggdb3 -static-libasan -fsanitize=address' > AM_LDFLAGS='-fsanitize=address -static-libasan -lpthread -ldl' > > and then I ran this, > > echo a > a && echo b > b && > ./ptx -g1 -w1 a b 2>&1 | asan_symbolize.py -d > > and include its output below. > That output shows a heap-read overrun bug that arises > because ptx was designed to process only one input file, yet > was later extended to process more than, but without some > important adjustments. > > The underlying problem is that swallow_file_in_memory (called from main) > is setting the contents of the global text_buffer for the first file, > then updating it (clobbering old value) for the second file. > Yet, some pointers to the initial buffer have been squirreled away > and later, one of them (keyafter) is presumed to point into > the new "text_buffer", which it does not. The subsequent > SKIP_WHITE_BACKWARDS use backs up "cursor" until it is goes > out of bounds. Nice. This is a good illustration how test coverage can be leveraged by (future) run time checks. I see it here too (as the only failure in make check with -fsanitize=address $ rpm -q gcc gcc-4.8.2-1.fc20.x86_64 $ yum install libasan # http://bugzilla.redhat.com/991003 $ rm src/ptx.o $ make check AM_CFLAGS='-fsanitize=address' TESTS=tests/misc/ptx.pl SUBDIRS=. VERBOSE=yes $ failure identified in tests/test-suite.log ... $ src/ptx -g1 -w1 <(echo a) <(echo b) | asan_symbolize.py -d thanks! Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Mon Apr 28 09:52:43 2014 Received: (at 16171-done) by debbugs.gnu.org; 28 Apr 2014 13:52:43 +0000 Received: from localhost ([127.0.0.1]:43722 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Welz8-0003Fc-1H for submit@debbugs.gnu.org; Mon, 28 Apr 2014 09:52:43 -0400 Received: from mail2.vodafone.ie ([213.233.128.44]:14917) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Welyx-0003F8-Qm for 16171-done@debbugs.gnu.org; Mon, 28 Apr 2014 09:52:40 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApMBAB9cXlNtTXOj/2dsb2JhbAANTINVxT6BK4MZAQEBBCdSEAsNBAMBAgEJFg8JAwIBAgE9CAYNAQUCAQGIQgilHqRaF44QOBEHCYQwBI58ggWEGYUshUuPC4Fp Received: from unknown (HELO [192.168.1.79]) ([109.77.115.163]) by mail2.vodafone.ie with ESMTP; 28 Apr 2014 14:52:24 +0100 Message-ID: <535E5D17.3050700@draigBrady.com> Date: Mon, 28 Apr 2014 14:52:23 +0100 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Jim Meyering Subject: Re: bug#16171: ptx: heap buffer overrun, when run with two file arguments References: <52B02788.3030604@draigBrady.com> In-Reply-To: <52B02788.3030604@draigBrady.com> X-Enigmail-Version: 1.6 Content-Type: multipart/mixed; boundary="------------090002090303070204090404" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16171-done Cc: 16171-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) This is a multi-part message in MIME format. --------------090002090303070204090404 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit On 12/17/2013 10:29 AM, Pádraig Brady wrote: > On 12/17/2013 02:22 AM, Jim Meyering wrote: >> Hi, >> >> I built like this using just-built 4.9.0 20131216 >> (but it probably would work as well with 4.8.x): >> >> make check AM_CFLAGS='-ggdb3 -static-libasan -fsanitize=address' >> AM_LDFLAGS='-fsanitize=address -static-libasan -lpthread -ldl' >> >> and then I ran this, >> >> echo a > a && echo b > b && >> ./ptx -g1 -w1 a b 2>&1 | asan_symbolize.py -d >> >> and include its output below. >> That output shows a heap-read overrun bug that arises >> because ptx was designed to process only one input file, yet >> was later extended to process more than, but without some >> important adjustments. >> >> The underlying problem is that swallow_file_in_memory (called from main) >> is setting the contents of the global text_buffer for the first file, >> then updating it (clobbering old value) for the second file. >> Yet, some pointers to the initial buffer have been squirreled away >> and later, one of them (keyafter) is presumed to point into >> the new "text_buffer", which it does not. The subsequent >> SKIP_WHITE_BACKWARDS use backs up "cursor" until it is goes >> out of bounds. > > Nice. This is a good illustration how test coverage > can be leveraged by (future) run time checks. > > I see it here too (as the only failure in make check with -fsanitize=address The attached should address this. I'll push later. thanks, Pádraig. --------------090002090303070204090404 Content-Type: text/x-patch; name="ptx-whitespace-heap-overflow.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ptx-whitespace-heap-overflow.patch" >From 2f1d17b846f8078b8d03eba3b2a4ad517f9aff38 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Mon, 28 Apr 2014 13:29:41 +0100 Subject: [PATCH] ptx: fix whitespace trimming with multiple files This issue was identified by running the test suite with http://code.google.com/p/address-sanitizer/ which is included in GCC 4.8 and enabled with -fsanitize=address This was checked on Fedora 20 with GCC 4.8 as follows: $ yum install libasan # http://bugzilla.redhat.com/991003 $ rm -f src/ptx.o $ make check AM_CFLAGS='-fsanitize=address' SUBDIRS=. VERBOSE=yes $ failure identified in tests/test-suite.log To see this particular failure triggered with multiple files: $ src/ptx <(echo a) <(echo a) 2>&1 | asan_symbolize.py -d ================================================================= ==32178==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000e74f at pc 0x435442 bp 0x7fffe8a1b290 sp 0x7fffe8a1b288 READ of size 1 at 0x60200000e74f thread T0 #0 0x435441 in define_all_fields coreutils/src/ptx.c:1425 #1 0x7fa206d31d64 in __libc_start_main ??:? #2 0x42f77c in _start ??:? 0x60200000e74f is located 1 bytes to the left of 3-byte region [0x60200000e750,0x60200000e753) allocated by thread T0 here: #0 0x421809 in realloc ??:? #1 0x439b4e in fread_file coreutils/lib/read-file.c:97 Shadow bytes around the buggy address: 0x0c047fff9c90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c047fff9ca0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c047fff9cb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c047fff9cc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c047fff9cd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fd fd =>0x0c047fff9ce0: fa fa 03 fa fa fa fd fd fa[fa]03 fa fa fa 00 00 0x0c047fff9cf0: fa fa 04 fa fa fa 04 fa fa fa fd fa fa fa fd fa 0x0c047fff9d00: fa fa 00 fa fa fa fd fa fa fa 00 fa fa fa 00 fa 0x0c047fff9d10: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fa 0x0c047fff9d20: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fa 0x0c047fff9d30: fa fa fd fa fa fa 00 fa fa fa 00 fa fa fa 00 fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 ASan internal: fe ==32178==ABORTING The initial report and high level analysis were from Jim Meyering... "The underlying problem is that swallow_file_in_memory() is setting the contents of the global text_buffer for the first file, then updating it (clobbering old value) for the second file. Yet, some pointers to the initial buffer have been squirreled away and later, one of them (keyafter) is presumed to point into the new "text_buffer", which it does not. The subsequent SKIP_WHITE_BACKWARDS use backs up "cursor" and goes out of bounds." * src/ptx.c (text_buffers): Maintain references for the limits of each buffer corresponding to each file, rather than just the last processed. (struct OCCURS): Add a member to map back to the corresponding file. Note normally this could be computed from the "reference" member rather than needing the extra storage, however this is not possible when in --references mode. (find_occurs_in_text): Reference the array rather than a single entry. (define_all_fields): Likewise. Also avoid computing the file index since this is now stored directly. (main): Update text_buffers[] array rather than a single text_buffer. * tests/misc/ptx-overrun.sh: Even though this issue is already triggered with AddressSanitizer, add a new case to demonstrate the whitespace trimming issue, and to trigger without AddressSanitizer. Fixes https://bugs.gnu.org/16171 --- src/ptx.c | 68 +++++++++++++++++++++++++-------------------- tests/misc/ptx-overrun.sh | 13 +++++++- 2 files changed, 49 insertions(+), 32 deletions(-) diff --git a/src/ptx.c b/src/ptx.c index be342ee..b074704 100644 --- a/src/ptx.c +++ b/src/ptx.c @@ -166,7 +166,7 @@ static int total_line_count; /* total number of lines seen so far */ static const char **input_file_name; /* array of text input file names */ static int *file_line_count; /* array of 'total_line_count' values at end */ -static BLOCK text_buffer; /* file to study */ +static BLOCK *text_buffers; /* files to study */ /* SKIP_NON_WHITE used only for getting or skipping the reference. */ @@ -232,6 +232,7 @@ typedef struct DELTA left; /* distance to left context start */ DELTA right; /* distance to right context end */ int reference; /* reference descriptor */ + size_t file_index; /* corresponding file */ } OCCURS; @@ -744,7 +745,7 @@ digest_word_file (const char *file_name, WORD_TABLE *table) `----------------------------------------------------------------------*/ static void -find_occurs_in_text (void) +find_occurs_in_text (size_t file_index) { char *cursor; /* for scanning the source text */ char *scan; /* for scanning the source text also */ @@ -760,6 +761,8 @@ find_occurs_in_text (void) char *word_end; /* end of word */ char *next_context_start; /* next start of left context */ + const BLOCK *text_buffer = &text_buffers[file_index]; + /* reference_length is always used within 'if (input_reference)'. However, GNU C diagnoses that it may be used uninitialized. The following assignment is merely to shut it up. */ @@ -775,19 +778,19 @@ find_occurs_in_text (void) found inside it. Also, unconditionally assigning these variable has the happy effect of shutting up lint. */ - line_start = text_buffer.start; + line_start = text_buffer->start; line_scan = line_start; if (input_reference) { - SKIP_NON_WHITE (line_scan, text_buffer.end); + SKIP_NON_WHITE (line_scan, text_buffer->end); reference_length = line_scan - line_start; - SKIP_WHITE (line_scan, text_buffer.end); + SKIP_WHITE (line_scan, text_buffer->end); } /* Process the whole buffer, one line or one sentence at a time. */ - for (cursor = text_buffer.start; - cursor < text_buffer.end; + for (cursor = text_buffer->start; + cursor < text_buffer->end; cursor = next_context_start) { @@ -805,11 +808,11 @@ find_occurs_in_text (void) This test also accounts for the case of an incomplete line or sentence at the end of the buffer. */ - next_context_start = text_buffer.end; + next_context_start = text_buffer->end; if (context_regex.string) switch (re_search (&context_regex.pattern, cursor, - text_buffer.end - cursor, - 0, text_buffer.end - cursor, &context_regs)) + text_buffer->end - cursor, + 0, text_buffer->end - cursor, &context_regs)) { case -2: matcher_error (); @@ -915,7 +918,7 @@ find_occurs_in_text (void) total_line_count++; line_scan++; line_start = line_scan; - SKIP_NON_WHITE (line_scan, text_buffer.end); + SKIP_NON_WHITE (line_scan, text_buffer->end); reference_length = line_scan - line_start; } else @@ -956,6 +959,7 @@ find_occurs_in_text (void) occurs_cursor = occurs_table[0] + number_of_occurs[0]; + /* Define the refence field, if any. */ if (auto_reference) @@ -973,7 +977,7 @@ find_occurs_in_text (void) total_line_count++; line_scan++; line_start = line_scan; - SKIP_NON_WHITE (line_scan, text_buffer.end); + SKIP_NON_WHITE (line_scan, text_buffer->end); } else line_scan++; @@ -1007,6 +1011,7 @@ find_occurs_in_text (void) occurs_cursor->key = possible_key; occurs_cursor->left = context_start - possible_key.start; occurs_cursor->right = context_end - possible_key.start; + occurs_cursor->file_index = file_index; number_of_occurs[0]++; } @@ -1356,9 +1361,10 @@ define_all_fields (OCCURS *occurs) char *left_context_start; /* start of left context */ char *right_context_end; /* end of right context */ char *left_field_start; /* conservative start for 'head'/'before' */ - int file_index; /* index in text input file arrays */ const char *file_name; /* file name for reference */ int line_ordinal; /* line ordinal for reference */ + const char *buffer_start; /* start of buffered file for this occurs */ + const char *buffer_end; /* end of buffered file for this occurs */ /* Define 'keyafter', start of left context and end of right context. 'keyafter' starts at the saved position for keyword and extend to the @@ -1371,6 +1377,9 @@ define_all_fields (OCCURS *occurs) left_context_start = keyafter.start + occurs->left; right_context_end = keyafter.start + occurs->right; + buffer_start = text_buffers[occurs->file_index].start; + buffer_end = text_buffers[occurs->file_index].end; + cursor = keyafter.end; while (cursor < right_context_end && cursor <= keyafter.start + keyafter_max_width) @@ -1422,13 +1431,13 @@ define_all_fields (OCCURS *occurs) if (truncation_string) { cursor = before.start; - SKIP_WHITE_BACKWARDS (cursor, text_buffer.start); + SKIP_WHITE_BACKWARDS (cursor, buffer_start); before_truncation = cursor > left_context_start; } else before_truncation = 0; - SKIP_WHITE (before.start, text_buffer.end); + SKIP_WHITE (before.start, buffer_end); /* The tail could not take more columns than what has been left in the left context field, and a gap is mandatory. It starts after the @@ -1443,7 +1452,7 @@ define_all_fields (OCCURS *occurs) if (tail_max_width > 0) { tail.start = keyafter.end; - SKIP_WHITE (tail.start, text_buffer.end); + SKIP_WHITE (tail.start, buffer_end); tail.end = tail.start; cursor = tail.end; @@ -1489,7 +1498,7 @@ define_all_fields (OCCURS *occurs) if (head_max_width > 0) { head.end = before.start; - SKIP_WHITE_BACKWARDS (head.end, text_buffer.start); + SKIP_WHITE_BACKWARDS (head.end, buffer_start); head.start = left_field_start; while (head.start + head_max_width < head.end) @@ -1520,21 +1529,16 @@ define_all_fields (OCCURS *occurs) { /* Construct the reference text in preallocated space from the file - name and the line number. Find out in which file the reference - occurred. Standard input yields an empty file name. Insure line - numbers are one based, even if they are computed zero based. */ - - file_index = 0; - while (file_line_count[file_index] < occurs->reference) - file_index++; + name and the line number. Standard input yields an empty file name. + Ensure line numbers are 1 based, even if they are computed 0 based. */ - file_name = input_file_name[file_index]; + file_name = input_file_name[occurs->file_index]; if (!file_name) file_name = ""; line_ordinal = occurs->reference + 1; - if (file_index > 0) - line_ordinal -= file_line_count[file_index - 1]; + if (occurs->file_index > 0) + line_ordinal -= file_line_count[occurs->file_index - 1]; sprintf (reference.start, "%s:%d", file_name, line_ordinal); reference.end = reference.start + strlen (reference.start); @@ -2034,6 +2038,7 @@ main (int argc, char **argv) input_file_name = xmalloc (sizeof *input_file_name); file_line_count = xmalloc (sizeof *file_line_count); + text_buffers = xmalloc (sizeof *text_buffers); number_input_files = 1; input_file_name[0] = NULL; } @@ -2042,6 +2047,7 @@ main (int argc, char **argv) number_input_files = argc - optind; input_file_name = xmalloc (number_input_files * sizeof *input_file_name); file_line_count = xmalloc (number_input_files * sizeof *file_line_count); + text_buffers = xmalloc (number_input_files * sizeof *text_buffers); for (file_index = 0; file_index < number_input_files; file_index++) { @@ -2060,6 +2066,7 @@ main (int argc, char **argv) number_input_files = 1; input_file_name = xmalloc (sizeof *input_file_name); file_line_count = xmalloc (sizeof *file_line_count); + text_buffers = xmalloc (sizeof *text_buffers); if (!*argv[optind] || STREQ (argv[optind], "-")) input_file_name[0] = NULL; else @@ -2126,11 +2133,12 @@ main (int argc, char **argv) for (file_index = 0; file_index < number_input_files; file_index++) { + BLOCK *text_buffer = text_buffers + file_index; - /* Read the file in core, than study it. */ + /* Read the file in core, then study it. */ - swallow_file_in_memory (input_file_name[file_index], &text_buffer); - find_occurs_in_text (); + swallow_file_in_memory (input_file_name[file_index], text_buffer); + find_occurs_in_text (file_index); /* Maintain for each file how many lines has been read so far when its end is reached. Incrementing the count first is a simple kludge to diff --git a/tests/misc/ptx-overrun.sh b/tests/misc/ptx-overrun.sh index 693a180..7bf7271 100755 --- a/tests/misc/ptx-overrun.sh +++ b/tests/misc/ptx-overrun.sh @@ -1,5 +1,4 @@ #!/bin/sh -# Trigger a heap-clobbering bug in ptx from coreutils-6.10 and earlier. # Copyright (C) 2008-2014 Free Software Foundation, Inc. @@ -19,12 +18,12 @@ . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src print_ver_ ptx +# Trigger a heap-clobbering bug in ptx from coreutils-6.10 and earlier. # Using a long file name makes an abort more likely. # Even with no file name, valgrind detects the buffer overrun. f=01234567890123456789012345678901234567890123456789 touch $f empty || framework_failure_ - # Specifying a regular expression ending in a lone backslash # would cause ptx to write beyond the end of a malloc'd buffer. ptx -F '\' $f < /dev/null > out || fail=1 @@ -32,4 +31,14 @@ ptx -S 'foo\' $f < /dev/null >> out || fail=1 ptx -W 'bar\\\' $f < /dev/null >> out || fail=1 compare out empty || fail=1 + +# Trigger an invalid heap reference noticed by gcc -fsanitize=address +# from coreutils-8.22 and earlier. As well as an invalid memory reference, +# the issue can be seen in the output, with invalid whitespace trimming +# when multiple files are specified. +printf '%s\n' 'This is a ptx whitespace Trimming test' > ws.in +ptx ws.in ws.in | sort | uniq -u > out +compare /dev/null out || fail=1 + + Exit $fail -- 1.7.7.6 --------------090002090303070204090404-- From debbugs-submit-bounces@debbugs.gnu.org Mon Apr 28 17:01:22 2014 Received: (at 16171) by debbugs.gnu.org; 28 Apr 2014 21:01:22 +0000 Received: from localhost ([127.0.0.1]:44156 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wesfx-0000ax-LS for submit@debbugs.gnu.org; Mon, 28 Apr 2014 17:01:21 -0400 Received: from mout.kundenserver.de ([212.227.17.13]:53570) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wesfv-0000aZ-02 for 16171@debbugs.gnu.org; Mon, 28 Apr 2014 17:01:20 -0400 Received: from [192.168.1.20] (p548E555D.dip0.t-ipconnect.de [84.142.85.93]) by mrelayeu.kundenserver.de (node=mreue105) with ESMTP (Nemesis) id 0M2d6r-1WvS3K0ijl-00sPMY; Mon, 28 Apr 2014 23:01:07 +0200 Message-ID: <535EC192.3020207@bernhard-voelker.de> Date: Mon, 28 Apr 2014 23:01:06 +0200 From: Bernhard Voelker User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: 16171@debbugs.gnu.org, P@draigBrady.com, jim@meyering.net Subject: Re: bug#16171: ptx: heap buffer overrun, when run with two file arguments References: <52B02788.3030604@draigBrady.com> <535E5D17.3050700@draigBrady.com> In-Reply-To: <535E5D17.3050700@draigBrady.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Provags-ID: V02:K0:pYxRPvUoUjGq82yfcBjApd8RH6MNufETxVKzSf42EDG 0hhfznms8gw1cqtaDv6UclDEfEkZS6Zn/hHyHQ2+F+uOVV7YhN JKVrNzfj6ih8mcw/MLfIL/1NA3zvZFS8iWfMRahPAd/fvTi5rz QSnfkSXQuYt9jKyiQNWl2uZmfa1Q+83V6/UJhpsOG1xM5vtEFb EE6KfJQstKJXc9+EwFvcl6gjv2tDXw86uI2MG2D87JRX1BQ615 i7WJwL6UPrsvBs9j7uNi4jTkEnOTvo/XxWEXBX7ux9HPAiNJsY uErp01XFuyHP2Q1C7foJS8LEmBfWhS2Plra0Jm7+4z531C+paP DLK/ZHulNT9Zhl1qmstJ9Xie0spyqCisIXvf69AxY X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16171 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On 04/28/2014 03:52 PM, Pádraig Brady wrote: > diff --git a/tests/misc/ptx-overrun.sh b/tests/misc/ptx-overrun.sh > +# Trigger an invalid heap reference noticed by gcc -fsanitize=address > +# from coreutils-8.22 and earlier. As well as an invalid memory reference, > +# the issue can be seen in the output, with invalid whitespace trimming > +# when multiple files are specified. > +printf '%s\n' 'This is a ptx whitespace Trimming test' > ws.in > +ptx ws.in ws.in | sort | uniq -u > out > +compare /dev/null out || fail=1 Isn't this a user-visible change, i.e., worth a NEWS entry? +1 otherwise. BTW: I noticed that v8.21 produces a different result than v8.22: $ /tmp/cu/coreutils-8.21/src/ptx ws.in ws.in | sort | uniq -u test This is a ptx whitespace Trimming test This is a ptx whitespace Trimming $ /tmp/cu/coreutils-8.22/src/ptx ws.in ws.in | sort | uniq -u is a ptx whitespace Trimming test This is a ptx whitespace Trimming test This ... although there hasn't been a change in src/ptx.c - or I don't see it: "git diff -r v8.21..v8.22 src/ptx.c" only shows a change in usage(). Any idea why? Have a nice day, Berny From debbugs-submit-bounces@debbugs.gnu.org Mon Apr 28 17:36:34 2014 Received: (at 16171) by debbugs.gnu.org; 28 Apr 2014 21:36:34 +0000 Received: from localhost ([127.0.0.1]:44166 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WetE1-0001gJ-KB for submit@debbugs.gnu.org; Mon, 28 Apr 2014 17:36:33 -0400 Received: from mail2.vodafone.ie ([213.233.128.44]:5836) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WetDy-0001g2-99 for 16171@debbugs.gnu.org; Mon, 28 Apr 2014 17:36:31 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApMBAJTJXlNtTXOj/2dsb2JhbAANTINVxUGBMoMZAQEBAwEyAUYFCwsNAQoJFg8JAwIBAgFFBg0BBwEBFogfDaQspDkXjlkHhDkBA6ARjws Received: from unknown (HELO [192.168.1.79]) ([109.77.115.163]) by mail2.vodafone.ie with ESMTP; 28 Apr 2014 22:36:23 +0100 Message-ID: <535EC9D4.4030901@draigBrady.com> Date: Mon, 28 Apr 2014 22:36:20 +0100 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Bernhard Voelker Subject: Re: bug#16171: ptx: heap buffer overrun, when run with two file arguments References: <52B02788.3030604@draigBrady.com> <535E5D17.3050700@draigBrady.com> <535EC192.3020207@bernhard-voelker.de> In-Reply-To: <535EC192.3020207@bernhard-voelker.de> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16171 Cc: 16171@debbugs.gnu.org, jim@meyering.net X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On 04/28/2014 10:01 PM, Bernhard Voelker wrote: > On 04/28/2014 03:52 PM, Pádraig Brady wrote: >> diff --git a/tests/misc/ptx-overrun.sh b/tests/misc/ptx-overrun.sh > >> +# Trigger an invalid heap reference noticed by gcc -fsanitize=address >> +# from coreutils-8.22 and earlier. As well as an invalid memory reference, >> +# the issue can be seen in the output, with invalid whitespace trimming >> +# when multiple files are specified. >> +printf '%s\n' 'This is a ptx whitespace Trimming test' > ws.in >> +ptx ws.in ws.in | sort | uniq -u > out >> +compare /dev/null out || fail=1 > > Isn't this a user-visible change, i.e., worth a NEWS entry? Good point. I'll add a NEWS entry. > BTW: I noticed that v8.21 produces a different result than v8.22: > > $ /tmp/cu/coreutils-8.21/src/ptx ws.in ws.in | sort | uniq -u > test This is a ptx whitespace Trimming > test This is a ptx whitespace Trimming > $ /tmp/cu/coreutils-8.22/src/ptx ws.in ws.in | sort | uniq -u > is a ptx whitespace Trimming test This > is a ptx whitespace Trimming test This It's basically undefined behavior when trimming whitespace depending on values on the heap. So if you look at the full output it should be largely the same apart from the whitespace. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Tue Apr 29 01:13:33 2014 Received: (at 16171-done) by debbugs.gnu.org; 29 Apr 2014 05:13:33 +0000 Received: from localhost ([127.0.0.1]:44306 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wf0MG-0007Ap-Nq for submit@debbugs.gnu.org; Tue, 29 Apr 2014 01:13:32 -0400 Received: from mail-yk0-f174.google.com ([209.85.160.174]:60988) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wf0MD-0007AX-QB for 16171-done@debbugs.gnu.org; Tue, 29 Apr 2014 01:13:30 -0400 Received: by mail-yk0-f174.google.com with SMTP id 20so6563429yks.33 for <16171-done@debbugs.gnu.org>; Mon, 28 Apr 2014 22:13:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type:content-transfer-encoding; bh=KKa90/dXee92A+R0nDDoGu+6VIBDHK8pZekgyWMdYVI=; b=YrovJOWBpipg3Y2r+aOSjYeg118QinicHS+S5ZTF+CDql8i7cukKRsqtrRNk0x1sVH Vv5th8XvpZe2QQIuAWOIA9UoWjrmdY4/Czboxn9gqI4JnRL0Z55nQvpCW/Rmg4SlxIfH oG6mSZ6MuFSWKDrCXoTkEA8pNBYQsrmb4BIQ131lkuOVQ+vaVzPsBmx7NIRi4FrugnRA um7wwbHJOJBnfXCa+c9vbC2YKvQDujQaaDyRnTU+WmcihxqcpFv+8u6S2+9OWjg8993+ OIcgW3S/b4z4hCd8Wo5+rgyZjb9HvttvrkXK+aleCSKvA6EwHlbYp0frJU8k1IuEZwCX e38Q== X-Received: by 10.236.197.68 with SMTP id s44mr75118yhn.109.1398748404172; Mon, 28 Apr 2014 22:13:24 -0700 (PDT) MIME-Version: 1.0 Received: by 10.170.127.18 with HTTP; Mon, 28 Apr 2014 22:13:04 -0700 (PDT) In-Reply-To: <535E5D17.3050700@draigBrady.com> References: <52B02788.3030604@draigBrady.com> <535E5D17.3050700@draigBrady.com> From: Jim Meyering Date: Mon, 28 Apr 2014 22:13:04 -0700 X-Google-Sender-Auth: sZ3ButIvDe1wlBQyuawYFThlzYc Message-ID: Subject: Re: bug#16171: ptx: heap buffer overrun, when run with two file arguments To: =?ISO-8859-1?Q?P=E1draig_Brady?= Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 16171-done Cc: 16171-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Mon, Apr 28, 2014 at 6:52 AM, P=E1draig Brady wrote: ... >> I see it here too (as the only failure in make check with -fsanitize=3Da= ddress > > The attached should address this. Thanks for taking the time to address that. The patch looks fine, modulo what looks like an accidentally-added empty line all by itself in one chunk, just before this comment: /* Define the refence field, if any. */ From unknown Fri Aug 15 15:35:26 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Tue, 27 May 2014 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator