Package: sed;
Reported by: Saito Takaaki <tails.saito <at> gmail.com>
Date: Thu, 30 Aug 2018 14:44:01 UTC
Severity: normal
Tags: fixed
Done: Assaf Gordon <assafgordon <at> gmail.com>
Bug is archived. No further changes may be made.
Message #17 received at 32592 <at> debbugs.gnu.org (full text, mbox):
From: Assaf Gordon <assafgordon <at> gmail.com> To: Saito Takaaki <tails.saito <at> gmail.com>, 32592 <at> debbugs.gnu.org, bug-gnulib <at> gnu.org Cc: bill-auger <bill-auger <at> peers.community>, Eric Blake <eblake <at> redhat.com>, Jim Meyering <jim <at> meyering.net> Subject: bug#32592: heap-use-after-free in regex module (was: s with i modifier seems to work incorrectly) Date: Wed, 5 Sep 2018 01:32:27 -0600
(adding gnulib) On 04/09/18 07:02 PM, Saito Takaaki wrote: [... discussing a sed bug ...] > However, a friend showed me a more complex case which is > problematic even with sed 4.4 on ideone. The last two lines of the > output (for the identical input lines) are particularly interesting. > https://ideone.com/Sq5xJX > > I hope this helps even a bit. Thank you for persisting with this bug. The linked snippet you provided exposed a heap-use-after-free bug in gnulib's regex module (possibly in glibc as well). A simple way to reproduce with latest sed: cd sed ./bootstrap ./configure --with-included-regex make echo 'abcdefghijklmns!!!!!!!!!!' \ | valgrind ./sed/sed -E 'h;G;s/((.).+(.))(.*\n.*\1)/\2-\3\4/i' Results in a use-after-free relating to the back-references (valgrind output below). There's some interplay with the input length - if the exclamation marks are removed, the bug is not triggered. The bug does not trigger without the case-insensitive flag (s///i). This is easier to trigger with gnulib (hence --with-included-regex) but happens also with glibc's regex module. This could also mean that the bug you previously reported and I surmised was fixed is not fixed at all - could be that it was just much harder to trigger with later sed versions. I'm still learning the code so don't have a fix yet. comments welcomed, - assaf ========================= ==13408== Memcheck, a memory error detector ==13408== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==13408== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info ==13408== Command: ./sed/sed -E h;G;s/((.).+(.))(.*\\n.*\\1)/\\2-\\3\\4/i ==13408== ==13408== Invalid read of size 1 ==13408== at 0x123857: get_subexp (regexec.c:2747) ==13408== by 0x123857: transit_state_bkref.isra.32 (regexec.c:2561) ==13408== by 0x123BDC: merge_state_with_log (regexec.c:2345) ==13408== by 0x1248B8: check_matching (regexec.c:1135) ==13408== by 0x1248B8: re_search_internal (regexec.c:802) ==13408== by 0x12921E: re_search_stub (regexec.c:424) ==13408== by 0x12995F: rpl_re_search (regexec.c:289) ==13408== by 0x111C84: match_regex (regexp.c:358) ==13408== by 0x110205: do_subst (execute.c:1015) ==13408== by 0x110205: execute_program (execute.c:1536) ==13408== by 0x11145A: process_files (execute.c:1673) ==13408== by 0x10B23B: main (sed.c:360) ==13408== Address 0x56096d0 is 16 bytes inside a block of size 42 free'd ==13408== at 0x4C2DDCF: realloc (vg_replace_malloc.c:785) ==13408== by 0x11BF43: re_string_realloc_buffers (regex_internal.c:167) ==13408== by 0x11CA8C: extend_buffers (regexec.c:4057) ==13408== by 0x11CBBA: clean_state_log_if_needed (regexec.c:1697) ==13408== by 0x123967: get_subexp (regexec.c:2778) ==13408== by 0x123967: transit_state_bkref.isra.32 (regexec.c:2561) ==13408== by 0x123BDC: merge_state_with_log (regexec.c:2345) ==13408== by 0x1248B8: check_matching (regexec.c:1135) ==13408== by 0x1248B8: re_search_internal (regexec.c:802) ==13408== by 0x12921E: re_search_stub (regexec.c:424) ==13408== by 0x12995F: rpl_re_search (regexec.c:289) ==13408== by 0x111C84: match_regex (regexp.c:358) ==13408== by 0x110205: do_subst (execute.c:1015) ==13408== by 0x110205: execute_program (execute.c:1536) ==13408== by 0x11145A: process_files (execute.c:1673) ==13408== Block was alloc'd at ==13408== at 0x4C2DDCF: realloc (vg_replace_malloc.c:785) ==13408== by 0x11BF43: re_string_realloc_buffers (regex_internal.c:167) ==13408== by 0x11CA8C: extend_buffers (regexec.c:4057) ==13408== by 0x124A1A: check_matching (regexec.c:1125) ==13408== by 0x124A1A: re_search_internal (regexec.c:802) ==13408== by 0x12921E: re_search_stub (regexec.c:424) ==13408== by 0x12995F: rpl_re_search (regexec.c:289) ==13408== by 0x111C84: match_regex (regexp.c:358) ==13408== by 0x110205: do_subst (execute.c:1015) ==13408== by 0x110205: execute_program (execute.c:1536) ==13408== by 0x11145A: process_files (execute.c:1673) ==13408== by 0x10B23B: main (sed.c:360) ==13408== ==13408== Invalid read of size 1 ==13408== at 0x12385C: get_subexp (regexec.c:2747) ==13408== by 0x12385C: transit_state_bkref.isra.32 (regexec.c:2561) ==13408== by 0x123BDC: merge_state_with_log (regexec.c:2345) ==13408== by 0x1248B8: check_matching (regexec.c:1135) ==13408== by 0x1248B8: re_search_internal (regexec.c:802) ==13408== by 0x12921E: re_search_stub (regexec.c:424) ==13408== by 0x12995F: rpl_re_search (regexec.c:289) ==13408== by 0x111C84: match_regex (regexp.c:358) ==13408== by 0x110205: do_subst (execute.c:1015) ==13408== by 0x110205: execute_program (execute.c:1536) ==13408== by 0x11145A: process_files (execute.c:1673) ==13408== by 0x10B23B: main (sed.c:360) ==13408== Address 0x56096ea is 0 bytes after a block of size 42 free'd ==13408== at 0x4C2DDCF: realloc (vg_replace_malloc.c:785) ==13408== by 0x11BF43: re_string_realloc_buffers (regex_internal.c:167) ==13408== by 0x11CA8C: extend_buffers (regexec.c:4057) ==13408== by 0x11CBBA: clean_state_log_if_needed (regexec.c:1697) ==13408== by 0x123967: get_subexp (regexec.c:2778) ==13408== by 0x123967: transit_state_bkref.isra.32 (regexec.c:2561) ==13408== by 0x123BDC: merge_state_with_log (regexec.c:2345) ==13408== by 0x1248B8: check_matching (regexec.c:1135) ==13408== by 0x1248B8: re_search_internal (regexec.c:802) ==13408== by 0x12921E: re_search_stub (regexec.c:424) ==13408== by 0x12995F: rpl_re_search (regexec.c:289) ==13408== by 0x111C84: match_regex (regexp.c:358) ==13408== by 0x110205: do_subst (execute.c:1015) ==13408== by 0x110205: execute_program (execute.c:1536) ==13408== by 0x11145A: process_files (execute.c:1673) ==13408== Block was alloc'd at ==13408== at 0x4C2DDCF: realloc (vg_replace_malloc.c:785) ==13408== by 0x11BF43: re_string_realloc_buffers (regex_internal.c:167) ==13408== by 0x11CA8C: extend_buffers (regexec.c:4057) ==13408== by 0x124A1A: check_matching (regexec.c:1125) ==13408== by 0x124A1A: re_search_internal (regexec.c:802) ==13408== by 0x12921E: re_search_stub (regexec.c:424) ==13408== by 0x12995F: rpl_re_search (regexec.c:289) ==13408== by 0x111C84: match_regex (regexp.c:358) ==13408== by 0x110205: do_subst (execute.c:1015) ==13408== by 0x110205: execute_program (execute.c:1536) ==13408== by 0x11145A: process_files (execute.c:1673) ==13408== by 0x10B23B: main (sed.c:360) ==13408== a-!!!!!!!!!! abcdefghijklmns!!!!!!!!!! ==13408== ==13408== HEAP SUMMARY: ==13408== in use at exit: 1,840 bytes in 5 blocks ==13408== total heap usage: 1,131 allocs, 1,126 frees, 205,127 bytes allocated ==13408== ==13408== LEAK SUMMARY: ==13408== definitely lost: 0 bytes in 0 blocks ==13408== indirectly lost: 0 bytes in 0 blocks ==13408== possibly lost: 0 bytes in 0 blocks ==13408== still reachable: 1,840 bytes in 5 blocks ==13408== suppressed: 0 bytes in 0 blocks ==13408== Rerun with --leak-check=full to see details of leaked memory ==13408== ==13408== For counts of detected and suppressed errors, rerun with: -v ==13408== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.