GNU bug report logs - #18806
grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8

Previous Next

Package: grep;

Reported by: Shlomi Fish <shlomif <at> shlomifish.org>

Date: Thu, 23 Oct 2014 11:16:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #40 received at 18806 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>, 18806 <at> debbugs.gnu.org, 
 Shlomi Fish <shlomif <at> shlomifish.org>
Cc: 18806-done <at> debbugs.gnu.org, Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Subject: Re: bug#18806: grep -rP getline crashes prematurely (without
 displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8
Date: Sat, 25 Oct 2014 16:11:33 -0700
[Message part 1 (text/plain, inline)]
Jim Meyering wrote:
> after your change,
> our pcre-invalid-utf8-input hangs. That happens because the following
> infloops (stuck in pcre_exec) on a CentOS6 system:
>
>    printf 'j\202j\nj\nk\202\n' > in; LC_ALL=en_US.utf8 src/grep -P 'k$' in
>
> That binary was linked with the libpcre from this package:
>
>    pcre-7.8-4.el6.x86_64

I'm getting a failure in pcre-invalid-utf8-input both before and after the 
change, with CentOS 6.5 and pcre-7.8-6.el6.x86_64.  In my case the failures are 
segmentation violations; perhaps 7.8-4 has a different failure mode, or perhaps 
there's some other minor change to your platform that causes libpcre to infloop. 
 Either way, this appears to be a PCRE bug that grep can't be expected to work 
around.

Does the attached patch cause the test to fail reliably for you, instead of looping?

By the way, I'm not sure why tests distinguish between require_en_utf8_locale_ 
and require_compiled_in_MB_support; the latter requires the former, and there's 
no point requiring the former unless we also require the latter.

[pcre.diff (text/plain, attachment)]

This bug report was last modified 10 years and 210 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.