GNU bug report logs -
#62483
echo a | grep -E -w '((()|a)|())*' # does not terminate
Previous Next
Full log
Message #50 received at 62483 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I found it when I was testing various new regular expression algorithms
against grep (which I used as the golden standard for this).
I used a random generator for regular expressions (using the QuickCheck
framework) and then shrinking/delta debugging to automatically find the
smallest failing test case.
BTW, if you are interested, I could do a larger more targeted effort stress
testing grep like this and possibly find more test cases with unexpected
behavior. I would need some guidance on where to put most effort in order
to be as useful as this can be. I could find a MSc student to help out with
that. Let me know if this sounds like an interesting thing to do!
kind regards,
/Koen
On Sun, Apr 2, 2023 at 6:15 AM Jim Meyering <jim <at> meyering.net> wrote:
> On Mon, Mar 27, 2023 at 6:15 AM Koen Claessen <koen <at> chalmers.se> wrote:
> > Running the command:
> >
> > echo a | grep -E -w '((()|a)|())*'
> >
> > does not terminate, and uses a LOT of processor time, for all versions of
> > grep I have tried.
> >
> > This is the smallest case that could be found; simplifying anything in
> the
> > input and/or expression leads to correct behavior.
>
> Thank you! How did you find that?
>
> FYI, this strikes grep-3.10 (on Fedora 37/glibc-2.36-9.fc37.x86_64)
> when using LC_ALL=en_US.UTF-8, but not with LC_ALL=C.
> I.e., this infloops:
> echo a | LC_ALL=en_US.UTF-8 grep -E -w '((()|a)|())*'
>
> but this works as expected and promptly prints its line of input:
> echo a | LC_ALL=C grep -E -w '((()|a)|())*'
>
> For now, I've added an expected-failing test case for this bug:
>
[Message part 2 (text/html, inline)]
This bug report was last modified 2 years and 74 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.