GNU bug report logs - #16481
dfa.c and Rational Range Interpretation

Previous Next

Package: grep;

Reported by: Aharon Robbins <arnold <at> skeeve.com>

Date: Fri, 17 Jan 2014 13:41:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: arnold <at> skeeve.com
To: eggert <at> cs.ucla.edu, bonzini <at> gnu.org, arnold <at> skeeve.com, 16481 <at> debbugs.gnu.org
Subject: bug#16481: dfa.c and Rational Range Interpretation
Date: Mon, 10 Feb 2014 02:00:07 -0700
Paolo Bonzini <bonzini <at> gnu.org> wrote:

> Il 10/02/2014 03:35, Paul Eggert ha scritto:
> > Paolo Bonzini wrote:
> >> The correct course of action for grep is to defer range interpretation
> >> to regex, because otherwise you can get mismatches between regexes with
> >> backreferences and those without.
> >
> > It depends on what one means by "correct".  POSIX doesn't say what to do
> > in this situation, so it's OK as far as POSIX is concerned for grep to
> > use RRI in the typical case (i.e., without backreferences), and for grep
> > to use some other interpretation in the rare cases when backreferences
> > are used.
> >
> > The documentation for 'grep' attempts to address this issue, perhaps not
> > as clearly as it could.  Maybe the installation instructions should talk
> > about it as well, and suggest --with-included-regex for people who care
> > about this sort of thing.
>
> Yeah, that makes sense.  I will revert the commit.

I think this is the wrong course of action. Paul suggested updating the
doc to be more clear, not reverting the code.

Personally, I think grep should always use the included regex so that
then the behavior is consistent across all platforms everywhere; this
is why gawk always uses its own regex.

If the only way to use collating sequences and equivalence classes is
with GLIBC, then I think it'd be better to pull the __LIBC bits out into
the standalone regex somehow.

In reponse to another question: Making GLIBC's regex support RRI isn't
hard - getting the GLIBC maintainers to accept the patch, is. :-(

My two cents: Jim & Paul will have to decide.

Thanks,

Arnold




This bug report was last modified 11 years and 132 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.