GNU bug report logs -
#20657
Traditional range expression not accepted in regex/dfa
Previous Next
Reported by: arnold <at> skeeve.com
Date: Tue, 26 May 2015 02:43:02 UTC
Severity: wishlist
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
Hi Paul.
Thanks for this. The patch looks good. I will (eventually) merge it
into gawk instead of my change.
I plan to add a test to gawk; perhaps grep would benefit from one as well?
Thanks,
Arnold
Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> On 4/21/22 00:57, Arnold Robbins wrote:
>
> > As far as my testing indicates, dfa.c doesn't need a patch, it seems
> > to accept "---" inside brackets for a single minus.
>
> Yes, a brief perusal of the dfa.c source code suggests you're right.
> Thanks for looking into this. I tend to agree with you that POSIX is not
> likely to outlaw this extension.
>
>
> > If there are no objections, can we get this into Gnulib?
>
> Although the basic idea looks good, I see a few places where the patch
> can be improved.
>
> * The two calls to re_string_peek_byte might go past the end of the
> pattern (a subscript violation). This is possible because the pattern is
> not necessarily null-terminated.
>
> * The two calls to re_string_fetch_byte can be simplified into a single
> call to re_string_skip_bytes.
>
> * No need to assign to token->opr.c, as it already has the correct value.
>
> * Can fall through to the default case to save a bit of duplicate code.
>
> * glibc still uses comments /* like this */ for style reasons, and we
> should stick to that.
>
> I wrote a patch with these improvements in mind and installed it into
> Gnulib (see attached); hope it works for Gawk too.
This bug report was last modified 3 years and 33 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.