#20638 - BUG: standard & extended RE's don't find NUL's :-(

GNU bug report logs - #20638
BUG: standard & extended RE's don't find NUL's :-(

Package: grep;

Reported by: "L. A. Walsh" <gnu <at> tlinx.org>

Date: Sun, 24 May 2015 00:06:02 UTC

Severity: normal

Tags: notabug

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Message #35 received at 20638 <at> debbugs.gnu.org (full text, mbox):

From: Linda Walsh <gnu <at> tlinx.org> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: 20638 <at> debbugs.gnu.org, Eric Blake <eblake <at> redhat.com> Subject: Re: bug#20638: BUG: standard & extended RE's don't find NUL's :-( Date: Mon, 25 May 2015 19:13:06 -0700

Paul Eggert wrote: > Linda Walsh wrote: > >> Perhaps you want to tell me where the documentation on the >> standard and/or extended RE's is that you use? ---- Here is another: *POSIX Extended Regular Expression Syntax: (http://www.boost.org/doc/libs/1_43_0/libs/regex/doc/html/boost_regex/syntax/basic_extended.html) Escapes The POSIX standard defines no escape sequences for POSIX-Extended regular expressions, except that: * Any special character preceded by an escape shall match itself. * The effect of any ordinary character being preceded by an escape is undefined. * An escape inside a character class declaration shall match itself: in other words the escape character is not "special" inside a character class declaration; so [\^] will match either a literal '\' or a '^'. However, that's rather restrictive, so the following standard-compatible extensions are also supported by Boost.Regex: Escapes matching a specific character The following escape sequences are all synonyms for single characters: Escape Character \a '\a' \e 0x1B \f \f \n \n \r \r \t \t \v \v \b \b (but only inside a character class declaration). \cX An ASCII escape sequence - the character whose code point is X % 32 \xdd A hexadecimal escape sequence - matches the single character whose code point is 0xdd. \x{dddd} A hexadecimal escape sequence - matches the single character whose code point is 0xdddd. \0ddd An octal escape sequence - matches the single character whose code point is 0ddd. \N{Name} Matches the single character which has the symbolic name name. For example \\N{newline} matches the single character \n. * > > We're talking about grep, so the relevant documentation is the grep > manual, not the awk manual or other random stuff you might find on the > Internet. Type 'info grep'. Or if you're in Emacs, type 'C-h i m > grep RET'. ----- Again another example of \000 octal and \x hex. Most desccriptions of the chars grep takes say it was designed so that awk, sed, tr -- any core linux util that takes regexes - to be *the ssame* so people didn't have to learn a different syntax for each tool.

This bug report was last modified 9 years and 363 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #20638 BUG: standard & extended RE's don't find NUL's :-(

GNU bug report logs - #20638
BUG: standard & extended RE's don't find NUL's :-(