GNU bug report logs -
#36251
Regex library doesn't recognize ']' in a character class
Previous Next
Full log
Message #11 received at 36251 <at> debbugs.gnu.org (full text, mbox):
Hi,
Abdulrahman Semrie <hsamireh <at> gmail.com> writes:
> I am using the pattern [\\[\\]a-zA-Z]+ to match a string with left or
> right bracket in it. However, the string-match function doesn’t match
> the ‘]’ character. To demonstrate with an example, try the following
> funciton:
>
> (string-match "[\\[\\]a-zA-Z]+" "Text[ab]”)
>
> The result for the above function should have been a match structure
> with Text[ab] matched. However, the string-match returns #f which is
> incorrect. To test if the pattern I am using was right, I tried on
> regex101.com and it works. Here (https://regex101.com/r/VAl6aI/1) is
> the link that demonstrates that it works.
It turns out that there are several flavors of regular expressions in
common use, with different features and syntax. The link you provided
is using PCRE (PHP) regular expressions (see the "flavor" pane on the
left), and there are three other supported flavors on that web site.
Guile's (ice-9 regex) module provides a simpler flavor of regexps known
as "POSIX extended regular expressions", implemented as a thin wrapper
around your system's POSIX regular expression library ('regcomp' and
'regexec'). The web site you referenced does not appear to support
POSIX extended regular expressions, but here are some links about them:
https://en.wikibooks.org/wiki/Regular_Expressions/POSIX-Extended_Regular_Expressions
https://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_04
One of the notable differences is that in POSIX extended regular
expressions, character classes do not support backslash escapes, but
instead use a more ad-hoc approach as <tomas <at> tuxteam.de> described.
Regards,
Mark
This bug report was last modified 6 years and 15 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.