There are issues (mostly common but some not) using a regexp like
this:
^(.?)(.?)(.?)(.?)(.?).?\5\4\3\2\1$
with GNU grep and GNU sed, hence my contacting both mailing lists
but apologies if that was the wrong starting point.
This started out as a question on StackOverflow,
(https://stackoverflow.com/questions/77820540/searching-palindromes-with-grep-e-egrep/77861446?noredirect=1#comment137299746_77861446)
but my "answer" and some comments from there copied below so you
don't have to look anywhere else for a description of the issues.
Given this input file:
a
ab
abba
abcdef
abcba
zufolo
Removing the `$` from the end of the regexp (i.e. making it less restrictive) produces fewer matches, which is the opposite of what it should do:
a) With the `$` at the end of the regexp:
$ grep -E '^(.?)(.?)(.?)(.?)(.?).?\5\4\3\2\1$' sample
a
abba
abcba
zufolo
b) Without the `$` at the end of the regexp:
$ grep -E '^(.?)(.?)(.?)(.?)(.?).?\5\4\3\2\1' sample
a
abba
abcba
It's not just GNU grep that behaves strangely, GNU sed has the same behavior from the question when just matching with `sed -nE '/.../p' sample` as GNU `grep` does AND sed behaves differently if we're just doing a match vs if we're doing a match + replace.
For example here's `sed` doing a match+replacement and behaving the same way as `grep` above:
a) With the `$` at the end of the regexp:
$ sed -nE 's/^(.?)(.?)(.?)(.?)(.?).?\5\4\3\2\1$/&/p' sample
a
abba
abcba
zufolo
b) Without the `$` at the end of the regexp:
$ sed -nE 's/^(.?)(.?)(.?)(.?)(.?).?\5\4\3\2\1/&/p' sample
a
abba
abcba
but here's sed just doing a match and behaving differently from any of the above:
a) With the `$` at the end of the regexp (note the extra `ab` in the output):
$ sed -nE '/^(.?)(.?)(.?)(.?)(.?).?\5\4\3\2\1$/p' sample
a
ab
abba
abcba
zufolo
b) Without the `$` at the end of the regexp (note the extra `ab` and `abcdef` in the output):
$ sed -nE '/^(.?)(.?)(.?)(.?)(.?).?\5\4\3\2\1/p' sample
a
ab
abba
abcdef
abcba
zufolo
Also interestingly this:
$ sed -nE 's/^(.?)(.?)(.?)(.?)(.?).?\5\4\3\2\1$/<&>/p' sample
outputs:
<a>
<abba>
<abcba>
<>zufolo
the last line of which means the regexp is apparently matching the start of the line and ignoring the `$` end-of-string metachar present in the regexp!
The odd behavior isn't just associated with using `-E`, though, if I remove `-E` and just use [POSIX compliant BREs](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03) then:
a) With the `$` at the end of the regexp:
$ grep '^\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\).\{0,1\}\5\4\3\2\1$' sample
a
abba
abcba
zufolo
<p>
$ sed -n 's/^\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\).\{0,1\}\5\4\3\2\1$/&/p' sample
a
abba
abcba
zufolo
b) Without the `$` at the end of the regexp:
$ grep '^\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\).\{0,1\}\5\4\3\2\1' sample
a
abba
abcba
<p>
$ sed -n 's/^\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\).\{0,1\}\5\4\3\2\1/&/p' sample
a
abba
abcba
and again just doing a match in sed below behaves differently from the sed match+replacements above:
a) With the `$` at the end of the regexp:
$ sed -n '/^\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\).\{0,1\}\5\4\3\2\1$/p' sample
a
ab
abba
abcba
zufolo
b) Without the `$` at the end of the regexp:
$ sed -n '/^\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\)\(.\{0,1\}\).\{0,1\}\5\4\3\2\1/p' sample
a
ab
abba
abcdef
abcba
zufolo
The above shows that, given the same regexp, sed is apparently matching different strings depending on whether it's doing a substitution or not.
These are the version I was using when testing above:
$ grep --version | head -1
grep (GNU grep) 3.11
$ sed --version | head -1
sed (GNU sed) 4.9
It was later pointed out that grep in git-=bash produces an error message and core dumps given the original regexp above
Regards,
Ed Morton