GNU bug report logs -
#33763
RE backtrack for last slash fails when backslashblank involved
Previous Next
Reported by: Peter Benjamin <pete <at> peterbenjamin.com>
Date: Sat, 15 Dec 2018 23:05:02 UTC
Severity: normal
Tags: notabug
Done: Assaf Gordon <assafgordon <at> gmail.com>
Bug is archived. No further changes may be made.
Full log
Message #10 received at control <at> debbugs.gnu.org (full text, mbox):
tags 33763 notabug
close 33763
stop
Hello,
On 2018-12-15 3:07 p.m., Peter Benjamin wrote:
> Backtrack last slash RE does not work when there are "\ " involved.
>
> RE:
> sed -e 's/^\(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm
>
> $ cat findm
> /media/userid/data/movies/movie\ 1\ a.m4v
> /media/userid/data/movies/movie\ 1\ a.extra.m4v
> /media/userid/data/movies/movie\ 2.m4v
> /media/userid/data/movies/movie\ 3.m4v
> /media/userid/data/movies/movie4.m4v
> /media/userid/data2/movies/data.m4v
>
> STDOUT
>
> $ sed -e 's/^\(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm
> /media/userid/data/movies/movie\ 1\ a.m4v
> /media/userid/data/movies/movie\ 1\ a.extra.m4v
> /media/userid/data/movies/movie\ 2.m4v
> /media/userid/data/movies/movie\ 3.m4v
> movie4.m4v /media/userid/data/movies/movie4.m4v
> data.m4v /media/userid/data2/movies/data.m4v
>
> ------------------------
>
> Same backtrack last slash RE in perl works:
>
> perl -n -e 'chomp;s/^(.*)\/([^\/]*)$/\2\t\1\/\2/;print"$_\n"' findm
>
> STDOUT
> movie\ 1\ a.m4v /media/userid/data/movies/movie\ 1\ a.m4v
> movie\ 1\ a.extra.m4v /media/userid/data/movies/movie\ 1\
> a.extra.m4v
> movie\ 2.m4v /media/userid/data/movies/movie\ 2.m4v
> movie\ 3.m4v /media/userid/data/movies/movie\ 3.m4v
> movie4.m4v /media/userid/data/movies/movie4.m4v
> data.m4v /media/userid/data2/movies/data.m4v
>
Thank you for providing such clear and reproducible examples -
it makes the troubleshooting much easier.
First,
let's enable sed's extended regular expression syntax (by adding "-E"),
to make the comparison simpler.
The following "sed -E" command is equivalent to the one you used above,
and produces the same (unsatisfying) results:
sed -E -e 's/^(.*)\/([^\/]*)$/\2\t\1\/\2/' findm
perl -n -e 'chomp;s/^(.*)\/([^\/]*)$/\2\t\1\/\2/;print"$_\n"' findm
Now,
The culprit lies in the bracket expression:
[^\/]
The POSIX definition of regular expression bracket expression says:
"The special characters '.', '*', '[', and '\' (period, asterisk,
left-bracket, and backslash, respectively) shall lose their special
meaning within a bracket expression."
(from section 9.3.5 subitem 1, last sentence in the paragraph:
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_03_05
)
Meaning, the bracket expression "[^\/]" is not "every character except
regular slash" (with the slash character escaped by backslash).
Instead It means "every character except slash or backslash".
Since the first four file names contain backslash, the regex does not
match them.
If the backslash is removed, the results are as you expected:
$ sed -E -e 's/^(.*)\/([^/]*)$/\2\t\1\/\2/' findm
movie\ 1\ a.m4v /media/userid/data/movies/movie\ 1\ a.m4v
movie\ 1\ a.extra.m4v /media/userid/data/movies/movie\ 1\ a.extra.m4v
movie\ 2.m4v /media/userid/data/movies/movie\ 2.m4v
movie\ 3.m4v /media/userid/data/movies/movie\ 3.m4v
movie4.m4v /media/userid/data/movies/movie4.m4v
data.m4v /media/userid/data2/movies/data.m4v
As such, I conclude that it is not a sed bug.
Perhaps Perl's parsing requires to escape the slash,
which leads to this apparent differences.
I'm closing this as "not a bug",
but discussion can continue by replying to this thread.
regards,
- assaf
This bug report was last modified 6 years and 240 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.