GNU bug report logs -
#33763
RE backtrack for last slash fails when backslashblank involved
Previous Next
Reported by: Peter Benjamin <pete <at> peterbenjamin.com>
Date: Sat, 15 Dec 2018 23:05:02 UTC
Severity: normal
Tags: notabug
Done: Assaf Gordon <assafgordon <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 33763 in the body.
You can then email your comments to 33763 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-sed <at> gnu.org
:
bug#33763
; Package
sed
.
(Sat, 15 Dec 2018 23:05:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Peter Benjamin <pete <at> peterbenjamin.com>
:
New bug report received and forwarded. Copy sent to
bug-sed <at> gnu.org
.
(Sat, 15 Dec 2018 23:05:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Backtrack last slash RE does not work when there are "\ " involved.
RE:
sed -e 's/^\(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm
$ cat findm
/media/userid/data/movies/movie\ 1\ a.m4v
/media/userid/data/movies/movie\ 1\ a.extra.m4v
/media/userid/data/movies/movie\ 2.m4v
/media/userid/data/movies/movie\ 3.m4v
/media/userid/data/movies/movie4.m4v
/media/userid/data2/movies/data.m4v
STDOUT
$ sed -e 's/^\(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm
/media/userid/data/movies/movie\ 1\ a.m4v
/media/userid/data/movies/movie\ 1\ a.extra.m4v
/media/userid/data/movies/movie\ 2.m4v
/media/userid/data/movies/movie\ 3.m4v
movie4.m4v /media/userid/data/movies/movie4.m4v
data.m4v /media/userid/data2/movies/data.m4v
----------------------------------------
Ubuntu 16.04
$ sed --version
sed (GNU sed) 4.2.2
$ uname -a
Linux *** 4.4.0-140-generic #166-Ubuntu SMP Wed Nov 14 20:09:47 UTC
2018 x86_64 x86_64 x86_64 GNU/Linux
------------------------
Same backtrack last slash RE in perl works:
perl -n -e 'chomp;s/^(.*)\/([^\/]*)$/\2\t\1\/\2/;print"$_\n"' findm
STDOUT
movie\ 1\ a.m4v /media/userid/data/movies/movie\ 1\ a.m4v
movie\ 1\ a.extra.m4v /media/userid/data/movies/movie\ 1\
a.extra.m4v
movie\ 2.m4v /media/userid/data/movies/movie\ 2.m4v
movie\ 3.m4v /media/userid/data/movies/movie\ 3.m4v
movie4.m4v /media/userid/data/movies/movie4.m4v
data.m4v /media/userid/data2/movies/data.m4v
The End
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-sed <at> gnu.org
:
bug#33763
; Package
sed
.
(Sun, 16 Dec 2018 20:51:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 33763 <at> debbugs.gnu.org (full text, mbox):
tags 33763 notabug
close 33763
stop
Hello,
On 2018-12-15 3:07 p.m., Peter Benjamin wrote:
> Backtrack last slash RE does not work when there are "\ " involved.
>
> RE:
> sed -e 's/^\(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm
>
> $ cat findm
> /media/userid/data/movies/movie\ 1\ a.m4v
> /media/userid/data/movies/movie\ 1\ a.extra.m4v
> /media/userid/data/movies/movie\ 2.m4v
> /media/userid/data/movies/movie\ 3.m4v
> /media/userid/data/movies/movie4.m4v
> /media/userid/data2/movies/data.m4v
>
> STDOUT
>
> $ sed -e 's/^\(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm
> /media/userid/data/movies/movie\ 1\ a.m4v
> /media/userid/data/movies/movie\ 1\ a.extra.m4v
> /media/userid/data/movies/movie\ 2.m4v
> /media/userid/data/movies/movie\ 3.m4v
> movie4.m4v /media/userid/data/movies/movie4.m4v
> data.m4v /media/userid/data2/movies/data.m4v
>
> ------------------------
>
> Same backtrack last slash RE in perl works:
>
> perl -n -e 'chomp;s/^(.*)\/([^\/]*)$/\2\t\1\/\2/;print"$_\n"' findm
>
> STDOUT
> movie\ 1\ a.m4v /media/userid/data/movies/movie\ 1\ a.m4v
> movie\ 1\ a.extra.m4v /media/userid/data/movies/movie\ 1\
> a.extra.m4v
> movie\ 2.m4v /media/userid/data/movies/movie\ 2.m4v
> movie\ 3.m4v /media/userid/data/movies/movie\ 3.m4v
> movie4.m4v /media/userid/data/movies/movie4.m4v
> data.m4v /media/userid/data2/movies/data.m4v
>
Thank you for providing such clear and reproducible examples -
it makes the troubleshooting much easier.
First,
let's enable sed's extended regular expression syntax (by adding "-E"),
to make the comparison simpler.
The following "sed -E" command is equivalent to the one you used above,
and produces the same (unsatisfying) results:
sed -E -e 's/^(.*)\/([^\/]*)$/\2\t\1\/\2/' findm
perl -n -e 'chomp;s/^(.*)\/([^\/]*)$/\2\t\1\/\2/;print"$_\n"' findm
Now,
The culprit lies in the bracket expression:
[^\/]
The POSIX definition of regular expression bracket expression says:
"The special characters '.', '*', '[', and '\' (period, asterisk,
left-bracket, and backslash, respectively) shall lose their special
meaning within a bracket expression."
(from section 9.3.5 subitem 1, last sentence in the paragraph:
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_03_05
)
Meaning, the bracket expression "[^\/]" is not "every character except
regular slash" (with the slash character escaped by backslash).
Instead It means "every character except slash or backslash".
Since the first four file names contain backslash, the regex does not
match them.
If the backslash is removed, the results are as you expected:
$ sed -E -e 's/^(.*)\/([^/]*)$/\2\t\1\/\2/' findm
movie\ 1\ a.m4v /media/userid/data/movies/movie\ 1\ a.m4v
movie\ 1\ a.extra.m4v /media/userid/data/movies/movie\ 1\ a.extra.m4v
movie\ 2.m4v /media/userid/data/movies/movie\ 2.m4v
movie\ 3.m4v /media/userid/data/movies/movie\ 3.m4v
movie4.m4v /media/userid/data/movies/movie4.m4v
data.m4v /media/userid/data2/movies/data.m4v
As such, I conclude that it is not a sed bug.
Perhaps Perl's parsing requires to escape the slash,
which leads to this apparent differences.
I'm closing this as "not a bug",
but discussion can continue by replying to this thread.
regards,
- assaf
Added tag(s) notabug.
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Sun, 16 Dec 2018 20:51:03 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
33763 <at> debbugs.gnu.org and Peter Benjamin <pete <at> peterbenjamin.com>
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Sun, 16 Dec 2018 20:51:03 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Mon, 14 Jan 2019 12:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 6 years and 240 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.