GNU bug report logs - #33763
RE backtrack for last slash fails when backslashblank involved

Previous Next

Package: sed;

Reported by: Peter Benjamin <pete <at> peterbenjamin.com>

Date: Sat, 15 Dec 2018 23:05:02 UTC

Severity: normal

Tags: notabug

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #8 received at 33763 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Peter Benjamin <pete <at> peterbenjamin.com>, 33763 <at> debbugs.gnu.org
Subject: Re: bug#33763: RE backtrack for last slash fails when backslashblank
 involved
Date: Sun, 16 Dec 2018 13:49:52 -0700
tags 33763 notabug
close 33763
stop

Hello,

On 2018-12-15 3:07 p.m., Peter Benjamin wrote:
> Backtrack last slash RE does not work when there are "\ " involved.
> 
> RE:
> sed -e 's/^\(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm
> 
> $ cat findm
> /media/userid/data/movies/movie\ 1\ a.m4v
> /media/userid/data/movies/movie\ 1\ a.extra.m4v
> /media/userid/data/movies/movie\ 2.m4v
> /media/userid/data/movies/movie\ 3.m4v
> /media/userid/data/movies/movie4.m4v
> /media/userid/data2/movies/data.m4v
> 
> STDOUT
> 
> $ sed -e 's/^\(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm
> /media/userid/data/movies/movie\ 1\ a.m4v
> /media/userid/data/movies/movie\ 1\ a.extra.m4v
> /media/userid/data/movies/movie\ 2.m4v
> /media/userid/data/movies/movie\ 3.m4v
> movie4.m4v	/media/userid/data/movies/movie4.m4v
> data.m4v	/media/userid/data2/movies/data.m4v
> 
> ------------------------
> 
> Same backtrack last slash RE in perl works:
> 
> perl -n -e 'chomp;s/^(.*)\/([^\/]*)$/\2\t\1\/\2/;print"$_\n"' findm
> 
> STDOUT
> movie\ 1\ a.m4v	/media/userid/data/movies/movie\ 1\ a.m4v
> movie\ 1\ a.extra.m4v	/media/userid/data/movies/movie\ 1\
> a.extra.m4v
> movie\ 2.m4v	/media/userid/data/movies/movie\ 2.m4v
> movie\ 3.m4v	/media/userid/data/movies/movie\ 3.m4v
> movie4.m4v	/media/userid/data/movies/movie4.m4v
> data.m4v	/media/userid/data2/movies/data.m4v
> 

Thank you for providing such clear and reproducible examples -
it makes the troubleshooting much easier.

First,
let's enable sed's extended regular expression syntax (by adding "-E"),
to make the comparison simpler.
The following "sed -E" command is equivalent to the one you used above,
and produces the same (unsatisfying) results:

       sed -E -e 's/^(.*)\/([^\/]*)$/\2\t\1\/\2/'             findm
perl -n -e 'chomp;s/^(.*)\/([^\/]*)$/\2\t\1\/\2/;print"$_\n"' findm

Now,
The culprit lies in the bracket expression:
   [^\/]

The POSIX definition of regular expression bracket expression says:

  "The special characters '.', '*', '[', and '\' (period, asterisk,
  left-bracket, and backslash, respectively) shall lose their special
  meaning within a bracket expression."

(from section 9.3.5 subitem 1, last sentence in the paragraph:
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_03_05 
)

Meaning, the bracket expression "[^\/]" is not "every character except
regular slash" (with the slash character escaped by backslash).
Instead It means "every character except slash or backslash".
Since the first four file names contain backslash, the regex does not
match them.

If the backslash is removed, the results are as you expected:

  $ sed -E -e 's/^(.*)\/([^/]*)$/\2\t\1\/\2/' findm
  movie\ 1\ a.m4v /media/userid/data/movies/movie\ 1\ a.m4v
  movie\ 1\ a.extra.m4v   /media/userid/data/movies/movie\ 1\   a.extra.m4v
  movie\ 2.m4v    /media/userid/data/movies/movie\ 2.m4v
  movie\ 3.m4v    /media/userid/data/movies/movie\ 3.m4v
  movie4.m4v      /media/userid/data/movies/movie4.m4v
  data.m4v        /media/userid/data2/movies/data.m4v

As such, I conclude that it is not a sed bug.
Perhaps Perl's parsing requires to escape the slash,
which leads to this apparent differences.

I'm closing this as "not a bug",
but discussion can continue by replying to this thread.


regards,
 - assaf





This bug report was last modified 6 years and 240 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.