From unknown Tue Sep 09 12:43:16 2025 X-Loop: help-debbugs@gnu.org Subject: bug#33763: RE backtrack for last slash fails when backslashblank involved Resent-From: Peter Benjamin Original-Sender: "Debbugs-submit" Resent-CC: bug-sed@gnu.org Resent-Date: Sat, 15 Dec 2018 23:05:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 33763 X-GNU-PR-Package: sed X-GNU-PR-Keywords: To: 33763@debbugs.gnu.org X-Debbugs-Original-To: bug-sed@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.15449150979290 (code B ref -1); Sat, 15 Dec 2018 23:05:02 +0000 Received: (at submit) by debbugs.gnu.org; 15 Dec 2018 23:04:57 +0000 Received: from localhost ([127.0.0.1]:49757 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gYIzQ-0002Pm-QX for submit@debbugs.gnu.org; Sat, 15 Dec 2018 18:04:57 -0500 Received: from eggs.gnu.org ([208.118.235.92]:57987) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gYI5t-0007UT-Kb for submit@debbugs.gnu.org; Sat, 15 Dec 2018 17:07:33 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gYI5n-00049y-7N for submit@debbugs.gnu.org; Sat, 15 Dec 2018 17:07:28 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.9 required=5.0 tests=BAYES_00,HTML_MESSAGE, UNWANTED_LANGUAGE_BODY autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:49089) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gYI5m-00049b-Ty for submit@debbugs.gnu.org; Sat, 15 Dec 2018 17:07:27 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39309) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gYI5l-0006WV-N1 for bug-sed@gnu.org; Sat, 15 Dec 2018 17:07:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gYI5f-00044z-BD for bug-sed@gnu.org; Sat, 15 Dec 2018 17:07:25 -0500 Received: from ns7.balaca.com ([173.230.157.158]:45146) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gYI5b-0003wq-FQ for bug-sed@gnu.org; Sat, 15 Dec 2018 17:07:17 -0500 Received: from U16t54 (162-201-253-208.lightspeed.irvnca.sbcglobal.net [162.201.253.208]) by ns7.balaca.com (Postfix) with ESMTPSA id C12F767AF; Sat, 15 Dec 2018 14:07:09 -0800 (PST) Message-ID: <1544911628.7759.12.camel@peterbenjamin.com> From: Peter Benjamin Date: Sat, 15 Dec 2018 14:07:08 -0800 Content-Type: multipart/alternative; boundary="=-+keD4oYtt8la8I0coBiE" X-Mailer: Evolution 3.18.5.2-0ubuntu3.2 Mime-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Mailman-Approved-At: Sat, 15 Dec 2018 18:04:55 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) --=-+keD4oYtt8la8I0coBiE Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Backtrack last slash RE does not work when there are "\ " involved. RE: sed -e 's/^\(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm $ cat findm /media/userid/data/movies/movie\ 1\ a.m4v /media/userid/data/movies/movie\ 1\ a.extra.m4v /media/userid/data/movies/movie\ 2.m4v /media/userid/data/movies/movie\ 3.m4v /media/userid/data/movies/movie4.m4v /media/userid/data2/movies/data.m4v STDOUT $ sed -e 's/^\(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm /media/userid/data/movies/movie\ 1\ a.m4v /media/userid/data/movies/movie\ 1\ a.extra.m4v /media/userid/data/movies/movie\ 2.m4v /media/userid/data/movies/movie\ 3.m4v movie4.m4v /media/userid/data/movies/movie4.m4v data.m4v /media/userid/data2/movies/data.m4v ---------------------------------------- Ubuntu 16.04 $ sed --version sed (GNU sed) 4.2.2 $ uname -a Linux *** 4.4.0-140-generic #166-Ubuntu SMP Wed Nov 14 20:09:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux ------------------------ Same backtrack last slash RE in perl works: perl -n -e 'chomp;s/^(.*)\/([^\/]*)$/\2\t\1\/\2/;print"$_\n"' findm STDOUT movie\ 1\ a.m4v /media/userid/data/movies/movie\ 1\ a.m4v movie\ 1\ a.extra.m4v /media/userid/data/movies/movie\ 1\ a.extra.m4v movie\ 2.m4v /media/userid/data/movies/movie\ 2.m4v movie\ 3.m4v /media/userid/data/movies/movie\ 3.m4v movie4.m4v /media/userid/data/movies/movie4.m4v data.m4v /media/userid/data2/movies/data.m4v The End --=-+keD4oYtt8la8I0coBiE Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable
Backtrack last slash RE does not work when th= ere are "\ " involved.

RE:
sed -e 's/^\(= .*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm

$ cat findm
/media/userid/data/movies/movie\ 1\ a.m4v
/media/userid/= data/movies/movie\ 1\ a.extra.m4v
/media/userid/data/movies/movie= \ 2.m4v
/media/userid/data/movies/movie\ 3.m4v
/media/u= serid/data/movies/movie4.m4v
/media/userid/data2/movies/data.m4v<= /div>

STDOUT

$ sed -e 's/^= \(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm
/media/userid/data/movies/= movie\ 1\ a.m4v
/media/userid/data/movies/movie\ 1\ a.extra.m4v
/media/userid/data/movies/movie\ 2.m4v
/media/userid/dat= a/movies/movie\ 3.m4v
movie4.m4v /media/userid/data/movies/movie4.m4v
<= div>data.m4v /media/userid/data2/movies/data.m4v

------------= ----------------------------

Ubuntu 16.04

$ sed --version
sed (GNU sed) 4.2.2

$ uname -a
Linux *** 4.4.0-140-generic #1= 66-Ubuntu SMP Wed Nov 14 20:09:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

------------------------

=
Same backtrack last slash RE in perl works:

p= erl -n -e 'chomp;s/^(.*)\/([^\/]*)$/\2\t\1\/\2/;print"$_\n"' findm

STDOUT
movie\ 1\ a.m4v /media/userid/data/movies/movie\ 1\ = a.m4v
movie\ 1\ a.extra.m4v /media/userid/data/movies/movie\ 1\ a.extra.m4= v
movie\ 2.m4v /media/userid/data/movies/movie\ 2.m4v
movie\ 3.m4v= /media/use= rid/data/movies/movie\ 3.m4v
movie4.m4v /media/userid/data/movies/movie4.m4v=
data.m4v /media/userid/data2/movies/data.m4v

The E= nd

--=-+keD4oYtt8la8I0coBiE-- From unknown Tue Sep 09 12:43:16 2025 X-Loop: help-debbugs@gnu.org Subject: bug#33763: RE backtrack for last slash fails when backslashblank involved Resent-From: Assaf Gordon Original-Sender: "Debbugs-submit" Resent-CC: bug-sed@gnu.org Resent-Date: Sun, 16 Dec 2018 20:51:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 33763 X-GNU-PR-Package: sed X-GNU-PR-Keywords: To: Peter Benjamin , 33763@debbugs.gnu.org Received: via spool by 33763-submit@debbugs.gnu.org id=B33763.154499340322861 (code B ref 33763); Sun, 16 Dec 2018 20:51:02 +0000 Received: (at 33763) by debbugs.gnu.org; 16 Dec 2018 20:50:03 +0000 Received: from localhost ([127.0.0.1]:50622 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gYdMR-0005wZ-0g for submit@debbugs.gnu.org; Sun, 16 Dec 2018 15:50:03 -0500 Received: from mail-pg1-f177.google.com ([209.85.215.177]:39447) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gYdMO-0005vw-Sp; Sun, 16 Dec 2018 15:50:01 -0500 Received: by mail-pg1-f177.google.com with SMTP id w6so5085281pgl.6; Sun, 16 Dec 2018 12:50:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=/DqB9wN6XWd+l9fWSnK0JOAgwTFSg7AwmFw06CVun4A=; b=T5Ev9m3EsRqdAMPlTKwpglzUD9Yc6anNEDjDuHmNHSr+GqsNFrb+ZiF88rmERrGbGn j3noH52p2VHP7xNf5E/SDfkafNt61G7vGJ0jT7wWNbSf4EVoN0OmBLn1sbBl8FvmBOHZ tZuXi96BDVuOVcpzpGVhGuzRhfSR5p+qCibmEiyd1nKtPRJ3y0N2O9Un2H5P83oQnGqC uHvnwX3oJ3vZLxpGgZERnuMAnfbuR/7za4LAc/CdsGltGzkh8ULdVzkk5P5y6GNXBMmK Wg4p2P54CFxUPmHEfNsoCucJMfHe6xAQeEKX+51KsU9vsnc1+dKbCRvGgGsmt4RPtp8R Fy6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=/DqB9wN6XWd+l9fWSnK0JOAgwTFSg7AwmFw06CVun4A=; b=PQnd9dTh7mz+XVVcRhyrJ3OAhYWwUx5cPpgaxVUjzkamuCHZ6T5VzveNhrhdaya4kc puA81cy5a2+LcAqk695YwKLJ/sAmEFv9eeCMjy1f4w+JIej7d5ESUGWPPqFom9Cyx1Xk xZmLcyXSCfXOBcHO9Tp55NlkfujLrx/ee7xqSfWW1oz+xffILJ/0LCpwCq+DK1RoqVI3 Us0XBTdapWw+3kdnqpHf/Dt17Rwbz+Y5S2sFMxuNm+6LdORt2ERrWl8WX3pKd/8TyAVz 6CbxNRsrNrghbqhKLaRfyzvn5qZnSs9HvaFz09Wzsxpzcn4kTvpvYkb+aySrCKl0XE1d r4nA== X-Gm-Message-State: AA+aEWbJuO+D7esEsi3Ul8K4mJP478umsHM9qgU8E+QicpAoG+usCAaL zrXWbsWtY8J5o7QBoJIuivZ1NIx+ X-Google-Smtp-Source: AFSGD/W0cb0Q8AcXCFauot08O1sxNfTSng/+pzOUl7SYr2XrR0RcuhphiQXwrIk/OD2/IvLMmA8eng== X-Received: by 2002:a62:dbc2:: with SMTP id f185mr10392981pfg.235.1544993394470; Sun, 16 Dec 2018 12:49:54 -0800 (PST) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id v191sm24793885pgb.77.2018.12.16.12.49.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 16 Dec 2018 12:49:53 -0800 (PST) References: <1544911628.7759.12.camel@peterbenjamin.com> From: Assaf Gordon Message-ID: <18cf64cf-1603-de28-c071-22c0d9d34ee1@gmail.com> Date: Sun, 16 Dec 2018 13:49:52 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: <1544911628.7759.12.camel@peterbenjamin.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) tags 33763 notabug close 33763 stop Hello, On 2018-12-15 3:07 p.m., Peter Benjamin wrote: > Backtrack last slash RE does not work when there are "\ " involved. > > RE: > sed -e 's/^\(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm > > $ cat findm > /media/userid/data/movies/movie\ 1\ a.m4v > /media/userid/data/movies/movie\ 1\ a.extra.m4v > /media/userid/data/movies/movie\ 2.m4v > /media/userid/data/movies/movie\ 3.m4v > /media/userid/data/movies/movie4.m4v > /media/userid/data2/movies/data.m4v > > STDOUT > > $ sed -e 's/^\(.*\)\/\([^\/]*\)$/\2\t\1\/\2/' findm > /media/userid/data/movies/movie\ 1\ a.m4v > /media/userid/data/movies/movie\ 1\ a.extra.m4v > /media/userid/data/movies/movie\ 2.m4v > /media/userid/data/movies/movie\ 3.m4v > movie4.m4v /media/userid/data/movies/movie4.m4v > data.m4v /media/userid/data2/movies/data.m4v > > ------------------------ > > Same backtrack last slash RE in perl works: > > perl -n -e 'chomp;s/^(.*)\/([^\/]*)$/\2\t\1\/\2/;print"$_\n"' findm > > STDOUT > movie\ 1\ a.m4v /media/userid/data/movies/movie\ 1\ a.m4v > movie\ 1\ a.extra.m4v /media/userid/data/movies/movie\ 1\ > a.extra.m4v > movie\ 2.m4v /media/userid/data/movies/movie\ 2.m4v > movie\ 3.m4v /media/userid/data/movies/movie\ 3.m4v > movie4.m4v /media/userid/data/movies/movie4.m4v > data.m4v /media/userid/data2/movies/data.m4v > Thank you for providing such clear and reproducible examples - it makes the troubleshooting much easier. First, let's enable sed's extended regular expression syntax (by adding "-E"), to make the comparison simpler. The following "sed -E" command is equivalent to the one you used above, and produces the same (unsatisfying) results: sed -E -e 's/^(.*)\/([^\/]*)$/\2\t\1\/\2/' findm perl -n -e 'chomp;s/^(.*)\/([^\/]*)$/\2\t\1\/\2/;print"$_\n"' findm Now, The culprit lies in the bracket expression: [^\/] The POSIX definition of regular expression bracket expression says: "The special characters '.', '*', '[', and '\' (period, asterisk, left-bracket, and backslash, respectively) shall lose their special meaning within a bracket expression." (from section 9.3.5 subitem 1, last sentence in the paragraph: http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_03_05 ) Meaning, the bracket expression "[^\/]" is not "every character except regular slash" (with the slash character escaped by backslash). Instead It means "every character except slash or backslash". Since the first four file names contain backslash, the regex does not match them. If the backslash is removed, the results are as you expected: $ sed -E -e 's/^(.*)\/([^/]*)$/\2\t\1\/\2/' findm movie\ 1\ a.m4v /media/userid/data/movies/movie\ 1\ a.m4v movie\ 1\ a.extra.m4v /media/userid/data/movies/movie\ 1\ a.extra.m4v movie\ 2.m4v /media/userid/data/movies/movie\ 2.m4v movie\ 3.m4v /media/userid/data/movies/movie\ 3.m4v movie4.m4v /media/userid/data/movies/movie4.m4v data.m4v /media/userid/data2/movies/data.m4v As such, I conclude that it is not a sed bug. Perhaps Perl's parsing requires to escape the slash, which leads to this apparent differences. I'm closing this as "not a bug", but discussion can continue by replying to this thread. regards, - assaf