From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 15 16:33:22 2018 Received: (at submit) by debbugs.gnu.org; 15 Mar 2018 20:33:22 +0000 Received: from localhost ([127.0.0.1]:35450 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ewZYw-00029O-67 for submit@debbugs.gnu.org; Thu, 15 Mar 2018 16:33:22 -0400 Received: from eggs.gnu.org ([208.118.235.92]:49214) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ewZMo-0001oK-Lb for submit@debbugs.gnu.org; Thu, 15 Mar 2018 16:20:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ewZMi-0007CA-EI for submit@debbugs.gnu.org; Thu, 15 Mar 2018 16:20:45 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: * X-Spam-Status: No, score=1.5 required=5.0 tests=BAYES_50,FREEMAIL_FROM, HTML_MESSAGE,MIME_HTML_ONLY autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:50729) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ewZMi-0007C2-BK for submit@debbugs.gnu.org; Thu, 15 Mar 2018 16:20:44 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40178) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ewZMh-0001Ha-8m for bug-sed@gnu.org; Thu, 15 Mar 2018 16:20:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ewZMe-00076n-3U for bug-sed@gnu.org; Thu, 15 Mar 2018 16:20:43 -0400 Received: from mout.gmx.net ([212.227.17.21]:48945) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ewZMd-00074H-PV for bug-sed@gnu.org; Thu, 15 Mar 2018 16:20:40 -0400 Received: from [82.137.13.167] ([82.137.13.167]) by 3c-app-mailcom-bs12.server.lan (via HTTP); Thu, 15 Mar 2018 21:20:37 +0100 MIME-Version: 1.0 Message-ID: From: "Don Crissti" To: bug-sed@gnu.org Subject: bug: empty regex exits with error when following 2-address like LINENO,/RE/ Content-Type: text/html; charset=UTF-8 Date: Thu, 15 Mar 2018 21:20:37 +0100 Importance: normal Sensitivity: Normal X-Priority: 3 X-Provags-ID: V03:K1:N7OQdt2fcdcgN/MzCmzNLgiE3L6RIAeZMfY2Rc+wl4z q6Jf6JgtOHkEBFOcRacLutImh8oLnLSnBUD5h85p6594TvljvN mxVKqkiMAQIhyed7UmKfNtdaKk8umiZdfJV0J0zQxO9ITEsUo4 vmoBJKDxBzefza9hs59RpFRD4APr8Gm4ZX37FueNUBeM3+/YLq 8FJqIepUG0zN+2ViV9S/5A0FoiAzeFXIUxYA9VG3Mspsy2QQBu L/PHXepPsMEQBDaOKmCG8azEU+rUgXZVEYvC3yjgEP77yNdpNX RKhs1E= X-UI-Out-Filterresults: notjunk:1;V01:K0:Zvoum5jT3RM=:jK7eMBfEu1YCt6otI5CMym gq5RcfldBvtaVQ3UUp9c1F5LNmT11m1fA0CQaT3Tr9O5RSuH1ibuFXGouo6iUCsi44upkBUky G8EwtQ1Wp/YOfSHgexVffHYw732gdk0psJu8lOlGwQCNSyOp6a+QoIjjQvjyAAnL/HQRmBQWP XdX4mwyXJoknIQ2fB3E26px9V3adOD85xLK2r/HUd0uvkDXb27H0LRU2fewWOZ4SxvH2rbt/J +NBKjxGSf0K8LOg50GL/iby831wLCVfmm7jXv4vDtX2oBLQFq9ihe8IAfTNLUsowll4Gtk+ci yWUkvQJgWwe1MM2DORTTejzXyou1dQitP/EDWnPZdZYlcLA3JbbG+sVNkDpPCXbyfonw1HZoy hH/57vHZ8nVG5EemOOh1ExM4NmoLHPpbRhCzPILmEDJ4I2KJDiqYBmNy03ENCCmLjDZFYO+/Q tMtXtPag9hzE+NdHuiFPDQMPqLAQHhEUxB4p2e9R/PfL7gOO08B0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 15 Mar 2018 16:33:21 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---)
The manual states that
 
"the empty regular expression ‘//’ repeats the last regular expression match"
 
however this does not work when the empty regex follows a 2-address of the form LINE_NUMBER,/REGEX/
e.g.
 
# printf %s\\n {1..5} | sed '2,/5/{//!d}'
 
fails with
 
"sed: -e expression #1, char 0: no previous regular expression"
 
instead of printing
 
1
5
 
If it matters, a 2-address like /REGEX/,LINE_NUMBER works as expected e.g.:
 
# printf %s\\n {1..5} | sed '/2/,5{//!d}'
 
correctly prints
 
1
2
 
This is with gnu sed 4.4 on archlinux, vanilla.
From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 15 18:34:18 2018 Received: (at 30829-done) by debbugs.gnu.org; 15 Mar 2018 22:34:18 +0000 Received: from localhost ([127.0.0.1]:35497 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ewbRy-000519-1e for submit@debbugs.gnu.org; Thu, 15 Mar 2018 18:34:18 -0400 Received: from mail-pf0-f171.google.com ([209.85.192.171]:35631) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ewbRv-00050w-NH for 30829-done@debbugs.gnu.org; Thu, 15 Mar 2018 18:34:16 -0400 Received: by mail-pf0-f171.google.com with SMTP id y186so3405811pfb.2 for <30829-done@debbugs.gnu.org>; Thu, 15 Mar 2018 15:34:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=j8oL87rciAVo/pT9aru1cLSVdIa2ZX26x85WYlm4Bh0=; b=FHZbhp32quINCCO7pxZP21y4u3IeEveIE4Re9caUGHhGEHGWu2n7+VTdLSS+EWgUek Vn6B7bIa0yU0wQfEj1abpB4JoAQhOFyRijCIHOPUYw/H9XfayhjSbE/PLiJSawq7EIkX VnEBpaRv6nJM1FrsgS+NRglccvigMBBgW/5PgFvXBMXLEzd1jq/81EJV/z77BgVWeZVO 0AoRJyVp8MdPvtnBujg2mxIvEcxFcczuuKivQWtYYzXf+V9DH1WCzecZKr2GH4wLIf/b hzFFBsFsLrFkevc6NGcX6ty89Dq4KD53yb9QnHU0pobD3Xdb9YzgF9tLZOsTwQx3iywB lRJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=j8oL87rciAVo/pT9aru1cLSVdIa2ZX26x85WYlm4Bh0=; b=oO1yWy5c486hAh6sTOD89aj+r22CmKx4L9YGilCN990FcZJvkbU9HZxhQ99Pev2VMG XO/F9L1Y+GYD97cdUs7CZKQ6+JDUzJzaEV8YS9NE7OzS4LjJFOUbu/bo629yrpMhXCWU Mv0EOhk1YCSKdHMdcbkBQpzqGZFhgidH8Jr2deqEF0l0q2fBS47QvH1YJTj0hSfFjakx Xvkf1Td5ejDcEzpchfYPd0zuGbKzOKxN96lxzYBk8D4GmTL8tcKOt41g7Aj06Feyw/eP nj1goV7qHfIJL81iko3drMlGghIKxMlVy47HSyeORxmtDmhr0QfgJ3EkY8S38Wq3xMJz rmCg== X-Gm-Message-State: AElRT7GmB0nidjBzJlRR1AxWsFYrs0O8DMWrY4lRksKH7JtJ8m+HFLAh gNxEgszQQpLFRBxuMmuFsfE= X-Google-Smtp-Source: AG47ELuy/Sf4ZCsKCmUyFhUVLXb1uwHw22K8ln4vuDEkJn7GCPFn4eXlQV4w4Y6zEl91CVcLuPP4ew== X-Received: by 10.99.2.140 with SMTP id 134mr8050095pgc.117.1521153249877; Thu, 15 Mar 2018 15:34:09 -0700 (PDT) Received: from tomato (moose.housegordon.com. [184.68.105.38]) by smtp.gmail.com with ESMTPSA id x128sm4008875pgb.31.2018.03.15.15.34.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 15 Mar 2018 15:34:09 -0700 (PDT) Date: Thu, 15 Mar 2018 16:34:07 -0600 From: Assaf Gordon To: Don Crissti Subject: Re: bug#30829: bug: empty regex exits with error when following 2-address like LINENO, /RE/ Message-ID: <20180315223407.GE29079@tomato> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 30829-done Cc: 30829-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Hello, On Thu, Mar 15, 2018 at 09:20:37PM +0100, Don Crissti wrote: > "the empty regular expression ‘//’ repeats the last regular expression > match" > > however this does not work when the empty regex follows a 2-address of > the form LINE_NUMBER,/REGEX/ > e.g. > > # printf %s\\n {1..5} | sed '2,/5/{//!d}' > > fails with > > "sed: -e expression #1, char 0: no previous regular expression" Thanks for reporting this bug and providing an easy way to reproduce. Before deciding if it's a bug or not, it's worth comparing to other sed's. (I'm using a slightly different sed program because multiple commands on the same line is a GNU extension.) FreeBSD/OpenBSD/NetBSD: $ printf "%s\n" 1 2 3 4 5 | sed -n -e '2,/5/p' -e '//p' sed: first RE may not be empty BusyBox and ToyBox (output seems incorrect): $ printf "%s\n" 1 2 3 4 5 | sed -n -e '2,/5/p' -e '//p' 1 2 2 3 3 4 4 5 5 Heirloom (http://heirloom.sf.net/): $ seq 5 | sed-heirloom -n -e '2,/5/p' -e '//p' 2 3 4 5 5 And surprisingly, GNU sed version 3.02: $ seq 5 | sed-gnu-3.02 -n -e '2,/5/p' -e '//p' 2 3 4 5 5 GNU sed 4.0 and later: $ seq 5 | sed -n -e '2,/5/p' -e '//p' sed: -e expression #2, char 0: no previous regular expression ===== Now to why it happens: POSIX says (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html): "If an RE is empty (that is, no pattern is specified) sed shall behave as if the last RE used in the last command applied (either as an address or as part of a substitute command) was specified." And the interpertation (of both GNU sed >4.0 and *BSD's sed) is that the "last RE used in the last command *applied*" means the last RE *executed* - not the last regex that preceeds the empty regex in the program. And so in this command: sed -n -e '2,/5/p' -e '//p' On the first line, the address 2 is checked (it doesn't match on line 1 obviously). the regex '/5/' is *not* executed (because 2 didn't match). Then sed tries '//p' - but there was no RE executed - hence the error. The reason for this is that empty (last) regex can be changed during runtime, based on the input. Consider the following (contrived) example: $ printf "%s\n" a ab ab ab \ | sed '1s/a/X/ tq 1s/b/Y/ :q s//*/' X *b *b *b $ printf "%s\n" b ab ab ab \ | sed '1s/a/X/ tq 1s/b/Y/ :q s//*/' Y a* a* a* The flow is: 1. If line 1 contains 'a' - replace 'a' with 'X' and skip the next check ('tq' means "jump to label :q if the last subsitution matched"). 2. If line 1 contains 'b' - replace 'b' with 'Y'. 3. For every line, replace the last regex with '*'. And so you see that the last regex changes dynamically during runtime, based on whether the first line contained 'a' or 'b'. In the first case, the three 'a's are replaced with '*'. In the second case, the three 'b's are replaced with '*'. I therefore think this is not a bug (and I'm marking it as 'done'). However discussion can continue by replying to this thread, and if there are different opinions we can always re-open it. regards, - assaf From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 15 19:16:46 2018 Received: (at 30829-done) by debbugs.gnu.org; 15 Mar 2018 23:16:46 +0000 Received: from localhost ([127.0.0.1]:35520 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ewc72-0005y8-O6 for submit@debbugs.gnu.org; Thu, 15 Mar 2018 19:16:46 -0400 Received: from mail-pl0-f41.google.com ([209.85.160.41]:43079) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ewc71-0005xw-LU for 30829-done@debbugs.gnu.org; Thu, 15 Mar 2018 19:16:44 -0400 Received: by mail-pl0-f41.google.com with SMTP id f23-v6so4735850plr.10 for <30829-done@debbugs.gnu.org>; Thu, 15 Mar 2018 16:16:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=/ZsJbvJBfEE6D4Kjsp0RYXOiczmWuXvpOT/MJpu4rD0=; b=Qiyd2uVzhVbrm9Z5NkvPD+sMxibBS5RLQEK9KyrIwxzzjGjW8g1AUWfzFc4fFBhkAp 54P+oEee1fO35YZ97eANeMEy/TxLgdW0mAJLUhSqgIxBZ3JaFxaNBD7d+AbtIBGv0WDD vhLq2wi7xDWVMte8tEY1YvAgh9n3QfVanhmh+9cuFRiOjNQ4mA9DzmtlPjL9fXkf3EWd 4nBaM3gNAYoqdJ0/9MJE3wC8hjj5z/+C4v2Z+IetSGAE1Skt3z3vJ2yj/DA3jnUBEyxv BSBiQUVN06RmMIkCHmse9KPVV2LfjMvGrt20gflIOh7C7KPo/cU7LKiLWqKJQd8tRvgJ X8QA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=/ZsJbvJBfEE6D4Kjsp0RYXOiczmWuXvpOT/MJpu4rD0=; b=dLc4nmFNcmYVs53NSUCNB9bmgx2LbFc0u2lqN2ZcetbN9SPe9MinIT+21IKHnhdVnJ SSNVUi5IG5D5kj1yWvWiEEh2jI901Ev4BJrwnirR8QzZs5KnbOfyNVyHF+ORitwDjeeb QPj6NFAtmh/APdGdDZSTeA2Bmo+0ROpUxqDKYbvVaErhCukOlyX8IgGmUNTIXTNuWYaB 945uUKFKmqGFj5r5lX7k9forku+k4STNQvPHwznwAfevP5qDRbyloxYugehqByFwzL2Q 1zsds07XsgIWGrHLqEmjfDsKpC47FOwCL+TTpls7qU9EMOBt7KV1HojFPgdGsXgvKb8o 8n+g== X-Gm-Message-State: AElRT7F8JszzlgctjgR/dB4rK9h7Iz/p0qW5Omws/bIBmtKkPrzuO4L2 lTdcA5DG54Ku4xRvSUhXGQk= X-Google-Smtp-Source: AG47ELvSfXFpdtmuXwKIzKqDOOsvW0C0eSfFZ5k03tBG2KNx51Sg1fkZUE435S1mWZrXxlqadiCiSg== X-Received: by 2002:a17:902:848e:: with SMTP id c14-v6mr10131873plo.139.1521155797960; Thu, 15 Mar 2018 16:16:37 -0700 (PDT) Received: from tomato (moose.housegordon.com. [184.68.105.38]) by smtp.gmail.com with ESMTPSA id x84sm14967842pfi.3.2018.03.15.16.16.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 15 Mar 2018 16:16:36 -0700 (PDT) Date: Thu, 15 Mar 2018 17:16:34 -0600 From: Assaf Gordon To: Don Crissti Subject: Re: bug#30829: bug: empty regex exits with error when following 2-address like LINENO, /RE/ Message-ID: <20180315231634.GF29079@tomato> References: <20180315223407.GE29079@tomato> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180315223407.GE29079@tomato> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 30829-done Cc: 30829-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Follow-up: On Thu, Mar 15, 2018 at 04:34:07PM -0600, Assaf Gordon wrote: > On Thu, Mar 15, 2018 at 09:20:37PM +0100, Don Crissti wrote: [...] > > # printf %s\\n {1..5} | sed '2,/5/{//!d}' > > > > fails with > > > > "sed: -e expression #1, char 0: no previous regular expression" > [...] > And the interpertation (of both GNU sed >4.0 and *BSD's sed) is > that the "last RE used in the last command *applied*" means the last RE *executed* > - not the last regex that preceeds the empty regex in the program. The previous examples were needlessly complicated. Here's a simpler example: $ printf "%s\n" ccbb aabb | sed -e '/a/!s/b/X/' -e 's//*/' ccX* *abb Whether the 'last regex' is /a/ or /b/ depends on whether the line contains 'a' or not. > I therefore think this is not a bug (and I'm marking it as 'done'). > However discussion can continue by replying to this thread, > and if there are different opinions we can always re-open it. One could argue that the behavior you're expecting (and happens in sed-heirloom and sed-gnu-3.02) is that if there is no "last regex" silently treat it as 'no match'. That's easy to implement but I don't think that's a good change. The current behaviour is better. More so, I suspect sed-heirloom's behavior is just buggy: $ seq 1 | sed-heirloom -n -e '2p' -e '//p' [no output] $ seq 2 | sed-heirloom -n -e '2p' -e '//p' 2 2 $ seq 2 | sed-heirloom -n -e '2,5p' -e '//p' 2 $ seq 2 | sed-heirloom -n -e '//p' First RE may not be null regards, - assaf From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 15 19:41:35 2018 Received: (at 30829) by debbugs.gnu.org; 15 Mar 2018 23:41:35 +0000 Received: from localhost ([127.0.0.1]:35529 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ewcV4-0006X2-QO for submit@debbugs.gnu.org; Thu, 15 Mar 2018 19:41:35 -0400 Received: from mout.gmx.net ([212.227.15.15]:43789) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ewbzj-0005nO-4Y for 30829@debbugs.gnu.org; Thu, 15 Mar 2018 19:09:11 -0400 Received: from [82.137.13.167] ([82.137.13.167]) by 3c-app-mailcom-bs01.server.lan (via HTTP); Fri, 16 Mar 2018 00:09:05 +0100 MIME-Version: 1.0 Message-ID: From: "Don Crissti" To: 30829@debbugs.gnu.org Content-Type: text/plain; charset=UTF-8 Date: Fri, 16 Mar 2018 00:09:05 +0100 Importance: normal Sensitivity: Normal Content-Transfer-Encoding: quoted-printable X-Priority: 3 X-Provags-ID: V03:K1:PRY4O+avUWXzrFA3iY8a4gKbpgBXlAdLFEYhkgR0zSK ArvjV28x0EYqqGqTLMgY2XF25QEaOzGWFxSzPRhgJdE1yLpKl0 03AnEOpvB0dOR0oRKHnkGq/I+66flxwivJW0pDx5kjmeYEY1Fd HdGWJ0a4n+aZmmH0r1yqO7OQb9VldW/aFsrnNyMK4QN3OzUIE1 Qc7JyX8p/1bMRaSxaka0Ikkk1RUSxrQ28yj+rMCBUc4G6xNqY3 eXa3nzVDoGPnJVNjzWyg/eCUEBvex3KJTLpWh9OP0THjp8iG/f UM3+tM= X-UI-Out-Filterresults: notjunk:1;V01:K0:KQkn9zjlMHc=:UHo1cnfZK3PdVk5CXj1hQw Mxh8JYyHLBg1bYE0hmsmFecIJW9CNxeBZ/HG5qP1YEEJNV74N9BIp/CoBL26FGlW0o3PtiQkR d6xX4yszxs8m9OiIwBYTXrqHi87FOfYCXSJOWpSucVWRIDMqitYaR1Wo8jZQ/GTJC5AlVSE3K 0gOnRvWkqHM4MKKmOQuCyZMbgiczoDSvzYNgiAAMDheW+cqW3I3xUpAREgWRyEhTHvZf2VX9r VrWIDuFH19BLJGwIdxBukpTWpLXij1h1vYhoi94Ex9PFx96cRQ7fek+vku/mO8Eh2r9H2CI8E rzILwXHZypu8VmW2Pjl0Kl6usirIlXeEDD6fUGgTcon5xVR9sDgFnXh5+ITFihRdCgh369tfc NSlVjZwK8tmNdaWDXeVVM91ZonDrkCjdFP83FHXHhE3ORSjASXNMSRe82IA3VJpifq6aqWTZd uzZIeFxiSRBi0Oza9tdgrIij/lMQ0P0peENPsUEXv4AsPiaLz4eO X-Spam-Score: 1.3 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Thanks for the prompt reply ! While I understand your explanation I think we are talking about slightly different things. Your example is different than mine. Let me re-write my code so as to be portable:  printf %s\\n 1 2 3 4 5 | sed -e '2,/5/{//!d' -e'}' [...] Content analysis details: (1.3 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (don_crissti[at]gmx.com) -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [212.227.15.15 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -0.0 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [212.227.15.15 listed in wl.mailspike.net] 1.8 MISSING_SUBJECT Missing Subject: header 0.2 NO_SUBJECT Extra score for no subject X-Debbugs-Envelope-To: 30829 X-Mailman-Approved-At: Thu, 15 Mar 2018 19:41:33 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.3 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Thanks for the prompt reply ! While I understand your explanation I think we are talking about slightly different things. Your example is different than mine. Let me re-write my code so as to be portable:  printf %s\\n 1 2 3 4 5 | sed -e '2,/5/{//!d' -e'}' [...] Content analysis details: (1.3 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [212.227.15.15 listed in list.dnswl.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (don_crissti[at]gmx.com) -0.0 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [212.227.15.15 listed in wl.mailspike.net] -0.0 SPF_PASS SPF: sender matches SPF record 1.8 MISSING_SUBJECT Missing Subject: header 0.2 NO_SUBJECT Extra score for no subject Thanks for the prompt reply ! While I understand your explanation I think we are talking about slightly = different things=2E Your example is different than mine=2E Let me re-write = my code so as to be portable: =C2=A0 printf %s\\n 1 2 3 4 5 | sed -e '2,/5/{//!d' -e'}' Now, as you can see, the main difference between my sample above and your = sample printf "%s\n" 1 2 3 4 5 | sed -n -e '2,/5/p' -e '//p' is the braces (the command grouping)=2E In other words, you are unconditio= nally using an empty regex while I'm only using it for lines that meet cert= ain criteria=2E Based on your explanation (i=2Ee=2E applied regex=3Dexecute= d regex, no executed regex=3Dno previous regex on line 1 hence fail) I can = understand why your code exits with error=2E However, my code uses empty regex on condition (only for a certain range o= f lines)=2E It is logical that '//!d' should not be executed for lines outs= ide that range=2E If I used a plain 'd' instead of '//!d' would sed uncondi= tionally delete all lines from the file ? No=2E It would delete only the li= nes in that range=2E Similarly, sed should not even attempt to evaluate the= empty regex in '{//!d' for lines outside that range=2E Unless I'm missing something I still see this as a bug=2E From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 15 20:16:59 2018 Received: (at 30829) by debbugs.gnu.org; 16 Mar 2018 00:17:00 +0000 Received: from localhost ([127.0.0.1]:35538 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ewd3L-0007QR-NH for submit@debbugs.gnu.org; Thu, 15 Mar 2018 20:16:59 -0400 Received: from mail-pl0-f43.google.com ([209.85.160.43]:41928) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ewd3K-0007Q4-32 for 30829@debbugs.gnu.org; Thu, 15 Mar 2018 20:16:58 -0400 Received: by mail-pl0-f43.google.com with SMTP id b7-v6so758048plr.8 for <30829@debbugs.gnu.org>; Thu, 15 Mar 2018 17:16:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=+lhkALAgvuRgIP5ihb2Qqlz6rtQW0CXRo3ES+0GCCew=; b=E+23v1vOw1TXoC96CiqA8/RXe2zKKJWZg8Mg1J60TooHXItl3+uR+dxL78XRCNHe0O t4R4owNJstyexqldOJQekAslg85TYCghfHKwKAah7Cdsho7fvF3JU9Ij1Sx2bHpZMGD2 fXV5lMzI/dwS2J0ImArNZCMNzjfDnjH/DlWu4HTnZZKDw9S+PBSf8xufR/BBg8MOZG3a MTXicvgQ1tOeGjG2nz2aw0Iy+zppdLbtFVDs+ppQTEgmqQoCdAtaZrKcarb+0qmHTLT2 3c/odrXlEiRBWwzLcYlwL1arE6uIgDlGjtHcMjLYINL4FFsRGw/XpoQnyuBXOtslZGFM 4Hkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=+lhkALAgvuRgIP5ihb2Qqlz6rtQW0CXRo3ES+0GCCew=; b=uUCOSVuVjPuhXT24imgwC7NNnM987EvyacN4Z7g0kG+oien2B8zZxlpGsdLhBvTw7a tfKJgmCLxW+9j4c4v0qhxLkDlbBfLjZ30rwRDie6ZDPH40g2MOXCME5HVcIo71x411I/ X65zUh3iOgO/iQVULZx4vpqiMBoh6DHCOd4yvbJIfDTy9/G13MtFdaH2fOldrLR7w2zh jZdnUr/2tOJxmr/Xy61+NG1vyQdY42D9AaxTOwqvgSlaWFbtCBucPFotvUFHBJQVzJp1 SBRrdeDewyLIn3wClyoZY5WVPxcxluOUdrwHsKELz7jPhjQeJEow4vIcc9PtYhFnThSZ loXw== X-Gm-Message-State: AElRT7GJkx1WrAbjpaQL39GLYPBcd+pTsvjFEx9umHauuZP0+0CTh9R4 F3NORP1OMLUn2Omj8vZNi+I= X-Google-Smtp-Source: AG47ELsk/ZzA+JSOz3UA2G6TllDvQe6rnnYVPeEuELBiiRlXZURVUiP89B2rfARpoN0aEP0fB1DuLA== X-Received: by 2002:a17:902:bd46:: with SMTP id b6-v6mr4758000plx.38.1521159412170; Thu, 15 Mar 2018 17:16:52 -0700 (PDT) Received: from tomato (moose.housegordon.com. [184.68.105.38]) by smtp.gmail.com with ESMTPSA id w24sm12667070pfl.14.2018.03.15.17.16.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 15 Mar 2018 17:16:51 -0700 (PDT) Date: Thu, 15 Mar 2018 18:16:48 -0600 From: Assaf Gordon To: Don Crissti Subject: Re: bug#30829: (no subject) Message-ID: <20180316001648.GG29079@tomato> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Hello, On Fri, Mar 16, 2018 at 12:09:05AM +0100, Don Crissti wrote: [...] > However, my code uses empty regex on condition (only for a certain range of lines). It is logical that '//!d' should not be executed for lines outside that range. If I used a plain 'd' instead of '//!d' would sed unconditionally delete all lines from the file ? No. It would delete only the lines in that range. Similarly, sed should not even attempt to evaluate the empty regex in '{//!d' for lines outside that range. > > Unless I'm missing something I still see this as a bug. [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 2.0 SLIGHTLY_BAD_SUBJECT Subject contains something slightly spammy 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (assafgordon[at]gmail.com) -0.0 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [209.85.160.43 listed in wl.mailspike.net] -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [209.85.160.43 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid X-Debbugs-Envelope-To: 30829 Cc: 30829@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Hello, On Fri, Mar 16, 2018 at 12:09:05AM +0100, Don Crissti wrote: [...] > However, my code uses empty regex on condition (only for a certain range of lines). It is logical that '//!d' should not be executed for lines outside that range. If I used a plain 'd' instead of '//!d' would sed unconditionally delete all lines from the file ? No. It would delete only the lines in that range. Similarly, sed should not even attempt to evaluate the empty regex in '{//!d' for lines outside that range. > > Unless I'm missing something I still see this as a bug. [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [209.85.160.43 listed in list.dnswl.org] -0.0 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [209.85.160.43 listed in wl.mailspike.net] 2.0 SLIGHTLY_BAD_SUBJECT Subject contains something slightly spammy 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (assafgordon[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record 0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid Hello, On Fri, Mar 16, 2018 at 12:09:05AM +0100, Don Crissti wrote: [...] > However, my code uses empty regex on condition (only for a certain range of lines). It is logical that '//!d' should not be executed for lines outside that range. If I used a plain 'd' instead of '//!d' would sed unconditionally delete all lines from the file ? No. It would delete only the lines in that range. Similarly, sed should not even attempt to evaluate the empty regex in '{//!d' for lines outside that range. > > Unless I'm missing something I still see this as a bug. There is a subtle issue here: when using 2 addresses, and the second address is an RE, the first line matching the first address (line 2 in your case) will *never* be checked against the RE. And so, even though the '//!d' is run conditionally, the condition is true (line 2, before regex is checked), and then '//!d' is executed but there is no 'last regex' yet. This is documented in the manual: https://www.gnu.org/software/sed/manual/sed.html#Range-Addresses (in the second paragraph, starting with "if the second address is a regexp"). Also, In the POSIX standard, the relevant text is: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html (under "addresses in sed" section). "An editing command with two addresses shall select the inclusive range from the first pattern space that matches the first address through the next pattern space that matches the second. " The interpretation is that the second address is checked against "the next pattern space" - impling that the first time the second address is checkd is not in the first line that matches, but on the 'next pattern space' (meaning starting at the line following the line that matched the first address). [phew, that is a bit confusing....] Does this clarify the issue? -assaf From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 15 20:43:24 2018 Received: (at 30829) by debbugs.gnu.org; 16 Mar 2018 00:43:24 +0000 Received: from localhost ([127.0.0.1]:35553 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ewdSu-0008FU-0Z for submit@debbugs.gnu.org; Thu, 15 Mar 2018 20:43:24 -0400 Received: from mout.gmx.net ([212.227.15.19]:39475) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ewdPU-00086g-1u for 30829@debbugs.gnu.org; Thu, 15 Mar 2018 20:39:52 -0400 Received: from [82.137.13.167] ([82.137.13.167]) by 3c-app-mailcom-bs06.server.lan (via HTTP); Fri, 16 Mar 2018 01:39:45 +0100 MIME-Version: 1.0 Message-ID: From: "Don Crissti" To: 30829@debbugs.gnu.org Subject: Re: bug#30829: (no subject) Content-Type: text/html; charset=UTF-8 Date: Fri, 16 Mar 2018 01:39:45 +0100 Importance: normal Sensitivity: Normal In-Reply-To: <20180316001648.GG29079@tomato> References: <20180316001648.GG29079@tomato> X-UI-Message-Type: mail X-Priority: 3 X-Provags-ID: V03:K1:Ko736UTOZOKZGhQ0OiixRdK2BY/cXUrtbpqd0++oFrQ RpjOHnKT+6S5NqVq63tWGrHbAul8wnz8Yy0fMfWfOjUtPkmD0q UgSouabVptrbkCTsq7WIXqb4BAdwZ6Ry6X5WpJVLjPbdhy1JY6 UMxRtCmVYzlAZaKxhklILIGvoL9dZPw5uGlrfJdDWrgvfnf8Kt bDiraIcC0YOq5fegXBmir9HJqwFU+fmpQ6Pdw37WbgKiMFmy8g MMYgIeQAVh+6itGaiBsnSnyYo0hPafZYThT/ow/z09uUlVrAEc rfC4pE= X-UI-Out-Filterresults: notjunk:1;V01:K0:+2dosiQNNwg=:9PWth8934jxmy4qXCBRlr8 OKGX/h3W7/QnGl+x4B7JEk/T8sK65lTIp74Sl6yuHvB+ml6QIa+l2abNmmRb7zTEyEu1zjezX h3Wkf6lhs7TXk/2hR0au74Ohjd942Ba99Zk126uYBjHak9o36G3rD8o6kmvuQOz82VZ8d9pXP H6gpx2jFuNw2hd41ZgXkMcCcapwawkyrzPcasVogt7GMVtmWYJ1FWrxUSRUHnTRDj4zw1WVz1 fKLPdCZ1N3kDLXbJtQmIuhgASVE+NhrFs2AVcKASwsrQDO0Bw5YZVPa4ebKKy1e8yh5U1uwly eYl9pJCh2toYoKCSbXf3nhfH6a81YUPfA0OrypMhscEBy1YPNm95/mPCqQ8DEJWr3nn6CpUUJ d+9YCzPCMBZcjoNpt193ysrQCZn9qByi+IaSJh3LiTq2hmDt4EwMUKLj6OrKiAsk+vobUA0FM e0k2fJW49JvTFRvsvaen97yy3YeFlmtrA1JFqCrRdheiRtlhry/A X-Spam-Score: 2.4 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Damn... No, it's not confusing at all. It's actually crystal clear. Not sure what I was thinking... I even knew that the 1st line of a range is never checked against the 2nd RE, that's why stuff like [...] Content analysis details: (2.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 2.0 SLIGHTLY_BAD_SUBJECT Subject contains something slightly spammy 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (don_crissti[at]gmx.com) -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [212.227.15.19 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts 0.0 HTML_MESSAGE BODY: HTML included in message X-Debbugs-Envelope-To: 30829 X-Mailman-Approved-At: Thu, 15 Mar 2018 20:43:23 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 2.4 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Damn... No, it's not confusing at all. It's actually crystal clear. Not sure what I was thinking... I even knew that the 1st line of a range is never checked against the 2nd RE, that's why stuff like [...] Content analysis details: (2.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [212.227.15.19 listed in list.dnswl.org] 2.0 SLIGHTLY_BAD_SUBJECT Subject contains something slightly spammy 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (don_crissti[at]gmx.com) -0.0 SPF_PASS SPF: sender matches SPF record 1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts 0.0 HTML_MESSAGE BODY: HTML included in message
Damn... No, it's not confusing at all. It's actually crystal clear. Not sure what I was thinking...
I even knew that the 1st line of a range is never checked against the 2nd RE, that's why stuff like
 
sed '1,/RE/d'
 
deletes all the lines when the only RE match is on the 1st line.
I guess my ADHD got the best of me...
 
Thanks for your time and for your detailed explanation !
 
 
 
There is a subtle issue here:
when using 2 addresses, and the second address is an RE,
the first line matching the first address (line 2 in your case)
will *never* be checked against the RE.

And so, even though the '//!d' is run conditionally,
the condition is true (line 2, before regex is checked),
and then '//!d' is executed but there is no 'last regex' yet.

This is documented in the manual:
https://www.gnu.org/software/sed/manual/sed.html#Range-Addresses
(in the second paragraph, starting with "if the second address is a regexp").

Also,
In the POSIX standard, the relevant text is:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html
(under "addresses in sed" section).

"An editing command with two addresses shall select the inclusive
range from the first pattern space that matches the first address
through the next pattern space that matches the second. "

The interpretation is that the second address is checked against
"the next pattern space" - impling that the first time the second
address is checkd is not in the first line that matches, but
on the 'next pattern space' (meaning starting at the line following
the line that matched the first address).
[phew, that is a bit confusing....]



Does this clarify the issue?

-assaf
From unknown Wed Sep 10 03:05:00 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 13 Apr 2018 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator