From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 13 13:53:08 2018 Received: (at submit) by debbugs.gnu.org; 13 Jun 2018 17:53:08 +0000 Received: from localhost ([127.0.0.1]:46879 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fT9xE-0000fL-Ca for submit@debbugs.gnu.org; Wed, 13 Jun 2018 13:53:08 -0400 Received: from eggs.gnu.org ([208.118.235.92]:42854) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fT9BL-0007zv-TX for submit@debbugs.gnu.org; Wed, 13 Jun 2018 13:03:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fT9BF-0007On-FQ for submit@debbugs.gnu.org; Wed, 13 Jun 2018 13:03:34 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, HTML_MESSAGE,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:34749) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fT9BF-0007Nz-BQ for submit@debbugs.gnu.org; Wed, 13 Jun 2018 13:03:33 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33820) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fT9BD-0001Yl-Rk for bug-sed@gnu.org; Wed, 13 Jun 2018 13:03:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fT9BC-0007G3-Mi for bug-sed@gnu.org; Wed, 13 Jun 2018 13:03:31 -0400 Received: from mail-wr0-x22f.google.com ([2a00:1450:400c:c0c::22f]:40755) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fT9BC-0007BE-Fh for bug-sed@gnu.org; Wed, 13 Jun 2018 13:03:30 -0400 Received: by mail-wr0-x22f.google.com with SMTP id l41-v6so3511416wre.7 for ; Wed, 13 Jun 2018 10:03:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:reply-to:from:date:message-id:subject:to; bh=JUyEzgL03V4WrgOoXUdlpUKWm56dRoQ27zbfO3u1iBw=; b=F6JND3fTxwxEloBeSRTvkiMqfyRrHUzrdsjp0ucG7+jjt5dycv9RAZa7L1aD0/rVQ5 6ep4TPRD8vtUnRAjFO8Vc8okiH//IX2kq9AahygxyC8YQk43BO1izaglDnKa7vnxAywO MM6N3n7HZKkTiXjpV5rNir0+0aULn6XOaVNXux4YQtyfZh34s7+Kv+5/FYKgAGHlGxgO m4qmTa4hkTBMKGrUPTvNtLxNK5GGHuYW7FMzZkicL7OEDGdRUpNScFOdOPztGGGFG7tz msSA3XFJub7TTYEX/uhAc3DMVh3RceJmx9scq5mdBLGkdCEV3vrjvbappwAywc2iqV5Z pBPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:reply-to:from:date:message-id :subject:to; bh=JUyEzgL03V4WrgOoXUdlpUKWm56dRoQ27zbfO3u1iBw=; b=VdIx/h2HjCEa4vctKVB/3GxMZNthAi2+jIgFQBeYBTznchekGd8gwnDzAO72n9/W+W 81/VXN7lLXpi0yE0YUhFm729ZIXmXSb44wo/2ThyzHT7CpwW9u9Qp2VFyU7Y60ss8mkd pC5q3r7CU8RarKrhlDXKpEDPg8mkeCh+d93xKuflvK8+GCVk48KaqAy+2k3saOy2w5IM yzKmS2G2AlIBT6d4pOjuUa/aS4byK9orLcoMrgro+CB19Nrvt4yXFtFYoETEcppgABKy /pvYuOdR2nCFBio8QtRxnukwc18VEJbhQyUmpCO9fIF+VpKdMmJg0H/EejdLhjcWu4j7 SDCg== X-Gm-Message-State: APt69E3/yN8C4bSvHEXVFtbwcS8zK5RgRmWq4zTxUQGTik00hTXDa23Z 87MC6sYGNwgrT8iW9dQbY5+0C7BbfGCNPgGskfWMKA== X-Google-Smtp-Source: ADUXVKLeYGLQHKozdeqFmMmB+UUFs5q+6HsHz/mlJdV/HtnDskq6Fm3tRTlFBhipJkkBQ3xmbesRDAngQTnkQ7bdOPs= X-Received: by 2002:adf:84c2:: with SMTP id 60-v6mr4998053wrg.167.1528909408352; Wed, 13 Jun 2018 10:03:28 -0700 (PDT) MIME-Version: 1.0 From: Mark Otto Date: Wed, 13 Jun 2018 13:03:16 -0400 Message-ID: Subject: Saved Sub String Only Saves Last To: bug-sed@gnu.org Content-Type: multipart/alternative; boundary="0000000000002dbc81056e88f618" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Wed, 13 Jun 2018 13:53:06 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Mark.Ot2o@gmail.com Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) --0000000000002dbc81056e88f618 Content-Type: text/plain; charset="UTF-8" If I use a saved substring it should capture the maximum number of characters that fit the pattern, in this case [0-9][0-9]*. echo "I'm 2254 years old"|sed "s/^..*\([0-9][0-9]*\) /She's \1 /" She's 4 years old" She should be 2254 years old. It does search correctly because without the substring it replaces all the digits: echo "I'm 2287 years old"|sed "s/^..*[0-9][0-9]*/She's many/" She's many years old" Here is my version information: sed --version # On Windows 10 sed (GNU sed) 4.4 Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later . This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Jay Fenlason, Tom Lord, Ken Pizzini, and Paolo Bonzini. GNU sed home page: . General help using GNU software: . E-mail bug reports to: . --0000000000002dbc81056e88f618 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
If I use a saved substring it should capture the maxi= mum number of characters that fit the pattern, in this case=C2=A0 [0-9][0-9]*.= =C2=A0=C2=A0

echo &= quot;I'm 2254 years old"|sed "s/^..*\([0-9][0-9]*\) /She'= s \1 /"
She's= 4 years old"

She should = be 2254 years old.

It does search correctly becaus= e without the substring it replaces all the digits:

echo "I'm 2287 years old"|sed= "s/^..*[0-9][0-9]*/She's many/"
= She's many years old"

Here is my version information:<= /div>

sed --version # On Windows 10
sed (= GNU sed) 4.4
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
<= div>There is NO WARRANTY, to the extent permitted by law.

Written by Jay Fenlason, Tom Lord, Ken Pizzini,
and Paolo= Bonzini.
GNU sed home page: <http://www.gnu.org/software/sed/>.
General he= lp using GNU software: <http://w= ww.gnu.org/gethelp/>.
E-mail bug reports to: <bug-sed@gnu.org>.
--0000000000002dbc81056e88f618-- From debbugs-submit-bounces@debbugs.gnu.org Mon Jun 18 16:04:32 2018 Received: (at control) by debbugs.gnu.org; 18 Jun 2018 20:04:32 +0000 Received: from localhost ([127.0.0.1]:55316 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fV0O8-0006D6-2L for submit@debbugs.gnu.org; Mon, 18 Jun 2018 16:04:32 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:44128 helo=mx1.redhat.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fV0O5-0006Cn-Up; Mon, 18 Jun 2018 16:04:30 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 20EC97D84D; Mon, 18 Jun 2018 20:04:24 +0000 (UTC) Received: from [10.10.125.119] (ovpn-125-119.rdu2.redhat.com [10.10.125.119]) by smtp.corp.redhat.com (Postfix) with ESMTP id C46527C39; Mon, 18 Jun 2018 20:04:23 +0000 (UTC) Subject: Re: bug#31816: Saved Sub String Only Saves Last To: Mark.Ot2o@gmail.com, 31816-done@debbugs.gnu.org, GNU bug control References: From: Eric Blake Organization: Red Hat, Inc. Message-ID: Date: Mon, 18 Jun 2018 15:04:23 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Mon, 18 Jun 2018 20:04:24 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Mon, 18 Jun 2018 20:04:24 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'eblake@redhat.com' RCPT:'' X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) tag 31816 notabug thanks On 06/13/2018 12:03 PM, Mark Otto wrote: > If I use a saved substring it should capture the maximum number of > characters that fit the pattern, in this case [0-9][0-9]*. Sed already does that (an operator is as greedy as possible, given what has already been matched earlier in the line). However, you are misunderstanding how greedy operators work. > > echo "I'm 2254 years old"|sed "s/^..*\([0-9][0-9]*\) /She's \1 /" > She's 4 years old" That is correct output. Remember, in sed, every pattern is evaluated from left to right to find the longest possible substring that will match, where patterns on the left use a shorter substring only if patterns on the right are not possible with the longest substring. Since .* is a greedy pattern, you have matched: "I" "'m 225" "4" ^. .* \([0-9][0-9]*\) > > > She should be 2254 years old. If you want the second pattern to match longer as a higher priority than the first .* pattern being greedy, you have to use some other pattern on the first use, such as: echo "I'm 2254 years old" | sed "s/^..*[^0-9]\([0-9][0-9]*\)/She's \1/" which matches as: "I" "'m" " " "2254" ^. .* [^0-9] \([0-9][0-9]*\) where my explicit match of a non-digit forced the .* to be less greedy. Or, you can use other languages, like perl, which have the extension of non-greedy operators, as in: echo "I'm 2254 years old" | perl -pe "s/^..*?([0-9]+) /She's \1/" perl is more like 'sed -E', but has the additional '.*?' non-greedy counterpart to '.*' that sed lacks. > > It does search correctly because without the substring it replaces all the > digits: > > echo "I'm 2287 years old"|sed "s/^..*[0-9][0-9]*/She's many/" > She's many years old" That output is still correct, but wasn't doing what you claimed it was doing. Again, it was matching: "I" "'m 228" "7" ^. .* [0-9][0-9]* then replacing that entire match. As such, I'm marking this as not a bug. But feel free to comment further if you still need help. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 20 11:13:58 2018 Received: (at 31816) by debbugs.gnu.org; 20 Jun 2018 15:13:58 +0000 Received: from localhost ([127.0.0.1]:57809 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fVeo1-00024i-Vo for submit@debbugs.gnu.org; Wed, 20 Jun 2018 11:13:58 -0400 Received: from mail-wr0-f178.google.com ([209.85.128.178]:34190) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fVbUh-00081c-Lp for 31816@debbugs.gnu.org; Wed, 20 Jun 2018 07:41:48 -0400 Received: by mail-wr0-f178.google.com with SMTP id a12-v6so2975155wro.1 for <31816@debbugs.gnu.org>; Wed, 20 Jun 2018 04:41:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:reply-to:from:date:message-id :subject:to; bh=k+l2fFHCdBDcT63xw4Wx8xAImQk6es6tmRi2jWvIIc4=; b=UzCegxISxH69iVE7LVYlYHy2APsx+NRLFBV50FFe1PheVFmSOQneVqv1nd2jAea64b rI46CXbQADIdbWYsc6LLLACb8U/fS8wX8y8DWZX96BgPAj5/mePZwP7a3nEB7qYFptfQ AyJKr+MIUvaxbZ4zEGHamq19O8w+SCviQhSIvKfRNmYbMZVZt5dsekmJm4rzhWycYBLY ObshwerFp2zpyeG6lv15aXazue3cdxGFF7fq9MKnOhdb/z5BcftdE8ulLhy5xzJbB0WF Vq/yj2E6wpmIYoQYASxpb5DVb+TOg+tPlaT77SuEfV9HpblBanRn2kmYoFmmV9UBjrXG MXFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:reply-to :from:date:message-id:subject:to; bh=k+l2fFHCdBDcT63xw4Wx8xAImQk6es6tmRi2jWvIIc4=; b=mYGdT532ulWY1EerO8jHawJnEHGcAPltHxCMlTW7wEQI4eAeE1p4f4ygTJgftiqdV4 15Lb3Fi7gdg39Jo4O1BDg9lEBiOE2cfVhx8NUoMocQhO5aYymczogl20LaWARGfsCs+c Bz378TQVHMbPKjIr/KHG1BMWjbKniDuaGi79QRu41PKTVgC5rYx19zJCm3CRFvUPaIO8 XTpsnujEP6JqWixiiE0EU26rTRgP0MBD9MWZUpw+yF8+u4R1xHmZ6z0QC9HQC5e26sdr zmfVb8l+hHoxhBbj3Nlj1xwEQVOjGW7OeGaR6ETuCuW+nVzRngX5Og0P3e8ylF8PNR9W vmRg== X-Gm-Message-State: APt69E03zYKuTOiFdtklN2FGW9ms63YnqBXtG/rd7uRfGS/nR5Ld1fRA w+mE3MtnNI4DystKesjQeQt/R4gKTJN+5qL0yRQz4w== X-Google-Smtp-Source: ADUXVKL3YYQkYeuJKlxN4cYB8XzixF0whsfDEHgOnkXTWpSG61CIDZPgu/RejzkZPgwP+wc+jb1EQ4lwTr/c9tHhFRg= X-Received: by 2002:adf:e084:: with SMTP id c4-v6mr17014091wri.199.1529494901578; Wed, 20 Jun 2018 04:41:41 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Mark Otto Date: Wed, 20 Jun 2018 07:41:29 -0400 Message-ID: Subject: Re: bug#31816: closed (Re: bug#31816: Saved Sub String Only Saves Last) To: 31816@debbugs.gnu.org, Eric Blake Content-Type: multipart/alternative; boundary="0000000000004b62ec056f114880" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 31816 X-Mailman-Approved-At: Wed, 20 Jun 2018 11:13:57 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Mark.Ot2o@gmail.com Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --0000000000004b62ec056f114880 Content-Type: text/plain; charset="UTF-8" Dear Eric, Thank you for your thorough explanation of the greediness of sed. If I was thinking about sed's greediness, I should have thought that it would be consistent at every point, including being greedy before my back reference. The nongreedy perl operators are intuitive, but their matching process still needs to be thought through. I found an explanation of the difference between greedy and non-greedy here : Consider the input 101000000000100. Using 1.*1, * is greedy It will match all the way to the end, and then backtrack until it can match a 1, leaving you with 1010000000001. .*? is non-greedy. * will match nothing, but then will try to match extra characters until it matches a 1, eventually matching 101. All quantifiers have a non-greedy mode: .*?, .+?, .{2,6}?, and even .??. Sed is a UNIX standard, so I could think harder about how it works rather than jumping to "It's a bug!" Best wishes, Mark On Mon, Jun 18, 2018 at 4:05 PM GNU bug Tracking System < help-debbugs@gnu.org> wrote: > Your bug report > > #31816: Saved Sub String Only Saves Last > > which was filed against the sed package, has been closed. > > The explanation is attached below, along with your original report. > If you require more details, please reply to 31816@debbugs.gnu.org. > > -- > 31816: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=31816 > GNU Bug Tracking System > Contact help-debbugs@gnu.org with problems > > > > ---------- Forwarded message ---------- > From: Eric Blake > To: Mark.Ot2o@gmail.com, 31816-done@debbugs.gnu.org, GNU bug control < > control@debbugs.gnu.org> > Cc: > Bcc: > Date: Mon, 18 Jun 2018 15:04:23 -0500 > Subject: Re: bug#31816: Saved Sub String Only Saves Last > tag 31816 notabug > thanks > > On 06/13/2018 12:03 PM, Mark Otto wrote: > > If I use a saved substring it should capture the maximum number of > > characters that fit the pattern, in this case [0-9][0-9]*. > > Sed already does that (an operator is as greedy as possible, given what > has already been matched earlier in the line). However, you are > misunderstanding how greedy operators work. > > > > > echo "I'm 2254 years old"|sed "s/^..*\([0-9][0-9]*\) /She's \1 /" > > She's 4 years old" > > That is correct output. Remember, in sed, every pattern is evaluated > from left to right to find the longest possible substring that will > match, where patterns on the left use a shorter substring only if > patterns on the right are not possible with the longest substring. > Since .* is a greedy pattern, you have matched: > > "I" "'m 225" "4" > ^. .* \([0-9][0-9]*\) > > > > > > > She should be 2254 years old. > > If you want the second pattern to match longer as a higher priority than > the first .* pattern being greedy, you have to use some other pattern on > the first use, such as: > > echo "I'm 2254 years old" | sed "s/^..*[^0-9]\([0-9][0-9]*\)/She's \1/" > > which matches as: > > "I" "'m" " " "2254" > ^. .* [^0-9] \([0-9][0-9]*\) > > where my explicit match of a non-digit forced the .* to be less greedy. > > Or, you can use other languages, like perl, which have the extension of > non-greedy operators, as in: > > echo "I'm 2254 years old" | perl -pe "s/^..*?([0-9]+) /She's \1/" > > perl is more like 'sed -E', but has the additional '.*?' non-greedy > counterpart to '.*' that sed lacks. > > > > > It does search correctly because without the substring it replaces all > the > > digits: > > > > echo "I'm 2287 years old"|sed "s/^..*[0-9][0-9]*/She's many/" > > She's many years old" > > That output is still correct, but wasn't doing what you claimed it was > doing. Again, it was matching: > > "I" "'m 228" "7" > ^. .* [0-9][0-9]* > > then replacing that entire match. > > As such, I'm marking this as not a bug. But feel free to comment > further if you still need help. > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3266 > Virtualization: qemu.org | libvirt.org > > > > > ---------- Forwarded message ---------- > From: Mark Otto > To: bug-sed@gnu.org > Cc: > Bcc: > Date: Wed, 13 Jun 2018 13:03:16 -0400 > Subject: Saved Sub String Only Saves Last > If I use a saved substring it should capture the maximum number of > characters that fit the pattern, in this case [0-9][0-9]*. > > echo "I'm 2254 years old"|sed "s/^..*\([0-9][0-9]*\) /She's \1 /" > She's 4 years old" > > > She should be 2254 years old. > > It does search correctly because without the substring it replaces all the > digits: > > echo "I'm 2287 years old"|sed "s/^..*[0-9][0-9]*/She's many/" > She's many years old" > > > Here is my version information: > > sed --version # On Windows 10 > sed (GNU sed) 4.4 > Copyright (C) 2017 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later < > http://gnu.org/licenses/gpl.html>. > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. > > Written by Jay Fenlason, Tom Lord, Ken Pizzini, > and Paolo Bonzini. > GNU sed home page: . > General help using GNU software: . > E-mail bug reports to: . > --0000000000004b62ec056f114880 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Dear Eric,

Thank you for your thorough = explanation of the greediness of sed.=C2=A0 If I was thinking about sed'= ;s greediness, I should have thought that it would be consistent at every p= oint, including being greedy before my back reference.=C2=A0 The nongreedy = perl operators are intuitive, but their matching process still needs to be = thought through.

I found an explanation of the dif= ference between greedy and non-greedy=C2=A0here:

Consid= er the input=C2=A0101000000000100.=C2=A0 Using=C2= =A01.*1,=C2=A0<= font face=3D"monospace, monospace">*=C2=A0is greedy=C2=A0 It will match all = the way to the end, and then backtrack until it can match a=C2=A01, leaving you with=C2=A0= 1010000000001.=C2=A0 .*?=C2= =A0is non-greedy.=C2=A0*=C2=A0will match nothing, but then will t= ry to match extra characters until it matches a=C2=A01, ev= entually matching=C2=A0101.=C2=A0 All quantifiers have a n= on-greedy mode:=C2=A0.*?,=C2=A0.+?,=C2=A0 and even=C2=A0<= font face=3D"monospace, monospace">.??.

Se= d is a UNIX standard, so I could think harder about how it works rather tha= n jumping to "It's a bug!"

Best wish= es,
Mark

On Mon, Jun 18, 2018 at 4:05 PM GNU bug Tracking System <help-debbugs@gnu.org> wrote:
Your bug report

#31816: Saved Sub String Only Saves Last

which was filed against the sed package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 31816@debbugs.gnu.org.

--
31816: http://debbugs.gnu.org/cgi/bugreport.cgi?= bug=3D31816
GNU Bug Tracking System
Contact help-debb= ugs@gnu.org with problems



---------- Forwarded message ----------
From:=C2=A0Eric Blak= e <eblake@redhat.= com>
To:=C2=A0Mark.Ot2o@gmail.com, 31816-done@debbugs.gnu.org, GNU bug control <control@debbugs.g= nu.org>
Cc:=C2=A0
Bcc:=C2=A0
Date:=C2=A0Mon, 18 Jun 2018 15= :04:23 -0500
Subject:=C2=A0Re: bug#31816: Saved Sub String Only Saves La= st
tag 31816 notabug
thanks

On 06/13/2018 12:03 PM, Mark Otto wrote:
> If I use a saved substring it should capture the maximum number of
> characters that fit the pattern, in this case=C2=A0 [0-9][0-9]*.

Sed already does that (an operator is as greedy as possible, given what has already been matched earlier in the line).=C2=A0 However, you are
misunderstanding how greedy operators work.

>
> echo "I'm 2254 years old"|sed "s/^..*\([0-9][0-9]*\= ) /She's \1 /"
> She's 4 years old"

That is correct output.=C2=A0 Remember, in sed, every pattern is evaluated =
from left to right to find the longest possible substring that will
match, where patterns on the left use a shorter substring only if
patterns on the right are not possible with the longest substring.
Since .* is a greedy pattern, you have matched:

"I" "'m 225" "4"
=C2=A0 ^.=C2=A0 .*=C2=A0 =C2=A0 =C2=A0 =C2=A0\([0-9][0-9]*\)

>
>
> She should be 2254 years old.

If you want the second pattern to match longer as a higher priority than the first .* pattern being greedy, you have to use some other pattern on the first use, such as:

echo "I'm 2254 years old" | sed "s/^..*[^0-9]\([0-9][0-9= ]*\)/She's \1/"

which matches as:

"I" "'m" " "=C2=A0 =C2=A0 =C2=A0"225= 4"
=C2=A0 ^.=C2=A0 .*=C2=A0 =C2=A0[^0-9]=C2=A0 \([0-9][0-9]*\)

where my explicit match of a non-digit forced the .* to be less greedy.

Or, you can use other languages, like perl, which have the extension of non-greedy operators, as in:

echo "I'm 2254 years old" | perl -pe "s/^..*?([0-9]+) /S= he's \1/"

perl is more like 'sed -E', but has the additional '.*?' no= n-greedy
counterpart to '.*' that sed lacks.

>
> It does search correctly because without the substring it replaces all= the
> digits:
>
> echo "I'm 2287 years old"|sed "s/^..*[0-9][0-9]*/Sh= e's many/"
> She's many years old"

That output is still correct, but wasn't doing what you claimed it was =
doing.=C2=A0 Again, it was matching:

"I" "'m 228" "7"
=C2=A0 ^.=C2=A0 .*=C2=A0 =C2=A0 =C2=A0 =C2=A0[0-9][0-9]*

then replacing that entire match.

As such, I'm marking this as not a bug.=C2=A0 But feel free to comment =
further if you still need help.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+1-919-301-3266
Virtualization:=C2=A0 qemu.org | libvirt.org




---------- Forwarded message ----------
From:=C2=A0Mark Otto= <mark.ot2o@gma= il.com>
To:=C2=A0bug-sed@gnu.org
Cc:=C2=A0
Bcc:=C2=A0
Date:=C2=A0Wed, 13 J= un 2018 13:03:16 -0400
Subject:=C2=A0Saved Sub String Only Saves Last
If I use a saved substring it should capture the max= imum number of characters that fit the pattern, in this case=C2=A0 [0-9][0-9]*.= =C2=A0=C2=A0

echo &= quot;I'm 2254 years old"|sed "s/^..*\([0-9][0-9]*\) /She'= s \1 /"
She's= 4 years old"

She should = be 2254 years old.

It does search correctly becaus= e without the substring it replaces all the digits:

echo "I'm 2287 years old"|sed= "s/^..*[0-9][0-9]*/She's many/"
= She's many years old"

Here is my version information:<= /div>

sed --version # On Windows 10
sed (= GNU sed) 4.4
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html= >.
This is free software: you are free to change and redis= tribute it.
There is NO WARRANTY, to the extent permitted by law.=

Written by Jay Fenlason, Tom Lord, Ken Pizzini,
and Paolo Bonzini.
General help using GNU software: <http://www.gnu.org/gethelp/>.
--0000000000004b62ec056f114880-- From unknown Tue Jun 17 22:28:40 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Thu, 19 Jul 2018 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator