From unknown Sat Jun 21 05:16:18 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#5970 <5970@debbugs.gnu.org> To: bug#5970 <5970@debbugs.gnu.org> Subject: Status: regex won't do lazy matching Reply-To: bug#5970 <5970@debbugs.gnu.org> Date: Sat, 21 Jun 2025 12:16:18 +0000 retitle 5970 regex won't do lazy matching reassign 5970 coreutils submitter 5970 a g severity 5970 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Sun Apr 18 22:01:36 2010 Received: (at submit) by debbugs.gnu.org; 19 Apr 2010 02:01:36 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O3gIp-0001Tb-G2 for submit@debbugs.gnu.org; Sun, 18 Apr 2010 22:01:36 -0400 Received: from mail.gnu.org ([199.232.76.166] helo=mx10.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O3frn-0001Ik-3B for submit@debbugs.gnu.org; Sun, 18 Apr 2010 21:33:39 -0400 Received: from lists.gnu.org ([199.232.76.165]:36051) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1O3frj-0002BA-21 for submit@debbugs.gnu.org; Sun, 18 Apr 2010 21:33:35 -0400 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1O3fri-0001vC-MO for bug-coreutils@gnu.org; Sun, 18 Apr 2010 21:33:34 -0400 Received: from [140.186.70.92] (port=51179 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1O3frh-0001ta-AD for bug-coreutils@gnu.org; Sun, 18 Apr 2010 21:33:34 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,T_DKIM_INVALID,T_TO_NO_BRKTS_FREEMAIL autolearn=unavailable version=3.3.0 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1O3frf-0000E1-PM for bug-coreutils@gnu.org; Sun, 18 Apr 2010 21:33:33 -0400 Received: from qw-out-1920.google.com ([74.125.92.145]:25941) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O3frf-0000Dp-L7 for bug-coreutils@gnu.org; Sun, 18 Apr 2010 21:33:31 -0400 Received: by qw-out-1920.google.com with SMTP id 14so963187qwa.24 for ; Sun, 18 Apr 2010 18:33:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:received:message-id :subject:from:to:content-type; bh=9Crg3HbfmM9w1Wg0caZMaXvQVPqYMC4yHb7vZByvndw=; b=JTkCat7yMOm3+GyA1v8dxGPwrix5WBIj8mpgS/h5TYTo2PzjKvqSFiZXZV/KNM76LX dSi7I9bvmhWtHiCzYfJj6VatZvB+0ZEbnrC44YIThujY1BedIyPSmBL3Vd+L+pPCI/JZ aIUXq0d271UMDmnoXCb4jd2sggTtHDSE+t8Yo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=c/fG824zMUuL3bcWFSDlfR9U3UnpcuFSiBDs/0GDdU7dmWmK6ctM0AC30cn3XjLfHS yTxy21pu0uwxSBCk9iFuYGBR6wZzogEfdrfrEIwrNJsFJoHH7mEBsZqwuafV959OEu+n dhlJS23QrZbIZaADl1PCqmosv/ZidkvnIG/xY= MIME-Version: 1.0 Received: by 10.229.232.204 with HTTP; Sun, 18 Apr 2010 18:33:30 -0700 (PDT) Date: Sun, 18 Apr 2010 21:33:30 -0400 Received: by 10.229.190.213 with SMTP id dj21mr1032790qcb.66.1271640810349; Sun, 18 Apr 2010 18:33:30 -0700 (PDT) Message-ID: Subject: regex won't do lazy matching From: a g To: bug-coreutils@gnu.org Content-Type: multipart/alternative; boundary=001636283b68c8664f04848cf202 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Sun, 18 Apr 2010 22:01:35 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -4.6 (----) --001636283b68c8664f04848cf202 Content-Type: text/plain; charset=ISO-8859-1 This may be a usage problem, but it does not exist with other regex packages (such as slre) and I can't find anything in the documentation to indicate that the syntax should be different for coreutils. I am using coreutils 8.4 on ubuntu AMD64, version 9.10. I cannot get the coreutils regex matcher to do lazy matching. Here is my code: /**** regex_test.cpp ***/ #include #include #include "xalloc.h" #include "regex.h" // compile: // gcc -I coreutils-8.4/lib/ -c regex_test.cpp // g++ -o regex_test regex_test.o coreutils-8.4/lib/xmalloc.o coreutils-8.4/lib/xalloc-die.o coreutils-8.4/lib/exitfail.o coreutils-8.4/lib/regex.o void print_regerror (int errcode, regex_t *compiled) { size_t length = regerror (errcode, compiled, NULL, 0); char *buffer = (char *)xmalloc (length); if(!buffer) printf("error: regerror malloc failed!\n"); else { (void) regerror (errcode, compiled, buffer, length); printf("error: %s\n", buffer); free(buffer); } } int main(int argc, char *argv[]){ if(argc < 3) printf("usage: regex_test pattern string\n"); else { regex_t rx; int err; if((err = regcomp(&rx, argv[1], REG_EXTENDED))) print_regerror(err, &rx); else { regmatch_t matches[4]; if(!regexec(&rx, argv[2], 4, matches, 0)) { int i; printf("match! \n"); for(i = 0; i < 4; i++) { if(matches[i].rm_so != -1) printf(" s:%i, e:%i", matches[i].rm_so, matches[i].rm_eo); } printf("\n"); } else printf("match failed.\n"); regfree(&rx); } } } /********/ Here is the problem. If you execute: regex_test "a[^x]*?a" "a1a2a" then you get: match! s:0, e:5 But, you should get the same result as when you execute *regex_test "a[^x]*?a" "a1a"*-- that is: match! s:0, e:3 Please advise. Thank you! --001636283b68c8664f04848cf202 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
This may be a usage problem, but it does not exi= st with other regex packages (such as slre) and I can't find anything i= n the documentation to indicate that the syntax should be different for cor= eutils. I am using coreutils 8.4 on ubuntu AMD64, version 9.10.=A0I cannot = get the coreutils regex matcher to do lazy matching. Here is my code:

/**** regex_test.cpp ***/
#include <stdio.h>
#include <stdlib.h>
#include "xalloc.h"
#include "regex.h"
=

// compile:
// gcc -I coreutils-8.4/lib/ -c r= egex_test.cpp
// g++ -o regex_test regex_test.o coreutils-8.4/lib/xmalloc.o coreutil= s-8.4/lib/xalloc-die.o coreutils-8.4/lib/exitfail.o coreutils-8.4/lib/regex= .o=A0

void print_regerror (int errcode, regex_t *c= ompiled)
{
=A0=A0size_t length =3D regerror (errcode, compiled, NULL,= 0);
=A0=A0char *buffer =3D (char *)xmalloc (length);
= =A0=A0if(!buffer) printf("error: regerror malloc failed!\n");
=A0=A0else {
=A0=A0 =A0(void) regerror (errcode, compiled, buffer, length);
=A0=A0 =A0printf("error: %s\n", buffer);
=A0=A0 =A0f= ree(buffer);
=A0=A0}
}

int mai= n(int argc, char *argv[]){
=A0=A0if(argc < 3) printf("usage: regex_test pattern string\n&= quot;);
=A0=A0else {
=A0=A0 =A0regex_t rx;
= =A0=A0 =A0int err;
=A0=A0 =A0if((err =3D regcomp(&rx, argv[1]= , REG_EXTENDED)))
=A0=A0 =A0 =A0print_regerror(err, &rx);
=A0=A0 =A0else {
=A0=A0 =A0 =A0regmatch_t matches[4];
=A0=A0 =A0 =A0if(!rege= xec(&rx, argv[2], 4, matches, 0)) {
int i;
printf("match! \n");=
for(i =3D 0; i < 4; i= ++) {
=A0=A0 =A0 =A0 =A0 =A0if(matches[i].rm_so !=3D -1)=A0
=A0=A0 =A0 =A0 =A0 =A0 =A0printf(" =A0s:%i, e:%i", matches[i= ].rm_so, matches[i].rm_eo);
}
printf("\n"= ;);
=A0=A0 =A0 =A0} else printf("match failed.\n");
= =A0=A0 =A0 =A0regfree(&rx);
=A0=A0 =A0}
=A0=A0}
}
/********/

Here is the pr= oblem. If you execute:
=A0=A0regex_test "a[^x]*?a" "a1a2a"

=
then you get:
=A0=A0match!=A0
=A0=A0 = =A0s:0, e:5

But, you should get t= he same result as when you execute=A0regex_test "a[^x]*?a" &qu= ot;a1a"-- that is:
=A0=A0match!=A0
=A0=A0 =A0s:0, e:3
Please advise. Thank you!
--001636283b68c8664f04848cf202-- From debbugs-submit-bounces@debbugs.gnu.org Mon Apr 19 17:36:57 2010 Received: (at 5970) by debbugs.gnu.org; 19 Apr 2010 21:36:57 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O3yeG-0005jN-0s for submit@debbugs.gnu.org; Mon, 19 Apr 2010 17:36:56 -0400 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O3yeE-0005jI-5B for 5970@debbugs.gnu.org; Mon, 19 Apr 2010 17:36:54 -0400 Received: from int-mx03.intmail.prod.int.phx2.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o3JLan0W009613 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 19 Apr 2010 17:36:49 -0400 Received: from [10.3.252.118] (vpn-252-118.phx2.redhat.com [10.3.252.118]) by int-mx03.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o3JLamW1003846; Mon, 19 Apr 2010 17:36:49 -0400 Message-ID: <4BCCCCF6.1010301@redhat.com> Date: Mon, 19 Apr 2010 15:36:54 -0600 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100330 Fedora/3.0.4-1.fc12 Lightning/1.0b1 Thunderbird/3.0.4 MIME-Version: 1.0 To: a g Subject: Re: bug#5970: regex won't do lazy matching References: In-Reply-To: X-Enigmail-Version: 1.0.1 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enigB7F0B02204FCEF054724EF9C" X-Scanned-By: MIMEDefang 2.67 on 10.5.11.16 X-Spam-Score: -10.2 (----------) X-Debbugs-Envelope-To: 5970 Cc: 5970@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.2 (----------) This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigB7F0B02204FCEF054724EF9C Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 04/18/2010 07:33 PM, a g wrote: > This may be a usage problem, but it does not exist with other regex pac= kages > (such as slre) and I can't find anything in the documentation to indica= te > that the syntax should be different for coreutils. I am using coreutils= 8.4 > on ubuntu AMD64, version 9.10. I cannot get the coreutils regex matcher= to > do lazy matching. Thanks for the report. However, coreutils does not maintain regex code. Rather, uses an upstream version from gnulib, which in turn borrows from glibc. Perhaps the best course of action would be to try your test app compiled against glibc, and if that still doesn't meet your needs, then open a bug report against glibc. And if glibc works, then open a bug report against gnulib that gnulib and glibc disagree. --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org --------------enigB7F0B02204FCEF054724EF9C Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBCAAGBQJLzMz2AAoJEKeha0olJ0Nqo40H/jnGs40kXHBfwbTI2JyuJ6oa 1e06uboHhwVmplluiVCIgrxvL29VxHXoIBKEsT51FCKDnDtrrM/fUGVLlTPgqLG1 X2WUB4V1Z0T4Os/tjatfmzV7WHFVy0Kkf6jHhmAxFCAOnAE3R/ySUyzYgezwwU5F TGRixkXwxwY7Qw6p7phZZLvTZmv3+zdQgrKcPCXPT7YK3U9x+tgWzfZsw/o1byCr kILijRt/wiTVtYg8cB+O3zmZz2ycAk61Z733mH2B1MIPQ0ha1Z7/BQ2Q+nVc5ztp QhQrYCIzZLelzvAlOuQnxNfw9Ds0ZMRMLu7kFZFE1a1Uxu+iULxdqM8TFXJ/ZRg= =e9Nq -----END PGP SIGNATURE----- --------------enigB7F0B02204FCEF054724EF9C-- From debbugs-submit-bounces@debbugs.gnu.org Mon Apr 19 18:01:55 2010 Received: (at 5970) by debbugs.gnu.org; 19 Apr 2010 22:01:55 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O3z2Q-0005uE-Ve for submit@debbugs.gnu.org; Mon, 19 Apr 2010 18:01:55 -0400 Received: from c-98-226-122-10.hsd1.in.comcast.net ([98.226.122.10] helo=kosh.dhis.org) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1O3yxK-0005ri-1k for 5970@debbugs.gnu.org; Mon, 19 Apr 2010 17:56:38 -0400 Received: (qmail 20107 invoked by uid 1000); 19 Apr 2010 21:56:33 -0000 Message-ID: <20100419215633.20106.qmail@kosh.dhis.org> From: "Alan Curry" Subject: Re: bug#5970: regex won't do lazy matching To: mewalig@gmail.com (a g) Date: Mon, 19 Apr 2010 16:56:33 -0500 (GMT+5) In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: 1.9 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: a g writes: > > This may be a usage problem, but it does not exist with other regex packages > (such as slre) and I can't find anything in the documentation to indicate > that the syntax should be different for coreutils. I am using coreutils 8.4 > on ubuntu AMD64, version 9.10. I cannot get the coreutils regex matcher to > do lazy matching. Here is my code: [...] Content analysis details: (1.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.9 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL [98.226.122.10 listed in zen.spamhaus.org] 0.9 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP address [98.226.122.10 listed in dnsbl.sorbs.net] 0.0 BAYES_50 BODY: Bayesian spam probability is 40 to 60% [score: 0.4863] 0.1 RDNS_DYNAMIC Delivered to trusted network by host with dynamic-looking rDNS X-Debbugs-Envelope-To: 5970 X-Mailman-Approved-At: Mon, 19 Apr 2010 18:01:53 -0400 Cc: 5970@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: 0.6 (/) a g writes: > > This may be a usage problem, but it does not exist with other regex packages > (such as slre) and I can't find anything in the documentation to indicate > that the syntax should be different for coreutils. I am using coreutils 8.4 > on ubuntu AMD64, version 9.10. I cannot get the coreutils regex matcher to > do lazy matching. Here is my code: By "lazy" do you mean non-greedy? > Here is the problem. If you execute: > regex_test "a[^x]*?a" "a1a2a" The non-greedy quantifiers like *? are not part of standard regex, they are extensions found in perl, and in other packages inspired by perl. -- Alan Curry From debbugs-submit-bounces@debbugs.gnu.org Mon Apr 19 18:28:53 2010 Received: (at 5970) by debbugs.gnu.org; 19 Apr 2010 22:28:53 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O3zSX-00064x-9T for submit@debbugs.gnu.org; Mon, 19 Apr 2010 18:28:53 -0400 Received: from mail-out.m-online.net ([212.18.0.10]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O3zSU-00064s-Vx for 5970@debbugs.gnu.org; Mon, 19 Apr 2010 18:28:52 -0400 Received: from mail01.m-online.net (mail.m-online.net [192.168.3.149]) by mail-out.m-online.net (Postfix) with ESMTP id 0AA351C00176; Tue, 20 Apr 2010 00:28:45 +0200 (CEST) Received: from localhost (dynscan1.mnet-online.de [192.168.8.164]) by mail.m-online.net (Postfix) with ESMTP id D7EC6903C6; Tue, 20 Apr 2010 00:28:45 +0200 (CEST) X-Virus-Scanned: amavisd-new at mnet-online.de Received: from mail.mnet-online.de ([192.168.3.149]) by localhost (dynscan1.mnet-online.de [192.168.8.164]) (amavisd-new, port 10024) with ESMTP id 2kZSHtenf9sq; Tue, 20 Apr 2010 00:28:44 +0200 (CEST) Received: from igel.home (ppp-88-217-100-218.dynamic.mnet-online.de [88.217.100.218]) by mail.mnet-online.de (Postfix) with ESMTP; Tue, 20 Apr 2010 00:28:44 +0200 (CEST) Received: by igel.home (Postfix, from userid 501) id A429BCA297; Tue, 20 Apr 2010 00:28:44 +0200 (CEST) From: Andreas Schwab To: a g Subject: Re: bug#5970: regex won't do lazy matching References: X-Yow: I'm into SOFTWARE! Date: Tue, 20 Apr 2010 00:28:44 +0200 In-Reply-To: (a. g.'s message of "Sun, 18 Apr 2010 21:33:30 -0400") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.95 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Score: -2.6 (--) X-Debbugs-Envelope-To: 5970 Cc: 5970@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.6 (--) a g writes: > regex_test "a[^x]*?a" "a1a2a" : 9.4.6 EREs Matching Multiple Characters [...] The behavior of multiple adjacent duplication symbols ('+', '*', '?', and intervals) produces undefined results. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From debbugs-submit-bounces@debbugs.gnu.org Sat Apr 24 16:34:14 2010 Received: (at 5970) by debbugs.gnu.org; 24 Apr 2010 20:34:14 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O5m3K-0003Bw-CV for submit@debbugs.gnu.org; Sat, 24 Apr 2010 16:34:14 -0400 Received: from joseki.proulx.com ([216.17.153.58]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O5m3I-0003Bo-9c; Sat, 24 Apr 2010 16:34:13 -0400 Received: from dementia.proulx.com (dementia.proulx.com [192.168.230.115]) by joseki.proulx.com (Postfix) with ESMTP id 48A0321363; Sat, 24 Apr 2010 14:34:10 -0600 (MDT) Received: by dementia.proulx.com (Postfix, from userid 1000) id 3CA813CC3A0; Sat, 24 Apr 2010 14:34:10 -0600 (MDT) Date: Sat, 24 Apr 2010 14:34:10 -0600 From: Bob Proulx To: a g Subject: Re: bug#5970: regex won't do lazy matching Message-ID: <20100424203410.GA7258@dementia.proulx.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: 5970 Cc: 5970@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.5 (--) tags 5970 + feedback thanks a g wrote: > This may be a usage problem, but it does not exist with other regex packages > (such as slre) and I can't find anything in the documentation to indicate > that the syntax should be different for coreutils. I am using coreutils 8.4 > on ubuntu AMD64, version 9.10. I cannot get the coreutils regex matcher to > do lazy matching. Here is my code: I read this and was somewhat confused by it. Could you clarify and educate me as to your use? Coreutils is not a "regex package" in the same way as pcre or slre. It does use regular expressions such as in 'expr'. So of course I wondered if you were using a command like 'expr' or were trying to extend coreutils with an additional command. Also, since the bug-coreutils mailing list is attached to a bug tracking system every message thread of discussion opens an issue ticket in the bug tracker. I believe this issue has been resolved satisfactorily by the subsequence responses. Do you agree? Thanks, Bob From debbugs-submit-bounces@debbugs.gnu.org Sun Apr 25 21:56:26 2010 Received: (at 5970) by debbugs.gnu.org; 26 Apr 2010 01:56:26 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O6DYg-00028O-0r for submit@debbugs.gnu.org; Sun, 25 Apr 2010 21:56:26 -0400 Received: from mail-qy0-f171.google.com ([209.85.221.171]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O6DYd-00028I-Ar for 5970@debbugs.gnu.org; Sun, 25 Apr 2010 21:56:24 -0400 Received: by qyk1 with SMTP id 1so14823160qyk.15 for <5970@debbugs.gnu.org>; Sun, 25 Apr 2010 18:56:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=Vs/ejGnTAA+h9h03nqF4EMt32hBEd72rR+Tb7/lIFXw=; b=m7OGzaPmMbE8RGLaDil94+8mY1Ju9KXMnhJVMKhfFt9ToyEX3bfOV58zsPwYemyPu5 W3dnAjXtYtKOFgUL0W0q5G5veeM7c06kbpM64PnwDCWu7x/U2XqzjfSU3N7GZP6pPMgn D5ng/z3XMeYsfWnL3hwk6fZbOHvsujlThukFc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=BTOX4mWXvFRrbV5y7Fkp8cz1nleKLPvmEfOZxALez/RnQKdSnqw9LPGvOrUpUbWLMg uDOx7UNkMcV6azLuN0KENBBB3KM4z+ZuIgMgE/1sd57eMNZAVM6moGJX+APZtRK0CHaa yCxzZabkcnPXegyX7Q/x+Y3GOJr34+12mX2v0= MIME-Version: 1.0 Received: by 10.229.223.140 with SMTP id ik12mr3940987qcb.98.1272246978975; Sun, 25 Apr 2010 18:56:18 -0700 (PDT) Received: by 10.229.232.204 with HTTP; Sun, 25 Apr 2010 18:56:18 -0700 (PDT) In-Reply-To: <20100424203410.GA7258@dementia.proulx.com> References: <20100424203410.GA7258@dementia.proulx.com> Date: Sun, 25 Apr 2010 21:56:18 -0400 Message-ID: Subject: Re: bug#5970: regex won't do lazy matching From: a g To: Bob Proulx Content-Type: multipart/alternative; boundary=00163630ebdd3f996a04851a1589 X-Spam-Score: -2.6 (--) X-Debbugs-Envelope-To: 5970 Cc: 5970@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.6 (--) --00163630ebdd3f996a04851a1589 Content-Type: text/plain; charset=ISO-8859-1 thanks for everyone's help. I agree this can be closed, but not for the reasons mentioned (though I appreciate them and they gave me the info I needed to find the answer: osdir.com/ml/lib.gnulib.bugs/2005-04/msg00027.html). Off to try emacs' regex.c instead. It would be nice if coreutils could offer the emacs regex.c (or some other regex that supported non-greedy matching) at least as an additional library if not the standard one... but, not my call... On Sat, Apr 24, 2010 at 4:34 PM, Bob Proulx wrote: > tags 5970 + feedback > thanks > > a g wrote: > > This may be a usage problem, but it does not exist with other regex > packages > > (such as slre) and I can't find anything in the documentation to indicate > > that the syntax should be different for coreutils. I am using coreutils > 8.4 > > on ubuntu AMD64, version 9.10. I cannot get the coreutils regex matcher > to > > do lazy matching. Here is my code: > > I read this and was somewhat confused by it. Could you clarify and > educate me as to your use? Coreutils is not a "regex package" in the > same way as pcre or slre. It does use regular expressions such as in > 'expr'. So of course I wondered if you were using a command like > 'expr' or were trying to extend coreutils with an additional command. > > Also, since the bug-coreutils mailing list is attached to a bug > tracking system every message thread of discussion opens an issue > ticket in the bug tracker. I believe this issue has been resolved > satisfactorily by the subsequence responses. Do you agree? > > Thanks, > Bob > --00163630ebdd3f996a04851a1589 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable thanks for everyone's help. I agree this can be closed, but not for the= reasons mentioned (though I appreciate them and they gave me the info I ne= eded to find the answer:=A0osdir.com/ml/lib.gnulib.bugs/2005-04/msg00027.html). Off to try emacs' regex.c instead.

It would be nice if coreutils could offer the emacs regex.c = (or some other regex that supported non-greedy matching) at least as an add= itional library if not the standard one... but, not my call...

On Sat, Apr 24, 2010 at 4:34 PM, Bob Proulx = <b= ob@proulx.com> wrote:
tags 5970 + feedback
thanks

a g wrote:
> This may be a usage problem, but it does not exist with other regex pa= ckages
> (such as slre) and I can't find anything in the documentation to i= ndicate
> that the syntax should be different for coreutils. I am using coreutil= s 8.4
> on ubuntu AMD64, version 9.10. I cannot get the coreutils regex matche= r to
> do lazy matching. Here is my code:

I read this and was somewhat confused by it. =A0Could you clarify and
educate me as to your use? =A0Coreutils is not a "regex package" = in the
same way as pcre or slre. =A0It does use regular expressions such as in
'expr'. =A0So of course I wondered if you were using a command like=
'expr' or were trying to extend coreutils with an additional comman= d.

Also, since the bug-coreutils mailing list is attached to a bug
tracking system every message thread of discussion opens an issue
ticket in the bug tracker. =A0I believe this issue has been resolved
satisfactorily by the subsequence responses. =A0Do you agree?

Thanks,
Bob

--00163630ebdd3f996a04851a1589-- From debbugs-submit-bounces@debbugs.gnu.org Mon Apr 26 05:36:35 2010 Received: (at 5970) by debbugs.gnu.org; 26 Apr 2010 09:36:35 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O6Kjy-0005Se-TZ for submit@debbugs.gnu.org; Mon, 26 Apr 2010 05:36:35 -0400 Received: from smtp1-g21.free.fr ([212.27.42.1]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O6Kjv-0005SW-9v for 5970@debbugs.gnu.org; Mon, 26 Apr 2010 05:36:32 -0400 Received: from smtp1-g21.free.fr (localhost [127.0.0.1]) by smtp1-g21.free.fr (Postfix) with ESMTP id D6331940008 for <5970@debbugs.gnu.org>; Mon, 26 Apr 2010 11:36:25 +0200 (CEST) Received: from mx.meyering.net (mx.meyering.net [82.230.74.64]) by smtp1-g21.free.fr (Postfix) with ESMTP id 03AE09400AF for <5970@debbugs.gnu.org>; Mon, 26 Apr 2010 11:36:23 +0200 (CEST) Received: by rho.meyering.net (Acme Bit-Twister, from userid 1000) id CF3A2C62; Mon, 26 Apr 2010 11:36:22 +0200 (CEST) From: Jim Meyering To: a g Subject: Re: bug#5970: regex won't do lazy matching In-Reply-To: (a. g.'s message of "Sun, 25 Apr 2010 21:56:18 -0400") References: <20100424203410.GA7258@dementia.proulx.com> Date: Mon, 26 Apr 2010 11:36:22 +0200 Message-ID: <87fx2i7dyh.fsf@meyering.net> Lines: 17 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Score: -3.1 (---) X-Debbugs-Envelope-To: 5970 Cc: 5970@debbugs.gnu.org, Bob Proulx X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.1 (---) a g wrote: > thanks for everyone's help. I agree this can be closed, but not for the > reasons mentioned (though I appreciate them and they gave me the info I > needed to find the answer: > osdir.com/ml/lib.gnulib.bugs/2005-04/msg00027.html). Off to try emacs' > regex.c instead. > > It would be nice if coreutils could offer the emacs regex.c (or some other > regex that supported non-greedy matching) at least as an additional library > if not the standard one... but, not my call... BTW, why do you care what regex code coreutils uses? Because of expr? Its use of regexp is tightly specified by POSIX, so we cannot change it without a very good reason. There are a few other coreutils programs that use regex.c functions, but they are not used as frequently. From debbugs-submit-bounces@debbugs.gnu.org Mon Apr 26 06:18:55 2010 Received: (at 5970) by debbugs.gnu.org; 26 Apr 2010 10:18:55 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O6LOx-0005mR-0j for submit@debbugs.gnu.org; Mon, 26 Apr 2010 06:18:55 -0400 Received: from mail-out.m-online.net ([212.18.0.9]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O6LOv-0005mL-7y for 5970@debbugs.gnu.org; Mon, 26 Apr 2010 06:18:53 -0400 Received: from mail01.m-online.net (mail.m-online.net [192.168.3.149]) by mail-out.m-online.net (Postfix) with ESMTP id 6ECCD1C15579; Mon, 26 Apr 2010 12:18:50 +0200 (CEST) Received: from localhost (dynscan1.mnet-online.de [192.168.8.164]) by mail.m-online.net (Postfix) with ESMTP id 5D11790EFA; Mon, 26 Apr 2010 12:18:50 +0200 (CEST) X-Virus-Scanned: amavisd-new at mnet-online.de Received: from mail.mnet-online.de ([192.168.3.149]) by localhost (dynscan1.mnet-online.de [192.168.8.164]) (amavisd-new, port 10024) with ESMTP id UMBxuPHszSaU; Mon, 26 Apr 2010 12:18:49 +0200 (CEST) Received: from igel.home (ppp-88-217-126-225.dynamic.mnet-online.de [88.217.126.225]) by mail.mnet-online.de (Postfix) with ESMTP; Mon, 26 Apr 2010 12:18:49 +0200 (CEST) Received: by igel.home (Postfix, from userid 501) id 4BF85CA297; Mon, 26 Apr 2010 12:18:49 +0200 (CEST) From: Andreas Schwab To: Jim Meyering Subject: Re: bug#5970: regex won't do lazy matching References: <20100424203410.GA7258@dementia.proulx.com> <87fx2i7dyh.fsf@meyering.net> X-Yow: Oh, I get it!! ``The BEACH goes on,'' huh, SONNY?? Date: Mon, 26 Apr 2010 12:18:49 +0200 In-Reply-To: <87fx2i7dyh.fsf@meyering.net> (Jim Meyering's message of "Mon, 26 Apr 2010 11:36:22 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.96 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Score: -2.6 (--) X-Debbugs-Envelope-To: 5970 Cc: a g , 5970@debbugs.gnu.org, Bob Proulx X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.6 (--) Jim Meyering writes: > Because of expr? Its use of regexp is tightly specified by POSIX, > so we cannot change it without a very good reason. In the OP's use of regexp POSIX defines nothing. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From debbugs-submit-bounces@debbugs.gnu.org Tue Apr 27 18:30:34 2010 Received: (at 5970-done) by debbugs.gnu.org; 27 Apr 2010 22:30:34 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O6tIX-0004bC-H7 for submit@debbugs.gnu.org; Tue, 27 Apr 2010 18:30:33 -0400 Received: from joseki.proulx.com ([216.17.153.58]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O6tIU-0004b4-Se for 5970-done@debbugs.gnu.org; Tue, 27 Apr 2010 18:30:31 -0400 Received: from dementia.proulx.com (dementia.proulx.com [192.168.230.115]) by joseki.proulx.com (Postfix) with ESMTP id B763121363; Tue, 27 Apr 2010 16:30:26 -0600 (MDT) Received: by dementia.proulx.com (Postfix, from userid 1000) id AECCB3CC39C; Tue, 27 Apr 2010 16:30:26 -0600 (MDT) Date: Tue, 27 Apr 2010 16:30:26 -0600 From: Bob Proulx To: a g Subject: Re: bug#5970: regex won't do lazy matching Message-ID: <20100427223026.GA19682@dementia.proulx.com> References: <20100424203410.GA7258@dementia.proulx.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-Spam-Score: -2.5 (--) X-Debbugs-Envelope-To: 5970-done Cc: 5970-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.5 (--) a g wrote: > thanks for everyone's help. I agree this can be closed, but not for the > reasons mentioned (though I appreciate them and they gave me the info I > needed to find the answer: > osdir.com/ml/lib.gnulib.bugs/2005-04/msg00027.html). Off to try emacs' > regex.c instead. Okay. I will close the bug. > It would be nice if coreutils could offer the emacs regex.c (or some other > regex that supported non-greedy matching) at least as an additional library > if not the standard one... but, not my call... Though you did not answer my question as to your use of the regex engine in coreutils I assume by this that you are trying to use it as some type of general purpose library. That really isn't its intended purpose. In coreutils it is there to support the regular expression matching done in 'expr'. For a general purpose library items it would be better if you were to use gnulib or pcre or one of the other libraries that is intended to be used as a library. Bob From debbugs-submit-bounces@debbugs.gnu.org Mon May 03 02:27:38 2010 Received: (at 5970) by debbugs.gnu.org; 3 May 2010 06:27:38 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O8p7y-0007UD-5h for submit@debbugs.gnu.org; Mon, 03 May 2010 02:27:38 -0400 Received: from 173-164-175-65-sfba.hfc.comcastbusiness.net ([173.164.175.65] helo=Ishtar.sc.tlinx.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O8p27-0007Ra-W1 for 5970@debbugs.gnu.org; Mon, 03 May 2010 02:21:43 -0400 Received: from [192.168.3.12] (Athenae [192.168.3.12]) by Ishtar.sc.tlinx.org (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id o436LHdw007625; Sun, 2 May 2010 23:21:23 -0700 Message-ID: <4BDE6B5D.5080306@tlinx.org> Date: Sun, 02 May 2010 23:21:17 -0700 From: Linda Walsh User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.8.1.24) Gecko/20100228 Lightning/0.9 Thunderbird/2.0.0.24 Mnenhy/0.7.6.666 MIME-Version: 1.0 To: Andreas Schwab Subject: Re: bug#5970: regex won't do lazy matching References: In-Reply-To: X-Stationery: 0.5.1 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: 0.7 (/) X-Debbugs-Envelope-To: 5970 X-Mailman-Approved-At: Mon, 03 May 2010 02:27:37 -0400 Cc: a g , 5970@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -0.6 (/) Andreas Schwab wrote: > a g writes: > >> regex_test "a[^x]*?a" "a1a2a" > > : > > 9.4.6 EREs Matching Multiple Characters > [...] > The behavior of multiple adjacent duplication symbols ('+', '*', '?', > and intervals) produces undefined results. > > Andreas. --- Sorry, late to conversation, but reading email. There was a time in QA that "undefined results" and bug was synonymous. That a spec would say that is lame. Personally, I think if it isn't compatible with the Perl Regex, it's a bug, but that's purely informed personal bias. :-) -l From unknown Sat Jun 21 05:16:18 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Mon, 31 May 2010 11:24:03 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator