From unknown Mon Aug 18 19:26:15 2025 X-Loop: help-debbugs@gnu.org Subject: bug#5812: expr: Difference in behavior of match and : Resent-From: Adil Mujeeb Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Wed, 31 Mar 2010 14:05:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 5812 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 5812@debbugs.gnu.org X-Debbugs-Original-To: bug-coreutils@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.127004424426009 (code B ref -1); Wed, 31 Mar 2010 14:05:01 +0000 Received: (at submit) by debbugs.gnu.org; 31 Mar 2010 14:04:04 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NwyWZ-0006lS-Ms for submit@debbugs.gnu.org; Wed, 31 Mar 2010 10:04:03 -0400 Received: from mx10.gnu.org ([199.232.76.166]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Nwxbu-0006Mv-RW for submit@debbugs.gnu.org; Wed, 31 Mar 2010 09:05:31 -0400 Received: from lists.gnu.org ([199.232.76.165]:37302) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Nwxbp-0003Uf-D6 for submit@debbugs.gnu.org; Wed, 31 Mar 2010 09:05:25 -0400 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Nwxbo-00062k-Rj for bug-coreutils@gnu.org; Wed, 31 Mar 2010 09:05:24 -0400 Received: from [140.186.70.92] (port=51943 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Nwxbm-00062c-GZ for bug-coreutils@gnu.org; Wed, 31 Mar 2010 09:05:23 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,T_DKIM_INVALID,T_TO_NO_BRKTS_FREEMAIL autolearn=unavailable version=3.3.0 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Nwxbj-00026G-AA for bug-coreutils@gnu.org; Wed, 31 Mar 2010 09:05:22 -0400 Received: from mail-pz0-f191.google.com ([209.85.222.191]:35129) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Nwxbj-00025p-5J for bug-coreutils@gnu.org; Wed, 31 Mar 2010 09:05:19 -0400 Received: by pzk29 with SMTP id 29so57788pzk.27 for ; Wed, 31 Mar 2010 06:05:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:received:message-id :subject:from:to:content-type; bh=LSefAayu4X7GaQoul6zgxQe4iT+mgwQ0Jv3otcchtuI=; b=f+LO5KIbsOmCMgIytkf3vMEHfxGWAbTFLjYd4cUA+fiuPFLYBA6cUcHch1iBd7UH8i acDPzQXQ9VIUcQ9aMZEihoY7cxGTEaBn993bdgk+F42pDv07nPDWMWYyDh3NB8gAbuif jn85T/daFL7U4vVIc6UIpu+AtmigZVdde3vPU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=v8ev/mcMAGU3q6559Dzebri6eHTzx9PEb6lXqZL8tMgjjWcjYn9heCWnKg8b8tekfw HOrxx1pPHR5MDw2fjqF0x0FyfTGGNVlCfaTlPbvATGRK/muYK9un7+vGWRhKR+gt/L75 1PW4IV9HCoeQie3A6N09UJSn7VvXwAjmXUJrg= MIME-Version: 1.0 Received: by 10.141.51.14 with HTTP; Wed, 31 Mar 2010 06:05:16 -0700 (PDT) Date: Wed, 31 Mar 2010 18:35:16 +0530 Received: by 10.140.58.7 with SMTP id g7mr2694488rva.37.1270040716845; Wed, 31 Mar 2010 06:05:16 -0700 (PDT) Message-ID: From: Adil Mujeeb Content-Type: multipart/alternative; boundary=001636b2ac50c721170483186539 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Spam-Score: -3.3 (---) X-Mailman-Approved-At: Wed, 31 Mar 2010 10:04:02 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -4.6 (----) --001636b2ac50c721170483186539 Content-Type: text/plain; charset=ISO-8859-1 Hello team, I have tried following snippet in a bash script: -bash-3.1$userid=`expr "uid=11008(ADILM) gid=1200(cvs),1400(build)" : ".*uid=[0-9]*(\(.[0-9a-z]*\)) .*"` -bash-3.1$echo $userid ADILM -bash-3.1$ To my knowledge it should not able to extract ADILM as the regex does not include uppercase letters (A-Z). In the expr man page it is mentioned that: -----8<---------- match STRING REGEXP same as STRING : REGEXP -----8<---------- So i tried following snippet:- -bash-3.1$ userid=`expr match "uid=11008(ADILM) gid=1200(cvs),1400(build)" ".*uid=[0-9]*(\(.[0-9a-z]*\)) .*"` -bash-3.1$ echo $userid -bash-3.1$ I changed the regex and added uppercase letters:- -bash-3.1$ userid=`expr match "uid=11008(ADILM) gid=1200(cvs),1400(build)" ".*uid=[0-9]*(\(.[0-9A-Za-z]*\)) .*"` -bash-3.1$ echo $userid ADILM -bash-3.1$ So it means that match is not same as ":". As per observation ":" uses case-insensitive matching while match is strict case sensitive matching. Can you update the man page OR let me know if i am doing anything wrong? Package:- -bash-3.1$ rpm -qf /usr/bin/expr coreutils-5.97-12.1.el5 -bash-3.1$ Thanks and Regards, Adil Mujeeb --001636b2ac50c721170483186539 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable


Hello team,

I have tried following snippet in a bash script:

-bash-3.1$userid=3D`expr "uid=3D11008(ADILM) gid=3D1200(cvs),1400(b= uild)" : ".*uid=3D[0-9]*(\(.[0-9a-z]*\)) .*"`
-bash-3.1$= echo $userid
ADILM
-bash-3.1$

To my knowledge it should not able to extract ADILM as the regex does no= t include uppercase letters (A-Z).

In the expr man page it is mentioned that:

-----8<----------
match STRING REGEXP
=A0same as STRING : REGEX= P
-----8<----------

So i tried following snippet:-

-bash-3.1$ userid=3D`expr match "uid=3D11008(ADILM) gid=3D1200(cvs)= ,1400(build)"=A0 ".*uid=3D[0-9]*(\(.[0-9a-z]*\)) .*"`
-ba= sh-3.1$ echo $userid

-bash-3.1$

I changed the regex and added uppercase letters:-
-bash-3.1$ userid= =3D`expr match "uid=3D11008(ADILM) gid=3D1200(cvs),1400(build)"= =A0 ".*uid=3D[0-9]*(\(.[0-9A-Za-z]*\)) .*"`
-bash-3.1$ echo $u= serid
ADILM
-bash-3.1$

So it means that match is not same as ":". As per observation = ":" uses case-insensitive matching while match is strict case sen= sitive matching.

Can you update the man page OR let me know if i am doing anything wrong?=

Package:-
-bash-3.1$ rpm -qf /usr/bin/expr
coreutils-5.97-12.1.el5=
-bash-3.1$

Thanks and Regards,
Adil Mujeeb

--001636b2ac50c721170483186539-- From unknown Mon Aug 18 19:26:15 2025 X-Loop: help-debbugs@gnu.org Subject: bug#5812: expr: Difference in behavior of match and : Resent-From: Eric Blake Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Wed, 31 Mar 2010 21:51:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 5812 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Adil Mujeeb Cc: 5812@debbugs.gnu.org Received: via spool by 5812-submit@debbugs.gnu.org id=B5812.127007222010085 (code B ref 5812); Wed, 31 Mar 2010 21:51:02 +0000 Received: (at 5812) by debbugs.gnu.org; 31 Mar 2010 21:50:20 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Nx5nn-0002cc-9H for submit@debbugs.gnu.org; Wed, 31 Mar 2010 17:50:19 -0400 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Nx5nk-0002cP-3h for 5812@debbugs.gnu.org; Wed, 31 Mar 2010 17:50:17 -0400 Received: from int-mx04.intmail.prod.int.phx2.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.17]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o2VLoBjJ021748 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 31 Mar 2010 17:50:11 -0400 Received: from [10.11.11.115] (vpn-11-115.rdu.redhat.com [10.11.11.115]) by int-mx04.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o2VLo9Yx020088; Wed, 31 Mar 2010 17:50:10 -0400 Message-ID: <4BB3C348.5060001@redhat.com> Date: Wed, 31 Mar 2010 15:48:56 -0600 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc12 Lightning/1.0b1 Thunderbird/3.0.3 MIME-Version: 1.0 References: In-Reply-To: X-Enigmail-Version: 1.0.1 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig20F7FBC77A69D9D85ADB3333" X-Scanned-By: MIMEDefang 2.67 on 10.5.11.17 X-Spam-Score: -9.0 (---------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -9.9 (---------) This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig20F7FBC77A69D9D85ADB3333 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 03/31/2010 07:05 AM, Adil Mujeeb wrote: > Hello team, >=20 > I have tried following snippet in a bash script: >=20 > -bash-3.1$userid=3D`expr "uid=3D11008(ADILM) gid=3D1200(cvs),1400(build= )" : > ".*uid=3D[0-9]*(\(.[0-9a-z]*\)) .*"` > -bash-3.1$echo $userid > ADILM > -bash-3.1$ I cannot repeat your results with 7.6 (the version in fedora 12) or the latest coreutils.git. $ expr "uid=3D11008(ADILM) gid=3D1200(cvs),1400(build)" : \ ".*uid=3D[0-9]*(\(.[0-9a-z]*\)) .*" $ expr "uid=3D11008(ADILM) gid=3D1200(cvs),1400(build)" : \ ".*uid=3D[0-9]*(\(.[0-9a-zA-Z]*\)) .*" ADILM Perhaps you have a locale issue at play? > -bash-3.1$ rpm -qf /usr/bin/expr > coreutils-5.97-12.1.el5 That's rather old. Perhaps it might be a bug that has been fixed in the meantime, in which case, you would want to upgrade to 8.4. At any rate, there's nothing in the source code that introduces any case insensitivity, and the documentation is correct, that match and : behave identically. --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org --------------enig20F7FBC77A69D9D85ADB3333 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iEYEARECAAYFAkuzw0oACgkQ84KuGfSFAYDqswCgrX9CUABRbjFIP0cjnwE21h6C TMYAoK1QH2uw6NjsyScbF3i8x/1Nmhq5 =2bYj -----END PGP SIGNATURE----- --------------enig20F7FBC77A69D9D85ADB3333-- From unknown Mon Aug 18 19:26:15 2025 X-Loop: help-debbugs@gnu.org Subject: bug#5812: expr: Difference in behavior of match and : Resent-From: Bob Proulx Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Sat, 03 Apr 2010 22:34:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 5812 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Adil Mujeeb Cc: 5812@debbugs.gnu.org Received: via spool by 5812-submit@debbugs.gnu.org id=B5812.1270334041636 (code B ref 5812); Sat, 03 Apr 2010 22:34:01 +0000 Received: (at 5812) by debbugs.gnu.org; 3 Apr 2010 22:34:01 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NyBui-0000AD-MU for submit@debbugs.gnu.org; Sat, 03 Apr 2010 18:34:01 -0400 Received: from joseki.proulx.com ([216.17.153.58]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NyBuf-0000A8-Oc for 5812@debbugs.gnu.org; Sat, 03 Apr 2010 18:33:59 -0400 Received: from dementia.proulx.com (dementia.proulx.com [192.168.230.115]) by joseki.proulx.com (Postfix) with ESMTP id 50575213FC; Sat, 3 Apr 2010 16:33:53 -0600 (MDT) Received: by dementia.proulx.com (Postfix, from userid 1000) id 45C943CC204; Sat, 3 Apr 2010 16:33:53 -0600 (MDT) Date: Sat, 3 Apr 2010 16:33:53 -0600 From: Bob Proulx Message-ID: <20100403223353.GA20406@dementia.proulx.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-Spam-Score: -1.4 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) tags 5812 + moreinfo unreproducible thanks Adil Mujeeb wrote: > I have tried following snippet in a bash script: > > -bash-3.1$userid=`expr "uid=11008(ADILM) gid=1200(cvs),1400(build)" : ".*uid=[0-9]*(\(.[0-9a-z]*\)) .*"` > -bash-3.1$echo $userid > ADILM > -bash-3.1$ > > To my knowledge it should not able to extract ADILM as the regex does not > include uppercase letters (A-Z). Thank you for the bug report. It stands out as being exceptionally well written and covering the needed information. However I believe what you are seeing is intended behavior. It is an effect of the character collation sequence chosen by your locale setting. What is your locale? $ locale Your sort order depends upon your locale. You didn't say what your locale was and therefore I assume that you were not aware that it had an effect. If your locale is set to a dictionary collation sequence such as en_US.UTF-8 then this is the expected (not necessarily desired but expected) behavior. You probably expected a US-ASCII sort ordering but the powers that be (in the system, in libc, not in coreutils) have decided that the collation ordering (sort ordering) for data should be dictionary sort ordering. In dictionary ordering case is folded together and punctuation is ignored. By having LANG set to any of the "en*" locales the system is instructed to use dictionary sort ordering. This affects almost everything on the system that sorts. This includes commands such as 'ls' and also your shell (e.g. 'echo *') too. Plus things like 'expr'. The collation sequence of [a-z] in dictionary ordering is really "aAbBcC...xXyYzZ" and not "abc...z". So when you say "[a-z]" you are getting "aAbBcC...xXyYz" without 'Z' and when you say "[A-Z]" you are really getting "AbBcC...xXyYzZ" with 'A'! Here is what I see with your case example: $ LC_ALL=C expr "uid=11008(ADILM) gid=1200(cvs),1400(build)" : ".*uid=[0-9]*(\(.[0-9a-z]*\)) .*" ...no output... $ LC_ALL=en_US.UTF-8 expr "uid=11008(ADILM) gid=1200(cvs),1400(build)" : ".*uid=[0-9]*(\(.[0-9a-z]*\)) .*" ADILM > In the expr man page it is mentioned that: > -----8<---------- > match STRING REGEXP > same as STRING : REGEXP > -----8<---------- > So i tried following snippet:- > -bash-3.1$ userid=`expr match "uid=11008(ADILM) gid=1200(cvs),1400(build)" ".*uid=[0-9]*(\(.[0-9a-z]*\)) .*"` > -bash-3.1$ echo $userid > -bash-3.1$ > I changed the regex and added uppercase letters:- > -bash-3.1$ userid=`expr match "uid=11008(ADILM) gid=1200(cvs),1400(build)" ".*uid=[0-9]*(\(.[0-9A-Za-z]*\)) .*"` > -bash-3.1$ echo $userid > ADILM > -bash-3.1$ > So it means that match is not same as ":". As per observation ":" uses > case-insensitive matching while match is strict case sensitive matching. I cannot reproduce this behavior. But I am impressed that you went looking for it. :-) Was this perhaps tested on different machines? Or on any different login account where different locale settings may have been in effect? $ LC_ALL=C expr match "uid=11008(ADILM) gid=1200(cvs),1400(build)" ".*uid=[0-9]*(\(.[0-9a-z]*\))" ...no output... $ LC_ALL=en_US.UTF-8 expr match "uid=11008(ADILM) gid=1200(cvs),1400(build)" ".*uid=[0-9]*(\(.[0-9a-z]*\))" ADILM In addition to setting LC_ALL=C in scripts that need standard behavior you may want to use POSIX character classes here. They may help with situations such as yours. $ LC_ALL=C expr match "uid=11008(ADILM) gid=1200(cvs),1400(build)" ".*uid=[[:digit:]]*(\(.[[:digit:][:upper:]]*\))" ADILM $ LC_ALL=en_US.UTF-8 expr match "uid=11008(ADILM) gid=1200(cvs),1400(build)" ".*uid=[[:digit:]]*(\(.[[:digit:][:upper:]]*\))" ADILM $ LC_ALL=C expr match "uid=11008(ADILM) gid=1200(cvs),1400(build)" ".*uid=[[:digit:]]*(\(.[[:digit:][:lower:]]*\))" ...no output... $ LC_ALL=en_US.UTF-8 expr match "uid=11008(ADILM) gid=1200(cvs),1400(build)" ".*uid=[[:digit:]]*(\(.[[:digit:][:lower:]]*\))" ...no output... > Can you update the man page OR let me know if i am doing anything wrong? This is something that has such global behavior that the problem comes in where do you document it? It shouldn't be documented everywhere. It is a libc behavior and everything that uses libc (everything!) will get the same behavior. But 'sort' has taken the full force of it and so you might look there for the best explanations. The sort documentation says: Unless otherwise specified, all comparisons use the character collating sequence specified by the `LC_COLLATE' locale.(1) ... (1) If you use a non-POSIX locale (e.g., by setting `LC_ALL' to `en_US'), then `sort' may produce output that is sorted differently than you're accustomed to. In that case, set the `LC_ALL' environment variable to `C'. Note that setting only `LC_COLLATE' has two problems. First, it is ineffective if `LC_ALL' is also set. Second, it has undefined behavior if `LC_CTYPE' (or `LANG', if `LC_CTYPE' is unset) is set to an incompatible value. For example, you get undefined behavior if `LC_CTYPE' is `ja_JP.PCK' but `LC_COLLATE' is `en_US.UTF-8'. Personally I have the following in my $HOME/.bashrc file. export LANG=en_US.UTF-8 export LC_COLLATE=C That sets most of my locale to a UTF-8 one but forces sorting to be standard C/POSIX. This probably won't work in the general case since I have no idea how that would interact with all character sets. You may want to look at the FAQ. http://www.gnu.org/software/coreutils/faq/#Sort-does-not-sort-in-normal-order_0021 Notes: * You don't need to include a trailing ".*" or " .*" in your pattern. It won't affect your match and it will be slightly more efficient without it. * You don't need to capture the output with backticks and then echo it. You can just run the command and display the output. Thanks again for the very nice bug report! Bob From unknown Mon Aug 18 19:26:15 2025 X-Loop: help-debbugs@gnu.org Subject: bug#5812: expr: Difference in behavior of match and : Resent-From: Adil Mujeeb Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Mon, 05 Apr 2010 04:34:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 5812 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Bob Proulx Cc: 5812@debbugs.gnu.org Received: via spool by 5812-submit@debbugs.gnu.org id=B5812.127044203517494 (code B ref 5812); Mon, 05 Apr 2010 04:34:02 +0000 Received: (at 5812) by debbugs.gnu.org; 5 Apr 2010 04:33:55 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Nye0X-0004Y7-Ms for submit@debbugs.gnu.org; Mon, 05 Apr 2010 00:33:55 -0400 Received: from mail-pz0-f180.google.com ([209.85.222.180]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NydzD-0004XE-4m for 5812@debbugs.gnu.org; Mon, 05 Apr 2010 00:32:31 -0400 Received: by pzk10 with SMTP id 10so413264pzk.21 for <5812@debbugs.gnu.org>; Sun, 04 Apr 2010 21:32:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:cc:content-type; bh=jteD7bz3lJdBDsZDT6avqlup8BoOLX9jrfubEAMrqGE=; b=Sdux3oky9L3pYt+CPlIPECoFLPrB4/cjxQHAq7NA/jPZ7xJKDVUIBmGI/jC/YeYoH4 e3CbIfA3Mknf74v7NtgafQvF3dHAAgzKGoq8s0I8L/a9BVkNqYG3rYwhWD1eiDlJoZm5 dVAhojjSmjWFYMV1ONGo29OBibycFtjPFI/fk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=hAHWW9rD6YMnpsAlzM2ZouXyskgG97gT3wq7w+EVHhlpQHLbV8ovBD/KeklUFFsnnn Mh59vBmeP2g2TXo3WiIZ8TeaSZLV4gfT7M3vVp1ELlOab5QjXONJxxXT14Jmffuy8fc0 X+tx+cmocescncxh6oc/+nmPcAf7KQpHxYsUQ= MIME-Version: 1.0 Received: by 10.141.51.14 with HTTP; Sun, 4 Apr 2010 21:32:26 -0700 (PDT) In-Reply-To: <20100403223353.GA20406@dementia.proulx.com> References: <20100403223353.GA20406@dementia.proulx.com> Date: Mon, 5 Apr 2010 10:02:26 +0530 Received: by 10.141.88.3 with SMTP id q3mr3159694rvl.162.1270441946982; Sun, 04 Apr 2010 21:32:26 -0700 (PDT) Message-ID: From: Adil Mujeeb Content-Type: multipart/alternative; boundary=000e0cd13a16f52afb048375d0a9 X-Spam-Score: -3.6 (---) X-Mailman-Approved-At: Mon, 05 Apr 2010 00:33:52 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.3 (---) --000e0cd13a16f52afb048375d0a9 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Thanks Bob for such a nice explanation and your instinct is right. It is locale problem. -bash-3.1$ locale LANG=3Den_US.UTF-8 LC_CTYPE=3D"en_US.UTF-8" LC_NUMERIC=3D"en_US.UTF-8" LC_TIME=3D"en_US.UTF-8" LC_COLLATE=3D"en_US.UTF-8" LC_MONETARY=3D"en_US.UTF-8" LC_MESSAGES=3D"en_US.UTF-8" LC_PAPER=3D"en_US.UTF-8" LC_NAME=3D"en_US.UTF-8" LC_ADDRESS=3D"en_US.UTF-8" LC_TELEPHONE=3D"en_US.UTF-8" LC_MEASUREMENT=3D"en_US.UTF-8" LC_IDENTIFICATION=3D"en_US.UTF-8" LC_ALL=3D -bash-3.1$ And the other point you made is also right. I didn=92t realize that I was using another session for comparing the result with match which ahs different locale:- -bash-3.1$ LANG=3Dja_JP.UTF-8 LC_CTYPE=3D"ja_JP.UTF-8" LC_NUMERIC=3D"ja_JP.UTF-8" LC_TIME=3D"ja_JP.UTF-8" LC_COLLATE=3D"ja_JP.UTF-8" LC_MONETARY=3D"ja_JP.UTF-8" LC_MESSAGES=3D"ja_JP.UTF-8" LC_PAPER=3D"ja_JP.UTF-8" LC_NAME=3D"ja_JP.UTF-8" LC_ADDRESS=3D"ja_JP.UTF-8" LC_TELEPHONE=3D"ja_JP.UTF-8" LC_MEASUREMENT=3D"ja_JP.UTF-8" LC_IDENTIFICATION=3D"ja_JP.UTF-8" LC_ALL=3D -bash-3.1$ I never knew that locale has effect on the behavior. We can close this bug. Thank you so much for your time and details, I have learnt new thing :) Also, thanks for correcting my regex. Thanks and Regards, Adil Mujeeb --000e0cd13a16f52afb048375d0a9 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable

Thanks Bob for such a nice explanation and yo= ur instinct is right. It is locale problem.

=A0

-bash-3.1$ locale

LANG=3Den_US.UTF-8

LC_CTYPE=3D"en_US.UTF-8"

LC_NUMERIC=3D"en_US.UTF-8"

LC_TIME=3D"en_US.UTF-8"

LC_COLLATE=3D"en_US.UTF-8"

LC_MONETARY=3D"en_US.UTF-8"<= /p>

LC_MESSAGES=3D"en_US.UTF-8"<= /p>

LC_PAPER=3D"en_US.UTF-8"

LC_NAME=3D"en_US.UTF-8"

LC_ADDRESS=3D"en_US.UTF-8"

LC_TELEPHONE=3D"en_US.UTF-8"=

LC_MEASUREMENT=3D"en_US.UTF-8"

LC_IDENTIFICATION=3D"en_US.UTF-8"

LC_ALL=3D

-bash-3.1$

=A0

And the other point you made is also right. I= didn=92t realize that I was using another session for comparing the result= with match which ahs different locale:-

=A0

-bash-3.1$

LANG=3Dja_JP.UTF-8

LC_CTYPE=3D"ja_JP.UTF-8"

LC_NUMERIC=3D"ja_JP.UTF-8"

LC_TIME=3D"ja_JP.UTF-8"

LC_COLLATE=3D"ja_JP.UTF-8"

LC_MONETARY=3D"ja_JP.UTF-8"<= /p>

LC_MESSAGES=3D"ja_JP.UTF-8"<= /p>

LC_PAPER=3D"ja_JP.UTF-8"

LC_NAME=3D"ja_JP.UTF-8"

LC_ADDRESS=3D"ja_JP.UTF-8"

LC_TELEPHONE=3D"ja_JP.UTF-8"=

LC_MEASUREMENT=3D"ja_JP.UTF-8"

LC_IDENTIFICATION=3D"ja_JP.UTF-8"

LC_ALL=3D

-bash-3.1$

=A0

I never knew that locale has effect on the be= havior. We can close this bug.

Thank you so much for your time and details, = I have learnt new thing :)

=A0

Also, thanks for correcting my regex.<= /p>

Thanks and Regards,
Adil Mujeeb
--000e0cd13a16f52afb048375d0a9-- From unknown Mon Aug 18 19:26:15 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.427 (Entity 5.427) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Adil Mujeeb Subject: bug#5812 closed by Bob Proulx (Re: bug#5812: expr: Difference in behavior of match and :) Message-ID: References: <20100405044212.GA22010@dementia.proulx.com> X-Gnu-PR-Message: they-closed 5812 X-Gnu-PR-Package: coreutils Reply-To: 5812@debbugs.gnu.org Date: Mon, 05 Apr 2010 04:43:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1270442582-17769-1" This is a multi-part message in MIME format... ------------=_1270442582-17769-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This is an automatic notification regarding your bug report which was filed against the coreutils package: #5812: expr: Difference in behavior of match and : It has been closed by Bob Proulx . Their explanation is attached below along with your original report. If this explanation is unsatisfactory and you have not received a better one in a separate message then please contact Bob Proulx by replying to this email. --=20 5812: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D5812 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1270442582-17769-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 5812-done) by debbugs.gnu.org; 5 Apr 2010 04:42:19 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Nye8g-0004cO-Of for submit@debbugs.gnu.org; Mon, 05 Apr 2010 00:42:18 -0400 Received: from joseki.proulx.com ([216.17.153.58]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Nye8f-0004cJ-4J for 5812-done@debbugs.gnu.org; Mon, 05 Apr 2010 00:42:17 -0400 Received: from dementia.proulx.com (dementia.proulx.com [192.168.230.115]) by joseki.proulx.com (Postfix) with ESMTP id 3EE42213FC; Sun, 4 Apr 2010 22:42:13 -0600 (MDT) Received: by dementia.proulx.com (Postfix, from userid 1000) id 1DA3C3CC0D9; Sun, 4 Apr 2010 22:42:13 -0600 (MDT) Date: Sun, 4 Apr 2010 22:42:13 -0600 From: Bob Proulx To: Adil Mujeeb Subject: Re: bug#5812: expr: Difference in behavior of match and : Message-ID: <20100405044212.GA22010@dementia.proulx.com> References: <20100403223353.GA20406@dementia.proulx.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 5812-done Cc: 5812-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) Adil Mujeeb wrote: > Thanks Bob for such a nice explanation and your instinct is right. It i= s > locale problem. > ... > And the other point you made is also right. I didn=E2=80=99t realize th= at I was > using another session for comparing the result with match which ahs > different locale:- I thought it might have been something like that. > I never knew that locale has effect on the behavior. We can close this = bug. I will close the bug with this message then. > Thank you so much for your time and details, I have learnt new thing :) I am glad to have helped! > Also, thanks for correcting my regex. Sure thing! Bob ------------=_1270442582-17769-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 31 Mar 2010 14:04:04 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NwyWZ-0006lS-Ms for submit@debbugs.gnu.org; Wed, 31 Mar 2010 10:04:03 -0400 Received: from mx10.gnu.org ([199.232.76.166]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Nwxbu-0006Mv-RW for submit@debbugs.gnu.org; Wed, 31 Mar 2010 09:05:31 -0400 Received: from lists.gnu.org ([199.232.76.165]:37302) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Nwxbp-0003Uf-D6 for submit@debbugs.gnu.org; Wed, 31 Mar 2010 09:05:25 -0400 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Nwxbo-00062k-Rj for bug-coreutils@gnu.org; Wed, 31 Mar 2010 09:05:24 -0400 Received: from [140.186.70.92] (port=51943 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Nwxbm-00062c-GZ for bug-coreutils@gnu.org; Wed, 31 Mar 2010 09:05:23 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,T_DKIM_INVALID,T_TO_NO_BRKTS_FREEMAIL autolearn=unavailable version=3.3.0 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Nwxbj-00026G-AA for bug-coreutils@gnu.org; Wed, 31 Mar 2010 09:05:22 -0400 Received: from mail-pz0-f191.google.com ([209.85.222.191]:35129) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Nwxbj-00025p-5J for bug-coreutils@gnu.org; Wed, 31 Mar 2010 09:05:19 -0400 Received: by pzk29 with SMTP id 29so57788pzk.27 for ; Wed, 31 Mar 2010 06:05:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:received:message-id :subject:from:to:content-type; bh=LSefAayu4X7GaQoul6zgxQe4iT+mgwQ0Jv3otcchtuI=; b=f+LO5KIbsOmCMgIytkf3vMEHfxGWAbTFLjYd4cUA+fiuPFLYBA6cUcHch1iBd7UH8i acDPzQXQ9VIUcQ9aMZEihoY7cxGTEaBn993bdgk+F42pDv07nPDWMWYyDh3NB8gAbuif jn85T/daFL7U4vVIc6UIpu+AtmigZVdde3vPU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=v8ev/mcMAGU3q6559Dzebri6eHTzx9PEb6lXqZL8tMgjjWcjYn9heCWnKg8b8tekfw HOrxx1pPHR5MDw2fjqF0x0FyfTGGNVlCfaTlPbvATGRK/muYK9un7+vGWRhKR+gt/L75 1PW4IV9HCoeQie3A6N09UJSn7VvXwAjmXUJrg= MIME-Version: 1.0 Received: by 10.141.51.14 with HTTP; Wed, 31 Mar 2010 06:05:16 -0700 (PDT) Date: Wed, 31 Mar 2010 18:35:16 +0530 Received: by 10.140.58.7 with SMTP id g7mr2694488rva.37.1270040716845; Wed, 31 Mar 2010 06:05:16 -0700 (PDT) Message-ID: Subject: expr: Difference in behavior of match and : From: Adil Mujeeb To: bug-coreutils@gnu.org Content-Type: multipart/alternative; boundary=001636b2ac50c721170483186539 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Wed, 31 Mar 2010 10:04:02 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -4.6 (----) --001636b2ac50c721170483186539 Content-Type: text/plain; charset=ISO-8859-1 Hello team, I have tried following snippet in a bash script: -bash-3.1$userid=`expr "uid=11008(ADILM) gid=1200(cvs),1400(build)" : ".*uid=[0-9]*(\(.[0-9a-z]*\)) .*"` -bash-3.1$echo $userid ADILM -bash-3.1$ To my knowledge it should not able to extract ADILM as the regex does not include uppercase letters (A-Z). In the expr man page it is mentioned that: -----8<---------- match STRING REGEXP same as STRING : REGEXP -----8<---------- So i tried following snippet:- -bash-3.1$ userid=`expr match "uid=11008(ADILM) gid=1200(cvs),1400(build)" ".*uid=[0-9]*(\(.[0-9a-z]*\)) .*"` -bash-3.1$ echo $userid -bash-3.1$ I changed the regex and added uppercase letters:- -bash-3.1$ userid=`expr match "uid=11008(ADILM) gid=1200(cvs),1400(build)" ".*uid=[0-9]*(\(.[0-9A-Za-z]*\)) .*"` -bash-3.1$ echo $userid ADILM -bash-3.1$ So it means that match is not same as ":". As per observation ":" uses case-insensitive matching while match is strict case sensitive matching. Can you update the man page OR let me know if i am doing anything wrong? Package:- -bash-3.1$ rpm -qf /usr/bin/expr coreutils-5.97-12.1.el5 -bash-3.1$ Thanks and Regards, Adil Mujeeb --001636b2ac50c721170483186539 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable


Hello team,

I have tried following snippet in a bash script:

-bash-3.1$userid=3D`expr "uid=3D11008(ADILM) gid=3D1200(cvs),1400(b= uild)" : ".*uid=3D[0-9]*(\(.[0-9a-z]*\)) .*"`
-bash-3.1$= echo $userid
ADILM
-bash-3.1$

To my knowledge it should not able to extract ADILM as the regex does no= t include uppercase letters (A-Z).

In the expr man page it is mentioned that:

-----8<----------
match STRING REGEXP
=A0same as STRING : REGEX= P
-----8<----------

So i tried following snippet:-

-bash-3.1$ userid=3D`expr match "uid=3D11008(ADILM) gid=3D1200(cvs)= ,1400(build)"=A0 ".*uid=3D[0-9]*(\(.[0-9a-z]*\)) .*"`
-ba= sh-3.1$ echo $userid

-bash-3.1$

I changed the regex and added uppercase letters:-
-bash-3.1$ userid= =3D`expr match "uid=3D11008(ADILM) gid=3D1200(cvs),1400(build)"= =A0 ".*uid=3D[0-9]*(\(.[0-9A-Za-z]*\)) .*"`
-bash-3.1$ echo $u= serid
ADILM
-bash-3.1$

So it means that match is not same as ":". As per observation = ":" uses case-insensitive matching while match is strict case sen= sitive matching.

Can you update the man page OR let me know if i am doing anything wrong?=

Package:-
-bash-3.1$ rpm -qf /usr/bin/expr
coreutils-5.97-12.1.el5=
-bash-3.1$

Thanks and Regards,
Adil Mujeeb

--001636b2ac50c721170483186539-- ------------=_1270442582-17769-1--