From debbugs-submit-bounces@debbugs.gnu.org Sat May 23 20:05:46 2015 Received: (at submit) by debbugs.gnu.org; 24 May 2015 00:05:46 +0000 Received: from localhost ([127.0.0.1]:54166 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YwJQD-0002Ki-Kg for submit@debbugs.gnu.org; Sat, 23 May 2015 20:05:46 -0400 Received: from eggs.gnu.org ([208.118.235.92]:34788) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YwJQ6-0002KQ-Kw for submit@debbugs.gnu.org; Sat, 23 May 2015 20:05:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YwJQ0-0001wM-A0 for submit@debbugs.gnu.org; Sat, 23 May 2015 20:05:29 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:54374) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YwJQ0-0001wF-7J for submit@debbugs.gnu.org; Sat, 23 May 2015 20:05:28 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48263) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YwJPv-0006jg-58 for bug-grep@gnu.org; Sat, 23 May 2015 20:05:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YwJPX-0001fU-WD for bug-grep@gnu.org; Sat, 23 May 2015 20:05:04 -0400 Received: from ishtar.tlinx.org ([173.164.175.65]:47579 helo=Ishtar.hs.tlinx.org) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YwJPX-0001b4-Et for bug-grep@gnu.org; Sat, 23 May 2015 20:04:59 -0400 Received: from [192.168.4.12] (Athenae [192.168.4.12]) by Ishtar.hs.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id t4O04odV050275 for ; Sat, 23 May 2015 17:04:53 -0700 Message-ID: <556115A2.2020404@tlinx.org> Date: Sat, 23 May 2015 17:04:50 -0700 From: "L. A. Walsh" User-Agent: Thunderbird MIME-Version: 1.0 To: bug-grep@gnu.org Subject: BUG: standard & extended RE's don't find NUL's :-( Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) the standard & extended RE's don't find find NUL's: > dd if=/dev/zero of=zeros bs=4k count=1 > command grep -Pq '\000\000' zeros && echo "badness" badness > command grep -Eq '\000\000' zeros && echo "badness" > command grep -Gq '\000\000' zeros && echo "badness" > command grep -q '\000\000' zeros && echo "badness" > rpm -q grep grep-2.20-2.4.1.x86_64 From debbugs-submit-bounces@debbugs.gnu.org Sun May 24 08:59:03 2015 Received: (at 20638) by debbugs.gnu.org; 24 May 2015 12:59:03 +0000 Received: from localhost ([127.0.0.1]:54355 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YwVUY-0007FC-Pb for submit@debbugs.gnu.org; Sun, 24 May 2015 08:59:02 -0400 Received: from resqmta-po-08v.sys.comcast.net ([96.114.154.167]:44406) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YwVUT-0007Eu-2S for 20638@debbugs.gnu.org; Sun, 24 May 2015 08:58:57 -0400 Received: from resomta-po-10v.sys.comcast.net ([96.114.154.234]) by resqmta-po-08v.sys.comcast.net with comcast id XoyZ1q00353iAfU01oyseg; Sun, 24 May 2015 12:58:52 +0000 Received: from [192.168.0.6] ([24.10.254.122]) by resomta-po-10v.sys.comcast.net with comcast id Xoyr1q00R2fD5rL01oyrcm; Sun, 24 May 2015 12:58:52 +0000 Message-ID: <5561CB0B.9090409@redhat.com> Date: Sun, 24 May 2015 06:58:51 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: "L. A. Walsh" , 20638@debbugs.gnu.org Subject: Re: bug#20638: BUG: standard & extended RE's don't find NUL's :-( References: <556115A2.2020404@tlinx.org> In-Reply-To: <556115A2.2020404@tlinx.org> OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="GNuCoA8bW7fjrDfK5e1EpXjNo1PgrhoLc" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1432472332; bh=UUzzGFCCZedhsXEwZmaSViP0nkVf6TeDv3lzU9Ht+K4=; h=Received:Received:Message-ID:Date:From:MIME-Version:To:Subject: Content-Type; b=XYqC2yKUBdc0wL5e7w5ZoFSMyLS80imNFd4LHiZzyIW6UF4UTSpY9tmOEgfYqBXg4 LDzA29uwftzkGjczFzQ6zLUOzAQBK6H7PVLOHUYS5gHdWtI32gPJVGSiZEYlTHJiRP LCg2RPpZvc7uHrc2EF+Yk4aEGlHnSqEeBlxrThtrT/qaswAS4cGpUG3tx3DHGZUY4J MMCciaBIS+tvKkpfyp7AOaZ9pussu/RXyH7VlbbmKQ8hkrAsREaEDvhfNXXKP8AH4a sJ7cKY2SkV+hZI/YBHpRShIIpe9VjsHIL0EurXYXk93toZ2Pfw/il9QjlfzhHaY6dN lAlZHmDH5UUCA== X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 20638 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --GNuCoA8bW7fjrDfK5e1EpXjNo1PgrhoLc Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 05/23/2015 06:04 PM, L. A. Walsh wrote: > the standard & extended RE's don't find find NUL's: Because NULs imply binary data, and grepping binary data has unspecified results per POSIX. What's more, the NEWS for 2.21 documents that grep is now taking the liberty of treating NUL as a line terminator when -a is not in effect, thanks to the behavior being otherwise unspecified by POSIX. Try using 'grep -a' to force grep to treat the file as non-binary, in spite of the NULs. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --GNuCoA8bW7fjrDfK5e1EpXjNo1PgrhoLc Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJVYcsLAAoJEKeha0olJ0NqwhMH/28vSa1VgIqInMgMz69+ldib RdlPrcXtDbmAzLj8b8TGseGGjsPbwxf2V5vd6vaUr9QBd5lMWwN3U/rj2OS2BaSD wmEa7xHBWf+gkdBOys7F0P+ZBI+T8dbzDTIROiv59fcttulgk6Oc2RzEe6DQeYB7 rBi7js4eZPa09Wr3+wyW9gt7KBKyn0iKZ0/3v2eH+FwagMkaxmk0utAq0I0LDfna dIoCjrZ+74sKGBz8Fgscwsr97IHtQ/Bu01lei2PVHfLvHYrMeqZZXLSnRAlHqYGz RUuGeBPWxQFA7fvfuRfpT8sJkbTxQjaz+DRLmtThFtrCuJlXYDvvZ3s2LAbmemI= =ddpd -----END PGP SIGNATURE----- --GNuCoA8bW7fjrDfK5e1EpXjNo1PgrhoLc-- From debbugs-submit-bounces@debbugs.gnu.org Mon May 25 02:48:19 2015 Received: (at 20638) by debbugs.gnu.org; 25 May 2015 06:48:19 +0000 Received: from localhost ([127.0.0.1]:55237 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YwmBK-0004nt-ON for submit@debbugs.gnu.org; Mon, 25 May 2015 02:48:19 -0400 Received: from ishtar.tlinx.org ([173.164.175.65]:45900 helo=Ishtar.hs.tlinx.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YwmBE-0004nf-26 for 20638@debbugs.gnu.org; Mon, 25 May 2015 02:48:13 -0400 Received: from [192.168.4.12] (Athenae [192.168.4.12]) by Ishtar.hs.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id t4P6m3JP054706; Sun, 24 May 2015 23:48:05 -0700 Message-ID: <5562C5A3.7010301@tlinx.org> Date: Sun, 24 May 2015 23:48:03 -0700 From: Linda Walsh User-Agent: Thunderbird MIME-Version: 1.0 To: Eric Blake Subject: Re: bug#20638: BUG: standard & extended RE's don't find NUL's :-( References: <556115A2.2020404@tlinx.org> <5561CB0B.9090409@redhat.com> In-Reply-To: <5561CB0B.9090409@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 20638 Cc: 20638@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Eric Blake wrote: > On 05/23/2015 06:04 PM, L. A. Walsh wrote: > >> the standard & extended RE's don't find find NUL's: >> > > Because NULs imply binary data, I can think of multiple cases were at least 1 'nul' would be found in text data -- the most prime example being that it is a Microsoft Text file. While MS usually uses a BOM at the beginning of files, since NT's original format was only LSB/UCS-2, one still runs into the occasional file -- but just rare enough that I don't have the vim command to change it in the buffer to a compat format that I waste time looking it up. But more to the point some unix files were designed to work on file -- not just limited to text -- 'strings' for example. Right now, it seems grep has lost much in the 'robust' category -- I had one file that it bailed on saying it has an invalid UTF-8 encoding -- but the line was recursive starting from '.' -- and it didn't name the file "-a" doesn't work, BTW: Ishtar:/tmp> grep -a '\000\000' zeros Ishtar:/tmp> echo $? 1 Ishtar:/tmp> grep -P '\000\000' zeros Binary file zeros matches But there it is -- if grep wasn't meant to handle binary files, it wouldn't know to call 'zeroes' a binary file. Many of the coreutils have worked equally well on binary as well as txt. (cat, split, tr, wc to name a few). But how can 'shuf' claim to work on input lines yet have this allowed: -z, --zero-terminated line delimiter is NUL, not newline. 'nl' claims the file, 'zeros' (4k of nulls -- created by bash, that can write a file of zeros, but not read it) is 1 line. 'pr' will print it (though not too well). 'xargs': and grepping binary data has unspecified > results per POSIX. What's more, the NEWS for 2.21 documents that grep > is now taking the liberty of treating NUL as a line terminator when -a > is not in effect, thanks to the behavior being otherwise unspecified by > POSIX. > ---- With a "-0" switch, I presume (not default behavior -- that would be ungood :^/ ) > Try using 'grep -a' to force grep to treat the file as non-binary, in > spite of the NULs. > doesn't work -- as mentioned above. I'd say it's a bug fair and square... From debbugs-submit-bounces@debbugs.gnu.org Mon May 25 11:19:19 2015 Received: (at 20638) by debbugs.gnu.org; 25 May 2015 15:19:19 +0000 Received: from localhost ([127.0.0.1]:55757 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Ywu9q-000182-ON for submit@debbugs.gnu.org; Mon, 25 May 2015 11:19:19 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:57450) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Ywu9k-00017i-7l for 20638@debbugs.gnu.org; Mon, 25 May 2015 11:19:13 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id A3923A60004; Mon, 25 May 2015 08:19:01 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lwQDOtSZMbYP; Mon, 25 May 2015 08:19:01 -0700 (PDT) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 24EE639E8015; Mon, 25 May 2015 08:19:01 -0700 (PDT) Message-ID: <55633D60.10907@cs.ucla.edu> Date: Mon, 25 May 2015 08:18:56 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Linda Walsh , Eric Blake Subject: Re: bug#20638: BUG: standard & extended RE's don't find NUL's :-( References: <556115A2.2020404@tlinx.org> <5561CB0B.9090409@redhat.com> <5562C5A3.7010301@tlinx.org> In-Reply-To: <5562C5A3.7010301@tlinx.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 20638 Cc: 20638@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) Linda Walsh wrote: > I had one file that it bailed on > saying it has an invalid UTF-8 encoding -- but the line was > recursive starting from '.' -- and it didn't name the file That's pretty vague. Can you reproduce that problem? I don't observe it: $ mkdir d $ printf 'a\200\n' >d/f $ printf 'b\200\n' >d/g $ grep -r a d Binary file d/f matches > "-a" doesn't work, BTW: > > Ishtar:/tmp> grep -a '\000\000' zeros > Ishtar:/tmp> echo $? > 1 That's the way 'grep' has always behaved. The regular expression '\0' matches the string "0", not the NUL byte. > Ishtar:/tmp> grep -P '\000\000' zeros Binary file zeros matches I don't follow this example; perhaps some text was omitted? Anyway, -P has always treated files containing zeros as binary files too, ever since -P has been introduced. It's the same as without -P. > But there it is -- if grep wasn't meant to handle binary files, > it wouldn't know to call 'zeroes' a binary file. Obviously, grep *is* meant to handle binary files; it's documented to handle them in a particular way. > how can 'shuf' claim to work on input lines yet have this allowed: > > -z, --zero-terminated > line delimiter is NUL, not newline. I don't follow this point. -z is a nice feature; we don't want to get rid of it. > People argue to dumb down POSIX > utils, because some corp wants to get a posix label but > has a few shortcomings -- so they donate enough money and > posix changes it's rules. I'm afraid you've gone off the deep end here. From debbugs-submit-bounces@debbugs.gnu.org Mon May 25 15:46:43 2015 Received: (at 20638) by debbugs.gnu.org; 25 May 2015 19:46:43 +0000 Received: from localhost ([127.0.0.1]:55882 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YwyKg-0001tT-D3 for submit@debbugs.gnu.org; Mon, 25 May 2015 15:46:42 -0400 Received: from ishtar.tlinx.org ([173.164.175.65]:35453 helo=Ishtar.hs.tlinx.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YwyKc-0001tJ-Pc for 20638@debbugs.gnu.org; Mon, 25 May 2015 15:46:39 -0400 Received: from [192.168.4.12] (Athenae [192.168.4.12]) by Ishtar.hs.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id t4PJkOFj088144; Mon, 25 May 2015 12:46:26 -0700 Message-ID: <55637C0F.9050807@tlinx.org> Date: Mon, 25 May 2015 12:46:23 -0700 From: Linda Walsh User-Agent: Thunderbird MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#20638: BUG: standard & extended RE's don't find NUL's :-( References: <556115A2.2020404@tlinx.org> <5561CB0B.9090409@redhat.com> <5562C5A3.7010301@tlinx.org> <55633D60.10907@cs.ucla.edu> In-Reply-To: <55633D60.10907@cs.ucla.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 20638 Cc: 20638@debbugs.gnu.org, Eric Blake X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Paul Eggert wrote: > Linda Walsh wrote: > >> I had one file that it bailed on >> saying it has an invalid UTF-8 encoding -- but the line was >> recursive starting from '.' -- and it didn't name the file ---- I didn't report that as 'a bug', because when I went back to reproduce it -- low level physics took over -- i.e. the closer I looked, the more uncertain the problem became! I did change the grep * into a for i in *;do echo file;grep file;...but couldn't find the file that gave the message...Grrr. I will bet it was with the '-P' option, since the standard Regex in perl complains about such things and since I was only interested in status (was using -q _because_ I was searching for a binary pattern -- the '\000\000') I got the warning but nothing else. If I run into it again, maybe I can find it w/o looking too closely then that uncertainty principle won't kick in... ;-) > > That's pretty vague. Can you reproduce that problem? I don't observe > it: > > $ mkdir d > $ printf 'a\200\n' >d/f > $ printf 'b\200\n' >d/g > $ grep -r a d > Binary file d/f matches > >> "-a" doesn't work, BTW: >> >> Ishtar:/tmp> grep -a '\000\000' zeros >> Ishtar:/tmp> echo $? >> 1 > > That's the way 'grep' has always behaved. The regular expression '\0' > matches the string "0", not the NUL byte. > >> Ishtar:/tmp> grep -P '\000\000' zeros Binary file zeros matches > > I don't follow this example; perhaps some text was omitted? Anyway, > -P has always treated files containing zeros as binary files too, ever > since -P has been introduced. It's the same as without -P. > >> But there it is -- if grep wasn't meant to handle binary files, >> it wouldn't know to call 'zeroes' a binary file. > > Obviously, grep *is* meant to handle binary files; it's documented to > handle them in a particular way. --- Nevertheless, it is documented, that '\ddd' or '\xHH' can be used to match a single character of the value specified. '\000\000' is found in 'zeroes' (as mentioned in the original report -- a file filled with 4k of nulls), with the -P switch, but not the -a switch. That behavior violates the documentation. > >> how can 'shuf' claim to work on input lines yet have this allowed: >> >> -z, --zero-terminated >> line delimiter is NUL, not newline. > > I don't follow this point. -z is a nice feature; we don't want to get > rid of it. ---- Nice of you to not read the previous notes. The argument was that a NUL in a file made it non-text -- therefore it woudln't be a "line". > >> People argue to dumb down POSIX >> utils, because some corp wants to get a posix label but >> has a few shortcomings -- so they donate enough money and >> posix changes it's rules. > > I'm afraid you've gone off the deep end here. I didn't bring up POSIX, Eric did. Again, nice of you to jump in the middle of a conversation and not read the earlier notes... :-) *Cheers* Paul...(et al). -linda From debbugs-submit-bounces@debbugs.gnu.org Mon May 25 15:54:47 2015 Received: (at 20638) by debbugs.gnu.org; 25 May 2015 19:54:47 +0000 Received: from localhost ([127.0.0.1]:55886 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YwySV-00024D-12 for submit@debbugs.gnu.org; Mon, 25 May 2015 15:54:47 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:37323) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YwySS-00023y-GY for 20638@debbugs.gnu.org; Mon, 25 May 2015 15:54:45 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 1F1D039E801B; Mon, 25 May 2015 12:54:38 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wGFYL9kjMBUW; Mon, 25 May 2015 12:54:37 -0700 (PDT) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id AC57F39E8015; Mon, 25 May 2015 12:54:37 -0700 (PDT) Message-ID: <55637DFD.1070707@cs.ucla.edu> Date: Mon, 25 May 2015 12:54:37 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Linda Walsh Subject: Re: bug#20638: BUG: standard & extended RE's don't find NUL's :-( References: <556115A2.2020404@tlinx.org> <5561CB0B.9090409@redhat.com> <5562C5A3.7010301@tlinx.org> <55633D60.10907@cs.ucla.edu> <55637C0F.9050807@tlinx.org> In-Reply-To: <55637C0F.9050807@tlinx.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 20638 Cc: 20638@debbugs.gnu.org, Eric Blake X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) Linda Walsh wrote: > it is documented, that '\ddd' or '\xHH' can be used > to match a single character of the value specified. I don't see where it's documented to behave that way. Perhaps you're looking at the wrong documentation? > The argument was that > a NUL in a file made it non-text -- therefore it woudln't be a "line". Obviously -z changes the definition of a line. -z is explicitly designed to operate on files containing NUL bytes. So that argument was not coherent. >> I'm afraid you've gone off the deep end here. > I didn't bring up POSIX, Eric did. Eric's comments didn't incorporate conspiracy theories about corporate payoffs; yours did. From debbugs-submit-bounces@debbugs.gnu.org Mon May 25 18:22:39 2015 Received: (at 20638) by debbugs.gnu.org; 25 May 2015 22:22:39 +0000 Received: from localhost ([127.0.0.1]:55940 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yx0la-0006wA-Av for submit@debbugs.gnu.org; Mon, 25 May 2015 18:22:38 -0400 Received: from ishtar.tlinx.org ([173.164.175.65]:39529 helo=Ishtar.hs.tlinx.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yx0lX-0006w0-Ph for 20638@debbugs.gnu.org; Mon, 25 May 2015 18:22:36 -0400 Received: from [192.168.4.12] (Athenae [192.168.4.12]) by Ishtar.hs.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id t4PMMUew008535; Mon, 25 May 2015 15:22:32 -0700 Message-ID: <5563A0A6.8060703@tlinx.org> Date: Mon, 25 May 2015 15:22:30 -0700 From: Linda Walsh User-Agent: Thunderbird MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#20638: BUG: standard & extended RE's don't find NUL's :-( References: <556115A2.2020404@tlinx.org> <5561CB0B.9090409@redhat.com> <5562C5A3.7010301@tlinx.org> <55633D60.10907@cs.ucla.edu> <55637C0F.9050807@tlinx.org> <55637DFD.1070707@cs.ucla.edu> In-Reply-To: <55637DFD.1070707@cs.ucla.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 20638 Cc: 20638@debbugs.gnu.org, Eric Blake X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Paul Eggert wrote: > Linda Walsh wrote: >> it is documented, that '\ddd' or '\xHH' can be used >> to match a single character of the value specified. > > I don't see where it's documented to behave that way. Perhaps you're > looking at the wrong documentation? Perhaps you want to tell me where the documentation on the standard and/or extended RE's is that you use? I think I was referred to a number of different manpages... it's the first reference under "See Also" at the bottom of the grep page: awk. From the awk manpage: String Constants String constants in AWK are sequences of characters enclosed between double quotes (like "value"). Within strings, certain escape sequences are recognized, as in C. These are: \\ A literal backslash. \a The "alert" character; usually the ASCII BEL character. \b Backspace. \f Form-feed. \n Newline. \r Carriage return. \t Horizontal tab. \v Vertical tab. \xhex digits The character represented by the string of hexadecimal digits fol- lowing the \x. As in ISO C, all following hexadecimal digits are considered part of the escape sequence. (This feature should tell us something about language design by committee.) E.g., "\x1B" is the ASCII ESC (escape) character. \ddd The character represented by the 1-, 2-, or 3-digit sequence of octal digits. E.g., "\033" is the ASCII ESC (escape) character. >> The argument was that >> a NUL in a file made it non-text -- therefore it woudln't be a "line". > > Obviously -z changes the definition of a line. -z is explicitly > designed to operate on files containing NUL bytes. So that argument > was not coherent. --- That is my opinion, also, but nevertheless, that '\000' implies binary was said early in this bug-discusion -- I was refuting that. The other thing that corrupts some tools is not working well if there is no terminating LF at the end of a page of text. (i.e. some editors will text-based files by adding an extra LF at the end, which can cause problems with config files in some cases. > >>> I'm afraid you've gone off the deep end here. >> I didn't bring up POSIX, Eric did. > > Eric's comments didn't incorporate conspiracy theories about corporate > payoffs; yours did. --- I am stating facts. The ones who had the most influence on posix in the past were the largest "gold sponsors". Now, it's fewer of them and more 'silver'.... but they, historically have had the most influence on such standards organizations. I will remind you that POSIX described its initial mission statement as "descriptive" -- not "prescriptive". That changed ~ 2003 or so when they started telling implementors what they had to remove to be posix compliant. The worst violation I can think of is removing the ability for rm to be used easily and safely to remove everything under a specific directory: "rm -fr --one-file-system ." -- It might be good to have a 1 char name for that. For some reason I remember "-x" being a reasonable choice. "rm" was always described to do a depth-first traversal, which means it shouldn't even look at top-paths except to descend into them.That was changed making coreutils rm's that follow that standard, unreliable for removing dir contents (w/o removing the dir). I have good reasons -- not conspiracy, but capitalistic reasons for what I say, and if you don't believe money and capitalism run this country, I'd have to say it was you, who had gone off the deep end. But if you had -- I can probably welcome you -- I think I live in the deep end... ;-) linda From debbugs-submit-bounces@debbugs.gnu.org Mon May 25 18:58:58 2015 Received: (at 20638) by debbugs.gnu.org; 25 May 2015 22:58:58 +0000 Received: from localhost ([127.0.0.1]:55961 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yx1Kj-0007mw-KZ for submit@debbugs.gnu.org; Mon, 25 May 2015 18:58:57 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:42097) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yx1Ki-0007mk-DE for 20638@debbugs.gnu.org; Mon, 25 May 2015 18:58:57 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 0D3F839E8016; Mon, 25 May 2015 15:58:49 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wPLW2vhUqpsv; Mon, 25 May 2015 15:58:48 -0700 (PDT) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 8FA3E39E8014; Mon, 25 May 2015 15:58:48 -0700 (PDT) Message-ID: <5563A928.4050707@cs.ucla.edu> Date: Mon, 25 May 2015 15:58:48 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Linda Walsh Subject: Re: bug#20638: BUG: standard & extended RE's don't find NUL's :-( References: <556115A2.2020404@tlinx.org> <5561CB0B.9090409@redhat.com> <5562C5A3.7010301@tlinx.org> <55633D60.10907@cs.ucla.edu> <55637C0F.9050807@tlinx.org> <55637DFD.1070707@cs.ucla.edu> <5563A0A6.8060703@tlinx.org> In-Reply-To: <5563A0A6.8060703@tlinx.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 20638 Cc: 20638@debbugs.gnu.org, Eric Blake X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) Linda Walsh wrote: > Perhaps you want to tell me where the documentation on the > standard and/or extended RE's is that you use? We're talking about grep, so the relevant documentation is the grep manual, not the awk manual or other random stuff you might find on the Internet. Type 'info grep'. Or if you're in Emacs, type 'C-h i m grep RET'. > I have good reasons -- not conspiracy, but capitalistic reasons for > what I say Whether you do or not, they're irrelevant to this discussion and to be honest that tinfoil-hat stuff isn't helping your case. From debbugs-submit-bounces@debbugs.gnu.org Mon May 25 21:19:25 2015 Received: (at 20638) by debbugs.gnu.org; 26 May 2015 01:19:26 +0000 Received: from localhost ([127.0.0.1]:56020 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yx3Wf-0002XK-7h for submit@debbugs.gnu.org; Mon, 25 May 2015 21:19:25 -0400 Received: from ishtar.tlinx.org ([173.164.175.65]:43420 helo=Ishtar.hs.tlinx.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yx3Wc-0002X8-5V for 20638@debbugs.gnu.org; Mon, 25 May 2015 21:19:22 -0400 Received: from [192.168.4.12] (Athenae [192.168.4.12]) by Ishtar.hs.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id t4Q1JGgm029941; Mon, 25 May 2015 18:19:19 -0700 Message-ID: <5563CA15.10905@tlinx.org> Date: Mon, 25 May 2015 18:19:17 -0700 From: Linda Walsh User-Agent: Thunderbird MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#20638: BUG: standard & extended RE's don't find NUL's :-( References: <556115A2.2020404@tlinx.org> <5561CB0B.9090409@redhat.com> <5562C5A3.7010301@tlinx.org> <55633D60.10907@cs.ucla.edu> <55637C0F.9050807@tlinx.org> <55637DFD.1070707@cs.ucla.edu> <5563A0A6.8060703@tlinx.org> <5563A928.4050707@cs.ucla.edu> In-Reply-To: <5563A928.4050707@cs.ucla.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 20638 Cc: 20638@debbugs.gnu.org, Eric Blake X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Paul Eggert wrote: > Linda Walsh wrote: > >> Perhaps you want to tell me where the documentation on the >> standard and/or extended RE's is that you use? > > We're talking about grep, so the relevant documentation is the grep > manual, not the awk manual or other random stuff you might find on the > Internet. Type 'info grep'. Or if you're in Emacs, type 'C-h i m > grep RET'. ---- From the coreutils-5.97 info page: Backslash escapes A backslash followed by a character not listed below causes an error message. `\a' Control-G. `\b' Control-H. `\f' Control-L. `\n' Control-J. `\r' Control-M. `\t' Control-I. `\v' Control-K. `\OOO' The character with the value given by OOO, which is 1 to 3 octal digits, `\\' A backslash. ---- It didn't have 'hex' back then. But you've broken backward compatibility. That would normally be a regression. You like to think that I wear a tinfoil hat -- but I just have a good memory for how grep used to operate. Maybe you should do some memory strengthening exercises (though I admit my memory isn't what it always was, it was in this case). Should I file this as a 2nd bug, that grep broke backward compat? It *used* to be compatible with 'awk's regex, which is why it is the first entry in the "See also". > > Whether you do or not, they're irrelevant to this discussion and to be > honest that tinfoil-hat stuff isn't helping your case. ---- no tinfoil hat -- just a good memory, something you might find useful to work on! ;-) -linda From debbugs-submit-bounces@debbugs.gnu.org Mon May 25 21:29:45 2015 Received: (at 20638) by debbugs.gnu.org; 26 May 2015 01:29:45 +0000 Received: from localhost ([127.0.0.1]:56024 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yx3ge-0002n9-Sd for submit@debbugs.gnu.org; Mon, 25 May 2015 21:29:45 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:45929) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yx3gb-0002mt-VJ for 20638@debbugs.gnu.org; Mon, 25 May 2015 21:29:42 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 6B36C39E801B; Mon, 25 May 2015 18:29:35 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XOekSdHeohsy; Mon, 25 May 2015 18:29:35 -0700 (PDT) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id EC3F439E8017; Mon, 25 May 2015 18:29:34 -0700 (PDT) Message-ID: <5563CC7E.3020002@cs.ucla.edu> Date: Mon, 25 May 2015 18:29:34 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Linda Walsh Subject: Re: bug#20638: BUG: standard & extended RE's don't find NUL's :-( References: <556115A2.2020404@tlinx.org> <5561CB0B.9090409@redhat.com> <5562C5A3.7010301@tlinx.org> <55633D60.10907@cs.ucla.edu> <55637C0F.9050807@tlinx.org> <55637DFD.1070707@cs.ucla.edu> <5563A0A6.8060703@tlinx.org> <5563A928.4050707@cs.ucla.edu> <5563CA15.10905@tlinx.org> In-Reply-To: <5563CA15.10905@tlinx.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 20638 Cc: 20638@debbugs.gnu.org, Eric Blake X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) > From the coreutils-5.97 info page: Like I said, we're talking about grep, so you need to look at the grep manual. grep is not part of coreutils, so you're barking up the wrong tree again. > It *used* to be compatible with 'awk's regex No, that's never been true. It wasn't true even back in the late 1970s, when I first used grep and awk. If you want to blame someone, blame the Bell Labs hackers who wrote them in the first place. > no tinfoil hat -- just a good memory, something you might find useful to > work on! ;-) In this particular case I'm afraid your memory has played tricks on you. From debbugs-submit-bounces@debbugs.gnu.org Mon May 25 22:13:15 2015 Received: (at 20638) by debbugs.gnu.org; 26 May 2015 02:13:15 +0000 Received: from localhost ([127.0.0.1]:56032 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yx4Mk-0003oK-E9 for submit@debbugs.gnu.org; Mon, 25 May 2015 22:13:14 -0400 Received: from ishtar.tlinx.org ([173.164.175.65]:44811 helo=Ishtar.hs.tlinx.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yx4Mg-0003oA-R6 for 20638@debbugs.gnu.org; Mon, 25 May 2015 22:13:11 -0400 Received: from [192.168.4.12] (Athenae [192.168.4.12]) by Ishtar.hs.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id t4Q2D6So038473; Mon, 25 May 2015 19:13:08 -0700 Message-ID: <5563D6B2.3090708@tlinx.org> Date: Mon, 25 May 2015 19:13:06 -0700 From: Linda Walsh User-Agent: Thunderbird MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#20638: BUG: standard & extended RE's don't find NUL's :-( References: <556115A2.2020404@tlinx.org> <5561CB0B.9090409@redhat.com> <5562C5A3.7010301@tlinx.org> <55633D60.10907@cs.ucla.edu> <55637C0F.9050807@tlinx.org> <55637DFD.1070707@cs.ucla.edu> <5563A0A6.8060703@tlinx.org> <5563A928.4050707@cs.ucla.edu> In-Reply-To: <5563A928.4050707@cs.ucla.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 20638 Cc: 20638@debbugs.gnu.org, Eric Blake X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Paul Eggert wrote: > Linda Walsh wrote: > >> Perhaps you want to tell me where the documentation on the >> standard and/or extended RE's is that you use? ---- Here is another: *POSIX Extended Regular Expression Syntax: (http://www.boost.org/doc/libs/1_43_0/libs/regex/doc/html/boost_regex/syntax/basic_extended.html) Escapes The POSIX standard defines no escape sequences for POSIX-Extended regular expressions, except that: * Any special character preceded by an escape shall match itself. * The effect of any ordinary character being preceded by an escape is undefined. * An escape inside a character class declaration shall match itself: in other words the escape character is not "special" inside a character class declaration; so [\^] will match either a literal '\' or a '^'. However, that's rather restrictive, so the following standard-compatible extensions are also supported by Boost.Regex: Escapes matching a specific character The following escape sequences are all synonyms for single characters: Escape Character \a '\a' \e 0x1B \f \f \n \n \r \r \t \t \v \v \b \b (but only inside a character class declaration). \cX An ASCII escape sequence - the character whose code point is X % 32 \xdd A hexadecimal escape sequence - matches the single character whose code point is 0xdd. \x{dddd} A hexadecimal escape sequence - matches the single character whose code point is 0xdddd. \0ddd An octal escape sequence - matches the single character whose code point is 0ddd. \N{Name} Matches the single character which has the symbolic name name. For example \\N{newline} matches the single character \n. * > > We're talking about grep, so the relevant documentation is the grep > manual, not the awk manual or other random stuff you might find on the > Internet. Type 'info grep'. Or if you're in Emacs, type 'C-h i m > grep RET'. ----- Again another example of \000 octal and \x hex. Most desccriptions of the chars grep takes say it was designed so that awk, sed, tr -- any core linux util that takes regexes - to be *the ssame* so people didn't have to learn a different syntax for each tool. From debbugs-submit-bounces@debbugs.gnu.org Mon May 25 22:30:21 2015 Received: (at 20638) by debbugs.gnu.org; 26 May 2015 02:30:21 +0000 Received: from localhost ([127.0.0.1]:56038 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yx4dI-0004Dk-3X for submit@debbugs.gnu.org; Mon, 25 May 2015 22:30:20 -0400 Received: from ishtar.tlinx.org ([173.164.175.65]:45191 helo=Ishtar.hs.tlinx.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yx4dF-0004Cq-I4 for 20638@debbugs.gnu.org; Mon, 25 May 2015 22:30:18 -0400 Received: from [192.168.4.12] (Athenae [192.168.4.12]) by Ishtar.hs.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id t4Q2UCkj039618; Mon, 25 May 2015 19:30:15 -0700 Message-ID: <5563DAB5.6080005@tlinx.org> Date: Mon, 25 May 2015 19:30:13 -0700 From: Linda Walsh User-Agent: Thunderbird MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#20638: BUG: standard & extended RE's don't find NUL's :-( References: <556115A2.2020404@tlinx.org> <5561CB0B.9090409@redhat.com> <5562C5A3.7010301@tlinx.org> <55633D60.10907@cs.ucla.edu> <55637C0F.9050807@tlinx.org> <55637DFD.1070707@cs.ucla.edu> <5563A0A6.8060703@tlinx.org> <5563A928.4050707@cs.ucla.edu> <5563CA15.10905@tlinx.org> <5563CC7E.3020002@cs.ucla.edu> In-Reply-To: <5563CC7E.3020002@cs.ucla.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 20638 Cc: 20638@debbugs.gnu.org, Eric Blake X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Paul Eggert wrote: > In this particular case I'm afraid your memory has played tricks on you --- You may be right ..;-( I found these: http://unix.stackexchange.com/questions/19491/how-to-specify-characters-using-hexadecimal-codes-in-grep http://stackoverflow.com/questions/6319878/using-grep-to-search-for-hex-strings-in-a-file and two others which pointed to the '-P' option as being the only way in newer grep's.. Needless to say, I am scandalized... However, One could always ask that those be added so as to be compatible w/sed, awk....etc? I.e. an RFE?? I'm pretty sure the grep didn't have backreferences in it before either. You going to tell me those date back to Bell labs as well? I.e.-- if you look at earlier info pages, there wasn't a separate regex for grep section (that I could fine).... many of the regex-taking utils pointed at each other for more clarification. I thought it was from those that grep had the same notation. Not exactly a faulty memory, but improbable logic? From debbugs-submit-bounces@debbugs.gnu.org Sat May 30 16:04:40 2015 Received: (at control) by debbugs.gnu.org; 30 May 2015 20:04:41 +0000 Received: from localhost ([127.0.0.1]:33783 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yymzo-0007X7-Fr for submit@debbugs.gnu.org; Sat, 30 May 2015 16:04:40 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:53645) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yymzm-0007Wi-MZ for control@debbugs.gnu.org; Sat, 30 May 2015 16:04:39 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 3C93E39E801B for ; Sat, 30 May 2015 13:04:33 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9SwSbBss7Jia for ; Sat, 30 May 2015 13:04:32 -0700 (PDT) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 4CC9C39E8016 for ; Sat, 30 May 2015 13:04:32 -0700 (PDT) Message-ID: <556A17D0.4000303@cs.ucla.edu> Date: Sat, 30 May 2015 13:04:32 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: control@debbugs.gnu.org Subject: grep bug maintainance Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) tag 20605 notabug close 20605 severity 20657 wishlist tag 20638 notabug close 20638 merge 20526 19985 19230 tag 19837 notabug close 19837 merge 16444 19777 close 19563 close 19486 tag 19330 notabug close 19330 tag 19193 notabug close 19193 tag 19071 notabug close 19071 tag 19005 notabug close 19005 close 19000 tag 18888 notabug close 18888 From unknown Mon Jun 23 07:51:58 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sun, 28 Jun 2015 11:24:06 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator