From unknown Sat Jun 21 05:15:39 2025 X-Loop: help-debbugs@gnu.org Subject: bug#21989: grep search by ASCII code unsuccessful Resent-From: Shivanshu Goyal Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Mon, 23 Nov 2015 07:57:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 21989 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: 21989@debbugs.gnu.org X-Debbugs-Original-To: bug-grep@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.144826537411673 (code B ref -1); Mon, 23 Nov 2015 07:57:02 +0000 Received: (at submit) by debbugs.gnu.org; 23 Nov 2015 07:56:14 +0000 Received: from localhost ([127.0.0.1]:48889 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a0lyd-00031S-3d for submit@debbugs.gnu.org; Mon, 23 Nov 2015 02:56:13 -0500 Received: from eggs.gnu.org ([208.118.235.92]:41853) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a0jc5-0007R0-5V for submit@debbugs.gnu.org; Mon, 23 Nov 2015 00:24:47 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a0jc3-0000F8-VS for submit@debbugs.gnu.org; Mon, 23 Nov 2015 00:24:28 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.3 required=5.0 tests=BAYES_40, FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:38836) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0jc3-0000F3-Rp for submit@debbugs.gnu.org; Mon, 23 Nov 2015 00:24:27 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51855) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0jc2-0004xU-Oq for bug-grep@gnu.org; Mon, 23 Nov 2015 00:24:27 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a0jc1-0000Er-Uh for bug-grep@gnu.org; Mon, 23 Nov 2015 00:24:26 -0500 Received: from mail-oi0-x22c.google.com ([2607:f8b0:4003:c06::22c]:34018) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0jc1-0000Ed-PI for bug-grep@gnu.org; Mon, 23 Nov 2015 00:24:25 -0500 Received: by oies6 with SMTP id s6so112615187oie.1 for ; Sun, 22 Nov 2015 21:24:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=TtA78HbGqw0CtaiAHpabLNtVuztzrPw4qcom2TsWqnc=; b=dGvwjc87J+ykjuS30jxoeeYf6c6cK9yV/Kyy3/v2dvCD2EsgiLDekh2205K7SF9rCB MQBWD62kXgwH401XuBWwm3vKnEb9UuplVk7I0WNDhlJ8hNpjKPgGINLusIZ9On6Jmfcn apdrYMXrNujQhZke1EPKvpCodz3WCTUAJ2TyJs0zdf2RSvvjubpRN5RpILApaKN/faKq kYcas7Rnn0MeCWFdZMzH6UpI+cEkwxhNJEdsvamTyCLTOZxowW5X1nVe6P4309mjJCLn 7X69PIPut9nJ/JO7C1c+/LgkVHnkkpkwzxSQfnvUOxwkNpUdsScgKc2DZEL7uJWIyrK8 Jm5g== X-Received: by 10.60.77.34 with SMTP id p2mr15136070oew.21.1448256264410; Sun, 22 Nov 2015 21:24:24 -0800 (PST) MIME-Version: 1.0 Received: by 10.60.59.193 with HTTP; Sun, 22 Nov 2015 21:24:05 -0800 (PST) From: Shivanshu Goyal Date: Sun, 22 Nov 2015 21:24:05 -0800 Message-ID: Content-Type: multipart/alternative; boundary=047d7b33cac22f1f6805252e70a0 X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -3.8 (---) X-Mailman-Approved-At: Mon, 23 Nov 2015 02:55:54 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.8 (---) --047d7b33cac22f1f6805252e70a0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, I think I found a bug which did not exist in version 2.14, but does seem to exist in versions 2.16 and 2.22. I have not tested any other versions. Say there is a file with the following contents: shivanshu@thetis:tmp$ cat temp | xxd 0000000: 68e2 8093 680a h...h. The following is the grep 2.14 command and output: shivanshu@thetis:tmp$ cat temp | grep -P '\xe2\x80\x93' h=E2=80=93h The following is the grep 2.16/2.22 command and output: shivanshu@thetis:tmp$ cat temp | grep -P '\xe2\x80\x93' d1y8@thetis:tmp$ Thanks, Shivanshu Goyal shivanshu.ca --047d7b33cac22f1f6805252e70a0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

I think I found a bug which did not= exist in version 2.14, but does seem to exist in versions 2.16 and 2.22. I= have not tested any other versions.

Say there is = a file with the following contents:

shivanshu@thetis:tmp$ cat temp | xxd
0000000: 68e2 8093 680a =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 h...h.

=
The following= is the grep 2.14 command and output:

s= hivanshu@thetis:tmp$ cat temp | grep -P '\xe2\x80\x93'
=
h=E2=80=93= h

The following is the g= rep 2.16/2.22 command and output:

shivanshu@thetis:tmp$ cat te= mp | grep -P '\xe2\x80\x93'
d1y8@thetis:tmp$

Th= anks,
Shivanshu Goyal
shivanshu.ca
--047d7b33cac22f1f6805252e70a0-- From unknown Sat Jun 21 05:15:39 2025 X-Loop: help-debbugs@gnu.org Subject: bug#21989: grep search by ASCII code unsuccessful Resent-From: Stephane Chazelas Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Mon, 23 Nov 2015 15:06:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 21989 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Shivanshu Goyal Cc: 21989@debbugs.gnu.org Received: via spool by 21989-submit@debbugs.gnu.org id=B21989.14482911221856 (code B ref 21989); Mon, 23 Nov 2015 15:06:02 +0000 Received: (at 21989) by debbugs.gnu.org; 23 Nov 2015 15:05:22 +0000 Received: from localhost ([127.0.0.1]:49643 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a0sgD-0000Tj-5T for submit@debbugs.gnu.org; Mon, 23 Nov 2015 10:05:21 -0500 Received: from mail-wm0-f48.google.com ([74.125.82.48]:38131) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a0sg9-0000TZ-58 for 21989@debbugs.gnu.org; Mon, 23 Nov 2015 10:05:17 -0500 Received: by wmec201 with SMTP id c201so109105924wme.1 for <21989@debbugs.gnu.org>; Mon, 23 Nov 2015 07:05:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=hAoS2AGS0T2EHYiBPkYYnYMEgGhWf62xt+UJMOC90cQ=; b=ukoCyzO8jQiCMv+QzrgUQtxFFgYozGAebWwvThXrE8cNrVBrqs/PoWXJFAt7i/y7TL 5mVk8zuYAN2UJrnQzt3iVxVPvVqsUW7LS+FF0lcv9ZcOHqLtBAGdpyNNMeIF9NSjBMVo FAL4u1hniBqInttqxuyKjHnl7MENMOcPj/3zUAl8WBOb/YVdT5nI6AiOAgWB1lbsvNW1 EF4ms555MQW9MdxCvYsw7LWzz46McyniWcDtW1ZI878o5mB8DjbXRZ2FCBSK4nb0I5A8 grloa+OMPqMMc/nG2JQzgdmiWmzZn2CW4zEyjueGzVe+dDbVGNZOyOcm6WuQ+Ej0ZOkf XhaA== X-Received: by 10.194.84.4 with SMTP id u4mr36939331wjy.149.1448291116575; Mon, 23 Nov 2015 07:05:16 -0800 (PST) Received: from chaz.gmail.com ([2.121.21.200]) by smtp.gmail.com with ESMTPSA id bh6sm2691118wjb.0.2015.11.23.07.05.15 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 23 Nov 2015 07:05:15 -0800 (PST) Date: Mon, 23 Nov 2015 15:05:14 +0000 From: Stephane Chazelas Message-ID: <20151123150514.GB18811@chaz.gmail.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) 2015-11-22 21:24:05 -0800, Shivanshu Goyal: [...] > I think I found a bug which did not exist in version 2.14, but does seem to > exist in versions 2.16 and 2.22. I have not tested any other versions. > > Say there is a file with the following contents: > > shivanshu@thetis:tmp$ cat temp | xxd > 0000000: 68e2 8093 680a h...h. > > The following is the grep 2.14 command and output: > > shivanshu@thetis:tmp$ cat temp | grep -P '\xe2\x80\x93' > h–h > > The following is the grep 2.16/2.22 command and output: > > shivanshu@thetis:tmp$ cat temp | grep -P '\xe2\x80\x93' > d1y8@thetis:tmp$ [...] If you read the pcrepattern man page, you'll see that \xe2 doesn't match the byte e2, but the character of code e2. If you're in a UTF-8 locale, \xe2 would match the character of Unicode code point e2 (LATIN SMALL LETTER A WITH CIRCUMFLEX) which in UTF-8 is written as the bytes c3 a2. The sequence e2 80 93 is actually the one character U+2013 (EN DASH). So, here, you either want: LC_ALL=C grep -P '\xe2\x80\x93' That is use a locale where characters are single-byte and their code is the byte value, or assuming the current locale is UTF-8, use: grep -P '\x{2013}' Or, regardless of the locale: grep -P '(*UTF8)\x{2013}' -- Stephane From unknown Sat Jun 21 05:15:39 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.503 (Entity 5.503) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Shivanshu Goyal Subject: bug#21989: closed (Re: bug#21989: grep search by ASCII code unsuccessful) Message-ID: References: <56533BE0.9070706@cs.ucla.edu> X-Gnu-PR-Message: they-closed 21989 X-Gnu-PR-Package: grep Reply-To: 21989@debbugs.gnu.org Date: Mon, 23 Nov 2015 16:17:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1448295422-15803-1" This is a multi-part message in MIME format... ------------=_1448295422-15803-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #21989: grep search by ASCII code unsuccessful which was filed against the grep package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 21989@debbugs.gnu.org. --=20 21989: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D21989 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1448295422-15803-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 21989-done) by debbugs.gnu.org; 23 Nov 2015 16:16:39 +0000 Received: from localhost ([127.0.0.1]:49673 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a0tnC-000468-H1 for submit@debbugs.gnu.org; Mon, 23 Nov 2015 11:16:38 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:38541) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a0tnA-00045y-U6 for 21989-done@debbugs.gnu.org; Mon, 23 Nov 2015 11:16:37 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id CC6C91605AF; Mon, 23 Nov 2015 08:16:35 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id BLnHDqM0zQh1; Mon, 23 Nov 2015 08:16:35 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 3725A160779; Mon, 23 Nov 2015 08:16:35 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id x1eY0KpkZowT; Mon, 23 Nov 2015 08:16:35 -0800 (PST) Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 1D0571605AF; Mon, 23 Nov 2015 08:16:35 -0800 (PST) Subject: Re: bug#21989: grep search by ASCII code unsuccessful To: Stephane Chazelas , Shivanshu Goyal References: <20151123150514.GB18811@chaz.gmail.com> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <56533BE0.9070706@cs.ucla.edu> Date: Mon, 23 Nov 2015 08:16:32 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <20151123150514.GB18811@chaz.gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.6 (/) X-Debbugs-Envelope-To: 21989-done Cc: 21989-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) Thanks, Stephane, for diagnosing the problem. Closing the bug. ------------=_1448295422-15803-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 23 Nov 2015 07:56:14 +0000 Received: from localhost ([127.0.0.1]:48889 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a0lyd-00031S-3d for submit@debbugs.gnu.org; Mon, 23 Nov 2015 02:56:13 -0500 Received: from eggs.gnu.org ([208.118.235.92]:41853) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a0jc5-0007R0-5V for submit@debbugs.gnu.org; Mon, 23 Nov 2015 00:24:47 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a0jc3-0000F8-VS for submit@debbugs.gnu.org; Mon, 23 Nov 2015 00:24:28 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.3 required=5.0 tests=BAYES_40, FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:38836) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0jc3-0000F3-Rp for submit@debbugs.gnu.org; Mon, 23 Nov 2015 00:24:27 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51855) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0jc2-0004xU-Oq for bug-grep@gnu.org; Mon, 23 Nov 2015 00:24:27 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a0jc1-0000Er-Uh for bug-grep@gnu.org; Mon, 23 Nov 2015 00:24:26 -0500 Received: from mail-oi0-x22c.google.com ([2607:f8b0:4003:c06::22c]:34018) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0jc1-0000Ed-PI for bug-grep@gnu.org; Mon, 23 Nov 2015 00:24:25 -0500 Received: by oies6 with SMTP id s6so112615187oie.1 for ; Sun, 22 Nov 2015 21:24:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=TtA78HbGqw0CtaiAHpabLNtVuztzrPw4qcom2TsWqnc=; b=dGvwjc87J+ykjuS30jxoeeYf6c6cK9yV/Kyy3/v2dvCD2EsgiLDekh2205K7SF9rCB MQBWD62kXgwH401XuBWwm3vKnEb9UuplVk7I0WNDhlJ8hNpjKPgGINLusIZ9On6Jmfcn apdrYMXrNujQhZke1EPKvpCodz3WCTUAJ2TyJs0zdf2RSvvjubpRN5RpILApaKN/faKq kYcas7Rnn0MeCWFdZMzH6UpI+cEkwxhNJEdsvamTyCLTOZxowW5X1nVe6P4309mjJCLn 7X69PIPut9nJ/JO7C1c+/LgkVHnkkpkwzxSQfnvUOxwkNpUdsScgKc2DZEL7uJWIyrK8 Jm5g== X-Received: by 10.60.77.34 with SMTP id p2mr15136070oew.21.1448256264410; Sun, 22 Nov 2015 21:24:24 -0800 (PST) MIME-Version: 1.0 Received: by 10.60.59.193 with HTTP; Sun, 22 Nov 2015 21:24:05 -0800 (PST) From: Shivanshu Goyal Date: Sun, 22 Nov 2015 21:24:05 -0800 Message-ID: Subject: grep search by ASCII code unsuccessful To: bug-grep@gnu.org Content-Type: multipart/alternative; boundary=047d7b33cac22f1f6805252e70a0 X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -3.8 (---) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Mon, 23 Nov 2015 02:55:54 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.8 (---) --047d7b33cac22f1f6805252e70a0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, I think I found a bug which did not exist in version 2.14, but does seem to exist in versions 2.16 and 2.22. I have not tested any other versions. Say there is a file with the following contents: shivanshu@thetis:tmp$ cat temp | xxd 0000000: 68e2 8093 680a h...h. The following is the grep 2.14 command and output: shivanshu@thetis:tmp$ cat temp | grep -P '\xe2\x80\x93' h=E2=80=93h The following is the grep 2.16/2.22 command and output: shivanshu@thetis:tmp$ cat temp | grep -P '\xe2\x80\x93' d1y8@thetis:tmp$ Thanks, Shivanshu Goyal shivanshu.ca --047d7b33cac22f1f6805252e70a0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

I think I found a bug which did not= exist in version 2.14, but does seem to exist in versions 2.16 and 2.22. I= have not tested any other versions.

Say there is = a file with the following contents:

shivanshu@thetis:tmp$ cat temp | xxd
0000000: 68e2 8093 680a =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 h...h.

=
The following= is the grep 2.14 command and output:

s= hivanshu@thetis:tmp$ cat temp | grep -P '\xe2\x80\x93'
=
h=E2=80=93= h

The following is the g= rep 2.16/2.22 command and output:

shivanshu@thetis:tmp$ cat te= mp | grep -P '\xe2\x80\x93'
d1y8@thetis:tmp$

Th= anks,
Shivanshu Goyal
shivanshu.ca
--047d7b33cac22f1f6805252e70a0-- ------------=_1448295422-15803-1-- From unknown Sat Jun 21 05:15:39 2025 X-Loop: help-debbugs@gnu.org Subject: bug#21989: grep search by ASCII code unsuccessful Resent-From: Shivanshu Goyal Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Mon, 23 Nov 2015 16:45:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 21989 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: 21989@debbugs.gnu.org X-Debbugs-Original-To: bug-grep@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.144829705218834 (code B ref -1); Mon, 23 Nov 2015 16:45:03 +0000 Received: (at submit) by debbugs.gnu.org; 23 Nov 2015 16:44:12 +0000 Received: from localhost ([127.0.0.1]:49704 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a0uDs-0004tc-Ae for submit@debbugs.gnu.org; Mon, 23 Nov 2015 11:44:12 -0500 Received: from eggs.gnu.org ([208.118.235.92]:43019) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a0jj2-0007dA-DC for submit@debbugs.gnu.org; Mon, 23 Nov 2015 00:31:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a0jj0-0001i2-U0 for submit@debbugs.gnu.org; Mon, 23 Nov 2015 00:31:39 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: * X-Spam-Status: No, score=1.1 required=5.0 tests=BAYES_50, FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:57403) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0jj0-0001hy-QQ for submit@debbugs.gnu.org; Mon, 23 Nov 2015 00:31:38 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53024) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0jiz-0006mK-PK for bug-grep@gnu.org; Mon, 23 Nov 2015 00:31:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a0jiy-0001hb-OL for bug-grep@gnu.org; Mon, 23 Nov 2015 00:31:37 -0500 Received: from mail-oi0-x22a.google.com ([2607:f8b0:4003:c06::22a]:33591) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0jiy-0001hX-JW for bug-grep@gnu.org; Mon, 23 Nov 2015 00:31:36 -0500 Received: by oixx65 with SMTP id x65so111483736oix.0 for ; Sun, 22 Nov 2015 21:31:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-type; bh=bTgkVdf0ZF7j2VY6WI8X3kbtc7/LtXCutSG2Ky8hbio=; b=UJRBmiNWFkVlZc3dNY163fPPeJJVr4yaD+K032qiDs5W6LjzohqyM9CmimA+09pB3e R15dakcr1VTlS6mVQf9O+tLdI0Ftst+LFNUKeWCs8EyRgJUYxxRMgeBA0HBHjlYPhCBm 4wWj+XCW9/lodHw4d1ts64IUX1EbEqQID7YyEC0djcC6Nx5etwkZk6mONafj00YQgiSE SwEoG9ZQVuok4nPC8IUOHLUEqnxCveyr/9Qm4MAlwIxzhPZ2EGObJpGeNfNAaVSmzTL3 7vgIcoFq99xOluZrinhkRG8EZzK+AtqB1Ydd+8hbfxBrmUK8rIOgtG4astvPZKCmPNxV mC4g== X-Received: by 10.60.65.6 with SMTP id t6mr15307894oes.47.1448256696188; Sun, 22 Nov 2015 21:31:36 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Shivanshu Goyal Date: Mon, 23 Nov 2015 05:31:26 +0000 Message-ID: Content-Type: multipart/alternative; boundary=001a11c1a328eb899d05252e890c X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -3.8 (---) X-Mailman-Approved-At: Mon, 23 Nov 2015 11:44:02 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.8 (---) --001a11c1a328eb899d05252e890c Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Correction: The following is the grep 2.16/2.22 command and output: (It doesn't output anything) shivanshu@thetis:tmp$ cat temp | grep -P '\xe2\x80\x93' shivanshu@thetis:tmp$ On Sun, Nov 22, 2015 at 9:24 PM Shivanshu Goyal wrote: > Hi, > > I think I found a bug which did not exist in version 2.14, but does seem > to exist in versions 2.16 and 2.22. I have not tested any other versions. > > Say there is a file with the following contents: > > shivanshu@thetis:tmp$ cat temp | xxd > 0000000: 68e2 8093 680a h...h. > > The following is the grep 2.14 command and output: > > shivanshu@thetis:tmp$ cat temp | grep -P '\xe2\x80\x93' > h=E2=80=93h > > The following is the grep 2.16/2.22 command and output: > > shivanshu@thetis:tmp$ cat temp | grep -P '\xe2\x80\x93' > d1y8@thetis:tmp$ > > Thanks, > Shivanshu Goyal > shivanshu.ca > --001a11c1a328eb899d05252e890c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Correction:

The following is the grep 2.16/2.22 command and outpu= t:
(It doesn't output anything)=

shivanshu@thetis:tmp$ cat temp | grep -P '= ;\xe2\x80\x93'
shi= vanshu@thetis:tmp$

<= div dir=3D"ltr">On Sun, Nov 22, 2015 at 9:24 PM Shivanshu Goyal <shivanshu3@gmail.com> wrote:
Hi,

I= think I found a bug which did not exist in version 2.14, but does seem to = exist in versions 2.16 and 2.22. I have not tested any other versions.

Say there is a file with the following contents:
=

shivanshu@thet= is:tmp$ cat temp | xxd
0000000: 68e2 8093 680a =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 h...h.

The following is the grep 2.14 command and output:=

shivanshu@thetis:tmp$ cat temp | grep -P '= ;\xe2\x80\x93'
h=E2=80=93h

The following is the grep 2.16/2.22 command and output:

shivanshu@thetis:tmp$ cat temp | grep -P '\xe2\x80\x93'
d1y8@thetis:tmp$

Thanks,
Shivanshu Goyal
=
--001a11c1a328eb899d05252e890c--