From unknown Sun Jun 22 04:30:13 2025 X-Loop: help-debbugs@gnu.org Subject: bug#13947: bug report for core-utils command : OD Resent-From: Marc Grondin Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Wed, 13 Mar 2013 20:25:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 13947 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 13947@debbugs.gnu.org Cc: Mark.Jaeger@oracle.com X-Debbugs-Original-To: Received: via spool by submit@debbugs.gnu.org id=B.13632062923625 (code B ref -1); Wed, 13 Mar 2013 20:25:02 +0000 Received: (at submit) by debbugs.gnu.org; 13 Mar 2013 20:24:52 +0000 Received: from localhost ([127.0.0.1]:51931 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UFsEF-0000wP-Ty for submit@debbugs.gnu.org; Wed, 13 Mar 2013 16:24:52 -0400 Received: from eggs.gnu.org ([208.118.235.92]:55723) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UFs7J-0000lV-5a for submit@debbugs.gnu.org; Wed, 13 Mar 2013 16:17:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UFs66-00044e-8s for submit@debbugs.gnu.org; Wed, 13 Mar 2013 16:16:28 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD autolearn=unavailable version=3.3.2 Received: from lists.gnu.org ([208.118.235.17]:35826) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UFs66-00044M-6J for submit@debbugs.gnu.org; Wed, 13 Mar 2013 16:16:26 -0400 Received: from eggs.gnu.org ([208.118.235.92]:38837) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UFs63-0005pl-4V for bug-coreutils@gnu.org; Wed, 13 Mar 2013 16:16:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UFs60-00042g-97 for bug-coreutils@gnu.org; Wed, 13 Mar 2013 16:16:23 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:18458) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UFs60-00041y-1q for bug-coreutils@gnu.org; Wed, 13 Mar 2013 16:16:20 -0400 Received: from ucsinet22.oracle.com (ucsinet22.oracle.com [156.151.31.94]) by userp1040.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id r2DKGHD6024774 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 13 Mar 2013 20:16:18 GMT Received: from acsmt358.oracle.com (acsmt358.oracle.com [141.146.40.158]) by ucsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r2DKGG6G018625 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 13 Mar 2013 20:16:17 GMT Received: from abhmt111.oracle.com (abhmt111.oracle.com [141.146.116.63]) by acsmt358.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id r2DKGG8T007861 for ; Wed, 13 Mar 2013 15:16:16 -0500 MIME-Version: 1.0 Message-ID: Date: Wed, 13 Mar 2013 13:16:16 -0700 (PDT) From: Marc Grondin X-Mailer: Zimbra on Oracle Beehive Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Source-IP: ucsinet22.oracle.com [156.151.31.94] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 208.118.235.17 X-Spam-Score: -3.5 (---) X-Mailman-Approved-At: Wed, 13 Mar 2013 16:24:50 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.2 (------) Good Afternoon,=20 My client was attempting to run the command : od -c on this xml file (sampl= e only)=20 ---------------------------------------------------------------------------= --- =E4=B8=B8 =E4=B8=B8 =F0=A0=84=8C ? ? ?=E4=B8=B8 ??=E4=B8=B8 ---------------------------------------------------------------------------= --- note : this system is a : 2.6.18-164.0.0.0.1.el5xen #1 SMP Thu Sep 3 00:34:= 43 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux He was getting this output :=20 ---------------------------------------------------------------------------= --- 0000000 < ? x m l v e r s i o n =3D =20 0000020 ' 1 . 0 ' e n c o d i n g =3D 0000040 ' U T F - 8 ' ? > \n < t o p > 0000060 \n < x > =EF=BF=BD =EF=BF=BD =EF=BF=BD <= / x > \n =20 0000100 < y > =EF=BF=BD =EF=BF=BD =EF=BF=BD 201 < /= y > \n =20 0000120 < z > =EF=BF=BD =EF=BF=BD 204 214 < / z > \n= =20 0000140 < x > ? < / x > \n < x > ? 0000160 < / x > \n < x > ? =EF=BF=BD =EF= =BF=BD =EF=BF=BD 201 0000200 < / x > \n < x > ? ? =EF=BF=BD = =EF=BF=BD =EF=BF=BD 0000220 201 < / x > \n < / t o p > \n ---------------------------------------------------------------------------= --- Instead of this :=20 ---------------------------------------------------------------------------= --- 000000 < ? x m l v e r s i o n =3D =20 0000020 ' 1 . 0 ' e n c o d i n g =3D 0000040 ' U T F - 8 ' ? > \n < t o p > 0000060 \n < x > 344 270 270 < / x > \n =20 0000100 < y > 360 257 240 201 < / y > \n =20 0000120 < z > 360 240 204 214 < / z > \n =20 0000140 < x > ? < / x > \n < x > ? 0000160 < / x > \n < x > ? 360 257 240 201 0000200 < / x > \n < x > ? ? 360 257 240 0000220 201 < / x > \n < / t o p > \n 0000235 ---------------------------------------------------------------------------= --- This all based on the LANG env. He was using :=20 LANG=3Den_US.iso88591, instead of LANG=3Den_US.UTF-8=20 ---------------------------------------------------------------------------= --- Question :=20 Since the output is based on the ASCII character set, should it not, in bot= h cases give a numerical output (as it did in scenario #2)=20 for a symbol outside the ascii/extended-ascii character set ?=20 ---------------------------------------------------------------------------= --- Regards,=20 Marc Grondin,=20 __________________________________ Oracle - Quebec city, Qc. Senior System Administrator, PDIT --------------------------------- 400-330 St-Vallier, G1K 9C5 418.524.5665 # 1256 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D From unknown Sun Jun 22 04:30:13 2025 X-Loop: help-debbugs@gnu.org Subject: bug#13947: bug report for core-utils command : OD Resent-From: Eric Blake Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Wed, 13 Mar 2013 21:36:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 13947 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Marc Grondin Cc: Mark.Jaeger@oracle.com, 13947@debbugs.gnu.org Received: via spool by 13947-submit@debbugs.gnu.org id=B13947.136321053510330 (code B ref 13947); Wed, 13 Mar 2013 21:36:02 +0000 Received: (at 13947) by debbugs.gnu.org; 13 Mar 2013 21:35:35 +0000 Received: from localhost ([127.0.0.1]:52010 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UFtKf-0002gX-1S for submit@debbugs.gnu.org; Wed, 13 Mar 2013 17:35:34 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53387) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UFtKa-0002gH-MJ for 13947@debbugs.gnu.org; Wed, 13 Mar 2013 17:35:32 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r2DLYFAg006766 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 13 Mar 2013 17:34:15 -0400 Received: from [10.3.113.40] (ovpn-113-40.phx2.redhat.com [10.3.113.40]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r2DLYEkv026633; Wed, 13 Mar 2013 17:34:14 -0400 Message-ID: <5140F0D6.6070103@redhat.com> Date: Wed, 13 Mar 2013 15:34:14 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130219 Thunderbird/17.0.3 MIME-Version: 1.0 References: In-Reply-To: X-Enigmail-Version: 1.5.1 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="----enig2VQVWRHJQTIJOBIQBAQVQ" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Spam-Score: -7.4 (-------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -9.3 (---------) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2VQVWRHJQTIJOBIQBAQVQ Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 03/13/2013 02:16 PM, Marc Grondin wrote: > Good Afternoon,=20 Hello, and thanks for the report. >=20 > My client was attempting to run the command : od -c on this xml file (s= ample only)=20 > -----------------------------------------------------------------------= ------- > > > =E4=B8=B8 Here, you are representing a character in UTF-8 > He was getting this output :=20 > -----------------------------------------------------------------------= ------- > 0000000 < ? x m l v e r s i o n =3D = =20 > 0000020 ' 1 . 0 ' e n c o d i n g =3D= > 0000040 ' U T F - 8 ' ? > \n < t o p >= > 0000060 \n < x > =EF=BF=BD =EF=BF=BD =EF=BF=BD= < / x > \n =20 and here, you were running od in a different character set: > This all based on the LANG env. He was using :=20 > LANG=3Den_US.iso88591, instead of > LANG=3Den_US.UTF-8=20 In ISO-88591, every byte is a character, and those particular bytes happen to be printable, so od was faithfully replaying the character as printable, only to then be shown by your UTF-8 terminal as an invalid UTF-8 sequence. Mismatching character sets between your program and your terminal is always a recipe for confusion. However, you HAVE identified a bug, in our documentation. >=20 > -----------------------------------------------------------------------= ------- >=20 > Question :=20 > Since the output is based on the ASCII character set, should it not, in= both cases give a numerical output (as it did in scenario #2)=20 > for a symbol outside the ascii/extended-ascii character set ?=20 Our documentation is lying. Here's what POSIX says about od -c: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/od.html "Interpret bytes as characters specified by the current setting of the LC_CTYPE category. Certain non-graphic characters appear as C escapes: "NUL=3D\0" , "BS=3D\b" , "FF=3D\f" , "NL=3D\n" , "CR=3D\r" , "HT=3D\t" ; = others appear as 3-digit octal numbers." Nothing in there restricts the output to ASCII only. The bytes that are showing up as =EF=BF=BD are graphic characters in your current choice of LC_CTYPE, so there is no escaping performed (since escaping is permitted only on non-graphic characters). If your terminal was using the same character set as you ran od under, you would see proper graphical characters in the ISO-88591 set (but then again, you wouldn't see the nice =E4=B8=B8 character that the UTF-8 was representing). Coreutils is properly obeying the locale, what is wrong is the info documentation which stated: `-c' Output as ASCII characters or backslash escapes. In reality, that should state something like: Output as characters in the current locale, using octal sequences or backslash escapes for all non-graphic bytes. Meanwhile, if you want to guarantee ASCII-only output from od, you have to use a different format, such as -b or -tx1, or use LC_ALL=3DC on a system where the C locale does not treat non-ascii bytes as graphical characters (most glibc systems, including the one you are using, fit this bill). --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org ------enig2VQVWRHJQTIJOBIQBAQVQ Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJRQPDWAAoJEKeha0olJ0NqnYUH/RfvVl8gnhQiqvgpALWtE5Jc hKaLAnkr9X8uT6uumKs0/0DEab7aEWcLHFHTfwfJgoE8YtKkIKhlV0lxPkxcAbpF XIVwZzSMrrdDkUd7ERh2rzgm74JL0iQELdfYawF+V1O7MmiYNPU03jpLeL0Fvtt9 15EZ/PmDlpl+PBZMyA4qw+OCZsmq/1bfckw39D+ROezgosv74RTc1cJZcfGT1LVF Rs2xVPVB3JN7ymiiMkNhST6KZjXP1kBNTQgn7x2+7N9QXoeMdkoqiEFQGI1JxBHe WyRRFQyfQumySuXWeZW/VU9+ttiiaC72nSXEdX7cB8QnyEoCn5E0TS2nTlHj8TA= =Epy8 -----END PGP SIGNATURE----- ------enig2VQVWRHJQTIJOBIQBAQVQ-- From unknown Sun Jun 22 04:30:13 2025 X-Loop: help-debbugs@gnu.org Subject: bug#13947: bug report for core-utils command : OD Resent-From: =?UTF-8?Q?P=C3=A1draig?= Brady Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Wed, 13 Mar 2013 21:55:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 13947 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Eric Blake Cc: Mark.Jaeger@oracle.com, Marc Grondin , 13947@debbugs.gnu.org Received: via spool by 13947-submit@debbugs.gnu.org id=B13947.136321169812043 (code B ref 13947); Wed, 13 Mar 2013 21:55:02 +0000 Received: (at 13947) by debbugs.gnu.org; 13 Mar 2013 21:54:58 +0000 Received: from localhost ([127.0.0.1]:52025 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UFtdR-00038B-GL for submit@debbugs.gnu.org; Wed, 13 Mar 2013 17:54:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44079) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UFtdO-000380-Il for 13947@debbugs.gnu.org; Wed, 13 Mar 2013 17:54:56 -0400 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r2DLrh0j003963 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 13 Mar 2013 17:53:43 -0400 Received: from [10.36.116.70] (ovpn-116-70.ams2.redhat.com [10.36.116.70]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r2DLrdv6014308 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 13 Mar 2013 17:53:41 -0400 Message-ID: <5140F563.1010906@draigBrady.com> Date: Wed, 13 Mar 2013 21:53:39 +0000 From: =?UTF-8?Q?P=C3=A1draig?= Brady User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 References: <5140F0D6.6070103@redhat.com> In-Reply-To: <5140F0D6.6070103@redhat.com> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=UTF-8 X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mx1.redhat.com id r2DLrh0j003963 X-Spam-Score: -5.0 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.9 (------) On 03/13/2013 09:34 PM, Eric Blake wrote: > On 03/13/2013 02:16 PM, Marc Grondin wrote: >> Good Afternoon,=20 >=20 > Hello, and thanks for the report. >=20 >> >> My client was attempting to run the command : od -c on this xml file (= sample only)=20 >> ----------------------------------------------------------------------= -------- >> >> >> =E4=B8=B8 >=20 > Here, you are representing a character in UTF-8 >=20 >> He was getting this output :=20 >> ----------------------------------------------------------------------= -------- >> 0000000 < ? x m l v e r s i o n =3D = =20 >> 0000020 ' 1 . 0 ' e n c o d i n g = =3D >> 0000040 ' U T F - 8 ' ? > \n < t o p = > >> 0000060 \n < x > =EF=BF=BD =EF=BF=BD =EF=BF=BD= < / x > \n =20 >=20 > and here, you were running od in a different character set: >=20 >> This all based on the LANG env. He was using :=20 >> LANG=3Den_US.iso88591, instead of >> LANG=3Den_US.UTF-8=20 >=20 > In ISO-88591, every byte is a character, and those particular bytes > happen to be printable, so od was faithfully replaying the character as > printable, only to then be shown by your UTF-8 terminal as an invalid > UTF-8 sequence. Mismatching character sets between your program and > your terminal is always a recipe for confusion. >=20 > However, you HAVE identified a bug, in our documentation. >=20 >> >> ----------------------------------------------------------------------= -------- >> >> Question :=20 >> Since the output is based on the ASCII character set, should it not, i= n both cases give a numerical output (as it did in scenario #2)=20 >> for a symbol outside the ascii/extended-ascii character set ?=20 >=20 > Our documentation is lying. Here's what POSIX says about od -c: >=20 > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/od.html > "Interpret bytes as characters specified by the current setting of the > LC_CTYPE category. Certain non-graphic characters appear as C escapes: > "NUL=3D\0" , "BS=3D\b" , "FF=3D\f" , "NL=3D\n" , "CR=3D\r" , "HT=3D\t" = ; others > appear as 3-digit octal numbers." >=20 > Nothing in there restricts the output to ASCII only. The bytes that ar= e > showing up as =EF=BF=BD are graphic characters in your current choice o= f > LC_CTYPE, so there is no escaping performed (since escaping is permitte= d > only on non-graphic characters). If your terminal was using the same > character set as you ran od under, you would see proper graphical > characters in the ISO-88591 set (but then again, you wouldn't see the > nice =E4=B8=B8 character that the UTF-8 was representing). >=20 > Coreutils is properly obeying the locale, what is wrong is the info > documentation which stated: >=20 > `-c' > Output as ASCII characters or backslash escapes. I agree. Thanks for the detailed description. > In reality, that should state something like: > Output as characters in the current locale, using octal sequences > or backslash escapes for all non-graphic bytes. Note we output spaces, so I'd s/non-graphic/non-printable/. Also multi byte is always going to be problematic displaying in a grid like this, so we'll probably continue to do as we do now for the utf8 example above and output octal and dots. So therefore s/characters/single byte characters/. >=20 > Meanwhile, if you want to guarantee ASCII-only output from od, you have > to use a different format, such as -b or -tx1, or use LC_ALL=3DC on a > system where the C locale does not treat non-ascii bytes as graphical > characters (most glibc systems, including the one you are using, fit > this bill). >=20 cheers, P=C3=A1draig. From unknown Sun Jun 22 04:30:13 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.428 (Entity 5.428) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Marc Grondin Subject: bug#13947: closed (Re: bug#13947: bug report for core-utils command : OD) Message-ID: References: <514C7CAD.8010403@draigBrady.com> X-Gnu-PR-Message: they-closed 13947 X-Gnu-PR-Package: coreutils Reply-To: 13947@debbugs.gnu.org Date: Fri, 22 Mar 2013 15:48:03 +0000 Content-Type: multipart/mixed; boundary="----------=_1363967283-17624-1" This is a multi-part message in MIME format... ------------=_1363967283-17624-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #13947: bug report for core-utils command : OD which was filed against the coreutils package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 13947@debbugs.gnu.org. --=20 13947: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D13947 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1363967283-17624-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 13947-done) by debbugs.gnu.org; 22 Mar 2013 15:47:56 +0000 Received: from localhost ([127.0.0.1]:41076 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UJ4CB-0004Zt-EP for submit@debbugs.gnu.org; Fri, 22 Mar 2013 11:47:56 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58232) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UJ4C7-0004Zm-Sz for 13947-done@debbugs.gnu.org; Fri, 22 Mar 2013 11:47:53 -0400 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r2MFjrR2019452 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 22 Mar 2013 11:45:54 -0400 Received: from [10.36.116.96] (ovpn-116-96.ams2.redhat.com [10.36.116.96]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r2MFjoCU014356 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 22 Mar 2013 11:45:52 -0400 Message-ID: <514C7CAD.8010403@draigBrady.com> Date: Fri, 22 Mar 2013 15:45:49 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Eric Blake Subject: Re: bug#13947: bug report for core-utils command : OD References: <5140F0D6.6070103@redhat.com> <5140F563.1010906@draigBrady.com> In-Reply-To: <5140F563.1010906@draigBrady.com> X-Enigmail-Version: 1.5.1 Content-Type: multipart/mixed; boundary="------------000401040709070800060007" X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 X-Spam-Score: -6.9 (------) X-Debbugs-Envelope-To: 13947-done Cc: Mark.Jaeger@oracle.com, Marc Grondin , 13947-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.9 (------) This is a multi-part message in MIME format. --------------000401040709070800060007 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mx1.redhat.com id r2MFjrR2019452 On 03/13/2013 09:53 PM, P=C3=A1draig Brady wrote: > On 03/13/2013 09:34 PM, Eric Blake wrote: >> In reality, that should state something like: >=20 >> Output as characters in the current locale, using octal sequences >> or backslash escapes for all non-graphic bytes. >=20 > Note we output spaces, so I'd s/non-graphic/non-printable/. >=20 > Also multi byte is always going to be problematic displaying > in a grid like this, so we'll probably continue to do as > we do now for the utf8 example above and output octal and dots. > So therefore s/characters/single byte characters/. Hopefully the attached clarifies things. thanks, P=C3=A1draig. --------------000401040709070800060007 Content-Type: text/x-patch; name="od-printable.patch" Content-Disposition: attachment; filename="od-printable.patch" Content-Transfer-Encoding: 7bit >From 1f9a6a48bbc05599092cb5da2429ab3ccfe87631 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Fri, 22 Mar 2013 15:33:57 +0000 Subject: [PATCH] doc: clarify the printable characters output by od * src/od.c (usage): Mention any printable character is output, Not just ASCII. * doc/coreutils.texi (od invocation): Further clarify that only single byte characters are output (due to the alignment requirement). Reported in http://bugs.gnu.org/13947 --- doc/coreutils.texi | 6 +++--- src/od.c | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 8f1df45..d2f3b21 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -1908,14 +1908,14 @@ of each output line using each of the data types that you specified, in the order that you specified. Adding a trailing ``z'' to any type specification appends a display -of the ASCII character representation of the printable characters +of the single byte character representation of the printable characters to the output line generated by the type specification. @table @samp @item a named character, ignoring high-order bit @item c -ASCII character or backslash escape, +printable single byte character or backslash escape, @item d signed decimal @item f @@ -2003,7 +2003,7 @@ Output as octal bytes. Equivalent to @samp{-t o1}. @item -c @opindex -c -Output as ASCII characters or backslash escapes. Equivalent to +Output as printable single byte characters or backslash escapes. Equivalent to @samp{-t c}. @item -d diff --git a/src/od.c b/src/od.c index e7d881b..e8cab46 100644 --- a/src/od.c +++ b/src/od.c @@ -339,7 +339,7 @@ suffixes may be . for octal and b for multiply by 512.\n\ Traditional format specifications may be intermixed; they accumulate:\n\ -a same as -t a, select named characters, ignoring high-order bit\n\ -b same as -t o1, select octal bytes\n\ - -c same as -t c, select ASCII characters or backslash escapes\n\ + -c same as -t c, select printable characters or backslash escapes\n\ -d same as -t u2, select unsigned decimal 2-byte units\n\ "), stdout); fputs (_("\ @@ -355,7 +355,7 @@ Traditional format specifications may be intermixed; they accumulate:\n\ \n\ TYPE is made up of one or more of these specifications:\n\ a named character, ignoring high-order bit\n\ - c ASCII character or backslash escape\n\ + c printable character or backslash escape\n\ "), stdout); fputs (_("\ d[SIZE] signed decimal, SIZE bytes per integer\n\ -- 1.7.7.6 --------------000401040709070800060007-- ------------=_1363967283-17624-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 13 Mar 2013 20:24:52 +0000 Received: from localhost ([127.0.0.1]:51931 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UFsEF-0000wP-Ty for submit@debbugs.gnu.org; Wed, 13 Mar 2013 16:24:52 -0400 Received: from eggs.gnu.org ([208.118.235.92]:55723) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UFs7J-0000lV-5a for submit@debbugs.gnu.org; Wed, 13 Mar 2013 16:17:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UFs66-00044e-8s for submit@debbugs.gnu.org; Wed, 13 Mar 2013 16:16:28 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD autolearn=unavailable version=3.3.2 Received: from lists.gnu.org ([208.118.235.17]:35826) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UFs66-00044M-6J for submit@debbugs.gnu.org; Wed, 13 Mar 2013 16:16:26 -0400 Received: from eggs.gnu.org ([208.118.235.92]:38837) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UFs63-0005pl-4V for bug-coreutils@gnu.org; Wed, 13 Mar 2013 16:16:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UFs60-00042g-97 for bug-coreutils@gnu.org; Wed, 13 Mar 2013 16:16:23 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:18458) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UFs60-00041y-1q for bug-coreutils@gnu.org; Wed, 13 Mar 2013 16:16:20 -0400 Received: from ucsinet22.oracle.com (ucsinet22.oracle.com [156.151.31.94]) by userp1040.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id r2DKGHD6024774 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 13 Mar 2013 20:16:18 GMT Received: from acsmt358.oracle.com (acsmt358.oracle.com [141.146.40.158]) by ucsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r2DKGG6G018625 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 13 Mar 2013 20:16:17 GMT Received: from abhmt111.oracle.com (abhmt111.oracle.com [141.146.116.63]) by acsmt358.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id r2DKGG8T007861 for ; Wed, 13 Mar 2013 15:16:16 -0500 MIME-Version: 1.0 Message-ID: Date: Wed, 13 Mar 2013 13:16:16 -0700 (PDT) From: Marc Grondin To: Subject: bug report for core-utils command : OD X-Mailer: Zimbra on Oracle Beehive Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Source-IP: ucsinet22.oracle.com [156.151.31.94] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 208.118.235.17 X-Spam-Score: -3.5 (---) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Wed, 13 Mar 2013 16:24:50 -0400 Cc: Mark.Jaeger@oracle.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.2 (------) Good Afternoon,=20 My client was attempting to run the command : od -c on this xml file (sampl= e only)=20 ---------------------------------------------------------------------------= --- =E4=B8=B8 =E4=B8=B8 =F0=A0=84=8C ? ? ?=E4=B8=B8 ??=E4=B8=B8 ---------------------------------------------------------------------------= --- note : this system is a : 2.6.18-164.0.0.0.1.el5xen #1 SMP Thu Sep 3 00:34:= 43 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux He was getting this output :=20 ---------------------------------------------------------------------------= --- 0000000 < ? x m l v e r s i o n =3D =20 0000020 ' 1 . 0 ' e n c o d i n g =3D 0000040 ' U T F - 8 ' ? > \n < t o p > 0000060 \n < x > =EF=BF=BD =EF=BF=BD =EF=BF=BD <= / x > \n =20 0000100 < y > =EF=BF=BD =EF=BF=BD =EF=BF=BD 201 < /= y > \n =20 0000120 < z > =EF=BF=BD =EF=BF=BD 204 214 < / z > \n= =20 0000140 < x > ? < / x > \n < x > ? 0000160 < / x > \n < x > ? =EF=BF=BD =EF= =BF=BD =EF=BF=BD 201 0000200 < / x > \n < x > ? ? =EF=BF=BD = =EF=BF=BD =EF=BF=BD 0000220 201 < / x > \n < / t o p > \n ---------------------------------------------------------------------------= --- Instead of this :=20 ---------------------------------------------------------------------------= --- 000000 < ? x m l v e r s i o n =3D =20 0000020 ' 1 . 0 ' e n c o d i n g =3D 0000040 ' U T F - 8 ' ? > \n < t o p > 0000060 \n < x > 344 270 270 < / x > \n =20 0000100 < y > 360 257 240 201 < / y > \n =20 0000120 < z > 360 240 204 214 < / z > \n =20 0000140 < x > ? < / x > \n < x > ? 0000160 < / x > \n < x > ? 360 257 240 201 0000200 < / x > \n < x > ? ? 360 257 240 0000220 201 < / x > \n < / t o p > \n 0000235 ---------------------------------------------------------------------------= --- This all based on the LANG env. He was using :=20 LANG=3Den_US.iso88591, instead of LANG=3Den_US.UTF-8=20 ---------------------------------------------------------------------------= --- Question :=20 Since the output is based on the ASCII character set, should it not, in bot= h cases give a numerical output (as it did in scenario #2)=20 for a symbol outside the ascii/extended-ascii character set ?=20 ---------------------------------------------------------------------------= --- Regards,=20 Marc Grondin,=20 __________________________________ Oracle - Quebec city, Qc. Senior System Administrator, PDIT --------------------------------- 400-330 St-Vallier, G1K 9C5 418.524.5665 # 1256 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D ------------=_1363967283-17624-1-- From unknown Sun Jun 22 04:30:13 2025 X-Loop: help-debbugs@gnu.org Subject: bug#13947: bug report for core-utils command : OD Resent-From: Eric Blake Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Fri, 22 Mar 2013 16:06:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 13947 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: =?UTF-8?Q?P=C3=A1draig?= Brady Cc: Mark.Jaeger@oracle.com, Marc Grondin , 13947-done@debbugs.gnu.org Received: via spool by 13947-done@debbugs.gnu.org id=D13947.136396835119274 (code D ref 13947); Fri, 22 Mar 2013 16:06:01 +0000 Received: (at 13947-done) by debbugs.gnu.org; 22 Mar 2013 16:05:51 +0000 Received: from localhost ([127.0.0.1]:41094 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UJ4TW-00050o-MZ for submit@debbugs.gnu.org; Fri, 22 Mar 2013 12:05:51 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40597) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UJ4TS-00050e-NO for 13947-done@debbugs.gnu.org; Fri, 22 Mar 2013 12:05:48 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r2MG3mnI018914 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 22 Mar 2013 12:03:48 -0400 Received: from [10.3.113.168] (ovpn-113-168.phx2.redhat.com [10.3.113.168]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r2MG3lvC027132; Fri, 22 Mar 2013 12:03:48 -0400 Message-ID: <514C80E3.5030203@redhat.com> Date: Fri, 22 Mar 2013 10:03:47 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4 MIME-Version: 1.0 References: <5140F0D6.6070103@redhat.com> <5140F563.1010906@draigBrady.com> <514C7CAD.8010403@draigBrady.com> In-Reply-To: <514C7CAD.8010403@draigBrady.com> X-Enigmail-Version: 1.5.1 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="----enig2KNCSPNGTGCVXXCIQWMFH" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Spam-Score: -9.4 (---------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -9.4 (---------) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2KNCSPNGTGCVXXCIQWMFH Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 03/22/2013 09:45 AM, P=C3=A1draig Brady wrote: > Hopefully the attached clarifies things. > * src/od.c (usage): Mention any printable character is output, > Not just ASCII. > * doc/coreutils.texi (od invocation): Further clarify that only > single byte characters are output (due to the alignment requirement). > Reported in http://bugs.gnu.org/13947 Yes, this looks good to me. It could go in as-is, but see my question below... > --- > doc/coreutils.texi | 6 +++--- > src/od.c | 4 ++-- > 2 files changed, 5 insertions(+), 5 deletions(-) >=20 > @table @samp > @item a > named character, ignoring high-order bit > @item c > -ASCII character or backslash escape, > +printable single byte character or backslash escape, Hmm, we output octal sequences without a backslash; should the info page be any more verbose that it is one of: a single-byte printable character, a C backslash escape, or an octal sequence? Or does that just clutter things (seeing three octal digits, even without a backslash, still makes it easy to determine that it can be used as an escape sequence). > +++ b/src/od.c > @@ -339,7 +339,7 @@ suffixes may be . for octal and b for multiply by 5= 12.\n\ > Traditional format specifications may be intermixed; they accumulate:\= n\ > -a same as -t a, select named characters, ignoring high-order bit= \n\ > -b same as -t o1, select octal bytes\n\ > - -c same as -t c, select ASCII characters or backslash escapes\n\ > + -c same as -t c, select printable characters or backslash escapes= \n\ For the --help output, terse is good, so I don't see any improvements to your change here. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org ------enig2KNCSPNGTGCVXXCIQWMFH Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJRTIDjAAoJEKeha0olJ0NqRHQH/3D+/LChCRgRPK0/pPbr4nqG QLlBq2/2EWCzFJ2so3O/zXBkSgLiIU0mYJ3P64WJWEavfMXSuwVc1wE0KIZg44Cm cP437TY+uVrJDmdYgNWGuHe7tDj98AJWoR2ptOn4MziOjN1JCvxfToZoYl1MOslv F960kqUqJ2wygD5n4qye1BSodJEo+oVxgOBOhKiOMmtRZjo5ROI7N6rXp7v9mD2r 3nGR9gJfN4KnUhCJb7P4lj+iEkegyw8DAPsW5QzEjkrn3jT2GDs/AWLj6B4Tjphc pRSH0qkoZya2QMcACCkaQwCZMCHavabyQKQHQvjDm/+5C8jbAiqLtNETfSQbsoE= =EDAK -----END PGP SIGNATURE----- ------enig2KNCSPNGTGCVXXCIQWMFH-- From unknown Sun Jun 22 04:30:13 2025 X-Loop: help-debbugs@gnu.org Subject: bug#13947: bug report for core-utils command : OD Resent-From: =?UTF-8?Q?P=C3=A1draig?= Brady Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Fri, 22 Mar 2013 16:16:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 13947 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Eric Blake Cc: Mark.Jaeger@oracle.com, Marc Grondin , 13947@debbugs.gnu.org Received: via spool by 13947-submit@debbugs.gnu.org id=B13947.136396893420132 (code B ref 13947); Fri, 22 Mar 2013 16:16:02 +0000 Received: (at 13947) by debbugs.gnu.org; 22 Mar 2013 16:15:34 +0000 Received: from localhost ([127.0.0.1]:41102 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UJ4cw-0005Ef-2n for submit@debbugs.gnu.org; Fri, 22 Mar 2013 12:15:34 -0400 Received: from mx1.redhat.com ([209.132.183.28]:11377) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UJ4cu-0005EY-9W for 13947@debbugs.gnu.org; Fri, 22 Mar 2013 12:15:33 -0400 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r2MGDYJN031542 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 22 Mar 2013 12:13:34 -0400 Received: from [10.36.116.25] (ovpn-116-25.ams2.redhat.com [10.36.116.25]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r2MGDVnD026330 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 22 Mar 2013 12:13:32 -0400 Message-ID: <514C832A.4010006@draigBrady.com> Date: Fri, 22 Mar 2013 16:13:30 +0000 From: =?UTF-8?Q?P=C3=A1draig?= Brady User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 References: <5140F0D6.6070103@redhat.com> <5140F563.1010906@draigBrady.com> <514C7CAD.8010403@draigBrady.com> <514C80E3.5030203@redhat.com> In-Reply-To: <514C80E3.5030203@redhat.com> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=UTF-8 X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mx1.redhat.com id r2MGDYJN031542 X-Spam-Score: -5.0 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.9 (------) On 03/22/2013 04:03 PM, Eric Blake wrote: > On 03/22/2013 09:45 AM, P=C3=A1draig Brady wrote: >> @table @samp >> @item a >> named character, ignoring high-order bit >> @item c >> -ASCII character or backslash escape, >> +printable single byte character or backslash escape, >=20 > Hmm, we output octal sequences without a backslash; should the info pag= e > be any more verbose that it is one of: a single-byte printable > character, a C backslash escape, or an octal sequence? Or does that > just clutter things (seeing three octal digits, even without a > backslash, still makes it easy to determine that it can be used as an > escape sequence). Good point. I'll make that clarification in the same commit as it at least confirms t= he behavior is intended. POSIX is explicit about the three possibilities. thanks, P=C3=A1draig. From unknown Sun Jun 22 04:30:13 2025 X-Loop: help-debbugs@gnu.org Subject: bug#13947: bug report for core-utils command : OD Resent-From: Eric Blake Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Wed, 27 Mar 2013 18:47:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 13947 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Mark.Jaeger@oracle.com Cc: marc.grondin@oracle.com, P@draigBrady.com, 13947-done@debbugs.gnu.org Received: via spool by 13947-done@debbugs.gnu.org id=D13947.136441000131071 (code D ref 13947); Wed, 27 Mar 2013 18:47:02 +0000 Received: (at 13947-done) by debbugs.gnu.org; 27 Mar 2013 18:46:41 +0000 Received: from localhost ([127.0.0.1]:48074 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UKvMt-000852-Ql for submit@debbugs.gnu.org; Wed, 27 Mar 2013 14:46:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:20640) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UKvMq-00084s-Kd for 13947-done@debbugs.gnu.org; Wed, 27 Mar 2013 14:46:38 -0400 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r2RIi9Ie023720 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 27 Mar 2013 14:44:09 -0400 Received: from [10.3.113.179] (ovpn-113-179.phx2.redhat.com [10.3.113.179]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r2RIi97S011151; Wed, 27 Mar 2013 14:44:09 -0400 Message-ID: <51533DF8.7080106@redhat.com> Date: Wed, 27 Mar 2013 12:44:08 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4 MIME-Version: 1.0 References: <5140F0D6.6070103@redhat.com> <5140F563.1010906@draigBrady.com> <514C7CAD.8010403@draigBrady.com> <514C80E3.5030203@redhat.com> In-Reply-To: X-Enigmail-Version: 1.5.1 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="----enig2MKHHMGNOHOJJPNVGWJWB" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 X-Spam-Score: -8.2 (--------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -8.2 (--------) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2MKHHMGNOHOJJPNVGWJWB Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 03/27/2013 12:39 PM, Mark JAEGER wrote: > Hello Eric, >=20 > The terms "single-byte character" and "single-byte > printable character" do not sound precise to me. They are precise - they are characters in the encoding determined by the current setting of LC_CTYPE. >=20 > A byte is just a byte. It is NOT a character. > I.e., it is an octet, or an 8-bit quantity. >=20 > It CAN be interpreted as a character, but only in > the context of a particular ENCODING. Yes, but the ENCODING is always known, thanks to the rules on LC_* and locale handling. >=20 > The help text as it stands now IS precise in talking > about ASCII, which IS a particular encoding. >=20 > Please don't use the term "single-byte ... character" > without being precise about what encoding it uses. The encoding is whatever encoding you asked for. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org ------enig2MKHHMGNOHOJJPNVGWJWB Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJRUz34AAoJEKeha0olJ0NqIhgH+wVJ1uGSySfBzcvoaGTmdwPw WPQVFNnOAEluQsU+00lJ44vjyJ325rNe7zDGKNYPuOr3vGSL4LGyhJVgPmR7Ph3o 3H6fZ0ruv0XCww27Su2r95uftXflahDHjRvEqumaBvjUG/KSYNP8dm6EqfzjePFF 4GJKEZJaqsNuukNprYUHmry5KErtdRijKl8SQ3IhGyRJVqk+wfiIF4RO233DKzAh UF21fdSYM+9Hglbs5A2Hzz0iMR/QuF9o2Q7bFWtVqqZ7ajYCd6FobnfAh8mt3JXI eMK6nD8yKhdXUrcxiJ4pU2Lv0LavoMVhBarpgGjxYzSq34whdGxpj88g/2Vkk5Q= =OLfq -----END PGP SIGNATURE----- ------enig2MKHHMGNOHOJJPNVGWJWB-- From unknown Sun Jun 22 04:30:13 2025 X-Loop: help-debbugs@gnu.org Subject: bug#13947: bug report for core-utils command : OD Resent-From: Mark JAEGER Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Wed, 27 Mar 2013 19:25:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 13947 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: eblake@redhat.com Cc: marc.grondin@oracle.com, P@draigBrady.com, 13947-done@debbugs.gnu.org Reply-To: Mark.Jaeger@oracle.com Received: via spool by 13947-done@debbugs.gnu.org id=D13947.13644122882719 (code D ref 13947); Wed, 27 Mar 2013 19:25:02 +0000 Received: (at 13947-done) by debbugs.gnu.org; 27 Mar 2013 19:24:48 +0000 Received: from localhost ([127.0.0.1]:48148 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UKvxn-0000hn-3h for submit@debbugs.gnu.org; Wed, 27 Mar 2013 15:24:47 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:33882) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UKvI2-0007w2-UG for 13947-done@debbugs.gnu.org; Wed, 27 Mar 2013 14:41:40 -0400 Received: from ucsinet21.oracle.com (ucsinet21.oracle.com [156.151.31.93]) by aserp1040.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id r2RIdABY011672 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 27 Mar 2013 18:39:11 GMT Received: from acsmt356.oracle.com (acsmt356.oracle.com [141.146.40.156]) by ucsinet21.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r2RId920025099 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 27 Mar 2013 18:39:09 GMT Received: from abhmt118.oracle.com (abhmt118.oracle.com [141.146.116.70]) by acsmt356.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id r2RId8SY019347; Wed, 27 Mar 2013 13:39:08 -0500 Received: from adc2190842.us.oracle.com (/10.228.220.117) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 27 Mar 2013 11:39:08 -0700 Date: Wed, 27 Mar 2013 11:39:08 -0700 (PDT) From: Mark JAEGER X-X-Sender: mjaeger@adc2190842.us.oracle.com In-Reply-To: <514C80E3.5030203@redhat.com> Message-ID: References: <5140F0D6.6070103@redhat.com> <5140F563.1010906@draigBrady.com> <514C7CAD.8010403@draigBrady.com> <514C80E3.5030203@redhat.com> User-Agent: Alpine 2.00 (LRH 1167 2008-08-23) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-469076516-2006786642-1364409548=:21517" X-Source-IP: ucsinet21.oracle.com [156.151.31.93] X-Spam-Score: -5.5 (-----) X-Mailman-Approved-At: Wed, 27 Mar 2013 15:24:45 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.5 (-----) This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---469076516-2006786642-1364409548=:21517 Content-Type: TEXT/PLAIN; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Hello Eric, The terms "single-byte character" and "single-byte printable character" do not sound precise to me. A byte is just a byte. It is NOT a character. I.e., it is an octet, or an 8-bit quantity. It CAN be interpreted as a character, but only in the context of a particular ENCODING. The help text as it stands now IS precise in talking about ASCII, which IS a particular encoding. Please don't use the term "single-byte ... character" without being precise about what encoding it uses. Regards, --Mark JAEGER phone: 312-651-8329 Sustaining Engineering (formerly DDR) Server Technologies, Oracle e-mail: Mark.Jaeger@oracle.com On Fri, 22 Mar 2013, Eric Blake wrote: > Date: Fri, 22 Mar 2013 10:03:47 -0600 > From: Eric Blake > To: P=C3=A1draig Brady > Cc: Mark.Jaeger@oracle.com, Marc Grondin , > 13947-done@debbugs.gnu.org > Subject: Re: bug#13947: bug report for core-utils command : OD >=20 > On 03/22/2013 09:45 AM, P=C3=A1draig Brady wrote: >> Hopefully the attached clarifies things. > >> * src/od.c (usage): Mention any printable character is output, >> Not just ASCII. >> * doc/coreutils.texi (od invocation): Further clarify that only >> single byte characters are output (due to the alignment requirement). >> Reported in http://bugs.gnu.org/13947 > > Yes, this looks good to me. It could go in as-is, but see my question > below... > >> --- >> doc/coreutils.texi | 6 +++--- >> src/od.c | 4 ++-- >> 2 files changed, 5 insertions(+), 5 deletions(-) >> > >> @table @samp >> @item a >> named character, ignoring high-order bit >> @item c >> -ASCII character or backslash escape, >> +printable single byte character or backslash escape, > > Hmm, we output octal sequences without a backslash; should the info page > be any more verbose that it is one of: a single-byte printable > character, a C backslash escape, or an octal sequence? Or does that > just clutter things (seeing three octal digits, even without a > backslash, still makes it easy to determine that it can be used as an > escape sequence). > >> +++ b/src/od.c >> @@ -339,7 +339,7 @@ suffixes may be . for octal and b for multiply by 51= 2.\n\ >> Traditional format specifications may be intermixed; they accumulate:\n= \ >> -a same as -t a, select named characters, ignoring high-order bit\= n\ >> -b same as -t o1, select octal bytes\n\ >> - -c same as -t c, select ASCII characters or backslash escapes\n\ >> + -c same as -t c, select printable characters or backslash escapes\= n\ > > For the --help output, terse is good, so I don't see any improvements to > your change here. > > --=20 > Eric Blake eblake redhat com +1-919-301-3266 > Libvirt virtualization library http://libvirt.org > > ---469076516-2006786642-1364409548=:21517--