From unknown Sat Aug 16 18:37:37 2025 X-Loop: help-debbugs@gnu.org Subject: bug#9346: wc does not conform to POSIX (additional spaces) Resent-From: Vincent Lefevre Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Tue, 23 Aug 2011 00:42:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 9346 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 9346@debbugs.gnu.org X-Debbugs-Original-To: bug-coreutils@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.131406010518408 (code B ref -1); Tue, 23 Aug 2011 00:42:01 +0000 Received: (at submit) by debbugs.gnu.org; 23 Aug 2011 00:41:45 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qvf3o-0004mr-P7 for submit@debbugs.gnu.org; Mon, 22 Aug 2011 20:41:44 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qvf3k-0004mc-Qj for submit@debbugs.gnu.org; Mon, 22 Aug 2011 20:41:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qvf1P-0003rA-2x for submit@debbugs.gnu.org; Mon, 22 Aug 2011 20:39:15 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,TO_NO_BRKTS_PCNT autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([140.186.70.17]:35761) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qvf1P-0003r6-1X for submit@debbugs.gnu.org; Mon, 22 Aug 2011 20:39:15 -0400 Received: from eggs.gnu.org ([140.186.70.92]:40922) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qvf1N-0003Ua-Sz for bug-coreutils@gnu.org; Mon, 22 Aug 2011 20:39:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qvf1M-0003qu-QH for bug-coreutils@gnu.org; Mon, 22 Aug 2011 20:39:13 -0400 Received: from vinc17.pck.nerim.net ([213.41.242.187]:64246 helo=smtp-xvii.vinc17.net) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qvf1M-0003ql-EY for bug-coreutils@gnu.org; Mon, 22 Aug 2011 20:39:12 -0400 Received: from vinc17 by xvii.vinc17.org with local (Exim 4.76) (envelope-from ) id 1Qvf1J-00078i-P1; Tue, 23 Aug 2011 02:39:09 +0200 Date: Tue, 23 Aug 2011 02:39:09 +0200 From: Vincent Lefevre Message-ID: <20110823003909.GD28945@xvii.vinc17.org> Mail-Followup-To: Vincent Lefevre , bug-coreutils@gnu.org MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Mailer-Info: http://www.vinc17.net/mutt/ User-Agent: Mutt/1.5.21-6194-vl-r44775 (2011-07-13) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.17 X-Spam-Score: -5.6 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.7 (-----) http://pubs.opengroup.org/onlinepubs/9699919799/utilities/wc.html says: STDOUT By default, the standard output shall contain an entry for each input file of the form: "%d %d %d %s\n", , , , But wc from GNU coreutils 8.12 adds spaces: $ echo | wc 1 0 1 Setting POSIXLY_CORRECT=1 doesn't even have any effect here. -- Vincent Lefèvre - Web: 100% accessible validated (X)HTML - Blog: Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon) From unknown Sat Aug 16 18:37:37 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.427 (Entity 5.427) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Vincent Lefevre Subject: bug#9346: closed (Re: bug#9346: wc does not conform to POSIX (additional spaces)) Message-ID: References: <4E52FD6F.1050808@draigBrady.com> <20110823003909.GD28945@xvii.vinc17.org> X-Gnu-PR-Message: they-closed 9346 X-Gnu-PR-Package: coreutils Reply-To: 9346@debbugs.gnu.org Date: Tue, 23 Aug 2011 01:11:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1314061862-26227-1" This is a multi-part message in MIME format... ------------=_1314061862-26227-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #9346: wc does not conform to POSIX (additional spaces) which was filed against the coreutils package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 9346@debbugs.gnu.org. --=20 9346: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D9346 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1314061862-26227-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 9346-done) by debbugs.gnu.org; 23 Aug 2011 01:10:51 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QvfVy-0006ob-1k for submit@debbugs.gnu.org; Mon, 22 Aug 2011 21:10:50 -0400 Received: from mail1.vodafone.ie ([213.233.128.43]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QvfVv-0006oR-J0 for 9346-done@debbugs.gnu.org; Mon, 22 Aug 2011 21:10:49 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApMBAKn8Uk5tTHJJ/2dsb2JhbAAMNapSBwEBBTIBVgsNCwkUAg8JAwIBAgFFBgEMCAEBh3G1doZIBJhAi1A Received: from unknown (HELO [192.168.1.79]) ([109.76.114.73]) by mail1.vodafone.ie with ESMTP; 23 Aug 2011 02:08:21 +0100 Message-ID: <4E52FD6F.1050808@draigBrady.com> Date: Tue, 23 Aug 2011 02:07:59 +0100 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:5.0) Gecko/20110707 Thunderbird/5.0 MIME-Version: 1.0 To: Vincent Lefevre , 9346-done@debbugs.gnu.org Subject: Re: bug#9346: wc does not conform to POSIX (additional spaces) References: <20110823003909.GD28945@xvii.vinc17.org> In-Reply-To: <20110823003909.GD28945@xvii.vinc17.org> X-Enigmail-Version: 1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.6 (--) X-Debbugs-Envelope-To: 9346-done X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.6 (--) tags 9346 + notabug On 08/23/2011 01:39 AM, Vincent Lefevre wrote: > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/wc.html > says: > > STDOUT > > By default, the standard output shall contain an entry for each > input file of the form: > > "%d %d %d %s\n", , , , > > But wc from GNU coreutils 8.12 adds spaces: > > $ echo | wc > 1 0 1 > > Setting POSIXLY_CORRECT=1 doesn't even have any effect here. > POSIX refers to the printf format above as a pseudo-printf format, to contrast with the format used in SYS V of "%7d%7d%7d %s\n". Notice the lack of spaces there, hence problems with big numbers. So I take the POSIX printf format you referenced, just to ensure at least 1 space is guaranteed between counts. Also for any kind of portability, one will need to deal with a variable number of spaces. GNU wc uses a dynamic width (try it on a small file), while also ensuring at least 1 space is present. cheers, Pádraig. ------------=_1314061862-26227-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 23 Aug 2011 00:41:45 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qvf3o-0004mr-P7 for submit@debbugs.gnu.org; Mon, 22 Aug 2011 20:41:44 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qvf3k-0004mc-Qj for submit@debbugs.gnu.org; Mon, 22 Aug 2011 20:41:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qvf1P-0003rA-2x for submit@debbugs.gnu.org; Mon, 22 Aug 2011 20:39:15 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,TO_NO_BRKTS_PCNT autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([140.186.70.17]:35761) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qvf1P-0003r6-1X for submit@debbugs.gnu.org; Mon, 22 Aug 2011 20:39:15 -0400 Received: from eggs.gnu.org ([140.186.70.92]:40922) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qvf1N-0003Ua-Sz for bug-coreutils@gnu.org; Mon, 22 Aug 2011 20:39:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qvf1M-0003qu-QH for bug-coreutils@gnu.org; Mon, 22 Aug 2011 20:39:13 -0400 Received: from vinc17.pck.nerim.net ([213.41.242.187]:64246 helo=smtp-xvii.vinc17.net) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qvf1M-0003ql-EY for bug-coreutils@gnu.org; Mon, 22 Aug 2011 20:39:12 -0400 Received: from vinc17 by xvii.vinc17.org with local (Exim 4.76) (envelope-from ) id 1Qvf1J-00078i-P1; Tue, 23 Aug 2011 02:39:09 +0200 Date: Tue, 23 Aug 2011 02:39:09 +0200 From: Vincent Lefevre To: bug-coreutils@gnu.org Subject: wc does not conform to POSIX (additional spaces) Message-ID: <20110823003909.GD28945@xvii.vinc17.org> Mail-Followup-To: Vincent Lefevre , bug-coreutils@gnu.org MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Mailer-Info: http://www.vinc17.net/mutt/ User-Agent: Mutt/1.5.21-6194-vl-r44775 (2011-07-13) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.17 X-Spam-Score: -5.6 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.7 (-----) http://pubs.opengroup.org/onlinepubs/9699919799/utilities/wc.html says: STDOUT By default, the standard output shall contain an entry for each input file of the form: "%d %d %d %s\n", , , , But wc from GNU coreutils 8.12 adds spaces: $ echo | wc 1 0 1 Setting POSIXLY_CORRECT=1 doesn't even have any effect here. -- Vincent Lefèvre - Web: 100% accessible validated (X)HTML - Blog: Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon) ------------=_1314061862-26227-1-- From unknown Sat Aug 16 18:37:37 2025 X-Loop: help-debbugs@gnu.org Subject: bug#9346: wc does not conform to POSIX (additional spaces) Resent-From: Eric Blake Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Tue, 23 Aug 2011 01:51:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 9346 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 9346@debbugs.gnu.org, P@draigBrady.com Received: via spool by 9346-submit@debbugs.gnu.org id=B9346.13140642265053 (code B ref 9346); Tue, 23 Aug 2011 01:51:02 +0000 Received: (at 9346) by debbugs.gnu.org; 23 Aug 2011 01:50:26 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qvg8H-0001JR-Vo for submit@debbugs.gnu.org; Mon, 22 Aug 2011 21:50:26 -0400 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qvg8E-0001JI-Ar for 9346@debbugs.gnu.org; Mon, 22 Aug 2011 21:50:24 -0400 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p7N1luMd021014 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 22 Aug 2011 21:47:56 -0400 Received: from [10.3.113.118] (ovpn-113-118.phx2.redhat.com [10.3.113.118]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p7N1ltTa017246; Mon, 22 Aug 2011 21:47:56 -0400 Message-ID: <4E5306CB.6060703@redhat.com> Date: Mon, 22 Aug 2011 19:47:55 -0600 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110621 Fedora/3.1.11-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.11 MIME-Version: 1.0 References: <20110823003909.GD28945@xvii.vinc17.org> <4E52FD6F.1050808@draigBrady.com> In-Reply-To: <4E52FD6F.1050808@draigBrady.com> Content-Type: text/plain; charset=UTF-8; format=flowed X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mx1.redhat.com id p7N1luMd021014 X-Spam-Score: -10.3 (----------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.3 (----------) On 08/22/2011 07:07 PM, P=C3=A1draig Brady wrote: > tags 9346 + notabug > > On 08/23/2011 01:39 AM, Vincent Lefevre wrote: >> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/wc.html >> says: >> >> STDOUT >> >> By default, the standard output shall contain an entry for each >> input file of the form: >> >> "%d %d %d %s\n",,,, >> >> But wc from GNU coreutils 8.12 adds spaces: >> >> $ echo | wc >> 1 0 1 >> >> Setting POSIXLY_CORRECT=3D1 doesn't even have any effect here. Correct, because it is not a POSIX violation. >> > > POSIX refers to the printf format above as a pseudo-printf format, > to contrast with the format used in SYS V of "%7d%7d%7d %s\n". The official wording is here: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap05.html#t= ag_05 3. The following characters have the following special meaning in=20 the format string: '' (An empty character position.) Represents one or more =20 characters. =E2=88=86 Represents exactly one character. Since the POSIX specification for wc uses space, and not the special=20 delta symbol, it is intended to be arbitrary amount of blanks (space or=20 tabs), according as the tool designers think fit, and you cannot=20 portably rely on an exact number, but can rely on the fact that no=20 matter how large the numbers are, the columns will not run into one anoth= er. --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org