From unknown Fri Jun 20 07:16:10 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#16468 <16468@debbugs.gnu.org> To: bug#16468 <16468@debbugs.gnu.org> Subject: Status: join Reply-To: bug#16468 <16468@debbugs.gnu.org> Date: Fri, 20 Jun 2025 14:16:10 +0000 retitle 16468 join reassign 16468 coreutils submitter 16468 barry kesner severity 16468 normal tag 16468 notabug thanks From modockesner@gmail.com Thu Jan 16 10:29:37 2014 Received: (at submit) by debbugs.gnu.org; 16 Jan 2014 17:06:21 +0000 Received: from eggs.gnu.org ([208.118.235.92]:51805) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3osy-00043t-OY for submit@debbugs.gnu.org; Thu, 16 Jan 2014 10:29:37 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W3osx-00071U-Fw for submit@debbugs.gnu.org; Thu, 16 Jan 2014 10:29:36 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM, HTML_MESSAGE,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:34048) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W3osx-00071L-Cf for submit@debbugs.gnu.org; Thu, 16 Jan 2014 10:29:35 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53515) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W3osw-00071z-DQ for bug-coreutils@gnu.org; Thu, 16 Jan 2014 10:29:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W3osu-00070z-A3 for bug-coreutils@gnu.org; Thu, 16 Jan 2014 10:29:34 -0500 Received: from mail-pd0-x234.google.com ([2607:f8b0:400e:c02::234]:52358) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W3osu-0006zd-2m for bug-coreutils@gnu.org; Thu, 16 Jan 2014 10:29:32 -0500 Received: by mail-pd0-f180.google.com with SMTP id x10so1406931pdj.25 for ; Thu, 16 Jan 2014 07:29:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=czPmy1iABNUE42GAfVPM4qBZ/iJYN7NtpRtPvOfvLso=; b=p1++wLtbAkcylyHiG6bCoyleHGMT1Ct4PVIaJTGw1V8SMv8F/2qNHwiFXgj0AYBrRg 0THB4JlJJEONCLGWe/eAfmezwMl7qtOaJwTeR3J1j7iW9wwJG+hgfeyQNeYPAKytbYVh XtBBYn5CP5AdL+Z0/QMp/X7E3X83grAFai1zg0FZCZ112duaFJmmogrCDPw7Sj+4h2yv mj0vV0+XIfnCg+7swqWurmDogdseHt3LCPLp3qPXTVKwvqBs9gzEfMv8uc0xHUtk4hAZ Ig4wZ4KpXPKX+CeDiJ/3s1EuPqCNhl0NaDU+6kGU6XZ2B4bdY1eblNyke11kprH5wrnU fQRA== MIME-Version: 1.0 X-Received: by 10.69.20.11 with SMTP id gy11mr6050317pbd.64.1389886167579; Thu, 16 Jan 2014 07:29:27 -0800 (PST) Received: by 10.70.41.231 with HTTP; Thu, 16 Jan 2014 07:29:27 -0800 (PST) Date: Thu, 16 Jan 2014 10:29:27 -0500 Message-ID: Subject: join From: barry kesner To: bug-coreutils@gnu.org Content-Type: multipart/alternative; boundary=001a113673444c0fc404f018173c X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit --001a113673444c0fc404f018173c Content-Type: text/plain; charset=ISO-8859-1 join is failing on large numbers somehow I have 2 files to join file 1 99910287 1 99978720 1 99980081 1 99980180 2 99980281 1 99980406 1 99980932 1 99982402 1 100002132 1 100002162 2 100002166 3 file 2 contains 99980081 1 100002129 1 100002136 2 100002162 3 Join fails to join properly only giving 99980081 if I prefix the 9's with a 0 join does not fail Barry --001a113673444c0fc404f018173c Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
join is failing on large numbers somehow

I have 2 files to join
file 1
99910287 =A0 =A01 =A0= =A0
99978720 =A0 =A01 =A0=A0
99980081 =A0 =A01 =A0=A0
99980180 =A0 =A02 =A0=A0
99980281 =A0 =A01 =A0=A0
99980406 =A0 =A01 =A0=A0
= 99980932 =A0 =A01 =A0=A0
99982402 =A0 =A01 =A0=A0
10000= 2132 =A0 1 =A0=A0
100002162 =A0 2 =A0=A0
100002166 =A0 = 3 =A0=A0
file 2 contains
99980081 =A0 =A01 =A0=A0
100002129 =A0 1 =A0=A0
1000021= 36 =A0 2 =A0=A0
100002162 =A0 3=A0

Join fails to join properly only giving 99980081
if I prefix the= 9's with a 0 join does not fail

Barry
--001a113673444c0fc404f018173c-- From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 16 12:15:12 2014 Received: (at 16468) by debbugs.gnu.org; 16 Jan 2014 17:15:12 +0000 Received: from localhost ([127.0.0.1]:53901 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3qX9-0006zp-Lq for submit@debbugs.gnu.org; Thu, 16 Jan 2014 12:15:12 -0500 Received: from mx1.redhat.com ([209.132.183.28]:1145) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3qX5-0006zd-Mr for 16468@debbugs.gnu.org; Thu, 16 Jan 2014 12:15:09 -0500 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s0GHF47t005939 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 16 Jan 2014 12:15:05 -0500 Received: from [10.3.113.148] (ovpn-113-148.phx2.redhat.com [10.3.113.148]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s0GHF4rI026574; Thu, 16 Jan 2014 12:15:04 -0500 Message-ID: <52D81398.60203@redhat.com> Date: Thu, 16 Jan 2014 10:15:04 -0700 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: barry kesner , 16468@debbugs.gnu.org Subject: Re: bug#16468: join References: In-Reply-To: X-Enigmail-Version: 1.6 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="LHOgxak8U9NniQUivAj6OWAfI44mbOVLB" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 X-Spam-Score: -5.3 (-----) X-Debbugs-Envelope-To: 16468 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.3 (-----) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --LHOgxak8U9NniQUivAj6OWAfI44mbOVLB Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 01/16/2014 08:29 AM, barry kesner wrote: > join is failing on large numbers somehow >=20 >=20 > Join fails to join properly only giving 99980081 > if I prefix the 9's with a 0 join does not fail Sounds to me like you didn't heed this advice in the --help text: Important: FILE1 and FILE2 must be sorted on the join fields. E.g., use "sort -k 1b,1" if 'join' has no options, or use "join -t ''" if 'sort' has no options. Note, comparisons honor the rules specified by 'LC_COLLATE'. If the input is not sorted and some lines cannot be joined, a warning message will be given. Does running 'LC_ALL=3DC join' change the behavior for you, in which case= it was an issue of your choice of LC_COLLATE? --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --LHOgxak8U9NniQUivAj6OWAfI44mbOVLB Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJS2BOYAAoJEKeha0olJ0NqZacH/ip+GHRKgPgUt9yu2ZsInYP5 nQwENU0WzFY9GfwCXjiuRXwmb0M60UrjqP4ifqKGROiEYYp4yBCOZ0L+2bX9a9Q8 8k+gkZjL8DoeNxeeF8KGU5HDAV9TKmnbPZXT8pJO/b81kfSFVnkhNTbeicBJafz/ ySkMzoWfCLmyj/sQAQhp1OcdEYAzjG/u8ehvVRRVd0+bumf4HYG60SDlKCWIcqqH YR+Cn8B1tbi/VEeu9bFv+3ZhV0eQNn/L88Xr0ym3VsJrq1vzPVGWqiJzcWFYehN8 2PIQjmxWG3x2KZdrfowtiO4g76FX0k1dWQcibhWlRcwBVV/i5uioUHNFSCE4+VE= =iMs9 -----END PGP SIGNATURE----- --LHOgxak8U9NniQUivAj6OWAfI44mbOVLB-- From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 16 13:10:19 2014 Received: (at 16468) by debbugs.gnu.org; 16 Jan 2014 18:10:19 +0000 Received: from localhost ([127.0.0.1]:53946 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3rOV-00009V-5z for submit@debbugs.gnu.org; Thu, 16 Jan 2014 13:10:19 -0500 Received: from mx1.redhat.com ([209.132.183.28]:49088) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3rOR-00009H-4w for 16468@debbugs.gnu.org; Thu, 16 Jan 2014 13:10:17 -0500 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s0GIACMK027680 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 16 Jan 2014 13:10:14 -0500 Received: from [10.3.113.148] (ovpn-113-148.phx2.redhat.com [10.3.113.148]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s0GIACf2023946; Thu, 16 Jan 2014 13:10:12 -0500 Message-ID: <52D82083.3010309@redhat.com> Date: Thu, 16 Jan 2014 11:10:11 -0700 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: barry kesner , 16468@debbugs.gnu.org Subject: Re: bug#16468: join References: <52D81398.60203@redhat.com> In-Reply-To: X-Enigmail-Version: 1.6 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="XDjdKbPcQEOPXlB4VoUPgJ7rhOrgq8VT4" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 X-Spam-Score: -5.3 (-----) X-Debbugs-Envelope-To: 16468 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.3 (-----) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --XDjdKbPcQEOPXlB4VoUPgJ7rhOrgq8VT4 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable [re-adding the list, with permission] On 01/16/2014 10:46 AM, barry kesner wrote: > Eric, > Thanks for response. > I now realize it wants sorted alpha input not numerical. 999 1000 100= 1 is > how it is sorted. I think there have been requests in the past to enhance 'join' so that it can have more fine-tuned control over how its fields are selected. Maybe something like sharing code so that 'join -1 k1,1n' would behave like it were using 'sort -k1,1n' sorting on file 1. But right now, that functionality doesn't exist. >=20 > How do you tell join this without resorting. The files are huge! Unfortunately, there isn't any really good way, short of re-processing the files to make the data appear sorted in the order join expects. That said, it certainly appears that for your given data, you can write a sed filter that can reprocess on a line-by-line basis, and feed that into join, without the penalty of having to re-sort the entire file and without having to have the processed file stored in your file system all at once. It also seems possible to write a post filter to get back to the style of the line in the original file. Here, extensions such as bas= h's join <(infilter file1) <(infilter file2) | outfilter make it easier to type (where the trick is to now write the correct sed scripts to serve as infilter and outfilter) than the alternative of having to use named fifos for limiting yourself to just POSIX semantics. >=20 > I can't find LC_COLLATE? It's an environment variable, like LC_ALL, that affects your locale. Running 'locale' will show you your current locale settings, including LC_COLLATE. Setting LC_ALL in the environment is shorthand that forces all other categories to behave the same, so it's easier to test whether 'LC_ALL=3DC command' has an effect than it is to figure out which locale category(ies) matter. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --XDjdKbPcQEOPXlB4VoUPgJ7rhOrgq8VT4 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJS2CCDAAoJEKeha0olJ0NqW8wH/jW+8Mfyd9ZimoqK/cz5oJ0X sxPpjKBjM5Xw6i55u4+qi+21xdCNk5TtYkjkTnqLpBowKVLaogvkn+2GNyPt2g2L qc8DDbSne8GxVZn/fQRNjoE4D1G1ZOtLrIqZsVB8Y+jk5yYi+x4uhMETbO1HM0qk F6W64oz6SfjA51RMR5rKDMj0x50EcwclvEf3oQeEBAHSSXGusj13tOw2stzE0yZF h82pfOF0cQgDVERP8SjepTlHVe8ZozvJ7V3CIFN3F13KrPJdnSRV8MeDPzSvJC8N nxN4KsSjVEB0P5MyJGHCszFmEXJbYtXOzwAHTMq/CCHhAhIDFOoKsbxoYvawpKw= =AUR2 -----END PGP SIGNATURE----- --XDjdKbPcQEOPXlB4VoUPgJ7rhOrgq8VT4-- From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 16 14:32:04 2014 Received: (at 16468) by debbugs.gnu.org; 16 Jan 2014 19:32:04 +0000 Received: from localhost ([127.0.0.1]:53999 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3sfc-0002SY-7D for submit@debbugs.gnu.org; Thu, 16 Jan 2014 14:32:04 -0500 Received: from mail2.vodafone.ie ([213.233.128.44]:14587) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3sfZ-0002S4-P2 for 16468@debbugs.gnu.org; Thu, 16 Jan 2014 14:32:02 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApQBALMy2FJtTvfM/2dsb2JhbAANTINDg1S4EoElgxkBAQEEAQIgDwFGEAsNAQoCAgUWCwICCQMCAQIBFi8GDQEHAQGIBQind3acKxeBKY1WB4JvgUkElDuBFYQChT2OVg Received: from unknown (HELO [192.168.1.79]) ([109.78.247.204]) by mail2.vodafone.ie with ESMTP; 16 Jan 2014 19:32:00 +0000 Message-ID: <52D833AF.2060506@draigBrady.com> Date: Thu, 16 Jan 2014 19:31:59 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Eric Blake Subject: Re: bug#16468: join References: <52D81398.60203@redhat.com> <52D82083.3010309@redhat.com> In-Reply-To: <52D82083.3010309@redhat.com> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16468 Cc: barry kesner , 16468@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On 01/16/2014 06:10 PM, Eric Blake wrote: > [re-adding the list, with permission] > > On 01/16/2014 10:46 AM, barry kesner wrote: >> How do you tell join this without resorting. The files are huge! > > Unfortunately, there isn't any really good way, short of re-processing > the files to make the data appear sorted in the order join expects. Note we are working on merging sort, uniq, and join key selection and comparison code, to support this directly. http://lists.gnu.org/archive/html/coreutils/2013-09/msg00047.html thanks, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 16 19:00:15 2014 Received: (at 16468) by debbugs.gnu.org; 17 Jan 2014 00:00:15 +0000 Received: from localhost ([127.0.0.1]:54205 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3wr8-0003r8-Mn for submit@debbugs.gnu.org; Thu, 16 Jan 2014 19:00:14 -0500 Received: from moutng.kundenserver.de ([212.227.126.171]:57836) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3wr6-0003qu-Uj for 16468@debbugs.gnu.org; Thu, 16 Jan 2014 19:00:13 -0500 Received: from [192.168.1.11] (p57A5CC82.dip0.t-ipconnect.de [87.165.204.130]) by mrelayeu.kundenserver.de (node=mreu2) with ESMTP (Nemesis) id 0MVqHM-1VoRDW2PH6-00XErk; Fri, 17 Jan 2014 01:00:10 +0100 Message-ID: <52D8728A.1010904@bernhard-voelker.de> Date: Fri, 17 Jan 2014 01:00:10 +0100 From: Bernhard Voelker User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Eric Blake , barry kesner , 16468@debbugs.gnu.org Subject: Re: bug#16468: join References: <52D81398.60203@redhat.com> <52D82083.3010309@redhat.com> In-Reply-To: <52D82083.3010309@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:vSbVdtq14ndWriASSymab5tkgeB7Ia/lf10L3+I+fqu tG8DUeQfwYEH4I9P+oZd42tQ9CFmbhveiSyxPwHvYCsvlGrgg2 fRttxP8CqCl4s08Ey5gXZ7K0QHcONqv0k0XLBClGIL0/wGUc0R NB4A/6zZrz7qCnACBC7Y+nQ7bYuPRQpHlCqdZugqubps9+JrhY q1GBYhwTs3JxjykuPbicH9GIu93E2JmqPoIR43S7RDbZxMi+kc sn1qeXWNISWamJqym1G7pp1Tw6TdBawo2V2OcRYhO3h/ENDRb0 rGaYl64kDr3+r4WQk1COZgkC9BgiNPycOunZGkcOOkONki2zK1 M5rIWCoUTblk//Zrnky3mbyqrIzkjRjIPI3X6E/sa X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 16468 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) On 01/16/2014 07:10 PM, Eric Blake wrote: > On 01/16/2014 10:46 AM, barry kesner wrote: >> How do you tell join this without resorting. The files are huge! > > Unfortunately, there isn't any really good way, short of re-processing > the files to make the data appear sorted in the order join expects. > That said, it certainly appears that for your given data, you can write > a sed filter that can reprocess on a line-by-line basis, and feed that > into join, without the penalty of having to re-sort the entire file and > without having to have the processed file stored in your file system all > at once. It also seems possible to write a post filter to get back to > the style of the line in the original file. Here, extensions such as bash's > join <(infilter file1) <(infilter file2) | outfilter > make it easier to type (where the trick is to now write the correct sed > scripts to serve as infilter and outfilter) than the alternative of > having to use named fifos for limiting yourself to just POSIX semantics. Hum, isn't such number conversion filtering exactly what numfmt wasn't designed for? But wait ... $ numfmt --field 1 --format='%020f' < f2 99980081 1 100002129 1 100002136 2 100002162 3 ... it doesn't support leading zeros, unfortunately. ;-/ Wouldn't this be a nice enhancement? Have a nice day, Berny From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 16 21:21:39 2014 Received: (at 16468) by debbugs.gnu.org; 17 Jan 2014 02:21:39 +0000 Received: from localhost ([127.0.0.1]:54242 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3z3z-0001Bt-2k for submit@debbugs.gnu.org; Thu, 16 Jan 2014 21:21:39 -0500 Received: from mail2.vodafone.ie ([213.233.128.44]:1269) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3z3w-0001Bh-MW for 16468@debbugs.gnu.org; Thu, 16 Jan 2014 21:21:37 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApQBAFVf2FJtTCcs/2dsb2JhbAANTIcXtQuDCIElgxkBAQEEIw8BRhALDQEKAgIFFAILAgIJAwIBAgFFBg0BBwEBiAWoKXacHReBKY1WB4JvgUkBA5Q7ilSOVg Received: from unknown (HELO [192.168.1.79]) ([109.76.39.44]) by mail2.vodafone.ie with ESMTP; 17 Jan 2014 02:21:35 +0000 Message-ID: <52D893AD.9070602@draigBrady.com> Date: Fri, 17 Jan 2014 02:21:33 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Bernhard Voelker Subject: Re: bug#16468: join References: <52D81398.60203@redhat.com> <52D82083.3010309@redhat.com> <52D8728A.1010904@bernhard-voelker.de> In-Reply-To: <52D8728A.1010904@bernhard-voelker.de> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16468 Cc: barry kesner , Eric Blake , 16468@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On 01/17/2014 12:00 AM, Bernhard Voelker wrote: > On 01/16/2014 07:10 PM, Eric Blake wrote: >> On 01/16/2014 10:46 AM, barry kesner wrote: >>> How do you tell join this without resorting. The files are huge! >> >> Unfortunately, there isn't any really good way, short of re-processing >> the files to make the data appear sorted in the order join expects. >> That said, it certainly appears that for your given data, you can write >> a sed filter that can reprocess on a line-by-line basis, and feed that >> into join, without the penalty of having to re-sort the entire file and >> without having to have the processed file stored in your file system all >> at once. It also seems possible to write a post filter to get back to >> the style of the line in the original file. Here, extensions such as bash's >> join <(infilter file1) <(infilter file2) | outfilter >> make it easier to type (where the trick is to now write the correct sed >> scripts to serve as infilter and outfilter) than the alternative of >> having to use named fifos for limiting yourself to just POSIX semantics. > > Hum, isn't such number conversion filtering exactly what numfmt > wasn't designed for? But wait ... > > $ numfmt --field 1 --format='%020f' < f2 > 99980081 1 > 100002129 1 > 100002136 2 > 100002162 3 > > ... it doesn't support leading zeros, unfortunately. ;-/ > Wouldn't this be a nice enhancement? Yes it really should support standard formatting directives. leading zeros, precision in the format, etc. thanks, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Wed Apr 30 19:53:31 2014 Received: (at 16468) by debbugs.gnu.org; 30 Apr 2014 23:53:32 +0000 Received: from localhost ([127.0.0.1]:46900 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WfeJe-0004Dg-S3 for submit@debbugs.gnu.org; Wed, 30 Apr 2014 19:53:31 -0400 Received: from mail2.vodafone.ie ([213.233.128.44]:4216) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WfeJY-0004DB-Ui for 16468@debbugs.gnu.org; Wed, 30 Apr 2014 19:53:28 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApQBAGWMYVNda70d/2dsb2JhbAANTINVgz7BZoE2gxkBAQEEI1YQCw0BAwMBAgEJFgsCAgkDAgECAT0IBg0BBQIBARaILKR6dqQSF44DPREHCYJmgUoEhUiLUYE4gmKFMIVOhWKJMQ Received: from unknown (HELO [192.168.1.79]) ([93.107.189.29]) by mail2.vodafone.ie with ESMTP; 01 May 2014 00:53:17 +0100 Message-ID: <53618CEC.9090209@draigBrady.com> Date: Thu, 01 May 2014 00:53:16 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Bernhard Voelker Subject: Re: bug#16468: join References: <52D81398.60203@redhat.com> <52D82083.3010309@redhat.com> <52D8728A.1010904@bernhard-voelker.de> In-Reply-To: <52D8728A.1010904@bernhard-voelker.de> X-Enigmail-Version: 1.6 Content-Type: multipart/mixed; boundary="------------050800080003030005090203" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16468 Cc: barry kesner , Eric Blake , 16468@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) This is a multi-part message in MIME format. --------------050800080003030005090203 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 01/17/2014 12:00 AM, Bernhard Voelker wrote: > On 01/16/2014 07:10 PM, Eric Blake wrote: >> On 01/16/2014 10:46 AM, barry kesner wrote: >>> How do you tell join this without resorting. The files are huge! >> >> Unfortunately, there isn't any really good way, short of re-processing >> the files to make the data appear sorted in the order join expects. >> That said, it certainly appears that for your given data, you can write >> a sed filter that can reprocess on a line-by-line basis, and feed that >> into join, without the penalty of having to re-sort the entire file and >> without having to have the processed file stored in your file system all >> at once. It also seems possible to write a post filter to get back to >> the style of the line in the original file. Here, extensions such as bash's >> join <(infilter file1) <(infilter file2) | outfilter >> make it easier to type (where the trick is to now write the correct sed >> scripts to serve as infilter and outfilter) than the alternative of >> having to use named fifos for limiting yourself to just POSIX semantics. > > Hum, isn't such number conversion filtering exactly what numfmt > wasn't designed for? But wait ... > > $ numfmt --field 1 --format='%020f' < f2 > 99980081 1 > 100002129 1 > 100002136 2 > 100002162 3 > > ... it doesn't support leading zeros, unfortunately. ;-/ > Wouldn't this be a nice enhancement? I've needed this a few times so I added it in the attached. thanks, Pádraig. --------------050800080003030005090203 Content-Type: text/x-patch; name="numfmt-leading-zeros.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="numfmt-leading-zeros.patch" >From a6c62fba1867f1df9b8d2a8c9264606784fddc77 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Wed, 30 Apr 2014 15:05:15 +0100 Subject: [PATCH] numfmt: support zero padding using --format="%010f" * src/numfmt.c (setup_padding_buffer): Simplify the code by not explicitly dealing with heap exhaustion. (parse_format_string): Likewise. Handle multiple grouping modifiers as does the standard printf. Handle the new leading zero --format modifier. (double_to_human): Use more defensive coding against overwriting stack buffers. Honor the leading zeros width. (usage): Mention the leading zero --format modifier. (main): Allow --padding in combo with a --format (width), as the number of leading zeros are useful independent of the main field width. * doc/coreutils.texi (numfmt invocation): Likewise. * tests/misc/numfmt.pl: Add new test cases. * NEWS: Mention the improvement. --- NEWS | 3 ++ doc/coreutils.texi | 6 ++- src/numfmt.c | 94 ++++++++++++++++++++++++++++++++------------------ tests/misc/numfmt.pl | 25 +++++++++++--- 4 files changed, 87 insertions(+), 41 deletions(-) diff --git a/NEWS b/NEWS index 7855a48..904aace 100644 --- a/NEWS +++ b/NEWS @@ -66,6 +66,9 @@ GNU coreutils NEWS -*- outline -*- causing name look-up errors. Also look-ups are first done outside the chroot, in case the look-up within the chroot fails due to library conflicts etc. + numfmt supports zero padding of numbers using the standard --printf + syntax of a leading zero, for example --format="%010f". + shred now supports multiple passes on GNU/Linux tape devices by rewinding the tape before each pass, avoids redundant writes to empty files, uses direct I/O for all passes where possible, and attempts to clear diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 12002fc..a949ffc 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -2293,10 +2293,12 @@ Convert the number in input field @var{n} (default: 1). @item --format=@var{format} @opindex --format Use printf-style floating FORMAT string. The @var{format} string must contain -one @samp{%f} directive, optionally with @samp{'}, @samp{-}, or width +one @samp{%f} directive, optionally with @samp{'}, @samp{-}, @samp{0}, or width modifiers. The @samp{'} modifier will enable @option{--grouping}, the @samp{-} modifier will enable left-aligned @option{--padding} and the width modifier will -enable right-aligned @option{--padding}. +enable right-aligned @option{--padding}. The @samp{0} width modifier +(without the @samp{-} modifier) will generate leading zeros on the number, +up to the specified width. @item --from=@var{unit} @opindex --from diff --git a/src/numfmt.c b/src/numfmt.c index 63411f3..c744875 100644 --- a/src/numfmt.c +++ b/src/numfmt.c @@ -169,6 +169,7 @@ static int grouping = 0; static char *padding_buffer = NULL; static size_t padding_buffer_size = 0; static long int padding_width = 0; +static long int zero_padding_width = 0; static const char *format_str = NULL; static char *format_str_prefix = NULL; static char *format_str_suffix = NULL; @@ -272,7 +273,7 @@ suffix_power (const char suf) } static inline const char * -suffix_power_character (unsigned int power) +suffix_power_char (unsigned int power) { switch (power) { @@ -705,6 +706,21 @@ double_to_human (long double val, int precision, char *buf, size_t buf_size, enum scale_type scale, int group, enum round_type round) { + int num_size; + char fmt[64]; + verify (sizeof (fmt) > (INT_BUFSIZE_BOUND (zero_padding_width) + + INT_BUFSIZE_BOUND (precision) + + 10 /* for %.Lf etc. */)); + + char *pfmt = fmt; + *pfmt++ = '%'; + + if (group) + *pfmt++ = '\''; + + if (zero_padding_width) + pfmt += snprintf (pfmt, sizeof (fmt) - 1, "0%ld", zero_padding_width); + devmsg ("double_to_human:\n"); if (scale == scale_none) @@ -717,9 +733,10 @@ double_to_human (long double val, int precision, " no scaling, returning (grouped) value: %'.*Lf\n" : " no scaling, returning value: %.*Lf\n", precision, val); - int i = snprintf (buf, buf_size, (group) ? "%'.*Lf" : "%.*Lf", - precision, val); - if (i < 0 || i >= (int) buf_size) + stpcpy (pfmt, ".*Lf"); + + num_size = snprintf (buf, buf_size, fmt, precision, val); + if (num_size < 0 || num_size >= (int) buf_size) error (EXIT_FAILURE, 0, _("failed to prepare value '%Lf' for printing"), val); return; @@ -761,11 +778,16 @@ double_to_human (long double val, int precision, devmsg (" after rounding, value=%Lf * %0.f ^ %d\n", val, scale_base, power); - snprintf (buf, buf_size, (show_decimal_point) ? "%.1Lf%s" : "%.0Lf%s", - val, suffix_power_character (power)); + stpcpy (pfmt, show_decimal_point ? ".1Lf%s" : ".0Lf%s"); + + /* buf_size - 1 used here to ensure place for possible scale_IEC_I suffix. */ + num_size = snprintf (buf, buf_size - 1, fmt, val, suffix_power_char (power)); + if (num_size < 0 || num_size >= (int) buf_size - 1) + error (EXIT_FAILURE, 0, + _("failed to prepare value '%Lf' for printing"), val); if (scale == scale_IEC_I && power > 0) - strncat (buf, "i", buf_size - strlen (buf) - 1); + strncat (buf, "i", buf_size - num_size - 1); devmsg (" returning value: %s\n", quote (buf)); @@ -798,10 +820,7 @@ setup_padding_buffer (size_t min_size) return; padding_buffer_size = min_size + 1; - padding_buffer = realloc (padding_buffer, padding_buffer_size); - if (!padding_buffer) - error (EXIT_FAILURE, 0, _("out of memory (requested %zu bytes)"), - padding_buffer_size); + padding_buffer = xrealloc (padding_buffer, padding_buffer_size); } void @@ -906,8 +925,8 @@ UNIT options:\n"), stdout); fputs (_("\n\ FORMAT must be suitable for printing one floating-point argument '%f'.\n\ Optional quote (%'f) will enable --grouping (if supported by current locale).\n\ -Optional width value (%10f) will pad output. Optional negative width values\n\ -(%-10f) will left-pad output.\n\ +Optional width value (%10f) will pad output. Optional zero (%010f) width\n\ +will zero pad the number. Optional negative values (%-10f) will left align.\n\ "), stdout); printf (_("\n\ @@ -967,6 +986,7 @@ parse_format_string (char const *fmt) size_t suffix_pos; long int pad = 0; char *endptr = NULL; + bool zero_padding = false; for (i = 0; !(fmt[i] == '%' && fmt[i + 1] != '%'); i += (fmt[i] == '%') + 1) { @@ -977,13 +997,24 @@ parse_format_string (char const *fmt) } i++; - i += strspn (fmt + i, " "); - if (fmt[i] == '\'') + while (true) { - grouping = 1; - i++; + size_t skip = strspn (fmt + i, " "); + i += skip; + if (fmt[i] == '\'') + { + grouping = 1; + i++; + } + else if (fmt[i] == '0') + { + zero_padding = true; + i++; + } + else if (! skip) + break; } - i += strspn (fmt + i, " "); + errno = 0; pad = strtol (fmt + i, &endptr, 10); if (errno == ERANGE) @@ -992,6 +1023,9 @@ parse_format_string (char const *fmt) if (endptr != (fmt + i) && pad != 0) { + if (debug && padding_width && !(zero_padding && pad > 0)) + error (0, 0, _("--format padding overridding --padding")); + if (pad < 0) { padding_alignment = MBS_ALIGN_LEFT; @@ -999,8 +1033,12 @@ parse_format_string (char const *fmt) } else { - padding_width = pad; + if (zero_padding) + zero_padding_width = pad; + else + padding_width = pad; } + } i = endptr - fmt; @@ -1009,7 +1047,7 @@ parse_format_string (char const *fmt) if (fmt[i] != 'f') error (EXIT_FAILURE, 0, _("invalid format %s," - " directive must be %%['][-][N]f"), + " directive must be %%[0]['][-][N]f"), quote (fmt)); i++; suffix_pos = i; @@ -1020,19 +1058,9 @@ parse_format_string (char const *fmt) quote (fmt)); if (prefix_len) - { - format_str_prefix = xstrndup (fmt, prefix_len); - if (!format_str_prefix) - error (EXIT_FAILURE, 0, _("out of memory (requested %zu bytes)"), - prefix_len + 1); - } + format_str_prefix = xstrndup (fmt, prefix_len); if (fmt[suffix_pos] != '\0') - { - format_str_suffix = strdup (fmt + suffix_pos); - if (!format_str_suffix) - error (EXIT_FAILURE, 0, _("out of memory (requested %zu bytes)"), - strlen (fmt + suffix_pos)); - } + format_str_suffix = xstrdup (fmt + suffix_pos); devmsg ("format String:\n input: %s\n grouping: %s\n" " padding width: %ld\n alignment: %s\n" @@ -1462,8 +1490,6 @@ main (int argc, char **argv) if (format_str != NULL && grouping) error (EXIT_FAILURE, 0, _("--grouping cannot be combined with --format")); - if (format_str != NULL && padding_width > 0) - error (EXIT_FAILURE, 0, _("--padding cannot be combined with --format")); /* Warn about no-op. */ if (debug && scale_from == scale_none && scale_to == scale_none diff --git a/tests/misc/numfmt.pl b/tests/misc/numfmt.pl index ca3c896..dfb4b2e 100755 --- a/tests/misc/numfmt.pl +++ b/tests/misc/numfmt.pl @@ -695,11 +695,11 @@ my @Tests = {EXIT=>1}], ['fmt-err-4', '--format "%d"', {ERR=>"$prog: invalid format '%d', " . - "directive must be %['][-][N]f\n"}, + "directive must be %[0]['][-][N]f\n"}, {EXIT=>1}], ['fmt-err-5', '--format "% -43 f"', {ERR=>"$prog: invalid format '% -43 f', " . - "directive must be %['][-][N]f\n"}, + "directive must be %[0]['][-][N]f\n"}, {EXIT=>1}], ['fmt-err-6', '--format "%f %f"', {ERR=>"$prog: format '%f %f' has too many % directives\n"}, @@ -708,9 +708,6 @@ my @Tests = {ERR=>"$prog: invalid format '%123456789012345678901234567890f'". " (width overflow)\n"}, {EXIT=>1}], - ['fmt-err-8', '--format "%f" --padding 20', - {ERR=>"$prog: --padding cannot be combined with --format\n"}, - {EXIT=>1}], ['fmt-err-9', '--format "%f" --grouping', {ERR=>"$prog: --grouping cannot be combined with --format\n"}, {EXIT=>1}], @@ -748,6 +745,17 @@ my @Tests = ['fmt-15', '--format "--%100000f--" --to=si 4200', {OUT=>"--" . " " x 99996 . "4.2K--" }], + # --format padding overrides --padding + ['fmt-16', '--format="%6f" --padding=66 1234',{OUT=>" 1234"}], + + # zero padding + ['fmt-17', '--format="%06f" 1234',{OUT=>"001234"}], + # also support spaces (which are ignored as spacing is handled separately) + ['fmt-18', '--format="%0 6f" 1234',{OUT=>"001234"}], + # handle generic padding in combination + ['fmt-22', '--format="%06f" --padding=7 1234',{OUT=>" 001234"}], + ['fmt-23', '--format="%06f" --padding=-7 1234',{OUT=>"001234 "}], + ## Check all errors again, this time with --invalid=fail ## Input will be printed without conversion, @@ -881,6 +889,13 @@ my @Locale_Tests = ['lcl-fmt-4', '--format "--%-10f--" --to=si 5000000', {OUT=>"--5,0M --"}, {ENV=>"LC_ALL=$locale"}], + # handle zero/grouping in combination + ['lcl-fmt-5', '--format="%\'06f" 1234',{OUT=>"01 234"}, + {ENV=>"LC_ALL=$locale"}], + ['lcl-fmt-6', '--format="%0\'6f" 1234',{OUT=>"01 234"}, + {ENV=>"LC_ALL=$locale"}], + ['lcl-fmt-7', '--format="%0\'\'6f" 1234',{OUT=>"01 234"}, + {ENV=>"LC_ALL=$locale"}], ); if ($locale ne 'C') -- 1.7.7.6 --------------050800080003030005090203-- From debbugs-submit-bounces@debbugs.gnu.org Thu May 01 18:26:16 2014 Received: (at 16468) by debbugs.gnu.org; 1 May 2014 22:26:16 +0000 Received: from localhost ([127.0.0.1]:47759 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WfzQl-0006DC-CO for submit@debbugs.gnu.org; Thu, 01 May 2014 18:26:15 -0400 Received: from mout.kundenserver.de ([212.227.126.187]:60885) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WfzQh-0006Cv-QV for 16468@debbugs.gnu.org; Thu, 01 May 2014 18:26:12 -0400 Received: from [192.168.1.20] (p5084F89F.dip0.t-ipconnect.de [80.132.248.159]) by mrelayeu.kundenserver.de (node=mreue007) with ESMTP (Nemesis) id 0M6c88-1X2u1m3bUK-00wXGy; Fri, 02 May 2014 00:26:03 +0200 Message-ID: <5362C9FA.3020101@bernhard-voelker.de> Date: Fri, 02 May 2014 00:26:02 +0200 From: Bernhard Voelker User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#16468: join References: <52D81398.60203@redhat.com> <52D82083.3010309@redhat.com> <52D8728A.1010904@bernhard-voelker.de> <53618CEC.9090209@draigBrady.com> In-Reply-To: <53618CEC.9090209@draigBrady.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Provags-ID: V02:K0:iM93RWWEnL9t1FcpRJJs/MdEf+exFubbKnNqV3wV9TS rCWMT3N2VUFQtgkAU6gOxagy7yCD7jxSUS7U9K3y3++Qk9Pgf+ 3QwOeUOdHfiPZvYKFXUhuop35nPoPZ3l6nvNyc6Y9d5iOGtqTs b7OClkOOcr7BBuNHQz+fXyoy5clsr2I5ooJlEf7OpUWYNGqln/ sK8uGe/Px8U8o1SS7vTtqc40JGVOkLxJQ/QXFfKE8u5gqfJDxi XiRCMNw50iHID9lsEpmPd8kpN/6MNwI/e9GgT2NMlfm4osTM1r yVwzc7Ohy5P7uMWhD55WTVIqyVj6suNeMJY1z7eLDtdJCz8bOm GTm2agWEpNiyp2AkRK8NNZPCcWb27S5Yg2J55ilTa X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16468 Cc: barry kesner , Eric Blake , 16468@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On 05/01/2014 01:53 AM, Pádraig Brady wrote: > I added it in the attached. Thanks, great stuff. > diff --git a/NEWS b/NEWS > index 7855a48..904aace 100644 > --- a/NEWS > +++ b/NEWS > @@ -66,6 +66,9 @@ GNU coreutils NEWS -*- outline -*- > causing name look-up errors. Also look-ups are first done outside the chroot, > in case the look-up within the chroot fails due to library conflicts etc. > > + numfmt supports zero padding of numbers using the standard --printf > + syntax of a leading zero, for example --format="%010f". > + s/--printf/printf/ > diff --git a/src/numfmt.c b/src/numfmt.c > index 63411f3..c744875 100644 > --- a/src/numfmt.c > +++ b/src/numfmt.c ... > @@ -992,6 +1023,9 @@ parse_format_string (char const *fmt) > > if (endptr != (fmt + i) && pad != 0) > { > + if (debug && padding_width && !(zero_padding && pad > 0)) > + error (0, 0, _("--format padding overridding --padding")); > + In --debug mode, it seems odd that the format with the new zero-padding does not lead to a warning ... $ src/numfmt --debug --format="%09f" --padding=2 1234 000001234 while a format without does: $ src/numfmt --debug --format="%9f" --padding=2 1234 src/numfmt: --format padding overridding --padding 1234 +1 otherwise. Thanks & have a nice day, Berny From debbugs-submit-bounces@debbugs.gnu.org Thu May 01 20:59:01 2014 Received: (at 16468) by debbugs.gnu.org; 2 May 2014 00:59:01 +0000 Received: from localhost ([127.0.0.1]:47820 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wg1ob-0003Fu-1D for submit@debbugs.gnu.org; Thu, 01 May 2014 20:59:01 -0400 Received: from mail6.vodafone.ie ([213.233.128.184]:12663) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wg1oY-0003Fa-82 for 16468@debbugs.gnu.org; Thu, 01 May 2014 20:58:59 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApQBAGvtYlNtTMOA/2dsb2JhbAANTYcTvmSDD4EqgxkBAQEDASMPAUYFCwsNAQoCAgUWCwICCQMCAQIBRQYNAQcBARaIHw2lF3ejdBeBKo0oB4JvgUoBA4VHmnKFZYk0 Received: from unknown (HELO [192.168.1.79]) ([109.76.195.128]) by mail3.vodafone.ie with ESMTP; 02 May 2014 01:58:50 +0100 Message-ID: <5362EDCA.6050407@draigBrady.com> Date: Fri, 02 May 2014 01:58:50 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Bernhard Voelker Subject: Re: bug#16468: join References: <52D81398.60203@redhat.com> <52D82083.3010309@redhat.com> <52D8728A.1010904@bernhard-voelker.de> <53618CEC.9090209@draigBrady.com> <5362C9FA.3020101@bernhard-voelker.de> In-Reply-To: <5362C9FA.3020101@bernhard-voelker.de> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16468 Cc: barry kesner , Eric Blake , 16468@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On 05/01/2014 11:26 PM, Bernhard Voelker wrote: > On 05/01/2014 01:53 AM, Pádraig Brady wrote: >> I added it in the attached. > > Thanks, great stuff. > >> diff --git a/NEWS b/NEWS >> index 7855a48..904aace 100644 >> --- a/NEWS >> +++ b/NEWS >> @@ -66,6 +66,9 @@ GNU coreutils NEWS -*- outline -*- >> causing name look-up errors. Also look-ups are first done outside the chroot, >> in case the look-up within the chroot fails due to library conflicts etc. >> >> + numfmt supports zero padding of numbers using the standard --printf >> + syntax of a leading zero, for example --format="%010f". >> + > > s/--printf/printf/ done >> diff --git a/src/numfmt.c b/src/numfmt.c >> index 63411f3..c744875 100644 >> --- a/src/numfmt.c >> +++ b/src/numfmt.c > ... >> @@ -992,6 +1023,9 @@ parse_format_string (char const *fmt) >> >> if (endptr != (fmt + i) && pad != 0) >> { >> + if (debug && padding_width && !(zero_padding && pad > 0)) >> + error (0, 0, _("--format padding overridding --padding")); >> + > > In --debug mode, it seems odd that the format with the new > zero-padding does not lead to a warning ... > > $ src/numfmt --debug --format="%09f" --padding=2 1234 > 000001234 In this case the number of leading zeros and --padding are separate. Since they're zero padded you can freely move around the numbers in a field like: numfmt --header --field=2 --format="%010f" --padding=-15 < /proc/interrupts > while a format without does: > > $ src/numfmt --debug --format="%9f" --padding=2 1234 > src/numfmt: --format padding overridding --padding > 1234 Here the --padding is overridden hence the warning. thanks for the review! I've now pushed it. Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Thu Oct 11 18:15:48 2018 Received: (at control) by debbugs.gnu.org; 11 Oct 2018 22:15:48 +0000 Received: from localhost ([127.0.0.1]:45654 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gAjFE-0002PD-5i for submit@debbugs.gnu.org; Thu, 11 Oct 2018 18:15:48 -0400 Received: from mail-pf1-f181.google.com ([209.85.210.181]:39391) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gAjFC-0002Ow-D5 for control@debbugs.gnu.org; Thu, 11 Oct 2018 18:15:46 -0400 Received: by mail-pf1-f181.google.com with SMTP id c25-v6so5102440pfe.6 for ; Thu, 11 Oct 2018 15:15:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=to:from:message-id:date:user-agent:mime-version:content-language :content-transfer-encoding; bh=aCMPf+7kyoxN+rBTekFZz3oWrRjDvMuc6vTuYT21Ayk=; b=jQSTj1oIOh/kU+0QgvVa9jAf+WwO4i158LaHlPg5KcDpTdIIMVsexf46xGQWHnYLM8 +zReHcVThWS5DKY2PtiaYQGoq0/qHoDTNNNQ7BTSHUqQwf5HTyh5a6RobQNQugQFtydJ Rt2cbxuXfVQHEozp/uVR62CVqhNPR6aaaYGnlGU20ZoIjt4uN5rnGh2BcOMDV5upRjvo 6dwGWMSXjJC8ddSe+KJfFeLw6KnDTzBrUeAiZj247qIw5PooKZAmz9jkgQfx2LmRWPJC 98xDJquUSuqjjWj758gJwUlPLaWqlEn+H43oIvxpqV7epCT+uuams8HD7MkjZy1uo3/8 nwXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:message-id:date:user-agent:mime-version :content-language:content-transfer-encoding; bh=aCMPf+7kyoxN+rBTekFZz3oWrRjDvMuc6vTuYT21Ayk=; b=XAcJmAEOey9bb2WzHT6+srgmRjRzR4cDJbuTkbhLNQxG/XSET2F5+c/jjtle2ISd0V YsQW4UwM1Z1NbdhHaf5HYWoZ0bZ6A7oO7bzRAWXn4000cNozruOCjhAqqpwg2RAkePfP CboI+2L8zn9621bO6EApDM3/7ZZVOJt7Ca3HO7PZ4EXgZXAorXCKS93G/O64ggjklOud 3XKuBNBnYabnsHdE00C3fOPEUO9WVB+MFDzrf7X37xHyNiFxeTz24yLJJxTZ5gEYlMWQ s0j/PGkwzaDWVWEWjs5iKC2GQvuq9P92h+5Y5lCllSnlj8/pChugcBAQtQhPz3RZnIbG h/8A== X-Gm-Message-State: ABuFfogJB04BZVDOwb0VfdxCQKygSQpusXbFwqFvlObU9/zrOBl+BIHK lIYX2UAUTb546doifRv5Qh/++IfM X-Google-Smtp-Source: ACcGV63/p1ZxnixeZlKyAhWdRojntCOIqhMBML/uGDJBsBc9xOlhDy1myTnczOQ7SPvTJ1b7zRMyWA== X-Received: by 2002:a62:3384:: with SMTP id z126-v6mr3277051pfz.85.1539296139769; Thu, 11 Oct 2018 15:15:39 -0700 (PDT) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id t15-v6sm66768319pfj.7.2018.10.11.15.15.37 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Oct 2018 15:15:38 -0700 (PDT) To: control@debbugs.gnu.org From: Assaf Gordon Message-ID: Date: Thu, 11 Oct 2018 16:15:36 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: tags 15308 notabug close 15308 tags 15634 notabug close 15634 tags 16004 notabug severity 16004 wishlist close 16004 [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_PASS SPF: sender matches SPF record 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (assafgordon[at]gmail.com) -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [209.85.210.181 listed in list.dnswl.org] 0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [209.85.210.181 listed in wl.mailspike.net] 1.8 MISSING_SUBJECT Missing Subject: header 0.2 NO_SUBJECT Extra score for no subject 0.0 RCVD_IN_MSPIKE_WL Mailspike good senders X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) tags 15308 notabug close 15308 tags 15634 notabug close 15634 tags 16004 notabug severity 16004 wishlist close 16004 tags 16245 notabug close 16245 tags 16249 notabug close 16249 tags 16249 notabug close 16249 close 16309 tags 16468 notabug close 16468 tag 16530 notabug close 16530 tags 16718 notabug close 16718 tags 16742 +moreinfo close 16742 tags 16831 wontfix close 16831 tags 16838 wontfix close 16838 tags 16872 fixed close 16872 close 16945 close 17224 tags 17503 + notabug close 17503 close 17546 tags 17904 notabug close 17904 From unknown Fri Jun 20 07:16:10 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 09 Nov 2018 12:24:08 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator