GNU bug report logs - #10924
Updated patch for Bug #6366

Previous Next

Package: coreutils;

Reported by: Drew Frank <drewfrank <at> gmail.com>

Date: Fri, 2 Mar 2012 07:21:02 UTC

Severity: wishlist

Tags: patch

Merged with 6366, 12264

To reply to this bug, email your comments to 10924 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#10924; Package coreutils. (Fri, 02 Mar 2012 07:21:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Drew Frank <drewfrank <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Fri, 02 Mar 2012 07:21:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Drew Frank <drewfrank <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: Updated patch for Bug #6366
Date: Thu, 1 Mar 2012 20:05:57 -0800
[Message part 1 (text/plain, inline)]
Hi,

Bug #6366 was reported over a year ago to address a deficiency in "join":
it is unable to join on fields that are sorted numerically (rather than
lexicographically). The bug report has a patch attached -- I applied it to
the current Git head, cleaned up a few things, and added a couple tests.

In response to previous discussions on the issue, my two cents:
* "sort" includes several other sorting criteria, but they're not likely to
be useful in this context.
* This patch doesn't implement the functionality of sort's "-g" option, so
"-n" is appropriate.
* If two values have different string representations but compare as equal
due to lack of precision...well, the person doing the join should be aware
of the limits of floating point representations and use the "sort, join,
sort -n" strategy. I doubt this would come up much in practice though.

The lack of this option seems like an unusually conspicuous wart and I'd
love to see it addressed. Please let me know if I can be of any more help
to that effect.

Best,
Drew Frank
[Message part 2 (text/html, inline)]
[join.numeric.sort.diff (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#10924; Package coreutils. (Wed, 21 Mar 2012 17:21:01 GMT) Full text and rfc822 format available.

Message #8 received at 10924 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Drew Frank <drewfrank <at> gmail.com>
Cc: 10924 <at> debbugs.gnu.org
Subject: Re: bug#10924: Updated patch for Bug #6366
Date: Wed, 21 Mar 2012 16:50:11 +0000
forcemerge 6366 10924




Information forwarded to bug-coreutils <at> gnu.org:
bug#10924; Package coreutils. (Wed, 21 Mar 2012 18:28:02 GMT) Full text and rfc822 format available.

Message #11 received at 10924 <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Drew Frank <drewfrank <at> gmail.com>
Cc: 10924 <at> debbugs.gnu.org
Subject: Re: bug#10924: Updated patch for Bug #6366
Date: Wed, 21 Mar 2012 10:59:21 -0600
[Message part 1 (text/plain, inline)]
On 03/01/2012 09:05 PM, Drew Frank wrote:
> Hi,
> 
> Bug #6366 was reported over a year ago to address a deficiency in "join":
> it is unable to join on fields that are sorted numerically (rather than
> lexicographically). The bug report has a patch attached -- I applied it to
> the current Git head, cleaned up a few things, and added a couple tests.
> 
> In response to previous discussions on the issue, my two cents:
> * "sort" includes several other sorting criteria, but they're not likely to
> be useful in this context.
> * This patch doesn't implement the functionality of sort's "-g" option, so
> "-n" is appropriate.
> * If two values have different string representations but compare as equal
> due to lack of precision...well, the person doing the join should be aware
> of the limits of floating point representations and use the "sort, join,
> sort -n" strategy. I doubt this would come up much in practice though.
> 
> The lack of this option seems like an unusually conspicuous wart and I'd
> love to see it addressed. Please let me know if I can be of any more help
> to that effect.
> 
> Best,
> Drew Frank

> * src/join.c: add new flags and implement numeric comparison feature.
> * tests/misc/join: add two tests for numerically sorted key fields.
> This patch is based on code written by Alex Shinn
> <Alex Shinn <at> gmail.com>
> ---
>  src/join.c      |   22 +++++++++++++++++++---
>  tests/misc/join |    6 ++++++
>  2 files changed, 25 insertions(+), 3 deletions(-)

Missing documentation in NEWS and doc/coreutils.texi.

Based on just the diffstat, this patch is non-trivial, so we'd need
copyright assignment to the FSF from both you and from Alex Shinn, as
co-authors of this material.  Is that something you are interested in
pursuing?  I'm refraining from reviewing the patch itself, in case we
need to come up with a clean-room reimplementation.

-- 
Eric Blake   eblake <at> redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#10924; Package coreutils. (Wed, 21 Mar 2012 20:52:01 GMT) Full text and rfc822 format available.

Message #14 received at 10924 <at> debbugs.gnu.org (full text, mbox):

From: Drew Frank <drewfrank <at> gmail.com>
To: Eric Blake <eblake <at> redhat.com>
Cc: 10924 <at> debbugs.gnu.org
Subject: Re: bug#10924: Updated patch for Bug #6366
Date: Wed, 21 Mar 2012 13:20:18 -0700
[Message part 1 (text/plain, inline)]
On Wed, Mar 21, 2012 at 9:59 AM, Eric Blake <eblake <at> redhat.com> wrote:
>
> Missing documentation in NEWS and doc/coreutils.texi.
>
> Based on just the diffstat, this patch is non-trivial, so we'd need
> copyright assignment to the FSF from both you and from Alex Shinn, as
> co-authors of this material.  Is that something you are interested in
> pursuing?  I'm refraining from reviewing the patch itself, in case we
> need to come up with a clean-room reimplementation.

Apologies for creating this mess on the mailing list -- the patch you
glanced at was the first of two that I sent (see bug #6366 for the
second). I'm including a third here. Compared to the first, I've made
the following changes:
- Added documentation (both NEWS and doc/coreutils.texi).
- Added tests.
- Reimplemented the numeric comparison functionality by copying code
from sort.c.
Due to the reimplementation, just about all of Alex's patch has been
replaced (except for the bits where there's obviously one right way to
do it, e.g. parsing the new options). Also, since I copied most of the
code from sort.c, there's very little truly original code in the whole
patch (mostly old code blocks in new places). If a copyright
assignment is needed, I'd be happy to provide that.

Drew
[join_numeric_sort_3.diff (text/plain, attachment)]

Forcibly Merged 6366 10924 12264. Request was from era eriksson <era <at> iki.fi> to control <at> debbugs.gnu.org. (Thu, 30 Aug 2012 08:08:02 GMT) Full text and rfc822 format available.

Severity set to 'wishlist' from 'normal' Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Tue, 09 Oct 2018 20:20:02 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 260 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.