GNU bug report logs - #9780
sort -u throws out non-duplicates

Previous Next

Package: coreutils;

Reported by: Bernhard Rosenkraenzer <bero <at> bero.eu>

Date: Tue, 18 Oct 2011 01:04:02 UTC

Severity: normal

Tags: moreinfo

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


Message #61 received at 9780 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 9780 <at> debbugs.gnu.org, Rasmus Borup Hansen <rbh <at> intomics.com>
Subject: Re: bug#9780: sort -u throws out non-duplicates
Date: Fri, 17 Aug 2012 21:36:32 +0200
Paul Eggert wrote:
> On 08/16/2012 02:03 PM, Jim Meyering wrote:
>> * src/sort.c (saved_line): New static/global, renamed and moved from...
>> (write_unique): ...here.
>
> I see a couple of problems with this patch.  Pedantically,
> the behavior of 'overlap' is undefined on hosts that
> use a segmented architecture, because '<=' is not reliable
> on pointers into different buffers.  (I have the vague recollection
> that some compilers even rely on this to generate faster code
> on flat architectures....)

I pushed the change seconds before your message arrived.
But that's probably best.  If you can change it to do the job
reliably even on fringe systems, that would be welcome.

> More importantly, suppose the
> buffer is reallocated (because it grows)?  Won't 'overlap'
> do the wrong thing after that?

How?  The first time the safe_text buffer is allocated
it will have to be disjoint from the line.text buffer
and from the buffer into which we're about to fread.
Thereafter, regardless of reallocation, overlap should
always be false.

> And it'd be nice if we didn't
> have to worry about making a copy of that line.

It appears that the need to copy a line (overlap)
is very rare, in practice.  If you find a way to avoid it,
it seems like it would have to be small and simple to be
worthwhile.

> I'll see if I can come up with something that addresses these
> objectinos.

Thanks!
And thanks for the review.




This bug report was last modified 12 years and 278 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.