GNU bug report logs - #6643
[PATCH] sort: fix a bug with sort -u and xmemcoll0, and tune keycompare

Previous Next

Package: coreutils;

Reported by: Paul Eggert <eggert <at> CS.UCLA.EDU>

Date: Thu, 15 Jul 2010 23:34:02 UTC

Severity: normal

Tags: patch

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> CS.UCLA.EDU>
Cc: 6643 <at> debbugs.gnu.org, chenguo4 <at> yahoo.com
Subject: bug#6643: [PATCH] sort: fix a bug with sort -u and xmemcoll0, and tune keycompare
Date: Fri, 16 Jul 2010 01:46:29 +0100
On 16/07/10 00:33, Paul Eggert wrote:
> Well, after my brave words about xmemcoll0 unlikely to be misused in 'sort'
> I found a misuse of it.  With sort -u, write_bytes replaces the trailing NUL
> with a newline but it doesn't change it back after writing.  In some cases
> "sort -u" will output a line and then later use it as an argument for
> comparison, which will mess up if the trailing NUL has been replaced.
> I pushed the following patch to work around the problem.

That looks correct.
Sorry for not spotting it.
I had intended to look at the -u paths but didn't
when I considered that all the -u tests had passed.
But I now realize that they're run under LC_ALL=C
I'll add a few basic tests I think to run under $mb_locale

> Chen, can you please verify that "sort -u" does not access the same
> line from multiple threads, even saved lines?  If it does, then even this patch
> is not enough: we would need to alter write_bytes so that it does not
> modify its argument at all.
> 
> This patch also tunes keycompare slightly.  I should have
> broken it up into a different patch, but it sort of snuck in.

I didn't do that because then you're just doing what xmemcoll()
is doing anyway.  I.E. the other xmemcoll0() is more cache
efficient because it doesn't need to write anything in the
normal case for each comparison.

> By the way, any objection if I put 'const' after the types that it
> qualifies, e.g., "char const *" rather than "const char *"?  That
> was the usual style here.

Yep that's fine.

cheers,
Pádraig.




This bug report was last modified 14 years and 41 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.