GNU bug report logs - #22155
Wrong char count with UTF8 in sort -k

Previous Next

Package: coreutils;

Reported by: Holger Klene <h.klene <at> gmx.de>

Date: Sat, 12 Dec 2015 22:55:02 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


Message #13 received at 22155-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Holger Klene <h.klene <at> gmx.de>, 22155-done <at> debbugs.gnu.org
Subject: Re: bug#22155: Wrong char count with UTF8 in sort -k
Date: Sun, 13 Dec 2015 02:32:51 +0000
[Message part 1 (text/plain, inline)]
On 13/12/15 01:32, Pádraig Brady wrote:
> On 12/12/15 22:53, Holger Klene wrote:
>>> sort sort.bug.txt -u -s -k 1.20 -b --debug
>> sort: es werden die Sortierregeln für »de_DE.UTF-8“ verwendet
>> 05. Mär 2015 13:30 ./mess.jpg
>>                    __________
>> 07. Feb 2015 15:57 ./mess.jpg
>>                    __________
>>
>> In fact, it does correct the underlines, but still -u gives both lines, though I want it to discard the second line. You can add more lines for the same file, but sort insists on keeping exactly two: one with Umlaut and the other without.
> 
> That's a bug in --debug because the implementation was split
> from the actual processing done during the sort (for performance reasons).
> Therefore we'll need to fix --debug to show what's being actually done

Patch attached.

thanks,
Pádraig.
[sort-debug-b.patch (text/x-patch, attachment)]

This bug report was last modified 9 years and 161 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.