GNU bug report logs - #32472
sort doesn't sort and uniq loses data for many non-Latin scripts on UTF-8 locales

Previous Next

Package: coreutils;

Reported by: Vaayda Yaasra <vaaydayaasra <at> gmail.com>

Date: Sat, 18 Aug 2018 16:05:02 UTC

Severity: normal

Tags: notabug

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #15 received at control <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Vaayda Yaasra <vaaydayaasra <at> gmail.com>, 32472 <at> debbugs.gnu.org
Subject: Re: bug#32472: sort doesn't sort and uniq loses data for many
 non-Latin scripts on UTF-8 locales
Date: Mon, 29 Oct 2018 21:54:59 -0600
tags 32472 notabug
close 32472
stop


On 2018-08-18 11:34 a.m., Paul Eggert wrote:
> Vaayda Yaasra wrote:
>> Here’s an example in Syriac:
>>
>> ܡܠܬܐ
>> ܒܝܬܐ
>> ܒܪܢܫܐ
>> ܡܠܬܐ
>>
>> Sort produces the following:
>>
>> ܡܠܬܐ
>> ܒܝܬܐ
>> ܡܠܬܐ
>> ܒܪܢܫܐ
> 
> This is a property of your locale, so I suggest sending a bug report to 
> whoever maintains your locale. You should be able to reproduce the 
> problem by bypassing GNU 'sort' entirely and using the C strcoll function.
> 
> For what it's worth, I observe the problem on Ubuntu 18.04 but not on 
> Fedora 28. As Fedora tends to be more up-to-date, perhaps the problem is 
> fixed already in glibc.

Given the above, and with no further comments,
I'm closing this bug.

-assaf




This bug report was last modified 6 years and 289 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.