GNU bug report logs - #19142
sort not working with LANG set to language_country.encoding

Previous Next

Package: coreutils;

Reported by: Roland Sieker <ospalh <at> gmail.com>

Date: Fri, 21 Nov 2014 16:49:02 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Eric Blake <eblake <at> redhat.com>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#19142: closed (sort not working with LANG set to
 language_country.encoding)
Date: Fri, 21 Nov 2014 17:00:05 +0000
[Message part 1 (text/plain, inline)]
Your message dated Fri, 21 Nov 2014 09:59:20 -0700
with message-id <546F6F68.1040004 <at> redhat.com>
and subject line Re: bug#19142: sort not working with LANG set to language_country.encoding
has caused the debbugs.gnu.org bug report #19142,
regarding sort not working with LANG set to language_country.encoding
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
19142: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=19142
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Roland Sieker <ospalh <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: sort not working with LANG set to language_country.encoding
Date: Fri, 21 Nov 2014 12:24:56 +0100
[Message part 3 (text/plain, inline)]
Hi.

I have noticed that sort seems to have problems when the LANG environment
variable is set with language and country.

As a test case, i tried to sort

a
b
a
⺌
⺕
⺌

It sorts OK like this, with LANG just the language.encoding:
( setenv LANG en.UTF-8 ; echo 'a\nb\na\n⺌\n⺕\n⺌' | sort )
a
a
b
⺌
⺌
⺕

But not with LANG as language_country.encoding:
( setenv LANG en_GB.UTF-8 ; echo 'a\nb\na\n⺌\n⺕\n⺌' | sort )
⺌
⺕
⺌
a
a
b




sort: sort (GNU coreutils) 8.21
Shell: tcsh 6.18.01 (Astron) 2012-02-14 (x86_64-unknown-linux) options
wide,nls,dl,al,kan,rh,color,filec
Fedora Linux 20

Regards, ospalh
[Message part 4 (text/html, inline)]
[Message part 5 (message/rfc822, inline)]
From: Eric Blake <eblake <at> redhat.com>
To: Roland Sieker <ospalh <at> gmail.com>, 19142-done <at> debbugs.gnu.org
Subject: Re: bug#19142: sort not working with LANG set to
 language_country.encoding
Date: Fri, 21 Nov 2014 09:59:20 -0700
[Message part 6 (text/plain, inline)]
tag 19142 notabug
thanks

On 11/21/2014 04:24 AM, Roland Sieker wrote:
> Hi.
> 
> I have noticed that sort seems to have problems when the LANG environment
> variable is set with language and country.
> 

Thanks for the report.  The whole point of locales is that each locale
is free to choose the collation sequences that make the most sense for
that locale.


> It sorts OK like this, with LANG just the language.encoding:
> ( setenv LANG en.UTF-8 ; echo 'a\nb\na\n⺌\n⺕\n⺌' | sort )

[I'm translating your csh syntax into more-reliable sh syntax]
Try turning on sort debugging:

$ printf 'a\nb\na\n⺌\n⺕\n⺌' | LC_ALL=en.UTF-8 sort --debug
sort: using simple byte comparison
a
_
a
_
b
_
⺌
___
⺌
___
⺕
___


> But not with LANG as language_country.encoding:

$ printf 'a\nb\na\n⺌\n⺕\n⺌' | LC_ALL=en_GB.UTF-8 sort --debug
sort: using ‘en_GB.UTF-8’ sorting rules
⺌
__
⺕
__
⺌
__
a
_
a
_
b
_


That just means that whoever wrote the en_GB.UTF-8 locale picked a
different collation sequence for non-ascii characters than the person
that wrote the generic en.UTF-8 locale.  That's not a bug in sort, so
I'm closing this as not a bug from coreutils' perspective.  Feel free to
raise it as a glibc bug (the owner of locale definitions on GNU/Linux
systems) if you have a strong reason why different locales should be
more consistent on their choice of collation sequences.  And feel free
to reply further to this bug with more questions or comments, even
though it has been closed.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

This bug report was last modified 10 years and 218 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.