GNU bug report logs - #28539
different outputs of sort

Previous Next

Package: coreutils;

Reported by: "He, Yihan" <yihan.he <at> intel.com>

Date: Thu, 21 Sep 2017 15:24:02 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

Full log


Message #12 received at 28539-done <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: "He, Yihan" <yihan.he <at> intel.com>, 28539-done <at> debbugs.gnu.org
Cc: "Ji, Zhenlong Z" <zhenlong.z.ji <at> intel.com>, "Wang,
 Tao W" <tao.w.wang <at> intel.com>
Subject: Re: bug#28539: different outputs of sort
Date: Thu, 21 Sep 2017 10:36:12 -0500
[Message part 1 (text/plain, inline)]
tag 28539 notabug
thanks

On 09/21/2017 01:56 AM, He, Yihan wrote:
> Hi,
> 
> We are facing issue when using sort, could you please help us to understand why? Thanks a lot.

Thanks for your report.  However, I suspect that the difference you are
seeing is the result of a FAQ:
https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021
namely, that you have different locale settings between your two machines.

> 
> We run below command on two servers, but got different outputs. I attached source file (property_contexts) and two outputs files(sort_property_context_local.txt & sort_property_context_slave.txt).
> $sort -u property_contexts -o sort_property_context.txt
> 
> As checked, we use same version of sort on that two servers:
> local:~ $sort --version
> sort (GNU coreutils) 8.21

Note that this version of sort includes 'sort --debug', which is useful
in diagnosing issues like this.

Here's a trimmed-down example that produces similar behavior for me,
with the locale being explicit:

$ printf 'a\n# b\nc\n' | LC_ALL=C sort --debug
sort: using simple byte comparison
# b
___
a
_
c
_

$ printf 'a\n# b\nc\n' | LC_ALL=en_US.utf8 sort --debug
sort: using ‘en_US.utf8’ sorting rules
a
_
# b
___
c
_


That is, in the en_US.utf8 locale, the locale instructs sort to ignore
punctuation and sort case-insensitively, whereas the C locale sorts by
strict byte order.

As such, I'm tagging this issue as not a bug, but feel free to follow up
with further information, including any evidence you might have that
there is a real bug once locales have been accounted for.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[signature.asc (application/pgp-signature, attachment)]

This bug report was last modified 7 years and 239 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.