On Fri, Aug 31, 2018 at 11:59 AM, Eric Blake <eblake@redhat.com> wrote:
tag 32603 notabug
thanks


On 08/31/2018 11:44 AM, Paul Eggert wrote:
"sort --help" says:

*** WARNING ***
The locale specified by the environment affects sort order.
Set LC_ALL=C to get the traditional sort order that uses
native byte values.

and that's what you have run into.

To expound on Paul's answer:

> $ sort <foo
> t.co
> tec.co
> te.co

Let's run that with --debug to make it obvious:

$ printf 't.co\ntec.co\nte.co\n' | sort --debug
sort: using ‘en_US.UTF-8’ sorting rules
t.co
____
tec.co
______
te.co
_____

and realize that en_US.UTF-8 is a locale where punctuation is ignored when determining collation order (thus, 'tco' < 'tecco' < 'teco' once you strip out the ignored '.').


I keep seeing these sort "bugs" pop up, they seem to be very popular. At any point would the default behavior be seen as needing change?

I'm not sure why I'd want to ignore special characters by default, for example...

Cheers,
    R0b0t1