On 04/05/2014 01:19 PM, Nikos Balkanas wrote: >> >> No, earlier distributions merely defaulted to LC_ALL=C instead of >> LC_ALL=en_US.UTF-8. This complaint is the same as your previous one, >> and the solution is the same - if you want sorting by bytes, then ensure >> that your locale is set to C rather than en_US.UTF-8. >> >> Thank you all. As I explained in my previous mail, an update of the man > pages is essential. A change in the UI would also be desirable, > if the standards allow it. Sorry, about my attitude, but I was getting > pretty desperate. Thanks for not flaming. > > To make it up I will look into updating the man pages ;-) But the man page ALREADY says this: *** WARNING *** The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values. What more are you proposing? > > A suggestion. I think that sort should sort text based on the LOCALE of > the file, not the system. Couldn't it detect automatically from the text, > whether it is is dealing with UTF-8 or iso? Unfortunately, no, this is not possible. You're welcome to try and write a patch to prove me wrong, but people have already had years of experience of using environment variables as the way to tell a program what encoding an input file uses, precisely because there is no other obvious way of determining a file's locale. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org