GNU bug report logs -
#18291
Unix Sort Bug Report
Previous Next
Reported by: NTENTOS STAVROS <ntentos <at> inf.uth.gr>
Date: Mon, 18 Aug 2014 15:36:02 UTC
Severity: normal
Tags: notabug
Done: Eric Blake <eblake <at> redhat.com>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
On 08/18/2014 09:57 AM, Pádraig Brady wrote:
> On 08/18/2014 09:55 AM, NTENTOS STAVROS wrote:
>>
>> Hello developers,
>>
>> Recently, using the sort utility I run into an omission. While I cannot disclose the file in question, I will try to explain the issue:
>> On a Windows-created file (line ending: \r\n) I tried to perform a sorting, which happened to sort the last entry somewhere above. The last line did not have a line ending of any kind, and sort created a Unix-like ending (\r), which afterwards creates a parsing problem with the file.
>
> Well a \n is inserted actually, not \r, but yes that is a problem on windows.
> This demonstrates the behavior:
>
> $ printf '2\r\n1' | sort | od -Ax -tx1z -v
> 000000 31 0a 32 0d 0a >1.2..<
>
> The \n is inserted so as to delimit the reordered item appropriately,
> which is set here:
>
> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/sort.c;h=c2493192;hb=HEAD#l178
>
> It seems that this should be set to '\r\n' on cygwin builds,
> (wither other adjustments to handle multiple chars).
If the file was opened in text mode, then sort only sees \n line endings
on input (cygwin already shortened \r\n to \n before handing the line to
sort), and on output all \n are automatically converted back to \r\n.
If the file was opened in binary mode, then cygwin CANNOT second guess
what line endings you wanted. It sounds like your file lives on a
binary mount point, when you want it to live on a text mount point
instead; at which point cygwin should do the right thing (although I
admit I did not actually try this on cygwin, because I seldom use cygwin
text mounts). But that is probably more a question for cygwin
downstream, not for upstream coreutils (the POSIX requirement is that
text and binary file modes are identical, so any system like cygwin
where there are not is already non-POSIX and starts to get into a
question of whether pushing upstream fixes for a downstream-only problem
is maintainable).
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[signature.asc (application/pgp-signature, attachment)]
This bug report was last modified 10 years and 277 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.