From unknown Thu Jun 19 14:07:07 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#18291 <18291@debbugs.gnu.org> To: bug#18291 <18291@debbugs.gnu.org> Subject: Status: Unix Sort Bug Report Reply-To: bug#18291 <18291@debbugs.gnu.org> Date: Thu, 19 Jun 2025 21:07:07 +0000 retitle 18291 Unix Sort Bug Report reassign 18291 coreutils submitter 18291 NTENTOS STAVROS severity 18291 normal tag 18291 notabug thanks From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 18 11:35:04 2014 Received: (at submit) by debbugs.gnu.org; 18 Aug 2014 15:35:04 +0000 Received: from localhost ([127.0.0.1]:46349 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XJOxb-0004Zd-8r for submit@debbugs.gnu.org; Mon, 18 Aug 2014 11:35:04 -0400 Received: from eggs.gnu.org ([208.118.235.92]:56189) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XJIjx-0000nz-VF for submit@debbugs.gnu.org; Mon, 18 Aug 2014 04:56:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XJIji-0000bO-SJ for submit@debbugs.gnu.org; Mon, 18 Aug 2014 04:56:28 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:48265) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XJIji-0000bK-Ps for submit@debbugs.gnu.org; Mon, 18 Aug 2014 04:56:18 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57870) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XJIjb-0002GY-AU for bug-coreutils@gnu.org; Mon, 18 Aug 2014 04:56:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XJIjT-0000ZL-Mb for bug-coreutils@gnu.org; Mon, 18 Aug 2014 04:56:11 -0400 Received: from smtp.uth.gr ([194.177.200.11]:46830) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XJIjT-0000Z6-AD for bug-coreutils@gnu.org; Mon, 18 Aug 2014 04:56:03 -0400 Received: from webmail.uth.gr (webmail.uth.gr [194.177.200.14]) (authenticated bits=0) by smtp.uth.gr (8.14.4/8.14.4) with ESMTP id s7I8tCqA015602 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 18 Aug 2014 11:55:13 +0300 Received: from 10.64.8.80 ([10.64.8.80]) by webmail.uth.gr (Horde Framework) with HTTP; Mon, 18 Aug 2014 11:55:21 +0300 Date: Mon, 18 Aug 2014 11:55:21 +0300 Message-ID: <20140818115521.Horde.JnRI2dtzkALhGh6Gh7j28w8@webmail.uth.gr> From: NTENTOS STAVROS To: bug-coreutils@gnu.org Subject: Unix Sort Bug Report User-Agent: Internet Messaging Program (IMP) H5 (6.1.7) Content-Type: text/plain; charset=UTF-8; format=flowed; DelSp=Yes MIME-Version: 1.0 Content-Disposition: inline X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Mon, 18 Aug 2014 11:35:01 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) Hello developers, Recently, using the sort utility I run into an omission. While I cannot disclose the file in question, I will try to explain the issue: On a Windows-created file (line ending: \r\n) I tried to perform a sorting, which happened to sort the last entry somewhere above. The last line did not have a line ending of any kind, and sort created a Unix-like ending (\r), which afterwards creates a parsing problem with the file. -- Ntentos Stavros From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 18 11:57:48 2014 Received: (at 18291) by debbugs.gnu.org; 18 Aug 2014 15:57:48 +0000 Received: from localhost ([127.0.0.1]:46365 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XJPJb-0005BP-7J for submit@debbugs.gnu.org; Mon, 18 Aug 2014 11:57:47 -0400 Received: from mail4.vodafone.ie ([213.233.128.170]:47083) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XJPJY-0005BA-HC for 18291@debbugs.gnu.org; Mon, 18 Aug 2014 11:57:45 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArQCABwi8lNtTKQa/2dsb2JhbAANTINgVwGCe8oAh1YBgTSEewEBBCMPAUYQCw0LAgIFFgsCAgkDAgECAUUGDQEHAQGIQwirAHiVCReBLIpeg0IHgnmBUwEElUuOSpEObAEBgkwBAQE Received: from unknown (HELO [192.168.1.43]) ([109.76.164.26]) by mail3.vodafone.ie with ESMTP; 18 Aug 2014 16:57:37 +0100 Message-ID: <53F22270.1070304@draigBrady.com> Date: Mon, 18 Aug 2014 16:57:36 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: NTENTOS STAVROS Subject: Re: bug#18291: Unix Sort Bug Report References: <20140818115521.Horde.JnRI2dtzkALhGh6Gh7j28w8@webmail.uth.gr> In-Reply-To: <20140818115521.Horde.JnRI2dtzkALhGh6Gh7j28w8@webmail.uth.gr> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 18291 Cc: 18291@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On 08/18/2014 09:55 AM, NTENTOS STAVROS wrote: > > Hello developers, > > Recently, using the sort utility I run into an omission. While I cannot disclose the file in question, I will try to explain the issue: > On a Windows-created file (line ending: \r\n) I tried to perform a sorting, which happened to sort the last entry somewhere above. The last line did not have a line ending of any kind, and sort created a Unix-like ending (\r), which afterwards creates a parsing problem with the file. Well a \n is inserted actually, not \r, but yes that is a problem on windows. This demonstrates the behavior: $ printf '2\r\n1' | sort | od -Ax -tx1z -v 000000 31 0a 32 0d 0a >1.2..< The \n is inserted so as to delimit the reordered item appropriately, which is set here: http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/sort.c;h=c2493192;hb=HEAD#l178 It seems that this should be set to '\r\n' on cygwin builds, (wither other adjustments to handle multiple chars). thanks, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 18 12:27:56 2014 Received: (at control) by debbugs.gnu.org; 18 Aug 2014 16:27:56 +0000 Received: from localhost ([127.0.0.1]:46377 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XJPml-0005zo-Pg for submit@debbugs.gnu.org; Mon, 18 Aug 2014 12:27:56 -0400 Received: from mx1.redhat.com ([209.132.183.28]:30531) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XJPmi-0005zX-K6; Mon, 18 Aug 2014 12:27:53 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s7IGRpq8023071 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 18 Aug 2014 12:27:51 -0400 Received: from [10.3.113.148] (ovpn-113-148.phx2.redhat.com [10.3.113.148]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s7IGRoHP018572; Mon, 18 Aug 2014 12:27:50 -0400 Message-ID: <53F22986.7070905@redhat.com> Date: Mon, 18 Aug 2014 10:27:50 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0 MIME-Version: 1.0 To: NTENTOS STAVROS , 18291-done@debbugs.gnu.org Subject: Re: bug#18291: Unix Sort Bug Report References: <20140818115521.Horde.JnRI2dtzkALhGh6Gh7j28w8@webmail.uth.gr> In-Reply-To: <20140818115521.Horde.JnRI2dtzkALhGh6Gh7j28w8@webmail.uth.gr> OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="pkq4ljoerioRXwN3N9NjExAEGrgjMAKg3" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-Spam-Score: -5.7 (-----) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.7 (-----) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --pkq4ljoerioRXwN3N9NjExAEGrgjMAKg3 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable tag 18291 notabug thanks On 08/18/2014 02:55 AM, NTENTOS STAVROS wrote: >=20 > Hello developers, >=20 > Recently, using the sort utility I run into an omission. While I cannot= > disclose the file in question, I will try to explain the issue: > On a Windows-created file (line ending: \r\n) I tried to perform a > sorting, which happened to sort the last entry somewhere above. The las= t > line did not have a line ending of any kind, and sort created a > Unix-like ending (\r), which afterwards creates a parsing problem with > the file. (Unix line ending is \n, not \r) Per POSIX, sort(1) is only required to operate on text files with one exception: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html "The input files shall be text files, except that the sort utility shall add a to the end of a file ending with an incomplete last line.= " and the POSIX definition of a text file is one that is either empty or has a trailing newline to begin with: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html "3.397 Text File" "A file that contains characters organized into zero or more lines. The lines do not contain NUL characters and none can exceed {LINE_MAX} bytes in length, including the character. Although POSIX.1-2008 does not distinguish between text files and binary files (see the ISO C standard), many utilities only produce predictable or meaningful output when operating on text files. The standard utilities that have such restrictions always specify "text files" in their STDIN or INPUT FILES sections." As such, coreutils is doing what is already required by POSIX, and the bug is more on you for providing a non-text file without a trailing newline and expecting sane behavior. I seriously doubt cygwin can second-guess your intention to use only windows line endings, and that you are better off guaranteeing that you have a text file with the desired line ending already in place than relying on sort's requirement to add a \n if the file was not a text file merely because it had an incomplete last line. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --pkq4ljoerioRXwN3N9NjExAEGrgjMAKg3 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg iQEcBAEBCAAGBQJT8imGAAoJEKeha0olJ0NqfRAH/3XzQ4j5h/okb1F3ERRcg8p9 IKE6RgACssR0yENIyZLXMctywzSTrwcxPl3WE/BcEmNA+56F0exH4an2KgxTPdej 1G/whH8B+cR421zIAk0C3kdQoteX+o2g5LiM8mtpISTCPlnE1R5YgXVTV3/YpAa8 5r/g3Adg8UQsXSuZTQDLcEN2mnoNwQuMKBRXhyTSVXyltg80oHkjNsG/FpTP831G +kecblSuHiF8K937rneWDFEb3sNE17TXE7xlIR+GhdWU/KLsMyED6hrBRu2ZoE0Y d/2fX4dd1LzAT7ave8hsURhD+ebOns8CYso13cA72ht+Hm7n5h73MdQxTXvQlTo= =mE0W -----END PGP SIGNATURE----- --pkq4ljoerioRXwN3N9NjExAEGrgjMAKg3-- From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 18 12:32:19 2014 Received: (at 18291) by debbugs.gnu.org; 18 Aug 2014 16:32:20 +0000 Received: from localhost ([127.0.0.1]:46387 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XJPr0-0007L4-QE for submit@debbugs.gnu.org; Mon, 18 Aug 2014 12:32:19 -0400 Received: from mx1.redhat.com ([209.132.183.28]:7313) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XJPqx-0007Ko-51 for 18291@debbugs.gnu.org; Mon, 18 Aug 2014 12:32:16 -0400 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s7IGW77T024630 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 18 Aug 2014 12:32:08 -0400 Received: from [10.3.113.148] (ovpn-113-148.phx2.redhat.com [10.3.113.148]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s7IGW7jb023100; Mon, 18 Aug 2014 12:32:07 -0400 Message-ID: <53F22A87.2000509@redhat.com> Date: Mon, 18 Aug 2014 10:32:07 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= , NTENTOS STAVROS Subject: Re: bug#18291: Unix Sort Bug Report References: <20140818115521.Horde.JnRI2dtzkALhGh6Gh7j28w8@webmail.uth.gr> <53F22270.1070304@draigBrady.com> In-Reply-To: <53F22270.1070304@draigBrady.com> OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="VcG2EvBme26t9X5gShRN6vi3EGcHcO7jv" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Spam-Score: -5.7 (-----) X-Debbugs-Envelope-To: 18291 Cc: 18291@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.7 (-----) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --VcG2EvBme26t9X5gShRN6vi3EGcHcO7jv Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 08/18/2014 09:57 AM, P=C3=A1draig Brady wrote: > On 08/18/2014 09:55 AM, NTENTOS STAVROS wrote: >> >> Hello developers, >> >> Recently, using the sort utility I run into an omission. While I canno= t disclose the file in question, I will try to explain the issue: >> On a Windows-created file (line ending: \r\n) I tried to perform a sor= ting, which happened to sort the last entry somewhere above. The last lin= e did not have a line ending of any kind, and sort created a Unix-like en= ding (\r), which afterwards creates a parsing problem with the file. >=20 > Well a \n is inserted actually, not \r, but yes that is a problem on wi= ndows. > This demonstrates the behavior: >=20 > $ printf '2\r\n1' | sort | od -Ax -tx1z -v > 000000 31 0a 32 0d 0a >1.2..< >=20 > The \n is inserted so as to delimit the reordered item appropriately, > which is set here: >=20 > http://git.sv.gnu.org/gitweb/?p=3Dcoreutils.git;a=3Dblob;f=3Dsrc/sort.c= ;h=3Dc2493192;hb=3DHEAD#l178 >=20 > It seems that this should be set to '\r\n' on cygwin builds, > (wither other adjustments to handle multiple chars). If the file was opened in text mode, then sort only sees \n line endings on input (cygwin already shortened \r\n to \n before handing the line to sort), and on output all \n are automatically converted back to \r\n. If the file was opened in binary mode, then cygwin CANNOT second guess what line endings you wanted. It sounds like your file lives on a binary mount point, when you want it to live on a text mount point instead; at which point cygwin should do the right thing (although I admit I did not actually try this on cygwin, because I seldom use cygwin text mounts). But that is probably more a question for cygwin downstream, not for upstream coreutils (the POSIX requirement is that text and binary file modes are identical, so any system like cygwin where there are not is already non-POSIX and starts to get into a question of whether pushing upstream fixes for a downstream-only problem is maintainable). --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --VcG2EvBme26t9X5gShRN6vi3EGcHcO7jv Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg iQEcBAEBCAAGBQJT8iqHAAoJEKeha0olJ0NqYsMH/0PE8/Y/kScuOZ58amQ89DFv y9gB2IQ5RQyYh7150VPmaLdke+h8u7JkCPEeJOAoK8iKN2ZlmxUF5AGcSXZbYwSv sgw6hPxH6o8rWVTxSDxyGPo6LA/x0voFPxkikamRk5PXDUiqPpSauyEXoz54h3dn NNBj5VVIDdeG2+Cp5fHmDCUo4RMufqhaZm30yBJth74LO16rGuwy17iZ9Vfx/7e7 IKH2sBRZkWQZRdfogxqIzJcW8gTi4op/kOH/IHspo3k5MPDOiYNKjl9dJJB54ifz ju22LTMG+b9YB9djTuGdPjyjSLVsCWx6aoYHBjLF20J6ibeqqIZdQigK6hnJnsA= =xZWE -----END PGP SIGNATURE----- --VcG2EvBme26t9X5gShRN6vi3EGcHcO7jv-- From unknown Thu Jun 19 14:07:07 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Tue, 16 Sep 2014 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator