From unknown Sat Jun 21 05:11:48 2025 X-Loop: help-debbugs@gnu.org Subject: bug#9580: sort 8.5 bug? Resent-From: Sean Sun Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Thu, 22 Sep 2011 21:46:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 9580 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 9580@debbugs.gnu.org X-Debbugs-Original-To: Bug-coreutils@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.131672792620868 (code B ref -1); Thu, 22 Sep 2011 21:46:01 +0000 Received: (at submit) by debbugs.gnu.org; 22 Sep 2011 21:45:26 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1R6r58-0005QS-IN for submit@debbugs.gnu.org; Thu, 22 Sep 2011 17:45:24 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1R6qJ0-0004L1-AC for submit@debbugs.gnu.org; Thu, 22 Sep 2011 16:55:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R6qIc-0007VS-Mc for submit@debbugs.gnu.org; Thu, 22 Sep 2011 16:55:15 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,FREEMAIL_FROM, RP_MATCHES_RCVD, T_TO_NO_BRKTS_FREEMAIL autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([140.186.70.17]:48286) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R6qIc-0007VO-LD for submit@debbugs.gnu.org; Thu, 22 Sep 2011 16:55:14 -0400 Received: from eggs.gnu.org ([140.186.70.92]:34523) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R6qIb-0004Ke-64 for Bug-coreutils@gnu.org; Thu, 22 Sep 2011 16:55:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R6qIa-0007V3-28 for Bug-coreutils@gnu.org; Thu, 22 Sep 2011 16:55:13 -0400 Received: from sam.nabble.com ([216.139.236.26]:44920) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R6qIZ-0007Qu-Uc for Bug-coreutils@gnu.org; Thu, 22 Sep 2011 16:55:12 -0400 Received: from isper.nabble.com ([192.168.236.156]) by sam.nabble.com with esmtp (Exim 4.72) (envelope-from ) id 1R6qIT-0004st-A9 for Bug-coreutils@gnu.org; Thu, 22 Sep 2011 13:55:05 -0700 Message-ID: <32503840.post@talk.nabble.com> Date: Thu, 22 Sep 2011 13:55:05 -0700 (PDT) From: Sean Sun MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Nabble-From: sean.x.sun@gmail.com X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.17 X-Spam-Score: -5.9 (-----) X-Mailman-Approved-At: Thu, 22 Sep 2011 17:45:21 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.9 (-----) ######################################################### Ubuntu 11.04 2.6.38-11-generic-pae sort --version sort (GNU coreutils) 8.5 Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later . This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and Paul Eggert. ########################################################## I created two testing files: File_A and File_B. cat File_=E2=80=8BA . BAD. sort File_A . BAD. cat File_=E2=80=8BB .s BAD.s sort File_B BAD.s .s So basi=C2=ADcally, append=C2=ADing a let=C2=ADter after =E2=80=98.=E2=80= =99 would reverse the sort order. That doesn't look quite right. Is there an explanation for this behavior? I've tried the same on a Mac, and their sort (5.93) woks just fine. I've also tried set LC_ALL=3D'C'. Just in case it's a funky locale problem, but didn't make a difference. --=20 View this message in context: http://old.nabble.com/sort-8.5-bug--tp3250384= 0p32503840.html Sent from the Gnu - Coreutils - Discuss mailing list archive at Nabble.com. From debbugs-submit-bounces@debbugs.gnu.org Thu Sep 22 18:02:15 2011 Received: (at control) by debbugs.gnu.org; 22 Sep 2011 22:02:15 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1R6rLS-0005nt-LR for submit@debbugs.gnu.org; Thu, 22 Sep 2011 18:02:15 -0400 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1R6rLN-0005nd-RA; Thu, 22 Sep 2011 18:02:11 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p8MM1k0s011754 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 22 Sep 2011 18:01:46 -0400 Received: from [10.3.113.135] (ovpn-113-135.phx2.redhat.com [10.3.113.135]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p8MM1iRk020621; Thu, 22 Sep 2011 18:01:44 -0400 Message-ID: <4E7BB048.9070801@redhat.com> Date: Thu, 22 Sep 2011 16:01:44 -0600 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.22) Gecko/20110906 Fedora/3.1.14-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.14 MIME-Version: 1.0 To: Sean Sun Subject: Re: bug#9580: sort 8.5 bug? References: <32503840.post@talk.nabble.com> In-Reply-To: <32503840.post@talk.nabble.com> Content-Type: text/plain; charset=UTF-8; format=flowed X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mx1.redhat.com id p8MM1k0s011754 X-Spam-Score: -10.3 (----------) X-Debbugs-Envelope-To: control Cc: 9580-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.3 (----------) tag 9580 notabug thanks On 09/22/2011 02:55 PM, Sean Sun wrote: > So basi=C2=ADcally, append=C2=ADing a let=C2=ADter after =E2=80=98.=E2=80= =99 would reverse the sort order. > That doesn't look quite right. Is there an explanation for this behavio= r? > I've tried the same on a Mac, and their sort (5.93) woks just fine. Thanks for the report, but this is not a bug in sort. Actually, both=20 versions that you tried (8.5 and 5.93) sort in the same way, where the=20 difference is in your choice of locale, and you are hitting this FAQ: https://www.gnu.org/software/coreutils/faq/#Sort-does-not-sort-in-normal-= order_0021 Newer coreutils added a --debug option to help you learn why the bug is=20 in your expectations and not in sort (8.13 is current, but --debug has=20 been present since 8.6). So let's use it: $ printf '.\nBAD.\n.s\nBAD.s\n' | sort --debug sort: using `en_US.UTF-8' sorting rules =2E _ BAD. ____ BAD.s _____ .s __ $ printf '.\nBAD.\n.s\nBAD.s\n' | LC_ALL=3DC sort --debug sort: using simple byte comparison =2E _ .s __ BAD. ____ BAD.s _____ Remember, the en_US.UTF-8 locale uses dictionary order collation, which=20 treats punctuation as insignificant, and blends case. That is, 's' and=20 '.s' collate as the same string, and '.s' is larger than 'BAD.' since=20 's' comes later in the alphabet than 'B'. On the other hand, the C locale uses ASCII ordering, where every byte is=20 significant, and '.' sorts before 'B'. > > I've also tried set LC_ALL=3D'C'. Just in case it's a funky locale prob= lem, > but didn't make a difference. Are you sure you used the correct syntax? The way you wrote it, it=20 looks like you tried: $ set LC_ALL=3D'C' But that is neither sh (export LC_ALL=3DC) nor csh (setenv LC_ALL C)=20 syntax. And your problem is absolutely explained by locale, and would=20 indeed be "solved" if you indeed had set LC_ALL=3DC like you meant to do. --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org From unknown Sat Jun 21 05:11:48 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.427 (Entity 5.427) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Sean Sun Subject: bug#9580: closed (Re: bug#9580: sort 8.5 bug?) Message-ID: References: <4E7BB048.9070801@redhat.com> <32503840.post@talk.nabble.com> X-Gnu-PR-Message: they-closed 9580 X-Gnu-PR-Package: coreutils X-Gnu-PR-Keywords: notabug Reply-To: 9580@debbugs.gnu.org Date: Thu, 22 Sep 2011 22:03:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1316728982-22379-1" This is a multi-part message in MIME format... ------------=_1316728982-22379-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #9580: sort 8.5 bug? which was filed against the coreutils package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 9580@debbugs.gnu.org. --=20 9580: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D9580 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1316728982-22379-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 9580-done) by debbugs.gnu.org; 22 Sep 2011 22:02:14 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1R6rLR-0005nq-RA for submit@debbugs.gnu.org; Thu, 22 Sep 2011 18:02:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1R6rLN-0005nd-RA; Thu, 22 Sep 2011 18:02:11 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p8MM1k0s011754 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 22 Sep 2011 18:01:46 -0400 Received: from [10.3.113.135] (ovpn-113-135.phx2.redhat.com [10.3.113.135]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p8MM1iRk020621; Thu, 22 Sep 2011 18:01:44 -0400 Message-ID: <4E7BB048.9070801@redhat.com> Date: Thu, 22 Sep 2011 16:01:44 -0600 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.22) Gecko/20110906 Fedora/3.1.14-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.14 MIME-Version: 1.0 To: Sean Sun Subject: Re: bug#9580: sort 8.5 bug? References: <32503840.post@talk.nabble.com> In-Reply-To: <32503840.post@talk.nabble.com> Content-Type: text/plain; charset=UTF-8; format=flowed X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mx1.redhat.com id p8MM1k0s011754 X-Spam-Score: -10.3 (----------) X-Debbugs-Envelope-To: 9580-done Cc: 9580-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.3 (----------) tag 9580 notabug thanks On 09/22/2011 02:55 PM, Sean Sun wrote: > So basi=C2=ADcally, append=C2=ADing a let=C2=ADter after =E2=80=98.=E2=80= =99 would reverse the sort order. > That doesn't look quite right. Is there an explanation for this behavio= r? > I've tried the same on a Mac, and their sort (5.93) woks just fine. Thanks for the report, but this is not a bug in sort. Actually, both=20 versions that you tried (8.5 and 5.93) sort in the same way, where the=20 difference is in your choice of locale, and you are hitting this FAQ: https://www.gnu.org/software/coreutils/faq/#Sort-does-not-sort-in-normal-= order_0021 Newer coreutils added a --debug option to help you learn why the bug is=20 in your expectations and not in sort (8.13 is current, but --debug has=20 been present since 8.6). So let's use it: $ printf '.\nBAD.\n.s\nBAD.s\n' | sort --debug sort: using `en_US.UTF-8' sorting rules =2E _ BAD. ____ BAD.s _____ .s __ $ printf '.\nBAD.\n.s\nBAD.s\n' | LC_ALL=3DC sort --debug sort: using simple byte comparison =2E _ .s __ BAD. ____ BAD.s _____ Remember, the en_US.UTF-8 locale uses dictionary order collation, which=20 treats punctuation as insignificant, and blends case. That is, 's' and=20 '.s' collate as the same string, and '.s' is larger than 'BAD.' since=20 's' comes later in the alphabet than 'B'. On the other hand, the C locale uses ASCII ordering, where every byte is=20 significant, and '.' sorts before 'B'. > > I've also tried set LC_ALL=3D'C'. Just in case it's a funky locale prob= lem, > but didn't make a difference. Are you sure you used the correct syntax? The way you wrote it, it=20 looks like you tried: $ set LC_ALL=3D'C' But that is neither sh (export LC_ALL=3DC) nor csh (setenv LC_ALL C)=20 syntax. And your problem is absolutely explained by locale, and would=20 indeed be "solved" if you indeed had set LC_ALL=3DC like you meant to do. --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org ------------=_1316728982-22379-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 22 Sep 2011 21:45:26 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1R6r58-0005QS-IN for submit@debbugs.gnu.org; Thu, 22 Sep 2011 17:45:24 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1R6qJ0-0004L1-AC for submit@debbugs.gnu.org; Thu, 22 Sep 2011 16:55:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R6qIc-0007VS-Mc for submit@debbugs.gnu.org; Thu, 22 Sep 2011 16:55:15 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,FREEMAIL_FROM, RP_MATCHES_RCVD, T_TO_NO_BRKTS_FREEMAIL autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([140.186.70.17]:48286) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R6qIc-0007VO-LD for submit@debbugs.gnu.org; Thu, 22 Sep 2011 16:55:14 -0400 Received: from eggs.gnu.org ([140.186.70.92]:34523) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R6qIb-0004Ke-64 for Bug-coreutils@gnu.org; Thu, 22 Sep 2011 16:55:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R6qIa-0007V3-28 for Bug-coreutils@gnu.org; Thu, 22 Sep 2011 16:55:13 -0400 Received: from sam.nabble.com ([216.139.236.26]:44920) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R6qIZ-0007Qu-Uc for Bug-coreutils@gnu.org; Thu, 22 Sep 2011 16:55:12 -0400 Received: from isper.nabble.com ([192.168.236.156]) by sam.nabble.com with esmtp (Exim 4.72) (envelope-from ) id 1R6qIT-0004st-A9 for Bug-coreutils@gnu.org; Thu, 22 Sep 2011 13:55:05 -0700 Message-ID: <32503840.post@talk.nabble.com> Date: Thu, 22 Sep 2011 13:55:05 -0700 (PDT) From: Sean Sun To: Bug-coreutils@gnu.org Subject: sort 8.5 bug? MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Nabble-From: sean.x.sun@gmail.com X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.17 X-Spam-Score: -5.9 (-----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 22 Sep 2011 17:45:21 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.9 (-----) ######################################################### Ubuntu 11.04 2.6.38-11-generic-pae sort --version sort (GNU coreutils) 8.5 Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later . This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and Paul Eggert. ########################################################## I created two testing files: File_A and File_B. cat File_=E2=80=8BA . BAD. sort File_A . BAD. cat File_=E2=80=8BB .s BAD.s sort File_B BAD.s .s So basi=C2=ADcally, append=C2=ADing a let=C2=ADter after =E2=80=98.=E2=80= =99 would reverse the sort order. That doesn't look quite right. Is there an explanation for this behavior? I've tried the same on a Mac, and their sort (5.93) woks just fine. I've also tried set LC_ALL=3D'C'. Just in case it's a funky locale problem, but didn't make a difference. --=20 View this message in context: http://old.nabble.com/sort-8.5-bug--tp3250384= 0p32503840.html Sent from the Gnu - Coreutils - Discuss mailing list archive at Nabble.com. ------------=_1316728982-22379-1--