From unknown Sat Aug 09 13:00:06 2025 X-Loop: help-debbugs@gnu.org Subject: bug#7961: sort Resent-From: Francesco Bettella Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Wed, 02 Feb 2011 14:42:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 7961 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 7961@debbugs.gnu.org X-Debbugs-Original-To: bug-coreutils@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.129665767727036 (code B ref -1); Wed, 02 Feb 2011 14:42:02 +0000 Received: (at submit) by debbugs.gnu.org; 2 Feb 2011 14:41:17 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PkdtU-00071v-Ki for submit@debbugs.gnu.org; Wed, 02 Feb 2011 09:41:17 -0500 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PkdYd-0006W9-R1 for submit@debbugs.gnu.org; Wed, 02 Feb 2011 09:19:44 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pkdga-0007ft-DL for submit@debbugs.gnu.org; Wed, 02 Feb 2011 09:28:09 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: ** X-Spam-Status: No, score=2.5 required=5.0 tests=BAYES_20, RECEIVED_FROM_WINDOWS_HOST, T_RP_MATCHES_RCVD autolearn=no version=3.3.1 Received: from lists.gnu.org ([199.232.76.165]:44300) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pkdga-0007fO-7f for submit@debbugs.gnu.org; Wed, 02 Feb 2011 09:27:56 -0500 Received: from [140.186.70.92] (port=35264 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PkdgN-0001zo-MC for bug-coreutils@gnu.org; Wed, 02 Feb 2011 09:27:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pkc29-000740-NH for bug-coreutils@gnu.org; Wed, 02 Feb 2011 07:42:10 -0500 Received: from mx0.decode.is ([212.126.224.32]:2113) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pkc29-00073S-BB for bug-coreutils@gnu.org; Wed, 02 Feb 2011 07:42:05 -0500 Received: from snote.decode.is (Not Verified[172.17.152.10]) by mx0.decode.is id ; Wed, 02 Feb 2011 12:42:01 +0000 Received: from lws14.decode.is ([172.17.112.14]) by snote.decode.is (Lotus Domino Release 8.0.2FP1) with ESMTP id 2011020212420210-87484 ; Wed, 2 Feb 2011 12:42:02 +0000 From: Francesco Bettella Organization: deCODE Date: Wed, 2 Feb 2011 12:42:01 +0000 User-Agent: KMail/1.9.4 MIME-Version: 1.0 Message-Id: <201102021242.01910.francesb@decode.is> X-MIMETrack: Itemize by SMTP Server on DecodeDom/Decode/IS(Release 8.0.2FP1|January 12, 2009) at 02.02.2011 12:42:02, Serialize by Router on DecodeDom/Decode/IS(Release 8.0.2FP1|January 12, 2009) at 02.02.2011 12:42:02, Serialize complete at 02.02.2011 12:42:02 Content-Type: Multipart/Mixed; boundary="Boundary-00=_ZEVSN9Ujbbcj3lK" X-deCODE-Mailmarshal: Check X-detected-operating-system: by eggs.gnu.org: Windows 2000 SP4, XP SP1+ X-Received-From: 212.126.224.32 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 199.232.76.165 X-Spam-Score: -5.9 (-----) X-Mailman-Approved-At: Wed, 02 Feb 2011 09:41:14 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.9 (-----) --Boundary-00=_ZEVSN9Ujbbcj3lK Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline hi, I may have bumped into an undesired feature/bug of sort, which appears to be still present in the version 8.9 of coreutils. I'm issuing the following sort commands (see attached files): [prompt1] > sort -k 1.4,1n asd1 > asd1.sorted [prompt2] > sort -k 2.4,2n asd2 > asd2.sorted the first one works as I would expect, the second one doesn't. cheers. Francesco --Boundary-00=_ZEVSN9Ujbbcj3lK Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; name="asd1" Content-Disposition: attachment; filename="asd1" chr coding_gene chr1 PRAMEF1 chr1 PRAMEF4 chr1 PPT1 chr1 B4GALT2 chr1 BTBD19 chr1 PPM1J chr1 AMPD1 chr1 HSD3B1 chr1 LY9 chr1 DUSP27 chr1 MYBPH chr1 DISC1 chr1 DISC1 chr10 ANXA7 chr10 KILLIN chr10 RNLS chr10 WNT8B chr10 SEC31B chr11 INSC chr11 PRSS23 chr11 DDX10 chr12 TEAD4 chr12 GXYLT1 chr12 TBX5 chr12 PIWIL1 chr14 C14orf106 chr14 C14orf50 chr15 OR4N4 chr15 AKAP13 chr16 TBC1D10B chr16 FAM38A chr18 HDHD2 chr19 USHBP1 chr19 MLL4 chr19 MEGF8 chr19 SPHK2 chr19 SIGLEC10 chr19 ZNF83 chr19 LILRA6 chr2 TAF1B chr2 KHK chr2 PPM1G chr2 KCTD18 chr21 SIK1 chr22 SGSM1 chr3 CCR5 chr3 SETD2 chr3 ZNF717 chr3 SNX4 chr3 GMPS chr3 C3orf34 chr5 MTRR chr5 SEMA5A chr5 PCDHB16 chr5 CLK4 chr6 C6orf105 chr6 OR5V1 chr6 MUC21 chr6 PKHD1 chr7 CDHR3 chr8 CSMD1 chr8 FER1L6 chr8 KIAA0196 chr9 CNTLN chr9 PLAA chr9 SLC25A25 --Boundary-00=_ZEVSN9Ujbbcj3lK Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; name="asd1.sorted" Content-Disposition: attachment; filename="asd1.sorted" chr coding_gene chr1 AMPD1 chr1 B4GALT2 chr1 BTBD19 chr1 DISC1 chr1 DISC1 chr1 DUSP27 chr1 HSD3B1 chr1 LY9 chr1 MYBPH chr1 PPM1J chr1 PPT1 chr1 PRAMEF1 chr1 PRAMEF4 chr2 KCTD18 chr2 KHK chr2 PPM1G chr2 TAF1B chr3 C3orf34 chr3 CCR5 chr3 GMPS chr3 SETD2 chr3 SNX4 chr3 ZNF717 chr5 CLK4 chr5 MTRR chr5 PCDHB16 chr5 SEMA5A chr6 C6orf105 chr6 MUC21 chr6 OR5V1 chr6 PKHD1 chr7 CDHR3 chr8 CSMD1 chr8 FER1L6 chr8 KIAA0196 chr9 CNTLN chr9 PLAA chr9 SLC25A25 chr10 ANXA7 chr10 KILLIN chr10 RNLS chr10 SEC31B chr10 WNT8B chr11 DDX10 chr11 INSC chr11 PRSS23 chr12 GXYLT1 chr12 PIWIL1 chr12 TBX5 chr12 TEAD4 chr14 C14orf106 chr14 C14orf50 chr15 AKAP13 chr15 OR4N4 chr16 FAM38A chr16 TBC1D10B chr18 HDHD2 chr19 LILRA6 chr19 MEGF8 chr19 MLL4 chr19 SIGLEC10 chr19 SPHK2 chr19 USHBP1 chr19 ZNF83 chr21 SIK1 chr22 SGSM1 --Boundary-00=_ZEVSN9Ujbbcj3lK Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; name="asd2.sorted" Content-Disposition: attachment; filename="asd2.sorted" AKAP13 chr15 AMPD1 chr1 ANXA7 chr10 B4GALT2 chr1 BTBD19 chr1 C14orf106 chr14 C14orf50 chr14 C3orf34 chr3 C6orf105 chr6 CCR5 chr3 CDHR3 chr7 CLK4 chr5 CNTLN chr9 CSMD1 chr8 DDX10 chr11 DISC1 chr1 DISC1 chr1 DUSP27 chr1 FAM38A chr16 FER1L6 chr8 GMPS chr3 GXYLT1 chr12 HDHD2 chr18 HSD3B1 chr1 INSC chr11 KCTD18 chr2 KHK chr2 KIAA0196 chr8 KILLIN chr10 LILRA6 chr19 LY9 chr1 MEGF8 chr19 MLL4 chr19 MTRR chr5 MUC21 chr6 MYBPH chr1 OR4N4 chr15 OR5V1 chr6 PCDHB16 chr5 PIWIL1 chr12 PKHD1 chr6 PLAA chr9 PPM1G chr2 PPM1J chr1 PPT1 chr1 PRAMEF1 chr1 PRAMEF4 chr1 PRSS23 chr11 RNLS chr10 SEC31B chr10 SEMA5A chr5 SETD2 chr3 SGSM1 chr22 SIGLEC10 chr19 SIK1 chr21 SLC25A25 chr9 SNX4 chr3 SPHK2 chr19 TAF1B chr2 TBC1D10B chr16 TBX5 chr12 TEAD4 chr12 USHBP1 chr19 WNT8B chr10 ZNF717 chr3 ZNF83 chr19 coding_gene chr --Boundary-00=_ZEVSN9Ujbbcj3lK Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; name="asd2" Content-Disposition: attachment; filename="asd2" coding_gene chr PRAMEF1 chr1 PRAMEF4 chr1 PPT1 chr1 B4GALT2 chr1 BTBD19 chr1 PPM1J chr1 AMPD1 chr1 HSD3B1 chr1 LY9 chr1 DUSP27 chr1 MYBPH chr1 DISC1 chr1 DISC1 chr1 ANXA7 chr10 KILLIN chr10 RNLS chr10 WNT8B chr10 SEC31B chr10 INSC chr11 PRSS23 chr11 DDX10 chr11 TEAD4 chr12 GXYLT1 chr12 TBX5 chr12 PIWIL1 chr12 C14orf106 chr14 C14orf50 chr14 OR4N4 chr15 AKAP13 chr15 TBC1D10B chr16 FAM38A chr16 HDHD2 chr18 USHBP1 chr19 MLL4 chr19 MEGF8 chr19 SPHK2 chr19 SIGLEC10 chr19 ZNF83 chr19 LILRA6 chr19 TAF1B chr2 KHK chr2 PPM1G chr2 KCTD18 chr2 SIK1 chr21 SGSM1 chr22 CCR5 chr3 SETD2 chr3 ZNF717 chr3 SNX4 chr3 GMPS chr3 C3orf34 chr3 MTRR chr5 SEMA5A chr5 PCDHB16 chr5 CLK4 chr5 C6orf105 chr6 OR5V1 chr6 MUC21 chr6 PKHD1 chr6 CDHR3 chr7 CSMD1 chr8 FER1L6 chr8 KIAA0196 chr8 CNTLN chr9 PLAA chr9 SLC25A25 chr9 --Boundary-00=_ZEVSN9Ujbbcj3lK-- From unknown Sat Aug 09 13:00:06 2025 X-Loop: help-debbugs@gnu.org Subject: bug#7961: sort Resent-From: Eric Blake Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Wed, 02 Feb 2011 17:36:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 7961 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Francesco Bettella Cc: 7961@debbugs.gnu.org Received: via spool by 7961-submit@debbugs.gnu.org id=B7961.12966681419755 (code B ref 7961); Wed, 02 Feb 2011 17:36:02 +0000 Received: (at 7961) by debbugs.gnu.org; 2 Feb 2011 17:35:41 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PkgcG-0002XE-K1 for submit@debbugs.gnu.org; Wed, 02 Feb 2011 12:35:41 -0500 Received: from qmta09.emeryville.ca.mail.comcast.net ([76.96.30.96]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PkgcD-0002X2-BF for 7961@debbugs.gnu.org; Wed, 02 Feb 2011 12:35:39 -0500 Received: from omta18.emeryville.ca.mail.comcast.net ([76.96.30.74]) by qmta09.emeryville.ca.mail.comcast.net with comcast id 35Pf1g0041bwxycA95k3SG; Wed, 02 Feb 2011 17:44:03 +0000 Received: from [192.168.0.6] ([24.10.251.25]) by omta18.emeryville.ca.mail.comcast.net with comcast id 35k11g00S0ZdyUg8e5k1UG; Wed, 02 Feb 2011 17:44:02 +0000 Message-ID: <4D4997E0.7020506@redhat.com> Date: Wed, 02 Feb 2011 10:44:00 -0700 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.7 MIME-Version: 1.0 References: <201102021242.01910.francesb@decode.is> In-Reply-To: <201102021242.01910.francesb@decode.is> X-Enigmail-Version: 1.1.2 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enigF656B57ED3D6835C9D3335B7" X-Spam-Score: -1.5 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -1.5 (-) This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigF656B57ED3D6835C9D3335B7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 02/02/2011 05:42 AM, Francesco Bettella wrote: > hi, > I may have bumped into an undesired feature/bug of sort, which appears = to be=20 > still present in the version 8.9 of coreutils. Thanks for the report. However, this is a feature, and not a bug, of sor= t. >=20 > I'm issuing the following sort commands (see attached files): >=20 > [prompt1] > sort -k 1.4,1n asd1 > asd1.sorted >=20 > [prompt2] > sort -k 2.4,2n asd2 > asd2.sorted If I'm correct, asd1 and asd2 have the same contents, except that you have swapped columns 1 and 2 between the two and resorted the lines. And your desired goal is that the output matches asd1.sorted, again with the columns swapped for asd2.sorted. >=20 > the first one works as I would expect, the second one doesn't. Let's examine why: $ head -3 asd1 | sort -k 1.4,1n --debug sort: using `en_US.UTF-8' sorting rules sort: leading blanks are significant in key 1; consider also specifying `= b' chr>coding_gene ^ no match for key _______________ chr1>PRAMEF1 _ ____________ chr1>PRAMEF4 _ ____________ $ head -3 asd1 | LC_ALL=3DC sort -k 1.4,1n --debug sort: using simple byte comparison sort: leading blanks are significant in key 1; consider also specifying `= b' chr>coding_gene ^ no match for key _______________ chr1>PRAMEF1 _ ____________ chr1>PRAMEF4 _ ____________ In both cases, when there is no match for a key but numeric sorting was requested, then that line sorts first; meanwhile, you get the fallback sort of the complete line after the first key has been sorted, so that the end result matches asd1.sorted whether you use the C locale or dictionary sorting. But notice that warning about not using -b, and how it affects asd2 (and also, how the difference in dictionary vs. byte-ordering plays a role in the secondary sorting): $ head -3 asd2 | sort -k 2.4,2n --debug sort: using `en_US.UTF-8' sorting rules sort: leading blanks are significant in key 1; consider also specifying `= b' coding_gene>chr ^ no match for key _______________ PRAMEF1>chr1 ^ no match for key ____________ PRAMEF4>chr1 ^ no match for key ____________ $ head -3 asd2 | LC_ALL=3DC sort -k 2.4,2n --debug sort: using simple byte comparison sort: leading blanks are significant in key 1; consider also specifying `= b' PRAMEF1>chr1 ^ no match for key ____________ PRAMEF4>chr1 ^ no match for key ____________ coding_gene>chr ^ no match for key But when you add -b (note, b is the one option you have to add to the start field, since it affects start and end fields specially; all other options can be added to start, end, or both, and affect the entire key): $ head -3 asd2 | sort -k 2.4b,2n --debug sort: using `en_US.UTF-8' sorting rules coding_gene>chr ^ no match for key _______________ PRAMEF1>chr1 _ ____________ PRAMEF4>chr1 _ ____________ $ head -3 asd2 | LC_ALL=3DC coreutils/src/sort -k 2.4b,2n --debug coreutils/src/sort: using simple byte comparison coding_gene>chr ^ no match for key _______________ PRAMEF1>chr1 _ ____________ PRAMEF4>chr1 _ ____________ That is, your expectations were insufficient - without telling sort enough additional information, sort correctly followed what you told it to do, but what you told it was not what you meant. And the --debug option is your [new] friend :) --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org --------------enigF656B57ED3D6835C9D3335B7 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBCAAGBQJNSZfgAAoJEKeha0olJ0NqioIH/07pVriKrt+wDVHvRO5l0Vto tKiESft6uW3j7yyQdkbP/AQ2KVbUQCBAp+LzqgDmY1ZYx/Dc5wbjUj5Y7KhnnP9R cnR2vu02sUKJ4/0cSg4hjaSv+nzEdOpjQKTBcP+aIDofmjyortN3RoyjvXQzsdIe locV0t4U1NxSMfpbM7NiJEpeQxKEFwmkaTHvC94IwrqtQfOITAgqj5nLKYMPHN6Q YcyLi5x/yH2B60Br93+SE81yE7I7tNO1hRpCpjCK2lnVqAtSynu6RM9Xsk2JWG0R 6ItYIdYlbv4Eb09JHDDTpgtCq3SvLiLXel03hDYlWOig+LomLCKEoVZEtxIYE6g= =hJlW -----END PGP SIGNATURE----- --------------enigF656B57ED3D6835C9D3335B7-- From unknown Sat Aug 09 13:00:06 2025 X-Loop: help-debbugs@gnu.org Subject: bug#7961: sort Resent-From: Assaf Gordon Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Wed, 02 Feb 2011 19:04:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 7961 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Francesco Bettella Cc: 7961@debbugs.gnu.org Received: via spool by 7961-submit@debbugs.gnu.org id=B7961.129667339420162 (code B ref 7961); Wed, 02 Feb 2011 19:04:03 +0000 Received: (at 7961) by debbugs.gnu.org; 2 Feb 2011 19:03:14 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Pkhz0-0005F9-0F for submit@debbugs.gnu.org; Wed, 02 Feb 2011 14:03:14 -0500 Received: from mail-qw0-f44.google.com ([209.85.216.44]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Pkhed-00040J-Jp for 7961@debbugs.gnu.org; Wed, 02 Feb 2011 13:42:12 -0500 Received: by qwi2 with SMTP id 2so279344qwi.3 for <7961@debbugs.gnu.org>; Wed, 02 Feb 2011 10:50:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:message-id:disposition-notification-to:date :from:user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=oImDb3snW+rN+Tg4YClosfgYlQcAf/jYRL09yBA93Fs=; b=Bt3Vdv3Z/6BAH3AHPvFJnF670d/QkJTyP4J1Rt8Gmtm+0eMwa0tilSFrhDE00opFuT cdig8ub1q66t+DEQTDS5fbYX6pEqs6koMfsaCwDlULRT8ACh1nVGXn/pM9ocXl2WXvwV Iow64aBI3M7xPVoVSfClbD5tRyMjOqTKCDOsQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:disposition-notification-to:date:from:user-agent :mime-version:to:cc:subject:references:in-reply-to:content-type :content-transfer-encoding; b=QGO+9qYD5/BXRdJnPX9PGS8WnU5bc5P52IlGAUUj096t5che3y9jZx8jevAzeO78jR hCuZJLvFq31E3ix8hX/K84DXk+6qvKJGTDx9QVsyIq4YuAVYMmQHYVnynchVFEEPkn8o ywIfx4WNLGDWryA4FguvczRWoztEBJ/+cT0lA= Received: by 10.224.37.78 with SMTP id w14mr8839753qad.215.1296672637468; Wed, 02 Feb 2011 10:50:37 -0800 (PST) Received: from [143.48.11.9] (tango.cshl.edu [143.48.11.9]) by mx.google.com with ESMTPS id y17sm16730818qci.45.2011.02.02.10.50.36 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 02 Feb 2011 10:50:36 -0800 (PST) Message-ID: <4D49A77B.7070202@gmail.com> Date: Wed, 02 Feb 2011 13:50:35 -0500 From: Assaf Gordon User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100918 Icedove/3.1.4 MIME-Version: 1.0 References: <201102021242.01910.francesb@decode.is> In-Reply-To: <201102021242.01910.francesb@decode.is> Content-Type: text/plain; charset=windows-1255 Content-Transfer-Encoding: 7bit X-Spam-Score: -3.6 (---) X-Mailman-Approved-At: Wed, 02 Feb 2011 14:03:12 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.6 (---) On a somewhat off-topic note, Francesco Bettella wrote, On 02/02/2011 07:42 AM: > > I'm issuing the following sort commands (see attached files): > [prompt1] > sort -k 1.4,1n asd1 > asd1.sorted > [prompt2] > sort -k 2.4,2n asd2 > asd2.sorted > > the first one works as I would expect, the second one doesn't. When sorting chromosome names, the version sort option (-V, introduced in coreutils 7.0) sorts as you would expect, saving you the need to skip three characters in the sort key, and also accommodating mixing letters and numbers. Example: $ cat chrom.txt chr1 chrUn_gl000232 chrY chr2 chr13 chrM chrUn_gl000218 chr6_hap chr2R chr16 chr10 chr6_dbb_hap3 chr4 chr3L chr4_ctg9_hap1 chr3R chr3 chrX $ sort -k1,1V chrom.txt chr1 chr2 chr2R chr3 chr3L chr3R chr4 chr4_ctg9_hap1 chr6_dbb_hap3 chr6_hap chr10 chr13 chr16 chrM chrUn_gl000218 chrUn_gl000232 chrX chrY -gordon From unknown Sat Aug 09 13:00:06 2025 X-Loop: help-debbugs@gnu.org Subject: bug#7961: sort Resent-From: Francesco Bettella Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Wed, 02 Feb 2011 19:04:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 7961 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Eric Blake Cc: 7961@debbugs.gnu.org Received: via spool by 7961-submit@debbugs.gnu.org id=B7961.129667340320180 (code B ref 7961); Wed, 02 Feb 2011 19:04:03 +0000 Received: (at 7961) by debbugs.gnu.org; 2 Feb 2011 19:03:23 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Pkhz8-0005FP-Bc for submit@debbugs.gnu.org; Wed, 02 Feb 2011 14:03:23 -0500 Received: from mx0.decode.is ([212.126.224.32]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Pkht7-00056T-FM for 7961@debbugs.gnu.org; Wed, 02 Feb 2011 13:57:10 -0500 Received: from snote.decode.is (Not Verified[172.17.152.10]) by mx0.decode.is id ; Wed, 02 Feb 2011 19:05:33 +0000 Received: from lws14.decode.is ([172.17.112.14]) by snote.decode.is (Lotus Domino Release 8.0.2FP1) with ESMTP id 2011020219053338-88512 ; Wed, 2 Feb 2011 19:05:33 +0000 From: Francesco Bettella Organization: deCODE Date: Wed, 2 Feb 2011 19:05:33 +0000 User-Agent: KMail/1.9.4 References: <201102021242.01910.francesb@decode.is> <4D4997E0.7020506@redhat.com> In-Reply-To: <4D4997E0.7020506@redhat.com> MIME-Version: 1.0 Message-Id: <201102021905.33371.francesb@decode.is> X-MIMETrack: Itemize by SMTP Server on DecodeDom/Decode/IS(Release 8.0.2FP1|January 12, 2009) at 02.02.2011 19:05:33, Serialize by Router on DecodeDom/Decode/IS(Release 8.0.2FP1|January 12, 2009) at 02.02.2011 19:05:33, Serialize complete at 02.02.2011 19:05:33 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" Content-Disposition: inline X-deCODE-Mailmarshal: Check X-Spam-Score: -4.3 (----) X-Mailman-Approved-At: Wed, 02 Feb 2011 14:03:21 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.7 (---) thank you very much for your time. and sorry for the trouble. if I understand this right, specifying 'b' in the start field spares me the fallback sort of the complete line. and this actually does the trick. I remain a little in the dark regarding the dictionary vs. byte (POSIX vs. C) ordering. I've tried both on asd2 (without the 'b') with the same result. but I trust you on this one. Francesco P.S.: just got Gordon's reply. thank you for that. On Wed February 2 2011 17:44, Eric Blake wrote: > On 02/02/2011 05:42 AM, Francesco Bettella wrote: > > hi, > > I may have bumped into an undesired feature/bug of sort, which appears to be > > still present in the version 8.9 of coreutils. > > Thanks for the report. However, this is a feature, and not a bug, of sort. > > > > > I'm issuing the following sort commands (see attached files): > > > > [prompt1] > sort -k 1.4,1n asd1 > asd1.sorted > > > > [prompt2] > sort -k 2.4,2n asd2 > asd2.sorted > > If I'm correct, asd1 and asd2 have the same contents, except that you > have swapped columns 1 and 2 between the two and resorted the lines. > And your desired goal is that the output matches asd1.sorted, again with > the columns swapped for asd2.sorted. > > > > > the first one works as I would expect, the second one doesn't. > > Let's examine why: > > $ head -3 asd1 | sort -k 1.4,1n --debug > sort: using `en_US.UTF-8' sorting rules > sort: leading blanks are significant in key 1; consider also specifying `b' > chr>coding_gene > ^ no match for key > _______________ > chr1>PRAMEF1 > _ > ____________ > chr1>PRAMEF4 > _ > ____________ > $ head -3 asd1 | LC_ALL=C sort -k 1.4,1n --debug > sort: using simple byte comparison > sort: leading blanks are significant in key 1; consider also specifying `b' > chr>coding_gene > ^ no match for key > _______________ > chr1>PRAMEF1 > _ > ____________ > chr1>PRAMEF4 > _ > ____________ > > In both cases, when there is no match for a key but numeric sorting was > requested, then that line sorts first; meanwhile, you get the fallback > sort of the complete line after the first key has been sorted, so that > the end result matches asd1.sorted whether you use the C locale or > dictionary sorting. > > But notice that warning about not using -b, and how it affects asd2 (and > also, how the difference in dictionary vs. byte-ordering plays a role in > the secondary sorting): > > $ head -3 asd2 | sort -k 2.4,2n --debug > sort: using `en_US.UTF-8' sorting rules > sort: leading blanks are significant in key 1; consider also specifying `b' > coding_gene>chr > ^ no match for key > _______________ > PRAMEF1>chr1 > ^ no match for key > ____________ > PRAMEF4>chr1 > ^ no match for key > ____________ > $ head -3 asd2 | LC_ALL=C sort -k 2.4,2n --debug > sort: using simple byte comparison > sort: leading blanks are significant in key 1; consider also specifying `b' > PRAMEF1>chr1 > ^ no match for key > ____________ > PRAMEF4>chr1 > ^ no match for key > ____________ > coding_gene>chr > ^ no match for key > > But when you add -b (note, b is the one option you have to add to the > start field, since it affects start and end fields specially; all other > options can be added to start, end, or both, and affect the entire key): > > $ head -3 asd2 | sort -k 2.4b,2n --debug > sort: using `en_US.UTF-8' sorting rules > coding_gene>chr > ^ no match for key > _______________ > PRAMEF1>chr1 > _ > ____________ > PRAMEF4>chr1 > _ > ____________ > $ head -3 asd2 | LC_ALL=C coreutils/src/sort -k 2.4b,2n --debug > coreutils/src/sort: using simple byte comparison > coding_gene>chr > ^ no match for key > _______________ > PRAMEF1>chr1 > _ > ____________ > PRAMEF4>chr1 > _ > ____________ > > That is, your expectations were insufficient - without telling sort > enough additional information, sort correctly followed what you told it > to do, but what you told it was not what you meant. And the --debug > option is your [new] friend :) > > -- > Eric Blake eblake@redhat.com +1-801-349-2682 > Libvirt virtualization library http://libvirt.org > > From unknown Sat Aug 09 13:00:06 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.427 (Entity 5.427) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Francesco Bettella Subject: bug#7961: closed (Re: bug#7961: sort) Message-ID: References: <4D49E1B6.104@draigBrady.com> <201102021242.01910.francesb@decode.is> X-Gnu-PR-Message: they-closed 7961 X-Gnu-PR-Package: coreutils Reply-To: 7961@debbugs.gnu.org Date: Wed, 02 Feb 2011 22:53:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1296687182-13448-1" This is a multi-part message in MIME format... ------------=_1296687182-13448-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #7961: sort which was filed against the coreutils package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 7961@debbugs.gnu.org. --=20 7961: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D7961 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1296687182-13448-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 7961-done) by debbugs.gnu.org; 2 Feb 2011 22:52:52 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PklZE-0003UY-K0 for submit@debbugs.gnu.org; Wed, 02 Feb 2011 17:52:52 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PklZB-0003UM-VS for 7961-done@debbugs.gnu.org; Wed, 02 Feb 2011 17:52:50 -0500 Received: (qmail 42561 invoked from network); 2 Feb 2011 23:01:15 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 2 Feb 2011 23:01:15 -0000 Message-ID: <4D49E1B6.104@draigBrady.com> Date: Wed, 02 Feb 2011 22:59:02 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Eric Blake Subject: Re: bug#7961: sort References: <201102021242.01910.francesb@decode.is> <4D4997E0.7020506@redhat.com> In-Reply-To: <4D4997E0.7020506@redhat.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7961-done Cc: 7961-done@debbugs.gnu.org, Francesco Bettella X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 02/02/11 17:44, Eric Blake wrote: > $ head -3 asd2 | LC_ALL=C sort -k 2.4,2n --debug > sort: using simple byte comparison > sort: leading blanks are significant in key 1; consider also specifying `b' > PRAMEF1>chr1 > ^ no match for key > ____________ > PRAMEF4>chr1 > ^ no match for key > ____________ > coding_gene>chr > ^ no match for key > > But when you add -b (note, b is the one option you have to add to the > start field, since it affects start and end fields specially; all other > options can be added to start, end, or both, and affect the entire key): > > $ head -3 asd2 | sort -k 2.4b,2n --debug > sort: using `en_US.UTF-8' sorting rules > coding_gene>chr > ^ no match for key > _______________ > PRAMEF1>chr1 > _ Yep. The 'b' option is one of the main reasons for --debug. Note, sort --debug will warn until you put it in the right place. Hmm, I just noticed a bug with --debug, introduced with bdde34f9: $ printf "A\tchr10\nB\tchr1\n" | ./sort -s --debug -k2.4b,2.3n 2>/dev/null A>chr10 __ B>chr1 _ This should fix it up: diff --git a/src/sort.c b/src/sort.c index 06b0d95..365634d 100644 --- a/src/sort.c +++ b/src/sort.c @@ -2214,7 +2214,9 @@ debug_key (struct line const *line, struct keyfield const *key) char *tighter_lim = beg; - if (key->month) + if (lim < beg) + tighter_lim = lim; + else if (key->month) getmonth (beg, &tighter_lim); else if (key->general_numeric) ignore_value (strtold (beg, &tighter_lim)); cheers, Pádraig. ------------=_1296687182-13448-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 2 Feb 2011 14:41:17 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PkdtU-00071v-Ki for submit@debbugs.gnu.org; Wed, 02 Feb 2011 09:41:17 -0500 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PkdYd-0006W9-R1 for submit@debbugs.gnu.org; Wed, 02 Feb 2011 09:19:44 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pkdga-0007ft-DL for submit@debbugs.gnu.org; Wed, 02 Feb 2011 09:28:09 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: ** X-Spam-Status: No, score=2.5 required=5.0 tests=BAYES_20, RECEIVED_FROM_WINDOWS_HOST, T_RP_MATCHES_RCVD autolearn=no version=3.3.1 Received: from lists.gnu.org ([199.232.76.165]:44300) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pkdga-0007fO-7f for submit@debbugs.gnu.org; Wed, 02 Feb 2011 09:27:56 -0500 Received: from [140.186.70.92] (port=35264 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PkdgN-0001zo-MC for bug-coreutils@gnu.org; Wed, 02 Feb 2011 09:27:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pkc29-000740-NH for bug-coreutils@gnu.org; Wed, 02 Feb 2011 07:42:10 -0500 Received: from mx0.decode.is ([212.126.224.32]:2113) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pkc29-00073S-BB for bug-coreutils@gnu.org; Wed, 02 Feb 2011 07:42:05 -0500 Received: from snote.decode.is (Not Verified[172.17.152.10]) by mx0.decode.is id ; Wed, 02 Feb 2011 12:42:01 +0000 Received: from lws14.decode.is ([172.17.112.14]) by snote.decode.is (Lotus Domino Release 8.0.2FP1) with ESMTP id 2011020212420210-87484 ; Wed, 2 Feb 2011 12:42:02 +0000 From: Francesco Bettella Organization: deCODE To: bug-coreutils@gnu.org Subject: sort Date: Wed, 2 Feb 2011 12:42:01 +0000 User-Agent: KMail/1.9.4 MIME-Version: 1.0 Message-Id: <201102021242.01910.francesb@decode.is> X-MIMETrack: Itemize by SMTP Server on DecodeDom/Decode/IS(Release 8.0.2FP1|January 12, 2009) at 02.02.2011 12:42:02, Serialize by Router on DecodeDom/Decode/IS(Release 8.0.2FP1|January 12, 2009) at 02.02.2011 12:42:02, Serialize complete at 02.02.2011 12:42:02 Content-Type: Multipart/Mixed; boundary="Boundary-00=_ZEVSN9Ujbbcj3lK" X-deCODE-Mailmarshal: Check X-detected-operating-system: by eggs.gnu.org: Windows 2000 SP4, XP SP1+ X-Received-From: 212.126.224.32 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 199.232.76.165 X-Spam-Score: -5.9 (-----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Wed, 02 Feb 2011 09:41:14 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.9 (-----) --Boundary-00=_ZEVSN9Ujbbcj3lK Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline hi, I may have bumped into an undesired feature/bug of sort, which appears to be still present in the version 8.9 of coreutils. I'm issuing the following sort commands (see attached files): [prompt1] > sort -k 1.4,1n asd1 > asd1.sorted [prompt2] > sort -k 2.4,2n asd2 > asd2.sorted the first one works as I would expect, the second one doesn't. cheers. Francesco --Boundary-00=_ZEVSN9Ujbbcj3lK Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; name="asd1" Content-Disposition: attachment; filename="asd1" chr coding_gene chr1 PRAMEF1 chr1 PRAMEF4 chr1 PPT1 chr1 B4GALT2 chr1 BTBD19 chr1 PPM1J chr1 AMPD1 chr1 HSD3B1 chr1 LY9 chr1 DUSP27 chr1 MYBPH chr1 DISC1 chr1 DISC1 chr10 ANXA7 chr10 KILLIN chr10 RNLS chr10 WNT8B chr10 SEC31B chr11 INSC chr11 PRSS23 chr11 DDX10 chr12 TEAD4 chr12 GXYLT1 chr12 TBX5 chr12 PIWIL1 chr14 C14orf106 chr14 C14orf50 chr15 OR4N4 chr15 AKAP13 chr16 TBC1D10B chr16 FAM38A chr18 HDHD2 chr19 USHBP1 chr19 MLL4 chr19 MEGF8 chr19 SPHK2 chr19 SIGLEC10 chr19 ZNF83 chr19 LILRA6 chr2 TAF1B chr2 KHK chr2 PPM1G chr2 KCTD18 chr21 SIK1 chr22 SGSM1 chr3 CCR5 chr3 SETD2 chr3 ZNF717 chr3 SNX4 chr3 GMPS chr3 C3orf34 chr5 MTRR chr5 SEMA5A chr5 PCDHB16 chr5 CLK4 chr6 C6orf105 chr6 OR5V1 chr6 MUC21 chr6 PKHD1 chr7 CDHR3 chr8 CSMD1 chr8 FER1L6 chr8 KIAA0196 chr9 CNTLN chr9 PLAA chr9 SLC25A25 --Boundary-00=_ZEVSN9Ujbbcj3lK Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; name="asd1.sorted" Content-Disposition: attachment; filename="asd1.sorted" chr coding_gene chr1 AMPD1 chr1 B4GALT2 chr1 BTBD19 chr1 DISC1 chr1 DISC1 chr1 DUSP27 chr1 HSD3B1 chr1 LY9 chr1 MYBPH chr1 PPM1J chr1 PPT1 chr1 PRAMEF1 chr1 PRAMEF4 chr2 KCTD18 chr2 KHK chr2 PPM1G chr2 TAF1B chr3 C3orf34 chr3 CCR5 chr3 GMPS chr3 SETD2 chr3 SNX4 chr3 ZNF717 chr5 CLK4 chr5 MTRR chr5 PCDHB16 chr5 SEMA5A chr6 C6orf105 chr6 MUC21 chr6 OR5V1 chr6 PKHD1 chr7 CDHR3 chr8 CSMD1 chr8 FER1L6 chr8 KIAA0196 chr9 CNTLN chr9 PLAA chr9 SLC25A25 chr10 ANXA7 chr10 KILLIN chr10 RNLS chr10 SEC31B chr10 WNT8B chr11 DDX10 chr11 INSC chr11 PRSS23 chr12 GXYLT1 chr12 PIWIL1 chr12 TBX5 chr12 TEAD4 chr14 C14orf106 chr14 C14orf50 chr15 AKAP13 chr15 OR4N4 chr16 FAM38A chr16 TBC1D10B chr18 HDHD2 chr19 LILRA6 chr19 MEGF8 chr19 MLL4 chr19 SIGLEC10 chr19 SPHK2 chr19 USHBP1 chr19 ZNF83 chr21 SIK1 chr22 SGSM1 --Boundary-00=_ZEVSN9Ujbbcj3lK Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; name="asd2.sorted" Content-Disposition: attachment; filename="asd2.sorted" AKAP13 chr15 AMPD1 chr1 ANXA7 chr10 B4GALT2 chr1 BTBD19 chr1 C14orf106 chr14 C14orf50 chr14 C3orf34 chr3 C6orf105 chr6 CCR5 chr3 CDHR3 chr7 CLK4 chr5 CNTLN chr9 CSMD1 chr8 DDX10 chr11 DISC1 chr1 DISC1 chr1 DUSP27 chr1 FAM38A chr16 FER1L6 chr8 GMPS chr3 GXYLT1 chr12 HDHD2 chr18 HSD3B1 chr1 INSC chr11 KCTD18 chr2 KHK chr2 KIAA0196 chr8 KILLIN chr10 LILRA6 chr19 LY9 chr1 MEGF8 chr19 MLL4 chr19 MTRR chr5 MUC21 chr6 MYBPH chr1 OR4N4 chr15 OR5V1 chr6 PCDHB16 chr5 PIWIL1 chr12 PKHD1 chr6 PLAA chr9 PPM1G chr2 PPM1J chr1 PPT1 chr1 PRAMEF1 chr1 PRAMEF4 chr1 PRSS23 chr11 RNLS chr10 SEC31B chr10 SEMA5A chr5 SETD2 chr3 SGSM1 chr22 SIGLEC10 chr19 SIK1 chr21 SLC25A25 chr9 SNX4 chr3 SPHK2 chr19 TAF1B chr2 TBC1D10B chr16 TBX5 chr12 TEAD4 chr12 USHBP1 chr19 WNT8B chr10 ZNF717 chr3 ZNF83 chr19 coding_gene chr --Boundary-00=_ZEVSN9Ujbbcj3lK Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; name="asd2" Content-Disposition: attachment; filename="asd2" coding_gene chr PRAMEF1 chr1 PRAMEF4 chr1 PPT1 chr1 B4GALT2 chr1 BTBD19 chr1 PPM1J chr1 AMPD1 chr1 HSD3B1 chr1 LY9 chr1 DUSP27 chr1 MYBPH chr1 DISC1 chr1 DISC1 chr1 ANXA7 chr10 KILLIN chr10 RNLS chr10 WNT8B chr10 SEC31B chr10 INSC chr11 PRSS23 chr11 DDX10 chr11 TEAD4 chr12 GXYLT1 chr12 TBX5 chr12 PIWIL1 chr12 C14orf106 chr14 C14orf50 chr14 OR4N4 chr15 AKAP13 chr15 TBC1D10B chr16 FAM38A chr16 HDHD2 chr18 USHBP1 chr19 MLL4 chr19 MEGF8 chr19 SPHK2 chr19 SIGLEC10 chr19 ZNF83 chr19 LILRA6 chr19 TAF1B chr2 KHK chr2 PPM1G chr2 KCTD18 chr2 SIK1 chr21 SGSM1 chr22 CCR5 chr3 SETD2 chr3 ZNF717 chr3 SNX4 chr3 GMPS chr3 C3orf34 chr3 MTRR chr5 SEMA5A chr5 PCDHB16 chr5 CLK4 chr5 C6orf105 chr6 OR5V1 chr6 MUC21 chr6 PKHD1 chr6 CDHR3 chr7 CSMD1 chr8 FER1L6 chr8 KIAA0196 chr8 CNTLN chr9 PLAA chr9 SLC25A25 chr9 --Boundary-00=_ZEVSN9Ujbbcj3lK-- ------------=_1296687182-13448-1-- From unknown Sat Aug 09 13:00:06 2025 X-Loop: help-debbugs@gnu.org Subject: bug#7961: sort Resent-From: Jim Meyering Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-coreutils@gnu.org Resent-Date: Thu, 03 Feb 2011 08:12:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 7961 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 7961@debbugs.gnu.org Cc: P@draigBrady.com Received: via spool by 7961-submit@debbugs.gnu.org id=B7961.129672071128044 (code B ref 7961); Thu, 03 Feb 2011 08:12:02 +0000 Received: (at 7961) by debbugs.gnu.org; 3 Feb 2011 08:11:51 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PkuIB-0007IG-3c for submit@debbugs.gnu.org; Thu, 03 Feb 2011 03:11:51 -0500 Received: from mx.meyering.net ([82.230.74.64]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PkuI8-0007I1-Up for 7961@debbugs.gnu.org; Thu, 03 Feb 2011 03:11:49 -0500 Received: by rho.meyering.net (Acme Bit-Twister, from userid 1000) id CB9C0600B3; Thu, 3 Feb 2011 09:20:14 +0100 (CET) From: Jim Meyering In-Reply-To: <4D49E1B6.104@draigBrady.com> ("=?UTF-8?Q?P=C3=A1draig?= Brady"'s message of "Wed, 02 Feb 2011 22:59:02 +0000") References: <201102021242.01910.francesb@decode.is> <4D4997E0.7020506@redhat.com> <4D49E1B6.104@draigBrady.com> Date: Thu, 03 Feb 2011 09:20:14 +0100 Message-ID: <87k4hho7gh.fsf@meyering.net> Lines: 65 MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.8 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.8 (-----) P=E1draig Brady wrote: > On 02/02/11 17:44, Eric Blake wrote: >> $ head -3 asd2 | LC_ALL=3DC sort -k 2.4,2n --debug >> sort: using simple byte comparison >> sort: leading blanks are significant in key 1; consider also specifying = `b' >> PRAMEF1>chr1 >> ^ no match for key >> ____________ >> PRAMEF4>chr1 >> ^ no match for key >> ____________ >> coding_gene>chr >> ^ no match for key >> >> But when you add -b (note, b is the one option you have to add to the >> start field, since it affects start and end fields specially; all other >> options can be added to start, end, or both, and affect the entire key): >> >> $ head -3 asd2 | sort -k 2.4b,2n --debug >> sort: using `en_US.UTF-8' sorting rules >> coding_gene>chr >> ^ no match for key >> _______________ >> PRAMEF1>chr1 >> _ > > > Yep. The 'b' option is one of the main reasons for --debug. > Note, sort --debug will warn until you put it in the right place. > > Hmm, I just noticed a bug with --debug, introduced with bdde34f9: > > $ printf "A\tchr10\nB\tchr1\n" | ./sort -s --debug -k2.4b,2.3n 2>/dev/null > A>chr10 > __ > B>chr1 > _ > > This should fix it up: Good catch. That looks right and works for me: $ printf "A\tchr10\nB\tchr1\n" | ./sort -s --debug -k2.4b,2.3n 2>/dev/null A>chr10 ^ no match for key B>chr1 ^ no match for key If you have time, please push that today. > diff --git a/src/sort.c b/src/sort.c > index 06b0d95..365634d 100644 > --- a/src/sort.c > +++ b/src/sort.c > @@ -2214,7 +2214,9 @@ debug_key (struct line const *line, struct keyfield= const *key) > > char *tighter_lim =3D beg; > > - if (key->month) > + if (lim < beg) > + tighter_lim =3D lim; > + else if (key->month) > getmonth (beg, &tighter_lim); > else if (key->general_numeric) > ignore_value (strtold (beg, &tighter_lim));