From unknown Thu Sep 11 09:17:53 2025 X-Loop: help-debbugs@gnu.org Subject: bug#14988: sort enhancement request Resent-From: Danny Nicholas Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Tue, 30 Jul 2013 21:45:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 14988 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 14988@debbugs.gnu.org X-Debbugs-Original-To: "bug-coreutils@gnu.org" Received: via spool by submit@debbugs.gnu.org id=B.137522070127221 (code B ref -1); Tue, 30 Jul 2013 21:45:02 +0000 Received: (at submit) by debbugs.gnu.org; 30 Jul 2013 21:45:01 +0000 Received: from localhost ([127.0.0.1]:59176 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1V4Hj1-00074x-8J for submit@debbugs.gnu.org; Tue, 30 Jul 2013 17:45:00 -0400 Received: from eggs.gnu.org ([208.118.235.92]:55079) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1V4HDI-0005to-C9 for submit@debbugs.gnu.org; Tue, 30 Jul 2013 17:12:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1V4HD8-0002j4-4P for submit@debbugs.gnu.org; Tue, 30 Jul 2013 17:12:07 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: *** X-Spam-Status: No, score=3.3 required=5.0 tests=BAYES_50,HTML_MESSAGE, RECEIVED_FROM_WINDOWS_HOST autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:59969) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V4HD8-0002iz-26 for submit@debbugs.gnu.org; Tue, 30 Jul 2013 17:12:02 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56764) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V4HD3-0004sG-6A for bug-coreutils@gnu.org; Tue, 30 Jul 2013 17:12:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1V4HCy-0002hN-Cx for bug-coreutils@gnu.org; Tue, 30 Jul 2013 17:11:57 -0400 Received: from mail.pinnacledatasystems.com ([97.65.18.95]:11375) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V4HCy-0002gH-5o for bug-coreutils@gnu.org; Tue, 30 Jul 2013 17:11:52 -0400 Received: from PDS-BHM-MAIL1.pinnacledata.local ([::1]) by PDS-BHM-MAIL1.pinnacledata.local ([::1]) with mapi id 14.03.0123.003; Tue, 30 Jul 2013 15:51:09 -0500 From: Danny Nicholas Thread-Topic: sort enhancement request Thread-Index: Ac6NZoQIomh8iPUwRDizMBDglsGOpA== Date: Tue, 30 Jul 2013 20:51:08 +0000 Message-ID: <1F727E6927E3F34787AEC8C1DCFAB965029D5F19@PDS-BHM-MAIL1.pinnacledata.local> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-originating-ip: [168.162.167.198] x-tm-as-product-ver: SMEX-10.1.0.2244-7.000.1014-20046.006 x-tm-as-result: No--43.546000-5.000000-31 x-tm-as-user-approved-sender: No x-tm-as-user-blocked-sender: No Content-Type: multipart/related; boundary="_005_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_"; type="multipart/alternative" MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -3.4 (---) X-Mailman-Approved-At: Tue, 30 Jul 2013 17:44:57 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) --_005_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_ Content-Type: multipart/alternative; boundary="_000_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_" --_000_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi guys, I am presently using version 7.1 on a Solaris box. I downloaded 8.21 and r= eally love the improvement in speed (almost 50% in some tests). I am looki= ng to replace the commercial product NSORT and would like this feature in t= he source instead of a wrapper. If I have a file XXXX300001XXXX XXXX300002XXXX XXXX300003XXXX XXXX300003XXXX XXXX300003XXXX XXXX300003XXXX XXXX300004XXXX XXXX300005XXXX XXXX300006XXXX XXXX300007XXXX NSORT keeps the 4 300003 records together in entry sequence. My present w= ork-around is to use a Python script that reads in the whole file and creat= es a pseudo-key that is 30000X plus an 8 digit sequence number (I process m= illions of records). What I am thinking of is an -es (--entry-sequence) th= at would add a hidden -k to process on this internal sequence. If I figure= out how to do this on my own, I will submit it to you. Thanks, Danny Nicholas Applications Programmer Pinnacle Data Systems L.L.C. Office: (205) 307-6874 danny.nicholas@pinnacledatasystems.com www.pinnacledatasystems.com [Description: Description: Description: https://encrypted-tbn1.google.co= m/images?q=3Dtbn:ANd9GcRglmT5RwJEUk-1ZNPo_FI8y_udB6BL29pkwTt-Qh442v-FI1gH] = [Description: = Description: Description: https://encrypted-tbn0.google.com/images?q=3Dtbn:= ANd9GcSfD26ooDfMWD_xWRaMfbMcaBmkIKcG2oRxlaj6tBGYguC_aD71lw] Follow us on LinkedIn and Twitter CONFIDENTIALITY: This email (including any attachments) may contain confid= ential, proprietary and privileged information, and unauthorized disclosure= or use is prohibited. If you received this email in error, please notify = the sender and delete this email from your system. --_000_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi guys,

I am presently using version 7.1 on a Solaris box.&n= bsp; I downloaded 8.21 and really love the improvement in speed (almost 50%= in some tests).  I am looking to replace the commercial product NSORT= and would like this feature in the source instead of a wrapper.  If I have a file

XXXX300001XXXX

XXXX300002XXXX

XXXX300003XXXX

XXXX300003XXXX

XXXX300003XXXX

XXXX300003XXXX

XXXX300004XXXX

XXXX300005XXXX

XXXX300006XXXX

XXXX300007XXXX

 

NSORT keeps the 4 300003 records together in entry s= equence.   My present work-around is to use a Python script that = reads in the whole file and creates a pseudo-key that is 30000X plus an 8 d= igit sequence number (I process millions of records).  What I am thinking of is an –es (--entry-sequence) t= hat would add a hidden –k to process on this internal sequence. = If I figure out how to do this on my own, I will submit it to you.

 

 

Thanks,<= /span>

Danny Nicholas=

Applications Progra= mmer
Pinnacle Data Systems L.L.C.

Office: (205) 307-6874=

danny.nicholas@pinnacledatasystems.com=

www.pinnacledatasystem= s.com

 

   3D"Description: 3D"Description:

Follow us on LinkedIn and= Twitter

 <= /i>

CONFIDENTIALITY:  Th= is email (including any attachments) may contain confidential, proprietary = and privileged information, and unauthorized disclosure or use is prohibited.  If you received this email in error, please notify th= e sender and delete this email from your system.

 

 

 

 

--_000_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_-- --_005_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_ Content-Type: image/jpeg; name="image001.jpg" Content-Description: image001.jpg Content-Disposition: inline; filename="image001.jpg"; size=976; creation-date="Tue, 30 Jul 2013 20:51:08 GMT"; modification-date="Tue, 30 Jul 2013 20:51:08 GMT" Content-ID: Content-Transfer-Encoding: base64 /9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAoHBwkHBgoJCAkLCwoMDxkQDw4ODx4WFxIZJCAmJSMg IyIoLTkwKCo2KyIjMkQyNjs9QEBAJjBGS0U+Sjk/QD3/2wBDAQsLCw8NDx0QEB09KSMpPT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT3/wAARCAAgACADASIA AhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQA AAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3 ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWm p6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEA AwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSEx BhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElK U1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3 uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwDtda1W 4N/IiSMqoxVVViAMcduprP8A7Suv+e8n/fZ/xrG17X5LXX7yJ4VdFmcAg4ONxraS4F3YeHkkLKl+ 7ogCjMeD3PevR5FGKujx3NznKz2/zsJ/aN1/z3k/77P+NaWh6rcfb443kZkdtrKzEjnvz0NZs0Vk ItVW1unmudNUvLGY9qkDOQD68VkeHdelu/EVnEkKRo0qgknJ60OClF2QKcoTim9zH8V/8jLf/wDX eT/0I1et/E9tFb+HozDMTpcjPLjHz5Ofl5/nWr4s8F6jNrE9zZQtPFO5kBQjKk8kEE+ucH3rC/4Q vW/+fCf/AL5H+NaxnSnBXZhKnXp1JOMXr5edyW38R28N14glMMpGpo6xAY+TcSRu59+1V/B//Iz2 P/XZf50//hC9b/58J/8Avkf410Hg/wAGahb6xDd30LQRQNv+cjc5HQADt3z7UTnSjB2YU6VadSPN F6Pt53P/2Q== --_005_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_ Content-Type: image/jpeg; name="image002.jpg" Content-Description: image002.jpg Content-Disposition: inline; filename="image002.jpg"; size=924; creation-date="Tue, 30 Jul 2013 20:51:08 GMT"; modification-date="Tue, 30 Jul 2013 20:51:08 GMT" Content-ID: Content-Transfer-Encoding: base64 /9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAoHBwkHBgoJCAkLCwoMDxkQDw4ODx4WFxIZJCAmJSMg IyIoLTkwKCo2KyIjMkQyNjs9QEBAJjBGS0U+Sjk/QD3/2wBDAQsLCw8NDx0QEB09KSMpPT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT3/wAARCAAeAB8DASIA AhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQA AAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3 ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWm p6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEA AwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSEx BhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElK U1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3 uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwDv9W1S 4+3NFC7KFbYqqcZ/yalntdSgtmlN1uZBuZAzZA7896yL+Z18ShRgr9pHb/aFdO07faZFRd5yRjGe K7WuWMbLoeSpc858ze9jm/7Vuu07/wDfR/xrY0HUpbiYwysWBBIJOSCMd/xp99DC1jO1xbRxKkZY PtCkHtisjwnK8mpfPgfI3AH0pz5Z020thU3OnWjFyvcraq3l688uM7Ji2PXBBrcLXH2g3ViPNjky VZRnGexHY0uqeHmurpp4JIxvOWSQHGfUEVXj8PXcX3ZIB9GcVKqwcVdjeGrRnJpaN33sS3Rnu7Oc anCFhWMsrsuCrdse9Z/hRduo/wDAG/pVuTw/eSDDSQEe7ua0NI0Y6e7SyurSEbQEGFUUpVYKDiup cMPVdaM5Lbre7P/Z --_005_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_-- From unknown Thu Sep 11 09:17:53 2025 X-Loop: help-debbugs@gnu.org Subject: bug#14988: sort enhancement request Resent-From: Eric Blake Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Tue, 30 Jul 2013 22:35:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 14988 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Danny Nicholas Cc: 14988@debbugs.gnu.org Received: via spool by 14988-submit@debbugs.gnu.org id=B14988.13752236411331 (code B ref 14988); Tue, 30 Jul 2013 22:35:01 +0000 Received: (at 14988) by debbugs.gnu.org; 30 Jul 2013 22:34:01 +0000 Received: from localhost ([127.0.0.1]:59209 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1V4IUS-0000LD-RV for submit@debbugs.gnu.org; Tue, 30 Jul 2013 18:34:01 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42135) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1V4IUQ-0000Kw-EG; Tue, 30 Jul 2013 18:33:59 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r6UMXumB018157 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 30 Jul 2013 18:33:56 -0400 Received: from [10.3.113.111] (ovpn-113-111.phx2.redhat.com [10.3.113.111]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r6UMXtvW011419; Tue, 30 Jul 2013 18:33:55 -0400 Message-ID: <51F83F53.2060208@redhat.com> Date: Tue, 30 Jul 2013 16:33:55 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130625 Thunderbird/17.0.7 MIME-Version: 1.0 References: <1F727E6927E3F34787AEC8C1DCFAB965029D5F19@PDS-BHM-MAIL1.pinnacledata.local> In-Reply-To: <1F727E6927E3F34787AEC8C1DCFAB965029D5F19@PDS-BHM-MAIL1.pinnacledata.local> X-Enigmail-Version: 1.5.2 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="DTJ7aol5672oChQBMewBfritP4BmWg9ai" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-Spam-Score: -6.5 (------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.5 (------) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --DTJ7aol5672oChQBMewBfritP4BmWg9ai Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable tag 14988 needinfo thanks On 07/30/2013 02:51 PM, Danny Nicholas wrote: > Hi guys, [can you convince your mailer to wrap long lines?] > I am presently using version 7.1 on a Solaris box. I downloaded 8.21 a= nd really love the improvement in speed (almost 50% in some tests). I am= looking to replace the commercial product NSORT and would like this feat= ure in the source instead of a wrapper. If I have a file > XXXX300001XXXX > XXXX300002XXXX > XXXX300003XXXX > XXXX300003XXXX > XXXX300003XXXX > XXXX300003XXXX > XXXX300004XXXX > XXXX300005XXXX > XXXX300006XXXX > XXXX300007XXXX As written, your example is already sorted in the same order as written, and with no other distinguishing features on the line, you haven't proven that sort isn't outputting lines in the order you want. I also can't tell if the XXXX represent the actual bytes you are sorting, or if you meant them as placeholders for a sanitized version of your actual data set. You'll need to give as an actual example of lines that are sorted differently by nsort and GNU sort, and the command line options you attempted for GNU sort, before we can tell you what to try next. >=20 > NSORT keeps the 4 300003 records together in entry sequence. My prese= nt work-around is to use a Python script that reads in the whole file and= creates a pseudo-key that is 30000X plus an 8 digit sequence number (I p= rocess millions of records). What I am thinking of is an -es (--entry-se= quence) that would add a hidden -k to process on this internal sequence. = If I figure out how to do this on my own, I will submit it to you. Short options must be one letter long; writing your proposed 'sort -es' would be the same as 'sort -e -s'. Also, we are reluctant to burn short options; these days, it's better to add a long option only, until it proves its popularity, so that we don't collide with any future standardized short options. It SOUNDS like you are merely asking for a stable sort option. Have you tried the -s/--stable option? That effectively adds an invisible key of last resort that says if two lines otherwise compare equal, sort them so that the line occurring first in input also occurs first in output. At any rate, I'm marking this bug as 'needinfo' so that we can get more feedback on whether --stable already meets your needs, or at least so we can get a test case that we can play with to see what you are really asking for. Also, have you played with 'sort --debug'? It shows you a lot more details on EXACTLY what sort is looking at. For example, I am able to do a numeric sort on JUST the 6 digits in between the XXXX fillers of the example you listed: $ printf 'XXXX300002XXXX\nXXXX300001XXXX\n' \ | LC_ALL=3DC sort --debug -k1.5,1.10n -s sort: using simple byte comparison XXXX300001XXXX ______ XXXX300002XXXX ______ >=20 > CONFIDENTIALITY: This email (including any attachments) may contain co= nfidential, Sorry, but this disclaimer is unenforceable on publicly archived lists. It is considered poor netiquette to use your employers email if they insist on adding this on your behalf, and you may be better off sending the mail from a personal account. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --DTJ7aol5672oChQBMewBfritP4BmWg9ai Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJR+D9TAAoJEKeha0olJ0NqfisH/05rEUkNDoRVC6/Qs57ZvFoh xzBvzetIjrBb/JkM/OZxkF0aDxRT6zHBqTIEL/LmL58UOcSP/XzoietOnCrjBujI DFqPlRP6Xk36mFPBfT80vNiRO3KSviYkkFyGN1HvXOkKJ7jUqk4xR6vtk7eNz5qT fX/RpiBU7+tBHPmyqCM1tPgtR9y6oN2r+oN+80LswkVII6mKs0F7oYrru2c3+41n FyS0PeEJpB41dks3gnwXoC9HyrFxKrRTKNC7/dWpl16295BqMDlQTKMrvT00TGjR Qb7EPkdpqrmq1qS0bgyGMxK+oGJ9MEqzH+7gXRBTWdoMg7BVyY4/DI4toe5vIn8= =sao3 -----END PGP SIGNATURE----- --DTJ7aol5672oChQBMewBfritP4BmWg9ai-- From unknown Thu Sep 11 09:17:53 2025 X-Loop: help-debbugs@gnu.org Subject: bug#14988: sort enhancement request Resent-From: Eric Blake Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Tue, 30 Jul 2013 22:44:05 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 14988 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: moreinfo Cc: Danny Nicholas , 14988@debbugs.gnu.org Received: via spool by 14988-submit@debbugs.gnu.org id=B14988.13752242142792 (code B ref 14988); Tue, 30 Jul 2013 22:44:05 +0000 Received: (at 14988) by debbugs.gnu.org; 30 Jul 2013 22:43:34 +0000 Received: from localhost ([127.0.0.1]:59220 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1V4Idg-0000iw-Ao for submit@debbugs.gnu.org; Tue, 30 Jul 2013 18:43:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42014) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1V4Idc-0000ik-8Z for 14988@debbugs.gnu.org; Tue, 30 Jul 2013 18:43:29 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r6UMhRNW021506 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 30 Jul 2013 18:43:27 -0400 Received: from [10.3.113.111] (ovpn-113-111.phx2.redhat.com [10.3.113.111]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r6UMhQFv014798; Tue, 30 Jul 2013 18:43:27 -0400 Message-ID: <51F8418E.5050506@redhat.com> Date: Tue, 30 Jul 2013 16:43:26 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130625 Thunderbird/17.0.7 MIME-Version: 1.0 References: <1F727E6927E3F34787AEC8C1DCFAB965029D5F19@PDS-BHM-MAIL1.pinnacledata.local> <51F83F53.2060208@redhat.com> In-Reply-To: <51F83F53.2060208@redhat.com> X-Enigmail-Version: 1.5.2 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="4txfIw4gbxFDTG9AbMFiHb8sDvctErOiQ" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-Spam-Score: -5.3 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.3 (-----) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --4txfIw4gbxFDTG9AbMFiHb8sDvctErOiQ Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 07/30/2013 04:33 PM, Eric Blake wrote: > It SOUNDS like you are merely asking for a stable sort option. Have yo= u > tried the -s/--stable option? That effectively adds an invisible key o= f > last resort that says if two lines otherwise compare equal, sort them s= o > that the line occurring first in input also occurs first in output. An alternative view of how -s works (that more closely matches what you will see in 'sort --debug' output) is that POSIX requires that 'sort' behave as if a key of -k1 were always the last key present (if two lines otherwise compare equal, sort them by a strcoll() comparison of the entire line), where -s is the GNU extension that disables this POSIX implicit full-line sort key, so that you are left with a stable sort. Since 'adding an option to remove a key' sounds a little fishy on the surface, you can see why I gave my first explanation instead. But at the end of the day, all that matters is the result, so pick whichever mental representation you find easier to understand :) --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --4txfIw4gbxFDTG9AbMFiHb8sDvctErOiQ Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJR+EGOAAoJEKeha0olJ0NqjcQIAIAVObTha7MNF3ZmtC6q5ezb YRXd6hundIXa4CWmPn1wKtTbAZ4Z9Fg7MRcIi51jMC3NpWniOrVPF0I62cpbst3W xeygeZLe51VCUbGw1+2I3UIGUWmTMW/Co4KGpIO2BUOck3GQnDjK6qQB0hHjR4+s I3l9Jm+ChMDAFkxKpPaiNzzmdSvCBZddHn207nhRK2xjLyBndU0j9ot7vjBWhGRN Zk9Y9SS1RKM5RGsAchTlB4Sf0DSCDsX/ge3elOxAOqafRu4/KM0/MMXE8yRhxt+f GE9d1W3fi+sMDwLj6Er4k5f4bvae9VL/ldcmkB6IQyrTRIjE5PxuAgq7KjtRhmQ= =muOG -----END PGP SIGNATURE----- --4txfIw4gbxFDTG9AbMFiHb8sDvctErOiQ-- From debbugs-submit-bounces@debbugs.gnu.org Wed Jul 31 09:59:31 2013 Received: (at control) by debbugs.gnu.org; 31 Jul 2013 13:59:31 +0000 Received: from localhost ([127.0.0.1]:60970 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1V4Ww6-0007Jz-8t for submit@debbugs.gnu.org; Wed, 31 Jul 2013 09:59:30 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36373) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1V4Ww1-0007JW-T9; Wed, 31 Jul 2013 09:59:27 -0400 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r6VDxOVO025029 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 31 Jul 2013 09:59:24 -0400 Received: from [10.3.113.111] (ovpn-113-111.phx2.redhat.com [10.3.113.111]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r6VDxN3D020190; Wed, 31 Jul 2013 09:59:24 -0400 Message-ID: <51F9183B.5000504@redhat.com> Date: Wed, 31 Jul 2013 07:59:23 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130625 Thunderbird/17.0.7 MIME-Version: 1.0 To: Danny Nicholas , 14988-done@debbugs.gnu.org Subject: Re: bug#14988: sort enhancement request References: <1F727E6927E3F34787AEC8C1DCFAB965029D5F19@PDS-BHM-MAIL1.pinnacledata.local> <51F83F53.2060208@redhat.com> <51F8418E.5050506@redhat.com> <1F727E6927E3F34787AEC8C1DCFAB965029D5FA3@PDS-BHM-MAIL1.pinnacledata.local> In-Reply-To: <1F727E6927E3F34787AEC8C1DCFAB965029D5FA3@PDS-BHM-MAIL1.pinnacledata.local> X-Enigmail-Version: 1.5.2 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="NatQ8b7UXIfNkxMHeBemoN0DbEREUg6S7" X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 X-Spam-Score: -6.5 (------) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.5 (------) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --NatQ8b7UXIfNkxMHeBemoN0DbEREUg6S7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable tag 14988 notabug thanks [re-adding the list; and please don't top-post on technical lists] On 07/31/2013 07:19 AM, Danny Nicholas wrote: > Thank you Eric. We have two sorts on our system. Our /usr/bin/sort do= es not support the -s option, Makes sense - the '-s' option is a GNU extension, and your /usr/bin/sort is probably not GNU sort. If you want stable sorting using only POSIX features, then you have to supply enough sort keys so that no two lines ever compare equal (since POSIX has no way to disable the full-line sort of last resort). And depending on your input to be sorted; this may indeed require a pre-filter run that adds line numbering (by the way, sed's '=3D' command can do this much more efficiently than a python script), then sorting, then a post-filter run that removes the line numbe= r. > but our /usr/local/bin/sort does. Indeed - life is simpler if you can write your script to ensure that it always sets PATH to use the full power of the GNU tools. > Unfortunately, that did not resolve the issue. Here is a portion of th= e file I'm trying to sort Thank you - THIS makes much more sense for understanding your problem. > 010_000001_0000731_00001_200000081610_ > 010_000001_0000731_00002_200000081610_ 4102 LANGUAGE EN<= /CCODEPAGE> > 010_000001_0000731_00003_200000081610_ YES > 010_000001_0000731_00003_200000081610_ 010 > 010_000001_0000731_00003_200000081610_ 06/12/2013<= /lastpaymentdate> > 010_000001_0000731_00003_200000081610_ = 277.59 > 010_000001_0000731_00003_200000081610_ > 010_000001_0000731_00003_200000081610_ PAGE1= > 010_000001_0000731_00004_200000081610_ REGULAR > 010_000001_0000731_00005_200000081610_ PRINTER > 010_000001_0000731_00006_200000081610_ S > 010_000001_0000731_00007_200000081610_ PRINTER > 010_000001_0000731_00008_200000081610_ R3P >=20 > What I am executing is /usr/local/bin/sort -k 1,36 -s file -o file2 So, with "-k1,36" you asked sort to treat as its sort key the portion of the line ranging from the first field to the 36th field. I only see 2 fields in most of the lines (a few have more, but none of them with 36 fields), so you are basically sorting by the entire line. You didn't provide any other keys, but since your first key is already botched as the ENTIRE line, there were no lines that compared equal for -s to make any difference. Again, sort --debug makes this clear (using a subset of just two lines of your input): >> $ printf '010_000001_0000731_00003_200000081610_ \n010_000001_0000731_00003_20= 0000081610_ PAGE1\n' \ >> | LC_ALL=3DC sort --debug -k1,36 -s >> sort: using simple byte comparison >> 010_000001_0000731_00003_200000081610_ PAGE1 >> ______________________________________________________________________= _ >> 010_000001_0000731_00003_200000081610_ >> ______________________________________________________________________= __________________________________ But it appears that what you WANTED was to sort on just the first 36 bytes, with a stable sort of the results. If so, then ASK for that, by using the correct -k option: >> $ printf '010_000001_0000731_00003_200000081610_ \n010_000001_0000731_00003_20= 0000081610_ PAGE1\n' \ >> | LC_ALL=3DC sort --debug -k1,1.36 -s >> sort: using simple byte comparison >> 010_000001_0000731_00003_200000081610_ >> ____________________________________ >> 010_000001_0000731_00003_200000081610_ PAGE1 >> ____________________________________ Note how I asked for a sort key -k1,1.36, which says to start in the first field, and end 36 bytes into the first field (hmm, it looks like you actually want 38 bytes - but I'll leave that for you to decide). Also note that -s now makes a difference, when the content of that first sort key is identical so the last-resort full-line comparison swaps unequal lines when -s is not used: >> $ printf '010_000001_0000731_00003_200000081610_ \n010_000001_0000731_00003_20= 0000081610_ PAGE1\n' \ >> | LC_ALL=3DC sort --debug -k1,1.36 >> sort: using simple byte comparison >> 010_000001_0000731_00003_200000081610_ PAGE1 >> ____________________________________ >> ______________________________________________________________________= _ >> 010_000001_0000731_00003_200000081610_ >> ____________________________________ >> ______________________________________________________________________= __________________________________ As this is a case of you not passing the correct command line arguments, rather than a bug in sort, I am marking this bug as closed. However, feel free to continue to comment on the topic (preferably on-list) if you have more questions. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --NatQ8b7UXIfNkxMHeBemoN0DbEREUg6S7 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJR+Rg7AAoJEKeha0olJ0Nq3jkIAKDRV0cDy5bA9f2xMBae+8pd yxEgusk9Qdw6K4jouow7X2e5aWivEj4rQGgoqwrxW6+s1ZFxUEJvbSLyotB8081W I1EMRTWH+cIzfc0Lb3NfVW3sqSg97AfdU9D47hh5O9HEBEaL6ZKr7FUpVBJ3oT7h jJi26N05ZN5GkEwIYMwML79tJqFBEMFet4ha4pjQwIga+2T30M/WFxnRK4+HGNtJ mN2dPAU1Ku4JHkRWnqS6x6FFNyz4cNjjUtM6jUSbMqn3EnD5bwhI2+SFyOm6u3IS RkuInxDcnCkwzSbR0x12eMzRr9lV/Di7fC53vpeclyG8Pl7LKnrEYMIT4BslRJ4= =Hg8n -----END PGP SIGNATURE----- --NatQ8b7UXIfNkxMHeBemoN0DbEREUg6S7-- From unknown Thu Sep 11 09:17:53 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.503 (Entity 5.503) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Danny Nicholas Subject: bug#14988: closed (Re: bug#14988: sort enhancement request) Message-ID: References: <51F9183B.5000504@redhat.com> <1F727E6927E3F34787AEC8C1DCFAB965029D5F19@PDS-BHM-MAIL1.pinnacledata.local> X-Gnu-PR-Message: they-closed 14988 X-Gnu-PR-Package: coreutils X-Gnu-PR-Keywords: notabug moreinfo Reply-To: 14988@debbugs.gnu.org Date: Wed, 31 Jul 2013 14:00:13 +0000 Content-Type: multipart/mixed; boundary="----------=_1375279213-28307-1" This is a multi-part message in MIME format... ------------=_1375279213-28307-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #14988: sort enhancement request which was filed against the coreutils package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 14988@debbugs.gnu.org. --=20 14988: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D14988 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1375279213-28307-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 14988-done) by debbugs.gnu.org; 31 Jul 2013 13:59:30 +0000 Received: from localhost ([127.0.0.1]:60968 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1V4Ww5-0007Ju-HR for submit@debbugs.gnu.org; Wed, 31 Jul 2013 09:59:30 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36373) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1V4Ww1-0007JW-T9; Wed, 31 Jul 2013 09:59:27 -0400 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r6VDxOVO025029 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 31 Jul 2013 09:59:24 -0400 Received: from [10.3.113.111] (ovpn-113-111.phx2.redhat.com [10.3.113.111]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r6VDxN3D020190; Wed, 31 Jul 2013 09:59:24 -0400 Message-ID: <51F9183B.5000504@redhat.com> Date: Wed, 31 Jul 2013 07:59:23 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130625 Thunderbird/17.0.7 MIME-Version: 1.0 To: Danny Nicholas , 14988-done@debbugs.gnu.org Subject: Re: bug#14988: sort enhancement request References: <1F727E6927E3F34787AEC8C1DCFAB965029D5F19@PDS-BHM-MAIL1.pinnacledata.local> <51F83F53.2060208@redhat.com> <51F8418E.5050506@redhat.com> <1F727E6927E3F34787AEC8C1DCFAB965029D5FA3@PDS-BHM-MAIL1.pinnacledata.local> In-Reply-To: <1F727E6927E3F34787AEC8C1DCFAB965029D5FA3@PDS-BHM-MAIL1.pinnacledata.local> X-Enigmail-Version: 1.5.2 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="NatQ8b7UXIfNkxMHeBemoN0DbEREUg6S7" X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 X-Spam-Score: -6.5 (------) X-Debbugs-Envelope-To: 14988-done X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.5 (------) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --NatQ8b7UXIfNkxMHeBemoN0DbEREUg6S7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable tag 14988 notabug thanks [re-adding the list; and please don't top-post on technical lists] On 07/31/2013 07:19 AM, Danny Nicholas wrote: > Thank you Eric. We have two sorts on our system. Our /usr/bin/sort do= es not support the -s option, Makes sense - the '-s' option is a GNU extension, and your /usr/bin/sort is probably not GNU sort. If you want stable sorting using only POSIX features, then you have to supply enough sort keys so that no two lines ever compare equal (since POSIX has no way to disable the full-line sort of last resort). And depending on your input to be sorted; this may indeed require a pre-filter run that adds line numbering (by the way, sed's '=3D' command can do this much more efficiently than a python script), then sorting, then a post-filter run that removes the line numbe= r. > but our /usr/local/bin/sort does. Indeed - life is simpler if you can write your script to ensure that it always sets PATH to use the full power of the GNU tools. > Unfortunately, that did not resolve the issue. Here is a portion of th= e file I'm trying to sort Thank you - THIS makes much more sense for understanding your problem. > 010_000001_0000731_00001_200000081610_ > 010_000001_0000731_00002_200000081610_ 4102 LANGUAGE EN<= /CCODEPAGE> > 010_000001_0000731_00003_200000081610_ YES > 010_000001_0000731_00003_200000081610_ 010 > 010_000001_0000731_00003_200000081610_ 06/12/2013<= /lastpaymentdate> > 010_000001_0000731_00003_200000081610_ = 277.59 > 010_000001_0000731_00003_200000081610_ > 010_000001_0000731_00003_200000081610_ PAGE1= > 010_000001_0000731_00004_200000081610_ REGULAR > 010_000001_0000731_00005_200000081610_ PRINTER > 010_000001_0000731_00006_200000081610_ S > 010_000001_0000731_00007_200000081610_ PRINTER > 010_000001_0000731_00008_200000081610_ R3P >=20 > What I am executing is /usr/local/bin/sort -k 1,36 -s file -o file2 So, with "-k1,36" you asked sort to treat as its sort key the portion of the line ranging from the first field to the 36th field. I only see 2 fields in most of the lines (a few have more, but none of them with 36 fields), so you are basically sorting by the entire line. You didn't provide any other keys, but since your first key is already botched as the ENTIRE line, there were no lines that compared equal for -s to make any difference. Again, sort --debug makes this clear (using a subset of just two lines of your input): >> $ printf '010_000001_0000731_00003_200000081610_ \n010_000001_0000731_00003_20= 0000081610_ PAGE1\n' \ >> | LC_ALL=3DC sort --debug -k1,36 -s >> sort: using simple byte comparison >> 010_000001_0000731_00003_200000081610_ PAGE1 >> ______________________________________________________________________= _ >> 010_000001_0000731_00003_200000081610_ >> ______________________________________________________________________= __________________________________ But it appears that what you WANTED was to sort on just the first 36 bytes, with a stable sort of the results. If so, then ASK for that, by using the correct -k option: >> $ printf '010_000001_0000731_00003_200000081610_ \n010_000001_0000731_00003_20= 0000081610_ PAGE1\n' \ >> | LC_ALL=3DC sort --debug -k1,1.36 -s >> sort: using simple byte comparison >> 010_000001_0000731_00003_200000081610_ >> ____________________________________ >> 010_000001_0000731_00003_200000081610_ PAGE1 >> ____________________________________ Note how I asked for a sort key -k1,1.36, which says to start in the first field, and end 36 bytes into the first field (hmm, it looks like you actually want 38 bytes - but I'll leave that for you to decide). Also note that -s now makes a difference, when the content of that first sort key is identical so the last-resort full-line comparison swaps unequal lines when -s is not used: >> $ printf '010_000001_0000731_00003_200000081610_ \n010_000001_0000731_00003_20= 0000081610_ PAGE1\n' \ >> | LC_ALL=3DC sort --debug -k1,1.36 >> sort: using simple byte comparison >> 010_000001_0000731_00003_200000081610_ PAGE1 >> ____________________________________ >> ______________________________________________________________________= _ >> 010_000001_0000731_00003_200000081610_ >> ____________________________________ >> ______________________________________________________________________= __________________________________ As this is a case of you not passing the correct command line arguments, rather than a bug in sort, I am marking this bug as closed. However, feel free to continue to comment on the topic (preferably on-list) if you have more questions. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --NatQ8b7UXIfNkxMHeBemoN0DbEREUg6S7 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJR+Rg7AAoJEKeha0olJ0Nq3jkIAKDRV0cDy5bA9f2xMBae+8pd yxEgusk9Qdw6K4jouow7X2e5aWivEj4rQGgoqwrxW6+s1ZFxUEJvbSLyotB8081W I1EMRTWH+cIzfc0Lb3NfVW3sqSg97AfdU9D47hh5O9HEBEaL6ZKr7FUpVBJ3oT7h jJi26N05ZN5GkEwIYMwML79tJqFBEMFet4ha4pjQwIga+2T30M/WFxnRK4+HGNtJ mN2dPAU1Ku4JHkRWnqS6x6FFNyz4cNjjUtM6jUSbMqn3EnD5bwhI2+SFyOm6u3IS RkuInxDcnCkwzSbR0x12eMzRr9lV/Di7fC53vpeclyG8Pl7LKnrEYMIT4BslRJ4= =Hg8n -----END PGP SIGNATURE----- --NatQ8b7UXIfNkxMHeBemoN0DbEREUg6S7-- ------------=_1375279213-28307-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 30 Jul 2013 21:45:01 +0000 Received: from localhost ([127.0.0.1]:59176 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1V4Hj1-00074x-8J for submit@debbugs.gnu.org; Tue, 30 Jul 2013 17:45:00 -0400 Received: from eggs.gnu.org ([208.118.235.92]:55079) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1V4HDI-0005to-C9 for submit@debbugs.gnu.org; Tue, 30 Jul 2013 17:12:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1V4HD8-0002j4-4P for submit@debbugs.gnu.org; Tue, 30 Jul 2013 17:12:07 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: *** X-Spam-Status: No, score=3.3 required=5.0 tests=BAYES_50,HTML_MESSAGE, RECEIVED_FROM_WINDOWS_HOST autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:59969) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V4HD8-0002iz-26 for submit@debbugs.gnu.org; Tue, 30 Jul 2013 17:12:02 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56764) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V4HD3-0004sG-6A for bug-coreutils@gnu.org; Tue, 30 Jul 2013 17:12:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1V4HCy-0002hN-Cx for bug-coreutils@gnu.org; Tue, 30 Jul 2013 17:11:57 -0400 Received: from mail.pinnacledatasystems.com ([97.65.18.95]:11375) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V4HCy-0002gH-5o for bug-coreutils@gnu.org; Tue, 30 Jul 2013 17:11:52 -0400 Received: from PDS-BHM-MAIL1.pinnacledata.local ([::1]) by PDS-BHM-MAIL1.pinnacledata.local ([::1]) with mapi id 14.03.0123.003; Tue, 30 Jul 2013 15:51:09 -0500 From: Danny Nicholas To: "bug-coreutils@gnu.org" Subject: sort enhancement request Thread-Topic: sort enhancement request Thread-Index: Ac6NZoQIomh8iPUwRDizMBDglsGOpA== Date: Tue, 30 Jul 2013 20:51:08 +0000 Message-ID: <1F727E6927E3F34787AEC8C1DCFAB965029D5F19@PDS-BHM-MAIL1.pinnacledata.local> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-originating-ip: [168.162.167.198] x-tm-as-product-ver: SMEX-10.1.0.2244-7.000.1014-20046.006 x-tm-as-result: No--43.546000-5.000000-31 x-tm-as-user-approved-sender: No x-tm-as-user-blocked-sender: No Content-Type: multipart/related; boundary="_005_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_"; type="multipart/alternative" MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -3.4 (---) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Tue, 30 Jul 2013 17:44:57 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) --_005_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_ Content-Type: multipart/alternative; boundary="_000_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_" --_000_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi guys, I am presently using version 7.1 on a Solaris box. I downloaded 8.21 and r= eally love the improvement in speed (almost 50% in some tests). I am looki= ng to replace the commercial product NSORT and would like this feature in t= he source instead of a wrapper. If I have a file XXXX300001XXXX XXXX300002XXXX XXXX300003XXXX XXXX300003XXXX XXXX300003XXXX XXXX300003XXXX XXXX300004XXXX XXXX300005XXXX XXXX300006XXXX XXXX300007XXXX NSORT keeps the 4 300003 records together in entry sequence. My present w= ork-around is to use a Python script that reads in the whole file and creat= es a pseudo-key that is 30000X plus an 8 digit sequence number (I process m= illions of records). What I am thinking of is an -es (--entry-sequence) th= at would add a hidden -k to process on this internal sequence. If I figure= out how to do this on my own, I will submit it to you. Thanks, Danny Nicholas Applications Programmer Pinnacle Data Systems L.L.C. Office: (205) 307-6874 danny.nicholas@pinnacledatasystems.com www.pinnacledatasystems.com [Description: Description: Description: https://encrypted-tbn1.google.co= m/images?q=3Dtbn:ANd9GcRglmT5RwJEUk-1ZNPo_FI8y_udB6BL29pkwTt-Qh442v-FI1gH] = [Description: = Description: Description: https://encrypted-tbn0.google.com/images?q=3Dtbn:= ANd9GcSfD26ooDfMWD_xWRaMfbMcaBmkIKcG2oRxlaj6tBGYguC_aD71lw] Follow us on LinkedIn and Twitter CONFIDENTIALITY: This email (including any attachments) may contain confid= ential, proprietary and privileged information, and unauthorized disclosure= or use is prohibited. If you received this email in error, please notify = the sender and delete this email from your system. --_000_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi guys,

I am presently using version 7.1 on a Solaris box.&n= bsp; I downloaded 8.21 and really love the improvement in speed (almost 50%= in some tests).  I am looking to replace the commercial product NSORT= and would like this feature in the source instead of a wrapper.  If I have a file

XXXX300001XXXX

XXXX300002XXXX

XXXX300003XXXX

XXXX300003XXXX

XXXX300003XXXX

XXXX300003XXXX

XXXX300004XXXX

XXXX300005XXXX

XXXX300006XXXX

XXXX300007XXXX

 

NSORT keeps the 4 300003 records together in entry s= equence.   My present work-around is to use a Python script that = reads in the whole file and creates a pseudo-key that is 30000X plus an 8 d= igit sequence number (I process millions of records).  What I am thinking of is an –es (--entry-sequence) t= hat would add a hidden –k to process on this internal sequence. = If I figure out how to do this on my own, I will submit it to you.

 

 

Thanks,<= /span>

Danny Nicholas=

Applications Progra= mmer
Pinnacle Data Systems L.L.C.

Office: (205) 307-6874=

danny.nicholas@pinnacledatasystems.com=

www.pinnacledatasystem= s.com

 

   3D"Description: 3D"Description:

Follow us on LinkedIn and= Twitter

 <= /i>

CONFIDENTIALITY:  Th= is email (including any attachments) may contain confidential, proprietary = and privileged information, and unauthorized disclosure or use is prohibited.  If you received this email in error, please notify th= e sender and delete this email from your system.

 

 

 

 

--_000_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_-- --_005_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_ Content-Type: image/jpeg; name="image001.jpg" Content-Description: image001.jpg Content-Disposition: inline; filename="image001.jpg"; size=976; creation-date="Tue, 30 Jul 2013 20:51:08 GMT"; modification-date="Tue, 30 Jul 2013 20:51:08 GMT" Content-ID: Content-Transfer-Encoding: base64 /9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAoHBwkHBgoJCAkLCwoMDxkQDw4ODx4WFxIZJCAmJSMg IyIoLTkwKCo2KyIjMkQyNjs9QEBAJjBGS0U+Sjk/QD3/2wBDAQsLCw8NDx0QEB09KSMpPT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT3/wAARCAAgACADASIA AhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQA AAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3 ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWm p6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEA AwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSEx BhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElK U1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3 uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwDtda1W 4N/IiSMqoxVVViAMcduprP8A7Suv+e8n/fZ/xrG17X5LXX7yJ4VdFmcAg4ONxraS4F3YeHkkLKl+ 7ogCjMeD3PevR5FGKujx3NznKz2/zsJ/aN1/z3k/77P+NaWh6rcfb443kZkdtrKzEjnvz0NZs0Vk ItVW1unmudNUvLGY9qkDOQD68VkeHdelu/EVnEkKRo0qgknJ60OClF2QKcoTim9zH8V/8jLf/wDX eT/0I1et/E9tFb+HozDMTpcjPLjHz5Ofl5/nWr4s8F6jNrE9zZQtPFO5kBQjKk8kEE+ucH3rC/4Q vW/+fCf/AL5H+NaxnSnBXZhKnXp1JOMXr5edyW38R28N14glMMpGpo6xAY+TcSRu59+1V/B//Iz2 P/XZf50//hC9b/58J/8Avkf410Hg/wAGahb6xDd30LQRQNv+cjc5HQADt3z7UTnSjB2YU6VadSPN F6Pt53P/2Q== --_005_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_ Content-Type: image/jpeg; name="image002.jpg" Content-Description: image002.jpg Content-Disposition: inline; filename="image002.jpg"; size=924; creation-date="Tue, 30 Jul 2013 20:51:08 GMT"; modification-date="Tue, 30 Jul 2013 20:51:08 GMT" Content-ID: Content-Transfer-Encoding: base64 /9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAoHBwkHBgoJCAkLCwoMDxkQDw4ODx4WFxIZJCAmJSMg IyIoLTkwKCo2KyIjMkQyNjs9QEBAJjBGS0U+Sjk/QD3/2wBDAQsLCw8NDx0QEB09KSMpPT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT3/wAARCAAeAB8DASIA AhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQA AAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3 ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWm p6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEA AwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSEx BhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElK U1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3 uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwDv9W1S 4+3NFC7KFbYqqcZ/yalntdSgtmlN1uZBuZAzZA7896yL+Z18ShRgr9pHb/aFdO07faZFRd5yRjGe K7WuWMbLoeSpc858ze9jm/7Vuu07/wDfR/xrY0HUpbiYwysWBBIJOSCMd/xp99DC1jO1xbRxKkZY PtCkHtisjwnK8mpfPgfI3AH0pz5Z020thU3OnWjFyvcraq3l688uM7Ji2PXBBrcLXH2g3ViPNjky VZRnGexHY0uqeHmurpp4JIxvOWSQHGfUEVXj8PXcX3ZIB9GcVKqwcVdjeGrRnJpaN33sS3Rnu7Oc anCFhWMsrsuCrdse9Z/hRduo/wDAG/pVuTw/eSDDSQEe7ua0NI0Y6e7SyurSEbQEGFUUpVYKDiup cMPVdaM5Lbre7P/Z --_005_1F727E6927E3F34787AEC8C1DCFAB965029D5F19PDSBHMMAIL1pinn_-- ------------=_1375279213-28307-1--