From unknown Thu Sep 11 12:41:57 2025 X-Loop: help-debbugs@gnu.org Subject: bug#32603: sort bug? Resent-From: Michael Bartman Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Fri, 31 Aug 2018 16:36:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 32603 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 32603@debbugs.gnu.org X-Debbugs-Original-To: bug-coreutils@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.153573334032520 (code B ref -1); Fri, 31 Aug 2018 16:36:01 +0000 Received: (at submit) by debbugs.gnu.org; 31 Aug 2018 16:35:40 +0000 Received: from localhost ([127.0.0.1]:40832 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvmOY-0008SO-4O for submit@debbugs.gnu.org; Fri, 31 Aug 2018 12:35:38 -0400 Received: from eggs.gnu.org ([208.118.235.92]:36294) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvlce-0007C4-1w for submit@debbugs.gnu.org; Fri, 31 Aug 2018 11:46:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fvlcX-0003d9-V0 for submit@debbugs.gnu.org; Fri, 31 Aug 2018 11:46:02 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,HTML_MESSAGE, T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:44350) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fvlcX-0003cc-Jq for submit@debbugs.gnu.org; Fri, 31 Aug 2018 11:46:01 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56803) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fvlcW-00066j-JK for bug-coreutils@gnu.org; Fri, 31 Aug 2018 11:46:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fvlRN-0006jY-1o for bug-coreutils@gnu.org; Fri, 31 Aug 2018 11:34:29 -0400 Received: from mail-pg1-x52e.google.com ([2607:f8b0:4864:20::52e]:39311) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fvlRM-0006jB-KQ for bug-coreutils@gnu.org; Fri, 31 Aug 2018 11:34:28 -0400 Received: by mail-pg1-x52e.google.com with SMTP id g20-v6so5613865pgv.6 for ; Fri, 31 Aug 2018 08:34:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sparkpost.com; s=google; h=mime-version:from:date:message-id:subject:to; bh=CK4wMIbNS4A06osISXpobtk0uweuoD0tGHW66l7XRKk=; b=c6H8U5bR2jeRjoDwXBKP+IEGd0hmMj7bHkGMZy52DlvQua0PJjbgcmNqeC20lQbyla rgPDtQxTVljiGdXtU1v/dw40qvPrZGk4k3fp+dcvFLXwuR6JvtdmIii3czIq6S5MTtXr eMwGm+Sxq1nLWJwPNWKouMtWH54QI+qH9ipwg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=CK4wMIbNS4A06osISXpobtk0uweuoD0tGHW66l7XRKk=; b=GvKy9a2J1thkWbN3ebcJcbODKS0ubTBOi/zQfYSvmjSoNAEvqpNf9O9jaAiXeAHcFL EWgRQe7NSk/YOg1SyJ7kDGsCWAyL7iGeGCsnuYM6NDUWbvO2SRzGQR+NVXeJ7N+SJuLI rT0uE1YALV3wKfa2aFwtMiBKb8vnm3inpu3r3u4D+IB/88DhGsu3lrYFufydoXeUvV8y SDph4rCWTKW3WVG6Rw5lV7U5L63cvu6PNgBC/KsN/uhiTp9tVUdwdoCEg7CZ21gw3Nfd pWrN6iNx+/PVZ+I9wTg88neWRMyYqK3pT9FMGz3tRcyBJF66sCTXX1ZnO68x0naMVYAa wEQA== X-Gm-Message-State: APzg51BMHrFqdnh/WamRcGIbGxJT49CMi2OqhxQdB4D4pSOdW+AEAJQu XXJcp+7zfktLGBdS1sa3BgFDV7JJvhmhsMRE4f3o5NZsFBc= X-Google-Smtp-Source: ANB0Vdbvxf35XToF1WYVzzBVN/833ELf+N2D96thJZrTCqxz/04v1ptkKFp1XwAbTRIY8yGdkHlC7s5wwasFZSVBo94= X-Received: by 2002:a62:6d02:: with SMTP id i2-v6mr16794896pfc.218.1535729666919; Fri, 31 Aug 2018 08:34:26 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a17:90a:709:0:0:0:0 with HTTP; Fri, 31 Aug 2018 08:34:26 -0700 (PDT) From: Michael Bartman Date: Fri, 31 Aug 2018 11:34:26 -0400 Message-ID: Content-Type: multipart/alternative; boundary="00000000000044a4370574bced2c" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Mailman-Approved-At: Fri, 31 Aug 2018 12:35:36 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) --00000000000044a4370574bced2c Content-Type: text/plain; charset="UTF-8" My version of sort seems to have unpredictable behavior, based on the data being sorted: $ sort . This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and Paul Eggert. -- Using flags -d, -n, -R, -r, and -i had no effect on this behavior. *Mike Bartman* *senior software engineer - platform* *tel* (415)-578-5222 x492 *email *michael.bartman@sparkpost.com --00000000000044a4370574bced2c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
My version of sort = seems to have unpredictable behavior, based on the data being sorted:
=

$ sort <foo
t
te
tec

$ sort <= ;foo
t.co
tec.c= o
te.co

$ sort <foo
t.cte.c
tec.c

$ sort <foo
t.cotec.co
te.co=

$ sort <foo
tec.o
te.o
t.o
$ sort --version
sort (GNU coreutils) 8.4
Copyright (C) 2= 010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or = later <http://gnu.org/licen= ses/gpl.html>.
This is free software: you are free to change and = redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Mike Haertel and Paul Eggert.
--

Using flags -d, -n, -R, -r, and -i had no effect on this behavior.<= br>

Mike Bartman
s= enior software engineer - platform

tel = (415)-578-5222 x492
email michael.bartman@sparkpost.com

--00000000000044a4370574bced2c-- From unknown Thu Sep 11 12:41:57 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Michael Bartman Subject: bug#32603: closed (Re: bug#32603: sort bug?) Message-ID: References: <6959e399-1f37-026e-ebb4-1fbe9942e3d5@cs.ucla.edu> X-Gnu-PR-Message: they-closed 32603 X-Gnu-PR-Package: coreutils Reply-To: 32603@debbugs.gnu.org Date: Fri, 31 Aug 2018 16:45:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1535733902-983-1" This is a multi-part message in MIME format... ------------=_1535733902-983-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #32603: sort bug? which was filed against the coreutils package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 32603@debbugs.gnu.org. --=20 32603: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D32603 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1535733902-983-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 32603-done) by debbugs.gnu.org; 31 Aug 2018 16:44:44 +0000 Received: from localhost ([127.0.0.1]:40837 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvmXM-0000FD-2j for submit@debbugs.gnu.org; Fri, 31 Aug 2018 12:44:44 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:46806) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvmXK-0000F0-7p for 32603-done@debbugs.gnu.org; Fri, 31 Aug 2018 12:44:42 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 734EA160806; Fri, 31 Aug 2018 09:44:36 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id IybGmVCPzkd3; Fri, 31 Aug 2018 09:44:35 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id C5081160DFC; Fri, 31 Aug 2018 09:44:35 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id hRtcpT4QuyDd; Fri, 31 Aug 2018 09:44:35 -0700 (PDT) Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com [23.242.74.103]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id A39C0160806; Fri, 31 Aug 2018 09:44:35 -0700 (PDT) Subject: Re: bug#32603: sort bug? To: Michael Bartman , 32603-done@debbugs.gnu.org References: From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <6959e399-1f37-026e-ebb4-1fbe9942e3d5@cs.ucla.edu> Date: Fri, 31 Aug 2018 09:44:35 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 32603-done X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) "sort --help" says: *** WARNING *** The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values. and that's what you have run into. ------------=_1535733902-983-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 31 Aug 2018 16:35:40 +0000 Received: from localhost ([127.0.0.1]:40832 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvmOY-0008SO-4O for submit@debbugs.gnu.org; Fri, 31 Aug 2018 12:35:38 -0400 Received: from eggs.gnu.org ([208.118.235.92]:36294) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvlce-0007C4-1w for submit@debbugs.gnu.org; Fri, 31 Aug 2018 11:46:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fvlcX-0003d9-V0 for submit@debbugs.gnu.org; Fri, 31 Aug 2018 11:46:02 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,HTML_MESSAGE, T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:44350) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fvlcX-0003cc-Jq for submit@debbugs.gnu.org; Fri, 31 Aug 2018 11:46:01 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56803) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fvlcW-00066j-JK for bug-coreutils@gnu.org; Fri, 31 Aug 2018 11:46:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fvlRN-0006jY-1o for bug-coreutils@gnu.org; Fri, 31 Aug 2018 11:34:29 -0400 Received: from mail-pg1-x52e.google.com ([2607:f8b0:4864:20::52e]:39311) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fvlRM-0006jB-KQ for bug-coreutils@gnu.org; Fri, 31 Aug 2018 11:34:28 -0400 Received: by mail-pg1-x52e.google.com with SMTP id g20-v6so5613865pgv.6 for ; Fri, 31 Aug 2018 08:34:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sparkpost.com; s=google; h=mime-version:from:date:message-id:subject:to; bh=CK4wMIbNS4A06osISXpobtk0uweuoD0tGHW66l7XRKk=; b=c6H8U5bR2jeRjoDwXBKP+IEGd0hmMj7bHkGMZy52DlvQua0PJjbgcmNqeC20lQbyla rgPDtQxTVljiGdXtU1v/dw40qvPrZGk4k3fp+dcvFLXwuR6JvtdmIii3czIq6S5MTtXr eMwGm+Sxq1nLWJwPNWKouMtWH54QI+qH9ipwg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=CK4wMIbNS4A06osISXpobtk0uweuoD0tGHW66l7XRKk=; b=GvKy9a2J1thkWbN3ebcJcbODKS0ubTBOi/zQfYSvmjSoNAEvqpNf9O9jaAiXeAHcFL EWgRQe7NSk/YOg1SyJ7kDGsCWAyL7iGeGCsnuYM6NDUWbvO2SRzGQR+NVXeJ7N+SJuLI rT0uE1YALV3wKfa2aFwtMiBKb8vnm3inpu3r3u4D+IB/88DhGsu3lrYFufydoXeUvV8y SDph4rCWTKW3WVG6Rw5lV7U5L63cvu6PNgBC/KsN/uhiTp9tVUdwdoCEg7CZ21gw3Nfd pWrN6iNx+/PVZ+I9wTg88neWRMyYqK3pT9FMGz3tRcyBJF66sCTXX1ZnO68x0naMVYAa wEQA== X-Gm-Message-State: APzg51BMHrFqdnh/WamRcGIbGxJT49CMi2OqhxQdB4D4pSOdW+AEAJQu XXJcp+7zfktLGBdS1sa3BgFDV7JJvhmhsMRE4f3o5NZsFBc= X-Google-Smtp-Source: ANB0Vdbvxf35XToF1WYVzzBVN/833ELf+N2D96thJZrTCqxz/04v1ptkKFp1XwAbTRIY8yGdkHlC7s5wwasFZSVBo94= X-Received: by 2002:a62:6d02:: with SMTP id i2-v6mr16794896pfc.218.1535729666919; Fri, 31 Aug 2018 08:34:26 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a17:90a:709:0:0:0:0 with HTTP; Fri, 31 Aug 2018 08:34:26 -0700 (PDT) From: Michael Bartman Date: Fri, 31 Aug 2018 11:34:26 -0400 Message-ID: Subject: sort bug? To: bug-coreutils@gnu.org Content-Type: multipart/alternative; boundary="00000000000044a4370574bced2c" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Fri, 31 Aug 2018 12:35:36 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) --00000000000044a4370574bced2c Content-Type: text/plain; charset="UTF-8" My version of sort seems to have unpredictable behavior, based on the data being sorted: $ sort . This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and Paul Eggert. -- Using flags -d, -n, -R, -r, and -i had no effect on this behavior. *Mike Bartman* *senior software engineer - platform* *tel* (415)-578-5222 x492 *email *michael.bartman@sparkpost.com --00000000000044a4370574bced2c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
My version of sort = seems to have unpredictable behavior, based on the data being sorted:
=

$ sort <foo
t
te
tec

$ sort <= ;foo
t.co
tec.c= o
te.co

$ sort <foo
t.cte.c
tec.c

$ sort <foo
t.cotec.co
te.co=

$ sort <foo
tec.o
te.o
t.o
$ sort --version
sort (GNU coreutils) 8.4
Copyright (C) 2= 010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or = later <http://gnu.org/licen= ses/gpl.html>.
This is free software: you are free to change and = redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Mike Haertel and Paul Eggert.
--

Using flags -d, -n, -R, -r, and -i had no effect on this behavior.<= br>

Mike Bartman
s= enior software engineer - platform

tel = (415)-578-5222 x492
email michael.bartman@sparkpost.com

--00000000000044a4370574bced2c-- ------------=_1535733902-983-1-- From debbugs-submit-bounces@debbugs.gnu.org Fri Aug 31 13:00:06 2018 Received: (at control) by debbugs.gnu.org; 31 Aug 2018 17:00:06 +0000 Received: from localhost ([127.0.0.1]:40851 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvmmA-0000bT-H2 for submit@debbugs.gnu.org; Fri, 31 Aug 2018 13:00:06 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:34330 helo=mx1.redhat.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvmm4-0000aZ-7f; Fri, 31 Aug 2018 12:59:56 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 552AD87A70; Fri, 31 Aug 2018 16:59:50 +0000 (UTC) Received: from [10.10.123.80] (ovpn-123-80.rdu2.redhat.com [10.10.123.80]) by smtp.corp.redhat.com (Postfix) with ESMTP id B21FC10FFE79; Fri, 31 Aug 2018 16:59:49 +0000 (UTC) Subject: Re: bug#32603: sort bug? To: 32603-done@debbugs.gnu.org, eggert@cs.ucla.edu, michael.bartman@sparkpost.com References: <6959e399-1f37-026e-ebb4-1fbe9942e3d5@cs.ucla.edu> From: Eric Blake Organization: Red Hat, Inc. Message-ID: Date: Fri, 31 Aug 2018 11:59:49 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <6959e399-1f37-026e-ebb4-1fbe9942e3d5@cs.ucla.edu> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 31 Aug 2018 16:59:50 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 31 Aug 2018 16:59:50 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'eblake@redhat.com' RCPT:'' X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) tag 32603 notabug thanks On 08/31/2018 11:44 AM, Paul Eggert wrote: > "sort --help" says: > > *** WARNING *** > The locale specified by the environment affects sort order. > Set LC_ALL=C to get the traditional sort order that uses > native byte values. > > and that's what you have run into. To expound on Paul's answer: > $ sort t.co > tec.co > te.co Let's run that with --debug to make it obvious: $ printf 't.co\ntec.co\nte.co\n' | sort --debug sort: using ‘en_US.UTF-8’ sorting rules t.co ____ tec.co ______ te.co _____ and realize that en_US.UTF-8 is a locale where punctuation is ignored when determining collation order (thus, 'tco' < 'tecco' < 'teco' once you strip out the ignored '.'). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org From unknown Thu Sep 11 12:41:57 2025 X-Loop: help-debbugs@gnu.org Subject: bug#32603: sort bug? Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Fri, 31 Aug 2018 17:10:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 32603 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: notabug To: R0b0t1 , Eric Blake Cc: 32603-done@debbugs.gnu.org, michael.bartman@sparkpost.com Received: via spool by 32603-done@debbugs.gnu.org id=D32603.15357353423307 (code D ref 32603); Fri, 31 Aug 2018 17:10:02 +0000 Received: (at 32603-done) by debbugs.gnu.org; 31 Aug 2018 17:09:02 +0000 Received: from localhost ([127.0.0.1]:40881 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvmus-0000rH-CZ for submit@debbugs.gnu.org; Fri, 31 Aug 2018 13:09:02 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:51320) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvmup-0000qi-38 for 32603-done@debbugs.gnu.org; Fri, 31 Aug 2018 13:08:59 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id A249A160EC0; Fri, 31 Aug 2018 10:08:53 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id zymoR0z_4hQM; Fri, 31 Aug 2018 10:08:53 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id F2E49160EF0; Fri, 31 Aug 2018 10:08:52 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ZJGT8b2MZuKO; Fri, 31 Aug 2018 10:08:52 -0700 (PDT) Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com [23.242.74.103]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id C94B1160EC0; Fri, 31 Aug 2018 10:08:52 -0700 (PDT) References: <6959e399-1f37-026e-ebb4-1fbe9942e3d5@cs.ucla.edu> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <3fda7427-77a3-fa55-bb66-09f3d4c5ddef@cs.ucla.edu> Date: Fri, 31 Aug 2018 10:08:52 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) R0b0t1 wrote: > I keep seeing these sort "bugs" pop up, they seem to be very popular. At > any point would the default behavior be seen as needing change? No matter what the default behavior is, it won't work for some applications, and "bugs" will pop up. From unknown Thu Sep 11 12:41:57 2025 X-Loop: help-debbugs@gnu.org Subject: bug#32603: sort bug? Resent-From: R0b0t1 Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Fri, 31 Aug 2018 18:01:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 32603 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: notabug To: Eric Blake Cc: Paul Eggert , 32603-done@debbugs.gnu.org, michael.bartman@sparkpost.com Received: via spool by 32603-done@debbugs.gnu.org id=D32603.153573841515653 (code D ref 32603); Fri, 31 Aug 2018 18:01:02 +0000 Received: (at 32603-done) by debbugs.gnu.org; 31 Aug 2018 18:00:15 +0000 Received: from localhost ([127.0.0.1]:40935 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvniQ-00044O-Ov for submit@debbugs.gnu.org; Fri, 31 Aug 2018 14:00:15 -0400 Received: from mail-lj1-f174.google.com ([209.85.208.174]:36651) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvmry-0000m7-Gn for 32603-done@debbugs.gnu.org; Fri, 31 Aug 2018 13:06:03 -0400 Received: by mail-lj1-f174.google.com with SMTP id v26-v6so10636686ljj.3 for <32603-done@debbugs.gnu.org>; Fri, 31 Aug 2018 10:06:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=0khUjN3y4B3i6EdxLM/HPNRxwl8KLGM0M3uQD6nn7Gk=; b=T6J8pUP/1Xvy4Cy++ztGzrR6YDv41+iacFAI+tQ9XcCpHxUGFnzXkQpUp5qb6SoBbs U1HZwddPG+AEtcaHIVN+BSEPu47uMwGq9Gw0CdccB1g3809pQNhbGfhdjT84DCqB/2JG wwjpFAcLuUuqXvSEyHUcvWzHut8cdZ2ZhvLo79+glUe1unqTJwhl++n71CulsY9EyR7I HX6Aw7KhVhwgy6FlNNv6PpwNQTSTW/xepgcZwfbapDlwU7aqKvx+VBUTzJrQGp4VgzCO t9mDPdHgJK4wqY8D9aSR6EFLmS2FZC/yHJL/2fFnJEw6MjELmm1czx3OwGJuZlnHJvyX pB5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=0khUjN3y4B3i6EdxLM/HPNRxwl8KLGM0M3uQD6nn7Gk=; b=m6+SYZeny/htQW5sGtgIoA/h2DjsZmpEysKAVPT3KlQi4i6JfsP+Pe2uvyf7n+Da/7 kWz12UG79r2LWUYrIxSFMqtT/F9Gd83N8Si9PEIROzVOeSDWEhZHRpLy1zybFVv/NX/R /r0wWCoRicvKIBylP2+Zy8KcKujt0r0wH33+oVjmnye+roApWCpRMWuyAZ51JQnFx24o E0VTu3QCoGZ2a+q+r9Z4l63OYeQpD3C4JLxEd8AREHgTxDqFQsF2Cvdk2XS7jOBgIfR/ IQSw/Qr4Gr5bkFRt5uJLJyNAdJsETAThrbMKe/RcQDpf1kBvpj9cWAKhS7IofoOGE+RH TMow== X-Gm-Message-State: APzg51DX6MFopbRak4tM1Bl8xWJAg+g3yItvc+zgdG3LHAV9pQcmTOg6 bxGx36Tj1vbXyr0itfL+6j/GbOW1+qegS83HHxg= X-Google-Smtp-Source: ANB0Vdap5+NvXp6VgpuHm3Aq25yP6GxKZrmx3UU6nj5JhaPlLY8Lyiv4Gam0oZswjc9VRqFuPGuA8eqBhOKpi332Wm4= X-Received: by 2002:a2e:380d:: with SMTP id f13-v6mr11639021lja.74.1535735156541; Fri, 31 Aug 2018 10:05:56 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a2e:8647:0:0:0:0:0 with HTTP; Fri, 31 Aug 2018 10:05:56 -0700 (PDT) In-Reply-To: References: <6959e399-1f37-026e-ebb4-1fbe9942e3d5@cs.ucla.edu> From: R0b0t1 Date: Fri, 31 Aug 2018 12:05:56 -0500 Message-ID: Content-Type: multipart/alternative; boundary="000000000000798e540574be34ec" X-Spam-Score: 0.3 (/) X-Mailman-Approved-At: Fri, 31 Aug 2018 14:00:12 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --000000000000798e540574be34ec Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Aug 31, 2018 at 11:59 AM, Eric Blake wrote: > tag 32603 notabug > thanks > > > On 08/31/2018 11:44 AM, Paul Eggert wrote: > >> "sort --help" says: >> >> *** WARNING *** >> The locale specified by the environment affects sort order. >> Set LC_ALL=3DC to get the traditional sort order that uses >> native byte values. >> >> and that's what you have run into. >> > > To expound on Paul's answer: > > > $ sort > t.co > > tec.co > > te.co > > Let's run that with --debug to make it obvious: > > $ printf 't.co\ntec.co\nte.co\n' | sort --debug > sort: using =E2=80=98en_US.UTF-8=E2=80=99 sorting rules > t.co > ____ > tec.co > ______ > te.co > _____ > > and realize that en_US.UTF-8 is a locale where punctuation is ignored whe= n > determining collation order (thus, 'tco' < 'tecco' < 'teco' once you stri= p > out the ignored '.'). > > I keep seeing these sort "bugs" pop up, they seem to be very popular. At any point would the default behavior be seen as needing change? I'm not sure why I'd want to ignore special characters by default, for example... Cheers, R0b0t1 --000000000000798e540574be34ec Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On F= ri, Aug 31, 2018 at 11:59 AM, Eric Blake <eblake@redhat.com>= wrote:
tag 32603 notabug
thanks


On 08/31/2018 11:44 AM, Paul Eggert wrote:
"sort --help" says:

*** WARNING ***
The locale specified by the environment affects sort order.
Set LC_ALL=3DC to get the traditional sort order that uses
native byte values.

and that's what you have run into.

To expound on Paul's answer:

> $ sort <foo
> t.co<= br> > tec.co<= /a>
>
te.co

Let's run that with --debug to make it obvious:

$ printf '
= t.co\nt= ec.co\nt= e.co\n' | sort --debug
sort: using =E2=80=98en_US.UTF-8=E2=80=99 sorting rules
t.co
____
tec.co ______
te.co
_____

and realize that en_US.UTF-8 is a locale where punctuation is ignored when = determining collation order (thus, 'tco' < 'tecco' < = 'teco' once you strip out the ignored '.').


I keep seeing these sort= "bugs" pop up, they seem to be very popular. At any point would = the default behavior be seen as needing change?

I&= #39;m not sure why I'd want to ignore special characters by default, fo= r example...

Cheers,
=C2=A0=C2=A0=C2=A0 = R0b0t1
--000000000000798e540574be34ec-- From unknown Thu Sep 11 12:41:57 2025 X-Loop: help-debbugs@gnu.org Subject: bug#32603: Thank you for the quick response and answer! References: In-Reply-To: Resent-From: Michael Bartman Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Fri, 31 Aug 2018 18:43:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 32603 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: notabug To: 32603@debbugs.gnu.org Received: via spool by 32603-submit@debbugs.gnu.org id=B32603.153574092119765 (code B ref 32603); Fri, 31 Aug 2018 18:43:01 +0000 Received: (at 32603) by debbugs.gnu.org; 31 Aug 2018 18:42:01 +0000 Received: from localhost ([127.0.0.1]:41020 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvoMq-00058c-UP for submit@debbugs.gnu.org; Fri, 31 Aug 2018 14:42:01 -0400 Received: from mail-pf1-f172.google.com ([209.85.210.172]:45034) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fvoMo-00058M-IZ for 32603@debbugs.gnu.org; Fri, 31 Aug 2018 14:41:59 -0400 Received: by mail-pf1-f172.google.com with SMTP id k21-v6so5886498pff.11 for <32603@debbugs.gnu.org>; Fri, 31 Aug 2018 11:41:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sparkpost.com; s=google; h=mime-version:from:date:message-id:subject:to; bh=JKQuggE4yp0NnuVnx5OWGmPX4mvANPfMmb/sOFxVCA8=; b=WLlGEBM2obFhmVMLSJqmI7m1eQzzqE7y/2rA6T7l7BxVOgq38lJ0pFP3GixH5hf3l1 kqNHBELQmtEt/DTInKJ3hAaRzWogeGKA3BQFS6entEbtH5CUtI1dpVb33AV90nG1QK1y eftSyedDQMd/DbSycSyVSWcJePX/p1j+hAMJI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=JKQuggE4yp0NnuVnx5OWGmPX4mvANPfMmb/sOFxVCA8=; b=X4eRkTbtXP8WBjBkkBW8ueMMlJy/uop1iS0S2MweS/sVDWr+aFJtN8gh6QAJ1auioy K5RawTApuxBuh9k4XVrEo6lRQpT/Y9IGsMBWPG2EkfB8c8iq2OYLkzAaQJSXBgbI4Dfz XOa+fsQmX9EP26bvee+bEjMwAtFOnuIVpzaIrxhhQAir3rFZd5MgSW4SuAiUCKTEAQF/ fhMN/fWA3RVsOOLJMYlY30PpYkPUKieFpT2SWzxjQ3uNeZaguQ1CYIq7Pqf4V/znLY9b n3tmQXZIA8TTWNO+IpqyU0TKxnHSatnKyXBTQ7oel8OVMarVMKWJqIgBsQ7tK8zMiw1i /Zlg== X-Gm-Message-State: APzg51D6O7LrpMhkXYHConLoKVWkrFhDH8pME/yljLsEi/OQVGbY2mob t164MHocDV1yAUHXFFdiWS7AYVgMuSw+38wvd42MsEAg X-Google-Smtp-Source: ANB0VdZJdTfpcstXF+oVQVLJGXsFe1fIZJxz7I1+2iPdjCetG1FCUtJW7H24VFkXxQ5Zqs8XdMrXC2isHRsCKph91iI= X-Received: by 2002:a63:a745:: with SMTP id w5-v6mr15506035pgo.374.1535740912249; Fri, 31 Aug 2018 11:41:52 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a17:90a:709:0:0:0:0 with HTTP; Fri, 31 Aug 2018 11:41:51 -0700 (PDT) From: Michael Bartman Date: Fri, 31 Aug 2018 14:41:51 -0400 Message-ID: Content-Type: multipart/alternative; boundary="0000000000008ac3020574bf8b0e" X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --0000000000008ac3020574bf8b0e Content-Type: text/plain; charset="UTF-8" While the behavior of ignoring parts of the data is unexpected and confusing, the explanation is clear and useful, and the LC_ALL=C setting does result in the expected results. Thank you to all respondents. The explanation of LC_ALL use in the "sort --help" output could perhaps be clearer however to reduce the number of future "bug" reports. Perhaps something like this: "The locale specified by the environment affects sort order, and some locale specifications or defaults may ignore certain characters, such as punctuation. If you see unexpected sort output orderings, try setting LC_ALL=C to get the traditional sort order that uses native byte values." -- *Mike Bartman* *senior software engineer - platform* *tel* (415)-578-5222 x492 *email *michael.bartman@sparkpost.com --0000000000008ac3020574bf8b0e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
While the behavior of ignoring parts of the data is u= nexpected and confusing, the explanation is clear and useful, and the LC_AL= L=3DC setting does result in the expected results.=C2=A0 Thank you to all r= espondents.

The explanation of LC_ALL use in t= he "sort --help" output could perhaps be clearer however to reduc= e the number of future "bug" reports.=C2=A0 Perhaps something lik= e this:

"The locale specified by the environm= ent affects sort order, and some locale specifications or defaults may igno= re certain characters, such as punctuation.=C2=A0 If you see unexpected sor= t output orderings, try setting LC_ALL=3DC to get the traditional sort orde= r that uses native byte values."

--

Mike Bartman
senior software engineer - platform

tel= (415)-578-5222 x492
email michael.bartman@sparkpost.com

--0000000000008ac3020574bf8b0e-- From unknown Thu Sep 11 12:41:57 2025 X-Loop: help-debbugs@gnu.org Subject: bug#32603: Thank you for the quick response and answer! Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Sat, 01 Sep 2018 07:53:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 32603 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: notabug To: Michael Bartman , 32603@debbugs.gnu.org Received: via spool by 32603-submit@debbugs.gnu.org id=B32603.153578833329027 (code B ref 32603); Sat, 01 Sep 2018 07:53:01 +0000 Received: (at 32603) by debbugs.gnu.org; 1 Sep 2018 07:52:13 +0000 Received: from localhost ([127.0.0.1]:41205 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fw0hX-0007Y5-7r for submit@debbugs.gnu.org; Sat, 01 Sep 2018 03:52:12 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:56414) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fw0hU-0007Xn-W7 for 32603@debbugs.gnu.org; Sat, 01 Sep 2018 03:52:09 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id EB03216161D; Sat, 1 Sep 2018 00:52:02 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id hce4bad3MrFr; Sat, 1 Sep 2018 00:52:02 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 365B616161F; Sat, 1 Sep 2018 00:52:02 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ruaSwc8JZTMU; Sat, 1 Sep 2018 00:52:02 -0700 (PDT) Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com [23.242.74.103]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 13A9116161D; Sat, 1 Sep 2018 00:52:02 -0700 (PDT) References: From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <45cb3945-9884-18c8-ca83-48ff1ee7d329@cs.ucla.edu> Date: Sat, 1 Sep 2018 00:52:01 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Michael Bartman wrote: > try setting > LC_ALL=C to get the traditional sort order that uses native byte values." LC_ALL=C is not guaranteed to do that. There is no requirement that it use native byte values; on the contrary, it is required to not use native byte values in some circumstances (e.g., z/OS EBCDIC environments). This is a complicated area, unfortunately, and it's not something that can easily be condensed into a single line in a help message.