From unknown Sat Aug 16 21:18:12 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#9995 <9995@debbugs.gnu.org> To: bug#9995 <9995@debbugs.gnu.org> Subject: Status: problem about sort -u -k Reply-To: bug#9995 <9995@debbugs.gnu.org> Date: Sun, 17 Aug 2025 04:18:12 +0000 retitle 9995 problem about sort -u -k reassign 9995 coreutils submitter 9995 =E5=A4=8F=E5=87=AF severity 9995 normal tag 9995 notabug thanks From debbugs-submit-bounces@debbugs.gnu.org Tue Nov 08 12:24:42 2011 Received: (at submit) by debbugs.gnu.org; 8 Nov 2011 17:24:42 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RNpPd-0000PM-Ft for submit@debbugs.gnu.org; Tue, 08 Nov 2011 12:24:42 -0500 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RNn2b-0002rN-4A for submit@debbugs.gnu.org; Tue, 08 Nov 2011 09:52:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RNmzh-0001NZ-Ub for submit@debbugs.gnu.org; Tue, 08 Nov 2011 09:49:47 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,FREEMAIL_FROM, RCVD_IN_DNSWL_LOW,T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([140.186.70.17]:35270) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RNmzh-0001NP-Sn for submit@debbugs.gnu.org; Tue, 08 Nov 2011 09:49:45 -0500 Received: from eggs.gnu.org ([140.186.70.92]:34845) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RNmzd-00061e-5o for bug-coreutils@gnu.org; Tue, 08 Nov 2011 09:49:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RNmzX-0001Lm-GY for bug-coreutils@gnu.org; Tue, 08 Nov 2011 09:49:41 -0500 Received: from mail-fx0-f41.google.com ([209.85.161.41]:56398) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RNmzX-0001LX-Bv for bug-coreutils@gnu.org; Tue, 08 Nov 2011 09:49:35 -0500 Received: by faaq16 with SMTP id q16so768468faa.0 for ; Tue, 08 Nov 2011 06:49:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; bh=WSSdbu86laAvliRCPdZ51sID93QIre9Qn/avOgjiVEY=; b=QrN2nGQJU8LPs53QOOqPPEufS8wxx91lUfbWdMMJC0NHy5vN/tvJvk1nBQmegP6HXk b3j9L830an4nyWUJH+qzC85hM/b/0HVZp/96NcdQUInrgCj+DO+pzMHMKvifJVwYc7p6 KqpGDNxPls3bkVHEoN1O0N8xGCpQ3kSap8Cu8= Received: by 10.223.75.15 with SMTP id w15mr55555825faj.9.1320763773175; Tue, 08 Nov 2011 06:49:33 -0800 (PST) MIME-Version: 1.0 Received: by 10.223.81.68 with HTTP; Tue, 8 Nov 2011 06:49:12 -0800 (PST) From: =?UTF-8?B?5aSP5Yev?= Date: Tue, 8 Nov 2011 22:49:12 +0800 Message-ID: Subject: problem about sort -u -k To: bug-coreutils@gnu.org Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.17 X-Spam-Score: -5.9 (-----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Tue, 08 Nov 2011 12:24:40 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.9 (-----) when i use sort command with -k and -n together, i got the wrong result: 22:41:21#tp#~> LC_ALL=C 22:41:39#tp#~> /usr/local/bin/sort -u -k1,3 a 1 a q 1 a w 3 a w 22:41:48#tp#~> /usr/local/bin/sort -u -k3 a 1 a q 1 a w 22:41:49#tp#~> cat a 1 a q 1 a w 3 a w 22:41:52#tp#~> /usr/local/bin/sort --version sort (GNU coreutils) 8.14 Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later . This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and Paul Eggert. 22:41:57#tp#~> why is that? i read http://www.gnu.org/s/coreutils/manual/html_node/sort-invocation.html, but got nothing about this. any help is appreciate. -- contact me: MSN: walkerxk@gmail.com GTALK: walkerxk@gmail.com From debbugs-submit-bounces@debbugs.gnu.org Tue Nov 08 13:54:28 2011 Received: (at control) by debbugs.gnu.org; 8 Nov 2011 18:54:28 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RNqoV-0002aH-P6 for submit@debbugs.gnu.org; Tue, 08 Nov 2011 13:54:28 -0500 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RNqoR-0002Zx-BY; Tue, 08 Nov 2011 13:54:25 -0500 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id pA8IsImX022683 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 8 Nov 2011 13:54:18 -0500 Received: from [10.3.113.131] (ovpn-113-131.phx2.redhat.com [10.3.113.131]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id pA8IsI4G021337; Tue, 8 Nov 2011 13:54:18 -0500 Message-ID: <4EB97AD9.4050502@redhat.com> Date: Tue, 08 Nov 2011 11:54:17 -0700 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110928 Fedora/3.1.15-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.4 Thunderbird/3.1.15 MIME-Version: 1.0 To: =?UTF-8?B?5aSP5Yev?= Subject: Re: bug#9995: problem about sort -u -k References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mx1.redhat.com id pA8IsImX022683 X-Spam-Score: -10.3 (----------) X-Debbugs-Envelope-To: control Cc: 9995-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.3 (----------) tag 9995 notabug thanks On 11/08/2011 07:49 AM, =E5=A4=8F=E5=87=AF wrote: > when i use sort command with -k and -n together, i got the wrong result= : Thanks for the report; however, this is most likely not a bug in sort,=20 but in your usage patterns. Your sentence mentioned -k and -n together,=20 but your example and subject line mentioned -u and -k together; so I'll=20 assume that you got surprised by -u, not -n. > 22:41:21#tp#~> LC_ALL=3DC Unless you also did 'export LC_ALL' at some point, this does not=20 guarantee that child processes will see this setting in their environment. > 22:41:39#tp#~> /usr/local/bin/sort -u -k1,3 a > 1 a q > 1 a w > 3 a w > 22:41:48#tp#~> /usr/local/bin/sort -u -k3 a > 1 a q > 1 a w > 22:41:49#tp#~> cat a > 1 a q > 1 a w > 3 a w > 22:41:52#tp#~> /usr/local/bin/sort --version > sort (GNU coreutils) 8.14 That's new enough that you can use the --debug option to see what was=20 really going on: $ LC_ALL=3DC ../coreutils/src/sort --debug -u -k1,3 a sort: using simple byte comparison 1 a q _____ 1 a w _____ 3 a w _____ Here, you compared all three lines, which were all distinct. $ LC_ALL=3DC ../coreutils/src/sort --debug -u -k3 a sort: using simple byte comparison 1 a q __ 1 a w __ Here, you told sort to only look at a key of field three onwards, and to=20 uniquify the results (that is, don't display multiple lines if they had=20 the same sort key). Since two lines both have the string " w" as the=20 -k3 key, sort -u picked one of those lines (namely "3 a w") to be=20 discarded on output. This behavior matches POSIX rules. Since you didn't tell us what output you were hoping to get, I can't=20 tell you the proper command line that would match your expected output.=20 Feel free to reply, even while this bug is closed, if you need more=20 help in getting the output you want. Also, if you can prove that sort=20 is doing something wrong, then feel free to reopen this bug with more=20 evidence of why it is a bug in sort, including --debug output to back up=20 your claim (but be aware that more than 90% of "bug" reports against=20 sort have been debunked as user error rather than an actual bug in sort). --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org From debbugs-submit-bounces@debbugs.gnu.org Tue Nov 08 14:45:24 2011 Received: (at 9995) by debbugs.gnu.org; 8 Nov 2011 19:45:24 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RNrbn-0004YN-Qi for submit@debbugs.gnu.org; Tue, 08 Nov 2011 14:45:24 -0500 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RNrbk-0004Y4-0G for 9995@debbugs.gnu.org; Tue, 08 Nov 2011 14:45:22 -0500 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id pA8JjC3c020184 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 8 Nov 2011 14:45:12 -0500 Received: from [10.3.113.131] (ovpn-113-131.phx2.redhat.com [10.3.113.131]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id pA8JjBbN001613; Tue, 8 Nov 2011 14:45:12 -0500 Message-ID: <4EB986C7.6070504@redhat.com> Date: Tue, 08 Nov 2011 12:45:11 -0700 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110928 Fedora/3.1.15-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.4 Thunderbird/3.1.15 MIME-Version: 1.0 To: 9995@debbugs.gnu.org, =?UTF-8?B?5aSP5Yev?= Subject: Re: bug#9995: problem about sort -u -k References: <4EB97AD9.4050502@redhat.com> In-Reply-To: <4EB97AD9.4050502@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 X-Spam-Score: -10.3 (----------) X-Debbugs-Envelope-To: 9995 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.3 (----------) On 11/08/2011 11:54 AM, Eric Blake wrote: >> 22:41:39#tp#~> /usr/local/bin/sort -u -k1,3 a >> 1 a q >> 1 a w >> 3 a w >> 22:41:48#tp#~> /usr/local/bin/sort -u -k3 a >> 1 a q >> 1 a w > Since you didn't tell us what output you were hoping to get, I can't > tell you the proper command line that would match your expected output. > Feel free to reply, even while this bug is closed, if you need more help > in getting the output you want. I'll give a preemptive attempt at guessing what you meant, as well: If you wanted to sort on just the third and subsequent fields, but then strip duplicate lines only if the entire line is duplicate, then you have to use two processes: sort [-s] -k3 a | uniq If you don't mind a two-key sort, where the primary key is the third and subsequent fields, but where the secondary key is the entire line so as to force sort -u to consider the entire line when determining uniqueness, then one process will do: sort -u -k3 -k1 a To see the difference, and remembering that sort -u implies sort -s, consider these contents for a: $ cat a 1 a q 2 a q 1 a q 1 a w 3 a w $ sort -u -k3 -k1 a 1 a q 2 a q 1 a w 3 a w $ sort -s -k3 a | uniq 1 a q 2 a q 1 a q 1 a w 3 a w $ sort -k3 a | uniq 1 a q 2 a q 1 a w 3 a w That is, if the stable sort of just -k3 leaves identical lines that are not adjacent ("1 a q" in my example), then the separate uniq process won't filter them; while using sort -u with -k1 as the means to force the entire line as a secondary sort key loses the ability to leave identical lines separated by a distinct line. Likewise, omitting both -s and -u lets sort imply a last-resort -k1, at which point uniq sees the same line order as sort -u sees. >> i read http://www.gnu.org/s/coreutils/manual/html_node/sort-invocation.html, >> but got nothing about this. Actually, it does - under the option -u, I see: The commands sort -u and sort | uniq are equivalent, but this equivalence does not extend to arbitrary sort options. For example, sort -n -u inspects only the value of the initial numeric string when checking for uniqueness, whereas sort -n | uniq inspects the entire line. See uniq invocation. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 09 09:03:08 2011 Received: (at 9995) by debbugs.gnu.org; 9 Nov 2011 14:03:08 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RO8k4-0006xK-4Z for submit@debbugs.gnu.org; Wed, 09 Nov 2011 09:03:08 -0500 Received: from mail-fx0-f44.google.com ([209.85.161.44]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RO8jy-0006wo-2I for 9995@debbugs.gnu.org; Wed, 09 Nov 2011 09:03:02 -0500 Received: by faas12 with SMTP id s12so1647908faa.3 for <9995@debbugs.gnu.org>; Wed, 09 Nov 2011 06:02:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=e6T0EIhdLlYm765e4gWsHEe6U5PBIl44NORVUs5GN+c=; b=Y3/9o69yXuUvFrELSbRCloW8zvKm5r4/PC3ikbuAwr8SXNVS64n/0zdeh2alJH+vSg mUb9bG6/DXJUaidz8hQaqNuU6kVlkYodLSOkAcSx+a/WOUvutOV5iRSbpvPfBseD3myQ hBmVlNtQiauozk4nwFEoVHwl3tl8t/Tnd4HgY= Received: by 10.223.75.15 with SMTP id w15mr5199351faj.9.1320847367144; Wed, 09 Nov 2011 06:02:47 -0800 (PST) MIME-Version: 1.0 Received: by 10.223.81.68 with HTTP; Wed, 9 Nov 2011 06:02:26 -0800 (PST) In-Reply-To: <4EB986C7.6070504@redhat.com> References: <4EB97AD9.4050502@redhat.com> <4EB986C7.6070504@redhat.com> From: =?UTF-8?B?5aSP5Yev?= Date: Wed, 9 Nov 2011 22:02:26 +0800 Message-ID: Subject: Re: bug#9995: problem about sort -u -k To: 9995@debbugs.gnu.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -4.8 (----) X-Debbugs-Envelope-To: 9995 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -4.4 (----) thanks for you reply. if i want to use the entire line as a key, and sort by the third field, whether should i use sort -u -k3 -k1 -k2 a to do that? On Wed, Nov 9, 2011 at 03:45, Eric Blake wrote: > On 11/08/2011 11:54 AM, Eric Blake wrote: >>> >>> 22:41:39#tp#~> /usr/local/bin/sort -u -k1,3 a >>> 1 a q >>> 1 a w >>> 3 a w >>> 22:41:48#tp#~> /usr/local/bin/sort -u -k3 a >>> 1 a q >>> 1 a w > >> Since you didn't tell us what output you were hoping to get, I can't >> tell you the proper command line that would match your expected output. >> Feel free to reply, even while this bug is closed, if you need more help >> in getting the output you want. > > I'll give a preemptive attempt at guessing what you meant, as well: > > If you wanted to sort on just the third and subsequent fields, but then > strip duplicate lines only if the entire line is duplicate, then you have= to > use two processes: > > sort [-s] -k3 a | uniq > > If you don't mind a two-key sort, where the primary key is the third and > subsequent fields, but where the secondary key is the entire line so as t= o > force sort -u to consider the entire line when determining uniqueness, th= en > one process will do: > > sort -u -k3 -k1 a > > To see the difference, and remembering that sort -u implies sort -s, > consider these contents for a: > > $ cat a > 1 a q > 2 a q > 1 a q > 1 a w > 3 a w > $ sort -u -k3 -k1 a > 1 a q > 2 a q > 1 a w > 3 a w > $ sort -s -k3 a | uniq > 1 a q > 2 a q > 1 a q > 1 a w > 3 a w > $ sort -k3 a | uniq > 1 a q > 2 a q > 1 a w > 3 a w > > That is, if the stable sort of just -k3 leaves identical lines that are n= ot > adjacent ("1 a q" in my example), then the separate uniq process won't > filter them; while using sort -u with -k1 as the means to force the entir= e > line as a secondary sort key loses the ability to leave identical lines > separated by a distinct line. =A0Likewise, omitting both -s and -u lets s= ort > imply a last-resort -k1, at which point uniq sees the same line order as > sort -u sees. > >>> i read >>> http://www.gnu.org/s/coreutils/manual/html_node/sort-invocation.html, >>> but got nothing about this. > > Actually, it does - under the option -u, I see: > > The commands sort -u and sort | uniq are equivalent, but this equivalence > does not extend to arbitrary sort options. For example, sort -n -u inspec= ts > only the value of the initial numeric string when checking for uniqueness= , > whereas sort -n | uniq inspects the entire line. See uniq invocation. > > -- > Eric Blake =A0 eblake@redhat.com =A0 =A0+1-801-349-2682 > Libvirt virtualization library http://libvirt.org > --=20 contact me: MSN: walkerxk@gmail.com GTALK: walkerxk@gmail.com From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 09 09:58:59 2011 Received: (at 9995) by debbugs.gnu.org; 9 Nov 2011 14:58:59 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RO9c6-0008IB-LU for submit@debbugs.gnu.org; Wed, 09 Nov 2011 09:58:58 -0500 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RO9c1-0008I0-MP for 9995@debbugs.gnu.org; Wed, 09 Nov 2011 09:58:54 -0500 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id pA9Ewh7j001879 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 9 Nov 2011 09:58:43 -0500 Received: from [10.3.113.109] (ovpn-113-109.phx2.redhat.com [10.3.113.109]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id pA9Ewgkq018055; Wed, 9 Nov 2011 09:58:42 -0500 Message-ID: <4EBA9522.10409@redhat.com> Date: Wed, 09 Nov 2011 07:58:42 -0700 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110928 Fedora/3.1.15-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.4 Thunderbird/3.1.15 MIME-Version: 1.0 To: =?UTF-8?B?5aSP5Yev?= , 9995@debbugs.gnu.org Subject: Re: bug#9995: problem about sort -u -k References: <4EB97AD9.4050502@redhat.com> <4EB986C7.6070504@redhat.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mx1.redhat.com id pA9Ewh7j001879 X-Spam-Score: -10.3 (----------) X-Debbugs-Envelope-To: 9995 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.3 (----------) [Let's keep the list in the loop] On 11/08/2011 07:58 PM, =E5=A4=8F=E5=87=AF wrote: > thanks for you reply. > if i want to get my result, whether should i use sort -u -k3 -k1 -k2 a > to do that? > I'm still not quite sure what result you want. sort -u -k3 -k1 -k2 a says to sort with three keys - from field 3 to the end of the line, from=20 field 1 to the end of the line (aka the entire line), and from field 2=20 to the end of the line (that -k2 is useless, since sorting by field 1 to=20 the end of the line already sorted everything so that there is no longer=20 any distinguishing factors from field 2 to the end of the line). Then,=20 after sorting, sort discards any lines where all three keys are=20 identical, and since the -k1 key was the entire line, you are discarding=20 only duplicate lines. But I don't know if that is what you wanted. --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 09 22:09:05 2011 Received: (at 9995) by debbugs.gnu.org; 10 Nov 2011 03:09:05 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1ROL0i-0001Bz-Kf for submit@debbugs.gnu.org; Wed, 09 Nov 2011 22:09:05 -0500 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1ROL0e-0001BV-Fo for 9995@debbugs.gnu.org; Wed, 09 Nov 2011 22:09:02 -0500 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id pAA38kD7016534 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for <9995@debbugs.gnu.org>; Wed, 9 Nov 2011 22:08:46 -0500 Received: from [10.3.113.23] (ovpn-113-23.phx2.redhat.com [10.3.113.23]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id pAA38jQZ000563 for <9995@debbugs.gnu.org>; Wed, 9 Nov 2011 22:08:46 -0500 Message-ID: <4EBB403D.2040708@redhat.com> Date: Wed, 09 Nov 2011 20:08:45 -0700 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110928 Fedora/3.1.15-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.4 Thunderbird/3.1.15 MIME-Version: 1.0 To: 9995@debbugs.gnu.org Subject: Re: bug#9995: problem about sort -u -k References: <4EB97AD9.4050502@redhat.com> <4EB986C7.6070504@redhat.com> <4EBA9522.10409@redhat.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mx1.redhat.com id pAA38kD7016534 X-Spam-Score: -10.3 (----------) X-Debbugs-Envelope-To: 9995 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.3 (----------) [top-posting on technical lists is generally frowned on] [re-adding the list - it's always wiser to keep the list in the loop] On 11/09/2011 07:25 PM, =E5=A4=8F=E5=87=AF wrote: > actually, i just want the result of sort -sk3 a|uniq, we can't just > use -u to instead of uniq? Nope, and I already explained why and gave a sample file to demonstrate=20 it. These two are equivalent: sort -k3 a | uniq sort -u -k3 -k1 a but there is no way to get both stable sorting that leaves fields 1 and=20 2 unsorted and in the original order, as well as stripping adjacent=20 duplicate lines, without also involving a separate uniq process. That=20 is, there is no one-process counterpart to: sort -s -k3 a | uniq The reason is that the only way to match uniq behavior is to have the=20 sort key cover the entire line, but the moment you add -k1 to cover the=20 entire line, your sort is no longer stable on your original sort of just=20 -k3. Also, you may want to consider whether -k3 is what you really meant, or=20 if you want to use -k3,3 (that is, whether sorting by the entire line=20 except for the first two fields, or sorting by just the third field=20 while ignoring any fourth or later field). Note that I intentionally=20 used -k1 as shorthand for the entire line. --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org From unknown Sat Aug 16 21:18:12 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Thu, 08 Dec 2011 12:24:03 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator