From debbugs-submit-bounces@debbugs.gnu.org Tue May 31 14:32:46 2016 Received: (at submit) by debbugs.gnu.org; 31 May 2016 18:32:46 +0000 Received: from localhost ([127.0.0.1]:48601 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7oT8-0000mg-EY for submit@debbugs.gnu.org; Tue, 31 May 2016 14:32:46 -0400 Received: from eggs.gnu.org ([208.118.235.92]:51615) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7oT6-0000mR-Tp for submit@debbugs.gnu.org; Tue, 31 May 2016 14:32:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b7oT0-0007zR-T4 for submit@debbugs.gnu.org; Tue, 31 May 2016 14:32:39 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:59069) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b7oT0-0007zL-QK for submit@debbugs.gnu.org; Tue, 31 May 2016 14:32:38 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42574) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b7oSy-0000Jj-KV for bug-coreutils@gnu.org; Tue, 31 May 2016 14:32:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b7oSu-0007vK-FY for bug-coreutils@gnu.org; Tue, 31 May 2016 14:32:35 -0400 Received: from freefriends.org ([96.88.95.60]:48382) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b7oSu-0007tU-6C for bug-coreutils@gnu.org; Tue, 31 May 2016 14:32:32 -0400 X-Envelope-From: karl@freefriends.org X-Envelope-To: Received: from freefriends.org (localhost [127.0.0.1]) by freefriends.org (8.14.9/8.14.9) with ESMTP id u4VIWRHK007924 for ; Tue, 31 May 2016 12:32:27 -0600 Received: (from nobody@localhost) by freefriends.org (8.14.9/8.14.9/submit) id u4VIWRkw007923; Tue, 31 May 2016 18:32:27 GMT Date: Tue, 31 May 2016 18:32:27 GMT Message-Id: <201605311832.u4VIWRkw007923@freefriends.org> X-Authentication-Warning: frenzy.freefriends.org: nobody set sender to karl@freefriends.org using -f From: Karl Berry To: bug-coreutils@gnu.org Subject: spaces in keys: doc, --debug in LC_ALL=C X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) Consider this three-line source file, say /tmp/foo: M Build/zfile M Master/mfile MM Build/afile There are two spaces after the M on the first two lines (and no trailing spaces on any line). I was trying to sort on the second "field". I run LC_ALL=en_US.UTF-8 sort --debug -k 2 /tmp/foo # or -k 2,2 et al. And get the nicely explanatory output for the "surprising" result: sort: using ‘en_US.UTF-8’ sorting rules sort: leading blanks are significant in key 1; consider also specifying 'b' MM Build/afile ... However, if I run that same command in the C locale: LC_ALL=C sort --debug -k 2 /tmp/foo # or -k 2,2 et al. the output lacks that crucial commentary line: sort: leading blanks are significant ... But the information is just as valid in C as in UTF-8, so far as I can see. Thus it would be nice for it to be present. It would also be nice if the definition of "key 1" was stated. Awfully easy to misread that as "field 1". More importantly, I urge that the documentation for sort give an example of this. The idea that following blanks after the first become part of the next field is highly counter-intuitive. The information is implicitly there in "non-blank to blank transition", but it is a common confounding of expectations and deserves explicit mention, IMHO. (If it's there, sorry, I didn't see it.) This is with coreutils 8.25 (from original source). Thanks, Karl From debbugs-submit-bounces@debbugs.gnu.org Tue May 31 15:11:18 2016 Received: (at 23665) by debbugs.gnu.org; 31 May 2016 19:11:18 +0000 Received: from localhost ([127.0.0.1]:48638 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7p4Q-0001ja-DX for submit@debbugs.gnu.org; Tue, 31 May 2016 15:11:18 -0400 Received: from mail-qk0-f180.google.com ([209.85.220.180]:35216) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7p4P-0001jM-0c for 23665@debbugs.gnu.org; Tue, 31 May 2016 15:11:17 -0400 Received: by mail-qk0-f180.google.com with SMTP id c140so18160484qke.2 for <23665@debbugs.gnu.org>; Tue, 31 May 2016 12:11:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding; bh=GyKObb2L6iXQ/oMJuGbqJHQ4QVuHDvBmbOFuDlrUUOk=; b=0v94Whg2qr7CW/C1fJtVOgZuXRaHvTWii1J1MH3ScSR5QWKNOr1RnHKOIIHaGTQ3U/ qioecqnCUoZXikIytZedqKJIvnoBA8CHvJBTHxJ3DOjc0fdBJlBaBbh36Zm5UnBZKD/L Ni+mwDUG0tETEy+PwApzi7deLfZJoq8A/vjT8K8IfjfiPGQL1hZIJnLrlS4eWMiTk5gI rw4L6m63rw5G7fO9SE5kcYJlB51ZL9QoT5Xpp6Gqe5YOA9UthHJKN7i9FsJm01xFLcdI kf8bu6GlJPPowjTZGY+Ohmmc4jlsoZiW9alfCHTkq2M5NVmnqG0izq/+5hLH7SgUSxle 5/ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=GyKObb2L6iXQ/oMJuGbqJHQ4QVuHDvBmbOFuDlrUUOk=; b=NDIUof4c88aYD0zRRPRPoCD33EKyuql7Dsop1iyZpAeD7E/F4LRAJZVV9ZVCcCEvQv gYMWUj4jIS3X+cPGO/Pogg/i7wkPw9oxksGdXyzSmovv38HmsDCUsTlkOkRlQTU3PXNI Aml7m3e+40mmkBJgCnNkDnkF0gJzV9jYZk04ODNzoFlHP1A+N1DVJ1L8cQjBDNDUFBBP Lg++WaKLlwgWToghGWcv0Qgd9ISk/1Ivs6rrVYD/SRVxPvlMpFzgWiG9AvSR817lE24I DvlJC0I7GY7pLmpU8mt0KujKxy+HhTtziK+HNhH04/NqlEiWRMYREgFFREIN4s79Oq7n ESZw== X-Gm-Message-State: ALyK8tJoXabCGZavF5mJw3jsOm7K1S1G1RAcfGyqn8QU8XGOmk5ZhHv34LIW1G/5Xd+Yzg== X-Received: by 10.55.89.131 with SMTP id n125mr30487393qkb.143.1464721871520; Tue, 31 May 2016 12:11:11 -0700 (PDT) Received: from disco.erlich.nygenome.org ([69.74.14.178]) by smtp.googlemail.com with ESMTPSA id t64sm2931656qhc.1.2016.05.31.12.11.10 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 31 May 2016 12:11:10 -0700 (PDT) Subject: Re: bug#23665: spaces in keys: doc, --debug in LC_ALL=C To: Karl Berry , 23665@debbugs.gnu.org References: <201605311832.u4VIWRkw007923@freefriends.org> From: Assaf Gordon Message-ID: <574DE1CE.2090706@gmail.com> Date: Tue, 31 May 2016 15:11:10 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: <201605311832.u4VIWRkw007923@freefriends.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 23665 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Hello Karl! On 05/31/2016 02:32 PM, Karl Berry wrote: > I run > LC_ALL=en_US.UTF-8 sort --debug -k 2 /tmp/foo # or -k 2,2 et al. > And get the nicely explanatory output for the "surprising" result: [...] Just to verify, the surprising result is in C locale? I'm seeing the following, for "en_US.UTF-8" it's the order I'd expect, but the "C" is surprising: $ cat -A k.txt M Build/zfile$ M Master/mfile$ MM Build/afile$ $ LC_ALL=en_US.UTF-8 sort -k2 k.txt MM Build/afile M Build/zfile M Master/mfile $ LC_ALL=C sort -k2 k.txt M Build/zfile M Master/mfile MM Build/afile > But the information is just as valid in C as in UTF-8, so far as I can > see. Thus it would be nice for it to be present. If I understand correctly, one could argue the warning is even more important in C locale than in UTF-8 locales, as collating rules for UTF-8 make leading spaces less significant. As in: $ cat -A s.txt M A$ M B$ M D$ M C$ UTF-8 makes leading spaces less important: $ LC_ALL=en_US.UTF-8 sort -k2 s.txt M A M B M C M D in C locale, spaces (as simple bytes) do matter: $ LC_ALL=C sort -k2 s.txt M D M B M C M A -b skips leading spaces: $ LC_ALL=C sort -k2b s.txt M A M B M C M D > More importantly, I urge that the documentation for sort give an example > of this. The idea that following blanks after the first become part of > the next field is highly counter-intuitive. I agree, I can add the above example to the documentation (also possibly to the FAQ or Gotcha pages?). What do you think? The condition to print this message is here: http://lingrok.org/xref/coreutils/src/sort.c#2435 I can try to suggest a patch to print it in C locale as well (hopefully tonight). > It would also be nice if the definition of "key 1" was stated. > Awfully easy to misread that as "field 1". How about "leading blanks are significant in sort key [...]" ? (in http://lingrok.org/xref/coreutils/src/sort.c#2439 ) regards, - assaf From debbugs-submit-bounces@debbugs.gnu.org Tue May 31 18:46:51 2016 Received: (at 23665) by debbugs.gnu.org; 31 May 2016 22:46:51 +0000 Received: from localhost ([127.0.0.1]:48835 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7sR1-0001PP-4I for submit@debbugs.gnu.org; Tue, 31 May 2016 18:46:51 -0400 Received: from mail.magicbluesmoke.com ([82.195.144.49]:47284) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7sQz-0001OV-61 for 23665@debbugs.gnu.org; Tue, 31 May 2016 18:46:49 -0400 Received: from [192.168.1.80] (unknown [109.79.38.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.magicbluesmoke.com (Postfix) with ESMTPSA id AA8134ADE; Tue, 31 May 2016 23:46:47 +0100 (IST) Subject: Re: bug#23665: spaces in keys: doc, --debug in LC_ALL=C To: Assaf Gordon , Karl Berry , 23665@debbugs.gnu.org References: <201605311832.u4VIWRkw007923@freefriends.org> <574DE1CE.2090706@gmail.com> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: <574E1457.3080500@draigBrady.com> Date: Tue, 31 May 2016 23:46:47 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <574DE1CE.2090706@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 23665 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On 31/05/16 20:11, Assaf Gordon wrote: > Hello Karl! > > On 05/31/2016 02:32 PM, Karl Berry wrote: >> I run >> LC_ALL=en_US.UTF-8 sort --debug -k 2 /tmp/foo # or -k 2,2 et al. >> And get the nicely explanatory output for the "surprising" result: > [...] > > Just to verify, the surprising result is in C locale? > > I'm seeing the following, for "en_US.UTF-8" it's the order I'd expect, but the "C" is surprising: > > $ cat -A k.txt > M Build/zfile$ > M Master/mfile$ > MM Build/afile$ > > $ LC_ALL=en_US.UTF-8 sort -k2 k.txt > MM Build/afile > M Build/zfile > M Master/mfile > > $ LC_ALL=C sort -k2 k.txt > M Build/zfile > M Master/mfile > MM Build/afile > > >> But the information is just as valid in C as in UTF-8, so far as I can >> see. Thus it would be nice for it to be present. > > If I understand correctly, one could argue the warning is even more important in C locale than in UTF-8 locales, > as collating rules for UTF-8 make leading spaces less significant. > > As in: > > $ cat -A s.txt > M A$ > M B$ > M D$ > M C$ > > UTF-8 makes leading spaces less important: > > $ LC_ALL=en_US.UTF-8 sort -k2 s.txt > M A > M B > M C > M D > > in C locale, spaces (as simple bytes) do matter: > > $ LC_ALL=C sort -k2 s.txt > M D > M B > M C > M A > > -b skips leading spaces: > > $ LC_ALL=C sort -k2b s.txt > M A > M B > M C > M D > > >> More importantly, I urge that the documentation for sort give an example >> of this. The idea that following blanks after the first become part of >> the next field is highly counter-intuitive. > > I agree, > I can add the above example to the documentation (also possibly to the FAQ or Gotcha pages?). > What do you think? > > The condition to print this message is here: > http://lingrok.org/xref/coreutils/src/sort.c#2435 > I can try to suggest a patch to print it in C locale as well (hopefully tonight). The warning was suppressed in this case as one might be using such a command to sort right aligned indexes: http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=v8.5-40-g63761c0 Now I was probably over thinking that a bit, so I'd be happy for the removal of the maybe_space_aligned from the condition. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Tue May 31 19:15:25 2016 Received: (at 23665) by debbugs.gnu.org; 31 May 2016 23:15:25 +0000 Received: from localhost ([127.0.0.1]:48841 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7ssf-00024h-KL for submit@debbugs.gnu.org; Tue, 31 May 2016 19:15:25 -0400 Received: from freefriends.org ([96.88.95.60]:50773) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7ssd-00024Z-Qi for 23665@debbugs.gnu.org; Tue, 31 May 2016 19:15:24 -0400 X-Envelope-From: karl@freefriends.org Received: from freefriends.org (localhost [127.0.0.1]) by freefriends.org (8.14.9/8.14.9) with ESMTP id u4VNFMLI026227; Tue, 31 May 2016 17:15:22 -0600 Received: (from nobody@localhost) by freefriends.org (8.14.9/8.14.9/submit) id u4VNFLon026226; Tue, 31 May 2016 23:15:21 GMT Date: Tue, 31 May 2016 23:15:21 GMT Message-Id: <201605312315.u4VNFLon026226@freefriends.org> X-Authentication-Warning: frenzy.freefriends.org: nobody set sender to karl@freefriends.org using -f From: Karl Berry To: assafgordon@gmail.com Subject: Re: bug#23665: spaces in keys: doc, --debug in LC_ALL=C In-Reply-To: <574DE1CE.2090706@gmail.com> X-Spam-Score: -3.7 (---) X-Debbugs-Envelope-To: 23665 Cc: 23665@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.7 (---) Just to verify, the surprising result is in C locale? Yes. as collating rules for UTF-8 make leading spaces less significant. Yes, which is a different problem, in itself. Let me ask this: Are the collation rules for en_US.UTF-8 documented or even reasonably comprehensively described anywhere? Or just buried in the bowels of libc code? I looked online and found nothing usable (surprisingly). I can add the above example to the documentation Yes please. The C example. It can probably be cut down to two lines and "foo" vs. "bar" or whatever. (also possibly to the FAQ or Gotcha pages?). What do you think? You (or Paul, Jim, Bob, ...) would know better than me what deserves to be on those pages (wherever they are). My feeling would be yes. Along with mentioning --debug to, well, debug such things. How about "leading blanks are significant in sort key [...]" ? I'm not sure what you mean by [...]. The %lu? Are you proposing to just add the word "sort"? That's not needed IMHO. What I was thinking was something like this: sort: leading blanks are significant in key 1 [-k 2]; consider also specifying 'b' Since this is debugging output, the more information the better, in theory, seems to me. Maybe it is not feasible. No biggie. (in http://lingrok.org/xref/coreutils/src/sort.c#2439 ) I don't see the change. Sorry. --thanks, karl. From debbugs-submit-bounces@debbugs.gnu.org Tue May 31 19:36:21 2016 Received: (at 23665) by debbugs.gnu.org; 31 May 2016 23:36:21 +0000 Received: from localhost ([127.0.0.1]:48846 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7tCv-0002cH-Eg for submit@debbugs.gnu.org; Tue, 31 May 2016 19:36:21 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:36268) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7tCt-0002c2-VI for 23665@debbugs.gnu.org; Tue, 31 May 2016 19:36:20 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 36B4E161404; Tue, 31 May 2016 16:36:14 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id oHPkXbV3QKu1; Tue, 31 May 2016 16:36:13 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 8D2F7161401; Tue, 31 May 2016 16:36:13 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id xCtpsIWx6DWf; Tue, 31 May 2016 16:36:13 -0700 (PDT) Received: from penguin.cs.ucla.edu (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 72D02161400; Tue, 31 May 2016 16:36:13 -0700 (PDT) Subject: Re: bug#23665: spaces in keys: doc, --debug in LC_ALL=C To: Karl Berry , assafgordon@gmail.com References: <201605311832.u4VIWRkw007923@freefriends.org> <201605312315.u4VNFLon026226@freefriends.org> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <95271152-2318-8b6b-12e2-7beebc4e11c7@cs.ucla.edu> Date: Tue, 31 May 2016 16:36:13 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: <201605312315.u4VNFLon026226@freefriends.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: 23665 Cc: 23665@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.4 (-) On 05/31/2016 04:15 PM, Karl Berry wrote: > Are the > collation rules for en_US.UTF-8 documented or even reasonably > comprehensively described anywhere? Although I think they are taken from ISO/IEC 14651, I expect they've diverged from the standard by now, as a new version of the standard came out this year. I don't know of any documentation other than the glibc source code itself. From debbugs-submit-bounces@debbugs.gnu.org Tue May 31 20:38:23 2016 Received: (at 23665) by debbugs.gnu.org; 1 Jun 2016 00:38:23 +0000 Received: from localhost ([127.0.0.1]:48862 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7uAw-00049d-Tl for submit@debbugs.gnu.org; Tue, 31 May 2016 20:38:23 -0400 Received: from mail-qg0-f47.google.com ([209.85.192.47]:33535) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7uAv-00049R-6j for 23665@debbugs.gnu.org; Tue, 31 May 2016 20:38:21 -0400 Received: by mail-qg0-f47.google.com with SMTP id 52so35633241qgy.0 for <23665@debbugs.gnu.org>; Tue, 31 May 2016 17:38:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:subject:from:in-reply-to:date:cc:message-id:references :to; bh=rtEeLuEcgHI0fAy1MwM1Y9ifLoJtx+e1P26wfPye2XU=; b=m2aohyOkvfmy5wUBAhglMz8fZUKmmyUvuENjZu/uPjOJrB6nWkdDy4NjCYOiIvyaGJ 14FfzDU+QEva4PT9U22k3i5oL1UC3rl0AYgNKWSbLaD+5ol6nr5sysdiCtlknMv7nc+I at5urfSEybSHvHA5WJxA/jMYHEC3ZuZP7gRas4PySXl23XMSLJPblzpwVRG1CiDjG99o /1L31PplISyB2v2CebE94mbNlvnClA9S3DKgKxbcVNIB/mRHlJxSvSWRL9IevhmfjCIw 1eviG2F1R72np9532NIVNtkF4QEDos/6iyAVnpWKXEIrruywdGhjcknA0mnuXo6i8HIr lEtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :message-id:references:to; bh=rtEeLuEcgHI0fAy1MwM1Y9ifLoJtx+e1P26wfPye2XU=; b=lZZpzBiBg8djDsV4TYzoNbOcT/D3/0CNlFcr8T15laPuMjU+IsXQNaT5EL4yDl4xTa ufnvG5tEkE0ICCW4IcsLDBsWajRh02cJvqUCVGTfgSbslYgj7qSZZU6E6e6fM7jRcKOc MyUOR2slLgtRoc9YB1Nb+7jsc0vIAKwvBNqOtcw1zz9mwF4NGYhKBmIZYHe0FXGeVwp/ P4+R7MVzIaMksCLYXNwQJAM2Hmr4GOrHMK7PJfhZPHVog5V3rR94t0MYAAbThL30WJSA U4i3CqAZEF7RCVsDYb6WGpVFtvEcEki1Egh8Ytt9LXuuN3Qx4/+BBQ/IUgZJCII9bK3N dE8A== X-Gm-Message-State: ALyK8tI0SAbtn5mY0t12OgXJRcD7EE10vAw9ybR+Bd8TwvaorZ2gAZ/wO5zeHb32y0I13Q== X-Received: by 10.140.231.5 with SMTP id b5mr34779391qhc.98.1464741495627; Tue, 31 May 2016 17:38:15 -0700 (PDT) Received: from ix.home (pool-100-38-105-55.nycmny.fios.verizon.net. [100.38.105.55]) by smtp.gmail.com with ESMTPSA id f41sm2178692qtf.41.2016.05.31.17.38.13 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 31 May 2016 17:38:14 -0700 (PDT) Content-Type: multipart/mixed; boundary="Apple-Mail=_24E304D2-F98D-4C34-A189-1E50678B5C1E" Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2102\)) Subject: Re: bug#23665: spaces in keys: doc, --debug in LC_ALL=C From: Assaf Gordon In-Reply-To: <201605312315.u4VNFLon026226@freefriends.org> Date: Tue, 31 May 2016 20:38:11 -0400 Message-Id: <675BD18E-C99F-4D4E-8B71-77303D0D1C7F@gmail.com> References: <201605312315.u4VNFLon026226@freefriends.org> To: Karl Berry X-Mailer: Apple Mail (2.2102) X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 23665 Cc: 23665@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.3 (/) --Apple-Mail=_24E304D2-F98D-4C34-A189-1E50678B5C1E Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hello Karl and all, > On May 31, 2016, at 19:15, Karl Berry wrote: [...] > I'm not sure what you mean by [...]. The %lu? > Are you proposing to just add the word "sort"? That's not needed = IMHO. I was suggesting exactly that :) Also, the word "key" appears in few other messages, the attached patch = adds "sort" to them all. This is of course just a suggestion, and we can = use another syntax like the one you listed. Attached are 3 patches, not finalized but good as a starting point for = comments. 1. rephrase "key" with "sort key". Perhaps this is superfluous - = thoughts ? 2. add a bit more verbose progress information to the = 'sort-debug-warn.sh' test - just so it'll be easier to discuss to the = changed messages. 3. removes the 'maybe_space_aligned' and modifies the condition a bit. Expanding on the third patch: In the following two tests, the "leading space" warning is now printed = (the numbers refer to the numbers added in patch 2): #7=20 sort -gbr -k1,1n -k1,1r --debug /dev/null #11 sort -k1,1r --debug /dev/null I would say this is correct, as spaces do matter for LC_ALL=3DC with = "-r" sorting: $ cat -A 1.txt=20 x A$ x B$ x C$ $ LC_ALL=3DC ./src/sort -k2,2r 1.txt=20 x C x A x B $ LC_ALL=3DC ./src/sort -k2b,2r 1.txt=20 x C x B x A The "leading space" warning is removed from the last test, because for = keys that are "zero widths" and are ignored there's no point in printing = the warning. Comments welcomed, - assaf --Apple-Mail=_24E304D2-F98D-4C34-A189-1E50678B5C1E Content-Disposition: attachment; filename=0001-sort-clearify-key-meaning-in-debug-warnings.patch Content-Type: application/octet-stream; name="0001-sort-clearify-key-meaning-in-debug-warnings.patch" Content-Transfer-Encoding: quoted-printable =46rom=20bb94b560b7aa8e37469913382c0e383a76593c45=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20Assaf=20Gordon=20=0A= Date:=20Tue,=2031=20May=202016=2020:03:03=20-0400=0ASubject:=20[PATCH=20= 1/3]=20sort:=20clearify=20'key'=20meaning=20in=20debug=20warnings=0A=0A= Avoids=20possible=20confusion=20between=20sorting=20keys=20and=20input=20= fields.=0ASuggested=20by=20Karl=20Berry=20in=20http://bugs.gnu.org/23665=20= .=0A=0A*=20src/sort.c:=20(key_warnings):=20rephrase=20'key'=20to=20'sort=20= key'=20in=20warning=0Amessages.=0A*=20tests/misc/sort-debug-warn.sh:=20= adjust=20tests=20accordingly.=0A---=0A=20src/sort.c=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20|=20=209=20+++++----=0A=20= tests/misc/sort-debug-warn.sh=20|=2020=20++++++++++----------=0A=202=20= files=20changed,=2015=20insertions(+),=2014=20deletions(-)=0A=0Adiff=20= --git=20a/src/sort.c=20b/src/sort.c=0Aindex=20aa52b75..72ee995=20100644=0A= ---=20a/src/sort.c=0A+++=20b/src/sort.c=0A@@=20-2419,13=20+2419,14=20@@=20= key_warnings=20(struct=20keyfield=20const=20*gkey,=20bool=20gkey_only)=0A= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= umaxtostr=20(eword=20+=201=0A=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20+=20(key->echar=20= =3D=3D=20SIZE_MAX),=20tmp));=0A=20=20=20=20=20=20=20=20=20=20=20=20=20}=0A= -=20=20=20=20=20=20=20=20=20=20error=20(0,=200,=20_("obsolescent=20key=20= %s=20used;=20consider=20%s=20instead"),=0A+=20=20=20=20=20=20=20=20=20=20= error=20(0,=200,=20_("obsolescent=20sort=20key=20%s=20used;=20consider=20= %s=20instead"),=0A=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= quote_n=20(0,=20obuf),=20quote_n=20(1,=20nbuf));=0A=20=20=20=20=20=20=20=20= =20}=0A=20=0A=20=20=20=20=20=20=20/*=20Warn=20about=20field=20specs=20= that=20will=20never=20match.=20=20*/=0A=20=20=20=20=20=20=20if=20= (key->sword=20!=3D=20SIZE_MAX=20&&=20key->eword=20<=20key->sword)=0A-=20=20= =20=20=20=20=20=20error=20(0,=200,=20_("key=20%lu=20has=20zero=20width=20= and=20will=20be=20ignored"),=20keynum);=0A+=20=20=20=20=20=20=20=20error=20= (0,=200,=20_("sort=20key=20%lu=20has=20zero=20width=20and=20will=20be=20= ignored"),=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20keynum);=0A=20= =0A=20=20=20=20=20=20=20/*=20Warn=20about=20significant=20leading=20= blanks.=20=20*/=0A=20=20=20=20=20=20=20bool=20implicit_skip=20=3D=20= key_numeric=20(key)=20||=20key->month;=0A@@=20-2436,7=20+2437,7=20@@=20= key_warnings=20(struct=20keyfield=20const=20*gkey,=20bool=20gkey_only)=0A= =20=20=20=20=20=20=20=20=20=20=20&&=20((!key->skipsblanks=20&&=20= !(implicit_skip=20||=20maybe_space_aligned))=0A=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20||=20(!key->skipsblanks=20&&=20key->schar)=0A=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20||=20(!key->skipeblanks=20&&=20= key->echar)))=0A-=20=20=20=20=20=20=20=20error=20(0,=200,=20_("leading=20= blanks=20are=20significant=20in=20key=20%lu;=20"=0A+=20=20=20=20=20=20=20= =20error=20(0,=200,=20_("leading=20blanks=20are=20significant=20in=20= sort=20key=20%lu;=20"=0A=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20"consider=20also=20specifying=20'b'"),=20keynum);=0A= =20=0A=20=20=20=20=20=20=20/*=20Warn=20about=20numeric=20comparisons=20= spanning=20fields,=0A@@=20-2449,7=20+2450,7=20@@=20key_warnings=20= (struct=20keyfield=20const=20*gkey,=20bool=20gkey_only)=0A=20=20=20=20=20= =20=20=20=20=20=20if=20(!sword)=0A=20=20=20=20=20=20=20=20=20=20=20=20=20= sword++;=0A=20=20=20=20=20=20=20=20=20=20=20if=20(!eword=20||=20sword=20= <=20eword)=0A-=20=20=20=20=20=20=20=20=20=20=20=20error=20(0,=200,=20= _("key=20%lu=20is=20numeric=20and=20spans=20multiple=20fields"),=0A+=20=20= =20=20=20=20=20=20=20=20=20=20error=20(0,=200,=20_("sort=20key=20%lu=20= is=20numeric=20and=20spans=20multiple=20fields"),=0A=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20keynum);=0A=20=20=20=20=20=20=20=20= =20}=0A=20=0Adiff=20--git=20a/tests/misc/sort-debug-warn.sh=20= b/tests/misc/sort-debug-warn.sh=0Aindex=20a31132d..da86d37=20100755=0A= ---=20a/tests/misc/sort-debug-warn.sh=0A+++=20= b/tests/misc/sort-debug-warn.sh=0A@@=20-21,11=20+21,11=20@@=20print_ver_=20= sort=0A=20=0A=20cat=20<<\EOF=20>=20exp=0A=20sort:=20using=20simple=20= byte=20comparison=0A-sort:=20key=201=20has=20zero=20width=20and=20will=20= be=20ignored=0A+sort:=20sort=20key=201=20has=20zero=20width=20and=20will=20= be=20ignored=0A=20sort:=20using=20simple=20byte=20comparison=0A-sort:=20= key=201=20has=20zero=20width=20and=20will=20be=20ignored=0A+sort:=20sort=20= key=201=20has=20zero=20width=20and=20will=20be=20ignored=0A=20sort:=20= using=20simple=20byte=20comparison=0A-sort:=20key=201=20is=20numeric=20= and=20spans=20multiple=20fields=0A+sort:=20sort=20key=201=20is=20numeric=20= and=20spans=20multiple=20fields=0A=20sort:=20using=20simple=20byte=20= comparison=0A=20sort:=20options=20'-bghMRrV'=20are=20ignored=0A=20sort:=20= using=20simple=20byte=20comparison=0A@@=20-41,12=20+41,12=20@@=20sort:=20= option=20'-b'=20is=20ignored=0A=20sort:=20using=20simple=20byte=20= comparison=0A=20sort:=20using=20simple=20byte=20comparison=0A=20sort:=20= using=20simple=20byte=20comparison=0A-sort:=20leading=20blanks=20are=20= significant=20in=20key=201;=20consider=20also=20specifying=20'b'=0A= +sort:=20leading=20blanks=20are=20significant=20in=20sort=20key=201;=20= consider=20also=20specifying=20'b'=0A=20sort:=20using=20simple=20byte=20= comparison=0A-sort:=20leading=20blanks=20are=20significant=20in=20key=20= 1;=20consider=20also=20specifying=20'b'=0A+sort:=20leading=20blanks=20= are=20significant=20in=20sort=20key=201;=20consider=20also=20specifying=20= 'b'=0A=20sort:=20option=20'-d'=20is=20ignored=0A=20sort:=20using=20= simple=20byte=20comparison=0A-sort:=20leading=20blanks=20are=20= significant=20in=20key=201;=20consider=20also=20specifying=20'b'=0A= +sort:=20leading=20blanks=20are=20significant=20in=20sort=20key=201;=20= consider=20also=20specifying=20'b'=0A=20sort:=20option=20'-i'=20is=20= ignored=0A=20sort:=20using=20simple=20byte=20comparison=0A=20sort:=20= using=20simple=20byte=20comparison=0A@@=20-77,10=20+77,10=20@@=20compare=20= exp=20out=20||=20fail=3D1=0A=20=0A=20cat=20<<\EOF=20>=20exp=0A=20sort:=20= using=20simple=20byte=20comparison=0A-sort:=20key=201=20is=20numeric=20= and=20spans=20multiple=20fields=0A-sort:=20obsolescent=20key=20'+2=20-1'=20= used;=20consider=20'-k=203,1'=20instead=0A-sort:=20key=202=20has=20zero=20= width=20and=20will=20be=20ignored=0A-sort:=20leading=20blanks=20are=20= significant=20in=20key=202;=20consider=20also=20specifying=20'b'=0A= +sort:=20sort=20key=201=20is=20numeric=20and=20spans=20multiple=20fields=0A= +sort:=20obsolescent=20sort=20key=20'+2=20-1'=20used;=20consider=20'-k=20= 3,1'=20instead=0A+sort:=20sort=20key=202=20has=20zero=20width=20and=20= will=20be=20ignored=0A+sort:=20leading=20blanks=20are=20significant=20in=20= sort=20key=202;=20consider=20also=20specifying=20'b'=0A=20sort:=20option=20= '-b'=20is=20ignored=0A=20sort:=20option=20'-r'=20only=20applies=20to=20= last-resort=20comparison=0A=20EOF=0A--=20=0A2.7.0=0A=0A= --Apple-Mail=_24E304D2-F98D-4C34-A189-1E50678B5C1E Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii --Apple-Mail=_24E304D2-F98D-4C34-A189-1E50678B5C1E Content-Disposition: attachment; filename=0002-tests-sort-debug-warn-add-progress-information-lines.patch Content-Type: application/octet-stream; name="0002-tests-sort-debug-warn-add-progress-information-lines.patch" Content-Transfer-Encoding: quoted-printable =46rom=20313c4749cf607b8e22a36cc860c138e0be76ee59=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20Assaf=20Gordon=20=0A= Date:=20Tue,=2031=20May=202016=2020:11:34=20-0400=0ASubject:=20[PATCH=20= 2/3]=20tests:=20sort-debug-warn:=20add=20progress=20information=20lines=0A= =0AEasier=20troubleshooting=20of=20individual=20'sort=20--debug'=20= messages.=0A=0A*=20tests/misc/sort-debug-warn.sh:=20add=20progress=20= number=20before=20each=20sort=0Ainvocation.=0A---=0A=20= tests/misc/sort-debug-warn.sh=20|=2036=20= ++++++++++++++++++++++++++++++++++++=0A=201=20file=20changed,=2036=20= insertions(+)=0A=0Adiff=20--git=20a/tests/misc/sort-debug-warn.sh=20= b/tests/misc/sort-debug-warn.sh=0Aindex=20da86d37..4c0ef58=20100755=0A= ---=20a/tests/misc/sort-debug-warn.sh=0A+++=20= b/tests/misc/sort-debug-warn.sh=0A@@=20-20,57=20+20,93=20@@=0A=20= print_ver_=20sort=0A=20=0A=20cat=20<<\EOF=20>=20exp=0A+1=0A=20sort:=20= using=20simple=20byte=20comparison=0A=20sort:=20sort=20key=201=20has=20= zero=20width=20and=20will=20be=20ignored=0A+2=0A=20sort:=20using=20= simple=20byte=20comparison=0A=20sort:=20sort=20key=201=20has=20zero=20= width=20and=20will=20be=20ignored=0A+3=0A=20sort:=20using=20simple=20= byte=20comparison=0A=20sort:=20sort=20key=201=20is=20numeric=20and=20= spans=20multiple=20fields=0A+4=0A=20sort:=20using=20simple=20byte=20= comparison=0A=20sort:=20options=20'-bghMRrV'=20are=20ignored=0A+5=0A=20= sort:=20using=20simple=20byte=20comparison=0A=20sort:=20options=20= '-bghMRV'=20are=20ignored=0A=20sort:=20option=20'-r'=20only=20applies=20= to=20last-resort=20comparison=0A+6=0A=20sort:=20using=20simple=20byte=20= comparison=0A=20sort:=20option=20'-r'=20only=20applies=20to=20= last-resort=20comparison=0A+7=0A=20sort:=20using=20simple=20byte=20= comparison=0A=20sort:=20options=20'-bg'=20are=20ignored=0A+8=0A=20sort:=20= using=20simple=20byte=20comparison=0A+9=0A=20sort:=20using=20simple=20= byte=20comparison=0A=20sort:=20option=20'-b'=20is=20ignored=0A+10=0A=20= sort:=20using=20simple=20byte=20comparison=0A+11=0A=20sort:=20using=20= simple=20byte=20comparison=0A+12=0A=20sort:=20using=20simple=20byte=20= comparison=0A=20sort:=20leading=20blanks=20are=20significant=20in=20sort=20= key=201;=20consider=20also=20specifying=20'b'=0A+13=0A=20sort:=20using=20= simple=20byte=20comparison=0A=20sort:=20leading=20blanks=20are=20= significant=20in=20sort=20key=201;=20consider=20also=20specifying=20'b'=0A= =20sort:=20option=20'-d'=20is=20ignored=0A+14=0A=20sort:=20using=20= simple=20byte=20comparison=0A=20sort:=20leading=20blanks=20are=20= significant=20in=20sort=20key=201;=20consider=20also=20specifying=20'b'=0A= =20sort:=20option=20'-i'=20is=20ignored=0A+15=0A=20sort:=20using=20= simple=20byte=20comparison=0A+16=0A=20sort:=20using=20simple=20byte=20= comparison=0A+17=0A=20sort:=20using=20simple=20byte=20comparison=0A+18=0A= =20sort:=20failed=20to=20set=20locale;=20using=20simple=20byte=20= comparison=0A=20EOF=0A=20=0A+echo=201=20>>=20out=0A=20sort=20-s=20-k2,1=20= --debug=20/dev/null=202>>out=0A+echo=202=20>>=20out=0A=20sort=20-s=20= -k2,1n=20--debug=20/dev/null=202>>out=0A+echo=203=20>>=20out=0A=20sort=20= -s=20-k1,2n=20--debug=20/dev/null=202>>out=0A+echo=204=20>>=20out=0A=20= sort=20-s=20-rRVMhgb=20-k1,1n=20--debug=20/dev/null=202>>out=0A+echo=205=20= >>=20out=0A=20sort=20-rRVMhgb=20-k1,1n=20--debug=20/dev/null=202>>out=0A= +echo=206=20>>=20out=0A=20sort=20-r=20-k1,1n=20--debug=20/dev/null=20= 2>>out=0A+echo=207=20>>=20out=0A=20sort=20-gbr=20-k1,1n=20-k1,1r=20= --debug=20/dev/null=202>>out=0A+echo=208=20>>=20out=0A=20sort=20-b=20= -k1b,1bn=20--debug=20/dev/null=202>>out=20#=20no=20warning=0A+echo=209=20= >>=20out=0A=20sort=20-b=20-k1,1bn=20--debug=20/dev/null=202>>out=0A+echo=20= 10=20>>=20out=0A=20sort=20-b=20-k1,1bn=20-k2b,2=20--debug=20/dev/null=20= 2>>out=20#=20no=20warning=0A+echo=2011=20>>=20out=0A=20sort=20-r=20= -k1,1r=20--debug=20/dev/null=202>>out=20#=20no=20warning=20for=20= redundant=20options=0A+echo=2012=20>>=20out=0A=20sort=20-i=20-k1,1i=20= --debug=20/dev/null=202>>out=20#=20no=20warning=0A+echo=2013=20>>=20out=0A= =20sort=20-d=20-k1,1b=20--debug=20/dev/null=202>>out=0A+echo=2014=20>>=20= out=0A=20sort=20-i=20-k1,1d=20--debug=20/dev/null=202>>out=0A+echo=2015=20= >>=20out=0A=20sort=20-r=20--debug=20/dev/null=202>>out=20#no=20warning=0A= +echo=2016=20>>=20out=0A=20sort=20-rM=20--debug=20/dev/null=202>>out=20= #no=20warning=0A+echo=2017=20>>=20out=0A=20sort=20-rM=20-k1,1=20--debug=20= /dev/null=202>>out=20#no=20warning=0A+echo=2018=20>>=20out=0A=20= LC_ALL=3Dmissing=20sort=20--debug=20/dev/null=202>>out=0A=20=0A=20= compare=20exp=20out=20||=20fail=3D1=0A--=20=0A2.7.0=0A=0A= --Apple-Mail=_24E304D2-F98D-4C34-A189-1E50678B5C1E Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii --Apple-Mail=_24E304D2-F98D-4C34-A189-1E50678B5C1E Content-Disposition: attachment; filename=0003-sort-modify-leading-spaces-debug-warning-scenarios.patch Content-Type: application/octet-stream; name="0003-sort-modify-leading-spaces-debug-warning-scenarios.patch" Content-Transfer-Encoding: quoted-printable =46rom=20a9972d82869e1f31de857e30d29fe18417e8b4c9=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20Assaf=20Gordon=20=0A= Date:=20Tue,=2031=20May=202016=2020:19:38=20-0400=0ASubject:=20[PATCH=20= 3/3]=20sort:=20modify=20'leading=20spaces'=20debug=20warning=20scenarios=0A= =0APrint=20warning=20regardless=20of=20locale,=20avoid=20warning=20if=20= key=20is=20zero=20width.=0A=0A*=20src/sort.c:=20(key_warnings):=20change=20= conditions=20for=20'leading=20spaces'=0Awarning.=0A*=20= tests/misc/sort-debug-warn.sh:=20adjust=20tests=20accordingly.=0A---=0A=20= src/sort.c=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20|=20= 9=20++++-----=0A=20tests/misc/sort-debug-warn.sh=20|=203=20++-=0A=202=20= files=20changed,=206=20insertions(+),=206=20deletions(-)=0A=0Adiff=20= --git=20a/src/sort.c=20b/src/sort.c=0Aindex=2072ee995..5d9ecdd=20100644=0A= ---=20a/src/sort.c=0A+++=20b/src/sort.c=0A@@=20-2424,17=20+2424,16=20@@=20= key_warnings=20(struct=20keyfield=20const=20*gkey,=20bool=20gkey_only)=0A= =20=20=20=20=20=20=20=20=20}=0A=20=0A=20=20=20=20=20=20=20/*=20Warn=20= about=20field=20specs=20that=20will=20never=20match.=20=20*/=0A-=20=20=20= =20=20=20if=20(key->sword=20!=3D=20SIZE_MAX=20&&=20key->eword=20<=20= key->sword)=0A+=20=20=20=20=20=20bool=20zero_width=20=3D=20key->sword=20= !=3D=20SIZE_MAX=20&&=20key->eword=20<=20key->sword;=0A+=20=20=20=20=20=20= if=20(zero_width)=0A=20=20=20=20=20=20=20=20=20error=20(0,=200,=20= _("sort=20key=20%lu=20has=20zero=20width=20and=20will=20be=20ignored"),=0A= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20keynum);=0A=20=0A=20=20=20= =20=20=20=20/*=20Warn=20about=20significant=20leading=20blanks.=20=20*/=0A= =20=20=20=20=20=20=20bool=20implicit_skip=20=3D=20key_numeric=20(key)=20= ||=20key->month;=0A-=20=20=20=20=20=20bool=20maybe_space_aligned=20=3D=20= !hard_LC_COLLATE=20&&=20default_key_compare=20(key)=0A-=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20&&=20!(key->schar=20||=20key->echar);=0A=20=20=20=20=20=20=20bool=20= line_offset=20=3D=20key->eword=20=3D=3D=200=20&&=20key->echar=20!=3D=20= 0;=20/*=20-k1.x,1.y=20=20*/=0A-=20=20=20=20=20=20if=20(!gkey_only=20&&=20= tab=20=3D=3D=20TAB_DEFAULT=20&&=20!line_offset=0A-=20=20=20=20=20=20=20=20= =20=20&&=20((!key->skipsblanks=20&&=20!(implicit_skip=20||=20= maybe_space_aligned))=0A+=20=20=20=20=20=20if=20(!zero_width=20&&=20= !gkey_only=20&&=20tab=20=3D=3D=20TAB_DEFAULT=20&&=20!line_offset=0A+=20=20= =20=20=20=20=20=20=20=20&&=20((!key->skipsblanks=20&&=20!implicit_skip)=0A= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20||=20(!key->skipsblanks=20= &&=20key->schar)=0A=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20||=20= (!key->skipeblanks=20&&=20key->echar)))=0A=20=20=20=20=20=20=20=20=20= error=20(0,=200,=20_("leading=20blanks=20are=20significant=20in=20sort=20= key=20%lu;=20"=0Adiff=20--git=20a/tests/misc/sort-debug-warn.sh=20= b/tests/misc/sort-debug-warn.sh=0Aindex=204c0ef58..903bb8d=20100755=0A= ---=20a/tests/misc/sort-debug-warn.sh=0A+++=20= b/tests/misc/sort-debug-warn.sh=0A@@=20-41,6=20+41,7=20@@=20sort:=20= using=20simple=20byte=20comparison=0A=20sort:=20option=20'-r'=20only=20= applies=20to=20last-resort=20comparison=0A=207=0A=20sort:=20using=20= simple=20byte=20comparison=0A+sort:=20leading=20blanks=20are=20= significant=20in=20sort=20key=202;=20consider=20also=20specifying=20'b'=0A= =20sort:=20options=20'-bg'=20are=20ignored=0A=208=0A=20sort:=20using=20= simple=20byte=20comparison=0A@@=20-51,6=20+52,7=20@@=20sort:=20option=20= '-b'=20is=20ignored=0A=20sort:=20using=20simple=20byte=20comparison=0A=20= 11=0A=20sort:=20using=20simple=20byte=20comparison=0A+sort:=20leading=20= blanks=20are=20significant=20in=20sort=20key=201;=20consider=20also=20= specifying=20'b'=0A=2012=0A=20sort:=20using=20simple=20byte=20comparison=0A= =20sort:=20leading=20blanks=20are=20significant=20in=20sort=20key=201;=20= consider=20also=20specifying=20'b'=0A@@=20-116,7=20+118,6=20@@=20sort:=20= using=20simple=20byte=20comparison=0A=20sort:=20sort=20key=201=20is=20= numeric=20and=20spans=20multiple=20fields=0A=20sort:=20obsolescent=20= sort=20key=20'+2=20-1'=20used;=20consider=20'-k=203,1'=20instead=0A=20= sort:=20sort=20key=202=20has=20zero=20width=20and=20will=20be=20ignored=0A= -sort:=20leading=20blanks=20are=20significant=20in=20sort=20key=202;=20= consider=20also=20specifying=20'b'=0A=20sort:=20option=20'-b'=20is=20= ignored=0A=20sort:=20option=20'-r'=20only=20applies=20to=20last-resort=20= comparison=0A=20EOF=0A--=20=0A2.7.0=0A=0A= --Apple-Mail=_24E304D2-F98D-4C34-A189-1E50678B5C1E-- From debbugs-submit-bounces@debbugs.gnu.org Tue May 31 20:54:47 2016 Received: (at 23665) by debbugs.gnu.org; 1 Jun 2016 00:54:47 +0000 Received: from localhost ([127.0.0.1]:48869 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7uQp-0004Wc-Dj for submit@debbugs.gnu.org; Tue, 31 May 2016 20:54:47 -0400 Received: from mail.magicbluesmoke.com ([82.195.144.49]:47670) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7uQk-0004WQ-Az for 23665@debbugs.gnu.org; Tue, 31 May 2016 20:54:45 -0400 Received: from [192.168.1.80] (unknown [109.79.38.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.magicbluesmoke.com (Postfix) with ESMTPSA id B33A11EF; Wed, 1 Jun 2016 01:54:40 +0100 (IST) Subject: Re: bug#23665: spaces in keys: doc, --debug in LC_ALL=C To: Assaf Gordon , Karl Berry References: <201605312315.u4VNFLon026226@freefriends.org> <675BD18E-C99F-4D4E-8B71-77303D0D1C7F@gmail.com> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: <574E3250.30501@draigBrady.com> Date: Wed, 1 Jun 2016 01:54:40 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <675BD18E-C99F-4D4E-8B71-77303D0D1C7F@gmail.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 23665 Cc: 23665@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On 01/06/16 01:38, Assaf Gordon wrote: > Hello Karl and all, > >> On May 31, 2016, at 19:15, Karl Berry wrote: > [...] >> I'm not sure what you mean by [...]. The %lu? >> Are you proposing to just add the word "sort"? That's not needed IMHO. > > I was suggesting exactly that :) > Also, the word "key" appears in few other messages, the attached patch adds "sort" to them all. This is of course just a suggestion, and we can use another syntax like the one you listed. > > Attached are 3 patches, not finalized but good as a starting point for comments. > > 1. rephrase "key" with "sort key". Perhaps this is superfluous - thoughts ? > > 2. add a bit more verbose progress information to the 'sort-debug-warn.sh' test - just so it'll be easier to discuss to the changed messages. > > 3. removes the 'maybe_space_aligned' and modifies the condition a bit. I'm 50:50 on 1. 2 and 3 are good to push. thanks! From debbugs-submit-bounces@debbugs.gnu.org Tue May 31 21:52:35 2016 Received: (at 23665) by debbugs.gnu.org; 1 Jun 2016 01:52:35 +0000 Received: from localhost ([127.0.0.1]:48894 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7vKl-0005wd-Me for submit@debbugs.gnu.org; Tue, 31 May 2016 21:52:35 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:44346) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7vKj-0005wJ-O6 for 23665@debbugs.gnu.org; Tue, 31 May 2016 21:52:34 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 0B008161404; Tue, 31 May 2016 18:52:27 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id uJqLIN4Nyxs5; Tue, 31 May 2016 18:52:26 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 5EA3D161408; Tue, 31 May 2016 18:52:26 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id jnWEurGO8IOR; Tue, 31 May 2016 18:52:26 -0700 (PDT) Received: from [192.168.1.9] (unknown [100.32.155.148]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 3EE54161404; Tue, 31 May 2016 18:52:26 -0700 (PDT) Subject: Re: bug#23665: spaces in keys: doc, --debug in LC_ALL=C To: Assaf Gordon , Karl Berry References: <201605312315.u4VNFLon026226@freefriends.org> <675BD18E-C99F-4D4E-8B71-77303D0D1C7F@gmail.com> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <574E3FD5.6020303@cs.ucla.edu> Date: Tue, 31 May 2016 18:52:21 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: <675BD18E-C99F-4D4E-8B71-77303D0D1C7F@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: 23665 Cc: 23665@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.4 (-) Assaf Gordon wrote: > 1. rephrase "key" with "sort key". Perhaps this is superfluous - thoughts ? I would leave it alone. This is the 'sort' program, after all, so it's hard to misinterpret "key". Plus, "obsolescent sort key" might be misinterpreted as meaning a key for an obsolescent sort, not a sort key that is obsolescent. From debbugs-submit-bounces@debbugs.gnu.org Tue May 31 22:15:34 2016 Received: (at 23665) by debbugs.gnu.org; 1 Jun 2016 02:15:34 +0000 Received: from localhost ([127.0.0.1]:48903 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7vgz-0006Tl-SH for submit@debbugs.gnu.org; Tue, 31 May 2016 22:15:34 -0400 Received: from mail-qg0-f53.google.com ([209.85.192.53]:34153) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b7vgx-0006TY-O5 for 23665@debbugs.gnu.org; Tue, 31 May 2016 22:15:32 -0400 Received: by mail-qg0-f53.google.com with SMTP id p34so31609616qgp.1 for <23665@debbugs.gnu.org>; Tue, 31 May 2016 19:15:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=SNQdE5d1wCsofw17xKsFLny5IXmeGUGNOaJSDfsb7RI=; b=nvmkS0UX+EfiV9CZhbSW92AOhxZZpj65WfcX/8tMohEQfPw7GuBLs2aTcezlt+5ICX X7kkgTwW1bxiHJa5h2436j1x4S0EoPsNv04W03Nd5VJANJgNsZfzrBm6hHuSTMecnaL0 UGfnv9w32g/bhlS8pgOACL9V69obM+1ikXPVLamAmpnbGbkRvrzD2uYCoriIPhET4F8U QjA+RhayyPS41lyzAPeeisslqcBLkXiNdc4Ht5YSAwBs9pBc2L03AdgAF06VBaDrBVLf QfTcZd6DILpcS2g+JWT9OgC03b+Gw3uAcW4cgiNm8PDtf8SLo3HhtbYg5pbaS4EW4UL8 HWMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=SNQdE5d1wCsofw17xKsFLny5IXmeGUGNOaJSDfsb7RI=; b=DAwpHB6HhVoZRE4nPeKjAlrncc1rvhCkHQ8gBktC2Br8leXU6919odtrVhDf1N5zEz nZ+ef1O4z4BFZJKQdRiQWT1lqEPsIJoMvUSOLmkTWAG4wcb9D2Jh/grJzgn+1jaWKI10 PAm+ETIpjxO9aGG9LsUXPD7oTyp8UJLjFLocOz4tLECSEBnoMqIiv5N3pZxlQI1E9Myz XgW7qK/EONogG1tCmFXkM0k0hGyXU3k2T6olp6UvtmDfdLTR+eVZIb/tF8TLW68PcR/a RoAZL2qFGU8saAIPAQSfUWZWhvwlXDBQ0iICkWNajF/IsbXp/GFlm/6LE7bksXlqzMDf b0rQ== X-Gm-Message-State: ALyK8tI7EWfl+wzmLaaSfvVnz/XVopjFSAs/b7EAtbi4rrNGNqTNrEhgkkN44UB35mr19A== X-Received: by 10.140.34.133 with SMTP id l5mr25653767qgl.28.1464747326325; Tue, 31 May 2016 19:15:26 -0700 (PDT) Received: from ix.home (pool-100-38-105-55.nycmny.fios.verizon.net. [100.38.105.55]) by smtp.gmail.com with ESMTPSA id c94sm9081955qge.36.2016.05.31.19.15.25 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 31 May 2016 19:15:25 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2102\)) Subject: Re: bug#23665: spaces in keys: doc, --debug in LC_ALL=C From: Assaf Gordon In-Reply-To: <574E3250.30501@draigBrady.com> Date: Tue, 31 May 2016 22:15:22 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: <201605312315.u4VNFLon026226@freefriends.org> <675BD18E-C99F-4D4E-8B71-77303D0D1C7F@gmail.com> <574E3250.30501@draigBrady.com> To: =?windows-1252?Q?P=E1draig_Brady?= X-Mailer: Apple Mail (2.2102) X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 23665 Cc: 23665@debbugs.gnu.org, Karl Berry X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.3 (/) > On May 31, 2016, at 20:54, P=E1draig Brady wrote: >=20 > On 01/06/16 01:38, Assaf Gordon wrote: >>=20 >> 2. add a bit more verbose progress information to the = 'sort-debug-warn.sh' test - just so it'll be easier to discuss to the = changed messages. >>=20 >> 3. removes the 'maybe_space_aligned' and modifies the condition a = bit. >=20 > 2 and 3 are good to push. Thank you, pushed in: = http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=3Dd548f87595a193= e21b170368bc8fc2ded4dadb73 = http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=3D6223bf94bfeac7= 5fb4252095864a80545ba00a0d =3D=3D=3D=3D Regarding documentation: how about the following? 'sort' without '-t' separates fields by whitespace (tab and space = characters) and considers the whitespace characters to be part of the = field's text. Use 'b' sorting option to skip leading spaces. Example: $ cat s.txt M A M C M D M B Without 'b' leading spaces affect sorting order of the second field: $ LC_ALL=3DC sort -k2 s.txt M D M B M C M A With 'b', leading spaces are skipped: $ LC_ALL=3DC sort -k2b s.txt M A M B M C M D For troubleshooting use 'sort --debug': $ LC_ALL=3DC ./src/sort --debug -k2 s.txt=20 sort: using simple byte comparison sort: leading blanks are significant in key 1; consider also = specifying 'b' M D ____ _____ M B ___ ____ M C ___ ____ M A __ ___ =3D=3D=3D=3D=3D=3D=3D=3D=3D Should such an example go in the documentation, or in the new 'gotcha' = page ? I can shorten the example (e.g. with only two letter, such as 'printf = "A\n B\n"'), but perhaps a slightly longer more verbose example would = help understand the issue in a glance. The fixed first field "M" is there to make it visually clear where the = spaces are. comments welcomed, - assaf From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 01 07:05:12 2016 Received: (at 23665) by debbugs.gnu.org; 1 Jun 2016 11:05:13 +0000 Received: from localhost ([127.0.0.1]:49106 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b83xY-0005k2-Kb for submit@debbugs.gnu.org; Wed, 01 Jun 2016 07:05:12 -0400 Received: from mail.magicbluesmoke.com ([82.195.144.49]:49613) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b83xV-0005js-QX for 23665@debbugs.gnu.org; Wed, 01 Jun 2016 07:05:11 -0400 Received: from [192.168.1.80] (unknown [109.78.232.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.magicbluesmoke.com (Postfix) with ESMTPSA id 726D0986C; Wed, 1 Jun 2016 12:05:08 +0100 (IST) Subject: Re: bug#23665: spaces in keys: doc, --debug in LC_ALL=C To: Assaf Gordon References: <201605312315.u4VNFLon026226@freefriends.org> <675BD18E-C99F-4D4E-8B71-77303D0D1C7F@gmail.com> <574E3250.30501@draigBrady.com> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: <574EC163.5010601@draigBrady.com> Date: Wed, 1 Jun 2016 12:05:07 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------080305080108090107080406" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 23665 Cc: 23665@debbugs.gnu.org, Karl Berry X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) This is a multi-part message in MIME format. --------------080305080108090107080406 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit On 01/06/16 03:15, Assaf Gordon wrote: > >> On May 31, 2016, at 20:54, Pádraig Brady wrote: >> >> On 01/06/16 01:38, Assaf Gordon wrote: >>> >>> 2. add a bit more verbose progress information to the 'sort-debug-warn.sh' test - just so it'll be easier to discuss to the changed messages. >>> >>> 3. removes the 'maybe_space_aligned' and modifies the condition a bit. >> >> 2 and 3 are good to push. > > Thank you, pushed in: > http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=d548f87595a193e21b170368bc8fc2ded4dadb73 > http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=6223bf94bfeac75fb4252095864a80545ba00a0d > > ==== > > Regarding documentation: how about the following? > > 'sort' without '-t' separates fields by whitespace (tab and space characters) and considers the whitespace characters to be part of the field's text. Use 'b' sorting option to skip leading spaces. > I've added essentially that summary to the --key description in the attached. > Example: I think these examples are a bit verbose for the info docs for the amount of info they convey. There are so many combinations of field handling options that it's best to give the rules and defer to the --debug option for what's actually happening. What I have done is to expand this discussion on field handling in: http://www.pixelbeat.org/docs/coreutils-gotchas.html#sort and drilling down from there have given an example of a comparison where leading blanks are significant (and useful) at the bottom of: http://www.pixelbeat.org/patches/coreutils/sort-debug/ cheers, Pádraig. --------------080305080108090107080406 Content-Type: text/x-patch; name="sort-key-docs.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="sort-key-docs.patch" >From a3311c966e34f2d9f8aa6b1de31b211124803d02 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Wed, 1 Jun 2016 11:56:47 +0100 Subject: [PATCH] doc: clarify sort --key handling of default field separators * doc/coreutils.texi (sort invocation): Mention in the summary dicussion that --key is used to specify fields. Give a summary in the --key description, of the most common use case of specifying a field, and that by default those fields include the blank separators at the start of each field in the comparisons. --- doc/coreutils.texi | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 6a671bb..47c63db 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -4022,7 +4022,7 @@ sort [@var{option}]@dots{} [@var{file}]@dots{} Many options affect how @command{sort} compares lines; if the results are unexpected, try the @option{--debug} option to see what happened. A pair of lines is compared as follows: -@command{sort} compares each pair of fields, in the +@command{sort} compares each pair of fields (see @option{--key}), in the order specified on the command line, according to the associated ordering options, until a difference is found or no fields are left. If no key fields are specified, @command{sort} uses a default key of @@ -4332,7 +4332,14 @@ Specify a sort field that consists of the part of the line between @var{pos1} and @var{pos2} (or the end of the line, if @var{pos2} is omitted), @emph{inclusive}. -Each @var{pos} has the form @samp{@var{f}[.@var{c}][@var{opts}]}, +In its simplest form @var{pos} specifies a field number (starting with 1), +with fields being separated by runs of blank characters, and by default +those blanks being included in the comparison at the start of each field. +To adjust the handling of blank characters see the @option{-b} and +@option{-t} options. + +More generally, +each @var{pos} has the form @samp{@var{f}[.@var{c}][@var{opts}]}, where @var{f} is the number of the field to use, and @var{c} is the number of the first character from the beginning of the field. Fields and character positions are numbered starting with 1; a character position of zero in -- 2.5.5 --------------080305080108090107080406-- From debbugs-submit-bounces@debbugs.gnu.org Sun Oct 28 02:02:39 2018 Received: (at 23665) by debbugs.gnu.org; 28 Oct 2018 06:02:39 +0000 Received: from localhost ([127.0.0.1]:46173 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gGe9n-0001So-By for submit@debbugs.gnu.org; Sun, 28 Oct 2018 02:02:39 -0400 Received: from mail-pg1-f177.google.com ([209.85.215.177]:35607) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gGe9l-0001OZ-Bk; Sun, 28 Oct 2018 02:02:37 -0400 Received: by mail-pg1-f177.google.com with SMTP id 32-v6so2370197pgu.2; Sat, 27 Oct 2018 23:02:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:references:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=h65qwHhaw++ewUm62CzXEdKtAHGtWLMIJP3y7UUTBDE=; b=JPRvGpqhq4zcWoTquses3PBwbvgttc1ApkgizM6tfpEpFMeT7XRC2g+uIyeCPsO8zK oolv6hQtkMf/hIifWpfw5e0Q1hA3v2wJcTw8Mk0i2yjZZR4YuaLQq/g72NYGpsJC1uyQ fQeHS2bINhNZV2zQKvI49gmFK2/3Qqxk9ttj8vQ8BLRn7okm6cLlvvuxujFApJShK2Ca yCUQt6Jsjeo75MTeYuIhuWjE8768rsXTWRRdeUolKPifGS/ABP/JAO6MPn9hoiO93x0+ 6qKuIeuwdUkR8hrdXfv+ACVT0G9r4d+JhXu0FT3i2flfTL/aK6xtsX52leHoEOuDIWbd r4ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=h65qwHhaw++ewUm62CzXEdKtAHGtWLMIJP3y7UUTBDE=; b=hsF4HVqwYnfy/Tw+9Bo8O8wNW3qSYpabWgUQyprX4NkfnJW2ujUImYMfIuqusNY+XL 4aukB70CBOJwhxtMsEbA7KZyQ41I1HYoa4khtgGNd3cKF/GfYD6NXPBNL/PTYbD27xsj 3XvcY+lVwFM/bPQEtN+K6sQE65deYoIh2fqA89m/X3NaKL8vU+d2pfqqH3GfnMKRktd/ uVPLa1VSV4w0vR3iwhByFHe0L9f7iutZ6up8M+xzMEYPuUKTQEneRoFEVVnaQvl9/CXX F2ZfrGRqeteuyt3t9c4RGFayRXnl8APPObjNp5+yFlMNGlhElSp3c8Vbm78iVXdNhcEJ YOLA== X-Gm-Message-State: AGRZ1gKgvYq8v2FOXi70kxHFQFynU6v4ol9WPY4RJKLxCEpxcvg+IHRP m+4a64e8nOTZXnmWoe8xWxs/m2zR X-Google-Smtp-Source: AJdET5dCPiUgaX0v//u+EtLIkFRDQhpBGWbOA/NlkQsS++PhFRET/vtM4lanWLSFg90ea4NReh1VDw== X-Received: by 2002:a62:2803:: with SMTP id o3-v6mr3595487pfo.57.1540706550915; Sat, 27 Oct 2018 23:02:30 -0700 (PDT) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id m25-v6sm8465919pgb.67.2018.10.27.23.02.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 27 Oct 2018 23:02:29 -0700 (PDT) Subject: Re: bug#23665: spaces in keys: doc, --debug in LC_ALL=C From: Assaf Gordon To: 23665@debbugs.gnu.org References: <201605312315.u4VNFLon026226@freefriends.org> <675BD18E-C99F-4D4E-8B71-77303D0D1C7F@gmail.com> <574E3250.30501@draigBrady.com> Message-ID: <80170433-913f-c286-bb3b-4e5242d921b5@gmail.com> Date: Sun, 28 Oct 2018 00:02:28 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 23665 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) tags 23665 fixed close 23665 stop (triaging old bugs) On 2016-05-31 8:15 p.m., Assaf Gordon wrote: > >> On May 31, 2016, at 20:54, Pádraig Brady wrote: >> >> On 01/06/16 01:38, Assaf Gordon wrote: >>> >>> 2. add a bit more verbose progress information to the 'sort-debug-warn.sh' test - just so it'll be easier to discuss to the changed messages. >>> >>> 3. removes the 'maybe_space_aligned' and modifies the condition a bit. >> >> 2 and 3 are good to push. > > Thank you, pushed in: > http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=d548f87595a193e21b170368bc8fc2ded4dadb73 > http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=6223bf94bfeac75fb4252095864a80545ba00a0d > With no further follow-ups in 2 years, I'm closing this as fixed. -assaf From unknown Sat Jul 26 21:31:40 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sun, 25 Nov 2018 12:24:10 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator