From unknown Sun Jun 22 11:33:36 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#6176 <6176@debbugs.gnu.org> To: bug#6176 <6176@debbugs.gnu.org> Subject: Status: [PATCH 1/2] sort: add a --debug option to highlight key extents Reply-To: bug#6176 <6176@debbugs.gnu.org> Date: Sun, 22 Jun 2025 18:33:36 +0000 retitle 6176 [PATCH 1/2] sort: add a --debug option to highlight key extents reassign 6176 coreutils submitter 6176 P=C3=A1draig Brady severity 6176 normal tag 6176 patch thanks From debbugs-submit-bounces@debbugs.gnu.org Tue May 11 18:58:48 2010 Received: (at submit) by debbugs.gnu.org; 11 May 2010 22:58:48 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OByPX-0006oS-KX for submit@debbugs.gnu.org; Tue, 11 May 2010 18:58:48 -0400 Received: from mx10.gnu.org ([199.232.76.166]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OByPT-0006oN-Mk for submit@debbugs.gnu.org; Tue, 11 May 2010 18:58:45 -0400 Received: from lists.gnu.org ([199.232.76.165]:44928) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1OByPP-0000VJ-Li for submit@debbugs.gnu.org; Tue, 11 May 2010 18:58:39 -0400 Received: from [140.186.70.92] (port=33634 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OByPM-0004X8-T3 for bug-coreutils@gnu.org; Tue, 11 May 2010 18:58:39 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable version=3.3.1 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OByPH-0001WU-NA for bug-coreutils@gnu.org; Tue, 11 May 2010 18:58:36 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]:9793) by eggs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OByPH-0001Vd-4t for bug-coreutils@gnu.org; Tue, 11 May 2010 18:58:31 -0400 Received: (qmail 68351 invoked from network); 11 May 2010 22:58:28 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 11 May 2010 22:58:28 -0000 Message-ID: <4BE9E0B1.5030308@draigBrady.com> Date: Tue, 11 May 2010 23:56:49 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Report bugs to Subject: [PATCH 1/2] sort: add a --debug option to highlight key extents X-Enigmail-Version: 1.0.1 Content-Type: multipart/mixed; boundary="------------010204060602020007040800" X-detected-operating-system: by eggs.gnu.org: FreeBSD 4.6-4.9 X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Spam-Score: -3.6 (---) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -4.9 (----) This is a multi-part message in MIME format. --------------010204060602020007040800 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The attached patch allows one to: $ printf "one 2 3e3e" | .sort --debug -k3,3g -k2,2 one 2 3e3e ___ __ __________ cheers, Pádraig --------------010204060602020007040800 Content-Type: text/x-patch; name="0001-sort-add-a-debug-option-to-highlight-key-extents.patch" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename*0="0001-sort-add-a-debug-option-to-highlight-key-extents.patch" >From b22891eea88a333387a678fcccabc731327680af Mon Sep 17 00:00:00 2001 From: =?utf-8?q?P=C3=A1draig=20Brady?= Date: Tue, 23 Feb 2010 08:43:04 +0000 Subject: [PATCH 1/2] sort: add a --debug option to highlight key extents * src/sort (usage): Add description for --debug. (write_bytes): Pass a line structure so it can subsequently be passed to compare to highlight the keys when in debug mode. Also transform TAB and NUL characters written to stdout so that the highlighting in debug mode aligns correctly. (human_numcompare): Pass an "endptr" so we can record the extent of the number matched. (general_numcompare): Likewise. (find_unit_order): Likewise. (getmonth): Likewise. (numcompare): Likewise. Note we reuse find_unit_order() for this, which is a good enough appropixmation, and means we don't need to change the strnumcmp() interface. (check_mixed_SI_IEC): Return whether iec_present, so that can be used to set the "endptr" in find_unit_order. Also make the key parameter optional, which will be the case from numcompare(). (count_tabs): A new function to determine how much to adjust the mbswidth() values by (TABs don't have a width). (mark_key): A new function to output the key highlighting to stdout. (debug_key): A new function to determine the offset and width of the key highlighting. (key_compare): Pass the show_debug parameter so the key highlighting is only displayed when explicitly called. For each key type, set the length (lena) and whether leading blanks are auto skipped (skipb) which are then used by debug_key() to highlight the portion of the key used in the comparison. (compare): Pass the show_debug parameter so the key highlighting is only displayed when explicitly called. Call debug_key() to highlight the last resort comparison. (check): Output highlighting for disorder line to stdout. (main): Process the --debug option and make it mutually exlusive with the -o option as I don't see it useful there, even potentially harmful if someone left a --debug in by mistake when updating a file. Also restricting debug output to stdout, simplifies the logic for dealing with temporary files. * doc/coreutils.texi (sort invocation): Describe the --debug option, and reference it from the --key description. * tests/misc/sort-debug-keys: A new test for highlighting keys. * tests/Makefile.am: Reference the new test. * NEWS: Mention the new feature. --- NEWS | 5 + doc/coreutils.texi | 5 + src/sort.c | 289 ++++++++++++++++++++++++++++++++--------- tests/Makefile.am | 1 + tests/misc/sort-debug-keys | 317 ++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 557 insertions(+), 60 deletions(-) create mode 100755 tests/misc/sort-debug-keys diff --git a/NEWS b/NEWS index 070f338..4c2da67 100644 --- a/NEWS +++ b/NEWS @@ -2,6 +2,11 @@ GNU coreutils NEWS -*- outline -*- * Noteworthy changes in release ?.? (????-??-??) [?] +** New features + + sort now accepts the --debug option, to highlight the part of the + line significant in the sort. + ** Changes in behavior sort -g now uses long doubles for greater range and precision. diff --git a/doc/coreutils.texi b/doc/coreutils.texi index c8ba53c..6714ada 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -3939,6 +3939,11 @@ multiple fields. Example: To sort on the second field, use @option{--key=2,2} (@option{-k 2,2}). See below for more notes on keys and more examples. +See also the @option{--debug} option to help determine the part +of the line being used in the sort. + +@item --debug +Highlight the portion of each line used for sorting. @item --batch-size=@var{nmerge} @opindex --batch-size diff --git a/src/sort.c b/src/sort.c index 54b97e2..65866b9 100644 --- a/src/sort.c +++ b/src/sort.c @@ -34,6 +34,7 @@ #include "hash.h" #include "ignore-value.h" #include "md5.h" +#include "mbswidth.h" #include "physmem.h" #include "posixver.h" #include "quote.h" @@ -291,6 +292,9 @@ static struct keyfield *keylist; /* Program used to (de)compress temp files. Must accept -d. */ static char const *compress_program; +/* Annotate the output with extra info to aid the user. */ +static bool debug; + /* Maximum number of files to merge in one go. If more than this number are present, temp files will be used. */ static unsigned int nmerge = NMERGE_DEFAULT; @@ -371,6 +375,7 @@ Other options:\n\ -C, --check=quiet, --check=silent like -c, but do not report first bad line\n\ --compress-program=PROG compress temporaries with PROG;\n\ decompress them with PROG -d\n\ + --debug annotate the part of the line used to sort\n\ --files0-from=F read input from the files specified by\n\ NUL-terminated names in file F;\n\ If F is - then read names from standard input\n\ @@ -429,6 +434,7 @@ enum { CHECK_OPTION = CHAR_MAX + 1, COMPRESS_PROGRAM_OPTION, + DEBUG_PROGRAM_OPTION, FILES0_FROM_OPTION, NMERGE_OPTION, RANDOM_SOURCE_OPTION, @@ -442,6 +448,7 @@ static struct option const long_options[] = {"ignore-leading-blanks", no_argument, NULL, 'b'}, {"check", optional_argument, NULL, CHECK_OPTION}, {"compress-program", required_argument, NULL, COMPRESS_PROGRAM_OPTION}, + {"debug", no_argument, NULL, DEBUG_PROGRAM_OPTION}, {"dictionary-order", no_argument, NULL, 'd'}, {"ignore-case", no_argument, NULL, 'f'}, {"files0-from", required_argument, NULL, FILES0_FROM_OPTION}, @@ -1111,13 +1118,6 @@ open_temp (const char *name, pid_t pid) return fp; } -static void -write_bytes (const char *buf, size_t n_bytes, FILE *fp, const char *output_file) -{ - if (fwrite (buf, 1, n_bytes, fp) != n_bytes) - die (_("write failed"), output_file); -} - /* Append DIR to the array of temporary directory names. */ static void add_temp_dir (char const *dir) @@ -1734,30 +1734,19 @@ fillbuf (struct buffer *buf, FILE *fp, char const *file) } } -/* Compare strings A and B as numbers without explicitly converting them to - machine numbers. Comparatively slow for short strings, but asymptotically - hideously fast. */ - -static int -numcompare (const char *a, const char *b) -{ - while (blanks[to_uchar (*a)]) - a++; - while (blanks[to_uchar (*b)]) - b++; - - return strnumcmp (a, b, decimal_point, thousands_sep); -} - /* Exit with an error if a mixture of SI and IEC units detected. */ -static void +static bool check_mixed_SI_IEC (char prefix, struct keyfield *key) { int iec_present = prefix == 'i'; - if (key->iec_present != -1 && iec_present != key->iec_present) - error (SORT_FAILURE, 0, _("both SI and IEC prefixes present on units")); - key->iec_present = iec_present; + if (key) + { + if (key->iec_present != -1 && iec_present != key->iec_present) + error (SORT_FAILURE, 0, _("both SI and IEC prefixes present on units")); + key->iec_present = iec_present; + } + return iec_present; } /* Return an integer which represents the order of magnitude of @@ -1766,7 +1755,7 @@ check_mixed_SI_IEC (char prefix, struct keyfield *key) Negative numbers return a negative unit order. */ static int -find_unit_order (const char *number, struct keyfield *key) +find_unit_order (const char *number, struct keyfield *key, char const **endptr) { static const char orders [UCHAR_LIM] = { @@ -1822,7 +1811,13 @@ find_unit_order (const char *number, struct keyfield *key) /* For valid units check for MiB vs MB etc. */ if (order) - check_mixed_SI_IEC (*(p + 1), key); + { + p++; + p += check_mixed_SI_IEC (*p, key); + } + + if (endptr) + *endptr = p; return sign * order; } @@ -1833,25 +1828,50 @@ find_unit_order (const char *number, struct keyfield *key) i.e. input will never have both 6000K and 5M. */ static int -human_numcompare (const char *a, const char *b, struct keyfield *key) +human_numcompare (const char *a, const char *b, struct keyfield *key, + char const **ea) { while (blanks[to_uchar (*a)]) a++; while (blanks[to_uchar (*b)]) b++; - int order_a = find_unit_order (a, key); - int order_b = find_unit_order (b, key); + int order_a = find_unit_order (a, key, ea); + int order_b = find_unit_order (b, key, NULL); return (order_a > order_b ? 1 : order_a < order_b ? -1 : strnumcmp (a, b, decimal_point, thousands_sep)); } +/* Compare strings A and B as numbers without explicitly converting them to + machine numbers. Comparatively slow for short strings, but asymptotically + hideously fast. */ + static int -general_numcompare (const char *sa, const char *sb) +numcompare (const char *a, const char *b, char const **ea) +{ + while (blanks[to_uchar (*a)]) + a++; + while (blanks[to_uchar (*b)]) + b++; + + if (debug) + { + /* Approximate strnumcmp extents with find_unit_order. */ + if (find_unit_order (a, NULL, ea)) + { + *ea -= 1; /* ignore the order letter */ + *ea -= (**ea == 'i'); /* and IEC prefix */ + } + } + + return strnumcmp (a, b, decimal_point, thousands_sep); +} + +static int +general_numcompare (const char *sa, const char *sb, char const **ea) { - /* FIXME: add option to warn about failed conversions. */ /* FIXME: maybe add option to try expensive FP conversion only if A and B can't be compared more cheaply/accurately. */ @@ -1863,13 +1883,12 @@ general_numcompare (const char *sa, const char *sb) # define strtold strtod #endif - char *ea; char *eb; - long_double a = strtold (sa, &ea); + long_double a = strtold (sa, (char **) ea); long_double b = strtold (sb, &eb); /* Put conversion errors at the start of the collating sequence. */ - if (sa == ea) + if (sa == *ea) return sb == eb ? 0 : -1; if (sb == eb) return 1; @@ -1889,7 +1908,7 @@ general_numcompare (const char *sa, const char *sb) Return 0 if the name in S is not recognized. */ static int -getmonth (char const *month, size_t len) +getmonth (char const *month, size_t len, char const **ea) { size_t lo = 0; size_t hi = MONTHS_PER_YEAR; @@ -1913,7 +1932,11 @@ getmonth (char const *month, size_t len) for (;; m++, n++) { if (!*n) - return monthtab[ix].val; + { + if (ea) + *ea = m; + return monthtab[ix].val; + } if (m == monthlim || fold_toupper[to_uchar (*m)] < to_uchar (*n)) { hi = ix; @@ -2070,11 +2093,76 @@ compare_version (char *restrict texta, size_t lena, return diff; } +/* For debug mode, count tabs in the passed string + so we can adjust the widths returned by mbswidth. + FIXME: Should we generally be counting non printable chars? */ + +static size_t +count_tabs (char const *text, const size_t len) +{ + size_t tabs = 0; + size_t tlen = strnlen (text, len); + + while (tlen--) + { + if (*text++ == '\t') + tabs++; + } + + return tabs; +} + +/* For debug mode, "underline" a key at the + specified offset and screen width. */ + +static void +mark_key (size_t offset, size_t width) +{ + printf ("%*s", offset, ""); + + if (!width) + printf (_("^ no match for key\n")); + else + { + while (width--) + putchar ('_'); + putchar ('\n'); + } +} + +/* For debug mode, determine the screen offset and width + to highlight for a key, and then output the highlight. */ + +static void +debug_key (char const *sline, char const *sfield, char const *efield, + size_t flen, bool skipb) +{ + char const *sa = sfield; + + if (skipb) /* This key type implicitly skips leading blanks. */ + { + while (sa < efield && blanks[to_uchar (*sa)]) + { + sa++; + if (flen) + flen--; /* This assumes TABs same width as SPACEs. */ + } + } + + size_t offset = mbsnwidth (sline, sfield - sline, 0) + (sa - sfield); + offset += count_tabs (sline, sfield - sline); + + size_t width = mbsnwidth (sa, flen, 0); + width += count_tabs (sa, flen); + + mark_key (offset, width); +} + /* Compare two lines A and B trying every key in sequence until there are no more keys or a difference is found. */ static int -keycompare (const struct line *a, const struct line *b) +keycompare (const struct line *a, const struct line *b, bool show_debug) { struct keyfield *key = keylist; @@ -2091,6 +2179,7 @@ keycompare (const struct line *a, const struct line *b) { char const *translate = key->translate; bool const *ignore = key->ignore; + bool skipb = false; /* Whether key type auto skips leading blanks. */ /* Treat field ends before field starts as empty fields. */ lima = MAX (texta, lima); @@ -2107,21 +2196,42 @@ keycompare (const struct line *a, const struct line *b) else if (key->numeric || key->general_numeric || key->human_numeric) { char savea = *lima, saveb = *limb; + char const* ea = lima; *lima = *limb = '\0'; - diff = (key->numeric ? numcompare (texta, textb) - : key->general_numeric ? general_numcompare (texta, textb) - : human_numcompare (texta, textb, key)); + diff = (key->numeric ? numcompare (texta, textb, &ea) + : key->general_numeric ? general_numcompare (texta, textb, + &ea) + : human_numcompare (texta, textb, key, &ea)); + if (show_debug) + { + lena = ea - texta; + skipb = true; + } *lima = savea, *limb = saveb; } else if (key->version) diff = compare_version (texta, lena, textb, lenb); else if (key->month) - diff = getmonth (texta, lena) - getmonth (textb, lenb); + { + char const *ea = lima; + + int amon = getmonth (texta, lena, &ea); + diff = amon - getmonth (textb, lenb, NULL); + + if (show_debug) + { + lena = amon ? ea - texta : 0; + skipb = true; + } + } /* Sorting like this may become slow, so in a simple locale the user can select a faster sort that is similar to ascii sort. */ else if (hard_LC_COLLATE) { + /* FIXME: for debug, account for skipped chars, while handling mb chars. + Generally perhaps xmemfrm could be used to determine chars that are + excluded from the collating order? */ if (ignore || translate) { char buf[4000]; @@ -2165,6 +2275,8 @@ keycompare (const struct line *a, const struct line *b) } else if (ignore) { + char *savea = texta; + #define CMP_WITH_IGNORE(A, B) \ do \ { \ @@ -2192,6 +2304,10 @@ keycompare (const struct line *a, const struct line *b) translate[to_uchar (*textb)]); else CMP_WITH_IGNORE (*texta, *textb); + + /* We only need to restore this for debug_key + in which case the keys being compared are equal. */ + texta = savea; } else if (lena == 0) diff = - NONZERO (lenb); @@ -2201,6 +2317,8 @@ keycompare (const struct line *a, const struct line *b) { if (translate) { + char *savea = texta; + while (texta < lima && textb < limb) { diff = (to_uchar (translate[to_uchar (*texta++)]) @@ -2208,6 +2326,10 @@ keycompare (const struct line *a, const struct line *b) if (diff) goto not_equal; } + + /* We only need to restore this for debug_key + in which case the keys being compared are equal. */ + texta = savea; } else { @@ -2221,6 +2343,9 @@ keycompare (const struct line *a, const struct line *b) if (diff) goto not_equal; + if (show_debug) + debug_key (a->text, texta, lima, lena, skipb); + key = key->next; if (! key) break; @@ -2258,7 +2383,7 @@ keycompare (const struct line *a, const struct line *b) depending on whether A compares less than, equal to, or greater than B. */ static int -compare (const struct line *a, const struct line *b) +compare (const struct line *a, const struct line *b, bool show_debug) { int diff; size_t alen, blen; @@ -2268,7 +2393,7 @@ compare (const struct line *a, const struct line *b) and unadorned sort -r. */ if (keylist) { - diff = keycompare (a, b); + diff = keycompare (a, b, show_debug); if (diff || unique || stable) return diff; } @@ -2277,6 +2402,9 @@ compare (const struct line *a, const struct line *b) fall through to the default comparison. */ alen = a->length - 1, blen = b->length - 1; + if (show_debug) + debug_key (a->text, a->text, a->text + alen, alen, false); + if (alen == 0) diff = - NONZERO (blen); else if (blen == 0) @@ -2289,6 +2417,38 @@ compare (const struct line *a, const struct line *b) return reverse ? -diff : diff; } +static void +write_bytes (const struct line *line, FILE *fp, char const *output_file) +{ + char const *buf = line->text; + size_t n_bytes = line->length; + + /* Convert TABs to '>' and \0 to \n when -z specified. */ + if (debug && fp == stdout) + { + char const *ebuf = buf + n_bytes; + char const *c = buf; + + while (c < ebuf) + { + char wc = *c++; + if (wc == '\t') + wc = '>'; + else if (wc == 0 && eolchar == 0) + wc = '\n'; + if (fputc (wc, fp) == EOF) + die (_("write failed"), output_file); + } + + compare (line, line, true); + } + else + { + if (fwrite (buf, 1, n_bytes, fp) != n_bytes) + die (_("write failed"), output_file); + } +} + /* Check that the lines read from FILE_NAME come in order. Return true if they are in order. If CHECKONLY == 'c', also print a diagnostic (FILE_NAME, line number, contents of line) to stderr if @@ -2317,7 +2477,7 @@ check (char const *file_name, char checkonly) /* Make sure the line saved from the old buffer contents is less than or equal to the first line of the new buffer. */ - if (alloc && nonunique <= compare (&temp, line - 1)) + if (alloc && nonunique <= compare (&temp, line - 1, false)) { found_disorder: { @@ -2330,8 +2490,10 @@ check (char const *file_name, char checkonly) fprintf (stderr, _("%s: %s:%s: disorder: "), program_name, file_name, umaxtostr (disorder_line_number, hr_buf)); - write_bytes (disorder_line->text, disorder_line->length, - stderr, _("standard error")); + if (debug) + fputc ('\n', stderr); + write_bytes (disorder_line, debug ? stdout : stderr, + debug ? _("standard out") : _("standard error")); } ordered = false; @@ -2341,7 +2503,7 @@ check (char const *file_name, char checkonly) /* Compare each line in the buffer with its successor. */ while (linebase < --line) - if (nonunique <= compare (line, line - 1)) + if (nonunique <= compare (line, line - 1, false)) goto found_disorder; line_number += buf.nlines; @@ -2470,7 +2632,7 @@ mergefps (struct sortfile *files, size_t ntemps, size_t nfiles, for (i = 0; i < nfiles; ++i) ord[i] = i; for (i = 1; i < nfiles; ++i) - if (0 < compare (cur[ord[i - 1]], cur[ord[i]])) + if (0 < compare (cur[ord[i - 1]], cur[ord[i]], false)) t = ord[i - 1], ord[i - 1] = ord[i], ord[i] = t, i = 0; /* Repeatedly output the smallest line until no input remains. */ @@ -2482,10 +2644,10 @@ mergefps (struct sortfile *files, size_t ntemps, size_t nfiles, an identical series of lines. */ if (unique) { - if (savedline && compare (savedline, smallest)) + if (savedline && compare (savedline, smallest, false)) { savedline = NULL; - write_bytes (saved.text, saved.length, ofp, output_file); + write_bytes (&saved, ofp, output_file); } if (!savedline) { @@ -2514,7 +2676,7 @@ mergefps (struct sortfile *files, size_t ntemps, size_t nfiles, } } else - write_bytes (smallest->text, smallest->length, ofp, output_file); + write_bytes (smallest, ofp, output_file); /* Check if we need to read more lines into core. */ if (base[ord[0]] < smallest) @@ -2568,7 +2730,7 @@ mergefps (struct sortfile *files, size_t ntemps, size_t nfiles, while (lo < hi) { - int cmp = compare (cur[ord0], cur[ord[probe]]); + int cmp = compare (cur[ord0], cur[ord[probe]], false); if (cmp < 0 || (cmp == 0 && ord0 < ord[probe])) hi = probe; else @@ -2589,7 +2751,7 @@ mergefps (struct sortfile *files, size_t ntemps, size_t nfiles, if (unique && savedline) { - write_bytes (saved.text, saved.length, ofp, output_file); + write_bytes (&saved, ofp, output_file); free (saved.text); } @@ -2634,7 +2796,7 @@ mergelines (struct line *t, struct line const *hi, size_t nhi) { for (;;) - if (compare (lo - 1, hi - 1) <= 0) + if (compare (lo - 1, hi - 1, false) <= 0) { *--t = *--lo; if (! --nlo) @@ -2676,7 +2838,7 @@ sortlines (struct line *lines, size_t nlines, struct line *temp) { if (nlines == 2) { - if (0 < compare (&lines[-1], &lines[-2])) + if (0 < compare (&lines[-1], &lines[-2], false)) { struct line tmp = lines[-1]; lines[-1] = lines[-2]; @@ -2712,7 +2874,7 @@ sortlines_temp (struct line *lines, size_t nlines, struct line *temp) /* Declare `swap' as int, not bool, to work around a bug in the IBM xlc 6.0.0.0 compiler in 64-bit mode. */ - int swap = (0 < compare (&lines[-1], &lines[-2])); + int swap = (0 < compare (&lines[-1], &lines[-2], false)); temp[-1] = lines[-1 - swap]; temp[-2] = lines[-2 + swap]; } @@ -2994,9 +3156,9 @@ sort (char * const *files, size_t nfiles, char const *output_file) do { line--; - write_bytes (line->text, line->length, tfp, temp_output); + write_bytes (line, tfp, temp_output); if (unique) - while (linebase < line && compare (line, line - 1) == 0) + while (linebase < line && compare (line, line - 1, false) == 0) line--; } while (linebase < line); @@ -3459,6 +3621,10 @@ main (int argc, char **argv) compress_program = optarg; break; + case DEBUG_PROGRAM_OPTION: + debug = true; + break; + case FILES0_FROM_OPTION: files_from = optarg; break; @@ -3715,6 +3881,9 @@ main (int argc, char **argv) check_ordering_compatibility (); + if (debug && outfile) + error (SORT_FAILURE, 0, _("options -o and --debug are incompatible")); + reverse = gkey.reverse; if (need_random) diff --git a/tests/Makefile.am b/tests/Makefile.am index 1049b2b..46d388a 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -224,6 +224,7 @@ TESTS = \ misc/sort \ misc/sort-compress \ misc/sort-continue \ + misc/sort-debug-keys \ misc/sort-files0-from \ misc/sort-float \ misc/sort-merge \ diff --git a/tests/misc/sort-debug-keys b/tests/misc/sort-debug-keys new file mode 100755 index 0000000..0437678 --- /dev/null +++ b/tests/misc/sort-debug-keys @@ -0,0 +1,317 @@ +#!/bin/sh +# Test annotation of sort keys + +# Copyright (C) 2010 Free Software Foundation, Inc. + +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +if test "$VERBOSE" = yes; then + set -x + sort --version +fi + +. $srcdir/test-lib.sh + +number() { cat -n | sed 's/^ *//'; } + +cat <<\EOF > exp +1 + ^ no match for key + +^ no match for key +44 + ^ no match for key +33 + ^ no match for key +2 + ^ no match for key +1 + ^ no match for key + +^ no match for key +44 + ^ no match for key +33 + ^ no match for key +2 + ^ no match for key + +^ no match for key +1 +_ +2 +_ +33 +__ +44 +__ +2> + ^ no match for key +3>1 + _ +1>2 + _ +1 + ^ no match for key + +^ no match for key +44 + ^ no match for key +33 + ^ no match for key +2 + ^ no match for key +1 + ^ no match for key + +^ no match for key +44 + ^ no match for key +33 + ^ no match for key +2 + ^ no match for key + +^ no match for key +1 +_ +2 +_ +33 +__ +44 +__ +2> + ^ no match for key +3>1 + _ +1>2 + _ +1 + ^ no match for key + +^ no match for key +44 + ^ no match for key +33 + ^ no match for key +2 + ^ no match for key +1 + ^ no match for key + +^ no match for key +44 + ^ no match for key +33 + ^ no match for key +2 + ^ no match for key + +^ no match for key +1 +_ +2 +_ +33 +__ +44 +__ +2> + ^ no match for key +3>1 + _ +1>2 + _ + +^ no match for key +JAN +___ +FEB +___ +FEB + ^ no match for key + +^ no match for key +JAN + ^ no match for key +JAZZ +^ no match for key + +^ no match for key +JAN +___ +FEB +___ +2>JAZZ + ^ no match for key +3> + ^ no match for key +4>JAN + ___ +1>FEB + ___ + +^ no match for key +JANZ +___ +JAN +___ +FEB +___ +3> + ^ no match for key +2>JANZ + ___ +4>JAN + ___ +1>FEB + ___ + 1.2ignore + ___ + 1.1e4ignore + _____ +>>a +___ +>b +__ +a + ^ no match for key + +^ no match for key +a +_ +b +_ +-3 +__ +-2 +__ +-0 +__ +--Mi-1 +_ +-0 +__ +1 +_ + 1 + _ +__ +1 +_ +_ + 1 + _ +1 +_ + 1 +__ +1 +_ +2,5 +_ +2.4 +___ +2.,,3 +_ +2.4 +___ +2,,3 +_ +2.4 +___ +1a +_ +2b +_ +EOF + +( +for type in n h g; do + printf "1\n\n44\n33\n2\n" | sort -s -k2$type --debug + printf "1\n\n44\n33\n2\n" | sort -s -k1.3$type --debug + printf "1\n\n44\n33\n2\n" | sort -s -k1$type --debug + printf "2\n\n1\n" | number | sort -s -k2g --debug +done + +printf "FEB\n\nJAN\n" | sort -s -k1M --debug +printf "FEB\n\nJAN\n" | sort -s -k2,2M --debug +printf "FEB\nJAZZ\n\nJAN\n" | sort -s -k1M --debug +printf "FEB\nJAZZ\n\nJAN\n" | number | sort -s -k2,2M --debug +printf "FEB\nJANZ\n\nJAN\n" | sort -s -k1M --debug +printf "FEB\nJANZ\n\nJAN\n" | number | sort -s -k2,2M --debug + +printf " 1.2ignore\n 1.1e4ignore\n" | sort -s -g --debug + +printf "\tb\n\t\ta\n" | sort -s -d --debug # ignore = 1 + +printf "a\n\n" | sort -s -k2,2 --debug #lena = 0 + +printf "b\na\n" | sort -s -k1 --debug #otherwise key compare + +printf -- "-0\n1\n-2\n--Mi-1\n-3\n-0\n" | sort -s --debug -k1,1h + +printf " 1\n1\n" | sort -b --debug +printf " 1\n1\n" | sort -sb --debug +printf " 1\n1\n" | sort --debug + +# strnumcmp is a bit weird, so we don't match exactly +printf "2,5\n2.4\n" | sort -s -k1n --debug +printf "2.,,3\n2.4\n" | sort -s -k1n --debug +printf "2,,3\n2.4\n" | sort -s -k1n --debug + +# -z means we convert \0 to \n +printf "1a\x002b\x00" | sort -s -n -z --debug +) > out + +compare out exp || fail=1 + +cat <<\EOF > exp + 1²---++3 1,234 Mi + _ + _________ +________________________ + 1²---++3 1,234 Mi + _____ + ________ +_______________________ ++1234 1234Gi 1,234M +^ no match for key +_____ +^ no match for key + ____ + ____ + ______ + _____ + _____ + ______ +___________________ +EOF + +( +if test "$LOCALE_FR_UTF8"; then + echo " 1²---++3 1,234 Mi" | + LC_ALL=C sort --debug -k2g -k1b,1 + echo " 1²---++3 1,234 Mi" | + LC_ALL=$LOCALE_FR_UTF8 sort --debug -k2g -k1b,1 + echo "+1234 1234Gi 1,234M" | + LC_ALL=$LOCALE_FR_UTF8 sort --debug -k1,1n -k1,1g \ + -k1,1h -k2,2n -k2,2g -k2,2h -k3,3n -k3,3g -k3,3h +fi +) > out + +compare out exp || fail=1 + +Exit $fail -- 1.6.2.5 --------------010204060602020007040800-- From debbugs-submit-bounces@debbugs.gnu.org Tue May 11 19:41:23 2010 Received: (at 6176) by debbugs.gnu.org; 11 May 2010 23:41:23 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OBz4k-00076Y-Gw for submit@debbugs.gnu.org; Tue, 11 May 2010 19:41:23 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OBz4h-00076R-TK for 6176@debbugs.gnu.org; Tue, 11 May 2010 19:41:21 -0400 Received: (qmail 75885 invoked from network); 11 May 2010 23:41:15 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 11 May 2010 23:41:15 -0000 Message-ID: <4BE9EAB8.8010703@draigBrady.com> Date: Wed, 12 May 2010 00:39:36 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: 6176@debbugs.gnu.org Subject: [PATCH 2/2] sort: --debug: output data independent key warnings References: <4BE9E0B1.5030308@draigBrady.com> In-Reply-To: <4BE9E0B1.5030308@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: multipart/mixed; boundary="------------090907000806040204010703" X-Spam-Score: -2.0 (--) X-Debbugs-Envelope-To: 6176 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.9 (--) This is a multi-part message in MIME format. --------------090907000806040204010703 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The attached patch gives warnings about questionable option combinations. For example: $ sort --debug -rb -k1,1n /dev/null ! options `-b' are ignored ! option `-r' only applies to last-resort comparison cheers, Pádraig. --------------090907000806040204010703 Content-Type: text/x-patch; name="0002-sort-debug-output-data-independent-key-warnings.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0002-sort-debug-output-data-independent-key-warnings.patch" >From 2256de4bdc458ef9e9d92e2009f255bfd3fa2e36 Mon Sep 17 00:00:00 2001 From: =?utf-8?q?P=C3=A1draig=20Brady?= Date: Tue, 11 May 2010 18:46:21 +0100 Subject: [PATCH 2/2] sort: --debug: output data independent key warnings * src/sort.c (usage): Mention --debug can output warnings to stderr. (default_key_compare): A new function refactored from main(), and now also called from the new key_warnings() function. (key_to_opts): A new function refactored from incompatible_options(), and now also called from the new key_warnings() function. (key_warnings): A new function to output warnings to stderr, about questionable use of various options. Currently it warns about zero length keys and ineffective global options. (incompatible_options): Refactor out key_to_opts() (main): Use key_init() to initialize gkey. Refactor out default_key_compare(). Call key_warnings() in debug mode. * doc/coreutils.texi (sort invocation): Mention that warnings are output by --debug. * tests/misc/sort-debug-warn: A new test for debug warnings. * tests/Makefile.am: Reference the new test. * NEWS: Mention the new feature --- NEWS | 2 +- doc/coreutils.texi | 1 + src/sort.c | 173 ++++++++++++++++++++++++++++++-------------- tests/Makefile.am | 1 + tests/misc/sort-debug-warn | 53 ++++++++++++++ 5 files changed, 174 insertions(+), 56 deletions(-) create mode 100755 tests/misc/sort-debug-warn diff --git a/NEWS b/NEWS index 4c2da67..5d7b81f 100644 --- a/NEWS +++ b/NEWS @@ -5,7 +5,7 @@ GNU coreutils NEWS -*- outline -*- ** New features sort now accepts the --debug option, to highlight the part of the - line significant in the sort. + line significant in the sort, and warn about questionable options. ** Changes in behavior diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 6714ada..cd99bd0 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -3944,6 +3944,7 @@ of the line being used in the sort. @item --debug Highlight the portion of each line used for sorting. +Also issue warnings about questionable usage to stderr. @item --batch-size=@var{nmerge} @opindex --batch-size diff --git a/src/sort.c b/src/sort.c index 65866b9..66a00ef 100644 --- a/src/sort.c +++ b/src/sort.c @@ -375,7 +375,8 @@ Other options:\n\ -C, --check=quiet, --check=silent like -c, but do not report first bad line\n\ --compress-program=PROG compress temporaries with PROG;\n\ decompress them with PROG -d\n\ - --debug annotate the part of the line used to sort\n\ + --debug annotate the part of the line used to sort,\n\ + and warn about questionable usage to stderr\n\ --files0-from=F read input from the files specified by\n\ NUL-terminated names in file F;\n\ If F is - then read names from standard input\n\ @@ -2158,6 +2159,108 @@ debug_key (char const *sline, char const *sfield, char const *efield, mark_key (offset, width); } +/* Return whether sorting options specified for key. */ + +static bool +default_key_compare (struct keyfield const *key) +{ + return ! (key->ignore + || key->translate + || key->skipsblanks + || key->skipeblanks + || key->month + || key->numeric + || key->version + || key->general_numeric + || key->human_numeric + || key->random + /* || key->reverse */ + ); +} + +/* Convert a key to the short options used to specify it. + The returned string must be freed. */ + +static char* +key_to_opts (struct keyfield const *key) +{ + /* The following is too big, but guaranteed to be "big enough". */ + char *opts = xstrdup (short_options); + char *p = opts; + + if (key->skipsblanks || key->skipeblanks) + *p++ = 'b';/* either disables global -b */ + if (key->ignore == nondictionary) + *p++ = 'd'; + if (key->ignore == nonprinting) + *p++ = 'i'; + if (key->translate) + *p++ = 'f'; + if (key->general_numeric) + *p++ = 'g'; + if (key->human_numeric) + *p++ = 'h'; + if (key->month) + *p++ = 'M'; + if (key->numeric) + *p++ = 'n'; + if (key->version) + *p++ = 'V'; + if (key->random) + *p++ = 'R'; + if (key->reverse) + *p++ = 'r'; + *p = '\0'; + + return opts; +} + +/* Output data independent key warnings to stderr. */ + +static void +key_warnings (struct keyfield const *gkey) +{ + struct keyfield const *key; + struct keyfield ugkey = *gkey; + + for (key = keylist; key; key = key->next) + { + /* Output a warning for field specs that will never match. */ + if (key->sword != SIZE_MAX && key->eword < key->sword) + fprintf (stderr, _("! zero width field ignored\n")); + + /* Flag global options not copied or specified in any key. */ + if (ugkey.ignore && (ugkey.ignore == key->ignore)) + ugkey.ignore = NULL; + if (ugkey.translate && (ugkey.translate == key->translate)) + ugkey.translate = NULL; + ugkey.skipsblanks &= !key->skipsblanks; + ugkey.skipeblanks &= !key->skipeblanks; + ugkey.month &= !key->month; + ugkey.numeric &= !key->numeric; + ugkey.general_numeric &= !key->general_numeric; + ugkey.human_numeric &= !key->human_numeric; + ugkey.random &= !key->random; + ugkey.version &= !key->version; + ugkey.reverse &= !key->reverse; + } + + /* Warn about ignored global options flagged above. */ + if (!default_key_compare (&ugkey) || (stable && ugkey.reverse)) + { + bool ugkey_reverse = ugkey.reverse; + if (!stable) + ugkey.reverse = false; + char *opts = key_to_opts (&ugkey); + fprintf (stderr, _("! options `-%s' are ignored\n"), opts); + free (opts); + ugkey.reverse = ugkey_reverse; + } + if (!stable && ugkey.reverse) + fprintf (stderr, + _("! option `-r' only applies to last-resort comparison\n")); +} + /* Compare two lines A and B trying every key in sequence until there are no more keys or a difference is found. */ @@ -2418,7 +2521,7 @@ compare (const struct line *a, const struct line *b, bool show_debug) } static void -write_bytes (const struct line *line, FILE *fp, char const *output_file) +write_bytes (struct line const *line, FILE *fp, char const *output_file) { char const *buf = line->text; size_t n_bytes = line->length; @@ -3235,36 +3338,18 @@ incompatible_options (char const *opts) static void check_ordering_compatibility (void) { - struct keyfield const *key; + struct keyfield *key; for (key = keylist; key; key = key->next) if ((1 < (key->random + key->numeric + key->general_numeric + key->month + key->version + !!key->ignore + key->human_numeric)) || (key->random && key->translate)) { - /* The following is too big, but guaranteed to be "big enough". */ - char opts[sizeof short_options]; - char *p = opts; - if (key->ignore == nondictionary) - *p++ = 'd'; - if (key->translate) - *p++ = 'f'; - if (key->general_numeric) - *p++ = 'g'; - if (key->human_numeric) - *p++ = 'h'; - if (key->ignore == nonprinting) - *p++ = 'i'; - if (key->month) - *p++ = 'M'; - if (key->numeric) - *p++ = 'n'; - if (key->version) - *p++ = 'V'; - if (key->random) - *p++ = 'R'; - *p = '\0'; + /* Clear flags we're not interested in. */ + key->skipsblanks = key->skipeblanks = key->reverse = false; + char *opts = key_to_opts (key); incompatible_options (opts); + free (opts); } } @@ -3494,14 +3579,8 @@ main (int argc, char **argv) /* The signal mask is known, so it is safe to invoke exit_cleanup. */ atexit (exit_cleanup); - gkey.sword = gkey.eword = SIZE_MAX; - gkey.ignore = NULL; - gkey.translate = NULL; - gkey.numeric = gkey.general_numeric = gkey.human_numeric = false; - gkey.iec_present = -1; - gkey.random = gkey.version = false; - gkey.month = gkey.reverse = false; - gkey.skipsblanks = gkey.skipeblanks = false; + key_init (&gkey); + gkey.sword = SIZE_MAX; files = xnmalloc (argc, sizeof *files); @@ -3836,17 +3915,7 @@ main (int argc, char **argv) /* Inheritance of global options to individual keys. */ for (key = keylist; key; key = key->next) { - if (! (key->ignore - || key->translate - || (key->skipsblanks - || key->reverse - || key->skipeblanks - || key->month - || key->numeric - || key->version - || key->general_numeric - || key->human_numeric - || key->random))) + if (default_key_compare (key) && !key->reverse) { key->ignore = gkey.ignore; key->translate = gkey.translate; @@ -3856,24 +3925,15 @@ main (int argc, char **argv) key->numeric = gkey.numeric; key->general_numeric = gkey.general_numeric; key->human_numeric = gkey.human_numeric; + key->version = gkey.version; key->random = gkey.random; key->reverse = gkey.reverse; - key->version = gkey.version; } need_random |= key->random; } - if (!keylist && (gkey.ignore - || gkey.translate - || (gkey.skipsblanks - || gkey.skipeblanks - || gkey.month - || gkey.numeric - || gkey.general_numeric - || gkey.human_numeric - || gkey.random - || gkey.version))) + if (!keylist && !default_key_compare (&gkey)) { insertkey (&gkey); need_random |= gkey.random; @@ -3884,6 +3944,9 @@ main (int argc, char **argv) if (debug && outfile) error (SORT_FAILURE, 0, _("options -o and --debug are incompatible")); + if (debug) + key_warnings (&gkey); + reverse = gkey.reverse; if (need_random) diff --git a/tests/Makefile.am b/tests/Makefile.am index 46d388a..a732c63 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -225,6 +225,7 @@ TESTS = \ misc/sort-compress \ misc/sort-continue \ misc/sort-debug-keys \ + misc/sort-debug-warn \ misc/sort-files0-from \ misc/sort-float \ misc/sort-merge \ diff --git a/tests/misc/sort-debug-warn b/tests/misc/sort-debug-warn new file mode 100755 index 0000000..5295b4b --- /dev/null +++ b/tests/misc/sort-debug-warn @@ -0,0 +1,53 @@ +#!/bin/sh +# Test warnings for sort options + +# Copyright (C) 2010 Free Software Foundation, Inc. + +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +if test "$VERBOSE" = yes; then + set -x + sort --version +fi + +. $srcdir/test-lib.sh + +cat <<\EOF > exp +! zero width field ignored +! options `-bghMVRr' are ignored +! options `-bghMVR' are ignored +! option `-r' only applies to last-resort comparison +! option `-r' only applies to last-resort comparison +! options `-bg' are ignored +! options `-b' are ignored +! options `-d' are ignored +! options `-i' are ignored +EOF + +sort -s -k2,1 --debug /dev/null 2>>out +sort -s -rRVMhgb -k1,1n --debug /dev/null 2>>out +sort -rRVMhgb -k1,1n --debug /dev/null 2>>out +sort -r -k1,1n --debug /dev/null 2>>out +sort -gbr -k1,1n -k1,1r --debug /dev/null 2>>out +sort -b -k1b,1bn --debug /dev/null 2>>out # no warning +sort -b -k1,1bn --debug /dev/null 2>>out +sort -b -k1,1bn -k2b,2 --debug /dev/null 2>>out # no warning +sort -r -k1,1r --debug /dev/null 2>>out # no warning for redundant options +sort -i -k1,1i --debug /dev/null 2>>out # no warning +sort -d -k1,1b --debug /dev/null 2>>out +sort -i -k1,1d --debug /dev/null 2>>out + +compare out exp || fail=1 + +Exit $fail -- 1.6.2.5 --------------090907000806040204010703-- From debbugs-submit-bounces@debbugs.gnu.org Wed May 12 05:23:36 2010 Received: (at 6176) by debbugs.gnu.org; 12 May 2010 09:23:36 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OC8AB-0002mC-RV for submit@debbugs.gnu.org; Wed, 12 May 2010 05:23:36 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OC8A9-0002m6-Pk for 6176@debbugs.gnu.org; Wed, 12 May 2010 05:23:34 -0400 Received: (qmail 72953 invoked from network); 12 May 2010 09:23:29 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 12 May 2010 09:23:29 -0000 Message-ID: <4BEA732C.8090608@draigBrady.com> Date: Wed, 12 May 2010 10:21:48 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: 6176@debbugs.gnu.org Subject: Re: bug#6176: [PATCH 2/2] sort: --debug: output data independent key warnings References: <4BE9E0B1.5030308@draigBrady.com> <4BE9EAB8.8010703@draigBrady.com> In-Reply-To: <4BE9EAB8.8010703@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.9 (--) X-Debbugs-Envelope-To: 6176 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.9 (--) On 12/05/10 00:39, Pádraig Brady wrote: > The attached patch gives warnings about questionable > option combinations. For example: > > $ sort --debug -rb -k1,1n /dev/null > ! options `-b' are ignored > ! option `-r' only applies to last-resort comparison Oops, The previous patch warned about -r even when no keys specified, and didn't so the sort -ru special case right either. This should fix it up. diff --git a/src/sort.c b/src/sort.c index 66a00ef..211415d 100644 --- a/src/sort.c +++ b/src/sort.c @@ -2245,18 +2245,20 @@ key_warnings (struct keyfield const *gkey) ugkey.reverse &= !key->reverse; } - /* Warn about ignored global options flagged above. */ - if (!default_key_compare (&ugkey) || (stable && ugkey.reverse)) + /* Warn about ignored global options flagged above. + Note if gkey is the only one in the list, all flags are cleared. */ + if (!default_key_compare (&ugkey) + || (ugkey.reverse && (stable || unique) && keylist)) { bool ugkey_reverse = ugkey.reverse; - if (!stable) + if (!(stable || unique)) ugkey.reverse = false; char *opts = key_to_opts (&ugkey); fprintf (stderr, _("! options `-%s' are ignored\n"), opts); free (opts); ugkey.reverse = ugkey_reverse; } - if (!stable && ugkey.reverse) + if (ugkey.reverse && !(stable || unique) && keylist) fprintf (stderr, _("! option `-r' only applies to last-resort comparison\n")); } diff --git a/tests/misc/sort-debug-warn b/tests/misc/sort-debug-warn index 5295b4b..c2ff01d 100755 --- a/tests/misc/sort-debug-warn +++ b/tests/misc/sort-debug-warn @@ -47,6 +47,9 @@ sort -r -k1,1r --debug /dev/null 2>>out # no warning for redundant options sort -i -k1,1i --debug /dev/null 2>>out # no warning sort -d -k1,1b --debug /dev/null 2>>out sort -i -k1,1d --debug /dev/null 2>>out +sort -r --debug /dev/null 2>>out #no warning +sort -rM --debug /dev/null 2>>out #no warning +sort -rM -k1,1 --debug /dev/null 2>>out #no warning compare out exp || fail=1 From debbugs-submit-bounces@debbugs.gnu.org Wed May 12 07:08:49 2010 Received: (at 6176-done) by debbugs.gnu.org; 12 May 2010 11:08:49 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OC9o0-0003XF-SE for submit@debbugs.gnu.org; Wed, 12 May 2010 07:08:49 -0400 Received: from smtp1-g21.free.fr ([212.27.42.1]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OC9nx-0003X9-Ql for 6176-done@debbugs.gnu.org; Wed, 12 May 2010 07:08:46 -0400 Received: from smtp1-g21.free.fr (localhost [127.0.0.1]) by smtp1-g21.free.fr (Postfix) with ESMTP id DEB7F940145 for <6176-done@debbugs.gnu.org>; Wed, 12 May 2010 13:08:37 +0200 (CEST) Received: from mx.meyering.net (mx.meyering.net [82.230.74.64]) by smtp1-g21.free.fr (Postfix) with ESMTP for <6176-done@debbugs.gnu.org>; Wed, 12 May 2010 13:08:36 +0200 (CEST) Received: by rho.meyering.net (Acme Bit-Twister, from userid 1000) id E8129DC8; Wed, 12 May 2010 13:08:35 +0200 (CEST) From: Jim Meyering To: =?utf-8?Q?P=C3=A1draig?= Brady Subject: Re: bug#6176: [PATCH 1/2] sort: add a --debug option to highlight key extents In-Reply-To: <4BE9E0B1.5030308@draigBrady.com> (=?utf-8?Q?=22P=C3=A1draig?= Brady"'s message of "Tue, 11 May 2010 23:56:49 +0100") References: <4BE9E0B1.5030308@draigBrady.com> Date: Wed, 12 May 2010 13:08:35 +0200 Message-ID: <87vdatml70.fsf@meyering.net> Lines: 18 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -3.1 (---) X-Debbugs-Envelope-To: 6176-done Cc: 6176-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.1 (---) P=C3=A1draig Brady wrote: > The attached patch allows one to: > > $ printf "one 2 3e3e" | .sort --debug -k3,3g -k2,2 > one 2 3e3e > ___ > __ > __________ Nice. Thanks for writing all of that. These changes are sure to be useful. I have merely glanced through them (look fine so far), but would not mind at all if you were to push the series as-is. Jim PS: I've gone ahead and marked this as "done". From debbugs-submit-bounces@debbugs.gnu.org Wed May 12 07:40:52 2010 Received: (at 6176-done) by debbugs.gnu.org; 12 May 2010 11:40:52 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OCAJ2-0004Iy-8t for submit@debbugs.gnu.org; Wed, 12 May 2010 07:40:52 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OCAJ0-0004Io-4g for 6176-done@debbugs.gnu.org; Wed, 12 May 2010 07:40:50 -0400 Received: (qmail 8982 invoked from network); 12 May 2010 11:40:43 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 12 May 2010 11:40:43 -0000 Message-ID: <4BEA9355.2090309@draigBrady.com> Date: Wed, 12 May 2010 12:39:01 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Jim Meyering Subject: Re: bug#6176: [PATCH 1/2] sort: add a --debug option to highlight key extents References: <4BE9E0B1.5030308@draigBrady.com> <87vdatml70.fsf@meyering.net> In-Reply-To: <87vdatml70.fsf@meyering.net> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.9 (--) X-Debbugs-Envelope-To: 6176-done Cc: 6176-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.9 (--) On 12/05/10 12:08, Jim Meyering wrote: > Pádraig Brady wrote: >> The attached patch allows one to: >> >> $ printf "one 2 3e3e" | .sort --debug -k3,3g -k2,2 >> one 2 3e3e >> ___ >> __ >> __________ > > Nice. Thanks for writing all of that. > These changes are sure to be useful. > > I have merely glanced through them (look fine so far), > but would not mind at all if you were to push the series as-is. > > Jim > > PS: I've gone ahead and marked this as "done". Thanks. The changes shouldn't be risky as everything is protected with if (debug). I was also wondering about informational class messages that could be output by --debug. Perhaps stats on the number of comparisons done could be useful? Also mentioning the oft confusing collating order with something like: if (debug) { if (hard_LC_COLLATE) fprintf (stderr, _("# Using %s locale rules\n"), quote(setlocale (LC_COLLATE, NULL)); else fprintf (stderr, _("# Using simple byte comparison\n"); } Anyway, something for another day. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Wed May 12 09:53:50 2010 Received: (at 6176) by debbugs.gnu.org; 12 May 2010 13:53:50 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OCCNi-0005Hs-EO for submit@debbugs.gnu.org; Wed, 12 May 2010 09:53:50 -0400 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OCCNf-0005Hn-Iv for 6176@debbugs.gnu.org; Wed, 12 May 2010 09:53:48 -0400 Received: from int-mx04.intmail.prod.int.phx2.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.17]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o4CDrfkK028636 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 12 May 2010 09:53:41 -0400 Received: from [10.3.241.85] (vpn-241-85.phx2.redhat.com [10.3.241.85]) by int-mx04.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o4CDrfP1028608; Wed, 12 May 2010 09:53:41 -0400 Message-ID: <4BEAB2D3.3010801@redhat.com> Date: Wed, 12 May 2010 07:53:23 -0600 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100430 Fedora/3.0.4-3.fc13 Lightning/1.0b1 Mnenhy/0.8.2 Thunderbird/3.0.4 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#6176: [PATCH 2/2] sort: --debug: output data independent key warnings References: <4BE9E0B1.5030308@draigBrady.com> <4BE9EAB8.8010703@draigBrady.com> In-Reply-To: <4BE9EAB8.8010703@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enig795737A7AD791F2C361F0B7A" X-Scanned-By: MIMEDefang 2.67 on 10.5.11.17 X-Spam-Score: -10.0 (----------) X-Debbugs-Envelope-To: 6176 Cc: 6176@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.0 (----------) This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig795737A7AD791F2C361F0B7A Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 05/11/2010 05:39 PM, P=C3=A1draig Brady wrote: > The attached patch gives warnings about questionable > option combinations. For example: >=20 > $ sort --debug -rb -k1,1n /dev/null > ! options `-b' are ignored > ! option `-r' only applies to last-resort comparison That looks awkward, both when compared to the GCS convention of listing the program name rather than !, and in respect to plurality: sort: option `-b' is ignored sort: option `-r' only applies to last-resort comparison --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org --------------enig795737A7AD791F2C361F0B7A Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBCAAGBQJL6rLcAAoJEKeha0olJ0NqGbwH/2njpjRKu7ahuk+fvH9+a8pT J7/s4nBaLFtjTkLIfM+x8o6e2wG9jYFDZBdjEZPJZLOBMni4sEauQ4G++2ix0U0M s19M1XNtBlpNsKbWjiMuenF52GklL0+BUwTVliAAvAa0/GTVfmJbTL+PxPvAQYV7 rtmOy/RveN6WP2BU6XBaj8eXFkA9lvrBV4yv53tc1nCD6D4u2PK2rFQnkmkh/SzM g2AQ+WtFv+GJG0KU/zYXHYg8JfYviVI48tODH7AeuBwElnjnfwKNcouio4tqTzQU swXzN6I4KQ9zcvD0hhv+zG3JyfQ3gCDY1yck9NysvhfYWYRaFyJzKoR7Q4sCKto= =91RN -----END PGP SIGNATURE----- --------------enig795737A7AD791F2C361F0B7A-- From debbugs-submit-bounces@debbugs.gnu.org Wed May 12 09:56:03 2010 Received: (at 6176) by debbugs.gnu.org; 12 May 2010 13:56:04 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OCCPr-0005JE-OO for submit@debbugs.gnu.org; Wed, 12 May 2010 09:56:03 -0400 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OCCPq-0005Iq-3L for 6176@debbugs.gnu.org; Wed, 12 May 2010 09:56:02 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o4CDtuJZ030656 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 12 May 2010 09:55:56 -0400 Received: from [10.3.241.85] (vpn-241-85.phx2.redhat.com [10.3.241.85]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o4CDttiS015024; Wed, 12 May 2010 09:55:56 -0400 Message-ID: <4BEAB363.1070009@redhat.com> Date: Wed, 12 May 2010 07:55:47 -0600 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100430 Fedora/3.0.4-3.fc13 Lightning/1.0b1 Mnenhy/0.8.2 Thunderbird/3.0.4 MIME-Version: 1.0 To: Eric Blake Subject: Re: bug#6176: [PATCH 2/2] sort: --debug: output data independent key warnings References: <4BE9E0B1.5030308@draigBrady.com> <4BE9EAB8.8010703@draigBrady.com> <4BEAB2D3.3010801@redhat.com> In-Reply-To: <4BEAB2D3.3010801@redhat.com> X-Enigmail-Version: 1.0.1 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enigD37350B5184F2D89DB00928F" X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-Spam-Score: -10.0 (----------) X-Debbugs-Envelope-To: 6176 Cc: 6176@debbugs.gnu.org, =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.0 (----------) This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigD37350B5184F2D89DB00928F Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 05/12/2010 07:53 AM, Eric Blake wrote: > On 05/11/2010 05:39 PM, P=C3=A1draig Brady wrote: >> The attached patch gives warnings about questionable >> option combinations. For example: >> >> $ sort --debug -rb -k1,1n /dev/null >> ! options `-b' are ignored >> ! option `-r' only applies to last-resort comparison >=20 > That looks awkward, both when compared to the GCS convention of listing= > the program name rather than !, and in respect to plurality: >=20 > sort: option `-b' is ignored > sort: option `-r' only applies to last-resort comparison Or, to put it more concretely, > + fprintf (stderr, _("! options `-%s' are ignored\n"), opts); > + free (opts); > + ugkey.reverse =3D ugkey_reverse; > + } > + if (!stable && ugkey.reverse) > + fprintf (stderr, > + _("! option `-r' only applies to last-resort comparison\n= ")); Why are we using fprintf(stderr) instead of error()? --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org --------------enigD37350B5184F2D89DB00928F Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBCAAGBQJL6rNjAAoJEKeha0olJ0NqvGUH/RV0RaF8nbqb0TOa4erErgdD WqGKXIJLTD0XuXa5Ic9K5AalB9TVYAN2jRQW3UfH5SYKPEzsOpn15rXbQdl4cGR8 Rkz2zDBsKP0ui1NZ40OXP1keOC7xc0YWorG1HJQTLlNYevfHIqlg4T4Ftj4eJBTs ZD/30zRcs8i1XU406NF6dHGuC/k6Jy3DflkO/0qcKWyt/fSFKaNCKPJswpO7Im3A NBBDLvVyUAeL3OSCplWJfqJKHDZaKex1Iwzn1vw2plG8K9RL4dVEMQzFP8rN0ROY O4w76Di0dR10ckSfnkH4lprfwcIYDQgLVXTWeUWyuWYLMH6GWOqBxToXXMXIxEw= =Pevw -----END PGP SIGNATURE----- --------------enigD37350B5184F2D89DB00928F-- From debbugs-submit-bounces@debbugs.gnu.org Wed May 12 10:08:56 2010 Received: (at 6176) by debbugs.gnu.org; 12 May 2010 14:08:56 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OCCcJ-0005Ox-V9 for submit@debbugs.gnu.org; Wed, 12 May 2010 10:08:56 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OCCcH-0005Os-UN for 6176@debbugs.gnu.org; Wed, 12 May 2010 10:08:54 -0400 Received: (qmail 47730 invoked from network); 12 May 2010 14:08:48 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 12 May 2010 14:08:48 -0000 Message-ID: <4BEAB609.6050201@draigBrady.com> Date: Wed, 12 May 2010 15:07:05 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Eric Blake Subject: Re: bug#6176: [PATCH 2/2] sort: --debug: output data independent key warnings References: <4BE9E0B1.5030308@draigBrady.com> <4BE9EAB8.8010703@draigBrady.com> <4BEAB2D3.3010801@redhat.com> <4BEAB363.1070009@redhat.com> In-Reply-To: <4BEAB363.1070009@redhat.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.9 (--) X-Debbugs-Envelope-To: 6176 Cc: 6176@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.9 (--) On 12/05/10 14:55, Eric Blake wrote: > On 05/12/2010 07:53 AM, Eric Blake wrote: >> On 05/11/2010 05:39 PM, Pádraig Brady wrote: >>> The attached patch gives warnings about questionable >>> option combinations. For example: >>> >>> $ sort --debug -rb -k1,1n /dev/null >>> ! options `-b' are ignored >>> ! option `-r' only applies to last-resort comparison >> >> That looks awkward, both when compared to the GCS convention of listing >> the program name rather than !, and in respect to plurality: >> >> sort: option `-b' is ignored >> sort: option `-r' only applies to last-resort comparison > > Or, to put it more concretely, > >> + fprintf (stderr, _("! options `-%s' are ignored\n"), opts); >> + free (opts); >> + ugkey.reverse = ugkey_reverse; >> + } >> + if (!stable && ugkey.reverse) >> + fprintf (stderr, >> + _("! option `-r' only applies to last-resort comparison\n")); > > Why are we using fprintf(stderr) instead of error()? > I was thinking it was redundant to print the command name when explicitly asking for warnings,but yes I think you're right, I'll just error(). I'll fix up the plurality also. BTW I'm not intending to push this second patch for at least a day. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Wed May 12 10:15:13 2010 Received: (at 6176) by debbugs.gnu.org; 12 May 2010 14:15:13 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OCCiP-0005Ru-IT for submit@debbugs.gnu.org; Wed, 12 May 2010 10:15:13 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OCCiM-0005Ro-46 for 6176@debbugs.gnu.org; Wed, 12 May 2010 10:15:12 -0400 Received: (qmail 49439 invoked from network); 12 May 2010 14:15:04 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 12 May 2010 14:15:04 -0000 Message-ID: <4BEAB781.1070007@draigBrady.com> Date: Wed, 12 May 2010 15:13:21 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: 6176@debbugs.gnu.org Subject: Re: bug#6176: [PATCH 1/2] sort: add a --debug option to highlight key extents References: <4BE9E0B1.5030308@draigBrady.com> <87vdatml70.fsf@meyering.net> In-Reply-To: <87vdatml70.fsf@meyering.net> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.9 (--) X-Debbugs-Envelope-To: 6176 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.9 (--) On 12/05/10 12:08, Jim Meyering wrote: > Pádraig Brady wrote: >> The attached patch allows one to: >> >> $ printf "one 2 3e3e" | .sort --debug -k3,3g -k2,2 >> one 2 3e3e >> ___ >> __ >> __________ > > Nice. Thanks for writing all of that. > These changes are sure to be useful. > > I have merely glanced through them (look fine so far), > but would not mind at all if you were to push the series as-is. I pushed the first, and I then pushed the following to fix a test failure pointed out by our continuous integration box http://hydra.nixos.org/jobset/gnu/coreutils-master I'll probably push patch 2 tomorrow. cheers, Pádraig. diff --git a/tests/misc/sort-debug-keys b/tests/misc/sort-debug-keys index 0437678..0f05025 100755 --- a/tests/misc/sort-debug-keys +++ b/tests/misc/sort-debug-keys @@ -300,8 +300,8 @@ _____ ___________________ EOF -( if test "$LOCALE_FR_UTF8"; then + ( echo " 1²---++3 1,234 Mi" | LC_ALL=C sort --debug -k2g -k1b,1 echo " 1²---++3 1,234 Mi" | @@ -309,9 +309,8 @@ if test "$LOCALE_FR_UTF8"; then echo "+1234 1234Gi 1,234M" | LC_ALL=$LOCALE_FR_UTF8 sort --debug -k1,1n -k1,1g \ -k1,1h -k2,2n -k2,2g -k2,2h -k3,3n -k3,3g -k3,3h + ) > out + compare out exp || fail=1 fi -) > out - -compare out exp || fail=1 Exit $fail From debbugs-submit-bounces@debbugs.gnu.org Fri May 14 09:12:19 2010 Received: (at 6176) by debbugs.gnu.org; 14 May 2010 13:12:19 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OCugc-0005EQ-R1 for submit@debbugs.gnu.org; Fri, 14 May 2010 09:12:19 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OCugZ-0005EL-4K for 6176@debbugs.gnu.org; Fri, 14 May 2010 09:12:16 -0400 Received: (qmail 62465 invoked from network); 14 May 2010 13:12:11 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 14 May 2010 13:12:11 -0000 Message-ID: <4BED4BBB.4000908@draigBrady.com> Date: Fri, 14 May 2010 14:10:19 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: 6176@debbugs.gnu.org Subject: Re: bug#6176: [PATCH 2/2] sort: --debug: output data independent key warnings References: <4BE9E0B1.5030308@draigBrady.com> <4BE9EAB8.8010703@draigBrady.com> <4BEAB2D3.3010801@redhat.com> <4BEAB363.1070009@redhat.com> <4BEAB609.6050201@draigBrady.com> In-Reply-To: <4BEAB609.6050201@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: multipart/mixed; boundary="------------040200000409030800080602" X-Spam-Score: -1.6 (-) X-Debbugs-Envelope-To: 6176 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.8 (--) This is a multi-part message in MIME format. --------------040200000409030800080602 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Latest version of the patch attached with new warnings and info. Example output... $ sort --debug -rb -k2n +2 -1b /dev/null sort: using `en_US.utf8' sorting rules sort: obsolescent key formats used. Consider using `-k' sort: key 1 is numeric and spans multiple fields sort: key 2 has zero width and will be ignored sort: leading blanks are significant in key 2. Consider also specifying `b' sort: option `-b' is ignored sort: option `-r' only applies to last-resort comparison $ LANG=en_USA sort --debug -srb -k1,1n /dev/null sort: using simple byte comparison sort: options `-br' are ignored I'll commit tonight sometime I think. cheers, Pádraig. --------------040200000409030800080602 Content-Type: text/x-patch; name="0001-sort-debug-output-data-independent-key-warnings.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0001-sort-debug-output-data-independent-key-warnings.patch" >From f82a746c8e8ca342273e31ccd49f6f75f14ee2e6 Mon Sep 17 00:00:00 2001 From: =?utf-8?q?P=C3=A1draig=20Brady?= Date: Tue, 11 May 2010 18:46:21 +0100 Subject: [PATCH] sort: --debug: output data independent key warnings * src/sort.c (usage): Mention --debug can output warnings to stderr. (default_key_compare): A new function refactored from main(), and now also called from the new key_warnings() function. (key_to_opts): A new function refactored from incompatible_options(), and now also called from the new key_warnings() function. (key_numeric): A new function refactored to test if key is numeric. (key_warnings): A new function to output warnings to stderr, about questionable use of various options. Currently it warns about zero length keys and ineffective global options. (incompatible_options): Refactor out key_to_opts() (main): Use key_init() to initialize gkey. Refactor out default_key_compare(). Call key_warnings() in debug mode. * doc/coreutils.texi (sort invocation): Mention that warnings are output by --debug. * tests/misc/sort-debug-warn: A new test for debug warnings. * tests/Makefile.am: Reference the new test. * NEWS: Mention the new feature --- NEWS | 2 +- doc/coreutils.texi | 1 + src/sort.c | 227 +++++++++++++++++++++++++++++++++----------- tests/Makefile.am | 1 + tests/misc/sort-debug-warn | 91 ++++++++++++++++++ 5 files changed, 264 insertions(+), 58 deletions(-) create mode 100755 tests/misc/sort-debug-warn diff --git a/NEWS b/NEWS index 4c2da67..5d7b81f 100644 --- a/NEWS +++ b/NEWS @@ -5,7 +5,7 @@ GNU coreutils NEWS -*- outline -*- ** New features sort now accepts the --debug option, to highlight the part of the - line significant in the sort. + line significant in the sort, and warn about questionable options. ** Changes in behavior diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 6714ada..cd99bd0 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -3944,6 +3944,7 @@ of the line being used in the sort. @item --debug Highlight the portion of each line used for sorting. +Also issue warnings about questionable usage to stderr. @item --batch-size=@var{nmerge} @opindex --batch-size diff --git a/src/sort.c b/src/sort.c index 38c5ed2..f2b1124 100644 --- a/src/sort.c +++ b/src/sort.c @@ -131,6 +131,9 @@ static bool hard_LC_COLLATE; static bool hard_LC_TIME; #endif +/* Nonzero if the obsolescent key option format is used. */ +static bool obsolete_used; + #define NONZERO(x) ((x) != 0) /* The kind of blanks for '-b' to skip in various options. */ @@ -375,7 +378,8 @@ Other options:\n\ -C, --check=quiet, --check=silent like -c, but do not report first bad line\n\ --compress-program=PROG compress temporaries with PROG;\n\ decompress them with PROG -d\n\ - --debug annotate the part of the line used to sort\n\ + --debug annotate the part of the line used to sort,\n\ + and warn about questionable usage to stderr\n\ --files0-from=F read input from the files specified by\n\ NUL-terminated names in file F;\n\ If F is - then read names from standard input\n\ @@ -2158,6 +2162,143 @@ debug_key (char const *sline, char const *sfield, char const *efield, mark_key (offset, width); } +/* Testing if a key is numeric is done in various places. */ + +static inline bool +key_numeric (struct keyfield const *key) +{ + return (key->numeric || key->general_numeric || key->human_numeric); +} + +/* Return whether sorting options specified for key. */ + +static bool +default_key_compare (struct keyfield const *key) +{ + return ! (key->ignore + || key->translate + || key->skipsblanks + || key->skipeblanks + || key_numeric (key) + || key->month + || key->version + || key->random + /* || key->reverse */ + ); +} + +/* Convert a key to the short options used to specify it. + The returned string must be freed. */ + +static char* +key_to_opts (struct keyfield const *key) +{ + /* The following is too big, but guaranteed to be "big enough". */ + char *opts = xstrdup (short_options); + char *p = opts; + + if (key->skipsblanks || key->skipeblanks) + *p++ = 'b';/* either disables global -b */ + if (key->ignore == nondictionary) + *p++ = 'd'; + if (key->ignore == nonprinting) + *p++ = 'i'; + if (key->translate) + *p++ = 'f'; + if (key->general_numeric) + *p++ = 'g'; + if (key->human_numeric) + *p++ = 'h'; + if (key->month) + *p++ = 'M'; + if (key->numeric) + *p++ = 'n'; + if (key->random) + *p++ = 'R'; + if (key->reverse) + *p++ = 'r'; + if (key->version) + *p++ = 'V'; + *p = '\0'; + + return opts; +} + +/* Output data independent key warnings to stderr. */ + +static void +key_warnings (struct keyfield const *gkey, bool gkey_only) +{ + struct keyfield const *key; + struct keyfield ugkey = *gkey; + size_t keynum = 1; + + if (obsolete_used) + error (0, 0, _("obsolescent key formats used. Consider using `-k'")); + + for (key = keylist; key; key = key->next, keynum++) + { + /* Warn about field specs that will never match. */ + if (key->sword != SIZE_MAX && key->eword < key->sword) + error (0, 0, _("key %zu has zero width and will be ignored"), keynum); + + /* Warn about significant leading blanks. */ + if (!gkey_only && tab == TAB_DEFAULT && !key->skipsblanks + && !key_numeric (key) && !key->month) + error (0, 0, _("leading blanks are significant in key %zu. Consider " + "also specifying `b'"), keynum); + + + /* Warn about numeric comparisons spanning fields, + as field delimiters could be interpreted as part + of the number (maybe only in other locales). */ + if (!gkey_only && key_numeric (key)) + { + size_t sword = key->sword + 1; + size_t eword = key->eword + 1; + if (!sword) + sword++; + if (sword != eword) + error (0, 0, _("key %zu is numeric and spans multiple fields"), + keynum); + } + + /* Flag global options not copied or specified in any key. */ + if (ugkey.ignore && (ugkey.ignore == key->ignore)) + ugkey.ignore = NULL; + if (ugkey.translate && (ugkey.translate == key->translate)) + ugkey.translate = NULL; + ugkey.skipsblanks &= !key->skipsblanks; + ugkey.skipeblanks &= !key->skipeblanks; + ugkey.month &= !key->month; + ugkey.numeric &= !key->numeric; + ugkey.general_numeric &= !key->general_numeric; + ugkey.human_numeric &= !key->human_numeric; + ugkey.random &= !key->random; + ugkey.version &= !key->version; + ugkey.reverse &= !key->reverse; + } + + /* Warn about ignored global options flagged above. + Note if gkey is the only one in the list, all flags are cleared. */ + if (!default_key_compare (&ugkey) + || (ugkey.reverse && (stable || unique) && keylist)) + { + bool ugkey_reverse = ugkey.reverse; + if (!(stable || unique)) + ugkey.reverse = false; + char *opts = key_to_opts (&ugkey); + error (0, 0, + ngettext ("option `-%s' is ignored", + "options `-%s' are ignored", + select_plural (strlen (opts))), opts); + free (opts); + ugkey.reverse = ugkey_reverse; + } + if (ugkey.reverse && !(stable || unique) && keylist) + error (0, 0, _("option `-r' only applies to last-resort comparison")); +} + /* Compare two lines A and B trying every key in sequence until there are no more keys or a difference is found. */ @@ -2193,7 +2334,7 @@ keycompare (const struct line *a, const struct line *b, bool show_debug) if (key->random) diff = compare_random (texta, lena, textb, lenb); - else if (key->numeric || key->general_numeric || key->human_numeric) + else if (key_numeric (key)) { char savea = *lima, saveb = *limb; char const* ea = lima; @@ -3235,36 +3376,18 @@ incompatible_options (char const *opts) static void check_ordering_compatibility (void) { - struct keyfield const *key; + struct keyfield *key; for (key = keylist; key; key = key->next) - if ((1 < (key->random + key->numeric + key->general_numeric + key->month - + key->version + !!key->ignore + key->human_numeric)) + if ((1 < (key->random + key_numeric (key) + key->month + key->version + + !!key->ignore)) || (key->random && key->translate)) { - /* The following is too big, but guaranteed to be "big enough". */ - char opts[sizeof short_options]; - char *p = opts; - if (key->ignore == nondictionary) - *p++ = 'd'; - if (key->translate) - *p++ = 'f'; - if (key->general_numeric) - *p++ = 'g'; - if (key->human_numeric) - *p++ = 'h'; - if (key->ignore == nonprinting) - *p++ = 'i'; - if (key->month) - *p++ = 'M'; - if (key->numeric) - *p++ = 'n'; - if (key->version) - *p++ = 'V'; - if (key->random) - *p++ = 'R'; - *p = '\0'; + /* Clear flags we're not interested in. */ + key->skipsblanks = key->skipeblanks = key->reverse = false; + char *opts = key_to_opts (key); incompatible_options (opts); + free (opts); } } @@ -3391,6 +3514,7 @@ main (int argc, char **argv) struct keyfield *key; struct keyfield key_buf; struct keyfield gkey; + bool gkey_only = false; char const *s; int c = 0; char checkonly = 0; @@ -3494,14 +3618,8 @@ main (int argc, char **argv) /* The signal mask is known, so it is safe to invoke exit_cleanup. */ atexit (exit_cleanup); - gkey.sword = gkey.eword = SIZE_MAX; - gkey.ignore = NULL; - gkey.translate = NULL; - gkey.numeric = gkey.general_numeric = gkey.human_numeric = false; - gkey.iec_present = -1; - gkey.random = gkey.version = false; - gkey.month = gkey.reverse = false; - gkey.skipsblanks = gkey.skipeblanks = false; + key_init (&gkey); + gkey.sword = SIZE_MAX; files = xnmalloc (argc, sizeof *files); @@ -3574,6 +3692,7 @@ main (int argc, char **argv) N_("stray character in field spec")); } insertkey (key); + obsolete_used = true; } } } @@ -3836,17 +3955,7 @@ main (int argc, char **argv) /* Inheritance of global options to individual keys. */ for (key = keylist; key; key = key->next) { - if (! (key->ignore - || key->translate - || (key->skipsblanks - || key->reverse - || key->skipeblanks - || key->month - || key->numeric - || key->version - || key->general_numeric - || key->human_numeric - || key->random))) + if (default_key_compare (key) && !key->reverse) { key->ignore = gkey.ignore; key->translate = gkey.translate; @@ -3856,25 +3965,17 @@ main (int argc, char **argv) key->numeric = gkey.numeric; key->general_numeric = gkey.general_numeric; key->human_numeric = gkey.human_numeric; + key->version = gkey.version; key->random = gkey.random; key->reverse = gkey.reverse; - key->version = gkey.version; } need_random |= key->random; } - if (!keylist && (gkey.ignore - || gkey.translate - || (gkey.skipsblanks - || gkey.skipeblanks - || gkey.month - || gkey.numeric - || gkey.general_numeric - || gkey.human_numeric - || gkey.random - || gkey.version))) + if (!keylist && !default_key_compare (&gkey)) { + gkey_only = true; insertkey (&gkey); need_random |= gkey.random; } @@ -3884,6 +3985,18 @@ main (int argc, char **argv) if (debug && outfile) error (SORT_FAILURE, 0, _("options -o and --debug are incompatible")); + if (debug) + { + /* Always output the locale in debug mode, since this + is such a common source of confusion. */ + if (hard_LC_COLLATE) + error (0, 0, _("using %s sorting rules"), + quote (setlocale (LC_COLLATE, NULL))); + else + error (0, 0, _("using simple byte comparison")); + key_warnings (&gkey, gkey_only); + } + reverse = gkey.reverse; if (need_random) diff --git a/tests/Makefile.am b/tests/Makefile.am index 46d388a..a732c63 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -225,6 +225,7 @@ TESTS = \ misc/sort-compress \ misc/sort-continue \ misc/sort-debug-keys \ + misc/sort-debug-warn \ misc/sort-files0-from \ misc/sort-float \ misc/sort-merge \ diff --git a/tests/misc/sort-debug-warn b/tests/misc/sort-debug-warn new file mode 100755 index 0000000..6da3d9b --- /dev/null +++ b/tests/misc/sort-debug-warn @@ -0,0 +1,91 @@ +#!/bin/sh +# Test warnings for sort options + +# Copyright (C) 2010 Free Software Foundation, Inc. + +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +if test "$VERBOSE" = yes; then + set -x + sort --version +fi + +. $srcdir/test-lib.sh + +cat <<\EOF > exp +sort: using simple byte comparison +sort: key 1 has zero width and will be ignored +sort: leading blanks are significant in key 1. Consider specifying `b' also +sort: using simple byte comparison +sort: options `-bghMRrV' are ignored +sort: using simple byte comparison +sort: options `-bghMRV' are ignored +sort: option `-r' only applies to last-resort comparison +sort: using simple byte comparison +sort: option `-r' only applies to last-resort comparison +sort: using simple byte comparison +sort: leading blanks are significant in key 2. Consider specifying `b' also +sort: options `-bg' are ignored +sort: using simple byte comparison +sort: using simple byte comparison +sort: option `-b' is ignored +sort: using simple byte comparison +sort: using simple byte comparison +sort: leading blanks are significant in key 1. Consider specifying `b' also +sort: using simple byte comparison +sort: leading blanks are significant in key 1. Consider specifying `b' also +sort: using simple byte comparison +sort: leading blanks are significant in key 1. Consider specifying `b' also +sort: option `-d' is ignored +sort: using simple byte comparison +sort: leading blanks are significant in key 1. Consider specifying `b' also +sort: option `-i' is ignored +sort: using simple byte comparison +sort: using simple byte comparison +sort: using simple byte comparison +EOF + +sort -s -k2,1 --debug /dev/null 2>>out +sort -s -rRVMhgb -k1,1n --debug /dev/null 2>>out +sort -rRVMhgb -k1,1n --debug /dev/null 2>>out +sort -r -k1,1n --debug /dev/null 2>>out +sort -gbr -k1,1n -k1,1r --debug /dev/null 2>>out +sort -b -k1b,1bn --debug /dev/null 2>>out # no warning +sort -b -k1,1bn --debug /dev/null 2>>out +sort -b -k1,1bn -k2b,2 --debug /dev/null 2>>out # no warning +sort -r -k1,1r --debug /dev/null 2>>out # no warning for redundant options +sort -i -k1,1i --debug /dev/null 2>>out # no warning +sort -d -k1,1b --debug /dev/null 2>>out +sort -i -k1,1d --debug /dev/null 2>>out +sort -r --debug /dev/null 2>>out #no warning +sort -rM --debug /dev/null 2>>out #no warning +sort -rM -k1,1 --debug /dev/null 2>>out #no warning + +compare out exp || fail=1 + +cat <<\EOF > exp +sort: using simple byte comparison +sort: Obsolescent key formats used. Consider using `-k' +sort: key 1 is numeric and spans multiple fields +sort: key 2 has zero width and will be ignored +sort: leading blanks are significant in key 2. Consider specifying `b' also +sort: option `-b' is ignored +sort: option `-r' only applies to last-resort comparison +EOF + +sort --debug -rb -k2n +2 -1b /dev/null 2>out + +compare out exp || fail=1 + +Exit $fail -- 1.6.2.5 --------------040200000409030800080602-- From debbugs-submit-bounces@debbugs.gnu.org Fri May 14 11:10:02 2010 Received: (at 6176) by debbugs.gnu.org; 14 May 2010 15:10:03 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OCwWY-00063K-Li for submit@debbugs.gnu.org; Fri, 14 May 2010 11:10:02 -0400 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OCwWT-000634-U7 for 6176@debbugs.gnu.org; Fri, 14 May 2010 11:10:01 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o4EF9toW011511 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 14 May 2010 11:09:55 -0400 Received: from [10.3.241.64] (vpn-241-64.phx2.redhat.com [10.3.241.64]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o4EF9snI031930; Fri, 14 May 2010 11:09:55 -0400 Message-ID: <4BED67BA.3010605@redhat.com> Date: Fri, 14 May 2010 09:09:46 -0600 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100430 Fedora/3.0.4-3.fc13 Lightning/1.0b1 Mnenhy/0.8.2 Thunderbird/3.0.4 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#6176: [PATCH 2/2] sort: --debug: output data independent key warnings References: <4BE9E0B1.5030308@draigBrady.com> <4BE9EAB8.8010703@draigBrady.com> <4BEAB2D3.3010801@redhat.com> <4BEAB363.1070009@redhat.com> <4BEAB609.6050201@draigBrady.com> <4BED4BBB.4000908@draigBrady.com> In-Reply-To: <4BED4BBB.4000908@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enig23D27CE4956A02D040D61857" X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-Spam-Score: -10.0 (----------) X-Debbugs-Envelope-To: 6176 Cc: 6176@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.0 (----------) This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig23D27CE4956A02D040D61857 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 05/14/2010 07:10 AM, P=C3=A1draig Brady wrote: > Latest version of the patch attached with new warnings and info. > Example output... >=20 > $ sort --debug -rb -k2n +2 -1b /dev/null > sort: using `en_US.utf8' sorting rules > sort: obsolescent key formats used. Consider using `-k' > sort: key 1 is numeric and spans multiple fields > sort: key 2 has zero width and will be ignored > sort: leading blanks are significant in key 2. Consider also specifyin= g `b' > sort: option `-b' is ignored > sort: option `-r' only applies to last-resort comparison *Nice!* > +/* Nonzero if the obsolescent key option format is used. */ > +static bool obsolete_used; s/Nonzero/True/ > + > #define NONZERO(x) ((x) !=3D 0) > =20 > /* The kind of blanks for '-b' to skip in various options. */ > @@ -375,7 +378,8 @@ Other options:\n\ > -C, --check=3Dquiet, --check=3Dsilent like -c, but do not report fi= rst bad line\n\ > --compress-program=3DPROG compress temporaries with PROG;\n\ > decompress them with PROG -d\n\ > - --debug annotate the part of the line used to sort= \n\ > + --debug annotate the part of the line used to sort= ,\n\ > + and warn about questionable usage to std= err\n\ > --files0-from=3DF read input from the files specified by\n= \ > NUL-terminated names in file F;\n\ > If F is - then read names from standard in= put\n\ This makes for a pretty long translation string; time to break it in two?= > +static inline bool > +key_numeric (struct keyfield const *key) > +{ > + return (key->numeric || key->general_numeric || key->human_numeric);= Redundant (). > + for (key =3D keylist; key; key =3D key->next, keynum++) > + { > + /* Warn about field specs that will never match. */ > + if (key->sword !=3D SIZE_MAX && key->eword < key->sword) > + error (0, 0, _("key %zu has zero width and will be ignored"), = keynum); This requires vfprintf-posix to guarantee that %zu will work; I'm not sure we have that guarantee, and Jim has been reluctant to globally turn on gnulib printf replacements. > + > + /* Warn about significant leading blanks. */ > + if (!gkey_only && tab =3D=3D TAB_DEFAULT && !key->skipsblanks > + && !key_numeric (key) && !key->month) > + error (0, 0, _("leading blanks are significant in key %zu. Co= nsider " > + "also specifying `b'"), keynum); > + > + > + /* Warn about numeric comparisons spanning fields, Why two blank lines? > @@ -3884,6 +3985,18 @@ main (int argc, char **argv) > if (debug && outfile) > error (SORT_FAILURE, 0, _("options -o and --debug are incompatible= ")); Why? --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org --------------enig23D27CE4956A02D040D61857 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBCAAGBQJL7We6AAoJEKeha0olJ0Nq8PQH/RPXn8S7HHXYh91BDUnDOhKG nO76zOqU8h5XslmRovu+ioc7VS2uBPTrIXeMBenZ4olSCqDy4w8KdxESBAfJY4gF qAdedpK8YGT7rup/hS5dJckddiWuDSbknp9W35m44zWIIOLmCX2St7fhQuKOfWkU vCsDN6JXPWHTrStoWNKBcyoR4i9V19GPSyMOK2n5yLJqkIdbxlElEXOtK/nQuWHr wptUD8wgfIQXb1/z7jLpQW+hxxYoEx4c2t3PDlwg83E4hNiilbGIpuV+GV+0tyL5 sPPtVqJ210+4Qg+eMGJcuT8A4ptsNvUK8k3rihuiJsXcT7tWu8wMRyEgsX7Ykzk= =//Do -----END PGP SIGNATURE----- --------------enig23D27CE4956A02D040D61857-- From debbugs-submit-bounces@debbugs.gnu.org Fri May 14 11:54:09 2010 Received: (at 6176) by debbugs.gnu.org; 14 May 2010 15:54:09 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OCxDE-0006Nl-IQ for submit@debbugs.gnu.org; Fri, 14 May 2010 11:54:09 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OCxDC-0006NK-5U for 6176@debbugs.gnu.org; Fri, 14 May 2010 11:54:06 -0400 Received: (qmail 93816 invoked from network); 14 May 2010 15:54:03 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 14 May 2010 15:54:03 -0000 Message-ID: <4BED71AA.8010305@draigBrady.com> Date: Fri, 14 May 2010 16:52:10 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: 6176@debbugs.gnu.org Subject: Re: bug#6176: [PATCH 2/2] sort: --debug: output data independent key warnings References: <4BE9E0B1.5030308@draigBrady.com> <4BE9EAB8.8010703@draigBrady.com> <4BEAB2D3.3010801@redhat.com> <4BEAB363.1070009@redhat.com> <4BEAB609.6050201@draigBrady.com> <4BED4BBB.4000908@draigBrady.com> <4BED67BA.3010605@redhat.com> In-Reply-To: <4BED67BA.3010605@redhat.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.8 (--) X-Debbugs-Envelope-To: 6176 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.8 (--) On 14/05/10 16:09, Eric Blake wrote: > On 05/14/2010 07:10 AM, Pádraig Brady wrote: >> >> /* The kind of blanks for '-b' to skip in various options. */ >> @@ -375,7 +378,8 @@ Other options:\n\ >> -C, --check=quiet, --check=silent like -c, but do not report first bad line\n\ >> --compress-program=PROG compress temporaries with PROG;\n\ >> decompress them with PROG -d\n\ >> - --debug annotate the part of the line used to sort\n\ >> + --debug annotate the part of the line used to sort,\n\ >> + and warn about questionable usage to stderr\n\ >> --files0-from=F read input from the files specified by\n\ >> NUL-terminated names in file F;\n\ >> If F is - then read names from standard input\n\ > > This makes for a pretty long translation string; time to break it in two? OK will do. >> + for (key = keylist; key; key = key->next, keynum++) >> + { >> + /* Warn about field specs that will never match. */ >> + if (key->sword != SIZE_MAX && key->eword < key->sword) >> + error (0, 0, _("key %zu has zero width and will be ignored"), keynum); > > This requires vfprintf-posix to guarantee that %zu will work; I'm not > sure we have that guarantee, and Jim has been reluctant to globally turn > on gnulib printf replacements. Oops, right. I've been doing too much C99 lately. I quickly grepped but didn't notice the existing %zu was inside #if DEBUG. I'll use something more standard. >> @@ -3884,6 +3985,18 @@ main (int argc, char **argv) >> if (debug && outfile) >> error (SORT_FAILURE, 0, _("options -o and --debug are incompatible")); > > Why? That was from the previous commit, and the log message there said: (main): Process the --debug option and make it mutually exlusive with the -o option as I don't see it useful there, even potentially harmful if someone left a --debug in by mistake when updating a file. Also restricting debug output to stdout, simplifies the logic for dealing with temporary files. I'll add a comment to the code also. thanks for the review. Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Fri May 14 17:23:17 2010 Received: (at 6176) by debbugs.gnu.org; 14 May 2010 21:23:17 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OD2Lk-00011S-Ty for submit@debbugs.gnu.org; Fri, 14 May 2010 17:23:17 -0400 Received: from kiwi.cs.ucla.edu ([131.179.128.19]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OD2Li-00011K-PM for 6176@debbugs.gnu.org; Fri, 14 May 2010 17:23:15 -0400 Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by kiwi.cs.ucla.edu (8.13.8+Sun/8.13.8/UCLACS-6.0) with ESMTP id o4ELNCjm016850; Fri, 14 May 2010 14:23:12 -0700 (PDT) Message-ID: <4BEDBF40.60407@cs.ucla.edu> Date: Fri, 14 May 2010 14:23:12 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#6176: [PATCH 2/2] sort: --debug: output data independent key warnings References: <4BE9E0B1.5030308@draigBrady.com> <4BE9EAB8.8010703@draigBrady.com> <4BEAB2D3.3010801@redhat.com> <4BEAB363.1070009@redhat.com> <4BEAB609.6050201@draigBrady.com> <4BED4BBB.4000908@draigBrady.com> In-Reply-To: <4BED4BBB.4000908@draigBrady.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by kiwi.cs.ucla.edu id o4ELNCjm016850 X-Spam-Score: -2.6 (--) X-Debbugs-Envelope-To: 6176 Cc: 6176@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.6 (--) On 05/14/10 06:10, P=C3=A1draig Brady wrote: > - if ((1 < (key->random + key->numeric + key->general_numeric + key-= >month > - + key->version + !!key->ignore + key->human_numeric)) > + if ((1 < (key->random + key_numeric (key) + key->month + key->vers= ion > + + !!key->ignore)) This change doesn't look right, since it won't catch the error of specifying both numeric and general_numeric options. Am I missing something? > sort: obsolescent key formats used. Consider using `-k' Something like the following diagnostic would be far more helpful for users who are not 'sort' experts: sort: obsolescent key `+2 -4' used; consider `-k 3,4' instead Can you please arrange for that? > +static char* Missing space before "*". > + /* The following is too big, but guaranteed to be "big enough". */ > + char *opts =3D xstrdup (short_options); This unnecessarily copies short_options. Better would be: char *opts =3D xmalloc (sizeof short_options); But, come to think of it, the interface for key_to_opts is awkward. Callers must currently do this: char *opts =3D key_to_opts (key); F (opts); free (opts); where F is some function. It'd be nicer for callers to do something like this instead: char opts[sizeof short_options]; key_to_opts (key, opts); F (opts); This is a bit faster and is easier to understand (at least, for me). From debbugs-submit-bounces@debbugs.gnu.org Fri May 14 17:49:38 2010 Received: (at 6176) by debbugs.gnu.org; 14 May 2010 21:49:38 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OD2lG-0001By-M1 for submit@debbugs.gnu.org; Fri, 14 May 2010 17:49:38 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OD2lF-0001Bp-4L for 6176@debbugs.gnu.org; Fri, 14 May 2010 17:49:37 -0400 Received: (qmail 46885 invoked from network); 14 May 2010 21:49:34 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 14 May 2010 21:49:34 -0000 Message-ID: <4BEDC4FD.8080707@draigBrady.com> Date: Fri, 14 May 2010 22:47:41 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#6176: [PATCH 2/2] sort: --debug: output data independent key warnings References: <4BE9E0B1.5030308@draigBrady.com> <4BE9EAB8.8010703@draigBrady.com> <4BEAB2D3.3010801@redhat.com> <4BEAB363.1070009@redhat.com> <4BEAB609.6050201@draigBrady.com> <4BED4BBB.4000908@draigBrady.com> <4BEDBF40.60407@cs.ucla.edu> In-Reply-To: <4BEDBF40.60407@cs.ucla.edu> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.8 (--) X-Debbugs-Envelope-To: 6176 Cc: 6176@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.8 (--) On 14/05/10 22:23, Paul Eggert wrote: > On 05/14/10 06:10, Pádraig Brady wrote: > >> - if ((1 < (key->random + key->numeric + key->general_numeric + >> key->month >> - + key->version + !!key->ignore + key->human_numeric)) >> + if ((1 < (key->random + key_numeric (key) + key->month + >> key->version >> + + !!key->ignore)) > > This change doesn't look right, since it won't catch the error of > specifying both numeric and general_numeric options. Am I missing > something? Well spotted test/misc/sort::h7 had caught my silliness also > >> sort: obsolescent key formats used. Consider using `-k' > > Something like the following diagnostic would be far more helpful for > users who are not 'sort' experts: > > sort: obsolescent key `+2 -4' used; consider `-k 3,4' instead > > Can you please arrange for that? Yes that would be more useful. I'll have a look. >> + /* The following is too big, but guaranteed to be "big enough". */ >> + char *opts = xstrdup (short_options); > > This unnecessarily copies short_options. Better would be: > char opts[sizeof short_options]; > key_to_opts (key, opts); > F (opts); > > This is a bit faster and is easier to understand (at least, for me). Yes that's better. thanks a lot! Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Fri May 14 20:28:19 2010 Received: (at 6176) by debbugs.gnu.org; 15 May 2010 00:28:19 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OD5Ep-0002Cn-9i for submit@debbugs.gnu.org; Fri, 14 May 2010 20:28:19 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OD5En-0002Ch-4y for 6176@debbugs.gnu.org; Fri, 14 May 2010 20:28:17 -0400 Received: (qmail 65031 invoked from network); 15 May 2010 00:28:15 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 15 May 2010 00:28:15 -0000 Message-ID: <4BEDEA2D.5040102@draigBrady.com> Date: Sat, 15 May 2010 01:26:21 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: 6176@debbugs.gnu.org Subject: Re: bug#6176: [PATCH 2/2] sort: --debug: output data independent key warnings References: <4BE9E0B1.5030308@draigBrady.com> <4BE9EAB8.8010703@draigBrady.com> <4BEAB2D3.3010801@redhat.com> <4BEAB363.1070009@redhat.com> <4BEAB609.6050201@draigBrady.com> <4BED4BBB.4000908@draigBrady.com> <4BEDBF40.60407@cs.ucla.edu> <4BEDC4FD.8080707@draigBrady.com> In-Reply-To: <4BEDC4FD.8080707@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: 6176 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -1.8 (-) On 14/05/10 22:47, Pádraig Brady wrote: > On 14/05/10 22:23, Paul Eggert wrote: >> Something like the following diagnostic would be far more helpful for >> users who are not 'sort' experts: >> >> sort: obsolescent key `+2 -4' used; consider `-k 3,4' instead >> >> Can you please arrange for that? I did that using this chunk. Note that it doesn't reproduce char offsets or flags on the old or new formats, but they're the same on both formats, so it's not worth the complexity I think. if (key->obsolete_used) { /* obsolescent syntax +A.x -B.y is equivalent to: -k A+1.x+1,B.y (when y = 0) -k A+1.x+1,B+1.y (when y > 0) */ char obuf[INT_BUFSIZE_BOUND (size_t) * 2 + 4]; /* +# -# */ char nbuf[INT_BUFSIZE_BOUND (size_t) * 2 + 5]; /* -k #,# */ char *po = obuf; char *pn = nbuf; size_t sword = key->sword; size_t eword = key->eword; if (sword == SIZE_MAX) sword++; po += sprintf (po, "+%" PRIuMAX, (uintmax_t) sword); pn += sprintf (pn, "-k %" PRIuMAX, (uintmax_t) sword + 1); if (key->eword != SIZE_MAX) { po += sprintf (po, " -%" PRIuMAX, (uintmax_t) eword + 1); pn += sprintf (pn, ",%" PRIuMAX, (uintmax_t) eword + 1 + (key->echar == SIZE_MAX)); } error (0, 0, _("obsolescent key `%s' used; consider `%s' instead"), obuf, nbuf); } That latest patch is at http://url.ie/660o cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Sat May 22 09:24:00 2010 Received: (at 6176) by debbugs.gnu.org; 22 May 2010 13:24:00 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OFogJ-0003kE-A8 for submit@debbugs.gnu.org; Sat, 22 May 2010 09:23:59 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OFogH-0003k9-39 for 6176@debbugs.gnu.org; Sat, 22 May 2010 09:23:57 -0400 Received: (qmail 37905 invoked from network); 22 May 2010 13:23:51 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 22 May 2010 13:23:51 -0000 Message-ID: <4BF7DA50.5020003@draigBrady.com> Date: Sat, 22 May 2010 14:21:20 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: 6176@debbugs.gnu.org Subject: [PATCH] sort: adjust the leading blanks --debug warning References: <4BE9E0B1.5030308@draigBrady.com> <4BE9EAB8.8010703@draigBrady.com> <4BEAB2D3.3010801@redhat.com> <4BEAB363.1070009@redhat.com> <4BEAB609.6050201@draigBrady.com> <4BED4BBB.4000908@draigBrady.com> <4BEDBF40.60407@cs.ucla.edu> <4BEDC4FD.8080707@draigBrady.com> <4BEDEA2D.5040102@draigBrady.com> In-Reply-To: <4BEDEA2D.5040102@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -1.6 (-) X-Debbugs-Envelope-To: 6176 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.8 (--) I was reviewing some tricky sorting I've done previously to see did the new --debug option warn me about the issues I encountered. One thing I missed was that if one specifies character offsets, then we should warn, even if the type implicitly skips spaces when sorting. For example, one has to specify 'b' to both start and end positions in the second key in the example below (which sorts apache access.log by date). printf "127.0.0.1 - - [01/Jan/2008:02:08:26 +0000] ...\n" | sort --debug -b -k4.9,4.12 -k4.5b,4.7Mb -k4.2,4.3 -k4.14,4 One caveat with always warning about a missing 'b' when character offsets are specified, is that this form is now warned about: -k1.x,1.y That is commonly used to specify offsets into a fixed spaced file. I.E. it can be seen as a line offset, rather than a field offset. So I've excluded that form from any warnings. Also previously we would warn about a missing 'b' when it could have been legitimately unused to support sorting right aligned indexes, aligned with spaces. I'm wary about being "too clever" with these warnings, but I think it worth suppressing this warning in the above 2 fairly common cases, as the user can still see the blanks being included when the key annotations are displayed. cheers, Pádraig. diff --git a/src/sort.c b/src/sort.c index 8a9309a..bebbd11 100644 --- a/src/sort.c +++ b/src/sort.c @@ -2261,8 +2261,13 @@ key_warnings (struct keyfield const *gkey, bool gkey_only) error (0, 0, _("key %lu has zero width and will be ignored"), keynum); /* Warn about significant leading blanks. */ - if (!gkey_only && tab == TAB_DEFAULT && !key->skipsblanks - && !key_numeric (key) && !key->month) + bool implicit_skip = key_numeric (key) || key->month; + bool support_space_aligned = !hard_LC_COLLATE && default_key_compare (key) && !key->schar; + bool line_offset = key->eword == 0 && key->echar != 0; /* -k1.x,1.y */ + if (!gkey_only && tab == TAB_DEFAULT && !line_offset + && ((!key->skipsblanks && !(implicit_skip || support_space_aligned)) + || (!key->skipsblanks && key->schar) + || (!key->skipeblanks && key->echar))) error (0, 0, _("leading blanks are significant in key %lu; " "consider also specifying `b'"), keynum); From unknown Sun Jun 22 11:33:36 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sun, 20 Jun 2010 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator