GNU bug report logs - #22782
[PATCH] all: be less strict about usage if POSIX 2008

Previous Next

Package: coreutils;

Reported by: Paul Eggert <eggert <at> cs.ucla.edu>

Date: Tue, 23 Feb 2016 09:08:01 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 22782 in the body.
You can then email your comments to 22782 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#22782; Package coreutils. (Tue, 23 Feb 2016 09:08:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Paul Eggert <eggert <at> cs.ucla.edu>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Tue, 23 Feb 2016 09:08:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: bug-coreutils <at> gnu.org
Cc: Paul Eggert <eggert <at> cs.ucla.edu>
Subject: [PATCH] all: be less strict about usage if POSIX 2008
Date: Tue, 23 Feb 2016 01:06:34 -0800
sort, tail, and uniq now support traditional usage like 'sort +2'
and 'tail +10' on systems conforming to POSIX 1003.1-2008 and later.
* NEWS: Document this.
* doc/coreutils.texi (Standards conformance, tail invocation)
(sort invocation, uniq invocation, touch invocation):
Document new behavior, or behavior's dependence on POSIX 1003.1-2001.
* src/sort.c (struct keyfield.traditional_used):
Rename from obsolete_used, since implementations are now allowed
to support it.  All uses changed.
(main): Allow traditional usage if _POSIX2_VERSION is 200809.
* src/tail.c (parse_obsolete_option): Distinguish between
traditional usage (which POSIX 2008 and later allows) and obsolete
(which it still does not).
* src/uniq.c (strict_posix2): New function.
(main): Allow traditional usage if _POSIX2_VERSION is 200809.
* tests/misc/tail.pl: Test for new behavior.
---
 NEWS               |  7 +++++++
 doc/coreutils.texi | 55 +++++++++++++++++++++++++++++-------------------------
 src/sort.c         | 17 +++++++++--------
 src/tail.c         | 11 ++++++-----
 src/uniq.c         |  9 ++++++++-
 tests/misc/tail.pl |  5 ++++-
 6 files changed, 64 insertions(+), 40 deletions(-)

diff --git a/NEWS b/NEWS
index 3b2e461..ae42581 100644
--- a/NEWS
+++ b/NEWS
@@ -7,6 +7,13 @@ GNU coreutils NEWS                                    -*- outline -*-
    stty --help no longer outputs extraneous gettext header lines
    for translated languages. [bug introduced in coreutils-8.24]
 
+** Changes in behavior
+
+   sort, tail, and uniq now support traditional usage like 'sort +2'
+   and 'tail +10' on systems conforming to POSIX 1003.1-2008 and later.
+   The 2008 edition of POSIX dropped the requirement that arguments
+   like '+2' must be treated as file names.
+
 
 * Noteworthy changes in release 8.25 (2016-01-20) [stable]
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index a07e46e..45706bd 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -1506,10 +1506,11 @@ probably do not need to define @env{POSIXLY_CORRECT}.
 Newer versions of POSIX are occasionally incompatible with older
 versions.  For example, older versions of POSIX required the
 command @samp{sort +1} to sort based on the second and succeeding
-fields in each input line, but starting with POSIX 1003.1-2001
+fields in each input line, but in POSIX 1003.1-2001
 the same command is required to sort the file named @file{+1}, and you
 must instead use the command @samp{sort -k 2} to get the field-based
-sort.
+sort.  To complicate things further, POSIX 1003.1-2008 allows an
+implementation to have either the old or the new behavior.
 
 @vindex _POSIX2_VERSION
 The GNU utilities normally conform to the version of POSIX
@@ -1520,10 +1521,10 @@ the year and month the standard was adopted.  Three values are currently
 supported for @env{_POSIX2_VERSION}: @samp{199209} stands for
 POSIX 1003.2-1992, @samp{200112} stands for POSIX
 1003.1-2001, and @samp{200809} stands for POSIX 1003.1-2008.
-For example, if you have a newer system but are running software
-that assumes an older version of POSIX and uses @samp{sort +1}
-or @samp{tail +10}, you can work around any compatibility problems by setting
-@samp{_POSIX2_VERSION=199209} in your environment.
+For example, if you have a POSIX 1003.1-2001 system but are running software
+containing traditional usage like @samp{sort +1} or @samp{tail +10},
+you can work around the compatibility problems by setting
+@samp{_POSIX2_VERSION=200809} in your environment.
 
 @c This node is named "Multi-call invocation", not the usual
 @c "coreutils invocation", so that shell commands like
@@ -3060,17 +3061,18 @@ by 512-byte blocks, bytes, or lines, optionally followed by @samp{f}
 which has the same meaning as @option{-f}.
 
 @vindex _POSIX2_VERSION
-On older systems, the leading @samp{-} can be replaced by @samp{+} in
-the obsolete option syntax with the same meaning as in counts, and
-obsolete usage overrides normal usage when the two conflict.
-This obsolete behavior can be enabled or disabled with the
+On systems not conforming to POSIX 1003.1-2001, the leading @samp{-}
+can be replaced by @samp{+} in the traditional option syntax with the
+same meaning as in counts, and on obsolete systems predating POSIX
+1003.1-2001 traditional usage overrides normal usage when the two
+conflict.  This behavior can be controlled with the
 @env{_POSIX2_VERSION} environment variable (@pxref{Standards
 conformance}).
 
-Scripts intended for use on standard hosts should avoid obsolete
+Scripts intended for use on standard hosts should avoid traditional
 syntax and should use @option{-c @var{num}[b]}, @option{-n
 @var{num}}, and/or @option{-f} instead.  If your script must also
-run on hosts that support only the obsolete syntax, you can often
+run on hosts that support only the traditional syntax, you can often
 rewrite it to avoid problematic usages, e.g., by using @samp{sed -n
 '$p'} rather than @samp{tail -1}.  If that's not possible, the script
 can use a test like @samp{if tail -c +1 </dev/null >/dev/null 2>&1;
@@ -4536,23 +4538,24 @@ is counted from the first nonblank character of the field.
 
 @vindex _POSIX2_VERSION
 @vindex POSIXLY_CORRECT
-On older systems, @command{sort} supports an obsolete origin-zero
+On systems not conforming to POSIX 1003.1-2001,
+@command{sort} supports a traditional origin-zero
 syntax @samp{+@var{pos1} [-@var{pos2}]} for specifying sort keys.
-The obsolete sequence @samp{sort +@var{a}.@var{x} -@var{b}.@var{y}}
+The traditional command @samp{sort +@var{a}.@var{x} -@var{b}.@var{y}}
 is equivalent to @samp{sort -k @var{a+1}.@var{x+1},@var{b}} if @var{y}
 is @samp{0} or absent, otherwise it is equivalent to @samp{sort -k
 @var{a+1}.@var{x+1},@var{b+1}.@var{y}}.
 
-This obsolete behavior can be enabled or disabled with the
+This traditional behavior can be controlled with the
 @env{_POSIX2_VERSION} environment variable (@pxref{Standards
 conformance}); it can also be enabled when @env{POSIXLY_CORRECT} is
-not set by using the obsolete syntax with @samp{-@var{pos2}} present.
+not set by using the traditional syntax with @samp{-@var{pos2}} present.
 
-Scripts intended for use on standard hosts should avoid obsolete
+Scripts intended for use on standard hosts should avoid traditional
 syntax and should use @option{-k} instead.  For example, avoid
 @samp{sort +2}, since it might be interpreted as either @samp{sort
 ./+2} or @samp{sort -k 3}.  If your script must also run on hosts that
-support only the obsolete syntax, it can use a test like @samp{if sort
+support only the traditional syntax, it can use a test like @samp{if sort
 -k 1 </dev/null >/dev/null 2>&1; then @dots{}} to decide which syntax
 to use.
 
@@ -4911,7 +4914,7 @@ a null string for comparison if a line has fewer than @var{n} fields.  Fields
 are sequences of non-space non-tab characters that are separated from
 each other by at least one space or tab.
 
-For compatibility @command{uniq} supports an obsolete option syntax
+For compatibility @command{uniq} supports a traditional option syntax
 @option{-@var{n}}.  New scripts should use @option{-f @var{n}} instead.
 
 @item -s @var{n}
@@ -4923,11 +4926,12 @@ for comparison if a line has fewer than @var{n} characters.  If you use both
 the field and character skipping options, fields are skipped over first.
 
 @vindex _POSIX2_VERSION
-On older systems, @command{uniq} supports an obsolete option syntax
+On systems not conforming to POSIX 1003.1-2001,
+@command{uniq} supports a traditional option syntax
 @option{+@var{n}}.
-This obsolete behavior can be enabled or disabled with the
+Although this traditional behavior can be controlled with the
 @env{_POSIX2_VERSION} environment variable (@pxref{Standards
-conformance}), but portable scripts should avoid commands whose
+conformance}), portable scripts should avoid commands whose
 behavior depends on this variable.
 For example, use @samp{uniq ./+10} or @samp{uniq -s 10} rather than
 the ambiguous @samp{uniq +10}.
@@ -10981,7 +10985,8 @@ On the atypical systems that support leap seconds, @var{ss} may be
 @end table
 
 @vindex _POSIX2_VERSION
-On older systems, @command{touch} supports an obsolete syntax, as follows.
+On systems predating POSIX 1003.1-2001,
+@command{touch} supports an obsolete syntax, as follows.
 If no timestamp is given with any of the @option{-d}, @option{-r}, or
 @option{-t} options, and if there are two or more @var{file}s and the
 first @var{file} is of the form @samp{@var{mmddhhmm}[@var{yy}]} and this
@@ -10989,9 +10994,9 @@ would be a valid argument to the @option{-t} option (if the @var{yy}, if
 any, were moved to the front), and if the represented year
 is in the range 1969--1999, that argument is interpreted as the time
 for the other files instead of as a file name.
-This obsolete behavior can be enabled or disabled with the
+Although this obsolete behavior can be controlled with the
 @env{_POSIX2_VERSION} environment variable (@pxref{Standards
-conformance}), but portable scripts should avoid commands whose
+conformance}), portable scripts should avoid commands whose
 behavior depends on this variable.
 For example, use @samp{touch ./12312359 main.c} or @samp{touch -t
 12312359 main.c} rather than the ambiguous @samp{touch 12312359 main.c}.
diff --git a/src/sort.c b/src/sort.c
index 62acb62..aa52b75 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -224,7 +224,7 @@ struct keyfield
   bool month;			/* Flag for comparison by month name. */
   bool reverse;			/* Reverse the sense of comparison. */
   bool version;			/* sort by version number */
-  bool obsolete_used;		/* obsolescent key option format is used. */
+  bool traditional_used;	/* Traditional key option format is used. */
   struct keyfield *next;	/* Next keyfield to try. */
 };
 
@@ -2394,7 +2394,7 @@ key_warnings (struct keyfield const *gkey, bool gkey_only)
 
   for (key = keylist; key; key = key->next, keynum++)
     {
-      if (key->obsolete_used)
+      if (key->traditional_used)
         {
           size_t sword = key->sword;
           size_t eword = key->eword;
@@ -4183,7 +4183,8 @@ main (int argc, char **argv)
   size_t nthreads = 0;
   size_t nfiles = 0;
   bool posixly_correct = (getenv ("POSIXLY_CORRECT") != NULL);
-  bool obsolete_usage = (posix2_version () < 200112);
+  int posix_ver = posix2_version ();
+  bool traditional_usage = ! (200112 <= posix_ver && posix_ver < 200809);
   char **files;
   char *files_from = NULL;
   struct Tokens tok;
@@ -4288,13 +4289,13 @@ main (int argc, char **argv)
     {
       /* Parse an operand as a file after "--" was seen; or if
          pedantic and a file was seen, unless the POSIX version
-         predates 1003.1-2001 and -c was not seen and the operand is
+         is not 1003.1-2001 and -c was not seen and the operand is
          "-o FILE" or "-oFILE".  */
       int oi = -1;
 
       if (c == -1
           || (posixly_correct && nfiles != 0
-              && ! (obsolete_usage
+              && ! (traditional_usage
                     && ! checkonly
                     && optind != argc
                     && argv[optind][0] == '-' && argv[optind][1] == 'o'
@@ -4315,8 +4316,8 @@ main (int argc, char **argv)
             {
               bool minus_pos_usage = (optind != argc && argv[optind][0] == '-'
                                       && ISDIGIT (argv[optind][1]));
-              obsolete_usage |= minus_pos_usage && !posixly_correct;
-              if (obsolete_usage)
+              traditional_usage |= minus_pos_usage && !posixly_correct;
+              if (traditional_usage)
                 {
                   /* Treat +POS1 [-POS2] as a key if possible; but silently
                      treat an operand as a file if it is not a valid +POS1.  */
@@ -4356,7 +4357,7 @@ main (int argc, char **argv)
                             badfieldspec (optarg1,
                                       N_("stray character in field spec"));
                         }
-                      key->obsolete_used = true;
+                      key->traditional_used = true;
                       insertkey (key);
                     }
                 }
diff --git a/src/tail.c b/src/tail.c
index 2a72a93..caa5407 100644
--- a/src/tail.c
+++ b/src/tail.c
@@ -1981,7 +1981,6 @@ parse_obsolete_option (int argc, char * const *argv, uintmax_t *n_units)
   const char *p;
   const char *n_string;
   const char *n_string_end;
-  bool obsolete_usage;
   int default_count = DEFAULT_N_LINES;
   bool t_from_start;
   bool t_count_lines = true;
@@ -1994,7 +1993,9 @@ parse_obsolete_option (int argc, char * const *argv, uintmax_t *n_units)
          || (3 <= argc && argc <= 4 && STREQ (argv[2], "--"))))
     return false;
 
-  obsolete_usage = (posix2_version () < 200112);
+  int posix_ver = posix2_version ();
+  bool obsolete_usage = posix_ver < 200112;
+  bool traditional_usage = obsolete_usage || 200809 <= posix_ver;
   p = argv[1];
 
   switch (*p++)
@@ -2003,8 +2004,8 @@ parse_obsolete_option (int argc, char * const *argv, uintmax_t *n_units)
       return false;
 
     case '+':
-      /* Leading "+" is a file name in the non-obsolete form.  */
-      if (!obsolete_usage)
+      /* Leading "+" is a file name in the standard form.  */
+      if (!traditional_usage)
         return false;
 
       t_from_start = true;
@@ -2014,7 +2015,7 @@ parse_obsolete_option (int argc, char * const *argv, uintmax_t *n_units)
       /* In the non-obsolete form, "-" is standard input and "-c"
          requires an option-argument.  The obsolete multidigit options
          are supported as a GNU extension even when conforming to
-         POSIX 1003.1-2001, so don't complain about them.  */
+         POSIX 1003.1-2001 or later, so don't complain about them.  */
       if (!obsolete_usage && !p[p[0] == 'c'])
         return false;
 
diff --git a/src/uniq.c b/src/uniq.c
index 0e118da..896ce93 100644
--- a/src/uniq.c
+++ b/src/uniq.c
@@ -226,6 +226,13 @@ Also, comparisons honor the rules specified by 'LC_COLLATE'.\n\
   exit (status);
 }
 
+static bool
+strict_posix2 (void)
+{
+  int posix_ver = posix2_version ();
+  return 200112 <= posix_ver && posix_ver < 200809;
+}
+
 /* Convert OPT to size_t, reporting an error using MSGID if OPT is
    invalid.  Silently convert too-large values to SIZE_MAX.  */
 
@@ -533,7 +540,7 @@ main (int argc, char **argv)
           {
             unsigned long int size;
             if (optarg[0] == '+'
-                && posix2_version () < 200112
+                && ! strict_posix2 ()
                 && xstrtoul (optarg, NULL, 10, &size, "") == LONGINT_OK
                 && size <= SIZE_MAX)
               skip_chars = size;
diff --git a/tests/misc/tail.pl b/tests/misc/tail.pl
index 0d9bc48..57ed62d 100755
--- a/tests/misc/tail.pl
+++ b/tests/misc/tail.pl
@@ -116,12 +116,15 @@ foreach my $t (@tv)
     $ret
       and push @$e, {EXIT=>$ret}, {ERR=>$err_msg}, {ERR_SUBST=>$err_sub};
 
-    $test_name =~ /^(obs-plus-|minus-)/
+    $test_name =~ /^minus-/
       and push @$e, {ENV=>'_POSIX2_VERSION=199209'};
 
     $test_name =~ /^(err-6|c-2)$/
       and push @$e, {ENV=>'_POSIX2_VERSION=200112'};
 
+    $test_name =~ /^obs-plus-/
+      and push @$e, {ENV=>'_POSIX2_VERSION=200809'};
+
     $test_name =~ /^f-pipe-/
       and push @$e, {ENV=>'POSIXLY_CORRECT=1'};
 
-- 
2.5.0





bug closed, send any further explanations to 22782 <at> debbugs.gnu.org and Paul Eggert <eggert <at> cs.ucla.edu> Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Tue, 23 Feb 2016 09:13:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 22 Mar 2016 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 96 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.