GNU bug report logs - #15483
POSIXLY_CORRECT documentation vis a vis some simple EREs

Previous Next

Package: grep;

Reported by: gdg <at> zplane.com

Date: Sat, 28 Sep 2013 18:24:09 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#15483: closed (POSIXLY_CORRECT documentation vis a vis some
 simple EREs)
Date: Sun, 29 Sep 2013 23:54:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Sun, 29 Sep 2013 16:53:21 -0700
with message-id <5248BD71.9050309 <at> cs.ucla.edu>
and subject line Re: bug#15483: POSIXLY_CORRECT documentation vis a vis some simple EREs
has caused the debbugs.gnu.org bug report #15483,
regarding POSIXLY_CORRECT documentation vis a vis some simple EREs
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
15483: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=15483
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Glenn Golden <gdg <at> zplane.com>
To: bug-grep <at> gnu.org
Subject: POSIXLY_CORRECT documentation vis a vis some simple EREs
Date: Sat, 28 Sep 2013 11:52:38 -0600
--
Regarding EREs having leading repetition operators, e.g. '*xyz':

Section 9.5.3 of

    http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html 

supplies the grammar for POSIX-conforming EREs. From the notes at the very
bottom:

    -----------------------------------------------------------------------
    The ERE grammar does not permit several constructs that previous
    sections specify as having undefined results:

	[ ... ]

        * One or more ERE_dupl_symbols appearing first in an ERE, or [ ... ]

    Implementations are permitted to extend the language to allow these.
    Conforming applications cannot use such constructs. 
    -----------------------------------------------------------------------


To my eyes, the last sentence seems to say that a conforming implementation
must not accept EREs like '*xyz'. But egrep [grep 2.14] does accept them,
even with POSIXLY_CORRECT defined, e.g.

   $ export POSIXLY_CORRECT=1
   $ echo 'abcdefghi' | egrep --color=auto '*def'

matches 'def'.  In contrast, POSIX regex(3) rejects such EREs with "invalid
preceding regular expression".

Not sure whether this is a POSIX conformance issue or not; it depends on the
intended semantics of POSIXLY_CORRECT.

To my eyes, the man page is a bit ambiguous, since it first says that it
"behaves as POSIX.2 requires", but then goes on to list only some specific
behaviors related to option processing.  It wasn't clear to me whether listing
the option-related behavior was intended to limit the scope of the
POSIXLY_CORRECT-ness to only those aspects, or if they were listed just
because they are (for example) often confusing to users, hence worthwhile to
call out explicitly.

In summary, there are a few questions/branches to this:

   1. If POSIXLY_CORRECT is intended to be conforming only in the specific
      respects listed, I'd suggest that the name of the associated envar be
      changed to reflect that (e.g., something like POSIXLY_CORRECT_OPTS),
      and also to change the man page text to read something like:

       POSIXLY_CORRECT_OPTS
	  If set, grep conforms with POSIX.2 with regard to the following
          option processing behaviors: [ description of option behaviors ]

   2. If POSIXLY_CORRECT is intended to mean 'fully conforming in all respects'
      then it seems like the present behavior is in technical violation.

   3. If (2) is the case, and the decision is made to change the behavior of
      grep accordingly, it might be worthwhile to also change the doc for
      POSIXLY_CORRECT to something like this:

       POSIXLY_CORRECT
	  If set, grep conforms with POSIX.2 in all respects.  In particular,
          [ description of option-related behaviors and/or other behaviors
	    that are deemed worthwhile to call out explicitly ]

   4. If (2) is the case, but the decision is made not to change the behavior
      of grep (i.e. accept the non-conformance) it might be wortwhile to
      change the doc for POSIXLY_CORRECT to something like this:

       POSIXLY_CORRECT
	  If set, grep conforms with POSIX.2 in almost all respects.  In
          particular, [ description of option-related behaviors and/or other
          behaviors that are deemed worthwhile to call out explicitly ]. But
          it does not conform precisely regarding ERE's like '*xyz' [ and
          whatever other ways are known to be non-conforming. ]

To pre-answer an expected question, asked of a submitter (Roman Donchenko) in
a similar POSIX violation bug report (#37737): 

    Are you encountering this problem in a real-world usecase, or are you
    simply reporting a violation of the standard?

My response is essentially the same as Roman gave: I am reporting it only as a
violation, but otoh, the POSIX-mandated behavior makes a lot more sense to me
than the current behavior, since expressions like '*xyz' are almost always user
error; the intent is usually '.*xyz'.  So if such expressions were rejected by
egrep, it would IMO be a behavioral improvement for users (like, ummm... me)
who chronically mis-remember how '*' is interpreted by bash vs. grep. 



[Message part 3 (message/rfc822, inline)]
From: Paul Eggert <eggert <at> cs.ucla.edu>
To: gdg <at> zplane.com, 15483-done <at> debbugs.gnu.org
Subject: Re: bug#15483: POSIXLY_CORRECT documentation vis a vis some simple
 EREs
Date: Sun, 29 Sep 2013 16:53:21 -0700
Glenn Golden wrote:
> Per the final sentence of 9.5.3, "conforming applications cannot use
> 	[constructs like '*xyz']"

This is making the incorrect assumption that 'grep'
internally must be implemented as a strictly conforming
POSIX application.  POSIX does not require that, and the
rest of your conclusions therefore do not follow.

Eric explained the intent of POSIXLY_CORRECT pretty well.
Occasionally people ask for a different feature, where
a GNU application diagnoses the use of any extension to
POSIX.  The need for such a feature is less, though, and
the hassle is greater, and so it's typically not worth
the aggravation.

As this does not seem to be a bug in grep I'm going to
take the liberty of marking it 'done'.


This bug report was last modified 11 years and 315 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.