GNU bug report logs - #15157
join doesn't follow norms and dies instead of doing something useful w/duplicate options

Previous Next

Package: coreutils;

Reported by: Linda Walsh <coreutils <at> tlinx.org>

Date: Wed, 21 Aug 2013 21:46:02 UTC

Severity: normal

Full log


View this message in rfc822 format

From: Linda Walsh <coreutils <at> tlinx.org>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 15127 <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>, 15157 <at> debbugs.gnu.org
Subject: bug#15157: join doesn't follow norms and dies instead of doing something useful w/duplicate options
Date: Wed, 21 Aug 2013 21:54:49 -0700

Pádraig Brady wrote:
> On 08/21/2013 10:44 PM, Linda Walsh wrote:
>> Historically, options specified on the command line take precedence
>> over options in an init/rc-file or in the ENV.  Many utils
>> in a build process build up command lines in pieces -- with the
>> expectation that later values take precedence, allowing for
>> higher level make files to set defaults, while allowing make's
>> in sub directories to override options set in a parent.
>>
>> Defaulting to "fail", rather than proceed with latest input
>> data, is rarely useful for humans.  It's arguable whether or
>> not it is useful for machines in most cases.
>>
>> In the past, unix utils have tried to do what the user meant
>> rather than deliberately playing "stupid" and pretending to have
>> no clue about what was likely expected.
> 
> Right, to support subsequent specification of scripts etc.
> it's useful to allow options to be overridden.
> In addition this is how other systems behave wrt to
> input field separator options for example.
> 
> Now on the other hand, the ambiguity being diagnosed here
> in such an obtuse manner, is that one might think that _any_
> of the specified separators are supported in the input,
> rather than the last taking precedence.
----
	There are other utilities not all officially
under the official 'coreutils' project, but definitely
under the "core unix utilities" definition.  One of those
which started me looking at the inconsistencies was/is
grep(+flavors).

	There, you have the ENV var GREP_OPTIONS, which
I would argue should take the least precedence when compared
with the 'command name' and 'options on the command line'

	The "-[FEPG]" options are mutually exclusive and can
easily override each other w/o harm.

	To add spice, "egrep", uses the 'GREP_OPTIONS', but
isn't really compatible with 'grep' (as it is now) w/regards
to -- for example, the search-type switches.  I'm not sure
why, but egrep, right now, refuses all pattern options --this
is a real kicker:
egrep -E foo
egrep: egrep can only use the egrep pattern syntax
----
Ok, thank you for "sharing", but doesn't '-E' mean egrep pattern syntax?
That even, '-E' fails, telling the user that they can only
use the syntax they are specifying seems "abusive".  That
other options in grep DO take the 'last' option, but the
syntax options are disallowed, is inconsistent, unuseful and
creates breakages in existing scripts that don't know they
should clear GREP_OPTIONS in order for egrep and fgrep to
function correctly.

There is no reason why "last specified" shouldn't apply there
as well (with the ENV being specified before the command was
entered, thus having lowest priority), the command name being
the 2nd thing typed, and having next priority, and options specified
to the command being the last thing typed, left to right.

It so happens that 'join' was used as a justification for
this behavior in 'grep', which was one of the reasons why I looked
at join (along with sort, and a few others) to note where there
might be inconsistencies and whether or not the trend of "fail"
taking precedence over deterministic and working behaviors that
have been defined as normal for as long as I can remember on *nix.


Do you see a reason why grep(+e/f) should fail -- or, especially,
why e/fgrep should fail due to conflicting options in a GREP env
var... or reject specification of a correct format?


> 
> New users of these tools may be caught out though.
---
	They wouldn't have any previous history to be caught
by.  When I came to *nix, I read the man page and noted that nearly
all of the utilities showed the same behavior (with the exception
of sort that might have it's options confused as applying to different
fields, not sure how likely that is).  I have come to rely on
option-override working in a number of utils -- with config files
taking the lowest priority (they are present before the user logs
in), followed by ENV vars (set each session), command name and
switches...(usually command name isn't part of that list, but
to make things consistent...)


> We could display a warning but that would negate most of
> the benefit of allowing overriding the option.
> I suppose we could support --debug on all utils to diagnose
> ambiguities like this, rather than disallowing them.
> I'll look at doing both of the above.
----
	--lint?

	debug has other connotations... or --anal^h^h^h^hstrict ?
;^)
> 
> thanks,
> Pádraig.
--
ditto.. and I need to know how to phrase the problem for the kernel
folks as they have quite a few places calling grep where they don't check
for status (let alone, now being affected by conflicting options)...

Could 15127 also be re-opened as it was closed unilaterally in the
presence of obvious bugs.  Thanks...






This bug report was last modified 11 years and 301 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.