GNU bug report logs - #17722
Makefile rule fix and cleanup patches

Package: grep;

Reported by: Jim Meyering <jim <at> meyering.net>

Date: Fri, 6 Jun 2014 23:49:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17722 in the body.
You can then email your comments to 17722 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-grep <at> gnu.org:
bug#17722; Package grep. (Fri, 06 Jun 2014 23:49:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jim Meyering <jim <at> meyering.net>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Fri, 06 Jun 2014 23:49:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: bug-grep <at> gnu.org
Subject: Makefile rule fix and cleanup patches
Date: Fri, 6 Jun 2014 16:48:12 -0700

[Message part 1 (text/plain, inline)]

I nearly omitted the second, since using scripts
for egrep and fgrep may be removed, but left it
in on principle: set a good example.

[0001-build-don-t-redirect-directly-to.patch (application/octet-stream, attachment)]

[0002-build-improve-rule-to-generate-egrep-fgrep-scripts.patch (application/octet-stream, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#17722; Package grep. (Sat, 07 Jun 2014 00:17:01 GMT) Full text and rfc822 format available.

Message #8 received at 17722 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>, 17722 <at> debbugs.gnu.org
Subject: Re: bug#17722: Makefile rule fix and cleanup patches
Date: Fri, 06 Jun 2014 17:16:02 -0700

Jim Meyering wrote:
> using scripts for egrep and fgrep may be removed

Let's not remove the scripts, as they're better on platforms where 
they're supported.  Users of a script can more-easily understand and 
modify what it does, which is a better match for the GNU project's 
overarching goals.

Instead, we should use scripts on GNUish platforms, and fall back on 
binary executables only on platforms that don't have a working shell.

bug closed, send any further explanations to 17722 <at> debbugs.gnu.org and Jim Meyering <jim <at> meyering.net> Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Sat, 07 Jun 2014 00:17:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#17722; Package grep. (Sat, 07 Jun 2014 01:57:02 GMT) Full text and rfc822 format available.

Message #13 received at 17722 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 17722 <at> debbugs.gnu.org
Subject: Re: bug#17722: Makefile rule fix and cleanup patches
Date: Fri, 6 Jun 2014 18:55:49 -0700

On Fri, Jun 6, 2014 at 5:16 PM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> Jim Meyering wrote:
>>
>> using scripts for egrep and fgrep may be removed
>
>
> Let's not remove the scripts, as they're better on platforms where they're
> supported.  Users of a script can more-easily understand and modify what it
> does, which is a better match for the GNU project's overarching goals.
>
> Instead, we should use scripts on GNUish platforms, and fall back on binary
> executables only on platforms that don't have a working shell.

That might fly: use the work-around (separate build rules)
only where needed -- assuming the code changes end up being
small and clean. Though if they're that small and clean, why
wouldn't we want to keep the build rules simple and the same
for everyone?  IMHO, being able to look at the contents of
obsolete wrapper scripts has no added value.

Information forwarded to bug-grep <at> gnu.org:
bug#17722; Package grep. (Sat, 07 Jun 2014 06:57:02 GMT) Full text and rfc822 format available.

Message #16 received at 17722 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>
Cc: 17722 <at> debbugs.gnu.org
Subject: Re: bug#17722: Makefile rule fix and cleanup patches
Date: Fri, 06 Jun 2014 23:56:46 -0700

Jim Meyering wrote:

> why wouldn't we want to keep the build rules simple and the same
> for everyone?

We wouldn't want to do that if it entailed bloated and opaque 
executables on all platforms, or if it entailed too-complicated C 
programs on all platforms.  We should be able to avoid both problems.

Perhaps we could supply a different build recipe for platforms that lack 
a working shell.  Such platforms can't run shell scripts, after all, so 
maybe it would suffice to supply a text file or a comment containing 
build instructions for such platforms.

Information forwarded to bug-grep <at> gnu.org:
bug#17722; Package grep. (Sat, 07 Jun 2014 11:16:01 GMT) Full text and rfc822 format available.

Message #19 received at 17722 <at> debbugs.gnu.org (full text, mbox):

From: behoffski <behoffski <at> grouse.com.au>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Jim Meyering <jim <at> meyering.net>
Cc: 17722 <at> debbugs.gnu.org
Subject: Re: bug#17722: Makefile rule fix and cleanup patches
Date: Sat, 07 Jun 2014 20:45:06 +0930

On 06/07/14 16:26, Paul Eggert wrote:
> Jim Meyering wrote:
>
>> why wouldn't we want to keep the build rules simple and the same
>> for everyone?
>
> We wouldn't want to do that if it entailed bloated and opaque executables on all platforms, or if it entailed too-complicated C programs on all platforms.  We should be able to avoid both problems.
>
> Perhaps we could supply a different build recipe for platforms that lack a working shell.  Such platforms can't run shell scripts, after all, so maybe it would suffice to supply a text file or a comment containing build instructions for such platforms.
>
>

I've already anticipated at least part of this in my untangle'd code,
as the parser only knows about the lexer via two functions:

    lex ()      -- get the next lexical token; and
    exchange () -- Exchange information between the lexer and the
                   parser, according to the opcodes defined in
                   proto_lexparse.h:

    typedef enum proto_lexparse_opcode_enum
    {
    /*  Possible future opcode: PROTO_LEXPARSE_OP_GET_LOCALE,  entire
        locale from uselocale/duplocale */
      PROTO_LEXPARSE_OP_GET_IS_MULTIBYTE_LOCALE,
      PROTO_LEXPARSE_OP_GET_REPMN_MIN,
      PROTO_LEXPARSE_OP_GET_REPMN_MAX,
      PROTO_LEXPARSE_OP_GET_WIDE_CHAR_LIST_MAX,
      PROTO_LEXPARSE_OP_GET_WIDE_CHARS,
      PROTO_LEXPARSE_OP_GET_DOTCLASS,
      PROTO_LEXPARSE_OP_GET_MBCSET,
} proto_lexparse_opcode_t;

The addresses of the lex and exchange functions are not known at link
time; they are explicitly provided at runtime, via the function
fsaparse_lexer in fsaparse.h.  [The function description omits to
mention the exchange parameter; this is a glitch on my part.]

    /* Receive a lexer function, plus lexer instance context pointer, for use by
       the parser.  Although not needed initially, this plug-in architecture may
       be useful in the future, and it breaks up some of the intricate
       connections that made the original dfa.c code so daunting.  */
    extern void
    fsaparse_lexer (fsaparse_ctxt_t *parser,
                    void *lexer_context,
                    proto_lexparse_lex_fn_t *lex_fn,
                    proto_lexparse_exchange_fn_t *lex_exchange_fn);

It would be trivial to build a "fgrep-only" lexer module that
merely fetches the next character in the input pattern (perhaps
using mbrtowc), and return it as either a WCHAR list, as a
"self-token", or as a charclass for case-folded unibyte chars.

Some of the opcodes above (e.g. REPMN*, MBCSET) would become no-ops
in the fgrep lexer, as it would never emit those tokens.  At present,
case folding for wide characters is handled via the WCHAR token
allowing a list of equivalents to be specified alongside the
original; the parser ORs these together in the parse tree.

While the selection could be deferred until runtime (link both
versions and bind when the parser/lexer link is established), the
linker could be tailored to exclude the larger lexer (fsalex) in the
fgrep version (including avoiding e.g. the parse_bracket_exp
infrastructure).   This leads to both easier-to-read code and a
smaller executable.

In the distant future, new tokens, such as:
    STRING
    STRING_CASE_INSENSITIVE
    WCHAR_STRING
could be added as part of a rework of the token structure; in this
case, the use of CAT in the parser could be reduced, and the work
needed to extract "musts" from the tree could be reduced.  However,
I'd prefer to do this as part of a complete revamp of the token
structure, not by tacking more options onto an existing structure.

cheers,

behoffski
Programmer, Grouse Software

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 05 Jul 2014 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 11 years and 45 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #17722 Makefile rule fix and cleanup patches

GNU bug report logs - #17722
Makefile rule fix and cleanup patches