GNU bug report logs - #40868
Grep C library for multi-string pattern matching?

Previous Next

Package: grep;

Reported by: noloader <at> gmail.com

Date: Sun, 26 Apr 2020 13:59:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 40868 in the body.
You can then email your comments to 40868 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#40868; Package grep. (Sun, 26 Apr 2020 13:59:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to noloader <at> gmail.com:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Sun, 26 Apr 2020 13:59:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jeffrey Walton <noloader <at> gmail.com>
To: bug-grep <at> gnu.org
Subject: Grep C library for multi-string pattern matching?
Date: Sun, 26 Apr 2020 09:58:10 -0400
Hi Everyone,

I need to perform multi-string pattern matching in C. The problem I am
working on does not allow a shell script. I'm looking for a library
that implements Aho–Corasick or Commentz-Walter (or similar).

Does Grep provide a library that exposes its multi-string pattern
matching? If not, can someone recommend an implementation?

Thanks in advance.




Information forwarded to bug-grep <at> gnu.org:
bug#40868; Package grep. (Sun, 26 Apr 2020 19:20:02 GMT) Full text and rfc822 format available.

Message #8 received at 40868 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: noloader <at> gmail.com
Cc: 40868 <at> debbugs.gnu.org
Subject: Re: bug#40868: Grep C library for multi-string pattern matching?
Date: Sun, 26 Apr 2020 12:19:40 -0700
On 4/26/20 6:58 AM, Jeffrey Walton wrote:

> Does Grep provide a library that exposes its multi-string pattern
> matching?

No, and that's partly by design: Grep is GPLed rather than LGPLed. I don't know
of any free library that does anything similar.




Information forwarded to bug-grep <at> gnu.org:
bug#40868; Package grep. (Sun, 26 Apr 2020 23:55:02 GMT) Full text and rfc822 format available.

Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Paul Jackson" <pj <at> usa.net>
To: bug-grep <at> gnu.org
Subject: Re: bug#40868: Grep C library for multi-string pattern matching?
Date: Sun, 26 Apr 2020 18:53:26 -0500
Perhaps you could use fork, exec, pipe, read, write, and similar system calls,
to execute grep and feed data through it, without resorting to any shell or
any shell wrapper such as the system(3) library call.

Or, if that would work, except for being rather fussy to code,
then consider Colin Watson's libpipeline:

  http://libpipeline.nongnu.org

-- 
                Paul Jackson
                pj <at> usa.net




Information forwarded to bug-grep <at> gnu.org:
bug#40868; Package grep. (Mon, 27 Apr 2020 01:09:01 GMT) Full text and rfc822 format available.

Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Paul Jackson" <pj <at> usa.net>
To: bug-grep <at> gnu.org
Subject: Re: {SPAM 04.0} bug#40868: Grep C library for multi-string pattern matching?
Date: Sun, 26 Apr 2020 20:08:21 -0500
P.S. -- on multi-core systems (which most are these days),
piping data between executables working in parallel can
be a good way to reduce the elapsed clock time of a job,
albeit at the expense of higher system CPU utilization.

-- 
                Paul Jackson
                pj <at> usa.net




Information forwarded to bug-grep <at> gnu.org:
bug#40868; Package grep. (Mon, 27 Apr 2020 06:54:02 GMT) Full text and rfc822 format available.

Message #17 received at 40868 <at> debbugs.gnu.org (full text, mbox):

From: Shlomi Fish <shlomif <at> shlomifish.org>
To: Jeffrey Walton <noloader <at> gmail.com>
Cc: 40868 <at> debbugs.gnu.org
Subject: Re: bug#40868: Grep C library for multi-string pattern matching?
Date: Mon, 27 Apr 2020 09:52:53 +0300
Hi Mr. Walton!

On Sun, 26 Apr 2020 09:58:10 -0400
Jeffrey Walton <noloader <at> gmail.com> wrote:

> Hi Everyone,
> 
> I need to perform multi-string pattern matching in C. The problem I am
> working on does not allow a shell script. I'm looking for a library
> that implements Aho–Corasick or Commentz-Walter (or similar).
> 
> Does Grep provide a library that exposes its multi-string pattern
> matching? If not, can someone recommend an implementation?
> 

There is an impl of Aho-Corasik in C++ here:
https://www.geeksforgeeks.org/aho-corasick-algorithm-pattern-searching/ (under
CC-by-sa).

Furthermore, you may wish to look at FOSS grep-like tools:

* https://wiki.freebsd.org/BSDgrep

* https://github.com/ggreer/the_silver_searcher

* https://beyondgrep.com/more-tools/

> Thanks in advance.
> 
> 
> 



-- 

Shlomi Fish       https://www.shlomifish.org/
Let’s talk about restores instead of backups - https://is.gd/WatQqu

“Stop reinventing wheels, start building space rockets.”
    — The motto of the Comprehensive Perl Archive Network

Please reply to list if it's a mailing list post - https://shlom.in/reply .




Information forwarded to bug-grep <at> gnu.org:
bug#40868; Package grep. (Mon, 27 Apr 2020 15:14:01 GMT) Full text and rfc822 format available.

Message #20 received at 40868 <at> debbugs.gnu.org (full text, mbox):

From: Jeffrey Walton <noloader <at> gmail.com>
To: 40868 <at> debbugs.gnu.org
Subject: Re: bug#40868: Grep C library for multi-string pattern matching?
Date: Mon, 27 Apr 2020 11:12:53 -0400
On Sun, Apr 26, 2020 at 9:59 AM Jeffrey Walton <noloader <at> gmail.com> wrote:
>
> I need to perform multi-string pattern matching in C. The problem I am
> working on does not allow a shell script. I'm looking for a library
> that implements Aho–Corasick or Commentz-Walter (or similar).
>
> Does Grep provide a library that exposes its multi-string pattern
> matching? If not, can someone recommend an implementation?

Thanks everyone.




Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Mon, 21 Sep 2020 19:29:01 GMT) Full text and rfc822 format available.

Notification sent to noloader <at> gmail.com:
bug acknowledged by developer. (Mon, 21 Sep 2020 19:29:01 GMT) Full text and rfc822 format available.

Message #25 received at 40868-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: noloader <at> gmail.com
Cc: 40868-done <at> debbugs.gnu.org
Subject: Re: bug#40868: Grep C library for multi-string pattern matching?
Date: Mon, 21 Sep 2020 12:28:28 -0700
Discussion on this old grep bug report has died down (and it wasn't a grep bug 
anyway) so I'm closing the bug report.




Information forwarded to bug-grep <at> gnu.org:
bug#40868; Package grep. (Tue, 22 Sep 2020 02:20:01 GMT) Full text and rfc822 format available.

Message #28 received at 40868 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: 40868 <at> debbugs.gnu.org, Jeffrey Walton <noloader <at> gmail.com>
Subject: Re: bug#40868: Grep C library for multi-string pattern matching?
Date: Tue, 22 Sep 2020 04:19:06 +0200
> Does Grep provide a library that exposes its multi-string pattern
> matching? If not, can someone recommend an implementation?

I don't know exactly what you mean, but the GNU grep algorithms are
used as a library in GNU gettext:
  https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=tree;f=gettext-tools/libgrep

Although in hindsight, I don't know whether using simply a regex would not
have been just as good.

Bruno





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 20 Oct 2020 11:24:13 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 239 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.