GNU bug report logs - #29909
non-greedy matching (RE2)

Previous Next

Package: sed;

Reported by: Shawn Landden <slandden <at> gmail.com>

Date: Sat, 30 Dec 2017 17:18:02 UTC

Severity: wishlist

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Assaf Gordon <assafgordon <at> gmail.com>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#29909: closed (non-greedy matching (RE2))
Date: Sat, 30 Dec 2017 23:56:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Sat, 30 Dec 2017 16:55:11 -0700
with message-id <c378e73a-c10e-e475-930e-9729649ff414 <at> gmail.com>
and subject line Re: bug#29909: non-greedy matching (RE2)
has caused the debbugs.gnu.org bug report #29909,
regarding non-greedy matching (RE2)
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
29909: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=29909
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Shawn Landden <slandden <at> gmail.com>
To: bug-sed <at> gnu.org
Subject: non-greedy matching (RE2)
Date: Sat, 30 Dec 2017 05:01:12 -0800
It is well known that sed lacks non-greedy regular expression matches.
This means that sed can only match a subset of regular languages[1].
Would a proper patch to add re2 support[2], so that sed implements ALL
regular languages correctly, in O(n) time, be considered?

Thanks,

Shawn Landden

[1] https://en.wikipedia.org/wiki/Regular_language#Location_in_the_Chomsky_hierarchy
And because that link isn't very good, 28c3: The Science of Insecurity
https://www.youtube.com/watch?v=3kEfedtQVOY
[2] https://github.com/google/re2


[Message part 3 (message/rfc822, inline)]
From: Assaf Gordon <assafgordon <at> gmail.com>
To: Shawn Landden <slandden <at> gmail.com>, 29909-done <at> debbugs.gnu.org
Subject: Re: bug#29909: non-greedy matching (RE2)
Date: Sat, 30 Dec 2017 16:55:11 -0700
severity 29909 wishlist
stop

Hello Shawn,

On 2017-12-30 06:01 AM, Shawn Landden wrote:
> It is well known that sed lacks non-greedy regular expression matches.
> This means that sed can only match a subset of regular languages[1].
> Would a proper patch to add re2 support[2], so that sed implements ALL
> regular languages correctly, in O(n) time, be considered?
> 
> [2] https://github.com/google/re2

First,
A working patch is worth 1000 emails :)
if you already have something working, that will go a long way
towards considering this feature.

However,
From a cursory look, I would say using RE2 in GNU sed is not likely.
RE2 is a C++ library, and while there is a C wrapper for it,
it will make compiling GNU sed much more complicated than it is today.

It could be added as an optional dependency,
but GNU sed is included in many "minimal" installation, and those will 
likely opt not to add additional libraries to their minimal setup -
so by default most users won't benefit from RE2 at all.

There was an attempt to add PCRE support for GNU sed (which has been 
shelved for now). PCRE is much more commonly available than RE2,
and if any effort is done in this direction, I would think focusing
on reviving the PCRE patch would be more effective.

As such, I'm marking this ticket as a "wishlist" item and closing it,
but discussion can continue by replying to this thread.

regards,
 - assaf


This bug report was last modified 7 years and 207 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.