GNU bug report logs - #29909
non-greedy matching (RE2)

Previous Next

Package: sed;

Reported by: Shawn Landden <slandden <at> gmail.com>

Date: Sat, 30 Dec 2017 17:18:02 UTC

Severity: wishlist

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 29909 in the body.
You can then email your comments to 29909 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-sed <at> gnu.org:
bug#29909; Package sed. (Sat, 30 Dec 2017 17:18:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Shawn Landden <slandden <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-sed <at> gnu.org. (Sat, 30 Dec 2017 17:18:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Shawn Landden <slandden <at> gmail.com>
To: bug-sed <at> gnu.org
Subject: non-greedy matching (RE2)
Date: Sat, 30 Dec 2017 05:01:12 -0800
It is well known that sed lacks non-greedy regular expression matches.
This means that sed can only match a subset of regular languages[1].
Would a proper patch to add re2 support[2], so that sed implements ALL
regular languages correctly, in O(n) time, be considered?

Thanks,

Shawn Landden

[1] https://en.wikipedia.org/wiki/Regular_language#Location_in_the_Chomsky_hierarchy
And because that link isn't very good, 28c3: The Science of Insecurity
https://www.youtube.com/watch?v=3kEfedtQVOY
[2] https://github.com/google/re2




Severity set to 'wishlist' from 'normal' Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Sat, 30 Dec 2017 23:56:02 GMT) Full text and rfc822 format available.

Reply sent to Assaf Gordon <assafgordon <at> gmail.com>:
You have taken responsibility. (Sat, 30 Dec 2017 23:56:02 GMT) Full text and rfc822 format available.

Notification sent to Shawn Landden <slandden <at> gmail.com>:
bug acknowledged by developer. (Sat, 30 Dec 2017 23:56:02 GMT) Full text and rfc822 format available.

Message #12 received at 29909-done <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Shawn Landden <slandden <at> gmail.com>, 29909-done <at> debbugs.gnu.org
Subject: Re: bug#29909: non-greedy matching (RE2)
Date: Sat, 30 Dec 2017 16:55:11 -0700
severity 29909 wishlist
stop

Hello Shawn,

On 2017-12-30 06:01 AM, Shawn Landden wrote:
> It is well known that sed lacks non-greedy regular expression matches.
> This means that sed can only match a subset of regular languages[1].
> Would a proper patch to add re2 support[2], so that sed implements ALL
> regular languages correctly, in O(n) time, be considered?
> 
> [2] https://github.com/google/re2

First,
A working patch is worth 1000 emails :)
if you already have something working, that will go a long way
towards considering this feature.

However,
From a cursory look, I would say using RE2 in GNU sed is not likely.
RE2 is a C++ library, and while there is a C wrapper for it,
it will make compiling GNU sed much more complicated than it is today.

It could be added as an optional dependency,
but GNU sed is included in many "minimal" installation, and those will 
likely opt not to add additional libraries to their minimal setup -
so by default most users won't benefit from RE2 at all.

There was an attempt to add PCRE support for GNU sed (which has been 
shelved for now). PCRE is much more commonly available than RE2,
and if any effort is done in this direction, I would think focusing
on reviving the PCRE patch would be more effective.

As such, I'm marking this ticket as a "wishlist" item and closing it,
but discussion can continue by replying to this thread.

regards,
 - assaf




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 28 Jan 2018 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 7 years and 205 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.