GNU bug report logs - #5393
Patch for lookaround assertion in regexp

Previous Next

Package: emacs;

Reported by: Tomohiro MATSUYAMA <t.matsuyama.pub <at> gmail.com>

Date: Fri, 15 Jan 2010 18:46:02 UTC

Severity: wishlist

Tags: patch

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Forwarded to http://lists.gnu.org/archive/html/emacs-devel/2012-01/msg00732.html

Full log


Message #1 received at quiet <at> debbugs.gnu.org (full text, mbox):

From: Tomohiro MATSUYAMA <t.matsuyama.pub <at> gmail.com>
To: quiet <at> debbugs.gnu.org
Subject: Patch for lookaround assertion in regexp
Date: Thu, 4 Jun 2009 08:04:25 +0900
[Message part 1 (text/plain, inline)]
Severity: wishlist
Tags: patch

[ resent from
  http://lists.gnu.org/archive/html/emacs-devel/2009-06/msg00094.html ]

Hi, all

I have attached a patch that enables you to
use lookaround assertion in regexp
with following syntax:

* Positive lookahead assertion
    \(?=...\)
* Negative lookahead assertion
    \(?!...\)
* Positive lookbehind assertion
    \(?<=...\)
* Negative lookbehind assertion
    \(?<!...\)

Basically, it works as same as Perl's one.

Spec:
* Any pattern is allowed in lookahead assertion.
* Nested looaround assertion is allowed.
* Capturing is allowed only in positive lookahead/lookbehind assertion.
* Duplication is allowed after such assertion.
* Variable length pattern is NOT yet allowed in lookbehind assertion.
  [x] \(?<=[0-9]+\)MB
  [o] \(?<=[0-9][0-9][0-9][0-9]\)MB
* Lookahead assertion over start bound is not allowed in re-search-backward.
  (re-search-backward "\(?<=a\)b") for buffer "abca_|_b"
  will seek to first "ab".

As of performace, I think there is no problem about lookahead assertion,
but lookbehind assertion is somewhat high cost.
You can check this patch works properly with a testcase I have attached
and also see performance:
    src/emacs --script regex-test.el perf

I saw that lookbehind assertion will spend 5 times than usual lookbehind alike
regexp. I think I have to improve its performance.

Anyway, please try it and review it.
And if like it, please merge it.
I believe that some people really want to use it.

Regards,
MATSUYAMA Tomohiro
[regex-test.el (application/octet-stream, attachment)]
[emacs-regex.patch (application/octet-stream, attachment)]

This bug report was last modified 4 years and 279 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.