GNU bug report logs - #35119
Zero-length regexp matches at point when doing re-search-backward

Previous Next

Package: emacs;

Reported by: Sam Halliday <sam.halliday <at> gmail.com>

Date: Wed, 3 Apr 2019 11:20:01 UTC

Severity: normal

Found in version 26.1

Full log


View this message in rfc822 format

From: Sam Halliday <sam.halliday <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 35119 <at> debbugs.gnu.org
Subject: bug#35119: 26.1; narrow-to-region loses word-start/symbol-start information at end
Date: Wed, 3 Apr 2019 13:30:40 +0100
Hi Eli,

Sorry that was a terrible bug report.

This impacts me in `looking-back'. Here's an interactive snippet to
demonstrate the problem (not minimised to`narrow-to-region'):

(defun look-for-35119 ()
  (interactive)
  (if (looking-back
       (rx (: word-end ":" word-start))
       ;;(rx (: word-end ":"))
       (- (point) 1) 't)
      (message "hit")
    (message "miss")))

in emacs-lisp-mode, which defines : as non-word, interactively
evaluate look-for-35119 when the point is just after the colon in this
example text

  wibble:wobble

I would expect to see "hit", but we get "miss". To demonstrate that
the word-start is the cause of the problem, try the commented regexp
and try again, you'll get "hit" but of course this regexp is not what
is intended. For example, it would also match in between :: in the
following:

  wibble::wobble

The cause is that the `narrow-to-region' call inside `looking-back' is
dropping the word-start zero length match at the beginning of wobble.
This may or may not be a bug in narrow-to-region, but I'm quite sure
it's a bug in `looking-back'. There is most likely a similar example
demonstrating that the zero lengths are missing at the start as well
as the end.

I've tried playing around with multiple alternative implementations of
`looking-back' but none are working for me. Probably the best
workaround I can think of is to extend the `narrow-to-region' call by
one more character at the start and the end. Dealing with the start is
easy, we just goto-char limit+1, but dealing with the end is difficult
as we need to put an anychar \\. matcher in the doctored regexp and
then the match-end is off-by-one from what the user expects, so then
we have to doctor that, and then all hell breaks loose.

Does that make sense?


On Wed, 3 Apr 2019 at 12:25, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> > From: Sam Halliday <sam.halliday <at> gmail.com>
> > Date: Wed, 03 Apr 2019 12:19:08 +0100
> >
> > If the function `narrow-to-region' (as it is in `looking-back') is used
> > to restrict the region prior to an invocation of re-search-forward or
> > looking-at, then zero length regexp patterns are lost at the boundaries.
>
> Could you please provide a recipe to reproduce the issue?  I'm not
> sure I understand what is the problem you are describing.
>
> Thanks.




This bug report was last modified 3 years and 311 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.