GNU bug report logs - #34525
replace-regexp missing some matches

Previous Next

Packages: cc-mode, emacs;

Reported by: Daniel Lopez <daniel.lopez999 <at> gmail.com>

Date: Mon, 18 Feb 2019 08:31:01 UTC

Severity: normal

Done: Alan Mackenzie <acm <at> muc.de>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Alan Mackenzie <acm <at> muc.de>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#34525: closed (replace-regexp missing some matches)
Date: Fri, 01 Mar 2019 17:48:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Fri, 1 Mar 2019 17:42:42 +0000
with message-id <20190301174242.GA10816 <at> ACM>
and subject line Re: bug#34525: replace-regexp missing some matches
has caused the debbugs.gnu.org bug report #34525,
regarding replace-regexp missing some matches
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
34525: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=34525
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Daniel Lopez <daniel.lopez999 <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: replace-regexp missing some matches
Date: Mon, 18 Feb 2019 08:28:35 +0000
[Message part 3 (text/plain, inline)]
Reproduce:

- Start "emacs -Q" and open the file BitmapFontFace.h
- Evaluate the expression (replace-regexp "\\<Bitmap\\>" "SharedBitmap")
- The text "Replaced 8 occurrences" appears in the echo area.

Problem:

There were actually 12 occurrences (ie. of the word "Bitmap" surrounded 
by word boundaries) in the file that should have been replaced. If I now 
move point back to the start of the buffer and evaluate the expression 
again, it says "Replaced 4 occurrences".

The exact number of incorrect replacements perhaps varies over time. 
That is, I can test it five times in a row and get 8 initial replacments 
each time, but after trying some other search terms, messing with the 
file, restarting Emacs etc, I try my initial test again and then maybe 
it consistently replaces 10 the first time, for a while. So your exact 
numbers may vary.

I debugged the Lisp as far as I could and it appears to be wrong answers 
coming out of the re-search-forward C call that is in 
isearch-search-fun-default.

The bug filters up to a number of string replacement user actions - I 
first noticed it when trying to do this replacement interactively with 
query-replace on word boundaries (C-u M-%), entering "Bitmap" as search 
string, then "SharedBitmap" as replacement string. Trying now, as I 
press space repeatedly about once a second to confirm each one, I see 
the pink highlight skip valid matches to ask me about one that is 
further down even while I see the skipped one highlighted in blue a few 
lines above, and in the end it may have replaced only 6-8 of the 
occurrences. Though, if I press 'n' instead of space to skip without 
making any replacements, it does visit all of the occurrences.

I see from the Lisp that plain (non-regexp) query-replace on word 
boundaries gets preprocessed into the equivalent regexp search as in my 
initial example. I don't think there are any problems with plain string 
search and replacement.

Some more experimental observations:

- The replacement text can be any string instead of "SharedBitmap", eg. 
"qwertyasdfgh", "qwer", etc, and the bug still happens. The number of 
matches seems to be related to the length of the replacement string. 
Currently 12 character replacement strings are causing replace-regexp to 
make 8 replacements on the first call for me, while 4 character strings 
cause 7 replacements. 6 character replacement strings - ie. same length 
as "Bitmap" - always work, replacing all 12 occurrences.

- The bug doesn't happen in fundamental-mode, nor c-mode, js-mode, 
text-mode or any other major modes I tried.

- I've seen this happen in other of my C++ files where I was making the 
same replacement, so the problem's not precisely unique to this one. 
I've been trying to simplify this one but haven't found anything much 
more revealing so far. For example if I delete all the comments and 
blank lines, then the first replacement finds 9 occurrences out of 10. 
If I cut the file in half by deleting line 140 onwards, the first 
replacement finds 3 occurrences out of 6. But if I do something very 
simple like just pasting "Bitmap<PixelType>" on 100 consecutive lines, 
it's not fooled and it replaces them all.

I've tried this in GNU Emacs 26.1 on Arch Linux and 25.2.1 on Windows 7 
and am seeing the same behaviour in both.

Thanks,
Daniel
[BitmapFontFace.h (text/x-chdr, attachment)]
[Message part 5 (message/rfc822, inline)]
From: Alan Mackenzie <acm <at> muc.de>
To: 34525-done <at> debbugs.gnu.org
Cc: Daniel Lopez <daniel.lopez999 <at> gmail.com>
Subject: Re: bug#34525: replace-regexp missing some matches
Date: Fri, 1 Mar 2019 17:42:42 +0000
Bug fixed in master.

Closing.

-- 
Alan Mackenzie (Nuremberg, Germany).


This bug report was last modified 6 years and 86 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.