GNU bug report logs -
#31796
26.1; dired-do-find-regexp-and-replace fails to find multiline regexps
Previous Next
Full log
Message #107 received at 31796 <at> debbugs.gnu.org (full text, mbox):
On 24.11.2020 22:16, Eli Zaretskii wrote:
>> Cc: abela <at> chalmers.se, 31796 <at> debbugs.gnu.org
>> From: Dmitry Gutov <dgutov <at> yandex.ru>
>> Date: Tue, 24 Nov 2020 21:43:22 +0200
>>
>> How about https://debbugs.gnu.org/cgi/bugreport.cgi?bug=31796#23 ?
>
> The idea sounds fine to me.
>
>> Someone more familiar with existing ports of Grep on different systems
>> should weigh in on it.
>
> I don't think it's necessary. We just need to probe Grep for support
> of these switches, and then use it. The result cannot be worse than
> it is now.
Now that I've dug in a little, the situation seems difficult.
-Pz does work, but it forces Grep to consider the file as one long
string. As a consequence, if we ask it to output the line number, the
number will always be 1. That's not a helpful mode of operation.
Even if it worked differently, -P imposes a significant performance
penalty from what I see, even when the extra syntax is not actually
used. So we couldn't enable it by default.
There is a similar program called pcregrep which outputs in the expected
format:
$ pcregrep -MHn "names\"\n *" lisp/progmodes/project.el
lisp/progmodes/project.el:772: :type '(choice (const :tag "Read with
completion from relative names"
project--read-file-cpd-relative)
lisp/progmodes/project.el:774: (const :tag "Read with
completion from absolute names"
project--read-file-absolute)
...but it doesn't seem to have a way to reliably detect where a match
result ends. When we're talking multiline, perhaps the searched file
includes a string like "file-name/etc:number"? Some of our tests
probably do. Grep has an flag -Z (or --null) which adds a null byte
after file names, but pcregrep doesn't.
And anyway, pcregrep isn't usually installed by default.
ripgrep, OTOH, seems to combine both good features here:
$ rg -Hn --multiline --null "names\"\n *" lisp/progmodes/project.el
lisp/progmodes/project.el772: :type '(choice (const :tag "Read with
completion from relative names"
773: project--read-file-cpd-relative)
774: (const :tag "Read with completion from absolute names"
775: project--read-file-absolute)
And it also disables the multiline mode automatically if the regexp
can't match a newline (the multiline mode is significantly slower).
To sum up, there are options, but I don't see a working solution that is
based on GNU Grep. And that's the most portable search program we have,
I think.
The other recommendations I see (here:
https://unix.stackexchange.com/questions/112132/how-can-i-grep-patterns-across-multiple-lines)
include bespoke scripts in sed or perl in command mode. These seem less
portable, but if someone would like to try their hand at one that would
also output file names and line numbers in the expected format, I'd be
happy to benchmark it.
This bug report was last modified 4 years and 246 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.