GNU bug report logs -
#31796
26.1; dired-do-find-regexp-and-replace fails to find multiline regexps
Previous Next
Full log
Message #161 received at 31796 <at> debbugs.gnu.org (full text, mbox):
On 02.12.2020 19:39, Eli Zaretskii wrote:
>> Cc: abela <at> chalmers.se, 31796 <at> debbugs.gnu.org
>> From: Dmitry Gutov <dgutov <at> yandex.ru>
>> Date: Wed, 2 Dec 2020 19:17:06 +0200
>>
>> On 02.12.2020 16:56, Eli Zaretskii wrote:
>>> The point is that our heuristics for detecting encoding is not
>>> perfect, so it could fail.
>>
>> Do you imagine Grep could use a more reliable detection algorithm?
>
> No, I don't. But it could allow the user to specify a different
> encoding for each file, as in
>
> grep --encoding=FOO FILES1* --encoding=BAR FILES2*
Not sure we can call it like that in an automated fashion (i.e. in
project-find-regexp). But hey, somebody else could.
> etc. And even if it just did the job of the same quality as we do, it
> will do it faster, which is why we use Grep in the first place, right?
That's true.
> The important part of the "enhancement" I described is actually the
> fact that the output gets encoded in a single encoding, no matter what
> was the encoding of the original files. This makes reading and
> decoding the output simple and always correct.
Yes, OK.
>> Although... since it has to scan the full file anyway, it could first do
>> a quick detection, and then maybe rescan from the beginning if the
>> encoding turns out to be something else.
>
> That'd be too late, as some matches were already output.
It could buffer them until the full file has been parsed. Encoding
detection and conversion must add a certain overhead anyway, so I'm not
sure how expensive the extra buffering would be in comparison.
As a bonus, per-file buffering like that would allow easier
parallelization of searches.
This bug report was last modified 4 years and 246 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.