GNU bug report logs - #31796
26.1; dired-do-find-regexp-and-replace fails to find multiline regexps

Previous Next

Package: emacs;

Reported by: Žygimantas Bruzgys <me <at> zygi.xyz>

Date: Tue, 12 Jun 2018 07:56:03 UTC

Severity: minor

Found in version 26.1

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: rms <at> gnu.org
Cc: abela <at> chalmers.se, 31796 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: bug#31796: 27.1; dired-do-find-regexp-and-replace fails to find multiline regexps
Date: Tue, 01 Dec 2020 17:46:19 +0200
> From: Richard Stallman <rms <at> gnu.org>
> Cc: eliz <at> gnu.org, abela <at> chalmers.se, 31796 <at> debbugs.gnu.org
> Date: Tue, 01 Dec 2020 00:20:12 -0500
> 
> Can people think of a new feature that would be easy to add to GNU grep
> that would make it easy for Dired to handle all cases correctly?

Yes: it should detect encoding of each input file (and have a way of
letting the user specify encoding for each file), convert the file's
contents to some internal encoding (probably UTF-8), then report the
hits encoded in UTF-8, regardless of the file's original encoding (and
regardless of the current locale's codeset).

> I don't know what the problem is, but if it has to do with parsing the
> grep output, here's an idea: an option to tell GNU grep to use quoting
> on file names and the match strings, Perhaps in the same way GNU ls
> does.

The problem is not with file names, it's with the matches.  But since
you mention it: Grep should, in this new mode, report file names also
recoded into UTF-8.  In a word, it should arrange for its output be in
a single encoding known in advance, so that front ends like Emacs
won't need to guess the encoding.

> Another idea is an option to output numerical byte positions in the
> file instead of the lines that are matched.  Emacs can feed those byte
> positions into byte-to-position to convert them into buffer positions.

AFAIU, there's already such an option: -b.  However, byte-to-position
works only with UTF-8 encoded files; we need filepos-to-bufferpos
(which requires to know the file's encoding, so we are back at the
same problem).




This bug report was last modified 4 years and 246 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.