GNU bug report logs -
#46859
28.0.50; [PATCH]: Add option to truncate long lines in xref.el
Previous Next
Reported by: Theodor Thornhill <theo <at> thornhill.no>
Date: Mon, 1 Mar 2021 20:42:01 UTC
Severity: normal
Tags: patch
Found in version 28.0.50
Fixed in version 28.1
Done: Dmitry Gutov <dgutov <at> yandex.ru>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
> Date: Thu, 04 Mar 2021 09:19:50 +0000
> From: Gregory Heytings <gregory <at> heytings.org>
> cc: 46859 <at> debbugs.gnu.org
>
> > While you discuss all those possibilities, please be aware that byte
> > offsets have one more problem: converting them to character offsets or
> > columns might not be trivial, especially if the encoding of the file is
> > not UTF-8. (Apologies if you already discussed this.)
> >
>
> We did not discuss this, thanks for pointing that out.
>
> Is this not easy to do with byte-to-position?
No. byte-to-position works for text in an Emacs buffer, whereas we
are talking about the text in its original file on disk. Unless that
file is encoded in UTF-8, byte-to-position will give you wrong
results. You need to use filepos-to-bufferpos, and you will need to
specify the file's encoding. And it's relatively slow for non-UTF-8
encoded files.
> What I would suggest is to use "grep -nbo '.\{0,50\}PATTERN.\{0,50\}'", to
> hide the byte position in the xref buffer, and when the user jumps to an
> occurrence to use something like (goto-char (byte-to-position
> (get-byte-position))). Does that make sense?
Yes, but see above about encodings other than UTF-8. For example, if
the original file is in Latin-1, each character is 1 byte, but in an
Emacs buffer non-ASCII Latin-1 characters will take 2 bytes.
This bug report was last modified 4 years and 89 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.