#44486 - 27.1; C-@ chars corrupt elisp buffer

GNU bug report logs - #44486
27.1; C-@ chars corrupt elisp buffer

Package: emacs;

Reported by: Thierry Volpiatto <thievol <at> posteo.net>

Date: Fri, 6 Nov 2020 15:24:02 UTC

Severity: minor

Found in version 27.1

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org> To: Stefan Monnier <monnier <at> iro.umontreal.ca> Cc: thievol <at> posteo.net, handa <at> gnu.org, larsi <at> gnus.org, schwab <at> linux-m68k.org, 44486 <at> debbugs.gnu.org Subject: bug#44486: 27.1; C-@ chars corrupt elisp buffer Date: Sat, 14 Nov 2020 20:08:04 +0200

> From: Stefan Monnier <monnier <at> iro.umontreal.ca> > Cc: larsi <at> gnus.org, thievol <at> posteo.net, handa <at> gnu.org, > schwab <at> linux-m68k.org, 44486 <at> debbugs.gnu.org > Date: Sat, 14 Nov 2020 12:55:51 -0500 > > >> AFAIK `prefer-utf-8` is only ever used for files which are known to > >> contain text and should almost always contain UTF-8 text. > > For those, we should use utf-8, not prefer-utf-8. > > No, `utf-8` should be used when other coding systems should be > considered as errors (i.e. not "almost always" but "always") Why? > whereas `prefer-utf-8` is for use when utf-8 is the most likely one > and other coding systems should be tried only when there's some > evidence that the file actually doesn't use utf-8. > > `prefer-utf-8` was introduced specifically for `.el` files (and I don't > know of any other use of that encoding so far). Maybe that was the history, but the reality is different. prefer-utf-8 is the same as 'undecided' with coding-systems' priorities tampered to prefer UTF-8. > If `utf-8` is preferable over `prefer-utf-8` for this usage I think > the problem is in `prefer-utf-8` since it was introduced > specifically for that. The implementation doesn't support your POV. > >> I believe if there's a NUL byte in such a files but it otherwise doesn't > >> contain any invalid UTF-8 byte sequence, it will result in better > >> behavior if we treat it as UFT-8 than as binary. > > We treat null bytes as the _single_ telltale sign of a binary file. > > A .el file should *never* be a binary file. We are not talking about .el files, we are talking about _any_ file read using prefer-utf-8. For .el files, we can always bind inhibit-null-byte-detection to t when we load or visit such files. > > If we disable that in coding-systems that are supposed to _detect_ > > encoding, we will never be able to detect binary files. > > In which scenario would it be beneficial to detect a `.el` file as being > binary instead of utf-8? I'm not talking about .el files. The coding-system's applicability is wider than that.

This bug report was last modified 4 years and 248 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #44486 27.1; C-@ chars corrupt elisp buffer

GNU bug report logs - #44486
27.1; C-@ chars corrupt elisp buffer