GNU bug report logs -
#59341
29.0.50; Lisp files with other encoding than UTF-8?
Previous Next
Reported by: Stefan Kangas <stefankangas <at> gmail.com>
Date: Thu, 17 Nov 2022 19:39:02 UTC
Severity: normal
Found in version 29.0.50
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
Message #11 received at 59341 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
> No. AFAIR, they are in utf-8-emacs because they include characters
> beyond the Unicode range, which UTF-8 cannot encode. See, for
> example, the codepoints that start around line 645 in ind-util.el,
> which are used for converting between IS 13194 (ISCII) and Unicode.
I see, thanks.
Do we need these characters to be raw bytes in the source code though?
I was thinking of a change similar to the below, which would
incidentally make it a bit easier to read the code.
diff --git a/lisp/language/ind-util.el b/lisp/language/ind-util.el
index e2a21820f4..16161319ef 100644
--- a/lisp/language/ind-util.el
+++ b/lisp/language/ind-util.el
@@ -644,9 +644,9 @@ indian-dev-aiba-decode-region
;;Unicode vs IS13194 ;; only Devanagari is supported now.
((ucs-devanagari-to-is13194-alist
'((?\x0900 . "[U+0900]")
- (?\x0901 . " ")
- (?\x0902 . " ")
- (?\x0903 . " ")
+ (?\x0901 . "?\x180000")
+ (?\x0902 . "?\x180001")
+ (?\x0903 . "?\x180002")
(?\x0904 . "[U+0904]")
[and so on]
This change would also avoid confusing external tools. For example, the
code is completely unreadable in many external viewers, such as:
https://github.com/emacs-mirror/emacs/blob/master/lisp/language/ind-util.el#L647
This bug report was last modified 2 years and 237 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.