GNU bug report logs -
#12598
24.2; utf-8 codepoints in doc-strings and compression of .el and .elc files
Previous Next
Reported by: Achim Gratz <Stromeko <at> nexgo.de>
Date: Sun, 7 Oct 2012 17:46:01 UTC
Severity: normal
Tags: moreinfo
Found in version 24.2
Fixed in version 29.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
Message #23 received at 12598 <at> debbugs.gnu.org (full text, mbox):
> I've just removed some utf-8 codepoints from docstrings in org-mode
> because when I compress either the source (.el.gz) or the resulting
> byte-compiled file (.elc.gz), the loader fails after the first function
I can't reproduce this problem for the .el.gz case (indeed, I think
it's specific to byte-compiled files).
> So, any codepoint that is more than a single byte will throw the
> byte-compiler off, not just any utf-8 codepoint. Since this has been in
> Emacs likely ever since unicode strings have been introduced, I'd
> suggest adding a *strong* warning in some prominent place in the
> documentation about this even when it gets fixed in a newer version of
> Emacs. Otherwise it's all too easy to produce libraries that have
> mysterious failures depending on whatever Emacs was used to compile or
> run them.
I think the problem lies between load-with-code-conversion and
eval-buffer, so it dates back to the introduction of
load-with-code-conversion, which IIRC predates the internal use
of Unicode.
Fixing `eval-buffer' so that it skips bytes when it sees #@NN is tricky,
so the best fix is probably to change load-with-code-conversion so that
(if the file is byte-compiled) it saves the buffer to a temp file and
passes that to `load'.
Stefan
This bug report was last modified 3 years and 76 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.