GNU bug report logs - #12598
24.2; utf-8 codepoints in doc-strings and compression of .el and .elc files

Previous Next

Package: emacs;

Reported by: Achim Gratz <Stromeko <at> nexgo.de>

Date: Sun, 7 Oct 2012 17:46:01 UTC

Severity: normal

Tags: moreinfo

Found in version 24.2

Fixed in version 29.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log

Message #23 received at 12598 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Achim Gratz <Stromeko <at> nexgo.de>
Cc: 12598 <at> debbugs.gnu.org
Subject: Re: bug#12598: 24.2;
	utf-8 codepoints in doc-strings and compression of .el and .elc files
Date: Thu, 31 Jan 2013 13:15:20 -0500

> I've just removed some utf-8 codepoints from docstrings in org-mode
> because when I compress either the source (.el.gz) or the resulting
> byte-compiled file (.elc.gz), the loader fails after the first function

I can't reproduce this problem for the .el.gz case (indeed, I think
it's specific to byte-compiled files).

> So, any codepoint that is more than a single byte will throw the
> byte-compiler off, not just any utf-8 codepoint.  Since this has been in
> Emacs likely ever since unicode strings have been introduced, I'd
> suggest adding a *strong* warning in some prominent place in the
> documentation about this even when it gets fixed in a newer version of
> Emacs. Otherwise it's all too easy to produce libraries that have
> mysterious failures depending on whatever Emacs was used to compile or
> run them.

I think the problem lies between load-with-code-conversion and
eval-buffer, so it dates back to the introduction of
load-with-code-conversion, which IIRC predates the internal use
of Unicode.

Fixing `eval-buffer' so that it skips bytes when it sees #@NN is tricky,
so the best fix is probably to change load-with-code-conversion so that
(if the file is byte-compiled) it saves the buffer to a temp file and
passes that to `load'.


        Stefan

This bug report was last modified 3 years and 76 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #12598 24.2; utf-8 codepoints in doc-strings and compression of .el and .elc files

GNU bug report logs - #12598
24.2; utf-8 codepoints in doc-strings and compression of .el and .elc files