GNU bug report logs - #74922
29.4; copy_string_contents doesn't always produce a valid utf-8

Previous Next

Package: emacs;

Reported by: Evgeny Kurnevsky <kurnevsky <at> gmail.com>

Date: Tue, 17 Dec 2024 06:09:01 UTC

Severity: normal

Found in version 29.4

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #11 received at 74922 <at> debbugs.gnu.org (full text, mbox):

From: Evgeny Kurnevsky <kurnevsky <at> gmail.com>
To: 74922 <at> debbugs.gnu.org
Subject: Fwd: bug#74922: 29.4; copy_string_contents doesn't always produce a
 valid utf-8
Date: Tue, 17 Dec 2024 13:31:57 +0000
[Message part 1 (text/plain, inline)]
Yes, that's a binary file that is not an utf-8 string. From the comment in
module_copy_string_contents implementation I guessed that in such cases
emacs should signal an error, but instead it just passes this invalid
string to the dynamic library which caused this bug in emacs-module-rs (see
https://ubolonton.github.io/emacs-module-rs/latest/type-conversions.html#strings
). So if it's expected then maybe it should be explicitly said in the docs
of copy_string_contents here
https://www.gnu.org/software/emacs/manual/html_node/elisp/Module-Values.html
? It just says that it stores the utf-8 encoded text which makes an
impression that it's an always valid utf-8 string.

On Tue, Dec 17, 2024 at 1:18 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > From: Evgeny Kurnevsky <kurnevsky <at> gmail.com>
> > Date: Tue, 17 Dec 2024 06:08:30 +0000
> >
> > According to the docs and comment inside module_copy_string_contents it
> should always produce a valid
> > utf-8 string that can be used in dynamic modules, but it seems it's not
> always the case. I encountered an
> > emacs crash when using emacs-module-rs because it always expects a valid
> utf-8 for strings. To reproduce
> > you can call:
> >
> > (some-function-from-dynamic-library (encode-coding-string (f-read-text
> "wg-private-pc.age") 'utf-8 t))
> >
> > The file is
> >
> https://github.com/kurnevsky/nixfiles/raw/0b3de016dac551398627a55788b80d4809afcbf9/secrets/wg-private-pc.age
>
> This string includes raw bytes, it isn't a text string, as far as I
> could see.  It definitely isn't UTF-8 encoded text.  What did you
> expect to happen with it when you copy such a string from Emacs?
>
> > See https://github.com/ubolonton/emacs-module-rs/issues/58 for
> additional details.
>
> Can't say there are too many details there...
>
[Message part 2 (text/html, inline)]

This bug report was last modified 136 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.