GNU bug report logs - #74922
29.4; copy_string_contents doesn't always produce a valid utf-8

Previous Next

Package: emacs;

Reported by: Evgeny Kurnevsky <kurnevsky <at> gmail.com>

Date: Tue, 17 Dec 2024 06:09:01 UTC

Severity: normal

Found in version 29.4

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Evgeny Kurnevsky <kurnevsky <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 74922 <at> debbugs.gnu.org
Subject: bug#74922: Fwd: bug#74922: 29.4; copy_string_contents doesn't always produce a valid utf-8
Date: Tue, 17 Dec 2024 14:46:28 +0000
[Message part 1 (text/plain, inline)]
It can definitely do it, but I guess in emacs-module-rs it's not done by
default because of performance implications - it might be quite costly to
check every string in some cases, and it wasn't really clear if emacs can
pass an invalid string. So currently this case causes undefined behavior
there which results in emacs crash.

On Tue, Dec 17, 2024 at 2:24 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > From: Evgeny Kurnevsky <kurnevsky <at> gmail.com>
> > Date: Tue, 17 Dec 2024 13:31:57 +0000
> >
> > Yes, that's a binary file that is not an utf-8 string. From the comment
> in module_copy_string_contents
> > implementation I guessed that in such cases emacs should signal an
> error, but instead it just passes this
> > invalid string to the dynamic library which caused this bug in
> emacs-module-rs (see
> >
> https://ubolonton.github.io/emacs-module-rs/latest/type-conversions.html#strings
> ). So if it's expected then
> > maybe it should be explicitly said in the docs of copy_string_contents
> here
> >
> https://www.gnu.org/software/emacs/manual/html_node/elisp/Module-Values.html
> ? It just says that it stores
> > the utf-8 encoded text which makes an impression that it's an always
> valid utf-8 string.
>
> I could look into the internals, but I actually wonder why the module
> doesn't check the text before relying on such subtle behaviors.  We
> didn't document the fact that it signals an error for a reason.
>
> So: why cannot the module code or the application which uses it test
> up from that the string it copies is human-readable text, nit some
> binary junk?
>


-- 
С уважением, Курневский Евгений.
[Message part 2 (text/html, inline)]

This bug report was last modified 136 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.