GNU bug report logs - #27270
display-raw-bytes-as-hex generates ambiguous output for Emacs strings

Previous Next

Package: emacs;

Reported by: Paul Eggert <eggert <at> cs.ucla.edu>

Date: Wed, 7 Jun 2017 03:59:01 UTC

Severity: wishlist

Tags: moreinfo

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #100 received at 27270 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, v.schneidermann <at> gmail.com,
 27270 <at> debbugs.gnu.org, npostavs <at> users.sourceforge.net
Subject: Re: bug#27270: display-raw-bytes-as-hex generates ambiguous output
 for Emacs strings
Date: Sun, 24 Apr 2022 15:35:53 -0700
On 4/24/22 04:24, Lars Ingebrigtsen wrote:

> The likelihood of anybody actually encountering this issue is ... small.

Sure, if strings are random. But strings from opponents aren't random.

I'll readily grant that it's a much smaller exposure than SQL injection. 
Still, like SQL injection it's an exposure and should be fixed.


> You want to quote all %c as if they were raw bytes?  Or only following a
> raw byte?

Closer to the latter, but even less than the latter. I am being 
conservative and am proposing that Emacs do what it does now unless the 
resulting output would be misinterpreted on input. So I wouldn't change 
how all characters are quoted; only how characters are quoted when the 
result would be interpreted incorrectly.


> what about (format "%cf" #x9e)

Since that returns a multibyte string, I suggest "\u009ef" which is 
multibyte. For its unibyte counterpart (encode-coding-string (format 
"%cf" #x9e) 'iso-latin-1) I suggest the syntax "\x9e\ f" which is 
unibyte. (These are not the only possibilities; for example, the former 
could be "\u009e\ f" if you think that's clearer.)

This string syntax is already supported by Emacs, so this wouldn't 
change the Lisp reader.


> it creates
> very confusing displayed strings.

These examples are not *that* confusing. And although they may not be 
beautiful, correct strings are less confusing than incorrect strings.




This bug report was last modified 3 years and 109 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.