GNU bug report logs -
#27270
display-raw-bytes-as-hex generates ambiguous output for Emacs strings
Previous Next
Reported by: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Wed, 7 Jun 2017 03:59:01 UTC
Severity: wishlist
Tags: moreinfo
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
Message #100 received at 27270 <at> debbugs.gnu.org (full text, mbox):
On 4/24/22 04:24, Lars Ingebrigtsen wrote:
> The likelihood of anybody actually encountering this issue is ... small.
Sure, if strings are random. But strings from opponents aren't random.
I'll readily grant that it's a much smaller exposure than SQL injection.
Still, like SQL injection it's an exposure and should be fixed.
> You want to quote all %c as if they were raw bytes? Or only following a
> raw byte?
Closer to the latter, but even less than the latter. I am being
conservative and am proposing that Emacs do what it does now unless the
resulting output would be misinterpreted on input. So I wouldn't change
how all characters are quoted; only how characters are quoted when the
result would be interpreted incorrectly.
> what about (format "%cf" #x9e)
Since that returns a multibyte string, I suggest "\u009ef" which is
multibyte. For its unibyte counterpart (encode-coding-string (format
"%cf" #x9e) 'iso-latin-1) I suggest the syntax "\x9e\ f" which is
unibyte. (These are not the only possibilities; for example, the former
could be "\u009e\ f" if you think that's clearer.)
This string syntax is already supported by Emacs, so this wouldn't
change the Lisp reader.
> it creates
> very confusing displayed strings.
These examples are not *that* confusing. And although they may not be
beautiful, correct strings are less confusing than incorrect strings.
This bug report was last modified 3 years and 109 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.