#27270 - display-raw-bytes-as-hex generates ambiguous output for Emacs strings

GNU bug report logs - #27270
display-raw-bytes-as-hex generates ambiguous output for Emacs strings

Package: emacs;

Reported by: Paul Eggert <eggert <at> cs.ucla.edu>

Date: Wed, 7 Jun 2017 03:59:01 UTC

Severity: wishlist

Tags: moreinfo

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Message #63 received at 27270 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org> To: Paul Eggert <eggert <at> cs.ucla.edu> Cc: v.schneidermann <at> gmail.com, 27270 <at> debbugs.gnu.org, npostavs <at> users.sourceforge.net Subject: Re: bug#27270: display-raw-bytes-as-hex generates ambiguous output for Emacs strings Date: Sun, 11 Jun 2017 17:48:04 +0300

> Cc: npostavs <at> users.sourceforge.net, 27270 <at> debbugs.gnu.org, > v.schneidermann <at> gmail.com > From: Paul Eggert <eggert <at> cs.ucla.edu> > Date: Sat, 10 Jun 2017 17:04:40 -0700 > > On 06/10/2017 12:24 AM, Eli Zaretskii wrote: > > So your proposal would mean a change to the Lisp reader to support > > such escapes, right? If so, isn't such a change > > backward-incompatible? > > Yes, but only in the sense that undocumented escapes evaluate to > themselves, e.g., "\F" is currently the same as "F" in Emacs Lisp > because there is no escape sequence \F currently defined for character > constants. But there's nothing new here, e.g., when we added "\N{...}" > last year we changed the interpretation of the formerly-undocumented \N > escape. Then maybe the new hex display should use the \N{U+nnn} format? > >> Also, display-raw-bytes-as-hex would cause raw bytes to be displayed with this > >> new X escape, rather than with with the x escape. > > It could only do that for codepoints below 256 decimal, so that > > limitation should be taken into account when deciding on the proposal. > > Ouch, I hadn't thought of that. > > Wait -- doesn't that mean that "display-raw-bytes-as-hex" is a > misleading name, because it affects the display not only of raw bytes, > but of other undisplayable characters? That's true, but since the chances of a _user_ changing the printable-chars char-table are pretty slim, I didn't think it was justified to obfuscate the name. > Shouldn't we change its name to > something more generic and more accurate, like "display-characters-as-hex"? Codepoints whose printable-chars entry is nil cannot in good faith be called "characters", IMO. "Codepoints", maybe? But again, that makes the discoverability harder, so I'm not sure it's worth the hassle. > Anyway, to address the point you raised: how about a different idea? We > extend the existing \x syntax in strings so that \x{dddd} has the same > meaning as "\xdddd", except that the "}" terminates the escape. This > syntax is used by Perl and so is in the same family as \N{...}. We also > change display-raw-bytes-as-hex to use this new syntax when a character > is immediately followed by a hexadecimal digit. That way, most > characters are displayed as before, but my problematic example is > displayed as "x\x{90}5y", which is a good visual cue of the unusual > situation. See above: why not \N{U+...}? The only downside is that it's much longer than \xNN. Could be another option, perhaps.

This bug report was last modified 3 years and 109 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #27270 display-raw-bytes-as-hex generates ambiguous output for Emacs strings

GNU bug report logs - #27270
display-raw-bytes-as-hex generates ambiguous output for Emacs strings