GNU bug report logs -
#27270
display-raw-bytes-as-hex generates ambiguous output for Emacs strings
Previous Next
Reported by: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Wed, 7 Jun 2017 03:59:01 UTC
Severity: wishlist
Tags: moreinfo
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
Message #63 received at 27270 <at> debbugs.gnu.org (full text, mbox):
> Cc: npostavs <at> users.sourceforge.net, 27270 <at> debbugs.gnu.org,
> v.schneidermann <at> gmail.com
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Sat, 10 Jun 2017 17:04:40 -0700
>
> On 06/10/2017 12:24 AM, Eli Zaretskii wrote:
> > So your proposal would mean a change to the Lisp reader to support
> > such escapes, right? If so, isn't such a change
> > backward-incompatible?
>
> Yes, but only in the sense that undocumented escapes evaluate to
> themselves, e.g., "\F" is currently the same as "F" in Emacs Lisp
> because there is no escape sequence \F currently defined for character
> constants. But there's nothing new here, e.g., when we added "\N{...}"
> last year we changed the interpretation of the formerly-undocumented \N
> escape.
Then maybe the new hex display should use the \N{U+nnn} format?
> >> Also, display-raw-bytes-as-hex would cause raw bytes to be displayed with this
> >> new X escape, rather than with with the x escape.
> > It could only do that for codepoints below 256 decimal, so that
> > limitation should be taken into account when deciding on the proposal.
>
> Ouch, I hadn't thought of that.
>
> Wait -- doesn't that mean that "display-raw-bytes-as-hex" is a
> misleading name, because it affects the display not only of raw bytes,
> but of other undisplayable characters?
That's true, but since the chances of a _user_ changing the
printable-chars char-table are pretty slim, I didn't think it was
justified to obfuscate the name.
> Shouldn't we change its name to
> something more generic and more accurate, like "display-characters-as-hex"?
Codepoints whose printable-chars entry is nil cannot in good faith be
called "characters", IMO. "Codepoints", maybe? But again, that makes
the discoverability harder, so I'm not sure it's worth the hassle.
> Anyway, to address the point you raised: how about a different idea? We
> extend the existing \x syntax in strings so that \x{dddd} has the same
> meaning as "\xdddd", except that the "}" terminates the escape. This
> syntax is used by Perl and so is in the same family as \N{...}. We also
> change display-raw-bytes-as-hex to use this new syntax when a character
> is immediately followed by a hexadecimal digit. That way, most
> characters are displayed as before, but my problematic example is
> displayed as "x\x{90}5y", which is a good visual cue of the unusual
> situation.
See above: why not \N{U+...}? The only downside is that it's much
longer than \xNN. Could be another option, perhaps.
This bug report was last modified 3 years and 109 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.