GNU bug report logs - #17133
json-encode-string incorrectly encodes extra-BMP characters

Previous Next

Package: emacs;

Reported by: Nathan Trapuzzano <nbtrap <at> nbtrap.com>

Date: Fri, 28 Mar 2014 22:24:01 UTC

Severity: normal

Done: Simen Heggestøyl <simenheg <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Simen Heggestøyl <simenheg <at> gmail.com>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#17133: closed (json-encode-string incorrectly encodes
 extra-BMP characters)
Date: Sun, 04 Oct 2015 15:56:01 +0000
[Message part 1 (text/plain, inline)]
Your message dated Sun, 04 Oct 2015 17:55:22 +0200
with message-id <87egha7hn9.fsf <at> gmail.com>
and subject line Re: bug#17133: json-encode-string incorrectly encodes extra-BMP characters
has caused the debbugs.gnu.org bug report #17133,
regarding json-encode-string incorrectly encodes extra-BMP characters
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
17133: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=17133
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Nathan Trapuzzano <nbtrap <at> nbtrap.com>
To: bug-gnu-emacs <at> gnu.org
Subject: json-encode-string incorrectly encodes extra-BMP characters
Date: Fri, 28 Mar 2014 18:22:25 -0400
M-: (princ (json-encode "\U0001d11e"))
==> "\u1d11e"  ;; should be "\ud834\udd1e" or "𝄞"

From ECMA-404:

  To escape a code point that is not in the Basic Multilingual Plane,
  the character is represented as a twelve-character sequence, encoding
  the UTF-16 surrogate pair. So for example, a string containing only
  the G clef character (U+1D11E) may be represented as "\uD834\uDD1E".


[Message part 3 (message/rfc822, inline)]
From: Simen Heggestøyl <simenheg <at> gmail.com>
To: Nathan Trapuzzano <nbtrap <at> nbtrap.com>
Cc: 17133-done <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#17133: json-encode-string incorrectly encodes extra-BMP
 characters
Date: Sun, 04 Oct 2015 17:55:22 +0200
Nathan Trapuzzano <nbtrap <at> nbtrap.com> writes:
> M-: (princ (json-encode "\U0001d11e"))
> ==> "\u1d11e"  ;; should be "\ud834\udd1e" or "𝄞"
>
>>From ECMA-404:
>
>   To escape a code point that is not in the Basic Multilingual Plane,
>   the character is represented as a twelve-character sequence, encoding
>   the UTF-16 surrogate pair. So for example, a string containing only
>   the G clef character (U+1D11E) may be represented as "\uD834\uDD1E".

This seems to be working as expected in master now; (json-encode
"\U0001d11e") produces "𝄞" as described.

-- Simen


This bug report was last modified 9 years and 291 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.