GNU bug report logs - #55777
[PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte'

Previous Next

Package: emacs;

Reported by: Richard Hansen <rhansen <at> rhansen.org>

Date: Fri, 3 Jun 2022 06:21:02 UTC

Severity: minor

Tags: patch

Done: Stefan Kangas <stefankangas <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Richard Hansen <rhansen <at> rhansen.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 55777 <at> debbugs.gnu.org
Subject: bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte'
Date: Fri, 3 Jun 2022 23:28:51 -0400
[Message part 1 (text/plain, inline)]
On 6/3/22 03:02, Eli Zaretskii wrote:
> Thanks, but please explain the motivation for these changes.

The motivation is in the commit message, which I revised in the
attached patch to hopefully make it more clear.

> In particular, why would we need to describe in a doc string such 
> intimate details of our current implementation?
There is a fair amount of implementation detail right now; the patch
doesn't significantly change that. But I revised the patch to remove
some of the detail.

> If there was some situation where you needed these details for some 
> Lisp program, please describe that situation.
I'm trying to understand some inconsistent behavior I'm observing
while writing code to process binary data, and I found the existing
documentation lacking.

    ;; Unibyte vs. multibyte characters:
    (eq ?\xff ?\x3fffff)                           ; t (ok)
    (eq (aref "\x3fffff" 0) (aref "\xff" 0))       ; t (ok)
    (eq (aref "\x3fffff 馃榾" 0) (aref "\xff 馃榾" 0)) ; t (ok)
    (eq (aref "\xff" 0) (aref "\xff 馃榾" 0))        ; nil (expected t)

    ;; Unibyte vs. multibyte strings:
    (multibyte-string-p "\xff")                    ; nil (ok)
    (multibyte-string-p "\x3fffff")                ; nil (ok???)
    (string= "\xff" (string-to-multibyte "\xff"))  ; nil (expected t)

    ;; Char code vs. Unicode codepoint:
    (string= "馃榾\xff" "馃榾\x3fffff")                ; t (ok)
    (string= "馃榾\N{U+ff}" "馃榾\xff")                ; nil (ok)
    (string= "馃榾\N{U+ff}" "馃榾\x3fffff")            ; nil (ok)
    (string= "馃榾每" "馃榾\N{U+ff}")                   ; t (ok)
    (string= "馃榾每" "馃榾\xff")                       ; nil (ok)
    (string= "馃榾每" "馃榾\x3fffff")                   ; nil (ok)
    (eq ?\N{U+ff} ?\xff)                           ; t (expected nil)
    (eq ?\N{U+ff} ?\x3fffff)                       ; t (expected nil)
    (eq ?每 ?\xff)                                  ; t (expected nil)
    (eq ?每 ?\x3fffff)                              ; t (expected nil)
[0001-Improve-documentation-of-string-to-multibyte-string-.patch (text/x-patch, attachment)]
[OpenPGP_signature (application/pgp-signature, attachment)]

This bug report was last modified 2 years and 363 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.