#55777 - [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte'

GNU bug report logs - #55777
[PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte'

Package: emacs;

Reported by: Richard Hansen <rhansen <at> rhansen.org>

Date: Fri, 3 Jun 2022 06:21:02 UTC

Severity: minor

Tags: patch

Done: Stefan Kangas <stefankangas <at> gmail.com>

Bug is archived. No further changes may be made.

Message #23 received at 55777 <at> debbugs.gnu.org (full text, mbox):

From: Richard Hansen <rhansen <at> rhansen.org> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 55777 <at> debbugs.gnu.org Subject: Re: bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte' Date: Sun, 5 Jun 2022 22:00:35 -0400

On 6/5/22 01:37, Eli Zaretskii wrote: > Could you please state what is confusing in the current wording? * "Raw 8-bit bytes" isn't really defined. It's mentioned earlier in the chapter -- the term is even in a @dfn{} -- but there's no definition there. * The term "raw 8-bit bytes" is misleading. It suggests binary data (bytes with values 0-255) but it's actually meant to only cover 128-255. * The term "raw 8-bit bytes" is not used consistently. Sometimes "8" is spelled out as "eight", sometimes "raw" comes after "8-bit", and sometimes it refers to all byte values 0-255 (see the first sentence under `@cindex unibyte text`). * It's not clear whether "raw 8-bit bytes" is meant to refer to bytes with values 128-255, or to the *characters* that map to those byte values. * The following phrasing is weird: "The function assumes that @var{string} includes ASCII characters and raw 8-bit bytes". The purpose of "raw 8-bit bytes" is to cover non-ASCII byte values, so by definition that assumption is always true. By saying "the function assumes", the reader is left wondering about the cases where that assumption is not true, which in turn causes the reader to question whether "raw 8-bit bytes" fully covers non-ASCII byte values, which in turn causes the reader to wonder how to handle those non-covered values (whatever they are). Maybe something like this: By definition, unibyte strings contain only @acronym{ASCII} characters (bytes with values 0-127) and raw 8-bit bytes (bytes with values 128-255); the latter are converted to their corresponding multibyte representations in the @code{eight-bit} character set (@pxref{Text Representations, codepoints}).

This bug report was last modified 2 years and 363 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #55777 [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte'

GNU bug report logs - #55777
[PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte'