GNU bug report logs -
#55777
[PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte'
Previous Next
Reported by: Richard Hansen <rhansen <at> rhansen.org>
Date: Fri, 3 Jun 2022 06:21:02 UTC
Severity: minor
Tags: patch
Done: Stefan Kangas <stefankangas <at> gmail.com>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
On 6/5/22 01:37, Eli Zaretskii wrote:
> Could you please state what is confusing in the current wording?
* "Raw 8-bit bytes" isn't really defined. It's mentioned earlier in
the chapter -- the term is even in a @dfn{} -- but there's no
definition there.
* The term "raw 8-bit bytes" is misleading. It suggests binary data
(bytes with values 0-255) but it's actually meant to only cover
128-255.
* The term "raw 8-bit bytes" is not used consistently. Sometimes "8"
is spelled out as "eight", sometimes "raw" comes after "8-bit",
and sometimes it refers to all byte values 0-255 (see the first
sentence under `@cindex unibyte text`).
* It's not clear whether "raw 8-bit bytes" is meant to refer to
bytes with values 128-255, or to the *characters* that map to
those byte values.
* The following phrasing is weird: "The function assumes that
@var{string} includes ASCII characters and raw 8-bit bytes". The
purpose of "raw 8-bit bytes" is to cover non-ASCII byte values, so
by definition that assumption is always true. By saying "the
function assumes", the reader is left wondering about the cases
where that assumption is not true, which in turn causes the reader
to question whether "raw 8-bit bytes" fully covers non-ASCII byte
values, which in turn causes the reader to wonder how to handle
those non-covered values (whatever they are).
Maybe something like this:
By definition, unibyte strings contain only @acronym{ASCII}
characters (bytes with values 0-127) and raw 8-bit bytes
(bytes with values 128-255); the latter are converted to their
corresponding multibyte representations in the
@code{eight-bit} character set (@pxref{Text Representations,
codepoints}).
This bug report was last modified 2 years and 363 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.