GNU bug report logs -
#70988
(read FUNCTION) uses Latin-1 [PATCH]
Previous Next
Full log
Message #38 received at 70988 <at> debbugs.gnu.org (full text, mbox):
"Eli Zaretskii" <eliz <at> gnu.org> writes:
>> Date: Wed, 12 Feb 2025 16:42:43 +0000
>> From: Pip Cet <pipcet <at> protonmail.com>
>> Cc: Mattias Engdegård <mattias.engdegard <at> gmail.com>, 70988 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, monnier <at> iro.umontreal.ca
>>
>> The alternative patch would look something like this:
>>
>> >From bbc65c9be7ccebf034f4d10f018a076ef1e8a4e9 Mon Sep 17 00:00:00 2001
>> From: Pip Cet <pipcet <at> protonmail.com>
>> Subject: [PATCH] Auto-detect multibyteness of readchar funs (bug#70988)
>>
>> * src/lread.c (readchar): Set *MULTIBYTE if we detect a multibyte
>> character. Return -1 for non-characters rather than crashing.
>> ---
>> src/lread.c | 7 +++++--
>> 1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/lread.c b/src/lread.c
>> index 6af95873bb8..c18c1be3cf5 100644
>> --- a/src/lread.c
>> +++ b/src/lread.c
>> @@ -398,9 +398,12 @@ readchar (Lisp_Object readcharfun, bool *multibyte)
>>
>> tem = call0 (readcharfun);
>>
>> - if (NILP (tem))
>> + if (!CHARACTERP (tem))
>> return -1;
>> - return XFIXNUM (tem);
>> + if (multibyte && !ASCII_CHAR_P (XFIXNAT (tem)))
>> + *multibyte = true;
>> +
>> + return XFIXNAT (tem);
>
> AFAIU, the proposed patch was just a bugfix, whereas the above also
> changes behavior in backward-incompatible ways.
The other way around, I think: the first proposed patch changed the
behavior of readchar to always set the multibyte flag when a function
was used, resulting in the creation of symbols whose ASCII names are
multibyte strings. The previous behavior was never to set the multibyte
flag, which was correct for ASCII strings but not multibyte ones.
This patch retains the previous behavior for ASCII symbols, but sets the
multibyte flag for non-ASCII symbols, which seems the best we can do if
we're given a simple function.
If we want to change symbol names to always be multibyte strings, we can
do that, but then we probably want to do that or all streams.
It also fixes yet another XFIXNUM crash, but those (there are more in
lread.c, it seems) should be fixed independently. However, it does give
us the ability to extend the API so readcharfun could return a single
character string, unibyte or multibyte, to be handled appropriately.
Pip
This bug report was last modified 10 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.