GNU bug report logs -
#70988
(read FUNCTION) uses Latin-1 [PATCH]
Previous Next
Full log
View this message in rfc822 format
> Date: Wed, 12 Feb 2025 20:27:58 +0000
> From: Pip Cet <pipcet <at> protonmail.com>
> Cc: stefankangas <at> gmail.com, mattias.engdegard <at> gmail.com, 70988 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
>
> "Eli Zaretskii" <eliz <at> gnu.org> writes:
>
> >> --- a/src/lread.c
> >> +++ b/src/lread.c
> >> @@ -398,9 +398,12 @@ readchar (Lisp_Object readcharfun, bool *multibyte)
> >>
> >> tem = call0 (readcharfun);
> >>
> >> - if (NILP (tem))
> >> + if (!CHARACTERP (tem))
> >> return -1;
> >> - return XFIXNUM (tem);
> >> + if (multibyte && !ASCII_CHAR_P (XFIXNAT (tem)))
> >> + *multibyte = true;
> >> +
> >> + return XFIXNAT (tem);
> >
> > AFAIU, the proposed patch was just a bugfix, whereas the above also
> > changes behavior in backward-incompatible ways.
>
> The other way around, I think: the first proposed patch changed the
> behavior of readchar to always set the multibyte flag when a function
> was used, resulting in the creation of symbols whose ASCII names are
> multibyte strings. The previous behavior was never to set the multibyte
> flag, which was correct for ASCII strings but not multibyte ones.
>
> This patch retains the previous behavior for ASCII symbols, but sets the
> multibyte flag for non-ASCII symbols, which seems the best we can do if
> we're given a simple function.
I'm talking about the CHARACTERP test (why not FIXNUMP?), and the
addition of ASCII_CHAR_P test (why would we want an ASCII character
to never be considered multibyte?).
> If we want to change symbol names to always be multibyte strings, we can
> do that, but then we probably want to do that or all streams.
I don't understand why you are talking about symbols: AFAIU this code
is used in many other cases as well. But even for symbols: why change
the current behavior of making their names multibyte?
> It also fixes yet another XFIXNUM crash, but those (there are more in
> lread.c, it seems) should be fixed independently.
I'm okay with adding a FIXNUMP test (which happens in the debugging
builds anyway, so any violations probably never happen), but using
CHARACTERP changes behavior.
> However, it does give us the ability to extend the API so
> readcharfun could return a single character string, unibyte or
> multibyte, to be handled appropriately.
This is also a change in behavior.
This bug report was last modified 10 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.