GNU bug report logs -
#10627
char-ready? is broken for multibyte encodings
Previous Next
Reported by: Mark H Weaver <mhw <at> netris.org>
Date: Sat, 28 Jan 2012 10:24:02 UTC
Severity: normal
Done: Andy Wingo <wingo <at> pobox.com>
Bug is archived. No further changes may be made.
Full log
Message #17 received at 10627 <at> debbugs.gnu.org (full text, mbox):
Andy Wingo <wingo <at> pobox.com> writes:
> On Sun 24 Feb 2013 21:14, Mark H Weaver <mhw <at> netris.org> writes:
>
>> Maybe I'm missing something, but I don't see any semantic problem here,
>> and it seems straightforward to implement. 'char-ready?' should simply
>> read bytes until either a complete character is available, or no more
>> bytes are ready. In either case, all the bytes should then be 'unget'
>> before returning. What's the problem?
>
> The problem is that char-ready? should not read anything.
Okay, but if all bytes read are later *unread*, and the reads never
block, then why does it matter? The reads in my proposed implementation
are just an internal implementation detail, and it seems to me that the
user cannot tell the difference, as long as he does not peek underneath
the Scheme port abstraction.
If you prefer, perhaps a nicer way to think about it is that
'char-ready?' looks ahead in the putback buffer and/or the read buffer
(refilling it in a non-blocking mode if needed), and returns #t iff a
complete character is present in the buffer(s), or EOF is reached.
However, is seems to me that implementing this in terms of read-byte and
unget-byte is simpler, because it avoids duplication of the logic
regarding putback buffers and refilling of buffers. Maybe there's some
reason why this is a bad idea, but I haven't heard one.
I agree that 'char-ready?' is an antiquated interface, but it is
nonetheless part of the R5RS (and Guile since approximately forever),
and it is the only way to do a non-blocking read in portable R5RS. It
seems to me that we ought to try to implement it as well as we can, no?
> If you want to peek, use peek-char.
Okay, but that's a totally different tool with a different use case.
It cannot be used to do non-blocking reads.
> Note that if the stream is at EOF, char-ready? should return #t.
Agreed.
More thoughts?
Thanks,
Mark
This bug report was last modified 8 years and 340 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.