GNU bug report logs - #10627
char-ready? is broken for multibyte encodings

Previous Next

Package: guile;

Reported by: Mark H Weaver <mhw <at> netris.org>

Date: Sat, 28 Jan 2012 10:24:02 UTC

Severity: normal

Done: Andy Wingo <wingo <at> pobox.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Andy Wingo <wingo <at> pobox.com>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#10627: closed (char-ready? is broken for multibyte encodings)
Date: Mon, 20 Jun 2016 19:24:01 +0000
[Message part 1 (text/plain, inline)]
Your message dated Mon, 20 Jun 2016 21:23:35 +0200
with message-id <87y45z8y14.fsf <at> pobox.com>
and subject line Re: bug#10627: char-ready? is broken for multibyte encodings
has caused the debbugs.gnu.org bug report #10627,
regarding char-ready? is broken for multibyte encodings
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
10627: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=10627
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Mark H Weaver <mhw <at> netris.org>
To: bug-guile <at> gnu.org
Subject: char-ready? is broken for multibyte encodings
Date: Sat, 28 Jan 2012 05:21:24 -0500
The R5RS specifies that if 'char-ready?' returns #t, then the next
'read-char' operation is guaranteed not to hang.  This is not currently
the case for ports using a multibyte encoding.

'char-ready?' currently returns #t whenever at least one _byte_ is
available.  This is not correct in general.  It should return #t only if
there is a complete _character_ available.

     Mark


[Message part 3 (message/rfc822, inline)]
From: Andy Wingo <wingo <at> pobox.com>
To: Mark H Weaver <mhw <at> netris.org>
Cc: 10627-done <at> debbugs.gnu.org
Subject: Re: bug#10627: char-ready? is broken for multibyte encodings
Date: Mon, 20 Jun 2016 21:23:35 +0200
On Tue 26 Feb 2013 20:59, Andy Wingo <wingo <at> pobox.com> writes:

> On Tue 26 Feb 2013 20:50, Mark H Weaver <mhw <at> netris.org> writes:
>
>> Andy Wingo <wingo <at> pobox.com> writes:
>>> Are you proposing that `char-ready?' do a nonblocking read if
>>> the buffer is empty?  That could work.
>>
>> Yes.  I suspect that something along these lines is already implemented,
>> because I don't see how 'u8-ready?' could work properly without it.
>
> It does a poll with a timeout of 0.

In the end I added this to the manual:

    Note that @code{char-ready?} only works reliably for terminals and
    sockets with one-byte encodings.  Under the hood it will return
    @code{#t} if the port has any input buffered, or if the file descriptor
    that backs the port polls as readable, indicating that Guile can fetch
    more bytes from the kernel.  However being able to fetch one byte
    doesn't mean that a full character is available; @xref{Encoding}.  Also,
    on many systems it's possible for a file descriptor to poll as readable,
    but then block when it comes time to read bytes.  Note also that on
    Linux kernels, all file ports backed by files always poll as readable.
    For non-file ports, this procedure always returns @code{#t}, except for
    soft ports, which have a @code{char-ready?} handler.  @xref{Soft Ports}.

    In short, this is a legacy procedure whose semantics are hard to
    provide.  However it is a useful check to see if any input is buffered.
    @xref{Non-Blocking I/O}.

We could try a non-blocking read but at that point we should just
provide a non-blocking read-char, and allow users to unread-char.  That
would be a different bug :)

Andy


This bug report was last modified 8 years and 340 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.