GNU bug report logs - #30066
'get-bytevector-some' returns only 1 byte from unbuffered ports

Previous Next

Package: guile;

Reported by: ludo <at> gnu.org (Ludovic Courtès)

Date: Wed, 10 Jan 2018 15:03:02 UTC

Severity: normal

Tags: notabug

Done: ludo <at> gnu.org (Ludovic Courtès)

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Mark H Weaver <mhw <at> netris.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: Andy Wingo <wingo <at> igalia.com>, 30066 <at> debbugs.gnu.org
Subject: bug#30066: 'get-bytevector-some' returns only 1 byte from unbuffered ports
Date: Thu, 11 Jan 2018 16:55:38 -0500
ludo <at> gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw <at> netris.org> skribis:
>
>> ludo <at> gnu.org (Ludovic Courtès) writes:
>
> [...]
>
>>> +  if (SCM_UNBUFFEREDP (port) && (avail < max_buffer_size))
>>> +    {
>>> +      /* PORT is unbuffered.  Read as much as possible from PORT.  */
>>> +      size_t read;
>>> +
>>> +      bv = scm_c_make_bytevector (max_buffer_size);
>>> +      scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS (bv),
>>> +                            avail, cur, avail);
>>> +
>>> +      read = scm_i_read_bytes (port, bv, avail,
>>> +                               SCM_BYTEVECTOR_LENGTH (bv) - avail);
>>
>> Here's the R6RS specification for 'get-bytevector-some':
>>
>>   "Reads from BINARY-INPUT-PORT, blocking as necessary, until bytes are
>>    available from BINARY-INPUT-PORT or until an end of file is reached.
>>    If bytes become available, 'get-bytevector-some' returns a freshly
>>    allocated bytevector containing the initial available bytes (at least
>>    one), and it updates BINARY-INPUT-PORT to point just past these
>>    bytes.  If no input bytes are seen before an end of file is reached,
>>    the end-of-file object is returned."
>>
>> By my reading of this, we should block only if necessary to ensure that
>> we return at least one byte (or EOF).  In other words, if we can return
>> at least one byte (or EOF), then we must not block, which means that we
>> must not initiate another 'read'.
>
> Indeed.  So perhaps the condition above should be changed to:
>
>   if (SCM_UNBUFFEREDP (port) && (avail == 0))
>
> ?

That won't work, because the earlier call to 'scm_fill_input' will have
already initiated a 'read' if the buffer was empty.  The read buffer
size will determine the maximum number of bytes read, which will be 1 in
the case of an unbuffered port.  So, at the point of this condition,
'avail == 0' will occur only if EOF was encountered, in which case you
must return EOF without attempting another 'read'.

In order to avoid unnecessary blocking, there must be only one 'read'
call, and it must be initiated only if the buffer was already empty.

So, in order to accomplish your goal here, I don't see how you can use
'scm_fill_input', unless you temporarily increase the size of the read
buffer beforehand.

Instead, I think you need to first check if the read buffer contains any
bytes.  If so, empty the buffer and return them.  If the buffer is
empty, the next thing to check is 'scm_port_buffer_has_eof_p'.  If it's
set, then clear that flag and return EOF.

Otherwise, if the buffer is empty and 'scm_port_buffer_has_eof_p' is
false, then you must do what 'scm_fill_input' would have done, except
using your larger buffer instead of the port's internal read buffer.  In
particular, you must first switch the port to "reading" mode, flushing
the write buffer if 'rw_random' is set.

Also, I'd prefer to move this code to ports.c in order to avoid adding
more internal declarations to ports.h and changing more functions from
'static' to global functions.

>> Out of curiosity, is there a reason why you're using an unbuffered port
>> in your use case?
>
> It’s to implement redirect à la socat:
>
>   https://git.savannah.gnu.org/cgit/guix.git/commit/?id=17af5d51de7c40756a4a39d336f81681de2ba447

Why is an unbuffered port being used here?  Can we change it to a
buffered port?

      Mark




This bug report was last modified 7 years and 99 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.