GNU bug report logs -
#22901
drain-input doesn't decode
Previous Next
Reported by: Zefram <zefram <at> fysh.org>
Date: Fri, 4 Mar 2016 03:11:01 UTC
Severity: normal
Done: Taylan Kammer <taylan.kammer <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 22901 in the body.
You can then email your comments to 22901 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guile <at> gnu.org
:
bug#22901
; Package
guile
.
(Fri, 04 Mar 2016 03:11:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Zefram <zefram <at> fysh.org>
:
New bug report received and forwarded. Copy sent to
bug-guile <at> gnu.org
.
(Fri, 04 Mar 2016 03:11:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
The documentation for drain-input says that it returns a string of
characters, implying that the result is equivalent to what you'd get
from calling read-char some number of times. In fact it differs in a
significant respect: whereas read-char decodes input octets according to
the port's selected encoding, drain-input ignores the selected encoding
and always decodes according to ISO-8859-1 (thus preserving the octet
values in character form).
$ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! (current-input-port) "UCS-2BE") (write (port-encoding (current-input-port))) (newline) (write (map char->integer (let r ((l '\''())) (let ((c (read-char (current-input-port)))) (if (eof-object? c) (reverse l) (r (cons c l))))))) (newline)'
"UCS-2BE"
(353 610 867)
$ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! (current-input-port) "UCS-2BE") (write (port-encoding (current-input-port))) (newline) (peek-char (current-input-port)) (write (map char->integer (string->list (drain-input (current-input-port))))) (newline)'
"UCS-2BE"
(1 97 2 98 3 99)
The practical upshot is that the input returned by drain-input can't
be used in the same way as regular input from read-char. It can still
be used if the code doing the reading is totally aware of the encoding,
so that it can perform the decoding manually, but this seems a failure
of abstraction. The value returned by drain-input ought to be coherent
with the abstraction level at which it is specified.
I can see that there is a reason for drain-input to avoid performing
decoding: the problem that occurs if the buffer ends in the middle
of a character. If drain-input is to return decoded characters then
presumably in this case it would have to read further octets beyond the
buffer contents, in an unbuffered manner, until it reaches a character
boundary. If this is too unpalatable, perhaps drain-input should be
permitted only on ports configured for single-octet character encodings.
If, on the other hand, it is decided to endorse the current non-decoding
behaviour, then the break of abstraction needs to be documented.
-zefram
Information forwarded
to
bug-guile <at> gnu.org
:
bug#22901
; Package
guile
.
(Mon, 20 Jun 2016 16:14:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 22901 <at> debbugs.gnu.org (full text, mbox):
On Fri 04 Mar 2016 04:09, Zefram <zefram <at> fysh.org> writes:
> The documentation for drain-input says that it returns a string of
> characters, implying that the result is equivalent to what you'd get
> from calling read-char some number of times. In fact it differs in a
> significant respect: whereas read-char decodes input octets according to
> the port's selected encoding, drain-input ignores the selected encoding
> and always decodes according to ISO-8859-1 (thus preserving the octet
> values in character form).
>
> $ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding!
> (current-input-port) "UCS-2BE") (write (port-encoding
> (current-input-port))) (newline) (write (map char->integer (let r ((l
> '\''())) (let ((c (read-char (current-input-port)))) (if (eof-object?
> c) (reverse l) (r (cons c l))))))) (newline)'
> "UCS-2BE"
> (353 610 867)
> $ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding!
> (current-input-port) "UCS-2BE") (write (port-encoding
> (current-input-port))) (newline) (peek-char (current-input-port))
> (write (map char->integer (string->list (drain-input
> (current-input-port))))) (newline)'
> "UCS-2BE"
> (1 97 2 98 3 99)
Thanks for the test case! FWIW, this is fixed in Guile 2.1.3. I am not
sure what we should do about Guile 2.0. I guess we should make it do
the documented thing though!
Andy
Information forwarded
to
bug-guile <at> gnu.org
:
bug#22901
; Package
guile
.
(Sun, 26 Feb 2017 17:47:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 22901 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I put together a test and tried on 2.1.7 - my test fails. See attached.
(pass-if "encoded input"
(let ((fn (test-file))
(nc "utf-8")
(st "\u03b2\u03b1\u03b4 \u03b1\u03c3\u03c3 am I.")
;;(st "hello, world\n")
)
(let ((p1 (open-output-file fn #:encoding nc)))
;;(display st p1)
(string-for-each (lambda (ch) (write-char ch p1)) st)
(close p1))
(let* ((p0 (open-input-file fn #:encoding nc))
(s0 (begin (unread-char (read-char p0) p0) (drain-input p0))))
(simple-format #t "~S\n" s0)
(equal? s0 st))))
[port-di.test (application/octet-stream, attachment)]
Information forwarded
to
bug-guile <at> gnu.org
:
bug#22901
; Package
guile
.
(Sun, 26 Feb 2017 17:59:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 22901 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
> On Feb 26, 2017, at 9:46 AM, Matt Wette <matt.wette <at> gmail.com> wrote:
>
> I put together a test and tried on 2.1.7 - my test fails. See attached.
>
> (pass-if "encoded input"
> (let ((fn (test-file))
> (nc "utf-8")
> (st "\u03b2\u03b1\u03b4 \u03b1\u03c3\u03c3 am I.")
> ;;(st "hello, world\n")
> )
> (let ((p1 (open-output-file fn #:encoding nc)))
> ;;(display st p1)
> (string-for-each (lambda (ch) (write-char ch p1)) st)
> (close p1))
> (let* ((p0 (open-input-file fn #:encoding nc))
> (s0 (begin (unread-char (read-char p0) p0) (drain-input p0))))
> (simple-format #t "~S\n" s0)
> (equal? s0 st))))
>
My bad. The failure was on guile-2.0.13. It seems to work on guile-2.1.7:
mwette$ guile-2.1.7-dev3/meta/guile port-di.test
"βαδ ασσ am I."
PASS: drain-input: encoded input
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-guile <at> gnu.org
:
bug#22901
; Package
guile
.
(Sun, 16 May 2021 17:56:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 22901 <at> debbugs.gnu.org (full text, mbox):
Are we still maintaining 2.0, or can this issue be closed?
--
Taylan
Reply sent
to
Taylan Kammer <taylan.kammer <at> gmail.com>
:
You have taken responsibility.
(Wed, 19 May 2021 11:42:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
Zefram <zefram <at> fysh.org>
:
bug acknowledged by developer.
(Wed, 19 May 2021 11:42:01 GMT)
Full text and
rfc822 format available.
Message #22 received at 22901-done <at> debbugs.gnu.org (full text, mbox):
Closing this since it's 5 years old and fixed in Guile 2.1 and higher.
--
Taylan
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 17 Jun 2021 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 3 years and 364 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.