From unknown Sat Aug 16 16:21:37 2025 X-Loop: help-debbugs@gnu.org Subject: bug#22901: drain-input doesn't decode Resent-From: Zefram Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Fri, 04 Mar 2016 03:11:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 22901 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 22901@debbugs.gnu.org X-Debbugs-Original-To: bug-guile@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.145706100517253 (code B ref -1); Fri, 04 Mar 2016 03:11:01 +0000 Received: (at submit) by debbugs.gnu.org; 4 Mar 2016 03:10:05 +0000 Received: from localhost ([127.0.0.1]:60716 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1abg7x-0004UD-7B for submit@debbugs.gnu.org; Thu, 03 Mar 2016 22:10:05 -0500 Received: from eggs.gnu.org ([208.118.235.92]:37103) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1abg7v-0004Tg-TA for submit@debbugs.gnu.org; Thu, 03 Mar 2016 22:10:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1abg7p-0000bM-OJ for submit@debbugs.gnu.org; Thu, 03 Mar 2016 22:09:58 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:54346) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abg7p-0000bI-LC for submit@debbugs.gnu.org; Thu, 03 Mar 2016 22:09:57 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56307) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abg7o-0006Xa-Hk for bug-guile@gnu.org; Thu, 03 Mar 2016 22:09:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1abg7l-0000b1-92 for bug-guile@gnu.org; Thu, 03 Mar 2016 22:09:56 -0500 Received: from river.fysh.org ([87.98.248.19]:55145) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abg7l-0000ao-2X for bug-guile@gnu.org; Thu, 03 Mar 2016 22:09:53 -0500 Received: from zefram by river.fysh.org with local (Exim 4.80 #2 (Debian)) id 1abg7c-0001TA-SN; Fri, 04 Mar 2016 03:09:44 +0000 Date: Fri, 4 Mar 2016 03:09:44 +0000 From: Zefram Message-ID: <20160304030944.GA1318@fysh.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) The documentation for drain-input says that it returns a string of characters, implying that the result is equivalent to what you'd get from calling read-char some number of times. In fact it differs in a significant respect: whereas read-char decodes input octets according to the port's selected encoding, drain-input ignores the selected encoding and always decodes according to ISO-8859-1 (thus preserving the octet values in character form). $ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! (current-input-port) "UCS-2BE") (write (port-encoding (current-input-port))) (newline) (write (map char->integer (let r ((l '\''())) (let ((c (read-char (current-input-port)))) (if (eof-object? c) (reverse l) (r (cons c l))))))) (newline)' "UCS-2BE" (353 610 867) $ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! (current-input-port) "UCS-2BE") (write (port-encoding (current-input-port))) (newline) (peek-char (current-input-port)) (write (map char->integer (string->list (drain-input (current-input-port))))) (newline)' "UCS-2BE" (1 97 2 98 3 99) The practical upshot is that the input returned by drain-input can't be used in the same way as regular input from read-char. It can still be used if the code doing the reading is totally aware of the encoding, so that it can perform the decoding manually, but this seems a failure of abstraction. The value returned by drain-input ought to be coherent with the abstraction level at which it is specified. I can see that there is a reason for drain-input to avoid performing decoding: the problem that occurs if the buffer ends in the middle of a character. If drain-input is to return decoded characters then presumably in this case it would have to read further octets beyond the buffer contents, in an unbuffered manner, until it reaches a character boundary. If this is too unpalatable, perhaps drain-input should be permitted only on ports configured for single-octet character encodings. If, on the other hand, it is decided to endorse the current non-decoding behaviour, then the break of abstraction needs to be documented. -zefram From unknown Sat Aug 16 16:21:37 2025 X-Loop: help-debbugs@gnu.org Subject: bug#22901: drain-input doesn't decode Resent-From: Andy Wingo Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Mon, 20 Jun 2016 16:14:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 22901 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: Zefram Cc: 22901@debbugs.gnu.org Received: via spool by 22901-submit@debbugs.gnu.org id=B22901.146643918212848 (code B ref 22901); Mon, 20 Jun 2016 16:14:01 +0000 Received: (at 22901) by debbugs.gnu.org; 20 Jun 2016 16:13:02 +0000 Received: from localhost ([127.0.0.1]:47930 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bF1or-0003L0-JM for submit@debbugs.gnu.org; Mon, 20 Jun 2016 12:13:01 -0400 Received: from pb-sasl1.pobox.com ([64.147.108.66]:62952 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bF1op-0003Kj-UT for 22901@debbugs.gnu.org; Mon, 20 Jun 2016 12:13:00 -0400 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-sasl1.pobox.com (Postfix) with ESMTP id B6871234B8; Mon, 20 Jun 2016 12:12:59 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=y4R4cYS9EkYmfXQv09LXm8Itk7w=; b=ERnacB mLvkD4zAv3Pl7n95yTv27yNPKPnb8gbURxkX3wiqkX8TI2K3A/bcz3h5IJIR77y6 6V9eRqIUXYflsePzJRfvLhkhqlVRDQKawJW1WEgZajbyaxguwkiwkoFOB0U3oTDj G0y8EksJ0VValrx9UjRVQ8xcKNdW+Xw9Tgj28= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=IsjO+jwE6DP1FymnJbvgWW7fqMqXQHKX 0shWKnjA5RUrZUXhyMJSmGIkavh5Bo98JZAVoNuzKNOSXfsf18S7nvhSkBACeJ48 GTd+nr7/sB6fPN6AkqB5OcGIGglXoVCTMMhLzmAzyoZxdbxmcONJa6e5SsuRxvFp 478xAdOan+0= Received: from pb-sasl1.nyi.icgroup.com (unknown [127.0.0.1]) by pb-sasl1.pobox.com (Postfix) with ESMTP id AFC49234B7; Mon, 20 Jun 2016 12:12:59 -0400 (EDT) Received: from clucks (unknown [88.160.190.192]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pb-sasl1.pobox.com (Postfix) with ESMTPSA id BCEFB234B1; Mon, 20 Jun 2016 12:12:58 -0400 (EDT) From: Andy Wingo References: <20160304030944.GA1318@fysh.org> Date: Mon, 20 Jun 2016 18:12:50 +0200 In-Reply-To: <20160304030944.GA1318@fysh.org> (zefram@fysh.org's message of "Fri, 4 Mar 2016 03:09:44 +0000") Message-ID: <87d1nbg7p9.fsf@pobox.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: DB0ED0CA-3701-11E6-A060-C1836462E9F6-02397024!pb-sasl1.pobox.com X-Spam-Score: -1.4 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.4 (-) On Fri 04 Mar 2016 04:09, Zefram writes: > The documentation for drain-input says that it returns a string of > characters, implying that the result is equivalent to what you'd get > from calling read-char some number of times. In fact it differs in a > significant respect: whereas read-char decodes input octets according to > the port's selected encoding, drain-input ignores the selected encoding > and always decodes according to ISO-8859-1 (thus preserving the octet > values in character form). > > $ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! > (current-input-port) "UCS-2BE") (write (port-encoding > (current-input-port))) (newline) (write (map char->integer (let r ((l > '\''())) (let ((c (read-char (current-input-port)))) (if (eof-object? > c) (reverse l) (r (cons c l))))))) (newline)' > "UCS-2BE" > (353 610 867) > $ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! > (current-input-port) "UCS-2BE") (write (port-encoding > (current-input-port))) (newline) (peek-char (current-input-port)) > (write (map char->integer (string->list (drain-input > (current-input-port))))) (newline)' > "UCS-2BE" > (1 97 2 98 3 99) Thanks for the test case! FWIW, this is fixed in Guile 2.1.3. I am not sure what we should do about Guile 2.0. I guess we should make it do the documented thing though! Andy From unknown Sat Aug 16 16:21:37 2025 X-Loop: help-debbugs@gnu.org Subject: bug#22901: drain-input doesn't decode References: <20160304030944.GA1318@fysh.org> In-Reply-To: <20160304030944.GA1318@fysh.org> Resent-From: Matt Wette Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Sun, 26 Feb 2017 17:47:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 22901 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 22901@debbugs.gnu.org Received: via spool by 22901-submit@debbugs.gnu.org id=B22901.14881311873628 (code B ref 22901); Sun, 26 Feb 2017 17:47:02 +0000 Received: (at 22901) by debbugs.gnu.org; 26 Feb 2017 17:46:27 +0000 Received: from localhost ([127.0.0.1]:57005 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ci2tv-0000wS-1E for submit@debbugs.gnu.org; Sun, 26 Feb 2017 12:46:27 -0500 Received: from mail-pg0-f54.google.com ([74.125.83.54]:35330) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ci2ts-0000wE-Iy for 22901@debbugs.gnu.org; Sun, 26 Feb 2017 12:46:24 -0500 Received: by mail-pg0-f54.google.com with SMTP id b129so33435450pgc.2 for <22901@debbugs.gnu.org>; Sun, 26 Feb 2017 09:46:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:mime-version:subject:message-id:date:to; bh=C81sAg3zpi+JH+Uwlc2+injY0mHZVCy68rdPtnUDiZE=; b=t3H/dqDtTPXURflHv3JFd/MWIAsYB1urVUEA1iBjKRSj6bimq+0ETfh6Y3V+AKxsqu 2/J7YqBeexWRPqByrRmHL0LrNEJemq1xaFqIe7gTa5SJiMUGfrCXaLwSvYeHspPUJiZi yrb/fsY5Bawg8Y62Mt4ZJxJ8rHXl8t6e+lJht25cHX0qLU8oJEv65KmdlONXzE8j1EFl AQvcYWptrUYwNTb5JDrHWrD02JT8FvG8UFy7Aa7VwDuhh2w2Z5syjRUdWAa7YeEsS0Wx 71PGY7JohsyCHy/P1UcDehjsS4KYCokY3b6AKVAOoqupPPr+Dpq/0DbaAgVAkWzCyRpo zh5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:subject:message-id:date:to; bh=C81sAg3zpi+JH+Uwlc2+injY0mHZVCy68rdPtnUDiZE=; b=WEW2qRMHWEkLrE/CdW7a1vLr40JuNDYCDiBcXC3Rqvqh0IDjp5X3pmFnYYiEwLowOy rtL7lRJrgDBCq8uDy3Crj3vScEJ0NGrwgqLgoYWKm/6ANQ6w2UzedTl+xh7q3Ak0QzoH gSgxge0oxxVhorTBckoMV0B/a2iD9nNLMd6KmiHgPEBFKhZzuBk0WCufbIpL79DOperm VmPRC8nQ2ynkdTQ4HFfVTbnmqFFox5pEovyb2f0c98NIpfXPeFJtgrOfNERZuNmBcY0Z MrxxdAvPMZQU1BQMPPpXkL7jZO5tSyyr+ZJfG+6V5kOP1wz0EHK0TXrikLeF3vQ7z0kA hVpw== X-Gm-Message-State: AMke39nKErSG7LZiWprtepC0/lE3CLlUG3YoyUVEdj5eNOjNgaJ+QAJcLgNbUbfgCu6SPQ== X-Received: by 10.84.224.65 with SMTP id a1mr18913419plt.28.1488131178097; Sun, 26 Feb 2017 09:46:18 -0800 (PST) Received: from nautilus.championbroadband.com (216-165-246-244.championbroadband.com. [216.165.246.244]) by smtp.gmail.com with ESMTPSA id 80sm4513996pgd.39.2017.02.26.09.46.15 for <22901@debbugs.gnu.org> (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Feb 2017 09:46:16 -0800 (PST) From: Matt Wette Content-Type: multipart/mixed; boundary="Apple-Mail=_19CBBA5B-69D5-4EA2-BA05-F73A8BAABA16" Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Message-Id: Date: Sun, 26 Feb 2017 09:46:14 -0800 X-Mailer: Apple Mail (2.3259) X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --Apple-Mail=_19CBBA5B-69D5-4EA2-BA05-F73A8BAABA16 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii I put together a test and tried on 2.1.7 - my test fails. See attached. (pass-if "encoded input" (let ((fn (test-file)) (nc "utf-8") (st "\u03b2\u03b1\u03b4 \u03b1\u03c3\u03c3 am I.") ;;(st "hello, world\n") ) (let ((p1 (open-output-file fn #:encoding nc))) ;;(display st p1) (string-for-each (lambda (ch) (write-char ch p1)) st) (close p1)) (let* ((p0 (open-input-file fn #:encoding nc)) (s0 (begin (unread-char (read-char p0) p0) (drain-input p0)))) (simple-format #t "~S\n" s0) (equal? s0 st)))) --Apple-Mail=_19CBBA5B-69D5-4EA2-BA05-F73A8BAABA16 Content-Disposition: attachment; filename=port-di.test Content-Type: application/octet-stream; x-unix-mode=0644; name="port-di.test" Content-Transfer-Encoding: 7bit ;; port-di.text -*- scheme -*- (add-to-load-path "guile-2.1.7-dev3/test-suite") (use-modules (test-suite lib)) (define (test-file) (string-append (getcwd) "/ports-test.tmp")) (with-test-prefix "drain-input" (pass-if "encoded input" (let ((fn (test-file)) (nc "utf-8") (st "\u03b2\u03b1\u03b4 \u03b1\u03c3\u03c3 am I.") ;;(st "hello, world\n") ) (let ((p1 (open-output-file fn #:encoding nc))) ;;(display st p1) (string-for-each (lambda (ch) (write-char ch p1)) st) (close p1)) (let* ((p0 (open-input-file fn #:encoding nc)) (s0 (begin (unread-char (read-char p0) p0) (drain-input p0)))) (simple-format #t "~S\n" s0) (equal? s0 st)))) ) ;; --- last line --- --Apple-Mail=_19CBBA5B-69D5-4EA2-BA05-F73A8BAABA16-- From unknown Sat Aug 16 16:21:37 2025 X-Loop: help-debbugs@gnu.org Subject: bug#22901: drain-input doesn't decode Resent-From: Matt Wette Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Sun, 26 Feb 2017 17:59:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 22901 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 22901@debbugs.gnu.org Received: via spool by 22901-submit@debbugs.gnu.org id=B22901.14881319324619 (code B ref 22901); Sun, 26 Feb 2017 17:59:01 +0000 Received: (at 22901) by debbugs.gnu.org; 26 Feb 2017 17:58:52 +0000 Received: from localhost ([127.0.0.1]:57009 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ci35w-0001CR-5w for submit@debbugs.gnu.org; Sun, 26 Feb 2017 12:58:52 -0500 Received: from mail-pg0-f43.google.com ([74.125.83.43]:33455) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ci35v-0001CG-Hb for 22901@debbugs.gnu.org; Sun, 26 Feb 2017 12:58:51 -0500 Received: by mail-pg0-f43.google.com with SMTP id 25so3977041pgy.0 for <22901@debbugs.gnu.org>; Sun, 26 Feb 2017 09:58:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:mime-version:subject:date:references:to:in-reply-to:message-id; bh=ghCk00pcAedZhHRh+u+bpHN7LfGBV9hLp1pgj/hn8cs=; b=lXTEq6+BvDaXTbHvNWSKx6DMbY+SXpeE3Zfl+NhImEzy7bIxmnRxN6q4TbP/bVN5bM Cch/VCbatjnLzKKB7fMZAsExtgVlJjR6uRQJYts+gEbIuyATxf372a0wdkczb366W57f vMNvpSyyGJbBl42F0aYtmbwpaAA4A+n1P5PbVYXOEcSZI7Y0JnIgHScBGwKTMh+NXByA vwDLvCje3EceQT/Uo+rLaUuqtBI0QFutolX7DklQYdYp/L0G+wJcRLQZ6jypr3uuLjVb nUOAszlkEzN6iVIy7wK4/rQhqg0kfro75+bIWegr0yKq5NMssFTRHG19cHRxD701Ffh0 CTZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:subject:date:references:to :in-reply-to:message-id; bh=ghCk00pcAedZhHRh+u+bpHN7LfGBV9hLp1pgj/hn8cs=; b=sDQ0tzeS3gFKUNO9awEQtW9OdAz5KTeEPD+yOYeUJm+ze6U9gP3jFcur/UIK1lvknJ gZCcNhyMOJY+wFYegd07q7DRL5PuCUTKdCzwpCDU+e7dSCWRne9wv3MUHt2xr7SiY/5R MP0kHj93axKjYDZmB18W1XEatVRimf2cyJc4Z+TD3TH1Wf+7z/WrHnZENFFjelIgWRnD 4qgvjgBODP0RLyvFGzqanxiwiCdc10aPS0B2gAAzt8CIb25lWJvZsmzJ2IOARMY0927e IfaEOojBqO0TXB8u4KrVPb/QqskFaovQZFVdufs8970a7JU/pOJ0uYrScerQbqZzfnYN rjLA== X-Gm-Message-State: AMke39nM33VAAVs5jFBjf09tQjlM40W1TKM1Gma3CoR58lPiafSGoeT7Rx+kkA+sM5iWwg== X-Received: by 10.84.128.8 with SMTP id 8mr19067534pla.24.1488131925170; Sun, 26 Feb 2017 09:58:45 -0800 (PST) Received: from nautilus.championbroadband.com (216-165-246-244.championbroadband.com. [216.165.246.244]) by smtp.gmail.com with ESMTPSA id s20sm25813476pfg.11.2017.02.26.09.58.43 for <22901@debbugs.gnu.org> (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Feb 2017 09:58:43 -0800 (PST) From: Matt Wette Content-Type: multipart/alternative; boundary="Apple-Mail=_5BEEB7CA-2CE6-44FE-A41B-94FCE628AC2B" Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Date: Sun, 26 Feb 2017 09:58:42 -0800 References: In-Reply-To: Message-Id: <4496E82C-4E88-4CDE-B73E-655CC3179E25@gmail.com> X-Mailer: Apple Mail (2.3259) X-Spam-Score: -0.1 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.1 (/) --Apple-Mail=_5BEEB7CA-2CE6-44FE-A41B-94FCE628AC2B Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Feb 26, 2017, at 9:46 AM, Matt Wette wrote: >=20 > I put together a test and tried on 2.1.7 - my test fails. See = attached. >=20 > (pass-if "encoded input" > (let ((fn (test-file)) > (nc "utf-8") > (st "\u03b2\u03b1\u03b4 \u03b1\u03c3\u03c3 am I.") > ;;(st "hello, world\n") > ) > (let ((p1 (open-output-file fn #:encoding nc))) > ;;(display st p1) > (string-for-each (lambda (ch) (write-char ch p1)) st) > (close p1)) > (let* ((p0 (open-input-file fn #:encoding nc)) > (s0 (begin (unread-char (read-char p0) p0) (drain-input = p0)))) > (simple-format #t "~S\n" s0) > (equal? s0 st)))) >=20 My bad. The failure was on guile-2.0.13. It seems to work on = guile-2.1.7: mwette$ guile-2.1.7-dev3/meta/guile port-di.test "=CE=B2=CE=B1=CE=B4 =CE=B1=CF=83=CF=83 am I." PASS: drain-input: encoded input --Apple-Mail=_5BEEB7CA-2CE6-44FE-A41B-94FCE628AC2B Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
On Feb 26, 2017, at 9:46 AM, Matt Wette <matt.wette@gmail.com> wrote:

I put together a = test and tried on 2.1.7 - my test fails.  See attached.

 (pass-if "encoded input"
=    (let ((fn (test-file))
=  (nc "utf-8")
 (st "\u03b2\u03b1\u03b4 = \u03b1\u03c3\u03c3 am I.")
 ;;(st "hello, world\n")
=  )
     (let ((p1 = (open-output-file fn #:encoding nc)))
= ;;(display st p1)
(string-for-each (lambda (ch) = (write-char ch p1)) st)
(close p1))
=      (let* ((p0 (open-input-file fn #:encoding = nc))
    (s0 = (begin (unread-char (read-char p0) p0) (drain-input p0))))
= (simple-format #t "~S\n" s0)
(equal? = s0 st))))


My bad.  The failure was on guile-2.0.13. =  It seems to work on guile-2.1.7:

mwette$ guile-2.1.7-dev3/meta/guile = port-di.test
"=CE=B2=CE=B1=CE=B4 =CE=B1=CF=83=CF=83 = am I."
PASS: drain-input: encoded = input

= --Apple-Mail=_5BEEB7CA-2CE6-44FE-A41B-94FCE628AC2B-- From unknown Sat Aug 16 16:21:37 2025 X-Loop: help-debbugs@gnu.org Subject: bug#22901: drain-input doesn't decode References: <20160304030944.GA1318@fysh.org> In-Reply-To: <20160304030944.GA1318@fysh.org> Resent-From: Taylan Kammer Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Sun, 16 May 2021 17:56:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 22901 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 22901@debbugs.gnu.org, Zefram , Andy Wingo Received: via spool by 22901-submit@debbugs.gnu.org id=B22901.16211877167731 (code B ref 22901); Sun, 16 May 2021 17:56:01 +0000 Received: (at 22901) by debbugs.gnu.org; 16 May 2021 17:55:16 +0000 Received: from localhost ([127.0.0.1]:51422 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1liKyx-00020d-Oz for submit@debbugs.gnu.org; Sun, 16 May 2021 13:55:15 -0400 Received: from mail-ed1-f41.google.com ([209.85.208.41]:35705) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1liKyv-00020O-Ut for 22901@debbugs.gnu.org; Sun, 16 May 2021 13:55:14 -0400 Received: by mail-ed1-f41.google.com with SMTP id di13so4251350edb.2 for <22901@debbugs.gnu.org>; Sun, 16 May 2021 10:55:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=to:from:subject:message-id:date:user-agent:mime-version :content-language:content-transfer-encoding; bh=viizl0oloTXXB5ZiSbBp7ZjNKBLAVgX7K5rA5ikVBwY=; b=K2pMdTFMxgxl9FAEATsRLfUD1SxGGCo5Yf8c39g+psDz36Lo5e2Ld7YdUcaXibItOS EJphFA5M2To/A1n8qdhWZ438yyPasx3Cj8LsBWbciInW6rXmGFM7azuI7veejaY7ERpa it9fZeP+uIiIPw4wNsn2d3XNyYjL09G2reM7zSgsEP41V+e8IUvUer3l9ZBHpbhsSlxx lIBiUoKgNekQ8uEP7fRdoTBWL3+nKJdyuUIV3sAC5s4mneZWqfgISJD2LDiH1vBLkFYo WLzaGQpalel69snImgo614RGv5W2hpCTk4YBWHGBF7PewTRUJJdc4z/AaG7BS7fEjoLB cCHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:subject:message-id:date:user-agent :mime-version:content-language:content-transfer-encoding; bh=viizl0oloTXXB5ZiSbBp7ZjNKBLAVgX7K5rA5ikVBwY=; b=GoY7cSde+iqcSCMuPocea8/6ZcrDe2S9FTX6iLgHOqMBQgAuoskc6qs7iJWurBd9dd WUD9xaJs/ypVQkpsanbpjhhn61C+ZlOfMlgqjOTQxsU1rsNfKcihbgqRJLEYTdAG6wUX PgSmQzOeonoRlang5m2iiqpDAFertDsrYbHqTPWQdD055QI39z/6t8T5t24P8+1KQ9ba lhyFnlsdiCeOpvCrA5GcH8gDupvEgXBBp/9h6uIMVvy4/U083OUvKNMmjZ9GJokgYr1T Vo9NigJndvn6a4b09N1o9qtMe/Iosmd2L5fbFtd9twZMv2+EN/WFUP4KNmHJRl2ftSpg DrWg== X-Gm-Message-State: AOAM5339ChFpHrk18skggjbAgGRfhZoAr5x6uSFw11B6/e4kF8nc6b/I WYjeYMPjzF+8pTHoVYNVdnY= X-Google-Smtp-Source: ABdhPJzXvWnnFOt4zMD6O0DVRrdA6i+99ua2ksWE0r0F7GnSn9cvq8kyaA2kFb1l8qDs8ydKn2xKMg== X-Received: by 2002:a05:6402:3109:: with SMTP id dc9mr67832578edb.13.1621187708137; Sun, 16 May 2021 10:55:08 -0700 (PDT) Received: from [192.168.178.20] (b2b-109-90-125-150.unitymedia.biz. [109.90.125.150]) by smtp.gmail.com with ESMTPSA id c7sm9224115ede.37.2021.05.16.10.55.07 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 16 May 2021 10:55:07 -0700 (PDT) From: Taylan Kammer Message-ID: Date: Sun, 16 May 2021 19:55:07 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Are we still maintaining 2.0, or can this issue be closed? -- Taylan From unknown Sat Aug 16 16:21:37 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Zefram Subject: bug#22901: closed (drain-input doesn't decode) Message-ID: References: <9042a763-6e68-4233-efab-a1c1116b1d59@gmail.com> <20160304030944.GA1318@fysh.org> X-Gnu-PR-Message: they-closed 22901 X-Gnu-PR-Package: guile Reply-To: 22901@debbugs.gnu.org Date: Wed, 19 May 2021 11:42:01 +0000 Content-Type: multipart/mixed; boundary="----------=_1621424521-17204-1" This is a multi-part message in MIME format... ------------=_1621424521-17204-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #22901: drain-input doesn't decode which was filed against the guile package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 22901@debbugs.gnu.org. --=20 22901: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D22901 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1621424521-17204-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 22901-done) by debbugs.gnu.org; 19 May 2021 11:41:38 +0000 Received: from localhost ([127.0.0.1]:58241 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ljKa1-0004Sr-Nz for submit@debbugs.gnu.org; Wed, 19 May 2021 07:41:37 -0400 Received: from mail-ed1-f45.google.com ([209.85.208.45]:34552) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ljKa0-0004Se-7D for 22901-done@debbugs.gnu.org; Wed, 19 May 2021 07:41:36 -0400 Received: by mail-ed1-f45.google.com with SMTP id w12so7233396edx.1 for <22901-done@debbugs.gnu.org>; Wed, 19 May 2021 04:41:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=to:from:subject:message-id:date:user-agent:mime-version :content-language:content-transfer-encoding; bh=YZTXeQZlf2w4vvxy2La/rX0vHgEwfgdkS+75q9KL2HA=; b=Tk8VcSOnAMAV7JxMfF4CSM+WzSirGbjIq9ay8aGH+fwFQ1WmAOSFLm54ahP0KN6mPW WYvxcZn5JAm5Jl6GW093PAif9soax1gPF2hmhQ83mjTaY0MrbpU8Pth6U3PEz5ADDN89 2Ep2lzKMwKsyUxVvhCZKSOnRtFvf5He+9BZ/YahDBkPSRAbwaSTHrP8HWM6DqqRDNtiO vqNCn7hlZxe7oRPlEs36c0hQsTBLy3L0/YxLW0gBWIib47XmilLo1pMYnMuU5aB4yCFh yS7sJeCd6vKo6PmVgg1vRh1lah37jUBZHTY9Vzfz5hvX+VoWiPfU9Oqk1brBQ47+0P7o 9WAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:subject:message-id:date:user-agent :mime-version:content-language:content-transfer-encoding; bh=YZTXeQZlf2w4vvxy2La/rX0vHgEwfgdkS+75q9KL2HA=; b=dJ3/emtMz/CI0ujhUAr8Wx1/HG8/1xv/0Ba2ETppEEQBOF832rI0UqmzT6AB5MhVed Ev2lE1ffAr9ulDP7sw3ySBe//407KYFfj7Lu+WkB3jbb7BtMVuPdYLFFyniJKvwAB14J cHcVFfyYI4sNzbt3N9L5iurVLxrngdY/aUMVnMefw6Q/czP0XH4MkAh7uWNVcFtmNEC2 dioM6y8N0l1n1Slo48fIqgUxEjj1aOJpwC34A1pYIVmJW1FeNKdQ12r8GStaVnzexibz mkURIwO+HxMSwPXcvarnGYabLySpSMF3b7Mvw5jcMfUW3iuLNZu38bvVaGjViQejR6YT 6pWQ== X-Gm-Message-State: AOAM533mvyG0PPe6omX9nie4+VuY3Dx9nOILItlzCo08xYYi8MgVuQ2d IQF2LdbfXBlUVwl152ACNcux+m8meETIhcFl X-Google-Smtp-Source: ABdhPJwEdKFGEKU5dMC/bhuGdAHAqggA6TuNuLOgQ/A0pj16JXiE9RgG4HlTdztMGxrzSA5EMT8MSA== X-Received: by 2002:aa7:c684:: with SMTP id n4mr6629449edq.357.1621424490184; Wed, 19 May 2021 04:41:30 -0700 (PDT) Received: from [192.168.178.20] (b2b-109-90-125-150.unitymedia.biz. [109.90.125.150]) by smtp.gmail.com with ESMTPSA id dn4sm8233450edb.88.2021.05.19.04.41.29 for <22901-done@debbugs.gnu.org> (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 19 May 2021 04:41:29 -0700 (PDT) To: 22901-done@debbugs.gnu.org From: Taylan Kammer Subject: drain-input doesn't decode Message-ID: <9042a763-6e68-4233-efab-a1c1116b1d59@gmail.com> Date: Wed, 19 May 2021 13:41:26 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 22901-done X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Closing this since it's 5 years old and fixed in Guile 2.1 and higher. -- Taylan ------------=_1621424521-17204-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 4 Mar 2016 03:10:05 +0000 Received: from localhost ([127.0.0.1]:60716 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1abg7x-0004UD-7B for submit@debbugs.gnu.org; Thu, 03 Mar 2016 22:10:05 -0500 Received: from eggs.gnu.org ([208.118.235.92]:37103) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1abg7v-0004Tg-TA for submit@debbugs.gnu.org; Thu, 03 Mar 2016 22:10:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1abg7p-0000bM-OJ for submit@debbugs.gnu.org; Thu, 03 Mar 2016 22:09:58 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:54346) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abg7p-0000bI-LC for submit@debbugs.gnu.org; Thu, 03 Mar 2016 22:09:57 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56307) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abg7o-0006Xa-Hk for bug-guile@gnu.org; Thu, 03 Mar 2016 22:09:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1abg7l-0000b1-92 for bug-guile@gnu.org; Thu, 03 Mar 2016 22:09:56 -0500 Received: from river.fysh.org ([87.98.248.19]:55145) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abg7l-0000ao-2X for bug-guile@gnu.org; Thu, 03 Mar 2016 22:09:53 -0500 Received: from zefram by river.fysh.org with local (Exim 4.80 #2 (Debian)) id 1abg7c-0001TA-SN; Fri, 04 Mar 2016 03:09:44 +0000 Date: Fri, 4 Mar 2016 03:09:44 +0000 From: Zefram To: bug-guile@gnu.org Subject: drain-input doesn't decode Message-ID: <20160304030944.GA1318@fysh.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) The documentation for drain-input says that it returns a string of characters, implying that the result is equivalent to what you'd get from calling read-char some number of times. In fact it differs in a significant respect: whereas read-char decodes input octets according to the port's selected encoding, drain-input ignores the selected encoding and always decodes according to ISO-8859-1 (thus preserving the octet values in character form). $ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! (current-input-port) "UCS-2BE") (write (port-encoding (current-input-port))) (newline) (write (map char->integer (let r ((l '\''())) (let ((c (read-char (current-input-port)))) (if (eof-object? c) (reverse l) (r (cons c l))))))) (newline)' "UCS-2BE" (353 610 867) $ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! (current-input-port) "UCS-2BE") (write (port-encoding (current-input-port))) (newline) (peek-char (current-input-port)) (write (map char->integer (string->list (drain-input (current-input-port))))) (newline)' "UCS-2BE" (1 97 2 98 3 99) The practical upshot is that the input returned by drain-input can't be used in the same way as regular input from read-char. It can still be used if the code doing the reading is totally aware of the encoding, so that it can perform the decoding manually, but this seems a failure of abstraction. The value returned by drain-input ought to be coherent with the abstraction level at which it is specified. I can see that there is a reason for drain-input to avoid performing decoding: the problem that occurs if the buffer ends in the middle of a character. If drain-input is to return decoded characters then presumably in this case it would have to read further octets beyond the buffer contents, in an unbuffered manner, until it reaches a character boundary. If this is too unpalatable, perhaps drain-input should be permitted only on ports configured for single-octet character encodings. If, on the other hand, it is decided to endorse the current non-decoding behaviour, then the break of abstraction needs to be documented. -zefram ------------=_1621424521-17204-1--