From unknown Fri Jun 20 18:10:33 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#62290 <62290@debbugs.gnu.org> To: bug#62290 <62290@debbugs.gnu.org> Subject: Status: Error when handling invalid unicode with suspendable ports Reply-To: bug#62290 <62290@debbugs.gnu.org> Date: Sat, 21 Jun 2025 01:10:33 +0000 retitle 62290 Error when handling invalid unicode with suspendable ports reassign 62290 guile submitter 62290 Christopher Baines severity 62290 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Mon Mar 20 05:12:08 2023 Received: (at submit) by debbugs.gnu.org; 20 Mar 2023 09:12:08 +0000 Received: from localhost ([127.0.0.1]:53686 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1peBYm-0005TG-Ev for submit@debbugs.gnu.org; Mon, 20 Mar 2023 05:12:08 -0400 Received: from lists.gnu.org ([209.51.188.17]:37714) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1peBYl-0005T9-3U for submit@debbugs.gnu.org; Mon, 20 Mar 2023 05:12:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1peBYk-0001d8-DY for bug-guile@gnu.org; Mon, 20 Mar 2023 05:12:06 -0400 Received: from mira.cbaines.net ([212.71.252.8]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1peBYg-0003OU-7c for bug-guile@gnu.org; Mon, 20 Mar 2023 05:12:05 -0400 Received: from localhost (unknown [IPv6:2a02:8010:68c1:0:54d1:d5d4:280e:f699]) by mira.cbaines.net (Postfix) with ESMTPSA id AA1EB16F1F for ; Mon, 20 Mar 2023 09:11:59 +0000 (GMT) Received: from felis (localhost [127.0.0.1]) by localhost (OpenSMTPD) with ESMTP id 077924cf for ; Mon, 20 Mar 2023 09:11:58 +0000 (UTC) User-agent: mu4e 1.8.13; emacs 28.2 From: Christopher Baines To: bug-guile@gnu.org Subject: Error when handling invalid unicode with suspendable ports Date: Mon, 20 Mar 2023 09:09:14 +0000 Message-ID: <874jqf6b35.fsf@cbaines.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=212.71.252.8; envelope-from=mail@cbaines.net; helo=mira.cbaines.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.4 (--) Here's a simple reproducer: (use-modules (ice-9 binary-ports) (ice-9 suspendable-ports) (rnrs bytevectors)) (define (test) (let* ((sequence '(#xf4 #xa4 #xbd #xa4)) (p (open-bytevector-input-port (u8-list->bytevector sequence)))) (set-port-encoding! p "UTF-8") (set-port-conversion-strategy! p 'substitute) (peek (read-char p)))) (test) (install-suspendable-ports!) (test) If you run it, it outputs #\=EF=BF=BD as expected the first time, but then = using suspendable ports, it raises an exception. The behaviour should be the same. ;;; (#\=EF=BF=BD) Backtrace: In ice-9/boot-9.scm: 1752:10 8 (with-exception-handler _ _ #:unwind? _ # _) In unknown file: 7 (apply-smob/0 #) In ice-9/boot-9.scm: 724:2 6 (call-with-prompt ("prompt") # =E2=80=A6) In ice-9/eval.scm: 619:8 5 (_ #(#(#))) In ice-9/boot-9.scm: 2836:4 4 (save-module-excursion #) 4388:12 3 (_) In /home/chris/Projects/Guile/guile/bad-unicode.scm: 12:10 2 (test) In ice-9/suspendable-ports.scm: 591:33 1 (read-char _) 499:12 0 (peek-char-and-next-cur/utf8 _ _ _ _) ice-9/suspendable-ports.scm:499:12: In procedure peek-char-and-next-cur/utf= 8: In procedure integer->char: Argument 1 out of range: 1199972 From debbugs-submit-bounces@debbugs.gnu.org Mon Mar 20 05:15:17 2023 Received: (at 62290) by debbugs.gnu.org; 20 Mar 2023 09:15:17 +0000 Received: from localhost ([127.0.0.1]:53692 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1peBbp-0005YY-1L for submit@debbugs.gnu.org; Mon, 20 Mar 2023 05:15:17 -0400 Received: from mira.cbaines.net ([212.71.252.8]:42404) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1peBbm-0005YQ-Ue for 62290@debbugs.gnu.org; Mon, 20 Mar 2023 05:15:15 -0400 Received: from localhost (unknown [IPv6:2a02:8010:68c1:0:54d1:d5d4:280e:f699]) by mira.cbaines.net (Postfix) with ESMTPSA id 56CDE16F21 for <62290@debbugs.gnu.org>; Mon, 20 Mar 2023 09:15:14 +0000 (GMT) Received: from localhost (localhost [local]) by localhost (OpenSMTPD) with ESMTPA id 03fb9ae7 for <62290@debbugs.gnu.org>; Mon, 20 Mar 2023 09:15:14 +0000 (UTC) From: Christopher Baines To: 62290@debbugs.gnu.org Subject: [PATCH] Fix some invalid unicode handling issues with suspendable ports. Date: Mon, 20 Mar 2023 09:15:13 +0000 Message-Id: <20230320091513.10817-1-mail@cbaines.net> X-Mailer: git-send-email 2.39.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 62290 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Based on the implementation in ports.c. I don't understand what this code is really doing, but the suspendable ports implementation differs from the similar C code for a couple of inequalities. * module/ice-9/suspendable-ports.scm (decode-utf8, bad-utf8-len): Flip a couple of inequalities. * test-suite/tests/ports.test ("string ports"): Add additional invalid UTF-8 test case. --- module/ice-9/suspendable-ports.scm | 8 ++++---- test-suite/tests/ports.test | 7 +++++++ 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/module/ice-9/suspendable-ports.scm b/module/ice-9/suspendable-ports.scm index a823f1d37..9fac1df62 100644 --- a/module/ice-9/suspendable-ports.scm +++ b/module/ice-9/suspendable-ports.scm @@ -419,7 +419,7 @@ (= (logand u8_2 #xc0) #x80) (case u8_0 ((#xe0) (>= u8_1 #xa0)) - ((#xed) (>= u8_1 #x9f)) + ((#xed) (<= u8_1 #x9f)) (else #t))) (kt (integer->char (logior (ash (logand u8_0 #x0f) 12) @@ -436,7 +436,7 @@ (= (logand u8_3 #xc0) #x80) (case u8_0 ((#xf0) (>= u8_1 #x90)) - ((#xf4) (>= u8_1 #x8f)) + ((#xf4) (<= u8_1 #x8f)) (else #t))) (kt (integer->char (logior (ash (logand u8_0 #x07) 18) @@ -462,7 +462,7 @@ ((< buffering 2) 1) ((not (= (logand (ref 1) #xc0) #x80)) 1) ((and (eq? first-byte #xe0) (< (ref 1) #xa0)) 1) - ((and (eq? first-byte #xed) (< (ref 1) #x9f)) 1) + ((and (eq? first-byte #xed) (> (ref 1) #x9f)) 1) ((< buffering 3) 2) ((not (= (logand (ref 2) #xc0) #x80)) 2) (else 0))) @@ -471,7 +471,7 @@ ((< buffering 2) 1) ((not (= (logand (ref 1) #xc0) #x80)) 1) ((and (eq? first-byte #xf0) (< (ref 1) #x90)) 1) - ((and (eq? first-byte #xf4) (< (ref 1) #x8f)) 1) + ((and (eq? first-byte #xf4) (> (ref 1) #x8f)) 1) ((< buffering 3) 2) ((not (= (logand (ref 2) #xc0) #x80)) 2) ((< buffering 4) 3) diff --git a/test-suite/tests/ports.test b/test-suite/tests/ports.test index 66e10e3dd..1b30e1a68 100644 --- a/test-suite/tests/ports.test +++ b/test-suite/tests/ports.test @@ -1059,6 +1059,13 @@ eof)) (test-decoding-error (#xf0 #x88 #x88 #x88) "UTF-8" + (error ;; 2nd byte should be in the 90..BF range + error ;; 88: not a valid starting byte + error ;; 88: not a valid starting byte + error ;; 88: not a valid starting byte + eof)) + + (test-decoding-error (#xf4 #xa4 #xbd #xa4) "UTF-8" (error ;; 2nd byte should be in the 90..BF range error ;; 88: not a valid starting byte error ;; 88: not a valid starting byte -- 2.39.1 From debbugs-submit-bounces@debbugs.gnu.org Mon Mar 20 18:27:41 2023 Received: (at 62290-done) by debbugs.gnu.org; 20 Mar 2023 22:27:41 +0000 Received: from localhost ([127.0.0.1]:57095 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1peNyf-0007sE-B2 for submit@debbugs.gnu.org; Mon, 20 Mar 2023 18:27:41 -0400 Received: from eggs.gnu.org ([209.51.188.92]:48124) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1peNye-0007s2-5Z for 62290-done@debbugs.gnu.org; Mon, 20 Mar 2023 18:27:40 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1peNyY-0000eM-Uw; Mon, 20 Mar 2023 18:27:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=PCPlLL7uXQP7rxDUHEOrLfskkOE0wJY0zq1ctokOozw=; b=B7q3TjBl0coP9xeiMX+m 2thdYlzJv3WDd1SPs2LX7CsnrcbDt8282fkHhlmHzWmMIJZMrckBkBsSyK8FaplJikHxmqqMHBxVW Rgu79aNi1To2rJISU/n/nDxhf91rnMTFShPH6/jhqTc110j91+RgubvJ2cbjlP/cyvx9SiOHHgUUv tcVzIRMAti1PoKcil9fcVIxs8IYXJZVtNA6ZiKqEY5/j1gLq1IS7y1b4pdJTsXGdTMmsIwCrhsEFV J+y2ToNabf7tF3NWi7DpmFUcp5zR54Ia+iBZM1kc1HuDXudOmGK1p/8dw3+7KiX2U5jxtJRXO6zGO +0qB7xvekGx7cg==; Received: from 91-160-117-201.subs.proxad.net ([91.160.117.201] helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1peNyY-0001ZG-Cc; Mon, 20 Mar 2023 18:27:34 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Christopher Baines Subject: Re: bug#62290: Error when handling invalid unicode with suspendable ports References: <874jqf6b35.fsf@cbaines.net> <20230320091513.10817-1-mail@cbaines.net> Date: Mon, 20 Mar 2023 23:27:32 +0100 In-Reply-To: <20230320091513.10817-1-mail@cbaines.net> (Christopher Baines's message of "Mon, 20 Mar 2023 09:15:13 +0000") Message-ID: <87pm932h4b.fsf_-_@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 62290-done Cc: 62290-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hello, Christopher Baines skribis: > Based on the implementation in ports.c. I don't understand what this > code is really doing, but the suspendable ports implementation differs > from the similar C code for a couple of inequalities. > > * module/ice-9/suspendable-ports.scm (decode-utf8, bad-utf8-len): Flip a > couple of inequalities. > * test-suite/tests/ports.test ("string ports"): Add additional invalid > UTF-8 test case. Pushed as cba2e7e3fec3c781230570f5d1ef070625eeeda8. Thanks for documenting the problem and providing a perfect patch! Ludo=E2=80=99. From unknown Fri Jun 20 18:10:33 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Tue, 18 Apr 2023 11:24:08 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator