From debbugs-submit-bounces@debbugs.gnu.org Sat Jan 28 05:23:16 2012 Received: (at submit) by debbugs.gnu.org; 28 Jan 2012 10:23:16 +0000 Received: from localhost ([127.0.0.1]:47534 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Rr5RE-0000kZ-3Y for submit@debbugs.gnu.org; Sat, 28 Jan 2012 05:23:16 -0500 Received: from eggs.gnu.org ([140.186.70.92]:36904) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Rr5RC-0000kM-LQ for submit@debbugs.gnu.org; Sat, 28 Jan 2012 05:23:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Rr5QQ-0006Ph-1c for submit@debbugs.gnu.org; Sat, 28 Jan 2012 05:22:26 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable version=3.3.2 Received: from lists.gnu.org ([140.186.70.17]:50129) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rr5QQ-0006Pd-0A for submit@debbugs.gnu.org; Sat, 28 Jan 2012 05:22:26 -0500 Received: from eggs.gnu.org ([140.186.70.92]:41770) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rr5QP-0001O0-2E for bug-guile@gnu.org; Sat, 28 Jan 2012 05:22:25 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Rr5QO-0006PF-87 for bug-guile@gnu.org; Sat, 28 Jan 2012 05:22:25 -0500 Received: from world.peace.net ([96.39.62.75]:41174) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rr5QO-0006P2-3C for bug-guile@gnu.org; Sat, 28 Jan 2012 05:22:24 -0500 Received: from 209-6-91-212.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com ([209.6.91.212] helo=yeeloong) by world.peace.net with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.69) (envelope-from ) id 1Rr5QI-0001SX-NN; Sat, 28 Jan 2012 05:22:18 -0500 From: Mark H Weaver To: bug-guile@gnu.org Subject: char-ready? is broken for multibyte encodings Date: Sat, 28 Jan 2012 05:21:24 -0500 Message-ID: <87ipjwktzv.fsf@netris.org> MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.17 X-Spam-Score: -4.2 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -4.2 (----) The R5RS specifies that if 'char-ready?' returns #t, then the next 'read-char' operation is guaranteed not to hang. This is not currently the case for ports using a multibyte encoding. 'char-ready?' currently returns #t whenever at least one _byte_ is available. This is not correct in general. It should return #t only if there is a complete _character_ available. Mark From debbugs-submit-bounces@debbugs.gnu.org Sun Feb 24 14:13:33 2013 Received: (at 10627) by debbugs.gnu.org; 24 Feb 2013 19:13:33 +0000 Received: from localhost ([127.0.0.1]:47907 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9h0u-00047m-IT for submit@debbugs.gnu.org; Sun, 24 Feb 2013 14:13:33 -0500 Received: from a-pb-sasl-quonix.pobox.com ([208.72.237.25]:60930 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9h0r-00047e-Iz for 10627@debbugs.gnu.org; Sun, 24 Feb 2013 14:13:31 -0500 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 1E3ECB0A8; Sun, 24 Feb 2013 14:11:54 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=CfSmodbFwuUPtZa1bsbngbFJWwM=; b=CqzOQw CZA60U7YCDl+Dl/uzrcq4awbXfG/ilm/SeYulm4dGar097ZATjw6VHw/8J3O7fGS aj42D31HOOf/UAC7TmP4tN7XfnDKcBNSzojakoh2sOVqDbC1A+y+RZYY5T9PFh4i Vp69FuKdbXHr9nHImz5ligyD+NDlsCabK/XvY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=EhvB0y2fJZO9n9n7bbdaKY10i79r0hP9 4JC845qC4kNvPk+RzFEKrzeGS6afRdBLD3Yr2Jhswue00UzyhxdJaQZpUdJ1lgTX LFBgtYY0uPIlwoHVIz3OAWG2sdBO0+3Azq/3YxGq8APLIm1andt9ySt9ozz0attc iItsIMYgZbw= Received: from a-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 14749B0A7; Sun, 24 Feb 2013 14:11:54 -0500 (EST) Received: from badger (unknown [88.160.190.192]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTPSA id 9121DB0A6; Sun, 24 Feb 2013 14:11:53 -0500 (EST) From: Andy Wingo To: Mark H Weaver Subject: Re: bug#10627: char-ready? is broken for multibyte encodings References: <87ipjwktzv.fsf@netris.org> Date: Sun, 24 Feb 2013 20:11:50 +0100 In-Reply-To: <87ipjwktzv.fsf@netris.org> (Mark H. Weaver's message of "Sat, 28 Jan 2012 05:21:24 -0500") Message-ID: <87d2vpedc9.fsf@pobox.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: 0CBA7B36-7EB6-11E2-AAE6-1C2F0E5B5709-02397024!a-pb-sasl-quonix.pobox.com X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 10627 Cc: 10627@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.6 (--) On Sat 28 Jan 2012 11:21, Mark H Weaver writes: > The R5RS specifies that if 'char-ready?' returns #t, then the next > 'read-char' operation is guaranteed not to hang. This is not currently > the case for ports using a multibyte encoding. > > 'char-ready?' currently returns #t whenever at least one _byte_ is > available. This is not correct in general. It should return #t only if > there is a complete _character_ available. This procedure is omitted in the R6RS because it is not a good interface. Besides its semantic difficulties, can you think of a sane implementation for multibyte characters? I suggest we document that this procedure only works correctly in encodings with 1-byte characters and recommend that people use u8-ready? instead. Andy -- http://wingolog.org/ From debbugs-submit-bounces@debbugs.gnu.org Sun Feb 24 15:16:02 2013 Received: (at 10627) by debbugs.gnu.org; 24 Feb 2013 20:16:02 +0000 Received: from localhost ([127.0.0.1]:48025 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9hzN-0005dL-Jv for submit@debbugs.gnu.org; Sun, 24 Feb 2013 15:16:02 -0500 Received: from world.peace.net ([96.39.62.75]:57170) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9hzK-0005d6-63 for 10627@debbugs.gnu.org; Sun, 24 Feb 2013 15:15:59 -0500 Received: from 209-6-91-212.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com ([209.6.91.212] helo=tines.lan) by world.peace.net with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1U9hxg-0005ZD-FJ; Sun, 24 Feb 2013 15:14:16 -0500 From: Mark H Weaver To: Andy Wingo Subject: Re: bug#10627: char-ready? is broken for multibyte encodings References: <87ipjwktzv.fsf@netris.org> <87d2vpedc9.fsf@pobox.com> Date: Sun, 24 Feb 2013 15:14:05 -0500 In-Reply-To: <87d2vpedc9.fsf@pobox.com> (Andy Wingo's message of "Sun, 24 Feb 2013 20:11:50 +0100") Message-ID: <87ip5h79ma.fsf@tines.lan> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.8 (/) X-Debbugs-Envelope-To: 10627 Cc: 10627@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -1.9 (-) Hi Andy, Andy Wingo writes: > On Sat 28 Jan 2012 11:21, Mark H Weaver writes: > >> The R5RS specifies that if 'char-ready?' returns #t, then the next >> 'read-char' operation is guaranteed not to hang. This is not currently >> the case for ports using a multibyte encoding. >> >> 'char-ready?' currently returns #t whenever at least one _byte_ is >> available. This is not correct in general. It should return #t only if >> there is a complete _character_ available. > > This procedure is omitted in the R6RS because it is not a good > interface. Besides its semantic difficulties, can you think of a sane > implementation for multibyte characters? Maybe I'm missing something, but I don't see any semantic problem here, and it seems straightforward to implement. 'char-ready?' should simply read bytes until either a complete character is available, or no more bytes are ready. In either case, all the bytes should then be 'unget' before returning. What's the problem? The only reason I haven't yet fixed this is because it will require some refactoring in ports.c. I guess the most straightforward approach is to generalize 'get_codepoint', 'get_utf8_codepoint', and 'get_iconv_codepoint' to support a non-blocking mode of operation. What do you think? Regards, Mark From debbugs-submit-bounces@debbugs.gnu.org Sun Feb 24 17:17:15 2013 Received: (at 10627) by debbugs.gnu.org; 24 Feb 2013 22:17:15 +0000 Received: from localhost ([127.0.0.1]:48185 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9jsh-0008VZ-9d for submit@debbugs.gnu.org; Sun, 24 Feb 2013 17:17:15 -0500 Received: from a-pb-sasl-quonix.pobox.com ([208.72.237.25]:36182 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9jse-0008VR-EK for 10627@debbugs.gnu.org; Sun, 24 Feb 2013 17:17:13 -0500 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 2A6DABE1F; Sun, 24 Feb 2013 17:15:36 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=/mCvmxbnUwqv5gGLPGSkda9uHtI=; b=QNkkMd tktz8fvswtDODRqhZMiLzh3CQbzIGTVFiFA86uWJQwTuubXpMpPsPvZaJokcnjyk mkVxM/4e9X1C1Zg00NoaPKeuBCHYXEZnOEljbYCW0JBvaE3LA0VUoogo1jRRAbGF tvplo0e3mk87EOtcYC6pdaIf29nN+yM5Ygcok= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=IfZeyhF8an+s0zlMxtAsGYcZ2Zviik7X PqFbLugHRxnNjFZGwUNIgfaTajar5SvNxYsyKdtTDhfz+CDjsCgnagkFbM/2o86a OC6OkNQcS8qN//azUuliQpIDpHiCewzkLkPKI1fJH/8dR5zmMwMQCn6KN+g4rNFD JXWRT4S/FEw= Received: from a-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 2331EBE1D; Sun, 24 Feb 2013 17:15:36 -0500 (EST) Received: from badger (unknown [88.160.190.192]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTPSA id 99BA7BE1C; Sun, 24 Feb 2013 17:15:35 -0500 (EST) From: Andy Wingo To: Mark H Weaver Subject: Re: bug#10627: char-ready? is broken for multibyte encodings References: <87ipjwktzv.fsf@netris.org> <87d2vpedc9.fsf@pobox.com> <87ip5h79ma.fsf@tines.lan> Date: Sun, 24 Feb 2013 23:15:33 +0100 In-Reply-To: <87ip5h79ma.fsf@tines.lan> (Mark H. Weaver's message of "Sun, 24 Feb 2013 15:14:05 -0500") Message-ID: <87y5ed8ika.fsf@pobox.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: B65F48B0-7ECF-11E2-BDC6-1C2F0E5B5709-02397024!a-pb-sasl-quonix.pobox.com X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 10627 Cc: 10627@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.6 (--) Hi :) On Sun 24 Feb 2013 21:14, Mark H Weaver writes: > Andy Wingo writes: > >> On Sat 28 Jan 2012 11:21, Mark H Weaver writes: >> >>> The R5RS specifies that if 'char-ready?' returns #t, then the next >>> 'read-char' operation is guaranteed not to hang. This is not currently >>> the case for ports using a multibyte encoding. >>> >>> 'char-ready?' currently returns #t whenever at least one _byte_ is >>> available. This is not correct in general. It should return #t only if >>> there is a complete _character_ available. >> >> This procedure is omitted in the R6RS because it is not a good >> interface. Besides its semantic difficulties, can you think of a sane >> implementation for multibyte characters? > > Maybe I'm missing something, but I don't see any semantic problem here, > and it seems straightforward to implement. 'char-ready?' should simply > read bytes until either a complete character is available, or no more > bytes are ready. In either case, all the bytes should then be 'unget' > before returning. What's the problem? The problem is that char-ready? should not read anything. If you want to peek, use peek-char. Note that if the stream is at EOF, char-ready? should return #t. Andy -- http://wingolog.org/ From debbugs-submit-bounces@debbugs.gnu.org Sun Feb 24 19:08:28 2013 Received: (at 10627) by debbugs.gnu.org; 25 Feb 2013 00:08:28 +0000 Received: from localhost ([127.0.0.1]:48323 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9lcJ-0002qE-61 for submit@debbugs.gnu.org; Sun, 24 Feb 2013 19:08:28 -0500 Received: from world.peace.net ([96.39.62.75]:57365) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9lcG-0002q5-7v for 10627@debbugs.gnu.org; Sun, 24 Feb 2013 19:08:25 -0500 Received: from 209-6-91-212.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com ([209.6.91.212] helo=tines.lan) by world.peace.net with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1U9laa-00061i-6u; Sun, 24 Feb 2013 19:06:40 -0500 From: Mark H Weaver To: Andy Wingo Subject: Re: bug#10627: char-ready? is broken for multibyte encodings References: <87ipjwktzv.fsf@netris.org> <87d2vpedc9.fsf@pobox.com> <87ip5h79ma.fsf@tines.lan> <87y5ed8ika.fsf@pobox.com> Date: Sun, 24 Feb 2013 19:06:30 -0500 In-Reply-To: <87y5ed8ika.fsf@pobox.com> (Andy Wingo's message of "Sun, 24 Feb 2013 23:15:33 +0100") Message-ID: <87a9qt6yux.fsf@tines.lan> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.8 (/) X-Debbugs-Envelope-To: 10627 Cc: 10627@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -1.9 (-) Andy Wingo writes: > On Sun 24 Feb 2013 21:14, Mark H Weaver writes: > >> Maybe I'm missing something, but I don't see any semantic problem here, >> and it seems straightforward to implement. 'char-ready?' should simply >> read bytes until either a complete character is available, or no more >> bytes are ready. In either case, all the bytes should then be 'unget' >> before returning. What's the problem? > > The problem is that char-ready? should not read anything. Okay, but if all bytes read are later *unread*, and the reads never block, then why does it matter? The reads in my proposed implementation are just an internal implementation detail, and it seems to me that the user cannot tell the difference, as long as he does not peek underneath the Scheme port abstraction. If you prefer, perhaps a nicer way to think about it is that 'char-ready?' looks ahead in the putback buffer and/or the read buffer (refilling it in a non-blocking mode if needed), and returns #t iff a complete character is present in the buffer(s), or EOF is reached. However, is seems to me that implementing this in terms of read-byte and unget-byte is simpler, because it avoids duplication of the logic regarding putback buffers and refilling of buffers. Maybe there's some reason why this is a bad idea, but I haven't heard one. I agree that 'char-ready?' is an antiquated interface, but it is nonetheless part of the R5RS (and Guile since approximately forever), and it is the only way to do a non-blocking read in portable R5RS. It seems to me that we ought to try to implement it as well as we can, no? > If you want to peek, use peek-char. Okay, but that's a totally different tool with a different use case. It cannot be used to do non-blocking reads. > Note that if the stream is at EOF, char-ready? should return #t. Agreed. More thoughts? Thanks, Mark From debbugs-submit-bounces@debbugs.gnu.org Sun Feb 24 20:25:25 2013 Received: (at 10627) by debbugs.gnu.org; 25 Feb 2013 01:25:25 +0000 Received: from localhost ([127.0.0.1]:48397 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9mom-0005Yr-Or for submit@debbugs.gnu.org; Sun, 24 Feb 2013 20:25:25 -0500 Received: from mail-ie0-f182.google.com ([209.85.223.182]:49222) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9mok-0005Yh-L9 for 10627@debbugs.gnu.org; Sun, 24 Feb 2013 20:25:23 -0500 Received: by mail-ie0-f182.google.com with SMTP id k14so2597808iea.27 for <10627@debbugs.gnu.org>; Sun, 24 Feb 2013 17:23:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=2StNd3+4JGeKG7OoFUSpJJbJmy5yvrrf8uQHT6n38z8=; b=YRlAA2hSkvPIBUl1BVR5JTDY/YV+T6D/eWqtVb0al9mCfbMU0B0d5aiIGSnfGjfGMH Mge6Tok2P+gjXs62wJRtDowE8P8lUAI4R0wx4fskXU7WGmmIF1L2WawvziQ+JMbT4zDi TpYuULH9eIpiqBCEbNlgrPI3OSEg755fAFuxA+0qeERS0ZhbNcbem6tSlTk6g8ayC1sL JTddyrrfLQ65mdn1mv2+gGtA+kDxy5R13FPNwFq7HYG/C+hkafMZCjvqaItNwnl35DJn TUvjx4VAnBOaOa7dFG49TVdEMtHCdbp0OVR0K2/dcs0ZJy+TgG1ybdMAkIZhkj15xK5p 7UFA== MIME-Version: 1.0 X-Received: by 10.43.7.7 with SMTP id om7mr3703294icb.25.1361755425493; Sun, 24 Feb 2013 17:23:45 -0800 (PST) Received: by 10.64.26.168 with HTTP; Sun, 24 Feb 2013 17:23:45 -0800 (PST) In-Reply-To: <87a9qt6yux.fsf@tines.lan> References: <87ipjwktzv.fsf@netris.org> <87d2vpedc9.fsf@pobox.com> <87ip5h79ma.fsf@tines.lan> <87y5ed8ika.fsf@pobox.com> <87a9qt6yux.fsf@tines.lan> Date: Mon, 25 Feb 2013 09:23:45 +0800 Message-ID: Subject: Re: bug#10627: char-ready? is broken for multibyte encodings From: Daniel Hartwig To: Mark H Weaver Content-Type: text/plain; charset=UTF-8 X-Spam-Score: 0.1 (/) X-Debbugs-Envelope-To: 10627 Cc: Andy Wingo , 10627@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -0.7 (/) On 25 February 2013 08:06, Mark H Weaver wrote: > Andy Wingo writes: > >> On Sun 24 Feb 2013 21:14, Mark H Weaver writes: >> >>> Maybe I'm missing something, but I don't see any semantic problem here, >>> and it seems straightforward to implement. 'char-ready?' should simply >>> read bytes until either a complete character is available, or no more >>> bytes are ready. In either case, all the bytes should then be 'unget' >>> before returning. What's the problem? >> >> The problem is that char-ready? should not read anything. > > Okay, but if all bytes read are later *unread*, and the reads never > block, then why does it matter? Taking care to still use sf_input_waiting for soft ports? Reading bytes from a soft port could have side effects (i.e. logging action or similar). From debbugs-submit-bounces@debbugs.gnu.org Mon Feb 25 03:57:29 2013 Received: (at 10627) by debbugs.gnu.org; 25 Feb 2013 08:57:29 +0000 Received: from localhost ([127.0.0.1]:48894 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9tsG-0001DB-6W for submit@debbugs.gnu.org; Mon, 25 Feb 2013 03:57:28 -0500 Received: from a-pb-sasl-quonix.pobox.com ([208.72.237.25]:51761 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9tsD-0001D4-MW for 10627@debbugs.gnu.org; Mon, 25 Feb 2013 03:57:26 -0500 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 1B1E3A525; Mon, 25 Feb 2013 03:55:47 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=xSEbMbeh09NSzhykSAD5tlF/3uE=; b=sPOt7+ EGfUEZqAKSjEdJP6cVFh0yHSbl1+ttDppp3YAo5Em6fyjvalQtqqhjgOHFUtYXf9 ee6KpKaZ5NgkXkCeO6KufnCgRrCv2T2ypVFN9wPkz1J9JCLlNLA4ChfPH8Amolcf UmVB25urvW7CPvRJYDE1VhaJywStnw7ayxRq0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=tVUS4HbAkC/DSX4EFgfQf4IlSu4cqKOv IYyCws8Zs9PjM9vfMriCfBiQOuVmsdCRSnFxy4TnU1oh7FurmkC87giG1WgWlY/B BnrLsVTUMWHtJQwJ0h9Vwtom48M+F/f0OEPBPgyLGJyAaoddwNuzpATEjCTBonVN xqCQP3rd9xs= Received: from a-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 14015A524; Mon, 25 Feb 2013 03:55:47 -0500 (EST) Received: from badger (unknown [88.160.190.192]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTPSA id 967B8A523; Mon, 25 Feb 2013 03:55:46 -0500 (EST) From: Andy Wingo To: Mark H Weaver Subject: Re: bug#10627: char-ready? is broken for multibyte encodings References: <87ipjwktzv.fsf@netris.org> <87d2vpedc9.fsf@pobox.com> <87ip5h79ma.fsf@tines.lan> <87y5ed8ika.fsf@pobox.com> <87a9qt6yux.fsf@tines.lan> Date: Mon, 25 Feb 2013 09:55:44 +0100 In-Reply-To: <87a9qt6yux.fsf@tines.lan> (Mark H. Weaver's message of "Sun, 24 Feb 2013 19:06:30 -0500") Message-ID: <87ehg493hr.fsf@pobox.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: 251AFE9E-7F29-11E2-9EDD-1C2F0E5B5709-02397024!a-pb-sasl-quonix.pobox.com X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 10627 Cc: 10627@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.6 (--) Hi Mark, Are you proposing that `char-ready?' do a nonblocking read if the buffer is empty? That could work. On Mon 25 Feb 2013 01:06, Mark H Weaver writes: > However, is seems to me that implementing this in terms of read-byte and > unget-byte is simpler, because it avoids duplication of the logic > regarding putback buffers and refilling of buffers. Could work, if the port is nonblocking to begin with. > I agree that 'char-ready?' is an antiquated interface, but it is > nonetheless part of the R5RS (and Guile since approximately forever), > and it is the only way to do a non-blocking read in portable R5RS. It > seems to me that we ought to try to implement it as well as we can, no? Do what you like to do :) But if it were my time, I would simply document that it checks for a byte and not a character and move on. Andy -- http://wingolog.org/ From debbugs-submit-bounces@debbugs.gnu.org Tue Feb 26 14:52:58 2013 Received: (at 10627) by debbugs.gnu.org; 26 Feb 2013 19:52:58 +0000 Received: from localhost ([127.0.0.1]:51646 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UAQa6-0004QO-9Q for submit@debbugs.gnu.org; Tue, 26 Feb 2013 14:52:58 -0500 Received: from world.peace.net ([96.39.62.75]:59653) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UAQa0-0004QB-0N for 10627@debbugs.gnu.org; Tue, 26 Feb 2013 14:52:52 -0500 Received: from 209-6-91-212.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com ([209.6.91.212] helo=tines.lan) by world.peace.net with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1UAQYA-0003kO-7I; Tue, 26 Feb 2013 14:50:54 -0500 From: Mark H Weaver To: Andy Wingo Subject: Re: bug#10627: char-ready? is broken for multibyte encodings References: <87ipjwktzv.fsf@netris.org> <87d2vpedc9.fsf@pobox.com> <87ip5h79ma.fsf@tines.lan> <87y5ed8ika.fsf@pobox.com> <87a9qt6yux.fsf@tines.lan> <87ehg493hr.fsf@pobox.com> Date: Tue, 26 Feb 2013 14:50:43 -0500 In-Reply-To: <87ehg493hr.fsf@pobox.com> (Andy Wingo's message of "Mon, 25 Feb 2013 09:55:44 +0100") Message-ID: <87fw0i26ss.fsf@tines.lan> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.8 (/) X-Debbugs-Envelope-To: 10627 Cc: 10627@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -0.5 (/) Andy Wingo writes: > Are you proposing that `char-ready?' do a nonblocking read if > the buffer is empty? That could work. Yes. I suspect that something along these lines is already implemented, because I don't see how 'u8-ready?' could work properly without it. > Do what you like to do :) But if it were my time, I would simply > document that it checks for a byte and not a character and move on. I'd like to fix it properly. Let's keep this bug open until it's done. Thanks, Mark From debbugs-submit-bounces@debbugs.gnu.org Tue Feb 26 15:01:18 2013 Received: (at 10627) by debbugs.gnu.org; 26 Feb 2013 20:01:19 +0000 Received: from localhost ([127.0.0.1]:51651 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UAQiD-0004cp-QH for submit@debbugs.gnu.org; Tue, 26 Feb 2013 15:01:18 -0500 Received: from a-pb-sasl-quonix.pobox.com ([208.72.237.25]:46118 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UAQiB-0004ci-0Z for 10627@debbugs.gnu.org; Tue, 26 Feb 2013 15:01:16 -0500 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 3926BA9A5; Tue, 26 Feb 2013 14:59:28 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=lWPgLd5Jdf0Ble+ifQArrOyNV1Q=; b=oFsY75 mYvmvXI/SUJGir6Z2QtX3RarYXc85627cIAZNAoOFGxZoPx6a9J52Ysrs3AAZVfE CpC7rXI6GJTQuDWy0rZc8KGt8SorwgQfwM+BGFzmnYA4A0CqYQg/OsmG1kaEsDbP N8Ni0w8Ml2tsg0sszuP1hXCRUYt0wJvBVV5OY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=rrcifPQXiSMTm5gD6Y9bBXr2YWu+sAAb Ny95rxTqIa5JMc9GKxvaF6cAmvKnIbZfk7U2bAq5O2yI5gnlWdolH39BZ6HByhWp vn6UJ8MqciffZ+dapZFe9HzLWzTW00oxPCdgwq3lw3vZ8uxLcdlQvQf2cF9mWVrP fjsh8yyVIIU= Received: from a-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id 30DA1A9A4; Tue, 26 Feb 2013 14:59:28 -0500 (EST) Received: from badger (unknown [88.160.190.192]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTPSA id B79D5A9A3; Tue, 26 Feb 2013 14:59:27 -0500 (EST) From: Andy Wingo To: Mark H Weaver Subject: Re: bug#10627: char-ready? is broken for multibyte encodings References: <87ipjwktzv.fsf@netris.org> <87d2vpedc9.fsf@pobox.com> <87ip5h79ma.fsf@tines.lan> <87y5ed8ika.fsf@pobox.com> <87a9qt6yux.fsf@tines.lan> <87ehg493hr.fsf@pobox.com> <87fw0i26ss.fsf@tines.lan> Date: Tue, 26 Feb 2013 20:59:25 +0100 In-Reply-To: <87fw0i26ss.fsf@tines.lan> (Mark H. Weaver's message of "Tue, 26 Feb 2013 14:50:43 -0500") Message-ID: <87wqtuyhgi.fsf@pobox.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: 06C34EF2-804F-11E2-B087-79910E5B5709-02397024!a-pb-sasl-quonix.pobox.com X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 10627 Cc: 10627@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.6 (--) On Tue 26 Feb 2013 20:50, Mark H Weaver writes: > Andy Wingo writes: >> Are you proposing that `char-ready?' do a nonblocking read if >> the buffer is empty? That could work. > > Yes. I suspect that something along these lines is already implemented, > because I don't see how 'u8-ready?' could work properly without it. It does a poll with a timeout of 0. Andy -- http://wingolog.org/ From debbugs-submit-bounces@debbugs.gnu.org Mon Jun 20 15:23:49 2016 Received: (at 10627-done) by debbugs.gnu.org; 20 Jun 2016 19:23:49 +0000 Received: from localhost ([127.0.0.1]:47998 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bF4nU-0000us-UE for submit@debbugs.gnu.org; Mon, 20 Jun 2016 15:23:49 -0400 Received: from pb-sasl1.pobox.com ([64.147.108.66]:63151 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bF4nS-0000uk-Rk for 10627-done@debbugs.gnu.org; Mon, 20 Jun 2016 15:23:47 -0400 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-sasl1.pobox.com (Postfix) with ESMTP id 40C5224951; Mon, 20 Jun 2016 15:23:45 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=t8hbtOBmzNhG3g8x0nnSC3WvT5U=; b=bvJsxr ZeBjVM5U/sitEpdOqfQiPzDzOrYMBtSzB1g2sy0SYnrKGOWqN6yHQmbWNnh33voG vaX1k0gRih6LhrTeh1IOO8qwzZrdc/Pb2vFVddPf97444/4URRUgrso0/5OM1NLu ziwaWfEbWwOUdAzyQQrDz2dwVxjZX/qJoPSIY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=U58UKcw9YnA2Z6qdN6656K44QK219urM ZOl5btmrcxsHc9hBGG/hlq01mChs2wJwG9uwvebVtv1NHFcyp+B8SV2Ul4dy+faf Hl3zSTX0zfAJz85NHTUMdMTcRPdBPzQmYrxdnWPOadA8Hddo/5j2HC+VAqbji1Jk bLQaTpmu8C4= Received: from pb-sasl1.nyi.icgroup.com (unknown [127.0.0.1]) by pb-sasl1.pobox.com (Postfix) with ESMTP id 38DB924950; Mon, 20 Jun 2016 15:23:45 -0400 (EDT) Received: from clucks (unknown [88.160.190.192]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pb-sasl1.pobox.com (Postfix) with ESMTPSA id 626582494F; Mon, 20 Jun 2016 15:23:44 -0400 (EDT) From: Andy Wingo To: Mark H Weaver Subject: Re: bug#10627: char-ready? is broken for multibyte encodings References: <87ipjwktzv.fsf@netris.org> <87d2vpedc9.fsf@pobox.com> <87ip5h79ma.fsf@tines.lan> <87y5ed8ika.fsf@pobox.com> <87a9qt6yux.fsf@tines.lan> <87ehg493hr.fsf@pobox.com> <87fw0i26ss.fsf@tines.lan> <87wqtuyhgi.fsf@pobox.com> Date: Mon, 20 Jun 2016 21:23:35 +0200 In-Reply-To: <87wqtuyhgi.fsf@pobox.com> (Andy Wingo's message of "Tue, 26 Feb 2013 20:59:25 +0100") Message-ID: <87y45z8y14.fsf@pobox.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: 812D37DE-371C-11E6-98AC-C1836462E9F6-02397024!pb-sasl1.pobox.com X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: 10627-done Cc: 10627-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.4 (-) On Tue 26 Feb 2013 20:59, Andy Wingo writes: > On Tue 26 Feb 2013 20:50, Mark H Weaver writes: > >> Andy Wingo writes: >>> Are you proposing that `char-ready?' do a nonblocking read if >>> the buffer is empty? That could work. >> >> Yes. I suspect that something along these lines is already implemented, >> because I don't see how 'u8-ready?' could work properly without it. > > It does a poll with a timeout of 0. In the end I added this to the manual: Note that @code{char-ready?} only works reliably for terminals and sockets with one-byte encodings. Under the hood it will return @code{#t} if the port has any input buffered, or if the file descriptor that backs the port polls as readable, indicating that Guile can fetch more bytes from the kernel. However being able to fetch one byte doesn't mean that a full character is available; @xref{Encoding}. Also, on many systems it's possible for a file descriptor to poll as readable, but then block when it comes time to read bytes. Note also that on Linux kernels, all file ports backed by files always poll as readable. For non-file ports, this procedure always returns @code{#t}, except for soft ports, which have a @code{char-ready?} handler. @xref{Soft Ports}. In short, this is a legacy procedure whose semantics are hard to provide. However it is a useful check to see if any input is buffered. @xref{Non-Blocking I/O}. We could try a non-blocking read but at that point we should just provide a non-blocking read-char, and allow users to unread-char. That would be a different bug :) Andy From unknown Sun Jun 22 11:35:47 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Tue, 19 Jul 2016 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator