From debbugs-submit-bounces@debbugs.gnu.org Mon Mar 29 12:34:50 2010 Received: (at submit) by debbugs.gnu.org; 29 Mar 2010 16:34:50 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NwHvN-0001jp-1K for submit@debbugs.gnu.org; Mon, 29 Mar 2010 12:34:50 -0400 Received: from mail.gnu.org ([199.232.76.166] helo=mx10.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NwGp5-0000nI-Rt for submit@debbugs.gnu.org; Mon, 29 Mar 2010 11:24:17 -0400 Received: from lists.gnu.org ([199.232.76.165]:51456) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1NwGoz-0003b0-Cm for submit@debbugs.gnu.org; Mon, 29 Mar 2010 11:24:09 -0400 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NwGoy-0008Sa-Bz for bug-gnu-emacs@gnu.org; Mon, 29 Mar 2010 11:24:08 -0400 Received: from [140.186.70.92] (port=45707 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NwGov-0006Kh-Dy for bug-gnu-emacs@gnu.org; Mon, 29 Mar 2010 11:24:07 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD autolearn=unavailable version=3.3.0 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1NwGaj-0004nD-KC for bug-gnu-emacs@gnu.org; Mon, 29 Mar 2010 11:09:27 -0400 Received: from aristotle.tamu.edu ([128.194.75.5]:27894) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NwGaj-0004n8-CX for bug-gnu-emacs@gnu.org; Mon, 29 Mar 2010 11:09:25 -0400 Received: from localhost (localhost [127.0.0.1]) by aristotle.tamu.edu (Postfix) with ESMTP id 2B731E041C for ; Mon, 29 Mar 2010 10:09:19 -0500 (CDT) Date: Mon, 29 Mar 2010 10:09:19 -0500 (CDT) Message-Id: <20100329.100919.319083499807539873.rasmith@aristotle.tamu.edu> To: bug-gnu-emacs@gnu.org Subject: 23.1; search-forward in unibyte buffer for \377 From: rasmith@tamu.edu X-Mailer: Mew version 6.3 on Emacs 23.1 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Spam-Score: -4.2 (----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Mon, 29 Mar 2010 12:34:47 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Reply-To: rasmith@tamu.edu List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.4 (-----) Please write in English if possible, because the Emacs maintainers usually do not have translators to read other languages for them. Your bug report will be posted to the bug-gnu-emacs@gnu.org mailing lis= t, and to the gnu.emacs.bug news group. Please describe exactly what actions triggered the bug and the precise symptoms of the bug: search-forward fails to find a unibyte \377 in a raw unibyte buffer. I use "cgreek", a package written by Naoto Takahashi for handling polytonic (ancient, fully accented) Greek. It includes a file, cgreek-tlg.el, for processing the files in the Thesaurus Linguae Graecae, which have their own unique formats. In these files, the byte \377 is used as a string terminator. Prior to emacs23, these files could be processed by reading the file in with insert-file-contents-literally, making the buffer unibyte with (set-buffer-multibyte nil), and searching for the string terminator with (search-forward (char-to-string ?\xff)). However, that search now fails to find a single byte \377 and instead matches on the two-byte sequence \231\277. = Changing the search function to (search-forward (unibyte-string ?\377))= has the same result. = On investigation, I see the following: After further investigation, I'm not certain it's a bug: it may be an intentional part of the modifications to accommodate utf-8. Here are the details; In a multibyte-buffer (set-buffer-multibyte t), = = (search-forward (char-to-string ?\xff)) matches utf-8 "=FF" (i.e. \303\= 277) (search-forward (char-to-string ?\377)) matches utf-8 "=FF" (search-forward (unibyte-string ?\377)) matches byte \377 In a unibyte buffer (set-buffer-multibyte nil) (search-forward (char-to-string ?\xff)) matches \231\277 (search-forward (char-to-string ?\377)) matches \231\277 (search-forward (unibyte-string ?\377)) matches \231\277 In other words, search-forward cannot find byte \377 when searching in a *unibyte* buffer, but it can find that same byte if the buffer is changed to multibyte. The reason is that in a unibyte buffer, search-forward apparently changes byte \377 to a two-byte representation (but not to utf-8, which would be \303\277). = This may be exactly the intended behavior of search-forward, but it breaks scripts expecting search-forward to be able to find a single high 8-bit byte in a unibyte buffer. In context, changing the buffer to multibyte is not a solution. The code in which I found this error can be fixed by replacing (search-forward (char-to-string ?\xff)) with (skip-chars-forward "^\377") (forward-char 1) (fix provided by Naoto Takahashi) However, that means that scripts counting on the old behavior of search-forward will have to be modified. = If Emacs crashed, and you have the Emacs process in the gdb debugger, please include the output from the following gdb commands: `bt full' and `xbacktrace'. If you would like to further debug the crash, please read the file /usr/local/share/emacs/23.1/etc/DEBUG for instructions. In GNU Emacs 23.1.1 (amd64-portbld-freebsd8.0, GTK+ Version 2.18.7) of 2010-03-25 on aristotle.tamu.edu Windowing system distributor `The X.Org Foundation', version 11.0.10605= 000 configured using `configure '--with-x-toolkit=3Dgtk' '--x-libraries=3D= /usr/local/lib' '--x-includes=3D/usr/local/include' '--prefix=3D/usr/lo= cal' '--mandir=3D/usr/local/man' '--infodir=3D/usr/local/info/' '--buil= d=3Damd64-portbld-freebsd8.0' 'build_alias=3Damd64-portbld-freebsd8.0' = 'CC=3Dcc' 'CFLAGS=3D-O2 -pipe -fno-strict-aliasing' 'LDFLAGS=3D-L/usr/l= ocal/lib -lintl' 'CPPFLAGS=3D-I/usr/local/include'' Important settings: value of $LC_ALL: en_US.UTF-8 value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: en_US.UTF-8 value of $XMODIFIERS: nil locale-coding-system: utf-8-unix default-enable-multibyte-characters: t Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t tool-bar-mode: t mouse-wheel-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t global-auto-composition-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Recent input: o C-q 0 0 0 = C-q 3 7 7 C-x C-e = C-x o = C-q 2 3 1 ] C-q 2 7 7 = C-e C-x C-e C-x = C-e = = = = C-k C-y C-y = t = C-x C-e = = C-x C-e C-x o C-x C-e = = ( s e a r c h - f o r w a r d SPC ( c h a r - = t o - s t r i o n g = g g SPC n g SPC = ? \ x f f ) ) C-x C-e C-x o C-x = C-e C-e C-x C-e C-e = C-x C-e C-x C-e = C-e C-x C-e C-e C-x C-e C-x o = C-q 3 7 = 7 = C-x C-e = C-x C-e C-x C-e = = C-x C-e C-e C-x C-e = C-e C-x C-e C-e C-x C-e = = M-x r e p o r t b Recent messages: Entering debugger... 326 Entering debugger... nil 369 [3 times] t Entering debugger... 374 [2 times] 366 nil 369 [3 times] From debbugs-submit-bounces@debbugs.gnu.org Wed Mar 31 14:00:58 2010 Received: (at control) by debbugs.gnu.org; 31 Mar 2010 18:00:59 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Nx2Dq-0008Gy-OB for submit@debbugs.gnu.org; Wed, 31 Mar 2010 14:00:58 -0400 Received: from fencepost.gnu.org ([140.186.70.10]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Nx2Dp-0008Gs-12 for control@debbugs.gnu.org; Wed, 31 Mar 2010 14:00:57 -0400 Received: from rgm by fencepost.gnu.org with local (Exim 4.69) (envelope-from ) id 1Nx2Dj-0000uY-FD; Wed, 31 Mar 2010 14:00:51 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <19379.36307.342420.168332@fencepost.gnu.org> Date: Wed, 31 Mar 2010 14:00:51 -0400 From: Glenn Morris To: control Subject: control X-Attribution: GM X-Mailer: VM (www.wonderworks.com/vm), GNU Emacs (www.gnu.org/software/emacs) X-Hue: cyan X-Ran: |xb+Qi!x3@qp^&UV8ipxV]x}wjI#jV~r]2Qrm_)s{j List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.1 (-----) merge 5797 5799 severity 5808 minor reassign 5811 emacs,ns unarchive 5365 forcemerge 5365 5810 From debbugs-submit-bounces@debbugs.gnu.org Sun Sep 18 16:09:59 2011 Received: (at control) by debbugs.gnu.org; 18 Sep 2011 20:10:00 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1R5Ngc-0002Kc-0d for submit@debbugs.gnu.org; Sun, 18 Sep 2011 16:09:59 -0400 Received: from hermes.netfonds.no ([80.91.224.195]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1R5NgZ-0002KT-Cu for control@debbugs.gnu.org; Sun, 18 Sep 2011 16:09:56 -0400 Received: from cm-84.215.51.58.getinternet.no ([84.215.51.58] helo=stories.gnus.org) by hermes.netfonds.no with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1R5Nbf-00058I-KQ for control@debbugs.gnu.org; Sun, 18 Sep 2011 22:04:51 +0200 Date: Sun, 18 Sep 2011 22:01:30 +0200 Message-Id: To: control@debbugs.gnu.org From: Lars Magne Ingebrigtsen Subject: control message for bug #5799 X-MailScanner-ID: 1R5Nbf-00058I-KQ X-Netfonds-MailScanner: Found to be clean X-Netfonds-MailScanner-From: larsi@gnus.org MailScanner-NULL-Check: 1316981091.90978@XNzrbiYqY0fI4P/hrs6+Gg X-Spam-Status: No X-Spam-Score: -2.6 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.6 (--) tags 5799 fixed close 5799 24.1 From unknown Mon Aug 18 11:17:05 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Mon, 17 Oct 2011 11:24:08 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator