From debbugs-submit-bounces@debbugs.gnu.org Sun May 31 13:52:52 2015 Received: (at submit) by debbugs.gnu.org; 31 May 2015 17:52:52 +0000 Received: from localhost ([127.0.0.1]:35064 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yz7Pm-0002AU-JI for submit@debbugs.gnu.org; Sun, 31 May 2015 13:52:51 -0400 Received: from eggs.gnu.org ([208.118.235.92]:46525) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yz4d6-0006Zp-OO for submit@debbugs.gnu.org; Sun, 31 May 2015 10:54:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yz4d0-0004FF-NW for submit@debbugs.gnu.org; Sun, 31 May 2015 10:54:19 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:48378) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yz4d0-0004FB-Ka for submit@debbugs.gnu.org; Sun, 31 May 2015 10:54:18 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60022) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yz4cz-0004sL-BQ for bug-gnu-emacs@gnu.org; Sun, 31 May 2015 10:54:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yz4cw-0004Dc-54 for bug-gnu-emacs@gnu.org; Sun, 31 May 2015 10:54:17 -0400 Received: from tower.recompile.se ([88.80.28.95]:55630) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yz4cv-0004DN-Uf for bug-gnu-emacs@gnu.org; Sun, 31 May 2015 10:54:14 -0400 Received: from localhost (localhost [127.0.0.1]) by tower.recompile.se (Postfix) with ESMTP id A654736E530; Sun, 31 May 2015 16:54:09 +0200 (CEST) Received: by tower.recompile.se (Postfix, from userid 1000) id 7858336E4C9; Sun, 31 May 2015 16:54:09 +0200 (CEST) From: Teddy Hogeborn To: bug-gnu-emacs@gnu.org Subject: info.el bug fix; Interprets Info format wrongly Date: Sun, 31 May 2015 16:54:05 +0200 Message-ID: <87d21gpzle.fsf@tower.recompile.se> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Sun, 31 May 2015 13:52:48 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable The Info file format (see (texinfo)Info Format Tag Table.) is documented as having the reference position in bytes. However, the info.el functions "Info-find-in-tag-table-1", "Info-read-subfile", and "Info-search" reads the byte value and adds it to (point-min), which is a character position, not a byte position. This causes the Emacs Info reader to jump to the wrong position in Info files with a lot of non-ascii characters. Solution: Convert the read value to position using byte-to-position: diff --git a/lisp/info.el b/lisp/info.el index 80428e7..b179510 100644 =2D-- a/lisp/info.el +++ b/lisp/info.el @@ -1020,7 +1020,8 @@ which the match was found." (beginning-of-line) (when (re-search-forward regexp nil t) (list (string-equal "Ref:" (match-string 1)) =2D (+ (point-min) (read (current-buffer))) + (+ (point-min) (byte-to-position + (read (current-buffer)))) major-mode))))) =20 (defun Info-find-in-tag-table (marker regexp &optional strict-case) @@ -1523,7 +1524,9 @@ is non-nil)." thisfilepos thisfilename) (search-forward ": ") (setq thisfilename (buffer-substring beg (- (point) 2))) =2D (setq thisfilepos (+ (point-min) (read (current-buffer)))) + (setq thisfilepos (+ (point-min) + (byte-to-position + (read (current-buffer))))) ;; read in version 19 stops at the end of number. ;; Advance to the next line. (forward-line 1) @@ -2013,9 +2016,11 @@ If DIRECTION is `backward', search in the reverse di= rection." (re-search-backward "\\(^.*\\): [0-9]+$") (re-search-forward "\\(^.*\\): [0-9]+$")) (goto-char (+ (match-end 1) 2)) =2D (setq list (cons (cons (+ (point-min) =2D (read (current-buffer))) =2D (match-string-no-properties 1)) + (setq list (cons (cons + (+ (point-min) + (byte-to-position + (read (current-buffer)))) + (match-string-no-properties 1)) list)) (goto-char (if backward (1- (match-beginning 0)) Suggested ChangeLog: =2D--- Convert reference byte positions from Info file to character position. * lisp/info.el (Info-find-in-tag-table-1, Info-read-subfile) (Info-search): Convert position read from Info file from bytes to character position. Patch by Teddy Hogeborn . =2D--- /Teddy Hogeborn --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCAAGBQJVayCNAAoJEOubGwHeBXLk9cYQAJ6KYtVDeXZxTT7Nx9ISK0zQ soMH6XJbjeWVCzounDoVCS/fZdrLbmmrPQO9SLlC34AWpfeE/CjHqHQlCFO22M68 6N/tJjQ5Zy7t6nH2i9p3KRfoohmZpewCMwvmW9EY7BapwDBezG8xzkpr3mhgYfUV S+0DyLB3pElkxDhemcrjVLsGbtSqvBnjyIFWV//jLNmy2c/edMWP+RaNI3h/6x2I uvX0d+4cOLIgVLGGdUkO2EYTn9Vtzqb3tOjLTkDkACywtma+bgDMzsap9vcQjI+l oioi9HPs+awW99hDs19lfiuKx2Qi3kglSKbkJeffTbO9fFg+VdfufKTosA2/GOP3 KSHWIKuWwYlr5nCpcWpBd3/pn6xXL6QKFBMwnK0vZRglJgtFQpXGB/h/CLH6khzj GANWnm0wV5R6l/w5zxF5MLVH2wqUKZJjm8klK00iPQtrY/m+0WF6iRk5dO9MjOla Z0iQgLTBH1B81yCCisXEMld44GgQAFb9shQN4/8yYAhnQ53mftkMgZiq6wmoOl/S AxR362KfWpRrKn/zDdbDH0/lrzUS8XNKFH/pCX7gxoxuB4Isdt9oBypw+2mLjhjt RuiFWRztPTEoaWsev0nMBhzEDdXrKvTrkfsFWDUqHHBpqTYQF1sobo32MsKq7/NB UAZlLhjc5xs8jFJzFLzV =Ccju -----END PGP SIGNATURE----- --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Sun May 31 14:35:47 2015 Received: (at control) by debbugs.gnu.org; 31 May 2015 18:35:48 +0000 Received: from localhost ([127.0.0.1]:35090 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yz85L-0003Cr-Ci for submit@debbugs.gnu.org; Sun, 31 May 2015 14:35:47 -0400 Received: from tower.recompile.se ([88.80.28.95]:54727 ident=postfix) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yz85H-0003Ci-W2 for control@debbugs.gnu.org; Sun, 31 May 2015 14:35:44 -0400 Received: from localhost (localhost [127.0.0.1]) by tower.recompile.se (Postfix) with ESMTP id DDF4E36E530; Sun, 31 May 2015 20:35:42 +0200 (CEST) Received: by tower.recompile.se (Postfix, from userid 1000) id BF1E936E4C9; Sun, 31 May 2015 20:35:42 +0200 (CEST) From: Teddy Hogeborn To: control@debbugs.gnu.org Content-Transfer-Encoding: 8bit Subject: 20704 is duplicate of 13431 Organization: Recompile Date: Sun, 31 May 2015 20:35:42 +0200 Message-ID: <871thwppc1.fsf@tower.recompile.se> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) forcemerge 13431 20704 tags 13431 +patch tags 20704 +patch stop From debbugs-submit-bounces@debbugs.gnu.org Sun May 31 14:38:28 2015 Received: (at control) by debbugs.gnu.org; 31 May 2015 18:38:28 +0000 Received: from localhost ([127.0.0.1]:35099 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yz87v-0003HJ-VA for submit@debbugs.gnu.org; Sun, 31 May 2015 14:38:28 -0400 Received: from tower.recompile.se ([88.80.28.95]:54750 ident=postfix) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yz87u-0003HB-3D for control@debbugs.gnu.org; Sun, 31 May 2015 14:38:26 -0400 Received: from localhost (localhost [127.0.0.1]) by tower.recompile.se (Postfix) with ESMTP id 4ACA736E530; Sun, 31 May 2015 20:38:25 +0200 (CEST) Received: by tower.recompile.se (Postfix, from userid 1000) id E15D036E4C9; Sun, 31 May 2015 20:38:24 +0200 (CEST) From: Teddy Hogeborn To: control@debbugs.gnu.org Content-Transfer-Encoding: 8bit Subject: Bug 20704 contains a patch Organization: Recompile Date: Sun, 31 May 2015 20:38:24 +0200 Message-ID: <87twusoan3.fsf@tower.recompile.se> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) tags 20704 + patch stop From debbugs-submit-bounces@debbugs.gnu.org Mon Jun 01 10:02:10 2015 Received: (at 20704) by debbugs.gnu.org; 1 Jun 2015 14:02:10 +0000 Received: from localhost ([127.0.0.1]:36206 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YzQI5-0000IH-Qf for submit@debbugs.gnu.org; Mon, 01 Jun 2015 10:02:10 -0400 Received: from ironport2-out.teksavvy.com ([206.248.154.181]:54090) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YzQI1-0000Hj-Uz for 20704@debbugs.gnu.org; Mon, 01 Jun 2015 10:02:06 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0ArEwA731xV/3K9xEVcgxCEAoVVuzcJh0sEAgKBPDkUAQEBAQEBAYEKQQWDXQEBBFYjEAs0EhQYDSSIP88jAQEBAQYBAQEBHos6hQUHhC0FnxeGaY0/gUUjgWaCLiKCeAEBAQ X-IPAS-Result: A0ArEwA731xV/3K9xEVcgxCEAoVVuzcJh0sEAgKBPDkUAQEBAQEBAYEKQQWDXQEBBFYjEAs0EhQYDSSIP88jAQEBAQYBAQEBHos6hQUHhC0FnxeGaY0/gUUjgWaCLiKCeAEBAQ X-IronPort-AV: E=Sophos;i="5.13,465,1427774400"; d="scan'208";a="123771618" Received: from 69-196-189-114.dsl.teksavvy.com (HELO ceviche.home) ([69.196.189.114]) by ironport2-out.teksavvy.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 01 Jun 2015 10:01:59 -0400 Received: by ceviche.home (Postfix, from userid 20848) id B1F8B6618B; Mon, 1 Jun 2015 10:01:59 -0400 (EDT) From: Stefan Monnier To: Teddy Hogeborn Subject: Re: bug#20704: info.el bug fix; Interprets Info format wrongly Message-ID: References: <87d21gpzle.fsf@tower.recompile.se> Date: Mon, 01 Jun 2015 10:01:59 -0400 In-Reply-To: <87d21gpzle.fsf@tower.recompile.se> (Teddy Hogeborn's message of "Sun, 31 May 2015 16:54:05 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 20704 Cc: 20704@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.3 (/) Thanks, > + (+ (point-min) (byte-to-position > + (read (current-buffer)))) Hmm... this only works if the Info file is encoded in UTF-8. I guess in the case of Info, 99% of the files are just ASCII and there's a chance that the vast majority of the rest is (or will be) UTF-8, so maybe this hack works well in practice. But I think we should define an `Info-bytepos-to-charpos' function for that. It can be defined as an alias for byte-to-position, but at least it concentrates this utf-8 assumption at a single place where we can place a clear comment. Stefan From debbugs-submit-bounces@debbugs.gnu.org Mon Jun 01 11:12:58 2015 Received: (at 20704) by debbugs.gnu.org; 1 Jun 2015 15:12:58 +0000 Received: from localhost ([127.0.0.1]:36305 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YzROc-0001ya-F2 for submit@debbugs.gnu.org; Mon, 01 Jun 2015 11:12:58 -0400 Received: from mtaout20.012.net.il ([80.179.55.166]:55027) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YzROZ-0001yD-8a for 20704@debbugs.gnu.org; Mon, 01 Jun 2015 11:12:56 -0400 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0NP900E00TN0H300@a-mtaout20.012.net.il> for 20704@debbugs.gnu.org; Mon, 01 Jun 2015 18:12:49 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NP900EF1U9CGT30@a-mtaout20.012.net.il>; Mon, 01 Jun 2015 18:12:49 +0300 (IDT) Date: Mon, 01 Jun 2015 18:12:35 +0300 From: Eli Zaretskii Subject: Re: bug#20704: info.el bug fix; Interprets Info format wrongly In-reply-to: X-012-Sender: halo1@inter.net.il To: Stefan Monnier Message-id: <83382btqcc.fsf@gnu.org> References: <87d21gpzle.fsf@tower.recompile.se> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 20704 Cc: teddy@recompile.se, 20704@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Stefan Monnier > Date: Mon, 01 Jun 2015 10:01:59 -0400 > Cc: 20704@debbugs.gnu.org > > Thanks, > > > + (+ (point-min) (byte-to-position > > + (read (current-buffer)))) > > Hmm... this only works if the Info file is encoded in UTF-8. > I guess in the case of Info, 99% of the files are just ASCII and there's > a chance that the vast majority of the rest is (or will be) UTF-8, > so maybe this hack works well in practice. Using byte-to-position would make things worse for Latin-1 and the likes. But it shouldn't be hard to add a simple test of buffer-file-coding-system: if it states fixed-size encoding, like any of the 8-bit encodings, or UTF-16, the conversion to character position is trivial. AFAIR, the only problems will be with ISO-2022 derived encodings, and those are really rare in Info. So IMO adding such a simple test would go a long way towards making the solution almost perfect. > But I think we should define an `Info-bytepos-to-charpos' function for that. > It can be defined as an alias for byte-to-position, but at least it > concentrates this utf-8 assumption at a single place where we can place > a clear comment. Right. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Tue Jun 09 07:09:27 2015 Received: (at 20704) by debbugs.gnu.org; 9 Jun 2015 11:09:27 +0000 Received: from localhost ([127.0.0.1]:45782 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z2HPK-0008Df-QH for submit@debbugs.gnu.org; Tue, 09 Jun 2015 07:09:27 -0400 Received: from tower.recompile.se ([88.80.28.95]:46534 ident=postfix) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z2HPI-0008DW-SM for 20704@debbugs.gnu.org; Tue, 09 Jun 2015 07:09:25 -0400 Received: from localhost (localhost [127.0.0.1]) by tower.recompile.se (Postfix) with ESMTP id 46D4C36F0C6; Tue, 9 Jun 2015 13:09:23 +0200 (CEST) Received: by tower.recompile.se (Postfix, from userid 1000) id 27F4336D85C; Tue, 9 Jun 2015 13:09:23 +0200 (CEST) From: Teddy Hogeborn To: Eli Zaretskii References: <87d21gpzle.fsf@tower.recompile.se> <83382btqcc.fsf@gnu.org> Content-Transfer-Encoding: 8bit Subject: Re: bug#20704: info.el bug fix; Interprets Info format wrongly Organization: Recompile Date: Tue, 09 Jun 2015 13:09:08 +0200 In-Reply-To: <83382btqcc.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 01 Jun 2015 18:12:35 +0300") Message-ID: <87h9qhf8a3.fsf@tower.recompile.se> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 20704 Cc: Stefan Monnier , 20704@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Eli Zaretskii writes: > > > + (+ (point-min) (byte-to-position > > > + (read (current-buffer)))) > >=20 > > Hmm... this only works if the Info file is encoded in UTF-8. I > > guess in the case of Info, 99% of the files are just ASCII and > > there's a chance that the vast majority of the rest is (or will be) > > UTF-8, so maybe this hack works well in practice. > > Using byte-to-position would make things worse for Latin-1 and the > likes. No, byte-to-position already checks for that: =2D--- src/marker.c, line 302 /* If this buffer has as many characters as bytes, each character must be one byte. This takes care of the case where enable-multibyte-characters is nil. = */ if (best_above =3D=3D best_above_byte) return bytepos; =2D--- Therefore, an Info file in Latin-1 should work just fine. > But it shouldn't be hard to add a simple test of > buffer-file-coding-system: if it states fixed-size encoding, like any > of the 8-bit encodings, or UTF-16, > the conversion to character position is trivial. I think you mean UTF-32 instead of UTF-16, since UTF-16 is variable- length. /Teddy Hogeborn --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCAAGBQJVdslVAAoJEOubGwHeBXLkjiAP/09dbBj0SdNbVDQMFEf6s9gh IRaRxUBFQ1Q4SlgKF+kjtjL34eboWNVzwtA20qITgkVJbeBw0qZqbUhXikeGVQ2k 7/5eYBXoYv4jqel1rq0gy8h0Ywjp/ktlJLQAi24tfKWStrVH85cQJcp3Rb9Z8rZ9 M940EI6J47/SJzl2OoZ5s1c+RIeOVmRxx1hqjFiT0OVaOTaNSCBR5PNjdMgKF5E+ xgPFmTx3MtY+3MxY92S+W1SuYzkxq5QnYsYtMDk0cY3eI5bzYkvRZFgSP/CLA13r 0hcT2YB6DGwytjCI0c1Zb8TlpAsD6M6/jmY7NejKcpuoHrfu7ncc9r/HJHb+rNlz 8ppdDMZfqj4Rbo/D1sANs7PLOyRlAIh1GDkP//izC306eS+PhmYwpJBRxxl288BQ srUI6Q67J2plsIA/ryC0kCtJQY+na7p/ifZRQVq7kaY70VAT7JK0KOTOoqx6JFHN FLQAmlF7UEyY2fiAj65TI/XNibYPrmRa85LRehDeyBPBjmmEZMgLaXdAjn4i6tOu h0Cm/NUe4Tk3T8nUGxNc2h7fm3gwbxLTCSl5APKWnD/ATzTdQLcHnsVFxfiy/iwV mTRxQfK8auLubxhgdSdvoUu4PkYTBEcSDOXY+dd8oLudkgOkf81ErBSY5swEdXZP 7xvvoJKjRmFSUTvHY2GD =Dt/N -----END PGP SIGNATURE----- --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Tue Jun 09 10:29:34 2015 Received: (at 20704) by debbugs.gnu.org; 9 Jun 2015 14:29:35 +0000 Received: from localhost ([127.0.0.1]:46335 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z2KX0-00064T-5Q for submit@debbugs.gnu.org; Tue, 09 Jun 2015 10:29:34 -0400 Received: from mtaout21.012.net.il ([80.179.55.169]:49948) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z2KWw-00064D-Pg for 20704@debbugs.gnu.org; Tue, 09 Jun 2015 10:29:31 -0400 Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0NPO00000LKUV600@a-mtaout21.012.net.il> for 20704@debbugs.gnu.org; Tue, 09 Jun 2015 17:29:21 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NPO00098LKWU940@a-mtaout21.012.net.il>; Tue, 09 Jun 2015 17:29:21 +0300 (IDT) Date: Tue, 09 Jun 2015 17:29:09 +0300 From: Eli Zaretskii Subject: Re: bug#20704: info.el bug fix; Interprets Info format wrongly In-reply-to: <87h9qhf8a3.fsf@tower.recompile.se> X-012-Sender: halo1@inter.net.il To: Teddy Hogeborn Message-id: <83mw09t0p6.fsf@gnu.org> References: <87d21gpzle.fsf@tower.recompile.se> <83382btqcc.fsf@gnu.org> <87h9qhf8a3.fsf@tower.recompile.se> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 20704 Cc: monnier@iro.umontreal.ca, 20704@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Teddy Hogeborn > Cc: Stefan Monnier , 20704@debbugs.gnu.org > Date: Tue, 09 Jun 2015 13:09:08 +0200 > > Eli Zaretskii writes: > > > > > + (+ (point-min) (byte-to-position > > > > + (read (current-buffer)))) > > > > > > Hmm... this only works if the Info file is encoded in UTF-8. I > > > guess in the case of Info, 99% of the files are just ASCII and > > > there's a chance that the vast majority of the rest is (or will be) > > > UTF-8, so maybe this hack works well in practice. > > > > Using byte-to-position would make things worse for Latin-1 and the > > likes. > > No, byte-to-position already checks for that: > > ---- src/marker.c, line 302 > /* If this buffer has as many characters as bytes, > each character must be one byte. > This takes care of the case where enable-multibyte-characters is nil. */ > if (best_above == best_above_byte) > return bytepos; > ---- I think you are misreading the code: the above snippet is for unibyte buffers, whereas a Latin-1 encoded Info file will be read into a multibyte buffer (and decoded into the internal Emacs representation of characters during the read). So this optimization is not going to work in that case. IOW, what matters for byte-to-position is the encoding used in representing characters in Emacs buffers, not the one used externally by the Info file on disk. > Therefore, an Info file in Latin-1 should work just fine. > > > But it shouldn't be hard to add a simple test of > > buffer-file-coding-system: if it states fixed-size encoding, like any > > of the 8-bit encodings, or UTF-16, > > the conversion to character position is trivial. > > I think you mean UTF-32 instead of UTF-16, since UTF-16 is variable- > length. UTF-16 is fixed length for characters in the BMP. From debbugs-submit-bounces@debbugs.gnu.org Tue Jun 09 12:01:45 2015 Received: (at 20704) by debbugs.gnu.org; 9 Jun 2015 16:01:46 +0000 Received: from localhost ([127.0.0.1]:46392 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z2LyD-0008Hk-A8 for submit@debbugs.gnu.org; Tue, 09 Jun 2015 12:01:45 -0400 Received: from mercure.iro.umontreal.ca ([132.204.24.67]:48792) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z2LyC-0008Ha-0A for 20704@debbugs.gnu.org; Tue, 09 Jun 2015 12:01:44 -0400 Received: from hidalgo.iro.umontreal.ca (hidalgo.iro.umontreal.ca [132.204.27.50]) by mercure.iro.umontreal.ca (Postfix) with ESMTP id 0AAF09C133; Tue, 9 Jun 2015 12:01:43 -0400 (EDT) Received: from lechon.iro.umontreal.ca (lechon.iro.umontreal.ca [132.204.27.242]) by hidalgo.iro.umontreal.ca (Postfix) with ESMTP id 09F411E5B99; Tue, 9 Jun 2015 12:01:19 -0400 (EDT) Received: by lechon.iro.umontreal.ca (Postfix, from userid 20848) id DC127B416C; Tue, 9 Jun 2015 12:01:18 -0400 (EDT) From: Stefan Monnier To: Teddy Hogeborn Subject: Re: bug#20704: info.el bug fix; Interprets Info format wrongly Message-ID: References: <87d21gpzle.fsf@tower.recompile.se> <83382btqcc.fsf@gnu.org> <87h9qhf8a3.fsf@tower.recompile.se> Date: Tue, 09 Jun 2015 12:01:18 -0400 In-Reply-To: <87h9qhf8a3.fsf@tower.recompile.se> (Teddy Hogeborn's message of "Tue, 09 Jun 2015 13:09:08 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-DIRO-MailScanner-Information: Please contact the ISP for more information X-DIRO-MailScanner: Found to be clean X-DIRO-MailScanner-SpamCheck: n'est pas un polluriel, SpamAssassin (score=-2.82, requis 5, autolearn=not spam, ALL_TRUSTED -2.82, MC_TSTLAST 0.00) X-DIRO-MailScanner-From: monnier@iro.umontreal.ca X-Spam-Status: No X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 20704 Cc: Eli Zaretskii , 20704@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) >> Using byte-to-position would make things worse for Latin-1 and the >> likes. > No, byte-to-position already checks for that: > ---- src/marker.c, line 302 > /* If this buffer has as many characters as bytes, > each character must be one byte. > This takes care of the case where enable-multibyte-characters is nil. */ > if (best_above == best_above_byte) > return bytepos; > ---- > Therefore, an Info file in Latin-1 should work just fine. No, because the representation in the buffer will still be a utf-8 derivative, so best_above will generally not be equal to best_above_byte. Stefan From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 10 13:50:38 2015 Received: (at 20704) by debbugs.gnu.org; 10 Jun 2015 17:50:39 +0000 Received: from localhost ([127.0.0.1]:47432 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z2k97-0002ZS-I0 for submit@debbugs.gnu.org; Wed, 10 Jun 2015 13:50:38 -0400 Received: from pruche.dit.umontreal.ca ([132.204.246.22]:50132) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z2k93-0002ZC-Tm for 20704@debbugs.gnu.org; Wed, 10 Jun 2015 13:50:35 -0400 Received: from ceviche.home (lechon.iro.umontreal.ca [132.204.27.242]) by pruche.dit.umontreal.ca (8.14.1/8.14.1) with ESMTP id t5AHoTrN000811; Wed, 10 Jun 2015 13:50:30 -0400 Received: by ceviche.home (Postfix, from userid 20848) id C51D466166; Wed, 10 Jun 2015 13:50:29 -0400 (EDT) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#20704: info.el bug fix; Interprets Info format wrongly Message-ID: References: <87d21gpzle.fsf@tower.recompile.se> <83382btqcc.fsf@gnu.org> Date: Wed, 10 Jun 2015 13:50:29 -0400 In-Reply-To: <83382btqcc.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 01 Jun 2015 18:12:35 +0300") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-NAI-Spam-Flag: NO X-NAI-Spam-Threshold: 5 X-NAI-Spam-Score: 0 X-NAI-Spam-Rules: 1 Rules triggered RV5333=0 X-NAI-Spam-Version: 2.3.0.9393 : core <5333> : inlines <3178> : streams <1453233> : uri <1955083> X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: 20704 Cc: teddy@recompile.se, 20704@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.3 (-) > Using byte-to-position would make things worse for Latin-1 and the likes. There's also the problem of EOL encoding, but I'll just ignore it for now. Could someone test the patch below? Stefan diff --git a/lisp/info.el b/lisp/info.el index 9602337..0de7f1e 100644 --- a/lisp/info.el +++ b/lisp/info.el @@ -1020,7 +1020,7 @@ which the match was found." (beginning-of-line) (when (re-search-forward regexp nil t) (list (string-equal "Ref:" (match-string 1)) - (+ (point-min) (read (current-buffer))) + (filepos-to-bufferpos (read (current-buffer)) 'approximate) major-mode))))) (defun Info-find-in-tag-table (marker regexp &optional strict-case) @@ -1187,7 +1187,8 @@ is non-nil)." (when found ;; FOUND is (ANCHOR POS MODE). - (setq guesspos (nth 1 found)) + (setq guesspos (filepos-to-bufferpos (nth 1 found) + 'approximate)) ;; If this is an indirect file, determine which ;; file really holds this node and read it in. @@ -1203,8 +1204,7 @@ is non-nil)." (throw 'foo t))))) ;; Else we may have a node, which we search for: - (goto-char (max (point-min) - (- (byte-to-position guesspos) 1000))) + (goto-char (max (point-min) (- guesspos 1000))) ;; Now search from our advised position (or from beg of ;; buffer) to find the actual node. First, check @@ -1523,7 +1523,9 @@ is non-nil)." thisfilepos thisfilename) (search-forward ": ") (setq thisfilename (buffer-substring beg (- (point) 2))) - (setq thisfilepos (+ (point-min) (read (current-buffer)))) + (setq thisfilepos + (filepos-to-bufferpos (read (current-buffer)) + 'approximate)) ;; read in version 19 stops at the end of number. ;; Advance to the next line. (forward-line 1) @@ -1554,7 +1556,7 @@ is non-nil)." ;; Don't add the length of the skipped summary segment to ;; the value returned to `Info-find-node-2'. (Bug#14125) (if (numberp nodepos) - (+ (- nodepos lastfilepos) (point-min))))) + (- nodepos lastfilepos)))) (defun Info-unescape-quotes (value) "Unescape double quotes and backslashes in VALUE." @@ -2013,8 +2015,9 @@ If DIRECTION is `backward', search in the reverse direction." (re-search-backward "\\(^.*\\): [0-9]+$") (re-search-forward "\\(^.*\\): [0-9]+$")) (goto-char (+ (match-end 1) 2)) - (setq list (cons (cons (+ (point-min) - (read (current-buffer))) + (setq list (cons (cons (filepos-to-bufferpos + (read (current-buffer)) + 'approximate) (match-string-no-properties 1)) list)) (goto-char (if backward diff --git a/lisp/international/mule-util.el b/lisp/international/mule-util.el index eae787b..1f7df0b 100644 --- a/lisp/international/mule-util.el +++ b/lisp/international/mule-util.el @@ -313,6 +313,35 @@ per-character basis, this may not be accurate." (throw 'tag3 charset))) charset-list) nil))))))))) + +;;;###autoload +(defun filepos-to-bufferpos (byte &optional quality coding-system) + "Try to return the buffer position corresponding to a particular file position. +The file position is given as a BYTE count. +The function presumes the file is encoded with CODING-SYSTEM, which defaults +to `buffer-file-coding-system'. +QUALITY can be: + `approximate', in which case we may cut some corners to avoid + excessive work. + nil, in which case we may return nil rather than an approximation." + ;; `exact', in which case we may end up re-(en|de)coding a large + ;; part of the file. + (unless coding-system (setq coding-system buffer-file-coding-system)) + (let ((eol (coding-system-eol-type coding-system)) + (type (coding-system-type coding-system)) + (pm (save-restriction (widen) (point-min)))) + (pcase (cons type eol) + (`(utf-8 . ,(or 0 2)) + (let ((bom-offset (coding-system-get coding-system :bom))) + (byte-to-position + (+ pm (max 0 (- byte (if bom-offset 3 0))))))) + ;; FIXME: What if it's a 2-byte charset? Are there such beasts? + (`(charset . ,(or 0 2)) (+ pm byte)) + (_ + (pcase quality + (`approximate (+ pm (byte-to-position byte))) + ;; (`exact ...) + ))))) (provide 'mule-util) From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 10 14:21:40 2015 Received: (at 20704) by debbugs.gnu.org; 10 Jun 2015 18:21:40 +0000 Received: from localhost ([127.0.0.1]:47444 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z2kd9-0003vk-Tf for submit@debbugs.gnu.org; Wed, 10 Jun 2015 14:21:40 -0400 Received: from mtaout24.012.net.il ([80.179.55.180]:54983) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z2kd6-0003vW-TG for 20704@debbugs.gnu.org; Wed, 10 Jun 2015 14:21:38 -0400 Received: from conversion-daemon.mtaout24.012.net.il by mtaout24.012.net.il (HyperSendmail v2007.08) id <0NPQ00M00QI65T00@mtaout24.012.net.il> for 20704@debbugs.gnu.org; Wed, 10 Jun 2015 21:13:18 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout24.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NPQ00JMMQM67730@mtaout24.012.net.il>; Wed, 10 Jun 2015 21:13:18 +0300 (IDT) Date: Wed, 10 Jun 2015 21:21:25 +0300 From: Eli Zaretskii Subject: Re: bug#20704: info.el bug fix; Interprets Info format wrongly In-reply-to: X-012-Sender: halo1@inter.net.il To: Stefan Monnier Message-id: <83twufs9ui.fsf@gnu.org> References: <87d21gpzle.fsf@tower.recompile.se> <83382btqcc.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 20704 Cc: teddy@recompile.se, 20704@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Stefan Monnier > Cc: teddy@recompile.se, 20704@debbugs.gnu.org > Date: Wed, 10 Jun 2015 13:50:29 -0400 > > > Using byte-to-position would make things worse for Latin-1 and the likes. > > There's also the problem of EOL encoding, but I'll just ignore it for now. That was never a problem before Texinfo 5.x: makeinfo didn't count the CR characters in the CRLF EOLs, and the Info readers removed the CR characters when reading the Info files. But Texinfo 5.x and later does count the CR characters, so the stand-alone Info reader was recently changed to account for that. Which means that Emacs will now have a problem, whereby the byte counts in the tag tables will be inaccurate, and our only hope is the 1000-character tolerance we use to look for the node around the position stated in the tag table will be large enough. Read the gory details about that in this thread: http://lists.gnu.org/archive/html/bug-texinfo/2014-12/msg00068.html From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 10 23:02:21 2015 Received: (at 20704) by debbugs.gnu.org; 11 Jun 2015 03:02:21 +0000 Received: from localhost ([127.0.0.1]:50018 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z2sl2-0004Mb-Nr for submit@debbugs.gnu.org; Wed, 10 Jun 2015 23:02:21 -0400 Received: from chene.dit.umontreal.ca ([132.204.246.20]:49366) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z2sl1-0004MR-6G for 20704@debbugs.gnu.org; Wed, 10 Jun 2015 23:02:19 -0400 Received: from ceviche.home (lechon.iro.umontreal.ca [132.204.27.242]) by chene.dit.umontreal.ca (8.14.1/8.14.1) with ESMTP id t5B32Gkm009005; Wed, 10 Jun 2015 23:02:17 -0400 Received: by ceviche.home (Postfix, from userid 20848) id 5849B6614A; Wed, 10 Jun 2015 23:02:16 -0400 (EDT) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#20704: info.el bug fix; Interprets Info format wrongly Message-ID: References: <87d21gpzle.fsf@tower.recompile.se> <83382btqcc.fsf@gnu.org> <83twufs9ui.fsf@gnu.org> Date: Wed, 10 Jun 2015 23:02:16 -0400 In-Reply-To: <83twufs9ui.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 10 Jun 2015 21:21:25 +0300") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-NAI-Spam-Flag: NO X-NAI-Spam-Threshold: 5 X-NAI-Spam-Score: 0 X-NAI-Spam-Rules: 1 Rules triggered RV5333=0 X-NAI-Spam-Version: 2.3.0.9393 : core <5333> : inlines <3181> : streams <1453440> : uri <1955529> X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: > But Texinfo 5.x and later does count the CR characters, so the > stand-alone Info reader was recently changed to account for that. > Which means that Emacs will now have a problem, whereby the byte > counts in the tag tables will be inaccurate, and our only hope is the > 1000-character tolerance we use to look for the node around the > position stated in the tag table will be large enough. [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium trust [132.204.246.20 listed in list.dnswl.org] -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay domain 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 2.7 GUARANTEED_100_PERCENT BODY: One hundred percent guaranteed X-Debbugs-Envelope-To: 20704 Cc: teddy@recompile.se, 20704@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: > But Texinfo 5.x and later does count the CR characters, so the > stand-alone Info reader was recently changed to account for that. > Which means that Emacs will now have a problem, whereby the byte > counts in the tag tables will be inaccurate, and our only hope is the > 1000-character tolerance we use to look for the node around the > position stated in the tag table will be large enough. [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium trust [132.204.246.20 listed in list.dnswl.org] -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay domain 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 2.7 GUARANTEED_100_PERCENT BODY: One hundred percent guaranteed > But Texinfo 5.x and later does count the CR characters, so the > stand-alone Info reader was recently changed to account for that. > Which means that Emacs will now have a problem, whereby the byte > counts in the tag tables will be inaccurate, and our only hope is the > 1000-character tolerance we use to look for the node around the > position stated in the tag table will be large enough. If needed, I think we could make it work reasonably cheaply with something along the lines of (100% guaranteed untested code): (let (pos lines (eol-offset 0)) (while (progn (setq pos (byte-to-position (+ pm byte (- eol-offset)))) (setq lines (1- (line-number-at-pos pos))) (not (= lines eol-offset))) (setq eol-offset (+ eol-offset lines))) pos)) -- Stefan From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 11 09:12:00 2015 Received: (at 20704) by debbugs.gnu.org; 11 Jun 2015 13:12:00 +0000 Received: from localhost ([127.0.0.1]:50462 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z32H1-0003dG-KX for submit@debbugs.gnu.org; Thu, 11 Jun 2015 09:11:59 -0400 Received: from mtaout20.012.net.il ([80.179.55.166]:38499) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z32Gz-0003d2-5G for 20704@debbugs.gnu.org; Thu, 11 Jun 2015 09:11:58 -0400 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0NPS00L00789GO00@a-mtaout20.012.net.il> for 20704@debbugs.gnu.org; Thu, 11 Jun 2015 16:11:23 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NPS00LFC7AZ6D80@a-mtaout20.012.net.il>; Thu, 11 Jun 2015 16:11:23 +0300 (IDT) Date: Thu, 11 Jun 2015 16:11:16 +0300 From: Eli Zaretskii Subject: Re: bug#20704: info.el bug fix; Interprets Info format wrongly In-reply-to: X-012-Sender: halo1@inter.net.il To: Stefan Monnier Message-id: <83lhfqs83v.fsf@gnu.org> References: <87d21gpzle.fsf@tower.recompile.se> <83382btqcc.fsf@gnu.org> <83twufs9ui.fsf@gnu.org> X-Spam-Score: 3.7 (+++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: > From: Stefan Monnier > Cc: teddy@recompile.se, 20704@debbugs.gnu.org > Date: Wed, 10 Jun 2015 23:02:16 -0400 > > > But Texinfo 5.x and later does count the CR characters, so the > > stand-alone Info reader was recently changed to account for that. > > Which means that Emacs will now have a problem, whereby the byte > > counts in the tag tables will be inaccurate, and our only hope is the > > 1000-character tolerance we use to look for the node around the > > position stated in the tag table will be large enough. > > If needed, I think we could make it work reasonably cheaply with > something along the lines of (100% guaranteed untested code): [...] Content analysis details: (3.7 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [80.179.55.166 listed in list.dnswl.org] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 2.7 GUARANTEED_100_PERCENT BODY: One hundred percent guaranteed X-Debbugs-Envelope-To: 20704 Cc: teddy@recompile.se, 20704@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 3.7 (+++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: > From: Stefan Monnier > Cc: teddy@recompile.se, 20704@debbugs.gnu.org > Date: Wed, 10 Jun 2015 23:02:16 -0400 > > > But Texinfo 5.x and later does count the CR characters, so the > > stand-alone Info reader was recently changed to account for that. > > Which means that Emacs will now have a problem, whereby the byte > > counts in the tag tables will be inaccurate, and our only hope is the > > 1000-character tolerance we use to look for the node around the > > position stated in the tag table will be large enough. > > If needed, I think we could make it work reasonably cheaply with > something along the lines of (100% guaranteed untested code): [...] Content analysis details: (3.7 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [80.179.55.166 listed in list.dnswl.org] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 2.7 GUARANTEED_100_PERCENT BODY: One hundred percent guaranteed > From: Stefan Monnier > Cc: teddy@recompile.se, 20704@debbugs.gnu.org > Date: Wed, 10 Jun 2015 23:02:16 -0400 > > > But Texinfo 5.x and later does count the CR characters, so the > > stand-alone Info reader was recently changed to account for that. > > Which means that Emacs will now have a problem, whereby the byte > > counts in the tag tables will be inaccurate, and our only hope is the > > 1000-character tolerance we use to look for the node around the > > position stated in the tag table will be large enough. > > If needed, I think we could make it work reasonably cheaply with > something along the lines of (100% guaranteed untested code): Sure, but this needs to be conditioned on the EOL encoding we actually found when we read the file. From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 27 07:44:19 2019 Received: (at control) by debbugs.gnu.org; 27 Jun 2019 11:44:19 +0000 Received: from localhost ([127.0.0.1]:37553 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hgSp9-0006A0-Ek for submit@debbugs.gnu.org; Thu, 27 Jun 2019 07:44:19 -0400 Received: from quimby.gnus.org ([80.91.231.51]:40708) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hgSp7-00069s-Gu for control@debbugs.gnu.org; Thu, 27 Jun 2019 07:44:17 -0400 Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=stories) by quimby.gnus.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hgSp5-0004o4-0m for control@debbugs.gnu.org; Thu, 27 Jun 2019 13:44:16 +0200 Date: Thu, 27 Jun 2019 13:44:14 +0200 Message-Id: To: control@debbugs.gnu.org From: Lars Ingebrigtsen Subject: control message for bug #13431 X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: close 13431 quit Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) close 13431 quit From unknown Mon Jun 23 02:24:17 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 26 Jul 2019 11:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator