From unknown Sun Aug 17 22:10:07 2025 X-Loop: help-debbugs@gnu.org Subject: bug#8528: 24.0.50; 32-bit Emacs with apparent 128M buffer size limit Resent-From: Evans Winner Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 20 Apr 2011 21:05:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 8528 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 8528@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.130333346015442 (code B ref -1); Wed, 20 Apr 2011 21:05:01 +0000 Received: (at submit) by debbugs.gnu.org; 20 Apr 2011 21:04:20 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCeZP-000411-QS for submit@debbugs.gnu.org; Wed, 20 Apr 2011 17:04:20 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCeZO-00040p-0L for submit@debbugs.gnu.org; Wed, 20 Apr 2011 17:04:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QCeZH-0002Uy-Q7 for submit@debbugs.gnu.org; Wed, 20 Apr 2011 17:04:12 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, RFC_ABUSE_POST, T_DKIM_INVALID,T_TO_NO_BRKTS_FREEMAIL autolearn=no version=3.3.1 Received: from lists.gnu.org ([140.186.70.17]:58515) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QCeZH-0002Uu-O9 for submit@debbugs.gnu.org; Wed, 20 Apr 2011 17:04:11 -0400 Received: from eggs.gnu.org ([140.186.70.92]:55252) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QCeZG-0001UB-Vq for bug-gnu-emacs@gnu.org; Wed, 20 Apr 2011 17:04:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QCeZG-0002Uk-9t for bug-gnu-emacs@gnu.org; Wed, 20 Apr 2011 17:04:10 -0400 Received: from mail-pw0-f41.google.com ([209.85.160.41]:33161) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QCeZG-0002Ue-59 for bug-gnu-emacs@gnu.org; Wed, 20 Apr 2011 17:04:10 -0400 Received: by pwi10 with SMTP id 10so882324pwi.0 for ; Wed, 20 Apr 2011 14:04:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:to:subject:date:message-id:mime-version :content-type; bh=Dlrqjsz0zC9DnS/i49lVYKUqSstLsBkAoxRkIajIpRM=; b=V6/9rhxAnqugEytwHEWTnyv83tWXWiPdinrqaUYGhYUvBwooIwYxYNFJbceHVK37nG nI+n/hOTF1qT/poCBXjw7gUb0jjH8g4A/J3H8p2KJ00xJVbatk+MWVG5/xRcbJ01MXjX SsI/lP/67vZ2M7dy0IRr0BjC8QS0Fsea7TzT8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:subject:date:message-id:mime-version:content-type; b=hbMW2T0uQoKTiqbNf4dyyQ4MtBMJWmqgCIEBWgglWbf+OSwJ19QTopsPtRFnltioyY ra2oaiZZWtWW7VJ655LNALH8L35rKVA8QhCeYPOULG+dYIzdVZ17d4hC/fil442Iym2e PX4/h8a0X1chxAn5UT3sIOuvr2eonhKfMjTt8= Received: by 10.68.23.33 with SMTP id j1mr11165809pbf.443.1303333449137; Wed, 20 Apr 2011 14:04:09 -0700 (PDT) Received: from braintron.67.42.142.120 ([67.42.142.120]) by mx.google.com with ESMTPS id d3sm840246pbh.73.2011.04.20.14.04.07 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 20 Apr 2011 14:04:08 -0700 (PDT) From: Evans Winner Date: Wed, 20 Apr 2011 15:04:06 -0600 Message-ID: <87bp00iqih.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.17 X-Spam-Score: -5.9 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.9 (-----) My understanding is that a 32-bit GNU Emacs should be able to open files up to 512 M. If I am wrong about that, please let me know. I have compiled Emacs trunk from source several times in the last couple of months and somewhere in the last month or so it seems that the limit on my machine has become 128 M. My math could be off, but on the assumption that 128 Mebibytes = 2^27 bytes = 1024 * 131072 bytes, and starting with emacs -Q I tried: $ dd if=/dev/zero of=testfile bs=1024 count=131072 and tried to open the file, and got: "Maximum buffer size exceeded". Then I tried one K less: $ dd if=/dev/zero of=testfile bs=1024 count=131071 and the buffer opened. I have verified using the `top' command that there is sufficient free memory for the files. Also, for what it's worth: ELISP> most-positive-fixnum ==> 536870911 I discovered this as a result of not being able to open a large (~160Mb) .pdf file that I had earlier been able to open. Please let me know if there is any other information I can provide, or if there is something simple I am doing wrong. In GNU Emacs 24.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.0.8) of 2011-04-19 on braintron Windowing system distributor `The X.Org Foundation', version 11.0.11001000 configured using `configure '--with-x-toolkit=gtk3'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: en_US.UTF-8 value of $XMODIFIERS: nil locale-coding-system: utf-8-unix default enable-multibyte-characters: t From unknown Sun Aug 17 22:10:07 2025 X-Loop: help-debbugs@gnu.org Subject: bug#8528: 24.0.50; 32-bit Emacs with apparent 128M buffer size limit Resent-From: Eli Zaretskii Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 21 Apr 2011 05:54:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 8528 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Evans Winner , Paul Eggert Cc: 8528@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 8528-submit@debbugs.gnu.org id=B8528.130336522428303 (code B ref 8528); Thu, 21 Apr 2011 05:54:01 +0000 Received: (at 8528) by debbugs.gnu.org; 21 Apr 2011 05:53:44 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCmpk-0007MS-Il for submit@debbugs.gnu.org; Thu, 21 Apr 2011 01:53:44 -0400 Received: from mtaout21.012.net.il ([80.179.55.169]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCmpi-0007MC-3x for 8528@debbugs.gnu.org; Thu, 21 Apr 2011 01:53:43 -0400 Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0LJZ00100MHNBT00@a-mtaout21.012.net.il> for 8528@debbugs.gnu.org; Thu, 21 Apr 2011 08:52:36 +0300 (IDT) Received: from HOME-C4E4A596F7 ([77.124.129.240]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0LJZ000SVMZMY250@a-mtaout21.012.net.il>; Thu, 21 Apr 2011 08:52:36 +0300 (IDT) Date: Thu, 21 Apr 2011 08:52:34 +0300 From: Eli Zaretskii In-reply-to: <87bp00iqih.fsf@gmail.com> X-012-Sender: halo1@inter.net.il Message-id: <83r58w2lst.fsf@gnu.org> References: <87bp00iqih.fsf@gmail.com> X-Spam-Score: -2.1 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.1 (--) > From: Evans Winner > Date: Wed, 20 Apr 2011 15:04:06 -0600 > > My understanding is that a 32-bit GNU Emacs should be able > to open files up to 512 M. If I am wrong about that, please > let me know. I have compiled Emacs trunk from source > several times in the last couple of months and somewhere in > the last month or so it seems that the limit on my machine > has become 128 M. My math could be off, but on the > assumption that 128 Mebibytes = 2^27 bytes = 1024 * 131072 > bytes, and starting with emacs -Q I tried: > > $ dd if=/dev/zero of=testfile bs=1024 count=131072 > > and tried to open the file, and got: "Maximum buffer size > exceeded". This happens because of the following test in insert-file-contents: /* Arithmetic overflow can occur if an Emacs integer cannot represent the file size, or if the calculations below overflow. The calculations below double the file size twice, so check that it can be multiplied by 4 safely. Also check whether the size is negative, which can happen on a platform that allows file sizes greater than the maximum off_t value. */ if (! not_regular && ! (0 <= st.st_size && st.st_size <= MOST_POSITIVE_FIXNUM / 4)) error ("Maximum buffer size exceeded"); This test was commented out for the last 2 years, but lately it was uncommented by Paul Eggert in revision 103841 on the trunk. Paul, could you please tell where do you see twice doubling of the file size in insert-file-contents? Back in 1999, when this test was first introduced, there was indeed such doubling. But even then it was only when the REPLACE argument was non-nil (according to my reading of the code). In any case, that part of code was completely rewritten since then, and I don't believe we double the file size even once. By disabling that test, I was able to visit a 260-MB file on a 32-bit machine. So it seems like this test could be removed, if I'm not missing anything. From unknown Sun Aug 17 22:10:07 2025 X-Loop: help-debbugs@gnu.org Subject: bug#8528: 24.0.50; 32-bit Emacs with apparent 128M buffer size limit Resent-From: Paul Eggert Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 21 Apr 2011 06:20:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 8528 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 8528@debbugs.gnu.org, Evans Winner Received: via spool by 8528-submit@debbugs.gnu.org id=B8528.130336674930497 (code B ref 8528); Thu, 21 Apr 2011 06:20:02 +0000 Received: (at 8528) by debbugs.gnu.org; 21 Apr 2011 06:19:09 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCnEK-0007vp-8M for submit@debbugs.gnu.org; Thu, 21 Apr 2011 02:19:09 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCnEI-0007vK-9M for 8528@debbugs.gnu.org; Thu, 21 Apr 2011 02:19:06 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 945C539E8105; Wed, 20 Apr 2011 23:19:00 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9zB+2pHzON56; Wed, 20 Apr 2011 23:19:00 -0700 (PDT) Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 2036D39E80B1; Wed, 20 Apr 2011 23:19:00 -0700 (PDT) Message-ID: <4DAFCC4F.1080900@cs.ucla.edu> Date: Wed, 20 Apr 2011 23:18:55 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8 MIME-Version: 1.0 References: <87bp00iqih.fsf@gmail.com> <83r58w2lst.fsf@gnu.org> In-Reply-To: <83r58w2lst.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Score: -3.0 (---) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.0 (---) On 04/20/11 22:52, Eli Zaretskii wrote: > Paul, could you please tell where do you see twice doubling of the > file size in insert-file-contents? I assumed that it was because the internal buffers contain an Emacs-encoded version of the file, which could be as long as four times the actual file size, because a single byte in the file might expand to 4 bytes inside Emacs in some cases. That would explain the behavior that you saw: if your file's internal encoding was the same as the external, you wouldn't observe any problem. The problem would be exhibited only with files containing many characters that bloat when read into memory. However, I didn't investigate the matter thoroughly; perhaps someone who's more expert on how Emacs encodes things internally could speak up. From unknown Sun Aug 17 22:10:07 2025 X-Loop: help-debbugs@gnu.org Subject: bug#8528: 24.0.50; 32-bit Emacs with apparent 128M buffer size limit Resent-From: Eli Zaretskii Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 21 Apr 2011 06:41:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 8528 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Paul Eggert Cc: 8528@debbugs.gnu.org, ego111@gmail.com Reply-To: Eli Zaretskii Received: via spool by 8528-submit@debbugs.gnu.org id=B8528.130336803732386 (code B ref 8528); Thu, 21 Apr 2011 06:41:02 +0000 Received: (at 8528) by debbugs.gnu.org; 21 Apr 2011 06:40:37 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCnZ6-0008QI-74 for submit@debbugs.gnu.org; Thu, 21 Apr 2011 02:40:36 -0400 Received: from mtaout20.012.net.il ([80.179.55.166]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCnZ2-0008Q1-U8 for 8528@debbugs.gnu.org; Thu, 21 Apr 2011 02:40:34 -0400 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0LJZ00K00OUZHU00@a-mtaout20.012.net.il> for 8528@debbugs.gnu.org; Thu, 21 Apr 2011 09:40:25 +0300 (IDT) Received: from HOME-C4E4A596F7 ([77.124.129.240]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0LJZ00JS4P78KH40@a-mtaout20.012.net.il>; Thu, 21 Apr 2011 09:40:24 +0300 (IDT) Date: Thu, 21 Apr 2011 09:40:26 +0300 From: Eli Zaretskii In-reply-to: <4DAFCC4F.1080900@cs.ucla.edu> X-012-Sender: halo1@inter.net.il Message-id: <83mxjk2jl1.fsf@gnu.org> References: <87bp00iqih.fsf@gmail.com> <83r58w2lst.fsf@gnu.org> <4DAFCC4F.1080900@cs.ucla.edu> X-Spam-Score: -2.1 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.1 (--) > Date: Wed, 20 Apr 2011 23:18:55 -0700 > From: Paul Eggert > CC: Evans Winner , 8528@debbugs.gnu.org > > On 04/20/11 22:52, Eli Zaretskii wrote: > > Paul, could you please tell where do you see twice doubling of the > > file size in insert-file-contents? > > I assumed that it was because the internal buffers contain an > Emacs-encoded version of the file, which could be as long as four > times the actual file size, because a single byte in the file > might expand to 4 bytes inside Emacs in some cases. Actually, it could potentially expand even 5-fold (because Emacs extends UTF-8 to codepoints as large as 0x3FFFFF). But we test the buffer size and avoid overflowing it in many other places, both further down in insert-file-contents and in insdel.c. If those are not enough, we could add more such tests, particularly after decoding the file's contents, where we know the full buffer size in bytes. So I think artificially limiting the maximum size of a file that can be visited in that particular place in insert-file-contents is too harsh. > That would explain the behavior that you saw: if your file's > internal encoding was the same as the external, you wouldn't observe any > problem. The problem would be exhibited only with files containing > many characters that bloat when read into memory. Right, but wouldn't you agree that such a limitation is too stringent? E.g., I should be able to use find-file-literally to visit a 512MB file, but currently I cannot. From unknown Sun Aug 17 22:10:07 2025 X-Loop: help-debbugs@gnu.org Subject: bug#8528: 24.0.50; 32-bit Emacs with apparent 128M buffer size limit Resent-From: Paul Eggert Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 21 Apr 2011 06:59:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 8528 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 8528@debbugs.gnu.org, ego111@gmail.com Received: via spool by 8528-submit@debbugs.gnu.org id=B8528.13033691261586 (code B ref 8528); Thu, 21 Apr 2011 06:59:01 +0000 Received: (at 8528) by debbugs.gnu.org; 21 Apr 2011 06:58:46 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCnqf-0000PW-5A for submit@debbugs.gnu.org; Thu, 21 Apr 2011 02:58:45 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCnqd-0000PJ-91 for 8528@debbugs.gnu.org; Thu, 21 Apr 2011 02:58:44 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id A92E239E80DB; Wed, 20 Apr 2011 23:58:37 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ytXRuvAUrthg; Wed, 20 Apr 2011 23:58:36 -0700 (PDT) Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id A03DC39E8083; Wed, 20 Apr 2011 23:58:36 -0700 (PDT) Message-ID: <4DAFD59C.5090602@cs.ucla.edu> Date: Wed, 20 Apr 2011 23:58:36 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8 MIME-Version: 1.0 References: <87bp00iqih.fsf@gmail.com> <83r58w2lst.fsf@gnu.org> <4DAFCC4F.1080900@cs.ucla.edu> <83mxjk2jl1.fsf@gnu.org> In-Reply-To: <83mxjk2jl1.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Score: -3.0 (---) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.0 (---) On 04/20/11 23:40, Eli Zaretskii wrote: > Right, but wouldn't you agree that such a limitation is too stringent? Yes, absolutely, the limit should be removed if possible. In a brief look at the code, it appeared to me that there were places where the it does not check for integer overflow in size calculations when converting external to internal form. So it could well be that this preliminary check may be needed to avoid catastrophe later. I have not checked this out carefully, though, and I could be wrong. (One way to find out would be to test it with a worst-case-bloat file, but I haven't had time to do that.) > E.g., I should be able to use find-file-literally to visit a 512MB > file, but currently I cannot. If we know that byte bloat cannot occur, which is the case with find-file-literally, then the divide-by-4 limit should not be needed. That case should be easy, in that it shouldn't require a lot of analysis to fix that case safely. From unknown Sun Aug 17 22:10:07 2025 X-Loop: help-debbugs@gnu.org Subject: bug#8528: 24.0.50; 32-bit Emacs with apparent 128M buffer size limit Resent-From: Eli Zaretskii Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 21 Apr 2011 13:21:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 8528 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Paul Eggert Cc: 8528@debbugs.gnu.org, ego111@gmail.com Reply-To: Eli Zaretskii Received: via spool by 8528-submit@debbugs.gnu.org id=B8528.13033920134572 (code B ref 8528); Thu, 21 Apr 2011 13:21:01 +0000 Received: (at 8528) by debbugs.gnu.org; 21 Apr 2011 13:20:13 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCtnp-0001Bg-6T for submit@debbugs.gnu.org; Thu, 21 Apr 2011 09:20:13 -0400 Received: from mtaout22.012.net.il ([80.179.55.172]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCtni-0001B2-92 for 8528@debbugs.gnu.org; Thu, 21 Apr 2011 09:20:12 -0400 Received: from conversion-daemon.a-mtaout22.012.net.il by a-mtaout22.012.net.il (HyperSendmail v2007.08) id <0LK000L007NPA900@a-mtaout22.012.net.il> for 8528@debbugs.gnu.org; Thu, 21 Apr 2011 16:19:57 +0300 (IDT) Received: from HOME-C4E4A596F7 ([77.124.129.240]) by a-mtaout22.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0LK000LPE7OZ7F10@a-mtaout22.012.net.il>; Thu, 21 Apr 2011 16:19:57 +0300 (IDT) Date: Thu, 21 Apr 2011 16:20:54 +0300 From: Eli Zaretskii In-reply-to: <4DAFD59C.5090602@cs.ucla.edu> X-012-Sender: halo1@inter.net.il Message-id: <838vv33fm1.fsf@gnu.org> References: <87bp00iqih.fsf@gmail.com> <83r58w2lst.fsf@gnu.org> <4DAFCC4F.1080900@cs.ucla.edu> <83mxjk2jl1.fsf@gnu.org> <4DAFD59C.5090602@cs.ucla.edu> X-Spam-Score: -2.1 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.1 (--) > Date: Wed, 20 Apr 2011 23:58:36 -0700 > From: Paul Eggert > CC: ego111@gmail.com, 8528@debbugs.gnu.org > > On 04/20/11 23:40, Eli Zaretskii wrote: > > Right, but wouldn't you agree that such a limitation is too stringent? > > Yes, absolutely, the limit should be removed if possible. > > In a brief look at the code, it appeared to me that there were > places where the it does not check for integer overflow > in size calculations when converting external to internal form. So > it could well be that this preliminary check may be needed to > avoid catastrophe later. I have not checked this out carefully, > though, and I could be wrong. I have now reviewed the code involved in this, and I think the limit can be lifted. In general, there could be two ways for us to insert text into the buffer as result of calling insert-file-contents: (a) directly, by reading from the file into its buffer (or some temporary buffer used as part of processing); or (b) indirectly, by decoding inserted text through various functions in coding.c, which write the decoded text into the destination buffer. I found that both of these ways include tests for potential overflows of the buffer size. Inserting text directly is protected because it enlarges the buffer's gap before inserting text, and make_gap which does that errors out if the new size will overflow (actually, it errors out 2000 bytes too early, because it wants some extra space). Insertion by decoding text is also protected because it makes sure the destination buffer has enough space before it writes another chunk of decoded text into it. It assures that by enlarging the gap, which again goes through make_gap. I found only one place where we were not protected from overflowing MOST_POSITIVE_FIXNUM (not sure if it is relevant to insert-file-contents), and one other place where I wasn't sure we were protected, so I added a suitable protection in both those places. See the proposed patch below. If no one objects, I will commit these changes in a week or so. > > E.g., I should be able to use find-file-literally to visit a 512MB > > file, but currently I cannot. > > If we know that byte bloat cannot occur, which is the case with > find-file-literally, then the divide-by-4 limit should not be needed. > That case should be easy, in that it shouldn't require a lot > of analysis to fix that case safely. Yes, definitely. But I think the patch below solves this problem as well, so there's no need for special treatment for unibyte or pure ASCII files. Here's the proposed patch. Evans, I'd appreciate if you could try it and see if it solves the original problem for you. === modified file 'src/ChangeLog' --- src/ChangeLog 2011-04-19 10:48:30 +0000 +++ src/ChangeLog 2011-04-21 12:35:30 +0000 @@ -1,3 +1,16 @@ +2011-04-21 Eli Zaretskii + + * coding.c (coding_alloc_by_realloc): Error out if destination + will grow beyond MOST_POSITIVE_FIXNUM. + (decode_coding_emacs_mule): Abort if there isn't enough place in + charbuf for the composition carryover bytes. Reserve an extra + space for up to 2 characters produced in a loop. + (decode_coding_iso_2022): Abort if there isn't enough place in + charbuf for the composition carryover bytes. + + * fileio.c (Finsert_file_contents): Don't limit file size to 1/4 + of MOST_POSITIVE_FIXNUM. + 2011-04-19 Eli Zaretskii * syntax.h (SETUP_SYNTAX_TABLE_FOR_OBJECT): Fix setting of === modified file 'src/coding.c' --- src/coding.c 2011-04-14 05:04:02 +0000 +++ src/coding.c 2011-04-21 12:35:33 +0000 @@ -1071,6 +1071,8 @@ coding_set_destination (struct coding_sy static void coding_alloc_by_realloc (struct coding_system *coding, EMACS_INT bytes) { + if (coding->dst_bytes > MOST_POSITIVE_FIXNUM - bytes) + error ("Maximum size of buffer or string exceeded"); coding->destination = (unsigned char *) xrealloc (coding->destination, coding->dst_bytes + bytes); coding->dst_bytes += bytes; @@ -2333,7 +2335,9 @@ decode_coding_emacs_mule (struct coding_ /* We may produce two annotations (charset and composition) in one loop and one more charset annotation at the end. */ int *charbuf_end - = coding->charbuf + coding->charbuf_size - (MAX_ANNOTATION_LENGTH * 3); + = coding->charbuf + coding->charbuf_size - (MAX_ANNOTATION_LENGTH * 3) + /* We can produce up to 2 characters in a loop. */ + - 1; EMACS_INT consumed_chars = 0, consumed_chars_base; int multibytep = coding->src_multibyte; EMACS_INT char_offset = coding->produced_char; @@ -2348,6 +2352,8 @@ decode_coding_emacs_mule (struct coding_ { int i; + if (charbuf_end - charbuf < cmp_status->length) + abort (); for (i = 0; i < cmp_status->length; i++) *charbuf++ = cmp_status->carryover[i]; coding->annotated = 1; @@ -3479,6 +3485,8 @@ decode_coding_iso_2022 (struct coding_sy if (cmp_status->state != COMPOSING_NO) { + if (charbuf_end - charbuf < cmp_status->length) + abort (); for (i = 0; i < cmp_status->length; i++) *charbuf++ = cmp_status->carryover[i]; coding->annotated = 1; === modified file 'src/fileio.c' --- src/fileio.c 2011-04-14 20:20:17 +0000 +++ src/fileio.c 2011-04-21 12:07:44 +0000 @@ -3245,15 +3245,10 @@ variable `last-coding-system-used' to th record_unwind_protect (close_file_unwind, make_number (fd)); - /* Arithmetic overflow can occur if an Emacs integer cannot represent the - file size, or if the calculations below overflow. The calculations below - double the file size twice, so check that it can be multiplied by 4 - safely. - - Also check whether the size is negative, which can happen on a platform - that allows file sizes greater than the maximum off_t value. */ + /* Check whether the size is too large or negative, which can happen on a + platform that allows file sizes greater than the maximum off_t value. */ if (! not_regular - && ! (0 <= st.st_size && st.st_size <= MOST_POSITIVE_FIXNUM / 4)) + && ! (0 <= st.st_size && st.st_size <= MOST_POSITIVE_FIXNUM)) error ("Maximum buffer size exceeded"); /* Prevent redisplay optimizations. */ From unknown Sun Aug 17 22:10:07 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.427 (Entity 5.427) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Evans Winner Subject: bug#8528: closed (Re: bug#8528: 24.0.50; 32-bit Emacs with apparent 128M buffer size limit) Message-ID: References: <83y62s6doj.fsf@gnu.org> <87bp00iqih.fsf@gmail.com> X-Gnu-PR-Message: they-closed 8528 X-Gnu-PR-Package: emacs Reply-To: 8528@debbugs.gnu.org Date: Fri, 29 Apr 2011 19:50:03 +0000 Content-Type: multipart/mixed; boundary="----------=_1304106603-5646-1" This is a multi-part message in MIME format... ------------=_1304106603-5646-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #8528: 24.0.50; 32-bit Emacs with apparent 128M buffer size limit which was filed against the emacs package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 8528@debbugs.gnu.org. --=20 8528: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D8528 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1304106603-5646-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 8528-done) by debbugs.gnu.org; 29 Apr 2011 19:49:26 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QFtgr-0001SH-If for submit@debbugs.gnu.org; Fri, 29 Apr 2011 15:49:25 -0400 Received: from mtaout20.012.net.il ([80.179.55.166]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QFtgp-0001S1-M3 for 8528-done@debbugs.gnu.org; Fri, 29 Apr 2011 15:49:24 -0400 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0LKF00A00ISGKT00@a-mtaout20.012.net.il> for 8528-done@debbugs.gnu.org; Fri, 29 Apr 2011 22:49:16 +0300 (IDT) Received: from HOME-C4E4A596F7 ([77.124.150.132]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0LKF00996J220UQ0@a-mtaout20.012.net.il>; Fri, 29 Apr 2011 22:49:16 +0300 (IDT) Date: Fri, 29 Apr 2011 22:49:16 +0300 From: Eli Zaretskii Subject: Re: bug#8528: 24.0.50; 32-bit Emacs with apparent 128M buffer size limit In-reply-to: <838vv33fm1.fsf@gnu.org> X-012-Sender: halo1@inter.net.il To: eggert@cs.ucla.edu, ego111@gmail.com Message-id: <83y62s6doj.fsf@gnu.org> References: <87bp00iqih.fsf@gmail.com> <83r58w2lst.fsf@gnu.org> <4DAFCC4F.1080900@cs.ucla.edu> <83mxjk2jl1.fsf@gnu.org> <4DAFD59C.5090602@cs.ucla.edu> <838vv33fm1.fsf@gnu.org> X-Spam-Score: -2.1 (--) X-Debbugs-Envelope-To: 8528-done Cc: 8528-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.1 (--) > Date: Thu, 21 Apr 2011 16:20:54 +0300 > From: Eli Zaretskii > Cc: 8528@debbugs.gnu.org, ego111@gmail.com > > If no one objects, I will commit these changes in a week or so. No one objected, so I installed this. ------------=_1304106603-5646-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 20 Apr 2011 21:04:20 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCeZP-000411-QS for submit@debbugs.gnu.org; Wed, 20 Apr 2011 17:04:20 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QCeZO-00040p-0L for submit@debbugs.gnu.org; Wed, 20 Apr 2011 17:04:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QCeZH-0002Uy-Q7 for submit@debbugs.gnu.org; Wed, 20 Apr 2011 17:04:12 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, RFC_ABUSE_POST, T_DKIM_INVALID,T_TO_NO_BRKTS_FREEMAIL autolearn=no version=3.3.1 Received: from lists.gnu.org ([140.186.70.17]:58515) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QCeZH-0002Uu-O9 for submit@debbugs.gnu.org; Wed, 20 Apr 2011 17:04:11 -0400 Received: from eggs.gnu.org ([140.186.70.92]:55252) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QCeZG-0001UB-Vq for bug-gnu-emacs@gnu.org; Wed, 20 Apr 2011 17:04:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QCeZG-0002Uk-9t for bug-gnu-emacs@gnu.org; Wed, 20 Apr 2011 17:04:10 -0400 Received: from mail-pw0-f41.google.com ([209.85.160.41]:33161) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QCeZG-0002Ue-59 for bug-gnu-emacs@gnu.org; Wed, 20 Apr 2011 17:04:10 -0400 Received: by pwi10 with SMTP id 10so882324pwi.0 for ; Wed, 20 Apr 2011 14:04:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:to:subject:date:message-id:mime-version :content-type; bh=Dlrqjsz0zC9DnS/i49lVYKUqSstLsBkAoxRkIajIpRM=; b=V6/9rhxAnqugEytwHEWTnyv83tWXWiPdinrqaUYGhYUvBwooIwYxYNFJbceHVK37nG nI+n/hOTF1qT/poCBXjw7gUb0jjH8g4A/J3H8p2KJ00xJVbatk+MWVG5/xRcbJ01MXjX SsI/lP/67vZ2M7dy0IRr0BjC8QS0Fsea7TzT8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:subject:date:message-id:mime-version:content-type; b=hbMW2T0uQoKTiqbNf4dyyQ4MtBMJWmqgCIEBWgglWbf+OSwJ19QTopsPtRFnltioyY ra2oaiZZWtWW7VJ655LNALH8L35rKVA8QhCeYPOULG+dYIzdVZ17d4hC/fil442Iym2e PX4/h8a0X1chxAn5UT3sIOuvr2eonhKfMjTt8= Received: by 10.68.23.33 with SMTP id j1mr11165809pbf.443.1303333449137; Wed, 20 Apr 2011 14:04:09 -0700 (PDT) Received: from braintron.67.42.142.120 ([67.42.142.120]) by mx.google.com with ESMTPS id d3sm840246pbh.73.2011.04.20.14.04.07 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 20 Apr 2011 14:04:08 -0700 (PDT) From: Evans Winner To: bug-gnu-emacs@gnu.org Subject: 24.0.50; 32-bit Emacs with apparent 128M buffer size limit Date: Wed, 20 Apr 2011 15:04:06 -0600 Message-ID: <87bp00iqih.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.17 X-Spam-Score: -5.9 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.9 (-----) My understanding is that a 32-bit GNU Emacs should be able to open files up to 512 M. If I am wrong about that, please let me know. I have compiled Emacs trunk from source several times in the last couple of months and somewhere in the last month or so it seems that the limit on my machine has become 128 M. My math could be off, but on the assumption that 128 Mebibytes = 2^27 bytes = 1024 * 131072 bytes, and starting with emacs -Q I tried: $ dd if=/dev/zero of=testfile bs=1024 count=131072 and tried to open the file, and got: "Maximum buffer size exceeded". Then I tried one K less: $ dd if=/dev/zero of=testfile bs=1024 count=131071 and the buffer opened. I have verified using the `top' command that there is sufficient free memory for the files. Also, for what it's worth: ELISP> most-positive-fixnum ==> 536870911 I discovered this as a result of not being able to open a large (~160Mb) .pdf file that I had earlier been able to open. Please let me know if there is any other information I can provide, or if there is something simple I am doing wrong. In GNU Emacs 24.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.0.8) of 2011-04-19 on braintron Windowing system distributor `The X.Org Foundation', version 11.0.11001000 configured using `configure '--with-x-toolkit=gtk3'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: en_US.UTF-8 value of $XMODIFIERS: nil locale-coding-system: utf-8-unix default enable-multibyte-characters: t ------------=_1304106603-5646-1-- From unknown Sun Aug 17 22:10:07 2025 X-Loop: help-debbugs@gnu.org Subject: bug#8528: 24.0.50; 32-bit Emacs with apparent 128M buffer size limit Resent-From: Stefan Monnier Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 02 May 2011 14:54:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 8528 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 8528@debbugs.gnu.org Cc: eliz@gnu.org Received: via spool by 8528-submit@debbugs.gnu.org id=B8528.13043480159533 (code B ref 8528); Mon, 02 May 2011 14:54:02 +0000 Received: (at 8528) by debbugs.gnu.org; 2 May 2011 14:53:35 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QGuVC-0002Ti-N4 for submit@debbugs.gnu.org; Mon, 02 May 2011 10:53:35 -0400 Received: from fencepost.gnu.org ([140.186.70.10]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QGuVA-0002TX-QU for 8528@debbugs.gnu.org; Mon, 02 May 2011 10:53:33 -0400 Received: from 121-249-126-200.fibertel.com.ar ([200.126.249.121]:51784 helo=ceviche.home) by fencepost.gnu.org with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1QGuV5-0003nq-8t; Mon, 02 May 2011 10:53:27 -0400 Received: by ceviche.home (Postfix, from userid 20848) id 74E4B66119; Mon, 2 May 2011 11:53:24 -0300 (ART) From: Stefan Monnier Message-ID: References: <87bp00iqih.fsf@gmail.com> <83r58w2lst.fsf@gnu.org> <4DAFCC4F.1080900@cs.ucla.edu> <83mxjk2jl1.fsf@gnu.org> <4DAFD59C.5090602@cs.ucla.edu> <838vv33fm1.fsf@gnu.org> <83y62s6doj.fsf@gnu.org> Date: Mon, 02 May 2011 11:53:24 -0300 In-Reply-To: <83y62s6doj.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 29 Apr 2011 22:49:16 +0300") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -6.0 (------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.0 (------) > No one objected, so I installed this. Thanks, Stefan