From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 11 12:07:18 2018 Received: (at submit) by debbugs.gnu.org; 11 Dec 2018 17:07:18 +0000 Received: from localhost ([127.0.0.1]:44116 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gWlV7-0000gv-Jz for submit@debbugs.gnu.org; Tue, 11 Dec 2018 12:07:17 -0500 Received: from eggs.gnu.org ([208.118.235.92]:43163) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gWlV6-0000gh-LR for submit@debbugs.gnu.org; Tue, 11 Dec 2018 12:07:17 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gWlUw-0002ai-EN for submit@debbugs.gnu.org; Tue, 11 Dec 2018 12:07:11 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:60486) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gWlUw-0002aa-Av for submit@debbugs.gnu.org; Tue, 11 Dec 2018 12:07:06 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52756) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWlUu-0005J1-0Y for bug-gnu-emacs@gnu.org; Tue, 11 Dec 2018 12:07:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gWlUp-0002V4-PN for bug-gnu-emacs@gnu.org; Tue, 11 Dec 2018 12:07:03 -0500 Received: from altona.bgc-jena.mpg.de ([195.37.229.22]:55572) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gWlUp-0002TP-GN for bug-gnu-emacs@gnu.org; Tue, 11 Dec 2018 12:06:59 -0500 Received: from localhost (localhost [127.0.0.1]) by altona.bgc-jena.mpg.de (GATE -B2013- / MPI BGC Mail System) with ESMTP id 6870F1A362 for ; Tue, 11 Dec 2018 18:06:55 +0100 (CET) X-Virus-Scanned: amavisd-new at bgc-jena.mpg.de Received: from altona.bgc-jena.mpg.de ([127.0.0.1]) by localhost (bermuda.bgc-jena.mpg.de [127.0.0.1]) (amavisd-new, port 10024) with LMTP id o1Uj2NDVAV66 for ; Tue, 11 Dec 2018 18:06:49 +0100 (CET) Received: from [192.168.2.100] (dslb-002-203-190-066.002.203.pools.vodafone-ip.de [2.203.190.66]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by altona.bgc-jena.mpg.de (GATE -B2013- / MPI BGC Mail System) with ESMTPSA id 170941A360 for ; Tue, 11 Dec 2018 18:06:49 +0100 (CET) To: bug-gnu-emacs@gnu.org Subject: 26.1.90; nhexl-mode performance From: Guido Kraemer Message-ID: <8d6352ac-4bca-fb76-8c2f-dd597e481e8e@bgc-jena.mpg.de> Date: Tue, 11 Dec 2018 18:06:40 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) Filing a bug report because of the discussion on https://emacs.stackexchange.com/questions/46492/how-to-search-for-a-sequence-of-bytes-in-hexl-mode/ nhexl-mode performance is really bad in large files. Occurred in the files of the Bitcoin blockchain. The beginning of every block is marked with the byte sequence `f9beb4d9`. Searching for this byte sequence in nhexl-mode is really slow. In case you do not have the bitcoin client installed, I uploaded the first file of the blockchain here (134MB, link will be valid for 30 days): https://ufile.io/z08bl Thanks for looking into this. Guido. In GNU Emacs 26.1.90 (build 1, x86_64-pc-linux-gnu, X toolkit, Xaw scroll bars) of 2018-12-01 built on uranus Repository revision: 7851ae8b443c62a41ea4f4440512aa56cc87b9b7 Windowing system distributor 'The X.Org Foundation From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 14 13:40:46 2018 Received: (at 33708) by debbugs.gnu.org; 14 Dec 2018 18:40:46 +0000 Received: from localhost ([127.0.0.1]:48541 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gXsOE-0006TK-Da for submit@debbugs.gnu.org; Fri, 14 Dec 2018 13:40:46 -0500 Received: from alt42.smtp-out.videotron.ca ([23.233.128.29]:4865) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gXsOC-0006T5-NA for 33708@debbugs.gnu.org; Fri, 14 Dec 2018 13:40:45 -0500 Received: from fmsmemgm.homelinux.net ([23.233.195.134]) by Videotron with SMTP id XsO4g34hOM6QEXsO6g4cKE; Fri, 14 Dec 2018 13:40:39 -0500 X-Authority-Analysis: v=2.3 cv=bq5i+nSi c=1 sm=1 tr=0 a=xXJ578j8WyTliCxld3/pTA==:117 a=xXJ578j8WyTliCxld3/pTA==:17 a=2ur7OfE09M0A:10 a=g5WmnKvGXLTxagMPJxAA:9 a=9Hpp9JvEUdwtoNKF:21 a=JfC2CH3QbcSsgSG5:21 Received: by fmsmemgm.homelinux.net (Postfix, from userid 20848) id D81FFAE97A; Fri, 14 Dec 2018 13:40:36 -0500 (EST) From: Stefan Monnier To: Guido Kraemer Subject: Re: bug#33708: 26.1.90; nhexl-mode performance Message-ID: References: <8d6352ac-4bca-fb76-8c2f-dd597e481e8e@bgc-jena.mpg.de> Date: Fri, 14 Dec 2018 13:40:36 -0500 In-Reply-To: <8d6352ac-4bca-fb76-8c2f-dd597e481e8e@bgc-jena.mpg.de> (Guido Kraemer's message of "Tue, 11 Dec 2018 18:06:40 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-CMAE-Envelope: MS4wfMmtEVwRql6nSqktzxhUlPeDAF9sfUHH4hJmVBAi2WIvcA408fRsbqp1O4Vy/lVaH2EKcR453KD9jYeTe74xFeUpXbgMy9Gr5aaJlWbgtuPYffGN5+nG chUPKhyXEyvOy66urqTs9GSbNG1+T387My0ldzSee7Fwv22ELeOgVCFCk7tTe9GiQr1ezwoxXvwIazIQD8fMzJtUp7X3MV27ZCvR/X1tvpUn2z5Q6cQNjMkF QE8WtWN3M92JsPubXlQAlOyGtwtFXE8g6NCPaPdKFds= X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 33708 Cc: 33708@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) > Occurred in the files of the Bitcoin blockchain. The beginning of every > block is marked with the byte sequence `f9beb4d9`. Searching for this > byte sequence in nhexl-mode is really slow. Duh, indeed it's painful, and it's a plain performance bug in nhexl-mode. I just installed the patch below which seems to fix it (along with another less severe bug). The corresponding 1.2 package should appear soon on GNU ELPA. Thanks for the test case (tho it was painful to get due to having to go through my browser and coerce it to run non-free Javascript code). Stefan diff --git a/packages/nhexl-mode/nhexl-mode.el b/packages/nhexl-mode/nhexl-mode.el index 89d91182f..a52a90081 100644 --- a/packages/nhexl-mode/nhexl-mode.el +++ b/packages/nhexl-mode/nhexl-mode.el @@ -807,22 +807,31 @@ Return the corresponding nibble, if applicable." (push (string-to-number (substring string i (+ i 2)) 16) chars) (setq i (+ i 2))) - (let* ((base (regexp-quote (apply #'string (nreverse chars)))) - (newstr - (if (>= i (length string)) - base - (cl-assert (= (1+ i) (length string))) - (let ((nibble (string-to-number (substring string i) 16))) - ;; FIXME: if one of the two bounds is a special char - ;; like `]` or `^' we can get into trouble! - (format "%s[%c-%c]" base - (* 16 nibble) - (+ 15 (* 16 nibble))))))) + (let* ((base (regexp-quote (apply #'unibyte-string (nreverse chars)))) + (re + (concat (if (>= i (length string)) + base + (cl-assert (= (1+ i) (length string))) + (let ((nibble (string-to-number (substring string i) 16))) + ;; FIXME: if one of the two bounds is a special char + ;; like `]` or `^' we can get into trouble! + (concat base + (unibyte-string ?\[ (* 16 nibble) ?- + (+ 15 (* 16 nibble)) ?\])))) + ;; We also search for the literal hex string here, so the + ;; search stops as soon as one is found, otherwise we too + ;; easily fall into the trap of bug#33708 where at every + ;; cycle we first search unsuccessfully through the whole + ;; buffer with one kind of search before trying the + ;; other search. + ;; Don't bother regexp-quoting the string since we know + ;; it's only made of hex chars! + "\\|" string))) (let ((case-fold-search nil)) (funcall (if isearch-forward #'re-search-forward #'re-search-backward) - newstr bound noerror))))) + re bound noerror))))) (defun nhexl--isearch-search-fun (orig-fun) (let ((def-fun (funcall orig-fun))) @@ -830,9 +839,18 @@ Return the corresponding nibble, if applicable." (unless bound (setq bound (if isearch-forward (point-max) (point-min)))) (let ((startpos (point)) - (def (funcall def-fun string bound noerror))) - ;; Don't search further than what `def-fun' found. - (if def (setq bound (match-beginning 0))) + def) + ;; Hex address search. + (when (and nhexl-isearch-hex-addresses + (> (length string) 1) + (string-match-p "\\`[[:xdigit:]]+:?\\'" string)) + ;; Could be a hexadecimal address. + (goto-char startpos) + (let ((newdef (nhexl--isearch-match-hex-address string bound noerror))) + (when newdef + (setq def newdef) + (setq bound (match-beginning 0))))) + ;; Hex bytes search (when (and nhexl-isearch-hex-bytes (> (length string) 1) (string-match-p "\\`[[:xdigit:]]+\\'" string)) @@ -842,12 +860,10 @@ Return the corresponding nibble, if applicable." (when newdef (setq def newdef) (setq bound (match-beginning 0))))) - (when (and nhexl-isearch-hex-addresses - (> (length string) 1) - (string-match-p "\\`[[:xdigit:]]+:?\\'" string)) - ;; Could be a hexadecimal address. + ;; Normal search. + (progn (goto-char startpos) - (let ((newdef (nhexl--isearch-match-hex-address string bound noerror))) + (let ((newdef (funcall def-fun string bound noerror))) (when newdef (setq def newdef) (setq bound (match-beginning 0))))) @@ -909,17 +925,19 @@ Return the corresponding nibble, if applicable." #'nhexl--isearch-highlight-cleanup) (defun nhexl--isearch-highlight-cleanup (&rest _) (when (and nhexl-mode nhexl-isearch-hex-highlight) - (dolist (ol isearch-lazy-highlight-overlays) - (when (and (overlayp ol) (eq (overlay-buffer ol) (current-buffer))) - (put-text-property (overlay-start ol) (overlay-end ol) - 'fontified nil))))) + (with-silent-modifications + (dolist (ol isearch-lazy-highlight-overlays) + (when (and (overlayp ol) (eq (overlay-buffer ol) (current-buffer))) + (put-text-property (overlay-start ol) (overlay-end ol) + 'fontified nil)))))) (advice-add 'isearch-lazy-highlight-match :after #'nhexl--isearch-highlight-match) (defun nhexl--isearch-highlight-match (&optional mb me) (when (and nhexl-mode nhexl-isearch-hex-highlight (integerp mb) (integerp me)) - (put-text-property mb me 'fontified nil))) + (with-silent-modifications + (put-text-property mb me 'fontified nil)))) (defun nhexl--line-width-watcher (_sym _newval op where) (when (eq op 'set) From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 14 17:52:40 2018 Received: (at 33708-done) by debbugs.gnu.org; 14 Dec 2018 22:52:40 +0000 Received: from localhost ([127.0.0.1]:48647 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gXwK0-0004OW-BI for submit@debbugs.gnu.org; Fri, 14 Dec 2018 17:52:40 -0500 Received: from pmta11.teksavvy.com ([76.10.157.34]:30585) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gXwJx-0004OH-7K for 33708-done@debbugs.gnu.org; Fri, 14 Dec 2018 17:52:38 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A2HgEgC/MxRc/+yTpUVkHAEBAQQBAQcEA?= =?us-ascii?q?QGBZYIEgWiIVYRGixkBggwTIgGZGwENhGYEAgKDBSM4EgEDAQEBAQEBAgICaSi?= =?us-ascii?q?FPgEEAVYoCwsOJhIUGA0khS0Ipz6KKooeBIIzgX+EI4RrhXMCkEqFPYsRCZFyi?= =?us-ascii?q?XOHX5lVDIFdIoFWMxoIMIMokHgkjmUBAQ?= X-IPAS-Result: =?us-ascii?q?A2HgEgC/MxRc/+yTpUVkHAEBAQQBAQcEAQGBZYIEgWiIVYR?= =?us-ascii?q?GixkBggwTIgGZGwENhGYEAgKDBSM4EgEDAQEBAQEBAgICaSiFPgEEAVYoCwsOJ?= =?us-ascii?q?hIUGA0khS0Ipz6KKooeBIIzgX+EI4RrhXMCkEqFPYsRCZFyiXOHX5lVDIFdIoF?= =?us-ascii?q?WMxoIMIMokHgkjmUBAQ?= X-IronPort-AV: E=Sophos;i="5.56,354,1539662400"; d="scan'208";a="58522511" Received: from 69-165-147-236.dsl.teksavvy.com (HELO fmsmemgm.homelinux.net) ([69.165.147.236]) by smtp.teksavvy.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Dec 2018 17:52:31 -0500 Received: by fmsmemgm.homelinux.net (Postfix, from userid 20848) id 10298AEA23; Fri, 14 Dec 2018 17:52:31 -0500 (EST) From: Stefan Monnier To: Guido Kraemer Subject: Re: bug#33708: 26.1.90; nhexl-mode performance Message-ID: References: <8d6352ac-4bca-fb76-8c2f-dd597e481e8e@bgc-jena.mpg.de> Date: Fri, 14 Dec 2018 17:52:31 -0500 In-Reply-To: (Guido Kraemer's message of "Fri, 14 Dec 2018 23:15:40 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 33708-done X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) > - Thanks for fixing this, performance is great now! Thanks for confirming. > - Sorry for using that weird site for file sharing, you could have installed > the bitcoin client (which is MIT licensed) and downloaded the > blockchain ;-), I hesitated to do that, indeed. > what would you use for ad-hoc file sharing? I'd put it on my web server, not linked from any page. > - I think there is another minor bug: When the cursor is at the very > beginning of the buffer and you search for the byte sequence at the very > beginning of the file, search will jump to the second occurrence. Happens in > the example of the original bug report. Yeah, it's a misfeature that I'm not sure how to fix: When you type `C-s f a`, you first search for `f` and this one is not treated as a hex-search so it jumps to the first `f` char, so when you get to type `a` Isearch keeps searching from that `f` rather than restarting from the beginning of the buffer. I could change the rule so that `C-s f` already treats `f` as a hex-search, I guess. Stefan From unknown Mon Aug 18 11:12:01 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 12 Jan 2019 12:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator