GNU bug report logs - #37393
26.2.90; [PATCH] Speed up 'csv-align-fields'

Previous Next

Package: emacs;

Reported by: Simen Heggestøyl <simenheg <at> gmail.com>

Date: Thu, 12 Sep 2019 17:08:01 UTC

Severity: normal

Tags: patch

Found in version 26.2.90

Done: Simen Heggestøyl <simenheg <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Simen Heggestøyl <simenheg <at> gmail.com>
Subject: bug#37393: closed (Re: 26.2.90; [PATCH] Speed up 'csv-align-fields')
Date: Wed, 09 Oct 2019 16:34:02 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#37393: 26.2.90; [PATCH] Speed up 'csv-align-fields'

which was filed against the emacs package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 37393 <at> debbugs.gnu.org.

-- 
37393: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=37393
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Simen Heggestøyl <simenheg <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 37393-done <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Wed, 09 Oct 2019 18:33:16 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> Then I guess this technique won't help in your case.  Sorry for
> distracting you.

No problem, thanks the suggestions.

Closing this bug now as the original patch has been installed (with some
changes suggested by Stefan).

-- Simen

[Message part 3 (message/rfc822, inline)]
From: Simen Heggestøyl <simenheg <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, Leo Liu <sdl.web <at> gmail.com>
Subject: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Thu, 12 Sep 2019 19:07:33 +0200
[Message part 4 (text/plain, inline)]
The attached patch attempts to speed up the 'csv-align-fields' command
by avoiding expensive calls to 'current-column', instead reusing field
widths already computed by 'csv--column-widths'.

I felt an urge to speed up the command a bit while working with large
(100 000+ lines) CSV files. Below are benchmarks produced by running

  (benchmark 3 '(csv-align-fields nil (point-min) (point-max)))

in three CSV files from the real world of various sizes. In these cases
the speedup seems to be around 1.5x—2x.

~400 line file:
  Before: Elapsed time: 0.175867s
  After:  Elapsed time: 0.086809s

~50 000 line file:
  Before: Elapsed time: 34.665853s (7.480686s in 35 GCs)
  After:  Elapsed time: 24.349081s (7.154716s in 27 GCs)

~110 000 line file:
  Before: Elapsed time: 82.444038s (19.799686s in 51 GCs)
  After:  Elapsed time: 40.184331s (9.037813s in 25 GCs)

(I've put on CC the two of you who seem to have done most of the work on
this mode lately, hope that's OK.)

-- Simen
[0001-Speed-up-csv-align-fields.patch (text/x-diff, inline)]
From 4fc82f1f66c736bcfbc15d20ff53bd3e21e8a8e1 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Simen=20Heggest=C3=B8yl?= <simenheg <at> gmail.com>
Date: Thu, 12 Sep 2019 18:54:28 +0200
Subject: [PATCH] Speed up 'csv-align-fields'

* packages/csv-mode/csv-mode.el: Bump version number and make the
dependency on Emacs 24.1 or higher explicit.
(csv--column-widths): Return the field widths as well.
(csv-align-fields): Speed up by using the field widths already computed
by 'csv--column-widths'.
---
 packages/csv-mode/csv-mode.el | 30 ++++++++++++++++--------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/packages/csv-mode/csv-mode.el b/packages/csv-mode/csv-mode.el
index 40f70330a..dc2555687 100644
--- a/packages/csv-mode/csv-mode.el
+++ b/packages/csv-mode/csv-mode.el
@@ -4,7 +4,8 @@
 
 ;; Author: "Francis J. Wright" <F.J.Wright <at> qmul.ac.uk>
 ;; Time-stamp: <23 August 2004>
-;; Version: 1.7
+;; Version: 1.8
+;; Package-Requires: ((emacs "24.1"))
 ;; Keywords: convenience
 
 ;; This package is free software; you can redistribute it and/or modify
@@ -969,24 +970,26 @@ The fields yanked are those last killed by `csv-kill-fields'."
   (and (overlay-get o 'csv) (delete-overlay o)))
 
 (defun csv--column-widths ()
-  (let ((widths '()))
+  (let ((column-widths '())
+        (field-widths '()))
     ;; Construct list of column widths:
     (while (not (eobp))                   ; for each record...
       (or (csv-not-looking-at-record)
-          (let ((w widths)
+          (let ((w column-widths)
                 (col (current-column))
-                x)
+                field-width)
             (while (not (eolp))
               (csv-end-of-field)
-              (setq x (- (current-column) col)) ; Field width.
+              (setq field-width (- (current-column) col))
+              (push field-width field-widths)
               (if w
-                  (if (> x (car w)) (setcar w x))
-                (setq w (list x)
-                      widths (nconc widths w)))
+                  (if (> field-width (car w)) (setcar w field-width))
+                (setq w (list field-width)
+                      column-widths (nconc column-widths w)))
               (or (eolp) (forward-char))  ; Skip separator.
               (setq w (cdr w) col (current-column)))))
       (forward-line))
-    widths))
+    (list column-widths (nreverse field-widths))))
 
 (defun csv-align-fields (hard beg end)
   "Align all the fields in the region to form columns.
@@ -1017,23 +1020,22 @@ If there is no selected region, default to the whole buffer."
       (narrow-to-region beg end)
       (set-marker end nil)
       (goto-char (point-min))
-      (let ((widths (csv--column-widths)))
+      (pcase-let ((`(,column-widths ,field-widths) (csv--column-widths)))
 
 	;; Align fields:
 	(goto-char (point-min))
 	(while (not (eobp))		; for each record...
 	  (unless (csv-not-looking-at-record)
-            (let ((w widths)
+            (let ((w column-widths)
                   (column 0))    ;Desired position of left-side of this column.
               (while (and w (not (eolp)))
                 (let* ((beg (point))
                        (align-padding (if (bolp) 0 csv-align-padding))
                        (left-padding 0) (right-padding 0)
-                       (field-width
-                        (- (- (current-column)
-                              (progn (csv-end-of-field) (current-column)))))
+                       (field-width (pop field-widths))
                        (column-width (pop w))
                        (x (- column-width field-width))) ; Required padding.
+                  (csv-end-of-field)
                   (set-marker end (point)) ; End of current field.
                   ;; beg = beginning of current field
                   ;; end = (point) = end of current field
-- 
2.23.0


This bug report was last modified 5 years and 285 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.