GNU bug report logs - #37393
26.2.90; [PATCH] Speed up 'csv-align-fields'

Previous Next

Package: emacs;

Reported by: Simen Heggestøyl <simenheg <at> gmail.com>

Date: Thu, 12 Sep 2019 17:08:01 UTC

Severity: normal

Tags: patch

Found in version 26.2.90

Done: Simen Heggestøyl <simenheg <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Simen Heggestøyl <simenheg <at> gmail.com>
Cc: bug-gnu-emacs <at> gnu.org, Leo Liu <sdl.web <at> gmail.com>
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Thu, 12 Sep 2019 13:46:36 -0400
> The attached patch attempts to speed up the 'csv-align-fields' command
> by avoiding expensive calls to 'current-column', instead reusing field
> widths already computed by 'csv--column-widths'.

Sounds good.  I rarely use large CSV files, but I know the operation is slow.

I'm OK with the patch, tho please see my comment below.

> I felt an urge to speed up the command a bit while working with large
> (100 000+ lines) CSV files. Below are benchmarks produced by running
>
>   (benchmark 3 '(csv-align-fields nil (point-min) (point-max)))
>
> in three CSV files from the real world of various sizes. In these cases
> the speedup seems to be around 1.5x—2x.
>
> ~400 line file:
>   Before: Elapsed time: 0.175867s
>   After:  Elapsed time: 0.086809s
>
> ~50 000 line file:
>   Before: Elapsed time: 34.665853s (7.480686s in 35 GCs)
>   After:  Elapsed time: 24.349081s (7.154716s in 27 GCs)
>
> ~110 000 line file:
>   Before: Elapsed time: 82.444038s (19.799686s in 51 GCs)
>   After:  Elapsed time: 40.184331s (9.037813s in 25 GCs)

40s is still slow, but a factor of 2 is good, thanks.

If you're interested in this line, I think there are two avenues to
improve the behavior further:
- align lazily via jit-lock (this way the time is determined by the
  amount of text displayed rather than the total file size).
- make align-fields' into a mode, where fields are kept aligned even while
  the buffer is modified.

>  (defun csv--column-widths ()
> -  (let ((widths '()))
> +  (let ((column-widths '())
> +        (field-widths '()))

I think the return value is now sufficiently complex that the function
deserves a docstring describing it.


        Stefan





This bug report was last modified 5 years and 284 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.