GNU bug report logs - #66796
[PATCH] lint: Speed up the formatting linter.

Previous Next

Package: guix-patches;

Reported by: Christopher Baines <mail <at> cbaines.net>

Date: Sat, 28 Oct 2023 14:37:01 UTC

Severity: normal

Tags: patch

Done: Christopher Baines <mail <at> cbaines.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 66796 in the body.
You can then email your comments to 66796 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix <at> cbaines.net, dev <at> jpoiret.xyz, ludo <at> gnu.org, othacehe <at> gnu.org, rekado <at> elephly.net, zimon.toutoune <at> gmail.com, me <at> tobias.gr, guix-patches <at> gnu.org:
bug#66796; Package guix-patches. (Sat, 28 Oct 2023 14:37:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Christopher Baines <mail <at> cbaines.net>:
New bug report received and forwarded. Copy sent to guix <at> cbaines.net, dev <at> jpoiret.xyz, ludo <at> gnu.org, othacehe <at> gnu.org, rekado <at> elephly.net, zimon.toutoune <at> gmail.com, me <at> tobias.gr, guix-patches <at> gnu.org. (Sat, 28 Oct 2023 14:37:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: guix-patches <at> gnu.org
Subject: [PATCH] lint: Speed up the formatting linter.
Date: Sat, 28 Oct 2023 15:35:14 +0100
By storing the bytes to seek to for the start of each line the first time you
want to check a package in a file, rather than figuring out where the package
starts each time.

This cuts down the time to run guix lint -c formatting from 450 seconds to 13
seconds.

* guix/lint.scm (report-formatting-issues): If %check-formatting-seek-lookup
is a hash table, store vlist's in it to map from a line number to a byte to
seek to.
(%check-formatting-seek-lookup): New parameter.
* guix/scripts/lint.scm (guix-lint): Enable faster formatting linting, when
linting all packages.

Change-Id: I34e4d3acfbb1e14e026d2e7f712ba8d22b56c147
---
 guix/lint.scm         | 44 ++++++++++++++++++++++++++++++++++++++++++-
 guix/scripts/lint.scm |  3 +++
 2 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/guix/lint.scm b/guix/lint.scm
index 7ccf52dec1..d94b4026c6 100644
--- a/guix/lint.scm
+++ b/guix/lint.scm
@@ -68,6 +68,7 @@ (define-module (guix lint)
                                     svn-multi-reference-user-name
                                     svn-multi-reference-password)
   #:use-module (guix import stackage)
+  #:use-module (ice-9 vlist)
   #:use-module (ice-9 match)
   #:use-module (ice-9 regex)
   #:use-module (ice-9 format)
@@ -109,6 +110,7 @@ (define-module (guix lint)
             check-license
             check-vulnerabilities
             check-for-updates
+            %check-formatting-seek-lookup
             check-formatting
             check-archival
             check-profile-collisions
@@ -1839,6 +1841,40 @@ (define* (report-formatting-issues package file starting-line
                                    #:key (reporters %formatting-reporters))
   "Report white-space issues in FILE starting from STARTING-LINE, and report
 them for PACKAGE."
+  (define (seek-to-line port line)
+    (let ((offset
+           (vlist-ref
+            (or (hash-ref (%check-formatting-seek-lookup) file)
+                (call-with-input-file file
+                  (lambda (port)
+                    (let* ((buf-length 80)
+                           (buf (make-string buf-length)))
+                      (let loop ((byte-lookup-list '(0)))
+                        (let* ((rv (%read-delimited! "\n" buf #t port))
+                               (terminator (car rv))
+                               (nchars (cdr rv)))
+                          (cond
+                           ((eof-object? terminator)
+                            (let ((byte-lookup-vlist
+                                   (list->vlist
+                                    (reverse byte-lookup-list))))
+                              (hash-set! (%check-formatting-seek-lookup)
+                                         file
+                                         byte-lookup-vlist)
+                              byte-lookup-vlist))
+
+                           ((not terminator)
+                            (loop byte-lookup-list))
+
+                           (nchars
+                            (loop (cons
+                                   (ftell port)
+                                   byte-lookup-list))))))))))
+            (- line 1))))
+      (set-port-line! port line)
+      (seek port offset SEEK_SET)
+      line))
+
   (define (sexp-last-line port)
     ;; Return the last line of the sexp read from PORT or an estimate thereof.
     (define &failure (list 'failure))
@@ -1857,7 +1893,10 @@ (define* (report-formatting-issues package file starting-line
 
   (call-with-input-file file
     (lambda (port)
-      (let loop ((line-number 1)
+      (let loop ((line-number
+                  (if (%check-formatting-seek-lookup)
+                      (seek-to-line port starting-line)
+                      1))
                  (last-line #f)
                  (warnings '()))
         (let ((line (read-line port)))
@@ -1879,6 +1918,9 @@ (define* (report-formatting-issues package file starting-line
                                            (report package line line-number))
                                          reporters)))))))))))
 
+(define %check-formatting-seek-lookup
+  (make-parameter #f))
+
 (define (check-formatting package)
   "Check the formatting of the source code of PACKAGE."
   (let ((location (package-location package)))
diff --git a/guix/scripts/lint.scm b/guix/scripts/lint.scm
index ee3de51fb1..219c3b91be 100644
--- a/guix/scripts/lint.scm
+++ b/guix/scripts/lint.scm
@@ -222,6 +222,9 @@ (define-command (guix-lint . args)
          (lambda (store)
            (cond
             ((null? args)
+             ;; Enable fast seeking to lines for the check-formatting linter
+             (%check-formatting-seek-lookup (make-hash-table))
+
              (fold-packages (lambda (p r) (run-checkers p checkers
                                                         #:store store)) '()))
             (else

base-commit: c3cf04d05b452fee549bb84b323d056fd30cef45
-- 
2.41.0





Information forwarded to guix-patches <at> gnu.org:
bug#66796; Package guix-patches. (Sun, 05 Nov 2023 14:37:02 GMT) Full text and rfc822 format available.

Message #8 received at 66796 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Christopher Baines <mail <at> cbaines.net>
Cc: 66796 <at> debbugs.gnu.org, Josselin Poiret <dev <at> jpoiret.xyz>,
 Simon Tournier <zimon.toutoune <at> gmail.com>, Mathieu Othacehe <othacehe <at> gnu.org>,
 Tobias Geerinckx-Rice <me <at> tobias.gr>, Ricardo Wurmus <rekado <at> elephly.net>,
 Christopher Baines <guix <at> cbaines.net>
Subject: Re: [bug#66796] [PATCH] lint: Speed up the formatting linter.
Date: Sun, 05 Nov 2023 15:35:22 +0100
Hi,

Christopher Baines <mail <at> cbaines.net> skribis:

> By storing the bytes to seek to for the start of each line the first time you
> want to check a package in a file, rather than figuring out where the package
> starts each time.
>
> This cuts down the time to run guix lint -c formatting from 450 seconds to 13
> seconds.

Excellent!

> +  (define (seek-to-line port line)
> +    (let ((offset
> +           (vlist-ref
> +            (or (hash-ref (%check-formatting-seek-lookup) file)
> +                (call-with-input-file file
> +                  (lambda (port)
> +                    (let* ((buf-length 80)
> +                           (buf (make-string buf-length)))
> +                      (let loop ((byte-lookup-list '(0)))
> +                        (let* ((rv (%read-delimited! "\n" buf #t port))
> +                               (terminator (car rv))
> +                               (nchars (cdr rv)))
> +                          (cond
> +                           ((eof-object? terminator)
> +                            (let ((byte-lookup-vlist
> +                                   (list->vlist
> +                                    (reverse byte-lookup-list))))
> +                              (hash-set! (%check-formatting-seek-lookup)
> +                                         file
> +                                         byte-lookup-vlist)
> +                              byte-lookup-vlist))
> +
> +                           ((not terminator)
> +                            (loop byte-lookup-list))
> +
> +                           (nchars
> +                            (loop (cons
> +                                   (ftell port)
> +                                   byte-lookup-list))))))))))
> +            (- line 1))))
> +      (set-port-line! port line)
> +      (seek port offset SEEK_SET)
> +      line))

Two things: (1) it’s a bit hard to read, in part due to long
identifiers, and (2) it would be nice to keep this facility orthogonal,
outside the checker.

As it turns out, a similar facility is available in (guix utils),
exposed via ‘go-to-location’.  Would it be possible to use it here?

> +      (let loop ((line-number
> +                  (if (%check-formatting-seek-lookup)
> +                      (seek-to-line port starting-line)
> +                      1))

Answering myself: I guess ‘seek-to-line’ can be replaced by
(go-to-location port starting-line 0).

Thanks!

Ludo’.




Reply sent to Christopher Baines <mail <at> cbaines.net>:
You have taken responsibility. (Sun, 05 Nov 2023 18:19:02 GMT) Full text and rfc822 format available.

Notification sent to Christopher Baines <mail <at> cbaines.net>:
bug acknowledged by developer. (Sun, 05 Nov 2023 18:19:02 GMT) Full text and rfc822 format available.

Message #13 received at 66796-done <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 66796-done <at> debbugs.gnu.org
Subject: Re: [bug#66796] [PATCH] lint: Speed up the formatting linter.
Date: Sun, 05 Nov 2023 18:17:31 +0000
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi,
>
> Christopher Baines <mail <at> cbaines.net> skribis:
>
>> By storing the bytes to seek to for the start of each line the first time you
>> want to check a package in a file, rather than figuring out where the package
>> starts each time.
>>
>> This cuts down the time to run guix lint -c formatting from 450 seconds to 13
>> seconds.
>
> Excellent!
>
>> +  (define (seek-to-line port line)
>> +    (let ((offset
>> +           (vlist-ref
>> +            (or (hash-ref (%check-formatting-seek-lookup) file)
>> +                (call-with-input-file file
>> +                  (lambda (port)
>> +                    (let* ((buf-length 80)
>> +                           (buf (make-string buf-length)))
>> +                      (let loop ((byte-lookup-list '(0)))
>> +                        (let* ((rv (%read-delimited! "\n" buf #t port))
>> +                               (terminator (car rv))
>> +                               (nchars (cdr rv)))
>> +                          (cond
>> +                           ((eof-object? terminator)
>> +                            (let ((byte-lookup-vlist
>> +                                   (list->vlist
>> +                                    (reverse byte-lookup-list))))
>> +                              (hash-set! (%check-formatting-seek-lookup)
>> +                                         file
>> +                                         byte-lookup-vlist)
>> +                              byte-lookup-vlist))
>> +
>> +                           ((not terminator)
>> +                            (loop byte-lookup-list))
>> +
>> +                           (nchars
>> +                            (loop (cons
>> +                                   (ftell port)
>> +                                   byte-lookup-list))))))))))
>> +            (- line 1))))
>> +      (set-port-line! port line)
>> +      (seek port offset SEEK_SET)
>> +      line))
>
> Two things: (1) it’s a bit hard to read, in part due to long
> identifiers, and (2) it would be nice to keep this facility orthogonal,
> outside the checker.
>
> As it turns out, a similar facility is available in (guix utils),
> exposed via ‘go-to-location’.  Would it be possible to use it here?
>
>> +      (let loop ((line-number
>> +                  (if (%check-formatting-seek-lookup)
>> +                      (seek-to-line port starting-line)
>> +                      1))
>
> Answering myself: I guess ‘seek-to-line’ can be replaced by
> (go-to-location port starting-line 0).

Cool, this simplifies the change a lot, I've pushed the amended version
to master as aa98a976079101318d53b412fef6c722bb4332c9.

Thanks,

Chris
[signature.asc (application/pgp-signature, inline)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 04 Dec 2023 12:24:11 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 290 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.