GNU bug report logs - #47283
Performance regression in narinfo fetching

Previous Next

Package: guix;

Reported by: Ludovic Courtès <ludo <at> gnu.org>

Date: Sat, 20 Mar 2021 17:39:01 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#47283: closed (Performance regression in narinfo fetching)
Date: Sat, 27 Mar 2021 21:59:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Sat, 27 Mar 2021 22:58:39 +0100
with message-id <87mtuoxnqo.fsf_-_ <at> gnu.org>
and subject line Re: bug#47288: [PATCH] guix: http-client: Tweak http-multiple-get error handling.
has caused the debbugs.gnu.org bug report #47283,
regarding Performance regression in narinfo fetching
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
47283: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=47283
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Ludovic Courtès <ludo <at> gnu.org>
To: <bug-guix <at> gnu.org>
Subject: Performance regression in narinfo fetching
Date: Sat, 20 Mar 2021 18:38:39 +0100
[Message part 3 (text/plain, inline)]
Hello!

As reported on guix-devel, ‘guix weather’ has become extremely slow.
Specifically, in the narinfo-fetching phase, it runs at 100% CPU, even
though that part should be network-bound (pipelined HTTP GETs).

A profile of the ‘report-server-coverage’ call would show this:

--8<---------------cut here---------------start------------->8---
%     cumulative   self             
time   seconds     seconds  procedure
 62.50      1.06      1.06  fluid-ref*
  6.25      0.11      0.11  regexp-exec
  3.13      0.05      0.05  ice-9/boot-9.scm:1738:4:throw
  2.08      0.04      0.04  string-index
  2.08      0.04      0.04  write
  1.04    568.08      0.02  ice-9/boot-9.scm:1673:4:with-exception-handler
  1.04      0.02      0.02  %read-line
  1.04      0.02      0.02  guix/ci.scm:78:0:json->build
  1.04      0.02      0.02  string-append
--8<---------------cut here---------------end--------------->8---

More than half of the time spent in ‘fluid-ref*’—sounds fishy.

Where does that that call come from?  There seems to be a single caller,
in boot-9.scm:

    (define* (raise-exception exn #:key (continuable? #f))
      (define (capture-current-exception-handlers)
        ;; FIXME: This is quadratic.
        (let lp ((depth 0))
          (let ((h (fluid-ref* %exception-handler depth)))
            (if h
                (cons h (lp (1+ depth)))
                (list fallback-exception-handler)))))
      ;; …
      )

We must be abusing exceptions somewhere…

Indeed, there’s one place on the hot path where we install exception
handlers: in ‘http-multiple-get’ (from commit
205833b72c5517915a47a50dbe28e7024dc74e57).  I don’t think it’s needed,
is it?  (But if it is, let’s find another approach, this one is
prohibitively expensive.)

A simple performance test is:

  rm -rf ~/.cache/guix/substitute/
  time ./pre-inst-env guix weather $(guix package -A|head -500| cut -f1)

After removing this ‘catch’ in ‘http-multiple-get’, the profile is
flatter:

--8<---------------cut here---------------start------------->8---
%     cumulative   self             
time   seconds     seconds  procedure
  8.33      0.07      0.07  string-index
  8.33      0.07      0.07  regexp-exec
  5.56      0.05      0.05  anon #x154af88
  5.56      0.05      0.05  write
  5.56      0.05      0.05  string-tokenize
  5.56      0.05      0.05  read-char
  5.56      0.05      0.05  set-certificate-credentials-x509-trust-data!
  5.56      0.05      0.05  %read-line
--8<---------------cut here---------------end--------------->8---

There’s also this ‘call-with-connection-error-handling’ call in (guix
substitute), around an ‘http-multiple-get’ call, that may not be
justified.

Attached is a diff of the tweaks I made to test this.

WDYT, Chris?

Ludo’.

[Message part 4 (text/x-patch, inline)]
diff --git a/guix/http-client.scm b/guix/http-client.scm
index 4b4c14ed0b..a28523201e 100644
--- a/guix/http-client.scm
+++ b/guix/http-client.scm
@@ -219,42 +219,21 @@ returning."
              (remainder
               (connect p remainder result))))
           ((head tail ...)
-           (catch #t
-             (lambda ()
-               (let* ((resp   (read-response p))
-                      (body   (response-body-port resp))
-                      (result (proc head resp body result)))
-                 ;; The server can choose to stop responding at any time,
-                 ;; in which case we have to try again.  Check whether
-                 ;; that is the case.  Note that even upon "Connection:
-                 ;; close", we can read from BODY.
-                 (match (assq 'connection (response-headers resp))
-                   (('connection 'close)
-                    (close-port p)
-                    (connect #f                       ;try again
-                             (drop requests (+ 1 processed))
-                             result))
-                   (_
-                    (loop tail (+ 1 processed) result))))) ;keep going
-             (lambda (key . args)
-               ;; If PORT was cached and the server closed the connection
-               ;; in the meantime, we get EPIPE.  In that case, open a
-               ;; fresh connection and retry.  We might also get
-               ;; 'bad-response or a similar exception from (web response)
-               ;; later on, once we've sent the request, or a
-               ;; ERROR/INVALID-SESSION from GnuTLS.
-               (if (or (and (eq? key 'system-error)
-                            (= EPIPE (system-error-errno `(,key ,@args))))
-                       (and (eq? key 'gnutls-error)
-                            (eq? (first args) error/invalid-session))
-                       (memq key
-                             '(bad-response bad-header bad-header-component)))
-                   (begin
-                     (close-port p)
-                     (connect #f      ; try again
-                              (drop requests (+ 1 processed))
-                              result))
-                   (apply throw key args))))))))))
+           (let* ((resp   (read-response p))
+                  (body   (response-body-port resp))
+                  (result (proc head resp body result)))
+             ;; The server can choose to stop responding at any time,
+             ;; in which case we have to try again.  Check whether
+             ;; that is the case.  Note that even upon "Connection:
+             ;; close", we can read from BODY.
+             (match (assq 'connection (response-headers resp))
+               (('connection 'close)
+                (close-port p)
+                (connect #f                       ;try again
+                         (drop requests (+ 1 processed))
+                         result))
+               (_
+                (loop tail (+ 1 processed) result))))))))))
 
 
 ;;;
diff --git a/guix/scripts/weather.scm b/guix/scripts/weather.scm
index 5164fe0494..3d8d50d5e3 100644
--- a/guix/scripts/weather.scm
+++ b/guix/scripts/weather.scm
@@ -184,9 +184,10 @@ Return the coverage ratio, an exact number between 0 and 1."
   (let/time ((time narinfos requests-made
                    (lookup-narinfos
                     server items
-                    #:make-progress-reporter
-                    (lambda* (total #:key url #:allow-other-keys)
-                      (progress-reporter/bar total)))))
+                    ;; #:make-progress-reporter
+                    ;; (lambda* (total #:key url #:allow-other-keys)
+                    ;;   (progress-reporter/bar total))
+                    )))
     (format #t "~a~%" server)
     (let ((obtained  (length narinfos))
           (requested (length items))
@@ -504,6 +505,7 @@ SERVER.  Display information for packages with at least THRESHOLD dependents."
 ;;; Entry point.
 ;;;
 
+(use-modules (statprof))
 (define-command (guix-weather . args)
   (synopsis "report on the availability of pre-built package binaries")
 
@@ -551,9 +553,11 @@ SERVER.  Display information for packages with at least THRESHOLD dependents."
         (exit
          (every* (lambda (server)
                    (define coverage
-                     (report-server-coverage server items
-                                             #:display-missing?
-                                             (assoc-ref opts 'display-missing?)))
+                     (statprof
+                      (lambda ()
+                       (report-server-coverage server items
+                                               #:display-missing?
+                                               (assoc-ref opts 'display-missing?)))))
                    (match (assoc-ref opts 'coverage)
                      (#f #f)
                      (threshold
diff --git a/guix/substitutes.scm b/guix/substitutes.scm
index 08f8c24efd..04bf70caaa 100644
--- a/guix/substitutes.scm
+++ b/guix/substitutes.scm
@@ -59,8 +59,6 @@
   #:use-module (guix http-client)
   #:export (%narinfo-cache-directory
 
-            call-with-connection-error-handling
-
             lookup-narinfos
             lookup-narinfos/diverse))
 
@@ -235,14 +233,11 @@ if file doesn't exist, and the narinfo otherwise."
        (let* ((requests (map (cut narinfo-request url <>) paths))
               (result   (begin
                           (start-progress-reporter! progress-reporter)
-                          (call-with-connection-error-handling
-                           uri
-                           (lambda ()
-                             (http-multiple-get uri
-                                                handle-narinfo-response '()
-                                                requests
-                                                #:open-connection open-connection
-                                                #:verify-certificate? #f))))))
+                          (http-multiple-get uri
+                                             handle-narinfo-response '()
+                                             requests
+                                             #:open-connection open-connection
+                                             #:verify-certificate? #f))))
          (stop-progress-reporter! progress-reporter)
          result))
       ((file #f)
[Message part 5 (message/rfc822, inline)]
From: Ludovic Courtès <ludo <at> gnu.org>
To: Christopher Baines <mail <at> cbaines.net>
Cc: 47283-done <at> debbugs.gnu.org, 47288-done <at> debbugs.gnu.org
Subject: Re: bug#47288: [PATCH] guix: http-client: Tweak http-multiple-get
 error handling.
Date: Sat, 27 Mar 2021 22:58:39 +0100
Hi!

I went ahead and pushed 45fce38fb0b6c6796906149ade145b8d3594c1c6 along
these lines.

‘guix weather’ runs to completion and things seem to work fine.  Let me
know if you notice anything wrong.

Thank you for your work and for your patience on this issue!

Ludo’.


This bug report was last modified 4 years and 55 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.