GNU bug report logs -
#47283
Performance regression in narinfo fetching
Previous Next
Reported by: Ludovic Courtès <ludo <at> gnu.org>
Date: Sat, 20 Mar 2021 17:39:01 UTC
Severity: important
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
Message #22 received at 47283 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:
> Christopher Baines <mail <at> cbaines.net> skribis:
>
>> Ludovic Courtès <ludo <at> gnu.org> writes:
>>
>>> Indeed, there’s one place on the hot path where we install exception
>>> handlers: in ‘http-multiple-get’ (from commit
>>> 205833b72c5517915a47a50dbe28e7024dc74e57). I don’t think it’s needed,
>>> is it? (But if it is, let’s find another approach, this one is
>>> prohibitively expensive.)
>>
>> I think the exception handling has moved around, but I guess the
>> exceptions that could be caught in http-multiple-get could happen,
>> right? I am really just guessing here, as Guile doesn't help tell you
>> about possible exceptions, and I haven't spent enough time to read all
>> the possible code involved to find out if these are definitely possible.
>
> Yeah.
>
> Commit 205833b72c5517915a47a50dbe28e7024dc74e57 added a ‘catch’ block
> that catches the same things as ‘with-cached-connection’ did (it would
> be better to not duplicate it IMO). That includes EPIPE, gnutls-error,
> bad-response & co.
So, my intention here was to move the error handling, to allow
separating out the connection caching code from the code I wanted to
move out to the (guix substitutes) module. I don't think there's
currently duplication in the error handling for the code path involving
http-multiple-get currently, at least for the exceptions in question
here.
> Earlier, commit be5a75ebb5988b87b2392e2113f6590f353dd6cd (“substitute:
> Reuse connections for '--query'.”) did not add such a ‘catch’ block in
> ‘http-multiple-get’. Instead, it wrapped its call in ‘do-fetch’ in
> ‘fetch-narinfos’:
>
> (define (do-fetch uri)
> (case (and=> uri uri-scheme)
> ((http https)
> - (let ((requests (map (cut narinfo-request url <>) paths)))
> - (match (open-connection-for-uri/maybe uri)
> - (#f
> - '())
> - (port
> - (update-progress!)
> ;; Note: Do not check HTTPS server certificates to avoid depending
> ;; on the X.509 PKI. We can do it because we authenticate
> ;; narinfos, which provides a much stronger guarantee.
> - (let ((result (http-multiple-get uri
> + (let* ((requests (map (cut narinfo-request url <>) paths))
> + (result (call-with-cached-connection uri
> + (lambda (port)
> + (if port
> + (begin
> + (update-progress!)
> + (http-multiple-get uri
> handle-narinfo-response '()
> requests
> + #:open-connection
> + open-connection-for-uri/cached
> #:verify-certificate? #f
> - #:port port)))
>
> This bit is still there in current ‘master’, so I think it’s not
> necessary to catch these exceptions in ‘http-multiple-get’ itself, and I
> would just remove the ‘catch’ wrap altogether.
>
> WDYT?
I'm not sure what you're referring to as still being there on the master
branch?
Looking at the changes to this particular code path resulting from the
changes I've made recently, starting at lookup-narinfos, before:
- lookup-narinfos calls fetch-narinfos, which calls do-fetch
- call-with-cached-connection is used, which catches a number of
exceptions relating to requests, and will retry PROC once upon a
matching exception
- open-connection-for-uri/maybe is also used, which is like
open-connection-for-uri/cached, except it includes error handling for
establishing connections to substitute servers
- http-multiple-get doesn't include error handling
After:
- lookup-narinfos calls fetch-narinfos, which calls do-fetch
- call-with-connection-error-handling is used, which performs the same
role as the error handling previously within
open-connection-for-uri/maybe, catching exceptions relating to
establishing connections to substitute servers
- http-multiple-get now includes error handling similar to what was
previously done by call-with-cached-connection, although it's more
complicated since it's done with knowledge of what http-multiple-get
is doing
I think that the error handling now in http-multiple-get isn't covered
elsewhere. Moving this error handling back in to fetch-narinfos is
possible, but then we'd be back to handling connection caching in that
code, and avoiding that led to this refactoring in the first place.
Also, apart from the implementation problems, I do think that the error
handling here is better than before. Previously, if you called
lookup-narinfos, and a connection problem occurred, processing all the
requests would start from scratch (as call-with-cached-connection calls
PROC a second time), if a second connection error was to happen, well,
call-with-cached-connection only handles one error, so that won't be
caught.
I think it's possible that http-multiple-get will be making thousands of
requests, running guix weather with no cached results for example. The
error handling in http-multiple-get is smarter than the previous
approach, doing as little as possible again. It's also not limited to
catching one exception.
[signature.asc (application/pgp-signature, inline)]
This bug report was last modified 4 years and 55 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.