GNU bug report logs -
#47126
[PATCH 0/7] Add 'generic-html' updater
Previous Next
Reported by: Ludovic Courtès <ludo <at> gnu.org>
Date: Sat, 13 Mar 2021 21:44:02 UTC
Severity: normal
Tags: fixed, patch
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 47126 in the body.
You can then email your comments to 47126 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
guix-patches <at> gnu.org
:
bug#47126
; Package
guix-patches
.
(Sat, 13 Mar 2021 21:44:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Ludovic Courtès <ludo <at> gnu.org>
:
New bug report received and forwarded. Copy sent to
guix-patches <at> gnu.org
.
(Sat, 13 Mar 2021 21:44:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi!
These patches allow ‘guix refresh’ coverage to go from 78% to 88%
as reported by ‘guix refresh --list-updaters’ (both are probably
slightly overestimated) by adding a new ‘generic-html’ updater.
The updater crawls the web page where the package’s source tarball
is stored, using Guile-Lib’s (htmlprag), which we depend on since
commit 02e2e093e858e8a0ca7bd66c1f1f6fd0a1705edb. Among other things,
it handles freedesktop.org packages.
Feedback welcome!
Thanks,
Ludo’.
Ludovic Courtès (7):
gnu-maintenance: Use (htmlprag) for 'latest-html-release'.
gnu-maintenance: 'latest-html-release' considers non-relative URLs.
gnu-maintenance: 'release-file?' rejects checksum files.
gnu-maintenance: 'latest-html-release' can determine signature file
name.
gnu-maintenance: 'latest-html-release' better computes version number.
gnu-maintenance: Add 'generic-html' updater.
gnu: hwloc: Add 'release-monitoring-url' property.
doc/guix.texi | 6 +-
gnu/packages/mpi.scm | 6 ++
guix/gnu-maintenance.scm | 136 ++++++++++++++++++++++++++++-----------
3 files changed, 108 insertions(+), 40 deletions(-)
--
2.30.1
Information forwarded
to
guix-patches <at> gnu.org
:
bug#47126
; Package
guix-patches
.
(Sat, 13 Mar 2021 21:47:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 47126 <at> debbugs.gnu.org (full text, mbox):
* guix/gnu-maintenance.scm (html->sxml): Remove. Autoload (htmlprag)
instead.
* doc/guix.texi (Requirements): Mention 'guix refresh' for the Guile-Lib
dependency.
---
doc/guix.texi | 3 ++-
guix/gnu-maintenance.scm | 13 +------------
2 files changed, 3 insertions(+), 13 deletions(-)
diff --git a/doc/guix.texi b/doc/guix.texi
index 4cf241c56a..97094a7d0a 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -865,7 +865,8 @@ the @code{crate} importer (@pxref{Invoking guix import}).
@item
@uref{https://www.nongnu.org/guile-lib/doc/ref/htmlprag/, Guile-Lib} for
-the @code{go} importer (@pxref{Invoking guix import}).
+the @code{go} importer (@pxref{Invoking guix import}) and for some of
+the ``updaters'' (@pxref{Invoking guix refresh}).
@item
When @url{http://www.bzip.org, libbz2} is available,
diff --git a/guix/gnu-maintenance.scm b/guix/gnu-maintenance.scm
index 9e393d18cd..febed57c3a 100644
--- a/guix/gnu-maintenance.scm
+++ b/guix/gnu-maintenance.scm
@@ -38,6 +38,7 @@
#:use-module (guix upstream)
#:use-module (guix packages)
#:autoload (zlib) (call-with-gzip-input-port)
+ #:autoload (htmlprag) (html->sxml) ;from Guile-Lib
#:export (gnu-package-name
gnu-package-mundane-name
gnu-package-copyright-holder
@@ -447,18 +448,6 @@ hosted on ftp.gnu.org, or not under that name (this is the case for
;;; Latest HTTP release.
;;;
-(define (html->sxml port)
- "Read HTML from PORT and return the corresponding SXML tree."
- (let ((str (get-string-all port)))
- (catch #t
- (lambda ()
- ;; XXX: This is the poor developer's HTML-to-XML converter. It's good
- ;; enough for directory listings at <https://kernel.org/pub> but if
- ;; needed we could resort to (htmlprag) from Guile-Lib.
- (call-with-input-string (string-replace-substring str "<hr>" "<hr />")
- xml->sxml))
- (const '(html))))) ;parse error
-
(define (html-links sxml)
"Return the list of links found in SXML, the SXML tree of an HTML page."
(let loop ((sxml sxml)
--
2.30.1
Information forwarded
to
guix-patches <at> gnu.org
:
bug#47126
; Package
guix-patches
.
(Sat, 13 Mar 2021 21:47:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 47126 <at> debbugs.gnu.org (full text, mbox):
* guix/gnu-maintenance.scm (latest-html-release): Allow for URL to be an
arbitrary URL rather than a relative URL reference.
---
guix/gnu-maintenance.scm | 30 ++++++++++++++++--------------
1 file changed, 16 insertions(+), 14 deletions(-)
diff --git a/guix/gnu-maintenance.scm b/guix/gnu-maintenance.scm
index febed57c3a..98d326e500 100644
--- a/guix/gnu-maintenance.scm
+++ b/guix/gnu-maintenance.scm
@@ -1,5 +1,5 @@
;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès <ludo <at> gnu.org>
+;;; Copyright © 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 Ludovic Courtès <ludo <at> gnu.org>
;;; Copyright © 2012, 2013 Nikita Karetnikov <nikita <at> karetnikov.org>
;;; Copyright © 2021 Simon Tournier <zimon.toutoune <at> gmail.com>
;;;
@@ -479,19 +479,21 @@ return the corresponding signature URL, or #f it signatures are unavailable."
(port (http-fetch/cached uri #:ttl 3600))
(sxml (html->sxml port)))
(define (url->release url)
- (and (string=? url (basename url)) ;relative reference?
- (release-file? package url)
- (let-values (((name version)
- (package-name->name+version
- (tarball-sans-extension url)
- #\-)))
- (upstream-source
- (package name)
- (version version)
- (urls (list (string-append base-url directory "/" url)))
- (signature-urls
- (list (file->signature
- (string-append base-url directory "/" url))))))))
+ (let* ((base (basename url))
+ (url (if (string=? base url)
+ (string-append base-url directory "/" url)
+ url)))
+ (and (release-file? package base)
+ (let-values (((name version)
+ (package-name->name+version
+ (tarball-sans-extension base)
+ #\-)))
+ (upstream-source
+ (package name)
+ (version version)
+ (urls (list url))
+ (signature-urls
+ (list (file->signature url))))))))
(define candidates
(filter-map url->release (html-links sxml)))
--
2.30.1
Information forwarded
to
guix-patches <at> gnu.org
:
bug#47126
; Package
guix-patches
.
(Sat, 13 Mar 2021 21:47:03 GMT)
Full text and
rfc822 format available.
Message #14 received at 47126 <at> debbugs.gnu.org (full text, mbox):
* guix/gnu-maintenance.scm (release-file?): Reject ".md5sum",
".sha1sum", and ".sha256sum".
---
guix/gnu-maintenance.scm | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/guix/gnu-maintenance.scm b/guix/gnu-maintenance.scm
index 98d326e500..a8b24fa336 100644
--- a/guix/gnu-maintenance.scm
+++ b/guix/gnu-maintenance.scm
@@ -247,7 +247,9 @@ network to check in GNU's database."
(define (release-file? project file)
"Return #f if FILE is not a release tarball of PROJECT, otherwise return
true."
- (and (not (member (file-extension file) '("sig" "sign" "asc")))
+ (and (not (member (file-extension file)
+ '("sig" "sign" "asc"
+ "md5sum" "sha1sum" "sha256sum")))
(and=> (regexp-exec %tarball-rx file)
(lambda (match)
;; Filter out unrelated files, like `guile-www-1.1.1'.
--
2.30.1
Information forwarded
to
guix-patches <at> gnu.org
:
bug#47126
; Package
guix-patches
.
(Sat, 13 Mar 2021 21:47:03 GMT)
Full text and
rfc822 format available.
Message #17 received at 47126 <at> debbugs.gnu.org (full text, mbox):
* guix/gnu-maintenance.scm (latest-html-release): #:file->signature
defaults to #f.
[file->signature/guess]: New procedure.
[url->release]: Use it when FILE->SIGNATURE is #f.
Introduce 'links' variable.
(url-prefix-rewrite): Check whether URL is true before calling
'string-prefix?'.
(latest-savannah-release): Adjust comment about detached signatures.
---
guix/gnu-maintenance.scm | 36 ++++++++++++++++++++++++------------
1 file changed, 24 insertions(+), 12 deletions(-)
diff --git a/guix/gnu-maintenance.scm b/guix/gnu-maintenance.scm
index a8b24fa336..3bffa4d11e 100644
--- a/guix/gnu-maintenance.scm
+++ b/guix/gnu-maintenance.scm
@@ -470,16 +470,29 @@ hosted on ftp.gnu.org, or not under that name (this is the case for
#:key
(base-url "https://kernel.org/pub")
(directory (string-append "/" package))
- (file->signature (cut string-append <> ".sig")))
+ file->signature)
"Return an <upstream-source> for the latest release of PACKAGE (a string) on
SERVER under DIRECTORY, or #f. BASE-URL should be the URL of an HTML page,
typically a directory listing as found on 'https://kernel.org/pub'.
-FILE->SIGNATURE must be a procedure; it is passed a source file URL and must
-return the corresponding signature URL, or #f it signatures are unavailable."
- (let* ((uri (string->uri (string-append base-url directory "/")))
- (port (http-fetch/cached uri #:ttl 3600))
- (sxml (html->sxml port)))
+When FILE->SIGNATURE is omitted or #f, guess the detached signature file name,
+if any. Otherwise, FILE->SIGNATURE must be a procedure; it is passed a source
+file URL and must return the corresponding signature URL, or #f it signatures
+are unavailable."
+ (let* ((uri (string->uri (string-append base-url directory "/")))
+ (port (http-fetch/cached uri #:ttl 3600))
+ (sxml (html->sxml port))
+ (links (delete-duplicates (html-links sxml))))
+ (define (file->signature/guess url)
+ (let ((base (basename url)))
+ (any (lambda (link)
+ (any (lambda (extension)
+ (and (string=? (string-append base extension)
+ (basename link))
+ (string-append url extension)))
+ '(".asc" ".sig" ".sign")))
+ links)))
+
(define (url->release url)
(let* ((base (basename url))
(url (if (string=? base url)
@@ -495,10 +508,10 @@ return the corresponding signature URL, or #f it signatures are unavailable."
(version version)
(urls (list url))
(signature-urls
- (list (file->signature url))))))))
+ (list ((or file->signature file->signature/guess) url))))))))
(define candidates
- (filter-map url->release (html-links sxml)))
+ (filter-map url->release links))
(close-port port)
(match candidates
@@ -614,7 +627,7 @@ releases are on gnu.org."
(define (url-prefix-rewrite old new)
"Return a one-argument procedure that rewrites URL prefix OLD to NEW."
(lambda (url)
- (if (string-prefix? old url)
+ (if (and url (string-prefix? old url))
(string-append new (string-drop url (string-length old)))
url)))
@@ -646,9 +659,8 @@ releases are on gnu.org."
(directory (dirname (uri-path uri)))
(rewrite (url-prefix-rewrite %savannah-base
"mirror://savannah")))
- ;; Note: We use the default 'file->signature', which adds ".sig", but not
- ;; all projects on Savannah follow that convention: some use ".asc" and
- ;; perhaps some lack signatures altogether.
+ ;; Note: We use the default 'file->signature', which adds ".sig", ".asc",
+ ;; or whichever detached signature naming scheme PACKAGE uses.
(and=> (latest-html-release package
#:base-url %savannah-base
#:directory directory)
--
2.30.1
Information forwarded
to
guix-patches <at> gnu.org
:
bug#47126
; Package
guix-patches
.
(Sat, 13 Mar 2021 21:47:03 GMT)
Full text and
rfc822 format available.
Message #20 received at 47126 <at> debbugs.gnu.org (full text, mbox):
* guix/gnu-maintenance.scm (latest-html-release): Use 'tarball->version'
rather than 'package-name->name+version' to extract the version number.
This fixes problems with packages like 'netsurf' and 'libdom' that have
"-src" in their tarball name, where "src" would be taken as the new
version number.
---
guix/gnu-maintenance.scm | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/guix/gnu-maintenance.scm b/guix/gnu-maintenance.scm
index 3bffa4d11e..5aa16acfde 100644
--- a/guix/gnu-maintenance.scm
+++ b/guix/gnu-maintenance.scm
@@ -499,12 +499,9 @@ are unavailable."
(string-append base-url directory "/" url)
url)))
(and (release-file? package base)
- (let-values (((name version)
- (package-name->name+version
- (tarball-sans-extension base)
- #\-)))
+ (let ((version (tarball->version base)))
(upstream-source
- (package name)
+ (package package)
(version version)
(urls (list url))
(signature-urls
--
2.30.1
Information forwarded
to
guix-patches <at> gnu.org
:
bug#47126
; Package
guix-patches
.
(Sat, 13 Mar 2021 21:47:04 GMT)
Full text and
rfc822 format available.
Message #23 received at 47126 <at> debbugs.gnu.org (full text, mbox):
* gnu/packages/mpi.scm (hwloc-1)[properties]: New field.
---
gnu/packages/mpi.scm | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/gnu/packages/mpi.scm b/gnu/packages/mpi.scm
index 53ee6ef1cd..a8ebd8aeb8 100644
--- a/gnu/packages/mpi.scm
+++ b/gnu/packages/mpi.scm
@@ -66,6 +66,12 @@
(sha256
(base32
"0za1b9lvrm3rhn0lrxja5f64r0aq1qs4m0pxn1ji2mbi8ndppyyx"))))
+
+ (properties
+ ;; Tell the 'generic-html' updater to monitor this URL for updates.
+ `((release-monitoring-url
+ . "https://www-lb.open-mpi.org/software/hwloc/current")))
+
(build-system gnu-build-system)
(outputs '("out" ;'lstopo' & co., depends on Cairo, libx11, etc.
"lib" ;small closure
--
2.30.1
Information forwarded
to
guix-patches <at> gnu.org
:
bug#47126
; Package
guix-patches
.
(Sat, 13 Mar 2021 21:47:04 GMT)
Full text and
rfc822 format available.
Message #26 received at 47126 <at> debbugs.gnu.org (full text, mbox):
This brings total updater coverage, as reported by 'guix refresh
--list-updaters', from 78% to 88.3%. Among many other things, it covers
freedesktop.org packages.
* guix/gnu-maintenance.scm (html-updatable-package?)
(latest-html-updatable-release): New procedures.
(%generic-html-updater): New variable.
* doc/guix.texi (Invoking guix refresh): Document it.
---
doc/guix.texi | 3 +++
guix/gnu-maintenance.scm | 58 +++++++++++++++++++++++++++++++++++++++-
2 files changed, 60 insertions(+), 1 deletion(-)
diff --git a/doc/guix.texi b/doc/guix.texi
index 97094a7d0a..89c8c58295 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -11693,6 +11693,9 @@ the updater for @uref{https://www.stackage.org, Stackage} packages.
the updater for @uref{https://crates.io, Crates} packages.
@item launchpad
the updater for @uref{https://launchpad.net, Launchpad} packages.
+@item generic-html
+a generic updater that crawls the HTML page where the source tarball of
+the package is hosted, when applicable.
@end table
For instance, the following command only checks for updates of Emacs
diff --git a/guix/gnu-maintenance.scm b/guix/gnu-maintenance.scm
index 5aa16acfde..ced5497b37 100644
--- a/guix/gnu-maintenance.scm
+++ b/guix/gnu-maintenance.scm
@@ -28,6 +28,7 @@
#:use-module (srfi srfi-1)
#:use-module (srfi srfi-11)
#:use-module (srfi srfi-26)
+ #:use-module (srfi srfi-34)
#:use-module (rnrs io ports)
#:use-module (system foreign)
#:use-module (guix http-client)
@@ -66,7 +67,8 @@
%gnu-ftp-updater
%savannah-updater
%xorg-updater
- %kernel.org-updater))
+ %kernel.org-updater
+ %generic-html-updater))
;;; Commentary:
;;;
@@ -697,6 +699,53 @@ releases are on gnu.org."
#:file->signature file->signature)
(cut adjusted-upstream-source <> rewrite))))
+(define html-updatable-package?
+ ;; Return true if the given package may be handled by the generic HTML
+ ;; updater.
+ (let ((hosting-sites '("github.com" "github.io" "gitlab.com"
+ "notabug.org" "sr.ht"
+ "gforge.inria.fr" "gitlab.inria.fr"
+ "ftp.gnu.org" "download.savannah.gnu.org"
+ "pypi.org" "crates.io" "rubygems.org"
+ "bioconductor.org")))
+ (url-predicate (lambda (url)
+ (match (string->uri url)
+ (#f #f)
+ (uri
+ (let ((scheme (uri-scheme uri))
+ (host (uri-host uri)))
+ (and (memq scheme '(http https))
+ (not (member host hosting-sites))))))))))
+
+(define (latest-html-updatable-release package)
+ "Return the latest release of PACKAGE. Do that by crawling the HTML page of
+the directory containing its source tarball."
+ (let* ((uri (string->uri
+ (match (origin-uri (package-source package))
+ ((? string? url) url)
+ ((url _ ...) url))))
+ (custom (assoc-ref (package-properties package)
+ 'release-monitoring-url))
+ (base (or custom
+ (string-append (symbol->string (uri-scheme uri))
+ "://" (uri-host uri))))
+ (directory (if custom
+ ""
+ (dirname (uri-path uri))))
+ (package (package-upstream-name package)))
+ (catch #t
+ (lambda ()
+ (guard (c ((http-get-error? c) #f))
+ (latest-html-release package
+ #:base-url base
+ #:directory directory)))
+ (lambda (key . args)
+ ;; Return false and move on upon connection failures.
+ (unless (memq key '(gnutls-error tls-certificate-error
+ system-error))
+ (apply throw key args))
+ #f))))
+
(define %gnu-updater
;; This is for everything at ftp.gnu.org.
(upstream-updater
@@ -737,4 +786,11 @@ releases are on gnu.org."
(pred (url-prefix-predicate "mirror://kernel.org/"))
(latest latest-kernel.org-release)))
+(define %generic-html-updater
+ (upstream-updater
+ (name 'generic-html)
+ (description "Updater that crawls HTML pages.")
+ (pred html-updatable-package?)
+ (latest latest-html-updatable-release)))
+
;;; gnu-maintenance.scm ends here
--
2.30.1
Information forwarded
to
guix-patches <at> gnu.org
:
bug#47126
; Package
guix-patches
.
(Wed, 17 Mar 2021 10:20:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 47126 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hello!
That's awesome thanks a lot Ludo!!
I am wondering, does this handle cases where there's a subfolder with
version and then another tarball with version as well?
Like GNOME for example:
https://download.gnome.org/sources/NetworkManager/1.31/NetworkManager-1.31.1.tar.xz
I see this is a generic solution, I see you made available some options
to customize per-package as needed but can we get as precise/reliable
as Debian's watch/uscan with that?
Or if I understand correctly, we should always point it to a page where
the link for the latest release is always published? That last thing
really sounds nice!
Léo
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
guix-patches <at> gnu.org
:
bug#47126
; Package
guix-patches
.
(Wed, 17 Mar 2021 13:53:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 47126 <at> debbugs.gnu.org (full text, mbox):
Hi Léo,
Léo Le Bouter <lle-bout <at> zaclys.net> skribis:
> That's awesome thanks a lot Ludo!!
Just pushed this series as fe96f64110676f28b948f0d31a1726501abdae0e.
Unleash your update powers, comrades! :-)
> I am wondering, does this handle cases where there's a subfolder with
> version and then another tarball with version as well?
>
> Like GNOME for example:
> https://download.gnome.org/sources/NetworkManager/1.31/NetworkManager-1.31.1.tar.xz
>
> I see this is a generic solution, I see you made available some options
> to customize per-package as needed but can we get as precise/reliable
> as Debian's watch/uscan with that?
There’s a ‘gnome’ updater for GNOME:
https://guix.gnu.org/manual/en/html_node/Invoking-guix-refresh.html
And yes, it actually works. :-)
In the case of NetworkManager, there’s a bug right now:
--8<---------------cut here---------------start------------->8---
$ guix refresh network-manager
ni sekvas la redirektigon al 'https://download.gnome.org/sources/NetworkManager/cache.json'...
ni sekvas la redirektigon al 'https://fr2.rpmfind.net/linux/gnome.org/sources/NetworkManager/cache.json'...
gnu/packages/gnome.scm:7648:13: network-manager would be upgraded from 1.24.0 to rc2
--8<---------------cut here---------------end--------------->8---
I’ll see what’s up. But otherwise ‘guix refresh -t gnome’ produces
sensible results.
At any rate, updaters sometimes bitrot, produce buggy results as in the
example above. Please do use ‘guix refresh’ and report any issues!
Also, there are still ~12% of packages for which none of the updaters
apply. We should investigate and see how we can bring that down to
zero.
Thanks for your feedback!
Ludo’.
Added tag(s) fixed.
Request was from
Ludovic Courtès <ludo <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Wed, 17 Mar 2021 13:54:02 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
47126 <at> debbugs.gnu.org and Ludovic Courtès <ludo <at> gnu.org>
Request was from
Ludovic Courtès <ludo <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Wed, 17 Mar 2021 13:54:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 15 Apr 2021 11:24:07 GMT)
Full text and
rfc822 format available.
This bug report was last modified 4 years and 127 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.