Package: guix-patches;
Reported by: Maxime Devos <maximedevos <at> telenet.be>
Date: Mon, 2 Aug 2021 15:48:02 UTC
Severity: normal
Tags: patch
Done: Leo Prikler <leo.prikler <at> student.tugraz.at>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Leo Prikler <leo.prikler <at> student.tugraz.at> To: Maxime Devos <maximedevos <at> telenet.be>, 49828 <at> debbugs.gnu.org Subject: [bug#49828] [PATCH 06/20] guix: Add ContentDB importer. Date: Thu, 05 Aug 2021 18:41:00 +0200
Hi, Am Montag, den 02.08.2021, 17:50 +0200 schrieb Maxime Devos: > * guix/import/contentdb.scm: New file. > * guix/scripts/import/contentdb.scm: New file. > * tests/contentdb.scm: New file. > * Makefile.am (MODULES, SCM_TESTS): Register them. > * po/guix/POTFILES.in: Likewise. > * doc/guix.texi (Invoking guix import): Document it. > [...] > diff --git a/doc/guix.texi b/doc/guix.texi > index 43c248234d..d06c9b73c5 100644 > --- a/doc/guix.texi > +++ b/doc/guix.texi > @@ -11313,6 +11313,31 @@ and generate package expressions for all > those packages that are not yet > in Guix. > @end table > > +@item contentdb > +@cindex ContentDB > +Import metadata from @uref{https://content.minetest.net, ContentDB}. > +Information is taken from the JSON-formatted metadata provided > through > +@uref{https://content.minetest.net/help/api/, ContentDB's API} and > +includes most relevant information, including dependencies. There > are > +some caveats, however. The license information on ContentDB does > not > +distinguish between GPLvN-only and GPLvN-or-later. The commit id is > +sometimes missing. The descriptions are in the Markdown format, but > +Guix uses Texinfo instead. Texture packs and subgames are > unsupported. What is the "commit id"? Is it the hash? A tag? Anything that resolves to a commit? Also, since ContentDB sounds fairly generic (a database of content?), perhaps we ought to call this the "minetest" importer instead? > [...] > +;; The ContentDB API is documented at > +;; <https://content.minetest.net>;. > + > +(define %contentdb-api > + (make-parameter "https://content.minetest.net/api/")) > + > +(define (string-or-false x) > + (and (string? x) x)) > + > +(define (natural-or-false x) > + (and (exact-integer? x) (>= x 0) x)) > + > +;; Descriptions on ContentDB use carriage returns, but Guix doesn't. > +(define (delete-cr text) > + (string-delete #\cr text)) > + > +;; Minetest package. > +;; > +;; API endpoint: /packages/AUTHOR/NAME/ > +(define-json-mapping <package> make-package package? > + json->package > + (author package-author) ; string > + (creation-date package-creation-date ; string > + "created_at") > + (downloads package-downloads) ; integer > + (forums package-forums "forums" natural-or-false) ; > natural | #f This comment and some others like it seem to simply be repeating already present information. Is there a use for them? Should we instead provide a third argument on every field to verify/enforce the type? > + (issue-tracker package-issue-tracker "issue_tracker") ; string > + (license package-license) ; string > + (long-description package-long-description "long_description") ; > string > + (maintainers package-maintainers ; list of strings > + "maintainers" vector->list) > + (media-license package-media-license "media_license") ; string > + (name package-name) ; string > + (provides package-provides ; list of strings > + "provides" vector->list) > + (release package-release) ; integer > + (repository package-repository "repo" string-or-false) ; > string | #f > + (score package-score) ; flonum > + (screenshots package-screenshots "screenshots" vector->list) > ; list of strings > + (short-description package-short-description "short_description") > ; string > + (state package-state) ; string > + (tags package-tags "tags" vector->list) ; list of > strings > + (thumbnail package-thumbnail) ; string > + (title package-title) ; string > + (type package-type) ; string > + (url package-url) ; string > + (website package-website "website" string-or-false)) ; > string | #f > + > +(define-json-mapping <release> make-release release? > + json->release > + (commit release-commit "commit" string-or-false) ; > string | #f > + (downloads release-downloads) ; integer > + (id release-id) ; integer > + (max-minetest-version release-max-minetest-version) ; string | #f > + (min-minetest-version release-min-minetest-version) ; string | #f > + (release-date release-data) ; string > + (title release-title) ; string > + (url release-url)) ; string > + > +(define-json-mapping <dependency> make-dependency dependency? > + json->dependency > + (optional? dependency-optional? "is_optional") ; #t | #f Also known as "boolean". > + (name dependency-name) ; string > + (packages dependency-packages "packages" vector->list)) ; list of > strings > + > +(define (contentdb-fetch author name) > + "Return a <package> record for package NAME by AUTHOR, or #f on > failure." > + (and=> (json-fetch > + (string-append (%contentdb-api) "packages/" author "/" > name "/")) > + json->package)) Is there a reason for author and name to be separate keys? For me it makes more sense to take AUTHOR/NAME as a singular search string from users and then perform queries based on that. If ContentDB allows searching, we might also resolve NAME to a singular package where possible and otherwise error out, telling the user to choose one. > [...] > + > +(define (important-dependencies dependencies author name) > + (define dependency-list > + (assoc-ref dependencies (string-append author "/" name))) > + (filter-map > + (lambda (dependency) > + (and (not (dependency-optional? dependency)) > + ;; "default" must be provided by the 'subgame' in use > + ;; and does not refer to a specific minetest mod. > + ;; "doors", "bucket" ... are provided by the default > minetest > + ;; subgame. > + (not (member (dependency-name dependency) > + '("default" "doors" "beds" "bucket" "doors" > "farming" > + "flowers" "stairs" "xpanes"))) > + ;; Dependencies often have only one implementation. > + (let* ((/name (string-append "/" (dependency-name > dependency))) > + (likewise-named-implementations > + (filter (cut string-suffix? /name <>) > + (dependency-packages dependency))) > + (implementation > + (and (not (null? likewise-named-implementations)) > + (first likewise-named-implementations)))) > + (and implementation > + (apply cons (string-split implementation #\/)))))) > + dependency-list)) What exactly does the likewise-named-implementations bit do here? > +(define (contentdb-recursive-import author name) > + ;; recursive-import expects upstream package names to be strings, > + ;; so do some conversions. > + (define (split-author/name author/name) > + (string-split author/name #\/)) +1 for my author/name splitting, as it's already required for recursive imports. > + (define (author+name->author/name author+name) > + (string-append (car author+name) "/" (cdr author+name))) > + (define* (contentdb->guix-package* author/name #:key repo version) > + (receive (package . maybe-dependencies) > + (apply contentdb->guix-package (split-author/name > author/name)) > + (and package > + (receive (dependencies) > + (apply values maybe-dependencies) > + (values package > + (map author+name->author/name > dependencies)))))) > + (recursive-import (author+name->author/name (cons author name)) > + #:repo->guix-package contentdb->guix-package* > + #:guix-name > + (lambda (author/name) > + (contentdb->package-name > + (second (split-author/name author/name)))))) > + > +;; A list of license names is available at > +;; <https://content.minetest.net/api/licenses/>;. > +(define (string->license str) > + "Convert the string STR into a license object." > + (match str > + ("GPLv3" license:gpl3) > + ("GPLv2" license:gpl2) > + ("ISC" license:isc) > + ;; "MIT" means the Expat license on ContentDB, > + ;; see < > https://github.com/minetest/contentdb/issues/326#issuecomment-890143784> > ;. > + ("MIT" license:expat) > + ("CC BY-SA 3.0" license:cc-by-sa3.0) > + ("CC BY-SA 4.0" license:cc-by-sa4.0) > + ("LGPLv2.1" license:lgpl2.1) > + ("LGPLv3" license:lgpl3) > + ("MPL 2.0" license:mpl2.0) > + ("ZLib" license:zlib) > + ("Unlicense" license:unlicense) > + (_ #f))) The link mentions, that ContentDB now supports all SPDX identifiers. Do we have a SPDX->Guix converter lying around in some other importer that we could use as default case here (especially w.r.t. "or later") WDYT?
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.