From unknown Sun Jun 22 11:32:48 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#68741 <68741@debbugs.gnu.org> To: bug#68741 <68741@debbugs.gnu.org> Subject: Status: [PATCH 0/6] Content-addressed downloads from Software Heritage Reply-To: bug#68741 <68741@debbugs.gnu.org> Date: Sun, 22 Jun 2025 18:32:48 +0000 retitle 68741 [PATCH 0/6] Content-addressed downloads from Software Heritage reassign 68741 guix-patches submitter 68741 Ludovic Court=C3=A8s severity 68741 normal tag 68741 patch thanks From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 26 12:17:14 2024 Received: (at submit) by debbugs.gnu.org; 26 Jan 2024 17:17:14 +0000 Received: from localhost ([127.0.0.1]:52619 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPpH-0004Kj-39 for submit@debbugs.gnu.org; Fri, 26 Jan 2024 12:17:14 -0500 Received: from lists.gnu.org ([2001:470:142::17]:39826) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPpC-0004JH-1E for submit@debbugs.gnu.org; Fri, 26 Jan 2024 12:17:09 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTPoz-0001Zv-Tk for guix-patches@gnu.org; Fri, 26 Jan 2024 12:16:54 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTPoy-0005bb-V5; Fri, 26 Jan 2024 12:16:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:Subject:To:From:in-reply-to: references; bh=F5IC4AS13r+TdM2wuBoeVvNNlZcvTYULpiRynO2msVw=; b=L+MCrO9yQZymRo ywXbLil33dkD2qU8HsDpRlPaY7dP6heXCnlPgtSnaEzZee0NnGEZeH+jYTRjv5Z/sX6FB1sr5GRoE oDGa8mi1VBjugIsw86KO1e2TE1vjavky17TGDzg3wy432nrRbFH9IHgzRag3m7GWsac+TEXIZ3boM JjSmLN3T568XnnouSwpTGjhNsWulQuUGIJa2U2JlLxvRGTJVBlhP554hQns4V1cWHplC5IAmCX8PM PhtDKIX+XRVhdI7GGnWZvme93+hBUKxUOaHtuN5nm6rDtI81sIPpsHRcFuRck7Qy5EfkDqasFtSS3 e7AUne3YCxWPLpSTjXIQ==; From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: guix-patches@gnu.org Subject: [PATCH 0/6] Content-addressed downloads from Software Heritage Date: Fri, 26 Jan 2024 18:16:40 +0100 Message-ID: X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Debbugs-Cc: Christopher Baines , Josselin Poiret , Ludovic Courtès , Mathieu Othacehe , Ricardo Wurmus , Simon Tournier , Tobias Geerinckx-Rice Content-Transfer-Encoding: 8bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: submit Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hello Guix! For those who’ve been following along, you might remember that the main impedance mismatch between SWH and Guix is that SWH uses Git tree SHA1 hashes to identify directories whereas Guix uses nar SHA256 hashes (and possibly other hash functions in the future): https://guix.gnu.org/en/blog/2019/connecting-reproducible-deployment-to-a-long-term-source-code-archive/ Because of this, the SWH fallback path for ‘git-download’ had two options: 1. If ‘git-reference’ specifies a full SHA1 commit ID, it would look it up on SWH and fetch it. 2. If ‘git-reference’ specifies a tag, which is perhaps the majority of cases, Guix would ask SWH the commit that once corresponded to that tag at that URL, and then fetch it. Case #1 is ideal: it’s content-addressed. Case #2 is brittle: we’re hoping that the tag hasn’t been modified and that the URL hasn’t been reused for something else; if that’s not the case, SWH might return the “wrong” commit and we end up fetching something unrelated. The good news is that our friends at SWH have just deployed a new version of their code that lets us look up directories by some “external identifier” (“ExtID”), among which there’s ‘nar-sha256’: https://archive.softwareheritage.org/api/1/extid/doc/ And that, my friends, makes a huge difference: the impedance mismatch is gone, we can now use content-addressing to fetch our stuff from SWH!! And that works not just for Git, but also for Mercurial, SVN, CVS, etc. Well, there’s a caveat: currently the ‘nar-sha256’ is added only on new visits and it’s apparently not being added yet for Mercurial for unclear reasons. So right now, we can get guile-sqlite3 0.1.3 (Git) by nar-sha256, but we cannot get guile-wisp (hg) nor in fact most things. That’ll improve over time though, and SWH comrades are open to adding those ExtIDs retroactively. The patches that follow do several things: 1. Follow redirects in the Vault: (guix swh) previously did not do that (oops!) but the newly-deployed Vault now responds with 302 redirects so we have to handle that. 2. Add bindings for the ExtID HTTP interface. 3. Add ‘swh-download-directory-by-nar-hash’, which does what it says. 4. Use that as the preferred fallback method for ‘git-fetch’. Here’s a REPLshot: --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (lookup-external-id "nar-sha256" (content-hash-value(origin-hash (package-source (@ (gnu packages guile) guile-sqlite3)))) ) $43 = #< value: "0b56ba94c2b83b8f74e3772887c1109135802eb3e8962b628377987fe97e1e63" type: "nar-sha256" version: 0 target: "swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153" target-url: "https://archive.softwareheritage.org/swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153"> scheme@(guile-user)> (swh-download-directory-by-nar-hash (content-hash-value(origin-hash (package-source (@ (gnu packages guile) guile-sqlite3)))) 'sha256 "/tmp/gsql") SWH: found directory with nar-sha256 hash 0b56ba94c2b83b8f74e3772887c1109135802eb3e8962b628377987fe97e1e63 at 'swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153' swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/ swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/.gitignore swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/AUTHORS swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/COPYING swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/COPYING.LESSER swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/ChangeLog swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/Makefile.am swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/NEWS swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/README swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/build-aux/ swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/build-aux/guile.am swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/build-aux/test-driver.scm swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/configure.ac swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/env.in swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/sqlite3.scm.in swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/tests/ swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/tests/basic.scm $46 = #t --8<---------------cut here---------------end--------------->8--- Huge thanks to everyone over at #swh-devel for helping me out over the past few days! Next tasks: implement download fallback for ‘hg-fetch’, change ‘guix lint -c archival’ to make ‘save-origin’ requests not just for Git repos, assess the situation with SVN and sub-directories to see what can be done. Thoughts? Ludo’. PS: Apologies for the wall of text! Ludovic Courtès (6): swh: ‘vault-fetch’ follows redirects. swh: Add bindings for the “ExtID” API. swh: Add ‘swh-download-directory-by-nar-hash’. lint: archival: Check with ‘lookup-directory-by-nar-hash’. git-download: Download from SWH by nar hash when possible. swh: Fix docstring of ‘lookup-directory’. guix/build/git.scm | 20 ++++-- guix/git-download.scm | 4 +- guix/lint.scm | 28 +++++--- guix/scripts/perform-download.scm | 4 +- guix/swh.scm | 113 ++++++++++++++++++++++++++---- tests/lint.scm | 33 +++++++-- tests/swh.scm | 21 +++++- 7 files changed, 189 insertions(+), 34 deletions(-) base-commit: 8bee6bb9aaaf35c36fe325675d1eb2daebd69c25 -- 2.41.0 From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 26 12:25:44 2024 Received: (at 68741) by debbugs.gnu.org; 26 Jan 2024 17:25:44 +0000 Received: from localhost ([127.0.0.1]:52648 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxX-0004fF-DP for submit@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:44 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:47870) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxU-0004ef-PJ for 68741@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:41 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTPxI-0008SK-Sh; Fri, 26 Jan 2024 12:25:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:References:In-Reply-To:Date:Subject:To: From; bh=uZJgLjo/h8ho0gAmTFFlAg5ftkZjinmccurE++XcOSg=; b=JiPV43iiFgewA/0HUG0j mJkfRGCTRJG9iyqwtdis3E1uYf79QiID3ItBdPP+es31JVmkveMvllg1IrrLg9RWLwe9b4PpPvMrc yY77oCXlUAX6JRhhR6flpQkLTi7XKJFx6nbpkUWTK8MDc2A7zGngeIFMNOIZCLMPzJQRPOPYMCVZx bv8z6Mynf3W6ckw5AQFAs0q5hsoi2Id0HUJnJJZi6lgaXSe6tMWuiJC51UkwpT3GidRF4KanXOj0g h1fgvE5AMMqPVmCJOx61Sq+ZNVDpG2JEKVlv810ewZeIY/8nqNe5n6eGjfnQo+Ya/Zm86/YFfRcRT iG85q0QRiFQXCg==; From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 68741@debbugs.gnu.org Subject: [PATCH 2/6] =?UTF-8?q?swh:=20Add=20bindings=20for=20the=20?= =?UTF-8?q?=E2=80=9CExtID=E2=80=9D=20API.?= Date: Fri, 26 Jan 2024 18:25:02 +0100 Message-ID: <848b0eb1d2ee9d7a31940c9e1867b8decde6ae3f.1706287537.git.ludo@gnu.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Debbugs-Cc: Christopher Baines , Josselin Poiret , Ludovic Courtès , Mathieu Othacehe , Ricardo Wurmus , Simon Tournier , Tobias Geerinckx-Rice Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 68741 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) This interface was deployed at archive.softwareheritage.org a few days ago. Our main use case will be looking up directories by “nar-sha256” hashes. * guix/swh.scm (): New JSON-mapped record type. (lookup-external-id, lookup-directory-by-nar-hash): New procedures. * tests/swh.scm (%external-id): New variable. ("lookup-directory-by-nar-hash"): New test. Change-Id: Ib671c7798aeb6f8132ac78f2b06b9285da8e7bd5 --- guix/swh.scm | 35 +++++++++++++++++++++++++++++++++++ tests/swh.scm | 21 ++++++++++++++++++++- 2 files changed, 55 insertions(+), 1 deletion(-) diff --git a/guix/swh.scm b/guix/swh.scm index 4e71bdb045..60e97c6d38 100644 --- a/guix/swh.scm +++ b/guix/swh.scm @@ -78,6 +78,14 @@ (define-module (guix swh) lookup-revision lookup-origin-revision + external-id? + external-id-value + external-id-type + external-id-version + external-id-target + lookup-external-id + lookup-directory-by-nar-hash + content? content-checksums content-data-url @@ -382,6 +390,15 @@ (define-json-mapping make-directory-entry directory-entry? (permissions directory-entry-permissions "perms") (target-url directory-entry-target-url "target_url")) +;; +(define-json-mapping make-external-id external-id? + json->external-id + (value external-id-value "extid") + (type external-id-type "extid_type") + (version external-id-version "extid_version") + (target external-id-target) + (target-url external-id-target-url "target_url")) + ;; (define-json-mapping make-save-reply save-reply? json->save-reply @@ -436,6 +453,24 @@ (define (json->directory-entries port) (map json->directory-entry (vector->list (json->scm port)))) +(define (lookup-external-id type id) + "Return the external ID record for ID, a bytevector, of the given TYPE +(currently one of: \"bzr-nodeid\", \"hg-nodeid\", \"nar-sha256\", +\"checksum-sha512\")." + (call (swh-url "/api/1/extid" type + (string-append "hex:" (bytevector->base16-string id))) + json->external-id)) + +(define* (lookup-directory-by-nar-hash hash #:optional (algorithm 'sha256)) + "Return the SWHID of a directory---i.e., prefixed by \"swh:1:dir\"---for the +directory that with the given HASH (a bytevector), assuming nar serialization +and use of ALGORITHM." + ;; example: + ;; https://archive.softwareheritage.org/api/1/extid/nar-sha256/base64url:0jD6Z4TLMm5g1CviuNNuVNP31KWyoT_oevfr8TQwc3Y/ + (and=> (lookup-external-id (string-append "nar-" (symbol->string algorithm)) + hash) + external-id-target)) + (define (origin-visits origin) "Return the list of visits of ORIGIN, a record as returned by 'lookup-origin'." diff --git a/tests/swh.scm b/tests/swh.scm index a36f951241..e7ced6b50c 100644 --- a/tests/swh.scm +++ b/tests/swh.scm @@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU -;;; Copyright © 2019, 2020, 2021 Ludovic Courtès +;;; Copyright © 2019-2021, 2024 Ludovic Courtès ;;; ;;; This file is part of GNU Guix. ;;; @@ -18,6 +18,7 @@ (define-module (test-swh) #:use-module (guix swh) + #:use-module (guix base32) #:use-module (guix tests http) #:use-module (web response) #:use-module (srfi srfi-19) @@ -56,6 +57,16 @@ (define %directory-entries \"length\": 456, \"dir_id\": 2 } ]") +(define %external-id + "{ \"extid_type\": \"nar-sha256\", + \"extid\": +\"0b56ba94c2b83b8f74e3772887c1109135802eb3e8962b628377987fe97e1e63\", + \"version\": 0, + \"target\": \"swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153\", + \"target_url\": +\"https://archive.softwareheritage.org/swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153\" + }") + (define-syntax-rule (with-json-result str exp ...) (with-http-server `((200 ,str)) (parameterize ((%swh-base-url (%local-url))) @@ -98,6 +109,14 @@ (define-syntax-rule (with-json-result str exp ...) (directory-entry-length entry))) (lookup-directory "123")))) +(test-equal "lookup-directory-by-nar-hash" + "swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153" + (with-json-result %external-id + (lookup-directory-by-nar-hash + (nix-base32-string->bytevector + "0qqygvlpz63phdi2p5p8ncp80dci230qfa3pwds8yfxqqaablmhb") + 'sha256))) + (test-equal "rate limit reached" 3000000000 (let ((too-many (build-response -- 2.41.0 From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 26 12:25:45 2024 Received: (at 68741) by debbugs.gnu.org; 26 Jan 2024 17:25:45 +0000 Received: from localhost ([127.0.0.1]:52654 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxY-0004fN-6M for submit@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:45 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:47864) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxU-0004ee-KB for 68741@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:41 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTPxH-0008Qq-Kx; Fri, 26 Jan 2024 12:25:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:References:In-Reply-To:Date:Subject:To: From; bh=t3vEQHw0zSebWWR31SkRHVjibufh19E6tE2BvamB6og=; b=EeZ5762wiGRfzJe/qQDm dQ+0V843LWdXUyh0r/KoA3gyx3YKCN7NtqqIe17vEl2BdEGKItO/SNhS1dm748aqaseOZ6OSsWRSq SzlxDTiatVnvQwCYPV1/JrI0dAmuCPaaB7QSyy32PWph10mtff9++nLt6ETMfBPVMRyiZCFqZEkS8 PaX/2Smg8tY/BFRF7Y60j2Sn2KrrOhFyzgQPraX5k58+5XlQQ5kMIT9vWk6HbkN/NAUTmyvv4jeGV otvBJ7KDRHabGcjlc9mKAPQx228duBST7sUdrkNvTXgyDGBcOYBczs9iCWXZlPt2hwVYdNsnwA+5t 8MmG7ng2u9w7ww==; From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 68741@debbugs.gnu.org Subject: [PATCH 1/6] =?UTF-8?q?swh:=20=E2=80=98vault-fetch=E2=80=99=20foll?= =?UTF-8?q?ows=20redirects.?= Date: Fri, 26 Jan 2024 18:25:01 +0100 Message-ID: <8d1ecdcbef60d2643f1aad455e9a2525e6c9d78a.1706287537.git.ludo@gnu.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Debbugs-Cc: Christopher Baines , Josselin Poiret , Ludovic Courtès , Mathieu Othacehe , Ricardo Wurmus , Simon Tournier , Tobias Geerinckx-Rice Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 68741 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Today, URLs like https://archive.softwareheritage.org/api/1/vault/flat/swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/raw/ redirect to https://swhvaultstorage.blob.core.windows.net/…. This change fixes ‘vault-fetch’ to follow these. * guix/swh.scm (http-get/follow): New procedure. (vault-fetch): Use it instead of ‘http-get*’. Change-Id: Id6b9585a9ce6699a2274b99c9a6d4edda1018b02 --- guix/swh.scm | 52 +++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 41 insertions(+), 11 deletions(-) diff --git a/guix/swh.scm b/guix/swh.scm index c7c1c873a2..4e71bdb045 100644 --- a/guix/swh.scm +++ b/guix/swh.scm @@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU -;;; Copyright © 2018, 2019, 2020, 2021 Ludovic Courtès +;;; Copyright © 2018-2021, 2024 Ludovic Courtès ;;; Copyright © 2020 Jakub Kądziołka ;;; Copyright © 2021 Xinglu Chen ;;; Copyright © 2021 Simon Tournier @@ -583,6 +583,41 @@ (define* (request-cooking id #:optional kind #:key (archive-type 'flat)) json->vault-reply http-post*)) +(define* (http-get/follow url + #:key + (verify-certificate? (%verify-swh-certificate?))) + "Like 'http-get' but follow redirects (HTTP 30x). On success, return two +values: an input port to read the response body and its 'Content-Length'. On +failure return #f and #f." + (define uri + (if (string? url) (string->uri url) url)) + + (let loop ((uri uri)) + (define (resolve-uri-reference target) + (if (and (uri-scheme target) (uri-host target)) + target + (build-uri (uri-scheme uri) #:host (uri-host uri) + #:port (uri-port uri) + #:path (uri-path target)))) + + (let*-values (((response port) + (http-get* uri #:streaming? #t + #:verify-certificate? verify-certificate?)) + ((code) + (response-code response))) + (case code + ((200) + (values port (response-content-length response))) + ((301 ; moved permanently + 302 ; found (redirection) + 303 ; see other + 307 ; temporary redirection + 308) ; permanent redirection + (close-port port) + (loop (resolve-uri-reference (response-location response)))) + (else + (values #f #f)))))) + (define* (vault-fetch id #:optional kind #:key @@ -604,16 +639,11 @@ (define* (vault-fetch id (match (vault-reply-status reply) ('done ;; Fetch the bundle. - (let-values (((response port) - (http-get* (swh-url (vault-reply-fetch-url reply)) - #:streaming? #t - #:verify-certificate? - (%verify-swh-certificate?)))) - (if (= (response-code response) 200) - port - (begin ;shouldn't happen - (close-port port) - #f)))) + (let-values (((port length) + (http-get/follow (swh-url (vault-reply-fetch-url reply)) + #:verify-certificate? + (%verify-swh-certificate?)))) + port)) ('failed ;; Upon failure, we're supposed to try again. (format log-port "SWH vault: failure: ~a~%" -- 2.41.0 From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 26 12:25:45 2024 Received: (at 68741) by debbugs.gnu.org; 26 Jan 2024 17:25:45 +0000 Received: from localhost ([127.0.0.1]:52656 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxZ-0004fi-AR for submit@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:45 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:47882) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxV-0004eh-79 for 68741@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:41 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTPxJ-0008Sc-C1; Fri, 26 Jan 2024 12:25:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:References:In-Reply-To:Date:Subject:To: From; bh=U5yArds2yjafEJvvwVaCX8GTZi2fDqN3Z/ThSwp+MTs=; b=FOaueB01BzBYRcKhDd6a 5XQC6qrCI9D81ISfM71EH/D7E7HzDWu7s82JsD9DSvF9fOdTq2adLLxVKtncnLETqqZv9bX5dKJfA yNFEU5iYFMbvSoW2IzARyaJD9ZLvp/Lbb8W+vOOyFPB0Zc1beVvqZ4Mtiw9GQaQC6LUe6qUQl0bik KVzRaKm7wO36sbrLX96Q3Ez6U/raRHDRYqsv/Qv88yRE/mi808s59MN3O4wGzSc4xtoBEoDDqqhcq Vriwy661KMQmUFxOBvUeIuGCrXeDt/hL5tBJ/QZVGWon7sAeW0cBIA/ENKQSZvd1DH1iaYs1HOFOn JgMDBEGowIlDeg==; From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 68741@debbugs.gnu.org Subject: [PATCH 3/6] =?UTF-8?q?swh:=20Add=20=E2=80=98swh-download-director?= =?UTF-8?q?y-by-nar-hash=E2=80=99.?= Date: Fri, 26 Jan 2024 18:25:03 +0100 Message-ID: <4b8ebf96980377ae0a83b2702d6fc93600da8a74.1706287537.git.ludo@gnu.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Debbugs-Cc: Christopher Baines , Josselin Poiret , Ludovic Courtès , Mathieu Othacehe , Ricardo Wurmus , Simon Tournier , Tobias Geerinckx-Rice Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 68741 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) This allows us to take advantage of content addressing by giving SWH the expected nar hash. * guix/swh.scm (swh-download-directory-by-nar-hash): New procedure. Change-Id: I0494ee15a3cde390a22552de7c2246e0314ba7b5 --- guix/swh.scm | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/guix/swh.scm b/guix/swh.scm index 60e97c6d38..be1eb7d151 100644 --- a/guix/swh.scm +++ b/guix/swh.scm @@ -123,6 +123,7 @@ (define-module (guix swh) commit-id? swh-download-directory + swh-download-directory-by-nar-hash swh-download)) ;;; Commentary: @@ -805,3 +806,26 @@ (define* (swh-download url reference output "SWH: revision ~s originating from ~a could not be found~%" reference url) #f))) + +(define* (swh-download-directory-by-nar-hash hash algorithm output + #:key + (log-port (current-error-port))) + "Download from Software Heritage the directory with the given nar HASH for +ALGORITHM (a symbol such as 'sha256), and unpack it in OUTPUT. Return #t on +success and #f on failure. + +This procedure uses the \"vault\", which contains \"cooked\" directories in +the form of tarballs. If the requested directory is not cooked yet, it will +wait until it becomes available, which could take several minutes." + (match (lookup-directory-by-nar-hash hash algorithm) + (#f + (format log-port + "SWH: directory with nar-~a hash ~a not found~%" + algorithm (bytevector->base16-string hash)) + #f) + (swhid + (format log-port "SWH: found directory with nar-~a hash ~a at '~a'~%" + algorithm (bytevector->base16-string hash) swhid) + (swh-download-archive swhid output + #:archive-type 'flat ;SWHID denotes a directory + #:log-port log-port)))) -- 2.41.0 From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 26 12:25:55 2024 Received: (at 68741) by debbugs.gnu.org; 26 Jan 2024 17:25:55 +0000 Received: from localhost ([127.0.0.1]:52660 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxi-0004g7-Nw for submit@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:55 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:39632) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxW-0004ek-79 for 68741@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:43 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTPxK-0008TX-90; Fri, 26 Jan 2024 12:25:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:References:In-Reply-To:Date:Subject:To: From; bh=QpUFUjwVjCIIWeWTnCfOj1QPj71m+/H2PZ2L8z2iaSg=; b=WKiqW5CBc9K9N1Ego+uc qTMaEvsb77HqKfMg+GpZnCSkq7HgHjQxYJwz+/FfcEKFmgNxffcyWHVWeInWKGbtVIoykVQU5p1Fk TYzld9fP0r1V4+GU5ISQsFCExtFyV2c4wKFnBr36lTtG5BhmiNf73BfH6Ig1h+XC0IQdgrJ3dsSDJ yywUFsAMmV5Uy4OWcDVsmjM1Gn68SIwUI5F9sEDSEUlemA7X/2w3eCnr6ZZEzfYerhZ2iKYSVVSoV esWJtMkGYYey4cKniPlpanpvBFHxv8gcQCufsjMkHWws/xbMiMMxtZToB6IuaFGzMsPmA6lIVuMZa 4hY/iV4IXxMo2w==; From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 68741@debbugs.gnu.org Subject: [PATCH 4/6] =?UTF-8?q?lint:=20archival:=20Check=20with=20?= =?UTF-8?q?=E2=80=98lookup-directory-by-nar-hash=E2=80=99.?= Date: Fri, 26 Jan 2024 18:25:04 +0100 Message-ID: <2d4e7a22fdea270a184fe36be4ddf2267c938b84.1706287537.git.ludo@gnu.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Debbugs-Cc: Christopher Baines , Josselin Poiret , Ludovic Courtès , Mathieu Othacehe , Ricardo Wurmus , Simon Tournier , Tobias Geerinckx-Rice Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 68741 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) While this method is new and nar-sha256 ExtIDs are currently available only for new visits, it is fundamentally more reliable than the other methods, which is why it comes first. * guix/lint.scm (check-archival)[lookup-by-nar-hash]: New procedure. Call ‘lookup-by-nar-hash’ before the other lookup methods. * tests/lint.scm ("archival: content available") ("archival: content unavailable but disarchive available") ("archival: missing revision") ("archival: revision available"): Add a 404 response corresponding to the ‘lookup-external-id’ request. * tests/lint.scm ("archival: nar-sha256 extid available"): New test. Change-Id: I4a81d6e022a3b72e6484726549d7fbae627f8e73 --- guix/lint.scm | 28 ++++++++++++++++++---------- tests/lint.scm | 33 ++++++++++++++++++++++++++++----- 2 files changed, 46 insertions(+), 15 deletions(-) diff --git a/guix/lint.scm b/guix/lint.scm index 861e352b93..c95de85e69 100644 --- a/guix/lint.scm +++ b/guix/lint.scm @@ -1,7 +1,7 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2014 Cyril Roelandt ;;; Copyright © 2014, 2015 Eric Bavier -;;; Copyright © 2013-2023 Ludovic Courtès +;;; Copyright © 2013-2024 Ludovic Courtès ;;; Copyright © 2015, 2016 Mathieu Lirzin ;;; Copyright © 2016 Danny Milosavljevic ;;; Copyright © 2016 Hartmut Goebel @@ -1658,24 +1658,31 @@ (define (check-archival package) (or (not (request-rate-limit-reached? url method)) (throw skip-key #t))) + (define (lookup-by-nar-hash hash) + (lookup-directory-by-nar-hash (content-hash-value hash) + (content-hash-algorithm hash))) + (parameterize ((%allow-request? skip-when-limit-reached)) (catch #t (lambda () (match (package-source package) (#f ;no source '()) - ((and (? origin?) + ((and (? origin? origin) (= origin-uri (? git-reference? reference))) (define url (git-reference-url reference)) (define commit (git-reference-commit reference)) + (define hash + (origin-hash origin)) - (match (if (commit-id? commit) - (or (lookup-revision commit) - (lookup-origin-revision url commit)) - (lookup-origin-revision url commit)) - ((? revision? revision) + (match (or (lookup-by-nar-hash hash) + (if (commit-id? commit) + (or (lookup-revision commit) + (lookup-origin-revision url commit)) + (lookup-origin-revision url commit))) + ((or (? string?) (? revision?)) '()) (#f ;; Revision is missing from the archive, attempt to save it. @@ -1704,9 +1711,10 @@ (define (check-archival package) (if (and=> (origin-hash origin) ;XXX: for ungoogled-chromium content-hash-value) ;& icecat (let ((hash (origin-hash origin))) - (match (lookup-content (content-hash-value hash) - (symbol->string - (content-hash-algorithm hash))) + (match (or (lookup-by-nar-hash hash) + (lookup-content (content-hash-value hash) + (symbol->string + (content-hash-algorithm hash)))) (#f ;; If SWH doesn't have HASH as is, it may be because it's ;; a hand-crafted tarball. In that case, check whether diff --git a/tests/lint.scm b/tests/lint.scm index a52a82237b..87213fcc78 100644 --- a/tests/lint.scm +++ b/tests/lint.scm @@ -1,7 +1,7 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2012, 2013 Cyril Roelandt ;;; Copyright © 2014, 2015, 2016 Eric Bavier -;;; Copyright © 2014-2023 Ludovic Courtès +;;; Copyright © 2014-2024 Ludovic Courtès ;;; Copyright © 2015, 2016 Mathieu Lirzin ;;; Copyright © 2016 Hartmut Goebel ;;; Copyright © 2017 Alex Kost @@ -1358,7 +1358,8 @@ (define (package-with-phase-changes changes) ;; https://archive.softwareheritage.org/api/1/content/ (content "{ \"checksums\": {}, \"data_url\": \"xyz\", \"length\": 42 }")) - (with-http-server `((200 ,content)) + (with-http-server `((404 "") ;extid + (200 ,content)) (parameterize ((%swh-base-url (%local-url))) (check-archival (dummy-package "x" (source origin))))))) @@ -1378,7 +1379,8 @@ (define (package-with-phase-changes changes) \"type\": \"file\", \"name\": \"README\" \"length\": 42 } ]")) - (with-http-server `((404 "") ;lookup-content + (with-http-server `((404 "") ;lookup-directory-by-nar-hash + (404 "") ;lookup-content (200 ,disarchive) ;Disarchive database lookup (200 ,directory)) ;lookup-directory (mock ((guix download) %disarchive-mirrors (list (%local-url))) @@ -1397,7 +1399,8 @@ (define (package-with-phase-changes changes) \"save_request_date\": \"2014-11-17T22:09:38+01:00\", \"save_request_status\": \"accepted\", \"save_task_status\": \"scheduled\" }") - (warnings (with-http-server `((404 "No revision.") ;lookup-revision + (warnings (with-http-server `((404 "No extid.") ;lookup-directory-by-nar-hash + (404 "No revision.") ;lookup-revision (404 "No origin.") ;lookup-origin (200 ,save)) ;save-origin (parameterize ((%swh-base-url (%local-url))) @@ -1415,7 +1418,27 @@ (define (package-with-phase-changes changes) ;; https://archive.softwareheritage.org/api/1/revision/ (revision "{ \"author\": {}, \"parents\": [], \"date\": \"2014-11-17T22:09:38+01:00\" }")) - (with-http-server `((200 ,revision)) + (with-http-server `((404 "No directory.") ;lookup-directory-by-nar-hash + (200 ,revision)) + (parameterize ((%swh-base-url (%local-url))) + (check-archival (dummy-package "x" (source origin))))))) + +(test-equal "archival: nar-sha256 extid available" + '() + (let* ((origin (origin + (method git-fetch) + (uri (git-reference + (url "http://example.org/foo.git") + (commit "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"))) + (sha256 (make-bytevector 32)))) + ;; https://archive.softwareheritage.org/api/1/extid/doc/ + (extid "{ \"extid_type\": \"nar-sha256\", + \"extid\": \"1234\", + \"extid_version\": 0, + \"target\": \"swh:1:dir:cabba93\", + \"target_url\": \"boo\" + }")) + (with-http-server `((200 ,extid)) (parameterize ((%swh-base-url (%local-url))) (check-archival (dummy-package "x" (source origin))))))) -- 2.41.0 From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 26 12:25:56 2024 Received: (at 68741) by debbugs.gnu.org; 26 Jan 2024 17:25:56 +0000 Received: from localhost ([127.0.0.1]:52662 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxj-0004gB-GA for submit@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:56 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:39636) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxX-0004eo-Nh for 68741@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:44 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTPxL-0008Ts-5E; Fri, 26 Jan 2024 12:25:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:References:In-Reply-To:Date:Subject:To: From; bh=mgt89WmrVc/WRNOl2Mz9pD9Ulapju1MAcEe06SrJhCk=; b=LCwA3Nvv4ydkRm2AKMX/ 3cYCcA8IRBrvxdVlHIErH5Xj9h1m9sBFXOPkDJe8Dq5Kkt0jIvbVNVzdBl5yMc1np9wuKDg6QdXfr dF8RDWMH4ipRBnQv15SxTzyEriT4CadVXNwN5d4VmjoF5HwBPjNUQdb+k8t66IVfQtvMq6kS7Bwbx ZJjrbZDG5JTydmLmK6sl0tgzlnVNbnvrvdZjM6UaQGpvPMIyMmAvsftapjr61Fq5iobFnWcF5uIi5 FeSmEnZz5Rd7iMZt3DVlc0xlKvgQiIYQYQMXAnARbmhZWAfsg4KUS6b7dKqZxBz81Azo2CNIXkTDk FyL456rgN/HGiw==; From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 68741@debbugs.gnu.org Subject: [PATCH 5/6] git-download: Download from SWH by nar hash when possible. Date: Fri, 26 Jan 2024 18:25:05 +0100 Message-ID: <805362f89ad92114e4902bd6aab886007e8e9f00.1706287537.git.ludo@gnu.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Debbugs-Cc: Christopher Baines , Josselin Poiret , Ludovic Courtès , Mathieu Othacehe , Ricardo Wurmus , Simon Tournier , Tobias Geerinckx-Rice Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 68741 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) From: Ludovic Courtès * guix/build/git.scm (git-fetch-with-fallback): Add #:hash and #:hash-algorithm. Try ‘swh-download-directory-by-nar-hash’ before ‘swh-download’ when #:hash is provided. * guix/git-download.scm (git-fetch/in-band*): Pass #:hash and #:hash-algorithm to ‘git-fetch-with-fallback’. * guix/scripts/perform-download.scm (perform-git-download): Likewise. Change-Id: Ic875a7022fd78c9fac32e92ad4f8ce4d81646ec5 --- guix/build/git.scm | 20 ++++++++++++++++---- guix/git-download.scm | 4 +++- guix/scripts/perform-download.scm | 4 +++- 3 files changed, 22 insertions(+), 6 deletions(-) diff --git a/guix/build/git.scm b/guix/build/git.scm index 867cade2c4..4c69365a7b 100644 --- a/guix/build/git.scm +++ b/guix/build/git.scm @@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU -;;; Copyright © 2014, 2016, 2019, 2023 Ludovic Courtès +;;; Copyright © 2014, 2016, 2019, 2023-2024 Ludovic Courtès ;;; Copyright © 2023 Maxim Cournoyer ;;; ;;; This file is part of GNU Guix. @@ -20,7 +20,9 @@ (define-module (guix build git) #:use-module (guix build utils) #:autoload (guix build download-nar) (download-nar) - #:autoload (guix swh) (%verify-swh-certificate? swh-download) + #:autoload (guix swh) (%verify-swh-certificate? + swh-download + swh-download-directory-by-nar-hash) #:use-module (srfi srfi-34) #:use-module (ice-9 format) #:export (git-fetch @@ -91,10 +93,13 @@ (define* (git-fetch url commit directory (define* (git-fetch-with-fallback url commit directory #:key (git-command "git") + hash hash-algorithm lfs? recursive?) "Like 'git-fetch', fetch COMMIT from URL into DIRECTORY, but fall back to alternative methods when fetching from URL fails: attempt to download a nar, -and if that also fails, download from the Software Heritage archive." +and if that also fails, download from the Software Heritage archive. When +HASH and HASH-ALGORITHM are provided, they are interpreted as the nar hash of +the directory of interested and are used as its content address at SWH." (or (git-fetch url commit directory #:lfs? lfs? #:recursive? recursive? @@ -110,7 +115,14 @@ (define* (git-fetch-with-fallback url commit directory (format (current-error-port) "Trying to download from Software Heritage...~%") - (swh-download url commit directory) + ;; First try to look up and download the directory corresponding + ;; to HASH: this is fundamentally more reliable than looking up + ;; COMMIT, especially when COMMIT denotes a tag. + (or (and hash hash-algorithm + (swh-download-directory-by-nar-hash hash hash-algorithm + directory)) + (swh-download url commit directory)) + (when (file-exists? (string-append directory "/.gitattributes")) ;; Perform CR/LF conversion and other changes diff --git a/guix/git-download.scm b/guix/git-download.scm index 3de6ae970d..aadcbd234c 100644 --- a/guix/git-download.scm +++ b/guix/git-download.scm @@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU -;;; Copyright © 2014-2021, 2023 Ludovic Courtès +;;; Copyright © 2014-2021, 2023-2024 Ludovic Courtès ;;; Copyright © 2017 Mathieu Lirzin ;;; Copyright © 2017 Christopher Baines ;;; Copyright © 2020 Jakub Kądziołka @@ -165,6 +165,8 @@ (define* (git-fetch/in-band* ref hash-algo hash (git-fetch-with-fallback (getenv "git url") (getenv "git commit") #$output + #:hash #$hash + #:hash-algorithm '#$hash-algo #:lfs? lfs? #:recursive? recursive? #:git-command "git"))))) diff --git a/guix/scripts/perform-download.scm b/guix/scripts/perform-download.scm index 9aa0e61e9d..e7eb3b2a1f 100644 --- a/guix/scripts/perform-download.scm +++ b/guix/scripts/perform-download.scm @@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU -;;; Copyright © 2016-2018, 2020, 2023 Ludovic Courtès +;;; Copyright © 2016-2018, 2020, 2023-2024 Ludovic Courtès ;;; ;;; This file is part of GNU Guix. ;;; @@ -115,6 +115,8 @@ (define* (perform-git-download drv output (setenv "PATH" "/run/current-system/profile/bin:/bin:/usr/bin") (git-fetch-with-fallback url commit output + #:hash hash + #:hash-algorithm algo #:recursive? recursive? #:git-command %git)))) -- 2.41.0 From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 26 12:25:56 2024 Received: (at 68741) by debbugs.gnu.org; 26 Jan 2024 17:25:56 +0000 Received: from localhost ([127.0.0.1]:52664 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxk-0004gK-3u for submit@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:56 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:39650) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxY-0004eq-4l for 68741@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:44 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTPxM-0008UP-0l; Fri, 26 Jan 2024 12:25:32 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:References:In-Reply-To:Date:Subject:To: From; bh=LH3qPPPq9hPu+GBQob+mKqd1SmUcqHnoRnRCLRvBxA0=; b=ajTPtI5Y9P15+ugU5vQo lex6Zg2rapO302Fw6pyhAyNNUPdiddxMpYwFerntNwcIFTHFKi3LLUPmW8Xi/EAGDQEogF2iUrUPH Wm/p1Hu8rH/77y9afT3EOWrPzVvpWFfMwaLHLn9e812+o+r4eWYZHHBaz3b7+C4DoLUWEK2j02Dyf 3TjQZDJW7b64Xx8b5t4Pa+z2bY0aaTdvxNe3YW3c8SwRGlwyXgCzQ0xuT1QQwSML0wKlomlykCJLP BBGXRqgdP5Dl3PLecv01ms56AetjqvNJPsNhbQ+ZFSoWxTRBtVZhmj/O9RPSRQ3HYdv0zF2P+Pcpd KAVcDUJ97soYzw==; From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 68741@debbugs.gnu.org Subject: [PATCH 6/6] =?UTF-8?q?swh:=20Fix=20docstring=20of=20=E2=80=98look?= =?UTF-8?q?up-directory=E2=80=99.?= Date: Fri, 26 Jan 2024 18:25:06 +0100 Message-ID: <3967ad1dbbb623437d8c931488b1e384f9c962e2.1706287537.git.ludo@gnu.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Debbugs-Cc: Christopher Baines , Josselin Poiret , Ludovic Courtès , Mathieu Othacehe , Ricardo Wurmus , Simon Tournier , Tobias Geerinckx-Rice Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 68741 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) * guix/swh.scm (lookup-directory): Fix docstring. Change-Id: Ia1fd9b2bc9184364cebbd30ee84c9fdea4ba897c --- guix/swh.scm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/guix/swh.scm b/guix/swh.scm index be1eb7d151..04cecd854c 100644 --- a/guix/swh.scm +++ b/guix/swh.scm @@ -446,7 +446,7 @@ (define-query (lookup-revision id) json->revision) (define-query (lookup-directory id) - "Return the directory with the given ID." + "Return the list of entries of the directory with the given ID." (path "/api/1/directory" id) json->directory-entries) -- 2.41.0 From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 26 12:26:08 2024 Received: (at 68741) by debbugs.gnu.org; 26 Jan 2024 17:26:08 +0000 Received: from localhost ([127.0.0.1]:52705 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxv-0004iY-Il for submit@debbugs.gnu.org; Fri, 26 Jan 2024 12:26:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:50928) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rTPxh-0004fx-49 for 68741@debbugs.gnu.org; Fri, 26 Jan 2024 12:25:56 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rTPxT-00004y-Ua; Fri, 26 Jan 2024 12:25:40 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:References:In-Reply-To:Subject:To: From; bh=Lwb26KY2vEcJEDw9p47u6Xvyqyw92y9f72CIRcaz1RE=; b=jVymmtjE8iWENk5x0uk2 ApbcbuBSgnGabA8WbKKT4xsoL3PgkHI15iT75xc86RrHucAueIIly4SKJ5dgZz+QcTJQrjPLKioGA xVEoNFjZAFdAL/cuC5l1QFvDb0Okx0DlN+ODRoAYKPkm+Bm6uyJ+Dgm9GPwcLv24QWNcOq/4Hp67j voiMGYWAzxarXVgG5mNYal79SPlyrrAYROok8lFS3GRhRBmUf+Sak0x1p0B8eVGFzcx5fRfwkRghF SzHZDdDWoSB1fPOm70T6blNaUA6ZYgswDQ5CkqBWihZgv3AxMlcM8j6l8qi3Q/89ge3wIsjV0u/qf MfEfcsGdpz65Vg==; From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: 68741@debbugs.gnu.org Subject: Re: [bug#68741] [PATCH 0/6] Content-addressed downloads from Software Heritage In-Reply-To: ("Ludovic =?utf-8?Q?Court?= =?utf-8?Q?=C3=A8s=22's?= message of "Fri, 26 Jan 2024 18:16:40 +0100") References: Date: Fri, 26 Jan 2024 18:25:37 +0100 Message-ID: <87y1ccm2ge.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 68741 Cc: Timothy Sample , Antoine R. Dumont (@ardumont) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Oops, I forgot to Cc: the fine people for the cover letter; fixed! See . Ludovic Court=C3=A8s skribis: > Hello Guix! > > For those who=E2=80=99ve been following along, you might remember that the > main impedance mismatch between SWH and Guix is that SWH uses Git > tree SHA1 hashes to identify directories whereas Guix uses nar SHA256 > hashes (and possibly other hash functions in the future): > > https://guix.gnu.org/en/blog/2019/connecting-reproducible-deployment-to= -a-long-term-source-code-archive/ > > Because of this, the SWH fallback path for =E2=80=98git-download=E2=80=99= had two > options: > > 1. If =E2=80=98git-reference=E2=80=99 specifies a full SHA1 commit ID, = it would > look it up on SWH and fetch it. > > 2. If =E2=80=98git-reference=E2=80=99 specifies a tag, which is perhaps= the > majority of cases, Guix would ask SWH the commit that once > corresponded to that tag at that URL, and then fetch it. > > Case #1 is ideal: it=E2=80=99s content-addressed. Case #2 is brittle: we= =E2=80=99re > hoping that the tag hasn=E2=80=99t been modified and that the URL hasn=E2= =80=99t been > reused for something else; if that=E2=80=99s not the case, SWH might retu= rn > the =E2=80=9Cwrong=E2=80=9D commit and we end up fetching something unrel= ated. > > The good news is that our friends at SWH have just deployed a new > version of their code that lets us look up directories by some > =E2=80=9Cexternal identifier=E2=80=9D (=E2=80=9CExtID=E2=80=9D), among wh= ich there=E2=80=99s =E2=80=98nar-sha256=E2=80=99: > > https://archive.softwareheritage.org/api/1/extid/doc/ > > And that, my friends, makes a huge difference: the impedance mismatch > is gone, we can now use content-addressing to fetch our stuff from SWH!! > And that works not just for Git, but also for Mercurial, SVN, CVS, etc. > > Well, there=E2=80=99s a caveat: currently the =E2=80=98nar-sha256=E2=80= =99 is added only on > new visits and it=E2=80=99s apparently not being added yet for Mercurial = for > unclear reasons. So right now, we can get guile-sqlite3 0.1.3 (Git) by > nar-sha256, but we cannot get guile-wisp (hg) nor in fact most things. > That=E2=80=99ll improve over time though, and SWH comrades are open to ad= ding > those ExtIDs retroactively. > > The patches that follow do several things: > > 1. Follow redirects in the Vault: (guix swh) previously did not > do that (oops!) but the newly-deployed Vault now responds with > 302 redirects so we have to handle that. > > 2. Add bindings for the ExtID HTTP interface. > > 3. Add =E2=80=98swh-download-directory-by-nar-hash=E2=80=99, which does= what it > says. > > 4. Use that as the preferred fallback method for =E2=80=98git-fetch=E2= =80=99. > > Here=E2=80=99s a REPLshot: > > scheme@(guile-user)> (lookup-external-id "nar-sha256" (content-hash-value= (origin-hash (package-source (@ (gnu packages guile) guile-sqlite3)))) ) > $43 =3D #< value: "0b56ba94c2b83b8f74e3772887c1109135802eb3e= 8962b628377987fe97e1e63" type: "nar-sha256" version: 0 target: "swh:1:dir:8= 4a8b34591712c0a90bab0af604188bcd1fe3153" target-url: "https://archive.softw= areheritage.org/swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153"> > scheme@(guile-user)> (swh-download-directory-by-nar-hash (content-hash-va= lue(origin-hash (package-source (@ (gnu packages guile) guile-sqlite3)))) '= sha256 "/tmp/gsql") > SWH: found directory with nar-sha256 hash 0b56ba94c2b83b8f74e3772887c1109= 135802eb3e8962b628377987fe97e1e63 at 'swh:1:dir:84a8b34591712c0a90bab0af604= 188bcd1fe3153' > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/ > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/.gitignore > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/AUTHORS > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/COPYING > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/COPYING.LESSER > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/ChangeLog > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/Makefile.am > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/NEWS > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/README > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/build-aux/ > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/build-aux/guile.am > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/build-aux/test-driver.= scm > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/configure.ac > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/env.in > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/sqlite3.scm.in > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/tests/ > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/tests/basic.scm > $46 =3D #t > > Huge thanks to everyone over at #swh-devel for helping me out > over the past few days! > > Next tasks: implement download fallback for =E2=80=98hg-fetch=E2=80=99, c= hange > =E2=80=98guix lint -c archival=E2=80=99 to make =E2=80=98save-origin=E2= =80=99 requests not just > for Git repos, assess the situation with SVN and sub-directories > to see what can be done. > > Thoughts? > > Ludo=E2=80=99. > > PS: Apologies for the wall of text! > > Ludovic Court=C3=A8s (6): > swh: =E2=80=98vault-fetch=E2=80=99 follows redirects. > swh: Add bindings for the =E2=80=9CExtID=E2=80=9D API. > swh: Add =E2=80=98swh-download-directory-by-nar-hash=E2=80=99. > lint: archival: Check with =E2=80=98lookup-directory-by-nar-hash=E2=80= =99. > git-download: Download from SWH by nar hash when possible. > swh: Fix docstring of =E2=80=98lookup-directory=E2=80=99. > > guix/build/git.scm | 20 ++++-- > guix/git-download.scm | 4 +- > guix/lint.scm | 28 +++++--- > guix/scripts/perform-download.scm | 4 +- > guix/swh.scm | 113 ++++++++++++++++++++++++++---- > tests/lint.scm | 33 +++++++-- > tests/swh.scm | 21 +++++- > 7 files changed, 189 insertions(+), 34 deletions(-) > > > base-commit: 8bee6bb9aaaf35c36fe325675d1eb2daebd69c25 From debbugs-submit-bounces@debbugs.gnu.org Mon Feb 12 06:42:51 2024 Received: (at 68741-done) by debbugs.gnu.org; 12 Feb 2024 11:42:51 +0000 Received: from localhost ([127.0.0.1]:50509 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rZUi2-0002sh-QW for submit@debbugs.gnu.org; Mon, 12 Feb 2024 06:42:51 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55108) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rZUi1-0002rt-0Y for 68741-done@debbugs.gnu.org; Mon, 12 Feb 2024 06:42:49 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rZUP1-0003oo-NB; Mon, 12 Feb 2024 06:23:11 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:References:In-Reply-To:Subject:To: From; bh=eOmGzOy4W0Y6XIgK68AzxZSuq07rnvR1van8CiyCxDE=; b=chfthm34/R7HfPCJOByO PE98bnqacjWGsEmJLZGyQf89MBldA6gYR3SkVPEbv5sWhfDoO4+8iXEBJdGCfk7Uyl83dFeN8VPBZ bHrK8lC0WqJktJaq5cLfq+z6izGwVeUPsCdk96qN0c5YTO/aXzVB8j0NHiEomFymj4bUoGU6elJ2m E31yK9ylAo1Vz8satHU9HA3CWHGKM/33+DurCeeiXrnmlQDeTVCLruOpDejSq7mefCbFgUdiFmzrV qvWOucX6yp+/5bgRj9XZoxfhXe0m0jYMEg4IXXkVVxzaVGM6HSdpf5nw2kpbKn7smQNto2NsDINL9 rLe+Ka9MHzudGw==; From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: 68741-done@debbugs.gnu.org Subject: Re: [bug#68741] [PATCH 0/6] Content-addressed downloads from Software Heritage In-Reply-To: ("Ludovic =?utf-8?Q?Court?= =?utf-8?Q?=C3=A8s=22's?= message of "Fri, 26 Jan 2024 18:16:40 +0100") References: Date: Mon, 12 Feb 2024 12:23:07 +0100 Message-ID: <87le7qexk4.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 68741-done Cc: Josselin Poiret , Simon Tournier , Mathieu Othacehe , Tobias Geerinckx-Rice , Ricardo Wurmus , Christopher Baines X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi, Ludovic Court=C3=A8s skribis: > swh: =E2=80=98vault-fetch=E2=80=99 follows redirects. > swh: Add bindings for the =E2=80=9CExtID=E2=80=9D API. > swh: Add =E2=80=98swh-download-directory-by-nar-hash=E2=80=99. > lint: archival: Check with =E2=80=98lookup-directory-by-nar-hash=E2=80= =99. > git-download: Download from SWH by nar hash when possible. > swh: Fix docstring of =E2=80=98lookup-directory=E2=80=99. Pushed as 5a61ce6bcfbd0882956e40457232da737776abe7. > Next tasks: implement download fallback for =E2=80=98hg-fetch=E2=80=99, c= hange > =E2=80=98guix lint -c archival=E2=80=99 to make =E2=80=98save-origin=E2= =80=99 requests not just > for Git repos, assess the situation with SVN and sub-directories > to see what can be done. Let=E2=80=99s make it happen! Ludo=E2=80=99. From unknown Sun Jun 22 11:32:48 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Tue, 12 Mar 2024 11:24:20 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator