GNU bug report logs - #66650
[PATCH] git: Shell out to ‘git gc’ when necessary.

Previous Next

Package: guix-patches;

Reported by: Ludovic Courtès <ludo <at> gnu.org>

Date: Fri, 20 Oct 2023 16:17:01 UTC

Severity: normal

Tags: patch

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#66650: closed ([PATCH] git: Shell out to ‘git gc’ when necessary.)
Date: Wed, 22 Nov 2023 16:01:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Wed, 22 Nov 2023 17:00:07 +0100
with message-id <87h6ldkcc8.fsf <at> gnu.org>
and subject line Re: [bug#66650] bug#65720: Guile-Git-managed checkouts grow way too much
has caused the debbugs.gnu.org bug report #66650,
regarding [PATCH] git: Shell out to ‘git gc’ when necessary.
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
66650: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=66650
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Ludovic Courtès <ludo <at> gnu.org>
To: guix-patches <at> gnu.org
Cc: Ludovic Courtès <ludo <at> gnu.org>, 65720 <at> debbugs.gnu.org,
 Josselin Poiret <dev <at> jpoiret.xyz>, Simon Tournier <zimon.toutoune <at> gmail.com>
Subject: [PATCH] git: Shell out to ‘git gc’ when necessary.
Date: Fri, 20 Oct 2023 18:15:12 +0200
Fixes <https://issues.guix.gnu.org/65720>.

This fixes a bug whereby libgit2-managed checkouts would keep growing as
we fetch.

* guix/git.scm (packs-in-git-repository, maybe-run-git-gc): New
procedures.
(update-cached-checkout): Use it.
---
 guix/git.scm | 39 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 36 insertions(+), 3 deletions(-)

Hi!

This is a radical fix/workaround for the unbounded Git checkout growth
problem, shelling out to ‘git gc’ when it’s likely needed (“too many”
pack files around).

I thought we might be able to implement a ‘git gc’ approximation using
the libgit2 “packbuilder” interface, but I haven’t got around to doing
it: <https://libgit2.org/libgit2/#HEAD/search/pack>.

Once again, shelling out is not my favorite option, but it’s a bug we
should fix sooner rather than later, hence this compromise.

Thoughts?

Ludo’.

diff --git a/guix/git.scm b/guix/git.scm
index b7182305cf..d704b62333 100644
--- a/guix/git.scm
+++ b/guix/git.scm
@@ -1,6 +1,6 @@
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2017, 2020 Mathieu Othacehe <m.othacehe <at> gmail.com>
-;;; Copyright © 2018-2022 Ludovic Courtès <ludo <at> gnu.org>
+;;; Copyright © 2018-2023 Ludovic Courtès <ludo <at> gnu.org>
 ;;; Copyright © 2021 Kyle Meyer <kyle <at> kyleam.com>
 ;;; Copyright © 2021 Marius Bakke <marius <at> gnu.org>
 ;;; Copyright © 2022 Maxime Devos <maximedevos <at> telenet.be>
@@ -29,15 +29,16 @@ (define-module (guix git)
   #:use-module (guix cache)
   #:use-module (gcrypt hash)
   #:use-module ((guix build utils)
-                #:select (mkdir-p delete-file-recursively))
+                #:select (mkdir-p delete-file-recursively invoke/quiet))
   #:use-module (guix store)
   #:use-module (guix utils)
   #:use-module (guix records)
   #:use-module (guix gexp)
   #:autoload   (guix git-download)
   (git-reference-url git-reference-commit git-reference-recursive?)
+  #:autoload   (guix config) (%git)
   #:use-module (guix sets)
-  #:use-module ((guix diagnostics) #:select (leave warning))
+  #:use-module ((guix diagnostics) #:select (leave warning info))
   #:use-module (guix progress)
   #:autoload   (guix swh) (swh-download commit-id?)
   #:use-module (rnrs bytevectors)
@@ -428,6 +429,35 @@ (define (delete-checkout directory)
     (rename-file directory trashed)
     (delete-file-recursively trashed)))
 
+(define (packs-in-git-repository directory)
+  "Return the number of pack files under DIRECTORY, a Git checkout."
+  (catch 'system-error
+    (lambda ()
+      (let ((directory (opendir (in-vicinity directory ".git/objects/pack"))))
+        (let loop ((count 0))
+          (match (readdir directory)
+            ((? eof-object?)
+             (closedir directory)
+             count)
+            (str
+             (loop (if (string-suffix? ".pack" str)
+                       (+ 1 count)
+                       count)))))))
+    (const 0)))
+
+(define (maybe-run-git-gc directory)
+  "Run 'git gc' in DIRECTORY if needed."
+  ;; XXX: As of libgit2 1.3.x (used by Guile-Git), there's no support for GC.
+  ;; Each time a checkout is pulled, a new pack is created, which eventually
+  ;; takes up a lot of space (lots of small, poorly-compressed packs).  As a
+  ;; workaround, shell out to 'git gc' when the number of packs in a
+  ;; repository has become "too large", potentially wasting a lot of space.
+  ;; See <https://issues.guix.gnu.org/65720>.
+  (when (> (packs-in-git-repository directory) 25)
+    (info (G_ "compressing cached Git repository at '~a'...~%")
+          directory)
+    (invoke/quiet %git "-C" directory "gc")))
+
 (define* (update-cached-checkout url
                                  #:key
                                  (ref '())
@@ -515,6 +545,9 @@ (define* (update-cached-checkout url
                    seconds seconds
                    nanoseconds nanoseconds))))
 
+       ;; Run 'git gc' if needed.
+       (maybe-run-git-gc cache-directory)
+
        ;; When CACHE-DIRECTORY is a sub-directory of the default cache
        ;; directory, remove expired checkouts that are next to it.
        (let ((parent (dirname cache-directory)))

base-commit: 6b0a32196982a0a2f4dbb59d35e55833a5545ac6
-- 
2.41.0



[Message part 3 (message/rfc822, inline)]
From: Ludovic Courtès <ludo <at> gnu.org>
To: Simon Tournier <zimon.toutoune <at> gmail.com>
Cc: Josselin Poiret <dev <at> jpoiret.xyz>, Christopher Baines <mail <at> cbaines.net>,
 65720-done <at> debbugs.gnu.org, 66650-done <at> debbugs.gnu.org
Subject: Re: [bug#66650] bug#65720: Guile-Git-managed checkouts grow way too
 much
Date: Wed, 22 Nov 2023 17:00:07 +0100
Hi,

Simon Tournier <zimon.toutoune <at> gmail.com> skribis:

> Somehow I was expressing: my view probably falls into the “Premature
> optimization is the root of all evil” category.  Other said, I have no
> objection and I will revisit the issue when I will be on fire, if I am,
> or annoyed for real.

Alright!

Pushed as b150c546b04c9ebb09de9f2c39789221054f5eea.

Let’s see how it behaves and if there are problems we had overlooked…

Ludo’.


This bug report was last modified 1 year and 182 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.