GNU bug report logs - #76081
OCI provisioning service

Previous Next

Package: guix-patches;

Reported by: paul <goodoldpaul <at> autistici.org>

Date: Wed, 5 Feb 2025 22:01:03 UTC

Severity: normal

Tags: moreinfo

Full log


Message #214 received at 76081 <at> debbugs.gnu.org (full text, mbox):

From: paul <goodoldpaul <at> autistici.org>
To: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Cc: Ludovic Courtès <ludo <at> gnu.org>,
 Gabriel Wicki <gabriel <at> erlikon.ch>, 76081 <at> debbugs.gnu.org
Subject: Re: [bug#76081] [PATCH v10 4/7] services: Add oci-service-type.
Date: Sun, 1 Jun 2025 02:18:16 +0200
[Message part 1 (text/plain, inline)]
Hi,

On 5/16/25 03:38, Maxim Cournoyer wrote:
>>   @lisp
>> -(service oci-container-service-type
>> -         (list
>> -          (oci-container-configuration
>> -           (network "host")
>> -           (image
>> -            (oci-image
>> -             (repository "guile")
>> -             (tag "3")
>> -             (value (specifications->manifest '("guile")))
>> -             (pack-options '(#:symlinks (("/bin/guile" -> "bin/guile"))
>> -                             #:max-layers 2))))
>> -           (entrypoint "/bin/guile")
>> -           (command
>> -            '("-c" "(display \"hello!\n\")")))
>> -          (oci-container-configuration
>> -           (image "prom/prometheus")
>> -           (ports
>> -             '(("9000" . "9000")
>> -               ("9090" . "9090"))))
>> -          (oci-container-configuration
>> -           (image "grafana/grafana:10.0.1")
>> -           (network "host")
>> -           (volumes
>> -             '("/var/lib/grafana:/var/lib/grafana")))))
>> +(simple-service 'oci-provisioning
>> +                oci-service-type
>> +                (oci-extension
>> +                  (networks
>> +                    (list
>> +                      (oci-network-configuration (name "monitoring"))))
>> +                  (containers
>> +                   (list
>> +                    (oci-container-configuration
>> +                     (network "monitoring")
>> +                     (image
>> +                      (oci-image
>> +                        (repository "guile")
>> +                        (tag "3")
>> +                        (value (specifications->manifest '("guile")))
>> +                        (pack-options '(#:symlinks (("/bin/guile" -> "bin/guile"))
>> +                                        #:max-layers 2))))
>> +                     (entrypoint "/bin/guile")
>> +                     (command
>> +                      '("-c" "(display \"hello!\n\")")))
>> +                    (oci-container-configuration
>> +                      (image "prom/prometheus")
>> +                      (network "host")
>> +                      (ports
>> +                       '(("9000" . "9000")
>> +                         ("9090" . "9090"))))
>> +                    (oci-container-configuration
>> +                      (image "grafana/grafana:10.0.1")
>> +                      (network "host")
>> +                      (volumes
>> +                       '("/var/lib/grafana:/var/lib/grafana")))))))
> I was curious: was is the gain to have by using the extension mechanism
> instead of directly accepting a list of 'oci-configuration' or similarly
> named record to the oci-service-type?
One can always pass a list of 
'oci-{container,network,volume}-configuration's to an 'oci-configuration':

(service oci-service-type
         (oci-configuration
          (runtime 'podman)
          (containers
           (list
            (oci-container-configuration ...)))))


but you loose the deduplication checks done by the extension mechanisms. 
This is useful especially for containers: both System/Home services and 
users of plain 'simple-service' in the system configuration can add 
Shepherd services for different reasons, if users do not set an explicit 
value to the 'provision' field of 'oci-container-configuration', the 
Shepherd service name will be derived from the image. So the 
'oci-extension' suggestion in the documentation is to take care of 
duplicates.
>> +(define (oci-object-exists? runtime runtime-cli object verbose?)
>> +  #~(lambda* (name #:key (format-string "{{.Name}}"))
>> +      (use-modules (ice-9 format)
>> +                   (ice-9 match)
>> +                   (ice-9 popen)
>> +                   (ice-9 rdelim)
>> +                   (srfi srfi-1))
>> +
>> +      (define (read-lines file-or-port)
>> +        (define (loop-lines port)
>> +          (let loop ((lines '()))
>> +            (match (read-line port)
>> +              ((? eof-object?)
>> +               (reverse lines))
>> +              (line
>> +               (loop (cons line lines))))))
>> +
>> +        (if (port? file-or-port)
>> +            (loop-lines file-or-port)
>> +            (call-with-input-file file-or-port
>> +              loop-lines)))
>> +
>> +      #$(if (eq? runtime 'podman)
>> +            #~(let ((command
>> +                     (list #$runtime-cli
>> +                           #$object "exists" name)))
>> +                (when #$verbose?
>> +                  (format #t "Running~{ ~a~}~%" command))
>> +                (define exit-code (status:exit-val (apply system* command)))
>> +                (when #$verbose?
>> +                  (format #t "Exit code: ~a~%" exit-code))
>> +                (equal? EXIT_SUCCESS exit-code))
>> +            #~(let ((command
>> +                     (string-append #$runtime-cli
>> +                                    " " #$object " ls --format "
>> +                                    "\"" format-string "\"")))
>> +                (when #$verbose?
>> +                  (format #t "Running ~a~%" command))
>> +                (member name (read-lines (open-input-pipe command)))))))
> That's a complex piece of staged code; have you considered placing it in
> a supporting build-side module, e.g. gnu/build/oci-containers.scm ?
> Then it could be more simply imported in the gexp environment and used.
> See the gnu/build/jami-service.scm for an example.
Thank you I was not aware of this option, I tried to move as much as I 
could to gnu/build/oci-containers.scm.
>> +(define* (oci-image-loader runtime runtime-cli name image tag #:key (verbose? #f))
>> +  "Return a file-like object that, once lowered, will evaluate to a program able
>> +to load IMAGE through RUNTIME-CLI and to tag it with TAG afterwards."
>> +  (let ((tarball (lower-oci-image name image)))
>>       (with-imported-modules '((guix build utils))
>> -      (program-file (format #f "~a-image-loader" name)
>> +      (program-file
>> +       (format #f "~a-image-loader" name)
>>          #~(begin
>>              (use-modules (guix build utils)
>> +                        (ice-9 match)
>>                           (ice-9 popen)
>> -                        (ice-9 rdelim))
>> -
>> -           (format #t "Loading image for ~a from ~a...~%" #$name #$tarball)
>> -           (define line
>> -             (read-line
>> -              (open-input-pipe
>> -               (string-append #$docker " load -i " #$tarball))))
>> -
>> -           (unless (or (eof-object? line)
>> -                       (string-null? line))
>> -             (format #t "~a~%" line)
>> -             (let ((repository&tag
>> -                    (string-drop line
>> -                                 (string-length
>> -                                   "Loaded image: "))))
>> -
>> -               (invoke #$docker "tag" repository&tag #$tag)
>> -               (format #t "Tagged ~a with ~a...~%" #$tarball #$tag))))))))
>> -
>> -(define (oci-container-shepherd-service config)
>> -  (define (guess-name name image)
>> -    (if (maybe-value-set? name)
>> -        name
>> -        (string-append "docker-"
>> -                       (basename
>> -                        (if (string? image)
>> -                            (first (string-split image #\:))
>> -                            (oci-image-repository image))))))
>> -
>> -  (let* ((docker (file-append docker-cli "/bin/docker"))
>> -         (actions (oci-container-configuration-shepherd-actions config))
>> +                        (ice-9 rdelim)
>> +                        (srfi srfi-1))
>> +           (define object-exists?
>> +             #$(oci-object-exists? runtime runtime-cli "image" verbose?))
>> +           (define load-command
>> +             (string-append #$runtime-cli
>> +                            " load -i " #$tarball))
>> +
>> +           (if (object-exists? #$tag #:format-string "{{.Repository}}:{{.Tag}}")
>> +               (format #t "~a image already exists, skipping.~%" #$tag)
>> +               (begin
>> +                 (format #t "Loading image for ~a from ~a...~%" #$name #$tarball)
>> +                 (when #$verbose?
>> +                   (format #t "Running ~a~%" load-command))
>> +                 (let ((line (read-line
>> +                              (open-input-pipe load-command))))
>> +                   (unless (or (eof-object? line)
>> +                               (string-null? line))
>> +                     (format #t "~a~%" line)
>> +                     (let* ((repository&tag
>> +                             (string-drop line
>> +                                          (string-length
>> +                                           "Loaded image: ")))
>> +                            (tag-command
>> +                             (list #$runtime-cli "tag" repository&tag #$tag))
>> +                            (drop-old-tag-command
>> +                             (list #$runtime-cli "image" "rm" "-f" repository&tag)))
>> +
>> +                       (unless (string=? repository&tag #$tag)
>> +                         (when #$verbose?
>> +                           (format #t "Running~{ ~a~}~%" tag-command))
>> +
>> +                         (let ((exit-code
>> +                                (status:exit-val (apply system* tag-command))))
>> +                           (format #t "Tagged ~a with ~a...~%" #$tarball #$tag)
>> +
>> +                           (when #$verbose?
>> +                             (format #t "Exit code: ~a~%" exit-code))
>> +
>> +                           (when (equal? EXIT_SUCCESS exit-code)
>> +                             (when #$verbose?
>> +                               (format #t "Running~{ ~a~}~%" drop-old-tag-command))
>> +                             (let ((drop-exit-code
>> +                                    (status:exit-val (apply system* drop-old-tag-command))))
>> +                               (when #$verbose?
>> +                                 (format #t "Exit code: ~a~%" drop-exit-code))))))))))))))))
>> +
>> +(define (oci-container-run-invocation runtime runtime-cli name command image-reference
>> +                                      options runtime-extra-arguments run-extra-arguments)
> That's a lot of arguments!  Perhaps it could accept a record instead?
> Please also mind the 80 chars max width.
Good idea, I made an intermediate oci-container-invocation record.
>> +  "Return a list representing the OCI runtime
>> +invocation for running containers."
>> +  ;; run [OPTIONS] IMAGE [COMMAND] [ARG...]
>> +  `(,runtime-cli ,@runtime-extra-arguments "run" "--rm"
>> +    ,@(if (eq? runtime 'podman)
>> +          ;; This is because podman takes some time to
>> +          ;; release container names.  --replace seems
>> +          ;; to be required to be able to restart services.
>> +          '("--replace")
>> +          '())
>> +    "--name" ,name
>> +    ,@options ,@run-extra-arguments
>> +    ,image-reference ,@command))
>> +
>> +(define* (oci-container-entrypoint runtime runtime-cli name image image-reference
>> +                                   invocation #:key (verbose? #f) (pre-script #~()))
> The default value for a keyword argument is #f, so I'd omit the explicit
> default for verbose?.
>> +              (begin
>> +                (when #$verbose?
>> +                  (format #t "Running~{ ~a~}~%" invocation))
>> +                (let ((exit-code (status:exit-val (apply system* invocation))))
>> +                  (when #$verbose?
>> +                    (format #t "Exit code: ~a~%" exit-code))))))
>> +        (list #$@invocations)))))
>> +
>> +(define* (oci-object-shepherd-service object runtime runtime-cli name requirement invocations
> Max width :-).

I should have dropped the default #f for keywords in all procedure, thanks.

>> +  "Return a file-like object that, once lowered, will evaluate to the entrypoint
>> +for the Shepherd service that will run IMAGE through RUNTIME-CLI."
>> +  (program-file
>> +   (string-append "oci-entrypoint-" name)
>> +   #~(begin
>> +       (use-modules (ice-9 format)
>> +                    (srfi srfi-1))
>> +       (when #$verbose?
>> +         (format #t "Running in verbose mode...~%")
>> +         (format #t "Current user: ~a ~a~%"
>> +                 (getuid) (passwd:name (getpwuid (getuid))))
>> +         (format #t "Current group: ~a ~a~%"
>> +                 (getgid) (group:name (getgrgid (getgid))))
>> +         (format #t "Current directory ~a~%" (getcwd)))
> I'd use a single format call; you can format the string naturally like
>
>              (format #t "\
> Running in verbose mode...
> Current user: ~a ~a
> ...
> " arg1 arg2, ...)
>
>
>> +       (define invocation (list #$@invocation))
>> +       #$@pre-script
>> +       (when #$verbose?
>> +         (format #t "Running~{ ~a~}~%" invocation))
>> +       (apply execlp `(,(first invocation) ,@invocation)))))
> You can more simply: (apply execlp (first invocation) invocation)
Thank you, I have addressed this in gnu/build/oci-containers.scm
>> +
>> +(define* (oci-container-shepherd-service runtime runtime-cli config
>> +                                         #:key
>> +                                         (runtime-environment #~())
>> +                                         (runtime-extra-arguments '())
>> +                                         (oci-requirement '())
>> +                                         (user #f)
>> +                                         (group #f)
>> +                                         (verbose? #f))
>> +  "Return a Shepherd service object that will run the OCI container represented
>> +by CONFIG through RUNTIME-CLI."
>> +  (let* ((actions (oci-container-configuration-shepherd-actions config))
>>            (auto-start?
>>             (oci-container-configuration-auto-start? config))
>> -         (user (oci-container-configuration-user config))
>> -         (group (oci-container-configuration-group config))
>> +         (maybe-user (oci-container-configuration-user config))
>> +         (maybe-group (oci-container-configuration-group config))
>> +         (user (if (maybe-value-set? maybe-user)
>> +                    maybe-user
>> +                    user))
>> +         (group (if (maybe-value-set? maybe-group)
>> +                    maybe-group
>> +                    group))
>>            (host-environment
>>             (oci-container-configuration-host-environment config))
>>            (command (oci-container-configuration-command config))
>>            (log-file (oci-container-configuration-log-file config))
>> -         (provision (oci-container-configuration-provision config))
>>            (requirement (oci-container-configuration-requirement config))
>>            (respawn?
>>             (oci-container-configuration-respawn? config))
>>            (image (oci-container-configuration-image config))
>>            (image-reference (oci-image-reference image))
>>            (options (oci-container-configuration->options config))
>> -         (name (guess-name provision image))
>> +         (name
>> +          (oci-container-shepherd-name runtime config))
>>            (extra-arguments
>> -          (oci-container-configuration-extra-arguments config)))
>> +          (oci-container-configuration-extra-arguments config))
>> +         (invocation
>> +          (oci-container-run-invocation
>> +           runtime runtime-cli name command image-reference
>> +           options runtime-extra-arguments extra-arguments))
>> +         (container-action
>> +          (lambda* (command #:key (environment-variables #f))
>> +            #~(lambda _
>> +                (fork+exec-command
>> +                 (list #$@command)
>> +                 #$@(if user (list #:user user) '())
>> +                 #$@(if group (list #:group group) '())
>> +                 #$@(if (maybe-value-set? log-file)
>> +                        (list #:log-file log-file)
>> +                        '())
>> +                 #$@(if (and user (eq? runtime 'podman))
>> +                        (list #:directory
>> +                              #~(passwd:dir (getpwnam #$user)))
>> +                        '())
>> +                 #$@(if environment-variables
>> +                        (list #:environment-variables
>> +                              environment-variables)
>> +                        '()))))))
> That's a very long let.  Since many values appear to simply be accessed
> from theh oci-container-configuration, match-record could be used to
> great effect.  For readability, I'd also define the action procedures
> using nested 'define' inside the let/match-record.
I tried to address this but it's really difficult to trim much, please 
let me know if there's some better way.
>> +(define (oci-service-subids config)
>> +  "Return a subids-extension record representing subuids and subgids required by
>> +the rootless Podman backend."
>> +  (define (delete-duplicate-ranges ranges)
>> +    (delete-duplicates ranges
>> +                       (lambda args
>> +                         (apply string=? (map subid-range-name ranges)))))
> I don't understand how the above 'delete-duplicate-ranges' procedure
> works; the 2nd argument of delete-duplicates is normally a predicate
> that accepts two values, but here it's a lambda that disregards the
> arguments: it seems it'd be wasteful to compute the same result over and
> over?
It is, I just replace this with a more descriptive error that allows the 
users to drop the duplicate without silently dropping a random one 
between the two.
>> +                          (response1
>> +                           (slurp bash "-c"
>> +                                  (string-append "ls -la /sys/fs/cgroup | "
>> +                                                 "grep -E ' \\./?$' | awk '{ print $4 }'")))
>> +                          (response2 (slurp bash "-c"
>> +                                            (string-append "ls -l /sys/fs/cgroup/cgroup"
>> +                                                           ".{procs,subtree_control,threads} | "
>> +                                                           "awk '{ print $4 }' | sort -u"))))
> Is there perhaps an easier way to check that the podman services were
> started successfully?  Perhaps using the podman command directly to
> query the state of the running containers?
>
> 'ls' is usually frowned for scripting; it's meant for interactive use.
> Perhaps you could use the shell wildcard feature or the find command
> instead?  I comment explaining what the above does/check exactly would
> be useful; it's not obvious.
I think this test can be dropped anyway since we later wait for Shepherd 
services to be started, it is a development leftover.
>> +             (let loop ((attempts 0))
>> +               (if (= attempts 60)
>> +                   (error "Services didn't come up after more than 60 seconds")
>> +                   (if (equal? '("cgroup" "cgroup")
>> +                               (run-test))
>> +                       #t
>> +                       (begin
>> +                         (sleep 1)
>> +                         (format #t "Services didn't come up yet, retrying with attempt ~a~%"
>> +                                 (+ 1 attempts))
>> +                         (loop (+ 1 attempts))))))))
> Note that we have a 'with-retries' definition in (gnu build
> jami-service) which could be used here;  perhaps it should be relocated
> elsewhere.

Thank you for the hint, I created gnu/build/utils.scm and tested my 
changes with:

guix shell -CPWN --writable-root nss-certs -m manifest.scm -- make 
check-system TESTS="pounce jami ngircd"

>> +                               (lambda (port)
>> +                                 (display (string-join response "\n") port)))))
>> +                         (lambda ()
>> +                           (primitive-exit 127))))
> The fork is used to be able to run the podman command with the right
> passwd and uid, right?  Maybe you could use the shepherd's provided
> 'command' or 'fork+exec-command', which have a nice API to do this.
> Also, is the dynamic-wind unnecessary?  I would have thought that if a
> forked children aborts, its exit status should be reflected accordingly?
You bring two good points, addressed by dropping dynamic-wind and 
refactoring duplicate code in tests.
> The idiom of forking ot invoke podman is used so commonly it should be
> extracted to a procedure or syntax and reused.
This should be addressed now.
> Phew, that was a large chunk!  I'll continue my review with the
> remaining commits, but when I'm done, could you please submit a v11,
> taking into account my comments?
Thank you a lot for taking the time to go through all of this, I hope I 
addressed all your comments successfully. Please let me know if 
something seems off.

I'll send a v11 soon.

cheers,

giacomo
[Message part 2 (text/html, inline)]

This bug report was last modified 12 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.