Package: guix-patches;
Reported by: kyle <kyle <at> posteo.net>
Date: Wed, 22 Feb 2023 05:18:02 UTC
Severity: normal
Tags: patch
View this message in rfc822 format
From: kyle <kyle <at> posteo.net> To: 61701 <at> debbugs.gnu.org Cc: Kyle Andrews <kyle <at> posteo.net> Subject: [bug#61701] [PATCH] doc: Propose new cookbook section for reproducible research. Date: Wed, 22 Feb 2023 05:17:29 +0000
From: Kyle Andrews <kyle <at> posteo.net> The intent was to cover the most common cases where R and python using researchers could rapidly achieve the benefits of reproducibility. --- doc/guix-cookbook.texi | 174 +++++++++++++++++++++++++++++++++++ guix/build-system/python.scm | 1 + 2 files changed, 175 insertions(+) diff --git a/doc/guix-cookbook.texi b/doc/guix-cookbook.texi index b9fb916f4a..8a10bcbec7 100644 --- a/doc/guix-cookbook.texi +++ b/doc/guix-cookbook.texi @@ -114,6 +114,7 @@ Top Environment management +* Reproducible Research in Practice:: Write manifests to create reproducible environments. * Guix environment via direnv:: Setup Guix environment with direnv Installing Guix on a Cluster @@ -3538,9 +3539,182 @@ Environment management demonstrate such utilities. @menu +* Reproducible Research in Practice:: Write manifests to create reproducible environments * Guix environment via direnv:: Setup Guix environment with direnv @end menu +@node Reproducible Research in Practice +@section Common scientific software environments + +Many researchers write applied scientific software supported by a +mixture of more generic tools developed by teams written within the R +and Python ecosystems and supporting shell utilities. Even researchers +who predominantly stick to using just R or just python often have to use +both R and python at the same time when collaborating with others. This +tutorial covers strategies for creating manifests to handle such +situations. + +Widely used R packages are hosted on CRAN, which employs a strict test +suite backed by continuous integration infrastructure for the latest R +version. A positive result of this rigid discipline is that most R +packages from the same period of time will interoperate well together +when used with a particular R version. This means there is a clear +low-complexity target for achieving a reproducible environment. + +Writing a manifest for packaging R code alone requires only minimal +knowledge of the Guix infrastructure. This stub should work for most +cases involving the R packages already in Guix. + +@example +(use-modules + (gnu packages cran) + (gnu packages statistics)) + +(packages->manifest + (list r r-tidyverse)) + +R packages are defined predominantly inside of gnu/packages/cran.scm and +gnu/packages/statistics.scm files under a guix source repository. + +This manifest can be run with the basic guix shell command: + +@example +guix shell --manifest=manifest.scm --container +@end example + +Please remember at the end to pin your channels so that others in the +future know how to recover your exact Guix environment. + +@example +guix describe --format=channels > channels.scm +@end example + +This can be done with Guix time machine: + +@example +guix time-machine --channels=channels.scm \ + -- guix shell --manifest=manifest.scm --container +@end example + +In contrast, the python scientific ecosystem is far less +standardized. There is no effort made to integrate all python packages +together. While there is a latest python version, it is less often less +dominantly used for various reasons such as the fact that python tends +to be employed with much larger teams than R is. This makes packaging up +reproducible python environments much more difficult. Adding R together +with python as a mixture complicates things still further. However, we +have to be mindful of the goals of reproducible research. + +If reproducibility becomes an end in itself and not a catlyst towards +faster discovery, then Guix will be a non-starter for scientists. Their +goal is to develop useful understanding about particular aspects of the +world. + +Thankfully, three common scenarios cover the vast majority of +needs. These are: + +@itemize +@item +combining standard package definitions with custom package definitions +@item +combining package definitions from the current revision with other revisions +@item +combining package variants which need a modified build-system +@end itemize + +In the rest of the tutorial we develop a manifest which tackles all +three of these common issues. The hope is that if you see the hardest +possible common situation as being readily solvable without writing +thousands of lines of code, researchers will clearly see it as worth the +effort which will not pose a significant detour from the main line of +their research. + +@example +(use-modules + (guix packages) + (guix download) + (guix licenses) + (guix profiles) + (gnu packages) + (gnu packages cran) + (guix inferior) + (guix channels) + (guix build-system python)) + +;; guix import pypi APTED +(define python-apted + (package + (name "python-apted") + (version "1.0.3") + (source (origin + (method url-fetch) + (uri (pypi-uri "apted" version)) + (sha256 + (base32 + "1sawf6s5c64fgnliwy5w5yxliq2fc215m6alisl7yiflwa0m3ymy")))) + (build-system python-build-system) + (home-page "https://github.com/JoaoFelipe/apted") + (synopsis "APTED algorithm for the Tree Edit Distance") + (description "APTED algorithm for the Tree Edit Distance") + (license expat))) + +(define last-guix-with-python-3.6 + (list + (channel + (name 'guix) + (url "https://git.savannah.gnu.org/git/guix.git") + (commit + "d66146073def03d1a3d61607bc6b77997284904b")))) + +(define connection-to-last-guix-with-python-3.6 + (inferior-for-channels last-guix-with-python-3.6)) + +(define first car) + +(define python-3.6 + (first + (lookup-inferior-packages + connection-to-last-guix-with-python-3.6 "python"))) + +(define python3.6-numpy + (first + (lookup-inferior-packages + connection-to-last-guix-with-python-3.6 "python-numpy"))) + +(define included-packages + (list r r-reticulate)) + +(define inferior-packages + (list python-3.6 python3.6-numpy)) + +(define package-with-python-3.6 + (package-with-explicit-python python-3.6 + "python-" "python3.6-" 'python3-variant)) + +(define custom-variant-packages + (list (package-with-python-3.6 python-apted))) + +(concatenate-manifest + (map packages->manifest + (list + included-packages + inferior-packages + custom-variant-packages))) +@end example + +This should produce a profile with the latest R and an older python +3.6. These should be able to interoperate with code like: + +@example +library(reticulate) +use_python("python") +apted = import("apted") +t1 = '{a{b}{c}}' +t2 = '{a{b{d}}}' +metric = apted$APTED(t1, t2) +distance = metric$compute_edit_distance() +@end example + @node Guix environment via direnv @section Guix environment via direnv diff --git a/guix/build-system/python.scm b/guix/build-system/python.scm index c8f04b2298..d4aaab906d 100644 --- a/guix/build-system/python.scm +++ b/guix/build-system/python.scm @@ -36,6 +36,7 @@ (define-module (guix build-system python) #:use-module (srfi srfi-1) #:use-module (srfi srfi-26) #:export (%python-build-system-modules + package-with-explicit-python package-with-python2 strip-python2-variant default-python -- 2.37.2
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.