GNU bug report logs - #68789
[PATCH 0/2] gnu: Add python-pyjanitor.

Previous Next

Package: guix-patches;

Reported by: Troy Figiel <troy <at> troyfigiel.com>

Date: Sun, 28 Jan 2024 22:51:01 UTC

Severity: normal

Tags: patch

Done: Sharlatan Hellseher <sharlatanus <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 68789 in the body.
You can then email your comments to 68789 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix-patches <at> gnu.org:
bug#68789; Package guix-patches. (Sun, 28 Jan 2024 22:51:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Troy Figiel <troy <at> troyfigiel.com>:
New bug report received and forwarded. Copy sent to guix-patches <at> gnu.org. (Sun, 28 Jan 2024 22:51:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Troy Figiel <troy <at> troyfigiel.com>
To: guix-patches <at> gnu.org
Subject: [PATCH 0/2] gnu: Add python-pyjanitor.
Date: Sun, 28 Jan 2024 23:49:30 +0100
This patch series adds python-pyjanitor and its dependency python-unyt.

Troy Figiel (2):
  gnu: Add python-unyt.
  gnu: Add python-pyjanitor.

 gnu/packages/python-science.scm | 88 +++++++++++++++++++++++++++++++++
 1 file changed, 88 insertions(+)


base-commit: 08ed3ec64ecd571d92d497b2493f5c0225102c99
-- 
2.42.0





Information forwarded to guix-patches <at> gnu.org:
bug#68789; Package guix-patches. (Sun, 28 Jan 2024 22:54:02 GMT) Full text and rfc822 format available.

Message #8 received at 68789 <at> debbugs.gnu.org (full text, mbox):

From: Troy Figiel <troy <at> troyfigiel.com>
To: 68789 <at> debbugs.gnu.org
Subject: [PATCH 1/2] gnu: Add python-unyt.
Date: Sun, 28 Jan 2024 22:47:17 +0100
* gnu/packages/python-science.scm (python-unyt): New variable.
---
 gnu/packages/python-science.scm | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/gnu/packages/python-science.scm b/gnu/packages/python-science.scm
index 9d72608de4..3013c77c34 100644
--- a/gnu/packages/python-science.scm
+++ b/gnu/packages/python-science.scm
@@ -42,6 +42,7 @@
 (define-module (gnu packages python-science)
   #:use-module ((guix licenses) #:prefix license:)
   #:use-module (gnu packages)
+  #:use-module (gnu packages astronomy)
   #:use-module (gnu packages base)
   #:use-module (gnu packages bioinformatics)
   #:use-module (gnu packages boost)
@@ -1217,6 +1218,34 @@ (define-public python-statannot
 annotations on an existing boxplots and barplots generated by seaborn.")
     (license license:expat)))
 
+(define-public python-unyt
+  (package
+    (name "python-unyt")
+    (version "3.0.1")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (pypi-uri "unyt" version))
+       (sha256
+        (base32 "00900bw24rxgcgwgxp9xlx0l5im96r1n5hn0r3mxvbdgc3lyyq48"))))
+    (build-system pyproject-build-system)
+    (propagated-inputs (list python-h5py ;optional import
+                             python-matplotlib ;optional import
+                             python-numpy
+                             python-sympy))
+    ;; python-astropy and python-pint are also optional imports, but we do not
+    ;; propagate them due to their sizes.
+    (native-inputs (list python-astropy python-pint python-pytest))
+    (home-page "https://unyt.readthedocs.io")
+    (synopsis "Library for working with data that has physical units")
+    (description
+     "Writing code that deals with data with physical units can be confusing.
+A function might return an array but at least with plain @code{numpy}, there
+is no way to easily tell what the units of the data are without somehow
+knowing a priori.  @code{unyt} handles this problem by providing a subclass of
+the @code{ndarray} class in @code{numpy} that is unit aware.")
+    (license license:bsd-3)))
+
 (define-public python-upsetplot
   (package
     (name "python-upsetplot")
-- 
2.42.0





Information forwarded to guix-patches <at> gnu.org:
bug#68789; Package guix-patches. (Sun, 28 Jan 2024 22:54:02 GMT) Full text and rfc822 format available.

Message #11 received at 68789 <at> debbugs.gnu.org (full text, mbox):

From: Troy Figiel <troy <at> troyfigiel.com>
To: 68789 <at> debbugs.gnu.org
Subject: [PATCH 2/2] gnu: Add python-pyjanitor.
Date: Sun, 28 Jan 2024 23:13:19 +0100
* gnu/packages/python-science.scm (python-pyjanitor): New variable.
---
 gnu/packages/python-science.scm | 59 +++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/gnu/packages/python-science.scm b/gnu/packages/python-science.scm
index 3013c77c34..00b7e6cae1 100644
--- a/gnu/packages/python-science.scm
+++ b/gnu/packages/python-science.scm
@@ -48,6 +48,7 @@ (define-module (gnu packages python-science)
   #:use-module (gnu packages boost)
   #:use-module (gnu packages build-tools)
   #:use-module (gnu packages check)
+  #:use-module (gnu packages chemistry)
   #:use-module (gnu packages cpp)
   #:use-module (gnu packages crypto)
   #:use-module (gnu packages databases)
@@ -771,6 +772,64 @@ (define-public python-pandera
 @end itemize")
     (license license:expat)))
 
+(define-public python-pyjanitor
+  (package
+    (name "python-pyjanitor")
+    (version "0.26.0")
+    (source
+     (origin
+       ;; The build requires the mkdocs directory for the description in
+       ;; setup.py. This is not included in the PyPI tarball.
+       (method git-fetch)
+       (uri (git-reference
+             (url "https://github.com/pyjanitor-devs/pyjanitor")
+             (commit (string-append "v" version))))
+       (file-name (git-file-name name version))
+       (sha256
+        (base32 "1f8xbl1k9l2z56bapp7v6bd3016zrk48igcaz6hb553r6yfl7vfx"))))
+    (build-system pyproject-build-system)
+    ;; Pyjanitor has an extensive test suite. For quick debugging, the tests
+    ;; marked turtle can be skipped using "-m" "not turtle".
+    (arguments
+     (list
+      #:test-flags '(list
+                     ;; Tries to connect to the internet.
+                     "-k"
+                     "not test_is_connected"
+
+                     ;; PySpark has not been packaged yet.
+                     "--ignore"
+                     "tests/spark")
+      #:phases #~(modify-phases %standard-phases
+                   (add-before 'check 'set-env-ci
+                     (lambda _
+                       ;; Some tests are skipped if the JANITOR_CI_MACHINE
+                       ;; variable is not set.
+                       (setenv "JANITOR_CI_MACHINE" "1"))))))
+    (propagated-inputs (list python-multipledispatch
+                             python-natsort
+                             python-pandas-flavor
+                             python-scipy
+
+                             ;; Optional imports.
+                             python-biopython ;biology submodule
+                             python-unyt)) ;engineering submodule
+    (native-inputs (list python-pytest
+
+                         ;; Optional imports. We do not propagate them due to
+                         ;; their size.
+                         python-numba ;speedup of joins
+                         rdkit)) ;chemistry submodule
+    (home-page "https://github.com/pyjanitor-devs/pyjanitor")
+    (synopsis "Tools for cleaning and transforming pandas DataFrames")
+    (description
+     "@code{pyjanitor} provides a set of data cleaning routines for
+@code{pandas} DataFrames.  These routines extend the method chaining API
+defined by @code{pandas} for a subset of its methods.  Originally, this
+package was a port of the R package by the same name and it is inspired by the
+ease-of-use and expressiveness of the @code{dplyr} package.")
+    (license license:expat)))
+
 (define-public python-pythran
   (package
     (name "python-pythran")
-- 
2.42.0





Information forwarded to guix-patches <at> gnu.org:
bug#68789; Package guix-patches. (Mon, 29 Jan 2024 14:27:01 GMT) Full text and rfc822 format available.

Message #14 received at 68789 <at> debbugs.gnu.org (full text, mbox):

From: Sharlatan Hellseher <sharlatanus <at> gmail.com>
To: Troy Figiel <troy <at> troyfigiel.com>
Cc: 68789 <at> debbugs.gnu.org
Subject: [PATCH 0/2] gnu: Add python-pyjanitor.
Date: Mon, 29 Jan 2024 14:26:09 +0000
[Message part 1 (text/plain, inline)]
Hi Troy,

Thank you for the patches!

I'm in the process of packaging python-yt in (gnu packages astronomy)
and I've noticed that python-unyt is part of it which brought me here
:-) I started reviewing this issue so.

One note - you introduced a module cycle which was not before
astronomy->python-science->astronomy. If the requirement of
python-astropy is soft let's silent it for now.

Also I've already updated the whole chain depending on python-astropy
after it's update to 6.0.0, letting you know if your work requires fresh
Astropy version. It will be in review on 20th next month.

What do you think?

Regards,
Oleg
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#68789; Package guix-patches. (Mon, 29 Jan 2024 17:20:01 GMT) Full text and rfc822 format available.

Message #17 received at 68789 <at> debbugs.gnu.org (full text, mbox):

From: Troy Figiel <troy <at> troyfigiel.com>
To: Sharlatan Hellseher <sharlatanus <at> gmail.com>
Cc: 68789 <at> debbugs.gnu.org
Subject: Re: [PATCH 0/2] gnu: Add python-pyjanitor.
Date: Mon, 29 Jan 2024 18:18:57 +0100
[Message part 1 (text/plain, inline)]
Hi Oleg,

Thanks for the check!

On 2024-01-29 15:26, Sharlatan Hellseher wrote:
> One note - you introduced a module cycle which was not before
> astronomy->python-science->astronomy. If the requirement of
> python-astropy is soft let's silent it for now.

Removing the python-astropy dependency should be fine for python-unyt. I
agree that avoiding module cycles would be better. If I recall
correctly, Astropy was only used in tests, because it has a similar
submodule dealing with physical units.

The build was successful and the cycle did not show up in the linter.
How did you find it? Did you happen to notice it when you saw the imports?

Best wishes,

Troy
[OpenPGP_0xC67C9181B3893FB0.asc (application/pgp-keys, attachment)]
[OpenPGP_signature.asc (application/pgp-signature, attachment)]

Information forwarded to guix-patches <at> gnu.org:
bug#68789; Package guix-patches. (Mon, 29 Jan 2024 17:33:02 GMT) Full text and rfc822 format available.

Message #20 received at 68789 <at> debbugs.gnu.org (full text, mbox):

From: Sharlatan Hellseher <sharlatanus <at> gmail.com>
To: Troy Figiel <troy <at> troyfigiel.com>
Cc: 68789 <at> debbugs.gnu.org
Subject: Re: [PATCH 0/2] gnu: Add python-pyjanitor.
Date: Mon, 29 Jan 2024 17:31:56 +0000
[Message part 1 (text/plain, inline)]
Hi,


How did you find it? Did you happen to notice it when you saw the imports?
>

It's usually pops up in issues with efforts to break modules cycles
e.g.  https://issues.guix.gnu.org/54539.

I'm not quite sure how it is critical right now, but there was a discussion
that cycles in modules slow down ~guix pull~.

Let's comment astropy out with some notes about optional test dependency
and potential module cycle.

Looking forward for v2, patches look good.

If you have wider plan of upcoming patches please share to coordinate
efforts ;-).

Regards,
Oleg
[Message part 2 (text/html, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#68789; Package guix-patches. (Mon, 29 Jan 2024 18:15:01 GMT) Full text and rfc822 format available.

Message #23 received at 68789 <at> debbugs.gnu.org (full text, mbox):

From: Troy Figiel <troy <at> troyfigiel.com>
To: Sharlatan Hellseher <sharlatanus <at> gmail.com>
Cc: 68789 <at> debbugs.gnu.org
Subject: Re: [PATCH 0/2] gnu: Add python-pyjanitor.
Date: Mon, 29 Jan 2024 19:13:42 +0100
[Message part 1 (text/plain, inline)]
Now that you mention it, there are quite a few cycles. To name a few:

- astronomy->python-science->python-xyz->astronomy
- databases->python-xyz->databases
- bioinformatics->python-science->bioinformatics

On 2024-01-29 18:31, Sharlatan Hellseher wrote:
> If you have wider plan of upcoming patches please share to coordinate
> efforts ;-).

There is only the guix-devel list, right? No Python specific list?

When it comes to the Python ecosystem, I have been looking at

- python-shap
- python-cocotb (#68153)
- ruff

Unfortunately, ruff has caused me some headaches since it uses a Rust
workspace definition. I will probably have to write guix-devel for
advice sooner or later.

I've also still had some Golang packages on my radar, since long-term I
would like to see opentofu and gotenberg included. That might be going
off-topic a bit :-)
[OpenPGP_0xC67C9181B3893FB0.asc (application/pgp-keys, attachment)]
[OpenPGP_signature.asc (application/pgp-signature, attachment)]

Information forwarded to guix-patches <at> gnu.org:
bug#68789; Package guix-patches. (Mon, 29 Jan 2024 18:21:01 GMT) Full text and rfc822 format available.

Message #26 received at 68789 <at> debbugs.gnu.org (full text, mbox):

From: Troy Figiel <troy <at> troyfigiel.com>
To: 68789 <at> debbugs.gnu.org
Subject: [PATCH v2 0/2] gnu: Add python-pyjanitor.
Date: Mon, 29 Jan 2024 19:18:22 +0100
This is the updated patch series. I have rebased it on the current master and made the suggested changes.

Troy Figiel (2):
  gnu: Add python-unyt.
  gnu: Add python-pyjanitor.

 gnu/packages/python-science.scm | 88 +++++++++++++++++++++++++++++++++
 1 file changed, 88 insertions(+)


base-commit: 21e4d6cd6913eca131f2c0fd0cd509fc843c7eb8
-- 
2.42.0





Information forwarded to guix-patches <at> gnu.org:
bug#68789; Package guix-patches. (Mon, 29 Jan 2024 18:22:02 GMT) Full text and rfc822 format available.

Message #29 received at 68789 <at> debbugs.gnu.org (full text, mbox):

From: Troy Figiel <troy <at> troyfigiel.com>
To: 68789 <at> debbugs.gnu.org
Subject: [PATCH 1/2] gnu: Add python-unyt.
Date: Mon, 29 Jan 2024 19:16:54 +0100
* gnu/packages/python-science.scm (python-unyt): New variable.
---
 gnu/packages/python-science.scm | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/gnu/packages/python-science.scm b/gnu/packages/python-science.scm
index f775d46349..3390b918a4 100644
--- a/gnu/packages/python-science.scm
+++ b/gnu/packages/python-science.scm
@@ -1287,6 +1287,35 @@ (define-public python-statannot
 annotations on an existing boxplots and barplots generated by seaborn.")
     (license license:expat)))
 
+(define-public python-unyt
+  (package
+    (name "python-unyt")
+    (version "3.0.1")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (pypi-uri "unyt" version))
+       (sha256
+        (base32 "00900bw24rxgcgwgxp9xlx0l5im96r1n5hn0r3mxvbdgc3lyyq48"))))
+    (build-system pyproject-build-system)
+    ;; Astropy is an optional import, but we do not include it as it creates a
+    ;; module cycle: astronomy->python-science->astronomy.
+    (propagated-inputs (list python-h5py ;optional import
+                             python-matplotlib ;optional import
+                             python-numpy
+                             python-sympy))
+    ;; Pint is optional, but we do not propagate it due to its size.
+    (native-inputs (list python-pint python-pytest))
+    (home-page "https://unyt.readthedocs.io")
+    (synopsis "Library for working with data that has physical units")
+    (description
+     "Writing code that deals with data with physical units can be confusing.
+A function might return an array but at least with plain @code{numpy}, there
+is no way to easily tell what the units of the data are without somehow
+knowing a priori.  @code{unyt} handles this problem by providing a subclass of
+the @code{ndarray} class in @code{numpy} that is unit aware.")
+    (license license:bsd-3)))
+
 (define-public python-upsetplot
   (package
     (name "python-upsetplot")
-- 
2.42.0





Information forwarded to guix-patches <at> gnu.org:
bug#68789; Package guix-patches. (Mon, 29 Jan 2024 18:22:02 GMT) Full text and rfc822 format available.

Message #32 received at 68789 <at> debbugs.gnu.org (full text, mbox):

From: Troy Figiel <troy <at> troyfigiel.com>
To: 68789 <at> debbugs.gnu.org
Subject: [PATCH 2/2] gnu: Add python-pyjanitor.
Date: Mon, 29 Jan 2024 19:17:14 +0100
* gnu/packages/python-science.scm (python-pyjanitor): New variable.
---
 gnu/packages/python-science.scm | 59 +++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/gnu/packages/python-science.scm b/gnu/packages/python-science.scm
index 3390b918a4..643fb69f3f 100644
--- a/gnu/packages/python-science.scm
+++ b/gnu/packages/python-science.scm
@@ -47,6 +47,7 @@ (define-module (gnu packages python-science)
   #:use-module (gnu packages boost)
   #:use-module (gnu packages build-tools)
   #:use-module (gnu packages check)
+  #:use-module (gnu packages chemistry)
   #:use-module (gnu packages cpp)
   #:use-module (gnu packages crypto)
   #:use-module (gnu packages databases)
@@ -840,6 +841,64 @@ (define-public python-pandera
 @end itemize")
     (license license:expat)))
 
+(define-public python-pyjanitor
+  (package
+    (name "python-pyjanitor")
+    (version "0.26.0")
+    (source
+     (origin
+       ;; The build requires the mkdocs directory for the description in
+       ;; setup.py. This is not included in the PyPI tarball.
+       (method git-fetch)
+       (uri (git-reference
+             (url "https://github.com/pyjanitor-devs/pyjanitor")
+             (commit (string-append "v" version))))
+       (file-name (git-file-name name version))
+       (sha256
+        (base32 "1f8xbl1k9l2z56bapp7v6bd3016zrk48igcaz6hb553r6yfl7vfx"))))
+    (build-system pyproject-build-system)
+    ;; Pyjanitor has an extensive test suite. For quick debugging, the tests
+    ;; marked turtle can be skipped using "-m" "not turtle".
+    (arguments
+     (list
+      #:test-flags '(list
+                     ;; Tries to connect to the internet.
+                     "-k"
+                     "not test_is_connected"
+
+                     ;; PySpark has not been packaged yet.
+                     "--ignore"
+                     "tests/spark")
+      #:phases #~(modify-phases %standard-phases
+                   (add-before 'check 'set-env-ci
+                     (lambda _
+                       ;; Some tests are skipped if the JANITOR_CI_MACHINE
+                       ;; variable is not set.
+                       (setenv "JANITOR_CI_MACHINE" "1"))))))
+    (propagated-inputs (list python-multipledispatch
+                             python-natsort
+                             python-pandas-flavor
+                             python-scipy
+
+                             ;; Optional imports.
+                             python-biopython ;biology submodule
+                             python-unyt)) ;engineering submodule
+    (native-inputs (list python-pytest
+
+                         ;; Optional imports. We do not propagate them due to
+                         ;; their size.
+                         python-numba ;speedup of joins
+                         rdkit)) ;chemistry submodule
+    (home-page "https://github.com/pyjanitor-devs/pyjanitor")
+    (synopsis "Tools for cleaning and transforming pandas DataFrames")
+    (description
+     "@code{pyjanitor} provides a set of data cleaning routines for
+@code{pandas} DataFrames.  These routines extend the method chaining API
+defined by @code{pandas} for a subset of its methods.  Originally, this
+package was a port of the R package by the same name and it is inspired by the
+ease-of-use and expressiveness of the @code{dplyr} package.")
+    (license license:expat)))
+
 (define-public python-pythran
   (package
     (name "python-pythran")
-- 
2.42.0





Reply sent to Sharlatan Hellseher <sharlatanus <at> gmail.com>:
You have taken responsibility. (Mon, 29 Jan 2024 23:02:03 GMT) Full text and rfc822 format available.

Notification sent to Troy Figiel <troy <at> troyfigiel.com>:
bug acknowledged by developer. (Mon, 29 Jan 2024 23:02:04 GMT) Full text and rfc822 format available.

Message #37 received at 68789-done <at> debbugs.gnu.org (full text, mbox):

From: Sharlatan Hellseher <sharlatanus <at> gmail.com>
To: 68789-done <at> debbugs.gnu.org
Subject: [PATCH 0/2] gnu: Add python-pyjanitor.
Date: Mon, 29 Jan 2024 23:01:37 +0000
[Message part 1 (text/plain, inline)]
Modifications applied:

- python-unyt :: rephrase description, partly sourced and combined from
  <https://packages.debian.org/sid/python3-unyt>,
  <https://unyt.readthedocs.io/en/stable/>

- python-pyjanitor :: speed up tests with python-pytest-xdist (~x3
  faster on 16x threads), remove blank lines, disable exact tests
  related to PySpark.

Pushed as 370b79b4f5..cde0adaacd to master.

Thanks,
Oleg
[signature.asc (application/pgp-signature, inline)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 27 Feb 2024 12:24:10 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 170 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.