GNU bug report logs - #27905
changes for openmpi

Previous Next

Package: guix-patches;

Reported by: Dave Love <fx <at> gnu.org>

Date: Tue, 1 Aug 2017 12:55:02 UTC

Severity: normal

Done: ludovic.courtes <at> inria.fr (Ludovic Courtès)

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 27905 in the body.
You can then email your comments to 27905 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix-patches <at> gnu.org:
bug#27905; Package guix-patches. (Tue, 01 Aug 2017 12:55:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Dave Love <fx <at> gnu.org>:
New bug report received and forwarded. Copy sent to guix-patches <at> gnu.org. (Tue, 01 Aug 2017 12:55:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Dave Love <fx <at> gnu.org>
To: guix-patches <at> gnu.org
Subject: changes for openmpi
Date: Tue, 01 Aug 2017 13:54:24 +0100
[Message part 1 (text/plain, inline)]
Here's a series with suggestions for openmpi.  I hope the log messages
are sufficiently explanatory, otherwise I can comment.  The hwloc and
valgrind changes are in line with what I'm used to using with RHEL and
Debian packaging.  (I know you won't particularly want to follow them,
but they make sense from the point of view of a user.)

I think the last one will need to be used for gfortran-specific
variants, as suggested on -devel.  That will take the closure back up
somewhat, but what I get now is:

  store item                                                       total    self
  /gnu/store/la6mj9kh7fwws233955wyp80x39ag88w-openmpi-1.10.7         134.1     9.7   7.2%
  /gnu/store/b8ni7680lh6j8z26dam7ki9z6f9y6pnz-hwloc-1.11.7-nogui      89.9     2.9   2.1%
  /gnu/store/h7mx27bl0wynlz8vjszzykqqldccfwm5-ncurses-6.0             74.3     5.7   4.2%
  /gnu/store/w1mrskd2ddgvkr727r9241g8dlkf0rlf-gfortran-5.4.0-lib      73.0    34.5  25.7%
  /gnu/store/lsidb1rk8z24c516pqw99anm57cpm8r1-numactl-2.0.11          68.9     0.3   0.2%
  /gnu/store/4vdik5cc02yh2hypwnwi6n6799j6srgn-libpciaccess-0.13.5     68.7     0.1   0.1%
  /gnu/store/dhc2iy059hi91fk55dcv79z09kp6500y-gcc-5.4.0-lib           68.6    30.1  22.4%
  /gnu/store/k7029k5va68lkapbzcycdzj7m5bjb4b8-bash-4.4.12             50.9     5.4   4.1%
  /gnu/store/hvyk1qyph1hihfmym1w271ygp84adb0v-readline-7.0            45.5     1.3   1.0%
  /gnu/store/q1x4v3x8v2g59d244hl7k0i1n4h83c9a-ncurses-6.0             44.2     5.7   4.2%
  /gnu/store/rmjlycdgiq8pfy5hfi42qhw3k7p6kdav-glibc-2.25              38.5    37.1  27.7%
  /gnu/store/02426nwiy32cscm4h83729vn5ws1gs2i-bash-static-4.4.12       1.4     1.4   1.1%
  total: 134.1 MiB

[0002-gnu-Add-openmpi-thread-multiple-and-modify-openmpi-a.patch (text/x-diff, attachment)]
[0003-gnu-openmpi-Remove-static-output.patch (text/x-diff, attachment)]
[0004-gnu-hwloc-Replace-lib-output-with-nogui-containing-a.patch (text/x-diff, attachment)]
[0005-gnu-valgrind-Add-doc-and-openmpi-outputs.patch (text/x-diff, attachment)]
[0006-gnu-openmpi-Modify-configuration-to-reduce-closure.patch (text/x-diff, attachment)]
[0007-gnu-openmpi-Configure-without-vampirtrace.patch (text/x-diff, attachment)]
[0008-gnu-openmpi-Remove-references-to-compiler-pathnames-.patch (text/x-diff, attachment)]

Information forwarded to guix-patches <at> gnu.org:
bug#27905; Package guix-patches. (Mon, 21 Aug 2017 15:13:02 GMT) Full text and rfc822 format available.

Message #8 received at 27905 <at> debbugs.gnu.org (full text, mbox):

From: ludovic.courtes <at> inria.fr (Ludovic Courtès)
To: Dave Love <fx <at> gnu.org>
Cc: 27905 <at> debbugs.gnu.org
Subject: Re: [bug#27905] changes for openmpi
Date: Mon, 21 Aug 2017 17:12:06 +0200
Hi!

I’ve applied most of the patches.  I have a few remaining questions:

Dave Love <fx <at> gnu.org> skribis:

>>From 67a59e734dd451d1e64d450dcebeb23d60996f3e Mon Sep 17 00:00:00 2001
> From: Dave Love <dave.love <at> manchester.ac.uk>
> Date: Mon, 31 Jul 2017 14:58:39 +0100
> Subject: [PATCH] gnu: hwloc: Replace "lib" output with "nogui", containing 
>  all but lstopo.
>
> A compute node typically wants the non-GUI programs available, which still
> have a small closure.
>
> * mpi.scm (hwloc)[outputs]: Replace lib with nogui.
> (hwloc)[arguments]: Change configure --prefix; use "nogui" output,
> not "lib"; populate "all" output.
> (openmpi)[inputs]: Use hwloc-nogui.

The downside of this is that the “nogui” output is less discoverable
(and it’s another user-visible breakage.)

Also, it shouldn’t make any difference to the closure size of openmpi
anyway, no?

>>From 1772aa47c3bc71521340d7f569d4d906ab7f53e9 Mon Sep 17 00:00:00 2001
> From: Dave Love <dave.love <at> manchester.ac.uk>
> Date: Mon, 31 Jul 2017 15:01:59 +0100
> Subject: [PATCH] gnu: valgrind: Add "doc" and "openmpi" outputs.
>
> Also don't configure openmpi with valgrind; rely on the wrapper
> library from the valgrind package, like Fedora and Debian.
>
> * gnu/packages/valgrind.scm (valgrind)[outputs]: New field.
> [arguments]: Add install-doc and install-openmpi phases.
> [description]: Mention openmpi output.
> * gnu/packages/mpi.scm (openmpi)[arguments]: Don't configure with
> valgrind.

I’ve installed the doc-output-for-valgrind part, as a separate patch.

Regarding the rest:

> +         (add-after 'install 'install-openmpi
> +           (lambda* (#:key outputs #:allow-other-keys)
> +             (let ((dest (format #f "~a/lib/valgrind"
> +                                 (assoc-ref outputs "openmpi"))))
> +               (mkdir-p dest)
> +               (zero?
> +                (system (format #f "mv ~a/lib/valgrind/libmpiwrap* ~a"
> +                                (assoc-ref outputs "out") dest)))))))))

Why move it to a separate output?  After all, we can keep it in “out”
since all it costs is the size of libmpiwrap.so, right?

Also, I assume that this is functionally equivalent to Open MPI’s
built-in Valgrind support, is it?

>>From ec65c9d847c30d51bf83b49b397bc1ca20b7ca11 Mon Sep 17 00:00:00 2001
> From: Dave Love <dave.love <at> manchester.ac.uk>
> Date: Mon, 31 Jul 2017 17:15:19 +0100
> Subject: [PATCH 8/8] gnu: openmpi: Remove references to compiler pathnames in
>  "_info" programs.
>
> This reduces the closure greatly, but note that the Fortran .mod files are
> gfortran version-specific, so there should probably be development packages
> for each incompatible version.  (The runtime is supposed to be more-or-less
> version-independent unless the libgfortran soname changes.)  There may still
> be a case for a separate runtime output.
>
> * gnu/packages/mpi.scm (openmpi)[arguments]: Add "remove-absolute" phase.

Great, I added the URL of previous discussions on this topic.

With the changes I pushed the closure size is already at 378.5 instead
of 700.8 MiB, pretty cool!

Thank you,
Ludo’.




Information forwarded to guix-patches <at> gnu.org:
bug#27905; Package guix-patches. (Wed, 23 Aug 2017 13:01:01 GMT) Full text and rfc822 format available.

Message #11 received at 27905 <at> debbugs.gnu.org (full text, mbox):

From: Dave Love <fx <at> gnu.org>
To: Ludovic Courtès <ludovic.courtes <at> inria.fr>
Cc: 27905 <at> debbugs.gnu.org
Subject: Re: [bug#27905] changes for openmpi
Date: Wed, 23 Aug 2017 14:00:53 +0100
Ludovic Courtès <ludovic.courtes <at> inria.fr> writes:

>> * mpi.scm (hwloc)[outputs]: Replace lib with nogui.
>> (hwloc)[arguments]: Change configure --prefix; use "nogui" output,
>> not "lib"; populate "all" output.
>> (openmpi)[inputs]: Use hwloc-nogui.
>
> The downside of this is that the “nogui” output is less discoverable
> (and it’s another user-visible breakage.)

I don't think that's a problem, as people who want to avoid the GUI
stuff will look for an alternative.

> Also, it shouldn’t make any difference to the closure size of openmpi
> anyway, no?

No, but I think you should be able to run the hwloc programs on compute
nodes without requiring X support, and you sometimes need to run openmpi
programs specifically with openmpi (for memory affinity, for instance).

> > +         (add-after 'install 'install-openmpi
> > +           (lambda* (#:key outputs #:allow-other-keys)
> > +             (let ((dest (format #f "~a/lib/valgrind"
> > +                                 (assoc-ref outputs "openmpi"))))
> > +               (mkdir-p dest)
> > +               (zero?
> > +                (system (format #f "mv ~a/lib/valgrind/libmpiwrap* ~a"
> > +                                (assoc-ref outputs "out") dest)))))))))
> 
> Why move it to a separate output?  After all, we can keep it in “out”
> since all it costs is the size of libmpiwrap.so, right?

That would still pull in valgrind, which drags in a lot else (gdb, perl,
python...).  The support isn't commonly used as far as I can tell, and
isn't configured by default -- I forgot about the performance hit
<https://www.open-mpi.org/faq/?category=debugging#memchecker_overhead>.

> Also, I assume that this is functionally equivalent to Open MPI’s
> built-in Valgrind support, is it?

I think so, basically, but I can ask the question.  (Actually it's
occurred to me that the wrapper uses the profiling interface, so it
won't work together with profiling tools without pnmpi multiplexing, but
you're unlikely to want to stack one with memory debugging.)  Anyhow, I
vote against a performance hit generally.  Also, versions of the wrapper
library could be provided for other MPIs when they're packaged.  If
memchecker really needs to be built-in, I think it should be packaged
separately openmpi, as for thread-multiple support.

I'll report any reply I get about the built-in support.




Information forwarded to guix-patches <at> gnu.org:
bug#27905; Package guix-patches. (Thu, 31 Aug 2017 07:59:02 GMT) Full text and rfc822 format available.

Message #14 received at 27905 <at> debbugs.gnu.org (full text, mbox):

From: ludovic.courtes <at> inria.fr (Ludovic Courtès)
To: Dave Love <fx <at> gnu.org>
Cc: 27905 <at> debbugs.gnu.org
Subject: Re: [bug#27905] changes for openmpi
Date: Thu, 31 Aug 2017 09:58:43 +0200
Hi Dave,

Dave Love <fx <at> gnu.org> skribis:

> Ludovic Courtès <ludovic.courtes <at> inria.fr> writes:
>
>>> * mpi.scm (hwloc)[outputs]: Replace lib with nogui.
>>> (hwloc)[arguments]: Change configure --prefix; use "nogui" output,
>>> not "lib"; populate "all" output.
>>> (openmpi)[inputs]: Use hwloc-nogui.
>>
>> The downside of this is that the “nogui” output is less discoverable
>> (and it’s another user-visible breakage.)
>
> I don't think that's a problem, as people who want to avoid the GUI
> stuff will look for an alternative.
>
>> Also, it shouldn’t make any difference to the closure size of openmpi
>> anyway, no?
>
> No, but I think you should be able to run the hwloc programs on compute
> nodes without requiring X support, and you sometimes need to run openmpi
> programs specifically with openmpi (for memory affinity, for instance).

OK so the gain over the current status (with the “lib” output) is that
people would be able to get, say, ‘hwloc-bind’, without getting the full
‘lstopo’ and its dependencies, right?

I guess that makes sense, though at the same time ‘lstopo’ is probably
the most widely used program in hwloc.  Perhaps we should keep the
current “lib” separation, and instead provide an “hwloc-minimal”
package that does not depend on X11/Cairo?

>> > +         (add-after 'install 'install-openmpi
>> > +           (lambda* (#:key outputs #:allow-other-keys)
>> > +             (let ((dest (format #f "~a/lib/valgrind"
>> > +                                 (assoc-ref outputs "openmpi"))))
>> > +               (mkdir-p dest)
>> > +               (zero?
>> > +                (system (format #f "mv ~a/lib/valgrind/libmpiwrap* ~a"
>> > +                                (assoc-ref outputs "out") dest)))))))))
>> 
>> Why move it to a separate output?  After all, we can keep it in “out”
>> since all it costs is the size of libmpiwrap.so, right?
>
> That would still pull in valgrind, which drags in a lot else (gdb, perl,
> python...).  The support isn't commonly used as far as I can tell, and
> isn't configured by default -- I forgot about the performance hit
> <https://www.open-mpi.org/faq/?category=debugging#memchecker_overhead>.

The hunk above is within Valgrind, so I don’t understand what you mean
by “that would still pull in valgrind.”

My suggestion was to:

  1. Remove Valgrind from the inputs of Open MPI;
  2. Not add the ‘install-openmpi’ phase above to Valgrind.

Does that make sense?

Thank you,
Ludo’.




Information forwarded to guix-patches <at> gnu.org:
bug#27905; Package guix-patches. (Fri, 01 Sep 2017 11:07:02 GMT) Full text and rfc822 format available.

Message #17 received at 27905 <at> debbugs.gnu.org (full text, mbox):

From: Dave Love <fx <at> gnu.org>
To: Ludovic Courtès <ludovic.courtes <at> inria.fr>
Cc: 27905 <at> debbugs.gnu.org
Subject: Re: [bug#27905] changes for openmpi
Date: Fri, 01 Sep 2017 12:06:14 +0100
Ludovic Courtès <ludovic.courtes <at> inria.fr> writes:

>> * mpi.scm (hwloc)[outputs]: Replace lib with nogui.
>> (hwloc)[arguments]: Change configure --prefix; use "nogui" output,
>> not "lib"; populate "all" output.
>> (openmpi)[inputs]: Use hwloc-nogui.
>
> The downside of this is that the “nogui” output is less discoverable
> (and it’s another user-visible breakage.)

I don't understand why it's worse than currently.  "hwloc" will provide
the same as before, won't it?  I guess developer breakage could be fixed
by retaining the lib output if it matters.

Maybe it's helpful to try to document what sort of stability is expected
currently?

> Also, it shouldn’t make any difference to the closure size of openmpi
> anyway, no?

Right.  It wasn't for openmpi specifically.

>> +         (add-after 'install 'install-openmpi
>> +           (lambda* (#:key outputs #:allow-other-keys)
>> +             (let ((dest (format #f "~a/lib/valgrind"
>> +                                 (assoc-ref outputs "openmpi"))))
>> +               (mkdir-p dest)
>> +               (zero?
>> +                (system (format #f "mv ~a/lib/valgrind/libmpiwrap* ~a"
>> +                                (assoc-ref outputs "out") dest)))))))))
>
> Why move it to a separate output?  After all, we can keep it in “out”
> since all it costs is the size of libmpiwrap.so, right?
>
> Also, I assume that this is functionally equivalent to Open MPI’s
> built-in Valgrind support, is it?

This is probably moot.  It isn't entirely equivalent but, more
importantly, the builtin support apparently doesn't have the performance
hit which was documented; I haven't checked experimentally.  See this
thread, though not all my questions were answered:
<https://www.mail-archive.com/users <at> lists.open-mpi.org//msg31459.html>.

The wrapper library may still be relevant for mpich-y MPIs, if they get
used -- I don't know.




Information forwarded to guix-patches <at> gnu.org:
bug#27905; Package guix-patches. (Fri, 01 Sep 2017 11:25:01 GMT) Full text and rfc822 format available.

Message #20 received at 27905 <at> debbugs.gnu.org (full text, mbox):

From: Dave Love <fx <at> gnu.org>
To: Ludovic Courtès <ludovic.courtes <at> inria.fr>
Cc: 27905 <at> debbugs.gnu.org
Subject: Re: [bug#27905] changes for openmpi
Date: Fri, 01 Sep 2017 12:24:30 +0100
Ludovic Courtès <ludovic.courtes <at> inria.fr> writes:

> Hi Dave,
>
> Dave Love <fx <at> gnu.org> skribis:
>
>> Ludovic Courtès <ludovic.courtes <at> inria.fr> writes:
>>
>>>> * mpi.scm (hwloc)[outputs]: Replace lib with nogui.
>>>> (hwloc)[arguments]: Change configure --prefix; use "nogui" output,
>>>> not "lib"; populate "all" output.
>>>> (openmpi)[inputs]: Use hwloc-nogui.
>>>
>>> The downside of this is that the “nogui” output is less discoverable
>>> (and it’s another user-visible breakage.)
>>
>> I don't think that's a problem, as people who want to avoid the GUI
>> stuff will look for an alternative.
>>
>>> Also, it shouldn’t make any difference to the closure size of openmpi
>>> anyway, no?
>>
>> No, but I think you should be able to run the hwloc programs on compute
>> nodes without requiring X support, and you sometimes need to run openmpi
>> programs specifically with openmpi (for memory affinity, for instance).
>
> OK so the gain over the current status (with the “lib” output) is that
> people would be able to get, say, ‘hwloc-bind’, without getting the full
> ‘lstopo’ and its dependencies, right?

Right.

> I guess that makes sense, though at the same time ‘lstopo’ is probably
> the most widely used program in hwloc.

I'm surprised at that, since it's something you'd only normally run once
on a node, and I'd normally only have the nogui variant on them anyhow.
(You can dump the topology and display it separately if necessary, but
the graphical output is often unwieldy on recent compute nodes -- or
even not-so-recent ones.)

> Perhaps we should keep the
> current “lib” separation, and instead provide an “hwloc-minimal”
> package that does not depend on X11/Cairo?

I've no strong feelings, but I'd still call it "nogui", or similar --
more descriptive.  (Debian calls it -nox and Fedora has hwloc and
hwloc-gui.)  The no-GUI lstopo adds little to the closure.

>>> > +         (add-after 'install 'install-openmpi
>>> > +           (lambda* (#:key outputs #:allow-other-keys)
>>> > +             (let ((dest (format #f "~a/lib/valgrind"
>>> > +                                 (assoc-ref outputs "openmpi"))))
>>> > +               (mkdir-p dest)
>>> > +               (zero?
>>> > +                (system (format #f "mv ~a/lib/valgrind/libmpiwrap* ~a"
>>> > +                                (assoc-ref outputs "out") dest)))))))))
>>> 
>>> Why move it to a separate output?  After all, we can keep it in “out”
>>> since all it costs is the size of libmpiwrap.so, right?
>>
>> That would still pull in valgrind, which drags in a lot else (gdb, perl,
>> python...).  The support isn't commonly used as far as I can tell, and
>> isn't configured by default -- I forgot about the performance hit
>> <https://www.open-mpi.org/faq/?category=debugging#memchecker_overhead>.
>
> The hunk above is within Valgrind, so I don’t understand what you mean
> by “that would still pull in valgrind.”

Sorry -- confused by the time lag.  The rationale was that it either
depends upon, or needs when it's used -- I can't remember -- the rest,
which involves gdb's closure.

I'm happy to drop libmpiwrap support anyhow, at least until another MPI
might need it, if I was wrong that the openmpi built-in is a performance
issue.  Apologies for the misinformation and wasted time, though it was
based on what's on the openmpi web site.

> My suggestion was to:
>
>   1. Remove Valgrind from the inputs of Open MPI;
>   2. Not add the ‘install-openmpi’ phase above to Valgrind.
>
> Does that make sense?
>
> Thank you,
> Ludo’.




Information forwarded to guix-patches <at> gnu.org:
bug#27905; Package guix-patches. (Mon, 04 Sep 2017 15:11:01 GMT) Full text and rfc822 format available.

Message #23 received at 27905 <at> debbugs.gnu.org (full text, mbox):

From: ludovic.courtes <at> inria.fr (Ludovic Courtès)
To: Dave Love <fx <at> gnu.org>
Cc: 27905 <at> debbugs.gnu.org
Subject: Re: [bug#27905] changes for openmpi
Date: Mon, 04 Sep 2017 17:10:24 +0200
[Message part 1 (text/plain, inline)]
Dave Love <fx <at> gnu.org> skribis:

> Ludovic Courtès <ludovic.courtes <at> inria.fr> writes:
>
>>> * mpi.scm (hwloc)[outputs]: Replace lib with nogui.
>>> (hwloc)[arguments]: Change configure --prefix; use "nogui" output,
>>> not "lib"; populate "all" output.
>>> (openmpi)[inputs]: Use hwloc-nogui.
>>
>> The downside of this is that the “nogui” output is less discoverable
>> (and it’s another user-visible breakage.)
>
> I don't understand why it's worse than currently.  "hwloc" will provide
> the same as before, won't it?  I guess developer breakage could be fixed
> by retaining the lib output if it matters.
>
> Maybe it's helpful to try to document what sort of stability is expected
> currently?

Concretely, I have a bunch of packages for linear algebra software
developed at work.  When we add/remove an output to hwloc, those
packages may fail to build (for instance, currently they expect the
“lib” output of hwloc.)

Likewise, “guix package -u” doesn’t deal with output changes (we do have
a mechanism to deal with package renames, but not with output changes.)

>> Also, it shouldn’t make any difference to the closure size of openmpi
>> anyway, no?
>
> Right.  It wasn't for openmpi specifically.
>
>>> +         (add-after 'install 'install-openmpi
>>> +           (lambda* (#:key outputs #:allow-other-keys)
>>> +             (let ((dest (format #f "~a/lib/valgrind"
>>> +                                 (assoc-ref outputs "openmpi"))))
>>> +               (mkdir-p dest)
>>> +               (zero?
>>> +                (system (format #f "mv ~a/lib/valgrind/libmpiwrap* ~a"
>>> +                                (assoc-ref outputs "out") dest)))))))))
>>
>> Why move it to a separate output?  After all, we can keep it in “out”
>> since all it costs is the size of libmpiwrap.so, right?
>>
>> Also, I assume that this is functionally equivalent to Open MPI’s
>> built-in Valgrind support, is it?
>
> This is probably moot.  It isn't entirely equivalent but, more
> importantly, the builtin support apparently doesn't have the performance
> hit which was documented; I haven't checked experimentally.  See this
> thread, though not all my questions were answered:
> <https://www.mail-archive.com/users <at> lists.open-mpi.org//msg31459.html>.
>
> The wrapper library may still be relevant for mpich-y MPIs, if they get
> used -- I don't know.

OK.

So to me that means we can apply the patch below and be done with it.
Fine with you?

Thanks,
Ludo’.

[Message part 2 (text/x-patch, inline)]
diff --git a/gnu/packages/mpi.scm b/gnu/packages/mpi.scm
index 93157e269..ded9d4fda 100644
--- a/gnu/packages/mpi.scm
+++ b/gnu/packages/mpi.scm
@@ -36,8 +36,7 @@
   #:use-module (gnu packages xml)
   #:use-module (gnu packages perl)
   #:use-module (gnu packages ncurses)
-  #:use-module (gnu packages pkg-config)
-  #:use-module (gnu packages valgrind))
+  #:use-module (gnu packages pkg-config))
 
 (define-public hwloc
   (package
@@ -126,8 +125,7 @@ bind processes, and much more.")
      `(("hwloc" ,hwloc "lib")
        ("gfortran" ,gfortran)
        ("libfabric" ,libfabric)
-       ("rdma-core" ,rdma-core)
-       ("valgrind" ,valgrind)))
+       ("rdma-core" ,rdma-core)))
     (native-inputs
      `(("pkg-config" ,pkg-config)
        ("perl" ,perl)))
@@ -142,8 +140,6 @@ bind processes, and much more.")
                            ;; it reduces the closure size considerably.
                            "--disable-vt"
 
-                           ,(string-append "--with-valgrind="
-                                           (assoc-ref %build-inputs "valgrind"))
                            ,(string-append "--with-hwloc="
                                            (assoc-ref %build-inputs "hwloc")))
        #:phases (modify-phases %standard-phases

Information forwarded to guix-patches <at> gnu.org:
bug#27905; Package guix-patches. (Thu, 07 Sep 2017 16:15:01 GMT) Full text and rfc822 format available.

Message #26 received at 27905 <at> debbugs.gnu.org (full text, mbox):

From: Dave Love <fx <at> gnu.org>
To: Ludovic Courtès <ludovic.courtes <at> inria.fr>
Cc: 27905 <at> debbugs.gnu.org
Subject: Re: [bug#27905] changes for openmpi
Date: Thu, 07 Sep 2017 17:14:43 +0100
Ludovic Courtès <ludovic.courtes <at> inria.fr> writes:

>>> Also, I assume that this is functionally equivalent to Open MPI’s
>>> built-in Valgrind support, is it?
>>
>> This is probably moot.  It isn't entirely equivalent but, more
>> importantly, the builtin support apparently doesn't have the performance
>> hit which was documented; I haven't checked experimentally.  See this
>> thread, though not all my questions were answered:
>> <https://www.mail-archive.com/users <at> lists.open-mpi.org//msg31459.html>.
>>
>> The wrapper library may still be relevant for mpich-y MPIs, if they get
>> used -- I don't know.
>
> OK.
>
> So to me that means we can apply the patch below and be done with it.
> Fine with you?

No, I now think it shouldn't be changed, since the valgrind integration
is supposed not to impose a significant speed penalty, and I can remove
valgrind from the closure simply.  I'll send a new patch later.

> Thanks,
> Ludo’.
>
>
> diff --git a/gnu/packages/mpi.scm b/gnu/packages/mpi.scm
> index 93157e269..ded9d4fda 100644
> --- a/gnu/packages/mpi.scm
> +++ b/gnu/packages/mpi.scm
> @@ -36,8 +36,7 @@
>    #:use-module (gnu packages xml)
>    #:use-module (gnu packages perl)
>    #:use-module (gnu packages ncurses)
> -  #:use-module (gnu packages pkg-config)
> -  #:use-module (gnu packages valgrind))
> +  #:use-module (gnu packages pkg-config))
>  
>  (define-public hwloc
>    (package
> @@ -126,8 +125,7 @@ bind processes, and much more.")
>       `(("hwloc" ,hwloc "lib")
>         ("gfortran" ,gfortran)
>         ("libfabric" ,libfabric)
> -       ("rdma-core" ,rdma-core)
> -       ("valgrind" ,valgrind)))
> +       ("rdma-core" ,rdma-core)))
>      (native-inputs
>       `(("pkg-config" ,pkg-config)
>         ("perl" ,perl)))
> @@ -142,8 +140,6 @@ bind processes, and much more.")
>                             ;; it reduces the closure size considerably.
>                             "--disable-vt"
>  
> -                           ,(string-append "--with-valgrind="
> -                                           (assoc-ref %build-inputs "valgrind"))
>                             ,(string-append "--with-hwloc="
>                                             (assoc-ref %build-inputs "hwloc")))
>         #:phases (modify-phases %standard-phases




Information forwarded to guix-patches <at> gnu.org:
bug#27905; Package guix-patches. (Mon, 11 Sep 2017 20:25:01 GMT) Full text and rfc822 format available.

Message #29 received at 27905 <at> debbugs.gnu.org (full text, mbox):

From: Dave Love <fx <at> gnu.org>
To: Ludovic Courtès <ludovic.courtes <at> inria.fr>
Cc: 27905 <at> debbugs.gnu.org
Subject: Re: [bug#27905] changes for openmpi
Date: Mon, 11 Sep 2017 21:24:05 +0100
[Message part 1 (text/plain, inline)]
I wrote: 

>> So to me that means we can apply the patch below and be done with it.
>> Fine with you?
>
> No, I now think it shouldn't be changed, since the valgrind integration
> is supposed not to impose a significant speed penalty, and I can remove
> valgrind from the closure simply.  I'll send a new patch later.

Here it is, eventually, which gets rid of a lot from the closure.

[0002-gnu-openmpi-Remove-valgrind-from-closure.patch (text/x-diff, attachment)]

Reply sent to ludovic.courtes <at> inria.fr (Ludovic Courtès):
You have taken responsibility. (Tue, 12 Sep 2017 07:01:01 GMT) Full text and rfc822 format available.

Notification sent to Dave Love <fx <at> gnu.org>:
bug acknowledged by developer. (Tue, 12 Sep 2017 07:01:04 GMT) Full text and rfc822 format available.

Message #34 received at 27905-done <at> debbugs.gnu.org (full text, mbox):

From: ludovic.courtes <at> inria.fr (Ludovic Courtès)
To: Dave Love <fx <at> gnu.org>
Cc: 27905-done <at> debbugs.gnu.org
Subject: Re: [bug#27905] changes for openmpi
Date: Tue, 12 Sep 2017 09:00:22 +0200
Dave Love <fx <at> gnu.org> skribis:

> I wrote: 
>
>>> So to me that means we can apply the patch below and be done with it.
>>> Fine with you?
>>
>> No, I now think it shouldn't be changed, since the valgrind integration
>> is supposed not to impose a significant speed penalty, and I can remove
>> valgrind from the closure simply.  I'll send a new patch later.
>
> Here it is, eventually, which gets rid of a lot from the closure.
>
> From 6b47b2ce671bfbdab3c5f4f2546f02bcfee66d68 Mon Sep 17 00:00:00 2001
> From: Dave Love <fx <at> gnu.org>
> Date: Mon, 4 Sep 2017 18:04:21 +0100
> Subject: [PATCH 2/2] gnu openmpi: Remove valgrind from closure.
>
> * mpi.scm (openmpi)[arguments]: Elide romio config info to avoid valgrind
> path.

Awesome!  I tweaked the commit log and pushed.

Now we’re down to 156 MiB for the whole closure, which is much better.
There’s still room for optimization (Bash, xz, util-linux?), but we’ll
get there:

--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix size openmpi
store item                                                       total    self
/gnu/store/n6nvxlk2j8ysffjh3jphn1k5silnakh6-glibc-2.25              38.5    37.1  23.7%
/gnu/store/8j1h29zcgrg13dc2md7lalxliv1jrq2p-gfortran-5.4.0-lib      73.0    34.5  22.0%
/gnu/store/3x53yv4v144c9xp02rs64z7j597kkqax-gcc-5.4.0-lib           68.6    30.1  19.2%
/gnu/store/z77nhww8zh96w6lb5ak6h3jb4niain3b-eudev-3.2.2            103.2    14.1   9.0%
/gnu/store/dy81cx0yshq8vban59vjsdl4rvxnwxab-util-linux-2.30         87.6    12.0   7.7%
/gnu/store/jk8bcr9q79cj6j97xb6rdil1fw0g8hd6-openmpi-1.10.7         156.5    10.1   6.5%
/gnu/store/09j7scnl3hahcmql986fsjpzj6gqsmzv-ncurses-6.0             74.3     5.7   3.6%
/gnu/store/bhawz0mpfdjhwq423q6kk2jz34dpcsx5-libnl-3.3.0             72.3     3.6   2.3%
/gnu/store/n2k1kmwj0rswq6qija8v8kz9ramj2a83-rdma-core-14           108.8     2.0   1.3%
/gnu/store/808hmh1bp6khhbfrbljcsnly9497bxvy-libfabric-1.4.1        110.4     1.6   1.0%
/gnu/store/zhrajv6qf2hzn9c3g2bb07559hyrz5xp-bash-static-4.4.12       1.4     1.4   0.9%
/gnu/store/g3nari57wcfnm00kv9bnpyzdzfq4h8pk-xz-5.2.2                70.7     1.1   0.7%
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12     39.5     1.0   0.6%
/gnu/store/hf6k2i6aqqs50p181bs1aa7xw49kd6xn-hwloc-1.11.8-lib        72.8     0.6   0.4%
/gnu/store/ljzqi3ajkc6l5r8hwdz7kr1zwbli3i7y-pciutils-3.5.5          71.8     0.5   0.3%
/gnu/store/sfx1wh27i6gsrk21p87rdyikc64v7d51-zlib-1.2.11             69.0     0.4   0.2%
/gnu/store/bdys6wm9hwd7akd5mc00xw0y4cz0j1fg-numactl-2.0.11          68.9     0.3   0.2%
/gnu/store/insr5wrif9pn1mlqa5rl9k3sr5qf2q1y-kmod-24                 71.3     0.3   0.2%
/gnu/store/0p4gxh2xiz31v2zx8mg43nv2djjyfwmn-libpciaccess-0.13.5     71.9     0.1   0.1%
total: 156.5 MiB
--8<---------------cut here---------------end--------------->8---

Thanks!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 10 Oct 2017 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 7 years and 259 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.