GNU bug report logs -
#21909
Segfault with eigen in R
Previous Next
Reported by: Kyle Meyer <kyle <at> kyleam.com>
Date: Fri, 13 Nov 2015 16:07:02 UTC
Severity: normal
Done: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21909 in the body.
You can then email your comments to 21909 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guix <at> gnu.org
:
bug#21909
; Package
guix
.
(Fri, 13 Nov 2015 16:07:03 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Kyle Meyer <kyle <at> kyleam.com>
:
New bug report received and forwarded. Copy sent to
bug-guix <at> gnu.org
.
(Fri, 13 Nov 2015 16:07:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hello,
With R 3.2.2 built from r in statistics.scm (guix 0.9.0), I'm seeing a
segfault when eigen is called with a matrix over some size. I can
trigger the error with the following code [1]:
> M <- 50
> N <- 500
> eigen(crossprod(matrix(rnorm(M * N), M, N)))
*** caught segfault ***
address 0xfb0, cause 'memory not mapped'
Traceback:
1: eigen(crossprod(matrix(rnorm(M * N), M, N)))
Can others reproduce this?
Thanks.
[1] This is a down-sized version of the snippet from an ATLAS bug report
in 2011 for a similar error with R 2.14.
http://sourceforge.net/p/math-atlas/support-requests/792/
--
Kyle
Information forwarded
to
bug-guix <at> gnu.org
:
bug#21909
; Package
guix
.
(Fri, 13 Nov 2015 20:27:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 21909 <at> debbugs.gnu.org (full text, mbox):
Kyle Meyer <kyle <at> kyleam.com> writes:
> With R 3.2.2 built from r in statistics.scm (guix 0.9.0), I'm seeing a
> segfault when eigen is called with a matrix over some size. I can
> trigger the error with the following code [1]:
>
> > M <- 50
> > N <- 500
> > eigen(crossprod(matrix(rnorm(M * N), M, N)))
>
> *** caught segfault ***
> address 0xfb0, cause 'memory not mapped'
>
> Traceback:
> 1: eigen(crossprod(matrix(rnorm(M * N), M, N)))
>
> Can others reproduce this?
I can reproduce this running R 3.2.2 on GuixSD on a x86_64 machine.
~~ Ricardo
Information forwarded
to
bug-guix <at> gnu.org
:
bug#21909
; Package
guix
.
(Thu, 19 Nov 2015 00:38:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 21909 <at> debbugs.gnu.org (full text, mbox):
Ricardo Wurmus <rekado <at> elephly.net> writes:
> Kyle Meyer <kyle <at> kyleam.com> writes:
>
>> With R 3.2.2 built from r in statistics.scm (guix 0.9.0), I'm seeing a
>> segfault when eigen is called with a matrix over some size. I can
>> trigger the error with the following code [1]:
>>
>> > M <- 50
>> > N <- 500
>> > eigen(crossprod(matrix(rnorm(M * N), M, N)))
>>
>> *** caught segfault ***
>> address 0xfb0, cause 'memory not mapped'
>>
>> Traceback:
>> 1: eigen(crossprod(matrix(rnorm(M * N), M, N)))
>>
>> Can others reproduce this?
>
> I can reproduce this running R 3.2.2 on GuixSD on a x86_64 machine.
I haven't had any luck resolving this aside from just using R's internal
BLAS by removing "--with-blas=openblas" and "--with-lapack" from the
configure flags.
--
Kyle
Information forwarded
to
bug-guix <at> gnu.org
:
bug#21909
; Package
guix
.
(Thu, 19 Nov 2015 09:40:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 21909 <at> debbugs.gnu.org (full text, mbox):
I can reproduce the bug with:
guix environment --pure --ad-hoc r -- R
and then typed “1” to get a core dump, which gives this:
--8<---------------cut here---------------start------------->8---
Core was generated by `/gnu/store/zci2lb9jlc9hlck3x3hc04ab3y86fzf9-r-3.2.2/lib/R/bin/exec/R'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f34616450e0 in dgemv_t_SANDYBRIDGE () from /gnu/store/hw9p1zyn1nh8pbm1cl69nm0i391lk6c7-openblas-0.2.15/lib/libopenblas.so.0
[Current thread is 1 (Thread 0x7f34651307c0 (LWP 3399))]
(gdb) bt
#0 0x00007f34616450e0 in dgemv_t_SANDYBRIDGE () from /gnu/store/hw9p1zyn1nh8pbm1cl69nm0i391lk6c7-openblas-0.2.15/lib/libopenblas.so.0
#1 0x00000000000000a2 in ?? ()
[...]
(gdb) disassemble
Dump of assembler code for function dgemv_t_SANDYBRIDGE:
[...]
0x00007f34616450cf <+207>: jle 0x7f3461645140 <dgemv_t_SANDYBRIDGE+320>
0x00007f34616450d1 <+209>: nopl 0x0(%rax,%rax,1)
0x00007f34616450d6 <+214>: nopw %cs:0x0(%rax,%rax,1)
=> 0x00007f34616450e0 <+224>: movsd (%r9),%xmm0
[...]
(gdb) p $r9
$1 = 4016
--8<---------------cut here---------------end--------------->8---
My CPU seems to be a Sandy Bridge:
--8<---------------cut here---------------start------------->8---
$ cat /proc/cpuinfo | grep ^model | head -2
model : 42
model name : Intel(R) Core(TM) i5-2540M CPU @ 2.60GHz
--8<---------------cut here---------------end--------------->8---
Might be useful to report it upstream?
Thanks,
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#21909
; Package
guix
.
(Fri, 20 Nov 2015 00:06:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 21909 <at> debbugs.gnu.org (full text, mbox):
ludo <at> gnu.org (Ludovic Courtès) writes:
> I can reproduce the bug with:
>
> guix environment --pure --ad-hoc r -- R
[...]
> Might be useful to report it upstream?
Yes. In preparing to do so, I figured that I should reproduce the issue
with a build outside of Guix. However, when I tried with a manual build
on an Arch Linux system, the snippet ran fine. This was with gcc
version 5.2.0, so I switched the gfortran input of openblas over to
gfortran-5, which seems to fix the issue.
While the issue should still be reported upstream, would it be OK to
update the gfortran input to gfortran-5?
--
Kyle
Information forwarded
to
bug-guix <at> gnu.org
:
bug#21909
; Package
guix
.
(Fri, 20 Nov 2015 09:59:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 21909 <at> debbugs.gnu.org (full text, mbox):
Kyle Meyer <kyle <at> kyleam.com> skribis:
> ludo <at> gnu.org (Ludovic Courtès) writes:
>
>> I can reproduce the bug with:
>>
>> guix environment --pure --ad-hoc r -- R
>
> [...]
>
>> Might be useful to report it upstream?
>
> Yes. In preparing to do so, I figured that I should reproduce the issue
> with a build outside of Guix. However, when I tried with a manual build
> on an Arch Linux system, the snippet ran fine. This was with gcc
> version 5.2.0, so I switched the gfortran input of openblas over to
> gfortran-5, which seems to fix the issue.
Interesting.
> While the issue should still be reported upstream, would it be OK to
> update the gfortran input to gfortran-5?
The problem is that this leads to an R linked against GCC 4.9’s libgcc_s
and other run-time support libraries, and also against those of GCC 5.2,
via OpenBLAS. I think we’d rather avoid it.
An additional data point:
--8<---------------cut here---------------start------------->8---
$ guix environment --pure --ad-hoc r valgrind -- R -d valgrind
/gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/bin/R: line 8: uname: command not found
==3198== Memcheck, a memory error detector
==3198== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==3198== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==3198== Command: /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/bin/exec/R
==3198==
R version 3.2.2 (2015-08-14) -- "Fire Safety"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> M <- 50
> N <- 500
> eigen(crossprod(matrix(rnorm(M * N), M, N)))
==3198== Invalid read of size 8
==3198== at 0x8E400E0: dgemv_t_SANDYBRIDGE (in /gnu/store/hw9p1zyn1nh8pbm1cl69nm0i391lk6c7-openblas-0.2.15/lib/libopenblasp-r0.2.15.so)
==3198== by 0x183AED48: dlatrd_ (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libRlapack.so)
==3198== by 0x18461F92: dsytrd_ (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libRlapack.so)
==3198== by 0x184B9540: dsyevr_ (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libRlapack.so)
==3198== by 0x1B742D5E: La_rs (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/modules/lapack.so)
==3198== by 0x1B745B96: mod_do_lapack (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/modules/lapack.so)
==3198== by 0x4F35635: bcEval (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libR.so)
==3198== by 0x4F432DF: Rf_eval (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libR.so)
==3198== by 0x4F48F4B: Rf_applyClosure (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libR.so)
==3198== by 0x4F4345E: Rf_eval (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libR.so)
==3198== by 0x4F6B0D9: Rf_ReplIteration (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libR.so)
==3198== by 0x4F6B430: R_ReplConsole (in /gnu/store/jb11p396a277rndb52da20ygdksccji8-r-3.2.2/lib/R/lib/libR.so)
==3198== Address 0xfb0 is not stack'd, malloc'd or (recently) free'd
==3198==
*** caught segfault ***
address 0xfb0, cause 'memory not mapped'
--8<---------------cut here---------------end--------------->8---
Here this suggests an R issue more than an OpenBLAS problem.
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#21909
; Package
guix
.
(Mon, 30 Nov 2015 23:27:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 21909 <at> debbugs.gnu.org (full text, mbox):
I've opened an issue in the OpenBLAS repo [1].
https://github.com/xianyi/OpenBLAS/issues/703
I'm trying to answer their questions, but, as is apparent in that
thread, I'm not really familiar with debugging these sorts of problems.
Since others are able to reproduce the error, any help over there would
be very appreciated.
[1] The last post suggested it may be an R issue rather than an OpenBLAS
one, but I wanted to be more sure of that before going to R
developers, especially given their comment about the --with-lapack
flag:
"Please do bear in mind that using --with-lapack is 'definitely not
recommended': it is provided only because it is necessary on some
platforms and because some users want to experiment with claimed
performance improvements. Reporting problems where it is used
unnecessarily will simply irritate the R helpers."
https://cran.r-project.org/doc/manuals/r-release/R-admin.html#LAPACK
--
Kyle
Information forwarded
to
bug-guix <at> gnu.org
:
bug#21909
; Package
guix
.
(Wed, 02 Mar 2016 10:42:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 21909 <at> debbugs.gnu.org (full text, mbox):
Kyle Meyer <kyle <at> kyleam.com> writes:
> I've opened an issue in the OpenBLAS repo [1].
>
> https://github.com/xianyi/OpenBLAS/issues/703
>
> I'm trying to answer their questions, but, as is apparent in that
> thread, I'm not really familiar with debugging these sorts of problems.
> Since others are able to reproduce the error, any help over there would
> be very appreciated.
>
> [1] The last post suggested it may be an R issue rather than an OpenBLAS
> one, but I wanted to be more sure of that before going to R
> developers, especially given their comment about the --with-lapack
> flag:
>
> "Please do bear in mind that using --with-lapack is 'definitely not
> recommended': it is provided only because it is necessary on some
> platforms and because some users want to experiment with claimed
> performance improvements. Reporting problems where it is used
> unnecessarily will simply irritate the R helpers."
>
> https://cran.r-project.org/doc/manuals/r-release/R-admin.html#LAPACK
We have since removed OpenBLAS from the inputs of R and just use the
internal BLAS instead. Without OpenBLAS I cannot reproduce this bug
any more.
~~ Ricardo
bug closed, send any further explanations to
21909 <at> debbugs.gnu.org and Kyle Meyer <kyle <at> kyleam.com>
Request was from
Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
to
control <at> debbugs.gnu.org
.
(Wed, 02 Mar 2016 10:43:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#21909
; Package
guix
.
(Wed, 02 Mar 2016 21:11:01 GMT)
Full text and
rfc822 format available.
Message #31 received at 21909 <at> debbugs.gnu.org (full text, mbox):
Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de> skribis:
> We have since removed OpenBLAS from the inputs of R and just use the
> internal BLAS instead. Without OpenBLAS I cannot reproduce this bug
> any more.
Nice. So we can close?
Ludo'.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#21909
; Package
guix
.
(Thu, 03 Mar 2016 20:55:01 GMT)
Full text and
rfc822 format available.
Message #34 received at 21909 <at> debbugs.gnu.org (full text, mbox):
Ludovic Courtès <ludo <at> gnu.org> writes:
> Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de> skribis:
>
>> We have since removed OpenBLAS from the inputs of R and just use the
>> internal BLAS instead. Without OpenBLAS I cannot reproduce this bug
>> any more.
>
> Nice. So we can close?
I had already closed it in a subsequent email to
control <at> debbugs.gnu.org.
~~ Ricardo
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 01 Apr 2016 11:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 9 years and 76 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.