GNU bug report logs - #14756
threads - par-map - multicore issue

Previous Next

Package: guile;

Reported by: David Pirotte <david <at> altosw.be>

Date: Sun, 30 Jun 2013 18:02:02 UTC

Severity: normal

To reply to this bug, email your comments to 14756 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#14756; Package guile. (Sun, 30 Jun 2013 18:02:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to David Pirotte <david <at> altosw.be>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Sun, 30 Jun 2013 18:02:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: David Pirotte <david <at> altosw.be>
To: <bug-guile <at> gnu.org>
Subject: threads - par-map - multicore issue
Date: Sun, 30 Jun 2013 15:00:54 -0300
Hello,

	guile --version
	guile (GNU Guile) 2.0.9.20-10454

It seems that the par-map not using all cores problem has some how been reintroduced?

	guile -c '(begin (use-modules (ice-9 threads)) (par-map 1+ (iota 400000)))'

only uses 1 core [it seems it uses some other [maybe all, i can't tell] a couple of
milliseconds, then drops to 1 core only.

Thanks,
David

;; -- 

david <at> idefix:~ 16 $ guile -c '(begin
>     (use-modules (ice-9 threads))
>     (par-map 1+ (iota 400))
>     (display (current-processor-count)) (display "\n")
>     (display (length (@@ (ice-9 futures) %workers))) (display "\n"))'
12
11




Information forwarded to bug-guile <at> gnu.org:
bug#14756; Package guile. (Tue, 21 Jun 2016 06:52:01 GMT) Full text and rfc822 format available.

Message #8 received at 14756 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: ludo <at> gnu.org
Cc: 14756 <at> debbugs.gnu.org, David Pirotte <david <at> altosw.be>
Subject: Re: bug#14756: threads - par-map - multicore issue
Date: Tue, 21 Jun 2016 08:51:09 +0200
I see this, but I'm not quite sure what's going on.  What I do see is
that par-map of 1+ on a list is horribly slow, both on 2.0 and master.
Ludovic do you know what's going on here?

Andy

On Sun 30 Jun 2013 20:00, David Pirotte <david <at> altosw.be> writes:

> Hello,
>
> 	guile --version
> 	guile (GNU Guile) 2.0.9.20-10454
>
> It seems that the par-map not using all cores problem has some how been reintroduced?
>
> 	guile -c '(begin (use-modules (ice-9 threads)) (par-map 1+ (iota 400000)))'
>
> only uses 1 core [it seems it uses some other [maybe all, i can't tell] a couple of
> milliseconds, then drops to 1 core only.
>
> Thanks,
> David
>
> ;; -- 
>
> david <at> idefix:~ 16 $ guile -c '(begin
>>     (use-modules (ice-9 threads))
>>     (par-map 1+ (iota 400))
>>     (display (current-processor-count)) (display "\n")
>>     (display (length (@@ (ice-9 futures) %workers))) (display "\n"))'
> 12
> 11




Information forwarded to bug-guile <at> gnu.org:
bug#14756; Package guile. (Tue, 21 Jun 2016 08:35:02 GMT) Full text and rfc822 format available.

Message #11 received at 14756 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Andy Wingo <wingo <at> pobox.com>
Cc: 14756 <at> debbugs.gnu.org, David Pirotte <david <at> altosw.be>
Subject: Re: bug#14756: threads - par-map - multicore issue
Date: Tue, 21 Jun 2016 10:33:47 +0200
Andy Wingo <wingo <at> pobox.com> skribis:

> I see this, but I'm not quite sure what's going on.  What I do see is
> that par-map of 1+ on a list is horribly slow, both on 2.0 and master.
> Ludovic do you know what's going on here?

As David put it, only one core is being used, which is clearly a bug.

I believe the bug was introduced by
8a177d316c0062afe74f9a761ef460e297435e59 (however, before that commit,
you would hit a stack overflow when doing ‘par-map’ on a large-enough
list.)

What happens is that ‘par-mapper’ creates nested futures whose
dependency graph forms a comb-shaped tree; thus we quickly hit
%MAX-NESTING-LEVEL.

This is fine in itself, but for some reason, it ends up evaluating most
of those futures in one thread while the other threads apparently remain
stuck in ‘wait-condition-variable’ in ‘process-futures’.

I’ve looked into it a bit but that needs more time…

Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#14756; Package guile. (Tue, 28 Feb 2017 09:54:01 GMT) Full text and rfc822 format available.

Message #14 received at 14756 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 14756 <at> debbugs.gnu.org, David Pirotte <david <at> altosw.be>
Subject: Re: bug#14756: threads - par-map - multicore issue
Date: Tue, 28 Feb 2017 10:53:39 +0100
On Tue 21 Jun 2016 10:33, ludo <at> gnu.org (Ludovic Courtès) writes:

> Andy Wingo <wingo <at> pobox.com> skribis:
>
>> I see this, but I'm not quite sure what's going on.  What I do see is
>> that par-map of 1+ on a list is horribly slow, both on 2.0 and master.
>> Ludovic do you know what's going on here?
>
> As David put it, only one core is being used, which is clearly a bug.
>
> I believe the bug was introduced by
> 8a177d316c0062afe74f9a761ef460e297435e59 (however, before that commit,
> you would hit a stack overflow when doing ‘par-map’ on a large-enough
> list.)

Given that Guile 2.2. doesn't have a stack limit problem, I have
reverted this commit on master (though I kept the tests).

FWIW Guile 2.0 with this test

   $ time ../guile-2.0/meta/guile -c '(begin (use-modules (ice-9 threads)) (par-map 1+ (iota 40000)))'

   real	1m45.282s
   user	1m45.208s
   sys	0m0.036s


Guile 2.1.x with the stack-limit stuff:

   $ time /opt/guile/bin/guile -c '(begin (use-modules (ice-9 threads)) (par-map 1+ (iota 40000)))'

   real	0m51.738s
   user	1m2.720s
   sys	0m0.116s

Guile 2.1.x after reverting the patch:

   $ time meta/guile -c '(begin (use-modules (ice-9 threads)) (par-map 1+ (iota 40000)))'

   real	0m1.403s
   user	0m1.396s
   sys	0m0.024s

Note that I took a zero off the original test in all examples above.
However!  I still have the problem that mostly only one core is used.  I
would imagine that is because the thread that builds the spine is more
costly than the threads that actually do the workload (the 1+ in this
case).  But maybe that is wrong.  Certainly there are improvements that
can be made in the futures implementation in 2.2 with atomic boxes.

Andy




This bug report was last modified 8 years and 107 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.