GNU bug report logs - #14599
An option to make vector allocation aligned

Previous Next

Package: guile;

Reported by: Jan Schukat <shookie <at> email.de>

Date: Wed, 12 Jun 2013 13:38:02 UTC

Severity: wishlist

Full log


View this message in rfc822 format

From: Jan Schukat <shookie <at> email.de>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 14599 <at> debbugs.gnu.org
Subject: bug#14599: An option to make vector allocation aligned
Date: Wed, 12 Jun 2013 23:14:31 +0200
Thought a bit about it, and it would really be nice to have an aligned 
uniform vector API.

ATM all are 8 byte aligned, so you probably would want also to be able 
to have at least 16 and 32 byte alignment (intel's AVX has 256bit 
registers that better work aligned).
But even 64 and and more could be useful for cache line alignment, 
although that would require this to be a separate alignment, because the 
benefits of cache line alignment are kind of defeated if the header is 
in a different cache line.

So I guess just one alignment, namely that of the first element is 
feasible without wasting whole cache lines. If you really need that you 
can still use the take_*vector functions, and it's pretty rare to do 
such things anyway. But being able to control the alignment of the first 
element allows you to properly use simd instructions on those vectors.

You don't even really need any more space to store alignment 
information, since that can be directly inferred from the bytevector 
content pointer, although the bytevector flags still have more than 
enough space to store it.

Extending the programming api to support this is a bit more tricky. I 
guess most straightforward and backward compatible would be to just at a 
set of make-aligned-*vector and aligned-*vector and *->aligned-*vector 
functions and their scm_* versions with an additional alignment 
parameter. Optional alignment parameters on the old functions could be 
nice too, but I guess that is just asking for compatibility trouble.

The other question is the read syntax (one of the primary reasons I'm 
doing all this). If alignment is something that should be preserved in 
the permanent representation, you also need to store it in the flags, 
since the content pointer can be aligned by coincidence. I haven't 
looked at the compiling of bytevectors yet, to see if alignment can be 
handled easily there.

As for the text representation, I think the simplest way is to add 
another reserved character with the alignment number that works for 
uniform vectors and arrays like #vu8>8(1 2 3 4 5 6) to have the first 
element at 8byte alignment (right now the allocation pretty much ensures 
4 byte alignment of the first element on 32 bit machines and 8 byte at 
64bit machines, because gc_malloc returns 8byte aligned blocks, but the 
array starts at cell word 3. Any 64 bit type vector like double and long 
is already guaranteed to be misaligned on 32 bit platforms. Which would 
be even more unfortunate on linux x32 abi systems that uses efficient 64 
bit ints with 32 bit pointers, but cell size is determined by pointer size.

Or to construct simd 4 element arrays #2f32:2:4>16((1 2 3 4)(1 2 3 4)). 
Maybe even have a default alignment of 16 when you just use > without a 
number so #2f32:2:4>((1 2 3 4)(1 2 3 4)) is the same thing. Or even more 
convenient #m128((1 2 3 4)(1.0 1.0 1.0 1.0) (2.0 2.0)) where you can 
freely mix the underlying types and the size of the elements is inferred 
by the amount of them in each group.


So if there is interest for something like this in the main guile, I 
will make the patches. If not, I'll just stick to my crude hack for now 
and see if I need the full shebang :).


Regards

Jan Schukat


On 06/12/2013 04:59 PM, Ludovic Courtès wrote:
> severity 14599 wishlist
> thanks
>
> Hi!
>
> Jan Schukat <shookie <at> email.de> skribis:
>
>> If you want to access native uniform vectors from c, sometimes you
>> really want guarantees about the alignment.
> [...]
>
>> This isn't necessarily true for vectors created from pre-existing
>> buffers (the take_*vector functions), but there you have control over
>> the pointer you pass, so you can make it true if needed.
>>
>> So if there is interest, maybe this could be integrated into the build
>> system as a configuration like this:
>>
>>
>> --- libguile/bytevectors.c    2013-04-11 02:16:30.000000000 +0200
>> +++ bytevectors.c    2013-06-12 14:45:16.000000000 +0200
>> @@ -223,10 +223,18 @@
>>
>>         c_len = len * (scm_i_array_element_type_sizes[element_type] / 8);
>>
>> +#ifdef SCM_VECTOR_ALIGN
>> +      contents = scm_gc_malloc_pointerless
>> (SCM_BYTEVECTOR_HEADER_BYTES + c_len + SCM_VECTOR_ALIGN,
>> +                        SCM_GC_BYTEVECTOR);
>> +      ret = PTR2SCM (contents);
>> +      contents += SCM_BYTEVECTOR_HEADER_BYTES;
>> +      contents += (addr + (SCM_VECTOR_ALIGN - 1)) & -SCM_VECTOR_ALIGN;
>> +#else
>>         contents = scm_gc_malloc_pointerless
>> (SCM_BYTEVECTOR_HEADER_BYTES + c_len,
>>                           SCM_GC_BYTEVECTOR);
>>         ret = PTR2SCM (contents);
>>         contents += SCM_BYTEVECTOR_HEADER_BYTES;
>> +#endif
>>
>>         SCM_BYTEVECTOR_SET_LENGTH (ret, c_len);
>>         SCM_BYTEVECTOR_SET_CONTENTS (ret, contents);
> I don’t think it should be a compile-time option, because it would be
> inflexible and inconvenient.
>
> Instead, I would suggest using the scm_take_ functions if allocating
> from C, as you noted.
>
> In Scheme, I came up with the following hack:
>
> --8<---------------cut here---------------start------------->8---
> (use-modules (system foreign)
>               (rnrs bytevectors)
>               (ice-9 match))
>
> (define (memalign len alignment)
>    (let* ((b (make-bytevector (+ len alignment)))
>           (p (bytevector->pointer b))
>           (a (pointer-address p)))
>      (match (modulo a alignment)
>        (0 b)
>        (padding
>         (let ((p (make-pointer (+ a (- alignment padding)))))
>           ;; XXX: Keep a weak reference to B or it can be collected
>           ;; behind our back.
>           (pointer->bytevector p len))))))
> --8<---------------cut here---------------end--------------->8---
>
> Not particularly elegant, but it does the job.  ;-)
>
> Do you think there’s additional support that should be provided?
>
> Thanks,
> Ludo’.





This bug report was last modified 12 years and 61 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.