[clug] Looking for Linux on CBE programming expert help

Jeremy Kerr jk at ozlabs.org
Sun Aug 30 04:32:11 MDT 2009


Hi Sanders,

> PS3 Cell Broadband Engine - Linux Programming

I'm not sure how many other CBE hackers are on this list, but you may want to 
use cbe-oss-dev in future (https://ozlabs.org/mailman/listinfo/cbe-oss-dev) 
but regardless, in answer to your questions:

> In the SPE program I have declared and allocated local memory to take the
> data structure, like this:
>
>     float *p1 __attribute__ ((aligned (128)));

You don't need the aligned attribute there, this will only align the pointer 
itself, not the data.

>     p1 = (float*) malloc(floatArrayByteCount + 128 + 128); // Note:
> floatArrayByteCount = 128.  // +128+128 is to allow for 128 byte boundary
> alignment (below)
>
> I am using the following to make sure that my pointers start on a 128 byte
> boundary.  I know that there must be a better way but I don't know it:
>
>     while ((unsigned int)p1 % 128 != 0) {p1 += 1;}

The usual trick here is to add and mask with size - 1. Something like:

#define ALIGN(x, size) \
	(typeof(x))(((unsigned long)x + ((size) - 1))) & ~((size) - 1)))

p1 = ALIGN(p1, 128)

However, if you already know the size of the local buffer at compile time, I'd 
ditch the malloc and just use a statically allocated buffer, aligned to the 
right address:

static float buf[floatArrayByteCount] __attribute__((aligned(128)));

Beware that there is a bug in gcc that means that the 'aligned' attribute 
won't work for variables declared on the stack, you'll need to declare this 
outside of a function. I assume a static var declared within a function will 
work too, but haven't tried it.

If you don't know the size of the buffer at compile time, I'd suggest using 
memalign() rather than malloc().

> I am lining up the main memory source pointer (passed from the PPE program)
> to a 128 byte bondary in a similiar way.
>
> Then I start the transfer with no problem, using:
>
>     mfc_get(p1, ppe_p1_start_address, floatArrayByteCount, 31, 0, 0); //
> ppe_p1_start_address
>
> However when I wait for the transfer to finish the program stalls
> (apparently forever) with no error:
>
>     mfc_write_tag_mask(1<<31);
>     mfc_read_tag_status_all(); // Stalls here

I'm pretty sure you only have 16 tags in the SPE DMA command queue. Try using 
1<<15 here (and update the call to mfc_get) and see if that helps.

Regards,


Jeremy


More information about the linux mailing list