endboss
01-13-2010, 12:00 PM
Hi everybody, I'm new here.
I'm currently developing some high performance audio dsp-code.
While playing around with the scratchbox and on device sdk I stumbled across some severe bugs in the SDK.
The first bug is that automatic vector-code generation fails.
Compiling
void NeonTest(float * __restrict a, float * __restrict b, float * __restrict z)
{
int i;
for(i=0;i<200;i++) {
z[i] = a[i] * b[i];
}
}
with gcc -O3 -march=armv7-a -mtune=cortex-a8 -mfpu=neon -ftree-vectorize -mfloat-abi=softfp won't produce neon-optimized code.
So I was hand crafting the code and ran into the next bug...
This bug I came across is a seg-fault in the gnu assembler when assembling certain NEON instructions (e.g. vmul.f32).
I filed already a bug report here:
http://www.mail-archive.com/bug-binutils@gnu.org/msg08819.html
I bypassed that by recompiling gas with the seg-fault scenario disabled. For some weird reason the file runs through and assembles correctly.
The next bug I came across is one in the scratchbox while running in armel-mode. The sb-environment emulates the arm-code, however it has bugs when emulating certain neon instructions.
Especially scalar multiplication of quads with floats will give wrong results. That was really weird. I was working on some matrix-times-vector code, which just gave wrong results in scratchbox, but gave the correct result while running on device.
I just wanted to let you guys know.
Where should I report the scratchbox-bug?
I'm currently developing some high performance audio dsp-code.
While playing around with the scratchbox and on device sdk I stumbled across some severe bugs in the SDK.
The first bug is that automatic vector-code generation fails.
Compiling
void NeonTest(float * __restrict a, float * __restrict b, float * __restrict z)
{
int i;
for(i=0;i<200;i++) {
z[i] = a[i] * b[i];
}
}
with gcc -O3 -march=armv7-a -mtune=cortex-a8 -mfpu=neon -ftree-vectorize -mfloat-abi=softfp won't produce neon-optimized code.
So I was hand crafting the code and ran into the next bug...
This bug I came across is a seg-fault in the gnu assembler when assembling certain NEON instructions (e.g. vmul.f32).
I filed already a bug report here:
http://www.mail-archive.com/bug-binutils@gnu.org/msg08819.html
I bypassed that by recompiling gas with the seg-fault scenario disabled. For some weird reason the file runs through and assembles correctly.
The next bug I came across is one in the scratchbox while running in armel-mode. The sb-environment emulates the arm-code, however it has bugs when emulating certain neon instructions.
Especially scalar multiplication of quads with floats will give wrong results. That was really weird. I was working on some matrix-times-vector code, which just gave wrong results in scratchbox, but gave the correct result while running on device.
I just wanted to let you guys know.
Where should I report the scratchbox-bug?