[sdiy] Raspberry Pi 2 Synthesizer Project
Andrew Simper
andy at cytomic.com
Wed Feb 10 08:49:16 CET 2016
On 8 February 2016 at 00:27, Scott Gravenhorst <music.maker at gte.net> wrote:
> Eric Brombaugh <ebrombaugh1 at cox.net> wrote:
> Achim wrote this:
>>For that kind of code it should be beneficial to use VFPv4 and NEONv2,
>> >which is available on the Pi2.
>
> Yes, I saw that NEON is in the IC, but I need to research how to turn it on
> with gcc as I believe it is not used by default. It would be interesting to
> see if there's much of a difference. In the past, all of my synth designs
> (such as for dsPIC and FPGA) have used fixed point format. This particular
> synth uses the sin() function with uses double float arithmetic. Voice
> summing is done with fixed pt. A future KS synth will be float or double at
> first and then converted to fixed point just to see if there's a significant
> difference.
>
> -- ScottG
Here is how to turn on NEON, but you're not going to get speedups just
by flicking the compiler switch, you need to re-code the entire
algorithm for it:
https://gcc.gnu.org/onlinedocs/gcc-4.4.1/gcc/ARM-NEON-Intrinsics.html
You basically get SSE1 type 4 x float ops in parallel, so if you have
synth voices then that means you can run around x3 more of them (there
is a little overhead so you'll never get a x4 speedup).
So you need to code your voicing engine to stick all the voices into
4xfloat type blocks, then every operation you can do in parallel e.g.:
init (on voice change):
float32x4_t param;
float32x4_t state;
for (int i=0; i<4; i++)
{
const VoiceData& vdata = voices[voices_index[i]];
param[i] = vdata.param;
state[i] = vdata.state;
}
tick:
state = vaddq_f32 (state, param);
Or if you want a fixed cpu then you could always use 4xfloats for you
data and run all the voices all the time and not need to fiddle around
with which voice is currently playing.
Andy
More information about the Synth-diy
mailing list