[sdiy] The Owl: ARM fx pedal

rsdio at sounds.wa.com rsdio at sounds.wa.com
Wed Apr 24 20:46:19 CEST 2013


On Apr 24, 2013, at 11:15, Martin Klang wrote:
> On 24 Apr 2013, at 18:43, Paul Beckmann wrote:
>> The benchmarks listed on the hoxtonowl.com web site aren't  
>> accurate.  The Cortex-M4 has basic DSP instructions but lacks a  
>> lot of the capabilities of a traditional DSP.  A DSP can do a  
>> Biquad in 5 instructions; the Cortex-M4 takes about 13 cycles in  
>> the inner loop.  If you are open to changing processors, you might  
>> look at the M4 family from NXP.  The device runs at up to 204 MHz.
>
> Can you elaborate on this, please?
> I'm not an expert, I but I understand a biquad filter requires 5  
> multiply and accumulate (MAC) operations per sample.
> The Cortex M4 does single-cycle MAC - see here:
> http://infocenter.arm.com/help/index.jsp?topic=/ 
> com.arm.doc.ddi0439b/CHDDIGAC.html
>
> Therefore I don't think our information is inaccurate, though it  
> could be more precise. Please correct me if you think I'm wrong!

You're looking only at the MAC instruction itself, which only  
operates on data in registers, and handles no looping. That's not  
nearly the whole picture.

A full DSP like the TMS320 or OMAP can fetch two full audio samples,  
calculate the MAC, store the result, increment all three memory  
pointers, wrap the pointers for the buffer size range, decrement a  
loop counter, and then decide to repeat the instruction without  
having to fetch a new opcode. All of the operations listed occur in a  
single cycle on a proper DSP. Note that the TMS320 has three full- 
width read busses and two full-width write busses that can operate at  
the same time during a single cycle, as well as on-board SRAM that  
can support multiple, simultaneous accesses (provided that you  
allocate your buffers so they don't impinge on each other).

The same operations in the ARM would require several instructions.  
Sure, the MAC only takes a cycle, but: You have to get the data into  
the registers. You have to store the results somewhere. You have to  
manage pointers and loop variables. I haven't checked Paul's  
calculations, but I assume he's correct that it would take about 13  
cycles in an inner loop to accomplish the complete set of tasks that  
a DSP handles in a single cycle. Then you have to be prepared that  
the compiler might not always emit the most efficient instructions,  
and suddenly your loop takes way more than 13 cycles.

By the way, frequently-used DSP routines are often written in  
assembly, meaning that performance doesn't slow down when you switch  
compilers. Also, processors like the TMS320 can fetch an entire  
subroutine of multiple instructions into cache and loop without tying  
up the memory bus fetching the same opcodes over and over. Of course,  
other processors have instruction cache, but a DSP often allows very  
precise control over this.

I'd actually recommend the Analog Devices SHARC or Texas Instruments  
OMAP. The latter combines an ARM core with a TMS320 DSP core, such  
that you have both the efficiency of DSP instructions for audio  
processing plus the ease of use of ARM for the high-level control.  
Then again, dual core programming with asymmetrical processing like  
the OMAP might be too challenging for beginners, unless a very solid  
operating system with reusable building blocks is built up as the  
basis for your pedal.


> What we want to leverage is the skills and talents of all the  
> coders who are currently doing VST plugins and other software-based  
> solutions.
Existing VST code will not be directly usable on your platform  
because it will be such a different environment (and should be, if  
you want lower latency). It's hard enough to convince VST plugin  
developers to support AudioUnits, which represent a huge market  
compared to an fx pedal.

There is a huge difference between single-sample-based DSP and  
framebuffer-based DSP. The mindset is quite similar, but the code  
doesn't readily port across. Not only that, but having a user  
interface for parameter changes is very different from operating in a  
hardware platform with only physical controls and no graphics.

Brian Willoughby
Sound Consulting




More information about the Synth-diy mailing list