[sdiy] Hardware convolution box?

cheater00 cheater00 cheater00 at gmail.com
Wed Feb 15 00:39:47 CET 2017


On Tue, 14 Feb 2017 21:40 , <rsdio at audiobanshee.com> wrote:
>
> I recommend against trying to bend video shaders into audio processing, especially when there are better processors for audio signals.
>
> I'm sure you could get great performance out of a video accelerator, but you'd be using the wrong tool and would find less expertise to help along the way. The video chip manufacturers will not have sample code or support for questions about audio processing, whereas the DSP vendors have been helping people implement convolution for decades.
>
> It's not just the raw instruction speed, but also that many other aspects of DSP chip hardware are optimized for getting low-latency signals streaming through the hardware. The TMS320 has 5 busses that can operate at the same time to read 3 separate pieces of data and write 2 more. There are also DMA channels which can shuttle audio samples from the ADC to memory without tying up the processor execution, and matching DMA to shuttle from memory to the DAC. Also, DSP chips tend to support the digital audio formats that audio converters need, meaning that you can get audio data in and out without writing code to manually translate between different communication formats. It's highly unlikely that a video accelerator chip will have the ability to directly access digital audio streams without another processor wasting cycles to needlessly shuttle data around. On a DSP, everything is efficiently linked together. They've been learning how to optimize the very thing you're doing over several decades of experience. These chips are highly evolved (just as video accelerators are highly evolved for moving 3D data structures into video frame buffers).
>
> My most recent TMS320 firmware processes 6 million analog conversions per second, including about 16 thousand fixed-point FFTs per second. Latency is 1 or 2 ms, but that is largely constrained by USB and could potentially be lower with overlapping FFT windows. That's handling 32 output channels and 16 input channels at 125 kHz sample rate per channel.

Those are some very valid points!

I have researched a bit about using the GLES shader on the Raspberry
Pi and it turns out the hardware needs to be spoken to via their
driver. This means that it has to run a mainstream OS, and the whole
setup won't be realtime-optimized:

https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=10167&p=117038&hilit=gles+opengl

Of course, maybe the driver situation has changed in the 5 years since
those posts

If you started with one of the lower end TI dsps, say C55x or C674x,
would the asm code carry over to the C66x? It is my feeling the answer
would be a sound "no". If that is the case, maybe it would be best to
start with lower end C66x? The chips start at roughly $40 which really
isn't that much and provide 20 GMACS which is plenty already. This
family goes up to 358 GMACS. C55x tops out at 0.6 GMACS and C674x at
3.6 GMACS and neither are enough of an increase to challenge arm
boards which many people already have, most likely.

I have had a longer chat about the capabilities of the Raspberry Pi
Broadcom QPU. It can be hacked to do realtime dsp, but, well, it's a
hack. It turns out they are 32-bit fp units and the APU (that contains
the QPUs) is already running an RTOS of some sort. I will include the
log in a separate email with a different subject.

> On Feb 14, 2017, at 12:17 PM, cheater00 cheater00 <cheater00 at gmail.com> wrote:
> > Thanks. I wonder what other cheap computer boards have hardware shader engines?
> >
> > On Tue, 14 Feb 2017 14:40 Neil Johnson, <neil.johnson71 at gmail.com> wrote:
> >> Hi,
> >>
> >> Neil Johnson wrote:
> >> > If you can work with the slightly odd floating point format in single
> >> > precision GLES shader language then the shader engines in the Raspberry Pi
> >> > (1, 2 & 3) will give you about 24 GFLOPS.
> >>
> >> cheater00 cheater00 wrote:
> >> > Neil aren't the shader engines just in software on arm? Isn't it
> >> > better to just use the arm directly?
> >>
> >> No, they're hardware:
> >>
> >> Broadcom doc: https://docs.broadcom.com/docs/12358545
> >> Other: https://github.com/hermanhermitage/videocoreiv/wiki/VideoCore-IV---BCM2835-Overview
> >




More information about the Synth-diy mailing list