[sdiy] Hardware convolution box?

Terry Shultz thx1138 at earthlink.net
Fri Feb 10 06:49:58 CET 2017


http://www.st.com/en/development-tools/apworkbench.html?sc=apworkbench <http://www.st.com/en/development-tools/apworkbench.html?sc=apworkbench>

Use a Cortex M7 Discovery board for more performance over the Cortex M4.

regards,

Terry
> On Feb 9, 2017, at 9:24 PM, Terry Shultz <thx1138 at earthlink.net> wrote:
> 
> Another guy I know is Jason Kridner, who started BeagleBoard.org <http://beagleboard.org/> https://beagleboard.org/x15/ <https://beagleboard.org/x15/>
> 
> This is a monster performer http://www.ti.com/product/am5728 <http://www.ti.com/product/am5728>  
> 
> Processor: TI AM5728 2×1.5-GHz ARM® Cortex-A15 <http://www.ti.com/product/am5728>
> 2GB DDR3 RAM
> 4GB 8-bit eMMC on-board flash storage
> 2D/3D graphics and video accelerators (GPUs)
> 2×700-MHz C66 digital signal processors (DSPs)
> 2×ARM Cortex-M4 microcontrollers (MCUs)
> 4×32-bit programmable real-time units (PRUs)
> Connectivity
> 2×Gigabit Ethernet
> 3×SuperSpeed USB 3.0 host
> HighSpeed USB 2.0 client
> eSATA (500mA)
> full-size HDMI video output
> microSD card slot
> Stereo audio in and out
> 4×60-pin headers with PCIe, LCD, mSATA
> and much more... <http://elinux.org/Beagleboard:BeagleBoard-X15>
> Software Compatibility
> Debian
> Android
> Ubuntu
> Cloud9 IDE on Node.js
> plus much more
> 
> 
> 
> 
> 
> 
> 
> 
> http://uk.rs-online.com/web/p/processor-microcontroller-development-kits/8874764/ <http://uk.rs-online.com/web/p/processor-microcontroller-development-kits/8874764/> cost is approx. 207.49 but it is backordered at this site until 18/04/2017.
> 
> More than enough horsepower for Linux and Convolution engine I should think.
> 
> regards,
> 
> Terry
> 
> 
>> On Feb 9, 2017, at 6:26 PM, Terry Shultz <thx1138 at earthlink.net <mailto:thx1138 at earthlink.net>> wrote:
>> 
>> Check out my friend Dr. Paul Beckman’s site for tools https://www.dspconcepts.com/audio-weaver <https://www.dspconcepts.com/audio-weaver>
>> 
>> and my friend Tony Rouget site in Hong Kong https://www.minidsp.com <https://www.minidsp.com/>
>> 
>> and lastly my pal Al Clark’s site https://www.danvillesignal.com <https://www.danvillesignal.com/>  https://www.danvillesignal.com/landing-pages/snowbird-audio <https://www.danvillesignal.com/landing-pages/snowbird-audio>
>> 
>> These are good examples of audio products that are better than the DSP Manufacture can build.
>> 
>> and lastly my old friend from MIT Dr. Bill Gardner
>> 
>> https://www.audiobuildersworkshop.com <https://www.audiobuildersworkshop.com/>
>> 
>> hope this helps you guys a bit more.
>> 
>> regards,
>> 
>> Terry
>> 
>> 
>> 
>> 
>>> On Feb 9, 2017, at 5:20 PM, cheater00 cheater00 <cheater00 at gmail.com <mailto:cheater00 at gmail.com>> wrote:
>>> 
>>> That makes sense, it's also a very solid way to do things, if one manages to dot all the i's so the result is an accurate copy of the naiive method.
>>> 
>>> 
>>> On Fri, 10 Feb 2017 02:01 Olivier Gillet, <ol.gillet at gmail.com <mailto:ol.gillet at gmail.com>> wrote:
>>> I think you're vastly overestimating how much computational resources
>>> this requires.
>>> 
>>> A well-known trick is to partition the head of the IR into small
>>> blocks (say 32 samples long if you want sub ms latency at 48kHz), and
>>> use larger blocks for the tail of the IR (latency is not a problem for
>>> the tail). The whole convolution can be decomposed as a sum of
>>> convolutions by each of the blocks, which can be evaluated in the
>>> frequency domain by DFT, complex multiplication by the DFT of the IR
>>> block, and IFT.
>>> 
>>> I did some back of the envelope computations and arrived at the result
>>> of 40 MMACs for a sample rate of 48kHz and a 2s-long IR.
>>> 
>>> I found it a bit too good to be true and I got back to the source:
>>> 
>>> http://www.cs.ust.hk/mjg_lib/bibs/DPSu/DPSu.Files/Ga95.PDF <http://www.cs.ust.hk/mjg_lib/bibs/DPSu/DPSu.Files/Ga95.PDF>
>>> p. 132, just before the beginning of section 6:
>>> 
>>> "Thus a filter of size 128K samples will require approximately 427
>>> multiples per output sample".
>>> 
>>> This assumes that the DFT of all the blocks the IR is made of has been
>>> pre-computed; but this can be done in faster than real-time when the
>>> IR is loaded, assuming you've got enough RAM.
>>> 
>>> Of course there's the issue of scheduling and a lot of additional
>>> bookkeeping, but at the very worst the order of magnitude we're in are
>>> hundreds of MMACs and a couple MBytes of RAM.
>>> 
>>> On Fri, Feb 10, 2017 at 1:09 AM, cheater00 cheater00
>>> <cheater00 at gmail.com <mailto:cheater00 at gmail.com>> wrote:
>>> > Found the right spot at the TI website. I've made a somewhat large
>>> > survey of AD and TI chips. I've uploaded the data to Google Docs (see
>>> > link at the end of this email).
>>> >
>>> > For a lot of power, TI can't be beat. Their chips are as cheap as
>>> > $0.78/GMACS, that's on TMS320C6678CYP, a chip with 8.5MB ram and 256
>>> > GMACS, $200 at Mouser.
>>> >
>>> > The cheapest TMS320C is TMS320C6652CZH6 with 19.2 GMACS, 1MB ram, at
>>> > $41.95 at Arrow.
>>> >
>>> > For cheap chips, AD is great. Their most powerful non-obsolete
>>> > offering is ADSP-BF561SKBCZ-5A, 2 GMACS, 328KB ram, $32.59 at Arrow,
>>> > for $16.30/GMACS. Some of their unusually cheap chips include:
>>> > ADSP-BF525BBCZ-5A, 1.2 GMACS, 132KB, $11.79 at Newark Element14 for $9.83/GMACS
>>> > ADSP-BF534BBCZ-4A, 1 GMACS, 134KB, $5.88 at Newark Element14 for $5.88/GMACS
>>> > ADSP-BF531SBBCZ400 0.8 GMACS, 53KB, $4.44 at Avnet for $5.54/GMACS
>>> >
>>> > Those chips were noticeably (3-4x) cheaper than their close
>>> > counterparts, apparently Newark and Avnet have some sort of blowout.
>>> >
>>> > I stopped surveying AD chips around 1.2 GMACS. There are going to be
>>> > much cheaper ones than I found, I guess, but they just have so many
>>> > chips I'd spend 2 days figuring out the prices. It's obvious: their
>>> > stuff is cheap.
>>> >
>>> > AD are inexpensive, but clearly, if you need a lot of processing power
>>> > and/or a lot of memory the TI will be 5 to 10 times cheaper. $200
>>> > might not be so much if that's the majority of the cost of the box for
>>> > a DIY gamer.
>>> >
>>> > As far as evaluation boards go, the highest-powered AD board seems to
>>> > be the best value. The TMDSEVM6678L costs $399 on TI's website, has
>>> > 64MB Flash, 512 MB DDR3 SDRAM, gigabit ethernet, usb mini-B, 80 IO
>>> > header and an AMC header with PCIe, an emulator port, a small FPGA for
>>> > configuration and booting, etc. See features at these two links:
>>> > http://www.ti.com/tool/tmdsevm6678#Technical%20Documents <http://www.ti.com/tool/tmdsevm6678#Technical%20Documents>
>>> > http://www2.advantech.com/Support/TI-EVM/6678le_of.aspx <http://www2.advantech.com/Support/TI-EVM/6678le_of.aspx>
>>> >
>>> > I don't know if the USB can be used in host mode. Does anyone know?
>>> >
>>> > It is unclear to me which version of the chip this board has - the
>>> > 320GMACS one at 1.25 GHz or the 256 GMACS one at 1 GHz.
>>> >
>>> > Finally, there is a version of this board that costs $599 (50% more)
>>> > and it has an XDS560V2 emulation mode. I understand that's a debugger.
>>> > I don't know why exactly it is significant. What advantages does this
>>> > bring for a developer?
>>> > Is the emulator port shown on advantech's website only available in
>>> > this more expensive version? If the cheaper version also has it, what
>>> > can it be used for if the XDS560V2 emulation mode is not available?
>>> >
>>> > Survey data is on Google Docs. Anyone can comment:
>>> >
>>> > https://docs.google.com/spreadsheets/d/1oT-9PVh8yZMMwAkpltqGo8mhSL1NLXZJiC-LH7swsbY/edit?usp=sharing <https://docs.google.com/spreadsheets/d/1oT-9PVh8yZMMwAkpltqGo8mhSL1NLXZJiC-LH7swsbY/edit?usp=sharing>
>>> >
>>> >
>>> > Have fun!
>>> >
>>> > On Thu, Feb 9, 2017 at 9:02 PM, cheater00 cheater00 <cheater00 at gmail.com <mailto:cheater00 at gmail.com>> wrote:
>>> >> Yeah, usb host mode sounds super useful unless SD will allow faster UI
>>> >> interaction.
>>> >>
>>> >> Do you know which TI chips have the most MMACS? I find the website
>>> >> confusing.
>>> >>
>>> >>
>>> >> On Thu, 9 Feb 2017 20:29 , <rsdio at audiobanshee.com <mailto:rsdio at audiobanshee.com>> wrote:
>>> >>>
>>> >>> Based on your survey, I'd recommend the Analog Devices board, even though
>>> >>> I usually lean towards TMS320. The TMS320 family is huge, including both
>>> >>> fixed-point and floating-point, low-power and high-speed, old and new
>>> >>> designs, etc. Some of the TMS320 boards you listed are really geared more
>>> >>> towards motor control than audio, which is why they might be underpowered
>>> >>> for long impulse response convolution. I know that the AD SHARC family is
>>> >>> also large, and they're very popular, but I am less familiar with the
>>> >>> options.
>>> >>>
>>> >>> Don't forget to look at the chip manufacturer as a direct source for these
>>> >>> boards. I always buy directly from Texas Instruments because Digi-Key tends
>>> >>> to have a markup. Outside the US, maybe it's a different story due to
>>> >>> international availability.
>>> >>>
>>> >>> I'd recommend something like the 1MB 800 MMAC board and not worry about
>>> >>> external RAM. 1MB seems like plenty. I'd also recommend trying to implement
>>> >>> both the time domain convolution and the frequency domain version. There are
>>> >>> ways to reduce the latency of the frequency domain approach, and at least it
>>> >>> would allow for longer impulse responses to be supported. For IRs that are
>>> >>> short enough, the time domain approach would work. I've also seen papers on
>>> >>> combining the two, since LTI techniques can be run in parallel and summed.
>>> >>>
>>> >>> As for taking pairs of 16-bit samples to speed things up, be aware that
>>> >>> not all instructions can work that way. I think that most DSPs can do a few
>>> >>> simple operations on value pairs, but the most complex DSP instructions can
>>> >>> only handle full samples. DSP architectures have internal registers that are
>>> >>> much larger than the sample size, like 56-bit or higher. If you think about
>>> >>> all of the potential overflow when adding thousands of samples from an
>>> >>> impulse response, you can see why such large registers are needed. When
>>> >>> working in that model, its not possible to handle the overflow from two
>>> >>> samples that are combined in a single 32-bit input value.
>>> >>>
>>> >>> Finally, I think that nobody has made something like this because the user
>>> >>> interface would be rather difficult. It's a bit of a power-user effect. On
>>> >>> that note, some sort of SD card might be useful, so I can see why you're
>>> >>> looking into that. However, perhaps just a custom USB class device would be
>>> >>> enough of an interface to allow downloading impulse responses to the device.
>>> >>> At a minimum, you'll need a large Flash to store the current impulse
>>> >>> response, or some way to partition the program Flash to set aside room for
>>> >>> the data. The AD board with USB host mode could feasibly read directly from
>>> >>> a USB memory stick or Flash drive.
>>> >>>
>>> >>> Brian
>>> >>>
>>> >>>
>>> >>> On Feb 6, 2017, at 9:20 PM, cheater00 cheater00 <cheater00 at gmail.com <mailto:cheater00 at gmail.com>>
>>> >>> wrote:
>>> >>> > Brian, $50 is a steal. I've had a look at Digikey.
>>> >>> >
>>> >>> > This TI board is £24. It has ~150 KB on-chip RAM, but it has an
>>> >>> > integrated SDRAM interface.
>>> >>> >
>>> >>> > http://www.digikey.co.uk/product-detail/en/texas-instruments/LAUNCHXL-F28377S/296-42484-ND/5404239 <http://www.digikey.co.uk/product-detail/en/texas-instruments/LAUNCHXL-F28377S/296-42484-ND/5404239>
>>> >>> >
>>> >>> >
>>> >>> > This TI board is £40. It has ~384 KB on-chip RAM and an integrated
>>> >>> > SRAM interface and SD card support. 200 MMACS.
>>> >>> >
>>> >>> > http://www.digikey.co.uk/product-detail/en/texas-instruments/TMDX5505EZDSP/296-24965-ND/2127652 <http://www.digikey.co.uk/product-detail/en/texas-instruments/TMDX5505EZDSP/296-24965-ND/2127652>
>>> >>> >
>>> >>> > This AD board is £60. It has 1MB on-chip RAM and USB host mode, no
>>> >>> > idea about ram interface or SD card. 800 MMACS.
>>> >>> >
>>> >>> > http://www.digikey.co.uk/product-detail/en/analog-devices-inc/ADZS-BF706-EZMINI/ADZS-BF706-EZMINI-ND/5408943 <http://www.digikey.co.uk/product-detail/en/analog-devices-inc/ADZS-BF706-EZMINI/ADZS-BF706-EZMINI-ND/5408943>
>>> >>> >
>>> >>> > This last one sports 800 MMACS. Is this enough processing power for
>>> >>> > the 5-second convolution I mentioned above? It seemed to me like that
>>> >>> > would need 4608 MMACs. Maybe 2304 if we take pairs of 16 bit samples
>>> >>> > and treat them as 32 bit values. Are my numbers correct? Are there
>>> >>> > optimizations that can be done to lower this number, while still
>>> >>> > having zero latency? I understand FFT domain convolution introduces
>>> >>> > latency, which is not wanted in hardware. "Naive" MAC based
>>> >>> > convolution doesn't seem too far out of reach.
>>> >>> >
>>> >>> > This TI board is £156. It has 256 KB on-chip RAM and support for DDR2
>>> >>> > SDRAM. No mention of MMACs but they say 3648 MIPS and I assume a
>>> >>> > pipelined MAC costs one instruction, would that be correct?
>>> >>> >
>>> >>> > http://www.digikey.co.uk/product-detail/en/texas-instruments/TMDSLCDK6748/TMDSLCDK6748-ND/5213032 <http://www.digikey.co.uk/product-detail/en/texas-instruments/TMDSLCDK6748/TMDSLCDK6748-ND/5213032>
>>> >>> >
>>> >>> >
>>> >>> > The more expensive boards don't seem to have more powerful DSP chips.
>>> >>> > And those chips don't really get much more powerful either. However,
>>> >>> > convolution is easily parallelised. So, worst case scenario, if you
>>> >>> > wanted really long impulse responses you'd have to use a few chips.
>>> >>> > However, even the really good 800 MMACS Blackfin ones are £15 unit
>>> >>> > price, so that's not so bad...
>>> >>> >
>>> >>> > So, tell me, why hasn't anyone made this yet?
>>> >>> >
>>> >
>>> > _______________________________________________
>>> > Synth-diy mailing list
>>> > Synth-diy at synth-diy.org <mailto:Synth-diy at synth-diy.org>
>>> > http://synth-diy.org/mailman/listinfo/synth-diy <http://synth-diy.org/mailman/listinfo/synth-diy>
>>> _______________________________________________
>>> Synth-diy mailing list
>>> Synth-diy at synth-diy.org <mailto:Synth-diy at synth-diy.org>
>>> http://synth-diy.org/mailman/listinfo/synth-diy <http://synth-diy.org/mailman/listinfo/synth-diy>
>> 
>> _______________________________________________
>> Synth-diy mailing list
>> Synth-diy at synth-diy.org <mailto:Synth-diy at synth-diy.org>
>> http://synth-diy.org/mailman/listinfo/synth-diy
> 
> _______________________________________________
> Synth-diy mailing list
> Synth-diy at synth-diy.org
> http://synth-diy.org/mailman/listinfo/synth-diy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://synth-diy.org/pipermail/synth-diy/attachments/20170209/4aeb30e2/attachment.htm>


More information about the Synth-diy mailing list