nice Sunday with convolution

Martin Czech czech at Micronas.Com
Mon Apr 17 19:22:20 CEST 2000


I wrote this nice mail earlier today, until the mail tool crashed!

Some people asked what all this raving about convolution is good for.
It is good for at least a nice Sunday, a splendid Sunday, I can tell you!
There are basically three things you can do:

1. short coefficient file

If the coefficient file is so short that it is shorter then the ear-brain
time resolution, we can decribe the effects as filtering.  I.e. the
longer audio file is filtered with this specific filter.

Very sharp low pass band pass high pass and multiband filters are
possible.

E.g. a speech sample can be turned into a sine wave with an amplitude
according more or less to the speech input.

So far I have no filter creation software at home, I just guessed the
right filter impulse response and created that with cool edit.

A sine wave makes a bandpass. More sine waves a multiband filter.
The longer the filter, the sharper it is, and the more ringing will
be noticed.

Since this ringing is the predominant effect to vocoder speech it is no
miracle that it sounds vocoderish.

Of course square and saw waves create a lot of bands at their appropriate
harmonics, this sounds much like resonant delays (think of wave guide
techniques to create harmonic waves).

If the filter coefficients get more and more, we get to case 2:

2. long coefficient file

Now it is possible to hear delay or reverb effects, provided that the
coefficient file has some peaks and notches.

This reminds me that I should have tried a long noise file with no
envelope... Must try it at home...

Ok, noise with normal AD envelope gives reverb, inverted envelope
gives...  right! backward reverb, something that was done with tape
in the early days.  IIR reverb does it via time variant coefficients.
Convolution does it in one path.

I tried a piece of silence, where I dropped a sine ping here and there.

Nice echoes with ping coloration. Of course, the echo time is arbitrary
for all.

Next I tried some peaks in the silence, which get closer and closer, well
echoes that are speeding up. Should do this via software the next time.

3. neither

These are the cases when filtering and time domain effects are there at
the same time.

I took two speech samples, a complete mess resulted that I can not
describe, each wave is filtering the other to some extend, and the peaks
of one wave fire and new version of the other wave every time. This is
why the result file has always about the length of adding the individual
files.

Next I created a sine siren, 10s rise from 10Hz to 3000Hz.

Together with speech this creates uprising words, siren like, very
strange.

Next thing was a single word "six" and a drum loop. A diffuse floor
where a lot of sixes ride on the drum's peaks. Talking drums, I guess!

The problem with time varying filter coefficients is that they do not
work as expected. If you take a siren, you could perhaps expect a band
pass sweep. Yes, it is something like that, but the sentence is triggered
a lot of times during the whole sweep.

A lot of very nice results, you just need some understanding and lots
of fantasy to create something unique.

Well, I recently upgraded to 64MB DRAM, it helps a lot.

After some hours of thinking I decided to recreate the fast convolution
part completely new. So far it was brute force. Ridiculous amounts of
storage and still slow.

Somebody asked if this could theoretically work in real time, and I
answered that the initial delay would be as long as the coefficient input.
That is not true!

Basically I use an overlap add algorithm, i.e. I take a short chunk of
audio and convolute that with the whole coefficients. Then comes the
next chunk, but at the end I have to add the overlapping samples from
the previous convolution.  This way a lot of partial convolutions will
do a complete one.

And I don't need to have the complete audio file in DRAM storage.  Now,
this can be applied also to the coefficient data.  This represents
a transversal filter, it can be cut into a lot of smaller filters if
the output is added in the right way.  So, next time I will do double
overlap add, for audio and coefficients. This will save memory, and it
will reduce FFT length by several orders of magnitude. This should save
a lot of computation time.

If the DRAM is still too small for all the coefficient data, it could be
stored to disk and fetched on demand. This is much better then windooze
swapping and allows for "very long filter" applications, like convoluting
entire pieces with each other. (earth quake scientist need that too,
any earth quake hobbyists around here?).

So breaking down everything into bits and pieces makes it possible
to compute the very first output samples via direct convolution, i.e.
they drop out immediately, the rest follows by usual dual overlap fast
convolution. So the answers should have been: yes, real-time is possible,
but I'm not doing this. This would mean to get too deep into the guts of
windooze and they stink.

I guess that's the way this expensive red Sony device works...


m.c.








More information about the Synth-diy mailing list