[sdiy] vocoder filters

Richie Burnett rburnett at richieburnett.co.uk
Sun Sep 1 21:22:20 CEST 2019

One way to control the spectral broadening due to the amplitude modulation 
is obviously to band-limit the VCA's control signals.  I guess most VCA CV's 
have a faster attack than release, so it's mostly limiting the attack speed 
that we're talking about here.  High order LPFs applied to the VCA CVs give 
a more favourable trade-off between settling time and residual ripple.

Another way to control the spectral broadening due to amplitude modulation 
in the VCA, and stop it bleeding into the adjacent bands, is to put the VCAs 
in-between the filter stages that make up each of the synthesis band 
filters.  This is what Roland did in their SVC-350 Vocoder.  It was probably 
done as much to filter out the PWM carrier frequency, but putting some of 
the vocoder's synthesis band filtering after the VCA helps to control 
out-of-band leakage caused by any fast VCA modulation.

On a related note, the conundrum I always face when thinking about Vocoder 
analysis and synthesis filter banks is how to deal with the bottom few 
bands...  The job of the analysis filter bank is to determine the spectral 
profile of the speech at any particular instant in time.  So ideally I want 
shed loads of closely spaced bands so that I get a fine-resolution picture 
of what the spectral profile is doing across the full speech range.  For 
instance, I might choose to use 24 bands spaced 1/4 octave exponentially to 
cover the 6 octave range from 80Hz up to 5.12 kHz.  This seems sensible as 
there probably aren't many vocal tract resonances outside this range that 
noticeably affect speech intelligibility.  However a problem arises with the 
bottom few bands.  All of the bands have the same Q and the same fractional 
bandwidth, however the actual bandwidth of the lower bands becomes only a 
handful of Hertz.  For example the bottom bands in my setup would be:

160-190Hz, *
320-381Hz, *
538-640Hz, *

This can lead to the frustrating situation where harmonics of the speech 
signal only fall into certain analysis bands at the low-frequency end of the 
speech range.  For example a female voice might easily have a fundamental 
frequency of 180Hz.  This would only put energy into the analysis bands 
centred around 180, 360, 540, 720 Hz, etc.  marked with the stars above, 
with nothing in the intervening bands.  This situation persists until 
eventually we get to a point where more than one harmonic of the voice 
signal starts to fit into each analysis band at some higher frequency.  Then 
the holes in the analysis spectrum disappear.

Clearly the analogue filter bank in this example is doing a poor job of 
estimating the spectral shaping of the vocal tract when the bands are made 
too narrow because they start to resolve the actual discrete spectral 
harmonics of the vocal signal.  This then leads to the situation where the 
output signal from the vocoder is very quiet unless the pitches of the voice 
and carrier inputs are related, so that harmonics of the speech and carrier 
inputs both land in the same frequency bands (>.<)  This isn't right.

The problem is that I don't know the best way around this.  I could make the 
analysis and synthesis bands wider and use less bands, but that seems a 
waste when modern DSPs have got enough grunt to run 40 bands easily.  Or I 
could make the lower frequency bands wider in an attempt to make sure that 
some harmonics of a given voice signal end up in every band at the lower end 
of the spectrum.  This is what Sennheiser seem to have done in at least one 
of their Vocoders, and it makes sense.  However, I liked the idea of having 
filter bands that are evenly spaced in octaves because I can then shift 
formants up or down easily by routing the outputs from analysis bands to 
adjacent synthesis bands across the audio range.  This isn't as simple if I 
alter the fractional bandwidths of the filters across the audio spectrum. 
The only other idea I had was to try to come up with some sort of 
interpolation algorithm to work around the "holes" in the spectral at the 
bottom end generated by the analysis filter bank which are particularly bad 
with voices that have a relatively high fundamental.

I guess ultimately I need to accept that the analogue analysis filter-bank 
paradigm is a compromise.  Ultimately the best way to estimate the vocal 
tract resonances is probably to window one complete pitch period of the 
speech signal and then FFT it to get the spectrum.  That way you get a 
continuous spectrum, instead of a line spectrum out of the analysis process.

Any thoughts and comments welcome.


-----Original Message----- 
From: Mattias Rickardsson
Sent: Sunday, September 01, 2019 4:54 PM
To: David G Dixon
Cc: Synth DIY
Subject: Re: [sdiy] vocoder filters

The next level is to ponder on the unwanted higher-frequency AM effects from 
controlling a vocoder band VCA with the "best" (fastest) envelope follower, 
and whether a slower response could be more optimal. So much fun! :-)


Den lör 31 aug. 2019 23:23David G Dixon <dixon at mail.ubc.ca> skrev:

Not really.  It’s a full-wave rectifier followed by a standard LP filter 
stage tuned to about 1/4 of the expected incoming frequency followed by a 
notch filter tuned to twice the incoming frequency.  I also use a full wave 
rectifier instead of the normal half wave rectifier, because I figure this 
gives faster integration.  This is why the ripple is at twice the incoming 
frequency, and a notch filter knocks it out nicely.  Through the judicious 
choice of gain at the LP filter, the envelope follows the waveform tops more 
or less exactly, and comes up to full strength within two periods of the 
incoming waveform, with ripple which is inconsequential.  For a 10Vpp 
waveform coming in, the envelope rides at 5V, which will turn on my favored 
linearized 2164 VCA design to unity gain.

From: David Moylan [mailto:dave at expeditionelectronics.com]
Sent: Saturday, August 31, 2019 5:26 AM
To: David G Dixon
Cc: synth-diy at synth-diy.org
Subject: Re: [sdiy] vocoder filters

Curious about this envelope follower you mention. Trade secret?

On Aug 31, 2019 03:48, David G Dixon <dixon at mail.ubc.ca> wrote:

Well, I know that the higher-Q filters have a longer delay, so that they 
take longer to respond to the incoming waveform.  I’m thinking that a Q of 
about 3 is probably about right, and with that, only a 4-pole filter is 

I’ve got a nice design for an envelope follower which responds quickly and 
has little or no ripple, so that’s not a problem.

On a related note, does anyone here have problems getting the Bode plotter 
in Multisim to work consistently?  I am finding with this simulation that 
sometimes if I change the component values, the Bode plotter doesn’t work at 
all.  Also, for some simulations, changing the component values doesn’t 
change the filter response at all.  Multisim is sure glitchy.  It’s very 
frustrating.  I can sometimes fix it if I erase all of the passive 
components and load new ones with the new values, rather than just changing 
them, but that sort of thing is just complete bullshit.  Multisim is a sad 
excuse for a professional program.  There must be something better out there 

From: Paul Perry [mailto:paulfrancisperry at gmail.com]
Sent: Friday, August 30, 2019 10:22 PM
To: David G Dixon
Subject: Re: [sdiy] vocoder filters

I don't think there is a "right" answer. To my mind, it depends on what one 
wants to do with the unit. Think about what will happen when a single swept 
tone is used to modify white noise. The low pass filter on the VCAs probably 
has a significant effect as well.

paul perry Melbourne Australia

On Sat, 31 Aug 2019 at 14:50, David G Dixon <dixon at mail.ubc.ca> wrote:

Well, I think I might have answered my own question.  Looking again at the 
JH Living Vocoder, since Jurgen Haible’s filter responses overlap at about 
the 8dB mark, it really should not matter at all what’s going on around the 
skirt of the response, and higher-Q filters with two 2-pole stages should 
give very similar results to low-Q filters with four 4-pole stages (and be 
much much cheaper to build).

I’d still appreciate if anyone has any specific insights into this problem. 

From: Synth-diy [mailto:synth-diy-bounces at synth-diy.org] On Behalf Of David 
G Dixon
Sent: Friday, August 30, 2019 9:05 PM
To: synth-diy at synth-diy.org
Subject: [sdiy] vocoder filters

Hey SDIY Team!

I’m thinking about building a vocoder, and I have a general question about 
the bandpass filters.

I’ve looked at Jurgen Haible’s Living Vocoder, and he used 8-pole filters 
with low Q.  These give a reasonably broad band with fairly steep slopes. 
He makes the filters from two pairs of LP and HP.

I was thinking about using BP filter sections, but just 4-pole, and with 
higher Q (around 10).  This gives a somewhat narrower band, and the slope is 
steep near the corner, but fairly shallow around the skirt.  This idea uses 
a lot fewer components (about half as many).

What I’m asking is, does anybody here have any insight into what the 
 “proper” approach to vocoder filters would be?  What is the design goal? 
Do you want significant overlap from one band to the next, or should they be 
fairly distinct?  I guess I’m just looking for some general guidelines and 
conventional wisdom.


Dave Dixon

Synth-diy mailing list
Synth-diy at synth-diy.org

Synth-diy mailing list
Synth-diy at synth-diy.org

Virus-free. www.avg.com

Synth-diy mailing list
Synth-diy at synth-diy.org

More information about the Synth-diy mailing list