[sdiy] Beat tones, Fourier analysis, and nonlinearities in the ear [was Phase shifts and instantaneous frequency]

Aaron Lanterman lanterma at ece.gatech.edu
Wed Jul 16 04:20:21 CEST 2008


On Jul 15, 2008, at 9:04 PM, Ian Fritz wrote:

> Again, the power spectrum is constant in time.  So doesn't linear  
> signal theory say you would not hear beats?  That's the first-order  
> picture, that the ear does a linear Fourier analysis.  And if you  
> listen at a low volume level you don't hear the beats, whereas they  
> are very pronounces at high volume levels.  How do you explain that  
> with a linear theory?  *Any* linear theory?

Oh, the ear and auditory system is all sorts of nonlinear. I'm not  
arguing that. I just am unsure about the theory that a 100 Hz tone and  
a 201 Hz tone are multiplying in the ear and creating a 101 Hz spur  
that's beating against the 100 Hz tone, creating the 1 Hz that you  
hear. That might be happening, or something else might be happening.  
You can directly _see_ the 1 Hz pattern in the waveform without having  
to go into your synthesis program and multiply anything with anything  
else to simulate that _particular_ nonlinearity that would give you  
101 Hz spur. You can see it without invoking any models of the ear at  
all. Of course, there, we're invoking some models of the eye. ;)

The ear may be perceiving that 1 Hz signal in all sorts of nonlinear  
ways, explaining the changes you perceive at different volumes. For  
instance, I could easily imagine it as popping out of some sort of  
envelope detection of a short-time Fourier transform.

Consider playing a 439 Hz tone against a 441 Hz tone. Your ear  
perceives a 440 Hz tone modulated by a 1 Hz beat, so you'll hear two  
peaks per second, and if you plot a _short time_ Fourier transform of  
it with a small window, that's what you'll see. You'll see it in the  
spectrogram and you'll hear it with your ears, and you can describe  
that without ever having to multiply cos(439 t) with cos(441 t).

I think maybe an important distinction to make is about short-term vs.  
non-short term Fourier analysis.

You could write down an equation for an exponentially damped sinusoid  
x(t). LIstening to it, you'll hear a tone that gradually decreases in  
volume.

You could then write down its Fourier transform, which would look like  
two copies of 1/(a+j w) centered at +/- the frequency of your  
sinusoid. In this representation, time's been collapsed, and you  
imagine the wave is made up of the continuum of frequencies - in fact,  
there'd be frequency content at ALL frequencies. But you don't  
perceive all those frequencies.

Neither the time-domain or frequency-domain equation fully describe  
what we perceive. For that, a short-time Fourier analysis is best.

I also could be missing Ian's point...

- Aaron



More information about the Synth-diy mailing list