[sdiy] dx, chorus and Spock

Grant Richter grichter at asapnet.net
Fri Aug 16 22:31:07 CEST 2002

> I have a theory about our sense of hearing: I claim that we can't hear
> waveforms or harmonics very accurately at all, but we are remarkably
> sensitive to the subtleties of the *process* that's making the
> waveform.
> Why?  Millions of years of evolution and survival of the species.  If
> there's a bear hunting you down in the forest, the ability to
> accurately discern what's causing a funny noise is vital.  Identify
> the footsteps, the size of the animal, which direction, how far away,
> how fast, on what surface...  In such a situation, the level of the
> seventh harmonic is just not important.

I have been doing a lot of research into the physiology of the ear. The
engineers, physiologists and audiologists each have there own traditions and
analysis methods and there does not seem to be much cross fertilization
between disciplines.

The human ear is an amazing organ. As it is attached to the organ of
balance, and located deep inside the skull, speculation is that it evolved
very early relative to vision. It also has a privileged neural pathways to a
large number of other brain structures (including emotional centers).

The outer hair cells number around 25,000 and appear to be active
amplifiers/discriminators as they actually oscillate. The oscillations of
the outer hair cells are call "otoacoustic emissions" and are to used to
test hearing in infants. In other words, the ear produces noises of it's
own, with no input. They are not thought to be pitch sensitive, but appear
to increase dynamic range and pitch discrimination by sympathetic

The inner hair cells, arranged on the organ or Corti, number between 2000 -
3000 and are thought to be the primary pitch discrimination mechanism. A
little math shows that this is too few to resolve the ~1Hz pitch
discrimination of the ear, so considerable post processing is indicated.

If you consider there must be some metabolic increase in an excited hair
cell, and that this cost is measured in parallel for all hair cells, the
resulting function looks like Shannon's measure of information entropy. In
other words, the more hair cells that are excited, the larger the overall
metabolic drain. A sine wave produces the lowest spectral entropy, while
white noise produces the largest.

It's theoretically possible the ear is using a rapid estimate of amplitude
and/or spectral entropy as a threat assessment system. This would be useful
because it is an index of the amount of physical energy needed to produce
the sound signal, and therefor the immediacy of the threat by rapid
assessment of threat size/distance. When someone drops a metal cafeteria
tray and you jump, that is probably hard wired.

Mike Firman and I developed computer programs to calculate the amplitude and
spectral entropy of Mini-Wave wavetables. We then mapped the two numbers
onto a 2 dimensional plot, which for lack of a better term, we called a

It is interest to note that vocal and musical tones map nearer to each other
than "noise" signals. The evolution of sounds form a "trajectory" if the 2
dimensional plots are stacked up and connected to form a 3 dimensional line.
If the entropy of the power cepstrum is also calculated and plotted, each
data point has 3 dimensions and could be overlaid to form a 3 solid "sound
sculpture" or ideogram.

This suggests a possible research method to look for categories of sound.
The problem is computationally intense as generating all possible Mini-Waves
waves is 2^2048 (10^653) each with a FFT and 2 entropy calculations. This
would at least define the area of the 2D entropometer itself.

But since there's only been around 10^26 nanoseconds since the Big Bang, I'm
not waiting around for the results ;^)

Who knows if this is useful for anything, but it is a novel method of
visualizing audio data. The technique is applicable to any bit depth.

Any programmers interested in developing a real time application are invited
to contact me.

More information about the Synth-diy mailing list