[sdiy] physical modelling

Tue Nov 13 21:18:22 CET 2001

>Now that we are starting to get a handle on information entropy in audio. I
>can clear up this misconception.
>
>Information entropy is measured in "bits" 

-Usually. You'll also see "nats" (which are mythical base-e numbers) as well.

>and is an index of the minimum
>number of bits needed to encode a specific symbol set. 

-more accurately, it is the average number of bits needed to describe a
symbol in the alphabet, given the distribution of symbols in the alphabet.

>So a continuos time
>undigitized analog signal represents the maximum information entropy level.

-umm.. not really.. entropy is defined a little differently for continuous
alphabets. The interpretation becomes somewhat different as well; in this
case, the entropy is the number of bits needed, on average, to reduce
ambiguity about the symbol into an interval of length 1 (or a hypercube of
volume one in higher dimensions). If you think of discrete alphabets as
consisting of integers on the real line, you see that this definition
encompasses the discrete case, since unit ambiguity is all you need to
perfectly localize an integer.  The bottom line is that the entropy still
depends on the range of values that the alphabet contains and the
probabilities thereof. An alphabet consisting of a uniform distribution over
some interval has a finite entropy given by the log of the length of the
interval. Even something like a gaussian distibution can have finite
entropy, although it takes on values over an infinite interval. Furthermore,
if we constrain the distribution to have a certain variance (that is, signal
power is constrained), we can prove that the gaussian distribution with said
variance has the highest entropy of all possible distributions with the set
variance. Recalling that the entropy of a gaussian variable is finite, we
then can argue that the entropy of all finite-variance variables is also
finite.  The point is that there is no inherent ranking of discrete and
continuous alphabets in the conventional definition of entropy. What i think
you are talking about is the information required to *perfectly* reconstruct
an input signal, which is of course infinite for the case of a continous
alphabet (that is, we must use words of infinite bit depth to perfectly
describe an analog voltage).

Sorry if I'm being nit-picky, but I was up late doing information theory
homework last night... :]

>ANY process of quantization or time sampling will reduce the information
>entropy by reducing the size of the symbol set. This is something that CAN
>be documented mathematically.

-I'd add that reducing the size of the alphabet doesn't necessarily decrease
the entropy unless constraints on the new probability distribution are also
met.  you could imagine a set of 6 symbols, one of which has almost unity
probability (off by, say, epsilon), and the other five of which are
uniformly distributed (probability epsilon over 5). Deleting the
highly-probable symbol and computing the new entropy should give you a
GREATER entropy than before. However, you are correct in that an alphabet
obtained by quantizing a continuous alphabet can have a finite entropy,
whereas the continuos alphabet required infinite numbers of bits to describe
perfectly. However, recalling the definition of entropy for continuous
alphabets (usually called "differential entropy", incidentally), you should
be able to see that a continuous alphabet that takes on values over a finite
interval can well have lower entropy than a discrete variable obtained
through quantizing this continuous alphabet. This is because the entropy of
a continous alphabet corresponds to a unit quantization, whereas the
quantization we apply may have a finer scale to it. This doesn't change your
philosophical understanding of the situation, but we should be careful when
we throw around these terms. The point is that the normal definition for
entropy of a continous alphabet is interpretted as the number of bits
required to localize the value into intervals of unit length. This is,
clearly, not equivalent to the number of bits needed to describe it exactly,
which will usually be infinite. 

>Additionally, the finite speed of the human nervous system represents a time
>sampling effect at the ear itself. So the very process of "hearing"
>represents a reduction in information entropy. It may be possible to
>quantify that reduction in "bits" by estimating the number of possible
>symbol states of the cochlea. That is the number of nerve endings running to
>the cochlea itself, represent the fundamental symbol states of auditory
>perception. So the theoretical resolution of hearing in "bits" CAN be
>calculated to some extent.

-What would be really interesting would be if someone would measure/compute
the information capacity of the human ear... 

Ethan