[sdiy] Enveloppe follower

Bernard Arthur Hutchins, Jr bah13 at cornell.edu
Sun Dec 8 04:51:47 CET 2019

When we, as musical engineers, think of some notion such as “pitch” we usually have it from several points of view.  From the highest level (engineering-mathematics) we obtain pitch from the Continuous-Time Fourier Transform (CTFT), but this is usually not obtainable (the infinite limits, and usually a mathematical expression for a musical sound s(t) is completely missing, which absolutely prevents us from doing the integral!).

At a second level, computation IS possible with a sampled signal s(nT), using the Discrete Fourier Transform (DFT=FFT), which also self-windows in time, but which has (at least) uncertainty (Heisenberg sense for FT) and is in general potentially problematic.

At a third level (psychoacoustics) we insist on answers that agree with what a human listener hears.  What the modled pitch “should be” is irrelevant if it is wrong or unstable.

At the fourth level, we need to have a scheme that is practical in terms of construction/programmability as an end goal.  At this level, all sorts of “rule breaking” may appear and be applauded if they just WORK.

Do we find the same four levels if we are considering obtaining an ENVELOPE?   Yes and NO.  The first two levels (Hilbert transform) are so infrequently discussed:



that they do not come readily to mind.  So it is not apparent that there is any possibility of an “above all” engineering notion of a correct answer to compare to.  Instead, we have generally relied on levels three and four (what does the ear think is right; what can we build), which at least has problem with time-averaging.

To get the “envelope” of a signal s(t), first form the “analytic signal” S(t) = s(t) + js’(t) where s’(t) is the Hilbert Transform of s(t).   Then take the magnitude of S(t) as the envelope.   The HT, like the CTFT is an all-time integral transform, and is largely unobtainable at level 1 for the same reasons.    At level 2 (computability) however, there is a discrete-time HT that is a largely trivial standard FIR digital filter (see pages 10-12) of:


which is basically a 90 degree phase shifter: s’(nT) from a windowed s(nT).

Believing is perhaps a first step in understanding.  What if you had s(t) = A Sin(t)?   Shifting 90 degrees s’(t) = A cos(t), so S(t) = A Sin(t) + jA Cos(t), is the analytic signal and its magnitude (square root of sum of squares) is A, which everyone agrees is the correct answer and involves no low-pass filtering complications.  This example is non-linear!  You do need a full broadband HT.  So here, it is just the simplest example.  MANY years ago Hal Chamberlin pointed out that the above example while correct did not validate the general expanded scheme!

Just a few things to keep in mind.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://synth-diy.org/pipermail/synth-diy/attachments/20191208/20d6fb7d/attachment.htm>

More information about the Synth-diy mailing list