[sdiy] Question for those with musical ears

Mike Bryant mbryant at futurehorizons.com
Thu Apr 15 18:42:38 CEST 2021

Agree on the number theory and signal to noise comments, but over 99% of the frequency bins are below the algorithm noise level and so are just set to zero.   But you have to first test them all to find out both which ones are truly zero, and which ones are registering a signal due to a nearby bin - you have to look for the local maxima for every frequency component and select those to drive a modified Prony Method.

Sampling is at 384kHz, and the ADC needs to be run at the optimal level to ensure the input noise floor is as good as possible so as not to add to the algorithm noise floor.

However for the frequency determination, you slide the detector along through the samples so you get 10,000 overlapping windows on the signal.  Most of the data is the same but the window centre is the key point and by analysing the whole series I can resolve 4 cent vibrato or glissando changes in a 4 note chord, and often just about resolve 1 cent changes in frequency, but only at a hugely increased processing cost, and obviously not in real time.  Sometimes the algorithm does make errors, so I've numerous tests throughout the algorithms to look for sensible frequency combinations.  For example if you are sure the fundamental is at a certain frequency then it's quite likely the harmonics are at pure multiples of this for real world sounds so check these first.

-----Original Message-----
From: mskala at ansuz.sooke.bc.ca [mailto:mskala at ansuz.sooke.bc.ca] 
Sent: 15 April 2021 13:59
To: Mike Bryant
Cc: synth-diy at synth-diy.org
Subject: Re: [sdiy] Question for those with musical ears

On Thu, 15 Apr 2021, Mike Bryant wrote:
> Thanks - interesting read.  What I'm doing is sort of like a 
> polyphonic version of that.  At the moment I use a large array of 
> processors to analyse the incoming signal in 100uS time intervals and 
> 1 cent pitch intervals, so 128,000,000 bins per second.  This produces 
> tables of

Information theory limits how much meaningful timing and frequency information you can extract from a signal when you're trying to get both at once.  If your signal comes in with a sampling rate of 48kHz, then you only get 48,000 numbers describing the signal per second, and it's intuitively reasonable that you just can't extract 128,000,000 numbers from that and have them all be useful information.  There just isn't that much in the signal to extract.  Stuff like the Nyquist theorem and the Gabor limit make this notion more formal; really doing the estimate properly requires involving the signal to noise ratio as well as just the sampling rate.  I wrote an informal article about it, mostly from the point of view of "latency" concerns that sometimes come up with hardware synth effects, here:

On your initial posting I considered commenting that since you can't really get an accurate frequency measurement more than about once per cycle - which is 262 times per second for middle C - then any glissando that would have more than one step per cycle is information theoretically the same as a pure smooth portamento.  I kept quiet because I didn't think that comment would really be helpful:  it works out to minimum glissando steps of 0.4 cents for your example of one semitone per second at middle C, and that's clearly finer than the human ear's resolution anyway.  But if you're talking about taking complete high-resolution spectra instead of just a frequency measurement, at 100us intervals, then you're pushing far beyond the information theoretic limits and this becomes the limiting factor after all.

Matthew Skala
mskala at ansuz.sooke.bc.ca                 People before tribes.

More information about the Synth-diy mailing list