[sdiy] Variable rate waveform playback in NED synclavier
Scott Nordlund
gsn10 at hotmail.com
Mon Sep 3 06:16:42 CEST 2012
> Let me check that I understand this right...
> The "Sample Rate Generator" consists of a accumulator with an increment N feeding a divider set to M. This produces a sample rate clock which is then fed to the "Phase Angle Incrementer" which is another simple accumulator to control the octave.
> Since they've got a minimum of 48 samples (24 harmonics), they could use wavelengths of 256, 128, or 64. E.g. this octave shifted step only has a range of three octaves. So most of the work has to be done by the S.R.G. Is that how you understand it?
It's not just 3 octaves; it covers all of them. I think the phase angle counter has to be 12-ish bits, with the upper 8 driving the waveform RAM/ROM. You'd get 8 octaves by varying the increment from 1 to 128, with the total period correspondingly varying between 4096 and 32 clock pulses. So to sum up, the input to the "phase angle incrementer" sets the octave (it's 8 bits, but restricted to powers of 2), N is the "divisor" or "modulus" of the counter (coarse tuning), and M is the "increment" (fine tuning). It's a 12 bit counter, with 4 bits of the "increment" being fractional.
I've been thinking about this a little bit, and wondering what the difference is between a uniform sample skip with a non-uniform clock pulse and a non-uniform sample skip with a uniform clock pulse (compare the Synclavier to a standard phase accumulator oscillator with a fixed sample rate of ~400 kHz, which I think is roughly how the PPG stuff worked). They should be equivalent until you start skipping samples (F*N> Fs where N is the length of one cycle in samples). The clock jitter is essentially aliasing too, but I guess it's less objectionable if you sort of preemptively "downsample" the waveform (ignoring the resultant aliasing since it's harmonic). And now that I think about it, I notice this happening in my Yamaha RX5 and Wersi MK1/EX20. It should be equivalent to quantizing the phase accumulator before the waveform table lookup. If that's actually how it works, it should be easier to emulate than I was thinking. I think I could even do that in Pure Data. It would be pretty easy to extend/improve it too.
Reading further, patent 4680479 describes an improved pulse generator with less jitter. I get the impression they used this for the polyphonic sample playback (which otherwise worked similarly). I think it uses the standard clock generator (output clock rates 50 to 100 kHz), then corrects jitter by delaying some of the pulses. They don't say it explicitly, but it sort of spreads the error over two adjacent clock pulses, i.e. linearly interpolating the time intervals between pulses.
With that in mind, I wonder if there are other approaches. To truly eliminate jitter, you'd have to make every sample period an integer number pulses of the master clock. An 80 MHz clock would be near ideal (20 kHz bandwidth, 0.423 cents max pitch error, and no jitter ever). But I was thinking you could sort of compromise and find a continuum between timing errors (jitter) and tuning errors. I thought of a sort of "dither" method that toggles between the nearest two "ideal" pitches (which are integer divisions of the input clock rate). Of course if you're toggling per clock pulse you get the same jitter as before (though you could maybe space it differently in time, i.e. randomly). At much slower rates it might make a sort of slight vibrato around the desired pitch. It still might need a high-ish clock rate, but it seems to me that this might be a much more benign artifact. If it's not more "natural" sounding, it's at least a lot more controllable, since both the rate and periodicity of pitch change can be explicitly controlled.
Another thing that came up was the way they decimate the waveforms. A different wavetable size could have more factors and more ways to decimate. 256 has 9 factors while 240 has 20... I think this could be exploited for better pitch accuracy, but I'm not really sure how... Of course it's probably more reasonable to just use multisamples without decimating anything.
I know some of this stuff is well beyond the scope of "authentic" emulation, but I'm thinking also of a more general case that can do this sort of thing (including other well known digital synths) in an ideal fashion.
> I've taken the BASIC program they give in the patent for finding values for N and M and converted it to PHP. It's still not entirely clear what it outputs, but some things come up in the code;
>
> 1) It tests clock periods between 155ns (6.45MHz) and 156ns (6.41MHz)
> 2) The "per channel sample rate" is regarded as 1/16th of this rate - e.g. around 400KHz.
> 3) For a given frequency f1, the required sample rate is f1*128. This implies that NED regarded their waveforms as 128 samples long, and oversampled by a factor of two for the lower octaves.
I take this as a good sign that the patents correspond pretty well to what they actually implemented. Often they only disclose enough to illustrate the concept, and maybe deliberately obfuscate the rest. I think the choice of waveform resolution was also to improve the quality of the FM. I tried 128 samples in my Octave simulation previously and found it absurdly gritty sounding. On straight playback it sounds fine, it's just that the phase modulation is then quantized to those steps.
More information about the Synth-diy
mailing list