[sdiy] matrix vs direct I/O wiring & firmware (was - Re: MIDI velocity)

rsdio at audiobanshee.com rsdio at audiobanshee.com
Sun Apr 10 06:26:48 CEST 2016

Hi Steve,

On Apr 6, 2016, at 4:05 PM, sleepy_dog at gmx.de wrote:
> (I hope the initiator of this thread doesn't mind that I kinda hijacked it a bit :-D )
I changed the subject so the drift can be managed...

> "Depending upon your processor and the order of the GPIO registers, it might not even be possible to use a standard loop"
> Well from what I remember, this would work on stm32. The GPIO ports are consecutive in address space.
Consecutive in one sense. You'd have to increment your index register by 1024 to loop from Port A to Port B to Port C, etc., and you'd have to wire your keys on consecutive ports using all 16 bits of each port. That could be a challenge if your SPI peripheral eats away several bits of one or more Ports due to the pin function mux.

But this (direct switch wiring) trades firmware simplicity for hardware complexity. The 11 ports are not all 16 bits wide. Port K is only 8 bits. Smaller packages do not have all ports - the 100-pin ST only has Ports A through E; and it only has 80 GPIO pins anyway, so you're not going to handle a 61-key velocity keyboard if you try to wire each switch directly, without a matrix. Then there's the issue that most port pins are not adjacent. Most ports are split into 5 sections. Port D is the most well-grouped, but it still only has 8 pins adjacent in 2 sections. That means layout for 122 discrete switches would be a bit of work compared to a simpler matrix.

Don't forget that there's a $1 or $2 price increase for each larger package with more GPIO pins. Of course, you're talking about keeping each processor so that it's limited to only a handful of keys, so it might not be so bad.

> And as those guys probably thought themselves "what do we do with all whose 4 GIG address space on a MCU? It would be wasteful not to make use of it!",
> I could probably even loop through every single pin with doing one native word access each, no shifting & masking required,
> as from what I remember, every damn thing in the STM32's is bit-banded, i.e. there is a 32bit address that addresses just one peripheral bit, or even hardware control register bit, to do efficient changes of single bits.
> Which I mention here not for relevance but just because I think it's pretty cool ;)
Are you sure? I thought the bit-banding was limited to a small subset of the total memory range (otherwise the virtual bit address would be more than 32 bits!) at the beginning of the memory space. The STM32 manual says, "Each I/O port bit is freely programmable, however the I/O port registers have to be accessed as 32-bit words, half-words or bytes." That implies to me that the bit addressing modes are not available for the GPIO ports. You have to read at least 8 bits, if not all 16. There are no 32-bit GPIO ports (on the STM32 model I looked at).

> I hope I'm not coming off too much as the boy raving about his favorite toy, the class of which he knows only one product line of and lacks reference ;-)

Ha! I should be careful that I don't confuse the many different processors I've designed for. I keep confusing the Analog Devices Blackfin GPIO implementation with the typical ARM GPIO implementation. The Blackfin has some clever GPIO access modes. Each Blackfin GPIO has an alias that only sets bit, another that only clears bits, another that only toggles bits, and then the usual direct mode. Each of the special functions implements the masking in hardware, so you can easily select bits to toggle, set or clear.

> A friend, pro EE who used some classic MCUs like AVR, tends to be enthused by those things, though, so maybe it's worthy of mentioning.
I was taken by surprise when you referred to the AVR as "classic." In my book, chips like the 8049, 8051, Z-80, 6800 and 6502 used in vintage synths are classic. The AVR is currently in production. In any case, confirm that bit-banding is available for GPIO before you bring it up.

> Delay by serial communication between MCUs:
> If I used perhaps 12 Mbit/s SPI with DMA, but very small buffers, compared to 31250 bit/s or what it was for MIDI - a ratio of 384, on 24 or 48 MHz MCUs - I guess there won't be much of a delay? I don't know about electrical specs of SPI with regards to physical connection lengths. 3Mbit/s USART still seems fast enough I guess. I have used unclocked 3Mbit/s links with loosely hanging 30cm flat cable without problems.
With a 48 MHz processor like the PIC, the peripherals and instructions typically only operate at 1/4 the rate, maximum. So the fastest SPI clock is 12 MHz. With 8-bit data, you can execute 8 instructions before the serial peripheral can get the data to the next processor. 16-bit data would take 16 instructions. I point this out because you were worried about settling time of the switch matrix, and that's less than 1 instruction.

For ARM, I had a hard time finding any chip that could exceed 25 MHz for the SPI. In fact, I never found one. The ARM also runs at a higher clock than a PIC, but there are usually the same restrictions that the peripheral clock is limited to 1/4 the master clock, or perhaps less. Even on the ARM, there will be several instructions of delay between writing data to SPI and when it arrives on the next chip.

Bottom line, the delay wouldn't be that much, but it's certainly more than the matrix settling time.

> I worried about analog mux IC switching time because I am only familar with a few of venerable CD40xx parts that I played with so far. I guess there are much faster ones around these days, I sometimes forget that beginner articles in those regards often still mention parts invented in the 1970's or so :-)
I may have mentioned this, but the analog mux inside an MCU has similar settling times to the external chips you might design around. There's no magic here, internal and external are the same kinds of transistors, with the same limitations. If you use a single ADC input pin on your MCU, then you don't have to worry about the analog mux switching time. But as soon as you start using multiple ADC input pins, your firmware will have to take the analog mux switching times into consideration. A few MCU designs have as many as 3 independent ADC peripherals, so you could assign each to a different pin and read 3 signals without worrying about switching times.

In any case, internal or external analog mux, your firmware just needs to be programmed for the specific switch settling times of your analog mux. Your MCU should document the switching time of its analog mux.

> As for having a multi-MCU system -> writing multiple firmwares. Well, since the difference between the firmwares would be tiny, I'd probably write one, and have them, on init, check one sacrificed GPIO pin to know whether they are master or slave MCU.
> Or better yet, use the one-time-writable flash (or eeprom?) area in the stm32, to make that marking.
> But sure, implementing the communication adds significant effort. If I already need code for similar communication - say, MIDI, almost the only extra effort would lie in the program size, though. Given that my mentioned MCUs also tend to throw embedded flash at you, that shouldn't matter here.
I wasn't really talking about compile time, or the number of firmware images. I'm talking about the time it takes to develop and test the code. Even if you roll both the master and slave firmware source into one project, you still have to write the master code and the slave code. In contrast, a single MCU design won't have any distinct master or slave code. There would just be the core code, which is less code overall. You could totally skip all of the SPI master and SPI slave code, and any bugs associated with that (SPI slave code that outputs data to the SPI master is not always easy to design). There are definitely reasons to design with multiple processors, but the settling time of a key-switch matrix is not one of them.

One aspect of your design that you may not have considered is that you'd need a master MCU with 3 to 5 SPI ports. That's harder to find than you might think. You could link each slave in a token ring sort of network, but that would increase the latency even more for each additional keyboard PCB.

I'm really only going on about this because there really aren't any significant drawbacks to a switch matrix architecture when reading several octaves of velocity-sensitive keys.

Brian Willoughby
Sound Consulting

More information about the Synth-diy mailing list