[sdiy] Quick C query

Phillip Harbison alvitar at xavax.com
Sun Nov 21 22:10:45 CET 2010

Dealing with booleans is a tradeoff between code size and data
size. It's important to know the architecture of your CPU. For
example, in C it is tradition that integer types (including
char) be signed values. After loading a char (byte) the value
is sign-extended to a normal int. The 68000 had instructions
to extend a byte to a word (EXTB) and extend a word to a long,
but no instruction to extend a byte to a long. Many compilers
used a 32-bit "long word" as their basic "int" type. If you
used a char to store a boolean, thinking you were saving 3
bytes, you just wasted 4 bytes (2 instructions) extending the
sign. That's fine if you have lots of ROM and very little RAM.
If your memory is homogeneous, it's a false economy.

Using an unsigned char for booleans is a bit more efficient,
especially if your architecture automatically zeroes out the
upper byte(s) of the register. If not, then you'll need one
extra instruction to clear the register. If that instruction
is only one byte, it's still a win. On a typical RISC CPU, an
instruction is 4 bytes, so it's another false economy unless
RAM is a precious resource and you have ROM to burn.

The code size really starts to mount when you use bit fields
in C. The compiler cannot guess your intent, so it is going
to load the register, mask out the other bits, shift what is
left to the least significant bits of the register, and then
possibly sign extend the result. If you're mapping a bit field
to a device register, that's probably OK. If you're just using
bit fields to store booleans, it's a false economy on almost
any architecture.

Most compilers for x86 and RISC architectures make use of
optimization techniques such as loop invariant analysis and
common subexpression analysis. In my experience, compilers
for embedded processors are not quite so advanced, but that
may have changed since I did embedded work many years ago.
If you want to help the compiler out, the best things you
can do is apply the aforementioned techniques yourself. That
way the compiler doesn't have to deduce your intentions. If
you have a common subexpression, calculate it once and put
the result in a temporary variable. Avoid array indexing in
loops. Instead create a pointer to the first array element
and increment the pointer each time through the loop. This
made a huge difference in code size for old compilers for
the Z80 and 6800/6809. It may not be true for modern micro-
processors. As always, your mileage may vary.

Now back to lurking mode. :-)

Phil Harbison

More information about the Synth-diy mailing list