[sdiy] was how_you_got_started...
Neil Johnson
neil.johnson71 at gmail.com
Tue Oct 6 11:07:50 CEST 2015
Not sure if this is has been mentioned yet:
https://isocpp.org/blog/2015/09/bjarne-stroustrup-announces-cpp-core-guidelines
Looks like Bjarne it trying to follow the MISRA-C model: carve out a
safe subset of the language with a set of rules that can be turned
into automated checks (there are plenty of tools you can buy that
check code against the MISRA-C rules - important consideration for a
professional environment).
Neil
On 29 September 2015 at 23:51, Dan Snazelle <subjectivity at hotmail.com> wrote:
> Thanks for this extremely interesting response!
>
>
>
> Sent from my iPhone
>
> On Sep 29, 2015, at 6:08 PM, "Olivier Gillet" <ol.gillet at gmail.com> wrote:
>
>>> At the lowest level, the main overhead of C++ is the virtual
>>> function table and the code to navigate that table, all of
>>> which is constructed by the compiler and linker. Otherwise,
>>> C++ code should perform about the same as C code.
>>
>> Note that this extra code is generated only if you use dynamic
>> polymorphism (virtual functions).
>>
>> Other C++ features that can possibly cause bloating:
>> * Exceptions (code size increase of function intros/outtros + extra
>> RAM for bookkeeping)
>> * RTTI (at least one extra pointer stored in each object + some tables
>> in RAM for each type)
>> http://www.hexblog.com/wp-content/uploads/2012/06/Recon-2012-Skochinsky-Compiler-Internals.pdf
>>
>> None of these are really necessary - and there is a fair amount of
>> large C++ projects that do not make use of these features (especially
>> the last 2).
>>
>> Some of my favorite C++ features have no overhead at all, entirely
>> take place during compilation, and have nothing to do with objects:
>> * Data access policies (const, references)
>> * Evaluation of expressions during compilation
>> * Syntaxic sugar (auto, range loops, default arguments to functions)
>> * Functors (and now lambda expressions) leading to the generation of
>> very efficient code for filtering/traversing data-structures (eg
>> filtering a list of MIDI events sorted by time in a sequencer).
>> * Cleanup through RAII ("Set the CS line high or finish the
>> transaction with this peripheral irrespectively of the exit point of
>> this function").
>>
>> That's actually the point of the language and what makes it unique -
>> that all the towers of abstraction it provides can be made to collapse
>> during compilation. Best example is templates. Yes, they are
>> controversial - they can bloat code size, but they are my favourite
>> C++ feature for expressing computations in a way that the compiler can
>> see through.
>>
>> Let's say I want to build a granular sample player. It can work in a
>> mono and a stereo mode, using 8-bit and 16-bit samples, and depending
>> on some settings, with or without a complex envelope on the grains. If
>> I write the code naively in C or assembly, the code will be littered
>> with if statements checking for these conditions inside the audio
>> rendering loop. The tests will hamper code performance (taking time to
>> execute the tests, mess with prefetch, waste some registers). So in
>> such situations, you really want to create as many rendering loops as
>> there are combinations of parameters - each of these rendering loops
>> being tight and stripped of any test. While in C or assembly you'd
>> have to duplicate the loops (or maybe use some ugly pre-processor
>> macro tricks), you can do this in C++ with templates. The grain
>> rendering loop from my module Clouds looks like this:
>> https://github.com/pichenettes/eurorack/blob/master/clouds/dsp/grain.h#L120
>> . And this actually gets compiled into 2 x 3 x 2 variants for all
>> combinations of #channels, buffer resolution, and grain quality - each
>> of them very deeply inlined and almost branch-free. Of course this
>> generates 25k of highly redundant code - but factoring all the
>> configuration tests out of the loop gave a 20-40% boost in
>> performance.
>>
>> I used a similar technique to write reusable oscillator code that is
>> configured at compile time (sketched from this:
>> https://github.com/pichenettes/polyblep/blob/master/dsp/oscillator.h#L193)
>> - you can configure whether parameters like frequency or waveshape are
>> provided at audio rate, linearly interpolated from control rate, or
>> set once per block at control rate. In the context of the development
>> of a module, you can quickly check which variant is feasible with your
>> computational budget - you get generic code that is "flattened" for
>> each project you use it in. Or you can include two variants, one
>> suitable for a higher quality, lower polyphony mode.
>>
>> Another nice application is FFTs. Chip manufacturers give you these
>> nice little fft libraries in which the FFT size is an argument. But do
>> you often write programs in which the FFT size is decided only at run
>> time? If the FFT size is known at compile time, you can write your FFT
>> code as a template
>> (http://www.drdobbs.com/cpp/a-simple-and-efficient-fft-implementatio/199500857),
>> and all the computations (including the computation of the root of
>> unity constants) can be done at run time and put right there in the
>> code, giving a gigantic branch-free blob of computation that can be
>> unrolled and reorganized to the compiler's will. Obviously, the code
>> size will suffer, but it can be worth it in terms of run time. I have
>> an upcoming project in which I use a similar approach for the
>> compile-time generation and "unrolling" of sample rate conversion code
>> (polyphase FIR). Once all the computations were unrolled, the compiler
>> found an efficient way of working through several adjacent samples at
>> once using batch/load stores that I wouldn't have figured out myself.
>>
>> Some of these tricks might be doable in hand-written assembly with a
>> lot of patience (or with python/perl scripts generating the assembly
>> code) - but to me it's much more convenient to do everything in a
>> single language... that I can compile, test, debug and instrument to
>> my heart's content on my workstation rather than on the target device.
>> It's really worth the price of having to occasionally peek at the
>> generated assembly code to make sure that the compiler gets a good
>> grasp of the flow of the computations the code is meant to perform.
>>
>> I really think this idiom of compile-time code
>> unrolling/specialization is worth knowing for signal processing
>> applications - embedded or not!
>>
>> Olivier
>> _______________________________________________
>> Synth-diy mailing list
>> Synth-diy at dropmix.xs4all.nl
>> http://dropmix.xs4all.nl/mailman/listinfo/synth-diy
> _______________________________________________
> Synth-diy mailing list
> Synth-diy at dropmix.xs4all.nl
> http://dropmix.xs4all.nl/mailman/listinfo/synth-diy
--
--
http://www.njohnson.co.uk
More information about the Synth-diy
mailing list