<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix"><br>
      Mike Bryant:<br>
    </div>
    <blockquote
      cite="mid:8ed749595527457aa166c55c5268b16a@futurehorizons.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <meta name="Generator" content="Microsoft Word 14 (filtered
        medium)">
      <style><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";
        color:black;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
        {mso-style-priority:99;
        mso-style-link:"Balloon Text Char";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:8.0pt;
        font-family:"Tahoma","sans-serif";
        color:black;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0cm;
        margin-right:0cm;
        margin-bottom:0cm;
        margin-left:36.0pt;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";
        color:black;}
span.BalloonTextChar
        {mso-style-name:"Balloon Text Char";
        mso-style-priority:99;
        mso-style-link:"Balloon Text";
        font-family:"Tahoma","sans-serif";}
span.EmailStyle20
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
span.EmailStyle21
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <span style="color:#1F497D">> </span>E.g. "<span
style="font-family:"Arial","sans-serif";color:#121212;background:white">For
          the vast majority of benchmarks the LLVM Clang vs. GCC
          performance was quite close"</span><br>
        <span
style="font-family:"Arial","sans-serif";color:#121212;background:white"></span>
        <p class="MsoNormal" style="margin-bottom:12.0pt"><span
style="font-family:"Arial","sans-serif";color:#121212;background:white"></span><span
style="font-family:"Arial","sans-serif";color:#1F497D;background:white">>
          </span><span
style="font-family:"Arial","sans-serif";color:#121212;background:white">and
          </span>"<span
style="font-family:"Arial","sans-serif";color:#121212;background:white">The
            NCNN neural network inference library from Tencent was
            performing hugely better when built under the GCC compiler"</span><br>
          <span style="color:#1F497D">> </span><a
            moz-do-not-send="true"
href="https://www.phoronix.com/scan.php?page=article&item=apple-m1-compilers">https://www.phoronix.com/scan.php?page=article&item=apple-m1-compilers</a><o:p></o:p></p>
        <p class="MsoNormal" style="margin-bottom:12.0pt"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
        <p class="MsoNormal" style="margin-bottom:12.0pt"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">I’m
            afraid I see this sort of comment all the time.  Since 2014,
            Apple, ARM, Google, IBM et al have poured a fortune into
            developing LLVM as development of gcc had gone down a bit of
            a dead end.   Comments like they are ‘quite close’ tend to
            come from people using the free version of Clang which is
            years out of date, rather than the paid for versions such as
            ARM Compiler or Xcode which are two generations newer with
            many improvements.  And as GNU simply copied some of the
            optimisations of the LLVM project into gcc without even
            referencing those copies, it is quite possible future
            improvements may never be introduced into the open source
            versions of LLVM, or at least kept back a few generations.<o:p></o:p></span></p>
        <p class="MsoNormal" style="margin-bottom:12.0pt"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">In
            our applications I generally see anything from 8% for
            general logic (an example you can try is the Circle RTOS for
            Raspberry Pis) to 34% for DSP functions (which is our main
            area of expertise so definitely application specific but far
            too large a gain to throw away). 
            <o:p></o:p></span></p>
        <p class="MsoNormal" style="margin-bottom:12.0pt"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">However
            I have one particular routine for a high speed multiplexed
            digital drop and accumulate/insert bus where the improvement
            is over 200% as no matter what –Ox I applied, gcc simply
            couldn’t see an obvious register optimisation that stopped
            it using the stack.</span></p>
      </div>
    </blockquote>
    <br>
    Interesting!<br>
    <br>
    <blockquote
      cite="mid:8ed749595527457aa166c55c5268b16a@futurehorizons.com"
      type="cite">
      <div class="WordSection1">
        <p class="MsoNormal" style="margin-bottom:12.0pt"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> 
            You can force it to use registers by setting –O0 but then it
            doesn’t optimise the logic as you can’t switch optimisation
            levels inside functions.</span></p>
      </div>
    </blockquote>
    I guess one could refactor something to have the bit that needs this
    trick as an inline function with a tagged
    __attribute__((optimize("O0"))) ?<br>
    Never tried what happens there with inlines, whether the
    optimization level of the surrounding function overrides this... but
    just throwing this into GCC, it doesn't complain about this
    attribute being moot, like it tends to do when you do pointless
    stuff.<br>
    <br>
    <blockquote
      cite="mid:8ed749595527457aa166c55c5268b16a@futurehorizons.com"
      type="cite">
      <div class="WordSection1">
        <p class="MsoNormal" style="margin-bottom:12.0pt"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> 
            But ARM compiler saw the obvious straight away.  This is the
            routine I’m cutting and pasting into RP2040 code.<o:p></o:p></span></p>
        <p class="MsoNormal" style="margin-bottom:12.0pt"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Since
            about 2017, I’ve only found one thing that was better with
            gcc, and that was a custom Linux build we used for an audio
            analyser product, which of course is the thing gcc is
            optimised for and on which it is tested.  But Android, which
            is based on Linux of course, has been developed using ARM
            Compiler since 2014 and usually always compiled on it or
            Clang.  Similarly with MacOS and iOS.<o:p></o:p></span></p>
        <p class="MsoNormal" style="margin-bottom:12.0pt"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">The
            thing that amazes me is people spend ages trying to
            overclock processors using expensive cooling systems to get
            maybe a 10% improvement, yet ignore similar gains available
            by just getting a decent compiler.</span></p>
      </div>
    </blockquote>
    <br>
    I guess some people don't like being dependent on ever changing
    terms (read: the whims) of commercial tool vendors when avoidable.<br>
    Especially smaller outfits.<br>
    <br>
    <br>
  </body>
</html>