Yahoo Groups archive

Lpc2000

Index last updated: 2026-04-28 23:31 UTC

Message

Re: A tale of timing and short pulses

2004-02-07 by Richard

Robert,
    Could explain why this is a race condition in the timer?  I too 
am trying abosrb what you are describing.  Excellent description, I 
could feel the hair pulling, especially with the pulse behind the 
graticle!  Ugh!  

Richard

--- In lpc2100@yahoogroups.com, Robert Adsett <subscriptions@a...> 
wrote:
> I just spent a day tracing down a problem.  I thought I'd share 
the, umm 
> adventure, in the hopes that the information might be useful (or 
> entertaining in a "thank god it happened to him not me" way).
> 
> I've been updating my timing routines and testing and documenting 
> them.  One of the final steps was to test the one-wire 
communication 
> library I've been working on.  Now this library is a port of the 
Dallas 
> library adapted for the LPC and they've just updated it so thought 
I'd 
> bring over the changes at the same time.  That was probably a 
mistake.
> 
> Transferring the changes went without too many problems.  (It did 
reveal an 
> apparent bug in PC-Lint but I haven't managed to make a small 
enough test 
> case that shows the problem).  The first test of the functionality 
went 
> without problem.
> 
> Feeling confident I moved on to the second test.  Compile, load 
run.  Looks 
> OK.  Run first step, that's odd where's the output.  No output, I 
figured 
> something had changed in the way the routines worked or the display 
> routines and I hadn't noticed. Some time later and I hadn't found 
anything.
> 
> Well, next step is to add debugging prints to figure out where it 
stops 
> behaving as expected. Add a few and narrow it down to an area where 
the 
> devices are being enumerated.
> 
> Add debug prints at coarse level to enumeration.  Uh Oh! It 
suddenly starts 
> working.  Now it's looking like some sort of memory problem or 
timing issue.
> 
> Feeling confident in the timing routines (I'd just finished testing 
them 
> after all) I starting down the road looking for memory issues, such 
as 
> overflow, bad pointers (especially in newly modified code) and even 
code 
> generation.  Hours later no progress except rather more puzzlement.
> 
> Oscilloscope time.  The puzzle deepens.  With the debug prints in 
the 
> enumeration process is clearly proceeding with each device 
responding in 
> turn.  Take out the prints and the process collapses into something 
that is 
> not even consistent between tests.  Eeekk!
> 
> Replace debug prints with calls to WaitUs (a delay routine).  With 
large 
> waits between enumerations (5mS) everything appears to work.  I 
didn't see 
> why that should be necessary, time to explore further.  Shorten the 
delay, 
> the whole thing collapses again.  Hmm, that's pointing back to 
timing again.
> 
> Look more carefully at the collapse.   Things make less sense than 
> before.  A wait after the enumeration appears to affect the 
> enumeration.  IE enumerate(), wait for 5ms works, but enumerate 
wait for 
> 1ms fails.  The program is managing to effects events after they 
> occur?  Obviously I'm missing something.  Time to stop trying to 
solve the 
> problem and just start gathering information, something I should 
have 
> started earlier.
> 
> Add pin toggles around the individual enumerations so that the 
process is 
> easy to correlate on the oscilloscope.  Looks right on the 
> oscilloscope.  The enumeration bit pattern pauses at the pause 
between 
> enumerations.
> 
> Remove the delay between enumerations.  It still works.  What 
the...!  The 
> pin toggle appears not to happen between enumerations.  After a few 
more 
> tests and fooling around I find it.  The pin toggle is just narrow 
enough 
> to hide behind an oscilloscope graticule.
> 
> Remove the toggle between the enumerations.  It collapses again.  
Good, at 
> least it hasn't just disappeared on me.
> 
> Now, let's zoom in on the enumeration and see if I can't find out 
exactly 
> where it's breaking down.  Gather up code to trace along the path 
as I 
> manually decode.  (Note that each bit signal is 80-90uS long and 
there are 
> 70 odd bits so seeing each bit does involve going to a higher level 
of 
> detail, and as it turns out the devil is in the detail).  Repeat 
several 
> measurements at higher resolution, the first 1/2 dozen or so bits 
are not 
> consistently repeatable, they seen to fall into two patterns.  Ahh, 
now I'm 
> are getting somewhere.
> 
> Freeze a sample and look in greater detail (thank heavens for 
DSOs).  Reset 
> OK, bit 1 OK, bit 2 OK, bit 3 OK, bit 4 OK. Wait a minute what's 
that 
> glitch doing there and why is bit 4 so much longer than the others?
> 
> Take a break.  An important part of the process, probably too long 
delayed.
> 
> Come back to problem.  Take several measurements.  Save the first 
one as a 
> reference and take others until I have one that is different from 
the first 
> for comparison.  Hmm the two measurements are very close.  Where I 
have a 
> glitch on the first one, I have a full rise on my second 
> measurement.  Check the timing and indeed the glitch occurs where 
the 
> output should normally rise and the extended low period covers the 
low 
> period for two bits.  It appears something is killing the output of 
the 
> timers match register as it occurs.
> 
> Check, read code, check documentation.  Some time later I notice 
the 
> following sequence and realize its possible significance
> 
>    - Set output low
>    - Set timer match to set output high in 75 uS
>    - Wait 15 uS
>    - Sample input
>    - Wait 60 uS
>    - Disable timer match (1)
>    - Wait additional time
> 
> The disable at (1) is there to prevent the match from occurring on 
wrap 
> around after it's no longer needed.  Unfortunately if the waits are 
> accurate (and I just spent a fair amount of time ensuring they 
were) that 
> disable happens at approximately the same time that the match 
actually 
> occurs and cancels it just as it starts to affect the output.  It 
appears 
> there is a race condition in the timer module.  The fix is simple 
> enough.  Move the disable to later in the process.
> 
> I don't think I could hit this consistently if I tried (I expect 
the timing 
> would change if you looked at it crosswise), so why am I 
(relatively) 
> confident that this is the source of the problem?  A couple of 
> reasons.  The sensitivity of the effect to code changes.  Almost 
any change 
> in code position or timing eliminates the problem.  Changing the 
timing 
> using the fix suggested works as does adding time to the waits 
before the 
> disable.  Also the glitch coming out of the CPU is only 30nS wide 
and I've 
> not run across anything in the micro that can produce a pulse that 
narrow 
> under normal circumstances (hmm if that could be made repeatable it 
might 
> be useful).  Finally reducing the wait  time in the sequence above 
> eliminates the rise entirely.
> 
> I don't suppose anyone can confirm that the timer has this apparent 
> glitch?  I know whenever I see an issue like that mentioned I 
wonder "how 
> did anyone ever run across that?".  Well now I know :)
> 
> Robert
> 
> 
> " 'Freedom' has no meaning of itself.  There are always 
restrictions,
> be they legal, genetic, or physical.  If you don't believe me, try 
to
> chew a radio signal. "
> 
>                          Kelvin Throop, III

Attachments

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.