Hi Robert, I read thru your Email and tried to soak it all up. Didn't do that 100% yet, but this way I'm sure that when I'm ever stuck with a pr*ck of a thing like this, it'll ring a bell, and I'll be studying this deeper :-) I like the "hmm if that could be made repeatable it might be useful" !!!!!!!! 30 nS programmable pulses - Yippie !!! Ok, I think this again confirms clearly the defition of "TRIVIAL" : " Trivial - means you've found the problem ......" Best regards, Kris ----- Original Message ----- From: "Robert Adsett" <subscriptions@...> To: <lpc2100@yahoogroups.com> Sent: Saturday, February 07, 2004 3:16 AM Subject: [lpc2100] A tale of timing and short pulses > I just spent a day tracing down a problem. I thought I'd share the, umm > adventure, in the hopes that the information might be useful (or > entertaining in a "thank god it happened to him not me" way). > > I've been updating my timing routines and testing and documenting > them. One of the final steps was to test the one-wire communication > library I've been working on. Now this library is a port of the Dallas > library adapted for the LPC and they've just updated it so thought I'd > bring over the changes at the same time. That was probably a mistake. > > Transferring the changes went without too many problems. (It did reveal an > apparent bug in PC-Lint but I haven't managed to make a small enough test > case that shows the problem). The first test of the functionality went > without problem. > > Feeling confident I moved on to the second test. Compile, load run. Looks > OK. Run first step, that's odd where's the output. No output, I figured > something had changed in the way the routines worked or the display > routines and I hadn't noticed. Some time later and I hadn't found anything. > > Well, next step is to add debugging prints to figure out where it stops > behaving as expected. Add a few and narrow it down to an area where the > devices are being enumerated. > > Add debug prints at coarse level to enumeration. Uh Oh! It suddenly starts > working. Now it's looking like some sort of memory problem or timing issue. > > Feeling confident in the timing routines (I'd just finished testing them > after all) I starting down the road looking for memory issues, such as > overflow, bad pointers (especially in newly modified code) and even code > generation. Hours later no progress except rather more puzzlement. > > Oscilloscope time. The puzzle deepens. With the debug prints in the > enumeration process is clearly proceeding with each device responding in > turn. Take out the prints and the process collapses into something that is > not even consistent between tests. Eeekk! > > Replace debug prints with calls to WaitUs (a delay routine). With large > waits between enumerations (5mS) everything appears to work. I didn't see > why that should be necessary, time to explore further. Shorten the delay, > the whole thing collapses again. Hmm, that's pointing back to timing again. > > Look more carefully at the collapse. Things make less sense than > before. A wait after the enumeration appears to affect the > enumeration. IE enumerate(), wait for 5ms works, but enumerate wait for > 1ms fails. The program is managing to effects events after they > occur? Obviously I'm missing something. Time to stop trying to solve the > problem and just start gathering information, something I should have > started earlier. > > Add pin toggles around the individual enumerations so that the process is > easy to correlate on the oscilloscope. Looks right on the > oscilloscope. The enumeration bit pattern pauses at the pause between > enumerations. > > Remove the delay between enumerations. It still works. What the...! The > pin toggle appears not to happen between enumerations. After a few more > tests and fooling around I find it. The pin toggle is just narrow enough > to hide behind an oscilloscope graticule. > > Remove the toggle between the enumerations. It collapses again. Good, at > least it hasn't just disappeared on me. > > Now, let's zoom in on the enumeration and see if I can't find out exactly > where it's breaking down. Gather up code to trace along the path as I > manually decode. (Note that each bit signal is 80-90uS long and there are > 70 odd bits so seeing each bit does involve going to a higher level of > detail, and as it turns out the devil is in the detail). Repeat several > measurements at higher resolution, the first 1/2 dozen or so bits are not > consistently repeatable, they seen to fall into two patterns. Ahh, now I'm > are getting somewhere. > > Freeze a sample and look in greater detail (thank heavens for DSOs). Reset > OK, bit 1 OK, bit 2 OK, bit 3 OK, bit 4 OK. Wait a minute what's that > glitch doing there and why is bit 4 so much longer than the others? > > Take a break. An important part of the process, probably too long delayed. > > Come back to problem. Take several measurements. Save the first one as a > reference and take others until I have one that is different from the first > for comparison. Hmm the two measurements are very close. Where I have a > glitch on the first one, I have a full rise on my second > measurement. Check the timing and indeed the glitch occurs where the > output should normally rise and the extended low period covers the low > period for two bits. It appears something is killing the output of the > timers match register as it occurs. > > Check, read code, check documentation. Some time later I notice the > following sequence and realize its possible significance > > - Set output low > - Set timer match to set output high in 75 uS > - Wait 15 uS > - Sample input > - Wait 60 uS > - Disable timer match (1) > - Wait additional time > > The disable at (1) is there to prevent the match from occurring on wrap > around after it's no longer needed. Unfortunately if the waits are > accurate (and I just spent a fair amount of time ensuring they were) that > disable happens at approximately the same time that the match actually > occurs and cancels it just as it starts to affect the output. It appears > there is a race condition in the timer module. The fix is simple > enough. Move the disable to later in the process. > > I don't think I could hit this consistently if I tried (I expect the timing > would change if you looked at it crosswise), so why am I (relatively) > confident that this is the source of the problem? A couple of > reasons. The sensitivity of the effect to code changes. Almost any change > in code position or timing eliminates the problem. Changing the timing > using the fix suggested works as does adding time to the waits before the > disable. Also the glitch coming out of the CPU is only 30nS wide and I've > not run across anything in the micro that can produce a pulse that narrow > under normal circumstances (hmm if that could be made repeatable it might > be useful). Finally reducing the wait time in the sequence above > eliminates the rise entirely. > > I don't suppose anyone can confirm that the timer has this apparent > glitch? I know whenever I see an issue like that mentioned I wonder "how > did anyone ever run across that?". Well now I know :) > > Robert > > > " 'Freedom' has no meaning of itself. There are always restrictions, > be they legal, genetic, or physical. If you don't believe me, try to > chew a radio signal. " > > Kelvin Throop, III > > > > -------------------------------------------------------------------------- ------ > Yahoo! Groups Links > > a.. To visit your group on the web, go to: > http://groups.yahoo.com/group/lpc2100/ > > b.. To unsubscribe from this group, send an email to: > lpc2100-unsubscribe@yahoogroups.com > > c.. Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service. > >
Message
Re: [lpc2100] A tale of timing and short pulses
2004-02-06 by microbit
Attachments
- No local attachments were found for this message.