Hello,
first many thanks for your explanation ! We had a similar thread in this
mailinglist at 1st of Feb 2004 (name: Optimization of capture routine...).
There a trick was suggested to do the acces not only once and do a jump, but
do it multiple times and then check if loop is finished (depends on what you
want to do). Then you can go much higher:
> I am now at around 5,8956 MBytes / second, which is close to 5,898 MBytes
> /
> sec. ( = Fosc * 4 / 10).
> So the two operations ( ldr ip, [r0, #0] and strb ip, [r2], #1) seems to
> take in sum 10 cycles.
Regards,
Martin
----- Original Message -----
From: "philips_apps" <philips_apps@...>
To: <lpc2000@yahoogroups.com>
Sent: Wednesday, November 10, 2004 10:27 PM
Subject: [lpc2000] I/O Speed - An Explanation
>
>
> Here is an explanation of the I/O toggle speed that is observed in
> these devices.
>
> Richard
>
> The I/O speed has a maximum at ~3.7 Mhz because of several reasons,
> none specific to our parts. It is caused by interactions between the
> ARM pipeline, the VPB bus, the ARM AHB wrapper (interface between the
> ARM7TDMI-S core and the AHB bus), and the instruction timing itself.
> For the minimum 3-instruction loop below, a Store (Write to I/O pin)
> followed by another Store (toggle the I/O pin) and a Branch back to
> the first Store, the timing is as follows (Fe for Fetch, De for
> Decode, En for execution clock n):
>
> Pass1:
>
> STR: Fe-De-E1-E1-E2-E2-E2-E2-E2
> STR: Fe-De----------------------------E1-E1-E2-E2-E2-E2
> B: Fe-----------------------------De-----------------
> -----E1-E1-E2-E3
>
> Pass2:
> STR
> Fe-De
>
> And so on...
>
> An STR to VPB space takes 8 clocks because the last 2 phases (STR is
> a 4 phase instruction) are Non-Sequential (NS) accesses and the AHB
> wrapper adds one wait state for every NS access. This means the 3rd
> phase of the instruction takes 2 clocks, and the fourth phase takes 4
> because of the wait state and the VPB operations being 3 clocks.
>
> The second STR can be fetched and Decoded in the pipeline but will
> then stall because the execution pipeline stage is busy (the first
> Store has not completed yet). The Branch instruction can also be
> fetched in the Decode slot of the second STR but it will then stall
> because the Decode stall is occupied by the second STR.
>
> After the first STR completes, the second STR will start its
> execution phase and finally will allow the Branch instruction (which
> also has one NS phase) to proceed.
>
> End result: This takes 16 clocks (266.7 ns at 60 MHz with VPB clock
> set to 1) with a duty cycle of 6:10 .
>
>
> Code:
>
> .loop:
> str r2, [r7, #0]
> str r2, [r6, #0]
> b .loop
>
>
>
>
>
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>Message
Re: [lpc2000] I/O Speed - An Explanation
2004-11-11 by capiman@t-online.de
Attachments
- No local attachments were found for this message.