SPRAC21A June 2016 – June 2019 OMAP-L132 , OMAP-L138 , TDA2E , TDA2EG-17 , TDA2HF , TDA2HG , TDA2HV , TDA2LF , TDA2P-ABZ , TDA2P-ACD , TDA2SA , TDA2SG , TDA2SX , TDA3LA , TDA3LX , TDA3MA , TDA3MD , TDA3MV
The C code for the pipeline copy function is:
extern volatile float wrDuration;
void pipeline_copy(int byte_cnt)
{
long long *restrict dst = (long long *)ext_buf[1];
long long *restrict src = (long long *)ext_buf[0];
unsigned int wrStartTime, wrStopTime;
int i;
_nassert((int)dst == 0);
_nassert((int)src == 0);
wrStartTime = CSL_tscRead();
for (i=0; i<byte_cnt/8; i++) {
dst[i] = src[i];
}
WBINVALIDATE
wrStopTime = CSL_tscRead();
wrDuration = (float)(wrStopTime-wrStartTime)/(DSP_FREQ/1000);
}
The analysis of the scheduled iteration is given out by the compiler as:
;*----------------------------------------------------------------------------*
;* SOFTWARE PIPELINE INFORMATION
;*
;* Loop found in file : ../pipeline_loop.c
;* Loop source line : 37
;* Loop opening brace source line : 44
;* Loop closing brace source line : 46
;* Known Minimum Trip Count : 1
;* Known Max Trip Count Factor : 1
;* Loop Carried Dependency Bound(^) : 0
;* Unpartitioned Resource Bound : 1
;* Partitioned Resource Bound(*) : 1
;* Resource Partition:
;* A-side B-side
;* .L units 0 0
;* .S units 0 0
;* .D units 1* 1*
;* .M units 0 0
;* .X cross paths 0 1*
;* .T address paths 1* 1*
;* Long read paths 0 0
;* Long write paths 0 0
;* Logical ops (.LS) 0 1 (.L or .S unit)
;* Addition ops (.LSD) 0 0 (.L or .S or .D unit)
;* Bound(.L .S .LS) 0 1*
;* Bound(.L .S .D .LS .LSD) 1* 1*
;*
;* Searching for software pipeline schedule at ...
;* ii = 1 Schedule found with 7 iterations in parallel
;*----------------------------------------------------------------------------*
;* SINGLE SCHEDULED ITERATION
;*
;* $C$C330:
;* 0 LDDW .D1T1 *A3++,A5:A4 ; |45|
;* 1 NOP 4
;* 5 DADD .L2X 0,A5:A4,B5:B4 ; |45| Define a twin register
;* 6 STDW .D2T2 B5:B4,*B6++ ; |45|
;* || SPBR $C$C330
;* 7 ; BRANCHCC OCCURS {$C$C330} ; |37|
;*----------------------------------------------------------------------------*
The pipeline can be viewed as in Figure 20. It can be observed that once the loop prologue (pipe up) completes, the code would keep the two 64-bit load and store engines occupied every cycle until the loop begins to pipe down.