SPRUIG3C January 2018 – August 2019 TDA4VM , TDA4VM-Q1
Any loads or stores that remain after exhausting
the SE or SA resources are generated using normal indexed addressing. The migration tool
defines a variable corresponding to the Agen whose type is __agen
,
which is a typedef for int
. The agen variable represents the address
offset in bytes.
An access with base address b
and
agen a
is generated as the following C expression:
*(<type>*)((char*)b + a)
This generally results in no overhead for the access itself, as the compiler generally
keeps both b
and a
in registers for the duration of
the loop nest, so the expression compiles to a simple indirect operand such as
*Rega[Regb].
The overhead results from having to update the
agen as the loops iterate. These updates generally involve adding a constant term at
each loop level corresponding to the stride at that level. For levels outside the
innermost level, the stride is adjusted so as to rewind the inner level. The agen
adjustment values are computed in the init()
function from the
coefficients in the Agen expressions and the trip counts, and stored in the tvals
structure.
For example, the following code shows the addressing operations generated for a loop translated using indexed addressing:
__agen A0, A1, A2; // typedef int __agen
for (I1 ... )
{
for (I2 ...)
{
for (I3 ...)
{
for (I4...)
{
Vreg0 = *(tvals->p4 + A0); // load using A0
Vreg1 = *(tvals->p5 + A1);// load using A1
A0 += 2; // update A0
A1 += 2; // update A1
}
A0 += tvals->p8; // outer loop updates
A1 += tvals->p11;
}
*(tvals->p7 + A2) = Vdst;// store using A2
A0 += tvals->p9; // outer loop updates
A1 += tvals->p12;
A2 += 16;
}
A0 += tvals->p10; // outer loop updates
A2 += tvals->p13;
}
The agen updates are generated as statements of
the form a += tvals->pN
, positioned at the end of the loop at the
appropriate level. These generally turn into a single ADD instruction, but since they
appear in outer loops, can hamper loop collapsing. For loops collapsed with NLC, the
NLC-supplied predicate can be used to predicate any agen adjustments in the outer loop.
The total overhead of the agen in this case is generally two instructions: the
instruction to fetch the predicate from the NLC, and the (predicated) add instruction to
update the agen. For loops not collapsed with NLC, the overhead is higher, because the
presence of the agen update in the outer loop usually prevents collapsing, necessitating
explicit loop control for that loop level.