SPRUIG3C January 2018 – August 2019 TDA4VM , TDA4VM-Q1
VCOP memory is arranged as lines of 8 32-bit banks, for a total of 256 bits per line. This layout complements the 8-way SIMD width of table lookup operations: each SIMD lane corresponds to a lookup in one memory bank. The table layout in VCOP memory reflects this architecture.
The table layout cannot change between 8-way and 16-way SIMD. The producer of the table is independent of the consumer, so the producer need not be aware of whether the table will be read using 8-way SIMD or 16-way SIMD. So, for 16-way SIMD, only the lookup operation itself changes; the source table layout in L2 does not.
These considerations result in the following default scheme to support 16-way lookup and histogram:
num_tables
parameter of the _LOOKUP
directive. This layout is dictated by VCOP’s memory architecture and
num_tables
, and is independent of the SIMD width that will
be used to read the table.num_tables
. For
example, if num_tables
==8, the lookup operation used 16 lanes.
If num_tables
==2, the lookup operation uses 4 lanes.num_tables
).A few considerations apply to this scheme.
VCOP_SIMD_WIDTH
, rather than being hard-coded, or is more
commonly the case, in terms of num_tables
.This scheme is the default, but is not beneficial
for all cases of lookup operations. For this reason, _LOOKUP
has
been given an additional parameter to control this table duplication behavior,
duplication
:
_LOOKUP(num_tables, num_pts, duplication, table_size)
By default, duplication is
VCOP_SIMD_WIDTH/8
. That is, for 8-way it is 1 and for 16-way it
is 2. In addition, num_pts
may be up to
VCOP_SIMD_WIDTH
. This additional parameter is particularly
useful for one table configurations. Consider the case of:
_LOOKUP(1, 8)
If the desired effect is to look up as many points as possible, this specification may be rewritten as:
_LOOKUP(1, VCOP_SIMD_WIDTH, 1)
This configuration will look up 8 points in 8 way mode and 16 points in 16 way mode from a single table in L1D. If the desired effect is to lookup sets of 8 points, the specification may instead be rewritten to:
_LOOKUP(1, 8, VCOP_SIMD_WIDTH/8)
This configuration will look up 8 points using one index from one table in 8 way mode. In 16 way mode, the table will be duplicated in memory, two index values will be used, and two sets of 8 points will be returned.