SPRUIG3C January 2018 – August 2019 TDA4VM , TDA4VM-Q1
To preserve the dispatch API, the C7x translation uses a similar approach. Expressions that are kernel invariant are precomputed by the
init()
function and stored in a memory block, then accessed by the translated kernel by passing the memory block’s address to the
vloops()
function. However, the size, layout, and contents of the block is different. On C7x, restricting parameter elements to be 16-bit values is needlessly inefficient; parameter expressions often represent constants, addresses, or even vectors (for example Streaming Engine setup vectors). For this reason the pblock object on C7x is defined as a structure with fields defined by the migration tool. The address of the structure is passed to the
init()
function, which assigns to the fields of the structure, and the
vloops()
function, which accesses them. The structure has type
<
kernel
>_tvals_t
, and is referred to simply as the tvals structure throughout this document.
Even though the translated parameter block object is a structure, to preserve API compatibility, the pblock pointer parameters to the
init()
and
vloops()
function are declared as
unsigned short*
. The functions convert the pointer to point to the tvals structure by casting it.
The
param_count()
API function allows the client to allocate the pblock rather than using the built-in statically allocated one. It returns the required size of the pblock in
unsigned
short
units. For example, a user-managed pblock might be allocated as:
typedef unsigned short ptype;
ptype *my_pblock = (ptype*)malloc(mykernel_param_count() * sizeof(ptype));
To preserve compatibility, for C7x the return value of the
param_count()
function is still in terms of
unsigned short
units, even though the underlying object is actually a structure. This allows client code that allocates the pblock as above to remain unchanged.
The tvals structure consists of one field for each kernel parameter to capture its value, plus a nested substructure for each vloop in the kernel to hold loop-specific expressions. Furthermore, each substructure is actually an array of substructures. This is to handle vloops that use the “repeat” feature, whereby the vloop runs multiple times using different parameter sets.
Table 2-1 illustrates the migration tool-generated tvals structure for a simple kernel with two vloop commands, the first of which is inside a repeat loop.
Kernel-C Program | Generated Tvals Structure |
---|---|
|
|
Access to Parameters in vloops() Function | |
|
Compatibility Warning: Pblock Size |
---|
The parameter block for C7x tends to be larger on C7x than VCOP. Programs which (dynamically or statically) allocate the pblock without using the kernel_pblock_size() function may fail if the allocated space is too small for the translated tvals structure. |