SPRUIG3C January 2018 – August 2019 TDA4VM , TDA4VM-Q1
The copy-out operation is essentially the reverse
of the copy-in operation. The operation is performed by the
LHT_copy_out::copy_table_out()
method, which is called after a
loop containing a histogram operation.
Like copy-in, the table is read and written one 1024-bit line at a time. Each line is read from L1D as a pair of 512-bit vectors using two LUTRD (lookup read) instructions, rearranged using two VPERM instructions, and written using normal vector stores. The loop pipelines at an ii of 2, resulting in a throughput of 512 bits per cycle.
For purposes of the LUTRD used for the copy-out,
the table is considered to be 16 parallel tables containing 32-bit elements. This
enables each LUTRD instruction to read the maximal payload of 512 bits. The copy-out
LTCR configuration is computed during the init()
function by a call
to the copy_out_config()
method and stored in the tvals
structure.