SPRUHM9H October 2014 – May 2024 TMS320F28075 , TMS320F28075-Q1 , TMS320F28076
32-Bit Floating-Point Multiply with Parallel Add
MRa | CLA floating-point destination register for MMPYF32 (MR0 to MR3) MRa cannot be the same register as MRd |
MRb | CLA floating-point source register for MMPYF32 (MR0 to MR3) |
MRc | CLA floating-point source register for MMPYF32 (MR0 to MR3) |
MRd | CLA floating-point destination register for MADDF32 (MR0 to MR3) MRd cannot be the same register as MRa |
MRe | CLA floating-point source register for MADDF32 (MR0 to MR3) |
MRf | CLA floating-point source register for MADDF32 (MR0 to MR3) |
LSW: 0000 ffee ddcc bbaa
MSW: 0111 1010 0000 0000
Multiply the contents of two floating-point registers with parallel addition of two registers.
MRa = MRb * MRc;
MRd = MRe + MRf;
The destination register for the MMPYF32 and the MADDF32 must be unique. That is, MRa cannot be the same register as MRd.
This instruction modifies the following flags in the MSTF register:
Flag | TF | ZF | NF | LUF | LVF |
---|---|---|---|---|---|
Modified | No | No | No | Yes | Yes |
The MSTF register flags are modified as follows:
Both MMPYF32 and MADDF32 complete in a single cycle.
; Perform 5 multiply and accumulate operations:
;
; X and Y are 32-bit floating-point arrays
;
; 1st multiply: A = X0 * Y0
; 2nd multiply: B = X1 * Y1
; 3rd multiply: C = X2 * Y2
; 4th multiply: D = X3 * Y3
; 5th multiply: E = X3 * Y3
;
; Result = A + B + C + D + E
;
_Cla1Task1:
MMOVI16 MAR0, #_X ; MAR0 points to X array
MMOVI16 MAR1, #_Y ; MAR1 points to Y array
MNOP ; Delay for MAR0, MAR1 load
MNOP ; Delay for MAR0, MAR1 load
; <-- MAR0 valid
MMOV32 MR0, *MAR0[2]++ ; MR0 = X0, MAR0 += 2
; <-- MAR1 valid
MMOV32 MR1, *MAR1[2]++ ; MR1 = Y0, MAR1 += 2
MMPYF32 MR2, MR0, MR1 ; MR2 = A = X0 * Y0
|| MMOV32 MR0, *MAR0[2]++ ; In parallel MR0 = X1, MAR0 += 2
MMOV32 MR1, *MAR1[2]++ ; MR1 = Y1, MAR1 += 2
MMPYF32 MR3, MR0, MR1 ; MR3 = B = X1 * Y1
|| MMOV32 MR0, *MAR0[2]++ ; In parallel MR0 = X2, MAR0 += 2
MMOV32 MR1, *MAR1[2]++ ; MR1 = Y2, MAR2 += 2
MMACF32 MR3, MR2, MR2, MR0, MR1 ; MR3 = A + B, MR2 = C = X2 * Y2
|| MMOV32 MR0, *MAR0[2]++ ; In parallel MR0 = X3
MMOV32 MR1, *MAR1[2]++ ; MR1 = Y3
MMACF32 MR3, MR2, MR2, MR0, MR1 ; MR3 = (A + B) + C, MR2 = D = X3 * Y3
|| MMOV32 MR0, *MAR0 ; In parallel MR0 = X4
MMOV32 MR1, *MAR1 ; MR1 = Y4
MMPYF32 MR2, MR0, MR1 ; MR2 = E = X4 * Y4
|| MADDF32 MR3, MR3, MR2 ; in parallel MR3 = (A + B + C) + D
MADDF32 MR3, MR3, MR2 ; MR3 = (A + B + C + D) + E
MMOV32 @_Result, MR3 ; Store the result
MSTOP ; end of task