SSZTCS5 april 2015 TMS320C6678
In my previous blog, the first of a three part series, I shared the five applications of the C6678 DSP. Hopefully you now have several ideas on how you too can apply this multicore DSP to save power, and improve performance for your applications.
So, as we drill a bit deeper, I’d like to share with you five great facts about the C6678 DSP and how they work to achieve the exemplary performance. So, here they are, stated as a fact, and as a FAQ!
Can I really achieve the number of cores times a single core performance, when all the cores are running at once? The C6678 KeyStone multicore architecture, implemented with C66x generation of DSPs, maximizes throughput of on-chip data flows to eliminate even the most remote possibility of bottlenecks, ensuring that the vast processing power of the devices’ cores are maximized. As a matter of fact we have demonstrated a C6678 EVM, initially running video processing on a single core and then add the other seven cores running the same code with no video degradation on any of the eight cores!
I have a family of products requiring various numbers of DSP cores, how can I optimize the solutions? One of the great aspects of the C667x product is that it is available in two, four and eight core configurations, allowing optimized sizing for various design. The software is easily portable from one device to the other making a family of scalable end product easy to roll out! Furthermore, the C6654 is a single core KeyStone™ DSP, enabling further software scalability to the power optimized, single core point of the market.
I have a lot of simultaneous real time signal processing, can I connect several C6678 DSPs to work in parallel? The C6678 has a variety of popular I/O including four lanes of Serial Rapid I/O (SRIO), for high speed interconnect to external devices, as well as the KeyStone architecture HyperLink, enabling two C6678 devices to be efficiently connected to each other, doubling the performance.
Can I really program the C6678 EFFICIENTLY in C or C++? The TI C6x DSP has long been supported by a highly efficient compiler, and the C66x multicore DSP is not exception. Supported by the CCStudio, the C66x compiler offers a wide variety of features for both the novice and experienced C software developer to achieve high performance code without a lengthy optimization cycle.
I can run all the DSPs at full speed AND do IP packet processing, at the same time, in the same device? Applications implemented on C6678 multicore DSPs often involve the receipt and transmit of IP based real time packets. It has been a real bonus for developers in this space to find that the C6678 has on-chip, real-time, CPU offload of IP packet processing via the Network Co-Processor (NetCP). In many cases, no external packet processor required!
Hopefully you have begun exploring the C6678 and its features, would love to hear about any valuable discoveries. In the final of the series, I will next discuss the efficiencies in programming this multicore C6678 DSP. Stay tuned!