Low Power Embedded Compilers and Architectures

The fast growing market for embedded devices has large implications on design and technology. Shorter time-to-market requires programmable platforms with sufficient tool-support. Inherent complexity of the application demands high performance throughputs from these platforms. Realistic peak performances are expected to be around 100-1000GOPs. Additionally, the battery oriented and portable nature of these systems implies stringent power requirements for the platforms. Power consumption of these platforms running these applications is expected to be in the order of around 100-1000mW. Rephrasing power and performance requirements, we see that the computational efficiency has to be around 1000 MOPS/mW, combined with high peak performances. In addition, as a consequence of technology scaling into the nano-dimensions, deep sub-micron effects cannot be contained at lower levels of system abstraction, but counter measures has to be taken at higher abstraction levels namely in processor architecture and compilers. With the growth of multimedia and wireless applications that are becoming more complex (dynamic, heterogeneous and data/memory dominated) mapping such applications onto processor architectures in an efficient manner is a non-trivial task. State-of-the-art power-efficient programmable processors and compilation techniques achieve up to 50MOPS/mw (for single 32-bit arithmetic unit equivalent operations), which is at least a factor 20 short of reaching the target. Furthermore, the non-recurrent engineering (NRE) costs of the compilation and architecture exploration frameworks are becoming increasingly high. This implies that the desired solutions should be extensions that can be integrated into existing state-of-the-art architecture templates and compilation frameworks.

In order to approach this very challenging task, all the relevant architecture and compiler aspects are consolidated and are categorized into primitives. The aim of such a categorization is to handle the design complexity and also to enable easier integration into the state-of-the-art solutions. Also, since the limitations of one state-of-the-art solution are different from the limitations of another state-of-the-art solution, these primitives aid in identifying the particular extensions needed for a given state-of-the-art solution. The architecture and compiler primitives are:

  • Data-parallel background memory: Organization of the data memories for effective data-parallel access and the related compiler optimizations like data-layout and data-locality are in this category
  • Data-parallel foreground memory: Organization of the local data storage in the data-parallel processor, and the related compiler optimizations like register allocation, data-layout and data-locality are in this category
  • Subword parallel data-path: Organization of the data-path units for effective subword support, and the related compiler optimizations like instruction selection, scheduling and assignment are in this category
  • Data-parallel address path: Organization of the address-path units for the heavily distributed data-parallel memory units and the related compiler optimizations like instruction selection, scheduling and assignment
  • Distributed instruction memory: Organization of the instruction path in heavily distributed memories plus local controllers, and the related compiler optimizations like distributed code-layout, instruction compression and code selection are in this category

Graduate Students and Interns

During this period as a long term research coordinator I have worked with various PhD students and MS Interns towards this research (listed alphabetically).

  • Javed Absar [PhD, KULeuven]
  • Chaitanya Cherukuri [MS Intern]
  • Jean Baka Domelevo [MS Intern]
  • Yuki Kobayashi [PhD Osaka University]
  • John (Ioannis) Koutras [MS Intern]
  • Nikolaos Kroupis [PhD, Univ Patras]
  • Andy Lambrechts [PhD, KULeuven]
  • Elena Perez [PhD Candidate, UMadrid]
  • Vasilis Porpodas [PhD Candidate]
  • Praveen Raghavan [PhD, KULeuven]
  • Estela Rey Ramos [MS Intern]
  • Nandhavel Sethubalasubramanian [MS Intern]
  • Ittetsu Taniguchi [PhD, Osaka University]
  • Guillermo Talavera [PhD Candidate, UABarcelona]

Selected Publications

Ultra-Low Energy Domain-Speciific Instruction-Set Processors.
Francky Catthoor, Praveen Raghavan, Andy Lambrechts, Murali Jayapala, Angeliki Kritikakou and Javed Absar, Springer 1st Edition XXI 400p hardcover, 2010. [BIB]
[Springer Link]

Related Projects

  • SWANS: Silicon platforms for Wireless Advanced Networks of Sensors (01/05/05 - 13/03/08), IWT
  • FLEXWARE: Exploitation of flexible hardware platforms for massively parallel bio-informatics applications (1/jan/07 31/dec/10), IWT
  • HiPEAC-1: High-Performance Embedded Architectures and Compilers (9/1/2004 - 8/31/2008), FP6
  • Marie-Curie Actions: Human Resources and Mobility Activity, (2004-2008), FP6
  • [2004-2007] IMEC M4 Program
  • [2007-2010] IMEC Apollo Program