Murali Jayapala - 2001-2005-Low Power Instruction Memory Optimizations

PhD Research Summary

Instruction Memory Bottleneck
Approach
People
Related Publications
Related Projects

Research Problem

Current embedded systems for multimedia applications like mobile and hand-held devices, are typically battery operated. Therefore, low energy is one of the key design goals of such systems. Many such systems often rely on Very Long Instruction Word (VLIW) Application Specific Instruction Set Procssors (ASIPs). However, power analysis of such processors indicate that a significant amount of power is consumed in the instruction caches. Loop buffering or L0 buffering is an effective scheme to reduce energy consumption in the instruction memory hierarchy. In a typical multimedia application, significant amount of execution time is spent in small program segments. Hence, by storing them in a small L0 buffer instead of the big instruction cache, energy can be reduced.

While the reduction by L0 buffering is substantial, further optimizations are still necessary to ensure high energy efficiency in the future processors. With optimizations applied on different aspects of the processor like the datapath, register files and data memory hierarchy, the overall processor energy reduces. However the instruction memory energy, including the L0 buffer, is bound to increase or remain substantial. Of the two main contributors of energy consumption in the instruction memory hierarchy, the L0 buffers are the main bottleneck.

Approach

Our approach to solve this problem is to incorporate a fully clustered (distributed) instruction memory hierarchy. L0 buffers are partitioned and each partition is grouped with certain functional units in the datapath to form L0 clusters. Similarly, L1 (instruction) cache is partitioned and grouped with certain L0 clusters to form L1 clusters. Each cluster has its own local controller which enables a cluster to operate autonomously to a certain extent.

Various aspects of this memory hierachy are being investigated. Namely,

The basic operation of the hierarchy
Automatic generation or formation of L0 (L1) clusters
Compiler Scheduling for L0 (L1) clusters
Synchronicity between L0 clusters and datapath (partitioned register files) clusters
Support for execution of multiple loops in parallel (Simultaneous Loop Threading)

Associated tools

People I worked with

Advisors
- Henk Corporaal, TU Eindhoven, The Netherlands
- Francky Catthoor, IMEC vzw, Leuven, Belgium
- Geert Deconinck, K.U.Leuven, Belgium
Fellow PhD Students
- Tom Vander Aa
- Francisco Barat

Related Publications: Journals & Book Chapter

Clustered Loop Buffer Organization for Low Energy VLIW Embedded Processors.
Murali Jayapala, Francisco Barat, Tom Vander Aa, Francky Catthoor, Henk Corporaal and Geert Deconinck. IEEE Transactions on Computers, 54(6):672-683, June 2005. [BIB]
Instruction Buffering Exploration for Low Energy Embedded Processors.
Tom Vander Aa, Murali Jayapala, Francisco Barat, Geert Deconinck, Rudy Lauwereins, Henk Corporaal and Francky Catthoor. Journal of Embedded Computing, 1(3), 2004. [BIB]
Roadmaps on Selected topics in Embedded Systems Design from the ARTIST project.
Geert Deconinck, Murali Jayapala and Tom Vander Aa, Mar 2005. [BIB]

Related Publications: Conferences & Workshops

L0 Cluster Synthesis and Operation Shuffling.
Murali Jayapala, Tom Vander Aa, Francisco Barat, Francky Catthoor, Henk Coporaal and Geert Deconinck. In "Proc of 14th International Workshop on Power And Timing Modeling, September 2004. [BIB]
Instruction Buffering Exploration for Low Energy VLIWs with Instruction Clusters.
Tom Vander Aa, Murali Jayapala, Francisco Barat, Geert Deconinck, Rudy Lauwereins, Francky Catthoor and Henk Corporaal. In Proc. of the Asian Pacific Design and Automation Conference 2004 (ASPDAC'2004), "Yokohama, January 2004. [BIB]
Low Power Coarse-Grained Reconfigurable Instruction Set Processor.
Francisco Barat, Murali Jayapala, Tom Vander Aa, Geert Deconinck, Rudy Lauwereins and Henk Corporaal. In Proc of 13th International Conference on Field Programmable Logic and Applications (FPL), September 2003. [BIB]
Instruction Buffering Exploration for Low Energy Embedded Processors.
Tom Vander Aa, Murali Jayapala, Francisco Barat, Geert Deconinck, Rudy Lauwereins, Henk Corporaal and Francky Catthoor. In "Proc of 13th International Workshop on Power And Timing Modeling, September 2003. [BIB]
Clustered L0 Buffer Organization for Low Energy Embedded Processors.
Murali Jayapala, Francisco Barat, Tom Vander Aa, Francky Catthoor, Geert Deconinck and Henk Corporaal. In "Proc of 1st Workshop on Application Specific Processors (WASP), November 2002. [BIB]
Inner Loop Code Generation for Coarse-Grained Reconfigurable Instruction Set Processors.
Francisco Barat, Murali Jayapala, Pieter OpDeBeeck and Rudy Lauwereins. In Proc of International Workshop on Advanced Parallel Processing Techniques (APPT), September 2001. [BIB]
CRISP: A Template for Reconfigurable Instruction Set Processors.
Pieter OpDeBeeck, Francisco Barat, Murali Jayapala and Rudy Lauwereins. In Proc of International conference on Field Programmable Logic (FPL), August 2001. [BIB]
Reconfigurable Instruction Set Processors: An Implementation Platform for Interactive Multimedia Applications.
Francisco Barat, Murali Jayapala, Pieter OpDeBeeck and Geert Deconinck. In Proc of Asilomar Conference, November 2001. [BIB]
Low Energy Clustered Instruction Fetch and Split A Loop Cache Architecture For Long instruction Word Processors.
Murali Jayapala, Francisco Barat, Pieter OpDeBeeck, Francky Catthoor and Rudy Lauwereins. In Proc of the workshop on Compilers and Operating Systems for Low Power (COLP), September 2001. [BIB]
Software Pipelining for Coarse-Grained Reconfigurable Instruction Set Processors.
Francisco Barat, Murali Jayapala, Pieter OpDeBeeck and Geert Deconinck. In Proc of VLSI Design together with ASPDAC, January 2002. [BIB]
A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors.
Murali Jayapala, Francisco Barat, Pieter OpDeBeeck, Francky Catthoor, Geert Deconinck and Henk Corporaal. In Proc of 12th International Workshop on Power And Timing Modeling Optimization and Simulation (PATMOS) , September 2002. [BIB]

Related Projects

MESA2 under MEDEA+ Program (European Project)
Artist: Network of Exellence