Approaching the Maximum Energy Saving on Embedded Systems with Multiple Voltages Shaoxiong Hua and Gang Qu Electrical and Computer Engineering Department and Institute for Advanced Computer Studies University of Maryland, College Park, MD 20742, USA {shua, gangqu}@eng.umd.edu Abstract Dynamic voltage scaling (DVS) is arguably the most effective energy reduction technique. The multiple-voltage DVS systems, which can operate only at pre-determined discrete voltages, are practical and have been well studied. However, one important unsolved problem is how many levels and at which values should voltages be implemented on a multiple-voltage DVS system to achieve the maximum energy saving. We refer this as the voltage set-up problem. In this paper, (1) we derive analytical solutions for dual-voltage system. (2) For the general case that does not have analytic solutions, we develop efficient numerical methods. (3) We demonstrate how to apply the proposed algorithms on system design. (4) Interestingly, the experimental results suggest that the multiple-voltage DVS system, when the voltages are set up properly, can reach DVS technique?s full potential in energy saving. Specifically, on the design of an ad hoc application- specific system and the design of the MPEG video encoder, we find that the best single-voltage systems consume 150% and 20% more energy than the tight theoretical lower bounds, respectively. However, our approach gives dual-, 3-, and 4-voltage DVS sys- tem settings that are only 17.6%, 4.9%, and 2.6% for the ad hoc system, and 4.0%, 1.1%, and 0.2% for the MPEG video encoder, over the same lower bounds. 1. Introduction Energy consumption has become a major design issue for modern embedded systems especially battery-operated portable devices. Although leakage power becomes more significant in such sys- tems, dynamic power still dominates in designs with the current technology such as most DSP systems. Dynamic voltage scaling (DVS) technique varies the clock frequency and supply voltage according to workload at run-time to reduce dynamic power and save energy. It achieves the highest energy efficiency for time- varying computational loads if voltage can be varied arbitrarily [1, 12]. However, physical constraints of CMOS circuit limit the applicability of having voltage varying continuously. Instead, it is more practical to make multiple discrete voltages simultaneously available for the system. Transmeta's Crusoe, AMD's Mobile Ath- lon and Intel's XScale are all examples of advanced high- performance microprocessors that support DVS for low power. Most existing work on multiple voltage DVS systems assume that voltage values are given and focus on how to utilize this op- tion to minimize system's energy consumption [3, 7, 8, 11, 14]. For example, Raje and Sarrafzadeh [14] proposed a multiple volt- age scheduling algorithm to assign voltage level to each operation in a data flow graph to minimize power consumption with a given computation time constraint. Dual-voltage (5.0V and 3.0V) and 3- voltage (5.0V, 3.0V, and 2.4V) were used for experimental pur- pose. Chang and Pedram [3] presented a dynamic programming based algorithm extending this to more general cases (such as cyclic graphs, throughput constraints). Four voltages (5.0V, 3.3V, 2.4V, and 1.5V) were used in the simulation for no specific rea- sons. Hua et al. [7] proposed some scheduling strategies for a multiple-voltage system in order to reduce the system?s energy consumption while providing non-perfect completion ratio guar- antees statistically. The experimental results were reported when the four voltages were set at 3.3V, 2.6V, 1.9V, and 1.2V. To the best of our knowledge, how to select the multiple voltage levels has been discussed only in the following context: Chen and Sar- rafzadeh [5] studied the power minimization problem on dual- voltage system at gate level, where 5.0V was used as the high voltage and different voltages from 2.0V to 4.2V were used as the low voltage. They suggested that the voltages should be chosen carefully based on the slack distribution of the circuits. Qu and Potkonjak [11] gave analytical solutions on how to build energy efficient communication pipelines under latency constraints by voltage scaling and careful packet fragmentation, where each pipeline stage receives one fixed voltage. Dhar and Maksimovic [6] considered the design of finite impulse response filters and applied Lagrangian method to find the 2N+1 voltages for power minimization, where N is the order of the filter. In this paper, we consider the application level voltage set-up problem: how to determine the number of voltages and each volt- age value on a multiple-voltage application-specific DVS system such that the system?s energy consumption is minimized? The differences between our work and the ones mentioned above are: 1) we do voltage scaling at application level instead of gate level, and 2) we solve the problem for any number of voltages instead of only dual voltages or levels tightly bounded to applications. We formulate and provide practical solutions to the voltage set-up problem that seeks the most energy efficient voltage set- tings in the design of multiple-voltage DVS systems. This work is a novel extension under the DVS research framework. Our main contributions include: (1) analytical solutions and a linear search algorithm for dual-voltage DVS systems, (2) a linear approxima- tion algorithm for the general multiple-voltage DVS systems. When we use these results to guide multiple-voltage DVS system design, interestingly, (3) we find that the 3- or 4-voltage systems are (almost) as energy efficient as the ideal DVS system that can vary voltage arbitrarily. The remainder of this paper is organized as follows. Section 2 formulates the voltage set-up problem. Section 3 explains our approaches. Validation of our methods and experimental results are presented in Section 4. We conclude the paper in Section 5. 2. Problem Formulation We consider the design of an embedded system to perform a set of applications (or a single application with uncertainties in execu- tion time). The system has multiple voltages to support DVS. 26 Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ICCAD?03, November 11-13, 2003, San Jose, California, USA. Copyright 2003 ACM 1-58113-762-1/03/0011 ...$5.00. Proceedings of the International Conference on Computer Aided Design (ICCAD?03) 1092-3152/03 $ 17.00 ? 2003 ACM Input: n applications {(e i ,d i ,p i ): i =1,2,?,n} with their corresponding ideal voltages 00 0 12 ... n VV V???. Output: V 1 and V 2 that minimize the energy consumption. Algorithm: 1. V 2 = 0 n V ; 2. foreachk=2,3,?,n-1 3. { assume V 1 ? [] 00 1 , kk VV ? ; 4. solve the cubic equation (1) by replacing p 1 e 1 with g166 ? = 1 1 k i ii ep and p dp 22 with g166 ? = 1n ki ipi dp , where () ; )( )( 2 2 2 2 i thdd ddthi ip eV VrefV refVVVd d ? ? ? = 5. let V 1,k be the optimal voltage in that interval and E k be the energy; 6. } 7. report the voltage V 1,k that has the least E k as V 1 . 2.1 Application Model Each application has a (or a set of) specific amount of computa- tion requirement [13], or equivalently, a certain amount of CPU time to complete the computation before a deadline constraint. This situation occurs in systems (such as DSP systems) that run a single application characterized by the repetitive processing on periodically arriving input samples and each iteration must be completed during its period. It may also happen in systems that assign a fixed amount of time to each of the applications. Note that an application?s execution time can vary dramatically due to a number of factors such as data locality and correlation, I/D cache misses, or pipeline stalls. However, it is possible to obtain the application execution time distribution from the sys- tem?s detailed timing information or from simulation on the target hardware [15]. We adopt the assumption in [4] that the real execution time can be known as a priori, which is possible particularly in application-specific DSP systems. We also assume that the i-th application is characterized by triples , where e i is the execution time, d i is the deadline, and p i is the probability that the system executes the application. We mention that e i ?s can be the execution time for different applications or different execution time for the same application. 2.2 Multiple-Voltage DVS System Model We assume that the target multiple-voltage DVS system has m levels of supply voltage (V 1 1. These two lemmas not only identify non-optimal voltage set- tings, they are also fundamental for our proposed solutions to the voltage set-up problem. In the rest of this section, we first present the analytic solution and an exact approach for the dual-voltage DVS system. We then give an iterative approach and a linear ap- proximation algorithm for the general case with m voltages, where m is given, and the applications have n distinct possible execution time. Finally we discuss how to determine m, the number of volt- age levels, in order to achieve the maximum energy saving. 3.1 Case I: m=2 and n=3 We first consider a dual-voltage system (m=2) with three applica- tions (n=3). For simplicity, we assume that each application has one fixed execution time. Clearly, this is the simplest non-trivial case because one can simply use all the ideal voltages if m ? n. Let V 1 3 In this case, from Lemma 1 and 2 we know that 0 2 n VV= and Figure 1. Voltage set-up algorithm for Case II m=2, n?3. 27 Proceedings of the International Conference on Computer Aided Design (ICCAD?03) 1092-3152/03 $ 17.00 ? 2003 ACM 00 111 [, ] n VVV ? ? . We seek for V1 that minimizes the total energy consumption and meets all applications? deadlines. Figure 1 de- picts an O(n) algorithm that solves this problem optimally. The optimal V 1 that minimizes the energy consumption resides in one of the interval [] 00 1 , kk VV ? . Assuming that V 1 ? [] 00 1 , kk VV ? , the problem becomes to the same as Case I so we can solve it opti- mally (step 4) by Theorem 1. The loop from step 3 to step 6 finds the best V 1 . 3.3 Case III:m>2 When there are more than two voltages available, the system still uses at most two voltages to execute each application for energy efficiency [8,12]. Analytic solutions for this general case do not exist, we thus propose an iterative approach and an approximation algorithm to efficiently search the solution based on the convexity of the energy function. Lemma 3.Let{V k,1 ,V k,2 ,?,V k,k }and{V k+1,1 ,V k+1,2 ,?,V k+1,k+1 } be the optimal k- and (k+1)-voltage set-ups respectively, then we have V k+1,1 ? V k,1 ? V k+1,2 ? V k,2 ? ?,? V k,k =V k+1,k+1 = 0 n V . An Iterative Approach: ? Start with the single voltage system with voltage V 1,1 = 0 n V ; ? Apply the algorithm in Figure 1 to solve for V 2,1 and V 2,2 ,the best voltage set-up for dual-voltage system; ? Repetitively apply Lemma 3 for k-voltage (k>2) system as follows: let V k,k =V k-1,k-1 , search V k,i between V k-1,i-1 and V k-1,i for the most energy efficient setting. An Approximation Algorithm: ? Start with a random m-voltage set-up; ? Fix the (m-1) higher voltages and compute the lowest voltage V 1 by a procedure similar to the algorithm in Figure 1; ? Determine V 2 by fixing the obtained V 1 and the other (m-2) higher voltages; ? Continue till after we update the value of V m-1 , the second highest voltage; (This is one round of updating.) ? If there is energy improvement, go back to the second step with this new obtained voltage set-up; ? Report the optimal voltage set-up. 3.4 Determining the Number of Voltage Levels If we ignore the hardware overhead (e.g., the area and power on the voltage regulators or DC-DC converters) for supporting the multiple voltages, then we should use as many voltages as neces- sary to reduce the energy consumption [12]. However, supporting Figure 2. Determining the number of voltage levels. multiple voltages on the same chip does require additional hard- ware and will cause area, delay, and also power penalties. There- fore it becomes important to investigate the tradeoff between more voltage levels and the overhead they introduce. Figure 2 shows a scheme on how to determine the optimal number of voltage levels. The threshold energy cost E th,m measures the additional hardware cost in having (m+1) voltages instead of m voltages. We assume that this threshold cost can be obtained empirically. 4. Simulation Results There are two goals in our simulation: demonstrating the impor- tance of voltage set-up problem and validating our proposed ap- proaches. We formulate the voltage set-up problem in two exam- ples based on a set of randomly generated applications (Table 1) and the MPEG video encoder (Figure 3). The problems are then solved both analytically and numerically by using our approaches. Finally we compare the energy consumption under different volt- age set-ups obtained by using exhaustive simulation in Matlab in order to test the correctness of results and the effectiveness of our proposed methods. Note in this section the energy is in the unit of dissipation in one CPU unit at the reference voltage 3.3V. Table 1. Information on the two ad hoc applications. Application Deadline d i Execution Time: e i Probability p i Ideal Voltage (V) 9 0.03 3.0564 4 0.18 1.8124A10 3 0.39 1.5516 6 0.04 2.6888 4 0.10 2.0669 3 0.12 1.7479 B8 2 0.14 1.4176 Figure 3. MPEG video encoder execution time distribu- tions and deadlines in 10 4 cycles (redrawn from [10]). For each example, we apply the proposed approaches to find the best voltage set-ups for dual-voltage, 3-voltage, and 4-voltage DVS systems as reported in Table 2. We also list the energy con- sumption of the ideal DVS system, where we have one voltage for each possible execution time, and the best fixed-voltage system in the table for comparison. For the first example with two ad hoc applications, multiple- voltage DVS systems save significant amount of energy over the fixed-voltage system. The saving is more than 53% when we care- fully choose the second voltage on the dual-voltage system. With the addition of the third and fourth voltages, we see the continu- ous increase in energy reduction. On the other hand, comparing to the lower bound in the ideal system, the best fixed-voltage setting consumes more than 150% additional energy. But this "wasted" energy drops to 17.6%, 4.9%, and 2.6% for the dual-, 3-, and 4- 28 Proceedings of the International Conference on Computer Aided Design (ICCAD?03) 1092-3152/03 $ 17.00 ? 2003 ACM voltage system respectively. It indicates the effectiveness of mul- tiple-voltage DVS system's energy saving. Table 2. The optimal voltage set-ups and their corre- sponding average energy consumption per execution. (In the parenthesis of energy columns, ?-? is the energy saving over fixed-voltage system, ?+? is the ?wasted? energy comparing to the ideal DVS system.) 2-Application MPEG Encoder DVS Systems Voltages Energy Voltages Energy fixed- voltage 3.0564 2.9536 (+151.1%) 2.8934 26.7125 (+20.1%) dual- voltage 3.0564 1.8124 1.3833 (-53.2%) (+17.6%) 2.8934 1.8551 23.1478 (-13.3%) (+4.0%) 3-voltage 3.0564 2.0688 1.5514 1.2337 (-58.2%) (+4.9%) 2.8934 1.8558 1.3031 22.4958 (-15.8%) (+1.1%) 4-voltage 3.0564 2.0768 1.8119 1.5509 1.2071 (-59.1%) (+2.6%) 2.8934 2.6374 1.8554 1.3031 22.3020 (-16.5%) (+0.2%) ideal ? 1.1763 ? 22.2506 We have similar observations from the MPEG encoder exam- ple except that the energy saving (over the fixed-voltage system) is much lower, albeit a notable 13%. This is because that majority of the energy is consumed on the deterministic subtasks. How- ever, multiple-voltage systems again successfully reduced the "wasted" energy from more than 20% (for fixed voltage) to 4.0%, 1.1%, and 0.2%. To validate the correctness of our results, we use Matlab to simulate 100,000 iterations of each application under different voltage set-ups for dual-, 3-, and 4-voltage systems. In all the cases, this exhaustive search finds the same solution, within the precision of voltage increment 0.01V we set, as we reported in Table 2 by our methods. Figures 4 illustrates this for the dual- voltage system for the MPEG encoder. We set the high voltage V 2 to go from 0 n V (2.8934V) to 3.3V, and the low voltage V 1 to go from 1.0V to 3.3V, both with an increment of 0.01V. In the fig- ure, we see that the energy consumption is minimized at the same set-up as we obtained theoretically. Figure 4. Average energy consumption for the MPEG encoder with different dual-voltage set-ups. 5. Conclusion We consider the voltage set-up problem for application-specific multiple-voltage DVS systems. The problem seeks to determine the number of voltage levels and the voltage at each level to minimize the energy consumption for a set of applications. We give analytic optimal solution for the dual-voltage system and develop fast iterative and approximation approaches for the gen- eral case. The hardware overhead to supply multiple voltages, once obtained, can be conveniently integrated into our techniques to solve the voltage set-up problem. Simulation results show the correctness and efficiency of our approaches. We also observe that multiple-voltage systems can reduce energy consumption almost as much as the ideal DVS systems, which implies that the full potential of DVS can be reached by the practical multiple- voltage DVS systems. We are currently examining ways to extend our approaches to more complicated applications from real life. 6. References [1] T. D. Burd, T. A. Pering, A. J. Stratakos, and R. W. Broder- sen. A dynamic voltage scaled microprocessor system. IEEE J. Solid-State Circuits, 35(11):1571-1580, Nov. 2000. [2] A. Chandrakasan, S. Sheng, and R.W. Brodersen, Low- power CMOS digital design. IEEE J. Solid-State Circuits, 27(4):473-484, Apr. 1992. [3] J.-M.Chang and M. Pedram. Energy minimization using multiple supply voltages. ISLPED, pp. 157-162, 1996. [4] A. Chandrakasan, V. Gutnik, and T. Xanthopoulos. Data driven signal processing: an approach for energy efficient computing. ISLPED, pp. 347-352, 1996. [5] C. Chen and M. Sarrafzadeh. Provably good algorithm for low power consumption with dual supply voltages. ICCAD, pp. 76-79, 1999. [6] S. Dhar and D. Maksimovic. Low-power digital filtering using multiple voltage distribution and adaptive voltage scal- ing. ISLPED, pp. 207-209, 2000. [7] S. Hua, G. Qu, and S. S. Bhattacharyya. Energy reduction techniques for multimedia applications with tolerance to deadline misses. DAC, pp. 131-136, 2003. [8] T. Ishihara and H. Yasuura. Voltage scheduling problem for dynamically variable voltage processors. ISLPED, pp. 197- 202, 1998. [9] M. C. Johnson and K. Roy. Datapath scheduling with multi- ple supply voltages and level converters. ACM Trans. on De- sign Automation of Electronics Systems, 2(3):227-248, 1997. [10]A. Kalavade and P. Moghe. A tool for performance estima- tion of networked embedded end-systems. DAC, pp. 257- 262, 1998. [11]G. Qu and M. Potkonjak. Techniques for energy minimiza- tion of communication pipelines. ICCAD, pp.597-600, 1998. [12]G. Qu. What is the limit of energy saving by dynamic voltage scaling? ICCAD, pp. 560-563, 2001. [13]D. Mosse, H. Aydin, B. Childers, and R. Melhem. Compiler- assisted dynamic power-aware scheduling for real-time ap- plications. COLP, 2000. [14]S. Raje and M. Sarrafzadeh. Variable voltage scheduling. ISLPED, pp. 9-14, 1995. [15]T.-S.Tia,Z.Deng,M.Shankar,M.Storch,J.Sun,L.-C. Wu, and J.W.?S. Liu. Probabilistic performance guarantee for real-time tasks with varying computation times. RTAS, pp. 164-173, 1995. 29 Proceedings of the International Conference on Computer Aided Design (ICCAD?03) 1092-3152/03 $ 17.00 ? 2003 ACM