ABSTRACT Title of dissertation: DELAY MINIMIZATION IN ENERGY CONSTRAINED WIRELESS COMMUNICATIONS Jing Yang, Doctor of Philosophy, 2010 Dissertation directed by: Professor S?ennur Uluku?s Department of Electrical and Computer Engineering In wireless communications and networks, especially for many real-time ap- plications, the average delay packets experience is an important quality of service criterion. Therefore, it is imperative to design advanced transmission schemes to jointly address the goals of reliability, high rates and low delay. Achieving these objectives often requires careful allocation of given resources, such as energy, power, rate, among users. It also requires a close collaboration between physical layer, medium access control layer, and upper layers, and yields cross-layer solutions. We flrst investigate the problem of minimizing the overall transmission delay of packets in a multiple access wireless communication system, where the transmit- ters have average power constraints. We formulate the problem as a constrained optimization problem, and then transform it into a linear programming problem. We show that the optimal policy has a threshold structure: when the sum of the queue lengths is larger than a threshold, both users should transmit a packet during the current slot; when the sum of the queue lengths is smaller than a threshold, only one of the users, the one with the longer queue, should transmit a packet during the current slot. Then, we study the delay-optimal rate allocation in a multiple access wireless communication system. Our goal is to allocate rates to users, from the multiple access capacity region, based on their current queue lengths, in order to minimize the average delay of the system. We formulate the problem as a Markov decision problem (MDP) with an average cost criterion. We flrst show that the value function is increasing, symmetric and convex in the queue length vector. Taking advantage of these properties, we show that the optimal rate allocation policy is one which tries to equalize the queue lengths as much as possible in each slot, while working on the dominant face of the capacity region. Next, we extend the delay-optimal rate allocation problem to a communication channel with two transmitters and one receiver, where the underlying rate region is approximated as a general pentagon. We show that the delay-optimal policy has a switch curve structure. For the discounted-cost problem, we prove that the switch curve has a limit along one of the dimensions. The existence of a limit in the switch curve along one of the dimensions implies that, once the queue state is beyond the limit, the system always operates at one of the corner points, implying that the queues can be operated partially distributedly. Next, we shift our focus from the average delay minimization problem to trans- mission completion time minimization problem in energy harvesting communication systems. We flrst consider the optimal packet scheduling problem in a single-user energy harvesting wireless communication system. In this system, both the data packets and the harvested energy are modeled to arrive at the source node ran- domly. Our goal is to adaptively change the transmission rate according to the tra?c load and available energy, such that the time by which all packets are de- livered is minimized. Under a deterministic system setting, we develop an optimal ofi-line scheduling policy which minimizes the transmission completion time, under causality constraints on both data and energy arrivals. Then, we investigate the transmission completion time minimization problem in a two-user additive white Gaussian noise (AWGN) broadcast channel, where the transmitter is able to harvest energy from the nature. We flrst analyze the structural properties of the optimal transmission policy. We prove that the optimal total transmit power has the same structure as the optimal single-user transmit power. We also prove that there exists a cut-ofi power level for the stronger user. If the optimal total transmit power is lower than this level, all transmit power is allocated to the stronger user, and when the optimal total transmit power is larger than this level, all transmit power above this level is allocated to the weaker user. Based on these structural properties of the optimal policy, we propose an algorithm that yields the globally optimal ofi-line scheduling policy. Next, we investigate the transmission completion time minimization problem in a two-user AWGN multiple access channel. We flrst develop a generalized itera- tive backward waterfllling algorithm to characterize the maximum departure region of the transmitters for any given deadline. Then, based on the sequence of maximum departure regions at energy arrival epochs, we decompose the transmission comple- tion time minimization problem into a convex optimization problem and solve it e?ciently. Finally, we investigate the average delay minimization problem in a single-user communication channel with an energy harvesting transmitter. We consider three difierent cases. In the flrst case, both the data packets and the energy to be used to transmit them are assumed to be available at the transmitter at the beginning. In the second case, while the energy is available at the transmitter at the beginning, packets arrive during the transmissions. In the third case, the packets are available at the transmitter at the beginning and the energy arrives during the transmissions, as a result of energy harvesting. In each scenario, we flnd the structural properties of the optimal solution, and develop iterative algorithms to obtain the solution. DELAY MINIMIZATION IN ENERGY CONSTRAINED WIRELESS COMMUNICATIONS by Jing Yang Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulflllment of the requirements for the degree of Doctor of Philosophy 2010 Advisory Committee: Professor S?ennur Uluku?s, Chair/Advisor Professor Anthony Ephremides Professor Armand Makowski Professor Andr?e Tits Professor Richard Wentworth c Copyright by Jing Yang 2010 DEDICATION To my parents. ii ACKNOWLEDGEMENT First, I would like to thank my advisor Professor S?ennur Uluku?s, without whom this dissertation would not have been possible. I thank her for giving me an invaluable opportunity to work with her, and ofiering constant guidance and support through my PhD studies. I am grateful for her insightful advice, encouragement and patience. ThanksareduetoProfessorAnthonyEphremides, ProfessorArmandMakowski, Professor Andr?e Tits and Professor Richard Wenthworth for agreeing to serve on my dissertation committee and for their valuable comments and suggestions. I am especially thankful to Professor Armand Makowski for his kind concern and advice regarding my PhD studies. I also owe a word of thanks to my group members. I learned a lot from the presentations and discussions during our weekly group meetings. I give special thanks to Omur Ozel, for his input on the completion of our joint work. I am thankful to all of my friends at Maryland. I would like to thank Chao Wu, who has helped me a lot since the flrst day I came to Maryland. I would also like to thank Beiyu Rong and Jianwei Xie for all the cheerful discussions and chats during our tea breaks. I would like to thank Jiao Yu, Yiyuan Yin, Qiang Liu, Kai Li, Linjie Li, and many other friends who made my stay at Maryland fun and memorable. Finally, I would like to express my gratitude to my family. I am grateful to my parents for their unconditional love, undoubted trust and unwavering support iii during my life. I would like to thank my little sister for cheering me up when I was upset. I am thankful to Weiqiang Wu for his enduring love, caring, patience and support. He always stands by me, supports me to pursue my dreams and provides valuable advice on every aspect of my life. His in uence has shaped me into the person I am today. iv Table of Contents List of Figures ix 1 Introduction 1 2 Average Delay Minimization for Average Power Constrained Multiple Access Communications 10 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.1 Physical Layer Model . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.2 Medium Access Control Layer Model . . . . . . . . . . . . . . 16 2.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4 Analysis of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.5 The Modifled Optimization Problem and a Two-Step Solution . . . . 32 2.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.8.1 The Proof of Theorem 2.1 . . . . . . . . . . . . . . . . . . . . 46 2.8.2 The Proof of Theorem 2.2 . . . . . . . . . . . . . . . . . . . . 49 2.8.3 The Proof of Theorem 2.3 . . . . . . . . . . . . . . . . . . . . 54 3 Delay Minimization in a Symmetric Multiple Access Channel 59 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.2 System Model and Problem Formulation . . . . . . . . . . . . . . . . 62 v 3.2.1 Physical Layer Model . . . . . . . . . . . . . . . . . . . . . . . 62 3.2.2 Medium Access Control Layer Model . . . . . . . . . . . . . . 63 3.2.3 Formulation as an MDP . . . . . . . . . . . . . . . . . . . . . 64 3.3 The Discounted Cost Problem . . . . . . . . . . . . . . . . . . . . . . 65 3.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4 Delay Minimization with a General Pentagon Rate Region 77 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.2 System Model and Problem Formulation . . . . . . . . . . . . . . . . 81 4.3 An Inductive Proof of the Switch Structure . . . . . . . . . . . . . . . 87 4.4 The Limit on the Switch Curve . . . . . . . . . . . . . . . . . . . . . 90 4.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.7 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.7.1 g(q) is increasing in q1 and q2. . . . . . . . . . . . . . . . . . . 100 4.7.2 g(q+x)?g(q) is increasing in q1 and q2 for any flxed x. . . . 101 4.7.3 (a1 ?a2)g(D1q)+(b1 ?b2)g(D2q)+?g(q) is increasing in q1. 104 4.7.4 (a1 ?a2)g(D1q)+(b1 ?b2)g(D2q)+?g(q) is decreasing in q2. 109 5 Optimal Packet Scheduling in a Single-User Energy Harvesting System 112 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.2 Scenario I: Packets Ready Before Transmission Starts . . . . . . . . . 115 5.3 Scenario II: Packets Arrive During Transmissions . . . . . . . . . . . 128 vi 5.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 5.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 5.6 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 5.6.1 The Proof of Theorem 5.1 . . . . . . . . . . . . . . . . . . . . 137 5.6.2 The Proof of Theorem 5.2 . . . . . . . . . . . . . . . . . . . . 140 5.6.3 The Proof of Theorem 5.3 . . . . . . . . . . . . . . . . . . . . 142 5.6.4 The Proof of Theorem 5.4 . . . . . . . . . . . . . . . . . . . . 143 6 Optimal Packet Scheduling in a Broadcast Channel with an Energy Harvest- ing Transmitter 146 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 6.2 System Model and Problem Formulation . . . . . . . . . . . . . . . . 151 6.3 Characterizing D(T): Largest (B1;B2) Region for a Given Deadline T 153 6.4 Minimizing the Transmission Completion Time T for a Given (B1;B2)164 6.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 6.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 7 Optimal Packet Scheduling in a Multiple Access Channel with Energy Har- vesting Transmitters 177 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 7.2 System Model and Problem Formulation . . . . . . . . . . . . . . . . 181 7.3 Characterizing D(T): Largest (B1;B2) Region for a Given Deadline T 183 7.3.1 ?1 = ?2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 7.3.2 ?1 = 0 or ?2 = 0. . . . . . . . . . . . . . . . . . . . . . . . . . 190 vii 7.3.3 General ?1;?2 > 0. . . . . . . . . . . . . . . . . . . . . . . . . 194 7.4 Minimizing the Transmission Completion Time T for a Given (B1;B2)197 7.4.1 (B1;B2) lies on the at part of the dominant face. . . . . . . . 200 7.4.2 (B1;B2) lies on the vertical or horizontal part. . . . . . . . . . 202 7.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 7.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 8 Average Delay Minimization for an Energy Constrained Single-User Channel210 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 8.2 Scenario I: Packets and Energy Ready Before Transmission Starts . . 212 8.3 Scenario II: Random Packet Arrivals . . . . . . . . . . . . . . . . . . 216 8.3.1 An Iterative Approach . . . . . . . . . . . . . . . . . . . . . . 222 8.3.2 A Dynamic Programming Approach . . . . . . . . . . . . . . . 224 8.4 Scenario III: Random Energy Arrivals . . . . . . . . . . . . . . . . . . 226 8.4.1 A Dynamic Programming Approach . . . . . . . . . . . . . . . 231 8.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 8.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 9 Conclusions 240 viii List of Figures 2.1 System model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2 Two-dimensional Markov chain. . . . . . . . . . . . . . . . . . . . . . 20 2.3 Feasible power region. . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.4 The transitions between diagonal groups when N = 3. . . . . . . . . . 30 2.5 Example: allocation within groups when N = 4. . . . . . . . . . . . . 35 2.6 The average delay versus average power in the symmetric scenario. . . 44 2.7 Allocation pattern within groups. . . . . . . . . . . . . . . . . . . . . 50 2.8 The transitions between states when n is even. . . . . . . . . . . . . . 51 2.9 The transitions between states when n is odd. . . . . . . . . . . . . . 52 3.1 The capacity region for a two-user multiple access system. . . . . . . 63 3.2 System model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.3 Average delay versus arrival arrival rate. . . . . . . . . . . . . . . . . 75 4.1 The asymmetric pentagon rate region with a non-45? dominant face. Corner point 2 has larger sum-rate, i.e., a2 +b2 > a1 +b1. . . . . . . . 80 4.2 The system model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.3 The switch structure of the optimal policy. . . . . . . . . . . . . . . . 89 4.4 The switch structure for a symmetric Gaussian MAC. . . . . . . . . . 90 4.5 The switch curve of the discounted-cost MDP. . . . . . . . . . . . . . 98 4.6 The switch curves for the discounted-cost MDP. . . . . . . . . . . . . 99 4.7 We compare the values of g(q) at difierent states. . . . . . . . . . . . 101 ix 4.8 Two special policy patterns. . . . . . . . . . . . . . . . . . . . . . . . 106 4.9 The optimal operating points at A21q, A1q, A21A2q, A1A2q, A2q are 2;1;2;1;1, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.10 The optimal operating points at A21q, A1q, A21A2q, A1A2q, A2q are 2;2;1;1;1, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.11 The optimal operating points at A1q, A1A2q, A1A22q, A2q, A22q are 2;2;2;1;1, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.1 An energy harvesting communication system model. . . . . . . . . . . 113 5.2 System model with random packet and energy arrivals. Data packets arrive at points denoted by ? and energies arrive (are harvested) at points denoted by ?. . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.3 System model with all bits available at the beginning. Energies arrive at points denoted by ?. . . . . . . . . . . . . . . . . . . . . . . . . . . 116 5.4 The sequence of transmission powers and durations. . . . . . . . . . . 117 5.5 The rate must remain constant between energy harvests. . . . . . . . 120 5.6 f(p) is concave in p. . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 5.7 An interpretation of transmission policies satisfying Lemmas 5.1, 5.2, 5.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.8 An illustration of the algorithm. . . . . . . . . . . . . . . . . . . . . . 125 5.9 Optimal transmit powers p = [3;5;10;20] mW, with durations l = [5;3;1;0:5] s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 x 5.10 Optimal transmit powers p = [1;2;10] mW, with durations l = [5;5;1]?10?2 s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 5.11 Two difierent cases in the proof of Theorem 5.1. . . . . . . . . . . . . 139 5.12 Two difierent cases in the proof of Theorem 5.2. . . . . . . . . . . . . 142 6.1 An energy harvesting two-user broadcast channel. . . . . . . . . . . . 147 6.2 System model. (B1;B2) bits to be transmitted to users are available at the transmitter at the beginning. Energies arrive (are harvested) at points denoted by ?. T denotes the transmission completion time by which all of the bits are delivered to their respective destinations. 148 6.3 The capacity region of the two-user AWGN broadcast channel. . . . . 152 6.4 The maximum departure region and possible trajectories to reach the boundary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 6.5 Rates (r1n;r2n) and corresponding durations ln with a given deadline T. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 6.6 The value of the optimal transmit power is always equal to the curve on top. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 6.7 Optimally splitting total power between the signals that go to the two users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 6.8 Determining the optimal total power level of the flrst epoch. . . . . . 168 6.9 Search for the cutofi power level Pc iteratively. . . . . . . . . . . . . . 170 6.10 The maximum departure region of the broadcast channel for various T.174 xi 6.11 Cut-ofi power Pc = 1:933 mW. Optimal transmit rates are r1 = 1:552 Mbps, r2 = [0:274;0:680;1:369;1:834] Mbps, with durations l = [5;3;1;0:66] s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 6.12 Cut-ofi power Pc = 4:107 mW. Optimal transmit rates r1 = [2;2:353; 2:353;2:353] Mbps and r2 = [0;0:167;0:856;2:570] Mbps, with dura- tions l = [5;3;1;0:25] s. . . . . . . . . . . . . . . . . . . . . . . . . . . 176 7.1 (a) An energy harvesting MAC model with energy and data queues, and (b) the capacity region of the additive white Gaussian noise MAC.178 7.2 System model with all packets available at the beginning. Energies arrive at points denoted by ?. . . . . . . . . . . . . . . . . . . . . . . 179 7.3 The power/rate must remain constant between energy harvests. . . . 185 7.4 The transmit powers of individual user. . . . . . . . . . . . . . . . . . 190 7.5 The departure region D(T). . . . . . . . . . . . . . . . . . . . . . . . 190 7.6 The optimal transmit power for the second user to maximize its de- parture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 7.7 The minimum transmission completion time T to depart (B1;B2). . . 204 7.8 The maximum departure region of the multiple access channel for various T. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 7.9 Optimal transmit powers p1 = [2;0;5;2:5] mW, p2 = [1;5;0;2:5] mW, with durations l = [5;2;1;2] s. . . . . . . . . . . . . . . . . . . . 206 7.10 Optimaltransmitpowersp1 = [1:43;1:43;2:67]mW,p2 = [1;3:54;2:11] mW, with durations l = [5;2;3:75] s. . . . . . . . . . . . . . . . . . . 207 xii 7.11 Optimaltransmitpowersp1 = [1:86;0:35;3:63;3:03]mW,p2 = [1;4:43; 1:14;2:38] mW, with durations l = [5;2;1;2:1] s. . . . . . . . . . . . . 208 7.12 The maximum departure region of the multiple access channel for various T. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 8.1 System model with random packet and energy arrivals. Data packets arrive at points denoted by ? and energies arrive (are harvested) at points denoted by ?. . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 8.2 System model when all packets and energy are ready before the trans- mission starts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 8.3 System model with random packet arrivals. . . . . . . . . . . . . . . . 216 8.4 Four difierent cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 8.5 System model for the dynamic programming approach. . . . . . . . . 225 8.6 Average delay minimization with energy harvesting. . . . . . . . . . . 226 8.7 (a) The waiting time for the flrst packet, W1 and (b) the delay for the flrst packet, D1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 8.8 Tn(e;a) in the dynamic programming formulation. . . . . . . . . . . . 233 8.9 Overall delay as a function of the iteration index, when E = 48?10?2 mJ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 8.10 Overall delay as a function of the iteration index, when E = 45?10?2 mJ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 8.11 The optimal energy allocation e = [10;10;10;10] ? 10?2 mJ, with duration ? = [1;1;1;1]?10?2 s, respectively. . . . . . . . . . . . . . . 237 xiii 8.12 The optimal energy allocation e = [10:5;9:5;10;10]?10?2 mJ, with duration ? = [0:89;1:15;1;1]?10?2 s, respectively. . . . . . . . . . . 238 8.13 The optimal energy allocation e = [12:6;11:8;10:9;9:7] ? 10?2 mJ, with duration ? = [0:63;0:70;0:82;1:07]?10?2 s, respectively. . . . . 238 xiv Chapter 1 Introduction In modern wireless communication systems, especially for many real-time appli- cations, the delay packets experience is an important quality of service criterion. Therefore, in such systems, allocating the given resources, such as average power, energy, etc., to minimize the average delay, becomes an important issue. Since power and energy are physical layer resources, and the delay is a medium access control layer issue, such resource allocation problems require close collaboration of physical and medium access control layers, and yield cross-layer solutions. In addition, in many circumstances, such problems require treatments that combine information theory and queueing theory to obtain optimal solutions [1]. References [2{7] analyze the trade-ofi relationship between power and delay for a single-user communication system. Random arrivals queue at the transmitter to wait to be transmitted. In each slot, the transmitter adapts its service rate, i.e., transmission rate, according to the queue length and the channel state, as well as the average power constraint, to minimize the average delay. Reference [2] (see also [3]) formulates the problem as a dynamic programming problem and develops a delay-power tradeofi curve. References [4] and [5] determine some structural prop- erties of the optimal power/rate allocation policy. Reference [6] derives bounds on the average delay in a system with a single queue concatenated with a multi-layer 1 encoder. Reference [7] proposes a dynamic programming formulation to flnd optimal power, channel coding and source coding policies with a delay constraint. Refer- ence [8] considers the delay-optimal transmission policy for the secondary user in a cognitive multiple access system. It formulates the problem as a one-dimensional Markov chain and derives an analytical result to minimize the average delay of the secondary user under an average power constraint. The delay-optimal rate allocation in multiple access channels has been investi- gated in the literature as well. Reference [9] considers a symmetric Gaussian multiple access channel (MAC), whose capacity region for two-users is a symmetric pentagon. Reference [9] proves that in order to minimize the packet delay, the system should operate at an extreme point of the MAC capacity region, i.e., at one of the two corner points of the symmetric pentagon. In particular, [9] determines explicitly the corner point the system should operate at as a function of the queue sizes, by proving that the larger rate should be given to the user with the larger queue size, hence the name of the proposed policy: longer-queue-higher-rate (LQHR). Reference [10] generalizes [9] to a potentially asymmetric setting, and proves that the system should again operate at one of the two corner points of the capacity region, which in this case is a potentially asymmetric pentagon. This proves that the delay-optimal policy has a switch structure, i.e., that the queue state space should be divided into two, and in each region, the system should operate at one of the two corner points. However, unlike the symmetric case in [9], the explicit form of the switch curve is unknown. Reference [11] develops a policy named \modifled LQHR" which works at a corner point of the pentagon when the queue lengths are difierent, and switches to 2 the mid-point of the dominant face of the pentagon when the queue lengths become equal. The \modifled LQHR" algorithm is shown to minimize the average bit delay in a symmetric system. The third chapter of [12] extends \modifled LQHR" to a symmetric M-user scenario. The trade-ofi relationship between delay and energy has been well studied in traditional battery powered (unrechargeable) systems. References [13{18] investi- gate energy minimization problems with various deadline constraints. Reference [13] considers the problem of minimizing the energy in delivering all packets to the destination by a deadline. It develops a lazy scheduling algorithm, where the transmission times of all packets are equalized as much as possible, subject to the deadline and causality constraints, i.e., all packets must be delivered by the deadline and no packet may be transmitted before it has arrived. This algorithm also elon- gates the transmission time of each packet as much as possible, hence the name, lazy scheduling. Under a similar system setting, [14] proposes an interesting novel cal- culus approach to solve the energy minimization problem with individual deadlines for each packet. Reference [15] develops dynamic programming formulations and determines optimality conditions for a situation where channel gain varies stochas- tically over time. Reference [16] considers energy-e?cient packet transmission with individual packet delay constraints over a fading channel, and develops a recursive algorithm to flnd an optimal ofi-line schedule. This optimal ofi-line scheduler equal- izes the energy-rate derivative function as much as possible subject to the deadline and causality constraints. References [17] and [18] extend the single-user problem to multi-user scenarios. 3 InChapter2, wegeneralize[8]toatwo-usermultipleaccesssystem, whereboth users have equal priority. Our goal is to minimize the average delay of the packets in the system under an average power constraint for each user. Unlike [2, 4, 5], where the rate per slot is a continuous variable, we restrict the transmission rate for each user in a slot to be either zero or one packet per slot. Our objective is to flnd a set of transmission probabilities according to the queue lengths of both users, so that the average delay in the system is minimized. Compared to [2, 4, 5], our model has a more restricted policy space at each stage, however, this model enables us to construct a two-dimensional discrete-time Markov chain and eventually gives us a closed-form optimal solution. We show that the optimal transmission policy has a threshold structure, i.e., if the sum of the queue lengths exceeds a threshold, both users transmit a packet from their queues, and if the sum of the queue lengths is smaller than a threshold, only one user, which has the larger queue length, transmits a packet from its queue, while the other user remains silent (equal queue length case is resolved by ip of a potentially biased coin). In Chapter 3, we aim to minimize the average delay through rate allocation in a symmetric MAC. We consider a time-slotted system. In each slot, bits arrive at the transmitters randomly according to some general distribution. At the beginning of each slot, we allocate transmission rates from within the MAC capacity region to the users, based on their current queue lengths, to minimize the average delay. We formulate the problem as an average cost Markov decision problem (MDP). We flrst analyze the corresponding discounted cost MDP, and obtain some properties of the value function. Based on these properties, we prove that the delay optimal rate 4 allocation policy for this discounted MDP is to equalize the queue lengths in each slot as much as possible. We then prove that this queue balancing policy is optimal for the average cost MDP as well. In Chapter 4, we extend the delay-optimal rate allocation problem into a com- munication channel with a general pentagon rate region. Difierentfrom theGaussian MAC capacity region, the pentagon we assume does not have a 45? dominant face. Through value iteration, we prove that a switch curve structure exists in the queue state space. Next, we prove that for the discounted-cost MDP, the switch curve has a limit on one of the queue lengths, i.e., when one of the queue lengths exceeds a threshold, the transmitters always operate at the corner point which has the larger sum-rate. That is, the delay-optimal policy favors throughput-optimality (i.e., larger sum-rate) unless the flrst queue gets close to empty, in which case, the policy favors balancing queue lengths. Our result has two practical implications: First, it gives a partial analytical characterization for the delay-optimal switch curve. Secondly, it implies that we can operate the queues partially distributedly, in that, if the current queue length of the flrst user is larger than the limit, then this user does not need to know the current queue length of the other user in order to decide about the rate point at which it should operate on the rate region. The optimal policy also indicates that always operating at the maximum sum-rate point is not optimal. The optimal rate allocation policy trades some of the instantaneously achievable sum- rate in favor of balancing the queue lengths, with the goal of minimizing the overall delay. In Chapters 5, 6, 7, 8, we consider wireless communication networks where 5 nodes are able to harvest energy from nature. In our work, we do not focus on how energy is harvested, instead, we focus on developing transmission methods that take into account the randomness both in the arrivals of the data packets as well as in the arrivals of the harvested energy. Since devising on-line algorithms that update the instantaneous transmission rate and power in real-time as functions of the current data and energy queue lengths is an intractable mathematical problem for now, here, we aim to develop optimal ofi-line algorithms instead. In Chapter 5, we consider a single-user communication channel with an energy harvesting transmitter. We assume that an initial amount of energy is available at t = 0. As time progresses, certain amounts of energies will be harvested. For the data arrivals, we consider two difierent scenarios. In the flrst scenario, we assume that packets have already arrived and are ready to be transmitted at the transmitter before the transmission starts. In the second scenario, we assume that packets arrive during the transmissions. However, as in the case of energy arrivals, we assume that we know exactly when and in what amounts data will arrive. Subject to the energy and data arrival constraints, our purpose is to minimize the time by which all packets are delivered to the destination through controlling the transmission rate and power. Since we do not know the exact amount of energy to be used in the transmissions, we develop an algorithm, which flrst obtains a good lower bound for the flnal total transmission duration at the beginning, and performs rate and power allocation based on this lower bound. The procedure works iteratively until all of the transmission rates and powers are determined. We prove that the transmission policy obtained through this algorithm is globally optimum. 6 In Chapter 6, we consider a multi-user extension of the problem studied in Chapter 5. In particular, we consider a wireless broadcast channel with an energy harvesting transmitter and two receivers. References [19, 20] investigate two related problems. The flrst problem is to maximize the throughput (number of bits trans- mitted) with a given deadline constraint, and the second problem is to minimize the transmission completion time with a given number of bits to transmit. These two problems are \dual" to each other in the sense that, with a given energy arrival proflle, if the maximum number of bits that can be sent by a deadline is B? in the flrst problem, then the minimum time to transmit B? bits in the second problem must be the deadline in the flrst problem, and the optimal transmission policies for these two problems must be identical. In Chapter 6, we will follow this \dual prob- lems" approach. We will flrst attack and solve the flrst problem to determine the structural properties of the optimal solution. We will then utilize these structural properties to develop an iterative algorithm for the second problem. Our iterative approach has the goal of reducing the two-user broadcast problem into a single-user problem as much as possible, and utilizing the single-user solution in Chapter 5. InChapter7, weconsiderthetransmissioncompletiontimeminimizationprob- lem in a two-user rechargeable multiple access channel. As in Chapter 6, we flrst aim to characterize the maximum number of bits both users can transmit for any given time T. We propose a generalized iterative backward waterfllling algorithm to achieve the boundary points of the maximum departure region for any given deadline T. Then, based on the solution of this \dual" problem, we go back to the transmission completion time minimization problem, simplify it into standard 7 convex optimization problems, and solve it e?ciently. In particular, we flrst char- acterize the maximum departure region for every energy arrival epoch, and based on the location of the given (B1;B2) on the maximum departure region, we narrow down the range of the minimum transmission completion time to be between two consecutive epochs. Based on this information, we propose to solve the problem in two steps. In the flrst step, we solve for the optimal power policy sequences to achieve the minimum T, so that (B1;B2) is on the maximum departure region for this T. This step can be formulated as a convex optimization problem. Then, with the optimal power policy obtained in the flrst step, we search for the optimal rate policy sequences from the capacity regions deflned by the power sequences to flnish B1;B2 bits. The second step is formulated as a linear programming problem. In addition, we further simplify the problem by exploiting the optimal structural properties for two special scenarios. In Chapter 8, we revisit the average delay minimization problem in an energy harvestingsingle-user system. Under adeterministic setting, ouraim istoadaptively allocate the energy over all packets according to the available amount of energy and number of packet at the transmitter, in a way to minimize the overall delay of the system. The most general version of the problem is complicated. In Chapter 8, we will consider three scenarios, starting with the simplest setting and proceeding with progressively more complicated settings. In the flrst scenario, we assume that the transmitter has a flxed number of packets to transmit, and a flxed amount energy to use in its transmissions. In the second scenario, we assume that the transmitter has a flxed amount of energy, but the packets arrive during the transmissions. In 8 the third scenario, we assume that the transmitter has a flxed number of packets available at the beginning, but the energy arrives during the transmissions. This last setting models an energy harvesting transmitter which harvests energy from the nature by using a rechargeable battery. For each scenario, we develop iterative algorithms and dynamic programming formulation to obtain the optimal solution. 9 Chapter 2 Average Delay Minimization for Average Power Constrained Multiple Access Communications 2.1 Introduction In many applications, the average delay packets experience is an important quality of service criterion. Therefore, it is important to allocate the given resources, e.g., average power, energy, etc., in a way to minimize the average delay packets experi- ence. Since power and energy are physical layer resources, and the delay is a medium access control layer issue, such resource allocation problems require close collabo- ration of physical and medium access control layers, and yield cross-layer solutions. Our goal in this chapter is to combine information theory and queueing theory to devise a transmission protocol which minimizes the average delay experienced by packets, subject to an average power constraint at each transmitter. Similar goals have been undertaken by various authors in recent years. Refer- ence [21] considers a time-slotted system with N queues and one server. The length of the slot is equal to the transmission time of a packet in the queue. In each slot, the controller allocates the server to one of the connected queues, such that the average delay in the system is minimized. The authors develop an algorithm named longest connected queue (LCQ), where the server is allocated to the longest of all 10 connected queues at any given slot. The authors prove that in a symmetric system, LCQ algorithm minimizes the average delay. Reference [21] does not consider the issue of power consumption during transmissions. Reference [22] combines information theory and queueing theory in a multi- access communication over an additive Gaussian noise channel. Authors consider a continuous time system, where the arrival of packets is a Poisson process, and the service time required for each packet is random. Once a packet arrives, it is transmitted immediately with a flxed power, i.e., there are no queues at the transmitters. Each transmitter-receiver pair treats the other active pairs as noise. Because of the interference from the other transmitters, the service rate for each packet is a function of the number of active users in the system. Reference [22] derives a relationship between the average delay and a flxed probability of error requirement. References [2{7, 9, 11, 23] consider the data transmission problem from both information theory and queueing theory perspectives. Reference [9] (see also [23], [11]) aims to minimize the average delay through rate allocation in a multiple access scenario in additive Gaussian noise. Unlike [22], in the setting of [9], packets arrive randomly into the bufiers of the transmitters. When the queue at a transmitter is not empty, it transmits a packet with a flxed power. Simultaneously achievable rates are characterized by the capacity region of a multiple access channel (MAC), which, for the non-fading Gaussian case, is a pentagon. The purpose of [9] is to flnd an operating point on the capacity region of the corresponding MAC such that the average delay is minimized. The author develops the longer-queue-higher-rate 11 (LQHR) allocation strategy in the symmetric MAC case, which is shown to minimize the average delay of the packets. The LQHR allocation scheme selects an extreme point (i.e., a corner point) in the MAC capacity region. Reference [2] (see also [3]) considers the problem of rate/power control in a single-user communication over a fading channel. It considers a discrete-time model, and investigates adapting rate/power in each slot according to the queue length, source state and channel state. The objective is to minimize the average power with delay constraints. It discusses two transmission models. In the flrst model, the transmission time of a codeword is flxed, while the rate varies from block to block. In the second model, the transmission time for each codeword varies. It formulates the problem into a dynamic programming problem and develops a delay-power tradeofi curve. References [4{7] have similar formulations. Reference [4] uses dynamic pro- gramming to numerically calculate the optimal power/rate control policies that min- imize the average delay in a single-user system under an average power constraint. Reference [6] derives bounds on the average delay in a system with a single queue concatenated with a multi-layer encoder. Reference [5] formulates the power con- strained average delay minimization problem into a Markov decision problem (MDP) and analyzes the structure of the optimal solution for a single-user fading channel. Reference [7] proposes a dynamic programming formulation to flnd optimal power, channel coding and source coding policies with a delay constraint. As in [2], in these papers as well, because of the large number of possible rate/power choices at each stage, it is almost impossible to get analytical optimal solutions. 12 Reference [8] considers a cognitive multiple access system. In the model of [8], the primary user (PU) always transmits a packet during a slot whenever its queue is not empty. The secondary user (SU) always transmits when the PU is idle, and it transmits with some probability (which is a function of its own queue length) when the PU is active. The receiver operates at the corner point of the MAC capacity region where the SU is decoded flrst and the PU is decoded next, so that even though the SU experiences interference from the PU, the PU is always decoded interference-free. Reference [8] aims to minimize the average delay through controlling the transmission probability of the SU. It formulates the problem as a one-dimensional Markov chain and derives an analytical result to minimize the average delay of the SU under an average power constraint. In this chapter, we generalize [8] to a two-user multi-access system, where both users have equal priority. Our goal is to minimize the average delay of the packets in the system under an average power constraint for each user. As in [2, 4, 5, 8], we consider a discrete-time model. We divide the transmission time into time slots. Packets arriving at the transmitters are stored in the queues at each transmitter. In each slot, each user decides on a transmission rate based on the current lengths of both queues. Unlike [2, 4, 5], where the rate per slot is a continuous variable, we restrict the transmission rate for each user in a slot to be either zero or one packet per slot. We deflne the probabilities of choosing each transmission rate pair, which can be (0;0), (0;1), (1;0) or (1;1), for each given pair of queue lengths. Our objective is to flnd a set of transmission probabilities that minimizes the average delay while satisfying the average power constraints for both users. As 13 in [8], there are two main reasons that we introduce transmission probabilities: First, a randomized policy is more general than a deterministic policy; probability selections of 0 and 1 correspond to a deterministic policy, which is a special case of the randomized policy. Secondly, since we cannot choose arbitrary departure rates in each slot, the use of transmission probabilities enables us to control the average rate per slot at a flner scale. Compared to [2, 4, 5], our model has a more restricted policy space at each stage, however, this model enables us to construct a two-dimensional discrete-time Markov chain and eventually gives us a closed-form optimal solution. In the rest of this chapter, we flrst express the average delay and the average power consumed for each user as functions of the transmission probabilities and steady state distribution of the queue lengths. We then transform our problem into a linear programming problem, and derive the optimal transmission scheme analytically. Weshowthattheoptimaltransmissionpolicyhasathresholdstructure, i.e., if the sum of the queue lengths exceeds a threshold, both users transmit a packet from their queues, and if the sum of the queue lengths is smaller than a threshold, only one user, which has the larger queue length, transmits a packet from its queue, while the other user remains silent (equal queue length case is resolved by ip of a potentially biased coin). We provide a rigorous mathematical proof for the optimality of the solution. We also provide numerical examples for both symmetric and asymmetric settings. 14 2.2 System Model 2.2.1 Physical Layer Model We consider a discrete-time additive Gaussian noise multiple access system with two transmitters and one receiver. The received signal is Y = p h1X1 + p h2X2 +Z (2.1) where Xi is the signal of user i, phi is the channel gain of user i, and Z is a Gaussian noise with zero-mean and variance 2. Here, h1 and h2 are real constants, with h1 6= h2 in general. In this two-user system, since the MAC capacity region is given as [24] R1 ? 12 log 1+ h1P1 2 ? (2.2) R2 ? 12 log 1+ h2P2 2 ? (2.3) R1 +R2 ? 12 log 1+ h1P1 +h2P2 2 ? (2.4) the region of feasible received powers is given by [18] h1P1 ? 2(22R1 ?1) (2.5) h2P2 ? 2(22R2 ?1) (2.6) h1P1 +h2P2 ? 2(22(R1+R2) ?1) (2.7) 15 In each slot, the transmitters adjust their transmitted powers to achieve the desired transmission rates. We assume that for each user, the average transmitted power over all of the slots must satisfy a constraint. We denote the average power constraints for the flrst and second user as P1avg and P2avg, respectively. 2.2.2 Medium Access Control Layer Model In the medium access control layer, we assume that packets arrive at the transmitters at a uniform size of B bits per packet. We partition the time into small slots such that we have at most one packet arrive and/or depart during each slot. Let a1[n] and a2[n] denote the number of packets arriving at the flrst and second transmitters, respectively, during time slot n; see Figure 2.1. We assume that the packet arrivals are i.i.d. from slot to slot, and the probabilities of arrivals are Prfai[n] = 1g = i (2.8) Prfai[n] = 0g = 1? i (2.9) where i is the arrival rate for user i, i = 1;2. 1 user 2 receiver user 1 2a a Figure 2.1: System model. 16 There is a bufier with capacity N at each transmitter to store the packets, where N is a flnite positive integer. Once the bufier is not empty, the transmitters decide to transmit one packet in the slot with some probability based on the current lengths of both queues. Let d1[n] and d2[n] denote the number of packets transmitted in time slot n. The queue length in each bufier evolves according to q1[n+1] = (q1[n]?d1[n])+ +a1[n] (2.10) q2[n+1] = (q2[n]?d2[n])+ +a2[n] (2.11) where (x)+ denotes max(0;x). The departure rate for each queue in each slot is either zero or one packet per slot, and the decision whether it should be zero or one packet per slot depends on the current queue lengths. When both queues are empty, the departure rates for both queues should be zero packet per slot. In all other situations where there is at least one packet in at least one of the queues, the departure rates for both queues should not be zero packet per slot simultaneously. This is because, keeping both trans- mitters idle does not save any power, but causes unnecessary delay. Therefore, in these situations, there are three possible departure rate pairs: (d1;d2) = (1;0);(0;1) or (1;1), i.e., one packet is transmitted from queue 1 and no packet is transmitted from queue 2; no packet is transmitted from queue 1 and one packet is transmit- ted from queue 2; or, one packet is transmitted from each queue. We enumerate them as d1;d2;d3. When the flrst queue length is i and the second queue length is j, we deflne the probabilities of choosing each pair of these departure rates as g1ij, 17 g2ij, g3ij, respectively. Note that g1ij + g2ij + g3ij = 1. We also note that g1ij, g2ij, g3ij, for i = 0;1;:::;N and j = 0;1;:::;N are the main parameters we aim to choose optimally in this chapter. The state space of the Markov chain consists of all possible pairs of queue lengths. We denote the state as q , (q1;q2). When both of the queues are empty, i.e., q[n] = (0;0), transmitters have no packet to send, and from (2.10)-(2.11), q[n+1] = a[n]. The corresponding transition probabilities in this case are: Prfq[n+1] = (0;0)jq[n] = (0;0)g = (1? 1)(1? 2) Prfq[n+1] = (1;0)jq[n] = (0;0)g = 1(1? 2) Prfq[n+1] = (0;1)jq[n] = (0;0)g = 2(1? 1) Prfq[n+1] = (1;1)jq[n] = (0;0)g = 1 2 (2.12) When one of the queues is empty, there is only one possible departure rate pair, which is either (0;1) or (1;0), depending on which queue is empty. Therefore, from our argument above, the departure probabilities should not be free parameters, but must be chosen as g1i0 = g20j = 1. The corresponding transition probabilities are: Prfq[n+1] = (i?1;0)jq[n] = (i;0)g = (1? 1)(1? 2) Prfq[n+1] = (i?1;1)jq[n] = (i;0)g = 2(1? 1) Prfq[n+1] = (i;0)jq[n] = (i;0)g = 1(1? 2) Prfq[n+1] = (i;1)jq[n] = (i;0)g = 1 2 (2.13) 18 A similar argument is valid when the flrst queue is empty, i.e., q[n] = (0;j). Tran- sition probabilities in this case can be written similar to (2.13). When neither of the queues is empty, i.e., for q[n] = (i;j), where 1 ? i;j ? N, the transition probabilities are: Prf(i?1;j ?1)j(i;j)g = g3ij(1? 1)(1? 2) Prf(i?1;j +1)j(i;j)g = g1ij 2(1? 1) Prf(i+1;j ?1)j(i;j)g = g2ij 1(1? 2) Prf(i;j +1)j(i;j)g = g1ij 1 2 (2.14) Prf(i+1;j)j(i;j)g = g2ij 1 2 Prf(i?1;j)j(i;j)g = g3ij 2(1? 1)+g1ij(1? 1)(1? 2) Prf(i;j ?1)j(i;j)g = g3ij 1(1? 2)+g2ij(1? 1)(1? 2) Prf(i;j)j(i;j)g = g1ij 1(1? 2)+g2ij 2(1? 1)+g3ij 1 2 For example, the flrst equation in (2.14) is obtained by noting that, for the next queue state to be (i?1;j?1), we need to transmit one packet from each queue and we should have no arrivals to either of the queues. The probability of this event is g3ij, probability of transmitting one packet from each queue, multiplied by (1? 1), probability of having no arrivals to queue 1, and (1? 2), probability of having no arrivals to queue 2. In this chapter, we assume that the average power constraints are large enough to prevent any packet losses. In order to prevent over ows, we always choose to 19 transmit one packet from a queue whenever its length reaches N. Therefore, we have g1iN = g2Nj = g3NN = 1. The two-dimensional Markov chain is shown in Figure 2.2. i,N?1 i?1, Ni?1,j?1 i?1,j+1 1 ,j+1 i+1, 1i+1, 0 0 , 0 0 , 1 1 , 0 1 , 1 0 ,j+1 i , 0 i , 1 i ,j?1 i?1, j 1 , j 0 , j i , j i+1, j i ,j+1 0 , N 1 , N i , N N , 1 N , jN , 0 N , N N?1, 0 1, N?1 i?1,N?1 0,N?1 i?1, 0 i?1, 1 N?1, 1 i+1,j+1 N?1,N i+1, N N?1,j N,j+1N,j?1 N,N?1 i+1,j?1 N?1,N?1N?1,j+1 0,j?1 1,j?1 N?1,j?1 i+1,N?1 Figure 2.2: Two-dimensional Markov chain. In [25], it is proven that, for all irreducible, positive recurrent discrete-time Markov chains with state space S, there exists a stationary distribution f?s;s 2 Sg, which is given by the unique solution to X s2S ?spsr = ?r; X s2S ?s = 1 (2.15) It is also stated that for a reducible Markov chain with a single closed positive 20 recurrent aperiodic class and a nonempty set T, where for any i 2 T, the probability of getting absorbed in the closed class starting from state i is 1, and the steady state distribution exists. In our problem, we flrst assume that the stationary distribution exists for the optimal solution. Once we determine the solution, we verify that the corresponding Markov chain has a unique stationary distribution. Let us deflne the steady state distribution of this Markov chain as ? = [?00;?01;??? ;?0N;?10; ??? ;?NN]. Then, the steady state distribution must satisfy ?P= ?; ?1 = 1 (2.16) wherePis the transition matrix deflned by the transition probabilities (2.12)-(2.14). We can express the average number of packets in the system as Pi;j ?ij(i + j). According to Little?s law [25], for a flxed sample path in a queueing system, if the limits of average waiting time W and average arrival rate ? exist as time goes to inflnity, then the limit of average queue length L exists and they are related as L = ?W. For our problem, in a system without over ow, these limits exist and the average delay D is equal to D = 1 1 + 2 X i;j ?ij(i+j) (2.17) where 1 + 2 is the average arrival rate for the system. 21 2.3 Problem Formulation The transmission rate for both transmitters during a slot is either one packet per slot orzeropacketperslot. Equivalently, thetransmissionrateiseitherB=? bits/channel use or 0 bits/channel use, where ? is the number of channel uses in each slot. We assume that in each slot we can use codewords with flnite block length to get arbitrarily close to the boundary of feasible powers and achieve a satisfactory level of reliability. Next, let us consider the power consumptions during each slot. When only one user transmits, since there is no interference from the other transmitter, the transmitted power for the active user needs to satisfy hiPi ? 2(22R ?1) , fi (2.18) where R = B=?. In order to minimize the power, the transmitted power for the active user should be fi=hi, depending on which user is transmitting. When both users transmit simultaneously, the received powers should additionally satisfy h1P1 +h2P2 ? 2(24R ?1) , fl (2.19) The feasible transmitted power region is shown in Figure 2.3. Let us denote the received power pair as (fl1;fl2). In order to minimize the transmit power, this pair should be on the dominant face of the feasible power region, i.e., fl1 + fl2 = fl. Then, the corresponding transmit power pair is (fl1=h1;fl2=h2). Note that difierent 22 operating points need difierent sum of transmit powers. fl2=h2 fl1=h1 fl=h2 fl=h1 P1 P2fi=h2 fi=h1 Figure 2.3: Feasible power region. Thus, for any state (i;j) 6= (0;0), the average power consumed for the flrst queue is 1h1(g1ijfi+g3ijfl1), while the average power consumed for the second queue is 1 h2(g 2 ijfi+g 3 ijfl2). Our goal is to flnd the transmission policy, i.e., the probabilities g k ij, k = 1;2;3, i = 0;1;:::;N, j = 0;1;:::;N along with the operating point (fl1;fl2), such that the average delay is minimized, subject to an average power constraint 23 for each user. Therefore, our problem can be expressed as: min g;fl1;fl2 1 1 + 2 X i;j ?ij(i+j) (2.20) s.t. 1h 1 X i;j ?ij(g1ijfi +g3ijfl1) ? P1avg (2.21) 1 h2 X i;j ?ij(g2ijfi +g3ijfl2) ? P2avg (2.22) ?P= ?; ?1 = 1 (2.23) g1ij +g2ij +g3ij = 1; i;j = 0;1;:::;N (2.24) gkij ? 0; i;j = 0;1;:::;N; k = 1;2;3 (2.25) We note that the state transition matrix P is fllled with variables in (2.12)-(2.14) which depend on gkijs. Also, through (2.23), ?ijs depend on gkijs, as well. Unlike [8], we have a two-dimensional Markov chain, and it does not admit closed-form expressions for the steady state distribution ?ijs in terms of gkijs. Therefore, solving the above optimization problem becomes rather di?cult. Our methodology will be to transform our optimization problem into a linear programming problem, and exploit its special structure to obtain the globally optimal solution analytically. 2.4 Analysis of the Problem Note that g1ij +g2ij +g3ij = 1 for any (i;j) 6= (0;0), therefore ?ij = ?ij(g1ij +g2ij +g3ij). Deflne x00 = ?00, xkij = ?ijgkij, k = 1;2;3, i = 0;1;:::;N, j = 0;1;:::;N. Since gkij is the conditional probability of choosing policy k when the system is in state (i;j), xkij can be interpreted as the unconditional probability of staying in state (i;j) and 24 choosing policy k. Our aim is to flnd optimal gkijs. However, as we will see, our analysis will be more tractable with variables xkijs. Once we flnd optimal xkijs, we can obtain optimal gkijs through gkij = x k ijP 3 k=1 x k ij (2.26) Let us construct a vector of all of our unknowns x = [x00;x101;x201;x301;:::;x3NN]T. First, we consider the average power consumption when average power con- straints for both users are large enough such that each user is able to transmit a packet during a slot whenever its queue is not empty. In this scenario, the corre- sponding Markov chain has four non-transient states, (0,0), (0,1), (1,0), (1,1), and the stationary distribution is ?01 = 2(1? 1); ?00 = (1? 1)(1? 2); ?10 = 1(1? 2); ?11 = 1 2 (2.27) The average power consumption for each queue is P1csmp = 1h 1 (?10fi +?11fl1) = 1h 1 ( 1(1? 2)fi + 1 2fl1) (2.28) P2csmp = 1h 2 (?01fi +?11fl2) = 1h 2 ( 2(1? 1)fi + 1 2fl2) (2.29) 25 We note that P1csmph1 +P2csmph2 = ( 1 + 2 ?2 1 2)fi + 1 2fl (2.30) From Figure 2.3, we note that fl1;fl2 ? fi, therefore, each individual term in (2.30) must additionally satisfy P1csmp ? 1h 1 1fi (2.31) P2csmp ? 1h 2 2fi (2.32) Therefore, if the average power constraints P1avg and P2avg satisfy the following inequalities P1avgh1 +P2avgh2 ? ( 1 + 2 ?2 1 2)fi + 1 2fl (2.33) P1avg ? 1h 1 1fi (2.34) P2avg ? 1h 2 2fi (2.35) then we can always flnd an operating point (fl1;fl2) such that P1csmp ? P1avg and P2csmp ? P2avg, and we achieve the minimal possible delay in the system, which is one slot. The available power in this case is so large that the solution is trivial. If P1avgh1 +P2avgh2 < ( 1 + 2 ?2 1 2)fi + 1 2fl (2.36) 26 and P1avg and P2avg are large enough to prevent any over ows, both power con- straints should be tight. Therefore, from (2.21)-(2.22), we have two equality power constraints, 1 h1 X i;j (x1ijfi +x3ijfl1) = P1avg (2.37) 1 h2 X i;j (x2ijfi +x3ijfl2) = P2avg (2.38) Because the average arrival rate must be equal to the average departure rate when there is no over ow, we also have X i;j (x1ij +x3ij) = 1 (2.39) X i;j (x2ij +x3ij) = 2 (2.40) Solving (2.37)-(2.40), we obtain fl1 = fi + (fl ?2fi)(P1avgh1 ? 1fi)P 1avgh1 +P2avgh2 ?( 1 + 2)fi (2.41) fl2 = fi + (fl ?2fi)(P2avgh2 ? 2fi)P 1avgh1 +P2avgh2 ?( 1 + 2)fi (2.42) X i;j x1ij = 1 ? P1avgh1 +P2avgh2 ?( 1 + 2)fifl ?2fi (2.43) X i;j x2ij = 2 ? P1avgh1 +P2avgh2 ?( 1 + 2)fifl ?2fi (2.44) X i;j x3ij = P1avgh1 +P2avgh2 ?( 1 + 2)fifl ?2fi (2.45) 27 By jointly considering the normalization equation in (2.23), we also have x00 = 1? ( 1 + 2)(fl ?fi)?(P1avgh1 +P2avgh2)fl ?2fi (2.46) Thus, we transform our optimization problem in (2.20)-(2.24) into minx X i;j ? 3X k=1 xkij(i+j) ! (2.47) s.t. x00 = 1?( 1+ 2)(fl?fi)?(P1avgh1 +P2avgh2)fl?2fi (2.48) X i;j x1ij = 1 ? P1avgh1 +P2avgh2 ?( 1 + 2)fifl ?2fi (2.49) X i;j x2ij = 2 ? P1avgh1 +P2avgh2 ?( 1 + 2)fifl ?2fi (2.50) X i;j x3ij = P1avgh1 +P2avgh2 ?( 1 + 2)fifl ?2fi (2.51) Qx = 0; xkij ? 0; i;j = 0;1;:::;N; k=1;2;3 (2.52) which is in terms of xkijs. Here, Q is a (N +1)2 ?(4(N +1)2 ?3) matrix deflned by matrix P. We get the equations in (2.52) from (2.23) by substituting ?ijgkij for xkij. The optimization problem in (2.47)-(2.52) is a linear programming problem. In addition, we observe that, in the objective function, all of the xkijs with the same sum of indices share the same weight i + j. If we look into the two-dimensional Markov chain, this corresponds to the states on the diagonals running from the upper right corner to the lower left corner. This motivates us to group the xkijs along the diagonals of the two-dimensional Markov chain in Figure 2.2 and deflne 28 their sum, for the nth diagonal, as yn = nX i=0 (x1i;n?i +x2i;n?i) (2.53) tn = nX i=0 x3i;n?i (2.54) Then, yn ? 0, tn ? 0, and the objective function in (2.47) is equivalent to 2NX n=1 (yn +tn)n (2.55) Wealsoget2N ow-in- ow-outequationsbetweenthediagonalgroups. Forn = 0;1, we have x00 ( 1 + 2 ? 1 2) = (y1 +t2)(1? 1)(1? 2) (2.56) (x00 +y1) 1 2 = (y2 +t3)(1? 1)(1? 2)+t2 (1? 1 2) (2.57) and for n = 2;3;:::;2N ?2, we have yn 1 2 = (yn+1+tn+2)(1? 1)(1? 2)+tn+1 (1? 1 2) (2.58) y2N?1 1 2 = t2N (1? 1 2) (2.59) Figure 2.4 shows the transitions between diagonal groups for a system with N = 3; we use difierent colors to distinguish the transitions caused by difierent departure rate pairs. 29 0 , 30 , 0 2 , 0 3 , 0 1 , 0 1 , 2 1 , 31 , 1 2 , 2 2 , 32 , 1 3 , 1 3 , 2 3 , 3 0 , 1 0 , 2 Figure 2.4: The transitions between diagonal groups when N = 3. We multiply both sides of the nth equation in (2.56)-(2.59) with zn and sum with respect to n to obtain x00( 1 + 2 ? 1 2 + 1 2z)+? 1 2 ?(1? 1)(1? 2)z?1? 2NX n=1 ynzn ??(1? 1 2)z?1 +(1? 1)(1? 2)z?2? 2NX n=1 tnzn = 0 (2.60) Taking the derivative of (2.60) with respect to z and letting z = 1, we have 2NX n=1 tnn = 12? 1 ? 2 ? ( 1 + 2 ?1) ?2NX n=1 ynn ! +(1? 1)(1? 2) ?2NX n=1 yn ! (2.61) + ?1? 1 2 +2(1? 1)(1? 2)? ?2NX n=1 tn ! +x00 1 2 ! 30 From the deflnition of yn and tn in (2.53) and (2.54), we note 2NX n=1 yn = 2NX n=1 nX i=0 (x1i;n?i +x2i;n?i) = X i;j (x1ij +x2ij) (2.62) 2NX n=1 tn = 2NX n=1 nX i=0 x3i;n?i = X i;j x3ij (2.63) From (2.62) and (2.63), and using (2.49)-(2.51), we conclude that P2Nn=1 yn and P2N n=1 tn are constants that depend on system parameters fi, fl, 1, 2 and P1avg, P2avg. Using (2.62) and (2.49)-(2.50), for future reference, let us deflne 2NX n=1 yn = X i;j x1ij + X i;j x2ij = 1 + 2 ? 2(P1avgh1 +P2avgh2 ?( 1 + 2)fi)fl ?2fi , ? (2.64) Using the deflnition of yn, tn and (2.61), the objective function in (2.47) be- comes 2NX n=1 (yn +tn)n = 12? 1 ? 2 ?2NX n=1 ynn ! +C (2.65) where C is a constant, and 12? 1? 2 is positive. Therefore, minimizing the original objective function in (2.47) is equivalent to minimizing P2Nn=1 ynn. Since from (2.64) the sum of yns is flxed, and yns are positive, intuitively, the optimization problem requires us to assign larger values to yns with smaller indices n, without con icting with the transition equation constraints. 31 2.5 The Modifled Optimization Problem and a Two-Step Solution In this section, we will prove the following main result of this chapter: If the average power constraints P1avg, P2avg are large enough to prevent any packet losses, the delay-optimal policy has a threshold structure. When the sum of the queue lengths is larger than the threshold, both users should transmit; when the sum of the queue lengths is smaller than the threshold, only the user with the longer queue should transmit; the equal queue length case can be resolved through ip of a potentially biased coin. We propose to solve our original optimization problem in two steps. In the flrst step, we will consider the optimization problem in terms of yns and tns, where the objective function is P2Nn=1 ynn, and the constraints are (2.64), (2.48), (2.56)-(2.59), and positivity constraints on yns and tns. The objective function of this optimiza- tion problem is exactly the same as that of our original optimization problem in (2.47)-(2.52), however, our constraints are more lenient than those of (2.47)-(2.52). First, (2.64) is weaker than (2.49)-(2.51), as it imposes a constraint on the sum while (2.49)-(2.51) impose constraints on individual terms. Secondly, the transition equa- tions in (2.52) are between all of the states in the two-dimensional Markov chain, while the transition equations in (2.56)-(2.59) are only between the diagonal groups in the Markov chain. Finally, we do not explicitly impose the sum constraint on tn on the new problem. These imply that, the result we obtain in the flrst step, in principle, may not be feasible for the original problem. Therefore, in the second step we allocate yns and tns we obtain from the flrst 32 step to xkijs in such a way that the remaining independent transition equations in (2.52) are satisfled. We note that (2.39) and (2.40) can be derived from (2.52), therefore, once (2.52) is satisfled, (2.39) and (2.40) will be satisfled. Together with (2.64), we can make sure that (2.49)-(2.51) are all satisfled. Therefore, if we can flnd a valid allocation in the second step, we will conclude that the solution found in the flrst step is a feasible solution to our original problem. Since the problem we solve in the flrst step has the cost function of our problem, but is subject to more lenient constraints, its solution, in principle, may be better than the solution of our original problem. However, when we prove that the solution we obtain in our flrst step is within the feasible set of our original problem, we will have solved our original problem. In addition, once we prove the optimality of the solution in the flrst step, it will be globally optimal for the original problem. First, we will minimize P2Nn=1 ynn subject to (2.64), (2.48), (2.56)-(2.59), and yn, tn ? 0. This means that we will allocate ? to yns in a way to minimizeP2Nn=1 ynn. This will require us to allocate larger values to yns with smaller n, while making sure that (2.64), (2.48), and (2.56)-(2.59) are satisfled. We state the result of our flrst step in the following theorem. Theorem 2.1 The optimal solution of the problem min 2NX n=1 ynn s. t. (2:64);(2:48);(2:56)?(2:59);and yn ? 0;tn ? 0;8n (2.66) has a threshold structure. In particular, there exists a threshold ?n such that for 33 n < ?n, tn = 0 and for n > ?n, yn = 0. The proof of this theorem is given in Appendix 2.8.1. In the following, we consider the transition equations within groups for each state. Since adding more constraints cannot improve the optimization result, if we can flnd a way to allocate yns and tns to xkijs, such that all of the remaining transition equations are satisfled, then we will conclude that the assignments we obtained in the flrst step are actually feasible for the original problem. Therefore, next, in our second step, we focus on the assignment of the yns and tns found in the flrst step to xkijs. First, we use a simple example to illustrate the procedure of allocation within each group, then, we generalize the procedure to arbitrary cases. In this simple example, we assume that N = 4. Assume that after the group allocation, we obtained y1;:::;y5 and t5;t6 6= 0, and the rest of the yns and tns are equal to zero. In order to keep the alloca- tion simple, when we assign y3, y5, t5 in each group, we assign them only to two states: (1;2);(2;1) and (2;3);(3;2), respectively; while we assign y4 to three states: (1;3);(2;2);(3;1), and we assign t6 to a single state (3;3). Figure 2.5 illustrates the allocation pattern within groups. We do not assign any values to the states with dot- ted circles. The dotted states will be transient states after the allocation. We need to guarantee that the nonzero-valued states only transit to other nonzero-valued states. This requires us to set x112 = x221 = x123 = x232 = 0, and x113 = x313 = x231 = x331 = 0. The valid transitions are represented as arrows in Figure 2.5. We can see that the 34 transitions are within the positive recurrent class. 4 , 1 0 , 0 0 , 1 0 , 2 0 , 4 1 , 41 , 31 , 21 , 11 , 0 2 , 0 2 , 1 2 , 2 2 , 3 2 , 4 3 , 43 , 33 , 23 , 13 , 0 4 , 0 4 , 2 4 , 3 4 , 4 0 , 3 Figure 2.5: Example: allocation within groups when N = 4. Then, let us examine each group and flnd transition equations to be satisfled for each state. For states (0;1);(0;2);(1;2);(1;3);(2;3), the transition equations to be satisfled are x201(1? 2(1? 1)) =(x00 +x110) 2(1? 1) +(x202 +x111)(1? 1)(1? 2) x202(1? 2(1? 1)) =x111 2(1? 1) x212(1? 2(1? 1)) =(x202 +x111) 1 2 +x121 2(1? 1) +(x213 +x122 +x323)(1? 1)(1? 2) x213(1? 2(1? 1)) =(x111 +x323) 2(1? 1) x223(1? 2(1? 1))+x323(1? 1 2) =(x213 +x122) 1 2 +(x132 +x333) 2(1? 1) (2.67) 35 We have flve more similar transition equations for states (0;1);(0;2);(1;2);(1;3); (2;3). All the unknown variables are interacting with each other through these equations. How to flnd an allocation satisfying all of these equations simultaneously becomes rather di?cult. After simple manipulations, equations in (2.67) become equivalent to x201 =(x00+x110+x201) 2(1? 1)+(x202 +x111)(1? 1)(1? 2) x202 =(x111 +x202) 2(1? 1) x212 =(x202 +x111) 1 2 +(x212 +x121) 2(1? 1)+(x213 +x122 +x323)(1? 1)(1? 2) x213 =(x122 +x213 +x323) 2(1? 1) x223 =(x213 +x122) 1 2 +(x132 +x223 +x333) 2(1? 1)?x323(1? 1 2) (2.68) Observing the right hand sides of (2.68), we note that, x00, x110+x210, x212+x121, x132+x223, x333 are known, therefore, the allocation for states (0;1);(0;2);(1;2);(1;3); (2;3) depends only on the values of x202 + x111, x122 + x213, and x323. Similarly, the allocation for states (1;0);(2;0);(2;1);(3;1);(3;2) also depends on the values of x120 +x211, x222 +x131, and x332 only. Since y2 = (x202 +x111)+(x120 +x211) (2.69) y4 = (x122 +x213)+(x222 +x131) (2.70) t5 = x323 +x332 (2.71) the allocation actually depends on how we split y2, y4 and t5 between (x202 + x111) 36 and (x120 + x211), (x122 + x213) and (x222 + x131), x323 and x332, respectively. Once we flx the values of x202 +x111, x122 +x213, and x323, we obtain the values of all of the states, completing the allocation. We note that there is more than one feasible allocation within groups, and for each feasible allocation, all of the transition equations are satisfled, and the power constraints are satisfled as well. In order to keep the solution simple, we let x202 +x111 = y2=2 (2.72) x122 +x213 = y4=2 (2.73) x323 = t5=2 (2.74) Plugging these into (2.68), we get x201 =(x00 +y1) 2(1? 1)+ 12y2(1? 1)(1? 2) x202 =12y2 2(1? 1) x212 =12y2 1 2 +y3 2(1? 1)+ 12(y4 +t5)(1? 1)(1? 2) x213 =12(y4 +t5) 2(1? 1) x223 =12y4 1 2 +(y5 +t6) 2(1? 1)? 12t5(1? 1 2) (2.75) 37 Going back to (2.72)-(2.73), we obtain x111 =12y2(1? 2(1? 1)) x122 =12y4 ? 12(y4 +t5) 2(1? 1) (2.76) Since yn ? tn+1?=?, we can easily verify that x223 ? 0, x122 ? 0. The allocation for the remaining half of the states has a similar structure. Thus, each state has a positive value, and the allocation is feasible. Once we obtain the values of xkijs, we can compute the transmission probabil- ities using gkij = xkijP3 k=1 xkij . Here, we have g111 = 1? 2(1? 1)2? 2(1? 1)? 1(1? 2) (2.77) g211 = 1? 1(1? 2)2? 2(1? 1)? 1(1? 2) (2.78) g122 = y4 ?(y4 +t5) 2(1? 1)2y 4 ?(y4 +t5)( 2(1? 1)+ 1(1? 2)) (2.79) g222 = y4 ?(y4 +t5) 1(1? 2)2y 4 ?(y4 +t5)( 2(1? 1)+ 1(1? 2)) (2.80) We observe that a threshold structure exists. In this example, the threshold is 5. When the sum of the two queue lengths is greater than 5, both users transmit during a slot. When the sum of the two queue lengths is less than 5, only one user with longer queue transmits during a slot; in this case, if both queue lengths are the same, users transmit according to probabilities in (2.77)-(2.80). Following steps similar to those in the example above, we can assign yns and tns 38 to xkijs and obtain a feasible allocation for general settings. The following theorem states this fact formally. Theorem 2.2 For the yns and tns obtained in the flrst step, there always exists a feasible xkij assignment, such that xkijs are positive and satisfy all of the transition equations. The proof of this theorem is given in Appendix 2.8.2. Since this is a constructive proof, it also gives the exact method by which yns and tns are assigned to xkijs. Therefore, in order to prove the optimality of the xkij assignment, it su?ces to prove the optimality of the solution of the flrst step. The following theorem proves the optimality of the flrst step. Theorem 2.3 The allocation scheme in Theorem 2.1 minimizes the average delay in the system. The proof of this theorem is given in Appendix 2.8.3. In summary, the two-step allocation scheme is feasible and optimal for our original problem. The transition probabilities can be computed once we determine the allocation for each sate. From our allocation, we note that there exists a thresh- old ?n, where ?n is the largest group index n such that yn 6= 0. We have tn > 0 only when n ? ?n. Since gkij = xkijP3 k=1 xkij , we have g3ij = 1 when n > ?n. When n < ?n, we 39 have g1ij = 1 if i > j and g2ij = 1 if i < j. Then, for n ? ?n, and n is even, we have g1n=2;n=2 = yn ?(yn +tn+1) 2(1? 1)2y n?(yn+tn+1)( 2(1? 1)+ 1(1? 2))+tn (2.81) g2n=2;n=2 = yn ?(yn +tn+1) 1(1? 2)2y n?(yn+tn+1)( 2(1? 1)+ 1(1? 2))+tn (2.82) g3n=2;n=2 = tn2y n?(yn+tn+1)( 2(1? 1)+ 1(1? 2))+tn (2.83) If tn;tn+1 = 0, which happens when n < ?n?1, (2.81)-(2.83) reduce to g1n=2;n=2 = 1? 2(1? 1)2? 2(1? 1)? 1(1? 2) (2.84) g2n=2;n=2 = 1? 1(1? 2)2? 2(1? 1)? 1(1? 2) (2.85) Therefore, if the sum of the two queue lengths is greater than ?n, both users should transmit one packet during the slot. If the sum of the two queue lengths is less than ?n, only the user with the longer queue transmits one packet in the slot and the other user remains silent; if in this case both queues have the same length, then the probability that the flrst user transmits one packet while the second user keeps silent is 1? 2(1? 1)2? 2(1? 1)? 1(1? 2), and the probability that the second user transmits one packet while the flrst user keeps silent is 1? 1(1? 2)2? 2(1? 1)? 1(1? 2). When the system is symmetric, i.e., 1 = 2, these probabilities become 1=2 and 1=2. 40 2.6 Numerical Results Here we give simple examples to show how our allocation scheme works. We choose N = 10, i.e., each queue has a bufier of size 10 packets. Therefore, the joint queue sates is represented by an 11?11 Markov chain. First, we consider the symmetric scenario, where 1 = 2 = , h1 = h2 = h and P1avg = P2avg = Pavg. We assume the arrival rate = 1=2, and the power levels fi = 1, fl = 3. Therefore, we have ? = 3, ? = 1, ? = 3. From the analysis, we know that if Pavg ? 5=8, the average delay is one slot, which is the minimal possible delay in the system. If Pavg = 9=16, we have x00 = 1=8, Pi;j x1ij = Pi;j x2ij = 3=8, Pi;j x3ij = 1=8. Therefore, ? = 3=4. Following our allocation scheme, we have y1 = 3=8, y2 = 3=8, t3 = 1=8. Then, we need to allocate these within groups. We start with y1. Because of the symmetry of the setting, we simply let x110 = x201 = y1=2 = 3=16, x312 = x321 = t3=2 = 1=16. Then, we consider y2. We also let x120 = x202, x111 = x211. This symmetric allocation guarantees that the ow equations for states (0;1) and (1;0) are satisfled. The values of x120 and x111 also depend on the allocation of t3. The state (2;0) must satisfy the transition equation x120? (1? )+ 2 +(1? )2? = (x211 +x321) (1? ) 41 Together with the symmetric allocation, we have x120 +x211 = y2=2 = 3=16 Solving these equations, we get the allocation for the second group as x120 = x202 = 1=16; x211 = x111 = 1=8 We see that the two values are positive, thus feasible. Then, the transmission probabilities are g111 = g211 = 1=2, g312 = g321 = 1. The threshold of the sum of the queue lengths is 2 in this case. If the sum of the queue lengths is greater than 2, both users transmit, if the sum of the queue lengths is less than or equal to 2, only the user with the longer queue transmits and the other user remains silent; if both queues have one packet in their queues, each queue transmits with probability 1/2 while the other queue remains silent. If Pavg = 17=32, we have x00 = 1=16, Pi;j x1ij = Pi;j x2ij = 7=16, Pi;j x3ij = 1=16. Therefore, ? = 7=8. Following our allocation scheme, we have y1 = 3=16, y2 = y3 = 1=4, y4 = 3=16, t5 = 1=16. Then, we assign these within groups. For y1, we simply let x110 = x201 = y1=2 = 1=32. Then, considering to allocate y2, we have x120 = x202 = 1=32, x211 = x111 = 3=32. After completing the allocation, we have x121 = x212 = 1=8, x131 = x213 = 1=32, x222 = x122 = 1=16, x323 = x332 = 1=32. The transmission probabilities are g111 = g211 = g122 = g222 = 1=2, g110 = g201 = g120 = g202 = g121 = g212 = g113 = g231 = g332 = g323 = 1. The threshold of the sum of the 42 queue lengths is 4 in this case. If the sum of the queue lengths is greater than 4, both users transmit, if the sum of the queue lengths is less than or equal to 4, only the user with the longer queue transmits and the other user remains silent; if both queues have equal length, which is either 1 or 2 in this case, each queue transmits with probability 1/2 while the other queue remains silent. We compute the average delay as a function of average power for = 0:5, = 0:48 and = 0:46, and plot them in Figure 2.6. We observe that it is a piecewise linear function, and each linear segment corresponds to the same threshold value. This is because based on our optimal allocation scheme, for a flxed threshold value, the objective function is a linear function in x00, thus it is linear in Pavg. If Pavg increases, Davg decreases, and the threshold decreases as well. The minimum value of Pavg on each curve corresponds to the maximum threshold, which is 19 in this example. This is also the minimum amount of average power required to prevent any over ows. We also observe that the delay-power tradeofi curve is convex, which is consistent with the result in [2]. We note that although these three values of are close to each other, the average delay varies signiflcantly. This is because the average delay is not a linear function of . For the asymmetric scenario, we assume 1 = 1=2, 2 = 1=3, then ? = 2, ? = 1=2, ? = 5=2. We assume h1 = 1, h2 = 2. From (2.33), we know that if P1avgh1+P2avgh2 ? 1, P1avg ? 1=2, P2avg ? 2=3, then each user can always transmit a packet whenever its queue is not empty, and the average delay is one slot. If P1avg = 19=36, P2avg = 13=18, then P1avgh1 +P2avgh2 = 8=9. Plugging these into (2.41)-(2.48), we have fl1 = 1=2, fl2 = 1=2, P1i;j x1ij = 4=9, P2i;j x1ij = 5=18, 43 0.46 0.48 0.5 0.52 0.54 0.56 0.58 0.6 0.62 0.641 2 3 4 5 6 7 8 9 10 Pavg D avg ?=0.50?=0.48 ?=0.46 Figure 2.6: The average delay versus average power in the symmetric scenario. P3 i;j x 1 ij = 1=18, x00 = 2=9. Then, ? = 13=18. Following the group allocation scheme, we have y1 = 4=9, y2 = 5=18, t3 = 1=18. Then, we need to assign them within groups. From (2.118)-(2.124), we get x201 = 1=6, x110 = 5=18, x202 = 1=36, x111 = 4=36, x211 = 3=36, x120 = 2=36, and x312 = x321 = 1=18. The transmission probabilities are g111 = 4=7, g211 = 3=7, g110 = g201 = g120 = g202 = g312 = g321 = 1. The threshold is 2. If the sum of the queue lengths is greater than 2, both users transmit, if the sum of the queue lengths is less than or equal to 2, only the user with the longer queue transmits and the other user remains silent; if both queues have one packet in their queues, the flrst queue transmits with probability 4/7, and the second queue transmits with probability 3/7. 44 2.7 Conclusions We investigated the average delay minimization problem for a two-user multiple access system with average power constraints for the general asymmetric scenario, where users have arbitrary powers, channel gains, and arrival rates. We considered a discrete-time model. In each slot, the arrivals at each queue follow a Bernoulli distri- bution, and we transmit at most one packet from each queue with some probability. Our objective is to flnd the optimal set of departure probabilities. We modeled the problem as a two-dimensional Markov chain, and minimized the average delay through controlling the departure probabilities in each time slot. We transformed the problem into a linear programming problem and found the optimal solution an- alytically. The optimal policy has a threshold structure. Whenever the sum of the queue lengths exceeds a threshold, both queues transmit one packet during the slot, otherwise, only one of the queues, which is longer, transmits one packet during the slot and the other queue remains silent; if both queues have the same length, only one of the queues transmits with a probability which depends on the arrival rates to both queues while the other queue remains silent. 45 2.8 Appendix 2.8.1 The Proof of Theorem 2.1 Let us deflne ? = 1 + 2 ? 1 2(1? 1)(1? 2) (2.86) ? = 1 2(1? 1)(1? 2) (2.87) ? = 1? 1 2(1? 1)(1? 2) (2.88) Then, (2.56)-(2.59) are equivalent to x00? = y1 +t2 (2.89) (x00 +y1)? = (y2 +t3)+t2? (2.90) and for n = 2;3;:::;2N ?2, yn? = (yn+1 +tn+2)+tn+1? (2.91) y2N?1? = t2N? (2.92) The optimization requires us to assign larger values to yns with smaller in- dices n as much as possible. Examining (2.89)-(2.92), we note that for flxed x00, 46 maximizing y1, y2, ::: requires us to set t2, t3, ::: to zero. Therefore, we choose y1 = x00? (2.93) y2 = (x00 +y1)? (2.94) yn = yn?1?; tn = 0; n = 1;2;:::;n? (2.95) where n? is the largest integer satisfying Pn?n=1 yn < ?. Let ? = ? ? Pn?n=1 yn. We need to check that all of the group transition equations are satisfled. Assume that n? > 2. If ? = yn???=(? +?), then let yn?+1 = ?; and yn = 0; n = n? +2;:::;N ?1 (2.96) tn?+2 = yn?+1?=?; and tn = 0; n 6= n? +2 (2.97) We can verify that after this allocation, group transition equations (2.56)-(2.59) are satisfled. We also note that ? is allocated to fyngn?+1n=1 , among which, fyngn?n=1 attain their maximum possible values. Therefore, the objective function achieves its minimal possible value for the flrst step. If ? 6= yn???=(? + ?), if we assign it to yn?+1 directly, the group transition equations are not satisfled automatically. In order to satisfy the group transition equations, we need to do some adjustments. If ? > yn???=(? + ?), we assign ? to yn?+1 and yn?+2 proportionally. Specifl- 47 cally, we let yn?+1 = ?(?+?)+yn??? 2 ?2 +??+? +? (2.98) yn?+2 = ?(?+?)??yn??? 2 ?2 +??+? +? (2.99) tn?+2 = yn??(??+? +?)??(?+?)?2 +??+? +? (2.100) tn?+3 = ?(?+?)? ?yn?? 2? ?2 +??+? +? (2.101) Since yn?? > ? > yn???=(?+?), we can verify that each value above is positive, and the sum constraint and the group transition equations are satisfled. Among the non- zero fyngn?+2n=1 , although fyngn?n=1 attain their maximum, yn?+1 does not. Therefore, difierent from the flrst scenario, in this case, we cannot immediately claim that the result is optimal. We will give the mathematical proof for the optimality of this assignment later. If ? < yn???=(? +?), we need to remove some value from yn? and assign it to yn?+1 to satisfy the equations. Deflne ?0 = ? + yn? and assign ?0 to yn? and yn?+1 as follows yn? = ? 0(?+?)+yn??1??2 ?2 +??+? +? (2.102) yn?+1 = ? 0(?+?)??yn??1??2 ?2 +??+? +? (2.103) tn?+1 = yn??1?(??+? +?)?? 0(?+?) ?2 +??+? +? (2.104) tn?+2 = ? 0(?+?)? ?yn??1?2? ?2 +??+? +? (2.105) 48 Since yn??1? < ?0 < yn??1?(??=(?+?)+1), we can also verify that each value above is positive, and the sum constraint and the group transition equations are satisfled. Similar to the second case, we cannot immediately claim that this result is optimal because after the adjustment, yn? does not achieve its maximum value. We will give the proof of optimality later. When n? = 1, the allocation will be in a difierent form. If ? ? (x00 + y1)??=(? + ?), then we need to use (x00 + y1) instead of yn? in (2.96)-(2.101). If ? < (x00 +y1)??=(? +?), then yn? = ?(?+?)+x00(? ??)??2 +??+? +? (2.106) yn?+1 = ?(?+?)??x00(? ??)??2 +??+? +? (2.107) tn?+1 = x00(???+?? +?? 2 +??)??(?+?) ?2 +??+? +? (2.108) tn?+2 = ?(?+?)? ?x00(? ??)??2 +??+? +? (2.109) When n? = 2, if ? ? yn???=(? +?), the allocation of ? has the same form as in (2.96)-(2.101). If ? < yn???=(? + ?), then we need to use (x00 + y1) instead of yn??1 in (2.102)-(2.105). 2.8.2 The Proof of Theorem 2.2 While we generalize the simple example to an arbitrary setting, we follow the same basic allocation pattern. If n is odd, we assign yn and tn only to two states?n+12 ; n?12 ? and ?n?12 ; n+12 ?; if n is even, we assign yn to three states: ?n2 +1; n2 ?1?, ?n2; n2?, 49 ?n 2 ?1; n 2 +1 ?, and we assign t n to a single state ?n 2; n 2 ?. We illustrate the allocation pattern in Figure 2.7. N+2 n= 21 3 N-1 . . . N-2. . . . . . 2N n* N+1 N40 2N-1 Figure 2.7: Allocation pattern within groups. We need to make sure that the transitions only happen within the positive recurrent class. Therefore, when n is odd, we let x1n?1 2 ; n+1 2 = x2n+1 2 ; n?1 2 = 0; when n is even, we let x1n 2?1; n 2+1 = x2n 2+1; n 2?1 = 0. Then, let us examine the transition equations for the states. For n = 1, we have x201(1? 2(1? 1)) =(x00 +x110 +x311) 2(1? 1) +(x202 +x111 +x312)(1? 1)(1? 2) (2.110) For n = 2;3;:::, if n is even, the transitions between states are illustrated in Fig- ure 2.8. The transition equation for state ?n2 ?1; n2 +1? is x2n 2?1; n 2+1 ? 2(1? 1)) =(x1n 2; n 2 +x3n 2; n 2+1 ) 2(1? 1) (2.111) 50 . . . . . . . . .. . . . . . . . . n2;n2 n2;n2+1 n2+1,n2-1 n2-1,n2+1 n2+1;n2 n2,n2-1 n2-1,n2 Figure 2.8: The transitions between states when n is even. If n is odd, the transitions between states are illustrated in Figure 2.9. The transition equation for state ?n?12 ; n+12 ? is x2n?1 2 ; n+1 2 (1? 2(1? 1))+x3n?1 2 ; n+1 2 (1? 1 2) =(x2n?3 2 ; n+1 2 +x1n?1 2 ; n?1 2 ) 1 2 +(x1n+1 2 ; n?1 2 +x3n+1 2 ; n+1 2 ) 2(1? 1) +(x1n+1 2 ; n+1 2 +x2n?1 2 ; n+3 2 +x3n+1 2 ; n+3 2 )(1? 1)(1? 2) (2.112) After a transformation, (2.110) is equivalent to x201 =(x00 +x110 +x201 +x311) 2(1? 1)+(x202 +x111 +x312)(1? 1)(1? 2) (2.113) where x00 is known, x110 +x201 = y1, x311 = t2. 51 . . . . . . . . . n?12 ;n+12 n+12 ;n?32 n?32 ;n+12 n?12 ;n+32 n+12 ;n+12 n+32 ;n?12 n+12 ;n?12 n?12 ;n?12 n+12 ;n+32 n+32 ;n+12 Figure 2.9: The transitions between states when n is odd. For n = 2;3;:::, when n is even, (2.111) is equivalent to x2n 2?1; n 2+1 =(x1n 2; n 2 +x2n 2?1; n 2+1 +x3n 2; n 2+1 ) 2(1? 1) (2.114) and when n is odd, (2.112) is equivalent to x2n?1 2 ; n+1 2 =(x2n?3 2 ; n+1 2 +x1n?1 2 ; n?1 2 ) 1 2 ?x3n?1 2 ; n+1 2 (1? 1 2) +(x1n+1 2 ; n+1 2 +x2n?1 2 ; n+3 2 +x3n+1 2 ; n+3 2 )(1? 1)(1? 2) +(x1n+1 2 ; n?1 2 +x2n?1 2 ; n+1 2 +x3n+1 2 ; n+1 2 ) 2(1? 1) (2.115) where x1n+1 2 ; n?1 2 +x2n?1 2 ; n+1 2 = yn, x3n+1 2 ; n+1 2 = tn+1. The transition equations for the remaining half of the recurrent states can be expressed in a similar form. Therefore, the values of xkijs are determined only by 52 the allocation of yn between x1n 2+1; n 2?1 + x2n 2; n 2 and x2n 2?1; n 2+1 + x1n 2; n 2 when n is even, and the allocation of tn to x3n+1 2 ; n?1 2 and x3n?1 2 ; n+1 2 when n is odd. If we let x1n 2; n 2 +x2n 2?1; n 2+1 = yn=2; when n is even (2.116) x3n?1 2 ; n+1 2 = tn=2; when n is odd (2.117) and solve equations (2.113)-(2.115), then, for n = 1, we obtain x201 =(x00 +y1 +t2) 2(1? 1)+ 12(y2 +t3)(1? 1)(1? 2) x110 =(x00 +y1 +t2) 1(1? 2)+ 12(y2 +t3)(1? 1)(1? 2) (2.118) For n = 2;3;:::, if n is even, we get x2n 2?1; n 2+1 =12(yn +tn+1) 2(1? 1) (2.119) x1n 2+1; n 2?1 =12(yn +tn+1) 1(1? 2) (2.120) x1n 2; n 2 =12yn ? 12(yn +tn+1) 2(1? 1) (2.121) x2n 2; n 2 =12yn ? 12(yn +tn+1) 1(1? 2) (2.122) 53 and if n is odd, we have x2n?1 2 ; n+1 2 =12yn?1 1 2 +(yn +tn+1) 2(1? 1) + 12(yn+1 +tn+2)(1? 1)(1? 2)? 12tn(1? 1 2) (2.123) x1n+1 2 ; n?1 2 =12yn?1 1 2 +(yn +tn+1) 1(1? 2) + 12(yn+1 +tn+2)(1? 1)(1? 2)? 12tn(1? 1 2) (2.124) This completes the allocation. Note that tn 6= 0 only when n is equal to n? + 1, n?+2, and/or n?+3, depending on the value of ?. When tn+1 = 0, it automatically disappears from the right hand sides of (2.118)-(2.124). From the group transition equations, we have yn ? tn+1?0=?0, and it is easy to verify that all states have nonnegative assignments and the transition equations are also satisfled in this case. Therefore, there always exists a feasible allocation to satisfy all of the transition equations with yns deflned through this allocation scheme. 2.8.3 The Proof of Theorem 2.3 In a convex optimization problem, where the inequality constraints are convex and the equality constraints are a?ne, if x? is such that there exists a set of Lagrange multipliers which together with x? satisfy the KKT conditions, then x? is a global minimizer for the problem [26][27]. In the flrst step, we have a linear objective function and linear constraints. Therefore, if we prove that the point achieved by the assignment satisfles the KKT conditions, then we can say that it is the global 54 minimizer for our problem. In the allocation scheme, if ? = yn???=(? + ?), then it is easy to prove that the resulting allocation is optimal, since every yn;n < n? achieves its max- imum possible value. However, this is not the case when ? 6= yn???=(? + ?), because the second to last nonzero yn does not achieve its maximum. In the following, we prove that our allocation is optimal for this case as well. Deflne y = [y1;y2;:::;y2N?1;t2;:::;tN?1;t2N]. Then, the linear equality constraints, in- cluding the 2N group transition equations and the sum constraint can be written as a (2N +1)?2(2N ?1) matrix form as follows 0 BB BB BB BB BB BB BB BB BB BB BB @ 1 0 0 ??? 0 1 0 0 ??? 0 ?? 1 0 ??? 0 ? 1 0 ??? 0 0 ?? 1 ??? 0 0 ? 1 ??? 0 ... ... ... ... 0 0 0 ??? 1 0 0 0 ??? 1 0 0 0 ??? ?? 0 0 0 ??? ? 1 1 1 ??? 1 0 0 0 ??? 0 1 CC CC CC CC CC CC CC CC CC CC CC A yT= 0 BB BB BB BB BB BB BB BB BB BB BB @ x00? x00? 0 ... 0 0 ? 1 CC CC CC CC CC CC CC CC CC CC CC A which we write equivalently as, AyT = b (2.125) by deflning bT = x00? x00?(1+?) x00?2(1+?) ??? x00?2N?1(1+?) ? ?T (2.126) 55 and A= 0 BB BB BB BB BB BB BB BB BB BB BB @ 1 0 0 ??? 0 1 0 0 ??? 0 0 1 0 ??? 0 ?+? 1 0 ??? 0 0 0 1 ??? 0 (?+?)? ?+? 1 ??? 0 ... ... ... ... 0 0 0 ??? 1 (?+?)?2N?3 (?+?)?2N?4 (?+?)?2N?5 ??? 1 0 0 0 ??? 0 (?+?)?2N?2 (?+?)?2N?3 (?+?)?2N?4 ??? ?+? 1 1 1 ??? 1 0 0 0 ??? 0 1 CC CC CC CC CC CC CC CC CC CC CC A (2.127) The Lagrangian is expressed as L(y;?;?) = cTy??T(Ay?b)??Ty (2.128) where c = [1;2;??? ;2N ?1;0;0;??? ;0], ? 2 R2N+1 and ? 2 R4N?2. We need to prove that there exists a set of??, ?? associated with our allocation y?, such that they satisfy ?? ? 0; ??Ty? = 0 (2.129) y? ? 0; Ay?T = b (2.130) c =AT?? +?? (2.131) Consider the y we obtained with the algorithm. Let us consider the case when ? < yn???=(? + ?) flrst. The allocation indicates that yn > 0 only when 56 n = 1;2;:::;n? + 1, and tn > 0 only when n = n? + 1;n? + 2. Because of the complementary slackness in (2.129), we obtain ?n = 0; n = 1;2;:::;n? +1;n? +2N ?1;n? +2N (2.132) Plugging this into (2.131), and solving the equations, we have ?n = 1?+1 +n?n? ?1; n = 1;2;:::;n? +1 ?2N+1 = ??+1 +n? ?n+2N?2 = ? ? ?n?1 +(?+?) n??1X i=n ?i?i?n +??n??n?n? ! ; n = 2;3;:::;n? (2.133) Thus, wehave?n < 0whenn ? n?, whichguaranteesthepositivenessoff?ngn?+2N?2n=2N . We also have 2NX i=n?+2 ?i?i?n??2 = ? 1(?+?)(?+1) (2.134) and ?n = 1?+1 +n?n? ?1??n; n = n? +2;:::;2N ?1 ?n = ? ? ?n?1 +(?+?) 2NX i=n ?i?i?n ! ; n = n? +2N +1;:::;4N ?2 (2.135) We can always flnd a set of negative f?ig2Ni=n?+2 to satisfy (2.134). Since they are all 57 negative, this guarantees that f?ng2N?1n=n?+2 and f?ng4N?2n=n?+2N+1 are positive. There- fore, at the point y?, we can always flnd a set of multipliers satisfying all of the KKT constraints. This proves that the allocation our algorithm gives is a global minimizer. 58 Chapter 3 Delay Minimization in a Symmetric Multiple Access Channel 3.1 Introduction Traditional information theory investigates transmission problems from a physical layer perspective. In the simplifled source-channel-destination model, information- theoretic approaches assume the availability of an inflnite number of bits at the transmitter before the transmission starts. The burstiness of the arrivals and the associated issue of delay are mostly ignored. In contrast, network theory gives so- phisticated analysis of network layer issues, such as random arrivals and network de- lay. However, in network-theoretic approaches, the underlying physical layer model is usually very simplifled, e.g., in most approaches simultaneous transmissions are not allowed, and even when they are allowed, a collision channel model is used, which is too simplistic to capture what can be achieved in the physical layer from an information-theoretic perspective. In recent years, many authors have taken efiorts to bridge the gap between information theory and network theory [1]. Reference [22] addresses the delay issue for an additive Gaussian noise multiple access channel (MAC). Packets with random sizes arrive according to a Poisson process, and are transmitted out immediately with a flxed power. At the physical layer, the receiver decodes a packet while treating other transmissions as noise. Consequently, the service rate becomes a 59 function of the number of active users in the system. Reference [22] derives the relationship between the average delay and a flxed probability of error requirement. References [2], [4] and [5] consider a discrete-time model for a power-constrained single-user communication channel. Random arrivals queue at the transmitter to wait to be transmitted. In each slot, the transmitter adapts its service rate, i.e., transmission rate, according to the queue length and the channel state, as well as the average power constraint, to minimize the average delay. Reference [2] formulates theproblem as a dynamic programming problem and develops a delay-powertradeofi curve. References [4] and [5] determine some structural properties of the optimal power/rate allocation policy. Reference [9] uses a continuous-time queueing model to model the network layer behavior of a multiple access system. The packets arrive at the transmitters according to independent Poisson processes, and the packet lengths are exponen- tially distributed. The physical layer is modeled as an additive Gaussian noise channel, whose capacity region is a pentagon for the two-user case. The goal of [9] is to select an operating rate point inside the MAC capacity region, as a function of the current queue lengths, in order to minimize the average packet delay. The transmission rates selected from the capacity region serve as the current service rates of the queues. Reference [9] develops the longer-queue-higher-rate (LQHR) al- location strategy, which selects an extreme point in the capacity region of the MAC (i.e., a corner point of the pentagon). Reference [9] shows that LQHR minimizes the average delay of a symmetric system. Reference [10] extends [9] to a poten- tially asymmetric setting, and proves that the delay-optimal policy has a threshold 60 (switch) structure. Reference [11] develops a policy named \modifled LQHR" which works at a corner point of the pentagon when the queue lengths are difierent, and switches to the mid-point of the dominant face of the pentagon when the queue lengths become equal. The \modifled LQHR" algorithm is shown to minimize the average bit delay in the system. The third chapter of [12] extends \modifled LQHR" to an M-user scenario. In this chapter, we consider a similar delay minimization problem. In or- der to track the relationship between the average delay and the transmission rates more accurately and also to consider more general arrivals, we adopt a discrete- time queueing model and consider the problem from a bit perspective rather than a packet perspective. We partition the time into small slots. In each slot, bits arrive at the transmitters randomly according to some general distribution. At the be- ginning of each slot, we allocate transmission rates from within the MAC capacity region to the users, based on their current queue lengths, to minimize the average delay. In our model, the number of bits transmitted in each slot is equal to the product of the transmission rate and the number of channel uses in each slot. We formulate the problem as an average cost Markov decision problem (MDP). We flrst analyze the corresponding discounted cost MDP, and obtain some properties of the value function. Based on these properties, we prove that the delay optimal rate allocation policy for this discounted MDP is to equalize the queue lengths in each slot as much as possible. We then prove that this queue balancing policy is optimal for the average cost MDP as well. Essentially, both the \modifled LQHR" and our policy aim to balance the 61 queue lengths as well as to maximize the throughput at any time. However, the continuous model in [11, 12] allows the rates to be changed at any time, while our model allows us to make decisions only at the beginning of each slot. Consequently, the resulting optimal policies are difierent: The operating point of the \modifled LQHR" algorithm is either one of the corner points or the mid-point of the dominant face of the pentagon, while the queue balancing policy here may operate at any point on the dominant face of the pentagon. 3.2 System Model and Problem Formulation 3.2.1 Physical Layer Model We consider a two-user AWGN multiple access system Y = X1 +X2 +Z (3.1) where Xi is the signal of user i, and Z is a Gaussian noise with zero-mean and variance 2. In this multiple access system, the capacity region is given by [24] R1 ? 12 log 1+ P1 2 ? , C1 (3.2) R2 ? 12 log 1+ P2 2 ? , C2 (3.3) R1 +R2 ? 12 log 1+ P1 +P2 2 ? , Cs (3.4) 62 The capacity region is a pentagon, as shown in Figure 3.1. In this chapter we consider a symmetric two-user system, where P1 = P2 = P. Our results can be generalized to the symmetric K-user case. C s Cs R2 C2 R1C1 Figure 3.1: The capacity region for a two-user multiple access system. 3.2.2 Medium Access Control Layer Model In the medium access control layer, we assume that the bits arriveat the transmitters in random numbers in each slot, see Figure 3.2. Let a1[n] and a2[n] denote the number of bits arriving at the flrst and the second transmitter, respectively, during time slot n. Here, a1[n] and a2[n] are two independent random variables with a common distribution Fa. We assume that the arrivals are i.i.d. in n. There is an inflnite capacity bufier at each transmitter to store the bits. Let q1[n] and q2[n] denote the number of bits in the flrst and the second bufier, re- spectively, at the beginning of the nth slot. At the beginning of each slot, the transmitters decide on how many bits to transmit in this slot based on the current lengths of the two queues. Let d1[n] and d2[n] denote the number of bits to be 63 receiver a1[n] a2[n] user2 user1 Figure 3.2: System model. transmitted from the flrst and the second queue, respectively, in the nth time slot. Let us deflne q[n] , (q1[n];q2[n]), d[n] , (d1[n];d2[n]), and a[n] , (a1[n];a2[n]). Then, the queue lengths evolve according to q[n+1] = (q[n]?d[n])+ +a[n] (3.5) where (x)+ denotes max(0;x). If the number of channel uses in a slot is ?, the transmission rate of user i becomes Ri[n] = di[n]=?. Consequently, the actual rates of the users that need to be selected from the capacity region described by (3.2)-(3.4), are proportional to d1[n] and d2[n], and therefore, (d1[n];d2[n]) can be viewed as (scaled) rates. In order to simplify the notation, we will call di[n] = Ri[n]? as the rate of user i for slot n. The corresponding scaled capacity region that (d1;d2) should reside in is described by (3.2)-(3.4) by multiplying right hand sides by ?. 3.2.3 Formulation as an MDP According to Little?s law [25], minimizing the average delay in the system is equiv- alent to minimizing the average number of bits in the system, which is the average 64 sum of queue lengths. If the system starts from state q[1], the delay minimization problem is to obtain optimal policy d[n], n = 1;2;::: to minimize limsup N!1 1 NE " NX n=1 (q1[n]+q2[n]) # (3.6) Therefore, this problem can be formulated as a standard average cost MDP. The state space consists of all possible queue length vectors, while the policy space is the set of operating points within the multiple access capacity region. In principle, the values of qi[n], di[n] can only be integers, however, for practical applications, one bit is a flne enough precision that we can use a uid model to reasonably approximate the original discrete-state system. 3.3 The Discounted Cost Problem Instead of considering the minimization problem with the average cost criterion in (3.6) directly, we flrst consider the following minimization problem with a total discounted cost criterion E " 1X n=1 fln(q1[n]+q2[n]) # (3.7) where 0 < fl < 1 is the discount factor. We will return to the average cost criterion in (3.6) by letting fl go to 1. Let us deflne V fl(q) to be the total discounted cost starting from an initial state q. Then, for the optimization problem with criterion (3.7), V fl(q) must satisfy 65 the following optimality condition [28] V fl(q) = min d2C 'q 1 +q2 +flE ?V fl((q?d)+ +a)?? (3.8) We will flrst start with a discounted cost problem over flnite horizon N. For this problem with an initial state q, the dynamic programming formulation is V flN(q) = min d2C n q1 +q2 +flE h V flN?1((q?d)+ +a) io (3.9) with V fl0 (?) = 0. Since the instantaneous cost q1[n]+q2[n] is positive, and the policy space is flnite [28] V flN(q) ! V fl(q) as N !1 (3.10) where V fl(?) is the unique bounded solution of (3.8). In the following, we will analyze the discounted cost problem and obtain struc- tural properties of the value function V fl(q). We will flnd these structural properties of V fl(q) by examining the structural properties of the flnite-horizon discounted cost problem V flN(q). Lemma 3.1 V fl(q) is increasing in q1 and q2. Proof: From (3.10), we know that proving V fl(q) is increasing in q1 and q2 is equivalent to proving V flN(q) is increasing in q1 and q2 for every N. We prove this through induction. First, when N = 0;1, this is trivially true. Next, we assume 66 that it is true for N ? 1. We will prove that V flN(q1 + 1;q2) > V flN(q1;q2) for any positive (q1;q2). V flN(q1 +1;q2) = q1 +q2 +1+flE h V flN?1((q1 +1?d?1)+ +a1;(q2 ?d?2)+ +a2) i (3.11) ? q1 +q2 +1+flE h V flN?1((q1 ?d?1)+ +a1;(q2 ?d?2)+ +a2) i (3.12) > min d2C n (q1 +q2)+flE h V flN?1?(q1 ?d1)+ +a1;(q2 ?d2)+ +a2) io (3.13) = V flN(q1;q2) (3.14) where (d?1;d?2) in (3.11) is the pointwithin the capacityregion that minimizes V flN(q1+ 1;q2), and (3.12) follows from the assumption that V flN?1(q1;q2) is increasing for every q1. Therefore, V flN(q) is increasing in q1 for every N. Using (3.10), this implies that V fl(q) is increasing in q1. Now, following the same procedure for q2, we can prove that V fl(q) is increasing in q2 as well. 2 Lemma 3.2 In (3.8), the optimal operating point d must be on the boundary of the capacity region C. Proof: For an initial state q, if the optimal operating point d = (d1;d2) is not on the boundary of the capacity region but on the interior of the capacity region, then, we can always flnd points ?d = (d01;d2), ~d = (d1;d02) that are on the boundary of the capacity region with d01 > d1, d02 > d2. Note that ?d ? d and ~d ? d. Then, 67 by Lemma 3.1, we have E?V fl((q? ?d)+ +a)?? E?V fl((q?d)+ +a)? (3.15) and E h V fl((q? ~d)+ +a) i ? E?V fl((q?d)+ +a)? (3.16) This contradicts the optimality of d. Thus, d must be on the boundary of the capacity region. 2 Lemma 3.3 V fl(q) is symmetric and jointly convex in q. Proof: The symmetry property can be proved by induction. Note that V flN(q) is symmetric for N = 0;1. Assuming that V flN?1(q) is symmetric, it is easy to see that V flN(q) would be symmetric. Now, taking the limit N ! 1, it follows that V fl(q) is symmetric. We prove the convexity of V fl(q) through induction as well. When N = 0;1, it is trivial to see that V flN(q) is convex in q. Next, we assume that V flN?1(q) is convex in q. Given two difierent queue length vectors x , (x1;x2) and y , (y1;y2), we 68 have ?V flN(x)+(1??)V flN(y) = ?(x1 +x2)+(1??)(y1 +y2)+?flE h V flN?1((x?b?)+ +a) i +(1??)flE h V flN?1((y?d?)+ +a) i (3.17) ? ?(x1 +x2)+(1??)(y1 +y2)+ flE h V flN?1??(x?b?)+ +(1??)(y?d?)+ +a? i (3.18) ? ?(x1 +x2)+(1??)(y1 +y2)+ flE h V flN?1???(x?b?)+(1??)(y?d?)?+ +a? i (3.19) ? min d2C n ?(x1 +x2)+(1??)(y1 +y2)+ flE h V flN?1???x+(1??)y?d?+ +a? io (3.20) = V flN(?x+(1??)y) (3.21) where b? and d? are the minimizers for V flN(x) and V flN(y), respectively. Here, (3.18) follows from the assumption of the convexity of V flN?1(?), (3.19) follows from the convexity of the function (?)+, and (3.20) is valid because b?;d? 2 C, and C is a convex set, implying ?b? +(1??)d? 2 C. 2 Before we move on to the next structural property of the function V fl(q), we need to introduce the concepts of majorization and Schur-convexity. Deflnition 3.1 ([29]) Given x;y 2 Rd, we say that x majorizes y, and we write 69 x ? y, if kX i=1 xi ? kX i=1 yi; k 2f1;:::;d?1g (3.22) dX i=1 xi = dX i=1 yi (3.23) where xi and yi are the ith largest elements of x and y, respectively. Deflnition 3.2 ([29]) A function f : Rd ! R is said to be Schur-convex when x ? y implies f(x) ? f(y). AfunctionisSchur-convexifitissymmetricandconvex[29]. UsingLemma3.3, we conclude that V fl(q) is Schur-convex. However, given that x ? y, we cannot di- rectly claim that V fl(x+a) ? V fl(y+a) for every a. This is because the randomness of a may reverse the majorization relationship between x+a and y +a. However, provided that V fl(q) is symmetric and convex, and a has i.i.d. components, we can prove that E[V fl(x+a)] ? E[V fl(y+a)] if x ? y. Lemma 3.4 For i.i.d. ais x ? y implies E[V fl(x+a)] ? E[V fl(y+a)]. Proof: When a1 = a2, clearly, x+a ? y +a, and V fl(x+a) ? V fl(y +a). When a1 6= a2, we evaluate the functions V fl(x+a) and V fl(y+a) at two symmetric points (c1;c2) and (c2;c1). In order to simply the notation, for any vector v = (v1;v2), we deflne ?v = (v2;v1). Because ais are i.i.d., the two points c, ?c have the same probability mass. Without loss of generality, we assume c1 > c2, x1 ? x2, y1 ? y2. Since x ? y, we have x1 ? y1 ? y2 ? x2. 70 Consider four vectors (x+c), (?x+c), (y+c), (?y+c). We see that they are four points on the line q1+q2 = x1+x2+c1+c2. Moreover, since x1 ? y1 ? y2 ? x2, (x+c) and (?x+c) are the two outer points, and the mid-point of these two points is the same as the mid-point of the other two points. Since V fl(q) is convex, we have V fl(x+c)+V fl(?x+c) ? V fl(y+c)+V fl(?y+c) (3.24) We also note that because of the symmetry property of V fl(q) we have V fl(?x+c) = V fl(x+?c). Similarly, we have V fl(?y+c) = V fl(y+?c). Therefore, (3.24) is equivalent to V fl(x+c)+V fl(x+ ?c) ? V fl(y+c)+V fl(y+ ?c) (3.25) Integrating over a1, a2, we get E[V fl(x+a)] = Z a1>a2 V fl(x+a)+ Z a1 0. To prove the claim by contradiction, let us assume that d is not optimal, but b = (b1;b2) is the optimal point on the dominant face. Since both d and b are on the dominant face of the capacity region: d1 + d2 = b1 + b2. Since with a flxed sum, the vector with identical components is majorized by any other vector [29], we have (q1 ? b1;q2 ? b2) ? (q1 ? d1;q2 ? d2). Without loss of generality, we assume q1 ? b1 > q2 ? b2, i.e., q1 ? b1 > q1 ? d1 = q2?d2 > q2?b2. Ifq2?b2 ? 0, wehave((q1?b1)+;(q2?b2)+) ? ((q1?d1)+;(q2?d2)+), and using Lemma 3.4, this implies E[V fl((q?b)+ +a)] ? E[V fl((q?d)+ +a)] (3.30) 72 On the other hand, if q2 ?b2 < 0, we have E[V fl((q?b)+ +a)] = E[V fl((q1 ?b1)+a1;a2)] (3.31) ? E[V fl(q1 ?d1 +a1;d1 ?b1 +a2)] (3.32) = E[V fl(q1 ?d1 +a1;b2 ?d2 +a2)] (3.33) > E[V fl(q1 ?d1 +a1;q2 ?d2 +a2)] (3.34) = E[V fl((q?d)+ +a)] (3.35) where (3.32) follows from (q1 ? b1;0) ? (q1 ? d1;d1 ? b1) and Lemma 3.4, (3.33) follows from the fact that d1 +d2 = b1 +b2, and (3.34) is valid because we assumed that q2 ? b2 < 0, thus q2 ? d2 > b2 ? d2, and we apply Lemma 3.1. The results in (3.30) and (3.35) contradict the optimality of b, and therefore, d must be the optimal operating point. Next, we prove that if there does not exist a point on the dominant face of the capacity region which equalizes the queue lengths, then the optimal operating point must be one of the corner points. Let us assume that the optimal operating point d = (d1;d2) is not a corner point, and without loss of generality, let us assume that (q1 ?d1)+ > (q2 ?d2)+. If q1 ?d1 > q2 ?d2 ? 0, we can always flnd a small enough ? > 0, such that the operating point (d1+?;d2??) is also on the dominant face, and q1?(d1+?) > q2?(d2??) > 0. Since (q1?d1;q2?d2) ? (q1?(d1+?);q2?(d2??)), based on Lemma 3.4, we have E[V fl((q?d)+ +a)] ? E[V fl(q1 ?(d1 +?)+a1;q2 ? (d2 ? ?) + a2)], and this contradicts the optimality of d. On the other hand, if 73 q1?d1 > 0 > q2?d2, we can also flnd a small enough ? > 0, such that q1?(d1+?) > 0 ? q2?(d2??), and (d1 +?;d2??) is on the dominant face as well. Therefore, we have 0 < q1 ?(d1 +?) < q1 ?d1, and (q2 ?d2)+ = (q2 ?d2 +?)+ = 0. According to Lemma 3.1, we have V fl(q1?d1+a1;a2) > V fl(q1?(d1+?)+a1;a2) for any value of a1 and a2. Therefore, E[V fl(q1?d1+a1;a2)] > E[V fl(q1?(d1+?)+a1;a2)], and this contradicts the optimality of d. Hence, the optimal operating point, in this case, must be one of the corner points. 2 Using Theorem 3.1, we express the optimal operating point d? = (d?1;d?2) as a function of the queue lengths q = (q1;q2) d? = 8 >>>> >>< >>> >>> : ?q 1?q2+Cs 2 ; q2?q1+Cs 2 ?; jq 1 ?q2j < 2C1 ?Cs (C1;Cs ?C1); q1 ?q2 > 2C1 ?Cs (Cs ?C2;C2); q1 ?q2 < Cs ?2C1 This optimal rate allocation scheme works on the dominant face of the capacity region and therefore maximizes the number of bits transmitted in each slot; and, at the same time, it tries to balance the queue lengths as much as possible, which, in turn, minimizes the probability that any one of the queues becomes empty in the upcoming slots. When a queue becomes empty, the system resources cannot be utilized most e?ciently, as even though the user with an empty queue has power to transmit, it does not have any bits to transmit. Finally, while we developed Theorem 3.1 for the discounted cost criterion, we can flnd a sub-sequence of discount factors fln such that fln ! 1 as n ! 1. 74 Therefore, the policy we developed is optimal for the average cost problem as well. 3.4 Numerical Results We consider a two-user AWGN multiple access channel, with C1 = C2 = 20 bits/slot and Cs = 30 bits/slot. The number of bits arriving at the transmitters in each slot follows a Poisson distribution with parameter ?. We compare two policies: the optimal policy developed in this chapter which tries to balance the queue lengths in each slot and the LQHR algorithm developed in [9] which chooses a corner point of the capacity region and allocates the larger rate to the longer queue. We plot the average delay versus ? in Figure 3.3. 6 7 8 9 10 11 12 13 14 15 1 1.5 2 2.5 Arrival rate (bits/slot) Average delay (slots/bit) LQHR Queue balancing Figure 3.3: Average delay versus arrival arrival rate. We observe that when ? is small, both the LQHR policy and the queue balanc- ing policy yield delay close to one slot, and the difierence between these two policies 75 is insigniflcant. This is because, the system has a light tra?c, and both policies empty both queues in almost all slots. When ? becomes very close to the boundary of the capacity region, the average delay grows rapidly under both policies, and again the difierence between the two policies becomes insigniflcant. This is because, the system has a heavy tra?c, and the probability that the queues become empty is very small under both policies, and the actual number of departures in each slot is almost the same for both policies. When ? is neither very small nor very large, the queue balancing policy outperforms the LQHR policy signiflcantly. This is because, equalizing the queue lengths minimizes the probability that one queue is large while the other queue is empty or close to empty, and consequently utilizes the system resources more e?ciently. 3.5 Conclusions In this chapter, we investigated the delay-optimal rate allocation problem in a sym- metric MAC. We formulated the problem as a discrete-time MDP, and analyzed the properties of the value function for the corresponding discounted cost MDP. Based on these properties, we proved that the delay optimal rate allocation policy is to equalize the queue lengths in each slot as much as possible. 76 Chapter 4 Delay Minimization with a General Pentagon Rate Region 4.1 Introduction In Chapter 3, we investigate the delay-optimal rate allocation in a symmetric AWGN multiple access channel (MAC), where the underlying capacity region is a symmetric pentagon. We prove that the queue length balancing policy, which minimizes the queue length difierence while working on the dominant face of the capacity region in each slot, minimizes the average bit delay in the system. The goal of this chapter is to use a general pentagon shaped underlying rate region (hence, non-time-divided transmissions) and determine the optimal rate allocation policy from this available rate region, as a function of the current queue sizes of the users, to minimize the delay. Delay minimization for a single-user communication channel has been inves- tigated in [2, 4, 5], where the structural properties of the optimum power/rate allocation policies, and relationships between average power and delay have been determined for fading channels, using dynamic programming and Markov decision process (MDP) formulations. In these works, due to the large number of possible rate/power choices at each channel state, it has been almost impossible to get ana- lytical closed-form optimal solutions. For multi-user systems, even the properties of the optimum rate allocation have been impossible to obtain, except for special rate 77 regions. Reference [9] considers a symmetric Gaussian MAC, and proves that in order to minimize the packet delay, the system should operate at an extreme point of the MAC capacity region, and the larger rate should be given to the user with the larger queue size, hence the name of the proposed policy: longer-queue-higher- rate (LQHR). Reference [10] generalizes [9] to a potentially asymmetric setting, and proves that the delay-optimal policy has a switch structure, i.e., the queue state space should be divided into two, and in each region, the system should operate at one of the two corner points. However, unlike the symmetric case in [9], the explicit form of the switch curve is unknown. Reference [11] develops a policy named \modifled LQHR" which works at a corner point of the pentagon when the queue lengths are difierent, and switches to the mid-point of the dominant face of the pentagon when the queue lengths become equal. The \modifled LQHR" algorithm is shown to minimize the average bit delay in a symmetric system. The third chapter of [12] extends \modifled LQHR" to a symmetric M-user scenario. From the literature above, we observe that the explicit solution of the queue- length based delay-minimization problem is known only for the symmetric Gaussian MAC, where the underlying rate region is a symmetric pentagon. Even for the asymmetric pentagon, the delay-minimizing policy is not known. The reason for this is that delay-minimization requires maximizing the throughput at the current time as well as maximizing the throughput in the future. These are often con icting objectives. The flrst objective requires maximizing the sum-rate while the second objective requires balancing the queue lengths. Unbalanced queue lengths increases 78 the likelihood of one of the queues becoming empty, which results in ine?ciency of transmission, as it decreases the future achievable sum-rates. Thanks to the special properties of the capacity region of the symmetric Gaussian MAC, these two objectives can be achieved simultaneously. However, having a symmetric pentagon as a capacity region is a peculiarity of the symmetric Gaussian MAC. The capacity region of a general (non-Gaussian) MAC is not a pentagon, it is a union of pentagons [24]. Likewise, the capacity regions of the fading Gaussian MAC [30], the Gaussian MAC with multiple antennas [31], or the Gaussian MAC with user cooperation [32, 33] are not pentagons. In this chapter, we will consider a two-user communication channel with a general pentagon rate region. Difierent from the Gaussian MAC capacity region, the pentagon we assume does not have a 45? dominant face. The motivation to study such a rate region is two-fold: First, it is the simplest extension of the rate regions studied so far, that changes a characteristic of the rate region in a fundamental way. This characteristic is that the two corner points on the dominant face do not have equal sum-rates. Therefore, in this example rate region, we are able to observe the tension between throughput optimality, i.e., the desire to work at the point that yields the largest sum-rate, and balancing the queue lengths, i.e., the desire to favor the longer queue over the shorter one, more explicitly. Secondly, this asymmetric pentagon with a non-45? dominant face can be seen as a crude approximation of a general rate region, as shown in Figure 4.1. That is, we can imagine this asymmetric pentagon to be the largest such shape fltting in a general rate region, which may belong to a MAC with fading, multiple antennas, or cooperation. 79 2 R1 (a2,b2) (a1,b1) 1 R2 Figure 4.1: The asymmetric pentagon rate region with a non-45? dominant face. Corner point 2 has larger sum-rate, i.e., a2 +b2 > a1 +b1. Our goal in this chapter is to assign rate pairs to users from the underlying rate region based on their current queue lengths in order to minimize the average delay in the system. We formulate the problem as an MDP and prove that the delay-optimal policy should operate at one of the two corner points of the rate region. Through value iteration, we prove that a switch curve structure exists in the queue state space. Next, we prove that for the discounted-cost MDP, the switch curve has a limit on one of the queue lengths, i.e., when one of the queue lengths exceeds a threshold, the transmitters always operate at the corner point which has the larger sum-rate (see Figure 4.5). That is, the delay-optimal policy favors throughput-optimality (i.e., larger sum-rate) unless the flrst queue gets close to empty, in which case, the policy favors balancing queue lengths. Our result has two practical implications: First, it gives a partial analytical characterization for the delay-optimal switch curve. Secondly, it implies that we can operate the queues partially distributedly, in that, if the current queue length of the flrst user is larger than the limit, then this user does not need to know the current queue length of the other user in order to decide 80 about the rate point at which it should operate on the rate region. Finally, we note that, according to the optimal policy, always operating at the maximum sum-rate point is not optimal. With the goal of maximizing the current sum-rate as well as the sum-rate in the future, depending on the current queue lengths, the optimal policy may switch from the maximum sum-rate point to the rate point that favors balancing the queue lengths. This action minimizes the probability that any one of the queues becomes empty in the future, hence maximizes the overall transmission rates, and consequently, minimizes the overall delay. Therefore, we observe that, the optimal rate allocation policy trades some of the instantaneously achievable sum-rate in favor of balancing the queue lengths, with the goal of minimizing the overall delay. 4.2 System Model and Problem Formulation We consider a communication system with two transmitters, and one receiver, as in Figure 4.2. The underlying rate region is a general pentagon as shown in Figure 4.1. We denote the two corner points of the rate region as points 1 and 2, with rate pairs (a1;b1) and (a2;b2), respectively. Without loss of generality, we assume that a2 +b2 > a1 +b1, i.e., that point 2 has a larger sum-rate. We denote the difierence between the two sum-rates by ? = a2 +b2 ?(a1 +b1). In the medium access control layer, we assume that packets arrive at the source nodes according to independent Poisson processes with parameters ?1 and ?2, see Figure 4.2. We also assume that the packet lengths are independent and 81 identically distributed exponential random variables with unit mean. Therefore, for a given transmission rate r, the transmission time for a packet is an exponential random variable with parameter r. There is a bufier with inflnite capacity at each transmitter, storing the packets until they are transmitted. Let q1(t), q2(t) denote the number of packets in the two bufiers at time t. The transmitters determine their transmission rates, which are the components of the rate vector r, where r is in the rate region, based on the current queue length vector q(t) = (q1(t);q2(t)). Therefore, on the medium access control layer, the queue lengths evolve according to a continuous-time Markov chain, whose transition rates are determined by the arrival and transmission rates. ?2 user1 receiver q1(t) q2(t) ?1 user2 Figure 4.2: The system model. According to Little?s law [25], minimizing the average delay in the system is equivalent to minimizing the average number of packets in the system. Assuming that the system starts from state q(0), the delay minimization problem is to obtain an optimal policy u, to minimize the long-term average cost: limsup t!1 1 tE ?Z t 0 q(s)Tedsjq(0) ? (4.1) 82 where e is the vector of all ones. Samplingthesystematcertainepoches, wecanconverttheoriginalcontinuous- time problem into a discrete-time problem [28]. Intuitively, we intend to sample the system at any epoch when an arrival or departure occurs. However, because the transition rates are difierent at difierent operating points, the sampling frequency may be difierent for difierent states. In order to sample the system at a uniform frequency, we adopt the normalization method in [34]. Since a2 + b2 is the maxi- mum sum of transmission rates, the maximum total transition rate of the system is ?1 + ?2 + a2 + b2, which we deflne as . Let us denote the transmission rates of the users as r1 and r2. If r1 + r2 < a2 + b2, we assume that there is a third transmitter transmitting a dummy packet with rate a2 + b2 ?(r1 + r2). Then, we sample at the epoches when either a packet arrives, or a packet (dummy or real) departs. Therefore, the sampling frequency for all of the states will be the same, and the corresponding discrete-time Markov chain will precisely represent the original system. After sampling and discretizing the continuous-time system, our goal will be to choose r at every transition epoch to minimize the average delay. Let us denote the indices of the transition epoches as n, n = 1;2;:::. Given the initial queue lengths q0, the delay minimization problem is to determine the optimal policy that minimizes: limsup N!1 1 NE "N?1X n=0 q[n]Tejq[0] = q0 # (4.2) 83 Let us deflne Ai and Di to be an arrival or (potential) departure at the ith queue, i = 1;2. For example, A1q = (q1+1;q2), D1q = ((q1?1)+;q2). We flrst deflne the corresponding discounted-cost problem with a discount factor fl, and obtain the dynamic programming formulation: V flN(q) = qTe+fl ?1 ? ?1V flN?1(A1q)+?2V flN?1(A2q) (4.3) +min r2C n r1V flN?1(D1q)+r2V flN?1(D2q)+(a2 +b2 ?r1 ?r2)V flN?1(q) o? where C is the rate region from which rates r1 and r2 are chosen. As N ! +1, V flN(q) ! V fl(q), which is the unique solution of the optimality equation: V fl(q) = qTe+fl ?1 ? ?1V fl(A1q)+?2V fl(A2q) (4.4) +min r2C n r1V fl(D1q)+r2V fl(D2q)+(a2 +b2 ?r1 ?r2)V fl(q) o? This is a two-dimensional MDP, which is di?cult to solve in general. We flrst determine some structural properties of the optimal policy. Lemma 4.1 V fl(q) is monotonically increasing in qi, i = 1;2. Proof: We prove this lemma using induction. First, since V fl0 (q) = 0, V flN(q) increases monotonically in q1 and q2 for N = 0. Then, we assume that this lemma 84 holds for V flN(q);N > 0, and prove it for N +1. Since V flN+1(q) = qTe+fl ?1 ? ?1V flN(A1q)+?2V flN(A2q) (4.5) +min r2C n r1V flN(D1q)+r2V flN(D2q)+(a2 +b2 ?r1 ?r2)V flN(q) o? Using the assumption that V flN(q) is monotonically increasing in q1 and q2 and the fact that qTe is also monotonically increasing in q1 and q2, in order to prove the monotonicity of V flN+1(q) in q1 and q2, we only need to show that min r2C n r1V flN(D1q)+r2V flN(D2q)+(a2 +b2 ?r1 ?r2)V flN(q) o (4.6) is monotonically increasing in q1 and q2. We compare the values of this expression at two states A1q and q as follows min r2C n r1V flN(D1A1q)+r2V flN(D2A1q)+(a2 +b2 ?r1 ?r2)V flN(A1q) o (4.7) = r?1V flN(D1A1q)+r?2V flN(D2A1q)+(a2 +b2 ?r?1 ?r?2)V flN(A1q) (4.8) ? r?1V flN(D1q)+r?2V flN(D2q)+(a2 +b2 ?r?1 ?r?2)V flN(q) (4.9) ? min r2C n r1V flN(D1q)+r2V flN(D2q)+(a2 +b2 ?r1 ?r2)V flN(q) o (4.10) where (r?1;r?2) minimizes the value of (4.6) at state A1q. Here the flrst inequality follows from the assumption that V flN(q) is monotonically increasing in q1 and q2, and the second inequality follows from the fact that (r?1;r?2) may not be the minimizer of the function in (4.10). 85 Comparing (4.7) and (4.10), we conclude that the function in (4.6) is mono- tonically increasing in q1 and q2 for N. Then, since this is true for any N, by taking the limit V fl(q) = limN!1V flN(q) is monotonically increasing in q1 and q2. 2 Lemma 4.2 The optimal operating point must lie on the boundary of the rate re- gion. In addition, it must be one of the two corner points. Proof: The flrst half of Lemma 4.2 can be proved using Lemma 4.1. If the optimal operating point (r1;r2) is not on the boundary but is in the interior of the rate region, then, we can always flnd another operating point (r01;r02) on the boundary, where r01 ? r1, and r02 ? r2. Then, based on Lemma 4.1, the resulting value of (4.6) will be strictly smaller when operating at (r01;r02) compared to the value when operating at (r1;r2). This contradicts with the optimality of (r1;r2). Thus, the optimal operating point must lie on the boundary of the rate region. Therefore, we only need to focus on the dominant face of the capacity region. Any point (r1;r2) on the dominant face can be expressed as a linear combination of the two corner points. Thus, we have min r2C n r1V flN?1(D1q)+r2V flN?1(D2q)+(a2 +b2 ?r1 ?r2)V flN?1(q) o = min ?2(0;1) n ? ? a1V flN?1(D1q)+b1V flN?1(D2q)+?V flN?1(q) ? +(1??) ? a2V flN?1(D1q)+b2V flN?1(D2q) ?o (4.11) = a2V flN?1(D1q)+b2V flN?1(D2q) +min n (a1 ?a2)V flN?1(D1q)+(b1 ?b2)V flN?1(D2q)+?V flN?1(q);0 o (4.12) 86 where the last equality follows from the fact that the minimizer for a linear function must be one of the end points. 2 Let T be an operator deflned on real-valued functions as: Tf(q) = qTe+fl ?1 ? ?1f(A1q)+?2f(A2q)+a2V flN?1(D1q)+b2V flN?1(D2q) +min n (a1 ?a2)f(D1q)+(b1 ?b2)f(D2q)+?f(q);0 o? (4.13) Therefore, the dynamic programming optimality equation can be written as V flN+1(q) = TV flN(q) (4.14) 4.3 An Inductive Proof of the Switch Structure In this section, we prove that the delay-optimal policy has a switch structure. In order to prove that, we flrst deflne a set of functions with properties which are su?cient to have a switch structure. We show that these properties are preserved under the operator T. Since V fl0 = 0 is within this set, using induction, we will show that V fl will be within this set. Let us deflne F to be the set of real-valued functions such that: 1. f(q) is increasing in q1 and q2. 2. f(q+x)?f(q) is increasing in q1 and q2 for any flxed x. 3. (a1 ?a2)f(D1q)+(b1 ?b2)f(D2q)+?f(q) is increasing in q1. 87 4. (a1 ?a2)f(D1q)+(b1 ?b2)f(D2q)+?f(q) is decreasing in q2. Then, we have the following lemma. Lemma 4.3 If f 2F, then Tf 2F. The proof of Lemma 4.3, when ? = 0, can be found in [35]. When ? 6= 0, the proof is difierent, and is provided in Appendix 4.7. Lemma 4.4 V fln (q) 2F for all n. This lemma can be verifled as follows. Since V fl0 = 0, V fl0 is in F. Using Lemma 4.3 recursively, we have V fln (q) 2F for n = 0;1;2;:::. We now deflne the switch function: sn(q1) = min'q2 :(a1 ?a2)f(D1q)+(b1 ?b2)f(D2q)+?f(q) ? 0? (4.15) A generic switch function is shown in Figure 4.3. As we state in the following theorem, the optimal rate assignment problem has a switch structure. Theorem 4.1 The optimal policy for the discounted-cost MDP has a switch struc- ture, i.e., sn(q1) is increasing for every n. This theorem can be proved using properties 3) and 4) of V fln (q). The switch curve partitions the queue state space into two parts, each corresponding to one of the two operating points (corner points of the pentagon). Following the arguments in [10, 35], we can prove that the switch structure still exists when fl ! 1, i.e., for the average cost problem. 88 q2 1 2 q1 sn(q1) Figure 4.3: The switch structure of the optimal policy. While we have proved that the optimal policy has a switch structure, i.e., that the queue state space is divided into two, where in each region the optimal policy operates the system at one of the two corner points, a closed form solution for this switch curve is not known in general. The switch curve is explicitly known only for one special case, which is the symmetric Gaussian MAC case, where the rate region is a symmetric pentagon with a 45? dominant face. In that case the switch curve is a 45? straight line emanating from the origin, i.e., sn(q1) = q1, as shown in Figure 4.4. This implies that the system operates at one of the corner points when q1 > q2, and at the other corner point when q1 < q2. This results in the LQHR policy in [9]. In the asymmetric Gaussian MAC case, where the rate region is an asymmetric pentagon, but with still a 45? dominant face, even though it is known that a switch curve structure exists, the explicit form of the switch curve is not known [10]. In the next section, we will show that, in this more general case where we have an asymmetric pentagon rate region with a non-45? dominant face, even though we do not have an explicit formula for the switch curve, we show that we have a limit on 89 the switch curve along one of the dimensions. q2 q1 1 2 Figure 4.4: The switch structure for a symmetric Gaussian MAC. 4.4 The Limit on the Switch Curve Although we have shown that the delay optimal policy has a switch structure, it is di?cult to obtain the exact switch curve analytically. In this section, we will show that the switch curve is bounded in the q1-dimension. In other words, we can flnd a threshold N, such that, for all q1 greater than this threshold, the optimal operating point is the second corner point of the pentagon. In order to prove this, we start from an initial function f0, which is linear in q1 +q2. We will use f0 to approximate V fl over a large portion of the state space. Speciflcally, this region includes states q with q1;q2 > N, where N is a large enough number. Let us deflne: f0(q) = 11?fl(q1 +q2)+ fl(1?fl)2 ?1 +?2 ?a2 ?b2? 1 +?2 +a2 +b2 (4.16) 90 Clearly, f0 2F. It is easy to verify that Tf0(q)?f0(q) = 8 >>> >>> >>>> < >>> >>>> >>> : 0 q1;q2 6= 0 fl(a2+b2) (1?fl) q = 0 fl(a1+?) (1?fl) q1 = 0 flb2 (1?fl) q2 = 0 (4.17) that is, Tf0 and f0 difier only on the boundary, and for all states away from the boundary, these two functions have the same value. This is a key property that will be essential in this section. Note that under the operator T, the difierence caused by the boundary only propagates into the interior region of the state space by one layer in each iteration; rest of the states are not afiected by the operator. Let us deflne: jfjk = maxff(q) : q1;q2 ? 0;q1 +q2 ? kg (4.18) which is the maximum value of the function f in the region where the sum of the queue lengths is less than k. Similarly, let us deflne jfj1 = supff(q) : q1;q2 ? 0g (4.19) which is allowed to be inflnity. Then, we have the following property. Lemma 4.5 For 8f;g 2F, jTf ?Tgjk ? fljf ?gjk+1. 91 Proof: Tf(q)?Tg(q) =fl ?1 ? ?1f(A1q)+?2f(A2q)??1g(A1q)??2g(A2q) +min n a1f(D1q)+b1f(D2q)+?f(q);a2f(D1q)+b2f(D2q) o ?min n a1g(D1q)+b1g(D2q)+?f(q);a2g(D1q)+b2g(D2q) o? (4.20) Since jminfa;bg?minfc;dgj? maxfja?cj;jb?djg, we have jTf ?Tgjk ? fl ?1 ? ?1jf ?gjk+1 +?2jf ?gjk+1 (4.21) +max n a1jf ?gjk?1 +b1jf ?gjk?1 +?jf ?gjk;a2jf ?gjk?1 +b2jf ?gjk?1 o? ? fl ?1(?1 +?2 +a2 +b2)jf ?gjk+1 (4.22) = fljf ?gjk+1 (4.23) completing the proof. 2 Lemma 4.6 Tnf0 converges to a function f as n ! +1, and Tf = f. 92 Proof: Since f0 2F, Tnf0 2F for any n > 0. jTn+1f0 ?Tnf0jk ? fljTnf0 ?Tn?1f0jk+1 (4.24) ? flnjTf0 ?f0jk+n (4.25) ? fl n+1(a2 +b2) (1?fl) (4.26) where (4.26) follows from (4.17). We observe that (4.26) does not depend on k, thus, jTn+1f0?Tnf0j1 is uniformly bounded by (4.26). Since fl < 1, the right hand side of (4.26) forms a Cauchy sequence, therefore, Tnf0 converges to a function f pointwise. In other words, for any ?, we can flnd an N1(?) such that when n ? N1(?), we have jf ?Tn?1f0j1 ? ?. Thus, for such n, we have jTf ?fj1 ?jTf ?Tnf0j1 +jTnf0 ?fj1 (4.27) ? fljf ?Tn?1f0j1 +jTnf0 ?fj1 (4.28) ? (fl +1)? = ?0 (4.29) Therefore, for any ?0, we can flnd a n > N1( ?0fl+1), such that jTf?fj1 ? ?0. In other words, Tf and f are arbitrarily close. Thus, Tf = f. 2 Lemma 4.7 Let V fl0 (q) = 0, then, V fln (q) = TnV fl0 (q) converges to V fl(q), and f(q) = V fl(q). Proof: In order to prove that f(q) = V fl(q) pointwise, we start from the 93 following: jf ?V fljk ?jf ?Tnf0jk +jTnf0 ?V fln jk +jV fln ?V fljk (4.30) ?jf ?Tnf0jk +fljTn?1f0 ?V fln?1jk+1 +jV fln ?V fljk (4.31) ?jf ?Tnf0jk +jV fln ?V fljk +flnjf0 ?V fl0 jk+n (4.32) = jf ?Tnf0jk +jV fln ?V fljk +fln n+k 1?fl + fl (1?fl)2 ?1 +?2 ?a2 ?b2 ?1 +?2 +a2 +b2 ? (4.33) ? ?1 +?2 +?3 (4.34) where (4.31) follows from Lemma 4.5, (4.33) follows from the deflnition of f0, and (4.34) follows from the fact that Tnf0 converges to f0, V fln converges to V fl, and flnn ! 0. Therefore, when n is large enough, we have the difierence bounded by (4.34). We note that (4.34) does not depend on k, thus f(q) = V fl(q) for any point q. 2 Lemma 4.5 means that starting from f0 and performing the iterations, V fl converges to the same function if we started from V fl0 = 0. The convergence point is the unique solution of the optimality equation (4.4). Next, we will prove that f(q) gets arbitrarily close to f0(q) when q1;q2 ! +1. Lemma 4.8 jf ?Tnf0j1 ? fln+1(a2+b2) (1?fl)2 . 94 Proof: jTn+pf0 ?Tnf0jk ?jTn+pf0 ?Tn+p?1f0jk +jTn+p?1f0 ?Tn+p?2f0jk +??? +jTn+1f0 ?Tnf0jk (4.35) ??fln+p?1 +fln+p?2 +???+fln?jTf0 ?f0jk+n+p (4.36) ? fl n(1?flp) 1?fl fl(a2 +b2) (1?fl) (4.37) Note that (4.37) does not depend on k, therefore, jTn+pf0 ? Tnf0j1 is uniformly bounded, and we have jf ?Tnf0j1 = lim p!1 jTn+pf0 ?Tnf0j1 (4.38) = fl n+1(a2 +b2) (1?fl)2 (4.39) 2 Theorem 4.2 f(q) gets arbitrarily close to f0(q) when q1;q2 ! +1. Therefore, the switch curve has a limit on q1. Proof: For any flxed state q, we have jf(q)?f0(q)j?jf(q)?Tnf0(q)j+jTnf0(q)?f0(q)j (4.40) Based on Lemma 4.8, we can see that for 8?, there exists N(?), such that jf ? 95 TN(?)f0j1 ? ?. From the deflnition in (4.19), jf(q)?TN(?)f0(q)j?jf ?TN(?)f0j1 ? ? (4.41) At the same time, from (4.17), we know that TN(?)f0(q) only difiers from f0(q) over the states which are within N(?) layers away from the boundary. Thus, for all q1 > N(?);q2 > N(?), TN(?)f0(q)?f0(q) = 0 (4.42) Therefore, combining (4.40)-(4.42), for any q, q1 > N(?);q2 > N(?), (4.40) is bounded by jf(q)?f0(q)j?jf ?f0j1 +0 = ? (4.43) i.e., ?? ? f(q)?f0(q) ? ?. Thus, in this region, as shown in Figure 4.5, we have a1f(D1q)+b1f(D2q)+?f(q)?a2f(D1q)?b2f(D2q) = (b1 ?b2)f(D2q)+?f(q)?(a2 ?a1)f(D1q) (4.44) ? (b1 ?b2)(f0(D2q)??)+?(f0(q)??)?(a2 ?a1)(f0(D1q)+?) (4.45) = ?1?fl ?2(a2 ?a1)? (4.46) 96 where the inequality follows from (4.43). Therefore, when ? ? ?2(a 2 ?a1)(1?fl) (4.47) (4.46) is always greater than zero, thus point 2 is always better than point 1. From Lemma 4.8, let ? = fl n+1(a2 +b2) (1?fl)2 = ? 2(a2 ?a1)(1?fl) (4.48) from which, we have N(?) = ? logfl ? (1?fl)2(a 2 +b2)(a2 ?a1) ? ?1 (4.49) Since we have proved in the previous section that the optimal policy must have a switch curve structure, for any q, such that q1 ? N(?), the optimal policy is always to operate the system at point 2. Thus, the switch curve has a limit. 2 The result implies that when both q1, q2 are large, the objective of maximiz- ing the sum-rate is more important than balancing the queue lengths in order to minimize the average delay. Thus, in this scenario, operating at point 2 is optimal. When one queue (q1 in this chapter) becomes close to empty, the objective of bal- ancing the queue lengths becomes more important, and the operating point must be switched from point 2 to point 1. 97 N N q2 q1 (q1,q2) Figure 4.5: The switch curve of the discounted-cost MDP. 4.5 Numerical Results We consider a system where the arrival rates for the flrst and second user are ?1 = 0:4 packets/unit time, ?2 = 0:3 packets/unit time, respectively. We assume that the packet sizes are exponentially distributed i.i.d. random variables with unit mean. We assume that the underlying rate region is a general pentagon, where the normalized coordinates of the flrst corner point is (0:3;0:5), and the normalized coordinates of the second corner point is (0:7;0:3). We flrst obtain the optimal policy with fl = 1, which corresponds to the average delay minimization policy. We observe that the optimal policy has a switch structure. Then, we vary the value of fl, and obtain the optimal policy for the corresponding discounted-cost problem. These curves are shown in Figure 4.6. We observe that for each curve, there is a limit on the dimension of q1, and all of these curves are lower bounded by the curve with fl = 1. This can be explained in this way: as fl increases, the weight of future cost increases. Thus, balancing the queue lengths becomes progressively 98 more important, and for some states, it overweighs maximizing the sum-rate at the current stage. Therefore, in this case, the set of states which operate at the flrst corner point enlarges. 0 2 4 6 8 10 12 14 16 18 200 10 20 30 40 50 60 70 80 90 100 q1 q 2 ?=0.999 ?=0.99 ?=0.98 ?=1 Figure 4.6: The switch curves for the discounted-cost MDP. 4.6 Conclusions In this chapter, we investigated the delay minimization problem in a two-user com- munication channel, where the underlying rate region is approximated as a general pentagon. We assumed that the corner points of this pentagon have difierent sum- rates. We formulated the problem as an MDP, and proved that the delay-optimal policy always operates at one of the two corner points, and has a switch structure. This implies that for some states, the optimal policy requires trading a portion of the sum-rate for balancing the queue lengths in order to minimize the average delay. 99 We also proved that for the discounted-cost problem, the switch curve is bounded in one of the dimensions. This implies that the queues can be operated partially distributedly. 4.7 Appendix We prove the properties 1) through 4) of Tf by induction. If f 2F, then obviously, qTe, f(A1q), f(A2q), f(D1q), f(D2q) are in F. Then, it su?ces to show that minf(b1 ?b2)f(D2q)+?f(q);(a2 ?a1)f(D1q)g is also in F. In order to simply the notation, we deflne g(q) = minf(b1 ?b2)f(D2q)+?f(q);(a2 ?a1)f(D1q)g (4.50) If (b1?b2)f(D2q)+?f(q) < (a2?a1)f(D1q), then, the optimal operating point for state q is corner point 1; otherwise, the optimal operating point is corner point 2. We will show that g(q) also possesses the properties 1) through 4) of f(q). 4.7.1 g(q) is increasing in q1 and q2. It is straight forward to prove this property. Hence, we omit its proof. 100 4.7.2 g(q+x)?g(q) is increasing in q1 and q2 for any flxed x. For this property, we will prove that g(A21q)?g(A1q) ? g(A1q)?g(q) (4.51) g(A22q)?g(A2q) ? g(A2q)?g(q) (4.52) g(A1A2q)?g(A2q) ? g(A1q)?g(q) (4.53) First, we evaluate function g at points q, A1q, A21q, as shown in Figure 4.7. q1 (q1,q2) q2 Figure 4.7: We compare the values of g(q) at difierent states. If the optimal operating point for state q, A1q, A21q is corner point 1, then, we have g(q) = (b1 ?b2)f(D2q)+?f(q) (4.54) g(A1q) = (b1 ?b2)f(D2A1q)+?f(A1q) (4.55) g(A21q) = (b1 ?b2)f(D2A21q)+?f(A21q) (4.56) 101 Comparing the difierence of values between two adjacent states, we have g(A21q)?g(A1q) = (b1 ?b2)?f(D2A21q)?f(D2A1q)?+??f(A21q)?f(A1q)? (4.57) ? (b1 ?b2)(f(D2A1q)?f(D2q))+?(f(A1q)?f(A1q)) (4.58) = g(A1q)?g(q) (4.59) where the inequality follows from the assumption that f(q) is in F. Similarly, if the optimal operating point for state q, A1q, A21q is corner point 2, i.e., g(q) = (a2 ?a1)f(D1q) (4.60) g(A1q) = (a2 ?a1)f(D1A1q) (4.61) g(A21q) = (a2 ?a1)f(D1A21q) (4.62) we still have g(A21q)?g(A1q) ? g(A1q)?g(q). If the optimal operating points for state q, A1q, A21q are corner points 1, 2, 2, respectively, then, we have g(q) = (b1 ?b2)f(D2q)+?f(q) (4.63) g(A1q) = (a2 ?a1)f(D1A1q) (4.64) g(A21q) = (a2 ?a1)f(D1A21q) (4.65) 102 and, g(A1q)?g(q) = (a2 ?a1)f(D1A1q)?(b1 ?b2)f(D2q)??f(q) (4.66) = (b1 ?b2)(f(q)?f(D2q)) (4.67) g(A21q)?g(A1q) ? (a2 ?a1)f(A1q)?(b1 ?b2)f(D2A1q)??f(A1q) (4.68) = (b1 ?b2)(f(A1q)?f(D2A1q)) (4.69) Therefore, based on the second property of function f, g(A21q)?g(A1q) ? g(A1q)? g(q) still holds. Similarly, if the optimal operating points for state q, A1q, A21q are corner points 1, 1, 2, respectively, g(q) = (b1 ?b2)f(D2q)+?f(q) (4.70) g(A1q) = (b1 ?b2)f(D2A1q)+?f(A1q) (4.71) g(A21q) = (a2 ?a1)f(D1A21q) (4.72) Since the operating point at state A1q is corner point 1, it implies that we have g(A1q)?g(q) ? (a2 ?a1)f(D1A1q)?((b1 ?b2)f(D2q)+?f(q)) (4.73) = (b1 ?b2)(f(D1A1q)?f(D2q)) (4.74) 103 On the other hand, we have g(A21q)?g(A1q) = (b1 ?b2)(f(A1q)?f(D2A1q)) (4.75) ? (b1 ?b2)(f(q)?f(D2q)) (4.76) ? g(A1q)?g(q) (4.77) where the flrst inequality follows from the second property of function f. Based on the assumption that f 2 F, if the optimal policy for any state q is to operate at corner point 2, then, because of the third property of f, all the states on its right should operate on point 2 also. In the analysis above, we discuss every possible policy at states q, A1q, A21q. For all possible cases, we have shown that g(A21q)?g(A1q) ? g(A1q)?g(q). Following similar procedure, we can prove that g(A22q) ? g(A2q) ? g(A2q) ? g(q), and g(A1A2q) ? g(A2q) ? g(A1q) ? g(q). In summary, we conclude that the property 2) holds for g(q). 4.7.3 (a1 ?a2)g(D1q)+(b1 ?b2)g(D2q)+?g(q) is increasing in q1. We need to show that (a1 ?a2)g(A1A2q)+(b1 ?b2)g(A21q)+?g(A21A2q) ? (a1 ?a2)g(A2q)+(b1 ?b2)g(A1q)+?g(A1A2q) (4.78) We evaluate function g at points A1A2q, A21q, A21A2q, A2q, A1q. First, we note that if the optimal operating points for states A1A2q, A21q, 104 A21A2q are corner points 1, 2, 2, respectively, as shown in Figure 4.8(a), g(A1A2q) = (b1 ?b2)f(D2A1A2q)+?f(A1A2q) (4.79) g(A21q) = (a2 ?a1)f(D1A21q) (4.80) g(A21A2q) = (a2 ?a1)f(D1A21A2q) (4.81) we have (a1 ?a2)g(A1A2q)+(b1 ?b2)g(A21q)+?g(A21A2q) = (a1 ?a2)?(b1 ?b2)f(D2A1A2q)+?f(A1A2q)? +(b1 ?b2)(a2 ?a1)f(D1A21q)+?(a2 ?a1)f(D1A21A2q) (4.82) = 0 (4.83) This is an important policy pattern, and we will use it often in the proof afterwards. Another important policy patten is to operate at corner point 1, 2, 1, for state A1A2q, A21q, A21A2q, respectively, as shown in Figure 4.8(b). In this scenario, we 105 observe that (a1 ?a2)g(A1A2q)+(b1 ?b2)g(A21q)+?g(A21A2q) (4.84) = (a1 ?a2)?(b1 ?b2)f(D2A1A2q)+?f(A1A2q)? +(b1 ?b2)(a2 ?a1)f(D1A21q) +??(b1 ?b2)f(D2A21A2q)+?f(A21A2q)? (4.85) = ??(a1 ?a2)f(A1A2q)+(b1 ?b2)f(A21q)+?f(A21A2q)? (4.86) 2 (q1,q2) q2 q1 1 2 (a) Pattern 1 1 (q1,q2) q2 q1 1 2 (b) Pattern 2 Figure 4.8: Two special policy patterns. Iftheoptimal operatingpointsatA21q, A1q, A21A2q, A1A2q, A2qare2;1;2;1;1, respectively, as shown in Figure 4.9. Then , if we switch the operating point at state A1A2q from corner point 1 to 2, the policy at point A2q, A1q, and A1A2q becomes 106 the policy pattern discussed above, and we have (a1 ?a2)g(A2q)+(b1 ?b2)g(A1q)+?g(A1A2q) ? (a1 ?a2)?(b1 ?b2)f(D2A2q)+?(a2 ?a1)f(D1A1A2q) +?f(A2q)?+(b1 ?b2)(a2 ?a1)f(D1A1q) (4.87) = 0 (4.88) = (a1 ?a2)g(A1A2q)+(b1 ?b2)g(A21q)+?g(A21A2q) (4.89) 2 (q1,q2) q2 q1 1 1 2 1 Figure 4.9: The optimal operating points at A21q, A1q, A21A2q, A1A2q, A2q are 2;1;2;1;1, respectively. Similarly, if the optimal operating points at A21q, A1q, A21A2q, A1A2q, A2q are 2;2;2;1;1, or 2;2;2;2;1, respectively, we can show that property 3) still holds. If if the optimal operating points at A21q, A1q, A21A2q, A1A2q, A2q are 107 2;2;1;1;1, as shown in Figure 4.10, we have (a1 ?a2)g(A1A2q)+(b1 ?b2)g(A21q)+?g(A21A2q) = ??(a1 ?a2)f(A1A2q)+(b1 ?b2)f(A21q)+?f(A21A2q)? (4.90) ? ?((a1 ?a2)f(A2q)+(b1 ?b2)f(A1q)+?f(A1A2q)) (4.91) = (a1 ?a2)g(A2q)+(b1 ?b2)g(A1q)+?g(A1A2q) (4.92) where the inequality follows from the property 3) of function f, and the last in- equality follows from the assumption that the policy at state A2q, A1q, A1A2q falls into the second policy pattern discussed above. 2 (q1,q2) q2 q1 1 1 2 1 Figure 4.10: The optimal operating points at A21q, A1q, A21A2q, A1A2q, A2q are 2;2;1;1;1, respectively. Similarly, if the optimal operating points at A21q, A1q, A21A2q, A1A2q, A2q are 2;1;1;1;1, we have (a1 ?a2)g(A2q)+(b1 ?b2)g(A1q)+?g(A1A2q) ? ?((a1 ?a2)f(A2q)+(b1 ?b2)f(A1q)+?f(A1A2q)) (4.93) 108 This is because g(A1q) = (b1 ?b2)f(D2A1q) + ?f(A1q) ? (a2 ?a1)f(q), and if we switch the policy from corner point 2 to corner point 1, it forms the second special policy pattern. Thus, the inequality still holds. In summary, for all possible cases, the function g preserves the property 3) of function f. 4.7.4 (a1 ?a2)g(D1q)+(b1 ?b2)g(D2q)+?g(q) is decreasing in q2. We will evaluate g at points A1q, A1A2q, A1A22q, A2q, A22q. If the optimal operating points are 2;2;2;2;2, or 1;1;1;1;1, respectively, it is straightforward to show that the property still holds. If the optimal operating points are 2;2;2;1;1, respectively, as shown in Figure 4.11, we note that the policy at these points is the flrst special policy patten discussed before, and (a1 ?a2)g(A2q)+(b1 ?b2)g(A1q)+?g(A1A2q) = (a1 ?a2)g(A22q)+(b1 ?b2)g(A1A2q)+?g(A1A22q) (4.94) = 0 (4.95) If the optimal operating points are 2;2;1;1;1, we have (a1 ?a2)g(A22q)+(b1 ?b2)g(A1A2q)+?g(A1A22q) = ??(a1 ?a2)f(A22q)+(b1 ?b2)f(A1A2q)+?f(A1A22q)? (4.96) ? 0 (4.97) = (a1 ?a2)g(A2q)+(b1 ?b2)g(A1q)+?g(A1A2q) (4.98) 109 2 (q1,q2) q2 q1 1 2 1 2 Figure 4.11: The optimal operating points at A1q, A1A2q, A1A22q, A2q, A22q are 2;2;2;1;1, respectively. where the inequality follows from the assumption that at point A1A22q, the optimal policy is to operate at corner point 1. Similarly, for cases where the optimal operating points are 2;2;2;2;1, or 2;2;1;2;1, the property 4) still holds for g, this is because (a1 ?a2)g(A2q)+(b1 ?b2)g(A1q)+?g(A1A2q) ? 0 (4.99) (a1 ?a2)g(A22q)+(b1 ?b2)g(A1A2q)+?g(A1A22q) ? 0 (4.100) If the optimal operating points are 2;2;1;1;1, we have (a1 ?a2)g(A22q)+(b1 ?b2)g(A1A2q)+?g(A1A22q) (4.101) ? ??(a1 ?a2)f(A22q)+(b1 ?b2)f(A1A2q)+?f(A1A22q)? (4.102) ? ?((a1 ?a2)f(A2q)+(b1 ?b2)f(A1q)+?f(A1A2q)) (4.103) ? (a1 ?a2)g(A2q)+(b1 ?b2)g(A1q)+?g(A1A2q) (4.104) 110 where the flrst inequality follows from the assumption that the optimal policy for point A1A2q is corner point 1, thus the sum is smaller than operating at corner point 2. The second inequality follows from the property 4) of function f, and the last inequality follows from the assumption that the optimal policy for point A2q is the corner point 2. In summary, for all possible cases, we have proven that properties 1) through 4) hold for g, thus, if f is in F, then Tf is in F. 111 Chapter 5 Optimal Packet Scheduling in a Single-User Energy Harvesting System 5.1 Introduction We consider wireless communication networks where nodes are able to harvest en- ergy from the nature. The nodes may harvest energy through solar cells, vibration absorption devices, water mills, thermoelectric generators, microbial fuel cells, etc. In this work, we do not focus on how energy is harvested, instead, we focus on developing transmission methods that take into account the randomness both in the arrivals of the data packets as well as in the arrivals of the harvested energy. As shown in Figure 5.1, the transmitter node has two queues. The data queue stores the data arrivals, while the energy queue stores the energy harvested from the environment. In general, the data arrivals and the harvested energy can be rep- resented as two independent random processes. Then, the optimal scheduling policy becomes that of adaptively changing the transmission rate and power according to the instantaneous data and energy queue lengths. While one ideally should study the case where both data packets and energy arrive randomly in time as two stochastic processes, and devise an on-line algorithm that updates the instantaneous transmission rate and power in real-time as functions 112 dataqueue energyqueue Ei Bi receivertransmitter Figure 5.1: An energy harvesting communication system model. of the current data and energy queue lengths, this, for now, is an intractable math- ematical problem. Instead, in order to have progress in this di?cult problem, we consider an idealized version of the problem, where we assume that we know exactly when and in what amounts the data packets and energy will arrive, and develop an optimal ofi-line algorithm. We leave the development of the corresponding on-line algorithm for future work. Speciflcally, we consider a single node shown in Figure 5.2. We assume that packets arrive at times marked with ? and energy arrives (is harvested) at points in time marked with ?. In Figure 5.2, Bi denotes the number of bits in the ith arriving data packet, and Ei denotes the amount of energy in the ith energy arrival (energy harvesting). Our goal then is to develop methods of transmission to minimize the time, T, by which all of the data packets are delivered to the destination. The most challenging aspect of our optimization problem is the causality constraints introduced by the packet and energy arrival times, i.e., a packet may not be delivered before it has arrived and energy may not be used before it is harvested. The trade-ofi relationship between delay and energy has been well investigated in traditional battery powered (unrechargeable) systems. References [13{18] inves- 113 ??? E1 B0 B1 B2 BM t0 t1 t2 tMsK Ts1 ??? E0 EK Figure 5.2: System model with random packet and energy arrivals. Data packets arrive at points denoted by ? and energies arrive (are harvested) at points denoted by ?. tigate energy minimization problems with various deadline constraints. References [2, 4, 5, 9{12] investigate delay optimal resource allocation problems under various difierent settings. References [2, 4, 5] consider average power constrained delay min- imization problem for a single-user system, while [9{12] minimize the average delay through rate allocation in a multiple access channel. In this chapter, we consider a single-user communication channel with an en- ergy harvesting transmitter. We assume that an initial amount of energy is avail- able at t = 0. As time progresses, certain amounts of energies will be harvested. While energy arrivals should be modeled as a random process, for the mathematical tractability of the problem, in this chapter, we assume that the energy harvesting procedure can be precisely predicted, i.e., that, at the beginning, we know exactly when and how much energy will be harvested. For the data arrivals, we consider two difierent scenarios. In the flrst scenario, we assume that packets have already arrived and are ready to be transmitted at the transmitter before the transmission starts. In the second scenario, we assume that packets arrive during the transmis- sions. However, as in the case of energy arrivals, we assume that we know exactly when and in what amounts data will arrive. Subject to the energy and data arrival 114 constraints, our purpose is to minimize the time by which all packets are delivered to the destination through controlling the transmission rate and power. This is similar to the energy minimization problem in [13], where the objective is to minimize the energy consumption with a given deadline constraint. In this chapter, minimizing the transmission completion time is akin to minimizing the deadline in [13]. However, the problems are difierent, because, we do not know the exact amount of energy to be used in the transmissions, even though we know the times and amounts of harvested energy. This is because, intuitively, using more energy reduces the transmission time, however, using more energy entails waiting for energy arrivals, which increases the total transmission time. Therefore, minimizing the transmission completion time in the system requires a sophisticated utilization of the harvested energy. To that end, we develop an algorithm, which flrst obtains a good lower bound for the flnal total transmission duration at the beginning, and performs rate and power allocation based on this lower bound. The procedure works progressively until all of the transmission rates and powers are determined. We prove that the transmission policy obtained through this algorithm is globally optimum. 5.2 Scenario I: Packets Ready Before Transmission Starts We assume that there are a total of B0 bits available at the transmitter at time t = 0. We also assume that there is E0 amount of energy available at time t = 0, and at times s1, s2, :::, sK, we have energies harvested with amounts E1, E2, :::, EK, respectively. This system model is shown in Figure 5.3. Our objective is to 115 minimize the transmission completion time, T. sK T E1 ??? t B0 s1 s2 E0 EKE2 0 Figure 5.3: System model with all bits available at the beginning. Energies arrive at points denoted by ?. We assume that the transmitter can adaptively change its transmission power and rate according to the available energy and the remaining number of bits. We assumethatthetransmission rateandtransmit powerarerelated throughafunction, f(p), i.e., r = f(p). We assume that f(p) satisfles the following properties: i) f(0) = 0 and f(p) ! 1 as p ! 1, ii) f(p) increases monotonically in p, iii) f(p) is strictly concave in p, iv) f(p) is continuously difierentiable, and v) f(p)=p decreases monotonically in p. Properties i)-iii) guarantee that f?1(r) exists and is strictly convex. Property v) implies that for a flxed amount of energy, the number of bits that can be transmitted increases as the transmission duration increases. It can be verifled that these properties are satisfled in many systems with realistic encoding/decoding schemes, such as optimal random coding in single-user additive white Gaussian noise channel, where f(p) = 12 log(1+p). Assuming the transmitter changes its transmission power N times before it flnishes the transmission, let us denote the sequence of transmission powers as p1, p2, :::, pN, and the corresponding transmission durations of each rate as l1, l2, :::, lN, respectively; see Figure 5.4. Then, the energy consumed up to time t, denoted as E(t), and the total number of bits departed up to time t, denoted as B(t), can 116 be related through the function g as follows: E(t) = ?iX i=1 pili +p?i+1 ? t? ?iX i=1 li ! (5.1) B(t) = ?iX i=1 f(pi)li +f(p?i+1) ? t? ?iX i=1 li ! (5.2) where ?i = maxfi : Pij=1 lj ? tg. B0 E1 ??? ??? sK t p1 pNp2 p3 s2 l1 l2 l3 lN s1 E0 EKE2 0 T Figure 5.4: The sequence of transmission powers and durations. Then, the transmission completion time minimization problem can be formu- lated as: min p;l T s.t. E(t) ? X i:si pi+1. The total energy consumed over this duration is pili +pi+1li+1. Let p0i = p0i+1 = pili +pi+1li+1l i +li+1 (5.4) r0i = r0i+1 = f p ili +pi+1li+1 li +li+1 ? (5.5) Then, we have p0i ? pi, p0i+1 ? pi+1. Since p0ili ? pili, the energy constraint is still satisfled, and thus, the new energy allocation is feasible. We use r0i;r0i+1 to replace ri;ri+1 in the transmission policy, and keep the rest of the rates the same. Then, the total number of bits transmitted over the duration li +li+1 becomes r0ili +r0i+1li+1 = f p ili +pi+1li+1 li +li+1 ? (li +li+1) ? f (pi) lil i +li+1 (li +li+1)+f (pi+1) li+1l i +li+1 (li +li+1) = rili +ri+1li+1 (5.6) where the inequality follows from the fact that f(p) is concave in p. Therefore, the new policy departs more bits by time Pi+1j=1 lj. Keeping the remaining transmission rates the same, the new policy will flnish the entire transmission over a shorter duration. Thus, the original policy could not be optimal. Therefore, the optimal policy must have monotonically increasing powers (and rates). 2 Lemma 5.2 The transmission power/rate remains constant between energy har- vests, i.e., the power/rate only potentially changes when new energy arrives. 118 Proof: Assume that the transmitter changes its transmission rate between two energy harvesting instances si, si+1. Denote the rates as rn, rn+1, and the instant when the rate changes as s0i, as shown in Figure 5.5. Now, consider the duration [si;si+1). The total energy consumed during the duration is pn(s0i ?si) + pn+1(si+1 ?s0i). Let p0 = pn(s 0 i ?si)+pn+1(si+1 ?s 0 i) si+1 ?si (5.7) r0 = f p n(s0i ?si)+pn+1(si+1 ?s0i) si+1 ?si ? (5.8) Now let us use r0 as the new transmission rate over [si;si+1), and keep the rest of the rates the same. It is easy to check that the energy constraints are satisfled under this new policy, thus this new policy is feasible. On the other hand, the total number of bits departed over this duration under this new policy is r0(si+1 ?si) = f p n(s0i ?si)+pn+1(si+1 ?s0i) si+1 ?si ? (si+1 ?si) ? f(pn) s 0 i ?si si+1 ?si +f(pn+1) si+1 ?s0i si+1 ?si ? (si+1 ?si) = rn(s0i ?si)+rn+1(si+1 ?s0i) (5.9) where the inequality follows from the fact that f(p) is concave in p. Therefore, the total number of bits departed under the new policy is larger than that under the original policy. If we keep all of the remaining rates the same, the transmission will be completed at an earlier time. This con icts with the optimality of the original policy. 2 119 ??? sprimei rn+1 Ei Ei+1 si+1 rprime rn ??? ??? t??? si Figure 5.5: The rate must remain constant between energy harvests. Lemma 5.3 Under the optimal policy, whenever the transmission rate changes, the energy consumed up to that instant equals the energy harvested up to that instant. Proof: From Lemma 5.2, we know that the transmission rate can change only at certain energy harvesting instances. Assume that the transmission rate changes at si, however, the energy consumed by si, which is denoted by E(si), is less than Pi?1 j=0 Ej. We denote the energy gap by ?. Let us denote the rates before and after si by rn, rn+1. Now, we can always have two small amounts of perturbations ?n, ?n+1 on the corresponding transmit powers, such that p0n = pn +?n (5.10) p0n+1 = pn+1 ??n+1 (5.11) ?nln = ?n+1ln+1 (5.12) We also make sure that ?n and ?n+1 are small enough such that ?nln < ?, and p0n ? p0n+1. If we keep the transmission rates over the rest of the duration the same, under the new transmission policy, the energy allocation will still be feasible. The 120 total number of bits departed over the duration (Pn?1i=1 li;Pn+1i=1 li) is f(p0n)ln +f(p0n+1)ln+1 ? f(pn)ln +f(pn+1)ln+1 (5.13) where the inequality follows from the concavity of f(p) in p, and the fact that pnln + pn+1ln+1 = p0nln + p0n+1ln+1, pn ? p0n ? p0n+1 ? pn+1, as shown in Figure 5.6. This con icts with the optimality of the original allocation. 2 pprimenpn primen+1 pn+1 p r Figure 5.6: f(p) is concave in p. We are now ready to characterize the optimum transmission policy. In order to simplify the expressions, we let i0 = 0, and let sm+1 = T if the transmission completion time T lies between sm and sm+1. Based on Lemmas 5.1, 5.2 and 5.3, we can characterize the optimal policy in the following way. For given energy arrivals, we plot the total amount of harvested energy as a function of t, which is a staircase curve as shown in Figure 5.7. The total energy consumed up to time t can also be represented as a continuous curve, as shown in Figure 5.7. In order to satisfy the feasibility constraints on the energy, energy consumption curve must lie below the energy harvesting curve at all times. Based on Lemma 5.2, we know that the optimal energy consumption curve must be 121 linear between any two consecutive energy harvesting instants, and the slope of the segment corresponds to the transmit power level during that segment. Lemma 5.3 implies that whenever the slope changes, the energy consumption curve must touch the energy harvesting curve at that energy harvesting instant. Therefore, the flrst linear segment of the energy consumption curve must be one of the lines connecting the origin and any corner point on the energy harvesting curve before t = T. Because of the monotonicity property of the power given in Lemma 5.1, among those lines, we should always pick the one with the minimal slope, as shown in Figure 5.7. p2 B0 p1 p3 t0 p3 s1 E1 E2 E3 s2 s3 p2 s4 E4 summationtextEi E0 sK ??? ??? EK p1 si1 Tsi2 Figure 5.7: An interpretation of transmission policies satisfying Lemmas 5.1, 5.2, 5.3. Otherwise, either the feasibility constraints on the energy will not be satisfled, or the monotonicity property given in Lemma 5.1 will be violated. For example, if we choose the line ending at the corner point at s3, this will violate the feasibility constraint, as the energy consumption curve will surpass the energy arrival curve. On the other hand, if we choose the line ending at the corner point at s1, then the monotonicity property in Lemma 5.1 will be violated, because in that case, the slope 122 of the following segment would be smaller. These properties must hold similarly for p2, p3, :::, pN. We also observe that, for given T, the optimal transmission policy is the tightest string below the energy harvesting curve connecting the origin and the total harvested energy by time T. This is similar to the structure in [14]. We state the structure of the optimal policy formally in the following theorem. Theorem 5.1 For a given B0, consider a transmission policy with power vector p = [p1;p2;:::;pN] and corresponding duration vector l = [l1;l2;:::;lN]. This policy is optimal if and only if it has the following structure: NX n=1 f(pn)ln = B0 (5.14) and for n = 1;2;:::;N, in = arg min i:si?Ts i>sin?1 (Pi?1 j=in?1 Ej si ?sin?1 ) (5.15) pn = Pin?1 j=in?1 Ej sin ?sin?1 (5.16) ln = sin ?sin?1 (5.17) where in is the index of the energy arrival epoch when the power pn switches to pn+1, i.e., at t = sin, pn switches to pn+1. The proof of this theorem is given in Appendix 5.6.1. Therefore, we conclude that if the overall transmission duration T is known, then the optimal transmission policy is known via Theorem 5.1. In particular, 123 optimal transmission policy is the one that yields the tightest piecewise linear energy consumption curve that lies under the energy harvesting curve at all times and touches the energy harvesting curve at t = T. On the other hand, the overall transmission time T is what we want to minimize, and we do not know its optimal value up front. Consequently, we do not know up front which energy harvests will be utilized. For example, if the number of bits is small, and E0 is large, then, we can empty the data queue before the arrival of E1, thus, the rest of the energy arrivals are not necessary. Therefore, as a flrst step, we flrst obtain a good lower bound on the optimal transmission duration. We flrst illustrate our algorithm through an example in Figure 5.8. We flrst compute the minimal energy required to flnish the transmission before s1. We denote it as A1, and it equals A1 = f?1 B 0 s1 ? s1 (5.18) Then, we compare it with E0. If A1 < E0, it implies that we can complete the transmission before the arrival of the flrst energy harvest, thus E1 is not necessary for the transmission. We allocate E0 evenly to B0 bits, and the duration A1 is the minimum transmission duration. On the other hand, if A1 > E0, which is the case in the example, the flnal transmission completion time should be longer than s1. Thus, we move on and compute A2, A3, A4, and flnd that A2 > P1i=0 Ei, A3 > P2i=0 Ei and A4 < P3i=0 Ei. This means that the total transmission completion time will be larger than s3 and energies E0, :::, E3 will surely be utilized. Then, we allocate 124 P3 i=0 Ei evenly to B0 bits and obtain a constant transmission power ~p1, which is the dotted line in the flgure. The corresponding transmission duration is T1. Based on our allocation, we know that the flnal optimal transmission duration T must be greater than T1. This is because, this allocation assumes that all E0, :::, E3 are available at the beginning, i.e., at time t = 0, which, in fact, are not. Therefore, the actual transmission time will only be larger. Thus, T1 is a lower bound for T. Next, we need to check the feasibility of ~p1. Observing the flgure, we flnd that ~p1 is not feasible since it is above the staircase energy harvesting curve for some duration. Therefore, we connect all the corner points on the staircase curve before t = T1 with the origin, and flnd the line with the minimum slope among those lines. This corresponds to the red solid line in the flgure. Then, we update ~p1 with the slope p1, and the duration for p1 is l1 = si1. We repeat this procedure at t = si1 and obtain p2, and continue the procedure until all of the bits are flnished. summationtextE i si1 T t0 E1 E2 E3 s2 s3 s4 E4 Tprime1T1 ?p1 B0 A4 A3A2 A1 s1 E0 sK ??? ??? EK p1 Figure 5.8: An illustration of the algorithm. We state our algorithm for the general scenario as follows: First, we compute 125 the amounts of energy required to flnish the entire transmission before s1, s2, :::, sK, respectively, at a constant rate. We denote these as Ai: Ai = f?1 B 0 si ? si; i = 1;2;:::;K (5.19) Then, we compare Ai withPi?1j=0 Ej, and flnd the smallest i such that Ai ?Pi?1j=0 Ej. We denote this i as ~i1. If no such ~i1 exists, we let ~i1 = K +1. Now, we assume that we can useP~i1?1j=0 Ej to transmit all B0 bits at a constant rate. We allocate the energy evenly to these bits, and the overall transmission time T1 is the solution of f ?P~i 1?1 j=0 Ej T1 ! T1 = B0 (5.20) and the corresponding constant transmit power is p1 = P~i1?1 j=0 Ej T1 (5.21) Next, we compare p1 with Pi?1 j=0 Ej si for every i < ~i1. If p1 is smaller than every term, then, maintaining p1 is feasible, and the optimal policy is to transmit at a constant transmission rate f(p1) with duration T1, which gives the smallest possible transmission completion time, si1 = s~i1. Otherwise, maintaining p1 is infeasible 126 under the given energy arrival realization. Thus, we update i1 = argmin i<~i1 (Pi?1 j=0 Ej si ) (5.22) p1 = Pi1?1 j=0 Ej si1 (5.23) i.e., over the duration [0;si1), we choose to transmit with power p1 to make sure that the energy consumption is feasible. Then, at time t = si1, the total number of bits departed is f(p1)si1, and the remaining number of bits is B0 ? f(p1)si1. Subsequently, with initial number of bits B0 ?f(p1)si1, we start from si1, and get another lower bound on the overall transmission duration T2, and repeat the proce- dure above. Through this procedure, we obtain p2;p3;:::;pN, and the corresponding i2;i3;:::;iN, until we flnish transmitting all of the bits. Based on our allocation algorithm, we know that p1 is optimum up to time T1, since it corresponds to the minimal slope line passing through the origin and any corner point before t = T1. However, the algorithm also implies that the flnal transmission duration T will be larger than T1. The question then is, whether p1 is still the minimum slope line up to time T. If we can prove that p1 is lower than the slopes of the lines passing through the origin and any corner point in [T1;T], then, using Theorem 5.1, we will claim that p1 is the optimal transmission policy, not only between [0;T1], but also between [0;T]. The fact that this will be the case can be illustrated through the example in Figure 5.8. We note that, clearly, T1 is a lower bound on the eventual T. If we 127 keep transmitting at power p1, if no additional energy arrives, the energy harvested up until s~i1, i.e., P~i1?1i=0 Ei, will be depleted by time T01. We will next prove that T01 is an upper bound on T. Because of the assumption that f(p)=p is decreasing in p, we can prove that under this policy, the number of bits departed up to time T01 is greater than B0. Therefore, since potentially additional energy will arrive, T01 provides an upper bound. Thus, we know that the optimal T lies between T1 and T01. We next note that if we connect the origin with any corner point of the staircase curve between T1 and T01, the slope of the resulting line will be larger than p1, thus, p1 will be the smallest slope not only up to time T1, which is a lower bound, but also up to time T01, which is an upper bound. This proves that while we do not know the optimal T, if we run the algorithm with respect to the lower bound on T, i.e., T1, it will still yield an optimal policy, in that the resulting policy will satisfy Theorem 5.1. We prove the optimality of the algorithm formally in the following theorem. Theorem 5.2 The allocation procedure described above gives the optimal transmis- sion policy. The proof of this theorem is given in Appendix 5.6.2. 5.3 Scenario II: Packets Arrive During Transmissions In this section, we consider the situation where packets arrive during transmissions. We assume that there is an E0 amount of energy available at time t = 0, and at times s1, s2, :::, sK, energy is harvested in amounts E1, E2, :::, EK, respectively, 128 as in the previous section. We also assume that at t = 0, we have B0 bits available, and at times t1, t2, :::, tM, bits arrive in amounts B1, B2, :::, BM, respectively. This system model is shown in Figure 5.2. Our objective is again to minimize the transmission completion time, T, which again is the time by which the last bit is delivered to the destination. Let us denote the sequence of transmission powers by p1, p2, :::, pN, and the corresponding transmission durations by l1, l2, :::, lN. Then, the optimization problem becomes: min p;l T s.t. E(t) ? X i:si ri+1, with duration li, li+1, respectively. If i+1 6= N, then, let r0i = r0i+1 = rili +ri+1li+1l i +li+1 (5.25) p0i = p0i+1 = f?1 r ili +ri+1li+1 li +li+1 ? (5.26) Since ri > r0i = r0i+1 > ri+1, pi > p0i = p0i+1 > pi+1, it is easy to verify that the new policy is feasible up to the end of li+1, from both the data and energy arrival points of view. On the other hand, based on the convexity of f?1, the energy spent over the duration li + li+1 is smaller than pili + pi+1li+1. If we allocate the saved energy over to the last transmission duration, without con icting any energy or data constraints, the transmission will be completed in a shorter duration. If i+1 = N, then, we let p0i = p0i+1 = pili +pi+1li+1l i +li+1 (5.27) r0i = r0i+1 = f p ili +pi+1li+1 li +li+1 ? (5.28) Then, from (5.6), under the new policy, the last bit will depart before the end of li+1. The energy and data arrival constraints are satisfled over the whole transmission duration. Consequently, the original policy could not be optimal. Therefore, the optimal policy must have monotonically increasing rates (and powers). 2 Lemma 5.5 Under the optimal policy, the transmission power/rate remains con- stant between two event epoches, i.e., the rate only potentially changes when new 130 energy is harvested or a new packet arrives. Proof: This lemma can be proved through a procedure similar to that in Lemma 5.2. If power/rate is not constant between two event epoches, then, by equalizing the rate over the duration while keeping the total departures flxed, we can save energy. Allocating this saved energy to the last transmission duration, we can shorten the whole transmission duration. Thus, if power/rate is not constant between two event epoches, the policy cannot be optimal. 2 Lemma 5.6 If the transmission rate changes at an energy harvesting epoch, then the energy consumed up to that epoch equals the energy harvested up to that epoch; if the transmission rate changes at a packet arrival epoch, then, the number of packets departed up to that epoch equals the number of packets arrived up to that epoch; if the event epoch has both energy and data arrivals at the same time, then, one of the causality constraints must be met with equality. Proof: This lemma can be proved through contradiction using techniques similar to those used in the proof of Lemma 5.3. When the transmission rate changes at an energy harvesting epoch, if the energy consumed up to that time is not equal to the total amount harvested, then, without con icting the energy causality constraint, we can always increase the rate before that epoch a little and decrease the rate after that epoch a little while keeping the total departures flxed. This policy would save some energy which can be used to shorten the transmission durations afterwards. Thus, the energy constraint at that epoch must be satisfled as an equality. The remaining situations can be proved similarly. 2 131 Based on Lemmas 5.4, 5.5 and 5.6, we can identify the structure of the unique optimal transmission policy as stated in the following theorem. In order to simplify the notation, we deflne ui to be the time epoch when the ith arrival (energy or data) happens, i.e., u1 = minfs1;t1g (5.29) u2 = minfsi;tj : si > u1;tj > u1g (5.30) and so on, until the last arrival epoch. Theorem 5.3 For a given energy harvesting and packet arrival proflle, the trans- mission policy with a transmission rate vector r = [r1;r2;:::;rN] and the correspond- ing duration vector l = [l1;l2;:::;lN] is optimal, if and only if it has the following structure: NX i=1 rili = MX i=0 Bi (5.31) r1 = min i:ui?T ( f ?P j:sj tM, then, as in the flrst scenario where packets have arrived and are ready before the transmission starts, some of the harvested energy may not be utilized to transmit the bits. In this case also, we need to get a lower bound for the flnal transmission completion time. Let un be the energy harvesting epoch right after tM. Then, starting from un, we compute the energy required to transmit PM j=0 Bj bits at a constant rate by ui, un ? ui ? uK+M, and compare them with the total energy harvested up to that epoch, i.e., Pj:sj Pi0?1 j=in?1 Ej si0 ?sin?1 , p 0 (5.38) Based on Lemma 5.3, the energy consumed up to sin?1 is equal to Pin?1?1j=0 Ej, i.e., there is no energy remaining at t = s?in?1. We consider two possible cases here. The flrst case is that si0 < sin, as shown in Figure 5.11(a). Under the optimal policy, the energy required to maintain a transmit power pn over the duration [sin?1;si0) is pn(si0 ? sin?1). Based on (5.38), this is greater than the total amount of energy harvested during [sin?1;si0), which is Pi0?1 j=in?1 Ej. Therefore, this energy allocation under this policy is infeasible. On the other hand, if si0 > sin, as shown in Figure 5.11(b), then the total amount of energy harvested over [sin;si0) is Pi0?1j=in Ej. From (5.38), we know pn = Pin?1 j=in?1 Ej sin ?sin?1 > Pi0?1 j=in?1 Ej si0 ?sin?1 > Pi0?1 j=in Ej si0 ?sin (5.39) Thus, under any feasible policy, there must exist a duration l [sin;si0), such that the transmit power over this duration is less than pn. This contradicts with Lemma 5.1. Therefore, this policy cannot be optimal. 138 siprime ??? t??? ??? Eiprime pn pprime ??? ??? sin Ein Ein+1 sin+1 ??? (a) si0 < sin siprime t??? ?????? Ein+1 sin+1 pn pprime Eiprime ??? sin Ein ??? ??? (b) si0 > sin Figure 5.11: Two difierent cases in the proof of Theorem 5.1. Next, we prove that if a policy with power vector p and duration vector l has the structure given above, then, it must be optimal. We prove this through contradiction. We assume that there exists another policy with power vector p0 and duration vector l0, and the transmission completion time T0 under this policy is smaller. We assume both of the policies are the same over the duration [0;sin?1), how- ever, the transmit policies right after sin?1, which are pn and p0n, with durations ln and l0n, respectively, are difierent. Based on the assumption, we must have pn < p0n. If ln < l0n, from Lemma 5.3, we know that the total energy available over [sin?1;sin) is equal to pnln. Since pn < p0n, p0n is infeasible over [sin?1;sin). Thus, policy p0 cannot be optimal. Then, we consider the case when ln > l0n. If T0 ? sin, then, the total energy spent over [sin?1;sin) under p0 is greater than pnln, since p0n > pn, and p0n+1 > p0n based on Lemma 5.1. If T0 < sin, since the power-rate function f is concave, the total number of bits departed over [sin?1;sin) under p is greater than that under p0. Thus, policy p0 cannot depart B0 bits over T0, and it cannot be optimal. In summary, a policy is optimal if and only if it has the structure given above, 139 completing the proof. 5.6.2 The Proof of Theorem 5.2 Let T be the flnal transmission duration given by the allocation procedure. Then, we have B(T) = B0. In order to prove that the allocation is optimal, we need to show that the flnal transmission policy has the structure given in Theorem 5.1. We flrst prove that p1 satisfles (5.16). Then, we can similarly prove that p2, p3, ::: satisfy (5.16). We know that if T = T1, then it is the minimum possible transmission comple- tion time. We know that this transmit policy will satisfy the structural properties in Theorem 5.1. Otherwise, the flnal optimal transmission time T is greater than T1, and more harvested energy may need to be utilized to transmit the remaining bits. From the allocation procedure, we know that p1 ? Pi?1 j=0 Ej si ; 8i < ~i1 (5.40) In order to prove that p1 satisfles (5.16), we need to show that p1 ? Pi?1 j=0 Ej si ; 8i : s~i1 ? si ? T (5.41) If we keep transmitting with power p1, then at T01 = P~i1?1 j=0 Ej p1 , the total number 140 of bits departed will be f(p1)T01 ? f ?P~i 1?1 j=0 Ej T1 ! T1 = B0 (5.42) where the inequality follows from the assumption that f(p)=p decreases in p. Then, (5.40) guarantees that this is a feasible policy. Thus, under the optimal policy, the transmission duration T will be upper bounded by T01, i.e., T ? P~i1?1 j=0 Ej p1 (5.43) which implies p1 ? P~i1?1 j=0 Ej T (5.44) If T ? s~i1, as shown in Figure 5.12(a), no future harvested energy is utilized for the transmissions. Then, (5.44) guarantees that (5.41) is satisfled. If T > s~i1, as shown in Figure 5.12(b), additional energy harvested after s~i1 should be utilized to transmit the data. We next prove that (5.41) still holds through contradiction. Assume that there exists i0 with s~i1 ? si0 ? T, such that (5.41) is not satisfled, i.e., p1 > Pi0?1 j=0 Ej si0 , p 0 (5.45) 141 Then, Pi0?1 j=0 Ej p1 < si0 (5.46) Combining this with (5.43), we have T < si0, which contradicts with the assumption that si0 ? T. Thus, (5.41) holds, p1 satisfles the requirement of the optimal structure in (5.40). We can then prove using similar arguments that p2, p3, ::: also satisfy the properties of the optimal solution. Based on Lemma 5.1, this procedure gives us the unique optimal policy. s?i1T1p 1 ??? ??? si1 T ??? ???s?i1?1 t E0 E?i1?1Ei1 E?i1 (a) T ? s~i1 t Eiprime si1 T pprime s?i1 siprime E?i1 T1 ?????? ??? ??? E0 Ei1 p1 (b) T > s~i1 Figure 5.12: Two difierent cases in the proof of Theorem 5.2. 5.6.3 The Proof of Theorem 5.3 First, we prove that for the optimal transmission policy, r1 must satisfy (5.32). We prove this through contradiction. If r1 does not satisfy (5.32), then, we can always flnd another ui0, such that r1 > min ( f ?P j:sj ui1, then, the transmitter cannot maintain a transmission rate that is always greater than r1 over [ui;ui0), from the energy point of view. This contradicts with Lemma 5.4. Similarly, if f P j:sj P j:tj u~i1, additional energy harvested after u~i1 should be utilized to transmit the data. We next prove that (5.49) still holds through contradiction. Assume that there exists i0 with u~i1 ? ui0 ? T, such that (5.49) is not satisfled, i.e., p1 > Pi0?1 j=0 Ej ui0 or r1 > Pi0?1 j=0 Bj ui0 (5.54) Then, we have Pi0?1 j=0 Ej p1 < ui0 or Pi0?1 j=0 Bj r1 < ui0 (5.55) Combining this with (5.52), we have T < ui0, which contradicts with the assumption that ui0 ? T. Thus, (5.49) holds, r1 satisfles the requirement of the optimal structure in (5.32). We can then prove using a similar argument that r2, r3, ::: also satisfy the structure of the optimal solution. Based on Theorem 5.3, this procedure gives us the unique optimal transmission policy. 145 Chapter 6 Optimal Packet Scheduling in a Broadcast Channel with an Energy Harvesting Transmitter 6.1 Introduction We consider a wireless communication network where users are able to harvest en- ergy from the nature. Such energy harvesting capabilities make sustainable and environmentally friendly deployment of wireless communication networks possible. While energy-e?cient scheduling policies have been well-investigated in traditional battery powered (un-rechargeable) systems [13{18], energy-e?cient scheduling in energy harvesting networks with nodes that have rechargeable batteries has only recently been considered in Chapter 5. Chapter 5 considers a single-user communi- cation system with an energy harvesting transmitter, and develop a packet schedul- ing scheme that minimizes the time by which all of the packets are delivered to the receiver. In this chapter, we consider a multi-user extension of the work in Chapter 5. In particular, we consider a wireless broadcast channel with an energy harvesting transmitter. As shown in Figure 6.1, we consider a broadcast channel with one transmitter and two receivers, where the transmitter node has three queues. The data queues store the data arrivals intended for the individual receivers, while the 146 energy queue stores the energy harvested from the environment. Our objective is to adaptively change the transmission rates that go to both users according to the instantaneous data and energy queue sizes, such that the total transmission completion time is minimized. TXB2 dataqueues RX2 B1 energyqueue E RX1 Figure 6.1: An energy harvesting two-user broadcast channel. In this chapter, we focus on flnding the optimum ofi-line schedule, by assuming that the energy arrival proflle at the transmitter is known ahead of time in an ofi-line manner, i.e., the energy harvesting times and the corresponding harvested energy amounts are known at time t = 0. We assume that there are a total of B1 bits that need to be delivered to receiver 1 and B2 bits that need to be delivered to receiver 2, available at the transmitter at time t = 0. As shown in Figure 6.2, energy arrives (is harvested) at points in time marked with ?; in particular, Ek denotes the amount of energy harvested at time sk. Our goal is to develop a method of transmission to minimize the time, T, by which all of the data packets are delivered to their respective receivers. The optimal packet scheduling problem in a single-user energy harvesting com- munication system is investigated in Chapter 5. In Chapter 5, we prove that the op- timal scheduling policy has a \majorization" structure, in that, the transmit power 147 sK T E1 ??? t (B1,B2) s1 s2 E0 EKE2 0 Figure 6.2: System model. (B1;B2) bits to be transmitted to users are available at the transmitter at the beginning. Energies arrive (are harvested) at points denoted by ?. T denotes the transmission completion time by which all of the bits are delivered to their respective destinations. is kept constant between energy harvests, the sequence of transmit powers increases monotonically, and only changes at some of the energy harvesting instances; when the transmit power changes, the energy constraint is tight, i.e., the total consumed energy equals the total harvested energy. In Chapter 5, we develop an algorithm to obtain the optimal ofi-line scheduling policy based on these properties. Reference [19] extends Chapter 5 to the case where rechargeable batteries have flnite sizes. We extend Chapter 5 in [20] to a fading channel. References [19, 20] investigate two related problems. The flrst problem is to maximize the throughput (number of bits transmitted) with a given deadline constraint, and the second problem is to minimize the transmission completion time with a given number of bits to transmit. These two problems are \dual" to each other in the sense that, with a given energy arrival proflle, if the maximum number of bits that can be sent by a deadline is B? in the flrst problem, then the minimum time to transmit B? bits in the second problem must be the deadline in the flrst problem, and the optimal transmission policies for these two problems must be identical. In this chapter, we will follow this \dual problems" approach. We will flrst attack and solve the flrst problem to determine the structural properties of the optimal 148 solution. We will then utilize these structural properties to develop an iterative algorithm for the second problem. Our iterative approach has the goal of reducing the two-user broadcast problem into a single-user problem as much as possible, and utilizing the single-user solution in Chapter 5. The second problem is also considered in the independent work [36] which uses convex optimization techniques to reduce the problem into local sub-problems that consider only two energy arrival epochs at a time. We flrst analyze the structural properties of the optimal policy for the flrst problem where our goal is to maximize the number of bits delivered to both users under a given deadline constraint. To that end, we flrst determine the maximum departure region with a given deadline constraint T. The maximum departure region is deflned as the set of all (B1;B2) that can be transmitted to users reliably with a given deadline. In order to do that, we consider the problem of maximizing ?1B1 + ?2B2 under the energy causality constraints for the transmitter, for all ?1;?2 ? 0. Varying ?1, ?2 traces the boundary of the maximum departure region. We prove that the optimal total transmit power policy is independent of the values of ?1, ?2, and it has the same \majorization" structure as the single-user non-fading solution. As for the way of splitting the total transmit power between the two users, we prove that there exists a cut-ofi power level for the stronger user, i.e., only the power above this cut-ofi power level is allocated to the weaker user. We then consider the second problem, where our goal is to minimize the time, T, by which a given (B1;B2) number of bits are delivered to their intended receivers. As discussed, since the second problem is \dual" to the flrst problem, the optimal 149 transmission policy in this problem has the same structural properties as in the flrst problem. Therefore, in the second problem as well, there exists a cut-ofi power level. The problem then becomes that of flnding an optimal cut-ofi power such that the transmission times for both users become identical and minimized. With these optimal structural properties, we develop an iterative algorithm that flnds the optimal schedule e?ciently. In particular, we flrst use the fact that the optimum transmit power has the same structural properties as the single-user problem, to obtain the flrst optimal total power, P1. Then, given the fact that there exists a cut-ofi power level, Pc, for the flrst user, the optimal transmit strategy depends on whether P1 is smaller or larger than Pc, which, at this point, is unknown. Therefore, we have two cases to consider. If Pc is smaller than P1, then the stronger user will always have a constant, Pc, portion of the total transmit power. This reduces the problem to a single-user problem for the second user, together with a flxed-point equation in a single variable (Pc) to be solved to ensure that the transmissions to both users end at the same time. On the other hand, if Pc is larger than P1, this means that all of P1 must be spent to transmit to the flrst user. In this case, the number of bits delivered to the flrst user in this time duration can be subtracted from the total number of bits to be delivered to the flrst user, and the problem can be started anew with the updated number of bits (B01;B2) after the flrst epoch. Therefore, in both cases, the broadcast channel problem is essentially reduced to single-user problems, and the approach in Chapter 5 is utilized recursively to solve the overall problem. 150 6.2 System Model and Problem Formulation The system model is as shown in Figures 6.1 and 6.2. The transmitter has an energy queue and two data queues (Figure 6.1). The physical layer is modeled as an AWGN broadcast channel, where the received signals at the flrst and second receivers are Y1 = X +Z1 (6.1) Y2 = X +Z2 (6.2) where X is the transmit signal, and Z1 is a Gaussian noise with zero-mean and unit- variance, and Z2 is a Gaussian noise with zero-mean and variance 2, where 2 > 1. Therefore, the second user is the degraded (weaker) user in our broadcast channel. Assuming that the transmitter transmits with power P, the capacity region for this two-user AWGN broadcast channel is [24] r1 ? 12 log2 (1+fiP) (6.3) r2 ? 12 log2 1+ (1?fi)PfiP + 2 ? (6.4) where fi is the fraction of power spent for the message transmitted to the flrst user. Let us denote f(p) , 12 log2 (1+p) for future use. Then, the capacity region is r1 ? f(fiP), r2 ? f ? (1?fi)P fiP+ 2 ? . This capacity region is shown in Figure 6.3. 151 R2 C2 R1C1 Figure 6.3: The capacity region of the two-user AWGN broadcast channel. Working on the boundary of the capacity region, we have P = 22(r1+r2) +( 2 ?1)22r2 ? 2 (6.5) , g(r1;r2) (6.6) As shown in Figure 6.1, the transmitter has B1 bits to transmit to the flrst user, and B2 bits to transmit to the second user. Energy is harvested at times sk with amounts Ek. Our goal is to select a transmission policy that minimizes the time, T, by which all of the bits are delivered to their intended receivers. The transmitter adapts its transmit power and the portions of the total transmit power used to transmit signals to the two users according to the available energy level and the remaining number of bits. The energy consumed must satisfy the causality constraints, i.e., at any given time, the total amount of energy consumed up to time t must be less than or equal to the total amount of energy harvested up to time t. Before we proceed to give a formal deflnition of the optimization problem and propose the solution, we start with the \dual" problem of this transmission 152 completion time minimization problem, i.e, instead of trying to flnd the minimal T, we aim to identify the maximum number of bits the transmitter can deliver to both users by any flxed time T. As we will observe in the next section, solving the \dual" problem enables us to identify the optimal structural properties for both problems, and these properties eventually help us reduce the original problem into simple scenarios, which can be solved e?ciently. 6.3 Characterizing D(T): Largest (B1;B2) Region for a Given Dead- line T In this section, our goal is to characterize the maximum departure region for a given deadline T. We deflne it in the following way. Deflnition 6.1 For any flxed transmission duration T, the maximum departure re- gion, denoted as D(T), is the union of (B1;B2) under any feasible rate allocation policy over duration [0;T), i.e., D(T) = Sr1(t);r2(t)(B1;B2)(r1(t);r2(t)), subject to the energy constraint Rt0 g(r1;r2)(?)d? ?Pi:si 0, and 2n = 0 if r2n > 0. Based on these KKT optimality conditions, we flrst prove an important property of the optimal policy. Lemma 6.3 The optimal total transmit power of the transmitter is independent of the value of ?1;?2, and it is the same as the single-user optimal transmit power. Speciflcally, in = arg min in?1 0, (6.16)- (6.19) imply g(r1n;r2n) = ?2PN j=n ?j ? 2 > ?1PN j=n ?j ?1 (6.20) When r2n = 0, we must have r1n > 0. Otherwise, if r1n = 0, we can always let the weaker user transmit with some power over this duration without contradicting with any energy constraints. Since there is no interference from the stronger user, the departure from the weaker user can be improved, thus it contradicts with the optimality of the policy. Therefore, when r2n = 0, 1n = 0, (6.16)-(6.19) imply g(r1n;r2n) = ?1PN j=n ?j ?1 > ?2PN j=n ?j ? 2 (6.21) 158 Therefore, we can express g(r1n;r2n) in the following way: g(r1n;r2n) = max ( ?1P N j=n ?j ?1; ?2PN j=n ?j ? 2 ) (6.22) Plotting these two curves in Figure 6.6, we note that the optimal transmit power is always the curve on the top. If ?2PN j=n ?j ? 2 > ?1PN j=n ?j ?1 for some ?n, then, we have ?2 ??1P N j=n ?j ? ?2 ??1PN j=?n ?j > 2 ?1; 8n > ?n (6.23) where the flrst inequality follows from the KKT condition that ?j ? 0 for j = 1;2;:::N. Therefore, we conclude that there exists an integer ?n, 0 ? ?n ? N, such that, when n ? ?n, r2n = 0; and when n > ?n, r2n > 0. Furthermore, (6.20)-(6.21) imply that, the energy constraint at t = s?n must be tight. Otherwise, ??n = 0, and (6.21) implies g(r1?n;r2?n) = ?1PN j=?n+1 ?j ?1 > ?2PN j=?n+1 ?j ? 2 = g(r1;?n+1;r2;?n+1) (6.24) which contradicts with (6.20). Therefore, in the following, when we consider the energy constraints, we only need to consider two segments [0;s?n) and [s?n+1;sN) separately. When n < s?n, based on (6.20), if ?n = 0, we have g(r1n;r2n) = g(r1;n+1;r2;n+1). Starting from n = 1, g(r1n;r2n) remains a constant until an energy constraint be- comes tight. Therefore, between any two consecutive epochs, when the energy con- 159 straints are tight, the power level remains constant. Similar arguments hold when n ? s?n. Therefore, the corresponding power level is Pn = Pin?1 j=in?1 Ej sin ?sin?1 (6.25) where sin?1 and sin are two consecutive epochs with tight energy constraint. Finally, we need to determine the epochs when the energy constraint becomes tight. Another observation is that g(r1?n;r2?n) must monotonically increase in n, as shown in Figure 6.6. This is because both of these two curves monotonically increase, and the maximum value of these two curves should monotonically increase also. Therefore, based on the monotonicity of the transmit power, we conclude that in = arg min in?1 2, we have r1n = 0 (6.32) r2n = 12 log2 1+ Pn 2 ? (6.33) In this scenario, all of the Pn is allocated to the second user. Let us deflne a constant power level as Pc = ? 1( 2 ?1) ?2 ??1 ?1 ?+ (6.34) Based on the solution of the local optimization problem (6.27), we establish another important property of the optimal policy as follows. Lemma 6.4 For flxed ?1, ?2, under the optimal power policy, there exists a constant cut-ofi power level, Pc, for the flrst user. If the total power level is below this cut-ofi power level, then, all the power is allocated to the flrst user; if the power level is higher than this level, then, all the power above this cut-ofi level is allocated to the second user. In the proof of Lemma 6.3, we note that the optimal power Pn monotonically increases in n. Combining Lemma 6.3 and Lemma 6.4, we illustrate the structure of the optimal policy in Figure 6.7. Moreover, the optimal way of splitting the power in each epoch is such that both user?s share of the power monotonically increases. 162 In particular, the second user?s share is monotonically increasing in time. Hence, the path followed in the (B1;B2) plane is such that it changes direction to get closer to the second user?s departure axis as shown in Figure 6.4. The dotted trajectory cannot be optimal, since the path does not get closer to the second user?s departure axis in the middle (second) power epoch. E0 sK ??? ??? EK t0 E1 E2 E3 s2 s3 s4 E4 s1 (B1,B2) T P Pc P3 P1 P2 si1 si2 Figure 6.7: Optimally splitting total power between the signals that go to the two users. Corollary 6.1 Under the optimal policy, the transmission rate for the flrst user, fr1ngNn=1, is either a constant sequence (zero or a positive constant), or an increasing sequence. Moreover, before r1n achieves its flnal constant value, r2n = 0; and when r1n becomes a constant, r2n monotonically increases in n. Based on Lemma 6.3, we observe that for flxed T, ?1 and ?2, the optimal power allocation is unique, i.e., does not depend on ?1 and ?2. However, the way the total power is split between the two users depends on ?1, ?2. In fact, the cut-ofi power level Pc varies depending on the value of ?2=?1. Therefore, for difierent values 163 of ?2=?1, the optimal policy achieves difierent boundary points on the maximum departure region, and varying the value of ?2=?1 traces the boundary of this region. In this section, we characterized the maximum departure region for any given time T. We proved that the optimal total transmit power is the same as in the single-user case, and there exists a cut-ofi power for splitting the total transmit power to both users. In the next section, we will use these structural properties to solve the original transmission completion minimization problem. 6.4 Minimizing the Transmission Completion Time T for a Given (B1;B2) In this section, our goal is to minimize the transmission completion time of both users for a given (B1;B2). The optimization problem can be formulated as minr 1;r2 T s.t. jX n=1 g(r1n;r2n)ln ? j?1X n=1 En; 8j : 0 < j ? N(T) N(T)X n=1 r1nln = B1; N(T)X n=1 r1nln = B2 (6.35) where N(T)?1 is the number of energy arrival epochs (excluding t = 0) over (0;T), and lN(T) = T ? sN(T)?1. Since N(T) depends on T, the optimization problem in (6.35) is not a convex optimization problem in general. Therefore, we cannot solve it using standard convex optimization tools. We flrst note that this is exactly the \dual" problem of maximizing the de- 164 parture region for flxed T. They are \dual" in the sense that, if the minimal trans- mission completion time for (B1;B2) is T, then (B1;B2) must lie on the boundary of D(T), and the transmission policy should be exactly the same for some (?1;?2). This is based on the fact the D(T) ? D(T0) for any T < T0. Assume (B1;B2) does not lie on the boundary of D(T). Then, either (B1;B2) cannot be achieved by T or (B1;B2) is strictly inside D(T) and hence (B1;B2) can be achieved by T0 < T. Therefore, if (B1;B2) does not lie on the boundary of D(T), then T cannot be the minimum transmission completion time. We have the following lemma. Lemma 6.5 When B1;B2 6= 0, under the optimal policy, the transmissions to both users must be flnished at the same time. Proof: This lemma can be proved based on Corollary 6.1. If the transmission completion time for both users is not the same, then over the last duration, we transmit only to one of the users, while the transmission rate to the other user is zero. This contradicts with the monotonicity of the transmission rates for both users. Therefore, under the optimal policy, the transmitter must flnish transmitting to both users at the same time. 2 This lemma is proved in [36] also, by using a difierent approach. The authors prove it in [36] mainly based on the convexity of the capacity region of the broadcast channel. For flxed (B1;B2), let us denote the transmission completion time for the flrst and second user, by T1, T2, respectively. We note that T1 and T2 depend on the 165 selection of the cut-ofi power level, Pc. In particular, T1 is monotonically decreasing in Pc, and T2 is monotonically increasing in Pc. Based on Lemma 6.5, the problem of optimal selection of Pc, can be viewed as solving a flxed point equation. In particular, Pc must be chosen such that, the resulting T1 equals T2. Therefore, we propose the following algorithm to solve the transmission completion time, T, minimization problem. Our basic idea is to try to identify the cut-ofi power level Pc in an e?cient way. Since the power allocation is similar to the single-user case (c.f. Lemma 6.3), our approach to flnd T will be similar to the method in Chapter 5. First, we aim to identify P1, the flrst total transmit power starting from t = 0 in the system. This is exactly the same as identiflcation of P1 in the corresponding single-user problem. For this, as in Chapter 5, we treat the energy arrivals as if they have arrived at time t = 0, and obtain a lower bound for the transmission completion time as in Chapter 5. In order to do that, flrst, we compute the amount of energy required to flnish (B1;B2) by s1. This is equal to g ? B1 s1 ; B2 s1 ? s1, denoted as A1. Then, we compare A1 with E0. If E0 is greater than A1, this implies that the transmitter can flnish the transmission before s1 with E0, and future energy arrivals are not needed. In this case, the minimum transmission completion time is the solution of the following equation g B 1 T ; B2 T ? T = E0 (6.36) If A1 is greater than E0, this implies that the flnal transmission completion time 166 is greater than s1, and some of the future energy arrivals must be utilized to complete the transmission. We calculate the amount of energy required to fln- ish (B1;B2) by s2, s3, ..., and denote them as A2, A3, ..., and compare these with E0 + E1;P2j=0 Ej;P3j=0 Ej, ..., until the flrst Ai that becomes smaller than Pi?1 j=0 Ej. We denote the corresponding time index as ~i1. Then, we assume that we can use P~i1?1i=0 Ei to transmit (B1;B2) at a constant rate. And, the corresponding transmission completion time is the solution of the following equation g B 1 T ; B2 T ? T = ~i1?1X i=0 Ei (6.37) We denote the solution to this equation as ~T, and the corresponding power as ~P1. From our analysis, we know that the solution to this equation is the minimum possible transmission completion time we can achieve. Then, we check whether this constant power ~P1 is feasible, when the actual energy arrival times are imposed. If it is feasible, it gives us the minimal transmission completion time; otherwise, we get P1 by selecting the minimal slope according to (6.15). That is to say, we draw all of the lines from t = 0 to the corner points of the energy arrival instances before eT, and choose the line with the smallest slope. We denote si 1 as the corresponding duration associated with P1. This is shown in Figure 6.8. Once P1 is selected, we know that it is the optimal total transmit power in our broadcast channel problem. Next, we need to divide this total power between the signals transmitted to the two users. Based on Lemma 6.4 and Corollary 6.1, if the cut-ofi power level Pc is higher than P1, then, the transmitter spends all P1 for 167 E1 E2 E3 s2 s3 s4 E4 summationtextE i A4A3 A2A1 s1 ?T T (B1,B2) ?P1 E0 sK ??? ??? EK P1 si1 t0 Figure 6.8: Determining the optimal total power level of the flrst epoch. the stronger user; otherwise, the flrst user flnishes its transmission with a constant power Pc. We will flrst determine whether Pc lies in [0;P1] or it is higher than P1. Assume Pc = P1. Therefore, the transmission completion time for the flrst (stronger) user is T1 = B1f(P 1) (6.38) Once Pc is flxed, we can obtain the minimum transmission completion time for the second user, T2, by subtracting the energy consumed by the flrst user, and treating P1 as an interference for the second user. This reduces the problem to the single-user problem for the second user with fading, where the fading level is P1+ 2 over [0;T1), and 2 afterwards. The single-user problem with fading is discussed in [20]. Since obtaining the minimal transmission completion time is not as straightforward for the fading channel, a more approachable way is to calculate the maximum number of bits departed from the second user by T1, denoted as D2(T1;Pc). In order to 168 do that, we flrst identify the optimal power allocation policy with flxed deadline T1. This can be done according to Lemma 6.3. Assume that the optimal power allocation gives us P1;P2;:::;PN(T1). Then, we allocate P1 to the flrst user over the whole duration, and allocate the remaining power to the second user. Based on (6.4), we calculate the transmission rate for the second user over each duration, and obtain D2(T1;Pc) according to D2(T1;Pc) = N(T1)X i=1 1 2 log 1+ Pn ?PcP c + 2 ? (sin ?sin?1) (6.39) We observe that, given Pc, D2(T1;Pc) is a monotonically increasing function of T1. Moreover, given T1, D2(T1;Pc) is a monotonically decreasing function of Pc. If D2(T1;Pc) is smaller than B2, it implies that T1 < T2, and we need to decrease the rate for the flrst user to make T1 and T2 equal. Based on Lemma 6.4, this also implies that the transmission power for the flrst user is a constant Pc < P1. In particular, Pc is the unique solution of the following equation. B2 = D2 B 1 f(Pc);Pc ? (6.40) Note that D2 ? B1 f(Pc);Pc ? is a continuous, strictly monotonically decreasing function of Pc, hence the solution for Pc in (6.40) is unique. Since T1 is a decreasing function of Pc and D2 ? B1 f(Pc);Pc ? is a decreasing function of Pc, we can use the bisection method to solve (6.40). In this case, the minimum transmission completion time is T = B1f(Pc). 169 If D2(T1;Pc) is larger than B2, that implies T2 < T1, and we need to increase the power allocated for the flrst user to make T1 = T2, i.e., Pc > P1. Therefore, from Lemma 6.4, over the duration [0;si1), the optimal policy is to allocate the entire P1 to the flrst user only. We allocate P1 to the flrst user, calculate the number of bits departed for the flrst user, and remove them from B1. This simply reduces the problem to that of transmitting (B01;B2) bits starting at time t = si1, where B01 = B1 ?f(P1)si1. The process is illustrated in Figure 6.9. Then, the minimum transmission completion time is T = siK + B1 ? PK i=1 f(Pk)(sik ?sik?1) f(Pc) (6.41) where K is the number of recursions needed to get Pc. sK ??? ??? EK t0 E1 E2 E3 s2 s3 s4 E4 s1 (B1,B2) T P P3 P1 P2 si1 si2 Pc E0 Figure 6.9: Search for the cutofi power level Pc iteratively. In both scenarios, we reduce the problem into a simple form, and obtain the flnal optimal policy. Before we proceed to prove the optimality of the algorithm, we introduce the following lemma flrst, which is useful in the proof of the optimality of 170 the algorithm. Lemma 6.6 f(E=T)T monotonically increases in T; f ? fiE=T (1?fiE=t)+ 2 ? T monotoni- cally increases in T also. Proof: The monotonicity of both functions can be verifled by taking deriva- tives, (f(E=T)T)0 = f(E=T)? E(2ln2)(T +E) (6.42) (f(E=T)T)00 = E2ln2 1 (T +E)2 ? 1 T(T +E) ? < 0 (6.43) where the last inequality follows since E > 0. Therefore, f(E=T)T is a strictly concave function, and its flrst derivative monotonically decreases when T increases. Since when limT!1(f(E=T)T)0 = 0, when T < 1, we have (f(E=T)T)0 > 0, therefore, the monotonicity follows. Similarly, we have f fiE=T (1?fiE=t)+ 2 ? T ?0 =12 log2? 2 +E=T?? 12 log2? 2 +(1?fi)E=T? ? E2ln2 EE + 2T + E2ln2 (1?fi)E(1?fi)E + 2T (6.44) f fiE=T (1?fiE=t)+ 2 ? T ?00 = E 2 2T ln2 1 ( 2T=(1?fi)+E)2 ? 1 ( 2T +E)2 ? <0 (6.45) Again, the concavity implies the flrst derivative is positive when T < 1, and the monotonicity follows. 2 171 Theorem 6.1 The algorithm is feasible and optimal. Proof: We flrst prove the optimality. In order to prove that the algorithm is optimal, we need to prove that P1 is optimal. Once we prove the optimality of P1, the optimality of P2, P3, ::: follows. Since the solution obtained using our algorithm always has the optimal structure described in Lemma 6.4, the optimality of the power allocation also implies the optimality of rate selection, thus, the optimality of the algorithm follows. Therefore, in the following, we prove that P1 is optimal. First, we note that P1 is the minimal slope up to eT. We need to prove that P1 is also the minimal slope up to the flnal transmission completion time, T. Let us deflne T0 as follows T0 = P~i1 n=0 En P1 (6.46) Assume that with ~P1, we allocate fi ~P1 to the flrst user, and flnish (B1;B2) using constant rates. Then, we allocate fiP1 to the flrst user, and the rest to the second user. Based on Lemma 6.6, we have f(fiP1)T0 ? f(fi ~P1)~T = B1 (6.47) f fiP 1 (1?fi)P1 + 2 ? T0 ? f ? fi ~P1 (1?fi)~P1 + 2 ! ^T = B2 (6.48) Therefore, T0 is an upper bound for the optimal transmission completion time. Since P1 is the minimal slope up to T0, we conclude that P1 is optimal throughout the transmission. Following similar arguments, we can prove the optimality of the rest 172 of the power allocations. This completes the proof of optimality. In order to prove that the allocation is feasible, we need to show that the power allocation for the flrst user is always feasible in each step. Therefore, in the following, we flrst prove that P1 is feasible when we assume that Pc = P1. The feasibility of P1 also implies the feasibility of the rest of the power allocation. With the assumption that Pc = P1, the flnal transmission time for the flrst user is T1 = B1f(P 1) ? B1f(fiP 1) (6.49) Based on (6.47) and (6.48), we know that T1 < T0. Since P1 is feasible up to T0, therefore, P1 is feasible when we assume that Pc = P1. The feasibility of the rest of the power allocations follows in a similar way. This completes the feasibility part of the proof. 2 6.5 Numerical Results We consider a band-limited AWGN broadcast channel, with bandwidth W = 1 MHz and the noise power spectral density N0 = 10?19 W/Hz. We assume that the path loss between the transmitter and the flrst receiver is about 100 dB, and the path loss between the transmitter and the second user is about 105 dB. Then, we have r1 = W log2 1+ fiPh1N 0W ? = log2 1+ fiP10?3 ? Mbps (6.50) r2 = W log2 1+ (1?fi)Ph2fiPh 2 +N0W ? = log2 1+ (1?fi)PfiP +10?2:5 ? Mbps (6.51) 173 Therefore, g(r1;r2) = 10?32r1+r2 +(10?2:5 ?10?3)2r2 ?10?2:5 W (6.52) For the energy harvesting process, we assume that at times t = [0;2;5;6;8;9;11] s, we have energy harvested with amounts E = [10;5;10;5;10;10;10] mJ. We flnd the maximum departure region D(T) for T = 6;8;9;10 s, and plot them in Figure 6.10. We observe that the maximum departure region is convex for each value of T, and as T increases, the maximum departure region monotonically expands. 0 5 10 15 20 250 2 4 6 8 10 12 14 B1 (Mbits) B 2 (Mbits) T=6 s T=8 sT=9 s T=10 s Figure 6.10: The maximum departure region of the broadcast channel for various T. Then, we aim to minimize the transmission completion time with (B1;B2) = (15;6) Mbits. Following our algorithm, we obtain the optimal transmission policy, which is shown in Figure 6.11. We note that the powers change only potentially at instances when energy arrives (Lemma 6.1); power sequence is monotonically 174 increasing and \majorized" over the whole transmission duration (Lemma 6.3). We also note that, for this case, the flrst user transmits at a constant rate, and the rate for the second user monotonically increases. The transmitter flnishes its transmis- sions to both users by time T = 9:66 s, and the last energy harvest at time t = 11 s is not used. (B1,B2)=(15,6) 10 0 2 t 10 10 1055 6 9 11 10 85 10 53 T 15 T=9.66 t P Figure 6.11: Cut-ofi power Pc = 1:933 mW. Optimal transmit rates are r1 = 1:552 Mbps, r2 = [0:274;0:680;1:369;1:834] Mbps, with durations l = [5;3;1;0:66] s. Next, we consider the example when (B1;B2) = (20;2) Mbits, we have the optimal transmission policy, as shown in Figure 6.12. In this example, the cut- ofi power is greater than P1, and therefore, P1 is allocated to the flrst user only over [0;5) s, and after t = 5 s, the flrst user keeps transmitting at a constant rate until all bits are transmitted. In this case, the transmission rates for both users monotonically increase. The transmitter flnishes its transmissions by time T = 9:25 s, and the last energy harvest is not used. 175 T 10 0 2 t 10 10 1055 6 9 11 10 85 10 5 t P (B1,B2)=(20,2) 3 40 T=9.25 Figure 6.12: Cut-ofi power Pc = 4:107 mW. Optimal transmit rates r1 = [2;2:353; 2:353;2:353] Mbps and r2 = [0;0:167;0:856;2:570] Mbps, with durations l = [5;3;1;0:25] s. 6.6 Conclusions Inthischapter, weinvestigatedthetransmission completion timeminimization prob- lem in an energy harvesting broadcast channel. We assumed that there are certain number of packets at the transmitter, ready to be transmitted to both users before the transmission starts. We flrst analyzed the structural properties of the optimal transmission policy, and proved that the optimal total transmission power has the same structure as that in the single-user communication channel. We also proved that there exists a cut-ofi power for the stronger user. If the optimal total transmis- sion power is lower than this cut-ofi level, all power is allocated to the stronger user, and when the optimal total transmission power is greater than this cut-ofi level, all power above this level is allocated to the weaker user. Based on these structural properties of the optimal policy, we developed an iterative algorithm to obtain the globally optimal ofi-line scheduling policy. 176 Chapter 7 Optimal Packet Scheduling in a Multiple Access Channel with Energy Harvesting Transmitters 7.1 Introduction E?cient energy management is crucial for wireless communication systems, as it increases the throughput and improves the delay performance. Energy e?cient scheduling policies have been well investigated in traditional battery powered (un- rechargeable) systems [13{18]. On the other hand, there exist systems where the transmitters are able to harvest energy from the nature. Such energy harvesting abilities make sustainable and environmentally friendly deployment of communi- cation systems possible. This renewable energy supply feature also necessitates a completely difierent approach to energy management. In this chapter, we consider a multi-user rechargeable wireless communication system, where data packets as well as the harvested energy arrive at the transmitters as random processes in time. As shown in Figure 7.1, we consider a two-user multiple access channel (MAC), where each transmitter node has two queues. The data queue stores the data arrivals, while the energy queue stores the energy harvested from the environment. Our objective is to adaptively change the transmission rate and power according to the instantaneous data and energy queue sizes, such that the 177 transmission completion time is minimized. E2 B1 user1 energyqueue B2 user2 receiver E1 dataqueue (a) Cs CsR2 C2 R1C1 (b) Figure 7.1: (a) An energy harvesting MAC model with energy and data queues, and (b) the capacity region of the additive white Gaussian noise MAC. In general, the arrival processes for the data and the harvested energy can be formulated as stochastic processes, and the problem requires an on-line solution that adapts transmission power and rate in real-time. Since this seems to be an intractable problem for now, we simplify the problem by assuming that the data packets and energy will arrive in a deterministic fashion, and we aim to develop an ofi-line solution instead. In this chapter, we consider the scenario where packets have already arrived before the transmissions start. Speciflcally, we consider two nodes as shown in Figure 7.2. For the tra?c load, we assume that there are a total of B1 bits and B2 bits available at the flrst and second transmitter, respectively, at time t = 0. We assume that energy arrives (is harvested) at points in time marked with ?. In Figure 7.2, E1k denotes the amount of energy harvested for the flrst user at time sk. Similarly, E2k denotes the amount of energy harvested for the second user at time sk. If there is no energy arrival at one of the nodes, we simply 178 let the corresponding amount be zero, which are denoted by the dotted arrows in Figure 7.2. Our goal then is to develop methods of transmission to minimize the time, T, by which all of the data packets from both of the nodes are delivered to the destination. B2 s2 E20 t0 s1 T E22 s2 E2K sK sK?1 sK ??? B1 E10 E1,K?1E11 t0 s1 sK?1 T ??? Figure 7.2: System model with all packets available at the beginning. Energies arrive at points denoted by ?. The optimal packet scheduling problem in a single-user energy harvesting com- munication system is investigated in Chapter 5. In Chapter 5, we prove that the op- timal scheduling policy has a \majorization" structure, in that, the transmit power is kept constant between energy harvests, the sequence of transmit powers increases monotonically, and only changes at some of the energy harvesting instances; when the transmit power changes, the energy constraint is tight, i.e., the total consumed energy equals the total harvested energy. In Chapter 5, we develop an algorithm to obtain the optimal ofi-line scheduling policy based on these properties. Reference [19] extends Chapter 5 to the case where rechargeable batteries have flnite sizes. We extend Chapter 5 in [20] to a fading channel. In the two-user MAC setting studied in this chapter, the scheduling problem is signiflcantly more complicated. This is because the two users interfere with each other, and we need to select the trans- 179 mission powers for both users as well as the rates from the resulting rate region, to solve the problem. In addition, because the tra?c load and the harvested energy for both users may not be well-balanced, the flnal transmission durations for the two users may not be the same, further complicating the problem. We flrst investigate a problem which is \dual" to the transmission completion time minimization problem. In this \dual" problem, we aim to characterize the maximum number of bits both users can transmit for any given time T. These two problems are \dual" to each in the sense that, if (B1;B2) lies on the boundary of the maximum departure region for time T?, then, T? must be the solution to the trans- mission completion time minimization problem with initial number of bits (B1;B2). We propose a generalized iterative backward waterfllling algorithm to achieve the boundary points of the maximum departure region for any given time T. Then, based on the solution of this \dual" problem, we go back to the transmission com- pletion time minimization problem, simplify it into standard convex optimization problems, and solve it e?ciently. In particular, we flrst characterize the maximum departure region for every energy arrival epoch, and based on the location of the given (B1;B2) on the maximum departure region, we narrow down the range of the minimum transmission completion time to be between two consecutive epochs. Based on this information, we propose to solve the problem in two steps. In the flrst step, we solve for the optimal power policy sequences to achieve the minimum T, so that (B1;B2) is on the maximum departure region for this T. This step can be formulated as a convex optimization problem. Then, with the optimal power policy obtained in the flrst step, we search for the optimal rate policy sequences 180 from the capacity regions deflned by the power sequences to flnish B1;B2 bits. The second step is formulated as a linear programming problem. In addition, we further simplify the problem by exploiting the optimal structural properties for two special scenarios. 7.2 System Model and Problem Formulation The system model is as shown in Figures 7.1 and 7.2. As shown in Figure 7.1, each user has a data queue and an energy queue. The physical layer is modeled as an additive white Gaussian noise channel, where the received signal is Y = X1 +X2 +Z (7.1) where Xi is the signal of user i, and Z is a Gaussian noise with zero-mean and unit-variance. The capacity region for this two-user MAC is [24] R1 ? f(P1) (7.2) R2 ? f(P2) (7.3) R1 +R2 ? f(P1 +P2) (7.4) where f(p) = 12 log(1+p). We denote the region deflned by these inequalities above as C(P1;P2). This region is shown on the right flgure in Figure 7.1. As shown in Figure 7.2, user i has Bi bits to transmit which are available at transmitter i at time t = 0. Energy is harvested at times sk with amounts Eik 181 at transmitter i. Our goal is to solve for the transmit power sequence, the rate sequence, and the corresponding duration sequence that minimize the time T by which all of the bits are delivered to the destination. We assume that the transmitters can adapt their transmit powers and rates according to the available energy level and number of bits remaining. The energy consumed must satisfy the causality constraints, i.e., for each user, the total amount of energy consumed up to time t must be less than or equal to the total amount of energy harvested up to time t by that user. Let us denote the transmit power for the flrst and second user at time t as p1(t) and p2(t), respectively. Then, the transmission rate pair (r1(t);r2(t)) must be within the capacity region deflned by p1(t) and p2(t), i.e., C(p1;p2)(t). For user i, i = 1;2, the energy consumed up to time t, denoted as Ei(t), and the total number of bits departed up to time t, denoted as Bi(t), can be written as: Ei(t) = Z t 0 pi(?)d?; Bi(t) = Z t 0 ri(?)d?; i = 1;2 (7.5) Here ri and powers pi are related through the f function as shown in (7.2)-(7.4). Then, the transmission completion time minimization problem can be formulated 182 as: minp 1;p2;r1;r2 T s.t. E1(t) ? X n:sn f(p1n)(s0i ?si)+f(p1;n+1)(si+1 ?s0i) f(p02)(si+1 ?si) > f(p2n)(s0i ?si)+f(p2;n+1)(si+1 ?s0i) f(p01 +p02)(si+1 ?si) > f(p1n +p2n)(s0i ?si)+f(p1;n+1;p2;n+1)(si+1 ?s0i) (7.9) 184 where the inequality follows from the fact that f(p) is strictly concave in p. We note that the right hand side of these inequalities characterizes the boundary of the departure region under the original policy over [si;si+1). Therefore, the departure region under the original policy is strictly inside the departure region under the new policy, which con icts with the optimality of the original policy. 2 ??? E1i E1,i+1 si+1 pprime1 p1n ??? ??? t??? ??? si sprimei p2,n+1 E2i E2,i+1 si+1 pprime2 p2n ??? ??? t??? si sprimei p1,n+1 Figure 7.3: The power/rate must remain constant between energy harvests. Therefore, in the following, we only consider policies where the rates are con- stant between any two consecutive energy arrivals. In order to simplify the notation, in this section, for any given T, we assume that there are N?1 energy arrival epochs (excluding t = 0) over (0;T). We denote the last energy arrival epoch before T as sN?1, and sN = T, with ln = T ?sn?1. Let us deflne (p1n;p2n) to be the transmit power over [sn?1;sn). Lemma 7.2 For any feasible transmit power sequences p1, p2 over over [0;T), the total number of bits departed from both of the users, denoted as B1 and B2, is a 185 pentagon deflned as 8> >>>> >< >>> >>>: (B1;B2) flfl flfl flfl flfl flfl flfl B1 ?PNn=1 f(p1n)ln B2 ?PNn=1 f(p2n)ln B1 +B2 ?PNn=1 f(p1n +g2n)ln 9> >>>> >= >>> >>>; (7.10) This lemma can be established based on the property of pentagon with 45? dominant face. Lemma 7.3 D(T) is a convex region. Proof: Consider two power policies (p1;p2) and (?p1; ?p2) over [0;T). Without loss of generality, we assume that NX n=1 f(p2n)ln > NX n=1 f(?p2n)ln (7.11) NX n=1 f(p1n +p2n)ln ? NX n=1 f(?p1n + ?p2n)ln (7.12) Let us construct a new policy as a linear combination of these two policies over [0;T), i.e., p0i = ?pi + (1 ? ?)?pi, i = 1;2, 0 < ? < 1. It is straightforward to check that the energy constraints are still satisfled, thus the new policy is feasible. Consider the upper corner points of the departure region under the policies (p1;p2) 186 and (?p1; ?p2). Because of the concave property of f(p) in p, we have NX n=1 f(p02n)ln > ? NX n=1 f(p2n)ln +(1??) NX n=1 f(?p2n)ln (7.13) NX n=1 f(p01n +p02n)ln > ? NX n=1 f(p1n +p2n)ln +(1??) NX n=1 f(?p1n + ?p2n)ln (7.14) i.e., the upper corner point of the departure region under the new policy is always above the line connecting these two upper corner points under policies (p1;p2) and (?p1; ?p2). Therefore, the union of (B1;B2) over all feasible power allocation policies is a convex region. 2 Lemma 7.4 For any T0 > T, D(T) is strictly inside D(T0). Proof: For any policy achieving the boundary point of D(T), let us flx the power sequence for one user, and change the transmit power of the other user by re- moving part of its energy consumed before T and spend it over the duration [T;T0). Since there is no interference over [T;T0), the departures for the user can be po- tentially improved while the departures for the other user is unchanged. Therefore, D(T) must be strictly inside D(T). 2 As a flrst step, we aim to explicitly characterize D(T) for any T. Similar to the capacity region of the fading Gaussian multiple access channel [30], where each boundary point is a solution to maxR2C ??R, here, in our problem, the boundary points also maximize ??B for some ?. First, let us examine three difierent cases separately. 187 7.3.1 ?1 = ?2. In this subsection, we consider the scenario where ?1 = ?2. Therefore, our problem becomes maxp1;p2 B1+B2. In Chapter 5, we examined the optimal packet scheduling policy for the single-user scenario. We observe that for any flxed T, the optimal power allocation policy has the \majorization" property. Speciflcally, we have in = arg min in?1 1, n ? 0 and np2n = 0. The optimal solution must satisfy p2n = ? 1 ?1 ?Pnj=1 ?j ?p ? 1n ?1 !+ ; n = 1;2;:::;N (7.21) 1 ?1?Pnj=1 ?j can be interpreted as the \water" level over [sn?1;sn), and p ? 1n +1 is the base water level. If ?n > 0, no energy ows across the epoch t = sn?1, and we have, 1 ?1 ?Pnj=1 ?j > 1 ?1 ?Pn?1j=1 ?j (7.22) i.e., the water level over [sn?1;sn) must be higher than that over [sn?2;sn?1). If ?n = 0, energy harvested before ows across the epoch t = sn?1, and we have, 1 ?1 ?Pnj=1 ?j = 1 ?1 ?Pn?1j=1 ?j (7.23) 192 i.e., the water level over [sn?1;sn) is equal to that over [sn?2;sn?1). Therefore, energy ows across the epoch t = sn?1 only when the water level [sn?2;sn?1) has the potential to surpass that over [sn?2;sn), and the energy ow makes the water levels even. A backward waterfllling process naturally leads to the optimal power policy. 2 The backward waterfllling procedure is shown in Figure 7.6. This power al- location deflnes another pentagon, and its lower corner point maximizes B1, which is point 3 in Figure 7.5. Similarly, we can obtain another pentagon whose upper corner point maximizes B2, which is point 4 in Figure 7.5. In general, points 3 and 4 do not coincide with the points 1 and 2, respectively, and consequently, there are curved parts connecting these corner points. B1 s4 P p13p 12 E10 ??? EK t0 s1 E11 E13 s2 s3 s4 sK E20 ??? EK t0 s1 E22 s3 s4 E24 sK B2 s2 T p11 s1 s2 s3 Figure 7.6: The optimal transmit power for the second user to maximize its depar- ture. 193 7.3.3 General ?1;?2 > 0. The curved parts can be characterized through the solution of maxB2D(T)??B for some ? > 0. Since each boundary point corresponds to a corner point on some pentagon, for ?1 > ?2, we need to solve the following problem: maxp 1;p2 (?1 ??2) X n f(p1n)ln +?2 X n f(p1n +p2n)ln s.t. jX n=1 p1nln ? j?1X n=0 E1n; 8j : 0 < j ? N jX n=1 p2nln ? j?1X n=0 E2n; 8j : 0 < j ? N (7.24) The problem in (7.24) is a convex optimization problem with linear constraints, therefore, the unique global solution satisfles the extended KKT conditions as fol- lows: ?1 ??2 1+p1n + ?2 1+p1n +p2n ? NX j=n ?j; 1 ? n ? N (7.25) ?2 1+p1n +p2n ? NX j=n flj; 1 ? n ? N (7.26) where the conditions in (7.25) and (7.26) are satisfled with equality if p1n;p1n > 0. When ?1 6= ?2, it is di?cult to obtain the optimal policy explicitly from the KKT conditions. Therefore, we adopt the idea of generalized iterative waterfllling in [37] to flnd the optimal policy. Speciflcally, given the power allocation of the second user, denoted as p?2, we optimize the power allocation of the flrst user, i.e., we aim to solve the following 194 optimization problem: maxp 1 (?1 ??2) NX n=1 f(p1n)ln +?2 NX n=1 f(p1n +p?2n)ln s.t. jX n=1 p1nln ? j?1X n=0 E1n; 0 < j ? N (7.27) Once the power allocation of the flrst user is obtained, denoted as p?1, we do a backward waterfllling for the second user to obtain its optimal power allocation. We perform the optimization for both users in an alternating way. Because of the concavity of the objective function and the Cartesian product form of the convex constraint set, it can be shown that the iterative algorithm converges to the global optimal solution, [38]. Because there is more than one term in the objective function of (7.27), the optimal policy for the flrst user does not have a backward waterfllling interpretation. However, using the method in [37], we can interpret the procedure for the flrst user as a generalized backward waterfllling operation. In order to see that, given p?2, we deflne a generalized water level bn(p1n) as the inverse of the left hand side of (7.25), i.e., bn(p1n) = ? 1 ??2 1+p1n + ?2 1+p1n +p?2n ??1 (7.28) and the base water level as bn(0), which can be seen as the modifled interference plus noise level over the duration [sn?1;sn). We generalize the form of the water level by taking the priority of users into account. Then, the KKT condition for this 195 single-user problem is 1 bn(p1n) ? NX j=n ~?j; n = 1;2;:::;N (7.29) We note that ~?j in general is difierent from the Lagrange multiplier ?j in (7.25), since p?2n need not be the optimal p2. However, because of the convergence of the iterative algorithm, ~?j converges to ?j eventually as well. Therefore, under the deflnition of the generalized water level bn(p1n), we can also interpret the optimal solution for the flrst user as a generalized backward wa- terfllling process. We flrst flll E1;N?1 over the duration [sN?1;sN), with the base water level bN(0). This step gives us an updated water level bN(E1;N?1=lN). Then, we move backward to the duration [sN?2;sN?1), and flll E1;N?2 over that duration until it is depleted, or the water level becomes equal to bN(E1;N?1=lN). Once the latter happens, we flll the remaining energy over the durations [sN?2;sN?1) and [sN?1;sN) in a way that the water level always becomes even. We repeat the steps until E10 is flnished. This allocation gives the optimal p1 when the power of the second user is flxed. The optimality of this procedure can be proved in the same way as in the proof of Theorem 7.1. Therefore, in this section, we determined the largest (B1;B2) region for any given T, i.e., D(T). We also determined the optimal power/rate allocation policy that achieves the points on the boundary of this (B1;B2) region. However, we recall that our goal is to flnd the minimum time, T, by which we can transmit given flxed number of bits (B1;B2). In the next section, we go back to our original problem, 196 and provide a solution for it, using our flndings in this section. 7.4 Minimizing the Transmission Completion Time T for a Given (B1;B2) For a given pair (B1;B2), in order to minimize the transmission completion time of both users, we need to obtain T such that (B1;B2) lies on the boundary of the departure region D(T), as shown in Figure 7.5. However, D(T) depends on T, which is the objective we want to minimize, and is unknown upfront. Therefore, in order to solve the problem, we flrst calculate D(t) for t = s1;s2;:::;sK. Then, we locate (B1;B2) on the maximum departure region. If (B1;B2) is exactly on the boundary of D(t) for some t = si, then, based on the \duality" of these two problems, we know that this si is exactly the minimum trans- mission completion time the system can achieve, and the corresponding power and rate allocation policy achieving this point is the optimal policy. If (B1;B2) is outside D(si) but inside D(si+1) for some si, then, we conclude that the minimum transmission completion time, T, must lie between these two energy arriving epoches, i.e., si < T < si+1. Therefore, T ?si, denoted as t here, is the duration we aim to minimize. We propose to solve this optimization problem in two steps. In the flrst step, we aim to flnd a set of power allocation policy to ensure that (B1;B2) is on the boundary of the departure region deflned by this power allocation policy. In the second step, with the power allocation obtained in the flrst step, we flnd a set 197 of rate allocation within its corresponding capacity region, such that B1;B2 are flnished by the minimal transmission duration obtained in the flrst step. The flrst step guarantees that such a rate allocation exists. Solving the problem through these two steps signiflcantly reduces the complexity for each problem, since the number of unknown variables is about half in each problem. In addition, as we will observe, the flrst step can be formulated as a standard convex optimization problem, and the second step becomes a linear programming problem. Therefore, both steps can be solved through standard optimization tools in an e?cient way. Let us deflne the energy spent over [sn?1;sn) by the flrst and second trans- mitter as e1n, e2n, respectively. Then, let e1 = [e11;e12;:::;e1;i+1], and e2 = [e21;e22;:::;e2;i+1], we formulate the optimization problem in the flrst step as follows mine 1;e2;t t s.t. jX n=1 e1n ? j?1X n=0 E1n; 0 < j ? i+1 jX n=1 e2n ? j?1X n=0 E2n; 0 < j ? i+1 B1 ? iX n=1 f e 1n ln ? ln +f ?e1;i+1 t ? t B2 ? iX n=1 f e 2n ln ln ? ln +f ?e2;i+1 t ? t B1 +B2 ? iX n=1 f e 1n +e2n ln ? ln +f e 1;i+1 +e2;i+1 t ? t (7.30) where the last three inequality constraints simply mean that (B1;B2) 2 D(si + t). We state the problem in this form, so that the constraint set becomes convex, and 198 the problem is transformed into a standard convex optimization problem. The joint concavity of f ?et?t in (e;t) can be proved through taking second derivatives of the function with respect to e and t, and observing that the Hessian is always negative semideflnite. Therefore, the right hand side of these inequality constraints are all jointly concave, thus the constraint set is convex. Once we obtain e1;e2 and t, we divide the energy by its corresponding dura- tion, and get the optimal power policy sequences p1 and p2. Next, we perform the rate allocation in the second step. Therefore, the problem becomes that of searching for r1 and r2 from the sequence of capacity regions deflned by the sequences p1 and p2 to depart B1 and B2. This solution may not be unique. Therefore, we formulate it as a linear programming problem as follows: minr 1;r2 r1;i+1 s.t. iX n=1 r1nln +r1;i+1t = B1 iX n=1 r2nln +r2;i+1t = B2 (r1n;r2n) 2C(p1n;p2n); 0 < n ? i+1 (7.31) Here the objective function can be any arbitrary linear function in r1 and r2, since our purpose is only to obtain a feasible solution satisfying the constraints. We choose the objective function to be r1;i+1 for simplicity. The solution of the optimization problem (7.30)-(7.31) gives us an optimal power and rate allocation policies, which minimize the transmission completion time for both users. 199 Obtaining D(si) for every si requires a large number of computations, and as we will see, it is not necessary. In order to reduce the computation complexity, we aim to explore two special cases of the problem, and use the algorithm in Chapter 5 to obtain a lower bound for T. 7.4.1 (B1;B2) lies on the at part of the dominant face. For a given pair of (B1;B2), the minimum possible transmission completion time can be achieved if it lies on the at part of the dominant face of D(T) for some T. This corresponds to the scenario discussed in Section 7.3.1. Therefore, we can also treat these two users as a single-user system, and identify the value of T through the method discussed in Chapter 5. Speciflcally, we calculate the minimum energy required to flnish B1+B2 by s1, this is equal to 22 ?B 1+B2 s1 ? ?1, denoted as A1. Then, we compare A1 with E10 +E20. If A1 is smaller than E10+E20, then, the minimum possible transmission completion time is the solution to the following equation f E 10 +E20 T ? = B1 +B2T (7.32) Inthiscase, themaximumdepartureregionD(T)isapentagondeflnedbyC?E10T ; E20T ? T. If B1 < f ?E10T ?T and B2 < f ?E20T ?T, then, we always select a rate from C?E10T ; E20T ? to achieve the minimum transmission completion time. If A1 is greater than E10 + E20, then, we continue to calculate the minimum energy required to flnish B1+B2 by s2, s3, ..., denoted as A2, A3, ..., and compare 200 these withP1j=0 E1j+E2j;P2j=0 E1j+E2j;:::, until the flrst Ai that becomes smaller than Pi?1j=0 E1j +E2j. Then, the minimum possible transmission completion time is the solution of f ?Pi?1 j=0 E1j +E2j T ! = B1 +B2T (7.33) Then, we need to determine whether this constant sum of transmit power is feasible when the energy arrival times are imposed. We merge the energy arrivals from both users and plot the sum of energy as a function of time. Then, we connect the corner points up to T with the origin, and the smallest slope among the lines gives us the flrst sum of the transmit power, p1. We repeat this process, to obtain p2, p3, ..., until all of B1+B2 bits are transmitted. This gives the shortest possible transmission completion time, T1, for the system. Next, we need to determine whether (B1;B2) lies on the at part of the dom- inant face of D(T1). We obtain the region D(T1) and flnd the corner points of the at part on its dominant face through the method described in Section 7.3.1, and compare them with (B1;B2). If (B1;B2) lies within the bound, as shown in Fig- ure 7.5, this means that it is feasible to empty both queues by time T1. The only remaining step is to identify a feasible power and rate allocation sequence to achieve this lower bound. In order to obtain a feasible power allocation, we simplify the optimization 201 problem in (7.30) into the following form minp 1;p2 p11 s.t. p1n +p2n = pn; 0 < n ? i+1 B1 ? iX n=1 f(p1n)ln +f(p1;i+1)(T1 ?si) B2 ? iX n=1 f(p2n)ln +f(p2;i+1)(T1 ?si) (7.34) Again, the objective function can be arbitrary since our purpose is only to obtain a feasible solution satisfying the constraints. We choose p11 for simplicity. Once the feasible power allocation is obtained, the optimal rate allocation can be obtained by solving (7.31). 7.4.2 (B1;B2) lies on the vertical or horizontal part. If (B1;B2) does not lie on the at part of the dominant face of D(T1), then, it either lies on the vertical or horizontal parts of the boundary of D(T) for some T, or lies on the curved part of the boundary of D(T) for some T. Speciflcally, we assume that (B1;B2) is beyond the lower corner point of the at part of the dominant face of D(T1), as shown in Figure 7.7. This implies that if we keep transmitting with any policy corresponding to the point on the at part of the boundary of D(T1), by T1, we have B2 bits departed from the second user, however, there are still some more bits left in the queue of the flrst user. This situation motivates us to put more priority on the flrst user. 202 Therefore, as the second step, we consider the scenario that (B1;B2) lies on the vertical part of the boundary of D(T), for some duration T. We flrst ignore the second user, and treat the flrst user as the only user in the system. This is exactly the same situation as in the single-user scenario. We apply the algorithm in Chapter 5, and obtain the transmission duration for the flrst user, denoted as T2. T2 is the shortest possible transmission completion time for given B1. If we can depart B2 bits from the second user by T2, then T2 is the shortest transmission completion time for both users; otherwise, we cannot flnish both data queues by T2, and the flnal transmission time should be greater than T2. With T2 flxed, we obtain the optimal energy allocation for the second user through the backward waterfllling procedure described in Section 7.3.2. Once p1n and p2n are determined, we can calculate the maximum number of bits departed from the second user under the assumption that the flrst user is the primary user. This gives us a number B02. If B02 ? B2, as shown in Figure 7.7, it implies that our assumption is valid, and we can empty both queues by T2, which is also the shortest possible transmission duration for the system. If B02 < B2, this implies that we cannot depart B2 bits from the second queue byT2, therefore, theflnaltransmissiondurationcouldnotbeT2 eitherforthesystem. This leaves us with the last possibility that (B1;B2) must be on the curved part of some other region with some duration T, where T > T1;T2. Therefore, up to this point, we obtained a lower bound for the transmission completion time T, which is max(T1;T2). In order to identify an upper bound for T, we only need to calculate the maximum departure region for the energy arriving 203 B1 B2 (B1,B2)D(T1) D(T2) Figure 7.7: The minimum transmission completion time T to depart (B1;B2). epochs right after max(T1;T2), until (B1;B2) is included for some t = si. 7.5 Numerical Results We consider a band-limited additive white Gaussian noise channel, with bandwidth W = 1 MHz and noise power spectral density N0 = 10?19 W/Hz. We assume that the distance between the transmitters and the receiver is 1 Km, and the path loss is about 110 dB. Then, we have f(p) = W log2 ? 1+ phN0W ? = log2?1+ p10?2? Mbps. For the energy harvesting process, we assume that at times t = [0;2;7;11] s, we have energy harvested with amounts E = [5;5;10;10] mJ for the flrst user; at times t = [0;5;8;12] s, we have energy harvested with amounts E = [5;10;5;10] mJ for the second user. We flnd the maximum departure region D(T) for T = 7;8;11;12 s, and plot them in Figure 7.8. We observe that the maximum departure region is convex for each value of T, each boundary consists of three difierent parts, and as T increases, the maximum departure region monotonically expands. 204 0 0.5 1 1.5 2 2.5 3 3.5 40 0.5 1 1.5 2 2.5 3 B1 (Mbits) B 2 (Mbits) T=7 s T=8 s T=11 s T=12 s Figure 7.8: The maximum departure region of the multiple access channel for various T. We assume that at t = 0, we have B1 = 2:5 Mbits from the flrst user and B2 = 2:32 Mbits from the second user to transmit. We choose the numbers in such a way that the solution is expressable in simple numbers, and can be plotted conveniently. Then, using the proposed algorithm, we obtain the optimal transmission policy, which is shown in Figure 7.9. We also determine the transmission rates as r1 = [0:263;0;0:585;0:3] Mbps and r2 = [0:1155;0:585;0;0:285] Mbps. We note that, for this case, the active transmission is completed by time T = 10 s, and the energy harvests at time t = 11 s and t = 12 s are not used. We also note that (B1;B2) lies on the at part of the dominant face of D(10), therefore, we flnish the transmission of both user simultaneously at t = 10 s. Since (B1;B2) is not at the corner point, the optimal policy is not unique. We may have difierent p1 and p2 and choose difierent 205 rates accordingly to have the same departure time. However, the sequence of the sum of transmit powers is unique. 1 0 7 11 850 5 12 0 5 10 10 10 10 5 5 0 T=10 T=10 2.552 5 2.5 2 Figure 7.9: Optimal transmit powers p1 = [2;0;5;2:5] mW, p2 = [1;5;0;2:5] mW, with durations l = [5;2;1;2] s. If (B1;B2) is not well-balanced, then, it may not be on the dominant face of D(10), even though the sum B1 +B2 is the same. For example, if B1 = 2:63 Mbits and B2 = 2:19 Mbits, a simple calculation indicates that (B1;B2) lies beyond the range of the dominant face of D(10), and we cannot flnish both queues at t = 10 s. Therefore, we take the flrst user as our primary user, and calculate the minimum possible transmission time for it. The optimal policy for the flrst user is p11 = 1:43 mW over [0;7) s, and p12 = 2:67 mW over [7;10:75) s. Based on this allocation, we perform the waterfllling procedure for the second user. The optimal allocation for the second user is shown in Figure 7.10, and the maximum number of bits departed from the second user is 2:22 Mbits, which is greater than B2. This implies that the minimum transmission duration for both users is T = 10:75 s, and the data queue of the second user will be emptied earlier than the flrst user. The value of (B1;B2) may be such that it is neither on the at part of the dominant face nor on the vertical part of the boundary of any D(T). For example, 206 2.11 5 20 7 11 850 3.54 12 5 10 10 10 10 5 5 1.43 2.67 T=10.75 1 p11 p12 Figure 7.10: Optimal transmit powers p1 = [1:43;1:43;2:67] mW, p2 = [1;3:54;2:11] mW, with durations l = [5;2;3:75] s. let B1 = 2:58 Mbits and B2 = 2:24 Mbits (note that the sum B1+B2 is the same as in the previous two examples). From our flrst example, we know that it is beyond the dominant face ofD(10). Then, we use the method for the second example to flnd the minimum transmission time for the flrst user by treating it as the primary user. Calculation indicates that the minimum transmission duration for the flrst user is T = 9:7 s, and the corresponding power allocation is p11 = 1:43 mW over [0;7) s, and p12 = 3:7 mW over [7;9:7) s. Then, since T < 10 s, and 10 s is the minimum possible transmission duration for the system, it implies that the total number of bits departed by T = 9:7 s is strictly less than B1 + B2. Therefore, we cannot flnish the second queue by T = 9:7 s. Based on this analysis, we conclude that (B1;B2) must be on the curved part of D(T) for some T. Then, since it lies within D(11), together with the lower bound max(10;9:7) = 10 s, we solve the optimization problem described in (7.31). The optimal policy is shown in Figure 7.11. We observe that the sum of the transmit powers is always increasing, even though they are not 207 monotonically increasing for each individual user. The power changes at t = 2 s and t = 8 s, where the energy constraints are satisfled with equality for the second user. T 7 11 850 12 5 10 10 10 10 5 5 1.86 0.35 3.03 2.381.144.431 T=10.13.63 5 20 Figure 7.11: Optimal transmit powers p1 = [1:86;0:35;3:63;3:03] mW, p2 = [1;4:43; 1:14;2:38] mW, with durations l = [5;2;1;2:1] s. These three pairs of (B1;B2) are plotted in Figure 7.12. Although the sum of B1;B2 is the same, they corresponds to difierent scenarios discussed before, and lies on difierent parts of the boundary of their corresponding maximum departure regions. 2.1 2.2 2.3 2.4 2.5 2.6 2.72.1 2.2 2.3 2.4 2.5 2.6 2.7 B1 (Mbits) B 2 (Mbits) T=10 sT=10.75 s T=11 s Figure 7.12: The maximum departure region of the multiple access channel for various T. 208 7.6 Conclusions In this chapter, we investigated the transmission completion time minimization problem in an energy harvesting multiple access communication system. We as- sumed that the packets have already arrived and are ready to be transmitted at the transmitter before the transmission starts. We flrst proposed a generalized iterative backward waterfllling algorithm and characterized the maximum departure region for any given deadline constraint T. Then, based on these flndings, we simplifled the transmission completion time minimization problem into convex optimization problems, and solved it e?ciently. 209 Chapter 8 Average Delay Minimization for an Energy Constrained Single-User Channel 8.1 Introduction Our objective in this chapter is to minimize the packet delay in a general energy constrained system, where the transmitter may harvest energy from the nature. We aim to develop optimal transmission policies that take into account the randomness both in the arrivals of the data packets as well as in the arrivals of harvested energy. As shown in Figure 8.1, we will consider a single node, where packets arrive at random times marked with ? and energy arrives (is harvested) at random points in time marked with ?. In Figure 8.1, Bi denotes the number of bits in the ith arriving data packet, and Ei denotes the amount of energy in the ith energy arrival (energy harvesting). our objective is to minimize the overall delay of the packets subject to the energy constraints on the transmitter. The delay includes both the queuing time and the transmission time for the packet. Our aim is to adaptively allocate the energy over all packets according to the available amount of energy and number of packet at the transmitter, in a way to minimize the overall delay of the system. The most general version of the problem is complicated. In this chapter, we will consider three scenarios, starting with the simplest setting and proceeding with 210 ??? E1 B0 B1 B2 BM t0 t1 t2 tMsK Ts1 ??? E0 EK Figure 8.1: System model with random packet and energy arrivals. Data packets arrive at points denoted by ? and energies arrive (are harvested) at points denoted by ?. progressively more complicated settings. In the flrst scenario, we assume that the transmitter has a flxed number of packets to transmit, and a flxed amount energy to use in its transmissions. We formulate the problem as a convex minimization problem. We use a Lagrangian based approach, and develop an iterative algorithm. The iterative algorithm is guaranteed to converge to the unique global optimum solution. In the second scenario, we assume that the transmitter has a flxed amount of energy, but the packets arrive during the transmissions. We also formulate the problem as a convex minimization problem. However, even though the overall cost function is convex in the energies allocated to the packets, it is not difierentiable. The reason for this is that the cost function takes difierent forms in difierent re- gions of allowable energy distributions. In other words, the energy allocated to a packet afiects the form of the cost function for later packets. For this setting, unlike [13], the problem does not admit a closed-form solution. Therefore, we develop an iterative algorithm that is based on the principle of decreasing the overall delay at each iteration. We prove that the proposed algorithm decreases the overall delay monotonically. However, due to the non-difierentiability of the overall delay func- 211 tion, the proposed algorithm may converge to a suboptimal flxed point. In order to overcome this problem, we use two modiflcations on our algorithm: increasing the dimensionality of the sub-problem solved at each iteration (i.e., considering more than two packets at any given iteration), and ?-perturbation of the sub-optimal flxed points. In addition, we develop a dynamic programming (DP) based formulation for the same problem. In the third scenario, we assume that the transmitter has a flxed number of packets available at the beginning, but the energy arrives during the transmissions. This models an energy harvesting transmitter which harvests energy from the na- ture by using a rechargeable battery. In this scenario, a certain amount of energy from the battery is allocated to a packet for its transmission. In order to shorten the transmission time, a packet may hold its transmission until the battery gathers enough energy. This on the other hand increases its waiting time in the queue. Therefore, in this scenario, there is a trade-ofi between the waiting time and trans- mission time for the packets. This problem is not convex in general, and we develop a DP formation to obtain the optimal solution. 8.2 ScenarioI:PacketsandEnergyReadyBeforeTransmissionStarts In many situations, such as multimedia communications, the source (video, music, etc.) may be available at the server waiting to be downloaded to their destinations. In sensor networks, a node may have gathered a number of packets before the transmission starts. In these scenarios, minimizing the overall transmission delay 212 with a given amount of energy is an important problem. We consider a non-fading single-user wireless channel. We assume that there are M packets available at the transmitter at t = 0; see Figure 8.2. The packets have a uniform size, which is B0 bits per packet. The transmitter has a total energy constraint which is denoted by E0. Let ei denote the energy allocated for the transmission of packet i, then PMi=1 ei ? E0. We can express the relationship between the transmission duration of ?i and the energy spent in its transmission ei, for packet i, as a deterministic function ?i = f(ei). Without loss of generality, as in [13, 18], we assume that f(e) satisfles the following properties: i) f(e) ? 0, ii) f(e) decreases monotonically in e, iii) f(e) is strictly convex in e, iv) f(e) is continuously difierentiable, and v) f(e) ! 1 as e ! 0. As shown in [13, 18], the flrst four conditions are satisfled in realistic channel coding schemes. The last condition is reasonable as a packet cannot be transmitted with zero energy. f(e2) f(e1) f(e3) ::: Figure 8.2: System model when all packets and energy are ready before the trans- mission starts. Therefore, for the ith packet, the delay Di can be expressed as Di = iX k=1 ?k = iX k=1 f(ek) (8.1) 213 Then, our optimization problem becomes min MX i=1 (M ?i+1)f(ei) s.t. MX i=1 ei ? E0 ei ? 0; i = 1;:::;M (8.2) We note that, since all the packets have arrived before the transmission starts, the cost function has a flxed form. This makes the optimization problem tractable. The problem in (8.2) is a convex optimization problem, and there exists a unique global optimum solution that satisfles the KKT optimality conditions. We note that because of property v) of f(e), no ei can be zero, as it would require the cost function to go to inflnity. As a result, the KKTs can be expressed as (M ?i+1)f0(ei)+? = 0 (8.3) i.e., as ei = f0?1 ?? M ?i+1 ? ; i = 1;2;:::;M (8.4) where ? is the non-negative Lagrange multiplier which is chosen such thatPik=1 ek = E0. In the following, we also devise an iterative algorithm to solve this problem. Initially, we allocate the total energy E0 to the flrst packet. Then, we consider the flrst two packets, and optimize the distribution of the total energy E0 over these 214 two packets, in a way to minimize the overall delay, while we keep the energies allocated to the rest of the packets flxed. We continue this process until we reach the last packet, then we return to the flrst packet. The local optimization in the kth iteration becomes min (M ?i+1)f(eki)+(M ?i)f(eki+1) s.t. eki +eki+1 = ek?1i +ek?1i+1; eki;eki+1 ? 0 (8.5) It is easy to prove that this algorithm converges to a flxed point, since the algorithm monotonically decreases the cost function which is lower bounded by zero. Assume that ek converges to a flxed point, ?e, we need to show that ?e is the solution to (8.2). From the KKTs of the local optimization, we have Mf0(?e1) = (M ?1)f0(?e2) = ::: = f0(?eM) (8.6) We also have PMi=1 ?ei = E0. Therefore, ?e satisfles the global KKT conditions in (8.3) and is the globally optimal point. Based on the properties of f(e), we know that f0(e) is negative and monotoni- cally increasing in e. From (8.6), we have f0(?e1) > f0(?e2) > ::: > f0(?eM). Therefore, at the optimal point, the energy spent for each packet monotonically decreases in the order of transmission. Thus, earlier packets are assigned larger energies and therefore, are transmitted quicker than the later ones. Therefore, this model for the delay minimization problem yields a solution which is in contrast with the principle 215 of lazy scheduling that the model in [13] resulted in. 8.3 Scenario II: Random Packet Arrivals We assume that M packets arrive at the transmitter during the transmissions at times t1;t2;:::;tM, where the inter-arrival times are denoted as d1;d2;:::;dM?1; see Figure 8.3. tM?1 tM dM?1 ::: DM t1 d3d2d1 t2 D1 D 2 D3 t3 t4 ::: DM?1 Figure 8.3: System model with random packet arrivals. Let Di denote the delay experienced by the ith packet, which includes the waiting time in the queue and the transmission time. Then, the delay experienced by each packet can be written recursively as, D1 = f(e1) D2 = (D1 ?d1)+ +f(e2) D3 = (D2 ?d2)+ +f(e3) ... DM = (DM?1 ?dM?1)+ +f(eM) (8.7) where (x)+ = max(0;x). Here, for the ith packet, (Di?1?di?1)+ denotes the waiting time in the queue, and f(ei) denotes the actual transmission time. Then, we can 216 express our optimization problem as min MX i=1 Di s.t. MX i=1 ei ? E0 ei ? 0; i = 1;2;:::;M (8.8) where the parameters of the optimization are the energies allocated to all packets, feigMi=1, and the givens of the optimization problem are the total energy E0 and the inter-arrival times of the packets fdigMi=1. Intuitively, the optimization problem in (8.8) is a convex optimization problem since function f(ei) is convex and a linear combination of convex functions is convex. However, the existence of (?)+ function complicates matters, and the joint convexity of the cost function with respect to all ei, i.e., with respect to e = [e1 e2 ::: eM]> needs to be proved. Theorem 8.1 The objective function in (8.8) is convex with respect to e. Proof: We will prove the convexity recursively. First, we note that D1 = f(e1) and f(e1) is convex in e1. We also note that D2 = (f(e1)?d1)+ + f(e2) and the function (f(e1)?d1)+ is convex in e because of the convexity of the function f(e1) in e1. Thus, D2 is convex in e also. Then, we look at D3 = ?(f(e1)?d1)+ +f(e2)?d2?+ +f(e3). We let F(e) = (f(e1)?d1)+ +f(e2)?d2. We note that F(e) itself is convex in e, and we need to prove that (F(e))+ is convex in e as well. Using the deflnition of (?)+, for any two 217 vectors e and e0 in the constraint set, we have ?F(e)+ +(1??)F(e0)+ ? ?F(e)+(1??)F(e0) ? F(?e+(1??)e0) (8.9) If F(?e+(1??)e0) is positive, we have ?F(e)+ +(1??)F(e0)+ ? F(?e+(1??)e0) = F(?e+(1??)e0)+ (8.10) If F(?e+(1??)e0) is negative, then F(?e+(1??)e0)+ = 0. Using the nonnegativity of the (?)+ function, we have ?F(e)+ +(1??)F(e0)+ ? 0 = F(?e+(1??)e0)+ (8.11) Therefore, using (8.10) and (8.11), we conclude that ?F(e)+ +(1??)F(e0)+ ? F(?e+(1??)e0)+ (8.12) which implies that (F(e))+ is convex in e. Therefore, D3 is convex in e as well. The convexity of (Di ?di)+ for i = 4;:::;M ? 1 can be proved in a similar 218 manner. Since the objective function can be expressed as MX i=1 Di = M?1X i=1 (Di ?di)+ + MX i=1 f(ei) (8.13) and since each term in the cost function is convex in e, the linear combination is convex in e as well. 2 Therefore, our problem is a convex minimization problem which has a convex objective function and linear constraints. However, there are two main di?culties in this optimization problem. First, since the overall delay includes both the queuing time and the transmission time of the packets, the transmission time for a packet afiects the queuing time of all of the following packets. This causes the queuing time of earlier packets to be multiply counted in the objective function. This leads to the varying coe?cients before f(ei)?s in the cost function, which implies that the convexity of f(?) alone will not provide us a closed-form solution; we note that the convexity of the cost function alone provided a closed-form solution in [13] due to the symmetry in the cost function. Secondly, because of the existence of (?)+ function in the overall delay expression, the cost function has non-difierentiable points. In addition, depending on whether the insides of (?)+ functions are negative or positive, we have 2M possible cost functions. Since the number of difierent cost functions to consider grows exponentially with the number of packets, standard Lagrangian method is not tractable here. In the following, we will use a simple 3-packet problem to illustrate the di?culties involved in solving this convex optimization problem. 219 Using the deflnition of Di in (8.7), the 3-packet problem is min f(e1)+(f (e1)?d1)+ +f(e2)+?(f (e1)?d1)+ +f(e2)?d2?+ +f(e3) s.t. e1 +e2 +e3 ? E0; e1;e2;e3 ? 0 (8.14) Opening the parentheses, we have four difierent possible cases: Case 1: Both the transmission of the flrst and second packets end before the arrival of the next packet, i.e., insides of both (?)+ functions are negative. This case is illustrated in Figure 8.4(a). In this case, we have min f(e1)+f(e2)+f(e3) s.t. f(e1) ? d1; f(e2) ? d2 e1 +e2 +e3 ? E0; e1;e2;e3 ? 0 (8.15) Case 2: The transmission of the flrst packet ends after the arrival of the second packet, while the transmission of the second packet ends before the arrival of the third packet. This case is illustrated in Figure 8.4(b). In this case, we have min 2f(e1)+f(e2)+f(e3)?d1 s.t. f(e1) > d1; f(e1)+f(e2) ? d1 +d2 e1 +e2 +e3 ? E0; e1;e2;e3 ? 0 (8.16) Case 3: The transmission of the flrst packet ends before the arrival of the 220 second packet, while the transmission of the second packet ends after the arrival of the third packet. This case is illustrated in Figure 8.4(c). In this case, we have min f(e1)+2f(e2)+f(e3)?d2 s.t. f(e1) ? d1; f(e2) > d2 e1 +e2 +e3 ? E0; e1;e2;e3 ? 0 (8.17) Case 4: The transmissions of both the flrst and the second packets end after the arrival of the next packet. This case is illustrated in Figure 8.4(d). In this case, we have min 3f(e1)+2f(e2)+f(e3)?2d1 ?d2 s.t. f(e1) > d1; f(e1)+f(e2) > d1 +d2 e1 +e2 +e3 ? E0; e1;e2;e3 ? 0 (8.18) As we see, the sub-problems in (8.15), (8.16), (8.17) and (8.18) are similar in structure, except for difierent coe?cients in front of the transmission delay times, f(ei), in the cost function. In addition, each problem has a difierent constraint set, which are all convex due to the monotonicity of f(ei) in ei. In order to solve the optimization problem in (8.14), we need to solve the four optimization problems in (8.15), (8.16), (8.17) and (8.18), and take the solution that gives us the smallest cost function, i.e., overall delay. Even though each problem is difierentiable and convex, the number of problems to be solved increases exponentially with the number of 221 packets, making this approach intractable for practical scenarios with many packet arrivals. t1 t2 t3 d2d1 D3D2D1 (a) Case 1. t1 t2 t3 d2d1 D3D1 D 2 (b) Case 2. t1 t2 t3 d2d1 D1 D2 D 3 (c) Case 3. t1 t2 t3 d2d1 D1 D3D2 (d) Case 4. Figure 8.4: Four difierent cases. 8.3.1 An Iterative Approach Because of the intractability of the global problem, in this section, we consider developing an iterative algorithm, which at any given iteration, considers a smaller local sub-problem. Similar to the FlowRight algorithm developed in [18], in this section, we consider optimizing two of the variables, the energies allocated to two consecutive packets, at any iteration, when the rest of the variables, the energies allocated to the rest of the packets, are flxed. We follow the procedure of iterative algorithm described in the previous sec- tion. Initially, we allocate the total energy E0 to the flrst packet. Then, we consider the flrst two packets, and optimize the distribution of the total energy E0 over these 222 two packets, in a way to minimize the overall delay, while we keep the energies al- located to the rest of the packets flxed. We continue this process until we reach the last packet, then we return to the flrst packet. We express the local optimization problem in terms of the energies of two consecutive packets, as follows min MX j=i Dj(eki;eki+1) s.t. eki +eki+1 = ek?1i +ek?1i+1; eki;eki+1 ? 0 (8.19) where ek?1i and ek?1i+1 denote the energies of the packets in the previous iteration. This problem can be solved relatively easily as it essentially is a single-variable optimization problem. Similarly, it is easy to prove that this algorithm converges to a flxed point, since the value of cost function monotonically decrease in each step, and it is lower bounded by zero. If the objective function had a flxed form and was twice- difierentiable, as in the previous section, we could be sure that the algorithm con- verges to the globally optimum solution. Since our cost function is not difieren- tiable at some points, the algorithm may converge to a strictly sub-optimal flxed point. Reference [38] proposes two approaches to solve the di?culty introduced by non-difierentiability in network ow problems: \multiple node relaxation method", and \?-relaxation method". We adopt these two methods here in order to escape sub-optimal flxed points. Following multiple node relaxation method, we consider sub-problems involving three or more packets, as opposed to two packets as we have done above. Similarly, following the ?-relaxation method, we move a small amount 223 of energy from one packet to another to perturb a sub-optimal flxed point. Ex- perimentally, we have observed that both methods improve the convergence of the algorithm. 8.3.2 A Dynamic Programming Approach In this section, we develop a DP approach to our delay minimization problem. In particular, we partition the problem into M stages, and deflne the state space to be E?A, whereE includesthepossibleamountsofenergyremainingatthecurrentstage and A is the set of possible queuing times associated with the packet. Speciflcally, in stage n, we deflne Sn(e;a) to be the minimal delay for the last M?n+1 packets, given the total energy remaining is e and the waiting time in the queue for the n-th packet is a, as shown in Figure 8.5. Then, we have the following recursive relationship Sn(e;a) = min 0?en?e 'a+f(e n)+Sn+1 ?e?e n;(a+f(en)?dn)+ ?? (8.20) for n = 1;2;:::;M, and SM+1(e;a) = 0. During the process of solving the recursive equations backwards, we keep track of en that leads to the minimum value. Let us denote the minimizing values as ^en(e;a) for n = 1;2;:::;M. After computing the functions fSn(e;a);0 ? e ? E0g in a backward recursion and obtaining the ^en(e;a), we get the optimal energy allocation strategy as e1 = 224 dn::: ::: f(en) ::: an ::: an+1 Figure 8.5: System model for the dynamic programming approach. ^e1(E0;0). For n = 2;:::;M, an = (an?1 +f(en?1)?dn?1)+ en = ^en ? E0 ? n?1X i=1 ei;ai ! (8.21) Since getting a closed form solution for the recursive equations appears to be intractable, we perform numerical approximation instead. To this end, we quantize the state space into a flnite number of discrete states. The step size of the quanti- zation decides the size of the state space. Speciflcally, if there are N levels for the energy and J levels for the waiting time, for each packet we have N ? J difierent states. The number of evaluations of a + f(en) + Sn+1[e?en;(a + f(en)?dn)+] is once per quantized en for each quantized state for each stage. Thus, the number of basic evaluations is N2JM, and the number of calculations grows linearly with the total number of packets M. We note that we can use the DP approach for more general cases where the packet arrivals are modeled as a random process, and the delays are calculated as expectations. In addition, we can incorporate the fading nature of the wireless channel, as well as develop online algorithms. 225 8.4 Scenario III: Random Energy Arrivals In this section, we consider the situation where M packets are ready to transmit at t = 0. We assume that the packets have the same size, which is B0 bits per packet. We also assume that there is E0 amount of energy available at time t = 0, and at times s1, s2, :::, sK, we have energies harvested with amounts E1, E2, :::, EK, respectively. This system model is shown in Figure 8.6. Our goal is to adaptively choose the transmit rate according to the available energy and tra?c level, in a way to minimize the average delay of the packets. s2s1 sK ??? ??? t f(e2) f(e3) E0 f(e1) D2 W2 E1 E2 EK 0 Figure 8.6: Average delay minimization with energy harvesting. In order to make the system consistent with the model we have discussed in the previous sections, we assume that, the transmit rate of a packet is kept constant during its transmission. This assumption guarantees that the transmission time of a packet ? is a function of the energy allocated to it, i.e., ?i = f(ei). Moreover, we assume that at the time epoch right before packet i?s transmission starts, we allocate a certain amount of energy ei to it, and ei cannot be greater than the total amount of energy available at that time epoch. This assumption is consistent with the causality constraint on energy, i.e., energy cannot be allocated before it has been harvested. This assumption also makes the stochastic optimal online algorithm possible, as we 226 will see later in this section. Again, we let Di denote the delay experienced by the ith packet, which includes the waiting time in the queue and the transmission time. Difierent from previous sections, where the waiting time only includes the time waiting for all of the packets in front of it in the queue depart from the system, in this scenario, the waiting time may also include the time spent waiting for energy to become available. Deflne Wi to be the earliest epoch when the energy allocated to the ith packet ei becomes available. Then, given e1;e2;:::;eM, we have W1 = min k ( sk : kX j=0 Ej ? e1 ) (8.22) W2 = min k ( sk : kX j=0 Ej ? e1 +e2 ) (8.23) ... WM = min k ( sk : kX j=0 Ej ? MX i=1 ei ) (8.24) From the deflnition, we note that if e1 ? E0, then, W1 = 0; otherwise, the flrst packet needs to wait for the arrivals of energy, until e1 amount of energy becomes available. Then, the delay experienced by each packet can be expressed recursively 227 as D1 = W1 +f(e1) (8.25) D2 = max(D1;W2)+f(e2) (8.26) ... DM = max(DM?1;WM)+f(eM) (8.27) where max(Di?1;Wi) denotes the waiting time in the queue, and f(ei) denotes the actual transmission time. Then, the average delay minimization problem becomes min MX i=1 Di s.t. ei ? 0; i = 1;2;:::;M (8.28) where the parameters of the optimization are the energies allocated to all packets, feigMi=1, and the givens of the optimization problem is the energy arrival proflle. Difierent from previous scenarios, where the optimization problems are convex, in this scenario, because of the existence of Wis in the cost function, the problem, in general, is not convex. From the deflnition of Wi and Di in (8.22)-(8.24), we note that Wi and Di are functions of e1;e2;:::;ei. Speciflcally, W1 is a staircase function of e1, and D1 is a piecewise convex function, as shown in Figure 8.7. The expressions of Wis and Dis for i > 1 have more complex forms. As we can see from W1 and D1, in general, they are not convex in e. However, for given Wis, the cost function is jointly convex in e. We illustrate this fact through a simple two-packet 228 case with two energy arrivals E0, E1 at times 0 and s1, respectively.0 W1 E0 s1 s3 s2 e1E0+E1 (a) E0+E1 D1 E0 s1 s3 s2 e1 (b) Figure 8.7: (a) The waiting time for the flrst packet, W1 and (b) the delay for the flrst packet, D1. Using the deflnition of Wi and Di, the optimization problem becomes min W1 +f(e1)+max(W1 +f(e1);W2)+f(e2) (8.29) s.t. W1 = min k ( sk : kX j=0 Ej ? e1 ) (8.30) W2 = min k ( sk : kX j=0 Ej ? e1 +e2 ) (8.31) e1;e2 ? 0 (8.32) Opening the parentheses of the min function in the constraint, we have three difierent possible cases: Case 1: W1 = W2 = 0, i.e., the second energy arrival is not utilized during 229 the transmission. The optimization problem becomes min 2f(e1)+f(e2) s.t. e1 ? E0; e1 +e2 ? E0; e1;e2 ? 0 (8.33) Case 2: W1 = 0, W2 = s1, i.e., the second packet is held until the second energy arrives, while the flrst packet?s transmission is started at the beginning. The optimization problem becomes min f(e1)+max(f(e1);s1)+f(e2) s.t. e1 ? E0; E0 < e1 +e2 ? E0 +E1; e1;e2 ? 0 (8.34) Depending on relative values f(e1) and s1 in the cost function, we may have two difierent cases as follows: Case 2a: min 2f(e1)+f(e2) s.t. f?1(s1) < e1 ? E0; E0 < e1 +e2 ? E0 +E1; e1;e2 ? 0 (8.35) Case 2b: min f(e1)+f(e2)+s1 s.t. e1 ? min(E0;f?1(s1)); E0 < e1 +e2 ? E0 +E1; e1;e2 ? 0 (8.36) 230 Case 3: W1 = W2 = s1, i.e., the flrst packet is held until the second energy is harvested, and the second packet?s transmission is started after the flrst packet departs. The optimization problem becomes min 2f(e1)+f(e2) s.t. e1 > E0; e1 +e2 ? E0 +E1; e1;e2 ? 0 (8.37) As wee see, the sub-problems in (8.33), (8.35), (8.36) and (8.37) are similar in structure, and are all convex. In order to solve the average delay minimization problem in (8.32), we need to solve the optimization problem for each difierent case, and take the solution that gives us the smallest average delay. In the random packet arrivals scenario discussed in Section 8.3, the number of sub-problems to be solved increases exponentially with the number of packet arrivals. In this scenario, depending on the value selection of Wis, there are KMM! difierent constraint sets, and each constraint set corresponds to multiple cost functions (Cases 2a, 2b in the example), which again increases the complexity of the problem. Therefore, solving the problem analytically becomes intractable with large number of packets and energy arrivals. 8.4.1 A Dynamic Programming Approach In this section, we develop a DP approach to our delay minimization problem. In particular, we partition the problem into M stages, and deflne the state space to be E ?A, where E includes the possible amounts of energy remaining at the current 231 stage and A is the set of possible epochs when the packet?s transmission is started. Speciflcally, at stage n, we deflne Sn(e;a) to be the minimal delay for the last M?n packets, given the total energy remaining at t = a is e. Assuming there is no packet transmitting at t = a. Then, the transmitter may start to transmit the nth packet immediately, or it may postpone the transmission until more energy is harvested. Then, the start time is either t = a, or t = si for some si > a. If we start to transmit nth packet at t = a with energy en, where en ? e, then, the transmission time for nth packet is f(en). Since this transmission duration afiects the queueing time of all these packets behind the nth packet, it should be counted M ? n + 1 times in the total delay. Once the transmission of the nth packet flnishes, the system enters another stage n+1, with state (e?en +Pi:a a, then, the waiting time si ?a should also be counted M ?n+1 times in the total delay. In order to simply the notation, we deflne Tn(e;a) as the total minimal delay for the rest M ?n + 1 packet if the transmitter starts to transmit the n-th packet at t = a. Then, we have Tn(e;a) = min0 a ) (8.39) 232 for n = 1;2;:::;M, and SM+1(e;a) = 0. The relationship between Tn(e;a) and Sn(e;a) is illustrated in Figure 8.8. ? ? ? ? ? ? Ei Ei+1 a tsi+1si Tn(e+Ei,si) Tn(e+Ei +Ei+1,si+1) Tn(e,a) ? ? ? Figure 8.8: Tn(e;a) in the dynamic programming formulation. During the process of solving the recursive equations backwards, we keep track of en and the start point an that leads to the minimum value of Sn(e;a). Let us denote the minimizing values as ^en(e;a) and ^an(e;a) for n = 1;2;:::;M. After computing the functions Tn(e;a);Sn(e;a) in a backward recursion and obtaining the ^en(e;a), and ^an(e;a), we get the optimal energy allocation strategy as e1 = ^e1(E0;0), a1 = ^a1(E0;0). For n = 2;:::;M, an = ^an 0 @ X sj?an?1+f(en?1) Ej ? n?1X i=1 ei;an?1 +f(en?1) 1 A en = ^en 0 @ X sj?an?1+f(en?1) Ej ? n?1X i=1 ei;an?1 +f(en?1) 1 A (8.40) First, let us examine the example with two packets and two energy arrivals 233 under the DP formulation. Based on (8.38)-(8.39), we have T2(e;a) =f(e) (8.41) S2(e;a) =min ( f(e);(si ?a)+f ? e+ X a a ) (8.42) = 8 >>< >>: minff(e);s1 ?a+f(e+E1)g; a < s1 f(e); a ? s1 (8.43) T1(e;a) = 8 >>> >>>< >>>> >>: min0?e1?e n 2f(e1)+ min?f(e?e1);s1 ?a?f(e1)+f(e?e1 +E1)? o ; a+f(e1) < s1 min0?e1?ef2f(e1)+f(e?e1 +E1)g; a+f(e1) ? s1 (8.44) S1(E0;0) =minfT(E0;0);2si +T1(E0 +E1;s1)g (8.45) After taking derivatives of the functions on the right hand side of T1(e;a) and obtain the minimizers for each possible case, we can plug them in the expression of S1(e;a), and solve the problem explicitly. Although getting a closed form solution for the recursive equations becomes intractable when M becomes large, we can still perform numerical approximation to obtain the optimal energy allocation policy. The complexity is about KMN2J, where there are N levels for the energy space and J levels for the time space. Based on the DP formulation, we can easily incorporate the random energy harvesting process to develop online algorithms. 234 8.5 Numerical Results We consider a band-limited additive white Gaussian noise channel, with bandwidth W = 1 MHz and the noise power spectral density N0 = 10?19 W/Hz. We assume that the path loss between the transmitter and the flrst receiver is about 110 dB. We assume that the packets have a uniform size of 10 Kbits. Since the transmission rate with given power p is equal to W log2 1+ phN 0W ? = 106 log2 ? 1+ e10?2? ? (8.46) the transmission time of a packet ? and the energy allocated to it e are related through the following equation 106 log2 ? 1+ e10?2? ? = 10 4 ? (8.47) Although we cannot express ? as an explicit function of e, we can prove that the relationship between ? and e satisfles all of the stated properties for f(e). We assume that at time t = [0;1:5;2;3:5;5:25]?10?2 s, we have packets arrive at the transmitter. We use flve algorithms, including our iterative algorithm, the versions of it with dimension relaxation, and ?-perturbation methods, DP based algorithm and built-in Matlab optimization functions. Simulation results indicate that DP based algorithm always converges to the solution that the built-in Matlab function flnds. In Figure 8.9, we observe that our iterative algorithm converges to the solution the built-in Matlab function flnds. 235 However, in Figure 8.10, we observe that there is a gap between the convergence point of our iterative algorithm and the Matlab solution. We note that, at the point thatouralgorithmconvergesto, thedeparturetimeofthethirdpacketcoincideswith the arrival time of the fourth packet. This means that our algorithm converges to a non-difierentiable sub-optimal flxed point. When we apply dimension-3 relaxation and ?-perturbation methods, we observe that the modifled version of our algorithm escapes the sub-optimal flxed point and converges to the optimal solution. 2 4 6 8 10 12 145.7 5.75 5.8 5.85 5.9 5.95 6 Iteration index Overall delay Iterative algorithm Optimal solution by Matlab Figure 8.9: Overall delay as a function of the iteration index, when E = 48?10?2 mJ. Forthethirdscenariowithenergyarrivals, weassumethatatt = [0;2;5;6;8;9]? 10?2 s, we have energy harvested with amount E = [10;5;10;5;10;10]?10?2 mJ. We assume that at t = 0, we have four packets to transmit. We apply the DP algo- rithm, and obtain the policy as shown in Figure 8.11. We observe that since there is only a small amount of energy available at t = 0, in order to minimize the average delay, all of the packets except the flrst one have to wait for the arrivals of energy 236 2 4 6 8 10 12 14 167.95 8 8.05 8.1 8.15 Iteration index Overall delay Iterative algorithm Optimal solution by Matlab Dimension?3 relaxation ??pertubation Figure 8.10: Overall delay as a function of the iteration index, when E = 45?10?2 mJ. to transmit their packet, and the objective function in this case is equivalent to minimize P4i=1 f(ei), i.e., the transmission time for each packet has the same weight in the cost function.Therefore they have the same value in the optimal solution. 10 0 2 10 10 1055 985 t6 ?1 ?2 ?4?3 Figure 8.11: The optimal energy allocation e = [10;10;10;10] ? 10?2 mJ, with duration ? = [1;1;1;1]?10?2 s, respectively. When E0 increases to 20?10?2 mJ and the rest Eis are the same, the optimal transmission policy is shown in Figure 8.12. We observe that in this scenario, the second packet starts its transmission right after the flrst packet, and last two packets flnish their transmission with the same energy amount. Although the second packet 237 has a longer transmission duration than the last two packet, the overall delay is minimized since the the waiting time for the second packet is avoided under this allocation. 10 10 1055 985 t 20 ?3 ?4 2 6 ?1 ?2 0 Figure 8.12: The optimal energy allocation e = [10:5;9:5;10;10] ? 10?2 mJ, with duration ? = [0:89;1:15;1;1]?10?2 s, respectively. When E0 increases to 40?10?2 mJ, the optimal transmission policy is shown in Figure 8.13. This policy is identical to the policy in the flrst scenario when 45?10?2 mJ is available at t = 0. Because of the multiple counting of the transmission time for each packet in the cost function, the optimal policy has monotonically increasing transmission duration for the packets. 10 1055 t 40 ?1 ?2 ?3 8 92 650 ?4 10 Figure 8.13: The optimal energy allocation e = [12:6;11:8;10:9;9:7]?10?2 mJ, with duration ? = [0:63;0:70;0:82;1:07]?10?2 s, respectively. 8.6 Conclusions In this chapter, we investigated the average delay minimization in a general energy harvesting system. Depending on the arrival proflles of the energy and data pack- ets, the average delay minimization problem becomes difierent. We consider three 238 difierent scenarios here. In the flrst scenario, we assume that both packets and energy are ready at the transmitter before the transmission starts. In the second scenario, we assume that packets may arrive during the transmission, while energy is ready at the transmitter before the transmission. These two scenarios correspond to traditional unrechargeable systems. We developed iterative approaches and DP formulation for both scenarios. For the third scenario, we assume that packets are ready before the transmission starts, and energy is harvested during the transmis- sion. We flrst analyzed the structural properties of the optimal transmission policy, and developed an iterative algorithm and/or DP formulation to obtain the o?ine scheduling policy. 239 Chapter 9 Conclusions In this dissertation, we investigated delay minimization problems in wireless com- munication channels with energy and power constraints at the transmitters. We combined queueing theory with information theory and designed queue-length based cross-layer transmission policies. We flrst studied the average delay minimization problem in a two-user multiple access system, where each transmitter has an average power constraint. We analyzed the trade-ofi between the average power constraints and the average delay, and proved that the optimal transmission policy has a threshold structure, i.e., if the sum of the queue lengths exceeds a threshold, both users transmit a packet from their queues, and if the sum of the queue lengths is smaller than a threshold, the user with the larger queue length transmits a packet from its queue. Delay-optimal rate allocation is another important research area in multi-user communications. We flrst studied the optimal rate allocation policies in a symmetric multiple access channel. We proved that the delay optimal rate allocation policy is to balance the queue lengths in each slot as much as possible. In order to observe the tension between maximizing the current throughput and balancing the queue lengths, we studied the optimal rate allocation policy in a system with a general pentagon rate region. We proved that a switch curve structure exists in the queue 240 state space, and the switch curve has a limit on one of the queue lengths. The optimal policy implies that we can operate the queues partially distributedly. It also implies that the system may need to trade sum-rate for balancing queue lengths in order to achieve the optimal delay performance. Next, we consider communication systems with rechargeable batteries, where the transmitters are able to harvest energy from the nature throughout the duration of the transmissions. We investigated the transmission completion time minimiza- tion problem in such systems. We flrst considered a single-user communication channel with an energy harvesting transmitter. We developed an iterative algo- rithm, and proved its global optimality. Then, we extended the single-user scenario to a broadcast channel and a multiple access channel. For these two scenarios, we flrst characterized the maximum departure region for a given deadline T, then, based on the \duality" between the departure region maximization and transmission com- pletion time minimization problems, we simplifled the transmission completion time minimization problem into simple single-user problems, and obtained the optimal scheduling policies e?ciently. Finally, we studied the average delay minimization problem in a single-user en- ergy harvesting communication channel. We investigated three difierent scenarios. For each scenario, we flrst analyzed the structural properties of the optimal trans- mission policy, and developed an iterative algorithm and/or dynamic programming formulation to obtain the o?ine scheduling policy. 241 Bibliography [1] A. Ephremides and B. Hajek, \Information theory and communication net- works: An unconsummated union," IEEE Transactions on Information Theory, vol. 44, pp. 2416{2434, Oct 1998. [2] R. A. Berry and R. G. Gallager, \Communication over fading channels with delay constraints," IEEE Transactions on Information Theory, vol. 48, pp. 1135{1149, May 2002. [3] R. Berry, \Power and delay trade-ofis in fading channels," Ph.D. dissertation, MIT, Jun 2000. [4] I. Bettesh and S. Shamai, \Optimal power and rate control for minimal aver- age delay: The single-user case," IEEE Transactions on Information Theory, vol. 52, pp. 4115{4141, Sep 2006. [5] M. Goyal, A. Kumar, and V. Sharma, \Power constrained and delay optimal policies for scheduling transmission over a fading channel," Proceedings of IEEE Infocom, vol. 1, pp. 311{ 320, Mar/Apr 2003. [6] A. Steiner and S. Shamai, \On queueing and multi-layer coding," IEEE Trans- actions on Information Theory, vol. 56, pp. 2392{2415, May 2010. [7] T. Holliday and A. Goldsmith, \Optimal power control and source-channel coding for delay constrained tra?c over wireless channels," Proceedings of IEEE International Conference on Communications (ICC), vol. 2, pp. 831{ 835, Apr/May 2002. 242 [8] W. Chen, K. B. Letaief, and Z. Cao, \A joint coding and scheduling method for delay optimal cognitive multiple access," Proceedings of IEEE International Conference on Communications (ICC), pp. 3558{3562, May 2008. [9] E. Yeh, \Delay-optimal rate allocation in multiaccess communications: A cross- layer view," IEEE Workshop on Multimedia Signal Processing, pp. 404{ 407, Dec 2002. [10] N. Ehsan and T. Javidi, \Delay optimal transmission policy in a wireless multi- access channel," IEEE Transactions on Information Theory, vol. 54, pp. 3745{ 3751, Aug 2008. [11] E. Yeh, \Minimum delay multi-access communication for general packet length distributions," 42nd Annual Allerton Conference on Communication, Control, and Computing, pp. 1536{1545, Sep/Oct 2004. [12] S. Musy, \Delay and coding in multiple-user communications," Ph.D. disserta- tion, EPFL, 2007. [13] E. Uysal-Biyikoglu, B. Prabhakar, and A. El Gamal, \Energy-e?cient packet transmission over a wireless link," IEEE/ACM Transactions on Networking, vol. 10, pp. 487{499, Aug 2002. [14] M. A. Zafer and E. Modiano, \A calculus approach to energy-e?cient data transmission with quality of service constraints," IEEE/ACM Transactions on Networking, vol. 17, pp. 898{911, Jun 2009. 243 [15] ||, \Delay-constrained energy e?cient data transmission over a wireless fad- ing channel," Information Theory and Applications Workshop, pp. 289{298, Jan/Feb 2007. [16] W. Chen, U. Mitra, and M. J. Neely, \Energy-e?cient scheduling with individ- ual delay constraints over a fading channel," WiOpt, pp. 1{10, Apr 2007. [17] A. El Gamal, C. Nair, B. Prabhakar, E. Uysal-Biyikoglu, and S. Zahedi, \Energy-e?cient scheduling of packet transmissions over wireless networks," Proceedings of IEEE Infocom, vol. 3, pp. 1773{1782, Nov 2002. [18] E. Uysal-Biyikoglu and A. El Gamal, \On adaptive transmission for energy ef- flciency in wireless data networks," IEEE Transactions on Information Theory, vol. 50, pp. 3081{3094, Dec 2004. [19] K. Tutuncuoglu and A. Yener, \Transmission completion time minimization for energy harvesting nodes with flnite battery capacity," submitted. [20] O. Ozel, K. Tutuncuoglu, J. Yang, S. Ulukus, and A. Yener, \Resource man- agement for fading wireless channels with energy harvesting nodes," submitted. [21] L. Tassiulas and A. Ephremides, \Dynamic server allocation to parallel queues with randomly varying connectivity," IEEE Transactions on Information The- ory, vol. 39, pp. 466{478, Mar 1993. [22] I. E. Telatar and R. G. Gallager, \Combining queueing theory with information theory for multiaccess," IEEE Journal on Selected Areas in Communications, vol. 13, pp. 963{969, Aug 1995. 244 [23] E. Yeh and A. Cohen, \Throughput optimal power and rate control for queued multi-access and broadcast communications," IEEE International Symposium on Information Theory, p. 112, Jun 2004. [24] T. M. Cover and J. A. Thomas, Elements of Information Theory. New York: Wiley, 1991. [25] V. G. Kulkarni, Modeling and Analysis of Stochastic Systems. Chapman and Hall/CRC, 1995. [26] A. Tits, Optimal control, University of Maryland, 2002, lecture notes. [27] S. Boyd and L. Vandenberghe, Convex Optimization. United Kingdom: Cam- bridge University Press, 2004. [28] J. Walrand, An Introduction to Queueing Networks. Prentice Hall, 1988. [29] A. Marshall and I. Olkin, Inequalities: Theory of Majorization and Its applica- tions. New York: Academic Press, 1979. [30] D. Tse and S. Hanly, \Multiaccess fading channels { Part I: Polymatroid struc- ture, optimal resource allocation and throughput capacities," IEEE Transac- tions on Information Theory, vol. 7, pp. 2796{2815, Nov 1998. [31] W. Yu, W. Rhee, S. Boyd, and J. Cio?, \Iterative water-fllling for Gaussian vector multiple access channels," IEEE Transactions on Information Theory, vol. 50, no. 1, pp. 145{151, Jan 2004. 245 [32] A. Sendonaris, E. Erkip, and B. Aazhang, \User cooperation diversity-Part I: System description," IEEE Transactions on Communications, vol. 51, no. 11, pp. 1927{1938, Nov 2003. [33] O. Kaya and S. Ulukus, \Power control for fading cooperative multiple access channels," IEEE Transactions on Wireless Communications, vol. 6, no. 8, pp. 2915{2923, Aug 2007. [34] W. Lin and P. R. Kumar, \Optimal control of a queueing system with two heterogenous servers," IEEE Transactions on Automatic Control, vol. 29, pp. 696{703, Aug 1984. [35] B. Hajek, \Optimal control of two interacting service stations," IEEE Trans- actions on Automatic Control, vol. 29, no. 6, pp. 491{499, Jun 1984. [36] M. A. Antepli and E. Uysal-Biyikoglu, \Optimal packet scheduling on an energy harvesting broadcast link," submitted. [37] O. Kaya and S. Ulukus, \Achieving the capacity region boundary of fading cdma channels via generalized iterative waterfllling," IEEE Transactions on Wireless Communications, vol. 5, no. 11, pp. 3215{3223, Nov 2006. [38] D.BertsekasandJ.Tsitsiklis, Parallel and Distributed Computation: Numerical Methods. Athena Scientiflc, 1997. 246