ABSTRACT Title of Dissertation: SPATIAL PROCESSING, POWER CONTROL, AND CHANNEL ALLOCATION FOR OFDM WIRELESS COMMUNICATIONS Masoud Olfat, Doctor of Philosophy, 2003 Dissertation directed by: Professor K. J. Ray Liu Department of Electrical and Computer Engineering OFDM is mainly designed to combat the efiect of multipath reception, by dividing the wide-band frequency selective fading channel into many narrow-band at subchannels. OFDM ofiers exibility in adaptation to time-varying channel condition by adopting the parameters at each subcarrier accurately. The purpose of this work is to use this exibility and study the OFDM systems with power control, multiple transmit and receive antennas, the problem of Peak to Average Power Ratio (PAPR), and the efiect of OFDM in providing QoS. An OFDM uplink multiuser wireless network, combined with power control and receive beamforming is proposed to achieve the desired SINR at each OFDM subchannel. Conse- quently, better overall BER with the same total power is achieved. To reduce the receiver complexity, joint time-domain beamforming and power control is also provided. The pro- posed algorithm is also extended to COFDM. We use distributed schemes to maximize the maximum achievable data rate for each receiver in a multiuser downlink transmission using MIMO/OFDM, by flnding the optimal transmit and receive weight vectors. We propose iterative algorithms to distribute the lim- ited power (per carrier or per user) to multiple streams and multiple antennas in order to maximize the allocated rate per user. The game theoretic analogy of the problem is stated and the convergence of the algorithms are discussed. To increase the information rate of low PAPR OFDM codes, we propose two frameworks. Super Golay sequences constructed from 16-QAM constellation having PAPR bounded up to 3dB are deflned, and constructed by recursive structures. Cyclic-Golay codes are also proposed and constructed by a framework that can be used to obtain the cyclic shift of any code represented by Boolean algebraic functions. These codes are in general a subset of generalized Reed-Muller codes, and have lower error correction capabilities compared to Golay sequences. An extension of the majority logic Reed algorithm for decoding Reed- Muller codes of any order is provided. To reduce decoding complexity, recursive maximum- likelihood decoding schemes are also provided, and the complexity of these algorithms are analyzed. We also address a scheduling algorithm for wireless networks that provides QoS for mobile users in a shared environment and at the same time utilizes the system resources e?ciently. We introduce an income maximization notion, and propose optimal and suboptimal ap- proaches to increase throughput and maintain the QoS for each user, and generate high income for service provider. This notion is used to determine the optimal subcarrier allo- cation to difierent users of an OFDMA system based on their required QoS. Optimal and sub-optimal algorithms are presented and their performances and complexities are studied. SPATIAL PROCESSING, POWER CONTROL, AND CHANNEL ALLOCATION FOR OFDM WIRELESS COMMUNICATIONS by Masoud Olfat Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulflllment of the requirements for the degree of Doctor of Philosophy 2003 Advisory Committee: Professor K. J. Ray Liu, Chairperson/Advisor Professor Steven Tretter Professor Carlos Bernestein Professor Sennur Ulukus Professor Babis Papadopoulos c Copyright by Masoud Olfat 2003 DEDICATION TO MY PARENTS FOR THEIR ENCOURAGEMENT, GUIDANCE, SUPPORT, AND INFINITE LOVE, TO MY LOVELY WIFE MAHNAZ AND MY CHILDREN MAHBOD AND TARRA WHOSE LOVE IS THE MOTIVATION OF MY BEING ALIVE, AND TO ALL THOSE WHO HAS CARED FOR ME. ii ACKNOWLEDGEMENTS I am deeply grateful to my advisor Professor K.J. Ray Liu for his encouragement, guidance and support. Despite my di?culties, his generous support provided me the opportunity to pursue my ph.D. His patience, and delicate questions and suggestions greatly in uenced the performance of my work. His careful comments and criticisms have driven me substantially to achieve more signiflcant contributions. Professor Liu has been more than an academic advisor to me by developing a friendly relationship which is beyond advisor/student relation, and has behaved in such a way that his students are part of his family. I have also had the privilege to use his expertise in teaching high level sophisticated technical courses in a very professional manner. My sincere gratitude goes to the members of my dissertation committee, Pro- fessor Steve Tretter, Professor Carlos Bernestein, Professor Sennur Ulukus, and Professor Babis Papadopoulos for their time and careful consideration of my work and for their great comments and suggestions to improve the performance of my work. Specially, I express my thankfulness to Professor Tretter for his assistance though my studies. I would like to express my profound gratitude to Professor Homayoun Hashemi at the Electrical Engineering department of Sharif University of Technology without iii whom I have not been able to even pursue my higher educations. His attitude has inspired me of being a better researcher and more importantly a better human. I am also deeply grateful to my friends Dr. Farrokh Rashid Farrokhi and Dr. Mehdi Alasti who have helped me greatly in developing the ideas and approaches contained in this dissertation. Without their sincere assistances, I would have never been able to achieve this goal. Special thanks also goes to my friends Professor Alejandra Mercado, Profes- sor Hamid Jafarkhani, Dr. Kamran Etemad, Dr. Javad Razavilar, Dr. Vahid Tabatabaee, Dr. Babak Azimi Sadjadi, Dr. Majid Raissi Dehkordi, Dr. Tahereh Fazel, Dr. Hassan Yaghoobi, and specially Professor Vahid Tarokh for their sup- port, encouragement, and valuable insights, comments and suggestions through- out my ph.D studies. My sincere gratitude goes to my parents-in-law, my brothers, and my sister who have constantly bestowed me their support, love and encouragements. Above all, I am profoundly indebted to my parents who made countless per- sonal sacriflces for me and have patiently struggled and sufiered from enormous problems I have had throughout my life, and to my beautiful and extremely com- passionate wife Mahnaz and my lovely and gracious children Mahbod and Tarra who have been the main motivation of my life. Without all of them, I would have not been who I am. Without their love, my life would have been meaningless. To all of them and to the future of my beloved country, Iran, I dedicate this dissertation. iv TABLE OF CONTENTS List of Tables x List of Figures xi 1 Introduction 1 1.1 Broadband Wireless Communications . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Standardization and Frequency Bands . . . . . . . . . . . . . . . . . 2 1.2 Wireless Networks: The Layered Architecture . . . . . . . . . . . . . . . . . 7 1.3 Physical Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3.1 Modulation Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3.2 Interleaving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3.3 Channel Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4 Multiple Access Control (MAC) Layer . . . . . . . . . . . . . . . . . . . . . 16 1.4.1 MAC Layer in IEEE802.11 WLAN Standard . . . . . . . . . . . . . . 18 1.5 Network Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.6 Quality of Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.7 Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.8 Interference and Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.9 Wireless Propagation Models . . . . . . . . . . . . . . . . . . . . . . . . . . 27 v 1.9.1 Wireless Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . 31 1.10 Multiple Transmit and Receive Antenna . . . . . . . . . . . . . . . . . . . . 33 1.10.1 Beamforming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 1.10.2 Space-Time Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 1.10.3 spatial multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 1.11 Contribution of this Dissertation . . . . . . . . . . . . . . . . . . . . . . . . 45 1.12 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . 47 2 Orthogonal Frequency Division Multiplexing (OFDM) 49 2.1 Motivation for introducing OFDM . . . . . . . . . . . . . . . . . . . . . . . . 49 2.2 OFDM History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.3 Description of OFDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.3.1 Advantages of OFDM . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.3.2 Disadvantages of OFDM . . . . . . . . . . . . . . . . . . . . . . . . . 61 2.3.3 Single Carrier versus OFDM Comparison . . . . . . . . . . . . . . . . 63 2.4 Loading Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 2.5 Peak to Average Power Ratio (PAPR) . . . . . . . . . . . . . . . . . . . . . 69 2.5.1 Statistical Properties of OFDM Signals . . . . . . . . . . . . . . . . . 72 2.5.2 Techniques for OFDM PAPR Reduction . . . . . . . . . . . . . . . . 74 2.6 Orthogonal Frequency Division Multiple Access . . . . . . . . . . . . . . . . 79 2.6.1 Channel Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3 Power Allocation for OFDM using Adaptive Beamforming over Wireless Networks 84 3.1 Motivation and Previous Works . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.2 OFDM with Adaptive Power Control . . . . . . . . . . . . . . . . . . . . . . 87 vi 3.2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 3.2.2 System Conflguration . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 3.3 Power Control and Frequency-Domain Beamforming . . . . . . . . . . . . . . 92 3.4 Power Control and Time-Domain Beamforming . . . . . . . . . . . . . . . . 97 3.5 Power Control and MMSE Time-Domain Beamforming . . . . . . . . . . . . 102 3.6 Extension to COFDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 3.7 Performance Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 3.8 Summary of the Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 4 MIMO-OFDM Systems with Multi-User Interference 120 4.1 Motivation and Previous Works . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 4.3 Achievable Rate with Known Interference Covariance Matrix . . . . . . . . . 125 4.3.1 Constant Power per Subcarrier . . . . . . . . . . . . . . . . . . . . . 128 4.3.2 Constant power per user . . . . . . . . . . . . . . . . . . . . . . . . . 130 4.3.3 Iterative Water-fllling . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 4.3.4 Game Theoretic approach for rate maximization . . . . . . . . . . . . 133 4.3.5 Sub-optimal Solution; Same Cell Interference . . . . . . . . . . . . . . 136 4.4 Single Stream SNR Maximization . . . . . . . . . . . . . . . . . . . . . . . . 138 4.5 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 5 Low Peak to Average Power Ratio with modifled Golay Sequences for OFDM Systems 149 5.1 Motivation and Previous Works . . . . . . . . . . . . . . . . . . . . . . . . . 149 5.2 Golay Complementary Sequences for equal-power constellations . . . . . . . 153 vii 5.2.1 Construction of equal-power Golay Sequences . . . . . . . . . . . . . 156 5.2.2 PAPR reduction for the non-equal power constellation . . . . . . . . 159 5.2.3 Super-Golay 16QAM pairs from QPSK pairs . . . . . . . . . . . . . . 164 5.2.4 Super-Golay 64QAM pairs from QPSK pairs . . . . . . . . . . . . . . 166 5.3 Cyclic Golay Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 5.3.1 Construction of Cyclic Golay Codes . . . . . . . . . . . . . . . . . . . 172 5.3.2 Maximum-likelihood Decoding of RM2h(r;m) . . . . . . . . . . . . . 184 5.3.3 Performance Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 5.3.4 Summary of the Chapter . . . . . . . . . . . . . . . . . . . . . . . . . 204 6 Service Level Agreement (SLA) Based Scheduling Algorithms and QoS- provisioned Channel Allocation for OFDMA 207 6.1 Motivation and Previous Works . . . . . . . . . . . . . . . . . . . . . . . . . 207 6.2 SLA-Based Scheduling for Wireless Networks . . . . . . . . . . . . . . . . . . 215 6.2.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 6.2.2 Maximum Credit Scheduling (MCS) . . . . . . . . . . . . . . . . . . . 217 6.2.3 Maximum Throughput Scheduling (MTS) . . . . . . . . . . . . . . . 219 6.2.4 A Trade-ofi: SLA-Based Scheduling Algorithms . . . . . . . . . . . . 219 6.2.5 Maximum Income Greedy Scheduling: A Suboptimal Solution . . . . 221 6.2.6 Maximum Income Dynamic Programming Scheduling (MIDPS): Op- timal Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 6.2.7 System Admission Control Policy, and Pricing . . . . . . . . . . . . . 236 6.3 QoS-Provisioned Channel Allocation for OFDMA . . . . . . . . . . . . . . . 237 6.3.1 OFDMA System Model . . . . . . . . . . . . . . . . . . . . . . . . . 238 6.3.2 OFDMA Scheduling Algorithms through Subcarrier Allocation . . . . 239 viii 6.4 Performance Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 6.4.1 Simulation Results for SLA-Based Scheduling . . . . . . . . . . . . . 248 6.4.2 Simulation Results for OFDMA Channel Allocation . . . . . . . . . . 255 6.5 Summary of the Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 7 Conclusion and Future Works 262 7.0.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 7.0.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 Bibliography 271 ix LIST OF TABLES 3.1 The COST207 Typical Urban 6-ray power delay proflle. . . . . . . . . . . . 110 5.1 List of repeated Golay sequences under cyclic shifts for m = 3 and h = 2 . . 179 5.2 List of non-repeated cyclic shifts on Golay sequences for m = 3 and h = 2. . 181 5.3 K-Map for 4 variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 5.4 K-Map for 5 variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 5.5 Computational complexities of Decoders for codes from RM4(2;5) . . . . . 196 5.6 Number of constructed 8-valued SGolay Pairs. . . . . . . . . . . . . . . . . . . . 197 5.7 Rates and PMEPRs for size 16. . . . . . . . . . . . . . . . . . . . . . . . . . . 198 5.8 Rates and PMEPRs for size 32 . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 5.9 Rate, Hamming and Lee distance of some codes with low PAPR . . . . . . . 200 x LIST OF FIGURES 1.1 An example of a convolutional encoder. . . . . . . . . . . . . . . . . . . . . . . 14 1.2 The MAC Layer in IEEE802.11 as an interface between the PHY and LLC layers. 19 1.3 The IEEE802.11 multiple access structure. . . . . . . . . . . . . . . . . . . . . . 20 1.4 Power uctuation of the received signal due to difierent sources of fading . . . . . 27 1.5 The efiect of data rate in ISI. . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 1.6 Narrowband Antenna array and beamformer . . . . . . . . . . . . . . . . . . 35 1.7 Wideband Antenna array and beamformer . . . . . . . . . . . . . . . . . . . 36 1.8 Uniform Linear Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.1 (a) A typical FDM spectrum. (b) A typical OFDM spectrum, where B is the saving in bandwidth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.2 (a) Spectra of OFDM signal. . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.3 Basic Building block of an OFDM transmitter. . . . . . . . . . . . . . . . . . . 54 2.4 Basic Building block of the jth OFDM Receiver. . . . . . . . . . . . . . . . . . 56 2.5 Partial Transmit Sequences for PAPR reduction in an OFDM. . . . . . . . . . . 77 2.6 Selective Mapping for PAPR reduction in an OFDM. . . . . . . . . . . . . . . . 78 2.7 The extension of 16QAM constellation. . . . . . . . . . . . . . . . . . . . . . 79 2.8 OFDMA carrier segmentation. . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.1 The ith OFDM transmitter using Adaptive Power Control. . . . . . . . . . . . . 90 xi 3.2 Frequency-Domain Beamforming in the jth OFDM receiver. . . . . . . . . . . . 92 3.3 Time-Domain Beamforming in the jth OFDM receiver. . . . . . . . . . . . . . . 98 3.4 The 8-state 8-PSK TCM encoder. . . . . . . . . . . . . . . . . . . . . . . . . . 107 3.5 The 8-state 8-PSK TCM trellis. . . . . . . . . . . . . . . . . . . . . . . . . . . 108 3.6 Bit error rate vs. SINR [dB] for single antenna cases for time-varying channel between power updates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 3.7 Bit error rate vs. total network power [dBm] for single antenna cases assuming quasi-static channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 3.8 SINRs of difierent subchannels in the single antenna cases. . . . . . . . . . . . . 114 3.9 Total network power [dBm] vs. desired SINR [dB] for adaptive power control cases. 115 3.10 Coding gain of the TCM encoder depicted in Fig. 3.5 . . . . . . . . . . . . . . . 116 3.11 Total network power vs. desired SINR for coded and uncoded OFDM. . . . . . . 117 3.12 Bit error rate vs. desired SINR for coded and uncoded OFDM. . . . . . . . . . . 119 4.1 Cellular structure and distribution of users . . . . . . . . . . . . . . . . . . . . 142 4.2 Achievable rate CDF, for flxed power per carrier with 16 base stations, 2 mobile per cell, reuse factor of 7 and 2 streams . . . . . . . . . . . . . . . . . . . . . . 143 4.3 Achievable rate CDF, for flxed power per carrier with 16 base stations, 2 mobile per cell, reuse factor of 1 and 2 streams . . . . . . . . . . . . . . . . . . . . . . 144 4.4 Achievable rate CDF, for flxed power per user with 125 base stations, 1 mobile per cell, reuse factor of 3 and 2 streams . . . . . . . . . . . . . . . . . . . . . . . . 145 4.5 Achievable rate CDF of difierent number of streams, for flxed power per user with 100 base stations, 2 mobile per cell, and reuse factor of 3. . . . . . . . . . . . . . 146 4.6 Achievable rate CDF of difierent number of streams, for flxed power per user with 16 base stations, 2 mobile per cell, and reuse factor of 3 . . . . . . . . . . . . . . 147 xii 4.7 Achievable rate CDF of difierent number of streams, for flxed power per user with 16 base stations, 1 mobile per cell, and reuse factor of 3 . . . . . . . . . . . . . . 148 5.1 Bit Error Rate (BER) vs. SNR for AWGN channels . . . . . . . . . . . . . . . . 193 5.2 Coding rate vs. SNR for AWGN channels . . . . . . . . . . . . . . . . . . . . . 195 5.3 Bit Error Rate (BER) vs. SNR for Fading channels, when the channel is known at the receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 5.4 Bit Error Rate (BER) vs. SNR for Fading channels, when the channel is not known at the receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 5.5 OFDM transmitter with Golay and cyclic Golay encoder . . . . . . . . . . . . . 202 5.6 Bit Error Rate (BER) vs. SNR Threshold for AWGN channels, when m = 4 . . . 204 5.7 Coding rate vs. SNR Threshold for AWGN channels, when m = 4 . . . . . . . . 205 5.8 Bit error rate vs. coding rate for flxed values of SNR for AWGN channels, when m = 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 6.1 System block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 6.2 Metric generator block diagram for MIGS . . . . . . . . . . . . . . . . . . . 223 6.3 SLA scheduler block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 224 6.4 Block diagram of the system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 6.5 Scheduling tree of subcarrier allocation. . . . . . . . . . . . . . . . . . . . . . . 241 6.6 Viterbi channel assignment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 6.7 Throughput versus network load . . . . . . . . . . . . . . . . . . . . . . . . . 248 6.8 Minimum assigned relative rate versus network load . . . . . . . . . . . . . . 249 6.9 Total income vs. network load . . . . . . . . . . . . . . . . . . . . . . . . . . 250 6.10 Throughput of MIDPS, MIGS, and MPS vs. network load . . . . . . . . . . 251 6.11 Minimum assigned relative rate of MIDPS, MIGS, and MPS vs. network load 252 xiii 6.12 Income of MIDPS, MIGS, and MPS vs. network load . . . . . . . . . . . . . . . 253 6.13 QoS versus the network load for difierent penalty functions, fi = 5 . . . . . . . . 254 6.14 Throughput versus the network load for difierent penalty functions, fi = 5 . . . . 255 6.15 QoS versus the network load for difierent penalty functions, fi = 500 . . . . . . . 256 6.16 Throughput versus the network load for difierent penalty functions, fi = 500 . . . 257 6.17 CDF?s of the optimal exhaustive search algorithm, iterative algorithm with 20 and 80 iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 6.18 Throughput vs. load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 6.19 Worst case actual to desired throughput vs. load . . . . . . . . . . . . . . . . . 260 6.20 Total revenue vs. load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 xiv Chapter 1 Introduction 1.1 Broadband Wireless Communications In recent years, wireless communication and networking has experienced a rapid growth, and it promises to become a globally important infrastructure. The advances in integrated circuittechnologyanddigitalsignalprocessingalgorithmshavemadewirelesscommunication technology accessible to millions of people. The light weight, long operational time, and afiordable prices of portable devices have resulted in ever increasing demand for wireless services. The spectral growth of video, voice, and data communication over the Internet, and equally rapid pervasion of mobile telephony, justify great expectation for mobile multimedia. However, the current wireless communication systems and standards, such as the 3G cellular standard or the WLAN IEEE802.11 standard family, do not fully support the new emerging multimedia applications. The quality of service (QoS) they can provide is not competitive with the QoS wire-line service providers can ofier. The large volume and high sensitivity of multimedia data require the development and deployment of wireless communication systems that can guarantee reliable data transmission at high data rates. Research and development 1 are taking place all over the world to deflne the next generation of Wireless Broadband Mul- timedia Communication Systems (WBMCS) consisting of various components at difierent scales rating from global networks to residential small networks. The demand for wireless mobile, Internet, and multimedia communications is growing exponentially. Therefore it is imperative that wireless, Internet, and multimedia should be brought together. Thus, in the near future, wireless Internet Protocol (IP) and Wireless Asynchronous Transfer Mode (WATM) will play an important role in the development of WBMCS. While present communication systems are primarily designed for one speciflc application, such as speech on a mobile telephone or high-rate data in a Wireless Local Area Network (WLAN), the next generation of WBMCS will integrate various functions and applications. Designing wireless communication systems that supporting large data rates with su?cient robustness to radio channel impairments, faces the challenge of devising modulation, coding, and signal processing techniques that can combat the adverse efiects of the radio signal propagation environment, such as multipath fading and interference, more efiectively. To implement the wireless broadband communication systems, the following challenges must be considered: Frequency allocation and selection, Channel characterization, Multiple access techniques, Protocol and networks, and flnally System development with e?cient modulation, coding and smart antenna techniques. 1.1.1 Standardization and Frequency Bands Inspired by the successful application of the cellular concept, the wireless evolution has so far gone through two generations. First generation (1G) wireless systems (like AMPS, TACS) use analog transmission and support voice services. Second generation (2G) systems (like GSM, IS-95. PDC) employ digital technology and provide circuit-switched data communi- cation services at low speeds in addition to voice. On the other hand, the so-called 2.5G 2 system (like EDGE/GPRS, HDR), which currently operate in most countries, support more advanced services, such as moderate rate (up to 100 kbps) packet-switched data. In 1G and 2G systems, the main focus was on increasing system capacity in terms of the number of established connections, which have constant, low rate streams. However, recent evolutions in the telecommunications arena indicate a clear trend towards enhanced, rate-demanding services that are expected to ourish in the next years. The idea of the third generation (3G) systems became evident by the need to support high and diverse data rates for heterogeneous application, such as home-networking, video conferencing, fast wireless/mobile Internet access and multimedia communications. 3G systems, such as UMTS and CDMA2000 are envisioned to support rates in the order of 1 or 2 Mbps [1]. There are several forums for the standardization of wireless broadband systems; namely IEEE802.11 [2], European Telecommunication Standards Institute Broadband Radio Ac- cess Networks (ETSI BRAN) [3], Multimedia Mobile Access Communications (MMAC) [4], IEEE802.16/WiMAX [5], and IEEE802.20 [6]. IEEE 802.11 made the flrst WLAN standard for 2.4 GHZ Industrial, Scientiflc, and Medical band (ISM), and 5GHZ Unlicensed National Information Infrastructure (UNII) band. The legacy version of WLAN specifles the medium access control and three difierent physical layers; direct sequence spread spectrum, frequency hopping, and infrared which give a data rate of upto 2Mbps. Later, the committee pro- posed new versions, namely IEEE802.11b using high speed direct sequence spread sequence physical layer for the speed of 11Mbps, IEEE802.11a with Orthogonal Frequency Division multiplexing (OFDM) for 54Mbps in 5GHZ band, and IEEE802.11g for high the speed of up to 54Mbps in ISM band. ETSI BRAN and MMAC jointly used OFDM for high speed wireless transmission in 5 GHZ band. ETSI High Performance Local Area Network type 2 (HIPERLAN/2) consists of a family of standards one of which is an OFDM-based standard that is very similar to 3 IEEE802.11a. MMAC is used in Japan and supports both IEEE802.11a and HIPERLAN/2 standards. Note that Japan has only 100 MHZ available in the 5-GHZ band, while the United States and Europe provides 300 and 455 MHZ, respectively. Fixed wireless technologies (also called flxed wireless access, wireless broadband access, or broadband wireless access) are not new but because of recent advances, this technology has been successful in rural communities that are out of reach of installed flxed lines. When used at high frequencies, flxed wireless can carry more data but has limited range and requires more complex equipment and line of sight. At lower frequencies, the range is further and the equipment is cheaper, but the transmission rates are low. Multi-point Microwave Dis- tribution Systems (MMDS) and local multi-point distribution systems (LMDS) were viewed as promising technologies, but a lack of uniform standards has hampered their deployment. IEEE 802.16 and 802.16a are new flxed-wireless standards that should be able to transmit 32-56 km with maximum data rates close to 70 Mbit/s. Again, the higher frequencies require line of sight but it provides high-capacity links. At lower frequencies, line of sight is not required but speeds are lower. This technology is a high-speed wireless backbone designed to link distant ISPs to the Internet. Wireless LAN technologies would then be used for the connection to the user. The IEEE 802.16 standard is about to revolutionize the broadband wireless access indus- try. The 802.16 standard, the Air Interface for Fixed Broadband Wireless Access Systems, is also known as the IEEE Wireless-MAN (WMAN) air interface. This technology is designed from the ground up to provide wireless last-mile broadband access in the Metropolitan Area Network (MAN), delivering performance comparable to traditional cable, DSL, or T1 of- ferings. The principal advantages of systems based on 802.16 are multi-fold: the ability to quickly provision service, even in areas that are hard for wired infrastructure to reach; the avoidance of steep installation costs; and the ability to overcome the physical limitations 4 of traditional wired infrastructure. Providing a wired broadband connection to a currently underserved area through cable or DSL can be a time-consuming, expensive process, with the result that a surprisingly large number of areas in the US and throughout the world do not have access to broadband connectivity. 802.16 wireless technology provides a ex- ible, cost-efiective, standards-based means of fllling existing gaps in broadband coverage, and creating new forms of broadband services not envisioned in a wired world. Drawing on the expertise of hundreds of engineers from the communications industry, the IEEE has established a hierarchy of complementary wireless standards. These include IEEE 802.15 for the Personal Area Network (PAN), 802.11 for the Local Area Network (LAN), 802.16 for the Metropolitan Area Network, and the proposed IEEE 802.20 for the Wide Area Net- work (WAN). Each standard represents the optimized technology for a distinct market and usage model and is designed to complement the others. A good example is the proliferation of home and business wireless LANs and commercial hot spots based on the IEEE 802.11 standard. WiMAX (the Worldwide Interoperability for Microwave Access Forum) is a non-proflt corporation formed by equipment and component suppliers, including Intel Corporation, to promote the adoption of IEEE 802.16 compliant equipment by operators of broadband wire- less access systems. The organization is working to facilitate the deployment of broadband wireless networks based on the IEEE 802.16 standard by helping to ensure the compatibility and interoperability of broadband wireless access equipments. In an efiort to bring interoperability to broadband wireless access, WiMAX is focusing its efiorts on establishing a unique subset of baseline features grouped in what is referred to as System Proflles that all compliant equipments must satisfy. These proflles will establish a baseline protocol that allows equipment from multiple vendors to interoperate, and that also providessystem integratorsand service providerswith the abilityto purchaseequipmentfrom 5 more than one supplier. System Proflles can address the regulatory spectrum constraints faced by operators in difierent geographies. WiMAX will establish a structured compliance procedure based upon the proven test methodology results in a complete set of test tools available to equipment developers so they can design in conformance and interoperability during the earliest possible phase of product development. Ultimately, the WiMAX suite will enable service providers to choose from multiple vendors of broadband wireless access equipment that conforms to the IEEE 802.16a standard and that is optimized for their unique operating environment. By choosing interoperable, standards-based equipment, the operator reduces the risk of deploying broadband wireless access systems. Economies of scale enabled by the standard help reduce monetary risk. Operators are not locked in to a single vendor because base stations will interoperate with subscriber stations from difierent manufacturers. Ultimately, operators will beneflt from lower-cost and higher-performance equipment, as equipment man- ufacturers rapidly create product innovations based on a common, standards-based platform. On December 2002, the IEEE Standards Board approved the establishment of IEEE 802.20, the Mobile Broadband Wireless Access (MBWA) Working Group. The mission of IEEE 802.20 is to develop the speciflcation for an e?cient packet based air interface that is optimized for the transport of IP based services. The goal is to enable worldwide deploy- ment of afiordable, ubiquitous, always-on and interoperable multi-vendor mobile broadband wireless access networks that meet the needs of business and residential end user markets. MBWA Scope is the speciflcation of physical and medium access control layers of an air interface for interoperable mobile broadband wireless access systems, operating in licensed bands below 3.5 GHZ, optimized for IP-data transport, with peak data rates per user in excess of 1 Mbps. It supports various vehicular mobility classes up to 250 Km/h in a MAN environment and targets spectral e?ciencies, sustained user data rates and numbers of active 6 users that are all signiflcantly higher than achieved by existing mobile systems. The802.20interfaceseekstoboostrealtimedatatransmission ratesinwirelessmetropoli- tan area networks to speeds that rival DSL and cable connections (1Mbps or more) based on cell ranges of up to 15 kilometers or more, and it plans to deliver those rates to mobile users even when they are travelling at speeds up to 250 kilometers per hour (155 miles per hour). This would make 802.20 an option for deployment in high-speed trains. The 802.16e project authorization request specifles only that it will support subscriber stations moving at vehicular speeds. Essentially, 802.16e is looking at the mobile user walking around with a PDA or laptop, while 802.20 will address high-speed mobility issues. 1.2 Wireless Networks: The Layered Architecture The inherent volatility of the wireless medium constitutes the major di?culty in the design of wireless networks. The quality of a narrow-band wireless link between a transmitter and a receiver depends both on radio propagation parameters (path loss, shadow fading, multipath fading) and cochannel interference. The OSI (Open Systems Interconnection) model deflnes a layered architecture and the protocols deflned in each layer are responsible for communicating with the same peer protocol layer running in the opposite computer, and providing services to the layer above it (except for the top-level application layer). The techniques of layered protocols were developed to logically decompose a complex network into smaller, more understandable parts (layers), to provide standard interfaces between network functions, and to allow each layer to perform the same functions as its counterpart in other nodes of the network, In the following we will brie y describe the characteristics of the main layers for a wireless networks. 7 1.3 Physical Layer Physical layer-based techniques are employed on a link basis, in order to achieve high data rate, while maintaining an acceptable Bit Error Rate (BER) at the receiver, irrespective of link quality. The parameters that are considered as adaptable are modulation and coding, Interleaving, transmission power level, use of multiple antenna, ... 1.3.1 Modulation Level Modulation is a fundamental component of a digital communications system. It is the process of mapping the digital information to analog form so it can be transmitted over the channel. Consequently every digital communication system has a modulation that performs the task. Closely related to modulation is the inverse process, called demodulation, done by the receiver to recover the transmitted digital information. Modulation is done by changing the amplitude, phase or frequency of the transmitted Radio Frequency (RF) signal. The main design issue of the modulator is the choice of the constellation, which is the set of M points (constellation size) that can be transmitted on a single symbol. This choice afiects several important properties of a communication system; for example BER, Peak to Average Power Ratio (PAPR), and RF spectrum shape. Each block of b = log2 M bits from the coded bit stream constitutes a symbol and each symbol is mapped to one of M waveforms for transmission over the channel. The single most important parameter for a constellation is the "minimum distance", dmin, which is the smallest distance between any two points in the constellation. It depends on several factors; constellation size, average power, and the shape of constellation. The modulation and demodulation can be done either coherently, or non-coherently. 8 Coherent Modulation Coherent modulation can be used by a communication system that maintains a phase lock between the transmitter and the receiver RF carrier wave. It improves the performance, but requires more complex receiver structure compared to non- coherent systems. The performance gain of coherent modulation is signiflcant when the system uses large constellation. High speed communication systems, like IEEE802.11a are usually coherent. The most common coherent modulations are listed below: (1) Amplitude Shift Keying (ASK), where the information is transmitted by changing the amplitude of the carrier. (2) Phase Shift Keying (PSK) (including QPSK, BPSK 8-PSK) where the information is transmitted by changing the phase of the carrier. (3) M-ary Quadrature Amplitude Modulation (M-QAM), where both amplitude and phase of the carrier change, and is the combination of both ASK and PSK. b = log2 M bits are converted to one M-QAM symbol. The symbol error rate of M-QAM constellation is shown to be [7] Ps = 4 1? 1pM ? Q ?r 3 M ?1 Es N0 ! ; (1.1) where Q(:) is the error function deflned as Q(x) = 12? Z 1 x e?t22 dt; x ? 0 (1.2) Es is the symbol energy, and N0 is the energy of noise. Therefore, the BER of M-QAM depends on the Signal to Noise Ratio (SNR) of the wireless link, and the constellation size. Afterthereceiverhasperformedalltherequiredsynchronizationoperations, thedemodulator tries to detect which of the M symbols has been transmitted. Non-Coherent Modulations Non-Coherent modulations can be used by a communi- cation system that does not maintain a phase lock between transmitter and receiver, or does not have the knowledge of the amplitude change of the transmitted symbol caused by the channel. This means that the receiver symbols are rotated and scaled arbitrarily compared 9 to the transmitted symbol. Therefore the ASK, PSK, or QAM modulations cannot be used because they require the received symbol phase and amplitude to be very close to the trans- mitted phase and amplitude. The solution is to use difierential PSK (DPSK) or Difierential APSK (DAPSK) modulation. Difierential modulations encode the transmitted information to a phase, or phase and amplitude change from one transmitted symbol to the next. This encoding introduces memory to the signal, because transmitted symbol depends on previous symbols. As a consequence, the demodulator has to consider two consecutive symbols when making decisions. The main beneflt of difierential encoding is signiflcantly simplifled receiver structure. Several of the synchronization algorithms are not needed in a non-coherent receiver. Specif- ically, phase tracking and channel estimation are not needed, because absolute knowledge of carrier phase and the channel efiects is not needed. In an OFDM system (discussed in the next chapter) carrier frequency estimation could also be removed, if the system can tolerate the performance loss due to inter-carrier interference caused by lost orthogonality between the subcarriers. However, many standards do not use this demodulation scheme because of its performance loss associated with difierential approaches. In contrast, low data rate systems do use difierential techniques, mainly DPSK modulations. In difierential Phase Shift Keying (DPSK), the receiver change the carrier phase from its current value according to the data bits. Difierential Amplitude Phase Modulation (DAPSK) combines difierential phase and dif- ferentialamplitudemodulation. Thedifierentialphasemodulationisanalogoustotheregular DPSK. Difierential amplitude modulation, on the other hand, has to change the constellation shape compared to coherent amplitude modulation. The reason is the unknown scaling of the amplitude of the transmitted symbol caused by the channel. A general assumption when difierential modulation is used is that the channel and carrier phase are constant during 10 two consecutive symbols. Therefore, we can cancel the efiect of the channel by dividing two consecutive symbols. The detection of difierential modulation is done in two steps. First, the difierential encoding is removed from the signal and then a normal demodulation is performed as is done for regular PSK or QAM constellation. The performance loss of DPSK compared to coherent modulation varies with the size of the modulation; for DBPSK it is 1-2dB, for DPSK about 2.3dB, and for large constellations 3dB. 1.3.2 Interleaving Interleaving aims to distribute transmitted bits in time or frequency or both to achieve desir- able biterror distribution after demodulation. What constitutes a desirable error distribution depends on the used Forward Error Correction (FEC) code. What kind of interleaving pat- tern is needed depends on the channel characteristics. If the system operates in a purely AWGN environment, no interleaving is needed, because the error distribution cannot be changed by relocating the bits. Interleaving necessarily introduces delay into the system because bits are not received in the same order as the information source transmits them. The overall communication system usually dictates some maximum delay the system can tolerate, hence restricting the amount of interleaving than can be used. For example, cellular telephone systems usually use time diversity, because the channels are fast fading (discussed later in this chapter). However, the maximum phone to phone delay is usually constrained to 20ms or less, to prevent noticeable degradation in cell quality. This means the maximum interleaving delay must be much less than 20ms to allow for other delay sources in the system. There are two ways to perform interleaving; Block interleaving and Convolutional inter- 11 leaving. Block interleaving operates on one block of bits at a time. The number of bits in the block is called interleaving depth, which deflnes the delay introduced by interleaving. It can be described as a matrix to which data is written in columns and read in rows, or vice versa. Deinterleaving is the opposite operation of interleaving; that is, the bits are put into the original order. A Convolutional interleaver is another possible interleaving solution that is most suitable for systems that operate on continuous stream of bits. The interleaver operates by writing the bits into a commutator on the left, and reading bits out from the commutator on the right. The main beneflt of a convolutional interleaver is that it requires approximately half of the memory required by a block interleaver to achieve the same interleaving depth. This saving can be signiflcant for long interleaver depth. Deinterleaving of convolutional interleaver is achieved by ipping the interleaver along its horizontal axis. The structure is otherwise identical to the interleaver except the longest delay line is at the top, and the no-delay line is last. In IEEE802.11a has an interleaver depth of one OFDM symbol, because the channel is assumed to be quasi-static; that is, the channel is assumed to stay essentially the same for a duration of a transmitted packet. Therefore it is naturally a block interleaver. 1.3.3 Channel Coding Channel codes are the most important component of any modern communication system, and they make today?s efiective and reliable communications possible. The basic measure of channel coding performance is "coding gain", which is usually measured in dBs as the reduction of required SNR to achieve a certain symbol error rate in AWGN channel. As an example, IEEE802.11a uses two methods to achieve a 12Mbits=s data rate. The simplest way would be to use uncoded BPSK modulation on each OFDM subcarrier (48 bits worth of 12 information in each OFDM block). The symbol time is 4?s or 250000 symbols per second, hence the overall data rate is 250000?48 = 12Mbits=s. Another way is to use QPSK and rate 1=2 convolutional codes. This results in a signifl- cantly lower required SNR to achieve a good BER performance. At BER of 10?5, the coding gain is about 5:8dB. This means that to achieve the same performance, the system that does not use channel coding has to spend 5:8dB more energy for each transmitted symbol than the system that uses channel coding. Another important parameter of channel coding is the "coding rate. Code rate is the ratio of bits arrived at the encoder, called the "message word", to the bits exited from the encoder, called the "code word".This ratio is always less than or equal to one. Channel coding always forces the system to use a larger constellation to keep the same data rate as an uncoded system. However, going to larger constellations reduces the "minimum distance of the code", dmin; this implies higher BER at the output of demodulator. However, at the output of channel decoder, the bit error rate is signiflcantly reduced. There are two main types of channel coding; namely "block coding" and "convolutional coding", which will be brie y described next. Note that the performance of channel codes is ultimately limited by the channel capacity formula: C = W log2(1+SNR); (1.3) where W is the channel bandwidth. However, after about 50 years of research, "Turbo codes" [8] have flnally emerged as a class of codes that can approach the ultimate limit in performance. Another innovation are Low Density Parity Check (LDPC) codes [9], which also have performance very close to the capacity. 13 X1 X0 D D D Signal Mapping C2 C1 C0 y Figure 1.1: An example of a convolutional encoder. Convolutional Codes Almost all the major cellular systems (GSM, IS-95), IEEE802.11a and HIPERLAN/2 WLAN standards, and many other standards use convolutional error correcting codes. A convolu- tional code is deflned by a set of connections between stages of one or more shift registers and the output bits of the encoder. If the number of shift registers is k, and the number of output bits is n, the coding rate is kn. For each output bit there are k connections that deflne how the value of the output bit is calculated form the state of the shift register. The constraint length of a convolutional code is the maximum number of bits in a single output stream that can be afiected by any input bit. It is the maximum number of taps on the shift registers in the encoder plus one, i.e. K = 1+max i fmig; (1.4) where mi the number of shift registers in each branch. Fig. 1.1 depicts a sample convo- lutional encoder, whose rate is 2=3, and whose constraint length is 3. The longer the shift registers, the more powerful the code is, but the more complexity is incurred on the decoder. The performance of a convolutional code is determined by the "minimum free distance" of the code. Free distance is deflned using the Hamming distance that is equal to the number of 14 positions in which two codewords are difierent, and for convolutional codes is the minimum Hamming distance between two difierent codewords. An asymptotic coding gain at high SNR for a convolutional code can be calculated form the free distance and the rate of the code: coding gain = 10log10(rate?free distance): (1.5) Puncturing [10] is a very useful technique to generate difierent rates from a single con- volutional code. The basic idea behind Rate Compatible Puncturing Convolutional Codes (RCPC) is to avoid transmitting some of the bits output by the convolutional encoder, thus increasing the rate of the code. This increase in rate decreases the free distance of the code. There are several algorithms that can be used to decode convolutional codes. Viterbi algorithm has reached a dominant position as the method to decode convolutional codes especially in wireless communication. It is a maximum likelihood codeword estimator; it provides the best possible estimate of the transmitted codeword. Trellis Coded Modulation (TCM) [11] merges channel coding and modulation into a single integrated component. The beneflt of this approach is that the code design is optimized for the used constellation. The most signiflcant beneflt is reached in AWGN channel and with high spectral e?ciencies; in other words with large constellation. It consists of two parts; a convolutional encoder and a modulator. Block Codes Block codes are difierent from convolutional codes in the sense that the code has a deflnite codeword length nR, instead of variable code word length like convolutional codes. The most popular class of block codes are Cyclic block codes like Reed-Solomon (RS) codes [12], and BCH codes [13]. Another important difierence between block codes and convolutional codes is that the block codes are designed using algebraic properties of polynomials or curves over 15 flnite flelds, whereas convolutional codes are designed using exhaustive computer search. Other Codes Concatenated codes are built by combining an outer code and an inner code. The outer code is usually a Reed-Solomon block code and the inner code a convolutional code. They have reached performances that is only 2:2dB from the channel capacity limit. Turbo codes [14] have a performance only 0:6dB form the channel capacity. They are a combination of recursive systematic convolutional codes, interleaving and iterative decoding. 1.4 Multiple Access Control (MAC) Layer Multiple Access schemes allow the wireless systems to accommodate several users which need to access a common wireless channel, so that the channel is shared e?ciently among them. The users could be distinguished either by time, frequency, code, or space. The mul- tiple access methods could be either connection-oriented (flxed assignment), connectionless (random access), or on demand assignment methods. These methods are characterized by a trade-ofi between overhead they incur and the reliability of transmission. Connection-oriented multiple access methods are similar to telephone calls, where the connection is dedicated to the call after being established, even if there is nothing to talk about. In this case a separate connection is created for each user and maintained for the duration of the session, even if the user has no more data to be transmitted. The main connection-oriented methods are (1) Frequency Division Multiple Access (FDMA), where the whole bandwidth is divided into non-overlapping carrier frequencies and each user is assigned to one carrier. A special case of FDMA is Orthogonal Frequency Division Multiple Access (OFDMA) in which the subcarriers of an OFDM transmitter are divided into flxed 16 segments and each segment is assigned to one user. (2) Time Division Multiple Access (TDMA), where a time window is divided into short slots and each slot is assigned to one user. (3) Code Division Multiple Access (CDMA), in which each user transmits all the time over all available frequency band, but is identifled by a unique spreading code which is orthogonal to other users? codes to ensure that the receiver could distinguish the user. The code modulates the data and spans it over a wider frequency band as its code rate is much higher than that of the data. (4) Frequency Hopping Multiple Access (FHMA), where carrier frequency of difierent users are varied in a random fashion. (5) Space Division Multiple Access (SDMA), where the separation of the users is performed in space by directing the emitted energy towards each intended user through directional beams created by multiple antenna arrays. The connectionless multiple access methods are suitable for low tra?c networks, where the streams to be transmitted could be bursty. In Carrier Sense Multiple Access (CSMA) methods, the user listens to media before transmission and tries to capture the control of the channel before transmission. In ALOHA, a user transmits with a certain probability, whenever it has data. These methods reduce the amount of overhead, but increase the risk of collision and interference. In Demand assignment techniques, the system switches between the random access and flxed assignment methods based on the network load. In heavy loads, the connection-oriented methods are used, while in low tra?c the connectionless ones are exploited. However, the wireless medium has special properties that make the design of MAC pro- tocols difierent from, and more challenging than, wireline networks, i.e. (1) Collision detec- tion is not possible while sending data and so Ethernet-like protocols cannot be used. (2) Time varying Channel and multipath propagation, necessitates the handshaking between two nodes to test the wireless channel between them. (3) Errors are more likely in wireless 17 transmissions compared to wireline. Packet loss due to burst errors can be minimized by using either smaller packets, or Forward Error Correcting (FEC) codes, or retransmission methods by using Acknowledgments (ACK) and NACK for detecting packet errors. (4) Car- rier sensing is a function of the position of the receiver relative to the transmitter. Hidden node problem is an example of such phenomena. The metrics that are used to compare the MAC protocols, which can be specifled as Quality od Service (QoS) parameters are delay, jitter, throughput as a fraction of channel capacity, fairness in sharing the bandwidth among users (by considering the possible prior- ities), power consumption, Robustness against channel fading, multi-stream support to be able to handle difierent streams like voice, video, and data which have difierent requirements, and stability. 1.4.1 MAC Layer in IEEE802.11 WLAN Standard The MAC Layer forms layer 2 as compared with the Open System Interconnection (OSI) Model. The MAC acts as an interface to Logical Link Control Layer (LLC) and Physical layer (PHY) as shown in the Fig. 1.2. The primary function of the MAC Layer is to provide medium access control to applications that contend for medium in such a way as to maximize the utilization of the channel. The MAC layer will also provide the power management and synchronization. The MAC handles three types of messages: data, control, management. IEEE 802.11 specifles services that are used to support the delivery of MAC Service Data Unit (MSDU) (which are the packets received by the MAC layer from the upper layers) be- tween Stations (STA), and to control WLAN medium access. These services are listed as: (1) Data exchange services that provides reliable transmission of MSDUs between two peer LLC entities, including broadcast and multicast transports. (2) Control services; this service provides handshaking between MAC entities indicating the availability of wireless medium 18                      ! " #  $ % # #&  " ! '()*) +++'()*,, +++'()*,,- +++'()*,, Figure 1.2: The MAC Layer in IEEE802.11 as an interface between the PHY and LLC layers. (WM) for their communication. The handshaking happens by the exchange of control mes- sages between two entities. The requesting entity sends a control frame Request to send (RTS), asking for conflrmation from peer entity and gets an acknowledgement Clear to send (CTS), as a conflrmation. This way it sets the environment for reliable communication be- tween two peer MAC entities. (3) Management services; this service provides the services for all management messages like authentication, de-authentication, association, disassociation, re-association, timing and synchronization. The multiple access scheme that helps the WLAN users to share wireless medium is described in Fig. 1.3. The basic access mechanism, called Distributed Coordination Function (DCF), is a Carrier Sense Multiple Access with Collision Avoidance mechanism (usually known as CSMA/CA). The MAC Layer also incorporates an optional access method called Point Coordination Function (PCF). Both methods are mutually exclusive and operate in difierent time frames viz. Contention Period (CP) and Contention Free Period (CFP). ? Contention Period (CP); This is the time frame allocated for DCF access mechanism where all the stations (STAs) and Access Point (AP) contend for the wireless medium by control information exchange. 19                                         Figure 1.3: The IEEE802.11 multiple access structure. ? Contention Free Period (CFP); This is the time frame allocated for PCF access mech- anism where AP becomes the Point Coordinator (PC) or polling master and has the full control of allocating the wireless medium access to difierent STAs. Distributed Coordination Function (DCF) The DCF is implemented in all STAs, for use within all 802.11 network conflgurations. The basic medium access protocol is a DCF that allows for automatic medium sharing between compatible PHYs through the use of CSMA/CA and a random backofi time following a busy medium condition. In addition, all directed tra?c uses immediate positive acknowledgment (ACK frame) where the sender schedules retransmission, if no ACK is received. CSMA/CA Concept: The CSMA/CA protocol is designed to reduce the collision probability between multiple STAs accessing a medium, at the point where collisions would most likely occur. Just after the medium becomes idle following a busy period is when the highest probability of a collision exists. This is because multiple STAs could have been waiting for the medium to become available again. This is the situation that necessitates a 20 random backofi procedure (explained later ) to resolve medium contention con icts. Virtual CSMA concept and Network Allocation vector (NAV): Carrier sense is performed both physically (by PHY) and virtually (by MAC) mechanisms. The virtual carrier-sense mechanism is achieved by distributing reservation information announcing the impending use of the medium. This information is maintained by each STA in the NAV. The exchange of RTS and CTS frames prior to the actual data frame is one means of distribution of the medium reservation information. The RTS and CTS frames contain a duration-ID fleld that deflnes the period of time that the medium is to be reserved to transmit the actual data frame and the returning ACK frame. All STAs within the reception range of either the originating STA (which transmits the RTS) or the destination STA (which transmits the CTS) shall update the NAV with this information. Another means of distributing the medium reservation information is the duration-ID fleld in directed frames (Viz. MAC Headers). This fleld gives the time that the medium is reserved, either to the end of the immediately following ACK, or in the case of a fragment sequence, to the end of the ACK following the next fragment. The duration information is also available in the MAC headers ofallframessentduringtheContentionPeriod(CP)otherthanPowerSave(PS)-PollControl frames. The NAV may be thought of as a counter, which counts down to zero at a uniform rate. When the counter is zero, the virtual carrier-sense indication is that the medium is idle; This is as shown in the Figure 8. The medium is determined to be busy whenever the STA is transmitting. The time interval between frames is called the Inter Frame Space (IFS). A STA will determine that the medium is idle through the use of the carrier-sense function for the interval specifled. There are two main IFS used in DCF, i.e SIFS which is the shortest inter frame spaces. SIFS will be used when STAs have seized the medium and need to keep it for the duration of the frame exchange sequence to be performed. Using the smallest gap between transmissions within the frame exchange sequence prevents other 21 STAs, which are required to wait for the medium to be idle for a longer gap, from attempting to use the medium. DIFS: A STA using the DCF will be allowed to transmit if its carrier sense mechanism determines that the medium is idle. A STA may transmit after subsequent reception of an errorfree frame, re-synchronizing the STA. Random Backofi Procedure: Whenever a STA wants to acquire the wireless medium, it checks the state of the medium, as indicated by Physical and Virtual carrier sense mech- anism starting from NAV=0. Just after the medium becomes idle following a busy period, is when the highest probability of a collision exists as all the stations flnd the medium to be idle at the same time. Now each station will wait for a random time duration (back-ofi procedure) before contending for the medium. Point Coordination Function (PCF) The MAC layer software also incorporate an optional access method called PCF which is only usable on infrastructure network conflgurations. This access method uses a point coordinator (PC), which operates at the Access Point, to determine which STA currently has the right to transmit. The operation is essentially that of polling, with the PC performing the role of the polling master. The PCF distributes information within Beacon management frames to gain control of the medium by setting the network allocation vector (NAV) in STAs. 1.5 Network Layer Some of the main issues in wireless ad-hoc networks are the Routing, and Tra?c Engineering. The routing issue is a central function in any communication network. The routing protocols meant for wired networks can not be used for mobile ad hoc networks because of the mobility nature of the wireless networks ad hoc networks. These routing protocols can be divided 22 into two categories: proactive (table-driven) and reactive (on-demand) routings based on when and how the routes are discovered. In table-driven routing protocols each terminal maintains one or more tables containing routing information to every other terminal in the network. All terminals update their tables so as to maintain a consistent and up-to- date view of the network. In reactive routing, the routes are created only when desired by the source host. When the network topology changes, the terminals propagate update messages throughout the network in order to maintain consistent and up-to-date routing information about the whole network. Tra?c engineering (TE) is also a powerful approach for providing quality of service (QoS) over packet networks. The motivation for TE is to distribute tra?c ows over the network links to avoid the congestion caused by uneven network utilization [15]. One way to achieve this goal is to use QoS based routing. Given QoS request of a ow or an aggregation of ows, QoS routing obtains the route that is most likely able to meet the QoS requirements [16,17]. In wired networks, where the network topologies and link capacities are flxed, tra?c engineering with QoS based routing flnds a good distribution for tra?c ows to support the requested qualities [15]. Mobile ad-hoc networks are difierent from wired networks in the sense that network topol- ogy and link capacities vary over time. This causes the TE process to be more complicated. In wireless networks, lack of QoS can be caused by either high bit error rate (BER) on wire- less links (low signal to noise and interference ratio) or the congestion of uneven distribution of tra?c ows, when some parts of the network could be overloaded while other parts are lightly loaded. Therefore, unlike the wired networks, where the knowledge of overall net- work tra?c was enough to perform tra?c engineering, in wireless networks we need to know the link capacities, as well [18]. Similar to wired networks, a QoS based routing algorithm for wireless networks might come up with a longer but lightly loaded route compared to a 23 heavily loaded shortest path. 1.6 Quality of Service The primary goal of a wireless communications system is the fulfllment of Quality of Service (QoS) requirements. The QoS parameters vary depending on the network structure and the communication layers. For single user transmission, this could be the physical layer parameters like an acceptable Signal to Noise Ratio (SNR) level or Bit Error Rate (BER) at the receiver. In Data Link Layer or MAC layer it could be expressed as Packet Error Rate (PER), or minimum achievable rate, or maximum tolerable delay guaranteed to each user. QoS could also be interpreted as throughput, delay, and jitter in a session based, or even fairness in rate allocation to difierent users (for one-to-many transmission). The ability of the network infrastructure to satisfy such QoS requirements and ultimately enhance system capacity depends on procedures and mechanisms which span several commu- nication layers. For shared media, an e?cient multiple access mechanism must be employed. At the MAC layer, QoS could be guaranteed by appropriate scheduling strategies, as well as resource management and reuse methods. At the physical layer, adaptive transmission tech- niques provide the potential to adjust parameters such as transmission power, modulation level, and symbol rate to maintain acceptable quality on a link level. The employment of multiple transmit and/or receive antenna (both in downlink and uplink) is one of the most important means for increasing capacity, and therefore making the job of providing QoS guarantees more e?cient. 24 1.7 Scheduling One of the main characteristics that make the packet scheduling over mobile wireless net- works distinct from wireline networks is the fact that the wireless channel is time-varying due to multipath fading. In delay tolerant data systems it is thus possible, with the aid of channel condition feedback from the users, to schedule transmission to users when their fading conditions are favorable thereby achieving multi-user diversity [19]. They have shown that with single transmit antenna, transmitting to a single best user during each scheduling interval, is an e?cient strategy the performance of any scheduling algorithm critically de- pends on the transmission rates achieved in each scheduling interval which in turn depends on coding, modulation, and number of transmit antenna employed. 1.8 Interference and Capacity Interference is one of the major factors that limits the performance of wireless networks. Interference in a receiver can be caused by any communication system which leaks energy into the frequency band the receiver is working in. Interference can degrade Signal to Inter- ference and Noise Ratio (SINR) and therefore causes higher error probability and possibly termination of a transmission session. Interference has been recognized as a major bottleneck in increasing the capacity of a cellular system. In a cellular system the interference can be categorized as co-channel or adjacent channel. To exploit the limited bandwidth, it is possible that in a given coverage area several cells use the same set of frequency channels. The interference between signals from these cells creates the cochannel interference. Unlike thermal noise which can be overcome by increasing the transmitter power, cochannel interference cannot be compensated by simply increasing the transmitter power. This is due to the fact that although an increase in one 25 transmitter?s power can increase its SINR, it could increase the interference on other cells. To reduce cochannel interference, we can either physically separate the cochannel cells by a minimum distance, or use some signal processing means for interference cancellation, such as power control, or multiple receive antenna beamforming. In almost all wireless networks, the amount of transmission power is a key performance issue, for two reasons. First, the lower the power, the less interference it causes on other receivers, and therefore a better link quality is achieved. Second, the limited battery life of mobile units forces us to be conservative in the amount of transmission power. On the other hand, a mobile must transmit enough power such that the SINR at its corresponding receiver is in an acceptable level. As a result, power control is a key element of such wireless systems that require shared bandwidth and time slots (cochannel transmission), like CDMA wireless networks [20{22]. As the demand for wireless services increases, the number of channels assigned to a cell (in cellular systems) or service sets (in WLAN) eventually becomes insu?cient to support the required number of users. At this point, careful designs are needed to accommodate more users using the limited resources. There are several techniques in cellular systems to increase the capacity, like cell splitting (dividing congested cells into smaller cells), and cell sectoring (using multiple directional antenna at the base station). However these methods sufier form increasing the handofi rates. Moreover, the directional antennas are flxed and cannot adjust themselves to the changing environment. Therefore, we can employ adaptive beamforming techniques, rather than flxed antenna patterns. Array processing is a powerful technique which calls for replacing the single omni-directional antenna or directional sector with an array of antenna elements at the base station. Using adaptive beamforming techniques, one can place an antenna beam toward the and place antenna nulls toward other cochannel interference sources. This results in huge reduction in interference in the received signal and 26                    Figure 1.4: Power uctuation of the received signal due to difierent sources of fading . increases signiflcantly the SINR for the signal of interest. This technique is called space diversity combining method. we will discuss the increase in system capacity using difierent multiple transmit and receive antenna structure in Section 1.10. 1.9 Wireless Propagation Models A signal transmitted in a wireless channel will experience three separable efiects: path loss, shadow fading, and multipath fading. Path loss, or the mean propagation loss, comes from wave propagation, absorption, and vertical multipath. Shadow fading, or slow fading, is caused by large obstacles, such as buildings or hills, and is characterized by log-normal distribution. Multipath fading or fast fading, results from multipath scattering. If there is a non-fading direct path component it is modelled by Ricean distribution; otherwise it is modelled by Rayleigh distribution. Figure 1.4 illustrates the power uctuation of the received signal due to difierent sources of fading around the mean power. Path Loss: In free space the received power decays as a logarithmic function of the transmitter-receiver separation. The power received by a receiver antenna, in a distance d 27 from the transmitter antenna, in free space is given by [20] Pr = PtGtGr? 2 (4?)2d2 (1.6) where Pt is the transmitter power, Gt is the transmitter antenna gain, Gr is the receiver antenna gain, d is the transmitter-receiver separation distance in meters, and ? is the wave- length in meters. This implies that the received power decays at a rate of 20 dB/decade with distance. The Path Loss (PL) is deflned as the difierence between the efiective transmitted power and the received power in dB, and is for free space is given by PL = 10log10PtP r = ?10log10GtGr? 2 (4?)2d2 (1.7) Although Eq. (1.6) applies only to free space, in other environments the received power decays by the distance raised to some power n, i.e. Pr(d) = Pr(d0)(d=d 0)n ; (1.8) where Pr(d0) is the received power at the reference distance d0. Eq. (1.8) states that the path loss is always proportional to en exponent n of d=d0. The value of n depends on the speciflc propagation environment and it ranges from 2 to 5. Path loss is also referred to as large-scale fading. Shadow Fading: Slow fading known also as log-normal shadow fading is a result of difiraction and shadowing of the transmitted signal caused by a large object such as buildings, hills, cars, mountains, or other terrain conflgurations in mobile wireless environment. The shadow fading follows a log-normal distribution (or Gaussian in dB) with mean zero and variance 2. Accordingly, if K shows the efiect of such fading, Eq. (1.8) is changed to Pr(d) = K Pr(d0)(d=d 0)n : (1.9) 28 The values of n and are computed from measured data, using linear regression and mean square error methods. Their typical value in urban cellular wireless environments is n = 2:7 and = 11:8 dB [20]. Multipath Fading: In mobile radio communications, fading occurs due to re ection from scatterers, or bigger objects. This causes the receivers to receive a number of copies of the transmitted signal which have been re ected and difiracted by buildings and other urban obstacles. When the signals from various paths sum constructively at the BS antenna, the received signal level is enhanced. A serious condition occurs when the multipath signals, i.e. the transmitted signal arriving via many paths, efiectively sum to a small value. When this happens the received signal is said to be in a fade and the phenomenon is called multipath fading. The delay spread of the channel may be considered as the length of the received pulse when an impulse is transmitted through the channel [7,20]. A scatterer is a small object that re ects a wireless signal. If there is no object in the line of sight from the transmitter to the receiver, Line of Sight (LOS) might exist, too. If scatterers and the mobile move in relative to each other, the received power uctuates; that is the wireless channel causes time varying fading. This variation in the channel response gives risetorandomfrequencymodulationduetothedifierentDopplershiftsineachpath. Because of the inherent randomness in the phase, amplitude and time delay of the difierent multipath components, they could be combined at the receiver either constructively or destructively. The scatterer could be local to the transmitter, local to the receiver, or far from both of them. Local scatterers along with the movement of the mobile causes Doppler spread, or time selective fading. The delay spread due to local scattering is negligible. Local scatterers to the receiver cause angle spread or space selective fading. Both angle and delay spread could be caused by remote scatterers; they cause frequency selective fading. The impulse response of a frequency selective fading channel has a multipath delay spread 29 that is greater than the time duration of the transmitted signal waveform, i.e., ? Ts. Signal transmitted through a wireless channel undergoes frequency selective fading if Ts ? ?. Depending on the signal bandwidth, symbol period, channel delay and Doppler spread, the signal may experience difierent types of fading. If we transmit data at a slow rate, the data can easily be resolved at the receiver. Causing the fading to be frequency non-selective or at. This is because the extension of a data pulse due to the multipath is completed before the next impulse is transmitted. However, if we increase the data transmission rate, a point will be reached where each data symbol signiflcantly spreads into adjacent symbols, a phenomenon known as Inter-Symbol Interference (ISI). This phenomena is illustrated in Fig. 1.5. ISI without equalization results in very high Bit Error Rate (BER). In the frequency domain, this means certain frequency components in the received signal spectrum have greater gains than others. In other words, some frequency components of the transmitted signal is attenuated severely in frequency domain. Frequency selective fading channels are much more di?cult to model than at fading channels. For frequency selective fading, the spectrum S(f) of the transmitted signal has a bandwidth which is greater than the channel bandwidth BC. The channel becomes frequency selective, if the wireless channel gain is difierent for difierent frequency components. Frequency selective fading is caused by multipath delays which approach or exceed the symbol period of the transmitted signal. Frequency selective channels are also known as wideband channels since the bandwidth of the signal s(t) is wider than the bandwidth of the channel impulse response h(t;?). As time varies, the channel varies in gain and phase across the spectrum of s(t), resulting in time varying distortion in the received signal r(t). For at fading the frequency components of the signal stay unchanged, while the time domain signal undergoes variation. Most commonly, the amplitude is modelled according to 30                Figure 1.5: The efiect of data rate in ISI. a Rayleigh distribution. The pdf of a Rayleigh distributed random variable x is f(x) = x 2 exp(?x 2 2): (1.10) If the Doppler spread is larger than the signal bandwidth, the channel impulse response will vary rapidly during the symbol period, and the signal will undergo a fast fading. If the Doppler spread is much less than the bandwidth of the signal, the channel becomes static during the signal period. In this case the signal undergoes slow fading. The amplitude of the received signal in frequency selective fading depends on whether there is a non-fading LOS or not. If there is not such a link, the amplitude is Rayleigh distributed (Eq. (1.10)), otherwise it has a Ricean distribution f(x) = x 2e?x 2+A2 2 2 I0( Ax 2 )u(x); (1.11) where I0 is the Bessel function of flrst kind and zero-order, A denotes the peak amplitude of the dominant signal, and u(x) is the step function. As the amplitude of the dominant path decreases, the Ricean distribution approaches to a Rayleigh distribution. 1.9.1 Wireless Channel Model Let the transmitted signal be s(t) = ej2?fct X n bng(t?nT); (1.12) 31 where fc is the carrier frequency, bn is the nth transmitted symbol, T is the symbol duration, and g(:) is the pulse shaping waveform. An example of pulse shaping waveforms is the rectangular pulse. Another example is the square root raised cosine pulse, given by g(t) = sin(?t)=T?=T ? cos(fi?=T) 1?(2fit=T)2 ? ; (1.13) where fi represents the excess time beyond T to avoid ISI. The received signal ~r(t) is: ~r(t) = p ?(t) X n LX l=1 p Glfilbng(t??l ?nT)ej[2?(fc+fd cos`l)t?2?f?l] + ~n(t); (1.14) where v is the speed of mobile, `l is direction from the lth propagation path, fd = v? is the maximum Doppler frequency, ? is the wavelength, L is the number of paths, Gl represents the path loss, fil is the lth path fading, ?l is the propagation path delay, ?(t) is the shadow fading component, and ~n(t) is the thermal noise. The delay spread is ? = max?l ? min?l. In the small spread case, ? is much smaller than the symbol period, or ? << 1B, where B is the bandwidth of the signal. In this case, difierent replicas of the pulse shaping waveform could be considered the same and therefore the baseband equivalent of 1:14 can be approximated by ~r(t) = p ?(t)Ge?j2?f?0 X n bng(t??0 ?nT) LX l=1 pfi lej[2?fd cos`lt] + ~n(t); (1.15) where ?0 is an approximation for the delay, and we have assumed the same path loss for all paths. From (1:14), the impulse response of a wireless wideband channel, h(?;t), is represented by h(?;t) = p ?(t) LX i=1 p Glfil(t)?(? ??l)ej[2?(fd cos`l)t?2?f?l]: (1.16) In the case of at fading (small spread), the impulse response is given by h(?;t) = p ?(t)G?(? ??0)e?j2?f?0 LX l=1 pfi lej[2?(fd cos`l)t]: (1.17) 32 1.10 Multiple Transmit and Receive Antenna A smart antenna system combines multiple antenna elements with a signal-processing ca- pability to optimize its radiation and/or reception pattern automatically in response to the signal environment. Antennas have been the most neglected of all the components in personal communications systems. Yet, the manner in which energy is distributed into and collected from surrounding space has a profound in uence on the e?cient use of spectrum, the cost of establishing new networks, and the service quality provided by those networks. In this section, we will brie y describe the essential concepts of smart antenna systems and their important advantages over conventional omnidirectional approaches. Omnidirectional Antennas: Since the early days of wireless communications, there has been the simple dipole antenna, which radiates and receives equally well in all direc- tions. To flnd its users, this single-element design broadcasts omni-directionally in a pattern resembling ripples radiating outward in a pool of water. While adequate for simple RF envi- ronments where no speciflc knowledge of the users? whereabouts is available, this unfocused approach scatters signals, reaching desired users with only a small percentage of the overall energy sent out into the environment. This strategy impacts the spectral e?ciency, limiting frequency reuse. Directional Antennas: A single antenna can also be constructed to have certain flxed preferential transmission and reception directions. This can be done by sectorizing the 360 area of the cell into three 120 subdivisions, each of which covered by one directional antenna. Sector antennas provide increased gain over a restricted range of azimuths as compared to an omnidirectional antenna. This is commonly referred to as antenna element gain and should not be confused with the processing gains associated with smart antenna systems. 33 While sectorized antennas multiply the use of channels, they do not overcome the major disadvantages of standard omnidirectional antenna broadcast such as cochannel interference. Smart Antenna: Instead of one or more directional antennas, a system of antenna can become an antenna system that can be designed to shift signals before transmission at each of the successive elements so that the antenna has a composite efiect. This concept is known as the phased array antenna. The array can be exploited by creating sectorized antenna systems, that take a traditional cellular area and subdivide it into sectors that are covered using directional antennas looking out from the same base station location. The system can also incorporate two antenna elements at the base station, the slight physical separation (space diversity) of which has been used historically to improve reception by counteracting the negative efiects of multipath. This diversity ofiers an improvement in the efiective strength of the received signal by using either switched diversity (each antenna directs to one direction and the system continually switches between them), or diversity combining, in which the power of both signals coming from two paths can be combined efiectively to produce gain. Maximal Ratio Combining (MRC) is also another kind that combines the outputs of all the antennas to maximize the ratio of combined received signal energy to noise. As a matter of fact, antennas are not smart, but antenna systems are smart. Generally a smart antenna system combines an antenna array with a digital signal-processing capability to transmit and receive in an adaptive, spatially sensitive manner. In other words, such a system can automatically change the directionality of its radiation patterns in response to its signal environment. This can dramatically increase the performance characteristics (such as capacity) of a wireless system [23]. There are two major categories of smart antennas regarding the choices in transmit strategy: one is switched beam, with a flnite number of flxed, predeflned patterns or combining strategies (sectors). The second is the adaptive 34                    Figure 1.6: Narrowband Antenna array and beamformer antenna array, with an inflnite number of patterns that are adjusted in real time. Using a variety of new signal-processing algorithms, the adaptive system takes advantage of its ability to efiectively locate and track various types of signals to dynamically minimize interference and maximize intended signal reception. Both adaptive and switched systems attempt to increase gain according to the location of the user; however, only the adaptive system provides optimal gain while simultaneously iden- tifying, tracking, and minimizing interfering signals. The beneflts of adaptive antenna array are the ability to obtain signal gain, and therefore a better SNR, increasing the transmission coverage area , by focusing the energy in speciflc directions, interference rejection, spatial diversity, that minimizes the detrimental efiects of multipath fading, power e?ciency, and reduced expense. Moreover, exploiting adaptive antenna array, we can develop some tech- niques like determining the direction and location of a transmitter (using MUltiple SIgnal Classiflcation (MUSIC) [24] and Estimation of Signal Parameters via Rotational Invariance Techniques (ESPRIT) [25]). Other than these applications, multiple antennas are used in three major scenarios, beamforming, space-time coding (to gain diversity), and spatial 35        ? ? ? ? ??      ?     ?     ?  ?? ??  Figure 1.7: Wideband Antenna array and beamformer multiplexing. 1.10.1 Beamforming In receive beamforming, the antenna array are used to receive signals radiating from some speciflc directions and attenuate signals radiating from other directions of no interest. Fig. 1.6 shows the narrowband beamformer in which the beamforming is performed in one time instant, and 1.7 depicts the wideband beamformer. The outputs of array elements are weighted and added by a beamformer as shown in Fig. 1.6 to place nulls in the directions of sources of interference, and steer to the direction of the target signal by maintaining constant gain at this direction. Assuming an Uniform Linear Array (ULA) as shown in Fig. 1.8, if the number of antenna elements is K, and d is the adjacent antenna separation, and is the direction of arrivals of a signal, the response of the ith(i = 0;:::;K ?1) antenna element, ?i( ) is ?i( ) = e?j2?dsin c ; i = 0;1;:::K ?1 (1.18) 36   ? ? ?       Figure 1.8: Uniform Linear Array wherecisthepropagationvelocity. Theantennaarrayistherefore?( ) = [?0( );?2( );:::;?K?1( )]. Now consider a cochannel set consisting of M transmitter and receiver pairs. Let the receiver i be assigned to transmitter i. Assuming negligible delay spreads, and slow fading channel, where the channel response can be assumed constant over several symbol intervals, the received vector at the ith array can be written as xi(t) = MX j=1 pP jGji LX l=1 filji?j( l)sj(t??j)+ni(t); (1.19) where sj(t) is the message signal transmitted from the jth user, ?j is the corresponding time delay, ni(t) is the thermal noise vector at the input of antenna array at the ith receiver, and Pj is the power of the jth transmitter. The K ?1 vector aji, given by aji = LX l=1 filji?j( l); (1.20) is called the spatial signature or the array response or the steering vector of the ith receiver to the jth source. The received signal after performing sampling at the symbol intervals and matched fll- tering in the receiver i is given by xi(n) = MX j=1 pP jGjiajibj(n)+ni(n); (1.21) 37 The output of the beamformer and the average output power at the ith receivers are given by ei(n) =wHi xi(n); ?i =wHi Efxi(n)xHi (n)gwi = wHi Riwi; (1.22) where Ri is the correlation matrix of the received vector xi(n). Assuming that the message signals sj(t) are uncorrelated and zero mean, the correlation matrix Ri is given by Ri = "X j6=i PjGjiajiaHji +NiIM # +PiGiiaiiaHii; (1.23) in which the term inside the bracket is the received interference plus noise, and the second term is the energy of desired signal. The goal of beamforming is to flnd a weight vector wi that minimizes the total received energy (Eq. ( 1.22)) subject to a constant response toward the desired signal (wHi aii = 1). This optimization problem readily tries to minimize the interference plus noise, and is called Minimum Variance Distortionless Response (MVDR) beamforming. MVDR is equivalent to placing main lobes toward the desired mobile and nulls toward the interferers. It can be shown that the unique solution to this problem is given by [26] wi = R ?1 i aii aHiiR?1i aii: (1.24) The antenna gain for the signal of interest is unity. That is, the desired signal is unafiected by beamforming. Here we assumed that the array response to the source of interest, given by (1.20), is known. The array response can be obtained by the estimation of the Direction Of Arrival (DOA) for the signal of interest from difierent paths. In wireless networks usually the number of cochannels and multipath signals are much larger than the number of array elements. As 38 a result, conventional DOA estimation methods like ESPRIT and MUSIC are not applicable. However, there exist some schemes that can be used to estimate the array response in non- spread spectrum [27], and spread spectrum systems [28], without the need to estimate the DOA. Further, if we use a training sequence there is no need to estimate the array response. Transmit beamforming is also performed by flnding the antenna weight vectors such that (1) the transmitted energy toward the desired mobile is maximized, and (2) the transmitted energy toward other mobiles is minimized to avoid interference. Note that a base station may transmit to more than one mobile with difierent beamforming weight vectors. Denote the diversity vector for the ith mobile by vi. The received signal at each mobile is a superposition of the transmitted signal from difierent base stations and their delayed versions through the multipath channel. The kth received symbol at the ith mobile is then given by yi(k) = MX b=1 N?1X n=0 vHb hib(n) p Pbsb(k ?n)+ni(k); (1.25) where sb is the message signal transmitted from the bth base station to its associated mobile, and ni(k) is the thermal noise at the ith mobile. Pb can be considered as the signal power before the beamformer. Instead of absorbing this factor into the beamforming weight vector, we use it to adjust the level of the transmit power. Similar to the receive diversity case, we can show that the desired signal power at the ith receiver is given by PivHi Giivi, and the interference power from the bth base is given by PbvHb Gibvb, where Gii and Gib are the channel gain matrices. The SINR at this receiver is given by ?i = Piv H i GiiviP b6=i Pbv H b Gibvb +Ni ; (1.26) where Ni is the thermal noise power at the ith mobile. Again, the transmit weight vectors vi could be chosen such that the SNR at the desired mobile is maximized. 39 1.10.2 Space-Time Coding The design of codes for multiple-antenna systems (space-time codes [29]) is attracting consid- erableattention. Assumethataspace-timecodeisusedwithblocklengthN.Thetransmitted signal is represented by the t ? N matrix X;(x[1];:::;x[N]). The code, which we denote X, has jXj words. The row index of X indicates space, while the column index indicates time: the ith component of the t-vector x[n], denoted xi[n], is a complex number describing the signal transmitted by the ith antenna at discrete time n (for i = 1;:::;t;n = 1;:::;N). The received signal is the r ?N matrix Y = HX + Z, where Z is the matrix of zero-mean complex Gaussian random variables with zero mean and independent real and imaginary parts with the same variance N0=2 (i.e., circularly-distributed). Thus, the noise afiecting the received signal is spatially and temporally independent, with E[ZZH] = NN0Ir, where Ir denotes the size r identity matrix. The channel is described by the r?t matrix H, which is independent of both X and Z, and assuming quasi static channels, it remains constant during the transmission of an entire code word, and its realization (the channel state infor- mation, or CSI) is known at the receiver. Notice that the variance of the elements of H are chosen such that the total power received by r antennas from each transmit antenna remains constant as r varies. Under the assumption of CSI perfectly known by the receiver, and of additive white Gaussian noise, maximum likelihood detection and decoding corresponds to choosing the codeword X which minimizes the squared Frobenius norm kY ?HXk2, where for matrix A, we deflne, kAk2 = Tr(AAH). ML detection and decoding corresponds to the minimization of the quantity kY ?HXk2 = NX n=1 rX i=1 flfl flfl flyin ? tX j=1 hijxjn flfl flfl fl 2 : (1.27) 40 Using the union bound to error probability P(e) < 1jXj X X2X X ~X2XnfXg P(X 7?! ~X); (1.28) the pairwise error probability P(X 7?! ~X) is given by P(X 7?! ~X) < E " Q ? kH(X ? ~X)kp 2N0 !# ? E h exp ? ?kH(X ? ~X)k2=4N0 ?i : (1.29) SincekH(X? ~X)k2 = Tr ? HHH(X ? ~X)(X ? ~X)H ? , andusingthepropertiesofRayleigh at fading channels, we can change the inequality to P(X 7?! ~X) ? det h It +(X ? ~X)(X ? ~X)Hk=4N0 i?r : (1.30) By writing the RHS of this inequality in terms of the product of the eigenvalues ?j of the matrix (X ? ~X)(X ? ~X)H, we have P(X 7?! ~X) ? ? ?Y j=1 ?j !?r? 4 ??r? ; (1.31) where is the signal-to-noise ratio, and ? the number of nonzero eigenvalues. From this expression we see that the total diversity order of the coded system is r?min, where ?min is the minimum rank of (X ? ~X)(X ? ~X)H across all possible pairs X, ~X (diversity gain). Moreover, the pairwise error probability depends on the power r of the product of eigenvalues of (X? ~X)(X? ~X)H. This does not depend on the SNR , and displaces the error probability curve rather than changing its slope. This is called the coding gain. In a rapidly-changing mobile environment, or when long training sequences are not al- lowed, the assumption of perfect CSI at the receiver may not be valid. In the absence of CSI at the receiver, [30] advocates unitary space-time modulation, a technique which circum- vents the use of training symbols (which for maximum throughput should occupy half of the transmission interval, as seen before). Here the information is carried on the subspace that 41 is spanned by orthonormal signals that are sent to the transmit antennas. This subspace survives multiplication by the unknown channel-gain matrix. A scheme based on difierential unitary space-time signals using algebraic techniques, is presented in [31]. 1.10.3 spatial multiplexing Thebasicprincipleofspatialmultiplexingistotransmitindependentdatafromeachantenna, albeit with FEC coding. Then at the receiver the data from each antenna is separated by appropriate signal processing, usually involving a combination of linear decorrelation/MMSE detection and non-linear interference cancellation. In BLAST (Bell Labs Layered Space- Time), unlike beamforming techniques, we assume a rich scattering environment, and use multiple transmitters and receivers, each with its own antenna carrying independent data. All the transmitted signals occupy the same bandwidth simultaneously, so spectral e?ciency is roughly proportional to the number of streams. Here the data is demultiplexed into as many separate streams as there are transmit antennas, each of which is separately coded, and fed to a separate antenna. However, the mapping of code stream to antenna is cyclically rotated every few code symbols. In this way if any particular transmit antenna is subject to particularly severe fading, only those symbols are afiected, and the code should be able to overcome it. At the receiver, BLAST uses a combination of linear and nonlinear (joint MMSE) detection techniques to disentangle the mutually interfering co-channel signals. The results of decoding one stream are used in succession to remove interference from the others, while interference from those streams which cannot be cancelled (because they have not been decoded yet) is minimized in the MMSE algorithm. Using this approach, the aggregate theoretical capacity of the subchannels can consider- ably exceed the capacity obtained when the channel is treated conventionally, i.e. as a single (scalar) channel. Under the assumption of independent Rayleigh scattering, the information 42 theoretic capacity of the BLAST architecture grows roughly linearly with the number of antennas, even when the total transmitted power is held constant [32]. This is translated into a tremendous capacity improvement. In efiect, Foschini showed that in the high SNR regime, the capacity of a channel with t transmit, r receive antennas and i.i.d. Rayleigh faded gains between each antenna pair is given by: C( ) = minft;rglog +O(1); (1.32) where is the SNR of the environment. However, there are a number of technical issues to be addressed before BLAST can be deployed in a mobile wireless cellular environment. First, both transmitter and receiver are required to have multiple antennas, which increases size and cost of mobile devices. Second, BLAST assumes rich scattering environments, which may not always exist in outdoor environments. Finally, BLAST requires computationally intensive processing. There are difierent versions of BLAST, namely Diagonal BLAST (D-BLAST), where each successive block of information is sent from difierent antenna, Vertical BLAST (V-BLAST), in which one data stream is sent from each transmit antenna, ... Here, we brie y describe Zero Forcing V-BLAST for r ? t, when r and t are the number of receive and transmit antennas, respectively [33]. The channel matrix H is flrst decomposed as H = QR using the QR factorization, where R is an upper triangular t?t matrix (normalized so that the diagonal elements are positive), and Q is an r ? t matrix with QHQ = It (notice that if r = t then Q becomes a unitary matrix, i.e., QHQ = QQH = It). Observe that this factorization implies an ordering of the transmit antennas, which can be performed in t! ways. The receiver uses the feed-forward fllter matrix Q to obtain ~Y , QHY = QH(QRX +Z) = RX + ~Z; (1.33) 43 where ~Z is a t ? N matrix whose elements have the same distribution as those of Z, the covariance of noise. If interface processing were stopped at this stage (i.e., no cancellation took place), the metric would be equivalent to ML: k~Y ?RXk2 = kQHY ?QHHXk2 = kY ?HXk2: (1.34) Further processing (the cancellation step) by the ZF-BLAST interface is done by the (nonlinear) feedback fllter. This removes the remaining spatial interference resulting from the ofi-diagonal terms of R, which is achieved by decoding the subcode transmitted by antenna t flrst, then subtracting its decoded values from the signal received from antenna t?1, and so on. Speciflcally, we flrst obtain the estimate ^xt of xt by decoding Rt;txt + ~zt. Next we obtain ^xt?1 by decoding Rt?1;t?1xt?1 + Rt?1;t^xt + ~zt?1, etc. The statistics of Ri;i depend on the value of i: in particular, the expected value of R2i;i decreases as i increases, which means that the flrst decoding steps are more at risk of entailing error propagation to subsequent steps. To avoid this error propagation, a possible strategy consists of choosing the ordering of rows of H which is most favorable [34]. Itisworthwhiletoconcludethissectionwiththetradeofibetweenthediversityandspatial multiplexing gain [35]. Given a MIMO channel, both gains can in fact be simultaneously obtained, but there is a fundamental tradeofi between how much of each type of gain any coding scheme can extract: higher spatial multiplexing gain comes at the price of sacriflcing diversity. A scheme is said to have a spatial multiplexing gain r and a diversity advantage d if the rate of the scheme scales like rlog , and the average error probability decays like (1= )d. The diversity-multiplexing tradeofi is essentially the tradeofi between the error probability and the data rate of a system, by allowing both of them to scale with the SNR. 44 1.11 Contribution of this Dissertation The contributions of this dissertation are as follows: ? A joint iterative frequency-domain power control and receive beamforming is used in an OFDM system to achieve the desired SINR with the minimum total power at each OFDM subchannel, such that we can achieve a better overall error probability with the same total transmission power. ? To reduce the computational complexity of the joint frequency-domain beamforming and power control for OFDM, an iterative joint time-domain beamforming and power control for an OFDM uplink transmission is proposed. Although, this scheme is sub- optimal in terms of total transmission power, its complexity is signiflcantly lower. The convergence of this scheme is also discussed. ? For practical implementations, joint time-domain MMSE beamforming and power con- trol for OFDM is also proposed. ? The same schemes mentioned in the previous items, are also extended to Coded OFDM (COFDM) and their results are compared. ? Two iterative scheme is proposed for a downlink multi-user multi-cell multi-stream multiple transmit and receive antenna OFDM (MIMO OFDM) system is proposed to maximize the overall mutual information for all users. Multiple stream are used in order to achieve spatial multiplexing and increase the maximum achievable data rate. ? A non-cooperative theory is established for maximization of achievable data rate, where the players are difierent users and the parameters to be selected are the transmit and receive weight vectors. 45 ? The single stream MIMO-OFDM is also considered and an iterative algorithm is pro- posed to maximize the actual transmission data rate when the transmission power is constrained. ? The concept of super Golay codes is introduced and their efiect on the Peak to Average Power Ratio (PAPR) of OFDM systems using 16QAM sequences is discussed. ? The concept of cyclic-Golay codes is introduced such that their PAPR is as low as Golay codes, but their coding rate is higher. ? A construction method for creating cyclic-Golay codes is proposed. This method could be applied to flnd the cyclic shifts of any code presented by means of generalized Boolean functions. ? Two decoding schemes, one non-recursive and one recursive, for RM2h(r;m) are de- vised. The non-recursive scheme is an extension of the majority logic Reed algorithm used for decoding of binary Reed-Muller codes. The complexities of both methods are analyzed and it is shown that the recursive method has lower complexity. ? A scheduling algorithm for providing QoS for wireless mobile users and utilizing the system resources e?ciently is devised. The algorithm performs a trade-ofi between the QoS and throughput by maximizing the income a Service Level Algorithm (SLA)-based networks. ? Both a short-term optimal greedy algorithm (input maximization for one time slot) and a long-term optimal scheme using dynamic programming are introduced. ? A QoS-provisioned optimal subcarrier allocation to difierent users of an Orthogonal Frequency Division Multiple Access (OFDMA) system using the notion of revenue 46 maximization is proposed. Optimal and sub-optimal algorithms are also presented and their performances and complexities are studied. 1.12 Organization of the Dissertation In Chapter 2 of this dissertation, we describe the basic principals of an Orthogonal Frequency Division Multiplexing (OFDM) system, its history, and its advantages and disadvantages. We will also compare OFDM with single carrier modulation schemes. Next we move to present the loading algorithms, and its comparison with the use of coding in OFDM system. We will also describe the problem of Peak to Average Power Ratio (PAPR) in an OFDM system, and difierent solutions proposed to resolve this issue. Finally, we will brie y talk about the Orthogonal Frequency Division Multiple Access (OFDMA) scheme, which is used in IEEE802.16 Fixed wireless broadband systems, and the problem of channel allocation in these systems. In Chapter 3, flrst we will describe the power control problem in a multi-user cellular system and then outline our proposal for using joint power control and receive beamform- ing in multi-user OFDM systems. To reduce the complexity of this scheme, we will also propose joint power control and time-domain receive beamforming in an OFDM system. In situations where the array response is not known at the receiver, we will propose MMSE joint beamforming and power control, both in time and frequency domain. We will apply these scheme to Coded-OFDM (COFDM), and flnally the performances of these schemes are compared in this Chapter. Chapter 4 considers multiple antennas both at the transmitter and receiver of an multi- user, multi-cell, and multi-stream OFDM system. It proposes the use of iterative water- fllling to divide a flx power among difierent streams at difierent subcarriers, to achieve 47 maximum achievable rates. A game theoretic approach for this problem is presented and the convergence of these schemes is also discussed. Another algorithm for single stream transmission is also presented. In Chapter 5, we will focus on the problem of PAPR reduction in OFDM system by coding across subcarriers. In an attempt to increase the coding rate of well-known Golay codes, flrst, we propose the use of super-Golay codes when the symbols are selected from a non-equal energy like 16-QAM. Next, we propose the use of Cyclic Golay codes, and analyze the PAPR achieved by these codes. Then, we propose a construction method to obtain these codes out of Golay codes. The proposed scheme can be used to generate the cyclic shift of any code described by means of Boolean functions. We will show that these codes are in general a subset of rth order Reed-Muller codes (RM2h(r;m)), and propose two scheme, one recursive and one non-recursive to decode generalized Reed-Muller codes. The complexities of these schemes are discussed and compared with some existing methods. Chapter 6 serves two purposes. First, it proposes a novel scheduling algorithm when there is a Service Level Agreement (SLA) between the administrator and users of a wireless network. This scheme is based on the notion of network income and tries to achieve a meaningful tradeofi between the throughput of the system and guaranteed QoS for users. Both greedy (optimization for one time slot) and long-term solutions are proposed and their performances are compared to each other and to those of some other scheduling algorithms. This concept also builds a foundation for a channel allocation scheme proposed in the second part of this Chapter. This scheme is used to allocate the subcarriers of an OFDM system to difierent users, while a meaningful tradeofi between the overall throughput and the QoS is aimed. Finally, Chapter 7 concludes the dissertation and outlines some possible future works for the subject covered in this dissertation. 48 Chapter 2 Orthogonal Frequency Division Multiplexing (OFDM) 2.1 Motivation for introducing OFDM The most detrimental efiect in wireless communication is the fading caused by multipath propagation. Other problems are Inter-Symbol Interference (ISI), shadowing, and inter- ference. Further constraints are limited bandwidth, low power consumption, and network management. Because of multipath propagation, many re ected signals from trees, hills, building, people, cars, etc. arrive at the receiver at difierent times. Fading and ISI are caused by the combination of these echoes. This combination could be either constructive or destructive. Because of this fading, some frequencies are enhanced, whereas others are attenuated, and therefore the channel is frequency selective. If the bandwidth of the signal is great, some parts of the signal may sufier from constructive interference and be enhanced in level, whereas others may sufier from destructive interference and be attenuated. In general, frequency components that are close together will sufier variations in signal strength that are strongly correlated. The correlation (or coherence) is used as a measure of this phe- nomenon. For a narrowband signal, distortion is usually minimized if the bandwidth is less than the coherence bandwidth of the channel, because all frequencies in the band are usually 49 distorted in the same way. However, a signal which occupies a wider bandwidth, greater than coherence bandwidth, will be subject to more distortion, but will sufier less variation in total received power, even if it is subject to signiflcant levels of multipath propagation. This comes from the fact that variation averages out if the bandwidth is much larger than the coherence bandwidth, because difierent parts of the band sufier difierent levels of distortion. One can often flnd following formula for coherence bandwidth CBW [7], where ?rms is the RMS value of delay spread (not average): CBW ? 1? rms : (2.1) There are manyechoes presentin the time-domain response of the channel. These number of echoes are difierent for difierent environments, like outdoor/indoor areas. This range of delay can be measured and then processed to get statistical parameters. Difierent studies use the total range of delay, or the average delay. Whichever is chosen, the inverse of this leads to a good approximation for the coherence bandwidth. The spread spectrum techniques have proven to be robust against fading and interference, but they set forth impossible demands on the existing technology for instance, if a user needs a speed of 20Mb=s on air and the spreading factor is 128, this results in 2:56GB=s which have to processed in real-time and have impracticably large bandwidth. Besides that, they have di?culty with the near-far efiect and have a large power-consumption. Single-carrier techniques are vulnerable to fading and multipath propagation, especially in the case of very high bit rates. Improvements can be made with frequency equalization and directional antennas, which can also be used to improve multicarrier techniques. Orthogonal Frequency Division Multiplexing (OFDM) is a wideband modulation scheme that is designed to cope with the problems of the multipath reception. Essentially, the wide- band frequency selective fading channel is divided into many narrow-band subchannels. If 50      Figure 2.1: (a) A typical FDM spectrum. (b) A typical OFDM spectrum, where B is the saving in bandwidth. the number of subchannels is high enough, each subchannel could be considered as at. This is because we transmit many narrowband overlapping digital signals in parallel, inside one wide band. Increasing the number of parallel transmission channels reduces the data rate that each individual carrier must convey, and that lengthens the symbol period. Therefore, the delay time of re ected waves is suppressed to within 1 symbol time. 2.2 OFDM History The idea of using parallel data transmission by frequency division multiplexing (FDM) was published in mid 60s [36]. A U.S. patent was flled and issued in January, 1970. The idea of an U.S. patent, which was flled in 1970 and was for military application, was to use parallel data streams and FDM with overlapping subchannels to avoid complicated equal- ization and to combat frequency domain noise, and multipath distortion. The term Discrete Multi-Tone (DMT), multichannel modulation and Multi-Carrier Modulation (MCM) are used interchangeably with OFDM. The main speciflcation of OFDM which is not necessary in MCM is that each carrier is orthogonal to all other carriers. OFDM is an optimal version of multicarrier transmission schemes. 51 For a large number of subchannels, the arrays of sinusoidal generators and coherent de- modulators required in a parallel system become unreasonably expensive and complex. The receiver needs precise phasing of the demodulating carriers and sampling times in order to keep crosstalk between subchannels acceptable. Weinstein and Ebert [37] applied the Dis- crete Fourier Transform (DFT) to parallel data transmission system as part of the modula- tion and demodulation process. In addition to eliminating the banks of subcarrier oscillators and coherent demodulators required by FDM, a completely digital implementation could be built around special-purpose hardware performing the Fast Fourier Transform (FFT). Fig. 2.1 compares the bandwidth utilization of FDM and OFDM. Fig. 2.2 shows the spectrum of OFDM signals, where every one of the sinc(:) functions represent the spectra of one OFDM tone. This flgure shows that the OFDM signal is indeed the multiplexing of individual spec- tras with a frequency spacing equal to the transmission speed of each subcarrier. It shows that at the center frequency of each subcarrier, there is no crosstalk from other carriers. Therefore, if we use DFT at the receiver and calculate correlation values with the center frequency of each subcarrier, we can recover the transmitted data with no Inter-Channel In- terference (ICI). In addition, using the DFT-based multicarrier technique, frequency division multiplexing is achieved not by bandpass flltering but by baseband processing. Recent advances in VLSI technology enable making of high-speed chips that can perform large size FFT at afiordable price. In the 1980s, OFDM has been studied for high-speed modems, digital mobile communications and high-density recording. One of the systems used a pilot tone for stabilizing carrier and clock frequency control and trellis coding was implemented. Various fast modems were developed for telephone networks. In 1990s, OFDM has been exploited for wideband data communications over mobile radio FM channels, High- bit-rate Digital Subscriber Lines (HDSL) with a speed up to 1:6Mb=s, Asymmetric Digital SubscriberLines(ADSL)for1;536Mb=s, VeryHigh-speedDigitalSubscriberLines(VHDSL) 52 ?2.5 ?2 ?1.5 ?1 ?0.5 0 0.5 1 1.5 2 2.5?0.4 ?0.2 0 0.2 0.4 0.6 0.8 1 1.2 Figure 2.2: (a) Spectra of OFDM signal. for 100Mb=s, Digital Audio Broadcasting (DAB), and HDTV terrestrial broadcasting. In 1999, the Wireless LAN (WLAN) standard committee adopted OFDM as the physical layer of IEEE802.11a that was supposed to support up to 54Mb=s in 5GHZ band. Other WLAN standards like the more update IEEE802.11g, ETSI HIPERLAN/2, Mobile Multimedia Ac- cess Communication (MMAC) also have accepted OFDM as their physical layer speciflcation. These WLAN systems also incorporate coding with OFDM to combat dispersing channels. It has been shown that Coded-OFDM over dispersing channels can improve the reliability of the transmission. IEEE802.16, the standard for flxed wireless has adopted, and IEEE802.20, the standard for broadband wireless systems which is aimed to replace the wideband systems like 3G also exploit OFDM. 53 mN-1bits D/A &LPF & Up Conversion GuardInterval Insertion Parallel toSerial Conversion IFFTBlock  Coding/Bank of Modulation Blocks InputStream 0d 1d 1?Nd 0X 1X 1?NX iY )(tyi m0bits m1bitsFigure 2.3: Basic Building block of an OFDM transmitter. 2.3 Description of OFDM In a conventional serial data system, the symbols are transmitted sequentially, with the frequency spectrum of each data symbol allowed to occupy the entire available bandwidth. In a parallel data transmission system several symbols are transmitted at the same time, that ofiers possibilities for alleviating many of the problems encountered with serial systems. In OFDM, the data is divided among large number of closely spaced carriers. This accounts for the frequency division multiplex part of the name. This is not a multiple access technique, sincethereisnocommonmediumtobeshared. Theentirebandwidthisfllledfrom a single source of data. Instead of transmitting in serial way, data is transferred in a parallel way. OFDM can be simply deflned as a form of multicarrier modulation where its carrier spacing is carefully selected so that each subcarrier is orthogonal to the other subcarriers. As is well known, orthogonal signals can be separated at the receiver by correlation techniques; hence, ISI among channels can be eliminated. Orthogonality can be achieved by carefully selecting carrier spacing, such as letting the carrier spacing be equal to the reciprocal of the useful symbol period. Unlike FDM, the carriers in an OFDM signal are arranged so that the sidebands of the individual carriers overlap and the signals can still be received without adjacent carrier interference. In order to do this, the carriers must be orthogonal. The receiver acts as a 54 bank of demodulators, translating each carrier down to DC, the resulting signal then being integrated over a symbol period to recover the raw data. If the other carriers all beat down to frequencies which, in the time domain, have a whole number of cycles in the block period (Tb), then the integration process results in zero contribution from all these carriers. Thus, the carriers are linearly independent (i.e. orthogonal) if the carrier spacing is a multiple of 1=Tb. Mathematically, it is very easy to see that the sinusoidal series, ej2?ntN?t ; n = 0;1;:::N ?1 constitutes an orthogonal series, in the sense that Z N?t 0 ej2?ntN?t e?j2?mtN?t dt = 8> < >: N?t; for m = n; 0; for m 6= n: (2.2) where the interval [0;N?t] is a symbol period. Figs. 2.3 and 2.4 depict a conventional OFDM transmitter and receiver. The complex continuous wave N-carrier OFDM signal after the FFT block is: Sc(t) = 1pN N?1X n=0 Xnej2?(f0+n?f)t; (2.3) where f0 is the carrier frequency, ?f is the carrier spacing, and Xn is the complex symbol to be transmitted at carrier n. A window of incoming data bits are selected (the size of window is equal to the OFDM block period Tb) and are converted from serial to parallel substreams. Xn could be one chosen among the constellation points of some modulation scheme, like QPSK or 16QAM, or ... selected by the substream bits. It also could be generated by some coding scheme applied to the incoming windowed bits. Eq. (2.3) is the DFT of the symbols Xn (n = 0;1;:::N ? 1). The DFT is a variant on the normal transform in which the signals are sampled in both time and the frequency domains. By deflnition, the time waveform must repeat continually, and this leads to a frequency spectrum that repeats continually in the frequency domain [38]. 55                    !"#$ %& ' (    ) #  *  + +,  $- ? +$  +,  +$  $- ? ) # )#  )# ./-  Figure 2.4: Basic Building block of the jth OFDM Receiver. The fast Fourier transform (FFT) is merely a rapid mathematical method for computer applications of DFT. It is the availability of this technique, and the technology that allows it to be implemented on integrated circuits at a reasonable price, that has permitted OFDM to be developed e?ciently. The process of transforming from the time domain representation to the frequency domain representation uses the Fourier transform itself, whereas the reverse process uses the inverse Fourier transform. A natural consequence of using FFT is that it allows us to generate carriers that are orthogonal. The members of an orthogonal set are linearly independent. The baseband complex discrete OFDM symbols are Dk = 1pN N?1X n=0 Xnej2?nk=N = 1pN N?1X n=0 Xnej2?fntk; where fn = n=(N?t), tk = k?t, and ?t is an arbitrarily chosen symbol duration of the serial data sequence Xn. The in-phase and quadrature components of these symbols are applied to a low-pass fllter to obtain the baseband continuous wave OFDM signal as y(t) = 1pN N?1X n=0 Xne?j2?nt=N?t: 0 ? t ? N?t = Tb Here, it is assumed that the pulse-shaping fllter is normalized to 1. If the signal y(t) is up-converted to carrier frequency, the signal in (2.3) is obtained. The transmitted OFDM signal is the real part of Sc(t). 56 By dividing the input data stream into N subcarriers, the symbol duration is made N times smaller, which also reduces the multipath delay spread, relative to the symbol time, by the same factor. The orthogonality of subchannels in OFDM can be maintained and individual subchannels can be completely separated by the FFT at the receiver when there are no ISI and ICI introduced by transmission channel distortion. In practice these conditions can not be obtained. Since the spectra of an OFDM signal is not strictly band limited (sinc(f) function), linear distortion such as multipath cause each subchannel to spread energy into the adjacent channels and consequently cause ISI. A simple solution is to increase symbol duration or the number of carriers so that distortion becomes insigniflcant. However, this method may be di?cult to implement in terms of carrier stability, Doppler shift, FFT size and latency. One way to prevent ISI is to create a cyclically extended guard interval, where each OFDM symbol is preceded by a periodic extension of the signal itself. The total symbol duration is Ttotal = Tg + Tb, where Tg is the guard interval and Tb is the useful symbol duration. When the guard interval is longer than the channel impulse response, or the multipath delay, the ISI can be eliminated. However, the ICI, or in-band fading, still exists. ICI is the crosstalk between difierent subcarriers, which means they are no longer orthogonal. The ratio of the guard interval to useful symbol duration is application-dependent. Since the insertion of guard interval will reduce data throughput, Tg is usually less than Tb=4. Other reasons to use a cyclic preflx for the guard interval are: ? To maintain the receiver carrier synchronization; some signals instead of a long silence must always be transmitted; ? Cyclic convolution can still be applied between the OFDM signal and the channel response to model the transmission system. 57 To explain the second reason of using cyclic preflx, note that the received signal is the linear convolution of channel impulse response, h(t) and the time domain OFDM signal, plus noise, n(t): r(t) = y(t)?h(t)+n(t); (2.4) where ? represents linear convolution. If we assume that channel response has L taps, then by insertion of (L? 1)?t guard time between successive transmission blocks, we can make sure that the symbol duration is made larger than the time dispersion caused by the channel. However, adding the time domain samples yN?P;yN?P+1;:::;yN?1; (P ? L?1), to the block, as a cyclic preflx, causes the linear convolution shown in (2.4) to be performed between the channel and the augmented symbol. However since the augmented signal is cyclic, this is equivalent the circular convolution of the original OFDM sequence, yn with the channel coe?cients. Note that DFT(y~h) = DFT(y):DFT(h); (2.5) where ~ shows the circular convolution. This relation is not valid for linear convolution. At the receiver, the signal is translated to base-band and is sampled at the multiples of ?t. After removal of cyclic preflx, it is fed to a DFT module in order to yield the received frequency domain symbols ^Xm;m = 0:::N ?1. To simplify the discussion, assume that multipath fading channel is constant over one OFDM symbol (quasi-static), and therefore can be shown by h(t) = pG LX l=1 fil?(t??l); (2.6) 58 where G accounts for the path loss and shadow fading. The signal after down conversion is r(t) =pG LX l=1 file?j2?fc?ly(t??l)+n(t) = pG pN LX l=1 e?j2?fc?l N?1X n=0 Xnej2?n(t??l)N?t fil +n(t) = pG pN N?1X n=0 Xnej2?ntN?t LX l=1 file?j2?(fc+2?n=N?t)?lfil +n(t); (2.7) where fc is the carrier frequency. After sampling at time points k?t (k = 0;1;:::N ? 1), the kth sample is given by: rk = 1N N?1X n=0 HnXnej2?nk=N +nk; k = 0;1;:::N ?1 (2.8) where nk are noise samples. Hn is the channel frequency response at subcarrier n, and is given by: Hn = pG LX l=1 file?j2?(f0+2?n=N)?l: (2.9) These samples are applied to a DFT block, and because of orthogonality and Eq. (2.5), the resulting frequency domain symbols are ^Xm = HmXm +nm; (2.10) which says that the received OFDM symbols are the scaled versions of the transmitted ones plus thermal noise. In order to retrieve the transmitted symbol, the receiver needs to know the channel state information (CSI). Frequency-domain channel estimation can be performed with pilot symbols that are interspread with the transmitted OFDM symbols. A pilot e consists of known symbols em; m = 0;:::N ?1. The received pilot symbol at subcarrier m after DFT is zm = emHm + nm. Then the Minimum Mean Square Error (MMSE) estimate of the 59 complex gain Hm is obtained by ~Hm = zm em = Hm + nm em : These estimates are used for Frequency Domain Equalization (FEQ). If the estimations are accurate enough, the maximum likelihood detector makes its decision based on the statistics Rm= ~Hm. If the channel is slowly time varying, the transmitter can obtain reliable CSI as feedback from the receiver. 2.3.1 Advantages of OFDM Compared to FDM, the overlapping spectra of subcarriers in OFDM yields better spectral utilization. Only a small amount of the data is carried on each subcarrier, and by this lower- ing of the bit rate per subcarrier (not the total bit rate), the in uence of ISI is signiflcantly reduced. In principle, many modulation schemes could be used to modulate the data at a low bit rate onto each subcarrier. It is an important part of the OFDM system design that the bandwidth occupied is greater than the coherence bandwidth of the fading channel. Then, although some of the carriers are degraded by multipath fading, the majority of the carriers should still be adequately received. OFDM can efiectively randomize burst errors caused by Rayleigh fading, which comes from interleaving due to paralellisation. So, instead of several adjacent symbols being completely destroyed, many symbols are only slightly distorted. Be- cause of dividing an entire channel bandwidth into many narrow subbands, the frequency response over each individual subband is relatively at. Since each subchannel covers only a small fraction of the original bandwidth, equalization is potentially simpler than in a serial data system. A simple equalization algorithm can minimize mean-square distortion on each subchannel, and the implementation of difierential encoding may make it possible to avoid equalization altogether [37]. This allows the precise reconstruction of majority of them, even 60 without forward error correction (FEC). In addition, by using a guard interval the sensitivity of the system to delay spread can be reduced [39]. Furthermore, OFDM provides additional exibility in transmission adaptation to varying channel conditions, by allowing modulation level and power adjustment for each symbol in a subcarrier [40]. It also accommodates multi-users by allocating difierent subcarriers to difierent users (OFDMA). OFDM can be used in conjunction with bit loading techniques to improve the capacity of a highly frequency selective channel. 2.3.2 Disadvantages of OFDM In what follows, we will describe the main disadvantages of an OFDM system very brie y: High sensitivity to carrier frequency ofiset:, The ofiset in carrier frequency results in reducing desired symbol amplitude and introduces ICI. When comparing OFDM to a conventional single carrier system, it is in orders of magnitude more sensitive to frequency ofiset and Wiener phase noise [41]. This problem is more expressed in mobile applications, since a mobile channel has a time-varying nature. This ofiset causes the received frequency domain subcarriers to be shifted. The subcarriers are still mutually orthogonal, but the received data symbols, which were mapped to the OFDM spectrum, are in the wrong position in the demodulated spectrum, resulting in increased BER. Sensitivity to time-domain synchronization errors and Sensitivity to phase noise: If the receiver?s FFT window is shifted with respect to that of the transmitter, then at the receiver side, the timing mis-alignment introduces phase error between adjacent sub- carriers. The in uence of phase error on OFDM system depends on the modulation scheme; (1) Coherent Modulation: In case of imperfect time synchronization, phase correction mech- anisms are very crucial for coherent modulation. (2) Pilot symbol assisted Modulation: Pilots are inter-spread with the data symbols in the frequency domain and the receiver can esti- 61 mate the evolving phase error from the received pilots? phases. (3) Difierential Modulation: Difierential encoding can be implemented both between corresponding subcarriers of consec- utive OFDM blocks or between adjacent subcarriers of the same OFDM block to alleviate the efiect of phase error. Large Peak to Average Power Ratio (PAPR): will be described later. Non-linear distortion: Since an OFDM signal generally consists of a large sum of independently modulated subcarriers, then we might see large signal excursions and large PAPRs. Therefore, an OFDM signal is very vulnerable to non-linear distortion caused by any non-linear element in the system such as High Power Ampliflers (HPAs). The efiect of a non-linearity on an OFDM signal can be seen from two difierent point of views. First, the large signal excursions occasionally reach the non-linear region of any non-linear element in the system (saturation point) and hence the non-linearity output is a distorted replica of the input. Second, any nonlinear element in the system introduces severe harmonic distortion and inter-modulation distortion due to the multi-carrier nature of an OFDM signal. For example, the response of a third-order or cubic non-linearity to a sum of sine waves with frequencies fi;i = 1;:::N, corresponds to a sum of sine waves consisting of the original sine waves with frequencies fi;i = 1;:::N, harmonics with frequencies 3fi;i = 1;:::N, and other frequencies of the type fi + fj ? fk and 2fi ? fj, where i, j and k are integers that distinguish the difierent input frequencies. Generally, the response of an nth order non- linearity to a signal consisting of a sum of sine waves corresponds to a signal also consisting of a sum of sine waves with frequencies corresponding to every possible combination of sums and/or difierences of n input frequencies. These newly generated frequencies can fall in-band or out-of-band. Frequencies that fall in-band, are associated with signal error probability degradation and frequencies that fall out-of-band are associated with signal spectral spread- ing. Note that the two perspectives discussed above are related as the large signal excursions 62 are the consequence of the multi-carrier nature of the OFDM signal. 2.3.3 Single Carrier versus OFDM Comparison The main difierence between the single carrier and OFDM are their robustness to fading and synchronization errors (both frequency and timing). Assuming perfect synchronization, the performances of a single carrier system and an OFDM system are equivalent for AWGN and at fading channels. Consider a received signal for a single carrier system yi = hisi +ni 0 ? i ? N ?1; (2.11) wherehi isa complex random variable, si isthe baseband representation ofthe ith modulation symbol, and ni is complex additive Gaussian noise sample in the ith signal interval. The equivalent OFDM received signal (assuming no ISI and ICI) is Yk = HkSk +Nk 0 ? k ? N ?1; (2.12) where Hk, Xk, and Nk are the frequency domain representations of ri, xi, and ni, respectively. Since the noise power of ni and Nk are equivalent by Parseval Theorem [38], there is no advantage in detecting the signal using either of the these equations. Now, consider reception ofasignaloverafrequency-selectivechannel. Forthesinglecarriersystem, thereceivedsignal becomes y(i) = h(i)?s(i)+n(i); (2.13) where y(i) = [yi;yi+1;:::yL+i?1]H, AH represents the Hermitian (complex transpose) of matrix A, L is the number of taps, and ? represents the linear convolution. h(i), s(i), and n(i) are deflned correspondingly. The single carrier system requires an equalizer g(i) to compensate for the channel efiect, by the deconvolution operation on the received signal, i.e., ^y(i) = g(i)?y(i) = s(i)+h(i)?n(i)+?(i); (2.14) 63 where ?(i) is the residual error which is resulted because normally the equalizer cannot perfectly inverse the efiects of the channel. Note that the deconvolution process also enhances the noise amplitude in some samples. In contrast, if we assume that the circular convolution has removed the efiect of ISI, the OFDM system performs equalization in the same manner as the case of the at fading channel, i.e, Yk = HkSk +Vk ^Yk = Sk +'kVk; (2.15) where 'k = [1=hi;1=hi+1;:::1=hpk+i+1]H; (2.16) and pk is the number of taps in carrier k. Equalization in OFDM systems is subject to the same impairments such as residual error and noise enhancement as the single carrier system; thus, theoretically, the two systems have equivalent performance. Yet, the complexity of the equalizer for the OFDM system is substantially less than that for the single carrier system. The reason is that OFDM systems employ a bank of single-tap equalizers while single carrier systems employ multi-tap equalizers. Further, the complexity of the of the equalizer grows as the square of the number of taps. Synchronization Errors Synchronization errors can be either timing, frequency, or both. The single carrier system is much more sensitive to timing errors than the OFDM system. On the other hand, the OFDM system is more sensitive to frequency errors. Efiect of Timing Errors: Even with the training intervals, the demodulator reference circuitry may not be able to recover completely the timing at the transmitter. Without tim- 64 ing synchronized, the SNR at the output of the detection fllter is degraded. For a particular sampling time Toptimal, the output SNR is depends on the autocorrelation, A(?) [42], where ? is a random variable that represents the delay between the optimum sampling instance Toptimal and the associated symbol timing for the received signal, and is estimated in the presence of noise. The variability of this delay is called timing jitter. As an example, the autocorrelation function for the band-limited signals is given by [43]: A(?) = 1N sin(?NW?) sin(?W?) ? ; (2.17) where W is the bandwidth of the band-limited signal. A(?) can describe the single carrier system, but OFDM system is described as time-limited signal. For single carrier systems, the timing error or jitter causes a phase error for the bandpass signal. However, in OFDM systems, we can transmit pilot symbols on some reserved tones and therefore we can estimate residual phase errors. The single carrier system does not have a mechanism to archive this compensation [44]. Efiects of Frequency Errors: When there is relative motion between the transmitter and receiver, a Doppler shift of the RF carrier results and introduces frequency error. Also, any error in the oscillators either at the transmitter or at the receiver can result in residual frequency error. In either case, there are well-known carrier recovery schemes available for single carrier systems such as a flrst Costa loop [45]. The important result in [46] says that although the carrier can be recovered, the phase may be unknown and random. In summary, the single carrier systems are more robust to frequency ofiset error, but more vulnerable (SNR degradation) to timing error. OFDM systems are robust to timing error, but their SNR could be degraded by frequency ofiset error. These properties come from the fact that, supporting the same data rate, OFDM symbol duration is N times longer than the single carrier system. The complexity of the equalizer and the system performance 65 in at fading is the same for both, while OFDM has much better performance and much simpler equalizer in frequency-selective fading environments. 2.4 Loading Algorithms The key advantage of OFDM is that each subchannel is relatively narrowband and is assumed to have at fading. However, it is entirely possible that a given subchannel has a low gain, resulting in a large BER. Thus, it is desirable to take advantage of subchannels having relatively good performance; this is the motivation for adaptive modulation. In the context of time-varying channels, there is a decorrelation time associated with each frequency-selective channel instance. Thus, a new adaptation must be implemented each time the channel decorrelates. The number of bits that can be carried by each of the N subcarriers is given by: bi = log2 1+ SNRi? i ? ; where the SNR Gap, ?i at subcarrier i is used to measure the reduction of SNR with respect to capacity. It depends on the objective bit error rate and the modulation used. If the objective BER is the same for all sub-channels, then ?i = ? for all i. If coding is being used, SNR gap can be modifled by introducing additional coding gain factors [47]. The optimal adaptive transmission scheme, which achieves the Shannon capacity for a flxed transmit power is the water-fllling distribution of power over the frequency-selective channel. However, while the water-fllling distribution will indeed yield the optimal solution, it is di?cult to compute. Moreover, we need inflnite granularity in the constellation size, which is not practically realizable. Here, we will present the OFDM bit loading algorithms used in [48,49], which optimize the power and rate based on the knowledge of the subchannel gains. 66 In the discrete bit loading algorithm of [48], the amount of energy necessary to transmit b bits on subchannel n at the desired probability of error using a given coding scheme is represented by a convex monotonic function en(b), which is initialized to zero for all n. The problem of energy minimization and bit allocation can be formulated as: min bn NX n=1 en(bn) (2.18) subject to 8 >< >: PN n=1 bn = B; bn 2Z;bn ? 0;n = 1;2;:::N: To initialize the bit allocation, flrst the number of bits for the ith i = 0;1;:::N ? 1 subchannel, b(i), is computed based on (2.4), and is rounded down to the integer value ~b(i). Note that, since we are using MQAM, the number of bits are restricted to the values 0;1;2;4;6;8, based on the chosen constellation. Then, the energy for the ith subchannel is computed by rewriting (2.4) as: ei(~b(i)) = ? 2~b(i) ?1 ? ??=SNRi: (2.19) Then, we need to form a table of energy increments for each subchannel. For the ith subchannel ?ei(b) = ei(b)?ei(b?1) = 2 b?1 ?? SNRi (2.20) This table provides incremental energies required for each subchannel to transition from supporting b ? 1 bits to b bits, given the channel gain and noise Power Spectral Density (PSD). The required energy increment to increase from the maximum supportable bits is set to inflnity (or a very high value). Note that, since we are restricted to a limited set (f0;1;2;:::log2 Mg), the energy for other number of bits is averaged using the energy incre- ments for allowed number of bits. This averaging could be applied to all but the flnal bit to 67 even out the total number of bits on that subchannel. [48] has also proposed an algorithm to resolve the last bit issue, too. After the initialization is performed, the authors in [48] take the initial bit allocation, b, the total number of bits to be allocated, B as the input, and flnd the optimized bit allocation. The algorithm starts with B0 = Pn b(n). If B0 > B, it flnds the subcarrier that maximizes the energy to decrement from bn bits to bn?1 bits, , ?en(bn), and decrease one bit from that subcarrier. It then updates B0, accordingly. The algorithm continues as long as B0 6= B. However, if at the beginning, B0 < B, it flnds the subcarrier that minimizes the energy decrement from bn+1 bits to bn bits, , ?en(bn+1), and increments the bit allocation of that subcarrier. It then updates B0, accordingly. The algorithm continues as long as B0 6= B. These algorithms constitute a complete characterization of the bit loading procedure proposed in [48,49] for a given frequency selective channel. Hughes-Hartogs [50], proposed another method in which the bits are assigned one by one to the subchannel with the lowest power increment, until a pre-specifled target rate RT is reached. Obviously this is a very slow procedure and requires lots of sorting and searching. Another loading algorithm has been proposed in [51] where the distribution of bits is adapted to the shape of the transfer function of each subchannel. This is basically very close to what has been proposed in [49], except a difierent approach is used to obtain the bit capacity of each subchannel: C0QAM ? 0:31 10log SN ?0:67 ? (2.21) This is rounded to the maximum integer mi, which is smaller than C0QAM. It is assumed that the required number of bits per OFDM symbol (M) is flxed and in each iteration, depending on the relation between M and m = Pi mi, the number of bits in each individual subchannel is reduced or increased. 68 Fischer and Huber [52] exploit the fact that the signal power and the rate at each sub- channel are related. They minimize the BER at each subchannel with a constant data rate and transmission power. He characterizes the subchannels as AWGN and use the following relation to flnd the symbol error rate at each subchannel (assuming QAM) pe(i) = KiQ ?s di=4 Ni=2 ! (2.22) where the term under the square root is equal to the constant SNR at each subchannel, Q(:) is the Gaussian pdf tail, di is the minimum Euclidean distance between signal points in the constellation and Ni=2 is the noise power. 2.5 Peak to Average Power Ratio (PAPR) A major hurdle to the widespread use of OFDM is the high Peak to Average Power Ra- tio(PAPR) of OFDM signals. An OFDM signal consists of a number of independently modulated subcarriers, which can give rise to a large Peak to Average Power Ratio (PAPR), when added up coherently. When N equi-amplitude signals are added with the same phase, they produce a peak power that is N times the average power. The peak power is deflned as the power of a sine wave with an amplitude equal to the maximum envelope value. Hence, an unmodulated carrier has a PAPR ratio of 0 dB. An alternative measure of the envelope variation of a signal is the crest factor, which is deflned as the maximum signal value di- vided by the rms signal value. For an unmodulated carrier, the crest factor is 3dB. This 3dB difierence between PAPR and crest factor also holds for other signals, provided that the center frequency is large in comparison with the signal bandwidth. Usually the trans- mitters are constrained to a limited peak power. If the peak envelope power is subject to a design or regulatory limit, then this has the efiect of reducing the mean envelope power 69 allowed under OFDM, relative to that allowed under constant envelope modulation. This reduces the efiective range of the OFDM transmissions and is particularly acute in mobile applications where battery power is a constraint. Moreover, to prevent signal distortions and spectral growth due to non-linearities inherent in electronic components, power ampli- flers must operate below their compression point, where power is converted most e?ciently. This results in more expensive and ine?ciently used components. The high PAPR or high crest factor could cause problems when the signal is applied to a non-linear device such as a power amplifler, since it results in-band distortion and spectral spreading. To counteract these efiects, the amplifler needs to be highly linear or operate with a large back-ofi. Both approaches result in a severe power e?ciency penalty and are expensive [53]. An analysis of non-linear ampliflers, however, points out that an acceptable performance regarding the in-band distortion is obtained with an analysis of non-linear ampliflers, however, points out that an acceptable performance regarding the in-band distortion is obtained with only a small back-ofi [54]. The tolerable out-of-band radiation or spectral spreading sets the bound on the back-ofi that is needed [53]. As a conclusion, simply dimensioning the system components to be able to cope with the worst-case signal peaks is practically impossible. That?s why many researchers have been trying over the years, to counteract the PAR problem. PAPR can be deflned in difierent ways. It can be deflned in discrete domain in complex continuous domain, or in real continuous domain. The deflnitions in continuous domain could be also extended to baseband as well as the pass-band OFDM signals. Assume that the sequence xn; n = 0;1;:::N ? 1 is applied to the IFFT block in an OFDM transmitter. The discrete-domain PAPR of this codeword is PAPR(x) = 1P x max k fpx[k]g = 1P x max k "N?1X i=0 xix?uej2?(i?u)kN #2 ; (2.23) 70 where Px = kxk2 is the energy of the codeword x. The complex domain continuous pass-band OFDM signal is sx(t) = N?1X i=0 xi exp(j2?(fc +i?f)t): (2.24) The instantaneous envelope power of the signal is px(t) = jsx(t)j2 = N?1X i=0 N?1X u=0 xix?u exp(j2?(i?u)?ft): (2.25) The PAPR of the sequence x is then deflned as PAPR(x) , 1P x max 0?t?T fpx(t)g; (2.26) where T is the length of OFDM block. Somepapershaveintroducedanotherterm, PeaktoMeanEnvelopePowerRatio(PMEPR) to describe the statistical deflnition of PAPR. PAPR is also deflned for a code applied to an OFDM system. Let C be a code that maps blocks of k input bits into blocks of N con- stellation symbols from a constellation Q with 2a elements. The rate of C is deflned to be R = k=aN. The codeword c = [c0 c1 ::: cN?1] is applied to IFFT block and the transmitted signal can be derived by Re(sc(t)) for 0 ? t ? T, where Re(:) denote the real part of a signal. The relation between the quantities ?f and T depends on whether a guard interval is appended or not. However, we note that ?f = 1=T is commonly assumed in an ideal situation. Assuming that fc=?f 1, it is well known that the average envelope power of sc(t) is kck = PN?1i=0 jcij2. Thus, the PAPR of the signal is given by PAPR(c) = max 0?t?T [Re(sc(t))]2 Pav ; (2.27) where Pav = E(kck2) = Pc 2C kck2p(c), and p(c) is the probability of the transmission of the codeword c. The value Pav is referred to as the average square length of the transmitted 71 codewords. The PAPR of the codebook C is deflned to be PAPR(C) = max c2C [PAPR(c)]: (2.28) We also let Pmax = maxc2C kck2, and refer to it as the maximum square length of the transmitted codewords. The maximum square length Pmax is the maximum of the average envelope power of signals corresponding to difierent codewords. Thus, Pmax=Pav is not to be mixed up with PAPR(C). For instance, if an equal-energy constellation is used, then Pmax=Pav = 1, but the PAPR(C) can be as large as N. To evaluate the PAPR of continuous signals, we need to sample the signal with some speciflc rate. If the sampling period is 1=N?f = T=N, the continuous PAPR and discrete PAPR would be the same. However, to obtain more accurate value for continuous PAPR, we need to oversample the OFDM signal with a rate that is normally a multiple of N?f [55]. 2.5.1 Statistical Properties of OFDM Signals This section describes the statistical properties of OFDM signals assuming that the discrete PAPR is considered (critical sampling). Let?s assume that the symbols xn;n = 0;1;:::N?1 are i.i.d complex Gaussian random variables with zero mean and unit variance. It can be shown that the time domain symbols Xk are also i.i.d Gaussian random variables with zero mean and unit variance. Even if xn?s are not Gaussian, using Central Limit Theorem (CLT) it is easy to see that the time domain symbols which are a linear combination of N frequency domain symbols, are still i.i.d Gaussian, with unit variance, since IDFT operation 72 is an orthogonal transformation. To obtain independence, we can see that E[XrX?s] = 1NE "?N?1X l=0 xrej2?lr=N !?N?1X k=0 xsej2?ks=N !?# = 1N N?1X n=0 E[xrx?s]ej2?lr=Ne?j2?ks=N = ?(r?s): (2.29) In most practical applications, the data is randomized prior to modulation and the fre- quency domain symbols can be approximated as independent discrete uniform random vari- ables, typically MQAM, MPSK, or APSK. For these cases, the symbol samples Xn are linear combination of N discrete uniform random variables. For the OFDM cases, all symbols xk are chosen from the same constellation and thus the N discrete uniform random variables are i.i.d. Since the symbols xn are independent, the symbol samples Xk?s are still uncorrelated. Moreover, for large N?s, the CLT leads to the common assumption that the symbol samples are approximately i.i.d random variables. With this assumption, the Commutative Distri- bution Function(CDF) of the random variables PAPR(x) has a simple closed distribution: Prob?PAPR(x) < 2? =Prob ?jX 0j2 Px < 2;::: jXN?1j 2 Px < 2 ? = Prob ?jX 0j2 Px < 2 ??N (2.30) For real baseband symbols, we have Prob?PAPR(x) < 2? = (1?2Q( ))N : (2.31) For the complex OFDM symbols the CDF is turned out to be Prob?PAPR(x) < 2? = ?1?exp(? 2)?N : (2.32) 73 Therefore, for complex signals, the Complementary Cumulative Distribution Function (CCDF), or the probability that the PAPR exceeds some value 2 is [56]: Prob?PAPR(x) ? 2? =1??1?exp(? 2)?N =1?exp ? r? 3N e ? 2 ? : (2.33) could be considered as a clipping point. Setting the value of at such a level that the clipping noise is negligible (e.g. 50dB below the signal level is obtained when > 4 [57]) is not optimal. Lowering the dynamic range of the A/D and D/A converters for a constant number of bits reduces the quantization noise signiflcantly. The clipping noise however increases, as the clipping probability is larger. So, we need to make a trade-ofi between the quantization distortion and the clipping distortion to minimizes the overall distortion [58]. Alternatively, for a constant SNR, the number of bits in the D/A and A/D can be decreased, lowering the implementation cost. 2.5.2 Techniques for OFDM PAPR Reduction There are three main classes of methods to reduce the PAPR. The methods based on block coding, the methods based on clipping, and the probabilistic methods. PAPR Reduction with Distortion (Clipping) These method clip the OFDM signal, if is above the dynamic range of the A/D and D/A, or in some cases above a predetermined threshold. However, clipping causes in-band (for over- sampled) and out-of band distortion (for unoversampled or analog signals). These methods try to reduce the efiect of clipping. There are three main clipping schemes. One is block scal- ing in which an optimal clipping threshold is determined out of a limited set. This threshold 74 determination is performed on a symbol-by-symbol basis [59]. The selected threshold is then transmitted to the receiver in a reserved tone. In the second scheme, clipping is performed at the transmitter, but the receiver tries to compensate for some of its efiects. The receiver needs to estimate the size and the location of the clip [60] Finally, the third method applies some signal processing to reduce the efiect of clipping. Two kinds of processing are applied, namely peak windowing in which the large peaks are multiplied by a small window like Kaiser or Hamming [61], and adding correction function in the vicinity of the clip [62]. Both approaches decrease the out-of-band distortion by smoothing the hard limiting efiect. Block Coding schemes By block coding we limit the set of possible signals that can be transmitted. Only those signals with low amplitude are allowed. Therefore no clipping occurs and therefore no dis- tortion in implied. These codes not only ofier a low PAPR, but have the error-correcting capabilities. As a matter fact the CCDF in Eq. (2.33) is the fraction of symbols that are dis- carded for a given . This expression shows that as N increases, the proportion of sequences to be discarded goes up. As a matter of fact, no good codes for practical values of N > 64 are known. In other words, for large N the code rate is very low, and this is an inherent property of coding methods. One simple strategy is to perform exhaustive search over all possible codewords and use a table lookup [63]. Other options restrict the phase possibilities of certain tones [64] or only use part of the bits in a difierential phase modulation scheme [65]. Some codes have the property that they have a small PAPR by an instantaneous power that is most of the time close to the average power. Thus, the spectrum of the code is almost 75 at, or alternatively an impulse-like autocorrelation [66,67]. Two codes based on this criterion are Golay complementary sequences, which we describe in details in Chapter 5, and m-sequences [67]. The m-sequences are a class of (2m ?1;m) linear cyclic codes that create OFDM block length of N = 2m?1. They are used for generating pseudo-noise sequences for spread spec- trum communications, because their autocorrelation functions is an impulse and therefore their spectrum is almost at, implying a very low PAPR. Their autocorrelation function is Ac(j) = N?1X i=0 cic[i+j mod N] = 8 >< >: 2m ?1; j = 0; ?1; 1 ? j ? 2m?1 ?2: (2.34) All these block codes provide a low PAPR (typically below 3dB ). However, the most important drawback of these codes is that their code rate for large N is not acceptable. This drawback dramatically limits their usefulness with regard to real applications. Probabilistic methods, Parameter Optimization These methods do not reduce PAPR, instead they reduce the probability of the occurrence of large peaks. This causes the CCDF shown in (2.33) to reduce for any given . The basic way to reduce this probability is by performing a linear transformation on the input vector of IFFT block [68] Partial Transmit Sequence (PTS): In PTS, the input vector x is now subdivided into V non-overlapping subvectors xv of size N=V [69]. Each carrier in the subvector xv is multiplied with a rotation factor v. The rotation factors of the difierent subvectors are statistically independent. This scheme corresponds to a linear transformation, where the additive vector is all-zero. Because of linearity of IFFT, this is equivalent to applying IFFT to each subvector (by setting other elements to zero) and then multiplying the result by v. Fig. 2.5 depicts a block diagram of this method. One advantage of this method is that it 76    ? ? ? ?        Figure 2.5: Partial Transmit Sequences for PAPR reduction in an OFDM. reduces the complexity of IFFT operation, because the length of each IFFT is N=V. If the rotation factors v are chosen from a set of size L, the total number of phases is LV . For each set of rotations, the PAPR is computed and the one with the lowest PAPR is transmitted. The index for the chosen set of phases must be transmitted to the receiver using V log2 L bits. It is also possible to use difierential modulation for each subvector and in this case no side information is sent. Selective Mapping: The basic idea is to have L statistically independent vectors to represent the same information [70]. The vector with the lowest PAPR is selected for transmission. Since these vectors are independent, the probability that PAPR exceeds some value is therefore equal to CCDF in (2.33) to the power of L, and therefore is signiflcantly lower. Fixed, independent rotation vectors are used to create these L independent vectors. Fig. 2.6 depicts the building block of SLM. The authors in [71] use L cyclically non-correlated m-sequences to generate these L 77     ?  ?  ?      Figure 2.6: Selective Mapping for PAPR reduction in an OFDM. replicas, rather than rotations. This is difierent from using m-sequences as in Section 2.5.2. Here, The input vector is multiplied by an m-sequence and therefore a more at spectrum is obtained and therefore the amplitude peaks are reduced. Tone Reservation and Injection The idea of tone reservation is to use a shaping function such that the clipping noise is concentrated on unused tones. Unused tones are the tones in which the SNR is low and therefore the bit loading algorithms allocate no data (or very small number of bits) to those subcarriers. This problem is equivalent to flnding the closure of some convex sets [60]. In tone injection, the QAM constellation is extended such that the same data point correspondstomultiplepossibleconstellationpoints[72,73]. Fig.flg:TIillustratesanexample of such an extension. The advantage of this scheme is that receiver does not need any side information from the transmitter, but it increases the energy of some point compared to the original constellation points. The optimization problem that works per tone and repeats iteratively for all tones 78 Figure 2.7: The extension of 16QAM constellation. is to choose one of the constellation points corresponding to the tone?s substream such that the reduction in PAPR is maximized while the energy of the symbol is as low as possible. 2.6 Orthogonal Frequency Division Multiple Access There are many multiple access schemes to accommodate multi user systems. In Section 1.4 we brie y listed some of these schemes. In summary, in TDMA, time is the resource that is shared among difierent users, but every user uses the whole bandwidth. In FDMA, the whole available spectrum band is divided into several slots and each user modulates its data over one band. However, all user send their data simultaneously. In CDMA, all users send their data simultaneously using the whole spectrum, but are distinguished by the spreading code they use. Orthogonality of the codes, and power control are very essential in this system. In SDMA, users are separated by their location and the space diversity is essential in this system. 79         Figure 2.8: OFDMA carrier segmentation. Orthogonal Frequency Division Multiple Access (OFDMA) is another multiple access system that uses the subcarriers of an OFDM transmitter to difierentiate difierent users. The functionality of OFDMA is difierent in down-link or up-link transmission. Here, the active carriers ( the carriers that are aimed to carry data, not pilot or NULL) are divided into subsets of carriers, each subset is termed a subchannel [5]. In the down-link, a subchannel may be intended for difierent groups of receivers, while in the up-link, a transmitter may be assigned one or more subchannels, several transmitters may transmit in parallel. Note that in the up-link, all users are transmitting at the same time (unlike TDMA), using the same frequency band (unlike FDMA), without using any orthogonal spreading code (unlike CDMA), and assuming no ISI or ICI, there is no interference among all users. Because of this separation, performing power control is not very crucial in OFDMA. However, if the number of users is high, and all of them are required to be assigned distinct subchannels, the transmission rate will fall signiflcantly. For this reason, we might need to allow some users to share one or more subchannels. As a result we might need to use power control and/or multiple transmit antennas to reject interference. The concept of carrier segmentation in OFDMA is shown in Fig. 2.8. In addition to the support for multiple access, OFDMA allows for scalability, and the use of multiple antenna signal processing capability. 80 The up-link OFDMA block transmitted from user n is s(t) = < ( ej2?fct Nstart+Nused?1X Nstart+1 ckej2?k?f(t?Tg) ) ; (2.35) where t is the time elapsed since the beginning of the subject OFDM block, with 0 < t < Tg, ck is the complex symbol to be transmitted at subcarrier k, Tg is the guard time, TS is the OFDM symbol duration, including guard time, ?f is the carrier frequency spacing. The carrier NStart is excluded to transfer the DC component. One slot in IEEE802.16 is deflned as a group of contiguous subchannels, in a group of contiguous OFDMA blocks. This allocation can be seen as a two dimensional rectangle, where the horizontal component is the time domain, in which a speciflc number of OFDMA blocks are included, and the vertical component includes the number of subchannels. The whole slot is called a data region. 2.6.1 Channel Allocation Channel allocation is an integral part of OFDMA. Channel allocation is also the most im- portant part of any multiple access scheme. In TDMA time slot allocation, in FDMA carrier frequencies, in CDMA spreading codes, in SDMA antenna beams, and flnally in OFDMA subcarriers ( or a group of subcarriers named subchannel). The performance of any channel allocation algorithm is measured by the number of users it can accommodate given a limited resource, and the quality of transmission in each channel (capacity of the system). Channel allocation algorithms aim to maximize the reuse factor. However, some procedures might be taken to cancel the efiect of possible interference. One goal in channel allocation is to increase the total system throughput, if users have difierent rate requirements. Channel allocation could be done either dynamically or in a flxed manner. In Fixed 81 Channel Allocation (FCA), a set of channels are allocated permanently to a cell and all users in that cell use available channels. New users are admitted to the cell if there is any channel available. In Dynamic Channel Allocation (DCA), each cell has a reuse distance constraints. Channels can be reused in difierent cells if they are well separated and the amount of interference is low enough. Another scheme is the reassignment of channels to difierent users to accommodate newly coming user. The performance of these methods which are called Maximum Packing (MP) has the upper bound on the performance of other algorithms in terms of the number of users it can accommodate [74]. Channel Allocation in OFDMA The carriers forming one subchannel may, but need not be adjacent. Except the guard tones and the DC carriers, other carriers are allocated to pilot and data both in down-link and up- link. In down-link the pilot tones are allocated flrst and the rest are grouped in subchannels and allocated to data, and therefore there is a common flxed pilot carriers. However, in up- link the set of used carriers are flrst grouped into subchannels, and then the pilot carriers are allocated from within each subchannel, so each subchannel contains its own pilot symbols. The reason for this set up is that in down-link OFDM every subscriber receives the signal, but in up-link each subchannel may be transmitted from a difierent user. Note that in each case, there are two kinds of pilot carriers, flxed location, and variable location pilots, which shift their location every block repeating every 4 blocks for down-link and 13 blocks for up-link. For down-link, the remaining carriers are flrst divided into groups of contiguous carriers, and then each subchannel consists of one carrier from each of these groups. The partitioning of subcarriers into subchannels is performed according to the following equation, 82 which is called a permutation formula: carrier(n;s) = Nsubchannels:n+fps[nmod(Nsubchannels)]+IDcell:ceil[(n+1)=Nsubchannels]gmod(Nsubchannels); (2.36) where s is the index number of a subchannel, selected from the set [0;:::Nsubchannels ? 1], n is the carrier-in-subchannel index from the set [0;:::Nsubcarriers ?1], carrier(n;s) is the carrier index of carrier n in subchannel s, Nsubchannels is the number of subchannels, ps[j] is the series obtained by rotating a permutation cyclically to the left s times, ceil[:] is the function that rounds its argument up to the next integer (ceiling), IDcell is a positive integer assigned by the MAC to identify this particular base station sector. 83 Chapter 3 Power Allocation for OFDM using Adaptive Beamforming over Wireless Networks 3.1 Motivation and Previous Works Orthogonal Frequency Division Multiplexing (OFDM) is a parallel data transmission scheme. If the width of each subchannel is smaller than the coherence bandwidth of the channel, it converts the wideband frequency selective fading channel to a series of narrowband at fading subchannels [75]. Depending on the carrier spacing, data rate, and the coherence bandwidth of the channel, there is no need for sophisticated equalization methods [76]. One of the disadvantages of OFDM is the worst subchannel domination [51,77]. In an uncoded OFDM system with flxed modulation scheme for all subchannels, the error probability of the whole system is dominated by the subchannel with the highest attenuation [51,77]. If the SINR uctuates over subchannels, the ones with the worst SINR would afiect the overall BER the most. As a result, in the case of frequency selective fading channels, the performance of the whole system in terms of error probability will improve slowly by increasing the transmitted power. In order to obtain a minimum overall error probability, the optimum algorithm, in a flxed total power policy, is to have a uniform error probability for all of the subchannels 84 [77]. Several methods have been proposed to combat the aforementioned problem. One solu- tion is to use Coded OFDM (COFDM) [78{80]. Other methods try to adjust the bit and power distribution among subchannels according to their link gains and are mostly called \loading algorithms" [49{52]. In most of these algorithms, the bit allocation in each sub- channel is adapted to its capacity and therefore, a flxed modulation scheme is not considered for all subchannels. Some of the loading algorithms were introduced in Section 2.4. However, most of these methods have been proposed for a single transmitter with a single antenna without considering the efiect of interferences. In this chapter, we are looking at the problem of power allocation for difierent subchannels from a difierent point of view. In a mobile environment, each user?s signal can afiect others and this in turn, results in more interference. The loading algorithms proposed in the previous works do not consider such phenomena and therefore cannot reach the optimum solution in the sense of minimum total transmission power. In contrast, we use an adaptivepowerallocation scheme to distribute the powers at each subchannel based on the interference from other users at the same subchannel. Furthermore, we exploit antenna arrays to further reduce the interference. We will con- sider the antenna arrays in two situations. First we assume that the base station knows the array response and perform both frequency-domain and time-domain beamforming. Then we relax this assumption and use MMSE beamforming in which training sequences are exploited to update the weight vectors and minimize the interference. These training sequences are transmitted to the receiver to estimate the channel response. The rate at which the training sequences are transmitted depends on the speed of channel variation. In each method, to- tal interference and noise is calculated and fed back to the transmitter through a feedback channel. It should be noted here that sometimes because of deep frequency selective fading, the 85 tone to tone SNR difierences are very large and therefore we might need tremendous amount of power adjustment to compensate for SNR uctuations. In this case, it might be better to discard the subchannel, instead of allocating a large portion of power to it. However, this can result in data rate reduction. Besides, it is not always possible to discard some parts of the signal. As an example in IEEE802.11a wireless LAN protocol, we use some of the subchannels to transmit pilot signals, or we transmit the preambles using OFDM symbols, which are very important to recover. In these situations, we have to do proper bit or power allocation to save the signal. Another approach is to limit the transmitted power at each subchannel. In COFDM, by coding across subchannels, the BER is averaged over all of the subchan- nels. However, if by exploiting the power control and beamforming, the SINR at all of the subchannels can be increased, the overall BER is decreased. Moreover, using power control and beamforming can reduce the total network power. We will compare the uncoded OFDM using our proposed algorithms with COFDM using power control and beamforming and also with a COFDM system with no power control or beamforming . In this chapter we will assume that each mobile uses all of the subchannels. However, with a slight modiflcation, the same formulations and the same algorithms can be applied to OFDMA, where difierent subchannels could be assigned to difierent users. This chapter is organized as follows: In Section 3.2 we will review the concept of power control and propose the power control for OFDM receivers. Section 3.3 proposes the OFDM joint power control and frequency-domain beamforming. Joint time-domain beamforming and power control is proposed in Section 3.4. In Section 3.5 we will use MMSE approach to perform the beamforming. Section 3.6 extends the proposed algorithm to COFDM. Section 3.7 presents some simulation results, and flnally Section 3.7 concludes the chapter. 86 3.2 OFDM with Adaptive Power Control 3.2.1 Background The objective of power control in wireless networks is to minimize the transmitted power while some target error probabilities are met [81,82]. Consider a network of M mobiles trying to access the same channel. We denote the power link gain between the ith mobile and the bth base station by a real number Gib, and the ith mobile transmitted power by Pi. we assume that one base station is assigned to each mobile. Moreover, these base stations are co-channel in the sense that they use either the same frequency band in FDMA, or the same time slot like in TDMA, or spreading code like in CDMA. As a result they all sufier from co-channel interference. The SINR at the bth receiver is given by: ?b = GbbPbM?1P i=0 i6=b GibPi +Nb ; (3.1) where Nb is the noise power at the bth base station. The objective is to maintain the total transmitted power as low as possible, while the SINRs are kept above a threshold. If we denote the minimum acceptable SINR at base b by b, The requirement for acceptable link quality is ?b ? b; 1 ? b ? M; or in matrix form: [I?DF]P ? u; (3.2) where I is an M ?M identity matrix, P is the power column vector, D is a diagonal matrix whose bth diagonal element is bG bb , and [u]b = bNbG bb , and [F]ij = 8 >< >: 0 if j = i Gji if j 6= i ; (3.3) 87 If there is at least one power vectors that satisfles (3.2), the SNR thresholds could be achieved. The power control problem now is deflned as follows: minimize X Pi; subject to [I?DF]P ? u: The matrix DF is called the gain matrix. Let?s call the spectral radius of this matrix ?(DF). The Perron-Frobenius theorem [83] says that any positive deflnite (or irreducible) matrix has a positive real eigenvalue ?? such that ?? = maxfj?ijgMi=1, where ?i?s are the eigenvalues of the matrix. It has been shown in [84] that if the spectral radius of the gain matrix is less than unity, then the matrix I?DF is invertible and positive. Such a network is called a "feasible network". For a feasible network, the lowest possible total power is obtained when all of the SINRs are equal to the threshold, i.e. ?b = b; b = 0;:::;M ?1: (3.4) This is translated to ^P = [I?DF]?1u: (3.5) The above formula can be rewritten as ^P = DF^P+u (3.6) This can lead us to the following iterative equation Pn+1 = DFPn +u (3.7) A distributed power update scheme is proposed in [85] that achieves the optimal solution 88 for (3.1). The bth mobile power at the nth stage of iteration is updated by Pb(n+1) = bG bb 0 BB @ M?1X i=0 i6=b GibPi(n)+Nb 1 CC A: b = 0;:::;M ?1 (3.8) The right hand side in (3.8) is a function of the noise and interference at the bth base station (the term inside parenthesis) , the link gain Gbb, and the target SINR. All of these can be measured locally and transmitted through a feedback channel to the corresponding mobile [81]. So, transmitters need not to know all the existing path gains and transmitter powers. At each iteration, transmitters update their powers using the total interference and link gain to its corresponding receiver, that is fed back by the receiver. If the network is feasible, the above iteration converges to the optimal power vector bP. It has been shown in [86] that starting from any arbitrary power vector, this solution converges to the optimal solution ^P. In the following, we consider this scheme in a multiuser environment using multicarrier transmission. Our objective is to optimize the power allocation at each subchannel for all of the mobiles, such that: 1. The SINR at all of the subchannels for all of the mobiles are close to each other and they are above a SINR threshold. This causes the error probability to decrease faster by increasing the transmitter power, compared to that of unbalanced SINRs. 2. The total power used to achieve the aforementioned objective is minimized. The basic idea is to allocate less power to the subchannels with less interference, and more power to the subchannels with lower SINR. 89 mbits D/A &LPF & Up Conversion GuardInterval Insertion Parallel To Serial Conversion IFFT Modulation InputStream 0id 1id 1?Nid 0iX 1iX 1?NiX iY )(tyi mbits mbits Modulation Modulation 0iP 1iP 1?NiPFigure 3.1: The ith OFDM transmitter using Adaptive Power Control. 3.2.2 System Conflguration Fig. 3.1 depicts the transmitter proposed in OFDM systems using adaptive power control at each subchannel. If the maximum number of paths between the ith user and the bth base station is assumed to be L, the corresponding link can be modelled by the following impulse response (we have ignored the doppler efiect): hib(t) = p Gib L?1X l=0 filib?(t??lib); (3.9) where filib denote the lth path fadings that are independent complex Gaussian variables with variance lib2 (their amplitudes are Rayleigh). ?lib?s are the delays of the corresponding paths. Gib is a real random variable representing the log-normal shadow fading and path loss. The frequency response of the channel is Hib(f) = p Gib L?1X l=0 filibe(?j2?f?lib): (3.10) In this chapter the vectors are shown by bold underline letters. Moreover, the transmit- ted and received signals at the time domain are shown by uppercase and the same values at the frequency domain by lowercase letters. Let?s assume that N denotes the number of sub- channels, Ts is the symbol period, and f0 is the carrier frequency. If N is large enough, each subchannel can be modelled as a at fading channel [75] and so the link gain at subchannel 90 c, Hcib can be calculated simply by replacing f with fc , f0 + cNTs in (3.10), i.e. Hcib = p Gib L?1X l=0 filibe(?j2?fc?lib): (3.11) Without loss of generality, we can assume the path loss and shadowing for difierent paths to be the same, and any difierence can be absorbed in fading coe?cients In this chapter, we assume that a proper guard interval has been inserted in time domain such that the efiect of Inter-Symbol Interference (ISI) can be ignored. Moreover, the guard interval has the form of cyclic preflx and therefore the interaction of the received signals at difierent subchannels in the frequency domain is zero. This is due to the cyclic convolution performed between the channel and the transmitted signal. The modulated data at sub- channel c for user i is dci whose energy is assumed to be unity (e.g. using a flxed MPSK on all subchannels). We assume that all of the subchannels use the same modulation and so due to the fact that subchannel link gains are difierent, by distributing the power equally among them, the SINRs would become unbalanced. Now, we perform a power control algorithm at each subchannel separately. If we denote the power allocated to mobile i at subchannel c by Pci and deflne WN 4= e?j2?=N, the kth sampled received signal (k = 0:::N?1) at the bth receiver (after down conversion, guard interval removal, proper matched flltering, and sampling at intervals Ts) will be ^Xkb = 1p N M?1X i=0 0 B@N?1X c=0 HcibpPci dci| {z } tcib W?ckN 1 CA+nk b (3.12) = 1pN N?1X c=0 ?M?1X i=0 tcib + ^ncb ! W?ckN ; k = 0:::N ?1 (3.120) where nkb is the kth noise sample, and ^ncb is the Fourier transform of the noise samples. Since the noise samples are uncorrelated, these two variables have the same power, Nb. 91 OFDMReceiver 0  Demodulation  Output Stream ^0 jd 1? ?Njd mbits mbitsOFDM Receiver Q-1  Beamformer at subchannel 0 Beamformer at subchannel N-1      00jw 0 )1( ?Qjw 10?Njw 1 )1(??NQjw       Demodulation  Figure 3.2: Frequency-Domain Beamforming in the jth OFDM receiver. The signal at subchannel c for base station b in frequency domain is ^dcb = M?1X i=0 tcib + ^ncb = tcbb + "M?1X i6=b tcib + ^ncb # ; i = 0;:::;N ?1 (3.13) where the flrst part is the desired signal attenuated by the link gain, and the term inside the bracket is the sum of the interferences and thermal noise. The SINR at the cth subchannel, ?cb, is given by (3.1) as a function of link gain, power value, and noise at the cth subchannel. Our goal is to maintain ?cb above a target value, cb while the sum of allocated powers is minimized. To achieve this goal, we apply the power control algorithm, described in the previous subsection, to each subchannel independently. Since the subchannels are assumed to be orthogonal, this guarantees that the SINR at each subchannel is at least cb [81,82]. 3.3 Power Control and Frequency-Domain Beamform- ing Now consider an uplink OFDM system where adaptive beamforming is deployed at each subchannel of all OFDM receivers. Fig. 3.2 depicts the receiver with frequency-domain 92 beamforming at each subchannel which ensures that the subchannels can still be considered independently. The kth sampled received vector at the bth base station at the time domain is given by (using the notations introduced in (3.12)) Xkb = 1pN M?1X i=0 ?N?1X c=0 tcibW?ckN ! aib +nkb; (3.14) where nkb is the noise vector (with dimension Q, the number of antennas) whose elements are the noise samples at the input of each antenna, and the Q-dimensional vector aib is the array response at the bth receiver for the ith transmitter. The received signal at each antenna is passed through an OFDM receiver. The resultant cth outputs of the FFT blocks create the vector ^dcb = M?1X i=0 tcibaib + ^ncb; (3.15) where ^ncb is the Fourier transform of nkb. The output of the beamformer at subchannel c is then given by ec = wcbH^dcb. If we assume that the receiver knows the array response to the desired user, we can use Minimum Variance Distortionless Response (MVDR) beamforming [87]. In MVDR, the weight vector is calculated in order to minimize the total energy at the beamformer output, when thegain towardthedesired direction isunity. The jointbeamforming and powercontrol algorithm is performed at each subchannel separately, assuming perfect orthogonalization. The energy of the beampattern at subchannel c is ?c = E[ecec?] = wcHb E h^ dcb ^dcbH i wcb. As- suming that the noise is zero mean, white Gaussian process, and the transmitted symbols are independent and have average unity energy (see the assumptions in Section 3.2), we obtain 93 from (3.15) that ?c = M?1X i=0 0 B@Pc i jH c ibj 2wc b Ha iba H ibw c b| {z } ucib 1 CA+ Nb 2 jw c bj 2 = "X i6=b ?Pc i jH c ibj 2 uc ib ?+ Nb 2 jw c bj 2 # +Pcb jHpbbj2 ucbb; (3.16) in which the term inside the bracket is the interference plus noise and the second term is the power of the signal coming from the desired direction. The SINR at the output of the beamformer at subchannel c is given by ?cb = P c b jH c bbj 2flflwc b Ha bb flfl2 P i6=b ? Pci jHcibj2flflwcbHaibflfl2 ? +Nbjwcbj2 : c = 0;:::;N ?1: (3.17) The MVDR solution for beamforming optimization will be [87] wcb = (R c b) ?1a bb aHbb (Rcb)?1abb; c = 0;:::;N ?1 (3.18) where the data correlation matrix at the cth subchannel of base station b is Rcb = X i6=b ?Pc i jH c ibj 2a iba H ib ?+N bI: (3.19) Note that in (3.19) we have used i 6= b. Lemma 3.3.1. To flnd the optimum receive beamforming weight vector, we can drop the restriction the i 6= b in the deflnition of autocorrelation matrix in MVDR beamforming (Eq. (3.19)). Proof. Take Rcb as deflned in Eq. (3.19), and deflne ' = X i ?Pc i jH c ibj 2a iba H ib ?+N bI: (3.20) 94 We want to show that Rcb?1abb aHbbRcb?1abb = '?1abb aHbb'?1abb: (3.21) The Matrix Inversion Lemma (MIL) [87] says that for any three matrices P, M, Q, we have ?P?1 +MHQ?1M??1 = P?PMH ?MPMH +Q??1MP: (3.22) Take A = PcbbjHpbbj2abbaHbb, P = A?1, M = aHbb, Q = 1Pc bbjH p bbj2 , and considering ' = A+Rcb, we need to show ? P?1 +MHQ?1M?MH M(P?1 +MHQ?1M)?1MH = PMH MPMH: (3.23) Using MIL, and noting that fi , ?MPMH +Q??1 and fl = MPMH are two scalar numbers, we have ?P?1 +MHQ?1M?MH M(P?1 +MHQ?1M)?1MH = n P?PMH ?MPMH +Q??1MP o MH M'P?PMH [MPMH +Q]?1MP?MH =PM H ?fiflPMH fi?fi2fl =PM H (1?fifl) fi(1?fifl) =PM H fi = PMH MPMH: By considering the fact that the MVDR constraint enforcesflflwcbHabbflfl2 = 1, we solve (3.17) in terms of Pcb and adopt the iterative scheme presented in [81]. As a result, the mapping 95 at each iteration is the combination of (3.18) and the following equation: Pcb(n+1) = min wcb cb jHcbbj2 2 66 66 64 X i6=b ?jHc ibj 2 Pc i (n)u c ib ?+ Nb 2 jw c bj 2 | {z } Icb(n) 3 77 77 75: subject to wbcHabb = 1 (3.24) The following algorithm achieves the jointly optimal power allocations and adaptive beamforming at each subchannel: Algorithm I [Joint Power control and Frequency-Domain Beamforming for OFDM] 1. At step n = 0, the bth base station sets Pcb(n) = 0 (c = 0:::N ?1) for its mobile. 2. For each subchannel, the bth base station calculates the autocorrelation matrix Rcb, and uses (3.18) to compute the weight vector wcb. 3. The base station calculates the interference and noise at subchannel c, Icb(n), as given in (3.16), and transmits it to the transmitter through the feedback channel. 4. The mobile transmitter updates the power at each subchannel according to Pcb(n+1) = c b jHcbbj2I c b(n): (3.25) 5. If Pcb(n + 1) > Pmax, we set Pcb(n + 1) = Pmax, where Pmax is a pre-determined maximum subchannel power. This prevents the subchannels in deep fade to consume a tremendous amount of power 6. If N?1P p=0 jPpb (n+1)?Ppb (n)j2 ? ?, when ? is a threshold that deflnes the speed of con- vergence, the base station stops, otherwise sets n = n+1 and goes back to step 2. 96 Note that instead of imposing an upper bound on each subchannel?s power (Pmax), we can discard the subchannel that is in deep fade. However, this results in data rate reduction. It has been shown in [81] that if there is a solution for the joint power control and beam- forming problem for single carrier users, this algorithm will converge to the optimum solution and this solution is unique. Here, we have used the same algorithm at each subchannel in- dependently. Therefore, the convergence and uniqueness of the solution can also be applied to the proposed algorithm. In this work, we assumed that each mobile is using all available subchannels. However in Orthogonal Frequency Division Multiple Access (OFDMA), users are grouped and a subset of OFDM subchannels is assigned to each group of users. In this case, the proposed algo- rithms can be applied to the co-channel users in each group by using Ni for the number of subchannels assigned to group i. The modulation schemes can be difierent in difierent groups and this allows us to have a separate desired SINR for each group. 3.4 Power Control and Time-Domain Beamforming In Fig. 3.2, to be able to consider the subchannels individually, we had to apply the weight vectors after the OFDM receiver. If an Q-element antenna array is deployed and we use N subchannel FFT blocks at each antenna, the complexity (number of multiplications) at the receiver would be in the order of QN logN +NQ4. Let us consider a system as depicted in Fig. 3.3. Inthis case, theweightvectorswillpass through theFFTblockand thesubchannels cannot be considered independently anymore. Note that difierent beamformers are used at difierent subchannels prior to the FFT block and the data applied to the beamformer is called to be in time domain. This means they are not the data in difierent subchannels and so their optimum weight vectors need not to be difierent. In this system the beamforming 97 Demodulation  Output Stream ^0 jd 1? ?Njd mbits mbits Demodulation  FFTBlock AdaptiveBeamformer   0jw 1jw   )1( ?Qjw  1?jd mbits Demodulation ^0 jX 1?jX 1? ?NjX Figure 3.3: Time-Domain Beamforming in the jth OFDM receiver. is performed in the time domain and unlike Fig. 3.2 in which N set of weight vectors are calculated at each iteration, only one set of weight vectors is calculated at each iteration. Apparently, this system has less complexity compared to that of Fig. 3.2. Using the same parameters, the complexity of Fig. 3.3 is N logN +Q4. Using typical values of 4 for Q and 128 for N, the complexity of Fig. 3.2 is in the range of 33847, while that of Fig. 3.3 is 525, which means a complexity decrease of an order of 64. Using the system depicted in Fig. 3.3, we are no longer able to consider the joint beam- forming and power control at each subchannel independently. In an OFDM system, the symbol decisions are made at the FFT output. The error and weight vector calculations have to be done in the frequency domain. If a time-domain beamformer is to be used, we need to relate the frequency domain error to that quantity in the time domain. In the sequel, we present the relationship between the time-domain beamforming error and the same quan- tity in the frequency domain. One way to look at this problem is to minimize the energy of ^Db, the output of the beamformer in Fig. 3.3. Using the Parseval equation, this is equivalent to minimizing the sum of the energies of the subchannels at the output of the FFT block. If Q antennas are employed at each receiver and within one symbol period made up of N samples, and if within one symbol period made up of N samples, we denote the kth sample 98 input to the qth antenna at base station b as xkbq, the kth input to the FFT block will be ^Xkb = Q?1X q=0 xkbqw?bq: k = 0:::N ?1: (3.26) Using (3.14) and (3.26), the received signal at the pth subchannel will be ^dpb = 1p N N?1X k=0 ^Xkb WkpN = 1N Q?1X q=0 w?bq M?1X i=0 aqib N?1X c=0 tcib N?1X k=0 Wk(p?c)N + 1pN Q?1X q=0 w?bq N?1X k=0 WkpN nkbq = M?1X i=0 tpib Q?1X q=0 aqibw?bq | {z } v?ibM=wHb aib + Q?1X q=0 w?bq^npbq: (3.27) Notethatin(3.27)wehaveusedthepropertyofCombSequences, whichstates N?1P k=0 WkpN = 0 for all p 6= 0. Using the fact that the input symbol energy is unity, the signal energy at sub- channel p is obtained by ep = E h^ dp ^dp? i = M?1X i=0 Ppi jHpibj2jvibj2 + Nb2 jwbj2 : (3.28) It is not possible to minimize the energy of all of the subchannels simultaneously, thus we use a metric which is a positive combination of all ep?s. Since each ep is actually an energy quantity, we simply minimize the sum of the energies which is equivalent to the energy at the output of the beamformer, ^Db, as illustrated in Fig. 3.3. So our optimization problem becomes wb = argminw b (M?1X i=0 jvibj2 N?1X p=0 Ppi jHpibj2 + NNb2 jwbj2 ) ; subject to wbHabb = 1 (3.29) in which the term inside the bracket is equal to N?1P p=0 ep. 99 This is very similar to a normal beamforming process. The solution for the vector wb is similar to (3.18) with Rb matrix deflned as Rb = X i6=b ? aibaHib N?1X p=0 Ppi jHpibj2 ! + NNb2 I = N?1X p=0 Rpb; (3.30) where Rpb is the autocorrelation matrix at subchannel p as deflned in (3.19). As in the case of frequency-domain beamforming, using the Matrix Inversion Lemma, the term i 6= b in (3.30) can be dropped. By replacing ucib with uib = flflwbHaibflfl2, the SINR at the cth subchannel is similar to (3.17). Again, by adopting the iterative scheme presented in [81,82], the iteration for power calculation at each step will be Pcb(n+1) = min wb cb jHcbbj2 2 66 66 64 X i6=b " uib N?1X p=0 ? jHpibj2 Pci (n) ?# + NNb2 jwbj2 | {z } Icb(n) 3 77 77 75: subject to wbHabb = 1 (3.31) At each receiver the algorithm will be as follows: Algorithm II [Joint Power control and Time-Domain Beamforming for OFDM] 1. At step n = 0, the bth base station sets Ppb (n) = 0; (p = 0:::N ?1) for its mobile. 2. The bth base station calculates the sum of autocorrelation matrices of all subchannels N?1P p=0 Rpb. 3. The base station uses the quantities computed in Step 2 to flnd the weight vector, wb. Note that here the weight vector is not calculated for each subchannel. 100 4. The base station measures the interference represented by Icb(n) in (3.31) at each subchannel locally, and transmits these values to the bth mobile through a feedback channel. 5. The bth mobile uses (3.25) to re-calculate the power at each subchannel. 6. If Pcb(n + 1) > Pmax, we set Pcb(n + 1) = Pmax, where Pmax is a pre-determined maximum subchannel power. This prevents the subchannels in deep fade to consume a tremendous amount of power 7. If N?1X c=0 jPcb(n+1)?Pcb(n)j2 ? ?; when ? is a pre-determined threshold that deflnes the speed of convergence, the base station stops, otherwise sets n = n+1 and goes back to step 2. Let?sassumethatthegainmatrixisdenotedbyF(w)whose(ib)th elementis cbjHcibj2jwHb aibj 2 jHcbbj2 fori 6= band0fori = b. Thegainmatrixisanirreduciblenon-negativematrixandbyPerron- Frobenius theorem [83] has a positive real eigenvalue that is larger than the amplitude of all other eigenvalues (spectral radius of the matrix). If the spectral radius of the gain matrix is less than unity, there is a solution for the algorithm [81,82]. Let?s call the mappings deflned by the modifled version of (3.25) (replacing wcb with wb) mw(pn), and the mapping deflned by the combination of (3.29) and (3.25) as m(pn). Theorem 3.4.1. For any two power vectors pl1 and pl2 at subchannel l such that pl1 ? pl2 the following holds: (a) m(pl1) ? mw(pl1); for all w (b) mw(pl1) ? mw(pl2); for all w (c) m(pl1) ? m(pl1): (3.32) 101 Moreover, using Algorithm II, the flxed point the mapping m(pn) is unique. Proof. Note that the coe?cients of the power values in the mappings are positive. As a result, the proof of (a), (b), and (c), and also the convergence and optimality of Algorithm II, and the uniquness of the flxed point of m(pn) is essentially very similar to the proof of Theorem 1 in [81]. Therefore, if the link gains and steering vectors are such that there exists a solution for this joint power control and beamforming problem, the above mentioned algorithm will always converge to a unique optimal solution. If there is a solution to the iterative algo- rithm, the application of the upper bound to each subchannel?s power (Pmax) will ease the convergence. Like frequency domain, in time-domain beamforming, only one real value is exchanged through the feedback channel from the receiver to the transmitter for each link per update. Therefore, the required bandwidth for the feedback channel is the same for both methods. 3.5 Power Control and MMSE Time-Domain Beam- forming If the base stations do not have full knowledge of the array responses, aib, we must use a training sequence which is correlated with the desired signal. The weight vector is obtained by minimizing the mean square error between the estimated signal and the training sequence. This is called Minimum Mean Squared Error (MMSE) approach [26]. MMSE can be applied to both frequency and time-domain beamforming. Here we only show the MMSE time- domain beamforming. If we call the kth sample at the qth antenna at base station b by Xkbq, 102 from (3.27) we obtain ^dcb = 1p N N?1X k=0 ^Xkb WkcN = Q?1X q=0 w?bq[ 1pN N?1X k=0 XkbqWkcN ] = wbHscb; (3.33) wherethesequencescbq (c = 0:::N ?1)istheFouriertransformofthesequenceXkbq (k = 0:::N ?1). The objective in MMSE beamforming is to minimize "2cb given by "2cb = E[j^dcb ?tcbj2] = Pcb jHcbbj2 +wHb Rpsswb ?2RefwHb rcstg; (3.34) where tcb = pPcbHcbbdcb, and dcb is the training sequence at the bth transmitter whose power is assumed to be unity. Moreover, Rcss = E[scbscbH] and rcst = E[scbtcb?]. Here we have N subchannels and the weight vector is the same for all of them, so the criteria in MMSE time-domain beamforming is to minimize N?1P c=0 "2cb, where N?1X p=0 "2pb = N?1X p=0 Ppb jHpbbj2 +wHb ?N?1X p=0 Rpss ! wb ?2Re(wHb N?1X p=0 rpst): (3.35) This is a typical MMSE optimization problem and if N?1P c=0 Rcss is nonsingular, its solution will be the well-known Wiener-Hopf equation [26] given by woptb = ?N?1X c=0 Rcss !?1 N?1X c=0 rcst: (3.36) The time-domain beamformer does not minimize the individual error at each subchannel. Therefore, the principal of orthogonality, valid for the Wiener-Hopf solution, is not satisfled here. However, in the following lemma, we will prove that this solution is equivalent to the MVDR solution up to a constant coe?cient and therefore results in the same SINR [26]. Theorem 3.5.1. If the training sequences transmitted from difierent mobiles are uncorre- lated, the MMSE weight vector presented in (3.36) is equivalent to the MVDR weight vector, as expressed in (3.18) with autocorrelation matrix deflned as in (3.30), up to a constant coe?cient. 103 Proof. Using (3.14) and the deflnition of vector scb, we have rpst = E[spbtpb?] = E[ 1pN N?1X k=0 xkbWkpN tpb?] = 1N M?1X i=0 N?1X c=0 aibE[tcitpb?] N?1X k=0 Wk(p?c)N + 1pN N?1X k=0 E[nbtpb?]WkpN = 1N M?1X i=0 aibE[tpitpb?] = E[jtpbj2]abb = Ppb jHpbbj2abb: (3.37) Therefore N?1X p=0 rst = abb N?1X p=0 Ppb jHpbbj2: (3.38) On the other hand by using the deflnition of Rpss, we get Rpss = E[spbspbH] = 1N N?1X k=0 N?1X k0=0 E[xkbxk0b H]WkpN W?k0pN = 1N2 M?1X i=0 M?1X i0=0 aibai0bH N?1X c=0 N?1X c0=0 E[tcitc0i0?] N?1X k=0 Wk(p?c)N N?1X k0=0 W?k0(p?c0)N + 1N N?1X k=0 N?1X k0=0 E[nkbnk0b H]W(k?k0)pN = M?1X i=0 M?1X i0=0 aibai0bHE[tpitpi0?]+ 1N N?1X k=0 E[nkbnkbH] = M?1X i=0 aibaibHE[jtpij2]+ N02 I = M?1X i=0 aibaibHPpi jHpibj2 + N02 I = Rpb: (3.39) Therefore N?1X p=0 Rpss = N?1X p=0 Rpb: (3.40) This is the same as the autocorrelation for MVDR deflned in (3.30). Using (3.37) and (3.39), the weight vectors for MVDR and MMSE time-domain beamforming are the same. 104 Theinterferenceatsubchannelcequalsthedifierencebetweenthereceivedpower, E[j^dcbj2], and the power of the desired signal and is given by Ipb = woptb HRpsswoptb ?Ppb jHpbbj2 flfl flwoptb Habb flfl fl 2 = woptb H(Rpss ? 1Pp b jH p bbj2 rpstrpstH)woptb : p = 0:::N ?1: (3.41) Therefore, the MMSE algorithm is outlined as follows: AlgorithmIII [Joint Power control and MMSE Time-Domain Beamforming for OFDM] 1. At step n = 0, the bth base station sets Pcb(n) = 0; (c = 0:::N ?1) for its mobile. 2. Using (3.36), the bth base station flnds the weight vector woptb . 3. The base station uses (3.41) to flnd the interference at each subchannel and transmits these values to the bth mobile through the feedback channel. 4. The bth mobile uses (3.25) to re-calculate the power at each subchannel. If Pcb(n+1) > Pmax, we set Pcb(n+1) = Pmax. 5. If N?1X c=0 jPcb(n+1)?Pcb(n)j2 ? ?; (3.42) when ? is a pre-determined threshold that deflnes the speed of convergence, the base station stops, otherwise sets n = n+1 and goes back to step 2. 3.6 Extension to COFDM Theoretically, the bit and power allocation obtained by the loading algorithms meet the desired bit error rate as long as the time variation of the channel is very limited. Performing bit loading on time-varying channels requires a mechanism to adapt to the channel variation. 105 Moreover, as we mentioned in Section 3.1, the interference from other mobiles could be severe and cause performance degradation. Many practical OFDM systems use coding across subchannels (in frequency domain) to achieve better immunity to the frequency-selective fading channels. This provides a link between bits transmitted on separate subchannels and is done in such a way that the information conveyed by the subchannels in deep fade can be reconstructed by the information received through the ones which are not in deep fade. Therefore, coding applied to OFDM can be seen as a tool to average fading across subchannels. Block or convolutional codes are used either by their own or combined together ( as the inner and outer code) and possibly with interleaving. Here, we will use Trellis Coded Modulation (TCM) at the OFDM transmitters. In TCM, convolutional coding is combined with modulation and results in higher coding gain. Mostly, TCM is based on the set partitioning performed by the Ungerboeck?s encoder [11] in which m information bits map into a signal from the 2m+1-ary constellation. k of these bits are encoded by a rate k=(k + 1) convolutional encoder to select one of the 2k+1 partitions at the (k + 1)th level of the constellation?s partition tree. The remaining m?k bits are used to select one point within the designated partition. Adaptive TCM (ATCM) uses MQAM constellation and has a coding gain of at least 7dB over simple TCM (see [79] for details). The TCM used in this chapter is the 8-state 8-PSK encoder depicted in Fig. 3.4 and its trellis is depicted in Fig. 3.5. This encoder has a coding gain of 3:6dB for high SNR environments [88]. Although the COFDM averages the fading on all of the subchannels, in an environment with low or moderate high SNR, the BER depends on the SNR of each subchannels. This dependence could be better seen for TCM, through the following inequality [7,89]: NdfQ ?q d2fEs=(2N0) ? ? Pe ? 1X d=df NdQ ?p d2Es=(2N0) ? ; (3.43) 106 X1 X0 D D D Signal Mapping C2 C1 C0 y Figure 3.4: The 8-state 8-PSK TCM encoder. where Nd is the average number of paths in the trellis having the squared Euclidian distance of d2 from the all-zero path, d2f is the normalized square free distance of the code, and Q() is the error function. Moreover, in a multiuser environment it is not only the fading that determines both the performance of each subchannel and the overall performance of a single link. The efiect of interferences plays a detrimental role on the overall bit error rate, and therefore increasing the SINR at each subchannel can improve the overall bit error rate. From a system level point of view, COFDM applied in a single user only averages the fades over difierent subchannels of the same user, but cannot optimize the allocation of resources in a multiuser environment such that the efiect of interferences is minimized. Consequently, applying beamforming to each subchannel can improve the performance of the system. Moreover, by beamforming at each subchannel, we would be able to decrease the power consumption for achieving the same performance. The SINR uctuation is amplifled by the spatial processing and therefore the dependence of the overall bit error rate on the SINR of each subchannel becomes more severe. Power control can compensate for this uctuation. In this chapter, we will consider four COFDM systems using TCM. Those are a COFDM system with no power control or beamforming, a COFDM system using the frequency- 107 0246 1357 2064 3175 4602 5713 6420 7531 Figure 3.5: The 8-state 8-PSK TCM trellis. domain beamforming to increase the SINR with the same amount of total network power, a system with joint power control and frequency-domain beamforming per subchannel, and flnally a COFDM system where power control is performed per user and beamforming per subchannel. In the second and third system, the per subchannel SINR is measured at the symbol level, before decoding (or demodulation). In the last system, the following iterative algorithm is used to achieve power control per user and beamforming per subchannel. This algorithm tries to adapt the total user power by the equivalent SINR of the COFDM system derived from the BER of the receiver. The equivalent SINR of the COFDM system is deflned as the SINR of an uncoded-OFDM system achieving the same BER, minus the coding gain of the code. Algorithm IV [Joint Power control per user and Frequency-Domain Beamforming for COFDM] 1. At step n = 0, all mobiles start with equal powers at all subchannels. The weight vectors have one component as one and the rest as zero. 108 2. Each base station calculates the BER using a flxed number of frames. 3. Each base station calculates the SINR of the equivalent uncoded system using the relationship: BER ? 2kQ ?p 2 uncoded sin ? ? M ?? ; (3.44) where M = 2k is the constellation size, and uncoded is the SINR of the equivalent uncoded system [7]. This statement is an approximation of the bit error probability of AWGN channels for MPSK modulations and we use it here because we have assumed that the channel is known at the receiver, and by central limit theorem, the interference can be considered Gaussian. 4. The equivalent SINR of the COFDM system is calculated as coded = uncoded ?C; (3.45) where C is the coding gain of the coding scheme and the SINRs are evaluated in dB. 5. The following relationship is used to calculate the total power of each mobile that is distributed equally among subchannels: P(n+1) = P(n) desired coded : (3.46) 6. Each base station uses (3.18) to flnd the beamforming weight at each subchannel. 7. The algorithm is repeated until convergence. Spatial processing improves the SINR in each subchannel and the amount of improvement depends on the channel response, spatial signature and the interference in each subchannel. Since some subchannels get more beneflts from the spatial processing, the power control per subchannel saves more power compared to a system where the power control is performed per user. 109 Path (l) 0 1 2 3 4 5 Delay (?l), ?s 0:0 0:2 0:5 1:6 2:3 5:0 Power (?2l ) 0:189 0:379 0:239 0:095 0:061 0:037 Table 3.1: The COST207 Typical Urban 6-ray power delay proflle. 3.7 Performance Results Simulation Setup We use a wireless network consisting of 36 base stations placed in a hexagonal pattern. We assume that all of the base stations belong to the same co-channel set and each cell contains one base station and one mobile. This model can be used by adopting any multiple access scheme to distinguish the mobiles in a cell. The formulations presented in this chapter is applicable to the cases where multiple users are assigned to each base station, by allocating difierent indices to the same base stations associated with difierent mobiles. Users are randomly distributed in a cell according to a uniform distribution. We use an OFDM system with 32 subchannels for transmission. The communication channel is assumed to follow the COST207 Typical Urban 6-ray channel model whose power delay proflle is given in Table 3.1 [90]. The maximum channel delay spread is 5?s and so the channel coherence bandwidth is 200KHz. Link gains are calculated by considering 2:5dB for the variance of shadow fading and the path loss exponent to be four. We assume a quasi-static channel where the channel is assumed to be flxed over multiple OFDM symbols. Note that the subchannel link gains for each user are correlated according to (3.11). A one-tap frequency-domain equalizer is assumed at the receiver such that the channel between the base station and its desired mobile can be estimated. Each OFDM symbol is assumed to be 1?s long which corresponds 110 to a bandwidth of 1MHz. Therefore, the subcarrier spacing is 32KHz which is smaller than the coherence bandwidth of the channel and so the fading at each subchannel can be considered at. The average power of the signal at each subchannel at each transmitter is assumed to be unity. A white Gaussian noise whose power is 60dB below the received signal power is added to the signal at the receiver. QPSK modulator and demodulator are used at all of the subchannels in the transmitter and receiver. The desired SINR at each subchannel for uncoded systems varies over a range of ?5dB to 15dB. For the systems using beamforming a four-element antenna array is deployed at each receiver. When the adaptive powercontrolisperformed, theSINRatallofthesubchannelsforallofthemobilesarealmost the same and therefore, it does not matter which base station we pick for the calculation of bit error rate and SINR. However, when we divide the total network power among all of the subchannels and all mobiles equally (uniform power policy), the SINRs are difierent. Therefore we pick several base stations based on the average SINRs of their subchannels to calculate the performances (e.g. the base stations labelled as "best base", "worst base", "base 0" and "average base" in Figs. 3.6 and 3.7). The variable "total network power" appeared in some plots represents the sum of the powers of all subchannels of all mobiles. Numerical Results In Fig. 3.6 a single antenna conflguration is used to perform adaptive and uniform power policy, when the channel is assumed to change from one OFDM block to another, but is flxed during one OFDM symbol, and the total network power in the uniform policy experiment is the same as in the adaptive one. It is clear that the BER of all of the base stations for the adaptive case is close to the base station having average SINR. The BER vs. total network power is plotted in Fig. 3.7 when the channel follows a quasi-static model. In other words, in Fig. 3.7 the channel is assumed to be constant between two successive power updates, but 111 ?5 0 5 10 1510?5 10?4 10?3 10?2 10?1 SINR [dB] Bit Error Rate Bit Error Rate vs. total network power for single antenna cases Adaptive Power, Base 0Uniform Power,Base 0 Uniform Power, Best BaseUniform Power, Avg Base Uniform Power, Worst Base Figure 3.6: Bit error rate vs. SINR [dB] for single antenna cases for time-varying channel between power updates. it could vary from one to another, while in Fig. 3.6 the channel coe?cients are only allowed to change between two successive power updates. We expect that in adaptive power policy all of the subchannels perform close to the target SINR, while in uniform policy, because of difierent link gains at difierent subchannels, the SINRs are expected to be difierent. This fact is illustrated in Fig. 3.8, where the SINRs for difierent subchannels of a base station using both policies are shown. This flgure clearly shows that in adaptive policy, all of the subchannels perform at the same level of SINR and therefore the error probability of the receiver improves better by increasing the transmitted power [77]. By power allocation, we force the SINRs at all of the subchannels of all mobiles to be in the vicinity of a desired value, while the total transmission power is minimized. The target SINR at this experiment is 5dB. The decrease in the SINR of the adaptive policy for the 6th subchannel is due to the upper bound enforced on each subchannel?s transmitted 112 10 12 14 16 18 20 22 24 26 28 3010?7 10?6 10?5 10?4 10?3 10?2 10?1 Total Network Power [dBm] Bit Error Rate Bit Error Rate vs. total network power for single antenna cases Adaptive PowerUnif. Power,Base 0 Unif. Power, Best BaseUnif. Power, Avg Base Unif. Power, Worst Base Figure 3.7: Bit error rate vs. total network power [dBm] for single antenna cases assuming quasi- static channel. power (Pmax). The average SINR for the mobile chosen in uniform policy is about 5:1dB. Figs. 3.9 compares the three adaptive power control methods proposed in this chapter. This flgure shows that by using frequency-domain beamforming, we can achieve lower total network power for the same target SINR. For example, the 10dB threshold SINR is achieved by reducing about 4dBm in total network power compared to the single antenna case. It is also shown that with the channel parameters we have used, the time-domain beamforming, althoughnotoptimal, performssomewherebetweenthesingleantennasystemandthesystem utilizing the frequency-domain beamforming at each receiver. In this case, for the same target SINR the total network power is about 3dBm lower compared to the single antenna case. This amount clearly depends on the channel parameters. Obviously, the frequency- domain beamforming performs better than the time-domain beamforming. In turn, we can signiflcantly decrease complexity by using the time-domain beamforming. In all of these 113 0 5 10 15 20 25 30 35?2 0 2 4 6 8 10 Subchannels SNR [dB] SNRs of different subchannels when desired SNR = 5dB Adaptive PolicyUniform Policy Figure 3.8: SINRs of difierent subchannels in the single antenna cases. cases the uncoded-OFDM is used and by using the adaptive power control scheme, we have guaranteed the SINR at all of the subchannels to be close to the desired SINR. Since we have assumed a flxed modulation scheme at all of the subchannels, we expect to achieve similar bit error rates in all of these cases. The simulation results have conflrmed our expectation. Figs. 3.11 and 3.12 compare the uncoded-OFDM system with the rate 2=3 COFDM system using the TCM represented in Fig. 3.5. This is an Ungerboeck 8-state 8-PSK TCM encoder, whose minimum free distance d2free is equal to 4:568 (no parallel transition) and the asymptotic coding gain (the coding gain at high SNR) is = 2:29 (3:6dB). This scheme is optimized for AWGN channels [13], and since we assume the perfect knowledge of the channel at the receiver we will use it in this simulation. 114 ?5 0 5 10 1510 13 16 19 22 25 28 30 SINR [dB] Total Network Power [dBm] Total network power vs. desired SINR for adaptive power control cases Single antennaFreq.?Domain Beamforming Time?Domain Beamforming Figure 3.9: Total network power [dBm] vs. desired SINR [dB] for adaptive power control cases. The generator matrix of the encoder is g11 = ? 0 1 0 ? g21 = ? 0 0 ? g12 = ? 0 0 1 ? g22 = ? 1 0 ? g13 = ? 1 0 0 ? g23 = ? 0 1 ? : Viterbi decoding is used at each receiver. The equivalent uncoded system uses QPSK modulation. The SINR range for comparing the COFDM systems is chosen to be 0?30dB. Fig. 3.10 is used to evaluate the coding gain at difierent SINRs. This flgure is obtained by calculating the performance of a single carrier system using the same TCM encoder and Viterbi decoder (the arrows show the coding gain at BER = 10?5). In Fig. 3.11, the total network power of an uncoded-OFDM system is compared with a COFDM system with per 115 0 2 4 6 8 10 12 14 10?6 10?5 10?4 10?3 10?2 10?1 100 SNR [dB] Bit Error Rate Bit Error Rate vs. SNR for single carrier systems COFDMuncoded Coding Gain Figure 3.10: Coding gain of the TCM encoder depicted in Fig. 3.5 user power control and per subchannel beamforming. Note that the total network power of a COFDM system with per subchannel power control and beamforming is the same as that of the uncoded-OFDM system with power control and beamforming per subchannel. The total power for uncoded system is lower than COFDM system for low or moderate SNR, but is higher for high SNR. This is compensated by lower BER shown in Fig. 3.12 where we compare the BER vs. desired SINR for difierent OFDM systems. As can be seen from these curves, the COFDM system without any power control and beamforming has the lowest performance compared to other systems. A COFDM system where the transmitted powers are equal at all subchannels but the frequency-domain beamforming is performed at each subchannel, has a better performance compared to a COFDM system with no power control or beamforming. The curve marked by diamonds shows the BER of a COFDM system in which the per user power control jointly with per subchannel frequency-domain beamforming 116 0 5 10 15 20 25 3010 15 20 25 30 35 desired SINR [dB] Total Network Power [dBm] Total Network Power vs. SINR UncodedCoded PC/user Figure 3.11: Total network power vs. desired SINR for coded and uncoded OFDM. (the algorithm mentioned in Section 3.6) is performed. This flgure shows that if the joint power control and frequency-domain beamforming is performed at each subchannel, both the uncoded and coded system have better performances compared to other conflgurations. For low SINR environments, the uncoded system achieves lower BER, while the performance of the coded system is better for the moderate and high SINR environments. As can be seen, these two curves intersects when the desired per subchannel SINR is 7dB. As the SINR is increased the coded system performs better. For low BERs, the OFDM coding gain is about 3:6dB, which is consistent with the asymptotic coding gain of the trellis depicted in Fig. 3.5. Since in both cases power control and frequency-domain beamforming is performed at the symbol level, we expect the uncoded system to have a better performance in low SINRs, while in moderate or high SNR the coded system performs better. 117 3.8 Summary of the Chapter We considered iterative joint power control and beamforming for wireless networks using OFDM. Our study showed that we can force the SINR at all of the subchannels at all mobiles to be at least equal to a target value, while the total network power to achieve the above goals is minimized. To reduce the complexity of the OFDM receivers, we performed the array processing in the time domain and provided an iterative algorithm to distribute the power among subchannels. We also proposed MMSE time-domain beamforming jointly with power control, for the cases when the angle of arrivals are unknown. We observed that an uncoded-OFDM system with the proposed algorithms performs better than the simple COFDM system, a coded system with per subchannel beamforming with equal powers across subchannels and a COFDM system with per user power control and per subchannel beamforming. If the proposed algorithm is applied to COFDM, the BER is improved for moderate and high SINRs. 118 0 5 10 15 20 25 30 10?8 10?7 10?6 10?5 10?4 10?3 10?2 10?1 desired SINR [dB] Bit Error Rate Bit Error Rate vs. SINR COFDM, No PC, No BFCOFDM,PC/user+BF/bin COFDM, BF/bin no PCCOFDM, PC+BF/bin Uncoded, PC+BF/bin Figure 3.12: Bit error rate vs. desired SINR for coded and uncoded OFDM. 119 Chapter 4 MIMO-OFDM Systems with Multi-User Interference 4.1 Motivation and Previous Works Recent information theoretic results suggest that there is signiflcant capacity improvement for wireless communication systems using multiple antenna transmission [23]. The use of multiple antennas at both ends of a wireless link has been shown to have the potential to achieve tremendous improvements in bit rates [32,91]. This bit rate increase is obtained without the necessity of using additional power or bandwidth. Extensive study of on the ca- pacity of at fading (deterministic and stochastic) multiple-input multiple-output (MIMO) channels can be found in [23,32,92{94]. Multiple antennas are used to perform beamforming [26] , gain space diversity by using the well-known space time codes [29] , or achieve spatial multiplexing [32]. In BLAST (Bell-Labs Layered Space- Time architecture) multiple data streams are transmitted simultaneously and in the same frequency band, and can be sepa- rated using receiver signal processing because of distinct spatial signatures at the transmit antennas. Spatial multiplexing can signiflcantly beneflt from transmit processing when the channel is known at the transmitter side in addition to the receiver side [91]. Many of these researches have assumed a single user system and transmit multiple data streams. Optimal 120 or suboptimal strategies that maximizes the information theoretic sum capacity of vector multiple access channels have been studied in [95]. In a multiple access system, if the data rates are constant for difierent users, normally there would be a trade-ofi between the throughput of the overall network and the spectral e?ciency of one particular user. The reason is the interference each user might incur on other co-channel receivers. In the systems where the data rate is homogeneous, power control can be used, while in non-homogeneous systems data rate adaptation is necessary for increasing the system spectral e?ciency [96]. Transmit and receive beamforming have also been studied for multiuser systems when eachuseristransmittingasingledatastream[97,98]bymaximizingtheSignaltoInterference and Noise Ratio (SINR). The authors in [99] have proposed an algorithm that performs optimal transmit and receive processing by performing system-wide Mean Square Error (MSE). So far, most of the research in this context has considered a narrowband at fading channel. However, Orthogonal Frequency Division Multiplexing (OFDM) has proven to be an e?cient transmit and receive scheme, which is prone to the detrimental efiect of multipath delays. Difierent subchannels in an OFDM system experience difierent attenuations. So, how to adaptively allocate the powers and transmitting rates to fully utilize the spectrum is a hot topic in recent research works. As a matter of fact, this speciflcation of OFDM systems ofiers an e?cient tool for performing data rate adaptation. The exploitation of multiple transmit and receive antennas (Multiple-Input, Multiple-Output or MIMO) both in the form of beamforming [26] and space-frequency coding[29] has recently found much attention. The authors in [91,100], provide expressions for the ergodic capacity and the outage capacity of OFDM-based MIMO systems, with unknown channel at the transmitter, and studied the efiect of channel parameters on the capacity. In [91] the authors have used 121 the spatial multiplexing for DMT when the powers are distributed equally among all eigen- modes. Many of these schemes have considered a single transmitter and receiver. In addition to the efiect of multipath fading, in a multi-user environment, the interference from other transmitters, plays a detrimental role in degrading the system performance. In this chapter, we will consider MIMO-OFDM systems in a multi-user environment, where multiple data streams are transmitted form each transmitter. Unlike some previous works in this context, we will consider difierent power constraints at each transmitter. Two scenarios are considered in this chapter. First, we restrict ourselves to the case where the transmission power is flxed at each subcarrier of an OFDM system. This case can happen when, by performing power control, the subcarrier power is restricted to a maximum value. In the second assumption, we assume that the overall transmit power of each user is flxed. In each case, we will quantify the overall OFDM bit rate with respect to transmit and receive weight vectors, and use iterative water-fllling to flnd these vectors to maximize the OFDM overall bit rate. We present an extension of BLAST, which is optimal in the sense that it maximizes the spectral e?ciency of difierent links, assuming that the covariance of interference and the channel conditions are fed back to the transmitter form the receiver. The transmitter sends multiple streams through the eigen-modes of the channel and interference. We will also consider the rate maximization in MIMO/OFDM systems, when one single stream is transmitted. We will establish a game theoretic analogy of the problem. The rest of this chapter is organized as follows: Section4.2describesthesystemmodel. Section4.3quantiflestheoverallOFDMdatarate withthetransmitandreceiveweightvectorsinbothcases, andprovideiterativealgorithmsto maximize this quantity. We will also present the game theoretic approach of MIMO/OFDM problem in this section. In Section 4.4, a single mode MIMO/OFDM is considered. Section 4.5 presents some numerical results, and flnally the chapter is concluded in Section 4.6 122 concludes the chapter. 4.2 System Model We consider a multi-user OFDM system consisting of M cells, one base station per cell, and Nm mobiles in cell number m. Total number of mobiles is considered to be N, and therefore PMm Nm = N. That is, there are N co-channel link pairs for uplink and downlink transmission. A downlink communications is considered here, where Nt transmit antennas is employed at each base station and each mobile user is equipped with Nr receive antennas. However the same analysis can be applied to uplink transmission. T independent data streams are transmitted from each base station, where a difierent transmit weight vector is calculated for transmitting each stream to the corresponding mobile. Assuming that the number of OFDM subcarriers, K, is large enough, each subchannel is assumed to follow a at fading quasi static model, where the subchannel is constant over a few OFDM blocks. The frequency domain subchannel link gains from base station m to mobile n at the kth subcarrier is denoted by an Nr by Nt matrix Hkmn, whose (i;j)-th element [Hkmn]ij, represents the channel link gain from the transmit antenna j to receive antenna i. Note that we will consider a general model in which all mobile receivers in a cell are supposed to receive all of the streams transmitted from the base station. In this case, a speciflc mobile considers the data transmitted form its base station for other mobiles as interference. However, in a more practical model, speciflc data streams might be intended for speciflc users. In those cases, the powers of the eigen-modes corresponding to those streams for those mobiles are considered to be zero. Consider an OFDM system where the received signal at the kth carrier of the ith mobile 123 is given by xki = Wki H ? NX m=1 HkmiVkmskm +nki ! ; (4.1) where sm is the data vector of size T that is intended to be transmitted to user m, and its covariance 'km is deflned by 'km = Efskm(skm)Hg: (4.2) nki isthethermalnoisevectorofsizeNr attheith receiver, whosespatialcovariancematrix is diagonal with equal power ( ki )2 per antenna. Note that in (4.1), we have considered the sum of N transmitted signals rather than M. The reason is that each base station (say base station m) transmits independent data streams to Nm receivers, and the data aimed to each mobile is considered interference at other receivers. The Nr ? T matrix Wki and Nt ? T matrix Vkm represent the receive and transmit weight vectors, respectively. If skmt is the tth stream transmitted from transmitter t at subcarrier k, the transmitted power of stream t, from antenna i at subcarrier k is E?jskmtj2?jV kmi;tj2, and therefore the total transmit power at the kth subcarrier is Pkm = TX t=1 NtX i=1 ?E?jsk mtj 2?jV k mt;ij 2? = NtX i=1 " TX t=1 ?V k mt;i ??E?jsk mtj 2?V k mt;i # =tr ? VkmH'kmVkm ? (4.3) Assuming that the transmitted streams from difierent users are independent of each other and also independent of noise samples, the total covariance matrix at receiver i is given by Xki = Efxki(xki)Hg =Wki H " NX m=1 HkmiVkm'kmVkmHHkmiH + ki 2INr # Wki =Wki HHkiiVki 'kiVki HHkiiHWki +Wki HQkiWki ; (4.4) 124 where Qki = X m6=i HkmiVkm'kmVkmHHkmiH + ki 2INr; (4.5) is the covariance of the interference and noise at the kth subcarrier of the ith receiver. 4.3 Achievable Rate with Known Interference Covari- ance Matrix We assume that the interference has a Gaussian distribution. This happens, for example, when the transmit signal has a Gaussian distribution or the number of users in the system is large according to Central Limit Theorem (CLT). Furthermore, we follow two scenarios. In the flrst one, we assume a flx transmit power policy per subcarrier where every user adapts its data rate with the total transmit power across its antennas held constant. This happens in situations like when we use power control and therefore each user is assigned a flxed power value per subcarrier. In the second scheme, the transmit power is flxed per user. In the following, we will flnd the expressions for the achievable rate for each subcarrier at the receiver i. For notation simplicity, we will drop the subcarrier index k, and reuse it wherever needed. Assuming Gaussian interference, The mutual information at subcarrier k can be expressed as [101]: Ii = log2 det?WHi ?HiiVi'iVHi HHii +Qi?Wi??log2 det?WHi QiWi? = log2 det h WHi HiiVi'iVHi HHiiWi?WHi QiWi??1 +IT i (4.6) So the optimization problem for each user i knowing the data covariance matrix, 'i is 125 deflned as flnding how to flnd the transmit and receiver weight vector so as to max Wi;Vi Ii; subject to tr ? VkmH'kmVkm ? ? Pkm (4.7) To this end we will use a well-known relation in linear algebra which states that if matrix A is m by n, and B is n by m, then [102] det(Im +AB) = det(In +BA); (4.8) where Im and In are the identity matrices. Using (4.8), we can change the mutual information to Ii = log2 det h HHiiWi?WHi QiWi??1WHi HiiVi'iVHi +INt i : (4.9) The MMSE receiver weight vectors are given by Wi = ?Q?1i HiiVi; (4.10) where ? is a real constant. It is well known [103] that this choice can maximize the SINR (and therefore maximizes the mutual information). Using this expression for the receive weight vectors, we have (WHi QiWi)?1 = ??2?VHi HHiiQ?1i HiiVi??1 : (4.11) The mutual information becomes Ii = log2 det h HHiiQ?1i HiiVi?VHi HHiiQ?1i HiiVi??1VHi HHiiQ?1i HiiVi'iVHi +INt i = log2 det?HHiiQ?1i HiiVi'iVHi +INt?; (4.12) which is the capacity obtained by optimum receiver processing. Therefore the chosen MMSE receiver weight vector is optimum. 126 The eigenvalue decomposition the matrix HHiiQ?1i Hii is HHiiQ?1i Hii = U?UH; (4.13) where the columns of unitary matrix U and the diagonal elements of the diagonal matrix ? are the eigenvectors and eigenvalues of the matrix HHiiQ?1i Hii, respectively. SinceHHiiQ?1i Hii is positive deflnite, all of these eigenvalues are non-negative. Note that the number of eigen- modes of the system is determined by the rank of this matrix that we call ri. From (4.5), the matrix Qi is the sum of many independent matrices, and assuming that the number of transmitters is high enough, we can assume it is full rank (rank of Nr). By the inequalities [102] rank(AB) ? min(rank(A);rank(B)); (4.14) we have ri ? min(Nt;Nr): (4.15) Using this decomposition, the mutual information becomes Ii = log2 det?U?UHVi'iViH +INt? = log2 det??1=2UHVi'iVHi U?1=2 +INt?: (4.16) The Hadamard inequality for an m by m matrix A says [102] det(A) ? mY i=1 aii; (4.17) where aii is the ith diagonal elements of matrix A. Therefore , det(A) is maximized if matrix A is either diagonal or upper or lower triangular. The mutual information, Ii is maximized if UHVi'iVHi U = ~Pi; (4.18) 127 is diagonal with non-negative elements pi1;pi2;:::piNt. In this case Ii = log2 det h~ Pi?+INt i = X j log2 (1+pij?j); (4.19) where ?j is the ith eigen-values of ?. Since U is unitary, and tr(AB) = tr(BA), we have tr(Vi'iVHi ) = tr(~Pi) = NtX l=1 pil ? Pki : (4.20) 4.3.1 Constant Power per Subcarrier If we assume that the total power of eigen-modes at each subcarrier is flxed, our problem becomes the maximization of Ii, with the constraint tr(~Pi) ? Pki . The answer is the well- known water-fllling solution [101], which states that for non-zero ?j?s pij = ? ? 1? j ?+ ; (4.21) where ? is chosen in such a way that Pj pij = Pki , and (x)+ = max(x;0). If ?j = 0, we take pij = 0. To flnd the optimum transmit weight vectors, we need to solve UHVi'iVHi U = ~Pi or ? Vi'1=2i ?? Vi'1=2i ?H = U~PiUH for Vi. To this end, we consider 3 cases: 1. Nt < T: In this case U h~ P1=2i 0 i 2 64 ~P1=2i 0 3 75UH = U~P iUH: (4.22) So, Vi'1=2i = U h~ P1=2i 0 i | {z } B , or Vi = UB'?1=2i : 2. Nt > T: Note that rank?Vi'iVHi ?? min(T;Nt) = T, therefore rank(~Pi) ? T. Since ~Pi is diagonal, without loss of generality, we can assume that the flrst ri eigenvalues 128 are non-zero, and therefore we can write ~Pi = 0 B@ P1 0 0 0 1 CA; (4.23) when P1 is an T ?T diagonal matrix. In this case, Vi = U 2 64 P11=2 0 3 75 | {z } B '?1=2i : (4.24) 3. Nt = T: In this case B = ~P1=2i , and again Vi = UB'?1=2i : Note that in all cases, we have BBH = ~Pi. We assume that all of the data streams are independent, and therefore the matrix 'i is diagonal. Therefore xki = Wki H 0 BB BB @ HkiiVkiski + X m6=i HkmiVkmskm +nki | {z } q 1 CC CC A = ??'?1=2i BHUHHkiiHQ?1i HkiiUB'?1=2i ski +??'?1=2i BHUHHkiiHQ?1q = ??'?1=2i BHUHU?UHUB'?1=2i ski +??'?1=2i BHUHHkiiHQ?1q = ??'?1=2i BH?B'?1=2i ski +??'?1=2i BHUHHkiiQ?1q = ???'?1i ski +??'?1=2i BHUHHkiiHQ?1q; (4.25) where ? = BH?B is a diagonal matrix. This result shows that at the receiver, and at each subchannel the multiple streams are orthogonal to each other. The receiver covariance 129 matrix is E h xkixki H i = ?2?'?1i E[skiski H]'?1i ?+?2'?1=2i BHUHHkiiHQ?1E[qqH]Q?1HkiiUB'?1=2i = ?2'?1i BH?BBH|{z} ~Pi ?B+?2'?1=2i BHUHHkiiHQ?1HkiiUB'?1=2i = ?2BH?2~PiB'?1i +?2'?1=2i BHUHU?UHUB'?1=2i = ?2BH?2~PiB'?1i +?2'?1i ?: (4.26) 1. Nt < T: In this case BHDB = 2 64 D~Pi 0 0 0 3 75, for each diagonal matrix D, and therefore E h xkixki H i = ?2~P2i?2'?1i1 +?2~Pi?'?1i1 ; (4.27) where 'i1 is the flrst diagonal Nt ? Nt subblock of 'i. So the SNR for the flrst Nt streams are pij?j. 2. Nt > T: In this case BHDB = D1, where D1 is the flrst diagonal T ?T subblock of the diagonal matrix D. Therefore E h xkixki H i = ?2~P2i1?21'?1i +?2~Pi1?1'?1i : (4.28) So the SNR for the jth stream is pij?j. 3. Nt = T: In this case E h xkixki H i = ?2~P2i?2'?1i +?2~Pi?'?1i : (4.29) So the SNR for the jth stream is pij?j. 4.3.2 Constant power per user In the previous subsection, we assumed that the total transmit power of each user at each subcarrier is a known flxed value. However, a more practical arrangement is to consider a 130 known constant power for each user, rather than each subcarrier. To flnd the optimal weight vectors in this case, we notice that the overall mutual information between the ith receiver and its corresponding transmitter can be obtained form (4.16) by Ii = KX k=1 Iki = KX k=1 log2 det 2 64?k i 1=2Uk i HVk i ' k iV k i HUk i| {z } ~Pki ?ki 1=2 +INt 3 75: (4.30) If ~Pki for each k is diagonal, by Hadamard inequality, each term of the above sum is maximized regardless of any power constraint. Since each term is positive, the sum will also be maximized. Therefore, we want to maximize Ii = KX k=1 log2 TY t=1 ?1+pk it? k it ? = KTX r=1 log2 (1+pir?ir): (4.31) constrained to PKTr=1 pir = Pi, where Pi is the power allocated to user i. The index r is used to represent the powers over all streams and all subcarriers. For a given r, the index of the stream is given by j = (r mod T) + 1, and the index of subcarrier by k = b(r=T)c. Again, the answer is the water-fllling solution presented in (4.21). The SNRs are again pir?ir at corresponding subcarriers and streams. 4.3.3 Iterative Water-fllling Note that each transmitter optimizes its transmit spectrum independently of the other trans- mitters, but knowing the interference covariance matrix. Individual link optimal solutions are not stable points for the network, since they would be afiected by any change in the spatial signatures on other links, and so they need to be re-optimized. We use these link solutions to deflne a system-wide algorithm, wherein at each iteration the link solution is readjusted to new interference environment. Since each transmitter is repeating the same 131 process, the interference at the receiver is going to be changed and the above steps should be repeated until the transmit and interference covariance matrices converge. Therefore, the iterative water-fllling [95] is used to flnd the powers at each eigen-mode at each subcarrier of each transmitter. Note that the authors in [95] have used the iterative water-fllling for a single-cell single-carrier system, while we are proposing this scheme for multi-cell multicarrier multiple antenna systems. For the flrst scheme in which the constant power per subcarrier is considered, the iterative water-fllling at subcarrier k(k = 1:::K) is proposed as follows: Algorithm I [Multi-cell iterative water-fllling, constant power per subcarrier] 1. For receiver n (n = 1:::N) set the elements of the flrst row of V kn to 1, and the rest to 0. Set n = 1 2. Find the interference from Eq. (4.5). 3. Find the eigenvalue decomposition of HknnHQkn?1Hknn = Ukn?knUknH (as in (4.13)). 4. Find the power of each eigen-mode with nonzero eigenvalue from (4.21) where ?j?s are the diagonal elements of the diagonal matrix ?kn. For ?j = 0, take pij = 0. Create a diagonal matrix ~Pkn, whose diagonal elements are pij. 5. Find the transmit weight vectors from Vkn = UknB'kn?1=2, when B is h~ P1=2n 0 i , or2 64 P11=2 0 3 75 (look at (4.23)), or ~P1=2 n , depending on the relation between Nt and T. 6. Set n = n+1 and continue from step 2 until convergence. For the second scheme where the constant power per user is considered, the iterative water-fllling is proposed as follows: Algorithm II [Multi-cell iterative water-fllling, constant power per user] 132 1. For the receiver n (n = 1:::N) and all subcarriers k (k = 1:::K), set the elements of the flrst row of V kn to 1, and the rest to 0. Set n = 1. 2. For each carrier k, flnd the interference from Eq. (4.5). 3. For each carrier k, flnd the eigenvalue decomposition of HknnHQkn?1Hknn = Ukn?knUknH (as in (4.13)). 4. For r = 1:::KT, flnd the power of each eigen-mode with nonzero eigenvalue from (4.21) where ?ir?s are the diagonal elements of the corresponding subcarrier and stream. For ?ir = 0, take pir = 0. For each subcarrier, create a diagonal matrix ~Pkn, whose diagonal elements are corresponding pir?s. 5. Find the transmit weight vectors at each subcarrier k, from Vkn = UknB'kn?1=2, when B is introduced in the previous Algorithm. 6. Set n = n+1 and continue from step 2 until convergence. For each iterative water-fllling scheme, the choice of receive weight vectors at a receiver has no efiect on the amount of interference at other users. Therefore, the receive weight vectors are evaluated using Eq. (4.10), after the iterative algorithms are converged. Note that the above iteration is independent of the receiver processing. We saw that a receiver can achieve the maximum achievable rate by applying the optimum processing (MMSE receive weight vector (4.10)). In the simulation result of multi-cell system, we will avoid the receiver processing by using the above capacity as the achievable rate. 4.3.4 Game Theoretic approach for rate maximization As described in Section 4.3, each user tries to maximize its mutual information or the maxi- mum achievable rate, by allocating powers to difierent streams at difierent subcarriers of an 133 OFDM system. However, it is apparent that the choice of transmit weight vectors by user i will afiect the interference on other users, and therefore no centralized solution is available for this problem, unless we deflne a difierent utility function, like the total rate of all users, constrained to some fairness conditions. This problem might be very di?cult to approach, and therefore we proposed the distributed iterative solution. The structure of the optimization problem suggests that it can flt into the context of game theory [104,105]. The players of the game are the transmitters that try to select the transmit and receive weight vectors, and the utility function to be optimized is the mutual information or the maximum achievable rates for each user. Therefore by deflning Wi = 'W1i;W2i;:::;WKi ?, Vi = 'V1i;V2i;:::;VKi ?, the non-cooperative game for the flxed power per user can be established as a zero-sum (players with opposed preferences) strictly competitive game as: max Wi;Vi ( ui = KX k=1 Iki = KX k=1 log2 TY t=1 ?1+pk it? k it ? = KTX r=1 log2 (1+pir?ir) ) subject to KX k=1 tr ? Vki H'kiVki ? ? Pm; (4.32) ui is the utility function for user i that needs to be maximized. This game can be formulated as an extensive game in which there is perfect (or imperfect) information from other players game. The selected strategy of a player in an extensive game depends on the previously selected strategy of other users, as opposed to the strategic games (pure or mixed), in which a user has absolutely no information about other player?s strategy, or all the users choose their strategies simultaneously( For two users, it is called Stackelberg game [104,105]). The saddle point of an extensive game is difierent from the Nash equilibrium obtained for strategic games. However, in some speciflc situations (like game of a player and that nature, where the player moves flrst) this saddle point is the same as the Nash equilibrium. So we use the concept of Nash equilibrium: 134 Deflnition 1. Deflne V?i = fV1;V2;:::;Vi?1;Vi+1;:::;VMg (4.33) and W?i = fW1;W2;:::;Wi?1;Wi+1;:::;WNg (4.34) as the transmit and receive strategies of all players other than user i. Then the Nash Equi- librium Saddle Point (NESP) is deflned as a point in which Ii(Vi;V?i ;Wi;W?i ) ?Ii(V0i;V?i ;W0i;W?i ) (4.35) where V0i, and W0i belong to the set of admissible matrices (those who satisfy the constraints deflned in (4.32). In other words, the NESP is a point in which given other users? weight vectors, none of the transmitters can increase their mutual information by modifying their strategies alone. The maximization problem is performed in a distributed fashion, and we proposed the iterative water-fllling to approach the optimal weight vector allocations. Simulation results haveshownthat, startingfromdifierentinitialvaluesforpowers, andtransmitweightvectors, the proposed algorithm always converge to a unique solution. This fact motivates us to propose the following Conjecture, which we will try to prove in the future. At the moment, the exact necessary conditions are not quite apparent. Conjecture 4.3.1. Assuming that there exists a solution for tr ? Vki H'kiVki ? ? Pm, ( the set of feasible solutions is nonempty) the iterative water-fllling proposed in Algorithm II converges to the unique NESP for the game deflned in (4.32). The NESP is the maximum achievable rate solution for the multiuser OFDM system. 135 An insight into the proof comes from the fact that there exists an NESP for a non- cooperative game if the set of feasible transmit weight vectors is non-empty, convex, and compact and if the mutual information function Ii is continuous in Vi and Wi. Note that the capacity of the channel is the inflmum of the maximum achievable rates (Ii), when the inflmum is taken over all possible channel gain matrices. The same game can be established for the case when the power per carrier is flxed. 4.3.5 Sub-optimal Solution; Same Cell Interference As was mentioned in the Section 4.2, we have considered a cellular system in which there are M co-channel cells, each cell containing one base station and Nm mobiles, performing downlink transmission, where the base stations are the transmitters and the mobiles are the receivers. In this case the interference at each mobile can be divided into two categories; the interference the signals transmitted form the same cell base station to other mobiles in the cell. The second category consists of the interference coming from other base stations. Due to the shorter distance of the same cell base station compared to other base stations, the former of these two interferences is more signiflcant. Therefore, the performance degradation would not be very signiflcant if we modify either Algorithm I or II to consider only the same cell interference. It is worthwhile to mention that in Algorithms I and II, the weight vectors are calculated in such a way that the achievable rates are maximized at each receiver, considering all other signals as interference. In other words the mth base station transmits a sum of NmT symbols simultaneously. Every T streams are aimed to be transmitted to one user in the cell, and the weight vectors are adjusted to the channel form the base station and the desired mobile. Algorithms III and IV in the following, outline the iterative suboptimal solution for the maximum achievable rates for flxed power per subcarrier and per user, respectively. 136 Algorithm III [Single-cell iterative water-fllling, constant power per subcarrier] 1. For receiver n (n = 1:::N) set the elements of the flrst row of V kn to 1, and the rest to 0. Set n = 1 2. Assume that the receiver n is in the cell determined by base station m. Deflne Sm = f1 + Pm?1m0=1 Nm;:::;Pmm0=1 Nmg as the set of the indices of all mobiles in cell m. Find the interference at the kth subcarrier from the following equation, which is the modiflcation of Eq. (4.5). Qkn = X i2Sm i6=n HkmiVkm'kmVkmHHkmiH + kn2INr; (4.36) 3. Find the eigenvalue decomposition of HknnHQkn?1Hknn = Ukn?knUknH. 4. Find the power of each eigen-mode with nonzero eigenvalue from (4.21) Create a diag- onal matrix ~Pkn, whose diagonal elements are pij. 5. Find the transmit weight vectors from Vkn = UknB'kn?1=2, when B is deflned appropri- ately. 6. Set n = n+1 and continue from step 2 until convergence. Algorithm IV [Single-cell iterative water-fllling, constant power per user] 1. For the receiver n (n = 1:::N) and all subcarriers k (k = 1:::K), set the elements of the flrst row of V kn to 1, and the rest to 0. Set n = 1. 2. Find the single-cell interference form (4.36). 3. For each carrier k, flnd the eigenvalue decomposition of HknnHQkn?1Hknn = Ukn?knUknH (as in (4.13)). 137 4. For r = 1:::KT, flnd the power of each eigen-mode with nonzero eigenvalue from (4.21). For each subcarrier, create a diagonal matrix ~Pkn, whose diagonal elements are corresponding pir?s. 5. Find the transmit weight vectors at each subcarrier k, from Vkn = UknB'kn?1=2, when B is deflned appropriately. 6. Set n = n+1 and continue from step 2 until convergence. In each case, the optimal processing is assumed at each receiver. Note that although we only consider the in-cell interference to perform the iterative algorithm, the overall interfer- ence is taken into account to evaluate the achievable rate at each receiver. 4.4 Single Stream SNR Maximization If a single stream is transmitted from each base station to its corresponding mobiles, from (4.15), the rank of the system is 1 and therefore, all of the powers allocated to a subcarrier is given to that stream. As a result Algorithms I-IV are not applicable in this case. For this reason, we will consider the problem of determining transmit and receive weight vector form a difierent point of view. In the previous section, we tried to maximize the maximum achievable rate at each receiver, while in this section we look at the actual achieved rate. It is well known that the throughput of the transmission link form a transmitter to the ith receiver at the kth subcarrier of an OFDM is rki = log2(1+ k i ? ); (4.37) where ki is the Signal to Noise Ratio (SNR) at the kth subcarrier, and ? is the "SNR Gap" which converts the capacity (obtained form Shannon formula) to an achievable rate. This gap 138 is a function of the coding and modulation scheme, and the desired Bit Error Rate (BER). As a result, since the log2 function is a monotonically increasing function, maximizing the SNR is equivalent to maximizing the achieved throughput, at the subcarrier. To this end, we consider frequency domain beamforming, where the transmitter and receiver each have a beamforming weight vector at each OFDM subchannel. The received signal at subchannel k, at the nth receiver is given by xn(k) = wHn (k)xn(k) = wHn (k) NX m=1 p Pm(k)Hmn(k)vm(k)sm(k)+wHn (k)nn(k) where we have considered having N rather than M transmitters to re ect the fact that each base station transmits independent data streams to difierent receivers in its cell simulta- neously, and for each receive, one transmit beamforming weight vector is calculated. Here vm(k) and sm(k) are the transmitter beamforming vector and the message at the mth trans- mitter, wn(k) is the received beamforming weight vector, and nn(k) is the noise vector at the receiver n, all in subchannel k. The noise samples are considered to be independent with zero mean and variance 2. Pm(k) is the transmitting power of transmitter m at subchannel k. By deflning fmn(k) = wHn (k)Hmn(k)vm(k), the SINR at subchannel k is given by n(k) = Pn(k)jw H n (k)Hnn(k)vn(k)j 2 P m6=n Pm(k)jwHn (k)Hmn(k)vm(k)j2 + 2wHn (k)wn(k) (4.38) Let?sassumethatthetransmittingpowerateachsubchannelisaknownvaluePm(k)kvm(k)k2 = P0. It is straightforward to see that the receive weight vector that maximizes the SNR can be obtained either by MVDR approach or MMSE approach and is given by [26] wn(k) = ?n(k)Q?1n (k)Hnn(k)vn(k); (4.39) 139 where ?n(k) is a constant coe?cient. In the case of MVDR, the weight vector is given by wn(k) = Q ?1 n (k)Hnn(k)vn(k) vHn (k)HHnn(k)Q?1n (k)Hnn(k)vn(k); (4.40) where Qn(k) = X m6=n ?P m(k)Hmn(k)vm(k)vHm(k)HHmn(k) ?+ 2I: (4.41) In this case, ?n(k) is equal to the inverse of denominator of (4.40). Given this receive weight vector, the SNR is obtained from n(k) = Pn(k)vHn (k)?HHnn(k)Q?1n (k)Hnn(k)?vn(k) = P0v0Hn (k)?HHnn(k)Q?1n (k)Hnn(k)?v0n(k); (4.42) where kv0m(k)k2 = 1. With the constant length constraint for v0m(k), it is proven that (4.42) is maximized when v0m(k) is the principal eigenvector of the matrix Hnn(k)Q?1n (k)Hnn(k) [82,98,106]. The principal eigenvector of an irreducible matrix A is the eigenvector corre- sponding to the ?(A), which is the real positive eigenvalue of A with the maximum modula (spectral radius). If the network enforces a flxed transmit power per user, the weight vectors (transmit and receive) of difierent subchannels cannot be calculated independently. In this case, we have PK?1 k=0 Pm(k)jvm(k)j 2 ? P0. The total user power is divided among the subchannels in such a way to maximize the overall data rate of each user. Therefore, the optimization problem in this case is max ( ri = KX k=1 rki ) ; subject to KX k=1 Pm(k)jvm(k)j2 ? P0; (4.43) where rki is the data rate at subcarrier k at user i, obtained form 4.37), and ri is user i?s overall rate. 140 Assuming that we know the SNR at each subcarrier, it is well known that under a flxed overall power, water-fllling [101] achieves the maximum overall rate. However, for the case of multiuser environment, the power allocation among difierent subchannels for one user afiects the SINR of other users. Therefore, the following distributed iterative algorithm is proposed for all users, simultaneously: 1. For each subchannel k (k = 0:::K ? 1), initialize the transmission power, transmit and receive weight vectors 2. For the kth subcarrier of user m = 1;:::;N, calculate the receive weight vector so as to maximizes the achievable SINR, km, using either MVDR or MMSE approach both represented by Eq. (4.39). 3. Use the water-fllling algorithm to flnd the power at each subcarrier from the following relation: Pm(k) = ?m ? ? m(k) ?+ ; (4.44) where (:)+ is zero when the argument is negative, ? is the SNR gap, and the constant ?m is chosen such that the total transmit power is equal to P0: X k Pm(k) = P0: 4. Fix the receive weight vectors, and calculate the transmit weight vector as the principal eigenvector of the matrix HHmm(k)Q?1m (k)H(mmk), where Q(mk) is the kth subcarrier interference deflned as in (4.41). 5. Repeat from step 2, until convergence. As in the case of multiple steam MIMO/OFDM, it is possible to establish the problem in the framework of game theory. Observing the simulation results suggest that we can propose 141 0 5 10 15 20?2 ?1 0 1 2 3 4 5 6 7 Distribution of users in network x y Figure 4.1: Cellular structure and distribution of users a Conjecture similar to (4.3.1) to show that this algorithm converges to the NESP of this game. 4.5 Simulation We use a wireless network consisting of M base stations placed in a hexagonal pattern. We assume that all of the base stations belong to the same co-channel set and each cell contains one base station and Nm mobile. The cellular pattern is shown in Figure 4.1, depicted for M = 16, and Nm = 2. Users are randomly distributed in a cell according to a uniform distribution. We use an OFDM system with 8 subchannels for transmission. The communication channel is assumed to follow the COST207 Typical Urban 6-ray channel model with average path delays of f0:0;0:2;0:5;1:6;2:3;5:0g measured in ?s and path fading powers of f0:189;0:379;0:239;0:095;0:061;0:037g [90]. The maximum channel delay spread 142 0 10 20 30 40 50 600 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Rates CDF CDF of rates for fixed power per Carrier, M=16, Nm=1, RF=3, 2 streams Multi CellSingle Cell Single User Figure 4.2: Achievable rate CDF, for flxed power per carrier with 16 base stations, 2 mobile per cell, reuse factor of 7 and 2 streams is 5?s and so the channel coherence bandwidth is 200KHz. Link gains are calculated by considering 6?8dB for the variance of shadow fading and the path loss exponent to be four. We assume a quasi-static channel where the channel is assumed to be flxed over multiple OFDM symbols. Channel frequency response can be obtained simply by taking the Fourier transform of the time-domain channel impulse response, and sampling this response at the carrier frequency, mfc, where fc is the subcarrier separation; i.e. Hm = pG L?1X l=0 filne(?j2?mfc?ln); (4.45) where G captures the efiect of path loss and shadow fading which we consider the same for difierent paths. Any difierence can be absorbed in fading coe?cients. L is the number of multipaths, ?l the delay of each path, and fil is the fading of path l. Note that the subchannel link gains for each user are correlated according to (4.45). A one-tap frequency-domain equalizer is assumed at the receiver such that the channel 143 0 5 10 15 20 25 30 350 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Rates CDF CDF of rates for fixed power per Carrier, M=16, Nm=2l, RF=1, 2 streams Multi CellSingle Cell Single User Figure 4.3: Achievable rate CDF, for flxed power per carrier with 16 base stations, 2 mobile per cell, reuse factor of 1 and 2 streams between the base station and its desired mobile can be estimated. Each OFDM symbol is assumed to be 2?s long which corresponds to a bandwidth of 500KHZ. Therefore, the subcarrier spacing is 16KHz which is smaller than the coherence bandwidth of the channel and so the fading at each subchannel can be considered at. The average power of the signal at each subchannel at each transmitter is assumed to be unity. The white Gaussian thermal noise power at each receiver is calculated based on a noise flgure of 3dB and the receiver bandwidth, which is assumed to be 500KHZ bandwidth. Each base station uses four transmit antenna and each mobile uses four receive antennas. Multiple data streams are transmitted from each base stations to each mobile in its corresponding cell. We have simulated Algorithms I, II and III. Note that in Algorithm I, we consider multi- cell optimization, in which the interference from all base stations are considered to flnd the transmit and receive weight vectors, and the power is flxed per subcarrier. Algorithm II 144 0 10 20 30 40 50 60 70 80 900 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Rates CDF CDF of rates for fixed power per User, M=125, Nm=1, RF=3, 2 streams Multi CellSingle Cell Single User Figure 4.4: Achievable rate CDF, for flxed power per user with 125 base stations, 1 mobile per cell, reuse factor of 3 and 2 streams considers the same conflguration, but with flxed power per user. In this case, we needed to adapt the power at each subcarrier. In Algorithms III only the in-cell interference is considered to perform the optimization, while multi-cell interference is included at the flnal step to flnd the OFDM rate, and the power is flxed per user. Here, we have ignored the out of cell interference in performing the iterative optimization. Note that the same experiment has been simulated when the power per carrier is flxed (which we call it Algorithm III?). For reference, we have simulated Algorithm III (and III?), by only considering the thermal noise covariance in optimization. Again, the multi-cell interference is considered for evaluating the flnal OFDM rate. In each case the appropriate noise and interference covariance matrix (the one used for optimization) is measured at each receiver and is fed back to the base station for transmit weight optimization. Figure 4.2 shows the CDF of the achievable rate for the Algorithms II, III, and noise only 145 0 10 20 30 40 50 600 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 CDF of rates for fixed power per User, M=100, Nm=2, RF=3, vs Num. of streams 1 stream2 streams 3 streams4 streams Figure 4.5: Achievable rate CDF of difierent number of streams, for flxed power per user with 100 base stations, 2 mobile per cell, and reuse factor of 3. optimization, with 16 base stations, 16 base stations, 2 mobile per cell, reuse factor of 1 and 2 streams. Figure 4.2 considers the same conflguration with a reuse factor of 1. Both of these flgures show that the maximum achievable rate for multi-cell optimization outperforms those of single-cell and also noise-only optimizations. Moreover, comparing these two flgures, infers the expectable result that increasing the reuse factor, reduces the interference and therefore results in higher capacities. However, this improvement is more signiflcant in the case of multi-cell optimization. The same argument is valid for flxed power per user, as is seen in Figure 4.4. Fig. 4.5 depicts the CDF of achievable rates for 100 base stations, 2 mobile per cell, reuse factor 3, for difierent number of streams when the power is flxed per user. The same quantities are depicted for 16 cells, two mobiles per cell in Fig. 4.6 and 16 cells, one mobiles per cell in Fig. 4.7. When there is only one mobile per cell (Using some multiple-access 146 0 10 20 30 40 50 600 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 CDF of rates for fixed power per User, M=16, Nm=2, RF=3, vs Num. of streams 1 stream2 streams 3 streams4 streams Figure 4.6: Achievable rate CDF of difierent number of streams, for flxed power per user with 16 base stations, 2 mobile per cell, and reuse factor of 3 methods, like TDMA, or CDMA, the mobiles per cell are considered no interfering), as we increase the number of streams, more bandwidth can be assigned per user and the maximum achievable rate per user is increased. When we increase the number of co-channel mobiles per cell, more processing is needed to combat the efiect of in-cell interferences, and therefore the performance of two streams is better than that of four streams. When we increase the number of cells, it is clearly shown that multiple stream will degrade the maximum achievable rate. 4.6 Summary We have proposed iterative water-fllling solutions for multi-user multi-cell wireless systems where multiple antennas are deployed at both transmitters and receivers. The proposed 147 0 10 20 30 40 50 60 70 80 900 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 CDF of rates for fixed power per User, M=16, Nm=1, RF=3, vs Num. of streams 1 stream2 streams 3 streams4 streams Figure 4.7: Achievable rate CDF of difierent number of streams, for flxed power per user with 16 base stations, 1 mobile per cell, and reuse factor of 3 algorithm assigns multiple independent substreams for each user to increase data rate for each user and is performed as a distributed scheme. We have established a non-cooperative game theoretic analogy for the MIMO/OFDM problem and proposed a Conjecture that the iterative algorithms proposed in this work will converge to the Nash equilibrium saddle point of game. The iterative algorithm that considers single stream transmission, and tries to maximize the actual transmission data rate of each user by performing transmit and receive beamforming. Through numerical analysis, we observed that if the number of co-channel mobiles per cell is increased, it is better to limit the number of streams. 148 Chapter 5 Low Peak to Average Power Ratio with modifled Golay Sequences for OFDM Systems 5.1 Motivation and Previous Works One of the major hurdles to the widespread use of OFDM is the high Peak to Average Power Ratio (PAPR) of OFDM signals [107]. A large PAPR may force the amplifler to operate in the nonlinear region, or back-ofi the saturation point. These might reduce the mobile battery lifetime, or degrade the signal quality. Many approaches have been proposed to overcome this barrier [69,70,108{120]. In Sec- tion 2.5 we categorized these approaches in three classes, and describe each class brie y. We will focus more on the block coding approach in this dissertation. Cyclic coding has been used in [110] and involves adding extra carriers in which the phase of every fourth carrier is calculated from the phase of previous three information carriers. Lawrey and Kikkert [119] presented a technique that combines SLM and Cyclic coding. They added an extra carrier, referred to as Peak Reduction Carriers(PRC), whose phase and amplitude is varied to min- imize overall PAPR. Another block coding approach was proposed in [120] using codewords drawn from ofisets of a linear code. The work in [121] goes further and proposes a com- 149 putationally e?cient algorithm which, given any code and a maximum-likelihood decoding algorithm for that code, flnds good ofisets. However this method does not guarantee to obtain signiflcant PAPR reduction all of the times. In one of the most recent and e?cient works done in this area, Golay Complementary Sequences (GCS) [122] are used, to control the modulation of carrier information, resulting in OFDM signals with PAPR of at most 2. The correlation properties of these codes have made them a suitable choice for several applications like multi-code CDMA systems [123]. Recently, the IEEE802.11 standard committee has adopted Complementary Code Keying (CCK) signals (that are in principal QPSK Golay complementary sequences with size 8) as the physical layer of Wireless LAN IEEE802.11b standard. Davis and Jedwab in [118] obtained a large set of length 2m binary Golay Complementary Pairs (GCP) from certain second order cosets of the flrst order Reed-Muller (RM) codes [13]. They combined block coding schemes (with all of its properties, like e?cient encoding and decoding, error correc- tion capabilities) with the use of GCS (with their attractive power control properties). They also went one step further and found 2h-ary GCP from cosets of an appropriate generaliza- tion of the Reed-Muller codes. As a result, they found good binary, quarternary and octary OFDM codes with good error correcting capabilities, e?cient encoding and decoding, and a tightly controlled PAPR. By allowing higher PAPR, they were able to guarantee higher coding rates. Paterson [112] generalized Davis and Jedwab?s results in two ways. First, he used q-ary instead of 2h-ary alphabets (with q even), related to the general q-ary second order cosets of the flrst order Reed-Muller codes. Secondly, instead of Golay complementary pairs, he deflned the concept of Golay complementary sets of size 2k+1. He showed that any sequence lying in a Golay complementary set of size 2k+1 has PAPR at most 2k+1, and found upper bounds and lower bounds on the PAPR of complete second-order cosets of RMq(1;m) (the 150 flrst order Reed-Muller code deflned over Zq). Moreover, he gave an explicit, non-recursive construction of q-phase Golay complementary sets. However, Paterson and Tarokh [115] performed a theoretical analysis for a general coding scheme, having a constellation with equal energy symbols. They found a trade-ofi among the PAPR, the data rate, and the minimum distance of the codebook, regardless of the coding scheme. Given the minimum distance of the code, they found a lower bound for the PAPR that increases with increasing data rate. They also showed that, in general, if the PAPR is bounded up by a constant value, the data rate decreases with decreasing minimum distance of the code. Davis and Jedwab [118] have provided several trade-ofis among difierent codes with difierent PAPR and rates. In the flrst part of this chapter, we will look at the general problem of PAPR reduction in multicarrier systems and speciflcally try to overcome the limitation on PAPR reduction imposed by coding rate [115]. Here, we relax the assumption of having an equal energy con- stellation and use QAM for modulation. We deflne a new version of Golay Complementary Sequences to support these codes. The scheme presented in this chapter uses a recursive procedure to build the SGolay 16-QAM sequences. We generalize the recursive schemes in- troduced in [122] for the case of binary Golay sequences. Using QAM modulation allows a better error correcting capabilities over QPSK, too. Tarokh and Chong in [124] designed a construction method for low PAPR 16-QAM that uses the Jedwab?s construction for QPSK Golay sequences [118]. The coding rate achieved by this construction is low (about 3% for 256 subchannels). It should be pointed out that their constructions does not guarantee an upper bound of 3dB for the OFDM PAPR. If they want the PAPR to be bounded up to 3dB, their coding rate will be cut in half. Tarokh and Roessing, in a related work [117], designed another construction for 16-QAM low PAPR codes. In this work a QAM symbol is represented as a weighted sum of two QPSK symbols. The constructed sequences are not Golay, but their PAPR is bounded up to 3:6 (=5:6dB), and the coding rate is twice as the 151 rate for Golay QPSK codes found in [118]. For example for N = 256 it results in a rate of 2?50=(4?256) = 9:7% while the PAPR is guaranteed to be bounded up to 3:6 (5:5dB). In the second part of this chapter we will set an upper bound for the PAPR (3dB) and provide some trade-ofis among the coding rate and the error correction capabilities of the OFDM code, by developing the concept of cyclic Golay codes. We will present a method for constructing these codes out of Golay codes. The construction method we will propose in this chapter, is suitable for generating the cyclic shift of any code described by means of Boolean functions. We will show that If these codes are used as the codewords applied to the IFFT block in an OFDM system the PAPR is bounded up to 3dB, and present a trade-ofi between among the coding rate and minimum distance of these codes. Note that if the signal to noise ratio in an environment is above a threshold, we might be able to tolerate low distance codes with the expense of higher rates. Although, this work does not ofier a very signiflcant rate increase over the Golay codes, it outlines a methodology to generate the cyclic shift of any code, presented by Boolean algebraic functions, even if the generated code is not linear (the construction of linear cyclic codes is well known in the literature). We will show in this chapter that cyclic Golay codes are in principal special cases of Generalized Reed-Muller codes. Decoding of the flrst order RM codes is done e?ciently by exploiting the fast Hadamard transformations [13]. Higher order binary RM codes are decoded using majority logic Reed algorithm [13]. Grant et. al. [125], and Davis et. al. [118], in difierent works, proposed second order coset decoding of flrst order Generalized RM codes deflned over alphabet Z2h. Paterson and Jones [126] presented several optimal and suboptimal decoding algorithms, both hard and soft decision decoding, and both in signal and coding domain for second order generalized RM codes. They also mentioned about the extension of their reduction decoding scheme, proposed for second order codes, to higher order Generalized RM codes. In this chapter, we will focus on both soft and hard-decision 152 decoding of generalized RM codes of any order in both complex and coding domain, from a difierent perspective. We will provide two decoding algorithms for RM2h(r;m). The flrst one is a generalization of Reed algorithm for binary Reed-Muller codes. We will restate this algorithm using the concept of Karnaugh Map or K-Map [127], and present it for RM2h(3;4). The decoding steps for other sizes and orders can be obtained similarly. We also propose a recursive decoding algorithm to avoid the complexity of higher dimensional K-Maps. The remainder of this chapter is organized as follows: We flrst introduce the concept of Golay Complementary Sequences (GCS) and their impact on the PAPR of OFDM systems using equal-power constellations. Then Section 5.2.2 outlines a proposed structure to achieve low PAPR for unequal power constellations. In Section 5.2.3 a seed for our recursive proce- dure is introduced. Section 5.3 outlines the deflnition of cyclic Golay codes and their PAPR properties. In Section 5.3.1, we will flnd construction methods for building these codes out of Davis-Jedwab Golay codes. Section 5.3.2 outlines two decoding algorithms for RM2h(r;m) (in particular, cyclic Golay codes), one recursive and one non-recursive. In Section 5.3.3 we present some simulation results for both parts, and flnally Section 5.3.4 concludes the chapter. 5.2 Golay Complementary Sequences for equal-power constellations Channel coding is a mean to perform the PAPR reduction and some error correction for OFDM systems simultaneously. If the total number of bits assigned to one OFDM symbol is Nm, when N is the number of subchannels, then we choose a N-valued sequence from a codebook and this sequence is fed into the FFT block. if x is submitted codeword to the 153 IFFT block, the transmitted OFDM signal is sx(t) = N?1X i=0 xi exp[j2?(f0 +i?f)t]; (5.1) where ?f is the subcarrier separation, and f0 is the carrier frequency. The instantaneous envelope power of the signal is px(t) = jsx(t)j2 = N?1X i=0 N?1X u=0 xix?u exp[j2?(i?u)?ft]: (5.2) Let?s deflne the cross-correlation of two sequences and auto-correlation of a sequence as follows: Deflnition 2. The Cross-Correlation of two N-valued complex codes x and y with replace- ment ?N ? l ? N ?1 is deflned as Cl(x;y) = 8 >>> >>< >>> >>: N?l?1P i=0 xi+ly?i if 0 ? l ? N ?1; N+l?1P i=0 xiy?i?l if ?N ? l ? 0; 0 otherwise: (5.3) In other words, for non-negative l?s we have Cl(x;y) = N?l?1X i=0 xi+ly?i; (5.4) and C?l(x;y) = (Cl(y;x))?. Deflnition 3. The Auto-Correlation of an N-valued complex codes x with nonzero replace- ment l is deflned as Al(x) = Cl(x;y): (5.5) In other words, Al(x) = 8 >< >: N?l?1P i=0 xi+lx?i; for l ? 0; A??l(x); for l ? 0: (5.6) 154 Let?s start from (5.2) to restate the instantaneous envelope power of the OFDM signal as follows: When i and u go from 0 to N ?1, if we express l = i?u, then l goes from ?(N ?1) to N ?1. For each l > 0, we keep u and the other index, i goes from 0 to N ?1?l, and for each For each l < 0, we keep i and the other index, u goes from 0 to N ?1?l. This fact could be easily seen by writing the indices i and u as a matrix. Therefore, using Deflnition 3, we have px(t) = N?1X u=0 N?1X i=0 xix?u exp[j2?(i?u)?ft]: = N?1X i=0 x?ixi + N?1X l=1 N?1?lX u=0 x?uxu+lej2?l?ft + ?1X l=?(N?1) N?1?lX i=0 xix?i?lej2?l?ft =A0(x)+ N?1X l=1 Al(x)ej2?l?ft + ?1X l=?(N?1) A??l(x)ej2?l?ft =A0(x)+ N?1X l=1 Al(x)ej2?l?ft + ?1X l=?(N?1) Al(x)ej2?l?ft =A0(x)+ N?1X jlj=1 Al(x)ej2?l?ft (5.7) Note that the last two equalities in (5.7) can be rewritten as px(t) =A0(x)+ N?1X l=1 Al(x)ej2?l?ft + ?1X l=?(N?1) A??l(x)ej2?l?ft =A0(x)+ N?1X l=1 Al(x)ej2?l?ft + N?1X l=1) Al ?(x)e?j2?l?ft =A0(x)+2Re N?1X l=1 Al(x)ej2?l?ft (5.8) The average power of the OFDM signal is A0(x), which is the same as the power of the code x (by Parseval equation) and is denoted by Px. The maximum possible value for PAPR 155 in an OFDM system is equal to the number of subchannels, N (for the cases when all of the symbols applied to the FFT block are equal). Deflnition 4. Two N-valued complex sequences x and y are called Golay Complementary Pairs (GCP) if Al(x)+Al(y) = 0 8l 6= 0: We show this by x ? y. Each of the sequences x and y is called Golay Complementary Sequence (GCS). Assume that the sequences x and y are GCP and have the same power (A0(x) = A0(y)). Since the instantaneous envelope power is non-negative at all times, we can see that PAPR(x) , maxtfpx(t)gP x 6 2 = 3dB (5.9) Therefore, if we choose the codes from a set of Golay sequences, the PAPR is bounded up to 3dB. 5.2.1 Construction of equal-power Golay Sequences A Boolean function is a function f from Zm2 = f(x1;x2;:::xm)jxi 2f0;1gg to Z2. We regard each 0?1 variable xi as itself being a Boolean function fi(x1;x2;:::xm) = xi and consider the 2m monomials 1;x1;:::xm;x1x2;x1x3;:::xm?1xm;:::x1x2 :::xm: (5.10) Any Boolean function f can be uniquely expressed as a linear combination over Z2 of these monomials, where the coe?cient of each monomial belongs to Z2. We specify a sequence f of length 2m corresponding to f by listing the values taken by f(x1;x2;:::xm) as 156 x1;x2;:::xm ranges over all its 2m combinations in lexicographic order. In other words, if (i1;i2;:::;im) is the binary representation of integer i, i.e. i = mX k=1 ik2k?1; (5.11) the ith element of the sequence f is f(i1;i2;:::;im). For example, with m = 2, the Boolean function f(x1;x2) deflnes the following codeword: f(x1;x2) = 2 4f(0;0|{z} 0 ) f(1;0|{z} 1 ) f(0;1|{z} 2 ) f(1;1|{z} 3 ) 3 5: Note that, the flrst symbol in each element of the codeword represents the least signiflcant bit of the lexicographic representation of the index of that element, and the last one represents the most signiflcant bit. We deflne a generalized Boolean function to be a function f fromZm2 toZ2h, where h ? 1. It is straightforward to show that any such function can be uniquely expressed as a linear combination overZ2h of the monomials (5.10), where the coe?cient of each monomial belongs to Z2h. As above, we specify a sequence f of length 2m corresponding to the generalized Boolean function f. For example, for h = 2 and m = 3 we have 3x1 = (0 0 0 0 3 3 3 3), 2x1x2x3 = (0 0 0 0 0 0 0 2), and x1x2 + 3x2x3 + 2:1 = (2 2 2 1 2 2 3 2). (Technically, for such expressions to be valid we must embed the range space Zm2 of the monomials (5.10) in Zm2h.) We assume that the elements of the codeword to be transmitted using an OFDM system, are chosen from an equal-energy constellation like QPSK or 8-PSK. Therefore, without loss of generality, the elements of a codeword y can be written as yi = exp(j2?2h ai); (5.12) where ai is chosen from a 2h-ary alphabet, Z2h. As a result, constructing the sequence a, will provide the sequence y. Using these notations, the main result of [118] is stated as follows: 157 Theorem 5.2.1. Let f , f(x1;x2;:::;xm) = " 2h?1 m?1X k=1 x?(k)x?(k+1) + mX k=1 ckxk +c # mod 2h; (5.13) where ? is a permutation of the symbols f1;2;:::;mg, and c;ck 2Z2h. Then, the sequences f and ?f +2h?1x?(1) +c0 mod 2h? comprise a Golay pair of length 2m over Z2h, for any c0 2Z2h. Note that it is not the sequence a with elements ai = f(i0;:::im?1), when i and the vector (i0;:::im?1) are related as in (5.11), that is a Golay sequence, but the sequence y deflned as in (5.12) is Golay. Now, let?s consider the deflnition of Reed-Muller codes as follows: Deflnition 5. For h ? 1 and 0 ? r ? m, the rth order linear Reed-Muller code of length 2m over Z2h, RM2h(r;m), is deflned to be generated by all monomials in the xi of degree at most r. Using this deflnition, the authors in [118] have restated Theorem 5.2.1 as: Theorem 5.2.2. Each of the m!2 cosets of RM2h(1;m) having a coset representation of the form 2h?1 m?1P k=1 x?(k)x?(k+1) comprises 2h(m+1) Golay sequences over Z2h of length 2m where ? is a permutation of the symbols f1;2;:::;mg. By varying ck?s and coverZ2h in (5.13), this theorem generates m!2 2h(m+1) Golay sequences of length 2m. So, the coding rate of Davis-Jedwab construction is h(m+1)+log2(m!2 )h2m . Using these theorems, in Corollary 2.5 of [118] , which we repeat here, Davis and Jedwab introduced a construction for building at least 2h(m+2)m!2 Golay complementary pairs over Z2h of length 2m. 158 Corollary 5.2.3. Let f be deflned as in (5.13). Then any sequence in set A forms a Golay complementary pair over Z2h of length 2m with any sequence in set B, when A = ff +c;f +2h?1(x?(1) +x?(m))+cjc 2Z2hg B = ff +2h?1x?(1) +c0;f +2h?1x?(m) +c0jc0 2Z2hg 5.2.2 PAPR reduction for the non-equal power constellation Given the length of the code, minimum Euclidian distance, and maximum PAPR, Tarokh and Paterson [115] found a lower bound for achievable coding rate. On the other hand, given the length of the code, coding rate and minimum Euclidian distance, they found a lower bound for the PAPR. The lower bound for the PAPR increases by increasing the coding rate. The need for low PAPR and at the same time overcoming the lower bound of rate for equal-power codes has motivated us to investigate the non-equal power codes that achieve low PAPR. To do this we deflne a special case of GCSs. Deflnition 6. Two N-valued complex sequences x and y are called Super Golay Comple- mentary Pairs (SGCP) if ? They are Golay complementary pairs. ? If Pav is the average power of the constellation, Px +Py ? 2NPav; We show this by x ? y. Each of the sequences x and y is called Super Golay Comple- mentary Sequence (SGCS) 159 For a special case, It has been proved in [117] that if the 16-QAM codes are realized as a sum of two sequences chosen from an equiprobable set of QPSK codewords, then the mean envelope power of the transmitted OFDM symbol is Pav, the average power of the 16-QAM constellation. This fact can be easily generalized to our structures introduced in Section 5.2.3. Theorem 5.2.4. The PAPR achieved by any SGCS is bounded up by 3dB. Proof. By virtue of (5.23) and the fact that the instantaneous power of each code is always non-negative, we have PAPR(x) = maxtfpx(t)gP x 6 px(t)+py(t)NP av = Px +PyNP av 6 2 = 3dB: (5.14) Next, we would like to flnd a construction method for SGolay codes. If x and y are two N-valued sequences, we show the inverse of x by ^x, the element-wise conjugate of x by x?, the concatenation of x and y by xjy, and the interleaving of x and y by x # y. Also we show the sequence (x1;?x2;x3;:::;(?1)N?1xN) by x0. Theorem 5.2.5. The property of being super Golay complementary pairs is invariant under the following transformations: a .Re ection w.r.t the origin. b .Re ection w.r.t both axes. c .Multiplication of one or both sequences by a complex number with magnitude 1. d .Re ection w.r.t the bisectors of all regions. e .Rotation. 160 Proof. We will prove each item separately: a . Using Deflnition 3, Al(?x) = Al(x): (5.15) b . The re ection of x w.r.t the real axis is x?. Using Deflnition 3, Al(x?) = [Al(x)]?: (5.16) The re ection of a sequence x w.r.t the imaginary axis is ?x?. c . If fi is an arbitrary complex number , then Al(fix) = jfij2Al(x): (5.17) Therefore if x ? y and jfij = 1, then x ? fiy and fix ? fiy. d . The re ection of x w.r.t the bisector of the flrst and third regions is jx?. Also, the re ection of x w.r.t the bisector of the second and fourth regions is ?jx?. e . Rotation of a sequence with the angle is equivalent to multiplying the sequence by ej . Note that, in all of these cases, the power of the sequences are preserved. Theorem 5.2.6. If x ? y then a. x0 ? y0. b. ^x ? ^y c. x ? ^y? Proof. If x ? y then a . The kth member of the sequence x0 is x0k = (?1)kxk, therefore using Deflnition 3. Al(x0) = (?1)lAl(x): (5.18) 161 b . The kth member of the sequence ^x is ^xk = xN?1?k, therefore using Deflnition 3. Al(^x) = N?l?1X i=0 ^xi+l^x?i = N?l?1X i=0 xN?l?i?1x?N?i?1 = N?l?1X k=0 x?k+lxk = (Al(x))? (5.19) c Using (5.16) and (5.19) the statement is concluded. Theorem 5.2.7. If x ? y then a. xjy ? xj?y b. x # y ? x #?y Proof. The items are proved separately: a . It is easy to see that Al(xjy) = Al(x)+Al(y)+ l?1X i=0 x?N?1?iyl?1?i and therefore Al(xjy)+Al(xj?y) = 2(Al(x)+Al(y)) = 0: b . if l = 2k then Al(x # y) = Al=2(x)+Al=2(y) = 0: (5.20) Using (5.15) and since x ??y, Al(x #?y) = 0. Therefor, Al(x # y)+Al(x #?y) = 0. If l = 2k +1 then Al(x # y) = N?1?l?12X i=0 x?iyi+l?1 2 : (5.21) and therefor Al(x # y)+Al(x #?y) = 0. 162 By applying the transformations deflned in Theorems 5.2.5 and 5.2.6 to the statements of Theorem 5.2.7 we can build a set of structures that create 2N-valued super Golay pairs from N-valued ones. Speciflcally, if x ? y each with size N, then the following sequences are super Golay pairs ([j] means multiplying the sequence by j is optional) : 1) ?[j](xjy) ??[j](xj?y) 2) ?[j](x # y) ??[j](x #?y) 3) ?[j](xjy) ??[j](^y?j? ^x?) 4) ?[j](x # y) ??[j](^y? #?^x?) 5) ?[j](x #?y) ??[j](^y? # ^x?) However, because of the special structure of 16-QAM constellation, many of these con- structions yield similar sequences. For example reversing the role of x and y will not yield new pairs. if the number of N-valued pairs is M, the flrst structure yields 4M of 2N-valued SGolay pairs and this is true for the second structure too. We have performed a simulation for the pairs with size 8 and obtained the same result. In general each pair with size N yields 32 pairs each with size 2N. This is very similar to the Reed-Muller codes used for equal-power Golay sequences. Reed-Muller codes of degree r +1 and length 2m+1 can be constructed from two 2m-length Reed-Muller codes, one in degree r and one in degree r +1. The exact statement of the theorem is [13] Theorem 5.2.8. RM(r +1;m+1) = ffjf +g 8f 2 RM(r +1;m) and g 2 RM(r;m)g: These similarities can lead us to a new deflnition for a modifled Reed-Muller codes in the context of non-equal power constellations like 16-QAM. 163 5.2.3 Super-Golay 16QAM pairs from QPSK pairs In this section we will look at an important question which is how to start the recursive construction. To this end, we will use the relation that Tarokh and Roessing [117] used. They represented a 16-QAM symbol by a weighted sum of two QPSK symbols. We will generalize their observation and flnd a construction that builds 16-QAM SGolay sequences from QPSK Golay sequences. Let?s deflne QPSK symbols as the set QPSK = fexp[j(k?2 + ?4)];jk 2Z2hg Using Deflnitions 3 and 2, the following theorem can be proved easily, Theorem 5.2.9. For any two sequences x and y and any two complex numbers fi and fl Al(fix+fly) = jfij2Al(x)+jflj2Al(y) +fifl?Cl(x;y)+fi?flCl(y;x): Theorem 5.2.10. If x and y are N-valued QPSK Golay complementary pairs, and fi and fl are two arbitrary complex numbers with jfij = jflj , then each of the following pairs are 16-QAM super Golay sequences: 1 . c = fi(x+2y) and t = fl(?2x+y) 2 . c = fi(x?2y) and t = fl(2x+y) 3 . c = fi(x+2jy) and t = fl(2jx+y) 4 . c = fi(x?2jy) and t = fl(?2jx+y) Proof. We will prove the result for the third item. The rest can be proved similarly. Using 164 Theorem 5.2.9, for each nonzero l, Al(c)+Al(t) = jfij2Al(x)+4jfij2Al(y)?2jjfij2Cl(x;y)+2jjfij2Cl(y;x) +4jflj2Al(x)+jflj2Al(y)+2jjflj2Cl(x;y)?2jjflj2Cl(y;x) = 5jfij2(Al(x)+Al(y)) = 0 Therefore c and t are Golay complementary pairs. It is easy to see that each of these sequences are actually a 16-QAM sequence, when the average power of the constellation is Pav = 5jfij2. If we denote the Hermitian of x by xH, and considering the fact that the power of both x and y is N, then Pc +Pt = jfij2(kx+2jyk2 +k2jx+yk2) = jfij2[(kxk2 +4kyk2 +2jxHy ?2jyHx) +(kyk2 +4kxk2 ?2jxHy +2jyHx)] = 5jfij2(kxk2 +kyk2) = 10Njfij2 = 2NPav: Therefore, by Deflnition 6., c and t are super Golay sequences. Theorem 5.2.10 suggests a starting point for the proposed recursive construction. If we limit ourselves to an 16-QAM construction with Pav = 10, then fi and fl can be chosen from the set fp2;?p2;jp2;?jp2g, and therefor for each of the 2h(m+2)m!2 Golay complementary pairs over Z2h of length 2m, there are 64 super Golay 16-QAM pairs. However, some of these pairs are repeated. As an example if (c ? t), then (?c ??t) and therefore we do not need to multiply the two sequences in the flrst construction of Theorem 5.2.10 by ?p2. Eliminating these repeated sequences, the number of 16-QAM super Golay pairs generated from each QPSK Golay pair is 16. Therefore we can build 24+h(m+2)m!2 distinct QAM super Golay pairs over Z2h of length 2m. For m = 2 and QPSK symbols (h = 2), this translates to 4096 pairs. Through exhaustive search , we have found that there are exactly 12032 super Golay pairs and Theorem 5.2.10 builds 4096 of them. 165 We were able to come up with some structures that build in average 32 new 2N-valued SGolay pairs from one N-valued SGolay pair. Therefore, starting from 4-valued codes, our construction is able to achieve at least R = 12+5log2 N=44N : (5.22) code rate for N OFDM subchannels. For 128 subchannels, this achieves 7:3% code rate. Although not still acceptable, this is about 11% improvement over Tarokh-Chong?s work [124]. The achievable code rate is about 20% below the equal power Golay codes constructed by Jedwab and Davis [118]. However, because of using 16-QAM constellation, the information rate achieved by these structures is twice as the information rate achieved by Jedwab?s construction, while the error correction properties of the code is maintained. 5.2.4 Super-Golay 64QAM pairs from QPSK pairs The structure we proposed is for general super Golay codes, regardless of the constellation. However, we have focused on 16-QAM constellation for the sake of simulation. This scheme can be generalized to higher order QAM constellations, like 64-QAM which is used in IEEE WLAN standards like IEEE802.11a . To flnd a construction method for low PAPR 64-QAM sequences we can use the concept of Golay sets. The Golay sets are deflned as follows: Deflnition 7. The set of N-valued complex sequences fxij i = 1:::ng is called a Golay set if nX i=1 Al(xi) = 0 8l 6= 0: It is easy to see that if a Golay complementary set is taken from an equal-energy constel- lation, the PAPR of each sequence is bounded up to n. Paterson has shown that [112] the 166 2k+1-size equal power Golay set can be represented by some certain cosets of RM2h(1;m) in RM2h(2;m). The following lemma which can be proved easily using the Deflnitions 2 and 3 is a generalization of Lemma 5.2.9. Lemma 5.2.11. For any n sequences xi and any n complex numbers fii; i = 1;:::n Al( nX i=1 fiixi) = nX i=1 jfiij2Al(xi)+ nX i=1 nX j=1 i6=j fiifi?jCl(xi;xj): Using this Lemma, the following theorem can be used to generate 64-QAM Super Golay sets (The sum of the powers is nNPav). Theorem 5.2.12. If fxiji = 1:::4g comprises a N-valued QPSK Golay set, and jfiij is constant for all i = 1;:::4, then the following is a 64-QAM Super Golay set and therefore, the PAPR of each element is at most 4(6dB): ffi1(4x1 +jx2 ?jx3 ?x4); fi2(4x2 +jx1 ?x3 ?jx4); fi3(4x3 ?jx1 +x2 ?jx4); fi4(4x4 +x1 ?jx2 ?jx3)g: 5.3 Cyclic Golay Sequences Eq. (5.2) showed the instantaneous power of continuous OFDM signal. This power envelope is obtained after passing the OFDM channel symbols through a low-pass fllter and up- converting the signal to the carrier frequency. If we consider coded OFDM with codebook C, the Peak to Mean Envelope Power Ratio (PMEPR) of the code is deflned as PMEPR(C) = max 0?t?T px(t) Pav ; and the PAPR is deflned using the maximum of the real part of the OFDM signal in the time domain taken over all admissible codewords. PAPR measures the peak of the RF signal, 167 but PMEPR measures the peak of the baseband signal. The actual peak of OFDM time domain signal depends on the pulse shaping and the low-pass fllter we use after the IFFT block. It is obvious that this peak is in general difierent from the samples of the OFDM signal at the multiples of 1N?f. However, there is a direct relation between the peak of the continuous OFDM signal and the maximum of the discrete OFDM symbols sampled at the multiples of 1N?f. The PAPR of discrete OFDM symbols is an indication of the PMEPR of the continuous OFDM signal. By some proper time shaping, we can consider the PAPR of the discrete sequence obtained after IFFT operation, as a measure of the PAPR of the OFDM signal. In the sequel, we deflne the PAPR of a codeword x to be PAPR(x) = 1P x max k fpx[k]g = 1P x (N?1X i=0 x[i]x?[u]ej2?(i?u)kN ) ; (5.23) where Px = kxj2 is the power of the codeword x. Using the deflnition of the auto-Correlation of an N-valued complex sequence x as in Deflnition 3, the power of the kth OFDM channel symbol can be restated as px[k] = A0(x)+ N?1X jlj=1 Al(x)wlk; (5.24) where wk , ej2?kN , is the Nth root of unity. Note that A0(x) is actually the same as Px, the power of the code x, and by Parseval equation, this is the same as the average power of OFDM channel symbols. Therefore, the PAPR of the codeword x is: PAPR(x) = 1A 0(x) max k fpx[k]g: (5.25) For simplicity, we will denote "k mod n" by "k % n" in the following deflnition and what comes hereafter. Deflnition8. The CyclicAuto-Correlation of an N-valued complex sequence x, with nonzero replacement l, is deflned as CAl(x) = N?1X i=0 x?[i]x[(i+l) % N]: 168 Using this deflnition, one can see that CAl(x) = N?1X i=0 x?[i]x[(i+l) % N] = N?l?1X i=0 x?[i]x[i+l]+ N?1X i=N?l x?[i]x[i+l?N] = Al(x)+ l?1X i0=0 x?[i0 +N ?l]x[i0 +N ?l +l?N] = Al(x)+ N?(N?l)?1X i0=0 x[i0]x?[i0 +N ?l] = Al(x)+ N?(N?l)?1X i0=0 x[i0]x?[i0 +N ?l] == Al(x)+A?N?l(x) (5.26) Deflnition 9. Two N-valued complex sequences x and y are called cyclic Golay complemen- tary pairs if CAl(x)+CAl(y) = 0 8l 6= 0: Each of the sequences x and y is called Cyclic Golay Complementary Sequence (CGCS). Using (5.26), it is obvious that if two sequences are Golay pairs, they are cyclic Golay pairs, too. In other words, if G is the set of Golay sequences, and GC is the set of cyclic Golay sequences, then G GC: (5.27) Theorem 5.3.1. The PAPR achieved by cyclic Golay sequences is upper bounded by 3dB. 169 Proof. The power of OFDM channel symbols in (5.24), can be restated as px(k) = A0(x)+ N?1X jlj=1 Al(x)wlk = A0(x)+ N?1X l=1 (Al(x)wlk +A?l(x)w?lk ) = A0(x)+ N?1X l=1 (Al(x)wlk +A?l(x)w?lk ) = A0(x)+ N?1X l=1 Al(x)wlk + N?1X l0=1 A?N?l0(x)w?(N?l0)k = A0(x)+ N?1X l=1 Al(x)wlk + N?1X l=1 A?N?l(x)wlkw?Nk = A0(x)+ N?1X l=1 (Al(x)+A?N?l(x))wlk = CA0(x)+ N?1X l=1 CAl(x)wlk: (5.28) Note that, to obtain (5.28), we have used the equation A0(x) = CA0(x), and the fact that A?l(x) = A?l(x) from Eq. (5.6)Considering the fact that the power of each OFDM channel symbol is always non-negative, combining (5.25) and (5.28) deduces that the PAPR of each CGCS is upper bounded by 3dB. Theorem 5.3.1 and equation (5.27) state that the number of codewords achieving a PAPR at most equal to 3dB (in discrete domain), is more than just the number of Golay sequences. This is translated to higher coding rates. If x is an N-sized sequence, we denote its cyclic l-shift by xl (0 ? l ? N ?1). The kth element of xl is xl[k] = x[(k +l) % N]: = 8 >>< >>: x[k +l] if k ? N ?l?1; x[k +l?N] if k > N ?l?1 (5.29) Lemma 5.3.2. The property of being cyclic Golay is preserved under any cyclic l-shift of a sequence with size N. 170 Proof. CAu(xl) = N?1X i=0 xl?[i]xl[(i+u) % N] = N?l?1X i=0 x?[i+l]x[(i+l +u) % N]+ N?1X i=N?l x?[i+l?N]x[(i+l +u) % N] = N?1X k=l x?[k]x[(k +u) % N]+ l?1X k0=0 x?[k0]x[(k0 +u+N) % N] = N?1X k=l x?[k]x[(k +u) % N]+ l?1X k=0 x?[k]x[(k +u) % N] = N?1X k=0 x?[k]x[(k +u) % N] = CAu(x): (5.30) Therefore, if two sequences are cyclic Golay sequences, their shifted version by any re- placement l are cyclic Golay, too. Theorem 5.3.1 can be generalized for non-Golay sequences in the following way: Theorem 5.3.3. The PAPR achieved by any cyclic shifted versions of a sequence x is the same as the PAPR achieved by the sequence x itself. Proof. A0(x) is the power of the sequence x, which is preserved under any cyclic shift. Therefore, by (5.23) and (5.30) pxl[k] = A0(xl)+ N?1X i=1 CAi(xl)wik = A0(x)+ N?1X i=1 CAi(x)wik = px[k]: (5.31) So, the PAPR of the sequence xl is the same as the PAPR of x. In Section 5.2.2, we listed several transformation that when performed on Golay pairs, the resultant pairs are still GCP. These transformation are re ection with respect to the origin, with respect to both axes, with respect to the bisectors of all regions, and the rotation of one or both sequences. The same was true for the concatenation and interleaving of Golay 171 pairs, reversing each sequence in Golay pairs, and alternatively multiplying the elements of each sequence by ?1. Considering equation (5.26), we can deduce that the property of being cyclic Golay pairs is invariant under any of these transformations. 5.3.1 Construction of Cyclic Golay Codes If we start from one of the their Golay sequences with size 2m over Z2h, and make a cyclic shift, the resultant sequence is cyclic Golay. Our construction can create 2m cyclic Golay sequences out of each Golay sequence by l-shifting the original sequence with 0 ? l ? 2m ?1. However, some of these newly generated sequences are also part of original Golay sequences that can be created by difierent values of ck?s and c in (5.13). As a result, we need to carefully develop a structure for constructing the cyclic Golay sequences. In what follows, we will design a framework for obtaining the cyclic shifts of a sequence, presented by Boolean functions. To this end, we start from the fleld Z2 with the addition deflned modula 2. Cyclic Shift of Binary Codes The kth element of each basis sequence, fxn j n = 1:::mg, is given by: xn[k] = 8 >>< >>: 0 if k % 2n < 2n?1; 1 if k % 2n ? 2n?1: (5.32) By considering the relation (t % ab) % b = t % b, we can represent the cyclic l-shift of xn (1 ? n ? m) as: xln[k] = 8 >>< >>: 0 if [(k +l) % 2m] % 2n < 2n?1; 1 if [(k +l) % 2m] % 2n ? 2n?1 = 8 >>< >>: 0 if (k +l) % 2n < 2n?1; 1 if (k +l) % 2n ? 2n?1: (5.33) 172 For example, with m = 3, we have x1 = [0 1 0 1 0 1 0 1] x2 = [0 0 1 1 0 0 1 1] x3 = [0 0 0 0 1 1 1 1] x32 = [1 0 0 1 1 0 0 1]: (5.34) Moreover, by xy, we mean the Hadamard product of two codewords x and y. The Hadamard product of two p ? q matrices A and B, is another p ? q matrix C, whose ijth element is cij = aijbij. The second order monomials in the deflnition of Reed-Muller codes, and also in Theorem 5.2.1, are equivalent to the Hadamard product of two basis codeword. In the sequel, all additions are performed modula 2 and all codeword products are Hadamard product, unless otherwise stated. Lemma 5.3.4. The cyclic 1-shift of a basis codeword xn; 1 ? n ? m is x1n = xn + n?1Y i=0 xi; where x0 = 1 is an all one codeword. Proof. Examining (5.32), it is easy to see that ?n?1Y i=0 xi ! [k] = 8 >>< >>: 1 if k % 2n?1 = ?1; 0 if k % 2n?1 6= ?1: (5.35) If k % 2n?1 = ?1, then k = q(2n?1)?1. Depending on whether q is even or odd, we have 173 k % 2n = 2n ?1 or k % 2n = 2n?1 ?1. Therefore, ? xn + n?1Y i=0 xi ! [k] = 8> >>> >>>> >>< >>> >>> >>>> : 0 if k % 2n < 2n?1 ?1; 0 if k % 2n = 2n?1 ?1; 1 if 2n?1 ? (k % 2n) < 2n ?1; 1 if k % 2n = 2n ?1 + 8> >>> >>>> >>< >>> >>> >>>> : 0 if k % 2n < 2n?1 ?1; 1 if k % 2n = 2n?1 ?1; 0 if 2n?1 ? (k % 2n) < 2n ?1; 1 if k % 2n = 2n ?1 = 8 >>>> >>>> >>< >>> >>> >>> >: 0 if k % 2n < 2n?1 ?1 1 if k % 2n = 2n?1 ?1 1 if 2n?1 ? (k % 2n) < 2n ?1 0 if k % 2n = 2n ?1 = 8 >>< >>: 0 if (k +1) % 2n < 2n?1; 1 if (k +1) % 2n ? 2n?1 = x1n: Equation (5.35) can be generalized for s > 0 as: ?n?1Y i=s xi ! [k] = 8 >>< >>: 1 if ?2s ? k % 2n?1 < 0; 0 otherwise: (5.36) Using (5.36) and the same argument as in Lemma 5.3.4, we can generalize Lemma 5.3.4 and obtain the following result: Lemma 5.3.5. For n > 1 and 0 ? k < n?1, the cyclic 2k-shift of a basis codeword xn, is x2kn = xn + n?1Y i=k+1 xi: Moreover, x2n?1n = 1+xn, and x2kn = xn for k ? n ? 0. Theorem 5.3.6. The following relations hold: ? For 2n?1 ? l%2n, the l-shift of xn is xln = 1+xl%2n?1n . 174 ? For l ? 2n, the l-shift of xn is xln = xl%2nn . ? If l = 2a +2b with n?1 > a > b we have xln = xn + ? xa+1 + a+1Y i=b+1 xi + aY i=b+1 xi ! yn; where yn = 8> >< >>: n?1Q i=a+2 xi if a < n?2; 1 if a = n?2: (5.37) Proof. we will prove each item separately, ? If 2n?1 ? l%2n, then l%2n = 2n?1 +l%2n?1. Therefore, (k +l) % 2n = (k%2n +2n?1 +l%2n?1) % 2n = [(k +l%2n?1) % 2n +2n?1] % 2n: As a result, if (k + l%2n?1) % 2n < 2n?1, then (k + l) % 2n ? 2n?1, and vice versa. Using (5.33), the relation xln = 1+xl%2n?1n is obtained. ? It is easy to see that (k + l) % 2n = (k + l%2n) % 2n. Using (5.33), the relation xln = xl%2nn is obtained for 2n ? l. ? If l = 2a + 2b with a > b, we have xln = ? x2bn ?2a : Then, by virtue of Lemma 5.3.5, we have xln = ? xn + n?1Y i=b+1 xi !2a = xn + n?1Y i=a+1 xi +x2ab+1x2ab+2 :::x2aa x2aa+1 :::x2as :::x2an?1 = xn + n?1Y i=a+1 xi +xb+1xb+2 :::xa(1+xa+1)::: ? xs + s?1Y i=a+1 xi ! ::: ? xn?1 + n?2Y i=a+1 xi ! 175 For a = n?2, we have xln = xn +xa+1 +(1+xa+1) aY i=b+1 xi = xn +xa+1 + aY i=b+1 xi + a+1Y i=b+1 xi: If a > n?2, for each s ? a+2, we have ? xs + s?1Y j=a+1 xj ! (1+xa+1) = xs (1+xa+1) . Therefore, xln = xn + n?1Y i=a+1 xi + ? aY i=b+1 xi !? n?1Y i=a+2 xi ! (1+xa+1) = xn + ? xa+1 + aY i=b+1 xi + a+1Y i=b+1 xi ! n?1Y i=a+2 xi: Note that in the last equation we have used the fact ? i?1Q j=l1+1 xj ! (1+xl1?1) = 0 for each i. As an application of Lemma 5.3.6, for l = 3 = 21 +20, we have x3n = xn + ?n?1Y i=3 xi ! (x1 +x2 +x1x2): The following Corollary is a direct consequence of Theorem 5.3.6. Corollary 5.3.7. If l = 2a + b, the Boolean function for xln can be found by replacing each xn0 in xbn, by xn0 + n0?1Q i=a+1 xi. 176 All of the cyclic shifts we have discussed up to this point, have been left shifts. Now, we turn our attention to right shifts. The right cyclic shift of xn is deflned as: x?ln [k] = 8 >>< >>: 0 if (k ?l) % 2n < 2n?1; 1 if (k ?l) % 2n ? 2n?1: (5.38) Examining (5.32), one can see that ?n?1Y i=s (1+xi) ! [k] = 8 >>< >>: 1 if 0 ? k % 2n?1 < 2s ?1; 0 otherwise: (5.39) Using the same arguments as in Lemma 5.3.4 and 5.3.5, one can further prove the fol- lowing lemma: Lemma 5.3.8. For n ? 1 and 0 ? k < n?1, the right cyclic 2k-shift of a basis codeword xn, is x?2kn = xn + n?1Y i=k+1 (1+xi): Moreover, x?2n?1n = 1+xn and x?2kn = xn for k ? n ? 0. Finally, the following Lemma relates the left cyclic (2n?1?l)-shift of xn to the right cyclic l-shift of the basis codewords. Lemma 5.3.9. For 1 ? l < 2n?1, the left cyclic (2n?1 ?l)-shift of xn, is x2n?1?ln = 1+x?ln : Proof. It is easy to see that (k + 2n?1 ? l) % 2n = [(k ? l) % 2n + 2n?1] % 2n. So, if (k +2n?1 ?l) % 2n < 2n?1, then (k ?l) % 2n ? 2n?1 and vice versa. This means x2n?1?ln = 1+x?ln . 177 Cyclic Shift of non-Binary Codes We have built a framework to flnd the Boolean function representation of a large set of cyclic shifts of basis codewords. This framework deflnes the cyclic shift of any codeword deflned over Z2 by means of Boolean functions. However, if the code is deflned over Z2h, there is an ambiguity over modula-2 addition and modula-2h additions. This ambiguity can be avoided by using the following relation: c[(x+y) % 2] = (cx+cy ?2cxy) % 2h; (5.40) where c is a constant number deflned over Z2h and x and y are two binary codewords. This equation can be easily verifled by examining 3 cases for every element of x and y. The cases are when the kth components of x and y are both 0, both are 1, or one is 0 and the other one is 1. By deflning Sm = f1;2;:::mg, induction can be used to generalize equation (5.40) for m sequences fi (i = 1;:::m) as: c "? mX i=1 fi ! % 2 # = mX i=1 cfi? 0 BB @ X (i;j)2S2m i6=j 2cfifj 1 CC A???+ 8 >>> >< >>> >: (?2)m?1c mQ i=1 fi if m ? h; P (i1;i2;:::ih)2Shm i16=i2???6=ih (?2)h?1c hQ l=1 fil otherwise, (5.41) where the additions and subtraction in the right hand side of (5.41) are in modula 2h, and the codewords fi represent any binary codeword, not only the basis codewords. As an example of this framework, we will construct the cyclic Golay code with size 8 over Z4. The coset representations of RM4(1;3) that belongs to RM4(2;3), coming from Theorem 5.2.1, are 2(x1x2+x2x3), 2(x1x2+x1x3), and 2(x1x3+x2x3). It is worthwhile to mention that the additions in the coset representations in (5.13) are in modula-2h. Table 5.1 shows the Golay sequences that are repeated under cyclic shifts. The sequences are categorized based 178 Shifts/Coset Rep. x1x2 +x2x3 x1x3 +x2x3 x1x3 +x1x2 l = 1 c3 = 0 & (c2 = 1 or c2 = 3) c3 = 0 & (c2 = 0 or c2 = 2) c3 = 2 & (c2 = 0 or c2 = 2) c3 = 2 & (c2 = 1 or c2 = 3) l = 2 c3%2 = 0 c3%2 = 1 l = 3 c3 = 0 & (c2 = 0 or c2 = 2) c3 = 0 & (c2 = 1 or c2 = 3) c3 = 2 & (c2 = 1 or c2 = 3) c3 = 2 & (c2 = 0 or c2 = 2) l = 4 all l = 5 c3 = 0 & (c2 = 1 or c2 = 3) c3 = 0 & (c2 = 0 or c2 = 2) c3 = 2 & (c2 = 0 or c2 = 2) c3 = 2 & (c2 = 1 or c2 = 3) l = 6 c3%2 = 0 c3%2 = 1 l = 7 c3 = 0 & (c2 = 0 or c2 = 2) c3 = 0 & (c2 = 1 or c2 = 3) c3 = 2 & (c2 = 1 or c2 = 3) c3 = 2 & (c2 = 0 or c2 = 2) # of non-Golay 1024 1024 1024 sequences Table 5.1: List of repeated Golay sequences under cyclic shifts for m = 3 and h = 2 on the coset representative and the coe?cient values from Z4. These values are obtained by applying the above mentioned framework to each basis codeword. For example, when the coset representative is 2(x1x2 +x2x3), the 3rd cyclic shift of the Golay sequence represented in (5.13) is: c1x31 +c2x32 +c3x33 +c+2(x31x32 +x32x33) = c1(1+x1)+c2(1+x1 +x2)+ c3(x3 +x1 +x2 +x1x2)+c+2(1+x1 +x2 +x3 +x1x3 +x2x3) = c1 +c2 +c+2+(2+c3 ?c1 ?c2)x1 +(2+c3 ?c2)x2 +(2+c3)x3+ (?c3 ?2c2)x1x2 +(2?2c3)x1x3 +(2?2c3)x2x3 ?2c3x1x2x3: The requirements for this expression to be a Golay sequence (and therefore the 3rd cyclic shift does not create a new codeword) are: ? 2c3 = 0 mod 4 to delete the term x1x2x3, 179 ? One of the three terms "?c3?2c2" or "2?2c3" or "2?2c3" is equal to zero to create a valid second order coset. ? The other two nonzero terms are equal to 2. These requirements are translated into c3%2 = 0. Moreover, if c3 = 0 then c2%2 = 0, and if c3 = 2 then c2%2 = 1. Therefore, out of 256 Golay sequences generated by this coset, the 3rd cyclic shift of only 64 sequences are repetitions of Golay codes. So, the 3rd cyclic shift of this coset generates 192 cyclic Golay sequences that are not Golay. The last row of Table 5.1 shows the total number of cyclic Golay sequences created by each cost representative that are not Golay. The total number of new sequences created by 2(x1x2 + x2x3) is 1024. Considering all 3 cosets, the cyclic shifts generate 3072 new sequences. However, some of the newly generated sequences by a coset representative can be created by some other cosets and difierent values of ck?s. For example, when c3 = 3, all of the sequences generated by cyclic shifts of Golay codewords are similar to the ones created by c3 = 1, with difierent values for c1, c2, and c, and difierent coset representatives. To flnd this out, we need to create a table containing all of the coe?cients for each coset representatives, and each shift, for difierent values of c3. By deleting the similar columns, we can flnd the non-repeated cyclic Golay sequences. We have performed such inspection for m = 3 and h = 2. The result shows that 1024 of these cyclic Golay sequences are non-repeated and can be generated from Golay sequences, using the procedure presented in Table 5.2. This is the procedure that we use at the encoder to encode these sequences. 180 Coefi. / Coset x1x2 +x2x3 x1x3 +x2x3 x1x3 +x1x2 c3 = 0; c2%2 = 0 None l = 1;2;3 l = 2;3 c3 = 0; c2%2 = 1 None l = 2 l = 1;2 c3 = 1 l = 1;2;3;5;6;7 None l = 1;3;5;7 c3 = 2; c2%2 = 0 None l = 2 None c3 = 2; c2%2 = 1 None l = 1;2;3 None c3 = 3 None Table 5.2: List of non-repeated cyclic shifts on Golay sequences for m = 3 and h = 2. The rate of size 8 Golay sequences over Z4 is log2?3!2 22(3+1)? 2?23 = 0:599; while the rate of size 8 cyclic Golay sequences over Z4 is log2?1024+ 3!2 22(3+1)? 2?23 = 0:690: As the size of the codewords increases, more basis vectors are considered and therefore, the cyclic shifts of (5.13) create higher order monomials. As a result, the number of newly generated cyclic Golay codewords is increased. The same thing happens as the dimension of the fleld from which we pick the coe?cients ci is increased. Therefore, as we move toward higher dimension flelds and higher size codes, the rate increase is more signiflcant. As an example for m = 4, if we shift each of the codewords from the coset represented by x1x2 +x2x3 +x3x4 one position, none of the newly generated codewords are repeated Golay sequence. This is because the Boolean function representation of these codewords not only includes some of the 3rd order monomials, but also the second order coset is not in the form shown in (5.13). Therefore, none of these cyclic shifts can be obtained from (5.13). However, 181 the number of newly generated cyclic Golay sequences from (5.13) cannot be more than the number of Golay sequences times the size of each codeword (2m). It is worthwhile to mention that the total number of size 2m cyclic Golay sequences over Z2h, using exhaustive search, is greater than the number we just found, using (5.13) and our framework. The cyclic Golay code we generated, is clearly non-linear. Even if we subtract the coset representatives from (5.13) to obtain a linear RM2h(1;m) code and then perform the cyclic shifts, the resultant code is not linear. Therefore, this is not a linear cyclic code and we cannot deflne a minimal generator polynomial for generating this code. It is known that a linear cyclic code created by polynomials cannot be of size 2m and our code is of size 2m. Massey [128] has introduced cyclic Reed-Muller codes by puncturing the flrst column of the generator matrix of RM(r;m) and reordering the flrst order rows to create m-sequences. The size of the codewords in this case is 2m ? 1. However, even with those special cosets deflned in (5.13), cyclic Reed-Muller code is not Golay anymore and therefore, does not have the low PAPR property. The major drawback of the cyclic Golay code is their low Hamming and Lee distances of the code. The Hamming weight of a codeword x of size m over Z2h, denoted as wthH(x), is equal to the number of nonzero components of x, while the Lee weight of x over Z2h, denoted by wthL(x), is deflned to be m?1P j=0 minfxi;2h ? xig (The sum is performed regularly, not in modula 2h). The Hamming distance of two codewords x and y is deflned as dhH(x;y) = wthH(x?y mod 2h), and the Lee distance is dhL(x;y) = wthL(x?y mod 2h). The Hamming distance measures the number of positions in which x and y are difierent, whereas the Lee 182 distance takes into account the magnitude of the difierence over Z2h at each position. These two coincide in the binary case, h = 1. For example, the Hamming distance between the pair of codewords 5;7;0;1) and 3;7;7;6) over Z23 is 3, while their Lee distance is 2+0+1+3 = 6. The minimum Hamming distance of a code C, which is taken over all distinct pair of codewords in Z2h is deflned to be dhH(C) = min x;y2C x6=y 'dh H(x;y) ?, and similarly, the minimum Lee distance of the code is dhL(C) = min x;y2C x6=y 'dh L(x;y) ?. The minimum Hamming or Lee distance of a code is a measure of its error correction capability; that is if the minimum Hamming or Lee distance is d, then we can always correct error of (Hamming or Lee) weights less than d=2. If the transmission channel renders all H?1 possible error for a given codeword positions equally likely, then the traditional Hamming distance metric is suitable. However, if error involve a transition between adjacent values in Z2h are much more likely than other error in a given position, then Lee distance metric is more appropriate [129]. Both measures are appropriate metrics for evaluating the error correction capability for OFDM transmission. It is proved in [13,118] that the Hamming and Lee distance of the RM2h(r;m) are both 2m?r. Therefore, the Hamming and Lee distance of the code deflned in (5.13) are both 2m?2. The cyclic Golay code deflned in this section, is a subset of RM2h(m;m), and therefor, in general has a low distance. However, by reducing the coding rate, we can increase the Hamming and Lee distances, while maintaining the same upper bound for the PAPR. This trade-ofi can be explained using an example. Take m = 4, and h = 2. If we start from (5.13) with the coset representation x1x2 +x2x3 +x3x4, the Boolean function representation of the 183 cyclic 1-shift of each codeword is: c+c1 +(c2 ?c1)x1 +(2+c2)x2 +c3x3 +c4x4 +(?2c2 +c3 +2)x1x2+ 2x3(x1 +x2 +x4)+(?2c3 +c4)x1x2x3 +2x1x2x4 ?2c4x1x2x3x4: It is observed from this expression that if we avoid the cyclic 1-shift of the codewords for c4%2 = 1, the resulting code is in RM4(3;4), instead of RM4(4;4). Moreover, it can be easily checked that by applying some restriction on c3, all of the codewords can be considered either as a subset of RM4(2;4) or a coset of RM4(2;4) with coset representation 2x1x2x4. 5.3.2 Maximum-likelihood Decoding of RM2h(r;m) We assume that the cyclic Golay code is constructed to be a subset of RM2h(r;m). Our goal in this section is to flnd a decoding scheme for these codes. To this end, we provide two decoding algorithms for RM2h(r;m). The flrst one is a generalization of Reed algorithm which is stated in [13] for binary Reed-Muller codes. We will restate this algorithm using the concept of Karnaugh Map or K-Map [127] for generalized non-binary Reed-Muller codes. In this chapter, we will focus particularly on RM2h(3;4). The decoding steps for other sizes and orders can be easily obtained. The second algorithm is a recursive algorithm based upon the flrst scheme and other existing maximum likelihood decoding schemes for the flrst order generalized Reed-Muller codes. In the Subsection 5.3.2, we will compare the complexities of these two algorithms. 184 00 01 x2z }| { 11 10 00 0 1 3 2 01 4 5 7 6 x 3 x4 ?11 12 13 15 14 10 8 9 11 10 x1 Table 5.3: K-Map for 4 variables 00 01 x2z }| { 11 10 00 0 1 3 2 01 4 5 7 6 x 3 x4 ?11 12 13 15 14 10 8 9 11 10 x1 00 01 11 10 00 16 17 19 18 01 20 21 23 22 11 28 29 31 30 10 24 25 27 26 | {z } x5 Table 5.4: K-Map for 5 variables Maximum-likelihood Decoding of RM2h(r;m) using Karnaugh Maps K-Map is a set of squares, each representing a number in the range f0;1;:::2m ?1g, with the property that the binary representation of every two neighboring squares are difierent in exactly one bit position. The K-Maps for m = 4 and m = 5 are depicted in Tables 5.3 and 5.4. As it is apparent from these tables, two squares are neighbors if they are either adjacent, or circularly adjacent (like 0 and 2 in Table 5.3), or they are at the same position in difierent blocks (like 15 and 31 in Table 5.4). Let?s deflne Sk = f1;2;:::kg. The transmitted sequence in RM2h(3;4) can be written as: f(x1;x2;x3;x4) = 2 66 4fi + X i2S4 fiixi + X (i;j)2S24 i6=j fiijxixj + X (i;j;k)2S34 i6=j6=k fiijkxixjxk 3 77 5 mod 2h; (5.42) 185 where fi?s represent the Reed-Muller coe?cients. Note that, decoding of Generalized Reed- Muller codes is equivalent to determining the coe?cients of the monomials. Using the expansion in (5.42), we can flnd the components of the transmitted vector. For example, by setting all of xi?s equal to 1, we can obtain f15, the 15th component of f. In the following we show f15, f12, and f1: f15 = 2 66 4fi + X i2S4 fii + X (i;j)2S24 i6=j fiij + X (i;j;k)2S34 i6=j6=k fiijk 3 77 5 mod 2h; f14 = 2 66 4f15 ?fi1 ? X i2S4?f1g fi1i ? X (i;j)2(S4?f1g)2 i6=j fi1ij 3 77 5 mod 2h f13 = [c15 ?fi2 ? X i2I4?f2g fi2i ? X (i;j)2fS4?f2gg2 fi2ij] mod 2h f12 = 2 66 4f15 ?fi1 ?fi2 ?fi12 ? X i2S4?f1;2g (fi1i +fi2i +fi12i)? X (i;j)2(S4?f1;2g)2 i6=j (fi1ij +fi2ij) 3 77 5 mod 2h; f1 = [fi +fi1] mod 2h: If we rewrite these expressions for all of the components of the encoded codeword, both fi1123 and fi2123, shown in (5.43), could be considered as the estimates of fi123: fi1123 = [f15 ?(f11 +f13 +f14)+(f9 +f10 +f12)?f8] mod 2h fi2123 = [f7 ?(f3 +f5 +f6)+(f1 +f2 +f4)?f0] mod 2h: (5.43) By examining Table 5.3, the estimation of fi123 is obtained from the K-Map in the fol- lowing manner: 186 ? Specify the block of squares containing x1, x2, and x3 (squares 15 and 7). Write the corresponding terms with a positive sign in difierent equations. ? For each of the squares found in the previous step (e.g. f15) , write the adjacent squares with negative signs (f11, f13, and f14). Exclude the squares you found in the previous step. ? Continue step 2, by alternating the negative and positive signs, until all of the squares are covered. If fi1123 and fi2123 are the same, we can estimate the common value as fi123. Note that, the Hamming distance of RM2h(r;m) is 2m?r [13]. For RM2h(3;4), this distance is 2 and therefore, we can only detect one symbol error. This fact is apparent from our algorithm, too. Any one symbol error might cause the values for fi123, found from these two equations, to be difierent. Therefore, we can detect one symbol error, but cannot correct it. This is true for the Lee distance, too. The Lee distance of RM2h(r;m) is also 2m?r. Other 3rd order coe?cients like fi124, fi134, and fi234 can be found in the same way. For the lower order coe?cients, we flrst need to subtract the higher order monomials from the original code- word. If we denote the result of subtracting the third order monomials from the transmitted codeword by f1, we have f1 = 2 66 4f ? X (i;j;k)2S34 i6=j6=k fiijkxixjxk 3 77 5 mod 2h: (5.44) Then, the following set of equations can be obtained from the K-Map for fi24. 187 ^fi24 = f115 ?(f17 +f113)+f15 = f114 ?(f16 +f112)+f14 = f111 ?(f13 +f19)+f11 = f110 ?(f12 +f18)+f10 mod 2h: Again, fi24 can be found using a majority voting. Likewise, we can obtain the hard- decision estimation of other second order coe?cients. By subtracting the second order monomials from f1 and obtaining f2, we can write a set of equations for decoding the flrst order monomials, like: ^fi1 = f21?f20 = f23?f22 = f25?f24 = f27?f26 = f29?f28 = f211?f210 = f213?f212 = f215?f214 mod 2h The same set of equations can be obtained for other flrst order coe?cients. Finally, the constant term coe?cient is the result of majority voting over all of the remaining components. It can be verifled that the majority logic scheme, shown above using K-Maps, is an 2m?r Hamming and Lee distance decoder. To perform the soft-decision decoding in the complex domain, if the received codeword is fy0;y1;:::y15g, the maximum-likelihood estimation of fi123 is a number ^fi123 2 Z2h, that maximizes the real part of the following expression: w?^fi123 [y15(y11y13y14)?(y9y10y12)y?8 +y7(y3y5y6)?(y1y2y4)y?0]: (5.45) Note that, the conjugate of y11y13y14 in (5.45) performed in complex domain, corresponds to the subtraction in (5.43) performed in code domain. Other coe?cients can be found likewise. 188 Recursive Maximum-likelihood Decoding of RM2h(r;m) The K-Map scheme is appropriate, if m is not very large. So, we introduce a recursive scheme to decompose the decoding of RM2h(r;m) into decoding of lower order or lower size Reed- Muller codes. To this end, we flrst restate the maximum-likelihood soft-decision decoding scheme for RM2h(1;m) presented in [125]. If the information symbol to be transmitted is shown by a vector [u0 u] 2 (Zh2)m+1, then the transmitted sequence from RM2h(1;m) would be w?u0w?ATu, where w = e?2?j2h is the 2h-th root of unity, A is an m? 2m binary matrix whose columns are the 2m binary vectors of length m, and the exponentiation is considered component-wise. The received vector through an AWGN channel is y = w?u0w?ATu + n, where n is a complex white Gaussian noise vector with E[nnH] = 2I2m, where I2m is the size 2m identity matrix. The maximum-likelihood decoder flnds the vector u that maximizes the correlation between the received vector y and all of the flrst order Reed- Muller codewords in Z2h. Speciflcally, it maximizes jw?u0yTw?ATuj, with the constraint that ? ?2h ? arg(w?u0yTw?ATu) ? ?2h [125]. This maximization can be done using the modifled fast Hadamard transformation. The Hadamard matrix, Hm, is a 2m ?(2h)m matrix. Each row corresponds to a binary vector of size m, and each column corresponds to a 2h-ary vector of size m. For any two vectors a 2 (Z2h)m and b 2 (Z2)m, the element in the intersection of the corresponding row and column is Hm[a;b] = w?aTb. The matrix Hm can be generated recursively using the following equation, which is very similar to the recursive equation for 189 binary Hadamard matrices [13]: Hm = mY i=1 I2m?i ?H1 ?I2h(i?1); where ? denotes the Kronocker product. The algorithm flnds the fast Hadamard transfor- mation of the received word, Y = HHmy, and locates the element of Y, say Yi, with the largest magnitude, jYij. Then, it flnds u0 2Z2h, such that jw?u0Yij is maximized. If the ith column of Hm corresponds to the 2h-ary vector u with size 2m, the decoder outputs the vector [u0 u]. Another soft-decision decoding algorithm for RM2h(1;3) is presented in [107]. This al- gorithm is based on the Reed majority logic decoder for binary Reed-Muller codes. This algorithm can be easily generalized for the flrst order Reed-Muller code of any size. IEEE Wireless LAN standard committee has adopted this structure for IEEE802.11b physical layer. For size 8 complementary codes, at the transmitter, 4 phases `1, `2, `3, and `4 are calculated for each input symbol, and based on these phases, the sequence 'ej(` 1);?ej(`1+`2);ej(`1+`3);ej(`1+`2+`3);?ej(`1+`4);ej(`1+`2+`4);ej(`1+`3+`4);ej(`1+`2+`3+`4) ? is transmitted. At the receiver the sequence y is received and the phase `2 is estimated from ^`2 = 1 4(y ? 0y1 +y ? 2y3 +y ? 4y5 +y ? 6y7): Other phases can be similarly computed. This algorithm can be easily generalized to decode any RM2h(1;m) code, for an arbitrary m. Another e?cient algorithm for hard-decision decoding of RM2h(1;m) is presented in [118] that performs the binary fast Hadamard decoding, h times. This algorithm is not a maximum-likelihood decoding scheme, but is a 2m?1 Hamming and Lee distance decoder. 190 All of these algorithms can be used for decoding RM2h(1;m). However, to decode higher order Reed-Muller codes, we will restate the theorem presented in [13] for constructing higher order non-binary Reed-Muller codes from lower order ones, deflned in Z2h. The proof is basically the same as the binary case, by performing the additions in modula 2h. The steps of the proof are used in our decoding scheme, and therefore, the proof is stated here for the case of RM2h(r;m+1). In the following, xjy means the concatenation of two sequences x and y. Theorem 5.3.10. The rth order generalized Reed-Muller code of size 2m+1 can be constructed by the following recursive equation: RM2h(r;m+1) = f(fjf +g); 8f 2 RM2h(r;m) & 8g 2 RM2h(r?1;m)g: Proof. If c is a sequence from the rth order generalized Reed-Muller code of size 2m+1, it can be written as: c(x1;x2;:::xm;xm+1) = [f(x1;x2;:::xm)+0:(xm+1)]+(xm+1):g(x1;x2;:::xm) mod 2h; (5.46) where f(x1;x2;:::xm) is an rth order generalized Reed-Muller sequence of size 2m and g(x1;x2;:::xm) is an (r?1)th order generalized Reed-Muller codeword of size 2m, both over Z2h, and the addition is modula 2h. Equation (5.46) can be written as c = (fjf) + (0jg) = fjf +g. Note that, the coe?cients of all of the monomials containing xm+1 are determined by g. 191 Since the transmitted codeword is equal to wc, in order to obtain the partial sequence g, we have to multiply the second half of the received codeword by the conjugate of the flrst half, component-wise. Theorem 5.3.10 provides the following algorithm to decode RM2h(r;m+1): Algorithm I [Recursive Decoding of RM2h(r;m+1)] 1. The word y = fy0;y1;:::y2m+1?1g is received. 2. Foreachi = 0;1;:::2m?1, flndgi = yi+2myi?. Thisisequivalenttoflndingg(x1;x2;:::xm) or all of the monomials containing xm+1, in (5.46). The resultant g is in RM2h(r?1;m). Call the flrst half of the original sequence, f. 3. Use either this recursive algorithm or any other non-recursive algorithms (mentioned in the previous or this subsection) to flnd the coe?cients corresponding to g. 4. Subtract (in code domain) g from the second half of y, to flnd an estimate of f, which is in RM2h(r;m). 5. Apply either this recursive algorithm or any other non-recursive algorithms (presented in the previous or this subsection) to the average of f and the estimate of f found in the previous step. Instead of averaging, we can flnd the equations corresponding to the coe?cients in f and its estimate separately, and then for each coe?cient, perform the same maximum-likelihood decoding, similar to (5.45). Note that, we can terminate the recursive algorithm at any step by using any one of the non-recursive algorithms mentioned in this subsection (for the case of the flrst or second 192 ?2 0 2 4 6 8 10 12 14 10?7 10?6 10?5 10?4 10?3 10?2 10?1 BER Bit Error Rate (BER) vs. SNR for AWGN channels SNR m=3, Golay.m=4, Golay. m=3, Order 2.m=3, Order 3. m=4, Order 2.m=4, Order 3. m=4, Order 4. Figure 5.1: Bit Error Rate (BER) vs. SNR for AWGN channels order generalized Reed-Muller code) or the K-Map method presented in the previous section (for higher order codes). Comparison of the Complexities of Decoding Algorithms We measure the complexity of these decoders by enumerating the number of complex multi- plications and additions needed to perform the decodings. For the non-recursive scheme, to estimate one of the kth order coe?cients, we need to perform 2m?k(2k?1) complex multipli- cation to create equations similar to (5.43), 2h complex multiplications to perform equations like (5.45), and 2m?k?1 complex additions. Since there are exactly ?mk? coe?cients of order k, decoding of RM2h(r;m) requires rP k=0 ?m k ??2h +2m?1?2?k?? complex multiplications and 193 rP k=0 ?m k ??2m?k ?1? complex additions. For m = 5, r = 3 and h = 2, these numbers are calculates as 736 complex multiplications and 232 complex additions. As for the non-recursive scheme, we measure the complexity of one iteration assum- ing that at the end of this iteration, the K-Map method is used to decode the partial sequences. Assuming that we start from the generalized RM2h(r;m + 1), separation of f and g (as deflned in Theorem 5.3.10) requires 2m complex multiplications, decoding of g 2 RM2h(r ? 1;m) requires r?1P k=0 ?m k ??2h +2m?1?2?k?? complex multiplications and r?1P k=0 ?m k ??2m?k ?1? complex additions, and decoding of f 2 RM 2h(r;m) and its estimate re- quires rP k=0 ?m k ??2h +2:2m?1?2?k?? complex multiplications and rP k=0 ?m k ??2m+1?k ?1? com- plex additions. In the following theorem, we will prove that one iteration of the recursive algorithm is less complex compared to the K-Map method. Theorem 5.3.11. The decoding complexity of RM2h(r;m + 1), in terms of the number of complex multiplications and additions, associated to the K-Map decoding method presented in Subsection 5.3.2 is higher than the one associated to one iteration of the recursive Algorithm I. Proof. We will consider decoding of RM2h(r;m+1). a. Multiplications: From our discussion in Subsection 5.3.2, the total number of complex multiplications 194 ?5 0 5 10 15 200.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 Coding Rate Coding Rate vs. SNR for AWGN channels SNR m=3, Golay.m=4, Golay. m=3, Order 2.m=3, Order 3. m=4, Order 2.m=4, Order 3. m=4, Order 4. Figure 5.2: Coding rate vs. SNR for AWGN channels for the non-recursive algorithm minus the ones for the recursive algorithm is given by: rX k=0 m+1 k ?? 2h +2m+1?1?2?k??? " 2m + r?1X k=0 m k ?? 2h +2m?1?2?k??+ rX k=0 m k ?? 2h +2:2m?1?2?k?? # : Using the relation ?m+1k ? = ?mk?+? mk?1?, and some changes of indices, the above state- ment becomes 2m+1 r?1X k=0 m k ?? 1?2?(k+1)??2m ? r?1X k=0 m k ?? 1?2?k? = 2m r?1X k=0 m k ? ?2m ? 0 The equality holds if r = 1. b. Addition: From our discussion in Subsection 5.3.2, the total number of complex additions for the 195 Operation [125], Alg. 1 [118], Alg. 17 [126], Alg. 6 [126], Alg. 8 K-Map Recursive Alg. + 19200 9600 1960 5800 1184 1088 ? 0 0 720 840 1248 1104 Table 5.5: Computational complexities of Decoders for codes from RM4(2;5) non-recursive algorithm minus the ones for the recursive algorithm is given by: rX k=0 m+1 k ?? 2m+1?k ?1?? "r?1X k=0 m k ?? 2m?k ?1?+ rX k=0 m k ?? 2m+1?k ?1? # rX k=0 m k ?? 2m+1?k ?1?+ r?1X k=0 m k ?? 2m?k ?1?? r?1X k=0 m k ?? 2m?k ?1?? rX k=0 m k ?? 2m+1?k ?1? = 0 The algorithms presented in [118,125,126] are proposed for the cosets of the flrst order Reed-Muller codes that are in the second order Reed-Muller codes, and our algorithms are presented for general RM2h(r;m). However, we have compared the complexity of our algorithms for m = 5, h = 2, with the known schemes ones for the cosets of RM4(1;5) in RM4(2;5), in terms of the number of real additions and multiplications in Table 5.5. It is worthwhile to mention that although our algorithms are proposed for the generalized Reed-Muller code of any order, they still have comparable complexities with other schemes. Another point is that other schemes consider special second order monomials as the cosets, while our algorithms consider any second order monomials, and therefore, our algorithms perform higher number of calculations. As a matter of fact, the basis of comparison is not accurate enough and is against our algorithms. 196 Scheme size 4 SGPs size 8 SGPs built by Theorem 5.2.7 1 12032 385020 2 4096 131720 Table 5.6: Number of constructed 8-valued SGolay Pairs. 5.3.3 Performance Results Simulation Results for super-Golay Codes We have performed a search to flnd all of QPSK Golay pairs and 16-QAM SGolay pairs of size 4. The total number of QPSK Golay pairs is 256 which is exactly compliant by the formula given in [118] (22(m+2)m!2 distinct QPSK Golay pairs with size 4). Using the representations in Theorem 5.2.10, We have been able to build 4096 SGolay pairs with size 4. The total number of 4-valued SGolay pairs is 12032. Table 5.6 compares the number of 8-valued SGolay pairs obtained from these two sets. In both cases, each Golay pair with size 4 yields 32 new SGolay pairs with size 8. In this table, scheme 1 means the total number of SGolay pairs and scheme 2 means the number of SGolay pairs built by Theorem 5.2.10. Table 5.7 and 5.8 compare the code rate, information rate and the achievable PMEPR of the proposed recursive structure with the same values reported in [112,118,124] using QPSK constellation for 16 and 32 subcarriers. In the rows represented by \Jedwab1" the second order Reed-Muller cosets of the form 2h?1 m?1P k=0 x?(k)x?(k+1) are used and therefore the constructed codes are Golay and their PMEPR is bounded up by 3dB. However, in the scheme represented by \Jedwab2" other forms of second order Reed-Muller cosets are also 197 Scheme Max PMEPR Code rate Information rate Recursive1 3dB 0:3438 1.375 Recursive2 3dB 0:375 1.5 Jedwab1 3dB 0.31 0.62 Jedwab2 6dB 0.47 0.94 Paterson 6dB 0.563 1.126 Chong1 3dB 0.2954 1.1817 Chong2 5:56dB 0.3053 1.2212 Table 5.7: Rates and PMEPRs for size 16. Scheme PMEPR Code rate Information rate Recursive1 3dB 0:2109 0.84375 Recursive2 3dB 0:2266 0.9063 Jedwab1 3dB 0.19 0.38 Jedwab2 6dB 0.31 0.62 Paterson 6dB 0.375 0.75 Chong1 3dB 0.1835 0.7341 Chong2 5:56dB 0.3053 1.2212 Table 5.8: Rates and PMEPRs for size 32 198 ?2 0 2 4 6 8 10 12 14 16 18 20 10?4 10?3 10?2 10?1 BER Bit Error Rate (BER) vs. SNR for Multipath Fading channels, channel known SNR m=3, Golay.m=4, Golay. m=3, Order 2.m=3, Order 3. m=4, Order 2.m=4, Order 3. m=4, Order 4. Figure 5.3: Bit Error Rate (BER) vs. SNR for Fading channels, when the channel is known at the receiver allowed. This causes the PMEPR of the code to lie above 3dB. At the row represented by \Paterson", the concept of Golay sets are used. It is proved in [112] that a Golay set of size 2k+1 achieve the maximum PMEPR of 2k+1. The rows represented by \Recursive1" is using the 4-valued 16-QAM SGolay pairs generated from QPSK pairs as the seed, while \Recursive2" scheme uses the total 4-valued 16-QAM codes generated by exhaustive search, as the seed. Simulation Results for Cyclic Golay Codes Table 5.9 compares the rate and distances of Golay and cyclic Golay QPSK codes for m = 3 and m = 4. As we expect, the Hamming and Lee distances of the cyclic Golay code is in 199 Code /Size m = 3 m=4 Golay R = 0:59,H = 2,L = 4 R = 0:42,H = 4,L = 8 mth order Cyclic R = 0:68,H = 2,L = 2 R = 0:51,H = 2,L = 2 Exhaustive cyclic R = 0:73,H = 2,L = 2 R = 0:58,H = 2,L = 2 Table 5.9: Rate, Hamming and Lee distance of some codes with low PAPR general low, while their coding rate is higher than the Golay code. For the comparison of bit error rate, we used an OFDM system with m = 3 and 4, corresponding to 8 and 16 subchannels. The same discussion applies to larger number of subchannels. A white Gaussian noise with variance 0:5 per dimension was assumed at each subchannel. The symbols were chosen from Z4, corresponding to h = 2. The bit error rate versus SNR for Golay and cyclic Golay schemes is shown in Fig. 5.1 for AWGN channels. It is obvious that the cyclic Golay codes with m = 4, order 4 and m = 3, order 3 have the worst bit error rates, and the Golay code with m = 4 has the best performance. The second order cyclic Golay code with m = 4 performs better than other codes. As the difierence between the dimension of the code and the order is decreased, the bit error rate of the code increases. This is attributed to the fact that the cyclic Golay code is a subset of RM2h(r;m) and also the distance of the generalized Reed-Muller codes depends on the difierence between the dimension and the order of the code. Note that, the difierence between the BER?s are better observed at high SNRs. Fig. 5.2 shows the rate we obtained for these codes. The following facts can be seen from this flgure. First, the rates shown in this flgure are a little lower 200 ?2 0 2 4 6 8 10 12 14 16 18 20 10?2 BER Bit Error Rate (BER) vs. SNR for Multipath Fading channels, channel unknown SNR m=3, Golay.m=4, Golay. m=3, Order 2.m=3, Order 3. m=4, Order 2.m=4, Order 3. m=4, Order 4. Figure 5.4: Bit Error Rate (BER) vs. SNR for Fading channels, when the channel is not known at the receiver than Table 5.9. This is because for example for m = 4, only 8(= 23) out of 12(= 4!2 ) coset representatives are used in simulation. Second, the codes with m = 4 normally have lower rate than the cases with m = 3. Third, for each dimension, the pure Golay code has the worst coding rate. Fourth, as the difierence between the dimension and the order of cyclic Golay code decreases, the code can achieve better coding rate. Finally, although, the cyclic Golay code can result in difierent rates at difierent OFDM block, the rate of the code in a long frame size is almost constant for all values of SNR. These two flgures clearly show the trade-ofi between the coding rate and the coding distance, while the PAPR is bounded up to 3dB. 201 N-Point I-FFT Parallel/Serial &gaurd Interval Insertion (cyclic extention) D/A & LP y(t) Up Convert. XN-1 X0 Yi m bits m bits Modulation input stream ModulationdN-1 d0 .. . .. . P0? PN-1?Figure 5.5: OFDM transmitter with Golay and cyclic Golay encoder The results for the two path fading channels are shown in Figs. 5.3 and 5.4. Both paths have a Raleigh envelope and the delay equal to the OFDM symbol duration divided by the number of subchannels. We have also considered the same white Gaussian noise as in Fig. 5.1 in these flgures. In Fig. 5.3 we assume that the receiver knows the channel characteristics, and in Fig. 5.4, the receiver has no knowledge of the channel condition. We can observe the same trend as in Fig. 5.1, of course with lower bit error rates. When the channel is unknown at the receiver, the BER tends to converge at a constant level for high SNR. This is due to the fact that in this case, the SNR is the quotient of the signal power to the noise power. At high SNRs, the efiect of fading characteristics prevails the efiect of noise, and therefore the BER doesn?t change signiflcantly with increasing SNR. We saw that the cyclic Golay code can perform higher coding rate at the expense of lower distance. However, in high SINR environment, we might be able to tolerate lower distance codes. The system depicted in Fig. 5.5 sets a SINR threshold level. If the SINR is above the threshold, we use cyclic Golay codes at the transmitter to transmit higher data rates. If the SINR is below the threshold, we switch to the Golay codes having better error correcting property, but lower rates. The threshold can be chosen according to the upper bound of 202 BER, we can tolerate. We use this system along with m = 4, for our simulation purposes in Figs. 5.6 and 5.7. We assume that the SNR of the received signal changes randomly according to a uniform distribution. These flgure compare four schemes. In the flrst and second schemes, when the SNR is below a threshold, we use the Golay code, while the cyclic Golay code is used when the SNR is above the threshold. In the flrst scheme the second order and in the second scheme the fourth order cyclic Golay codes are used. In the other two schemes, we have two thresholds. If the SNR crosses the flrst threshold, we switch from Golay to the 2nd order cyclic Golay, and when the SNR is above the second threshold, we switch to the 4th order cyclic Golay code. The difierence between the thresholds are difierent in these two schemes. Notice that, the x-axis in these flgures attributes to the SNR threshold on which we switch the coding scheme, not the actual SNR. The actual SNR is changed from one OFDM block to another, according to a uniform distribution. When the threshold is set to the lowest value, the Golay codes are never used, and therefore we expect poor BER performance with a better coding rate. As the threshold becomes larger, the cyclic Golay code is less used and therefore the BER tends to be lower. When the SNR is set at the highest value, we use the Golay code, all the time. Fig. 5.8 shows the changes in bit error rate versus the coding rates for m = 4, when the SNR is kept constant. The flgure is in agreement with our expectation that for a flxed SNR, as the rate increases the codes performs higher bit error rate. For example, by flxing the SNR at 6dB, as the coding rate goes from 0:40625 to 50358, the bit error rate is increased from 5:4?10?4 to 6:7?10?2. 203 100 1010.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5x 10?3 BER BER vs. SNR Threshold for m = 4 SNR Threshold Two Thrd. D=3Two Thrd. D=5 Thrd., order 2Thrd., order 4 Figure 5.6: Bit Error Rate (BER) vs. SNR Threshold for AWGN channels, when m = 4 5.3.4 Summary of the Chapter In this chapter we tried to make some modiflcation to Golay complementary codes to increase their achievable rates. First, we modifled the concept of Golay codes to cover non-equal energy constellations, and to this end we deflned super-Golay codes. We proposed recursive construction schemes that allows us to build super Golay codes with a speciflc size from lower size codes. The construction started from QPSK Golay sequences which are e?ciently created using 2nd order cosets of RM2h(1;m). Next, we introduced the concept of cyclic Golay codes and shown that, with appropriate time shaping, they maintain the same level of PAPR in the discrete domain (critical sam- pling) as the Golay codes. Moreover, we have shown that the set of cyclic Golay codes is 204 5 10 15 20 250.45 0.455 0.46 0.465 0.47 0.475 0.48 0.485 0.49 Coding Rate Rate vs. SNR threshold for m = 4 SNR Threshold Two Thrd., D=3Two Thrd., D=5 Thrd., order 2Thrd., order 4 Figure 5.7: Coding rate vs. SNR Threshold for AWGN channels, when m = 4 a super-set of Golay codes and therefore results in higher coding rate. We have designed a construction method to flnd the cyclic shift of any code represented by Boolean algebraic forms. These codes introduce a trade-ofi between the coding rate and the distance of the code. Two decoding methods for RM2h(r;m) are also introduced. A generalization of major- ity logic decoding approach, using Karnaugh maps, both for soft decision, and hard-decision decoding of RM2h(r;m) is proposed. To reduce the complexity of decoders, a recursive ap- proach is introduced that brings the decoding procedure down into the decoding of lower size and lower order codes. 205 0.40.42 0.440.46 0.480.5 0.52 2 4 6 8 10 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 SNR Bit Error Rate (BER) vs. SNR for AWGN channels RATE BER Figure 5.8: Bit error rate vs. coding rate for flxed values of SNR for AWGN channels, when m = 4 206 Chapter 6 Service Level Agreement (SLA) Based Scheduling Algorithms and QoS-provisioned Channel Allocation for OFDMA 6.1 Motivation and Previous Works With the rapid growth in broad-band wireless data networks and increasing demand for multimedia applications, the future wireless networks should be able to provide services for heterogeneous tra?c with diverse quality of service (QoS) requirements. Also, the fast increase of tra?c volume, both the number of users and bandwidth requirements of the emerging applications, needs e?cient utilization of limited spectrum of wireless networks. The development of new technologies (including packet scheduling) for wireless networks has been receiving a lot of attention in both research and industry. A packet scheduling scheme is the mechanism that resolves contention between the packets and manages bandwidth among 207 the mobile users. Scheduling algorithms that support QoS and maintain high throughput are crucial to the development of broad-band wireless networks. Although many developed scheduling algorithms are available for wireline networks, they cannot be directly carried over to wireless networks due to major difierences in medium. These include location-dependent and time-varying wireless link capacity, scarce and limited bandwidth, high error-rate, and user mobility. The time varying nature of wireless channels introduces some discontinuity in the availability of a user when the channel is in a bad con- dition. The very same nature of wireless channel provides opportunities for the transmission of large amount of information when the channel is in a good condition. An e?cient sched- uler has to take advantage of the time variation of wireless channel. On the other hand, if a scheduler operates independent of channel condition, it might allocate bandwidth to users in deep fade, where most of data is lost and bandwidth is wasted, while at the same time deprive users with good channel from taking advantage of their instantaneous large capacity. Therefore, there is a need to develop new packet scheduling technologies which account for the special characteristics of wireless networks, to support QoS (difierentiation and guarantees) and to provide high network utilization. Quality of service (QoS) refers to the capability of a network to provide certain service to selected network tra?c. The four important attributes of QoS in packet networks are dedicated bandwidth, controlled jitter/latency, and controlled loss characteristics. Through the notion of efiective bandwidth, it can be shown that a certain QoS level can be translated into a bandwidth guaranteed to a user [16]. In other words, given tra?c characteristics 208 and its QoS requirements, there is a required bandwidth (efiective bandwidth) needed to be provided to the tra?c for supporting the requested QoS. For instance, let us assume that the tra?c is a two state Markov process (ON/OFF model) with exponential state transition rates. The ON state durations and OFF state durations are exponentially distributed with parameters flM and fiM, respectively. When in OFF state, the source is idle, while in ON state the source generates packets at the peek rate ?M. For this tra?c, we assume that the QoS requirement is to guarantee the probability that the delay of the packets in bufier exceeds a certain threshold DM is less than ?M, i.e. P[d > DM] < ?M: (6.1) Given the tra?c parameters (fiM;flM;?M) and the QoS parameters (DM;?M), the efiec- tive bandwidth of this tra?c would be [130]: Cd = ?M(fiMDM ?ln?M)(fi M +flM)DM ?ln?M : (6.2) For other QoS parameters and tra?c patterns, there are similar expressions that relate QoS to efiective bandwidth. As a consequence, in this work QoS is represented by a reserved bandwidth, guaranteed to a user. In wireless scheduling algorithms, there is a trade ofi between throughput maximization, which relates to the e?ciency in utilizing bandwidth, and supporting QoS, which indicates how resources are shared among users. Throughput is deflned as the total bandwidth sup- ported for the aggregate of the users, regardless of how much bandwidth is assigned to each user. On the other hand, QoS refers to the ability of providing a minimum guaranteed band- 209 width to a user, under difierent rate assignment. Let us deflne QoS region as the summation of the reserved rates across all the users for all the feasible rate assignments. It is important to notice that for any given scheduler, the QoS region cannot be higher than the through- put. To maximize spectrum utilization, a good scheduler identifles and schedules users with high instantaneous channel capacities. However, this methodology could be biased to mobile users who are closer to the base station. In other words, this scheduler allows only users close to the base station to access the channel, but the users away from the base station will not receive the reserved bandwidth necessary for the required QoS. There exist scheduling mechanisms able to provide high utilizations but they sacriflce the user satisfaction. Also, there exist other scheduling schemes that sacriflce network throughput to support QoS. Our main objective is to obtain a scheduling algorithm that maximizes the system throughput while it provides a QoS region very close to the system throughput. Channel state dependent packet scheduling (CSDPS) defers transmission of packets on links experiencing bursty errors [131]. A link status monitor, checks the channel condition for all mobile hosts, and when it determines that a channel is in a bad state, the scheduler does not serve the user associated with that link. Any one of the known wireline scheduling algorithms, e.g., round robin, earliest deadline flrst, and longest queue flrst, could be used as the service policies for this scheduling algorithm. However, CSDPS does not have any mech- anism for supporting QoS (to guarantee bandwidth) for a mobile user. CSDPS with class based queuing (CSDPS-CBQ) groups users into difierent classes and each class is committed with a certain amount of bandwidth [132]. It keeps track of the amount of service received 210 by each class in a certain time interval window. However, this scheduling algorithm does not have an explicit mechanism for compensating those mobile users who have not received the promised service. Idealized wireless fair queuing (IWFQ) is a modifled version of weighted fair queueing (WFQ) scheduling algorithm for wireless networks [133]. It uses an error-free WFQ as a reference system and tries to approximate the real service to the ideal error-free system. In IWFQ, each ow is assigned a queue, and the mth packet of the nth queue is assigned a start time Sn;m and a flnish time Fn;m, where: Sn;m =maxfv(A(t));Fn;m?1g Fn;m =Sn;m + Ln;mr n (6.3) Here, Ln;m is the packet size, v(A(t)) is the system virtual time deflned in WFQ, and rn is the rate reserved by user n [133]. This algorithm provides some appealing properties in fairness and QoS guarantees. However, when a user is compensated for its previous lagged service, all other users with good channels will not be served at all. Service Level Agreement (SLA) is a contract between a user and its service provider. An SLA deflnes the service (QoS) requested by the user, the price that the user must pay for the service, the penalty if the agreement is violated, and etc. This chapter considers SLA as the reference point between the network and the network users. In this work, we introduce a notion of income maximization, by which the scheduler is rewarded when total network throughput is increased, and penalized when SLA for each user is violated. 211 We will show that by properly choosing penalty, as a function of SLA, and reward, as a function of network throughput, the trade-ofi between the throughput and user satisfaction is performed e?ciently. We will also show that our algorithms, meet the QoS and utilize network resources e?ciently. We propose a greedy solution and a dynamic programming approach for the problem. This scheme is used for channel allocation in an OFDMA system. In OFDM systems a high data rate stream is split into a number of lower rate data streams. Each substream is modulated separately on one of the orthogonal subcarriers. One way of applying OFDM to a multi-user scenario is through OFDM-TDMA or OFDM-CDMA [134], where difierent users are allocated difierent time slots or difierent frequency spreading codes. However, each user has to transmit its signal over the entire spectrum. This leads to an averaged-down efiect in the presence of deep fading and narrowband interference. Alternatively, one can divide the total bandwidth into tra?c channels (one or a cluster of OFDM subcarriers) so that multiple access can be accommodated in an orthogonal frequency division multiple access (OFDMA) fashion. An OFDMA system is deflned as one in which each user occupies a subset of subcarriers, and each carrier is assigned exclusively to only one user at any time. One advantage of OFDMA over OFDM-TDMA and OFDM-CDMA is elimination of intra-cell interference, thus avoiding the need of CDMA type of multi-user detection. In OFDMA, each transmitter is assigned to a subset of the entire subcarriers to resolve the efiect of interference caused by the multi-access environment. This enables the network to perform a exible resource allocation with the goal of increasing the overall network throughput depending 212 on difierent users? tra?c load. This capability leads to increased system throughput and spectral e?ciency, when the allocation of subcarriers to difierent users is performed carefully [135,136]. Another advantage of OFDMA is that it can exploit multiuser diversity, when an user avoids the channels that are in deep fade or have narrowband interference. Since difierent users face difierent channel qualities, a badly faded channel for one user may still be favorable to other users. By careful subcarrier allocation, the spectral e?ciencies of the system can outperform interference-averaging techniques signiflcantly [137]. Clearly, careful subchannel allocation in an OFDMA system is very crucial in determin- ing the performance of the overall system. However optimum channel allocation in OFDMA is a fundamentally di?cult problem. In practice, additional constraints, e.g., individual users rate requirement, further complicates the problem. In [136], the problem of subcarrier alloca- tion for generic multiple access systems with orthogonal subchannels in a multi-cell system is studied. In [138] the authors propose an model wherein a single network access point serves a number of terminals, which require varying data rates. They also use OFDMA so that dif- ferent terminals are allocated a difierent number of subcarriers depending on their data rate requirement. In this manner all terminals can transmit simultaneously without collisions. The use of an IFFT in an OFDM modulator greatly simplifles this because subcarriers that are not used can be simply zeroed out at the IFFT input. In [139] a dynamic resource allo- cation scheme for OFDMA-based wireless broadband networks is presented. They formulate the problem of maximizing the total packet throughput subject to individual users outage probability constraint, by assuming a flnite bufier for the arrival packets and dynamically 213 allocates the radio resource based on users channel characteristics, tra?c patterns and QoS requirements. A Lagrangian relaxation algorithm is flrst introduced in [140] to minimize the total power consumption with constraints on transmission rate for users requiring difierent classes of services. Linear programming is used in [141] to solve the subcarrier allocation by linearizing the function of rate in term of power. Section 2.6 brie y described the OFDMA and channel allocation in IEEE802.16a. In summary, a flxed subset of subcarriers in consecutive time slots are assigned to each user according to a static assignment [5]. The network has to maintain a required Quality of Service (QoS) for each user, which is not necessarily in line with maximizing the network?s total throughput. The main goal of this chapter is to allocate a subset of subcarriers to each user such that the QoS is satisfled for each user and at the same time the overall network throughput is maximized. In order to increase resource utilization in the network while providing QoS, a scheduler has to adjust the allocation of subcarriers to users based on user demands and channel conditions. The scheduler has to achieve this purpose with reasonable performance and low complexity. As a result, we use the concept of revenue maximization is then used in an OFDMA system, in which the allocation of subcarriers to difierent users is performed in such a way that the overall network revenue is maximized. As a result, throughput is maximized while QoS is maintained above the required level. This chapter is divided in two parts. In the flrst part (Section 6.2) we present the SLA-based scheduling in wireless networks. The system model is introduced in Subsection 214 6.2.1. Subsections 6.2.2 and 6.2.3 describe two extreme cases, where only throughput or QoS is the objective, and flnally Subsection 6.2.4 describe our proposed trade-ofi between QoS and throughput through the notion of \income maximization". The second part covers the application of income maximization in OFDMA channel allocation (Section 6.3). The OFDMA system model is brought in Subsection 6.3.1, and difierent scheduling algorithms are presented in Subsection 6.3.2. In Section 6.4 the performance of the proposed algorithms both for TDMA, and OFDMA are compared through numerical studies. Finally, Section 6.5 summarizes the chapter. 6.2 SLA-Based Scheduling for Wireless Networks In the flrst part of this chapter, we will propose our framework for SLA-based scheduling. This scheme can be applied to any multiuser scheme like CDMA, TDMA, and OFDMA. In the second part of this chapter, we will apply this framework to OFDMA. 6.2.1 System Model In the flrst part, a single cell wireless network is investigated. We consider a time-slotted system, where time is the resource to be shared among all mobile users by a central processor. At any given time, only one user can be scheduled to occupy a given channel within a cell. The scheduling algorithm decides a time slot should be assigned to which user. At the down- link, a separate queue is assigned to each mobile user at the base station, and we assume 215 Transmit Antenna Channel CapacityEstimator U1 U2 UN U1 U2 UN Scheduler Figure 6.1: System block diagram that the scheduler at the base station has the full knowledge of the status of the queues. The block diagram of this system is shown Figure 6.1. Let rn denote the efiective bandwidth (rate) rate reserved by user n (n = 1;2;??? ;N), which is a fraction of the total available bandwidth (0 ? rn ? 1). In uid model, user n expects to receive a fraction of a time slot, rn. However, in this work, we do not consider the uid model and assume that a time slot is assigned only to one user. Deflne Yn(t) as: Yn(t) = 8> >>< >>> : 1 if the scheduler selects user n at time t; 0 otherwise: (6.4) Also, assume that the indictor In(t) is deflned as: In(t) = 8 >>>< >>>: 1 if queue of user n is non-empty at time t; 0 otherwise: (6.5) We assume that the link between each user and the base station is a wireless fading channel. In a power controlled system, the average power in each link is maintained at a flxed level and the instantaneous power follows a Rayleigh fading distribution. The signal 216 to noise ratio (SNR) for the nth user is a function of the received power Pn and the noise power Nn. The capacity per unit bandwidth for this user, Bn, is given by Bn = log2(1+Pn=Nn): (6.6) Since thermal noise at each receiver is flxed, the SNR at each user follows a Rayleigh distribution. We assume that the link capacity is quantized to a limited number of levels. Let us assume that the channel capacity for user n at time t is denoted by gn(t), which is a fractional number. Therefore, the service received by user n at time t is gn(t)Yn(t). Next, we consider two scheduling algorithms. The flrst one is proposed to support QoS and the other one provides high network throughput. 6.2.2 Maximum Credit Scheduling (MCS) In order to support QoS, a scheduler monitors and allocate resources in such a way that users? efiective rates stay within a satisfactory range. A credit based mechanism can be used to measure and control the service provided to each user; the user n is assigned a credit, denoted by Cn(t) (n = 1;2;??? ;N). A user?s credit represents how much service the network owes to the user. The credit for user n at time t evolves as follows: Cn(t) = Cn(t?1)+In(t)rn ?gn(t)Yn(t): (6.7) The second term on the right hand side of the above equation, In(t)rn, represents the 217 service reserved by user n. If the nth queue is non-empty, this term is the requested service. The third term, however, represents the service received by user n. Starting from Cn(?1) = 0 for all users, by induction, it is straightforward to see that (6.7) leads to the following non- recursive expression for the credits: Cn(t) = rn tX s=0 In(s)? tX s=0 gn(s)Yn(s): (6.8) In the above equation, the flrst and second terms are the reserved and received service by user n up to time t, respectively. A negative credit means that a user has received a better service than the reserved service. On the other hand, a positive credit implies that the network owes service to the user. Therefore, credit is a measure of how much QoS is delivered or how much network owes to a user. To support QoS, a scheduler must keep the credits of all the users as small as possible. In this case, for users with non-empty queues the service delivered to each user is close to its reserved service. In order to minimize user credits, a maximum credit scheduler (MCS), assigns the avail- able bandwidth to the user with maximum credit [142]: Yn(t) = 8 >>>< >>>: 1 if n = argmaxkfCk(t)g; 0 otherwise: (6.9) Since this scheduling is based on the credit values at time t that is independent of the channel capacities at time t, the total system throughput with this algorithm is equal to the average channel capacity, i.e., E(gn(t)). 218 6.2.3 Maximum Throughput Scheduling (MTS) It has been proved that to maximize the network throughput, a scheduler selects the user with the best capacity or with the lowest fading among all the users [143], i.e., Yn(t) = 8 >>> < >>> : 1 if n = argmaxkfgk(t)g; 0 otherwise: (6.10) The total throughput in this algorithm is equal to the expectation of maximum channel capacity, i.e., E(maxn gn(t)). However, this algorithm has no mechanism for supporting QoS. 6.2.4 A Trade-ofi: SLA-Based Scheduling Algorithms In MTS, a user that is trapped in a bad channel state, does not receive a service as long as its channel stays at that state. For this user, QoS or SLA is not satisfled. Thus, supporting QoS and maximizing network throughput cannot necessarily be achieved at the same time. MCS does not face this trade-ofi in wired networks since in those networks channel re- sponses of all users are equally good, i.e., gn(t) = 1 for all n;t. Therefore, by selecting the user with the highest credit, the scheduler maintains the credits as small as possible. How- ever, in wireless networks, attempting to support QoS for users with bad channel response may result in reducing network throughput and it does not guarantee QoS for those users. A more e?cient scheduler may ignore users with the most eligible QoS that are in deep fade, in favor of users with better channel response with the hope that in the future the capacity of those ignored users would improve. In this work, we study schedulers that 219 provide QoS, and at the same time maximize network throughput. Inthiswork, weproposeschedulingalgorithmsthatresolvesthetrade-ofibetweenthrough- put and QoS based on the users SLA. SLA includes QoS, pricing for the service provided and penalty when the agreement is violated. Let us denote dn(t) to be the income of the network from user n at time t. Also, let us assume that for the service received by user n, i.e., gn(t)Yn(t), the network charges the user n by fingn(t)Yn(t), where fin is the rate that the nth user pays for the service. On the other hand, if the user n has not received its requested QoS, the credit assigned to it will increase, and so the network is penalized by fn[Cn(t)]. We assume that fn[:] is a real, positive and continuous function with fn[x] = 0 for x ? 0. Both function fn[:] and fin are deflned through SLA between the network and user n. Then, we obtain: dn(t) = fingn(t)Yn(t)?In(t)fn[Cn(t)] = fingn(t)Yn(t)?In(t)fn [Cn(t?1)+rn ?gn(t)Yn(t)]: The total income of the network at time t is given by D(t) = NX n=1 dn(t); where N is the total number of users. An SLA-based scheduler selects the user that increases the total income. The penalty function has a signiflcant role in the performance of a SLA-based algorithm. This function is chosen in such a way that a user with negative credit does not penalize the system since this user has received its requested QoS; therefore, fn[x] = 0 for x ? 0. Also, if 220 a user has accumulated a big credit, i.e., has received a poor QoS, it might be beneflcial to disconnect this connection and pay the corresponding penalty. Moreover, we will expect that the penalty increase to be more signiflcant for high credits. This means that f[:] needs to be convex. We will see that for some special case, the convexity of f is necessary. Besides, the efiect of penalty functions is more described in the simulation results. One special example for fn[:] is: fn[x] = 8 >>> < >>> : nx2 if x > 0; 0 otherwise: (6.11) where n is a positive number. 6.2.5 Maximum Income Greedy Scheduling: A Suboptimal Solu- tion The maximum income greedy scheduling (MIGS) algorithm selects the user that maximizes the total system income at each time slot t: D(t) = NX n=1 dn(t) = NX n=1 fingn(t)Yn(t)? NX n=1 In(t)fn [Cn(t)]: Without loss of generality, from now on, we assume all users have non-empty queues, i.e., In(t) = 1. The following lemma summarizes the MIGS algorithm. Theorem 6.2.1. The maximum income greedy scheduling algorithm selects the user that maximizes the following quantity over all the users with non-empty queues: Hp(t) = fipgp(t)+fp[Cp(t?1)+rp]?fp[Cp(t?1)+rp ?gp(t)]: (6.12) 221 Proof. If user k is selected at time t, then Yk(t) = 1 and Yn(t) = 0, n 6= k. We denote the total income at time t if user k is selected by Dk(t), i.e.: Dk(t) = fikgk(t)? NX n=1n6=k fn[Cn(t?1)+rn]?fk[Ck(t?1)+rk ?gk(t)]: (6.13) The MIGS selects the user p that maximizes the total income: p = argmax k fDk(t)g: (6.14) That is, Yp(t) = 1, and Yk(t) = 0; k 6= p and Dp(t) ? Dk(t); k 6= p. Therefore, we get fipgp(t)? NX n=1n6=p fn[Cn(t?1)+rn]?fp[Cp(t?1)+rp ?gp(t)] > fikgk(t)? NX n=1n6=k fn[Cn(t?1)+rn]?fk[Ck(t?1)+rk ?gk(t)]: After simple manipulations, we obtain fipgp(t)+fp[Cp(t?1)+rp]?fp[Cp(t?1)+rp ?gp(t)] ? fikgk(t)+fk[Ck(t?1)+rk]?fk[Ck(t?1)+rk ?gk(t)]: (6.15) Therefore, the scheduling algorithm maximizing the total income at time t maximizes the right side of the above inequality. Figure 6.2 displays the metric generator block. This block receives fin; fn[:] and rn from the customer SLA, and gn(t) from the channel estimator and generates Hn(t). The state variable of this block is the user credit, Cn(t). The state variable is updated by the feedback 222                              Figure 6.2: Metric generator block diagram for MIGS circuit, shown in Figure 6.2. As illustrated in this flgure, the decision of the scheduler (Yn(t)) contributes to update the credit, (Cn(t)). It is interesting to notice that this module depends only on the parameters associated with user n, and is independent of the other users. Figure 6.3 shows the SLA scheduler. As shown in this flgure, the SLA scheduler is only a maximization that selects the user with the highest metric, (Hn(t)). The output of the scheduler is the binary vector fYn(t)gNn=1 that decides which user is selected in this time slot. Special Cases In the following, we will consider two extreme cases where in one case only the system throughput is important, and in the other case, only the QoS matters. We will show that the scheduling algorithm mentioned in this section approaches the two cases, so this algorithm addresses the throughput-QoS trade-ofi. 223                  !  " " " " "      " !" " # #" # #"" Figure 6.3: SLA scheduler block diagram Case I: Deep Fading: Let us assume that at time t, the kth user is in deep fade, (i.e., gk(t) = ? << 1). Intuitively, we expect this user not to be scheduled, because scheduling this user results in bandwidth loss at time slot t. Moreover the small allocated rate (gk(t) is not large enough to support the QoS for this user. Because of continuity of the function fk[:], we can write fk[Ck(t?1)+rk]?fk[Ck(t?1)+rk ?gk(t)] = ?; for a small positive ?. Thus, the metric associated with this user would be: Hk(t) = fikgk(t)+fk[Ck(t?1)+rk]?fk[Ck(t?1)+rk ?gk(t)] = fik?+?; 224 which is a small positive number. Therefore, a user in deep fade has a small metric and is not expected to be scheduled. Case II: Maximum Throughput Scheduling (MTS): Let us assume that the system would not be penalized for not supporting QoS, i.e., fk[:] = 0. Also, we assume that the system charges difierent users with the same rate, i.e., fik = fi. As we discussed in Section 6.2.3, in this case, the system tries to maximize the system throughput and the scheduler selects the user with the best capacity. With these assumptions, the optimal user is selected as follows p = argmax k ffigk(t)g? argmax k fgk(t)g: (6.16) Hence, if QoS is not an issue, our SLA based scheduling algorithm is the maximum throughput scheduling (MTS). Case II: Minimum Penalty Scheduling (MPS): Here, let us assume that the only goal of the system is to deliver QoS to the users, and the system throughput is not important. In this case fin = 0 for all the users. Also, we assume that the penalty function is the same for all users, i.e., fk[:] ? f[:]. Then, the SLA based scheduling process will be: p = argmax k ff[Ck(t?1)+rk]?f[Ck(t?1)+rk ?gk(t)]g: (6.17) We expect this scheduling algorithm (we call it minimum penalty scheduling-MPS) to support QoS for wireless networks. Now, let us assume that gn(t) = 1, as in wireline networks. In this case, MPS decision is 225 obtained by p = argmax k ff[Ck(t?1)+rk]?f[Ck(t?1)+rk ?1]g: (6.18) Now we move ahead to show that for , some special case, if the penalty function f[:] is a positive, continues, increasing and convex function, then the minimum penalty scheduling (MPS) is equivalent to MCS. To this end, we start with the following lemma. Lemma 6.2.2. If f(:) is a convex function, then for any x ? z ? y ? w when x+w = z+y, we have: f(z)+f(y) ? f(x)+f(w) Proof. It is obvious that 0 ? a , z ?yz ?w ? 1 ) y = aw +(1?a)z: Moreover, because of convexity of f[:]: f(y) ? z ?yz ?wf(w)+ 1? z ?yz ?w ? f(z) (6.19) Also, 0 ? b , z ?yx?y ? 1 ) z = bx+(1?b)y 226 However, x+w = z +y ) a = b ) f(z) ? z ?yz ?wf(x)+ 1? z ?yz ?w ? f(y) (6.20) Adding equations (6.19) and (6.20) yields: f(z)+f(y) ? z ?yz ?wf(x)+ z ?yz ?wf(w)+ 1? z ?yz ?w ? f(z)+ 1? z ?yz ?w ? f(y) ) f(z)+f(y) ? f(x)+f(w) Corollary 6.2.3. For convex function f[:] and x ? y, we have: f(y)?f(y ??) ? f(x)?f(x??) Proof. No matter if x?? ? y or x?? < y, using the previous fact, we have: f(x??)+f(y) ? f(x)+f(y ??) ) f(y)?f(y ??) ? f(x)?f(x??) Now we move to prove the main theorem which says Theorem 6.2.4. If f[:] is a positive, continues, increasing and convex function, then the minimum penalty scheduling (MPS) with gn(t) = 1 is similar to MCS (mentioned in Section 227 6.2.2) as follows: p = argmax k fCk(t?1)+rkg Proof. With wireline assumption, the MPS is based on: p = argmax k ff[Ck(t?1)+rk]?f[Ck(t?1)+rk ?gk(t)]g: (6.21) Using the previous corollary we have: p = argmax k fCk(t?1)+rkg which is similar to MCS. 6.2.6 MaximumIncomeDynamicProgrammingScheduling(MIDPS): Optimal Solution Background: Dynamic Programming In this section , we review the concept of Dynamic Programming (DP) and mention some of its relevant results. DP is used to flnd the long-term optimal scheduling policy in this work. The DP problem is categorized into two main categories; Finite Horizon (FH) and Inflnite Horizon (IH). We flrst summarize the DP for FH, and then move to IH. Let fXtgN?1t=0 be a discrete time process, taken from a countable state space, Sk, which can be denoted by a set of non-negative integers f0;1;2:::;K ? 1g. At each time instant k 2 f0;1;2:::N ?1g, we are required to choose an \action" ?k, ?k 2 M, where M is the given set of all possible actions. The dynamic of the system can be stated as 228 Xk+1 = fk(Xk;uk;wk); (6.22) where uk is a control element taken from a space Ck and depends on the present state, and the random noise (disturbance) wk is taken from another space Dk, and is denoted by a probability distribution Pk(:jXk;uk). This distribution depends on Xk and uk, but not on the previous values of this random variable w0;w1;:::wk?1. We deflne a policy, ?, as a rule for choosing the sequence of actions ?k, for k 2 f0;1;2;:::N ? 1g. The policy is called admissible, if each action ?k(Xk) belongs to the admissible set of control elements. If at time k we are in state Xk and choose an action ?k, a cost of gk(Xk;?k;wk) is incurred. The cost function g(:) is a mapping of the aggregate set f0;1;2:::;K ?1g?M 7! IR, where IR denotes the set of real numbers. Given the initial state X0, we want to flnd the optimal policy ?? that minimizes the overall cost function; i.e. J??(X0) = J?(X0) = min ?2? Ew k k=0;1:::N?1 ( gN(XN)+ N?1X k=0 gk (Xk;?k(Xk);wk) ) (6.23) J?(:) is a function that assigns each initial state to an optimal cost and is called optimal cost function or optimal value function. The theory of dynamic programming uses the principal of optimality, which states that if at time instant i we are at state Xi, and assuming that ?? = f??0;??1;:::??N?1g is the optimal policy, then the truncated policy ?? = f??i;??i+1;:::??N?1g is optimal in the sense that it minimizes the "cost-to-go" form time i to time N obtained from (6.23) by evaluating the cost per step from time k = i to N ?1, rather than k = 0;:::N ?1. 229 The DP algorithm is stated as [144]: Theorem 6.2.5. For any initial state X0, the optimal cost J?(X0) is equal to the cost of the last step of the backward problem stated as JN(XN) =gN(XN); Jk(Xk) =minu k Ew k fgk(Xk;uk;wk)+JK+1 (fk(Xk;uk;wk))g (6.24) Moreover, if u?k = ??k(Xk) minimizes the right hand side of (6.24) for each Xk and k;, the policy ?? = f??0;??1;:::??N?1g is optimal. The IH problem is difierent from the FH problem in the sense that the number of stages is not flnite (instead of N we have 1), and also the system is stationary; that is the noise distribution, the cost per stage, and the dynamic of the system do not depend on k ( do not change with time). In other words, the action taken at time k depends only on the state of the process at time k. Therefore, a stationary policy is, in efiect, a mapping f from the state space to the action space; i.e., given the current state we can determine the current action uniquely. Such a process is called a Markov Decision Process (MDP) [144{146]. The stationary policy can be considered admissible if it is in the form f?;?;:::;?g. The IH problem can be classifled in four principal classes, where we consider one of them in this chapter, which is "discounted problems with bounded cost per stage". If the state space is flnite or countable, then by a theory in real analysis [147], it is known that any mapping deflned over such a flnite or countable space is bounded. 230 Given the evolution of the process fXkg1t=0, the dynamic programming tries to choose f?kg1k=0 such that J?(i) 4= E?i " 1X k=0 flkg(Xk;?k) # ; (6.25) is minimized. Here E?i denotes the expectation under the policy ?, s0 = i is the initial state, and fl > 0 is the discounting factor. The above cost re ects the fact that while choosing the action ?k at time k, we would like to take into account the efiect of this action on the future. How much into the future we wish to look before taking any action can be controlled by choosing an appropriate value for fl. If fl < 1, the use of the discount factor is motivated by the fact that a cost to be incurred in the future is less important than one incurred at the present time instant. In general, if no restrictions are placed on the nature of the set of allowable policies, the action could in principle depend on the history of the process up to and including the present time. In order to ensure the existence of the expected inflnite horizon discounted cost, it su?ces to have a uniformly bounded cost function g(:) and 0 < fl < 1. Note that, for stationary policies, given the current state, the current action can be taken uniquely, regardless of the time instant. Then the optimal discounted cost will be J(i) = min? J?(i); i ? 0; (6.26) assuming that the minimum exists, where the minimization is over all policies ??s. Let ?? 231 denote an optimal policy which achieves the minimum in (6.26). Then J??(i) = J(i); i ? 0: (6.27) The main result for discounted inflnite horizon DP problem states that [144,145] Theorem 6.2.6. If pij(u) is the probability of going from state i to state j, when the action u is taken, then the sequence Jk+1(i) = min u2U(i) ( g(i;u)+fl K?1X j=0 pij(u)Jk(j) ) ;i = 0;1;:::;K ?1 (6.28) converges to the optimal costs J?(i) for all i, starting from arbitrary initial conditions J0(0);J0(1);:::J0(K ?1). Moreover, the optimal costs J?(0);J?(1);:::J?(K ?1) uniquely satisfy the Bellman?s equations: J?(i) = min u2U(i) ( g(i;u)+fl K?1X j=0 pij(u)J?(j) ) ;i = 0;1;:::;K ?1 (6.29) The algorithm stated by the iterative equation (6.28) is called the value iteration. It requires an inflnite number of iterations. However, in practical applications it terminates flnitely when the change in the cost value is below a threshold. The value iteration can be presented by a stationary policy ?, where the costs J?(0);:::J?(K ? 1) are the unique solutions of the equation J?(i) = g(i;?(i))+fl K?1X j=0 pij(?(i))J?(j):i = 0;1;:::;K ?1 (6.30) Furthermore, given any initial condition, the sequence Jk(i) generated by the DP iteration Jk+1(i) = g(i;?(i))+fl K?1X j=0 pij(?(i)Jk(j):i = 0;1;:::;K ?1 (6.31) 232 converges to the optimal cost J?(i) for each i. An alternative to the value iteration is called policy iteration, which starts form a sta- tionary policy ?0 and generates a sequence of new policies ?0;?1 :::. The policy evaluation step is given by ?k+1(i) = arg min u2U(i) ( g(i;u)+fl K?1X j=0 pij(u)J?k(j) ) :i = 0;1;:::;K ?1 (6.32) It is proven in [144] that the policy iteration is an improving sequence of policies and terminates with an optimal policy. MIDPS The algorithms presented in the previous sections, maximize the total income locally. In this section, the objective is to globally maximize the system income. In order to do so, dynamic programming algorithms are used to predict the future to make the decisions at the present time. In this framework, the optimization can be done within a flnite horizon or inflnite horizon [144]. We focus on the inflnite horizon problem since it provides the steady state policy which is independent of time. Deflne the expected total income as follows: D = E ( 1X t=0 flt " NX n=1 Dn(t) #) ; where 0 < fl ? 1 is the discount factor to keep the total income bounded. For simplicity, we assume that In(t) = 1 for all users. Then the credit update equation is given by Cn(t+1) = Cn(t)+rn ?gn(t)Yn(t): 233 Let us deflne C(t) = ? C1(t);C2(t);:::CN(t) ?T Y(t) = ? Y1(t);Y2(t);:::;YN(t) ?T g(t) = ? g1(t);g2(t);:::gN(t) ?T r = ? r1;r2;:::;rN ?T ; fi = ? fi1;fi2;:::;fiN ?T : Then, C(t+1) = C(t)+r?g(t)?Y(t); where ? denoted Hadamard product. In inflnite horizon case, we maximize the total system income as follow: max Y E Y ( 1X s=0 fls " NX n=1 ffingn(t)Yn(t)?fn [Cn+1(t)]g #) : Now, we deflne: G(C(t);Y(t);g(t)) , NX n=1 ffingn(t)Yn(t)?fn [Cn(t)+rn ?gn(t)Yn(t)]g: Let St denote the set of vectors X(t) = ?C(t);g(t)? at time t. Therefore, if at time t = 0 the system state is C(0) = Ci and the capacity vector is g, then we have J?(Ci;g) , max Y E ( 1X t=0 fltG(C(t);Y(t);g(t))jC(0) = Ci;g(0) = g ) : 234 We would like to obtain the optimal policy Y ?(t) = ???C(t);g(t)? that maximizes the above income function. Note that the scheduler knows the channel capacities at the decision time, and therefore, the channel capacities is a part of the state vector. We can rewrite the optimal income in the form of Bellman?s equation for discounted inflnite horizon problem [144] as follows: J?(Ci;g) = max Y 'G(Ci;Y;g)+flE'J?((Ci +r?gY);g(t+1))??: If we denote the probability of g(t+1) = gk by ^pgk, we obtain: J?(Ci;g) = max Y ( G(Ci;Y;g)+fl X k ^pgkJ??(Ci +r?gY);gk? ) : If the system state is Ci and the capacity vector is g, the scheduler selects the user that maximizes: max Y ( NX n=1 'fi ngnYn ?fn ?Ci n +rn ?gnYn ??+flX k ^pgkJ??(Ci +r?gY)gk? ) : (6.33) The flrst two terms represent the current income (as seen in the greedy algorithm-MIGS) and the third term represents the income-to-go. In a similar fashion as in Theorem 6.2.1, it can be shown that this maximization is equivalent to selecting the user n in the following 235 maximization problem: maxn ffingn +fn?Cin +rn??fn?Cin +rn ?gn? +fl X k ^pgkJ?(Ci +r? 2 66 66 66 66 66 66 66 4 0 : gn : 0 3 77 77 77 77 77 77 77 5 ;gk)g: 6.2.7 System Admission Control Policy, and Pricing Admission control policy in a network controls the network load by controlling either the number of users or the load of users in such a way that a criteria is met. For example, the performance criteria could be either QoS, or total income. Obviously, for large network load, supporting QoS is a challenge and the network prefers to deal with lower network loads and so not to pay too much penalty for QoS violation. One reasonable scheme for admission control policy in our system is to keep the network load below a threshold in such a way that the relative assigned rate is above some predetermined factor, e.g. 0:95. Another scheme for admission control policy is to keep network load below a threshold such that the system income is above some expected value. Pricing is another factor that could be considered in this work. The problem of pricing can be observed from two perspectives. In the flrst perspective, the network earning from the users who have received the service (fin) and the penalty function could be adjusted based 236 on the reserved rate by that user. It has economical justiflcation that whoever requires more services ( higher rn) needs to pay more (higher fin), and penalize the network more for the lack of service (higher order for fn). Another aspect of pricing relates to the problem of supply and demand. In other words, when the load of the network is low, the system administrator, might be willing to decrease the prices ( lower fin) to encourage the users to use the system ( require higher rn), while in the case of heavy loads (high ?), the prices might be adjusted accordingly. 6.3 QoS-Provisioned Channel Allocation for OFDMA The main goal of this Section is to allocate a subset of OFDM subcarriers to each user such that the QoS is satisfled for each user and at the same time the overall network throughput is maximized. The allocation is performed based on each users?s users demands and channel conditions. The scheduler has to achieve this purpose with reasonable performance and low complexity. The concept of revenue maximization is used in an OFDMA system, in which the allocation of subcarriers to difierent users is performed in such a way that the overall network revenue is maximized. As a result, throughput is maximized while QoS is maintained above the required level. 237 6.3.1 OFDMA System Model A single cell multiuser OFDMA system with N users and M subcarriers is considered in this chapter (see Figure 6.4). In each OFDM block period, the network can assign any subset of subcarriers to each user. The maximum achievable rate per unit bandwidth at the mth subcarrier for the nth user is given by gmn = log2(1+jHmn j2Pmn =(Nmn ?)); (6.34) whereHmn isthenth user channel frequency response at the mth subcarrier, Pmn isthe transmit power, Nmn is the noise power, and ? is the SNR gap [77]. We assume that the base station has the knowledge of channel condition for each user. In other words, gmn ?s are known in every time slot [57]. Tra?c from difierent users is directed to their assigned queues and each queue is served according to its users QoS. Let rn denote the rate reserved by user n (n = 1;2;??? ;N), i.e., assigned to queue n. It is well known that QoS of a user can be translated to a minimum guaranteed rate (i.e., rn) through the notion of efiective bandwidth [16]. We assume that the queues of all users are back-logged, so they have packets to transmit at all times. Also, we assume that at each time slot a subcarrier can be assigned to only one user. We denote the set of subcarriers assigned to user n at time t by Sn(t). Obviously, the total rate assigned to user n at time t is Pm2Sn(t) gmn (t). 238                   Figure 6.4: Block diagram of the system. 6.3.2 OFDMA Scheduling Algorithms through Subcarrier Alloca- tion We introduce a revenue model where the network charges the users based on the throughput it provides for users, and is penalized if the QoS deflned in SLA for any user is violated. We assume that for the service received by user n at time t, Pm2Sn(t) gmn (t), the network charges the user by finPm2Sn(t) gmn (t). Here, fin is the rate that the nth user pays for the service and is deflned in its SLA. To measure QoS delivered to users, we use the notion of credit. The network assigns a 239 credit to user n denoted by Cn(t) that evolves as follows: Cn(t) = Cn(t?1)+rn ? X m2Sn(t) gmn (t) where rn and Pm2Sn(t) gmn (t) are the nth user?s reserved rate and received service up to time t, respectively. Credit is introduced in Section 6.2.2. It is a measure of how much service the network has provided to a user [142]; i.e. if the network provides the reserved rate to a user, the user?s credit is close to zero. We measure the QoS provided to users by the credits, so if a user has not received the requested QoS, its credit is high (Section 6.2.4). In this case, the network is penalized by f[Cn], where f[:] is a real, positive, convex and continuous function with f[x] = 0 for x ? 0. Let us denote dn(t) to be the income of the network from user n at time t. Here, we perform scheduling in one OFDM symbol period, and maximize the total income at time t using a greedy algorithm [148]. Therefore, from now on and without loss of generality, we drop the index of time (t) in our discussions. Let us assume that Cn is the credit of user n at the beginning of current OFDM symbol period, the new credit after assigning all subcarriers are Cn +rn ?Pm2Sn gmn . As a result, we obtain: dn = fin X m2Sn gmn ?f " Cn +rn ? X m2Sn gmn # (6.35) The total income of the network is given by D = PNn=1 dn. We assume that in any time slot, the SLA-based scheduler knows rn?s and Cn?s (n = 1;??? ;N) and also gmn ?s (n = 1;??? ;N;m = 1;??? ;M), and it assigns subcarriers to the users such that the total income (D) is maximized. The penalty function shown in (6.11) is 240       Figure 6.5: Scheduling tree of subcarrier allocation. used here. In what follows, we present an optimal and suboptimal scheduling algorithms for sub- carrier allocation in OFDMA systems. Optimal Solution An exhaustive search among all possible assignments can achieve the optimal solution that maximizes the total income, D = PNn=1 dn. There are N difierent users that can be assigned to M subcarriers. Therefore, the total number of assignments is NM. The set of all possible assignments can be illustrated by scheduling tree as shown in Figure 6.5. The leaf labelled with 1 shows the choice of allocating all subcarriers to user 1, in leaf NM all are allocated to user N and in leaf 2, carriers 1:::M ?1 are assigned to user 1 and carrier M to user 2. Other leaves are labelled accordingly. The exhaustive search algorithm evaluates the income for each leaf of the scheduling tree (all possible assignments) and selects the one with the 241 maximum income. If the complexity of evaluating dn is bounded up by L, the complexity of performing the exhaustive search will be LNM+1. Since the computational complexity of this algo- rithm grows exponentially with the number of subcarriers, exhaustive search may not be a practical solution. However, it still can be used as a performance reference for all of the other algorithms. In the next sections, we propose a number of lower complexity subop- timal algorithms. The algorithms are presented in the increasing order of complexity and optimality. Sequential Assignment (SA) As the simplest suboptimal algorithm, we can assign users to carriers, one at a time. Assume that at the kth step, there is a pool of subcarriers left. Now, we introduce a notation of income atstepk, dkn similartoEquation(6.35)byreplacingSn withSkn, thesetofsubcarriersassigned to user n at the end of the kth step. Also, assume that at the kth step, a carrier denoted by mk is to be assigned. The best user for this carrier is determined by: ^nk = argmaxn 8 < :fin X m2Sk?1n gmn +fingmkn ?f 2 4Cn +rn ? X m2Sk?1n gmn ?gmkn 3 5 9 = ; (6.36) The above algorithm is summarized as: ? Start with a subcarrier, calculate the income for each user and flnd the best user with the best income according to (6.36). ? Assign the subcarrier to the best user found in the previous step. Remove the carrier 242 from the pool. ? Proceed to the next subcarrier. The carrier selection order can be random or flxed. However, we observed through numer- ical studies that the performances in both cases are similar. In other words, the performance of SA is independent of subcarrier assignment order. While having low complexity, this scheme performs far from the optimal solution, because in this algorithm each subcarrier is selected independent of assignments in the next steps. Viterbi Algorithm (VA) As mentioned in earlier sections, each channel assignment is represented by a path on the scheduling tree. We can observe that the scheduling tree can be translated into a scheduling trellis. In scheduling trellis, rows represent the users and columns represent the subcarriers. For simplicity, we can add a dummy initial node and a dummy terminal node to the schedul- ing trellis. Therefore, every path connecting the initial node to the terminal node through the scheduling trellis is an assignment. For instance, if the carrier m is assigned to the user n, the nth node of the mth column in the scheduling trellis will be in the assignment path. For every path in the scheduling tree, there is one (and exactly one) corresponding path in the scheduling trellis. Now, the optimal assignment can be translated to flnding a path in the scheduling tree with the highest income. If we consider the income of every assignment (path) in the trellis as the weight or length of the path, then the optimization problem is to flnd the longest 243                      Figure 6.6: Viterbi channel assignment. path in the trellis. The longest path in a trellis can be obtained by using the low complexity Viterbi Algorithm, if the length of a path has additive property. In other words, by adding a link to a path at step k + 1, the length of the link is added to the length of the path at step k. Unfortunately, in this problem, the length of a path does not have the additive property. The length of a path at step k depends on the assignment (Skn) through the non- linear function f[:]. Therefore, applying the Viterbi algorithm to the very same problem does not necessarily provide the longest (the highest income) path in the scheduling trellis. However, we expect the Viterbi Algorithm to provide a sub-optimal solution. In the Viterbi Algorithm, at each column, each node computes the cost of the N emerging links, and picks the one with the highest length as the survival path. Hence, at each stage, there are N survival paths and after M steps, the algorithm chooses one path as the flnal survival path. In each column there are N nodes, and N paths compete at each node. Thus, the complexity of computing a path length is upper bounded by NL, so the complexity of this algorithm is bounded up by MN3L. Figure 6.6 displays the survival of a path using the Viterbi Algorithm in the scheduling trellis for 4 users and six subcarriers. 244 Iterative Algorithm (IA) The sequential assignment algorithm has a very low complexity but its performance is sub- optimal. This is because the assignments in the future steps would change the income for the current assignment. That is, the assignment at each step must not be considered inde- pendently from other steps, including the future assignments. As a result, the Viterbi and sequential Algorithms are sub-optimal. In this section, we modify the sequential assignment algorithm to achieve close to optimal solution. For this purpose, we repeat the carrier as- signment and reflne the set of carriers assigned to each user in order to maximize income until they converge. The iterative subcarrier assignment algorithm is as follows: 1. Assume that all of the subcarriers are assigned initially to difierent users, by some algorithms (like flxed assignments). 2. Take the kth subcarrier. We want to reassign the this carrier to the locally optimum user. In this step the incomes for all users are checked for a possibly new assignment. The user that maximizes the total income will obtain this carrier. That is, ^nk = argmaxn 8 < :fin X m2Sk?1n gmn +fingmkn ?f 2 4Cn +rn ? X m2Sk?1n gmn ?gmkn 3 5 9 = ; 3. Assume that in the initial assignment or previous iterations, the kth subcarrier was 245 assigned to another user, say ~n. Then the assignments are updated as 8> >>>> >>> < >>> >>>> >: Skn = Sk?1n n 6= ~n or n 6= ^n; Sk~n = Sk?1~n ?fkg ; Sk^n = Sk?1^n +fkg : (6.37) 4. Update all incomes for efiected users, ~n, and ^n. 5. Repeat steps 2-4 for the remaining subcarriers until all of the subcarriers are reassigned. 6. Repeat steps 2-5 until the income does not increase anymore. The above algorithm converges to a flxed point. The reason is that in the above algorithm the income will increase at each step of the algorithm. Since the total income has an upper bound which is the one for the optimal full search assignment, this algorithm will converge to a flxed point which may not be the globally optimum assignment. Depending on the initial assignments, it will converge to a local or the global optimum. In order to improve the performance, we start the iteration from difierent and random initial points, and then pick the assignment with the maximum overall income. As we increase the number of random initial points, the probability that the algorithm achieves the globally optimal assignment increases. In order to speed up the convergence we try reassignment of carriers in random order. We will show through numerical studies that the achieved performance of this algorithm is close to the optimal solution with much lower complexity. However, its complexity is 246 higher than that of sequential algorithm or Viterbi solution. If we assume that we repeat the iterative algorithm from Q difierent initial points, and the average number of iterations to reach a flxed point is P, and the average complexity of evaluating the income for one user is L, the overall complexity of this scheme is QPNML. Considering typical numbers for N and M, this value is much less than LNM+1 for the exhaustive search method. Subcarrier Clustering We observed that there is a trade-ofi between the complexity and the performance of the algorithms. One way to reduce the complexity of subcarrier assignments is to bundle the subcarriers in a cluster. For instance, if the system has M subcarriers, and if each cluster has k subcarriers, then we obtain M=k clusters. The capacity matrix of the new system can be easily derived by adding the rows of subcarriers in each cluster, and it will be a matrix with dimensions of N ?M=k. Now, every one of the algorithms mentioned in this chapter can be used for the new system, by allocating clusters rather than subcarriers. By clustering, we can reduce the complexity, however, it is clear that this process degrades the system performance. 247 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Total network load System throughput Total system throughput vs. network load MTSMCS MPSMIGS IWFQCSDPS Figure 6.7: Throughput versus network load 6.4 Performance Results 6.4.1 Simulation Results for SLA-Based Scheduling In order to evaluate the performance of our algorithms, we have simulated a single-cell wireless system where users are randomly distributed. We assume that path loss and shadow fading are compensated by a power allocation mechanism and the channel follows a Rayleigh fading distribution. By considering the same noise level at all receivers, the received signal power also follows a Rayleigh distribution. We have quantized the Signal to Noise Ratio (SNR) of each link into four distinct levels, and for each SNR level, we have calculated the channel capacity according to Equation (6.6). The quantized levels of channel capacities and their probabilities are f1:0;0:6;0:4;0:2g and f0:43;0:24;0:19;0:14g, respectively. If Rn is the assigned rate to user n (or the proportional time assigned to user n), and rn 248 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 1 2 3 4 5 6 7 8 Total network load Minimum assigned relative rate QoS (minimum assigned relative rate) vs. network load MTSMCS MPSMIGS IWFQCSDPS Figure 6.8: Minimum assigned relative rate versus network load is the reserved rate by that user, we deflne the minimum assigned relative rate over all users to be: ? = minn ?R n rn : (6.38) This value can be considered as the measure of QoS; to support QoS for all users, we want ? ? 1. First, we present the simulation results for MIGS and compare its performance with MTS, MCS, MPS, IWFQ and CSDPS for a system with four users. The reserved rates of the four users are ?[0:1;0:2;0:3;0:4], where 0 ? ? ? 1 is the network load (the sum of the reserved rates is ?). Also, we assume that fi = 1000 for all users. Throughput, minimum assigned relative rate (?), and total income are plotted in Figures 6.7-6.9, respectively. The penalty function in the simulations is selected as in (6.11) with = 1. The horizontal axis in all these flgures shows the network load (?). 249 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1?5 ?4 ?3 ?2 ?1 0 1 2x 105 Total network load Total system income System income vs. network load MTSMCS MPSMIGS IWFQCSDPS Figure 6.9: Total income vs. network load As illustrated in Figure 6.7, MTS and MIGS achieve the maximum throughput (the expectation of maximum link capacity, Efmax(g) = 0:96g). It should be noticed that the maximum achievable throughput is Efmax(g) = 0:96g and therefore, no throughput above this value is plausible. As can be seen form these flgures, IWFQ and CSDPS provide a lower throughput compared to MIGS. It is interesting to notice that the throughput of IWFQ and CSDPS are almost equal. The reason is that in both of these schedulers, if the link capacity of a speciflc user is below a threshold which is the same for both schemes, the scheduler does not schedule that user. MCS achieves a at throughput which is equal to the average link capacity (E(g) = 0:68). At low network loads, MPS tries to satisfy each user with its requested bandwidth; that is, throughput is minimally allocated to satisfy each user. As network load increases, the system throughput increases and it approaches to that of MTS and MIGS. 250 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 10.65 0.7 0.75 0.8 0.85 0.9 Total reserved rate Throughput Throughput vs. Total reserved rate MPS MIGS MIDPS Figure 6.10: Throughput of MIDPS, MIGS, and MPS vs. network load As illustrated in Figure 6.8, at low network loads all the algorithms support QoS. How- ever, as the network load increases, MPS and MIGS try to maintain QoS for all users, while MTS and CSDPS fail to do so. This result is expected, since MTS and CSDPS are not designed to provide QoS. Since MCS does not utilize bandwidth as e?ciently as MIGS, it fails at high network loads due to the lack of available channel bandwidth. IWFQ has a mechanism for supporting QoS. However, since its throughput does not go beyond 0:76, it fails to support QoS after this network load. This is a general rule, which is re ected in Remark 6.4.1. Remark 6.4.1. If the Maximum achievable throughput by a scheduling scheme is Cmax, then the scheme fails to support the QoS (? ? 1) for all loads above CMax Therefore, IWFQ cannot maintain QoS as strongly as MIGS. It is important to notice that it is not possible to flnd a scheduler to provide QoS for network load of one. 251 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 10.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.05 Total reserved rate Minimum assigned relative rate Minimum assigned relative rate vs. Total reserved rate MPSMIGS MIDPS Figure 6.11: Minimum assigned relative rate of MIDPS, MIGS, and MPS vs. network load As it was mentioned earlier, total income is a combination of system throughput and penalty when QoS requirement is not met. This quantity is shown in Figure 6.9. MIGS generates the highest income since the throughput and penalty are optimized jointly. The income for MTS, CSDPS, MCS, and IWFQ drop at high network loads, since both all to meet QoS after certain loads. MTS and CSDPS fail to meet QoS at lower network loads compared to MCS and IWFQ, and as a result, their incomes drop faster. Total income for MPS increases as load increases, since it tries to minimize penalty independent of load, while at large loads, the efiect of throughput prevails. At large loads, the throughputs grows and increases the income. At high network loads, MPS income approaches that of MIGS, since both achieve similar throughput at high loads. Next, we evaluate the performance of MIDPS and compare its performance with those of MIGS and MPS (See Figures 6.10, 6.11 and 6.12). However, because of complexity issue of 252 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 16 7 8 9 10 11 12 13 Total reserved rate Total Income Total Income vs. Total reserved rate MPSMIGS MIDPS Figure 6.12: Income of MIDPS, MIGS, and MPS vs. network load DP algorithm, we consider only three users, limit the credits of users to be between -1 and 2, and perform the simulations for three cases where the reserved rates are a:[0:2;0:2;0:2], b:[0:2;0:2;0:4], and c:[0:2;0:2;0:6]. The penalty function is described by Equation 6.11, and fi1 = 1, fi2 = 2, and fi3 = 4. Figures 6.10 and 6.11 show that the system throughput and QoS with MIDPS are as good as the throughput and QoS with MIGS. Therefore, MIDPS can support QoS and provide high system throughput. However, as shown in Figure 6.12, the total income with MIDPS is better than the sub-optimal MIGS. We have to mention that when we increase the range of credits, sub-optimal solution MIGS performs close to the optimal solution MIDPS. Optimizing of the System Parameters: The system performance, i.e., QoS, system throughput and total system income depend on the values of fin?s and the penalty functions, fn[:]?s. The system charges the user n based on the value of fin and the service provided to 253 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 Total network load Relative rate ratio QoS vs. network load for different penalty functions, ? =5. 0.51.0 1.32.0 20.0 Figure 6.13: QoS versus the network load for difierent penalty functions, fi = 5 this user. On the other hand, the user n penalizes the system if its SLA is violated based on the penalty function fn[:]. In this subsection, we would like to investigate the efiect of these parameters on the system performance. As the value of fin increases, the throughput becomes more signiflcant. For the penalty function, we consider the polynomial function fn(x) = xv. For high values of v, the penalty function increases for large values of credits, therefore, QoS becomes more signiflcant. As mentioned earlier in this chapter, a scheduler cannot provide QoS above its throughput. Therefore, low throughput implies lack of QoS for large network loads. In our simulations, we use a[0:1;0:1;0:2;0:6] as the reserved rate assignment. Figure 6.13 displays the relative assigned rate ratio and Figure 6.14 shows the throughput for fin = 5 and for difierent penalty functions. As shown in these flgures, for this value of fin = 5, the exponent of between 1:1 ? v ? 3 is an appropriate value for good performance. Also, there 254 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.78 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 System throughput Throughput vs. network load for different penalty fucntions, ? =5 0.51.0 1.32.0 20.0 Figure 6.14: Throughput versus the network load for difierent penalty functions, fi = 5 is an optimal value of v? = 1:3 that optimizes the system performance in terms of both QoS and throughput. For small values of v, the system does not pay enough attention to QoS and for large values of v, the system throughput reduces and the system fails to support QoS for large network loads. Figures 6.15 and 6.16 show the QoS and throughput for difierent values of penalty function for fin = 500. We expect the optimal solution to change for this case. As shown in this flgure, the optimal value for v moves, i.e., v? = 3. 6.4.2 Simulation Results for OFDMA Channel Allocation We evaluate the performance of the proposed algorithms in a single cell system where four users (N = 4) are randomly distributed in the cell, and each user can be assigned to any of M = 32 subcarriers. We consider a multipath channel model with R = 4 distinct paths. 255 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 Total network load Relative assigned rate ratio QoS vs. network load for different penalty functions, ? = 500 1.32.0 3.07.0 20.0 Figure 6.15: QoS versus the network load for difierent penalty functions, fi = 500 The channel response for the nth user can be represented as: hn(t) = p Gn R?1X r=0 firn?(t??rn); (6.39) where Gn includes log-normal shadow fading and path loss, ?ln and filn denote the lth path delay and fading, respectively. Path delay fading follows a complex Gaussian distribution, so the received signal amplitude has a Rayleigh distribution. The baseband channel frequency response can be represented simply by the Fourier transform of hn(t) sampled at the carrier frequency, mfc, where fc is the subcarrier separation: Hmn = p Gn L?1X l=0 filne(?j2?mfc?ln): (6.40) We assume that the path loss and shadowing for difierent paths are the same, and any difierence can be absorbed in fading coe?cients. For performance evaluation, we consider the sequential assignment (SA), Viterbi Algo- 256 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.78 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 System throughput Throughput vs. network load for different penalty functions, ? = 500 Total network load 1.32.0 3.07.0 20.0 Figure 6.16: Throughput versus the network load for difierent penalty functions, fi = 500 rithm (VA), iterative algorithm (IA), and flxed assignment (FA). In the flxed assignment, we assume that the network assigns a set of subcarriers to each user for the whole duration of the simulations. In the other approaches, the subcarrier assignment is performed for one time slot, and it changes in every time slot. Obviously, FA and the exhaustive search have the lowest and the highest complexities, respectively. Our numerical results reveal that the IA performs close to the optimal exhaustive search algorithm. To see this, we compare the performance of both approaches through simulations. Figure 6.17 shows the cumulative distribution functions (CDF) of the total income for the exhaustive search algorithm along with IA with 20 and 80 iterations. As shown in this flgure, the performance of IA is close to the optimal one even with 20 iterations. Therefore, from now on, we can consider IA is as the reference algorithm. The total throughput versus the network load is shown in Figure 6.18. As it is illustrated 257 122 123 124 125 126 127 128 1290.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 The CDF of total income for full search, and iterative algorithm Total Income CDF Full SearchIter20 Iter80 Figure 6.17: CDF?s of the optimal exhaustive search algorithm, iterative algorithm with 20 and 80 iterations in this flgure, the IA achieves the maximum throughput (the expectation of maximum link capacity, EfPm max(gmn )g). VA achieves close to the maximum throughput, but is out- performed by IA. FA achieves close to the average capacity of the system, Efgmn g. At low network loads, SA achieves the maximum throughput. However, its performance drops very fast as load is increased. This is because this algorithm assigns the subcarrier independently; therefore, the assignments at the early stages of the algorithm limit performance of the later stages. We use the minimum assigned relative rate deflned in 6.38 to present QoS. This value is displayed versus the network load in Figure 6.19. Again IA and VA satisfy QoS requirement for almost all loading values while the SA and FA fail to meet the QoS requirement for large loadings. 258 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 114 16 18 20 22 24 26 28 30 32 Throughput vs. Network load Network load Throughput IterativeViterbi FixedSequential Figure 6.18: Throughput vs. load The total income is depicted versus network load in Figure 6.20. As shown in this flgure, the income of IA is higher than that of the other algorithms since IA provides the highest throughput and it supports the QoS, thus, the penalty of IA for violating the QoS is the lowest. As shown in this flgure, VA provides a suboptimal performance. The performance of SA and FA drop signiflcantly as the network load increases. 6.5 Summary of the Chapter In the flrst part of this chapter we proposed Service Level Agreement (SLA) based scheduling schemes. We introduced a notion of income maximization where throughput is the objective of maximization with the constraint that the scheduler is penalized when the QoS or SLA is violated. We proposed a greedy approach and a dynamic programming approach to solve 259 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.5 1 1.5 2 2.5 3 3.5 4 4.5 QoS vs. Network load Network load QoS IterativeViterbi FixedSequential Figure 6.19: Worst case actual to desired throughput vs. load the problem. Our results show that the performance of the algorithm is superior to cases where only throughput or QoS is considered in the scheduling process. In the second part of the chapter, we presented scheduling algorithms that maximize OFDMA system throughput for QoS sensitive users, using the notion of revenue maximiza- tion. The OFDM subcarriers were allocated to difierent users based on the their capacities, users? required rates, and the total income of the system. We have proposed low complexity sub-optimal scheduling algorithms to reduce the computational complexities. 260 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1?14 ?12 ?10 ?8 ?6 ?4 ?2 0 2x 105 Total income vs. network load Total network load Total income IterativeViterbi FixedSequential Figure 6.20: Total revenue vs. load 261 Chapter 7 Conclusion and Future Works 7.0.1 Summary The main focus of this dissertation was on the development of e?cient algorithms for in- creasing the performance of OFDM systems via power control, multiple transmit and receive antenna, and provide the required QoS?s to the users of a network who share a Wireless Media (WM). In Chapter 2 we outlined the basic concepts of OFDM and explained some of the existing problems of an OFDM system. The problem of equalizing the performance of difierent subcarriers, increasing the capacity of the system, high Peak to Average Power Ratio (PAPR) of OFDM systems, and difierent applications of multiple transmit and receive antenna in a wireless system was described. In Chapter 3, we considered iterative joint power control and beamforming, both in frequency-domain and in time-domain for OFDM wireless networks. We could achieve the 262 following goals, simultaneously: ? The performance of all of the subchannels of an OFDM receiver are very close to each other. ? The SINRs at all subchannels of all mobiles are at least equal to a pre-deflned threshold value. ? The total network power to achieve the above goals is minimized. To reduce the complexity of the OFDM receivers, we performed the array processing in the time domain and provided an iterative algorithm to distribute the power among subchan- nels. This scheme could provide a sub-optimal performance in terms of the total network power, with a complexity at the receiver which is about 64 times less (for 128 subchan- nels and 4 antennas) than the optimal frequency-domain beamforming. This reduction in complexity was achieved by paying a price of having higher total network power for the same target bit error rate. We also considered MMSE time-domain beamforming jointly with power control was for practical situations where the angle of arrivals are unknown. We proved that the MMSE time-domain beamforming solution has the same SINR as the MVDR solution. Finally we applied the proposed joint power control and frequency-domain beam- forming to the COFDM systems. We observed that an uncoded-OFDM system with the proposed algorithms performs better than the simple COFDM system, a coded system with per subchannel beamforming with equal powers across subchannels and a COFDM system 263 with per user power control and per subchannel beamforming. If the proposed algorithm is applied to COFDM, the BER is improved for moderate and high SINRs. In Chapter 4, we proposed iterative water-fllling solutions for downlink multi-user multi- cell wireless systems where multiple antennas are deployed at both transmitters and receivers. The proposed algorithm assigned multiple independent substreams for each user to increase the maximum achievable data rate for each user and is performed as a distributed scheme. Each transmitter has only the knowledge of interference at its own receiver and the channel response in that link. By simulation, we have observed that the algorithm converges to a flxed point solution where the data rate for each user is locally optimized. We have established a non-cooperative game theoretic analogy for the MIMO/OFDM problem and proposed a Conjecture that the iterative algorithms proposed in this work will converge to the Nash equilibrium saddle point of the game. We have also proposed another iterative algorithm that considers single stream transmission, and tries to maximize the actual transmission data rate of each user by performing transmit and receive beamforming. The performance of the proposed iterative schemes have also been evaluated through numerical analysis. We saw that if the number of co-channel mobiles per cell is increased, it is better to limit the number of streams. For example, if two co-channel mobiles per cell is used, with four transmit antennas, two streams outperforms the capacity of four streams. Chapter 5 focused on the problem of reducing PAPR of OFDM systems. We tried to make some modiflcation to Golay complementary codes to increase their achievable rates. We introduced the concept of super-Golay codes, which are the extension of Golay Com- 264 plementary codes to non-equal energy constellations. Speciflcally we focused on 16-QAM constellation. We constructed a recursive scheme that allowed us to build all of the super Golay codes with a speciflc size. These codes were derived form non-equal energy constella- tions and had the same property that Golay complementary codes derived form equal energy constellation had. We formed some structures to obtain these codes from super Golay Codes with half size. This helped us to establish a modifled Reed-Muller code that covers the structure for super Golay complementary. The construction started from QPSK Golay se- quences, created from the 2nd order cosets of RM2h(1;m). the future directions of this work is to search for more structures. The more structure we obtain, the bigger number of SGolay codes we can cover and the higher coding rate we can achieve. The construction started from QPSK Golay sequences which are e?ciently created using 2nd order cosets of RM2h(1;m). Although the information rate is higher than the existing works in this context, it is not still an acceptable rate. Another future direction of this work is to flnd the possible trade-ofi between the coding rate and PAPR. The structure we propose is for general super Golay codes, regardless of the constellation. However, we have focused on 16-QAM constellation for the sake of simulation. This scheme can be generalized to higher order QAM constellations, like 64-QAM which is used in IEEE WLAN standards like IEEE802.11a. Then, in the second part of this chapter, we introduced the concept of cyclic Golay codes and showed that with appropriate time shaping, and at the discrete domain (critical sam- pling), they maintain the same level of PAPR as the Golay codes. Moreover, we have shown 265 that these constitute a super-set of Golay codes and therefore results in higher coding rate. However, the increase in the coding rate was not very signiflcant (22% for m = 4). We designed a construction method to flnd the cyclic shift of any code represented by Boolean algebraic forms. The cyclic shifts of the Golay second order cosets of the flrst order Reed- Muller codes generated by our construction has a low Hamming and Lee distance. However, we introduced a trade-ofi between the coding rate and the distance of the code. An OFDM system, with a SNR threshold is introduced that according to SNR of the received signal switches between Golay code, and its cyclic shifts with difierent orders. We demonstrated that increasing the threshold drops both the BER and the coding rate. If the SNR threshold was set as the lowest value (?2dB in these flgures) the system always used the cyclic Golay code. However, if the threshold was set to the maximum value (20dB), the Golay code is always used and that is the point of convergence shown in these flgures. We observed that the cyclic Golay codes were in general a subset of RM2h(r;m). So, we proposed two decoding algorithms for these codes. The flrst one was a generalization of majority logic decoding ap- proach, using the Karnaugh maps. It was discussed that the decoding scheme is an Hamming and Lee distance decoder for generalized RM2h(r;m). A maximum-likelihood equivalent of the scheme was introduced for soft-decision decoding in complex domain. We also devised a recursive approach that reduces the decoding procedure into the decoding of lower size and lower order codes. We analyzed the complexities of both decoding algorithms in terms of the number of complex multiplications and additions. We have observed that although our algorithms are useful for generalized RM2h(r;m), their complexities are comparable to the 266 existing decoding schemes for the second order cosets of flrst order generalized Reed-Muller codes. We proved that the complexity of the recursive scheme is lower than the one for the non-recursive scheme. In Chapter 6, we proposed Service Level Agreement (SLA) based scheduling schemes. We introduced the notion of income maximization where the objective was to maximize the throughput of the network with the constraint that the scheduler is penalized when the QoS or SLA is violated. We proposed a greedy short-term approach that considered the optimization problem over one time-slot, and also a dynamic programming approach that tried to solve the problem optimally over a long period of time (inflnity in this case). This scheme provided the scheduling decision for each time slot using the fading characteristic of each wireless link. Our results showed that the performance of these algorithms were superior to cases where only throughput or QoS is considered in the scheduling process. Finally, in this chapter we presented scheduling algorithms that maximize OFDMA sys- tem throughput for QoS sensitive users. We used the same notion of revenue maximization to balance throughput optimization and QoS. The OFDM subcarriers were allocated to dif- ferent users based on the their capacities, users? required rates, and the total income of the system. We proposed low complexity sub-optimal scheduling algorithms to reduce the computational complexities. Through simulation, we observed that the iterative sequen- tial assignment, while having signiflcantly lower complexity, can achieve the fair trade-ofi between the QoS and total throughput through the notion of total income. 267 7.0.2 Future Works The works presented in this dissertation could be extended in the following directions ? To devise the suboptimal time domain time-domain MIMO OFDM to increase the overall mutual information with lower complexities ? To prove the Conjecture 4.3.1 presented in Chapter 4, and flnd the more accurate necessary conditions for this Conjecture. ? To search for more structures, that constitute higher dimension super Golay codes from lower dimension codes. The more structure we obtain, the bigger number of super Golay codes we can cover and the higher coding rate we can achieve. ? To flnd the possible trade-ofi between the coding rate and PAPR of super Golay codes. Although the information rate of super Golay codes were higher than the existing works in this context, they didn?t present acceptable rates for practical applications. ? To devise systematic encoding and decoding methods for super Golay codes. ? To create a relation between the discrete domain PAPR based on sampling of the OFDM signal, and the PMEPR of the continuous time OFDM signal. The 3dB upper bound for the cyclic Golay codes is only applicable for discrete sampled OFDM time domain symbols. However, they are an indication of how well the code performs. ? To flnd the relation between the PAPR of some extended second order cosets of RM2h(1;m) (who has PAPR above 3dB) and the PAPR of their cyclic shifts. 268 ? To devise coding schemes with low PAPR for space-frequency OFDM codes. Many PAPR reduction codes have considered single antenna transmission. In a paper pub- lished in [149], we devised a code that achieved full diversity gain over OFDM systems. We plan to extend these codes to those with the same diversity and coding advan- tage and low PAPR. There are 3 approaches we are following, namely rotation of the symbols, using cosets of flrst order Reed-Muller codes, and flnally row interleaving. ? To apply the framework proposed in Chapter 6 for SLA-based scheduling, to other multiple access protocols like CDMA and FDMA ? To devise a pricing scheme based on demand of network resources and their availability in terms of the overall capacity of the wireless network. The choice of revenue and penalty parameters in deflning the total income of the network has some efiect on the trade-ofi between the QoS and the system throughput, and also the income of the network. We plan to choose these parameters based on demand and supply. Another aspect of pricing is the dependence of fin and fn on the required rates of users (rn). It has an economical justiflcation that whoever requires more services ( higher rn) needs to pay more (higher fin), and penalize the network more for the lack of service (higher order for fn). ? To devise an admission control policy to adjust the maximum achievable network loads in such a way that the relative assigned rate is above some predetermined factor. Another scheme for admission control policy is to keep network load below a threshold 269 such that the system income is above some expected value. ? To flnd an e?cient long-term scheduling decision obtained form dynamic programming. The algorithm presented in Chapter 6 is obtained using the value iteration. However, we will seek a decision driven algorithm long-term scheduling ? To prove the convergence of the sub-optimal schemes proposed for channel allocation in OFDMA systems. ? To combine the channel allocation scheme with power control. In the second part of Chapter 6, we assumed that the powers are flxed for each subchannel. However it is possible to allocate powers intelligently, such that the overall throughput is higher, while the required QoS for all users are fulfllled. ? To apply the framework proposed in Chapter 6 to devise scheduling algorithms for mobile Ad-Hoc networks that use multiple antenna transmission. 270 BIBLIOGRAPHY [1] S. Nanda, K. Balachandran, and S. Kumar, \Adaptation techniques in wireless packet data services," IEEE Communications Magazine, vol. 38, pp. 54{64, Jan 2000. [2] wirlessLAN802.11 standard committee, \IEEE802.11 wireless lan standards," http://grouper.ieee.org/groups/802/11/. [3] E. T. S. I. (ETSI), \Etsi broadband radio access networks," http://www.etsi.org/. [4] J. Kruys, \Standardization of wireless high speed premises data networks adaptation techniques in wireless packet data services," Wireless ATM workshop, Espoo, Finland, Sep. 1996. [5] wirelessLAN802.16 standard committee, \IEEE Standard for Local and Metropoli- tan Area Networks, Part 16: Air Interface for Fixed Broadband Wireless Access Systems," http://standards.ieee.org/getieee802/download/802.16-2001.pdf, IEEE Std 802.16-2001, 2001. [6] wirlessLAN802.20 standard committee, \IEEE802.20 mobile broadband wireless access (MBWA)," http://grouper.ieee.org/groups/802/20/. 271 [7] J. G. Proakis, Digital Communications. McGraw Hill, third ed., 1995. [8] C. Heegard and S. B. Wicker, Turbo Codes. Kluwer Academic Publishers, 1999. [9] A. Shokrollahi, \LDPC Codes: an Introduction," http://www.ipm.ac.ir/IPM/homepage/Amin2.pdf, Apr. 2003. [10] J.B.Cain, J. G.C.Clark, and J.M.Geist, \Punctured convolutional codes of rate (n- 1)/n and simplifled maximum likelihood decoding," IEEE Transaction on Information Theory, pp. 97{100, Jan 1979. [11] G. Ungerboeck, \Trellis Coded Modulation with Redundant Signal Set," IEEE Comm. Magazine, vol. 27, pp. 5{21, February 1987. [12] I. S. Reed and G. Solomon, \Polynomial codes over certain flnite flelds," SIAM Journal on Applied Mathematics, vol. 8, Jan 1960. [13] S. B. Wicker, Error Control Systems for Digital Communication and Storage. Prentice Hall, New Jersey, 1995. [14] C. Berrou, A. Glavieux, and P. Thitimajshima, \Near Shannon limit error-correcting coding and decoding: Turbo-codes," IEEE International Conference on Communica- tions, Geneva, vol. 2, 1993. [15] C. Villamizar and T. Li, \IS-IS optimized multipath," Internet draft, draft-villaizar - isis-omp-00.tex, Oct 1998. 272 [16] R. Guerin, H. Ahmadi, and M. Naghsineh, \Equivalent Capacity and its Application to Bandwidth Allocation in High-speed Networks," IEEE Journal of Selected Areas in Communications, Sep 1991. [17] A. Orda, \Routing with end-to-end qos guarantees in broadband networks," Tech. Rep., Technion, Iseael. [18] S. Abhyankar, R. Roshiwal, C. Cordeiro, and D. Agrawal, \On the application of tra?c engineering over bluetooth ad hoc networks," Department of ECECS, University of Cincinnati, Oct 1998. [19] M. Andrews, K. Kumaran, K. Ramanan, A. Stolyar, P. Whiting, and R. Vijayku- mar, \Providing quality of service over a shared wireless link," IEEE Communications Magazine, vol. 39, pp. 150{154, Feb 2001. [20] T. Rappaport, Wireless Communications. IEEE Press, 1996. [21] J. Gibson, The Mobile Communications Handbook. IEEE Press, 1996. [22] R. Steele, Mobile Radio Communications. IEEE Press, 1992. [23] G. J. Foschini and M. J. Gans, \On Limits of Wireless Communications in Fading En- vironment When using Multi-element Antennas," Wireless Personal Communications, vol. 6, pp. 311{335, 1998. [24] R. O. Schmidt, \Multiple emitter location and signal parameters estimation," IEEE Trans. on Antenna and Propagations, vol. 34, pp. 276{280, March 1986. 273 [25] R. Roy, A. Paulraj, and T. Kailath, \ESPRIT-a subspace rotation approach to estima- tion of parameters of cisoids in noise," IEEE Trans. on Acoustics, Speech and Signal Processing, vol. 34, pp. 1340{1342, Oct 1986. [26] R. A. Monzingo and T. Miller, Introduction to Adaptive Arrays. Wiley, New York, 1980. [27] B. Ottersten, R. Roy, and T. Kailath, \Signal waveform estimation in sensor array pro- cessing," Proc. of the 23rd Asilomar Conference on Signals, Systems, and Computers,, Nov. 1989. [28] B. Suard, A. Naguib, G. Xu, and A. Paulraj, \Performance analysis of CDMA mobile communication systems using antenna array," IEEE Proc. Intn. Conf. on Acoustic, Speech and Signal Processing, ICASSP? 93, vol. 4, pp. 153{156, April 1993. [29] V. Tarokh, N. Seshadri, and A. R. Calderbank, \Space-time codes for high data rate wireless communication: Performance criterion and code construction," IEEE Trans- actions on Information Theory, vol. 44, pp. 744{765, Mar. 1998. [30] B. Hochwald and T. Marzetta, \Unitary space-time modulation for multiple-antenna communication in rayleigh at-fading," IEEE Trans. Inform. Theory, vol. 46, pp. 543{ 564, March 2000. 274 [31] B. Hassibi, B. M. Hochwald, A. Shokrollahi, and W. Sweldens, \Representation theory for high-rate multiple antenna code design," IEEE Trans. Inform. Theory, vol. 47, pp. 2335{2367, Sept. 2001. [32] G. J. Foschini, \Layered Space-Time Architecture for Wireless Communication in a Fading Environment When using Multi-element Antennas," Bell Labs Tech Journal, vol. 1, pp. 41{59, Fall 1996. [33] E. Biglieri and A. T. G. Taricc and, \Coding and signal processing for multiple-antenna transmission systems: A review," Seventh International Workshop on Digital Signal Processing Techniques for Space Communications, Oct 2001. [34] G. J. Foschini, G. D. Golden, R. A. Valenzuela, and P. W. Wolniansky, \Simplifled pro- cessing for high spectral e?ciency wireless communications employing multi-element arrays," IEEE Journal Select. Areas Comm. JSAC, vol. 17, p. 18411852, November 1999. [35] L. Zheng and D. N. Tse, \Diversity and multiplexing: A fundamental tradeofi in multiple-antenna channels," IEEE Transactions on Information Theory, vol. 49, May 2003. [36] B. Salzberg, \Performance of an e?cient parallel data transmission system," IEEE Trans. Commun. Technol., vol. COM-15, pp. 805{813, Dec 1967. 275 [37] S. Weinstein and P. Ebert, \Data transmission by frequency-division multiplexing us- ing the discrete fourier transform," IEEE Trans. Commun. Technol., vol. COM-19, pp. 628{634, Oct 1971. [38] A. Oppenheim and R. Schafier, Discrete -time signal processing. Prentice-Hall Inter- national, 1989. [39] J. L.J. Cimini, \Analysis and simulation of a digital mobile channel using orthogonal frequency division multiplexing," IEEE Trans. on Commun., vol. COM-33, pp. 665{ 675, Jul 1985. [40] T. Keller and L. Hanzo, \Adaptive modulation techniques for duplex OFDM transmis- sion," IEEE Trans. on Vehicular Technologies Commun., vol. 49, pp. 1893{1905665{ 675, Sep. 2000. [41] T.Pollet, M. Bladel, and M.Moeneclaey, \Sensitivity of OFDM systems to carrier fre- quency ofiset and wiener phase noise," IEEE Transaction on Communications, vol. 43, pp. 191{193, Apr 1995. [42] B. Sklar, Digital Communications: Fundamentals and Applications. Prentice Hall, 1988. [43] A. J. Viterbi, CDMA, Principles of Spread Spectrum Communication. Addison Wesley, 1996. 276 [44] J. Heiskala and J. Terry, OFDM Wireless LANs: A theoretical and practical guide. Sams, 2002. [45] A. N. D. Andra and U. Mengali, Synchronization Techniques for Digital Receivers. Plenum Press, New York, 1997. [46] V. K. Bhargava, D. Haccoun, R. Matyas, and P. P. Nuspl, Digital Communications by satellite: Modulation, Multiple Access, and Coding. Kreiger, 1991. [47] L. M. C. Hoo, J. Tellado, and J. M. Cio?, \Dual QoS loading algorithms for multi- carrier systems ofiering difierent CBR services," Proc. of PIMCR, vol. 1, 1998. [48] J. C. de Souza, \Discrete bit loading for multicarrier modulation systems," PhD The- sis., May 1999. [49] J. A. C. Bingham, J. M. Cio?, and P. S. Chow, \A practical discrete multitone tran- ceiver loading algorithm for data transmission over spectrally shaped channels," IEEE Trans. On Communications, pp. 773{775, February 1995. [50] D. Hughes-Hartogs, \Ensembled Modem Structure for Imperfect Transmission Media," U.S. Patent Notes 4679226, July 1987. [51] A. Czylwik, \Adaptive OFDM for Wideband Radio Channels," Proc. of IEEE Globe- Com?96, vol. 1, pp. 713{718, 1996. [52] R. F. H. Fischer and J. B. Huber, \A New Loading Algorithm for Discrete Multitone Transmission," Proc. of IEEE GlobeCom?96, pp. 724{728, February 1996. 277 [53] S. H. Muller, R. W. Ba, R. F. Fischer, and J. B. Huber, \OFDM with Reduced Peak to Average Power Ratio by Multiple Signal Representation," Annalls of Telecommu- nications, vol. 52, no. 1/2, pp. 58{67, 1997. [54] C.-L. Liu, \The efiect of nonlinearity on a QPSK-OFDM-QAMsignal," IEEE Trans. on Consumer Electronics, vol. 43, no. 3, pp. 443{447, 1997. [55] M. Sharif, M. G. Alkhansari, and B. H. Khalaj, \On the peak-to-average power of ofdm signals based on oversampling," IEEE Trans. on Comm., vol. 51, pp. 72{78, Jan. 2003. [56] J. Tellado, Multicarrier Modulation with Low PAPR, Application to DSL and Wireless. Kluwer Academic Publishers, 2000. [57] J. Gross and F. Fitzek, \Channel State Dependent Scheduling Policies for an OFDM Physical Layer Using a Binary State Model," Tech. Rep. TKN-01-009, Telecommuni- cation Networks Group, Technische Universityat Berlin, Jun 2001. [58] D. Mestdagh and P. Spruyt, \A method to reduce the probability of DMT-based transceivers," IEEE Trans. on Comm., vol. 44, no. 10, pp. 1234{1238, 1996. [59] J. Chow, J. Bingham, and J. Flowers, \Mitigating clipping noise in multi-carrier sys- tems," Proceedings ICC97, Montreal, Canada, pp. 715{719, 1997. [60] A. Gatherer and M. Polley, \Controlling clipping probability in DMT transmission," Proc. Conf. on Signals, Systems and Computers, Paciflc Grove, CA, pp. 578{584, 1997. 278 [61] R. V. Nee and A. D. Wild, \Reducing the peak-to-average power ratio of OFDM," Proceedings VTC98, Ottawa, Canada, pp. 2072{2076, 1998. [62] T. May and H. Rohling, \Reducing the peak-to-average power ratio in OFDM radio transmission systems," Proceedings VTC98, Ottawa, Canada, pp. 2474{2478, 1998. [63] T. A. Wilkinson and A. E. Jones, \Minimization of the peak to mean envelope power ratio of multicarrier transmission schemes by block coding," IEEE 45th Vehicular tech- nology Conf., Chicago, IL, pp. 825{829, Jul. 1995. [64] A. Kamerman and A. Krishnakumar, \OFDM encoding with reduced crest factors," Symp. On Comm. and Vehicular Technology in the Benelux, Louvain-La-Neuve, Bel- gium, pp. 182{186, 1994. [65] M. Friese, \Multicarrier modulation with low peak-to-mean average power ratio," Elec- tronic Lett., vol. 33, pp. 713{714, 1996. [66] H. Ochiai and H. Imai, \Block Coding Scheme Based on Complementary Sequences for Multicarrier Signals," IEICE Trans. Fundamentals, pp. 2136{2143, Nov. 1997. [67] X. Li and J. Ritcey, \M-sequences for OFDM peak-to-average power ratio reduction and error correction," Electronics Letters, vol. 33, no. 7, pp. 554{555, 1997. [68] C. Schurgers and M. B. Srivastava, \A systematic approach to peak-to-average power ratio in OFDM," SPIE?s 47th Annual Meeting, San Diego, CA, pp. 454{464, Jul 2001. 279 [69] S. H. Muller and J. B. Huber, \OFDM with Reduced Peak-to-Average Power Ratio by Optimum Combination of Partial Transmit Sequences," Electronic Lett., vol. 33, pp. 368{369, Feb. 1997. [70] R. W. Bauml, R. F. H. Fischer, and J. B. Huber, \Reducing the Peak-to-Average Power Ratio of Multicarrier Modulation by Selective Mapping," Electronic Lett., vol. 32, pp. 2056{2057, Oct. 1996. [71] P. V. Eetvelt, G. Wade, and M. Tomlinson, \Peak to average power reduction for OFDM schemes by selective scrambling," Electronics Letters, vol. 32, pp. 1963{1964, Jul 1996. [72] J. Tellado and J. Cio?, \Peak power reduction for multicarrier transmission," Pro- ceedings Globecom98, Sydney, Australia, 1998. [73] D. Jones, \Peak power reduction in OFDM and DMT via active channel modiflcation," Proceedings of 1999 Asilomar Conference, Paciflc Grove, CA, pp. 1076{1079, 1999. [74] D. Everitt and D. Manfleld, \Performance analysis of cellular mobile communication systems with dynamic channel assignment," IEEE Journal of Selected Areas in Comm., JSAC, vol. 7, pp. 1172{1180, Oct 1989. [75] J. T. E. McDonnel and T. A. Wilkinson, \Comparison of Computational Complexity of Adaptive Equalization and OFDM for Indoor Wireless Networks," Proc. of Personal, Indoor and Mobile Radio Comm. (PIMRC), pp. 1088{1091, 1996. 280 [76] Y. Wu and W. Y. Zou, \Orthogonal Frequency Division Multiplexing: A Multicarrier Modulation Scheme," IEEE Trans. on Consumer Electronics, vol. 41, pp. 392{398, June 1995. [77] J. A. C. Bingham, \Multicarrier Modulation for Data Transmission: An Idea Whose Time has Come," IEEE Communication Magazine, pp. 5{14, May 1990. [78] ETS300-401(1994), \Radio Broadcast Systems: Digital Audio Broadcasting (DAB) to Mobile, Portable and Fixed Receivers," http://www.ets.fr/, 1994. [79] S. K. Lai, R. S. Cheng, K. Letaief, and R. D. Murch, \Adaptive Trellis Coded MQAM and Power Optimization for OFDM Transmission," Proc. of IEEE Vehicular. Tech. Conf., vol. 49, 1999. [80] H. Sari, G. Karam, and I. Jeanclaude, \Transmission Techniques for Digital Terrestrial TV Broadcasting.," IEEE Comm. Magazine, 1995. [81] F. R. Farrokhi, L. Tassiulas, and K. J. R. Liu, \Joint Optimal Power Control and Beamforming in Wireless Networks Using Antenna Arrays," IEEE Trans. on Commu- nications, vol. 46, pp. 1313{1324, October 1998. [82] F. R. Farrokhi, K. J. R. Liu, and L. Tassiulas, \Transmit Beamforming and Power Control for Cellular Wireless Systems," IEEE Journal of Selected Areas in Commu- nications, Special Issue on Signal Processing for Wireless Communications, vol. 16, pp. 1437{1450, October 1998. 281 [83] F. R. Gantmacher, The Theory of Matrices. Chelsea, New York, third ed., 1990. [84] J. Zander, \Performance of Optimum Transmitter Power Control in Cellular Radio Systems," Proc. of IEEE Vehicular. Tech. Conf., vol. 41, pp. 57{62, February 1992. [85] G. J. Foschini, \A Simple Distributed Autonomous Power Control Algorithm and its Convergence," Proc. of IEEE Vehicular. Tech. Conf., vol. 42, pp. 641{646, November 1993. [86] J. Zander, \Distributed Cochannel Interference Control in Cellular Radio Systems," Proc. of IEEE Vehicular. Tech. Conf., vol. 41, pp. 305{311, August 1992. [87] S. Haykin, Adaptive Filter Theory. Prentice Hall, third ed., 1996. [88] E. Biglieri, D. Divsalar, P. J. McLane, and M. K. Simon, Introduction to Trellis Coded Modulation with Applications. Macmillan, 1991. [89] G. Ungerboeck, \The State of the Art in Trellis Coded Modulation," Proc. of the 5th Tirrenia Intl. workshop on Digital Communications, Tirrenia, Italy, September 1991. [90] A. Mehrotra, GSM System Engineering. Artech House Inc., Boston, 1996. [91] G. G. Raleigh and J. M. Cio?, \Spatio-Temporal Coding for Wireless Communica- tions," IEEE Transaction on Communications, vol. 46, no. 3, pp. 357{366, 1998. [92] I. E. Telatar, \Capacity of Multi-Antenna Gaussian Channels," Bell Labs Tech Journal, Tech Rep. num. BL0112170-0950615-07TM, 1995. 282 [93] T. L. Marzetta and B. Hochwald, \Capacity of a Mobile Multi-Antenna Communica- tion Link in Rayleigh Flat Fading," IEEE Transaction on Information Theory, vol. 45, pp. 139{157, January 1999. [94] D. Shiu, G. J. Foschini, M. J. Gans, and J. M. Kahn, \Fading Correlation and its Efiect on the Capacity of Multi-Element Antenna Systems," IEEE Transaction on Communications, vol. 48, pp. 502{513, Mar 2000. [95] W. Yu, W. Rhee, S. Boyd, and J. M. Cio?, \Iterative Water-Filling for Gaussian Vector Multiple Access Channels," IEEE International Symposium on Information Theory (ISIT2001), June 2001. [96] M. S. Alouini and A. J. Goldsmith, \Area spectral e?ciency of cellular mobile radio systems," Transaction on Vehic. Tech., vol. 48, pp. 1047{1066, July 1999. [97] J. Yang and S. Roy, \On joint transmitter and receiver optimization for multiple-input, multiple-output (MIMO) transmission systems," IEEE Transactions on Information Theory, vol. 42, Dec. 1994. [98] K. K. Wong, R. S. K. Cheng, K. B. Latief, and R. D. Murch, \Adaptive antennas at the mobile and base station in an OFDM/TDMA system," IEEE Transactions on Information Theory, vol. 49, Jan 2001. 283 [99] S. Serbetli and A. Yener, \Iterative transceiver optimization for multiuser MIMO sys- tems," Proceedings of 40th Allerton Conference on Communications, Control and Com- puting, Monticello, Illinois, Oct. 2002. [100] H. B?olcskei, D. Gesbert, and A. J. Paulraj, \On the capacity of ofdm-based spatial mul- tiplexing systems," IEEE Trans. on Communications, vol. 50, pp. 225{234, February 2002. [101] T. M. Cover and J. A. Thomas, Elements of Information Theory. New York: Willey, 1990. [102] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge, 1985. [103] J. R. T. Comption, Adaptive Antennas Concepts and Performance. New Jersey: Pren- tice Hall, 1988. [104] M. Osborne and A. Rubinstein, A course in Game Theory. Cambridge, MA: MIT Press, 1994. [105] D. Fudenberg and J. Tirole, Game Theory. Cambridge, MA: MIT Press, 1992. [106] H. B. Jovanovich, Linear Algebra and Its Application. 1988. [107] R. van Nee, \OFDM Codes for Peak-to-Average Power Reduction and Error Correc- tion," Proc. of IEEE Globecom?96, pp. 740{744, 1996. 284 [108] R.Dinis, P.Montezuma, and A.Gusmao, \PerformanceTrade-ofis with Quasi-Linearly Amplifled OFDM through a Two Branch Combining Technique," Proc. of IEEE Ve- hicular. Tech. Conf., pp. 899{903, May 1996. [109] X. Li and L. J. Cimini, \Efiects of Clipping and Filtering on the Performance of OFDM," Proc. of IEEE Vehicular. Tech. Conf., pp. 1634{1638, May 1997. [110] D. Wulich, \Reduction of Peak to Mean Power Ratio of Multicarrier Modulation Using Cyclic Coding," Electronic Lett., vol. 32, pp. 432{433, Feb. 1996. [111] H. Ochiai and H. Imai, \Block Coding Scheme Based on Complementary Sequences for Multicarrier Signals," Proc. of IEEE Int?l. Conf. on Comm. (ICC), 1998. [112] K. G. Paterson, \Generalized Reed-Muller Codes and Power Control in OFDM Mod- ulation," IEEE Trans. on Comm., vol. 46, pp. 104{120, Jan. 2000. [113] K. G. Paterson and A. E. Jones, \E?cient Decoding Algorithms for Generalized Reed- Muller Codes," IEEE Trans. on Comm., vol. 48, p. 1272, Aug. 2000. [114] K. G. Paterson, \Coding Techniques for Power Controlled OFDM," Proc.of 9th Intl. Symp.on Personal, Indoor and Mobile Radio Comm. (PIMRC ?98), vol. 2, pp. 801{805, 1998. [115] K. G. Paterson and V. Tarokh, \On the Existence and Construction of Good Codes with Low Peak-to-Average Power Ratios," IEEE Trans. on Information Theory, vol. 46, pp. 1974{1987, 2000. 285 [116] A.E.Jones, T.A.Wilkinson, andS.K.Barton, \BlockCodingSchemeforReductionof Peak to Mean Envelope Power Ratio of Multicarrier Transmission Scheme," Electronic Lett., vol. 47, pp. 2098{2099, Dec. 1994. [117] C. R?o?ing and V. Tarokh, \A Construction of OFDM 16-QAM Sequences Having Low Peak Powers," IEEE Trans. on Information Theory, vol. 47, pp. 2091{2093, Jul. 2001. [118] J. A. Davis and J. Jedwab, \Peak-to-Mean Power Control in OFDM, Golay Comple- mentary Sequences, and Reed-Muller Codes.," IEEE Trans. on Information Theory, vol. 45, pp. 2397{2417, Nov. 1999. [119] E. Lawrey and C. J. Kikkert, \Peak to Average Power Ratio Reduction of OFDM Signals Using Peak Reduction Carriers (PRC)," the Fifth Intl. Symp. on Signal Pro- cessing and its Applications. Brisbane, Queensland, Australia (ISSPA?99), vol. 542, pp. 737{740, Aug. 1999. [120] A. E. Jones and T. A. Wilkinson, \Combined Coding for Error Control and Increased Robustness to System Nonlinearties in OFDM," IEEE 46th Vehicular Tech. Conf., Atlanta, GA, pp. 904{908, Apr/May. 1996. [121] V. Tarokh and H. Jafarkhani, \On the computation and reduction of the peak to average power ratio in multicarrier communications," IEEE Transactions on Commu- nications, vol. 48, pp. 37{44, Jan. 2000. 286 [122] M. J. E. Golay, \Complementary Series," IRE Trans. on Information Theory, pp. 82{ 87, Apr. 1961. [123] K. G. Paterson, \On the Codes with Low Peak-to-Average Power Ratio for Multi-Code CDMA," IEEE International Symposium on Information Theory (ISIT), pp. 49{49, 2002. [124] C. V. Chong and V. Tarokh, \Two constructions of 16-QAM Golay Complementary Sequences," http://www.mit.edu/ vahid. [125] A. J. Grant and R. D. V. Nee, \E?cient Maximum-Likelihood Decoding of Q-ary Modulated Reed-Muller Codes," IEEE Trans. on Comm., vol. 2, pp. 134{136, May 1998. [126] K. G. Paterson and A. E. Jones, \E?cient Decoding Algorithms for Generlized Reed- Muller Codes," IEEE Trans. on Comm, vol. 48, pp. 1272{1285, August 2000. [127] M. M. Mano, Digital Design. Third Edition, Prentice Hall, New Jersey, 2002. [128] J. L. Massey, \The Ubiquity of Reed-Muller Codes," Applied Algebra, Algebraic Algo- rithms and Error-Correcting Codes (Eds. S. Boztas and I. E. Shparlinski), New York, Springer, no. 2227, pp. 1{12, 2001. [129] W. W. Peterson and E. J. W. Jr., Error Correcting Codes. Cambridge: MIT Press, 2 ed., 1972. 287 [130] L. Yanyu, Z. Zhifei, and T. Pinghui, \Equivalent bandwidth estimation under delay constrains," IEEE International Conference on Communication Technology Proceed- ings, 2000. WCC - ICCT 2000, vol. 2, pp. 1465 {1469, 2000. [131] P. Bhagwat, A. Krishna, and S. Tripathi, \Enhancing Throughput over Wireless LANs Using Channel State Dependent Packet Scheduling," Proceedings of IEEE INFO- COM96, vol. 2, pp. 1133{1140, Mar 1996. [132] C. Fragouli, \Controlled Multimedia Wireless Link Sharing via Enhanced Class-based Queuing with Channel-State Dependent Packet Scheduling," Proceedings of IEEE IN- FOCOM?98, vol. 2, pp. 572{580, Mar 1998. [133] Y. Cao, V. O. K. Li, and Z. Cao, \Scheduling Delay Sensitive and Best Efiort Tra?c in Wireless Networks," Proc. of IEEE Int?l. Conf. on Comm. (ICC), May 2003. [134] H. Rohling and R. Grunheid, \Performance comparison of difierent multiple access schemes for downlink of an ofdm communication system," IEEE 47th Vehicular Tech- nology Conference, vol. 3, pp. 1365{1369, May 1997. [135] W. Rhee and J. Cio?, \Increase in Capacity of Multiuser OFDM System Using Dy- namic Subchannel Allocation," Proc. Vehicular Technology Conference (VTC), p. 1085 1089, 2000. 288 [136] I. Koutsopoulos and L. Tassiulas, \Channel State Adaptive Techniques for Throughput Enhancement in Wireless Broadband Networks," Proceedings of IEEE INFOCOM, pp. 757{766, 2001. [137] G. J. Pottie, \System design choices in personal communications," IEEE Personal Communication, vol. 2, pp. 50 {67, Oct. 1995. [138] B. McFarland, G. Chesson, C. Temme, and T. Meng, \The 5-UPTM protocol for unifled multi-service wireless networks," IEEE Communications Magazine, pp. 74{80, November 2001. [139] G. Li and H. Liu, \Dynamic resource allocation with flnite bufier constraint in broad- band OFDMA networks," Proc. of IEEE Conf. on Acoustics, Speech, and Signal Pro- cessing, ICASSP ?03, vol. 4, pp. 197{200, April 2003. [140] Y. W. Cheong, R. S. Cheng, K. B. Lataief, and R. D. Murch, \Multiuser OFDM with adaptive subcarrier, bit, and power allocation," IEEE Journal on Selected Areas in communication, vol. 17, pp. 1747{1758, Oct. 1999. [141] K. Inhyoung, L. L. Hae, K. Beomsup, and Y. H. Lee, \On the use of linear program- ming for dynamic subchannel and bit allocation in multiuser OFDM," IEEE Global Telecommunications Conference, vol. 6, pp. 3648 {3652, 2001. 289 [142] A. C. Kam and K. Y. Siu, \Linear Complexity Algorithms for Bandwidth Reser- vatations and Delay Guarantees in Input Queued Switches with no Speedup," Proc. International Conference on Network Protocols 98, Austin TX, pp. 2{11, Oct 1998. [143] M. Grossglauser and D. Tse, \Mobility Increases the Capacity of Wireless Adhoc Networks," Proceedings of IEEE INFOCOM?01, Apr 2001. [144] D. P. Bertsekas, Dynamic Programming and Optimal Control. Athena Scientiflc, 1995. [145] S. M. Ross, Stochastic Dynamic Programming. Academic Press, 1983. [146] P. P. Varaiya, Notes on Optimization. New York: Van Nostrand Reinhold, 1972. [147] H. L. Royden, Real Analysis. New York: McMillan, 1968. [148] M. Alasti, F. R. Farrokhi, M. Olfat, and K. J. R. Liu, \Service Level Agreement (SLA) Based Scheduling Algorithms for Wireless Networks," Submitted to International Con- ference on Communications, ICC, 2004. [149] W. Su, Z. Safar, M. Olfat, and K. J. R. Liu, \Obtaining Full Diversity Space Frequency Codes from Space-Time Codes via Mapping," IEEE Trans. on Signal Processing, spe- cial issue on Signal Processing for Multiple-Input Multiple-Output (MIMO) Wireless Communications systems, Dec 2002, vol. 51, pp. 2905{2916, Nov. 2003. 290