ABSTRACT Title of dissertation: MULTIMEDIA SOCIAL NETWORKS: GAME THEORETIC MODELING AND EQUILIBRIUM ANALYSIS Yan Chen, Doctor of Philosophy, 2011 Dissertation directed by: Professor K. J. Ray Liu Department of Electrical and Computer Engineering Multimedia content sharing and distribution over multimedia social networks is more popular now than ever before: we download music from Napster, share our images on Flickr, view user-created video on YouTube, and watch peer-to-peer tele- vision using Coolstreaming, PPLive and PPStream. Within these multimedia social networks, users share, exchange, and compete for scarce resources such as multime- dia data and bandwidth, and thus in uence each other?s decision and performance. Therefore, to provide fundamental guidelines for the better system design, it is important to analyze the users? behaviors and interactions in a multimedia social network, i.e., how users interact with and respond to each other. Game theory is a mathematical tool that analyzes the strategic interactions among multiple decision makers. It is ideal and essential for studying, analyzing, and modeling the users? behaviors and interactions in social networking. In this thesis, game theory will be used to model users? behaviors in social networks and analyze the corresponding equilibria. Speciflcally, in this thesis, we flrst illustrate how to use game theory to analyze and model users? behaviors in multimedia social networks by discussing the following three difierent scenarios. In the flrst scenario, we consider a non-cooperative multimedia social network where users in the social network compete for the same resource. We use multiuser rate allocation social network as an example for this scenario. In the second scenario, we consider a cooperative multimedia social network where users in the social network cooperate with each other to obtain the content. We use cooperative peer-to-peer streaming social network as an example for this scenario. In the third scenario, we consider how to use the indirect reciprocity game to stimulate cooperation among users. We use the packet forwarding social network as an example. Moreover, the concept of \multimedia social networks" can be applied into the fleld of signal and image processing. If each pixel/sample is treated as a user, then the whole image/signal can be regarded as a multimedia social network. From such a perspective, we introduce a new paradigm for signal and image processing, and develop generalized and unifled frameworks for classical signal and image prob- lems. In this thesis, we use image denoising and image interpolation as examples to illustrate how to use game theory to re-formulate the classical signal and image processing problems. MULTIMEDIA SOCIAL NETWORKS: GAME THEORETIC MODELING AND EQUILIBRIUM ANALYSIS by Yan Chen Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulflllment of the requirements for the degree of Doctor of Philosophy 2011 Advisory Committee: Professor K. J. Ray Liu, Chair/Advisor Professor Min Wu Professor Rama Chellappa Professor Shuvra S. Bhattacharyya Professor Lawrence C. Washington c Copyright by Yan Chen 2011 Dedication To my family. ii Acknowledgments This thesis is the outcome of a wonderful four-year study and research expe- rience in Signal and Information Group (SIG) at University of Maryland, College Park. Here, I would like to take this opportunity to express my sincere apprecia- tion and gratitude to my advisor, Prof. K. J. Ray Liu, for his guidance, patience, understanding, and encouragement. During these four years, he inspired me and gave me freedom to work on many exciting cutting-edge research topics. His vision, creativeness, and enthusiasm not only have fueled my current research but also have a lasting in uence in my professional and personal development as a role model. I would like to thank Prof. Min Wu and Prof. Rama Chellappa for not only serving on my thesis committes and reviewing my thesis but also serving as references for my faculty application. I am also grateful to Prof. Shuvra S. Bhattacharyya and Prof. Lawrence C. Washington for their time and efiort serving on my committees and reviewing my thesis. I would also like to thank Prof. Oscar Au, Prof. Andre L. Tits, and Prof. Peter Cramton for serving as references for my faculty application. Thanks should also go to all members in our Signals and Information Group for their friendship, encouragement, and help. I have learned a lot from them for which I am deeply grateful. Special thanks go to Dr. Beibei Wang, Dr. Yongle Wu, Dr. Wan-Yi Lin, Mr. Matt Stamm, Dr. Steve Tjoa, and Dr. Quoc Lai for our research discussions that were all inspiring for me. My deepest gratitude goes to my wife Yang. She is always there for me, iii giving me her love and support. We share all things happened in life, and discuss the problems on research. I would also like to thank my father, my sister and my parents in law, for their constant love, comfort, support and encouragement. Finally, I would like to dedicate my thesis to my mother. I really hope that my thesis would make her proud. I would never forget her, and she is always in my heart. iv Table of Contents List of Tables viii List of Figures ix 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Related Works on Social Networks . . . . . . . . . . . . . . . . . . . 3 1.3 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.1 An Overview of Game Theory (Chapter 2) . . . . . . . . . . . 7 1.3.2 Multiuser Rate Allocation Social Networks (Chapter 3) . . . . 7 1.3.3 Peer-to-Peer Cooperative Video Streaming Social Networks (Chapter 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3.4 Cooperation Stimulation Using Indirect Reciprocity Game Mod- eling (Chapter 5) . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3.5 Image Denoising Games (Chapter 6) . . . . . . . . . . . . . . 10 1.3.6 Simultaneous Image Denoising and Interpolation Using Evo- lutionary Games (Chapter 7) . . . . . . . . . . . . . . . . . . 11 2 An Overview of Game Theory 13 2.1 Nash Equilibrium and Pareto Optimality . . . . . . . . . . . . . . . . 14 2.2 Auction Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 Evolutionary Games . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4 Coalition Formation Games . . . . . . . . . . . . . . . . . . . . . . . 18 2.5 Stochastic Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3 Multiuser Rate Allocation Social Networks 22 3.1 The Game-Theoretic Framework . . . . . . . . . . . . . . . . . . . . . 26 3.1.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.1.2 Video Distortion-Rate Model . . . . . . . . . . . . . . . . . . 27 3.1.3 User?s Utility Function . . . . . . . . . . . . . . . . . . . . . . 28 3.1.4 Multi-User Rate Allocation Game . . . . . . . . . . . . . . . . 31 3.2 Analysis of The Multi-User Rate Allocation Game . . . . . . . . . . . 32 3.2.1 Non-E?cient Rate Allocation (a > a0) . . . . . . . . . . . . . 32 3.2.2 E?cient Rate Allocation (0 ? a < a0) . . . . . . . . . . . . . . 33 3.2.3 E?cient and Proportionally Fair in Both Utility and PSNR (a = a0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.3 Relation To The Traditional Optimization-Based Approach . . . . . . 37 3.4 Clock Auction For Distributed Cheat-Proof Optimal Rate Allocation 38 3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.5.1 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . 46 3.5.2 Multi-User Rate Allocation . . . . . . . . . . . . . . . . . . . 49 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 v 4 Peer-to-Peer Cooperative Video Streaming Social Networks 55 4.1 The System Model and Utility Function . . . . . . . . . . . . . . . . 59 4.1.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.1.2 Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2 Agents Selection Within A Homogeneous Group . . . . . . . . . . . . 64 4.2.1 Centralized Agent Selection . . . . . . . . . . . . . . . . . . . 64 4.2.2 Distributed Agent Selection . . . . . . . . . . . . . . . . . . . 65 4.2.3 Evolutionary Cooperative Streaming Game . . . . . . . . . . . 66 4.2.4 Analysis of the Cooperative Streaming Game . . . . . . . . . 67 4.3 Agents Selection Within A Heterogeneous Group . . . . . . . . . . . 74 4.3.1 Two-Player Game . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.3.2 Multi-Player Game . . . . . . . . . . . . . . . . . . . . . . . . 77 4.4 A Distributed Learning Algorithm For ESS . . . . . . . . . . . . . . . 78 4.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5 Cooperation Stimulation Using Indirect Reciprocity Game Modeling 89 5.1 The System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.1.1 Social Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.1.2 Action Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.2 Optimal Action Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.2.1 Reputation Updating Policy . . . . . . . . . . . . . . . . . . . 98 5.2.2 Stationary Reputation Distribution . . . . . . . . . . . . . . . 101 5.2.3 Payofi Function . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.2.4 Optimal Action Using An Alternative Algorithm . . . . . . . 104 5.3 Action Spreading Due To Natural Selection . . . . . . . . . . . . . . 105 5.3.1 Action Spreading Algorithm Using Wright Fisher Model . . . 106 5.3.2 Action Spreading Algorithm Using Replicator Dynamic Equa- tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 5.4 Evolutionarily Stable Strategy and Simulations . . . . . . . . . . . . 108 5.4.1 Binary Reputation Scenario . . . . . . . . . . . . . . . . . . . 109 5.4.2 Multi-Level Reputation Scenario . . . . . . . . . . . . . . . . . 113 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 6 Image Denoising Games 119 6.1 The System Model and Coalition Formation Game . . . . . . . . . . 122 6.1.1 The System Model . . . . . . . . . . . . . . . . . . . . . . . . 122 6.1.2 The Coalition Formation Game . . . . . . . . . . . . . . . . . 123 6.2 Candidate Set Selection . . . . . . . . . . . . . . . . . . . . . . . . . 125 6.3 Game Theoretical Problem Formulation . . . . . . . . . . . . . . . . 128 6.3.1 Stein?s unbiased risk estimate (SURE) . . . . . . . . . . . . . 128 6.3.2 Confldence and Distortion Trade-ofi . . . . . . . . . . . . . . . 132 6.3.2.1 Confldence . . . . . . . . . . . . . . . . . . . . . . . 132 6.3.2.2 Distortion . . . . . . . . . . . . . . . . . . . . . . . . 133 6.3.2.3 Confldence and Distortion Trade-ofi . . . . . . . . . 133 vi 6.3.3 Utility Function and Solution to the Game . . . . . . . . . . . 134 6.4 Relation to the Traditional Approaches . . . . . . . . . . . . . . . . . 139 6.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 7 Simultaneous Image Denoising and Interpolation Using Evolutionary Games151 7.1 Evolutionary Game Model . . . . . . . . . . . . . . . . . . . . . . . . 156 7.2 Image Denosing and Interpolation as an Evolutionary Game . . . . . 158 7.2.1 Problems of Denoising+Interpolation . . . . . . . . . . . . . . 158 7.2.2 Game Theoretic Formulation . . . . . . . . . . . . . . . . . . 160 7.2.3 Pure Strategy Set ?ij . . . . . . . . . . . . . . . . . . . . . . . 162 7.2.4 Payofi Function . . . . . . . . . . . . . . . . . . . . . . . . . . 163 7.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 8 Conclusions and Future Work 177 8.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 Bibliography 182 vii List of Tables 3.1 ?i , fl?i , Rmini (kb=s), and Rmaxi (kb=s) for difierent sequence by training. 47 4.1 Utility table of a two-player game. . . . . . . . . . . . . . . . . . . . . 74 viii List of Figures 3.1 System Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2 The visual quality of Foreman sequence at difierent PSNR level: (a) 33dB; (b) 34dB; (c) 40dB; (d) 41dB. . . . . . . . . . . . . . . . . . . 30 3.3 Illustration for the proof of Lemma 2: (a) If ~Rit ? Rti andRL1i > Rmini , we can see that [ln( i + fliRL1i ) ? @ ln( i+fliRi)@Ri jRi=RL1i (R L1 i ? RL2i )] ? ln( i + fliRL2i ); (b) If ~Ri t > Rti and RL1i < Rmaxi , we can see that ln( i + fliRL1i ) ? [ln( i + fliRL2i )? @ ln( i+fliRi)@Ri jRi=RL1i (R L2 i ?RL1i )]. . . 43 3.4 Training and fl: (a) Football; (b) Coastguard. . . . . . . . . . . . . 48 3.5 Allocated rates for Akiyo, Carphone, Coastguard, Foreman, Table, Football, and Mobile using difierent methods. . . . . . . . . . . . . . 49 3.6 The sum of PSNR vs. the available network bandwidth R. . . . . . . 50 3.7 Cheat-proof performance: (a) AFD, AFR, and MSPSNR; (b) Pro- posed Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.1 A cooperative streaming example. . . . . . . . . . . . . . . . . . . . . 60 4.2 The deceasing property of wi. . . . . . . . . . . . . . . . . . . . . . . 68 4.3 The social welfare comparison among Non-Coop, MSW-C, MSW- D, and ESS-D. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.4 Behavior dynamic of a homogeneous group of peers. . . . . . . . . . . 82 4.5 Behavior dynamic of a heterogeneous group of peers. . . . . . . . . . 83 4.6 The probability of real-time streaming comparison between Non- Coop and ESS-D. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.7 The visual quality comparison: (a) Non-Coop; (b) ESS-D with N=2; (c) ESS-D with N=3; (d) ESS-D with N=4. . . . . . . . . . . 85 4.8 Single-source rate comparison between Non-Coop and ESS-D. . . . 86 4.9 Multi-source rate comparison between Non-Coop and ESS-D. . . . 87 4.10 The social welfare comparison between Non-Coop and ESS-D when the utility function is deflned as (4.42). . . . . . . . . . . . . . . . . . 88 5.1 System model. Within every interaction, a pair of transmitter and re- ceiver is randomly sampled from the population. Then, the transmit- ter will forward a certain amount of packets to the receiver according to the receiver?s and his/her own reputations. After the transmission, the transmitter?s reputation will be updated by the receiver and the observers. Finally, the transmitter?s reputation is propagated to the whole population from the receiver and the observers through a noisy gossip channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.2 Reputation updating policy. . . . . . . . . . . . . . . . . . . . . . . . 98 5.3 The population evolution when L = 1, g = 1 and c = 0:1: (a) the percentage of the population with reputation L = 1; (b) the percentage of the population using optimal action shown in (5.24). . . 108 ix 5.4 The stable region for the optimal action rule shown in (5.24) when L = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.5 The population evolution when L = 1, g = 1 and c = 0:6: (a) the percentage of the population with reputation L = 1; (b) the percentage of the population using optimal action shown in (5.24). . . 111 5.6 The stable region for the optimal action rule shown in (5.26) when L = 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.7 The population evolution when L = 4, g = 1 and c = 0:5: (a) the percentage of the population with reputation L = 4; (b) the percentage of the population using optimal action shown in (5.26). . . 114 5.8 The population evolution when L = 4, g = 1 and c = 0:7: (a) the percentage of the population with reputation L = 4; (b) the percentage of the population using optimal action shown in (5.26). . . 115 6.1 An example of optimal candidate set with an edge region: (a) original image; (b) noisy image with = 15; (c) noisy image with = 25; (d) noisy image with = 35; (e) the candidate set used by nonlocal; (f) the ideally optimal candidate set of (b) when the original signal is available; (g) the ideally optimal candidate set of (c) when the original signal is available; (h) the ideally optimal candidate set of (d) when the original signal is available. . . . . . . . . . . . . . . . . 127 6.2 An example of optimal candidate set with a smooth region: (a) origi- nal image; (b) noisy image with = 15; (c) noisy image with = 25; (d) noisy image with = 35; (e) the candidate set used by nonlocal; (f) the ideally optimal candidate set of (b) when the original signal is available; (g) the ideally optimal candidate set of (c) when the orig- inal signal is available; (h) the ideally optimal candidate set of (d) when the original signal is available. . . . . . . . . . . . . . . . . . . . 129 6.3 The optimal m? for lena image when = 10. . . . . . . . . . . . . . . 130 6.4 The trade-ofi between the confldence term C and the distortion term D: (a) the performance of C with difierent N ; (b) the performance of D with difierent N . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 6.5 Some possible gain functions. . . . . . . . . . . . . . . . . . . . . . . 134 6.6 The four tested images: (a) Lena; (b) Barbara; (c) Boat; (d) Flinstones.135 6.7 An example of optimal candidate set with an edge region: (a) original image; (b) noisy image with = 15; (c) noisy image with = 25; (d) noisy image with = 35; (e) the candidate set used by nonlocal; (f) (g) and (h) are the ideally optimal candidate sets of (b) (c) and (d) when the original signal is available; (i) (j) and (k) are the candidate sets of (b) (c) and (d) generated by the proposed method. . . . . . . 144 x 6.8 An example of optimal candidate set with a smooth region: (a) origi- nal image; (b) noisy image with = 15; (c) noisy image with = 25; (d) noisy image with = 35; (e) the candidate set used by nonlocal; (f) (g) and (h) are the ideally optimal candidate sets of (b) (c) and (d) when the original signal is available; (i) (j) and (k) are the candidate sets of (b) (c) and (d) generated by the proposed method. . . . . . . 145 6.9 The PSNR comparison for difierent images: (a) Lena; (b) Barbara; (c) Boat; (d) Flinstones. . . . . . . . . . . . . . . . . . . . . . . . . . 146 6.10 The visual quality comparison for Flinstones with = 25: (a) origi- nal image; (b) noisy image; (c) the result generated by the nonlocal method; (d) the result generated by the proposed approach. . . . . . 147 6.11 The visual quality comparison for Barbara with = 35: (a) origi- nal image; (b) noisy image; (c) the result generated by the nonlocal method; (d) the result generated by the proposed approach. . . . . . 148 6.12 The visual quality comparison for Lena with = 45: (a) original im- age; (b) noisy image; (c) the result generated by the nonlocal method; (d) the result generated by the proposed approach. . . . . . . . . . . 149 6.13 The visual quality comparison for Boat with = 20: (a) original im- age; (b) noisy image; (c) the result generated by the nonlocal method; (d) the result generated by the proposed approach. . . . . . . . . . . 150 7.1 (a) a region of Lena image that is corrupted with Gaussian noise with noise variance 2 = 25; (b) the interpolation result using bicubic; (c) the interpolation result using SAI. . . . . . . . . . . . . . . . . . . . . 153 7.2 (a) Denoising before interpolation; (b) Simultaneous denoising and interpolation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 7.3 (a) a region of Lena image that is corrupted by Gaussian noise with noise variance 2 = 225; (b) the denoised result using nonlocal; (c) the interpolation result of (b) using bicubic; (d) the interpolation result of (b) using SAI. . . . . . . . . . . . . . . . . . . . . . . . . . . 159 7.4 The pure strategy set ?ij for Ih(2m?1; 2n?1); Ih(2m?1; 2n); Ih(2m; 2n? 1); Ih(2m; 2n). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 7.5 The block diagram of the proposed method. . . . . . . . . . . . . . . 165 7.6 The test images: (a) Lena; (b) Boat; (c) Kodim03; (d) Kodim07; (e) Kodim09. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 7.7 The PSNR comparison: (a) Lena; (b) Boat; (c) Kodim03; (d) Kodim07; (e) Kodim09. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 7.8 The visual quality comparison for Lena: (a) the noisy low resolution image with = 10; (b) the result generated by Nonlocal+Bicubic (30.71dB); (c) the result generated by Nonlocal+SAI method (31.11dB); (d) the result generated by the proposed method (31.38dB). . . . . . 172 7.9 The visual quality comparison for Boat: (a) the noisy low resolu- tion image with = 5; (b) the result generated by Nonlocal+Bicubic (28.52dB); (c) the result generated by Nonlocal+SAI method (28.92dB); (d) the result generated by the proposed method (28.74dB). . . . . . 173 xi 7.10 The visual quality comparison for Kodim07: (a) the noisy low resolu- tion image with = 10; (b) the result generated by Nonlocal+Bicubic (29.85dB); (c) the result generated by Nonlocal+SAI method (30.18dB); (d) the result generated by the proposed method (30.35dB). . . . . . 174 7.11 The visual quality comparison for Kodim09: (a) the noisy low resolu- tion image with = 15; (b) the result generated by Nonlocal+Bicubic (29.56dB); (c) the result generated by Nonlocal+SAI method (29.81dB); (d) the result generated by the proposed method (30.41dB). . . . . . 175 7.12 The visual quality comparison for Kodim03: (a) the noisy low resolu- tion image with = 20; (b) the result generated by Nonlocal+Bicubic (27.85dB); (c) the result generated by Nonlocal+SAI method (28.03dB); (d) the result generated by the proposed method (28.37dB). . . . . . 176 xii Chapter 1 Introduction 1.1 Motivation Multimedia content sharing and distribution over multimedia social networks is more popular now than ever before: we download music from Napster [3], share our images on Flickr [2], view user-created video on YouTube [9], and watch peer- to-peer television using Coolstreaming [133], PPLive [4] and PPStream [5]. Within these multimedia social networks, users share, exchange, and compete for scarce resources such as multimedia data and bandwidth, and thus in uence each other?s decision and performance. Therefore, to provide fundamental guidelines for the better system design, it is important to analyze the users? behaviors and interactions in a multimedia social network, i.e., how users interact with and respond to each other. Unlike generic data applications, multimedia applications have time-varying bandwidth requirements, stringent delay deadlines and dynamic characteristics. To enable users in a multimedia social network to successfully participate in the resource competition, the uniquely scalable and delay-sensitive characteristics of multimedia data and the resulting impact on users viewing experiences of multimedia content should be explicitly involved in the system design. In multimedia social networks, users are intelligent and have the ability to 1 observe, learn, and make intelligent decisions. Since users usually belong to difier- ent authorities and pursue difierent goals, they will choose the strategies that can maximize their own payofis. In such a case, traditional centralized optimization- based approaches no longer work well since they only consider the e?ciency of the whole system while totally ignore the fairness among users, which is an even more important issue in multimedia social networks. To better design the system, not only the e?ciency issue from the system designers? perspective but also the fairness issue from the users? perspective should be taken into accout. Moreover, since users in multimedia social networks are rational and thus naturally selflsh, they tend to over-claim what they may need and will not truly report their private information if cheating can improve their payofis. Therefore, enforcing truth-telling is crucial in multimedia social networks. From the above discussions, we can see that the behavior dynamics among users in a multimedia social network are very complex. To understand the users? complex behavior dynamics and thus lead to a better system design, game theory is a powerful mathematical tool that analyzes the strategic interactions among multiple decision makers [97]. It has been developed for understanding cooperation and con ict between individuals in many flelds such as economics, politics, business, social sciences and biology. Thus, game theory is ideal and essential for studying, analyzing, and modeling the users? behaviors and interactions in social networking. Recently, it draws great attentions in cognitive networking [30] [123] and multimedia signal processing [27]. In this thesis, we will illustrate how game theory can be used to model users? behaviors in various multimedia social networks and analyze the 2 corresponding equilibria. 1.2 Related Works on Social Networks A social network is a social structure made of individuals and/or organizations called \nodes", which are connected with each other by certain types of interdepen- dency, such as friendship, kinship, flnancial exchange, con ict, trade, etc. Many methodologies have been studied to formulate the relationships among members at all scales, from interpersonal to international, and social network analysis be- comes a popular topic in sociology, economics, information science and many other disciplines. Most of the existing works on social networks fall into the following three cat- egories [80]: (1) social network properties, (2) social network models, (3) social net- work dynamics and evolution. In [17] [43], the authors showed that the vertex con- nectivities in many large networks follow a scale-free power-law distribution. Such a property is found to be a consequence of two generic mechanisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. Another important property of social networks is the \small-world" phenomenon. As pointed out in [125] [18], most real-world networks exhibit relatively small diameter, i.e., the networks are highly clustered. Besides the study of the social network properties, there are quite a lot of work on building models for social networks. The simplest model is the random graph 3 model introduced in [37], where given a number of nodes, each pair of nodes has an identical and independent probability of being joined by an edge. However, since it fails to match the real-world social network properties, e.g., it does not produce power law degree distributions, this model is not realistic. A better model that can produce power law degree distributions is the preferential attachment [38] [44] [10], where when a new node u arrives to the network, the probability of connecting to a node v is proportional to the degree of v. Another model that can also produce power law degree distributions is the copying model [103], where a new node joins the networks by uniformly creating random edges or flrst random choosing a node u and then linking to u0s neighbors. Another important research topic in the fleld of social network is the study of social network dynamics and evolution where the researchers study how the social network evolve and how information spread over the networks. Many works have been done to investigate the dynamics and evolution of difierent networks, e.g., trendsetters selecting in viral marketing [49], inoculation targets identiflcation in epidemiology [92], and studying trends in blogosphere [102]. However, most of these existing works study and analyze the social networks at the macroeconomic level, i.e., from system designer?s perspective. Few efiorts have been made to investigate the social networks at the microeconomic level, i.e., from the users? perspective, which is an important issue in social network analysis since users may only care about their own objectives and their decisions greatly afiect the evolution and performance of the social networks. In this thesis, we will study and analyze the social networks from the users? perspective by modeling users? behaviors 4 and interactions using game theory. 1.3 Dissertation Outline Since users in difierent multimedia social networks may have difierent types of interdependency, to efiectively model the users? behaviors and interactions, difierent game models for difierent multimedia social networks should be employed. The two most common types of users? interdependency in multimedia social networks are competition and cooperation, which leads to non-cooperative social networks and cooperative social networks, respectively. In cooperative social networks, since users are rational thus naturally selflsh, they will not cooperate with others unless cooperation can improve their own performance. Therefore, one important issue in cooperative social networks is cooperation stimulation. Without loss of generality, in this thesis, we flrst illustrate how to use game theory to analyze and model users? behaviors in multimedia social networks by discussing the following three difierent scenarios: ? In the flrst scenario, we consider a non-cooperative multimedia social network where users in the social network compete for the same resource. We use multiuser rate allocation social network [29] as an example for this scenario and the details will be described in Chapter 3. ? In the second scenario, we consider a cooperative multimedia social network where users in the social network cooperate with each other to obtain the con- tent. As discussed in Chapter 4, we will use cooperative peer-to-peer streaming 5 social network [34] as an example for this scenario. ? In the third scenario, we consider how to use the indirect reciprocity game to stimulate cooperation among users. In Chapter 5, we will use the packet forwarding social network [28] as an example. Moreover, the concept of \multimedia social networks" can be applied into the fleld of signal and image processing. Although there are seemingly no human factors involved in the algorithmic solution in classical signal/image processing, if we take the view that the pixels/signals of an image are forming a notion of a \social network" to jointly interact to accomplish a common (\processing") goal, be it flltering, denoising, or segmentation, then the game theoretic approach can ofier new views beyond what classical methods can. This completely changes the traditional thinking that we have to decide what a pixel does instead of simply giving some generic rules/guidelines and let the pixels themselves interact/cooperte to decide the best \strategy". From such a perspective, we introduce a new paradigm for signal and image processing, and develop generalized and unifled frameworks for classical signal and image problems. In this thesis, we use image denoising (Chapter 6) and simultaneous image denoising and interpolation (Chapter 7) as examples to illustrate how to use game theory to re-formulate the classical signal and image processing problems. The rest of the dissertation is organized as follows. 6 1.3.1 An Overview of Game Theory (Chapter 2) Since game theory has been recognized as an important tool in studying, mod- eling, and analyzing the interaction process among multiple decision makers, in this chapter, we present an overview of some fundamental concepts of game theory that will be used in this thesis. 1.3.2 Multiuser Rate Allocation Social Networks (Chapter 3) In multiuser rate allocation problem, a set of transmitters want to transmit the video sequences to corresponding receivers through a common channel that is shared by all transmitters. Since the transmitters compete for the same resource, i.e., channel bandwidth, they form a non-cooperative social network. The key problem in this social network is how to e?ciently and fairly allocate data rate among difierent users. Most of the existing optimization-based methods, such as minimizing the weighted sum of the distortions or maximizing the weighted sum of the peak signal- to-noise ratios (PSNRs), have their weights heuristically determined. Moreover, those approaches mainly focus on the e?ciency issue while there is no notion of fairness. In this chapter, we address this problem by proposing a game-theoretic framework, in which the utility function of each user is jointly determined by the characteristics of the transmitted video sequence and the allocated bit-rate. We show that a unique Nash equilibrium (NE), which is proportionally fair in terms of both utility and PSNR, can be obtained, according to which the controller can e?ciently and fairly allocate the available network bandwidth to the users. Moreover, we 7 propose a distributed cheat-proof rate allocation scheme for the users to converge to the optimal NE using alternative ascending clock auction. We also show that the traditional optimization-based approach that maximizes the weighted sum of the PSNRs is a special case of the game-theoretic framework with the utility function deflned as an exponential function of PSNR. Finally, we show several experimental results on real video data to demonstrate the e?ciency and efiectiveness of the proposed method. 1.3.3 Peer-to-Peer Cooperative Video Streaming Social Networks (Chap- ter 4) While peer-to-peer (P2P) video streaming systems have achieved promising results, they introduce a large number of unnecessary traverse links, which con- sequently leads to substantial network ine?ciency. To address this problem and achieve better streaming performance, we propose to enable cooperation among group peers, which are geographically neighboring peers with large intra-group up- load and download bandwidths. Considering the peers selflsh nature, we formulate the cooperative streaming problem as an evolutionary game and derive, for every peer, the evolutionarily stable strategy (ESS), which is the stable Nash equilibrium and no one will deviate from. Moreover, we propose a simple and distributed learn- ing algorithm for the peers to converge to the ESSs. With the proposed algorithm, each peer decides whether to be an agent who downloads data from the peers out- side the group or a freerider who downloads data from the agents by simply tossing 8 a coin, where the probability of being a head for the coin is learned from the peers own past payofi history. Simulation results show that the strategy of a peer con- verges to the ESS. Compared to the traditional non-cooperative P2P schemes, the proposed cooperative scheme achieves much better performance in terms of social welfare, probability of real-time streaming, and video quality (source rate). 1.3.4 Cooperation Stimulation Using Indirect Reciprocity Game Mod- eling (Chapter 5) In social networks, since nodes generally belong to difierent authorities and pursue difierent goals, they will not cooperate with others unless cooperation can improve their own performance. Thus, how to stimulate cooperation among nodes in social networks is very important. However, most of existing game-theoretic cooper- ation stimulation approaches rely on the assumption that the interactions between any pair of players are long-lasting. When this assumption is not true, according to the well-known Prisoners Dilemma and the backward induction principle, the unique Nash equilibrium (NE) is to always play non-cooperatively. In this chapter, we propose a cooperation stimulation scheme for the scenario where the number of interactions between any pair of players are flnite. The proposed algorithm is based on indirect reciprocity game modelling where the key concept is \I help you not because you have helped me but because you have helped others". We formulate the problem of flnding the optimal action rule as a Markov Decision Process (MDP) and propose a modifled value iteration algorithm to flnd the optimal action rule. 9 Using the packet forwarding game as an example, we show that with an appropriate cost-to-gain ratio, the strategy of forwarding the number of packets that is equal to the reputation level of the receiver is an evolutionarily stable strategy (ESS). Fi- nally, simulations are shown to verify the e?ciency and efiectiveness of the proposed algorithm. 1.3.5 Image Denoising Games (Chapter 6) Based on the observation that every small window in a natural image has many similar windows in the same image, the nonlocal denoising methods perform denois- ing by weighted averaging all the pixels in a nonlocal window and have achieved very promising denoising results. However, the use of a flxed square neighborhood window greatly limits the denoising performance. Therefore, an important issue in pixel-domain image denoising algorithms is how to adaptively choose optimal neighborhoods. Obviously, too large a neighborhood set may cause overly-smooth artifacts, while too small a neighborhood set may not be able to e?ciently reduce the noise variance. While the Stein?s principle is shown to be able to estimate the true mean square error (MSE) for determining the optimal neighborhoods, there exists a trade-ofi between the accuracy of the estimate and the minimum of the true MSE. In this chapter, we study the impact of such a trade-ofi and formulate the image denoising problem as a coalition formation game. In this game, every pixel is treated as a player, who tries to seek partners to form a coalition to achieve better denoising results. By forming a coalition, every player in the coalition can 10 obtain certain gains by improving the accuracy of the Stein?s estimate, while incur- ring some costs by increasing the minimum of the true MSE. Moreover, we show that the traditional approaches using a heuristically determined neighborhood set are special cases of the proposed game theoretical framework by choosing the utility function without a cost term. Finally, experimental results show that the proposed game theoretic approach can achieve better performance than the nonlocal method in terms of both PSNR and visual quality. 1.3.6 Simultaneous Image Denoising and Interpolation Using Evolu- tionary Games (Chapter 7) While the existing image interpolation approaches can achieve promising in- terpolation results, they are specially designed for the clean images. However, when the low resolution image is noisy, most of the existing interpolation approaches will also boost the noise and introduces severe visual distortions. Therefore, to achieve better reconstruction, we should jointly consider image denoising and interpola- tion. In this chapter, we study the problem of simultaneous image denoising and interpolation from the game theoretic perspective and formulate the problem as an evolutionary game. In this evolutionary game, the players are the unknown high resolution pixels and the pure strategies of the players are the corresponding noisy low resolution neighbors. By regarding the non-negative weights of the noisy low resolution pixels as the probabilities of selecting the pure strategies, the problem of estimating the high resolution pixels becomes flnding the evolutionarily stable 11 strategies for the evolutionary game. Experimental results show that the proposed game theoretical approach can achieve better performance than the methods that flrst denoise the noisy low resolution image and then interpolate the denoised image, in terms of both PSNR and visual quality. 12 Chapter 2 An Overview of Game Theory Game theory [97] is a mathematical tool that analyzes the strategic interac- tions among multiple decision makers. Its history dates back to 1944 when J. von Neumann and O. Morgenstern publish the book Theory of Games and Economic Behavior. In this book, von Neumann and Morgenstern introduced the method of flnding mutually consistent solutions for two-person zero-sum games, which lays the foundation of game theory. During the late 1940s, cooperative game theory had been studied to analyze how groups of individuals should cooperate with each other to improve their positions in a game. In early 1950s, J. Nash developed an important criterion, known as Nash equilibrium, to characterize mutually optimal strategies of players. This concept is applicable to non-zero-sum games, and thus is more general than the criterion proposed by von Neumann and Morgenstern and marks a quantum leap forward in the development of non-cooperative game theory. During the 1950s, many important concepts of game theory were developed, such as the concepts of the core, the extensive form games, repeated games, and the Shapley value. Reflnement of Nash equilibriums and the concepts of complete information and Bayesian games were proposed in the 1960s. Application of game theory to biol- ogy, i.e., the evolutionary game theory, was introduced by J. M. Smith in the 1970s, during which time, the concepts of correlated equilibrium and common knowledge 13 were introduced by R. Aumann. In nowadays, game theory has been widely rec- ognized as an important tool in many flelds, such as economics, politics, business, social sciences, biology, computer science, and engineering, for understanding co- operation and con ict between individuals. In this chapter, we will present a brief overview on some fundamental concepts of game theory that will be used in this thesis to model and analyze users? behaviors and interaction in multimedia social networks. For more extensive concepts of game theory, the readers are referred to. 2.1 Nash Equilibrium and Pareto Optimality A strategic game hN; (Ai); (ui)i consists of three components: a set of players, denoted by N ; a set of actions, denoted by Ai for player i; and payofi functions, denoted by ui : A ! R for player i, where A = ?i2NAi is the action set of all players. Generally, one player?s payofi depends on not only his/her own action, but also other players? actions, and hence there is a strategic interaction between players. Nash equilibrium is the key concept to understand non-cooperative game the- ory. Informally speaking, it is an equilibrium where everyone plays the best strategy while taking others? decisions into account. Mathematically, a? is a Nash equilibrium if for every player i 2 N , ui(a?i ; a??i) ? ui(ai; a??i); 8ai 2 Ai; (2.1) where ai denotes the strategy of player i and a?i is a common notation in game theory representing the strategies of all players other than player i. Therefore, Nash 14 equilibrium predicts the outcome of a game when all players are rational. Depending on whether players choose a single action or randomize over a set of actions according to some probability distribution, an equilibrium can be classifled as the pure-strategy Nash equilibrium or the mixed-strategy Nash equilibrium. Pareto optimality is a strategy proflle at which no single player can improve his/her own payofi without hurting any other player. Speciflcally, let u be a vector composed of payofis in one particular game outcome. Then, u is Pareto e?cient if there is no u0 of another game outcome for which u0i > ui for all i 2 N ; u is strongly Pareto e?cient if there is no u0 for which u0i ? ui for all i 2 N and u0i > ui for some i 2 N . The Pareto frontier is deflned as the set of all u that are Pareto e?cient. 2.2 Auction Games Auction theory [77] is an applied branch of game theory which analyzes inter- actions in auction markets. An auction, conducted by an auctioneer, is a process of buying and selling products by eliciting bids from potential buyers (i.e., bidders) and deciding the auction outcome based on the bids and auction rules. The rules of auction, or auction mechanisms, determine whom the goods are allocated to (i.e., the allocation rule) and how much they have to pay (i.e., the payment rule). The well-known four basic forms of auctions are: English auction, Dutch auc- tion, Second-price (sealed-bid) auction, and First-price (sealed-bid) auction. English auction is a sequential auction where price increases round by round from a low starting price until only one bidder is left, who wins the product and pays his/her 15 bid. Dutch auction is another sequential auction where price decreases round by round from a high starting price until one bidder accepts the price, who wins the product and pays the price at acceptance. Second-price (sealed-bid) auction is the auction where each bidder submits a bid in a sealed envelope simultaneously, and the highest bidder wins the product with payment equal to the second highest bid. First-price (sealed-bid) auction is the auction where each bidder submits a bid in a sealed envelope simultaneously, and the highest bidder wins the product with payment equal to his/her own bid. Although the four basic auctions appear quite difierent at flrst glance, they are actually equivalent in some sense under certain conditions [119]. As established in [119] by William Vickrey, a Nobel laureate in Economics, the English auction is equivalent to the second-price sealed-bid auction under the private values model while the Dutch auction is equivalent to the flrst-price sealed-bid auction since for every strategy in the flrst-price auction, there is an equivalent strategy in the Dutch auction and vice versa; and given symmetric and risk-neutral bidders and private values, all four auctions yield the same expected revenue of the seller. Therefore, it will su?ce to study or adopt only one kind of auction out of the four basic forms. An auction becomes more complicated when more than one item are simul- taneously sold and bidders bid for \packages" of products instead of individual products. This is known as the combinatorial auction [40]. One possible approach is the Vickrey-Clarke-Groves (VCG) mechanism, which is the generalized version of the second-price mechanism. The basic idea is that the allocation of products maximizes the social welfare and each winner in the auction pays the opportunity 16 cost that their presence introduces to all the other bidders. Another approach is the alternative ascending clock auction proposed in [16], where the basic idea is to awarded the items to bidders at the price whenever they are \clinched". 2.3 Evolutionary Games In some games, there can be more than one Nash equilibrium. When there ex- ist multiple Nash equilibria, one interesting and important problem is how to choose an optimal one in some sense. This process is also known as \equilibria reflnement" in game theory. In the literature, several reflnement criteria have been proposed, e.g. Pareto optimality is deflned to compare multi-dimension payofi proflles. How- ever, the establishment of Pareto optimality is based on the assumption that players have the full knowledge of the game they are playing and others players? actions, and that players are rational and willing to cooperate in their moves. Nevertheless, this assumption may not be true since players may only have limited information about the other players strategies. Moreover, players may take out-of-equilibrium strategies due to the uncertainty of the game and incorrect/noisy estimate of others? strategies. To overcome such problems, we need to provide a robust stable equilib- rium, and evolutionary game theory is such a theory that can provide the desired stable equilibrium { evolutionarily stable strategy. Evolutionary game theory is an application of the mathematical theory of games to the interaction dependent strategy evolution in populations [110] [41]. Arising from the realization that frequency dependent fltness introduces a strategic 17 aspect to evolution, evolutionary game theory becomes an essential component of a mathematical and computational approach to biological contexts, such as genes, viruses, cells, and humans. Recently, however, evolutionary game theory has become of increased interest to economists, sociologists, anthropologists, social scientists, and computer science. Difiers from classical game theory, evolutionary game theory focuses on the dynamics of strategy change more than the properties of strategy equilibria. It can tell us how a rational player should behave to approach a best strategy against a small number of players who do not follow the best strategy, and thus evolutionary game theory can better handle the unpredictable behavior of players. 2.4 Coalition Formation Games The coalition formation game is one type of cooperative game [104], which describes how a set of players can cooperate with others by forming cooperating groups and thus improves their payofis in a game. A coalition S is a nonempty subset of N , the set of all players. Since the players in coalition S have agreed to cooperate together, they can be viewed as one entity and is associated with a value v(S) which represents the worth of coalition S. Then, a coalitional game is determined by N and v(S). When the value v(S) is the total payofi that can be distributed in any way among the members of S, e.g., using an appropriate fairness rule, this kind of coalitional games is known as games with transferrable payofi. However, in some coalitional games, rigid restrictions exist 18 on the allocation of the payofi. These games fall into the other category known as games without transferrable payofi. In coalition formation games, often the value v(S) is determined by two terms: the gain of forming a coalition g(S) and the cost of forming a coalition c(S), i.e. v(S) = g(S)? c(S): (2.2) In general, cooperation by forming larger coalitions is beneflcial for players in terms of a higher gain. This property is referred to as superadditivity, i.e., g ? S1 [ S2 ? ? g(S1) + g(S2); 8S1; S2 ? N; S1 \ S2 = ;: (2.3) However, on the other hand, forming a larger coalition also require a larger cost, i.e. the cost is also superadditive as follows c ? S1 [ S2 ? ? c(S1) + c(S2); 8S1; S2 ? N; S1 \ S2 = ;: (2.4) Therefore, forming larger coalitions are not always beneflcial due to the cost term, which means that grand coalition is seldom formed. The objective of coalition formation games is to flnd the optimal coalition structure S? = fS?1 ; S?2 ; :::; S?l g, S?1 [ S?2 [ ::: [ S?l = N , that maximizes the total coalition values, i.e., S? = fS?1 ; S?2 ; :::; S?l g = arg maxS X i v(Si): (2.5) 2.5 Stochastic Games We have discussed various games, but generally speaking, players are assumed to face the same stage game at each time, meaning the game and the players? 19 strategies are not depending on the current state of the network. However, this is not true for a dynamic environment where players? strategies keep changing over time. In order to study the cooperation and competition behaviors under such a dynamic environment, the theory of stochastic games might be a better flt. A stochastic game [108] is an extension of Markov decision process (MDP) [101] by considering the interactive competition among difierent agents. In a stochas- tic game, there is a set of states, denoted by S, and a collection of action sets, A1; ? ? ? ; AjN j, one for each player in the game. The game is played in a sequence of stages. At the beginning of each stage the game is in a certain state. After the players select and execute their actions, the game then moves to a new random state with some transition probability determined by the current state and actions from all players: T : S ? A1 ? ? ? ? ? AjN j 7! PD(S). Meanwhile, at each stage each player receives a payofi ui : S ? A1 ? ? ? ? ? AjN j 7! R, which also depends on the current state and all the chosen actions. The game is played continually for a number of stages, and each player attempts to maximize an objective function. Like in the repeated game, the overall payofi function is deflned as the expected sum of discounted intermediate payofis. The solution, also called a policy of a stochastic game is deflned as a probability distribution over the action set at any state, ?i : S ! PD(Ai), for all i 2 N . Given the current state st at time t, if player i?s policy ?ti at time t is independent of the states and actions in all previous time slots, the policy ?i is said to be Markov. If the policy is further independent of time, it is said to be stationary. The stationary policy of the players in a stochastic game, i.e., their optimal 20 strategies, can be obtained by value iteration according to Bellman?s optimality condition. For example, in a two-player stochastic game with opposite objectives, let us denote V (s) as the expected reward (of player 1) for the optimal policy starting from state s, and Q(s; a1; a2) as the expected reward of player 1 for taking action a1 against player 2 who takes action a2 from state s and continuing optimally thereafter [85]. Then, the optimal strategy for player 1 can be obtained from the following iterations, V (s) = max ? min a22A2 X a12A1 Q(s; a1; a2)?a1 ; (2.6) Q(s; a1; a2) = u1(s; a1; a2) + ? X s02S T (s; a1; a2; s0)V (s0); (2.7) where ?a1 denotes player 1?s strategy proflle, and T (s; a1; a2; s0) denotes the transi- tion probability from state s to s0 when player 1 takes a1 and player 2 takes a2. 21 Chapter 3 Multiuser Rate Allocation Social Networks Nowadays, due to the explosive growth of the Internet and the advance of compression technologies, delay-sensitive multimedia networking applications such as multimedia streaming and multi-camera surveillance become more and more pop- ular. Therefore, a fundamental problem in these applications, how to fairly and e?- ciently allocate the rate among many users who share the same network bandwidth, becomes more and more important and draws great attention recently. Rate allocation for a single user has been well investigated in the literature [39] [35] [48]. In single-user rate allocation, the task of the rate controller is to assign the available rate to each frame and each macroblock (MB) to achieve the maximal visual quality. This is also known as rate control. The simplest rate control method is the constant bit-rate allocation (CBR), which equally allocates the bit-rate to each frame. However, CBR often results in quality uctuation, due to which the overall visual quality is signiflcantly degraded. To overcome this problem, variable bit-rate allocation (VBR) is proposed for constant quality reconstruction by assigning rate according to the complexity of each frame [78]. A core technique in VBR-based rate control methods is rate distortion modelling [64], which highly afiects the rate control performance. Many works have been done on rate distortion modelling, including parametric method [137] and non-parametric method [134]. 22 If a channel is shared by multiple users, besides considering the rate allocation within the same user (i.e., frame-level rate allocation and MB-level rate allocation), the rate controller needs to consider the rate allocation among difierent users. This becomes the multi-user rate allocation problem. Similar to frame-level rate allo- cation, the simplest multi-user rate allocation is the constant bit-rate allocation (CBR), where the available network bandwidth is equally assigned to each user. A major problem of CBR is that it does not consider the variable bit-rate characteris- tics of the video sequences. One way to overcome this disadvantage is to optimize a global objective function that involves the characteristics of all the video sequences using conventional optimization methods such as Lagrangian or dynamic program- ming [95]. For example, a commonly adopted method is for the rate controller to minimize the weighted sum of the distortions or try to maximize the weighted sum of the PSNRs, i.e., the optimization problem becomes: min Ri NX i=1 wiDi(Ri); s:t: NX i=1 Ri ? R; (3.1) or max Ri NX i=1 wiPSNRi(Ri); s:t: NX i=1 Ri ? R; (3.2) where R is the available network bandwidth, wi is the weight, Di is the distortion, and PSNRi is the PSNR of the ith user. Notice that the solution to the above optimization-based methods is highly re- lated to the selection of the weights wi. However, in the literature, the weights wi?s are usually heuristically determined, e.g., wi is uniformly set to be 1=N [109]. More- over, such a formulation can only address the e?ciency issue, e.g., how to maximize 23 the weighted sum of the PSNRs or minimize the weighted sum of the distortions. As such, the fairness issue, which is an important problem for multi-user rate allocation, has been generally ignored in the image/video/multimedia community. However, in the networking literature, the fairness issue in multi-user rate allocation have been considered in a difierent setting. In [105], the authors formu- lated the optimal channel-assignment problem as a convex optimization problem using a max-min fairness criterion for the downlink application. As pointed out in [61], the max-min approach deals with the worst-cast scenario, so it favors users with worse channels and reduces the system e?ciency. To overcome the disadvan- tage, the authors in [61] considered a generalized proportional fairness based on the Nash bargaining solutions and coalitions. While this proportional fairness criterion was successfully employed in networking applications, it cannot be directly used in content-aware multimedia applications since it does not explicitly consider the char- acteristics of the video content and the resulting impact on video quality. In [98], the authors applied the Nash bargaining solutions to the multimedia multi-user rate allocation problem, where the utility function for each user is deflned as the inverse of the distortion. But there are two main drawbacks of that utility function. Firstly, since no cost in video transmission is considered, every user can overclaim his/her need to get more bandwidth regardless the consequence to the system, which is recognized as selflsh behavior. Due to the selflsh nature, without a cost, all users will become too greedy and want to get as much bit-rate as possible, which is not good to the system [106]. Secondly, since the gain is deflned as the inverse of the distortion, i.e., an exponential function of the PSNR, a certain increase of the bit- 24 rate in the low PSNR region will lead to a less signiflcant gain than that in the high PSNR region. This contradicts with the human visual system (HVS) model since the quality difierence in the low PSNR region is easier to be distinguished than that in the high PSNR region (see Section 3.1.3 for details). Moreover, with the utility function deflned in [98], the generalized Nash bargaining solution is shown to be the same as the traditional optimization-based approach in (3.2), i.e., to maximize the weighted sum of the PSNRs, while the weights are determined by the bargaining powers, which are still heuristically determined. In this chapter, we propose a multi-user rate allocation game framework to e?ciently and fairly allocate the available network bandwidth to difierent multime- dia users. The utility/payofi function of each user/player is deflned according to the characteristics of the transmitted video sequences and the allocated bit-rate. Speciflcally, motivated by the intuition that the quality difierence in the low PSNR region is easier to be distinguished than that in the high PSNR region, we deflne the gain as a logarithm function of the PSNR. We also introduce a cost term in the utility function, which is linear in the allocated rate, to guide users? behaviors. In this way, the users will be more rational in choosing bit-rate since transmitting data with a higher bit-rate in this case does not necessarily result in a higher payofi, especially when the transmitted video sequence is a fast motion and complex scene sequence. Then, we discuss the Nash equilibrium (NE) of this rate allocation game. We show that with a unique NE, which is proportionally fair in terms of both utility and PSNR, can be obtained, based on which the rate controller can e?ciently and fairly allocate the available rate. Moveover, we propose a decentralized cheat-proof 25 rate allocation scheme for the users to converge to the unique NE using alternative ascending clock auction [16]. We also show that the traditional optimization-based method in (3.2) is a special case of the game-theoretic framework if the utility func- tion is deflned as an exponential function of PSNR. This fact indicates that the game-theoretic approach ofiers a more general and unifled solution, especially in a multi-user setting. Finally, we illustrate several experimental results on real video data to demonstrate the e?ciency and efiectiveness of the proposed game-theoretic multi-user multimedia rate allocation method. The rest of this chapter is organized as follows. In Section 3.1, we give a detailed description on the proposed method, including the system model, how to deflne the utility function, and the problem formulation. In Section 3.2, we provide a detailed analysis of the proposed game-theoretic framework. In Section 3.3, we show the relationship between the proposed game-theoretic method and the traditional optimization-based approach. In Section 3.4, we describe in details the proposed distributed cheat-proof rate allocation scheme using alternative ascending clock auction. Finally, we illustrate the experimental results on real video signals in Section 3.5 and draw conclusions in Section 3.6. 3.1 The Game-Theoretic Framework 3.1.1 System Model As shown in Figure 3.1, in our system, we assume that there is a controller, N transmitters, u1, u2, ..., uN , and N receivers, r1, r2, ..., rN . User ui transmits 26 Channel / Link Controller . . . . . . 1v 2v Nv 1u 2u Nu 1r 2r Nr Figure 3.1: System Model. the video sequence vi to the corresponding receiver ri through a channel/link that is shared by other users u1, ..., ui?1, ui+1, ..., uN . Since the channel has a limited bandwidth, it may not be able to satisfy the bandwidth requirements for all users. The role of the controller is to allocate the channel bandwidth to users u1, u2, ..., uN . So, the question is how the controller allocates the bandwidth to the users in an e?cient and fair way? We will formally deflne the notion of fairness later. 3.1.2 Video Distortion-Rate Model Before answering the question raised in the above subsection, let us flrst discuss the Distortion-Rate (DR) model for the video sequences. In video compression, due to the quantization process, there exists a tradeofi between the distortion (D), which is usually deflned as the mean squared error (MSE), and bit-rate (R), which determines the channel bandwidth or storage space required to transmit or store the coded data. Generally, high bit-rate leads to small distortion while low bit-rate causes large distortion. In the literature, several models have been proposed to 27 characterize this distortion rate tradeofi for difierent video coders, such as MPEG2 [63] [48], MPEG4 [39] [35], FGS [46], H.263 [114], H.264 [31] [83], and wavelet-based coders [124]. Without loss of generality, in this chapter, we use a simple two- parameter distortion-rate model, which is widely employed in a medium or high bit-rate situation, and other models can be similarly analyzed. The two-parameter distortion-rate model is described as follows: D(R) = fie?flR; (3.3) where fi and fl are two positive parameters determined by the characteristics of the video content. 3.1.3 User?s Utility Function As shown in Figure 3.1, user ui can get gain by successfully transmitting the video vi to receiver ri, and the gain is determined by the quality of the transmitted video. On the other hand, user ui needs to pay for the used bandwidth to transmit vi, and the payment is determined by the bit-rate of vi. Therefore, given the proflle of ui, the bit-rate Ri and distortion Di, the utility function of user ui can be deflned as: Ui(Ri; Di) = f(Di)? ag(Ri): (3.4) where f(Di) is the gain, g(Ri) is the cost, and a is a parameter controlling the balance between the gain and cost. Generally, since the gain of ui will be larger if the distortion Di is smaller, the function f(:) should be a monotonically decreasing function. Similarly, since the 28 cost of ui will be larger if the bit-rate Ri is larger, the function g(:) should be a monotonically increasing function. Without loss of generality, we assume that the cost per bit-rate unit is one, which means: g(Ri) = Ri: (3.5) The gain f(Di) is generally determined by how much receiver ri is satisfled with the received video. In video processing and coding community, the PSNR is a more common objective quality measure than MSE. For any MSE, i.e., the distortion D, the corresponding PSNR is given by: PSNR = 10 log10 2552 D : (3.6) Moreover, according to the human visual system (HVS) model, the quality difierence in the low PSNR region is easier to be distinguished than that in the high PSNR region, e.g., as shown in Figure 3.2, the 33dB and 34dB images are easier to be distinguished than the 40dB and 41dB images. Therefore, we deflne the f(:) function as: f(Di) = ln(PSNRi) = ln[10 log10 2552 Di ]: (3.7) Note that the reason of using ln(:) function is that ln(:) is a monotonically increasing function in its argument and its second order derivative is negative, due to which a certain increase in the low PSNR region will lead to a more signiflcant gain than that in the high PSNR region. Other functions that have similar properties can also be used. Moreover, if we do not consider the distinct characteristics of video signals, any monotonically decreasing function of the distortion D can be used, e.g., f(D) = 2552=D = e?PSNR; (3.8) 29 (a) (b) (c) (d) Figure 3.2: The visual quality of Foreman sequence at difierent PSNR level: (a) 33dB; (b) 34dB; (c) 40dB; (d) 41dB. or f(D) = 10 log10 2552 D = PSNR: (3.9) Combining (3.3)-(3.7) and ignoring the constant term, the utility function of user ui becomes: Ui(Ri) = ln( i + fliRi)? aRi: (3.10) where i = 2 ln 255? ln fii. 30 3.1.4 Multi-User Rate Allocation Game To answer the question raised in Section 3.1.1, we formulate this problem as a multi-user rate allocation game. As shown in Figure 3.1, in this game, there are N users/players, who share the available network bandwidth with each other. Each user ui has his/her own utility function as shown in (3.10), and it also has a minimum desired quality constraint (minimal rate constraint Rmini ) and a maximum satisfled quality constraint (maximum rate constraint Rmaxi ). Since Rmini is the minimal rate constraint that each user expects by jointing in the game, we assume that the available network rate at least guarantees each user for the minimal desired rate in the game. Obviously, if the available network bandwidth is able to satisfy all the users with the maximum quality constraint Rmaxi , the rate allocation problem is trivial since the controller just needs to allocate Rmaxi to each user ui. However, in the case that the available network bandwidth is not able to satisfy all the user with Rmaxi , the problem becomes more interesting: how does the controller fairly and e?ciently allocate the available bandwidth to the users? From the users? point of view, they try to maximize their utilities subject to the constraint that the sum of the users? bit-rate does not exceed the available bandwidth. Therefore, the game can be formulated as: max Ri Ui(Ri) = ln( i + fliRi)? aRi; s:t: Rmini ? Ri ? Rmaxi ; 8i = 1; 2; :::; N; NX i=1 Ri ? R; (3.11) where R is the available network bandwidth. 31 3.2 Analysis of The Multi-User Rate Allocation Game According to (3.10), we can see that the utility function Ui(Ri) is a concave function in terms of Ri. By taking the derivative of Ui(Ri) over Ri, we have: @Ui(Ri) @Ri = fli i + fliRi ? a; 8i = 1; 2; :::; N: (3.12) Therefore, user ui achieves his/her maximal utility U?i (R?i ) at R?i , where R?i is deflned as: R?i = max[Rmini ; min( 1 a ? i fli ; R max i )]; 8i = 1; 2; :::; N: (3.13) From (3.13), we can see that the optimal R?i corresponding to the maximal utility is determined by the parameter a. Therefore, for difierent choices of a, the game in (3.11) has difierent equilibria with difierent physical meanings. Speciflcally, in the following, we discuss the Nash equilibrium (NE) in three difierent cases: a > a0, 0 ? a < a0, and a = a0, where a0 is the constant that satisfles the following equation: NX i=1 ? max[Rmini ; min( 1 a0 ? i fli ; R max i )] ? = R: (3.14) 3.2.1 Non-E?cient Rate Allocation (a > a0) If a > a0, the game in (3.11) has a unique Nash Equilibrium (R?1; R?2; :::; R?N). Since a > a0, from (3.14), we have PN i=1 R?i < R, which means that the available network bandwidth is not fully utilized. Therefore, this allocation scheme is not e?cient. 32 3.2.2 E?cient Rate Allocation (0 ? a < a0) If 0 ? a < a0, the game in (3.11) has inflnitely many NE. For every NE ( ~R1; ~R2; :::; ~RN), according to Lemma 1, we have PN i=1 ~Ri = R, which means that the available network bandwidth is fully utilized. Therefore, this allocation scheme is e?cient. Lemma 1: When 0 ? a < a0, every NE ( ~R1; ~R2; :::; ~RN) satisfles PN i=1 ~Ri = R. Proof: Since PNi=1 ~Ri ? R, let us assume that there is a NE ( ~R1; ~R2; :::; ~RN) such that PNi=1 ~Ri = R ?? < R. Since 0 ? a < a0, we have PN i=1 R?i > R, which means there exists at least one ~Rj such that ~Rj < R?j . Let R^j = min( ~Rj + ?; R?j ), then Pj?1i=1 ~Ri + R^j + PN i=j+1 ~Ri ? R and Uj(R^j) > Uj( ~Rj) (due to the concavity of the utility function). This contradicts with the assumption that ( ~R1; ~R2; :::; ~RN) is a NE. Therefore, PNi=1 ~Ri = R. This completes the proof. 3.2.3 E?cient and Proportionally Fair in Both Utility and PSNR (a = a0) If a = a0, the game in (3.11) has a unique Nash Equilibrium (R?1; R?2; :::; R?N). According to (3.14), we have PNi=1 R?i = R, which means that the available network bandwidth is fully utilized. Therefore, this allocation scheme is e?cient. Moreover, we will show in the following deflnition [72] and theorem that when a = a0, (R?1; R?2; :::; R?N) is a proportionally fair NE in terms of both utility and PSNR. Deflnition 1: A utility distribution is said to be proportionally fair when any 33 change in the distribution of utilities results in the sum of the proportional changes being non-positive, i.e., X i Ui ? ~Ui ~Ui ? 0; 8Ui 2 S: (3.15) where ~Ui and Ui are the proportionally fair utility and any other feasible utility for the ith user, respectively, and S is a closed and convex subset of 0, clock index t = 0, and initializes a with a small value a0. At the beginning of clock t, the controller flrst announces at to all the users. Then, each user submits his/her optimal demand to the controller. After collecting all the demands, the controller compares the total demand Rtotal with the available bandwidth R. If Rtotal > R, i.e., the total demand exceeds the supply, the auction is not concluded. The controller continues the auction and goes to next clock t + 1 with an increased a computed by at+1 = at + ?. Moreover, the controller computes the cumulative clinch, which is 39 the amount of bit-rate that a user is guaranteed to win at current clock given by Cti = max(0; R? X j 6=i Rtj): (3.24) On the other hand, if Rtotal ? R, then the supply can meet all users? demands and the auction is concluded. Let the flnal clock index be L. As a increases dis- cretely, we may have Rtotal < R and do not fully utilize the bandwidth. To make sure that Rtotal = R, we modify (3.24) by introducing proportional rationing [16], and the flnal cumulative clinch of ui is given by, CLi = RLi + RL?1i ?RLiP i RL?1i ? P i RLi [R? X i RLi ]; with X i CLi = R: (3.25) Finally, the rate allocated to ui is R?i = CLi . The utility of ui is obtained as, U?i = ln( i + fliR?i )? P ?i ; (3.26) where P ?i = C0i a0 + PL t=1 at(Cti ? Ct?1i ) is the payment from user ui. Remark: Since at+1 > at, we have Rt+1j ? Rtj. Therefore, at clock t, ui is guaranteed at least the amount of bit-rate Cti = max(0; R ? P j 6=i Rtj). This is how (3.24) comes from. The rate allocation scheme described in Algorithm 1 has several advantages: ? The auction process is transparent to all users and simple enough for all users to understand. Simplicity and transparency are two important factors to stim- ulate auction since users may not be willing to join in the game if they do not understand the auction process. ? The auction scheme can preserve privacy. Since the scheme is distributed, 40 Algorithm 1 Cheat-Proof Rate Allocation Scheme Using Clock Auction Given the available bandwidth R, step size ? > 0, and clock index t = 0, the controller initializes a with a small value a0. Repeat: (1) the controller announces at to all the users. (2) Each user ui submits his/her optimal demand: Rti = max[Rmini ; min( 1at ? ifli ; Rmaxi )]. (3) The controller sums up all the demand Rttotal = P i Rti and compares Rttotal with R: If Rttotal > R, compute Cti = max(0; R? P j 6=i Rtj), set at+1 = at+ ?, t = t+1, and go to (1). Else, conclude the auction, set L = t, compute CLi = RLi + R L?1 i ?RLiP i RL?1i ? P i RLi [R? P i RLi ], and allocate R?i = CLi to ui. Finally, the payment of ui is P ?i = C0i a0 + PL t=1 at(Cti ? Ct?1i ) and the utility of ui is U?i = ln( i + fliR?i )? P ?i . 41 users do not need to report their private information. Instead, they only need to submit their demands. ? The computational complexity of each user is low since what the users need to do is to submit their optimal demands calculated by Ri = max[Rmini ; min( 1a ? i fli ; Rmaxi )] for any given a. ? The computational complexity of the controller is low in that the controller only needs to sum up the demands from the users, compare it with the available bandwidth, and compute the cumulative clinch for each user. ? Through the auction, each user will converge to the unique proportionally fair NE shown in Section 3.2. This is trivial due to the following two reasons: (1) since the auction concludes if and only if Pi Rti ? R, when ? is su?cient small, the auction will conclude at Pi Rti = R. (2) At each clock t, ui chooses Rti = max[Rmini ; min( 1at ? ifli ; Rmaxi )]. ? The scheme is cheat-proof, meaning that the best strategy of each user is to report his/her true optimal demand at every clock. There is no incentive for ui to deviate, and the proof is shown in Theorem 2. Let Rti be user ui?s true optimal demand at clock t, and ~Ri t be the claimed demand that ui reports to the controller at clock t. Note that ~Rit can be any value in [Rmini ; Rmaxi ] if ui cheats at clock t. Let ?(t; L) = fL; ~Ri 0; :::; ~Rit; Rt+1i ; :::; RLi ; C0i ; :::; CLi ; a0; :::; aLg be the proflle of ui at the following scenario: from clock 0 to clock t, ui reports ~Ri0; :::; ~Rit, and from clock t + 1 to the flnal clock, ui reports 42 2L iR 1 L iR 2ln( )Li i iR? ?+ 1ln( )Li i iR? ?+ 1 1 2 1 (ln( ))ln( ) ( ) L i i L L Li i i i i i i i i R R RR R R R ? ?? ? = ? + + ? ? (a) 1L iR 2 L iR 2ln( )Li i iR? ?+ 1ln( )Li i iR? ?+ 2 2 1 1 (ln( ))ln( ) ( ) L i i L L Li i i i i i i i i R R RR R R R ? ?? ? = ? + + ? ? (b) Figure 3.3: Illustration for the proof of Lemma 2: (a) If ~Rit ? Rti and RL1i > Rmini , we can see that [ln( i + fliRL1i ) ? @ ln( i+fliRi)@Ri jRi=RL1i (R L1 i ? RL2i )] ? ln( i + fliRL2i ); (b) If ~Rit > Rti and RL1i < Rmaxi , we can see that ln( i + fliRL1i ) ? [ln( i + fliRL2i )? @ ln( i+fliRi) @Ri jRi=RL1i (R L2 i ?RL1i )]. Rt+1i ; :::; RLi , where L is the flnal clock index, C0i ; :::; CLi is the corresponding cumu- lative clinch of ui from clock 0 to clock L, and a0; :::; aL is the corresponding value of a at each clock. Let Ui[?(t; L)] be the utility of ui in this scenario. Let ?(?1; L) = fL;R0i ; :::; RLi ;C0i ; :::; CLi ; a0; :::; aLg and ?(L;L) = fL; ~Ri 0; :::; ~RiL;C0i ; :::; CLi ; a0; :::; aLg be two special cases of ?(t; L). Lemma 2: If all other users report their true optimal demands at every clock, then Ui[?(t? 1; L1)] ? Ui[?(t; L2)]. Proof: From (3.26), we have, Ui[?(t? 1; L1)] = ln( i + fliRL1i )? a0C0i ? L1X k=1 ak(Cki ? Ck?1i ); Ui[?(t; L2)] = ln( i + fliRL2i )? a0C0i ? L2X k=1 ak(Cki ? Ck?1i ): (3.27) ? If ~Rit ? Rti, according to Algorithm 1, we have L2 ? L1 and RL2i ? RL1i . 43 Then, Ui[?(t? 1; L1)]? Ui[?(t; L2)] = ln( i + fliRL1i )? ln( i + fliRL2i )? L1X k=L2+1 ak(Cki ? Ck?1i ) ? ln( i + fliRL1i )? ln( i + fliRL2i )? aL1(CL1i ? CL2i ): (3.28) When ? is su?ciently small, CL1i = RL1i and CL2i = RL2i . Since Rti = max[Rmini ; min( 1at ? ifli ; Rmaxi )], { if RL1i > Rmini , according to (3.12), we get aL1 ? @ ln( i+fliRi)@Ri jRi=RL1i . Thus, (3.28) becomes Ui[?(t? 1; L1)]? Ui[?(t; L2)] ? [ln( i + fliRL1i )? @ ln( i + fliRi) @Ri jRi=RL1i (R L1 i ?RL2i )]? ln( i + fliRL2i ) ? 0 (see Figure 3:3(a)): (3.29) { if RL1i = Rmini , since Rmini ? RL2i ? RL1i = Rmini , we have RL2i = Rmini = RL1i . Therefore, Ui[?(t? 1; L1)] = Ui[?(t; L2)]: (3.30) So, if ~Rit ? Rti, we have Ui[?(t? 1; L1)] ? Ui[?(t; L2)]. ? If ~Rit > Rti, according to Algorithm 1, we have L2 ? L1 and RL2i ? RL1i . Then, Ui[?(t? 1; L1)]? Ui[?(t; L2)] = ln( i + fliRL1i )? ln( i + fliRL2i ) + L2X k=L1+1 ak(Cki ? Ck?1i ) ? ln( i + fliRL1i )? ln( i + fliRL2i ) + aL1(CL2i ? CL1i ): (3.31) 44 When ? is su?ciently small, CL1i = RL1i and CL2i = RL2i . Since Rti = max[Rmini ; min( 1at ? ifli ; Rmaxi )], { if RL1i < Rmaxi , according to (3.12), aL1 ? @ ln( i+fliRi)@Ri jRi=RL1i . Thus, (3.31) becomes Ui[?(t? 1; L1)]? Ui[?(t; L2)] ? ln( i + fliRL1i )? [ln( i + fliRL2i )? @ ln( i + fliRi) @Ri jRi=RL1i (R L2 i ?RL1i )] ? 0 (see Figure 3:3(b)): (3.32) { if RL1i = Rmaxi , since Rmaxi ? RL2i ? RL1i = Rmaxi , we have RL2i = Rmaxi = RL1i . Therefore, Ui[?(t? 1; L1)] = Ui[?(t; L2)]: (3.33) So, if ~Rit > Rti, we still have Ui[?(t? 1; L1)] ? Ui[?(t; L2)]. In all, we can show that Ui[?(t? 1; L1)] ? Ui[?(t; L2)]. This completes the proof. With Lemma 2, we can now show that the best strategy of each user is to report his/her true optimal demand at every clock. Theorem 2 (Cheat-Proof): Reporting true optimal demand at every clock is a mutually best response for every user, i.e., Ui[?(L3; L3)] ? Ui[?(?1; L4)] 8i. Proof: If all the other users report their true optimal demands in every clock, according to Lemma 2, we have Ui[?(L3; L3)] ? Ui[?(L3 ? 1; ~L3)] ? ::: ? Ui[?(?1; L4)], where ~L3 stands for the flnal clock index of the following scenario: from clock 0 to clock L3?1, ui reports ~Ri0; :::; ~RiL3?1, and from clock L3 to the flnal clock ~L3, ui reports RL3i ; :::; R ~L3 i . Since all users are non-collaborative, reporting true 45 optimal demand at every clock is a mutually best response for every user. There is no incentive for the users to cheat since any cheating will lead to a loss in utility. Therefore, the proposed scheme is cheat-proof. This completes the proof. In the above theorem, we give a theoretical proof for the cheat-proof strategy. In the following section, we will verify this cheat-proof strategy through experimental results. 3.5 Experimental Results In order to evaluate the proposed game-theoretic multi-user rate allocation game, we conduct experiments on real video data. Seven video sequences: Akiyo, Mobile, Table, Carphone, Coastguard, Foreman, and Football in QCIF format, are tested. Notice that these video sequences include slow, medium or fast motion, and smooth or complex scene. We use the state-of-art H.264 JM 9.0 video codec to encode the video sequences [6]. By changing the quantization parameter (QP) or using the rate control feature, we are able to compress the video sequences at difierent bit-rate and achieve difierent quality requirements. 3.5.1 Parameter Estimation From Section 3.1, we can see that there are several parameters in our frame- work, i, fli, Rmini , and Rmaxi . In this subsection, we will discuss how to estimate these parameters. 46 Table 3.1: ?i , fl?i , Rmini (kb=s), and Rmaxi (kb=s) for difierent sequence by training. Sequence ?i fl?i Rmini (kb=s) Rmaxi (kb=s) Akiyo 6.8449 0.0416 1.5119 84.5447 Carphone 6.6759 0.0114 20.2554 322.0153 Coastguard 6.6796 0.0043 28.4987 878.8011 Foreman 6.7418 0.0093 17.8168 388.7091 Football 6.2201 0.0024 286.311 1720 Mobile 6.3464 0.0025 225.0682 1610 Table 6.8135 0.0074 12.7781 481.1014 According to (3.3) and (3.6), we have: PSNRi = 10(log10 e) ln 2552 Di = (10 log10 e)( i + fliRi) (3.34) Therefore, we can estimate i and fli using ofi-line training. For each video sequence, we flrst generate a set of (PSNRi; Ri) by encoding the sequence using H.264 JM 9.0 with difierent QP. Then, the optimal ?i and fl?i can be computed by: ( ?i ; fl?i ) = min i;fli X j [PSNRi(j)? (10 log10 e)( i + fliRi(j))]2; (3.35) where j is the index of the training set. Through the training data and equation above, we get the optimal ?i and fl?i for difierent video sequences and show them in Table 3.1. As shown in Figure 3.4, 47 200 400 600 800 1000 1200 140028 30 32 34 36 38 40 42 Rate (kb/s) PSNR (dB ) Football QCIF baseline training (a) 0 100 200 300 400 500 600 70026 28 30 32 34 36 38 40 42 Rate (kb/s) PSNR (dB ) Coastguard QCIF baseline training (b) Figure 3.4: Training and fl: (a) Football; (b) Coastguard. with the optimal ?i and fl?i , the (10 log10 e)( i+ fliRi) can approximate PSNRi well. Due to the page limitation, we only show the results for Football and Coastguard. Similar results are observed for other sequences. After flnding the optimal ?i and fl?i , we derive the values for Rmini and Rmaxi . Suppose that the minimal desired PSNR (quality) constraint is PSNRmin, e.g., 30dB, and the maximal satisfled PSNR (quality) constraint is PSNRmax, e.g., 45dB. According to (3.34), we have: Rmini = 1 fl?i (PSNR min 10 log10 e ? ?i ); Rmaxi = 1 fl?i (PSNR max 10 log10 e ? ?i ): (3.36) According the equations above, the Rmini and Rmaxi for difierent sequences are obtained and shown in Table 3.1. From Table 3.1, we can see that the tested video sequences can be classifled to four categories according to fl?, Rmini and Rmaxi : slow motion and smooth scene (Akiyo), medium motion and smooth scene (Carphone, Foreman, and Table), medium motion and complex scene (Coastguard), and fast or 48 1 2 3 4 5 0 0.5 1 1.5 AFD Total Rate (Mb/s) Rate (Mb/s ) Akiyo Carphone Coastguard Foreman Table Football Mobile 1 2 3 4 5 0 0.5 1 1.5 AFR Total Rate (Mb/s) Rate (Mb/s ) Akiyo Carphone Coastguard Foreman Table Football Mobile 1 2 3 4 5 0 0.5 1 1.5 MSPSNR Total Rate (Mb/s) Rate (Mb/s ) Akiyo Carphone Coastguard Foreman Table Football Mobile 1 2 3 4 5 0 0.5 1 1.5 Proposed Total Rate (Mb/s) Rate (Mb/s ) Akiyo Carphone Coastguard Foreman Table Football Mobile Figure 3.5: Allocated rates for Akiyo, Carphone, Coastguard, Foreman, Table, Foot- ball, and Mobile using difierent methods. complex motion (Football and Mobile). 3.5.2 Multi-User Rate Allocation We compare the proposed method with three approaches: the Absolute Fair- ness in Rate (AFR), which equally divides the available bandwidth to all the users, the Absolute Fairness in Distortion (AFD), which minimizes the maximal distor- tion of all the users, i.e., min-max fairness, and the approach Maximizing the Sum of the PSNRs (MSPSNR), i.e. the traditional optimization-based approach shown in (3.2) with uniform weights. Notice that for AFR, AFD, and MSPSNR, the 49 1 1.5 2 2.5 3 3.5 4 4.5 5 200 220 240 260 280 300 320 R (Mb/s) To ta l P SNR (dB ) PSNR AFD AFR MSPSNR Proposed Figure 3.6: The sum of PSNR vs. the available network bandwidth R. allocated rate should be within [Rmini ; Rmaxi ]. Otherwise, we set it to be Rmini or Rmaxi and re-allocate the rest rate for the other users. Given the video sequences to be transmitted, the available bandwidth R, we can compute the rate allocated to each video sequence using difierent methods, i.e., AFD, AFR, MSPSNR, and the proposed method. Then, setting the allocated bit-rate as the target bit-rate, we compress the video sequence using the rate control feature in H.264 JM 9.0 reference software. Finally, each user transmits the compressed bitstream to the corresponding receiver. In the experiments, we assume that there are seven users u1; u2; :::; u7. They transmit Akiyo, Carphone, Coastguard, Foreman, Table, Football, and Mobile to seven receivers r1; r2; :::; r7, respectively. We test R at 1000, 2000, 3000, 4000 and 50 0 0.5 1 1.5 2 2.5 3 3.5 28 30 32 34 36 38 40 42 44 46 Scale factor k PSNR (dB ) ?=0.0025 and R=4.2Mb/s MSPSNR AFD AFR (a) 0 0.5 1 1.5 2 2.5 3 3.5 1.86 1.87 1.88 1.89 1.9 1.91 Scale factor k Utilit y ?=0.0025 and R=4.2Mb/s Cheat?Proof Performance of Proposed Method (b) Figure 3.7: Cheat-proof performance: (a) AFD, AFR, and MSPSNR; (b) Proposed Method. 5000 kb/s. The allocated bit-rate for each video sequence in difierent situations (i.e., difierent R) using difierent methods (i.e., AFD, AFR, MSPSNR, and the proposed method) are shown in Figure 3.5. From Figure 3.5, we can see that AFR equally allocates the bandwidth to each users if the allocated bit-rate is within [Rmini ; Rmaxi ]. AFD tries to allocate more bit-rate to the video sequence that has more complex motion and/or scene (a smaller fl?) to preserve constant quality among difierent users. On the contrary, MSPSNR favors the video sequence that has a larger fl? since allocating more bit-rate to the sequence with a larger fl? leads to a greater increase in the sum of the PSNRs. However, with MSPSNR, the sequence with fl?i will not be allocated more bit-rate than Rmini if there is a sequence with fl?j > fl?i who has not been allocated its maximal rate requirement Rmaxj yet. Speciflcally, the rate controller will flrst allocate each user with Rmini . Then, the remaining rates will be flrst allocated to Akiyo until the bit-rate of Akiyo achieves its maximal requirement. If there are still some unused rates, then Carphone will 51 be satisfled flrst. And the bit-rate of Football with the smallest fl? stays at its minimal requirement until all other sequences with higher fl? have achieved their maximal rate requirements. Obviously, this is not fair to the users who transmit the sequences with smaller fl?. By taking the proportional fairness into account, the proposed method can avoid this disadvantage and balance the rate allocation between the sequences with a larger fl? and a smaller fl?. For example, as shown in Figure 3.5, when the total available network bandwidth R increases from 3000kb/s to 4000kb/s, both the bit-rate of Mobile and Football increase. This is because the proposed method with the proportional fairness criterion aims at maximizing the product of the utility function Ui, and keeping a certain balance between the sequences with a larger fl? and a smaller fl? leads to an increase in the product. Let T PSNR = PNi=1 PSNRi be the sum of the analytical PSNRi computed by (3.34) of all the users. In Figure 3.6, we show T PSNR versus the available network bandwidth R. We can see that there is a big gap between the performance of AFD, AFR and MSPSNR, which means using AFD or AFR leads to a big loss in the system performance. However, the performance of the proposed method is almost the same as that of MSPSNR, which fully demonstrates the e?ciency of the proposed method. Therefore, while achieving a fair rate allocation among difierent users, the proposed method still performs well in terms of total PSNR. Finally, we evaluate the cheat-proof property of difierent methods. As shown in Table 3.1, since fl is the most important parameter representing the characteristics of video sequences, the best way for ui to pretend as another user uj is to use flj rather than fli in calculating optimal demand. Therefore, we evaluate the cheat- 52 proof property in terms of fl. In this experiment, the available network bandwidth R is set to be 4.2Mb/s. We assume that u6 who transmits Mobile sequence will cheat while other users are honest. In AFD, AFR and MSPSNR, u6 reports a false ~fl to the controller by scaling the original fl with a factor k, i.e., ~fl = kfl. In the proposed method, at each clock t of the auction, u6 uses ~fl to generate the \optimal" demand ~R6t using ~R6t = max[Rmin6 ; min( 1at ? 6kfl6 ; Rmax6 )] and reports ~R6 t to the controller. As shown in Figure 3.7(a), the PSNR performance of AFR is independent of the scale factor k. This is because AFR does not care about fl and just equally allocates the bandwidth to each user if the allocated bit-rate is within [Rmini ; Rmaxi ]. The PSNR performance of AFD decreases as k increases. This is because AFD tries to allocate more bit-rate to the video sequence with a smaller fl to preserve constant quality among difierent users. Therefore, with AFD, all users tend to report a smaller fl to the controller to obtain a better PSNR performance. On the contrary, the PSNR performance of MSPSNR is an increasing piecewise constant function in terms of k. This is because, with MSPSNR, the sequence with fli will not be allocated more bit-rate than Rmini if there is a sequence with flj > fli who has not been allocated its maximal rate requirement Rmaxj yet. To be allocated more rate and obtain a higher PSNR, u6 should increase k until at least kfl6 > flj where flj = minl(fll > fl6). Therefore, with MSPSNR, all users tend to report a larger fl to the controller to obtain a better PSNR performance. However, with the proposed method, as shown in Figure 3.7(b), reporting the optimal demand generated by the true fl (k=1) will lead to the best utility. Any deviation will lead to a loss in terms of utility, which means that the proposed method is cheat-proof. 53 Therefore, the proposed method ensures all users will be honest about their private information. 3.6 Summary In this chapter, we proposed a game-theoretic framework for multi-user multi- media rate allocation and a distributed cheat-proof scheme for users to converge to the NE of the game. Difierent from the traditional optimization-based approaches, which mainly focus on the e?ciency issue, e.g. maximizing the system performance, the proposed method not only considers the e?ciency issue but also the fairness issue. From the experimental results on real video sequences, we can see that with the proportional fairness criterion, the proposed game-theoretic method can e?- ciently and fairly allocate bit-rate to difierent users by allocating more bit-rate to the sequence with slower motion and/or simpler scene while keeping an eye on the fast motion and/or complex scene sequence. We also flnd that, with the proposed distributed cheat-proof rate allocation scheme, reporting the true optimal demand at every clock is the mutual best response for every user. Moreover, we show that the traditional optimization-based method that maximizes the weighted sum of the PSNRs is a special case of the game-theoretic framework with the utility function deflned as an exponential function of PSNR. 54 Chapter 4 Peer-to-Peer Cooperative Video Streaming Social Networks With the rapid development of signal processing, communication, and net- working technologies, video-over-IP applications become more and more popular and have attracted millions of users over the Internet [1] [9]. One simple solution to video streaming over Internet is the client-server service model [47] [76], where the video is streamed directly from a server to clients. However, with the client-server service model, the upload bandwidth of the server grows proportionally with the number of clients [86], which makes the large-scale video streaming impractical. To reduce the workload of the server, Peer-to-Peer (P2P) service model is proposed [36] [133], where a peer not only acts as a client to download data from the network, but also acts as a server to upload data for the other peers in the network. The upload bandwidth of the peers reduces the workload placed on the server dramatically, which makes large-scale video streaming possible. Recently, several industrial large-scale P2P video streaming systems have been developed, including Coolstreaming [133], PPLive [4], PPStream [5], UUSee [8] and Sopcast [7]. Studies show that these systems can support hundreds of thousands of users simultaneously [65]. While P2P video streaming systems have achieved promising results, they have several drawbacks. First, there is a large number of unnecessary traverse links within 55 a provider?s network. As observed in [129], each P2P bit on the Verizon network traverses 1000 miles and takes 5.5 metro-hops on average. Second, there is a huge number of cross Internet Service Provider (ISP) tra?c. The studies in [71] [107] showed that 50%-90% of the existing local pieces in active peers are downloaded externally. Third, the difierences in playback time among peers can be as high as 140 seconds [65], and the lag can be greater if the source rate is higher. Fourth, most of the current P2P systems assume that all peers are willing to contribute their resources. However, this assumption may not be true since the P2P systems are self-organizing networks and the peers are selflsh by nature [128] [60]. Note that the selflsh peers will act as free-riders if being free-riders can improve their utilities. In the literature, many approaches have been proposed to overcome these drawbacks. Karagiannis et al. [71] and Madhyastha et al. [87] proposed to use locality-aware P2P schemes to reduce the unnecessary traverse links within and cross ISPs and thus reduce the download time. Purandare and Guha [100] proposed an alliance based peering scheme to reduce the playback time lag and improve the Quality of Service (QoS). Xie et al. [129] proposed a P4P architecture that allows cooperative tra?c control between applications and network providers. To stimulate selflsh peers to contribute their resources, payment mechanisms [121] [57] and reputation schemes [88] [59] are proposed, where peers pay points to receive data and earn points by forwarding data to others. However, such payment or reputation based mechanisms often demand a centralized architecture and thus hinder their scalability. Game theory is a mathematical tool that analyzes the strategic interactions 56 among multiple decision makers. Recently, it draws great attentions in cognitive networking [123], multimedia social networking [136], and is being applied to many multimedia signal processing problems such as video coding [12] and multimedia communications [29]. In P2P networks, peers make intelligent decisions on their strategies of requesting and forwarding packets based on their needs and other peers? actions. Moreover, since peers are rational and thus naturally selflsh, they have no incentive to contribute their resources for other peers. Therefore, it is natural to study the intelligent behaviors and interactions of selflsh peers in P2P networks from a game theoretic perspective [128] [84]. Using a mental cost to describe the level of the peer?s altruism, the authors in [128] presented a game theoretical model to analyze nodes? behaviors and the in uence of incentive mechanism. In [84], a game theoretic framework is proposed for designing distributed, cheat-proof and attack- resistant cooperation stimulation strategies for P2P live streaming social networks. Most of the existing schemes treat every peer as an independent individual. However, in reality, every peer can have a large number of geographically neighboring peers with large intra-group upload and download bandwidths, e.g. the peers in the same lab, building, or campus. Here, we name those geographically neighboring peers with large intra-group upload and download bandwidths as group peers. To reduce the unnecessary traverse links and improve network e?ciency, instead of considering each peer?s strategy independently, we investigate possible cooperation among the group peers. Moreover, since peers are naturally selflsh, they will act as free-riders if doing so can improve their utilities. In such a case, full cooperation cannot be guaranteed. Instead, to achieve better payofi, rational peers will adjust 57 their degree of cooperation by learning from their payofi history. Therefore, a key question to answer is: \how a group of selflsh peers should cooperate with each other to achieve better streaming performance?" The main contributions of this chapter are summarized as follows. ? We propose a cooperative streaming scheme to enable cooperation among group peers to achieve better streaming performance. ? In the proposed scheme, we deflne the utility function of a peer by taking into account the possibility of real-time streaming and the cost of acting as a server to upload data for the other peers. ? Due to their selflsh nature, peers tend to act as free riders to improve their own utilities. Moreover, the peers may take out-of-equilibrium strategies due to the uncertainty of the strategies of the other peers. Therefore, a robust Nash equilibrium (NE) solution is desired for every peer. In this chapter, we formulate the cooperative streaming problem as an evolutionary game and derive the evolutionarily stable strategy (ESS) for every peer, which is the desired stable NE. ? To stimulate cooperation, the cooperative streaming scheme should be simple since peers may not be willing to join the cooperative streaming if the protocol is complicated. The proposed cooperative streaming scheme is very simple. Each peer tosses a coin to decide whether to be an agent or a free rider. If the outcome is head, the peer acts as an agent to download data from the peers outside the group. Otherwise, the peer acts as a free-rider to download data 58 from the agents. And the probability of being a head for the coin is learned from the peer?s own past payofi history. ? Due to the highly dynamic behaviors of the peers, i.e., the peers may join or leave the P2P network at any time, the cooperative streaming scheme should be distributed. We propose a distributed algorithm for every peer to approach the ESS by learning from the peer?s own past payofi history. The rest of this chapter is organized as follows. In Section 4.1, we describe the system model and the utility function. Then, we show in details how to select agents in a homogeneous group in Section 4.2. We extend the analysis to the heterogeneous case in Section 4.3. In Section 4.4, we propose a distributed learning algorithm for ESS. Finally, we show the simulation results in Section 4.5 and draw conclusions in Section 4.6. 4.1 The System Model and Utility Function 4.1.1 System Model As shown in Figure 4.1, there is a set of group peers1 (three in this example) who want to view a real-time video streaming simultaneously. Within a group, every peer can choose either to be an agent or a normal peer. If the peer serves as an agent, he/she not only needs to act as a client to download video data from the 1How to group the peers itself is an interesting problem. However, in this chapter, we assume that the peers have already been grouped and mainly focus on how the group peers cooperate with each other to achieve better streaming performance. 59 Agent Group Peers Figure 4.1: A cooperative streaming example. agents in other groups, but also needs to act as a server to upload video streams for both the agents in other groups and the peers in the same group. However, if the peer chooses not to be an agent, he/she only needs to download/upload data from/to the peers in the same group. We assume that the upload and download bandwidth within the group is larger than those cross groups. In such a case, peers tend to be a normal peer due to the selflsh nature. Nevertheless, the normal peers, on the other hand, take a risk of receiving degraded streaming performance since there may not be su?cient agents to download data from other groups. In order to achieve good streaming performance through cooperation, a question need to be addressed: given a group of peers, which peers should serve as agents. 60 4.1.2 Utility Functions In a P2P network, a peer not only acts as a client to download video data from the other peers but also acts as a server to upload video data for the other peers. Therefore, while a peer can beneflt from downloading data from the other peers, he/she also incurs a cost in uploading data for the other peers, where the cost can be resource spending on uploading data, e.g. bandwidth, bufier size. Given the group peers, u1, u2, ..., uN , we assume that k peers are willing to serve as agents to download multimedia data from the peers outside the group. Let the download rate be the transmission speed between an agent and a corresponding peer outside the group. If we denote that the download rates of the k agents are r1, r2, ..., rk, then the total download rate of the group peers is given by yk = kX i=1 ri: (4.1) Since the agents randomly and independently select peers outside the group for downloading data, the download rate ri?s are random variables. According to [68], the Cumulative Distribution Function (CDF) of a peer?s download bandwidth can be modelled as a linear function, which means that the PDF of a peer?s down- load bandwidth can be modelled as a uniform distribution, i.e., ri?s are uniformly distributed. To provide more insight into the cooperative streaming problem, we flrst con- sider a simple scenario without bufiering. Then, we extend our discussion to the case when there is bufiering efiect in Section 4.5. For the scenario without bufiering, if the total download rate yk is not smaller than the source rate r, then the group 61 peers can have a real-time streaming, and all the group peers can obtain a certain gain G. Otherwise, there will be some delay, and in this case we assume the gain is zero. Therefore, given the total download rate yk and the source rate r, if peer ui chooses to be an agent, then the utility function of ui is given by UA;i(k) = Pr(yk ? r)G? Ci;8k 2 [1; N ]; (4.2) where Ci is the cost of ui when he/she serves as an agent, and Pr(yk ? r) is the probability of achieving a real-time streaming which can be computed according to Theorem 1. Since the upload and download bandwidths within the group is large, the cost of uploading data to the other peers within the group can be negligible. In such a case, if peer ui chooses not to be an agent, then there is no cost for ui and the utility function becomes UN;i(k) = 8 >>< >>: Pr(yk ? r)G; if k 2 [1; N ? 1]; 0; if k = 0. (4.3) Theorem 1: If r1, r2,..., rk are i.i.d. uniformly distributed within [rL; rU ], then Pr(yk ? r) is given by Pr(yk ? r) = 12k! kX l=0 (?1)l 0 BB@ k l 1 CCA h (k ? l)k ? sgn(r^ ? l)(r^ ? l)k i ; (4.4) and when k is su?ciently large, Pr(yk ? r) can be approximated as Pr(yk ? r) ? Q 0 @ r^ ? k 2q k 12 1 A ; (4.5) where r^ = r?krLrU?rL and Q(x) is the Gaussian tail function R1 x 1p2? exp? x2 2 dx. 62 Proof :Let r^l = rl?rLrU?rL , 8l, then r^1, r^2, ..., r^k are i.i.d. uniformly distributed with [0; 1]. And the characteristic function of r^l is given by `(t) = i(1? e it) t : (4.6) Let y^k = Pk l=1 r^l, then the characteristic function of y^k can be computed by `y^k(t) = i(1? eit) t ?k : (4.7) Therefore, the density function of y^k is fy^k(y)=F?1t " i(1? eit) t ?k# (y) = 12(k ? 1)! kX l=0 (?1)l 0 BB@ k l 1 CCA sgn(y ? l)(y ? l)k?1: (4.8) Since Pr(yk ? r) = Pr(y^k ? r^), according to (4.8), we have Pr(yk ? r) = Pr(y^k ? r^) = Z 1 r^ fy^k(y)dy = 12k! kX l=0 (?1)l 0 BB@ k l 1 CCA h (k ? l)k ? sgn(r^ ? l)(r^ ? l)k i : (4.9) When k is su?ciently large, according to the Central Limit Theory, the dis- tribution of y^k can be approximated as Gaussian distribution N(k2 ; k12). Therefore, we have Pr(yk ? r) = Pr(y^k ? r^) ? Q 0 @ r^ ? k 2q k 12 1 A : (4.10) 63 4.2 Agents Selection Within A Homogeneous Group In the previous section, we have discussed the system model and the peer?s utility function. To optimize the streaming performance, proper peers should serve as agents to download data from the peers outside the group. In this section, we will discuss how to select agents within a homogeneous group where the cost of all peers serving as an agent is assumed to be the same. 4.2.1 Centralized Agent Selection If there is a central controller who can choose which peers should act as agents, then a straightforward criterion of selecting proper agents is to maximize the social welfare, which is the sum of all peers? utilities. Let Ci = C be the cost of a peer serving as an agent in a homogeneous group. Then the social welfare of an N ? peer group with k agents can be calculated by SW (k) = Pr(yk ? r)GN ? kC: (4.11) Based on (4.11), the agent selection problem to maximize the social welfare can be formulated as max k SW (k) = max k [Pr(yk ? r)GN ? kC] ; (4.12) where k 2 f1; 2; :::; Ng. By solving (4.12), we can flnd the optimal k? that maximizes the social wel- fare. Then, the central controller can choose k? peers from the group as agents to download data from the peers outside the group based on some mechanism, e.g. 64 the peers take turns to serve as agents. However, since peers? behaviors are highly dynamic, they may join in or leave the P2P network at any time. In such a case, the centralized approach may not be practical. 4.2.2 Distributed Agent Selection To overcome the drawback of the centralized approach, it is possible to consider a distributed approach where each peer acts as an agent with probability x. Then, according to (4.2) and (4.3), the group?s social welfare can be computed by Utotal(x)= NX i=1 0 BB@ N i 1 CCAxi(1? x)N?i h Pr(yi ? r)GN ? iC i : (4.13) The problem of flnding an optimal x to maximize the social welfare can be formulated as max x NX i=1 0 BB@ N i 1 CCAxi(1? x)N?i h Pr(yi ? r)GN ? iC i s:t: 0 ? x ? 1: (4.14) However, since peers are selflsh by nature, they are not as cooperative as a system designer/controller desires. By solving (4.14), we can flnd the optimal x? that maximizes the social welfare, but x? can not maximize each peer?s own utility. Therefore, the social welfare maximizer x? is not attainable when peers are selflsh. Moreover, the solution to the optimization problem shown in (4.14) is not stable since any perturbation will lead to a new solution. 65 4.2.3 Evolutionary Cooperative Streaming Game In order to provide a robust equilibrium strategy for the selflsh peers, we adopt the concept of Evolutionarily Stable Strategy (ESS) [110] [122], which is deflned as follows. Deflnition 1: A strategy a? is an ESS if and only if, 8a 6= a?, a? satisfles ? equilibrium condition: Ui(a; a?) ? Ui(a?; a?), and ? stability condition: if Ui(a; a?) = Ui(a?; a?), Ui(a; a) < Ui(a?; a), where Ui(a1; a2) is the utility of player i when he/she uses strategy a1 and another player uses strategy a2. Since all peers are selflsh, they will cheat if cheating can improve their payofis, which means that all peers are uncertain of other peers? actions and utilities. In such a case, to improve their utilities, peers will try difierent strategies in every play and learn from the strategic interactions using the methodology of understanding-by- building. During the process, the percentage of peers using a certain pure strategy may change. Such a population evolution can be modelled by replicator dynamics. Speciflcally, let xa stand for the probability of a peer using pure strategy a 2 A, where A = fA;Ng is the set of pure strategies including being an agent (A) and not being an agent (N). By replicator dynamics, the evolution dynamics of xa are given by the following difierential equation _xa = ?[ ?U(a; x?a)? ?U(xa)]xa; (4.15) where ?U(a; x?a) is the average payofi of the peers using pure strategy a, x?a is the 66 set of peers who use pure strategies other than a, ?U(xa) is the average payofi of all peers, and ? is a positive scale factor. From (4.15), we can see that if adopting pure strategy a can lead to a higher payofi than the average level, the probability of a peer using a will grow and the growth rate _xa=xa is proportional to the difierence between the average payofi of using strategy a (i.e., ?U(a; x?a)) and the average payofi of all peers (i.e., ?U(xa)). 4.2.4 Analysis of the Cooperative Streaming Game According to (4.2) and (4.3), the average payofi of a peer if he/she choose to be an agent can be computed by ?UA(x) = N?1X i=0 0 BB@ N?1 i 1 CCAxi(1? x)N?1?i h Pr(yi+1 ? r)G? C i ; (4.16) where x is the probability of a peer being an agent, and N?1 i ? xi(1? x)N?1?i is the probability that there are i agents out of N ? 1 other peers. Similarly, the average payofi of a peer if he/she chooses not to be an agent is given by ?UN(x) = N?1X i=1 0 BB@ N?1 i 1 CCAxi(1? x)N?1?iPr(yi ? r)G: (4.17) According to (4.16) and (4.17), the average payofi of a peer is ?U(x) = x ?UA(x) + (1? x) ?UN(x): (4.18) Substituting (4.18) back to (4.15), we have _x = ?x(1? x)[ ?UA(x)? ?UN(x)]: (4.19) 67 0 5 10 15 20 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 i w i Figure 4.2: The deceasing property of wi. At equilibrium x?, no player will deviate from the optimal strategy, which means _x? = 0, and we can obtain x? = 0, 1, or the solutions to ?UA(x) = ?UN(x). However, since _x? = 0 is only the necessary condition for x? to be ESS, we examine the su?cient condition for each ESS candidate and draw the following conclusions with the proofs shown in Theorem 2-4. ? x? = 0 is an ESS only when Pr(y1 ? r)G? C ? 0. ? x? = 1 is an ESS only when Pr(yN ? r)G? Pr(yN?1 ? r)G ? C. ? Let x? be the solution to ?UA(x) = ?UN(x), and x? 2 (0; 1). Then, x? is an ESS. Lemma 1: Let f(x) = ?UA(x) ? ?UN(x), then f 0(x) < 0, 8x 2 [0; 1]. Proof : 68 According to (4.16) and (4.17), we have f(x) = N?1X i=0 0 BB@ N?1 i 1 CCAxi(1? x)N?1?iwi ? C; (4.20) where wi = [Pr(yi+1 ? r)? Pr(yi ? r)]G. ? 8x 2 (0; 1), by taking the derivative of f(x) over x, we have f 0(x) = N?1X i=0 0 BB@ N?1 i 1 CCAxi?1(1? x)N?2?i[i? (N ? 1)x]wi; = i1X i=0 0 BB@ N?1 i 1 CCAxi?1(1? x)N?2?i[i? (N ? 1)x]wi + N?1X i=i1+1 0 BB@ N?1 i 1 CCAxi?1(1? x)N?2?i[i? (N ? 1)x]wi; (4.21) where i1 is the integer such that i1 ? (N ? 1)x and i1 + 1 > (N ? 1)x. Since wi stands for the additional gain by introducing one more agent into the i-agent system, as shown in Fig. 4.2, it is a decreasing function in terms of i, which means that wi ? wi1 ;8i ? i1 and wi ? wi1 ;8i > i1. Therefore, 69 according to (4.21), we have f 0(x) < i1X i=0 0 BB@ N?1 i 1 CCAxi?1(1? x)N?2?i[i? (N ? 1)x]wi1 + N?1X i=i1+1 0 BB@ N?1 i 1 CCAxi?1(1? x)N?2?i[i? (N ? 1)x]wi1 ; = wi1 N?1X i=0 0 BB@ N?1 i 1 CCAxi?1(1? x)N?2?i[i? (N ? 1)x]; = wi1 d 2 664 PN?1 i=0 0 BB@ N?1 i 1 CCAxi(1? x)N?1?i 3 775 dx ; = 0: (4.22) Therefore, f 0(x) < 0, 8x 2 (0; 1). ? The derivative of f(x) over x at x = 0 can be computed by f 0(0) = lim "!0 f(")? f(0) " = lim "!0 PN?1 i=0 0 BB@ N?1 i 1 CCA "i(1? ")N?1?iwi ? w0 " = lim "!0 (1? ")N?1w0 ? w0 " + lim"!0 (N ? 1)"(1? ")N?2w1 " = (N ? 1)(w1 ? w0) < 0: (4.23) where the last inequality comes from the fact that wi is a decreasing function in terms of i. 70 ? Similarly, the derivative of f(x) over x at x = 1 can be computed by f 0(1) = lim "!0 f(1)? f(1? ") " = lim "!0 wN?1 ? PN?1 i=0 0 BB@ N?1 i 1 CCA (1? ")i"N?1?iwi " = lim "!0 wN?1 ? (1? ")N?1wN?1 " + lim"!0 ?(N ? 1)(1? ")N?2"wN?2 " = (N ? 1)(wN?1 ? wN?2) < 0: (4.24) where the last inequality comes from the fact that wi is a decreasing function in terms of i. In all, f 0(x) < 0, 8x 2 [0; 1]. This completes the proof of the lemma. Theorem 2: The condition for x? = 0 to be an ESS is Pr(y1 ? r)G?C ? 0. Proof : According to (4.16-4.18), the utility that a peer using mixed strategy x and the other peers use mixed strategy x? = 0 can be written as ?U(x; 0) = ?UN(0) + ( ?UA(0)? ?UN(0))x; where ?UA(0) = Pr(y1 ? r)G? C and ?UN(0) = 0. ? If Pr(y1 ? r)G? C > 0, i.e. ?UA(0) > ?UN(0), every peer will deviate to x = 1 to obtain ?UA(0) rather than ?UN(0). ? If Pr(y1 ? r)G? C < 0, i.e. ?UA(0) < ?UN(0), every peer will stay at x = 0 to obtain ?UN(0) rather than ?UA(0). 71 ? If Pr(y1 ? r)G ? C = 0, i.e. ?UA(0) = ?UN(0), then ?U(x; 0) = 0 8x, and f(0) = ?UA(0)? ?UN(0) = 0. According to Lemma 1, we know that f 0(x) < 0 8x 2 [0; 1], so f(x) = ?UA(x)? ?UN(x) < 0. In such a case, ?U(0; x) = ?UN(x) > ?U(x; x) = ?UN(x)+( ?UA(x)? ?UN(x))x, which means x? = 0 is an ESS according to Deflnition 1. Therefore, x? = 0 is an ESS only when Pr(y1 ? r)G?C ? 0. Theorem 3: The condition for x? = 1 to be an ESS is Pr(yN ? r)G ? Pr(yN?1 ? r)G ? C. Proof : According to (4.16-4.18), the utility that a peer using mixed strategy x and the other peers use mixed strategy x? = 1 can be written as ?U(x; 1) = ?UN(1) + ( ?UA(1)? ?UN(1))x; where ?UA(1) = Pr(yN ? r)G? C and ?UN(1) = Pr(yN?1 ? r)G. ? If Pr(yN ? r)G ? Pr(yN?1 ? r)G < C, i.e., ?UN(1) > ?UA(1), every peer will deviate to x = 0 to obtain ?UN(1) rather than ?UA(1). ? If Pr(yN ? r)G ? Pr(yN?1 ? r)G > C, i.e., ?UN(1) < ?UA(1), every peer will stay at x = 1 to obtain ?UA(1) rather than ?UN(1). ? If Pr(yN ? r)G ? Pr(yN?1 ? r)G = C, i.e. ?UN(1) = ?UA(1), then ?U(x; 1) = ?UN(1) 8x, and f(1) = ?UA(1)? ?UN(1) = 0. According to Lemma 1, we know that f 0(x) < 0 8x 2 [0; 1], so f(x) = ?UA(x) ? ?UN(x) > 0. In such a case, ?U(1; x) = ?UN(x)+ ( ?UA(x)? ?UN(x))1 > ?U(x; x) = ?UN(x) + ( ?UA(x)? ?UN(x))x, which means x? = 1 is an ESS according to Deflnition 1. 72 Therefore, x? = 1 is an ESS only when Pr(yN ? r)G? Pr(yN?1 ? r)G ? C. Theorem 4: If x? 2 (0; 1) is a solution to ?UA(x) = ?UN(x), then x? is an ESS. Proof : Let ?Ui(x; x?) be the utility of player i when player i uses mixed strategy x and other users use mixed strategy x?. Then, we have ?Ui(x; x?) = x ?UA(x?) + (1? x) ?UN(x?): (4.25) Since x? is a solution to ?UA(x) = ?UN(x), we have ?UA(x?) = ?UN(x?). Therefore, (4.25) becomes ?Ui(x; x?) = ?UA(x?) = ?Ui(x?; x?); (4.26) which means x? satisfles the equilibrium condition shown in Deflnition 1. Moreover, according to (4.18), we have ?Ui(x; x) = ?UN(x) + ( ?UA(x)? ?UN(x))x; (4.27) and ?Ui(x?; x) = ?UN(x) + ( ?UA(x)? ?UN(x))x?: (4.28) Therefore, we have ?Ui(x?; x)? ?Ui(x; x) = ( ?UA(x)? ?UN(x))(x? ? x): (4.29) From Lemma 1, we know that f(x) = ?UA(x) ? ?UN(x) is a monotonically decreasing function. Since ?UA(x?) = ?UN(x?), ?UA(x) ? ?UN(x) > 0 if x < x?, and ?UA(x)? ?UN(x) < 0 if x > x?. Therefore, ( ?UA(x)? ?UN(x))(x??x) > 0, 8x 6= x?, i.e. ?Ui(x?; x) > ?Ui(x; x);8x 6= x?; (4.30) 73 which means x? satisfles the stability condition shown in Deflnition 1. According to (4.26) and (4.30), we know that x? is an ESS. 4.3 Agents Selection Within A Heterogeneous Group In this section, we will discuss how to select agents within a heterogeneous group where the costs of the peers acting as agents are difierent. Let xi;ai stands for the probability of peer ui using pure strategy ai 2 A. By replicator dynamics, the evolution dynamics of xi;ai are given by the following difierential equation _xi;ai = ?[ ?Ui(ai; x?i)? ?Ui(xi)]xi;ai ; (4.31) where ?Ui(ai; x?i) is the average payofi of peer ui using pure strategy ai, ?Ui(xi) is the average payofi of peer ui using mixed strategy xi, and ? is a positive scale factor. Since it is generally very di?cult to represent ?Ui(ai; x?i) and ?Ui(xi) in a com- pact form, in the following, we flrst analyze a two-player game to gain some insight. Then, we generalize the observation in the two-player game to the multi-player game. Table 4.1: Utility table of a two-player game. 74 4.3.1 Two-Player Game Let x1 and x2 be the probability of u1 and u2 being an agent, respectively. Let B1 = Pr(y1 ? r)G and B2 = Pr(y2 ? r)G. Then, the payofi matrix of u1 and u2 can be written as in Table 4.1. Therefore, the average payofi ?U1(A; x2) can be computed by ?U1(A; x2) = (B2 ? C1)x2 + (B1 ? C1)(1? x2); (4.32) and the average payofi ?U1(x1) becomes ?U1(x1) = (B2 ? C1)x1x2 + (B1 ? C1)x1(1? x2) +B1(1? x1)x2: (4.33) According to (4.31), the replicator dynamics equation of u1 is given by _x1 = ?x1(1? x1) [B1 ? C1 ? (2B1 ?B2)x2] : (4.34) Similarly, the replicator dynamics equation of u2 can be computed by _x2 = ?x2(1? x2) [B1 ? C2 ? (2B1 ?B2)x1] : (4.35) At equilibrium, we know that _x1 = 0 and _x2 = 0. According to (4.34) and (4.35), we can get flve equilibria: (0; 0), (0; 1), (1; 0), (1; 1), and the mixed strategy equilibrium ? B1?C2 2B1?B2 ; B1?C12B1?B2 ? . According to [41], if an equilibrium of the replicator dynamics equations is a locally asymptotically stable point in a dynamic system, it is an ESS. Therefore, by viewing (4.34) and (4.35) as a nonlinear dynamic system and analyzing the corresponding Jacobian matrix, we can examine whether the flve equilibria are ESSs. 75 By taking partial derivatives of (4.34) and (4.35), the Jacobian matrix can be written as J = 2 664 @ _x1 @x1 @ _x1 @x2 @ _x2 @x1 @ _x2 @x2 3 775 = ? 2 664 J11 J12 J21 J22 3 775 ; (4.36) where J11 = (1 ? 2x1)(B1 ? C1 ? (2B1 ? B2)x2), J12 = ?x1(1 ? x1)(2B1 ? B2), J21 = ?x2(1? x2)(2B1 ?B2), and J22 = (1? 2x2)(B1 ? C2 ? (2B1 ?B2)x1). The asymptotical stability requires that det(J) > 0 and tr(J) < 0 [41]. Sub- stituting the flve equilibria, i.e. (0; 0), (0; 1), (1; 0), (1; 1), and ? B1?C2 2B1?B2 ; B1?C12B1?B2 ? , to (4.36), we conclude that ? If B2?B1?C1 > 0 and B2?B1?C2 > 0, there is a unique ESS (1; 1), where both u1 and u2 converge to be agents. ? Elseif B2 ? B1 ? C1 > 0 and B2 ? B1 ? C2 < 0, there is a unique ESS (1; 0), where u1 converges to be an agent and u2 converges to be a free-rider. ? Elseif B2 ? B1 ? C1 < 0 and B2 ? B1 ? C2 > 0, there is a unique ESS (0; 1), where u2 converges to be an agent and u1 converges to be a free-rider. ? Else, there are two ESSs (0; 1) and (1; 0), where the converged strategy proflles depends on the initial strategy proflles. From the above analysis, we can see that when the gain of being an agent (B2 ? B1) is greater than the cost of being an agent (C1 or C2), the peer tends to be an agent. And the peer with a higher cost tends to be a free-rider and rely on the peer with a lower cost. 76 Algorithm 2 : A Distributed Learning Algorithm For ESS 1. Given the step size ? and the slot index t = 0, each peer ui initializes xi with xi(0). 2. During slot t, for q = 1 : M , ? ui tosses a coin with probability xi(t) being head. If the outcome is head, ui serves as an agent and downloads data from the peers outside the group with download rate ri(t; q). On the other hand, if the outcome is tail, ui acts as a free-rider and downloads the data from the agents. ? ui computes his/her utility using (4.39). ? ui computes the indicator function using (4.38). 3. Then, ui approximates ?Ui(A; x?i(t)) and ?Ui(xi(t)) using (4.40) and (4.41). 4. Finally, ui updates the probability of being an agent xi(t) using (4.37). 4.3.2 Multi-Player Game From the analysis of the two-player game, we can infer that the peer with a higher cost (Ci) tends to take advantage of the peer with a lower cost. This observation can be extended to multi-player game. If there are more than two peers in the game, the strategy of the peers with higher C 0is will converge to \N" with a greater probability. The peers with lower C 0is tend to be agents since they sufier relatively heavier losses if no one serves as an agent. 77 4.4 A Distributed Learning Algorithm For ESS From the previous two sections, we can see that the ESS can be found by solv- ing the replicator dynamics equations ((4.19) or (4.31)). However, solving the repli- cator dynamics equations require the exchange of private information and strategies adopted by other peers. In this section, we will present a distributed learning algo- rithm that can gradually converge to ESS without information exchange. We flrst discretize the replicator dynamics equation shown in (4.31) as xi(t+ 1) = xi(t) + ? ? ?Ui(A; x?i(t))? ?Ui(xi(t)) ?xi(t); (4.37) where t is the slot index and xi(t) is the probability of ui being an agent during slot t. Here, we assume that each slot can be further divided into M subslots and each peer can choose to be an agent or not at the beginning of each subslot. From (4.37), we can see that in order to update xi(t + 1), we need to flrst compute ?Ui(A; x?i(t)) and ?Ui(xi(t)). Let us deflne an indicator function 1i(t; k) as 1i(t; q)= 8 >>< >>: 1; if ui is an agent at subslot q in slot t, 0; else, (4.38) where q is the subslot index. The immediate utility of ui at subslot q in slot t can be computed by Ui(t; q) = 8 >>>>>>>>>>< >>>>>>>>>>: G? Ci; if ui is an agent and rt ? r, ?Ci; if ui is an agent and rt < r, G; if ui is not an agent and rt ? r, 0; if ui is not an agent and rt < r, (4.39) 78 where rt is the total download rate of the agents and r is the source rate. Then, ?Ui(A; x?i(t)) can be approximated using ?Ui(A; x?i(t)) = PM q=1 Ui(t; q)1i(t; q)PM q=1 1i(t; q) ; (4.40) Similarly ?Ui(xi(t)) can be approximated as ?Ui(xi(t)) = 1M MX q=1 Ui(t; q): (4.41) Based on (4.37-4.41), ui can gradually learn the ESS. In Algorithm 2, we summarize the detailed procedures of the proposed distributed learning algorithm. 4.5 Simulation Results In all simulations, the parameters G, rL, and rU are set to be 1, 50, and 800, respectively. For convenience, in the rest of this chapter, we denote the cen- tralized approach maximizing the social welfare shown in (4.12) as MSW-C, the distributed approach maximizing the social welfare shown in (4.14) asMSW-D, and the ESS-based approach as ESS-D. We compare the proposed methods with the traditional P2P non-cooperation method, denoted as Non-Coop. In Non-Coop, each peer acts as an individual and randomly selects some peers for downloading video streams. Such a protocol has been widely used in the existing P2P systems, e.g., Coolstreaming [133] and PPLive [4]. In the flrst simulation, we show the social welfare (the sum of all peers? util- ities) comparison among difierent approaches, where we assume that there are 20 homogenous peers and the cost C is 0:1. As show in Fig. 4.3, MSW-C achieves the 79 best social welfare performance since its objective function is to maximize the social welfare with pure strategy. By using the mixed strategy to maximize the social welfare, MSW-D achieves the second best social welfare performance. However, as discussed in Section 4.2.2, the solution to MSW-D is not stable. With ESS-D, a stable NE solution can be obtained at the cost of a slight loss in social welfare. Nev- ertheless, all three proposed algorithms perform much better than the Non-Coop method. In Non-Coop, the social welfare performance decreases linearly in terms of the source rate. With cooperation and adaptively selecting the proper number of agents, all three proposed algorithms can preserve a high social welfare performance even with a large source rate. In the second simulation, we evaluate the convergence property of the ESS-D. In Fig. 4.4, we show the replicator dynamic of the cooperation streaming game with homogeneous peers, where C = 0:1 and r = 500. We can see that starting from a high initial value, all peers gradually reduce their probabilities of being an agent since being a free-rider more often can bring a higher payofi. However, since too low a probability of being an agent increases the chance of having no peer be an agent, the probability of being an agent will flnally converge to a certain value which is determined by the number of peers. In Fig. 4.5, we show the replicator dynamic of the cooperation streaming game with 20 heterogeneous peers, where r = 500 and the cost Ci is randomly chosen from [0:1; 0:3]. We further assume that Ci is monotonically increasing in i where u1 has the lowest cost and u20 has the highest cost. From Fig. 4.5, we can see that the peers with lower costs (u1, u2, and u3 in this simulation) converge to be an agent while 80 150 200 250 300 350 400 450 500 550 600 2 4 6 8 10 12 14 16 18 20 Source Rate (kb/s) Social Welfar e Non?Coop MSW?C MSW?D ESS?D Figure 4.3: The social welfare comparison among Non-Coop, MSW-C, MSW-D, and ESS-D. the peers with higher costs (u4 ? u20 in this simulation) converge to be a free-rider. This observation coincides with our conclusion in Section 4.3.2, which is \the peers with lower costs tend to be an agent since they sufier relatively higher losses if no one serves as an agent". Note that due to the space limitation, we only show the behavior dynamics of u1 ? u4. All other peers u5 ? u20 have the similar behavior dynamics with u4, and they all converge to be free-riders. In the third simulation, we compare the performance of Non-Coop and ESS- D in terms of the probability of real-time streaming, which is deflned as the proba- bility that the total download rate is greater than the source rate. The simulation results are shown in Fig. 4.6. We can see that with cooperation, the probability of 81 0 100 200 300 400 500 600 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Time interval Probability of being an agen t N=10 N=15 N=20 Figure 4.4: Behavior dynamic of a homogeneous group of peers. real-time streaming can be signiflcantly improved especially at the high source rate region. We also flnd that at the high source rate region, the probability of real-time streaming increases as N increases. The visual quality comparison between Non-Coop and ESS-D is shown in Fig. 4.7. In this simulation, we flx the probability of real-time streaming to be 0.85. According to Fig. 4.6, we can see that the corresponding source rates for \Non-Coop", \ESS-D with N=2", \ESS-D with N=3", and \ESS-D with N=4" are around 100kb/s, 300kb/s, 520kb/s, and 720kb/s, respectively. By setting the above source rates as the target bitrates, we encode the Foreman sequence with CIF format using H.264 encoder. From Fig. 4.7, we can see that the video visual quality with the proposed ESS-D is much better than that with Non-Coop. Then, we show the simulation result of the source rate versus the utility. As 82 0 50 100 150 200 250 300 350 400 0 0.5 1 Probability of being an agent Time interval Peer u 1 0 50 100 150 200 250 300 350 400 0 0.5 1 Probability of being an agent Time interval Peer u 2 0 50 100 150 200 250 300 350 400 0 0.5 1 Probability of being an agent Time interval Peer u 3 0 50 100 150 200 250 300 350 400 0 0.5 Probability of being an agent Time interval Peer u 4 Figure 4.5: Behavior dynamic of a heterogeneous group of peers. shown in Fig. 4.8, without cooperation, if the peer requires a utility around 0.8, the source rate can not be larger than 130 kb/s. However, with cooperation, the source rate can be more than 400 kb/s even when there are only 2 peers. Therefore, with cooperation, the peers can enjoy much higher quality video with the same utility. In the fourth simulation, we consider the case that the peers in the same group are viewing multiple channels with L being the number of the channels. We assume that the source rate is the same for all channels and there are 20 homogenous peers with the cost C = 0:1. Similar to the View-Upload Decoupling (VUD) scheme [127], the uploading and downloading are decoupled in the proposed ESS-D algorithm in this case. We allow cooperation among all the peers where the agent may download 83 0 500 1000 1500 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Source rate (kb/s) Probability of real?time streamin g Non?Coop ESS?D with N=2 ESS?D with N=3 ESS?D with N=4 ESS?D with N=5 Figure 4.6: The probability of real-time streaming comparison between Non-Coop and ESS-D. source data that he/she is not viewing. As shown in Fig. 4.9, without cooperation, if the peer requires a utility around 0.8, the source rate can not be larger than 130 kb/s in the Non-Coop method. However, with the proposed ESS-D algorithm, the source rate can be around 240kb/s even when the peers are view 8 difierent channels. This phenomenon fully demonstrates the e?ciency of the proposed method. In the last simulation, we consider the scenario when there is bufiering efiect. In such a scenario, the gain in the utility will not drop to zero when the total download rate is smaller than the source rate. Instead, the gain should maintain a positive value due to the existence of bufiers. One possible utility function that 84 (a) (b) (c) (d) Figure 4.7: The visual quality comparison: (a) Non-Coop; (b) ESS-D with N=2; (c) ESS-D with N=3; (d) ESS-D with N=4. considers the bufiering efiect is UA;i(k) = 1ln(r)E [ln(yk)]G? Ci;8k 2 [1; N ]; UN;i(k) = 8 >>< >>: 1 ln(r)E [ln(yk)]G; if k 2 [1; N ? 1]; 0; if k = 0. (4.42) From the above utility function, we can see that for any given source rate r, the gain increases as the total download rate yk increases. Moreover, since the probability of playback delay becomes smaller with more data in the bufier, a certain 85 0.4 0.5 0.6 0.7 0.8 0.9 1 50 100 150 200 250 300 350 400 450 500 Utility Source Rate (kb/s ) Non?Coop ESS?D with N=2 ESS?D with N=3 Figure 4.8: Single-source rate comparison between Non-Coop and ESS-D. increase in the high yk region should lead to a less signiflcant gain than that in the low yk region [30]. Here, we use the ln(:) function to characterize such properties. Nevertheless, other functions that have similar properties can also be used. The social welfare comparison between Non-Coop and ESS-D with the util- ity function in (4.42) is shown in Fig. 4.10. From Fig. 4.10, we can see that when the utility function in (4.42) is used, the social welfare performance of Non-Coop no longer decreases linearly in terms of the source rate. This phenomenon is mainly because, with the existence of bufiers, the gain will not drop to zero when the total download rate is smaller than the source rate. Nevertheless, ESS-D can still lead to a much higher social welfare performance for all source rates, compared with 86 0.4 0.5 0.6 0.7 0.8 0.9 1 50 100 150 200 250 300 350 400 450 500 Utility Source Rate (kb/s ) Non?Coop ESS?D with L=4 ESS?D with L=6 ESS?D with L=8 ESS?D with L=10 Figure 4.9: Multi-source rate comparison between Non-Coop and ESS-D. Non-Coop. Moreover, we should notice that all the analysis in Section 4.2 is still applicable to the utility function in (4.42). 4.6 Summary In this chapter, we propose a cooperative streaming scheme to address the network ine?ciency problem encountered by the traditional non-cooperative P2P schemes. We answer the question of \how a group of selflsh peers with large intra- group upload and download bandwidths cooperate with each other to achieve better streaming performance" by formulating the problem as an evolutionary game and deriving the ESS for every peer. We further propose a distributed learning algorithm 87 200 300 400 500 600 16 18 20 22 24 26 28 Source Rate (kb/s) Social Welfar e Non?Coop ESS?D Figure 4.10: The social welfare comparison between Non-Coop and ESS-D when the utility function is deflned as (4.42). for each peer to converge to the ESS by learning from his/her own past payofi history. From the simulation results, we can see that compared with the traditional non-cooperative P2P schemes, the proposed algorithm achieves much better social welfare, higher probability of real-time streaming, and better video quality (higher source rate). Moreover, by incorporated with the recent proposed View-Upload Decoupling (VUD) scheme, the proposed cooperative streaming scheme allows the peers who are viewing difierent videos to cooperate with each other and mutually improve the streaming performance. 88 Chapter 5 Cooperation Stimulation Using Indirect Reciprocity Game Modeling A cognitive network is a network composed of elements that can dynamically adapt to varying network conditions to optimize end-to-end performance through learning and reasoning [116]. In such a network, nodes are intelligent and have the ability to observe, learn, and act to optimize their performance. Since nodes generally belong to difierent authorities and pursue difierent goals, fully cooperative behaviors, such as unconditionally forwarding packets for each other, cannot be taken for granted. Instead, nodes will only cooperate with others when cooperation can improve their own performance. We regard the nodes with such behaviors as selflsh nodes. Therefore, a key problem in cognitive networks is how to stimulate cooperation among selflsh nodes. In the literature, many schemes have been proposed to stimulate node co- operation for difierent cognitive networks, such as [23] [138] for ad hoc networks and [121] [57] for peer-to-peer networks. One way to stimulate cooperation among selflsh nodes is to use payment based methods [139] [14]. Although these schemes can achieve promising cooperation stimulation results, the requirement of tamper- proof hardware or central billing services greatly limits their potential applications. Another way to stimulate cooperation among selflsh nodes is to use reputation- based methods with necessary monitoring [130] [88] [59]. Marti et. al [89] propose 89 a mechanism, called \watchdog", to identify the misbehaving nodes and another mechanism, called \pathrater", to de ect the tra?cs around them. The major drawback of their method is that misbehaving nodes are not punished. Therefore, there is no incentive for the nodes to cooperate. To overcome this problem, Bucheg- ger and Boudec [22] as well as Michiardi and Molva [90] propose reputation-based mechanisms to enforce node cooperation. In both approaches, nodes observe the be- havior of each other, store this information locally, and distribute this information in reputation reports. According to their observations, nodes isolate the misbehav- ing nodes by denying forwarding packets to them. However, there is no theoretical justiflcation about the optimality of such approaches. Recently, efiorts have been made to mathematically analyzing cooperation in cognitive networks using game theory [91] [42] [84] [128]. Srinivasan et al. [112] propose to use generous TIT-FOR-TAT strategy while Urpi et al. [118] propose to use Bayesian games. In [53], Felegyhazi et al. investigate equilibrium conditions of packet forwarding strategies based on game theory and graph theory by taking into account the network topology. In [131], Yu and Liu propose a game theoretic framework to jointly analyze cooperation stimulation and security in autonomous mobile ad hoc networks. Their results show that, for a two-player packet forwarding game, the unique cheat-proof Nash equilibrium for every node is not to help the opponent more than the opponent has helped him/her. However, most of the existing game theoretical frameworks rely on the assump- tion that the game between a pair of players is directly played for inflnite times. In reality, due to mobility or changes of environment, nodes will periodically update 90 their partners to achieve better performance, which means that any pair of players are supposed to play for only flnite times with the termination time are either known or can be estimated by both players. Note that every player can experience inflnite times with many players but never always with the same partner. In such a case, according to the well known Prisoner?s Dilemma and backward induction princi- ple [96], the only optimal strategy is to always play non-cooperatively. The major reason causing such a non-cooperative optimal strategy is the implicit assumption of direct reciprocity in most games, where the action of a player taking towards his/her opponent is purely determined by the history of how the opponent treats him/her. Obviously, under such a scenario, all players have no incentive to play cooperatively since their behaviors will not be evaluated by other players except their opponents. To stimulate the plays? incentive to play cooperatively, not only the evaluations from the opponents but also the evaluations from other observers should be taken into account, which leads to the notion of \indirect reciprocity". Indirect reciprocity is a key mechanism for the evolution of human cooperation and has recently drawn a lot of attentions in the area of social science and evolutionary biology [93] [94]. The key concept of indirect reciprocity is \I help you not because you have helped me but because you have helped others". In this chapter, we propose to use the indirect reciprocity game modelling to stimulate cooperation among selflsh nodes for the scenario where the number of interactions between any pair of players are flnite. The main contributions of this chapter are summarized as follows. ? We propose a cooperation stimulation scheme to stimulate cooperation among 91 selflsh users in cognitive networks using indirect reciprocity game modelling. Difierent from the existing game-theoretic approaches, our proposed scheme does not rely on the assumption that the number of interactions between a pair of players are inflnite. ? In the proposed scheme, we flrst develop the concept of reputation distribu- tion to capture not only the mean behavior of the transmitter?s reputation but also all likelihoods of the transmitter?s reputation that may be. Then, we de- velop a reputation updating policy for the receiver and observers to update the transmitter?s reputation distribution based on the transmitter?s previous rep- utation distribution and his/her action toward the receiver. We also propose a gradient descent algorithm to flnd the stationary reputation distribution of the whole population for any given optimal action rule. ? In the proposed scheme, we formulate the problem of flnding the optimal action rule as a Markov Decision Process (MDP) and proposed a modifled value iteration algorithm to flnd the optimal action rule. ? We show that with an appropriate cost-to-gain ratio, the strategy of forwarding the number of packets that is equal to the reputation level of the receiver is an evolutionarily stable strategy (ESS). We also show that even with only 60 percentage of population adopting the optimal action rule at the beginning, by natural selection, the optimal action rule will quickly spread over the whole population. And once the whole population use the optimal action rule, no one will deviate. Moreover, we flnd that such an ESS will lead to a \good" 92 Population Observers Transmitter Receiver Figure 5.1: System model. Within every interaction, a pair of transmitter and re- ceiver is randomly sampled from the population. Then, the transmitter will forward a certain amount of packets to the receiver according to the receiver?s and his/her own reputations. After the transmission, the transmitter?s reputation will be up- dated by the receiver and the observers. Finally, the transmitter?s reputation is propagated to the whole population from the receiver and the observers through a noisy gossip channel. society with more than 90 percentage of the population have good reputation. The rest of this chapter is organized as follows. In Section 5.1, we describe the problem formulation and introduce basic components in our system model. Then, we show in details how to flnd the optimal action rule in Section 5.2. In Section 5.3, we describe two action spreading algorithms due to natural selection. Finally, we show the simulation results in Section 5.4 and draw conclusions in Section 5.5. 93 5.1 The System Model As shown in Figure 5.1, let us consider a cognitive network with su?ciently large population of nodes. Due to mobility and/or changes of environment, short interactions rather than long-lasting associations between anonymous partners are dominant. At each time slot, a fraction of players is chosen from the population to form pairs to forward packets. Within each pair, one player acts as a transmitter and the other player as a receiver. Let A = f0; 1; :::; Lg stand for the action set that the transmitter may choose, where the action i 2 A stands for the transmitter forwards i packets to the receiver. In the simplest model with L = 1, the receiver can obtain a gain g at a cost c to the transmitter. We should always assume that the gain g is greater than the cost c. Otherwise, no transmission will occur. In such a case, if both players cooperate with each other and forward one packet to the other player, both players receive g ? c, which is better than what they would obtain by both defecting, namely 0. However, a unilateral defector would earn g, which is the highest payofi, and the exploited cooperator would pay the cost c without receiving any beneflt. The payofi structure yields an instance of the well-known Prisoner?s Dilemma game and the unique Nash equilibrium (NE) is defecting, i.e. both players will not forward the packet to the other player. Moreover, with backward deduction, the NE remains the same even the game is played a flnite number of times. Such a non-cooperative optimal strategy is mainly because of the use of direct reciprocity, where the action of a transmitter taking towards a receiver is purely determined by the history of 94 how the receiver treats him/her. Obviously, under such a scenario, all transmitters have no incentive to forward packets since their behaviors will not be evaluated by other players except their corresponding receivers. To stimulate the cooperation under such a scenario, we use the indirect reci- procity game modelling, where the essential concept is: \I help you not because you have helped me but because you have helped others". Therefore, a key concept in indirect reciprocity game is the establishment of the notion of reputation, which is the evaluation of the history of the players? action. Here, to simplify the analysis, we assume that the reputation is quantized to L+1 levels with \0" being the worst reputation and \L" being the best reputation, i.e., the reputation set can be repre- sented as T = f0; 1; :::; Lg. However, the results can be easily extended to the case that the reputation set has difierent size from the action set. Here, we also assume that everyone agrees on the reputation of an individual and no private opinions are allowed. However, errors in assigning reputation are possible. During each inter- action, the transmitter determines his action, i.e. how many packets to forward to the receiver, based on the receiver?s and his/her own reputations. After each inter- action, the reputation of the receiver remains the same, while the reputation of the transmitter is flrst updated by the receiver and the observers, and propagated to the whole population through a noisy gossip channel. Then, each participant (including both the transmitter and receiver) goes back to the population with probability ? or leaves the population with probability 1 ? ?. The parameter ? can be treated as a discounting factor of the future. For every player who leaves the population, a new individual enters with an initial reputation randomly chosen from the reputation set 95 with equal probability 1L+1 . 5.1.1 Social Norms A social norm, Q, is a matrix used for updating the immediate reputation of players, where the immediate reputation is the reputation that a transmitter can immediately obtain by taking an action. Each element Qi;j in the social norm stands for the immediate reputation assigned to a transmitter who has taken the action i toward a receiver whose reputation is j. Without loss of generality, we assume that all players in the population share the same norm. Although the immediate reputation is only determined by the action of the transmitter and the reputation of the receiver, we can see from the later discussion, the flnal reputation updating rule also involves the reputation of the transmitter. Since both the cardinalities of the action set and the reputation set are L+1, there are (L + 1)(L+1)?(L+1) possible social norms. Based on the intuition that for- warding packets to the receiver with good reputation or denying forwarding packets to the receiver with bad reputation should receive good reputation, here, we deflne the immediate reputation Qi;j as follows Qi;j = L? ji? jj; (5.1) 96 which means that the social norm is Q = 0 BBBBBBBBBBBBBB@ L L? 1 : : : 1 0 L? 1 L : : : ... 1 ... L? 1 . . . L? 1 ... 1 ... : : : L L? 1 0 1 : : : L? 1 L 1 CCCCCCCCCCCCCCA : (5.2) For the special case when L = 1, the 2? 2 social norm can be written as Q2?2 = 0 BB@ 1 0 0 1 1 CCA ; (5.3) where \1" stands for good reputation and \0" stands for bad reputation. With such a social norm shown in (5.3), we can see that the transmitter can obtain a good immediate reputation by either forwarding packets to the receiver with good reputation or denying forwarding packets to the receiver with bad reputation. On the other hand, the transmitter will obtain a bad immediate reputation if he/she either denies forwarding packets to the receiver with good reputation or forwards packets to the receiver with bad reputation. 5.1.2 Action Rules An action rule, a, is an action table of the transmitter, where the ith row and jth column element ai;j stands for the action of the transmitter based on his/her own reputation i and the corresponding receiver?s reputation j. Since both the cardinalities of the action set and the reputation set are L + 1, there are (L + 97 Social Norm Noisy Gossip Channel Figure 5.2: Reputation updating policy. 1)(L+1)?(L+1) possible action rules. The optimal action rule, a?, should be the one that maximizes the payofi function as discussed later. 5.2 Optimal Action Rule 5.2.1 Reputation Updating Policy A key concept in indirect reciprocity game is reputation [94]. There is a similar notion of trust [115], however, which is mostly based on direct reciprocity. Players monitor the social interactions within their group and help others establish the reputation of being a helpful player. Therefore, one important step in indirect reciprocity game modelling is how to update reputation based on players? actions. In this subsection, we develop a reputation updating policy based on the action of the transmitter, the reputation of the transmitter and the reputation of the receiver. To capture not only the mean behavior of the transmitter?s reputation but also all likelihoods of the transmitter?s reputation that may be, we assign a reputation distribution for each player. Let d = [d0; d1; :::; dL]T be a reputation distribution for a speciflc player. Then di stands for the likelihood of the player being assigned with 98 reputation i. The proposed reputation updating policy is shown in Fig. 5.2. Suppose, at time index n, a transmitter with a reputation distribution dni is matched with a receiver with a reputation distribution dnj . By taking a certain action, the transmit- ter is assigned with an immediate reputation d^ni based on the social norm. Then, the receiver and the observers will update the transmitter?s reputation distribution using a linear combination of the transmitter?s original and immediate reputations, where the weight ? can be treated as a discounting factor of the past reputation. Finally, the transmitter?s reputation is propagated among the population by the receiver and observers through a noisy gossip channel. In a simple example, we assume that the transmitter?s reputation distribution is dni = ei and the receiver?s reputation distribution is dnj = ej, where ei and ej are the standard basis vectors. Let ai;j be the action the transmitter takes towards the receiver. Then, the immediate reputation of the transmitter is eQai;j ;j . According to the reputation updating policy in Fig. 5.2, after the transmission, the transmitter?s reputation distribution becomes dn+1i = PN ? ?ei + (1? ?)eQai;j ;j ? ; (5.4) where PN is the transition matrix of the noisy channel. Without loss of generality1, 1Note that the analysis in this chapter are also applicable to the PN with other forms. 99 we deflne PN as follows PN = 0 BBBBBBBBBBBBBB@ 1? ? ?=L : : : ?=L ?=L ?=L 1? ? : : : ... ?=L ... ?=L . . . ?=L ... ?=L ... : : : 1? ? ?=L ?=L ?=L : : : ?=L 1? ? 1 CCCCCCCCCCCCCCA ; (5.5) with ? 2 [0; 0:5] being a constant. The dn+1i in (5.4) is the updated reputation distribution of the transmitter after the transmitter with an original reputation ei takes an action ai;j towards the receiver with a reputation ej. Since this updated reputation distribution will be used later in the analysis for flnding the optimal action rule, we use a speciflc symbol ~di!j to denote it, i.e., ~di!j = PN ? ?ei + (1? ?)eQai;j ;j ? : (5.6) For the general case that dni 6= ei and/or dnj 6= ej, the transmitter?s updated reputation distribution cannot be simply expressed using (5.4) since, given an action rule, difierent combinations of the transmitter?s and receiver?s reputations may lead to the same immediate reputation. In such a case, we need to flrst flnd the immediate reputation using d^ni (k) = X p X q: Qap;q;q=k dni (p)dnj (q): (5.7) Then, according to Fig. 5.2, the transmitter?s updated reputation distribution can be computed by dn+1i = PN ? ?dni + (1? ?)d^ n i ? : (5.8) 100 5.2.2 Stationary Reputation Distribution Let x = [x0; x1; :::; xL]T stand for the reputation distribution of the entire population, where xi is the portion of the population that have the reputation i. Since every pair of transmitter and receiver is chosen from the population, given the transmitter with reputation i, the probability of matching with the receiver with reputation k is xk. After the transmission, the reputation of the transmitter is updated using the policy shown in Fig.5.2. Therefore, the evolution of x can be described by the following difierential equation dx dt = x new ? x; (5.9) where xnew is the new reputation distribution of the entire population and can be computed by xnew = PN (?I+ (1? ?)PT )x; (5.10) with the ith row and jth column element of the matrix PT being deflned as PT (j; i) = X k: Qa?i;k;k=j xk: (5.11) According to (5.9), (5.10), and (5.11), the stationary reputation distribution x? is the solution to the following equation PN (?I+ (1? ?)PT )x? = x?: (5.12) From (5.11) and (5.12), we can see that, given the optimal action a?, the stationary reputation distribution can be found by solving the nonlinear equations in (5.12). In Algorithm 3, we propose a gradient descent algorithm for flnding the stationary reputation distribution given the optimal action rule. 101 Algorithm 3 : Finding Stationary Reputation Distribution Using Gradi- ent Descent 1. Given the optimal action a?i;j ; 8i; 8j, the tolerance ?0 = 0:01, the index t = 0, and step size fi = 0:1, initialize x = [x0; x1; :::; xL]T with x0 = [x00; x01; :::; x0L]T , set ? = 1, and let F(x) = PN (?I+ (1? ?)PT (x))x? x: 2. while ? > ?0 ? Compute the updating vector ?xt+1 using ?xt+1 = ?fi?rF(xt)? F(xt). ? Update xt+1 by xt+1 = xt +?xt+1. ? Normalize xt+1 using xt+1 = xt+1jjxt+1jj2 . ? Update the parameter ? by ? = jjxt+1 ? xtjj2. ? Update the index t = t+ 1. End 3. The stationary reputation distribution is x? = xt. 5.2.3 Payofi Function Suppose that the cost of forwarding a packet is a constant, c, the total cost of the transmitter with reputation i taking action ai;j towards a receiver with reputa- tion j is given by C(ai;j) = ai;jc: (5.13) Similarly, if the gain of receiving a packet is a constant, g, the total gain of the receiver with reputation i can be computed by G(aj;i) = aj;ig; (5.14) where aj;i is the action of the corresponding transmitter with reputation j. 102 Let Wi;j denote the maximum payofi that a player, currently having reputation i and being matched with a player with reputation j, can gain from this interaction to future. Obviously, if the player with reputation i serves as a transmitter, then the long-term expected payofi that he/she can obtain by taking action ai;j would be f1(ai;j) = ?ai;jc+ ? X k X l ~di!j(k)x?(l)Wk;l; (5.15) where the flrst term ai;jc is the immediate cost the transmitter incurred by taking action ai;j, and the second term P k P l ~di!j(k)x?(l)Wk;l stands for the beneflt he gains in the future with a discounting factor ?. According to (5.6), by taking action ai;j, the reputation distribution of the transmitter will change from ei to ~di!j. Since his opponent in the next round is randomly sampled from the population with a sta- tionary reputation distribution x?, with probability ~di!j(k)x?(l), the transmitter?s reputation becomes k and his opponent?s reputation is l. On the other hand, if the player with reputation i serves as a receiver, the long-term expected payofi that he/she can obtain is f2 = a?j;ig + ? X l x?lWi;l; (5.16) where the flrst term a?j;ig is the immediate gain he/she can obtain when the trans- mitter takes the optimal action a?j;i, and the second term P l x?lWi;l stands for the beneflt he gains in the future with a discounting factor ?. As a receiver, the reputa- tion will not change after the transmission. Since his opponent in the next round is randomly sampled from the population with a stationary reputation distribution x?, with probability x?(l), the receiver?s reputation is i and his opponent?s reputation is l. 103 With each interaction, the play acts either as a transmitter or as a receiver with equal probability 12 . Therefore, the Bellman equation of Wi;j can be written as Wi;j = maxai;j " 1 2 ? ?ai;jc+ ? X k X l ~di!j(k)x?(l)Wk;l ! + 12 ? a?j;ig + ? X l x?(l)Wi;l !# ; (5.17) and the optimal action a?i;j can be computed by a?i;j=argmaxai;j Wi;j =argmax ai;j " 1 2 ? ?ai;jc+ ? X k X l ~di!j(k)x?(l)Wk;l !# : (5.18) From (5.17) and (5.18), we can see that the problem of flnding the optimal action rule is a Markov Decision Process (MDP), where the state is the reputation pair (i, j), the action is ai;j, the transition probability is determined by ~di!j and x?, and the reward is determined by c and g. Therefore, given the stationary reputa- tion distribution, the optimal action can be found by solving (5.18) using dynamic programming. In this chapter, we propose a modifled value iteration algorithm to flnd the optimal action given stationary reputation distribution, which is shown in Algorithm 4. 5.2.4 Optimal Action Using An Alternative Algorithm From the previous two subsections, we can see that given the optimal action, the stationary reputation distribution can be found using Algorithm 3, and given 104 the stationary reputation distribution, the optimal action can be found using Algo- rithm 4. Therefore, we can obtain the optimal action and the stationary reputation distribution alternatively by iteratively flxing one and solving the other. The de- tailed processes are summarized in Algorithm 5. Note that the convergence speed of Algorithm 5 is highly determined by the initial action rule a0. Nevertheless, it will converge since the number of the possible action rules is flnite. Moreover, Algorithm 5 can also be used to test the evolutionary stability of any action rule. The idea is to set the tested action rule as the initial action rule and see whether it can converge in one iteration. The details will be discussed in Section 5.4. 5.3 Action Spreading Due To Natural Selection Based on Algorithm 5, we can flnd the optimal action rule and the stationary reputation distribution. However, during the above analysis, we do not include the perturbation efiect, where players may take non-optimal action rule due to uncertainty of the system and/or the incorrect (noisy) parameters. Taking the perturbation efiect into account, we need to evaluate the stability of the optimal action rule. Here, we adopt the concept of evolutionarily stable strategy (ESS) [110], which is \a strategy such that, if all members of the population adopt it, then no mutant strategy could invade the population under the in uence of natural selection". In the following subsections, we flrst discuss, by natural selection, how the action rules spread over the population. Speciflcally, we discuss two action spreading algorithms: one is action spreading algorithm using Wright-Fisher model 105 [54] and the other is action spreading algorithm using replicator dynamic equation [110]. Then, we examine, in Section 5.4, the stability of the optimal action rule derived by Algorithm 5 by the simulations. Let M be the number of action rules, a1; a2; :::; aM , used in the population. Let pti be the percentage of the population that uses action rule ai at time t. Then, we have PMi=1 pti = 1. Let U ti be the average payofi using action rule ai at time t. 5.3.1 Action Spreading Algorithm Using Wright Fisher Model The Wright-Fisher model is by far the most popular stochastic model for re- production in population genetics [54]. It is based on the assumption that the prob- ability of an individual adopting a certain strategy is proportional to the expected payofi of the population using that strategy. Due to its simplicity and capability of capturing the essence of the biology involved, we use the Wright-Fisher model here to characterize how action rules spread over the population. Let yi be the probability of an individual using action ai. Then, we have PM i=1 yi = 1. With the Wright-Fisher Model, we assume that yi is proportional to the total payofi of the users using ai. Therefore, yi can be computed by yi = p t iU tiPM j=1 ptjU tj ; (5.19) where the numerator ptiU ti is the total payofi of the users using action ai, and the denominator PMj=1 ptjU tj is the total payofi of the whole population, which is the normalization term that ensures PMi=1 yi = 1. Based on the assumption that the population size is su?ciently large, the per- 106 centage of the population using action ai is equal to the probability of an individual using ai. Therefore, the action spreading equation can be written as pt+1i = ptiU tiPM j=1 ptjU tj : (5.20) 5.3.2 Action Spreading Algorithm Using Replicator Dynamic Equa- tion Replicator dynamic equation is widely used to characterize the population evolution in evolutionary game theory [110]. It is based on the following intuition: if a certain strategy results in a higher payofi than the average level, the population share using that strategy will grow with the growth rate proportional to the dif- ference between the expected payofi of the population using that strategy and the expected payofi of the entire population. In this subsection, we use the replicator dynamic equation to model the evolution of the percentage of the population using a certain action rule, which means that the evolution of pi is given by the following equation dpi dt = ? ? Ui ? MX j=1 pjUj ! pi; (5.21) where ? is a scale factor controlling the speed of the evolution. By discretizing the replicator dynamic equation in (5.21), we have the action spreading equation pt+1i = pti + ? ? U ti ? MX j=1 ptjU tj ! pti = pti " 1 + ? ? U ti ? MX j=1 ptjU tj !# : (5.22) 107 0 200 400 600 800 1000 1200 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Time index Percentag e Evoluation of the population with reputation L RDE WFM (a) 0 200 400 600 800 1000 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Time index Percentag e Evoluation of the population using optimal action RDE WFM (b) Figure 5.3: The population evolution when L = 1, g = 1 and c = 0:1: (a) the percentage of the population with reputation L = 1; (b) the percentage of the population using optimal action shown in (5.24). 5.4 Evolutionarily Stable Strategy and Simulations To verify the proposed algorithm, we simulate the packet forwarding game. We study a flxed-size population, N = 1000. Each new player receives an initial reputation, which is randomly chosen from f0; 1; :::; Lg with equal probability 1L+1 . Each player uses one of (L + 1)(L+1)?(L+1) possible action rules. All players in the population share the flxed social norm deflned in (5.2). Before any one elementary step of action updating, each individual has exactly 20 interactions with other ran- domly chosen individuals. Individuals act as transmitter and receiver on average 10 times each. After each interaction, the reputation of the transmitter is updated according to the reputation updating policy shown in Fig. 5.2. We assume that every player in the population agrees on the reputation generated by the reputation updating policy. No private lists of reputation are considered. After all 20 interac- 108 tions have taken place, each participant including both the transmitter and receiver goes back to the population with probability ? or leaves the population with proba- bility 1? ?. For every player who leaves, a new individual enters the population to keep the total population size constant. The initial reputation of the new coming is randomly chosen from f0; 1; :::; Lg with equal probability 1L+1 . Then, the players in the population, including the old players who stay in the population and the new players who enter the population, choose their new action rules according to previ- ous payofi history of the whole population. There are two possible action spreading algorithms as shown in the previous section. One is the action spreading algorithm using Wright Fisher Model, which is denoted as \WFM", and the other one is the action spreading algorithm using Replicator Dynamic Equation, which is denoted as \RDE". After updating the action rule, the payofis of all players are reset to zero. Therefore, older players do not accumulate their payofis. In all the following simulations, the parameters ?, ?, and ? are set to be 0.5, 0.9, and 0.95 respectively. The parameter ? that controls the speed of the evolution in RDE is set to be 0.1. 5.4.1 Binary Reputation Scenario To give more insights into the proposed algorithm, we flrst evaluate the binary reputation scenario where L = 1. We assume that the gain per unit is 1 and the cost per unit is 0.1, i.e. g = 1 and c = 0:1. According to Algorithm 5, with difierent initial conditions, we can flnd three pairs of stationary reputation distribution x? 109 g c c=0.582g Stable Region Figure 5.4: The stable region for the optimal action rule shown in (5.24) when L = 1. and the optimal action rule a?, which are x?1 = 0 BB@ 0:5 0:5 1 CCA ; a?1 = 0 BB@ 0 0 0 0 1 CCA : (5.23) x?2 = 0 BB@ 0:0909 0:9091 1 CCA ; a?2 = 0 BB@ 0 1 0 1 1 CCA : (5.24) x?3 = 0 BB@ 0:9091 0:0909 1 CCA ; a?3 = 0 BB@ 1 0 1 0 1 CCA : (5.25) With (x?1,a?1), the transmitter will not forward any packet to the receiver re- gardless his/her own reputation and the corresponding receiver?s reputation. Obvi- 110 0 200 400 600 800 1000 1200 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 Time index Percentag e Evoluation of the population with reputation L RDE WFM (a) 0 200 400 600 800 1000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Time index Percentag e Evoluation of the population using optimal action RDE WFM (b) Figure 5.5: The population evolution when L = 1, g = 1 and c = 0:6: (a) the percentage of the population with reputation L = 1; (b) the percentage of the population using optimal action shown in (5.24). ously, it is a bad strategy since, with such a strategy, there is no cooperation and the payofi of every player is zero. The pairs (x?2,a?2) and (x?3,a?3) are symmetric where with the former pair, the transmitter will always forward packets to the receiver who has good reputation, and with the latter pair, the transmitter will always forward packets to the receiver who has bad reputation. We can also flnd that the pair (x?2,a?2) leads to a population with more than 90 percentage of the players are good reputation while (x?3,a?3) leads to a population with more than 90 percentage of the players are bad reputation. Here, we prefer (x?2,a?2) since it leads to a \good" society with more than 90 percentage of the population are good reputation. Then, we evaluate the evolutionary stability of (x?2,a?2). In the simulation, the initial frequency of the optimal action rule a? shown in (5.24) is set to be 0.6. The initial frequencies of the other action rules are randomly chosen. The initial repu- tation of new players is randomly chosen from f0; 1g with equal probability 12 . In 111 Fig. 5.3 (a), we show the evolutionary results of the percentage of the population with reputation level L = 1. From Fig. 5.3 (a), we can see that for both WFM and RDE, the reputation distribution converges to the stationary reputation dis- tribution x?2. Compared with WFM, the convergence speed of RDE is a bit slower since a small speed controlling parameter ? = 0:1 is used in RDE. The evolutionary results of the percentage of the population using the action rule a?2 are shown in Fig. 5.3 (b). From Fig. 5.3 (b), we can see that for both WFM and RDE, the action rule a?2 will spread over the whole population. And once the whole population adopt a?2, no one will deviate. Therefore, the action rule a?2 is an evolutionarily stable strategy (ESS) [110] in this case. From (5.17), we can see that the optimal action rule is determined by the values of g and c. Intuitively, if g c, every player is willing to cooperate with other players since in such a scenario, the potential cooperation gain will be greater than the immediate cooperation cost. On the other hand, if c g, every player tends not to cooperate with other players since the potential cooperation gain will be smaller than the immediate cooperation cost in such a scenario. Based on the intuition, there should exist a critical cost-to-gain ratio such that the optimal action rule a?2 is stable if c < g and is not stable otherwise. By setting a?2 as the initial action rule a0 in Algorithm 5 and varying g and c, we flnd that if cg ? 0:582, the optimal action rule found by Algorithm 5 is a?2. On the other hand, if cg > 0:582, the optimal action rule changes to be a?1. Therefore, the critical cost-to-gain ratio is equal to 0.582 in this case, which means that the stable region for a?2 is the shadow region shown in Fig. 5.4. 112 Stable region g c c=0.622g Stable Region Figure 5.6: The stable region for the optimal action rule shown in (5.26) when L = 4. We verify the above statement by evaluating the stability of a?2 when g = 1 and c = 0:6. The corresponding evolutionary results are shown in Fig. 5.5. From Fig. 5.5 (b), we can see that when cg = 0:6 > 0:582, the percentage of the population using action rule a?2 does not converge to 1 for both WFM and RDE. Therefore, a?2 is not stable in this case. Correspondingly, we can also see from Fig. 5.5 (a) that the reputation distribution does not converge to x?2 in this case. 5.4.2 Multi-Level Reputation Scenario For the multi-level reputation scenario where L ? 2, due to the large dimension of the action space ((L+1)(L+1)?(L+1)), it is di?cult to flnd all the possible pairs of 113 0 200 400 600 800 1000 1200 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Time index Percentag e Evoluation of the population with reputation L RDE WFM (a) 0 200 400 600 800 1000 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Time index Percentag e Evoluation of the population using optimal action RDE WFM (b) Figure 5.7: The population evolution when L = 4, g = 1 and c = 0:5: (a) the percentage of the population with reputation L = 4; (b) the percentage of the population using optimal action shown in (5.26). stationary reputation distribution x? and optimal action rule a?. However, based on the results in the binary reputation scenario, we can infer that one possible optimal action rule a?0 is to forward i packets to the receiver with reputation i, i.e. a?0 can be written as a?0 = 0 BBBBBBBBBB@ 0 1 : : : L 0 1 : : : L ... ... ... ... 0 1 : : : L 1 CCCCCCCCCCA : (5.26) According to Algorithm 3, we can flnd the corresponding stationary reputation distribution x?0. For the special case with L = 4, x?0 is x?0 = 0:0235 0:0235 0:0235 0:0235 0:906 ?T : (5.27) Then, similar to the binary reputation scenario, we obtain the stable region for the optimal action rule a?0. By setting a?0 as the initial action rule a0 in Algorithm 114 0 200 400 600 800 1000 1200 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 Time index Percentag e Evoluation of the population with reputation L RDE WFM (a) 0 200 400 600 800 1000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Time index Percentag e Evoluation of the population using optimal action RDE WFM (b) Figure 5.8: The population evolution when L = 4, g = 1 and c = 0:7: (a) the percentage of the population with reputation L = 4; (b) the percentage of the population using optimal action shown in (5.26). 5 and varying g and c, we flnd that if cg ? 0:622, the optimal action rule found by Algorithm 5 is still a?0. On the other hand, if cg > 0:622, the optimal action rule changes. Therefore, the critical cost-to-gain ratio in this case is equal to 0.622, which means that the stable region for a?0 is the shadow region shown in Fig. 5.6. We then verify the above statement by simulating the packet forwarding game with two difierent cost-to-gain ratio settings. One is g = 1 and c = 0:5, i.e. cg = 0:5 < 0:622, and the other is g = 1 and c = 0:7, i.e. cg = 0:7 > 0:622. The evolutionary results for the former setting are shown in Fig. 5.7. From Fig. 5.7, we can see that when the cost-to-gain ratio is set to be cg = 0:5 < 0:622, the reputation distribution converges to x?0 and the optimal action rule a?0 spreads over the whole population for both WFM and RDE, which verifles that a?0 is an ESS in this case. The evolutionary results for the latter cost-to-gain ratio setting are difierent and shown in Fig. 5.8. From Fig. 5.8, we can see that when the cost-to-gain ratio 115 is set to be cg = 0:7 > 0:622, for both WFM and RDE, the action rule a?0 does not spread over the whole population and the reputation distribution does not converge to x?0. Therefore, a?0 is not stable in this case. 5.5 Summary In this chapter, we propose a cooperation stimulation scheme for cognitive networks using indirect reciprocity game modelling. Difierent from the existing game theoretic approaches, our proposed scheme does not rely on the assumption that the number of interactions between a pair of players are inflnite. From the simulation results, we can see that with a proper cost-to-gain ratio, the action rule of forwarding i packets to the receiver with reputation level i is an ESS. Even starting with only 60 percentage of population adopting the optimal action rule, the optimal action rule will quickly spread over the whole population by natural selection. And once the whole population use the optimal action rule, no one will deviate. Moreover, such an ESS will lead to a \good" society where more than 90 percentage of the population have good reputation. 116 Algorithm 4 : Modifled Value Iteration For Optimal Action Selection Given Stationary Reputation Distribution 1. Given the stationary reputation x?, tolerance ?0 = 0:01, initialize a?i;j with a0i;j 8i 8j, set ?1 = 1 and ?2 = 1. 2. while ?1 > ?0 ? Set ?2 = 1. ? Initialize Wi;j = 0 8i 8j. ? while ?2 > ?0 { Compute ~di!j using ~di!j = PN ? ?ei + (1? ?)eQai;j;j ? . { Compute W^i;j using W^i;j = maxai;j " 1 2 ? ?ai;jc+ ? X k X l ~di!j(k)x?(l)Wk;l ! + 12 ? a?j;ig + ? X l x?(l)Wi;l !# : { Compute a^i;j using a^i;j = argmaxai;j " 1 2 ? ?ai;jc+ ? X k X l ~di!j(k)x?(l)Wk;l !# : { Update the parameter ?2 by ?2 = jjW^?Wjj2. { Update W by W = W^. { End ? Update the parameter ?1 by ?1 = jja^? a?jj2. ? Update a? by a? = a^. End 3. The optimal action is a?. 117 Algorithm 5 : An Alternative Algorithm For Finding Stationary Repu- tation Distribution And Optimal Action 1. Given the tolerance ?0 = 0:01, initialize a? with a0 and set ? = 1. 2. while ? > ?0 ? Given the optimal action a?, flnding the stationary reputation distribution x? using Al- gorithm 3. ? Given the stationary reputation distribution x?, flnding the optimal action a^? using Al- gorithm 4. ? Update the parameter ? by ? = jja^? ? a?jj2. ? Update a? by a? = a^?. End 3. The stationary reputation distribution is x? and the optimal action is a?. 118 Chapter 6 Image Denoising Games During the processes of being captured, digitized, recorded, and transmitted, an image is usually distorted and noisy. Such a noisy image is visually annoying and often not suited to further perform tasks such as segmentation, recognition and compression. Therefore, image denoising is a very important issue to reconstruct a good estimate of the original image from the noisy observations. Many approaches have been proposed in the literature to reconstruct the orig- inal image by exploiting the inherently spatial correlation. By assuming that the image locally satisfles a stationary Gaussian process, Woods and Radewan [126] propose to estimate the original image from the noisy image using Kalman fllter while Jin et al [70] propose to use adaptive Wiener fllter. In both approaches, the flrst-order and second-order statistics used in the fllters are calculated based on the noisy samples within a local window. In [111] [51] [117], the authors propose to use bilateral flltering over the local neighborhood samples, where the weights of the bilateral fllters are computed based on the intensity and radiometric distances between the center sample and the neighboring samples. Another class of locally adaptive image denoising approaches are derived by considering image processing as a variational problem where the restored image is computed by minimizing a care- fully designed energy function [26] [55] [56]. Typically, such energy functions consist 119 of a fldelity term that is determined by the difierence between the reconstructed image and the noisy image, and a regularization penalty term that is determined by the image prior. To further exploit the spatial correlation, Buades et al [20] proposed to average, in a weighted manner, all the pixels in a nonlocal window instead of only involving the locally neighboring pixels, where the weights are determined by the difierences between the region centered by the target pixel and the regions centered by the candidate pixels. Since the weights are not determined by the radiometric (physical) distance, similar pixels that are far away from the target pixel can still be awarded large weights. In such a way, the denoising performance is greatly improved. Several extensions of the nonlocal approach are also proposed [21] [73] [74]. Besides the pixel-domain approaches, transform-domain approaches are also investigated [50] [99] [58] [66]. The transform-domain approaches are mainly based on the assumption that the original signal can be well approximated by a linear combination of few basis, i.e., the original signal is sparse in the transform-domain. In such a case, the original signal can be well estimated by preserving the few high-magnitude transform coe?cients that convey mostly the energy of the original signal and discarding the rest which are mainly introduced by the noise. Therefore, one important issue in the transform-domain approaches is how to threshold the transform coe?cients. Many threshold rules have been proposed from difierent speculations [50] [11] [52]. A combination of the nonlocal and transform-domain thresholding ideas is proposed in [45]. The basic idea is to flrst group similar 2D image blocks into 3D data arrays, then perform 3D wavelet transform, and flnally 120 shrinkage the transform spectrum. Most of the existing schemes focus on how to choose good weights for some given neighborhoods to achieve better reconstructions. However, how to adaptively choose optimal neighborhoods can be even more important since too large a neigh- borhood set may cause overly-smooth artifacts, while too small a neighborhood set may not be able to e?ciently reduce noise variance. Due to the absence of the original image, the Stein?s principle [113] is used to estimate the true MSE for determining the optimal neighborhoods. Nevertheless, we flnd that there exists a trade-ofi between the accuracy of the estimate and the minimum of the true MSE. In this chapter, we study the impact of this trade-ofi and formulate the image de- noising problem as a coalition formation game. In this game, every pixel is treated as a player, who tries to seek partners to form a coalition to improve the accu- racy of the Stein?s estimate while incurring a cost of increasing the minimum of the true MSE. Since flnding the optimal coalition structures is NP-hard, we propose a heuristically distributed algorithm in solving the coalition formation game. We also show that the traditional approaches that use a heuristically determined candidate set are special cases of the proposed game theoretical framework by choosing the utility function without a cost term. Finally, experimental results show that the proposed game theoretical approach can achieve better performance than the non- local method in terms of both PSNR and visual quality. Note that the proposed game is also applicable in other scenarios besides the nonlocal method as long as 1) there exist some locally adaptive parameters to be estimated, and 2) the estimation accuracy will be improved when more samples are involved in the estimate process. 121 The rest of this chapter is organized as follows. In Section 6.1, we give a brief description of the system model and the coalition formation game. Then, we discuss how to choose a good candidate set in Section 6.2. In Section 6.3, we study the trade-ofi between the accuracy of the estimate and the minimum of the true MSE and provide a detailed analysis of the proposed game-theoretic image denoising framework. In Section 6.4, we show the relationship between the proposed game-theoretic framework and the traditional candidate set selection approaches. Finally, we illustrate the experimental results on real images in Section 6.5 and draw conclusions in Section 6.6. 6.1 The System Model and Coalition Formation Game 6.1.1 The System Model In this chapter, we consider the problem of restoring images degraded by additive white Gaussian noise. The degraded process can be modelled as In(k) = I(k) + n(k); (6.1) where I is the original image, In is the noisy observation of the image, and n is the additive Gaussian noise with zero mean and 2 noise variance. The k = (k1; k2) is the coordinate of a pixel. The problem is to flnd an estimate I^ of the original image based on the noisy observation In. It is well known that the image denoising problem is ill-posed. To reconstruct the original image from the noisy observation, we need to use some prior information 122 such as the correlations among spatial neighboring pixels. In this chapter, we focus on the spatially adaptive linear flltering approach. For the pixel located at k, we flnd the estimate I^(k) using the weighted average of the spatially neighboring pixels, i.e., I^(k) = P l2S(k) wk;lIn(l)P l2S(k) wk;l ; (6.2) where S(k) is the candidate set that contains the spatially neighboring pixels for k, and wk;l is the weight for pixel In(l). 6.1.2 The Coalition Formation Game Game theory is a mathematical tool that analyzes the strategic interactions among multiple decision makers. A game is mainly composed by three components: ? a flnite set of players, denoted by u1; u2; :::; uN ; ? a set of actions, denoted by Ai, for each player ui; ? payofi/utility function, denoted by Ui, which measures the outcome for player ui determined by the actions of all players. A coalition formation game is a game where the players seek to form coopera- tive groups, i.e., coalitions, to strengthen their positions in the game. The players? actions in the coalition formation game are whom to cooperate with, i.e., which players to form coalitions with. The payofi/utility function in the coalition for- mation game is deflned over coalitions, called coalition value. The coalition value, which quantifles the worth of a coalition, is mainly determined by two terms: the 123 gain and the cost. By forming a coalition, every player in the coalition can obtain a gain through cooperation within the coalition. However, the gain is limited by a cooperation cost for forming the coalition, e.g. the negotiation cost or information exchange cost. Given the player set and the coalition value, the coalition formation game is uniquely deflned, and the outcome of the game is a set of coalitions, which is the optimal partitions of the player set. To obtain the optimal partitions, there are two possible approaches: centralized approach and distributed approach. For the centralized approach, the centralized controller needs to search over all the partitions of the player set to flnd the optimal partitions, which is NP-complete and impractical when the size of the player set is large [104]. For the distributed approach, the players will make their own decisions as to whether or not they join a coalition. One typical approach is to use the merge and split rules proposed in [15]. This approach starts with an initial partition and repeats alternatively the merge and split rule, 1) merge rule: merge any set of coalitions into a single coalition if the new coalition can provide larger total coalition values; 2) split rule: split a coalition into smaller coalitions if the resulting smaller coalitions can provide larger total coalition values. In Section 6.3, we will discuss in details how to use the coalition formation game to formulate the image denoising problem, where each pixel will be treated as a player seeking to form coalitions to achieve optimal denosing performance. 124 6.2 Candidate Set Selection From (6.2), we can see that the reconstruction performance are determined by the selection of the weights wk;l and the candidate set S(k). For any given S(k), the optimal weights w?k;l(S(k)) are determined by the correlation between pixels I(k) and I(l), and should be chosen to minimize the difierence between the estimation I^(k) and the original pixel I(k) as below. w?k;l(S(k)) = arg minwk;l ?P l2S(k) wk;lIn(l)P l2S(k) wk;l ? I(k) !2 (6.3) Note that when the optimal weights in (6.3) are used, the selection of the candidate set S(k) is trivial since the accuracy of the reconstruction improves as the candidate set S(k) becomes larger. The proof can be found in Theorem 1. Theorem 1: When the optimal weights in (6.3) are used, the accuracy of the reconstruction improves as the candidate set S(k) becomes larger. Proof : Let S1(k) and S2(k) be two candidate sets with S1(k) ? S2(k). Let w?k;l(S1(k)) and w?k;l(S2(k)) be the corresponding optimal weights computed by (6.3). Suppose ~wk;l(S2(k)) are the weights for S2(k) and are deflned as follows ~wk;l(S2(k)) = 8 >>< >>: w?k;l(S1(k)); l 2 S1(k); 0; else. (6.4) Then, according to the optimality of w?k;l(S2(k)), we have ?P l2S2(k) wk;l(S2(k))In(l)P l2S2(k) wk;l(S2(k)) ? I(k) !2 ? ?P l2S2(k) ~wk;l(S2(k))In(l)P l2S2(k) ~wk;l(S2(k)) ? I(k) !2 = ?P l2S1(k) wk;l(S1(k))In(l)P l2S1(k) wk;l(S1(k)) ? I(k) !2 : (6.5) 125 Therefore, when optimal weights are used, the reconstruction using candidate set S2(k) is more accurate than that using candidate set S1(k). However, due to the absence of the original pixel I(k), it is impossible for us to flnd the optimal weights using (6.3). One possible approximation is to use the similarity between the neighborhoods around k and l [20], which is deflned as follows wk;l = exp ( ? P b2B [In(k+ b)? In(l+ b)]2 h2 ) ; (6.6) where B is a predeflned neighborhood and h is the parameter related to the noise?s variance. Nevertheless, since the weights in (6.6) are not optimal, the selection of the candidate set S(k) for the reconstruction becomes critically important. On one hand, if the size of the candidate set is too small, then the noise may not be efiectively removed. On the other hand, if the size of the candidate set is too large, then the reconstruction may be overly-smooth. Moreover, according to (6.6), we can see that the pixels that are more similar to the target pixel would have larger weights. To prevent the reconstruction from being overly-smooth, we will only involve the pixels that have relatively large weights. Let ?(m) stand for the subset of S(k) which contains the pixels with the flrst m largest weights. Then, the reconstruction I^(k;m) using ?(m) can be written as I^(k;m) = P l2?(m) wk;lIn(l)P l2?(m) wk;l : (6.7) Obviously, the parameter m in (6.7) should be chosen in such a way that the difierence between I^(k;m) and I(k) is minimized, i.e., the optimal m? can be found 126 (a) (b) (c) (d) (e) (f) (g) (h) Figure 6.1: An example of optimal candidate set with an edge region: (a) original image; (b) noisy image with = 15; (c) noisy image with = 25; (d) noisy image with = 35; (e) the candidate set used by nonlocal; (f) the ideally optimal candidate set of (b) when the original signal is available; (g) the ideally optimal candidate set of (c) when the original signal is available; (h) the ideally optimal candidate set of (d) when the original signal is available. by m? = arg min m jI^(k;m)? I(k)j2: (6.8) In general, m? is content dependent, i.e., m? may be difierent for difierent k and/or difierent noise variances. Even with the same m?, the structure of the candidate set ?(m?) may be difierent for difierent pixels. In Figs. 6.1 and 6.2, we show the structure of the optimal candidate set for two difierent scenarios: 1) the target pixel is centered within an edge region, and 2) the target pixel is centered within a smooth region. For illustration purpose, we assume that the original image 127 is available for flnding m? in these two examples. Later in Section 6.3, we will discuss how to flnd m? using game theory under the scenario that the original image is not available. As shown in Figs. 6.1 and 6.2, (a) is the original image centered by the target pixel, which is denoted by red \x", (b)-(d) are the noisy images with being 15, 25, and 35 respectively. (e) is the candidate set using in [20], which is a square window. (f)-(h) are the optimal candidate sets generated using (6.8) for (b)-(d) respectively. Note that the black pixels in (f)-(h) stand for the pixels in the candidate set. From Figs. 6.1 and 6.2, we can see that for the scenario where the target pixel is centered within an edge region, the candidate set has an edge structure, while for the scenario where the target pixel is centered within a smooth region, the structure of the candidate set is unpredictable. Moreover, we can also see that with difierent noise variance, the candidate sets are quite difierent. Therefore, the candidate set should not be pre-deflned in a flxed way such as using a square window in [20]. Instead, the candidate set should be chosen adaptively to minimize the difierence between the estimate and the original signal. 6.3 Game Theoretical Problem Formulation 6.3.1 Stein?s unbiased risk estimate (SURE) Since I(k) is unknown, the optimal m? can not be explicitly computed using (6.5). Fortunately, we can flrst use the Stein?s unbiased risk estimate (SURE) [113] to estimate the true mean squared error (MSE) from the noisy observation and then use the estimated MSE to flnd the optimal m?. 128 (a) (b) (c) (d) (e) (f) (g) (h) Figure 6.2: An example of optimal candidate set with a smooth region: (a) original image; (b) noisy image with = 15; (c) noisy image with = 25; (d) noisy image with = 35; (e) the candidate set used by nonlocal; (f) the ideally optimal candidate set of (b) when the original signal is available; (g) the ideally optimal candidate set of (c) when the original signal is available; (h) the ideally optimal candidate set of (d) when the original signal is available. In Fig. 6.3, we show the optimal m? obtained using (6.8) for lena image when the standard deviation of the noise is = 10, where the intensity stands for the optimal m? value. From Fig. 6.3, we can see that there are many pixels have the similar m? value, which can be grouped together for flnding m?. For example, the pixels in the red circles have the m? value near 60 can be grouped together. Suppose that the whole image is partitioned into M subsets ' = f'1;'2; :::;'Mg, where each subset 'i contains a set of pixels that may not be physical neighboring but have the 129 8 441 6 28 310 296 26 441 5 138 49 16 6 389 225 441 58 177 88 28 6 441 240 23 37 62 12 191 312 3 7 1 71 93 315 72 441 10 58 441 441 5 252 35 441 8 144 441 85 254 10 36 12 185 14 51 306 2 2 20 10 59 60 155 Figure 6.3: The optimal m? for lena image when = 10. same optimal parameter m?i , i.e., m?i = arg minm X k2'i jI^(k;m)? I(k)j2: (6.9) With the optimal m?i , the mean square error (MSE) for the subset 'i can be computed by msei = 1j'ij X k2'i jI^(k;m?i )? I(k)j2; (6.10) and such a MSE can be approximated using SURE, according to Theorem 2, as follows SUREi = 1 j'ij X k2'i jI^(k;m?i )?In(k)j2+ 2 ? 2 j'ij X k2'i @I^(k;m?i ) @In(k) ?1 ! ; (6.11) 130 where @I^(k;m?i )@In(k) can be found by @I^(k;m?i ) @In(k) = 1P l2?(m?i ) wk;l ? 0 @ X l2?(m?i ) @wk;l @In(k)I n(l) + 1? I^(k;m?i ) X l2?(m?i ) @wk;l @In(k) 1 A ; (6.12) with @wk;l@In(k) being deflned as @wk;l @In(k) = 8 >>< >>: 2wk;l I n(l)?In(k) h2 ; l? k 62 B; 2wk;l I n(l)+In(2k?l)?2In(k) h2 ; l? k 2 B. (6.13) Theorem 2: The SUREi in (6.11) is an unbiased estimator of the true MSE msei in (6.10), i.e, E[SUREi] = E[msei]: (6.14) Proof : By substituting I(k) with In(k)? n(k), we can re-write E[msei] as follows E[msei] = E " 1 j'ij X k2'i jI^(k;m?i )? I(k)j2 # = E " 1 j'ij X k2'i ? jI^(k;m?i )? In(k)j2 +2n(k)I^(k;m?i )? n(k)2 ?# = 1j'ij X k2'i E h jI^(k;m?i )? In(k)j2 i +2 1j'ij X k2'i E h n(k)I^(k;m?i ) i ? 2: (6.15) According to Stein?s Lemma [113], we have E h n(k)I^(k;m?i ) i = 2E " @I^(k;m?i ) @In(k) # : (6.16) 131 0 10 20 30 40 10 20 30 40 50 60 70 80 90 N conf_ter m (a) 0 10 20 30 40 14 16 18 20 22 N mse_ter m (b) Figure 6.4: The trade-ofi between the confldence term C and the distortion term D: (a) the performance of C with difierent N ; (b) the performance of D with difierent N . Then, by substituting (6.19) back to (6.18), we have E[msei] = E " 1 j'ij X k2'i jI^(k;m?i )? In(k)j2 + 2 ? 2 j'ij X k2'i @I^(k;m?i ) @In(k) ? 1 !# = E[SUREi]: 6.3.2 Confldence and Distortion Trade-ofi 6.3.2.1 Confldence From Theorem 2, we can see that SUREi is an unbiased estimator of msei. However, there can be some mismatch between SUREi and msei for each realization (noise observation), i.e., SUREi is just an approximation of msei. To measure the 132 accuracy of the approximation, let us deflne the confldence term, C, as the average difierence between SUREi and msei over the whole image C = 1j'j MX i=1 j'ij ? jmsei ? SUREij: (6.17) According to [113], the estimator SUREi becomes closer to msei as j'ij in- creases, which means that the confldence term C in (6.17) decreases as j'ij increases. 6.3.2.2 Distortion With the partition ' = f'1;'2; :::;'Mg and the optimal parameters fm?1;m?2; :::;m?Mg, we can compute the mean square error for the whole image, D, as follows D = 1j'j MX i=1 j'ij ? msei: (6.18) According to the analysis in Section 6.3.1, we group the pixels with similar m? values together and assign a common m?i to all pixels in subset 'i. In such a case, as j'ij increases, the probability that the pixels in 'i have difierent true m? values increases, which leads to the increase of msei. Therefore, the distortion term D in (6.18) increases as j'ij increases. 6.3.2.3 Confldence and Distortion Trade-ofi From the above discussion, we can see that as j'ij increases, the confldence term C decreases but the distortion term D increases. Therefore, there exists a trade-ofi between C and D. To verify such a trade-ofi, we conduct a simple ex- periment by setting j'ij = N; 8i. As shown in Figure 6.4, the confldence term C 133 0 5 10 15 20 25 30 ?1 ?0.9 ?0.8 ?0.7 ?0.6 ?0.5 ?0.4 ?0.3 ?0.2 ?0.1 0 |?i| Gain function g1(|?i|,?2=1)=?1/|?i| g2(|?i|,? 2=1)=?exp(?|?i|/4) g3(|?i|,? 2=1)=ln(|?i|/(|?i|+1)) Figure 6.5: Some possible gain functions. decrease as N increases while the distortion term D increases as N increases, which are consistent with our analysis. 6.3.3 Utility Function and Solution to the Game From the previous subsections, we can see that given the partition ' = f'1;'2; :::;'Mg, SURE can be used to approximate the true MSE to flnd the opti- mal m?. However, how to flnd a good partition is not trivial since the number of the partition is not flxed and the size of each partition can vary. Due to the uncertainty of the number of the partition, the traditional segmentation and clustering meth- ods may not work. To study the complex interactions among difierent pixels and 134 (a) (b) (c) (d) Figure 6.6: The four tested images: (a) Lena; (b) Barbara; (c) Boat; (d) Flinstones. the dynamic partition formation process, we propose to use the coalition formation game. In this game theoretical formulation, every pixel is treated as a player, who tries to seek partners to form coalitions to achieve better reconstruction. By forming a coalition, every player in the coalition can obtain a gain of reducing the difierence between the SURE and the true estimate, i.e., the confldence term in (6.17), while incurring a cost of increasing the minimum of the MSE. With this idea in mind, we 135 deflne the utility for a coalition as: U('i) = ?j'ij ? SUREi + g(j'ij; 2); (6.19) where the flrst term of the right hand side is the cost and the second term g(j'ij; 2) is the gain. The function g(j'ij; 2) in (6.19) characterizes the gain of forming a coalition, which is the reduction of the difierence between the SURE and the true estimate due to the increase of the coalition size. Therefore, g(j'ij; 2) should satisfy the following properties 1. g(j'ij; 2) should be an increasing function in terms of j'ij since the gain increases as the coalition size j'ij increases, i.e., @g(j'ij; 2) @j'ij > 0. 2. g(j'ij; 2) should be a concave function in terms of j'ij since a certain increase of the coalition size in the low j'ij region should lead to a more signiflcant gain than that in the high j'ij region, i.e., @ 2g(j'ij; 2) @j'ij2 < 0. 3. g(j'ij; 2) should be a superadditive function since the gain of a large coalition should be no smaller than that of two sub coalitions, i.e., g(j'i + 'jj; 2) ? g(j'ij; 2) + g(j'jj; 2). 4. g(j'ij; 2) should be a decreasing function in terms of 2 since the gain de- creases as noise variance 2 increases, i.e., @g(j'ij; 2)@ 2 < 0. There are many functions that can satisfy the above property. In the following, we list three possible functions g1(j'ij; 2) = ?1 2 ?1 j'ij ? ; (6.20) 136 g2(j'ij; 2) = ?2 2 ? ? exp ?j'ij 4 ?? ; (6.21) g3(j'ij; 2) = ?3 2 ? ln j'ij j'ij+ 1 ?? ; (6.22) where ?1, ?2, and ?3 are flxed parameters. In Fig. 6.5, we plot the three possible gain functions versus j'ij by setting 2 = 1. We can see that all the three functions meet our requirements and are therefore valid gain functions. Moreover, we can see that all three functions behave similarly. Therefore, in this chapter, we only evaluate the flrst gain function, i.e., g(j'ij; 2) in (6.19) is set to be g1(j'ij; 2). Nevertheless, similar results can be obtained with the other two functions (g2(j'ij; 2) and g3(j'ij; 2)) and any other functions with similar properties. With the utility function in (6.19), we can see that as the size of the coalition increases, the members in the coalition can obtain gains from g(j'ij; 2). However, the gains are limited by the a cost of forming the coalition, which is ?j'ij?SUREi. The problem now is to flnd the optimal coalition structures based on the utility func- tion in (6.19). One possible approach is to use the merge and split rules proposed in [15], where the authors prove that their algorithm will converge to a unique solu- tion with arbitrary merge and split iterations. However, the computation complexity is still very large since all possible sub-partitions need to be evaluated during the split process. To make the problem traceable, in this chapter, we propose a heuris- tic algorithm in solving the coalition formation game. As shown in Algorithm 6, the proposed heuristic algorithm starts with a randomly chosen pixel and flnds the coalition by selecting the neighborhoods that can give best average utility. Then, all 137 the pixels in the coalition are denoised with the corresponding optimal m?. Finally, all the pixels in the coalition are excluded from the un-denoised set. The above procedures are repeated until all pixels are denoised. The proposed heuristic algorithm is distributive, and only locally neighboring information is required for flnding the coalition. Moreover, since it is not an iterative algorithm, there is no convergence issue. Compared with the merge and split rules [15], the computation complexity is greatly reduced since the split process is avoided. From the experimental results shown in Section 6.5, we can see that the proposed heuristic algorithm performs quite well. Algorithm 6 A Heuristic Algorithm For Coalition Formation Initialization: let the set of denoised pixel SD = ; and its complement ?SD = ', let N1 = 800, N2 = 21? 21, and i = 0. While ?SD 6= ; ? i = i+ 1 ? randomly choose k 2 ?SD, let '0 = fkg and set j = N1 ? While j > 0 { j = j ? 1 { (l?;m?) = arg min l2 ?SDn'0;1?m?N2 SURE ?'0 [ flg;m? { set '0 = '0 [ fl?g { compute u(j'0j) = g(j'0j)j'0j ? SURE ?'0;m?? End ? let n?i = argmaxn u(n), 'i = '0(1 : n?i ) ? compute m?i = arg min1?m?N2 SURE ('i;m) ? set ?SD = ?SD n 'i and SD = SD [ 'i ? denoise the pixel in 'i using (6.4) with m = m?i End 138 6.4 Relation to the Traditional Approaches In the traditional pixel-domain image denoising approaches, every pixel is de- noised using (6.2) with a heuristically pre-deflned candidate set S(k). For example, a flxed-size square window centered by the target pixel k is chosen as the candidate set in the nonlocal image denoising method [20]. Such kinds of approaches have a performance limitation due to the self-constrained use of a pre-deflned candidate set. As shown in Figs. 6.1 and 6.2, we can see that the candidate set should be adaptively chosen for difierent neighborhoods and/or noise variances. Moreover, we will show in the following analysis that the traditional methods such as the nonlocal method [20] is actually a special case of the proposed game theoretical framework by choosing a utility function without a cost term U('i) = g(j'ij; 2): (6.23) According to the discussion in Section 6.3.3, we know that a valid gain function g(j'ij; 2) should be monotonically increasing, concave, and superadditive in terms of j'ij. In such a case, if the utility function only involves the gain function as in (6.23), then all pixels will form a grand coalition and use the same candidate set. In such a case, it return to the traditional ad-hoc approaches where a flxed candidate set is used for all pixels. In this sense, we can say that the traditional ad-hoc approaches are special cases of the proposed game theoretical framework. 139 6.5 Experimental Results We evaluate the proposed game theoretical image denoising approach by com- paring it with the nonlocal method [20]. Four 512? 512 images shown in Fig. 6.6: Lena, Barbara, Boat and Flinstones, are tested. The neighborhood B and the pa- rameter h in (6.6) are set to be 11? 11 and 10 respectively. The candidate set for the nonlocal method is set to be a 21 ? 21 square window. The parameter ?1 for the proposed method in the gain function in (6.20) is set to be 0.875. Note that this parameter ?1 is flxed for all four tested images. We flrst examine the candidate set generated by the proposed approach. In Figs. 6.1 and 6.2, we show the ideally optimal candidate sets of two difierent image patches by assuming the original signal is available. Obviously, due to the absence of the original signal, we are not able to get the ideally optimal candidate sets. Nevertheless, with the proposed game theoretical approach and the SURE estimate, we can flnd approximate candidate sets and the results are shown in Figs. 6.7 and 6.8. From Figs. 6.7 and 6.8, we can see that the approximate candidate sets are much more similar to the ideally optimal candidate sets compared with the flxed square window candidate set used by [20]. Then, we evaluate the PSNR comparison versus the standard derivation of the noise. The PSNR comparison between the nonlocal method and the proposed method for the tested images at difierent noise levels are shown in Fig. 6.9. From Fig. 6.9, we can see that the proposed method always performs better than, if not equal to, the nonlocal method for all tested images at all difierent noise variances. 140 When the standard deviation of the noise is no larger than 20, i.e. ? 20, the PSNR performance of the proposed approach is just a bit better than the nonlocal method. This is because when is small, the weights of the pixels outside the optimal candidate set are too small. In such a case, the reconstruction using a heuristically determined square window candidate set is similar to that using an optimal candidate set, i.e., the performance of the nonlocal method is similar to the performance of the proposed approach when is small. However, we should notice that the performance of the nonlocal method is always upper bounded by that of the proposed method since the adaptively chosen candidate set is always better than, if not equal to, the flxed square window candidate set. When becomes larger, the superiority of the proposed approach becomes more signiflcant. This phenomenon is mainly because when is large, the weights of the pixels outside the optimal candidate is relatively large and is no longer negligible. In such a case, the reconstruction using all pixels in a heuristically determined square window tends to lead to over-smooth artifacts. On the other hand, since the candidate set is adaptively chosen for every pixel in the proposed method, we are able to preserve details of the original image and avoid over-smooth artifacts. From Fig. 6.9, we can see that the gain of the proposed approach over the nonlocal method can be up to 2.16dB for the Flinestones image when = 60, which fully demonstrates the efiectiveness of the proposed approach. Finally, we evaluate the visual quality of the reconstructions. In Fig. 6.10, we show the visual quality comparison for Flinstones. As shown in Fig. 6.10, (a) is the original patch of Flinstones and (b) is the noisy patch with = 25. The results 141 generated by the nonlocal method and the proposed approach are shown in (c) and (d) respectively. We can see that the result generated by the nonlocal method is over-smooth. This phenomenon is because the nonlocal method involves too many dis-similar pixels in the averaging process. With the proposed approach, every pixel (player) seeks parters to form coalition to determine the best number of neighbor- hoods to perform denoising, which can rule out the dis-similar neighborhoods and avoid over-smooth artifacts. Therefore, the details can be well-preserved in the pro- posed approach. Similar phenomenons can be observed in Figs. 6.11, 6.12 and 6.13 for Barbara, Lena and Boat at noise level = 35, = 45, and = 20 respectively. Due to the page limitation, we only show the results of one for each image in this chapter. Similar results are observed for difierent 0s. 6.6 Summary In this chapter, we study the trade-ofi between the accuracy of the Stein?s esti- mate and the minimum of the true MSE and formulate the image denoising problem as a coalition formation game. With the proposed game, every player (pixel) seek parters to form coalitions to obtain better decision for the optimal neighborhoods se- lection and thus lead to better denoising results. The experimental results show that compared with the nonlocal method [20], the proposed game theoretical approach can achieve not only better PSNR performance but also better visual quality. Note that the proposed game is also applicable in other scenarios besides the nonlocal method as long as 1) there exist some locally adaptive parameters to be estimated, 142 and 2) the estimation accuracy will be improved when more samples are involved in the estimate process. Moreover, we showed that the traditional approaches using a heuristically determined candidate set are special cases of the game theoretical framework by choosing the utility function without a cost term. 143 (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) Figure 6.7: An example of optimal candidate set with an edge region: (a) original image; (b) noisy image with = 15; (c) noisy image with = 25; (d) noisy image with = 35; (e) the candidate set used by nonlocal; (f) (g) and (h) are the ideally optimal candidate sets of (b) (c) and (d) when the original signal is available; (i) (j) and (k) are the candidate sets of (b) (c) and (d) generated by the proposed method. 144 (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) Figure 6.8: An example of optimal candidate set with a smooth region: (a) original image; (b) noisy image with = 15; (c) noisy image with = 25; (d) noisy image with = 35; (e) the candidate set used by nonlocal; (f) (g) and (h) are the ideally optimal candidate sets of (b) (c) and (d) when the original signal is available; (i) (j) and (k) are the candidate sets of (b) (c) and (d) generated by the proposed method. 145 10 15 20 25 30 35 40 45 50 55 60 24 25 26 27 28 29 30 31 32 33 34 ? PSNR (dB ) Lena Nonlocal Proposed (a) 10 15 20 25 30 35 40 45 50 55 60 22 23 24 25 26 27 28 29 30 31 32 ? PSNR (dB ) Barbara Nonlocal Proposed (b) 10 15 20 25 30 35 40 45 50 55 60 22 23 24 25 26 27 28 29 30 31 32 ? PSNR (dB ) Boat Nonlocal Proposed (c) 10 15 20 25 30 35 40 45 50 55 60 18 20 22 24 26 28 30 ? PSNR (dB ) Flinstones Nonlocal Proposed (d) Figure 6.9: The PSNR comparison for difierent images: (a) Lena; (b) Barbara; (c) Boat; (d) Flinstones. 146 (a) (b) (c) (d) Figure 6.10: The visual quality comparison for Flinstones with = 25: (a) original image; (b) noisy image; (c) the result generated by the nonlocal method; (d) the result generated by the proposed approach. 147 (a) (b) (c) (d) Figure 6.11: The visual quality comparison for Barbara with = 35: (a) original image; (b) noisy image; (c) the result generated by the nonlocal method; (d) the result generated by the proposed approach. 148 (a) (b) (c) (d) Figure 6.12: The visual quality comparison for Lena with = 45: (a) original image; (b) noisy image; (c) the result generated by the nonlocal method; (d) the result generated by the proposed approach. 149 (a) (b) (c) (d) Figure 6.13: The visual quality comparison for Boat with = 20: (a) original image; (b) noisy image; (c) the result generated by the nonlocal method; (d) the result generated by the proposed approach. 150 Chapter 7 Simultaneous Image Denoising and Interpolation Using Evolutionary Games Spatial resolution up-conversion is one of the most important tasks in the fleld of image processing. Image interpolation is the technique addressing the problem of spatial resolution up-conversion. It generates a high resolution image from the input low resolution image by exploiting the inherent relationship between them. The commonly used image interpolation methods are the conventional linear interpolation schemes such as bilinear and bicubic interpolation [75]. These methods generate the high resolution image using a spatial-invariant linear interpolation fllter. Although the computational complexity is low, these methods are not favored since they introduce a lot of blurring and ringing artifacts. To overcome the drawbacks of conventional linear interpolation schemes, many more sophisticated adaptive image interpolation methods have been proposed. Jensen and Anastassiou proposed to flrst detect edges and then flt them with some tem- plates to improve the interpolation result [69], while Carrato and Tenze optimized the interpolation parameters by using some predetermined edge pattern [24]. In [13], Allebach and Wong proposed to flrst estimate the high resolution edge map from the low resolution image using a subpixel edge estimation technique, and then cor- rect the interpolated high resolution pixels based on the high resolution edge map. 151 Based on the assumption that the covariance matrix of the high resolution image can be well estimated from the covariance matrix of the low resolution image, Li and Orchard proposed an edge-preserved interpolation scheme [82]. Cha and Kim in [25] proposed a modifled bilinear method by amending the error based on the interpolation error theorem in an edge-adaptive way. In [81], an MRF model-based edge-directed interpolation method is proposed by formulating the image interpola- tion problem as an energy minimization problem over a 2-D Markov Random fleld, where the edge direction information generated by a statistical based approach is incorporated in the energy function. For better interpolation, Zhang and Wu pro- posed to flrst generate multiple reconstructions from difierent directions, and then fuse the results by minimum mean square error estimation [132]. To further improve the interpolation results, a soft-decision adaptive interpolation (SAI) technique is proposed in [135] by combining the piecewise 2-D autoregressive modeling and block estimation. Although the existing approaches can achieve promising interpolation results, they are designed for the noisy-free images, i.e., clean images. However, if the low resolution image is noisy, most of the existing interpolation approaches will also boost the noise and introduces severe visual distortions such as fake edge artifacts shown in Figure 7.1. In reality, due to the quality of the sensors or the conditions of the environment, sensor noise is introduced during the image acquisition processes. Moreover, additional noise and distortions will be introduced during the processes of being digitized, recorded, and transmitted. To avoid the distortion caused by the undesired noise boosting, one may perform the denoising before the interpolation. 152 (a) (b) (c) Figure 7.1: (a) a region of Lena image that is corrupted with Gaussian noise with noise variance 2 = 25; (b) the interpolation result using bicubic; (c) the interpola- tion result using SAI. Many image denoising approaches have been proposed in the literature by exploiting the inherently spatial correlation. For examples, Woods and Radewan [126] proposed to estimate the original image from the noisy image by using Kalman fllter while Jin [70] proposed to use adaptive Wiener fllter. To further exploit the spatial cor- relation, Buades [20] proposed to average, in a weighted manner, all the pixels in a nonlocal window instead of only involving the locally neighboring pixels. For better reconstruction, the authors in [33] propose a game theoretic approach to adaptively choose the optimal neighborhood. Nonlinear approaches such as difiusion [55], total variation [56], rate distortion optimization [32], and fuzzy flltering [120] were also investigated. Besides the pixel-domain approaches, transform-domain approaches such as wavelet shrinkage [50] [45] were also investigated. Nevertheless, as shown later in Section 7.2.1., since the denoising problem it- self is ill-posed, the methods that flrst perform denoising and then interpolate the 153 denoising image can in adversely introduce severe visual artifacts. Therefore, to achieve better results, it is very important for us to jointly consider image denois- ing and interpolation together. In the literature, there are some prior works on jointly perform denoising and demosaicking { a special case of interpolation that reconstructs the missing color component due to color-flltered image sensors. For example, the authors in [67] proposed to use total least square technique, while the authors in [79] proposed to combine directional fllter with wavelet-based denoising method. However, since these approaches are specially designed for demosaicking, they cannot be directly applied into the general interpolation problem. The essential problem of simultaneous denoising and interpolation is to es- timate the unknown clean pixels in the high resolution image based on the low resolution noisy image by exploiting the inherent relationship between them, e.g., estimating the unknown high resolution pixels using weighted average of a set of neighboring low resolution noisy pixels. However, since the original high resolution pixels are unknown, the optimal weights are not achievable. Most of the existing in- terpolation approaches flnd the approximated weights based on the assumption that the covariance matrix of the high resolution image can be well-estimated from the co- variance matrix of the low resolution image [82]. Nevertheless, when this assumption is not true, the interpolation performance will be greatly degraded. Moreover, since the low resolution image is corrupted by noise, the reconstruction performance can be further degraded. Here, instead of directly estimating the weights in one step, we propose to progressively reflne the weights by alternatively estimating the weights based on the reconstruction and flnding the reconstruction using the weights. Such 154 a reflnement process of the weights is actually an evolutionary process and can be naturally formulated as an evolutionary game. Thus, in this chapter, we consider the problem of simultaneous denoising and interpolation from the game theoretic perspective and formulate the problem as an evolutionary game. Evolutionary game theory is an application of the mathematical theory of games to the interaction dependent strategy evolution in populations. Arising from the realization that frequency dependent fltness introduces a strategic aspect to evo- lution, evolutionary game theory becomes an essential component of a mathematical and computational approach to biological contexts, such as genes, viruses, cells, and humans. There are three basic components in an evolutionary game: players, strate- gies, and payofi functions. Players are the entities who play the game. Strategies, which can be divided into pure strategies and mixed strategies, are the complete plans of actions players may take in the game. A pure strategy is a deterministic plan of how a player will play a game while a mixed strategy is an assignment of a probability to each pure strategy. Payofi functions determine the payofis players can obtain by adopting a certain strategy. In the proposed evolutionary game for simultaneous image denoising and interpolation, the players are the unknown high resolution pixels, and their pure strategies are the neighboring low resolution noisy pixels. The probabilities in the mixed strategy are the non-negative normalized weights of the low resolution noisy pixels. In this sense, the simultaneous image denoising and interpolation problem is no longer ill-posed. Instead, the problem becomes well-deflned, and the objective of the player is to flnd the evolutionarily stable strategy, i.e., the optimal combination of the low resolution noisy pixels, to 155 achieve better denoising or interpolation performance. The rest of this chapter is organized as follows. We flrst give an introduction about evolutionary game model in Section 7.1. Then, we show the problems of the methods that flrst perform denoising and then interpolate the denoising image, and describe in details how to formulate the problem of simultaneous image denoising and interpolation as an evolutionary game, how to choose the pure strategy set and how to deflne the payofi function in Section 7.2. Experimental results are shown in Section 7.3. Finally, we draw conclusions in Section 7.4. 7.1 Evolutionary Game Model Before discussing how to use evolutionary game to simultaneously perform denoising and interpolation, in this section, we flrst brie y give an introduction about evolutionary games. A game G = fU;A; Fg is generally deflned as follows. U = fu1; u2; :::; uNg are the players who play the game. A = A1 ? A2 ? ::: ? AN are the pure strategy sets, where Ai is the pure strategy set containing all possible pure strategies for user ui. Let ai 2 Ai be one possible strategy for ui, then a = (a1; a2; :::; aN) 2 A is a strategy proflle of U. F = ff1; f2; :::; fNg are the players? payofi/utility functions, where fi is the payofi function of ui. In general, fi is determined by all players? strategies a rather than ai only, i.e., fi = fi(a). Also, a?i = (a1; :::; ai?1; ai+1; :::; aN) denotes the players? strategy proflle except player ui. Besides pure strategies, players can also take a mixed strategy by randomizing among difierent pure strategies. Suppose there are Mi pure strategies in the pure 156 strategy set Ai = fa1i ; :::; aMii g, and let pij; j = 1; :::;Mi be the probability of ui choosing the jth pure strategy aji , then pi = (pi1; :::; piMi) is a mixed strategy. Since the payofi function fi is generally not only determined by ai but also by a?i, to maximize the payofi, every player needs to know all other players? strategies. However, it is generally very di?cult or even impossible for players to know other player strategies and payofis. In such a case, to improve their payofis, players will try difierent strategies in every play and learn from the strategic interactions using the methodology of understanding-by-building, which leads to the concept of \Evolutionary Game" [34] [110]. An evolutionary game is a game that studies the evolution of the interaction dependent strategy in populations, and was flrst articulated by John Maynard Smith and G. R. Price for evolutionary biology. It is based on the idea that an organism?s genes largely determine its fltness in a given environment. Organisms that are more flt to the environment will tend to produce more ofispring, due to which genes that provide greater fltness have more representation in the population. Therefore, fltter genes tend to win over time and drive out other genes. If we treat organisms as players and genes as strategies, then the genes which persist in the population are the evolutionarily stable strategy (ESS) of an evolutionary game, which can be formally deflned as \a strategy such that, if all members of the population adopt it, then no mutant strategy could invade the population under the in uence of natural selection". The evolution of the strategies over the population by natural selection can be characterized by the Wright-Fisher model [54], which is by far the most popular stochastic model for reproduction in population genetics. It is based on 157 Denoising Interpolation n lI ?lI ?hI (a) Simultaneous Denoising and Interpolation n lI ?hI (b) Figure 7.2: (a) Denoising before interpolation; (b) Simultaneous denoising and in- terpolation. the assumption that the probability of an individual adopting a certain strategy is proportional to the expected payofi of the population using that strategy, and the strategy updating equation can be written as pt+1ij = ptijf ti (aji ; a?i)PMi k=1 ptikf ti (aki ; a?i) ; (7.1) where the numerator ptijf ti (aji ; a?i) is the expected payofi of ui using strategy aji , and the denominatorPMik=1 ptikf ti (aki ; a?i) is the total expected payofi of ui using difierent strategies, which is the normalization term that ensures PMij=1 pt+1ij = 1. 7.2 Image Denosing and Interpolation as an Evolutionary Game 7.2.1 Problems of Denoising+Interpolation As discussed in the introduction, when the input low resolution image is noisy, most of the existing interpolation approaches will also boost the noise and introduces severe visual distortions. To avoid such kinds of distortions, one may consider performing the denoising before the interpolation as shown in Figure 7.2 (a). The noisy low resolution image Inl is flrst passed through a denoising process and an estimate of the original low resolution image I^l is found. Then, the estimated low 158 (a) (b) (c) (d) Figure 7.3: (a) a region of Lena image that is corrupted by Gaussian noise with noise variance 2 = 225; (b) the denoised result using nonlocal; (c) the interpolation result of (b) using bicubic; (d) the interpolation result of (b) using SAI. resolution image I^l goes through a interpolation process and a reconstruction of the original high resolution image I^h is obtained. Nevertheless, since the denoising problem itself is ill-posed [20], the low resolution estimate I^l is not perfect and there will be some difierences between the estimate I^l and the original Il. On one hand, the edge structures and textures may be removed during the denoising process, e.g., the details of the hair shown in Figure 7.3 (b). On the other hand, some noise may not be e?ciently suppressed during the denoising process, e.g., the region around the nose and mouth shown in Figure 7.3 (b). In such a case, if we directly 159 perform interpolation on I^l, the lost edge structures and textures will never be reconstructed back and the remained noise will be boosted which introducing severe visual distortions such as fake edge artifacts as shown in Figure 7.3 (c) and (d). Therefore, we should jointly consider image denoising and interpolation to achieve better reconstruction as illustrated in Figure 7.2 (b). 7.2.2 Game Theoretic Formulation In this simultaneous denoising and interpolation problem, we have the noisy observation of the low resolution pixels, Inl (m;n); 1 ? m ? NH ; 1 ? n ? MW . The objective of this problem is to estimate the high resolution image Ih(i; j); 1 ? i ? 2 ? NH ; 1 ? j ? 2 ? MW based on the noisy low resolution image Inl (n;m). Obviously, this problem is ill-posed. To flnd a good estimate for the unknown high resolution pixels, we need to exploit the correlation among the low resolution pixels and between the low resolution pixels and the high resolution pixels. One possible approach is to use the spatially varying linear fllter, i.e., each unknown high resolution pixel can be estimated using weighted average of a set of neighboring noisy low resolution pixels. I^h(i; j) = X (m;n)2?ij wijmnInl (m;n); 8i;8j; (7.2) where ?ij is the candidate set of neighboring noisy low resolution pixels for Ih(i; j) and wijmn is the weight of candidate pixel Inl (m;n). Usually, we have the constraints that 0 ? wijmn ? 1 and P (m;n)2?ij wijmn = 1. Obviously, the optimal weights in (7.2) are not achievable since the original 160 high resolution pixels are unknown. Most of the existing approaches flnd the ap- proximated weights based on the assumption that the covariance matrix of the high resolution image can be well-estimated from the covariance matrix of the low res- olution image [82]. However, when this assumption is not true, the denoising and interpolation performance will be greatly degraded. Here, instead of directly esti- mating the weights in one step, we propose to progressively reflne the weights by alternatively estimating the weights based on the reconstruction and flnding the reconstruction using the weights. Such a reflnement process of the weights is ac- tually an evolutionary process and can be naturally formulated as an evolutionary game [110]. In the proposed evolutionary game for image denoising and interpola- tion, the players are the unknown high resolution pixels to be estimated and the pure strategies are the correspondingly neighboring noisy low resolution pixels. By regarding the non-negative weights of the neighboring noisy low resolution pixels as the probabilities of selecting the pure strategies, the problem of estimating the high resolution pixels becomes flnding the evolutionarily stable strategies for the evolutionary game. In this sense, the simultaneous image denoising and interpola- tion problem is no longer ill-posed. Instead, the problem becomes well-deflned, and the objective of the problem is to flnd the evolutionarily stable strategies to achieve good denoising and interpolation performance. In summary, the estimation problem in (7.2) can be formulated as an evolu- tionary game as follows. ? Players: unknown high resolution pixels Ih(i; j) 161 ij? hI (2 1,2 1)hI m n? ? Figure 7.4: The pure strategy set ?ij for Ih(2m?1; 2n?1); Ih(2m?1; 2n); Ih(2m; 2n? 1); Ih(2m; 2n). ? Pure strategies: noisy low resolution pixels Inl (m;n) ? Pure strategy set: the candidate set ?ij ? Mixed strategy: the estimate I^h(i; j) ? Probabilities in the mixed strategy: the non-negative normalized weights wijmn 7.2.3 Pure Strategy Set ?ij In most of the previous interpolation approaches, the high resolution image is reconstructed in two steps: in the flrst step, the unknown high resolution pixels sur- rounded by four low resolution pixels, i.e., Ih(2m; 2n); and in the second step, other unknown high resolution pixels Ih(2m? 1; 2n) and Ih(2m; 2n? 1) are reconstructed 162 with the help of reconstructed high resolution pixels Ih(2n; 2m). Difierent from pre- vious approaches, in our game theoretic framework, pixels are treated as players and they have the same priority. The pure strategy set ?ij is the same for all four high resolution pixels Ih(2m?1; 2n?1); Ih(2m?1; 2n); Ih(2m; 2n?1); Ih(2m; 2n) and is deflned as a square window centered by Ih(2m? 1; 2n? 1) in Figure 7.4. As shown in Figure 7.4, the red circle stands for pixel Ih(2m? 1; 2n? 1) and the three green squares stand for pixels Ih(2m? 1; 2n), Ih(2m; 2n? 1), and Ih(2m; 2n) respectively. The gray square window centered by Ih(2m? 1; 2n? 1), denoted as ?ij, is the pure strategy set for Ih(2m? 1; 2n? 1); Ih(2m? 1; 2n); Ih(2m; 2n? 1), and Ih(2m; 2n). 7.2.4 Payofi Function After choosing the pure strategy set, we now discuss how to deflne the payofi function. The payofi function f tij(aij; a?ij) measures the player?s payofi of taking strategy aij when other players? strategies are a?ij at time t. Let I^ th stand for the estimate of the high resolution image Ih at time t, and B(I^ th(i; j)) stand for the patch centered by I^ th(i; j). The neighborhood similarity between pixel I^ th(i; j) and I^ th(m;n) at time t can be measured by Dt(I^ th(i; j) $ I^ th(m;n)) as follows Dt(I^ th(i; j) $ I^ th(m;n)) = jjB(I^ th(i; j))?B(I^ th(m;n))jj2: (7.3) Moreover, let us deflne ?Dt?1(I^h(i; j) $ I^h(m;n)) as the weighted average of the neighborhood similarity up to time t? 1 as follow ?Dt?1(I^h(i; j) $ I^h(m;n)) = Pt?1 k=0 flkDk(I^kh(i; j) $ I^kh(m;n))Pt?1 k=0 flk ; (7.4) 163 where fl is a discounting factor, and when fl = 1, ?Dt?1(I^h(i; j) $ I^h(m;n)) reduces to the simple averaging. Note that ?Dt?1(I^h(i; j) $ I^h(m;n)) can be treated as an estimate ofD(Ih(i; j) $ Ih(m;n)) at time t?1 by taking into account all previous reconstruction I^0h; :::; I^ t?1h , and Dt(I^ th(i; j) $ I^ th(m;n)) is an estimate of D(Ih(i; j) $ Ih(m;n)) at time t us- ing the reconstruction I^ th. If I^ th(i; j) is a good estimate, Dt(I^ th(i; j) $ I^ th(m;n)) tends to be close to ?Dt?1(I^h(i; j) $ I^h(m;n)). If Dt(I^ th(i; j) $ I^ th(m;n)) < ?Dt?1(I^h(i; j) $ I^h(m;n)), the neighborhood similarity between pixel I^h(i; j) and I^h(m;n) is larger than what we anticipate at time t? 1, which means that a larger payofi should be received by adopting strategy I^ th(m;n) at time t. On the other hand if Dt(I^ th(i; j) $ I^ th(m;n)) > ?Dt?1(I^h(i; j) $ I^h(m;n)), the neighborhood sim- ilarity between pixel I^h(i; j) and I^h(m;n) is smaller than what we anticipate at time t ? 1, which means that a smaller payofi should be received by adopting strategy I^ th(m;n) at time t. Therefore, the payofi function should be an increasing function of ( ?Dt?1(I^h(i; j) $ I^h(m;n) ?Dt(I^ th(i; j) $ I^ th(m;n))). Here, we use the following payofi function f tij(aij; a?ij) = exp 0 @ fit ? ?Dt?1(I^h(i; j) $ aij)?Dt(I^ th(i; j) $ aij) ? 1 A ; (7.5) where fit and are parameters. Interestingly, if we set fit = fltPt k=0 flk and substitute (7.5) back into (7.1), the probability of choosing strategy aij at time t+ 1 can be simplifled as pt+1aij = exp ? ? ?Dt(I^h(i;j)$aij) ? P aij exp ? ? ?Dt(I^h(i;j)$aij) ? : (7.6) 164 Bicubic Neighborhood Similarity Probabilities of Pure Strategies Reconstruction Figure 7.5: The block diagram of the proposed method. where ?Dt(I^h(i; j) $ I^h(m;n)) is deflned as ?Dt(I^h(i; j) $ I^h(m;n)) = Pt k=0 flkDk(I^kh(i; j) $ I^kh(m;n))Pt k=0 flk ; (7.7) The (7.6) can be interpreted as follows: the true similarity between Ih(i; j) and aij is approximately measured by ?Dt(I^h(i; j) $ aij), and the larger ?Dt(I^h(i; j) $ aij), the less the similarity; therefore, the pixels with smaller ?Dt(I^h(i; j) $ aij) should make more contribution during the denoising and interpolation process, i.e., the probability of choosing aij should be a decreasing function of ?Dt(I^h(i; j) $ aij), and here we use an exponential function. According to the above discussions, the proposed simultaneous image denoising and interpolation using evolutionary games can be summarized as in Figure 7.5. As shown in Figure 7.5, the noisy low resolution image Inl is flrst interpolated using Bicubic [75] to obtain a noisy estimate of the high resolution image I^0h. Then, at each evolution time index t, we compute the neighborhood similarity Dt(I^ th(i; j) $ I^ th(m;n)) using (7.3). Then, the probability of using a certain pure strategy aij can be updated using (7.6) and (7.7). Finally, the estimate of the high resolution image 165 at time index t+ 1 can be found as follows I^ t+1h (i; j) = X aij2?ij pt+1aij Inl (aij): (7.8) Note that the outcome of the proposed game theoretic algorithm is an evo- lutionarily stable strategy, and the proof can be found in the following Theorem 1. Theorem 1: For any fl 2 [0; 1), the outcome of the proposed game theoretic algorithm shown in Figure 7.5 is an evolutionarily stable strategy (ESS). Proof : Since fl 2 [0; 1), flk goes to zero as k goes to inflnite. According to (7.7), ?Dt(I^h(i; j) $ I^h(m;n)) converges for su?ciently large t. Therefore, according to (7.6), pt+1aij converges for su?ciently large t, which means that the probability distribution of using the pure strategies pij = (p1; :::; pj?ij j) converges. Since such a strategy is chosen under the in uence of Wright-Fisher natural selection model [54], according to the deflnition of ESS, the strategy is an ESS. 7.3 Experimental Results To evaluate the proposed game theoretic approach for simultaneous image de- noising and interpolation, we compare with the methods that flrst perform denoising using nonlocal [20] and then perform interpolation using either bicubic method [75] or the soft-decision adaptive interpolation (SAI) method [135], which are denoted as \Nonlocal+Bicubic" and \Nonlocal+SAI". Five images shown in Figure 7.6: Lena, Boat, Kodim03, Kodim07, and Kodim09, are tested. For convenience, all tested image are truncated to 512x512. The noisy low resolution images are gen- 166 erated by flrst adding additive white Gaussian noise to the high resolution images and then directly perform downsample. We flrst evaluate the PSNR performance for difierent approaches. In Figure 7.7, we show the PSNR comparison among Nonlocal+Bicubic, Nonlocal+SAI and the proposed method. We can see that the proposed method outperforms both Nonlocal+Bicubic and Nonlocal+SAI when ? 10, and the gain becomes larger and larger as increases. We can also see that as increases, the advantage of using SAI diminishes. This is mainly because too many details are removed during the denoising process when is high. This phenomenon fully demonstrates the importance of performing joint denoising and interpolation. When 5 < < 10, the PSNR performance of the proposed method is similar to that of Nonlocal+SAI (the proposed method has slightly better performance for Lena and Kodim03, and slightly worse performance for Boat and Kodim09). However, both the proposed method and Nonlocal+SAI are better than Nonlocal+Bicubic. When ? 5, the proposed method performs slightly worse than Nonlocal+SAI. This is mainly be- cause the proposed method not only removes the additive noise but also removes the sensor noise in the original image, and the PSNR is computed based on the original images which contains some sensor noise. Nevertheless, the proposed method can achieve much better visual quality even in the low region as shown in Figure 7.9. We also evaluate the visual quality performance and show the visual quality comparison in Figures 7.8, 7.9, 7.10, 7.11, and 7.12, respectively, for difierent images at difierent noise variances. The results for Lena image are shown in Figure 7.8, where (a) is the noisy low resolution Lena image with = 10, (b) is the result 167 generated by Nonlocal+Bicubic with PSNR 30.71dB, (c) is the result generated by Nonlocal+SAI method with PSNR 31.11dB, and (d) is the result generated by the proposed method with PSNR 31.38dB. By comparing (b) (c) and (d) in Figure 7.8, we can see that both Nonlocal+Bicubic and Nonlocal+SAI cannot well suppress noise in the edge and texture regions and introduce some visually annoying artifacts such as fake edge artifacts in the face region. Since the proposed game theoretic method can simultaneously perform denoising and interpolation, we can avoid the artifacts caused by the separation of denoising and interpolation and generate the reconstruction with much better performance in terms of both PSNR and visual quality. Moreover, the proposed method has the ability to automatically remove the visually annoying sensor noise. In Figure 7.9, we show the reconstructed results for Boat image. Note that the original Boat image contains some sensor noise es- pecially in the sky region as shown in Figure 7.6 (b). In Figure 7.9, (a) is the noisy low resolution Boat image with = 5, (b) is the result generated by Non- local+Bicubic with PSNR 28.52dB, (c) is the result generated by Nonlocal+SAI method with PSNR 28.92dB, and (d) is the result generated by the proposed method with PSNR 28.74dB. By comparing (b) (c) and (d) in Figure 7.9, we can see that the proposed method automatically removes the sensor noise in the sky region while Nonlocal+SAI and Nonlocal+Bicubic cannot. In such a case, although the proposed method has a lower PSNR performance compared with Nonlocal+SAI, the visual quality of the result generated by the proposed method is still much better than that of Nonlocal+SAI. 168 The visual quality of the reconstructions of Kodim07, Kodim09 and Kodim03 are also evaluated at noise level = 10, = 15, and = 20 in Figures 7.10, 7.11, and 7.12, respectively. Similar to previous experiments, the proposed method can greatly suppress the noise and restore the image with not only better PSNR performance but also better visual quality, especially in the regions around edges and textures. 7.4 Summary In this chapter, we investigated the problem of simultaneous image denoising and interpolation from a completely new angle: game theoretic perspective. We treat each unknown high resolution pixel as an individual player and formulate the joint image denoising and interpolation problem as an evolutionary game. From such a perspective, the problem of estimating the high resolution pixels becomes flnding the evolutionarily stable strategies for the evolutionary game. The exper- imental results show that compared with the methods that flrst denoise the noisy low resolution image and then interpolate the denoised image, the proposed game theoretic approach can achieve not only better PSNR performance but also better visual quality. 169 (a) (b) (c) (d) (e) Figure 7.6: The test images: (a) Lena; (b) Boat; (c) Kodim03; (d) Kodim07; (e) Kodim09. 170 0 5 10 15 20 25 30 35 40 24 26 28 30 32 34 36 ? PSNR (dB ) Lena Nonlocal+Bicubic Nonlocal+SAI Proposed (a) 0 5 10 15 20 25 30 35 40 22 23 24 25 26 27 28 29 30 ? PSNR (dB ) Boat Nonlocal+Bicubic Nonlocal+SAI Proposed (b) 0 5 10 15 20 25 30 35 40 25 26 27 28 29 30 31 ? PSNR (dB ) Kodim03 Nonlocal+Bicubic Nonlocal+SAI Proposed (c) 0 5 10 15 20 25 30 35 40 24 25 26 27 28 29 30 31 32 33 ? PSNR (dB ) Kodim07 Nonlocal+Bicubic Nonlocal+SAI Proposed (d) 0 5 10 15 20 25 30 35 40 25 26 27 28 29 30 31 32 ? PSNR (dB ) Kodim09 Nonlocal+Bicubic Nonlocal+SAI Proposed (e) Figure 7.7: The PSNR comparison: (a) Lena; (b) Boat; (c) Kodim03; (d) Kodim07; (e) Kodim09. 171 (a) (b) (c) (d) Figure 7.8: The visual quality comparison for Lena: (a) the noisy low resolution image with = 10; (b) the result generated by Nonlocal+Bicubic (30.71dB); (c) the result generated by Nonlocal+SAI method (31.11dB); (d) the result generated by the proposed method (31.38dB). 172 (a) (b) (c) (d) Figure 7.9: The visual quality comparison for Boat: (a) the noisy low resolution image with = 5; (b) the result generated by Nonlocal+Bicubic (28.52dB); (c) the result generated by Nonlocal+SAI method (28.92dB); (d) the result generated by the proposed method (28.74dB). 173 (a) (b) (c) (d) Figure 7.10: The visual quality comparison for Kodim07: (a) the noisy low resolution image with = 10; (b) the result generated by Nonlocal+Bicubic (29.85dB); (c) the result generated by Nonlocal+SAI method (30.18dB); (d) the result generated by the proposed method (30.35dB). 174 (a) (b) (c) (d) Figure 7.11: The visual quality comparison for Kodim09: (a) the noisy low resolution image with = 15; (b) the result generated by Nonlocal+Bicubic (29.56dB); (c) the result generated by Nonlocal+SAI method (29.81dB); (d) the result generated by the proposed method (30.41dB). 175 (a) (b) (c) (d) Figure 7.12: The visual quality comparison for Kodim03: (a) the noisy low resolution image with = 20; (b) the result generated by Nonlocal+Bicubic (27.85dB); (c) the result generated by Nonlocal+SAI method (28.03dB); (d) the result generated by the proposed method (28.37dB). 176 Chapter 8 Conclusions and Future Work 8.1 Conclusions In this thesis, we develop a game-theoretic framework that enables us to flrst analyze and model human behaviors in multimedia social networks, and then better design the multimedia systems by taking into account the impact of human factors. We have showed that understanding the human behaviors and dynamics in a mul- timedia social network can ultimately ofier better system performance and thus is essential for its continued progress, and such analysis and modeling can be applied to any social networks. Moreover, we extend the concept of multimedia social networking into classical signal/image processing problems to liberate pixels/signals as players to develop a game-theoretic framework that can, not only overcome some of the undesired ill-posed formulations in traditional approaches, but also obtain a more general paradigm beyond what can be accomplished by using traditional optimization tools. With the notion of learning and cognitive process inherent in a game theoretic formulation, we show that many classical approaches are basically special cases of the proposed game-theoretic framework. Therefore, the proposed framework ofiers new directions, insight, and methodologies in further advancing of the science of signal/image processing. 177 The broader impact of this thesis is 1. Social networks have pervaded our daily life. By illustrating that game theory can be used to understand human behavior and dynamics in a multimedia social network with better system performance, this thesis can motivate similar new ideas in many social networks. 2. Signal and image processing has been a fundamental tool in many scientiflc and engineering disciplines. By introducing the proposed new game-theoretic paradigm to classical signal/image problems with new insight and signiflcant performance improvement, this thesis can trigger similar new thinking to many scientiflc areas that use signal/image processing. Speciflcally, in Chapter 3, we consider a non-cooperative multimedia social network and discuss how a group of users compete for the same resource. We use multiuser rate allocation social network as an example and show that game the- ory can provide a more general framework by theoretically proving that the tradi- tional optimization-based approach is a special case of the proposed game theoretical framework. Moreover, with the proposed method, we can flnd, in a distributed man- ner, a NE that is not only e?cient from system designer?s perspective but also fair from users? perspective. Then, in Chapter 4, we consider a cooperative multimedia social network and discuss how a group of selflsh users cooperate with each other to better obtain the content. We use cooperative peer-to-peer streaming social network as an example and show that evolutionary game can be used in such a scenario and ESS is the desired cooperative strategy. Moreover, we propose a distributed learn- 178 ing algorithm for users to converge to the ESS by learning from their own payofi history. In Chapter 5, we discuss how to stimulate cooperation in cooperative social networks. We flrst show that most of the existing game theoretic cooperation stim- ulation approaches fail when the number of interaction between a pair of players is flnite, and the major reason is the use of direct reciprocity. Then, we propose to use indirect reciprocity games to stimulate cooperation in such a scenario by taking into account the indirect opinions. With such a modeling, we show with simulations that an evolutionarily stable cooperative strategy can be achieved with a proper cost-to-gain ratio. In Chapter 6, the image denoising problem is formulated as a coalition formation game, where every pixel is treated as a player who tries to seek partners to form a coalition to flnd the optimal neighborhoods for better denoising results. By forming a coalition, every player in the coalition can obtain certain gains by improving the accuracy of the distortion estimation, while incurring some costs by increasing the true distortion. With such a formulation, the traditional image denoising approaches using a heuristically determined neighborhood set can be seen as special cases of the proposed game theoretical framework by choosing the utility function without a cost term. Another example is formulating the problem of si- multaneous image denoising and interpolation as an evolutionary game in Chapter 7, where the players are the unknown high resolution pixels and the pure strategies of the players are the corresponding noisy low resolution neighbors. By regarding the nonnegative weights of the noisy low resolution pixels as the probabilities of selecting the pure strategies, the problem of estimating the high resolution pixels becomes flnding the evolutionarily stable strategies for the evolutionary game. In 179 this sense, we say that the simultaneous image denoising and interpolation problem is no longer ill-posed. Instead, the problem becomes well-deflned, and the objec- tive of the problem is to flnd the evolutionarily stable strategies to achieve good denoising and interpolation performance. 8.2 Future Work Recently, the area of human and social dynamics has been identifled by the U.S. National Science Foundation as one of its flve priority areas, which shows the importance of this emerging interdisciplinary research area. Game theoretic modeling for multimedia social networks is a new emerging research fleld and is still in an infant stage. There are a lot of exciting problems to be investigated and addressed, which I will continue to devote my efiorts to. Security in multimedia social networks is one of these problems. Besides self- ish users, there are a group of users, called malicious users, in multimedia social networks. Unlike selflsh users whose aims are to maximize their own payofis, the objective of malicious users is to damage or even break down the system. Therefore, to successfully deploy the multimedia social networks, we need to study and analyze the malicious users attack strategies and develop the corresponding attack-resistant strategies. With my previous works, we can see that understanding the behavior dy- namics among users can ultimately ofier better system performance. Such analysis and modeling can be applied to other social networks. Therefore, in the future, I 180 would like to investigate the possibility of using such analysis and modeling in other networks such as online social networks, smart grid networks and camera networks. In my previous works, I have successfully used the concept of multimedia social networks to reformulate image denoising and image interpolation problems. In the future, I would like to apply the concept to more classical problems such as image/video compression problems, estimation and detection problems, pattern recognition and classiflcation problems, adaptive signal processing problems, and information theory related problems. It is our belief that from the multimedia social networks point of view, we can make the ill-posed problems well deflned and are able to construct generalized and unifled frameworks for the classical problems. We hope that we can introduce a new paradigm not only in the fleld of signal and image processing but also in many other flelds such as computer vision and information theory. 181 Bibliography [1] Accustream iMedia Research Homepage: http://www.accustreamresearch.com. [2] Flickr: http://www. ickr.com. [3] Napster: http://www.napster.com. [4] PPLive: http://www.pplive.com. [5] PPStream: http://www.ppstream.com. [6] H.264/AVC Software: http://iphome.hhi.de/suehring/tml/. [7] Sopcast: http://www.sopcast.com. [8] UUSee: http://www.uusee.com. [9] Youtube: http://www.youtube.com. [10] A. M. Frieze A. D. Flaxman and J. Vera. A geometric preferential attachment model of networks ii. In Proceedings of the 5th Workshop On Algorithms And Models For The Web-Graph, page 4155, 2007. [11] F. Abramovich and Y. Benjamini. Adaptive thresholding of wavelet coe?- cients. Computational Statistics and Data Analysis, 22:351{361, 1996. [12] Ishfaq Ahmad and Jiancong Luo. On using game theory for perceptually tuned rate control algorithm for video coding. IEEE Trans. Circuits Syst. Video Technol., 16(2):202{208, Feb. 2006. [13] Allebach and P. W. Wong. Edge-directed interpolation. In IEEE Int. Conf. Image Processing, volume 3, pages 707{710, 1996. [14] L. Anderegg and S. Eidenbenz. Ad hoc-vcg: A truthful and cost-e?cient routing protocol for mobile ad hoc networks with selflsh agents. In Proc. ACM MobiCom, 2003. [15] K. R. Apt and A. Witzel. A generic approach to coalition formation. In Proc. Int. Workshop Computational Social Choice (COMSOC), 2006. [16] Lawrence M. Ausubel. An e?cient ascending-bid auction for multiple objects. American Economic Review, 94:1452{1475, 2004. [17] A.-L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286:509{512, 1999. [18] B. Bollobas and O. Riordan. The diameter of a scale-free random graph. In Combinatorica, page 534, 2004. 182 [19] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004. [20] A. Buades, B. Coll, and J. M. Morel. A review of image denoising algorithms, with a new one. SIAM Multiscale Modeling and Simulation, 4(2):490{530, July 2005. [21] A. Buades, B. Coll, and J. M. Morel. The staircasing efiect in neighborhood fllters and its solution. IEEE Trans. Image Processing, 15:1499{1505, 2006. [22] S. Buchegger and J. Y. Le Boudec. Performance analysis of the confldant protocol. In Proc. ACM MobiHoc, 2002. [23] L. Buttyan and J. P. Hubaux. Enforcing service availability in mobile ad-hoc networks. In Proc. ACM MobiHoc, 2000. [24] S. Carrato and L. Tenze. A high quality 2x image interpolator. IEEE Signal Process. Lett., 7(6):132{135, Jun. 2000. [25] Y. Cha and S. Kim. The error-amended sharp edge (ease) scheme for image zooming. IEEE Trans. Image Process., 16(6):1496{1505, Jun. 2007. [26] T. Chan and J. Shen. Image processing and analysis: variational, PDE, wavelet, and stochastic methods. SIAM, 2005. [27] Y. Chen, W. S. Lin, and K. J. R. Liu. Risk-distortion analysis for video collusion attacks: A mouse-and-cat game. IEEE Trans. on Image Processing, 19(7):1798{1807, July 2010. [28] Y. Chen and K. J. R. Liu. Indirect reciprocity game modelling for cooperation stimulation in cognitive networks. IEEE Trans. Commun., 59(1):159{168, Jan. 2011. [29] Y. Chen, B. Wang, and K. J. R. Liu. Multi-user rate allocation games for multimedia communications. IEEE Trans. Multimedia, 11(6):1170{1181, Oct. 2009. [30] Y. Chen, Y. Wu, B. Wang, and K.J.R. Liu. Spectrum auction games for multimedia streaming over cognitive radio networks. IEEE Trans. Commun., 58(8):2381{2390, Aug. 2010. [31] Yan Chen and Oscar C. Au. Simultaneous RD-Optimized rate control and video de-noising. In Proc. IEEE Int. Conf. Acoustics, Speech and Signal Pro- cessing, 2008. [32] Yan Chen, Oscar C. Au, and Xiaopeng Fan. Simultaneous MAP-based video denoising and rate-distortion optimized video encoding. IEEE Trans. on Cir- cuits and Sytems for Video Technology, 19(1):15{26, Jan. 2009. 183 [33] Yan Chen and K. J. Ray Liu. A game theoretical approach for image denoising. In Proc. IEEE Int. Conf. Image Process., 2010. [34] Yan Chen, Beibei Wang, W. Sabrina Lin, Yongle Wu, and K. J. Ray Liu. Co- operative peer-to-peer streaming: An evolutionary game-theoretic approach. IEEE Trans. on Circuits and Systems for Video Technology, 20(10):1346{1357, Oct. 2010. [35] T. H. Chiang and Y. Q. Zhang. A new rate control scheme using quadratic rate distortion model. IEEE Trans. Circuits Syst. Video Technol., 7:246{250, 1997. [36] Y. Chu, S. G. Rao, and H. Zhang. A case for end system multicast. In ACM SIGMETRICS, pages 1{12, 2000. [37] F. R. K. Chung and L. Lu. The diameter of sparse random graphs. In Advances in Applied Mathematics, page 257279, 2001. [38] C. Cooper and A. Frieze. A general model of web graphs. In Random Struc- tures and Algorithms, page 311335, 2003. [39] J. R. Corbera and S. Lei. Rate control in DCT video coding for low-delay communications. IEEE Trans. Circuits Syst. Video Technol., 9:172{185, 1999. [40] P. Cramton, Y. Shoham, and R. Steinberg. Combinatorial Auctions. MIT Press, 2006. [41] R. Cressman. Evolutionary dynamics and extensive form games. MIT Press, 2003. [42] J. Crowcroft, R. Gibbens, F. Kelly, and S. Ostring. Modeling incentives for collaboration in mobile ad hoc networks. In Proc. Workshop Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOPT?03), 2003. [43] Y. Zhan D. Chakrabarti and C. Faloutsos. R-mat: A recursive model for graph mining. In SIAM Conference on Data Mining, 2004. [44] S. Lawrence E. J. Glover D. M. Pennock, G. W. Flake and C. L. Giles. Win- ners don?t take all: Characterizing the competition for links on the web. In Proceedings of the National Academy of Sciences, page 52075211, 2002. [45] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian. Image denoising by sparse 3d transform-domain collaborative flltering. IEEE Trans. Image Processing, 16(8):2080{2095, 2007. [46] M. Dai, D. Loguinov, and H. Radha. Rate-distortion modeling of scalable video coders. In Proc. IEEE Int. Conf. Image Processing, 2004. 184 [47] S. Deering and D. Cheriton. Multicast routing in datagram inter-networks and extended lans. ACM Transactions on Computer Systems, 8(2):85{111, 1990. [48] W. Ding and B. Liu. Rate control of MPEG video coding and recording by rate-quantization modeling. IEEE Trans. Circuits Syst. Video Technol., 6:12{20, 1996. [49] P. Domingos and M. Richardson. Mining the network value of customers. In Proceedings of the 7th ACM SIGKDD international conference on Knowledge discovery and data mining, 2001. [50] D. L. Donoho. De-noising by soft-thresholding. IEEE Trans. Inform. Theory, 41:613{627, May 1995. [51] M. Elad. On the origin of the bilateral fllter and ways to improve it. IEEE Trans. Image Processing, 10(10):1141{1151, 2002. [52] M. Elad. Why shinkage is still relevant for redundant representations? IEEE Trans. Inform. Theory, 52(12):5559{5569, 2006. [53] M. Felegyhazi, J. P. Hubaux, and L. Buttyan. Nash equilibria of packet for- warding strategies in wireless ad hoc networks. IEEE Trans. Mobile Comput., 5:463{476, May 2006. [54] R. Fisher. The genetical theory of natural selection. Clarendon Press, Oxford, 1930. [55] G. Gilboa, N. Sochen, and Y. Y. Zeevi. Image enhancement and denoising by complex difiusion processes. IEEE Trans. Pattern Anal. Machine Intell., 26:1020{1036, Aug. 2004. [56] G. Gilboa, N. Sochen, and Y. Y. Zeevi. Variational denoising of partl tex- tured images by spatially varying constraints. IEEE Trans. Image Processing, 15:2281{2289, Aug. 2006. [57] P. Golle, K. Leyton-Brown, and I. Mironov. Incentive for sharing in peer- to-peer networks. In Proc. of the ACM Conference on Electronic Commerce, 2001. [58] O. Guleryuz. Weighted averaging for denoising with overcomplete dictionaries. IEEE Trans. Image Processing, 16(12):3020{3034, 2007. [59] M. Gupta, P. Judge, and M. Ammar. A reputation system for peer-to-peer networks. In Proc. of the ACM 13th International Workshop on Network and Operating Systems Support for Digital Audio and Video, 2003. [60] Ahsan Habib and John Chuang. Service difierentiated peer selection: An incentive mechanism for peer-to-peer media streaming. IEEE Trans. Multi- media, 8(3):610{621, 2006. 185 [61] Zhu Han, Zhu Ji, and K. J. Ray Liu. Fair multiuser channel allocation for OFDMA networks using Nash bargaining solutions and coalitions. IEEE Trans. Commun., 53:1366{1376, 2005. [62] Zhu Han, Zhu Ji, and K. J. Ray Liu. A cartel maintenance framework to en- force cooperation in wireless networks with selflsh users. IEEE Trans. Wirless Commun., 7:1889{1899, 2008. [63] H. M. Hang and J. J. Chen. Source model for transform video coder and its application - part i: Fundamental theory. IEEE Trans. Circuits Syst. Video Technol., 7:287{298, 1997. [64] Zhihai He and S. K. Mitra. A unifled rate-distortion analysis framework for tansform coding. IEEE Trans. Circuits Syst. Video Technol., 11:1221{1236, 2001. [65] X. Hei, C. Liang, J. Liang, Y. Liu, and K. Ross. A measurement study of a large-scale p2p iptv system. IEEE Trans. Multimedia, 9(8):1672{1687, 2007. [66] Y. Hel-Or and D. Shaked. A discriminative approach for wavelet shrinkage denoising. IEEE Trans. Image Processing, 17(4):443{457, 2008. [67] K. Hirakawa and T. W. Parks. Joint demosaicking and denoising. IEEE Trans. on Image Processing, 15(8):2146{2157, Aug. 2006. [68] Cheng Huang, Jin Li, and Keith W. Ross. Can internet video-on-demand be profltable? In ACM SIGCOMM, pages 133{144, 2007. [69] K. Jensen and D. Anastassiou. Subpixel edge localization and the interpolation of still images. IEEE Trans. Image Process., 4(3):285{295, Mar. 1995. [70] F. Jin, P. Fieguth, L. Winger, and E. Jernigan. Adaptive wiener flltering of noisy images and image sequences. In Proc. IEEE Int. Conf. Image Process., pages 349{352, 2003. [71] T. Karagiannis, P. Rodriguez, and K. Papagiannaki. Should internet service providers fear peer-assisted content distribution? In Proceedings of the Inter- net Measurement Conference, 2005. [72] F. Kelly. Charging and rate control for elastic tra?c. Eur. Trans. Telecom- mun., 28:33{37, 1997. [73] C. Kervrann and J. Boulanger. Optimal spatial adaptation for patch-based image denoising. IEEE Trans. Image Processing, 15(10):2866{2878, 2006. [74] C. Kervrann and J. Boulanger. Local adaptivity to variable smoothness for exemplar-based image regularization and representation. International Jour- nal of Computer Vision, 79:45{69, 2008. 186 [75] R. G. Keys. Cubic convolution interpolation for digital image processing. IEEE TranS. Acoust., Speech, Signal Process., 29(6):1153{1160, Dec. 1981. [76] L. Kontothanassis, R. Sitaraman, J. Wein, D. Hong, R. Kleinberg, B. Man- cuso, D. Shaw, and D. Stodolsky. A transport layer for live streaming in a content delivery network. IEEE Proceedings, 92:1408{1419, 2004. [77] V. Krishna. Auction Theory. Academic Press, 2002. [78] T. V. Lakshman, A. Ortega, and A. R. Reibman. VBR video: Trade-ofis and potentials. In Proc. IEEE, 1998. [79] X. Wu Lei Zhang and D. Zhang. Color reproduction from noisy cfa data of single sensor digital cameras. IEEE Trans. on Image Processing, 16(9):2184{ 2197, Sept. 2007. [80] J. Leskovec. Dynamics of large networks. PhD Dissertation, Machine Learning Department, Carnegie Mellon University, Technical report CMU-ML-08-111, 2008. [81] M. Li and T. Q. Nguyen. Markov random fleld model-based edge-directed image interpolation. IEEE Trans. Image Process., 17(7):1121{1128, Jul. 2008. [82] X. Li and M. T. Orchard. New edge-directed interpolation. IEEE Trans. Image Process., 10(10):1521{1527, Oct. 2001. [83] Z. G. Li, F. Pan, K. P. Lim, X. Lin, and S. Rahardja. Adaptive rate control for H.264. In Proc. IEEE Int. Conf. Image Processing, 2004. [84] W. S. Lin, H. V. Zhao, and K. J. R. Liu. Incentive cooperation strategies for peer-to-peer live multimedia streaming social networks. IEEE Trans. on Multimedia, 11(3):396{412, 2009. [85] M. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proc. International Conference on Machine Learning, 1994. [86] Yong Liu, Yang Guo, and Chao Liang. A survey on peer-to-peer video stream- ing systems. Journal of Peer-to-Peer Networking and Applications, 1:18{28, 2008. [87] H. V. Madhyastha, T. Isdal, M. Piatek, C. Dixon, T. Anderson, A. Krishna- murthy, and A. Venkataramani. iplane: An information plane for distributed services. In 7th Symposium on Operating Systems Design and Implementation (OSDI), 2006. [88] S. Marti and H. Garcia-Molina. Limited reputation sharing in p2p systems. In Proc. of the 5th ACM Conference on Electronic Commerce, 2004. [89] S. Marti, T. J. Giuli, K. Lai, and M. Baker. Mitigating routing misbehavior in mobile ad hoc networks. In Proc. ACM MobiCom, 2000. 187 [90] P. Michiardi and R. Molva. Core: A collaborative reputation mechanism to enforce node cooperation in mobile ad hoc networks. In Proc. IFIP-Comm. and Multimedia Security Conf, 2002. [91] P. Michiardi and R. Molva. A game theoretical approach to evaluate coop- eration enforcement mechanisms in mobile ad hoc networks. In Proc. Work- shop Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOPT?03), 2003. [92] M. E. J. Newman. The spread of epidemic disease on networks. In Physical Review E, 2002. [93] Martin A. Nowak and Karl Sigmund. Evolution of indirect reciprocity. Nature, 437:1291{1298, 2005. [94] Hisashi Ohtsuki, Yoh Iwasa, and Martin A. Nowak. Indirect reciprocity pro- vides only a narrow margin for e?ciency for costly punishment. Nature, 457:79{82, 2009. [95] A. Ortega and K. Ramchandran. Rate-distortion techniques in image and video compression. IEEE Signal Process. Mag., 15:25{50, 1998. [96] M. J. Osborne and A. Rubinste. A Course in Game Theory. The MIT Press, 1994. [97] Martin J. Osborne. An introduction to game theory. Oxford University Press, 2004. [98] H. Park and M. van der Schaar. Bargaining strategies for networked mul- timedia resource management. IEEE Trans. Signal Process., 55:3496{3511, 2007. [99] A. Pizurica, W. Philips, I. Lemahieu, and M. Acheroy. A joint inter and intrascale statistical model for bayesian wavelet based image denoising. IEEE Trans. Image Processing, 11(5):545{557, 2002. [100] Darshan Purandare and Ratan Guha. An alliance based peering scheme for p2p live media streaming. IEEE Trans. Multimedia, 9(8):1633{1644, 2007. [101] M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley and Sons, 1994. [102] P. Raghavan R. Kumar, J. Novak and A. Tomkins. On the bursty evolution of blogspace. In Proceedings of the 11th international conference on World Wide Web, page 568576, 2003. [103] S. Rajagopalan D. Sivakumar A. Tomkins R. Kumar, P. Raghavan and E. Up- fal. Stochastic models for the web graph. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science, 2000. 188 [104] D. Ray. A Game-Theoretic Perspective on Coalition Formation. Oxford Uni- versity Press, 2007. [105] W. Rhee and J. M. Cio?. Increase in capacity of multiuser OFDM system using dynamic subchannel allocation. In Proc. IEEE Veh. Technol. Conf., 2000. [106] Cem U. Saraydar, Narayan B. Mandayam, and David J. Goodman. E?cient power control via pricing in wireless data networks. IEEE Trans. Commun., 50:291{303, 2002. [107] S. Seetharaman and M. Ammar. Characterizing and mitigating inter-domain policy violations in overlay routes. In Proc. of IEEE International Conference on Network Protocols, 2006. [108] L. Shapley. Stochastic games. In Proceedings of the National Academy of Sciences of USA, page 10951100, 1953. [109] C. Shen and M. van der Schaar. Optimal resource allocation for multimedia applications over multiaccess fading channels. IEEE Trans. Wirless Commun., 7:3546{3557, 2008. [110] J. Maynard Smith. Evolutionary and the theory of games. Cambridege Uni- versity Press, 1982. [111] S. M. Smith and J. M. Brady. Susan - a new approach to low level image processing. International Journal of Computer Vision, 23(1):45{78, 1997. [112] V. Srinivasan, P. Nuggehalli, C. F. Chiasserini, and R. R. Rao. Cooperation in wireless ad hoc networks. In Proc. IEEE INFOCOM, 2003. [113] C. Stein. Estimation of the mean of a multivariate normal distribution. In Ann. Statist., pages 1135{1151, 1981. [114] K. Stuhlmuller, N. Farber, M. Link, and B. Girod. Analysis of video transmis- sion over lossy channels. IEEE J. Sel. Areas Commun., 18:1012{1032, 2000. [115] Y. Sun, W. Yu, Z. Han, and K. J. R. Liu. Information theoretic framework of trust modelling and evaluation for ad hoc networks. IEEE Journal of Se- lected Areas in Communications, special issue on Security in Wireless Ad Hoc Networks, 24:305{317, 2006. [116] Ryan W. Thomas, Daniel H. Friend, Luiz A. DaSilva, and Allen B. MacKen- zie. Cognitive networks: Adaptation and learning to achieve end-to-end per- formance objectives. IEEE Commun. Mag., 44:51{57, December 2006. [117] C. Tomasi and R. Manduchi. Bilateral flltering for gray and color images. In Proc. of the Sixth Int. Conf. on Computer vision, pages 839{846, 1998. 189 [118] A. Urpi, M. Bonuccelli, and S. Giordano. Modeling cooperation in mobile ad hoc networks: A formal description of selflshness. In Proc. Workshop Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOPT?03), 2003. [119] W. Vickrey. Counterspeculation, auctions, and competitive sealed tenders. The Journal of Finance, 16(1):8{37, 1961. [120] D.V.D. Ville, M. Nachtegael, D.V.D. Weken, E. E. Kerre, W. Philips, and I. Lemahieu. Noise reduction by fuzzy image flltering. Aug. 2004. [121] V. Vishumurthy, S. Chandrakumar, and E.G. Sirer. Karma: A secure eco- nomic framework for peer-to-peer resource sharing. In Proc. of the 2003 Work- shop on Economics of Peer-to-Peer Systems, 2003. [122] B. Wang, K. J. R. Liu, and T. C. Clancy. Evolutionary cooperative spectrum sensing game: How to collaborate? IEEE Trans. Commun., 58(3):890{900, 2010. [123] B. Wang, Y. Wu, and K. J. R. Liu. Game theory for cognitive radio networks: An overview. Computer Networks, 54(14):2537{2561, Oct. 2010. [124] M. Wang and M. van der Schaar. Rate-distortion modeling for Wavelet video coders. In Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, 2005. [125] D. J. Watts and S. H. Strogatz. Collective dynamics of ?small-world? networks. Nature, 393:440{442, 1998. [126] J. Woods and C. Radewan. Kalman flltering in two dimensions. IEEE Trans. Inform. Theory, 23:473{482, Jul. 1977. [127] Di Wu, Chao Liang, Yong Liu, and Keith W. Ross. View-upload decoupling: A redesign of multi-channel p2p video systems. In IEEE Conference on Com- puter Communications (IEEE INFOCOM ?09), Mini-Conference, 2009. [128] Min Xiao and Debao Xiao. Understanding peer behavior and designing incen- tive mechanism in peer-to-peer networks: An analytical model based on game theory. In Proc. of International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP), 2007. [129] Haiyong Xie, Y. Richard Yang, Arvind Krishnamurthy, Yanbin Liu, and Avi Siberschatz. P4p: Provider portal for applications. In ACM SIGCOMM, pages 351{362, 2008. [130] W. Yu and K. J. Ray Liu. Attack-resistant cooperation stimulation in au- tonomous ad hoc networks. IEEE J. Sel. Areas Commun., 23:2260{2271, Dec. 2005. 190 [131] W. Yu and K. J. Ray Liu. Game theoretic analysis of cooperation stimulation and security in antonomous mobile ad hoc networks. IEEE Trans. Mobile Comput., 6:459{473, May 2007. [132] L. Zhang and X. Wu. Image interpolation via directional flltering and data fusion. IEEE Trans. Image Process., 15(8):2226{2238, Aug. 2006. [133] X. Zhang, J. Liu, B. Li, and T-S. P. Yum. Coolstreaming: A data-driven overlay network for peer-to-peer live media streaming. In IEEE Conference on Computer Communications (INFOCOM), pages 2102{2111, 2005. [134] X. Zhang, A. Vetro, Y. Q. Shi, and H. Sun. Constant quality constrained rate allocation for FGS video coded bitstreams. In Proc. SPIE Conf. on Visual Comm. and Image Processing (VCIP), 2002. [135] X. Zhang and X. Wu. Image interpolation by adaptive 2-d autoregressive mod- eling and soft-decision estimation. IEEE Trans. Image Process., 17(6):887{ 896, June 2008. [136] H. Zhao, W.S. Lin, and K. J. R. Liu. Behavior modeling and forensics for multimedia social networks: A case study in multimedia flngerprinting. IEEE Signal Process. Mag., 26(1):118{139, Jan. 2009. [137] X. J. Zhao, Y. W. He, S. Q. Yang, and Y. Z. Zhong. Rate allocation of equal image quality for MPEG-4 FGS video streaming. In Proc. Packet Video (PV), 2002. [138] S. Zhong, J. Chen, and Y. R. Yang. Sprite: A simple, cheat-proof, credit-based system for mobile ad-hoc networks. In Proc. IEEE INFOCOM, 2003. [139] S. Zhong, L. Li, Y. G. Liu, and Y. R. Yang. On designing incentive-compatible routing and forwarding protocols in wireless ad-hoc networks - an integrated approach using game theoretical and crytographic techniques. In Proc. ACM MobiCom, 2005. 191