ABSTRACT
 Title of dissertation: MULTIMEDIA SOCIAL NETWORKS:
 GAME THEORETIC MODELING
 AND EQUILIBRIUM ANALYSIS
 Yan Chen, Doctor of Philosophy, 2011
 Dissertation directed by: Professor K. J. Ray Liu
 Department of Electrical and Computer Engineering
 Multimedia content sharing and distribution over multimedia social networks
 is more popular now than ever before: we download music from Napster, share our
 images on Flickr, view user-created video on YouTube, and watch peer-to-peer tele-
 vision using Coolstreaming, PPLive and PPStream. Within these multimedia social
 networks, users share, exchange, and compete for scarce resources such as multime-
 dia data and bandwidth, and thus in uence each other?s decision and performance.
 Therefore, to provide fundamental guidelines for the better system design, it is
 important to analyze the users? behaviors and interactions in a multimedia social
 network, i.e., how users interact with and respond to each other.
 Game theory is a mathematical tool that analyzes the strategic interactions
 among multiple decision makers. It is ideal and essential for studying, analyzing,
 and modeling the users? behaviors and interactions in social networking. In this
 thesis, game theory will be used to model users? behaviors in social networks and
 analyze the corresponding equilibria. Speciflcally, in this thesis, we flrst illustrate
how to use game theory to analyze and model users? behaviors in multimedia social
 networks by discussing the following three difierent scenarios. In the flrst scenario,
 we consider a non-cooperative multimedia social network where users in the social
 network compete for the same resource. We use multiuser rate allocation social
 network as an example for this scenario. In the second scenario, we consider a
 cooperative multimedia social network where users in the social network cooperate
 with each other to obtain the content. We use cooperative peer-to-peer streaming
 social network as an example for this scenario. In the third scenario, we consider
 how to use the indirect reciprocity game to stimulate cooperation among users. We
 use the packet forwarding social network as an example.
 Moreover, the concept of \multimedia social networks" can be applied into
 the fleld of signal and image processing. If each pixel/sample is treated as a user,
 then the whole image/signal can be regarded as a multimedia social network. From
 such a perspective, we introduce a new paradigm for signal and image processing,
 and develop generalized and unifled frameworks for classical signal and image prob-
 lems. In this thesis, we use image denoising and image interpolation as examples
 to illustrate how to use game theory to re-formulate the classical signal and image
 processing problems.
MULTIMEDIA SOCIAL NETWORKS: GAME THEORETIC
 MODELING AND EQUILIBRIUM ANALYSIS
 by
 Yan Chen
 Dissertation submitted to the Faculty of the Graduate School of the
 University of Maryland, College Park in partial fulflllment
 of the requirements for the degree of
 Doctor of Philosophy
 2011
 Advisory Committee:
 Professor K. J. Ray Liu, Chair/Advisor
 Professor Min Wu
 Professor Rama Chellappa
 Professor Shuvra S. Bhattacharyya
 Professor Lawrence C. Washington
c Copyright by
 Yan Chen
 2011
Dedication
 To my family.
 ii
Acknowledgments
 This thesis is the outcome of a wonderful four-year study and research expe-
 rience in Signal and Information Group (SIG) at University of Maryland, College
 Park.
 Here, I would like to take this opportunity to express my sincere apprecia-
 tion and gratitude to my advisor, Prof. K. J. Ray Liu, for his guidance, patience,
 understanding, and encouragement. During these four years, he inspired me and
 gave me freedom to work on many exciting cutting-edge research topics. His vision,
 creativeness, and enthusiasm not only have fueled my current research but also have
 a lasting in uence in my professional and personal development as a role model.
 I would like to thank Prof. Min Wu and Prof. Rama Chellappa for not only
 serving on my thesis committes and reviewing my thesis but also serving as references
 for my faculty application. I am also grateful to Prof. Shuvra S. Bhattacharyya and
 Prof. Lawrence C. Washington for their time and efiort serving on my committees
 and reviewing my thesis. I would also like to thank Prof. Oscar Au, Prof. Andre L.
 Tits, and Prof. Peter Cramton for serving as references for my faculty application.
 Thanks should also go to all members in our Signals and Information Group
 for their friendship, encouragement, and help. I have learned a lot from them for
 which I am deeply grateful. Special thanks go to Dr. Beibei Wang, Dr. Yongle Wu,
 Dr. Wan-Yi Lin, Mr. Matt Stamm, Dr. Steve Tjoa, and Dr. Quoc Lai for our
 research discussions that were all inspiring for me.
 My deepest gratitude goes to my wife Yang. She is always there for me,
 iii
giving me her love and support. We share all things happened in life, and discuss
 the problems on research. I would also like to thank my father, my sister and my
 parents in law, for their constant love, comfort, support and encouragement. Finally,
 I would like to dedicate my thesis to my mother. I really hope that my thesis would
 make her proud. I would never forget her, and she is always in my heart.
 iv
Table of Contents
 List of Tables viii
 List of Figures ix
 1 Introduction 1
 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
 1.2 Related Works on Social Networks . . . . . . . . . . . . . . . . . . . 3
 1.3 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
 1.3.1 An Overview of Game Theory (Chapter 2) . . . . . . . . . . . 7
 1.3.2 Multiuser Rate Allocation Social Networks (Chapter 3) . . . . 7
 1.3.3 Peer-to-Peer Cooperative Video Streaming Social Networks
 (Chapter 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
 1.3.4 Cooperation Stimulation Using Indirect Reciprocity Game Mod-
 eling (Chapter 5) . . . . . . . . . . . . . . . . . . . . . . . . . 9
 1.3.5 Image Denoising Games (Chapter 6) . . . . . . . . . . . . . . 10
 1.3.6 Simultaneous Image Denoising and Interpolation Using Evo-
 lutionary Games (Chapter 7) . . . . . . . . . . . . . . . . . . 11
 2 An Overview of Game Theory 13
 2.1 Nash Equilibrium and Pareto Optimality . . . . . . . . . . . . . . . . 14
 2.2 Auction Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
 2.3 Evolutionary Games . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
 2.4 Coalition Formation Games . . . . . . . . . . . . . . . . . . . . . . . 18
 2.5 Stochastic Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
 3 Multiuser Rate Allocation Social Networks 22
 3.1 The Game-Theoretic Framework . . . . . . . . . . . . . . . . . . . . . 26
 3.1.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
 3.1.2 Video Distortion-Rate Model . . . . . . . . . . . . . . . . . . 27
 3.1.3 User?s Utility Function . . . . . . . . . . . . . . . . . . . . . . 28
 3.1.4 Multi-User Rate Allocation Game . . . . . . . . . . . . . . . . 31
 3.2 Analysis of The Multi-User Rate Allocation Game . . . . . . . . . . . 32
 3.2.1 Non-E?cient Rate Allocation (a > a0) . . . . . . . . . . . . . 32
 3.2.2 E?cient Rate Allocation (0 ? a < a0) . . . . . . . . . . . . . . 33
 3.2.3 E?cient and Proportionally Fair in Both Utility and PSNR
 (a = a0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
 3.3 Relation To The Traditional Optimization-Based Approach . . . . . . 37
 3.4 Clock Auction For Distributed Cheat-Proof Optimal Rate Allocation 38
 3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
 3.5.1 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . 46
 3.5.2 Multi-User Rate Allocation . . . . . . . . . . . . . . . . . . . 49
 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
 v
4 Peer-to-Peer Cooperative Video Streaming Social Networks 55
 4.1 The System Model and Utility Function . . . . . . . . . . . . . . . . 59
 4.1.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
 4.1.2 Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . 61
 4.2 Agents Selection Within A Homogeneous Group . . . . . . . . . . . . 64
 4.2.1 Centralized Agent Selection . . . . . . . . . . . . . . . . . . . 64
 4.2.2 Distributed Agent Selection . . . . . . . . . . . . . . . . . . . 65
 4.2.3 Evolutionary Cooperative Streaming Game . . . . . . . . . . . 66
 4.2.4 Analysis of the Cooperative Streaming Game . . . . . . . . . 67
 4.3 Agents Selection Within A Heterogeneous Group . . . . . . . . . . . 74
 4.3.1 Two-Player Game . . . . . . . . . . . . . . . . . . . . . . . . . 75
 4.3.2 Multi-Player Game . . . . . . . . . . . . . . . . . . . . . . . . 77
 4.4 A Distributed Learning Algorithm For ESS . . . . . . . . . . . . . . . 78
 4.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
 5 Cooperation Stimulation Using Indirect Reciprocity Game Modeling 89
 5.1 The System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
 5.1.1 Social Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
 5.1.2 Action Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
 5.2 Optimal Action Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
 5.2.1 Reputation Updating Policy . . . . . . . . . . . . . . . . . . . 98
 5.2.2 Stationary Reputation Distribution . . . . . . . . . . . . . . . 101
 5.2.3 Payofi Function . . . . . . . . . . . . . . . . . . . . . . . . . . 102
 5.2.4 Optimal Action Using An Alternative Algorithm . . . . . . . 104
 5.3 Action Spreading Due To Natural Selection . . . . . . . . . . . . . . 105
 5.3.1 Action Spreading Algorithm Using Wright Fisher Model . . . 106
 5.3.2 Action Spreading Algorithm Using Replicator Dynamic Equa-
 tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
 5.4 Evolutionarily Stable Strategy and Simulations . . . . . . . . . . . . 108
 5.4.1 Binary Reputation Scenario . . . . . . . . . . . . . . . . . . . 109
 5.4.2 Multi-Level Reputation Scenario . . . . . . . . . . . . . . . . . 113
 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
 6 Image Denoising Games 119
 6.1 The System Model and Coalition Formation Game . . . . . . . . . . 122
 6.1.1 The System Model . . . . . . . . . . . . . . . . . . . . . . . . 122
 6.1.2 The Coalition Formation Game . . . . . . . . . . . . . . . . . 123
 6.2 Candidate Set Selection . . . . . . . . . . . . . . . . . . . . . . . . . 125
 6.3 Game Theoretical Problem Formulation . . . . . . . . . . . . . . . . 128
 6.3.1 Stein?s unbiased risk estimate (SURE) . . . . . . . . . . . . . 128
 6.3.2 Confldence and Distortion Trade-ofi . . . . . . . . . . . . . . . 132
 6.3.2.1 Confldence . . . . . . . . . . . . . . . . . . . . . . . 132
 6.3.2.2 Distortion . . . . . . . . . . . . . . . . . . . . . . . . 133
 6.3.2.3 Confldence and Distortion Trade-ofi . . . . . . . . . 133
 vi
6.3.3 Utility Function and Solution to the Game . . . . . . . . . . . 134
 6.4 Relation to the Traditional Approaches . . . . . . . . . . . . . . . . . 139
 6.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
 6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
 7 Simultaneous Image Denoising and Interpolation Using Evolutionary Games151
 7.1 Evolutionary Game Model . . . . . . . . . . . . . . . . . . . . . . . . 156
 7.2 Image Denosing and Interpolation as an Evolutionary Game . . . . . 158
 7.2.1 Problems of Denoising+Interpolation . . . . . . . . . . . . . . 158
 7.2.2 Game Theoretic Formulation . . . . . . . . . . . . . . . . . . 160
 7.2.3 Pure Strategy Set ?ij . . . . . . . . . . . . . . . . . . . . . . . 162
 7.2.4 Payofi Function . . . . . . . . . . . . . . . . . . . . . . . . . . 163
 7.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
 7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
 8 Conclusions and Future Work 177
 8.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
 8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
 Bibliography 182
 vii
List of Tables
 3.1  ?i , fl?i , Rmini (kb=s), and Rmaxi (kb=s) for difierent sequence by training. 47
 4.1 Utility table of a two-player game. . . . . . . . . . . . . . . . . . . . . 74
 viii
List of Figures
 3.1 System Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
 3.2 The visual quality of Foreman sequence at difierent PSNR level: (a)
 33dB; (b) 34dB; (c) 40dB; (d) 41dB. . . . . . . . . . . . . . . . . . . 30
 3.3 Illustration for the proof of Lemma 2: (a) If ~Rit ? Rti andRL1i > Rmini ,
 we can see that [ln( i + fliRL1i ) ? @ ln( i+fliRi)@Ri jRi=RL1i (R
 L1
 i ? RL2i )] ?
 ln( i + fliRL2i ); (b) If ~Ri
 t > Rti and RL1i < Rmaxi , we can see that
 ln( i + fliRL1i ) ? [ln( i + fliRL2i )? @ ln( i+fliRi)@Ri jRi=RL1i (R
 L2
 i ?RL1i )]. . . 43
 3.4 Training  and fl: (a) Football; (b) Coastguard. . . . . . . . . . . . . 48
 3.5 Allocated rates for Akiyo, Carphone, Coastguard, Foreman, Table,
 Football, and Mobile using difierent methods. . . . . . . . . . . . . . 49
 3.6 The sum of PSNR vs. the available network bandwidth R. . . . . . . 50
 3.7 Cheat-proof performance: (a) AFD, AFR, and MSPSNR; (b) Pro-
 posed Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
 4.1 A cooperative streaming example. . . . . . . . . . . . . . . . . . . . . 60
 4.2 The deceasing property of wi. . . . . . . . . . . . . . . . . . . . . . . 68
 4.3 The social welfare comparison among Non-Coop, MSW-C, MSW-
 D, and ESS-D. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
 4.4 Behavior dynamic of a homogeneous group of peers. . . . . . . . . . . 82
 4.5 Behavior dynamic of a heterogeneous group of peers. . . . . . . . . . 83
 4.6 The probability of real-time streaming comparison between Non-
 Coop and ESS-D. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
 4.7 The visual quality comparison: (a) Non-Coop; (b) ESS-D with
 N=2; (c) ESS-D with N=3; (d) ESS-D with N=4. . . . . . . . . . . 85
 4.8 Single-source rate comparison between Non-Coop and ESS-D. . . . 86
 4.9 Multi-source rate comparison between Non-Coop and ESS-D. . . . 87
 4.10 The social welfare comparison between Non-Coop and ESS-D when
 the utility function is deflned as (4.42). . . . . . . . . . . . . . . . . . 88
 5.1 System model. Within every interaction, a pair of transmitter and re-
 ceiver is randomly sampled from the population. Then, the transmit-
 ter will forward a certain amount of packets to the receiver according
 to the receiver?s and his/her own reputations. After the transmission,
 the transmitter?s reputation will be updated by the receiver and the
 observers. Finally, the transmitter?s reputation is propagated to the
 whole population from the receiver and the observers through a noisy
 gossip channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
 5.2 Reputation updating policy. . . . . . . . . . . . . . . . . . . . . . . . 98
 5.3 The population evolution when L = 1, g = 1 and c = 0:1: (a)
 the percentage of the population with reputation L = 1; (b) the
 percentage of the population using optimal action shown in (5.24). . . 108
 ix
5.4 The stable region for the optimal action rule shown in (5.24) when
 L = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
 5.5 The population evolution when L = 1, g = 1 and c = 0:6: (a)
 the percentage of the population with reputation L = 1; (b) the
 percentage of the population using optimal action shown in (5.24). . . 111
 5.6 The stable region for the optimal action rule shown in (5.26) when
 L = 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
 5.7 The population evolution when L = 4, g = 1 and c = 0:5: (a)
 the percentage of the population with reputation L = 4; (b) the
 percentage of the population using optimal action shown in (5.26). . . 114
 5.8 The population evolution when L = 4, g = 1 and c = 0:7: (a)
 the percentage of the population with reputation L = 4; (b) the
 percentage of the population using optimal action shown in (5.26). . . 115
 6.1 An example of optimal candidate set with an edge region: (a) original
 image; (b) noisy image with  = 15; (c) noisy image with  = 25;
 (d) noisy image with  = 35; (e) the candidate set used by nonlocal;
 (f) the ideally optimal candidate set of (b) when the original signal
 is available; (g) the ideally optimal candidate set of (c) when the
 original signal is available; (h) the ideally optimal candidate set of
 (d) when the original signal is available. . . . . . . . . . . . . . . . . 127
 6.2 An example of optimal candidate set with a smooth region: (a) origi-
 nal image; (b) noisy image with  = 15; (c) noisy image with  = 25;
 (d) noisy image with  = 35; (e) the candidate set used by nonlocal;
 (f) the ideally optimal candidate set of (b) when the original signal is
 available; (g) the ideally optimal candidate set of (c) when the orig-
 inal signal is available; (h) the ideally optimal candidate set of (d)
 when the original signal is available. . . . . . . . . . . . . . . . . . . . 129
 6.3 The optimal m? for lena image when  = 10. . . . . . . . . . . . . . . 130
 6.4 The trade-ofi between the confldence term C and the distortion term
 D: (a) the performance of C with difierent N ; (b) the performance
 of D with difierent N . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
 6.5 Some possible gain functions. . . . . . . . . . . . . . . . . . . . . . . 134
 6.6 The four tested images: (a) Lena; (b) Barbara; (c) Boat; (d) Flinstones.135
 6.7 An example of optimal candidate set with an edge region: (a) original
 image; (b) noisy image with  = 15; (c) noisy image with  = 25; (d)
 noisy image with  = 35; (e) the candidate set used by nonlocal; (f)
 (g) and (h) are the ideally optimal candidate sets of (b) (c) and (d)
 when the original signal is available; (i) (j) and (k) are the candidate
 sets of (b) (c) and (d) generated by the proposed method. . . . . . . 144
 x
6.8 An example of optimal candidate set with a smooth region: (a) origi-
 nal image; (b) noisy image with  = 15; (c) noisy image with  = 25;
 (d) noisy image with  = 35; (e) the candidate set used by nonlocal;
 (f) (g) and (h) are the ideally optimal candidate sets of (b) (c) and (d)
 when the original signal is available; (i) (j) and (k) are the candidate
 sets of (b) (c) and (d) generated by the proposed method. . . . . . . 145
 6.9 The PSNR comparison for difierent images: (a) Lena; (b) Barbara;
 (c) Boat; (d) Flinstones. . . . . . . . . . . . . . . . . . . . . . . . . . 146
 6.10 The visual quality comparison for Flinstones with  = 25: (a) origi-
 nal image; (b) noisy image; (c) the result generated by the nonlocal
 method; (d) the result generated by the proposed approach. . . . . . 147
 6.11 The visual quality comparison for Barbara with  = 35: (a) origi-
 nal image; (b) noisy image; (c) the result generated by the nonlocal
 method; (d) the result generated by the proposed approach. . . . . . 148
 6.12 The visual quality comparison for Lena with  = 45: (a) original im-
 age; (b) noisy image; (c) the result generated by the nonlocal method;
 (d) the result generated by the proposed approach. . . . . . . . . . . 149
 6.13 The visual quality comparison for Boat with  = 20: (a) original im-
 age; (b) noisy image; (c) the result generated by the nonlocal method;
 (d) the result generated by the proposed approach. . . . . . . . . . . 150
 7.1 (a) a region of Lena image that is corrupted with Gaussian noise with
 noise variance  2 = 25; (b) the interpolation result using bicubic; (c)
 the interpolation result using SAI. . . . . . . . . . . . . . . . . . . . . 153
 7.2 (a) Denoising before interpolation; (b) Simultaneous denoising and
 interpolation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
 7.3 (a) a region of Lena image that is corrupted by Gaussian noise with
 noise variance  2 = 225; (b) the denoised result using nonlocal; (c)
 the interpolation result of (b) using bicubic; (d) the interpolation
 result of (b) using SAI. . . . . . . . . . . . . . . . . . . . . . . . . . . 159
 7.4 The pure strategy set ?ij for Ih(2m?1; 2n?1); Ih(2m?1; 2n); Ih(2m; 2n?
 1); Ih(2m; 2n). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
 7.5 The block diagram of the proposed method. . . . . . . . . . . . . . . 165
 7.6 The test images: (a) Lena; (b) Boat; (c) Kodim03; (d) Kodim07; (e)
 Kodim09. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
 7.7 The PSNR comparison: (a) Lena; (b) Boat; (c) Kodim03; (d) Kodim07;
 (e) Kodim09. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
 7.8 The visual quality comparison for Lena: (a) the noisy low resolution
 image with  = 10; (b) the result generated by Nonlocal+Bicubic
 (30.71dB); (c) the result generated by Nonlocal+SAI method (31.11dB);
 (d) the result generated by the proposed method (31.38dB). . . . . . 172
 7.9 The visual quality comparison for Boat: (a) the noisy low resolu-
 tion image with  = 5; (b) the result generated by Nonlocal+Bicubic
 (28.52dB); (c) the result generated by Nonlocal+SAI method (28.92dB);
 (d) the result generated by the proposed method (28.74dB). . . . . . 173
 xi
7.10 The visual quality comparison for Kodim07: (a) the noisy low resolu-
 tion image with  = 10; (b) the result generated by Nonlocal+Bicubic
 (29.85dB); (c) the result generated by Nonlocal+SAI method (30.18dB);
 (d) the result generated by the proposed method (30.35dB). . . . . . 174
 7.11 The visual quality comparison for Kodim09: (a) the noisy low resolu-
 tion image with  = 15; (b) the result generated by Nonlocal+Bicubic
 (29.56dB); (c) the result generated by Nonlocal+SAI method (29.81dB);
 (d) the result generated by the proposed method (30.41dB). . . . . . 175
 7.12 The visual quality comparison for Kodim03: (a) the noisy low resolu-
 tion image with  = 20; (b) the result generated by Nonlocal+Bicubic
 (27.85dB); (c) the result generated by Nonlocal+SAI method (28.03dB);
 (d) the result generated by the proposed method (28.37dB). . . . . . 176
 xii
Chapter 1
 Introduction
 1.1 Motivation
 Multimedia content sharing and distribution over multimedia social networks
 is more popular now than ever before: we download music from Napster [3], share
 our images on Flickr [2], view user-created video on YouTube [9], and watch peer-
 to-peer television using Coolstreaming [133], PPLive [4] and PPStream [5]. Within
 these multimedia social networks, users share, exchange, and compete for scarce
 resources such as multimedia data and bandwidth, and thus in uence each other?s
 decision and performance. Therefore, to provide fundamental guidelines for the
 better system design, it is important to analyze the users? behaviors and interactions
 in a multimedia social network, i.e., how users interact with and respond to each
 other.
 Unlike generic data applications, multimedia applications have time-varying
 bandwidth requirements, stringent delay deadlines and dynamic characteristics. To
 enable users in a multimedia social network to successfully participate in the resource
 competition, the uniquely scalable and delay-sensitive characteristics of multimedia
 data and the resulting impact on users viewing experiences of multimedia content
 should be explicitly involved in the system design.
 In multimedia social networks, users are intelligent and have the ability to
 1
observe, learn, and make intelligent decisions. Since users usually belong to difier-
 ent authorities and pursue difierent goals, they will choose the strategies that can
 maximize their own payofis. In such a case, traditional centralized optimization-
 based approaches no longer work well since they only consider the e?ciency of the
 whole system while totally ignore the fairness among users, which is an even more
 important issue in multimedia social networks. To better design the system, not
 only the e?ciency issue from the system designers? perspective but also the fairness
 issue from the users? perspective should be taken into accout. Moreover, since users
 in multimedia social networks are rational and thus naturally selflsh, they tend to
 over-claim what they may need and will not truly report their private information
 if cheating can improve their payofis. Therefore, enforcing truth-telling is crucial in
 multimedia social networks.
 From the above discussions, we can see that the behavior dynamics among
 users in a multimedia social network are very complex. To understand the users?
 complex behavior dynamics and thus lead to a better system design, game theory is a
 powerful mathematical tool that analyzes the strategic interactions among multiple
 decision makers [97]. It has been developed for understanding cooperation and
 con ict between individuals in many flelds such as economics, politics, business,
 social sciences and biology. Thus, game theory is ideal and essential for studying,
 analyzing, and modeling the users? behaviors and interactions in social networking.
 Recently, it draws great attentions in cognitive networking [30] [123] and multimedia
 signal processing [27]. In this thesis, we will illustrate how game theory can be used
 to model users? behaviors in various multimedia social networks and analyze the
 2
corresponding equilibria.
 1.2 Related Works on Social Networks
 A social network is a social structure made of individuals and/or organizations
 called \nodes", which are connected with each other by certain types of interdepen-
 dency, such as friendship, kinship, flnancial exchange, con ict, trade, etc. Many
 methodologies have been studied to formulate the relationships among members
 at all scales, from interpersonal to international, and social network analysis be-
 comes a popular topic in sociology, economics, information science and many other
 disciplines.
 Most of the existing works on social networks fall into the following three cat-
 egories [80]: (1) social network properties, (2) social network models, (3) social net-
 work dynamics and evolution. In [17] [43], the authors showed that the vertex con-
 nectivities in many large networks follow a scale-free power-law distribution. Such
 a property is found to be a consequence of two generic mechanisms: (i) networks
 expand continuously by the addition of new vertices, and (ii) new vertices attach
 preferentially to sites that are already well connected. Another important property
 of social networks is the \small-world" phenomenon. As pointed out in [125] [18],
 most real-world networks exhibit relatively small diameter, i.e., the networks are
 highly clustered.
 Besides the study of the social network properties, there are quite a lot of work
 on building models for social networks. The simplest model is the random graph
 3
model introduced in [37], where given a number of nodes, each pair of nodes has an
 identical and independent probability of being joined by an edge. However, since
 it fails to match the real-world social network properties, e.g., it does not produce
 power law degree distributions, this model is not realistic. A better model that can
 produce power law degree distributions is the preferential attachment [38] [44] [10],
 where when a new node u arrives to the network, the probability of connecting to
 a node v is proportional to the degree of v. Another model that can also produce
 power law degree distributions is the copying model [103], where a new node joins
 the networks by uniformly creating random edges or flrst random choosing a node
 u and then linking to u0s neighbors.
 Another important research topic in the fleld of social network is the study of
 social network dynamics and evolution where the researchers study how the social
 network evolve and how information spread over the networks. Many works have
 been done to investigate the dynamics and evolution of difierent networks, e.g.,
 trendsetters selecting in viral marketing [49], inoculation targets identiflcation in
 epidemiology [92], and studying trends in blogosphere [102].
 However, most of these existing works study and analyze the social networks at
 the macroeconomic level, i.e., from system designer?s perspective. Few efiorts have
 been made to investigate the social networks at the microeconomic level, i.e., from
 the users? perspective, which is an important issue in social network analysis since
 users may only care about their own objectives and their decisions greatly afiect the
 evolution and performance of the social networks. In this thesis, we will study and
 analyze the social networks from the users? perspective by modeling users? behaviors
 4
and interactions using game theory.
 1.3 Dissertation Outline
 Since users in difierent multimedia social networks may have difierent types of
 interdependency, to efiectively model the users? behaviors and interactions, difierent
 game models for difierent multimedia social networks should be employed. The
 two most common types of users? interdependency in multimedia social networks
 are competition and cooperation, which leads to non-cooperative social networks
 and cooperative social networks, respectively. In cooperative social networks, since
 users are rational thus naturally selflsh, they will not cooperate with others unless
 cooperation can improve their own performance. Therefore, one important issue in
 cooperative social networks is cooperation stimulation. Without loss of generality,
 in this thesis, we flrst illustrate how to use game theory to analyze and model users?
 behaviors in multimedia social networks by discussing the following three difierent
 scenarios:
 ? In the flrst scenario, we consider a non-cooperative multimedia social network
 where users in the social network compete for the same resource. We use
 multiuser rate allocation social network [29] as an example for this scenario
 and the details will be described in Chapter 3.
 ? In the second scenario, we consider a cooperative multimedia social network
 where users in the social network cooperate with each other to obtain the con-
 tent. As discussed in Chapter 4, we will use cooperative peer-to-peer streaming
 5
social network [34] as an example for this scenario.
 ? In the third scenario, we consider how to use the indirect reciprocity game
 to stimulate cooperation among users. In Chapter 5, we will use the packet
 forwarding social network [28] as an example.
 Moreover, the concept of \multimedia social networks" can be applied into
 the fleld of signal and image processing. Although there are seemingly no human
 factors involved in the algorithmic solution in classical signal/image processing, if
 we take the view that the pixels/signals of an image are forming a notion of a
 \social network" to jointly interact to accomplish a common (\processing") goal,
 be it flltering, denoising, or segmentation, then the game theoretic approach can
 ofier new views beyond what classical methods can. This completely changes the
 traditional thinking that we have to decide what a pixel does instead of simply
 giving some generic rules/guidelines and let the pixels themselves interact/cooperte
 to decide the best \strategy". From such a perspective, we introduce a new paradigm
 for signal and image processing, and develop generalized and unifled frameworks for
 classical signal and image problems. In this thesis, we use image denoising (Chapter
 6) and simultaneous image denoising and interpolation (Chapter 7) as examples to
 illustrate how to use game theory to re-formulate the classical signal and image
 processing problems. The rest of the dissertation is organized as follows.
 6
1.3.1 An Overview of Game Theory (Chapter 2)
 Since game theory has been recognized as an important tool in studying, mod-
 eling, and analyzing the interaction process among multiple decision makers, in this
 chapter, we present an overview of some fundamental concepts of game theory that
 will be used in this thesis.
 1.3.2 Multiuser Rate Allocation Social Networks (Chapter 3)
 In multiuser rate allocation problem, a set of transmitters want to transmit
 the video sequences to corresponding receivers through a common channel that is
 shared by all transmitters. Since the transmitters compete for the same resource, i.e.,
 channel bandwidth, they form a non-cooperative social network. The key problem in
 this social network is how to e?ciently and fairly allocate data rate among difierent
 users. Most of the existing optimization-based methods, such as minimizing the
 weighted sum of the distortions or maximizing the weighted sum of the peak signal-
 to-noise ratios (PSNRs), have their weights heuristically determined. Moreover,
 those approaches mainly focus on the e?ciency issue while there is no notion of
 fairness. In this chapter, we address this problem by proposing a game-theoretic
 framework, in which the utility function of each user is jointly determined by the
 characteristics of the transmitted video sequence and the allocated bit-rate. We show
 that a unique Nash equilibrium (NE), which is proportionally fair in terms of both
 utility and PSNR, can be obtained, according to which the controller can e?ciently
 and fairly allocate the available network bandwidth to the users. Moreover, we
 7
propose a distributed cheat-proof rate allocation scheme for the users to converge
 to the optimal NE using alternative ascending clock auction. We also show that the
 traditional optimization-based approach that maximizes the weighted sum of the
 PSNRs is a special case of the game-theoretic framework with the utility function
 deflned as an exponential function of PSNR. Finally, we show several experimental
 results on real video data to demonstrate the e?ciency and efiectiveness of the
 proposed method.
 1.3.3 Peer-to-Peer Cooperative Video Streaming Social Networks (Chap-
 ter 4)
 While peer-to-peer (P2P) video streaming systems have achieved promising
 results, they introduce a large number of unnecessary traverse links, which con-
 sequently leads to substantial network ine?ciency. To address this problem and
 achieve better streaming performance, we propose to enable cooperation among
 group peers, which are geographically neighboring peers with large intra-group up-
 load and download bandwidths. Considering the peers selflsh nature, we formulate
 the cooperative streaming problem as an evolutionary game and derive, for every
 peer, the evolutionarily stable strategy (ESS), which is the stable Nash equilibrium
 and no one will deviate from. Moreover, we propose a simple and distributed learn-
 ing algorithm for the peers to converge to the ESSs. With the proposed algorithm,
 each peer decides whether to be an agent who downloads data from the peers out-
 side the group or a freerider who downloads data from the agents by simply tossing
 8
a coin, where the probability of being a head for the coin is learned from the peers
 own past payofi history. Simulation results show that the strategy of a peer con-
 verges to the ESS. Compared to the traditional non-cooperative P2P schemes, the
 proposed cooperative scheme achieves much better performance in terms of social
 welfare, probability of real-time streaming, and video quality (source rate).
 1.3.4 Cooperation Stimulation Using Indirect Reciprocity Game Mod-
 eling (Chapter 5)
 In social networks, since nodes generally belong to difierent authorities and
 pursue difierent goals, they will not cooperate with others unless cooperation can
 improve their own performance. Thus, how to stimulate cooperation among nodes in
 social networks is very important. However, most of existing game-theoretic cooper-
 ation stimulation approaches rely on the assumption that the interactions between
 any pair of players are long-lasting. When this assumption is not true, according
 to the well-known Prisoners Dilemma and the backward induction principle, the
 unique Nash equilibrium (NE) is to always play non-cooperatively. In this chapter,
 we propose a cooperation stimulation scheme for the scenario where the number of
 interactions between any pair of players are flnite. The proposed algorithm is based
 on indirect reciprocity game modelling where the key concept is \I help you not
 because you have helped me but because you have helped others". We formulate
 the problem of flnding the optimal action rule as a Markov Decision Process (MDP)
 and propose a modifled value iteration algorithm to flnd the optimal action rule.
 9
Using the packet forwarding game as an example, we show that with an appropriate
 cost-to-gain ratio, the strategy of forwarding the number of packets that is equal
 to the reputation level of the receiver is an evolutionarily stable strategy (ESS). Fi-
 nally, simulations are shown to verify the e?ciency and efiectiveness of the proposed
 algorithm.
 1.3.5 Image Denoising Games (Chapter 6)
 Based on the observation that every small window in a natural image has many
 similar windows in the same image, the nonlocal denoising methods perform denois-
 ing by weighted averaging all the pixels in a nonlocal window and have achieved
 very promising denoising results. However, the use of a flxed square neighborhood
 window greatly limits the denoising performance. Therefore, an important issue
 in pixel-domain image denoising algorithms is how to adaptively choose optimal
 neighborhoods. Obviously, too large a neighborhood set may cause overly-smooth
 artifacts, while too small a neighborhood set may not be able to e?ciently reduce
 the noise variance. While the Stein?s principle is shown to be able to estimate the
 true mean square error (MSE) for determining the optimal neighborhoods, there
 exists a trade-ofi between the accuracy of the estimate and the minimum of the
 true MSE. In this chapter, we study the impact of such a trade-ofi and formulate
 the image denoising problem as a coalition formation game. In this game, every
 pixel is treated as a player, who tries to seek partners to form a coalition to achieve
 better denoising results. By forming a coalition, every player in the coalition can
 10
obtain certain gains by improving the accuracy of the Stein?s estimate, while incur-
 ring some costs by increasing the minimum of the true MSE. Moreover, we show
 that the traditional approaches using a heuristically determined neighborhood set
 are special cases of the proposed game theoretical framework by choosing the utility
 function without a cost term. Finally, experimental results show that the proposed
 game theoretic approach can achieve better performance than the nonlocal method
 in terms of both PSNR and visual quality.
 1.3.6 Simultaneous Image Denoising and Interpolation Using Evolu-
 tionary Games (Chapter 7)
 While the existing image interpolation approaches can achieve promising in-
 terpolation results, they are specially designed for the clean images. However, when
 the low resolution image is noisy, most of the existing interpolation approaches will
 also boost the noise and introduces severe visual distortions. Therefore, to achieve
 better reconstruction, we should jointly consider image denoising and interpola-
 tion. In this chapter, we study the problem of simultaneous image denoising and
 interpolation from the game theoretic perspective and formulate the problem as an
 evolutionary game. In this evolutionary game, the players are the unknown high
 resolution pixels and the pure strategies of the players are the corresponding noisy
 low resolution neighbors. By regarding the non-negative weights of the noisy low
 resolution pixels as the probabilities of selecting the pure strategies, the problem
 of estimating the high resolution pixels becomes flnding the evolutionarily stable
 11
strategies for the evolutionary game. Experimental results show that the proposed
 game theoretical approach can achieve better performance than the methods that
 flrst denoise the noisy low resolution image and then interpolate the denoised image,
 in terms of both PSNR and visual quality.
 12
Chapter 2
 An Overview of Game Theory
 Game theory [97] is a mathematical tool that analyzes the strategic interac-
 tions among multiple decision makers. Its history dates back to 1944 when J. von
 Neumann and O. Morgenstern publish the book Theory of Games and Economic
 Behavior. In this book, von Neumann and Morgenstern introduced the method
 of flnding mutually consistent solutions for two-person zero-sum games, which lays
 the foundation of game theory. During the late 1940s, cooperative game theory
 had been studied to analyze how groups of individuals should cooperate with each
 other to improve their positions in a game. In early 1950s, J. Nash developed an
 important criterion, known as Nash equilibrium, to characterize mutually optimal
 strategies of players. This concept is applicable to non-zero-sum games, and thus
 is more general than the criterion proposed by von Neumann and Morgenstern and
 marks a quantum leap forward in the development of non-cooperative game theory.
 During the 1950s, many important concepts of game theory were developed, such as
 the concepts of the core, the extensive form games, repeated games, and the Shapley
 value. Reflnement of Nash equilibriums and the concepts of complete information
 and Bayesian games were proposed in the 1960s. Application of game theory to biol-
 ogy, i.e., the evolutionary game theory, was introduced by J. M. Smith in the 1970s,
 during which time, the concepts of correlated equilibrium and common knowledge
 13
were introduced by R. Aumann. In nowadays, game theory has been widely rec-
 ognized as an important tool in many flelds, such as economics, politics, business,
 social sciences, biology, computer science, and engineering, for understanding co-
 operation and con ict between individuals. In this chapter, we will present a brief
 overview on some fundamental concepts of game theory that will be used in this
 thesis to model and analyze users? behaviors and interaction in multimedia social
 networks. For more extensive concepts of game theory, the readers are referred to.
 2.1 Nash Equilibrium and Pareto Optimality
 A strategic game hN; (Ai); (ui)i consists of three components: a set of players,
 denoted by N ; a set of actions, denoted by Ai for player i; and payofi functions,
 denoted by ui : A ! R for player i, where A = ?i2NAi is the action set of all
 players. Generally, one player?s payofi depends on not only his/her own action,
 but also other players? actions, and hence there is a strategic interaction between
 players.
 Nash equilibrium is the key concept to understand non-cooperative game the-
 ory. Informally speaking, it is an equilibrium where everyone plays the best strategy
 while taking others? decisions into account. Mathematically, a? is a Nash equilibrium
 if for every player i 2 N ,
 ui(a?i ; a??i) ? ui(ai; a??i); 8ai 2 Ai; (2.1)
 where ai denotes the strategy of player i and a?i is a common notation in game
 theory representing the strategies of all players other than player i. Therefore, Nash
 14
equilibrium predicts the outcome of a game when all players are rational. Depending
 on whether players choose a single action or randomize over a set of actions according
 to some probability distribution, an equilibrium can be classifled as the pure-strategy
 Nash equilibrium or the mixed-strategy Nash equilibrium.
 Pareto optimality is a strategy proflle at which no single player can improve
 his/her own payofi without hurting any other player. Speciflcally, let u be a vector
 composed of payofis in one particular game outcome. Then, u is Pareto e?cient if
 there is no u0 of another game outcome for which u0i > ui for all i 2 N ; u is strongly
 Pareto e?cient if there is no u0 for which u0i ? ui for all i 2 N and u0i > ui for some
 i 2 N . The Pareto frontier is deflned as the set of all u that are Pareto e?cient.
 2.2 Auction Games
 Auction theory [77] is an applied branch of game theory which analyzes inter-
 actions in auction markets. An auction, conducted by an auctioneer, is a process
 of buying and selling products by eliciting bids from potential buyers (i.e., bidders)
 and deciding the auction outcome based on the bids and auction rules. The rules of
 auction, or auction mechanisms, determine whom the goods are allocated to (i.e.,
 the allocation rule) and how much they have to pay (i.e., the payment rule).
 The well-known four basic forms of auctions are: English auction, Dutch auc-
 tion, Second-price (sealed-bid) auction, and First-price (sealed-bid) auction. English
 auction is a sequential auction where price increases round by round from a low
 starting price until only one bidder is left, who wins the product and pays his/her
 15
bid. Dutch auction is another sequential auction where price decreases round by
 round from a high starting price until one bidder accepts the price, who wins the
 product and pays the price at acceptance. Second-price (sealed-bid) auction is the
 auction where each bidder submits a bid in a sealed envelope simultaneously, and
 the highest bidder wins the product with payment equal to the second highest bid.
 First-price (sealed-bid) auction is the auction where each bidder submits a bid in
 a sealed envelope simultaneously, and the highest bidder wins the product with
 payment equal to his/her own bid.
 Although the four basic auctions appear quite difierent at flrst glance, they
 are actually equivalent in some sense under certain conditions [119]. As established
 in [119] by William Vickrey, a Nobel laureate in Economics, the English auction
 is equivalent to the second-price sealed-bid auction under the private values model
 while the Dutch auction is equivalent to the flrst-price sealed-bid auction since for
 every strategy in the flrst-price auction, there is an equivalent strategy in the Dutch
 auction and vice versa; and given symmetric and risk-neutral bidders and private
 values, all four auctions yield the same expected revenue of the seller. Therefore, it
 will su?ce to study or adopt only one kind of auction out of the four basic forms.
 An auction becomes more complicated when more than one item are simul-
 taneously sold and bidders bid for \packages" of products instead of individual
 products. This is known as the combinatorial auction [40]. One possible approach
 is the Vickrey-Clarke-Groves (VCG) mechanism, which is the generalized version
 of the second-price mechanism. The basic idea is that the allocation of products
 maximizes the social welfare and each winner in the auction pays the opportunity
 16
cost that their presence introduces to all the other bidders. Another approach is
 the alternative ascending clock auction proposed in [16], where the basic idea is to
 awarded the items to bidders at the price whenever they are \clinched".
 2.3 Evolutionary Games
 In some games, there can be more than one Nash equilibrium. When there ex-
 ist multiple Nash equilibria, one interesting and important problem is how to choose
 an optimal one in some sense. This process is also known as \equilibria reflnement"
 in game theory. In the literature, several reflnement criteria have been proposed,
 e.g. Pareto optimality is deflned to compare multi-dimension payofi proflles. How-
 ever, the establishment of Pareto optimality is based on the assumption that players
 have the full knowledge of the game they are playing and others players? actions,
 and that players are rational and willing to cooperate in their moves. Nevertheless,
 this assumption may not be true since players may only have limited information
 about the other players strategies. Moreover, players may take out-of-equilibrium
 strategies due to the uncertainty of the game and incorrect/noisy estimate of others?
 strategies. To overcome such problems, we need to provide a robust stable equilib-
 rium, and evolutionary game theory is such a theory that can provide the desired
 stable equilibrium { evolutionarily stable strategy.
 Evolutionary game theory is an application of the mathematical theory of
 games to the interaction dependent strategy evolution in populations [110] [41].
 Arising from the realization that frequency dependent fltness introduces a strategic
 17
aspect to evolution, evolutionary game theory becomes an essential component of
 a mathematical and computational approach to biological contexts, such as genes,
 viruses, cells, and humans. Recently, however, evolutionary game theory has become
 of increased interest to economists, sociologists, anthropologists, social scientists,
 and computer science.
 Difiers from classical game theory, evolutionary game theory focuses on the
 dynamics of strategy change more than the properties of strategy equilibria. It can
 tell us how a rational player should behave to approach a best strategy against a
 small number of players who do not follow the best strategy, and thus evolutionary
 game theory can better handle the unpredictable behavior of players.
 2.4 Coalition Formation Games
 The coalition formation game is one type of cooperative game [104], which
 describes how a set of players can cooperate with others by forming cooperating
 groups and thus improves their payofis in a game.
 A coalition S is a nonempty subset of N , the set of all players. Since the
 players in coalition S have agreed to cooperate together, they can be viewed as one
 entity and is associated with a value v(S) which represents the worth of coalition
 S. Then, a coalitional game is determined by N and v(S). When the value v(S) is
 the total payofi that can be distributed in any way among the members of S, e.g.,
 using an appropriate fairness rule, this kind of coalitional games is known as games
 with transferrable payofi. However, in some coalitional games, rigid restrictions exist
 18
on the allocation of the payofi. These games fall into the other category known as
 games without transferrable payofi.
 In coalition formation games, often the value v(S) is determined by two terms:
 the gain of forming a coalition g(S) and the cost of forming a coalition c(S), i.e.
 v(S) = g(S)? c(S): (2.2)
 In general, cooperation by forming larger coalitions is beneflcial for players in
 terms of a higher gain. This property is referred to as superadditivity, i.e.,
 g
 ?
 S1
 [
 S2
 ?
 ? g(S1) + g(S2); 8S1; S2 ? N; S1
 \
 S2 = ;: (2.3)
 However, on the other hand, forming a larger coalition also require a larger
 cost, i.e. the cost is also superadditive as follows
 c
 ?
 S1
 [
 S2
 ?
 ? c(S1) + c(S2); 8S1; S2 ? N; S1
 \
 S2 = ;: (2.4)
 Therefore, forming larger coalitions are not always beneflcial due to the cost
 term, which means that grand coalition is seldom formed. The objective of coalition
 formation games is to flnd the optimal coalition structure S? = fS?1 ; S?2 ; :::; S?l g,
 S?1 [ S?2 [ ::: [ S?l = N , that maximizes the total coalition values, i.e.,
 S? = fS?1 ; S?2 ; :::; S?l g = arg maxS
 X
 i
 v(Si): (2.5)
 2.5 Stochastic Games
 We have discussed various games, but generally speaking, players are assumed
 to face the same stage game at each time, meaning the game and the players?
 19
strategies are not depending on the current state of the network. However, this is
 not true for a dynamic environment where players? strategies keep changing over
 time. In order to study the cooperation and competition behaviors under such a
 dynamic environment, the theory of stochastic games might be a better flt.
 A stochastic game [108] is an extension of Markov decision process (MDP) [101]
 by considering the interactive competition among difierent agents. In a stochas-
 tic game, there is a set of states, denoted by S, and a collection of action sets,
 A1; ? ? ? ; AjN j, one for each player in the game. The game is played in a sequence
 of stages. At the beginning of each stage the game is in a certain state. After the
 players select and execute their actions, the game then moves to a new random
 state with some transition probability determined by the current state and actions
 from all players: T : S ? A1 ? ? ? ? ? AjN j 7! PD(S). Meanwhile, at each stage
 each player receives a payofi ui : S ? A1 ? ? ? ? ? AjN j 7! R, which also depends on
 the current state and all the chosen actions. The game is played continually for a
 number of stages, and each player attempts to maximize an objective function. Like
 in the repeated game, the overall payofi function is deflned as the expected sum of
 discounted intermediate payofis.
 The solution, also called a policy of a stochastic game is deflned as a probability
 distribution over the action set at any state, ?i : S ! PD(Ai), for all i 2 N . Given
 the current state st at time t, if player i?s policy ?ti at time t is independent of the
 states and actions in all previous time slots, the policy ?i is said to be Markov. If
 the policy is further independent of time, it is said to be stationary.
 The stationary policy of the players in a stochastic game, i.e., their optimal
 20
strategies, can be obtained by value iteration according to Bellman?s optimality
 condition. For example, in a two-player stochastic game with opposite objectives,
 let us denote V (s) as the expected reward (of player 1) for the optimal policy
 starting from state s, and Q(s; a1; a2) as the expected reward of player 1 for taking
 action a1 against player 2 who takes action a2 from state s and continuing optimally
 thereafter [85]. Then, the optimal strategy for player 1 can be obtained from the
 following iterations,
 V (s) = max
 ?
 min
 a22A2
 X
 a12A1
 Q(s; a1; a2)?a1 ; (2.6)
 Q(s; a1; a2) = u1(s; a1; a2) + ?
 X
 s02S
 T (s; a1; a2; s0)V (s0); (2.7)
 where ?a1 denotes player 1?s strategy proflle, and T (s; a1; a2; s0) denotes the transi-
 tion probability from state s to s0 when player 1 takes a1 and player 2 takes a2.
 21
Chapter 3
 Multiuser Rate Allocation Social Networks
 Nowadays, due to the explosive growth of the Internet and the advance of
 compression technologies, delay-sensitive multimedia networking applications such
 as multimedia streaming and multi-camera surveillance become more and more pop-
 ular. Therefore, a fundamental problem in these applications, how to fairly and e?-
 ciently allocate the rate among many users who share the same network bandwidth,
 becomes more and more important and draws great attention recently.
 Rate allocation for a single user has been well investigated in the literature
 [39] [35] [48]. In single-user rate allocation, the task of the rate controller is to assign
 the available rate to each frame and each macroblock (MB) to achieve the maximal
 visual quality. This is also known as rate control. The simplest rate control method
 is the constant bit-rate allocation (CBR), which equally allocates the bit-rate to each
 frame. However, CBR often results in quality  uctuation, due to which the overall
 visual quality is signiflcantly degraded. To overcome this problem, variable bit-rate
 allocation (VBR) is proposed for constant quality reconstruction by assigning rate
 according to the complexity of each frame [78]. A core technique in VBR-based
 rate control methods is rate distortion modelling [64], which highly afiects the rate
 control performance. Many works have been done on rate distortion modelling,
 including parametric method [137] and non-parametric method [134].
 22
If a channel is shared by multiple users, besides considering the rate allocation
 within the same user (i.e., frame-level rate allocation and MB-level rate allocation),
 the rate controller needs to consider the rate allocation among difierent users. This
 becomes the multi-user rate allocation problem. Similar to frame-level rate allo-
 cation, the simplest multi-user rate allocation is the constant bit-rate allocation
 (CBR), where the available network bandwidth is equally assigned to each user. A
 major problem of CBR is that it does not consider the variable bit-rate characteris-
 tics of the video sequences. One way to overcome this disadvantage is to optimize a
 global objective function that involves the characteristics of all the video sequences
 using conventional optimization methods such as Lagrangian or dynamic program-
 ming [95]. For example, a commonly adopted method is for the rate controller to
 minimize the weighted sum of the distortions or try to maximize the weighted sum
 of the PSNRs, i.e., the optimization problem becomes:
 min
 Ri
 NX
 i=1
 wiDi(Ri); s:t:
 NX
 i=1
 Ri ? R; (3.1)
 or
 max
 Ri
 NX
 i=1
 wiPSNRi(Ri); s:t:
 NX
 i=1
 Ri ? R; (3.2)
 where R is the available network bandwidth, wi is the weight, Di is the distortion,
 and PSNRi is the PSNR of the ith user.
 Notice that the solution to the above optimization-based methods is highly re-
 lated to the selection of the weights wi. However, in the literature, the weights wi?s
 are usually heuristically determined, e.g., wi is uniformly set to be 1=N [109]. More-
 over, such a formulation can only address the e?ciency issue, e.g., how to maximize
 23
the weighted sum of the PSNRs or minimize the weighted sum of the distortions. As
 such, the fairness issue, which is an important problem for multi-user rate allocation,
 has been generally ignored in the image/video/multimedia community.
 However, in the networking literature, the fairness issue in multi-user rate
 allocation have been considered in a difierent setting. In [105], the authors formu-
 lated the optimal channel-assignment problem as a convex optimization problem
 using a max-min fairness criterion for the downlink application. As pointed out
 in [61], the max-min approach deals with the worst-cast scenario, so it favors users
 with worse channels and reduces the system e?ciency. To overcome the disadvan-
 tage, the authors in [61] considered a generalized proportional fairness based on the
 Nash bargaining solutions and coalitions. While this proportional fairness criterion
 was successfully employed in networking applications, it cannot be directly used in
 content-aware multimedia applications since it does not explicitly consider the char-
 acteristics of the video content and the resulting impact on video quality. In [98],
 the authors applied the Nash bargaining solutions to the multimedia multi-user rate
 allocation problem, where the utility function for each user is deflned as the inverse
 of the distortion. But there are two main drawbacks of that utility function. Firstly,
 since no cost in video transmission is considered, every user can overclaim his/her
 need to get more bandwidth regardless the consequence to the system, which is
 recognized as selflsh behavior. Due to the selflsh nature, without a cost, all users
 will become too greedy and want to get as much bit-rate as possible, which is not
 good to the system [106]. Secondly, since the gain is deflned as the inverse of the
 distortion, i.e., an exponential function of the PSNR, a certain increase of the bit-
 24
rate in the low PSNR region will lead to a less signiflcant gain than that in the high
 PSNR region. This contradicts with the human visual system (HVS) model since
 the quality difierence in the low PSNR region is easier to be distinguished than that
 in the high PSNR region (see Section 3.1.3 for details). Moreover, with the utility
 function deflned in [98], the generalized Nash bargaining solution is shown to be the
 same as the traditional optimization-based approach in (3.2), i.e., to maximize the
 weighted sum of the PSNRs, while the weights are determined by the bargaining
 powers, which are still heuristically determined.
 In this chapter, we propose a multi-user rate allocation game framework to
 e?ciently and fairly allocate the available network bandwidth to difierent multime-
 dia users. The utility/payofi function of each user/player is deflned according to
 the characteristics of the transmitted video sequences and the allocated bit-rate.
 Speciflcally, motivated by the intuition that the quality difierence in the low PSNR
 region is easier to be distinguished than that in the high PSNR region, we deflne
 the gain as a logarithm function of the PSNR. We also introduce a cost term in
 the utility function, which is linear in the allocated rate, to guide users? behaviors.
 In this way, the users will be more rational in choosing bit-rate since transmitting
 data with a higher bit-rate in this case does not necessarily result in a higher payofi,
 especially when the transmitted video sequence is a fast motion and complex scene
 sequence. Then, we discuss the Nash equilibrium (NE) of this rate allocation game.
 We show that with a unique NE, which is proportionally fair in terms of both utility
 and PSNR, can be obtained, based on which the rate controller can e?ciently and
 fairly allocate the available rate. Moveover, we propose a decentralized cheat-proof
 25
rate allocation scheme for the users to converge to the unique NE using alternative
 ascending clock auction [16]. We also show that the traditional optimization-based
 method in (3.2) is a special case of the game-theoretic framework if the utility func-
 tion is deflned as an exponential function of PSNR. This fact indicates that the
 game-theoretic approach ofiers a more general and unifled solution, especially in a
 multi-user setting. Finally, we illustrate several experimental results on real video
 data to demonstrate the e?ciency and efiectiveness of the proposed game-theoretic
 multi-user multimedia rate allocation method.
 The rest of this chapter is organized as follows. In Section 3.1, we give a
 detailed description on the proposed method, including the system model, how
 to deflne the utility function, and the problem formulation. In Section 3.2, we
 provide a detailed analysis of the proposed game-theoretic framework. In Section
 3.3, we show the relationship between the proposed game-theoretic method and the
 traditional optimization-based approach. In Section 3.4, we describe in details the
 proposed distributed cheat-proof rate allocation scheme using alternative ascending
 clock auction. Finally, we illustrate the experimental results on real video signals in
 Section 3.5 and draw conclusions in Section 3.6.
 3.1 The Game-Theoretic Framework
 3.1.1 System Model
 As shown in Figure 3.1, in our system, we assume that there is a controller,
 N transmitters, u1, u2, ..., uN , and N receivers, r1, r2, ..., rN . User ui transmits
 26
Channel / Link
 Controller
 .
 .
 .
 .
 .
 .
 1v
 2v
 Nv
 1u
 2u
 Nu
 1r
 2r
 Nr
 Figure 3.1: System Model.
 the video sequence vi to the corresponding receiver ri through a channel/link that
 is shared by other users u1, ..., ui?1, ui+1, ..., uN . Since the channel has a limited
 bandwidth, it may not be able to satisfy the bandwidth requirements for all users.
 The role of the controller is to allocate the channel bandwidth to users u1, u2, ...,
 uN . So, the question is how the controller allocates the bandwidth to the users in
 an e?cient and fair way? We will formally deflne the notion of fairness later.
 3.1.2 Video Distortion-Rate Model
 Before answering the question raised in the above subsection, let us flrst discuss
 the Distortion-Rate (DR) model for the video sequences. In video compression,
 due to the quantization process, there exists a tradeofi between the distortion (D),
 which is usually deflned as the mean squared error (MSE), and bit-rate (R), which
 determines the channel bandwidth or storage space required to transmit or store
 the coded data. Generally, high bit-rate leads to small distortion while low bit-rate
 causes large distortion. In the literature, several models have been proposed to
 27
characterize this distortion rate tradeofi for difierent video coders, such as MPEG2
 [63] [48], MPEG4 [39] [35], FGS [46], H.263 [114], H.264 [31] [83], and wavelet-based
 coders [124]. Without loss of generality, in this chapter, we use a simple two-
 parameter distortion-rate model, which is widely employed in a medium or high
 bit-rate situation, and other models can be similarly analyzed. The two-parameter
 distortion-rate model is described as follows:
 D(R) = fie?flR; (3.3)
 where fi and fl are two positive parameters determined by the characteristics of the
 video content.
 3.1.3 User?s Utility Function
 As shown in Figure 3.1, user ui can get gain by successfully transmitting the
 video vi to receiver ri, and the gain is determined by the quality of the transmitted
 video. On the other hand, user ui needs to pay for the used bandwidth to transmit
 vi, and the payment is determined by the bit-rate of vi. Therefore, given the proflle
 of ui, the bit-rate Ri and distortion Di, the utility function of user ui can be deflned
 as:
 Ui(Ri; Di) = f(Di)? ag(Ri): (3.4)
 where f(Di) is the gain, g(Ri) is the cost, and a is a parameter controlling the balance
 between the gain and cost.
 Generally, since the gain of ui will be larger if the distortion Di is smaller,
 the function f(:) should be a monotonically decreasing function. Similarly, since the
 28
cost of ui will be larger if the bit-rate Ri is larger, the function g(:) should be a
 monotonically increasing function. Without loss of generality, we assume that the
 cost per bit-rate unit is one, which means:
 g(Ri) = Ri: (3.5)
 The gain f(Di) is generally determined by how much receiver ri is satisfled with
 the received video. In video processing and coding community, the PSNR is a more
 common objective quality measure than MSE. For any MSE, i.e., the distortion D,
 the corresponding PSNR is given by:
 PSNR = 10 log10
 2552
 D : (3.6)
 Moreover, according to the human visual system (HVS) model, the quality
 difierence in the low PSNR region is easier to be distinguished than that in the high
 PSNR region, e.g., as shown in Figure 3.2, the 33dB and 34dB images are easier
 to be distinguished than the 40dB and 41dB images. Therefore, we deflne the f(:)
 function as:
 f(Di) = ln(PSNRi) = ln[10 log10
 2552
 Di ]: (3.7)
 Note that the reason of using ln(:) function is that ln(:) is a monotonically
 increasing function in its argument and its second order derivative is negative, due
 to which a certain increase in the low PSNR region will lead to a more signiflcant
 gain than that in the high PSNR region. Other functions that have similar properties
 can also be used. Moreover, if we do not consider the distinct characteristics of video
 signals, any monotonically decreasing function of the distortion D can be used, e.g.,
 f(D) = 2552=D = e?PSNR; (3.8)
 29
(a) (b)
 (c) (d)
 Figure 3.2: The visual quality of Foreman sequence at difierent PSNR level: (a)
 33dB; (b) 34dB; (c) 40dB; (d) 41dB.
 or
 f(D) = 10 log10
 2552
 D = PSNR: (3.9)
 Combining (3.3)-(3.7) and ignoring the constant term, the utility function of
 user ui becomes:
 Ui(Ri) = ln( i + fliRi)? aRi: (3.10)
 where  i = 2 ln 255? ln fii.
 30
3.1.4 Multi-User Rate Allocation Game
 To answer the question raised in Section 3.1.1, we formulate this problem as
 a multi-user rate allocation game. As shown in Figure 3.1, in this game, there are
 N users/players, who share the available network bandwidth with each other. Each
 user ui has his/her own utility function as shown in (3.10), and it also has a minimum
 desired quality constraint (minimal rate constraint Rmini ) and a maximum satisfled
 quality constraint (maximum rate constraint Rmaxi ). Since Rmini is the minimal
 rate constraint that each user expects by jointing in the game, we assume that the
 available network rate at least guarantees each user for the minimal desired rate
 in the game. Obviously, if the available network bandwidth is able to satisfy all
 the users with the maximum quality constraint Rmaxi , the rate allocation problem
 is trivial since the controller just needs to allocate Rmaxi to each user ui. However,
 in the case that the available network bandwidth is not able to satisfy all the user
 with Rmaxi , the problem becomes more interesting: how does the controller fairly
 and e?ciently allocate the available bandwidth to the users? From the users? point
 of view, they try to maximize their utilities subject to the constraint that the sum
 of the users? bit-rate does not exceed the available bandwidth. Therefore, the game
 can be formulated as:
 max
 Ri
 Ui(Ri) = ln( i + fliRi)? aRi;
 s:t: Rmini ? Ri ? Rmaxi ; 8i = 1; 2; :::; N;
 NX
 i=1
 Ri ? R; (3.11)
 where R is the available network bandwidth.
 31
3.2 Analysis of The Multi-User Rate Allocation Game
 According to (3.10), we can see that the utility function Ui(Ri) is a concave
 function in terms of Ri. By taking the derivative of Ui(Ri) over Ri, we have:
 @Ui(Ri)
 @Ri =
 fli
  i + fliRi ? a; 8i = 1; 2; :::; N: (3.12)
 Therefore, user ui achieves his/her maximal utility U?i (R?i ) at R?i , where R?i is
 deflned as:
 R?i = max[Rmini ; min(
 1
 a ?
  i
 fli ; R
 max
 i )]; 8i = 1; 2; :::; N: (3.13)
 From (3.13), we can see that the optimal R?i corresponding to the maximal
 utility is determined by the parameter a. Therefore, for difierent choices of a, the
 game in (3.11) has difierent equilibria with difierent physical meanings. Speciflcally,
 in the following, we discuss the Nash equilibrium (NE) in three difierent cases:
 a > a0, 0 ? a < a0, and a = a0, where a0 is the constant that satisfles the following
 equation:
 NX
 i=1
 ?
 max[Rmini ; min(
 1
 a0 ?
  i
 fli ; R
 max
 i )]
 ?
 = R: (3.14)
 3.2.1 Non-E?cient Rate Allocation (a > a0)
 If a > a0, the game in (3.11) has a unique Nash Equilibrium (R?1; R?2; :::; R?N).
 Since a > a0, from (3.14), we have
 PN
 i=1 R?i < R, which means that the available
 network bandwidth is not fully utilized. Therefore, this allocation scheme is not
 e?cient.
 32
3.2.2 E?cient Rate Allocation (0 ? a < a0)
 If 0 ? a < a0, the game in (3.11) has inflnitely many NE. For every NE
 ( ~R1; ~R2; :::; ~RN), according to Lemma 1, we have
 PN
 i=1 ~Ri = R, which means that
 the available network bandwidth is fully utilized. Therefore, this allocation scheme
 is e?cient.
 Lemma 1: When 0 ? a < a0, every NE ( ~R1; ~R2; :::; ~RN) satisfles
 PN
 i=1 ~Ri = R.
 Proof: Since PNi=1 ~Ri ? R, let us assume that there is a NE ( ~R1; ~R2; :::; ~RN)
 such that PNi=1 ~Ri = R ?? < R. Since 0 ? a < a0, we have
 PN
 i=1 R?i > R, which
 means there exists at least one ~Rj such that ~Rj < R?j . Let R^j = min( ~Rj + ?; R?j ),
 then Pj?1i=1 ~Ri + R^j +
 PN
 i=j+1 ~Ri ? R and Uj(R^j) > Uj( ~Rj) (due to the concavity of
 the utility function). This contradicts with the assumption that ( ~R1; ~R2; :::; ~RN) is
 a NE. Therefore, PNi=1 ~Ri = R. This completes the proof.
 3.2.3 E?cient and Proportionally Fair in Both Utility and PSNR
 (a = a0)
 If a = a0, the game in (3.11) has a unique Nash Equilibrium (R?1; R?2; :::; R?N).
 According to (3.14), we have PNi=1 R?i = R, which means that the available network
 bandwidth is fully utilized. Therefore, this allocation scheme is e?cient.
 Moreover, we will show in the following deflnition [72] and theorem that when
 a = a0, (R?1; R?2; :::; R?N) is a proportionally fair NE in terms of both utility and
 PSNR.
 Deflnition 1: A utility distribution is said to be proportionally fair when any
 33
change in the distribution of utilities results in the sum of the proportional changes
 being non-positive, i.e.,
 X
 i
 Ui ? ~Ui
 ~Ui
 ? 0; 8Ui 2 S: (3.15)
 where ~Ui and Ui are the proportionally fair utility and any other feasible utility for
 the ith user, respectively, and S is a closed and convex subset of <N to represent the
 set of feasible utility functions that the users can achieve.
 Remark: The deflnition of proportional fairness comes from the fact that, if
 ( ~U1; ~U2; :::; ~UN) satisfles (3.15), any deviation from ( ~U1; ~U2; :::; ~UN) will lead to a non-
 increasing sum of the proportional changes. Moreover, from [72] and [61], we can see
 that ( ~U1; ~U2; :::; ~UN) is a proportionally fair utility if and only if
 QN
 i=1 ~Ui ?
 QN
 i=1 Ui
 for any feasible (U1; U2; :::; UN).
 Theorem 1: When a = a0, (R?1; R?2; :::; R?N) is a proportionally fair NE in terms
 of both utility and PSNR.
 Proof: According to the above remark, (R?1; R?2; :::; R?N) is a proportionally fair
 NE in terms of both utility and PSNR if and only if it is the solution to the following
 two optimization problems:
 max
 Ri
 NY
 i=1
 Ui =
 NY
 i=1
 [ln( i + fliRi)? aRi]
 s:t: Rmini ? Ri ? Rmaxi ; 8i = 1; 2; :::; N;
 NX
 i=1
 Ri ? R; (3.16)
 34
and
 max
 Ri
 NY
 i=1
 PSNRi =
 NY
 i=1
 ( i + fliRi)
 s:t: Rmini ? Ri ? Rmaxi ; 8i = 1; 2; :::; N;
 NX
 i=1
 Ri ? R: (3.17)
 ? We flrst show that (R?1; R?2; :::; R?N) is a proportionally fair NE in terms of
 utility. Since user ui achieves his/her maximal utility U?i (R?i ) at R?i , we have
 Ui(R?i ) ? Ui(Ri), for any Ri satisfles Rmini ? Ri ? Rmaxi and
 PN
 i=1 Ri ? R.
 This means that QNi=1 Ui(R?i ) ?
 QN
 i=1 Ui(Ri) for any feasible (R1; R2; :::; RN).
 Therefore, (R?1; R?2; :::; R?N) is a proportionally fair NE in terms of utility.
 ? We then show that (R?1; R?2; :::; R?N) is a proportionally fair NE in terms of
 PSNR. Since maximizing Qi xi is the same as maximizing
 P
 i ln(xi), the opti-
 mization problem in (3.17) is equivalent to the following optimization problem:
 max
 Ri
 NX
 i=1
 ln( i + fliRi)
 s:t: Rmini ? Ri ? Rmaxi ; 8i = 1; 2; :::; N;
 NX
 i=1
 Ri ? R; (3.18)
 Since the above optimization problem is convex, the optimal solution can be
 found by solving the Karush-Kuhn-Tucker (KKT) conditions [19]. We flrst
 write the Lagrangian of problem (3.18) as:
 L(Ri; ?; ?i; ?i) = ?
 NX
 i=1
 ln( i + fliRi) + ?(
 NX
 i=1
 Ri ?R)
 +
 NX
 i=1
 ?i(Ri ?Rmaxi ) +
 NX
 i=1
 ?i(Rmini ?Ri): (3.19)
 35
Then, the KKT conditions are:
 ? fli i + fliRi + ?+ ?i ? ?i = 0;
 ?(
 NX
 i=1
 Ri ?R) = 0;
 ?i(Ri ?Rmaxi ) = 0; 8i = 1; :::; N:
 ?i(Rmini ?Ri) = 0; 8i = 1; :::; N:
 Rmini ? Ri ? Rmaxi ; 8i = 1; :::; N:
 NX
 i=1
 Ri ? R;
 ? ? 0; ?i ? 0; ?i ? 0; 8i = 1; :::; N: (3.20)
 By solving the KKT conditions above, the optimal solution is:
 R?i = max[Rmini ; min(
 1
 ? ?
  i
 fli ; R
 max
 i )]; 8i = 1; :::; N: (3.21)
 where PNi=1 max[Rmini ; min( 1? ?  ifli ; Rmaxi )] = R.
 Therefore, (R?1; R?2; :::; R?N) is the solution to the optimization problem in
 (3.17), which means that it is a proportionally fair NE in terms of PSNR.
 In all, when a = a0, (R?1; R?2; :::; R?N) is a proportionally fair NE in terms of both
 utility and PSNR. This completes the proof.
 Remark: From the above analysis, we can see that choosing a = a0 is the
 best among the three difierent cases due to the following four reasons: (1) a unique
 proportionally fair NE in terms of both utility and PSNR can be found when a = a0;
 (2) with the unique proportionally fair NE, the available network bandwidth will be
 fully utilized; (3) since the optimal solution shown in (3.21) is very simple, no opti-
 mization is needed and the computational complexity is low; and (4) a distributed
 36
algorithm can be designed for the users to converge to the unique NE which will
 discussed in Section 3.4.
 3.3 Relation To The Traditional Optimization-Based Approach
 While the task of rate allocation for a single user is to flnd the best trade-ofi
 point on the rate-distortion curve, the traditional optimization-based multi-user rate
 allocation approach can be seen as flrst constructing an overall rate-distortion curve
 by combining rate-distortion curves of all users, and then flnding the best trade-
 ofi point on the joint rate-distortion curve. However, it is di?cult to construct
 the overall rate-distortion curve from all users? rate-distortion curve. The approach
 shown in (3.2) is one possible way, but there is no notion of fairness. Furthermore,
 the weights in (3.2) are hard to determine and are usually deflned heuristically.
 Instead of focusing on flnding a good way of constructing the overall rate-
 distortion curve, the proposed game-theoretic framework considers each user?s rate-
 distortion trade-ofi in the utility function. Then, the notion of proportional fairness,
 is introduced to balance the rate allocation among difierent users and to make sure
 that the total rate constraint is satisfled. Moreover, from (3.2), (3.4), and (3.16), we
 can see that the traditional optimization-based approach shown in (3.2) is actually a
 special case of the proposed game-theoretic framework by choosing the gain function
 and the cost function as follows:
 f(Di) = ewiPSNRi ; g(Ri) = 0; (3.22)
 37
which means,
 Ui(Ri; Di) = ewiPSNRi : (3.23)
 Note that there are mainly three drawbacks of this kind of utility function:
 ? The parameters wi are usually heuristically determined.
 ? If no cost in video transmission is considered, selflsh users may become too
 greedy and want to get as much bit-rate as possible, which is not good to the
 system [106].
 ? Since the gain is deflned as an exponential function of the PSNR, a certain
 increase of the bit-rate in the low PSNR region will lead to a less signiflcant
 gain than that in the high PSNR region. This contradicts with the human
 visual system (HVS) model since the quality difierence in the low PSNR region
 is easier to be distinguished than that in the high PSNR region.
 3.4 Clock Auction For Distributed Cheat-Proof Optimal Rate Allo-
 cation
 In Section 3.2, we have discussed the NE of the multi-user rate allocation
 game for difierent a?s and found that, when a = a0, the game has a proportionally
 fair NE in both utility and PSNR. However, we have not discussed how to obtain
 a0 and how the users converge to the NE yet. There are two possible approaches,
 centralized approach and distributed approach. For the centralized approach, the
 controller knows exactly all the private information of each user, i.e.,  i, fli, Rmini ,
 38
and Rmaxi . Then, the controller can flrst flnd a0 in a collective way by solving (3.14)
 and then allocate R?i to ui.
 However, in general, the users can be geographically distributed in many
 places, it is therefore not feasible for the controller to collect all the private informa-
 tion of each user. Moreover, since the users are selflsh, e.g., they tend to overclaim
 what they may need, they will not truly report their private information if cheating
 can improve their utilities [62]. To solve this problem, we propose a distributed
 cheat-proof rate allocation scheme using alternative ascending clock auction [16].
 An auction is a decentralized mechanism for allocating resources, where there is an
 auctioneer and several bidders. The auction processes can be described as follows:
 the auctioneer announces a price, bidders report to the auctioneer their demands
 at that price, and the auctioneer raises the price until the total demand meets the
 supply. In our multi-user rate allocation problem, the controller is the auctioneer
 and the users are the bidders.
 The proposed rate allocation scheme is described in Algorithm 1. As shown
 in Algorithm 1, before the auction, the controller sets up the step size ? > 0, clock
 index t = 0, and initializes a with a small value a0. At the beginning of clock t,
 the controller flrst announces at to all the users. Then, each user submits his/her
 optimal demand to the controller. After collecting all the demands, the controller
 compares the total demand Rtotal with the available bandwidth R. If Rtotal > R, i.e.,
 the total demand exceeds the supply, the auction is not concluded. The controller
 continues the auction and goes to next clock t + 1 with an increased a computed
 by at+1 = at + ?. Moreover, the controller computes the cumulative clinch, which is
 39
the amount of bit-rate that a user is guaranteed to win at current clock given by
 Cti = max(0; R?
 X
 j 6=i
 Rtj): (3.24)
 On the other hand, if Rtotal ? R, then the supply can meet all users? demands
 and the auction is concluded. Let the flnal clock index be L. As a increases dis-
 cretely, we may have Rtotal < R and do not fully utilize the bandwidth. To make
 sure that Rtotal = R, we modify (3.24) by introducing proportional rationing [16],
 and the flnal cumulative clinch of ui is given by,
 CLi = RLi +
 RL?1i ?RLiP
 i RL?1i ?
 P
 i RLi
 [R?
 X
 i
 RLi ]; with
 X
 i
 CLi = R: (3.25)
 Finally, the rate allocated to ui is R?i = CLi . The utility of ui is obtained as,
 U?i = ln( i + fliR?i )? P ?i ; (3.26)
 where P ?i = C0i a0 +
 PL
 t=1 at(Cti ? Ct?1i ) is the payment from user ui.
 Remark: Since at+1 > at, we have Rt+1j ? Rtj. Therefore, at clock t, ui is
 guaranteed at least the amount of bit-rate Cti = max(0; R ?
 P
 j 6=i
 Rtj). This is how
 (3.24) comes from.
 The rate allocation scheme described in Algorithm 1 has several advantages:
 ? The auction process is transparent to all users and simple enough for all users
 to understand. Simplicity and transparency are two important factors to stim-
 ulate auction since users may not be willing to join in the game if they do not
 understand the auction process.
 ? The auction scheme can preserve privacy. Since the scheme is distributed,
 40
Algorithm 1 Cheat-Proof Rate Allocation Scheme Using Clock Auction
 Given the available bandwidth R, step size ? > 0, and clock index t = 0, the
 controller initializes a with a small value a0.
 Repeat:
 (1) the controller announces at to all the users.
 (2) Each user ui submits his/her optimal demand:
 Rti = max[Rmini ; min( 1at ?  ifli ; Rmaxi )].
 (3) The controller sums up all the demand Rttotal =
 P
 i Rti and compares Rttotal
 with R:
 If Rttotal > R, compute Cti = max(0; R?
 P
 j 6=i
 Rtj), set at+1 = at+ ?, t = t+1,
 and go to (1).
 Else, conclude the auction, set L = t, compute CLi = RLi + R
 L?1
 i ?RLiP
 i RL?1i ?
 P
 i RLi
 [R?
 P
 i RLi ],
 and allocate R?i = CLi to ui.
 Finally, the payment of ui is P ?i = C0i a0 +
 PL
 t=1 at(Cti ? Ct?1i ) and the utility of
 ui is U?i = ln( i + fliR?i )? P ?i .
 41
users do not need to report their private information. Instead, they only need
 to submit their demands.
 ? The computational complexity of each user is low since what the users need to
 do is to submit their optimal demands calculated by Ri = max[Rmini ; min( 1a ?
  i
 fli ; Rmaxi )] for any given a.
 ? The computational complexity of the controller is low in that the controller
 only needs to sum up the demands from the users, compare it with the available
 bandwidth, and compute the cumulative clinch for each user.
 ? Through the auction, each user will converge to the unique proportionally fair
 NE shown in Section 3.2. This is trivial due to the following two reasons:
 (1) since the auction concludes if and only if Pi Rti ? R, when ? is su?cient
 small, the auction will conclude at Pi Rti = R. (2) At each clock t, ui chooses
 Rti = max[Rmini ; min( 1at ?  ifli ; Rmaxi )].
 ? The scheme is cheat-proof, meaning that the best strategy of each user is to
 report his/her true optimal demand at every clock. There is no incentive for
 ui to deviate, and the proof is shown in Theorem 2.
 Let Rti be user ui?s true optimal demand at clock t, and ~Ri
 t be the claimed
 demand that ui reports to the controller at clock t. Note that ~Rit can be any value
 in [Rmini ; Rmaxi ] if ui cheats at clock t. Let ?(t; L) = fL; ~Ri
 0; :::; ~Rit; Rt+1i ; :::; RLi ;
 C0i ; :::; CLi ; a0; :::; aLg be the proflle of ui at the following scenario: from clock 0 to
 clock t, ui reports ~Ri0; :::; ~Rit, and from clock t + 1 to the flnal clock, ui reports
 42
2L
 iR 1
 L
 iR
 2ln( )Li i iR? ?+
 1ln( )Li i iR? ?+
 1 1 2
 1
 (ln( ))ln( ) ( )
 L
 i i
 L L Li i i
 i i i i i
 i R R
 RR R R
 R
 ? ?? ?
 =
 ? +
 + ? ?
 (a)
 1L
 iR 2
 L
 iR
 2ln( )Li i iR? ?+
 1ln( )Li i iR? ?+
 2 2 1
 1
 (ln( ))ln( ) ( )
 L
 i i
 L L Li i i
 i i i i i
 i R R
 RR R R
 R
 ? ?? ?
 =
 ? +
 + ? ?
 (b)
 Figure 3.3: Illustration for the proof of Lemma 2: (a) If ~Rit ? Rti and RL1i > Rmini ,
 we can see that [ln( i + fliRL1i ) ? @ ln( i+fliRi)@Ri jRi=RL1i (R
 L1
 i ? RL2i )] ? ln( i + fliRL2i );
 (b) If ~Rit > Rti and RL1i < Rmaxi , we can see that ln( i + fliRL1i ) ? [ln( i + fliRL2i )?
 @ ln( i+fliRi)
 @Ri jRi=RL1i (R
 L2
 i ?RL1i )].
 Rt+1i ; :::; RLi , where L is the flnal clock index, C0i ; :::; CLi is the corresponding cumu-
 lative clinch of ui from clock 0 to clock L, and a0; :::; aL is the corresponding value
 of a at each clock. Let Ui[?(t; L)] be the utility of ui in this scenario. Let ?(?1; L) =
 fL;R0i ; :::; RLi ;C0i ; :::; CLi ; a0; :::; aLg and ?(L;L) = fL; ~Ri
 0; :::; ~RiL;C0i ; :::; CLi ; a0; :::; aLg
 be two special cases of ?(t; L).
 Lemma 2: If all other users report their true optimal demands at every clock,
 then Ui[?(t? 1; L1)] ? Ui[?(t; L2)].
 Proof: From (3.26), we have,
 Ui[?(t? 1; L1)] = ln( i + fliRL1i )? a0C0i ?
 L1X
 k=1
 ak(Cki ? Ck?1i );
 Ui[?(t; L2)] = ln( i + fliRL2i )? a0C0i ?
 L2X
 k=1
 ak(Cki ? Ck?1i ): (3.27)
 ? If ~Rit ? Rti, according to Algorithm 1, we have L2 ? L1 and RL2i ? RL1i .
 43
Then,
 Ui[?(t? 1; L1)]? Ui[?(t; L2)]
 = ln( i + fliRL1i )? ln( i + fliRL2i )?
 L1X
 k=L2+1
 ak(Cki ? Ck?1i )
 ? ln( i + fliRL1i )? ln( i + fliRL2i )? aL1(CL1i ? CL2i ): (3.28)
 When ? is su?ciently small, CL1i = RL1i and CL2i = RL2i . Since Rti =
 max[Rmini ; min( 1at ?  ifli ; Rmaxi )],
 { if RL1i > Rmini , according to (3.12), we get aL1 ? @ ln( i+fliRi)@Ri jRi=RL1i .
 Thus, (3.28) becomes
 Ui[?(t? 1; L1)]? Ui[?(t; L2)]
 ? [ln( i + fliRL1i )?
 @ ln( i + fliRi)
 @Ri jRi=RL1i (R
 L1
 i ?RL2i )]? ln( i + fliRL2i )
 ? 0 (see Figure 3:3(a)): (3.29)
 { if RL1i = Rmini , since Rmini ? RL2i ? RL1i = Rmini , we have RL2i = Rmini =
 RL1i . Therefore,
 Ui[?(t? 1; L1)] = Ui[?(t; L2)]: (3.30)
 So, if ~Rit ? Rti, we have Ui[?(t? 1; L1)] ? Ui[?(t; L2)].
 ? If ~Rit > Rti, according to Algorithm 1, we have L2 ? L1 and RL2i ? RL1i .
 Then,
 Ui[?(t? 1; L1)]? Ui[?(t; L2)]
 = ln( i + fliRL1i )? ln( i + fliRL2i ) +
 L2X
 k=L1+1
 ak(Cki ? Ck?1i )
 ? ln( i + fliRL1i )? ln( i + fliRL2i ) + aL1(CL2i ? CL1i ): (3.31)
 44
When ? is su?ciently small, CL1i = RL1i and CL2i = RL2i . Since Rti =
 max[Rmini ; min( 1at ?  ifli ; Rmaxi )],
 { if RL1i < Rmaxi , according to (3.12), aL1 ? @ ln( i+fliRi)@Ri jRi=RL1i . Thus,
 (3.31) becomes
 Ui[?(t? 1; L1)]? Ui[?(t; L2)]
 ? ln( i + fliRL1i )? [ln( i + fliRL2i )?
 @ ln( i + fliRi)
 @Ri jRi=RL1i (R
 L2
 i ?RL1i )]
 ? 0 (see Figure 3:3(b)): (3.32)
 { if RL1i = Rmaxi , since Rmaxi ? RL2i ? RL1i = Rmaxi , we have RL2i = Rmaxi =
 RL1i . Therefore,
 Ui[?(t? 1; L1)] = Ui[?(t; L2)]: (3.33)
 So, if ~Rit > Rti, we still have Ui[?(t? 1; L1)] ? Ui[?(t; L2)].
 In all, we can show that Ui[?(t? 1; L1)] ? Ui[?(t; L2)]. This completes the proof.
 With Lemma 2, we can now show that the best strategy of each user is to
 report his/her true optimal demand at every clock.
 Theorem 2 (Cheat-Proof): Reporting true optimal demand at every clock is a
 mutually best response for every user, i.e., Ui[?(L3; L3)] ? Ui[?(?1; L4)] 8i.
 Proof: If all the other users report their true optimal demands in every
 clock, according to Lemma 2, we have Ui[?(L3; L3)] ? Ui[?(L3 ? 1; ~L3)] ? ::: ?
 Ui[?(?1; L4)], where ~L3 stands for the flnal clock index of the following scenario:
 from clock 0 to clock L3?1, ui reports ~Ri0; :::; ~RiL3?1, and from clock L3 to the flnal
 clock ~L3, ui reports RL3i ; :::; R
 ~L3
 i . Since all users are non-collaborative, reporting true
 45
optimal demand at every clock is a mutually best response for every user. There is
 no incentive for the users to cheat since any cheating will lead to a loss in utility.
 Therefore, the proposed scheme is cheat-proof. This completes the proof.
 In the above theorem, we give a theoretical proof for the cheat-proof strategy.
 In the following section, we will verify this cheat-proof strategy through experimental
 results.
 3.5 Experimental Results
 In order to evaluate the proposed game-theoretic multi-user rate allocation
 game, we conduct experiments on real video data. Seven video sequences: Akiyo,
 Mobile, Table, Carphone, Coastguard, Foreman, and Football in QCIF format, are
 tested. Notice that these video sequences include slow, medium or fast motion,
 and smooth or complex scene. We use the state-of-art H.264 JM 9.0 video codec
 to encode the video sequences [6]. By changing the quantization parameter (QP)
 or using the rate control feature, we are able to compress the video sequences at
 difierent bit-rate and achieve difierent quality requirements.
 3.5.1 Parameter Estimation
 From Section 3.1, we can see that there are several parameters in our frame-
 work,  i, fli, Rmini , and Rmaxi . In this subsection, we will discuss how to estimate
 these parameters.
 46
Table 3.1:  ?i , fl?i , Rmini (kb=s), and Rmaxi (kb=s) for difierent sequence by training.
 Sequence  ?i fl?i Rmini (kb=s) Rmaxi (kb=s)
 Akiyo 6.8449 0.0416 1.5119 84.5447
 Carphone 6.6759 0.0114 20.2554 322.0153
 Coastguard 6.6796 0.0043 28.4987 878.8011
 Foreman 6.7418 0.0093 17.8168 388.7091
 Football 6.2201 0.0024 286.311 1720
 Mobile 6.3464 0.0025 225.0682 1610
 Table 6.8135 0.0074 12.7781 481.1014
 According to (3.3) and (3.6), we have:
 PSNRi = 10(log10 e) ln
 2552
 Di
 = (10 log10 e)( i + fliRi) (3.34)
 Therefore, we can estimate  i and fli using ofi-line training. For each video
 sequence, we flrst generate a set of (PSNRi; Ri) by encoding the sequence using
 H.264 JM 9.0 with difierent QP. Then, the optimal  ?i and fl?i can be computed by:
 ( ?i ; fl?i ) = min i;fli
 X
 j
 [PSNRi(j)? (10 log10 e)( i + fliRi(j))]2; (3.35)
 where j is the index of the training set.
 Through the training data and equation above, we get the optimal  ?i and fl?i
 for difierent video sequences and show them in Table 3.1. As shown in Figure 3.4,
 47
200 400 600 800 1000 1200 140028
 30
 32
 34
 36
 38
 40
 42
 Rate (kb/s)
 PSNR (dB
 )
 Football QCIF
  
 
baseline
 training
 (a)
 0 100 200 300 400 500 600 70026
 28
 30
 32
 34
 36
 38
 40
 42
 Rate (kb/s)
 PSNR (dB
 )
 Coastguard QCIF
  
 
baseline
 training
 (b)
 Figure 3.4: Training  and fl: (a) Football; (b) Coastguard.
 with the optimal  ?i and fl?i , the (10 log10 e)( i+ fliRi) can approximate PSNRi well.
 Due to the page limitation, we only show the results for Football and Coastguard.
 Similar results are observed for other sequences.
 After flnding the optimal  ?i and fl?i , we derive the values for Rmini and Rmaxi .
 Suppose that the minimal desired PSNR (quality) constraint is PSNRmin, e.g.,
 30dB, and the maximal satisfled PSNR (quality) constraint is PSNRmax, e.g., 45dB.
 According to (3.34), we have:
 Rmini =
 1
 fl?i
 (PSNR
 min
 10 log10 e
 ?  ?i );
 Rmaxi =
 1
 fl?i
 (PSNR
 max
 10 log10 e
 ?  ?i ): (3.36)
 According the equations above, the Rmini and Rmaxi for difierent sequences are
 obtained and shown in Table 3.1. From Table 3.1, we can see that the tested video
 sequences can be classifled to four categories according to fl?, Rmini and Rmaxi : slow
 motion and smooth scene (Akiyo), medium motion and smooth scene (Carphone,
 Foreman, and Table), medium motion and complex scene (Coastguard), and fast or
 48
1 2 3 4 5
 0
 0.5
 1
 1.5
 AFD
 Total Rate (Mb/s)
 Rate (Mb/s
 )
  
 
Akiyo
 Carphone
 Coastguard
 Foreman
 Table
 Football
 Mobile
 1 2 3 4 5
 0
 0.5
 1
 1.5
 AFR
 Total Rate (Mb/s)
 Rate (Mb/s
 )
  
 
Akiyo
 Carphone
 Coastguard
 Foreman
 Table
 Football
 Mobile
 1 2 3 4 5
 0
 0.5
 1
 1.5
 MSPSNR
 Total Rate (Mb/s)
 Rate (Mb/s
 )
  
 
Akiyo
 Carphone
 Coastguard
 Foreman
 Table
 Football
 Mobile
 1 2 3 4 5
 0
 0.5
 1
 1.5
 Proposed
 Total Rate (Mb/s)
 Rate (Mb/s
 )
  
 
Akiyo
 Carphone
 Coastguard
 Foreman
 Table
 Football
 Mobile
 Figure 3.5: Allocated rates for Akiyo, Carphone, Coastguard, Foreman, Table, Foot-
 ball, and Mobile using difierent methods.
 complex motion (Football and Mobile).
 3.5.2 Multi-User Rate Allocation
 We compare the proposed method with three approaches: the Absolute Fair-
 ness in Rate (AFR), which equally divides the available bandwidth to all the users,
 the Absolute Fairness in Distortion (AFD), which minimizes the maximal distor-
 tion of all the users, i.e., min-max fairness, and the approach Maximizing the Sum
 of the PSNRs (MSPSNR), i.e. the traditional optimization-based approach shown
 in (3.2) with uniform weights. Notice that for AFR, AFD, and MSPSNR, the
 49
1 1.5 2 2.5 3 3.5 4 4.5 5
 200
 220
 240
 260
 280
 300
 320
 R (Mb/s)
 To
 ta
 l P
 SNR (dB
 )
 PSNR
  
 
AFD
 AFR
 MSPSNR
 Proposed
 Figure 3.6: The sum of PSNR vs. the available network bandwidth R.
 allocated rate should be within [Rmini ; Rmaxi ]. Otherwise, we set it to be Rmini or
 Rmaxi and re-allocate the rest rate for the other users. Given the video sequences
 to be transmitted, the available bandwidth R, we can compute the rate allocated
 to each video sequence using difierent methods, i.e., AFD, AFR, MSPSNR, and
 the proposed method. Then, setting the allocated bit-rate as the target bit-rate,
 we compress the video sequence using the rate control feature in H.264 JM 9.0
 reference software. Finally, each user transmits the compressed bitstream to the
 corresponding receiver.
 In the experiments, we assume that there are seven users u1; u2; :::; u7. They
 transmit Akiyo, Carphone, Coastguard, Foreman, Table, Football, and Mobile to
 seven receivers r1; r2; :::; r7, respectively. We test R at 1000, 2000, 3000, 4000 and
 50
0 0.5 1 1.5 2 2.5 3 3.5
 28
 30
 32
 34
 36
 38
 40
 42
 44
 46
 Scale factor k
 PSNR (dB
 )
 ?=0.0025 and R=4.2Mb/s
  
 
MSPSNR
 AFD
 AFR
 (a)
 0 0.5 1 1.5 2 2.5 3 3.5
 1.86
 1.87
 1.88
 1.89
 1.9
 1.91
 Scale factor k
 Utilit
 y
 ?=0.0025 and R=4.2Mb/s
  
 
Cheat?Proof Performance of Proposed Method
 (b)
 Figure 3.7: Cheat-proof performance: (a) AFD, AFR, and MSPSNR; (b) Proposed
 Method.
 5000 kb/s. The allocated bit-rate for each video sequence in difierent situations
 (i.e., difierent R) using difierent methods (i.e., AFD, AFR, MSPSNR, and the
 proposed method) are shown in Figure 3.5. From Figure 3.5, we can see that AFR
 equally allocates the bandwidth to each users if the allocated bit-rate is within
 [Rmini ; Rmaxi ]. AFD tries to allocate more bit-rate to the video sequence that has
 more complex motion and/or scene (a smaller fl?) to preserve constant quality among
 difierent users. On the contrary, MSPSNR favors the video sequence that has
 a larger fl? since allocating more bit-rate to the sequence with a larger fl? leads
 to a greater increase in the sum of the PSNRs. However, with MSPSNR, the
 sequence with fl?i will not be allocated more bit-rate than Rmini if there is a sequence
 with fl?j > fl?i who has not been allocated its maximal rate requirement Rmaxj yet.
 Speciflcally, the rate controller will flrst allocate each user with Rmini . Then, the
 remaining rates will be flrst allocated to Akiyo until the bit-rate of Akiyo achieves
 its maximal requirement. If there are still some unused rates, then Carphone will
 51
be satisfled flrst. And the bit-rate of Football with the smallest fl? stays at its
 minimal requirement until all other sequences with higher fl? have achieved their
 maximal rate requirements. Obviously, this is not fair to the users who transmit
 the sequences with smaller fl?. By taking the proportional fairness into account,
 the proposed method can avoid this disadvantage and balance the rate allocation
 between the sequences with a larger fl? and a smaller fl?. For example, as shown in
 Figure 3.5, when the total available network bandwidth R increases from 3000kb/s
 to 4000kb/s, both the bit-rate of Mobile and Football increase. This is because
 the proposed method with the proportional fairness criterion aims at maximizing
 the product of the utility function Ui, and keeping a certain balance between the
 sequences with a larger fl? and a smaller fl? leads to an increase in the product.
 Let T PSNR = PNi=1 PSNRi be the sum of the analytical PSNRi computed
 by (3.34) of all the users. In Figure 3.6, we show T PSNR versus the available
 network bandwidth R. We can see that there is a big gap between the performance
 of AFD, AFR and MSPSNR, which means using AFD or AFR leads to a big
 loss in the system performance. However, the performance of the proposed method
 is almost the same as that of MSPSNR, which fully demonstrates the e?ciency
 of the proposed method. Therefore, while achieving a fair rate allocation among
 difierent users, the proposed method still performs well in terms of total PSNR.
 Finally, we evaluate the cheat-proof property of difierent methods. As shown
 in Table 3.1, since fl is the most important parameter representing the characteristics
 of video sequences, the best way for ui to pretend as another user uj is to use flj
 rather than fli in calculating optimal demand. Therefore, we evaluate the cheat-
 52
proof property in terms of fl. In this experiment, the available network bandwidth
 R is set to be 4.2Mb/s. We assume that u6 who transmits Mobile sequence will
 cheat while other users are honest. In AFD, AFR and MSPSNR, u6 reports a
 false ~fl to the controller by scaling the original fl with a factor k, i.e., ~fl = kfl.
 In the proposed method, at each clock t of the auction, u6 uses ~fl to generate the
 \optimal" demand ~R6t using ~R6t = max[Rmin6 ; min( 1at ?  6kfl6 ; Rmax6 )] and reports ~R6
 t
 to the controller. As shown in Figure 3.7(a), the PSNR performance of AFR is
 independent of the scale factor k. This is because AFR does not care about fl and
 just equally allocates the bandwidth to each user if the allocated bit-rate is within
 [Rmini ; Rmaxi ]. The PSNR performance of AFD decreases as k increases. This is
 because AFD tries to allocate more bit-rate to the video sequence with a smaller fl
 to preserve constant quality among difierent users. Therefore, with AFD, all users
 tend to report a smaller fl to the controller to obtain a better PSNR performance.
 On the contrary, the PSNR performance of MSPSNR is an increasing piecewise
 constant function in terms of k. This is because, with MSPSNR, the sequence
 with fli will not be allocated more bit-rate than Rmini if there is a sequence with
 flj > fli who has not been allocated its maximal rate requirement Rmaxj yet. To be
 allocated more rate and obtain a higher PSNR, u6 should increase k until at least
 kfl6 > flj where flj = minl(fll > fl6). Therefore, with MSPSNR, all users tend to
 report a larger fl to the controller to obtain a better PSNR performance. However,
 with the proposed method, as shown in Figure 3.7(b), reporting the optimal demand
 generated by the true fl (k=1) will lead to the best utility. Any deviation will lead
 to a loss in terms of utility, which means that the proposed method is cheat-proof.
 53
Therefore, the proposed method ensures all users will be honest about their private
 information.
 3.6 Summary
 In this chapter, we proposed a game-theoretic framework for multi-user multi-
 media rate allocation and a distributed cheat-proof scheme for users to converge to
 the NE of the game. Difierent from the traditional optimization-based approaches,
 which mainly focus on the e?ciency issue, e.g. maximizing the system performance,
 the proposed method not only considers the e?ciency issue but also the fairness
 issue. From the experimental results on real video sequences, we can see that with
 the proportional fairness criterion, the proposed game-theoretic method can e?-
 ciently and fairly allocate bit-rate to difierent users by allocating more bit-rate to
 the sequence with slower motion and/or simpler scene while keeping an eye on the
 fast motion and/or complex scene sequence. We also flnd that, with the proposed
 distributed cheat-proof rate allocation scheme, reporting the true optimal demand
 at every clock is the mutual best response for every user. Moreover, we show that
 the traditional optimization-based method that maximizes the weighted sum of the
 PSNRs is a special case of the game-theoretic framework with the utility function
 deflned as an exponential function of PSNR.
 54
Chapter 4
 Peer-to-Peer Cooperative Video Streaming Social Networks
 With the rapid development of signal processing, communication, and net-
 working technologies, video-over-IP applications become more and more popular
 and have attracted millions of users over the Internet [1] [9]. One simple solution to
 video streaming over Internet is the client-server service model [47] [76], where the
 video is streamed directly from a server to clients. However, with the client-server
 service model, the upload bandwidth of the server grows proportionally with the
 number of clients [86], which makes the large-scale video streaming impractical.
 To reduce the workload of the server, Peer-to-Peer (P2P) service model is
 proposed [36] [133], where a peer not only acts as a client to download data from
 the network, but also acts as a server to upload data for the other peers in the
 network. The upload bandwidth of the peers reduces the workload placed on the
 server dramatically, which makes large-scale video streaming possible. Recently,
 several industrial large-scale P2P video streaming systems have been developed,
 including Coolstreaming [133], PPLive [4], PPStream [5], UUSee [8] and Sopcast
 [7]. Studies show that these systems can support hundreds of thousands of users
 simultaneously [65].
 While P2P video streaming systems have achieved promising results, they have
 several drawbacks. First, there is a large number of unnecessary traverse links within
 55
a provider?s network. As observed in [129], each P2P bit on the Verizon network
 traverses 1000 miles and takes 5.5 metro-hops on average. Second, there is a huge
 number of cross Internet Service Provider (ISP) tra?c. The studies in [71] [107]
 showed that 50%-90% of the existing local pieces in active peers are downloaded
 externally. Third, the difierences in playback time among peers can be as high as
 140 seconds [65], and the lag can be greater if the source rate is higher. Fourth,
 most of the current P2P systems assume that all peers are willing to contribute
 their resources. However, this assumption may not be true since the P2P systems
 are self-organizing networks and the peers are selflsh by nature [128] [60]. Note that
 the selflsh peers will act as free-riders if being free-riders can improve their utilities.
 In the literature, many approaches have been proposed to overcome these
 drawbacks. Karagiannis et al. [71] and Madhyastha et al. [87] proposed to use
 locality-aware P2P schemes to reduce the unnecessary traverse links within and
 cross ISPs and thus reduce the download time. Purandare and Guha [100] proposed
 an alliance based peering scheme to reduce the playback time lag and improve
 the Quality of Service (QoS). Xie et al. [129] proposed a P4P architecture that
 allows cooperative tra?c control between applications and network providers. To
 stimulate selflsh peers to contribute their resources, payment mechanisms [121] [57]
 and reputation schemes [88] [59] are proposed, where peers pay points to receive data
 and earn points by forwarding data to others. However, such payment or reputation
 based mechanisms often demand a centralized architecture and thus hinder their
 scalability.
 Game theory is a mathematical tool that analyzes the strategic interactions
 56
among multiple decision makers. Recently, it draws great attentions in cognitive
 networking [123], multimedia social networking [136], and is being applied to many
 multimedia signal processing problems such as video coding [12] and multimedia
 communications [29]. In P2P networks, peers make intelligent decisions on their
 strategies of requesting and forwarding packets based on their needs and other peers?
 actions. Moreover, since peers are rational and thus naturally selflsh, they have no
 incentive to contribute their resources for other peers. Therefore, it is natural to
 study the intelligent behaviors and interactions of selflsh peers in P2P networks from
 a game theoretic perspective [128] [84]. Using a mental cost to describe the level
 of the peer?s altruism, the authors in [128] presented a game theoretical model to
 analyze nodes? behaviors and the in uence of incentive mechanism. In [84], a game
 theoretic framework is proposed for designing distributed, cheat-proof and attack-
 resistant cooperation stimulation strategies for P2P live streaming social networks.
 Most of the existing schemes treat every peer as an independent individual.
 However, in reality, every peer can have a large number of geographically neighboring
 peers with large intra-group upload and download bandwidths, e.g. the peers in the
 same lab, building, or campus. Here, we name those geographically neighboring
 peers with large intra-group upload and download bandwidths as group peers. To
 reduce the unnecessary traverse links and improve network e?ciency, instead of
 considering each peer?s strategy independently, we investigate possible cooperation
 among the group peers. Moreover, since peers are naturally selflsh, they will act as
 free-riders if doing so can improve their utilities. In such a case, full cooperation
 cannot be guaranteed. Instead, to achieve better payofi, rational peers will adjust
 57
their degree of cooperation by learning from their payofi history. Therefore, a key
 question to answer is: \how a group of selflsh peers should cooperate with each other
 to achieve better streaming performance?"
 The main contributions of this chapter are summarized as follows.
 ? We propose a cooperative streaming scheme to enable cooperation among
 group peers to achieve better streaming performance.
 ? In the proposed scheme, we deflne the utility function of a peer by taking into
 account the possibility of real-time streaming and the cost of acting as a server
 to upload data for the other peers.
 ? Due to their selflsh nature, peers tend to act as free riders to improve their
 own utilities. Moreover, the peers may take out-of-equilibrium strategies due
 to the uncertainty of the strategies of the other peers. Therefore, a robust
 Nash equilibrium (NE) solution is desired for every peer. In this chapter, we
 formulate the cooperative streaming problem as an evolutionary game and
 derive the evolutionarily stable strategy (ESS) for every peer, which is the
 desired stable NE.
 ? To stimulate cooperation, the cooperative streaming scheme should be simple
 since peers may not be willing to join the cooperative streaming if the protocol
 is complicated. The proposed cooperative streaming scheme is very simple.
 Each peer tosses a coin to decide whether to be an agent or a free rider. If the
 outcome is head, the peer acts as an agent to download data from the peers
 outside the group. Otherwise, the peer acts as a free-rider to download data
 58
from the agents. And the probability of being a head for the coin is learned
 from the peer?s own past payofi history.
 ? Due to the highly dynamic behaviors of the peers, i.e., the peers may join or
 leave the P2P network at any time, the cooperative streaming scheme should
 be distributed. We propose a distributed algorithm for every peer to approach
 the ESS by learning from the peer?s own past payofi history.
 The rest of this chapter is organized as follows. In Section 4.1, we describe the
 system model and the utility function. Then, we show in details how to select agents
 in a homogeneous group in Section 4.2. We extend the analysis to the heterogeneous
 case in Section 4.3. In Section 4.4, we propose a distributed learning algorithm for
 ESS. Finally, we show the simulation results in Section 4.5 and draw conclusions in
 Section 4.6.
 4.1 The System Model and Utility Function
 4.1.1 System Model
 As shown in Figure 4.1, there is a set of group peers1 (three in this example)
 who want to view a real-time video streaming simultaneously. Within a group, every
 peer can choose either to be an agent or a normal peer. If the peer serves as an
 agent, he/she not only needs to act as a client to download video data from the
 1How to group the peers itself is an interesting problem. However, in this chapter, we assume
 that the peers have already been grouped and mainly focus on how the group peers cooperate with
 each other to achieve better streaming performance.
 59
Agent
 Group Peers
 Figure 4.1: A cooperative streaming example.
 agents in other groups, but also needs to act as a server to upload video streams
 for both the agents in other groups and the peers in the same group. However, if
 the peer chooses not to be an agent, he/she only needs to download/upload data
 from/to the peers in the same group. We assume that the upload and download
 bandwidth within the group is larger than those cross groups. In such a case, peers
 tend to be a normal peer due to the selflsh nature. Nevertheless, the normal peers,
 on the other hand, take a risk of receiving degraded streaming performance since
 there may not be su?cient agents to download data from other groups. In order
 to achieve good streaming performance through cooperation, a question need to be
 addressed: given a group of peers, which peers should serve as agents.
 60
4.1.2 Utility Functions
 In a P2P network, a peer not only acts as a client to download video data from
 the other peers but also acts as a server to upload video data for the other peers.
 Therefore, while a peer can beneflt from downloading data from the other peers,
 he/she also incurs a cost in uploading data for the other peers, where the cost can
 be resource spending on uploading data, e.g. bandwidth, bufier size.
 Given the group peers, u1, u2, ..., uN , we assume that k peers are willing to
 serve as agents to download multimedia data from the peers outside the group. Let
 the download rate be the transmission speed between an agent and a corresponding
 peer outside the group. If we denote that the download rates of the k agents are r1,
 r2, ..., rk, then the total download rate of the group peers is given by
 yk =
 kX
 i=1
 ri: (4.1)
 Since the agents randomly and independently select peers outside the group for
 downloading data, the download rate ri?s are random variables. According to [68],
 the Cumulative Distribution Function (CDF) of a peer?s download bandwidth can
 be modelled as a linear function, which means that the PDF of a peer?s down-
 load bandwidth can be modelled as a uniform distribution, i.e., ri?s are uniformly
 distributed.
 To provide more insight into the cooperative streaming problem, we flrst con-
 sider a simple scenario without bufiering. Then, we extend our discussion to the
 case when there is bufiering efiect in Section 4.5. For the scenario without bufiering,
 if the total download rate yk is not smaller than the source rate r, then the group
 61
peers can have a real-time streaming, and all the group peers can obtain a certain
 gain G. Otherwise, there will be some delay, and in this case we assume the gain is
 zero. Therefore, given the total download rate yk and the source rate r, if peer ui
 chooses to be an agent, then the utility function of ui is given by
 UA;i(k) = Pr(yk ? r)G? Ci;8k 2 [1; N ]; (4.2)
 where Ci is the cost of ui when he/she serves as an agent, and Pr(yk ? r) is the
 probability of achieving a real-time streaming which can be computed according to
 Theorem 1.
 Since the upload and download bandwidths within the group is large, the cost
 of uploading data to the other peers within the group can be negligible. In such
 a case, if peer ui chooses not to be an agent, then there is no cost for ui and the
 utility function becomes
 UN;i(k) =
 8
 >><
 >>:
 Pr(yk ? r)G; if k 2 [1; N ? 1];
 0; if k = 0.
 (4.3)
 Theorem 1: If r1, r2,..., rk are i.i.d. uniformly distributed within [rL; rU ],
 then Pr(yk ? r) is given by
 Pr(yk ? r) = 12k!
 kX
 l=0
 (?1)l
 0
 BB@
 k
 l
 1
 CCA
 h
 (k ? l)k ? sgn(r^ ? l)(r^ ? l)k
 i
 ; (4.4)
 and when k is su?ciently large, Pr(yk ? r) can be approximated as
 Pr(yk ? r) ? Q
 0
 @ r^ ?
 k
 2q
 k
 12
 1
 A ; (4.5)
 where r^ = r?krLrU?rL and Q(x) is the Gaussian tail function
 R1
 x
 1p2? exp?
 x2
 2 dx.
 62
Proof :Let r^l = rl?rLrU?rL , 8l, then r^1, r^2, ..., r^k are i.i.d. uniformly distributed
 with [0; 1]. And the characteristic function of r^l is given by
 `(t) = i(1? e
 it)
 t : (4.6)
 Let y^k =
 Pk
 l=1 r^l, then the characteristic function of y^k can be computed by
 `y^k(t) =
  i(1? eit)
 t
 ?k
 : (4.7)
 Therefore, the density function of y^k is
 fy^k(y)=F?1t
 " i(1? eit)
 t
 ?k#
 (y)
 = 12(k ? 1)!
 kX
 l=0
 (?1)l
 0
 BB@
 k
 l
 1
 CCA sgn(y ? l)(y ? l)k?1: (4.8)
 Since Pr(yk ? r) = Pr(y^k ? r^), according to (4.8), we have
 Pr(yk ? r)
 = Pr(y^k ? r^) =
 Z 1
 r^
 fy^k(y)dy
 = 12k!
 kX
 l=0
 (?1)l
 0
 BB@
 k
 l
 1
 CCA
 h
 (k ? l)k ? sgn(r^ ? l)(r^ ? l)k
 i
 : (4.9)
 When k is su?ciently large, according to the Central Limit Theory, the dis-
 tribution of y^k can be approximated as Gaussian distribution N(k2 ; k12). Therefore,
 we have
 Pr(yk ? r) = Pr(y^k ? r^) ? Q
 0
 @ r^ ?
 k
 2q
 k
 12
 1
 A : (4.10)
 63
4.2 Agents Selection Within A Homogeneous Group
 In the previous section, we have discussed the system model and the peer?s
 utility function. To optimize the streaming performance, proper peers should serve
 as agents to download data from the peers outside the group. In this section, we
 will discuss how to select agents within a homogeneous group where the cost of all
 peers serving as an agent is assumed to be the same.
 4.2.1 Centralized Agent Selection
 If there is a central controller who can choose which peers should act as agents,
 then a straightforward criterion of selecting proper agents is to maximize the social
 welfare, which is the sum of all peers? utilities.
 Let Ci = C be the cost of a peer serving as an agent in a homogeneous group.
 Then the social welfare of an N ? peer group with k agents can be calculated by
 SW (k) = Pr(yk ? r)GN ? kC: (4.11)
 Based on (4.11), the agent selection problem to maximize the social welfare
 can be formulated as
 max
 k
 SW (k) = max
 k
 [Pr(yk ? r)GN ? kC] ; (4.12)
 where k 2 f1; 2; :::; Ng.
 By solving (4.12), we can flnd the optimal k? that maximizes the social wel-
 fare. Then, the central controller can choose k? peers from the group as agents to
 download data from the peers outside the group based on some mechanism, e.g.
 64
the peers take turns to serve as agents. However, since peers? behaviors are highly
 dynamic, they may join in or leave the P2P network at any time. In such a case,
 the centralized approach may not be practical.
 4.2.2 Distributed Agent Selection
 To overcome the drawback of the centralized approach, it is possible to consider
 a distributed approach where each peer acts as an agent with probability x. Then,
 according to (4.2) and (4.3), the group?s social welfare can be computed by
 Utotal(x)=
 NX
 i=1
 0
 BB@
 N
 i
 1
 CCAxi(1? x)N?i
 h
 Pr(yi ? r)GN ? iC
 i
 : (4.13)
 The problem of flnding an optimal x to maximize the social welfare can be
 formulated as
 max
 x
 NX
 i=1
 0
 BB@
 N
 i
 1
 CCAxi(1? x)N?i
 h
 Pr(yi ? r)GN ? iC
 i
 s:t: 0 ? x ? 1: (4.14)
 However, since peers are selflsh by nature, they are not as cooperative as a
 system designer/controller desires. By solving (4.14), we can flnd the optimal x?
 that maximizes the social welfare, but x? can not maximize each peer?s own utility.
 Therefore, the social welfare maximizer x? is not attainable when peers are selflsh.
 Moreover, the solution to the optimization problem shown in (4.14) is not stable
 since any perturbation will lead to a new solution.
 65
4.2.3 Evolutionary Cooperative Streaming Game
 In order to provide a robust equilibrium strategy for the selflsh peers, we adopt
 the concept of Evolutionarily Stable Strategy (ESS) [110] [122], which is deflned as
 follows.
 Deflnition 1: A strategy a? is an ESS if and only if, 8a 6= a?, a? satisfles
 ? equilibrium condition: Ui(a; a?) ? Ui(a?; a?), and
 ? stability condition: if Ui(a; a?) = Ui(a?; a?), Ui(a; a) < Ui(a?; a),
 where Ui(a1; a2) is the utility of player i when he/she uses strategy a1 and another
 player uses strategy a2.
 Since all peers are selflsh, they will cheat if cheating can improve their payofis,
 which means that all peers are uncertain of other peers? actions and utilities. In such
 a case, to improve their utilities, peers will try difierent strategies in every play and
 learn from the strategic interactions using the methodology of understanding-by-
 building. During the process, the percentage of peers using a certain pure strategy
 may change. Such a population evolution can be modelled by replicator dynamics.
 Speciflcally, let xa stand for the probability of a peer using pure strategy a 2 A,
 where A = fA;Ng is the set of pure strategies including being an agent (A) and
 not being an agent (N). By replicator dynamics, the evolution dynamics of xa are
 given by the following difierential equation
 _xa = ?[ ?U(a; x?a)? ?U(xa)]xa; (4.15)
 where ?U(a; x?a) is the average payofi of the peers using pure strategy a, x?a is the
 66
set of peers who use pure strategies other than a, ?U(xa) is the average payofi of all
 peers, and ? is a positive scale factor.
 From (4.15), we can see that if adopting pure strategy a can lead to a higher
 payofi than the average level, the probability of a peer using a will grow and the
 growth rate _xa=xa is proportional to the difierence between the average payofi of
 using strategy a (i.e., ?U(a; x?a)) and the average payofi of all peers (i.e., ?U(xa)).
 4.2.4 Analysis of the Cooperative Streaming Game
 According to (4.2) and (4.3), the average payofi of a peer if he/she choose to
 be an agent can be computed by
 ?UA(x) =
 N?1X
 i=0
 0
 BB@
 N?1
 i
 1
 CCAxi(1? x)N?1?i
 h
 Pr(yi+1 ? r)G? C
 i
 ; (4.16)
 where x is the probability of a peer being an agent, and
  
N?1
 i
 ?
 xi(1? x)N?1?i is the
 probability that there are i agents out of N ? 1 other peers.
 Similarly, the average payofi of a peer if he/she chooses not to be an agent is
 given by
 ?UN(x) =
 N?1X
 i=1
 0
 BB@
 N?1
 i
 1
 CCAxi(1? x)N?1?iPr(yi ? r)G: (4.17)
 According to (4.16) and (4.17), the average payofi of a peer is
 ?U(x) = x ?UA(x) + (1? x) ?UN(x): (4.18)
 Substituting (4.18) back to (4.15), we have
 _x = ?x(1? x)[ ?UA(x)? ?UN(x)]: (4.19)
 67
0 5 10 15 20
 0
 0.1
 0.2
 0.3
 0.4
 0.5
 0.6
 0.7
 0.8
 0.9
 i
 w
 i
 Figure 4.2: The deceasing property of wi.
 At equilibrium x?, no player will deviate from the optimal strategy, which
 means _x? = 0, and we can obtain x? = 0, 1, or the solutions to ?UA(x) = ?UN(x).
 However, since _x? = 0 is only the necessary condition for x? to be ESS, we examine
 the su?cient condition for each ESS candidate and draw the following conclusions
 with the proofs shown in Theorem 2-4.
 ? x? = 0 is an ESS only when Pr(y1 ? r)G? C ? 0.
 ? x? = 1 is an ESS only when Pr(yN ? r)G? Pr(yN?1 ? r)G ? C.
 ? Let x? be the solution to ?UA(x) = ?UN(x), and x? 2 (0; 1). Then, x? is an ESS.
 Lemma 1: Let f(x) = ?UA(x) ? ?UN(x), then f 0(x) < 0, 8x 2 [0; 1]. Proof :
 68
According to (4.16) and (4.17), we have
 f(x) =
 N?1X
 i=0
 0
 BB@
 N?1
 i
 1
 CCAxi(1? x)N?1?iwi ? C; (4.20)
 where wi = [Pr(yi+1 ? r)? Pr(yi ? r)]G.
 ? 8x 2 (0; 1), by taking the derivative of f(x) over x, we have
 f 0(x) =
 N?1X
 i=0
 0
 BB@
 N?1
 i
 1
 CCAxi?1(1? x)N?2?i[i? (N ? 1)x]wi;
 =
 i1X
 i=0
 0
 BB@
 N?1
 i
 1
 CCAxi?1(1? x)N?2?i[i? (N ? 1)x]wi
 +
 N?1X
 i=i1+1
 0
 BB@
 N?1
 i
 1
 CCAxi?1(1? x)N?2?i[i? (N ? 1)x]wi;
 (4.21)
 where i1 is the integer such that i1 ? (N ? 1)x and i1 + 1 > (N ? 1)x.
 Since wi stands for the additional gain by introducing one more agent into
 the i-agent system, as shown in Fig. 4.2, it is a decreasing function in terms
 of i, which means that wi ? wi1 ;8i ? i1 and wi ? wi1 ;8i > i1. Therefore,
 69
according to (4.21), we have
 f 0(x) <
 i1X
 i=0
 0
 BB@
 N?1
 i
 1
 CCAxi?1(1? x)N?2?i[i? (N ? 1)x]wi1
 +
 N?1X
 i=i1+1
 0
 BB@
 N?1
 i
 1
 CCAxi?1(1? x)N?2?i[i? (N ? 1)x]wi1 ;
 = wi1
 N?1X
 i=0
 0
 BB@
 N?1
 i
 1
 CCAxi?1(1? x)N?2?i[i? (N ? 1)x];
 = wi1
 d
 2
 664
 PN?1
 i=0
 0
 BB@
 N?1
 i
 1
 CCAxi(1? x)N?1?i
 3
 775
 dx ;
 = 0: (4.22)
 Therefore, f 0(x) < 0, 8x 2 (0; 1).
 ? The derivative of f(x) over x at x = 0 can be computed by
 f 0(0) = lim
 "!0
 f(")? f(0)
 "
 = lim
 "!0
 PN?1
 i=0
 0
 BB@
 N?1
 i
 1
 CCA "i(1? ")N?1?iwi ? w0
 "
 = lim
 "!0
 (1? ")N?1w0 ? w0
 " + lim"!0
 (N ? 1)"(1? ")N?2w1
 "
 = (N ? 1)(w1 ? w0)
 < 0: (4.23)
 where the last inequality comes from the fact that wi is a decreasing function
 in terms of i.
 70
? Similarly, the derivative of f(x) over x at x = 1 can be computed by
 f 0(1) = lim
 "!0
 f(1)? f(1? ")
 "
 = lim
 "!0
 wN?1 ?
 PN?1
 i=0
 0
 BB@
 N?1
 i
 1
 CCA (1? ")i"N?1?iwi
 "
 = lim
 "!0
 wN?1 ? (1? ")N?1wN?1
 " + lim"!0
 ?(N ? 1)(1? ")N?2"wN?2
 "
 = (N ? 1)(wN?1 ? wN?2)
 < 0: (4.24)
 where the last inequality comes from the fact that wi is a decreasing function
 in terms of i.
 In all, f 0(x) < 0, 8x 2 [0; 1]. This completes the proof of the lemma.
 Theorem 2: The condition for x? = 0 to be an ESS is Pr(y1 ? r)G?C ? 0.
 Proof : According to (4.16-4.18), the utility that a peer using mixed strategy
 x and the other peers use mixed strategy x? = 0 can be written as
 ?U(x; 0) = ?UN(0) + ( ?UA(0)? ?UN(0))x;
 where ?UA(0) = Pr(y1 ? r)G? C and ?UN(0) = 0.
 ? If Pr(y1 ? r)G? C > 0, i.e. ?UA(0) > ?UN(0), every peer will deviate to x = 1
 to obtain ?UA(0) rather than ?UN(0).
 ? If Pr(y1 ? r)G? C < 0, i.e. ?UA(0) < ?UN(0), every peer will stay at x = 0 to
 obtain ?UN(0) rather than ?UA(0).
 71
? If Pr(y1 ? r)G ? C = 0, i.e. ?UA(0) = ?UN(0), then ?U(x; 0) = 0 8x, and
 f(0) = ?UA(0)? ?UN(0) = 0. According to Lemma 1, we know that f 0(x) < 0
 8x 2 [0; 1], so f(x) = ?UA(x)? ?UN(x) < 0. In such a case, ?U(0; x) = ?UN(x) >
 ?U(x; x) = ?UN(x)+( ?UA(x)? ?UN(x))x, which means x? = 0 is an ESS according
 to Deflnition 1.
 Therefore, x? = 0 is an ESS only when Pr(y1 ? r)G?C ? 0.
 Theorem 3: The condition for x? = 1 to be an ESS is Pr(yN ? r)G ?
 Pr(yN?1 ? r)G ? C.
 Proof : According to (4.16-4.18), the utility that a peer using mixed strategy
 x and the other peers use mixed strategy x? = 1 can be written as
 ?U(x; 1) = ?UN(1) + ( ?UA(1)? ?UN(1))x;
 where ?UA(1) = Pr(yN ? r)G? C and ?UN(1) = Pr(yN?1 ? r)G.
 ? If Pr(yN ? r)G ? Pr(yN?1 ? r)G < C, i.e., ?UN(1) > ?UA(1), every peer will
 deviate to x = 0 to obtain ?UN(1) rather than ?UA(1).
 ? If Pr(yN ? r)G ? Pr(yN?1 ? r)G > C, i.e., ?UN(1) < ?UA(1), every peer will
 stay at x = 1 to obtain ?UA(1) rather than ?UN(1).
 ? If Pr(yN ? r)G ? Pr(yN?1 ? r)G = C, i.e. ?UN(1) = ?UA(1), then ?U(x; 1) =
 ?UN(1) 8x, and f(1) = ?UA(1)? ?UN(1) = 0. According to Lemma 1, we know
 that f 0(x) < 0 8x 2 [0; 1], so f(x) = ?UA(x) ? ?UN(x) > 0. In such a case,
 ?U(1; x) = ?UN(x)+ ( ?UA(x)? ?UN(x))1 > ?U(x; x) = ?UN(x) + ( ?UA(x)? ?UN(x))x,
 which means x? = 1 is an ESS according to Deflnition 1.
 72
Therefore, x? = 1 is an ESS only when Pr(yN ? r)G? Pr(yN?1 ? r)G ? C.
 Theorem 4: If x? 2 (0; 1) is a solution to ?UA(x) = ?UN(x), then x? is an ESS.
 Proof : Let ?Ui(x; x?) be the utility of player i when player i uses mixed strategy
 x and other users use mixed strategy x?. Then, we have
 ?Ui(x; x?) = x ?UA(x?) + (1? x) ?UN(x?): (4.25)
 Since x? is a solution to ?UA(x) = ?UN(x), we have ?UA(x?) = ?UN(x?). Therefore,
 (4.25) becomes
 ?Ui(x; x?) = ?UA(x?) = ?Ui(x?; x?); (4.26)
 which means x? satisfles the equilibrium condition shown in Deflnition 1.
 Moreover, according to (4.18), we have
 ?Ui(x; x) = ?UN(x) + ( ?UA(x)? ?UN(x))x; (4.27)
 and
 ?Ui(x?; x) = ?UN(x) + ( ?UA(x)? ?UN(x))x?: (4.28)
 Therefore, we have
 ?Ui(x?; x)? ?Ui(x; x) = ( ?UA(x)? ?UN(x))(x? ? x): (4.29)
 From Lemma 1, we know that f(x) = ?UA(x) ? ?UN(x) is a monotonically
 decreasing function. Since ?UA(x?) = ?UN(x?), ?UA(x) ? ?UN(x) > 0 if x < x?, and
 ?UA(x)? ?UN(x) < 0 if x > x?. Therefore, ( ?UA(x)? ?UN(x))(x??x) > 0, 8x 6= x?, i.e.
 ?Ui(x?; x) > ?Ui(x; x);8x 6= x?; (4.30)
 73
which means x? satisfles the stability condition shown in Deflnition 1.
 According to (4.26) and (4.30), we know that x? is an ESS.
 4.3 Agents Selection Within A Heterogeneous Group
 In this section, we will discuss how to select agents within a heterogeneous
 group where the costs of the peers acting as agents are difierent.
 Let xi;ai stands for the probability of peer ui using pure strategy ai 2 A.
 By replicator dynamics, the evolution dynamics of xi;ai are given by the following
 difierential equation
 _xi;ai = ?[ ?Ui(ai; x?i)? ?Ui(xi)]xi;ai ; (4.31)
 where ?Ui(ai; x?i) is the average payofi of peer ui using pure strategy ai, ?Ui(xi) is the
 average payofi of peer ui using mixed strategy xi, and ? is a positive scale factor.
 Since it is generally very di?cult to represent ?Ui(ai; x?i) and ?Ui(xi) in a com-
 pact form, in the following, we flrst analyze a two-player game to gain some insight.
 Then, we generalize the observation in the two-player game to the multi-player game.
 Table 4.1: Utility table of a two-player game.
 74
4.3.1 Two-Player Game
 Let x1 and x2 be the probability of u1 and u2 being an agent, respectively.
 Let B1 = Pr(y1 ? r)G and B2 = Pr(y2 ? r)G. Then, the payofi matrix of u1 and
 u2 can be written as in Table 4.1. Therefore, the average payofi ?U1(A; x2) can be
 computed by
 ?U1(A; x2) = (B2 ? C1)x2 + (B1 ? C1)(1? x2); (4.32)
 and the average payofi ?U1(x1) becomes
 ?U1(x1) = (B2 ? C1)x1x2 + (B1 ? C1)x1(1? x2) +B1(1? x1)x2: (4.33)
 According to (4.31), the replicator dynamics equation of u1 is given by
 _x1 = ?x1(1? x1) [B1 ? C1 ? (2B1 ?B2)x2] : (4.34)
 Similarly, the replicator dynamics equation of u2 can be computed by
 _x2 = ?x2(1? x2) [B1 ? C2 ? (2B1 ?B2)x1] : (4.35)
 At equilibrium, we know that _x1 = 0 and _x2 = 0. According to (4.34) and
 (4.35), we can get flve equilibria: (0; 0), (0; 1), (1; 0), (1; 1), and the mixed strategy
 equilibrium
 ?
 B1?C2
 2B1?B2 ; B1?C12B1?B2
 ?
 .
 According to [41], if an equilibrium of the replicator dynamics equations is a
 locally asymptotically stable point in a dynamic system, it is an ESS. Therefore,
 by viewing (4.34) and (4.35) as a nonlinear dynamic system and analyzing the
 corresponding Jacobian matrix, we can examine whether the flve equilibria are ESSs.
 75
By taking partial derivatives of (4.34) and (4.35), the Jacobian matrix can be written
 as
 J =
 2
 664
 @ _x1
 @x1
 @ _x1
 @x2
 @ _x2
 @x1
 @ _x2
 @x2
 3
 775 = ?
 2
 664
 J11 J12
 J21 J22
 3
 775 ; (4.36)
 where J11 = (1 ? 2x1)(B1 ? C1 ? (2B1 ? B2)x2), J12 = ?x1(1 ? x1)(2B1 ? B2),
 J21 = ?x2(1? x2)(2B1 ?B2), and J22 = (1? 2x2)(B1 ? C2 ? (2B1 ?B2)x1).
 The asymptotical stability requires that det(J) > 0 and tr(J) < 0 [41]. Sub-
 stituting the flve equilibria, i.e. (0; 0), (0; 1), (1; 0), (1; 1), and
 ?
 B1?C2
 2B1?B2 ; B1?C12B1?B2
 ?
 , to
 (4.36), we conclude that
 ? If B2?B1?C1 > 0 and B2?B1?C2 > 0, there is a unique ESS (1; 1), where
 both u1 and u2 converge to be agents.
 ? Elseif B2 ? B1 ? C1 > 0 and B2 ? B1 ? C2 < 0, there is a unique ESS (1; 0),
 where u1 converges to be an agent and u2 converges to be a free-rider.
 ? Elseif B2 ? B1 ? C1 < 0 and B2 ? B1 ? C2 > 0, there is a unique ESS (0; 1),
 where u2 converges to be an agent and u1 converges to be a free-rider.
 ? Else, there are two ESSs (0; 1) and (1; 0), where the converged strategy proflles
 depends on the initial strategy proflles.
 From the above analysis, we can see that when the gain of being an agent
 (B2 ? B1) is greater than the cost of being an agent (C1 or C2), the peer tends to
 be an agent. And the peer with a higher cost tends to be a free-rider and rely on
 the peer with a lower cost.
 76
Algorithm 2 : A Distributed Learning Algorithm For ESS
 1. Given the step size ? and the slot index t = 0, each peer ui initializes xi with
 xi(0).
 2. During slot t, for q = 1 : M ,
 ? ui tosses a coin with probability xi(t) being head. If the outcome is head,
 ui serves as an agent and downloads data from the peers outside the group
 with download rate ri(t; q). On the other hand, if the outcome is tail, ui
 acts as a free-rider and downloads the data from the agents.
 ? ui computes his/her utility using (4.39).
 ? ui computes the indicator function using (4.38).
 3. Then, ui approximates ?Ui(A; x?i(t)) and ?Ui(xi(t)) using (4.40) and (4.41).
 4. Finally, ui updates the probability of being an agent xi(t) using (4.37).
 4.3.2 Multi-Player Game
 From the analysis of the two-player game, we can infer that the peer with
 a higher cost (Ci) tends to take advantage of the peer with a lower cost. This
 observation can be extended to multi-player game. If there are more than two peers
 in the game, the strategy of the peers with higher C 0is will converge to \N" with
 a greater probability. The peers with lower C 0is tend to be agents since they sufier
 relatively heavier losses if no one serves as an agent.
 77
4.4 A Distributed Learning Algorithm For ESS
 From the previous two sections, we can see that the ESS can be found by solv-
 ing the replicator dynamics equations ((4.19) or (4.31)). However, solving the repli-
 cator dynamics equations require the exchange of private information and strategies
 adopted by other peers. In this section, we will present a distributed learning algo-
 rithm that can gradually converge to ESS without information exchange.
 We flrst discretize the replicator dynamics equation shown in (4.31) as
 xi(t+ 1) = xi(t) + ?
 ? ?Ui(A; x?i(t))? ?Ui(xi(t))
 ?xi(t); (4.37)
 where t is the slot index and xi(t) is the probability of ui being an agent during slot
 t. Here, we assume that each slot can be further divided into M subslots and each
 peer can choose to be an agent or not at the beginning of each subslot.
 From (4.37), we can see that in order to update xi(t + 1), we need to flrst
 compute ?Ui(A; x?i(t)) and ?Ui(xi(t)). Let us deflne an indicator function 1i(t; k) as
 1i(t; q)=
 8
 >><
 >>:
 1; if ui is an agent at subslot q in slot t,
 0; else,
 (4.38)
 where q is the subslot index.
 The immediate utility of ui at subslot q in slot t can be computed by
 Ui(t; q) =
 8
 >>>>>>>>>><
 >>>>>>>>>>:
 G? Ci; if ui is an agent and rt ? r,
 ?Ci; if ui is an agent and rt < r,
 G; if ui is not an agent and rt ? r,
 0; if ui is not an agent and rt < r,
 (4.39)
 78
where rt is the total download rate of the agents and r is the source rate.
 Then, ?Ui(A; x?i(t)) can be approximated using
 ?Ui(A; x?i(t)) =
 PM
 q=1 Ui(t; q)1i(t; q)PM
 q=1 1i(t; q)
 ; (4.40)
 Similarly ?Ui(xi(t)) can be approximated as
 ?Ui(xi(t)) = 1M
 MX
 q=1
 Ui(t; q): (4.41)
 Based on (4.37-4.41), ui can gradually learn the ESS. In Algorithm 2, we
 summarize the detailed procedures of the proposed distributed learning algorithm.
 4.5 Simulation Results
 In all simulations, the parameters G, rL, and rU are set to be 1, 50, and
 800, respectively. For convenience, in the rest of this chapter, we denote the cen-
 tralized approach maximizing the social welfare shown in (4.12) as MSW-C, the
 distributed approach maximizing the social welfare shown in (4.14) asMSW-D, and
 the ESS-based approach as ESS-D. We compare the proposed methods with the
 traditional P2P non-cooperation method, denoted as Non-Coop. In Non-Coop,
 each peer acts as an individual and randomly selects some peers for downloading
 video streams. Such a protocol has been widely used in the existing P2P systems,
 e.g., Coolstreaming [133] and PPLive [4].
 In the flrst simulation, we show the social welfare (the sum of all peers? util-
 ities) comparison among difierent approaches, where we assume that there are 20
 homogenous peers and the cost C is 0:1. As show in Fig. 4.3, MSW-C achieves the
 79
best social welfare performance since its objective function is to maximize the social
 welfare with pure strategy. By using the mixed strategy to maximize the social
 welfare, MSW-D achieves the second best social welfare performance. However, as
 discussed in Section 4.2.2, the solution to MSW-D is not stable. With ESS-D, a
 stable NE solution can be obtained at the cost of a slight loss in social welfare. Nev-
 ertheless, all three proposed algorithms perform much better than the Non-Coop
 method. In Non-Coop, the social welfare performance decreases linearly in terms
 of the source rate. With cooperation and adaptively selecting the proper number of
 agents, all three proposed algorithms can preserve a high social welfare performance
 even with a large source rate.
 In the second simulation, we evaluate the convergence property of the ESS-D.
 In Fig. 4.4, we show the replicator dynamic of the cooperation streaming game with
 homogeneous peers, where C = 0:1 and r = 500. We can see that starting from
 a high initial value, all peers gradually reduce their probabilities of being an agent
 since being a free-rider more often can bring a higher payofi. However, since too low
 a probability of being an agent increases the chance of having no peer be an agent,
 the probability of being an agent will flnally converge to a certain value which is
 determined by the number of peers.
 In Fig. 4.5, we show the replicator dynamic of the cooperation streaming game
 with 20 heterogeneous peers, where r = 500 and the cost Ci is randomly chosen from
 [0:1; 0:3]. We further assume that Ci is monotonically increasing in i where u1 has
 the lowest cost and u20 has the highest cost. From Fig. 4.5, we can see that the peers
 with lower costs (u1, u2, and u3 in this simulation) converge to be an agent while
 80
150 200 250 300 350 400 450 500 550 600
 2
 4
 6
 8
 10
 12
 14
 16
 18
 20
 Source Rate (kb/s)
 Social Welfar
 e
  
 
Non?Coop
 MSW?C
 MSW?D
 ESS?D
 Figure 4.3: The social welfare comparison among Non-Coop, MSW-C, MSW-D,
 and ESS-D.
 the peers with higher costs (u4 ? u20 in this simulation) converge to be a free-rider.
 This observation coincides with our conclusion in Section 4.3.2, which is \the peers
 with lower costs tend to be an agent since they sufier relatively higher losses if no
 one serves as an agent". Note that due to the space limitation, we only show the
 behavior dynamics of u1 ? u4. All other peers u5 ? u20 have the similar behavior
 dynamics with u4, and they all converge to be free-riders.
 In the third simulation, we compare the performance of Non-Coop and ESS-
 D in terms of the probability of real-time streaming, which is deflned as the proba-
 bility that the total download rate is greater than the source rate. The simulation
 results are shown in Fig. 4.6. We can see that with cooperation, the probability of
 81
0 100 200 300 400 500 600
 0.2
 0.25
 0.3
 0.35
 0.4
 0.45
 0.5
 Time interval
 Probability of being an agen
 t
  
 
N=10
 N=15
 N=20
 Figure 4.4: Behavior dynamic of a homogeneous group of peers.
 real-time streaming can be signiflcantly improved especially at the high source rate
 region. We also flnd that at the high source rate region, the probability of real-time
 streaming increases as N increases.
 The visual quality comparison between Non-Coop and ESS-D is shown in
 Fig. 4.7. In this simulation, we flx the probability of real-time streaming to be
 0.85. According to Fig. 4.6, we can see that the corresponding source rates for
 \Non-Coop", \ESS-D with N=2", \ESS-D with N=3", and \ESS-D with N=4"
 are around 100kb/s, 300kb/s, 520kb/s, and 720kb/s, respectively. By setting the
 above source rates as the target bitrates, we encode the Foreman sequence with CIF
 format using H.264 encoder. From Fig. 4.7, we can see that the video visual quality
 with the proposed ESS-D is much better than that with Non-Coop.
 Then, we show the simulation result of the source rate versus the utility. As
 82
0 50 100 150 200 250 300 350 400
 0
 0.5
 1
 Probability of being an agent
 Time interval
 Peer 
u 1
 0 50 100 150 200 250 300 350 400
 0
 0.5
 1
 Probability of being an agent
 Time interval
 Peer 
u 2
 0 50 100 150 200 250 300 350 400
 0
 0.5
 1
 Probability of being an agent
 Time interval
 Peer 
u 3
 0 50 100 150 200 250 300 350 400
 0
 0.5
 Probability of being an agent
 Time interval
 Peer 
u 4
 Figure 4.5: Behavior dynamic of a heterogeneous group of peers.
 shown in Fig. 4.8, without cooperation, if the peer requires a utility around 0.8, the
 source rate can not be larger than 130 kb/s. However, with cooperation, the source
 rate can be more than 400 kb/s even when there are only 2 peers. Therefore, with
 cooperation, the peers can enjoy much higher quality video with the same utility.
 In the fourth simulation, we consider the case that the peers in the same group
 are viewing multiple channels with L being the number of the channels. We assume
 that the source rate is the same for all channels and there are 20 homogenous peers
 with the cost C = 0:1. Similar to the View-Upload Decoupling (VUD) scheme [127],
 the uploading and downloading are decoupled in the proposed ESS-D algorithm in
 this case. We allow cooperation among all the peers where the agent may download
 83
0 500 1000 1500
 0
 0.1
 0.2
 0.3
 0.4
 0.5
 0.6
 0.7
 0.8
 0.9
 1
 Source rate (kb/s)
 Probability of real?time streamin
 g
  
 
Non?Coop
 ESS?D with N=2
 ESS?D with N=3
 ESS?D with N=4
 ESS?D with N=5
 Figure 4.6: The probability of real-time streaming comparison between Non-Coop
 and ESS-D.
 source data that he/she is not viewing. As shown in Fig. 4.9, without cooperation,
 if the peer requires a utility around 0.8, the source rate can not be larger than 130
 kb/s in the Non-Coop method. However, with the proposed ESS-D algorithm, the
 source rate can be around 240kb/s even when the peers are view 8 difierent channels.
 This phenomenon fully demonstrates the e?ciency of the proposed method.
 In the last simulation, we consider the scenario when there is bufiering efiect.
 In such a scenario, the gain in the utility will not drop to zero when the total
 download rate is smaller than the source rate. Instead, the gain should maintain
 a positive value due to the existence of bufiers. One possible utility function that
 84
(a) (b)
 (c) (d)
 Figure 4.7: The visual quality comparison: (a) Non-Coop; (b) ESS-D with N=2;
 (c) ESS-D with N=3; (d) ESS-D with N=4.
 considers the bufiering efiect is
 UA;i(k) = 1ln(r)E [ln(yk)]G? Ci;8k 2 [1; N ];
 UN;i(k) =
 8
 >><
 >>:
 1
 ln(r)E [ln(yk)]G; if k 2 [1; N ? 1];
 0; if k = 0.
 (4.42)
 From the above utility function, we can see that for any given source rate
 r, the gain increases as the total download rate yk increases. Moreover, since the
 probability of playback delay becomes smaller with more data in the bufier, a certain
 85
0.4 0.5 0.6 0.7 0.8 0.9 1
 50
 100
 150
 200
 250
 300
 350
 400
 450
 500
 Utility
 Source Rate (kb/s
 )
  
 
Non?Coop
 ESS?D with N=2
 ESS?D with N=3
 Figure 4.8: Single-source rate comparison between Non-Coop and ESS-D.
 increase in the high yk region should lead to a less signiflcant gain than that in the
 low yk region [30]. Here, we use the ln(:) function to characterize such properties.
 Nevertheless, other functions that have similar properties can also be used.
 The social welfare comparison between Non-Coop and ESS-D with the util-
 ity function in (4.42) is shown in Fig. 4.10. From Fig. 4.10, we can see that when
 the utility function in (4.42) is used, the social welfare performance of Non-Coop
 no longer decreases linearly in terms of the source rate. This phenomenon is mainly
 because, with the existence of bufiers, the gain will not drop to zero when the total
 download rate is smaller than the source rate. Nevertheless, ESS-D can still lead
 to a much higher social welfare performance for all source rates, compared with
 86
0.4 0.5 0.6 0.7 0.8 0.9 1
 50
 100
 150
 200
 250
 300
 350
 400
 450
 500
 Utility
 Source Rate (kb/s
 )
  
 
Non?Coop
 ESS?D with L=4
 ESS?D with L=6
 ESS?D with L=8
 ESS?D with L=10
 Figure 4.9: Multi-source rate comparison between Non-Coop and ESS-D.
 Non-Coop. Moreover, we should notice that all the analysis in Section 4.2 is still
 applicable to the utility function in (4.42).
 4.6 Summary
 In this chapter, we propose a cooperative streaming scheme to address the
 network ine?ciency problem encountered by the traditional non-cooperative P2P
 schemes. We answer the question of \how a group of selflsh peers with large intra-
 group upload and download bandwidths cooperate with each other to achieve better
 streaming performance" by formulating the problem as an evolutionary game and
 deriving the ESS for every peer. We further propose a distributed learning algorithm
 87
200 300 400 500 600
 16
 18
 20
 22
 24
 26
 28
 Source Rate (kb/s)
 Social Welfar
 e
  
 
Non?Coop
 ESS?D
 Figure 4.10: The social welfare comparison between Non-Coop and ESS-D when
 the utility function is deflned as (4.42).
 for each peer to converge to the ESS by learning from his/her own past payofi
 history. From the simulation results, we can see that compared with the traditional
 non-cooperative P2P schemes, the proposed algorithm achieves much better social
 welfare, higher probability of real-time streaming, and better video quality (higher
 source rate). Moreover, by incorporated with the recent proposed View-Upload
 Decoupling (VUD) scheme, the proposed cooperative streaming scheme allows the
 peers who are viewing difierent videos to cooperate with each other and mutually
 improve the streaming performance.
 88
Chapter 5
 Cooperation Stimulation Using Indirect Reciprocity Game Modeling
 A cognitive network is a network composed of elements that can dynamically
 adapt to varying network conditions to optimize end-to-end performance through
 learning and reasoning [116]. In such a network, nodes are intelligent and have
 the ability to observe, learn, and act to optimize their performance. Since nodes
 generally belong to difierent authorities and pursue difierent goals, fully cooperative
 behaviors, such as unconditionally forwarding packets for each other, cannot be
 taken for granted. Instead, nodes will only cooperate with others when cooperation
 can improve their own performance. We regard the nodes with such behaviors as
 selflsh nodes. Therefore, a key problem in cognitive networks is how to stimulate
 cooperation among selflsh nodes.
 In the literature, many schemes have been proposed to stimulate node co-
 operation for difierent cognitive networks, such as [23] [138] for ad hoc networks
 and [121] [57] for peer-to-peer networks. One way to stimulate cooperation among
 selflsh nodes is to use payment based methods [139] [14]. Although these schemes
 can achieve promising cooperation stimulation results, the requirement of tamper-
 proof hardware or central billing services greatly limits their potential applications.
 Another way to stimulate cooperation among selflsh nodes is to use reputation-
 based methods with necessary monitoring [130] [88] [59]. Marti et. al [89] propose
 89
a mechanism, called \watchdog", to identify the misbehaving nodes and another
 mechanism, called \pathrater", to de ect the tra?cs around them. The major
 drawback of their method is that misbehaving nodes are not punished. Therefore,
 there is no incentive for the nodes to cooperate. To overcome this problem, Bucheg-
 ger and Boudec [22] as well as Michiardi and Molva [90] propose reputation-based
 mechanisms to enforce node cooperation. In both approaches, nodes observe the be-
 havior of each other, store this information locally, and distribute this information
 in reputation reports. According to their observations, nodes isolate the misbehav-
 ing nodes by denying forwarding packets to them. However, there is no theoretical
 justiflcation about the optimality of such approaches.
 Recently, efiorts have been made to mathematically analyzing cooperation in
 cognitive networks using game theory [91] [42] [84] [128]. Srinivasan et al. [112]
 propose to use generous TIT-FOR-TAT strategy while Urpi et al. [118] propose to
 use Bayesian games. In [53], Felegyhazi et al. investigate equilibrium conditions
 of packet forwarding strategies based on game theory and graph theory by taking
 into account the network topology. In [131], Yu and Liu propose a game theoretic
 framework to jointly analyze cooperation stimulation and security in autonomous
 mobile ad hoc networks. Their results show that, for a two-player packet forwarding
 game, the unique cheat-proof Nash equilibrium for every node is not to help the
 opponent more than the opponent has helped him/her.
 However, most of the existing game theoretical frameworks rely on the assump-
 tion that the game between a pair of players is directly played for inflnite times. In
 reality, due to mobility or changes of environment, nodes will periodically update
 90
their partners to achieve better performance, which means that any pair of players
 are supposed to play for only flnite times with the termination time are either known
 or can be estimated by both players. Note that every player can experience inflnite
 times with many players but never always with the same partner. In such a case,
 according to the well known Prisoner?s Dilemma and backward induction princi-
 ple [96], the only optimal strategy is to always play non-cooperatively. The major
 reason causing such a non-cooperative optimal strategy is the implicit assumption of
 direct reciprocity in most games, where the action of a player taking towards his/her
 opponent is purely determined by the history of how the opponent treats him/her.
 Obviously, under such a scenario, all players have no incentive to play cooperatively
 since their behaviors will not be evaluated by other players except their opponents.
 To stimulate the plays? incentive to play cooperatively, not only the evaluations
 from the opponents but also the evaluations from other observers should be taken
 into account, which leads to the notion of \indirect reciprocity". Indirect reciprocity
 is a key mechanism for the evolution of human cooperation and has recently drawn
 a lot of attentions in the area of social science and evolutionary biology [93] [94].
 The key concept of indirect reciprocity is \I help you not because you have helped
 me but because you have helped others". In this chapter, we propose to use the
 indirect reciprocity game modelling to stimulate cooperation among selflsh nodes
 for the scenario where the number of interactions between any pair of players are
 flnite. The main contributions of this chapter are summarized as follows.
 ? We propose a cooperation stimulation scheme to stimulate cooperation among
 91
selflsh users in cognitive networks using indirect reciprocity game modelling.
 Difierent from the existing game-theoretic approaches, our proposed scheme
 does not rely on the assumption that the number of interactions between a
 pair of players are inflnite.
 ? In the proposed scheme, we flrst develop the concept of reputation distribu-
 tion to capture not only the mean behavior of the transmitter?s reputation but
 also all likelihoods of the transmitter?s reputation that may be. Then, we de-
 velop a reputation updating policy for the receiver and observers to update the
 transmitter?s reputation distribution based on the transmitter?s previous rep-
 utation distribution and his/her action toward the receiver. We also propose
 a gradient descent algorithm to flnd the stationary reputation distribution of
 the whole population for any given optimal action rule.
 ? In the proposed scheme, we formulate the problem of flnding the optimal
 action rule as a Markov Decision Process (MDP) and proposed a modifled
 value iteration algorithm to flnd the optimal action rule.
 ? We show that with an appropriate cost-to-gain ratio, the strategy of forwarding
 the number of packets that is equal to the reputation level of the receiver is
 an evolutionarily stable strategy (ESS). We also show that even with only 60
 percentage of population adopting the optimal action rule at the beginning,
 by natural selection, the optimal action rule will quickly spread over the whole
 population. And once the whole population use the optimal action rule, no
 one will deviate. Moreover, we flnd that such an ESS will lead to a \good"
 92
Population
 Observers
 Transmitter Receiver
 Figure 5.1: System model. Within every interaction, a pair of transmitter and re-
 ceiver is randomly sampled from the population. Then, the transmitter will forward
 a certain amount of packets to the receiver according to the receiver?s and his/her
 own reputations. After the transmission, the transmitter?s reputation will be up-
 dated by the receiver and the observers. Finally, the transmitter?s reputation is
 propagated to the whole population from the receiver and the observers through a
 noisy gossip channel.
 society with more than 90 percentage of the population have good reputation.
 The rest of this chapter is organized as follows. In Section 5.1, we describe the
 problem formulation and introduce basic components in our system model. Then,
 we show in details how to flnd the optimal action rule in Section 5.2. In Section
 5.3, we describe two action spreading algorithms due to natural selection. Finally,
 we show the simulation results in Section 5.4 and draw conclusions in Section 5.5.
 93
5.1 The System Model
 As shown in Figure 5.1, let us consider a cognitive network with su?ciently
 large population of nodes. Due to mobility and/or changes of environment, short
 interactions rather than long-lasting associations between anonymous partners are
 dominant. At each time slot, a fraction of players is chosen from the population to
 form pairs to forward packets. Within each pair, one player acts as a transmitter
 and the other player as a receiver. Let A = f0; 1; :::; Lg stand for the action set
 that the transmitter may choose, where the action i 2 A stands for the transmitter
 forwards i packets to the receiver.
 In the simplest model with L = 1, the receiver can obtain a gain g at a cost c
 to the transmitter. We should always assume that the gain g is greater than the cost
 c. Otherwise, no transmission will occur. In such a case, if both players cooperate
 with each other and forward one packet to the other player, both players receive
 g ? c, which is better than what they would obtain by both defecting, namely 0.
 However, a unilateral defector would earn g, which is the highest payofi, and the
 exploited cooperator would pay the cost c without receiving any beneflt. The payofi
 structure yields an instance of the well-known Prisoner?s Dilemma game and the
 unique Nash equilibrium (NE) is defecting, i.e. both players will not forward the
 packet to the other player. Moreover, with backward deduction, the NE remains
 the same even the game is played a flnite number of times. Such a non-cooperative
 optimal strategy is mainly because of the use of direct reciprocity, where the action
 of a transmitter taking towards a receiver is purely determined by the history of
 94
how the receiver treats him/her. Obviously, under such a scenario, all transmitters
 have no incentive to forward packets since their behaviors will not be evaluated by
 other players except their corresponding receivers.
 To stimulate the cooperation under such a scenario, we use the indirect reci-
 procity game modelling, where the essential concept is: \I help you not because you
 have helped me but because you have helped others". Therefore, a key concept in
 indirect reciprocity game is the establishment of the notion of reputation, which is
 the evaluation of the history of the players? action. Here, to simplify the analysis,
 we assume that the reputation is quantized to L+1 levels with \0" being the worst
 reputation and \L" being the best reputation, i.e., the reputation set can be repre-
 sented as T = f0; 1; :::; Lg. However, the results can be easily extended to the case
 that the reputation set has difierent size from the action set. Here, we also assume
 that everyone agrees on the reputation of an individual and no private opinions are
 allowed. However, errors in assigning reputation are possible. During each inter-
 action, the transmitter determines his action, i.e. how many packets to forward to
 the receiver, based on the receiver?s and his/her own reputations. After each inter-
 action, the reputation of the receiver remains the same, while the reputation of the
 transmitter is flrst updated by the receiver and the observers, and propagated to the
 whole population through a noisy gossip channel. Then, each participant (including
 both the transmitter and receiver) goes back to the population with probability ? or
 leaves the population with probability 1 ? ?. The parameter ? can be treated as a
 discounting factor of the future. For every player who leaves the population, a new
 individual enters with an initial reputation randomly chosen from the reputation set
 95
with equal probability 1L+1 .
 5.1.1 Social Norms
 A social norm, Q, is a matrix used for updating the immediate reputation of
 players, where the immediate reputation is the reputation that a transmitter can
 immediately obtain by taking an action. Each element Qi;j in the social norm stands
 for the immediate reputation assigned to a transmitter who has taken the action
 i toward a receiver whose reputation is j. Without loss of generality, we assume
 that all players in the population share the same norm. Although the immediate
 reputation is only determined by the action of the transmitter and the reputation
 of the receiver, we can see from the later discussion, the flnal reputation updating
 rule also involves the reputation of the transmitter.
 Since both the cardinalities of the action set and the reputation set are L+1,
 there are (L + 1)(L+1)?(L+1) possible social norms. Based on the intuition that for-
 warding packets to the receiver with good reputation or denying forwarding packets
 to the receiver with bad reputation should receive good reputation, here, we deflne
 the immediate reputation Qi;j as follows
 Qi;j = L? ji? jj; (5.1)
 96
which means that the social norm is
 Q =
 0
 BBBBBBBBBBBBBB@
 L L? 1 : : : 1 0
 L? 1 L : : : ... 1
 ... L? 1 . . . L? 1 ...
 1 ... : : : L L? 1
 0 1 : : : L? 1 L
 1
 CCCCCCCCCCCCCCA
 : (5.2)
 For the special case when L = 1, the 2? 2 social norm can be written as
 Q2?2 =
 0
 BB@
 1 0
 0 1
 1
 CCA ; (5.3)
 where \1" stands for good reputation and \0" stands for bad reputation.
 With such a social norm shown in (5.3), we can see that the transmitter can
 obtain a good immediate reputation by either forwarding packets to the receiver with
 good reputation or denying forwarding packets to the receiver with bad reputation.
 On the other hand, the transmitter will obtain a bad immediate reputation if he/she
 either denies forwarding packets to the receiver with good reputation or forwards
 packets to the receiver with bad reputation.
 5.1.2 Action Rules
 An action rule, a, is an action table of the transmitter, where the ith row and
 jth column element ai;j stands for the action of the transmitter based on his/her
 own reputation i and the corresponding receiver?s reputation j. Since both the
 cardinalities of the action set and the reputation set are L + 1, there are (L +
 97
Social Norm Noisy Gossip 
Channel
 Figure 5.2: Reputation updating policy.
 1)(L+1)?(L+1) possible action rules. The optimal action rule, a?, should be the one
 that maximizes the payofi function as discussed later.
 5.2 Optimal Action Rule
 5.2.1 Reputation Updating Policy
 A key concept in indirect reciprocity game is reputation [94]. There is a
 similar notion of trust [115], however, which is mostly based on direct reciprocity.
 Players monitor the social interactions within their group and help others establish
 the reputation of being a helpful player. Therefore, one important step in indirect
 reciprocity game modelling is how to update reputation based on players? actions.
 In this subsection, we develop a reputation updating policy based on the action of
 the transmitter, the reputation of the transmitter and the reputation of the receiver.
 To capture not only the mean behavior of the transmitter?s reputation but also
 all likelihoods of the transmitter?s reputation that may be, we assign a reputation
 distribution for each player. Let d = [d0; d1; :::; dL]T be a reputation distribution for
 a speciflc player. Then di stands for the likelihood of the player being assigned with
 98
reputation i.
 The proposed reputation updating policy is shown in Fig. 5.2. Suppose, at
 time index n, a transmitter with a reputation distribution dni is matched with a
 receiver with a reputation distribution dnj . By taking a certain action, the transmit-
 ter is assigned with an immediate reputation d^ni based on the social norm. Then,
 the receiver and the observers will update the transmitter?s reputation distribution
 using a linear combination of the transmitter?s original and immediate reputations,
 where the weight ? can be treated as a discounting factor of the past reputation.
 Finally, the transmitter?s reputation is propagated among the population by the
 receiver and observers through a noisy gossip channel.
 In a simple example, we assume that the transmitter?s reputation distribution
 is dni = ei and the receiver?s reputation distribution is dnj = ej, where ei and ej are
 the standard basis vectors. Let ai;j be the action the transmitter takes towards the
 receiver. Then, the immediate reputation of the transmitter is eQai;j ;j . According to
 the reputation updating policy in Fig. 5.2, after the transmission, the transmitter?s
 reputation distribution becomes
 dn+1i = PN
 ?
 ?ei + (1? ?)eQai;j ;j
 ?
 ; (5.4)
 where PN is the transition matrix of the noisy channel. Without loss of generality1,
 1Note that the analysis in this chapter are also applicable to the PN with other forms.
 99
we deflne PN as follows
 PN =
 0
 BBBBBBBBBBBBBB@
 1? ? ?=L : : : ?=L ?=L
 ?=L 1? ? : : : ... ?=L
 ... ?=L . . . ?=L ...
 ?=L ... : : : 1? ? ?=L
 ?=L ?=L : : : ?=L 1? ?
 1
 CCCCCCCCCCCCCCA
 ; (5.5)
 with ? 2 [0; 0:5] being a constant.
 The dn+1i in (5.4) is the updated reputation distribution of the transmitter
 after the transmitter with an original reputation ei takes an action ai;j towards
 the receiver with a reputation ej. Since this updated reputation distribution will
 be used later in the analysis for flnding the optimal action rule, we use a speciflc
 symbol ~di!j to denote it, i.e.,
 ~di!j = PN
 ?
 ?ei + (1? ?)eQai;j ;j
 ?
 : (5.6)
 For the general case that dni 6= ei and/or dnj 6= ej, the transmitter?s updated
 reputation distribution cannot be simply expressed using (5.4) since, given an action
 rule, difierent combinations of the transmitter?s and receiver?s reputations may lead
 to the same immediate reputation. In such a case, we need to flrst flnd the immediate
 reputation using
 d^ni (k) =
 X
 p
 X
 q: Qap;q;q=k
 dni (p)dnj (q): (5.7)
 Then, according to Fig. 5.2, the transmitter?s updated reputation distribution
 can be computed by
 dn+1i = PN
 ?
 ?dni + (1? ?)d^
 n
 i
 ?
 : (5.8)
 100
5.2.2 Stationary Reputation Distribution
 Let x = [x0; x1; :::; xL]T stand for the reputation distribution of the entire
 population, where xi is the portion of the population that have the reputation i.
 Since every pair of transmitter and receiver is chosen from the population, given
 the transmitter with reputation i, the probability of matching with the receiver
 with reputation k is xk. After the transmission, the reputation of the transmitter
 is updated using the policy shown in Fig.5.2. Therefore, the evolution of x can be
 described by the following difierential equation
 dx
 dt = x
 new ? x; (5.9)
 where xnew is the new reputation distribution of the entire population and can be
 computed by
 xnew = PN (?I+ (1? ?)PT )x; (5.10)
 with the ith row and jth column element of the matrix PT being deflned as
 PT (j; i) =
 X
 k: Qa?i;k;k=j
 xk: (5.11)
 According to (5.9), (5.10), and (5.11), the stationary reputation distribution
 x? is the solution to the following equation
 PN (?I+ (1? ?)PT )x? = x?: (5.12)
 From (5.11) and (5.12), we can see that, given the optimal action a?, the
 stationary reputation distribution can be found by solving the nonlinear equations
 in (5.12). In Algorithm 3, we propose a gradient descent algorithm for flnding the
 stationary reputation distribution given the optimal action rule.
 101
Algorithm 3 : Finding Stationary Reputation Distribution Using Gradi-
 ent Descent
 1. Given the optimal action a?i;j ; 8i; 8j, the tolerance ?0 = 0:01, the index t = 0, and step
 size fi = 0:1, initialize x = [x0; x1; :::; xL]T with x0 = [x00; x01; :::; x0L]T , set ? = 1, and let
 F(x) = PN (?I+ (1? ?)PT (x))x? x:
 2. while ? > ?0
 ? Compute the updating vector ?xt+1 using ?xt+1 = ?fi?rF(xt)? F(xt).
 ? Update xt+1 by xt+1 = xt +?xt+1.
 ? Normalize xt+1 using xt+1 = xt+1jjxt+1jj2 .
 ? Update the parameter ? by ? = jjxt+1 ? xtjj2.
 ? Update the index t = t+ 1.
 End
 3. The stationary reputation distribution is x? = xt.
 5.2.3 Payofi Function
 Suppose that the cost of forwarding a packet is a constant, c, the total cost of
 the transmitter with reputation i taking action ai;j towards a receiver with reputa-
 tion j is given by
 C(ai;j) = ai;jc: (5.13)
 Similarly, if the gain of receiving a packet is a constant, g, the total gain of
 the receiver with reputation i can be computed by
 G(aj;i) = aj;ig; (5.14)
 where aj;i is the action of the corresponding transmitter with reputation j.
 102
Let Wi;j denote the maximum payofi that a player, currently having reputation
 i and being matched with a player with reputation j, can gain from this interaction
 to future. Obviously, if the player with reputation i serves as a transmitter, then
 the long-term expected payofi that he/she can obtain by taking action ai;j would be
 f1(ai;j) = ?ai;jc+ ?
 X
 k
 X
 l
 ~di!j(k)x?(l)Wk;l; (5.15)
 where the flrst term ai;jc is the immediate cost the transmitter incurred by taking
 action ai;j, and the second term
 P
 k
 P
 l ~di!j(k)x?(l)Wk;l stands for the beneflt he
 gains in the future with a discounting factor ?. According to (5.6), by taking action
 ai;j, the reputation distribution of the transmitter will change from ei to ~di!j. Since
 his opponent in the next round is randomly sampled from the population with a sta-
 tionary reputation distribution x?, with probability ~di!j(k)x?(l), the transmitter?s
 reputation becomes k and his opponent?s reputation is l.
 On the other hand, if the player with reputation i serves as a receiver, the
 long-term expected payofi that he/she can obtain is
 f2 = a?j;ig + ?
 X
 l
 x?lWi;l; (5.16)
 where the flrst term a?j;ig is the immediate gain he/she can obtain when the trans-
 mitter takes the optimal action a?j;i, and the second term
 P
 l x?lWi;l stands for the
 beneflt he gains in the future with a discounting factor ?. As a receiver, the reputa-
 tion will not change after the transmission. Since his opponent in the next round is
 randomly sampled from the population with a stationary reputation distribution x?,
 with probability x?(l), the receiver?s reputation is i and his opponent?s reputation
 is l.
 103
With each interaction, the play acts either as a transmitter or as a receiver
 with equal probability 12 . Therefore, the Bellman equation of Wi;j can be written as
 Wi;j = maxai;j
 "
 1
 2
 ?
 ?ai;jc+ ?
 X
 k
 X
 l
 ~di!j(k)x?(l)Wk;l
 !
 + 12
 ?
 a?j;ig + ?
 X
 l
 x?(l)Wi;l
 !#
 ; (5.17)
 and the optimal action a?i;j can be computed by
 a?i;j=argmaxai;j Wi;j
 =argmax
 ai;j
 "
 1
 2
 ?
 ?ai;jc+ ?
 X
 k
 X
 l
 ~di!j(k)x?(l)Wk;l
 !#
 :
 (5.18)
 From (5.17) and (5.18), we can see that the problem of flnding the optimal
 action rule is a Markov Decision Process (MDP), where the state is the reputation
 pair (i, j), the action is ai;j, the transition probability is determined by ~di!j and x?,
 and the reward is determined by c and g. Therefore, given the stationary reputa-
 tion distribution, the optimal action can be found by solving (5.18) using dynamic
 programming. In this chapter, we propose a modifled value iteration algorithm to
 flnd the optimal action given stationary reputation distribution, which is shown in
 Algorithm 4.
 5.2.4 Optimal Action Using An Alternative Algorithm
 From the previous two subsections, we can see that given the optimal action,
 the stationary reputation distribution can be found using Algorithm 3, and given
 104
the stationary reputation distribution, the optimal action can be found using Algo-
 rithm 4. Therefore, we can obtain the optimal action and the stationary reputation
 distribution alternatively by iteratively flxing one and solving the other. The de-
 tailed processes are summarized in Algorithm 5. Note that the convergence speed of
 Algorithm 5 is highly determined by the initial action rule a0. Nevertheless, it will
 converge since the number of the possible action rules is flnite. Moreover, Algorithm
 5 can also be used to test the evolutionary stability of any action rule. The idea is
 to set the tested action rule as the initial action rule and see whether it can converge
 in one iteration. The details will be discussed in Section 5.4.
 5.3 Action Spreading Due To Natural Selection
 Based on Algorithm 5, we can flnd the optimal action rule and the stationary
 reputation distribution. However, during the above analysis, we do not include
 the perturbation efiect, where players may take non-optimal action rule due to
 uncertainty of the system and/or the incorrect (noisy) parameters. Taking the
 perturbation efiect into account, we need to evaluate the stability of the optimal
 action rule. Here, we adopt the concept of evolutionarily stable strategy (ESS)
 [110], which is \a strategy such that, if all members of the population adopt it,
 then no mutant strategy could invade the population under the in uence of natural
 selection". In the following subsections, we flrst discuss, by natural selection, how
 the action rules spread over the population. Speciflcally, we discuss two action
 spreading algorithms: one is action spreading algorithm using Wright-Fisher model
 105
[54] and the other is action spreading algorithm using replicator dynamic equation
 [110]. Then, we examine, in Section 5.4, the stability of the optimal action rule
 derived by Algorithm 5 by the simulations.
 Let M be the number of action rules, a1; a2; :::; aM , used in the population.
 Let pti be the percentage of the population that uses action rule ai at time t. Then,
 we have PMi=1 pti = 1. Let U ti be the average payofi using action rule ai at time t.
 5.3.1 Action Spreading Algorithm Using Wright Fisher Model
 The Wright-Fisher model is by far the most popular stochastic model for re-
 production in population genetics [54]. It is based on the assumption that the prob-
 ability of an individual adopting a certain strategy is proportional to the expected
 payofi of the population using that strategy. Due to its simplicity and capability of
 capturing the essence of the biology involved, we use the Wright-Fisher model here
 to characterize how action rules spread over the population.
 Let yi be the probability of an individual using action ai. Then, we have
 PM
 i=1 yi = 1. With the Wright-Fisher Model, we assume that yi is proportional to
 the total payofi of the users using ai. Therefore, yi can be computed by
 yi = p
 t
 iU tiPM
 j=1 ptjU tj
 ; (5.19)
 where the numerator ptiU ti is the total payofi of the users using action ai, and the
 denominator PMj=1 ptjU tj is the total payofi of the whole population, which is the
 normalization term that ensures PMi=1 yi = 1.
 Based on the assumption that the population size is su?ciently large, the per-
 106
centage of the population using action ai is equal to the probability of an individual
 using ai. Therefore, the action spreading equation can be written as
 pt+1i =
 ptiU tiPM
 j=1 ptjU tj
 : (5.20)
 5.3.2 Action Spreading Algorithm Using Replicator Dynamic Equa-
 tion
 Replicator dynamic equation is widely used to characterize the population
 evolution in evolutionary game theory [110]. It is based on the following intuition:
 if a certain strategy results in a higher payofi than the average level, the population
 share using that strategy will grow with the growth rate proportional to the dif-
 ference between the expected payofi of the population using that strategy and the
 expected payofi of the entire population. In this subsection, we use the replicator
 dynamic equation to model the evolution of the percentage of the population using
 a certain action rule, which means that the evolution of pi is given by the following
 equation
 dpi
 dt = ?
 ?
 Ui ?
 MX
 j=1
 pjUj
 !
 pi; (5.21)
 where ? is a scale factor controlling the speed of the evolution.
 By discretizing the replicator dynamic equation in (5.21), we have the action
 spreading equation
 pt+1i = pti + ?
 ?
 U ti ?
 MX
 j=1
 ptjU tj
 !
 pti
 = pti
 "
 1 + ?
 ?
 U ti ?
 MX
 j=1
 ptjU tj
 !#
 : (5.22)
 107
0 200 400 600 800 1000 1200
 0.45
 0.5
 0.55
 0.6
 0.65
 0.7
 0.75
 0.8
 0.85
 0.9
 Time index
 Percentag
 e
 Evoluation of the population with reputation L
  
 
RDE
 WFM
 (a)
 0 200 400 600 800 1000
 0.65
 0.7
 0.75
 0.8
 0.85
 0.9
 0.95
 1
 Time index
 Percentag
 e
 Evoluation of the population using optimal action
  
 
RDE
 WFM
 (b)
 Figure 5.3: The population evolution when L = 1, g = 1 and c = 0:1: (a) the
 percentage of the population with reputation L = 1; (b) the percentage of the
 population using optimal action shown in (5.24).
 5.4 Evolutionarily Stable Strategy and Simulations
 To verify the proposed algorithm, we simulate the packet forwarding game.
 We study a flxed-size population, N = 1000. Each new player receives an initial
 reputation, which is randomly chosen from f0; 1; :::; Lg with equal probability 1L+1 .
 Each player uses one of (L + 1)(L+1)?(L+1) possible action rules. All players in the
 population share the flxed social norm deflned in (5.2). Before any one elementary
 step of action updating, each individual has exactly 20 interactions with other ran-
 domly chosen individuals. Individuals act as transmitter and receiver on average
 10 times each. After each interaction, the reputation of the transmitter is updated
 according to the reputation updating policy shown in Fig. 5.2. We assume that
 every player in the population agrees on the reputation generated by the reputation
 updating policy. No private lists of reputation are considered. After all 20 interac-
 108
tions have taken place, each participant including both the transmitter and receiver
 goes back to the population with probability ? or leaves the population with proba-
 bility 1? ?. For every player who leaves, a new individual enters the population to
 keep the total population size constant. The initial reputation of the new coming is
 randomly chosen from f0; 1; :::; Lg with equal probability 1L+1 . Then, the players in
 the population, including the old players who stay in the population and the new
 players who enter the population, choose their new action rules according to previ-
 ous payofi history of the whole population. There are two possible action spreading
 algorithms as shown in the previous section. One is the action spreading algorithm
 using Wright Fisher Model, which is denoted as \WFM", and the other one is the
 action spreading algorithm using Replicator Dynamic Equation, which is denoted
 as \RDE". After updating the action rule, the payofis of all players are reset to
 zero. Therefore, older players do not accumulate their payofis. In all the following
 simulations, the parameters ?, ?, and ? are set to be 0.5, 0.9, and 0.95 respectively.
 The parameter ? that controls the speed of the evolution in RDE is set to be 0.1.
 5.4.1 Binary Reputation Scenario
 To give more insights into the proposed algorithm, we flrst evaluate the binary
 reputation scenario where L = 1. We assume that the gain per unit is 1 and the
 cost per unit is 0.1, i.e. g = 1 and c = 0:1. According to Algorithm 5, with difierent
 initial conditions, we can flnd three pairs of stationary reputation distribution x?
 109
g
 c
 c=0.582g
 Stable Region
 Figure 5.4: The stable region for the optimal action rule shown in (5.24) when
 L = 1.
 and the optimal action rule a?, which are
 x?1 =
 0
 BB@
 0:5
 0:5
 1
 CCA ; a?1 =
 0
 BB@
 0 0
 0 0
 1
 CCA : (5.23)
 x?2 =
 0
 BB@
 0:0909
 0:9091
 1
 CCA ; a?2 =
 0
 BB@
 0 1
 0 1
 1
 CCA : (5.24)
 x?3 =
 0
 BB@
 0:9091
 0:0909
 1
 CCA ; a?3 =
 0
 BB@
 1 0
 1 0
 1
 CCA : (5.25)
 With (x?1,a?1), the transmitter will not forward any packet to the receiver re-
 gardless his/her own reputation and the corresponding receiver?s reputation. Obvi-
 110
0 200 400 600 800 1000 1200
 0.45
 0.5
 0.55
 0.6
 0.65
 0.7
 0.75
 0.8
 Time index
 Percentag
 e
 Evoluation of the population with reputation L
  
 
RDE
 WFM
 (a)
 0 200 400 600 800 1000
 0.1
 0.2
 0.3
 0.4
 0.5
 0.6
 0.7
 0.8
 Time index
 Percentag
 e
 Evoluation of the population using optimal action
  
 
RDE
 WFM
 (b)
 Figure 5.5: The population evolution when L = 1, g = 1 and c = 0:6: (a) the
 percentage of the population with reputation L = 1; (b) the percentage of the
 population using optimal action shown in (5.24).
 ously, it is a bad strategy since, with such a strategy, there is no cooperation and
 the payofi of every player is zero. The pairs (x?2,a?2) and (x?3,a?3) are symmetric where
 with the former pair, the transmitter will always forward packets to the receiver who
 has good reputation, and with the latter pair, the transmitter will always forward
 packets to the receiver who has bad reputation. We can also flnd that the pair
 (x?2,a?2) leads to a population with more than 90 percentage of the players are good
 reputation while (x?3,a?3) leads to a population with more than 90 percentage of the
 players are bad reputation. Here, we prefer (x?2,a?2) since it leads to a \good" society
 with more than 90 percentage of the population are good reputation.
 Then, we evaluate the evolutionary stability of (x?2,a?2). In the simulation, the
 initial frequency of the optimal action rule a? shown in (5.24) is set to be 0.6. The
 initial frequencies of the other action rules are randomly chosen. The initial repu-
 tation of new players is randomly chosen from f0; 1g with equal probability 12 . In
 111
Fig. 5.3 (a), we show the evolutionary results of the percentage of the population
 with reputation level L = 1. From Fig. 5.3 (a), we can see that for both WFM
 and RDE, the reputation distribution converges to the stationary reputation dis-
 tribution x?2. Compared with WFM, the convergence speed of RDE is a bit slower
 since a small speed controlling parameter ? = 0:1 is used in RDE.
 The evolutionary results of the percentage of the population using the action
 rule a?2 are shown in Fig. 5.3 (b). From Fig. 5.3 (b), we can see that for both WFM
 and RDE, the action rule a?2 will spread over the whole population. And once the
 whole population adopt a?2, no one will deviate. Therefore, the action rule a?2 is an
 evolutionarily stable strategy (ESS) [110] in this case.
 From (5.17), we can see that the optimal action rule is determined by the
 values of g and c. Intuitively, if g  c, every player is willing to cooperate with
 other players since in such a scenario, the potential cooperation gain will be greater
 than the immediate cooperation cost. On the other hand, if c  g, every player
 tends not to cooperate with other players since the potential cooperation gain will
 be smaller than the immediate cooperation cost in such a scenario. Based on the
 intuition, there should exist a critical cost-to-gain ratio  such that the optimal
 action rule a?2 is stable if c <  g and is not stable otherwise.
 By setting a?2 as the initial action rule a0 in Algorithm 5 and varying g and c,
 we flnd that if cg ? 0:582, the optimal action rule found by Algorithm 5 is a?2. On
 the other hand, if cg > 0:582, the optimal action rule changes to be a?1. Therefore,
 the critical cost-to-gain ratio  is equal to 0.582 in this case, which means that the
 stable region for a?2 is the shadow region shown in Fig. 5.4.
 112
Stable region
 g
 c
 c=0.622g
 Stable Region
 Figure 5.6: The stable region for the optimal action rule shown in (5.26) when
 L = 4.
 We verify the above statement by evaluating the stability of a?2 when g = 1
 and c = 0:6. The corresponding evolutionary results are shown in Fig. 5.5. From
 Fig. 5.5 (b), we can see that when cg = 0:6 > 0:582, the percentage of the population
 using action rule a?2 does not converge to 1 for both WFM and RDE. Therefore,
 a?2 is not stable in this case. Correspondingly, we can also see from Fig. 5.5 (a) that
 the reputation distribution does not converge to x?2 in this case.
 5.4.2 Multi-Level Reputation Scenario
 For the multi-level reputation scenario where L ? 2, due to the large dimension
 of the action space ((L+1)(L+1)?(L+1)), it is di?cult to flnd all the possible pairs of
 113
0 200 400 600 800 1000 1200
 0.2
 0.3
 0.4
 0.5
 0.6
 0.7
 0.8
 0.9
 1
 Time index
 Percentag
 e
 Evoluation of the population with reputation L
  
 
RDE
 WFM
 (a)
 0 200 400 600 800 1000
 0.55
 0.6
 0.65
 0.7
 0.75
 0.8
 0.85
 0.9
 0.95
 1
 Time index
 Percentag
 e
 Evoluation of the population using optimal action
  
 
RDE
 WFM
 (b)
 Figure 5.7: The population evolution when L = 4, g = 1 and c = 0:5: (a) the
 percentage of the population with reputation L = 4; (b) the percentage of the
 population using optimal action shown in (5.26).
 stationary reputation distribution x? and optimal action rule a?. However, based on
 the results in the binary reputation scenario, we can infer that one possible optimal
 action rule a?0 is to forward i packets to the receiver with reputation i, i.e. a?0 can
 be written as
 a?0 =
 0
 BBBBBBBBBB@
 0 1 : : : L
 0 1 : : : L
 ... ... ... ...
 0 1 : : : L
 1
 CCCCCCCCCCA
 : (5.26)
 According to Algorithm 3, we can flnd the corresponding stationary reputation
 distribution x?0. For the special case with L = 4, x?0 is
 x?0 =
  
0:0235 0:0235 0:0235 0:0235 0:906
 ?T
 : (5.27)
 Then, similar to the binary reputation scenario, we obtain the stable region for
 the optimal action rule a?0. By setting a?0 as the initial action rule a0 in Algorithm
 114
0 200 400 600 800 1000 1200
 0.2
 0.25
 0.3
 0.35
 0.4
 0.45
 0.5
 0.55
 0.6
 0.65
 Time index
 Percentag
 e
 Evoluation of the population with reputation L
  
 
RDE
 WFM
 (a)
 0 200 400 600 800 1000
 0
 0.1
 0.2
 0.3
 0.4
 0.5
 0.6
 0.7
 Time index
 Percentag
 e
 Evoluation of the population using optimal action
  
 
RDE
 WFM
 (b)
 Figure 5.8: The population evolution when L = 4, g = 1 and c = 0:7: (a) the
 percentage of the population with reputation L = 4; (b) the percentage of the
 population using optimal action shown in (5.26).
 5 and varying g and c, we flnd that if cg ? 0:622, the optimal action rule found by
 Algorithm 5 is still a?0. On the other hand, if cg > 0:622, the optimal action rule
 changes. Therefore, the critical cost-to-gain ratio  in this case is equal to 0.622,
 which means that the stable region for a?0 is the shadow region shown in Fig. 5.6.
 We then verify the above statement by simulating the packet forwarding game
 with two difierent cost-to-gain ratio settings. One is g = 1 and c = 0:5, i.e. cg =
 0:5 < 0:622, and the other is g = 1 and c = 0:7, i.e. cg = 0:7 > 0:622. The
 evolutionary results for the former setting are shown in Fig. 5.7. From Fig. 5.7, we
 can see that when the cost-to-gain ratio is set to be cg = 0:5 < 0:622, the reputation
 distribution converges to x?0 and the optimal action rule a?0 spreads over the whole
 population for both WFM and RDE, which verifles that a?0 is an ESS in this case.
 The evolutionary results for the latter cost-to-gain ratio setting are difierent
 and shown in Fig. 5.8. From Fig. 5.8, we can see that when the cost-to-gain ratio
 115
is set to be cg = 0:7 > 0:622, for both WFM and RDE, the action rule a?0 does not
 spread over the whole population and the reputation distribution does not converge
 to x?0. Therefore, a?0 is not stable in this case.
 5.5 Summary
 In this chapter, we propose a cooperation stimulation scheme for cognitive
 networks using indirect reciprocity game modelling. Difierent from the existing
 game theoretic approaches, our proposed scheme does not rely on the assumption
 that the number of interactions between a pair of players are inflnite. From the
 simulation results, we can see that with a proper cost-to-gain ratio, the action rule
 of forwarding i packets to the receiver with reputation level i is an ESS. Even
 starting with only 60 percentage of population adopting the optimal action rule,
 the optimal action rule will quickly spread over the whole population by natural
 selection. And once the whole population use the optimal action rule, no one will
 deviate. Moreover, such an ESS will lead to a \good" society where more than 90
 percentage of the population have good reputation.
 116
Algorithm 4 : Modifled Value Iteration For Optimal Action Selection
 Given Stationary Reputation Distribution
 1. Given the stationary reputation x?, tolerance ?0 = 0:01, initialize a?i;j with a0i;j 8i 8j, set
 ?1 = 1 and ?2 = 1.
 2. while ?1 > ?0
 ? Set ?2 = 1.
 ? Initialize Wi;j = 0 8i 8j.
 ? while ?2 > ?0
 { Compute ~di!j using ~di!j = PN
 ?
 ?ei + (1? ?)eQai;j;j
 ?
 .
 { Compute W^i;j using
 W^i;j = maxai;j
 "
 1
 2
 ?
 ?ai;jc+ ?
 X
 k
 X
 l
 ~di!j(k)x?(l)Wk;l
 !
 + 12
 ?
 a?j;ig + ?
 X
 l
 x?(l)Wi;l
 !#
 :
 { Compute a^i;j using
 a^i;j = argmaxai;j
 "
 1
 2
 ?
 ?ai;jc+ ?
 X
 k
 X
 l
 ~di!j(k)x?(l)Wk;l
 !#
 :
 { Update the parameter ?2 by ?2 = jjW^?Wjj2.
 { Update W by W = W^.
 { End
 ? Update the parameter ?1 by ?1 = jja^? a?jj2.
 ? Update a? by a? = a^.
 End
 3. The optimal action is a?.
 117
Algorithm 5 : An Alternative Algorithm For Finding Stationary Repu-
 tation Distribution And Optimal Action
 1. Given the tolerance ?0 = 0:01, initialize a? with a0 and set ? = 1.
 2. while ? > ?0
 ? Given the optimal action a?, flnding the stationary reputation distribution x? using Al-
 gorithm 3.
 ? Given the stationary reputation distribution x?, flnding the optimal action a^? using Al-
 gorithm 4.
 ? Update the parameter ? by ? = jja^? ? a?jj2.
 ? Update a? by a? = a^?.
 End
 3. The stationary reputation distribution is x? and the optimal action is a?.
 118
Chapter 6
 Image Denoising Games
 During the processes of being captured, digitized, recorded, and transmitted,
 an image is usually distorted and noisy. Such a noisy image is visually annoying
 and often not suited to further perform tasks such as segmentation, recognition and
 compression. Therefore, image denoising is a very important issue to reconstruct a
 good estimate of the original image from the noisy observations.
 Many approaches have been proposed in the literature to reconstruct the orig-
 inal image by exploiting the inherently spatial correlation. By assuming that the
 image locally satisfles a stationary Gaussian process, Woods and Radewan [126]
 propose to estimate the original image from the noisy image using Kalman fllter
 while Jin et al [70] propose to use adaptive Wiener fllter. In both approaches, the
 flrst-order and second-order statistics used in the fllters are calculated based on the
 noisy samples within a local window. In [111] [51] [117], the authors propose to
 use bilateral flltering over the local neighborhood samples, where the weights of
 the bilateral fllters are computed based on the intensity and radiometric distances
 between the center sample and the neighboring samples. Another class of locally
 adaptive image denoising approaches are derived by considering image processing as
 a variational problem where the restored image is computed by minimizing a care-
 fully designed energy function [26] [55] [56]. Typically, such energy functions consist
 119
of a fldelity term that is determined by the difierence between the reconstructed
 image and the noisy image, and a regularization penalty term that is determined by
 the image prior.
 To further exploit the spatial correlation, Buades et al [20] proposed to average,
 in a weighted manner, all the pixels in a nonlocal window instead of only involving
 the locally neighboring pixels, where the weights are determined by the difierences
 between the region centered by the target pixel and the regions centered by the
 candidate pixels. Since the weights are not determined by the radiometric (physical)
 distance, similar pixels that are far away from the target pixel can still be awarded
 large weights. In such a way, the denoising performance is greatly improved. Several
 extensions of the nonlocal approach are also proposed [21] [73] [74].
 Besides the pixel-domain approaches, transform-domain approaches are also
 investigated [50] [99] [58] [66]. The transform-domain approaches are mainly based
 on the assumption that the original signal can be well approximated by a linear
 combination of few basis, i.e., the original signal is sparse in the transform-domain.
 In such a case, the original signal can be well estimated by preserving the few
 high-magnitude transform coe?cients that convey mostly the energy of the original
 signal and discarding the rest which are mainly introduced by the noise. Therefore,
 one important issue in the transform-domain approaches is how to threshold the
 transform coe?cients. Many threshold rules have been proposed from difierent
 speculations [50] [11] [52]. A combination of the nonlocal and transform-domain
 thresholding ideas is proposed in [45]. The basic idea is to flrst group similar 2D
 image blocks into 3D data arrays, then perform 3D wavelet transform, and flnally
 120
shrinkage the transform spectrum.
 Most of the existing schemes focus on how to choose good weights for some
 given neighborhoods to achieve better reconstructions. However, how to adaptively
 choose optimal neighborhoods can be even more important since too large a neigh-
 borhood set may cause overly-smooth artifacts, while too small a neighborhood
 set may not be able to e?ciently reduce noise variance. Due to the absence of
 the original image, the Stein?s principle [113] is used to estimate the true MSE for
 determining the optimal neighborhoods. Nevertheless, we flnd that there exists a
 trade-ofi between the accuracy of the estimate and the minimum of the true MSE.
 In this chapter, we study the impact of this trade-ofi and formulate the image de-
 noising problem as a coalition formation game. In this game, every pixel is treated
 as a player, who tries to seek partners to form a coalition to improve the accu-
 racy of the Stein?s estimate while incurring a cost of increasing the minimum of the
 true MSE. Since flnding the optimal coalition structures is NP-hard, we propose a
 heuristically distributed algorithm in solving the coalition formation game. We also
 show that the traditional approaches that use a heuristically determined candidate
 set are special cases of the proposed game theoretical framework by choosing the
 utility function without a cost term. Finally, experimental results show that the
 proposed game theoretical approach can achieve better performance than the non-
 local method in terms of both PSNR and visual quality. Note that the proposed
 game is also applicable in other scenarios besides the nonlocal method as long as 1)
 there exist some locally adaptive parameters to be estimated, and 2) the estimation
 accuracy will be improved when more samples are involved in the estimate process.
 121
The rest of this chapter is organized as follows. In Section 6.1, we give a
 brief description of the system model and the coalition formation game. Then,
 we discuss how to choose a good candidate set in Section 6.2. In Section 6.3,
 we study the trade-ofi between the accuracy of the estimate and the minimum of
 the true MSE and provide a detailed analysis of the proposed game-theoretic image
 denoising framework. In Section 6.4, we show the relationship between the proposed
 game-theoretic framework and the traditional candidate set selection approaches.
 Finally, we illustrate the experimental results on real images in Section 6.5 and
 draw conclusions in Section 6.6.
 6.1 The System Model and Coalition Formation Game
 6.1.1 The System Model
 In this chapter, we consider the problem of restoring images degraded by
 additive white Gaussian noise. The degraded process can be modelled as
 In(k) = I(k) + n(k); (6.1)
 where I is the original image, In is the noisy observation of the image, and n is the
 additive Gaussian noise with zero mean and  2 noise variance. The k = (k1; k2) is
 the coordinate of a pixel. The problem is to flnd an estimate I^ of the original image
 based on the noisy observation In.
 It is well known that the image denoising problem is ill-posed. To reconstruct
 the original image from the noisy observation, we need to use some prior information
 122
such as the correlations among spatial neighboring pixels. In this chapter, we focus
 on the spatially adaptive linear flltering approach. For the pixel located at k, we
 flnd the estimate I^(k) using the weighted average of the spatially neighboring pixels,
 i.e.,
 I^(k) =
 P
 l2S(k) wk;lIn(l)P
 l2S(k) wk;l
 ; (6.2)
 where S(k) is the candidate set that contains the spatially neighboring pixels for k,
 and wk;l is the weight for pixel In(l).
 6.1.2 The Coalition Formation Game
 Game theory is a mathematical tool that analyzes the strategic interactions
 among multiple decision makers. A game is mainly composed by three components:
 ? a flnite set of players, denoted by u1; u2; :::; uN ;
 ? a set of actions, denoted by Ai, for each player ui;
 ? payofi/utility function, denoted by Ui, which measures the outcome for player
 ui determined by the actions of all players.
 A coalition formation game is a game where the players seek to form coopera-
 tive groups, i.e., coalitions, to strengthen their positions in the game. The players?
 actions in the coalition formation game are whom to cooperate with, i.e., which
 players to form coalitions with. The payofi/utility function in the coalition for-
 mation game is deflned over coalitions, called coalition value. The coalition value,
 which quantifles the worth of a coalition, is mainly determined by two terms: the
 123
gain and the cost. By forming a coalition, every player in the coalition can obtain
 a gain through cooperation within the coalition. However, the gain is limited by a
 cooperation cost for forming the coalition, e.g. the negotiation cost or information
 exchange cost.
 Given the player set and the coalition value, the coalition formation game
 is uniquely deflned, and the outcome of the game is a set of coalitions, which is
 the optimal partitions of the player set. To obtain the optimal partitions, there
 are two possible approaches: centralized approach and distributed approach. For
 the centralized approach, the centralized controller needs to search over all the
 partitions of the player set to flnd the optimal partitions, which is NP-complete
 and impractical when the size of the player set is large [104]. For the distributed
 approach, the players will make their own decisions as to whether or not they join a
 coalition. One typical approach is to use the merge and split rules proposed in [15].
 This approach starts with an initial partition and repeats alternatively the merge
 and split rule, 1) merge rule: merge any set of coalitions into a single coalition
 if the new coalition can provide larger total coalition values; 2) split rule: split a
 coalition into smaller coalitions if the resulting smaller coalitions can provide larger
 total coalition values.
 In Section 6.3, we will discuss in details how to use the coalition formation
 game to formulate the image denoising problem, where each pixel will be treated as
 a player seeking to form coalitions to achieve optimal denosing performance.
 124
6.2 Candidate Set Selection
 From (6.2), we can see that the reconstruction performance are determined by
 the selection of the weights wk;l and the candidate set S(k). For any given S(k), the
 optimal weights w?k;l(S(k)) are determined by the correlation between pixels I(k)
 and I(l), and should be chosen to minimize the difierence between the estimation
 I^(k) and the original pixel I(k) as below.
 w?k;l(S(k)) = arg minwk;l
 ?P
 l2S(k) wk;lIn(l)P
 l2S(k) wk;l
 ? I(k)
 !2
 (6.3)
 Note that when the optimal weights in (6.3) are used, the selection of the
 candidate set S(k) is trivial since the accuracy of the reconstruction improves as
 the candidate set S(k) becomes larger. The proof can be found in Theorem 1.
 Theorem 1: When the optimal weights in (6.3) are used, the accuracy of the
 reconstruction improves as the candidate set S(k) becomes larger. Proof : Let S1(k)
 and S2(k) be two candidate sets with S1(k) ? S2(k). Let w?k;l(S1(k)) and w?k;l(S2(k))
 be the corresponding optimal weights computed by (6.3). Suppose ~wk;l(S2(k)) are
 the weights for S2(k) and are deflned as follows
 ~wk;l(S2(k)) =
 8
 >><
 >>:
 w?k;l(S1(k)); l 2 S1(k);
 0; else.
 (6.4)
 Then, according to the optimality of w?k;l(S2(k)), we have
 ?P
 l2S2(k) wk;l(S2(k))In(l)P
 l2S2(k) wk;l(S2(k))
 ? I(k)
 !2
 ?
 ?P
 l2S2(k) ~wk;l(S2(k))In(l)P
 l2S2(k) ~wk;l(S2(k))
 ? I(k)
 !2
 =
 ?P
 l2S1(k) wk;l(S1(k))In(l)P
 l2S1(k) wk;l(S1(k))
 ? I(k)
 !2
 : (6.5)
 125
Therefore, when optimal weights are used, the reconstruction using candidate
 set S2(k) is more accurate than that using candidate set S1(k).
 However, due to the absence of the original pixel I(k), it is impossible for us
 to flnd the optimal weights using (6.3). One possible approximation is to use the
 similarity between the neighborhoods around k and l [20], which is deflned as follows
 wk;l = exp
 (
 ?
 P
 b2B [In(k+ b)? In(l+ b)]2
 h2
 )
 ; (6.6)
 where B is a predeflned neighborhood and h is the parameter related to the noise?s
 variance.
 Nevertheless, since the weights in (6.6) are not optimal, the selection of the
 candidate set S(k) for the reconstruction becomes critically important. On one
 hand, if the size of the candidate set is too small, then the noise may not be efiectively
 removed. On the other hand, if the size of the candidate set is too large, then the
 reconstruction may be overly-smooth. Moreover, according to (6.6), we can see
 that the pixels that are more similar to the target pixel would have larger weights.
 To prevent the reconstruction from being overly-smooth, we will only involve the
 pixels that have relatively large weights. Let ?(m) stand for the subset of S(k)
 which contains the pixels with the flrst m largest weights. Then, the reconstruction
 I^(k;m) using ?(m) can be written as
 I^(k;m) =
 P
 l2?(m) wk;lIn(l)P
 l2?(m) wk;l
 : (6.7)
 Obviously, the parameter m in (6.7) should be chosen in such a way that the
 difierence between I^(k;m) and I(k) is minimized, i.e., the optimal m? can be found
 126
(a) (b) (c) (d)
 (e) (f) (g) (h)
 Figure 6.1: An example of optimal candidate set with an edge region: (a) original
 image; (b) noisy image with  = 15; (c) noisy image with  = 25; (d) noisy image
 with  = 35; (e) the candidate set used by nonlocal; (f) the ideally optimal candidate
 set of (b) when the original signal is available; (g) the ideally optimal candidate set
 of (c) when the original signal is available; (h) the ideally optimal candidate set of
 (d) when the original signal is available.
 by
 m? = arg min
 m
 jI^(k;m)? I(k)j2: (6.8)
 In general, m? is content dependent, i.e., m? may be difierent for difierent
 k and/or difierent noise variances. Even with the same m?, the structure of the
 candidate set ?(m?) may be difierent for difierent pixels. In Figs. 6.1 and 6.2, we
 show the structure of the optimal candidate set for two difierent scenarios: 1) the
 target pixel is centered within an edge region, and 2) the target pixel is centered
 within a smooth region. For illustration purpose, we assume that the original image
 127
is available for flnding m? in these two examples. Later in Section 6.3, we will discuss
 how to flnd m? using game theory under the scenario that the original image is not
 available. As shown in Figs. 6.1 and 6.2, (a) is the original image centered by
 the target pixel, which is denoted by red \x", (b)-(d) are the noisy images with  
being 15, 25, and 35 respectively. (e) is the candidate set using in [20], which is
 a square window. (f)-(h) are the optimal candidate sets generated using (6.8) for
 (b)-(d) respectively. Note that the black pixels in (f)-(h) stand for the pixels in
 the candidate set. From Figs. 6.1 and 6.2, we can see that for the scenario where
 the target pixel is centered within an edge region, the candidate set has an edge
 structure, while for the scenario where the target pixel is centered within a smooth
 region, the structure of the candidate set is unpredictable. Moreover, we can also see
 that with difierent noise variance, the candidate sets are quite difierent. Therefore,
 the candidate set should not be pre-deflned in a flxed way such as using a square
 window in [20]. Instead, the candidate set should be chosen adaptively to minimize
 the difierence between the estimate and the original signal.
 6.3 Game Theoretical Problem Formulation
 6.3.1 Stein?s unbiased risk estimate (SURE)
 Since I(k) is unknown, the optimal m? can not be explicitly computed using
 (6.5). Fortunately, we can flrst use the Stein?s unbiased risk estimate (SURE) [113]
 to estimate the true mean squared error (MSE) from the noisy observation and then
 use the estimated MSE to flnd the optimal m?.
 128
(a) (b) (c) (d)
 (e) (f) (g) (h)
 Figure 6.2: An example of optimal candidate set with a smooth region: (a) original
 image; (b) noisy image with  = 15; (c) noisy image with  = 25; (d) noisy image
 with  = 35; (e) the candidate set used by nonlocal; (f) the ideally optimal candidate
 set of (b) when the original signal is available; (g) the ideally optimal candidate set
 of (c) when the original signal is available; (h) the ideally optimal candidate set of
 (d) when the original signal is available.
 In Fig. 6.3, we show the optimal m? obtained using (6.8) for lena image when
 the standard deviation of the noise is  = 10, where the intensity stands for the
 optimal m? value. From Fig. 6.3, we can see that there are many pixels have the
 similar m? value, which can be grouped together for flnding m?. For example, the
 pixels in the red circles have the m? value near 60 can be grouped together. Suppose
 that the whole image is partitioned into M subsets ' = f'1;'2; :::;'Mg, where each
 subset 'i contains a set of pixels that may not be physical neighboring but have the
 129
8 441 6 28 310 296 26 441 
5 138 49 16 6 389 225 441 
58 177 88 28 6 441 240 23 
37 62 12 191 312 3 7 1 
71 93 315 72 441 10 58 441 
441 5 252 35 441 8 144 441 
85 254 10 36 12 185 14 51 
306 2 2 20 10 59 60 155 
Figure 6.3: The optimal m? for lena image when  = 10.
 same optimal parameter m?i , i.e.,
 m?i = arg minm
 X
 k2'i
 jI^(k;m)? I(k)j2: (6.9)
 With the optimal m?i , the mean square error (MSE) for the subset 'i can be
 computed by
 msei = 1j'ij
 X
 k2'i
 jI^(k;m?i )? I(k)j2; (6.10)
 and such a MSE can be approximated using SURE, according to Theorem 2, as
 follows
 SUREi =
 1
 j'ij
 X
 k2'i
 jI^(k;m?i )?In(k)j2+ 2
 ?
 2
 j'ij
 X
 k2'i
 @I^(k;m?i )
 @In(k) ?1
 !
 ; (6.11)
 130
where @I^(k;m?i )@In(k) can be found by
 @I^(k;m?i )
 @In(k) =
 1P
 l2?(m?i ) wk;l
 ?
 0
 @ X
 l2?(m?i )
 @wk;l
 @In(k)I
 n(l) + 1? I^(k;m?i )
 X
 l2?(m?i )
 @wk;l
 @In(k)
 1
 A ; (6.12)
 with @wk;l@In(k) being deflned as
 @wk;l
 @In(k) =
 8
 >><
 >>:
 2wk;l I
 n(l)?In(k)
 h2 ; l? k 62 B;
 2wk;l I
 n(l)+In(2k?l)?2In(k)
 h2 ; l? k 2 B.
 (6.13)
 Theorem 2: The SUREi in (6.11) is an unbiased estimator of the true MSE
 msei in (6.10), i.e,
 E[SUREi] = E[msei]: (6.14)
 Proof : By substituting I(k) with In(k)? n(k), we can re-write E[msei] as follows
 E[msei] = E
 "
 1
 j'ij
 X
 k2'i
 jI^(k;m?i )? I(k)j2
 #
 = E
 "
 1
 j'ij
 X
 k2'i
 ?
 jI^(k;m?i )? In(k)j2
 +2n(k)I^(k;m?i )? n(k)2
 ?#
 = 1j'ij
 X
 k2'i
 E
 h
 jI^(k;m?i )? In(k)j2
 i
 +2 1j'ij
 X
 k2'i
 E
 h
 n(k)I^(k;m?i )
 i
 ?  2: (6.15)
 According to Stein?s Lemma [113], we have
 E
 h
 n(k)I^(k;m?i )
 i
 =  2E
 "
 @I^(k;m?i )
 @In(k)
 #
 : (6.16)
 131
0 10 20 30 40
 10
 20
 30
 40
 50
 60
 70
 80
 90
 N
 conf_ter
 m
 (a)
 0 10 20 30 40
 14
 16
 18
 20
 22
 N
 mse_ter
 m
 (b)
 Figure 6.4: The trade-ofi between the confldence term C and the distortion term D:
 (a) the performance of C with difierent N ; (b) the performance of D with difierent
 N .
 Then, by substituting (6.19) back to (6.18), we have
 E[msei] = E
 "
 1
 j'ij
 X
 k2'i
 jI^(k;m?i )? In(k)j2
 + 2
 ?
 2
 j'ij
 X
 k2'i
 @I^(k;m?i )
 @In(k) ? 1
 !#
 = E[SUREi]:
 6.3.2 Confldence and Distortion Trade-ofi
 6.3.2.1 Confldence
 From Theorem 2, we can see that SUREi is an unbiased estimator of msei.
 However, there can be some mismatch between SUREi and msei for each realization
 (noise observation), i.e., SUREi is just an approximation of msei. To measure the
 132
accuracy of the approximation, let us deflne the confldence term, C, as the average
 difierence between SUREi and msei over the whole image
 C = 1j'j
 MX
 i=1
 j'ij ? jmsei ? SUREij: (6.17)
 According to [113], the estimator SUREi becomes closer to msei as j'ij in-
 creases, which means that the confldence term C in (6.17) decreases as j'ij increases.
 6.3.2.2 Distortion
 With the partition ' = f'1;'2; :::;'Mg and the optimal parameters fm?1;m?2; :::;m?Mg,
 we can compute the mean square error for the whole image, D, as follows
 D = 1j'j
 MX
 i=1
 j'ij ? msei: (6.18)
 According to the analysis in Section 6.3.1, we group the pixels with similar m?
 values together and assign a common m?i to all pixels in subset 'i. In such a case,
 as j'ij increases, the probability that the pixels in 'i have difierent true m? values
 increases, which leads to the increase of msei. Therefore, the distortion term D in
 (6.18) increases as j'ij increases.
 6.3.2.3 Confldence and Distortion Trade-ofi
 From the above discussion, we can see that as j'ij increases, the confldence
 term C decreases but the distortion term D increases. Therefore, there exists a
 trade-ofi between C and D. To verify such a trade-ofi, we conduct a simple ex-
 periment by setting j'ij = N; 8i. As shown in Figure 6.4, the confldence term C
 133
0 5 10 15 20 25 30
 ?1
 ?0.9
 ?0.8
 ?0.7
 ?0.6
 ?0.5
 ?0.4
 ?0.3
 ?0.2
 ?0.1
 0
 |?i|
 Gain function
  
 
g1(|?i|,?2=1)=?1/|?i|
 g2(|?i|,?
 2=1)=?exp(?|?i|/4)
 g3(|?i|,?
 2=1)=ln(|?i|/(|?i|+1))
 Figure 6.5: Some possible gain functions.
 decrease as N increases while the distortion term D increases as N increases, which
 are consistent with our analysis.
 6.3.3 Utility Function and Solution to the Game
 From the previous subsections, we can see that given the partition ' =
 f'1;'2; :::;'Mg, SURE can be used to approximate the true MSE to flnd the opti-
 mal m?. However, how to flnd a good partition is not trivial since the number of the
 partition is not flxed and the size of each partition can vary. Due to the uncertainty
 of the number of the partition, the traditional segmentation and clustering meth-
 ods may not work. To study the complex interactions among difierent pixels and
 134
(a) (b)
 (c) (d)
 Figure 6.6: The four tested images: (a) Lena; (b) Barbara; (c) Boat; (d) Flinstones.
 the dynamic partition formation process, we propose to use the coalition formation
 game.
 In this game theoretical formulation, every pixel is treated as a player, who
 tries to seek partners to form coalitions to achieve better reconstruction. By forming
 a coalition, every player in the coalition can obtain a gain of reducing the difierence
 between the SURE and the true estimate, i.e., the confldence term in (6.17), while
 incurring a cost of increasing the minimum of the MSE. With this idea in mind, we
 135
deflne the utility for a coalition as:
 U('i) = ?j'ij ? SUREi + g(j'ij;  2); (6.19)
 where the flrst term of the right hand side is the cost and the second term g(j'ij;  2)
 is the gain.
 The function g(j'ij;  2) in (6.19) characterizes the gain of forming a coalition,
 which is the reduction of the difierence between the SURE and the true estimate
 due to the increase of the coalition size. Therefore, g(j'ij;  2) should satisfy the
 following properties
 1. g(j'ij;  2) should be an increasing function in terms of j'ij since the gain
 increases as the coalition size j'ij increases, i.e., @g(j'ij; 
2)
 @j'ij > 0.
 2. g(j'ij;  2) should be a concave function in terms of j'ij since a certain increase
 of the coalition size in the low j'ij region should lead to a more signiflcant
 gain than that in the high j'ij region, i.e., @
 2g(j'ij; 2)
 @j'ij2 < 0.
 3. g(j'ij;  2) should be a superadditive function since the gain of a large coalition
 should be no smaller than that of two sub coalitions, i.e., g(j'i + 'jj;  2) ?
 g(j'ij;  2) + g(j'jj;  2).
 4. g(j'ij;  2) should be a decreasing function in terms of  2 since the gain de-
 creases as noise variance  2 increases, i.e., @g(j'ij; 2)@ 2 < 0.
 There are many functions that can satisfy the above property. In the following,
 we list three possible functions
 g1(j'ij;  2) = ?1 2
  ?1
 j'ij
 ?
 ; (6.20)
 136
g2(j'ij;  2) = ?2 2
 ?
 ? exp
  ?j'ij
 4
 ??
 ; (6.21)
 g3(j'ij;  2) = ?3 2
 ?
 ln
  j'ij
 j'ij+ 1
 ??
 ; (6.22)
 where ?1, ?2, and ?3 are flxed parameters.
 In Fig. 6.5, we plot the three possible gain functions versus j'ij by setting
  2 = 1. We can see that all the three functions meet our requirements and are
 therefore valid gain functions. Moreover, we can see that all three functions behave
 similarly. Therefore, in this chapter, we only evaluate the flrst gain function, i.e.,
 g(j'ij;  2) in (6.19) is set to be g1(j'ij;  2). Nevertheless, similar results can be
 obtained with the other two functions (g2(j'ij;  2) and g3(j'ij;  2)) and any other
 functions with similar properties.
 With the utility function in (6.19), we can see that as the size of the coalition
 increases, the members in the coalition can obtain gains from g(j'ij;  2). However,
 the gains are limited by the a cost of forming the coalition, which is ?j'ij?SUREi.
 The problem now is to flnd the optimal coalition structures based on the utility func-
 tion in (6.19). One possible approach is to use the merge and split rules proposed
 in [15], where the authors prove that their algorithm will converge to a unique solu-
 tion with arbitrary merge and split iterations. However, the computation complexity
 is still very large since all possible sub-partitions need to be evaluated during the
 split process. To make the problem traceable, in this chapter, we propose a heuris-
 tic algorithm in solving the coalition formation game. As shown in Algorithm 6,
 the proposed heuristic algorithm starts with a randomly chosen pixel and flnds the
 coalition by selecting the neighborhoods that can give best average utility. Then, all
 137
the pixels in the coalition are denoised with the corresponding optimal m?. Finally,
 all the pixels in the coalition are excluded from the un-denoised set. The above
 procedures are repeated until all pixels are denoised.
 The proposed heuristic algorithm is distributive, and only locally neighboring
 information is required for flnding the coalition. Moreover, since it is not an iterative
 algorithm, there is no convergence issue. Compared with the merge and split rules
 [15], the computation complexity is greatly reduced since the split process is avoided.
 From the experimental results shown in Section 6.5, we can see that the proposed
 heuristic algorithm performs quite well.
 Algorithm 6 A Heuristic Algorithm For Coalition Formation
 Initialization: let the set of denoised pixel SD = ; and its complement ?SD = ', let N1 = 800,
 N2 = 21? 21, and i = 0.
 While ?SD 6= ;
 ? i = i+ 1
 ? randomly choose k 2 ?SD, let '0 = fkg and set j = N1
 ? While j > 0
 { j = j ? 1
 { (l?;m?) = arg min
 l2 ?SDn'0;1?m?N2
 SURE ?'0 [ flg;m?
 { set '0 = '0 [ fl?g
 { compute u(j'0j) = g(j'0j)j'0j ? SURE
 ?'0;m??
 End
 ? let n?i = argmaxn u(n), 'i = '0(1 : n?i )
 ? compute m?i = arg min1?m?N2 SURE ('i;m)
 ? set ?SD = ?SD n 'i and SD = SD [ 'i
 ? denoise the pixel in 'i using (6.4) with m = m?i
 End
 138
6.4 Relation to the Traditional Approaches
 In the traditional pixel-domain image denoising approaches, every pixel is de-
 noised using (6.2) with a heuristically pre-deflned candidate set S(k). For example,
 a flxed-size square window centered by the target pixel k is chosen as the candidate
 set in the nonlocal image denoising method [20]. Such kinds of approaches have a
 performance limitation due to the self-constrained use of a pre-deflned candidate
 set. As shown in Figs. 6.1 and 6.2, we can see that the candidate set should be
 adaptively chosen for difierent neighborhoods and/or noise variances. Moreover, we
 will show in the following analysis that the traditional methods such as the nonlocal
 method [20] is actually a special case of the proposed game theoretical framework
 by choosing a utility function without a cost term
 U('i) = g(j'ij;  2): (6.23)
 According to the discussion in Section 6.3.3, we know that a valid gain function
 g(j'ij;  2) should be monotonically increasing, concave, and superadditive in terms
 of j'ij. In such a case, if the utility function only involves the gain function as
 in (6.23), then all pixels will form a grand coalition and use the same candidate
 set. In such a case, it return to the traditional ad-hoc approaches where a flxed
 candidate set is used for all pixels. In this sense, we can say that the traditional
 ad-hoc approaches are special cases of the proposed game theoretical framework.
 139
6.5 Experimental Results
 We evaluate the proposed game theoretical image denoising approach by com-
 paring it with the nonlocal method [20]. Four 512? 512 images shown in Fig. 6.6:
 Lena, Barbara, Boat and Flinstones, are tested. The neighborhood B and the pa-
 rameter h in (6.6) are set to be 11? 11 and 10 respectively. The candidate set for
 the nonlocal method is set to be a 21 ? 21 square window. The parameter ?1 for
 the proposed method in the gain function in (6.20) is set to be 0.875. Note that this
 parameter ?1 is flxed for all four tested images.
 We flrst examine the candidate set generated by the proposed approach. In
 Figs. 6.1 and 6.2, we show the ideally optimal candidate sets of two difierent image
 patches by assuming the original signal is available. Obviously, due to the absence
 of the original signal, we are not able to get the ideally optimal candidate sets.
 Nevertheless, with the proposed game theoretical approach and the SURE estimate,
 we can flnd approximate candidate sets and the results are shown in Figs. 6.7 and
 6.8. From Figs. 6.7 and 6.8, we can see that the approximate candidate sets are
 much more similar to the ideally optimal candidate sets compared with the flxed
 square window candidate set used by [20].
 Then, we evaluate the PSNR comparison versus the standard derivation of
 the noise. The PSNR comparison between the nonlocal method and the proposed
 method for the tested images at difierent noise levels are shown in Fig. 6.9. From
 Fig. 6.9, we can see that the proposed method always performs better than, if not
 equal to, the nonlocal method for all tested images at all difierent noise variances.
 140
When the standard deviation of the noise is no larger than 20, i.e.  ? 20, the
 PSNR performance of the proposed approach is just a bit better than the nonlocal
 method. This is because when  is small, the weights of the pixels outside the
 optimal candidate set are too small. In such a case, the reconstruction using a
 heuristically determined square window candidate set is similar to that using an
 optimal candidate set, i.e., the performance of the nonlocal method is similar to the
 performance of the proposed approach when  is small. However, we should notice
 that the performance of the nonlocal method is always upper bounded by that of the
 proposed method since the adaptively chosen candidate set is always better than, if
 not equal to, the flxed square window candidate set.
 When  becomes larger, the superiority of the proposed approach becomes
 more signiflcant. This phenomenon is mainly because when  is large, the weights of
 the pixels outside the optimal candidate is relatively large and is no longer negligible.
 In such a case, the reconstruction using all pixels in a heuristically determined
 square window tends to lead to over-smooth artifacts. On the other hand, since
 the candidate set is adaptively chosen for every pixel in the proposed method, we
 are able to preserve details of the original image and avoid over-smooth artifacts.
 From Fig. 6.9, we can see that the gain of the proposed approach over the nonlocal
 method can be up to 2.16dB for the Flinestones image when  = 60, which fully
 demonstrates the efiectiveness of the proposed approach.
 Finally, we evaluate the visual quality of the reconstructions. In Fig. 6.10,
 we show the visual quality comparison for Flinstones. As shown in Fig. 6.10, (a) is
 the original patch of Flinstones and (b) is the noisy patch with  = 25. The results
 141
generated by the nonlocal method and the proposed approach are shown in (c) and
 (d) respectively. We can see that the result generated by the nonlocal method is
 over-smooth. This phenomenon is because the nonlocal method involves too many
 dis-similar pixels in the averaging process. With the proposed approach, every pixel
 (player) seeks parters to form coalition to determine the best number of neighbor-
 hoods to perform denoising, which can rule out the dis-similar neighborhoods and
 avoid over-smooth artifacts. Therefore, the details can be well-preserved in the pro-
 posed approach. Similar phenomenons can be observed in Figs. 6.11, 6.12 and 6.13
 for Barbara, Lena and Boat at noise level  = 35,  = 45, and  = 20 respectively.
 Due to the page limitation, we only show the results of one  for each image in this
 chapter. Similar results are observed for difierent  0s.
 6.6 Summary
 In this chapter, we study the trade-ofi between the accuracy of the Stein?s esti-
 mate and the minimum of the true MSE and formulate the image denoising problem
 as a coalition formation game. With the proposed game, every player (pixel) seek
 parters to form coalitions to obtain better decision for the optimal neighborhoods se-
 lection and thus lead to better denoising results. The experimental results show that
 compared with the nonlocal method [20], the proposed game theoretical approach
 can achieve not only better PSNR performance but also better visual quality. Note
 that the proposed game is also applicable in other scenarios besides the nonlocal
 method as long as 1) there exist some locally adaptive parameters to be estimated,
 142
and 2) the estimation accuracy will be improved when more samples are involved
 in the estimate process. Moreover, we showed that the traditional approaches using
 a heuristically determined candidate set are special cases of the game theoretical
 framework by choosing the utility function without a cost term.
 143
(a) (b) (c) (d)
 (e) (f) (g) (h)
 (i) (j) (k)
 Figure 6.7: An example of optimal candidate set with an edge region: (a) original
 image; (b) noisy image with  = 15; (c) noisy image with  = 25; (d) noisy image
 with  = 35; (e) the candidate set used by nonlocal; (f) (g) and (h) are the ideally
 optimal candidate sets of (b) (c) and (d) when the original signal is available; (i) (j)
 and (k) are the candidate sets of (b) (c) and (d) generated by the proposed method.
 144
(a) (b) (c) (d)
 (e) (f) (g) (h)
 (i) (j) (k)
 Figure 6.8: An example of optimal candidate set with a smooth region: (a) original
 image; (b) noisy image with  = 15; (c) noisy image with  = 25; (d) noisy image
 with  = 35; (e) the candidate set used by nonlocal; (f) (g) and (h) are the ideally
 optimal candidate sets of (b) (c) and (d) when the original signal is available; (i) (j)
 and (k) are the candidate sets of (b) (c) and (d) generated by the proposed method.
 145
10 15 20 25 30 35 40 45 50 55 60
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 ?
 PSNR (dB
 )
 Lena
  
 
Nonlocal
 Proposed
 (a)
 10 15 20 25 30 35 40 45 50 55 60
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 ?
 PSNR (dB
 )
 Barbara
  
 
Nonlocal
 Proposed
 (b)
 10 15 20 25 30 35 40 45 50 55 60
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 ?
 PSNR (dB
 )
 Boat
  
 
Nonlocal
 Proposed
 (c)
 10 15 20 25 30 35 40 45 50 55 60
 18
 20
 22
 24
 26
 28
 30
 ?
 PSNR (dB
 )
 Flinstones
  
 
Nonlocal
 Proposed
 (d)
 Figure 6.9: The PSNR comparison for difierent images: (a) Lena; (b) Barbara; (c)
 Boat; (d) Flinstones.
 146
(a) (b)
 (c) (d)
 Figure 6.10: The visual quality comparison for Flinstones with  = 25: (a) original
 image; (b) noisy image; (c) the result generated by the nonlocal method; (d) the
 result generated by the proposed approach.
 147
(a) (b)
 (c) (d)
 Figure 6.11: The visual quality comparison for Barbara with  = 35: (a) original
 image; (b) noisy image; (c) the result generated by the nonlocal method; (d) the
 result generated by the proposed approach.
 148
(a) (b)
 (c) (d)
 Figure 6.12: The visual quality comparison for Lena with  = 45: (a) original
 image; (b) noisy image; (c) the result generated by the nonlocal method; (d) the
 result generated by the proposed approach.
 149
(a) (b)
 (c) (d)
 Figure 6.13: The visual quality comparison for Boat with  = 20: (a) original
 image; (b) noisy image; (c) the result generated by the nonlocal method; (d) the
 result generated by the proposed approach.
 150
Chapter 7
 Simultaneous Image Denoising and Interpolation Using Evolutionary
 Games
 Spatial resolution up-conversion is one of the most important tasks in the fleld
 of image processing. Image interpolation is the technique addressing the problem
 of spatial resolution up-conversion. It generates a high resolution image from the
 input low resolution image by exploiting the inherent relationship between them.
 The commonly used image interpolation methods are the conventional linear
 interpolation schemes such as bilinear and bicubic interpolation [75]. These methods
 generate the high resolution image using a spatial-invariant linear interpolation fllter.
 Although the computational complexity is low, these methods are not favored since
 they introduce a lot of blurring and ringing artifacts.
 To overcome the drawbacks of conventional linear interpolation schemes, many
 more sophisticated adaptive image interpolation methods have been proposed. Jensen
 and Anastassiou proposed to flrst detect edges and then flt them with some tem-
 plates to improve the interpolation result [69], while Carrato and Tenze optimized
 the interpolation parameters by using some predetermined edge pattern [24]. In [13],
 Allebach and Wong proposed to flrst estimate the high resolution edge map from
 the low resolution image using a subpixel edge estimation technique, and then cor-
 rect the interpolated high resolution pixels based on the high resolution edge map.
 151
Based on the assumption that the covariance matrix of the high resolution image
 can be well estimated from the covariance matrix of the low resolution image, Li
 and Orchard proposed an edge-preserved interpolation scheme [82]. Cha and Kim
 in [25] proposed a modifled bilinear method by amending the error based on the
 interpolation error theorem in an edge-adaptive way. In [81], an MRF model-based
 edge-directed interpolation method is proposed by formulating the image interpola-
 tion problem as an energy minimization problem over a 2-D Markov Random fleld,
 where the edge direction information generated by a statistical based approach is
 incorporated in the energy function. For better interpolation, Zhang and Wu pro-
 posed to flrst generate multiple reconstructions from difierent directions, and then
 fuse the results by minimum mean square error estimation [132]. To further improve
 the interpolation results, a soft-decision adaptive interpolation (SAI) technique is
 proposed in [135] by combining the piecewise 2-D autoregressive modeling and block
 estimation.
 Although the existing approaches can achieve promising interpolation results,
 they are designed for the noisy-free images, i.e., clean images. However, if the low
 resolution image is noisy, most of the existing interpolation approaches will also
 boost the noise and introduces severe visual distortions such as fake edge artifacts
 shown in Figure 7.1. In reality, due to the quality of the sensors or the conditions of
 the environment, sensor noise is introduced during the image acquisition processes.
 Moreover, additional noise and distortions will be introduced during the processes
 of being digitized, recorded, and transmitted. To avoid the distortion caused by the
 undesired noise boosting, one may perform the denoising before the interpolation.
 152
(a) (b) (c)
 Figure 7.1: (a) a region of Lena image that is corrupted with Gaussian noise with
 noise variance  2 = 25; (b) the interpolation result using bicubic; (c) the interpola-
 tion result using SAI.
 Many image denoising approaches have been proposed in the literature by exploiting
 the inherently spatial correlation. For examples, Woods and Radewan [126] proposed
 to estimate the original image from the noisy image by using Kalman fllter while
 Jin [70] proposed to use adaptive Wiener fllter. To further exploit the spatial cor-
 relation, Buades [20] proposed to average, in a weighted manner, all the pixels in a
 nonlocal window instead of only involving the locally neighboring pixels. For better
 reconstruction, the authors in [33] propose a game theoretic approach to adaptively
 choose the optimal neighborhood. Nonlinear approaches such as difiusion [55], total
 variation [56], rate distortion optimization [32], and fuzzy flltering [120] were also
 investigated. Besides the pixel-domain approaches, transform-domain approaches
 such as wavelet shrinkage [50] [45] were also investigated.
 Nevertheless, as shown later in Section 7.2.1., since the denoising problem it-
 self is ill-posed, the methods that flrst perform denoising and then interpolate the
 153
denoising image can in adversely introduce severe visual artifacts. Therefore, to
 achieve better results, it is very important for us to jointly consider image denois-
 ing and interpolation together. In the literature, there are some prior works on
 jointly perform denoising and demosaicking { a special case of interpolation that
 reconstructs the missing color component due to color-flltered image sensors. For
 example, the authors in [67] proposed to use total least square technique, while the
 authors in [79] proposed to combine directional fllter with wavelet-based denoising
 method. However, since these approaches are specially designed for demosaicking,
 they cannot be directly applied into the general interpolation problem.
 The essential problem of simultaneous denoising and interpolation is to es-
 timate the unknown clean pixels in the high resolution image based on the low
 resolution noisy image by exploiting the inherent relationship between them, e.g.,
 estimating the unknown high resolution pixels using weighted average of a set of
 neighboring low resolution noisy pixels. However, since the original high resolution
 pixels are unknown, the optimal weights are not achievable. Most of the existing in-
 terpolation approaches flnd the approximated weights based on the assumption that
 the covariance matrix of the high resolution image can be well-estimated from the co-
 variance matrix of the low resolution image [82]. Nevertheless, when this assumption
 is not true, the interpolation performance will be greatly degraded. Moreover, since
 the low resolution image is corrupted by noise, the reconstruction performance can
 be further degraded. Here, instead of directly estimating the weights in one step, we
 propose to progressively reflne the weights by alternatively estimating the weights
 based on the reconstruction and flnding the reconstruction using the weights. Such
 154
a reflnement process of the weights is actually an evolutionary process and can be
 naturally formulated as an evolutionary game. Thus, in this chapter, we consider
 the problem of simultaneous denoising and interpolation from the game theoretic
 perspective and formulate the problem as an evolutionary game.
 Evolutionary game theory is an application of the mathematical theory of
 games to the interaction dependent strategy evolution in populations. Arising from
 the realization that frequency dependent fltness introduces a strategic aspect to evo-
 lution, evolutionary game theory becomes an essential component of a mathematical
 and computational approach to biological contexts, such as genes, viruses, cells, and
 humans. There are three basic components in an evolutionary game: players, strate-
 gies, and payofi functions. Players are the entities who play the game. Strategies,
 which can be divided into pure strategies and mixed strategies, are the complete
 plans of actions players may take in the game. A pure strategy is a deterministic
 plan of how a player will play a game while a mixed strategy is an assignment of
 a probability to each pure strategy. Payofi functions determine the payofis players
 can obtain by adopting a certain strategy. In the proposed evolutionary game for
 simultaneous image denoising and interpolation, the players are the unknown high
 resolution pixels, and their pure strategies are the neighboring low resolution noisy
 pixels. The probabilities in the mixed strategy are the non-negative normalized
 weights of the low resolution noisy pixels. In this sense, the simultaneous image
 denoising and interpolation problem is no longer ill-posed. Instead, the problem
 becomes well-deflned, and the objective of the player is to flnd the evolutionarily
 stable strategy, i.e., the optimal combination of the low resolution noisy pixels, to
 155
achieve better denoising or interpolation performance.
 The rest of this chapter is organized as follows. We flrst give an introduction
 about evolutionary game model in Section 7.1. Then, we show the problems of the
 methods that flrst perform denoising and then interpolate the denoising image, and
 describe in details how to formulate the problem of simultaneous image denoising
 and interpolation as an evolutionary game, how to choose the pure strategy set and
 how to deflne the payofi function in Section 7.2. Experimental results are shown in
 Section 7.3. Finally, we draw conclusions in Section 7.4.
 7.1 Evolutionary Game Model
 Before discussing how to use evolutionary game to simultaneously perform
 denoising and interpolation, in this section, we flrst brie y give an introduction
 about evolutionary games. A game G = fU;A; Fg is generally deflned as follows.
 U = fu1; u2; :::; uNg are the players who play the game. A = A1 ? A2 ? ::: ? AN
 are the pure strategy sets, where Ai is the pure strategy set containing all possible
 pure strategies for user ui. Let ai 2 Ai be one possible strategy for ui, then a =
 (a1; a2; :::; aN) 2 A is a strategy proflle of U. F = ff1; f2; :::; fNg are the players?
 payofi/utility functions, where fi is the payofi function of ui. In general, fi is
 determined by all players? strategies a rather than ai only, i.e., fi = fi(a). Also,
 a?i = (a1; :::; ai?1; ai+1; :::; aN) denotes the players? strategy proflle except player
 ui. Besides pure strategies, players can also take a mixed strategy by randomizing
 among difierent pure strategies. Suppose there are Mi pure strategies in the pure
 156
strategy set Ai = fa1i ; :::; aMii g, and let pij; j = 1; :::;Mi be the probability of ui
 choosing the jth pure strategy aji , then pi = (pi1; :::; piMi) is a mixed strategy.
 Since the payofi function fi is generally not only determined by ai but also by
 a?i, to maximize the payofi, every player needs to know all other players? strategies.
 However, it is generally very di?cult or even impossible for players to know other
 player strategies and payofis. In such a case, to improve their payofis, players
 will try difierent strategies in every play and learn from the strategic interactions
 using the methodology of understanding-by-building, which leads to the concept of
 \Evolutionary Game" [34] [110].
 An evolutionary game is a game that studies the evolution of the interaction
 dependent strategy in populations, and was flrst articulated by John Maynard Smith
 and G. R. Price for evolutionary biology. It is based on the idea that an organism?s
 genes largely determine its fltness in a given environment. Organisms that are more
 flt to the environment will tend to produce more ofispring, due to which genes
 that provide greater fltness have more representation in the population. Therefore,
 fltter genes tend to win over time and drive out other genes. If we treat organisms
 as players and genes as strategies, then the genes which persist in the population
 are the evolutionarily stable strategy (ESS) of an evolutionary game, which can be
 formally deflned as \a strategy such that, if all members of the population adopt it,
 then no mutant strategy could invade the population under the in uence of natural
 selection". The evolution of the strategies over the population by natural selection
 can be characterized by the Wright-Fisher model [54], which is by far the most
 popular stochastic model for reproduction in population genetics. It is based on
 157
Denoising Interpolation
 n
 lI ?lI ?hI
 (a)
 Simultaneous Denoising 
and Interpolation
 n
 lI ?hI
 (b)
 Figure 7.2: (a) Denoising before interpolation; (b) Simultaneous denoising and in-
 terpolation.
 the assumption that the probability of an individual adopting a certain strategy is
 proportional to the expected payofi of the population using that strategy, and the
 strategy updating equation can be written as
 pt+1ij =
 ptijf ti (aji ; a?i)PMi
 k=1 ptikf ti (aki ; a?i)
 ; (7.1)
 where the numerator ptijf ti (aji ; a?i) is the expected payofi of ui using strategy aji , and
 the denominatorPMik=1 ptikf ti (aki ; a?i) is the total expected payofi of ui using difierent
 strategies, which is the normalization term that ensures PMij=1 pt+1ij = 1.
 7.2 Image Denosing and Interpolation as an Evolutionary Game
 7.2.1 Problems of Denoising+Interpolation
 As discussed in the introduction, when the input low resolution image is noisy,
 most of the existing interpolation approaches will also boost the noise and introduces
 severe visual distortions. To avoid such kinds of distortions, one may consider
 performing the denoising before the interpolation as shown in Figure 7.2 (a). The
 noisy low resolution image Inl is flrst passed through a denoising process and an
 estimate of the original low resolution image I^l is found. Then, the estimated low
 158
(a) (b)
 (c) (d)
 Figure 7.3: (a) a region of Lena image that is corrupted by Gaussian noise with noise
 variance  2 = 225; (b) the denoised result using nonlocal; (c) the interpolation result
 of (b) using bicubic; (d) the interpolation result of (b) using SAI.
 resolution image I^l goes through a interpolation process and a reconstruction of
 the original high resolution image I^h is obtained. Nevertheless, since the denoising
 problem itself is ill-posed [20], the low resolution estimate I^l is not perfect and there
 will be some difierences between the estimate I^l and the original Il. On one hand,
 the edge structures and textures may be removed during the denoising process,
 e.g., the details of the hair shown in Figure 7.3 (b). On the other hand, some
 noise may not be e?ciently suppressed during the denoising process, e.g., the region
 around the nose and mouth shown in Figure 7.3 (b). In such a case, if we directly
 159
perform interpolation on I^l, the lost edge structures and textures will never be
 reconstructed back and the remained noise will be boosted which introducing severe
 visual distortions such as fake edge artifacts as shown in Figure 7.3 (c) and (d).
 Therefore, we should jointly consider image denoising and interpolation to achieve
 better reconstruction as illustrated in Figure 7.2 (b).
 7.2.2 Game Theoretic Formulation
 In this simultaneous denoising and interpolation problem, we have the noisy
 observation of the low resolution pixels, Inl (m;n); 1 ? m ? NH ; 1 ? n ? MW .
 The objective of this problem is to estimate the high resolution image Ih(i; j); 1 ?
 i ? 2 ? NH ; 1 ? j ? 2 ? MW based on the noisy low resolution image Inl (n;m).
 Obviously, this problem is ill-posed. To flnd a good estimate for the unknown
 high resolution pixels, we need to exploit the correlation among the low resolution
 pixels and between the low resolution pixels and the high resolution pixels. One
 possible approach is to use the spatially varying linear fllter, i.e., each unknown
 high resolution pixel can be estimated using weighted average of a set of neighboring
 noisy low resolution pixels.
 I^h(i; j) =
 X
 (m;n)2?ij
 wijmnInl (m;n); 8i;8j; (7.2)
 where ?ij is the candidate set of neighboring noisy low resolution pixels for Ih(i; j)
 and wijmn is the weight of candidate pixel Inl (m;n). Usually, we have the constraints
 that 0 ? wijmn ? 1 and
 P
 (m;n)2?ij wijmn = 1.
 Obviously, the optimal weights in (7.2) are not achievable since the original
 160
high resolution pixels are unknown. Most of the existing approaches flnd the ap-
 proximated weights based on the assumption that the covariance matrix of the high
 resolution image can be well-estimated from the covariance matrix of the low res-
 olution image [82]. However, when this assumption is not true, the denoising and
 interpolation performance will be greatly degraded. Here, instead of directly esti-
 mating the weights in one step, we propose to progressively reflne the weights by
 alternatively estimating the weights based on the reconstruction and flnding the
 reconstruction using the weights. Such a reflnement process of the weights is ac-
 tually an evolutionary process and can be naturally formulated as an evolutionary
 game [110]. In the proposed evolutionary game for image denoising and interpola-
 tion, the players are the unknown high resolution pixels to be estimated and the
 pure strategies are the correspondingly neighboring noisy low resolution pixels. By
 regarding the non-negative weights of the neighboring noisy low resolution pixels
 as the probabilities of selecting the pure strategies, the problem of estimating the
 high resolution pixels becomes flnding the evolutionarily stable strategies for the
 evolutionary game. In this sense, the simultaneous image denoising and interpola-
 tion problem is no longer ill-posed. Instead, the problem becomes well-deflned, and
 the objective of the problem is to flnd the evolutionarily stable strategies to achieve
 good denoising and interpolation performance.
 In summary, the estimation problem in (7.2) can be formulated as an evolu-
 tionary game as follows.
 ? Players: unknown high resolution pixels Ih(i; j)
 161
ij?
 hI
 (2 1,2 1)hI m n? ?
 Figure 7.4: The pure strategy set ?ij for Ih(2m?1; 2n?1); Ih(2m?1; 2n); Ih(2m; 2n?
 1); Ih(2m; 2n).
 ? Pure strategies: noisy low resolution pixels Inl (m;n)
 ? Pure strategy set: the candidate set ?ij
 ? Mixed strategy: the estimate I^h(i; j)
 ? Probabilities in the mixed strategy: the non-negative normalized weights wijmn
 7.2.3 Pure Strategy Set ?ij
 In most of the previous interpolation approaches, the high resolution image is
 reconstructed in two steps: in the flrst step, the unknown high resolution pixels sur-
 rounded by four low resolution pixels, i.e., Ih(2m; 2n); and in the second step, other
 unknown high resolution pixels Ih(2m? 1; 2n) and Ih(2m; 2n? 1) are reconstructed
 162
with the help of reconstructed high resolution pixels Ih(2n; 2m). Difierent from pre-
 vious approaches, in our game theoretic framework, pixels are treated as players and
 they have the same priority. The pure strategy set ?ij is the same for all four high
 resolution pixels Ih(2m?1; 2n?1); Ih(2m?1; 2n); Ih(2m; 2n?1); Ih(2m; 2n) and is
 deflned as a square window centered by Ih(2m? 1; 2n? 1) in Figure 7.4. As shown
 in Figure 7.4, the red circle stands for pixel Ih(2m? 1; 2n? 1) and the three green
 squares stand for pixels Ih(2m? 1; 2n), Ih(2m; 2n? 1), and Ih(2m; 2n) respectively.
 The gray square window centered by Ih(2m? 1; 2n? 1), denoted as ?ij, is the pure
 strategy set for Ih(2m? 1; 2n? 1); Ih(2m? 1; 2n); Ih(2m; 2n? 1), and Ih(2m; 2n).
 7.2.4 Payofi Function
 After choosing the pure strategy set, we now discuss how to deflne the payofi
 function. The payofi function f tij(aij; a?ij) measures the player?s payofi of taking
 strategy aij when other players? strategies are a?ij at time t. Let I^ th stand for the
 estimate of the high resolution image Ih at time t, and B(I^ th(i; j)) stand for the
 patch centered by I^ th(i; j). The neighborhood similarity between pixel I^ th(i; j) and
 I^ th(m;n) at time t can be measured by Dt(I^ th(i; j) $ I^ th(m;n)) as follows
 Dt(I^ th(i; j) $ I^ th(m;n)) = jjB(I^ th(i; j))?B(I^ th(m;n))jj2: (7.3)
 Moreover, let us deflne ?Dt?1(I^h(i; j) $ I^h(m;n)) as the weighted average of
 the neighborhood similarity up to time t? 1 as follow
 ?Dt?1(I^h(i; j) $ I^h(m;n)) =
 Pt?1
 k=0 flkDk(I^kh(i; j) $ I^kh(m;n))Pt?1
 k=0 flk
 ; (7.4)
 163
where fl is a discounting factor, and when fl = 1, ?Dt?1(I^h(i; j) $ I^h(m;n)) reduces
 to the simple averaging.
 Note that ?Dt?1(I^h(i; j) $ I^h(m;n)) can be treated as an estimate ofD(Ih(i; j) $
 Ih(m;n)) at time t?1 by taking into account all previous reconstruction I^0h; :::; I^ t?1h ,
 and Dt(I^ th(i; j) $ I^ th(m;n)) is an estimate of D(Ih(i; j) $ Ih(m;n)) at time t us-
 ing the reconstruction I^ th. If I^ th(i; j) is a good estimate, Dt(I^ th(i; j) $ I^ th(m;n))
 tends to be close to ?Dt?1(I^h(i; j) $ I^h(m;n)). If Dt(I^ th(i; j) $ I^ th(m;n)) <
 ?Dt?1(I^h(i; j) $ I^h(m;n)), the neighborhood similarity between pixel I^h(i; j) and
 I^h(m;n) is larger than what we anticipate at time t? 1, which means that a larger
 payofi should be received by adopting strategy I^ th(m;n) at time t. On the other
 hand if Dt(I^ th(i; j) $ I^ th(m;n)) > ?Dt?1(I^h(i; j) $ I^h(m;n)), the neighborhood sim-
 ilarity between pixel I^h(i; j) and I^h(m;n) is smaller than what we anticipate at time
 t ? 1, which means that a smaller payofi should be received by adopting strategy
 I^ th(m;n) at time t. Therefore, the payofi function should be an increasing function
 of ( ?Dt?1(I^h(i; j) $ I^h(m;n) ?Dt(I^ th(i; j) $ I^ th(m;n))). Here, we use the following
 payofi function
 f tij(aij; a?ij) = exp
 0
 @ fit
 ?
 ?Dt?1(I^h(i; j) $ aij)?Dt(I^ th(i; j) $ aij)
 ?
  
1
 A ; (7.5)
 where fit and  are parameters.
 Interestingly, if we set fit = fltPt
 k=0 flk
 and substitute (7.5) back into (7.1), the
 probability of choosing strategy aij at time t+ 1 can be simplifled as
 pt+1aij =
 exp
 ?
 ? ?Dt(I^h(i;j)$aij) 
?
 P
 aij exp
 ?
 ? ?Dt(I^h(i;j)$aij) 
? : (7.6)
 164
Bicubic Neighborhood Similarity
 Probabilities of 
Pure Strategies Reconstruction
 Figure 7.5: The block diagram of the proposed method.
 where ?Dt(I^h(i; j) $ I^h(m;n)) is deflned as
 ?Dt(I^h(i; j) $ I^h(m;n)) =
 Pt
 k=0 flkDk(I^kh(i; j) $ I^kh(m;n))Pt
 k=0 flk
 ; (7.7)
 The (7.6) can be interpreted as follows: the true similarity between Ih(i; j) and
 aij is approximately measured by ?Dt(I^h(i; j) $ aij), and the larger ?Dt(I^h(i; j) $
 aij), the less the similarity; therefore, the pixels with smaller ?Dt(I^h(i; j) $ aij)
 should make more contribution during the denoising and interpolation process, i.e.,
 the probability of choosing aij should be a decreasing function of ?Dt(I^h(i; j) $ aij),
 and here we use an exponential function.
 According to the above discussions, the proposed simultaneous image denoising
 and interpolation using evolutionary games can be summarized as in Figure 7.5. As
 shown in Figure 7.5, the noisy low resolution image Inl is flrst interpolated using
 Bicubic [75] to obtain a noisy estimate of the high resolution image I^0h. Then, at
 each evolution time index t, we compute the neighborhood similarity Dt(I^ th(i; j) $
 I^ th(m;n)) using (7.3). Then, the probability of using a certain pure strategy aij can
 be updated using (7.6) and (7.7). Finally, the estimate of the high resolution image
 165
at time index t+ 1 can be found as follows
 I^ t+1h (i; j) =
 X
 aij2?ij
 pt+1aij Inl (aij): (7.8)
 Note that the outcome of the proposed game theoretic algorithm is an evo-
 lutionarily stable strategy, and the proof can be found in the following Theorem
 1.
 Theorem 1: For any fl 2 [0; 1), the outcome of the proposed game theoretic
 algorithm shown in Figure 7.5 is an evolutionarily stable strategy (ESS).
 Proof : Since fl 2 [0; 1), flk goes to zero as k goes to inflnite. According to
 (7.7), ?Dt(I^h(i; j) $ I^h(m;n)) converges for su?ciently large t. Therefore, according
 to (7.6), pt+1aij converges for su?ciently large t, which means that the probability
 distribution of using the pure strategies pij = (p1; :::; pj?ij j) converges. Since such a
 strategy is chosen under the in uence of Wright-Fisher natural selection model [54],
 according to the deflnition of ESS, the strategy is an ESS.
 7.3 Experimental Results
 To evaluate the proposed game theoretic approach for simultaneous image de-
 noising and interpolation, we compare with the methods that flrst perform denoising
 using nonlocal [20] and then perform interpolation using either bicubic method [75]
 or the soft-decision adaptive interpolation (SAI) method [135], which are denoted as
 \Nonlocal+Bicubic" and \Nonlocal+SAI". Five images shown in Figure 7.6:
 Lena, Boat, Kodim03, Kodim07, and Kodim09, are tested. For convenience, all
 tested image are truncated to 512x512. The noisy low resolution images are gen-
 166
erated by flrst adding additive white Gaussian noise to the high resolution images
 and then directly perform downsample.
 We flrst evaluate the PSNR performance for difierent approaches. In Figure
 7.7, we show the PSNR comparison among Nonlocal+Bicubic, Nonlocal+SAI and
 the proposed method. We can see that the proposed method outperforms both
 Nonlocal+Bicubic and Nonlocal+SAI when  ? 10, and the gain becomes larger
 and larger as  increases. We can also see that as  increases, the advantage of
 using SAI diminishes. This is mainly because too many details are removed during
 the denoising process when  is high. This phenomenon fully demonstrates the
 importance of performing joint denoising and interpolation. When 5 <  < 10,
 the PSNR performance of the proposed method is similar to that of Nonlocal+SAI
 (the proposed method has slightly better performance for Lena and Kodim03, and
 slightly worse performance for Boat and Kodim09). However, both the proposed
 method and Nonlocal+SAI are better than Nonlocal+Bicubic. When  ? 5, the
 proposed method performs slightly worse than Nonlocal+SAI. This is mainly be-
 cause the proposed method not only removes the additive noise but also removes the
 sensor noise in the original image, and the PSNR is computed based on the original
 images which contains some sensor noise. Nevertheless, the proposed method can
 achieve much better visual quality even in the low  region as shown in Figure 7.9.
 We also evaluate the visual quality performance and show the visual quality
 comparison in Figures 7.8, 7.9, 7.10, 7.11, and 7.12, respectively, for difierent images
 at difierent noise variances. The results for Lena image are shown in Figure 7.8,
 where (a) is the noisy low resolution Lena image with  = 10, (b) is the result
 167
generated by Nonlocal+Bicubic with PSNR 30.71dB, (c) is the result generated by
 Nonlocal+SAI method with PSNR 31.11dB, and (d) is the result generated by the
 proposed method with PSNR 31.38dB. By comparing (b) (c) and (d) in Figure 7.8,
 we can see that both Nonlocal+Bicubic and Nonlocal+SAI cannot well suppress
 noise in the edge and texture regions and introduce some visually annoying artifacts
 such as fake edge artifacts in the face region. Since the proposed game theoretic
 method can simultaneously perform denoising and interpolation, we can avoid the
 artifacts caused by the separation of denoising and interpolation and generate the
 reconstruction with much better performance in terms of both PSNR and visual
 quality.
 Moreover, the proposed method has the ability to automatically remove the
 visually annoying sensor noise. In Figure 7.9, we show the reconstructed results
 for Boat image. Note that the original Boat image contains some sensor noise es-
 pecially in the sky region as shown in Figure 7.6 (b). In Figure 7.9, (a) is the
 noisy low resolution Boat image with  = 5, (b) is the result generated by Non-
 local+Bicubic with PSNR 28.52dB, (c) is the result generated by Nonlocal+SAI
 method with PSNR 28.92dB, and (d) is the result generated by the proposed method
 with PSNR 28.74dB. By comparing (b) (c) and (d) in Figure 7.9, we can see that
 the proposed method automatically removes the sensor noise in the sky region while
 Nonlocal+SAI and Nonlocal+Bicubic cannot. In such a case, although the proposed
 method has a lower PSNR performance compared with Nonlocal+SAI, the visual
 quality of the result generated by the proposed method is still much better than
 that of Nonlocal+SAI.
 168
The visual quality of the reconstructions of Kodim07, Kodim09 and Kodim03
 are also evaluated at noise level  = 10,  = 15, and  = 20 in Figures 7.10,
 7.11, and 7.12, respectively. Similar to previous experiments, the proposed method
 can greatly suppress the noise and restore the image with not only better PSNR
 performance but also better visual quality, especially in the regions around edges
 and textures.
 7.4 Summary
 In this chapter, we investigated the problem of simultaneous image denoising
 and interpolation from a completely new angle: game theoretic perspective. We
 treat each unknown high resolution pixel as an individual player and formulate the
 joint image denoising and interpolation problem as an evolutionary game. From
 such a perspective, the problem of estimating the high resolution pixels becomes
 flnding the evolutionarily stable strategies for the evolutionary game. The exper-
 imental results show that compared with the methods that flrst denoise the noisy
 low resolution image and then interpolate the denoised image, the proposed game
 theoretic approach can achieve not only better PSNR performance but also better
 visual quality.
 169
(a) (b)
 (c) (d)
 (e)
 Figure 7.6: The test images: (a) Lena; (b) Boat; (c) Kodim03; (d) Kodim07; (e)
 Kodim09.
 170
0 5 10 15 20 25 30 35 40
 24
 26
 28
 30
 32
 34
 36
 ?
 PSNR (dB
 )
 Lena
  
 
Nonlocal+Bicubic
 Nonlocal+SAI
 Proposed
 (a)
 0 5 10 15 20 25 30 35 40
 22
 23
 24
 25
 26
 27
 28
 29
 30
 ?
 PSNR (dB
 )
 Boat
  
 
Nonlocal+Bicubic
 Nonlocal+SAI
 Proposed
 (b)
 0 5 10 15 20 25 30 35 40
 25
 26
 27
 28
 29
 30
 31
 ?
 PSNR (dB
 )
 Kodim03
  
 
Nonlocal+Bicubic
 Nonlocal+SAI
 Proposed
 (c)
 0 5 10 15 20 25 30 35 40
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 ?
 PSNR (dB
 )
 Kodim07
  
 
Nonlocal+Bicubic
 Nonlocal+SAI
 Proposed
 (d)
 0 5 10 15 20 25 30 35 40
 25
 26
 27
 28
 29
 30
 31
 32
 ?
 PSNR (dB
 )
 Kodim09
  
 
Nonlocal+Bicubic
 Nonlocal+SAI
 Proposed
 (e)
 Figure 7.7: The PSNR comparison: (a) Lena; (b) Boat; (c) Kodim03; (d) Kodim07;
 (e) Kodim09.
 171
(a) (b)
 (c) (d)
 Figure 7.8: The visual quality comparison for Lena: (a) the noisy low resolution
 image with  = 10; (b) the result generated by Nonlocal+Bicubic (30.71dB); (c)
 the result generated by Nonlocal+SAI method (31.11dB); (d) the result generated
 by the proposed method (31.38dB).
 172
(a) (b)
 (c) (d)
 Figure 7.9: The visual quality comparison for Boat: (a) the noisy low resolution
 image with  = 5; (b) the result generated by Nonlocal+Bicubic (28.52dB); (c) the
 result generated by Nonlocal+SAI method (28.92dB); (d) the result generated by
 the proposed method (28.74dB).
 173
(a) (b)
 (c) (d)
 Figure 7.10: The visual quality comparison for Kodim07: (a) the noisy low resolution
 image with  = 10; (b) the result generated by Nonlocal+Bicubic (29.85dB); (c)
 the result generated by Nonlocal+SAI method (30.18dB); (d) the result generated
 by the proposed method (30.35dB).
 174
(a) (b)
 (c) (d)
 Figure 7.11: The visual quality comparison for Kodim09: (a) the noisy low resolution
 image with  = 15; (b) the result generated by Nonlocal+Bicubic (29.56dB); (c)
 the result generated by Nonlocal+SAI method (29.81dB); (d) the result generated
 by the proposed method (30.41dB).
 175
(a) (b)
 (c) (d)
 Figure 7.12: The visual quality comparison for Kodim03: (a) the noisy low resolution
 image with  = 20; (b) the result generated by Nonlocal+Bicubic (27.85dB); (c)
 the result generated by Nonlocal+SAI method (28.03dB); (d) the result generated
 by the proposed method (28.37dB).
 176
Chapter 8
 Conclusions and Future Work
 8.1 Conclusions
 In this thesis, we develop a game-theoretic framework that enables us to flrst
 analyze and model human behaviors in multimedia social networks, and then better
 design the multimedia systems by taking into account the impact of human factors.
 We have showed that understanding the human behaviors and dynamics in a mul-
 timedia social network can ultimately ofier better system performance and thus is
 essential for its continued progress, and such analysis and modeling can be applied
 to any social networks.
 Moreover, we extend the concept of multimedia social networking into classical
 signal/image processing problems to liberate pixels/signals as players to develop
 a game-theoretic framework that can, not only overcome some of the undesired
 ill-posed formulations in traditional approaches, but also obtain a more general
 paradigm beyond what can be accomplished by using traditional optimization tools.
 With the notion of learning and cognitive process inherent in a game theoretic
 formulation, we show that many classical approaches are basically special cases of
 the proposed game-theoretic framework. Therefore, the proposed framework ofiers
 new directions, insight, and methodologies in further advancing of the science of
 signal/image processing.
 177
The broader impact of this thesis is
 1. Social networks have pervaded our daily life. By illustrating that game theory
 can be used to understand human behavior and dynamics in a multimedia
 social network with better system performance, this thesis can motivate similar
 new ideas in many social networks.
 2. Signal and image processing has been a fundamental tool in many scientiflc
 and engineering disciplines. By introducing the proposed new game-theoretic
 paradigm to classical signal/image problems with new insight and signiflcant
 performance improvement, this thesis can trigger similar new thinking to many
 scientiflc areas that use signal/image processing.
 Speciflcally, in Chapter 3, we consider a non-cooperative multimedia social
 network and discuss how a group of users compete for the same resource. We use
 multiuser rate allocation social network as an example and show that game the-
 ory can provide a more general framework by theoretically proving that the tradi-
 tional optimization-based approach is a special case of the proposed game theoretical
 framework. Moreover, with the proposed method, we can flnd, in a distributed man-
 ner, a NE that is not only e?cient from system designer?s perspective but also fair
 from users? perspective. Then, in Chapter 4, we consider a cooperative multimedia
 social network and discuss how a group of selflsh users cooperate with each other to
 better obtain the content. We use cooperative peer-to-peer streaming social network
 as an example and show that evolutionary game can be used in such a scenario and
 ESS is the desired cooperative strategy. Moreover, we propose a distributed learn-
 178
ing algorithm for users to converge to the ESS by learning from their own payofi
 history. In Chapter 5, we discuss how to stimulate cooperation in cooperative social
 networks. We flrst show that most of the existing game theoretic cooperation stim-
 ulation approaches fail when the number of interaction between a pair of players is
 flnite, and the major reason is the use of direct reciprocity. Then, we propose to
 use indirect reciprocity games to stimulate cooperation in such a scenario by taking
 into account the indirect opinions. With such a modeling, we show with simulations
 that an evolutionarily stable cooperative strategy can be achieved with a proper
 cost-to-gain ratio. In Chapter 6, the image denoising problem is formulated as a
 coalition formation game, where every pixel is treated as a player who tries to seek
 partners to form a coalition to flnd the optimal neighborhoods for better denoising
 results. By forming a coalition, every player in the coalition can obtain certain gains
 by improving the accuracy of the distortion estimation, while incurring some costs
 by increasing the true distortion. With such a formulation, the traditional image
 denoising approaches using a heuristically determined neighborhood set can be seen
 as special cases of the proposed game theoretical framework by choosing the utility
 function without a cost term. Another example is formulating the problem of si-
 multaneous image denoising and interpolation as an evolutionary game in Chapter
 7, where the players are the unknown high resolution pixels and the pure strategies
 of the players are the corresponding noisy low resolution neighbors. By regarding
 the nonnegative weights of the noisy low resolution pixels as the probabilities of
 selecting the pure strategies, the problem of estimating the high resolution pixels
 becomes flnding the evolutionarily stable strategies for the evolutionary game. In
 179
this sense, we say that the simultaneous image denoising and interpolation problem
 is no longer ill-posed. Instead, the problem becomes well-deflned, and the objec-
 tive of the problem is to flnd the evolutionarily stable strategies to achieve good
 denoising and interpolation performance.
 8.2 Future Work
 Recently, the area of human and social dynamics has been identifled by the
 U.S. National Science Foundation as one of its flve priority areas, which shows
 the importance of this emerging interdisciplinary research area. Game theoretic
 modeling for multimedia social networks is a new emerging research fleld and is
 still in an infant stage. There are a lot of exciting problems to be investigated and
 addressed, which I will continue to devote my efiorts to.
 Security in multimedia social networks is one of these problems. Besides self-
 ish users, there are a group of users, called malicious users, in multimedia social
 networks. Unlike selflsh users whose aims are to maximize their own payofis, the
 objective of malicious users is to damage or even break down the system. Therefore,
 to successfully deploy the multimedia social networks, we need to study and analyze
 the malicious users attack strategies and develop the corresponding attack-resistant
 strategies.
 With my previous works, we can see that understanding the behavior dy-
 namics among users can ultimately ofier better system performance. Such analysis
 and modeling can be applied to other social networks. Therefore, in the future, I
 180
would like to investigate the possibility of using such analysis and modeling in other
 networks such as online social networks, smart grid networks and camera networks.
 In my previous works, I have successfully used the concept of multimedia
 social networks to reformulate image denoising and image interpolation problems.
 In the future, I would like to apply the concept to more classical problems such
 as image/video compression problems, estimation and detection problems, pattern
 recognition and classiflcation problems, adaptive signal processing problems, and
 information theory related problems. It is our belief that from the multimedia social
 networks point of view, we can make the ill-posed problems well deflned and are
 able to construct generalized and unifled frameworks for the classical problems. We
 hope that we can introduce a new paradigm not only in the fleld of signal and image
 processing but also in many other flelds such as computer vision and information
 theory.
 181
Bibliography
 [1] Accustream iMedia Research Homepage: http://www.accustreamresearch.com.
 [2] Flickr: http://www. ickr.com.
 [3] Napster: http://www.napster.com.
 [4] PPLive: http://www.pplive.com.
 [5] PPStream: http://www.ppstream.com.
 [6] H.264/AVC Software: http://iphome.hhi.de/suehring/tml/.
 [7] Sopcast: http://www.sopcast.com.
 [8] UUSee: http://www.uusee.com.
 [9] Youtube: http://www.youtube.com.
 [10] A. M. Frieze A. D. Flaxman and J. Vera. A geometric preferential attachment
 model of networks ii. In Proceedings of the 5th Workshop On Algorithms And
 Models For The Web-Graph, page 4155, 2007.
 [11] F. Abramovich and Y. Benjamini. Adaptive thresholding of wavelet coe?-
 cients. Computational Statistics and Data Analysis, 22:351{361, 1996.
 [12] Ishfaq Ahmad and Jiancong Luo. On using game theory for perceptually
 tuned rate control algorithm for video coding. IEEE Trans. Circuits Syst.
 Video Technol., 16(2):202{208, Feb. 2006.
 [13] Allebach and P. W. Wong. Edge-directed interpolation. In IEEE Int. Conf.
 Image Processing, volume 3, pages 707{710, 1996.
 [14] L. Anderegg and S. Eidenbenz. Ad hoc-vcg: A truthful and cost-e?cient
 routing protocol for mobile ad hoc networks with selflsh agents. In Proc.
 ACM MobiCom, 2003.
 [15] K. R. Apt and A. Witzel. A generic approach to coalition formation. In Proc.
 Int. Workshop Computational Social Choice (COMSOC), 2006.
 [16] Lawrence M. Ausubel. An e?cient ascending-bid auction for multiple objects.
 American Economic Review, 94:1452{1475, 2004.
 [17] A.-L. Barabasi and R. Albert. Emergence of scaling in random networks.
 Science, 286:509{512, 1999.
 [18] B. Bollobas and O. Riordan. The diameter of a scale-free random graph. In
 Combinatorica, page 534, 2004.
 182
[19] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University
 Press, 2004.
 [20] A. Buades, B. Coll, and J. M. Morel. A review of image denoising algorithms,
 with a new one. SIAM Multiscale Modeling and Simulation, 4(2):490{530,
 July 2005.
 [21] A. Buades, B. Coll, and J. M. Morel. The staircasing efiect in neighborhood
 fllters and its solution. IEEE Trans. Image Processing, 15:1499{1505, 2006.
 [22] S. Buchegger and J. Y. Le Boudec. Performance analysis of the confldant
 protocol. In Proc. ACM MobiHoc, 2002.
 [23] L. Buttyan and J. P. Hubaux. Enforcing service availability in mobile ad-hoc
 networks. In Proc. ACM MobiHoc, 2000.
 [24] S. Carrato and L. Tenze. A high quality 2x image interpolator. IEEE Signal
 Process. Lett., 7(6):132{135, Jun. 2000.
 [25] Y. Cha and S. Kim. The error-amended sharp edge (ease) scheme for image
 zooming. IEEE Trans. Image Process., 16(6):1496{1505, Jun. 2007.
 [26] T. Chan and J. Shen. Image processing and analysis: variational, PDE,
 wavelet, and stochastic methods. SIAM, 2005.
 [27] Y. Chen, W. S. Lin, and K. J. R. Liu. Risk-distortion analysis for video
 collusion attacks: A mouse-and-cat game. IEEE Trans. on Image Processing,
 19(7):1798{1807, July 2010.
 [28] Y. Chen and K. J. R. Liu. Indirect reciprocity game modelling for cooperation
 stimulation in cognitive networks. IEEE Trans. Commun., 59(1):159{168, Jan.
 2011.
 [29] Y. Chen, B. Wang, and K. J. R. Liu. Multi-user rate allocation games for
 multimedia communications. IEEE Trans. Multimedia, 11(6):1170{1181, Oct.
 2009.
 [30] Y. Chen, Y. Wu, B. Wang, and K.J.R. Liu. Spectrum auction games for
 multimedia streaming over cognitive radio networks. IEEE Trans. Commun.,
 58(8):2381{2390, Aug. 2010.
 [31] Yan Chen and Oscar C. Au. Simultaneous RD-Optimized rate control and
 video de-noising. In Proc. IEEE Int. Conf. Acoustics, Speech and Signal Pro-
 cessing, 2008.
 [32] Yan Chen, Oscar C. Au, and Xiaopeng Fan. Simultaneous MAP-based video
 denoising and rate-distortion optimized video encoding. IEEE Trans. on Cir-
 cuits and Sytems for Video Technology, 19(1):15{26, Jan. 2009.
 183
[33] Yan Chen and K. J. Ray Liu. A game theoretical approach for image denoising.
 In Proc. IEEE Int. Conf. Image Process., 2010.
 [34] Yan Chen, Beibei Wang, W. Sabrina Lin, Yongle Wu, and K. J. Ray Liu. Co-
 operative peer-to-peer streaming: An evolutionary game-theoretic approach.
 IEEE Trans. on Circuits and Systems for Video Technology, 20(10):1346{1357,
 Oct. 2010.
 [35] T. H. Chiang and Y. Q. Zhang. A new rate control scheme using quadratic
 rate distortion model. IEEE Trans. Circuits Syst. Video Technol., 7:246{250,
 1997.
 [36] Y. Chu, S. G. Rao, and H. Zhang. A case for end system multicast. In ACM
 SIGMETRICS, pages 1{12, 2000.
 [37] F. R. K. Chung and L. Lu. The diameter of sparse random graphs. In Advances
 in Applied Mathematics, page 257279, 2001.
 [38] C. Cooper and A. Frieze. A general model of web graphs. In Random Struc-
 tures and Algorithms, page 311335, 2003.
 [39] J. R. Corbera and S. Lei. Rate control in DCT video coding for low-delay
 communications. IEEE Trans. Circuits Syst. Video Technol., 9:172{185, 1999.
 [40] P. Cramton, Y. Shoham, and R. Steinberg. Combinatorial Auctions. MIT
 Press, 2006.
 [41] R. Cressman. Evolutionary dynamics and extensive form games. MIT Press,
 2003.
 [42] J. Crowcroft, R. Gibbens, F. Kelly, and S. Ostring. Modeling incentives for
 collaboration in mobile ad hoc networks. In Proc. Workshop Modeling and
 Optimization in Mobile, Ad Hoc and Wireless Networks (WiOPT?03), 2003.
 [43] Y. Zhan D. Chakrabarti and C. Faloutsos. R-mat: A recursive model for
 graph mining. In SIAM Conference on Data Mining, 2004.
 [44] S. Lawrence E. J. Glover D. M. Pennock, G. W. Flake and C. L. Giles. Win-
 ners don?t take all: Characterizing the competition for links on the web. In
 Proceedings of the National Academy of Sciences, page 52075211, 2002.
 [45] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian. Image denoising by sparse
 3d transform-domain collaborative flltering. IEEE Trans. Image Processing,
 16(8):2080{2095, 2007.
 [46] M. Dai, D. Loguinov, and H. Radha. Rate-distortion modeling of scalable
 video coders. In Proc. IEEE Int. Conf. Image Processing, 2004.
 184
[47] S. Deering and D. Cheriton. Multicast routing in datagram inter-networks
 and extended lans. ACM Transactions on Computer Systems, 8(2):85{111,
 1990.
 [48] W. Ding and B. Liu. Rate control of MPEG video coding and recording
 by rate-quantization modeling. IEEE Trans. Circuits Syst. Video Technol.,
 6:12{20, 1996.
 [49] P. Domingos and M. Richardson. Mining the network value of customers. In
 Proceedings of the 7th ACM SIGKDD international conference on Knowledge
 discovery and data mining, 2001.
 [50] D. L. Donoho. De-noising by soft-thresholding. IEEE Trans. Inform. Theory,
 41:613{627, May 1995.
 [51] M. Elad. On the origin of the bilateral fllter and ways to improve it. IEEE
 Trans. Image Processing, 10(10):1141{1151, 2002.
 [52] M. Elad. Why shinkage is still relevant for redundant representations? IEEE
 Trans. Inform. Theory, 52(12):5559{5569, 2006.
 [53] M. Felegyhazi, J. P. Hubaux, and L. Buttyan. Nash equilibria of packet for-
 warding strategies in wireless ad hoc networks. IEEE Trans. Mobile Comput.,
 5:463{476, May 2006.
 [54] R. Fisher. The genetical theory of natural selection. Clarendon Press, Oxford,
 1930.
 [55] G. Gilboa, N. Sochen, and Y. Y. Zeevi. Image enhancement and denoising
 by complex difiusion processes. IEEE Trans. Pattern Anal. Machine Intell.,
 26:1020{1036, Aug. 2004.
 [56] G. Gilboa, N. Sochen, and Y. Y. Zeevi. Variational denoising of partl tex-
 tured images by spatially varying constraints. IEEE Trans. Image Processing,
 15:2281{2289, Aug. 2006.
 [57] P. Golle, K. Leyton-Brown, and I. Mironov. Incentive for sharing in peer-
 to-peer networks. In Proc. of the ACM Conference on Electronic Commerce,
 2001.
 [58] O. Guleryuz. Weighted averaging for denoising with overcomplete dictionaries.
 IEEE Trans. Image Processing, 16(12):3020{3034, 2007.
 [59] M. Gupta, P. Judge, and M. Ammar. A reputation system for peer-to-peer
 networks. In Proc. of the ACM 13th International Workshop on Network and
 Operating Systems Support for Digital Audio and Video, 2003.
 [60] Ahsan Habib and John Chuang. Service difierentiated peer selection: An
 incentive mechanism for peer-to-peer media streaming. IEEE Trans. Multi-
 media, 8(3):610{621, 2006.
 185
[61] Zhu Han, Zhu Ji, and K. J. Ray Liu. Fair multiuser channel allocation for
 OFDMA networks using Nash bargaining solutions and coalitions. IEEE
 Trans. Commun., 53:1366{1376, 2005.
 [62] Zhu Han, Zhu Ji, and K. J. Ray Liu. A cartel maintenance framework to en-
 force cooperation in wireless networks with selflsh users. IEEE Trans. Wirless
 Commun., 7:1889{1899, 2008.
 [63] H. M. Hang and J. J. Chen. Source model for transform video coder and its
 application - part i: Fundamental theory. IEEE Trans. Circuits Syst. Video
 Technol., 7:287{298, 1997.
 [64] Zhihai He and S. K. Mitra. A unifled rate-distortion analysis framework for
 tansform coding. IEEE Trans. Circuits Syst. Video Technol., 11:1221{1236,
 2001.
 [65] X. Hei, C. Liang, J. Liang, Y. Liu, and K. Ross. A measurement study of a
 large-scale p2p iptv system. IEEE Trans. Multimedia, 9(8):1672{1687, 2007.
 [66] Y. Hel-Or and D. Shaked. A discriminative approach for wavelet shrinkage
 denoising. IEEE Trans. Image Processing, 17(4):443{457, 2008.
 [67] K. Hirakawa and T. W. Parks. Joint demosaicking and denoising. IEEE Trans.
 on Image Processing, 15(8):2146{2157, Aug. 2006.
 [68] Cheng Huang, Jin Li, and Keith W. Ross. Can internet video-on-demand be
 profltable? In ACM SIGCOMM, pages 133{144, 2007.
 [69] K. Jensen and D. Anastassiou. Subpixel edge localization and the interpolation
 of still images. IEEE Trans. Image Process., 4(3):285{295, Mar. 1995.
 [70] F. Jin, P. Fieguth, L. Winger, and E. Jernigan. Adaptive wiener flltering of
 noisy images and image sequences. In Proc. IEEE Int. Conf. Image Process.,
 pages 349{352, 2003.
 [71] T. Karagiannis, P. Rodriguez, and K. Papagiannaki. Should internet service
 providers fear peer-assisted content distribution? In Proceedings of the Inter-
 net Measurement Conference, 2005.
 [72] F. Kelly. Charging and rate control for elastic tra?c. Eur. Trans. Telecom-
 mun., 28:33{37, 1997.
 [73] C. Kervrann and J. Boulanger. Optimal spatial adaptation for patch-based
 image denoising. IEEE Trans. Image Processing, 15(10):2866{2878, 2006.
 [74] C. Kervrann and J. Boulanger. Local adaptivity to variable smoothness for
 exemplar-based image regularization and representation. International Jour-
 nal of Computer Vision, 79:45{69, 2008.
 186
[75] R. G. Keys. Cubic convolution interpolation for digital image processing.
 IEEE TranS. Acoust., Speech, Signal Process., 29(6):1153{1160, Dec. 1981.
 [76] L. Kontothanassis, R. Sitaraman, J. Wein, D. Hong, R. Kleinberg, B. Man-
 cuso, D. Shaw, and D. Stodolsky. A transport layer for live streaming in a
 content delivery network. IEEE Proceedings, 92:1408{1419, 2004.
 [77] V. Krishna. Auction Theory. Academic Press, 2002.
 [78] T. V. Lakshman, A. Ortega, and A. R. Reibman. VBR video: Trade-ofis and
 potentials. In Proc. IEEE, 1998.
 [79] X. Wu Lei Zhang and D. Zhang. Color reproduction from noisy cfa data of
 single sensor digital cameras. IEEE Trans. on Image Processing, 16(9):2184{
 2197, Sept. 2007.
 [80] J. Leskovec. Dynamics of large networks. PhD Dissertation, Machine Learning
 Department, Carnegie Mellon University, Technical report CMU-ML-08-111,
 2008.
 [81] M. Li and T. Q. Nguyen. Markov random fleld model-based edge-directed
 image interpolation. IEEE Trans. Image Process., 17(7):1121{1128, Jul. 2008.
 [82] X. Li and M. T. Orchard. New edge-directed interpolation. IEEE Trans.
 Image Process., 10(10):1521{1527, Oct. 2001.
 [83] Z. G. Li, F. Pan, K. P. Lim, X. Lin, and S. Rahardja. Adaptive rate control
 for H.264. In Proc. IEEE Int. Conf. Image Processing, 2004.
 [84] W. S. Lin, H. V. Zhao, and K. J. R. Liu. Incentive cooperation strategies
 for peer-to-peer live multimedia streaming social networks. IEEE Trans. on
 Multimedia, 11(3):396{412, 2009.
 [85] M. Littman. Markov games as a framework for multi-agent reinforcement
 learning. In Proc. International Conference on Machine Learning, 1994.
 [86] Yong Liu, Yang Guo, and Chao Liang. A survey on peer-to-peer video stream-
 ing systems. Journal of Peer-to-Peer Networking and Applications, 1:18{28,
 2008.
 [87] H. V. Madhyastha, T. Isdal, M. Piatek, C. Dixon, T. Anderson, A. Krishna-
 murthy, and A. Venkataramani. iplane: An information plane for distributed
 services. In 7th Symposium on Operating Systems Design and Implementation
 (OSDI), 2006.
 [88] S. Marti and H. Garcia-Molina. Limited reputation sharing in p2p systems.
 In Proc. of the 5th ACM Conference on Electronic Commerce, 2004.
 [89] S. Marti, T. J. Giuli, K. Lai, and M. Baker. Mitigating routing misbehavior
 in mobile ad hoc networks. In Proc. ACM MobiCom, 2000.
 187
[90] P. Michiardi and R. Molva. Core: A collaborative reputation mechanism to
 enforce node cooperation in mobile ad hoc networks. In Proc. IFIP-Comm.
 and Multimedia Security Conf, 2002.
 [91] P. Michiardi and R. Molva. A game theoretical approach to evaluate coop-
 eration enforcement mechanisms in mobile ad hoc networks. In Proc. Work-
 shop Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks
 (WiOPT?03), 2003.
 [92] M. E. J. Newman. The spread of epidemic disease on networks. In Physical
 Review E, 2002.
 [93] Martin A. Nowak and Karl Sigmund. Evolution of indirect reciprocity. Nature,
 437:1291{1298, 2005.
 [94] Hisashi Ohtsuki, Yoh Iwasa, and Martin A. Nowak. Indirect reciprocity pro-
 vides only a narrow margin for e?ciency for costly punishment. Nature,
 457:79{82, 2009.
 [95] A. Ortega and K. Ramchandran. Rate-distortion techniques in image and
 video compression. IEEE Signal Process. Mag., 15:25{50, 1998.
 [96] M. J. Osborne and A. Rubinste. A Course in Game Theory. The MIT Press,
 1994.
 [97] Martin J. Osborne. An introduction to game theory. Oxford University Press,
 2004.
 [98] H. Park and M. van der Schaar. Bargaining strategies for networked mul-
 timedia resource management. IEEE Trans. Signal Process., 55:3496{3511,
 2007.
 [99] A. Pizurica, W. Philips, I. Lemahieu, and M. Acheroy. A joint inter and
 intrascale statistical model for bayesian wavelet based image denoising. IEEE
 Trans. Image Processing, 11(5):545{557, 2002.
 [100] Darshan Purandare and Ratan Guha. An alliance based peering scheme for
 p2p live media streaming. IEEE Trans. Multimedia, 9(8):1633{1644, 2007.
 [101] M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic
 Programming. John Wiley and Sons, 1994.
 [102] P. Raghavan R. Kumar, J. Novak and A. Tomkins. On the bursty evolution of
 blogspace. In Proceedings of the 11th international conference on World Wide
 Web, page 568576, 2003.
 [103] S. Rajagopalan D. Sivakumar A. Tomkins R. Kumar, P. Raghavan and E. Up-
 fal. Stochastic models for the web graph. In Proceedings of the 41st Annual
 Symposium on Foundations of Computer Science, 2000.
 188
[104] D. Ray. A Game-Theoretic Perspective on Coalition Formation. Oxford Uni-
 versity Press, 2007.
 [105] W. Rhee and J. M. Cio?. Increase in capacity of multiuser OFDM system
 using dynamic subchannel allocation. In Proc. IEEE Veh. Technol. Conf.,
 2000.
 [106] Cem U. Saraydar, Narayan B. Mandayam, and David J. Goodman. E?cient
 power control via pricing in wireless data networks. IEEE Trans. Commun.,
 50:291{303, 2002.
 [107] S. Seetharaman and M. Ammar. Characterizing and mitigating inter-domain
 policy violations in overlay routes. In Proc. of IEEE International Conference
 on Network Protocols, 2006.
 [108] L. Shapley. Stochastic games. In Proceedings of the National Academy of
 Sciences of USA, page 10951100, 1953.
 [109] C. Shen and M. van der Schaar. Optimal resource allocation for multimedia
 applications over multiaccess fading channels. IEEE Trans. Wirless Commun.,
 7:3546{3557, 2008.
 [110] J. Maynard Smith. Evolutionary and the theory of games. Cambridege Uni-
 versity Press, 1982.
 [111] S. M. Smith and J. M. Brady. Susan - a new approach to low level image
 processing. International Journal of Computer Vision, 23(1):45{78, 1997.
 [112] V. Srinivasan, P. Nuggehalli, C. F. Chiasserini, and R. R. Rao. Cooperation
 in wireless ad hoc networks. In Proc. IEEE INFOCOM, 2003.
 [113] C. Stein. Estimation of the mean of a multivariate normal distribution. In
 Ann. Statist., pages 1135{1151, 1981.
 [114] K. Stuhlmuller, N. Farber, M. Link, and B. Girod. Analysis of video transmis-
 sion over lossy channels. IEEE J. Sel. Areas Commun., 18:1012{1032, 2000.
 [115] Y. Sun, W. Yu, Z. Han, and K. J. R. Liu. Information theoretic framework
 of trust modelling and evaluation for ad hoc networks. IEEE Journal of Se-
 lected Areas in Communications, special issue on Security in Wireless Ad Hoc
 Networks, 24:305{317, 2006.
 [116] Ryan W. Thomas, Daniel H. Friend, Luiz A. DaSilva, and Allen B. MacKen-
 zie. Cognitive networks: Adaptation and learning to achieve end-to-end per-
 formance objectives. IEEE Commun. Mag., 44:51{57, December 2006.
 [117] C. Tomasi and R. Manduchi. Bilateral flltering for gray and color images. In
 Proc. of the Sixth Int. Conf. on Computer vision, pages 839{846, 1998.
 189
[118] A. Urpi, M. Bonuccelli, and S. Giordano. Modeling cooperation in mobile ad
 hoc networks: A formal description of selflshness. In Proc. Workshop Modeling
 and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOPT?03),
 2003.
 [119] W. Vickrey. Counterspeculation, auctions, and competitive sealed tenders.
 The Journal of Finance, 16(1):8{37, 1961.
 [120] D.V.D. Ville, M. Nachtegael, D.V.D. Weken, E. E. Kerre, W. Philips, and
 I. Lemahieu. Noise reduction by fuzzy image flltering. Aug. 2004.
 [121] V. Vishumurthy, S. Chandrakumar, and E.G. Sirer. Karma: A secure eco-
 nomic framework for peer-to-peer resource sharing. In Proc. of the 2003 Work-
 shop on Economics of Peer-to-Peer Systems, 2003.
 [122] B. Wang, K. J. R. Liu, and T. C. Clancy. Evolutionary cooperative spectrum
 sensing game: How to collaborate? IEEE Trans. Commun., 58(3):890{900,
 2010.
 [123] B. Wang, Y. Wu, and K. J. R. Liu. Game theory for cognitive radio networks:
 An overview. Computer Networks, 54(14):2537{2561, Oct. 2010.
 [124] M. Wang and M. van der Schaar. Rate-distortion modeling for Wavelet video
 coders. In Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing,
 2005.
 [125] D. J. Watts and S. H. Strogatz. Collective dynamics of ?small-world? networks.
 Nature, 393:440{442, 1998.
 [126] J. Woods and C. Radewan. Kalman flltering in two dimensions. IEEE Trans.
 Inform. Theory, 23:473{482, Jul. 1977.
 [127] Di Wu, Chao Liang, Yong Liu, and Keith W. Ross. View-upload decoupling:
 A redesign of multi-channel p2p video systems. In IEEE Conference on Com-
 puter Communications (IEEE INFOCOM ?09), Mini-Conference, 2009.
 [128] Min Xiao and Debao Xiao. Understanding peer behavior and designing incen-
 tive mechanism in peer-to-peer networks: An analytical model based on game
 theory. In Proc. of International Conference on Algorithms and Architectures
 for Parallel Processing (ICA3PP), 2007.
 [129] Haiyong Xie, Y. Richard Yang, Arvind Krishnamurthy, Yanbin Liu, and Avi
 Siberschatz. P4p: Provider portal for applications. In ACM SIGCOMM, pages
 351{362, 2008.
 [130] W. Yu and K. J. Ray Liu. Attack-resistant cooperation stimulation in au-
 tonomous ad hoc networks. IEEE J. Sel. Areas Commun., 23:2260{2271, Dec.
 2005.
 190
[131] W. Yu and K. J. Ray Liu. Game theoretic analysis of cooperation stimulation
 and security in antonomous mobile ad hoc networks. IEEE Trans. Mobile
 Comput., 6:459{473, May 2007.
 [132] L. Zhang and X. Wu. Image interpolation via directional flltering and data
 fusion. IEEE Trans. Image Process., 15(8):2226{2238, Aug. 2006.
 [133] X. Zhang, J. Liu, B. Li, and T-S. P. Yum. Coolstreaming: A data-driven
 overlay network for peer-to-peer live media streaming. In IEEE Conference
 on Computer Communications (INFOCOM), pages 2102{2111, 2005.
 [134] X. Zhang, A. Vetro, Y. Q. Shi, and H. Sun. Constant quality constrained rate
 allocation for FGS video coded bitstreams. In Proc. SPIE Conf. on Visual
 Comm. and Image Processing (VCIP), 2002.
 [135] X. Zhang and X. Wu. Image interpolation by adaptive 2-d autoregressive mod-
 eling and soft-decision estimation. IEEE Trans. Image Process., 17(6):887{
 896, June 2008.
 [136] H. Zhao, W.S. Lin, and K. J. R. Liu. Behavior modeling and forensics for
 multimedia social networks: A case study in multimedia flngerprinting. IEEE
 Signal Process. Mag., 26(1):118{139, Jan. 2009.
 [137] X. J. Zhao, Y. W. He, S. Q. Yang, and Y. Z. Zhong. Rate allocation of equal
 image quality for MPEG-4 FGS video streaming. In Proc. Packet Video
 (PV), 2002.
 [138] S. Zhong, J. Chen, and Y. R. Yang. Sprite: A simple, cheat-proof, credit-based
 system for mobile ad-hoc networks. In Proc. IEEE INFOCOM, 2003.
 [139] S. Zhong, L. Li, Y. G. Liu, and Y. R. Yang. On designing incentive-compatible
 routing and forwarding protocols in wireless ad-hoc networks - an integrated
 approach using game theoretical and crytographic techniques. In Proc. ACM
 MobiCom, 2005.
 191