ABSTRACT 
 
 
 
 
Title of dissertation: ESTIMATION OF EXPECTED RETURNS, 
TIME CONSISTENCY OF A STOCK 
RETURN MODEL, AND THEIR 
APPLICATION TO PORTFOLIO SELECTION 
 
 Huaqiang Ma, Doctor of Philosophy, 2010 
Dissertation directed by: Prof. Dilip B. Madan  
AMSC / Department of Finance 
 
 
 
Longer horizon returns are modeled by two approaches, which have different impact 
on skewness and excess kurtosis. The Levy approach, which considers the random 
variable at longer horizon as the cumulants of i.i.d random variables from shorter 
horizons, tends to decrease skewness and excess kurtosis in a faster rate along the 
time horizon than the real data implies. On the other side, the scaling approach keeps 
skewness and excess kurtosis constant along the time horizon. The combination of 
these two approaches may have a better performance than each one of them. This 
empirical work employs the mixed approach to study the returns at five time scales, 
from one-hour to two-week. At all time scales, the mixed model outperforms the 
other two in terms of the KS test and numerous statistical distances. 
Traditionally, the expected return is estimated from the historical data through the 
classic asset pricing models and their variations. However, because the realized 
  
 
returns are so volatile, it requires decades or even longer time period of data to attain 
relatively accurate estimates. Furthermore, it is questionable to extrapolate the 
expected return from the historical data because the return is determined by future 
uncertainty. Therefore, instead of using the historical data, the expected return should 
be estimated from data representing future uncertainty, such as the option prices 
which are used in our method. A numeraire portfolio links the option prices to the 
expected return by its striking feature, which states that any contingent claim's price, 
if denominated by this portfolio, is the conditional expectation of its denominated 
future payoffs under the physical measure. It contains the information of the expected 
return. Therefore, in this study, the expected returns are estimated from the option 
calibration through the numeraire portfolio pricing method. The results are compared 
to the realized returns through a linear regression model, which shows that the 
difference of the two returns is indifferent to the major risk factors. This demonstrates 
that the numeraire portfolio pricing method provides a good estimator for the 
expected return. 
The modern portfolio theory is well developed. However, various aspects are 
questioned in the implementation, e.g., the expected return is not properly estimated 
using historical data, the return distribution is assumed to be Gaussian, which does 
not reflect the empirical facts. The results from the first two studies can be applied to 
this problem. The constructed portfolio using this estimated expected return is 
superior to the reference portfolios with expected return estimated from historical 
data. Furthermore, this portfolio also outperforms the market index, SPX. 
 
  
 
 
 
 
 
 
 
 
ESTIMATION OF EXPECTED RETURNS, TIME CONSISTENCY OF A STOCK 
RETURN MODEL, AND THEIR APPLICATION 
 TO PORTFOLIO SELECTION    
 
 
 
By 
 
 
Huaqiang Ma 
 
 
 
 
 
Dissertation submitted to the Faculty of the Graduate School of the  
University of Maryland, College Park, in partial fulfillment 
of the requirements for the degree of 
Doctor of Philosophy 
2010 
 
 
 
 
 
 
 
 
 
 
Advisory Committee: 
Professor Dilip B. Madan, Chair 
Professor Benjamin Kedem 
Professor Mark Loewenstein 
Professor Tobias von Petersdorff 
Professor Victor M. Yakovenko 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
? Copyright by 
Huaqiang Ma 
2010 
 
 
 
 
 
 
 
 
 
 
  
 

Acknowledgements 
First and foremost, I would like to thank my advisor Professor Dilip B. Madan. 
Through the years of my PhD study, Dr. Madan was always there offering his help. 
His width and depth of knowledge opens my eyes in the world of mathematical 
finance; his insight lights the way when I got lost in the mist; his intolerance for 
ambiguity sharpens my mind; and his sense of humor soothes my nerves when I was 
struggling in the research. To me, Dr. Madan is also a role model. Most of people 
only enjoy the time after work. Very few people enjoy their life at work, and Dr. 
Madan is one of them. His passion for knowledge inspired and motivated me during 
my PhD study, and will inspire and motivate me through my life. 
I want to thank Professor Mark Loewenstein for generous advices in my thesis. I 
would like to extend my gratitude to Professor Benjamin Kedem, Professor Tobias 
von Petersdorff, and Professor Victor M. Yakovenko for agreeing to serve in the PhD 
advisory committee. I also want to thank Professor Michael Fu for organizing our 
weekly financial seminar, which broaden my knowledge in the mathematical finance. 
I would like to say thanks to my fellow classmates. Samvit Prakash, Bing Zhang, and 
Christian Silva guided me with their own experience in math finance. Thank Yun 
Zhou, Linyan Cao, Guoyuan Liu, and Geping Liu for numerous hours of discussion 
for the problems in the research. 
I would like to express my grateful feelings to the staff members. Ms. Alverda 
McCoy was always there willing to help. Mr. William (Bill) Schildknecht was so 
kind to offer me teaching opportunities in and out of math department. Mr. Chuck 
LaHaie helped me with financial data. 
 iii 
My sincere thanks go to my friends, especially Wei Hu, Yiling Luo, Baozhong Mao, 
Hua Sheng, Anshuman Sinha, and Min Sun. Your sincere help and encouragement 
helped me go through the hard time in my PhD study. 
Finally, I would like to thank my parents. They always stood by me with their endless 
love and support. No words can express my gratitude. 
 
 
 iv 
Contents
1 Empirical Study of A Stock Return Model 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Two Approaches to Model Stock Market Returns . . . . . . . . . . . 6
1.2.1 L?vy Processes: De?nition, Properties, and L?vy Market Models 6
1.2.2 Scaling Property and Self-similarity . . . . . . . . . . . . . . . 13
1.3 Modeling Stock Returns with L?vy and Scaling . . . . . . . . . . . . 16
1.3.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.2 Self-decomposable Laws . . . . . . . . . . . . . . . . . . . . . 17
1.3.3 Mixed Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4 Related Methods and Techniques . . . . . . . . . . . . . . . . . . . . 21
1.4.1 Variance Gamma Process and the Associated Law . . . . . . . 21
1.4.2 Stock Price Dynamics with the VG Process . . . . . . . . . . 26
1.4.3 Maximum Likelihood Estimation (MLE) . . . . . . . . . . . . 28
1.4.4 Fast Fourier Transform (FFT) . . . . . . . . . . . . . . . . . . 29
1.4.5 VG Random Number Simulation . . . . . . . . . . . . . . . . 31
1.5 Numerical Implementation and Results . . . . . . . . . . . . . . . . . 33
v
1.5.1 Data, Sketch of the Procedure, and Brief Introduction of the
Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . 33
1.5.2 Statistical Estimation at the Unit Time . . . . . . . . . . . . . 35
1.5.3 Statistical Estimation at Longer Horizons . . . . . . . . . . . 37
1.5.4 Statistical Analysis for Model Performance Comparison . . . . 44
1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2 Estimating Expected Return By Numeraire-Portfolio Method 50
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.2 Pricing with Physical Measure . . . . . . . . . . . . . . . . . . . . . . 54
2.2.1 A Simple Example . . . . . . . . . . . . . . . . . . . . . . . . 54
2.2.2 The Numeraire Portfolio . . . . . . . . . . . . . . . . . . . . . 56
2.2.3 Pricing Under the Physical Measure . . . . . . . . . . . . . . . 64
2.3 Estimating Expected Returns Via Numeraire-portfolio Pricing Method 67
2.3.1 Idea of the Expected Return Estimation . . . . . . . . . . . . 67
2.3.2 Multivariate Random Number Simulation Via FGC . . . . . . 70
2.4 Numerical Implementation and Results . . . . . . . . . . . . . . . . . 75
2.4.1 Estimating Expected Returns . . . . . . . . . . . . . . . . . . 76
2.4.2 Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . 81
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3 A New Approach to Portfolio Selection 92
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.2 Portfolio Evaluation - Acceptability Indices . . . . . . . . . . . . . . . 95
vi
3.2.1 Acceptance Sets and Coherent Risk Measure . . . . . . . . . . 95
3.2.2 Acceptability Indices . . . . . . . . . . . . . . . . . . . . . . . 96
3.2.3 WVAR Acceptability Indices . . . . . . . . . . . . . . . . . . . 99
3.2.4 Bid and Ask Prices . . . . . . . . . . . . . . . . . . . . . . . . 102
3.3 Numerical Implementation and Results . . . . . . . . . . . . . . . . . 104
3.3.1 Trading Strategy . . . . . . . . . . . . . . . . . . . . . . . . . 104
3.3.2 Procedure and Results . . . . . . . . . . . . . . . . . . . . . . 105
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
vii
List of Figures
1-1 VG ?t to WMT at 1-hour time scale . . . . . . . . . . . . . . . . . . 36
1-2 Statistical ?t to WMT at 2hr timescale . . . . . . . . . . . . . . . . . 39
1-3 Statistical ?t to WMT at 3hr timescale . . . . . . . . . . . . . . . . . 40
1-4 Statistical ?t to WMT at 1d timescale . . . . . . . . . . . . . . . . . 41
1-5 Statistical ?t to WMT at 1w timescale . . . . . . . . . . . . . . . . . 42
1-6 Statistical ?t to WMT at 2w timescale . . . . . . . . . . . . . . . . . 43
1-7 Proportion of stocks with p-value greater than certain level (2-hour
timescale) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1-8 Proportion of stocks with p-value greater than certain level (3-hour
timescale) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1-9 Proportion of stocks with p-value greater than certain level (1-day
timescale) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1-10 Proportion of stocks with p-value greater than certain level (1-week
timescale) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1-11 Proportion of stocks with p-value greater than certain level (2-week
timescale) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
viii
2-1 A single-period binormial model. . . . . . . . . . . . . . . . . . . . . 54
2-2 The ?tted option data of SPX on July 11, 2007, with one-month ma-
turity, RMSE=2.01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2-3 The ?tted option data of HPQ on July 11, 2007, with one-month ma-
turity, RMSE=0.0617 . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2-4 Estimated return[SPX vs. realized return]SPX for SPX (January
1999 to October 2009) . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3-1 Graphic illustration of the Representation Theorem . . . . . . . . . . 98
3-2 Cumulative returns of the ?ve portfolios ( = 0:05) . . . . . . . . . . 108
3-3 Cumulative returns of the ?ve portfolios ( = 0:10) . . . . . . . . . . 109
3-4 Cumulative returns of the ?ve portfolios ( = 0:15) . . . . . . . . . . 110
3-5 Cumulative returns of the ?ve portfolios ( = 0:20) . . . . . . . . . . 111
3-6 Cumulative returns of the ?ve portfolios ( = 0:25) . . . . . . . . . . 112
3-7 Estimated-return portfolio at di?erent risk level  (AIX=MINMAXVAR)113
ix
List of Tables
1.1 Statistics of the Estimated VG Parameters (at unit time = 1 hour) . 36
1.2 Statistics of the Estimated VGMixed Parameter c . . . . . . . . . . . 38
1.3 Statistics of the Estimated VGMixed Parameter  . . . . . . . . . . . 38
1.4 Meanofthestatisticaldistancesofthethreemodelsatdi?erenttimescales 45
1.5 Std. of the statistical distances of the three models at di?erent timescales 45
2.1 Percentage of positive risk premium of each stock in 130 days . . . . 81
2.2 Mean and std. of the estimated return[ and realized return] (annu-
alized) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.3 t-test of each i (indi: = 0 represents i = 0, indi: = 1 represents i 6= 0) 88
2.4 F test of  (indi: = 0 represents  = 0, indi: = 1 represents  6= 0) . . 89
3.1 Mean of portfolio return at di?erent risk level (January 1999 - October
2009) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
3.2 Std. of portfolio return at di?erent risk level (January 1999 - October
2009) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
x
Chapter 1
Empirical Study of A Stock Return
Model
1.1 Introduction
Stock markets are complex dynamic systems with many elements interacting with
each other. The interaction of the elements can be observed as price ?uctuations
through time. Because of the complexity of stock markets, price ?uctuations exhibit
statistical properties, which can be reproduced by ?nancial models. Sophisticated
models are capable of dynamically describing these properties, that is, they can pro-
vide statistical ?t and analysis for the stock returns at various time scales, from
hourly to daily returns, and even to monthly and yearly returns. These models are
time consistent through the time horizon. The statistical accuracy of ?nancial models
is necessary and useful in many areas. For example, in risk management, a proper
model can help reduce severe losses. With the incentive both for academic and prac-
1
tical purposes, many ?nancial models have been developed.
The stochastic approach has been widely implemented to model this complex dy-
namic system. Dating back to 1900, Louis Bachelier [2] ?rst proposed that the stock
price behaves as random walks and modeled the price at time t as St = S0 + Wt,
where Wt is Brownian motion and  is the volatility. In this model, the price ?uctu-
ation St = St  S0 follows Brownian motion, which has independent and identical
increments, with increments having Gaussian distributions. To ensure positive price,
log-price ?uctuation ln(St=S0) = lnSt lnS0, instead of the price ?uctuation, is mod-
eled to follow Brownian motion [2]. The stochastic process of St is then said to follow
the geometric Brownian motion, and the dynamics of St is dSt = Stdt + StdWt,
which has the analytic solution St = S0 expf( 2=2)t + Wtg. This model is also
employed in option pricing by Black, Scholes [7], and Merton [58], and the work
earned the latter two a Nobel Prize of economics in 1997.
Sophisticated ?nancial models can reproduce, if not all but some important sta-
tistical properties of the stock markets. The main object is to study the probability
density function (pdf) of the increments at various time scales and compare the im-
plied statistical properties with the empirical facts. The increments in the stock
markets are the log-price ?uctuations (or log-price returns) ln(St+=St) at some given
time scale . The Brownian motion model misses some of the important statistical
properties of the stock markets. Compared to Brownian motion, the empirical distri-
bution of the increments of log-price returns has more mass at the origin and in the
tails, i.e., the stock markets have more events of small returns and big returns (either
losses or gains) than the model predicts. This empirical fact is called heavy tails (or
2
fat tails), which can be quantitatively measured by excess kurtosis. The excess kur-
tosis of Gaussian distribution is zero while the markets always show positive excess
kurtosis. The other discrepancy is that Brownian motion has a continuous sample
path but the path of stock prices may show discontinuity (or jumps).
L?vy processes, which were introduced by Paul L?vy in the early 20th century,
are stochastic processes with independent and identical increments. It relaxes one
assumption in Brownian motion, Gaussian increments. This relaxation provides the
?exibility to have more choices of distributions for the increments as long as they
are in?nitely divisible, and these distributions are more leptokurtic (positive excess
kurtosis) than the Gaussian one. Furthermore, discontinuity is also possible to appear
in the path of L?vy processes. Merton [59] is one of the ?rst to propose one L?vy
process (Brownian motion compound Poisson process, or jump-di?usion process) to
model asset returns. Later, numerous L?vy models were proposed, among them we
cite the Variance Gamma model by Madan and Seneta [49] [48], the NIG model by
Barndor?-Nielson [3], the Generalized Hyperbolic model by Eberlein and Prause [24],
Prause [67], the Meixner model by Schoutens [73], and the CGMY model by Carr
et al. [11]. Because of the ?exibility of choosing from various distributions, L?vy
models can capture some of the important empirical facts, such as jumps and fat
tails. The statistical ?t is usually performed at some ?xed time scale. However, little
work has been done to do statistical analysis along the time horizon. One example
of work is done by Eberlein and Ozkan [23], who investigated the time consistency
of L?vy models where a L?vy distribution model is employed at di?erent time scales,
from hourly to daily return. For L?vy processes, skewness decreases in the rate of
3
the square root of the time horizon and excess kurtosis decreases in the rate of the
time horizon. However, empirical studies [17] [22] show the actual data decrease more
slowly than L?vy processes predict.
Besides L?vy models, self-similarity or scaling is applied in ?nancial markets. In a
stochasticprocess, thescalingpropertymeansthedistributionofincrementsofvarious
time scales can be obtained from that of other time scale by rescaling the random
variable at that time scale. Mandelbrot [52] is the ?rst to introduce this concept
into ?nancial markets, where he considered cotton price returns having the scaling
property. We cite some of the other works as Mantegna and Stanley [54], Cont,
Bouchaud and Potters [18], Mandelbrot [53], Peters [63], Cont [17], and Galloway
and Nolder [31]. With the assumption of the scaling behavior, the distribution at
the larger timescales can be derived from those at the smaller ones, which are easier
to estimate because the data is su? cient. Not like L?vy processes, the fact, the
stochastic processes with scaling property have constant skewness and excess kurtosis
at all timescales, also does not satisfy the empirical results.
These two approaches have di?erent impacts on skewness and excess kurtosis
throughthetimehorizon. TheL?vyapproachindicatesafasterdecaythanthemarket
while the scaling approach has constant skewness and excess kurtosis at all horizons.
Thus, Eberlein and Madan [22] proposed a model mixed of the two approaches, which
providesthefreedomtoletthetermstructureoftheskewnessandexcesskurtosishave
a similar pattern as the markets. One thing needs to be pointed out is that this mixed
approach is not associated with any processes. Instead, it only uses the distributional
properties taken from these two processes. In this model, the random variables of
4
the increments of the stock returns at various time horizons are decomposed into two
independent parts, one is from the increments of a L?vy process, and the other comes
from that of a scaling process.
In this paper, we start the statistical parameter estimation from a short horizon,
e.g., one hour, because of the abundant date. A base distribution is chosen, which
shall be in?nite divisible and have self-decomposability characteristic (SDC) because
of the requirements from L?vy processes and decomposition of the random variables.
The distribution is decomposed into two parts in law, one is partial of itself, and
the other is a remaining component. The distributions at longer horizons are run by
these two components, represented by two parameters correspondingly: one implies
the proportion of the Levy composition; the other is the scaling coe? cient of the
remaining component. These two parameters are estimated at the longer horizons,
such as two hours, three hours, one day, one week, and two weeks in this paper.
For comparison, the statistical estimations are also conducted by the associated L?vy
model and scaling model at the same timescales, and statistical analyses, including
the Kolmogorov Smirnov (KS) test, the Kolmogorov distance, 2-distance, L1 and L2
distance, are performed. All these statistical analyses indicate that the scaling model
outperforms the L?vy model at longer horizons, while the mixed model dominates
these two models at all horizons. Furthermore, the averages of the two parameters
of 500 individual stocks are both around 0.4 through the time horizon. Thus, it is
adequate to assume value 0.4 for these two parameters at even longer time scales,
such as half-year, one year or longer, where the statistical parameter estimation are
not feasible due to the lack of data.
5
The remaining of this chapter is organized as follows. Section 1.2 introduces
and discusses the two approaches of modeling ?nancial markets, the L?vy process
approach and the scaling approach. Section 1.3 presents the empirical study of skew-
ness and excess kurtosis in stock markets and explains how the above two approaches
are questionable. In section 1.4, the mixed approach is developed to model the dis-
tribution of returns through the time horizon. Section 1.5 describes the methods to
do statistical estimation. Section 1.6 presents the results. Section 1.7 concludes the
study and suggests further work.
1.2 Two Approaches to Model Stock Market Re-
turns
1.2.1 L?vy Processes: De?nition, Properties, and L?vy Mar-
ket Models
? L?vy Processes and In?nitely Divisible Distributions
Relax one assumption in Brownian Motion, the Gaussian increments, and we have
L?vy processes, which provide more ?exibility to build continuous-time stochastic
models.
De?nition 1 L?vy Process
A stochastic process X = fXt : t  0g on (
;F;P) with values in Rd is said to be
a L?vy process if,
6
(1) X has independent increment: that is, for any 0  t1 < t2 < ::: < tn < 1;
Xt2  Xt1;:::;Xtn  Xtn 1 are independent.
(2) X has stationary increment: that is, the law of Xt  Xs is the same as Xt s,
where 0  s < t < 1
(3) Xt is stochastic continuous: that is, for 8 " > 0 and t > 0
lim
h!0
P (jXt+h  Xtj  ") = 0:
(4) X0 = 0 almost surely.
(5) Xt has the cadlag property: that is, right-continuity and left limits.
The third condition, stochastic continuity, does not imply continuous sample path.
It only means that at any given (or deterministic) time t, the probability that a
jump occurs is zero. However, sample path may still be discontinuous at random
times. Thus, the L?vy process is capable of capturing random jumps occurring in the
?nancial markets.
The ?rst two conditions are the main features of L?vy processes, and the third one
follows from the ?rst two (Keller [37]). Jacod and Shiryaev [36] name processes with
conditions (1) and (2) ?processes with stationary independent increment (PIIS).?
Given a random variable Y with the probability distribution F, if we make Xt law= Y
for any t > 0, then we can construct a L?vy process through the time horizon as
follows:
1. For the sample path at t;2t;3t;:::;nt;:::;n > 1, Xnt = Xt + (X2t   Xt) +
7
::: + (Xnt   X(n 1)t), whose distribution is the same as that of the sum of n
i:i:d random variables Xt;X2t Xt;:::;Xnt X(n 1)t because of the ?stationary
independent increment?property;
2. Within interval Xt   X0, we choose any integer m > 1 that m = t, Xt =
X + (X2  X) + ::: + (Xm  X(m 1)), that is, Xt is divided into m i:i:d
parts and it has the same law of that of the sum of m i:i:d random variables,
whose distribution can be derived from Xt.
In this procedure, the distribution of Xt can be in?nitely ?divided?as m > 1 can
be in?nitely large. This property is called ?in?nite divisibility.?
De?nition 2 In?nite Divisibility
Let X be a random variable with distribution F. A probability distribution F on
Rd is in?nitely divisible if for any integer n > 1, there exists n i:i:d random variables
X1;X2;:::;Xn such that
X law= X1 + X2 + ::: + Xn: (1.1)
We have the following proposition showing the one-to-one relationship between
the L?vy process and in?nitely divisible distribution:
Proposition 3 L?vy Processes and In?nitely Divisible Distributions
For any in?nitely divisible distribution F, there exists a L?vy process fXt : t  0g
such that the law of X1 is F. Conversely, given a L?vy process fXt : t  0g, the
distribution of Xt is in?nitely divisible for every t > 0.
8
Compared to Brownian motion, it?s quite ?exible to choose distributions for the
increments of L?vy processes, with only one constraint that the distributions should
be in?nitely divisible.
Usually the pdf of Xt in L?vy processes is not easy to obtain [36]. Instead we study
the characteristic function of Xt. Let t(u) or Xt(u) be the characteristic function of
Xt, E[eiuXt]. De?ne 	t(u) = 	Xt(u) = lnt(u) as the characteristic exponent. Then,
the characteristic function of a L?vy process is given by the following proposition:
Proposition 4 Characteristic Function of L?vy Processes
Let fXt : t  0g be a L?vy process on Rd and its characteristic exponent at t = 1
be 	. Then 	 is a continuous function 	 : Rd ! R, such that:
E[eiuXt] = et	(u); u 2 Rd: (1.2)
By this proposition, we can build a L?vy process from any in?nitely divisible
distributionthroughitscharacteristicfunction. Therefore, thelawofXt isdetermined
by the law of X1; both are in?nitely divisible.
? Properties of L?vy Processes
Brownian motion is a well-known L?vy process with Gaussian increment. Another
simple and common L?vy process is the compound Poisson process fXt : t  0g,
which is de?ned as Xt =
NtP
i=0
Yi where fNt : t  0g is a Poisson process and Yi are
i:i:d random variables independent of Nt. Other L?vy processes can be decomposed
by these simple bricks.
9
Proposition 5 L?vy-It? Decomposition
Let fXt : t  0g be a L?vy process on Rd. It can be decomposed into three parts:
Xt = XBt + XCt + lim
"#0
fX"t; where (1.3)
XBt = rt + AW(d)t
XCt =
Z
jXj1;s2[0;t]
xN(ds;dx)
fX"t =
Z
"jXj1;s2[0;t]
xeN(ds;dx):
XBt is a d-dimensional continuous Gaussian process with drift r and covariance
matrix A, W(d)t is a d-dimensional Brownian motion;
XCt is a compound Poisson process with jump size jXj  1. N(ds;dx) is a Poisson
random measure on R+ (Rdnf0g);
fX"t is a compensated compound Poisson process eN(ds;dx) = N(ds;dx) (dx)ds,
where  is a jump intensity (or called L?vy measure) on Rdnf0g and is given by
(dx) = E[N(1;dx)].  also veri?es R
Rdnf0g
(1jxj2)(dx) < 1.
(r;A;) is called the L?vy characteristic triplet or L?vy triplet.
The L?vy-It? decomposition states that the structure of the sample path of any
L?vy process consists three parts: a di?usion process with drift, XBt , a ?large jump?
process with jump size greater than one, XCt , and a ?small jump?process with jump
size less than one, fX"t. As there can be in?nitely many small jumps around zero and
their sum may not converge, we have to compensate the compound Poisson process
with small jumps to make it a martingale so that it won?t explode. That?s how we
10
get the third part fX"t in the decomposition (Proposition 2.16 in [19]).
With the L?vy-It? decomposition formula, it?s easy to derive the characteristic
function of L?vy processes, which is given in the next theorem:
Theorem 6 L?vy-Khinchin Representation
Let fXt : t  0g be a L?vy process. Its characteristic function satis?es:
E[eiuXt] = et	(u); u 2 Rd; (1.4)
where 	(u) = iru  12uAu +
Z
Rdnf0g
(eiux  1 iux1jXj1)(dx): (1.5)
The L?vy-Khinchin representation explicitly links L?vy processes to in?nitely di-
visible distributions. Given an in?nitely divisible distribution F with characteristic
component (1.5), a L?vy process Xt can be generated where its law at t = 1 is
F. Thus, we can study any L?vy process from its corresponding in?nitely divisible
distribution.
? Distributional Property of L?vy Process: Tail Behavior
The L?vy-Khinchin representation enables us to study the tail behavior of the dis-
tribution of a L?vy process through its associated in?nitely divisible distribution F,
which is characterized by a L?vy triplet (r;A;). We cite the following proposition
from Cont [19], Proposition 3.13.
Proposition 7 Moments and Cumulants of a L?vy process
11
Let fXt : t  0g be a L?vy process on R with characteristic triplet (r;A;). The n-
th absolute moment of Xt, E[jXtjn] is ?nite for some t or, equivalently, for every t > 0
if and only ifRjxj1jxjn (dx) < 1. In this case moments of Xt can be computed from
its characteristic function by di?erentiation. In particular, the form of cumulants of
Xt is:
E[Xt] = t(r +
Z
jxj1
x(dx)) (1.6)
c2(Xt) = VarXt = t(A +
Z
R
x2(dx)) (1.7)
cn(Xt) = t
Z
R
xn(dx) for n  3 (1.8)
Skewness and excess kurtosis of the increments Xt or Xt+t Xt can be derived
by this proposition.
skewness(Xt) = c3(Xt)c
2(Xt)3=2
= skewness(X1)pt (1.9)
excesskurtosis(Xt) = c4(Xt)c
2(Xt)2
= kurtosis(X1)t (1.10)
We can conclude that skewness decreases at the rate of t1=2 and excess kurtosis
decreases at the rate of t. This proposition also implies that the distributions of the
incrementsofL?vyprocessesareleptokurtic(excesskurtosisispositive)asc4(Xt) > 0.
12
1.2.2 Scaling Property and Self-similarity
Traditionally, scaling phenomena are observed and studied in physical sciences. In
the 1990s, the availability of high-frequency data and computer technology made it
possible to investigate scaling behavior in economic systems. Empirical studies show
that the asset prices exhibit similar statistical properties at di?erent time scales,
which bring interest to implement scaling property in economics.
In mathematics, the scaling behavior is associated with stochastic processes ex-
hibiting the self-similarity property, which is de?ned below:
De?nition 8 Self-Similarity and Self-Similar Processes
A stochastic process fXt : t  0g is said to be self-similar if
Xt law= Xt; (1.11)
where  > 0 is a scaling factor.  is called the self-similarity exponent.
This process is also called the self-similar process or -self-similar process.
Awell-known self-similar process is the Brownian motion with self-similarity expo-
nent  = 1=2. Some of the other L?vy processes may also have self-similar properties,
and they are named -stable L?vy processes. However, self-similarity property does
not solely appear in L?vy processes. It also exists in processes with dependent incre-
ments. For example, fractional Brownian motions, which have correlated increments,
also show self-similarity.
If we take the unit time for Xt in (1.11), we have
13
Xt law= tX1; 8 t > 0; (1.12)
which indicates the law for Xt at time t can be obtained from the law at the unit
time, scaled by a coe? cient t.
Nowletusstudyhowtheproperties, includingcharacteristicfunction, distribution
function, tail behavior, and moments, of self-similar processes behave through time
horizon.
 Characteristic function
Xt law= tX1 (1.13)
, E[eiuXt] = E[eiutX1]
, Xt(u) = [X1(u)]t or 	Xt(u) = t	X1(u)
 Distribution function
cdf:
Xt law= tX1
, P(Xt  x) = P(tX1  x)
, FXt(x) = FX1( xt) (1.14)
pdf: di?erentiate (2.13), we get fXt(x) = 1tfX1( xt)
14
At the center x = 0,
fXt(0) = 1tfX1(0) (1.15)
 Center and Tail Behavior
Several authors have used (1.15) to test self-similarity on stock returns and
estimate  around the center of the density function. Mantegna and Stanley
[54] studied the S&P 500 Index and concluded that   0:71 and self-similar
L?vy (a -stable process) processes describe the dynamics of the pdf well at
zero. However, this model fails at tails. Later, power-law distributions, along
with self-similar processes, are proposed by numerous authors [18] [32] [30] to
model the tail behavior of stock returns. If self-similar processes have power-law
tail at X1
P(jX1j > x)  1x;
then at other time scales
P(jXtj > x)  t

x ; for t > 0;
which means the tail behavior still exhibits power-law distribution with some
scaling coe? cient .
 Moments, variance, skewness, and kurtosis
Using Eq. (1.12), it?s obvious to derive the moment at t > 0 from t = 1
Moment: E[Xt] = tE[X1]
15
E[Xnt ] = tnE[Xn1 ]
Variance: Var[Xt] = t2Var(X1)
Skewness: skew(Xt) = E[(Xt EXt)3][Var(X
t)]3=2
= skew(X1)
Kurtosis: kurt(Xt) = E[(Xt EXt)4][Var(Xt)]2 = kurt(X1)
We can tell that skewness and excess kurtosis of self-similar processes do not
change along the time horizon.
1.3 Modeling Stock Returns with L?vy and Scal-
ing
1.3.1 Preliminary
As discussed in the previous section, the term structures of skewness and excess kur-
tosis (the relationship between skewness/excess kurtosis and the time horizon) exhibit
di?erent patterns under di?erent models. If we assume the price is moved by inde-
pendent news and it is the result of the accumulation of these independent identical
shocks, then the stock price is led by L?vy processes. The skewness and excess kur-
tosis of the price ?uctuations drop at the reciprocal of pt and t, respectively. If the
stock markets, as complex dynamic systems, exhibit scaling behavior as numerous
authors have indicated, then the skewness and excess kurtosis of the price ?uctua-
tions keep constant at all time scales. These two postulations have been investigated
by numerous authors. The empirical studies show that the term structures of skew-
16
ness and excess kurtosis behave in between of these two approaches, that is, skewness
and excess kurtosis decay slower than L?vy and but faster than scaling. Thus, it is
natural to propose a model that combines these two ideas: at a chosen time scale,
called unit time, the random variable of log-price increment (or price ?uctuation, or
log return) is split into two components, one runs as the accumulation of i:i:d random
variables, which is L?vy, and the other behaves as a scaling random variable along
the time horizon. Again, we shall point out that this construction is only for the
distributions of the stock returns at various time horizons, and do not necessarily
have to be associated with any stochastic processes.
1.3.2 Self-decomposable Laws
The ?rst step in modeling is to split the random variable at the unit time, which is
related to a family of limit laws and its associated property, self-decomposability.
Itisknownthatthestockpricesaremovedbymanypiecesofinformationornoises.
If these pieces of information are considered as a sequence of independent random
variables (not necessarily identical) fZi : i = 1;2;:::g, then the price ?uctuation Pt
is the consequence of the impacts from all Zi. Let Sn =
nP
i=0
Zi and rewrite it as
anSn +bn. L?vy [42] and Khinchin [40] studied the asymptotic behavior of anSn +bn
and de?ned a family of laws called ?class L.?It states that there exist sequences
of constants an, the scaling coe? cients, and bn, the centering constants, such that
the distribution of anSn + bn converges to the law of a random variable X, which
belongs to a family of laws ?class L.?The class L laws are limit laws. The Central
17
Limit Theorem, which says the distribution of the normalized sum of a large number
of i:i:d random variables converges to Gaussian distribution, is a special case of the
class L. As Pt, the price change within t time horizon, is the outcome of many
independent random variables appearing in t, it can be approximated as a random
variable X which has the law of class L.
Sato [72] studied another class of random variables with self-decomposable prop-
erty, which is de?ned below.
De?nition 9 Self-decomposable Laws
A random variable X is self-decomposable if for 8 c 2 (0;1)
X law= cX + X(c); (1.16)
where X(c) is a random variable independent of X.
This means a self-decomposable random variable X can be decomposed into a
partial of itself and another independent random variable. Furthermore, Sato [72]
shows that the random variable X is self-decomposable if and only if it has class L
distribution. Thus, we can study the property of the price ?uctuation Pt through
the self-decomposable laws, which is relatively easier to handle than class L.
Self-decomposable distributions belong to the family of in?nitely divisible laws
[41]. Their characteristic functions are given by the following proposition [72]:
Proposition 10 Characteristic Function of Self-decomposable Laws
18
The characteristic function of a self-decomposable random variable X is
E[eiux] = exp

iru  122u2 +
Z
R

eiux  1 iux1jxj1g(x)jxj dx

; (1.17)
where r,  are constants, 2  0,RR(jxj2 1)g(x)jxj dx < 1, and g(x) is an increasing
function when x < 0 and an decreasing function when x > 0.
The L?vy measure of the self-decomposable laws has the form g(x)jxj with some
constraints for g(x) as indicated above. This kind of function g(x) is called the
self-decomposability characteristic (SDC) of the random variable X [12].
1.3.3 Mixed Model
Let the log-price change (or the log return) X = lnSt  lnS0 be a self-decomposable
random variable within some chosen unit time (t = 1), e.g., one hour, one day. By
Eq. (1.16), we have X law= cX + X(c): The log return Yt at other time scales t are
developed from the two components at the unit time, that is, Xt is also decomposed
into cX(t), which runs as a L?vy process from the cX, and tX(c), which is scaled
from X(c).
Yt = cX(t) + tX(c): (1.18)
The characteristic function of Yt can be derived,
E[eiuYt] = E[eiucX(t)+tX(c)]
= E[eiucX(t)]E[eiutX(c)]
19
as from
E[eiuX] = E[eiu(cX+X(c))]
= E[eiucX]E[eiuX(c)];
we can get
E[eiuX(c)] = E[eiuX]=E[eiucX] = exp(	(u))=exp(	(cu)) = exp(	(u) 	(cu));
where 	() is the characteristic exponent.
so
E[eiuYt] = expft	(cu) + 	(ut) 	(cut)g: (1.19)
And the following proposition [22] provides the term structure of variance, skew-
ness, and excess kurtosis in this model:
Proposition 11 Variance, Skewness, and Kurtosis of the Mixed Model
Let Var(X), Skew(X), Kurt(X) be the variance, skewness, and excess kurtosis at
the unit time t = 1 of a self-decomposable random variable X de?ned by (1.16). Then
the variance, skewness and excess kurtosis of Yt de?ned by Eq. (1.18) are:
Var(Yt) = Var(X)(c2t + (1 c2)t2 (1.20)
Skew(Yt) = Skew(X)pt
 c3 + (1 c3)t3 1
(c2 + (1 c2)t2 1)3=2

(1.21)
Kurt(Yt) = Kurt(X)t
 c4 + (1 c4)t4 1
(c2 + (1 c2)t2 1)2

(1.22)
20
Remarks:
(1)Bysimplecalculation, wecanseethat
h
c3+(1 c3)t3 1
(c2+(1 c2)t2 1)3=2
i
< 1and
h
c4+(1 c4)t4 1
(c2+(1 c2)t2 1)2
i
<
1 when 0 < c < 1; thus, skewness decays at the rate between pt and 0, and excess
kurtosis decay at the rate between t and 0.
(2) It can be seen from this proposition that Yt follows L?vy process when c = 1
and scaling process when c = 0.
1.4 Related Methods and Techniques
In the experimental procedure, a couple of methods and techniques are needed, in-
cluding the law at the unit time, the maximum likelihood estimation, the fast Fourier
transform (FFT), and the simulation. They are used in the statistical parameter esti-
mation and analysis both in this chapter and the other two chapters. In this section,
a brief review of these methods is provided.
1.4.1 Variance Gamma Process and the Associated Law
Variance Gamma Process as a Time-changed Brownian Motion
Brownian motion captures the essentiality of the stock markets but also misses some
important empirical facts. The dynamics of the markets is not homogenous through
time: that is, sometimes the markets are very active while other times they are
relatively slow. Time, instead of being considered as a steady increasing process, can
be viewed as a randomly changing time that is ?economically relevant.?Thus, we
have a generalized version of Brownian motion with random time, which provides
21
more ?exibility to describe the log stock prices. If this random time follows a gamma
process, the time-changed Brownian motion is called the Variance Gamma process
[49][48].1
First, we de?ne gamma process.
De?nition 12 Gamma Process
A gamma process (t;;) is a L?vy process with independent gamma increments
where  is the mean rate and  is the variance rate. The increment gh = (t+h;;) 
(t;;) is a gamma random variable and has the probability density function (pdf)
fh(g) = 1 ()g 1e g=; (1.23)
where  = 2h ,  = , g  0.
The gamma process is a pure jump L?vy process, i.e., no di?usion part W(t). The
mean of gh is h and the variance is h.
The characteristic function of the gamma process (t) is
(t)(u) =
 
1
1 iu
!2t=
: (1.24)
If replacing the calendar time t in Brownian motion with a random time (t),
the expectation of (t) should equal t, E[(t)] = t. Thus, the gamma process must
increase with a unit mean rate, which means  = 1 and (t;1;) are used to model
the time in this time-changed Brownian motion.
1Most of the results in this section, if not indicated, come from Ref [48].
22
Given a gamma process with a unit mean rate, we can build a Variance Gamma
process.
De?nition 13 Variance Gamma Process
Let b(t;;) be a Brownian motion with drift  and standard deviation , b(t;;) =
t + W(t). Let (t;1;) be an independent gamma process with unit mean rate.
Then the Variance Gamma process (VG) is de?ned as
X(t;;;) = b((t;1;);;)
= (t;1;) + W((t;1;)); (1.25)
a time-changed Brownian motion with gamma random time.
? Properties of the VG Process
The random variable X(t) = (t)+W((t)) of the VG process in the time interval
t contains two independent random parts: a Gaussian random variable W(t) and a
gamma random variable (t). Thus, the independence can let us conveniently use the
conditional expectation method to derive its pdf and characteristic function.
The pdf of X(t) can be obtained from a normal density function conditioned on
a gamma random variable. So we can integrate the gamma part and get the density
function of X(t), which is given below:
fX(t)(x) =
Z 1
0
1
p2g exp

 (x g)
2
22g

 g
t
 1 exp( g)
 t  (t) dg: (1.26)
23
Similarly, the characteristic function of X(t) can be attained by conditional ex-
pectation on the gamma random variable g(t):
X(t)(u) =
 1
1 iu + u2 (2=2)
t

: (1.27)
Because the characteristic function has a much simpler expression than the density
function, it is used most of the time.
Another representation for the VG process is to interpret it as the di?erence of
two independent gamma processes,
X(t) = p(t) n(t); (1.28)
where p(t) is a gamma process representing positive jumps and n(t) is an in-
dependent gamma process with negative jumps. The VG process is the result of the
e?ect of these two processes, which implies that the VG process is also a pure jump
L?vy process.
The L?vy measure (or L?vy density) can be determined from the characteristic
function,
kX(dx) =
cexp(Gx)jxj 1 dx; x < 0
cexp( Mx)x 1dx; x > 0; (1.29)
24
where
c = 1 > 0 (1.30)
G =
0
@
s
22
4 +
2
2  

2
1
A
 1
> 0 (1.31)
M =
0
@
s
22
4 +
2
2 +

2
1
A
 1
> 0: (1.32)
From (1.29) we can see that the ?rst part (x < 0) is the L?vy measure of the
negative jumps (n) with parameter C, G, and the second part (x > 0) is the L?vy
measure of the positive jumps (p) with parameter C, M.
The central moments, skewness, and kurtosis can also be derived. The results are
listed below:
mean =  (1.33)
variance = 2 + 2 (1.34)
skewness = (32 + 22)=(2 + 2)3=2 (1.35)
kurtosis = 3(1 + 2  4(2 + 2) 2); (1.36)
which show that the VG process has the ?exibility to control skewness and excess
kurtosis, unlike Brownian motion which has ?xed values.
The L?vy measure of the VG process (1.29) indicates the VG random variable has
a self-decomposable distribution (Proposition 10). Thus, the VG random variable is
a candidate to be the building blocks at the unit time in the mixed model.
25
1.4.2 Stock Price Dynamics with the VG Process
? Preliminary - Stock Market Models
Stock price can be modeled as a stochastic process
St = S0 exp(t + Xt); (1.37)
where St, S0 are stock prices at time t, 0, respectively, Xt represents a stochastic
process, and  is the mean rate of returns of the stock. However, the stock price
St in (1.37) is not a martingale under the statistical probability measure (or usually
called ?physical measure?). The following proposition provides a way to make St a
martingale [Theorems 2.5.1 and 2.5.3 in [72]].
Proposition 14 Let fXt : t  0g be a real-valued process with independent incre-
ment. If E[eaXt] < 1 for some real-valued a, then

eaXt
E[eaXt]

t0
is a martingale at all
t  0.
Let a = 1 and assume E[eXt] < 1, Eq (1.37) can be rewritten as
St = S0exp(t + Xt)E[eX
t]
: (1.38)
Take the expectation of St, we get E[St] = S0etE
h
eXt
E[eXt]
i
= S0et, or S0 = E[St]et ,
which means Stet, the stock price discounted by its drift term et, is a martingale under
the physical measure.
26
? VG Stock Market Model
Let Xt in (1.38) be the VG random variable and fXt : t  0g the associated VG
process, we have the VG stock market model.
By the VG characteristic function (1.27),
E[exp(XVG(t))] = XVG(t)( i)
=
 1
1 i( i) + ( i)2 (2=2))
t

= exp

 t ln

1    
2
2

;
denotes ! = 1 ln

1    22

Eq (1.38) becomes
St = S0 expft + XVG(t) + !tg; (1.39)
which is the stochastic process for the stock price. The log-price return ln

St
S0

follows t + XVG(t) + !t.
Use the density function (1.29) of the VG random variable XVG(t) and integrate
the gamma random variable, we can derive the density function (pdf) of the log-price
return ln

St
S0

.
Proposition 15 Density Function (pdf) of VG Log-price Returns
Let r(t) = ln

S(t)
S(0)

be the log-price return, and St follow Eq (1.39). Under the
27
physical measure, the pdf of r(t) is given by
f(r(t)) =
2exp

z
2

p2 t
 (t)

 
z2
22
 + 
2
! t
2 
1
4
Kt
 
1
2
 
1
2 
s
z2
22
 + 
2
!
; (1.40)
where K is the modi?ed Bessel function of the second kind, and z = r(t)  t 
t
 ln

1    22

.
As ln

St
S0

= t+XVG(t)+!t, the characteristic function of the log-price return
r(t) is easier to attain
E[eiur(t)] = Eeiu(t+XVG(t)+!t)
= eiu(+!)t VG(u)
= expfiu( + !)tg

1 iu  
2
2 u
2
 t

: (1.41)
The characteristic function has a much simpler form than the density function. As
these two functions have a one-on-one relationship connected by Fourier transform,
people use the characteristic function most of the time.
1.4.3 Maximum Likelihood Estimation (MLE)
Probability distribution can model the log-stock returns with a ?xed interval, e.g.,
daily returns, hourly returns, etc. Given a set of sample points, the maximum likeli-
hood estimation (MLE) can be employed to estimate the model parameters by ?tting
the data to the statistical model.
Let x1;x2;:::;xn be n i:i:d sample data points collected from a population. The
28
pdf of a proposed distribution is f(x; ! ), where
n !
 : 1;2;:::;k
o
is the parameter
set. The likelihood function is de?ned as
L( ! jx) = n
i=1
f(xi; ! ): (1.42)
The method of maximum likelihood function is to estimate  ! by ?nding values
of a parameter set b which maximize L( ! jx). Eq (1.42) can be rewritten in the
logarithm version,
logL( ! jx) = n
i=1
f(xi; ! );
which is called log likelihood function. Maximization is then conducted on this
function instead.
Due to the complexity of pdf formulas, only numerical procedure is feasible to
perform MLE in most cases. When the closed-form of pdf is not known or it?s too
complicated to use in MLE, characteristic function is employed. At this situation,
the conversion from characteristic function to pdf is performed by Fourier transform.
A useful technique in the Fourier transform will be discussed in the next section.
1.4.4 Fast Fourier Transform (FFT)
The numerical MLE method is realized through optimization procedure, which re-
quires the calculation of the likelihood function at each iteration. If the pdf is
not known or computationally feasible, then the values of pdf are attained from
characteristic function through Fourier transform, f(x; ! ) = 12 R1 1 e iuxX(u)du.
29
The fact that f(x; ! ) is a real-valued function implies that X(u) has even real
part and odd imaginary part, which derives the transform formula to f(x; ! ) =
1

R1
0 e
 iuxX(u)du.The latter can be numerically calculated by
f(xk; ! ) = 1 N
j=1
e iujxkX(uj); (1.43)
where  = u, k = 0;1;:::;N  1. Eq (1.43) is the discrete Fourier transform (DFT)
and requires O(N2) operations: N computations are needed for each xk, and there
are N number of xk. The fast Fourier transform (FFT) is an e? cient algorithm to
compute the DFT. It produces exactly the same result as DFT and only requires
O(N logN) operations. There are several FFT algorithms. What we use in the
Matlab code is based on FFTW [81], and the details of this algorithm can be found
in Ref [78].
The ready-to-use formula for the numerical computation of pdf can be derived
from Eq (1.43)2. In this formula, N usually takes the value of the power of 2. Given
the step size ; the upper limit of u is a = N, the grid point uj = (j   1). Let 
be the length of grid of x, then x ranges from  b to b, where b = N2 , and the grid
point xk =  b + (k  1) for k = 1;2;:::;N.
With the above setting, Eq (1.43) becomes
f(xk; ! ) = 1 N
j=1
e i(j 1)( b+(k 1))X(uj)
= 1 N
j=1
e i(j 1)(k 1) eibujX(uj): (1.44)
2The procedure is based on the work by Carr and Madan [13].
30
The formula of the standard discrete Fourier transform is
Z(k) = N
j=1
e 2iN (j 1)(k 1) z(j): (1.45)
Comparing Eq (1.44) with Eq (1.45), we get
 = 2N :
With properly chosen values of  and N, Eq (1.44) can be immediately used in
the numerical procedure. It can be further revised by incorporating Simpon?s rule,
f(xk; ! ) = 1 N
j=1
e 2iN (j 1)(k 1) eibujX(uj) 3(3 + ( 1)j  j 1); (1.46)
where j is the Kronecker delta function with value one at j = 1 and zero else-
where.
Comparing Eq (1.45) and Eq (1.46), we have
z(j) = 1eibujX(uj) 3(3 + ( 1)j  j 1); (1.47)
which is the input in the FFT calculation.
1.4.5 VG Random Number Simulation
Simulation of VG random number is needed in the goodness of ?t of the VG model
in our work. The VG random number can be simulated directly from the de?nition
31
of the VG process. Recall the VG process X(t) is a drifted Brownian motion with
its time following the gamma process, which is independent of the Brownian motion
(De?nition 13),
X(t) = (t) + W((t)): (1.48)
To generate the random number X(t) at time t, we can simulate two independent
random variables separately: a standard normal random variable W(1) and a gamma
random variable (t). W((t)) in (1.48) is a Gaussian random variable, so it can be
written as W((t)) = p(t)W(1). As W(1) and (t) are independent, the product
of these two simulated numbers gives us W((t)). Thus, the random number X(t) in
(1.48) is attained.
The above procedure is summarized below:
Algorithm 16 VG Random Number Simulation
Step 1. Generate a standard normal random variable z  N(0;1)
Step 2. Generate a gamma random number g  Gamma(t;)
Step 3. The VG random number X(t) at time t is given by X(t) = g + pgz
32
1.5 Numerical Implementation and Results
1.5.1 Data, Sketch of the Procedure, and Brief Introduction
of the Statistical Analysis
The ?rst largest 495 stocks are chosen from S&P 500 and S&P MidCap400 [83].
The data are the log-price returns ranging from January 2, 2003 to December 29,
2006, which include nonoverlapping one-hour, two-hour, three-hour, daily, weekly,
and biweekly returns.
Sketch of the procedure is listed below:
 Step 1. Estimate the log return data at the unit time, which is chosen to be
one hour, for each stock using the Variance Gamma distribution.
 Step2. Estimatethelogreturndataatlongerhorizonsforeachstockusingthree
di?erent models, which are the VG iid model (because the random variables at
longer horizons are the cumulants of i.i.d random variables from the unit time),
the VG scaling model (with VG law at the unit time), and the VG mixed
model. The longer horizons are two hours, three hours, one day, one week, and
two weeks.
 Step 3. Conduct statistical goodness of ?t for each model. The statistical
analyses include the Kolmogorov-Smirnov test, the Kolmogorov distance, the
modi?ed Kolmogorov distance, 2-distance, L1 and L2 distances, which are
brie?y described below.
33
 Kolmogorov-Smirnov test (KS-test)
The KS-test is a statistical hypothesis test to determine whether the two data
sets di?er signi?cantly. The null hypothesis is that the two data sets are from the
same continuous distribution, and the alternative hypothesis is that they belong to
di?erent distributions. The signi?cance level is usually taken as 5%.
The advantage of the KS-test is that it does not have any assumptions about the
distributions of the data. On the other side, the cost or the disadvantage is it is less
sensitive or accurate than other tests.
 Various statistical distances to measure how close the two probability
distributions are
We have two probability distributions: one is the empirical distribution obtained
from the data; the other is the ?tted probability distribution. If the two distributions
are similar, then graphically their cdf curves are close. The distances between these
two curves can be used to measure how good the ?t is.
Let F1 and F2 be the cdf of two distributions. A couple of distances have been
de?ned.
 Kolmogorov distance
distK(F1;F2) = sup
x2R
jF1(x) F2(x)j
measures the largest distance between the two cdfs
 Modi?ed Kolmogorov distance
disteK(F1;F2) = sup
jxj>"
jF1(x) F2(x)j
34
The empirical cdf has a large jump at x = 0, which is called the 0-return e?ect.
To eliminate this e?ect, the distance is not measured in a small area around 0.
 2-distance
Denote n1, n2 the m-dimensional frequency vectors from samples of two distribu-
tions. The 2-distance is calculated by:
dist2(F1;F2) =
mP
i=1

n1i
n1i  
n2i
n2i
2
/(n1i + n2i)
where n1i, n2i are the elements in vectors n1, n2.
 Lp-distance
distLp(F1;F2) = RRjF1(x) F2(x)jp dx1=p, p = 1;2;:::
1.5.2 Statistical Estimation at the Unit Time
We ?rst estimate the VG parameters (;;) for the hourly demeaned returns using
MLE. The KS-test is then performed on the observed data and the simulated data
using the ?tted VG model. The signi?cance level is  = 0:05. Among 495 stocks, the
VG model only has a goodness ?t for one quarter (126) of the stocks, with p-value
greater than 5%. Further estimations at longer horizons will be conducted only on
these 126 stocks.
The statistics of the estimated parameters are presented in Table 1.1, includ-
ing mean, standard deviation, minimum, maximum, quantile 1/4, quantile 3/4, and
median. To graphically illustrate the statistical ?t, the ?tted pdf and its empirical
counterpart are shown in Figure 1-1 for WMT (Walmart).
35
  
mean 0.4473 1.32E-04 0.0397
std. 0.0936 2.27E-05 0.3282
min. 0.2788 8.73E-05 -0.5000
quantile 1/4 0.3694 1.13E-04 -0.1886
median 0.4436 1.31E-04 0.0763
quantile 3/4 0.5124 1.48E-04 0.2940
max 0.6713 2.11E-04 0.5000
Table 1.1: Statistics of the Estimated VG Parameters (at unit time = 1 hour)
Figure 1-1: VG ?t to WMT at 1-hour time scale
36
1.5.3 Statistical Estimation at Longer Horizons
The distribution of the mixed model is based on the distribution at the unit time,
which means the estimated parameters of the base distribution (b;b;b) is used at
longer horizons. The remaining two parameters, c and , are to be estimated, where
c represents theproportionof therandomvariablebehavinglikeL?vyprocesses, and
represents the scaling coe? cient of the remaining component of the random variable.
The maximum likelihood estimation is performed on the nonoverlapping demeaned
log return data at each longer horizon, including two hours, three hours, one day, one
week, and two weeks.
The summary statistics of the estimated (bc;b) is presented in Table 1.2 (bc) and
Table 1.3 (b). The parameters of the VG mixed model are hard to estimate at even
longer horizons, such as half year because of the lack of data. The average values of
c and  are around 0.4 in our estimation. Thus, c and  may be properly assumed to
have value of 0.4 at horizons where estimation is not feasible due to lack of data.
The performance of the VG mixed model is compared to other two models, the
VG iid model and the VG scaling model. The distribution of the VG iid model at
longer horizon t is the accumulated i:i:d VG variables to time t, so it is known if the
distribution at the unit time is given. In the VG scaling model, the random variable
Xt is t0X1, a scaled version of X1 from the unit time t = 1. Thus, we need to estimate
, the scaling coe? cient at time t. Sample graphs of the ?tted and empirical pdf of
these three models at various time horizons for WMT are shown in Figure 1-2 to
Figure 1-6.
37
2 hours 3 hours 1 day 1 week 2 weeks
mean 0.6404 0.7204 0.3390 0.4274 0.4713
std. 0.1712 0.1122 0.1116 0.1249 0.1421
min. 0.0064 0.3985 0.0002 0.0504 0.0002
quantile 1/4 0.5882 0.6518 0.2716 0.3436 0.3959
median 0.6591 0.7420 0.3334 0.4310 0.5025
quantile 3/4 0.7665 0.7955 0.4198 0.5165 0.5764
max 0.9144 0.9792 0.5759 0.7142 0.6927
Table 1.2: Statistics of the Estimated VGMixed Parameter c
2 hours 3 hours 1 day 1 week 2 weeks
mean 0.4966 0.5117 0.3277 0.3344 0.2671
std. 0.0617 0.0968 0.0536 0.111 0.1648
min. 0.2074 0.3043 0.0456 4.3e-06 1.9e-08
quantile 1/4 0.4674 0.4666 0.3074 0.3258 0.0536
median 0.5005 0.5028 0.3413 0.3634 0.3451
quantile 3/4 0.5363 0.5439 0.3621 0.3998 0.3864
max 0.6359 1.0000 0.4239 0.4681 0.4782
Table 1.3: Statistics of the Estimated VGMixed Parameter 
38
Figure 1-2: Statistical ?t to WMT at 2hr timescale
39
Figure 1-3: Statistical ?t to WMT at 3hr timescale
40
Figure 1-4: Statistical ?t to WMT at 1d timescale
41
Figure 1-5: Statistical ?t to WMT at 1w timescale
42
Figure 1-6: Statistical ?t to WMT at 2w timescale
43
1.5.4 Statistical Analysis for Model Performance Compari-
son
A couple of statistical analyses, described in Section 1.5.1, are conducted to compare
the performances of the three models, namely VG mixed, VG scaling, VG iid. The
statistics of ?ve distances are presented in Table 1.4 (mean) and Table 1.5 (std.).
The KS-test is also employed at the longer horizons to test and compare the three
models. It examines whether the observed data and the simulated data from the
?tted model belong to the same distribution. the p-value is attained and graphed in
?gures (Figure 1-7 to Figure 1-11) whose x axis is the p-value and y  axis is the
proportion of stocks whose p-value exceeds the corresponding p-value on the x axis.
ThegraphshowstheVGscalingmodelhasbetterperformancethantheVGiidmodel,
and the VG mixed model outperforms both models at all horizons. At longer horizon
(2-week), the VG iid model has better performance than the VG scaling model, which
alsocon?rmstheempiricalfactthatthereturndistributionasymptoticallyapproaches
Gaussian, which has i.i.d increment along the time horizon.
1.6 Conclusion
This chapter investigates the performance of three stock return models, the VG iid
model, the VG scaling model, and a mixed version of the two models, the VG mixed
model. The ?rst two approaches have di?erent e?ects on skewness and excess kurtosis
along the time horizon. The empirical study shows that skewness and excess kurtosis
44
Mean distK dist eK dist2 distL1 distL2
2h-VG Mixed 0.0064 0.0047 0.0001 0.0008 4.00E-06
2h-VG Scaling 0.0122 0.0060 0.0002 0.0013 7.00E-06
2h-VG iid 0.0160 0.0137 0.0003 0.0018 9.00E-06
3h-VG Mixed 0.0096 0.0065 0.0002 0.0011 6.00E-06
3h-VG Scaling 0.0170 0.0101 0.0004 0.0019 1.40E-05
3h-VG iid 0.0204 0.0182 0.0005 0.0024 1.20E-05
1d-VG Mixed 0.0144 0.0118 0.0007 0.0024 7.00E-06
1d-VG Scaling 0.0247 0.0224 0.0011 0.0038 1.40E-05
1d-VG iid 0.1157 0.1157 0.0083 0.0265 3.04E-04
1w-VG Mixed 0.0250 0.0239 0.0027 0.0070 1.60E-05
1w-VG Scaling 0.0346 0.0342 0.0036 0.0094 1.90E-05
1w-VG iid 0.1143 0.1143 0.0164 0.0369 2.73E-04
2w-VG Mixed 0.0388 0.0381 0.0053 0.0121 2.80E-05
2w-VG Scaling 0.0503 0.0502 0.0065 0.0155 3.30E-05
2w-VG iid 0.1140 0.1140 0.0204 0.0411 2.31E-04
Table 1.4: Mean of the statistical distances of the three models at di?erent timescales
Std. distK dist eK dist2 distL1 distL2
2h-VG Mixed 0.0024 0.0021 0.0001 2.00E-04 2.00E-06
2h-VG Scaling 0.0052 0.0022 0.0001 4.00E-04 5.00E-06
2h-VG iid 0.0056 0.0054 0.0001 6.00E-04 5.00E-06
3h-VG Mixed 0.0037 0.0028 0.0001 4.00E-04 4.00E-06
3h-VG Scaling 0.0060 0.0041 0.0001 4.00E-04 1.00E-05
3h-VG iid 0.0064 0.0063 0.0002 7.00E-04 7.00E-06
1d-VG Mixed 0.0062 0.0046 0.0003 8.00E-04 4.00E-06
1d-VG Scaling 0.0094 0.0089 0.0004 1.00E-03 1.10E-05
1d-VG iid 0.0129 0.0129 0.0017 3.20E-03 2.29E-04
1w-VG Mixed 0.0103 0.0095 0.0012 2.30E-03 9.00E-06
1w-VG Scaling 0.0131 0.0129 0.0013 2.60E-03 1.30E-05
1w-VG iid 0.0211 0.0211 0.0041 6.40E-03 1.66E-04
2w-VG Mixed 0.0124 0.0124 0.0025 4.30E-03 1.60E-05
2w-VG Scaling 0.0175 0.0177 0.0026 4.80E-03 2.00E-05
2w-VG iid 0.0228 0.0228 0.0063 8.90E-03 1.34E-04
Table 1.5: Std. of the statistical distances of the three models at di?erent timescales
45
Figure 1-7: Proportion of stocks with p-value greater than certain level (2-hour
timescale)
Figure 1-8: Proportion of stocks with p-value greater than certain level (3-hour
timescale)
46
Figure 1-9: Proportion of stocks with p-value greater than certain level (1-day
timescale)
Figure 1-10: Proportion of stocks with p-value greater than certain level (1-week
timescale)
47
Figure 1-11: Proportion of stocks with p-value greater than certain level (2-week
timescale)
in L?vy models decline much faster than the observed data when time increases while
they stay constant in the scaling models. A strategy of combining the two approaches
is proposed [22]: at a short horizon named unit time, e.g., one hour, we split the
random variable of log return into two components, one is a fraction of itself and the
other is the remaining part. The ?rst component follows the VG iid model along the
time horizon, and the second one follows the VG scaling model.
Statistical estimation and analysis are conducted. All the statistical analyses
show the VG mixed model outperforms the other two models at all longer horizons.
Furthermore, both the estimated coe? cient, c and , in the VG mixed model have
an average value of 0.4 at all horizons. The mixed model provides a practical method
to construct return distributions at longer horizons, which has many applications in
48
the ?nancial industry.
There are a couple of things to investigate in the future study.
1. In this study, the VG distribution only ?ts one-fourth of the stocks at the unit
time, one hour. Other possible self-decomposable distributions should be explored to
have a better statistical ?t at this unit time.
2. The value 0.4 for are assumed for c and  when estimation is not feasible due to
the lack of data. However, other values should be sought to have a better performance
than 0.4. One possible way is to combine data from di?erent horizons to estimate c
and , which may be more accurate than 0.4.
3. The mixed model provides a strategy to build return distributions at longer
horizons. The probability measure obtained from the time series stock return data is
called the physical measure P. Option surface contains the information to construct
anotherreturndistributioncalledriskneutralmeasureQ. Ithasinterestedresearchers
for a long time regarding how the ratio P=Q behaves along the time horizon. This
question is not easy to answer due to the di? culty in obtaining the physical measure
at longer horizons. The mixed model provides a possible way to study this topic.
49
Chapter 2
Estimating Expected Return By
Numeraire-Portfolio Method
2.1 Introduction
The expected return is one of the most important numbers in ?nance, which predicts
risky asset?s future performance. The estimation of expected returns is crucial to
many investment decisions, e.g., portfolio selection. Much research has been done to
analyze andmodel expectedreturns byvarious riskfactors. However, fewstudies have
been done to estimate this important number. Furthermore, there is no universally
accepted agreement on the estimation method.
One widely used method implements classic asset pricing models, mainly the Cap-
ital Asset Pricing Model (CAPM) and the Fama-French three-factor model, to esti-
mate expected returns from historical data. In these models (and their variations),
the expected return is a?ected by one or more than one of the risk factor(s), named
50
beta(s). In the estimation procedure, Beta(s) is ?rst estimated using a simple OLS
regression on historical data. Then the expected return is obtained by the product
of the estimated beta(s) and the associated risk premium. However, realized returns
are so volatile that a huge amount of data is required to obtain relatively precise
estimates. Detailed discussion can be found in [6]. An empirical study by Bartholdy
and Peare [4] also indicates that none of the two popular models provides an accurate
good ?t, where both regressions in the method can only explain an average 5% of
di?erences in returns.
Numerousauthors, includingBreeden[8], Lucas[46], MehraandPrescott[57], and
Rubenstein [71], demonstrate that expected returns are determined by future uncer-
tainty and investors?preferences, instead of implied by realized returns. Therefore,
a discount cash ?ow model which links expected returns to future cash ?ow (uncer-
tainty) is proposed. The estimator is originally derived from the Dividend Discount
Model by Preinreich [68], which says an asset?s current price is the future payo?s dis-
counted by the expected return. Edwards and Bell [25] and Ohlson [62] improved the
model and derived the Edwards-Bell-Ohlson equation, which, along with its modi?ed
versions, is implemented by numerous authors. We cite Claus and Thamas [15] and
Philips [64]. However, this method is not robust for assets with dividends or earning
growth rates.
For other estimation approaches, we refer to Welch?s paper [79], which provides a
review of the existing estimates of the expected returns. In this paper, an interest-
ing survey is conducted among a group of academic ?nancial economists, and their
forecasts of equity premium are reported.
51
In this chapter, we propose a novel estimation approach, which also tries to ex-
trapolate the expected return from the future uncertainty that is represented by
option prices. Unlike the classic risk-neutral pricing, the option can be priced by an
alternative method, which is related to a so-called numeraire portfolio [45] and the
associated pricing method. The numeraire portfolio is a self-?nancing, positive port-
folio, which maximizes the expected log utility at the terminal time. It exists if, and
only if, there is no arbitrage opportunity. A striking feature of this portfolio is that
the price process of any asset in the same market, if denominated by this portfolio, is
a martingale under the physical measure. Therefore, the numeraire portfolio provides
a pricing method for contingent claims, which is proposed by Platen [9] [65] [66].
More explicitly speaking, an option?s price, denominated by the numeraire portfolio,
is the expectation of its numeraire-denominated terminal payo? under the physical
measure. The physical measure implies the expected return through stochastic stock
price models. Therefore, the numeraire pricing method links the expected return and
future uncertainty, and it leads to a new method to estimate the expected return.
The numeraire portfolio is required in this method. However, its composition is
not as easy to determine as its existence. Long [45] demonstrates that it is a levered
position in the market portfolio. Furthermore, empirical studies [29] [69] [70] suggest
the market portfolio and the numeraire portfolio can be proxied by value-weighted or
equal-weighted portfolios, such as S&P 500, NYSE.
Option calibration is conducted to estimate the parameters, which include the
desired expected return. A simulation technique is employed in the calibration proce-
dure. As stock and the numeraire portfolio (or its proxy) are correlated, the bivariate
52
random variable is simulated through the full-rank Gaussian copula (FGC) [39] [50]
[51], which transforms the marginal samples to a standard normal random variable,
constructs the dependence structure from the binormal random variable, and then
transforms the simulated standard normal back to the desired bivariate random num-
bers. In the calibration procedure, a stock price model for long-horizons (one month
in this study) is required, which is the VG mixed model by Eberlein and Madan [22].
Theexpectedreturns of the?rst 50stocks intheS&P500are estimatedonce every
month from January 1999 to October 2009. Unlike the realized return or its sample
mean, nearly 95% of the estimated expected returns are positive. The statistics of
these estimates are more stable than the realized returns. A simple linear regression
model furthershows that the estimatedreturns andthe realizedreturns have the same
mean for nearly 80% of the stocks. The results indicate the estimated return can be
served as an estimator for the expected return, and it is superior to the estimators
from the historical data.
The rest of the chapter is organized as follows. Section 2.2 introduces the nu-
meraire portfolio and its pricing method. Section 2.3 describes the estimation pro-
cedure using the numeraire-portfolio method. The numerical implementation and
results are presented in Section 2.4. Section 2.5 concludes.
53
2.2 Pricing with Physical Measure
2.2.1 A Simple Example
In this section, we present a simple example to illustrate how to price an asset.1
Consider a single-period binomial model in Figure 2-1. We are interested in valuing
a stock A whose current price is S0 = $100. At time t = 1, its price will either be
$100 or $95, each with 50% possibility. To simplify the situation, the interested rate
is assumed to be zero.
Figure 2-1: A single-period binormial model.
In this example, the probability at t = 1 is the real world probability which
is called the ?physical measure.?Simply taking the expectation using the physical
measure, E[S1jt0] = 110  12 + 95  12 = 102:5, does not give us S0 = $100, the stock
price at t = 0. If the price is $102:50, nobody will buy this stock as people can invest
this amount at time 0 in the money market, which is risk free, and get back $102:50
at time 1 guaranteed with no worry to lose. So the actual price is lower than the
1Please note that it is a rough example, only for illustration purpose.
54
expected price using the physical measure, and the extra amount $2:50 is the risk
premium, a compensation for the uncertainty that people take in this risky asset.
One approach to obtaining the price is to take the expectation under another
probability measure called the ?risk-neutral measure.?The idea was ?rst proposed by
Cox and Ross [20] in 1976 and it is now the widely used method in pricing derivatives.
Let PQ(S1 = 110) = 13 and PQ(S1 = 95) = 23, and then take the expectation under
this measure, we get EQ[S1jt0] = 110  13 + 95  23 = 100, which is the actually price
at time 0. This new measure is not the actual probability measure but equivalent
to the physical measure. Harrison and Kreps [34] name this risk-neutral measure an
?equivalent martingale measure?as under this formulated measure, the asset price
processes are martingales.
How can we still obtain the actual price if taking the expectation under the phys-
ical measure? In other words, under what condition can the price process still be
martingale under the physical measure? Let us assume a portfolio with value $100
at time 0, $105 in the up state and $99:75 in the down state at time 1. Now di-
vide the stock A?s prices by the values of this portfolio and then take the expec-
tation under the physical measure, and we get expected dominated price at time 1
E[Sd1jt0] = 110105  12 + 9599:75  12 = 1, which is the dominated price at time 0 Sd0 = 100100.
Thus, the stock?s dominated price is a martingale under the physical measure. This
portfolio is found by Long [45] and named the numeraire portfolio. Again, please
keep in mind that this is a rough example to illustrate the idea of asset pricing, no
further information is implied. The structure and property of the numeraire portfolio
is discussed in the following section.
55
2.2.2 The Numeraire Portfolio
? The Setting
In this section, the basic de?nition and assumptions are set up. The numeraire
portfolio is discussed within this context.
A single-period model of an asset market is considered. We assume no transaction
costs and restrictions on short sales. N tradable assets exist in the market, with price
Sti for asset i at time t, where i = 1;:::;N and t = 0;1. To simplify the situation,
the asset prices are adjusted values, in which the information of dividend and split is
re?ected. All the assets are assumed to have strictly positive prices, i.e., Sti > 0. It is
also reasonable to assume all prices are bounded, denoted as P(Sti < D; i = 1;:::;N;
t = 0;1) = 1, which means all prices are less than a ?nite number D for sure. Let Rti
be the rate of return for asset i from time i 1 to i. Thus, Rti is also bounded. Now,
we have the price and rate of return N 1 vector St and Rt for the N assets at time
t.
Portfolios can be constructed using the N assets. Denote ti the number of units
of asset i at time t, and t the associated 1N composition vector. We also assume
?nite portfolios, which make it a ?nite number for all i and t. The market value of
the portfolio A at time t is denoted as Vt, which equals tSt at time t.
Therearesome speci?c requirements for the portfolios inourcontext: self-?nancing
and always positive value. In a self-?nancing portfolio, the purchase of new assets
must be funded by the sale of its own assets, expressed in the mathematical formula
as t 1St = tSt for all t  1. Because of self-?nancing, only portfolios with positive
56
values can survive in the market, as when one portfolio?s value is below zero, and
there is no exogenous infusion and the portfolio is valueless. We assume there exists
at least one self-?nancing portfolio with positive value all the time. In that case, we
have at least one portfolio with good performance to serve as a numeraire portfolio,
which will be de?ned later.
Last, we de?ne arbitrage, or ?pro?t opportunities?termed in Long?s paper [45].
Roughly speaking, arbitrage is the opportunity to get something from nothing, or a
?free lunch.?In our case, a portfolio with arbitrage opportunity has initial zero cost
(t = 0) but probability one to have nonnegative terminal value (t = 1 or in more
general case t = T where T  1), and a positive probability to have strictly positive
gain terminal value.
Mathematically it is de?ned as follows:
(1) V0 = 0;
(2) PfV1  0g = 1;
(3) PfV1 > 0g > 0:
? The Numeraire Portfolio: De?nition
Within the above scenario, let us ?nd a portfolio with the maximal expected log
return at the terminal time t = 1. The initial value of all portfolios is set to 1,
i.e.. V0 = 1. The composition of portfolio at time t is t, a 1  N vector. In a
single-period model, we select the portfolio at t = 0 and hold the position at t = 1.
Therefore, t is the same at t = 0 and 1 and can be simpli?ed as . Correspondingly,
i is the shares of asset i. The portfolio value is still denoted as Vt as it equals St
57
which is still related with t. Under the physical measure, this maximization problem
can be formulated as below:
max E0 [lnS1] or max E0 [lnV1], st. S0 = 1:
Using the Lagrangian method,
@ (E0 [lnS1] (S0  1))
@i = 0
for each i, i = 1;:::;N. Then
E0
S
1i
V1

 S0i = 0, where V1 = S1, i = 1;:::;N: (2.1)
Multiply  on both sides of these N equations,
E0
S
1
V1

= S0:
Because S0 = 1 and V1 = S1, we have  = 1. Eq (2.1) then becomes
E0
S
1i
V1

= S0i = S0iV
0
:
MaximumisachievedbythesecondorderconditioniftheportfolioV1 haspositive
value.
The above result means each asset denominated by a portfolio with maximized
58
expected log return is a martingale under the physical measure. Furthermore, this
conclusion can be extended to multi-period discrete-time case by the following theo-
rem2.
Theorem 17 In a discrete-time market with N assets,  = ft : t  0g is a positive
self-?nancing portfolio, where t is a 1N vector representing the portfolio?s compo-
sition at time t, t = 0;:::;T. If this portfolio maximizes E0[lnVT] at T, then for any
asset i in this market
Sit
Vt = Et
S
iT
VT

; for 0  t < T; (2.2)
where Sit is the price of asset i at t, and Vt is the value of portfolio  at t.
Long [45] de?nes the kind of denominating portfolio the ?numeraire portfolio?.
De?nition 18 Numeraire Portfolio
In a discrete-time market, a numeraire portfolio V  is a self-?nancing portfolio
with maximized terminal expected log return. When each asset in the market is de-
nominated by this portfolio, it is a martingale under the physical measure.
Sit
V t = Et
S
iT
V T

; for 0  t < T, i = 1;:::N:
The numeraire portfolio is obtained by maximizing the expected log return, which
is also called the expected growth (or expected growth rate in the continuous-time
2The idea of the proof can be found on page 54 in [45].
59
model). Kelly [38] proposed an investment portfolio, which is named the growth-
optimal portfolio(GOP), bymaximizingtheexpectedgrowthrateof portfolios. Thus,
the numeraire portfolio is the growth-optimal portfolio, and both reach optimal for
log utility investors when the initial portfolio value is set to one.
Let Rti be the rate of return for the denominated asset i from t   1 to t, Rti =
Sti
St 1;i   1 =
Sti
V t =
St 1;i
V t 1   1. Take the conditional expectation on both sides, we get
Et 1[Rti] = Et 1
h
Sti
V t
i
=St 1;iV 
t 1
  1. The numeraire portfolio implies that Et 1[Rti] =
0, the best conditional forecast of the rate of return for any denominated asset is
zero. This is an impressive feature that implies its relationship with the market
portfolio of the Capital Asset Pricing Model (CAPM), which provides information of
the composition and proxy of the numeraire portfolio. Details will be discussed on
page 62.
? The Numeraire Portfolio: Existence
The numeraire portfolio provides striking properties described in the previous session.
Now comes the question of under what conditions does such kind of portfolio exist.
Theorem 19 Existence and Uniqueness of Numeraire Portfolio
In a market all the portfolios are assumed to have bounded values. A numeraire
portfolio exists if and only if no arbitrage opportunity exists in the market. If there
are two numeraire portfolios, then, they have the same rates of return.
We will sketch the idea of the proof to have a better understanding of the condi-
tions of the existence. Detailed proof can be found on page 53 in [45].
60
First, if there exists a numeraire portfolio , then its de?nition implies that any
other portfolio  has
bV
0 = E0
hb
Vt
i
; (2.3)
where bV0 = V0=V0, bV1 = V1=V1, the denominated values of the portfolio .
If  contains arbitrage opportunity, by the arbitrage de?nition, we have
bV
0 = 0 and E0
hb
Vt
i
> 0;
which contradicts Eq (2.3). Thus, the existence of a numeraire portfolio excludes
arbitrage.
Secondly, we demonstrate the ?if?part. The numeraire portfolio is derived from
maximization of E0lnVtwith constraint V0 = 0. On page 56 we assumed that the
prices of all assets are bounded and there exists at least one self-?nancing portfolio
with always positive values. The solution of the maximization problem exists only
under the no arbitrage condition and the above listed assumptions. Thus, there exists
a numeraire portfolio. Last, the uniqueness of the rate of returns. If we have two
numeraire portfolios A and B with di?erent rates of return RAt, RBt, then they can
be denominated by each other,
E0
R
At
RBt

= E0
R
Bt
RAt

= 0:
The above equations exist if and only if RAt = RBt. However, the uniqueness of
the rate of return does not imply the unique composition of the numeraire portfo-
61
lios. Vasicek [77] provides an example that with redundant assets, and there exists
numeraire portfolios with di?erent compositions.
? The Numeraire Portfolio: Composition and Proxies
Onpage60, wedescribedanimpressivepropertyof thenumeraireportfolio: zeroisthe
best conditional forecast of the rate of return for any numeraire-denominated asset.
This property connects the numeraire portfolio with the market portfolio in CAPM,
which is a portfolio consisting of all assets in the market, with weights proportional
to their values in the market. Denote Ri and R the rates of return of asset i and the
numeraire portfolio , respectively. Then, we have
1 + bRi = S1iV 
1
=S0iV 
0
= S1iS
0i
=V

1
V 0 =
1 + Ri
1 + R; (2.4)
which has a conditional expectation 1 because E0[bRi] = 0. This implies that
if asset i has high rate of return, it also has high covariance with the numeraire
portfolio?s rate of return. This relationship is similar to the relationship between
individualassetsandthemarketportfoliointheCAPM.Thus, thenumeraireportfolio
issimilartothemarketportfolio. Furthermore, Long[45] indicatesthatthenumeraire
portfolio is a levered position in the mean-variance e? cient portfolio. Roll [70] states
the mean-variance e? cient portfolio P can be served as the market portfolio in the
CAPM equation,
E[Ri] = Rf + iP[E[Rp] Rf];
62
where the original CAPM equation is
E[Ri] = Rf + im[E[Rm] Rf];
where m represents the market portfolio, and the mean-variance e? cient portfolio
is the market portfolio. Thus, the numeraire portfolio is also a levered position in the
market portfolio.
Eq (2.4) shows that the numeraire-denominated rate of return of asset i is
bRi = 1 + Ri
1 + R  1:
As discussed previously, the above rate of return has mean zero. This property
can be used to test and compare di?erent proxies for the numeraire portfolio. Let
RiP = 1+Ri1+R
P
 1 be the proxy-denominated returns. Roll [69] uses the market portfolio
proxy, the S&P 500, for the numeraire portfolio, Fama and MacBeth [29] choose
the NYSE equal-weighted portfolio, and Long [45] picks the NYSE value-weighted
portfolio. The Hotelling T2 hypothesis tests are employed with the null hypothesis
that the expected proxy-denominated return equals to zero. All the tests indicate
zero expected returns with su? cient high p-values. Long [45] also ?nd the proxy-
denominated returns have means close to zero with small standard deviations. These
empirical tests suggest value-weighted or equal-weighted portfolios, such as S&P 500,
NYSE indices, can serve as proper proxies for the numeraire portfolios.
63
2.2.3 Pricing Under the Physical Measure
AsdiscussedinSection2.2.2, anyassetinamarket, whendenominatedbyanumeraire
portfolio, is a martingale, so does any portfolio that is the linear combination of all
the assets in this market. This numeraire-denominated portfolio process is called the
fair price process by B?hlmann and Platen [9].
De?nition 20 Fair Price Process
In a no-arbitrage market,  = ft : t  0g is the numeraire portfolio with value
process V = fVt : t  0g, and  = ft : t  0g is a self-?nancing portfolio with
price process V = fVt : t  0g. If the numeraire-denominated price of this portfolio
follows:
Vt
Vt = Et
V
s
Vs

0  t < s < T; t;s;T 2 N;
then, V is called a fair price process. And this market is a fair market.
In a fair market, denote a contingent claim H = fHt : 0  t  Tg, which
is Ft-measurable and has nonnegative payo? Ht on or before maturity T. Given a
numeraire portfolio in this market, a fair price for this contingent claim can be de?ned
[9].
De?nition 21 Numeraire Pricing
Given a numeraire portfolio  = ft : t  0g in a no-arbitrage market, the fair
price at time t of a contingent claim H = fHt : 0  t  Tg is de?ned by
VHt = Vt EPt
H
T
VT

0  t < T; (2.5)
64
where HT is the contingent claim?s terminal payo?. P is the physical measure.
Thenumeraire-denominatedprice bVHt = VHtV
t
iscalledthenumeraire-portfolioprice
(or ?benchmarked price?in [9]), which is a martingale under the physical measure P.
The numeraire pricing requires the existence of the physical measure and a nu-
meraire portfolio. As discussed in Theorem 19, the numeraire portfolio exists if there
is no arbitrage opportunity, and the market indices are proper proxies. If the physical
measure is easy to attain, then, this method is feasible and easy to implement. An-
other advantage is assets?expected return is contained in the pricing formula. Thus,
the numeraire-portfolio pricing method provides a possible way to estimate expected
returns, which is not easy to obtain. Details will be discussed in the next section.
This pricing method is also named the ?real word pricing?in [9] because the pric-
ing formula (2.5) uses the physical measure (or the real world probability measure).
According to Cochrane [16], the price in Eq (2.5) is the conditional expectation of
?nal payo? discounted by a stochastic factor Vt. Alternatively, we have the widely
used risk-neutral pricing method, which vales ?nancial asset by the expected future
payo?, discounted by a risk-free asset Bt, under the risk-neutral measure. Platen [66]
claims the risk-neutral pricing is a special case of the numeraire pricing.
65
Let fBt : t  0g be the riskless saving account (or a risk-free bond). Starting from
Eq (2.5),
VH0 = Vt EP0
H
T
VT 
BT
B0 
B0
BT

= EP0

HT BTV
T
 V0B
0
 B0B
T

= B0 EP0
H
T
BT T;0

;
where T;0 = BTV
T
= B0V
0
= bBTbB
0
. bB0 and bBT are the numeraire-portfolio risk-free
bond. B0 and V0 equal 1 as initially set.
In a no-arbitrage market, any numeraire-portfolio asset is a martingale under the
physical measure P. Therefore, E0[T;0] = E0
h
BT
VT
i
= 1. As the risk-free bond Bt
and the numeraire portfolio t are nonnegative, BTV
T
is also a nonnegative random
variable. Thus, T;0 is a Radon?Nikodym derivative, which transforms the physical
measure P to another measure Q by the following formula:
VH0 = EP0
H
T
BT T;0

= EQ0
H
T
BT

;
which is the risk-neutral pricing formula. Thus, Q is the risk-neutral measure.
66
2.3 Estimating Expected Returns Via Numeraire-
portfolio Pricing Method
2.3.1 Idea of the Expected Return Estimation
Consider a European call option of asset i with terminal payo? HTi = (STi  K)+ ,
where STi is the asset price and K is the strike. Let V = fVt : t  0g be the price
of the proxy of the numeraire portfolio. Using the numeraire pricing formula (2.5),
the price of this call option is
VH0i = V0 EP0
(S
Ti  K)+
VT

; (2.6)
where the conditional expectation is under the physical measure of the stock and
the proxy. The associated probability distribution is a bivariate distribution of the
random variables (STi;VT).
Recall Section 1.4.2, the model of the asset price is given by Eq (1.39)3
ST = S0 expfT + X(T) + !Tg; (2.7)
where  is the mean rate of return, T is the expected return in the time interval0
to T, ! is a ?convexity correction?to make the expected rate of return be  under
the physical measure.
3Although Eq (1.39) is the asset price model using VG distribution, it is applicable to all other
proper laws for X(t).
67
As discussed in Theorem 19, the existence of the numeraire portfolio is under the
assumption that all portfolios in the market are bounded. If the assets are modeled
by Eq (2.7), we need to check the portfolios constructed by the assets with dynamics
of Eq (2.7) are bounded. Let the value of such portfolio be 
i
tiSti at time t, where
Sti is asset i?s price and ti is its shares. By Eq (2.7), we have
ln


i
tiSti

= ln


i
tiS0i expft + Xi(t) + !itg

 ln
q

i
(ti)2 
q

i
(S0i exp(t + !it))2 
q

i
e2Xi(t)

= 12 ln


i
(ti)2

+ 12 ln


i
(S0i exp(t + !it))2

+ 12 ln


i
e2Xi(t)

:
E
h
ln


i
e2Xi(t)
i
 lnE


i
e2Xi(t)

= ln


i
Ee2Xi(t)

= ln


i
i( 2i)

< 1:

i
(S0i exp(t + !it))2 is ?nite. If 
i
(ti)2 is bounded, which is a reasonable con-
straint, then all the portfolios in this model setting are bounded. Thus, the numeraire
portfolio exists when the asset prices in the market follows the dynamics of Eq (2.7).
To estimate the mean rate of return  using Eq (2.6), we need to calibrate the
option prices of asset i, which is conducted by minimize the average absolute error
between the market prices and the model prices VH0t. A couple of average absolute
errors have been de?ned, which are summarized in Schoutens?s book [74].
68
 Average Pricing Error (APE)
APE = 1P
R
N
i=1
jPR  PMj
N
 Average Relative Percentage Error (ARPE)
ARPE = 1N N
i=1
jPR  PMj
PR
 Average Absolute Error (AAE)
AAE = N
i=1
jPR  PMj
N
 Root-mean-square Error (RMSE)
RMSE =
s
N
i=1
jPR  PMj2
N (2.8)
where PR is the market price, PR is the mean of the market prices, PM is the
model price, and N is the number of options. The model parameters in Eq (2.2.2)
can be estimated by minimizing one of these errors.
The asset prices (STi;VT) can be obtained either by analytical calculation or
simulation. In our work, STi and VT are simulated by a technique called full-rank
Gaussian copula (FGC), which will be introduced in the next section.
69
2.3.2 Multivariate Random Number Simulation Via FGC
Simulation of multivariate random numbers requires the information of the associ-
ated multivariate distribution function, which does not always have the closed-form.
Also the correlation is complicated in non-Gaussian distributions and covariance may
not be a proper measure for the correlation. Copula provides a general approach to
constructing dependence structures to formulate multivariate distribution from arbi-
trary marginal distributions, where the dependence structure is independent from the
marginal distributions. We will brie?y introduce the de?nition and some important
properties of copula below. One type of copula and the application in simulation will
also be described. The details of copula can be found in [61].
De?nition 22 Copula
A n-dimensional copula C is a multivariate joint distribution function de?ned on
[0;1]n with mapping [0;1]n ! [0;1], which has the following properties:
(1) C is ground4 and n-increasing;
(2) C(1;:::;1;ui;1;:::;1) = ui, ui[0;1], i = 1;:::;n;
(3) C(u1;:::;un) = 0 if at least one ui equals zero.
This de?nition indicates that copula C, as a multivariate distribution function,
has uniform marginal distributions.
How is a multivariate distribution function related to other arbitrage multivariate
distribution functions? Sklar [76] provides a solution, which is the foundation of the
most applications of the copula.
4A function f(x1;x2:::;xn) is called grounded if f(a1;:::) = f(:::;a2;:::) = :::   f(:::;an) = 0,
where ai is the least element in the domain of xi.
70
Theorem 23 Sklar?s Theorem
Let F be an n-dimensional multivariate distribution function with continuous mar-
ginal distributions F1;F2;:::;Fn. Then, there exists a unique n-dimensional copula C
de?ned on [0;1]n such that
F(x1;x2;:::;xn) = C(F1(x1);F2(x2);:::;Fn(xn)):
Sklar?s theorem states that given a join law F and the corresponding marginal
laws, there exists a copula C that describing the dependence structure, and that
copula does not contain any information of the expression of the marginal laws. Thus,
the joint law F can be constructed from the marginal laws F1;F2;:::;Fn and the
dependence structure C separately.
Instead of expressing in random variables (x1;x2;:::;xn), the Sklar?s theorem has
an equivalent form expressed by the probability distributions F1;F2;:::;Fn.
Corollary 24 An Equivalent Form of Sklar?s Theorem
Let H be an n-dimensional multivariate distribution function with continuous mar-
ginal distributions F1;F2;:::;Fn. Then, there exists a unique n-dimensional copula C
de?ned on [0;1]n such that
C(u1;u2;:::;un) = F(F 11 (u1);F 12 (u2);:::;F 1n (un));
where ui[0;1] for i = 1;2;:::;n.
The copula also has a useful invariant property given by Embrechts et al. [27].
71
Theorem 25 Invariant Property of Copula
Let (x1;:::;xn) be a continuous n-dimensional random vector with copula C and
h1(x1);:::;hn(xn) be strictly increasing continuous functions on the ranges of x1;:::;xn.
then the n-dimensional random vector (h1(x1);:::;hn(xn)) also has the same copula
C.
This invariant property provides a powerful way to construct the multivariate
distributions. If the distribution of the random vector  !X is not easy to obtain, then,
we can transform it to a new random vector whose dependence structure is easy
to build. The only requirement for this procedure is that the transform function is
strictly increasing.
Now let us introduce one copula, called Gaussian copula proposed by Li [43],
which is widely used in ?nancial modeling. The function of the Gaussian copula has
the same structure as the cumulative distribution function (cdf). of the standard
multivariate Gaussian random variables.
De?nition 26 Gaussian Copula
The Gaussian copula function is
CG(u1;:::;u1) = ( 11 (u1);:::; 1n (un)) ui 2 [0;1] for i = 1;:::;n;
where  is the multivariate Gaussian cdf with mean zero and correlation matrix
A. i is the univariate standard Gaussian cdf.
72
Malevergne and Sornette [50] provides the method to build the joint distribution
using the Gaussian copula. The idea is brie?y described here. Details can be found
from [50]. Let X be a n-dimensional random vector with marginal cdf F(xi) and
pdf f(xi) for Xi,  !Z be a n-dimensional standard Gaussian random vector with the
conservation of probability
f(xi)dxi = 1p2 exp

 z
2
i
2

dzi:
Integrate this equation,
Fi(xi) = (zi) where (zi) is the cdf of Zi;
zi =  1(Fi(xi)): (2.9)
Eq (2.9) is strictly increasing. Thus, by Theorem 25,  !X and  !Z have the same
copula C.  !Z has a simple and well-de?ned dependence structure, which is the covari-
ance matrix A. The copula of  !Z is its multivariate cdf, which is the Gaussian copula.
The joint distribution can then be easily obtained by combining this Gaussian copula
with the marginal distribution of  !X.
We are interested in applying the copula method in simulating multi-asset returns.
A model of dependence, termed the full rank Gaussian copula (FGC), is employed.
It is proposed and studied by Malevergne and Sornette [51], later summarized by
Madan and Khanna [39].
73
FGC can be a very useful tool to simulate multivariate non-Gaussian random
numbers in ?nance. Empirical study ([17] [54] [32], and many others) indicates that
the distributions of returns have power-law tails, whose variance and covariance are
either not well-de?ned, or only exist in principle but are hardly accurately to estimate
because of the poor convergence of the sample estimators. These multivariate random
variables  !X can ?rst be transformed to standard Gaussian variables  !Z, which have
well-de?ned correlations, the covariance matrix A with possibly full rank. Then,
the correlation can be estimated more accurately than the direct estimation on the
original random samples of  !X. Next, multivariate Gaussian random numbers can be
simulated using the estimated covariance matrix bA. Finally, the simulated Gaussian
randomnumbers is transformed back to get the randomsample of the original random
vector  !X.
The algorithm is summarized below:
Algorithm 27 Multivariate Simulation Using FGC
Step 1. Calculate the values of cdf for sample of Xi with the marginal cdf Fi(xi)
P(Xi  x) = Fi(x)
Step 2. Transform the marginal distribution Fi(x) to the standard Gaussian vari-
able
zi =  1(Fi(xi)); (zi) is the cdf of the standard normal Zi
Step 3. Estimate the covariance matrix A with the transformed sample. The
74
estimated covariance matrix is denoted as  !A.
Step 4. Simulate multivariate Gaussian variables with  !A. The simulated random
numbers are denoted as eZ.
Step 5. Convert each eZi to fXi by
fXi = F 1i ((eZi))
eX are the simulated random numbers.
Remark:
1. By Theorem 25, any multivariate random variables  !Y , if connected with  !X by
strictly increasing functions Yi = gi(Xi), can be simulated by Algorithm 27. We
can start from random samples of  !X in Step1 and obtain the random numbers
of  !Y in Step 5;
2. Algorithm 27 also works in a special case when X is a univariate random vari-
able.
2.4 Numerical Implementation and Results
The expected returns are estimated for the S&P 500 Index and the ?rst 50 stocks
of S&P 500 from January 1999 to October 2009. The estimation is conducted once
every month, on a Wednesday of the middle of that month, for a total of 130 months.
These 130 days are termed the estimation days. S&P 500 is chosen as the proxy of the
75
numeraire portfolio. The stock and option data are obtained from WRDS to estimate
the expected returns. The price data of eight sectors of S&P 500 are attained from
Reuters to perform statistical analysis.
2.4.1 Estimating Expected Returns
Expected return is estimated from calibrating one-month options5, which is realized
by minimizing one of the average absolute errors given in Section 2.3.1. RMSE (Eq
2.8) is chosen in our study, RMSE =
r
N
i=1
jPR VH0tj2
N , where VH0t is option?s model
price (Eq 2.6), VH0i = V0 EP0
h
(STi K)+
VT
i
, in which STi and VT are modeled by Eq
(2.7) ST = S0 expfT + X(T) + !Tg. Simulation technique FGC is employed in the
calibration. TosimulateSTi (stocki)andVT (proxytothenumeraireportfolio)inEq
(2.6), a distribution model at horizon T is required. Because of its better performance
at a longer horizon, the VG mixed model is employed in this study, as T is one month.
The input of this model is the VG parameters (i;i;i) of the marginal distribution
for each stock i at the unit time, which is one day in this study. On each estimation
day, these parameters are needed, which are estimated from four-year daily stock
price data prior to this estimation day. The expected return i of stock i and other
parameters (ci;i) in the VG mixed model can be estimated through iterations of
simulating (STi;VT) in the optimization. As the index VT also appears in Eq (2.6),
we ?rst need to estimate its parameters (;c;) with its option pricing formula
VH0 = V0 EP0
h(V
T  K)+
VT
i
.
5Actual maturity varies from four to ?ve weeks (roughly one month), depending on the days
between the estimation day to the next available maturity).
76
The estimation is conducted on 130 estimation days fromJanuary 1999 to October
2009. On each estimation day the following procedure is employed. S&P 500 (SPX)
is used as the proxy of the numeraire portfolio.
1. Estimate the VG parameter (;;) for SPX and the 50 stocks using four-year
daily asset price data prior to the estimation day.
2. Estimate (;c;) for SPX from the calibratoin of SPX one-month option
data. VG mixed model is employed to model the price of SPX.
3. Estimate (i;ci;i) for each stock i from the calibration of its one-month option
data.
Figure 2-2 and Figure 2-3 present two sample calibration results, the former is
from SPX option data on July 11, 2007, with RMSE value 2.01, the latter is from
HPQ option data on the same day, with RMSE 0.098. Both calibrations use one
option data with single maturity of one month.
Among all the estimated returns of the 51 assets in 130 days, there are 94:52% of
b and 73:73% of b   rf (risk premium, where rf is the risk-free rate) with positive
value. The percentage of positive risk premium (b rf) for each asset is presented in
Table 2.1.
As an example, the estimated return b and the realized return e of SPX are
compared in Figure 2-4.
From Table 2.1 we can tell the estimated risk premium (b rf) are positive most
of the time, which is consistent with the argument of positive risk premium of risky
77
Figure 2-2: The ?tted option data of SPX on July 11, 2007, with one-month maturity,
RMSE=2.01
78
Figure 2-3: The ?ttedoptiondataof HPQonJuly11, 2007, withone-monthmaturity,
RMSE=0.0617
79
Figure 2-4: Estimated return bSPX vs. realized return eSPX for SPX (January 1999
to October 2009)
80
Symbol positive % Symbol positive % Symbol positive %
SPX 89.2 DOW 64.6 OXY 63.1
ABT 64.6 DD 68.5 ORCL 92.3
MO 53.8 EMR 71.5 PNC 76.9
AXP 88.5 XOM 63.1 PEP 59.2
AMGN 86.2 F 64.6 PFE 64.6
AAPL 93.1 GE 72.3 PG 54.6
BAC 67.7 HAL 80.8 SLB 78.5
BA 83.1 HPQ 90.0 TGT 89.2
CVS 82.3 HD 82.3 TXN 90.0
CAT 70.8 INTC 82.3 MMM 63.1
CVX 56.2 JNJ 47.7 UNP 79.2
CSCO 82.3 LLY 64.6 UTX 76.2
C 74.6 LOW 88.5 UNH 85.4
KO 60.0 MCD 70.0 VZ 64.6
CL 65.4 MRK 67.7 WMT 75.4
COP 59.2 MSFT 78.5 WAG 80.0
DIS 78.5 NKE 80.8 WFC 74.6
Table 2.1: Percentage of positive risk premium of each stock in 130 days
asset in all the ?nancial models. Figure 2-4 also shows that the estimated expected
returns are more stable than the observed ones for SPX. Other stocks have the similar
results.
The mean and the standard deviation of the estimated return b and observed
return e for each asset are displayed in Table 2.2, which shows the estimated return
b have lower standard deviation than the observed ones.
2.4.2 Statistical Analysis
Theprevioussectionshowstheestimatedexpectedreturnusingthenumeraire-portfolio
method is better than the realized return and its sample mean. Further investigation
is required to test how good it is.
81
Name mean std. mean std. Name mean std. mean std.
(b) (b) (e) (e) (b) (b) (e) (e)
AAPL 0.088 0.055 0.451 1.655 LLY 0.055 0.049 -0.02 0.869
ABT 0.048 0.039 0.076 0.772 LOW 0.085 0.070 0.132 1.113
AMGN 0.079 0.053 0.130 1.250 MCD 0.053 0.045 0.075 0.876
AXP 0.114 0.134 0.121 1.295 MMM 0.052 0.046 0.092 0.780
BA 0.065 0.048 0.129 1.051 MO 0.033 0.053 0.157 0.917
BAC 0.066 0.120 0.053 1.670 MRK 0.053 0.052 0.041 0.953
C 0.096 0.200 -0.009 1.894 MSFT 0.073 0.054 0.092 1.040
CAT 0.060 0.058 0.125 1.224 NKE 0.067 0.051 0.262 1.085
CL 0.049 0.044 0.057 0.664 ORCL 0.102 0.073 0.217 1.436
COP 0.046 0.052 0.165 0.919 OXY 0.052 0.067 0.298 0.971
CSCO 0.097 0.068 0.073 1.286 PEP 0.045 0.037 0.089 0.667
CVS 0.062 0.047 0.066 0.995 PFE 0.055 0.047 -0.017 0.832
CVX 0.042 0.048 0.109 0.735 PG 0.035 0.047 0.073 0.802
DD 0.059 0.054 0.016 1.015 PNC 0.080 0.077 0.024 1.191
DIS 0.076 0.065 0.072 0.991 SLB 0.060 0.053 0.194 1.115
DOW 0.061 0.080 0.103 1.503 TGT 0.086 0.062 0.120 1.066
EMR 0.065 0.057 0.070 0.783 TXN 0.089 0.054 0.075 1.294
F 0.060 0.131 -0.055 1.819 UNH 0.070 0.052 0.240 1.227
GE 0.076 0.082 0.019 1.052 UNP 0.065 0.058 0.110 0.898
HAL 0.070 0.064 0.188 1.518 UTX 0.068 0.055 0.148 0.874
HD 0.077 0.062 0.045 1.116 VZ 0.054 0.062 -0.008 0.749
HPQ 0.081 0.053 0.199 1.190 WAG 0.061 0.043 0.086 0.837
INTC 0.088 0.062 0.028 1.300 WFC 0.075 0.096 0.089 1.213
JNJ 0.034 0.037 0.095 0.601 WMT 0.066 0.048 0.051 0.728
KO 0.045 0.040 0.059 0.730 XOM 0.051 0.05 0.117 0.677
Table 2.2: Mean and std. of the estimated returnb and realized returne (annualized)
82
Let Ri;t+1 be asset i?s return from t to t + 1 and bi;t be the associated estimated
expected return from t to t+1 using the numeraire-portfolio method. Ri;t+1 is Ft+1-
measurable and bi;t is Ft-measurable as it is obtained at time t from the option
prices, which are Ft-measurable. Ifbi;t properly estimates Ri;t+1, then the conditional
expectation of the di?erence of these two returns is zero, i.e., Et[Ri;t+1  bi;t] = 0,
or furthermore, E[Ri;t+1  bi;t] = E[Et(Ri;t+1  bi;t)] = 0. The generalized method
of moments (GMM) [33] can be employed to test this hypothesis. In the hypothesis
test of GMM, the null hypothesis is H0 : E[u] = 0, where u is the orthogonality
condition. In our study, let zt, which is Ft-measurable, be the instrumental variables
(vector) that may a?ect Ri;t+1 or Ri;t+1  bi;t. Thus, the null hypothesis becomes
E[(Ri;t+1  bi;t)zt] = 0.
The asset i?s return Ri;t+1 can be similarly modeled by the classical asset return
models, such as the Capital Asset Pricing Model (CAPM) [44] [60] [75], the Fama-
French three-factor model [28], which establish the relation between assets?expected
returns and their risk attributes. Among these models, one asset?s expected return
and return itself are linearly determined by various factors which represent di?erent
risks this asset is exposed to. The CAPM measures the asset?s return with its
sensitivity to one factor, systematic risk or market risk.
E[Ri;t] = rf;t + iM(E(RM;t) rf;t);
where:
 RM;t is the market return from t 1 to t with expectation E(RM;t)
83
 rf;t is the risk-free rate
 iM measures the sensitivity of asset i?s expected return to the market return,
and iM = Cov(Ri;RM)Var(R
M)
In CAPM, only a single factor iM is used to measure asset i?s expected return,
which oversimpli?es the complicate situations in the market. Fama-French three-
factor model introduces two more risk factors, ?rm size and book-to-market ratio
which are represented by two classes of stocks, small cap stocks and value stocks
(stocks with a high book-to-market ratio, BTM). These stocks tend to outperform
the market. Fama and French include these two factors in their model to adjust
assets?outperformance tendency:
Ri;t = rf;t + iM(RM;t  rf;t) + iS SMBt + iH HMLt + "i;t;
where additionally to CAPM:
 SMB represents ?Small Minus Big stocks,?which is the excess return of small
stocks over big ones
 HML represents ?High BTM Minus Low BTM,?which is the excess return of
high BTM stocks over small ones
 iS and iH measure the sensitivities to ?rm size and book-to-market ratio
Besides the risk factor in these two traditional models, other factors are proposed
by many people in academia and industry. MSCI Barra [82] suggests numerous fac-
tors in their industrial models. We choose two factors in our analysis, asset?s own
84
performance and the in?uence from asset?s sector6. The third one is the market risk in
CAPM and the Fama-French model. These returns associated with the three factors
are also considered as the instrumental variables zt.
Ri;t+1 is Ft+1-measurable. bi;t and the three returns, namely the market return
RM;t, asset i?s return Ri;t, and the sector return RS;t are Ft-measurable. At time t, if
the null hypothesis is chosen to be
Et[(Ri;t+1  bi;t)zt] = 0; (2.10)
then the linear regression model assumed in our study is
Ri;t+1 bi;t = 0 +iM(RM;t rf;t)+ii(Ri;t rf;t)+iS(RS;t rf;t)+"i;t+1; (2.11)
where:
 RM;t is the market return at t
 RS;t is the sector?s return at t
 rf;t is the risk-free rate at t
 ii re?ects asset i?s exposure to its own performance
 iS measures asset i?s sensitivity to its sector
 iM and "i;t+1 are the same as those in the CAMP and the Fama-French?s model
6The category an asset belongs to, such as Exxon in energy sector.
85
 0 is the intercept.
If bi;t properly measures asset i?s return, then, all the betas in Eq (2.11) should
be zero. Under this scenario, the null hypothesis (2.10) can be rewritten as
Et[zt(yt+1  xt)] = 0;
where:
 zt is the column return vector (1;RM;t  rf;t;Ri;t  rf;t;RS;t  rf;t)
 yt+1 = Ri;t+1  bi;t
  is the column risk factor vector (0;iM;ii;iS)
This is equivalent to the ordinary least square (OLS) model [80] y = X+". The
estimator b = (XX) 1Xy is the same as the OLS estimator. After  is estimated
through OLS, the F-test in OLS can be performed, where the null hypothesis is
H0 :  = 0. The acceptance of this null hypothesis implies the acceptance of the
GMM null hypothesis (2.10).
Linear regressions are employed on two pairs of return using Eq (2.11): one pair is
the estimated returns bi;t and the associated realized returns Ri;t+1; the other is the
estimated returns bi;t and the sample mean Ri;t+1. Both regressions are conducted
for 50 stocks using 130 data points from 130 estimation days from January 1999 to
October 2009. Let ti be the number of days to the next available maturity from the
estimation day i on which i;t is estimated. Ri;t+1 is the associated realized return
86
during this time period, and Ri;t, rf;t, RS;t and RM;t are the realized returns during
the time period back ti days from the estimation day i. Ri;t+1 is the sample mean
calculated from the daily returns during the time period forward ti days from the
estimation day i. All these returns are annualized.
Two hypothesis tests are conducted to test the beta values:
t test is to determine the signi?cance of each individual beat, with the null hy-
pothesis H0 : i = 0, and the alternative hypothesis H1 : i 6= 0, where i is 0, iM,
ii, or iS.
F test is for the overall signi?cance of all the betas with the null Hypothesis tests
H0 : 0 = iM = ii = iS = 0, and the alternative hypothesis H1 : one or more than
one beta is not equal to zero.
Both tests are performed with 95% con?dence level. The test results are displayed
in Table 2.3 and 2.4. The results are also summarized below:
t test:
The hypothesis tests for both pairs indicates large portion of stocks have i = 0:
Ri;t+1  bi;t: 76% for eii, 90% for eiM, 72% for eiS, and 80% for 0.
Ri;t+1  bi;t: 76% for eii, 84% for eiM, 76% for eiS, and 66% for 0.
F test:
The F test shows that 34 out of 50 stocks?p-value is greater than 0.05, which
means all the betas of these stocks have zero value with 95% con?dence level.
Similar results are also attained using the Fama-French three-factor model. Thus,
the numeraire-portfolio method provides a good approach to estimate expected re-
turns.
87
Name indi. p-value indi. p-value indi. p-value indi. p-value
ii ii iM iM iS iS 0 0
AAPL 0 0.138 0 0.799 0 0.854 1 0.004
ABT 0 0.612 0 0.328 0 0.513 0 0.348
AMGN 0 0.534 0 0.410 0 0.269 0 0.240
AXP 0 0.360 0 0.338 1 0.039 0 0.423
BA 1 0.005 0 0.422 1 0.033 0 0.371
BAC 0 0.052 0 0.887 0 0.309 0 0.804
C 0 0.770 0 0.867 0 0.707 0 0.366
CAT 0 0.195 0 0.157 0 0.067 0 0.260
CL 0 0.949 1 0.004 1 0.001 0 0.152
COP 0 0.543 0 0.632 0 0.678 0 0.111
CSCO 1 0.049 0 0.151 1 0.012 0 0.692
CVS 0 0.579 0 0.697 0 0.975 0 0.355
CVX 0 0.471 0 0.373 0 0.069 0 0.067
DD 1 0.006 0 0.122 1 0.001 0 0.992
DIS 1 0.042 0 0.300 0 0.881 0 0.405
DOW 0 0.506 0 0.333 0 0.367 0 0.963
EMR 0 0.944 0 0.334 0 0.811 0 0.633
F 0 0.875 0 0.573 0 0.471 0 0.117
GE 1 0.040 0 0.976 0 0.151 0 0.937
HAL 0 0.791 0 0.080 0 0.562 1 0.027
HD 1 0.018 0 0.290 1 0.006 0 0.966
HPQ 0 0.292 0 0.524 1 0.039 0 0.359
INTC 0 0.629 1 0.009 0 0.058 0 0.675
JNJ 0 0.488 0 0.099 1 0.001 0 0.137
KO 0 0.281 0 0.543 0 0.210 0 0.288
LLY 0 0.984 0 0.533 1 0.032 0 0.075
LOW 0 0.211 0 0.594 0 0.211 0 0.241
MCD 0 0.769 0 0.518 0 0.271 0 0.102
MMM 0 0.268 0 0.693 0 0.588 0 0.183
MO 1 0.010 0 0.845 1 0.014 1 0.001
MRK 0 0.665 0 0.576 0 0.725 0 0.703
MSFT 0 0.152 0 0.055 0 0.580 0 0.511
NKE 1 0.001 0 0.383 0 0.900 1 0.000
ORCL 1 0.000 1 0.033 0 0.385 0 0.314
OXY 0 0.085 0 0.626 0 0.779 1 0.000
PEP 1 0.003 0 0.735 0 0.189 1 0.029
PFE 1 0.025 0 0.935 1 0.025 0 0.958
PG 0 0.859 0 0.075 1 0.014 1 0.012
PNC 0 0.187 0 0.828 0 0.260 0 0.223
SLB 0 0.401 0 0.067 0 0.177 0 0.059
TGT 0 0.640 0 0.730 0 0.747 0 0.273
TXN 0 0.707 0 0.547 0 0.225 0 0.806
UNH 0 0.331 0 0.906 0 0.698 1 0.015
UNP 0 0.635 1 0.004 1 0.007 1 0.048
UTX 0 0.052 0 0.842 0 0.386 1 0.007
VZ 1 0.016 1 0.012 1 0.014 0 0.907
WAG 0 0.123 0 0.671 0 0.618 0 0.704
WFC 0 0.343 0 0.081 0 0.075 0 0.301
WMT 0 0.530 0 0.061 0 0.441 0 0.495
XOM 0 0.140 0 0.891 0 0.352 0 0.304
Table 2.3: t-test of each i (indi: = 0 represents i = 0, indi: = 1 represents i 6= 0)
88
Name indi. p-value Name inid. p-value Name inid. p-value
AAPL 0 0.292 F 0 0.912 OXY 0 0.171
ABT 0 0.791 GE 0 0.146 PEP 1 0.028
AMGN 0 0.497 HAL 0 0.345 PFE 0 0.056
AXP 0 0.169 HD 1 0.008 PG 1 0.027
BA 1 0.020 HPQ 0 0.102 PNC 0 0.227
BAC 0 0.182 INTC 0 0.079 SLB 0 0.171
C 0 0.820 JNJ 1 0.002 TGT 0 0.714
CAT 0 0.320 KO 0 0.633 TXN 0 0.589
CL 1 0.001 LLY 1 0.009 UNH 0 0.796
COP 0 0.876 LOW 1 0.006 UNP 1 0.044
CSCO 0 0.097 MCD 0 0.417 UTX 0 0.238
CVS 0 0.930 MMM 0 0.213 VZ 1 0.013
CVX 0 0.105 MO 1 0.022 WAG 1 0.050
DD 1 0.014 MRK 0 0.792 WFC 0 0.339
DIS 0 0.169 MSFT 0 0.115 WMT 1 0.003
DOW 0 0.695 NKE 1 0.005 XOM 0 0.473
EMR 0 0.313 ORCL 1 0.003
Table 2.4: F test of  (indi: = 0 represents  = 0, indi: = 1 represents  6= 0)
2.5 Conclusion
Expected returns are determined by future uncertainty, which can be represented
by option prices. The numeraire-portfolio pricing method links expected returns
to option prices. This method states that the numeraire-denominated option price
is the conditional expectation of the numeraire-denominated terminal payo? under
the physical measure, which contains the information of the expected return. The
expected return is estimated by the option calibration and a statistical analysis on
the results is performed.
A couple of advantages of this method are summarized below:
1. Stocks are riskier than riskless assets such as the money market account. There-
89
fore, their expected returns should be higher than the risk-free rate. Otherwise,
it does not make sense to invest in them. Realized returns representing the his-
tory are so volatile that they may not accurately reveal expected returns. For
example, they could be outperformed by the risk-free assets for a relatively long
period: the stock market?s return was on average less than the risk-free asset
for eleven years, from 1973 to 1984 [35]. Furthermore, conditions in markets
may change overtime in the long run. Therefore, past average returns may not
represent the current situation [6].
Expectedreturnsestimatedbythenumeraire-portfoliomethoduseoptionprices
with a short period maturity, e.g., one month. Thus, the price information
revealed is ?local?and it represents future uncertainty. The results in this study
show the risk premiums are positive most of the time and more stable than the
realized returns. Furthermore, the OLS regressions indicate that the di?erence
of the estimated returns and the realized returns is indi?erent to two major risk
factors for a large portion of assets. This result indicates that the estimated
returns properly estimate the expected returns.
2. The traditional estimation using the CAMP and the Fama-French model re-
quires the input of betas, the risk factors. However, there is no uniformly ac-
cepted agreement what betas should be chosen. Academicians and practitioners
try to ??ne gold?by data mining [6]. The uncertain input in the numeraire-
portfolio method is the proxy of the numeraire portfolio. Numerous empirical
studies show that market indices or equal-weighted/value-weighted portfolios
90
can serve as good proxies. Thus, there is less uncertain input in the numeraire-
portfolio approach than the traditional methods.
Therefore, our study demonstrates that the numeraire-portfolio approach provides
a good estimation for expected returns.
Future study may include the generalization of estimating the expected return.
Option is one example that represents future uncertainty. The idea to estimate the
expected return has two steps: ?rst, we need to ?nd any equity that can reveal future
uncertainty; second, this equity contains the information of the expected return.
Futures may be one of the candidates that satisfy the above criteria.
91
Chapter 3
A New Approach to Portfolio
Selection
3.1 Introduction
In ?nancial market, investors often face the question of how to allocate their wealth
amongvarious assets, andinwhat sense. Modernportfolio theory(MPT), ?rst articu-
lated by Markowitz [55] [56], provides selection principles for maximizing a portfolio?s
expected return when ?xing its variance, or minimizing the variance for a ?xed level
of expected return. These two principles formulate the e? cient frontier from which
investors can choose their preferred portfolio with the optimal combination of gain
(the expected return) and risk (de?ned as the standard deviation of return), where in-
vestors?preference is the trade-o?between gain and risk. Another important concept
is the diversi?cation. Because every asset is correlated with other assets, a properly
constructed portfolio?s variance can be smaller than the sum of all assets?variances.
92
Thus, investors can reduce the risk with a diversi?ed portfolio instead of investing
in individual asset. In the modern portfolio theory, asset returns are assumed to be
multivariate Gaussian random variables. To optimize a portfolio, investors ?rst need
to estimate each asset?s expected return, its variance, and the correlation to other
assets. Portfolio selection is well developed both in theory and implementation. We
refer to Elton and Gruber?s review paper [26], which provides literatures on each topic
in the modern portfolio theory.
However, various aspects are questioned in the modern portfolio theory. As dis-
cussed in Chapter 1, empirical studies indicate individual asset?s return is not nor-
mally distributed, which makes correlation complicated. Under this situation, covari-
ance may not properly measure correlation. Another issue is in the implementation
procedure. The return in the model input is the expected return, which is the predic-
tion to asset?s future return. In practice, the expected return is estimated from the
historical data, which does not necessarily provide a good prediction. Furthermore,
as discussed in Chapter 2, the historical data is very volatile and it cannot give a
relatively precise estimation.
In this chapter, we propose some alternative approaches for these questions. A
non-Gaussian law, the VG mixed distribution described in Chapter 1, is employed to
model the marginal distributions of asset returns. This model well captures the skew-
ness and excess kurtosis patterns exhibited in the data. The joint law is formulated
by FGC, a simulation technique proposed by Malevergne and Sornette [50] [51], and
summarized by Madan and Khanna [39]. The FGC transforms all marginal random
numbers to standard normal random numbers, and then constructs the dependence
93
structure by the covariance matrix of the multivariate normal distribution, which is
well de?ned to measure the correlation. Last, the expected return estimated by the
numeraire-portfolio method introduced in Chapter 2 is employed. The estimation is
conducted on option prices, which can be viewed as the prediction to asset?s future
values. It is also demonstrated in Chapter 2 that this estimator is more stable than
that from the historical data. Thus, the estimated expected return by the numeraire-
portfolio method is expected to serve as a better and more precise estimator.
Criteria in portfolio optimization are other issues to consider, which include the
mathematical formulation for the optimization and the measures for portfolio evalua-
tion. Traditionally, the utility function is the objective function and to be maximized
in the optimization [56]. A variety of measures have been discussed to evaluate port-
folios. The paper by Biglova et al. [5] provides a good review and also compares
various measures (or risk estimations) that are all in the form of ratios between the
expected return and certain risk measures.
New criteria are proposed in this study. We construct a portfolio from the buyer?s
side. To be a competitive player, the buyer should charge a price as minimal as
possible, which is called the bid price [47]. This price is de?ned based on the ac-
ceptability index developed by Cherny and Madan [14], and is the negative of the
distorted expectation of the terminal payo?. Di?erent risk level leads to di?erent bid
price. Given a risk level, the buyer can reach his or her highest pro?t by maximizing
the bid price which also depends on the composition of the portfolio. Thus, the bid
price, or the distorted expectation, serves as the objective function and the associated
acceptability index evaluates portfolio?s performance.
94
Theoutlineoftherestofthechapterisasfollows. Section3.2introducethecriteria
of the portfolio optimization, the bid price, and the acceptability index. In Section
3.3, the optimization problem is formulated and the numerical implementation and
results are presented. Section 3.4 concludes with a summary.
3.2 Portfolio Evaluation - Acceptability Indices
In this section, we start with the traditional utility function that leads to the concept
of an acceptance set and the associated coherent risk measure, which measures the
risk level of the acceptance set. The acceptability indices, in the sense of the coherent
risk measure, is introduced, along with examples. Finally, given a ?xed acceptable
level, the bid and ask price are described, which will be employed as the objective
function in the portfolio optimization problem.
3.2.1 Acceptance Sets and Coherent Risk Measure
In the classical portfolio optimization problem, an investor allocates wealth by maxi-
mizing portfolio?s utility function. If he or she starts from a position with a zero-cost
portfolio, any positions that will increase the portfolio?s terminal expected utility are
acceptable by this investor. These positions form a convex set that contains nonneg-
ative terminal cash ?ows. Every investor has his or her acceptable set depending on
the preference, or the utility function. The set that is accepted by all investors is
the intersection of all these sets, which is a convex cone. It is called the acceptance
set [1], which also includes the nonnegative terminal cash ?ows. The acceptance sets
95
are studied by Artzner et al. [1] and Carr, Geman, and Madan [10]. The model is
set up for the random variable X, the terminal cash ?ow of zero-cost trades, on a
probability space (
;F;P). The risk measure (X) for the acceptance sets, de?ned
as a mapping from the set of risks to the real-line R, is connected with the acceptance
set by a nonempty set of probability measures, denoted as D, which are equivalent to
P [1] [10]. This risk measure is called coherent risk measure [1]. Delbaen [21] further
indicates the coherent risk measure has the form
(X) =  inf
Q2D
EQ[X]; (3.1)
and a trade X is acceptable when (X)  0.
3.2.2 Acceptability Indices
Based on the axioms of the coherent risk measure in [1], Cherny and Madan [14] de?ne
the ?index of acceptability,?a mapping  from the class of bounded random variables
X to the positive real line R+ = [0;1], which has the following four properties:
1. Monotonicity:
If X is dominated by random variable Y, X  Y, then, (X)  (Y)
2. Scale Invariance:
(X) stays the same when X is scaled by a positive number, (cX) = (X)
for c > 0.
3. Quasi-concavity:
96
If (X)   and (Y)  , then, (X + Y)  
4. Fatou Property (Convergence)
Let fXng be a sequence of random variables. jXnj  1 and Xn converges in
probability to a random variable X. If (Xn)  x, then (X)  x.
(X) can be considered as the degree to measure the quality of terminal cash
?ow X, where bigger the value of (X), closer is X to the arbitrage. (X) = +1
represents arbitrage and all random variables in the acceptable cone are nonnegative.
Under the above four conditions, a basic representation theorem is derived by
Cherny and Madan [14], which connects acceptability indices (X) to family of prob-
ability measures.
Theorem 28 Representation Theorem of Acceptability Indices
Let L1 = L1(
;F; eP) be the probability space of the bounded random variables
X. (X) is an acceptability index which is a map  : L1 ! [0;1] and satis?es the
condition 1-4 if and only if there exists a family of subset fD :   0g of eP such
that
(X) = supf 2 R+ : inf
Q2D
EQ[X]  0g; (3.2)
and fD :   0g is an increasing family of sets of probability measures, i.e.,
D  D for   .
Remark:
(1) The probability measures in D are absolutely continuous with respect to the
original probability measure P for X and each Q 2 D is equivalent to P.
97
Figure 3-1: Graphic illustration of the Representation Theorem
(2) Eq (3.2) indicates that (X) =  is the largest value that makes the ex-
pectation of X positive under all probability measures in D. This can be roughly
illustrated in Figure 3-1.
(3) Recall the coherent risk measure (X) has the relationship in Eq (3.1) with
acceptable sets if (X)  0. Then, the acceptability index (X) is linked with (X)
by
(X) = supf 2 R+ : (X)  0g:
Thus, (X) is the largest risk level that the cash ?ow X is acceptable, and the
risk level is .
(4) The sets of probability measures fD :   0g can be considered as pricing
kernels, which will be discussed later.
98
3.2.3 WVAR Acceptability Indices
There are many acceptability indices that satisfy the four conditions on page 96. The
weighted VAR (WVAR) acceptability indices [14] are used in this study because of
their computational feasibility. The WVAR has the following form:
WVAR(X) =  
Z
R
xd	(FX(x)); (3.3)
where FX(x) is the cdf of random variable X. f	 :   0g is a set of increasing
concave continuous functions with mapping 	 : [0;1] ! [0;1], where 	(0) = 0 and
	(1) = 1. Furthermore, 	(y) increases in  with ?xed y value. Thus, 	(y) can
be viewed as a function to distort the cdf y = FX(x), adding more weights to the
losses, which are the area when FX(x) is close to x or X decreases in negative values.
	(FX(x)) again is sees as a probability distribution function, and WVAR can be
viewed as a distorted expectation of cash ?ow X.
Apply Eq (3.3) to Eq (3.2), the WVAR acceptability index (X) is de?ned as
(X) = supf 2 R+ :
Z
R
xd	(FX(x))  0g; (3.4)
where (X) is the biggest  such that the distorted expectation is still positive.
The expectation of X is taken under a new probability measure Q 2 D by a
measure change dQdP = 	(FX(x)) where P is the original probability measure of X.
D is the set of probability measures discussed in Section 3.2.2. When  increases,
	 also increases, which distorts cdf FX more and more to the left, or in another word,
99
gives more weight to the losses. Under this situation, if the distorted expectation of
X still remains positive, it means the trading strategy X is more acceptable as it can
survive the worse situations, and thus, a better performance. Therefore, (X) = 
is a performance measure for trade X. Higher , better the performance of trading
strategy X.  is also seen as a ?stress level?for the cash ?ow X, higher , more
stressed of X.
The computation of the distorted expectation is relatively simple. Given a sample
x1;x2;:::;xN of cash ?ow X, the numerical formula is
Z
R
xd	(FX(x)) =
NX
i=1
x(i)

	( iN) 	(i 1N )

; (3.5)
where fx(i)g are ordered values sorted increasingly, 	( iN) is the empirical distri-
bution function with 	( iN) 	(i 1N ) = 1N for all i = 1;2;:::N.
Four WVAR indices are provided in [14], namely MINVAR, MAXVAR, MIN-
MAXVAR, and MAXMINVAR.
We?rstlookatasimplecase. LetY law= minfx1;x2;:::;x+1g, wherex1;x2;:::;x+1
are  +1 independent draws from X. If the cdf of X is z = FX(x), then by the order
statistics, the probability function of Y is given by
	(z) = 1 (1 z)1+ where z 2 [0;1];   0: (3.6)
In this case,  is the largest number of draws such that the expected value of the
minimum of these  draws still remains positive. Larger value of  represents better
100
performance of trade X. This acceptability index is termed MINVAR as it?s related
with the minimum from a number of independent draws.
Now let us check how the distortion in MINVAR reweight the loss and gain.
Di?erentiate the distortion function (3.6)
d	(FX(x))
dx = ( + 1)(1 FX(x))
 fX(x);
where fX(x) is the pdf of X. This derivative shows MINVAR distortion adds more
weighttolargelosses(whenFX(x) iscloseto 0)andreducesmoreweighttolargegains
(when FX(x) is close to 1). However, large losses can not be reweighted to in?nitely
large levels. Thus, a modi?ed MINVAR is considered: 	(z) = 1 (1 z 11+)1+ with
its di?erentiation
d	(FX(x))
dx = ( + 1)(1 FX(x)
1
1+) FX(x) 

1+ fX(x):
Under this situation, the large losses can be reweighted in?nitely large and the large
gains can be reweighted down to zero. This index is called MINMAXVAR, which
will be implemented in this study.
Details of the other two indices, MAXVAR and MAXMINVAR, can be found in
[14].
101
3.2.4 Bid and Ask Prices
Given a ?xed acceptability index and level, we study the corresponding acceptable
price for the terminal cash ?ow X, either from seller?s or buyer?s view, namely ask
price and bid price [47], respectively. Let us derive a trading strategy from buyer?s
side. If b is the bid price, then the buyer?s residual cash ?ow is X   b. To be
a competitive buyer in the market, he or she should o?er as much as possible. If
the residual cash ?ow X   b is    acceptable (acceptable level is ) with certain
acceptability index , then the competitive bid price is taken maximum and is de?ned
as
b(X) = supfb : (X  b)  g: (3.7)
For the bid price, by Theorem 28, we have
inf
Q2D
EQ[X  b]  0 , b  inf
Q2D
EQ[X]:
By Eq (3.7), b = inf
Q2D
EQ[X]. Therefore, the bid price is the minimum of the
distorted expectation of X among all Q 2 D with a ?xed level .
If WVAR is chosen to be the acceptability index, by Eq (3.4), we have
Z
R
xd	(FX b(x))  0 ,  b +
Z
R
xd	(FX(x))  0:
102
By Eq (3.7), the competitive bid price is
b(X) =
Z
R
xd	(FX(x)); (3.8)
the distorted expectation of the terminal cash ?ow X.
Similarly, if the seller has the residual cash ?ow a   X with acceptable level 
under the acceptability index , where a is the ask price. The competitive ask price
is the minimal price de?ned by
a(X) = inffa : (a X)  g:
Withthesimilarprocedure, giventheacceptablelevel, wecanderivethecompetitive
ask price
a = sup
Q2D
EQ[X];
and
a(X) =  
Z
R
xd	(F X(x));
if using WVAR as the acceptability index.
103
3.3 Numerical Implementation and Results
3.3.1 Trading Strategy
A trading strategy over a single period is implemented. As a buyer, we construct a
stock-only portfolio at time 0 with an bid price b and the maturity t. All the payo?or
cash ?ow X is delivered at the maturity. The resulting residual cash ?ow X b is set
to be at  acceptable level with WVAR the acceptability index . A buyer maximize
the distorted expectation, or the bid price b, which turns out to be Eq (3.8) derived
in Section 3.2.4. Di?erent weights of assets result in di?erent trading strategies or
the ?nal cash ?ows X. Thus, this maximized distorted expectation depends on the
weights of assets. Optimal weight leads to a maximal distorted expectation for the
buyer.
Let w = (w1;:::;wi;:::;wn) and R = (R1;:::;Ri;:::;Rn) be the weight and return
vectors of the portfolio, where n is the total number of stocks. The cash ?ow x is
de?ned as x = w R, which is the portfolio?s return. The portfolio optimization
problem is formulated as follows:
max
w
Z
R
xd	(FX(x)); (3.9)
s:t: n
i=1
wi = 1;  1  wi  1:
Objective function in (3.9) can be numerically computed by Eq (3.5), in which N
samples of x are obtained by simulation. x = wR = n
i=1
wiRi, where Ri has the VG
104
mixed distribution [22]. The stock?s log return is given by Eq (1.39) reformulated as
ri = ln StiS
0i
= it + X(i)VGMixed(t) + !it; (3.10)
where i is the expected return which can be estimated by the numeraire-portfolio
method. The stock?s return Ri then equals to exp(ri) 1.
3.3.2 Procedure and Results
The results and data obtained from Chapter 2 are employed in this study. The
trading strategy is implemented on the 130 estimation days in Chapter 2, ranging
from January 1999 to October 2009. On each day, a portfolio containing the ?rst 50
stocks from the S&P 500 (SPX) is constructed, and the holding period is the same as
the time-to-maturity of the options used in the calibration. The FGC in Algorithm
27 is implemented to simulate the multivariate random variables r = (r1;:::;ri;:::;rn)
with VG mixed distribution as the marginal law for each ri. The input of FGC,
the VG parameters (i;i;i) from the marginal daily return, the expected return
i estimated from the numeraire-portfolio method (bi), and the parameters (ci;i)
in the VG mixed distribution at time horizon t, are all obtained from the results in
Chapter 2. Thus, the only parameters left in the objective function (3.9) are the
weights, which are attained from optimization. The optimal portfolio is named the
estimated-return portfolio (ERP).
Two reference portfolios are constructed to compare to the estimated-return port-
folio. They use the same input parameters as the estimated-return portfolio except
105
the one for the expected return i: one using the realized return Ri is named the
realized-return portfolio (RRP); the other using the sample mean Ri is named the
mean-return portfolio (MRP). On each of the 130 estimation days, these two portfo-
lios are attained by the objective function (3.9).
The actual return of the optimized portfolio at the maturity can be calculated
by n
i=1
bwifi, where bwi is stock i?s weight in the optimized portfolio, and fi is stock
i?s actual return during the period from the estimation day to maturity. The actual
return for each of the three optimized portfolios is calculated.
Besides comparing to the two reference portfolios, we are also interested in com-
paring the performance of the estimated-return portfolio to the market index, which
is SPX in this study. The adjustment for SPX?s return is required before comparison
due to the leverage e?ect. In the estimated-return portfolio, the weight ranges from
 1 to 1 for each stock and there are 50 stocks in the portfolio. The weight in SPX
can be considered as 1. Thus, the estimated-return portfolio is leveraged compared
to SPX. To set them to the same leverage level, the return of SPX is multiplied by
some ratios, which are de?ned as follow:
lpos = n
i=1
bw+i and lneg = n
i=1
bw i ;
where
bw+i =
 bw
i; if bwi > 0
0; otherwise
bw i =
 bw
i; if bwi < 0
0; otherwise:
106
We have two leveraged returns for SPX
R+SPX = lpos RSPX; R SPX = jlnegjRSPX;
where RSPX is SPX?s actual return during the period from the estimation day to
maturity. These two leveraged SPX are named positive-leveraged SPX and negative-
leveraged SPX.
The three optimized portfolios are constructed at di?erent risk levels, namely
 = 0:05; 0:10;0:15; 0:20; and 0:25. MINMAXVAR is employed as the acceptability
index.
At each risk level, cumulative returns are calculated to compare the performances
of the ?ve portfolios, namely the estimated-return portfolio, the realized-return port-
folio, themean-returnportfolio, thepositive-leveragedSPX,andthenegative-leveraged
SPX. The results of the cumulative returns are graphed in Figure 3-2 to Figure 3-6 for
each risk level. The cumulative returns of the estimated-return portfolio at di?erent
risk levels is shown in Figure 3-7 to check the e?ect of risk level on its performance.
The statistics of the returns for each portfolio at every risk level are also displayed
in Table 3.1 and 3.2. All the returns are annualized before the analysis.
Itisobservedfromthe?guresandalsocon?rmedfromthetablesthattheestimated-
return portfolio is superior to the other two reference portfolios, and its performance
is even better than SPX at all risk level except  = 0:05. Furthermore, Table 3.2 also
shows the return of the estimated-return portfolio is less volatile than SPX, which
means it is mean-variance optimal than SPX.
107
Figure 3-2: Cumulative returns of the ?ve portfolios ( = 0:05)
108
Figure 3-3: Cumulative returns of the ?ve portfolios ( = 0:10)
109
Figure 3-4: Cumulative returns of the ?ve portfolios ( = 0:15)
110
Figure 3-5: Cumulative returns of the ?ve portfolios ( = 0:20)
111
Figure 3-6: Cumulative returns of the ?ve portfolios ( = 0:25)
112
Figure 3-7: Estimated-return portfolio at di?erent risk level  (AIX=MINMAXVAR)
113
ER P R R P M R P SPX + SPX -
 = 0:05 -0.0176 -0.031 -0.0319 0.0156 0.015
 = 0:10 -0.0175 -0.0873 -0.0824 0.0156 0.0174
 = 0:15 0.0078 -0.2722 -0.2272 0.0156 0.0226
 = 0:20 0.0264 -0.4262 -0.3671 0.0156 0.0193
 = 0:25 0.0394 -0.3704 -0.3829 0.0156 0.0252
Table 3.1: Mean of portfolio return at di?erent risk level (January 1999 - October
2009)
ER P R R P M R P SPX + SPX -
 = 0:05 0.2895 0.4373 0.4159 0.5964 0.5666
 = 0:10 0.2740 0.9312 0.9438 0.5964 0.5406
 = 0:15 0.2797 2.4156 2.5823 0.5964 0.4539
 = 0:20 0.3155 3.4108 3.5483 0.5964 0.3989
 = 0:25 0.3599 3.9118 3.9843 0.5964 0.3429
Table 3.2: Std. of portfolio return at di?erent risk level (January 1999 - October
2009)
3.4 Conclusion
A new method is proposed to the classic portfolio selection problem. Several new
approaches inthis methodare employed: Portfolios are constructedinanon-Gaussian
environment; the FGC technique is employed to construct the complicate dependence
structure; a new estimator for expected return is used, which is expected to provide
114
a better and precise estimation; ?nally, new criteria, the distorted expectation and
the acceptability index, are employed to mathematically formulate the optimization
and evaluate the portfolio performance.
Three kinds of portfolios are built with the same setting in the portfolio selection
procedure, except the estimator for the expected return input. These portfolios are
compared to two leveraged SPX. Comparison is conducted at each of the ?ve risk
levels for the ?ve portfolios. We observed that the estimated-return portfolio out-
performs the other two reference portfolios, which have consistent loss This indicates
the estimated return using the numeraire-portfolio method is an e?ective estimator
for the asset?s expected return, compared to the estimators from the historical data.
Furthermore, this estimator may also serve as a good input in the portfolio optimiza-
tion at higher acceptable level ( is near or above 0:20) because, the portfolio has the
similar performance as the market index at these levels.
Future work following this study can be the portfolio performance testing under
various scenarios, including di?erent optimization criteria, e.g., utility function as
the objective function, more variety of components in portfolios, such as options and
bonds. The purpose is to ?nd out whether, in more general scenario, the numeraire-
portfolio method provides an e?ective estimator for the expected return in portfolio
selection.
115
Bibliography
[1] Artzner, P., F. Delbaen, J. Eber, and D. Heath (1999). Coherent Measures of
Risk, Mathematical Finance, 9: 203?228.
[2] Bachelier, L. (1900), Th?orie de la sp?culation, Annales de l?Ecole Normale
Sup?rieure, 17: 21?86. (English translation: Cootner, P., ed. 1964, The random
character of stock market Prices, MIT Press: Cambridge, MA).
[3] Barndor?-Nielsen, O. E. (1997). Normal inverse Gaussian distributions and sto-
chastic volatility modelling, Scandinavian Journal of statistics, 2: 41?68.
[4] Bartholdy, J. and P. Peare (2005). Estimation of expected return: CAPM vs.
Fama and French, International Review of Financial Analysis, 14: 407?427 .
[5] Biglova, A., S. Ortobelli, S. Rachev and S. Stoyanov (2004). Di?erent approaches
to risk estimation in portfolio theory, Journal of Portfolio Management, 31: 103?
112.
[6] Black, F. (1995). Estimating expected return, Financial Analysts Journal, 51:
168?171.
116
[7] Black, F. and M. Scholes (1973), The pricing of options and corporate liabilities,
Journal of Political Economy, 81: 637?654.
[8] Breeden, D. (1979). An intertemporal asset pricing model with stochastic con-
sumption and investment opportunities, Journal of Financial Economics, 7: 265?
296.
[9] B?hlmann, H. and E. Platen (2002). A discrete time benchmark approach for
?nance and insurance, Working paper.
[10] Carr, P., H. Geman, and D. B. Madan (2001). Pricing and hedging in incomplete
markets. Journal of Financial Economics, 62:131?67.
[11] Carr, P., H. Geman, D. B. Madan, and M. Yor (2002). The ?ne structure of asset
returns: An empirical investigation, Journal of business, 75: 305?332.
[12] Carr, P., H. Geman, D. B. Madan, and M. Yor (2007). Self-decomposability and
option pricing, Mathematical Finance, 17: 31?57.
[13] Carr, P. and D. B. Madan (1998). Option valuation using the fast Fourier trans-
form, Journal of Computational Finance, 2: 61?73.
[14] Cherny, A. and D.B. Madan (2009), New measures for performance evaluation,
Review of Financial Studies, 22: 2571?2606.
[15] Claus, J. and J. Thomas (2001). Equity premia as low as three percent? Evidence
from analysts?earnings forecasts for domestic and international stock markets,
Journal of Finance, 56: 1629?1666.
117
[16] Cochrane, J. H. (2001). Asset Pricing, Princeton University Press.
[17] Cont, R. (2001). Empirical properties of asset returns: stylized facts and statis-
tical issues, Quantitative Finance, 1: 223?236.
[18] Cont, R., J.-P. Bouchaud, and M. Potter (1997). Scaling in stockmarket data:
stable laws and beyond, in Scale Invariance and Beyond, Dubrulle, B., F. Graner,
and D. Sornette, eds., Springer: Berlin.
[19] Cont, R. and P. Tankov (2004). Financial modelling with jump processes, Chap-
man & Hall/CRC.
[20] Cox, J. C. and S. A. Ross (1976). The valuation of options for alternative sto-
chastic processes, Journal of Financial Economics, 3: 145?166.
[21] Delbaen, F. (2002). Coherent risk measures on general orobability spaces, in K.
Sandmann and P. Sch?nbucher (eds.), Advances in Finance and Stochastics:
Essays in Honor of Dieter Sondermann, Berlin: Springer 1?37.
[22] Eberlein, E. and D. B. Madan (2010). The distribution of returns at longer hori-
zons, Working paper.
[23] Eberlein, E. and F. ?kzan (2003). Time consistency of L?vy models, Quantitative
Finance, 3: 40?50.
[24] Eberlein, E., and K. Prause, (2000). The generalized hyperbolic model: ?nancial
derivatives and risk measures, Mathematical Finance-Bachelier Congress.
118
[25] Edwards, E. O. and P. W. Bell (1961). The theory and measurement of business
income, New York, John Wiley and Sons.
[26] Elton, E. J. and M. J. Gruber (1997). Modern portfolio theory, 1950 to date,
Journal of Banking & Finance, 21: 1743?1759.
[27] Embrechts, P., F. Lindskog, and A. McNeil (2003). Modelling dependence with
copulas and applications to risk management, in Rachev, S. (Eds.) Handbook of
Heavy Tailed Distributions in Finance, Elsevier, Chapter 8: 329?384.
[28] Fama, E. F. and K. R. French (1993). Common risk factors in the returns on
stocks and bonds, Journal of Financial Economics 33: 3?56.
[29] Fama, E.F.andJ.D.MacBeth(1974). Long-term growth in a short-term market,
Journal of Finance, 29: 857?885.
[30] Gabaix, X., P. Gopikrishnan, V. Plerou, and H. E. Stanley (2003). A theory of
power-law distributions in ?nancial market ?uctuations, Nature, 423: 267?270.
[31] Galloway, M.L., andC.A.Nolder(2007).Option pricing with selfsimilar additive
processes, Working paper.
[32] Gopikrishnan, P., V. Plerou, L. A. Nunes Amaral, M. Meyer, and H. E. Stanley
(1999). Scaling of the distribution of ?uctuations of ?nancial market indices,
Physical Review E, 60: 5305?5316.
[33] Hansen, L. P. (1982). Large sample properties of generalized method of moments
estimators, Econometrica, 50: 1029?1054.
119
[34] Harrison, J. M. and D. M. Kreps (1979). Martingales and arbitrage in multiperiod
securities markets, Journal of Economic Theory, 20: 381?408.
[35] IbbotsonAssociates(1995).Stocks, bonds, bills and in?ation, Yearbook, Ibbotson
Associates, Chicago, Ill.
[36] Jacod, J. and A. N. Shiryaev (1987). Limit theorems for stochastic processes,
Berlin: Springer-Verlag.
[37] Keller, U. (1997). Realistic modelling of ?nancial derivatives, Dissertation. Math-
ematische Fakult?t der Albert-Ludwigs-Universit?t Freiburg im Breisgau.
[38] Kelly, J. R., Jr. (1956). A new interpretation of information rate, Bell System
Technical Journal, 35: 917?926.
[39] Khanna, A. and D. B. Madan (2009). Non Gaussian models of dependence in
returns, Working paper.
[40] Khintchine, A. Y. (1938). Limit laws of sums of independent random variables,
ONTI, Moscow, Russia.
[41] Knight, F. B. (2001). On the path of an inert object impinged on one side by a
Brownian particle, Probability Theory Related Fields 121: 577?598.
[42] L?vy, P. (1937): Th?orie de l?Addition des Variables Al?atoires, Paris: Gauthier-
Villars.
[43] Li, D. X. (2000). On default correlation: a copula function approach, Journal of
Fixed Income, 9: 43?54.
120
[44] Lintner, J. (1965). The valuation of risk assets and the selection of risky invest-
ments in stock portfolios and capital budgets, Reviewof Economics and Statistics,
47: 13?37.
[45] Long, J. B. (1990). The numeraire portfolio, Journal of Financial Economics, 26:
29?69.
[46] Lucas, R. E., Jr. (1978). Asset prices in an exchange economy, Econometrica,
46: 1429?1445.
[47] Madan, D. B. (2010). Pricing and hedging basket options to prespeci?ed levels of
acceptability, Quantitative Finance, 10: 607?615.
[48] Madan, D., P. Carr, and E. Chang (1998). The variance gamma process and
option pricing, European Economic Review, 2: 79?105.
[49] Madan, D. and E. Seneta, (1990). The variance gamma (VG) model for share
market returns, Journal of Business, 63: 511?524.
[50] Malevergne, Y. and D. Sornette (2004). Multivariate Weibull distributions for
asset returns: I, Finance Letters, 2: 16?32.
[51] Malevergne, Y. and D. Sornette (2005). High-order moments and cumulants of
multivariate Weibull asset return distributions: Analytical theory and empirical
tests: II, Finance Letters, 3: 54?63.
[52] Mandelbrot, B. (1963). The variation of certain speculative prices, Journal of
business, 36: 394?419.
121
[53] Mandelbrot, B. (1997). Fractals and scaling in ?nance: discontinuity, concentra-
tion, risk: selecta volume E.
[54] Mantegna, R. N., and H. E. Stanley (1995). Scaling behaviour in the dynamics
of an economic index, Nature, 376: 46?49.
[55] Markowitz, H. M. (1952). Portfolio selection, Journal of Finance, 7: 77?91.
[56] Markowitz, H. M. (1959). Portfolio selection: e? cient diversi?cation of invest-
ments, John Wiley & Sons., New York.
[57] Mehra, R. and E. C. Prescott (1985). The equity premium: a puzzle, Journal of
Monetary Economics, 15: 145?161.
[58] Merton, R. C. (1973). Theory of rational option pricing, Bell Journal of Eco-
nomics and Management Science, 4: 141?183.
[59] Merton, R. C. (1976). Option pricing when underlying stock returns are discon-
tinuous, Journal of Financial Economics, 3: 125?144.
[60] Mossin, J. (1966). Equilibrium in a Capital Asset Market, Econometrica, 34:
768?783.
[61] Nelsen, R. B. (1999). An introduction to copulas, Springer-Verlag, New York.
[62] Ohlson, J. (1995). Earnings, book values and dividends in security valuation,
Contemporary Accounting Research, 11: 661?687.
[63] Peters, E. E. (1999). Complexity, risk and ?nancial markets, Wiley, New York.
122
[64] Philips, T. K. (2003). Estimating expected returns, Journal of Investing, 12: 49?
57.
[65] Platen E. and D. Heath (2006). A Benchmark Approach to Quantitative Finance,
Springer Finance, Springer.
[66] Platen, E.(2009).A benchmark approach to investing and pricing, workingpaper.
[67] Prause, K. (1999). The generalized hyperbolic model: Estimation, ?nancial deriv-
atives, and risk measures, Doctoral Thesis, University of Freiburg.
[68] Preinreich, G. (1938). Annual survey of economic theory: The theory of depreci-
ation, Econometrica, 6: 219?241.
[69] Roll, R. (1973). Evidence on the ?Growth-Optimum?model, Journal of Finance,
28: 551?66.
[70] Roll, R. (1977). A critique of the asset pricing theory?s tests Part I: On past and
potential testability of the theory, Journal of Financial Economics, 4: 129?176.
[71] Rubinstein, M.(1976).The valuation of uncertain income streams and the pricing
of options, Bell Journal of Economics, 7: 407?425.
[72] Sato, K. (1999): L?vy processes and in?nitely divisible distributions, Cambridge:
Cambridge, University Press.
[73] Schoutens, W. (2001). The Meixner processes in ?nance, EURANDOM Report
2001?2001, EURANDOM, Eindhoven.
123
[74] Schoutens, W. (2003). L?vy processes in ?nance: pricing ?nancial derivatives,
John Wiley & Sons Inc.
[75] Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under
conditions of risk, Journal of Finance, 19: 425?442.
[76] Sklar, A. (1959). Fonctions de r?partition ? n dimensions et leurs marges, Pub-
lications de I?Institut de Statistique de L?Universit? de Paris, 8: 229?231.
[77] Vasicek, O.(1977).An equilibrium characterization of the term structure, Journal
of Financial Economics, 5: 177?188.
[78] Walker, J. S. (1996). Fast Fourier Transforms, CRC Press, Boca Raton, Florida.
[79] Welch, I. (2000). Views of ?nancial economists on the equity premium and on
professional controversies, The Journal of Business, 73: 501?537.
[80] http://en.wikipedia.org/wiki/Generalized_method_of_moments
[81] http://www.?tw.org
[82] http://www.mscibarra.com
[83] http://wrds.wharton.upenn.edu
124