ABSTRACT
Title of dissertation: A NEW L?EVY BASED SHORT-RATE MODEL
FOR THE FIXED INCOME MARKET AND
ITS ESTIMATION WITH PARTICLE FILTER
Bing Zhang, Doctor of Philosophy, 2006
Dissertation directed by: Professor Dilip Madan
Smith School of Business
University of Maryland, College Park
In this thesis two contributions are made to the area of mathematical flnance.
First, in order to explain the non-trivial skewness and kurtosis that is observed in the
time-series data of constant maturity swap (CMS) rates, we employ the pure jump
L?evy processes, i.e. in particular Variance Gamma process, to model the variation
of unobservable economic factors. It is the flrst model to include L?evy dynamics in
the short rate modeling. Speciflcally, the Vasicek [51] type of short rate framework
is adopted, where the short rate is an a?ne combination of three mean-reverting
state variables. Zero-coupon bonds and a few flxed income derivatives are developed
under the model based on the transform method in Carr et al [13]. It is expected
that the L?evy based short rate model would give more realistic explanations to the
yield curve movements than Gaussian-based models.
Second, the model parameters are estimated by the particle fllter (PF) tech-
nique. The PF has not seen wide applications in the fleld of flnancial engineering,
partly due to its stringent requirement on the computing capability. However, given
cheap computing cost nowadays, the PF method is a  exible yet powerful tool
in estimating state-space models with non-Gaussian dynamics, such as the Levy-
based models. To customize the PF algorithm to our model, the continuous-time
L?evy short rate model is cast into the discrete format by flrst-order forward Euler
approximation. The PF technique is used to retrieve values of the unobservable fac-
tors by sequentially using readily available market prices. The optimal set of model
parameters are obtained by invoking the quasi-maximum likelihood estimation.
A NEW L?EVY BASED SHORT-RATE MODEL
FOR THE FIXED INCOME MARKET AND ITS ESTIMATION
WITH PARTICLE FILTER
by
Bing Zhang
Dissertation submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial fulflllment
of the requirements for the degree of
Doctor of Philosophy
2006
Advisory Committee:
Professor Dilip Madan, Chair/Advisor
Professor Michael Fu
Professor David Levermore
Professor Konstantina Trivisa
Professor Haluk Unal
c Copyright by
Bing Zhang
2006
DEDICATION
In memory of my father Zhang, Jianguo.
To my mother Bao, Xinhua.
ii
ACKNOWLEDGMENTS
I want to flrst give my utmost gratitude to a respectful professor, an extraor-
dinary scholar, a skillful professional, and a lovable friend | my advisor Dr. Dilip
Madan. Since becoming his student, I have witnessed and have been deeply touched
by his passion for knowledge, his diligence in working, his humbleness in learning,
his intolerance for ambiguity, as well as his sense of humor. These observations have
been the driving force that moved me forward with my PhD pursuance throughout
these years. I can not recall in retrospect how many times I wandered outside, with
a handful of questions, before entering his o?ce, being nervous, hopeless and self
in-confldent, but after talking to him my mind was always fllled with clarity and
order. I am greatly indebted to such a rare individual for his expertise, patience,
and help. It is impossible to utter my entire gratitude in one page, so the rest is
just left unsaid.
I want to thank Prof. Michael Fu for his excellent organization of our weekly
flnancial seminar, where I acquired extensive information and knowledge on most
recent development in mathematical flnance. Also, his outstanding teaching and
ofi-class help on stochastic processes and simulation would be my life-time beneflt.
I would like to acknowledge the help from Dr. Peter Carr at Bloomberg LP in
New York. It was a wonderful experience spending the 2005 summer as an intern
in his quantitative flnancial research group. There I received close nurturing from
iii
Peter with a project of pricing foreign exchange barrier options under a stochastic
skew model. I also want to express my gratitude for his generous advice and help
on my employment.
I thank Prof. Dave Levermore for funding my living during the doctoral ed-
ucation, otherwise I would not have completed the dissertation alive. I would like
also to extend my gratitude to Prof. Konstantina Trivisa and Prof. Haluk Unal for
agreeing to serve as the PhD committee members.
I would also like to say Thanks to my fellow classmates. Christian Silva guided
me through my flrst flnance project; Samvit Prakash initiated a math flnance stu-
dents gathering regarding our academic/career developments; Qing Xia and I spent
numerous hours together discussing math flnance problems; Huaqiang Ma has been
a good company of mine to go to the academic conferences; etc.
I would like to express my grateful feelings to the stafi members Ms. Alverda
McCoy and Ms. Liz Wincek for the laughters and joy they brought to me on every
each frustrating and endless research day.
I am sure this enumeration is far from being complete for people that I own
debt to; any inadvertent missing in the list is my fault.
In the end, I want to thank my graceful wife Fei Lin. Without her love and
support, this dissertation would be infeasible.
iv
TABLE OF CONTENTS
List of Tables vii
List of Figures viii
1 Introduction 1
2 L?evy Processes in Finance and Monte Carlo Simulation 8
2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Pure jump L?evy market models . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Variance Gamma process . . . . . . . . . . . . . . . . . . . . . 14
2.3.2 CGMY process . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.3 Market models . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Monte Carlo simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4.1 Simulate VG and CGMY . . . . . . . . . . . . . . . . . . . . . 21
2.4.2 Simulate VG and CGMY as time-change Brownian motion . . 23
2.4.3 Importance sampling under L?evy processes . . . . . . . . . . . 25
2.4.4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . 32
3 The 3-Factor L?evy Based Short Rate Model 41
3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2 Overview on flxed income modeling . . . . . . . . . . . . . . . . . . . 43
3.2.1 Bond market and money account . . . . . . . . . . . . . . . . 43
3.2.2 Forward rate and Libor rate . . . . . . . . . . . . . . . . . . . 46
3.2.3 Change of numeraire . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Current short rate models . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4 3-Factor L?evy short rate model dynamics . . . . . . . . . . . . . . . . 53
3.4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4.2 Model dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.5 Pricing of bond and bond derivatives . . . . . . . . . . . . . . . . . . 57
3.5.1 Joint characteristic function . . . . . . . . . . . . . . . . . . . 58
3.5.2 Bond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.5.3 Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.5.4 Swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.5.5 Caps/Floors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4 Particle Filter and Parameter Estimation 73
4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2 The problem and conceptual solution . . . . . . . . . . . . . . . . . . 76
4.3 Kalman fllter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.4 Particle fllter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.4.1 Importance sampling in particle fllter . . . . . . . . . . . . . . 82
4.4.2 Sequential importance sampling . . . . . . . . . . . . . . . . . 83
v
4.4.3 Selection of the importance density . . . . . . . . . . . . . . . 85
4.4.4 Resampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.4.5 Generic particle fllter algorithm . . . . . . . . . . . . . . . . . 88
4.5 Estimate the 3-factor L?evy short rate model . . . . . . . . . . . . . . 89
4.5.1 Data description . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.5.2 Quasi-maximum likelihood estimation with particle fllter . . . 90
4.5.3 Estimation results and discussion . . . . . . . . . . . . . . . . 92
4.6 Yield curve analysis and pricing . . . . . . . . . . . . . . . . . . . . . 94
4.6.1 Yield curve factor loading . . . . . . . . . . . . . . . . . . . . 100
4.6.2 Price caplet using simulation . . . . . . . . . . . . . . . . . . . 102
5 Conclusion 106
A 108
A.1 Solution derivation of OU process, Eqn. (3.9) . . . . . . . . . . . . . 108
A.2 Proof of the Bayes rule in the update stage for the conceptual solution
of the flltering problem, Eqn. (4.4) . . . . . . . . . . . . . . . . . . . 108
A.3 Proof of the posterior expansion, Eqn. (4.17) . . . . . . . . . . . . . . 109
Bibliography 110
vi
LIST OF TABLES
2.1 VG process simulation as the difierence of two gamma random variables 21
2.2 CGMY process simulation as compound Poisson and Brownian mo-
tion approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 VG process simulation as the time-change Brownian motion . . . . . 25
2.4 CGMY process simulation as the time change Brownian motion . . . 26
2.5 Chi-Square goodness-of-flt test numerical results for simulation of
CGMY(t; C, G, M, Y) process as compound Poisson process, 104
samples. N.R. stands for Not Rejected . . . . . . . . . . . . . . . . . 36
2.6 Chi-Square goodness-of-flt test numerical results for simulation of
CGMY(t; C, G, M, Y) process as time-change Brownian motion,
104 samples. N.R. stands for Not Rejected . . . . . . . . . . . . . . . 36
2.7 Parameters and speciflcations for the numerical illustration of VG
measure change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1 ICA statistics for swap rates of 10 maturities . . . . . . . . . . . . . . 54
3.2 Numbers of model parameter for original model and reduced model . 57
4.1 Sequential importance sampling algorithm . . . . . . . . . . . . . . . 86
4.2 Resampling algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3 Particle fllter algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.4 In-sample maximum likelihood parameter estimates of 3-factor L?evy
shortratemodelwithparticlefllterusing10-yeardatafrom04=25=1994
through 10=13=2004 . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.5 In-sample estimation performance statistics of 3-factor L?evy short
rate model with particle fllter using 10-year data from 04=25=1994
through 10=13=2004 . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.6 Implied volatility charts of model-generated caplet values of tenor 3-
month on 10=13=2004 with maturity from 1 to 9 years and moneyness
from at-the-money 1 to out-of-the-money 1.1 . . . . . . . . . . . . . . 104
vii
LIST OF FIGURES
2.1 Chi-Squaregoodness-of-flttestillustrationforsimulationofCGMY(t;
C, G, M, Y) process, as compound Poisson and time-change Brown-
ian motion, with t = 0:5;C = 0:8;G = 30;M = 30;Y = 0:5, 2?104
samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2 Chi-Squaregoodness-of-flttestillustrationforsimulationofCGMY(t;
C, G, M, Y) process, as compound Poisson and time-change Brown-
ian motion, with t = 0:3;C = 0:5;G = 20;M = 10;Y = 0:5, 2?104
samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3 CPU time comparison of compound Poisson and time-change Brow-
nian motion for simulating CGMY(t; C, G, M, Y) with t = 0:2;C =
0:1;G = 20;M = 10 and varying Y, 104 samples . . . . . . . . . . . . 38
2.4 Histogram comparison for VG(t; C, G, M) and VG(t; C0, G0, M0)
with parameters listed in Table (2.7), 2?104 samples . . . . . . . . . 39
2.5 Convergencecomparisonofpricingout-of-moneyputoptionsforVG(t;
C, G, M) and VG(t; C0, G0, M0) with importance sampling, param-
eters listed in Table (2.7) . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1 US treasury yield curve on Feb. 09, 2005 . . . . . . . . . . . . . . . . 46
4.1 Resampling scenario and mechanism in particle fllter . . . . . . . . . 87
4.2 Illustration of (quasi) maximum likelihood estimation using particle
fllteronEurodollarfuturesandswapratesagainstmarketquotesfrom
04=25=1994 to 10=13=2004: Upper 2 panels for Eurodollar future on
Libor(3m, 3m) and lower 2 panels for Eurodollar future on Libor(6m,
3m) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.3 Illustration of (quasi) maximum likelihood estimation using particle
fllteronEurodollarfuturesandswapratesagainstmarketquotesfrom
04=25=1994 to 10=13=2004: Upper 2 panels for Eurodollar future on
Libor(12m, 3m) and lower 2 panels for 2-year swap rate . . . . . . . . 97
4.4 Illustration of (quasi) maximum likelihood estimation using particle
fllter on Eurodollar futures and swap rates against market quotes
from 04=25=1994 to 10=13=2004: Upper 2 panels for 3-year swap rate
and lower 2 panels for 5-year swap rate . . . . . . . . . . . . . . . . . 98
viii
4.5 Illustration of (quasi) maximum likelihood estimation using particle
fllter on Eurodollar futures and swap rates against market quotes
from 04=25=1994 to 10=13=2004: Upper 2 panels for 10-year swap
rate and lower 2 panels for 20-year swap rate . . . . . . . . . . . . . . 99
4.6 Illustration of (quasi) maximum likelihood estimation using particle
fllter on Eurodollar futures and swap rates against market quotes
from 04=25=1994 to 10=13=2004: panels for 30-year swap rate . . . . . 100
4.7 Yield Curve and Factor Loading . . . . . . . . . . . . . . . . . . . . . 101
4.8 Implied volatility surface of model-generated caplet values, 10=13=2004104
ix
Chapter 1
Introduction
The past few decades has seen rapid development in the flxed income market
worldwide. As bonds issued by governments and corporations are being actively
traded between banks and funds, flnancial institutions on Wall Street have been
aggressive in designing and trading bond derivatives to provide insurance to in-
vestors. This phenomenal progress has stimulated an exuberance of efiorts, from
both academia and industry, in mathematically modeling the dynamics exhibited
from the flxed income market.
Modeling the dynamics of short rate, which is the instantaneous increment of
a unit deposit, has been one most popular approach in modeling interest rate term
structures. This stream of work was pioneered by Vasicek [51] and Cox-Ingersoll-
Ross [18], and followed by Hull-White [36], etc. A common feature that all such
models share is to describe the randomness in the interest rate term structure by
Brownian motion. The use of Brownian motion implies that the conditional distri-
bution of the modeled object is normal, and one only needs the mean and variance
to characterize the distribution.
However, in the context of modeling flnancial assets, e.g. interest rates, such an
assumptionisinacontradictiontomanyaspectsoftherealmarket. Forinstance, the
time series data of swap rates show that signiflcant higher probabilistic moments, i.e.
1
skewness/kurtosis, persist (see [24]). In addition, the evidence from the flxed income
derivative market indicates that the normality assumption breaks down too, as the
Black-implied volatilities1 backed from caps/ oors/swaptions prices are markedly
non-constant, which otherwise should have been  at if normality holds (see [9]).
Because the market clearly does not support the normality assumption, in this
dissertation I seek to remedy these discrepancies in the existing flxed income models.
I adopted an approach that has seen great success in modeling the equity market
dynamics. This approach, as termed by pure-jump L?evy models, was originally
proposed by Madan et al in [42], and some other researchers [50].
A L?evy process is a stochastic process with independent and stationary incre-
ments. It can be decomposed into three independent components: a deterministic
drift process, a continuous path mean-zero difiusion and a jump process. A drifted
Brownian motion is the special case of a L?evy process without jumps, while the
general L?evy setting permits  exible jump structures in addition to the difiusion
and drift.
In terms of building models for asset returns, it has been argued in Madan
et al [42] that there is no absolute necessity to include a Brownian motion in the
underlying asset movement, because in reality no asset?s values move continuously
(at least the movement is limited by the minimum tick size, e.g. 1 cent for traded
stocks in New York Stock Exchange). Backed by this notion, the aforementioned
1The Black-implied volatility is the market convention to quote the caps/ oors/swaptions based
on the Black formula [6]. The calculation of such a quantity is thus a reverse-engineering problem
to back out a model parameter, the Black-implied volatility, from the known prices.
2
pure-jump L?evy market models use the inflnite-active jump kernels, which make
small jumps arrive inflnitely frequent, in order to remove the difiusion component.
Such models can  exibly incorporate the exhibited skewness and kurtosis in the mar-
ket data by specifying the jump kernel. In speciflc, an uneven jump measure yields
desired skewness, and a heavy tail weight in the jump measure provides adequate
kurtosis. A few pure jump L?evy models for the equity market, including Variance
Gamma (VG) model which is used in this work, have been proved sound to flt both
the time-series of asset returns under physical measure and the smile/skew volatil-
ity surface under the risk-neutral measure. As an evidence of their market success,
some of those models have been reportedly used by a couple of prestigious flnancial
institutions on Wall Street to evaluate their equity option positions on a daily basis.
In this dissertation, I developed a 3-factor L?evy -based short rate model whose
dynamics is
r(t) = fi +fl0x(t) ;
dx(t) = (a??x(t))dt+BdL(t) ; (1.1)
where the short rate r(t) is a scaler process expressed as an a?ne combination of
factors x(t) driven by L?evy dynamics dL(t) in the background. The a?ne structure
is specifled by the scaler coe?cient fi and vector coe?cient fl, and the vector x(t)
follows a multidimensional mean-reverting Ornstein-Uhlenbeck (OU) process. The
vector a is denoted as the long run mean-reverting level for the factors x(t), ? as
the mean-reverting strength, and B as the correlating matrix.
This is the flrst model utilizing the L?evy structure dL(x) in the short rate,
3
as opposed to the existing Brownian motion-based models. Multi-factor models
are more efiective in explaining the interest rates movements, as one factor models
can not decouple the correlations between bonds across difierent maturities. The
3-factor model is in line with the conclusion from the seminal paper by Litterman
et al in [41] that found 3-factor is adequate to describe 99% of the yield curve
data. The model follows the Vasicek equilibrium modeling framework (see [51])
by specifying the process under the physical measure. Because we are primarily
concerned about pricing, the measure-change from the physical measure to risk-
neutral pricing measure is developed. Under the risk-neutral measure, closed-form
formulas for pricing bonds, swaps, caps/ oors and swaptions via their characteristic
functions are derived based on the transform method in Carr et al [13].
Parameter estimation is of critical importance in applying flnancial models.
Although the L?evy -based model is structurally superior to Gaussian-based models,
it is computationally more expensive and require more sophisticated algorithms for
parameter estimation.
In order to acquire the stable estimation of the parameters, the model is flrst
cast into the state-space format. In other words, the state variables are modeled
as latent/unobservable factors while the model outputs observable measurements,
i.e. the prices for the flxed income products. The idea of state-space model is
formularized in Eqn. (1.2)
xt = (I ?e???t)a? +e???txt?1 +B?Lt ;
zt = O(xt;?)+et ; (1.2)
4
where the flrst equation approximates the OU process solution by using the flrst-
order Euler method. ? is the set of parameters to be estimated, and the O(xt;?)
in the second equation represents the non-linear pricing function w.r.t. the state
variables xt and ?. zt are the output prices and et are the normally distributed
pricing errors.
We are given 10-year weekly-sampled time-series across 9 difierent asset values
as data input. The maximum likelihood estimation (MLE) is used as the estimation
algorithm, and the likelihood function is constructed for errors et between the mar-
ket time-series quotes and model output prices. In the context of state-space model
estimation, such a joint error likelihood function is built with the help from employ-
ing the fllter technique such as Kalman fllter or its likes. However, the widely-used
Kalman fllter requires strict theoretical assumptions, i.e. the linearity assumption
on the measurement and propagation functions and normality assumption on the
dynamics. Therefore, despite its power and convenience, Kalman fllter (KF) is
not immediately applicable in our model since the randomness is not Gaussian nor
pricing functions linear.
We opt to use its counterpart, the particle fllter (see [7, 20]), as our choice
for estimating the model. Particle fllter (PF) has not seen massive applications till
recently with the computation cost being reduced tremendously. Contrasting to KF,
which only updates two essential quantities i.e. the mean and variance, the rationale
behind PF is to use a large amount of particles to represent the entire distribution.
The movement of the distribution is carried out by applying forward Monte Carlo
simulation sequentially. Therefore, by design, particle fllter ofiers great  exibility as
5
it can accommodate arbitrary dynamic structures and measurement functions.
In a nutshell, at each time step a particle fllter flrst makes an ex-ante prediction
on the unobservable state variables ^xt. Once the new market observation zt is
received, the fllter updates the state variables under the Bayes theorem. In theory,
the updated state variables xt converge to its true values, which are used to output
the model prices by O(xt;?). A fllter technique recursively performs the prediction-
and-update procedure at all time steps till it reaches the end of the time-series. The
joint error likelihood function is then a multidimensional normal distribution across
all sampled dates and assets, and the logarithm of such likelihood function is given
as
L(?) = ?12
MX
t=1
NX
i=1
?(z
t ?ezit)0(Ri)?1(zt ?ezit)
? ; (1.3)
where zt is the market quote and ezit are model output. R represents the covariance
matrix for the errors. M and N are the number of sampling days and asset classes,
respectively. Maximization of such likelihood function yields the optimal model
parameters as follows
? = argmax
?
L(?) : (1.4)
In the dissertation, Chapter 2 introduces the L?evy process as a mathematical
concept. It discusses a couple of popular pure-jump L?evy market models and their
simulation algorithms. As an important tool to reduce simulation variance, impor-
tance sampling technique for jump processes is studied. Chapter 3 focuses on the
development of 3-factor L?evy -based short rate. The measure change from physical
measure to pricing measure (risk neutral) is derived. Under the risk-neutral mea-
6
sure, closed-form formulas for pricing bonds, swaps, caps/ oors and swaptions via
their characteristic functions are developed. In Chapter 4, the maximum likelihood
estimation algorithm for the L?evy short rate model with particle fllter is discussed
in detail. The estimation results show that the model has certain forecasting power
on the Libor and swap rates. In addition, the yield curve is analyzed using the
factor loading methodology which facilitates hedging and risk management. We
take caplet as an example of pricing flxed income derivatives. The model prices for
caplets show qualitative improvement against the existing models, but it also indi-
cates that the model is inadequate to price flxed income derivatives without further
seasoning.
7
Chapter 2
L?evy Processes in Finance and Monte Carlo Simulation
2.1 Background
Mathematical flnance has been an active research area after the birth of
Black-Merton-Scholes (BMS) equation in 1973. However, the constant volatility
assumption in BMS was obviously violated by the market observation of non-
constant volatility across maturity and strikes. To explain this so-called \volatility
smile/skew" phenomenon, diligent and astute researchers have proposed numerous
alternatives to BMS theory. The attempts can be grouped into three classes:
1. Local volatility model: The rationale underlying the local volatility model is
that future volatilities are deterministic functions of the underlying value and
calendar time, and these functions are implied by the current vanilla option
prices. The model was developed by Dupire [10] , Rubinstein [48], Derman and
Kani [19] in three independent efiorts. The model retains the convenience of
Black-Scholes type of hedging argument. Because of its simplicity, traders to
price and hedge exotic options with the local volatility model after the model
is calibrated to the vanilla option market.
2. Stochastic volatility model: The second alternative is to randomize the volatil-
ity by a second Markovian stochastic process. The stream of efiorts was pio-
8
neered by Hull and White [36], Heston [34], etc. In those models, a non-zero
correlation between the spot and volatility processes is assigned to reproduce
the skewness across strikes and maturities. The Heston stochastic volatility
model is one of the market standards for derivative pricing and risk manage-
ment.
3. Jump model: The third approach is to keep the dynamics a one-dimensional
Markovianprocessbutaddjumpstotheunderlying, andjumpstructurescould
be made to incorporate the skew/smile exhibited from the market. Merton [45]
in 1976 proposed the flrst jump-difiusion model for the underlying. Recently,
pure jump L?evy market models become increasingly popular, represented by
Variance Gamma model by Madan et al [42], CGMY by Carr et al [12], etc.
The L?evy models will be the building block for the work in this thesis and will
be elaborated in great details herein.
The L?evy models have been proved efiective in explaining the smile/skew in
equity and foreign exchange (FX). People have built certain degree of belief in option
prices under those models. As an evidence of the models? market success, a couple
of prestigious flnancial houses on Wall Street have been using those models or their
extensions to evaluate their option positions. In this chapter, I will flrst introduce
the mathematics needed to understand L?evy processes. A few popular L?evy market
models will then be described, including the VG and CGMY models. Monte Carlo
(MC) simulation is the most widely used computational tool in derivative pricing,
so the last section of this chapter will be dedicated to studying the simulation
9
algorithms for the VG and CGMY processes. Importance sampling technique is
discussed as a variance reduction technique for MC simulation, and a couple of toy
examples are presented in the end.
2.2 Mathematics
Levy process is a stochastic process Xt that has independent and stationary
increments. Consequently, it is a Markovian process with the marginal distribution
of random variable Xt being inflnitely divisible.
It is most common to study L?evy process by their characteristic functions.
The characteristic function of a random variable X is deflned as
`X(u) = E[eiuX] =
Z +1
?1
eiuxfX(x)dx; (2.1)
where fX(x) is the probability density function of X and u 2 R.1 The characteristic
function can be graphically viewed as the probability weighted average of a unit
circle on the complex plane. If we denote `X(u) = e?X(u), ?X(u) is then called the
characteristic exponent of X.
A L?evy process Xt is inflnitely divisible, which indicates that the characteristic
function of marginal random variable Xt can be expressed as follows
`Xt(u) = E[eiuXt] = et?X1(u) (2.2)
where ?X1(u) is the characteristic exponent of the Levy process at unit time. The
property of inflnite divisibility gives rise to a great convenience to study Xt, namely
1u can be extended to the complex plane.
10
one only needs to look at X1 in order to investigate the distributional properties of
Xt for any flnite t.
A L?evy process can be decomposed into three independent components: the
flrst is a deterministic drift with rate b, the second is a continuous path difiusion
with volatility  and the third is a jump process with the measure ?(dx). Hence,
a L?evy process can be fully characterized by the combined L?evy triplet (b; ;?(dx))
where b 2 R,  2 R+ and ?(dx) is a measure deflned on Rnf0g. The L?evy measure
?(dx) describes the arrival frequency of jumps with difierent sizes, and could be
written in a functional form ?(dx) = k(x)dx, where k(x) is called the L?evy density.
For a one-dimensional L?evy process, the L?evy -Khintchine formula gives the
expression for characteristic exponent ?X1(u) as follows
?X1(u) = bui? 12 2u2 +
Z +1
?1
(eiux ?1?iux1[jxj<1])?(dx): (2.3)
with
Z +1
?1
min(1;x2)?(dx) < 1; (2.4)
For a good reference of L?evy processes, please see Sato [49].
It is worth pointing out that many well-known stochastic processes are special
cases of general L?evy settings. For instance, if we set b = 0 and let the jump density
k(x) vanish for all real x, a standard Brownian motion with variance  2 is then left.
Or, if both b and  are set zero but k(x) = ??(1), where ?(1) denotes the Dirac
measure at 1, the Poisson process with arrival rate ? is restored.
The so-called pure jump L?evy models ignore the Brownian motion component
but use tiny jumps to mimic the continuous movement. This could be realized by
11
tilting the L?evy density k(x) su?ciently large as the jump size x approaches to zero.
For such pure jump processes, we can group the models into 3 exclusive categories.
Let?s deflne
I =
Z +1
?1
?(dx); J =
Z +1
?1
jxj?(dx)
where I and J denote the total arrival rate and total variation respectively. Thus,
a L?evy process would be called
1. Finite activity process if I < 1 and J < 1
2. Inflnite activity but flnite variation process if I = 1 but J < 1
3. Inflnite activity and inflnite variation process if I = 1 and J = 1
This classiflcation is to distinguish the behavior of small jumps around the
origin. Due to the high frequency of tiny jumps, inflnite activity process implies
that the total number of jump occurrence during any time interval is inflnite. The
inflnite variation process, which encompasses the inflnite activity process, indicates
that beyond the inflnite jump frequency, the summation of the absolute values of
all occurred jumps goes to inflnity too in any flnite time interval. Both inflnite
activity and inflnite variation processes could be used as the building block for the
pure-jump L?evy market models developed in the next section.
2.3 Pure jump L?evy market models
After the 1987 equity market crash, investors  ocked to buy out-of-the-money
put options to protect their equity positions and this pushed up the out-of-the-
12
money volatility and made the negative skew more pronounced. Additionally, it is
well known that the log daily returns have signiflcant skewness and are fat-tailed.
As the market evidence does not support the constant volatility Brownian motion
models, people started looking for models with richer structures that yield realistic
explanations. It is natural to think, after observing catastrophic market crash, that
the market moves not only continuously, but jumps from time to time. As the
flrst attempt in this regard, R. Merton in 1976 proposed a jump-difiusion model for
equity market in [45] which used Brownian motion for the small movements and
jumps for the large.
However, in real world trading never happens continuously but rather one
trade after another, and the movement of the stock price path can not be absolutely
continuous because it is at least limited by the minimum tick size (e.g. 1 cent for
the traded stocks on New York Stock Exchange). So it leads to the suspicion that
if the Brownian component should be absolutely needed in the model, especially
in the sense of parsimonious modeling. A few researchers [42] have argued that in
reality the continuous difiusion component is not statistically signiflcant, as long as
the small moves can be represented by alternative structures other than Brownian
motion. Under such rationales, Brownian motion is excluded from the pure jump
L?evy models.
In this section I will flrst describe the dynamics for a couple of popular L?evy
processes such as VG and CGMY. The market models are to use these processes
to describe the logarithm of the asset price. Consequently, the asset price itself
follows an exponential L?evy process. Careful treatments should be given here, be-
13
cause to build a legitimate model for stock price, the model needs to satisfy the
martingale condition under the appropriate measure and associated numeraire in
order to prevent arbitrage opportunities. This section will discuss how to make the
exponential L?evy process a martingale by correcting the convexity term caused by
the exponentiation.
2.3.1 Variance Gamma process
Variance Gamma (VG) model developed by Madan, Carr and Chang in [42] is
an elegant model that ofiers analytical tractability and straight forward simulation
schemes, and it has been one of the most well-know pure-jump L?evy models in this
area.
A VG random variable X follows a 3-parameter ( ;?; ) probability law, and
its characteristic function is given by
`VG(u; ;?; ) = (1?iu ? + 12 2?u2)?1=? ; (2.5)
with  2 R+;? 2 R+; 2 R. The elegance of VG process lies in that its L?evy jump
density can be expressed in a simple form as
kVG(x) =
8>
><
>>:
C exp(Gx)
jxj x < 0;
C exp(?Mx)
x x > 0;
(2.6)
where
C = 1=? ;
G = (
r1
4  
2?2 + 1
2 
2? ? 1
2  ?)
?1 ;
M = (
r1
4  
2?2 + 1
2 
2? + 1
2  ?)
?1 :
(2.7)
14
OnecanrecognizethateachformulainEqn.(2.6)isinfacttheL?evymeasurefor
a gamma random variable. This indicates that a VG random variable can be decom-
posed into two gamma random variables; one has positive jumps and the other has
negative jumps. Under this representation, a VG random variable XVG(C;G;M)
can be written as the difierence between two gamma random variables
XVG(C;G;M) = Xg(C;1=M)?Xg(C;1=G): (2.8)
This fact leads to a straight-forward simulation algorithm that we will present in
the following section.
The VG process can handle the skewness and excess kurtosis exhibited from
the historical stock prices, and flt well the vanilla option volatility curve for single
maturities. For instance, a negative parameter  will result in a negative skewness,
and the parameter ? provides the primary control for fat-tails in the empirical
distribution. For the vanilla option market, a negative  accounts for the negative
slope in the volatility curve.
VG process can also be intuitively expressed as a time-change Brownian Mo-
tion, where the time-change process2 is a gamma process. In speciflcs, a VG process
can be obtained by substituting the deterministic time t with a gamma random
variable g(t) in a drifted Brownian motion X(t) = bt +  W(t). Under the ( ;?; )
parameterization we have the expression for XVG as
XVG(t) =  g(t)+ W(g(t)); (2.9)
where g(t) follows gamma distribution gamma(t=?;?).
2A time-change process is also called a subordinator in some literatures.
15
The concept of time-change Brownian motion has strong economics intuitions.
It is clear that the market does not evolute identically every day; rather, some days
thetradingactivitiesaremoreintensive, whileotherdaysthemarketisjustquietand
has less trading going on. So the length of a market day is better measured by the
concept of random \business time" rather than the calendar time. As a (technical)
beneflt, VG process could be simulated by generating the standard Brownian motion
subordinated by gamma random time.
2.3.2 CGMY process
Variance Gamma process is of inflnite activity but flnite variation, but it can
be extended to have a better control over the flne structure of asset return distri-
bution by adding one parameter. The generalized model is called CGMY model as
developed by Carr et al in [12]. Based on the parameterization (C;G;M) of VG,
CGMY process adds the fourth parameter Y to the power of the denominator x as
in Eqn.2.7. CGMY jump density is then given as |
kCGMY (x) =
8>
><
>>:
C exp(Gx)
jxj1+Y x < 0;
C exp(?Mx)
x1+Y x ? 0;
(2.10)
where C > 0;G > 0;M > 0;Y < 2. (Y < 2 is to keep the process having flnite
second moment.)
By adding Y, we gain the  exibility to specify a flner structure. For Y < 0,
the L?evy process is of flnite activity; for 0 ? Y < 1, it is of inflnite activity but flnite
variation; for 1 ? Y < 2 the process is of inflnite activity and inflnite variation. As
special cases, VG process is recovered if Y = 0, and Kou?s double exponential model
16
in [40] is a flnite activity process with Y = ?1.
The characteristic function of CGMY random variable is given by
`CGMY (u;C;G;M;Y) = exp
?
C?(?Y)'(M ?iu)Y ?MY +(G+iu)Y ?GY?
?
;
(2.11)
where the ?(:) is the gamma function.
CGMY process also has a time-changed Brownian motion representation which
has been recently discovered in Madan et al [43]. We will show it together with
discussion of the CGMY simulation algorithm in the next section.
2.3.3 Market models
L?evy market models assume that the martingale component of the dynamics
in the log price of Xt is given by a L?evy process, e.g. VG or CGMY. As we are
most concerned about pricing derivatives, the model speciflcation is skipped under
the physical measure. Rather, the stock price dynamics as the exponential L?evy
process under the risk neutral measure is given by
St = S0 exp
?
(r +!)t+Xt
?
; (2.12)
where r is the risk-free interest rate and ! accounts for the \logarithm convexity
correction". The ! appears in the exponential because it is needed to make the
stock process an exponential martingale, which is a general requirement in asset
pricing theory in order to prevent arbitrage opportunities. It can be shown that !
is deflned as
exp(?!) = `(?i;?)
17
where `(:; :) is the characteristic function of the L?evy process and ? represents the
model parameters. For VG and CGMY models, their characteristic functions have
been given in Eqn.(2.5) and (2.11) respectively.
L?evy market models have been applied to option pricing in the real world
recently and the popularity has been increasing (see [17, 50] for details). Here we
only want to brie y compare the L?evy models with the stochastic volatility models,
in terms of their pros and cons in option pricing. L?evy models have the advantage
of keeping themselves stay within the family of the one dimensional Markovian
process, but the stochastic volatility models need an additional stochastic process
that captures the volatility. Secondly, under stochastic volatility models, which
have only continuous martingale components, are unable to generate the smile/skew
volatility curve for short-dated options. This is caused by the fact that the path
continuity prevents Brownian motions from generating enough variations in a short
period of time dt. However, such a problem does not exist for inflnite activity L?evy
models because during the time interval of any length there are an inflnite number
of jumps occurring. Therefore, even for short-lived options such L?evy models can
produce the exhibited skews.
However, pure jump L?evy models contradict the market in some other aspects.
For instance, under L?evy models the implied volatility curve of the long-dated op-
tions  attens out, but the market shows signiflcant skew in those options. This is
because of the i.i.d. increment assumption that L?evy processes hold. The central
limit theorem (CLT) states that the summation of a large number of i.i.d random
variables approaches to the normal distribution, so for the long run the skewness
18
and kurtosis provided by the L?evy process are suppressed by CLT. Another con-
tradicting example is that under the time-homogeneous L?evy models the implied
volatility surface for the forward-start option 3 is theoretically an exact duplication
of the current implied volatility surface, whereas the market indicates a signiflcant
difierence between those two surfaces. A third unsatisfactory outcome with the
L?evy models is the process?s constant variance, which can not explain the volatility
clustering and leverage efiect observed in the historical data of the realized volatility.
To solve the above problems associated with basic L?evy models, Carr et al
in [14] proposed the stochastic volatility L?evy models which essentially employ an
additional Markovian process to time-change the L?evy process in the spot price
dynamics. These models essentially combine the advantages of the basic L?evy and
stochastic volatility models, thus simultaneously attack the aforementioned deflcien-
cies successfully.
Although it would be very naive to claim the stochastic L?evy models are the
perfect description of the real world, those models provide a closer look to help
explain the option market phenomena.
2.4 Monte Carlo simulation
Monte Carlo simulation has been widely used in flnancial engineering with
applications ranging from pricing, hedging, risk managing, etc. Compared to other
methodologies in pricing derivatives, Monte Carlo simulation has the advantage of
3A plain vanilla option that comes into life on a specifled future date.
19
being  exible and easy-to-implement. In fact, Monte Carlo simulation serves as the
only approach when pricing derivatives with complicated structures such as exotics
options or hybrids products that is becoming a booming business in recent years.
For a full account with overviews in this regard, readers are referred to [26, 37].
It is a well-known drawback of Monte Carlo simulation that the convergence is
slow, i.e. the confldence interval decreasing at a rate proportional only to the square
root of the number of the random draws. In practise, one needs to seek for variance
reduction techniques in order to speed up the convergence. Depending on speciflc
scenarios, difierent variance reduction tools should be chosen. For instance, when
pricing far-out-of-money options people use importance sampling to redirect the
process to the interested area and then change it back by multiplying the Radon-
Nikodym ratio. Variance reduction is especially important in derivative pricing
business, because very often a customer wants a real-time quote on a product with
however complicated structures.
In this section I will focus on discussing the simulation algorithms for VG and
CGMY processes, and provide numerical comparisons between difierent simulation
schemes. I will also talk about the importance sampling technique under the L?evy
process, and numerical demonstrations will be presented.
20
2.4.1 Simulate VG and CGMY
Variance Gamma
We have learned from Eqn. (2.8) that a VG random variable could be decom-
posed into two gamma random variables. Based on this rationale, one could take
advantage of the gamma random variable simulation and generate VG process as
the difierence of those two gamma random variables. The corresponding simulation
algorithm, using the fC;G;Mg parameterization, is listed in Table (2.1)
Simulation of Xt ? VG(t;C;G;M)
1. Generate G?t ? gamma(tC;1=G)
2. Generate G+t ? gamma(tC;1=M)
3. Return Xt = G+t ?G?t
Table 2.1: VG process simulation as the difierence of two gamma random variables
CGMY
Unfortunately, the elegant simulation algorithm, as in the above VG case,
is not available for all L?evy processes. The most general approach to simulate an
arbitraryL?evyprocessistotreatthejumpsascompoundPoissonprocessandsample
from its L?evy density. For the compound Poisson simulation, one flrst calculate the
jump arrival rate and simulate the point process to locate the random sequence of
jump time epochs. Given a jump has occurred, we sample the jump size from the
normalized L?evy density. A compound Poisson random variable XCP(t) is then the
21
summation of all jumps up to time t.
The problem of the above solution is that for inflnite activity and inflnite vari-
ation L?evy processes, the arrival rate is inflnite for any time interval. However, one
realizes that this inflnity problem is caused by jumps of tiny sizes while the arrival
rates for large jumps are always flnite by deflnition. This provides a general way of
simulating such processes | namely the small jumps are cut ofi and approximated
while large jumps are simulated as compound Poisson. This general idea is tailored
to the CGMY simulation algorithm as discussed below.
We flrst cut ofi the small jumps of absolute size less than ?, a small positive
value4. Those small jumps will be approximated by Brownian motion with cor-
responding variance5. For the large jumps left, compound Poisson simulation is
employed and we sample jumps using the acceptance-rejection method.
For large jumps, we flrst make a truncated jump density k?CGMY in Eqn. (2.10)
and then we have
k?CGMY (x) =
8>
><
>>:
C exp(Gx)
jxj1+Y x < ??;
C exp(?Mx)
x1+Y x ? ?;
In order to make the best e?ciency for the acceptance-reject method, we want to
flnd a function f?(x) whose value is close to but always greater than k?CGMY (x) at
every x. For x ? ?, it can be shown that the function
f?(x) = Y?
Y
xY+11[jxj??]; Y < 2
4For practical purpose we pick ? = 10?4.
5Brownian motion provides a good approximation for small jumps as shown by Asmussen and
Rusi?nski [1]
22
has an e?ciency close to 1 as ? approaches to zero. The cumulative distribution
function of f?(x) is obviously F?(x) = 1 ? ?YxY 1[jxj>?] whose inversion is simply
F?1? (x) = ?u?1=Y where u is a uniform random variable. The case of negative
large jumps can be treated identically.
For jump sizes with absolute values smaller than ?, we calculate their variance
as follows
 2? =
Z +?
??
x2 ?(dx) < 1:
The small jumps are then approximated by a Brownian motion with drift zero and
variance  2?.
So far we have developed a simulation algorithm by decomposing the CGMY
processintothreecomponents, i.e. thelargepositivejumps, thelargenegativejumps
and small jumps, and generate the each component separately. The algorithm is
listed in Table (2.2).
The I? and I+ are intensities for positive and negative jumps respectively,
and they can be calculated as a gamma incomplete function.
2.4.2 Simulate VG and CGMY as time-change Brownian motion
The beneflt of viewing a L?evy process time-changed Brownian Motion has two
folds with respect to simulation. First, we avoid directly dealing with the L?evy jump
density which might be di?cult to sample from. Second, it provides a modularity
beneflt so as to flt into the existing simulation package.
23
Simulation of Zt ? CGMY(t;C;G;M;Y) process
1. Calculate I? = CR???1 eGxdxjxjY+1 for x < 0;I+ = CR1? e?MxdxxY+1 for x > 0
2. Simulate N? = poisson(tI?) and N+ = poisson(tI+), where N? and N+
refer to the number of negative and positive jumps respectively
3. For negative jumps, Loop i = 1 : N?
? Generate U ? uniform[0;1]
? Repeat
{ Generate fW;Vg? uniform[0;1]
{ Set X?i = ?W?1=Y
{ set T = Cf?(X?i )??Y e???Yk?
CGMY (X
?
i )
? Until VT ? 1 and then store X?i
4. Do the same calculation for N+ as in step (3), and store all X+i
5. Calculate  ? = R+??? x2 kCGMY (x)dx < 1
6. Simulate X? =  ?ptB where B ? Normal(0;1)
7. Return Zt = ?
N?X
i
X?i +
N+X
i
X+i +X?
Table 2.2: CGMY process simulation as compound Poisson and Brownian motion
approximation
Variance Gamma
As in Eqn.(2.9) VG is viewed as a Brownian motion subordinated by a gamma
process. Below in Table (2.3) is the procedure for simulating VG process Xt with
parameter set ( ;?; ).
24
Simulation of Xt ? VG(t; ;?; )
1. First step: generate Gt ? gamma(t=?;?)
2. Second step: insert Gt into a Brownian motion
? Generate a standard normal random variable W ? Normal(0;1)
? Return Xt =  Gt + pGtW
Table 2.3: VG process simulation as the time-change Brownian motion
CGMY
The expression of CGMY as a time-changed Brownian motion is recently dis-
covered by Madan and Yor in [43] by using a random truncation upon a stable
process. We skip the discussion here, but only list the simulation algorithm devel-
oped in [53] in Table (2.6).
2.4.3 Importance sampling under L?evy processes
Monte Carlo simulation is often used to evaluate integrals by sampling ran-
dom points from the relevant probability distribution. However, the choice of sample
distribution obviously makes a real-world difierence to the e?ciency of the method.
For example, brutal-force simulation for a rare event under the original distribution
creates a great deal of void points that contribute nothing to the calculation of the
integral, and hence causes high variance and slow convergence. Importance sam-
pling is a variance reduction technique. As inferred by the name, it samples from
a difierent distribution under which most random draws would make non-zero con-
tributions to the integral evaluation. The displacement caused by the distribution
25
Simulation of CGMY(t;C;G;M;Y) process
1. A = (G?M)=2;B = (G+M)=2:
2. Take an ? a small value, say 0:0001. For jumps in the subordinator that
is smaller than ? we use the expectation to replace it; for jumps bigger
than ? we simulate it by inverting the CDF.
3. The expectation of the small jump is d = R?0 y C
yY2 +1
dy = C?1?Y21?Y
2
.
4. The arrival rate for jumps bigger than ? is ? = R1? C
y1+Y2
dy = 2C
Y?Y2
.
5. Let T be a pseudo-time T = tCp??(1+Y2 )=2Y2 .
6. Generate a Poisson RV N with arrival rate T? .
7. Generate ti;i = 1;:::;N uniform distributed within [0;t] .
8. Generate jumps yi at ti given by yi = ?
(1?u1i) 2Y
, where u1i is an indepen-
dent uniform sequence. (Inversion of the Normalized L?evy density)
9. S(t) = dt + Pyi1h(yi)>u2i, where u2i is another independent uniform
sequence. (the calculation of h(yi) is presented below)
10. Once S(t) is known, then the CGMY process Xt = AS(t) +pS(t)W,
where W follows normal(0;1).
Calculation of truncation function h(y)
1. h(y) = e??(B
2?A2)y
2
?(Y+12 )
?(Y)?(12) 2
Y (B2y
2 )
Y
2 I(Y;B2y; B
2y
2 ).
2. I(Y;2?;?) = H?Y (
p2?) ?(Y)
(2?)Y2
, where H?(:) is the Hermite function.
3. Hermite function is explicitly known in terms of Con uent Hypergeo-
metric Function 1F1, where
H?(z) = 2?=2?
h 1
?(1??2 )?(12) 1F1(
??
2 ;
1
2;z
2=2)
? zp2?(??
2 )?(
3
2)
1F1(
1??
2 ;
3
2;z
2=2)
i
Table 2.4: CGMY process simulation as the time change Brownian motion
26
change will be corrected by multiplying the each individual random evaluation by a
ratio, which will be discussed in details below.
Suppose p(x) is a probability distribution function (pdf) and one is interested
in the integral
I =
Z
f(x)p(x)dx: (2.13)
With a large number of random draws from the original p(x), the integral I could
be approximated as
I0 =
X
f(xi): (2.14)
However, the pdf function p(x) and the evaluated function f(x) are independent
so p(x) could concentrate its weights in the region where f(x) has no signiflcant
values. Monte Carlo approximation then becomes ine?cient because most draws
will be wasted in this case. To increase the e?ciency, we can equivalently rewrite
the integral as
J =
Z
f(x)p(x)q(x)q(x)dx; (2.15)
such that it can be approximated by
J0 =
X
f(xj)p(xj)q(x
j)
; (2.16)
with points xj drawn from the new distribution q(x), termed the importance distri-
bution. q(x) shares the same sample space as p(x) but can be chosen to concentrate
in areas where the values of f(x) are non-zero. The technique reduces the variance
incurred in the integral approximation.
Above deflned is the concept of importance sampling on random variables. In
flnance however, we are dealing with time series of asset returns which are processes,
27
i.e. a collection of random variables indexed by time. Nevertheless, a process can
be considered as a random variable X measured by the probability law P in the
path space ?. Then if we construct a new probability measure Q such that it is
equivalent to P, namely
P(?) = 1 ()Q(?) = 1;
then we call the new probability measure a change of measure.
The Radon-Nikodym derivative Zt = dPdQjFt is what we need to perform the
importance sampling. It is deflned a Ft measurable martingale process under mea-
sure Q. The rigor in deflning the general Radon-Nikodym derivative Zt for L?evy
processes can be found in Sato [49]. Here we discuss the special case | pure jump
L?evy process without continuous martingale component, i.e. Brownian motion.
Proposition 1 Let (Xt;P) and (Xt;P0) be two pure jump L?evy processes with the
triplets (b;0;?) and (b0;0;?0), then PjFt and P0jFt are equivalent for all t if and
only if the following two conditions are satisfled
1. The L?evy measures are equivalent with
Z +1
?1
(e`(x)=2 ?1)2?(dx) < 1;
where `(x) = ln?d?d?0?
2. We must also have
b0 ?b =
Z 1
?1
x(?0 ??)(dx):
When they are equivalent, the Radon-Nikodym derivative is
dP0
dP jFt = e
Ut (2.17)
28
where Ut is given
Ut = lim?!0
? X
s?t;j?Xsj>?
`(?Xs)?t
Z
jxj>?
(e`(x) ?1)?(dx)
?
: (2.18)
The deflnition of the Radon-Nikodym derivative in the Proposition 1 indicates
that in general its calculation depends on the information of all jumpsf?Xs;8s < tg
along each entire path. This is an unwanted property because it could substantially
increase the requirement for computer storage and computation. For example, sup-
pose one is applying the importance sampling technique to price far out-of-money
European option. However, simulating only the terminal stock prices under the new
measure is no longer su?cient to price the option, and one has to generate the whole
evolution for each sample path. As a matter of fact, the beneflt from the variance
reduction will be largely ofiset, if not totally lost, by the excessive CPU time spent
on the path simulation.
However, the next two propositions show that in certain cases, i.e. if the ratio
of two L?evy measures is strictly exponential, we could still rely on the terminal
values to calculate the Radon-Nikodym derivative. And as a matter of fact, VG and
CGMY processes fall in this convenient class.
Proposition 2 Let (Xt;P) and (Xt;P0) be two pure jump L?evy processes with the
triplets (b;0;?) and (b0;0;?0), then the Radon-Nikodym derivative dP0dP jFt only de-
pends on the terminal value Xt instead of the whole path f?Xs;8s ? tg if and
only if the ratio of the two L?evy measures d?d?0 is a strict exponential function, i.e.
d?
d?0 = e
cx for all x, where c is a constant.
29
Proof of Proposition 2. From Eqn.(2.17), we have
dP0
dP jFt = e
Ut
= exp
?
lim
?!0
? X
s?t;j?Xsj>?
`(?Xs)?t
Z
jxj>?
(e`(x) ?1)?(dx)?
?
= exp
?
lim?!0?
X
s?t;j?Xsj>?
`(?Xs)?
?
exp
?
lim?!0??t
Z
jxj>?
(e`(x) ?1)?(dx)?
?
= exp
?
lim?!0?
X
s?t;j?Xsj>?
c?Xs?
?
exp
?
lim?!0??t
Z
jxj>?
(e`(x) ?1)?(dx)?
?
= exp(cXt) exp
?
lim?!0??t
Z
jxj>?
(e`(x) ?1)?(dx)?
?
As shown in the proof, under the condition in Proposition 2 the calculation of
the measure change solely depends on the terminal value Xt and no long requires the
path information. We show in the next Proposition 3 how this simplifled measure
change is applied to Variance Gamma process.
Proposition 3 Let P and P0 be the measures of Variance Gamma processes fol-
lowing VG(t;C;G;M) and VG(t;C0;G0;M0). The L?evy measures are ? and ?0, re-
spectively. Xt is the random variable under the measure of P and can be decomposed
into the difierence of two gamma random processes, i.e.
Xt = g+t ?g?t
where g?t ? Gamma(Ct;G) and g+t ? Gamma(Ct;M). Then the Radon-Nikodym
derivative dP0dP jFt only depends on the terminal value g?t and g+t but not the entire
path f?g?s ;8s ? tg only if C = C0 such that
dP0
dP jFt = exp(?tZ) `
+(g+
t ) `
?(?g?
t )
30
where
Z =
Z 1
?1
(?0 ??)(dx)
`?(x) = e?(G0?G)jxj;x < 0
`+(x) = e?(M0?M)x;x > 0
The validity of Proposition 3 can be easily proved by Proposition 2, and the ex-
tension from VG to CGMY process is straight-forward too. We skip both proofs
here.
The measure change technique is used widely in pricing complicated flnancial
derivative products. For example, for far out-of-the-money European put option a
naive Monte Carlo simulation will have signiflcant variance in the price since only
a small proportion of simulated paths will end up lower than the strike. Another
example is the up-and-out barrier option in which the upper barrier is far away from
the spot so the probability of breaching the barrier is slim. In those cases, a limited
number of simulation paths will not be su?cient to make accurate estimation of the
price.
To perform importance sampling, we need to alter the process and redirect
it to the interested region. For example, if the underlying dynamics follows a
VG(t;C;G;M) law and the task is to price a up-and-out barrier call option with
a high upper barrier. So we could sample from a new measure VG(t;C0;G0;M0)
with G0 > G but M0 < M so as to make more upward moves and less downward
moves for the new process than the original process. Note: the parameter C and C0
should be kept the same in order to satisfy the condition in Proposition 2. By this
31
means, more paths will be favorably sampled to shoot toward the barrier; the bias
caused by the change would be corrected by multiplying the results by the Radon-
Nikodym derivative developed in Proposition 2. We will show a concrete numerical
example in the next subsection.
2.4.4 Numerical results
Simulation validation
In this section we flrst present some simulation results for CGMY process
using two difierent algorithms | as the compound Poisson and as the time-change
Brownian motion (TCBM). We draw a comparison between the two algorithms on
the CPU time spent on the simulation.
Figure 2.1 qualitatively illustrates the goodness-of-flt for the two algorithms
for the symmetric case (G = M), and Figure 2.2 for the asymmetric one (G 6= M).
In both plots, the blue lines represent the binned simulation data while the red lines
are theoretical PDFs, which are obtained from Fourier inversion from characteristic
function using algorithm developed in [44]. Both flgures show good fltting qualities.
We also tabulate the chi-square goodness-of-flt numerical results in Table (2.5)
and (2.6) 6 with a quantitative view of the algorithms performance. We conduct
the experiment for various conflgurations of model parameters for both algorithms.
6?2 is the chi-square test statistic, k is the number of freedom, p is the chi-square CDF value,
and ?2fi;k is the critical value at confldence level fi with freedom k. The null hypothesis H0 can
not be rejected (denoted by "H0 N.R." in both tables) if ?2 is less than ?2fi;k.
32
?0.5 ?0.4 ?0.3 ?0.2 ?0.1 0 0.1 0.2 0.3 0.4 0.50
100
200
300
400
500
600
700
Chi?square test for CGMY as compound Poisson
Xt
#
Theoretical
Simulated
?0.5 ?0.4 ?0.3 ?0.2 ?0.1 0 0.1 0.2 0.3 0.4 0.50
100
200
300
400
500
600
700
Chi?square test for CGMY as TCBM
Xt
#
Theoretical
Simulated
Figure 2.1: Chi-Square goodness-of-flt test illustration for simulation of CGMY(t;
C, G, M, Y) process, as compound Poisson and time-change Brownian motion, with
t = 0:5;C = 0:8;G = 30;M = 30;Y = 0:5, 2?104 samples
33
?0.5 ?0.4 ?0.3 ?0.2 ?0.1 0 0.1 0.2 0.3 0.4 0.50
200
400
600
800
1000
Chi?square test for CGMY as compound Poisson
Xt
#
Theoretical
Simulated
?0.5 ?0.4 ?0.3 ?0.2 ?0.1 0 0.1 0.2 0.3 0.4 0.50
200
400
600
800
1000
Chi?square test for CGMY as TCBM
Xt
#
Theoretical
Simulated
Figure 2.2: Chi-Square goodness-of-flt test illustration for simulation of CGMY(t;
C, G, M, Y) process, as compound Poisson and time-change Brownian motion, with
t = 0:3;C = 0:5;G = 20;M = 10;Y = 0:5, 2?104 samples
34
The tables show that in all cases the null-hypothesis H0, which states that the
simulated random variables are drawn from the specifled CGMY distribution, can
not be rejected for three levels of signiflcance (fi = 0:01;0:05;0:1).
E?ciency comparison
Given the fact that both algorithms can successfully pass the goodness-of-flt
test, we next compare the e?ciencies in terms of CPU time. In CGMY process simu-
lation, the value of parameter Y should be given special attention because it controls
the behavior of small jumps, which cause most of the di?culties in simulation. For
a large value of Y(Y < 2), the L?evy density increases sharply around the origin and
hence tiny jumps happen more frequently than a smaller Y. Computation-wise,
a larger Y value takes more simulation time for both algorithms. Although the
compound Poisson method is more e?cient for small Y, it sufiers most from an
increasing Y because the acceptance-rejection method would reject more and more
draws as Y gets larger. The time-change Brownian motion takes more time than
the compound Poisson for small Y since it involves the computation of complex
functions such as the con uent hypergeometric function (the 1F1 function), but it
beats the counterpart for larger Y because the rejection rate becomes overwhelming
for the compound Poisson?s performance. The Figure (2.3) illustrates the following
two aspects about the e?ciency of the two algorithms |
1. both methods consume more CPU time with an increasing Y,
2. the compound Poisson beats time-change Brownian motion (TCBM) forY ? 1
35
Parameter set Chi-square statistics Critical value & results
t C G M Y ?2 p k ?20:01;k ?20:05;k ?20:10;k
0.5 0.8 30 30 0.5 95.77 0.42 98 133.476 122.108 116.315
H0 N.R. H0 N.R. H0 N.R.
0.5 0.5 10 10 0.5 182.17 0.57 178 224.8 210.1 202.6
H0 N.R. H0 N.R. H0 N.R.
0.5 0.5 30 20 0.5 80.64 0.13 95 130 118.8 113
H0 N.R. H0 N.R. H0 N.R.
0.3 0.2 30 20 0.8 77.26 0.33 82 114.7 104.1 98.78
H0 N.R. H0 N.R. H0 N.R.
0.3 0.2 30 20 1.2 114.24 0.09 135 176.1 163.1 156.4
H0 N.R. H0 N.R. H0 N.R.
0.2 0.1 20 10 1.4 152.07 0.62 146 188.7 175.2 168.3
H0 N.R. H0 N.R. H0 N.R.
Table 2.5: Chi-Square goodness-of-flt test numerical results for simulation of
CGMY(t; C, G, M, Y) process as compound Poisson process, 104 samples. N.R.
stands for Not Rejected
Parameter set Chi-square statistics Critical value & results
t C G M Y ?2 p k ?20:01;k ?20:05;k ?20:10;k
0.5 0.8 30 30 0.5 86.91 0.24 96 131.141 119.871 114.131
H0 N.R. H0 N.R. H0 N.R.
0.5 0.5 10 10 0.5 186.69 0.68 178 224.8 210.1 202.6
H0 N.R. H0 N.R. H0 N.R.
0.5 0.5 30 20 0.5 106.79 0.64 101 137 125.5 119.6
H0 N.R. H0 N.R. H0 N.R.
0.3 0.2 30 20 0.8 62.66 0.07 80 112.3 101.9 96.58
H0 N.R. H0 N.R. H0 N.R.
0.3 0.2 30 20 1.2 130.76 0.34 137 178.4 165.3 158.6
H0 N.R. H0 N.R. H0 N.R.
0.2 0.1 20 10 1.4 169.60 0.85 150 193.2 179.6 172.6
H0 N.R. H0 N.R. H0 N.R.
Table 2.6: Chi-Square goodness-of-flt test numerical results for simulation of
CGMY(t; C, G, M, Y) process as time-change Brownian motion, 104 samples.
N.R. stands for Not Rejected
36
but underperforms TCBM for Y > 1.
Importance sampling
We next show how to improve the accuracy of Monte Carlo simulation using
importance sampling. For the purpose of simplicity and convenience we take Vari-
ance Gamma process in this demonstration. We follow the Proposition 3 to perform
the measure change for VG process. Assume that the underlying follows the sym-
metric process VG(t;C;G;M) and our task is to price a pseudo far out-of-money
put option with a deep-low strike. We pick a new VG(t;C0;G0;M0) process trending
downward so that paths are more likely to breach the low strike. The model param-
eters and option speciflcations are listed in Table (2.7). With a smaller G0 but larger
M0 we allow more negative jumps to happen than positive ones. As illustrated in
Figure (2.4), the draws of random variables under the new measure is concentrated
on the negative half axis, which is exactly what is needed to have more probability
mass in the neighborhood of the deep-low strike. Figure (2.5) plots the averaged
Monte Carlo price against the number of simulation runs for the naive Monte Carlo
and Monte Carlo with importance sampling technique. Clearly, it shows the supe-
rior performance of applying importance sampling, as the option price under the
new process with importance sampling converges much faster than that under the
original one.
37
0.4 0.6 0.8 1 1.2 1.4 1.60
20
40
60
80
100
120
140
Y
CPU time (s)
Compound PoissonTime change Brownian motion
Figure 2.3: CPU time comparison of compound Poisson and time-change Brownian
motionforsimulatingCGMY(t; C, G, M, Y)witht = 0:2;C = 0:1;G = 20;M = 10
and varying Y, 104 samples
38
Original VG New VG Option speciflcations
C G M C0 G0 M0 Type Strike Maturity Initial Interest
5 10 10 5 7 12 Euro Put K = ?1:5 T = 4 S0 = 0 r = 0
Table 2.7: Parameters and speciflcations for the numerical illustration of VG mea-
sure change
?3 ?2 ?1 0 1 2 30
500
1000
1500
2000
ST
#
(a) Histogram for the original process
?7 ?6 ?5 ?4 ?3 ?2 ?1 0 1 20
500
1000
1500
2000
2500
ST
#
(b) Histogram for the new process
Figure 2.4: Histogram comparison for VG(t; C, G, M) and VG(t; C0, G0, M0) with
parameters listed in Table (2.7), 2?104 samples
39
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x 104
0
0.5
1
1.5
2
2.5
3
3.5x 10
?3 Convergence comparison
# of simulations
Option price
Original process
New process
Figure 2.5: Convergence comparison of pricing out-of-money put options for VG(t;
C, G, M) and VG(t; C0, G0, M0) with importance sampling, parameters listed in
Table (2.7)
40
Chapter 3
The 3-Factor L?evy Based Short Rate Model
3.1 Background
Whereas for modeling in equity markets one has the Black-Scholes theory ar-
guably as a bench mark or market standard, the situation in flxed income market
is more complicated. The mathematics involved in modeling the flxed income mar-
ket dynamics is essentially inflnite dimensional and no market standard has been
established for more than special asset classes.
On the other hand, the flxed income market is much larger than equity market.
It is so large that even bond derivatives are liquidly traded. On 04/25/2006, for
instance, the outstanding notional for the CBOT futures contracts on government
bonds with maturities of [30;10;5;2] years was worth 180 billion dollars, which
amounts to the 2004 GDP of the country Ireland ranked the 30th largest in the
world in terms of GDP.
The research on modeling flxed income market has been active. The flrst at-
tempt was the equilibrium modeling as pioneered by Vasicek [51] and Cox-Ingersoll-
Ross [18] and followed by Hull-White [36], etc. Those models view the short rate,
deflned as the instantaneous increment of a unit deposit, as the only source of uncer-
tainty in the economy. Short rate is then modeled by a mean-reverting process with
Gaussian randomness. Short rate models are favored by hedge funds and invest-
41
ment banks? proprietary trading desks because those models supply the equilibrium
prices which are presumably the prices the market will asymptotically converge to.
However, flxed income derivative traders are not fond of this type of short rate mod-
els because they are not arbitrage free | i.e. difierences between the model prices
and the market quotes always exist under such models. Pricing derivatives with a
model that can not make the underlying arbitrage free is dangerous and is avoided
by practitioners.
No-arbitrage models came up in mid-90?s represented by Heath-Jarrow-Morton
(HJM) model [31]. Instead of modeling the short rate, the HJM model was to
describe the forward rate dynamics. In the HJM framework, market models such as
BGM/J model [8, 38] were developed for particular flxed income rates. No-arbitrage
models give reasonable derivative prices, because the underlying values are perfectly
matched to the market by design.
However, all the aforementioned models uses Brownian motion to describe
the uncertainty. As we have discussed, the normality assumption associated with
Brownian motion was barely supported by the market because Brownian motion
lacks the structural properties to model many observed market phenomena.
Although the disagreement among the choice of models has not deterred banks
from trading, a better model certainly helps understand the market better, and
hence proflt more. In this chapter we are going to develop a new short rate model
based on the Vasicek framework yet without using Brownian motion. Rather, in-
flnitely active L?evy jump structures are included in the short rate in order to rec-
oncile the discrepancy between the market and the existing models.
42
As one factor models have been proved less efiective in explaining the correla-
tions between bonds across difierent maturities, three independent L?evy factors are
employed in the model. This setup is in line with the conclusion as in [41, 32] which
showed that 3 factors in the dynamics are adequate to explain more than 99% of
the yield curve data historically.
In this chapter, the basic conventions and methodologies used in the flxed
income modeling will be flrstly introduced. A brief overview on the existing flxed
income models will be provided. The degree of mathematics of general asset pric-
ing theory will be kept minimal | only necessary concepts and notations will be
included. We will spend most of this chapter developing the 3-factor L?evy based
short rate model with a view on measure change. The formulas for flxed income
derivatives under the model will be developed in the end.
3.2 Overview on flxed income modeling
3.2.1 Bond market and money account
Bonds are the basic element in the flxed income market. For all flxed income
modeling it serves as the market input to determine model structures and estimate
model parameters. In practice, the zero-coupon bond is mostly used as the reference
instrument to infer the yield curve. For the sake of argument we hereafter assume
that the zero-coupon bonds with continuous maturity in time are traded on the
market.
A bond is a securitized form of loan; that is, a loan that can be traded. A zero-
43
coupon bond P(t;T), issued at time t maturing at T, is a contract that guarantees
the holder a dollar to be paid at terminal time T.1 For brevity, such a coupon
bond P(t;T) is called a T-bond. In contrast to the T-bond, a coupon bond has
intermediate coupon payments between t and T periodically.
We summarize the assumptions we have made on T-bond below:
? P(T;T) = 1 for all T;
? P(t;T) < 1 for all t < T;
? There exists a market for T-bonds for every T > 0, and P(t;T) is continuously
difierentiable in all T.
To complete the market, people usually assume there is a frictionless risk-free
money account B(t). Money account grows as follows
B(t) = exp
 Z t
0
r(s)ds
?
; (3.1)
where B(0) = 1 and the compounding is taken continuously. Here we have intro-
duced the concept of a spot rate r(t) which is deflned as the instantaneous increment
at time t of the money account.
Clearly B(t) is a risk free asset insofar as its future value at t + dt bears no
uncertainty inflnitesimally. B(t) is also important to relate the amount of currency
at difierent times: in order to have one dollar in the bank account at time T we
need to have
B(t)
B(T) = exp
 
?
Z T
t
r(s)ds
?
1Here we assume the bond is default free.
44
dollars in the account at time t ? T.
The relationship between a T-bond P(0;T) and the money account B(T) can
be described as
P(0;T) = exp
 
?
Z T
0
r(s)ds
?
= 1B(T) : (3.2)
We need another important yet straight-forward relationship linking T-bond
P(t;T) with spot rate r(t) as given by
P(t;T) = exp
 
?
Z T
t
r(s)ds
?
: (3.3)
As the Eqn.(3.3) is not intuitive in knowing how much the bond grows during
the period (t;T), people designated the term yield to maturity (or yield) R(t;T) to
describe the average gain per unit time from the speciflc T-bond P(t;T), where
P(t;T) = exp
 
?R(t;T)(T ?t)
?
;
R(t;T) = ?logP(t;T)T ?t : (3.4)
It is clear that at time t, the yield is a function of maturity T given the
short rate dynamics. The plot of R(t;T) against the maturity T is referred as yield
curve, the Figure 3.1 shows a upward sloping yield curve on February 9th, 2005. In
theory, yield curve on difierent days can have all sorts of shapes, e.g. up-sloping,
down-sloping or humped re ecting people?s expectation on future interest rate level.
While the yield curve could be bootstrapped ofi from the bond prices by Eqn.
(3.4), the spot rate is not directly observable. Available market proxies for the short
rate include the Fed rate and short-dated, say 1-month, treasury rate, but the latter
is considered better since the short-dated treasury bond is being liquidly traded.
45
Figure 3.1: US treasury yield curve on Feb. 09, 2005
3.2.2 Forward rate and Libor rate
The term structure of zero-coupon bond prices does not contain very rich
visual information, but a few better measures can explained by implied interest
rates. Below we list a variety of them including both the conceptual and the market-
observable ones.
? The discrete forward rate F(t;T1;T2) for period [T1;T2] prevailing at t is de-
flned as
F(t;T1;T2) = 1T
2 ?T1
 P(t;T
1)
P(t;T2) ?1
?
:
It can be regarded as the expected average return for future time period
[T1;T2], viewed at time t.
46
? The simple spot rate for the period [t;T] is denoted by
F(t;T) = F(t;t;T)
? The instantaneous forward rate f(t;T) with maturity T prevailing at t is
deflned as
f(t;T) = ?@logP(t;T)@T ;
which determines the instantaneous gain of the continuously compounded T-
bond at future time T.
? Libor (London InterBank Ofier Rate) rates are the most important market
observable rates and underly many interest rate derivative contracts such as
swaps and caps/ oors. In essence, Libor rates are simple-compounded forward
rate of difierent tenors with the most used one being the 3-month Libor rate.
A Libor rate with tenor ? is given as
Lt(T;T +?) = F(t;T;T +?):
Somesimpleyetimportantlinksbetweenthebondpricesandtheabovedeflned
rates are summarized here. The T-bond price relates to the instantaneous forward
rate in the following manner
P(t;T) = exp(?
Z T
t
f(t;u)du):
And the instantaneous forward rate approaches to the short rate as T nears t,
r(t) = lim
T!t
f(t;T):
47
3.2.3 Change of numeraire
Since the celebrated Black-Scholes work, the most popular pricing tool for
computing asset prices has been the \risk-neutral pricing". As argued in Harrison &
Kreps [29] and Harrison & Pliska [30], the absence of arbitrage implies the existence
of a risk-adjusted probability Q such that the current price of any security should
equal its discounted expectation of future values. Under this equivalent measure Q
the associated discounting factor, referred as the numeraire, is the riskless money
account B(t). Thus, the problem of derivative pricing is simply left as calculating
the discounted expectation under risk neutral measure Q.
However, Geman et al [25] noted that the neither Q measure is necessarily
the most natural choice for pricing a contingent claim nor the money account is
the most convenient numeraire. In fact, under the measure Q many calculations
of the expectation could be considerably complicated. In such cases, a change of
numeraire can help to simplify the problem, and it has surprisingly helped reduce
the complexity in pricing derivatives, especially in the flxed income market. In
speciflc, Geman et al in [25] introduced the following deflnition.
Deflnition 4 A numeraire is any positive non-dividend-paying asset.
Intuitively, a numeraire is a unit asset chosen so as to normalize all other asset.
With this deflnition, the following Proposition 5 holds.
Proposition 5 Assume there exists a numeraire N and a probability measure QN,
equivalent to the original measure Q0, such that the price of any traded asset X
48
discounted by N is a (local) martingale under QN, i.e.,
Xt
Nt = E
N
?X
T
NT jFt
?
0 ? t ? T:
Let U be an arbitrary numeraire associated with measure QU. Similar to
(N;QN), (U;QU) satisfles
Xt
Ut = E
U
?X
T
UT jFt
?
0 ? t ? T:
Then the Radon-Nikodym derivative deflning the measure QU is given by
dQU
dQN =
UTN0
U0NT : (3.5)
By Proposition 5, the calculation of the discounted expectation of a traded asset
X under an inconvenient measure could be translated under another convenient
measure by
EN
?X
T
NT
?
= EU
?X
T
NT
dQN
dQU
?
= EU
?U
0
N0
ZT
UT
?
:
The change of numeraire technique is a useful pricing tool at one?s disposal. It
is typically employed when the money account discount factor under the risk-neutral
measure makes the expectation calculation di?cult, such as pricing caps/ oors etc.
To be speciflc we can make the following transformation
E0
?h(X
T)
BT
?
= S0ES
?h(X
T)
ST
?
whenever the following two properties are satisfled.
? XtSt is a tradable asset (0 ? t ? T)
? h(XT)=ST is conveniently simple
49
where h(XT) is the payofi function at time T. The flrst condition is to make the
quantity XtStSt = Xt so that Xt could be modeled as a martingale under the measure
QS; and the second condition ensures that under the new numeraire the computation
could be made simpler.
Change of numeraire is widely applied in flxed income and foreign exchange
derivative pricing. As one can see in the sections where pricing of caps/ oors and
swaptions are discussed, the change of numeraire signiflcantly reduced the complex-
ity of pricing. For FX derivatives, change of numeraire is used when the derivative
is priced in one currency but the underlying is dominated in another currency, e.g.
a quanto option. Interested readers are referred to [3].
3.3 Current short rate models
The flrst seminal paper on short rate model was published by Vasicek [51]
in 1977, and together with the development of interest rate market, flxed income
models have been mushrooming ever since. As the flrst short rate model, Vasicek?s
work assumes the short rate evolves as an Ornstein-Uhlenbeck (OU) process with
constant coe?cients
dr(t) = ?( ?r(t))dt+ dW0(t); (3.6)
where dW0(t) is the Brownian motion under real-world measure. As far as pricing
is concerned, one needs to move the process from the physical measure to the risk
neutral measure. Therefore, the concept of market price of risk is introduced as
the compensation that the investors are paid to take the risk by entering a risky
50
contract. If we specify the market price of risk process ?(t) to be linear with the
short rate2, i.e.
?(t) = ?r(t)
with ? being a constant, the short rate process under risk-neutral measure would
stay within the same OU process family. Thus, the risk-neutral short rate process
is given as follows
dr(t) = (?+? )( ? ?+? ?r(t))dt+ dW(t);
where dW(t) is the Brownian motion under risk neutral measure such that
dW(t) = dW0(t)+?(t)dt:
The solution to the SDE in Eqn.(3.6) suggests that at every time t, the random
variable r(t) is normally distributed. On the bright side, this property brings ana-
lytical solutions to pricing a variety of flxed income products. However, r(t) being
normally distributed implies the possibility of having negative short rate, which is
counter-intuitive. For practical purposes, this drawback is tolerated sometimes by
practitioners as they argue that, with reasonable parameter sets, the probability of
r(t) dropping below zero is slim. Nevertheless, it is an undesired property of having
possible negative short rates.
Then there came the celebrated Cox-Ingersoll-Ross (CIR) [18] model that flxed
this drawback. From its inception, CIR model has become the benchmark short rate
2In general there is no reason why the market price of risk process should take the linear form,
but this form is taken to simplify the problem.
51
model for years. Under CIR model, the short rate r(t) also follows a mean-reverting
process, but it difiers from Vasicek?s by having the Brownian motion term multiplied
by the square root of r(t). The CIR dynamics is given as
dr(t) = ?( ?r(t))dt+ 
p
r(t)dW0(t)
It can be shown that the solution to the SDE of CIR process follows a non-central
Chi-square probability distribution, which is strictly non-negative3.
Analogous to Vasicek?s, we move the process to the risk neutral measure by
specifying the market price of risk under CIR model being linear in the square root
of r(t), i.e.
?(t) = ?
p
r(t)
Under this formulation, the risk-neutral CIR process is given by
dr(t) = (?+? )( ? ?+? ?r(t))dt+ 
p
r(t)dW(t):
dW(t) is the Brownian motion under the risk neutral measure such that
dW(t) = dW0(t)+?(t)dt
Both Vasicek and CIR have closed-form formulas for pricing bonds and bond
options under risk-neutral measure, whereas the solutions under Vasicek?s are much
simpler. Therefore, for the sake of simplicity, the Vasicek framework is adopted in
our L?evy -based short rate model.
We have covered two of the milestone models, but a complete review about
existing models is far beyond the scope of this thesis. Interested readers are referred
to books by Brigo and Mercurio [9], Cairns [11], Rebonato [47] and Zagst [52]
3for a very small set of parameters, the process could hit zero.
52
3.4 3-Factor L?evy short rate model dynamics
The problem that puzzled the researchers for flxed income modeling resembles
that for equity modeling. Historical data show strong non-normality, and the deriva-
tive market exhibits signiflcant non-constant implied volatility. These markedly vio-
lationsagainsttheclassicBlack-ScholesorBlackmodelrequirealternativestructures
to explain the market behavior.
We are encouraged by the success of pure jump L?evy models on equity model-
ing, and conjecture that similar problems on flxed income market could be tackled
in the same manner. Therefore, in my dissertation a short rate model is developed
using multi-factor Vasicek framework by replacing the Brownian motion with a pure
jump L?evy dynamics.
3.4.1 Motivation
A preliminary investigation (see [24]) of changes in 10 swap rates over the
period 04=25=1994 to 10=14=2004 using the methods of independent components
analysis (ICA for short) revealed that they may be written as mixtures of factors
with the following statistics in Table 3.14.
We notice that the flrst three factors have highly non-zero skewness of 1:05,
?1:52, ?1:12 and signiflcant kurtosis 21:62, 132:28, 68:27 respectively. The results
indicate that classic Gaussian-based short rate models such as Vasicek and CIR,
regardless of the dimensionality, are inadequate to explain those time series data.
4We note that all ICA factors have zero means and unit variances by construction.
53
Factor Skewness Kurtosis
1 1.05 21.62
2 -1.52 132.28
3 -1.12 68.27
4 -.08 10.98
5 .15 11.72
6 .33 15.86
7 -.47 7.5
8 .37 6.5
9 .07 5.08
10 .0029 4.18
Table 3.1: ICA statistics for swap rates of 10 maturities
Besides, the statistics in Table 3.1 is conflrming the conclusion as in Litterman [41]
that 3-factor is su?cient to explain the variations in the yield curve.
L?evy processes, as described in Chapter 2, are ideal candidates for incorpo-
rating non-vanishing higher moments as revealed in the time series. We are then
motivated to use a 3-factor L?evy dynamics to model the short rate in order to
capture the characteristics shown in the table.
3.4.2 Model dynamics
Vasicek started modeling the short rate by studying the economic equilibrium,
but this step is skipped here. We simply borrow the conclusion. The spot rate is
writteninalinearrelationship withthe3latentfactorsunder thephysicalP measure
as
r(t) = fi +fl0x(t) (3.7)
54
where x(t) is a vector of three dimensional latent factor process, fi is a real number
and fl is a 3-dimensional vector.
We specify x(t) of following the multidimensional Ornstein-Uhlenbeck (O-U)
process whereby we write the stochastic difierential equation (SDE) of x(t) is given
as
dx(t) = (a??x(t))dt+BdL(t) (3.8)
where a is a 3-dimensional vector, ?;B are full 3 by 3 matrices and L is a 3-
dimensional, independent L?evy processes.
The solution of the SDE in Eqn.(3.8) takes the form
x(t) = e??tx(0)+
Z t
0
e??(t?s) (ads+BdL(s)) (3.9)
The detailed derivation can be found in Appendix A.1.
This speciflcation determines the short rate dynamics under physical measure
P. However, pricing bonds and other contingent claims must be done under a risk
neutral measure Q. To change from physical measure to risk-neutral measure, we
propose a speciflc form for the Radon-Nikodym derivative of Q with respect to P.
The Radon-Nikodym derivative dQdP is deflned as
dQ
dP =
3Y
i=1
E??e ix??ijxj ?1??(?i ??i)? (3.10)
where ?i is the random counting measure associated with the jumps of Li, ?i is
the compensation measure for ?i under P measure.  i;?i are coe?cients for risk
pricing, with ( i ? ?i) being the risk premium of positive jumps and ( i + ?i) the
risk premium of negative jump risks. E(:) denotes the stochastic exponential, which
55
makes the Radon-Nikodym derivative an exponential martingale. (For a complete
treatment of stochastic exponential see Protter [46]) Under this proposed form of
measure change, the risk neutral process and the physical process remain in the
same parametric family.
Now that the model dynamics and measure change have been worked out,
we choose the Variance Gamma (VG) process as the speciflc structure of the back-
ground driving L?evy process L(t). The model parameters are listed in Table (3.2).
As shown in the left column, the original model has totally 45 parameters. Appar-
ently, this large number of parameters would overwhelm the computer program for
model estimation and hence substantially slow down the optimization convergence.
Therefore, we seek to simplify the model conflguration by removing the redundan-
cies in the model parameters. For instance, matrix B is to correlate dL(t) and create
dependence on factors x(t), whereas the matrix ? has an identical efiect. Hence,
we can harmlessly collapse the matrix B to the constant identity matrix. Second,
matrix ? does not need to be a full matrix, i.e. a lower triangular matrix (positive
deflnite) su?ces its functionality yet save 3 more parameters. Third, since we are
primarily interested in pricing not the physical process of the short rate, we could
assume the model parameters are under risk-neutral measure; this way we could
eliminate the inclusion of six measure change parameters  i and ?i in Eqn. (3.10).
The model parameterization is then reduced to only have 25 parameters as listed in
the right column of Table 3.2. The model estimation is conducted in Chapter 4 for
the reduced form.
56
Original Parameter Set Reduced Parameter Set
(With physical parameters) (With risk-neutral parameters)
Symbol Number Symbol Number
fi 1 fi 1
fl 3 fl 3
a 3 a 3
? 9 ? 6, lower triangular matrix
B 9 B 0, constant identity matrix
x(0) 3 x(0) 3
L1;2;3 9 L1;2;3 9
 1;2;3 3  1;2;3 0
?1;2;3 3 ?1;2;3 0
Total 43 Total 25
Table 3.2: Numbers of model parameter for original model and reduced model
3.5 Pricing of bond and bond derivatives
In this section we discuss the analytical solutions for pricing flxed income
products. It is a common practice to look at those problems by developing the
characteristic function for the short rate (and its integral). As the building block
to pricing derivatives, the bond pricing formula is flrst derived. Given the bond
pricer, it is straight-forward to calculate the swap rate which can be represented by
a bond portfolio. Derivatives pricing formulas, such as caps/ oors and swaptions,
are derived through Fourier transform.
57
3.5.1 Joint characteristic function
We flrst write the characteristic function of the multi-factor OU process x(t)
as follows
'x(t)(u) = E
h
eiu0x(t)
i
= E
h
eiu0e?{tx(0)+iu0
x52t
0 e
??(t?s)(ads+BdL(s))
i
= exp
?
iu0e??tx(0)+iu0?I ?e??t???1a?
3X
j=1
Z t
0
?j((u0e??(t?s)B)j)ds
?
= exp(??x(t)(u)); (3.11)
where the characteristic function for each individual L?evy process is denoted by
E?eivLj(1)? = exp(??j(v)):
We may then rewrite the characteristic exponent of x(t) as
?x(t)(u) = ?iu0e??tx(0)?iu0?I ?e??t???1a+
3X
j=1
Z t
0
?j((u0e??(t?s)B)j)ds:
Recall the spot rate r(t) as in Eqn. (3.7) is an a?ne combination of x(t), thus
the characteristic function of r(t) can be straight-forwardly given as
'r(t)(u) = E[eiur(t)] = E[eiufi+iufl0x(t)]
= eiufi exp(??x(ufl)):
To flnd the bond price, we are interested in the characteristic function of the
integral of r(t). Below, we develop the joint characteristic function of the spot rate
r(t) and its integral of Rt0 r(s)ds
't(u;v) = E
?
exp
 
iu
Z t
0
r(s)ds+ivr(t)
??
:
58
We develop as follows.
E
?
exp
 
iu
Z t
0
r(s)ds+ivr(t)
??
= E
?
exp
 
iu
Z t
0
(fi +fl0x(s))ds+iv(fi +fl0x(t)
??
= eifi(v+ut)E
2
66
4
exp
?Rt
0 iufl
0(e??sx(0))ds+iufl0Rt
0
Rs
0 e
??(s?w)(a dw +B dL(w))ds
?
?exp
?
ivfl0(e??tx(0))+ivfl0Rt0 e??(t?w)(adw +B dL(w)
?
3
77
5
= eifi(v+ut)+[iufl0(I?e??t)??1+ivfl0e??t]x(0)+[iutfl0?iufl0(I?e??t)??1+ivfl0(I?e??t)]??1a
?E
?
exp
 Z t
0
Z s
0
iufl0e??(s?w)B dL(w)ds+
Z t
0
ivfl0e??(t?w)B dL(w)
??
:
For the flnal expectation we note flrst that
Z t
0
Z s
0
iufl0e??(s?w)B dL(w)ds =
Z t
0
Z t
w
iufl0e??(s?w)dsB dL(w)
=
Z t
0
iufl0?I ?e??(t?w)???1B dL(w):
The joint characteristic function is therefore given as
't(u;v) = E
?
exp
 
iu
Z t
0
r(s)ds+ivr(t)
??
(3.12)
= exp
0
BB
@
ifi(v +ut)+[iufl0(I ?e??t)??1 +ivfl0e??t]x(0)
+[iutfl0 ?iufl0(I ?e??t)??1 +ivfl0(I ?e??t)]??1a
1
CC
A
? exp
?
?
X
j
Z t
0
?j ??ufl0?I ?e??(t?w)???1 +vfl0e??(t?w)?Bj?dw
!
:
3.5.2 Bond
Equipped with the knowledge of the joint characteristic function, we could
obtain the bond price P(0,t), given the factors at level x(0) = x, by setting u =
59
i;v = 0 in Eqn.(3.12),
P(0;t) = E
?
exp
 
?
Z t
0
r(s)ds)
??
= 't(i;0) = exp(A);
where
A = ?fit?fl0?I ?e??t???1x
+[fl0?I ?e??t???1 ?tfl0]??1a
?
X
j
Z t
0
?j ?ifl0?I ?e??(t?s)???1Bj?ds:
The matrix exponentials exp(??t) involved can be calculated by the following
transformation
? = T?T?1 =
X
k
?kQk ;
where matrix Qk is given by
Qk = TkR0k ;
and R0k is the kth row of T?1. We may then expand A as
A = ?fit?fl0??1x+
X
j
fl0Qj??1xe??jt
?tfl0??1a+fl0??2a?
X
j
fl0Qj??2ae??jt
?
X
j
Z t
0
?j
?
ifl0??1Bj ?i
X
k
fl0Qk??1Bje??k(t?s)
!
ds:
Given L?evy processes are time-homogeneous, the more general bond prices
60
P(s;t) can be expressed as
P(s;t) = exp
0
BB
@
?fi(t?s)?fl0?I ?e??(t?s)???1x(s)
+[fl0?I ?e??(t?s)???1 ?(t?s)fl0]??1a
1
CC
A
?exp
?
?
X
j
Z t?s
0
?j ?ifl0?I ?e??(t?s?w)???1Bj?dw
!
= exp?as;t +b0s;tx(s)?; (3.13)
where the coe?cients as;t and bs;t are deflned as
as;t = ?fi(t?s)+[fl0?I ?e??(t?s)???1 ?(t?s)fl0]??1a
?
X
j
Z t?s
0
?j ?ifl0?I ?e??(t?s?w)???1Bj?dw
= ?fi(t?s)?(t?s)fl0??1a+fl0??2a?
X
j
fl0Qj??2ae??j(t?s)
?
X
j
Z t?s
0
?j
?
ifl0??1Bj ?i
X
k
fl0Qk??1Bje??k(t?s?w)
!
dw; (3.14)
b0s;t = ?fl0?I ?e??(t?s)???1
= ?fl0??1 +
X
j
fl0Qj??1e??j(t?s) : (3.15)
3.5.3 Swaps
Interest rate swaps have been central in the flxed income derivative market
(over-the-counter, OTC), and it is proven to be very successful in managing risks
and arguably the most successful innovations in flnancial market. Many exotic flxed
income derivatives are written with the swap rate underlying. The economic mo-
tivation behind swap is attributed to the comparative advantage; that is, difierent
companies can borrow at difierent rates in difierent markets, but they would be bet-
ter ofi if they are allowed to borrow from other markets that they can not access by
61
themselves. So the interest rate swap contract is designed to enable those companies
that have difierent borrowing conduits to exchange the comparative beneflts. See
Hull in [35] for more economic explanations.
In technical terms, an interest rate swap is an agreement between two parties to
exchange cash  ows at a series of predeflned future dates. In a vanilla payer(receiver)
interest rate swap, one party agrees to pay(receive) a predetermined, flxed rate on
agreed dates and receive(pay) a  oating rate, often referred as Libor (London Inter-
Bank Ofier Rate) that is prevailing one period before. The payer swap can be
specifled as follows |
? notional N and a flxed rate K;
? a number of future dates when to exchange cash  ows, T0 < T1 < ::: < Tn
with fii = Ti ?Ti?1 being the time fraction; 5
? flxed-leg pays NKfii and receives NfiiLTi?1(Ti?1;Ti) at time Ti, where the
LTi?1(Ti?1;Ti)istheLiborrateprevailingatTi?1 lastingfortheperiod[Ti?1;Ti].
Pricing interest rate swap is to flnd the fair value of the flxed rate K so that
it is costless (with respect to the current yield curve) for both sides to enter such
a contract at the initial time. To determine the swap contract value, let?s have a
closer look at a single payment at time Ti. From payer swap holder?s point of view,
5There is no cash  ow changing hands on the flrst date T0, but T0 sets the flrst  oating rate
paid on T1
62
the present value of the net cash  ow at time Ti would be
?i(t;K) = NfiiP(t;Ti)(F(t;Ti?1;Ti)?K)
= N (P(t;Ti?1)?P(t;Ti)?KfiiP(t;Ti)) : (3.16)
Thus the total discounted cash  ow ?(t;K) will be the summation of the all the
future discounted cash  ows
?(t;K) =
X
i
?i(t;K) = N
?
P(t;T0)?P(t;Tn)?K
nX
i=1
fiiP(t;Ti)
!
:
The swap rate is obtained by setting ?(t;K) = 0 such that
K = kt(T0;:::;Tn)P(t;T0)?P(t;Tn)nX
i=1
fiiP(t;Ti)
(3.17)
for all t < T0.
By the formula in Eqn.(3.17), the swap rate can be expressed as the ratio
of two T-bond portfolios. Armed with the bond pricer developed in the previous
section, the swap pricing is straight-forward following this formula.
3.5.4 Swaptions
Swaption by the name is an option written on the underlying swap. A Eu-
ropean vanilla payer/receiver swaption with strike rate K is an option giving the
holder the right but not obligation to enter a payer/receiver swap at a given future
date, the maturity. Usually the maturity coincides with the flrst resetting date T0
of the underlying swap, and the time from Tn to T0 6 is called the tenor of the
swaption.
6See Section 3.5.3 for swap speciflcation.
63
Recall that the value of a payer swap with flxed rate K at the flrst reset date
T0 is given by
?(T0;K) = N
nX
i=1
fiiP(T0;Ti)(F(T0;Ti?1;Ti)?K) ;
and the payofi of the swaption with strike K at maturity T0 is
w(T0;K) = N
? nX
i=1
fiiP(T0;Ti)(F(T0;Ti?1;Ti)?K)
!+
: (3.18)
A swaption can be viewed as an option on a bond portfolios, and this makes it
very di?cult to be evaluated because it can hardly be decomposed into elementary
payofis.
Since ?(T0;K = kT0) with kT0 being the swap rate at time T0 has to be made
zero, one can show that the payofi in Eqn. (3.18) can also be written as
w(T0;K) = N(kT0 ?K)+
nX
i=1
fiiP(T0;Ti): (3.19)
Eqn. (3.19) can be easily extended to evaluate the expectation for the swaption at
time t < T0 that follows
w(t;K) = NEQ
"
e?
x52T0
t rsds(kT
0 ?K)
+
nX
i=1
fiiP(T0;Ti)jFt
#
; (3.20)
where the expectation is taken on the risk-neutral measure Q.
We notice that the swaption formula in Eqn. (3.20) indicates the valuation
problem can be translated as valuing a vanilla option written on the swap rate
multiplied by the value of a traded bond portfolio. The di?culty lies in that, under
the risk-neutral measure, the discounting factor and bond portfolio multiplier are
dependent on the payofi such that they could not be taken outside the expectation
operator.
64
To overcome this di?culty, we consider change of numeraire technique as dis-
cussed in Section 3.2.3. We take the new numeraire to be the bond portfolio value
and call it forward swap measure, which is deflned by
deQ
dQ = ?s = e
?x52s0 r(w)dw
Pn
i=1 fiiP(s;Ti)Pn
i=1 fiiP(0;Ti)
(3.21)
= E
??e
Ys(y)?1
?
?(???)
?
for 0 < s < T0 where eQ is denoted as the forward swap measure. Under the measure
eQ, the swap rate ks is a positive martingale with
ks = E ((Ysswap(y)?1)?(??e?)) ; (3.22)
where E is the stochastic exponential (see [46]). In the above equations, ? stands for
the random jump measure, and ~? and ? the compensation jump measures associated
with forward swap measure eQ and the risk neutral measure Q, respectively.
The value of a swaption at time 0 with strike K is given by
w(K) = N
nX
i=1
fiiP(0;Ti)Ex65Q?(kT0 ?K)+? :
The problem is now reduced to evaluate the expectation of (kT0 ?K)+ under the
measure eQ. We learned from Carr and Madan [13] that the evaluation of such an
expectation can be done using the transform method, i.e. the Fourier Transform,
given the closed-form formula for the characteristic function of the swap rate ks is
readily known.
We then develop the characteristic function for ks under the measure eQ. To
approach it, we need to evaluate eYs(y), which is the response of the measure change
65
process ?s to a jump shock in the L?evy factors. Consider a jump of size yj in the
j-th factor (j = 1;2;3). We have that
d?js
?js =
?e
Ys(yj)?1
?
:
On the other hand, by the deflnition of Radon-Nikodym derivative ?s in Eqn. (3.21)
we may observe that
d?js
?js =
nX
i=1
fiiP(s;Ti)P
n
i=1 fiiP(s;Ti)
?
eb0s;TiBjyj ?1
?
=
nX
i=1
!i(s)
?
eb0s;TiBjyj ?1
?
: (3.23)
Toreduce thecomplexityinEqn.(3.23), weadoptthe ideaofslow-varyingmartingale
as in the BGM model [8], setting !i(s) ? !(0). Such an approximation is empirically
valid, because in reality the variability of !i is much less than eb0s;TiBjyj?1. Therefore,
Eqn.(3.23) is simplifled to be
d?js
?js ?
nX
i=1
!i(0)
?
eb0s;TiBjyj ?1
?
=
nX
i=1
fiiP(0;Ti)P
n
i=1 fiiP(0;Ti)
?
eb0s;TiBjyj ?1
?
:
Now we wish to see d?js?j
s
expressed in the exponential form
d?js
?js =
?ea
s(yj) ?1
? :
This implies that
as(yj) = ln
?
1+
nX
i=1
fiiP(0;Ti)P
n
i=1 fiiP(0;Ti)
?
eb0s;TiBjyj ?1
?!
;
and taking a flrst order approximation gives
as(yj) =
nX
i=1
fiiP(0;Ti)P
n
i=1 fiiP(0;Ti)
?b0
s;TiB
?
j yj :
66
We then write
eYs(y) = exp
? 3X
j=1
nX
i=1
fiiP(0;Ti)P
n
i=1 fiiP(0;Ti)
?b0
s;TiB
?
j yj
!
:
We may then infer that
e?s;j (eyj) = ecs;j?j (eyj)
cs;j =
nX
i=1
fiiP(0;Ti)P
n
i=1 fiiP(0;Ti)
?b0
s;TiB
?
j ;
where e?s;j and ?s;j are Levy densities under eQ and Q measures for the j-th factor.
FollowingL?evy-Khintchinetheoremwehavethecharacteristicexponent e?s;j(u)
for the jth L?evy factor at time s under the new forward swap measure,
e?s;j(u) = ?
Z 1
?1
?eiuy
j ?1
?e?
s;j(yj)dyj
= ?
Z 1
?1
?eiuy
j ?1
?ec
s;jyj?s;j(yj)dyj
= ?
Z 1
?1
?e(iu+c
s;j)yj ?ecs;jyj
??
s;j(yj)dyj
= ?
Z 1
?1
?e(iu+c
s;j)yj ?1
??
s;j(yj)dyj +
Z 1
?1
(ecs;jyj ?1)?s;j(yj)dyj
= ?Qs;j(u?ics;j)??Qs;j(?ics;j);
where ?Qs;j is the risk neutral characteristic exponent for Levy factor j at time s.
We now determine Y swaps (y) in Eqn. (3.22). Consider now a jump of yj in the
evolution of the j-th factor. We have that
dks
ks = (Y
swap
s (yj)?1) :
Again, by direct computation
dks
ks = ?
P(s;Tn)
1?P(s;Tn)
?
e(b0s;TB)jyj ?1
?
?
nX
i=1
fiiP(s;Ti)P
n
i=1 fiiP(s;Ti)
?
eb0s;TiBjyj ?1
?
:
67
Following the similar approximations (slow varying martingale and Taylor expanded
exponential), we show the solution of Y swaps (y) as
Y swaps (y) = exp
? 3X
j=1
?s;jyj
!
?s;j = ? P(0;Tn)1?P(0;T
n)
b0s;TnBj ?
nX
i=1
fiiP(0;Ti)P
n
i=1 fiiP(0;Ti)
b0s;TiBj :
We are now able to build the characteristic function of the logarithm of swap
rate process kT0 at maturity T0 under the forward swap measure by noting that
kT0 = exp
? 3X
j=1
?T0;j?Fj ?
3X
j=1
Z T
0
0
Z 1
?1
?e?
s;jyj ?1
?ek
s;j(yj)dyj
!
:
We recognize from this expression that
E?eiulnkT0? = exp???swapT0 (u)? ;
where
?swapT0 (u) = ?iu
3X
j=1
Z T0
0
e?s;j(?i?s;j)ds+
3X
j=1
Z T0
0
e?s;j(u?s;j)ds:
We realize at this time, with the knowledge of ?swaps , we can fully utilize the FFT
method in [13] to evaluate the swaption prices across difierent strikes.
3.5.5 Caps/Floors
A cap ( oor) is a strip of caplets ( oorlets) that gives the holder the protection
over the rising (declining) rates. Thus, a cap contract would consist of the following
elements |
? a notional of N and a flxed strike K;
68
? a number of future dates T0 < T1 < ::: < Tn with Ti ?Ti?1 = fii;
? at each time Ti;i = 1;:::;n, the holder of the cap contract receives the cash
 ow Nfii?LTi?1(Ti?1;Ti)?K?+.
A  oor contrasts a cap in that the  oor contract holder receives the cash
amount Nfii?K ?LTi?1(Ti?1;Ti)?+ at each time Ti. It is important to note the
following equality, known as the cap/ oor parity, holds
Cp(t;K)?Fl(t;K) = ?p(t;K);
where Cp(t;K) denotes the value of a cap at time t with strike K, Fl(t;K) the
corresponding value of a  oor, and ?p(t;K) the value of a payer swap7. In this
section, we show how to price the caplet under the model. The  oorlet?s valuation
would be straight-forward using the parity relationship.
Because the present value of a cap is simply the summation of the present
values of all caplets prior to maturity, a cap can therefore be viewed as a portfolio
of options. This is an important distinction from a swaption contract which is an
option on a portfolio, and makes the evaluation of caps is relatively easier than a
swaption.
A single caplet with reset date Ti?1 and settlement date Ti pays the holder
the notional N multiplied by the difierence between a Libor rate LTi?1(Ti?1;Ti) and
maturing at time Ti) and the strike K if LTi?1(Ti?1;Ti) > K, or zero otherwise.
Therefore the value wi(K) of such a caplet at initial time maturing is
wi(K) = NfiiEQ
h
e?
x52Ti
0 rsds
?L
Ti?1(Ti?1;Ti)?K
?+i ;
7The cap,  oor and swap must have the exact coinciding dates for future cash  ow exchanges.
69
where the expectation is taken under the risk-neutral measure Q.
Similar to the swaption case, the above expression is inconvenient for evalu-
ation, so we seek for convenient measure change. The new measure, termed as Ti
forward measure, is deflned in the following way for s < Ti
deQTi
dQ = ?Ti(s) = e
?x52s0 r(w)dw P(s;Ti)
P(0;Ti) ; (3.24)
= E
??e
Y Tis (y)?1
?
?(???)
?
:
We show that
d?jTi(s)
?jTi(s) = exp(bs;TBjyj)?1;
where yj is the jump in the j-th jump component (j = 1;2;3). After identical
algebraic manipulation and approximation as in the swaption case, we have
eY Tis (y) = exp
? 3X
j=1
bs;TBjyj
!
:
The characteristic exponent for the j-th jump component under the forward
measure is therefore given by
e?Tis;j(u) = ?
Z 1
?1
?eiuy
j ?1
?ek
s;j(yj)dyj
= ?
Z 1
?1
?eiuy
j ?1
?eb
s;TiBjyjks;j(yj)dyj
= ?
Z 1
?1
?e(iu+b
s;TiBj)yj ?ebs;TiBjyj
?k
s;j(yj)dyj
= ?
Z 1
?1
?e(iu+b
s;TiBj)yj ?1
?k
s;j(yj)dyj +
Z 1
?1
?eb
s;TiBjyj ?1
?k
s;j(yj)dyj
= ?Qj (u?ibs;TiBj)??Qj (?ibs;TiBj);
where the ?Q0j denotes the characteristic exponent of the j-th jump component
under the original risk-neutral measure Q0.
70
Now we look at the dynamics of the underlying forward rate Li(s) = F(s;s;Ti)
which is the simple compounded spot rate prevailing at time s and Ti deflned as
Li(s) = 1fi
i
 1
P(s;Ti) ?1
?
= 1?P(s;Ti)fi
iP(s;Ti)
;
where fii = Ti ?s. It is easy to see that under the forward measure eQTi, Li(s) is a
martingale. Analogous to the swaption case, we deflne
dLi(s)
Li(s) =
?
Y fi;s(yj)?1
?
and we try to determine Y fi;s(y) on the analysis of
dLi(s)
Li(s) = ?
P(s;Ti)
1?P(s;Ti)
?ex50
j bs;TiBjyj ?1
???ex50
j bs;TiBjyj ?1
?
= ? 11?P(s;T
i)
?ex50
j bs;TiBjyj ?1
? :
Hence, we approximate
Y fi;s(y) = exp
? 3X
j=1
? bs;TiBjyj1?P(0;T
i)
!
:
WethenobtainthecharacteristicexponentofthelogarithmofLi(Ti?1) = F(Ti?1;Ti)
since the cap contract payofi is known at time Ti?1.
E?eiulnLi(Ti?1)? = exp(??ii?1(u));
and we have
?ii?1(u) = ?iu
3X
j=1
Z Ti?1
0
e?Tis;j(?i?s;j)ds+
3X
j=1
Z Ti?1
0
e?Tis;j(u?s;j)ds;
?s;j = ? bs;TiBjyj1?P(0;T
i)
:
71
Once we have obtained the characteristic function for the logarithm of the
forward rate Li(s) at time Ti?1, the rest calculation is similar to the swaption case,
i.e. the FFT method is applied to calculate the value of a single caplet which pays
ofi at each time Ti. The price of a cap is thus the summation of all caplets, and one
can use the parity equation to deduce the value of a  oor with the same structure
as the cap.
72
Chapter 4
Particle Filter and Parameter Estimation
4.1 Background
The mission of assessing the empirical validity of interest models in general,
and of our L?evy based model in speciflc, is of obvious importance. Before engaging
in any performance comparison with other models, we should have a clear picture of
how our model parameters in uence the observable market quantities, such as the
bond prices. That is, the model should flrst be sensibly estimated.
The problem has two aspects. From the theoretical point of view, we have
developed a state-space model containing time-invariant parameters and a vector
of latent state variables driven by a multi-factor L?evy process. The short rate is
an a?ne combination of the latent factors. Bond prices and their derivative values
are non-linear functions of the current state variables, given the model parameters
known. From the empirical perspective, time series of market prices for the past
10 years are available. However, market prices are noisy data mixed with trading
errors, bid/ask spread, market mis-speciflcation, etc. Here, the task is to design
an algorithm, which accounts for both the model structure and the disturbed data
supply, to acquire a sound estimation of the model parameters.
Parameterestimation, also called\calibration" inflnancialengineering lingo, is
an optimization procedure that minimizes the difierence between the model outputs
73
and market inputs. The challenge in estimating the state-space model is that the
model outputs depend on the knowledge of state variables which are not directly
observable. Therefore, the critical step in estimating such models is to retrieve the
values of latent state variables by flltering out the aforementioned noises using the
fllter technique. Once the state variables are determined, the optimization is carried
out by minimizing the pricing errors. As a special case, if the errors are assumed
to be white noises following multidimensional normal distribution, the maximum
likelihood estimation could be invoked.
In speciflc, the model estimation under the state-space model with time series
data consists of two steps.
(1) Take the initial guess on the model parameters (or the returned parameter
values from last optimization iteration); estimate sequentially the state vari-
ables through time by flltering; generate model prices based on the flltered
state variables.
(2) Maximize the joint likelihood function for the errors between the model prices
and market prices
Step (1) is the focus of this chapter; that is, we will focus on designing al-
gorithms that could sequentially retrieve the unobservable state variables from a
large set of noisy signals. Kalman fllter (KF) is the standard tool in this area,
and under the assumption that the state variables are driven by Brownian motion
and the measurement/propagation functions are linear, KF has been proved very
capable of providing e?cient, robust estimation. We have seen massive applica-
74
tions of this techniques in the area of flnancial engineering for model calibration or
price predictions, etc. For instance, one investment strategy that hedge funds and
banks? proprietary trading desks often use is to build an equilibrium model, cast
it into state-space form, estimate it using the KF technique, make projections on
price movements, and trade the difierence between the model prediction and the
market quotes. If the model indeed describes the correct market dynamics, market
prices should converge (statistically) to the predicted price and traders pocket the
difierence. Performance of such strategies crucially depend on the quality of the
implementation of the fllter technique used for model estimation and prediction.
The KF-like fllter would perform well if we live in a perfect linear and Gaussian
world. Unfortunately, the real world is not even close to be perfect. As pointed out
in the Chapter 3, non-Gaussianity exists signiflcantly in the flxed-income world, and
pricing functions are highly nonlinear. Whereas the extended Kalman fllter (EKF)
and unscented Kalman fllter (UKF), as extensions to the original KF, were devised
to cope with the non-linearity, the non-Gaussianity could not be rescued anyway in
the restrictive Kalman fllter world.
Therefore we resort to the particle fllter (PF), a newly emerged flltering tech-
nique that has not been vastly applied due to its high demand in computing power.
However, as this demand is being eased by the dramatic reduction in computing
cost recently, particle fllter has become an attractive alternative to handle the non-
Gaussian and non-linear problems. Unlike KF where the distribution is completely
determined by the variance-covariance speciflcation, PF relies on a large number
of simulated particles to represent the distribution. The distribution is moved for-
75
ward by propagating the entire set of particles, while under KF only the variance-
covariance matrix needs to be updated. The propagation of the particles are realized
by Monte Carlo simulation, and this is why PF is also referred by some literature
as sequential Monte Carlo method.
In this chapter, Kalman fllter and its extensions will be flrstly discussed, given
its prominent status in the area of state-space models estimation. Next we will
focus on explaining the particle fllter technique, and generic algorithms including
the resampling technique will be studied. The generic particle fllter algorithm is
then customized to our model context and numerical results are presented. After
the model is estimated, we perform factor analysis on the yield curve and experiment
caplet pricing using Monte Carlo simulation.
4.2 The problem and conceptual solution
Let?s introduce the flltering problem. Deflne xk 2Rnx the state vector where
nx is the dimension of the state variable and k is the time index. The evolution of
xk follows the below discrete-time stochastic model function
xk = fk?1(xk?1;vk?1) (4.1)
where vk?1 is the randomness from the state variable evolution, and fk?1 is the
propagation function perturbed by the randomness vk?1, which is the unforeseeable
disturbance in the state motion. The measurements are functions of state variables
xk and the measurement noises wk
zk = hk(xk;wk); (4.2)
76
where zk 2Rnz, and hk 1 is a known, possibly non-linear function. The error wk is
often assumed as the white noise.
The challenge is to extract the values of unobservable xk for all k based on
the observations of Zk , fzi;i = 1;:::;kg which become readily available over
time. Because xk themselves are random variables, the problem essentially is to
discover the posterior probability density function (PDF) p(xkjZk), conditioned on
the observation Zk. Assuming the initial p(x0) , p(x0jZ0) is known, we invoke Bayes
theorem to recursively update the posterior distribution of xk at each k. There are
two steps in this task: prediction and update.
1. First we make the prediction about the PDF p(xkjZk?1) by applying the
Chapman-Kolmogorov equation
p(xkjZk?1) =
Z
p(xkjxk?1;Zk?1)p(xk?1jZk?1)dxk?1
=
Z
p(xkjxk?1)p(xk?1jZk?1)dxk?1 : (4.3)
Note: from the flrst line to second line in Equation. (4.3) we used the Markov
property, namely the information of Zk?1 has been included in xk?1.
2. Second, aftertheobservationzk becomesavailable, Bayesruleisusedtoupdate
the posterior
p(xkjZk) = p(zkjxk)p(xkjZk?1)p(z
kjZk?1)
; (4.4)
where the normalizing factor in the denominator could be written as
p(zkjZk?1) =
Z
p(zkjxk)p(xkjZk?1)dxk :
1It is noteworthy that for convenience in many cases both evolution function fk and measure-
ment function hk are time-homogeneous such that the time index k could be dropped.
77
The conceptual solution as in Eqn. (4.4) unfortunately can not be determined
optimally in general. It is because the storage of the entire PDF is equivalent to an
inflnite dimensional vector; therefore, one has to resort to sub-optimal algorithms
to approximate the solution to the problem. However, this di?culty is considerably
alleviated for models assuming linearity and Gaussian dynamics. In the next section
we demonstrate how the Kalman fllter leads to an optimal solution under the ideal
assumptions.
4.3 Kalman fllter
The Kalman fllter assumes that the posterior PDF at every time step is Gaus-
sian which can be completely characterized by the mean vector and covariance ma-
trix. In other words, if p(xk?1jZk?1) is normally distributed, it can be proved that
p(xkjZk) is also normal if the following conditions about Eqn.(4.1) and (4.2) are
satisfled |
? vk?1 and wk are samples from normal distribution
? xk = Fk?1xk?1 + vk?1, with Fk?1 being a nx ?nx matrix such that function
fk?1(xk?1;vk?1) is linear in xk?1 and vk?1
? zk = Hkxk + wk, with Hk being a nz ? nz such that function hk(xk;wk) is
linear in xk and wk
We further denote Qk?1 and Rk the covariance matrix for the white noise vk?1
and wk respectively, which are mutually independent.
78
Then the Kalman fllter algorithm can be viewed in the following recursive
manner |
p(xk?1jZk?1) = N(xk?1; ^xk?1jk?1;Pk?1jk?1) (4.5)
p(xkjZk?1) = N(xk; ^xkjk?1;Pkjk?1) (4.6)
p(xkjZk) = N(xk; ^xkjk;Pkjk) (4.7)
where N(x;m;P) is the density function of the normal distribution with mean m
and covariance P, and ^xkjk and Pkjk denote the posterior estimate of the mean
and covariance, and ^xkjk?1 and Pkjk?1 denote the prior estimate for the mean and
covariance without the newest update.
With the notations above, the Kalman fllter algorithm is carried out as follows,
^xkjk?1 = Fk?1 ^xk?1jk?1 (4.8)
Pkjk?1 = Qk?1 +Fk?1Pk?1jk?1FTk?1 (4.9)
^xkjk = ^xkjk?1 +Kk(zk ?Hk^xkjk?1) (4.10)
Pkjk = Pkjk?1 ?KkSkKTk (4.11)
where
Sk = HkPkjk?1HTk +Rk
is the covariance of the innovation term, and
Kk = Pkjk?1HTk S?1k
is the so-called Kalman gain.
By looking at the Eqn. (4.11) we flnd how Kalman fllter helps reduce the
covarianceofthestatevariableswiththereceptionofnewinformation. Theposterior
79
covariance Pkjk equals the prior covariance Pkjk?1 less KkSkKTk , which is the amount
of variance reduction coming from the new observation of zk through Kalman gain.
From Eqn. (4.8) to (4.11) is the complete recipe of Kalman fllter. The Kalman
fllter is the only optimal solution with the aforementioned assumptions held. How-
ever, in cases where the measurement functions are non-linear, sub-optimal solu-
tions such as extended Kalman fllter (EKF) and unscented Kalman fllter (UKF)
(see [20, 27, 28] for details) are devised to cope with the broken assumptions.
1. If the function(s) f(x;v) or h(x;w) (or both) are almost linear, we approximate
them by the flrst-order Taylor expansion and replace those functions with their
Jacobian matrices. This is the extended Kalman fllter.
2. If non-linearity is high in functions f(x;v) or h(x;w), we use the unscented
transformation that draws the deterministic sigma points to approximate the
distribution. This method is the unscented Kalman fllter or UKF.
Both EKF and UKF attempt to reconcile the non-linearity violation in the
original KF, but neither could theoretically handle the problem if the state vari-
ables are non-Gaussian. However, it is crucial in our case to relax the Gaussianity
restrictions since the state variables in our model are driven by pure jump L?evy
processes. We hereby resort to the counterpart of KF | the particle fllter.
4.4 Particle fllter
Reality often manifests itself as being very complex: non-Gaussian, non-linear
with continuous/discontinuous state space. Therefore, Kalman fllter, albeit theoret-
80
ically sound, is not applicable in many practical situations. Particle fllter emerged
as an alternative algorithm to substitute Kalman fllter in the non-Gaussian case.
Instead of updating only the mean and covariance, PF uses a large number of
points/particles to empirically approximate the posterior distribution in interest.
Admittedly in comparison to the optimal Kalman fllter, particle fllters are
sub-optimal. Obviously, a flnite number of simulated particles could not completely
recover a continuous distribution, though more particles will help increase the pre-
cision. However, the number of particles could not go too large in the practice of
state-space model estimation, since one needs to perform the simulation at every
time step. This is a heavy load of computation which has prohibited from applying
PF in the past when computing cost was high. Given enough computing power
nowadays however, particle fllter technique can be very helpful and ofier great  exi-
bility in estimating state-space models without restricted assumptions. Our 3-factor
L?evy flxed income model is a perfect application in this regard.
In this section we will discuss in detail about particle fllter in its generic format.
Importance sampling again will be reviewed in the context of PF implementation.
Sequential importance sampling (SIS), as the simplest PF implementation will be
investigated. As naive SIS introduces the degeneracy problem, resampling technique
is discussed to reduce the degree of degeneracy.
81
4.4.1 Importance sampling in particle fllter
The mechanism of importance sampling has been demonstrated in Eqn.(2.13)
to (2.16), and the idea behind it is to make the random draws from a more concen-
trated region that most concerns the calculation in interest. In the particle fllter
context, since we are interested in flnding the posterior distribution pX(:), we want
to generate particles from a guessed importance distribution qX(:) which preferably
are close to the true posterior. In fact, with a more educated guess about qX(:), the
calculation on pX(:) using Bayes theorem would be more accurate. It can be shown
that, to some extent, the quality of the parameter estimation depends on the choice
of the guessed importance distribution. The selection of the importance distribution
will be addressed later on in this chapter.
Recallthatinordertoapplyimportancesampling, theRadon-Nikodymderiva-
tive ~!(:) needs to be calculated for each particle xi as follows
~!(xi) = pX(xi)q
X(xi)
: (4.12)
Since the information about pX() is not known beforehand, we can not rely
on Eqn. (4.12) to compute the weights ~!(xi). To bypass this di?culty, the nor-
malization is introduced so the new weights for each drawing i can be obtained
by
!(xi) = ~!(xi)P~!(x
i)
: (4.13)
The normalization procedure will be discussed in details in the next section.
82
4.4.2 Sequential importance sampling
Now we discuss how to apply importance sampling with particle fllter based
on the conceptual solution as in Eqn. (4.4). We denote Xk = fxj : j = 1;:::;kg
the sequence of state variable x and Zk = fzj : j = 1;:::;kg the sequence of
observations of z both up to time k. The the joint posterior distribution at time k
is estimated as follows
p(XkjZk) =
NX
i=1
!ik?(Xk ?Xik); (4.14)
where ?(:) is the delta function. The calculation of weights !ik follows Eqn.(4.13).
In particular, if samples are drawn from an importance density q(XkjZk) other than
the true density, then the weights can be expressed as
!ik / p(X
i
kjZ
i
k)
q(XikjZik) : (4.15)
Suppose at time k?1 we have samples that constituting an approximation of
p(Xk?1jZk?1) and we are interested in having p(XkjZk) at time k with the reception
of new observation zk. If the importance distribution is chosen to have the following
factorization
q(XkjZk) = q(xkjXk?1;Zk) q(Xk?1jZk?1); (4.16)
one can then augment the existing samples Xik?1 ? q(Xk?1jZk?1) with the new state
xik ? q(xkjXk?1;Zk) to obtain samples Xik ? q(XkjZk).
To derive the weight update, the PDF p(XkjZk) can be expanded in the fol-
lowing manner | (For derivation see Appendix (A.3))
p(XkjZk) / p(zkjxk) p(xkjxk?1) p(Xk?1jZk?1): (4.17)
83
Now if we plug Eqn.(4.16) and (4.17) in the Eqn.(4.15), the weight update
equation can be shown as
!ik / p(zkjx
i
k) p(x
i
kjx
i
k?1) p(X
i
k?1jZk?1)
q(xikjXik?1;Zk) q(Xik?1jZk?1)
= !ik?1 p(zkjx
i
k) p(x
i
kjx
i
k?1)
q(xikjXik?1;Zk) : (4.18)
The relationship in Eqn.(4.18) can be simplifled by the Markov property to
drop the dependence on the history
q(xikjXik?1;Zk) = q(xikjxik?1;zk); (4.19)
such that the importance density only depends on the most recent state and ob-
servation xk?1 and zk. This reduction is particularly helpful when only the flltered
posterior p(xkjZk) is estimated sequentially but large storage of past information is
undesired. After the simpliflcation, the weights update equation reads
!ik / !ik?1 p(zkjx
i
k) p(x
i
kjx
i
k?1)
q(xikjxik?1;zk) : (4.20)
And the updated posterior density is approximated as
p(xkjZk) ?
NX
i=1
!ik?(xk ?xik): (4.21)
The Sequential Importance Sampling (SIS) algorithm gives rise to propagate
the posterior distribution with new updates at each time step by Monte Carlo sim-
ulating a large number of particles. With the knowledge of the entire posterior dis-
tribution obtained sequentially we could be able perform any relevant calculations
involving the distribution of the state variables, such as the likelihood function.
84
4.4.3 Selection of the importance density
The quality of the particle fllter estimation crucially depends on the selection
of the prior distribution q(xikjxik?1;zk) as in Eqn.(4.20). The optimal prior choice
would minimize the variance of the importance weights. However, except in a few
exceptions, flnding the optimal choice is infeasible. Here we present the most widely-
used suboptimal choice that is the so-called transitional prior,
q(xikjxik?1;zk) = p(xikjxik?1): (4.22)
This choice is very popular due to its simplicity. With this prior, the weight update
function Eqn.(4.20) is conveniently reduced to
!ik / !ik?1 p(zkjxik): (4.23)
The SIS algorithm under this choice of prior distribution is summarized in Table
(4.1).
The transitional prior would be flnd if the domain for most of prior?s distri-
bution mass is narrower than that of the likelihood function p(zkjxk), otherwise
problems such as fast degeneracy and impoverishment will arise. Improved algo-
rithms to handle those problems have been developed in the past, and one of them,
the resampling technique, will be discussed in the next section. Interested readers
are suggested to read Chapter 3 in [27] for a more detailed account.
4.4.4 Resampling
Any suboptimal choice of prior distribution, such as the transitional prior
discussed in the previous section, causes the degeneracy problem. In speciflc, it is
85
[fxik;!ikgNi=1] = SIS([fxik?1;!ik?1gNi=1;zk])
? For i = 1 : N
{ Draw xik from the prior distribution q(xikjxik?1;zk) = p(xikjxik?1);
{ Evaluate the non-normalized weights according to Eqn.(4.23)
~!ik = !ik?1 p(zkjxik)
? End For
? Calculate the normalizing factor Z = PNj=1 ~!jk
? For i = 1 : N
{ Normalize: wik = ~!ikZ
? End For
Table 4.1: Sequential importance sampling algorithm
found that the suboptimal prior could increase the variance in the importance weight
over time. As a consequence, it is very likely that, after a few time steps, one particle
picks up all the weights while other particles weigh little. The degeneracy problem
leads to a large variance in the posterior distribution and harms the estimation.
Unfortunately, it is theoretically inevitable when applying the suboptimal sequential
importance sampling algorithm.
Whereas it is impossible to totally avoid the degeneracy, we could seek to
reduce it. The degree of degeneracy could be measure by the efiective sample size
^Neff is deflned as
^Neff = 1P
(!ik)2 :
The smaller ^Neff is, the stronger degeneracy exists. ^Neff is then monitored over
time; once a pre-deflned threshold Nthr is breached, resampling is executed to lessen
86
the degeneracy degree. The resampling procedure is to split particles that carry
large weights into small particles based on the probability, so technically it is to
map a random measure fxik;!ikg into another random measure fxi?k ; 1Ng of equal
weights. The new random measure is generated by sampling with replacement N
times from the approximated discrete posterior p(xkjZk) given by Eqn.(4.21) such
that Pfxi?k = xjkg equals !jk. Therefore, after resampling, a new set of particles with
uniform weights is drawn and available for the next step of propagation.
The mechanism of resampling procedure is illustrated in Fig.(4.1), and the
algorithm is summarized in Table 4.2.
Figure 4.1: Resampling scenario and mechanism in particle fllter
87
[fxi?k ;!ikgNi=1] = RESAMPLE([fxik;!ikgNi=1;zk])
? Initialize c1 = !1k
? For i = 2 : N
ci = ci?1 +!ik
? End For
? Draw random number n1 ?U[0;1=N], set i = 1
? For j = 1 : N
{ Let uj = u1 + j?1N
{ While uj > ci
i = i+1
{ Assign xi?k = xik and !jk = 1N
? End For
Table 4.2: Resampling algorithm
4.4.5 Generic particle fllter algorithm
We conclude this section by summarizing the algorithm of the general particle
fllter which combines the sequential importance sampling (SIS) with resampling
in Table 4.3. The algorithm is to generate the N new pairs of fxik;!ikg for i 2
f1;:::;Ng at the time step k with input of fxik?1;!ik?1 = 1Ng.
[fxik;!ikgNi=1] = PF([fxik?1;!ik?1gNi=1;zk])
? Filtering via SIS in Table 4.1
? Resample using the algorithm in Table (4.2)
[fxik;!ik = 1NgNi=1] = RESAMPLE([fxik;!ikgNi=1;zk])
Table 4.3: Particle fllter algorithm
88
4.5 Estimate the 3-factor L?evy short rate model
In this section we will flrst describe the data set we use in the estimation
and then cater the generic particle fllter algorithm to our model estimation context.
Numerical results will be presented and analyzed.
4.5.1 Data description
By courtesy of Caspian Capital Management LLP, we are provided with the
over-the-counter Euro-dollar future rates and constant maturity swap (CMS) rates
across difierent maturities and tenors.
In speciflc, the flrst 3 columns of data are rates re ected from the Eurodollar
future contracts, which are written on the 3-month Libor rates for 3 difierent ma-
turities of T = 3;6;12 months. The next 6 columns of data are the swap rates of 6
difierent maturities including Tn = 2;3;5;10;15;30 years. In those swap contracts,
the flxed-rate coupon payments are made semi-annually.
Therefore we have in total 9 columns of data. The data set contains 10 year
daily market quotes that start from 04=25=1994 through 10=13=2004. To avoid
weekday efiects in the estimation, we only sample data weekly on every Wednesday.
This makes the sampled data set have 9?538 = 4842 entries.
89
4.5.2 Quasi-maximum likelihood estimation with particle fllter
Quasi-maximumlikelihoodestimation(MLE)2 isappliedtocalibratethemodel.
The continuous-time dynamics of the short rate factors is flrst cast into a discrete
version as the data supply is sampled weekly. Particle fllter then helps to retrieve
the values of state variables sequentially on every sampled time step. Based on the
flltered state variables, pricing equations output the model prices, and pricing errors
are simply the difierences between market prices and the model prices. Errors are
assumed to be white noises with pre-estimated variances, and the error likelihood
functions are computed via the multidimensional normal distribution function. The
concretes procedures will be given below.
We cast the SDE of the state variables x(t) in Eqn. (3.8) and (3.9) by Euler
approximation into discrete propagation equations here as
xt = (I ?e???t)a? +e???txt?1 +B?Lt ; (4.24)
where I is the 3?3 identity matrix and ?Lt is our L?evy randomness. With weekly
sample frequency, we set ?t = 7=365.
We construct measurement functions for Eurodollar futures and swap rates of
difierent kinds, assuming additive, normally distributed measurement errors:
zt = O(xt;?)+et ; (4.25)
where zt denotes the observable prices at time t, and O(xt;?) denotes the model-
implied values as a function of the parameter set ? and the factor vector xt . The
2It is called quasi-maximum likelihood method because the likelihood function is, strictly speak-
ing, the conditional likelihood function.
90
term et denotes the normally distributed pricing errors with 0 mean and R the
co-variance matrix 3.
Recall the pricing formulas O(xt;?) for Eurodollar futures and swap rates
which are summarized below:
Lt(T;T +fi) = P(t;T)?P(t;T +fi)fiP(t;T +fi) ; (4.26)
Kt(T0;:::;Tn) = P(t;T0)?P(t;Tn)nX
i=1
fiP(t;Ti)
; (4.27)
where the bond formulas, given the state variables, are shown in Eqn.(3.13). The
Lt(T;T + fi) in Eqn.(4.26) denotes the Eurodollar future rate prevailing at time t
for the period [T;T +fi] where fi = 3 months. And the Kt(T0;:::;Tn) in Eqn.(4.27)
stands for the rate of the swap contract with coupons being paid on T1;:::;Tn.
In such a contract, the flxed rate leg makes the payment every fi = 0:5 year. To
match the data, the pricing function O(xt;?) is 9-dimensional, with the flrst 3 being
Eurodollar futures maturing in T = 3;6;12 months and swap contracts maturing
in Tn = 2;3;5;10;20;30 years. And we deflne the pricing errors as the difierences
between ezt, the rates output by the model and zt, the rates observed from the
market.
The particle fllter technique we developed in Section 4.4 can be straight-
forwardly customized in the context of our model estimation. In speciflc, at a
particular time t ? 1 one has a large number of particles with known values xt?1
that represent the posterior distribution at time t ? 1. To propagate the poste-
3For the sake of convenience, in the estimation we take the covariance matrix R to be the
identity matrix I times a constant scalar  2
91
rior distribution to the next time step t, one flrst makes the ex ante predictions ext
according to Eqn. (4.24). Now at time t market quotes zt become available and
we now use the observable data to update the prediction. As shown in Table 4.1,
this is equivalent to flnd the weight function !it for each particle i. Recall that the
pricing error for each day t and each asset value i is modeled as white noise with
variance R so the log-likelihood function lit = logp(ztjexit) is the exponent of the
normal distribution
lit = ?12?(zt ?ezit)0(R)?1(zt ?ezit)? : (4.28)
To estimate the parameters, we build the joint log-likelihood function across
the entire history for every sampled data. That is,
L(?;zMt=1) =
MX
t=1
NX
i=1
lit ; (4.29)
where M is the number of days when we sample data from and N is the number of
particles we simulate4. We choose parameters to maximize the log-likelihood of the
data series, which is expressed as
? = argmax
?
L(?;zMt=1):
4.5.3 Estimation results and discussion
Assuming the L?evy process follows Variance Gamma, we estimate the reduced
form of the model with risk-neutral parameterization as listed in the right column
of Table 3.2 totalling 25 parameters. The program is written in Matlab and uti-
lizes its optimization routines. The selection of initial values of the parameters are
4We take N = 200 in the estimation.
92
critical for the optimization, and an informed selection would substantially expedite
the optimization convergence. The initial values are chosen via an trial-and-error
procedure.
On one hand, a large amount of particles will make better precision for rep-
resenting the distribution, and hence reduces the number of iterations used in the
optimization. On the other hand however, having more particles substantially in-
creases the workload in simulation. The optimal number of particles is beyond our
knowledge, but in the Matlab code 200 particles are employed.
The estimation took considerable CPU time for convergence. It is not guar-
anteed that, due to the large-scale optimization along with randomness caused by
simulation, the likelihood maximization stops at the global maximum. But as vi-
sualized by the plots from Figure 4.2 to 4.6, a reasonable convergence has been
achieved. The estimated parameters are presented in Table 4.4.5
The performance statistics of the estimated model are presented in Table 4.5.
Errors are the difierence between market prices and model prices and quoted in
basis points6. The statistics of pricing errors for both ex-ante prediction and ex-
post update are presented in panel A and B respectively, and columns titled [Mean,
Median, Std, MAD, Max, Min] refer to the [mean, median, standard deviation,
mean-average-deviation, maximum, minimum] of the error series. EF stands for
Eurodollar futures.
5We use L1;L2;L3 to denote the parameter vectors for each factor. Since we are using VG
process, all are 3-dimensional vectors.
6A basis point is one percent of a percentage.
93
Apparently, with the new market observation, the ex-post performance is much
better than the ex-ante prediction in terms of smaller error standard deviations.
Cross-sectionally, the results are indicating good predicting power as the mean and
median values for all assets but EU(12m, 3m) are within the range of 10 basis points
from zero. EU(12m, 3m) seems to be the least predictable. Also, we notice that the
predictions for swap rates of shorter maturities have much smaller error standard
deviations and MADs than other swap rates and Eurodollar futures. The whole
time series of the model implied values against the market quotes are graphically
illustrated in Figure 4.2 to 4.6 for each of the 9 asset classes.
Although the L?evy based models are structurally superior, the estimation pro-
cedures are, at the same time, considerably more complicated. It is expected that
a better computing facility would enable us to use more particles in order to bring
increased accuracy and enhanced performance.
4.6 Yield curve analysis and pricing
Based on the estimated model from the previous section, we perform the factor
loading analysis on the yield curve and price the caplet using Monte Carlo simu-
lation. The caplet volatility surface is constructed by converting the caplet values
into Black implied volatility.
94
Model SDE: r(t)=fi +fl0x(t); dx(t) = (a??x(t))dt+dL(t)
fi ? fl
0:0762
2
4
0:0013 0 0
0:0903 0:6353 0
?0:3450 ?0:3927 0:9691
3
5
2
4
?0:0666
?0:0568
0:1241
3
5
a L1 L2 L3 x02
4
0:0009
0:0353
0:4118
3
5
2
4
?0:0169
0:1200
0:1702
3
5
2
4
?0:0996
0:1028
0:1983
3
5
2
4
?0:0096
0:0828
0:1874
3
5
2
4
2:1199
?0:4353
1:0190
3
5
Table 4.4: In-sample maximum likelihood parameter estimates of 3-factor L?evy
short rate model with particle fllter using 10-year data from 04=25=1994 through
10=13=2004
In-sample error statistics in basis points (bps)
Asset Mean Median Std MAD Max Min
A. Ex-ante
EF(3m, 3m) 4.1811 4.1557 18.1331 14.0131 59.8101 -95.3415
EF(6m, 3m) -7.7202 -6.7680 18.1461 13.7373 55.7387 -123.5331
EF(12m, 3m) -12.7738 -10.5906 26.2126 21.0457 49.6241 -99.1613
Swap(2 year) -2.9530 -3.7763 14.7992 11.4461 50.6473 -83.8178
Swap(3 year) -4.9998 -5.6067 16.9883 13.3460 57.3284 -72.2385
Swap(5 year) -5.7343 -6.5089 18.9978 14.8604 63.1169 -57.7497
Swap(10 year) 0.9012 -1.0923 20.4852 15.8901 93.8079 -50.6519
Swap(20 year) 7.7000 5.2903 22.2722 17.5681 110.7255 -39.3156
Swap(30 year) 2.6371 0.5487 23.0010 18.2084 106.6838 -47.9710
Average -2.0846 -2.7053 19.8928 15.5684 71.9425 -74.4200
B. Ex-post
EF(3m, 3m) 7.2434 5.9534 16.2113 12.5880 58.8404 -34.5549
EF(6m, 3m) -4.8583 -4.2072 10.5541 8.2076 51.1436 -43.0244
EF(12m, 3m) -10.3143 -7.5075 18.8791 15.4580 38.4103 -69.2283
Swap(2 year) -0.2902 -0.0407 5.3054 4.1561 12.7163 -21.5128
Swap(3 year) -2.6413 -3.0292 8.5221 6.6426 29.6837 -31.1299
Swap(5 year) -3.8019 -4.6334 11.7396 8.9714 42.5053 -32.8672
Swap(10 year) 2.3866 1.1166 14.3203 10.9077 75.8342 -29.3619
Swap(20 year) 9.0442 7.4268 16.9991 13.2676 92.7829 -25.4996
Swap(30 year) 3.8909 1.6666 18.7174 14.6119 89.2396 -36.5544
Average 0.0732 -0.3616 13.4720 10.5345 54.5729 -35.9704
Table 4.5: In-sample estimation performance statistics of 3-factor L?evy short rate
model with particle fllter using 10-year data from 04=25=1994 through 10=13=2004
95
0 200 400 6000
2
4
6
8
10
Time (week)
Rates (%)
Euro?Dollar Future ?? Libor(3m, 3m)
Ex Ante
Ex Post
Market
0 200 400 600?1
0
1
2
3
4
Time (week)
Rates Difference (%)
Euro?Dollar Future ?? Libor(3m, 3m)
Ex Ante
Ex Post
0 200 400 6000
2
4
6
8
10
Time (week)
Rates (%)
Euro?Dollar Future ?? Libor(6m, 3m)
Ex Ante
Ex Post
Market
0 200 400 600?1
0
1
2
3
Time (week)
Rates Difference (%)
Euro?Dollar Future ?? Libor(6m, 3m)
Ex Ante
Ex Post
Figure 4.2: Illustration of (quasi) maximum likelihood estimation using particle
fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994
to 10=13=2004: Upper 2 panels for Eurodollar future on Libor(3m, 3m) and lower 2
panels for Eurodollar future on Libor(6m, 3m)
96
0 200 400 6000
2
4
6
8
10
Time (week)
Rates (%)
Euro?Dollar Future ?? Libor(12m, 3m)
Ex Ante
Ex Post
Market
0 200 400 600?1
?0.5
0
0.5
1
1.5
2
2.5
Time (week)
Rates Difference (%)
Euro?Dollar Future ?? Libor(12m, 3m)
Ex Ante
Ex Post
0 200 400 6000
2
4
6
8
10
Time (week)
Rates (%)
2?Year Swap Rate
Ex Ante
Ex Post
Market
0 200 400 600?0.5
0
0.5
1
1.5
2
2.5
3
Time (week)
Rates Difference (%)
2?Year Swap Rate
Ex Ante
Ex Post
Figure 4.3: Illustration of (quasi) maximum likelihood estimation using particle
fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994
to 10=13=2004: Upper 2 panels for Eurodollar future on Libor(12m, 3m) and lower
2 panels for 2-year swap rate
97
0 200 400 6000
2
4
6
8
10
Time (week)
Rates (%)
3?Year Swap Rate
Ex Ante
Ex Post
Market
0 200 400 600?1
?0.5
0
0.5
1
1.5
2
2.5
Time (week)
Rates Difference (%)
3?Year Swap Rate
Ex Ante
Ex Post
0 200 400 6002
3
4
5
6
7
8
9
Time (week)
Rates (%)
5?Year Swap Rate
Ex Ante
Ex Post
Market
0 200 400 600?1
?0.5
0
0.5
1
1.5
2
Time (week)
Rates Difference (%)
5?Year Swap Rate
Ex Ante
Ex Post
Figure 4.4: Illustration of (quasi) maximum likelihood estimation using particle
fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994
to 10=13=2004: Upper 2 panels for 3-year swap rate and lower 2 panels for 5-year
swap rate
98
0 200 400 6003
4
5
6
7
8
9
Time (week)
Rates (%)
10?Year Swap Rate
Ex Ante
Ex Post
Market
0 200 400 600?1
?0.5
0
0.5
1
1.5
Time (week)
Rates Difference (%)
10?Year Swap Rate
Ex Ante
Ex Post
0 200 400 6004
5
6
7
8
9
Time (week)
Rates (%)
20?Year Swap Rate
Ex Ante
Ex Post
Market
0 200 400 600?1.5
?1
?0.5
0
0.5
1
Time (week)
Rates Difference (%)
20?Year Swap Rate
Ex Ante
Ex Post
Figure 4.5: Illustration of (quasi) maximum likelihood estimation using particle
fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994
to 10=13=2004: Upper 2 panels for 10-year swap rate and lower 2 panels for 20-year
swap rate
99
0 200 400 6004
5
6
7
8
9
Time (week)
Rates (%)
30 Year Swap Rate
Ex Ante
Ex Post
Market
0 200 400 600?1.5
?1
?0.5
0
0.5
1
Time (week)
Rates Difference (%)
30 Year Swap Rate
Ex Ante
Ex Post
Figure 4.6: Illustration of (quasi) maximum likelihood estimation using particle
fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994
to 10=13=2004: panels for 30-year swap rate
4.6.1 Yield curve factor loading
Factor loading analysis has been very popular in the fleld of flxed income
research as pioneered by Litterman and Scheinkman in [41]. As a function of the
maturity, the loading of a yield curve factor stands for the change in the yield of that
maturity given a unit shock in the factor has occurred. The analysis on the factor
loading helps to hedge yield curve related positions, as traders can neutralize the
duration and convexity of each factor individually according to the loading functions.
We follow the parametric approach in [33]. According to Eqn. (3.13), the
coe?cient as;t and bs;t determine the term structure of interest rates. The fair
values Y(s;t) of simply compounded spot rates are linked to the yield curve L?evy
factors x(s) by
Y(s;t) = ?
 ? a
s;t
t?s
?
+
? b
s;t
t?s
?0
x(s)
?
:
The slope coe?cients ?bs;tt?s are the loading functions of the L?evy factors. We
illustrate the factor loading in Figure 4.7.
100
0 5 10 15 20 25 301
2
3
4
5
6
Model Implied Yield Curve on 10/13/2004
Maturity (year)
Yield (%)
0 5 10 15 20 25 30?0.1
?0.05
0
0.05
0.1
0.15
Maturity (year)
Factor loading
Factor Loading of Yield Curve
Factor 1
Factor 2
Factor 3
Figure 4.7: Yield Curve and Factor Loading
101
The upper panel of Figure 4.7 plots the model implied fair value of yield
curve on the date of Oct. 13th, 2004 and the lower panel illustrates the factor
loading, both as functions of maturity. The upper panel plot is consistent with
the upward trending term structure of the market quote on the same day. The
factor loading plot shows how the three factors control the variation of the yields at
difierent maturities. The flrst factor is persistently signiflcant across the spectrum
of maturities; the second is least signiflcant as it picks up slowly for short maturity
but diminishes rapidly for long maturities; the third factor is the most signiflcant
factor for short maturities but dies out quickly after the maturity of 10 years.
4.6.2 Price caplet using simulation
With the estimated model we seek to use Monte Carlo simulation to price one
of the mostly traded flxed income derivatives | caplet (see Section 3.5.5 for the
payofi structures). Black formula is the market benchmark to price a caplet. Under
the Black formula, the annualized caplet rate Cpl(t;T;T + ?t) with strike K and
volatility  is
Cpl(t;T;T +?t) = P(t;T)?F(t;T;T +?t)'(d+)?K'(d?)? ; (4.30)
where
d? = log(
F(t;T;T+?t)
K )?
1
2 
2(T ?t)
 pT ?t ; (4.31)
and F(t;T;T + ?t) is the forward rate prevailing at time t for the future period
[T;T +?t].
Obviously, if the price of the caplet is known one can invert by Eqn.(4.30)
102
and 4.31 to obtain the implied volatility  . The market has adopted the convention
of using  instead of the rate itself to quote the caplet values, which gives a more
intuitive measure about the caplet evaluation.
The Black implied volatilities are reported in the Table 4.6 and illustrated
in the 3-D plot of Figure 4.8. As we can see, the implied volatility is increasing
with the strike going farther out of the money for all maturities. And looking at the
maturity direction, we flnd that the implied volatility is monotonically decreasing as
the maturity gets longer. These results are consistent with some aspects of market
observations qualitatively.
However, certain unrealistic characteristics about the model pricing caplet ex-
ist. First, the model can only generate a monotonic downward slope in the volatil-
ity surface with respect to maturity, while the market has historically shown a
hump-shaped volatility shape with a peak between 2 years and 6 years. Second,
the volatility is dropping too fast as the maturity goes up, where the 9-year ATM
volatility drops below 5% which is way below the market observations of averagely
20%. This diminishing volatility against long maturity is possibly caused by the
i.i.d. increment assumptions under the L?evy process.
These observations in fact are in agreement with the industry consensus that
equilibrium models, like ours, are not the ideal candidates to price derivatives. It is
because
1. Themodelisnotarbitragefreeandtheunderlyingbondpricesarenotmatched
perfectly.
103
Black Implied Volatility of Caplet
Moneyness (K=F)
Maturity 1.0 1.02 1.04 1.06 1.08 1.10
1Y 0.2267 0.2420 0.2557 0.2682 0.2796 0.2900
2Y 0.1359 0.1469 0.1566 0.1653 0.1732 0.1804
3Y 0.0992 0.1081 0.1160 0.1229 0.1292 0.1349
4Y 0.0789 0.0866 0.0934 0.0995 0.1049 0.1098
5Y 0.0686 0.0755 0.0816 0.0870 0.0918 0.0961
6Y 0.0607 0.0670 0.0726 0.0774 0.0818 0.0857
7Y 0.0546 0.0605 0.0656 0.0701 0.0742 0.0778
8Y 0.0511 0.0567 0.0614 0.0657 0.0694 0.0728
9Y 0.0492 0.0544 0.0589 0.0629 0.0665 0.0697
Table 4.6: Implied volatility charts of model-generated caplet values of tenor 3-
month on 10=13=2004 with maturity from 1 to 9 years and moneyness from at-the-
money 1 to out-of-the-money 1.1
1
1.05
1.1
1.15
02
46
810
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Moneyness
Implied volatility surface of caplet on 10/13/2004
Maturity
Black Implied Vol
Figure 4.8: Implied volatility surface of model-generated caplet values, 10=13=2004
104
2. Term structure models can by no means capture certain characteristics that
are unique to derivative markets, e.g. caplet market, as those characteristics
are conceivably not priced in the yield curve.
There are three general methodologies to approach the problem of pricing
derivatives based on yield curve models. First, additional factors besides the yield
curve factors are included in the model and only dedicated to span the variations in
the particular derivative market. The second is to add stochastic volatility to the
L?evy factors in the original model to re ect the dynamics in the derivative market.
The third approach is that people choose to speciflcally model the market-observable
rate that underlies the derivative, such as Libor rate to the cap market or swap rate
to the swaption market. By doing this, the underlying rates are automatically
matched perfectly and hence arbitrage opportunities are eliminated.
105
Chapter 5
Conclusion
This work makes contributions to the mathematical flnance area in two aspects
|
1. Developed a 3-factor L?evy -based model, which is the flrst attempt to use a
non-Brownian motion structure in modeling the short rate dynamics. Bond,
caplet/ oorlets and swaption formulas are developed under the model.
2. Implemented a large-scale particle fllter to estimate the short rate model.
Particle fllter methodology provides robust estimation for state-space models
with non-Gaussian dynamics, and hence is an ideal candidate for studying
L?evy based flnancial models that are becoming increasingly popular.
The estimation results show quality flttings to the historical data, and the
estimated model demonstrated certain degree of forecasting power. However, the
caplet pricing results indicate that the model is not adequate to price flxed income
derivatives without further seasoning. This is in fact consistent with the consensus
that equilibrium flxed income models such as the short rate models, regardless of the
dynamics, are not able to capture certain characteristics in individual flxed income
derivative market.
As of future work, two things can be done along this line.
1. First, the particle fllter implementation can be improved by using more e?-
106
cient optimization routine, a better trial-and-error scheme, or generating more
particles to increase the precision.
2. Second, based on the current framework, stochastic volatility can be added
to the L?evy factors in order to explain the unique characteristics exhibited
from difierent derivatives market and hence give reasonable prices for those
products.
107
Appendix A
A.1 Solution derivation of OU process, Eqn. (3.9)
Proof. We notice from Eqn (3.8) that
d(e?tx(t)) = e?tdx(t)+?e?tx(t)dt
) d(e?tx(t)) = e?t(adt+BdL(s))
)
Z t
0
d(e?sx(s)) =
Z t
0
e?s(ads+BdL(s))
) e?tx(t)?x(0) =
Z t
0
e?s(ads+BdL(s))
) x(t) = e??tx(0)+
Z t
0
e??(t?s)(ads+BdL(s)):
A.2 Proof of the Bayes rule in the update stage for the conceptual
solution of the flltering problem, Eqn. (4.4)
Proof.
p(xkjZk) = p(Zkjxk)p(xk)p(Z
k)
= p(zk;Zk?1jxk)p(xk)p(z
k;Zk?1)
= p(zkjZk?1;xk)p(Zk?1jxk)p(xk)p(z
kjZk?1)p(Zk?1)
= p(zkjZk?1;xk)p(xkjZk?1)p(Zk?1)p(xk)p(z
kjZk?1)p(Zk?1)p(xk)
= p(zkjxk)p(xkjZk?1)p(z
kjZk?1))
:
108
A.3 Proof of the posterior expansion, Eqn. (4.17)
Proof.
p(XkjZk) = p(zkjXk;Zk?1)p(XkjZk?1)p(z
kjZk?1)
= p(zkjXk;Zk?1)p(xkjXk?1;Zk?1)p(Xk?1jZk?1)p(z
kjZk?1)
= p(zkjxk)p(xkjxk?1)p(z
kjZk?1)
p(Xk?1jZk?1)
/ p(zkjxk)p(xkjxk?1)p(Xk?1jZk?1):
We used the Markov properties in deriving those equalities.
109
BIBLIOGRAPHY
[1] S. Asmussen and J. Rosi?nski, Arpproximation of small jumps of L?evy processes
with a view towards simulation, Journal of Applied Probability, vol. 38, pp
482-493, 2001
[2] B. Azimi-Sadjadi and P.S. Krishnaprasad, A Particle Filtering Approach to
Change Detection for Nonlinear Systems EURASIP Journal on Applied Signal
Processing, vol 15, pp.2295-2305, 2004
[3] M. Baxter and A. Rennie, Financial Calculus : An Introduction to Derivative
Pricing, Cambridge University Press, 1996
[4] F. Black and P. Karasinski, Bond and Option Pricing When Short Rates Are
Lognormal, Financial Analysts Journal, v.47, pp.52-59, 1991 Jul/Aug
[5] F. Black, E. Derman and W. Toy, A One-Factor Model of Interest Rates and
Its Application to Treasury Bond Options, Financial Analysts Journal, v.46,
pp.33-39, Jan/Feb 1990
[6] F. Black, The pricing of commodity contracts, Journal of Financial Economics,
Vol.3, pp.167-179, 1976
[7] M. Boli?c, P. Djuri?c and S. Hong, Resampling Algorithms and Architectures For
Distributed Particles Filters, Working paper, 2004
[8] A. Brace, D. Gatarek and M. Musiela, The Market Model of Interest Rate
Dynamics, Mathematical Finance, Vol.7, no.2, 1997, pp.127-155
110
[9] D. Brigo and F. Mercurio, Interest Rate Models: Theory and Practice, Springer
Finance, 2001
[10] B. Dupire, Pricing with a Smile, Risk, vol.7, no.1, Jan. 1994, pp.18-20
[11] A. Cairns, Interest Rate Models, Princeton University Press, 2004
[12] P. Carr, H. Geman, D. Madan and M. Yor, The Fine Structure of Asset Re-
turns: An Empirical Investigation, Journal of Business, vol.75, no.2, April 2002,
pp.305-332
[13] P. Carr and D. Madan, Option valuation using fast Fourier Transform, Journal
of Computational Finance, v.2, pp.61-73, 1999
[14] P. Carr, H. Geman, D. Madan and M. Yor, Stochastic Volatility for Levy Pro-
cesses, Mathematical Finance, vol.13, No.3, July 2003, pp.345-382
[15] P.CarrandB.Zhang, Pricing FX Barrier Options under Stochastic Skew Model
by Monte Carlo Simulation, Bloomberg Technical Report, 2005
[16] P. Carr, L. Wu and B. Zhang, Finite Difierence Method under Heston Stochas-
tic Volatility Model and Stochastic Skew Model for Pricing Barrier Options,
Bloomberg Technical Report, 2005
[17] R. Cont and P. Tankov, Financial Modeling ith Jump Processes, Chapman &
Hall/CRC, 2004
[18] J. Cox, J. Ingesoll and S. Ross, A Theory of the Term Structure of Interest
Rates, Econometrica, vol. 53, 1985, pp. 385-408
111
[19] E. Derman and I. Kani, Riding on a smile, Risk, vol.7, no.2, 1994, pp.32-39
[20] A.Doucet, On Sequential Simulation Based Method for Baysian Filtering, Tech-
nical report CUEDF-INFENGTR.310, 1998
[21] J.C. Duan and A. Fulop Estimating the Structural Credit Risk Model When
Equity Prices Are Contaminated by Trading Noises, Working paper, 2005
[22] D. Du?e, Dynamic Asset Pricing Theory, Princeton University Press, 1992
[23] E. Eberlein and S. Raible, Term Structure Models Driven By General L?evy
Processes, Mathematical Finance, vol.9, no.1, Jan. 1999, pp.31-53
[24] H. Geman, E. Eberlein and D. Madan, Formulation of L?evy Based Interest
Rate Model, working paper, Feb. 2005
[25] H. Geman, N. Karoui and J. Rachet, Changes of Numerair, Changes of Proba-
bility Measure and Option Pricing, Journal of Applied Probability, vol.32, no.2,
June 1995, pp.443-458
[26] P. Glasserman, Monte Carlo Methods in Financial Engineering, Springer Fi-
nance, 2003
[27] B. Ristic, S. Arulampalam and N. Gordon, Beyond the Kalman Filter | Parti-
cle Filters for Tracking Applications, Artech House, Boston and London, 2004
[28] A. Doucet, N. de Freitas and N. Gordon, Sequential Monte Carlo Methods in
Practise, Springer 2001
112
[29] J. Harrison and D. Kreps, Martingale and Arbitrage in Multiperiod Securities
Markets, Journal of Economic Theory, 20, 1979, pp.381-408
[30] J. Harrison and S. Pliska, Martingale and Stochastic Integrals in the Theory
of Continuous Trading, Stochastic Processes and Their Applications, 11, 1981,
pp.215-260
[31] D. Heath, R. Jarrow and A. Morton, Bond Pricing and the Term Structure of
Interest Rates: A New Methodology for Contingent Claims Valuation, Econo-
metrica, Vol. 60, No. 1, 1992, pp. 77-105
[32] M. Heidari and L. Wu, Are Interest Rate Derivatives Spanned by the Term
Structure of Interest Rates? Journal of Fixed Income, June 2003, pp. 75-86
[33] M. Heidari and L. Wu, Term Structure of Interest Rates, Yield Curve Residuals,
and Consistent Pricing of Interest Rate Derivatives Working paper, 2005
[34] S. Heston, A Closed-form Solution for Options with Stochastic Volatility with
Applications to Bond and Currency Options, Review of Financial Studies, vol.6,
no.2, 1993, pp.327-343
[35] J. Hull, Options, Futures And Other Derviatives, Prince Hall, 2005 July
[36] J. Hull and A. White, An Analysis of the Bias in Option Pricing Caused by a
Stochastic Volatility, Advances in Futures and Options Research, vol.3, 1988,
pp.29-61
[37] P. J?ackel, Monte Carlo methods in flnance, John Wiley & Sons Ltd. 2002
113
[38] R. Jamshidian, Libor and Swap Market Models and Measures, Finance and
Stochastics, Vol.1, pp.290-330, 1997
[39] A. Javaheri, Inside Volatility Arbitrage, Wiley Finance, 2005
[40] S. Kou and H. Wang, First passage times of a jump difiusion process, Advanced
Applied Probability, 35, 2003, pp. 504-531
[41] R. Litterman, J. Scheinkman, Common Factors Afiecting Bond Returns, Jour-
nal of Fixed Income, 1, 1991, pp. 54-61
[42] D. Madan, P. Carr and E. Chang, The Variance Gamma Process and Option
Pricing, European Finance Review, vol.2, no.1, June 1998, pp.79-105
[43] D. Madan and M. Yor, Representing CGMY and Meixner L?evy Processes as
Time Change Brownian Motions, Working paper, 2005
[44] C. Menn and S. Rachev, Calibrated FFT-based Density Approximations for
fi-stable Distributions, working paper, 2004
[45] R. Merton, Option Pricing When Underlying Stock Returns Are Discontinuous,
Journal of Financial Economics, vol.3, 1976, pp.125-144
[46] P. Protter, Stochastic Integration and Difierential Equations, Second Edition,
Springer-Verlag, Heidelberg, 2005
[47] R. Rebonato, Interest-Rate Option Models, second edition, John Wiley & Sons,
1998
114
[48] M. Rubinstein, Implied Binomial Trees, Journal of Finance, vol.49, no.3, Jan.
1994, pp.771-818
[49] K. Sato, L?evy Processes and Inflnitely Divisible Distribution, Cambridge Press,
1999
[50] W. Schoutens, L?evy Processes in Finance, Wiley Finance, 2003
[51] O. Vasicek, An Equilibrium Charaterization of the Term Structure, Journal of
Financial Econometric, v.5, 1977, pp.177-188
[52] R. Zagst, Interest Rate Management, Springer Finance, 2002
[53] B. Zhang, Simulating CGMY as Time-Change Brownian Motion, Bloomberg
Technical Report, 2005
[54] F. Zhou, Black Smirks, Risk, pp.87-91, May 2003
115