ABSTRACT Title of dissertation: A NEW L?EVY BASED SHORT-RATE MODEL FOR THE FIXED INCOME MARKET AND ITS ESTIMATION WITH PARTICLE FILTER Bing Zhang, Doctor of Philosophy, 2006 Dissertation directed by: Professor Dilip Madan Smith School of Business University of Maryland, College Park In this thesis two contributions are made to the area of mathematical flnance. First, in order to explain the non-trivial skewness and kurtosis that is observed in the time-series data of constant maturity swap (CMS) rates, we employ the pure jump L?evy processes, i.e. in particular Variance Gamma process, to model the variation of unobservable economic factors. It is the flrst model to include L?evy dynamics in the short rate modeling. Speciflcally, the Vasicek [51] type of short rate framework is adopted, where the short rate is an a?ne combination of three mean-reverting state variables. Zero-coupon bonds and a few flxed income derivatives are developed under the model based on the transform method in Carr et al [13]. It is expected that the L?evy based short rate model would give more realistic explanations to the yield curve movements than Gaussian-based models. Second, the model parameters are estimated by the particle fllter (PF) tech- nique. The PF has not seen wide applications in the fleld of flnancial engineering, partly due to its stringent requirement on the computing capability. However, given cheap computing cost nowadays, the PF method is a exible yet powerful tool in estimating state-space models with non-Gaussian dynamics, such as the Levy- based models. To customize the PF algorithm to our model, the continuous-time L?evy short rate model is cast into the discrete format by flrst-order forward Euler approximation. The PF technique is used to retrieve values of the unobservable fac- tors by sequentially using readily available market prices. The optimal set of model parameters are obtained by invoking the quasi-maximum likelihood estimation. A NEW L?EVY BASED SHORT-RATE MODEL FOR THE FIXED INCOME MARKET AND ITS ESTIMATION WITH PARTICLE FILTER by Bing Zhang Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulflllment of the requirements for the degree of Doctor of Philosophy 2006 Advisory Committee: Professor Dilip Madan, Chair/Advisor Professor Michael Fu Professor David Levermore Professor Konstantina Trivisa Professor Haluk Unal c Copyright by Bing Zhang 2006 DEDICATION In memory of my father Zhang, Jianguo. To my mother Bao, Xinhua. ii ACKNOWLEDGMENTS I want to flrst give my utmost gratitude to a respectful professor, an extraor- dinary scholar, a skillful professional, and a lovable friend | my advisor Dr. Dilip Madan. Since becoming his student, I have witnessed and have been deeply touched by his passion for knowledge, his diligence in working, his humbleness in learning, his intolerance for ambiguity, as well as his sense of humor. These observations have been the driving force that moved me forward with my PhD pursuance throughout these years. I can not recall in retrospect how many times I wandered outside, with a handful of questions, before entering his o?ce, being nervous, hopeless and self in-confldent, but after talking to him my mind was always fllled with clarity and order. I am greatly indebted to such a rare individual for his expertise, patience, and help. It is impossible to utter my entire gratitude in one page, so the rest is just left unsaid. I want to thank Prof. Michael Fu for his excellent organization of our weekly flnancial seminar, where I acquired extensive information and knowledge on most recent development in mathematical flnance. Also, his outstanding teaching and ofi-class help on stochastic processes and simulation would be my life-time beneflt. I would like to acknowledge the help from Dr. Peter Carr at Bloomberg LP in New York. It was a wonderful experience spending the 2005 summer as an intern in his quantitative flnancial research group. There I received close nurturing from iii Peter with a project of pricing foreign exchange barrier options under a stochastic skew model. I also want to express my gratitude for his generous advice and help on my employment. I thank Prof. Dave Levermore for funding my living during the doctoral ed- ucation, otherwise I would not have completed the dissertation alive. I would like also to extend my gratitude to Prof. Konstantina Trivisa and Prof. Haluk Unal for agreeing to serve as the PhD committee members. I would also like to say Thanks to my fellow classmates. Christian Silva guided me through my flrst flnance project; Samvit Prakash initiated a math flnance stu- dents gathering regarding our academic/career developments; Qing Xia and I spent numerous hours together discussing math flnance problems; Huaqiang Ma has been a good company of mine to go to the academic conferences; etc. I would like to express my grateful feelings to the stafi members Ms. Alverda McCoy and Ms. Liz Wincek for the laughters and joy they brought to me on every each frustrating and endless research day. I am sure this enumeration is far from being complete for people that I own debt to; any inadvertent missing in the list is my fault. In the end, I want to thank my graceful wife Fei Lin. Without her love and support, this dissertation would be infeasible. iv TABLE OF CONTENTS List of Tables vii List of Figures viii 1 Introduction 1 2 L?evy Processes in Finance and Monte Carlo Simulation 8 2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Pure jump L?evy market models . . . . . . . . . . . . . . . . . . . . . 12 2.3.1 Variance Gamma process . . . . . . . . . . . . . . . . . . . . . 14 2.3.2 CGMY process . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.3 Market models . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4 Monte Carlo simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.4.1 Simulate VG and CGMY . . . . . . . . . . . . . . . . . . . . . 21 2.4.2 Simulate VG and CGMY as time-change Brownian motion . . 23 2.4.3 Importance sampling under L?evy processes . . . . . . . . . . . 25 2.4.4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . 32 3 The 3-Factor L?evy Based Short Rate Model 41 3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2 Overview on flxed income modeling . . . . . . . . . . . . . . . . . . . 43 3.2.1 Bond market and money account . . . . . . . . . . . . . . . . 43 3.2.2 Forward rate and Libor rate . . . . . . . . . . . . . . . . . . . 46 3.2.3 Change of numeraire . . . . . . . . . . . . . . . . . . . . . . . 48 3.3 Current short rate models . . . . . . . . . . . . . . . . . . . . . . . . 50 3.4 3-Factor L?evy short rate model dynamics . . . . . . . . . . . . . . . . 53 3.4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.4.2 Model dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.5 Pricing of bond and bond derivatives . . . . . . . . . . . . . . . . . . 57 3.5.1 Joint characteristic function . . . . . . . . . . . . . . . . . . . 58 3.5.2 Bond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.5.3 Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.5.4 Swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.5.5 Caps/Floors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4 Particle Filter and Parameter Estimation 73 4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.2 The problem and conceptual solution . . . . . . . . . . . . . . . . . . 76 4.3 Kalman fllter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.4 Particle fllter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.4.1 Importance sampling in particle fllter . . . . . . . . . . . . . . 82 4.4.2 Sequential importance sampling . . . . . . . . . . . . . . . . . 83 v 4.4.3 Selection of the importance density . . . . . . . . . . . . . . . 85 4.4.4 Resampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.4.5 Generic particle fllter algorithm . . . . . . . . . . . . . . . . . 88 4.5 Estimate the 3-factor L?evy short rate model . . . . . . . . . . . . . . 89 4.5.1 Data description . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.5.2 Quasi-maximum likelihood estimation with particle fllter . . . 90 4.5.3 Estimation results and discussion . . . . . . . . . . . . . . . . 92 4.6 Yield curve analysis and pricing . . . . . . . . . . . . . . . . . . . . . 94 4.6.1 Yield curve factor loading . . . . . . . . . . . . . . . . . . . . 100 4.6.2 Price caplet using simulation . . . . . . . . . . . . . . . . . . . 102 5 Conclusion 106 A 108 A.1 Solution derivation of OU process, Eqn. (3.9) . . . . . . . . . . . . . 108 A.2 Proof of the Bayes rule in the update stage for the conceptual solution of the flltering problem, Eqn. (4.4) . . . . . . . . . . . . . . . . . . . 108 A.3 Proof of the posterior expansion, Eqn. (4.17) . . . . . . . . . . . . . . 109 Bibliography 110 vi LIST OF TABLES 2.1 VG process simulation as the difierence of two gamma random variables 21 2.2 CGMY process simulation as compound Poisson and Brownian mo- tion approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3 VG process simulation as the time-change Brownian motion . . . . . 25 2.4 CGMY process simulation as the time change Brownian motion . . . 26 2.5 Chi-Square goodness-of-flt test numerical results for simulation of CGMY(t; C, G, M, Y) process as compound Poisson process, 104 samples. N.R. stands for Not Rejected . . . . . . . . . . . . . . . . . 36 2.6 Chi-Square goodness-of-flt test numerical results for simulation of CGMY(t; C, G, M, Y) process as time-change Brownian motion, 104 samples. N.R. stands for Not Rejected . . . . . . . . . . . . . . . 36 2.7 Parameters and speciflcations for the numerical illustration of VG measure change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.1 ICA statistics for swap rates of 10 maturities . . . . . . . . . . . . . . 54 3.2 Numbers of model parameter for original model and reduced model . 57 4.1 Sequential importance sampling algorithm . . . . . . . . . . . . . . . 86 4.2 Resampling algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.3 Particle fllter algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.4 In-sample maximum likelihood parameter estimates of 3-factor L?evy shortratemodelwithparticlefllterusing10-yeardatafrom04=25=1994 through 10=13=2004 . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.5 In-sample estimation performance statistics of 3-factor L?evy short rate model with particle fllter using 10-year data from 04=25=1994 through 10=13=2004 . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.6 Implied volatility charts of model-generated caplet values of tenor 3- month on 10=13=2004 with maturity from 1 to 9 years and moneyness from at-the-money 1 to out-of-the-money 1.1 . . . . . . . . . . . . . . 104 vii LIST OF FIGURES 2.1 Chi-Squaregoodness-of-flttestillustrationforsimulationofCGMY(t; C, G, M, Y) process, as compound Poisson and time-change Brown- ian motion, with t = 0:5;C = 0:8;G = 30;M = 30;Y = 0:5, 2?104 samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.2 Chi-Squaregoodness-of-flttestillustrationforsimulationofCGMY(t; C, G, M, Y) process, as compound Poisson and time-change Brown- ian motion, with t = 0:3;C = 0:5;G = 20;M = 10;Y = 0:5, 2?104 samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.3 CPU time comparison of compound Poisson and time-change Brow- nian motion for simulating CGMY(t; C, G, M, Y) with t = 0:2;C = 0:1;G = 20;M = 10 and varying Y, 104 samples . . . . . . . . . . . . 38 2.4 Histogram comparison for VG(t; C, G, M) and VG(t; C0, G0, M0) with parameters listed in Table (2.7), 2?104 samples . . . . . . . . . 39 2.5 Convergencecomparisonofpricingout-of-moneyputoptionsforVG(t; C, G, M) and VG(t; C0, G0, M0) with importance sampling, param- eters listed in Table (2.7) . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.1 US treasury yield curve on Feb. 09, 2005 . . . . . . . . . . . . . . . . 46 4.1 Resampling scenario and mechanism in particle fllter . . . . . . . . . 87 4.2 Illustration of (quasi) maximum likelihood estimation using particle fllteronEurodollarfuturesandswapratesagainstmarketquotesfrom 04=25=1994 to 10=13=2004: Upper 2 panels for Eurodollar future on Libor(3m, 3m) and lower 2 panels for Eurodollar future on Libor(6m, 3m) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.3 Illustration of (quasi) maximum likelihood estimation using particle fllteronEurodollarfuturesandswapratesagainstmarketquotesfrom 04=25=1994 to 10=13=2004: Upper 2 panels for Eurodollar future on Libor(12m, 3m) and lower 2 panels for 2-year swap rate . . . . . . . . 97 4.4 Illustration of (quasi) maximum likelihood estimation using particle fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994 to 10=13=2004: Upper 2 panels for 3-year swap rate and lower 2 panels for 5-year swap rate . . . . . . . . . . . . . . . . . 98 viii 4.5 Illustration of (quasi) maximum likelihood estimation using particle fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994 to 10=13=2004: Upper 2 panels for 10-year swap rate and lower 2 panels for 20-year swap rate . . . . . . . . . . . . . . 99 4.6 Illustration of (quasi) maximum likelihood estimation using particle fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994 to 10=13=2004: panels for 30-year swap rate . . . . . 100 4.7 Yield Curve and Factor Loading . . . . . . . . . . . . . . . . . . . . . 101 4.8 Implied volatility surface of model-generated caplet values, 10=13=2004104 ix Chapter 1 Introduction The past few decades has seen rapid development in the flxed income market worldwide. As bonds issued by governments and corporations are being actively traded between banks and funds, flnancial institutions on Wall Street have been aggressive in designing and trading bond derivatives to provide insurance to in- vestors. This phenomenal progress has stimulated an exuberance of efiorts, from both academia and industry, in mathematically modeling the dynamics exhibited from the flxed income market. Modeling the dynamics of short rate, which is the instantaneous increment of a unit deposit, has been one most popular approach in modeling interest rate term structures. This stream of work was pioneered by Vasicek [51] and Cox-Ingersoll- Ross [18], and followed by Hull-White [36], etc. A common feature that all such models share is to describe the randomness in the interest rate term structure by Brownian motion. The use of Brownian motion implies that the conditional distri- bution of the modeled object is normal, and one only needs the mean and variance to characterize the distribution. However, in the context of modeling flnancial assets, e.g. interest rates, such an assumptionisinacontradictiontomanyaspectsoftherealmarket. Forinstance, the time series data of swap rates show that signiflcant higher probabilistic moments, i.e. 1 skewness/kurtosis, persist (see [24]). In addition, the evidence from the flxed income derivative market indicates that the normality assumption breaks down too, as the Black-implied volatilities1 backed from caps/ oors/swaptions prices are markedly non-constant, which otherwise should have been at if normality holds (see [9]). Because the market clearly does not support the normality assumption, in this dissertation I seek to remedy these discrepancies in the existing flxed income models. I adopted an approach that has seen great success in modeling the equity market dynamics. This approach, as termed by pure-jump L?evy models, was originally proposed by Madan et al in [42], and some other researchers [50]. A L?evy process is a stochastic process with independent and stationary incre- ments. It can be decomposed into three independent components: a deterministic drift process, a continuous path mean-zero difiusion and a jump process. A drifted Brownian motion is the special case of a L?evy process without jumps, while the general L?evy setting permits exible jump structures in addition to the difiusion and drift. In terms of building models for asset returns, it has been argued in Madan et al [42] that there is no absolute necessity to include a Brownian motion in the underlying asset movement, because in reality no asset?s values move continuously (at least the movement is limited by the minimum tick size, e.g. 1 cent for traded stocks in New York Stock Exchange). Backed by this notion, the aforementioned 1The Black-implied volatility is the market convention to quote the caps/ oors/swaptions based on the Black formula [6]. The calculation of such a quantity is thus a reverse-engineering problem to back out a model parameter, the Black-implied volatility, from the known prices. 2 pure-jump L?evy market models use the inflnite-active jump kernels, which make small jumps arrive inflnitely frequent, in order to remove the difiusion component. Such models can exibly incorporate the exhibited skewness and kurtosis in the mar- ket data by specifying the jump kernel. In speciflc, an uneven jump measure yields desired skewness, and a heavy tail weight in the jump measure provides adequate kurtosis. A few pure jump L?evy models for the equity market, including Variance Gamma (VG) model which is used in this work, have been proved sound to flt both the time-series of asset returns under physical measure and the smile/skew volatil- ity surface under the risk-neutral measure. As an evidence of their market success, some of those models have been reportedly used by a couple of prestigious flnancial institutions on Wall Street to evaluate their equity option positions on a daily basis. In this dissertation, I developed a 3-factor L?evy -based short rate model whose dynamics is r(t) = fi +fl0x(t) ; dx(t) = (a??x(t))dt+BdL(t) ; (1.1) where the short rate r(t) is a scaler process expressed as an a?ne combination of factors x(t) driven by L?evy dynamics dL(t) in the background. The a?ne structure is specifled by the scaler coe?cient fi and vector coe?cient fl, and the vector x(t) follows a multidimensional mean-reverting Ornstein-Uhlenbeck (OU) process. The vector a is denoted as the long run mean-reverting level for the factors x(t), ? as the mean-reverting strength, and B as the correlating matrix. This is the flrst model utilizing the L?evy structure dL(x) in the short rate, 3 as opposed to the existing Brownian motion-based models. Multi-factor models are more efiective in explaining the interest rates movements, as one factor models can not decouple the correlations between bonds across difierent maturities. The 3-factor model is in line with the conclusion from the seminal paper by Litterman et al in [41] that found 3-factor is adequate to describe 99% of the yield curve data. The model follows the Vasicek equilibrium modeling framework (see [51]) by specifying the process under the physical measure. Because we are primarily concerned about pricing, the measure-change from the physical measure to risk- neutral pricing measure is developed. Under the risk-neutral measure, closed-form formulas for pricing bonds, swaps, caps/ oors and swaptions via their characteristic functions are derived based on the transform method in Carr et al [13]. Parameter estimation is of critical importance in applying flnancial models. Although the L?evy -based model is structurally superior to Gaussian-based models, it is computationally more expensive and require more sophisticated algorithms for parameter estimation. In order to acquire the stable estimation of the parameters, the model is flrst cast into the state-space format. In other words, the state variables are modeled as latent/unobservable factors while the model outputs observable measurements, i.e. the prices for the flxed income products. The idea of state-space model is formularized in Eqn. (1.2) xt = (I ?e???t)a? +e???txt?1 +B?Lt ; zt = O(xt;?)+et ; (1.2) 4 where the flrst equation approximates the OU process solution by using the flrst- order Euler method. ? is the set of parameters to be estimated, and the O(xt;?) in the second equation represents the non-linear pricing function w.r.t. the state variables xt and ?. zt are the output prices and et are the normally distributed pricing errors. We are given 10-year weekly-sampled time-series across 9 difierent asset values as data input. The maximum likelihood estimation (MLE) is used as the estimation algorithm, and the likelihood function is constructed for errors et between the mar- ket time-series quotes and model output prices. In the context of state-space model estimation, such a joint error likelihood function is built with the help from employ- ing the fllter technique such as Kalman fllter or its likes. However, the widely-used Kalman fllter requires strict theoretical assumptions, i.e. the linearity assumption on the measurement and propagation functions and normality assumption on the dynamics. Therefore, despite its power and convenience, Kalman fllter (KF) is not immediately applicable in our model since the randomness is not Gaussian nor pricing functions linear. We opt to use its counterpart, the particle fllter (see [7, 20]), as our choice for estimating the model. Particle fllter (PF) has not seen massive applications till recently with the computation cost being reduced tremendously. Contrasting to KF, which only updates two essential quantities i.e. the mean and variance, the rationale behind PF is to use a large amount of particles to represent the entire distribution. The movement of the distribution is carried out by applying forward Monte Carlo simulation sequentially. Therefore, by design, particle fllter ofiers great exibility as 5 it can accommodate arbitrary dynamic structures and measurement functions. In a nutshell, at each time step a particle fllter flrst makes an ex-ante prediction on the unobservable state variables ^xt. Once the new market observation zt is received, the fllter updates the state variables under the Bayes theorem. In theory, the updated state variables xt converge to its true values, which are used to output the model prices by O(xt;?). A fllter technique recursively performs the prediction- and-update procedure at all time steps till it reaches the end of the time-series. The joint error likelihood function is then a multidimensional normal distribution across all sampled dates and assets, and the logarithm of such likelihood function is given as L(?) = ?12 MX t=1 NX i=1 ?(z t ?ezit)0(Ri)?1(zt ?ezit) ? ; (1.3) where zt is the market quote and ezit are model output. R represents the covariance matrix for the errors. M and N are the number of sampling days and asset classes, respectively. Maximization of such likelihood function yields the optimal model parameters as follows ? = argmax ? L(?) : (1.4) In the dissertation, Chapter 2 introduces the L?evy process as a mathematical concept. It discusses a couple of popular pure-jump L?evy market models and their simulation algorithms. As an important tool to reduce simulation variance, impor- tance sampling technique for jump processes is studied. Chapter 3 focuses on the development of 3-factor L?evy -based short rate. The measure change from physical measure to pricing measure (risk neutral) is derived. Under the risk-neutral mea- 6 sure, closed-form formulas for pricing bonds, swaps, caps/ oors and swaptions via their characteristic functions are developed. In Chapter 4, the maximum likelihood estimation algorithm for the L?evy short rate model with particle fllter is discussed in detail. The estimation results show that the model has certain forecasting power on the Libor and swap rates. In addition, the yield curve is analyzed using the factor loading methodology which facilitates hedging and risk management. We take caplet as an example of pricing flxed income derivatives. The model prices for caplets show qualitative improvement against the existing models, but it also indi- cates that the model is inadequate to price flxed income derivatives without further seasoning. 7 Chapter 2 L?evy Processes in Finance and Monte Carlo Simulation 2.1 Background Mathematical flnance has been an active research area after the birth of Black-Merton-Scholes (BMS) equation in 1973. However, the constant volatility assumption in BMS was obviously violated by the market observation of non- constant volatility across maturity and strikes. To explain this so-called \volatility smile/skew" phenomenon, diligent and astute researchers have proposed numerous alternatives to BMS theory. The attempts can be grouped into three classes: 1. Local volatility model: The rationale underlying the local volatility model is that future volatilities are deterministic functions of the underlying value and calendar time, and these functions are implied by the current vanilla option prices. The model was developed by Dupire [10] , Rubinstein [48], Derman and Kani [19] in three independent efiorts. The model retains the convenience of Black-Scholes type of hedging argument. Because of its simplicity, traders to price and hedge exotic options with the local volatility model after the model is calibrated to the vanilla option market. 2. Stochastic volatility model: The second alternative is to randomize the volatil- ity by a second Markovian stochastic process. The stream of efiorts was pio- 8 neered by Hull and White [36], Heston [34], etc. In those models, a non-zero correlation between the spot and volatility processes is assigned to reproduce the skewness across strikes and maturities. The Heston stochastic volatility model is one of the market standards for derivative pricing and risk manage- ment. 3. Jump model: The third approach is to keep the dynamics a one-dimensional Markovianprocessbutaddjumpstotheunderlying, andjumpstructurescould be made to incorporate the skew/smile exhibited from the market. Merton [45] in 1976 proposed the flrst jump-difiusion model for the underlying. Recently, pure jump L?evy market models become increasingly popular, represented by Variance Gamma model by Madan et al [42], CGMY by Carr et al [12], etc. The L?evy models will be the building block for the work in this thesis and will be elaborated in great details herein. The L?evy models have been proved efiective in explaining the smile/skew in equity and foreign exchange (FX). People have built certain degree of belief in option prices under those models. As an evidence of the models? market success, a couple of prestigious flnancial houses on Wall Street have been using those models or their extensions to evaluate their option positions. In this chapter, I will flrst introduce the mathematics needed to understand L?evy processes. A few popular L?evy market models will then be described, including the VG and CGMY models. Monte Carlo (MC) simulation is the most widely used computational tool in derivative pricing, so the last section of this chapter will be dedicated to studying the simulation 9 algorithms for the VG and CGMY processes. Importance sampling technique is discussed as a variance reduction technique for MC simulation, and a couple of toy examples are presented in the end. 2.2 Mathematics Levy process is a stochastic process Xt that has independent and stationary increments. Consequently, it is a Markovian process with the marginal distribution of random variable Xt being inflnitely divisible. It is most common to study L?evy process by their characteristic functions. The characteristic function of a random variable X is deflned as `X(u) = E[eiuX] = Z +1 ?1 eiuxfX(x)dx; (2.1) where fX(x) is the probability density function of X and u 2 R.1 The characteristic function can be graphically viewed as the probability weighted average of a unit circle on the complex plane. If we denote `X(u) = e?X(u), ?X(u) is then called the characteristic exponent of X. A L?evy process Xt is inflnitely divisible, which indicates that the characteristic function of marginal random variable Xt can be expressed as follows `Xt(u) = E[eiuXt] = et?X1(u) (2.2) where ?X1(u) is the characteristic exponent of the Levy process at unit time. The property of inflnite divisibility gives rise to a great convenience to study Xt, namely 1u can be extended to the complex plane. 10 one only needs to look at X1 in order to investigate the distributional properties of Xt for any flnite t. A L?evy process can be decomposed into three independent components: the flrst is a deterministic drift with rate b, the second is a continuous path difiusion with volatility and the third is a jump process with the measure ?(dx). Hence, a L?evy process can be fully characterized by the combined L?evy triplet (b; ;?(dx)) where b 2 R, 2 R+ and ?(dx) is a measure deflned on Rnf0g. The L?evy measure ?(dx) describes the arrival frequency of jumps with difierent sizes, and could be written in a functional form ?(dx) = k(x)dx, where k(x) is called the L?evy density. For a one-dimensional L?evy process, the L?evy -Khintchine formula gives the expression for characteristic exponent ?X1(u) as follows ?X1(u) = bui? 12 2u2 + Z +1 ?1 (eiux ?1?iux1[jxj<1])?(dx): (2.3) with Z +1 ?1 min(1;x2)?(dx) < 1; (2.4) For a good reference of L?evy processes, please see Sato [49]. It is worth pointing out that many well-known stochastic processes are special cases of general L?evy settings. For instance, if we set b = 0 and let the jump density k(x) vanish for all real x, a standard Brownian motion with variance 2 is then left. Or, if both b and are set zero but k(x) = ??(1), where ?(1) denotes the Dirac measure at 1, the Poisson process with arrival rate ? is restored. The so-called pure jump L?evy models ignore the Brownian motion component but use tiny jumps to mimic the continuous movement. This could be realized by 11 tilting the L?evy density k(x) su?ciently large as the jump size x approaches to zero. For such pure jump processes, we can group the models into 3 exclusive categories. Let?s deflne I = Z +1 ?1 ?(dx); J = Z +1 ?1 jxj?(dx) where I and J denote the total arrival rate and total variation respectively. Thus, a L?evy process would be called 1. Finite activity process if I < 1 and J < 1 2. Inflnite activity but flnite variation process if I = 1 but J < 1 3. Inflnite activity and inflnite variation process if I = 1 and J = 1 This classiflcation is to distinguish the behavior of small jumps around the origin. Due to the high frequency of tiny jumps, inflnite activity process implies that the total number of jump occurrence during any time interval is inflnite. The inflnite variation process, which encompasses the inflnite activity process, indicates that beyond the inflnite jump frequency, the summation of the absolute values of all occurred jumps goes to inflnity too in any flnite time interval. Both inflnite activity and inflnite variation processes could be used as the building block for the pure-jump L?evy market models developed in the next section. 2.3 Pure jump L?evy market models After the 1987 equity market crash, investors ocked to buy out-of-the-money put options to protect their equity positions and this pushed up the out-of-the- 12 money volatility and made the negative skew more pronounced. Additionally, it is well known that the log daily returns have signiflcant skewness and are fat-tailed. As the market evidence does not support the constant volatility Brownian motion models, people started looking for models with richer structures that yield realistic explanations. It is natural to think, after observing catastrophic market crash, that the market moves not only continuously, but jumps from time to time. As the flrst attempt in this regard, R. Merton in 1976 proposed a jump-difiusion model for equity market in [45] which used Brownian motion for the small movements and jumps for the large. However, in real world trading never happens continuously but rather one trade after another, and the movement of the stock price path can not be absolutely continuous because it is at least limited by the minimum tick size (e.g. 1 cent for the traded stocks on New York Stock Exchange). So it leads to the suspicion that if the Brownian component should be absolutely needed in the model, especially in the sense of parsimonious modeling. A few researchers [42] have argued that in reality the continuous difiusion component is not statistically signiflcant, as long as the small moves can be represented by alternative structures other than Brownian motion. Under such rationales, Brownian motion is excluded from the pure jump L?evy models. In this section I will flrst describe the dynamics for a couple of popular L?evy processes such as VG and CGMY. The market models are to use these processes to describe the logarithm of the asset price. Consequently, the asset price itself follows an exponential L?evy process. Careful treatments should be given here, be- 13 cause to build a legitimate model for stock price, the model needs to satisfy the martingale condition under the appropriate measure and associated numeraire in order to prevent arbitrage opportunities. This section will discuss how to make the exponential L?evy process a martingale by correcting the convexity term caused by the exponentiation. 2.3.1 Variance Gamma process Variance Gamma (VG) model developed by Madan, Carr and Chang in [42] is an elegant model that ofiers analytical tractability and straight forward simulation schemes, and it has been one of the most well-know pure-jump L?evy models in this area. A VG random variable X follows a 3-parameter ( ;?; ) probability law, and its characteristic function is given by `VG(u; ;?; ) = (1?iu ? + 12 2?u2)?1=? ; (2.5) with 2 R+;? 2 R+; 2 R. The elegance of VG process lies in that its L?evy jump density can be expressed in a simple form as kVG(x) = 8> >< >>: C exp(Gx) jxj x < 0; C exp(?Mx) x x > 0; (2.6) where C = 1=? ; G = ( r1 4 2?2 + 1 2 2? ? 1 2 ?) ?1 ; M = ( r1 4 2?2 + 1 2 2? + 1 2 ?) ?1 : (2.7) 14 OnecanrecognizethateachformulainEqn.(2.6)isinfacttheL?evymeasurefor a gamma random variable. This indicates that a VG random variable can be decom- posed into two gamma random variables; one has positive jumps and the other has negative jumps. Under this representation, a VG random variable XVG(C;G;M) can be written as the difierence between two gamma random variables XVG(C;G;M) = Xg(C;1=M)?Xg(C;1=G): (2.8) This fact leads to a straight-forward simulation algorithm that we will present in the following section. The VG process can handle the skewness and excess kurtosis exhibited from the historical stock prices, and flt well the vanilla option volatility curve for single maturities. For instance, a negative parameter will result in a negative skewness, and the parameter ? provides the primary control for fat-tails in the empirical distribution. For the vanilla option market, a negative accounts for the negative slope in the volatility curve. VG process can also be intuitively expressed as a time-change Brownian Mo- tion, where the time-change process2 is a gamma process. In speciflcs, a VG process can be obtained by substituting the deterministic time t with a gamma random variable g(t) in a drifted Brownian motion X(t) = bt + W(t). Under the ( ;?; ) parameterization we have the expression for XVG as XVG(t) = g(t)+ W(g(t)); (2.9) where g(t) follows gamma distribution gamma(t=?;?). 2A time-change process is also called a subordinator in some literatures. 15 The concept of time-change Brownian motion has strong economics intuitions. It is clear that the market does not evolute identically every day; rather, some days thetradingactivitiesaremoreintensive, whileotherdaysthemarketisjustquietand has less trading going on. So the length of a market day is better measured by the concept of random \business time" rather than the calendar time. As a (technical) beneflt, VG process could be simulated by generating the standard Brownian motion subordinated by gamma random time. 2.3.2 CGMY process Variance Gamma process is of inflnite activity but flnite variation, but it can be extended to have a better control over the flne structure of asset return distri- bution by adding one parameter. The generalized model is called CGMY model as developed by Carr et al in [12]. Based on the parameterization (C;G;M) of VG, CGMY process adds the fourth parameter Y to the power of the denominator x as in Eqn.2.7. CGMY jump density is then given as | kCGMY (x) = 8> >< >>: C exp(Gx) jxj1+Y x < 0; C exp(?Mx) x1+Y x ? 0; (2.10) where C > 0;G > 0;M > 0;Y < 2. (Y < 2 is to keep the process having flnite second moment.) By adding Y, we gain the exibility to specify a flner structure. For Y < 0, the L?evy process is of flnite activity; for 0 ? Y < 1, it is of inflnite activity but flnite variation; for 1 ? Y < 2 the process is of inflnite activity and inflnite variation. As special cases, VG process is recovered if Y = 0, and Kou?s double exponential model 16 in [40] is a flnite activity process with Y = ?1. The characteristic function of CGMY random variable is given by `CGMY (u;C;G;M;Y) = exp ? C?(?Y)'(M ?iu)Y ?MY +(G+iu)Y ?GY? ? ; (2.11) where the ?(:) is the gamma function. CGMY process also has a time-changed Brownian motion representation which has been recently discovered in Madan et al [43]. We will show it together with discussion of the CGMY simulation algorithm in the next section. 2.3.3 Market models L?evy market models assume that the martingale component of the dynamics in the log price of Xt is given by a L?evy process, e.g. VG or CGMY. As we are most concerned about pricing derivatives, the model speciflcation is skipped under the physical measure. Rather, the stock price dynamics as the exponential L?evy process under the risk neutral measure is given by St = S0 exp ? (r +!)t+Xt ? ; (2.12) where r is the risk-free interest rate and ! accounts for the \logarithm convexity correction". The ! appears in the exponential because it is needed to make the stock process an exponential martingale, which is a general requirement in asset pricing theory in order to prevent arbitrage opportunities. It can be shown that ! is deflned as exp(?!) = `(?i;?) 17 where `(:; :) is the characteristic function of the L?evy process and ? represents the model parameters. For VG and CGMY models, their characteristic functions have been given in Eqn.(2.5) and (2.11) respectively. L?evy market models have been applied to option pricing in the real world recently and the popularity has been increasing (see [17, 50] for details). Here we only want to brie y compare the L?evy models with the stochastic volatility models, in terms of their pros and cons in option pricing. L?evy models have the advantage of keeping themselves stay within the family of the one dimensional Markovian process, but the stochastic volatility models need an additional stochastic process that captures the volatility. Secondly, under stochastic volatility models, which have only continuous martingale components, are unable to generate the smile/skew volatility curve for short-dated options. This is caused by the fact that the path continuity prevents Brownian motions from generating enough variations in a short period of time dt. However, such a problem does not exist for inflnite activity L?evy models because during the time interval of any length there are an inflnite number of jumps occurring. Therefore, even for short-lived options such L?evy models can produce the exhibited skews. However, pure jump L?evy models contradict the market in some other aspects. For instance, under L?evy models the implied volatility curve of the long-dated op- tions attens out, but the market shows signiflcant skew in those options. This is because of the i.i.d. increment assumption that L?evy processes hold. The central limit theorem (CLT) states that the summation of a large number of i.i.d random variables approaches to the normal distribution, so for the long run the skewness 18 and kurtosis provided by the L?evy process are suppressed by CLT. Another con- tradicting example is that under the time-homogeneous L?evy models the implied volatility surface for the forward-start option 3 is theoretically an exact duplication of the current implied volatility surface, whereas the market indicates a signiflcant difierence between those two surfaces. A third unsatisfactory outcome with the L?evy models is the process?s constant variance, which can not explain the volatility clustering and leverage efiect observed in the historical data of the realized volatility. To solve the above problems associated with basic L?evy models, Carr et al in [14] proposed the stochastic volatility L?evy models which essentially employ an additional Markovian process to time-change the L?evy process in the spot price dynamics. These models essentially combine the advantages of the basic L?evy and stochastic volatility models, thus simultaneously attack the aforementioned deflcien- cies successfully. Although it would be very naive to claim the stochastic L?evy models are the perfect description of the real world, those models provide a closer look to help explain the option market phenomena. 2.4 Monte Carlo simulation Monte Carlo simulation has been widely used in flnancial engineering with applications ranging from pricing, hedging, risk managing, etc. Compared to other methodologies in pricing derivatives, Monte Carlo simulation has the advantage of 3A plain vanilla option that comes into life on a specifled future date. 19 being exible and easy-to-implement. In fact, Monte Carlo simulation serves as the only approach when pricing derivatives with complicated structures such as exotics options or hybrids products that is becoming a booming business in recent years. For a full account with overviews in this regard, readers are referred to [26, 37]. It is a well-known drawback of Monte Carlo simulation that the convergence is slow, i.e. the confldence interval decreasing at a rate proportional only to the square root of the number of the random draws. In practise, one needs to seek for variance reduction techniques in order to speed up the convergence. Depending on speciflc scenarios, difierent variance reduction tools should be chosen. For instance, when pricing far-out-of-money options people use importance sampling to redirect the process to the interested area and then change it back by multiplying the Radon- Nikodym ratio. Variance reduction is especially important in derivative pricing business, because very often a customer wants a real-time quote on a product with however complicated structures. In this section I will focus on discussing the simulation algorithms for VG and CGMY processes, and provide numerical comparisons between difierent simulation schemes. I will also talk about the importance sampling technique under the L?evy process, and numerical demonstrations will be presented. 20 2.4.1 Simulate VG and CGMY Variance Gamma We have learned from Eqn. (2.8) that a VG random variable could be decom- posed into two gamma random variables. Based on this rationale, one could take advantage of the gamma random variable simulation and generate VG process as the difierence of those two gamma random variables. The corresponding simulation algorithm, using the fC;G;Mg parameterization, is listed in Table (2.1) Simulation of Xt ? VG(t;C;G;M) 1. Generate G?t ? gamma(tC;1=G) 2. Generate G+t ? gamma(tC;1=M) 3. Return Xt = G+t ?G?t Table 2.1: VG process simulation as the difierence of two gamma random variables CGMY Unfortunately, the elegant simulation algorithm, as in the above VG case, is not available for all L?evy processes. The most general approach to simulate an arbitraryL?evyprocessistotreatthejumpsascompoundPoissonprocessandsample from its L?evy density. For the compound Poisson simulation, one flrst calculate the jump arrival rate and simulate the point process to locate the random sequence of jump time epochs. Given a jump has occurred, we sample the jump size from the normalized L?evy density. A compound Poisson random variable XCP(t) is then the 21 summation of all jumps up to time t. The problem of the above solution is that for inflnite activity and inflnite vari- ation L?evy processes, the arrival rate is inflnite for any time interval. However, one realizes that this inflnity problem is caused by jumps of tiny sizes while the arrival rates for large jumps are always flnite by deflnition. This provides a general way of simulating such processes | namely the small jumps are cut ofi and approximated while large jumps are simulated as compound Poisson. This general idea is tailored to the CGMY simulation algorithm as discussed below. We flrst cut ofi the small jumps of absolute size less than ?, a small positive value4. Those small jumps will be approximated by Brownian motion with cor- responding variance5. For the large jumps left, compound Poisson simulation is employed and we sample jumps using the acceptance-rejection method. For large jumps, we flrst make a truncated jump density k?CGMY in Eqn. (2.10) and then we have k?CGMY (x) = 8> >< >>: C exp(Gx) jxj1+Y x < ??; C exp(?Mx) x1+Y x ? ?; In order to make the best e?ciency for the acceptance-reject method, we want to flnd a function f?(x) whose value is close to but always greater than k?CGMY (x) at every x. For x ? ?, it can be shown that the function f?(x) = Y? Y xY+11[jxj??]; Y < 2 4For practical purpose we pick ? = 10?4. 5Brownian motion provides a good approximation for small jumps as shown by Asmussen and Rusi?nski [1] 22 has an e?ciency close to 1 as ? approaches to zero. The cumulative distribution function of f?(x) is obviously F?(x) = 1 ? ?YxY 1[jxj>?] whose inversion is simply F?1? (x) = ?u?1=Y where u is a uniform random variable. The case of negative large jumps can be treated identically. For jump sizes with absolute values smaller than ?, we calculate their variance as follows 2? = Z +? ?? x2 ?(dx) < 1: The small jumps are then approximated by a Brownian motion with drift zero and variance 2?. So far we have developed a simulation algorithm by decomposing the CGMY processintothreecomponents, i.e. thelargepositivejumps, thelargenegativejumps and small jumps, and generate the each component separately. The algorithm is listed in Table (2.2). The I? and I+ are intensities for positive and negative jumps respectively, and they can be calculated as a gamma incomplete function. 2.4.2 Simulate VG and CGMY as time-change Brownian motion The beneflt of viewing a L?evy process time-changed Brownian Motion has two folds with respect to simulation. First, we avoid directly dealing with the L?evy jump density which might be di?cult to sample from. Second, it provides a modularity beneflt so as to flt into the existing simulation package. 23 Simulation of Zt ? CGMY(t;C;G;M;Y) process 1. Calculate I? = CR???1 eGxdxjxjY+1 for x < 0;I+ = CR1? e?MxdxxY+1 for x > 0 2. Simulate N? = poisson(tI?) and N+ = poisson(tI+), where N? and N+ refer to the number of negative and positive jumps respectively 3. For negative jumps, Loop i = 1 : N? ? Generate U ? uniform[0;1] ? Repeat { Generate fW;Vg? uniform[0;1] { Set X?i = ?W?1=Y { set T = Cf?(X?i )??Y e???Yk? CGMY (X ? i ) ? Until VT ? 1 and then store X?i 4. Do the same calculation for N+ as in step (3), and store all X+i 5. Calculate ? = R+??? x2 kCGMY (x)dx < 1 6. Simulate X? = ?ptB where B ? Normal(0;1) 7. Return Zt = ? N?X i X?i + N+X i X+i +X? Table 2.2: CGMY process simulation as compound Poisson and Brownian motion approximation Variance Gamma As in Eqn.(2.9) VG is viewed as a Brownian motion subordinated by a gamma process. Below in Table (2.3) is the procedure for simulating VG process Xt with parameter set ( ;?; ). 24 Simulation of Xt ? VG(t; ;?; ) 1. First step: generate Gt ? gamma(t=?;?) 2. Second step: insert Gt into a Brownian motion ? Generate a standard normal random variable W ? Normal(0;1) ? Return Xt = Gt + pGtW Table 2.3: VG process simulation as the time-change Brownian motion CGMY The expression of CGMY as a time-changed Brownian motion is recently dis- covered by Madan and Yor in [43] by using a random truncation upon a stable process. We skip the discussion here, but only list the simulation algorithm devel- oped in [53] in Table (2.6). 2.4.3 Importance sampling under L?evy processes Monte Carlo simulation is often used to evaluate integrals by sampling ran- dom points from the relevant probability distribution. However, the choice of sample distribution obviously makes a real-world difierence to the e?ciency of the method. For example, brutal-force simulation for a rare event under the original distribution creates a great deal of void points that contribute nothing to the calculation of the integral, and hence causes high variance and slow convergence. Importance sam- pling is a variance reduction technique. As inferred by the name, it samples from a difierent distribution under which most random draws would make non-zero con- tributions to the integral evaluation. The displacement caused by the distribution 25 Simulation of CGMY(t;C;G;M;Y) process 1. A = (G?M)=2;B = (G+M)=2: 2. Take an ? a small value, say 0:0001. For jumps in the subordinator that is smaller than ? we use the expectation to replace it; for jumps bigger than ? we simulate it by inverting the CDF. 3. The expectation of the small jump is d = R?0 y C yY2 +1 dy = C?1?Y21?Y 2 . 4. The arrival rate for jumps bigger than ? is ? = R1? C y1+Y2 dy = 2C Y?Y2 . 5. Let T be a pseudo-time T = tCp??(1+Y2 )=2Y2 . 6. Generate a Poisson RV N with arrival rate T? . 7. Generate ti;i = 1;:::;N uniform distributed within [0;t] . 8. Generate jumps yi at ti given by yi = ? (1?u1i) 2Y , where u1i is an indepen- dent uniform sequence. (Inversion of the Normalized L?evy density) 9. S(t) = dt + Pyi1h(yi)>u2i, where u2i is another independent uniform sequence. (the calculation of h(yi) is presented below) 10. Once S(t) is known, then the CGMY process Xt = AS(t) +pS(t)W, where W follows normal(0;1). Calculation of truncation function h(y) 1. h(y) = e??(B 2?A2)y 2 ?(Y+12 ) ?(Y)?(12) 2 Y (B2y 2 ) Y 2 I(Y;B2y; B 2y 2 ). 2. I(Y;2?;?) = H?Y ( p2?) ?(Y) (2?)Y2 , where H?(:) is the Hermite function. 3. Hermite function is explicitly known in terms of Con uent Hypergeo- metric Function 1F1, where H?(z) = 2?=2? h 1 ?(1??2 )?(12) 1F1( ?? 2 ; 1 2;z 2=2) ? zp2?(?? 2 )?( 3 2) 1F1( 1?? 2 ; 3 2;z 2=2) i Table 2.4: CGMY process simulation as the time change Brownian motion 26 change will be corrected by multiplying the each individual random evaluation by a ratio, which will be discussed in details below. Suppose p(x) is a probability distribution function (pdf) and one is interested in the integral I = Z f(x)p(x)dx: (2.13) With a large number of random draws from the original p(x), the integral I could be approximated as I0 = X f(xi): (2.14) However, the pdf function p(x) and the evaluated function f(x) are independent so p(x) could concentrate its weights in the region where f(x) has no signiflcant values. Monte Carlo approximation then becomes ine?cient because most draws will be wasted in this case. To increase the e?ciency, we can equivalently rewrite the integral as J = Z f(x)p(x)q(x)q(x)dx; (2.15) such that it can be approximated by J0 = X f(xj)p(xj)q(x j) ; (2.16) with points xj drawn from the new distribution q(x), termed the importance distri- bution. q(x) shares the same sample space as p(x) but can be chosen to concentrate in areas where the values of f(x) are non-zero. The technique reduces the variance incurred in the integral approximation. Above deflned is the concept of importance sampling on random variables. In flnance however, we are dealing with time series of asset returns which are processes, 27 i.e. a collection of random variables indexed by time. Nevertheless, a process can be considered as a random variable X measured by the probability law P in the path space ?. Then if we construct a new probability measure Q such that it is equivalent to P, namely P(?) = 1 ()Q(?) = 1; then we call the new probability measure a change of measure. The Radon-Nikodym derivative Zt = dPdQjFt is what we need to perform the importance sampling. It is deflned a Ft measurable martingale process under mea- sure Q. The rigor in deflning the general Radon-Nikodym derivative Zt for L?evy processes can be found in Sato [49]. Here we discuss the special case | pure jump L?evy process without continuous martingale component, i.e. Brownian motion. Proposition 1 Let (Xt;P) and (Xt;P0) be two pure jump L?evy processes with the triplets (b;0;?) and (b0;0;?0), then PjFt and P0jFt are equivalent for all t if and only if the following two conditions are satisfled 1. The L?evy measures are equivalent with Z +1 ?1 (e`(x)=2 ?1)2?(dx) < 1; where `(x) = ln?d?d?0? 2. We must also have b0 ?b = Z 1 ?1 x(?0 ??)(dx): When they are equivalent, the Radon-Nikodym derivative is dP0 dP jFt = e Ut (2.17) 28 where Ut is given Ut = lim?!0 ? X s?t;j?Xsj>? `(?Xs)?t Z jxj>? (e`(x) ?1)?(dx) ? : (2.18) The deflnition of the Radon-Nikodym derivative in the Proposition 1 indicates that in general its calculation depends on the information of all jumpsf?Xs;8s < tg along each entire path. This is an unwanted property because it could substantially increase the requirement for computer storage and computation. For example, sup- pose one is applying the importance sampling technique to price far out-of-money European option. However, simulating only the terminal stock prices under the new measure is no longer su?cient to price the option, and one has to generate the whole evolution for each sample path. As a matter of fact, the beneflt from the variance reduction will be largely ofiset, if not totally lost, by the excessive CPU time spent on the path simulation. However, the next two propositions show that in certain cases, i.e. if the ratio of two L?evy measures is strictly exponential, we could still rely on the terminal values to calculate the Radon-Nikodym derivative. And as a matter of fact, VG and CGMY processes fall in this convenient class. Proposition 2 Let (Xt;P) and (Xt;P0) be two pure jump L?evy processes with the triplets (b;0;?) and (b0;0;?0), then the Radon-Nikodym derivative dP0dP jFt only de- pends on the terminal value Xt instead of the whole path f?Xs;8s ? tg if and only if the ratio of the two L?evy measures d?d?0 is a strict exponential function, i.e. d? d?0 = e cx for all x, where c is a constant. 29 Proof of Proposition 2. From Eqn.(2.17), we have dP0 dP jFt = e Ut = exp ? lim ?!0 ? X s?t;j?Xsj>? `(?Xs)?t Z jxj>? (e`(x) ?1)?(dx)? ? = exp ? lim?!0? X s?t;j?Xsj>? `(?Xs)? ? exp ? lim?!0??t Z jxj>? (e`(x) ?1)?(dx)? ? = exp ? lim?!0? X s?t;j?Xsj>? c?Xs? ? exp ? lim?!0??t Z jxj>? (e`(x) ?1)?(dx)? ? = exp(cXt) exp ? lim?!0??t Z jxj>? (e`(x) ?1)?(dx)? ? As shown in the proof, under the condition in Proposition 2 the calculation of the measure change solely depends on the terminal value Xt and no long requires the path information. We show in the next Proposition 3 how this simplifled measure change is applied to Variance Gamma process. Proposition 3 Let P and P0 be the measures of Variance Gamma processes fol- lowing VG(t;C;G;M) and VG(t;C0;G0;M0). The L?evy measures are ? and ?0, re- spectively. Xt is the random variable under the measure of P and can be decomposed into the difierence of two gamma random processes, i.e. Xt = g+t ?g?t where g?t ? Gamma(Ct;G) and g+t ? Gamma(Ct;M). Then the Radon-Nikodym derivative dP0dP jFt only depends on the terminal value g?t and g+t but not the entire path f?g?s ;8s ? tg only if C = C0 such that dP0 dP jFt = exp(?tZ) ` +(g+ t ) ` ?(?g? t ) 30 where Z = Z 1 ?1 (?0 ??)(dx) `?(x) = e?(G0?G)jxj;x < 0 `+(x) = e?(M0?M)x;x > 0 The validity of Proposition 3 can be easily proved by Proposition 2, and the ex- tension from VG to CGMY process is straight-forward too. We skip both proofs here. The measure change technique is used widely in pricing complicated flnancial derivative products. For example, for far out-of-the-money European put option a naive Monte Carlo simulation will have signiflcant variance in the price since only a small proportion of simulated paths will end up lower than the strike. Another example is the up-and-out barrier option in which the upper barrier is far away from the spot so the probability of breaching the barrier is slim. In those cases, a limited number of simulation paths will not be su?cient to make accurate estimation of the price. To perform importance sampling, we need to alter the process and redirect it to the interested region. For example, if the underlying dynamics follows a VG(t;C;G;M) law and the task is to price a up-and-out barrier call option with a high upper barrier. So we could sample from a new measure VG(t;C0;G0;M0) with G0 > G but M0 < M so as to make more upward moves and less downward moves for the new process than the original process. Note: the parameter C and C0 should be kept the same in order to satisfy the condition in Proposition 2. By this 31 means, more paths will be favorably sampled to shoot toward the barrier; the bias caused by the change would be corrected by multiplying the results by the Radon- Nikodym derivative developed in Proposition 2. We will show a concrete numerical example in the next subsection. 2.4.4 Numerical results Simulation validation In this section we flrst present some simulation results for CGMY process using two difierent algorithms | as the compound Poisson and as the time-change Brownian motion (TCBM). We draw a comparison between the two algorithms on the CPU time spent on the simulation. Figure 2.1 qualitatively illustrates the goodness-of-flt for the two algorithms for the symmetric case (G = M), and Figure 2.2 for the asymmetric one (G 6= M). In both plots, the blue lines represent the binned simulation data while the red lines are theoretical PDFs, which are obtained from Fourier inversion from characteristic function using algorithm developed in [44]. Both flgures show good fltting qualities. We also tabulate the chi-square goodness-of-flt numerical results in Table (2.5) and (2.6) 6 with a quantitative view of the algorithms performance. We conduct the experiment for various conflgurations of model parameters for both algorithms. 6?2 is the chi-square test statistic, k is the number of freedom, p is the chi-square CDF value, and ?2fi;k is the critical value at confldence level fi with freedom k. The null hypothesis H0 can not be rejected (denoted by "H0 N.R." in both tables) if ?2 is less than ?2fi;k. 32 ?0.5 ?0.4 ?0.3 ?0.2 ?0.1 0 0.1 0.2 0.3 0.4 0.50 100 200 300 400 500 600 700 Chi?square test for CGMY as compound Poisson Xt # Theoretical Simulated ?0.5 ?0.4 ?0.3 ?0.2 ?0.1 0 0.1 0.2 0.3 0.4 0.50 100 200 300 400 500 600 700 Chi?square test for CGMY as TCBM Xt # Theoretical Simulated Figure 2.1: Chi-Square goodness-of-flt test illustration for simulation of CGMY(t; C, G, M, Y) process, as compound Poisson and time-change Brownian motion, with t = 0:5;C = 0:8;G = 30;M = 30;Y = 0:5, 2?104 samples 33 ?0.5 ?0.4 ?0.3 ?0.2 ?0.1 0 0.1 0.2 0.3 0.4 0.50 200 400 600 800 1000 Chi?square test for CGMY as compound Poisson Xt # Theoretical Simulated ?0.5 ?0.4 ?0.3 ?0.2 ?0.1 0 0.1 0.2 0.3 0.4 0.50 200 400 600 800 1000 Chi?square test for CGMY as TCBM Xt # Theoretical Simulated Figure 2.2: Chi-Square goodness-of-flt test illustration for simulation of CGMY(t; C, G, M, Y) process, as compound Poisson and time-change Brownian motion, with t = 0:3;C = 0:5;G = 20;M = 10;Y = 0:5, 2?104 samples 34 The tables show that in all cases the null-hypothesis H0, which states that the simulated random variables are drawn from the specifled CGMY distribution, can not be rejected for three levels of signiflcance (fi = 0:01;0:05;0:1). E?ciency comparison Given the fact that both algorithms can successfully pass the goodness-of-flt test, we next compare the e?ciencies in terms of CPU time. In CGMY process simu- lation, the value of parameter Y should be given special attention because it controls the behavior of small jumps, which cause most of the di?culties in simulation. For a large value of Y(Y < 2), the L?evy density increases sharply around the origin and hence tiny jumps happen more frequently than a smaller Y. Computation-wise, a larger Y value takes more simulation time for both algorithms. Although the compound Poisson method is more e?cient for small Y, it sufiers most from an increasing Y because the acceptance-rejection method would reject more and more draws as Y gets larger. The time-change Brownian motion takes more time than the compound Poisson for small Y since it involves the computation of complex functions such as the con uent hypergeometric function (the 1F1 function), but it beats the counterpart for larger Y because the rejection rate becomes overwhelming for the compound Poisson?s performance. The Figure (2.3) illustrates the following two aspects about the e?ciency of the two algorithms | 1. both methods consume more CPU time with an increasing Y, 2. the compound Poisson beats time-change Brownian motion (TCBM) forY ? 1 35 Parameter set Chi-square statistics Critical value & results t C G M Y ?2 p k ?20:01;k ?20:05;k ?20:10;k 0.5 0.8 30 30 0.5 95.77 0.42 98 133.476 122.108 116.315 H0 N.R. H0 N.R. H0 N.R. 0.5 0.5 10 10 0.5 182.17 0.57 178 224.8 210.1 202.6 H0 N.R. H0 N.R. H0 N.R. 0.5 0.5 30 20 0.5 80.64 0.13 95 130 118.8 113 H0 N.R. H0 N.R. H0 N.R. 0.3 0.2 30 20 0.8 77.26 0.33 82 114.7 104.1 98.78 H0 N.R. H0 N.R. H0 N.R. 0.3 0.2 30 20 1.2 114.24 0.09 135 176.1 163.1 156.4 H0 N.R. H0 N.R. H0 N.R. 0.2 0.1 20 10 1.4 152.07 0.62 146 188.7 175.2 168.3 H0 N.R. H0 N.R. H0 N.R. Table 2.5: Chi-Square goodness-of-flt test numerical results for simulation of CGMY(t; C, G, M, Y) process as compound Poisson process, 104 samples. N.R. stands for Not Rejected Parameter set Chi-square statistics Critical value & results t C G M Y ?2 p k ?20:01;k ?20:05;k ?20:10;k 0.5 0.8 30 30 0.5 86.91 0.24 96 131.141 119.871 114.131 H0 N.R. H0 N.R. H0 N.R. 0.5 0.5 10 10 0.5 186.69 0.68 178 224.8 210.1 202.6 H0 N.R. H0 N.R. H0 N.R. 0.5 0.5 30 20 0.5 106.79 0.64 101 137 125.5 119.6 H0 N.R. H0 N.R. H0 N.R. 0.3 0.2 30 20 0.8 62.66 0.07 80 112.3 101.9 96.58 H0 N.R. H0 N.R. H0 N.R. 0.3 0.2 30 20 1.2 130.76 0.34 137 178.4 165.3 158.6 H0 N.R. H0 N.R. H0 N.R. 0.2 0.1 20 10 1.4 169.60 0.85 150 193.2 179.6 172.6 H0 N.R. H0 N.R. H0 N.R. Table 2.6: Chi-Square goodness-of-flt test numerical results for simulation of CGMY(t; C, G, M, Y) process as time-change Brownian motion, 104 samples. N.R. stands for Not Rejected 36 but underperforms TCBM for Y > 1. Importance sampling We next show how to improve the accuracy of Monte Carlo simulation using importance sampling. For the purpose of simplicity and convenience we take Vari- ance Gamma process in this demonstration. We follow the Proposition 3 to perform the measure change for VG process. Assume that the underlying follows the sym- metric process VG(t;C;G;M) and our task is to price a pseudo far out-of-money put option with a deep-low strike. We pick a new VG(t;C0;G0;M0) process trending downward so that paths are more likely to breach the low strike. The model param- eters and option speciflcations are listed in Table (2.7). With a smaller G0 but larger M0 we allow more negative jumps to happen than positive ones. As illustrated in Figure (2.4), the draws of random variables under the new measure is concentrated on the negative half axis, which is exactly what is needed to have more probability mass in the neighborhood of the deep-low strike. Figure (2.5) plots the averaged Monte Carlo price against the number of simulation runs for the naive Monte Carlo and Monte Carlo with importance sampling technique. Clearly, it shows the supe- rior performance of applying importance sampling, as the option price under the new process with importance sampling converges much faster than that under the original one. 37 0.4 0.6 0.8 1 1.2 1.4 1.60 20 40 60 80 100 120 140 Y CPU time (s) Compound PoissonTime change Brownian motion Figure 2.3: CPU time comparison of compound Poisson and time-change Brownian motionforsimulatingCGMY(t; C, G, M, Y)witht = 0:2;C = 0:1;G = 20;M = 10 and varying Y, 104 samples 38 Original VG New VG Option speciflcations C G M C0 G0 M0 Type Strike Maturity Initial Interest 5 10 10 5 7 12 Euro Put K = ?1:5 T = 4 S0 = 0 r = 0 Table 2.7: Parameters and speciflcations for the numerical illustration of VG mea- sure change ?3 ?2 ?1 0 1 2 30 500 1000 1500 2000 ST # (a) Histogram for the original process ?7 ?6 ?5 ?4 ?3 ?2 ?1 0 1 20 500 1000 1500 2000 2500 ST # (b) Histogram for the new process Figure 2.4: Histogram comparison for VG(t; C, G, M) and VG(t; C0, G0, M0) with parameters listed in Table (2.7), 2?104 samples 39 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 104 0 0.5 1 1.5 2 2.5 3 3.5x 10 ?3 Convergence comparison # of simulations Option price Original process New process Figure 2.5: Convergence comparison of pricing out-of-money put options for VG(t; C, G, M) and VG(t; C0, G0, M0) with importance sampling, parameters listed in Table (2.7) 40 Chapter 3 The 3-Factor L?evy Based Short Rate Model 3.1 Background Whereas for modeling in equity markets one has the Black-Scholes theory ar- guably as a bench mark or market standard, the situation in flxed income market is more complicated. The mathematics involved in modeling the flxed income mar- ket dynamics is essentially inflnite dimensional and no market standard has been established for more than special asset classes. On the other hand, the flxed income market is much larger than equity market. It is so large that even bond derivatives are liquidly traded. On 04/25/2006, for instance, the outstanding notional for the CBOT futures contracts on government bonds with maturities of [30;10;5;2] years was worth 180 billion dollars, which amounts to the 2004 GDP of the country Ireland ranked the 30th largest in the world in terms of GDP. The research on modeling flxed income market has been active. The flrst at- tempt was the equilibrium modeling as pioneered by Vasicek [51] and Cox-Ingersoll- Ross [18] and followed by Hull-White [36], etc. Those models view the short rate, deflned as the instantaneous increment of a unit deposit, as the only source of uncer- tainty in the economy. Short rate is then modeled by a mean-reverting process with Gaussian randomness. Short rate models are favored by hedge funds and invest- 41 ment banks? proprietary trading desks because those models supply the equilibrium prices which are presumably the prices the market will asymptotically converge to. However, flxed income derivative traders are not fond of this type of short rate mod- els because they are not arbitrage free | i.e. difierences between the model prices and the market quotes always exist under such models. Pricing derivatives with a model that can not make the underlying arbitrage free is dangerous and is avoided by practitioners. No-arbitrage models came up in mid-90?s represented by Heath-Jarrow-Morton (HJM) model [31]. Instead of modeling the short rate, the HJM model was to describe the forward rate dynamics. In the HJM framework, market models such as BGM/J model [8, 38] were developed for particular flxed income rates. No-arbitrage models give reasonable derivative prices, because the underlying values are perfectly matched to the market by design. However, all the aforementioned models uses Brownian motion to describe the uncertainty. As we have discussed, the normality assumption associated with Brownian motion was barely supported by the market because Brownian motion lacks the structural properties to model many observed market phenomena. Although the disagreement among the choice of models has not deterred banks from trading, a better model certainly helps understand the market better, and hence proflt more. In this chapter we are going to develop a new short rate model based on the Vasicek framework yet without using Brownian motion. Rather, in- flnitely active L?evy jump structures are included in the short rate in order to rec- oncile the discrepancy between the market and the existing models. 42 As one factor models have been proved less efiective in explaining the correla- tions between bonds across difierent maturities, three independent L?evy factors are employed in the model. This setup is in line with the conclusion as in [41, 32] which showed that 3 factors in the dynamics are adequate to explain more than 99% of the yield curve data historically. In this chapter, the basic conventions and methodologies used in the flxed income modeling will be flrstly introduced. A brief overview on the existing flxed income models will be provided. The degree of mathematics of general asset pric- ing theory will be kept minimal | only necessary concepts and notations will be included. We will spend most of this chapter developing the 3-factor L?evy based short rate model with a view on measure change. The formulas for flxed income derivatives under the model will be developed in the end. 3.2 Overview on flxed income modeling 3.2.1 Bond market and money account Bonds are the basic element in the flxed income market. For all flxed income modeling it serves as the market input to determine model structures and estimate model parameters. In practice, the zero-coupon bond is mostly used as the reference instrument to infer the yield curve. For the sake of argument we hereafter assume that the zero-coupon bonds with continuous maturity in time are traded on the market. A bond is a securitized form of loan; that is, a loan that can be traded. A zero- 43 coupon bond P(t;T), issued at time t maturing at T, is a contract that guarantees the holder a dollar to be paid at terminal time T.1 For brevity, such a coupon bond P(t;T) is called a T-bond. In contrast to the T-bond, a coupon bond has intermediate coupon payments between t and T periodically. We summarize the assumptions we have made on T-bond below: ? P(T;T) = 1 for all T; ? P(t;T) < 1 for all t < T; ? There exists a market for T-bonds for every T > 0, and P(t;T) is continuously difierentiable in all T. To complete the market, people usually assume there is a frictionless risk-free money account B(t). Money account grows as follows B(t) = exp Z t 0 r(s)ds ? ; (3.1) where B(0) = 1 and the compounding is taken continuously. Here we have intro- duced the concept of a spot rate r(t) which is deflned as the instantaneous increment at time t of the money account. Clearly B(t) is a risk free asset insofar as its future value at t + dt bears no uncertainty inflnitesimally. B(t) is also important to relate the amount of currency at difierent times: in order to have one dollar in the bank account at time T we need to have B(t) B(T) = exp ? Z T t r(s)ds ? 1Here we assume the bond is default free. 44 dollars in the account at time t ? T. The relationship between a T-bond P(0;T) and the money account B(T) can be described as P(0;T) = exp ? Z T 0 r(s)ds ? = 1B(T) : (3.2) We need another important yet straight-forward relationship linking T-bond P(t;T) with spot rate r(t) as given by P(t;T) = exp ? Z T t r(s)ds ? : (3.3) As the Eqn.(3.3) is not intuitive in knowing how much the bond grows during the period (t;T), people designated the term yield to maturity (or yield) R(t;T) to describe the average gain per unit time from the speciflc T-bond P(t;T), where P(t;T) = exp ?R(t;T)(T ?t) ? ; R(t;T) = ?logP(t;T)T ?t : (3.4) It is clear that at time t, the yield is a function of maturity T given the short rate dynamics. The plot of R(t;T) against the maturity T is referred as yield curve, the Figure 3.1 shows a upward sloping yield curve on February 9th, 2005. In theory, yield curve on difierent days can have all sorts of shapes, e.g. up-sloping, down-sloping or humped re ecting people?s expectation on future interest rate level. While the yield curve could be bootstrapped ofi from the bond prices by Eqn. (3.4), the spot rate is not directly observable. Available market proxies for the short rate include the Fed rate and short-dated, say 1-month, treasury rate, but the latter is considered better since the short-dated treasury bond is being liquidly traded. 45 Figure 3.1: US treasury yield curve on Feb. 09, 2005 3.2.2 Forward rate and Libor rate The term structure of zero-coupon bond prices does not contain very rich visual information, but a few better measures can explained by implied interest rates. Below we list a variety of them including both the conceptual and the market- observable ones. ? The discrete forward rate F(t;T1;T2) for period [T1;T2] prevailing at t is de- flned as F(t;T1;T2) = 1T 2 ?T1 P(t;T 1) P(t;T2) ?1 ? : It can be regarded as the expected average return for future time period [T1;T2], viewed at time t. 46 ? The simple spot rate for the period [t;T] is denoted by F(t;T) = F(t;t;T) ? The instantaneous forward rate f(t;T) with maturity T prevailing at t is deflned as f(t;T) = ?@logP(t;T)@T ; which determines the instantaneous gain of the continuously compounded T- bond at future time T. ? Libor (London InterBank Ofier Rate) rates are the most important market observable rates and underly many interest rate derivative contracts such as swaps and caps/ oors. In essence, Libor rates are simple-compounded forward rate of difierent tenors with the most used one being the 3-month Libor rate. A Libor rate with tenor ? is given as Lt(T;T +?) = F(t;T;T +?): Somesimpleyetimportantlinksbetweenthebondpricesandtheabovedeflned rates are summarized here. The T-bond price relates to the instantaneous forward rate in the following manner P(t;T) = exp(? Z T t f(t;u)du): And the instantaneous forward rate approaches to the short rate as T nears t, r(t) = lim T!t f(t;T): 47 3.2.3 Change of numeraire Since the celebrated Black-Scholes work, the most popular pricing tool for computing asset prices has been the \risk-neutral pricing". As argued in Harrison & Kreps [29] and Harrison & Pliska [30], the absence of arbitrage implies the existence of a risk-adjusted probability Q such that the current price of any security should equal its discounted expectation of future values. Under this equivalent measure Q the associated discounting factor, referred as the numeraire, is the riskless money account B(t). Thus, the problem of derivative pricing is simply left as calculating the discounted expectation under risk neutral measure Q. However, Geman et al [25] noted that the neither Q measure is necessarily the most natural choice for pricing a contingent claim nor the money account is the most convenient numeraire. In fact, under the measure Q many calculations of the expectation could be considerably complicated. In such cases, a change of numeraire can help to simplify the problem, and it has surprisingly helped reduce the complexity in pricing derivatives, especially in the flxed income market. In speciflc, Geman et al in [25] introduced the following deflnition. Deflnition 4 A numeraire is any positive non-dividend-paying asset. Intuitively, a numeraire is a unit asset chosen so as to normalize all other asset. With this deflnition, the following Proposition 5 holds. Proposition 5 Assume there exists a numeraire N and a probability measure QN, equivalent to the original measure Q0, such that the price of any traded asset X 48 discounted by N is a (local) martingale under QN, i.e., Xt Nt = E N ?X T NT jFt ? 0 ? t ? T: Let U be an arbitrary numeraire associated with measure QU. Similar to (N;QN), (U;QU) satisfles Xt Ut = E U ?X T UT jFt ? 0 ? t ? T: Then the Radon-Nikodym derivative deflning the measure QU is given by dQU dQN = UTN0 U0NT : (3.5) By Proposition 5, the calculation of the discounted expectation of a traded asset X under an inconvenient measure could be translated under another convenient measure by EN ?X T NT ? = EU ?X T NT dQN dQU ? = EU ?U 0 N0 ZT UT ? : The change of numeraire technique is a useful pricing tool at one?s disposal. It is typically employed when the money account discount factor under the risk-neutral measure makes the expectation calculation di?cult, such as pricing caps/ oors etc. To be speciflc we can make the following transformation E0 ?h(X T) BT ? = S0ES ?h(X T) ST ? whenever the following two properties are satisfled. ? XtSt is a tradable asset (0 ? t ? T) ? h(XT)=ST is conveniently simple 49 where h(XT) is the payofi function at time T. The flrst condition is to make the quantity XtStSt = Xt so that Xt could be modeled as a martingale under the measure QS; and the second condition ensures that under the new numeraire the computation could be made simpler. Change of numeraire is widely applied in flxed income and foreign exchange derivative pricing. As one can see in the sections where pricing of caps/ oors and swaptions are discussed, the change of numeraire signiflcantly reduced the complex- ity of pricing. For FX derivatives, change of numeraire is used when the derivative is priced in one currency but the underlying is dominated in another currency, e.g. a quanto option. Interested readers are referred to [3]. 3.3 Current short rate models The flrst seminal paper on short rate model was published by Vasicek [51] in 1977, and together with the development of interest rate market, flxed income models have been mushrooming ever since. As the flrst short rate model, Vasicek?s work assumes the short rate evolves as an Ornstein-Uhlenbeck (OU) process with constant coe?cients dr(t) = ?( ?r(t))dt+ dW0(t); (3.6) where dW0(t) is the Brownian motion under real-world measure. As far as pricing is concerned, one needs to move the process from the physical measure to the risk neutral measure. Therefore, the concept of market price of risk is introduced as the compensation that the investors are paid to take the risk by entering a risky 50 contract. If we specify the market price of risk process ?(t) to be linear with the short rate2, i.e. ?(t) = ?r(t) with ? being a constant, the short rate process under risk-neutral measure would stay within the same OU process family. Thus, the risk-neutral short rate process is given as follows dr(t) = (?+? )( ? ?+? ?r(t))dt+ dW(t); where dW(t) is the Brownian motion under risk neutral measure such that dW(t) = dW0(t)+?(t)dt: The solution to the SDE in Eqn.(3.6) suggests that at every time t, the random variable r(t) is normally distributed. On the bright side, this property brings ana- lytical solutions to pricing a variety of flxed income products. However, r(t) being normally distributed implies the possibility of having negative short rate, which is counter-intuitive. For practical purposes, this drawback is tolerated sometimes by practitioners as they argue that, with reasonable parameter sets, the probability of r(t) dropping below zero is slim. Nevertheless, it is an undesired property of having possible negative short rates. Then there came the celebrated Cox-Ingersoll-Ross (CIR) [18] model that flxed this drawback. From its inception, CIR model has become the benchmark short rate 2In general there is no reason why the market price of risk process should take the linear form, but this form is taken to simplify the problem. 51 model for years. Under CIR model, the short rate r(t) also follows a mean-reverting process, but it difiers from Vasicek?s by having the Brownian motion term multiplied by the square root of r(t). The CIR dynamics is given as dr(t) = ?( ?r(t))dt+ p r(t)dW0(t) It can be shown that the solution to the SDE of CIR process follows a non-central Chi-square probability distribution, which is strictly non-negative3. Analogous to Vasicek?s, we move the process to the risk neutral measure by specifying the market price of risk under CIR model being linear in the square root of r(t), i.e. ?(t) = ? p r(t) Under this formulation, the risk-neutral CIR process is given by dr(t) = (?+? )( ? ?+? ?r(t))dt+ p r(t)dW(t): dW(t) is the Brownian motion under the risk neutral measure such that dW(t) = dW0(t)+?(t)dt Both Vasicek and CIR have closed-form formulas for pricing bonds and bond options under risk-neutral measure, whereas the solutions under Vasicek?s are much simpler. Therefore, for the sake of simplicity, the Vasicek framework is adopted in our L?evy -based short rate model. We have covered two of the milestone models, but a complete review about existing models is far beyond the scope of this thesis. Interested readers are referred to books by Brigo and Mercurio [9], Cairns [11], Rebonato [47] and Zagst [52] 3for a very small set of parameters, the process could hit zero. 52 3.4 3-Factor L?evy short rate model dynamics The problem that puzzled the researchers for flxed income modeling resembles that for equity modeling. Historical data show strong non-normality, and the deriva- tive market exhibits signiflcant non-constant implied volatility. These markedly vio- lationsagainsttheclassicBlack-ScholesorBlackmodelrequirealternativestructures to explain the market behavior. We are encouraged by the success of pure jump L?evy models on equity model- ing, and conjecture that similar problems on flxed income market could be tackled in the same manner. Therefore, in my dissertation a short rate model is developed using multi-factor Vasicek framework by replacing the Brownian motion with a pure jump L?evy dynamics. 3.4.1 Motivation A preliminary investigation (see [24]) of changes in 10 swap rates over the period 04=25=1994 to 10=14=2004 using the methods of independent components analysis (ICA for short) revealed that they may be written as mixtures of factors with the following statistics in Table 3.14. We notice that the flrst three factors have highly non-zero skewness of 1:05, ?1:52, ?1:12 and signiflcant kurtosis 21:62, 132:28, 68:27 respectively. The results indicate that classic Gaussian-based short rate models such as Vasicek and CIR, regardless of the dimensionality, are inadequate to explain those time series data. 4We note that all ICA factors have zero means and unit variances by construction. 53 Factor Skewness Kurtosis 1 1.05 21.62 2 -1.52 132.28 3 -1.12 68.27 4 -.08 10.98 5 .15 11.72 6 .33 15.86 7 -.47 7.5 8 .37 6.5 9 .07 5.08 10 .0029 4.18 Table 3.1: ICA statistics for swap rates of 10 maturities Besides, the statistics in Table 3.1 is conflrming the conclusion as in Litterman [41] that 3-factor is su?cient to explain the variations in the yield curve. L?evy processes, as described in Chapter 2, are ideal candidates for incorpo- rating non-vanishing higher moments as revealed in the time series. We are then motivated to use a 3-factor L?evy dynamics to model the short rate in order to capture the characteristics shown in the table. 3.4.2 Model dynamics Vasicek started modeling the short rate by studying the economic equilibrium, but this step is skipped here. We simply borrow the conclusion. The spot rate is writteninalinearrelationship withthe3latentfactorsunder thephysicalP measure as r(t) = fi +fl0x(t) (3.7) 54 where x(t) is a vector of three dimensional latent factor process, fi is a real number and fl is a 3-dimensional vector. We specify x(t) of following the multidimensional Ornstein-Uhlenbeck (O-U) process whereby we write the stochastic difierential equation (SDE) of x(t) is given as dx(t) = (a??x(t))dt+BdL(t) (3.8) where a is a 3-dimensional vector, ?;B are full 3 by 3 matrices and L is a 3- dimensional, independent L?evy processes. The solution of the SDE in Eqn.(3.8) takes the form x(t) = e??tx(0)+ Z t 0 e??(t?s) (ads+BdL(s)) (3.9) The detailed derivation can be found in Appendix A.1. This speciflcation determines the short rate dynamics under physical measure P. However, pricing bonds and other contingent claims must be done under a risk neutral measure Q. To change from physical measure to risk-neutral measure, we propose a speciflc form for the Radon-Nikodym derivative of Q with respect to P. The Radon-Nikodym derivative dQdP is deflned as dQ dP = 3Y i=1 E??e ix??ijxj ?1??(?i ??i)? (3.10) where ?i is the random counting measure associated with the jumps of Li, ?i is the compensation measure for ?i under P measure. i;?i are coe?cients for risk pricing, with ( i ? ?i) being the risk premium of positive jumps and ( i + ?i) the risk premium of negative jump risks. E(:) denotes the stochastic exponential, which 55 makes the Radon-Nikodym derivative an exponential martingale. (For a complete treatment of stochastic exponential see Protter [46]) Under this proposed form of measure change, the risk neutral process and the physical process remain in the same parametric family. Now that the model dynamics and measure change have been worked out, we choose the Variance Gamma (VG) process as the speciflc structure of the back- ground driving L?evy process L(t). The model parameters are listed in Table (3.2). As shown in the left column, the original model has totally 45 parameters. Appar- ently, this large number of parameters would overwhelm the computer program for model estimation and hence substantially slow down the optimization convergence. Therefore, we seek to simplify the model conflguration by removing the redundan- cies in the model parameters. For instance, matrix B is to correlate dL(t) and create dependence on factors x(t), whereas the matrix ? has an identical efiect. Hence, we can harmlessly collapse the matrix B to the constant identity matrix. Second, matrix ? does not need to be a full matrix, i.e. a lower triangular matrix (positive deflnite) su?ces its functionality yet save 3 more parameters. Third, since we are primarily interested in pricing not the physical process of the short rate, we could assume the model parameters are under risk-neutral measure; this way we could eliminate the inclusion of six measure change parameters i and ?i in Eqn. (3.10). The model parameterization is then reduced to only have 25 parameters as listed in the right column of Table 3.2. The model estimation is conducted in Chapter 4 for the reduced form. 56 Original Parameter Set Reduced Parameter Set (With physical parameters) (With risk-neutral parameters) Symbol Number Symbol Number fi 1 fi 1 fl 3 fl 3 a 3 a 3 ? 9 ? 6, lower triangular matrix B 9 B 0, constant identity matrix x(0) 3 x(0) 3 L1;2;3 9 L1;2;3 9 1;2;3 3 1;2;3 0 ?1;2;3 3 ?1;2;3 0 Total 43 Total 25 Table 3.2: Numbers of model parameter for original model and reduced model 3.5 Pricing of bond and bond derivatives In this section we discuss the analytical solutions for pricing flxed income products. It is a common practice to look at those problems by developing the characteristic function for the short rate (and its integral). As the building block to pricing derivatives, the bond pricing formula is flrst derived. Given the bond pricer, it is straight-forward to calculate the swap rate which can be represented by a bond portfolio. Derivatives pricing formulas, such as caps/ oors and swaptions, are derived through Fourier transform. 57 3.5.1 Joint characteristic function We flrst write the characteristic function of the multi-factor OU process x(t) as follows 'x(t)(u) = E h eiu0x(t) i = E h eiu0e?{tx(0)+iu0 x52t 0 e ??(t?s)(ads+BdL(s)) i = exp ? iu0e??tx(0)+iu0?I ?e??t???1a? 3X j=1 Z t 0 ?j((u0e??(t?s)B)j)ds ? = exp(??x(t)(u)); (3.11) where the characteristic function for each individual L?evy process is denoted by E?eivLj(1)? = exp(??j(v)): We may then rewrite the characteristic exponent of x(t) as ?x(t)(u) = ?iu0e??tx(0)?iu0?I ?e??t???1a+ 3X j=1 Z t 0 ?j((u0e??(t?s)B)j)ds: Recall the spot rate r(t) as in Eqn. (3.7) is an a?ne combination of x(t), thus the characteristic function of r(t) can be straight-forwardly given as 'r(t)(u) = E[eiur(t)] = E[eiufi+iufl0x(t)] = eiufi exp(??x(ufl)): To flnd the bond price, we are interested in the characteristic function of the integral of r(t). Below, we develop the joint characteristic function of the spot rate r(t) and its integral of Rt0 r(s)ds 't(u;v) = E ? exp iu Z t 0 r(s)ds+ivr(t) ?? : 58 We develop as follows. E ? exp iu Z t 0 r(s)ds+ivr(t) ?? = E ? exp iu Z t 0 (fi +fl0x(s))ds+iv(fi +fl0x(t) ?? = eifi(v+ut)E 2 66 4 exp ?Rt 0 iufl 0(e??sx(0))ds+iufl0Rt 0 Rs 0 e ??(s?w)(a dw +B dL(w))ds ? ?exp ? ivfl0(e??tx(0))+ivfl0Rt0 e??(t?w)(adw +B dL(w) ? 3 77 5 = eifi(v+ut)+[iufl0(I?e??t)??1+ivfl0e??t]x(0)+[iutfl0?iufl0(I?e??t)??1+ivfl0(I?e??t)]??1a ?E ? exp Z t 0 Z s 0 iufl0e??(s?w)B dL(w)ds+ Z t 0 ivfl0e??(t?w)B dL(w) ?? : For the flnal expectation we note flrst that Z t 0 Z s 0 iufl0e??(s?w)B dL(w)ds = Z t 0 Z t w iufl0e??(s?w)dsB dL(w) = Z t 0 iufl0?I ?e??(t?w)???1B dL(w): The joint characteristic function is therefore given as 't(u;v) = E ? exp iu Z t 0 r(s)ds+ivr(t) ?? (3.12) = exp 0 BB @ ifi(v +ut)+[iufl0(I ?e??t)??1 +ivfl0e??t]x(0) +[iutfl0 ?iufl0(I ?e??t)??1 +ivfl0(I ?e??t)]??1a 1 CC A ? exp ? ? X j Z t 0 ?j ??ufl0?I ?e??(t?w)???1 +vfl0e??(t?w)?Bj?dw ! : 3.5.2 Bond Equipped with the knowledge of the joint characteristic function, we could obtain the bond price P(0,t), given the factors at level x(0) = x, by setting u = 59 i;v = 0 in Eqn.(3.12), P(0;t) = E ? exp ? Z t 0 r(s)ds) ?? = 't(i;0) = exp(A); where A = ?fit?fl0?I ?e??t???1x +[fl0?I ?e??t???1 ?tfl0]??1a ? X j Z t 0 ?j ?ifl0?I ?e??(t?s)???1Bj?ds: The matrix exponentials exp(??t) involved can be calculated by the following transformation ? = T?T?1 = X k ?kQk ; where matrix Qk is given by Qk = TkR0k ; and R0k is the kth row of T?1. We may then expand A as A = ?fit?fl0??1x+ X j fl0Qj??1xe??jt ?tfl0??1a+fl0??2a? X j fl0Qj??2ae??jt ? X j Z t 0 ?j ? ifl0??1Bj ?i X k fl0Qk??1Bje??k(t?s) ! ds: Given L?evy processes are time-homogeneous, the more general bond prices 60 P(s;t) can be expressed as P(s;t) = exp 0 BB @ ?fi(t?s)?fl0?I ?e??(t?s)???1x(s) +[fl0?I ?e??(t?s)???1 ?(t?s)fl0]??1a 1 CC A ?exp ? ? X j Z t?s 0 ?j ?ifl0?I ?e??(t?s?w)???1Bj?dw ! = exp?as;t +b0s;tx(s)?; (3.13) where the coe?cients as;t and bs;t are deflned as as;t = ?fi(t?s)+[fl0?I ?e??(t?s)???1 ?(t?s)fl0]??1a ? X j Z t?s 0 ?j ?ifl0?I ?e??(t?s?w)???1Bj?dw = ?fi(t?s)?(t?s)fl0??1a+fl0??2a? X j fl0Qj??2ae??j(t?s) ? X j Z t?s 0 ?j ? ifl0??1Bj ?i X k fl0Qk??1Bje??k(t?s?w) ! dw; (3.14) b0s;t = ?fl0?I ?e??(t?s)???1 = ?fl0??1 + X j fl0Qj??1e??j(t?s) : (3.15) 3.5.3 Swaps Interest rate swaps have been central in the flxed income derivative market (over-the-counter, OTC), and it is proven to be very successful in managing risks and arguably the most successful innovations in flnancial market. Many exotic flxed income derivatives are written with the swap rate underlying. The economic mo- tivation behind swap is attributed to the comparative advantage; that is, difierent companies can borrow at difierent rates in difierent markets, but they would be bet- ter ofi if they are allowed to borrow from other markets that they can not access by 61 themselves. So the interest rate swap contract is designed to enable those companies that have difierent borrowing conduits to exchange the comparative beneflts. See Hull in [35] for more economic explanations. In technical terms, an interest rate swap is an agreement between two parties to exchange cash ows at a series of predeflned future dates. In a vanilla payer(receiver) interest rate swap, one party agrees to pay(receive) a predetermined, flxed rate on agreed dates and receive(pay) a oating rate, often referred as Libor (London Inter- Bank Ofier Rate) that is prevailing one period before. The payer swap can be specifled as follows | ? notional N and a flxed rate K; ? a number of future dates when to exchange cash ows, T0 < T1 < ::: < Tn with fii = Ti ?Ti?1 being the time fraction; 5 ? flxed-leg pays NKfii and receives NfiiLTi?1(Ti?1;Ti) at time Ti, where the LTi?1(Ti?1;Ti)istheLiborrateprevailingatTi?1 lastingfortheperiod[Ti?1;Ti]. Pricing interest rate swap is to flnd the fair value of the flxed rate K so that it is costless (with respect to the current yield curve) for both sides to enter such a contract at the initial time. To determine the swap contract value, let?s have a closer look at a single payment at time Ti. From payer swap holder?s point of view, 5There is no cash ow changing hands on the flrst date T0, but T0 sets the flrst oating rate paid on T1 62 the present value of the net cash ow at time Ti would be ?i(t;K) = NfiiP(t;Ti)(F(t;Ti?1;Ti)?K) = N (P(t;Ti?1)?P(t;Ti)?KfiiP(t;Ti)) : (3.16) Thus the total discounted cash ow ?(t;K) will be the summation of the all the future discounted cash ows ?(t;K) = X i ?i(t;K) = N ? P(t;T0)?P(t;Tn)?K nX i=1 fiiP(t;Ti) ! : The swap rate is obtained by setting ?(t;K) = 0 such that K = kt(T0;:::;Tn)P(t;T0)?P(t;Tn)nX i=1 fiiP(t;Ti) (3.17) for all t < T0. By the formula in Eqn.(3.17), the swap rate can be expressed as the ratio of two T-bond portfolios. Armed with the bond pricer developed in the previous section, the swap pricing is straight-forward following this formula. 3.5.4 Swaptions Swaption by the name is an option written on the underlying swap. A Eu- ropean vanilla payer/receiver swaption with strike rate K is an option giving the holder the right but not obligation to enter a payer/receiver swap at a given future date, the maturity. Usually the maturity coincides with the flrst resetting date T0 of the underlying swap, and the time from Tn to T0 6 is called the tenor of the swaption. 6See Section 3.5.3 for swap speciflcation. 63 Recall that the value of a payer swap with flxed rate K at the flrst reset date T0 is given by ?(T0;K) = N nX i=1 fiiP(T0;Ti)(F(T0;Ti?1;Ti)?K) ; and the payofi of the swaption with strike K at maturity T0 is w(T0;K) = N ? nX i=1 fiiP(T0;Ti)(F(T0;Ti?1;Ti)?K) !+ : (3.18) A swaption can be viewed as an option on a bond portfolios, and this makes it very di?cult to be evaluated because it can hardly be decomposed into elementary payofis. Since ?(T0;K = kT0) with kT0 being the swap rate at time T0 has to be made zero, one can show that the payofi in Eqn. (3.18) can also be written as w(T0;K) = N(kT0 ?K)+ nX i=1 fiiP(T0;Ti): (3.19) Eqn. (3.19) can be easily extended to evaluate the expectation for the swaption at time t < T0 that follows w(t;K) = NEQ " e? x52T0 t rsds(kT 0 ?K) + nX i=1 fiiP(T0;Ti)jFt # ; (3.20) where the expectation is taken on the risk-neutral measure Q. We notice that the swaption formula in Eqn. (3.20) indicates the valuation problem can be translated as valuing a vanilla option written on the swap rate multiplied by the value of a traded bond portfolio. The di?culty lies in that, under the risk-neutral measure, the discounting factor and bond portfolio multiplier are dependent on the payofi such that they could not be taken outside the expectation operator. 64 To overcome this di?culty, we consider change of numeraire technique as dis- cussed in Section 3.2.3. We take the new numeraire to be the bond portfolio value and call it forward swap measure, which is deflned by deQ dQ = ?s = e ?x52s0 r(w)dw Pn i=1 fiiP(s;Ti)Pn i=1 fiiP(0;Ti) (3.21) = E ??e Ys(y)?1 ? ?(???) ? for 0 < s < T0 where eQ is denoted as the forward swap measure. Under the measure eQ, the swap rate ks is a positive martingale with ks = E ((Ysswap(y)?1)?(??e?)) ; (3.22) where E is the stochastic exponential (see [46]). In the above equations, ? stands for the random jump measure, and ~? and ? the compensation jump measures associated with forward swap measure eQ and the risk neutral measure Q, respectively. The value of a swaption at time 0 with strike K is given by w(K) = N nX i=1 fiiP(0;Ti)Ex65Q?(kT0 ?K)+? : The problem is now reduced to evaluate the expectation of (kT0 ?K)+ under the measure eQ. We learned from Carr and Madan [13] that the evaluation of such an expectation can be done using the transform method, i.e. the Fourier Transform, given the closed-form formula for the characteristic function of the swap rate ks is readily known. We then develop the characteristic function for ks under the measure eQ. To approach it, we need to evaluate eYs(y), which is the response of the measure change 65 process ?s to a jump shock in the L?evy factors. Consider a jump of size yj in the j-th factor (j = 1;2;3). We have that d?js ?js = ?e Ys(yj)?1 ? : On the other hand, by the deflnition of Radon-Nikodym derivative ?s in Eqn. (3.21) we may observe that d?js ?js = nX i=1 fiiP(s;Ti)P n i=1 fiiP(s;Ti) ? eb0s;TiBjyj ?1 ? = nX i=1 !i(s) ? eb0s;TiBjyj ?1 ? : (3.23) Toreduce thecomplexityinEqn.(3.23), weadoptthe ideaofslow-varyingmartingale as in the BGM model [8], setting !i(s) ? !(0). Such an approximation is empirically valid, because in reality the variability of !i is much less than eb0s;TiBjyj?1. Therefore, Eqn.(3.23) is simplifled to be d?js ?js ? nX i=1 !i(0) ? eb0s;TiBjyj ?1 ? = nX i=1 fiiP(0;Ti)P n i=1 fiiP(0;Ti) ? eb0s;TiBjyj ?1 ? : Now we wish to see d?js?j s expressed in the exponential form d?js ?js = ?ea s(yj) ?1 ? : This implies that as(yj) = ln ? 1+ nX i=1 fiiP(0;Ti)P n i=1 fiiP(0;Ti) ? eb0s;TiBjyj ?1 ?! ; and taking a flrst order approximation gives as(yj) = nX i=1 fiiP(0;Ti)P n i=1 fiiP(0;Ti) ?b0 s;TiB ? j yj : 66 We then write eYs(y) = exp ? 3X j=1 nX i=1 fiiP(0;Ti)P n i=1 fiiP(0;Ti) ?b0 s;TiB ? j yj ! : We may then infer that e?s;j (eyj) = ecs;j?j (eyj) cs;j = nX i=1 fiiP(0;Ti)P n i=1 fiiP(0;Ti) ?b0 s;TiB ? j ; where e?s;j and ?s;j are Levy densities under eQ and Q measures for the j-th factor. FollowingL?evy-Khintchinetheoremwehavethecharacteristicexponent e?s;j(u) for the jth L?evy factor at time s under the new forward swap measure, e?s;j(u) = ? Z 1 ?1 ?eiuy j ?1 ?e? s;j(yj)dyj = ? Z 1 ?1 ?eiuy j ?1 ?ec s;jyj?s;j(yj)dyj = ? Z 1 ?1 ?e(iu+c s;j)yj ?ecs;jyj ?? s;j(yj)dyj = ? Z 1 ?1 ?e(iu+c s;j)yj ?1 ?? s;j(yj)dyj + Z 1 ?1 (ecs;jyj ?1)?s;j(yj)dyj = ?Qs;j(u?ics;j)??Qs;j(?ics;j); where ?Qs;j is the risk neutral characteristic exponent for Levy factor j at time s. We now determine Y swaps (y) in Eqn. (3.22). Consider now a jump of yj in the evolution of the j-th factor. We have that dks ks = (Y swap s (yj)?1) : Again, by direct computation dks ks = ? P(s;Tn) 1?P(s;Tn) ? e(b0s;TB)jyj ?1 ? ? nX i=1 fiiP(s;Ti)P n i=1 fiiP(s;Ti) ? eb0s;TiBjyj ?1 ? : 67 Following the similar approximations (slow varying martingale and Taylor expanded exponential), we show the solution of Y swaps (y) as Y swaps (y) = exp ? 3X j=1 ?s;jyj ! ?s;j = ? P(0;Tn)1?P(0;T n) b0s;TnBj ? nX i=1 fiiP(0;Ti)P n i=1 fiiP(0;Ti) b0s;TiBj : We are now able to build the characteristic function of the logarithm of swap rate process kT0 at maturity T0 under the forward swap measure by noting that kT0 = exp ? 3X j=1 ?T0;j?Fj ? 3X j=1 Z T 0 0 Z 1 ?1 ?e? s;jyj ?1 ?ek s;j(yj)dyj ! : We recognize from this expression that E?eiulnkT0? = exp???swapT0 (u)? ; where ?swapT0 (u) = ?iu 3X j=1 Z T0 0 e?s;j(?i?s;j)ds+ 3X j=1 Z T0 0 e?s;j(u?s;j)ds: We realize at this time, with the knowledge of ?swaps , we can fully utilize the FFT method in [13] to evaluate the swaption prices across difierent strikes. 3.5.5 Caps/Floors A cap ( oor) is a strip of caplets ( oorlets) that gives the holder the protection over the rising (declining) rates. Thus, a cap contract would consist of the following elements | ? a notional of N and a flxed strike K; 68 ? a number of future dates T0 < T1 < ::: < Tn with Ti ?Ti?1 = fii; ? at each time Ti;i = 1;:::;n, the holder of the cap contract receives the cash ow Nfii?LTi?1(Ti?1;Ti)?K?+. A oor contrasts a cap in that the oor contract holder receives the cash amount Nfii?K ?LTi?1(Ti?1;Ti)?+ at each time Ti. It is important to note the following equality, known as the cap/ oor parity, holds Cp(t;K)?Fl(t;K) = ?p(t;K); where Cp(t;K) denotes the value of a cap at time t with strike K, Fl(t;K) the corresponding value of a oor, and ?p(t;K) the value of a payer swap7. In this section, we show how to price the caplet under the model. The oorlet?s valuation would be straight-forward using the parity relationship. Because the present value of a cap is simply the summation of the present values of all caplets prior to maturity, a cap can therefore be viewed as a portfolio of options. This is an important distinction from a swaption contract which is an option on a portfolio, and makes the evaluation of caps is relatively easier than a swaption. A single caplet with reset date Ti?1 and settlement date Ti pays the holder the notional N multiplied by the difierence between a Libor rate LTi?1(Ti?1;Ti) and maturing at time Ti) and the strike K if LTi?1(Ti?1;Ti) > K, or zero otherwise. Therefore the value wi(K) of such a caplet at initial time maturing is wi(K) = NfiiEQ h e? x52Ti 0 rsds ?L Ti?1(Ti?1;Ti)?K ?+i ; 7The cap, oor and swap must have the exact coinciding dates for future cash ow exchanges. 69 where the expectation is taken under the risk-neutral measure Q. Similar to the swaption case, the above expression is inconvenient for evalu- ation, so we seek for convenient measure change. The new measure, termed as Ti forward measure, is deflned in the following way for s < Ti deQTi dQ = ?Ti(s) = e ?x52s0 r(w)dw P(s;Ti) P(0;Ti) ; (3.24) = E ??e Y Tis (y)?1 ? ?(???) ? : We show that d?jTi(s) ?jTi(s) = exp(bs;TBjyj)?1; where yj is the jump in the j-th jump component (j = 1;2;3). After identical algebraic manipulation and approximation as in the swaption case, we have eY Tis (y) = exp ? 3X j=1 bs;TBjyj ! : The characteristic exponent for the j-th jump component under the forward measure is therefore given by e?Tis;j(u) = ? Z 1 ?1 ?eiuy j ?1 ?ek s;j(yj)dyj = ? Z 1 ?1 ?eiuy j ?1 ?eb s;TiBjyjks;j(yj)dyj = ? Z 1 ?1 ?e(iu+b s;TiBj)yj ?ebs;TiBjyj ?k s;j(yj)dyj = ? Z 1 ?1 ?e(iu+b s;TiBj)yj ?1 ?k s;j(yj)dyj + Z 1 ?1 ?eb s;TiBjyj ?1 ?k s;j(yj)dyj = ?Qj (u?ibs;TiBj)??Qj (?ibs;TiBj); where the ?Q0j denotes the characteristic exponent of the j-th jump component under the original risk-neutral measure Q0. 70 Now we look at the dynamics of the underlying forward rate Li(s) = F(s;s;Ti) which is the simple compounded spot rate prevailing at time s and Ti deflned as Li(s) = 1fi i 1 P(s;Ti) ?1 ? = 1?P(s;Ti)fi iP(s;Ti) ; where fii = Ti ?s. It is easy to see that under the forward measure eQTi, Li(s) is a martingale. Analogous to the swaption case, we deflne dLi(s) Li(s) = ? Y fi;s(yj)?1 ? and we try to determine Y fi;s(y) on the analysis of dLi(s) Li(s) = ? P(s;Ti) 1?P(s;Ti) ?ex50 j bs;TiBjyj ?1 ???ex50 j bs;TiBjyj ?1 ? = ? 11?P(s;T i) ?ex50 j bs;TiBjyj ?1 ? : Hence, we approximate Y fi;s(y) = exp ? 3X j=1 ? bs;TiBjyj1?P(0;T i) ! : WethenobtainthecharacteristicexponentofthelogarithmofLi(Ti?1) = F(Ti?1;Ti) since the cap contract payofi is known at time Ti?1. E?eiulnLi(Ti?1)? = exp(??ii?1(u)); and we have ?ii?1(u) = ?iu 3X j=1 Z Ti?1 0 e?Tis;j(?i?s;j)ds+ 3X j=1 Z Ti?1 0 e?Tis;j(u?s;j)ds; ?s;j = ? bs;TiBjyj1?P(0;T i) : 71 Once we have obtained the characteristic function for the logarithm of the forward rate Li(s) at time Ti?1, the rest calculation is similar to the swaption case, i.e. the FFT method is applied to calculate the value of a single caplet which pays ofi at each time Ti. The price of a cap is thus the summation of all caplets, and one can use the parity equation to deduce the value of a oor with the same structure as the cap. 72 Chapter 4 Particle Filter and Parameter Estimation 4.1 Background The mission of assessing the empirical validity of interest models in general, and of our L?evy based model in speciflc, is of obvious importance. Before engaging in any performance comparison with other models, we should have a clear picture of how our model parameters in uence the observable market quantities, such as the bond prices. That is, the model should flrst be sensibly estimated. The problem has two aspects. From the theoretical point of view, we have developed a state-space model containing time-invariant parameters and a vector of latent state variables driven by a multi-factor L?evy process. The short rate is an a?ne combination of the latent factors. Bond prices and their derivative values are non-linear functions of the current state variables, given the model parameters known. From the empirical perspective, time series of market prices for the past 10 years are available. However, market prices are noisy data mixed with trading errors, bid/ask spread, market mis-speciflcation, etc. Here, the task is to design an algorithm, which accounts for both the model structure and the disturbed data supply, to acquire a sound estimation of the model parameters. Parameterestimation, also called\calibration" inflnancialengineering lingo, is an optimization procedure that minimizes the difierence between the model outputs 73 and market inputs. The challenge in estimating the state-space model is that the model outputs depend on the knowledge of state variables which are not directly observable. Therefore, the critical step in estimating such models is to retrieve the values of latent state variables by flltering out the aforementioned noises using the fllter technique. Once the state variables are determined, the optimization is carried out by minimizing the pricing errors. As a special case, if the errors are assumed to be white noises following multidimensional normal distribution, the maximum likelihood estimation could be invoked. In speciflc, the model estimation under the state-space model with time series data consists of two steps. (1) Take the initial guess on the model parameters (or the returned parameter values from last optimization iteration); estimate sequentially the state vari- ables through time by flltering; generate model prices based on the flltered state variables. (2) Maximize the joint likelihood function for the errors between the model prices and market prices Step (1) is the focus of this chapter; that is, we will focus on designing al- gorithms that could sequentially retrieve the unobservable state variables from a large set of noisy signals. Kalman fllter (KF) is the standard tool in this area, and under the assumption that the state variables are driven by Brownian motion and the measurement/propagation functions are linear, KF has been proved very capable of providing e?cient, robust estimation. We have seen massive applica- 74 tions of this techniques in the area of flnancial engineering for model calibration or price predictions, etc. For instance, one investment strategy that hedge funds and banks? proprietary trading desks often use is to build an equilibrium model, cast it into state-space form, estimate it using the KF technique, make projections on price movements, and trade the difierence between the model prediction and the market quotes. If the model indeed describes the correct market dynamics, market prices should converge (statistically) to the predicted price and traders pocket the difierence. Performance of such strategies crucially depend on the quality of the implementation of the fllter technique used for model estimation and prediction. The KF-like fllter would perform well if we live in a perfect linear and Gaussian world. Unfortunately, the real world is not even close to be perfect. As pointed out in the Chapter 3, non-Gaussianity exists signiflcantly in the flxed-income world, and pricing functions are highly nonlinear. Whereas the extended Kalman fllter (EKF) and unscented Kalman fllter (UKF), as extensions to the original KF, were devised to cope with the non-linearity, the non-Gaussianity could not be rescued anyway in the restrictive Kalman fllter world. Therefore we resort to the particle fllter (PF), a newly emerged flltering tech- nique that has not been vastly applied due to its high demand in computing power. However, as this demand is being eased by the dramatic reduction in computing cost recently, particle fllter has become an attractive alternative to handle the non- Gaussian and non-linear problems. Unlike KF where the distribution is completely determined by the variance-covariance speciflcation, PF relies on a large number of simulated particles to represent the distribution. The distribution is moved for- 75 ward by propagating the entire set of particles, while under KF only the variance- covariance matrix needs to be updated. The propagation of the particles are realized by Monte Carlo simulation, and this is why PF is also referred by some literature as sequential Monte Carlo method. In this chapter, Kalman fllter and its extensions will be flrstly discussed, given its prominent status in the area of state-space models estimation. Next we will focus on explaining the particle fllter technique, and generic algorithms including the resampling technique will be studied. The generic particle fllter algorithm is then customized to our model context and numerical results are presented. After the model is estimated, we perform factor analysis on the yield curve and experiment caplet pricing using Monte Carlo simulation. 4.2 The problem and conceptual solution Let?s introduce the flltering problem. Deflne xk 2Rnx the state vector where nx is the dimension of the state variable and k is the time index. The evolution of xk follows the below discrete-time stochastic model function xk = fk?1(xk?1;vk?1) (4.1) where vk?1 is the randomness from the state variable evolution, and fk?1 is the propagation function perturbed by the randomness vk?1, which is the unforeseeable disturbance in the state motion. The measurements are functions of state variables xk and the measurement noises wk zk = hk(xk;wk); (4.2) 76 where zk 2Rnz, and hk 1 is a known, possibly non-linear function. The error wk is often assumed as the white noise. The challenge is to extract the values of unobservable xk for all k based on the observations of Zk , fzi;i = 1;:::;kg which become readily available over time. Because xk themselves are random variables, the problem essentially is to discover the posterior probability density function (PDF) p(xkjZk), conditioned on the observation Zk. Assuming the initial p(x0) , p(x0jZ0) is known, we invoke Bayes theorem to recursively update the posterior distribution of xk at each k. There are two steps in this task: prediction and update. 1. First we make the prediction about the PDF p(xkjZk?1) by applying the Chapman-Kolmogorov equation p(xkjZk?1) = Z p(xkjxk?1;Zk?1)p(xk?1jZk?1)dxk?1 = Z p(xkjxk?1)p(xk?1jZk?1)dxk?1 : (4.3) Note: from the flrst line to second line in Equation. (4.3) we used the Markov property, namely the information of Zk?1 has been included in xk?1. 2. Second, aftertheobservationzk becomesavailable, Bayesruleisusedtoupdate the posterior p(xkjZk) = p(zkjxk)p(xkjZk?1)p(z kjZk?1) ; (4.4) where the normalizing factor in the denominator could be written as p(zkjZk?1) = Z p(zkjxk)p(xkjZk?1)dxk : 1It is noteworthy that for convenience in many cases both evolution function fk and measure- ment function hk are time-homogeneous such that the time index k could be dropped. 77 The conceptual solution as in Eqn. (4.4) unfortunately can not be determined optimally in general. It is because the storage of the entire PDF is equivalent to an inflnite dimensional vector; therefore, one has to resort to sub-optimal algorithms to approximate the solution to the problem. However, this di?culty is considerably alleviated for models assuming linearity and Gaussian dynamics. In the next section we demonstrate how the Kalman fllter leads to an optimal solution under the ideal assumptions. 4.3 Kalman fllter The Kalman fllter assumes that the posterior PDF at every time step is Gaus- sian which can be completely characterized by the mean vector and covariance ma- trix. In other words, if p(xk?1jZk?1) is normally distributed, it can be proved that p(xkjZk) is also normal if the following conditions about Eqn.(4.1) and (4.2) are satisfled | ? vk?1 and wk are samples from normal distribution ? xk = Fk?1xk?1 + vk?1, with Fk?1 being a nx ?nx matrix such that function fk?1(xk?1;vk?1) is linear in xk?1 and vk?1 ? zk = Hkxk + wk, with Hk being a nz ? nz such that function hk(xk;wk) is linear in xk and wk We further denote Qk?1 and Rk the covariance matrix for the white noise vk?1 and wk respectively, which are mutually independent. 78 Then the Kalman fllter algorithm can be viewed in the following recursive manner | p(xk?1jZk?1) = N(xk?1; ^xk?1jk?1;Pk?1jk?1) (4.5) p(xkjZk?1) = N(xk; ^xkjk?1;Pkjk?1) (4.6) p(xkjZk) = N(xk; ^xkjk;Pkjk) (4.7) where N(x;m;P) is the density function of the normal distribution with mean m and covariance P, and ^xkjk and Pkjk denote the posterior estimate of the mean and covariance, and ^xkjk?1 and Pkjk?1 denote the prior estimate for the mean and covariance without the newest update. With the notations above, the Kalman fllter algorithm is carried out as follows, ^xkjk?1 = Fk?1 ^xk?1jk?1 (4.8) Pkjk?1 = Qk?1 +Fk?1Pk?1jk?1FTk?1 (4.9) ^xkjk = ^xkjk?1 +Kk(zk ?Hk^xkjk?1) (4.10) Pkjk = Pkjk?1 ?KkSkKTk (4.11) where Sk = HkPkjk?1HTk +Rk is the covariance of the innovation term, and Kk = Pkjk?1HTk S?1k is the so-called Kalman gain. By looking at the Eqn. (4.11) we flnd how Kalman fllter helps reduce the covarianceofthestatevariableswiththereceptionofnewinformation. Theposterior 79 covariance Pkjk equals the prior covariance Pkjk?1 less KkSkKTk , which is the amount of variance reduction coming from the new observation of zk through Kalman gain. From Eqn. (4.8) to (4.11) is the complete recipe of Kalman fllter. The Kalman fllter is the only optimal solution with the aforementioned assumptions held. How- ever, in cases where the measurement functions are non-linear, sub-optimal solu- tions such as extended Kalman fllter (EKF) and unscented Kalman fllter (UKF) (see [20, 27, 28] for details) are devised to cope with the broken assumptions. 1. If the function(s) f(x;v) or h(x;w) (or both) are almost linear, we approximate them by the flrst-order Taylor expansion and replace those functions with their Jacobian matrices. This is the extended Kalman fllter. 2. If non-linearity is high in functions f(x;v) or h(x;w), we use the unscented transformation that draws the deterministic sigma points to approximate the distribution. This method is the unscented Kalman fllter or UKF. Both EKF and UKF attempt to reconcile the non-linearity violation in the original KF, but neither could theoretically handle the problem if the state vari- ables are non-Gaussian. However, it is crucial in our case to relax the Gaussianity restrictions since the state variables in our model are driven by pure jump L?evy processes. We hereby resort to the counterpart of KF | the particle fllter. 4.4 Particle fllter Reality often manifests itself as being very complex: non-Gaussian, non-linear with continuous/discontinuous state space. Therefore, Kalman fllter, albeit theoret- 80 ically sound, is not applicable in many practical situations. Particle fllter emerged as an alternative algorithm to substitute Kalman fllter in the non-Gaussian case. Instead of updating only the mean and covariance, PF uses a large number of points/particles to empirically approximate the posterior distribution in interest. Admittedly in comparison to the optimal Kalman fllter, particle fllters are sub-optimal. Obviously, a flnite number of simulated particles could not completely recover a continuous distribution, though more particles will help increase the pre- cision. However, the number of particles could not go too large in the practice of state-space model estimation, since one needs to perform the simulation at every time step. This is a heavy load of computation which has prohibited from applying PF in the past when computing cost was high. Given enough computing power nowadays however, particle fllter technique can be very helpful and ofier great exi- bility in estimating state-space models without restricted assumptions. Our 3-factor L?evy flxed income model is a perfect application in this regard. In this section we will discuss in detail about particle fllter in its generic format. Importance sampling again will be reviewed in the context of PF implementation. Sequential importance sampling (SIS), as the simplest PF implementation will be investigated. As naive SIS introduces the degeneracy problem, resampling technique is discussed to reduce the degree of degeneracy. 81 4.4.1 Importance sampling in particle fllter The mechanism of importance sampling has been demonstrated in Eqn.(2.13) to (2.16), and the idea behind it is to make the random draws from a more concen- trated region that most concerns the calculation in interest. In the particle fllter context, since we are interested in flnding the posterior distribution pX(:), we want to generate particles from a guessed importance distribution qX(:) which preferably are close to the true posterior. In fact, with a more educated guess about qX(:), the calculation on pX(:) using Bayes theorem would be more accurate. It can be shown that, to some extent, the quality of the parameter estimation depends on the choice of the guessed importance distribution. The selection of the importance distribution will be addressed later on in this chapter. Recallthatinordertoapplyimportancesampling, theRadon-Nikodymderiva- tive ~!(:) needs to be calculated for each particle xi as follows ~!(xi) = pX(xi)q X(xi) : (4.12) Since the information about pX() is not known beforehand, we can not rely on Eqn. (4.12) to compute the weights ~!(xi). To bypass this di?culty, the nor- malization is introduced so the new weights for each drawing i can be obtained by !(xi) = ~!(xi)P~!(x i) : (4.13) The normalization procedure will be discussed in details in the next section. 82 4.4.2 Sequential importance sampling Now we discuss how to apply importance sampling with particle fllter based on the conceptual solution as in Eqn. (4.4). We denote Xk = fxj : j = 1;:::;kg the sequence of state variable x and Zk = fzj : j = 1;:::;kg the sequence of observations of z both up to time k. The the joint posterior distribution at time k is estimated as follows p(XkjZk) = NX i=1 !ik?(Xk ?Xik); (4.14) where ?(:) is the delta function. The calculation of weights !ik follows Eqn.(4.13). In particular, if samples are drawn from an importance density q(XkjZk) other than the true density, then the weights can be expressed as !ik / p(X i kjZ i k) q(XikjZik) : (4.15) Suppose at time k?1 we have samples that constituting an approximation of p(Xk?1jZk?1) and we are interested in having p(XkjZk) at time k with the reception of new observation zk. If the importance distribution is chosen to have the following factorization q(XkjZk) = q(xkjXk?1;Zk) q(Xk?1jZk?1); (4.16) one can then augment the existing samples Xik?1 ? q(Xk?1jZk?1) with the new state xik ? q(xkjXk?1;Zk) to obtain samples Xik ? q(XkjZk). To derive the weight update, the PDF p(XkjZk) can be expanded in the fol- lowing manner | (For derivation see Appendix (A.3)) p(XkjZk) / p(zkjxk) p(xkjxk?1) p(Xk?1jZk?1): (4.17) 83 Now if we plug Eqn.(4.16) and (4.17) in the Eqn.(4.15), the weight update equation can be shown as !ik / p(zkjx i k) p(x i kjx i k?1) p(X i k?1jZk?1) q(xikjXik?1;Zk) q(Xik?1jZk?1) = !ik?1 p(zkjx i k) p(x i kjx i k?1) q(xikjXik?1;Zk) : (4.18) The relationship in Eqn.(4.18) can be simplifled by the Markov property to drop the dependence on the history q(xikjXik?1;Zk) = q(xikjxik?1;zk); (4.19) such that the importance density only depends on the most recent state and ob- servation xk?1 and zk. This reduction is particularly helpful when only the flltered posterior p(xkjZk) is estimated sequentially but large storage of past information is undesired. After the simpliflcation, the weights update equation reads !ik / !ik?1 p(zkjx i k) p(x i kjx i k?1) q(xikjxik?1;zk) : (4.20) And the updated posterior density is approximated as p(xkjZk) ? NX i=1 !ik?(xk ?xik): (4.21) The Sequential Importance Sampling (SIS) algorithm gives rise to propagate the posterior distribution with new updates at each time step by Monte Carlo sim- ulating a large number of particles. With the knowledge of the entire posterior dis- tribution obtained sequentially we could be able perform any relevant calculations involving the distribution of the state variables, such as the likelihood function. 84 4.4.3 Selection of the importance density The quality of the particle fllter estimation crucially depends on the selection of the prior distribution q(xikjxik?1;zk) as in Eqn.(4.20). The optimal prior choice would minimize the variance of the importance weights. However, except in a few exceptions, flnding the optimal choice is infeasible. Here we present the most widely- used suboptimal choice that is the so-called transitional prior, q(xikjxik?1;zk) = p(xikjxik?1): (4.22) This choice is very popular due to its simplicity. With this prior, the weight update function Eqn.(4.20) is conveniently reduced to !ik / !ik?1 p(zkjxik): (4.23) The SIS algorithm under this choice of prior distribution is summarized in Table (4.1). The transitional prior would be flnd if the domain for most of prior?s distri- bution mass is narrower than that of the likelihood function p(zkjxk), otherwise problems such as fast degeneracy and impoverishment will arise. Improved algo- rithms to handle those problems have been developed in the past, and one of them, the resampling technique, will be discussed in the next section. Interested readers are suggested to read Chapter 3 in [27] for a more detailed account. 4.4.4 Resampling Any suboptimal choice of prior distribution, such as the transitional prior discussed in the previous section, causes the degeneracy problem. In speciflc, it is 85 [fxik;!ikgNi=1] = SIS([fxik?1;!ik?1gNi=1;zk]) ? For i = 1 : N { Draw xik from the prior distribution q(xikjxik?1;zk) = p(xikjxik?1); { Evaluate the non-normalized weights according to Eqn.(4.23) ~!ik = !ik?1 p(zkjxik) ? End For ? Calculate the normalizing factor Z = PNj=1 ~!jk ? For i = 1 : N { Normalize: wik = ~!ikZ ? End For Table 4.1: Sequential importance sampling algorithm found that the suboptimal prior could increase the variance in the importance weight over time. As a consequence, it is very likely that, after a few time steps, one particle picks up all the weights while other particles weigh little. The degeneracy problem leads to a large variance in the posterior distribution and harms the estimation. Unfortunately, it is theoretically inevitable when applying the suboptimal sequential importance sampling algorithm. Whereas it is impossible to totally avoid the degeneracy, we could seek to reduce it. The degree of degeneracy could be measure by the efiective sample size ^Neff is deflned as ^Neff = 1P (!ik)2 : The smaller ^Neff is, the stronger degeneracy exists. ^Neff is then monitored over time; once a pre-deflned threshold Nthr is breached, resampling is executed to lessen 86 the degeneracy degree. The resampling procedure is to split particles that carry large weights into small particles based on the probability, so technically it is to map a random measure fxik;!ikg into another random measure fxi?k ; 1Ng of equal weights. The new random measure is generated by sampling with replacement N times from the approximated discrete posterior p(xkjZk) given by Eqn.(4.21) such that Pfxi?k = xjkg equals !jk. Therefore, after resampling, a new set of particles with uniform weights is drawn and available for the next step of propagation. The mechanism of resampling procedure is illustrated in Fig.(4.1), and the algorithm is summarized in Table 4.2. Figure 4.1: Resampling scenario and mechanism in particle fllter 87 [fxi?k ;!ikgNi=1] = RESAMPLE([fxik;!ikgNi=1;zk]) ? Initialize c1 = !1k ? For i = 2 : N ci = ci?1 +!ik ? End For ? Draw random number n1 ?U[0;1=N], set i = 1 ? For j = 1 : N { Let uj = u1 + j?1N { While uj > ci i = i+1 { Assign xi?k = xik and !jk = 1N ? End For Table 4.2: Resampling algorithm 4.4.5 Generic particle fllter algorithm We conclude this section by summarizing the algorithm of the general particle fllter which combines the sequential importance sampling (SIS) with resampling in Table 4.3. The algorithm is to generate the N new pairs of fxik;!ikg for i 2 f1;:::;Ng at the time step k with input of fxik?1;!ik?1 = 1Ng. [fxik;!ikgNi=1] = PF([fxik?1;!ik?1gNi=1;zk]) ? Filtering via SIS in Table 4.1 ? Resample using the algorithm in Table (4.2) [fxik;!ik = 1NgNi=1] = RESAMPLE([fxik;!ikgNi=1;zk]) Table 4.3: Particle fllter algorithm 88 4.5 Estimate the 3-factor L?evy short rate model In this section we will flrst describe the data set we use in the estimation and then cater the generic particle fllter algorithm to our model estimation context. Numerical results will be presented and analyzed. 4.5.1 Data description By courtesy of Caspian Capital Management LLP, we are provided with the over-the-counter Euro-dollar future rates and constant maturity swap (CMS) rates across difierent maturities and tenors. In speciflc, the flrst 3 columns of data are rates re ected from the Eurodollar future contracts, which are written on the 3-month Libor rates for 3 difierent ma- turities of T = 3;6;12 months. The next 6 columns of data are the swap rates of 6 difierent maturities including Tn = 2;3;5;10;15;30 years. In those swap contracts, the flxed-rate coupon payments are made semi-annually. Therefore we have in total 9 columns of data. The data set contains 10 year daily market quotes that start from 04=25=1994 through 10=13=2004. To avoid weekday efiects in the estimation, we only sample data weekly on every Wednesday. This makes the sampled data set have 9?538 = 4842 entries. 89 4.5.2 Quasi-maximum likelihood estimation with particle fllter Quasi-maximumlikelihoodestimation(MLE)2 isappliedtocalibratethemodel. The continuous-time dynamics of the short rate factors is flrst cast into a discrete version as the data supply is sampled weekly. Particle fllter then helps to retrieve the values of state variables sequentially on every sampled time step. Based on the flltered state variables, pricing equations output the model prices, and pricing errors are simply the difierences between market prices and the model prices. Errors are assumed to be white noises with pre-estimated variances, and the error likelihood functions are computed via the multidimensional normal distribution function. The concretes procedures will be given below. We cast the SDE of the state variables x(t) in Eqn. (3.8) and (3.9) by Euler approximation into discrete propagation equations here as xt = (I ?e???t)a? +e???txt?1 +B?Lt ; (4.24) where I is the 3?3 identity matrix and ?Lt is our L?evy randomness. With weekly sample frequency, we set ?t = 7=365. We construct measurement functions for Eurodollar futures and swap rates of difierent kinds, assuming additive, normally distributed measurement errors: zt = O(xt;?)+et ; (4.25) where zt denotes the observable prices at time t, and O(xt;?) denotes the model- implied values as a function of the parameter set ? and the factor vector xt . The 2It is called quasi-maximum likelihood method because the likelihood function is, strictly speak- ing, the conditional likelihood function. 90 term et denotes the normally distributed pricing errors with 0 mean and R the co-variance matrix 3. Recall the pricing formulas O(xt;?) for Eurodollar futures and swap rates which are summarized below: Lt(T;T +fi) = P(t;T)?P(t;T +fi)fiP(t;T +fi) ; (4.26) Kt(T0;:::;Tn) = P(t;T0)?P(t;Tn)nX i=1 fiP(t;Ti) ; (4.27) where the bond formulas, given the state variables, are shown in Eqn.(3.13). The Lt(T;T + fi) in Eqn.(4.26) denotes the Eurodollar future rate prevailing at time t for the period [T;T +fi] where fi = 3 months. And the Kt(T0;:::;Tn) in Eqn.(4.27) stands for the rate of the swap contract with coupons being paid on T1;:::;Tn. In such a contract, the flxed rate leg makes the payment every fi = 0:5 year. To match the data, the pricing function O(xt;?) is 9-dimensional, with the flrst 3 being Eurodollar futures maturing in T = 3;6;12 months and swap contracts maturing in Tn = 2;3;5;10;20;30 years. And we deflne the pricing errors as the difierences between ezt, the rates output by the model and zt, the rates observed from the market. The particle fllter technique we developed in Section 4.4 can be straight- forwardly customized in the context of our model estimation. In speciflc, at a particular time t ? 1 one has a large number of particles with known values xt?1 that represent the posterior distribution at time t ? 1. To propagate the poste- 3For the sake of convenience, in the estimation we take the covariance matrix R to be the identity matrix I times a constant scalar 2 91 rior distribution to the next time step t, one flrst makes the ex ante predictions ext according to Eqn. (4.24). Now at time t market quotes zt become available and we now use the observable data to update the prediction. As shown in Table 4.1, this is equivalent to flnd the weight function !it for each particle i. Recall that the pricing error for each day t and each asset value i is modeled as white noise with variance R so the log-likelihood function lit = logp(ztjexit) is the exponent of the normal distribution lit = ?12?(zt ?ezit)0(R)?1(zt ?ezit)? : (4.28) To estimate the parameters, we build the joint log-likelihood function across the entire history for every sampled data. That is, L(?;zMt=1) = MX t=1 NX i=1 lit ; (4.29) where M is the number of days when we sample data from and N is the number of particles we simulate4. We choose parameters to maximize the log-likelihood of the data series, which is expressed as ? = argmax ? L(?;zMt=1): 4.5.3 Estimation results and discussion Assuming the L?evy process follows Variance Gamma, we estimate the reduced form of the model with risk-neutral parameterization as listed in the right column of Table 3.2 totalling 25 parameters. The program is written in Matlab and uti- lizes its optimization routines. The selection of initial values of the parameters are 4We take N = 200 in the estimation. 92 critical for the optimization, and an informed selection would substantially expedite the optimization convergence. The initial values are chosen via an trial-and-error procedure. On one hand, a large amount of particles will make better precision for rep- resenting the distribution, and hence reduces the number of iterations used in the optimization. On the other hand however, having more particles substantially in- creases the workload in simulation. The optimal number of particles is beyond our knowledge, but in the Matlab code 200 particles are employed. The estimation took considerable CPU time for convergence. It is not guar- anteed that, due to the large-scale optimization along with randomness caused by simulation, the likelihood maximization stops at the global maximum. But as vi- sualized by the plots from Figure 4.2 to 4.6, a reasonable convergence has been achieved. The estimated parameters are presented in Table 4.4.5 The performance statistics of the estimated model are presented in Table 4.5. Errors are the difierence between market prices and model prices and quoted in basis points6. The statistics of pricing errors for both ex-ante prediction and ex- post update are presented in panel A and B respectively, and columns titled [Mean, Median, Std, MAD, Max, Min] refer to the [mean, median, standard deviation, mean-average-deviation, maximum, minimum] of the error series. EF stands for Eurodollar futures. 5We use L1;L2;L3 to denote the parameter vectors for each factor. Since we are using VG process, all are 3-dimensional vectors. 6A basis point is one percent of a percentage. 93 Apparently, with the new market observation, the ex-post performance is much better than the ex-ante prediction in terms of smaller error standard deviations. Cross-sectionally, the results are indicating good predicting power as the mean and median values for all assets but EU(12m, 3m) are within the range of 10 basis points from zero. EU(12m, 3m) seems to be the least predictable. Also, we notice that the predictions for swap rates of shorter maturities have much smaller error standard deviations and MADs than other swap rates and Eurodollar futures. The whole time series of the model implied values against the market quotes are graphically illustrated in Figure 4.2 to 4.6 for each of the 9 asset classes. Although the L?evy based models are structurally superior, the estimation pro- cedures are, at the same time, considerably more complicated. It is expected that a better computing facility would enable us to use more particles in order to bring increased accuracy and enhanced performance. 4.6 Yield curve analysis and pricing Based on the estimated model from the previous section, we perform the factor loading analysis on the yield curve and price the caplet using Monte Carlo simu- lation. The caplet volatility surface is constructed by converting the caplet values into Black implied volatility. 94 Model SDE: r(t)=fi +fl0x(t); dx(t) = (a??x(t))dt+dL(t) fi ? fl 0:0762 2 4 0:0013 0 0 0:0903 0:6353 0 ?0:3450 ?0:3927 0:9691 3 5 2 4 ?0:0666 ?0:0568 0:1241 3 5 a L1 L2 L3 x02 4 0:0009 0:0353 0:4118 3 5 2 4 ?0:0169 0:1200 0:1702 3 5 2 4 ?0:0996 0:1028 0:1983 3 5 2 4 ?0:0096 0:0828 0:1874 3 5 2 4 2:1199 ?0:4353 1:0190 3 5 Table 4.4: In-sample maximum likelihood parameter estimates of 3-factor L?evy short rate model with particle fllter using 10-year data from 04=25=1994 through 10=13=2004 In-sample error statistics in basis points (bps) Asset Mean Median Std MAD Max Min A. Ex-ante EF(3m, 3m) 4.1811 4.1557 18.1331 14.0131 59.8101 -95.3415 EF(6m, 3m) -7.7202 -6.7680 18.1461 13.7373 55.7387 -123.5331 EF(12m, 3m) -12.7738 -10.5906 26.2126 21.0457 49.6241 -99.1613 Swap(2 year) -2.9530 -3.7763 14.7992 11.4461 50.6473 -83.8178 Swap(3 year) -4.9998 -5.6067 16.9883 13.3460 57.3284 -72.2385 Swap(5 year) -5.7343 -6.5089 18.9978 14.8604 63.1169 -57.7497 Swap(10 year) 0.9012 -1.0923 20.4852 15.8901 93.8079 -50.6519 Swap(20 year) 7.7000 5.2903 22.2722 17.5681 110.7255 -39.3156 Swap(30 year) 2.6371 0.5487 23.0010 18.2084 106.6838 -47.9710 Average -2.0846 -2.7053 19.8928 15.5684 71.9425 -74.4200 B. Ex-post EF(3m, 3m) 7.2434 5.9534 16.2113 12.5880 58.8404 -34.5549 EF(6m, 3m) -4.8583 -4.2072 10.5541 8.2076 51.1436 -43.0244 EF(12m, 3m) -10.3143 -7.5075 18.8791 15.4580 38.4103 -69.2283 Swap(2 year) -0.2902 -0.0407 5.3054 4.1561 12.7163 -21.5128 Swap(3 year) -2.6413 -3.0292 8.5221 6.6426 29.6837 -31.1299 Swap(5 year) -3.8019 -4.6334 11.7396 8.9714 42.5053 -32.8672 Swap(10 year) 2.3866 1.1166 14.3203 10.9077 75.8342 -29.3619 Swap(20 year) 9.0442 7.4268 16.9991 13.2676 92.7829 -25.4996 Swap(30 year) 3.8909 1.6666 18.7174 14.6119 89.2396 -36.5544 Average 0.0732 -0.3616 13.4720 10.5345 54.5729 -35.9704 Table 4.5: In-sample estimation performance statistics of 3-factor L?evy short rate model with particle fllter using 10-year data from 04=25=1994 through 10=13=2004 95 0 200 400 6000 2 4 6 8 10 Time (week) Rates (%) Euro?Dollar Future ?? Libor(3m, 3m) Ex Ante Ex Post Market 0 200 400 600?1 0 1 2 3 4 Time (week) Rates Difference (%) Euro?Dollar Future ?? Libor(3m, 3m) Ex Ante Ex Post 0 200 400 6000 2 4 6 8 10 Time (week) Rates (%) Euro?Dollar Future ?? Libor(6m, 3m) Ex Ante Ex Post Market 0 200 400 600?1 0 1 2 3 Time (week) Rates Difference (%) Euro?Dollar Future ?? Libor(6m, 3m) Ex Ante Ex Post Figure 4.2: Illustration of (quasi) maximum likelihood estimation using particle fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994 to 10=13=2004: Upper 2 panels for Eurodollar future on Libor(3m, 3m) and lower 2 panels for Eurodollar future on Libor(6m, 3m) 96 0 200 400 6000 2 4 6 8 10 Time (week) Rates (%) Euro?Dollar Future ?? Libor(12m, 3m) Ex Ante Ex Post Market 0 200 400 600?1 ?0.5 0 0.5 1 1.5 2 2.5 Time (week) Rates Difference (%) Euro?Dollar Future ?? Libor(12m, 3m) Ex Ante Ex Post 0 200 400 6000 2 4 6 8 10 Time (week) Rates (%) 2?Year Swap Rate Ex Ante Ex Post Market 0 200 400 600?0.5 0 0.5 1 1.5 2 2.5 3 Time (week) Rates Difference (%) 2?Year Swap Rate Ex Ante Ex Post Figure 4.3: Illustration of (quasi) maximum likelihood estimation using particle fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994 to 10=13=2004: Upper 2 panels for Eurodollar future on Libor(12m, 3m) and lower 2 panels for 2-year swap rate 97 0 200 400 6000 2 4 6 8 10 Time (week) Rates (%) 3?Year Swap Rate Ex Ante Ex Post Market 0 200 400 600?1 ?0.5 0 0.5 1 1.5 2 2.5 Time (week) Rates Difference (%) 3?Year Swap Rate Ex Ante Ex Post 0 200 400 6002 3 4 5 6 7 8 9 Time (week) Rates (%) 5?Year Swap Rate Ex Ante Ex Post Market 0 200 400 600?1 ?0.5 0 0.5 1 1.5 2 Time (week) Rates Difference (%) 5?Year Swap Rate Ex Ante Ex Post Figure 4.4: Illustration of (quasi) maximum likelihood estimation using particle fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994 to 10=13=2004: Upper 2 panels for 3-year swap rate and lower 2 panels for 5-year swap rate 98 0 200 400 6003 4 5 6 7 8 9 Time (week) Rates (%) 10?Year Swap Rate Ex Ante Ex Post Market 0 200 400 600?1 ?0.5 0 0.5 1 1.5 Time (week) Rates Difference (%) 10?Year Swap Rate Ex Ante Ex Post 0 200 400 6004 5 6 7 8 9 Time (week) Rates (%) 20?Year Swap Rate Ex Ante Ex Post Market 0 200 400 600?1.5 ?1 ?0.5 0 0.5 1 Time (week) Rates Difference (%) 20?Year Swap Rate Ex Ante Ex Post Figure 4.5: Illustration of (quasi) maximum likelihood estimation using particle fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994 to 10=13=2004: Upper 2 panels for 10-year swap rate and lower 2 panels for 20-year swap rate 99 0 200 400 6004 5 6 7 8 9 Time (week) Rates (%) 30 Year Swap Rate Ex Ante Ex Post Market 0 200 400 600?1.5 ?1 ?0.5 0 0.5 1 Time (week) Rates Difference (%) 30 Year Swap Rate Ex Ante Ex Post Figure 4.6: Illustration of (quasi) maximum likelihood estimation using particle fllter on Eurodollar futures and swap rates against market quotes from 04=25=1994 to 10=13=2004: panels for 30-year swap rate 4.6.1 Yield curve factor loading Factor loading analysis has been very popular in the fleld of flxed income research as pioneered by Litterman and Scheinkman in [41]. As a function of the maturity, the loading of a yield curve factor stands for the change in the yield of that maturity given a unit shock in the factor has occurred. The analysis on the factor loading helps to hedge yield curve related positions, as traders can neutralize the duration and convexity of each factor individually according to the loading functions. We follow the parametric approach in [33]. According to Eqn. (3.13), the coe?cient as;t and bs;t determine the term structure of interest rates. The fair values Y(s;t) of simply compounded spot rates are linked to the yield curve L?evy factors x(s) by Y(s;t) = ? ? a s;t t?s ? + ? b s;t t?s ?0 x(s) ? : The slope coe?cients ?bs;tt?s are the loading functions of the L?evy factors. We illustrate the factor loading in Figure 4.7. 100 0 5 10 15 20 25 301 2 3 4 5 6 Model Implied Yield Curve on 10/13/2004 Maturity (year) Yield (%) 0 5 10 15 20 25 30?0.1 ?0.05 0 0.05 0.1 0.15 Maturity (year) Factor loading Factor Loading of Yield Curve Factor 1 Factor 2 Factor 3 Figure 4.7: Yield Curve and Factor Loading 101 The upper panel of Figure 4.7 plots the model implied fair value of yield curve on the date of Oct. 13th, 2004 and the lower panel illustrates the factor loading, both as functions of maturity. The upper panel plot is consistent with the upward trending term structure of the market quote on the same day. The factor loading plot shows how the three factors control the variation of the yields at difierent maturities. The flrst factor is persistently signiflcant across the spectrum of maturities; the second is least signiflcant as it picks up slowly for short maturity but diminishes rapidly for long maturities; the third factor is the most signiflcant factor for short maturities but dies out quickly after the maturity of 10 years. 4.6.2 Price caplet using simulation With the estimated model we seek to use Monte Carlo simulation to price one of the mostly traded flxed income derivatives | caplet (see Section 3.5.5 for the payofi structures). Black formula is the market benchmark to price a caplet. Under the Black formula, the annualized caplet rate Cpl(t;T;T + ?t) with strike K and volatility is Cpl(t;T;T +?t) = P(t;T)?F(t;T;T +?t)'(d+)?K'(d?)? ; (4.30) where d? = log( F(t;T;T+?t) K )? 1 2 2(T ?t) pT ?t ; (4.31) and F(t;T;T + ?t) is the forward rate prevailing at time t for the future period [T;T +?t]. Obviously, if the price of the caplet is known one can invert by Eqn.(4.30) 102 and 4.31 to obtain the implied volatility . The market has adopted the convention of using instead of the rate itself to quote the caplet values, which gives a more intuitive measure about the caplet evaluation. The Black implied volatilities are reported in the Table 4.6 and illustrated in the 3-D plot of Figure 4.8. As we can see, the implied volatility is increasing with the strike going farther out of the money for all maturities. And looking at the maturity direction, we flnd that the implied volatility is monotonically decreasing as the maturity gets longer. These results are consistent with some aspects of market observations qualitatively. However, certain unrealistic characteristics about the model pricing caplet ex- ist. First, the model can only generate a monotonic downward slope in the volatil- ity surface with respect to maturity, while the market has historically shown a hump-shaped volatility shape with a peak between 2 years and 6 years. Second, the volatility is dropping too fast as the maturity goes up, where the 9-year ATM volatility drops below 5% which is way below the market observations of averagely 20%. This diminishing volatility against long maturity is possibly caused by the i.i.d. increment assumptions under the L?evy process. These observations in fact are in agreement with the industry consensus that equilibrium models, like ours, are not the ideal candidates to price derivatives. It is because 1. Themodelisnotarbitragefreeandtheunderlyingbondpricesarenotmatched perfectly. 103 Black Implied Volatility of Caplet Moneyness (K=F) Maturity 1.0 1.02 1.04 1.06 1.08 1.10 1Y 0.2267 0.2420 0.2557 0.2682 0.2796 0.2900 2Y 0.1359 0.1469 0.1566 0.1653 0.1732 0.1804 3Y 0.0992 0.1081 0.1160 0.1229 0.1292 0.1349 4Y 0.0789 0.0866 0.0934 0.0995 0.1049 0.1098 5Y 0.0686 0.0755 0.0816 0.0870 0.0918 0.0961 6Y 0.0607 0.0670 0.0726 0.0774 0.0818 0.0857 7Y 0.0546 0.0605 0.0656 0.0701 0.0742 0.0778 8Y 0.0511 0.0567 0.0614 0.0657 0.0694 0.0728 9Y 0.0492 0.0544 0.0589 0.0629 0.0665 0.0697 Table 4.6: Implied volatility charts of model-generated caplet values of tenor 3- month on 10=13=2004 with maturity from 1 to 9 years and moneyness from at-the- money 1 to out-of-the-money 1.1 1 1.05 1.1 1.15 02 46 810 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Moneyness Implied volatility surface of caplet on 10/13/2004 Maturity Black Implied Vol Figure 4.8: Implied volatility surface of model-generated caplet values, 10=13=2004 104 2. Term structure models can by no means capture certain characteristics that are unique to derivative markets, e.g. caplet market, as those characteristics are conceivably not priced in the yield curve. There are three general methodologies to approach the problem of pricing derivatives based on yield curve models. First, additional factors besides the yield curve factors are included in the model and only dedicated to span the variations in the particular derivative market. The second is to add stochastic volatility to the L?evy factors in the original model to re ect the dynamics in the derivative market. The third approach is that people choose to speciflcally model the market-observable rate that underlies the derivative, such as Libor rate to the cap market or swap rate to the swaption market. By doing this, the underlying rates are automatically matched perfectly and hence arbitrage opportunities are eliminated. 105 Chapter 5 Conclusion This work makes contributions to the mathematical flnance area in two aspects | 1. Developed a 3-factor L?evy -based model, which is the flrst attempt to use a non-Brownian motion structure in modeling the short rate dynamics. Bond, caplet/ oorlets and swaption formulas are developed under the model. 2. Implemented a large-scale particle fllter to estimate the short rate model. Particle fllter methodology provides robust estimation for state-space models with non-Gaussian dynamics, and hence is an ideal candidate for studying L?evy based flnancial models that are becoming increasingly popular. The estimation results show quality flttings to the historical data, and the estimated model demonstrated certain degree of forecasting power. However, the caplet pricing results indicate that the model is not adequate to price flxed income derivatives without further seasoning. This is in fact consistent with the consensus that equilibrium flxed income models such as the short rate models, regardless of the dynamics, are not able to capture certain characteristics in individual flxed income derivative market. As of future work, two things can be done along this line. 1. First, the particle fllter implementation can be improved by using more e?- 106 cient optimization routine, a better trial-and-error scheme, or generating more particles to increase the precision. 2. Second, based on the current framework, stochastic volatility can be added to the L?evy factors in order to explain the unique characteristics exhibited from difierent derivatives market and hence give reasonable prices for those products. 107 Appendix A A.1 Solution derivation of OU process, Eqn. (3.9) Proof. We notice from Eqn (3.8) that d(e?tx(t)) = e?tdx(t)+?e?tx(t)dt ) d(e?tx(t)) = e?t(adt+BdL(s)) ) Z t 0 d(e?sx(s)) = Z t 0 e?s(ads+BdL(s)) ) e?tx(t)?x(0) = Z t 0 e?s(ads+BdL(s)) ) x(t) = e??tx(0)+ Z t 0 e??(t?s)(ads+BdL(s)): A.2 Proof of the Bayes rule in the update stage for the conceptual solution of the flltering problem, Eqn. (4.4) Proof. p(xkjZk) = p(Zkjxk)p(xk)p(Z k) = p(zk;Zk?1jxk)p(xk)p(z k;Zk?1) = p(zkjZk?1;xk)p(Zk?1jxk)p(xk)p(z kjZk?1)p(Zk?1) = p(zkjZk?1;xk)p(xkjZk?1)p(Zk?1)p(xk)p(z kjZk?1)p(Zk?1)p(xk) = p(zkjxk)p(xkjZk?1)p(z kjZk?1)) : 108 A.3 Proof of the posterior expansion, Eqn. (4.17) Proof. p(XkjZk) = p(zkjXk;Zk?1)p(XkjZk?1)p(z kjZk?1) = p(zkjXk;Zk?1)p(xkjXk?1;Zk?1)p(Xk?1jZk?1)p(z kjZk?1) = p(zkjxk)p(xkjxk?1)p(z kjZk?1) p(Xk?1jZk?1) / p(zkjxk)p(xkjxk?1)p(Xk?1jZk?1): We used the Markov properties in deriving those equalities. 109 BIBLIOGRAPHY [1] S. Asmussen and J. Rosi?nski, Arpproximation of small jumps of L?evy processes with a view towards simulation, Journal of Applied Probability, vol. 38, pp 482-493, 2001 [2] B. Azimi-Sadjadi and P.S. Krishnaprasad, A Particle Filtering Approach to Change Detection for Nonlinear Systems EURASIP Journal on Applied Signal Processing, vol 15, pp.2295-2305, 2004 [3] M. Baxter and A. Rennie, Financial Calculus : An Introduction to Derivative Pricing, Cambridge University Press, 1996 [4] F. Black and P. Karasinski, Bond and Option Pricing When Short Rates Are Lognormal, Financial Analysts Journal, v.47, pp.52-59, 1991 Jul/Aug [5] F. Black, E. Derman and W. Toy, A One-Factor Model of Interest Rates and Its Application to Treasury Bond Options, Financial Analysts Journal, v.46, pp.33-39, Jan/Feb 1990 [6] F. Black, The pricing of commodity contracts, Journal of Financial Economics, Vol.3, pp.167-179, 1976 [7] M. Boli?c, P. Djuri?c and S. Hong, Resampling Algorithms and Architectures For Distributed Particles Filters, Working paper, 2004 [8] A. Brace, D. Gatarek and M. Musiela, The Market Model of Interest Rate Dynamics, Mathematical Finance, Vol.7, no.2, 1997, pp.127-155 110 [9] D. Brigo and F. Mercurio, Interest Rate Models: Theory and Practice, Springer Finance, 2001 [10] B. Dupire, Pricing with a Smile, Risk, vol.7, no.1, Jan. 1994, pp.18-20 [11] A. Cairns, Interest Rate Models, Princeton University Press, 2004 [12] P. Carr, H. Geman, D. Madan and M. Yor, The Fine Structure of Asset Re- turns: An Empirical Investigation, Journal of Business, vol.75, no.2, April 2002, pp.305-332 [13] P. Carr and D. Madan, Option valuation using fast Fourier Transform, Journal of Computational Finance, v.2, pp.61-73, 1999 [14] P. Carr, H. Geman, D. Madan and M. Yor, Stochastic Volatility for Levy Pro- cesses, Mathematical Finance, vol.13, No.3, July 2003, pp.345-382 [15] P.CarrandB.Zhang, Pricing FX Barrier Options under Stochastic Skew Model by Monte Carlo Simulation, Bloomberg Technical Report, 2005 [16] P. Carr, L. Wu and B. Zhang, Finite Difierence Method under Heston Stochas- tic Volatility Model and Stochastic Skew Model for Pricing Barrier Options, Bloomberg Technical Report, 2005 [17] R. Cont and P. Tankov, Financial Modeling ith Jump Processes, Chapman & Hall/CRC, 2004 [18] J. Cox, J. Ingesoll and S. Ross, A Theory of the Term Structure of Interest Rates, Econometrica, vol. 53, 1985, pp. 385-408 111 [19] E. Derman and I. Kani, Riding on a smile, Risk, vol.7, no.2, 1994, pp.32-39 [20] A.Doucet, On Sequential Simulation Based Method for Baysian Filtering, Tech- nical report CUEDF-INFENGTR.310, 1998 [21] J.C. Duan and A. Fulop Estimating the Structural Credit Risk Model When Equity Prices Are Contaminated by Trading Noises, Working paper, 2005 [22] D. Du?e, Dynamic Asset Pricing Theory, Princeton University Press, 1992 [23] E. Eberlein and S. Raible, Term Structure Models Driven By General L?evy Processes, Mathematical Finance, vol.9, no.1, Jan. 1999, pp.31-53 [24] H. Geman, E. Eberlein and D. Madan, Formulation of L?evy Based Interest Rate Model, working paper, Feb. 2005 [25] H. Geman, N. Karoui and J. Rachet, Changes of Numerair, Changes of Proba- bility Measure and Option Pricing, Journal of Applied Probability, vol.32, no.2, June 1995, pp.443-458 [26] P. Glasserman, Monte Carlo Methods in Financial Engineering, Springer Fi- nance, 2003 [27] B. Ristic, S. Arulampalam and N. Gordon, Beyond the Kalman Filter | Parti- cle Filters for Tracking Applications, Artech House, Boston and London, 2004 [28] A. Doucet, N. de Freitas and N. Gordon, Sequential Monte Carlo Methods in Practise, Springer 2001 112 [29] J. Harrison and D. Kreps, Martingale and Arbitrage in Multiperiod Securities Markets, Journal of Economic Theory, 20, 1979, pp.381-408 [30] J. Harrison and S. Pliska, Martingale and Stochastic Integrals in the Theory of Continuous Trading, Stochastic Processes and Their Applications, 11, 1981, pp.215-260 [31] D. Heath, R. Jarrow and A. Morton, Bond Pricing and the Term Structure of Interest Rates: A New Methodology for Contingent Claims Valuation, Econo- metrica, Vol. 60, No. 1, 1992, pp. 77-105 [32] M. Heidari and L. Wu, Are Interest Rate Derivatives Spanned by the Term Structure of Interest Rates? Journal of Fixed Income, June 2003, pp. 75-86 [33] M. Heidari and L. Wu, Term Structure of Interest Rates, Yield Curve Residuals, and Consistent Pricing of Interest Rate Derivatives Working paper, 2005 [34] S. Heston, A Closed-form Solution for Options with Stochastic Volatility with Applications to Bond and Currency Options, Review of Financial Studies, vol.6, no.2, 1993, pp.327-343 [35] J. Hull, Options, Futures And Other Derviatives, Prince Hall, 2005 July [36] J. Hull and A. White, An Analysis of the Bias in Option Pricing Caused by a Stochastic Volatility, Advances in Futures and Options Research, vol.3, 1988, pp.29-61 [37] P. J?ackel, Monte Carlo methods in flnance, John Wiley & Sons Ltd. 2002 113 [38] R. Jamshidian, Libor and Swap Market Models and Measures, Finance and Stochastics, Vol.1, pp.290-330, 1997 [39] A. Javaheri, Inside Volatility Arbitrage, Wiley Finance, 2005 [40] S. Kou and H. Wang, First passage times of a jump difiusion process, Advanced Applied Probability, 35, 2003, pp. 504-531 [41] R. Litterman, J. Scheinkman, Common Factors Afiecting Bond Returns, Jour- nal of Fixed Income, 1, 1991, pp. 54-61 [42] D. Madan, P. Carr and E. Chang, The Variance Gamma Process and Option Pricing, European Finance Review, vol.2, no.1, June 1998, pp.79-105 [43] D. Madan and M. Yor, Representing CGMY and Meixner L?evy Processes as Time Change Brownian Motions, Working paper, 2005 [44] C. Menn and S. Rachev, Calibrated FFT-based Density Approximations for fi-stable Distributions, working paper, 2004 [45] R. Merton, Option Pricing When Underlying Stock Returns Are Discontinuous, Journal of Financial Economics, vol.3, 1976, pp.125-144 [46] P. Protter, Stochastic Integration and Difierential Equations, Second Edition, Springer-Verlag, Heidelberg, 2005 [47] R. Rebonato, Interest-Rate Option Models, second edition, John Wiley & Sons, 1998 114 [48] M. Rubinstein, Implied Binomial Trees, Journal of Finance, vol.49, no.3, Jan. 1994, pp.771-818 [49] K. Sato, L?evy Processes and Inflnitely Divisible Distribution, Cambridge Press, 1999 [50] W. Schoutens, L?evy Processes in Finance, Wiley Finance, 2003 [51] O. Vasicek, An Equilibrium Charaterization of the Term Structure, Journal of Financial Econometric, v.5, 1977, pp.177-188 [52] R. Zagst, Interest Rate Management, Springer Finance, 2002 [53] B. Zhang, Simulating CGMY as Time-Change Brownian Motion, Bloomberg Technical Report, 2005 [54] F. Zhou, Black Smirks, Risk, pp.87-91, May 2003 115