ABSTRACT Title of dissertation: HIGHER ORDER ASYMPTOTICS FOR THE CENTRAL LIMIT THEOREM AND LARGE DEVIATION PRINCIPLES Buddhima Kasun Fernando Akurugodage Doctor of Philosophy, 2018 Dissertation directed by: Professor Dmitry Dolgopyat Department of Mathematics First, we present results that extend the classical theory of Edgeworth expan- sions to independent identically distributed non-lattice discrete random variables. We consider sums of independent identically distributed random variables whose distributions have d + 1 atoms and show that such distributions never admit an Edgeworth expansion of order d but for almost all parameters the Edgeworth ex- pansion of order d− 1 is valid and the error of the order d− 1 Edgeworth expansion is typically O(n−d/2) but the O(n−d/2) terms have wild oscillations. Next, going a step further, we introduce a general theory of Edgeworth expan- sions for weakly dependent random variables. This gives us higher order asymptotics for the Central Limit Theorem for strongly ergodic Markov chains and for piece– wise expanding maps. In addition, alternative versions of asymptotic expansions are introduced in order to estimate errors when the classical expansions fail to hold. As applications, we obtain Local Limit Theorems and a Moderate Deviation Prin- ciple. Finally, we introduce asymptotic expansions for large deviations. For suffi- ciently regular weakly dependent random variables, we obtain higher order asymp- totics (similar to Edgeworth Expansions) for Large Deviation Principles. In partic- ular, we obtain asymptotic expansions for Cramér’s classical Large Deviation Prin- ciple for independent identically distributed random variables, and for the Large Deviation Principle for strongly ergodic Markov chains. HIGHER ORDER ASYMPTOTICS FOR THE CENTRAL LIMIT THEOREM AND LARGE DEVIATION PRINCIPLES by Buddhima Kasun Fernando Akurugodage Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2018 Advisory Committee: Prof. Dmitry Dolgopyat, Chair/Adviser Prof. Leonid Koralov Prof. Mark Freidlin Prof. Rodrigo Treviño Prof. Edward Ott, Dean’s Representative ©c Copyright by Buddhima Kasun Fernando Akuruogdage 2018 Dedication To the memory of my uncle, Ivan Fernando, who urged me to seek truth. ii Acknowledgments I am forever grateful to Dima, for taking me under his wing and showing me how to find my way in the mathematical landscape, for the insightful conversations and for his patience and thoughtfulness. His mathematical brilliance and humility are things that I will always look up to. I am also thankful to my collaborators, Carlangelo and Pratima for keeping me on the right track and, to Lashi, Sam, Alex and Peter for being positive role models. I am grateful to all my friends at UMD: Hamid, Danul, Micah, Steven, Pratima, Jerry, Jenny, Hsin-Yi, Shujie, David, Corry, Jing, Phil, Minsung, Patrick, JP, Ryan, Jacky and Nick, for being great companions. I extend my gratitude to Larry, without whom I would not even be in the program, to Vadim for initiating me into Dynamics and to Leonid for initiating me into Probability. I also appreciate the work of the awesome administrative staff at the UMD Mathematics Department, including Haydee, Celeste, Liliana, Sharon, Bill and Cristina. Thank you for turning paperwork into a minor problem and for all the reminders about deadlines. Finally, I am grateful to my parents and Kasunka, for everything they have done. I would not have been able to get through my PhD if not for their encour- agement and support. iii Table of Contents Dedication ii Acknowledgements iii List of Abbreviations and Symbols vi 1 Introduction 1 2 Central Limit Theorem: Discrete Random Variables. 11 2.1 Overview and main results. . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Edgeworth Expansion under Diophantine conditions. . . . . . . . . . 19 2.3 Change of variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.4 Cut off. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4.1 Density. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4.2 Fourier transform. . . . . . . . . . . . . . . . . . . . . . . . . 23 2.5 Simplifying the error. . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.6 Expectation of characteristic function. . . . . . . . . . . . . . . . . . 34 2.7 Relation to homogeneous flows. . . . . . . . . . . . . . . . . . . . . . 38 2.8 Finite intervals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3 Central Limit Theorem: Weakly Dependent Random Variables. 44 3.1 Overview and main results. . . . . . . . . . . . . . . . . . . . . . . . 44 3.2 Proofs of the main results. . . . . . . . . . . . . . . . . . . . . . . . . 50 3.3 Computing coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.4 Applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.4.1 Local Limit Theorem. . . . . . . . . . . . . . . . . . . . . . . 79 3.4.2 Moderate Deviations. . . . . . . . . . . . . . . . . . . . . . . . 83 3.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 3.5.1 Independent variables. . . . . . . . . . . . . . . . . . . . . . . 86 3.5.2 Finite state Markov chains. . . . . . . . . . . . . . . . . . . . 87 3.5.3 More general Markov chains. . . . . . . . . . . . . . . . . . . . 90 3.5.3.1 Chains with smooth transition density. . . . . . . . . 90 3.5.3.2 Chains without densities. . . . . . . . . . . . . . . . 95 3.5.4 One dimensional piecewise expanding maps. . . . . . . . . . . 97 3.5.5 Multidimensional expanding maps. . . . . . . . . . . . . . . . 102 iv 4 Large Deviation Principles. 104 4.1 Asymptotics for Cramér’s Theorem. . . . . . . . . . . . . . . . . . . . 104 4.1.1 Weak asymptotic expansions. . . . . . . . . . . . . . . . . . . 104 4.1.2 Strong asymptotic expansions. . . . . . . . . . . . . . . . . . . 109 4.2 Higher order asymptotics in the non–i.i.d. case. . . . . . . . . . . . . 113 4.3 An application to Markov Chains. . . . . . . . . . . . . . . . . . . . . 122 A Appendix 128 A.1 Convergence of X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 A.2 Hierarchy of Expansions. . . . . . . . . . . . . . . . . . . . . . . . . . 131 A.3 Construction of {fk}. . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Bibliography 137 v List of Abbreviations CLT Central Limit Theorem i.i.d. independent and identically distributed LCLT Local Central Limit Theorem LDP Large Deviation Principle LDCT Lebesgue Dominated Convergence Theorem LHS Left hand side LLT Local Limit Theorem PDE Partial differential equation RHS Right hand side WLOG Without loss of generality vi vii Chapter 1: Introduction The Central Limit Theorem (CLT) is one of the most fundamental concepts in probability which was introduced by the work of Laplace and Bernoulli. It describes the long term be∑haviour of random trials repeated under uniform conditions.N Let SN = Xn be a sum of random variables. We say that SN satisfies the n=1 CLT if there are real constants A(and σ > 0 such)that S P N√−NA∫ lim ≤ z = N(z) (1.1)N→∞ Nz 1 y2 where N(z) = n(y)dy and n(y) = √ e− 2σ2 . −∞ 2πσ2 The usefulness of the CLT and related limit theorems depends on rapid con- vergence of distributions of normalized partial sums to the limiting distribution. This is because limit theorems are primarily used for approximating distributions of sums of large but finite number of random variables. Therefore, an important problem is to estimate the rate of convergence of (1.1). In this regard, an asymptotic expansion as a series of increasing powers of order n−1/2 (now commonly referred to as the Edgeworth expansion) was formally derived by Chebyshev in [8]. Kolmogorov and Gnedenko emphasize the importance of these expansion in their monograph [23] by stating that the Edgeworth Expansion is “the most powerful and general method of finding such corrections.” 1 Definition 1. SN admits Edgeworth expansion of order r if there are polynomials P1(z), . . . , Pr(z)(such that ) r SN√−NA ∑ ( ) P ≤ Pp(z)z = N(x) + n(z) +o N−r/2︸ ︷︷ ︸ (1.2)N Np/2p=1 Er,N (z) uniformly for z ∈ R. Remark 1.1. It is an easy observation that Edgeworth expansion of SN , if it exists, is unique. Suppose {Pp(z)}p and {P̃p(z)}p, 1 ≤ p ≤ r are polynomials corresponding to two Edgeworth expansions. Then, ∑r ∑rPp(z) P̃p(z) ( ) n(z) = n(z) + o N−r/2 Np/2 Np/2 p=1 p=1 √ Multiplying by N taking the limit N →∞ we have P1(z) = P̃1(z). Therefore, ∑r rPp(z) ∑ P̃p(z) ( ) n(z) = n(z) + o N−r/2 Np/2 Np/2 p=2 p=2 Then, multiplying by N and taking N →∞, P2(z) = P̃2(z). Continuing this r times we can conclude Pp(z) = P̃p(z) for 1 ≤ p ≤ r. (S )N Here and in what follows, A is the asymptotic mean i.e. A = lim E . N→∞ N Work of Lyapunov, Edgeworth and Cramér focus on the problem of finding higher order asymptotics in the CLT. Their main focus was on independent and identically distributed (i.i.d.) sequences of random variables. In 1928, Cramér in- troduced a theory of Edgeworth expansions for a broad class of random variables. For the first rigorous derivation of this expansion see [10]. The monograph [11] by Cramér also gives a detailed account of his theory of Edgeworth expansions. Theorem 1.1 (Cramér). Let X be a centred random variable with E(X2) = σ2 > 0 and r + 2 absolute moments. Let X1, . . . , XN , . . . be sequence of i.i.d. copies of X. 2 Assume further that lim sup |E(eitX)| < 1. (1.3) |t|→∞ Then, SN satisfies (1.2). Many refinements of this result appear in later literature. A good introduction to this theory and later developments can be found in [3, 11,20,23]. In the i.i.d. case, Pp’s are polynomials such that the characteristic function φ(t) = E(eitX) and the Fo(urier tr)ansform Êr,N of Er,N satisfyN √t ( ) φ − Êr,N(t) = o N−r/2 . σ N E(X3) For example, E1,n(z) = N(z) + n(z) √ (1− z2) and3 [ 6σ n E E(X 3 √ ) 4 − 2 E(X )− 3σ 4 2,n(z) = N(z) + n(z) (1 z ) + (3z − z3) 6 nσ3 24nσ4 ] −E(X 3)2 (15z − 10z3 + z5) . 72nσ6 Since all distributions with an absolutely continuous component satisfy (1.3), this theorem covers a large class of random variables. However, (1.3) indicates that the common distribution of Xn’s is far from being discrete. In fact, (1.3) fails when random variables are purely discrete. Surprisingly, not much had been explored in the case of discrete random variables, except in the lattice case. The purpose of my first project [16], joint with Dmitry Dolgopyat, was to address this issue. A detailed discussion about this can found in Chapter 2. When Xn’s are i.i.d., it is known that the order 1 Edgeworth expansion exists if and only if the distribution is non-lattice (see [19]). Therefore, the following asymp- totic expansion for the Local Central Limit Theorem (LCLT) for lattice random 3 variables is also useful. Definition 2. Suppose that Xn’s are integer valued. We say that SN admits a lattice Edgeworth expansion of order r, if there are polynomials P0,d, . . . , Pr,d and a number A such that √ ( )− r √k √NA ∑ Pp,d((k −NA)/ N) ( )NP(SN = k) = n + o N−r/2 N Np/2p=0 uniformly for k ∈ Z. Remark 1.2. Here, the subscript d in Pp,d refers to the fact that the expansion is for discrete lattice-valued random variables. A priori, there is no reason for the polynomials Pp in Definition 1 to be related to Pp,d. In Section 3.3, we show how these two polynomials are related. As in remark 1.1, we can prove the uniqueness of this expansion. Because Pp,d’s have finite degree, say at most q, choose N large enough so that SN has more than q values. Then the argument in remark 1.1 applies. During the 20th century, the work of Lyapunov, Edgeworth, Cramér, Kol- mogorov, Esséen, Petrov, Bhattacharya and many others led to the development of the theory of these two asymptotic expansions. See [26, 31] and references therein, for more details. It is immediate that SN[ ad(mits an order r)Edgeworth]expansion if r/2 SN√−NAlim N P ≤ z − Er,N(z) = 0. (1.4) N→∞ N uniformly in z. [3, 4] discuss weak Edgeworth expansions where the LHS of (1.4) is convolved with smooth compactly supported functions. These expansions yield the asymptotics of E(f(SN)). 4 To introduce these expansions, suppose (F , ‖ · ‖) is a function space. Definition 3. SN admits weak global Edgeworth expansion of order r for f ∈ F if there are polynomials P0,g(z), . . . Pr,g(z) and A (which are independent of f) such that ∑r ∫ ( √ ) ( ) E(f(SN − 1 NA)) = p Pp,g(z)n(z)f z N dz + ‖f‖ · o N−(r+1)/2 . N 2 p=0 Definition 4. SN admits weak local Edgeworth expansion of order r for f ∈ F if there are polynomials P0,l(z), . . . Pr,l(z) and A (which are independent of f) such that √ br/2c ∫1 ∑ 1 ( ) NE(f(SN −NA)) = Pp,l(z)f(z)dz + ‖f‖ · o N−r/2 . 2π Np p=0 We also introduce the following asymptotic expansion which yields an averaged form of the error of approximation. Definition 5. SN admits averaged Edgeworth expansion of order r if there are polynomials P1,a(z), . . . Pr,a(z) and numbers k,m such that for f ∈ F we have ∫ [ ( ) ( )] S −NA y y P N√ ≤∫z + √( −N )z +(√ f∑N N Nr ) (y)dy 1 ( ) = Pp,a z + √ y y n z + √ f (y) dy + ‖f‖ · o N−r/2 . Np/2 p=1 N N Remark 1.3. Here, the subscripts g, l, a refer to global, local and averaged respec- tively and used to distinguish the polynomials appearing each definition. In Sec- tion 3.3, we show how these two polynomials are related. All of these weak forms of expansions are unique provided that F is dense in C∞c . If there are two different weak global expansions with polynomials {Pp,g} and 5 {P̃p,g}, the argument in remark 1.1 yields, ∫ ( √ ) ∫ ( √ ) Pp,g(z)n(z)f z N dz = P̃p,g(z)n(z)f z N dz for all f ∈ C∞c which gives us the equality, Pp,g(z) = P̃p,g(z). The same idea works for the other two expansions. We have seen that these asymptotic expansions are unique. They also form a hierarchy. We discuss this in Appendix A.2. Due to this hierarchy, in the absence of one, others can be useful in extracting information about the rate of convergence in (1.1). Previous results on existence of Edgeworth expansions, for example in [11, 20, 23], assume independence of random variables Xn. For many applications the independence assumption of random variables is too restrictive. Because of this reason, there have been attempts to develop a theory of Edgeworth expansions for weakly dependent random variables where weak dependence often refers to asymp- totic decorrelation. See [9, 22, 29, 40, 41] for such examples. Their focus is on the classical expansions introduced in Definition 1 and Definition 2. Except in [9], the sequences of random variables considered are uniformly er- godic Markov processes with strong recurrent properties or processes approximated by such Markov processes. In [9], the authors consider aperiodic subshifts of finite type endowed with a stationary equilibrium state and give explicit construction of the order 1 Edgeworth expansion. They also prove the existence of higher order classical Edgeworth expansions under a rapid decay assumption on the tail of the characteristic function. 6 The goal of [21], a joint work with Carlangelo Liverani, is to generalize these results and to provide sufficient conditions that guarantee the existence of Edgeworth expansions for weakly dependent random variables including observations arising from sufficiently chaotic dynamical systems, and strongly ergodic Markov chains. In fact, we introduce a widely applicable theory for both classical and weak forms of Edgeworth expansions and significantly improve existing results. This work is discussed in detail in Chapter 3. The CLT and related asymptotic expansions provide accurate descriptions only of typical events. For example, if Xn’s are centered i.i.d. random variables then for SN all a > 0, lim P(SN ≥ aN) = 0, due to the Law of Large Numbers i.e. → 0 N→∞ N in probability. Large Deviation Principles (LDPs) give better descriptions of these non–typical events by specifying the exponential rate at which their probabilities decay. Before we present results related to LDPs, we recall the following definitions, and facts whose proofs can be found in [17,30]. Given a function f : R→ (−∞,∞] with f 6= ∞, define its effective domain to be Df = {x ∈ R|f(x) < ∞} and its Legendre transform by f ∗(x) = sup [tx − f(t)]. Then, f ∗ is convex and lower t∈R semi-continuous. Therefore, Df∗ is an interval and f ∗ is continuous on Df∗ . In addition, suppose f is convex, lower semi-continuous with D̊f = (a, b) and f ∈ C2(a, b) with f ′′ > 0 on (a, b) (possibly a = −∞ or b = +∞). Then, D̊f∗ = (A,B) where A = lim f ′(t) and B = lim f ′(t), f ∗ is continuously differentiable t→a+ t→b− on (A,B) and (f ∗)′ = (f ′)−1. For any f satisfying the above properties, for any x ∈ D̊f∗ the supremum in the definition of f ∗(x) is achieved at the unique point 7 t ∈ D̊f which solves f ′(t) = x and hence, f ∗(x) = sup [tx− f(t)]. Also, f is called t∈D̊f steep if lim |f ′(t)| = lim |f ′(t)| =∞. t→a t→b The following classical result, due to Cramér, is one of the fundamental results in the theory of Large Deviations. Theorem 1.2 (Cramér). Let X be a real valued random variable with mean A and variance σ2 > 0. Suppose that the logarithmic moment generating function of X, logE(etX), is finite in a neighbourhood of 0. Let Xn be a sequence of i.i.d. copies of X. Then, 1 lim logP(SN ≥ Nz) = −I(z), if z > A N→∞ N and 1 lim logP(SN ≤ Nz) = −I(z), if z < A N→∞ N [ ] where I is given by I(z) = sup λz − logE(eλX) (the Legendre transform of the λ∈R logarithmic moment generating function of X). From the hypothesis it is immediate that I is convex and lower semi-continuous. Also, I ′′ > 0 on D̊I = (inf(supp X), sup(supp X)), therefore I is strictly con- vex on D̊I , I(z) = 0 ⇐⇒ z = µ and there is a unique λ∗ such that I(z) = λ∗z − ∗logE(eλ X). Cramér’s LDP has an extension to the non–i.i.d. case. We refer the reader to [6][Chapter V.6] for a proof of the following result. Theorem 1.3 (Local Gärtner–Ellis). Let Xn be a sequence of random variables not necessarily i.i.d. Suppose there exists δ > 0 such that for λ ∈ (−δ, δ), 1 lim logE(eλSN ) = Ω(λ) (1.5) N→∞ N 8 where Ω is (strictly co)nvex continuously differentiable function with Ω′(0) = 0. Then,Ω(δ) for all z ∈ 0, , δ 1 lim logP(SN ≥ Nz) = −I(z) (1.6) N→∞ N where I(z) = sup [zλ− Ω(λ)]. λ∈(−δ,δ) Remark 1.4. Ω(δ) 1. If the limit (1.5) exists for all λ ∈ R. Then, B = lim exists and (1.6) δ→∞ δ holds for all z ∈ (0, B). 2. The function I appearing in the theorem is called the rate function because it gives us the exponential rate of decay of tail probabilities. In an on-going joint work with Pratima Hebbar, we develop a theory of higher order asymptotics for LDPs, using the weak forms of Edgeworth expansions and extensions of results in [27, Chapter VIII]. As in the CLT case, higher order asymp- totics are given as expansions. Definition 6. Suppose SN satisfies an LDP with rate function I. Then, SN admits strong asymptotic expansion of order r for large deviations in the range (0, L) if r there are functions Cp : (0, L) → R for 0 ≤ p < and A > 0 such that for each 2 a ∈ (0, L), b∑r/2c ( ) − ≥ I(a)N Cp(a) 1P(SN AN aN)e = + C · o . Np+1/2 r,a r+1 2 p=0 N These expansions are in the spirit of the higher order expansions found [1] for i.i.d. sequences of random variables. In [7], authors refer to these expansions as strong large deviation results. [7, 32] establish the order 1 expansions under certain assumptions on the behaviour of the moment generating functions. These strengthen 9 the results of [1] but only in the order 1 case. Here, we give an alternative way to establish the so-called strong large deviation results of all orders. We also manage to recover the results in [1] in the non-lattice setting. For applications of these results to statisitcs, see examples listed in [1, 7, 32] and references therein. We also introduce the following weak form of the expansion for LDPs. As in the CLT case, we define these expansions over a function space (F , ‖ · ‖). Definition 7. Suppose SN satisfies an LDP with rate function I. Then, SN admits weak asymptotic expansion of order r for large deviations in the range (0, L) for f ∈ F r, if there are functions Dp : (0, L) → R for 0 ≤ p < and A > 0 such that 2 for each a ∈ (0, L), b∑r/2c ( ) E(f(S − aN))eI(a)N Dp(a)N = + Cr,a · 1 o . Np+1/2 r+12 p=0 N In fact, our results show that for a sequence Xn of i.i.d. l−Diophantine random variables with all exponential moments, for every r, SN admits weak asymptotic expansions of order r for large deviations on (0,∞) for sufficiently regular f . This is a refinement of the LDP by Cramér for a broad class of random variables. We also obtain similar results for certain classes of non–i.i.d. random variables. As an application, we obtain asymptotic expansions for the LDP in the case of Markov chains with smooth densities. In particular, let xn be a time homogeneous Markov chain on a compact connected manifoldM with a smooth transition density and h : M×M → R be smooth with non-degenerate critical points. Then Xn = h(xn, xn−1) admits asymptotic expansions for large deviations of all orders. These results are presented in Chapter 4. 10 Chapter 2: Central Limit Theorem: Discrete Random Variables. 2.1 Overview and main results. L∑et X be a random variable with zero mean and positive variance σ 2. Let n Sn = Xj where Xj are independent identically distributed and have the same n=1 distribution as X. Then, it is well-known that Sn satisfies the CLT with A = 0 and σ as in (1.1). In this chapter, we consider a case which is opposite to X having a density, namely we suppose that X has a discrete distribution with d+1 atoms where d ≥ 2. d = 2 is the simplest non-trivial case since distributions with two atoms are lattice and as a result they do not admit even the first order Edgeworth expansion. Thus, we suppose thatX takes values a1, . . . , ad+1 with probabilities p1, . . . , pd+1 respectively. Since X should have zero mean we suppose that our 2(d + 1)−tuple (a,p) belongs to the set Ω = {pi > 0, p1 + · · ·+ pd+1 = 1, p1a1 + · · ·+ pd+1ad+1 = 0}. It is easy to see that Sn never admits the order d Edgeworth expansion. Indeed ∑ n! P m(S ≤ z) = pm1 . . . p d+1a,p n . (2.1) ∑ ∑ m1! . . .m ! 1 d+1 mi≥ d+1 0, mi=n miai≤z 11 Applying the Local Central Limit Theorem to the time homogeneous Zd-random walk which jumps to ei from the origin 0 with probability pi for i = 1, . . . , d and stays at 0 with probability pd+1 we conclude that if ∑ ∑ √ miai = n aipi +O( n) then d/2 n! m1 mn p . . . p d+1 m 1 d+11! . . .md+1! is uniformly bounded from below. Accordingly Pa,p(Sn ≤ z) has jumps of order n−d/2. On the other hand Ed(z) is a smooth function of z. So, it cannot approximate both Pa,p(Sn ≤ z − 0) and Pa,p(Sn ≤ z + 0) at the points of jumps. Here we show that for typical (a,p) the order d Edgeworth expansion just barely fails. We present two results in this direction. For the first result let bj = aj − a1, for j = 2 . . . d+ 1. Set d(s) = max dist(bjs, 2πZ). j∈{2,...d+1} We say that a is β-Diophantine if there is a constant K such that for |s| > 1, K d(s) ≥ . |s|β It is easy to see that almost all a is β-Diophantine provided that β > (d− 1)−1 (see [36,47]). Theorem 2.1.1. If a is β-Diopha(ntine an)d − 12 R β < 1 (2.2) 2 12 then [ ( ) ] Sn lim nR P √ ≤ z − E →∞ a,p d−1 (z) = 0. n σ n Thus for almost every a the order (d − 1) Edgeworth expansion approximates the S√ndistribution of with error O(nε−d/2) for any ε. σ n Note that Theorem 2.1.1 applies for all βs, in particular for βs which are much larger than (d− 1)−1. However if β is large, then the statement of the theorem can be simplified. Namely, let r be the integer such that r < 2R ≤ r + 1. (Note that 1 (2.2) can be rewritten as 2R < + 1 so provided that 2R is suffciently close to β 1 + 1 we can take r = bβ−1( )c+ 1. Then,β ( ) S P √na,p ≤ z = E 1 d−1(z) + σ n (o n)R E 1= r(z) + o +O (Ed−1(z)− Er(z)) . nR r + 1 Since > R the first error term dominates the second and we obtain the 2 following result. Corollary 2.1.1. [ ( ) ] lim nR S P √na,p ≤ z − Er(z) = 0 n→∞ σ n 1 provided that a is β-Diophantine, r = 1 + bβ−1c, and r < 2R < + 1. β Theorem 2.1.1 shows that for almost every a and for r ∈ {1, . . . , d − 1}, the order r Edgeworth expansion is v(alid. Result)s that follow show that, S P na,p √ ≤ z − Ed(z) (2.3) σ n is typically of order O(n−d/2) but the O(n−d/2) term has wild oscillations. To for- mulate this result precisely we suppose that our 2(d+ 1)-tuple is chosen at random 13 according to an absolutely continuous distribution P on Ω. Thus (2.3) becomes a random variable. Theorem 2.1.2. There exists a smooth function Λ(a,p) such that for each z the random variable [ ( )] d/2 z2/2 n E − Se d(z) Pa,p √n ≤ z Λ(a,p) σ n converges in law to a non-trivial random variable X . More precisely we have, |ad+1 − a1| Λ(a,p) = d+ 1 √ (2.4) 2dπ 2 det(Da,p) σ(a,p) where Da,p is a (d − 1) × (d − 1) matrix defined by equations (2.37)–(2.38) of Section 2.5, σ(a,p) denotes the standard deviation of the distribution of the random variable taking value aj with probability pj and X is defined as follows. Let M be the space of pairs (L, χ) where L is a unimodular lattice in Rd and χ is a homeomorphism χ : L → T. In the formulas below, we identify T with segment [0, 1) equipped with addition modulo one. Given a vector w ∈ Rd we denote by y(w) its first coordinate and by x(w) its last d− 1 coordinates. Lemma 2.1.2. For almost every pair (L, χ) ∈M with respect to the Haar measure the following limit exists ∑ X L sin(2πχ(w)) −||x(w)||2( , χ) = lim e . (2.5) R→∞ y(w) w∈L\{0}, ||w||≤R In order to simplify the notation we will abbreviate expressions such as (2.5) by ∑ X L sin(2πχ(w)) 2( , χ) = e−||x(w)|| . (2.6) y(w) w∈L\{0} 14 The Haar measure on M can be defined in two equivalent ways. First, note that χ is of the form χ(w) = eiχ̃(w) for some linear functional χ̃ ∈ (Rd)∗. SLd(R) acts on Rd ⊕ (Rd)∗ by the formula, A(w, χ̃) = (Aw, χ̃A−1). Observe that if A(w, χ̃) = (ŵ, χ̂) then, χ̃(w) = ŵ(χ̂). (2.7) The above action of SLd(R) induces the following action of SL (R) n (Rd)∗d on M given by, (A, χ̃)(L, χ) = (AL, e2πitχ̃ · (χ ◦ A−1)). This action is transitive because SLd(R) acts transitively on unimodular lattices and (Rd)∗ acts transitively on characters. This allows us to identify M with (SLd(R) n Rd)/(SLd(Z) n Zd) and so M inherits the Haar measure from SL (R) nRdd . The second way to define the Haar measure is to note that the space M of unimodular lattices is naturally identified with SLd(R)/SLd(Z) and so it inherits the Haar measure from SLd(R). Next for a fixed L the set of homeomorphisms χ : L → T is a d dimensional torus so it comes with its own Haar measure. Now, if we want to compute the average of a function Φ(L, χ) with respect to the Haar measure then we can first compute its average Φ̄(L) in each fiber and then integrate the result with respect to the Haar measure on the space of lattices. In the proof of Lemma 2.1.2 given in Section A.1 the averaging inside a fiber will be denoted by Eχ and the averaging with respect to the Haar measure on the space of lattices will be denoted by EL. 15 If we assume that the pair (L, χ) is distributed according to the Haar measure on M then X , defined in Lemma 2.1.2, becomes a random variable. This is the variable mentioned in Theorem 2.1.2. Note that the distribution of X depends neither on P nor on z. Using the second representation of the Haar measure we can also describe X as follows. Let w1, . . . ,wd be the shortest spanning set of L. That is w1 is the shortest non zero vector in L and, for j > 1, wj is the shortest vector which is linearly independent of w1, . . . ,wj−1. Given m = (m1, . . . ,md) ∈ Zd let (y,x)(m), y ∈ R and x ∈ Rd−1, denote the point m1w1 + · · ·+mdwd ∈ L. (2.8) Let θj = χ(wj). Then θj are uniformly distributed on T and independent of each other. Set θ(m) = m1θ1 + · · · + mdθd. Now X (see definition in Lemma 2.1.2) can be rewritten as ∑ X sin(2πθ(m))= e−||x(m)||2 (2.9) y(m) m∈Zd\{0} where L is uniformly distributed on the space of lattices, (y,x)(m) is defined by (2.8), and (θ1, . . . θd) is uniformly distributed on Td and independent of L. Theorems 2.1.1 and 2.1.2 have analogues when we consider probabilities that Sn belongs to finite intervals. In particular, our results have applications to the Local Limit Theorem. Theorem 2.1.3. Let z1(n) and z2(n) be two uniformly bounded sequences such that |z (n)− z (n)|nd/21 (2 [ →∞. Th(en, the ran)d]om vec[tor, ( )]) nd/2 z2 Sne 1/2 E (z )− P √ ≤ z , ez2 Sn2/2d 1 a,p 1 Ed(z2)− Pa,p √ ≤ z2 (2.10) Λ(a,p) σ n σ n 16 converges in law to a random vector (X (L, χ1),X (L, χ2)) where X (L, χ) is defined by (2.6) and the triple (L, χ1, χ2) is uniformly distributed on (SLd(R)/SLd(Z)) × Td × Td. Here and below the uniform distribution of (L, χ1, χ2) means that L is uni- formly distributed on the space of lattices and for a given lattice, χ1 and χ2 are chosen independently and uniformly from the space of characters. Theorem 2.1.4. Let z1(n) < z2(n) be two uniformly bounded sequences such that ln = z2(n)− z1(n)→ 0. (a) If l ≥ Cnε−d/2n for some ε > 0 then Pa,p(z < S√n1 < zσ n 2) → 1 almost surely. lnn(z1) (b) If l nd/2n →∞ then P Sna,p(z1 < √ < zσ n 2) ⇒ 1 lnn(z1) (here and below “⇒” denotes the convergence in law). c|ad+1 − a1| (c) If ln = then σ(a,p)nd/2√ [ ]P (z < S√na,p 1 < z2) 2d− 3 d σ n 2π det(Da,p) − 1 ⇒ Y lnn(z1) where ∑ Y L sin(2π[χ(w)− cy(w)])− sin(2πχ(w))( , χ, c) = e−||x(w)||2 y(w) w∈L\{0} and L, χ are as in Theorem 2.1.2 and Da,p given by equations (2.37)–(2.38). The intuition behind this result is the following. Call yn δ-plausible if P(Sn = yn) ≥ δn−d/2. The discussion following (2.1) shows that for each δ there are about 17 C(δ)nd/2 δ-plausible values. Therefore, if ln  n−d/2 then the interval [z1(n), z2(n)] would typically contain no plausible values. Hence, we should not expect the LLT to hold on that scale. Theorem 2.1.4 shows that as soon as interval [z1(n), z2(n)] contains many plausible values then the LLT typically holds for this interval. Recall that, ∑ n! P ma,p(Sn ∈ [z1, z2]) = pm1∑ 1 . . . p d+1 d+1 . ∑ m1! . . .m≥ d+1!mi 0, mi=n z1≤ miai≤z2 ∑Thus, in Theorem 2.1.4 we just count the number of visits of a random linear form miai to a finite interval with weights given by multinomial coefficients. It is also interesting to consider counting with equal weight. In this case the analogue of Theorem 2.1.4(c) is obtained in [38] while for longer intervals only partial results are available, for example see [15,34]. The chapter is organized as follows. Theorem 2.1.1 is proven in Section 2.2. The proof is a minor modification of the arguments of [20, Chapter XVI]. The bulk of the chapter is devoted to the proof of Theorem 2.1.2. In Section 2.3 we provide an equivalent formula for X . This formula looks more complicated than (2.6) but it is easier to identify with the limit of the error term. Section 2.4 contains preliminary reductions. We show that the density ρ on Ω could be assumed smooth and the integration in the Fourier inversion formula could be restricted to a finite domain. In Section 2.5, we show that main contribution to the error term comes from resonances where characteristic function of Sn is close to 1 in absolute value. The proof relies on several technical estimates which are established in Section 2.6. In Section 2.7, we use dynamics on homogenuous spaces in order to show that the contribution of 18 resonances converges to (2.6) completing the proof of Theorem 2.1.2. The proofs of Theorems 2.1.3 and 2.1.4 are similar to the proof of Theorem 2.1.2. The necessary modifications are explained in Section 2.8. We postpone the proof of Lemma 2.1.2 till Appendix A.1. 2.2 Edgeworth Expansion under Diophantine conditions. Theorem 2.1.1 is a consequence of Theorem 2.2.1 below and the fact that in our case there is a positive constant c such that |φ(s)| ≤ 1− cd(s)2. (2.11) (2.11) follows from inequality (2.35) proven in Section 2.5. Theorem 2.2.1. If the distribution of X has d+ 2 moments and its characteristic function satisfies | Kφ(s)| ≤ 1− (2.12) |s|γ d and R < is such that 2 ( ) R− 1 γ < 1 (2.13) 2 then [ ( ) ] R SP √nlim n ≤ z − E →∞ d−1 (z) = 0. n σ n Theorem 2.2.1 follows easily from the estimates in [20, ChapterXVI] but we provide the proof here for completeness. Proof. Denoting ( ) Sn ∆̄n(a,p) = Pa,p √ ≤ z − Ed−1(z) σ n 19 we get by [20, Chapter XVI] that for∣each T∫ ∣ 1 √ T ∣ √ σ n | ∣∆̄n(a,p)| ≤ ∣ π − √T ∣φ n(s)− Êd−1(sσ n) ∣∣∣∣ Cds+ . (2.14)s T σ n C C ε Choose T = BnR with B = . Then, = . Take a small δ and split the ε T nR integral ∣in the RHS of (2.14) ∣into two parts.∫ ∣∣∣ √ ∣ √ ∣1 δ ∣φn ∫ (s)− Êd−1(sσ n) ∣∣∣∣ 1 ∣∣φn(s)− Ê ∣d−1(sσ n) ∣ds + ∣∣ ∣∣ ds.π −δ s π δ<|s|δ s |s| < BnR−1/2∣ /σ}∣. Thus, we only need to approximate,∫ ∣∣∣ n ∣∣∣ ∫ ∫φ (s) ≤ 1 | n | ≤ C ( )1ds φ (s) ds exp −c̄ n1−(R− γ2) ds (2.16) J s δ J δ J where the last inequality is due to (2.12). By (2.13) the integral decay faster than d any power of n. Because R < the contribution of |s| ≤ δ is also under control. 2 2.3 Change of variables. Here we deduce Theorem 2.1.2 from: Theorem 2.1.2*. For each z[the random v(ariable )] nd/2 Ed(z)− S P na,p √ ≤ z σ n converges in law to X̂ where ∑ X̂ L −z2/2 |ad+1 −√a1| sin 2πχ(w)(a, p, , χ) = e e−4π2x(w)TDa,px(w) (2.17) 2σ(a, p) π3 y(w) w∈L\{0} 20 a = (a1, . . . , ad+1), p = (p1, . . . , pd+1) and (a, p) ∈ Ω are distributed according to P and Da,p and σ(a, p) are defined immediately after (2.4). In order to deduce Theorem 2.1.2 from Theorem 2.1.2* we need to show that z2/2 X̂e has the same distribution as X . To this end we rewrite the sum in (2.17) Λ(a, p) as 1 √ ∑ sin(2πχ(w))√ √e−4π2||( Da,px(w))||2 . (2.18) (2π)d−1 det( Da,p) d−1w∈L\{ } y(w)/((2π) det( D0 a,p)) Let A be the linear map suc(h that √ )y √A(y,x) = , 2π Da,p x . (2π)d−1 det(Da,p) Put (L̄, χ̄) = A(L, χ). Then, using (2.7), (2.18) can be rewritten as: 1 √ ∑ sin(2πχ̄(w̄))e−||x(w̄))||2 . (2π)d−1 det( Da,p) y(w̄)w̄∈L\{0} Since det(A) = 1, the pair (L̄, χ̄) is distributed according to the Haar measure on M proving our formula for X . Sections 2.4–2.7 are devoted to the proof of Theorem 2.1.2*. Note that simi- larly to (2.9) we have X̂ −z2/2 |ad+1 −√a1| ∑ sin 2πθ(m) = e e−4π 2x(m)TDa,px(m). 2σ(a, p) π3 y(m) m∈Zd\{0} The statements of Theorems 2.1.2 and 2.1.2* look similar, however, there is an important distinction. Namely the proof of Theorem 2.1.2* is constructive. In the course of the proof given n, a and z we construct a lattice L(a, n) and a character χ(a,p, n, z) such that the expression n−d/2X̂ (a,p,L(a, n), χ(a,p, n, z)) well-approximates the error in the Edgeworth expansion. We believe that such a 21 construction could be made for more general distributions where the Edgeworth ex- pansion fails, and this will be a subject of a future investigation. So the difference between Theorems 2.1.2 and 2.1.2* is that in the first case we have only an approx- imation in law while in the second case we are able to obtain an approximation in probability. 2.4 Cut off. 2.4.1 Density. Here we show that it is enough to prove Theorem 2.1.2* under the assumption that P has smooth density supported on a subset Ωκ = {(a,p) ∈ Ω : ∀i pi ≥ κ and ∀i 6= j |ai − aj| ≥ κ} for some κ > 0. Indeed suppose that the theorem is true for such densities. Let p(a,p) the original density of P. Let φ be a bounded continuous test function. Given ε we can find a smooth density p̃(a,p) supported on some Ωκ such that ||p− p̃||L1∫≤ ε. In Section 2.7 we∫p∫rove that φ(nd/2∆n)(p̃ da dp→) φ(X̂ (a,p,L,θ))p̃ da dp dµ(L,θ) (2.19) S where ∆n = Ed(z)−P √n ≤ z and µ is the Haar measure on (SLd(R)/SLd(Z))× σ n Td. Let pm(a,p) be the smooth density supported on Ωκ corresponding to ε = m−1. Passing to subsequence, pm → p almost surely. Because |pmφ| ≤ ‖φ‖|pm| ∈ L1 and |pφ| ≤ ‖φ‖|p| ∈ L1 and ‖φ‖|pm| → ‖φ‖|p| almost surely, Lebesgue Dominated Convergence Theorem gives 22 ∫∫ φ(X̂ (a,p,L,θ))pm da dp dµ(L,θ∫) ∫ → φ(X̂ (a,p,L,θ))p da dp dµ(L,θ). (2.20) Combining∫(2.19) and (2.20) we h∫ave that, φ(nd/2∆∫n∫)p da dp = φ(n d/2∆n)pm da dp +O(m−1‖φ‖) (2.21) −n−→−∞→ ∫∫ φ(X̂ (a,p,L,θ))pm da dp dµ(L,θ) +O(m −1‖φ‖) −m−→−∞→ φ(X̂ (a,p,L,θ))p da dp dµ(L,θ). 2.4.2 Fourier transform. As in the previous section let ( ) Sn ∆n = Ed(z)− Fn(z) where Fn(z) = Pa,p √( ≤ z .σ n ) 1 · 1− cosTxDenote by vT (x) = and let V(s, T ) = 1− |s| 1 2 |s|≤T be itsπ Tx T Fourier transform. Using the approach of [20, Section XVI.3] we let T2 = n 2d+6 and decompose ∆n = [Ed − Fn] ? vT2(z)− [Fn − Fn ? vT2 ] (z) + [Ed − Ed ? vT2 ] (z). (2.22) To estimate the last term we split ∫ [Ed − Ed ? vT2 ] (z) =∫ [Ed(z)− Ed(z − x)] vT2(x)dx (2.23)|x|≤1 + [Ed(z)− Ed(z − x)] vT2(x)dx. |x|≥1 Since vT is even∫the first integral in (2.2∫3) equals to E ′′E ′ (y(z, x))d(z)xvT2(x)dx− d x2vT2(x)dx |x|≤1 |x|≤1 2 23 ∫ E ′′ ( ) d (y(z, x)) 1− cosT2x= dx = O 1 . |x|≤1 2 πT2 T2 Since both Ed and cosine are bounded the second integral in (2.23) is bounded by ∫ dx C C = . ( 2|x|≥1)T2x T2 Thus the last term in (2.22) is O T−12 . To estima√te the second term in√(2.22) we split the integral in Fn ?√vT2 into regions {|x| ≥ 1/ T2} and {|x| ≤ 1/ T2}. The contribution of {|x| ≥ 1/ T2} is bounded by ∫ ∞ dx C √ = √ C . 1/ T T x 2 T 2 2 2 On the other hand ∫ √ [Fn(z)− Fn(z − x)]VT2(x)dx = 0 |x|≤1/ T2 [ √ √ ] unless there is a point of increase of Fn inside z − 1/ T2, z + 1/ T2 . The prob- ability that such a point exists is bounded by ∑ ( [ √ √ ]) P m1a1 + · · ·+md+1ad+1 ∈ z − 1/ T2, z + 1/ T2 . (2.24) m1+···+md+1=n Note that for each fixed (m1, . . . ,md+1) the random variable m1a1 + · · ·+md+1ad+1 has a bou(n√ded density with r)espect to the uniform distribution on the segment of length O m21 + · · ·+m2d+1 and so ( ) |J | P(m.a ∈ J) = O ‖m‖ 24 ( ) for(any in)terval J . Hence each term in( 1(2.2)4) is O √ and so the(sum isn T2d ) O √n 1 −1/2. Thus with probability 1−O we have that ∆n = ∆n,2+O T n T2 n4 2 where ∫ [ ( ) ]T φn1 2 √t − Ên d(t) ∆n,2 = ∫ V(t, T )e −itz 2 dt 2π −T it2 T√2 √1 σ n √ φn−iszσ n (s)− Êd(sσ n)= e V(s, n, T2)ds , ∣ 2π ∣ − T√2 is∣ σ n√ ∣ V sσ n(s, n, T ) = 1− ∣∣ ∣∣ and φ(s) is the characteristic function of X given byT φ(s) = p1e isa1 + · · ·+ p eisad+1d+1 . Let T1 = K1n d/2 and define ∫ T√1 √1 σ n √ n−iszσ n φ (s)− Êd(sσ n)∆n,1 = e V(s, n, T2) ds. 2π − T√1 is σ n Let Γn = ∆n,2 −∆n,1. Put ∫ 1 √ n Γ̃ = e−iszσ n φ (s) n √ √ V(s, n, T2) ds.2π |s|∈[T1/((σ n),T2)/(σ n)] is 2 Then, we have Γn = Γ̃n +O e−εT1 due to the exponential decay of Êd. The main result of Subsection 2.4.2 is the following. Proposition 2.4.1. ∥∥∥ ∥∥Γ̃n∥ ≤ √C . (2.25) L2 T nd1 Proof. ∫∫ ( √ ) E(Γ̃2 ) = E e−i(s1+s2)zσ n n n V V ds1 ds2n φ (s1)φ (s2) (s1, n, T2) (s2, n, T2) .s1 s2 We split this integral into two parts. 25 (1) In the region where |s1 + s2| ≤ 1 we use Corollary 2.5.2 proven in Section 2.5 to estimate the(in∫tegral by ) O 1√ √ √ E (|φ n(s1)|) ds1 . (2.26)2 |s|∈[T1/(σ n),T2/(σ n)] ns1 The next result will be proven in Section 2.6. Lemma 2.4.2. C E (|φn(s1)|) ≤ . nd/2 Plugging the estimate of Lemma 2.4.2 into ((2.26) an)d integrating we see that 1 the contribution of the first region to E(Γ̃2n) is O .T d/21n (2) Consider now the region where |s1 + s2| ≥ 1. Denote bd+1 = ad+1 − a1, . . . , b2 = a2 − a1. Then φ(s) = eisa1ψ(s) where ψ(s) = p + p eisb21 2 + · · ·+ p eisbd+1d+1 . Denote ν = (p1, . . . , pd, b2, . . . , bd). Then there exists a compactly supported density ρ = ρ(a1,ν) such that the contribution of the second region is ∫∫ (∫ ) √ −i(s +s )zσ n in(s +s )a n n V V ds1 2 1 ds2e e 1 2 1ψ (s1)ψ (s2) (s1) (s2)ρ da1 dν . |s s s1+s2|≥1 1 2 We are able to use a 2d-dimensional coordinate system because on Ω p1 + · · ·+ pd+1 = 1, and p1a1 + · · ·+ pd+1ad+1 = 0. (2.27) To estimate this integral we integ[rate by p]arts with respect to a1. We use thatk eisna 1 d 1da1 = de isna1 isn da1 26 for some large k (for(exam)ple we(can take k = 2d + 1)). The integration by partskd √ amoun{ts to applying to e iszσ n }{ ρ[ψ(s1})ψ{(s n 2)] which leads to the terms( )k [ da1 ( ) ( ) }d 1 √ ] kd 2 ki(s +s )zσ n d 3e 1 2 [ρ] [[ψ(s1)ψ(s n2)] ] da1 da1 da1 where k1 + k2 + k3 = k. (Note that both σ and ψ depend on a1 implicitly due to the second equation in (2.27)). Thus, the contribution of the above term to the integral is bounded ∫by∫ (s + s k1 (k1/2)+k31 2) n ds1 ds2 C √ √ E (|φn(s1)|) . |s1|,|s2|∈[T1/σ n,T2/σ n] (s + s )k nk s1 s2 |s1+s 1 2 2|≥1 Using Lemma 2.4.2 again we can estimate the above integral by C if k ≥ k − 2nk/2 1  C − − otherwise.T k+d/2 k1/2 k31n Thus the main contribution comes from k1 = k2 = 0, k3 = k proving Proposition 2.4.1. Proposition 2.4.1 shows that the contribution from Γ̃n to the L 2-limit of nd/2∆n √ can be made arbitrarily small by choosing K1 large. Also, on |s| ≤ T1/σ n we have ( √ ) V sσ n sσ(s, n, T2) = 1− 1 √ T |s|j,j=6 1 ∑d + 2 pjp1 cos(bjξk + ηj,k). (2.35) j=2 Therefore ∑ r2k = 1− plpj[(bl − bj)ξk + η − η ]2l,k j,k − pd+1p1b2 2d+1ξk l>j,j=6 1 30 ∑ ( ∑ )d d − pjp1(bjξk + ηj,k)2 +O ξ3k + η3l,k . j=2 l=1 Taking η1,k = b1 = 0 we can write the above as, ∑ ∑ r2 = −ξ2 p p (b − b )2k k l j l j − 2ξk plpj(bl − bj)(ηl,k − ηj,k) l>j l>j ∑ (l,j)6=(d,1) ( ∑ )d + 1− plpj(bl − bj)(η 2 3 3l,k − ηj,k) +O ξk + ηl,k . l>j l=1 (l,j)6=(d,1) Since we have r2k approximated by a quadratic polynomial of ξk (the unknown) we can approximate ξk by∑determining the maximizer of r2k(ξk), obtaining l>j∑ plpj(bl − bj)(ηl,k − ηj,k) ( )ξk = − (l,j)=6 (d,1) +O ‖η 2k| . (2.36)2 l>j plpj(bl − bj) Substituting back we find rk in terms of ηj,k only. Ignoring higher order terms we compute the maximum to be: ∑ r2k = 1− plpj(bl − bj)(ηl,k − η 2j,k) l>j (l,j)=6 (d,1) [∑ ]2 p p (b − b )(η − η ) ( )l>j l j l j l,k j,k ∑d (l,j)6=(d,1) + ∑ +O η3 l>j plpj(bl − b )2 l,k [∑ ] j l=1 −1 Put R = p 2lpj(bl − bj) . Then, l>j ∑ r2 2k = 1 + plpj(bl − bj) [plpj(bl − bj)R− 1] (ηl,k − ηj,k) l>j ∑ (l,j)6=(d,1) (∑ ) + plpjpmpn(bl− bj)(b 3m− bn)R(ηl,k − ηj,k)(ηm,k − ηn,k) +O ηl,j l>j,m>n l>j l 6=m,j 6=n (l,j),(m,n)6=(d,1) 31 ∑ (d ∑ ) := 1− 2 Dl,j(a,p)ηl,kηj,k +O η3l,j . (2.37) l,j=2 l>j Thus, ∑ ( )d ∑ rk = 1− Dl,j(a,p)ηl,kηj,k +O η3 = 1− ηT 3l,j kDa,pηk +O(‖ηk‖ ) l,j=2 l>j where Da,p is a (d− 1)× (d− 1) matrix with [Da,p]i,j = Di,j(a,p) (2.38) and ηTk = (η2,k, . . . , ηd,k). From this we have, −z2e /2I √ (1− η T kDa,pηk +O(‖η ‖3))n √= k einφk−iskzσ nk (1 + o(1)). i πnσ sk Let B T1(a,p) be the contribution of the boundary terms ± √ ∈ Ik. σ n Lemma 2.5.3. E(|B| ≤ C) . n(2d−1)/2 Lemma 2.5.4. Let Ik,l = Ik1|k|αn1/4‖η ‖∈[2l,2l+1k ]. with α = [2(d− 1)]−1. Then there is a constant c̃ such that E ∑ ∑ ( )|Ik,l| O 1= 2K exp(−c̃22K) . nd/2 0<|k|K Lemmas 2.5.3 and 2.5.4 will be proven in Section 2.6. Next we prove a lemma that would allow us to further simplify ∆̂n. Lemma 2.5.5. (a) sk = sk + ω Tηk +O(‖η‖2k) where ω = ω(a,p) is a 1× (d− 1) vector. ( ) ‖ ‖ O l√nn(b) If η = then nφk = nska1 + np2η2,k + · · ·+ npdηd,k + o(1). n 32 Proof. Since sk − sk = ζk part (a) follows by (2.36). Next, by (2.34) ( ) O 3 ln 3/2 n φk = arg φ(sk) + δ + n3/2 Note that, φ(s ) = eiska1(p + p eiη2,k + · · ·+ p eiηd,kk 1 2 d + pd+1). Thus, ( ) −1 p2 sin η2,k + · · ·+ pd sin ηd,karg(φ(sk)) = ska1 + tan ∑ p1 + p2 cos η2,k + · · ·+ pd cos ηd,k + pd+1d = ska1 + plηl,k +O(‖η 3k‖ ) l=2 since the denominator in the first line is 1+O(||η||2). Now part (b) follows easily. Now, we continue the analysis of the leading term in ∆̂n. Pick a small δ and define A1 = {(a,p)| Ik,l = 0 ∀k, l s.t. |k| < δn(d−1)/2 and l < K}. Then Ac = {(a,p)| ∃|k| < δn(d−1)/2, |k|αn1/41 ‖ηk‖ ≤ 2K}. Thus, ∑ c C2 K √ P(A K1) = = O( δ2 )|k|(d−1)αn(d−1)/4 |k|<δn(d−1)/2 1 √ if α = . Hence, for a very large K and δ such that δ2K is very small, we 2(d− 1) |k| can approximate ∆n by the sum of Ik’s with δ ≤ ≤ K and |k|αn1/4‖η ‖ ≤ n(d−1)/2 k 2K . 33 √ k We define the random vector Xk = nηk and Yk = − . Then, combiningn(d 1)/2 terms corresponding to k and −k, we obtain the following approximation to the distribution of ∆n for large n √ | | −z2b e /2 ∑d+1 √ sin(nφk − skzσ n)e−XTk Da,pXk nd/2σ π3 Y∈ kk S(n,δ,K) where S(n, δ,K) = {k > 0|δ < Yk < K, |Yk|α‖Xk‖ < 2K}. Define q = (p2, . . . , pd). Then, Lemma 2.5.5 shows that √ √ √ nφk − s T Tkzσ n = sk(na1 − zσ n) + nq ηk − zσ nω ηk + o(1) 2πnd/2 √ √ = ( na − zσ)Y + ( nq− zσω)T1 k Xk + o(1).|bd+1| √ Therefore, for large n and K and δ such that δ2K is very small, the distribution of ∆n is well approximated by ∑ (2πnd/2 √ √ )|b |e−z2/2 sin | | ( na1 − zσ)Yk + ( nq− zσω)TXkd+1 bd+1 T ∆̃n(δ,K) = √ e−Xk Da,pXk . nd/2σ π3 Yk k∈S(n,δ,K) 2.6 Expectation of characteristic function. Proof of Lemma 2.4.2. Recall that d(s) = max d(bjs, 0) where the distance is 2≤j≤d+1 computed on the torus R/(2πZ). Formula (2.35) shows that there are positive con- stants C, c such that 1 ≤ |φ n(s)| C e−cnd(s)2 < C. (2.39) 34 ( 2) √ To prove the lemma we decompose E e−cnd(s) into the pieces where d(s) n is of order 2l for some l ≤ (log2 n)/2. and use the fact that ∂ has a bounded density.( ) (lo∑g2 n)/2 ( √ ) E (φn(s)) ≤ 1CP d(s) < √ + C P d(s) n ∈ [2l, 2l+1] e−c4l n l=0 (lo∑g2 n)/2 ≤ C 4 l l C + C e−c4 ≤ nd/2 nd/2 nd/2 l=0 completing the proof. T√1Proof of Lemma 2.5.3. Let k be such that ∈ Ik. Then ∫ σ n√T1/σ n √ n I = e−iszσ nφ (s)k ds. [π(2k−1)/|b sd+1| ] Because T = K nd/2 ∈ π(2k − 1) Tand s , √1 we have s ≈ n(d−1)/21 1 . Thus|bd+1(| ∫ σ n√ )T1/σ n E(|I Ck|) ≤ E |φn(s)| ds . n(d−1)/2 π(2k−1)/|bd+1| We claim that for all fixed bd, ∫∫ e−cnd(s) 2 C ds db2 . . . dbd−1 ≤ . (2.40) nd/2 If this is true then using that ρ is a smooth compactly supported density of bd we have that, (∫ √ ) ∫∫ ∫ √T1/σ n T1/σ n E |φn(s)| ds = |φn∫∫ (s)| ds dbd dbd−1 . . . db2π(2k−1)/|bd+1| π∫(2k−1)/|bd+1|√T1/σ n ≤ C e−cnd(s)2∫ ∫∫ ρ(x) ds dx dbd−1 . . . db2π(2k−1)/|x| ≤ C e−cnd(s)2∫ ds(dbd−1 .). . db2 ρ(x) dx ≤ C ρ(x) dx = O 1 . nd/2 nd/2 35 Thus C E(|Ik|) ≤ . (2.41) n(2d−1)/2 − √T ∈ I |B| ≤ CSimilarly, if k, then(2.41) holds. Hence, E( ) − as required.σ n n(2d 1)/2 √ To prove (2.40) we decompose it into pieces where d(s) n is of order 2l. Taking µ to be the product measure ds dbd−1 . . . db2 from (2.39) we have ∫∫ 2 √ e−cnd(s) ds dbd−1 . . . db2 ≤ Cµ{(s, b2, . . . , bd−1)|d(s) < 1/ n} (lo∑g2 n)/2 √ + C µ{(s, b2, . . . , bd−1)|d(s) n ∈ [2l, 2l+1]}e−c4 l l=0 (lo∑g2 n)/2C 4l≤ + C e−c4l ≤ C nd/2 nd/2 nd/2 l=0 as required. Proof of Lemma 2.5.4. Because r = 1− ηTD η +O(‖η ‖3) and |k|αn1/4k k a,p k k ‖ηk‖ ∈ [2l, 2l+1] we can write 4l r = 1− c √ +O(n−3/4k ).|k|2α n Accordingly c22l √ − n rn ≤ Ce |k|2αk . Also l P(|k|αn1/4‖η‖ ∈ [2l C2, 2l+1]) ≤ √ . |k|n(d−1)/4 Hence, c22l √ 2l√ − n − c2 n C |k|2α l l |k|2α E(Ik,l) ≤ √ e √ 2 C2 e= . n|k| |k|n(d−1)/4 |k|3/2n(d+1)/4 36 Thus ∑ c22K√K − nC2 e |k|2α E(Ik,l) ≤ .|k|3/2n(d+1)/4 l>K Therefore we need to estimate ∑ c22K√− nC2Ke |k|2α = |k|3/2n(d+1)/4 0<|k| 0. On the set Qk = {φ(ν) > ε} we can write 1 ( ) g(a ,ν) da = d exp ia n(d+1)/21 1 φ(ν) . iφ(ν)n(d+1)/2 1 Integrating by parts on Qk (note that h has compact support) and using trivial bounds on Qck, w∣∣e∫can co(nclude that ) ∣ | ∣Jn,k| ≤ ∣∣ exp ia n(d+1)/21 φ(ν) ∣∣h′(a1,ν) da1∣+ CP({φ(ν) ≤ ε})iφ∫(ν)n(d+1)/2 ∣ ≤ 1 |h′(a1,ν)| da1 + CP({φ(ν) ≤ ε}) εn(d+1)/2 √ for small enough ε. But h′(a1,ν) = O(nd/2), hence the first term is O(1/ n). Therefore, first taking n → ∞ and then taking ε → 0 we have the required result. Proposition 2.7.1 implies that as n → ∞ the distribution of nd/2∆̃n(δ,K) converges to the distribution of ∑ −z2/2 |ad+1 −√a1| sin 2πθ(m)e e−4π2xTDa,px1 y(m) {δ<|y(m)| 0 such that t 7→ Lt is s times continuously differentiable for |t| ≤ δ. (A2) 1 is an isolated and simple eigenvalue of L0, all other eigenvalues of L0 have absolute value less than 1 and its essential spectrum is contained strictly inside the disk of radius 1 (spectral gap). (A3) For all t =6 0, sp(Lt) ⊂ {|z| < 1}. ∥ ∥ 1 (A4) There are positive real numbers K, r1, r2 and N0 such that ∥LN∥t ≤ forN r2 all t satisfying K ≤ |t| ≤ N r1 and N > N0. 44 Remark 3.1.1. 1. In practice we would check (A3) by showing that when t =6 0, the spectral radius of Lt is at most 1 and no eigenvalue of Lt is on the unit circle. Because the spectrum of a linear operator is a closed set this would imply that sp(Lt) is contained in a closed disk strictly inside the unit disk. (r −)/r 2. Suppose (A4) holds. Let N1 > N 1 1 0 be such that N1 > N0. Then, for all N > N1, dN(r1−)/r‖LN‖ ≤ ‖ L 1e /r N 1‖ ≤ ‖ LdN (r1−)/r1e /r1 1 t ( t ) ( t )‖N1 ≤ 1 for K ≤ |t| ≤ N r1− /r dN (r1−)/r1er2N 11 ≤ 1 N r2KN1 r1 −  where K = N /r1N1 . Therefore fixing N1 large enough we can maker1 r2KN1 as large as we want. Hence, given (A4), by slightly reducing r1, we may assume r2 is sufficiently large. 3. Suppose (A1), (A2) and (A3) are satisfied with s ≥ 3. Then, [24, Theorem 2.4] implies that there exists A ∈ R and σ2 ≥ 0 such that SN√−NA →−d N (0, σ2). (3.2) N Our interest is in SN that satisfies the CLT i.e. the case σ 2 > 0. Since in ap- plications we specify conditions which guarantee this, in the following theorems we always assume that σ2 > 0. This is essentially an extension of Nagaev-Guivarc'h method. Some of the spectral assumptions in the theorem can be found in the proofs of decay of corre- 45 lations and the CLT using transfer operators. For example, see [24, 29, 37]. The key novelty here is the condition (A4) which guarantees a sufficient control over the characteristic function for intermediate values of t. This is analogous to the condi- tion (1.3) in Theorem 1.1. In addition, parallels can be drawn between the moment condition in Theorem 1.1 with the condition s = r + 2. The proof of the result is based on classical perturbation theory in [33], applicable due to (A1), (A2) and (A3), which provides the actual expansion and control of the error near 0, the Berry- Esseen inequality (see (3.4) below) which reduces that error to a Fourier inversion integral over an interval of size O(nr/2) and the condition (A4). Now we are in a position to state our first result on the existence of the classical Edgeworth expansion for random variables satisfying (A1) through (A4) which we refer to as weakly dependent random variables. Theorem 3.1.1. Let r ∈ N with r ≥ 2. Suppose (A1) through (A4) hold with r − 1 s = r + 2 and r1 > . Then SN admits Edgeworth expansion of order r. 2 Next, we examine the error of the order 1 Edgeworth expansion in more detail. We first show that the order 1 expansion exists if (A1) through (A3) hold with s = 3. Then, we show that the error of approximation can be improved if (A4) holds. Theorem 3.1.2. Suppose (A1) through (A3) hold with s ≥ 3. Then, the order 1 Edgeworth expansion exists. Theorem 3.1.3. Suppose (A1) through (A4) hold with s ≥ 4. Then, ( ) ( ) SN√−NA ≤ P1(z) 1P z = N(z) + n(z) +O N N1/2 N q{ 1 } where q = min 1, + r1 . 2 46 1 As one would expect, more precise asymptotics than the usual o(N− 2 ) are available when the characteristic function has better decay. The proof shows that the error depends mostly on the expansion of the characteristic function at 0. This is an indication that the error in Theorem 3.1.2 cannot be improved more than by √1a factor of even when r1 is large. N In [9], analogous results are obtained for subshifts of finite type in the sta- tionary case and an explicit description of the first order Edgeworth expansion is given. Here, we consider a wider class of (not necessarily stationary) sequences and give explicit descriptions of higher order Edgeworth polynomials by relating the coefficients to asymptotic moments. Also, we improve the condition ( )n | itS | ≤ − c α(r − 1)Hr : E(e N ) K 1 , < 1, |t| > K|t|α 2 found in [9] by replacing it with (A4). In addition, this allows us to obtain better asymptotics for the first order expansion. We also extend the results in [4] on the existence of weak Edgeworth expan- sions for i.i.d. random variables. In section 3.5.1, we compare their results with the ours. Before we mention our results, we define the space Fmk of functions. Put Cm(f) = max ‖f (j)‖L1 and Ck(f) = max ‖xjf‖≤ L1 .0 j≤m 0≤j≤k Define Cmk (f) = C m(f) + Ck(f). We say f ∈ Fmk if f is m times continuously differentiable and Cmk (f) <∞. 47 Theorem 3.1.4. Suppose (A1) through (A4) hold with s = r+2. Choose q ∈ N such r + 1 that q > . Then, for f ∈ F q+2, SN admits weak local Edgeworth expansion of 2r r+11 order r. Theorem 3.1.5. Suppose (A1) through (A4) hold with s = r+2. Choose q ∈ N such r + 1 that q > . Then, for f ∈ F q+20 , SN admits weak global Edgeworth expansion of2r1 order r. In Theorem 3.1.4 and Theorem 3.1.5, f is required to have at least three derivatives in order to guarantee the integrability of Fourier transforms of f and its derivatives. In addition to (A1) through (A4), if we have, C (A5) There exists C, α > 0 and N1 such that ‖LNt ‖ ≤ for |t| > N r1 for N > Nα 1.t then we can improve this assumption to f having only one continuous deriva- tive. r + 1 Theorem 3.1.4*. Suppose (A1) through (A5) hold with s = r + 2 and α > 2r1 for sufficiently large N . Then, for f ∈ F 1r+1, SN admits weak local Edgeworth expansion of order r. r + 1 Theorem 3.1.5*. Suppose (A1) through (A5) hold with s = r+2 and α > for 2r1 sufficiently large N . Then, for f ∈ F 10 , SN admits weak global Edgeworth expansion of order r. The proofs of these theorems are minor modifications of the proofs of the previ- ous two theorems. This is described in remark 3.2.2 appearing after the proofs. The next theorem gives sufficient conditions for the existence of the averaged Edgeworth expansion. Theorem 3.1.6. Suppose (A1) through (A4) hold with s = r + 2. Choose q ∈ N 48 r such that q > . Then, SN admits averaged Edgeworth expansion of order r for 2r1 f ∈ F q0 . We note that for integer valued random variable assumptions (A3) and (A4) cannot hold since the characteristic function of SN is 2π-periodic. Therefore we replace (A3) by, (̃A3) When t 6∈ 2πZ, sp(Lt) ⊂ {|z| < 1} and when t ∈ 2πZ, sp(Lt) ⊂ {|z| < 1}∪{1}. Also, because of periodicity of the characteristic function, an assumption similar to (A4) is not required. The following theorem provides conditions for the existence of asymptotic expansions for the LCLT for weakly dependent integer valued random variables. A similar result for Xn’s that are Zd-valued, is obtained in [42]. Compare with Proposition 4.2 and 4.4 therein. Theorem 3.1.7. Suppose Xn are integer valued, (A1), (A2) and (̃A3) are satisfied with s = r + 2. Then SN admits order r lattice Edgeworth expansion. The layout of the rest of the chapter is as follows. In section 3.2 we prove the results mentioned earlier by constructing the Edgeworth polynomials using char- acteristic functions and concluding that they satisfy the required asymptotics. In section 3.3 we relate the coefficients of these polynomials to moments of SN and provide an algorithm to compute coefficients. A few applications of the Edgeworth expansions such as the Local Central Limit Theorem and Moderate Deviations, are discussed in section 3.4. In the last section we give examples of sequences of random variables for which our theory can be applied. First, we revisit the i.i.d. case and 49 recover previous results. Then, we focus on non-trivial examples like observations arising from piece-wise expanding maps of an interval, Markov chains with finitely many states and markov processes which are strongly ergodic. 3.2 Proofs of the main results. Here we prove the results mentioned earlier. From now on we work in the setting described in section 3.1. Proof of Theorem 3.1.1. We seek polynomials Pp(x) with real coefficients such that( ) Sn√− nA ∑r ≤ − Pp(x) ( ) P x N(x) = n(x) + o n−r/2 . (3.3) n np/2 p=1 Once we have found suitable candidates for Pp(x) we can apply the Berry-Esseen inequality, ∫ ∣∣ ∣∣T | − E | ≤ 1 ∣∣∣ F̂n(t)− Êr,n(t) ∣∣∣ C0Fn(x) r,N(x) dt+ , (3.4)π −T t T where ( ) − rSn√ nA ∑ Pp(x) Fn(x) = P ≤ x , Er,n(x) = N(x) + n(x), n np/2 p=1 and C0 is independent of T . We refer the reader to [20, Chapter XVI.3] for a proof of (3.4). What follows is a formal derivation of Pp(x). Later, we will use (3.4) along with other estimates to prove (3.3). It follows from (A1), (A2) and classical perturbation theory (see [33, IV.3.6 and VII.1.8]) that there exist δ > 0 such that for |t| ≤ δ, Lt has a top eigenvalue µ(t) which is simple and the remainder of the spectrum is contained in a strictly 50 smaller disk. One can express Lt as Lt = µ(t)Πt + Λt (3.5) where Πt is the eigenprojection to the top eigenspace of Lt and Λt = (I − Πt)Lt. Because ΛtΠt = ΠtΛt = 0, iterating (3.5), we obtain Ln n nt = µ (t)Πt + Λt . Using (A3) and compactness, there exist C (which does not depend on n and t) and 0 < r < 1 such that ‖Λnt ‖ ≤ Crn for all |t| ≤ δ. By (3.1), √ ( )n ( ) ( ) E(eitSn/ n t ) = µ √ ` Π √ n √t/ nv + ` Λt/ nv . (3.6)n Now, we focus on the first term of (3.6). Put Z(t) = `(Πtv). (3.7) Then, substituting t = 0 in (3.6) yields 1 = Z(0) + `(Λn0v). Also, we know that lim ‖Λn0v‖ = 0. This gives lim `(Λn0v) = 0. Therefore, Z(0) = 1 and Z(t) 6= 0 n→∞ n→∞ when |t| < δ. Also, this shows that `(Λn0v) = 0 for all n. Next, note that t →7 µ(t) and t 7→ Πt are r+ 2 times continuously differentiable on |t| < δ (see [33, IV.3.6 and VII.1.8]). Therefore, Z(t) is r + 2 times continuously differentiable on |t| < δ. Now we are in a position to compute Pp(x). To this end we make use of ideas in [20, Chapter XVI] (where the Edgeworth expansions for i.i.d. random variables are constructed) and [24] (where the CLT is proved using Nagaev-Guivarc’h method). 51 Consider the function ψ such that, ( ) 2 2 ( ) ( ) ( ( )) inAt σ2t2 log µ √t i√At= − σ t t+ ψ √ ⇐⇒ µn √t √ − √t= e n 2 exp nψ . n n( 2n n nS ) ([S ] n 2) n n − nA where A = lim E is the asymptotic mean and σ2 = lim E √ is n→∞ n n→∞ n the asymptotic variance. (For details see section 3.3.) By (3.6) we have, ( ( )) ( ) ( ) Sn E it √ −nA σ2t2− t t − i√nAt(e n ) = e 2 exp nψ √ Z √ + e n ` Λn√t v (3.8)n n n Notice that ψ(0) = ψ′(0) = 0 and ψ(t) is r+2 times continuously differentiable. Now, denote by t2ψr(t) the order (r + 2) Taylor approximation of ψ. Then, ψr is the unique polynomial such that ψ(t) = t2ψr(t) + o(|t|r+2). Also, ψr(0) = 0 and ψr is a polynomial of degree r. In fact, we can write ψ(t) = t2ψ r+2r(t) + t ψ̃r(t) where ψ̃r is continuous and ψ̃r(0) = 0. Thus, ( ( ))t ( ( t ) 1 ( t )) exp nψ √ = exp t2ψ √ + tr+2r ψ̃r √ . n n nr/2 n Denote by Zr(t) the order−r Taylor expansion of Z(t) − 1. Then, Zr(0) = 0 and Z(t) = 1 + Zr(t) + t rZ̃r(t) with twice continuously differentiable Z̃r(t) such that Z̃r(0) = 0. Then, to make the order n −j/2 terms explicit, we compute: 2 ( ) ( )σ t2 t t e µn2 √ Z √ n ( n2 2 t ) ( )σ t t = e n2 µ √( ( e)xp logZ √ n n t 1 ( ) = exp t2ψr √ t + tr+2ψ̃ √ n nr/2 r∑nr − (−1) k+1 [ ( )]k ( )) Zr √ t − 1 tr tZ r/2 r √ k n n n k=1 52 ∑r r1 [ (2 √t ) ∑ k+1[ ( )]k]m= 1 + t ψr − (−1) Zr √t m! n k n m=1 k=1 1 ( ) (r+2 √t 1 r √t ) (r+1 − r+1 )+ t ψ̃ − t Z + t O n 2 ∑ nr/2 r r/2 rn n nr Ak(t) tr ( t ) ( r+1 ) = + ϕ √ + tr+1O n− 2 (3.9) nk/2 nr/2 n k=0 where A0 ≡ 1, ϕ(t) = t2ψ̃r(t) − Zr(t) is continuous and ϕ(0) = 0. Here Zr is the remainder of logZ(t) when approximated by powers of Zr. Next write, ∑r Ak(t) Qn(t) = . (3.10) nk/2 k=1 Notice that Ak and k have the same parity. (3.11) This can be seen directly from the construction, because we collect terms with the same power of n−1/2 t , ψr and Zr are a polynomial in √ with no constant term and n we take powers of t2ψr(t) and Zr(t), the resulting Ak will contain terms of the form c t2s+ks . We claim tha(t,∫ ∣∣∣ ) ( ) 2 2 2n t t − t σ t σ2∣µ √ Z √ ∣e− − e−2 2 Qn(t)n n ∣∣√ |t|<δ n ∫ ∣ t [ ( ) (∣ dt (3.12) 2 2 ∣∣∣ )] exp nψ √tt σ + logZ √ t − 1− ∣Qn(t)n n ∣ = e− 2 ∣√ ∣ dt |(t|<δ n ) t = o n−r/2 . We note[ tha(t fr)om the ch(oice)]of Qn, exp nψ √t + logZ √t − 1−Qn(t) ( ( ) )n n 1 ( ) = tr−1ϕ √t r+1 + trO n− 2 t nr/2 n 53 where ϕ(t) = o(1) as t→ 0. As a result, for all ε > 0 the integrand of (3.12) can be ε 2 2 made smaller than (tr−1 t σ + tr)e− 2 by choosing δ small enough. This proves nr/2 the claim. √ Even though the following derivation is only valid for |t| < δ n, once the polynomial function Qn(t) is obtained as above, we can consider it to be defined for all t ∈ R. Suppose |t| ≤ δ. From classical perturbation theory (see [33, Chapter IV] and [29, Section 7]) we have ∫ n 1Λt = z n(z − Lt)−1 dz (3.13) 2πi Γ where Γ is the positively oriented circle centered at z = 0 with radius ε0. Here ε0 is uniform in t and 0 < ε0 < 1. ∫Now, Λn − n 1 nt Λ0 = ∫ z [(z − L −1 t) − (z − Lt)−1] dz 2πi Γ 1 = zn[(z − L −10) (Lt − L0)(z − L )−1t ] dz. 2πi Γ Λn − Λn Because Lt − L0 = O(|t|) we have that t 0 = O(εn0 ). ` ∈ B′ and `(Λn0v) = 0|t| implies th∫at ∣∣∣ − i√nAt ∣ ∣ − i√nAt∣e n `(Λn √ v) ∣∣∣ ∫ ∣∣∣e n `(Λn √ v − Λnv)t/ n t/ n 0√ dt =t ∫ √ ∣ ∣ t ∣ ∣∣∣ dt |t|<δ n |t|<δ n ∣∣∣Λn − Λn ∣≤ C t 0 ∣∣ dt = O(εn). |t|<δ t 0 This decays exponentially fast to 0 as n→∞. This allows us to control the second term in the R∫HS of ∣(3.6). Combining this with (3.12) w∣e can conclude that,∣∣∣ Sn√−nA t2σ2 t2σ2√ |t|<δ n ∣E it (e n )− e− −2 − e 2 Qn(t) ∣∣∣∣ dt = o(n−r/2). (3.14)t 54 Observe that, σ2 2 ̂k − t 1 d − t2 k k d̂ ∫ (it) e 2 = √ e 2σ2 = n(t)2πσ2 dtk dtk where f̂(x) = e−itxf(t) dt is the Fourier transform of f . Therefore, ( )[ 2 ]t Rj(t)n(t) = √ 1 − dAj i e− 2σ2 . (3.15) 2πσ2 dt Then, the required Pp(x) for p ≥ 1, can be found using the relation, d [ ] n(x)Rp(x) = n(x)Pp(x) . (3.16) dx For more details, we refer the reader to [20, Chapter XVI.3,4]. C0 Given ε > 0, choose B > where C0 is as in (3.4). Let r ∈ N. Then we ε choose polynomials Pp(x) as descr∣ibed above. Then, from (3.4) it f∣ollows that,∫ Bnr/2 ∣∣∣ Sn−nA 2 2| − E | ≤ 1 ∣E it √ − − t σ(e n ) e 2 (1 +Qn(t))∣∣ C0 Fn(x) r,n(x) ∣ dt+ π −Bnr/2 t ∣ Bnr/2 ≤ εI1 + I2 + I3 + nr/2 where ∫ ∣∣∣∣ itSn√−nA t2σ2 ∣1 ∣E(e n )− e− 2 (1 +Qn(t)) ∣∣I1 = π √ ∣ dt|t|<δ n∫ t∣∣∣∣ √ ∣ ∣ 1 E(eitSn/ n) ∣ I ∣2 = dt π ∫ √ ∣δ n<|t|δ n From (3.12) we have that I1 is o(n −r/2). Because our choice of ε > 0 is arbitrary the proof is complete, if I2 and I3 are also o(n −r/2). These follow from (3.18), (3.19) and (3.17) below. 55 It is easy to see that, ∫ ∣ ∣ 2 2 − t σ ∣∣∣1 +Qn(t) ∣2 ∣√ e ∣ dt = O(e−cn) (3.17) |t|>δ n t for some c > 0. Thus, we only need to control, ∫ ∣∣∣∣ √ ∣ E(eitSn/ n)∣ I2 = ∣∫ √ ∣ dtδ n<|t| max{δ,K} with K as in (A4). By (A3) the spectral radius of Lt has modulus strictly less than 1. Because t 7→ Lt is continuous, for all p < q, there exists γ < 1 and C > 0, such that ‖Lmt ‖ ≤ Cγm for all p ≤ |t| ≤ q for sufficiently large m. Then using (3.1) for sufficiently large n we have, ∫ ∣∣∣ √∣ ∣E(eitSn/ n) ∣∣∣ ∫≤ √1 Cγn√ √ dt √ √ ‖Ln √t/ n‖ dt ≤ √ . (3.18) δ n<|t|<δ n t δ n δ n<|t|<δ n n √ This shows that the integral converges to 0 faster than any inverse power of n. Next for sufficiently large n, ∫ √∣∣∣∣ ∣itS / n ∣∣∣ ∫E(e n ) ≤ 1√ dt √ √ |`(Ln √ v)| dt (3.19) δ n<|t| (we can assume r2 > for large n due to Remark 3.1.1) and 2 2 √|t| r−1 r−1K ≤ δ < < Bn r2 ≤ n 1 for n ∈ N with nr1− 2 ≥ B. n 56 The proof of Theorem 3.1.2 follows the same idea. We include its proof for completion. Proof of Theorem 3.1.2. Because (A1) through (A3) hold with s ≥ 3, we have (3.9) C0 where ϕ is continuous, ϕ(0)∫= 0 a∣nd r = 1. Given ε > 0, choose B > . Then,√ ∣∣∣ ε Sn√−nA 2 2 ∣ 1 B n ∣E it(e n )− e− t σ ∣ 2 | (1 +Qn(t))∣Fn(x)− E1,n(x)| ≤ ∣ π √−B n t ∣ C√0dt+ B n ≤ εI1 + I2 + I3 + √ . B n Because, ϕ(t)[= o((1) )as t→ 0(and)] exp nψ √t + logZ √t − 1−Q1(t) ( ) ( )n n = √1 t 1ϕ √ + tO t n n n we have that, ∫ ∣∣∣∣ itSn√−nA t 2σ2 t2σ2 ∣E(e n )− e− − e−2 2 Q1(t) ∣ I ∣1 = √ ∣ dt = o(n−1/2). |t|<δ n t Also, I = O(e−cn∫ 3 ). Finally, because of (A3) there is γ < 1 such that,∣∣ √∣∣ ∣ ∣ ∣E(eitSn/ n ∫) ∣∣∣ ∣E(eitSn) ∣√ √ dt = ∣∣ ∣∣ dt ≤ C sup ‖Lnt ‖ ≤ Cγn δ n<|t| max{δ,K}. ‖Lnt ‖ ≤ 1 where K ≤ δ < |t| < nr1 . ∫ nr2 ∣∣∣∣ √ ∣∣∣∣ ∫E(eitSn/ n) ∣∣∣∣ ∣E(eitSn) ∣∣ 1√ dt = ∣ dt ≤ Cnr1−r2+ 2 δ n<|t| 0. (A particular δ is chosen later). Notice that for all δ ≤ |t| ≤ K (where K as in (A4)), there exists c0 ∈ (0, 1) such that ‖Lnt ‖ ≤ cn0 . Thus, ∣∣∣∣ ∫ ∣∣ ∫ ∣∣f̂(t)E(eit(Sn−nA)) dt∣∣ ≤ ∣f̂(t)`(Lnt v)∣∣∣ dt ≤ C‖f‖1cn0 . δ<|t| r1 + (r + 1)/2. Therefore, ∣∣∣∣ ∫ ∣∣ ∫it(S −nA) ∣∣ ≤ ‖ ‖ ‖ ‖‖ ‖ ‖Ln‖ ≤ C‖f‖1f̂(t)E(e n ) dt f 1 ` v t dt r −r K<|t| implies, ∣ 2r∣∣∣ ∫ ∣∣ ∫ ∫ ∣ 1 ∣ f̂ (q)it(S −nA) (t) ∣∣f̂(t)E(e n ) dt∣∣ ≤ |f̂(t)| dt ≤ ∣ ∣ dt (3.21)q |t|>nr1 |t|>nr1 |t|>nr1 t 59 ‖f̂ (q)≤ ‖1 = ‖f̂ (q)‖ o(n−(r+1)/21 ). nr1q Therefore, ∣∣∣∣ ∫ ∣∣f̂(t)E(eit(Sn−nA)) dt∣∣ = o(n−(r+1)/2). (3.22) |t|>δ √ From (3.8), for |t| ≤ δ n, we have, itSn√−nA σ 2t2 E 2(e n ) = e− et O(δ)2 (1 +O(δ)) +O(n0 ). √ Thus, choosing small δ, for large n when |t| < δ n there exist c, C > 0 such that ∣∣ ( Sn√−nA )∣E ite n ∣ ≤ 2Ce−ct . Then, √ √ ∣ ∣Sn−nA D log n < |t| ∣ it √ ∣< δ n =⇒ ∣E(e n )∣ ≤ Ce−cD logn C= ncD an∣d∣∣∣ ∫ ∣ ∣ ∫ ( ) ∣√ f̂(t)E(eit(Sn− ∣∣∣ ∣∣∣ t itSn√−nA ∣nA) dt) dt = f̂ √ E(e n )√ ∣D logn √ √ ∣<|t|<δ D∫logn<|t|<δ n n nn ≤ C | | 2δC‖f‖1 ncD √ f̂(t) dt = . D logn cD<|t|<δ n n Combining this wi∣t∫h (3.22) and choosing D such that, cD > (r+ 1)/2 we have that,∣∣∣ ∣∣√ f̂(t)E(eit(Sn−nA)) dt∣∣ = o(n−(r+1)/2). (3.23) |t|> D√lognn | | D log nNext, suppose t < . Then, ∑nr f̂ (j)(0) tr+1 f̂(t) = tj + f̂ (r+1)((t)) j! (r + 1)! j=0 where 0 ≤ |(t)| ≤ |t|. N∣ ∫ote that,∣ ∣∣ ∫ |f̂ (r+1)((t))| = ∣∣ xr+1e−i(t)xf(x) dx∣∣ ≤ |xr+1f(x)| dx ≤ Cr+1(f). 60 Therefore, ∫ ( t ) itSn√−nA √ f̂ √ E(e n ) dt |t|< D logn ∑nr f̂ (j) ∫(0) itSn√−nA = tjE(e n√ ) dtj!nj/2 j=0 |t|< D logn ∫ 1 1 ( ( ))itSn√−nA t + E(e n )tr+1f̂ (r+1)  √ dt n(r+1)/2 (r + 1)! √|t|< D logn n where ∣∣∣∣ ∫ ∣ ∫itSn√− ( ( ))nA ∣E(e n )tr+1f̂ (r+1) t 2√  √ dt∣∣ ≤ C (f) |t|r+1e−ctr+1 dt |t|< D logn n for large n. Hence, ∫ ( t ) itSn√−nA n √ f̂ √ E(e ∫ ) dt|t|< D logn ∑ nr f̂ (j)(0) itSn√−nA = j n −(r+1)/2 j!nj/2 √ t E(e ) dt+ Cr+1(f)O(n ). (3.24) j=0 |t|< D logn Because s = r + 2, from (3.9), σ2t2 ( ( )) ( ) itSn√−nA t t i 2 2E √ √ − √ nAt+σ t ( ) e 2 (e n ) = exp nψ Z + e n 2 ` Λn √ ∑ n n t/ n v r A (t) tr ( t ) ( log(r+1)/2 )k = + ϕ √ (n)+O . (3.25) nk/2 nr/2 n n(r+1)/2 k=0 Substituting this in (3.24), ∫ ( ) √t E it Sn√−nA √ f̂ (e n ) dt (3.26) |t|∑< D logn ∫ nr f̂ (j) r(0) ∑ (2 2 Ak(t) log(r+1)/2(n)) = √ t je−σ t /2 dt+O j!nj/2 ∫ nk/2 n(r+1)/2∑j=0 ∑ |t|< D logn k=0r r f̂ (j)(0) = tjA (t)e−σ 2t2/2 dt+ o(n−r/2). j!n(k+j)/2 √ k k=0 j=0 |t|< D logn 61 Recall from (3.11) that Ak and k have the same parity. Therefore, if k + j is odd then ∫ j 2 2 √ t A (t)e −σ t /2 k dt = 0. |t|< D logn So only integral powers of n−1 will remain in the expansion. Also there is C that depends only on r such that, ∫ ∫ j −σ2t2/2 ≤ 4r −σ2t2/2 C C√ t Ak(t)e dt C √ t e dt ≤ = .eσ2D log(n)/4 nσ2D/4|t|≥ D logn |t|≥ D logn Choosing D such that 2σ2D > (r + 1)/2, ∫ ∫ tj 2 2 2 2 A (t)e−σ t /2 dt = tjA (t)e−σ t /2 dt+ o(n−r/2k √ k ). R |t|≤ D logn Therefore, fixing D large, we can assume the integrals to be over the whole real line. Now, define ∫ a = tj 2 2 k,j Ak(t)e −σ t /2 dt R and substitute ∫ f̂ (j)(0) = (−it)jf(t) dt R in (3.26) to obtain, ∫ ( t ) ∑r ∑r ∫√ E itSn√−nA 1√ f̂ (e n ) dt = ak,j (−it)jf(t) dt+ o(n−r/2) |t|< D logn n j!n (k+j)/2 k=0 j=0 R ∑ ∫ ∑ (3.27)r 1 ak,j = f(t) (−it)j dt+ o(n−r/2) np j! p∑=0 R k+j=2pbr/2c ∫1 = f(t)P (t) dt+ o(n−r/2p,l ) np p=0 R 62 where ∑ ak,j Pp,l(t) = (−it)j. (3.28) j! k+j=2p The final simplification was done by absorbing the terms corresponding to higher powers of n−1 into the error term. Note that Pp,l is a polynomial of degree at most 2p and that once we know A0, . . . , A2p we can compute Pp,l. Finally combining (3.27) and (3.23) substituting in (3.20) we obtain the re- quired result as shown below. √ ∫ − 1 ( t ) itSn√−nA nE(f(Sn nA)) = n 2π √ f̂ √ E(e ) dt |t|< D logn n √ ∫ n + √ f̂(t)E(eit(Sn−nA)) dt D logn b 1 ∑ 2π |t|> n r/2c ∫ 1 √ = f(t)P −r/2p,l(t) dt+ o(n ) + n o(n −(r+1)/2) 2π np ∑p=0br/2c ∫R1 1 = f(t)P (t) dt+ o(n−r/2p,l ). 2π np p=0 R The proof of Theorem 3.1.5 uses the relation (3.25) derived in the previous proof. But we do not use the Taylor expansion of f̂ , so differentiability of f̂ is not required. So the assumption on the decay of f at infinity can be relaxed. Proof of Theorem 3.1.5. Multiplying (3.25) by f̂ and integrating we obtain, ∫ ( t ) itSn√−nA √ f̂ √ E(e n ) dt |t|< D logn n ∑r ∫1 ( t ) σ2t2 = √ f̂ √ Ak(t)e − 2 dt+ ‖f‖1o(n−r/2). nk/2 n k=0 |t|< D logn 63 As in the proof of Theorem 3.1.4 the integrals above can be replaced by inte- grals over R with∫out altering the order of the error because( t ) 2 2 √ f̂ √ σ t Ak(t)e − 2 dt ≤ ‖f‖1 o(n−r/2) |t|≥ D logn n f∫or D such that 2σ 2D > (r + 1)/2. Therefore∫,( ) rt ( )Sn ∑√−nA 1 t σ2t2it n √ f̂ √ E(e ) dt = f̂ √ Ak(t)e − 2 dt+ ‖f‖ o(n−r/2). k/2 1 |t|< D logn n n nk=0 R We pick Rp as in (3.15) and claim Pp,g = Rp. √ √ √ Note th∫at nf(t n)←→ f̂(t/ n). So ∫by the Plancherel theorem,√ ( √ ) 1 ( t ) σ2t2 nf t n Rk(t)n(t) dt = f̂ √ Ak(t)e− 2 dt. R 2π R n Thus, ∫ 1 ( t ) itSn√−nA√ n 2π n √ f̂ √ E(e |t|< D logn n (∑r ∫ ) dt 1 1 √ ( √ ) ) = √ nf t n Rp(t)n(t) dt+ ‖f‖ −r/2p/2 1o(n ) ∑n np=0 Rr ∫1 ( √ ) = f t n R (t)n(t) dt+ ‖f‖ o(n−(r+1)/2). (3.29) np/2 p 1 p=0 R Note that (3.23) holds because f ∈ F q+20 . Now, combining (3.29) with the estimate (3.23) completes the proof. Remark 3.2.2. Proofs of both the Theorem 3.1.4* and Theorem 3.1.5* are almost identical except the estimate (3.21). In order to obtain the same asymptotics, the assumption on the integrability of f̂ (q) can be replaced by (A5) and the fact that |f̂(t)| ∼ 1 for a∣s t→ ±∞.t ∣∣∣ ∫ ∣f̂(t)E(eit(Sn−nA)) dt∣∣∣ ∫≤ C |f̂(t)|‖Lnt ‖ dt |t|>nr1 |t|>nr1 64 ∫ ≤ ‖ ‖ 1C f 1 dt1+α |∫t|>nr1 t ≤ C‖f‖1 1 dt nr1(α−) t1+ r + 1 Since, r1α > choosing  small enough we can make the expression ‖f‖ o(n−(r+1)/21 ) 2 as required. Proof of Theorem 3.1.6. Select A as in (3.2). Define Pp by (3.15) and (3.16) and √ y f̃n(x) = f(− nx). Then the change of variables −√ → y yields, ∫ n[ (S − nA y ) ( y ) ( y )] √ P n√ ≤ x+ √ −N x+ √ − Er,n x+ √ f(y)dy = n∆n ∗ f̃n(x). n n n n ∑r where E 1r,n(x) = Pp(x)n(x). np/2 p=1 itSn√−nA Notice that E(e n )f̂̃n ∈ L1. Therefore, ∫ ′ 1 − itSn√−nA(Fn ∗ f̃n) (x) = e itxE(e )f̂̃n n(t) dt. 2π Also, [ (∑r ∫1 )] 1 σ2t2 ( ) n + Rpn ∗ f̃n(x) = e−itxe− 2 1 +Qn(t) f̂̃n(t) dt np/2 2π p=1 where Rp’s are polynomials given by (3.15) and Qn(t) is given by (3.10). From these we conclude that, ∫ ′ 1 − ( itSn√−nAitx σ2t2 ( )(∆n ∗ f̃n) (x) = e E(e n )− e− 2 1 +Qn(t) f̂̃n(t) dt. (3.30) 2π We claim that, ∫ itSn√−nA σ2t2 ( ) 1 n− E(e )− e − 2 itx 1 +Qn(t)(∆ ∗ f̃ )(x) = e f̂̃n n n(t) dt. (3.31) 2π −it 65 Indeed, if the right side of (3.31) converges absolutely, then Riemann-Lebesgue Lemma gives us that it converges 0 as |x| → ∞. Differentiating (3.31) we obtain (3.30). Thus the two sides in (3.31) can differ only by a constant. Since both are 0 at ±∞, this constant is 0 and (3.31) holds. Now, we are left with the task of showing that the right side of (3 ̂̃ 1 ( .31) con t )- verges absolutely. From the definition of f̃n it follows that, fn(t) = √ f̂ − √ . n n Combining this with (3.14), we have that, ∣∣∣ ∫ Sn√−nA σ2t2it ( )∣ ∣n−itxE(e )− e− 2 1 +Qn(t) ∣√ e ̂̃(fn(t) dt ∣ |t|<δ n ∫ ∣∣ −it ∣ ∣ Sn−nA 2 2 )∣ ∣E it √ − −σ t(e n ) e 2 1 +Q (t)≤ n ̂̃ ∣∣√∫ ∣ ( fn(t)∣ dt |t|<δ n t itSn√−nA σ2t2 )∣ ≤ ‖√f‖ n 1 ∣∣E(e )− e− 2 1 +Qn(t) ∣∣ dt n √ ∣ ∣|t|<δ n t = ‖f‖1o(n−(r+1)/2). Note that, ∣∣∣ ∫ itSn√−nA 2 2∣ E(e n )− e−σ t ( ) − 2e itx 1 +Qn(t) f̂̃ ∣∣√ (n(t) dt∣ ∣ |t|>δ n ∫ ∣ −it∣∣ itSn√−nA 2 2∣ −σ t ) ( )∣E(e n )− e 2 1 +Qn(t)≤ t ∣∣√ ( f̂ − √ )∣ dt|t|>δ∫ n t n 1 ∣∣∣ 2 2 2∣ √ ∣E(e−it(Sn−nA) n σ t)− e− 2 1 +Q (− nt) ∣≤ √ n∫ f̂(t) ∣ dt n ∣|t|>δ ∣ ∣ t ≤ √1 ∣∣∣E(e−it(Sn−nA)) ∣∣∣ 2f̂(t) dt+O(e−cn ).n |t|>δ t Put, ∫ 1 ∣∣∣∣E(e−it(Sn− ∣nA)) ∣Jn = √ f̂(t)n t ∣∣ dt.|t|>δ We claim J = o(n−(r+1)/2n ). This proves that (3.31) converges absolutely as required. 66 To conclude the asymptotics of Jn, choose δ > max{δ,K} where K as in (A4). From (A3) there exists γ < 1 such that ‖Lnt ‖ ≤ γn for all δ ≤ |t| ≤ δ for sufficiently large n. Then, usin∣g (3.1) for sufficien∣tly large n we have,∫ ∫ √1 ∣∣∣E(e−it(Sn−nA)) ∣∣∣ ≤ C‖√f‖1f̂(t) dt ‖Ln‖ dt = O(γn).n δ<|t|<δ t δ n tδ<|t|<δ 1 Next, for K ≤ δ ≤ |t| ≤ nr1 , ‖Lnt ‖ ≤ . Hence, for n sufficiently large so thatnr2 r r2 > , 2 ∫ √1 ∣∣∣∣ ∣E(e−it(Sn−nA)) ∣ ∫f̂(t)∣∣ dt ≤ √C ‖Lnt ‖|f̂(t)| dtn δ<|t| , we have that, ∫ 2r1 ∣∣ ∣ ∫ √1 ∣∣E(e−it(Sn−nA)) ∣ ‖f (q)‖1 1 C‖f (q)‖1f̂(t)∣ dt ≤ √ dt ≤n ∣ q+1 qr1+1/2|t|>nr1 t n |t|>nr1 |t| n = o(n−(r+1)/2). Combining the above estimates, J = Cq(f)o(n−(r+1)/2n ). This completes the proof that (∆ ∗ f̃ )(x) = o(n−(r+1)/2n n ). Hence,∫ [ (Sn√− nA ) ( ))]P ≤ yx+ √ −N x+ √y f(y)dy n ∫ (n ) n √ = Er,n x∫+ √ y ( f(y) dy)+ n∆n ∗ f̃ (x)∑ nnr 1 y = P x+ √ n(x)f(y) dy + Cq(f)o(n−r/2p ) np/2 n p=1 as required. In the lattice case, periodicity allows us to simplify the proof significantly although the idea behind the proof is similar to the previous proofs. 67 Proof of Theorem 3.1.7. Under assumptions (A1) and (A2) we have the CLT for Sn. Put A as in (3.2). We obs∫erve that,π ∫ π 2πP(Sn = k) = e−itkE(eitSn) dt = e−itk`(Lnt v) dt. −π −π After changing variabl∫es and using (3.6), (3.7) we have,√ √√ π n ( ∫−√itk t )n ( t ) π n −√itk ( ) 2π nP (Sn = k) = √ e nµ √ Z √ dt+ n n √ n n √ e ` Λt/ nv dt. −π n −π n (3.32) By (̃A3) there exists C > 0 and r ∈ (0, 1) (both independent of t) such that |` (Λnt v) | ≤ Crn for all t ∈ [−π, π]. Therefore the second term of (3.32) decays exponentially fast to 0 as n→∞. Now, we focus on the first term. Using the same strategy as in the proof of Theorem 3.1.1 we have, ( √t )n ( t ) inAt σ2√ − t2 [ ] µ Z √ = e n 2 1 +Qn(t) + o(n−r/2) (3.33) n n where Qn(t) is as in (3.10{). Define Rj as(in (3.15).√ r √ )}1 (k−nA)2 ∑ (R (k − nA)/ n) 2π nP p(Sn = k)− 2π √ e− 2σ2n 1 + ∫ ( 2π j/2 ) ( ) nj=1√π n −√itk n = e n√ ∫ µ √ t √tZ dt −π n n n ∞ ∫ − it(k√−nA) ∞ 2 2 − e n e− 2 2 −√ itk σ t /2 σ tdt− e n e− 2 Qn(t) dt+ o(n−r/2). −∞ −∞ We estimate th∫e RHS by estimating the three integrals given below,√δ n ( ) ( ) −√itk t n t − it(k√−nA) σ2− t2I n n1 = 2∫ √ e µ √ Z √ − e e [1 +Qn(t)] dt−δ n n ( )n ( ) −√itk t n t I2 = √ √ e nµ √ Z √ dt δ n<|t|<π n n n 68 ∫ − it(k√−nA) σ2t2 I n −3 = e e 2√ [1 +Qn(t)] dt. |t|>δ n Clearly, |I3| decays to 0 exponentially fast as n → ∞. Also, |µ(2π)| = 1 and |µ(t)| ∈ (0, 1) for 0 < |t| < 2π. Therefore, there exists  > 0 such that |µ(t)| <  on δ ≤ |t| ≤ π. Put M = max |Z(t)|. Then, δ≤|t|≤π √ ∫ √ |I n2| ≤M n |µ(t)| dt ≤ 2M(π − δ) nn. <|t|<π Hence, |I2| decays to 0 exponentially fast as n→∞. From (3.33), we have that [ ( ) ( ) ] −√itk t n t i√nAt σ2t2 σ2t2 e n µ √ Z √ − e n e− 2 [1 +Qn(t)] = e− −r/22 o(n ). n n This implies |I | = o(n−r/21 ). Combining these estimates we have the required result. 3.3 Computing coefficients. ∫ Since E(eitSn) dt decays sufficiently fast, the Edgeworth expansion, and |t|>δ hence its coefficients, depend only on the Taylor expansion of E(eitSn) about 0. Here we relate the coefficients of Edgeworth polynomials to the asymptotics of moments of Sn by relating them to derivatives of µ(t) and Z(t) at 0. Suppose (A1) through (A4) are satisfied with s = r + 2. Recall (3.6): E(eitSn) = µ (t)n ` (Πtv) + ` (Λnt v) . (3.34) Put Z(t) = ` (Πtv) as before. Also write Un(t) = ` (Λ n t v). We already know that µ(t), Z(t) and U(t) are r+ 2 times continuously differentiable. Using (3.13) one can 69 show further that the derivatives of Un(t) satisfy: sup ‖U (k)n ‖ ≤ Cεn0 |t|≤δ for all n and for all 1 ≤ k ≤ r + 2. Taking the first derivative of (3.34) at t = 0 we have: (S )n iE(S ′ ′ ′n) = nµ (0) + Z (0) + Un(0) =⇒ lim iE = µ′(0). n→∞ n In fact, using the Taylor expansion of log µ(t) and above limit one can conclude that the number A we used in the statement of the CLT in (3.2), is given by (S )n A = lim E . n→∞ n Therefore one can rewrite (3.6) as E(eit(Sn−nA)) = e−ntµ′(0)µ (t)n Z(t) + Un(t) (3.35) ′ (k) where U (t) = e−ntµ (0)n Un(t). Also note that its derivatives satisfy ‖Un ‖∞ = O(εn0 ) for all 1 ≤ k ≤ r + 2. From (3.35), it follows that moments of Sn−nA can be expanded in powers of n with coefficients depending on derivatives of µ and Z at 0. However, only powers of n upto order k/2 will appear. We prove this fact below. Lemma 3.3.1. Let 1 ≤ k ≤ r + 2. Then for large n, ( k ) b∑k/2cE [Sn − nA] = a j nk,jn +O(0 ). (3.36) j=0 Proof. We first note that taking the ∣kth derivative of (3.35) at t = 0,( ) dk ∣ [ ]′ (k) ikE [S − nA]k = ∣∣ e−ntµ (0)n µ (t)n Z(t) + U (0)dtk t=0 70 ∣ dk ∣∣∣ [ ]= e−ntµ′(0)µ (t)n Z(t) +O(n).dtk 0t=0 Observe that all the derivatives of e−ntµ ′(0)µ (t)n Z(t) will∣onl[y have positive inte]graldk ′ p∑owers of n (possibly) up to order k. Therefore, ∣ e−ntµ (0)µ (t)n Z(t) = dtk t=0 k ak,jn j. We claim that for j > k/2, ak,j = 0. This claim proves the result. j=0 We notice that the first derivative of e−tµ ′(0)µ (t) at t = 0∣ is 0. Thus we provedk the more general claim that if g(0) = 1 and g′(0) = 0 then ∣ [g(t)nZ(t)] has no dtk t=0 terms with powers of n greater than k/2. From the Leibniz rule, ∣∣∣∣ ∑k ( ) ∣dk n k dl ∣[g(t) Z(t)] = Z(k−l)(0) ∣ [g(t)n].dtk l ∣t=0 l dtl=0 t=0 dl ∣ Therefore it is enough to prove that ∣ [g(t)n] has no powers of n greater than dtl t=0 l/2. To this end we use the order l Taylor expansion of g(t) about t = 0. Since g′(0) = 0 and g is r + 2 times continuously differentiable for l ≤ r + 2 there exists φ(t) continuous such that, g(t) = 1 + a t2 + · · ·+ a tl + tl+12∑ l φ(t)n! =⇒ g(t)n = (a t22 )k2 . . . t(l+1)kl+1φ(t)kl+1 ∑ k !k··· 0 2! . . . kl+1!k0+k2+ +kl+1=n Ck0k2...k n!= l+1 t2k2+···+(l+1)kl+1φ(t)kl+1 . k0!k2! . . . kl+1! k0+k2+···+kl+1=n After combining and rearranging terms according to powers of t, we can obtain the order l Taylor expansion of g(t)n. Notice that if kl+1 ≥ 1 then 2k2 + · · · + (l + 1)kl+1 ≥ l + 1. Terms with kl+1 ≥ 1 are part of the error term of the order l Taylor expansion of g(t)n. Since our focus is on the derivative at t = 0, the 71 only terms that matter are terms with kl+1 = 0 and 2k2 + · · · + lkl = l. This l implies that k2 + · · · + kl ≤ . Because ki’s are non-negative integers, this means 2 k2 + · · · l + kl ≤ b c l . Hence, k0 ≥ n− b c. 2 2 dl This analysis shows that the largest contribution to ∣∣ [g(t)n] comes from dtl t=0 the term, C(n−b(l c),1,...,1,0),...,0 n!2 tl n− b l c ! 2 whose kth derivative at 0 is, C ( ⌊ ⌋ )(n−b (l c),1,...,1,0,...,0 l! n! l l2 ) = C b cl n− b l c ! (n−b c 2 ),1,...,1,0,...,0 l! n . . . n− + 1 = O(n ). 2 2 2 Therefore, dl ∣∣∣ l[g(t)n] = O(nb c2 ). dtl t=0 It is immediate from the proof that the coefficients ak,j are determined by the derivatives of µ(t) and Z(t) near 0. For example, the constant term ak,0 = (−i)kZ(k)(0). This follows from these three facts. The expansion (3.36) is the kth ′ derivative of the product of the three functions e−ntµ (0), µ (t)n and Z(t) at t = 0. ′ All derivatives of µ (t)n and e−ntµ (0) at t = 0 contain powers of n and thus, ak,0 corresponds to the term Z(t) being differentiated k times in the Leibneiz rule. Both e−ntµ ′(0) and µ (t)n are 1 at t = 0. We will see later that the other coefficients ak,j are combinations of µ′(0) = iA, higher order derivatives of µ at 0 upto order k and derivatives of Z at 0 upto order k − 1. As a corollary to Lemma 3.3.1, we conclude that asymptotic moments of orders upto r + 2 exist. These provide us an alternative way to describe ak,j. 72 m Corollary 3.3.2. For all 1 ≤ m ≤ r + 2 and 0 ≤ j ≤ , 2 E m([Sn − nA]m)− nj+1am,j+1 − · · · − nb c2 am,bm c a 2m,j = lim . n→∞ nj Proof. When m = 1, E([Sn − nA]) = a1,0 + O(n0 ) and it is immediate that a1,0 = lim E([Sn − nA]). For arbitrary k we have, n→∞ ( ) E [Sn − nA]k = a bk/2ck,bk/2cn + a nbk/2c−1k,bk/2c−1 + · · ·+ ak,0 +O(n0 ) and dividing by n we obtain, ( ) E [Sn − nA]k ( 1) b c = ak,bk/2c +O .n k/2 n Now, it is immediate that, ( ) E [Sn − nA]k ak,bk/2c = lim b c .n→∞ n k/2 k Having computed ak,j, for r ≤ j ≤ b c, we can write, 2 ( ) E [Sn − nA]k − a bk/2ck,bk/2cn − · · · − ak,rnr = a r−1k,r−1n + · · ·+ ak,0 +O(n0 ). Dividing by nr−1, we obtain, ( ) E [Sn − nA]k − nrak,r − · · · − nbk/2ca ( )k,bk/2c 1 = a +O . nr−1 k,r−1 n Now, we can compute am+1(,r−1 , k )E [Sn − nA] − nra bk/2ck,r − · · · − n ak,bk/2c ak,r−1 = lim − .n→∞ nr 1 This proves the Corollary for arbitrary k ∈ {1, . . . , r + 2}. 73 Because the coefficients of polynomials Ap(t) (see (3.10)) are combinations of derivatives of µ(t) and Z(t) at t = 0, we can write them explicitly in terms of ak,j, and hence, by applying Corollary 3.3.2, the coefficients of Edgeworth polynomials can be expressed in terms of moments of Sn. Next, we will introduce a recursive algorithm to do this and illustrate the process by computing the first and second Edgeworth polynomials. Taking the first derivative of (3.35) at t = 0, iE ′([Sn − nA]) = Z ′(0) + Un(0). Then, a ′1,0 = lim E([Sn − nA]) = −iZ (0). n→∞ Next, taking the second derivative of (3.35) at t = 0 we have, ′′ i2E([S − nA]2n ) = n[µ′′(0)− µ′(0)2] + Z ′′(0) + Un(0). Therefore, dividing by n and takin(g the limit we)have,[ ]2 2 Sn − nAa ′ 2 ′′2,1 = σ = lim E √ = µ (0) − µ (0). (3.37) n→∞ n Once we have found a2,1 we can find ( ) a2,0 = lim E([Sn − nA]2)− nσ2 = −Z ′′(0). n→∞ We can repeat this procedure iteratively. For example, after we compute the 3rd derivative of (3.35) at t = 0: i3E([S − nA]3) = Z(3)n (0) + nµ′(0)[2µ′(0)2 − 3µ′′(0)] + nµ(3)(0) 74 + 3nZ ′(0)[µ′(0)2 − ′′ (3)µ (0)] + Un (0) we get that, 1 ( ) a3,1 = lim E [Sn − nA]3 = −A(3σ2 + A2) + iµ(3)(0)− 3iσ2Z ′(0) n→∞ n = −A(3σ2 + A2) + iµ(3)(0) + 3σ2a1,0. This gives us µ(3)(0) and Z(3)(0) in terms of asymptotics of moments of Sn: iµ(3)(0) = a3,1 + A(3σ 2 + A2)− 3σ2a1,0 ( ) iZ(3)(0) = lim E([Sn − nA]3)− na→∞ 3,1 .n Given that we have all the coefficients ak,j, 1 ≤ k ≤ m computed and µ(k)(0), Z(k)(0) for 1 ≤ k ≤ m expressed in terms of the former, we can compute a and express µ(m+1)m+1,j (0), Z (m+1)(0) in terms of ak,j, 1 ≤ k ≤ m+ 1. To see this note that µ(m+1)(0) appears only as a result of µn(t) being differ- entiated m+ 1 times. So, µ(m+1)(0) only appears in derivatives of order m+ 1 and higher. It is also easy to see that it appears in the form nµ(m+1)(0) in the (m+ 1)th derivative of (3.35). Thus, it is a part of am+1,1 and all the other terms in am+1,1 are products of µ(k)(0), Z(k)(0) for 1 ≤ k ≤ m whose orders add upto m + 1 and hence they are products of ak,j, 1 ≤ k ≤ m. Also, Zm+1(0) appears only in am+1,0. This is because Z m+1(0) appears only as a result of Z(t) being differentiated m + 1 times. Thus, it appears only in derivatives of (3.35) of order m+ 1 or higher. In the (m+ 1)th derivative of (3.35), ′ there is only one term containing Z(m+1)(t) and it is e−ntµ (0)µ (t)n Zm+1(t). So am+1,0 = (−i)m+1Zm+1(0). 75 Using Corollary 3.3.2, we have, ( ) E [Sn − nA]m+1 am+1,bm+1 c = lim→∞ bm+1 . 2 n n c2 m+ 1 Having computed am+1,j, for r ≤ j ≤ b c, we compute a( ) m+1,r−1 : 2 E [Sn − nA]m+1 − nram+1,r − · · · − b m+1 n c2 am+1,bm+1 c a 2m+1,r−1 = lim . n→∞ nr−1 This gives us Z(m+1)(0) = im+1a m+1m+1,0 and µ (0) in terms of am+1,1 and ak,j, 1 ≤ k ≤ m i.e. explicitly in terms of moments of Sn. Proceeding inductively we can compute all the derivatives upto order r of µ(t) and Z(t) at t = 0 in this manner by taking derivatives up to order r of (3.35) at t = 0. This is possible because our assumptions guarantee the existence of the first r + 2 derivatives of (3.35) near t = 0. Remark 3.3.1. This representation of µ(k)(0) and Z(k)(0) in terms of ak,j is not unique. However, it is convenient to choose the ak,j’s with the lowest possible indices. The inductive procedure explained above yields exactly this representation. We will illustrate how the first and the second order Edgeworth expansion can be computed explicitly once we have µ(4)(0), µ(3)(0), Z ′′(0) and Z ′(0) in terms of asymptotic moments of Sn. Because A0(t) = 1 we have R0(t) = 1. From the derivation of (3.9) we have, t3 3 A (t) = (log µ)(3)(0) − Z ′(0)t = (µ(3) t1 ( (0)− 3µ ′′(0)µ′(0) +)2µ ′(0)3) − Z ′(0)t 6 6 3 = µ(3) t (0) + iA(3σ2 + A2) − Z ′(0)t 6 (it)3 = (a 23,1 − 3σ a1,0) − a1,0(it). 6 76 After taking the inverse Fourier transform as shown in (3.15) we have, (a3,1 − 3σ2a1,0) a1,0 R1(x) = x(3σ 2 − x2) + x. 6σ6 σ2 Using (3.16) we obtain the firs(t Edgeworth p)olynomial, a3,1 − 3σ2a1,0 2 − 2 − a1,0P1(x) = (σ x ) . 6σ4 σ Similar calculations give us, (it)6 [ A2(t) = (a3,1 + 3σ 2a )21,0 + A 2(6σ2 + A4]) + 4a3,1(A− 2a1,0)72 4 2 − (it) (it)3σ2(2a 2 22,0 − 4Aa1,0 + σ ) + a4,1 + (2a 24 1,0 − a2,0) . 2 From (3.15) and (3.16) we have, x62 2 − 15σ2x4 + 45σ4x2 − 15σ6R2(t) =(a3,1[+ 3σ a1,0) 72σ12 ] + A2(6σ2 + A4) + 4a3,1(A− 2a 2 21,0)− 3σ (2a2,0 − 4Aa1,0 + σ ) + a4,1 × (x 4 − 6σ2x2 + 3σ2) 2 − (x 2 − σ2) + (2a1,0 a2,0) ,24σ8 2σ4 x(15σ2 − 10σ2x2 + x6) P2(t) =(a3,1[+ 3σ 2a 21,0) 72σ10 ] + A2(6σ2 + A4) + 4a (A− 2a )− 3σ23,1 1,0 (2a 22,0 − 4Aa1,0 + σ ) + a4,1 × x(3σ 2 − x2) 2 − x+ (2a1,0 a2,0) .24σ6 2σ2 Remark 3.3.2. Once we have Rp for p ∈ N0 and Pp for p ∈ N, the polynomials Pp,g, Pp,d and Pp,a are given by Pp,g = Pp,d = Rp and Pp,a = Pp. These relations were obtained in the proofs in section 3.2. Also, one can compute Pp,l using (3.28 ∑ ∫ ): (−ix)j σ2t2 Pp,l(x) = t jA (t)e−l 2 dt. j! l+j=2p 77 For example, ∫ √ σ2t2 2π P −0,l(x) = A0(t)e 2 dt = . σ2 ∫ ∫ ∫ σ2t2 σ2 2 x2t σ2t2 P −1,l(x) = A2(t)e 2 dt − ix tA1(t)e− 2 dt− t2A −0(t)e 2 dt 2 P√1,l(x) =(a 2 2 53,1 + 3σ a1,0) 2π [ 24σ7 ] + A2(6σ2 + A4 1 ) + 4a3,1(A− 2a )(− 3σ2 21,0 (2a2,0 − 4Aa1,0 + σ ))+ a4,1 8σ5 2 − (2a21,0 − 1 a2,0) − (a3,1 − 2 1 2a1,0 x x 3σ a 6 1,0 ) + − 2σ σ5 σ3 2 2σ3 Higher order Edgeworth polynomials can be computed similarly. We can compare our results with the centered i.i.d. case. Then, we have that 1 A = 0, a1,0 = 0 because the sequence is stationary. Also, a3,1 = lim E([S − n→∞ nn nA]3) = E((X1−A)3), a2,0 = 0 and a4,1 = E(X41 ). So, the above polynomials reduce to, E(X3) E(X3) E(X3) A1(t) = 1 (it)3, R1(x) = 1 x(3σ2 − x2), P1(x) = 1 (σ2 − x2) 6 6σ6 6σ4 (it)6 4 A2(t) = E(X3 2 41 ) + (E(X1 )− 4 (it) 3σ ) 72 ( 24P 3 2 4 ) 3√0,l(x) 1 P√1,l(x) E(X1 ) 5 E(X1 ) − 3 1 − E(X1 ) x − 1 x2= , = + 2π σ 2π σ7 24 σ5 σ 8 σ5 2 σ3 2 These agree with the polynomials found in [20, Chapter XVI] (to see this one has to replace x by x/σ to make up for not normalizing by σ here) and [4]. The polynomials 1 Qk found in the latter are related to Pk,l by Qk(x) = Pk,l(x). 2π It is also easy to see that these agree with previous work on non-i.i.d. examples. In both [9, 29] only the first order Edgeworth polynomial is given explicitly. In [9], because the sequence is stationary and centered, we can take A = 0 and a1,0 = 0. 78 Also, the pressure P (t) given there, corresponds to log µ(t) here. So we recover 3 A (t) = P ′′′ (it) 1 (0) in [9, Theorem 3]. In [29], sequence is centered but not assumed 6 to be stationary. So A = 0 and a1,0 6= 0 and the asymptotic bias appears in the (it)3 expansion and A (t) = iµ(3)1 (0) − a1,0(it) which agrees with [29, Theorem 8.1]. 6 This dependence on initial distribution corresponds to presence of ` in (3.1). 3.4 Applications. 3.4.1 Local Limit Theorem. Existence of the Edgeworth expansion allows us to derive Local Limit Theo- rems (LLTs). For example see [16, Theorem 4]. Also, as direct consequences of weak global Edgeworth expansions, an LCLT comparable to the one given in [27, Chapter II], holds. In fact, a stronger version of LCLT holds true in special cases. To make the nota(tion)simpler, we assume that the asymptotic mean of SN isSN 0. That is A = lim E = 0. N→∞ N Proposition 3.4.1. Suppose that SN satisfies the weak global Edgeworth expansion of order 0 for an integrable function f ∈ (F , ‖·‖) where ‖·‖ is translation invariant. Further, assume that |xf(x)| is integrable. Then, √ ∫u2 NE(f(S − u)) = √ 1 e− 2Nσ2N f(x) dx+ o(1) (3.38) 2πσ2 uniformly for u ∈ R. √ Proof. After the change of variables z N → z in the RHS of the weak global 79 Edgeworth expansion, √ NE(f(∫SN −( u)) √z ) = ∫ [n ( )f(z − u)dz +(‖f‖o(N )1]) = √u ′ √zu( n )∫ + (z − u)n ∫ f(z − u()dz +)‖f‖o(1)N N = n √u − C zuf(z u) dz + (z − u)n √ f(z − u)dz + ‖f‖o(1) N N N Here zu is between u and z and depends continuously on u. Notice that, ∣∣∣ ∫ ( z ) ∣∣ ∫u(z − u)n √ f(z − u)dz∣ ≤ |(z − u)f(z − u)|dz ≤ ‖xf‖1 N Therefore, after a change of variables z − u→ z in the RHS, √ ( u )∫ NE(f(SN − u)) = n √ f(z)dz + max{‖xf‖1, ‖f‖} o(1) N as required. In particular, the result holds for F = F 10 . If the order 0 weak global Edge- worth expansion holds for all f ∈ F 10 , then we have the following corollary. We note that this is indeed the case for faster decaying |E(eitSN )| as in Markov chains and piecewise expanding maps described in sections 3.5.3.1, 3.5.3.2 and 3.5.4. Corollary 3.4.2. Suppose that SN admits the weak global Edgeworth expansion of order 0 for all f ∈ F 10 . Then, for all a < b, √ N ( ) u2 P SN ∈ 1 (u+ a, u+ b) = √ e− 2Nσ2 + o(1) (b− a) 2πσ2 uniformly in u ∈ R. 80 Proof. Fix a < b. It is elementary to see that there exists a sequence fk ∈ F 10 with compact support such that fk → 1(u+a,u+b) point-wise and fk’s are uniformly bounded in F 11 . This bound can be chosen uniformly in u, call it C. Therefore, from the proof of Proposition 3.4.1, we have, √ ( )∫ NE(fk(SN − u u)) = n √ fk(z)dz + C11(fk) o(1) N Because 0 ≤ C11(fk) ≤ C, taking the limit as k →∞ we conclude, √ ( ) ( u )∫ u+b NP SN ∈ (u+ a, u+ b) = n √ 1 dz + C o(1) N u+a and the result follows. In fact, u in the previous theorem need not be fixed. For example, for a uN sequence uN with √ → u, we have the following: N Corollary 3.4.3. Suppose that SN admits the weak global Edgeworth expansion of uN order 0 for all f ∈ F 10 . Let uN be a sequence such that lim √ = u. Then, for all N→∞ N a < b, √ N ( ) 1 u2 lim P SN ∈ (uN + a, uN + b) = √ e− 2σ2 . N→∞ (b− a) 2πσ2 Now, we state the stronger version of LCLT in which we allow intervals to shrink. Definition 8. Given a sequence N in R+ with N → 0 as N →∞, we say that SN admits an LCLT for N if we have, √ N ( ) 1 u2 P SN ∈ (u− N , u+  −N) = √ e 2Nσ2 + o(1) 2N 2πσ2 uniformly in u ∈ R. 81 The next proposition gives a existence of weak global Edgeworth expansions as a sufficient condition for SN to admit a LCLT for a sequence N . Notice that existence of higher order expansions allow N to decay faster. In case expansions of all orders exist, N can decay at any subexponential rate. Proposition 3.4.4. Suppose that SN satisfies the weak global Edgeworth expansion of order r (≥ 1) for all f ∈ F 10 . Let N be a sequence of positive real numbers such that N → 0 and NN r/2 →∞ as N → ∞. Then, SN admits an LCLT for N . Proof. WLOG assume N < 1 for all N . As in the previous proof, there exists a sequence fk ∈ F 10 with compact support such that fk → 1(u− ,u+ ) point-wise andN N fk’s are uniformly bounded in F 1 0 . This bound can be chosen uniformly in N and u, call it C. Let N ∈ N. Note that for all k, ∑r ∫1 ( √ ) ( ) E(fk(SN)) = p Pp,g(z)n(z)f 1 −(r+1)/2k z N dz + C0(fk) o N . N 2 p=0 By taking the limit as k →∞ and using the fact 0 ≤ C10(fk) ≤ C, we conclude, ( ) ∑r ∫ u√+N1 N ( ) P SN ∈ (u− N , u+ N) = p Pp,g(z)n(z) dz + C o N−(r+1)/2 . N 2 u−N p=0 √N z After a change of variables z → √ in the p = 0 term and divide the whole N equation by 2N to get, √ N ( ) P S∫N ∈ (u− N , u+ N)2N ( ) ∑r √ ∫ u1 √+N ( )N = 1J (z − z N 1 u)n √ dz + P N p p,g(z)n(z) dz + C o2N N 2 u− r/2p=1 NN 2 √ N NNN where JN = (−N , N). 82 Note that for p ≥ 1, there exists Cp such that |Pp,g(z)n(z)| < Cp. Therefore,∣∣∣ √ √∣ ∫ u√+N ∣∣ ∫ u√+NN N Cp N N Cpp Pp,g(z)n(z) dz∣∣ ≤ p 1 dz ≤ = o(1)2 u− u− p/2NN 2 √ N 2NN 2 √ N N N N Also, as in the proof of Proposition 3.4.1, ∫ ∫ 1 ( ) (− √z 1 √u ) u+N1J (z u)n dz = n 1 dz 2 NN N 2N N u−N ∫ C u+N ( ) + (z − zuu)n √ dz 2NN u−N N Note that, ∣∣∣∣ ∫ u+ ( ) ∣∣∣∣ ∫C N − √z C u+Nu(z u)n dz ≤ | − | CNz u dz =2NN u− N 2NN u− 2NN N Therefore, ∫ 1 (− z ) ( u )1J (z u)n √ dz = n √ + o(1). 2 NN N N Combining these estimates with  N r/2N →∞ we have that, √ N ( ) ( u ) P SN ∈ (u− N , u+ N) = n √ + o(1) 2N N and it is straightforward from the proof that this is uniform. Remark 3.4.1. We note that this result implies [16, Theorem 4] because existence of classical Edgeworth expansions imply the existence of the weak global Edgeworth expansion and this result is uniform in u. 3.4.2 Moderate Deviations. While the CLT describes the typical behaviour or ordinary deviations from the mean provided by the law of large numbers, it is not sufficient to understand prop- 83 erties of distribution of Xn completely. Therefore, the study of excessive deviations is important. For example, deviations of order n are called large deviations. An exponential moment condition is required for a large deviation√principle to hold, even for the i.i.d. case. However, when deviations are of order n log n (moderate deviations) this is not the case. We show here that a moderate deviation principle holds for SN under a weaker assumption than the exponential moment assumption. It is also worth noting that moderate deviations have numerous applications in areas like statistical physics and risk analysis. For example, moderate deviations are greatly involved in the computation of Bayes risk efficiency. See [44] for details. Proposition 3.4.5. Suppose SN admits the order r Edgeworth expansion. Then √ for all c ∈ (0, r), when 1 ≤ x ≤ cσ2(lnN, ) 1− P SN√−AN ≤ x N lim = 1. (3.39) N→∞ 1−N(x) Proof. Note that, [ ( − − − S )] ( ) P N1 N(x) 1 √− AN ≤ SN − ANx = P √ ≤ x −N(x) N ∑ Nr Pp(x) ( ) = n(x) + o N−r/2 Np/2 p=1 √ uniformly in x. So it is enough to show that for 1 ≤ x ≤ cσ2 lnN , Pp(x)n(x) N −r/2 lim = 0 and = o(1) N→∞ Np/2(1−N(x)) 1−N(x) Note that for x ≥ 1, 2 − σ n(x) ( O n(x) ) 1 N(x) = + . x x3 84 Thus, N−r/2 N−r/2 (√ −r/2 )≤ √ = O NlnN c 1−N(x) 1−N( cσ2 lnN) ( e−) lnN2 O lnN= N (r−c)/2 Say Pp(x) is of degree q. Then for some C and K, ∣∣∣ Pp(x)n(x) ∣∣∣ ≤ (xq +K)n(x) (xq +K) ( ( 1 ))C = C x 1 +O Np/2(1−N(x)) Np/2(1−N(x)) Np/2 x2 ≤ (lnN) q+1 C → 0 as N →∞. Np/2 This completes the proof of (3.39). Proposition 3.4.5 is a generalization of the results on moderate deviations found in [43] to the non-i.i.d. case along with improvements on the moment condi- tion. It should be noted that [4] contains an improvement of the moment condition for the i.i.d. case. But the proof we present here is different from the proof presented in [4]. As an immediate corollary to the above theorem, we can state the following first order asymptotic for probability of moderate deviations. Corollary 3.4.6. Assume SN admits the order r Edgeworth expansion. Then for all c ∈ (0, r), √ P(SN ≥ 1 1 AN + cσ2N lnN) ∼ √ √ . 2πc N c lnN 3.5 Examples Here we give several examples of systems satisfying assumptions (A1)–(A4). 85 3.5.1 Independent variables. Let Xn be i.i.d. with r + 2 moments. In this case we can take B = R, and define L v = E(eitX1t v) = φ(t)v where φ is the characteristic function of X1. Here we have taken ` = 1. Put v = 1. Then, the independence of the random variables gives us, Lnt 1 = E(eitSn) = φ(t)n. Also, the moment condition implies t → φ(t) is Cr+2. This means (A1) is satisfied. (A2) is clear. Suppose X1 is l−Diophantine. That is there exists C > 0 and t0 > 0 such that C − C for all |t| > t0, |φ(t)| < 1− . Then |φ(t)| ≤ e |t|l . So |φ(t)| < 1 for all t =6 0. So|t|l we have (A3). Also, this implies that X1 is non-lattice. An easy computation shows 1 that when r1 < , there exists r2 such that t < |t| < nr10 =⇒ |φ(t)|n ≤ n−r2 . In l fact, |φ(t)|n ≤ e−cnα 1where α = 1− r1l > 0. So, (A4) is satisfied with r1 < . l r − 1 When l = 0 we see that (A4) is satisfied with r1 > . Hence, by Theo- 2 rem 3.1.1 order r Edgeworth expansion for Sn exists. This is exactly the classical result of Cramér because the condition: lim sup |φ(t)| < 1 corresponds to l = 0. |t|→∞ r + 1 (r + 1)l Choose q > > . Then, by Theorem 3.1.4 and Theorem 3.1.5 we 2r1 2 have that Sn admits weak global expansion for f ∈ F q+20 and weak local expansion for f ∈ F q+2r+1 . These are similar to the results appearing in [4] but slightly weaker (r + 1)l because we require one more derivative: q + 2 > 2 + as opposed to 1 + 2 (r + 1)l . This is because we do not use the optimal conditions for the integrability 2 of the Fourier transform. If we required f ∈ F q+1 and f (q+1)r to be α−Hölder for small α, then the proof would still hold true and we could recover the results in [4]. 86 3.5.2 Finite state Markov chains. Here we present a non-trivial example for which the weak Edgeworth expan- sions exist but the strong expansion does not exist. Consider the Markov chain xn with states S = {1, . . . , d} whose transition probability matrix P = (pjk)d×d is positive. Then, by the Perron-Forbenius theorem, 1 is a simple eigenvalue of P and all other eigenvalues are strictly contained inside the unit disk. Suppose h = (hjk)d×d ∈ M(d,R) and that there does not exist constants c, r and a d−vector H such that rhjk = c+H(k)−H(j) mod 2π for all j, k. Put Xn = hxnxn+1 . For the family of operators L : Cd → Cd, ∑td (L f) = eithjkt j pjkfk, j = 1, . . . , d (3.40) k=1 v = 1 and ` = µ0, the initial distribution, we have (3.1). Define br,j,k = hrj + hjk for all j, r = 1, . . . , d and k = 2, . . . , d. Put d(s) = max {(br,j,k − br,1,k)s} where { . } denotes the fractional part. We further assume that h is β−Diophantine, that is, there exists K ∈ R such that for all |s| > 1, K d(s) ≥ . (3.41) |s|β 1 If β > then almost all h are β−Diophantine. d2(d− 1)− 1 2 Because Sn can take at most O(nd −1) distinct values, Sn has a maximal jump 2 of order at least n−(d −1). Therefore, the process Xhn = hxnxn−1 does not admit the order 2(d2 − 1) Edgeworth expansion. 87 The Perron-Forbenius theorem implies that the operator L0 satisfies (A2). Because (3.40) is a finite sum, it is clear that t 7→ Lt is analytic on R. So we also have (A1). Also the spectral radius of Lt is at most 1. Assume Lt has an eigenvalue on the unit circle, say eiλ, with eigenvector f , then, ∑d eiλf ithjkj = (Ltf)j = e pjkfk k=1 Assuming max |fj| = |fr|, j ∣∣∑d |fr| = |eiλf | = ∣∣ eithjkr pjkfk∣∣∣∣ ∑d ∑d≤ pjk|fk| =⇒ pjk(|fk| − |fr|) ≥ 0 k=1 k=1 k=1 Because |fk|− |fr| ≤ 0 for all k and pjk ≥ 0 for all j and k we have |fk| = |fr| for all k. Therefore, there exist a d−vector H such that f = ReiH(k)k for all k. Then, ∑d eiλReiH(j) = eithjkpjkRe iH(k) ∑k=1d 0 = p (ei(thjk+H(k)−H(j)−λ)jk − 1) k=1 =⇒ thjk = λ+H(j)−H(k) mod 2π But this is a contradiction. Therefore, (A3) holds. Next we notice that, |(L2f) | = ∣∣∣∣∑d ∑d ∣∣∣∣ ∣eit(hrj+hjk)t r prjpjkfk = ∣∣∣∑d (∑d ) ∣∣eit(hrj+hjk)prjp f ∣jk k∣ j=1 k=1 k=1(∑j=1d ∣∣∣∣∑d ∣∣)≤ ‖f‖ eitbr,j,kp ∣rjpjk∣ (3.42) k=1 j=1 Now we estimate |br,k(t)| where ∑d ∑d br,k(t) = e itbr,j,kp p = eitbr,1,k eit(br,j,k−br,1,k)rj jk prjpjk j=1 j=1 88 Then we have, ∑d ∑d |b (t)|2 = p2 2r,k rjpjk + 2 prjpjkprlplk cos((br,j,k − br,l,k)t) (j=∑1 ) j>ld 2 ∑d = prjpjk − 2 prjpjkprlplk[1− cos((br,j,k − br,l,k)t)] (∑j=1 j>ld )2 = p 2rjpjk − 2Cd(t) +O(d(t)3), C > 0 ∑j=1d |br,k(t)| = prjpjk − C̃d(t)2 +O(d(t)3), C̃ > 0 j=1 Therefore, ∑d ∣∣∣∣∑d ∣∣ ∑d (∑d )eitbr,j,kp p ∣rj jk∣ = prjpjk − Cd(t)2 +O(d(t)3) k=1 j=1 k=1 j=1 = 1− Cd(t)2 +O(d(t)3), C > 0 From the Diophantine condition (3.41), we can conclude that there exists θ > 0 such that for all |t| > 1, ( ) ‖L2t‖ ≤ 1− θd(t)2 =⇒ ‖LNt ‖ ≤ − 2 dN/2e 1 θd(t) ≤ e−θd(t)2N/2 ≤ −θt−2βe N/2 1−  1−  When 1 < |t| < N 2β , we have, ‖LN‖ ≤ e−θN /2t which gives us (A4) with r1 = 2β r + 1 where  > 0 can be made as small as required. Because for small , d e = 2(1− ) dr + 1e r + 1, choosing q > β, we conclude that for f ∈ F q+20 weak global and for2 2 f ∈ F q+2r+1 weak local Edgeworth expansions of order r for the process Xhn exist. Also, SN admits averaged Edgeworth expansions of order r for f ∈ F 20 . In the special 1 case of β > , these hold for a full measure set of h even though the d2(d− 1)− 1 order r strong expansion does not exist for r + 1 ≥ d2. 89 3.5.3 More general Markov chains. 3.5.3.1 Chains with smooth transition density. First we consider the case where xn is a time homogeneous Markov process on a compact connected manifold M with smooth transition density p(x, y) which is bounded away from 0, and Xn = h(xn−1, xn) for a piece-wise smooth function h :M×M→ R. We assume that h(x, y) can not be written in the form h(x, y) = H(y)−H(x) + c(x, y) (3.43) where c(x, y) is piece-wise constant. In particular, there is no constant c and a function H such that h(x, y) = H(y)−H(x)+c. Also, the transition probability P (x, dy) of Xn has a non-degenrate absolute continuous component. Then, by [25], the CLT holds with σ2 > 0. To check the assumption 3.43 we need the following: Lemma 3.5.1. (3.43) holds iff there exists o ∈ M such that the function x 7→ h(o, x) + h(x, y) is piece-wise constant for each y. Proof. If (3.43) holds then for each o ∈M h(o, x) + h(x, y) = c(o, x) + c(x, y) +H(y)−H(o) where c(o, x) + c(x, y) is piece-wise constant in x for each y. Conversely, suppose for some o ∈ M, x 7→ h(o, x) + h(x, y) is piece-wise constant for each y. Fix y. Let c = h(o, o) and H(x) = h(o, x) − h(o, o). Then, h(o, o) + h(o, y) and h(o, x) + h(x, y) differ by a piece-wise constant function. Then 90 (3.43) holds because h(o, x)+h(x, y)−(h(o, o)+h(o, y)) = h(x, y)+H(x)−H(y)−c is piecewise constant. Let B = L∞(M) and consider∫the family of integral operators, (L u)(x) = p(x, y)eith(x,y)t u(y) dy. Let µ be the initial distribution of the Markov chain and {Fn} be the filtration adapted to the processes. Then, using the Markov property, E [eitSnµ ] = Eµ[eitSn−1Lt1]. By induction we can conclude ∫ Eµ(eitSn) = Lnt 1 dµ Because h is bounded, expanding eith(x,y) as a power series in t, we see that t 7→ Lt is analytic for all t. This shows that (A1) is statisfied. From the Weierstrass theorem there exist fun∑ctions qk, rk on M such thatn p(x, y) is a uniform limit of functions of the form qk(x)rk(y). Therefore, Lt k=1 is a uniform limit of finite rank operators and is compact. Compact operators have a point spectrum hence the essential spectral radius of Lt vanishes. It is also immediate that ‖Lt‖ ≤ 1 for all t. Hence the spectrum is contained in the closed unit disk. In addition, L : L∞(M)→ L∞0 (M∫ ) given by (L0u)(x) = p(x, y)u(y) dy is a positive operator. Note that (L01)(x) = 1 for all x. Thus, 1 is an eigenvalue of L0 with eigenfunction 1. Also, eigenvalue 1 is simple and all other eigenvalues 91 β are such that |β| < 1. This follows from a direct application of Birkhoff Theory (see [2]). Thus, we have (A2). Next we show that if β ∈ sp(Lt), t =6 0 then |β| < 1. If not, then there exists λ and u ∈ L∞(M) such that ∫ p(x, y)eith(x,y)u(y) dy = eiλu(x) Suppose sup |u(x)| = R then for each  > 0 there exists x such that x ∣∣∫ ∣∣ ∫ R−  ≤ |u(x )| = |eiλu(x )| = ∣∣ p(x, y)eith(x,y)u(y) dy∣  ∣ ≤ p(x, y)|u(y)| dy Therefore, ∫ p(x, y)[|u(y)| −R] dy ≥ −, But |u(y)| − R ≤ 0. Hence, |u(y)| = R a.e. Therefore, u(y) = Reiθ(y) a.e. for some function θ and we may assume θ ∈ [0, 2π). ∫ p(x, y)eith(x,y)Reiθ(y) dy = Reiλeiθ(x)∫ =⇒ p(x, y)[ei(th(x,y)−λ+θ(y)−θ(x)) − 1] dy = 0 =⇒ th(x, y)− λ+ θ(y)− θ(x) ≡ 0 mod 2π (3.44) Fix y and t. Then, for all z, x 7→ h(y, x) + h(x, z) does not depend on x modulo 2π i.e. it is piece-wise constant for all t =6 0. By Lemma 3.5.1, h(x, y) satisfies (3.43). This contradiction proves (A3). Recall that if K is integral operator ∫ (Ku)(x) = k(x, y)u(y)dy 92 then ∫ ‖K‖ = sup |k(x, y)|dy. x In our case L2t has the kernel, ∫ lt(x, y) = e it[h(x,z)+h(z,y)]p(x, z)p(z, y)dz. By Lemma 3.5.1 for each x and y the function z →7 (h(x, z)+h(z, y)) is not piecewise constant. So its derivative (whenever it exists) is not identically 0. Thus there is an open set Vx,y and a vector field e such that ∂e[h(x, z) + h(z, y)] 6= 0 on Vx,y. Integrating by parts in the direction of e we conclude that ∫ lim eit[h(x,z)+h(z,y)]p(x, z)p(z, y)dz = 0. t→∞ Vx,y By compactness there are constants r0, ε0 such that for |t| ≥ r0 and all x and y in M, |lt(x, y)| ≤ l0(x, y)− ε0. It follows that ∫ ∫ ‖L2t‖ = sup |lt(x, y)|dy ≤ l0(x, y)dy − ε0. (3.45) x M M The first term here equals ∫∫ p(x, z)p(z, y)dzdy = 1. M×M Hence for |t| ≥ r , ‖L20 t‖ ≤ 1 − ε0 and so ||LN || ≤ (1 − ε )dN/2et 0 . This proves (A4) with no restriction on r1. Therefore, SN admits Edgeworth expansions of all orders. Next we look at the case when (3.43) fails but the constants are not lattice valued. Then, arguments for (A1), (A2) and (A3) hold. In particular, (3.44) cannot 93 hold since it implies tha(t ) θ(y) − θ(x) λ 2πh(x, y) + ∈ + Z t t t t However, we have to impose a Diophantine condition on the values that h(x, y) can take in order to obtain a sufficient control over ‖LNt ‖ and obtain (A4). For fixed x, y let the range of z 7→ h(x, z) + h(z, y) be S = {c1, . . . , cd}. Note that these ci’s may depend on x and y. However, there can be at most finitely many values that h(x, z) + h(z, y) can take as x and y vary onM because h is piece-wise smooth. So we might as well assume that S is this complete set of values. Also, take Uk to be the open set on which z 7→ h(x, z) + h(z, y) takes value ck. Take bk = ck − c1 and define d(s) = max {bks}. Assume further that there exists K > 0 such that for all |s| > 1, K d(s) ≥ |s|β If β > (d− 1)−1 for almost all d−tuples c = (c1, . . . , cd), the above holds. Note that,∫ ∣∣∫ ∣∣ |L2u(x)| = ∣∣ e∣it[h(x,z)+h(z,y)]t ∫ ∫ p(x, z)p(z, y) dz ∣ |u(y)| dy ∣∣∣∑ ∣d ∣∣∣ ∣ ∫ ∣∣∑ ∣d ≤ ‖∫u‖ ∣ ∣ eitck p(x, z)p(z, y) dz∣ dy = ‖u‖ ∣U ∣ p eitbkk ∣ ∣∣∣ dy k=1 k k=1 where and pk = p(x, z)p(z, y) dz. Therefore, p1 + · · ·+ pd = p(x, y). Uk Now the situ∣∣ ation is sim∣∣ ilar to that of (3.42) and a similar calculation yields,∣∣∣∑ d p eitbkk ∣∣∣ = p(x, y)− Cd(t)2 +O(d(t)3), C > 0 k=1 Therefore, ∫ [ ] ‖L2t‖ ≤ p(x, y)− Cd(t)2 +O(d(t)3) dy = 1− C̃d(s)2 94 From this we can repeat the analysis done in the finite state Markov chains example 1− following (3.42). In particular, when 1 < |t| < N 2β , there exists θ > 0 such that ‖LN t ‖ ≤ e−θN which gives us (A4). Finally, when (3.43) fails and h takes integer values with span 1, Xn is a lattice random variable and we can discuss the existence of the lattice Edgeworth expansion. In this case SN admits the lattice expansion of all orders. To this end, only the condition (̃A3) needs to be checked. First note that L0 = L2πk for all k ∈ Z. Also, assuming Lt has an eigenvalue on the unit circle, we conclude (3.44), th(x, y)− λ+ θ(y)− θ(x) ≡ 0 mod 2π This implies t(h(x, y) + h(y, x)) ∈ 2πZ + 2λ. Note that LHS belongs a lattice with span t and RHS is a lattice with span 2π. Because t is not a multiple of 2π this equality cannot happen. Therefore, when t 6∈ 2πZ, sp(Lt) ⊂ {|z| < 1} and we have the claim. 3.5.3.2 Chains without densities. We consider a more general case where transition probabilities may not have a density. We claim we can recover (A1)–(A4) if the transition operator takes the form L0 = aJ0 + (1− a)K0 where a ∈ (0, 1) and J0 and K0 are Markov operators on L∞(M) (i.e. J0f ≥ 0 if f ≥ 0 and J01 = 1 and similarly for K∫ 0), J0f(x) = p(x, y)f(y) dµ(y) 95 and ∫ K0f(x) = f(y)Q(x, dy) where p is a smooth transition density and Q is a transition probability measure. Let h(x, y) be piece-wise smooth and put, Jt(f) = J (eith0 f) and K itht(f) = K0(e f). Defining Lt = aJt + (1− a)Kt we can conc∫lude t 7→ Lt is analytic and that Eµ(eitSn) = Lnt 1 dµ. Now we show that conditions (A2), (A3) and (A4) are satisfied. Because ‖Jt‖ ≤ 1 and ‖Kt‖ ≤ 1 we have ‖Lt‖ ≤ 1. Thus the spectral radius of Lt is ≤ 1. Because aJt is compact, Lt and (1 − a)Kt have the same essential spectrum. See [33, Theorem IV.5.35]. However the spectral radius of the latter is at most (1− a). Hence, the essential spectral radius of Lt is at most (1− a). Because both J0 and K0 are Markov operators we can conclude that 1 is an eigenvalue of L0 with constant function 1 as the corresponding eigenfunction. From the previous paragraph the essential spectral radius of L0 is at most (1−a). Because Ln is norm bounded it cannot have Jordan blocks. So 1 is semisimple. Suppose, Ltu = eiθu. Without loss of generality we may assume ‖u‖∞ = 1. Assuming there exists a positive measure set Ω with |u(x)| < 1− δ we can conclude that, for all x, |u(x)| = |Ltu(x)| = |a∫Jtu(x) + (1− a)Ktu(x)| ∫ ≤ a |u(y)|p(x, y)dµ(y) + a |u(y)|p(x, y)dµ(y) + (1− a) Ω Ωc 96 ≤ 1− aδµ(Ω). This is a contradiction. Therefore, |u(x)| = 1. Put u(x) = eiγ(x). Then, ∫ 1 = a ei(th(x,y)+γ(y)−γ(x)−θ)p(x, y)dµ(y) + (1− a)e−i(θ+γ(x))Ktu ∫ Hence, ei(th(x,y)+γ(y)−γ(x)−θ)p(x, y)dµ(y) = 1 =⇒ Jtu = eiθu. From section 3.5.3.1, this can only be true when t = 0 and in this case θ = 0 and u ≡ 1. This concludes that Lt, t 6= 0 has no eigenvalues on the unit disk and the only eigenvalue of L0 on the unit disk is 1 and its geometric multiplicity is 1. As 1 is semisimple, it is simple as required. This concludes proof of (A2) and (A3). From the previous case, there exists r > 0 and  ∈ (0, 1) such that such that for all |t| > r we have ‖J 2t ‖ ≤ 1− . From this we have, ‖L2t‖ = ‖a2J 2t + a(1− a)JtKt + (1− a)aKtJt + (1− a)2K2t ‖ ≤ 1− a2. Hence, for all |t| > r, for all N , ‖LNt ‖ ≤ (1 − a2)bN/2c which gives us (A4) with no restrictions on r1. Therefore, SN admits Edgeworth expansions of all orders as before. As in the previous section, an analysis can be carried out when (3.43) fails. The conclusions are exactly the same. 3.5.4 One dimensional piecewise expanding maps. Here we check assumptions (3.1), (A1)–(A4) for piecewise expanding maps of the interval using the results of [5, 37]. 97 Let f : [0, 1]→ [0, 1] be such that there is a finite partition A0 of [0, 1] (except possibly a measure 0 set) into open intervals such that for all I ∈ A0, f |I extends to a C2 map on an interval containing I. In other words f is a piece-wise C2 map. F∨urther, assume that f ′ ≥ λ > 1 i.e. f is uniformly expanding. Next, let n A = T−jn A0 and suppose for each n there is Nn such that for all I ∈ An, k=0 fNnI = [0, 1]. Such maps are called covering. Statistical properties of piece-wise C2 covering expanding maps of an interval, are well-understood. For example, see [37]. In particular, such a function f has a unique absolutely continuous invariant measure with a strictly positive density h ∈ BV[0, 1] and the associated transfer operator ∑ L ϕ(y)0ϕ(x) = f ′(y) y∈f−1(x) has a spectral gap. Let g be C2 except possibly at finite number of points and admitting a C2 extension on each interval of smoothness. Define Xn = g ◦ fn and consider it as a random variable with x distributed according to some measure ρ(x)dx, ρ ∈ BV[0, 1]. Define a family of operators Lt : BV[0, 1]→ BV[0, 1] by ∑ itg(y) L etϕ(x) = ′ ϕ(y)f (y) y∈f−1(x) where t = 0 corresponds to the transfer operator. Because g is bounded, writing eitg(y) as a power series we can conclude t → Lt is analytic for all t. This gives (A1). 98 (A2) follows from the fact that L0 has a spectral gap. We further assume that g is not cohomologous to a piece-wise constant function. (3.46) In particular, g is not a BV coboundary. The assumption (3.46) is reasonable. Indeed, suppose that g is piece-wise constant taking values c1, c2 . . . ck. Then Sn takes less than n k−1 distinct values so the maximal jump is of order at least n−(k−1) so Sn can not admit Edgeworth expansion of order (2k − 2) in contrast to the case where (3.46) holds as we shall see below. A direct computation gives, ∫ √ 1 E(eitSn/ n) = Ln √t/ nρ(x) dx. 0 Therefore, there exists A such that, E it Sn√−nA 2 2 lim (e n ) = e−t σ /2 (3.47) n→∞ where σ2 ≥ 0. It is well know that σ2 > 0 ⇐⇒ g is a BV coboundary (see [24]). From (3.47) it is clear that Sn satisfies the CLT. To show (A3) holds, we first normalize the family of operators, ∑ eitg(y)L h(y)tv(x) = ′ v(y)f (y)h ◦ f(y) f(y)=x Then, L −1t = H ◦Lt ◦H where H is multiplication by the function h. Therefore, Lt and Lt have the same spectrum. However, the eigenfunction corresponding to the eigenvalue 1 of L0 changes to the constant function 1. 99 Assume eiθ is an eigenvalue of Lt. Then, there exists u ∈ BV[0, 1] with L iθtu(x) = e u(x). Observe that, ∑ L | | |u(y)|h(y)0 u (x) = ∣ f ′(y)h ◦ f(y)f∣(y∑)=x ≥ ∣∣ ∣eitg(y)u(y)h(y) ∣∣ iθ′ = |Ltu(x)| = |e u(x)| = |u(x)|f (y)h ◦ f(y) ∣ f(y)=x n Also note that, L0 is a positive operator. Hence, L0 |u|(x) ≥ |u(x)| for all n. How- ever, ∫ n lim (L0 |u|)(x) = |u(y)| · 1 dy n→∞ because 1 is the eigenfunction co∫rresponding to the top eigenvalue. So for all x, |u(y)| dy ≥ |u(x)| This implies that |u(x)| is constant. WLOG |u(x)| ≡ 1. So we can write u(x) = eiγ(x). Then, ∑ L h(y)u(x) = ei(tg(y)+γ(y)) = ei(θ+γ(x))t f ′(y)h ◦ f(y) ∑f(y)=x ⇒ h(y)= ei(tg(y)+γ(y)−γ(f(y))−θ) f ′ = 1 (y)h ◦ f(y) f(y)=x for all x. Since, ∑ L h(y)01 = f ′ = 1 (y)h ◦ f(y) f(y)=x and ei(tg(y)+γ(y)−γ(x)−θ) are unit vectors, it follows that tg(y) + γ(y)− γ(f(y))− θ = 0 mod 2π (3.48) for all y. Because g is not cohomologous to a piecewise constant function we have a contradiction. Therefore, Lt and hence Lt does not have an eigenvalue on the unit circle when t 6= 0. 100 To complete the proof of (A3) one has to show that the spectral radius of Lt is at most 1 and that the essential spectral radius of Lt is strictly less than 1. This is clear from Lasota-Yorke type inequality in [5, Lemma 1]. In fact, there is a uniform κ ∈ (0, 1) such that ress(Lt) ≤ κ for all t. Next, we describe in detail how the estimate in [5, Proposition 1] gives us (A4). To make the notation easier we assume t > 0 and we replace |t| by t. [5, Proposition 1] implies that there exist c and C such that if K1 large enough (we fix one such K1) then for all t > K1, ‖Ldc ln tet u‖ ≤ e−Cdc ln tet ‖u‖t (3.49) where ‖h‖t = (1 + t)−1‖h‖BV + ‖h‖L1 . Therefore, ‖Lkdc ln teu‖ ≤ e−Cdc ln te‖L(k−1)dc ln tet t u‖t ≤ · · · ≤ e−Ckdc ln te‖u‖t Also, ‖Lt‖t ≤ 1. So, if n = kdc ln te+ r where 0 ≤ r < dc ln te then kdc ln te ‖Lnu‖ ≤ e−Ckdc ln te r k ‖L u‖ ≤ e−Cn kdc ln te+rt t t t ‖u‖ −Cn t ≤ e k+1‖u‖t However, (1 + t)−1‖h‖BV ≤ ‖h‖t ≤ [1 + (1 + t)−1]‖h‖BV Therefore, (1 + t)−1‖Lnu‖ ≤ [1 + (1 + t)−1]e−Cn k BV k+1t ‖u‖BV which gives us ‖Ln k t ‖BV ≤ (t+ 2)e −Cn k+1 101 n and here k = k(n, t) = b c. When K ≤ |t| ≤ nr n11 , kmin = b c anddc ln te dc lnnr1e kmin → k1 as n→∞. Also, 1 ≥ ≥ kmin and, kmin + 1 k + 1 kmin + 1 k(n,t) k n −Cn min‖L ‖ ≤ (t+ 2)e−Cn k(n,t)+1 r1 k +1t BV ≤ 2n e min kmin 1 Choosing n0 such that for all n > n0, > (so this choice of n0 works for kmin + 1 2 all t) we can conclude that, ‖Ln‖ ≤ 2nr1e−Cn/2t BV r − 1 This proves (A4) for all choices of r1. In particular given r, we can choose r1 > 2 in the above proof. This implies that Edgeworth expansions of all orders exist. 3.5.5 Multidimensional expanding maps. LetM be a compact Riemannian manifold and f :M→M be a C2 expand- ing map. Let g : M → R be a C2 function which is non homologous to constant. The proof of Lemma 3.13 in [13] shows that this condition is equivalent to g not being infinitesimally integrable in the following sense. The natural extension of f acts on the space of pairs ({yn}n∈N, x) where f(yn+1) = yn for n > 0 and fy1 = x. Given such pair let [∑ ] [ ]n−1 ∑n ∑∞ Γ({y } ∂, x) = lim g(fk ∂ ∂n yn) = lim g(yk) = g(yk). n→∞ ∂x n→∞ ∂x ∂x k=0 k=1 k=1 g is called infinitesimally integrable if Γ({yn}, x) actually depends only on x but not on {yn}. Let Xn = g ◦fn. We want to verify (A1)–(A4) when x is distributed according to a smooth density ρ. Note that assumption (3.1) holds with v = ρ, ` being the 102 Lebesgue measure and ∑ eitg(y) (Ltφ)(x) = ∣∣ ( )det ∂f ∣∣φ(y). y∈f−1(x) ∂x We will check (A1)–(A4) for L acting on C1t (M). The proof of (A1)–(A3) is the same as in section 3.5.4. In particular, for (A3) we need Lasota–Yorke inequality (see (3.52) below) which is proven in [13, equation (19)]. The proof of (A4) is also similar to section 3.5.4, so we just explain the differ- ences. As before we assume that t >(0. Given a small c)onstant κ let κ‖Dφ‖C0‖φ‖t = max ‖φ‖C0 , . 1 + t Then by [13, Proposition 3.16] ‖Lnt φ‖t ≤ ‖φ‖t (3.50) provided that n ≥ C1 ln t. By [13, Lemma 3.18] if g is not infinitesimally integrable then there exists a constant η < 1 such that ‖Lnt φ‖L1 ≤ ηn‖φ‖t. (3.51) The Lasota–Yorke inequality says that there is a constant θ < 1, such that ‖D (Lnφ)‖ ≤ C (t‖φ‖ + θnt C0 3 C0 ‖Dφ‖C0) (3.52) Also, ‖Lnt φ‖C0 ≤ ‖L n 0 (|φ|)‖C0 ≤ C4 (‖ |φ| ‖L1 + θn‖ |φ| ‖Lip) (3.53) where the last step relies on L0 having a spectral gap on the space of Lipshitz functions. Combing (3.50) through (3.53), we conclude that Lt satisfies (3.49). The rest of the argument is the same as in section 3.5.4. 103 Chapter 4: Large Deviation Principles. 4.1 Asymptotics for Cramér’s Theorem. In this section, we focus on sequences of i.i.d. random variables. First, we prove the existence of weak asymptic expansions for Cramér’s LDP – Theorem 1.2. Next, we deduce existence of the strong expansion in special cases. As expected, a stronger assumption on the regularity of the law of the random variables is required for the second step. 4.1.1 Weak asymptotic expansions. We recall that a random variable X is called l−Diophantine if there exist C positive constants t0 and C such that |E(eitX)| < 1 − for |t| > t0. It is known|t|l that when X is l−Diophantine and r+2 moments exist weak Edgeworth expansions exist. For example, see [4] and Section 3.5.1. Given a random variable X with distribution function F , we define YX,γ to be a random variable with distribution function Gγ given by, yγ γ e dF (y)dG (y) = (4.1) µ(γ) 104 ∫ where µ(γ) = eyγdF (y). Therefore, ∫∫yeyγdF (y)E[YX,γ] = . (4.2) eyγdF (y) In Section 3.1 we defined the functio(n spaces Fm mk : f ∈ Fk if f is m) times continuously differentiable and Cmk (f) = max ‖f (j)‖L1 + max ‖xjf‖L1 < ∞. 0≤j≤m 0≤j≤k We call a function f , (left) exponential of order α, if lim |e−αxf(x)| = 0. Denote x→−∞ by F km,α the collection of all f ∈ F km with f (k) is exponential of order α. We note that due to assumption f ∈ F k , f (k)m being exponential of order α is enough to guarantee that f (l) is exponential of order α for all 0 ≤ l ≤ k. To see this suppose f, f ′ ∈ L1. Then, lim f(x) = 0. Suppose f ′ is exponential of order |x|→∞ α. The∫n, given  > 0∫there is M > 0 su∫ch that for x < −M , −eαx < f ′(x) < eαx.x x x − αy  So,  e dy ≤ f ′(y) dy ≤  eαy dy =⇒ − eαx ≤ f(x) ≤ eαx for −∞ −∞ −∞ α α x < −M . So f is also of exponential order α. Since f (l) ∈ L1 for all 0 ≤ l ≤ k, we can repeat the same argument starting from k and conclude that all lower order derivatives are of exponential order α. ⋂ It is clear that F k ⊂ F km,α m,β if α > β. Finally, define, F k km,∞ = Fm,α. This α>0 intersection is non-empty. For example, the family of Gaussian functions and C∞c (R) are in F km,α for all α > 0. Recall from Chapter 1 that for a function f : R → (−∞,∞] with f 6= ∞, Df = {x ∈ R|f(x) < ∞} and f ∗(x) = sup [tx − f(t)]. If f is convex, lower semi- t∈R continuous with D̊f = (a, b) and f ∈ C2(a, b) with f ′′ > 0 on (a, b) then, D̊f∗ = (A,B) where A = lim f ′(t) and B = lim f ′(t), f ∗ is continuously differentiable on t→a+ t→b− (A,B). For any f satisfying the above properties, for any x ∈ D̊f∗ the supremum in 105 the definition of f ∗(x) is achieved at a unique point. f is called steep if lim |f ′(t)| = t→a+ lim |f ′(t)| =∞. t→b− Theorem 4.1.1. Let X be a non-constant, real-valued, and centred random variable. Assume that the logarithmic moment generating function h(θ) = logE(eθX) is finite on a neighbourhood of 0. Further assume that there is l ∈ N such that for all θ ∈ D̊h, YX,θ is l−Diophantine. Let Xn be a sequence of i.i.d. copies of X. Let r ∈ N and a ∈ (0, sup(supp X)). Let θa be the unique θ such that( ∫ ) ∫ I(a) = sup aθ − log eyθdF (y) = aθa − log eyθadF (y). θ∈D̊h l(r + 2) Take q > + 1 and α > θa. Then, for every f ∈ F qr+1,α we have,2 b∑r/2c ∫ ( )1 1 E(f(SN−aN))eI(a)N = 1 Pp(z)f q θa(z)dz+Cr+1(fθa)·or,θa r+1 (4.3)p+ 2 2 p=0 N N where fθ(x) = e −θxf(x) and Pp(z) polynomials depending on a. Proof. Assuming F to be the distribution function of X we can define YX,γ by (4.1). Let Yi’s be i.i.d. copies of YX,γ and take S̃N = Y1 + · · ·+ YN . A simple computation gives us, γ e yγdFN(y) dGN(y) = µ(γ)N where FN is the distribution function of SN and G γ N is the distribution function of S̃N . Now, we formally compute, E(f(S − aN))eaγN = E(eaγNN f(SN − aN)) = ∫E(eγSNfγ(SN − aN)) = eγy2πfγ(y − aN)dFN(y) 106 ∫ = µ(γ)N 2πfγ(y − aN)dGγN(y) = µ(γ)NEγ(2πfγ(S̃N − aN)) 1 where f (s) = e−sγγ f(s). Hence, 2π E(f(S − aN))e(aγ−log µ(γ))NN = Eγ(2πfγ(S̃N − aN)). (4.4) Put γ = θa. Then, YX,γ has mean a (see [17, Chapter 2]). Since f ∈ F qr+1,α with θa < α we have fθa ∈ F q r+1. We prove this when r = 0 and q = 1. The argument for general q and r is similar. Suppose, f(x), f ′(x), xf(x) ∈ L1 and f ′(x) is continuous. It is immediate that (e−θaxf(x))′ = −θ e−θaxa f(x) + e−θaxf ′(x) is continuous. We need to show, e−θaxf(x), (e−θaxf(x))′, xe−θaxf(x) ∈ L1. Since f and f ′ are of exponential order, it is enough to show, e−θaxg(x), xe−θaxg(x) ∈ L1 if g is exponential of order α(> θa). This is true because there is M > 0 such that for x < −M , |e−θaxf(x)| < e(α−θa)x and |xe−θaxf(x)| < −xe(α−θa)x. Therefore, from [4], RHS of (4.4) admits the weak Edgeworth expansion whose coefficients are determined by moments of YX,θa . Therefore, we have that for all functions f ∈ F qr+1,α b∑r/2c ∫ ( )1 1 E(f(SN − aN))eI(a)N = q1 Pp,l(z)fθa(z) dz + Cp+ r+1(fθa) · o r+1 . p=0 N 2 N 2 Remark 4.1.1. 1. The assumption of X being centred is just to simplify the notation. One can easily reformulate the results for non-centred X using the corresponding results 107 for X − E(X). Therefore, from now on we discuss results for centred random variables only. 2. A similar result holds for a ∈ (inf(supp X), 0). In fact, one can deduce the corresponding results for a < 0 by considering −X and (−a) > 0. But, for simplicity we focus only on a > 0 hereafter. 3. Note that the requirement to expand Eγ(fθa(S̃N − aN)) is f q θa ∈ Fr+1 which is indeed the case when f ∈ F qθ ,α for some α > θa. In particular, this resulta holds for f ∈ Cqc (R). 4. In addition, if h(θ) is steep then sup(supp X) =∞ (see [30, Chapter 1]) and the expansion holds for all a > 0. We note that for a large class of random variables X, YX,θ is l−Diophantine. For example, if X is 0−Diophantine then so is YX,θ because X is absolutely contin- uous with respect to YX,θ (see [1, Lemma 4]). Also, we claim that if X is compactly supported and l−Diophantine for l > 0 then so is YX,θ. We recall from [4], that a random variable X with distribution function F is l−Diophantine if and only if there exists C1, C2 > 0 such that for all |x| > C1, ∫ C2 inf {ax+ y}2dF (a) ≥ y∈R R |x|l where {z} = dist(z,Z). If X is compactly supported (say on [c, d]) then, ∫ {ax+ y}2dGθ(a) = ∫ ∫1 d{ax+ y}2eθadF (a)d R eθadF (a) c ∫c ≥ ∫ eθc d{ax+ y}2 dF (a). eθaR dF (a) c 108 Thus, for all |x| > C1, ∫ eθc inf { C2ax+ y}2dGθ(a) ≥ ∫ . y∈R dR eθadF (a) |x|l c So the random variable YX,θ with distribution function G θ is l−Diophantine as claimed earlier. From this we obtain the following corollary. Corollary 4.1.2. Let X be a non-constant, real-valued, compactly supported and l-Diophantine centred random variable. Let Xn be a sequence of i.i.d. copies of X. Let r ∈ N and a ∈ (0, sup(supp X)). Let θa be the unique θ such that( ∫ ) ∫ I(a) = sup aθ − log eyθdF (y) = aθa − log eyθadF (y). θ∈D̊h l(r + 2) Then, for every f ∈ F qr+1,α with q > + 1 and α > θa we have,2 b∑r/2c ∫ ( ) E(f(S I(a)N 1 q 1 N − aN))e = 1 Pp(z)fθp+ a(z)dz + Cr+1(fθa) · or,θa r+1 2 2 p=0 N N for some polynomials Pp(z) depending on a. 4.1.2 Strong asymptotic expansions. We prove a lemma that gives conditions for the point-wise limit of a sequence of functions uniformly bounded in F qr+1 to satisfy the asymptotic expansions. Lemma 4.1.3. Let q ≥ 0. Suppose {fk} is a sequence in F qr+1, SN admits the weak local Edgeworth expansion for f , Cqk r+1(fk) ≤ C for all k, fk are uniformly bounded in L∞(R), fk → f point-wise and for all p, ∫ ∫ lim Pp(z)fk(z)dz = Pp(z)f(z)dz. (4.5) k→∞ 109 Then, √ b∑r/2c ∫1 1 NE(f(SN)) = Pp(z)f(z)dz + C · or,β(N−r/2). 2π Np p=0 Proof. For large N , ∣∣∣√ b∑r/2c ∫1 1E ∣∣N (fk(SN))− Pp(z)fk(z)dz∣ ≤ Cqr+1(f −r/2p k) · or,β(N )2π N p=0 ≤ C · or,β(N−r/2). (4.6) LDCT gives us that, lim E(fk(SN)) = E(f(SN)) k→∞ This along with assumption (4.5) allows us to take the limit k →∞ in the RHS of (4.6) and to conclude, ∣∣∣√ b∑r/2c ∫E − 1 1 ∣∣N (f(SN)) Pp(z)f(z)dz∣ ≤ C · o (N−r/2r,β ) 2π Np p=0 which implies the result. Remark 4.1.2. The same would hold if we replace weak local by weak global. How- ever, our focus here is on weak local expansions. The next theorem specifies when the existence of weak expansions imply the existence of strong expansions. Theorem 4.1.4. Let Xn be a sequence of random variables not necessarily i.i.d. Suppose SN = X1 + · · · + XN admits the weak asymptotic expansion of order r for large deviations in the range (0, L) for f ∈ F 1r+1,L where L+ > L when L <∞ and+ L+ =∞ if L =∞. That is, b∑r/2c ∫ ( ) E(f(S − aN))eI(a)N 1 1N = Pp(z)f 1θa(z)dz + Cr+1(fp+1/2 θa) · or,θN a r+1N 2p=0 110 for all a ∈ (0, L) where I(a) and θa as in (4.11). Then, SN admits the strong asymptotic expansion of order r for large deviation in (0, L). Proof. If f ∈ C∞c then fθ ∈ F 1r+1 for all θ. Therefore, we approximate 1[0,∞) by a sequence fk of C ∞ c functions such that (fk)θa are uniformly bounded in F 1 r+1 (see Appendix A.3 for such a sequence) and invoke Lemma 4.1.3 to establish, b∑r/2c ∫ ∞ ( ) P(SN ≥ 1 1 1 aN)eI(a)N = Pp(z)e −θazdz + C · o 2π Np+1/2 r,θa r+1 . 0 N 2p=0 Rem∫ ark 4.1.3. Note that the coefficients of the strong expansion are Cp(a) = 1 ∞ Pp(z)e −θaz dz obtained by replacing f with 1[0,∞) in coefficients of the weak 2π 0 expansions. Since fk’s are bounded in F 1 r+1, we can do this without altering the order of the error. However, for any q > 1, 1[0,∞) is not a pointwise limit of a sequence of functions f in F qk r with C q r+1(fk) bounded. To see this, assume that ‖f ′ ′′k‖1, ‖fk‖1, ‖fk ‖1 are uniformly bounded and fk → 1[0,∞) point-wise. Then, for all φ ∈ C∞c (R),∫ ∫ ∫ ∫ ∫ ∫ δ′ φ = − δ φ′ = 1 φ′′ = lim f φ′′ = lim − f ′ φ′[0,∞) k k = lim f ′′ φ k→∞ k→∞ k→∞ k |φ′(0)| This implies that ≤ sup ‖f ′′‖ for all φ ∈ C∞(R). Clearly, this is a contra- ‖ 1φ‖ k c∞ k diction. Therefore, Theorem 4.1.1 does not automatically give us strong expansions. Now we are in a position to state and prove the main result of this section, which extends Cramér’s LDP for i.i.d. random variables when the random variables have a sufficiently regular density. 111 Theorem 4.1.5. Let X be a non-constant real valued centred random variable. Assume that the logarithmic moment generating function h(θ) = logE(eθX) is finite on a neighbourhood of 0. Further assume that, X is 0−Diophantine. Let r ∈ N. Then for all a ∈ (0, sup(suppX)), there are constants Cp(a) such that b∑r/2c ( )Cp(a) 1P(SN ≥ aN)eI(a)N = 1 + op+ r+1 p=0 N 2 N 2 where ∫ 1 ∞ C −θazp(a) = e Pp(z)dz 2π 0 for some polynomials Pp(z) depending on a,( ∫ ) I(a) = sup aθ − log eyθdF (y) θ∈R and θa is this unique point the supremum is achieved. Proof. If X is 0−Diophantine then so is YX,θ as X is absolutely continuous with respect to YX,θ (see [1, Lemma 4]). Since, YX,θ has moments of all orders, YX,θ admits the strong Edgeworth expansion of all orders. Therefore, for each r ∈ N, YX,θ admits the weak local Edgeworth expansion of order r for f ∈ F 1r (see Appendix A.2). From (4.4) we know that, E(f(SN − aN))eI(a)N = Eγ(2πfθa(S̃N − aN)) where summands of S̃N have mean a. The assumptions allow us to expand RHS using the weak local Edgeworth expansion and obtain, b∑r/2c ∫ ( ) E(f(S − aN))eI(a)N 1= P (z)f (z)dz + C1N 1 p θa r+1(fθa) · o −r/2r,β N .p+ 2 p=0 N 112 for f ∈ C∞c (R). Now, we approximate 1 ∞[0,∞), by a sequence fk ∈ Cc (R) such that (fk)θa are bounded in F 1r+1 (see Appendix A.3 for such a sequence) and use Theorem 4.1.4 to obtain the required expansion. Remark 4.1.4. This gives us an alternative proof of [1, Theorem 2] for X satisfying the Cramér’s condition (which corresponds to Case 1 there). There are two ways the coefficients Cp(a) depend on a. First note that θa depends on the choice of a. Also, from Section 3.3, we know exactly how the coefficients of Pp depend on the first p+ 2 asymptotic moments of S̃N and thus, on the first p+ 2 moments of YX,θa . So the dependence of C(a) on a is explicit and one can compute these coefficients. In addition, Cp(a) does not depend on r because Pp(z)’s do not. 4.2 Higher order asymptotics in the non–i.i.d. case. Let Xn be a sequence of random variables that are not necessarily i.i.d. with asymptotic mean 0. Suppose that there exist a Banach space B, a family of bounded linear operators Lz : B→ B and vectors v ∈ B, ` ∈ B′ such that ( ) E ezSN = `(LNz v), z ∈ C (4.7) and satisfying the following, (B1) There exists δ > 0 such that z 7→ Lz is continuous on the strip |Re(z)| < δ and holomorphic on the disc |z| < δ. 113 (B2) 1 is an isolated and simple eigenvalue of L0, all other eigenvalues of L0 have absolute value less than 1 and its essential spectrum is contained strictly inside the disk of radius 1 (spectral gap). (B1) and (B2) along with perturbation theory of operators (see [33]) imply that there is δ0 ∈ (0, δ) such that Lz = µ(z)Πz + Λz, |z| < δ0 (4.8) where µ(z) is the top eigenvalue of Lz, Πz is the corresponding eigen–projection, Π∥ zΛz = Λ∥ zΠz = 0 and z →7 µ(z), z →7 Πz and z →7 Λz are holomorphic. In addition,∥∥ dk ∥ΛNz ∥ < βNk with 0 < βk < 1. Therefore,dzk LNz = µ(z)NΠz + ΛNz Combining this with (4.7) we have, E(ezSN ) = µ(z)N`(Πzv) + `(Λzv). (4.9) Then, plugging in z = 0 and taking N → ∞, we conclude that `(Π0v) = 1. Also, taking the derivative at z = 0, dividing by N and taking the limit as N → ∞, we obtain, d ∣∣ E(SN) µ(z)∣ = lim = 0. dz z=0 N→∞ N Taking the second derivative at z = 0, dividing by N2 and taking the limit as N →∞, we obtain, d2 ∣∣∣ E(S2 )µ(z) = lim N dz2 z=0 N→∞ N2 In addition, it follows from [24][Theorem 2.4] that there exists σ2 ≥ 0 such 114 that √SN →−d N (0, σ2). Since our interest is in SN that satisfies the CLT we would N asumme from now on that σ2 > 0. We also assume the following: (B3) µ(θ) > 0 for all θ ∈ (−δ0, δ0) (Here δ0 as in (4.8)). Define Ω(θ) = log µ(θ) for θ ∈ (−δ0, δ0). Then, Ω(0) = log µ(0) = 0 and ′ ′ µ (0) µ ′′(0)µ(0)− µ′(0)2 Ω (0) = = 0. Also, Ω′′(0) = = µ′′(0) = σ2 > 0. Since Ω′′ µ(0) µ(0)2 is continuous, there exists δ1 ∈ (0, δ0) such that Ω is strictly convex on (−δ1, δ1). Note that due to convexity, Ω′(−δ1) < 0 < Ω′(δ1). In addition, when θ 6= 0, µ(θ) > µ(0) = 1 by convexity. Next, we consider the Legendre transform of Ω, I given by, I(a) = sup [aθ − Ω(θ)], for a ∈ [0,Ω′(δ1)) θ∈(−δ1,δ1) which itself is a strictly convex function. Because Ω′ is strictly increasing and continuous on [0,Ω′(δ )], a − Ω′1 (θ) = 0 has a unique solution θa which depends continuously on a. Note that I(a) ≥ 0 for all a and I(a) = 0 ⇐⇒ a = 0. Also, I(a) is continuous because I is convex and I(0) = 0. In addition, I(Ω′(δ1)) = aδ1 − Ω(δ1). Now, we are in a position to prove a Large Deviation Principle for SN using Theorem 1.3. The following lemma shows that Theorem 1.3 applies in our case. Lemma 4.2.1. Suppose (B1), (B2) and (B3) hold. Then, there exists 0 < δ2 ≤ δ1 such that for θ ∈ (−δ2, δ2), 1 lim logE(eθSN ) = log µ(θ) N→∞ N Proof. Because `(Π0v) > 0, there exists δ2 and m > 0 such that for θ ∈ [−δ2, δ2] 115 `(Πθv) > 2m. Because ‖ΛN‖ < Cµ(θ)Nθ for large N , we have that lim µ(θ)−N`(ΛNθ v) = 0. N→∞ Hence, there exists N0 such that for N > N0, m < `(Πθv) + µ(θ) −N`(ΛNθ v) < 3m. Hence, 1 [ ] lim ln `(Πθv) + µ(θ) −N`(ΛNθ v) = 0. N→∞ N Now, for θ ∈ (−δ2, δ2) we can rewrite (4.9) as 1 1 [ ] logE(eθSN ) = log µ(θ) + log `(Πθv) + µ(θ)−N`(ΛNθ v) .N N This implies that, 1 lim logE(eθSN ) = log µ(θ). N→∞ N Combining this lemma with Theorem 1.3 and the analysis proceeding it, we have the following LDP. Theorem 4.2.2. Sup(pose (B1), ()B2) and (B3) hold. Then, there exists δ2 ∈ (0, δ1]log µ(δ2) such that for all a ∈ 0, , δ2 1 lim logP(SN ≥ aN) = −I(a) (4.10) N→∞ N where I(a) = sup [aθ − log µ(θ)] = aθa − log µ(θa) (4.11) θ∈(−δ2(,δ2) )′ µ′(θ) and θa is the unique θ solving log µ(θ) = = a. µ(θ) 116 Remark 4.2.1. The range of a for which the LDP holds, is constrained by the assumptions (B1), (B2) and (B3). We require a positive top eigenvalue µ(θ) to exist, log µ(θ) to be strictly convex and `(Πθv) > 0. Larger the range of θ for which these hold, larger the range of a. In particular, if these hold for all θ ∈ R, then by log µ(δ) convexity B = lim exists as an extended real number and for all a ∈ (0, B) δ→∞ δ the LDP holds. Next, we compute higher order asymptotics of this LDP. To this end, we make two more assumptions about Lz. (B4) For all θ ∈ (−δ2, δ2), for all real numbers t 6= 0, sp(Lθ+it) ⊂ {|z| < µ(θ)}. (B5) There are positive real numbers r1, r2, C,K and N∥0 suc∥h that for all θ ∈ − µ(θ) N ( δ2, δ2), for all N > N0 and for all K < |t| < N r1 , ∥LN ∥θ+it ≤ C .N r2 Remark 4.2.2. As in Remark 3.1.1 it follows that by slightly decreasing r1 we can assume r2 to be(as large as required for large enough N . ∈ log µ(δ ) ) 2 Pick a 0, . Then, δ2 E(f(SN − aN))eaθN = E(e∫θSN e−(SN−aN)θf(SN − aN)) 1 = f̂ (t)e−iatNθ `(LN 2π θ+it v) dt 1 e−iat where f −θxθ(x) = e f(x). Now define, Lθ+it = Lθ+it. Then, 2π ∫ µ(θ) E N(f(SN − aN))eaθN = µ(θ)N f̂θ(t)`(Lθ+itv) dt. From this we have, ∫ E(f(S − aN))e[aθ−log µ(θ)]N NN = f̂θ(t)`(Lθ+itv) dt. 117 In particular, ∫ E(f(S − aN))eI(a)N LNN = f̂θa(t)`( θ+it v) dt. (4.12)a e−iat Note that for |θa + it| < δ0 the top eigenvalue of Lθa+it is µ(θa + it) = µ(θa +µ(θa) it). As a function of t, µ(θa + it) is analytic in a neighbourhood of 0 by (4.8). Further, ∣ ′ d ∣∣ − µ′(θa) ′′ −µ′′(θa)µ(θa) = 1, µ (θa) = µ(z) = ia+ i = 0, µ (θa) = = −σ2 dt µ(θ ) µ(θ ) at=0 a a with σa > 0. Thus, there exists δ such that | 2µ(θ + it)| < e−σat2/4a , |t| < δ. (4.13) We also notice that, `(ΛNθ v)lim = 0 N→∞ µ(θ)N because the spectral radius of Λθ is strictly smaller than µ(θ). Combining this with E(eθSN ) = µ(θ)N`(Πθv) + `(ΛNθ v) we conclude that for all θ, E(eθSN ) `(Πθv) = lim . N→∞ µ(θ)N The following lemma allows us to obtain asymtotics of (4.12). We note that it is analogous to Theorem 3.1.4 where asymptotics of E(f(SN−aN)) for f ∈ F q+2r+1 are discussed and can be proven using the ideas in the proof of Theorem 3.1.4. One just has to replace Lt by Lθa+it there and introduce the corresponding changes. Lemma 4.2.3. Suppose (B1) (through (B5)) hold. Let r ∈ N. Then, there exist ∈ log µ(δ2)δ2 (0, δ) such that for all a ∈ 0, there are polynomials Pp(z) such that δ2 for g ∈ F q+1 r + 1r+1 with q > ,∫ 2r1br/2c ∫ ( ) LN ∑ 1 1 ĝ(t)`( q+2θa+itv) dt = Pp(z)g(z)dz + CNp+1/2 r+1 (g) · or,θa r+1 2 p=0 N 118 where θa is as in (4.11). Proof. We state how to estimate LHS away from 0. The rest of the proof, which contains the construction of polynomials Pp, is identical to that of Theorem 3.1.5 with it replaced by θa + it. Fix δ > 0 as in (4.13∣∣).∫By (B4), for δ ≤ |t| ≤ ∣∣K, there exists c0 ∈ (0, 1) such ‖Ln nthat nθ +it‖ ≤ c0 . Thus, ∣∣ ĝ(t)`(Lθ +itv) dt∣∣ ≤ C‖g‖ cna a 1 0 . δ<|t| r1 + (r + 1)/2, ∣∣∣∣ ∫ ∣ ∫Lnĝ(t)`( θa+itv) dt∣∣∣ ≤ C‖ ‖ n C‖g‖1g 1 ‖Lθa+it‖ dt ≤ r2−r1 K<|t| implies, ∣∣∣∣ ∫ ∣∣ ∫ 2r1 ∫ ∣∣ ĝ(q)Ln (t) ∣∣ĝ(t)`( θ +itv) dt∣∣ ≤ |ĝ(t)| dt ≤ ∣ ∣ dt (4.14)a q |t|>nr1 |t|>nr1 |t|>nr1 t ‖ĝ(q)≤ ‖1 = ‖ĝ(q)‖ o(n−(r+1)/21 ). nr1q Therefore, ∣∣∣∣ ∫ ∣n ∣ĝ(t)`(L v) dt∣∣ = o(n−(r+1)/2θ +it ). (4.15)a |t|>δ Remark 4.2.3. 1. The proof is almost identical to the proof of Theorem 3.1.4 and hence, the coefficients of polynomials Pp can be computed as shown in Section 3.3. In particular, they depend on exponential moments of SN . 119 2. Since θa depends on a, the coeffients of the polynomials Pp also depend on a. As a direct consequence of Lemma 4.2.3 and equation (4.12), we have the following theorem. T(heorem 4log µ(δ )).2.4. Suppose (B1) through (B5) hold. Let r ∈ N. Then, for a ∈2 0, there exist θa ∈ (0, δ2) and polynomials Pp(z) such that for f ∈ F q+2 δ r+1,α2 r + 1 with q > and α > δ2, 2r1 b∑r/2c ∫ ( )1 E(f(S − aN))eI(a)N = P (z)f (z)dz + Cq+2N p θa r+1 (fθa) · 1 o p+1/2 r,θN a r+1N 2p=0 1 where f −θxθ(x) = e f(x), I and θa as in (4.11). 2π Remark 4.2.4. In particular, the theorem holds for all f ∈ C∞c (R). This is the weak asymptotic expansion which gives us the required higher order asymptotics for (4.10), the LDP in Theorem 4.2.2. Next, we replace (B5) by the following stronger assumption which allows us to conclude existence of strong expansions for the LDP. Compare this assumption with assumption (A5) in Chapter 3. (̃B5) There are positive real numbers r1, r2, r3, C,K∥ and∥N0 such that for all θ ∈ − µ(θ) N ( δ2, δ2), for all N > N0 and for all |t| > K, ∥LN ∥θ+it ≤ C .N r2|t|r3 As in the case of (B5), we can assume r2 and r3 to be large after slightly reducing r1. Therefore we have the following theorem. Theorem 4.2.5. Suppose (B1) through (B4) and (̃B5) hold. Let r ∈ N. Then, there exists 0 < δ2(≤ δ such th)at SN admits a weak asymptotic expansions for thelog µ(δ2) LDP in the range 0, for f ∈ F 1 δ r+1,α with α > δ2. 2 120 ( ∈ log µ(δ ) 2) In particular, for all a 0, there exist constants Cp(a) such that δ2 b∑r/2c ( )C (a) 1 P p(SN ≥ aN)eI(a)N = + Cr,θ o Np+1/2 a r+1 . 2 p=0 N where ∫ 1 ∞ C (a) = e−θazp Pp(z)dz 2π 0 for some polynomials P0(z), . . . , Pr(z) depending on a and unique θa ∈ (0, δ2) such that I(a) = sup [aθ − log µ(θ)] = aθa − log µ(θa). θ∈(−δ2,δ2) Proof. The proof of the first part is similar to that of Theorem 4.2.4. The only difference is the estimate (4.14). Since f ∈ F 1 1r+1,α, we have g = fθ ∈ Fr+1. So tĝ(t) = (−i)ĝ′(t). WLOG assume r + 1 r3 > . Then, ∣ 2r1∣∣∣ ∫ ∣∣ ∫ ∫ ∣ ′ ∣ĝ(t)`(Ln ∣∣ ≤ | |‖Ln ‖ ≤ ∣∣ ĝ (t) ∣θ +itv) dt C ĝ(t)a θa+it dt C ∣ dtt1+r| | r | | r | | r 3t >n 1 t >n 1 t >n 1 ≤ C‖g ′‖1 nr1r3 = ‖g′‖ o(n−(r+1)/21 ) Now, the existence of the strong expansion follows from the first part of the theorem and Theorem 4.1.4. As in the i.i.d. case, Cp(a) does not depend r because θa and Pp do not. Also, there are two ways the coefficients C(a) depend on a. First note θa depends on the choice of a. Also, from Section 3.3, we know exactly how the coefficients of Pp depend on the derivatives of the µ(z) and `(Πz(·)) at θa and thus, on the exponential 121 moments of SN . Since this dependence of C(a) on a is explicit, one can compute these coefficients. 4.3 An application to Markov Chains. Take xn to be a time homogeneous Markov process on a compact connected manifold M with smooth transition density p(x, y) which is bounded away from 0, and Xn = h(xn−1, xn) for smooth function h :M×M→ R. We assume that h(x, y) can not be written in the form h(x, y) = H(y)−H(x) + c(x, y) (4.16) where c(x, y) is piece-wise constant. (An equivalent condition is given in Lemma 3.5.1). This is exactly the setting we worked in Section 3.5.3.1. We need the following lemma to establish (B1) through (B5). Lemma 4.3.1. Let K(x, y) be a smooth∫positive function on M×M. Let P be an operator on L∞(M) given by Pu(x) = K(x, y)u(y) dy. Then, P has a simple M leading eigenvalue λ > 0 and the corresponding eigenfunction g is positive and smooth. Proof.∑From the Weierstrass theorem, K(x, y) is a uniform limit of functions of the form Jr(x)Lr(y). Therefore, P can be approximated by finite rank operators. r≤n So P is compact. Since P is an operator which leaves the cone of positive functions invariant, by a direct application of Birkhoff Theory (see [2]), P has a leading eigenvalue λ which is positive and simple. The corresponding eigenfunction g is also positive. 122 Because P is compact, there is l ∈ (0, λ) such that spL∞(P )∩{|z| > r} = {λ}. Next, we consider P acting on C1(M)∫. Observe that, d ∂K (Pu)(x) = (x, y)u(y) dy. dx M ∂x So, ‖Pu(x)‖C1 ≤ C‖u‖∞ for some C. Since ‖ · ‖∞ ≤ ‖ · ‖C1 unit ball with respect to ‖ · ‖C1 is relatively compact with respect to ‖ · ‖∞. Therefore the essential spectral radius is 0 by [24, Lemma 2.2]. This gives us, spC1(P ) ∩ {|z| > r} ⊆ {λ}. To see that equality holds, note that the constant function 1 ∈ C1(M). By positivity of P , ≥ g n n 1 =⇒ P n P g1 ≥ =⇒ n ≥ λ gP 1 =⇒ |‖P n‖| ≥ n‖ gλ ‖ nC1 ≥ λ sup g sup g sup g sup g where |‖ · ‖| is the operator norm of P acting on C1(M). Therefore, the spectral radius of P is ≥ λ. This establishes that g ∈ C1. We can repeat the argument and show g ∈ Cr for r ∈ N. Take B = L∞(M) and con∫sider the family of integral operators, (Lzu)(x) = p(x, y)ezh(x,y)u(y) dy, z ∈ C. M Let µ be the initial distribution of the Markov chain. Then, using the Markov prop- erty, we have E [ezSnµ ] = µ(LNz 1). Now, we check conditions (B1) through (B5). Conditions (B1) and (B2) coincide with the conditions (A1) and (A2) in Chap- ter 3 and we verify them in Section 3.5.3.1. In particular, (B1) holds with δ = ∞. Note that, for all θ, Lθ is of the form P in Lemma 4.3.1. Therefore, (B3) holds for all θ. Take λ(θ) be the top eigenvalue and gθ to be the corresponding eigenfunction. Then, gθ is smooth. 123 To show (B4) and (B5) we d∫efine a new operator Qθ as follows. 1 θh(x,y) gθ(y)(Qθu)(x) = e p(x, y)u(y) d(y). λ(θ) M gθ(x) eθh(x,y)p(x, y) It is easy see to that pθ(x, y) = and dmθ(y) = gθ(y) d(y) defines a gθ(x)λ(θ) new Markov chain xθn with the ass∫ociated Markov operator Qθ. That is, Qθ is a 1 gθ(y) positive operator and Qθ1 = e θh(x,y)p(x, y) dy = 1 because gθ is the λ(θ) M gθ(x) eigenfunction corresponding to eigenvalue λ(θ) of Lθ. Now, we can repeat the argument in Section 3.5.3.1 to establish properties of the perturbed operator given by ∫ (Qθ+it)u(x) = e ith(x,y)pθ(x, y) dmθ(y) M Since (4.16) does not hold we conclude that sp(Lθ+it) ⊂ {|z| < 1}. Take Gθ to be the operator on L ∞(M) that corresponds to multiplication by gθ. Then, Lθ+it = λ(θ)G Q −1θ θ+itGθ . Therefore, sp(Lθ+it) is the sp(Qθ+it) scaled by λ(θ). This implies sp(Lθ+it) ⊂ {|z| < λ(θ)} as required. Since (4.16) does not hold, the asymptotic variance σ2θ of X θ n = h(x θ n−1, x θ n) is positive. Taking γ(θ+ it) to be the top e∣ignevalue of Qθ+it, λ(θ+ itd2 ∣∣ d2 ∣ ) = λ(θ)γ(θ+ it). ∣∣ γ′′(θ)T(hus, (log λ(θ))′′ = − log λ(θ + it) = − log γ(θ + it) = − +dt2 t=0 dt2 t=0 γ(θ)γ′(θ))2 =∫−γ′′(θ) + γ′(θ)2 (∵ γ(θ) = 1). Put SθN = Xθ1 + · · · + Xθγ(θ) N . Since, E(eitSθN ) = QNθ+it1 dµ, from (3.37), we have that γ′(θ)2 − γ′′(θ) = σ2θ . Thus, (log λ(θ))′′ = σ2θ > 0. Therefore, log λ(θ) is a strictly convex function. Note that, Lθ = λ(θ)Πθ+Λθ where Πθ is the projection onto the top eigenspace. From [27, Chapter III], Πθ = gθ ⊗ ϕθ where ϕθ is the top eigenfunction of Q∗θ, the adjoint of Qθ. Because Q ∗ θ itself is a positive compact operator acting on (L ∞)∗ (the 124 space of finitely additive finite signed measures), ϕθ is a finite positive measure. Hence, µ(Πθ1) = ϕθ(1)µ(gθ) > 0 for all θ. As a result, Lemma 4.2.1 holds with δ2 arbitrary large and hence, Theo- rem 4.2.2 holds with δ2 arbitrary large. So the rate function I(a) in Theorem 4.2.2 log λ(θ) is finite for a ∈ (0, B) where B = lim . We observe that B < ∞ be- θ→∞ θ SN BN cause h is bou∑nded i.e. ≤ ‖h‖∞ surely. In fact, we claim B = lim whereN N→∞ NN BN = sup h(xj−1, xj) (the supremum taken over all possible realizations of x0,...,xN j=1 the Markov chain xn). BN BN First note that BN is subadditive. So lim exists and is equal to inf . N→∞ N N N SN BN Given, a > B there exists N0 such that for all N > N0, ≤ < a. Therefore, N N P(SN ≥ aN) = 0 for all N > N0 and hence, I(a) = ∞. Next, given a < B, for all N ,∑BN > aN . Fix N . Then, there exists a realization x1, . . . , xN such thatN aN < h(xj−1, xj) ≤ B. Since h is uniformly continuous onM×M, there exists j=1 δ > 0 such that∑by choosing yj from a ball of radius δ centred at xj i.e. yj ∈ B(xj, δ),N we have aN < h(yj−1, yj) ≤ B. We estimate the probability of choosing such a j=1 realization y1, . . . , yN and obtain a lower bound for P(SN ≥ aN): ∫ ∫ ∫ P(SN ≥ aN) ≥ · · · p(yN−1, yN) . . . p(y0, y1) dµ(y0) dy1 . . . dyN B(xN ,δ) (B(x1,δ) B(x0,δ) )N ≥ µ(B(x0, δ)) min p(x, y) vol(B )Nδ x,y∈M Therefore, I(a) <∞ as required. Also, because gθ is smooth we can repeat the argument in Section 3.5.3.1 to obtain (3.45) for Qθ+it. That is, there is θ and rθ such that ‖Q2θ+it‖ ≤ (1− θ) for 125 all |t| > rθ. Therefore, ‖LNθ+it‖ = λ(θ)N‖G N −1 NθQθ+itGθ ‖ ≤ λ(θ) ‖G ‖‖Q N ‖‖G−1θ θ+it θ ‖ ≤ Cλ(θ) N(1−  )bN/2cθ . This establishes (B5). Since the rate in (B5) is exponential and Theorem 4.2.2 holds for (0, B), we conclude that for all r ∈ N, these Markov chains admit weak expansions for large deviations of order r in the range (0, B) for F 3r+1,B+ where B+ =∞, if B =∞ and B+ > B, if B <∞. We need a stronger assumption on h to establish (B̃5). Suppose, For all x, y critical points of z 7→ (h(x, z) + h(z, y)) are non-degenerate. (4.17) Since critical points of z 7→ (h(x, z) + h(z, y)) are non-degenerate we can use the stationary phase asymptotics in [48, Chapter VIII.2], to obtain, ∣∣∣ ∫ ∣eit(h(x,z)+h(z,y)p(x, z)p(z, y)eθ(h(x,z)+h(z,y)) dz∣∣ ≤ M|t|d/2M M where M is a constant and d is the dimension of M. Therefore, ‖L2θ+it‖ ≤ .|t|d/2 1 Choose K = (2M)2/d. Then for all |t| > K, ‖Q2θ+it‖ ≤ and hence, ( ) 2b(N−2)/2c ‖LNθ+it‖ ≤ ‖LN−2 2 1 M θ+it ‖‖Lθ+it‖ ≤ , |t| > K.2 td/2 By convexity, λ(θ) > 1. Thus, (1)b(N−2)/2cλ(θ)N‖LNθ+it‖ ≤M , |t| > K.2 td/2 This establishes (̃B5). 126 In particular, when h depends only on one variable, i.e. h(x, y) = H(x) for some H, we have that h(x, z) + h(z, y) = H(x) +H(z). Then, the condition (4.17) reduces to critical points of H being non-degenerate. Again, because Theorem 4.2.2 hold for all (0, B) and the rate in (B̃5) is ex- ponential, we conclude that these strongly ergodic Markov chains admit strong expansions for large deviations of all orders in the range (0, B). 127 Chapter A: Appendix A.1 Convergence of X . We need some background information. Given a piecewise smooth function g : Rd → R of compact support its Siegel transform is a function on the space of lattices defined by ∑ S(g)(L) = g(w). w∈L\{0} We need an identity of Siegel, see ( [38, Section 3.7] or [46, Lecture XV]) saying that ∫ EL(S(g)) = g(w)dw. (A.1) Rd In particular, if B is a set in Rd with piecewise smooth boundary not containing 0 then PL(L ∩B 6= ∅) ≤ P(S(1B)(L) ≥ 1) ≤ EL(S(1B)) = Vol(B). (A.2) sin(2πχ(w)) Proof of Lemma 2.1.2. Let L+ = {w ∈ L : y(w) > 0}. Since is even y(w) it is enough to restrict the attention to w ∈ L+. Throughout the proof we fix two numbers ε > 0, τ < 1 such that ε (1−τ) 1. It is easy to see using (A.2) and Borel-Cantelli Lemma that for almost every lattice 128 L C, there exists C and β such that y(w) > . It follows that ∑ ‖w‖ β sin 2πχ(w) ∑ e−||x(w)|| 2 ≤ 2εC||w||βe−||w|| y(w) w∈L+: ||x(w)||≥||w||ε w∈L+ converges absolutely. Hence it suffices to establish the convergence of ∑ X̄ sin 2πχ(w):= e−||x(w)||2 . y(w) w∈L+: ||x(w)||≤||w||ε 0, ∫ ∫ ∫ ∞ |(fk) (x)| dx = |e−γxγ fk(x)| dx ≤ e−γx dx = Cγ,1 <∞ −2 because 0 ≤ fk ≤ 1. 135 Since |f ′k| ≤ 5 on [k, k + 1], 0 ≤ fk ≤ 1 and fk is increasing on [−2, k],∫ ∫ k+1 |((f ′k)γ) (x)|dx = ∫ |γe −γxfk(x) + e −γxf ′k(x)| dx −2 k+1 ( ) ≤ γe−γx∫ fk(∫x) + e −γx|f ′k(x)| dx −2 k k ∫ k+1 ≤ γ −γx ′ −γx −γx −2 ∫e dx+ fk(x) dx+ (γe + 5e ) dx−1 kk+1 ≤ 1 + (5 + γ)e−γx dx = Cγ,2 <∞ −2 Also, note that |xlf l −γxk(x)| ≤ x e for all x ∈ [−2, k + 1]. Hence, ∫ ∫ ∞ |xlf l −γxk(x)| dx ≤ x e dx = Jγ,l <∞ −2 Put Jr(γ) = max Jγ,l and Cγ(r) = max{Jr(γ), Cγ,1, Cγ,2}. Then, Cγ(r) is finite and 1≤l≤r depends only on γ and r. Now, we have the following, 1. C1r+1((fk)γ) ≤ Cγ(r) for all k. 1 1 2. Since, tan−1(kx) + converges pointwise to 1[0,∞)(x), it is easy to see that π 2 fk → 1[0,∞) pointwise. 3. Since for each p, e−γzPp(z)fk(z) converges pointwise to e −γzPp(z)1[0,∞)(z), e−γz|P (z)|1 −γz −γzp [−2,∞) is integrable and |e Pp(z)fk(z)| ≤ e |Pp(z)|1[−2,∞), we can apply the LDCT to conclude, ∫ ∫ ∞ ∫ ∞ Pp(z)g −γz −γz k (z) dz = e Pp(z)fk(z) dz → e Pp(z) dz. −2 0 136 Bibliography [1] Bahadur, R. R., Ranga Rao R.; On Deviations of the Sample Mean. Ann. Math. Statist. 31 (1960), no. 4, 1015-1027. [2] Birkhoff, G. Extensions of Jentzschs theorem. Trans. Amer. Math. Soc. 85 (1957), no. 1, 219–227. [3] Bhattacharya, R. N., Ranga Rao R.; Normal Approximation and Asymptotic Expansions, first edition, John Wiley and Sons, 1976, xiv+274 pp. [4] Breuillard, E. Distributions diophantiennes et theoreme limite local sur Rd. Probab. Theory Related Fields 132 (2005), no. 1, 39–73. [5] Butterley, O., Eslami, P. Exponential mixing for skew products with disconti- nuities. Trans. Amer. Math. Soc. 369 (2017), no. 2, 783–803. [6] Bougerol, P., Lacroix, J.; Products of random matrices with applications to Schrödinger operators, Progress in Probability and Statistics, first edition, Birkhäuser Basel, Boston, 1985, xi+284 pp. [7] Chaganty, N. R., Sethuraman, J., Strong Large Deviation and Local Limit The- orems, Ann. Probab. 21 (1993), no. 3, 1671-1690. [8] Chebyshev, P. L. Sur le développement des fonctions à une seule variable. Bull, de l’Acad. Imp. des Sci. de St. Petersbourg 3(1) (1860), 193-200. [9] Coelho, Z., Parry, W. Central limit asymptotics for shifts of finite type. Israel J. Math. 69 (1990), no. 2, 235–249. [10] Cramér, H. On the composition of elementary errors. Skand. Aktuarietidskr. 1 (1928), 13–74; 141–180. [11] Cramér, H.; Random variables and probability distributions. Cambridge Tracts in Mathematics no. 36, Cambridge, 1937, 122 pp. 137 [12] Dolgopyat, D. A Local Limit Theorem for sum of independent random vectors, Electronic J. Prob. 21 (2016) paper 39. [13] Dolgopyat, D. On mixing properties of compact group extensions of hyperbolic systems. Israel J. Math. 130 (2002), 157–205. [14] Dolgopyat D., Fayad B. Deviations of ergodic sums for toral translations: Con- vex bodies, GAFA 24 (2014) 85–115. [15] Dolgopyat D., Fayad B. Limit theorems for toral translations, Proc. Sympos. Pure Math 89 (2015) 227–277. [16] Dolgopyat, D., Fernando, K. An error term in the Central Limit Theorem for sums of discrete random variables. preprint. [17] Dembo, A., Zeitouni O.; Large Deviations Techniques and Applications, second edition, Springer–Verlag Berlin Heidelberg, 2010, XVI+396. [18] Eskin A., McMullen C. Mixing, counting, and equidistribution in Lie groups, Duke Math. J. 71 (1993) 181–209. [19] Esséen, C.–G. Fourier analysis of distribution functions. A mathematical study of the Laplace-Gaussian law, Acta Math. 77 (1945) 1–125. [20] Feller, W.; An introduction to probability theory and its applications Vol. II., Second edition, John Wiley & Sons, Inc., New York-London-Sydney, 1971, xxiv+669. [21] Fernando, K., Liverani, C. Edgeworth expansions for weakly dependent random variables. arXiv:1803.07667 [math.PR]. [22] Götze F., Hipp C. Asymptotic Expansions for sums of Weakly Dependent Ran- dom Vectors, Z. Wahrscheinlickeitstheorie verw., 64 (1983) 211-239. [23] Gnedenko B.V., Kolmogorov A.N.; Limit distributions for sums of independent random variables, Trans. K.L. Chung, Revised edition, Addison-Wesley, 1968, ix+264 pp. [24] Gouëzel, S. Limit theorems in dynamical systems using the spectral method. Hyperbolic dynamics, fluctuations and large deviations, Proc. Sympos. Pure Math., 89 (2015) 161–193, AMS, Providence, RI. [25] Guivarc'h, Y., Hardy J. Théorèmes limites pour une classe de châınes de Markov et applications aux difféomorphismes d 'Anosov, Annales de l'I.H.P. Probabilités et statistiques, 24 (1) (1988) 73-98. [26] Hall, P. Contributions of Rabi Bhattacharya to the Central Limit Theory and 138 Normal Approximation. In Manfred Denker & Edward C. Waymire (Eds.), Rabi N. Bhattacharya Selected Papers, (pp 3–13). Birkhäuser Basel, 2016. [27] Hennion, H., Hervé, L.; Limit Theorems for Markov Chains and Stochastic Properties of Dynamical Systems by Quasi-Compactness, Lecture Notes in Mathematics, first edition, Springer-Verlag, Berlin Heidelberg, 2001, viii+125 pp. [28] Hebbar, P., Nolen, J. The asymptotics of solutions to parabolic PDE with peri- odic coefficients, preprint. [29] Hervé, L., Pène, F. The Nagaev-Guivarc'h method via the Keller-Liverani the- orem, Bull. Soc. Math. France 138 (2010) no. 3, 415–489. [30] den Hollander, F.; Large Deviations, Fields Institute Monographs 14, American Mathematical Society, Providence, RI, 2000, x+142 pp. [31] Ibragimov, I. A., Linnik, Y. V.; Independent and stationary sequences of ran- dom variables. With a supplementary chapter by I. A. Ibragimov and V. V. Petrov. Translation from the Russian edited by J. F. C. Kingman. Wolters- Noordhoff Publishing, Groningen, 1971, 443 pp. [32] Joutard, C. Strong large deviations for arbitrary sequences of random variables, Ann. Inst. Stat. Math. 65 (2013) no. 1, 49-67. [33] Kato, T.; Perturbation theory for linear operators, Classics in Mathematics, Reprint of the 1980 edition, Springer-Verlag, Berlin, 1995, xxii+619. [34] Kesten H. Uniform distribution mod 1, part I: Ann. of Math. 71 (1960) 445–471, part II: Acta Arith. 7 (1961/1962) 355–380. [35] Kleinbock D. Y., Margulis G. A. Bounded orbits of nonquasiunipotent flows on homogeneous spaces, AMS Transl. 171 (1996) 141–172. [36] Kleinbock D. Y., Margulis G. A. Logarithm laws for flows on homogeneous spaces, Invent. Math. 138 (1999) 451–494. [37] Liverani, C. Decay of correlations for piecewise expanding maps. J. Statist. Phys. 78 (1995), no. 3-4, 1111–1129. [38] Marklof J. The n-point correlations between values of a linear form, Erg. Th., Dynam. Sys. 20 (2000) 1127–1172. [39] Marklof J., Strombergsson A. The distribution of free path lengths in the pe- riodic Lorentz gas and related lattice point problems, Ann. Math. 172 (2010) 1949–2033. 139 [40] Nagaev S. V. More Exact Statement of Limit Theorems for Homogeneous Markov Chain, Theory Probab. Appl., 6(1) (1961) 62–81. [41] Nagaev S. V. Some Limit Theorems for Stationary Markov Chains, Theory Probab. Appl., 2(4) (1959) 378–406. [42] Pène F. Mixing and decorrelation in infinite measure: the case of periodic Sinai billiard, arXiv:1706.04461v1 [math.DS]. [43] Rubin H., Sethuraman J. Probabilities of moderate deviations, Sankhya Ser. A, 27 (1965) 325–346. [44] Rubin H., Sethuraman J. Bayes risk efficiency, Sankhya Ser. A, 27 (1965) 347–356. [45] Shah N. A. Limit distributions of expanding translates of certain orbits on ho- mogeneous spaces, Proc. Indian Acad. Sci. Math. Sci. 106 (1996) 105–125. [46] Siegel C. L.; Lectures on the geometry of numbers, Springer, Berlin, 1989. x+160 pp. [47] Sprindzuk V. G.; Metric theory of Diophantine approximations, Scripta Series in Math. V. H. Winston & Sons, Washington, D.C. 1979. xiii+156 pp. [48] Stein E. M.; Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals, Princeton University Press, first edition, 1993. xiv+695 pp. 140