ABSTRACT Title of dissertation: DATA ASSIMILATION EXPERIMENTS WITH A SIMPLE COUPLED OCEAN-ATMOSPHERE MODEL Tamara Singleton, Doctor of Philosophy, 2011 Dissertation directed by: Professor Eugenia Kalnay Department of Atmospheric and Oceanic Sciences The coupled ocean-atmosphere system has instabilities that span time scales from a few minutes (e.g. cumulus convection) to years (e.g. El Ni~no). Fast time scales have stronger growth rates and within linear approximations used in data assimilation, they do not saturate and may distort the slower longer time-scale solution. Therefore, it is not clear whether a data assimilation focused on long-term variability should include smaller time scales. To study this problem, we perform sequential and variational data assimilation experiments with 3 coupled Lorenz (1963) models of di erent time scales, simulating a coupled ocean-atmosphere model. We aim to better understand the abilities of di erent data assimilation methods for coupled models and aid in the development of data assimilation systems for larger coupled ocean-atmosphere models such as a general circulation models. The dissertation provides an overview of the sequential and variational data assimilation methods, which includes Ensemble Kalman Filter (EnKF)-based meth- ods, a fully coupled 4-dimensional variational data assimilation (4D-Var), and an ECCO-like 4D-Var, which uses the initial ocean state and surface uxes as control variables. Assuming a perfect model and observing all model variables, EnKF-based algorithms, without a quasi-outer loop or model localization, experience lter diver- gence for long assimilation windows, but were stable for shorter windows. The EnKF analyses depend on the covariance in ation and number of ensemble members. We found that short assimilation windows require a smaller in ation than long assimi- lation windows. The fully coupled 4D-Var analyses provide a good estimate of the model state and depend on the amplitude of the background error covariance. When comparing the EnKF analyses with the 4D-Var analyses, we found that the lters with a quasi-outer loop and model localization are more accurate than the fully coupled 4D-Var analyses for short windows, but the fully coupled 4D-Var method outperforms the EnKFs for long windows. The ECCO-like 4D-Var improves the 4D-Var analyses which uses only the initial ocean state as control variables, but the fully coupled 4D-Var outperforms the ECCO-like 4D-Var and 4D-Var analyses. The data assimilation experiments o er insight on developing and advancing sequential and variational data assimilation systems for coupled models. DATA ASSIMILATION EXPERIMENTS WITH A SIMPLE COUPLED OCEAN-ATMOSPHERE MODEL by Tamara Singleton Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial ful llment of the requirements for the degree of Doctor of Philosophy 2011 Graduate Committee: Dr. Eugenia Kalnay, Chair Dr. William Dorland Dr. Brian Hunt Dr. Kayo Ide Dr. Konstantina Trivisa Dr. Shu-Chih Yang Dr. Ning Zeng, Dean?s Representative c Copyright by Tamara Singleton 2011 Acknowledgments I am tremendously grateful and blessed for all of the people who have sup- ported me throughout my graduate studies and dissertation work. I?d like to thank my husband, Olusegun Goyea, for his unconditional love and encouragement throughout my years at Tulane University and the University of Maryland. He has helped me through the toughest times of my career, from preparing for quali ers to the obstacles of Hurricane Katrina. I will be forever appreciative for everything you have done for me. I would also like to thank my family - my mother, sisters, and aunts. They are women of great strength and inspire me to pursue my educational and career goals. I would like to thank my advisor, Dr. Eugenia Kalnay, for giving me an invalu- able opportunity to work on challenging and interesting data assimilation problems. She has provided me with a solid foundation in data assimilation and o ered great words of wisdom regarding my educational, career, and personal endeavors. It has been a pleasure to work with and learn from such an admirable teacher. I would also like to thank my co-advisors, Drs. Shu-Chih Yang and Kayo Ide. I would like to thank Dr. Yang for her patience and assistance through the coding process of the data assimilation methods. I thank Dr. Ide for her attention to details and working with me throughout the disseration. I would like to acknowledge Dr. David Starr of the National Aeronautics Space Administration (NASA) Goddard Space Flight Center and the nancial support from the NASA graduate fellowship programs, Harriett Jenkins Pre-doctoral and ii Graduate Student Researchers program. Dr. Starr taught me fundamental concepts of atmospheric sciences, which required a certain degree of patience being that I am a mathematician. I am thankful for his guidance and support. I would also like to thank my graduate committee members, Dr. Hunt, Dr. Trivisa, Dr. Dorland, and Dr. Zeng. I thank Mrs.Alverda McCoy and the Applied Mathematics and Scienti c Com- puting (AMSC) program for welcoming me to campus and helping me continue my graduate studies after a di cult time with the remnants of Hurricane Katrina. I am grateful for your kindness. I would also like to acknowledge Dean Halperin and the College of Computer, Mathematical, and Natural Sciences (CMNS) for their support throughout my grad- uate work at the University of Maryland, College Park. It was an honor to work with CMNS on outreach and recruitment initiatives. Lastly, I thank God for seeing me through this wonderful experience. iii Table of Contents List of Tables vi List of Figures vii List of Figures ix List of Abbreviations ix 1 Introduction 1 1.1 Coupled Ocean-Atmosphere Data Assimilation . . . . . . . . . . . . . 1 1.2 Studies of Coupled Data Assimilation . . . . . . . . . . . . . . . . . . 2 1.3 Simple Coupled Ocean-Atmosphere Model . . . . . . . . . . . . . . . 7 1.4 Study Objectives and Approach . . . . . . . . . . . . . . . . . . . . . 12 1.5 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2 Overview of Data Assimilation Algorithms 15 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 Local Ensemble Transform Kalman Filter (LETKF) . . . . . . . . . . 16 2.2.1 Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . 17 2.2.2 Ensemble Kalman Filters . . . . . . . . . . . . . . . . . . . . . 19 2.2.3 LETKF Formulation . . . . . . . . . . . . . . . . . . . . . . . 21 2.3 4-Dimensional LETKF . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.4 LETKF Quasi-Outer Loop (QOL) . . . . . . . . . . . . . . . . . . . . 31 2.5 Covariance In ation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.6 4D-Variational Data Assimilation . . . . . . . . . . . . . . . . . . . . 38 2.7 Comparisons of the two advanced methods, 4D-Var vs. EnKF . . . . 40 3 EnKF-based Data Assimilation Experiments using the Simple Cou- pled Ocean-Atmosphere Model 41 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2 Experiment 1: ETKF Data Assimilation Experiments . . . . . . . . . 45 3.3 Experiment 2: 4D-ETKF Data Assimilation Experiments . . . . . . . 54 3.4 Experiment 3: ETKF with a Quasi-Outer Loop (ETKF QOL) . . . . 57 3.5 Experiment 4: LETKF Data Assimilation Experiment . . . . . . . . . 60 3.6 Experiment 5: 4D-LETKF Data Assimilation Experiments . . . . . . 63 4 4D-Var Experiments with the Simple Coupled Ocean-Atmosphere Model 66 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.2 4D-Var Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.3 Tangent Linear and Adjoint Model . . . . . . . . . . . . . . . . . . . 69 4.4 Background error covariance estimation . . . . . . . . . . . . . . . . . 76 4.5 Quasi-static Variational Data Assimilation . . . . . . . . . . . . . . . 77 iv 4.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.7 4D-Var vs. EnKF-based Methods . . . . . . . . . . . . . . . . . . . . 82 5 Estimating the Circulation and Climate of the Ocean (ECCO) 4D- Var Data Assimilation System 85 5.1 About ECCO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.2 ECCO analyses and comparisons with other methods . . . . . . . . . 87 5.3 ECCO-like 4D-Var Data Assimilation System for the Simple Coupled Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.4 NCEP-like Flux Estimates . . . . . . . . . . . . . . . . . . . . . . . . 92 5.5 ECCO-like 4D-Var: Tangent Linear and Adjoint Models . . . . . . . 94 5.6 ECCO-like 4D-Var Experiments . . . . . . . . . . . . . . . . . . . . . 98 6 Conclusions and Future Work 103 Bibliography 110 Bibliography 116 v List of Tables 1.1 Parameters of the simple coupled ocean-atmosphere model . . . . . . 8 2.1 LETKF vs. 4D-LETKF . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.1 Parameters of the simple coupled ocean-atmosphere model . . . . . . 42 3.2 ETKF Experiment: Increasing in ation for 64 time-step assimilation - Ocean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.3 ETKF Experiment: Tuning multiplicative in ation - Extraropics . . . 51 3.4 ETKF Experiment: Tuning multiplicative in ation - Tropics . . . . . 51 3.5 ETKF Experiment: Tuning multiplicative in ation - Ocean . . . . . . 52 3.6 ETKF Experiment: Varying the number of ensemble members assim- ilating every 8 time steps . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.7 ETKF Experiment: Varying the number of ensemble members assim- ilating every 40 time steps . . . . . . . . . . . . . . . . . . . . . . . . 53 3.8 ETKF Experiment: Varying the number of ensemble members assim- ilating every 80 time steps . . . . . . . . . . . . . . . . . . . . . . . . 53 4.1 Verifying the adjoint model code . . . . . . . . . . . . . . . . . . . . 75 5.1 ECCO-like 4D-Var Experiment: Table of mean rms errors (no QSVA) 99 5.2 4D-Var Experiment with Fluxes(with QVA): Table of mean rms errors 101 vi List of Figures 1.1 Breeding experiments for a coupled model . . . . . . . . . . . . . . . 5 1.2 Solution trajectories of simple coupled ocean-atmosphere model . . . 9 1.3 x-component of the solution trajectories for the simple coupled ocean- atmosphere model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.4 Breeding experiment for the coupled ocean-atmosphere model . . . . 11 2.1 Description of LETKF Data Assimilation . . . . . . . . . . . . . . . . 28 2.2 4D-LETKF vs. LETKF . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.1 ETKF vs. LETKF . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2 Description of EnKF-based methods . . . . . . . . . . . . . . . . . . 45 3.3 ETKF Experiment: x-solution trajectory and rms errors for the Ocean 46 3.4 ETKF Experiment: Mean RMS Errors vs. Varying Assimilation In- tervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.5 ETKF Experiment: X-component of the Ocean Solution Trajectory and RMS Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.6 ETKF with In ation vs. ETKF without In ation: Plot of Mean RMS Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.7 4D-ETKF with In ation vs. 4D-ETKF without In ation: Plot of Mean RMS Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.8 ETKF Experiment: 4D-ETKF vs. ETKF . . . . . . . . . . . . . . . 56 3.9 ETKF-QOL with In ation vs. ETKF-QOL without In ation: Plot of Mean RMS Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.10 ETKF-QOL vs. ETKF vs. 4D-ETKF Experiments: Mean RMS Errors 59 3.11 LETKF vs. ETKF-QOL vs. ETKF vs. 4D-ETKF Experiments: Mean RMS Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.12 LETKF for Longer Assimilation Windows . . . . . . . . . . . . . . . 62 3.13 4D-LETKF vs. LETKF vs ETKFs Experiments: Mean RMS Errors . 64 4.1 4D-Var: Verifying correctness of tangent linear model . . . . . . . . . 71 4.2 Verifying correctness of tangent linear model with the norm . . . . . 72 4.3 Schematic of Quasi-static Variational Assimilation . . . . . . . . . . . 78 4.4 4D-Var Assimilation with and without QVA . . . . . . . . . . . . . . 79 4.5 4D-Var Assimilation: Tuning Background Error Covariance . . . . . . 81 4.6 4D-Var vs. EnKFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.7 4D-Var vs. EnKFs for Longer Windows . . . . . . . . . . . . . . . . . 83 5.1 ECCO: A Comparison Study of Nine Ocean Analyses . . . . . . . . . 89 5.2 ECCO-like 4D-Var vs. 4D-Var . . . . . . . . . . . . . . . . . . . . . . 91 5.3 NCEP-like Flux Estimates . . . . . . . . . . . . . . . . . . . . . . . . 93 5.4 ECCO-like 4D-Var: Tangent Linear Model . . . . . . . . . . . . . . . 95 5.5 ECCO-like 4D-Var: Adjoint Model . . . . . . . . . . . . . . . . . . . 98 5.6 ECCO-like 4D-Var Flux Estimates . . . . . . . . . . . . . . . . . . . 100 vii 5.7 ECCO-like 4D-Var vs. 4D-Var vs. Fully Coupled 4D-Var . . . . . . . 102 viii List of Abbreviations ENSO El Ni~no Southern Oscillation EnKF Ensemble Kalman Filter ETKF Ensemble Transform Kalman Filter 4D-ETKF 4-dimensional Ensemble Transform Kalman Filter ETKF-QOL Ensemble Transform Kalman Filter with Quasi-Outer Loop LEKTF Local Ensemble Transform Kalman Filter 4D-Var 4D-Variational Data Assimilation ECCO Estimating the Circulation and Climate of the Ocean ix Chapter 1 Introduction 1.1 Coupled Ocean-Atmosphere Data Assimilation The coupled ocean-atmosphere system, consisting of subsystems of vastly dif- ferent spatial and temporal scales, is needed for numerical weather prediction beyond a few days, for understanding the Earth?s climate system, and for short-term climate predictions. An important phenomena that highlights the need for a coupled ocean- atmosphere data assimilation system is the El Ni~no Southern Oscillation (ENSO). ENSO is characterized by changes in the air surface pressure in the tropical Pa- ci c and sea surface temperature of the tropical Paci c Ocean. It is known to be associated with extreme weather phenomena of signi cant societal and economical impact such as droughts, oods, and hurricanes. Coupled ocean-atmosphere mod- els have been developed for ENSO prediction and forecasting (e.g. Kirtman et al. 1997; Latif et al. 1994; Wang et al. 2001). Currently, skill of ENSO predictions is limited to 6-12 months. Chen (2008) identi es four factors a ecting the skill of ENSO predictions. These include observing systems,, model errors, limits to pre- dictability, and available observational data. To address these factors, there is a need to improve in data assimilation systems for coupled ocean-atmosphere model. Chen (2008) acknowledges that current ENSO forecast models are initialized in an uncoupled way, which a ects the skill of forecasts. Coupled data assimilation can 1 be used to initialize ENSO forecast models to improve the forecasts (Yang et al., 2009). Data assimilation is an approach that produces an estimate of a dynamical model state through the combination of observations and short-range forecasts. It can be used to determine the initial conditions for forecast models. Coupled ocean- atmosphere data assimilation can be used for advancing and improving coupled models, numerical weather prediction, and seasonal and inter-annual predictions Major challenges of coupled ocean-atmosphere data assimilation include the di ering time and spatial scales of the atmospheric and oceanic systems and the vast range of growing instabilities of the system. This study investigates the performance of data assimilation schemes using a simple coupled ocean-atmosphere model of di erent time scales and amplitude. 1.2 Studies of Coupled Data Assimilation Ballabrera-Poy et al (2009) used an Ensemble Kalman Filter, a sequential data assimilation scheme, to investigate the ability of the Kalman lter to estimate the state of a non-linear dynamical system with two di erent spatial and temporal scales. The study adapted a chaotic system from the model of Lorenz and Emmanuel (1998) that has been extensively used for data assimilation schemes. The equations of the model are dXi dt = Xi 1(Xi+1 Xi 2) Xi + F +Gi dYj;i dt = cb(Yj 1;i Yj+2;i) cYj+1;i +Hi 2 Gi = h c b JX j=1 Yj;i Hi = h c b Xi The above equations describe a coupled model of two variables with di erent temporal and spatial scales. Parameters were speci ed such that the variables X has a large amplitude and slow/long temporal scale and Y has a small amplitude and fast/short temporal scale. Three data assimilation experiments were performed Experiment 1: all observations were assimilated Experiment 2: only X (large-scale variables) observations were assimilated Experiment 3: all X (large-scale variables) observations were assimilated and a few Y (small-scale variables) observations were assimilated They assessed the performance of the EnKF by measuring the root mean square (rms) of the di erence between the true model state and the estimated model state from the EnKF data assimilation. Experiments 1-3 showed that simultaneous assim- ilation of the fast and slow variables a ected the slow variables. Experiment 2, when only the large-scale variable (X) observations where assimilated, had the smallest rms errors for the X variable. Rms errors for the large-scale variable (X) increased when they used the EnKF to assimilate both the large-scale (X) and small-scale (Y) variables. The study therefore recommended that in a system with multiple scales, di erent initialization techniques should be used. They used a Newtonian relaxation method to assimilate the fast variables and the Ensemble Kalman Filter to assimilate the slow variables. 3 Pe~na and Kalnay (2004) suggest that coupled ocean-atmosphere data assimi- lations should isolate the slow modes so that the fast instabilities do not invalidate the slower important processes in the data assimilation forecasts. Pe~na and Kalnay (2004) developed a fast-slow system based on coupling 3 Lorenz (1963) models and conducted breeding experiments to isolate the fast and slow modes in coupled chaotic system of variable amplitudes and di erent time scales. Toth and Kalnay (1993) introduced the breeding method that identi es rapidly growing instabilities in the ow of a dynamical system. The steps of the breeding method are 1. Generate an initial random perturbation of a prescribed amplitude and specify a rescaling interval 2. Add the initial random perturbation to the initial state of a nonlinear model resulting in an initial perturbed model state 3. Integrate the nonlinear model from both the initial model state and the ini- tial perturbed model state forward in time to obtain a control state and a perturbed state 4. At the end of the rescaling interval, scale the di erence between the perturbed model state and the control model state to the size of the initial random perturbation 5. Add the rescaled di erence in step (4) to the control state to create the per- turbed state after the rescaling. 6. Repeat steps (2) - (5) 4 The rescaled di erences between the perturbed and the control model states are called bred vectors, which capture the fast-growing instabilities of a dynamical sys- tem akin to leading Lyapunov vectors. To study the instabilities of coupled systems, Pe~na and Kalnay (2004) devel- oped coupled models based on Lorenz (1963) equations that qualitatively describe (1) a weather with convection case, (2) an ENSO case, and (3) a triple-coupled system of a tropical ENSO atmosphere that is weakly coupled to an extratropical atmosphere and strongly coupled to an ENSO ocean. Using these versions of the coupled models, they performed breeding experiments using the bred vectors to study and understand the experiment results. They found that they could separate the fast and slow modes of the coupled systems by choosing the rescaling amplitude and frequency of the breeding method. Fig. 1.1 is the results of the breeding exper- Figure 1.1: Breeding experiments with the coupled ?weather convection? model (Pe~na and Kalnay,2004). iments for the coupled weather convection model. The coupled weather convection 5 model is described by the following ordinary di erential equations _x = (y x) c(SX + k1) _y = rx y xz + c(SY + k1) _z = xy bz + czZ _X = (Y X) c(x+ k1) _Y = rX Y SZ + c(y + k1) _Z = SXY bZ czz The lower-case letters (x,y,z)) denote the fast subsystem and the upper-case letters (X,Y,Z) denote the slow subsystem. The parameters = 10, b = 8=3, and r = 28 are the standard Lorenz (1963) model parameters, c is the coupling strength of the x and y components, and cz is the coupling strength of the z component, S is the spatial scale factor, is the temporal scale factor, and k1 is an "?uncentering"? parameter. To simulate weather convection there is a weak coupling between the fast and slow subsystems and no coupling of the z component. The amplitude of the slow system solution trajectory is chosen to be 10 times larger than the fast subsystem (i.e. S = 0:1) and the slow system is 10 time slower than the fast system (i.e. = 0:1). The left side of Fig. 1.1 corresponds to breeding experiments using a small perturbation and a short rescaling interval. The right side of Fig.1.1 corresponds to breeding experiments with a larger perturbation and a long rescaling interval. The panels of each gure corresponds to a plot of the total bred vectors growth rate, the bred vectors growth rate measured with the fast variables (x,y,z), the 6 bred vectors growth rate measured with the slow variables (X,Y,Z), the Lyapunov vectors growth rate, and the Singular vectors growth rate over a period respectively. The left panel shows that the growth rate of the fast variables dominates the total growth rate, which can also be seen in the growth rates of the Lyapunov vector. The right shows how the breeding experiment was able to isolate the slow modes of the system when choosing a large amplitude and long rescaling interval. This can be seen in rst panel, where the slow growth rates dominates the total growth rate. The right side also shows that the Lyapunov and Singular vector growth rates were unable to capture the slow modes of the system. Hence, Pena and Kalnay were able to capture the fast "convective" and slow "weather waves" modes of the coupled system. Corazza et al. (2002) showed that there is a relationship between the EnKF data assimilation method and breeding. Therefore, Pe~na and Kalnay suggested that for a coupled ocean-atmosphere data assimilation system, the interval between analyses should be chosen in such a way to allow the fast irrelevant oscillations to saturate and not overwhelm the slower growth rates of the coupled system. 1.3 Simple Coupled Ocean-Atmosphere Model Pe~na and Kalnay (2004) adapted the Lorenz (1963) model to develop a simple coupled ocean-atmosphere model. The equations of the coupled ocean-atmosphere model are _xe = (ye xe) ce(Sxt + k1) (1.1) _ye = rxe ye xeze + ce(Syt + k1) (1.2) 7 _ze = xeye bze (1.3) _xt = (yt xt) c(SX + k2) ce(Sxe + k1) (1.4) _yt = rxt yt xtzt + c(SY + k2) + ce(Sye + k1) (1.5) _zt = xtyt bzt + czZ (1.6) _X = (Y X) c(xt + k2) (1.7) _Y = rX Y SXZ + c(yt + k2) (1.8) _Z = SXY bZ czzt (1.9) This system is composed of a fast extratropical atmosphere weakly coupled to a tropical atmosphere, which is strongly coupled to the slow ocean as in ENSO. The extratropical atmosphere model state is given by (xe; ye; ze), the tropical atmosphere model state is (xt; yt; zt), and the ocean model state is given by (X; Y; Z). The parameters c,ce,cz are the coupling strengths, S is the spatial scale factor, denotes the temporal scale factor, and , b, and r are the standard Lorenz (1963) parameters. To specify a weak coupling between the atmospheres and a strong coupling between the tropical atmosphere and ocean, the model parameters are de ned in table 1.1. Table 1.1: Parameters of the simple coupled ocean-atmosphere model Parameters Description Values c,cz,ce coupling coe cients c,cz = 1; ceb = 0:08 S spatial scale factor S = 1 temporal scale factor = 0:1 , b, r Lorenz parameters = 10, b = 8=3, and r = 28 The solution trajectories of the coupled system were obtained by integrat- 8 ing the simple coupled ocean-atmosphere model with the 4-th order Runge-Kutta method with a time step of t = 0:01. Figure 1.2: Attractors of the coupled system. Fig.1.2 is a plot of the attractors of the coupled ocean-atmosphere system. The arrows indicate the strength of the coupling, where the extra-tropical atmo- sphere is weakly coupled to the tropical atmosphere, which is strongly coupled to the ocean. The extra-tropical atmosphere attractor maintains its classical Lorenz butter y shape because of its weak coupling to the tropical atmosphere, whereas the tropical atmosphere loses its shape due to its coupling to the ocean. The attractor for the ocean shows how the ocean is oscillating between a "normal" state (which 9 lasts 3-12 years) and an El Ni=tildeno state (which lasts about 1 year). Fig.1.3 Figure 1.3: Evolution of the x-component for the subsystems of the coupled system. is a plot of the x-component of the solution trajectories for each subsystem of the coupled model, i.e. xe for the ?extratropics?, xt for the tropics, and X for the ocean. It highlights the larger amplitude of the slow subsystem and the coupling strengths between each subsystem with the tropical atmosphere being more of a slave to the ocean. Pe~na and Kalnay (2004) performed breeding experiments using this coupled ocean-atmosphere model to isolate its fast and slow instabilities. Fig. 1.4 shows the results of the breeding experiments for the coupled ocean- 10 Figure 1.4: Breeding experiment for the coupled ocean-atmosphere model (Pe~na and Kalnay, 2004). atmosphere model. Breeding results were reported in terms of the total growth rate ( rst panel), the growth rates for each subsystem (second, third, and fourth panel), the Lyapunov vectors growth rate ( fth panel), and singular vector growth rate (last panel) of Fig. 1.4. When choosing small amplitudes and short rescaling intervals for breeding, the total growth rate is dominated by the extratropical atmosphere as shown on the left side of Fig. 1.4. When choosing larger amplitudes and longer rescaling intervals, breeding ltered out the fast oscillations and the slow ocean growth rate dominated the total growth rate as shown on the right side of Fig. 1.4. Motivated by the breeding experiments for the coupled ocean-atmosphere model and by Ballabrera et al. (2009) experiments with a coupled fast-slow model, this dissertation will discuss the performance of data assimilation methods for the simple coupled ocean-atmosphere system equations (1.1)-(1.9), when choosing short and longer assimilation intervals. 11 1.4 Study Objectives and Approach The goal of this study is to assess the ability of a sequential and variational data assimilation systems to estimate the state of the simple coupled atmosphere- model of di erent temporal scales (fast/slow) and variable amplitudes (small/large). We hope to better understand the abilities of di erent data assimilation meth- ods for coupled models of varying scales, which is important to short-term climate prediction. We also hope this study will aid in the development of data assimilation systems for larger and more complex coupled ocean-atmosphere models. To achieve the goal, we will answer the following questions Is it possible to carry out a comprehensive coupled data assimilation, where all the observations corresponding to multiple time scales are assimilated si- multaneously? Or is it better to perform data assimilations at di erent time intervals, allowing for faster \noisy" phenomena to saturate? Can two types of data assimilation methods (sequential and variational) be used and are the coupled assimilation results similar? Can a variational data assimilation method include the initial model state and its forcings to provide an accurate estimate of the model state and its forc- ings as in the ECCO (Estimating the Circulation and Climate of the Ocean) assimilation system? To address the study questions, sequential and variational data assimilation 12 experiments are performed using the simple coupled ocean-atmosphere model dis- cussed in section 1.3. Observing Systems Simulation Experiments (OSSEs) were performed for the data assimilation experiments. For OSSEs the true model state is assumed to be known and observations are simulated by adding random noise to the truth according to observational errors. To answer the study questions, several ensemble and variational data assimilation systems were developed: 1. Ensemble Transform Kalman Filter Data Assimilation 2. 4-Dimensional Ensemble Transform Kalman Filter Data Assimilation 3. Ensemble Transform Kalman Filter with a Quasi-Outer Loop Data Assimila- tion 4. Local Ensemble Transform Kalman Filter Data Assimilation 5. 4D-Variational Data Assimilation To address the last study question, a 4D-Var data assimilation system for the ocean component of the coupled system that included the initial ocean state and its forcings was developed. The design of this 4D-Var data assimilation system was adapted from the Estimating the Circulation and Climate of the Ocean (ECCO) data assimilation system. This study refers to this 4D-Var data assimilation system as ECCO 4D-Var. All data assimilation systems were compared to assess their ability to accu- rately estimate the state of the coupled system when using short and long assimila- tion windows. 13 All data assimilation experiments assume a perfect model. 1.5 Outline Chapter 2 provides an overview of the data assimilation algorithms. Chapter 3 discusses the EnKF-based data assimilation experiments for the coupled ocean- atmosphere system. These are perfect model simulations and we examine the data assimilation performance for the coupled system of di erent time scales and variable amplitude when varying the assimilation window length. Chapter 4 discusses the 4D- Var data assimilation experiments for the coupled system. These are perfect model simulations and we assess its performance when varying the assimilation window and tuning the error covariance. We then compare the EnKF-based and 4D-Var data assimilations. Chapter 5 introduces the ECCO 4D-Var data assimilation system. It includes a discussion of ECCO, the experiment design using the slow subsystem of the simple coupled model, and comparisons between ECCO 4D-Var and 4D-var. Chapter 6 summarizes our ndings and proposes future work. 14 Chapter 2 Overview of Data Assimilation Algorithms 2.1 Introduction Data assimilation seeks to estimate the state of a dynamical system, called an analysis, using observations, a short-range forecast called the background, error statistics describing the uncertainties in the background state and observations, and a numerical model describes the physical laws governing the dynamical system. Data assimilation is an iterative process that typically occurs in two steps (Hunt et al., 2007) 1. forecast step: model predicts the future state of the system 2. analysis step: produce a current state estimate of the system by statistically combining the observations and a prior short-range forecast or background state It is widely used in the elds of meteorology and oceanography (Ghil et al., 1991; Bertino et al., 2003; Ghil et al., 1997; Luong et al., 1998), an important component of numerical weather prediction and used operationally by numerous numerical weather prediction centers (Parrish and Derber, 1992; Rabieret al., 1997). Daley (1991) and Kalany (2003) provide a detailed discussion on data assimilation. 15 Data assimilation methods can be classi ed into two categories, sequential data assimilation and variational data assimilation. Sequential data assimilation involves evolving the model state sequentially and updating the estimated state each time an observation is available. Variational data assimilation involves minimizing a cost function that measures the mis t between the initial model state and the observations and background. A signi cant challenge of sequential data assimilation schemes (e.g. Kalman lters) is that the covariance matrices that describes the uncertainties in the observations and background of the state variables have very large dimensions for operational numerical models and the EnKF was introduced to address this challenge. A signi cant problem of the variational data assimilation methods is the assumption of a static or constant background error covariances (Courtier et al., 1998; Lorenc, 2003). We develop Ensemble Kalman Filter-based data assimilation systems, which are sequential data assimilation methods, for the simple coupled ocean-atmosphere model. We also develop 4-dimensional Variational data assimilation systems, which are variational data assimilation methods, for the same simple coupled ocean-atmosphere model. Sections 2.2-2.6 provide a discussion on the data assimilation algorithms used in this dissertation. 2.2 Local Ensemble Transform Kalman Filter (LETKF) The LETKF is based on the Kalman Filter (Kalman and Bucy, 1960). The Kalman Filter is a recursive approach that, in a linear setting, generates the op- 16 timal estimate (called an analysis) of a dynamical system (e.g. atmosphere) using observations and short-range forecasts (called the background state). The simplest generalizaton of the Kalman Filter to a nonlinear setting is the Extended Kalman Filter, described subsequently. 2.2.1 Extended Kalman Filter Assuming a perfect model, let xti be the unknown state of the dynamical system at time ti xbi be the background state with errors b i = x b i x t i at time ti yoi be a vector of observations at time ti with errors o i = y o i H(x t i). H is an operator that maps model variables from model space to observational space. Throughout this dissertation, bold lower-case variables denote vectors and bold upper-case text denote matrices. xti and x b i are m-dimensional vectors containing m elements and yoi is an p-dimensional vector containing p elements. The errors b i and oi are Gaussian random variables with mean 0 and covariance matrix P b and R respectively. The covariance matrices are de ned as background error covariance matrix - Pb = E[ b bT ] observation error covariance - R = E[ o oT ] where E[ ] denoted the expected value. Using the observations, yo, the background state xb, and the error statistics, Pb and R, the Kalman lter produces the best estimate of the dynamical system called the 17 analysis (xa) by xa = xb + K(yo H(xb)) (2.1) K is called the Kalman gain matrix or the weighting matrix. Note, we drop the time indices. The Kalman gain is de ned by K = PbHT (HPbHT + R) 1 (2.2) where H is the Jacobian of H. From (2.2) we see that K is dependent on the background and observation errors. Hence, the optimality of the analysis depends on the accuracy of these error statistics. The estimation of the background and observation error covariance remains a challenge for data assimilation. The forecast and analysis error covariance matrices are evolved with time i, i.e. Pbi = MiP a i 1M T i Pai = (I KiHi)P b i (2.3) where I denotes the identify matrix, Pa denotes the analysis error covariance matrix, M is the Jacobian of the nonlinear model also known as the tangent linear model, and MT is the transpose of the tangent linear model also known as the adjoint model. The Kalman lter is a sequential process that occurs in two steps: (1) a forecast step and (2) an analysis step. During the forecast step, we generate the background state, xbi , and background covariance matrix, P b i by integrating the nonlinear model to evolve the analysis state, xai 1, and its error covariance P b i 1. The analysis step involves expressing the analysis state as a weighted mean of the background state and observations. 18 Forecast Step: xbi = M(x a i 1) Pbi = MP a i 1M T Analysis Step: Ki = PbiH T (HPbiH T + R) 1 xai = x b i + K(yi H(x b i+1)) Pai = (I KiH)P b i 2.2.2 Ensemble Kalman Filters Typically, numerical models used in weather forecasting or climate predic- tions are high-dimensional or have a large number of model variables. Hence, the computation of the background and analysis error covariance are extremely expen- sive. Evensen (1994) proposed an alternative called the Ensemble Kalman Filter (EnKF), where matrix operations are performed in the low-dimensional ensemble space. For the EnKF, we begin with an ensemble of K initial analysis states given by n xa(k)i 1 : k = 1; 2; :::; K o . Each ensemble member is a m-dimensional vector of m elements. Integrating the nonlinear model, each ensemble member is evolved to generate an ensemble of background states n xb(k)i : k = 1; 2; :::; K o , i.e. xb(k)i = M(x a(k) i 1 ) The sample background error covariance is de ned as Pbi = 1 K 1 KX k=1 (xb(k)i x b i)(x b(k) i x b i) T 19 = 1 K 1 XbiX bT i where xb is the background ensemble mean given by xb = 1 K KX k=1 xb(k) Xb is the background ensemble perturbation matrix of dimension m K, whose kth column is Xb(k) = xb(k) xb, where m is the dimension of the model state. The ensemble size, K, restricts the rank of the background error covariance matrix (Pb) because of its dependence on the background perturbation matrix, Xb, whose rank is at most (K 1) (Hunt et al., 2007). For the EnKF, there are several ways to determine the analysis mean xa and its covariance Pa. There are two basic tech- niques, stochastic (or perturbed observations) EnKF and deterministic (or square root) EnKF. The stochastic EnKF adds random perturbations to observations to generate an ensemble of observations that are used to update the background en- semble state (Burgers et al., 1998; Evensen and van Leeuwen, 1996; Houtekamer and Mitchell, 1998). The steps for the stochastic EnKF can be summarized as Forecast Step xb(k)i = M(x a(k) i 1 ) Pbi = 1 K 1 KX k=1 (xb(k)i x b i)(x b(k) i x b i) T Analysis Step: Ki = PbiH T (HPbiH T + R) 1 xa(k)i = x b(k) i + Ki(y o(k) i H(x b(k) i )) 20 The deterministic EnKF assimilates the observations to update the analysis ensem- ble mean and the analysis ensemble perturbations are generated by transforming the background ensemble via a transform matrix (Whitaker and Hamill, 2002; Bishop et al., 2001; Anderson 2001; Ott et al., 2004). It can be summarized as Forecast Step: xb(k)i = M(x a(k) i 1 ) Pbi = 1 K 1 KX k=1 (xb(k)i x b i)(x b(k) i x b i) T Analysis Step: Ki = PbiH T (HPbiH T + R) 1 xa(k)i = xbi + Ki(y o i H( x b i)) Xa(k)i = TiX b(k) i xa(k)i = x a i + X a(k) i where T is a transform matrix speci ed by the type of square root EnKF. There are di erent types of square root EnKFs. This study will focus on the Local Ensemble Transform Kalman Filter (LETKF), a square root EnKF. 2.2.3 LETKF Formulation Hunt et al. (2007) introduces and provides a detailed derivation of the LETKF. We present a derivation of the LETKF based on Hunt et al.(2007). Recall the Extended Kalman lter equations with dropped time indices, xa = xb + K(yo Hxb) (2.4) 21 Pa = (I KH)Pb (2.5) where xa denotes the m-dimensional analysis state, xb denotes the m-dimensional background state, yo is the p-dimensional vector of observations, H is the linearza- tion of the operator H, I is the identity matrix, and Pb is the background error covariance matrix. K is the Kalman gain matrix given by K = PbHT (HPbHT + R) 1 (2.6) Recall from EnKF, n xa(k)i 1 : k = 1; 2; :::; K o represents the mean of an ensemble of K members of initial conditions and the ensemble spread is given by Pai 1 at time ti 1. Using the nonlinear model, each ensemble member is evolved to produce the background ensemble at time ti given by n xb(k)i : k = 1; 2; :::; K o , whose error co- variance is given by Pbi . The background state is taken to be the background mean given by xbi = 1 K KX k=1 xb(k)i The sample background error covariance matrix is Pbi = 1 K 1 KX k=1 (xb(k)i x b i)(x b(k) i x b i) T = 1 K 1 XbiX bT i where Xbi is the background ensemble perturbation matrix of dimension m K, whose kth column is xb(k)i x b i . The Kalman lter equations 2.4 - 2.6 provides the analysis state xa that minimizes the cost function J(x) = (x xb)T (Pb)(x xb) + [yo H(x)]T R 1 [yo H(x)] (2.7) 22 provided the observation operator H is linear. Thus for the LETKF, we seek the analysis mean, xa, that minimizes the cost function J(x) = (x xb)T (Pb)(x xb) + [yo H(x)]T R 1 [yo H(x)] (2.8) Note the use of the analysis mean and background mean for EnKFs to represent the analysis and background state respectively. The characteristics of the m m background error covariance matrix,Pb are its rank is K 1 < m, where the rank is the number of linearly independent columns ) it is a singular matrix it is a symmetric matrix, i.e. Pb = PbT and therefore one-to-one on its column space S, which is the space spanned by the background perturbations, Xb Hunt et al. (2007) minimize J in a subspaces of the model space, a low- dimensional space. (Pb) 1 is well-de ned since it is one-to-one on S. Because (Pb) 1 is well-de ned, J is well-de ned for (x xb) if it is also in S. Hunt et al. (2007) therefore performed the analysis on the low-dimensional subspace S. To begin the computation of the analysis in the subspace S, they chose a basis for S, which is a set of linearly independent vectors that can represent a vector in S as a linear combination. The dimension of this basis (K 1). Hunt et al. (2007) accomplishes this through a coordinate transformation, where the matrix of background perturbations, Xb, is represented as a linear transformation from a K-dimensional space ~S onto S. If a K-dimensional vector w is a vector in ~S with mean 0 and covariance 1K 1I, then the vector X bw is in S. Hence, x = xb + Xbw is 23 in S and has mean xb and covariance Pb. With x = xb + Xbw, the cost function on ~S is given by ~J(w) = (K 1)wTw + yo H( xb + Xbw) T R 1 yo H( xb + Xbw) (2.9) Hunt et al. (2007) show that if a vector w in ~S minimizes ~J , then xa = xb + Xbw minimizes the cost function J in the low-dimensional subspace S. Note, that H is a nonlinear operator. Hunt et al. (2007) handles H by linearizing about the background ensemble mean xb. This was accomplished by applying H to each member of the background ensemble, xb(k) : k = 1; 2; :::K to generate a new ensemble of background observation vectors yb(k) : k = 1; 2; :::; K , where yb(k) = H(xb(k)). The mean of the ensemble of background observation vectors is yb and the p k matrix of perturbations is Yb whose kth column is yb(k) yb. Thus, the linear approximation for H( xb + Xbw) in 2.9 is given by yb + Ybw and through substitution the resulting approximation to the cost function ~J is given by ~J (w) = (K 1)wTw + yo yb Ybw T R 1 yo yb Ybw Note that since in ~S, the background distribution has a mean 0 and background covariance ~Pb = 1(K 1)I. Recall the Kalman lter analysis equations 2.4, xa = xb + K(yo H xb) (2.10) Pa = (I KH)Pb (2.11) where K is given by K = PbHT (HPbHT + R) 1 (2.12) 24 Since wa, minimizes ~J in ~S, these equations become wa = ~Pa(Yb)TR 1(yo yb) (2.13) ~Pa = (K 1)I + (Yb)TR 1Yb 1 (2.14) in ~S. In model space, the analysis mean and covariance are xa = xb + Xb wa (2.15) Pa = Xb ~Pa(Xb)T (2.16) Hunt et al. (2007) utilize a symmetric square root to express the analysis ensemble perturbations, i.e. Xa = XbWa (2.17) Wa = [(K 1) ~Pa]1=2 (2.18) The LETKF uses the symmetric square root to ensure that the analysis ensemble has the correct sample mean and that Wa depends on ~Pa (Hunt et al., 2007). In summary, Hunt et al. (2007) were able to obtain an analysis in the low-dimensional space S, by performing analysis in ~S and through a linear transformation mapping it over to S. Note, the LETKF without localization computes the same transform matrix as the ETKF of Bishop et al., 2001, but uses a di erent notation. One of the important features of the LETKF is localization. There are three bene ts of localization (Hunt et al, 2007) 1. Improves Accuracy: Localization allows the local analyses to be represented by di erent linear combination of ensemble members in di erent regions of the 25 model grid. This means that the LETKF has the ability to correct uncertain- ties in larger spaces, beyond the K-dimensional space (Ott et al, 2004). 2. Removing Spurious Correlations: Using small ensembles result in the correlations between other grid points of the model grid, which causes obser- vations at one grid point to in uence the analysis at grid points further away. The LETKF could eliminate these correlations through a speci ed criteria. 3. Parallel Computing: The LETKF allows the analysis to be computed lo- cally, using only nearby observations surrounding the analysis grid point. The analysis at di erent grid points is computed independently, thereby allowing calculations to be done simultaneously throughout the model grid as in parallel computing. The steps of the LETKF can be summarized as follows: Globally: Forecast Step xb(k)i = M(x a(k) i 1 ) Locally: Analysis Step 1. Apply H to each background ensemble member xb(k) : k = 1; 2; :::K to ob- tain the background observation ensemble yb(k) : k = 1; 2; :::K and compute the mean to obtain yb = 1K PK k=1 y b(k). Compute Yb = yb(k) yb = H(xb(k)) H( xb) 26 2. Compute the matrix of background ensemble perturbations Xb by comput- ing the background mean xb = 1K PK k=1 x b(k) and subtracting it from each background ensemble member xb(k) : k = 1; 2; :::K . 3. Choose the observations to be used and compute the local analysis error co- variance for each grid point in ensemble space ~Pa = (K 1)I + YbTR 1Yb 1 4. Compute the K K matrix Wa Wa = h (K 1) ~Pa i1=2 5. Compute the K-dimensional vector wa to obtain the analysis mean in en- semble space and add to Wa to get the analysis ensemble in ensemble space wa(k) : k = 1; 2; :::; K , where wa = ~PaYbTR 1(yo yb 6. Multiply Xb by each wa(k) and add xb to obtain the analysis ensemble members xa(k) : k = 1; 2; :::; K at each analysis grid point. 7. The new global analyses is obtained by gathering all of the grid point analyses. A owchart describing the LETKF data assimilation is given in Fig. 2.1. 2.3 4-Dimensional LETKF Hunt et al. (2004) introduced the 4-dimensional ensemble Kalman lter (4DEnKF), which uses observations throughout an assimilation window instead of just using 27 Figure 2.1: Flowchart of LETKF Data Assimilation the observations at the analysis time as in the EnKF. Harlim and Hunt (2007) and Fertig et al. (2007) adapted the 4DEnKF to develop the 4-dimensional lo- cal ensemble transform Kalman lter (4D-LETKF). Harlim and Hunt (2007) used the 4D-LETKF to perform numerical experiments with a global circulation model called the Simpli ed-Parametrized primitivE Equation Dynamics (SPEEDY) model. They found that for shorter assimilation windows, the 4D-LETKF performance was comparable to the LETKF performance, but for longer assimilation windows, 4D-LETKF outperformed the LETKF. Fertig et al.(2007) used the 4D-LETKF to perform perfect model simulations with the Lorenz (1996) model. The study then compared the 4D-LETKF performance to the 4D-Var performance. They found that the 4D-LETKF and 4D-Var has comparable errors when the 4D-LETKF assimila- tion was performed frequently and when 4D-Var is performed over long assimilation windows. Harlim and Hunt (2007) and Fertig et al. (2007) provide a detailed derivation of the 4D-LETKF. We will summarize their derivation based on their studies. The 28 4D-LETKF generates an ensemble of analysis states whose mean, xa, estimates the true state, xt, of a dynamical system and error covariance re ects the uncertainty in the estimate. The LETKF uses a background ensemble n xb(k)i : k = 1; 2; :::; K o and noisy observations, fyol g at times ftl : l = 1; 2; :::; ig, to generate an analysis ensemble n xa(k)i : k = 1; 2; :::; K o . 4D-LETKF accomplishes this by minimizing the cost function J(x) = 1 2 (xi xbi) T (Pbi) 1(xi xbi) (2.19) + 1 2 iX l=1 [yol Hl(xl)] T R 1l [y o l Hl(xl)] where Pbi = 1 K 1X b i(X b i) T and Xbi = h xb(1)i x b i ; :::;x b(K) i x b i i . 4D-LETKF assumes that the observation errors at di erent times ftl : l = 1; 2; :::; ig are uncorrelated, so R is a block diagonal matrix. Let nl be the number of observations at time tl, so that each block matrix is a nl nl covariance matrix Rl = E[ ol ( o l )]. The minimization of the cost function (2.19) is done through a coordinate transfor- mation. Let w be a vector in a K-dimensional subspace ~S that doesn?t depend on time, then the vector xl = xbl + X b lw is the corresponding model state in the model space S. Linearizing the nonlinear observation vector Hl(xl) = Hl( xbl ) + X b lw Hl( x b l ) + Y b lw 29 where with ybl = 1 K PK k=K Hl(x b(k)) Ybl = h Hl(x b(1) l ) y b l ; :::; Hl(x b(K) l ) y b l i The cost function becomes J(w) = 1 2 (K 1)wTw (2.20) + 1 2 iX l=1 yol Hl( x b l ) Y b lw T (R) 1l yol Hl( x b l ) Y b lw The minimum of the cost function (2.20) is given by wa = ~Pa " iX l=1 Ybl (R) 1 l (y o l Hl( x b(tl))) # ~Pa = " (K 1)I + iX l=1 YbTl (R) 1 l Y b l # 1 The corresponding model state is the mean analysis state xa = xb + Xbwa The analysis ensemble is given by xa(k) = xa + XbW(k) where W = h (K 1) ~Pa i1=2 The analysis ensemble, xa(k) : k = 1; 2; :::; K , becomes the background ensemble, xb(k) : k = 1; 2; :::; K , for the next analysis time. Table 2.1 below describes the di erence between the 4D-LETKF and LETKF data assimilation schemes, which is expressed in the analysis mean in the ensemble space, wa, and the local analysis error covariance, ~Pa. 30 Table 2.1: Di erence between the 4D-LETKF and LETKF LETKF 4D-LETKF wa = ~Pa(Yb)TR 1(yo H( xb)) wa = ~Pa hPi l=1 Y b l (R) 1 l (y o l Hl( x b(tl))) i ~Pa = (K 1)I + (Yb)TR 1Yb 1 ~Pa = h (K 1)I + Pi l=1 Y bT l (R) 1 l Y b l i 1 Figure 2.2: 4D-LETKF vs. LETKF Fig. 2.2 summarizes key the di erence between the LETKF and 4D-LETKF data assimilation. LETKF determines the linear combination of forecasts that best ts the observations (grey circles) at the analysis time. LETKF determines the linear combination of forecasts that best ts the observations throughout an assimilation window. 2.4 LETKF Quasi-Outer Loop (QOL) The EnKFs assume that ensemble perturbations are normally or Gaussian distributed. Miller et al. (1994) and Evensen (1992) introduced the notion of ? l- 31 ter divergence?. Filter divergence occurs when the EnKF data assimilation scheme is unable to estimate the true state of a dynamical system due to the growth of nonlinear perturbations which result in perturbations with a non-Gaussian distri- bution. EnKFs can handle some nonlinearities because the full nonlinear model is integrated to generate the background ensemble members, which includes the satu- ration of nonlinear perturbations (Evensen, 1994, 1997). However, EnKFs perform better for shorter assimilation intervals than longer assimilation intervals because the ensemble perturbations grow nonlinearly and become non-Gaussian for longer windows (Fertig et al., 2007). Kalnay et al. (2007) suggests that the EnKFs need an outer loop (as in 4D-Var) to handle non-linearities for longer assimilation intervals. Studies have shown that for incremental 4D-Var, the outer loop improves the model state of a nonlinear system (Courtier et al., 1994, Rabier et al, 2000, Anderson et al., 2005). Yang and Kalnay (2010) developed the quasi-outer loop (QOL) LETKF data assimilation system. Recall the equations for the LETKF Forecast Step 1. Generate the background ensemble from an initial analysis ensemble by inte- grating the nonlinear model M xb(k)i = M(x a(k) i 1 ) for k=1,2,...,K where K is the total number of ensemble members. Analysis Step 32 1. Compute the matrix of background perturbations at the analysis time ti (Xbi) Xb(k)i = x b(k) i x b i 2. Compute the matrix of background perturbations in observation space (Ybi ) Ybi = H(x b i) H(xbi) 3. Compute the analysis error covariance in ensemble space ~Pai = (K 1)I + YbTi R 1Ybi 1 4. Compute the analysis ensemble perturbations in model space Xa = XbTi ~P a i X b i 5. Compute the analysis mean in ensemble space wai = ~P a i Y bT i R 1(yo H(xbi) 6. Compute the analysis weight perturbation matrix Wai = h (K 1) ~Pai i1=2 7. Compute the mean analysis xai = X b i w a i + x b i 8. Compute the analysis ensemble perturbations Xai = X b iW a i 33 The LETKF utilizes the weights, wai , to linearly combine the background en- semble model states to estimate the true model state of the dynamical system at the analysis time ti. The weights at the analysis time can be used throughout the assim- ilation window (Kalnay and Yang, 2010; Yang and Kalnay, 2011). Therefore, using the observations at the analysis time, the LETKF can employ a no-cost smoother to improve the analysis model state at the beginning of an assimilation window (Kalnay et al., 2007b, Yang et al., 2009b, and Yang 2010). The no-cost smoother is applied by using the weights at the end of an assimilation window, wai , at the beginning of an assimilation window, ti 1, to express the analysis mean as ~xai 1 = x a i 1 + X a i 1 w a i (2.21) The analysis weight perturbation matrix at the analysis time, Wai , is used to express the analysis perturbations at the beginning of the assimilation window as ~Xai 1 = X a i 1W a i (2.22) ~xai 1 and ~X a i 1 are the smooth analysis mean and smooth analysis perturba- tions respectively. Kalnay and Yang (2010) shows that because of the use of the observations at the analysis time ti, the smooth analysis mean is more accurate than the analysis mean of the LETKF. It can also be shown that M(~xai 1) = x a i . Let L = @M@ xa be the tangent linear model operator, then M(~xai 1) = M( x a i 1 + X a i 1 w a i ) = M( xai 1) + LX a i 1 w a i = xbi + X a i w a i 34 = xai and M( ~Xai 1) = LX a i 1W a i = Xai With equations (2.21) and (2.22), Yang and Kalnay (2011) developed a quasi- outer loop for the LETKF that improves the ensemble mean and nonlinear evolution of the ensemble perturbations. The algorithm for the LETKF QOL is as follows for an assimilation window [ti 1; ti]. Let j denote the iteration. 1. j=0, perform the LETKF to obtain the analysis weights, wai;j at the end of the assimilation window ti. wai;j = ~P a i Y bT i R 1 i (y o i Hi(xbi) 2. Use the analysis weights, wai;j, to compute the smooth analysis mean and analysis ensemble perturbations ~xai 1;j+1 = x a i 1;j + X a i 1;j w a i;j 3. Update the background ensemble mean trajectory by integrating the nonlinear model from the smooth analysis mean xbi = M( x a i 1;j+1) 4. Compute the matrix of background perturbations by adding small random normally distributed perturbations,Ei, to the updated 35 Xbi;j+1 = X a i;j + Ei;j+1 The random perturbations are needed to prevent Xbi;j+1 from being the same as Xai;j 5. Compute and save the root mean square of the di erences between the obser- vations and forecast (RMSOMF) at j + 1. This will be used as a criterion for performing an outer loop. RMSOMF (j + 1) = [yoi Hix] (R 1 i ) [y o i Hix] 6. Compute the criteria to determine if another iteration of the outer loop is required RMSOMF (j) RMSOMF (j+1) RMSOMF (j) > where 0:01. If the quotient is greater than and the number of iterations is are not greater than two, then the outer loop is repeated. The criterion is used to ensure that the observations are only used when there is additional information to gain. Note, that in the Running in Place (RIP) algorithm both the mean and the perturbations are update using the no cost smoother weights. 2.5 Covariance In ation Kalman ltering data assimilation schemes may underestimate the uncertainty in its model state estimate that is re ected in the forecast error covariance due to 36 model errors and/or strong nonlinearities. When this happens, the observations may not be given enough weight during the analysis because of overcon dence in the background state estimate (Hunt et al., 2007). In time, this will cause the analyses to separate from the true model state. Two approaches are used to deal with this ? lter divergence?: (1) multiplicative covariance in ation and (2) additive covariance in ation. With multiplicative covariance in ation, we multiply by a factor the background error covariance, 1 + Pb, i.e. Pb ! (1 + )Pb (2.23) is a tunable parameter that is less than 1, i.e. << 1. This increases or in ates the background error covariance so that more weight can be placed on the obser- vations during the local analysis. Whitaker and Hamill (2002) and Anderson and Anderson (1999) found that multiplicative in ation improved the state estimate of a dynamical system. With additive covariance in ation, the background or analy- sis error covariance are in ated by adding random perturbations to each ensemble member. Ott et al. (2004) instead added a multiple of the identity matrix to the analysis error covariance, i.e. Pa + K I where is the trace of the analysis error covariance matrix in order to give more weight to the nonleading eigenvectors (Miyoshi, 2011). 37 2.6 4D-Variational Data Assimilation 4D-Var is an extension of the 3-dimensional variational data assimilation (3D- Var). Let?s assume the errors in the background state are Gaussian distributed with mean xb and covariance Pb, then the probability density function (pdf) for the model state x with respect to the background state xb is given by p(x) / exp 1 2 (x xb)TB 1(x xb) Let?s also assume that the errors of the observations are Gaussian distributed with mean Hxb and covariance R, then the conditional probability of the observa- tions (yo) given the model state (x) is given by p(yjx) / exp 1 2 (x Hxb)T (R) 1(x Hxb) Using Bayes theorem, p(xjy) / p(yjx)p(x), we can express the probability density of the model state given the observations can be written as p(xjy) / exp 1 2 (x xb)T (B) 1(x xb) (y Hxb)T (R) 1(y Hxb) (2.24) The variational data assimilation methods maximize the probability (2.24) or equivalently estimate the model state that best ts the observations. This is accomplished by minimizing the cost function J(x) = 1 2 (x xb)T (B) 1(x xb) + 1 2 (y Hxb)TR 1(y Hxb) (2.25) 38 The cost function measures the mis t between the model state and the obser- vations. 3D-Var estimates the state of the system by minimizing this cost function. It uses observations available at the analysis time. 4D-Var is an extension of 3D-Var by using observations throughout an assimilation window [ti; tN ]. The 4D-Var cost function is given by J(x0) = 1 2 (x0 xb)T (B) 1(x0 xb) + 1 2 NX i=0 (y Hixbi) TR 1i (yi Hix b i) (2.26) where, x0 is the initial model state, xi is the model state determined by in- tegrating the nonlinear model M , i.e. xi = M(x0), and the background error co- variance matrix, B, is assumed to be constant in time, homogeneous in space, and isotropic . 4D-Var minimizes this cost function and uses the initial model state as the control variables. The minimum of the cost function J is computed by computing its gradient and solving rJ = 0, i.e. r(J) = B 1(x0 xb) + NX i=1 LT [ti; t0]HTi R 1 i (Hi(xi) y 0 i ) = 0 (2.27) Minimization algorithms (e.g. conjugate gradient or quasi-Newton methods) are used to compute the minimum of the 4D-Var cost function. LT is called the adjoint model operator given by LT = @M @x T Computing the gradient requires the backward integration of the adjoint model. The adjoint model is the transpose of the tangent linear model, L given by L = @M @x 39 The tangent linear model approximates the evolution of perturbations in a nonlinear model state. 4-D Var is used in atmospheric sciences, meteorology, and physical oceanog- raphy. It is applied at European Center for Medium-Range Weather Forecasting in weather forecasting (Rabier et al., 2000; Mahfouf and Rabier, 2000;Klinker et al., 2000) and in several other countries (France, United Kingdom, Japan, and Canada). Some 4D-Var approximations are also applied in operational oceanography (Weaver et al., 2003; Vialard et al., 2003) to initialize ocean circulation models. 2.7 Comparisons of the two advanced methods, 4D-Var vs. EnKF There are advantages and disadvantages for both the 4D-Var and EnKF-based data assimilation techniques. The most important advantage of 4D-Var is its ability to assimilate asynchronous observations. Major disadvantages of 4D-Var is the strong constraint of a perfect model and the need to developthe tangent linear and adjoint models. Advantages of the EnKF data assimilation include its easy implementation, that it is generally model independent, it doesn?t require the development or inte- gration of tangent linear or adjoint models, and it provides an automatic estima- tion of ow-dependent error covariances for the analysis and background. A major drawback of the EnKF data assimilation schemes are the possible de ciences in the estimation of the background error covariance because of the low dimension of the ensemble. 40 Chapter 3 EnKF-based Data Assimilation Experiments using the Simple Coupled Ocean-Atmosphere Model Summary. This section introduces ve EnKF-based data assimilation systems for the simple coupled ocean-atmosphere model and discusses experiments used to study the per- formances of the data assimilation systems in the presence of multiple time scales and variable amplitudes. The EnKF-based algorithms performed well for short assimilation windows and were able to track regime changes in the model solution through growth in root-mean-square errors. Some EnKF-based data assimilation systems experienced l- ter divergence for long assimilation windows and thus require very large multiplicative in ation. We found that EnKF-based data assimilation systems with model localization (similar to variable localization) or with quasi-outer loops performed better for long as- similation windows. 3.1 Overview The equations of the simple "coupled ocean-atmosphere" model are given by _xe = (ye xe) ce(Sxt + k1) _ye = rxe ye xeze + ce(Syt + k1) _ze = xeye bze _xt = (yt xt) c(SX + k2) ce(Sxe + k1) 41 _yt = rxt yt xtzt + c(SY + k2) + ce(Sye + k1) (3.1) _zt = xtyt bzt + czZ _X = (Y X) c(xt + k2) _Y = rX Y SXZ + c(yt + k2) _Z = SXY bZ czzt This system is composed of a fast extratropical atmosphere weakly coupled to a fast tropical atmosphere, which is strongly coupled to the slow ocean as in ENSO. The fast variables are the extratropical atmosphere model (xe; ye; ze) and the tropical atmosphere model state (xt; yt; zt). The slow variables are the ocean model state is given by (X;Y; Z). Table 3.1: Parameters of the simple coupled ocean-atmosphere model Parameters Description Values c,cz,ce coupling coe cients c,cz = 1; ceb = 0:08 S spatial scale factor S = 1 temporal scale factor = 0:1 , b, r Lorenz parameters = 10, b = 8=3, and r = 28 k1,k2 uncentering parameters k1 = 10, k2 = 11 The speci cations of the model parameters are given in Table 3.1. For all data assimilation experiments, we are assuming a perfect model. The EnKF-based data assimilation systems developed for the coupled ocean- atmosphere model can be classi ed as Ensemble Transform Kalman Filters and Local Ensemble Transform Kalman Filters. Because the dimension of our model is small, we do not employ spatial localization, but subsystem localization. The primary di erence between our Ensemble Transform Kalman Filters and the Local Ensemble Transform Kalman Filters is that the Ensemble Transform Kalman Filters 42 assimlates observations simultaneously to update the ensemble mean, whereas the Local Ensemble Transform Kalman Filters assimilates the atmospheric and oceanic observations separately to update the ensemble mean. Figure 3.1: Illustrates the di erence between the ETKF and LETKF The Ensemble Transform Kalman Filters we used were the 1. Ensemble Transform Kalman Filter (ETKF) - The ETKF assimilates the com- plete system, i.e. the fast and slow variables of the coupled model simulta- neously using observations available at the analysis time. It has the same equations as the LETKF but with no localization. 2. 4D-ETKF - The 4D-ETKF assimilates the fast and slow variables of the cou- pled system simultaneously, but uses observations throughout the assimilation 43 window. 3. ETKF-QOL - The ETKF with a quasi-outer loop (Kalnay and Yang, 2010; Yang and Kalnay 2011) assimilates the fast and slow variables of the coupled system simultaneously using an outer loop that uses a no-cost smoother to re-center ensemble perturbations around a more accurate ensemble mean. The Local Ensemble Transform Kalman Filters we used were 1. LETKF - The LETKF assimilates the fast and slow variables of the coupled systems separately using model localization (similar to variable localization, Kang et al., 2011) 2. 4D-LETKF - The 4D-LETKF assimilates the fast variables separately with an ETKF and the slow variables with a 4D-ETKF using oceanic observations throughout an assimilation window. Fig. 3.2 summarizes the Ensemble Kalman Filter-based methods. The data assimilation experiments were performed observing all variables over a range of data assimilation intervals. A reference simulation, which created the true model state, is performed by integrating the coupled system for a long integration period using the 4th-order Runge- Kutta method with a time step of t = 0:01. Observations were generated by adding random errors to the model state. The sim- ulated observations have a standard error of p 2, which means the observation error covariance is given by R = 2I as in Miller et al. (1994). We generated observa- tions every 8 time steps of a simulation. The performance of the data assimilation 44 Figure 3.2: Description of EnKF-based methods experiments was assessed by computing the root-mean-square (rms) error, which measures the di erence between the analysis (i.e. estimate of the model state) and the true model state. The EnKF-based data assimilation experiments will be compared with the 4D-Var data assimilation experiments in chapter 4. 3.2 Experiment 1: ETKF Data Assimilation Experiments We performed ETKF data assimilation experiments with the coupled ocean- atmosphere system. These experiments include varying the data assimilation interval tuning the multiplicative in ation 45 varying the number of ensemble members We implemented the ETKFs at full rank. That is, the number of ensemble members equaled the dimension of the model state of nine, except for the experiment of varying the number of ensemble members. Atmospheric and oceanic states were assimilated simultaneously using observations available at the analysis time. All observations were assimilated every 8n time steps for n = 1; 2; :::; 10. Varying assimilation windows. ETKF assimilations were able to track regime changes in the model solutions through the growth of rms errors. Regime changes in the model solutions occurs when values of the x and y model state variables changes signs. Figure 3.3: ETKF Experiment: Plot of x-solution trajectory for the ocean (top panel) and rms errors (bottom panel) showing a tendency to increasing rms errors before and during regime changes (Evans et al., 2004) 46 The top panel of Fig. 3.3 is a plot of the x-solution trajectory for the ocean, the truth (blue) and the analysis mean (green) when assimilating every 24 time steps. The bottom panel is a plot of the rms errors for the analysis. It shows a growth in the rms errors right before regime changes, even for the slow ocean model (Evans et al., 2004). Figure 3.4: ETKF Experiment: Mean rms errors vs. varying assimilation intervals Fig. 3.4 summarizes the results of the ETKF assimilation with no in ation experiment when assimilating all observations and varying the assimilation inter- val. The y-axis is the mean rms errors and the x-axis is the assimilation interval in time-steps. The blue line corresponds to the mean rms errors for the extratrop- ical atmosphere, the pink line corresponds to the mean rms errors for the tropical atmosphere, and the brown line corresponds to the mean rms errors for the ocean. 47 The black dashed line is the observation standard error of p 2. The strong coupling between the ocean and tropical atmosphere manifests itself in the rms errors, with the tropical atmosphere being a "slave" to the ocean. The large amplitude of the ocean solution trajectory a ects the magnitude of the mean rms errors for the ocean when the assimilation fails, which in turn a ects the magnitude of the mean rms errors for the tropical atmosphere. The small rms errors show that the assimilation was able to provide an accurate estimate of the true model state for each subsystem for short assimilation intervals. However, as the assimilation intervals increased, the rms errors for each subsystem grew. For example, we observed a signi cant increase in the rms errors for the ocean when assimilating every 40 and 64 time steps. Fig. Figure 3.5: ETKF Experiment: Plot of x-component of the ocean so- lution trajectory (top panel) and rms errors (bottom panel). Note the errors with in ation (green) are much smaller than the errors without in ation (red) 48 3.5 is a plot of the x-component of the solution trajectory for the ocean subsystem (top panel) and the rms errors (bottom panel) for each analysis cycle in time steps when assimilating every 64 time steps. We are not using any in ation for these ETKF experiments. When there is no in ation, we see that the analysis (red line) decouples from the true model state (blue line) right before a regime change at about the 175th time step. When we increased the multiplicative in ation from 1.00 to 1.30, the analysis (green line) was able to track the true solution trajectory. Thus, increasing the in ation drastically decreased the mean rms errors of the assimilation. Table 3.2: ETKF Experiment: Increasing the multiplicative covariance in ation for the 64 time-step assimilation decreases the mean rms errors Multiplicative In ation Extratropics Tropics Ocean 1.0 2.47 5.87 20.90 1.30 1.29 0.47 0.58 Table 3.2 shows the reduction in mean rms errors for each subsystem when increasing the in ation for the 64 time step assimilation window. We therefore performed tuning in ation experiments to identify the optimal in ation for each assimilation window. Tuning multiplicative in ation, . Tuning experiments were performed to identify the optimal in ation for each assimilation window of our ETKF experiments and to assess the performance of the ETKF data assimilation for longer assimilation 49 intervals. Figure 3.6: ETKF with in ation vs. ETKF without in ation: Plot of mean rms errors for the extratropical atmosphere (top left), the tropical atmosphere (top right), and the ocean (bottom). Note the errors with in ation (solid line) are much smaller than the errors without in ation (dashed line) for longer windows Fig. 3.6 displays the mean rms errors for the ETKF with and without in ation data assimilations for varying assimilation windows. The dashed line corresponds to an ETKF data assimilation without in ation and the solid line correponds to an ETKF data assimilation with in ation. Mean rms errors were plotted for the extratropics (top left), the tropics (top right), and the ocean (bottom). The black dashed line denotes the observation error. Fig. 3.6 shows that tuning the in ation drastically improves the performance of the ETKF assimilation for longer assimilation windows. This is seen in the solid line for each subsytem falling below the observation error. 50 Tables 3.3 - 3.5 shows that longer assimilation intervals, when the perturba- tions become nonlinear, require larger multiplicative in ation for the analysis to be close to the truth. This holds for each subsystem of the coupled model when assimilating simultaneously. Table 3.3: ETKF Experiment: Table of mean rms errors for the Extratropical At- mosphere while tuning the multiplicative covariance in ation. Number of ensemble members = 9, Observational Error = p 2, Minimum errors bolded. Assimilation Interval = 1:00 = 1:10 = 1:20 = 1:30 = 1:40 = 1:50 = 1:60 = 1:70 = 1:80 = 1:90 8 0:30 0.36 0.45 0.52 0.59 0.64 0.68 0.72 0.75 0.79 40 2.66 1.12 0:79 0.80 0.80 0.81 0.85 0.86 0.88 0.83 80 2.54 1.81 1.94 1.58 1.38 1.41 1.31 1:24 1.38 1.41 120 2.93 2.17 1.84 1.55 1.63 1.50 1.67 1:43 1.34 1.32 160 3.63 2.12 2.39 2.10 1.50 1.48 1.31 1.42 1:29 1.58 Table 3.4: ETKF Experiment: Same as table 3.3 but for the Tropical Atmosphere Assimilation Interval = 1:00 = 1:10 = 1:20 = 1:30 = 1:40 = 1:50 = 1:60 = 1:70 = 1:80 = 1:90 8 0:06 0.23 0.34 0.44 0.52 0.57 0.62 0.67 0.70 0.74 40 3.15 0.39 0:27 0.33 0.40 0.44 0.46 0.52 0.53 0.57 80 6.36 1.46 1.13 1.02 0.62 0:58 0.64 0.73 0.64 0.68 120 4.47 3.21 1.95 0.61 0.81 0:60 0.61 0.61 0.64 0.66 160 2.87 2.96 3.49 3.07 1.11 0.79 0:66 1.23 0.73 1.36 Varying number of ensemble members . These experiments assess the per- formance of the ETKF data assimilation when varying the number of ensemble 51 Table 3.5: ETKF Experiment: Same as table 3.3 but for the Ocean Assimilation Interval = 1:00 = 1:10 = 1:20 = 1:30 = 1:40 = 1:50 1:60 1:70 1:80 1:90 8 0:15 0.32 0.47 0.59 0.68 0.75 0.81 0.86 0.91 0.94 40 12.17 0.61 0:44 0.49 0.54 0.62 0.67 0.69 0.76 0.70 80 24.13 3.80 2.90 2.24 0.75 0:70 0.79 0.99 0.90 0.82 120 17.60 11.92 4.69 0.88 1.58 0:74 0.83 0.79 0.83 0.83 160 8.81 8.53 12.21 7.66 2.00 1.02 0:93 2.29 0.97 2.73 members, K. For assimilation intervals of 8, 40, and 80 time steps and an optimal multiplicative in ation, we looked at the mean rms errors versus the number of en- semble members for each subsystem of the coupled model. Table 3.6 shows that for 8 time steps, 6 ensemble members (less than full rank) give the best results. Table 3.6: ETKF Experiment: Table of mean rms errors for the while varying the number of ensemble members. Assimilation Window = 8 time-steps, Optimal multiplicative in ation and Observational Error = p 2 K Extratropics Tropics Ocean 3 1.85 3.60 6.20 6 0.33 0.20 0.28 9 0.35 0.23 0.32 12 0.38 0.25 0.32 52 Table 3.7: ETKF Experiment: As table 3.7 but with an Assimilation Window = 40 time-steps. K Extratropics Tropics Ocean 3 5.10 6.65 24.79 6 1.34 0.48 0.66 9 1.13 0.39 0.61 12 0.83 0.23 0.36 Table 3.8: ETKF Experiment: As table 3.6 but with an Assimilation Window = 80 time-steps. K Extratropics Tropics Ocean 3 5.40 8.12 36.80 6 2.40 1.66 5.50 9 1.81 1.47 3.28 12 1.70 0.70 0.95 Table 3.6 - 3.8 shows that the performance of the ETKF data assimilation improves with increasing the number of ensemble members for the extratropical atmosphere, but beyond 6 ensemble members the results remain about the same. For assimilation windows longer than 8 time steps, increasing the number of ensemble 53 members reduces the error. For larger models, the number of ensemble members, K, is less than the number of model variables, M, i.e. K << M , so that we use 9 ensemble members in the rest of the experiments to make it computationally feasible. The ETKF experiments show that we are able to successfully assimilate all of the observations corresponding to multiple time scales simultaneously for shorter assimilation intervals. A large multiplicative in ation is required for assimilating all of the observations for longer assimilation intervals. The ETKF data assimilation was a ected by changes in regimes by a growth in rms errors. The performance of the ETKF is also dependent on the number of ensemble members and the size of the multiplicative in ation. We will compare this experiment with the 4D-ETKF that assimilates all the observations later to determine if it was able to provide a more accurate estimate of the model state. 3.3 Experiment 2: 4D-ETKF Data Assimilation Experiments We performed 4D-ETKF data assimilation experiments with the coupled ocean- atmosphere system. Atmospheric and oceanic observations are available every 8 time steps. The formulation of the 4D-ETKF data assimilation is the 4D-LETKF data assimilation (Hunt et al., 2004, 2007), but without spatial localization. We used 9 ensemble members and performed assimilations with and without in ation for each assimilation window. For these experiments, the atmospheric and oceanic states were assimilated simultaneously and we varied the assimilation windows from 8 to 54 80 time steps. Instead of using only the observations at the analysis time, the 4D- ETKF uses all of the observations within the assimilation interval. It selects the linear combination of the ensemble forecasts that best ts the observations through- out an assimilation window. Fig. 3.7 plots the mean rms errors of the 4D-ETKFs Figure 3.7: 4D-ETKF with in ation vs. 4D-ETKF without in ation: Plot of mean rms errors for the extratropical atmosphere (top left), the tropical atmosphere (top right), and the ocean (bottom). Note the er- rors with in ation (solid line) are much smaller than the errors without in ation (dashed line) for longer windows with and without in ation for each assimilation interval. Fig. 3.7 shows the mean rms errors for the extratropical atmosphere (top left), the tropical atmosphere (top right), and the ocean (bottom). The solid line corresponds to the 4D-ETKFs with in ation and the dashed line corresponds to the 4D-ETKFs without in ation. The black dashed line denotes the observation error. We see that for each subsystem, 55 there is a growth in the mean rms errors which shows that the assimilation without in ation fails for longer assimilation intervals. This is due to the growth of nonlinear perturbations as we lengthen the assimilation window. We have to tune in ation to account for the lter underestimating the uncertainty in the background state. For longer assimilation windows, the 4D-ETKF analysis tends to place less weight on the observations within the assimilation intervals. The dashed lines (4D-ETKFs with optimal in ation) for each subsystem shows that by tuning the multiplicative in ation, we improved the performance of the 4D-ETKF for longer assimilation windows. Figure 3.8: 4D-ETKF vs. ETKF: Plot of the mean rms errors vs. as- similation intervals with optimal in ation We compared the performance of the ETKF (solid line) to the performance of the 4D-ETKF (dashed line) in Fig. 3.8. We see that the 4D-ETKF outperforms the 56 ETKF for short and long assimilation intervals for each subsystem. Both data assimilations required an increase in multiplicative in ation to improve the analysis for long assimilation intervals. The 4D-ETKF bene ted from not using only the observations available at the analysis time, but using the observations throughout the assimlation window and has an excellent performance even for long assimilation windows. We saw that both the ETKF and 4D-ETKF assimilations experienced lter divergence for longer assimilations when optimal in ation was not applied. Typ- ically, ETKFs perform better for shorter assimilation windows because for longer assimilation windows the perturbations become non-Gaussian. Yang and Kalnay (2011) developed the Quasi-Outer Loop (QOL) within an ETKF to deal with the issue of nonlinearities. The QOL improves the mean model trajectory by using the weights of the ETKF valid at the analysis time to correct the mean initial analysis at the beginning of the assimilation window (Yang and Kalnay, 2011). We applied the QOL to our ETKF to improve the mean analysis for longer assimilation windows. 3.4 Experiment 3: ETKF with a Quasi-Outer Loop (ETKF QOL) We performed ETKF QOL data assimilations with the simple coupled ocean- atmosphere system to check whether the QOL would be able to overcome the issues of the ETKF without localization. We applied the ETKF QOL with and without in ation using an ensemble of 9 members. Fig. 3.9 is a plot of the ETKF-QOL mean rms errors for each subsystem of 57 Figure 3.9: ETKF-QOL with in ation vs. ETKF-QOL without in ation: Plot of mean rms errors for the extratropical atmosphere (top left), the tropical atmosphere (top right), and the ocean (bottom). Note the er- rors with in ation (solid line) are much smaller than the errors without in ation (dashed line) for longer windows the coupled model, extratropics (top left), tropics (top right), and ocean (bottom). The solid line corresponds to the ETKF-QOL with in ation and the dashed line corresponds to the ETKF-QOL without in ation. The black dashed line denotes the observation error. The ETKF-QOL was able to handle the nonlinearties for longer assimlation windows for the extratropics when in ation is not applied. However, the amplitude of the ocean a ected the performance of the ETKF-QOL with no in ation for longer windows and hence the performance of the ETKF-QOL for the tropics. We therefore used optimal in ation to improve the analysis for all three subsystems (dashed line) or longer assimilation windows. We then compared the performance of 58 the ETKF-QOL to the ETKF and 4D-ETKF assimilations with optimal in ation. Figure 3.10: ETKF QOL vs. ETKF vs. 4D-ETKF: Plots comparing the mean rms errors for all EnKF-based data assimilation experiments using optimal in ation for the extratropics (top left), the tropics (top right), and the ocean (bottom) Fig. 3.10 displays plots comparing the mean rms errors for all ETKF (solid-square line), 4D-ETKF (dashed-square line), ETKF QOL (solid line) for the extratropics (top left), tropics (top right), and ocean (bottom). For each subsystem, the ETKF- QOL outperforms the ETKF for short and long assimilation windows. For longer windows, the ETKF QOL competes with the 4D-ETKF which uses more observa- tions. ETKF QOL bene ts from being able to handle the growing non-Gaussian perturbations by recentering the analysis over a more accurate area. The ETKFs experienced two common problems for longer assimilation win- dows 59 1. sampling errors in the background error covariance 2. underestimation in the uncertainty in the state estimate We used two brute force approaches to address the issues, increasing the number of ensemble members and increasing the multiplicative in ation. The ETKF-QOL was able to improve the ETKF analysis, which used only the observations at the anal- ysis time. The ETKF-QOL analysis compared with the 4D-ETKF analysis for the tropics and ocean which used observations throughout an assimilation window. Our next experiments applied subsystem localization to the ETKF data assimilations. 3.5 Experiment 4: LETKF Data Assimilation Experiment We performed LETKF with optimal in ation data assimilation experiments with the simple coupled ocean-atmosphere system. The atmospheric and oceanic states were assimilated separately (subsystem localization). The assimilation for the fast subsystems was performed every 8 time steps, but the assimilation intervals for the ocean were varied (i.e. every 8 to 80 time steps). For our LETKF data assimila- tion system, we used an ensemble of 9 members. The extratropical atmosphere and tropical atmosphere are weakly coupled, while the tropical atmosphere is strongly coupled to the ocean. Fig. 3.11 compares the performance of the LETKF and ETKF data assimi- lation schemes for the extratropical atmosphere (top left), the tropical atmosphere (top right), and the ocean (bottom). The solid-square line denotes the ETKF as- similations, the dashed-square line the 4D-ETKF assimilations, the solid line the 60 Figure 3.11: LETKF vs. ETKF QOL vs. ETKF vs. 4D-ETKF: Plots comparing the mean rms errors using optimal in ation for the extrat- ropics (top left), the tropics (top right), and the ocean (bottom) ETKF-QOL, and the dashed-circle the LETKF. The black dashed line is the ob- servation error. Fig. 3.11 shows that the LETKF outperforms the ETKFs for the extratropcial atmosphere. The LETKF for the extratropical atmosphere bene ts from assimilating the atmospheres separately from the ocean and assimilating fre- quently (every 8 time-steps). While the LETKF improves the ETKF analysis for the tropical atmosphere and ocean, it doesn?t outperform the 4D-ETKF or ETKF- QOL. We then extended the assimilation window beyond 80 time steps. Fig. 3.12 compares the LETKF to the 4D-ETKF and ETKF-QOL for longer assimilation windows, i.e. 88, 96, 120, 160, 200, and 240 time steps. It shows that the LETKF ultimately outperforms the 4D-ETKF and ETKF-QOL for longer windows. Assim- 61 Figure 3.12: LETKF vs. ETKF QOL vs. 4D-ETKF: Plots comparing the mean rms errors for longer assimilation windows using optimal in a- tion for the extratropics (top left), the tropics (top right), and the ocean (bottom) ilating the atmospheres every 8 time steps prevents the growth of nonlinear per- turbations, which a ects the ocean assimilation since the tropical atmosphere and ocean are strongly coupled. The LETKF experiments shows that while possibly computationally more expensive, assimilating the subsystems separately is better for longer assimilation intervals. The LETKF works better also because it simulates ?variable localization? (Kang et al., 2011) by reducing spurious correlations due to sampling errors in the covariance matrix. 62 3.6 Experiment 5: 4D-LETKF Data Assimilation Experiments We performed a 4D-LETKF data assimilation experiments using the coupled ocean-atmosphere system. For this experiment, the atmospheric and oceanic states were assimilated separately (model localization). Localization was performed by assimilating separately the fast subsystems every 8 time steps with the ETKF, but varying the assimilation intervals for the ocean with the 4D-ETKF, i.e. every 8 to 80 time steps. For our LETKF data assimilation system, we used an ensemble of 9 members. We also used an optimal in ation. Oceanic observations were assimilated every 8 to 80 time steps using observations within the assimilation window instead of just using the oceanic observations available at the analysis time. Atmospheric and oceanic observations are available every 8 time steps of the simulation. Fig. 3.13 compares the 4D-LETKF (red line) to the LETKF (dashed-circle line), the 4D-ETKF (dashed-square line), the ETKF-QOL (solid line), and the ETKF (solid-square line) for the extratropics (top left), tropics (top right), and ocean (bottom). The 4D-LETKF data assimilation system performs the best out of all systems for longer assimilation intervals. This agrees with Ballabrera et al. (2009) conclusion that recommends assimilating the fast and slow subsystem sepa- rately. For shorter windows, we are able to assimilate simultaneously, but for longer windows, when the nonlinear perturbations grow, it is bene cial to assimilate the submodels separately. The 4D-LEKTF data assimilation system also mimics vari- able localization (Kang et al., 2011), where the spurious correlations due to sampling errors are suppressed. It performs better than the LETKF because the ocean is using 63 Figure 3.13: 4D-LETKF vs. LETKF vs. ETKFs: Plots comparing the mean rms errors using optimal in ation for the extratropics (top left), the tropics (top right), and the ocean (bottom) more observations to estimate the model state. In summary, we were able to develop EnKF-based data assimilation systems for a coupled data assimilation. The ETKFs were not able estimate the model state for longer assimilation windows without in ation due to growing nonlinear perturbations. Optimal in ation is required for these systems to assimilate for longer 64 windows. The LETKFs data assimilation systems assimilated the fast and slow variables separately. It bene ted from a type of "variable localization" based on fast- slow model localization and was able to assimilate for longer assimilation windows. The ETKF-QOL is unique among the ETKFs in its ability to improve the analysis and assimilate the fast and slow submodels simultaneously for longer assimilation windows without falling victim to lter divergence. The experiments show promise of EnKF-based algorithms for coupled data assimilations. Further improvements in advancing these systems will prove bene cial to potential operational use of EnKF- based algorithms for coupled data assimilations. 65 Chapter 4 4D-Var Experiments with the Simple Coupled Ocean-Atmosphere Model Summary. This section introduces the 4D-Var data assimilation system for the simple coupled ocean-atmosphere model and discusses experiments used to study its performances in the presence of multiple time scales and variable amplitudes. 4D-Var was able to esti- mate the model state for short and long assimilation windows. Long assimilation windows experienced multiple minima due to non-Gaussian perturbations. We applied a quasi- static variational data assimilation (Pires et al., 1996) to improve the analyses for longer windows. We also performed tuning experiments with the background error covariance and found a strong dependence of the analyses on the amplitude of the background error covariance. When comparing the 4D-Var analyses with the EnKF-based analyses, the EnKF-based schemes with variable localization and a quasi-static outer loop competes with the 4D-Var analyses. For extended windows, the 4D-Var analyses o ers the best estimate of the coupled system. 4.1 Introduction 4D-Var determines the initial model state of a dynamical system that bets ts the model and observations over an assimilation window. This is accomplished by minimizing a cost function that measures the mis t between the model and observations over the assimilation period. The minimization of the cost function requires the development of a 66 tangent linear and adjoint model, which is integrated backwards in time. This chapter discusses the complete development of the 4D-Var data assimilation system four the simple coupled ocean-atmosphere model. Section 2 discusses the cost function and the minimization algorithm. In section 3, we formulate the tangent linear and adjoint models and verify the correctness of the models. Section 4 describes the background error covariance estimation and section 5 discusses the 4D-Var experiments. Section 6 compares the EnKF-based data assimilations and the 4D-Var data assimilations. 4.2 4D-Var Cost Function The 4D-Var cost function is given by J(xt0) = 1 2 h xt0 x b t0 iT (B0) 1 h xt0 x b t0 i + 1 2 nX i=1 H(xti) y o ti T Rti) 1[H(xti) y o ti (4.1) = Jb + Jo (4.2) where yoti is the vector of observations made at time ti, R is the observation error covariance matrix, B is the background error covariance matrix, xbt0 is the background model state or rst guess at time t0, xti is the evolved model state at time ti, i.e. xti = M(xt0), H is the observation operator at time ti that maps from model space to observation space. Jb is called the background cost function and Jo is called the observation cost function. To minimize the cost function, the gradient of the cost function, rJ , is computed. rJ(xto) = B 1 t0 h xt0 x b t0 i + NX i=1 LTHTtiR 1 ti H(xti) y o ti (4.3) For our 4D-Var experiments, all of the model variables are observed so the operator H and H are the identity matrix I. Iterative minimization algorithms are used to determine 67 the minimum of J , or solve rJ = 0. The adjoint model LT , which is the transpose of the tangent linear model L, is integrated backwards during the iterative method to compute the gradient. A derivation of the tangent linear and adjoint model will be discussed later in the chapter. The minimization algorithm used in this 4D-Var data assimilation system is the Broyden-Fletcher-Goldfarb-Shanno (BFGS) numerical method. The BFGS method is a Quasi-Newton method. Quasi-Newton methods are preferred over Newton and steepest descent methods for minimization because they are less computationally expensive and do not require the computation of the Hessian matrix (a square matrix of second-derivatives of a function). Given an initial model state at iteration j = 0, x(0), and an estimate of the Hessian, G(0), the BFGS method for minimizing the function f is 1. Determine a direction p(j) by solving p(j) = rf(x(j)) 2. Perform a line search to determine a stepsize (j) > 0. This is accomplished by minimizing the function f(x(j) + p(j)). Then update the model state x(j+1) = x(j) + (j)p(j) 3. Compute s(j) = x(j+1) x(j) 4. Compute y(j) = rf(x(j+1)) rf(x(j)) 5. Approximate the Hessian for the next iteration: G(j+1) = G(j)+y (j)y(j)T y(j)T s(j) G (j)s(j)(G(j)s(j))T s(j)TG(j)s(j) 6. Iterate the counter j, i.e. j = j + 1 and return to step 1. The initial estimate of the Hessian matrix can be the identity matrix. Convergence is checked by the criteria jjrf(x(j))jj < , where > 0. A disadvantage of the BFGS method is that convergence is not quadratic as the Newton method, but convergence is 68 still superlinear. Next, we will discuss the tangent linear and adjoint models needed to minimize the 4D-Var cost function. 4.3 Tangent Linear and Adjoint Model The tangent linear model is derived from the simple coupled ocean-atmosphere system, which is the forecast model for this data assimilation system. Let M denote the nonlinear coupled ocean-atmosphere model, such that xti+1 = M [xti ] then given a perturbed model state, xtlti+1 , the tangent linear model of M , is xti+1 = Lxti+1 = @M(xti) @x xtlti L = @M(xti ) @x is the tangent linear model operator. The coded form of our nonlinear model can be written as dxdt(1) = sig (x(2) x(1)) ce (s x(4) + k1) dxdt(2) = r x(1) x(2) x(1) x(3) + ce(s x(5) + k1) dxdt(3) = x(1) x(2) b x(3) dxdt(4) = sig (x(5) x(4)) c (s x(7) + k2) ce(s x(1) + k1) dxdt(5) = r x(4) x(5) x(4) x(6) + c(s x(8) + k2) ce(s x(2) + k1) dxdt(6) = x(4) x(5) b x(6) + cz x(8) dxdt(7) = tau sig (x(8) x(7)) c(x(4) + k2) dxdt(8) = tau r x(7) tau x(8) tau s x() x(9) + c (x(5) + k2) dxdt(9) = tau s x(7) x(8) tau b x(9) cz x(6) 69 Linearizing each line of the code for the nonlinear model gives us the he equations of the tangent linear model that are coded as dxdt tl(1) = sig (xtl(2) xtl(1)) ce s xtl(4) dxdt tl(2) = (r x(3)) xtl(1) xtl(2) x(1) xtl(3) + ce s xtl(5) dxdt tl(3) = x(2) xtl(1) + x(1) xtl(2) b xtl(3) dxdt tl(4) = sig (xtl(5) xtl(6)) c s xtl(7) ce S xtl(1) dxdt tl(5) = (r x(6)) xtl(4) xtl(5) x(4) xtl(6) + c s xtl(7) + ce S xtl(2) dxdt tl(6) = x(5) xtl(4) + x(4) xtl(5) b xtl(6) + cz xtl(9) dxdt tl(7) = tau sig(xtl(8) xtl(7)) c xtl(4) dxdt tl(8) = (tau r x(9)) xtl(7) tau xtl(8) tau s x(7) xtl(9) + c xtl(5) dxdt tl(9) = tau s x(8) xtl(7) + tau s x(7) xtl(8) tau b xtl(9) cz xtl(6) where (x(1), x(2), x(3), x(4), x(5), x(6), x(7), x(8), x(9)) is the model state and (xtl(1), xtl(2), xtl(3), xtl(4), xtl(5), xtl(6), xtl(7), xtl(8), xtl(9)) are the tangent linear model variables. The background state is needed in the tangent linear model. It is not directly used in the 4D-Var data assimilation system, but it is needed for the development of the adjoint model. The correctness of the tangent linear model is checked by computing the following relation (Navon et al., 1992) M(x + x) M(x) Lx To verify the tangent linear model using the above relation, 1. Integrate the nonlinear model from an initial model state x, to get the nonlinear solution M(x) 70 2. Integrate the nonlinear model from the initial model state x + x, where x is a small perturbation to generate the nonlinear solution M(x + x) 3. Integrate the tangent linear model from x using M(x) as the background state to produce the tangent linear model solution L( x) 4. Compute the relation jj(M(x + x) M(x))=L( x)jj. This relation should be approximately equal to 1 for su ciently small x. We checked the correctness of the tangent linear model eq. 4.4 using a small initial perturbation of x = 0:01 and integrated for 1000 time steps so that x evolves with time. Figure 4.1: Verifying correctness of tangent linear model: The ?black? line corresponds to L( x) and the ?blue? line corresponds to M(x + x) M(x) for the Extratropical atmosphere (top), Tropical atmosphere (middle), Ocean(bottom). The left side corresponds to x = 0:01 and the right side corresponds to x = 0:1. 71 Figure 4.2: Verifying correctness of tangent linear model with the norm: A plot of jjM(x + x) M(x)=L( x)jj vs. number of time steps. Fig. 4.1 displays M(x+ x) M(x) (red line) and L( x) (black line) for each subsystem of the coupled model. The left side corresponds to a perturbation of 0.01 ( x = 0:01) and the right side corresponds to a perturbation of 0.1 ( x = 0:1). Fig. 4.2 shows the norm jj(M(x + x) M(x))=L( x)jj, which should approximately equal to 1. The left side is for a perturbation of 0.01 and the right side if for a perturbation of 0.1. Figures 4.1 and 4.2 show that the period of time for which the tangent linear model approximates the nonlinear model depends on the magnitude of the perturbation. The larger the magnitude of the perturbation is, the shorter the valid time period. For the smaller perturbation of 0.01, the approximation (M(x + x) M(x))=L( x) becomes invalid after 600 time steps, while for the larger perturbation of 0.1 the approximation becomes invalid after 400 time steps. 72 The invalidity may a ect the 4D-Var analyses. The adjoint model operator, LT , found in the gradient of the cost function J is the transpose of the tangent linear model operator. The gradient of the cost function with respect to the control variable is obtained by a backward integration of the adjoint model. Automatic adjoint code generators have been developed for complex systems (Rostaing et al., 1993). A line-by-line approach is used here to code the adjoint model. With the line-by-line approach, the adjoint code for the adjoint model is the transpose of each line of the tangent linear code in reverse order. To provide a description of the line-by-line approach, consider the 6th equation, for example, from the tangent linear model dxdt tl(6) = x(5) xtl(4) + x(4) xtl(5) b xtl(6) + cz xtl(9) where dxdttl is a 9 1 vector representing the left-side of the tangent linear model, xtl is a 9 1 vector representing the tangent linear model variables, and x is a 9 1 vector representing the model state. In matrix form, this becomes 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 xtl(4) xtl(5) xtl(6) xtl(9) dxdttl(6) 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 x(5) x(4) b cz 0 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 xtl(4) xtl(5) xtl(6) xtl(9) dxdttl(6) 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 73 Taking the transpose, we obtain 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 x tl(4) x tl(5) x tl(6) x tl(9) dxdt tl(6) 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 1 0 0 0 x(5) 0 1 0 0 x(4) 0 0 1 0 b 0 0 0 1 cz 0 0 0 0 0 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 x tl(4) x tl(5) x tl(6) x tl(9) dxdt tl(6) 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 where x tl are the adjoint variables. The corresponding code is ax(4) = ax(4) + x(5) adxdt(6) ax(5) = ax(5) + x(4) adxdt(6) ax(6) = ax(6) b adxdt(6) ax(9) = ax(9) + cz adxdt(6) adxdt(6) = 0 where the pre x a represents the adjoint model variables This line-by-line approach is done for each equation of the tangent linear model to code the adjoint model. The correctness of the adjoint code of the adjoint model is veri ed by using the identity (Navon et al., 1992) (L x)T (L x) = ( x)TLT (L x This is a strict equality and it should hold if the adjoint model is coded cor- rectly. The procedure for verifying the correctness of the adjoint is 1. Integrate the forecast model, M , from an initial model state, x to get a non- linear solution M(x), which is saved as the background state 74 2. Integrate the tangent linear model from a small perturbation x to get the tangent linear solution L( x), using M(x) as the reference state 3. Integrate the adjoint model backwards starting from the evolved state, L( x) 4. Compute the relation (L x)T (L x) = ( x)TLT (L x to verify the correctness of the adjoint model We veri ed the correctness of the adjoint model for our 4D-Var data assimila- tion system. Table 4.1: Verifying the adjoint model code for the coupled ocean-atmosphere model 4D-Var data assimilation system x (L x)T (L x) x)TLT (L x) 0:001 182:66 182:66 0:01 1:8266E4 1:8266E4 0:1 1:8266E5 1:8266E5 1 1:8266E8 1:8266E8 Table 4.1 describes the adjoint model code check that included a 1000 time step forward nonlinear model and tangent linear model integration and a 1000 time step backward adjoint model integration. 75 4.4 Background error covariance estimation The minimization of the 4D-Var cost function J requires an estimation of the background error covariance B. Recall that background errors, b = xb xt, with mean b have an error covariance given by B = E(( b b)( b b)T ) = 2 6 6 6 6 6 6 6 6 6 6 4 var(x1) cov(x1; x2) cov(x1; x9) cov(x1; x2) var(x2) cov(x2; x9) ... ... . . . ... cov(x1; x9) cov(x2; x9) var(x9) 3 7 7 7 7 7 7 7 7 7 7 5 where x = (x1; x2; x3; x4; x5; x6; x7; x8; x9) denotes the model state, var() computes the variance, and cov() is the covariance. The background error covariances are often di cult to estimate because the background state are never observed directly and thus there is a lack of information about the background errors. For our 4D-Var data assimilation system, B is estimated using the National Meteorological Center method (Parrish and Derber, 1992). The NMC method estimates the background error covariance matrix by computing the average di erences between two short- range model forecasts valid at the same time. B E (xf(16ts) xf (8ts))(xf(16ts) xf (8ts))T where ts denotes time-steps and is a tunable scalar. An advantage of the NMC method is that it provides background error statistics that are easy to implement in a variational scheme (Sadiki et al, 2000). The amplitude of the background error covariances can be tuned to determine the weights of the 76 observations. Various Numerical Weather Prediction centers have used the NMC method to estimate the background error covariance (Parrish and Derber, 1992; Gauthier et al., 1998; Rabier et al., 1998; Derber and Bouttier, 1999). We use the NMC method to estimate the background error covariance for our 4D-Var data assimilations. 4.5 Quasi-static Variational Data Assimilation For longer assimilation windows, non-Gaussian perturbations of the observa- tion error and background error may result in non-quadratic cost functions, which challenges the minimization algorithm to nd the global minimum. Pires et al. (1996) proposed the Quasi-static Variational Data Assimilation (QVA) approach. QVA nds the minimum of a non-quadratic cost function by beginning with a mini- mum short window and progressively increasing the window to the maximum assim- ilation window while simultaneously adjusting the minimum of the cost function. Steps of the QVA approach can be summarized as 1. Begin with an initial guess of the model state and a short assimilation window 2. Minimize the 4D-cost function over that assimilation window to generate a new estimate of the model state 3. Increase the window by a small increment and repeat step 2 until the maximum window is obtained 77 Figure 4.3: Schematic of quasi-static variational data assimilation (Pires et al, 1996) Fig. 4.3 illustrates the schematic of the QVA. We applied QVA to our 4D-Var data assimilation system. 4.6 Experiments For our 4D-Var data assimilation experiments, observations are generated from the nature (or control) run plus random errors of fully coupled model. The obser- vational error standard error is p (2), i.e. R = 2I where I is the identity matrix. Observations are available every 8 time steps. Our forecast model is the simple cou- pled ocean-atmosphere model where the extratropical atmosphere is weakly coupled to the tropical atmosphere that is strongly coupled to the ocean. We performed 4D- Var data assimilation experiments with and without QVA. The control variables 78 were the initial model state and we varied the length of the assimilation window, i.e. 8, 16, 24, 32, 40, 48, 56, 64, 72, 80 time steps. The rms errors were used to assess the performance of the 4D-Var data assimilation. Figure 4.4: 4D-Var Assimilation with and without QVA: Plot of analysis mean rms errors for the extratropics (top left), tropics (top right), and ocean (bottom) Fig 4.4 describes the performance of the 4D-Var assimilation with and without QVA as we vary the assimilation window length (in time-steps) in terms of the mean rms errors. Mean rms errors were plotted for the extratropical atmosphere (top left), the tropical atmosphere (top right), and the ocean (bottom). The black dashed line denotes the observation error and the red line corresponds to the 4D-Var assimilation without QVA. The y-axis is the range of rms errors and the x-axis in the length of the assimilation window in time-steps. With observations available every 8 time 79 steps, the extra-tropical analysis rms errors improves up to a 32 time step window for the 4D-Var assimilation without QVA. The tropical and ocean analysis rms errors improve up to an 80 time-step window for the 4D-Var assimilation without QVA. Fig. 4.4 shows that applying QVA slightly improves the 4D-Var analysis. The extra-tropical analysis mean rms errors decrease up to a 72 time step assimilation window instead of the 24 time step assimilation window without QVA. The tropical analysis mean rms errors decrease up to an 80 time-step window and the ocean analysis mean rms errors decrease up to a 88 time-step window. In our 4D-Var experiments, we estimated the background error covariance using the NMC method. We performed tuning the background error covariance experiments to determine if the 4D-Var analysis is impacted by the amplitude of the backgrond error covariance. Tuning experiments were performed bymultiplying the background error covariance B by a scalar , i.e. B. Fig 4.5 describes the impact of tuning the background error covariance for short and long assimilation windows. It is a plot of the mean rms errors for the extratropics (top left), tropics (top right), and ocean (bottom). The red line corre- sponds to the 4D-Var assimilations with optimal B. The black dashed line denote the observation error covariance. For each subsystem, we were able to improve the 4D-Var analysis by tuning the background error covariance B. It shows that tuning B has a signi cant positive impact on the 4D-Var analyses. In summary, we were able to develop a 4D-Var data assimilation system for the simple coupled ocean-atmosphere model, where the extratropical atmosphere is weakly coupled to the tropical atmosphere which is strongly coupled to the ocean. 80 Figure 4.5: 4D-Var Assimilation: Tuning Background Error Covariance - Plot of analysis mean rms errors for the extratropics (top left), tropics (top right), and ocean (bottom). The red line corresponds to the 4D-Var assimilation with optimal B We performed experiments varying the assimilation window in time steps and found that lengthening the assimilation window to a certain time step and applying QVA improves the 4D-Var analysis. We also found that tuning the amplitude of the background error covariance has an impact on the performance of the assimilation, somthing which is not explicitly acknowledged in the description of 4D-Var assimi- lation systems. Longer windows require a smaller amplitude, re ecting the fact that the background information becomes relatively smaller or irrelevant compared to the increased number of observations. 81 4.7 4D-Var vs. EnKF-based Methods We compare the mean rms errors from our EnKF-based data assimilations to our 4D-Var data assimilations with optimal B. Figure 4.6: 4D-Var vs. EnKFs: Plot of analysis mean rms errors for ETKF-QOL (red), LETKF (light blue), 4D-LETKF (dark blue), and 4D-Var (green) for the extratropical atmosphere (top left), tropical at- mosphere (top right), and ocean (bottom) Fig. 4.6 shows that the EnKFs compete with 4D-Var for short and long assim- ilation windows. For short assimilation windows, the LETKFs and the ETKF-QOL outperform 4D-Var for the extratropics, tropics, and ocean. LETKF submodel lo- calization and a quasi-outerloop aids in assimilating for longer windows by reducing sampling errors and handling nonlinear perturbations respectively. We exteneded the assimilaiton windows beyond 80 time-steps. 82 Figure 4.7: 4D-Var vs. EnKFs for Longer Windows: Plot of analysis mean rms errors for ETKF-QOL (red), LETKF (light blue), 4D-LETKF (dark blue), and 4D-Var (green) for the extratropical atmosphere (top left), tropical atmosphere (top right), and ocean (bottom) When assimilating for even longer windows, g. 4.7 shows that 4D-Var per- forms better or competes with the LETKFs. With QVA, 4D-Var was able to handle the strong nonlinear perturbations for longer windows. Note, we did not compare the ETKF and 4D-ETKF since they didn?t perform well for very long assimilation windows. In summary, we were able to develop a 4D-Var assimilation for the simple coupled ocean-atmosphere model. 4D-Var was able to provide good estimates of the model state using observations within a short and long assimilation window. We were able to improve the analyses using QVA. The 4D-Var analyses dependes on the amplitude of the background error covariance, which was estimated using the NMC 83 method. When comparing the 4D-Var analyses to the EnKF-based schemes, we found that the 4D-VAR and EnKF-based systems yield comparable mean analysis and forecast errors when 4D-VAR is performed with a long enough assimilation window and when EnKFs are performed su ciently frequently. 84 Chapter 5 Estimating the Circulation and Climate of the Ocean (ECCO) 4D-Var Data Assimilation System Summary We develop an ECCO-like 4D-Var data assimilation system using the slow ocean submodel of our coupled ocean-atmosphere system. The control variables are the initial ocean state and the uxes between the tropical atmosphere and ocean that are updated every 8 time steps of a simulation and unchanged by the model. We develop ux estimates akin to the NCEP Reanalysis uxes (Kalnay et al.,1996) for background information. The ECCO-like 4D-Var data assimilation experiments were carried out for short and long assimilation windows to provide an estimate of the ocean state. We then performed a 4D-Var data assimilation system using the slow component of the coupled model and only the initial ocean state as control variables. QVA was used to improve the analyses for longer windows. The ECCO-like 4D-Var analyses improved the 4D-Var analyses, but the system was unable to improve upon the NCEP-like ux estimates. With QVA, we were also able to extend the assimilation window, but not beyond the standard 4D-Var. 5.1 About ECCO The Consortium for Estimating the Circulation and Climate of the Ocean (ECCO) is a collaboration of a group of scientists who seeks to provide a coupled ocean/biochemical/sea- ice, and atmospheric state estimate through assimilation methods. The assimilation meth- 85 ods combine a general ocean circulation model and various observations to produce a a global ocean state estimate. ECCO uses an ocean general circulation model that is based on the Massachusetts Institute of Technology (MIT) general circulation model The prog- nostic variables are horizontal velocity, potential, and salt. Several ECCO products are ECCO-SIO (Scripps Institution of Oceanography) - Scientists from MIT and SIO used the 4D-Var method or the adjoint method to obtain global ocean state esti- mates over the periods of 1992-1997, 1992-2000, and 1992-2002. ECCO-JPL (Jet Propulsion Laboratory) - Scientists at JPL used the extended Kalman lter and Rauch-Tung-Striebel (RTS) smoother to obtain global ocean state estimates from 1993 - present. ECCO-GODAE (Global Ocean Data Assimilation Experiment) - Scientists used the adjoint or 4D-Var method and various observation sets to obtain the global set estimate over the periods of 1992-2004, 1992-2007, and 1992-2006. German ECCO (GECCO) - Scientists based at the University of Hamburg?s Institut fuer Meereskunde who seeks the global ocean state estimate over the full 50-year NCEP/NCAR (National Centers for Environmental Prediction/National Center for Atmospheric Research) re-analysis period and estimates in the North Atlantic by using the adjoint (4D-Var) method. ECCO-Phase II - In this phase, scientists seek to generate sea-ice data and global ocean state estimates at high resolutions allowing for eddy resolution. ECCO developed a 4D-Var data assimilation system to estimate the ocean state using the World Ocean Circulation Experiment (WOCE) data, NCEP reanalysis of surface 86 uxes, and their ocean general circulation model. For their data assimilation system, they used as control variables both the initial model state as in 4D-Var as well as surface uxes. Here, we developed an ECCO-like data assimilation system using the ocean subsys- tem of the simple coupled ocean-atmosphere model to see if the initial model state and its forcings can be used to estimate the model state of the slow system. Section 2 discusses previous studies related to the ECCO 4D-Var data assimilation system. Section 3 intro- duces the ECCO-like 4D-Var data assimilation system for our simple model, discussing the tangent linear and adjoint models, the cost function, and the error covariance matrix estimations. Section 4 presents the experiments and summarizes the ndings. Section 5 discusses a 4D-Var data assimilation system for the slow component of the coupled system and its experiments. Section 6 compares the ECCO-like 4D-Var data assimilation with the LETKF and 4D-Var assimilations. 5.2 ECCO analyses and comparisons with other methods Stammer et al. (2004) performed a 4D-Var data assimilation using the initial ocean state and air-sea uxes of heat, freshwater, and momentum to obtain an optimal estimate of the global ocean state and air-sea uxes for a 10 year period. The control variables are the initial potential temperature and salinity eld and the daily surface forcings of heat, freshwater, and momentum uxes over 10 years (Stammer et al., 2004). The observa- tions that were used to t the model included satellite data sets, surface drifter velocities, in-situ hydrographic temperature, and salinity pro les. The 4D-Var system used NCEP surfaces uxes to constrain the forcing elds and the model?s monthly mean climatology of temperature and salinity. Stammer et al. (2004) compared the ECCO ux estimates to independent estimates from bulk formula and observations to determine if they improved 87 the NCEP reanalysis uxes. They found general agreement between the ECCO ux esti- mates and the independent ux estimates, with the ECCO adjustments being within the NCEP error estimates. They did nd that small scale structures due to model error in the momentum uxes. To improve their 4D-Var estimation, Stammer et al. (2004) proposed using spatial covariances for the the ux errors and improving the model resolution and physics. Carton and Santorelli (2008) examined and compared nine analyses of the ocean state and heat content during the period of 1960-2002. The analyses included six from sequential data assimilation, two that were independent of numerical models, and the ECCO 4D-Var analysis that employed a general circulation model. The ECCO 4D-Var data assimilation system used in this comparison study employed the initial conditions and atmospheric uxes as the control variables. The study found the ECCO analyses to be outliers in some of the comparison studies. Fig 5.1 shows ECCO bing a relative outlier during the comparison study of ocean analyses by Carton and Santorelli (2008). It is a plot of the rst empiri- cal orthogonal eigenfunction of monthly heat content and the corresponding time series. The ECCO analyses was the only one among the nine analyses for which the eigenfunction did not resemble the Paci c Decadal Oscillation Pattern. This doesn?t necessarily discredit the ECCO analyses, but may highlight a weakness in the analyses. 88 Figure 5.1: A comparison study of nine ocean analyses (Carton and Santorelli, 2008) that shows an ECCO analysis as an outlier 5.3 ECCO-like 4D-Var Data Assimilation System for the Simple Cou- pled Model We developed a 4D-Var data assimilation system similar to the ECCO 4D-Var data assimilation system using the slow component of the simple coupled ocean- atmosphere model. Like ECCO, we include as control variables the initial conditions and the ?surface uxes? between the ocean and the tropical atmosphere of our simple coupled ocean-atmosphere model. The equations describing the slow ?ocean? component of the simple coupled model are _X = (Y X) + f1 89 _Y = rX Y SXZ + f2 (5.1) _Z = SXY bZ + f3 _f1 = 0 _f2 = 0 _f3 = 0 where the model state is given by x = (X; Y; Z; f1; f2; f3)T . (X; Y; Z) represents the ocean model state and (f1; f2; f3) represents the model forcings ( uxes between the ocean and tropical atmosphere) that are not changed by the nonlinear model. The cost function for our ECCO-like 4D-Var system is given by J(xt0) = 1 2 h xt0 x b;nfe t0 iT (Bb;nfe0 ) 1 h xt0 x b;nfe t0 i (5.2) + 1 2 nX i=1 H(xti) y o ti T Rti) 1[H(xti) y o ti (5.3) where the control vector is given xt0 = [X0; Y0; Z0; f 1 1 ; f 1 2 ; f 1 3 ; :::; f n 1 ; f n 2 ; f n 3 ] T . (X0; Y0; Z0) corresponds to the initial ocean state, (f 11 ; f 1 2 ; f 1 3 ) are the uxes for time step 1 through time step 8, (f 21 ; f 2 2 ; f 2 3 ) are the uxes for time step 9 through time step 16, and (fn1 ; f n 2 ; f n 3 ) are the uxes for time steps 8(n+ 1) + 1 through time step 8n. Note, n = total number of time steps=8. The background state is given by xb;nfe = [Xb; Y b; Zb; fnfe;11 ; f nfe;1 2 ; f nfe;1 3 ; :::; f nfe;n 1 ; f nfe;n 2 ; f nfe;n 3 ] T , where (Xb; Y b; Zb) corre- sponds to the background ocean state, (fnfe;11 ; f nfe;1 2 ; f nfe;1 3 ) are the NCEP-like ux estimates for time step 1 and time step 8, (fnfe;21 ; f nfe;2 2 ; f nfe;2 3 ) are the NCEP-like ux estimates for time step 9 and time step 16, and (fnfe;n1 ; f nfe;n 2 ; f nfe;n 3 ) are the NCEP-like ux estimates for time steps 8(n+ 1) + 1 and time step 8n. 90 Figure 5.2: ECCO-like 4D-Var vs. 4D-Var: The di erence between a 4D-Var system and the ECCO-like 4D-Var system There are two major di erences between the ECCO 4D-Var and the standard 4D-Var. The rst is that both the initial conditions and surface ocean-atmosphere uxes are used as control variables. The second is that this approach allows ECCO to use very long assimilation windows, essentially in nite. For example, the 10 year ECCO ocean reanalysis uses a single 10 year window. We use "NCEP-like" ux estimates for the background state. We will discuss the computation of the NCEP-like ux estimates in the next section. The background error covariance, Bb;nfe, is an augmented matrix consisting of the background error covariance matrix for the background ocean state, B and the error covariance matrix for the ux estimates, Q, i.e. 91 B = 2 6 6 4 B 0 0 Q 3 7 7 5 The background error covariance matrix, B was estimated using the NMC method. The error covariance matrix, Q is a diagonal matrix with time-averaged variances of the ux estimates along the main diagonal. The ECCO-like 4D-Var data assimilation system minimizes this cost function to get an estimate of the ocean state. Because of the additional constraint by using surface uxes as control variables, ECCO is able to use very long windows at least in the published work. 5.4 NCEP-like Flux Estimates The NCEP uxes used by ECCO as background information in the estimation of the uxes were obtained from the NCEP Reanalysis (Kalnay et al., 1996). The reanalysis was carried out on the global atmosphere and with observed sea surface temperature (SST). In our simple coupled ocean-atmosphere model, we imitate this procedure by doing a "reanalysis" of the tropical-extratropical atmospheres with ob- served ocean variables (with errors). This allows to estimate the ocean-atmosphere uxes within a one-way coupling, as in the NCEP Reanalysis. To compute NCEP- like ux estimates using the simple couple ocean-atmosphere model, we uncouple the ocean from the tropical atmosphere and perform a ETKF data assimilation with 10 ensemble members. The tropical atmosphere is weakly coupled to the extratropical atmosphere and is forced by ocean observations every 8 time steps. We assimilate 92 every 8 time steps and use the mean analysis state of the tropical atmosphere to estimate the uxes. The equations for uxes are given by f1 = c(x a tr + k1) f2 = c(y a tr + k1) f3 = czz a tr where (xatr; y a tr; z a tr) corresponds to the mean analysis state for the tropical atmo- sphere. The parameters c,cz,and k1 are the speci ed parameters of the simple cou- pled ocean-atmosphere model. Figure 5.3: NCEP-like Flux Estimates: A plot of the NCEP-like ux estimates (f1; f2; f3) (top panel) and the di erence between the true and estimated uxes (bottom panel) Fig. 5.3 shows a plot of the ux estimates in the top panel and the di erence 93 between the ux estimates and ?true? uxes in the bottom panel. A normally dis- tributed random noise was added to the ux estimates to create more variability. The ?true? uxes were computed by using the ?true? tropical atmosphere state when computing the ux estimates 5.5 ECCO-like 4D-Var: Tangent Linear and Adjoint Models The tangent linear model is used to develop the adjoint model. The equations for the tangent linear model are _X tl = (Y tl X tl) + f tl1 _Y tl = ( rX Z)X tl Y tl SXZtl + f tl2 _Ztl = SY X tl + SXY tl bZtl + f tl3 _f1 tl = 0 _f2 tl = 0 _f3 tl = 0 where (X tl; Y tl; Ztl; f tl1 ; f tl 2 ; f tl 3 ) are the tangent linear model state variables. For correctness, we check M(x + x) M(x) L( x) where x is a model state, x is a small perturbation, and L is the tangent linear model operator. To verify the tangent linear model using the above relation, 1. Integrate the nonlinear model from an initial model state x, to get the non- linear solution M(x) 94 2. Integrate the nonlinear model from the initial model state x + x, where x is a small perturbation to generate the nonlinear solution M(x + x) 3. Integrate the tangent linear model from x using M(x) as the background state to produce the tangent linear model solution L( x) 4. Compute the relation jj(M(x + x) M(x))=L( x)jj. This relation should approximately equal to 1. Figure 5.4: ECCO-LIKE 4D-Var: Tangent Linear Model Check - A plot of the jjM(x + x) M(x)=L( x)jj when varying the assimilation window and perturbation size Fig. 5.4 shows that the tangent linear model approximates the di erence between the two nonlinear model solutions well up to through an 80 time-step as- similation window. 95 The adjoint model is developed using the tangent linear model. The line-by- line approach is used to code the adjoint model, which involves transposing the equations of the tangent linear model. Consider, for example, the equation from the tangent linear model _Ztl = SY X tl + SXY tl bZtl + f tl3 This equation can be coded as dxdt tl(3) = tau s x(2) xtl(1) + tau s x(1) xtl(2) tau b xtl(3) + xtl(6) where dxdt tl is a 6 1 vector representing the left-side of the tangent linear model, xtl is a 6 1 vector representing the tangent linear model variable, and x is a 6 1 vector representing the model state. In matrix form, this becomes 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 xtl(1) xtl(2) xtl(3) xtl(6) dxdttl(3) 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 Sx(2) Sx(1) b 1 0 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 xtl(1) xtl(2) xtl(3) xtl(6) dxdttl(3) 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 Taking the transpose, we obtain 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 x tl(1) x tl(2) x tl(3) x tl(6) dxdt tl(3) 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 1 0 0 0 Sx(2) 0 1 0 0 Sx(1) 0 0 1 0 b 0 0 0 1 1 0 0 0 0 0 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 x tl(1) x tl(2) x tl(3) x tl(6) dxdt tl(3) 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 96 where x tl are the adjoint variables. The corresponding adjoint code is ax(1) = ax(1) + tau s x (2) adxdt(3) ax(2) = ax(2) + tau s x(1) adxdt(3) ax(3) = ax(3) tau b adxdt(3 ax(6) = ax(6) + adxdt(6) adxdt(3) = 0 This line-by-line approach is done for each equation of the tangent linear model to code the adjoint model, reversing the order of the tangent linear model equations. To verify the correctness of the adjoint model we check (L x)T (L x) = ( x)TLT (L x This is a strict equality and it should hold if the adjoint model is coded cor- rectly. The procedure for verifying the correctness of the adjoint is 1. Integrate the forecast model, M , from an initial model state, x to get a non- linear solution M(x), which is saved as the background state 2. Integrate the tangent linear model from a small perturbation x to get the tangent linear solution L( x), using M(x) as the reference state 3. Integrate the adjoint model backwards starting from the evolved state, L( x) 4. Compute the relation (L x)T (L x) = ( x)TLT (L x 97 Figure 5.5: ECCO-LIKE 4D-Var: Adjoint Model Check - A plot of the (L x)T (L x) and ( x)TLT (L x when varying the assimilation window and perturbation size to verify the correctness of the adjoint model For a small perturbation of x = 0:001, we see in Fig. 5.5 that the strict equality holds. 5.6 ECCO-like 4D-Var Experiments For our ECCO-like 4D-Var experiments, observations were generated from the nature (or control) run of the fully coupled model adding random errors. Observa- tional standard error is p (2), i.e. R = 2I where I is the identity matrix. Ocean observations are available every 8 time steps. The forecast model is the slow sub- system of the coupled ocean-atmosphere model with uxes changing after every 8 98 time steps. We perform a 4D-Var data assimilation using the initial ocean state and uxes as control variables. We also varied the length of the assimilation windows, 8 to 80 time steps in increments of 8 time steps. The rms errors for the analysis was computed to assess the performance of the data assimilation system. Table 5.1: ECCO-like 4D-Var Experiment with and without QSVA: Table of mean analysis rms errors for each assimilation window Assimilation Window Length Mean RMS Error, Analysis Mean RMS Error, Analysis (with QVA) 8 0:88 0:87 16 0:64 0:64 24 0:67 0:66 32 0:74 0:74 40 0:81 0:81 48 0:85 0:85 56 0:94 0:94 64 1:00 1:00 72 1:05 1:02 80 1:12 1:07 120 1:52 1:44 160 2:18 1:55 200 3:31 1:59 Table 5.1 displays the mean rms errors for the analysis with and without QVA for each assimilation window. We observe decreasing rms errors up to a 16 time- step assimilation window. Longer windows typically introduce multiple minima in 99 a 4D-Var assimilation system, so we applied the QVA. QVA slightly improves the analyses for longer windows. We are able to use the initial ocean state and uxes to obtain good estimates of the ocean state. However, our ECCO-like 4D-Var data assimilation system was not able to improve the NCEP-like ux estimates. Figure 5.6: ECCO-Like 4D-Var Flux Estimate - Plots of the ECCO-like 4D-Var ux estimates (blue line) and the NCEP-like ux estimates (red line) Fig. 5.6 shows the rms errors for the ECCO-like 4D-Var ux estimates and the NCEP-like ux estimates. From the gure, we observe that we are not improving the NCEP-like ux estimates and that the ECCO errors increase with increasing assimilation window. The errors in the ECCO-like ux estimates maybe due to the fact that the ux control variables do not control the ocean state and doesn?t force it 100 to be close to the observations. Our system is chaotic and the ocean of our coupled model is not dominated by the uxes. We also performed a 4D-Var data assimilation, where we used only the initial ocean state as the control variables and forced the system every 8 time steps with NCEP-like ux estimates. The assimilation included the QVA. Table 5.2: 4D-Var Experiment with Fluxes (QSVA applied): Table of mean analysis rms errors for each assimilation window when applying Pires et al.(1996) Assimilation Window Length Mean RMS Error, Analysis 8 1:14 16 0:74 24 0:74 32 0:84 40 0:92 48 0:96 56 1:11 64 1:24 72 1:28 80 1:41 120 2:37 160 3:19 200 4; 76 101 Figure 5.7: ECCO-LIKE 4D-Var vs. 4D-Var vs. Fully Coupled 4D-Var: A plot of the mean rms errors for the ECCO-like 4D-Var assimilation (orange), the 4D-Var assimilation (blue), and the Fully Coupled 4D-Var (red) with lengthening assimilation windows Fig. 5.7 compares the ECCO-like 4D-Var, 4D-Var, and Fully Coupled 4D- Var (from chapter 4). The fully coupled 4D-Var provides better estimates of the ocean state since it bene ts from ?true? model forcings from the tropical atmosphere. Comparing the ECCO-like 4D-Var data assimilation to the 4D-Var data assimila- tion, we see that the ECCO-like 4D-Var analyses improves the ocean states. This experiment shows that including model forcings in the control variables improves the ocean state estimate. 102 Chapter 6 Conclusions and Future Work The goal of this study was to develop and compare sequential and variational data assimilation systems for a simple "coupled ocean-atmosphere model" of dif- ferent time scales and varying amplitudes. We developed EnKF-based and 4D-Var data assimilation systems and compared their performance when assimilating for short and long assimilation windows. We summarize and discuss the conclusions for each data assimilation system below. EnKF-based Assimilation Systems The EnKF-based data assimilation methods used in this work were ETKF - assimilating all model variables, the ETKF nds the linear combina- tion of the ensemble forecasts that best ts the observations valid at the end of the assimilation window (analysis time) 4D-ETKF - assimilating all model variables, the 4D-ETKF nds the linear combination of the ensemble forecasts at the analysis time that best ts the observations made throughout an assimilation window LETKF - assimilating the ?fast atmospheric? variables frequently while sepa- rately assimilating the ?slow ocean? variables. For this system, the "localiza- tion" takes place on the submodel. 103 4D-LETKF - assimilating the ?fast atmospheric? variables frequently using an ETKF while separately assimilating the ?slow ocean? variables using a 4D- ETKF ETKF-Quasi-Outer Loop - assimilating the fast and slow variables simul- taneously using the ETKF with a quasi-outer loop. We found that the ETKF was able to assimilate all of the observations corresponding to multiple time scales for short assimilation intervals. It experiences a lter divergence for longer assimilation windows due to nonlinear, hence a non- Gaussian) growth of perturbations. the 4D-ETKF was able to assimilate all of the observations corresponding to multiple time scales for short assimilation intervals. It outperforms the ETKF because of the use of more observations within an assimilation interval. It also experiences lter divergence for longer assimilation windows due to nonlinear perturbations. the LETKF was able to assimilate at di erent time intervals, assimilating the fast variables frequently while varying the assimilation intervals for the slow variables. The model localization akin to variable localization of the LETKF suppresses the e ect of spurious correlations resulting from the use of small ensembles. The LETKF outperforms the ETKFs for longer assimilation intervals and avoids lter divergence. The frequent assimilation of the fast 104 variables allows the faster ?noisy? phenomena to saturate. the 4D-LETKF was able to assimilate at di erent time intervals, assimilating the fast variables via the local ETKF frequently while varying the assimilation intervals using a 4D-ETKF for the slow variables. It outperforms the ETKFs and LETKF for long assimilation intervals. With the 4D-LETKF, the ocean assimilation bene ts from using more observations and both systems bene t from using variable localization. the ETKF with a Quasi-Outer Loop was able to assimilate the fast and slow variables simultaneously for short and long assimilation intervals. It is com- petitive with the LETKFs for long assimilation window. This data assimi- lation system bene ts from being able to handle the nonlinear perturbations for longer windows by using a no-cost smoother to improve the mean analysis state A challenge of the EnKF-based assimilation systems, especially when assimi- lating for longer windows, is lter divergence. Multiplicative in ation was applied to mitigate lter divergence. Future work for coupled data assimilation using EnKF- based assimilation methods include Extending the ETKF Quasi-Outer Loop: Analyses of the model state can be improved by extending the ETKF Quasi-Outer Loop to a 4D-ETKF Quasi-Outer Loop using observations over an assimilation window. The ETKF Quasi-Outer Loop could also be extended to 4D-LETKF Quasi-Outer Loop, assimilating the fast and slow variables separately. 105 Adaptive In ation: Manually optimizing multiplicative in ation is compu- tationally expensive and ine cient. Anderson (2007,2009), Li et al. (2009), and Miyoshi (2011) developed adaptive in ation approaches where they adap- tively estimated multiplicative in ation for EnKF-based algorithms. Their work showed improvements in the model analysis with adaptive in ation. An adaptive in ation approach can be used with the LETKFs or ETKF with a Quasi-Outer Loop to improve the analyses for coupled EnKF-based data assimilations. Variable Localization: In the application of the LETKF, the localization was carried out by slow and fast submodels, as in Ballabrera et al. (2009). Alternatively, we should try the method of variable localization (Kang et al., 2011) and identify the variables in th coupled model where errors should be physically uncorrelated. For these variables, the corresponding covariance is explicitly zeroed out, thus avoiding the spurious correlation that are inevitably generated in the standard EnKF method that includes all covariances. For ex- ample, the fast extratropical atmosphere and the ocean should be uncorrelated with the errors. Therefore, variable localization (zeroing out the covariances between the extratropical atmosphere and the ocean) will eliminate spurious correlations, which at the same time allowing for the simultaneous ETKF of the fully coupled model. Other experiments can be done to investigate the performance of coupled EnKF-based data assimilation methods. This includes reducing observation cov- 106 erage and studying its e ect on the performance of the assimilation systems and introducing model errors to perform imperfect model experiments using the assim- ilation systems. These experiments would provide further information about the EnKF-based methods for coupled data assimilation. 4D-Var Assimilation System Assuming a perfect model, a 4D-Var data assimilation system was developed for the simple coupled ocean-atmosphere model. We found that it provided good estimates of the extratropical, tropical, and ocean model states for short and long as- similation windows. For short and long assimilation windows, the 4D-Var analyses competes with the LETKF and ETKF-QOL analyses. However, when extending the windows, through the use of the expensive Quasi-static Variational Data As- similation (Pires et al., 1996), the 4D-Var analyses outperforms the LETKF and ETKF-QOL. Future work for the 4D-Var assimilation system include Incremental 4D-Var 4D-Var is computationally expensive when minimizing the cost function. Incremental 4D-Var (Courtier et al., 1994) can be use to reduce the cost of 4D-Var. It expresses the cost function in terms of increments with respect to the background state, x = x xb, for computational e ciency. The cost function J(x) = 1 2 (x xb)TB 1(x xb) + 1 2 nX i=1 [yoi Hi(xi)] TR 1[yoi Hi(xi)] T becomes J( x) = 1 2 ( x)TB 1( x) 107 + 1 2 nX i=1 (Hi xi di)TR 1i (Hi xi di) where di = yi Hi( xb) are called the innovations. Using increments reduces the tangent linear and adjoint models during minimization. Weak Constraint 4D-Var Weak constraint 4D-Var can be used to perform imperfect model experiments. With the weak constraint 4D-Var, the cost function becomes J(x) = 1 2 (x xb)TB 1(x xb) + 1 2 nX i=1 [yoi Hi(xi)] TR 1[yoi Hi(xi)] T + nX i=1 Ti Q 1 i where is the model error de ned as (x) = xi Mi 1 and Q is the model error covariance. ECCO-like 4D-Variational Data Assimilation We developed an ECCO-like 4D-Var data assimilation system for the slow ocean component of our simple coupled ocean-atmosphere model. This 4D-Var assimilation system used the ocean model state and forcings as the control variables. While including the forcings in the control variables slightly improved the analyses, we were not able to improve the uxes or extend inde nitely the windows as done in ECCO. Future work for this study includes Error Covariance for Fluxes: A diagonal matrix was used to represent the error covariance for the uxes. Future work could include using full error covariance matrix for the uxes to study if it would improve ux estimates. 108 Longer Assimilation Windows: ECCO has been performed with a 10-year as well as a 50-year analysis with a single assimilation window. Future work could include lengthening the assimilation window to study 4D-Var estimates of the slow model state. But in our ECCO-like assimilation, we were not able to extend the assimilation window beyond that used for 4D-Var. Long assim- ilation windows present the challenge of rapidly growing perturbations that invalidates the tangent linear model. The approach of dampening the adjoint model (Hoteit, 2005) can be used to explore assimilating for longer windows. In addition, we could use nudging toward the ocean model climatology, as done in some versions of ECCO (J. Carton pers. comm., 2011). In closing, we were able to perform coupled data assimilation for a simple coupled ocean-atmosphere system using sequential and variational data assimila- tion methods. The study highlights advantages, challenges, and possible additional improvements for both systems that could further advance the data assimilation systems for coupled data assimilations. The results obtained with this simple "cou- pled ocean-atmosphere" system can serve to guide the design of future generations of data assimilation systems for operational coupled ocean-atmosphere models. 109 Bibliography [1] Anderson, J.L. and S.L. Anderson, 1999: A Monte Carlo implementation of the nonlinear ltering problem to produce ensemble assimilations and forecasts, Mon. Wea. Rev, 127, 2741-2758. [2] Anderson, J.L., 2001: An ensemble adjustment kalman lter for data assimila- tion, Mon. Weather Rev., 129, 2884-2903. [3] Anderson, J.L. 2007: An adaptive covariance in ation error correction algo- rithm for ensemble lters. Tellus, 59A, 210-224. [4] Andersson, E., Fisher,M., Holm, E., Isaksen, L., Radnoti, G., and Y. Tremolet, 2005: Will the 4D-Var approach be defeated by nonlinearity?, ECMWF Tech Memo., 479, 26. [5] Ballabrera-Poy, J., Kalnay, E., and S.C. Yang, 2009: Data assimilation in a system with two scales - combining two initialization techniques, Tellus, 61A, 539-549. [6] Bertino, L., Evensen, Geir, and H. Wackernagel, 2003: Sequential Data Assim- ilation Techniques in Oceanography. International Statistical Review, 71(2), 223-241. [7] Bishop Bishop, C.H., Etherton, B.J., and S.J. Majumdar, 2001: Adaptive sam- pling with ensemble transform kalman lter. Part I: Theoretical aspects, Mon. Weather Rev., 129, 420-436. [8] Burgers, G., van Leeuwen, P.G., and G. Evensen, 1998: On the analysis scheme in the ensemble kalman lter. Mon. Weather Rev, 126, 1719-1724. [9] Carton, J.A., and A. Santorelli, 2008: Global decadal upper-ocean heat content as viewed in nine analyses, J. Clim., 21, 6015-35. [10] Chen, D. 2008: Coupled data assimilation for ENSO prediction. Advances in Geosciences, 18, 45-62. [11] Corazza, M., Kalnay, E., Patil, D.J., Ott, E., Yorke, J., Szunyogh, I., and M. Cai, 2002: Use of the breeding technique in the estimation of the background error covariance matrix for a quasigeostrophic model, in AMS Symposium on Observations, Data Assimilation, and Probabilistic Prediction, Orlando, FL, 154-157. 110 [12] Courtier, P., Thpaut, J.-N. and Hollingsworth, A., 1994: A strategy for op- erational implementation of 4D-Var, using an incremental approach. Q. J. R. Meteorol. Soc., 120, 1367-1388. [13] Courtier, P., Anderson, E., Heckley, W., Pailleux, J., Vasiljevic, D., Hamrud, J., Hollingsworth, A., Rabier, F. and M. Fisher, 1998: The ECMWF implemen- tation of three dimensional variational assimilation (3D-Var) I: Formulation. Quarterly Journal of Royal Meterology Society, 124, 1783-1807. [14] Daley, R. 1991: Atmospheric data analysis. Cambridge University Press. [15] Derber, R. and F. Bouttier, 1999: A reformulation of the background error covariance in the ECMWF global data assimilation system. Tellus, 51A, 195- 221. [16] Evans, E., N. Bhatti, J. Kinney, L. Pann, M. Pena, S-C. Yang, E. Kalnay, and J. Hansen, 2004: RISE undergraduates nd that regime changes in Lorenzs model are predictable. Bull. Amer. Meteor. Soc., 85, 520524. [17] Evensen, G., 1992: Using the extended kalman lter with a multilayer quasi- geostrophic ocean model, J. Geophys. Res., 97(C11), 17905-17924. [18] Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi- geostrophic model using Monte Carlo methods to forecast error statistics, J. Geophys. Res., 99(C5), 10143-10162. [19] Evensen, G. and P.J. van Leewen, 1996: Assimilation of geosat altimeter data for the aghulas current using the ensemble kalman lter with a quasi-geostrophic model, Mon. Weather Rev., 124, 85-96. [20] Evenson, G., 1997: Advanced data assimilation for strongly nonlinear dynam- ics, Mon. Wea. Rev, 125, 1342-1354. [21] Fertig, E., Harlim, J., and B.R. Hunt, 2007: A comparative study of 4D-Var and a 4D Ensemble Kalman Filter: perfect model simulations with Lorenz-96, Tellus, 59A, 96-100. [22] Gauthier, P., Buehner, M., and L. Fillion, 1998: Background error statistics modelling in a 3D variational data assimilation scheme: estimation and impact on the analyses. ECMWF workshop on diagnosis of data assimilation systems, 131-145. [23] Ghil, M. and P. Malanotte-Rizzoli, 1991: Data assimilation in meteorology and oceanography, Adv. Geophys. 33, 141-266 111 [24] Harlim, J. and B. Hunt, 2007: Four-dimensional local ensemble transform Kalman lter: numerical experiments with a global circulation model, Tellus, 59A, 731-748. [25] Hoteit, I., Cornuelle, B., Kohl, A., and D. Stammer, 2005: Treating strong adjoint sensitivities in tropical eddy-permitting variational data assimilation., Q. J. R. Meterol. Soc., 131, 3659-3682. [26] Houtekamer P.L. and H.L. Mitchell, 1998: Data assimilation using an ensemble kalman lter technique, Mon. Weather Rev., 126, 796-811. [27] Hunt, B., Kalnay, E., Kostelich, E.J., Ott, E., Patil, D.J., Sauer, T., Szunyogh, I., Yorke, J.A., and A.V. Zimin, 2004: Four-dimensional ensemble Kalman ltering, Tellus, 56A, 273-277. [28] Hunt, B., Kostelich, E., and I. Szunyogh, 2007: E cient data assimilation for spatiotemporal chaos: A local ensemble Kalman Filter. Physica D, 230, 112- 126. Kalman Kalman, R.E. and Bucy, R.C., 1960: A new approach to linear ltering and prediction problems, J. Basic Eng. (ASME) 82D, 35-45. [29] Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project. Bull. Amer. Meteor. Soc., 77, 437471. [30] Kalnay, E. 2003: Atmospheric modeling, data assimilation, and predictability. Cambridge University Press. [31] Kalnay, E., Li, H., Miyoshi, Yang, S.C., and J. Ballabrera, 2007a: 4D-Var or Ensemble Kalman Filter? Tellus A, 59, 758-773. [32] Kalnay, E., Li, H., Miyoshi, T., Yang, S.C., and J. Ballabrera, 2007b: Response to the discussion on \4D-Var or EnKF?" by Nils Gusta son, Tellus A, 59, 778- 780. [33] Kalnay, E. and S.C. Yang, 2010: Accelerating the spin-up of Ensemble Kalman Filtering. Quart. J. Roy. Meteor. Soc., 136, 16441651. [34] Kang, J.-S., E. Kalnay, J. Liu, I. Fung, T. Miyoshi, and K. Ide, 2011: \Variable localization" in an ensemble Kalman lter: Application to the carbon cycle data assimilation, J. Geophys. Res., in press. 112 [35] Kirtman, B.P., Shukla, J., Huang, B., Zhu, Z., and E.K. Schneider, 1997: Multi- seasonal predictions with a coupled tropical ocean-global atmosphere system, Mon. Wea. Rev., 125, 789-808. [36] Klinker, E., F. Rabier, G. Kelly and J. F. Mahfouf, 2000: The ECMWF op- erational implementation of four-dimensional variational assimilation. Part I: experimental results and diagnostics with operational con guration. Q. J. R. Meteorol. Soc. 126, 1191-1215. [37] Latif, M. and T.P. Barnett, 1994: A review of ENSO prediction studies. Climate Dyn., 9, 167-179. [38] Li, H., Kalnay, E., and T. Miyoship, 2009: Simultaneous estimatiion of covari- ance in ation and observation errors within an ensemble Kalman lter. Quart. J. Roy. Meteor. Soc., 135, 523-533. [39] Lorenc A. C., 2003: The potential of the Ensemble Kalman Filter for NWP-A comparison with the 4D-VAR. Quarterly Journal of Royal Meterology Society, 129, 3183-3203. [40] Lorenz, E., 1963: Deterministic non-periodic ow. J. Atmos. Sci., 20, 130-141. [41] Lorenz, E.N., 1996: Predictability-a problem partly solved. In Proceedings on predictability, held at ECMWF on 4-8 September1995. [42] Lorenz, E.N. and K.A. Emanuel, 1998: Optimal sites for supplementary weath- rer observations: simulations with a small model. J. Atmos. Sci., 55, 399-414. Luong Luong, B., J. Blum, and J. Verron, 1998: A variational method for the resolution of a data assimilation problem in oceanography. Inverse Problems, 14, 979-997. [43] Mahfouf, J.F. and F. Rabier, 200: The ECMWF operational implementation of four dimensional variational assimilation. Part II: Experimental results with improved physics. Quart. J. Roy. Meteor. Soc, 126, 1171-1190. [44] Miller, R.N. Ghil, M., and F. Gauthiez, 1994: Advanced data assimilation in strongly nonlinear dynamical systems, J. Atmos. Sci., 51, 1037-1055. [45] Miyoshi, T., 2011: The Gaussian Approach to Adaptive Covariance In ation and Its Implementation with the Local Ensemble Transform Kalman Filter. Mon. Wea. Rev., 139, 1519-1535. 113 [46] Navon, I. and D.M. Legler, 1987: Conjugate-gradient methods for large scale minimization in meteorology. Mon. Wea. Rev, 115, 1479-1502. [47] Ott, E., Hunt, B.R., Szunyogh, I., Zimin, A.V., Kostelich, E.J., Corazza, M., Kalnay, E., Patil, D.J., and J.A. Yorke, 2004: A local ensemble Kalman lter for atmospheric data assimilation, Tellus A, 56, 415-428. [48] Parrish DF, Derber JC, 1992: The National-Meteorological-Centers spectral statistical interpolation analysis system, Mon Wea Rev, 120(8), 1747-1763. [49] Pena, M. and E. Kalnay, 2004: Separating fast and slow modes in coupled chaotic systems, Nonlinear Process. Geophys., 11, 319-327. [50] Pires, C., Vautard, R., and O. Talagrand, 1996: On extending the limits of variational assimilation in nonlinear chaotic systems. Tellus A, 48, 96-121. [51] Rabier, F., Jarvinen, H., Klinker, E., Mahfouf, J.-F. and Simmons, A.,2000: The ECMWF operational implementation of four-dimensional variational physics, Q. J. R. Meteorol. Soc., 126, 1143-1170. [52] Rostaing, N., Dalmas, S., and A. Galligo, 1993: Automatic di erentiation in Odyssee. Tellus, 45A, 558-568. [53] Sadiki, W., Fischer, C., and J.F. Geleyn, 2000: Mesoscale background error covariances - recent results obtained with the limited area model ALADIN over Morocco. Mon. Wea. Rev., 128, 3927-2935. [54] Vialard, J., Weaver, A.T., Anderson, D.L.T., and P. Delecluse, 2003: Three- and four-dimensional variational data assimilation with an ocean general cir- culation model of the tropical Paci c Ocean. Part 2: physical validation, Mon. Wea. Rev., 131, 1379-1395. [55] Wang, G., Kleeman, R., Smith, N., and F. Tseitkin, 2002: The BMRC coupled general circulation model ENSO forecast system, Mon. Wea. Rev., 130, 975- 991. [56] Wevar, A.T., Vialard, J. and D.L.T. Anderson, 2003: Three- and four- dimensional variational assimilation with an ocean geneal circulatin model of the tropical Paci c Ocean. Part 1: formulation, internal diagnostics and con- sistency checks, Mon. Wea. Rev., 131, 1360-1378. [57] Whitaker J.S. and T.M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Weather Rev., 130, 1913-1924. 114 [58] Yang, S.C., Kalnay, E., Hunt, B., and N. Bowler, 2009a: Weight interpolati- ion for e cient data assimilation with the Local Ensemble Transform Kalman Filter, Quart.J.Roy.Meteor.Soc., 135, 251-262. [59] Yang, S.C., Corazza, A., Kalnay, E., and T. Miyoshi, 2009b: Comparison of ensemble-based and variational-based data assimilation schemes in a quasi- geostrophic model, Mov. Wea. Rev., 137, 639-709. [60] Yang, S.C. and E. Kalnay, 2011: Handling nonlinearity and non-Gaussianity in Ensemble Kalman Filter. submitted to a special collection "Intercomparisons of 4D-Variational Assimilation and the Ensemble Kalman Filter", Mon. Wea. Rev.(submitted manuscript). 115 Bibliography 116